🔔 FCM Loaded

Research Engineer - Video Generation

Pocket FM

2 - 5 years

Bengaluru

Posted: 21/02/2026

Getting a referral is 5x more effective than applying directly

Job Description

About Pocket FM


Pocket FM is on a mission to deliver personalised and immersive audio experiences to listeners worldwide. We are revolutionising the audio entertainment industry through long-form storytelling, supported by our cutting-edge platform that serves millions of listeners and generates billions of minutes of engagement monthly. We leverage Generative AI in producing content and streamlining operations, developing innovative solutions for cutting-edge challenges in the AI landscape across all modalitiestext, audio, and images. With strong backing and rapid user base growth, Pocket FM is an exciting and dynamic place to join.


The Role: What You'll Build and Own


  • Design and implement an agentic orchestration framework that Selects optimal video generation models per scene, Constructs and refines prompts dynamically decomposes episode-level goals into scene-level tasks, manages generation, validation, and refinement loops
  • Build a multi-agent system that can translate high-level episode briefs into structured scripts, break scripts into scenes, shots, and animation beats, select visual style, pacing, and emotional tone parameters, trigger the appropriate video models and pipelines
  • Develop automated prompt engineering strategies, model selection heuristics (or learned selection policies), self-refinement and critique loops, quality control mechanisms (LLM- or vision-based evaluators)
  • Create orchestration logic for scene continuity (character consistency, environment persistence), Style preservation across the episode, Temporal coherence,Budget / compute optimisation
  • Build evaluation frameworks that measure narrative coherence, Visual consistency, Style fidelity, Emotional alignment, Anime-specific quality metrics
  • Optimise for minimal human intervention, scalable production, robust failure recovery, and reproducibility


The Ideal Candidate: Who You Are


You are someone who is experiences in building-


  • An agent-based orchestration engine
  • Automated prompt generation and refinement modules
  • Model selection and routing layer
  • Episode planner (hierarchical decomposition system)
  • Feedback-driven improvement loops
  • Evaluation and scoring system
  • Production-ready pipeline for end-to-end anime episode generation


Your Technical Toolkit:


  • Masters or PhD in Computer Science, AI, ML, or related field
  • Strong experience with Large Language Models (LLMs), multimodal generative models, prompt engineering and prompt optimisation, python and production ML systems
  • Hands-on experience building agentic systems (e.g., ReAct, AutoGPT-style, planning agents), tool-using LLM systems, and Orchestration pipelines
  • Deep understanding of video generation models, Model evaluation and benchmarking and experimentation frameworks.


Preferred Qualification:


  • Experience with video diffusion or text-to-video systems, character consistency techniques (LoRA, embeddings, adapters),scene planning or hierarchical generation, reinforcement learning or policy learning and automated content evaluation systems
  • Familiarity with anime production workflows, storyboarding, shot composition and pacing, diffusion models, and narrative structure
  • Experience deploying distributed ML systems, GPU-accelerated pipelines and cloud-based ML infrastructure

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.