🔔 FCM Loaded

Artificial Intelligence Engineer

VolumX

2 - 5 years

Gurugram

Posted: 01/01/2026

Getting a referral is 5x more effective than applying directly

Job Description


About VolumX

VolumX is Indias first and leading company for digital humans, trusted by Oscar-winning studios and global brands. Using some of the most advance technologies, we are creating lifelike digital doubles for celebrities powering face replacements, de-aging, and autonomous digital humans.

Now, were pushing into AI driven performance synthesis and were inviting talented engineers to join us in shaping the future of content and immersive interactions.


Requirement

- Strong experience in AI/ML model development (PyTorch, TensorFlow).

- Practical experience with video generation, face replacement, digital doubles.

- Understanding of Gaussian splats / NeRF / neural rendering concepts.

- Knowledge of emotion recognition & sentiment analysis models.

- Experience with real-time inference optimization (ONNX, TensorRT, quantization).

- Background in speech processing (ASR, vocoders, prosody control) to train facial expression model . - Familiarity with lip-sync engines (e.g., Rhubarb, Wav2Lip, custom phoneme alignment).

- Strong programming skills in Python

- Experience in multimodal AI (audio + video + text).

- Experience deploying AI in cloud environments (AWS/GCP/Azure, Docker, Kubernetes).

- Hands-on expertise with STT/TTS frameworks (LiveKit, Whisper, Riva, Coqui TTS, Tacotron, FastSpeech, VITS, etc.).


Responsibilities

Face & Video Models

-Train person-specific models for face reenactment, face swapping, and de-aging.

-Build high-res, temporally consistent face replacement pipelines.

-Test and implement neural rendering pipelines using Gaussians, Nerfs, Diffusion

Lip Sync & Emotions

- Implement or adapt lip-sync models (e.g. Wav2Lip-style, phoneme/viseme-based).

- Research and integrate emotion recognition models (from audio/text input).

- Map emotion states into facial rigs and lip-sync engines.

- Interface outputs with rigs / MetaHumans via blendshapes or bone controls.

Pipeline Engineering

- Design low-latency inference pipelines for STT NLU TTS.

- Optimize models for real-time streaming (GPU/TPU/Cloud deployment).

- Work with backend engineers to expose AI services via APIs/WebSockets.

Collaboration & Integration

- Partner with Unreal engineers to sync AI outputs with Pixel Streaming. - Ensure smooth coordination between voice, facial animation, and emotional response.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.