IndiaAI is building aligned, safe, and multilingual LLMs for 1.4B people. Were hiring a Senior ML Engineer/Applied Scientist to lead post-training , covering SFT, DPO, GRPO, RFT, RLHF, multi-turn chat tuning, reward modeling, and evaluation with a strong focus on Indic and low-resource languages .

What Youll Do

Build and scale SFT pipelines (single & multi-turn chat).
Run DPO, GRPO, RFT and other preference optimization techniques.
Train reward models and integrate them into alignment loops.
Use leading libraries: HuggingFace TRL/PEFT, DeepSpeed-Chat, NeMo Alignment, OpenRLHF, Axolotl, Colossal-AI .
Develop high-quality datasets for instructions, chat, and preference ranking.
Conduct multilingual & Indic evaluation using lm-eval-harness, Ragas, HELM .
Improve performance for low-resource Indic languages via augmentation & synthetic data loops.
Work with infra teams to scale training on multi-GPU clusters.

What You Bring

48+ years in ML/NLP with deep experience in post-training.
Strong expertise in SFT, DPO/GRPO/RFT, PPO-style RLHF .
Hands-on with TRL, NeMo, DeepSpeed-Chat, OpenRLHF, Axolotl .
Proficiency with LoRA/QLoRA, FSDP & distributed training.
Experience with Indic languages and multilingual NLP .
Strong evaluation and dataset engineering background.

Bonus Skills

Experience with 7B70B+ LLM tuning.
Contributions to alignment libraries.
Safety alignment or Constitutional AI experience.

Join us to build Indias aligned, safe, multilingual LLMs.

Senior ML Engineer / Applied Scientist – Post-Training (SFT, DPO, GRPO, RFT & Evaluation)

gnani.ai

Job Description

Services you might be interested in

Improve Your Resume Today