Senior ML Engineer / Applied Scientist – Post-Training (SFT, DPO, GRPO, RFT & Evaluation)
gnani.ai
5 - 10 years
Bengaluru
Posted: 05/01/2026
Getting a referral is 5x more effective than applying directly
Job Description
IndiaAI is building aligned, safe, and multilingual LLMs for 1.4B people. Were hiring a Senior ML Engineer/Applied Scientist to lead post-training , covering SFT, DPO, GRPO, RFT, RLHF, multi-turn chat tuning, reward modeling, and evaluation with a strong focus on Indic and low-resource languages .
What Youll Do
- Build and scale SFT pipelines (single & multi-turn chat).
- Run DPO, GRPO, RFT and other preference optimization techniques.
- Train reward models and integrate them into alignment loops.
- Use leading libraries: HuggingFace TRL/PEFT, DeepSpeed-Chat, NeMo Alignment, OpenRLHF, Axolotl, Colossal-AI .
- Develop high-quality datasets for instructions, chat, and preference ranking.
- Conduct multilingual & Indic evaluation using lm-eval-harness, Ragas, HELM .
- Improve performance for low-resource Indic languages via augmentation & synthetic data loops.
- Work with infra teams to scale training on multi-GPU clusters.
What You Bring
- 48+ years in ML/NLP with deep experience in post-training.
- Strong expertise in SFT, DPO/GRPO/RFT, PPO-style RLHF .
- Hands-on with TRL, NeMo, DeepSpeed-Chat, OpenRLHF, Axolotl .
- Proficiency with LoRA/QLoRA, FSDP & distributed training.
- Experience with Indic languages and multilingual NLP .
- Strong evaluation and dataset engineering background.
Bonus Skills
- Experience with 7B70B+ LLM tuning.
- Contributions to alignment libraries.
- Safety alignment or Constitutional AI experience.
Join us to build Indias aligned, safe, multilingual LLMs.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
