🔔 FCM Loaded

Senior ML Engineer / Applied Scientist – Post-Training (SFT, DPO, GRPO, RFT & Evaluation)

gnani.ai

5 - 10 years

Bengaluru

Posted: 05/01/2026

Getting a referral is 5x more effective than applying directly

Job Description

IndiaAI is building aligned, safe, and multilingual LLMs for 1.4B people. Were hiring a Senior ML Engineer/Applied Scientist to lead post-training , covering SFT, DPO, GRPO, RFT, RLHF, multi-turn chat tuning, reward modeling, and evaluation with a strong focus on Indic and low-resource languages .


What Youll Do

  • Build and scale SFT pipelines (single & multi-turn chat).
  • Run DPO, GRPO, RFT and other preference optimization techniques.
  • Train reward models and integrate them into alignment loops.
  • Use leading libraries: HuggingFace TRL/PEFT, DeepSpeed-Chat, NeMo Alignment, OpenRLHF, Axolotl, Colossal-AI .
  • Develop high-quality datasets for instructions, chat, and preference ranking.
  • Conduct multilingual & Indic evaluation using lm-eval-harness, Ragas, HELM .
  • Improve performance for low-resource Indic languages via augmentation & synthetic data loops.
  • Work with infra teams to scale training on multi-GPU clusters.


What You Bring

  • 48+ years in ML/NLP with deep experience in post-training.
  • Strong expertise in SFT, DPO/GRPO/RFT, PPO-style RLHF .
  • Hands-on with TRL, NeMo, DeepSpeed-Chat, OpenRLHF, Axolotl .
  • Proficiency with LoRA/QLoRA, FSDP & distributed training.
  • Experience with Indic languages and multilingual NLP .
  • Strong evaluation and dataset engineering background.


Bonus Skills

  • Experience with 7B70B+ LLM tuning.
  • Contributions to alignment libraries.
  • Safety alignment or Constitutional AI experience.


Join us to build Indias aligned, safe, multilingual LLMs.

Services you might be interested in

Improve Your Resume Today

Boost your chances with professional resume services!

Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.