LLM R&D Specialist
gnani.ai
2 - 5 years
Ahmedabad
Posted: 05/02/2026
Getting a referral is 5x more effective than applying directly
Job Description
IndiaAI is building Indias next-gen foundational LLMs. Were looking for a hands-on Senior ML Engineer experienced in large-scale pre-training, distributed GPU systems, and data creation pipelines. You will work with Megatron-LM, NVIDIA NeMo, DeepSpeed, PyTorch Distributed, and SLURM to train 7B70B+ models on multi-node GPU clusters.
What Youll Do
- Build & optimize LLM pre-training pipelines (7B70B+).
- Implement distributed training using PyTorch Distributed, DeepSpeed (ZeRO/FSDP), Megatron-LM, NVIDIA NeMo.
- Manage multi-node GPU jobs via SLURM and optimize NCCL communication.
- Lead large-scale data creation, cleaning, deduplication, tokenization & sharding for multilingual datasets (with focus on Indian languages).
- Build high-throughput dataloaders, monitoring dashboards & training workflows.
- Collaborate with infra teams to optimize GPU utilization, networking, and storage systems.
What You Bring
- 5+ years in ML Engineering / DL Systems.
- Prior experience training large transformer models (ideal: 7B+).
- Strong in NeMo, Megatron-LM, DeepSpeed, PyTorch Distributed.
- Experience with SLURM & multi-node GPU clusters (A100/H100).
- Understanding of transformer internals (attention, RoPE, FlashAttention, parallelism).
- Experience in data pipelines cleaning, dataset assembly, tokenization.
Bonus Skills
- Indic-language data experience
- MoE training
- Kernel-level optimization (Triton/CUDA)
- Open-source contributions (Megatron, NeMo, DeepSpeed, PyTorch)
Apply now to help build Indias national-scale foundational AI models.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
