IndiaAI is building aligned, safe, and multilingual LLMs for 1.4B people. We’re hiring a Senior ML Engineer / Applied Scientist to lead post-training , covering SFT, DPO, GRPO, RFT, RLHF, multi-turn chat tuning, reward modeling, and evaluation — with a strong focus on Indic and low-resource languages .
What You’ll Do
- Build and scale SFT pipelines (single & multi-turn chat).
- Run DPO, GRPO, RFT and other preference optimization techniques.
- Train reward models and integrate them into alignment loops.
- Use leading libraries : HuggingFace TRL / PEFT, DeepSpeed-Chat, NeMo Alignment, OpenRLHF, Axolotl, Colossal-AI .
- Develop high-quality datasets for instructions, chat, and preference ranking.
- Conduct multilingual & Indic evaluation using lm-eval-harness, Ragas, HELM .
- Improve performance for low-resource Indic languages via augmentation & synthetic data loops.
- Work with infra teams to scale training on multi-GPU clusters.
What You Bring
4–8+ years in ML / NLP with deep experience in post-training.Strong expertise in SFT, DPO / GRPO / RFT, PPO-style RLHF .Hands-on with TRL, NeMo, DeepSpeed-Chat, OpenRLHF, Axolotl .Proficiency with LoRA / QLoRA, FSDP & distributed training.Experience with Indic languages and multilingual NLP .Strong evaluation and dataset engineering background.Bonus Skills
Experience with 7B–70B+ LLM tuning.Contributions to alignment libraries.Safety alignment or Constitutional AI experience.📩 Join us to build India’s aligned, safe, multilingual LLMs.