Description :
You will play a leading role in designing, deploying, and scaling production-grade ML systems including large language model (LLM)-based pipelines, AI copilots, and agentic workflows.
This role is ideal for someone who thrives on balancing cutting-edge research with production rigor and loves mentoring while building impact-first AI :
- Own the full ML lifecycle : model design, training, evaluation, deployment
- Design production-ready ML pipelines with CI / CD, testing, monitoring, and drift detection
- Fine-tune LLMs and implement retrieval-augmented generation (RAG) pipelines
- Build agentic workflows for reasoning, planning, and decision-making
- Develop both real-time and batch inference systems using Docker, Kubernetes, and Spark
- Leverage state-of-the-art architectures : transformers, diffusion models, RLHF, and multimodal pipelines
- Collaborate with product and engineering teams to integrate AI models into business applications
- Mentor junior team members and promote MLOps, scalable architecture, and responsible AI best practices
Ideal Candidate :
5+ years of experience in designing, deploying, and scaling ML / DL systems in productionProficient in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAXExperience with LLM fine-tuning, LoRA / QLoRA, vector search (Weaviate / PGVector), and RAG pipelinesFamiliarity with agent-based development (e.g., ReAct agents, function-calling, orchestration)Solid understanding of MLOps : Docker, Kubernetes, Spark, model registries, and deployment workflowsStrong software engineering background with experience in testing, version control, and APIsProven ability to balance innovation with scalable deploymentB.S. / M.S. / Ph.D. in Computer Science, Data Science, or a related fieldMust have a minimum of 5+ years of experience in designing, developing, and deploying Machine Learning / Deep Learning (ML / DL) systems in productionMust have strong hands-on experience in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAX.Must have 1+ years of experience in fine-tuning Large Language Models (LLMs) using techniques like LoRA / QLoRA, and building RAG (Retrieval-Augmented Generation) pipelines.Must have experience with MLOps and production-grade systems including Docker, Kubernetes, Spark, model registries, and CI / CD workflows.(ref : hirist.tech)