Job Title : Machine Learning Engineer Gen AI, LLMs & RAG
Experience : 3 - 8 Years
Job Summary :
We are looking for a highly motivated and skilled Machine Learning Engineer with deep expertise in Machine Learning (ML), Deep Learning, Large Language Models (LLMs), Generative AI, and Retrieval-Augmented Generation (RAG). You will play a key role in designing, developing, and deploying state-of-the-art AI models and scalable systems that push the boundaries of intelligent automation and human-like interaction.
Key Responsibilities :
- Develop, train, and fine-tune advanced machine learning and deep learning models using structured and unstructured data.
- Build, evaluate, and deploy LLM-based solutions (e.g., using OpenAI, Hugging Face, LLaMA, Mistral, or similar).
- Design and implement RAG pipelines to enhance LLM capabilities with domain-specific context and external knowledge bases.
- Work with cross-functional teams (NLP researchers, data engineers, product managers) to convert business needs into ML solutions.
- Optimize inference and training performance for LLMs and Gen AI models using efficient computing and model compression techniques.
- Monitor model performance, fairness, and bias; ensure reliability in production.
- Maintain and expand reusable ML tools, frameworks, and best practices for rapid experimentation and deployment.
Mandatory Skills :
Strong foundation in Machine Learning and Deep Learning algorithms (e.g., CNNs, RNNs, Transformers, Attention Mechanisms).Hands-on experience with LLMs (e.g., GPT, BERT, T5, Falcon, LLaMA, etc.).Proficiency in Generative AI techniques (text generation, summarization, Q&A, image-to-text, etc.).Experience building Retrieval-Augmented Generation (RAG) architectures using vector databases like FAISS, Pinecone, Weaviate, or Milvus.Deep proficiency with ML / DL frameworks : TensorFlow, PyTorch, Transformers (Hugging Face).Strong programming skills in Python; familiarity with libraries like LangChain, LLamaIndex, etc.Experience with MLOps, model versioning, and deployment in cloud environments (AWS, Azure, GCP).Familiarity with prompt engineering, fine-tuning LLMs, and zero- / few-shot learning methods.(ref : hirist.tech)