We’re building next-gen AI-powered product experiences — intelligent assistants, agentic workflows, and scalable RAG systems. We’re hiring a Senior ML Engineer / Applied Scientist who can build production-grade RAG architectures , develop agentic systems (LangGraph, CrewAI, AutoGen), and deploy high-performance LLM inference using vLLM, SGLang, TensorRT-LLM , while working with modern vector databases (Qdrant, MongoDB Vector Search) .
This is a hands-on role at the intersection of AI systems, retrieval, agents, and product engineering.
What You’ll Do
- Build end-to-end RAG pipelines (ingestion → embeddings → vector DB → retrieval → generation).
- Work with open-source models (LLaMA, Mistral, Qwen, Gemma) and closed-source models (GPT-4.x, Claude 3, Gemini, Grok).
- Develop agentic workflows using LangGraph, CrewAI, AutoGen, LangChain, LlamaIndex .
- Deploy & optimize inference using vLLM, SGLang, TensorRT-LLM, TGI .
- Implement scalable vector search with Qdrant & MongoDB Atlas Vector Search .
- Build multilingual pipelines with Indic-language support.
- Set up evaluation using Ragas, DeepEval, LangSmith, TruLens .
- Work closely with backend / product teams to deploy RAG + agent systems in production.
What You Bring
4–8+ years in ML engineering, NLP, or LLM-based product development.Strong hands-on RAG experience (chunking, embeddings, reranking, retrieval optimization).Experience with Qdrant , MongoDB Vector Search , FAISS, Milvus, Pinecone, or Weaviate.Practical experience with vLLM , SGLang , model routing, and inference optimization.Proficiency with LangGraph , CrewAI , AutoGen , LangChain , or LlamaIndex .Experience integrating both open-source and closed-source LLMs.Ability to ship production systems with strong reliability, latency, and quality.Bonus Skills
Knowledge graphs, semantic retrieval, long-context models.Multi-agent systems & complex tool-routing workflows.Experience with Indic languages and OCR pipelines.📩 Join us to build the AI backbone of next-generation product experiences.