We are looking for an AI Consultant to guide the team for building and deploying scalable AI applications and platforms. The consultant will provide technical leadership, best practices, and deployment strategies to help transform concepts into production-ready systems. This role involves advising, mentoring, and directing development in areas such as real-time communication, multi-modal AI, chat applications, custom agent creation, and scalable cloud deployment.
Key Responsibilities
- Lead design, training, and deployment of ML / LLM / SLM models (classification, recommendation, ranking, generative).
- Guide creation of custom LLMs / SLMs using RoPE, MTP, MoE, KV caching, vLLM, and RLHF (GRPO, PPO) for optimization.
- Build and scale RAG pipelines, chat applications, and custom task-specific agents (LangGraph, Agno, Smol-agents) with vector databases.
- Deliver multimodal systems : text↔speech, real-time streaming (LiveKit), Vision Transformers for image↔text, OCR.
- Provide direction on low-code automation (n8n, LangFlow, MCP servers) for rapid workflows.
- Ensure enterprise-grade deployment across cloud / local with MLOps, CI / CD, monitoring, cost optimization.
- Mentor and upskill teams, driving best practices in architecture, scaling, and AI product delivery.
Required Expertise
Proven experience with LLM / SLM training, fine-tuning, and inference acceleration (vLLM, quantization, caching).Strong skills in RAG, chatbots, custom agents, multimodal AI (speech, vision), and OCR.Hands-on with scalable deployments (Kubernetes, GPU clusters, hybrid cloud).Deep knowledge of MLOps, observability, and production AI lifecycle.Success Criteria
Production-ready AI applications deployed with scalability, reliability, and efficiency.Team enabled with frameworks, playbooks, and best practices for sustainable AI delivery.Faster transition from POC → production with measurable business impact.