AI Platform DevOpes / SRE Engineer
Location : India - 100% Remote
Fulltime Permanent Position
Responsibilities / What You’ll Do
- Platform Design and Architecture : building and operating a highly available, scalable, modular AI platform using technologies such as Qdrant, Anyscale, and Ray to support LLM orchestration, vector search, and multi-agent frameworks.
- Core Infrastructure Development : Build essential APIs and infrastructure to power conversational applications, AI agents, and analytics tools.
- LLM Operational Solutions : Implement workflows for Large Language Models, including inference pipelines, fine-tuning, caching, and evaluation for open-weight and hosted models.
- Deployment & Performance Optimization : Deploy AI services on AWS with Kubernetes (EKS), Lambda, and ECS, ensuring scalability and resilience while optimizing vector databases and model runtimes for cost and performance.
- Collaboration, Governance, & Mentorship : Partner with engineering teams, research teams to deliver production-grade, self-healing, and performance-optimized services for AI / RAG pipelines , establish governance / security standards, and mentoring junior engineers in AI infrastructure best practices & reviews.
What We’re Looking for (Minimum Qualifications) :
8+ years of experience as Platform Engineer ( Site Reliability / DevOps Engineer) , with at least 3+ years in AI / ML platform development ( MLOps ).Deep expertise in Python, with strong design and debugging skills.Ability to work independently and lead complex projects with Excellent problem-solving, analytical, and communication skills.Proficiency working with cloud platforms such as AWS, GCP, or Azure and familiarity with MLOps / AI DevOps tools like MLflow or Kubeflow, proficient in CI / CD , infrastructure as code (Terraform / CloudFormation).Hands-on expertise with CI / CD pipelines, model observability, and incident response for AI / ML services.Preferred Qualification :
Experience implementing and optimizing Platforms supporting large language model (LLM) pipelines with frameworks such as LangChain, LlamaIndex, Hugging Face Transformers, or similar.Hands-on knowledge of Scaling & Setting up Vector DB platforms such as Qdrant (or other vector DBs like Pinecone, Weaviate) for semantic search and embeddings management.Exposure to MLOps tools, Ray.io , Anyscale or other distributed orchestration & inference frameworks.Experience with developing and deploying containerized applications using Docker and Kubernetes, including Helm charts and automated scaling.Understanding of LLMOps patterns — model registry, prompt versioning, and feedback loops.