About the Role :
We’re looking for a strategic Senior MLOps Engineer to lead the end-to-end design, implementation, and scaling of our AI infrastructure. You’ll partner with researchers, product teams, and DevOps to turn prototypes into production services that meet strict SLAs for latency, reliability, and cost efficiency.
Responsibilities :
Core MLOps Pipelines : Design and implement scalable ML pipelines (training, evaluation, deployment) for LLMs, CV, and multimodal models .
Model Serving & CI / CD : Lead efforts in model serving, versioning, automated CI / CD, and real-time monitoring of AI workflows .
Inference-as-a-Service : Build and optimize GPU-backed serving infrastructure targeting p99 latency
80% GPU utilization .
Governance & Drift Detection : Drive initiatives on model governance, automated drift detection (≤10% false positives), and data-management best practices .
Vector Search & Agent Orchestration : Integrate vector databases (Qdrant, Pinecone) for low-latency semantic retrieval, and build agentic workflows using LangChain or similar frameworks.
Enterprise Multi-Tenancy : Architect RBAC-driven, isolated ML services to securely serve 100–500+ organizations.
Observability & Logging : Design Prometheus / Grafana dashboards, ELK / Fluentd logging pipelines, and alerting for all ML workloads.
CI / CD for Inference APIs : Maintain CI / CD pipelines for Python (FastAPI) and TypeScript (NestJS) inference services.
Metrics & Cost Optimization : Define and track SLAs / SLOs, optimize cloud spend by ≥ 20% year-over-year, and ensure GPU clusters operate at >
80% utilization.
Cross-Functional Leadership : Partner with AI researchers, product managers, and legal to align MLOps standards with compliance and roadmap goals.
Mentorship & Community : Mentor junior engineers, run quarterly brown-bags, own onboarding docs (upskill 5+ engineers / quarter), and publish ≥ 1 open-source contribution or talk annually.
Requirements :
≥ 4 years
in MLOps or ML infrastructure
Familiarity with distributed training, workload schedulers, and GPU-cluster orchestration
Plus These Critical Skills :
Minimum Qualification :
Bachelor's or Master's Degree in Computer Science, Engineering, or a related field.
Preferred :
Mlops Engineer • Delhi, India