About the Role
We are seeking an experienced MLOps Engineer to design, build, and maintain scalable machine learning infrastructure with strong focus on Azure cloud ecosystem. You will deploy and optimize ML / AI models in production, with emphasis on GPU-accelerated workloads and large language models.
Key Responsibilities
Design and implement MLOps pipelines using Kubeflow and KServe for model deployment and serving
Deploy and manage production ML models on Kubernetes with NVIDIA GPU acceleration
Optimize model inference using vLLM, TensorRT-LLM, Triton Inference Server , and PyTorch Dynamo
Implement auto-scaling using KEDA, HPA , and Azure Cluster Autoscaler
Build CI / CD pipelines using Azure DevOps for automated ML model deployment
Manage Azure infrastructure (AKS, Azure ML, Container Registry) using IaC (Terraform / ARM / Bicep)
Implement monitoring and observability using Prometheus and Grafana
Perform Linux system administration and optimize containerized workloads
Deploy and manage LLM applications with OpenAI API and Azure OpenAI services
Monitor GPU utilization, model performance, and implement drift detection
Build RAG systems and manage vector databases for LLM applications
Ml Engineer • Hyderabad, Telangana, India