Talent.com
MLOps Engineer- Billion Dollar US Enterprise Software - Hiring in India!

MLOps Engineer- Billion Dollar US Enterprise Software - Hiring in India!

CareerXperts ConsultingHyderabad, Telangana, India
21 hours ago
Job description

Role Focus : Production ML Systems | GPU Orchestration | Inference at Scale

What You'll Actually Do (Not Buzzwords)

Infrastructure That Doesn't Break

Design and maintain the backbone for training, fine-tuning, and deploying ML models that actually work in production

Orchestrate GPU workloads on Kubernetes (EKS) with node autoscaling, intelligent bin-packing, and cost-aware scheduling (spot instances, preemptibles—you know the drill)

Build CI / CD pipelines that handle ML code, data versioning, and model artifacts like a well-oiled machine (GitHub Actions, Argo Workflows, Terraform)

Production ML, Not Science Projects

Partner with Data Scientists and ML Engineers to turn Jupyter notebooks into production-grade systems

Deploy and scale inference backends (vLLM, Hugging Face, NVIDIA Triton) that serve real traffic

Optimize GPU utilization because every idle A100 hour is money burning

Build observability that actually tells you why things broke (Prometheus, Grafana, OpenTelemetry)

Ship Fast, Sleep Well

Create tooling for seamless model deployment, instant rollback, and A / B testing

Lead incident response when production AI systems decide to have opinions

Work with security and compliance teams to implement best practices without slowing down innovation

What We're Really Looking For

Must-Haves (No Negotiation)

5+ years in MLOps, infrastructure, or platform engineering —you've been in the trenches

Production ML experience : At least one project that's serving real users, not a Kaggle competition

Kubernetes expertise with GPUs : You understand taints, tolerations, affinity rules, and why GPU scheduling is its own special hell

Cloud-native architecture (AWS preferred) : You think in VPCs, IAM roles, and cost optimization

Training pipeline experience : Set up or scaled training / fine-tuning for ML models in production (PyTorch Lightning, Hugging Face Accelerate, DeepSpeed)

IaC fluency : Terraform, Helm, Kustomize are second nature

Python engineering skills : You can debug a distributed training failure and fix it

Inference scaling : You've deployed and scaled inference workloads and lived to tell the tale

The "We're Very Interested" Signals

You mention scaling inference and we can see the fire in your eyes

You've used MLflow, W&B, or SageMaker Experiments and have opinions on which is best

You understand CI / CD for ML and why it's different from regular software

You've built monitoring systems that caught issues before users did

Nice to Have (But Seriously Nice)

GPU scheduling wizardry in Kubernetes

Model drift monitoring and versioning tools

Low-latency inference optimization (quantization, FP8, TensorRT—the good stuff)

Experience in compliance or regulated industries where "just ship it" isn't an option

What Makes This Role Different

Ownership. You're not a ticket-taker or a consultant passing through. You'll own infrastructure that powers real AI products, make architectural decisions that matter, and have the autonomy to build things the right way.

Impact. Your work directly affects model training speed, inference latency, GPU costs, and system reliability. You'll see the results of your optimizations in dollars saved and milliseconds gained.

Quality over speed. We value security, operational excellence, and sustainable systems. No "move fast and break things" chaos here—we move deliberately and build things that last.

The Reality Check

This role is not for you if :

You prefer working on proofs-of-concept over production systems

You think "it works on my machine" is an acceptable answer

You haven't shipped ML systems to production

You're looking for pure research or pure DevOps (this is the intersection)

This role is for you if :

You get excited about making GPUs go brrr efficiently

You've been oncall for ML systems and learned hard lessons

You believe infrastructure is a product, not an afterthought

You want to build the foundation for AI that actually works

Write to MLOps@CareerXperts.com to get connected!

Create a job alert for this search

Mlops Engineer • Hyderabad, Telangana, India

Related jobs
  • Promoted
MLOps Engineer

MLOps Engineer

X4 TechnologyHyderabad, IN
MLOps Engineer - Role & Responsibilities.Design, deploy and manage scalable & secure cloud infrastructure.Apply least privilege across cloud platforms (Azure, RBAC, AWS IAM).Enable audit logging co...Show moreLast updated: 12 days ago
  • Promoted
MLops Engineer

MLops Engineer

RecroHyderabad, IN
We are looking for an experienced.Azure and AWS cloud ecosystems.The ideal candidate should bring a strong background in. GenAI tooling, automation, and CI / CD pipelines.Design, implement, and manage...Show moreLast updated: 12 days ago
  • Promoted
Engineer, Software [T500-20462]

Engineer, Software [T500-20462]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 16 days ago
  • Promoted
  • New!
MLOps Engineer

MLOps Engineer

Zorba Consulting India Pvt. Ltd.Hyderabad
Description : MLOps Engineer.Industry & Sector : Enterprise software and AI services delivering production-grade machine learning solutions, model lifec...Show moreLast updated: 8 hours ago
  • Promoted
Senior MLOps Engineer

Senior MLOps Engineer

Mitchell Martin Inc.hyderabad, telangana, in
Include, but are not limited to, the following : .Own productionizing models—from tracked experiments to governed releases—ensuring resilient services with clear SLOs, runbooks, and fast, safe rollba...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
(Immediate Start) MLOps Engineer- Billion Dollar US Enterprise Software - Hiring in India!

(Immediate Start) MLOps Engineer- Billion Dollar US Enterprise Software - Hiring in India!

CareerXperts ConsultingHyderabad, Telangana, India
Role Focus : Production ML Systems | GPU Orchestration | Inference at Scale What You'll Actually Do (Not Buzzwords) Infrastructure That Doesn't Break - Design and maintain the backbone for traini...Show moreLast updated: 1 hour ago
  • Promoted
Infometry - Senior MLOps Engineer - GCP Ecosystem

Infometry - Senior MLOps Engineer - GCP Ecosystem

InfometryHyderabad
Location : Bangalore, Chennai, Hyderabad, Pune Role : MLOPS GCP Data : 10-19 Summary : < / b&...Show moreLast updated: 18 days ago
  • Promoted
MLOps Lead Engineer

MLOps Lead Engineer

RecroHyderabad, IN
Experience with Azure services such as Azure AI services, Azure Search, Azure ML, Databricks, Azure Kubernetes Service, and AWS services like AWS SageMaker, AWS Bedrock and AWS Lambda.Exposure to G...Show moreLast updated: 12 days ago
  • Promoted
MLOps Engineer - Python

MLOps Engineer - Python

Ampcus TechHyderabad
Title : ML Ops + Python.Level of experience 6+.Job Purpose Analyzing, designing, developing and managing the data pipelines to release scalable D...Show moreLast updated: 27 days ago
  • Promoted
Engineer, Software [T500-20448]

Engineer, Software [T500-20448]

TMUS Global Solutionshyderabad, telangana, in
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 16 days ago
  • Promoted
RTL Engineer

RTL Engineer

TEKsystemsHyderabad, India
Client / Domain : Semiconductor Manufacturing.Notice Period Expectations : Immediate to 45 days.Work Location (client) : Hitec city, Hyderabad. Work timings : Normal Working hours.Qualification : Bachel...Show moreLast updated: 3 days ago
  • Promoted
MLOps Engineer

MLOps Engineer

Mastech DigitalHyderabad
Description : Position Title : ML Ops Engineer 4.Address : Spire T110, Hyderabad Knowledge City, Madhapur, Hyderabad, Telangana, India, 500081.Job Description : <...Show moreLast updated: 11 days ago
  • Promoted
Engineer, Software [T500-20438]

Engineer, Software [T500-20438]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 16 days ago
  • Promoted
AI / ML Engineer

AI / ML Engineer

TransPerfecthyderabad, telangana, in
AI / ML Engineer (Remote – Bangalore, India).Location : Bangalore, India (Remote).Contract : 5 months (with potential 12-month extension based on performance). Client : A leading multinational telecommun...Show moreLast updated: 12 days ago
  • Promoted
Senior Java Engineer – Model Inference Platform (Microservices / ML / Seldon) Hiring For Pan India

Senior Java Engineer – Model Inference Platform (Microservices / ML / Seldon) Hiring For Pan India

Tata Consultancy Servicessecunderabad, telangana, in
Design, build, and optimize high-performance microservices using Java 17+, Spring Boot, and reactive frameworks.Develop and maintain APIs for model registration, inference request routing, and mode...Show moreLast updated: 12 days ago
  • Promoted
Capgemini - MLOps Engineer

Capgemini - MLOps Engineer

Capgemini Technology Services India LimitedHyderabad
Your Role : - Design, implement, and maintain end-to-end ML pipelines for model training, evaluation, and deployment &...Show moreLast updated: 30+ days ago
  • Promoted
MLOps Engineer

MLOps Engineer

TECHSOPHYHyderabad
Job Opportunity : MLOps Engineer (3+ Years) Location : Hyderabad At Techsophy, we are driving transformation for global enter...Show moreLast updated: 30+ days ago
  • Promoted
MLOps Engineer

MLOps Engineer

ConfidentialHyderabad / Secunderabad, Telangana
You will collaborate closely with data scientists, engineers, and DevOps to build robust, scalable ML infrastructure and CI / CD pipelines for AI / ML workflows. The engineer is supposed to participate ...Show moreLast updated: 30+ days ago