Talent.com
This job offer is not available in your country.
▷ Apply in 3 Minutes : MLops Engineer

▷ Apply in 3 Minutes : MLops Engineer

RecroIndia
11 hours ago
Job description

Role Overview

We are looking for an experienced MLOps Lead with deep expertise in Azure and AWS cloud ecosystems, who can design, deploy, and manage scalable AI / ML infrastructure. The ideal candidate should bring a strong background in cloud governance, GenAI tooling, automation, and CI / CD pipelines, with hands-on experience across modern MLOps frameworks.

Key Responsibilities

  • Design, implement, and manage scalable cloud-based AI / ML infrastructure across Azure and AWS.
  • Drive end-to-end MLOps lifecycle — model deployment, monitoring, retraining, and governance.
  • Enable GenAI and Agentic AI platforms leveraging Azure OpenAI, Bedrock, Anthropic Claude, LangChain, etc.
  • Implement CI / CD pipelines using Azure DevOps or AWS CodePipeline.
  • Ensure security, observability, and compliance across ML and GenAI ecosystems.
  • Manage infrastructure automation via Terraform, Bicep, CloudFormation, or similar IaC tools.
  • Collaborate with data science and engineering teams to optimize ML workflows, data pipelines, and API integrations.
  • Implement monitoring and alerting using Grafana, Prometheus, Azure Monitor, and Application Insights.
  • Oversee networking, identity management, and role-based access controls (IAM, RBAC) across clouds.
  • Support model lifecycle management — drift monitoring, retraining, technical evaluation, and business validation.

Technical Skills & Expertise

Cloud & MLOps Platforms

  • Azure : Azure ML, Azure AI Services, Azure OpenAI, Azure Kubernetes Service (AKS), Databricks, Azure Search, Azure Blob, Cosmos DB, Azure SQL, Azure Functions, Azure Event Hub, Azure Resource Manager (ARM), Bicep.
  • AWS : SageMaker, Bedrock, Lambda, DynamoDB, S3, RDS, Redshift, ECR, CloudFormation, CDK, KMS, EventBridge, Step Functions.
  • AI / ML & Programming

  • Hands-on in Python, with exposure to TensorFlow, PyTorch, scikit-learn.
  • Understanding of LLM tokenization, prompt injection risks, jailbreak prevention, and AI safety techniques.
  • Familiarity with LangChain, LlamaCloud, AI Foundry, and related frameworks.
  • Experience in model monitoring, retraining, and evaluation workflows.
  • DevOps & Infrastructure

  • Expertise in CI / CD pipelines, containerization (Docker, Kubernetes), and infrastructure automation.
  • Strong in governance, audit logging, security policies (Azure Policy, AWS SCP, IAM).
  • Deep understanding of networking, DNS, load balancers, VNets / VPCs, VPNs.
  • Skilled in IaC tools – Terraform, Bicep, ARM, CloudFormation.
  • Monitoring & Observability

  • Experience with Grafana, Prometheus, Application Insights, Log Analytics Workspaces, Azure Monitor.
  • Security & Access Management

  • Understanding of Microsoft AD, least privilege principles, IAM, RBAC.
  • Testing & Automation

  • Familiarity with unit testing and integration testing in CI / CD workflows (preferably Azure DevOps).
  • Good to Have

  • Experience with Azure Bot Framework, M365 Copilot, and APIM.
  • Exposure to code assistants such as GitHub Copilot, Cursor, Claude Code.
  • Knowledge of Boto3 SDK (AWS Python) and TypeScript for IaC.
  • Preferred Background

  • Strong background in cloud infrastructure engineering and machine learning operations.
  • Proven ability to lead cross-functional teams and implement AI governance at scale.
  • Excellent problem-solving, communication, and documentation skills.
  • Create a job alert for this search

    Mlops Engineer • India