Talent.com
Machine Learning Engineer

Machine Learning Engineer

Recropune, maharashtra, in
30+ days ago
Job description

Role Overview

We are looking for an experienced MLOps Lead with deep expertise in Azure and AWS cloud ecosystems , who can design, deploy, and manage scalable AI / ML infrastructure. The ideal candidate should bring a strong background in cloud governance, GenAI tooling, automation, and CI / CD pipelines , with hands-on experience across modern MLOps frameworks.

Key Responsibilities

  • Design, implement, and manage scalable cloud-based AI / ML infrastructure across Azure and AWS .
  • Drive end-to-end MLOps lifecycle — model deployment, monitoring, retraining, and governance.
  • Enable GenAI and Agentic AI platforms leveraging Azure OpenAI, Bedrock, Anthropic Claude, LangChain, etc.
  • Implement CI / CD pipelines using Azure DevOps or AWS CodePipeline.
  • Ensure security, observability, and compliance across ML and GenAI ecosystems.
  • Manage infrastructure automation via Terraform, Bicep, CloudFormation , or similar IaC tools.
  • Collaborate with data science and engineering teams to optimize ML workflows, data pipelines, and API integrations.
  • Implement monitoring and alerting using Grafana, Prometheus, Azure Monitor, and Application Insights.
  • Oversee networking, identity management, and role-based access controls (IAM, RBAC) across clouds.
  • Support model lifecycle management — drift monitoring, retraining, technical evaluation, and business validation.

Technical Skills & Expertise

Cloud & MLOps Platforms

  • Azure : Azure ML, Azure AI Services, Azure OpenAI, Azure Kubernetes Service (AKS), Databricks, Azure Search, Azure Blob, Cosmos DB, Azure SQL, Azure Functions, Azure Event Hub, Azure Resource Manager (ARM), Bicep.
  • AWS : SageMaker, Bedrock, Lambda, DynamoDB, S3, RDS, Redshift, ECR, CloudFormation, CDK, KMS, EventBridge, Step Functions.
  • AI / ML & Programming

  • Hands-on in Python , with exposure to TensorFlow, PyTorch, scikit-learn.
  • Understanding of LLM tokenization, prompt injection risks, jailbreak prevention, and AI safety techniques.
  • Familiarity with LangChain, LlamaCloud, AI Foundry , and related frameworks.
  • Experience in model monitoring, retraining, and evaluation workflows.
  • DevOps & Infrastructure

  • Expertise in CI / CD pipelines , containerization (Docker, Kubernetes) , and infrastructure automation .
  • Strong in governance, audit logging, security policies (Azure Policy, AWS SCP, IAM).
  • Deep understanding of networking, DNS, load balancers, VNets / VPCs, VPNs.
  • Skilled in IaC tools – Terraform, Bicep, ARM, CloudFormation.
  • Monitoring & Observability

  • Experience with Grafana, Prometheus, Application Insights, Log Analytics Workspaces, Azure Monitor.
  • Security & Access Management

  • Understanding of Microsoft AD, least privilege principles, IAM, RBAC.
  • Testing & Automation

  • Familiarity with unit testing and integration testing in CI / CD workflows (preferably Azure DevOps).
  • Good to Have

  • Experience with Azure Bot Framework , M365 Copilot , and APIM .
  • Exposure to code assistants such as GitHub Copilot, Cursor, Claude Code.
  • Knowledge of Boto3 SDK (AWS Python) and TypeScript for IaC .
  • Preferred Background

  • Strong background in cloud infrastructure engineering and machine learning operations .
  • Proven ability to lead cross-functional teams and implement AI governance at scale.
  • Excellent problem-solving, communication, and documentation skills.
  • Create a job alert for this search

    Machine Learning Engineer • pune, maharashtra, in