5+ years of experience in platform engineering, with a proven track record of designing, deploying, and managing scalable and secure cloud-based infrastructures, leveraging both Azure and AWS services. Experience with Azure services such as Azure AI services, Azure Search, Azure ML, Databricks, Azure Kubernetes Service, and AWS services like AWS Sage Maker, AWS Bedrock and AWS Lambda. Exposure to Generative AI and Agentic AI ecosystems such as Azure Open AI, Azure AI Foundry, Azure AI Hub, Bedrock, Anthropic Claude, Open AI API, Llama Cloud, Lang Chain. Understanding of token usage, LLM prompt injection risks, Jailbreak attempts and mitigation techniques. Strong knowledge of governance, audit, observability, and compliance in cloud- based Gen AI and ML ecosystems. Should understand Azure AI Evaluation SDK and AI Red Teaming Prompt Security Scans Good to have experience with code assistant tools like Github Copilot, Cursor and Claude Code Expertise in Azure Dev Ops or AWS Code Pipeline, including setting up and managing CI / CD pipelines. Advanced experience with Azure Blob Storage, Cosmos DB, SQL, Key Vault,, AWS S3, Dynamo DB, and AWS RDS etc and their integrations with AI services Advanced understanding of networking concepts, including DNS management, load balancing, VPNs, and virtual networks (VNets). Advanced understanding of security concepts, including IAM roles, identities, Azure policies, AWS SCPs. Experience in Advanced Authentication and Authorization Concepts across various cloud providers and platforms Must have experience with Azure Policy, AWS SCP, AWS IAM, audit logging, Azure RBAC etc. Mastery of infrastructure-as-code tools such as Azure ARM / Bicep, Terraform, Cloud Formation, or equivalent. Proficiency in networking, DNS, load balancers, and cloud engineering services. Knowledge in Python programming and AI / ML libraries (Tensor Flow, Py Torch, Sci-Kit learn, Bash & Power Shell etc.). Experience with containerization and orchestration tools such as Docker and Kubernetes. Good to have knowledge about Azure Bot framework, APIM, Application Gateway. Also, knowledge about M365 offerings like M365 Copilot. AWS CDK, AWS Python(Boto3) SDK. Experience with monitoring tools like Grafana, Prometheus, Application Insights,Log Analytics Workspaces, and Azure Monitor Understanding of common database technologies for both OLTP and OLAP applications Azure Services knowledge (Azure machine learning)o Experience in ML tooling Knowledge of Azure Machine Learning studio, Python SDK (v2), CLI (v2)) to monitor, retrain, and redeploy models.o Exposure to Azure Machine Learning model as Architect or model built from an open-source platform, such as Pytorch, Tensor Flow, or scikit- learn.o Practical knowledge of how to build efficient end-to-end ML workflowso Understanding of machine learning & deep learning concepts and algorithms, various statistical techniques, and experimentation analysis workflowso Enable production models across the ML lifecycleo Implement CI / CD orchestration for data science pipelineso Understanding the production deployments and post-deployment model lifecycle management activities : drift monitoring, model retraining, and model technical evaluation business validationo Work with stakeholders to assist with ML pipeline -related technical issues and support modelling infrastructure needs. Security and Accesso Familiarity with Microsoft ADo Understanding of principle of least privilege and its application to projects(RBAC) Testingo Understand and apply unit testing in day-to-day development worko Applied knowledge of integration testing as part of CI / CD process (ideally on ADO) Infrastructure as Codeo Understand the key concepts of Ia C and some practical experience of applicationo Write code in languages such as Python or Type Script to define cloud infrastructure as code Specific to AWS : o AWS database services (such as RDS, Dynamo DB, Redshift, Aurora)o AWS compute services and storage (such as EC2 – including scaling, EBS, EFS)o AWS serverless technology (including Lambda, SQS, SNS, Event Bridge and Step functions)o AWS KMSo AWS container services (including ECR)o AWS Cloud Formation and the CDK Specific to AWS : o Azure database services (including Cosmos DB, Azure SQL Serverless)o Azure compute services (VM, VM Scale Sets)o Azure serverless technology (i.e. Functions, Event Grid / Hub, Queue Storage, Service Bus)o Azure container services (including ACR / AKS)o Azure Resource Manager (ARM) / BICEPo Azure Key Vaulto Azure Machine Learningo Azure Data Lake Storage
Mlops Engineer • Dombivli, Maharashtra, India