Key Responsibilities :
- Design and implement cloud architecture for AI / ML, LLM model deployment, monitoring, and scaling
- Deploy models using Docker, Kubernetes, and CI / CD pipelines across cloud environments on Azure or GCP
- Optimize AI model serving using tools like TensorFlow Serving, TorchServe, ONNX Runtime, Triton Inference Server, etc.
- Build and manage IaC using Terraform, or CloudFormation
- Implement security best practices, autoscaling, logging, monitoring (Prometheus / Grafana / ELK), and disaster recovery plans
- Collaborate with ML engineers to produce prototypes into resilient, cloud-native services
- Benchmark and tune deployments for low latency, high throughput, and cost optimization
Required Qualifications :
8+ years of experience in cloud engineering, with 4+ years focused on AI / ML deployment at scaleStrong hands-on expertise in Azure and GCPProficient in Docker, Kubernetes, Helm, and serverless deployment modelsSolid understanding of ML frameworks (TensorFlow, PyTorch, scikit-learn) and model deployment workflowsExperience in CI / CD tools (GitHub Actions, Jenkins, GitLab CI), scripting (Python, Bash), and API gateway managementSkills Required
Docker, Cloud Engineering, AI ML, Azure, Python, Kubernetes