Key Responsibilities :
- Design, implement, and maintain end-to-end ML pipelines for model training, evaluation, and deployment
- Collaborate with data scientists and software engineers to operationalize ML models
- Develop and maintain CI / CD pipelines for ML workflows
- Implement monitoring and logging solutions for ML models
- Optimize ML infrastructure for performance, scalability, and cost-efficiency
- Ensure compliance with data privacy and security regulations
Required Skills and Qualifications :
Strong programming skills in Python, with experience in ML frameworksExpertise in containerization technologies (Docker) and orchestration platforms (Kubernetes)Proficiency in cloud platform (AWS) and their ML-specific servicesExperience with MLOps toolsStrong understanding of DevOps practices and tools (GitLab, Artifactory, Gitflow etc.)Knowledge of data versioning and model versioning techniquesExperience with monitoring and observability tools (Prometheus, Grafana, ELK stack)Knowledge of distributed training techniquesExperience with ML model serving frameworks (TensorFlow Serving, TorchServe)Understanding of ML-specific testing and validation techniquesSkills Required
Prometheus, Grafana, Elk Stack, Gitlab, Artifactory, Aws