We are seeking an experienced MLOps & DevOps Infrastructure Engineer to design, implement, and maintain scalable ML pipelines and cloud-native infrastructure for enterprise applications. The role requires expertise in machine learning operationalization (MLOps) and intermediate in DevOps infrastructure automation, ensuring seamless integration, monitoring, and delivery of ML and software systems as well as connected infrastructure.
The candidate will work closely with Data Scientists, Software Leads, and Cloud Architects to deliver robust, secure, and automated solutions.
MLOps Responsibilities
- Design and implement end-to-end ML pipelines (data ingestion, preprocessing, training, validation, deployment and monitoring).
- Automate ML workflows with tools like Kubeflow, MLflow or equivalent.
- Deploy ML models into production environments (cloud & on-premises) using containerized and serverless architectures.
- Implement model versioning, experiment tracking, and model registry .
- Collaborate with Data Science teams to optimize training and inference performance with connected infrastructure and delivery pipelines.
- Ensure compliance with data governance, privacy, and security policies .
DevOps Infrastructure Responsibilities
Basic understanding in Architecting and maintaining of CI / CD pipelines for software and ML systems.Manage cloud infrastructure (AWS, On-Prim) with Infrastructure-as-Code (IaC) tools.Administer Kubernetes clusters, Docker containers, and orchestration platforms .Implement monitoring, logging, and alerting using tools like Prometheus , Grafana, ELK, Datadog .Optimize infrastructure cost and ensure SLA adherence.Knowledge
Familiarity with Drone data processing Software packages / tools.Edge ML deployment (IoT, drones, or mobile).Prior experience scaling ML workloads in production.Familiarity with AI Model Development, Training and Deployment on Prod environments and its monitoring.Knowledge of serverless computing and event-driven architecture.Familiarity with multi-cloud and hybrid-cloud architectures .Skills
Python, MLflow, Docker, Kubernetes, Git, TensorFlow / PyTorch, FastAPI, AWS / GCP, Airflow / Kubeflow
Education
B.E / B.Tech in Computer Science, Data Engineering, or related field.