About the Role
We are looking for a highly skilled Lead - Machine Learning Engineer to lead the design, development, and deployment of large-scale ML solutions that power intelligent business applications. This role bridges data science, engineering, and product — ensuring that ML models move efficiently from prototype to production with scalability, reliability, and performance.
You’ll collaborate with data scientists, software engineers, and product teams to architect and operationalize machine learning systems that deliver measurable business value. The ideal candidate combines deep technical expertise in machine learning infrastructure with strong leadership and mentorship capabilities.
Key Responsibilities
1. Machine Learning System Design & Development
- Lead the end-to-end development of scalable ML systems — from data processing to model training, validation, deployment, and monitoring.
- Design high-performance pipelines for feature engineering, model serving, and real-time inference.
- Build reusable frameworks, SDKs, and components to accelerate ML deployment across teams.
2. MLOps & Automation
Establish CI / CD pipelines for ML models, ensuring reproducibility, traceability, and automated rollout.Define best practices for model versioning, model governance, and monitoring (data drift, performance degradation, etc.).Integrate ML systems into production using cloud-native services (AWS, Azure, GCP) and orchestration tools (Kubernetes, Airflow, Kubeflow).3. Leadership & Collaboration
Lead and mentor a team of ML engineers, fostering a culture of technical excellence and continuous learning.Partner with data scientists to productionize research models, ensuring robustness, scalability, and maintainability.Work with product and engineering teams to align ML solutions with business objectives and delivery timelines.4. Performance Optimization & Innovation
Optimize training and inference performance using distributed computing, GPU acceleration, and efficient data pipelines.Evaluate new tools, frameworks, and technologies to improve model performance, scalability, and cost-efficiency.Champion best practices in responsible AI — including fairness, transparency, and compliance.Qualifications
Bachelor’s or Master’s degree in Computer Science, Data Engineering, Machine Learning, or related field.6+ years of experience in machine learning or data engineering, including 3+ years in a senior or lead role.Strong proficiency in Python with experience in ML libraries and frameworks such as TensorFlow, PyTorch, scikit-learn, and XGBoost.Knowledge of microservices architecture and API design for ML model serving.Expertise in MLOps tools : MLflow, Kubeflow, Airflow, Docker, Kubernetes, and CI / CD systems (Jenkins, GitHub Actions, Argo).Deep understanding of cloud infrastructure (AWS SageMaker, GCP Vertex AI, Azure ML) and data pipelines (Spark, Kafka, Snowflake, BigQuery).Solid experience with model monitoring, observability, and A / B testing frameworks.Strong understanding of software engineering principles — clean code, testing, version control, and documentation.Excellent communication and leadership skills with a proven track record of cross-functional collaboration.Nice-to-Have
Experience with large language models (LLMs), retrieval-augmented generation (RAG), or generative AI pipelines.Exposure to feature stores (Feast, Tecton) and real-time data systems.Understanding of data governance, ethics, and security in AI systems.Full stack development familiarity to collaborate across front-end and back-end integration of ML services.