Role Overview : Build, train, and deploy machine learning models for predictive analytics and data-driven decision making. Implement end-to-end ML pipelines from data preparation to production deployment.Key ResponsibilitiesDevelop and train ML models for classification, regression, forecasting, and anomaly detection, Perform feature engineering, data preprocessing, and exploratory data analysis
- Implement model training pipelines with hyperparameter optimization
- Deploy models to production and integrate with application services
- Monitor model performance, detect drift, and trigger retraining
- Collaborate with data engineers on feature store and data pipeline design
- Conduct A / B testing and model performance evaluation
- Document model architectures, experiments, and deployment processes Required Skills Machine Learning :
- Strong foundation in supervised and unsupervised learning algorithms
- Time-series forecasting and anomaly detection techniques
- Classification, regression, clustering, and ensemble methods
- Feature engineering and feature selection strategies
- Model evaluation metrics and validation techniques
- Handling imbalanced datasets and data quality issues Statistical & Mathematical :
- Statistics, probability, and linear algebra
- Hypothesis testing and statistical inference
- Optimization algorithms and gradient descent
- Understanding of model bias, variance, and overfitting Data Processing :
- Data cleaning, transformation, and normalization
- Exploratory Data Analysis (EDA) and data visualization
- Working with structured and unstructured data
- ETL / ELT pipeline integrationRequired Tech Stack Programming & ML :
- Languages : Python (expert), SQL
- ML Libraries : Scikit-learn, XGBoost, LightGBM, CatBoost
- Deep Learning : PyTorch or TensorFlow, Keras
- Data Processing : Pandas, NumPy, Polars
- Visualization : Matplotlib, Seaborn, Plotly MLOps & Deployment :
- Experiment Tracking : MLflow, Weights & Biases
- Model Serving : FastAPI, Flask, TensorFlow Serving
- Containerization : Docker
- Version Control : Git, DVC (Data Version Control)
- Workflow : Airflow, Prefect Cloud & Tools :
- Cloud Platforms : AWS (SageMaker), Azure ML, or GCP (Vertex AI)
- Databases : SQL (PostgreSQL, MySQL), NoSQL basics
- Tools : Jupyter, VS Code, Linux / Unix Preferred Qualifications
- Bachelor's / Master's in Computer Science, Data Science, Statistics, or related field
- Experience with distributed training (Spark MLlib, Ray)
- Knowledge of AutoML and hyperparameter tuning frameworks (Optuna, Hyperopt)
- Kaggle competitions or ML portfolio projects What Success Looks Like
- Production models achieving target accuracy and business KPIs
- Automated ML pipelines reducing manual intervention
- Fast iteration cycles for model experimentation
- Well-documented, maintainable code and models
- Collaboration with cross-functional teams