Role Overview : Build, train, and deploy machine learning models for predictive analytics and data-driven decision making. Implement end-to-end ML pipelines from data preparation to production deployment.
Key Responsibilities
Develop and train ML models for classification, regression, forecasting, and anomaly detection
, Perform feature engineering, data preprocessing, and exploratory data analysis
- Implement model training pipelines with hyperparameter optimization
- Deploy models to production and integrate with application services
- Monitor model performance, detect drift, and trigger retraining
- Collaborate with data engineers on feature store and data pipeline design
- Conduct A / B testing and model performance evaluation
- Document model architectures, experiments, and deployment processes Required Skills Machine Learning :
- Strong foundation in supervised and unsupervised learning algorithms
- Time-series forecasting and anomaly detection techniques
- Classification, regression, clustering, and ensemble methods
- Feature engineering and feature selection strategies
- Model evaluation metrics and validation techniques
- Handling imbalanced datasets and data quality issues Statistical & Mathematical :
- Statistics, probability, and linear algebra
- Hypothesis testing and statistical inference
- Optimization algorithms and gradient descent
- Understanding of model bias, variance, and overfitting Data Processing :
- Data cleaning, transformation, and normalization
- Exploratory Data Analysis (EDA) and data visualization
- Working with structured and unstructured data
- ETL / ELT pipeline integration
Required Tech Stack Programming & ML :
Languages : Python (expert), SQLML Libraries : Scikit-learn, XGBoost, LightGBM, CatBoostDeep Learning : PyTorch or TensorFlow, KerasData Processing : Pandas, NumPy, PolarsVisualization : Matplotlib, Seaborn, Plotly MLOps & Deployment :Experiment Tracking : MLflow, Weights & BiasesModel Serving : FastAPI, Flask, TensorFlow ServingContainerization : DockerVersion Control : Git, DVC (Data Version Control)Workflow : Airflow, Prefect Cloud & Tools :Cloud Platforms : AWS (SageMaker), Azure ML, or GCP (Vertex AI)Databases : SQL (PostgreSQL, MySQL), NoSQL basicsTools : Jupyter, VS Code, Linux / Unix Preferred QualificationsBachelor's / Master's in Computer Science, Data Science, Statistics, or related fieldExperience with distributed training (Spark MLlib, Ray)Knowledge of AutoML and hyperparameter tuning frameworks (Optuna, Hyperopt)Kaggle competitions or ML portfolio projects What Success Looks LikeProduction models achieving target accuracy and business KPIsAutomated ML pipelines reducing manual interventionFast iteration cycles for model experimentationWell-documented, maintainable code and modelsCollaboration with cross-functional teams