Job Summary
We are seeking a Data Scientist to join our team to analyze complex datasets, develop machine learning models, and provide data-driven insights to support business decisions.
Key Responsibilities
Data Analysis & Exploration
- Perform comprehensive exploratory data analysis (EDA) on large datasets.
- Identify patterns, trends, and anomalies in data.
- Conduct statistical analysis to validate business hypotheses.
- Create data visualizations to communicate findings effectively.
- Assess and ensure data quality and integrity.
- Write complex SQL queries to extract and manipulate data
Machine Learning & Modeling
Design and develop machine learning models for business problems like (XG Boost, Logistic Regression, DNN, RNN etc)Implement supervised and unsupervised learning algorithmsPerform feature engineering and selectionEvaluate model performance using appropriate metricsDeploy and monitor machine learning models in productionProgramming & Development
Develop data analysis scripts and automation tools using PythonBuild data pipelines and ETL processesCreate reusable code libraries and functionsMaintain version control and documentation standardsRequired Qualifications
Technical Skills
SQL : Advanced proficiency in writing complex queries, joins, subqueries, and database optimizationPython : Strong programming skills in Python for data analysis and machine learningExploratory Data Analysis : Expertise in EDA techniques, statistical analysis, and data visualizationMachine Learning : Solid understanding of ML algorithms, model evaluation, and validation techniquesStatistics : Knowledge of statistical methods, hypothesis testing, and experimental designKnowledge of any cloud like AWS, GCP or Azure is good to haveMust have experience working on ML Algorithm’s including - Random Forest, XGBoost and Neural networks.Familiarity with version control systemsExperience with containerization and deployment toolsGood to Have : -
Worked on GenAI based ProjectsUsing GenAI for driving productivity in your work.Knowledge of PySpark is a plusMust have Frameworks :
Scikit LearnAny 1 of these 3 is a requirement. Pytorch / Keras / TensorFlow