Primary Skills : Cloud (AWS, Azure, GCP), Python, GenAI, Keras, TensorFlow, scikit-learn, XGBoost, Neural Networks, Apache Pyspark, Random Forest model, Machine Learning, AI / ML, PyTorch, SQL
Requirements :
Must have skills for Lead Data Scientist role :
Python – 2+ years
Panda - 2+ Years
Must have led one end to end project
Must have customer facing experience
This person will manage a project as a Tech Lead.
Must have experience working on ML Algorithm's including - Random Forest, XGBoost and Neural networks
Must have Frameworks :
Scikit Learn
Any 1 of these 3 is a requirement. Pytorch / Keras / TensorFlow
Candidate must have worked on atleast 1 end to end machine learning project
Basic SQL is Mand
Job Description
Position : Lead Data Scientist
Experience Level : 3 - 5 years
Job Summary :
We are seeking a Sr. Data Scientist to join our team to analyze complex datasets, develop machine learning models, and provide data-driven insights to support business decisions.
Key Responsibilities
Data Analysis & Exploration
- Perform comprehensive exploratory data analysis (EDA) on large datasets.
- Identify patterns, trends, and anomalies in data.
- Conduct statistical analysis to validate business hypotheses.
- Create data visualizations to communicate findings effectively.
- Assess and ensure data quality and integrity.
- Write complex SQL queries to extract and manipulate data
Machine Learning & Modeling
Design and develop machine learning models for business problems like (XG Boost, Logistic Regression, DNN, RNN etc)Implement supervised and unsupervised learning algorithmsPerform feature engineering and selectionEvaluate model performance using appropriate metricsDeploy and monitor machine learning models in productionProgramming & Development
Develop data analysis scripts and automation tools using PythonBuild data pipelines and ETL processesCreate reusable code libraries and functionsMaintain version control and documentation standardsR equired Qualifications
Technical Skills
SQL : Advanced proficiency in writing complex queries, joins, subqueries, and database optimizationPython : Strong programming skills in Python for data analysis and machine learningExploratory Data Analysis : Expertise in EDA techniques, statistical analysis, and data visualizationMachine Learning : Solid understanding of ML algorithms, model evaluation, and validation techniquesStatistics : Knowledge of statistical methods, hypothesis testing, and experimental designKnowledge of any cloud like AWS, GCP or Azure is good to haveFamiliarity with version control systemsExperience with containerization and deployment toolsGood to Have : -
Worked on GenAI based ProjectsUsing GenAI for driving productivity in your work.