Summary
We are seeking a high-performing Senior Data Scientist (4-7 years experience) with mandatory expertise in statistical modeling, including regressions, anomaly detection, and clustering techniques.
The ideal candidate must be highly proficient in Python, utilizing essential libraries such as NumPy, Pandas, and Scikit-learn, and possess hands-on experience with scalable data processing via Databricks and PySpark.
Key responsibilities involve developing and deploying innovative ML solutions, occasionally utilizing NLP techniques, and communicating complex insights effectively to clients and cross-functional teams.
Key Responsibilities and Technical Modeling and Algorithm Development :
- Design and implement advanced Statistical modelling techniques, including regressions, for predictive analysis and forecasting.
- Develop robust models for Anomaly detections, identifying outliers and critical events in large datasets.
- Utilize Clustering and Categorization algorithms to segment data and uncover hidden patterns and customer insights.
- Apply the Secondary Skill : Natural Language Processing (NLP) to solve problems related to text data analysis, sentiment analysis, or topic modeling where required.
Programming and Big Data Tools :
Demonstrate Strong proficiency in Python and its core data manipulation libraries : Numpy and Pandas.Utilize SciLearn (Scikit-learn) for implementing standard machine learning algorithms and conducting rigorous model validation.Leverage Databricks and PySpark for scalable data ingestion, transformation, and model training on large-scale, distributed data architectures.Deployment, Communication, and Strategy :
Apply Good to have skills such as familiarity with MLOps principles to streamline the deployment, monitoring, and lifecycle management of production models.Utilize knowledge of Cloud ML platforms like Azure ML (or similar Cogentice services) for platform-specific deployment and management tasks.Maintain Excellent communication skill with client facing experience to clearly articulate technical methodologies, project progress, and the business impact of data science findings to non-technical stakeholders.Mandatory Skills & Qualifications :
Modeling : Expertise in Statistical modelling for regressions, Anomaly detections, Clustering and Categorization.Core Tech : Python, and its essential libraries : Numpy, Pandas, SciLearn (Scikit-learn).Big Data : Hands-on experience with Databricks and PySpark.Communication : Excellent communication skill with client facing experience.Preferred Skills :
Experience with MLOps practices and tools (e.g., MLflow, Kubeflow).Familiarity with Cloud ML platforms like Azure ML or similar cloud-native services.Experience with Natural Language Processing (NLP) techniques and libraries (e.g., NLTK, spaCy).Advanced SQL query writing and data warehousing experience(ref : iimjobs.com)