Role : Junior Data Scientist
Experience : Minimum 4 years
Location : Hyderabad
About the role :
We are scaling an AI innovation team focused on practical, high-impact use cases for pharma manufacturers and commercial operations (CMO, CDMO, CRM). In this role you will rapidly prototype ML / AI solutions, work with cross-functional stakeholders (R&D, Process Development, Quality, Manufacturing, Commercial), and contribute to moving validated models toward production. This role is ideal for a pragmatic scientist / engineer with 4–5 years of applied data science experience who wants to focus on pharma problems (process analytics, PAT, sensor data, imaging, customer analytics) and learn regulated deployment practices.
Responsibilities :
➢ Collaborate with domain teams to translate business / regulatory problems into data science hypotheses and testable experiments.
➢ Acquire, clean, and integrate manufacturing (MES, LIMS, PAT, sensor / time-series, batch records), laboratory, and CRM / commercial datasets.
➢ Rapidly prototype models and algorithms : regression, tree ensembles, time-series forecasting, anomaly detection, clustering / segmentation, and basic deep learning (e.G., CNNs for imaging, RNNs / Temporal models).
➢ Build explainability and uncertainty estimates into prototype models for regulated decision-support.
participate in documentation needed for audits / validation.
➢ Implement monitoring experiments (drift detection, simple retraining pipelines) and hand off monitoring requirements.
follow data governance and security policies.
translate model resultsinto actionable recommendations.
Required qualifications :
➢ 4–5 years of hands-on experience in applied data science, machine learning, or analytics (industry or research with applied projects).
Master’s preferred).
working knowledge of SQL.
➢ Practical experience with time-series / sensor data or tabular modeling in production-like settings.
➢ Experience with at least one deep learning framework (PyTorch or TensorFlow) for applied tasks. ➢ Demonstrated ability to move from problem definition to prototype and present results to stakeholders.
➢ Good documentation practices, basic testing, and reproducible analysis (notebooks + script refactors).
➢ Clear communication skills and ability to work in cross-functional teams.
➢ Willingness to work in regulated environments and follow documentation / validation processes.
Preferred qualifications :
➢ Prior experience in pharma, biotech, CMO / CDMO, manufacturing, or other regulated industries. ➢ Familiarity with MES, LIMS, ELN, PAT, or industrial IoT data sources.
➢ Experience with MLOps basics (Docker, simple CI / CD, MLflow, Airflow) or production handoffs. ➢ Knowledge of multivariate statistical process control (MSPC), DOE, chemometrics, or Six Sigma concepts.
➢ Exposure to LLMs and prompt engineering for knowledge extraction, summarization or augmentation of domain content (SOPs, batch records).
➢ Experience with cloud platforms (AWS / Azure / GCP) and data platforms (Snowflake, Redshift, BigQuery).
➢ Understanding of model explainability (SHAP, LIME) and uncertainty quantification techniques.
Technical stack :
➢ Languages : Python (pandas, scikit-learn, xgboost / lightgbm), SQL
➢ DL : PyTorch or TensorFlow / Keras
➢ Time-series : tsfresh, statsmodels, prophet, tslearn
➢ MLOps / infra : Docker, MLflow, Airflow (familiarity)
➢ Storage / Viz : S3 / object store, Postgres, Tableau / Power BI / Plotly ➢ Tools : Git, Jupyter / VS Code, basic Linux shell
Ai Innovation Scientist • Hyderabad, Republic Of India, IN