Senior Data Scientist
Experience : 5+ years | Location : Bengaluru, Chennai (Hybrid)
Akaike Technologies is a dynamic and innovative AI-driven company dedicated to building impactful solutions. Our mission is to empower businesses by harnessing the power of data and AI to drive growth, efficiency, and value. We foster a culture of collaboration, creativity, and continuous learning.
Experience Pre-Requisite : Having 5 years of experience, of which at least 4 years as relevant experience into Data Science
Job Description :
We are seeking an experienced and highly skilled Senior Data Scientist to join our team in Bengaluru. This role focuses on driving innovative, large-scale solutions using cutting-edge Classical Machine Learning, PySpark, Spark SQL, and Generative AI. The ideal candidate will possess a blend of deep technical expertise, strong business acumen, effective communication skills, sense of ownership & be motivated towards establishing quantifiable business impact . We require a proven track record in designing, developing, and real-time deploying scalable ML / DL pipelines and LLM Agents in a fast-paced, collaborative environment.
Key Responsibilities :
Large-Scale Data Handling, PySpark, & Databricks Deployment
- Efficiently handle and model billions of data points using multi-cluster data processing frameworks ( PySpark , Spark SQL ).
- Expertise on Databricks / AWS is a must have : Ability to design, write, scale, and monitor end-to-end ML Pipelines on Databricks / AWS.
- Proven expertise to run and manage Databricks data pipelines in real time for low-latency decision-making.
- Develop and implement scalable deployment pipelines using Docker and AWS services (ECR, Lambda, Step Functions).
Classical Machine Learning
Owning the entire workstreams end to end, from use-case identification, to initial designs & POC by building custom machine learning solutions as needed till the business impact calculation of the use-case while ensuring modularity, scalablity, and production-ready codebase.Design and implement custom models, loss functions and be able to handle nuanced conversations of trade offs between various modelling choices.Apply specialized modeling for marketing scenarios (Targeting, Budget optimisation, Churn) and data limitations (Sparse / incomplete labels, Single class learning).Generative AI & Large Language Models :
Practical experience in building LLM-ready Data Management layers for large-scale structured and unstructured data.Apply foundational understanding of LLM Agents and multi-agent systems (e.g., Agent-Critique, ReACT , Agent Collaboration), advanced prompting, LLM evaluation, confidence grading , and Human-in-the-Loop systems .Team Mentorship and Stakeholder Management.
Mentor, support and manage a cross-functional team.Bring in structure across the client engagement - both internally as well as externally, with effective and top down communication.Act as the primary contact for clients, translating complex data needs into tasks. Present data insights to stakeholders, highlighting business impacts. Collaborate with cross-functional teams to align AI initiatives with business goals.Must Have Technical Skills
Data Pipelines, PySpark & Databricks
Proficiency in Python and its data science ecosystem (NumPy, Pandas, Dask, PySpark ) for large-scale data processing .Expert, hands-on experience with Databricks for MLOps, pipeline orchestration, and real-time deployment.Ability to perform effective feature engineering by understanding complex business objectives.Core Machine Learning & Deep Learning
In-depth knowledge of Classical ML : Tree Based Models, GLMs’, Clustering Models etc.Deep Learning : ANN, 1D / 2D / 3D Convolutional Neural Networks (ConvNets) , LSTMs, Transformer models.Strong proficiency in PU learning , single-class learning , representation learning , alongside traditional ML approaches.Advanced understanding and application of model explainability techniques (e.g., SHAP, LIME).Hands-on experience with ML / DL libraries such as Scikit-learn, TensorFlow / Keras, and PyTorch .Others
Experience utilizing large-scale language models (GPT-4, Mistral, Llama, Claude) through prompt engineering and custom finetuning .Code Versioning Systems : Github, GitMust Have Soft Skills
Communication Skills : Of all the things, this is perhaps the most important soft skill for us, you must be able toCapture the attention of your audience - usually in client calls Succinctly put across your ideas to your team members Bring clarity of thought and next steps to the table and present it well.Presentation Skills : Be able to visually present your ideas on a white board Be able to build compelling presentation for CxOs in a top down manner with an angle of business impact in mind.Problem Solving Skills : Be able to leverage various internal tools, client datasets to craft a problem in shortest time possible. Be able to make trade-offs keeping the timelines in mind.Relevant to Have
Background in Pharma Domain.Knowledge of Recommender Systems & Next Best Action Systems.Benefits and Perks
Competitive ESOP grants.
Support for publishing papers and attending academic / industry conferences.
High visibility across all functions at Akaike.