Join Innodata’s global Data Science team as an LLM Research Data Scientist and drive innovation in Multimodal Large Language Models and Agentic Workflows. In this position, you’ll contribute across the full LLM lifecycle, from data collection and post-training to evaluation and benchmarking. Collaborate closely with our data engineering teams to translate data into client-facing insights. Enjoy the freedom to publish scientific work and contribute to open-source initiatives.
Key Responsibilities
- Analyze existing workflows and data pipelines to extract insights, identify areas of improvement, and propose recommendations.
- Collaborate with internal teams and clients to operationalize findings.
- Specialize in one of the following focus areas :
oData : Collection, curation, and synthetic data creation
oModeling : Multimodal post-training and agentic workflow creation
oEvaluation : LLM and Agent benchmarking and performance analysis
What We Look For
Strong communication and collaboration skills across technical and non-technical audiencesAbility to work autonomously, self-motivated, ambitiousClient-facing mindset : active listening, clear articulation of findings and recommendationsWillingness to help upskill teammates in data science and generative AI fundamentalsQualifications
Master’s degree in Data Science, Computer Science or a related field10+ years of experience in data science and client-facing rolesHigh proficiency with AI tools (e.g., Copilots, GenAI platforms) and applying them to complex real-world use casesExpert-level Python skills, with deep familiarity with LLM frameworks for data manipulation, post training and evaluation