About the Role
Our team is seeking a skilled data validation specialist to join our efforts in developing and optimizing large-scale training datasets for agentic AI.
- Validate large-scale training datasets and maintain strict quality standards.
- Develop and optimize Python-based pipelines for data processing and validation.
- Create, manipulate, and structure JSON tasks for function-calling workflows.
- Work closely with AI researchers and teams to align datasets with agentic AI use cases.
- Monitor data accuracy, consistency, and integrity across projects.
Requirements
Strong proficiency in Python or JavaScript or Java or any other programming languages for data handling and workflow automation.Solid understanding of SQL for querying and managing datasets.Hands-on experience with GitHub for version control and collaboration.Expertise in handling, creating, and validating JSON-formatted tasks.Exceptional attention to detail with a focus on data quality and consistency.Interest in or exposure to agentic AI, LLMs, or NLP best practices.Benefits of this role include :
A collaborative and dynamic work environment.Opportunities for growth and professional development.