Description :
Function : Data Science and Analysis ? Data Science / Machine Learning
Data ValidationMachine LearningPython
We are seeking a detail-oriented Data Labeller to help build and validate high-quality datasets for AI model development. You will work with text, numerical, and image data, ensuring accuracy, consistency, and schema compliance. Alongside dataset creation, you will verify model predictions to ensure outputs meet internal quality standards and SLAs, directly supporting reliable datasets and strong model :
- Assist in building and curating high-quality datasets for AI model training and evaluation.
- Perform labelling, tagging, categorisation, and annotation tasks across text, numerical, and image data.
- Verify data integrity and compliance against schemas, guidelines, or external data sources.
- Review and validate AI model predictions to ensure they meet internal SLAs for accuracy, consistency, and reliability.
- Flag and document errors or misclassifications for retraining or process improvement.
- Ensure accuracy and consistency across large volumes of data, flagging anomalies or issues.
- Collaborate with stakeholders to refine guidelines and processes.
- Maintain organised and well-documented labelling workflows, ensuring reproducibility and auditability.
- Provide feedback to improve labelling tools, processes, and quality assurance checks.
Requirements :
Strong proficiency in Excel or Google Sheets (formulas, pivot tables, lookups, filters, and data cleaning functions).Excellent written English, with high attention to grammar, spelling, and clarity.Strong attention to detail and ability to follow labelling instructions precisely.Ability to work efficiently with large datasets and repetitive tasks without loss of accuracy.Basic understanding of data structures, schemas, and quality control processes.Understanding of how labelled data and prediction QA feed into model retraining and improvement.Prior experience in dataset curation or quality assurance against SLAs.Knowledge of Python or SQL for data manipulation.(ref : hirist.tech)