Position Title : QA / Data Validation Engineer
Experience : 4–7 Years
Location : Remote – India
Employment Type : Full-time
Description :
We are seeking a QA / Data Validation Engineer with a strong foundation in Python and SQL to design and implement automated testing for data pipelines. The ideal candidate will ensure data accuracy, integrity, and reliability across ingestion and transformation workflows. You’ll work closely with Data Engineers to validate and enhance large-scale, distributed data systems.
Tech Stack :
- Languages / Runtime : Python 3.11, SQL
- Containerization & Orchestration : Kubernetes (EKS)
- Workflow Orchestration : Prefect
- Parallel Task Runner : Ray
- Data Storage & Processing : AWS Aurora RDS, PostgreSQL (cached storage), S3 buckets
- CI / CD : Bitbucket using Helm chart and Terraform
Responsibilities :
Develop automated validation tests for data ingestion and transformation pipelines.Ensure deduplication, reconciliation, and data quality rules are enforced.Build reusable mock datasets for regression testing and dual-run cutover validation.Collaborate with Data Engineers to identify, reproduce, and resolve defects in pipelines.Maintain and enhance validation frameworks to ensure data accuracy and consistency.Preferred Skills :
Experience with automated testing frameworks for data pipelines.Strong SQL skills for writing complex validation and reconciliation queries.Proficiency in Python for test automation and data validation scripting.Exposure to AWS (S3, RDS) and CI / CD pipelines (Bitbucket, Terraform).