Job Description :
We are seeking a highly skilled Big Data Tester with strong expertise in ETL Testing, Hadoop, and Spark / PySpark. The candidate will be responsible for ensuring data quality, validating large-scale data pipelines, and working closely with data engineers and analysts to deliver reliable big data solutions.
Key Responsibilities :
- Perform end-to-end testing of big data applications, data pipelines, and ETL workflows.
- Validate data transformations, aggregations, and data quality across multiple sources and targets.
- Conduct functional, regression, system integration, and performance testing for big data solutions.
- Write and execute complex SQL and Hive queries to verify data accuracy and consistency.
- Collaborate with data engineers and developers to debug and resolve data issues.
- Design and maintain test plans, test cases, and automation frameworks for big data testing.
- Perform batch and streaming data validation on Hadoop ecosystem components.
- Ensure compliance with data governance, quality, and security standards.
Required Skills & Experience :
Strong experience in ETL Testing with a solid understanding of data warehousing concepts.Hands-on expertise in Hadoop ecosystem (HDFS, Hive, HBase, Sqoop, etc.).Proficiency in Apache Spark / PySpark for data validation and transformations.Strong SQL skills (Oracle, Teradata, MySQL, or similar) for data validation.Experience with big data testing strategies including volume, performance, and scalability testing.Familiarity with data pipeline orchestration tools (Airflow, Oozie, NiFi, etc.).Exposure to test automation frameworks for big data testing is a plus.Good analytical, debugging, and problem-solving skills.(ref : hirist.tech)