Quality and Analytics Specialist
About Us
We do things differently. We build a solution for enterprises to make sense of all of their information. We know how important it is for companies to understand their customers, so we provide our technology to solve their biggest challenges. We believe in open and transparent communication, not strict rules and hierarchies. We are a team of hardworking, talented people who aim to build software that makes sense of data. We’ve got some huge challenges ahead of us, and we need smart, driven wordsmiths to help us tackle them. If you think you’ve got what it takes—join us.
Role Summary
We are seeking a QA to ensure the quality, accuracy, and reliability of data workflows executed through Notebook / JupyterLab and data platforms. This role focuses on validating processing logic, workflows, and analytical outputs built using Python, Spark, and modern data libraries.
Key Responsibilities
1. Data & Notebook Quality Assurance
Validate JupyterLab notebooks for logic accuracy, data transformation, and analytical correctness using libraries like Pandas, NumPy, PySpark, Apache Sedona, and GeoMesa.
Ensure seamless integration of notebooks within the Syntasa platform, confirming all required libraries, kernels, and dependencies function reliably for users.
Verify error-free execution, correct geospatial operations, and a stable, consistent user experience across user roles and environments..
2. Data Validation & Reconciliation
Perform backend data validation using complex SQL queries and Python-based comparison scripts.
Validate aggregation logic, statistical calculations, and transformations implemented through Pandas and Spark frameworks.
Identify anomalies, duplicates, and data inconsistencies across datasets.
3. Pipeline & Platform Testing
Test data pipelines developed using Spark / PySpark and associated libraries.
Validate ingestion workflows from multiple sources including APIs, files, and cloud storage.
Test performance and scalability of pipelines handling high-volume datasets.
4. Test Planning & Reporting
Design detailed test cases covering notebook execution, data integrity, and analytical accuracy.
Maintain test artifacts and data validation reports.
Log and track defects using Jira or similar tools.
5. Collaboration & Continuous Improvement
Work closely with data engineers and data scientists to validate logic developed using Notebook and Python-based analytics frameworks.
Suggest improvements to data quality checks and automation strategies.
Required Skills
Must Have
Strong Python scripting for data validation and automation
Experience testing JupyterLab or notebook-based workflows
Hands-on validation of data logic using Pandas, NumPy, PySpark, Apache Sedona (GeoSpark), and / or GeoMesa
Strong SQL expertise for reconciliation and analysis
Experience testing Spark-based data pipelines
Good to Have
Familiarity with cloud platforms (GCP / AWS)
Experience with Airflow or similar schedulers
Quality Specialist • bhubaneswar, orissa, in