This job offer is not available in your country.

QA Analyst – Data Science

Volga Infotechtirunelveli, tamil nadu, in

1 day ago

Job description

We are currently hiring for a senior-level position and are looking for immediate joiners only.

If you are interested, please send your updated resume to resume@volgainfotech.com along with details of your CTC, ECTC and notice period

Location : Remote

Employment Type : Full-time

About the Role

The QA Engineer will own quality assurance across the ML lifecycle—from raw data validation through

feature engineering checks, model training / evaluation verification, batch prediction / optimization

validation, and end-to-end (E2E) workflow testing. The role is hands-on with Python automation, data

profiling, and pipeline test harnesses in Azure ML and Azure DevOps. Success means provably correct

data, models, and outputs at production scale and cadence.

About the Role

The QA Engineer will own quality assurance across the ML lifecycle—from raw data validation through

feature engineering checks, model training / evaluation verification, batch prediction / optimization

validation, and end-to-end (E2E) workflow testing. The role is hands-on with Python automation, data

profiling, and pipeline test harnesses in Azure ML and Azure DevOps. Success means provably correct

data, models, and outputs at production scale and cadence.

Key Responsibilities

Test Strategy & Governance

○ Define an ML-specific Test Strategy covering data quality KPIs, feature consistency

checks, model acceptance gates (metrics + guardrails), and E2E run acceptance

(timeliness, completeness, integrity).

○ Establish versioned test datasets & golden baselines for repeatable regression of

features, models, and optimizers.

Data Quality & Transformation

○ Validate raw data extracts and landed datalake data : schema / contract checks,

null / outlier thresholds, time-window completeness, duplicate detection, site / material

coverage.

○ Validate transformed / feature datasets : deterministic feature generation, leakage

detection, drift vs. historical distributions, feature parity across runs (hash or statistical

similarity tests).

○ Implement automated data quality checks (e.g., Great Expectations / pytest +

Pandas / SQL) executed in CI and AML pipelines.

Model Training & Evaluation

○ Verify training inputs (splits, windowing, target leakage prevention) and

hyperparameter configs per site / cluster.

○ Automate metric verification (e.g., MAPE / MAE / RMSE, uplift vs. last model, stability

tests) with acceptance thresholds and champion / challenger logic.

○ Validate feature importance stability and sensitivity / elasticity sanity checks (pricevolume

monotonicity where applicable).

○ Gate model registration / promotion in AML based on signed test artifacts and

reproducible metrics.

Predictions, Optimization & Guardrails

○ Validate batch predictions : result shapes, coverage, latency, and failure handling.

○ Test model optimization outputs and enforced guardrails : detect violations and prove

idempotent writes to DB.

○ Verify API push to third party system (idempotency keys, retry / backoff, delivery

receipts).

Pipelines & E2E

○ Build pipeline test harnesses for AML pipelines (data-gen nightly, training weekly,

prediction / optimization) including orchestrated synthetic runs and fault injection

(missing slice, late competitor data, SB backlog).

○ Run E2E tests from raw data store ->

ADLS ->

AML ->

RDBMS ->

APIM / Frontend; assert

freshness SLOs and audit event completeness (Event Hubs ->

ADLS immutable).

Automation & Tooling

○ Develop Python-based automated tests (pytest) for data checks, model metrics, and API

contracts; integrate with Azure DevOps (pipelines, badges, gates).

○ Implement data-driven test runners (parameterized by site / material / model-version)

and store signed test artifacts alongside models in AML Registry.

○ Create synthetic test data generators and golden fixtures to cover edge cases (price

gaps, competitor shocks, cold starts).

Reporting & Quality Ops

○ Publish weekly test reports and go / no-go recommendations for promotions; maintain a

defect taxonomy (data vs. model vs. serving vs. optimization).

○ Contribute to SLI / SLO dashboards (prediction timeliness, queue / DLQ, push success, data

drift) used for release gates.

Required Qualifications

○ 5–7+ years in QA with 3+ years focused on ML / Data systems (data pipelines + model validation).

○ Python automation (pytest, pandas, NumPy), SQL (PostgreSQL / Snowflake), and CI / CD (Azure

DevOps) for fully automated ML QA.

○ Strong grasp of ML validation : leakage checks, proper splits, metric selection

(MAE / MAPE / RMSE), drift detection, sensitivity / elasticity sanity checks.

○Experience testing AML pipelines (pipelines / jobs / components), and message-driven integrations

(Service Bus / Event Hubs).

○ API test skills (FastAPI / OpenAPI, contract tests, Postman / pytest-httpx) + idempotency and retry

patterns.

○ Familiar with feature stores / feature engineering concepts and reproducibility.

○Solid understanding of observability (App Insights / Log Analytics) and auditability requirements.

Education

Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
Certification in Azure Data or ML Engineer Associate is a plus..

Create a job alert for this search

Data Science Analyst • tirunelveli, tamil nadu, in