If you are interested, please send your updated resume to resume@volgainfotech.com along with details of your CTC, ECTC and notice period
Location : Remote
Employment Type : Full-time
About the Role
The QA Engineer will own quality assurance across the ML lifecycle—from raw data validation through
feature engineering checks, model training / evaluation verification, batch prediction / optimization
validation, and end-to-end (E2E) workflow testing. The role is hands-on with Python automation, data
profiling, and pipeline test harnesses in Azure ML and Azure DevOps. Success means provably correct
data, models, and outputs at production scale and cadence.
About the Role
The QA Engineer will own quality assurance across the ML lifecycle—from raw data validation through
feature engineering checks, model training / evaluation verification, batch prediction / optimization
validation, and end-to-end (E2E) workflow testing. The role is hands-on with Python automation, data
profiling, and pipeline test harnesses in Azure ML and Azure DevOps. Success means provably correct
data, models, and outputs at production scale and cadence.
Key Responsibilities
○ Define an ML-specific Test Strategy covering data quality KPIs, feature consistency
checks, model acceptance gates (metrics + guardrails), and E2E run acceptance
(timeliness, completeness, integrity).
○ Establish versioned test datasets & golden baselines for repeatable regression of
features, models, and optimizers.
○ Validate raw data extracts and landed datalake data : schema / contract checks,
null / outlier thresholds, time-window completeness, duplicate detection, site / material
coverage.
○ Validate transformed / feature datasets : deterministic feature generation, leakage
detection, drift vs. historical distributions, feature parity across runs (hash or statistical
similarity tests).
○ Implement automated data quality checks (e.g., Great Expectations / pytest +
Pandas / SQL) executed in CI and AML pipelines.
○ Verify training inputs (splits, windowing, target leakage prevention) and
hyperparameter configs per site / cluster.
○ Automate metric verification (e.g., MAPE / MAE / RMSE, uplift vs. last model, stability
tests) with acceptance thresholds and champion / challenger logic.
○ Validate feature importance stability and sensitivity / elasticity sanity checks (pricevolume
monotonicity where applicable).
○ Gate model registration / promotion in AML based on signed test artifacts and
reproducible metrics.
○ Validate batch predictions : result shapes, coverage, latency, and failure handling.
© 2025 Insurge Partners. All rights reserved.
○ Test model optimization outputs and enforced guardrails : detect violations and prove
idempotent writes to DB.
○ Verify API push to third party system (idempotency keys, retry / backoff, delivery
receipts).
○ Build pipeline test harnesses for AML pipelines (data-gen nightly, training weekly,
prediction / optimization) including orchestrated synthetic runs and fault injection
(missing slice, late competitor data, SB backlog).
○ Run E2E tests from raw data store ->
ADLS ->
AML ->
RDBMS ->
APIM / Frontend; assert
freshness SLOs and audit event completeness (Event Hubs ->
ADLS immutable).
○ Develop Python-based automated tests (pytest) for data checks, model metrics, and API
contracts; integrate with Azure DevOps (pipelines, badges, gates).
○ Implement data-driven test runners (parameterized by site / material / model-version)
and store signed test artifacts alongside models in AML Registry.
○ Create synthetic test data generators and golden fixtures to cover edge cases (price
gaps, competitor shocks, cold starts).
○ Publish weekly test reports and go / no-go recommendations for promotions; maintain a
defect taxonomy (data vs. model vs. serving vs. optimization).
○ Contribute to SLI / SLO dashboards (prediction timeliness, queue / DLQ, push success, data
drift) used for release gates.
Required Qualifications
○ 5–7+ years in QA with 3+ years focused on ML / Data systems (data pipelines + model validation).
○ Python automation (pytest, pandas, NumPy), SQL (PostgreSQL / Snowflake), and CI / CD (Azure
DevOps) for fully automated ML QA.
○ Strong grasp of ML validation : leakage checks, proper splits, metric selection
(MAE / MAPE / RMSE), drift detection, sensitivity / elasticity sanity checks.
○Experience testing AML pipelines (pipelines / jobs / components), and message-driven integrations
(Service Bus / Event Hubs).
○ API test skills (FastAPI / OpenAPI, contract tests, Postman / pytest-httpx) + idempotency and retry
patterns.
○ Familiar with feature stores / feature engineering concepts and reproducibility.
○Solid understanding of observability (App Insights / Log Analytics) and auditability requirements.
Education
Data Science Analyst • alwar, rajasthan, in