Talent.com
This job offer is not available in your country.
QA Analyst – Data Science

QA Analyst – Data Science

Volga Infotechtirunelveli, tamil nadu, in
1 day ago
Job description
  • We are currently hiring for a senior-level position and are looking for immediate joiners only.
  • If you are interested, please send your updated resume to resume@volgainfotech.com along with details of your CTC, ECTC and notice period

    Location : Remote

    Employment Type : Full-time

    About the Role

    The QA Engineer will own quality assurance across the ML lifecycle—from raw data validation through

    feature engineering checks, model training / evaluation verification, batch prediction / optimization

    validation, and end-to-end (E2E) workflow testing. The role is hands-on with Python automation, data

    profiling, and pipeline test harnesses in Azure ML and Azure DevOps. Success means provably correct

    data, models, and outputs at production scale and cadence.

    About the Role

    The QA Engineer will own quality assurance across the ML lifecycle—from raw data validation through

    feature engineering checks, model training / evaluation verification, batch prediction / optimization

    validation, and end-to-end (E2E) workflow testing. The role is hands-on with Python automation, data

    profiling, and pipeline test harnesses in Azure ML and Azure DevOps. Success means provably correct

    data, models, and outputs at production scale and cadence.

    Key Responsibilities

    • Test Strategy & Governance
    • ○ Define an ML-specific Test Strategy covering data quality KPIs, feature consistency

      checks, model acceptance gates (metrics + guardrails), and E2E run acceptance

      (timeliness, completeness, integrity).

      ○ Establish versioned test datasets & golden baselines for repeatable regression of

      features, models, and optimizers.

    • Data Quality & Transformation
    • ○ Validate raw data extracts and landed datalake data : schema / contract checks,

      null / outlier thresholds, time-window completeness, duplicate detection, site / material

      coverage.

      ○ Validate transformed / feature datasets : deterministic feature generation, leakage

      detection, drift vs. historical distributions, feature parity across runs (hash or statistical

      similarity tests).

      ○ Implement automated data quality checks (e.g., Great Expectations / pytest +

      Pandas / SQL) executed in CI and AML pipelines.

    • Model Training & Evaluation
    • ○ Verify training inputs (splits, windowing, target leakage prevention) and

      hyperparameter configs per site / cluster.

      ○ Automate metric verification (e.g., MAPE / MAE / RMSE, uplift vs. last model, stability

      tests) with acceptance thresholds and champion / challenger logic.

      ○ Validate feature importance stability and sensitivity / elasticity sanity checks (pricevolume

      monotonicity where applicable).

      ○ Gate model registration / promotion in AML based on signed test artifacts and

      reproducible metrics.

    • Predictions, Optimization & Guardrails
    • ○ Validate batch predictions : result shapes, coverage, latency, and failure handling.

      © 2025 Insurge Partners. All rights reserved.

      ○ Test model optimization outputs and enforced guardrails : detect violations and prove

      idempotent writes to DB.

      ○ Verify API push to third party system (idempotency keys, retry / backoff, delivery

      receipts).

    • Pipelines & E2E
    • ○ Build pipeline test harnesses for AML pipelines (data-gen nightly, training weekly,

      prediction / optimization) including orchestrated synthetic runs and fault injection

      (missing slice, late competitor data, SB backlog).

      ○ Run E2E tests from raw data store ->

      ADLS ->

      AML ->

      RDBMS ->

      APIM / Frontend; assert

      freshness SLOs and audit event completeness (Event Hubs ->

      ADLS immutable).

    • Automation & Tooling
    • ○ Develop Python-based automated tests (pytest) for data checks, model metrics, and API

      contracts; integrate with Azure DevOps (pipelines, badges, gates).

      ○ Implement data-driven test runners (parameterized by site / material / model-version)

      and store signed test artifacts alongside models in AML Registry.

      ○ Create synthetic test data generators and golden fixtures to cover edge cases (price

      gaps, competitor shocks, cold starts).

    • Reporting & Quality Ops
    • ○ Publish weekly test reports and go / no-go recommendations for promotions; maintain a

      defect taxonomy (data vs. model vs. serving vs. optimization).

      ○ Contribute to SLI / SLO dashboards (prediction timeliness, queue / DLQ, push success, data

      drift) used for release gates.

      Required Qualifications

      ○ 5–7+ years in QA with 3+ years focused on ML / Data systems (data pipelines + model validation).

      ○ Python automation (pytest, pandas, NumPy), SQL (PostgreSQL / Snowflake), and CI / CD (Azure

      DevOps) for fully automated ML QA.

      ○ Strong grasp of ML validation : leakage checks, proper splits, metric selection

      (MAE / MAPE / RMSE), drift detection, sensitivity / elasticity sanity checks.

      ○Experience testing AML pipelines (pipelines / jobs / components), and message-driven integrations

      (Service Bus / Event Hubs).

      ○ API test skills (FastAPI / OpenAPI, contract tests, Postman / pytest-httpx) + idempotency and retry

      patterns.

      ○ Familiar with feature stores / feature engineering concepts and reproducibility.

      ○Solid understanding of observability (App Insights / Log Analytics) and auditability requirements.

      Education

    • Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
    • Certification in Azure Data or ML Engineer Associate is a plus..
    Create a job alert for this search

    Data Science Analyst • tirunelveli, tamil nadu, in