Description :
We are seeking a Lead Quality Engineering (QE) Engineer to define, operationalize, and own the quality strategy for our Agentic AI application teams. This leader will be accountable for functional, operational, and security quality across 25+ engineers (AI + UI engineers in India & USA).
This role requires deep awareness of quality challenges unique to LLM- and SLM-powered Agentic AI applications, especially in healthcare and education, where correctness, reliability, and compliance are essential.
Typical quality challenges include :
- LLM / SLM Latency & Token Efficiency : unpredictable response times, throughput constraints, and cost-performance tradeoffs.
- Non-Deterministic Outputs : validating variable responses in sensitive domains (medical correctness, educational appropriateness).
- RAG & Vector DB Use Cases : testing retrieval relevance, embedding coverage, semantic accuracy, and fallback handling.
- SME-Driven UAT Cycles : unpredictable validation cycles with clinicians or educators.
- Operational Risks : agent workflow reliability, cand system behavior under load.
- Security Risks : prompt injection, adversarial inputs, data leakage, and access control.
This is a transformational role : you will move the organization from manual QA toward automation-first and AI-driven evaluation, enabling every engineer to take responsibility for quality.
Key Responsibilities Leadership & Culture :
Own accountability for end-to-end quality outcomes across 2 global teams (~25 engineers).Champion a shift-left quality culture, embedding testing in design, code reviews, and CI / CD.Partner closely with AI Engineers to embed quality into day-to-day development.Partner with the Platform QE Engineering team to ensure AI apps meet platform-level quality and scalability standards.Partner with Technical Product Managers (TPMs) and Technical Product Owners (TPOs) to ensure quality requirements are captured and addressed.Define and track team-level quality OKRs and KPIs.Functional Quality :
Architect and implement automation frameworks (UI, backend, API, mobile).Build evaluation frameworks for :
LLM / SLM non-deterministic responses.Prompt and agent orchestration reliability.RAG + Vector DB use cases (retrieval relevance, semantic correctness, failure fallback).Hallucination detection, bias, fairness, and safety.Integrate AI evaluation into CI / CD pipelines with dashboards and gating criteria.Operational Quality (Enablement Role) :
Define strategies for load, performance, and reliability testing.Establish frameworks and test patterns for evaluating latency, concurrency, token efficiency, and response unpredictability.Ensure teams conduct and observe LnP (Load & Performance) tests and capture quality signals.Act as an enabler and coach, ensuring practices are scalable and team owned.Security & Compliance Quality :
Collaborate with the Client Penetration Testing team to ensure coverage of security risks (prompt injection, adversarial attacks, access control, and data leakage prevention).Establish additional security validation practices (input / output sanitization for healthcare / education data).Ensure compliance with Client ITGC, PCI, PII, CCPA applicable.Qualifications : Must Have :
7+ years in Quality Engineering / Automation, with 3 years in QA leadership roles.Proven experience transforming teams from manual QA to automation first.Awareness of LLM / SLM quality challenges (latency unpredictability, token inefficiency, hallucinations, SME UAT cycles).Strong automation expertise (Playwright, PyTest, Cypress, JUnit, REST API testing).Understanding of Agentic AI tech stack (RAG pipelines, Vector DBs).Solid background in CI / CD, DevOps, and cloud-native systems (Azure, Kubernetes, Gitlab).Nice to Have (Big Plus) :
Experience with Playwright MCP for scaling AI testing automation.Hands-on with AI evaluation tools (Ragas, Promptfoo, DeepEval, OpenAI Evals).Familiarity with AI observability & monitoring (Datadog).Background in AI security testing (prompt injection, adversarial robustness).Experience in healthcare or education applications.(ref : hirist.tech)