Title : AI QA Automation Engineer
Location : Remote
Job Summary :
We are seeking an AI Quality Engineer with a strong automation skillset to ensure the robustness, performance, and reliability of our AI systems and services. The ideal candidate is tech-savvy, proactive, and passionate about quality at every stepfrom initial design through deployment and ongoing monitoring. You will play a key role in building and maintaining a highly automated testing infrastructure to support fast, reliable model and pipeline delivery as the company scales.
Key Responsibilities :
Testing Expertise :
- Conduct comprehensive testing across all layers, including server load, integration points, and output quality.
- Apply Test Driven Development (TDD) principlesanticipate, design, and define all necessary tests before the start of feature development.
- Identify what needs to be tested and proactively communicate requirements before build phases.
Automation-First Approach :
Develop, maintain, and extend a fully automated testing suite that covers unit, integration, performance, and end-to-end testing.Emphasize automation to minimize manual intervention and maximize test coverage, reliability, and repeatability.DevOps & CI / CD Integration :
Collaborate closely with DevOps to ensure all tests (including those for model deployment and data pipelines) are tightly integrated with modern CI / CD workflows.Streamline rapid yet safe releases through automation and timely feedback.Automated Testing Frameworks :
Extensive hands-on experience with frameworks such as Pytest (Python testing), Playwright (end-to-end browser testing), Postman (API testing), and Langfuse (LLM output tracking / testing).Implement and maintain robust API contract testing to ensure reliable interactions between services.Manual & LLM Testing :
Execute manual test cases with strong attention to detail, especially for evaluating Large Language Model (LLM) output quality.Flag issues such as hallucinations, factual inaccuracies, or unexpected edge case responses.Continuously update manual testing strategies to adapt to evolving model behaviors and business requirements.Monitoring, Observability & Post-Deploy Quality :
Configure, deploy, and interpret dashboards from monitoring tools like Prometheus, Grafana, and CloudWatch.Track model health, pipeline performance, error rates, and system anomalies after deployment.Proactively investigate and triage quality issues uncovered in production.Core Abilities and Technical Skills :
Deep practical knowledge in test automation, performance, and reliability engineering.In-depth experience integrating tests into CI / CD pipelines, especially for machine learning and AI model workflows.Hands-on proficiency in automated QA tools : Pytest, Playwright, Postman, Langfuse, and similar.Solid foundation in manual exploratory testing, particularly for complex and evolving outputs such as those from LLMs.Expertise in monitoring, APM, and observability tools (e.g., Prometheus, Grafana, CloudWatch).Demonstrated strong problem-solving skillsanticipate, identify, and resolve issues early.Strong communication skills to clearly articulate requirements, quality risks, and advocate for automation-driven quality throughout the organization.Mindset :
Automation-First : Relentless emphasis on driving automation over manual effort.Proactive : Anticipates issues and testing needs; does not wait to be told what to test.Quality Advocate : Champions testing best practices and designs processes to catch bugs before production.Curious & Continuous Learner : Seeks out new tools, stays current with testing frameworks and industry best practices.Collaborative : Partners effectively with product, engineering, and DevOps teams to deliver high-quality models and features at scale(ref : hirist.tech)