About Company :
- They balance innovation with an open, friendly culture and the backing of a long-established parent company, known for its ethical reputation. We guide customers from what’s now to what’s next by unlocking the value of their data and applications to solve their digital challenges, achieving outcomes that benefit both business and society.
About Client :
Our client is a global digital solutions and technology consulting company headquartered in Mumbai, India. The company generates annual revenue of over $4.29 billion (₹35,517 crore), reflecting a 4.4% year-over-year growth in USD terms. It has a workforce of around 86,000 professionals operating in more than 40 countries and serves a global client base of over 700 organizations.Our client operates across several major industry sectors, including Banking, Financial Services & Insurance (BFSI), Technology, Media & Telecommunications (TMT), Healthcare & Life Sciences, and Manufacturing & Consumer. In the past year, the company achieved a net profit of $553.4 million (₹4,584.6 crore), marking a 1.4% increase from the previous year. It also recorded a strong order inflow of $5.6 billion, up 15.7% year-over-year, highlighting growing demand across its service lines.Key focus areas include Digital Transformation, Enterprise AI, Data & Analytics, and Product Engineering—reflecting its strategic commitment to driving innovation and value for clients across industries.Job Title : Automation Tester (Ai Testing_LLM+Python)Experience : 6+ YearsJob Type : Contract to hire.Notice Period : - Immediate joiners.Location : PAN INDIAJob Summary
Seeking an experienced Automation QA Engineer with strong expertise in evaluating and optimizing Large Language Model (LLM) responses. The role involves building automated evaluation workflows using Ragas , LangSmith , and Python-based frameworks to assess accuracy, relevancy, and faithfulness of model outputs. The ideal candidate has hands-on skills in prompt engineering , understands how to tweak and refine prompts for improved response quality, and contributes to LLM optimization techniques including token efficiency and prompt restructuring. The role also involves conducting performance testing of LLMs by measuring metrics such as Inter-Token Latency , First Token Time, and throughput. Strong knowledge of Python, basic SQL, and automation principles is essential to ensure high-quality, reliable, and efficient GenAI solutions.
Roles & Responsibilities
Design and execute test strategies for AI / LLM-based applications, ensuring response accuracy, reliability, and consistency.Evaluate LLM outputs using frameworks like Ragas , LangSmith , and custom Python-based evaluation pipelines.Develop automated workflows to measure metrics such as context precision, faithfulness, answer relevancy, similarity score , and hallucination rates.Write, refine, and optimize prompts (system, zero-shot, few-shot, CoT) to improve model accuracy and reduce hallucinations.Analyze and optimize token usage , prompt structure, and context-window efficiency to enhance LLM performance.Perform performance testing of LLM responses , measuring :Inter-Token LatencyFirst Token TimeResponse throughputLatency under loadBuild Python-based scripts, validators, and reporting dashboards for automated model evaluation.Work closely with Data Science, Product, and Engineering teams to improve model performance and user experience.Ensure LLM outputs adhere to functional, business, and compliance requirements.Create and maintain test documentation : test plans, test cases, prompt variants, evaluation reports, and performance benchmarking dashboards.Integrate LLM testing into CI / CD pipelines for continuous monitoring and regression validation.Validate test data and evaluation datasets using SQL .Seniority LevelMid-Senior levelIndustryIT Services and IT ConsultingEmployment TypeContractJob FunctionsBusiness DevelopmentConsultingSkillsAutomation TestingPythonLLM EvaluationRagasAi TestingLangSmithPrompt EngineeringSQLPerformance Testing