This job offer is not available in your country.

LLM Reliability & Evaluation Engineer

XenonStackMohali, PB, in

25 days ago

Job type

Quick Apply

Job description

Job Description

ABOUT XENONSTACK

XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems , enabling enterprises to gain real-time and intelligent business insights .

We deliver innovation through :

Agentic Systems for AI Agents → akira.ai

Vision AI Platform → xenonstack.ai

Inference AI Infrastructure for Agentic Systems → nexastack.ai

Our mission is to accelerate the world’s transition to AI + Human Intelligence by making AI agents reliable, explainable, and enterprise-ready .

THE OPPORTUNITY

We are seeking an LLM Reliability & Evaluation Engineer to ensure that large language models (LLMs) and agentic AI systems meet enterprise-grade standards of accuracy, safety, and trustworthiness .

This role focuses on evaluating, benchmarking, and stress-testing LLMs in real-world workflows, building frameworks for reliability, robustness, and continuous improvement . If you thrive at the intersection of AI research, applied testing, and responsible deployment , this is the role for you.

KEY RESPONSIBILITIES

Evaluation Frameworks

Design and implement LLM evaluation pipelines covering accuracy, robustness, safety, and bias.

Develop automated systems for benchmarking models on enterprise-relevant tasks.

Reliability Engineering

Conduct stress tests, adversarial testing, and edge-case evaluations .

Build tools to measure latency, consistency, and error recovery in multi-turn interactions.

Metrics & Monitoring

Define KPIs such as factual accuracy, hallucination rate, toxicity, and compliance alignment .

Establish real-time monitoring for drift, anomalies, and performance regressions .

Collaboration & Alignment

Partner with ML engineers, product managers, and domain experts to align evaluation with business objectives.

Work with Responsible AI teams to implement ethical, explainable, and compliant evaluation practices .

Continuous Improvement

Feed insights from evaluation into fine-tuning, RLHF / RLAIF pipelines, and model selection .

Maintain a central repository of test cases, benchmarks, and evaluation results .

Research & Innovation

Stay current with state-of-the-art LLM evaluation techniques , from academic benchmarks to applied enterprise metrics.

Explore automated evaluation using agentic test harnesses and synthetic data generation .

SKILLS & QUALIFICATIONS

Must-Have

3–6 years in AI / ML, NLP, or applied model evaluation .

Strong understanding of LLM architectures, prompt engineering, and failure modes .

Hands-on with evaluation frameworks (Eval harnesses, Ragas, OpenAI Evals, DeepEval).

Proficiency in Python and libraries like LangChain, LangGraph, LlamaIndex, Hugging Face .

Experience with vector databases, RAG pipelines, and knowledge graph integration .

Familiarity with bias / fairness testing and Responsible AI frameworks .

Good-to-Have

Experience with reinforcement learning (RLHF, RLAIF) and reward modeling.

Exposure to agentic evaluation frameworks (multi-agent stress testing, synthetic user simulators).

Knowledge of compliance and safety requirements for BFSI, GRC, or SOC use cases.

Contributions to open-source evaluation libraries or research papers .

WHY SHOULD YOU JOIN US?

Agentic AI Product Company

Ensure reliability in cutting-edge AI platforms that are redefining enterprise adoption.

A Fast-Growing Category Leader

Be part of one of the fastest-growing AI Foundries , powering Fortune 500 enterprises with trustworthy AI.

Career Mobility & Growth

Grow into roles such as AI Systems Architect, Responsible AI Engineer, or Reliability Engineering Lead .

Global Exposure

Work on enterprise-scale evaluation challenges across BFSI, Healthcare, Telecom, and GRC.

Create Real Impact

Your evaluations will directly shape production-grade AI agents used in mission-critical systems .

Culture of Excellence

Our values — Agency, Taste, Ownership, Mastery, Impatience, and Customer Obsession — empower you to innovate fearlessly.

Responsible AI First

Join a company that prioritizes trustworthy, explainable, and compliant AI .

XENONSTACK CULTURE – JOIN US & MAKE AN IMPACT!

At XenonStack, we believe in shaping the future of intelligent systems . We foster a culture of cultivation built on bold, human-centric leadership principles, where deep work, simplicity, and adoption define everything we do.

Our Cultural Values

Agency – Be self-directed and proactive.

Taste – Sweat the details and build with precision.

Ownership – Take responsibility for outcomes.

Mastery – Commit to continuous learning and growth.

Impatience – Move fast and embrace progress.

Customer Obsession – Always put the customer first.

Our Product Philosophy

Obsessed with Adoption – Making AI accessible, reliable, and enterprise-ready.

Obsessed with Simplicity – Turning complex evaluation challenges into seamless, automated frameworks.

Be part of our mission to accelerate the world’s transition to AI + Human Intelligence — by making AI agents not just powerful, but trustworthy and reliable .

Requirements

Responsible AI Practices, LLM evaluation

Create a job alert for this search

Reliability Engineer • Mohali, PB, in

Related jobs

Promoted

R-103864 LLM Optimization Engineer (Open)

Jade Globalbaddi, himachal pradesh, in

Analyze tracing logs from LLM inference and training runs to identify performance issues and inefficiencies.Develop tools and scripts to parse, visualize, and monitor LLM tracing data.Collaborate w...Show moreLast updated: 7 days ago

Promoted
New!

Site Reliability Engineer

Exasoftbaddi, himachal pradesh, in

Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: less than 1 hour ago

Promoted

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.baddi, himachal pradesh, in

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

Uplersbaddi, himachal pradesh, in

Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 24 days ago

Promoted

Site Reliability Engineer - DevOps

Wits Innovation LabMohali

Key Responsibilities : - Design, implement, and maintain comprehensive monitoring, logging, and alerting solutions across our production and other environmentsShow moreLast updated: 30+ days ago

Promoted

Technical Lead - AI ML

TrantorChandigarh, India, India

Lead the development and integration of Python-based applications with LLMs (OpenAI, DeepSeek, Anthropic, LLaMA, etc.Architect and implement LLM pipelines including prompt engineering, retrieval-au...Show moreLast updated: 23 days ago

Promoted

Data Integration & LLM Engineer

Chargebeebaddi, himachal pradesh, in

We are seeking a highly motivated.This role is ideal for engineers who enjoy working at the intersection of.APIs, SaaS connectors, and ETL / ELT pipelines to ensure reliable and scalable data flows.B...Show moreLast updated: 7 days ago

Promoted

HRS - Team Lead / Site Reliability Engineer

HRS INDIA PRIVATE LIMITEDChandigarh

We are looking for an experienced (Site Reliability Engineer Team Lead) to lead our SRE team at HRS.The ideal candidate will have a strong background in enhancing the reliability and scalability of...Show moreLast updated: 30+ days ago

Promoted

Sr. Site Reliability Engineer- Azure

ConfidentialMohali

Gathering Project Requirements from Stakeholders along with Business Analysts and Project Managers.Break down complex problems and projects into manageable goals. Handle High severity incident and s...Show moreLast updated: 13 days ago

Promoted

Site Reliability Engineer

Amicon Hub Servicespanchkula, haryana, in

Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 6 days ago

Promoted

Site Reliability Engineer

Xebiabaddi, himachal pradesh, in

AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 26 days ago

Promoted

Senior Site Reliability Engineer

Wits Innovation LabMohali

Job Overview : The Sr.SRE will lead the implementation and management of the observability stack across cloud infrastructure, ensuring reli...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer - Chaos Management

XebiaMohali, Punjab, India

We are looking for a highly skilled AWS Engineer with strong Python development and Chaos Engineering expertise to design, build, and validate resilient, scalable, and automated cloud-native envi...Show moreLast updated: 7 days ago

Promoted

MLOps Lead Engineer

Innodata Inc.baddi, himachal pradesh, in

Bachelor's or Master's degree in Computer Science, Data Science, Statistics, or a related field.Proven experience in designing and implementing AI / ML models. Proficiency in programming languages suc...Show moreLast updated: 7 days ago

Promoted

Data Engineer Team Lead

SGIChandigarh, India, India

To be discussed based on your skills and experience.Strong hands-on data engineering experience with a proven ability to design, build, and optimize scalable data pipelines in .Deep technical exper...Show moreLast updated: 6 days ago

Promoted
New!

Site Reliability Engineer

BayOne Solutionsbaddi, himachal pradesh, in

Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: less than 1 hour ago

Promoted

MARKET RESEARCH ATL / TL

The Knowledge CastleBarara, Haryana, India

Knowledge Castle is India's pioneering platform that offers comprehensive career-ready training, AI-driven interviews, and a job portal all in one place. Our services empower job seekers by providin...Show moreLast updated: 7 days ago

Promoted

LLM Engineer / Prompt Engineer

ConfidentialNagar, Sahibzada Ajit Singh Nagar, India

Train, evaluate, and fine-tune models across.Collaborate with product managers, engineers, and designers to integrate AI solutions into applications. Deploy models into production using.APIs, Docker...Show moreLast updated: 9 days ago