Talent.com
This job offer is not available in your country.
LLM Reliability & Evaluation Engineer

LLM Reliability & Evaluation Engineer

XenonStackMohali, PB, in
25 days ago
Job type
  • Quick Apply
Job description

Job Description

ABOUT XENONSTACK

XenonStack is the fastest-growing   Data and AI Foundry for Agentic Systems , enabling enterprises to gain   real-time and intelligent business insights .

We deliver innovation through :

Agentic Systems for AI Agents   →   akira.ai

Vision AI Platform   →   xenonstack.ai

Inference AI Infrastructure for Agentic Systems   →   nexastack.ai

Our mission is to accelerate the world’s transition to   AI + Human Intelligence   by making AI agents   reliable, explainable, and enterprise-ready .

THE OPPORTUNITY

We are seeking an   LLM Reliability & Evaluation Engineer   to ensure that large language models (LLMs) and agentic AI systems meet   enterprise-grade standards of accuracy, safety, and trustworthiness .

This role focuses on   evaluating, benchmarking, and stress-testing   LLMs in real-world workflows, building frameworks for   reliability, robustness, and continuous improvement . If you thrive at the intersection of   AI research, applied testing, and responsible deployment , this is the role for you.

KEY RESPONSIBILITIES

Evaluation Frameworks

Design and implement   LLM evaluation pipelines   covering accuracy, robustness, safety, and bias.

Develop automated systems for   benchmarking models   on enterprise-relevant tasks.

Reliability Engineering

Conduct   stress tests, adversarial testing, and edge-case evaluations .

Build tools to measure   latency, consistency, and error recovery   in multi-turn interactions.

Metrics & Monitoring

Define KPIs such as   factual accuracy, hallucination rate, toxicity, and compliance alignment .

Establish real-time monitoring for   drift, anomalies, and performance regressions .

Collaboration & Alignment

Partner with   ML engineers, product managers, and domain experts   to align evaluation with business objectives.

Work with Responsible AI teams to implement   ethical, explainable, and compliant evaluation practices .

Continuous Improvement

Feed insights from evaluation into   fine-tuning, RLHF / RLAIF pipelines, and model selection .

Maintain a   central repository of test cases, benchmarks, and evaluation results .

Research & Innovation

Stay current with   state-of-the-art LLM evaluation techniques , from academic benchmarks to applied enterprise metrics.

Explore   automated evaluation using agentic test harnesses and synthetic data generation .

SKILLS & QUALIFICATIONS

Must-Have

3–6 years in   AI / ML, NLP, or applied model evaluation .

Strong understanding of   LLM architectures, prompt engineering, and failure modes .

Hands-on with   evaluation frameworks   (Eval harnesses, Ragas, OpenAI Evals, DeepEval).

Proficiency in   Python   and libraries like   LangChain, LangGraph, LlamaIndex, Hugging Face .

Experience with   vector databases, RAG pipelines, and knowledge graph integration .

Familiarity with   bias / fairness testing and Responsible AI frameworks .

Good-to-Have

Experience with   reinforcement learning (RLHF, RLAIF)   and reward modeling.

Exposure to   agentic evaluation frameworks   (multi-agent stress testing, synthetic user simulators).

Knowledge of   compliance and safety requirements   for BFSI, GRC, or SOC use cases.

Contributions to   open-source evaluation libraries or research papers .

WHY SHOULD YOU JOIN US?

Agentic AI Product Company

Ensure reliability in cutting-edge AI platforms that are redefining enterprise adoption.

A Fast-Growing Category Leader

Be part of one of the fastest-growing   AI Foundries , powering Fortune 500 enterprises with trustworthy AI.

Career Mobility & Growth

Grow into roles such as   AI Systems Architect, Responsible AI Engineer, or Reliability Engineering Lead .

Global Exposure

Work on   enterprise-scale evaluation challenges   across BFSI, Healthcare, Telecom, and GRC.

Create Real Impact

Your evaluations will directly shape   production-grade AI agents used in mission-critical systems .

Culture of Excellence

Our values —   Agency, Taste, Ownership, Mastery, Impatience, and Customer Obsession   — empower you to innovate fearlessly.

Responsible AI First

Join a company that prioritizes   trustworthy, explainable, and compliant AI .

XENONSTACK CULTURE – JOIN US & MAKE AN IMPACT!

At XenonStack, we believe in   shaping the future of intelligent systems . We foster a   culture of cultivation   built on bold, human-centric leadership principles, where   deep work, simplicity, and adoption   define everything we do.

Our Cultural Values

Agency   – Be self-directed and proactive.

Taste   – Sweat the details and build with precision.

Ownership   – Take responsibility for outcomes.

Mastery   – Commit to continuous learning and growth.

Impatience   – Move fast and embrace progress.

Customer Obsession   – Always put the customer first.

Our Product Philosophy

Obsessed with Adoption   – Making AI accessible, reliable, and enterprise-ready.

Obsessed with Simplicity   – Turning complex evaluation challenges into seamless, automated frameworks.

Be part of our mission to   accelerate the world’s transition to AI + Human Intelligence   — by making AI agents not just powerful, but   trustworthy and reliable .

Requirements

Responsible AI Practices, LLM evaluation

Create a job alert for this search

Reliability Engineer • Mohali, PB, in

Related jobs
  • Promoted
R-103864 LLM Optimization Engineer (Open)

R-103864 LLM Optimization Engineer (Open)

Jade Globalbaddi, himachal pradesh, in
Analyze tracing logs from LLM inference and training runs to identify performance issues and inefficiencies.Develop tools and scripts to parse, visualize, and monitor LLM tracing data.Collaborate w...Show moreLast updated: 7 days ago
  • Promoted
  • New!
Site Reliability Engineer

Site Reliability Engineer

Exasoftbaddi, himachal pradesh, in
Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: less than 1 hour ago
  • Promoted
Senior Site Reliability Engineer- ELK Expert

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.baddi, himachal pradesh, in
Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Uplersbaddi, himachal pradesh, in
Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 24 days ago
  • Promoted
Site Reliability Engineer - DevOps

Site Reliability Engineer - DevOps

Wits Innovation LabMohali
Key Responsibilities : - Design, implement, and maintain comprehensive monitoring, logging, and alerting solutions across our production and other environmentsShow moreLast updated: 30+ days ago
  • Promoted
Technical Lead - AI ML

Technical Lead - AI ML

TrantorChandigarh, India, India
Lead the development and integration of Python-based applications with LLMs (OpenAI, DeepSeek, Anthropic, LLaMA, etc.Architect and implement LLM pipelines including prompt engineering, retrieval-au...Show moreLast updated: 23 days ago
  • Promoted
Data Integration & LLM Engineer

Data Integration & LLM Engineer

Chargebeebaddi, himachal pradesh, in
We are seeking a highly motivated.This role is ideal for engineers who enjoy working at the intersection of.APIs, SaaS connectors, and ETL / ELT pipelines to ensure reliable and scalable data flows.B...Show moreLast updated: 7 days ago
  • Promoted
HRS - Team Lead / Site Reliability Engineer

HRS - Team Lead / Site Reliability Engineer

HRS INDIA PRIVATE LIMITEDChandigarh
We are looking for an experienced (Site Reliability Engineer Team Lead) to lead our SRE team at HRS.The ideal candidate will have a strong background in enhancing the reliability and scalability of...Show moreLast updated: 30+ days ago
  • Promoted
Sr. Site Reliability Engineer- Azure

Sr. Site Reliability Engineer- Azure

ConfidentialMohali
Gathering Project Requirements from Stakeholders along with Business Analysts and Project Managers.Break down complex problems and projects into manageable goals. Handle High severity incident and s...Show moreLast updated: 13 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Amicon Hub Servicespanchkula, haryana, in
Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 6 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Xebiabaddi, himachal pradesh, in
AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 26 days ago
  • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Wits Innovation LabMohali
Job Overview : The Sr.SRE will lead the implementation and management of the observability stack across cloud infrastructure, ensuring reli...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer - Chaos Management

Site Reliability Engineer - Chaos Management

XebiaMohali, Punjab, India
We are looking for a highly skilled AWS Engineer with strong Python development and Chaos Engineering expertise to design, build, and validate resilient, scalable, and automated cloud-native envi...Show moreLast updated: 7 days ago
  • Promoted
MLOps Lead Engineer

MLOps Lead Engineer

Innodata Inc.baddi, himachal pradesh, in
Bachelor's or Master's degree in Computer Science, Data Science, Statistics, or a related field.Proven experience in designing and implementing AI / ML models. Proficiency in programming languages suc...Show moreLast updated: 7 days ago
  • Promoted
Data Engineer Team Lead

Data Engineer Team Lead

SGIChandigarh, India, India
To be discussed based on your skills and experience.Strong hands-on data engineering experience with a proven ability to design, build, and optimize scalable data pipelines in .Deep technical exper...Show moreLast updated: 6 days ago
  • Promoted
  • New!
Site Reliability Engineer

Site Reliability Engineer

BayOne Solutionsbaddi, himachal pradesh, in
Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: less than 1 hour ago
  • Promoted
MARKET RESEARCH ATL / TL

MARKET RESEARCH ATL / TL

The Knowledge CastleBarara, Haryana, India
Knowledge Castle is India's pioneering platform that offers comprehensive career-ready training, AI-driven interviews, and a job portal all in one place. Our services empower job seekers by providin...Show moreLast updated: 7 days ago
  • Promoted
LLM Engineer / Prompt Engineer

LLM Engineer / Prompt Engineer

ConfidentialNagar, Sahibzada Ajit Singh Nagar, India
Train, evaluate, and fine-tune models across.Collaborate with product managers, engineers, and designers to integrate AI solutions into applications. Deploy models into production using.APIs, Docker...Show moreLast updated: 9 days ago