Talent.com
Ai lead

Ai lead

ConfidentialChandigarh, India
21 days ago
Job description

AIOps Lead

Location : Chandigarh (On-site)

Experience : 3 to 5 years (AI / ML + DevOps + Observability)

Employment Type : Full-time

About the Role

We are looking for a next-generation AIOps Engineer to design and operate AI-driven, self-healing, and intelligent infrastructure systems.

In this role, you'll fuse MLOps, DevOps, and agentic AI systems — leveraging technologies like Ray, vLLM, SGLang, and PyTorch Lightning to build predictive, autonomous, and scalable operational pipelines.

You will develop intelligent observability systems capable of detecting, diagnosing, and resolving issues in real time — powered by distributed AI and LLM-based automation.

Key Responsibilities

  • Design, implement, and scale AIOps pipelines that collect, analyze, and act on telemetry data across infrastructure and applications.
  • Build and deploy distributed ML / LLM workflows using Ray, PyTorch Lightning, vLLM, or SGLang for anomaly detection, event correlation, and predictive maintenance.
  • Orchestrate LLM-based operations agents using LangChain, LangGraph, or SGLang to power AI-assisted diagnostics and root-cause analysis.
  • Implement intelligent observability layers over systems like Prometheus, Grafana, ELK, OpenTelemetry, or Datadog to enable AI-driven insights and alerting.
  • Develop self-healing systems leveraging AI and automation frameworks to auto-remediate incidents.
  • Optimize inference serving and distributed compute with vLLM, Ray Serve, and Triton Inference Server for ultra-fast response times.
  • Build real-time data ingestion pipelines using Kafka, Spark, or Flink for operational and telemetry data.
  • Collaborate with SRE, MLOps, and AI engineering teams to create autonomous, adaptive infrastructure systems.
  • Integrate CI / CD pipelines for AI workflows using MLflow, Kubeflow, or Airflow, with model monitoring and drift detection.
  • Evaluate and integrate AIOps platforms (Moogsoft, BigPanda, Datadog AIOps, Dynatrace, etc.) and agentic frameworks for proactive automation.

Required Skills & Qualifications

  • Bachelor's or Master's in Computer Science, Engineering, or related field.
  • 4+ years of experience in DevOps, SRE, or AI infrastructure engineering.
  • Strong programming experience in Python (preferred), Go, or Bash scripting.
  • Deep understanding of cloud platforms (AWS, GCP, Azure) and Kubernetes / Docker orchestration.
  • Expertise in infrastructure as code (Terraform, Helm, Pulumi).
  • Experience with distributed compute frameworks — Ray, PyTorch Lightning, vLLM, SGLang.
  • Proficiency with observability and monitoring stacks (Prometheus, Grafana, ELK, OpenTelemetry, Splunk).
  • Familiarity with MLOps and LLMOps tools (MLflow, Kubeflow, Airflow, ArgoCD).
  • Experience with event-driven systems and message queues (Kafka, RabbitMQ, AWS SQS).
  • Understanding of AI-powered automation, root cause analysis, and predictive operational analytics.
  • Preferred / Nice-to-Have

  • Hands-on with vLLM for optimized LLM inference and observability agents.
  • Experience deploying and optimizing Ray Serve, vLLM, or Triton in production.
  • Exposure to SGLang for LLM-based orchestration, workflow automation, and diagnostics reasoning.
  • Familiarity with vector databases (Milvus, Weaviate, Pinecone) and RAG-based observability.
  • Experience with agentic AIOps frameworks and LLM-driven operational reasoning (LangGraph, AutoGen, CrewAI).
  • Understanding of AI observability, drift detection, cost-aware scaling, and fault-tolerant AI systems.
  • Contributions to open-source AIOps, observability, or distributed AI infrastructure projects.
  • What We Offer

  • Opportunity to build the foundation for autonomous, intelligent operations.
  • Hands-on exposure to SGLang, vLLM, Ray, PyTorch Lightning, and LangGraph ecosystems.
  • Collaborative, cross-functional environment spanning AI, cloud, and systems engineering.
  • Competitive compensation, flexible work setup, and professional development opportunities.
  • Skills Required

    Airflow, Elk, Prometheus, Kafka, Grafana, Datadog, Terraform, Docker, Aws, Ml, Ai, Pulumi, Devops, Gcp, Spark, Helm, Azure, Kubernetes

    Create a job alert for this search

    Ai Lead • Chandigarh, India

    Related jobs
    • Promoted
    AI / ML Developer

    AI / ML Developer

    Viionn Labsbaddi, himachal pradesh, in
    Derive and design use cases from structured and unstructured data.Provide LLM expertise to solve AI problems using state-of-the-art language models and off-the-shelf LLM services such as OpenAI mod...Show moreLast updated: 18 days ago
    • Promoted
    ML / Gen AI Engineer

    ML / Gen AI Engineer

    Intuition IT – Intuitive Technology Recruitmentbaddi, himachal pradesh, in
    Design, deploy, and manage scalable ML and GenAI workloads using AWS services including SageMaker Studio and Bedrock.Implement and maintain infrastructure using AWS Lambda, EKS, ECS on Fargate, and...Show moreLast updated: 1 day ago
    • Promoted
    Senior AI Developer

    Senior AI Developer

    PioVation GmbHbaddi, himachal pradesh, in
    Cloud Operating System and we need someone who can ship.If you like taking AI from prototype → scalable product, this is for you. Design and ship AI / LLM features that run in production.Build RAG-sty...Show moreLast updated: 18 days ago
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    Edstem Technologiesbaddi, himachal pradesh, in
    The ideal candidate will have hands-on expertise across the full ML lifecycle—from data exploration and feature engineering to model training, optimization, and production deployment.You will work ...Show moreLast updated: 18 days ago
    • Promoted
    Lead AI / ML Engineer

    Lead AI / ML Engineer

    Optumpanchkula, haryana, in
    Lead AI / ML Engineer – Clinical AI systems.Optum is a global organization that delivers care, aided by technology, to help millions of people live healthier lives. The work you do with our team will ...Show moreLast updated: 14 days ago
    • Promoted
    AI Automation Engineer (Internship)

    AI Automation Engineer (Internship)

    Bhalekar ConsultingChandigarh, Chandigarh, India
    Position Title : AI Automation Engineer (Internship).Learning and Training-Based Internship (Unpaid as per company policy). Assigned Department Supervisor / Mentor.Australia-based consulting firm spe...Show moreLast updated: 17 days ago
    • Promoted
    • New!
    AI Engineer

    AI Engineer

    Asitepanchkula, haryana, in
    We start with a simple idea : the built environment should be smarter, safer and more sustainable.Everything we do is about helping the people behind major construction and infrastructure projects w...Show moreLast updated: 17 hours ago
    • Promoted
    AI Engineer

    AI Engineer

    TechKareerbaddi, himachal pradesh, in
    Mumbai / Bengaluru / Gurgaon (Hybrid : 3 days / week in office).Remote option for exceptional candidates.We’re building production-grade AI workflows and agentic applications that power real user expe...Show moreLast updated: 30+ days ago
    • Promoted
    AI Security Lead

    AI Security Lead

    Delphi Consulting Middle Eastbaddi, himachal pradesh, in
    Join Delphi - Where Innovation meets transformation.At Delphi, we believe in creating an environment where our people thrive. We are committed to supporting your personal goals, family, and overall ...Show moreLast updated: 1 day ago
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    TransPerfectbaddi, himachal pradesh, in
    We are seeking a Senior AI / ML Engineer to join our client’s AI team and contribute to the development of cutting-edge intelligent systems. In this role, you’ll be responsible for designing, training...Show moreLast updated: 30+ days ago
    • Promoted
    Senior AI Engineer

    Senior AI Engineer

    Calyx Globalpanchkula, haryana, in
    Calyx Global is a ClimateTech startup founded by Donna Lee and Duncan van Bergen, backed by Sequoia, Susquehanna (SIG), New Climate Ventures, and others. We provide ratings and research on the globa...Show moreLast updated: 4 days ago
    • Promoted
    Team Lead

    Team Lead

    Zensar Technologiespanchkula, haryana, in
    ZENSAR -TEAM LEAD | PROJECT MANAGER OPPORTUNITY FOR GEN AI PROJECT.Dear Aspirant, Greetings from Zensar!!.We are a technology consulting and services company with over 11,500 associates in 33 globa...Show moreLast updated: 14 days ago
    • Promoted
    AI Automation Engineer

    AI Automation Engineer

    Innovation Technology By DesignChandigarh, India, India
    Work from Office (WFO) – Chandigarh.Be Part of IT By Design’s AI Revolution : .AI, automation, and building intelligent solutions. IT and technology functions globally, this is your opportunity to mak...Show moreLast updated: 30+ days ago
    • Promoted
    Artificial Intelligence Engineer

    Artificial Intelligence Engineer

    Tata Consultancy Servicesbaddi, himachal pradesh, in
    TCS is conducting virtual drive Engineer.Years of Experience : 5 to 10 Yrs.Mode of Interview : Virtual Mode.Date of Interview : 5th, 6th, 7th Nov, 2025. Python, Generative AI models (LLMs, Image Genera...Show moreLast updated: 30+ days ago
    • Promoted
    AI Architect

    AI Architect

    Innovation Technology By DesignChandigarh, India, India
    Work from Office (WFO) – Chandigarh.Be Part of IT By Design’s AI Revolution : .IT By Design’s AI Automation Team in India. AI-driven ecosystem — shaping the future of how IT operates, automates, and e...Show moreLast updated: 27 days ago
    • Promoted
    AIML Architect

    AIML Architect

    ValueLabspanchkula, haryana, in
    We at ValueLabs have an Opening for AI / ML Architect role.At least 7+ years of relevant AI / ML experience or previous ML experience with strong engineering competencies and at least 2+ years in Gener...Show moreLast updated: 1 day ago
    • Promoted
    AI Architect

    AI Architect

    EvoluteIQpanchkula, haryana, in
    We at EvoluteIQ believe in the power of transformation.We are committed to building an industry leading technology that will revolutionize the way enterprises conduct business.To make that happen, ...Show moreLast updated: 9 days ago
    • Promoted
    • New!
    Looker BI Lead (5+ yrs exp)

    Looker BI Lead (5+ yrs exp)

    MindBrainpanchkula, haryana, in
    BI solutioning within the Looker platform.This role involves owning the Looker semantic layer, optimizing data models, and enabling advanced analytics capabilities, including conversational BI and ...Show moreLast updated: 13 hours ago