Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

Insight GlobalBengaluru, India
30+ days ago
Job description

₹5-15 LPA with benefits (EOR)

Onsite in Bangalore, India

Required Skills & Experience

  • 3 years of experience responding and monitoring a globally deployed web application (keeping track of permutations)
  • Experience working with microservices that run on a Kubernetes background
  • Metrics forward thought process and a strong understanding of observability tools focusing on operational Metrics : Quantiles, P99, and Prometheus
  • Familiarity with AWS services or any cloud provider – foundational understanding

Very Strong Communication and Customer service skills

Nice to Have Skills & Experience

LLM or AI Experience

Job Description

Insight Global is seeking a skilled LLM System Monitor to support the LLM Proxy team. You will be the person monitoring and interpreting the Grafana dashboards that will signal failures and problems in order to manage the incident communication. On a day-to-day basis you will be the SRE monitoring the observability dashboards. You will either begin an incident report yourself from an automated alert, or will be pulled into a chat zone by someone who has created a ticket. From here you will be the main point of contact exhibiting great communication to the end customer and the incident commander. You will give frequent updates of the status of the incident to all parties.

Create a job alert for this search

Site Reliability Engineer • Bengaluru, India