Job Title : Site Reliability Engineer
Experience : 5 to 9 Years
Location : Bangalore (Work From Office — All 5 Days)
Job Summary :
We are seeking an experienced Site Reliability Engineer (SRE) to join our team in Bangalore. The ideal candidate will bring expertise in observability tools, incident management, and automation, ensuring high availability and performance of critical systems. The role requires working in rotational 24×7 support shifts.
Key Responsibilities :
Implement and manage monitoring and alerting solutions using Prometheus , Grafana , and Datadog
Set up and tune alerting rules for anomaly detection and noise reduction
Perform alert triage , manage incidents, and drive root cause analysis (RCA)
Optimize and report on SLA , SLO , SLI , MTTA , and MTTD metrics
Debug and resolve production issues across services and stacks
Automate operational tasks using Python and Google App Script
Collaborate with development teams to enhance reliability and scalability of systems
Drive continuous improvement in monitoring, alerting, and on-call processes
Primary Skills :
Secondary Skills :
Site Reliability Engineer • Bangalore Urban, Karnataka, India