Staff Site Reliability Engineer

Anlage Infotech (India) Pvt LtdBangalore

11 days ago

Job description

About The Role :

We are looking for a highly experienced Staff Site Reliability Engineer (SRE) to drive the reliability, performance, and operational excellence of our core production systems.

This is a senior, hands-on role that requires deep expertise in large-scale distributed systems, complex incident management, and building world-class observability platforms.

Key Responsibilities :

Reliability Engineering :

Define, measure, and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical platform services.
Drive down toil by promoting self-service and automation.

Observability Platform :

Lead the design and implementation of our global observability stack, including metric collection (Prometheus / M3DB), distributed tracing (Jaeger / OpenTelemetry), and logging (Loki / Elasticsearch).

Incident Management :

Act as a technical leader during high-severity incidents, perform in-depth Root Cause Analysis (RCA), and implement long-term preventative measures.

Performance Tuning :

Conduct performance analysis and capacity planning for the entire platform, optimizing infrastructure and application bottlenecks.

Security & Compliance :

Partner with the security team to enforce security controls and best practices across the infrastructure layer.

Mentorship & Evangelism :

Mentor SRE and DevOps teams, and evangelize reliability best practices and engineering excellence across all product development teams.

Technical Skills (Must-Have) :

Distributed Systems :

Proven experience designing, running, and debugging large-scale distributed systems and microservices in a high-traffic environment.

Cloud & Kubernetes :

Expert proficiency in managing highly available Kubernetes clusters (i.e., K8s on GCP / AWS / Azure) and their underlying cloud resources.

Observability Stack :

Deep, hands-on experience with modern observability tools (Prometheus, Grafana, :

Expert in at least one modern programming language (Go / Python) for writing operators, automation tooling, and extending monitoring systems.

Infrastructure as Code (IaC) :

Advanced knowledge of Terraform for managing multi-cloud infrastructure.

Networking :

Advanced understanding of network concepts in a cloud / container environment (service mesh, network policies, load balancing).

Qualifications :

Bachelor's or Master's degree in Computer Science or a related technical field.

8+ years of professional experience in SRE, DevOps, or Infrastructure Engineering roles.

History of successfully implementing reliability improvements that result in measurable SLO adherence

(ref : hirist.tech)

Create a job alert for this search

Site Reliability Engineer • Bangalore

Related jobs

Promoted

Senior Staff Site Reliability Engineer

MoviusBengaluru, Karnataka, India

Senior Staff Site Reliability Engineer.Location : Bengaluru, KA, 560076.We are seeking a highly skilled Senior Staff Site Reliability Engineer with extensive experience in DevOps / SRE roles and large...Show moreLast updated: 5 days ago

Promoted

Sr. Site Reliability Engineer [T500-20179]

Delta Air LinesBengaluru, Karnataka, India

Delta Air Lines (NYSE : DAL) is the U.Powered by our employees around the world, Delta has for a decade led the airline industry in operational excellence while maintaining our reputation for award-...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

o9 Solutions, Inc.Bengaluru, Karnataka, India

Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

SynechronBengaluru, Karnataka, India

We have immediate opportunity for Senior Site Reliability Engineer.Senior Site Reliability Engineer.At Synechron, we believe in the power of digital to transform businesses for the better.Our globa...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

Core Minds Tech SOlutionsHosur

Job Description : - Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions&l...Show moreLast updated: 30+ days ago

Promoted

Senior Staff Site Reliability Engineer

Palo Alto NetworksBengaluru, Karnataka, India

At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

Resource AlgorithmBengaluru, Karnataka, India

Senior SRE (Engineering & Reliability).We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical sys...Show moreLast updated: 12 days ago

Promoted

Site Reliability Engineer

TEKsystemsBengaluru, Karnataka, India

SRE – Site Reliability Engineer : Experience : 6+ years Location : Bangalore Mode of work : Hybrid Job Description The Resy Site Reliability Engineering group’s goal is to ensure Resy Customers ca...Show moreLast updated: 13 days ago

Site Reliability Engineer

Aqilea (formerly Soltia)Bangalore, Karnataka, India

Quick Apply

We are a consulting company with a bunch of technology-interested and happy people!.We love technology, we love design and we love quality. Our diversity makes us unique and creates an inclusive and...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer - OpenShift

ConfidentialBengaluru / Bangalore

Applies software engineering principles to the operations domain.Contributes to a service's codebase, writes automation that aids in the management of a service, and performs operational engineerin...Show moreLast updated: 30+ days ago

Promoted

Senior Site Reliability Engineer

RecRootsBangalore Urban, Karnataka, India

The core premise for the SRE lies in treating operational issues as a software problem.We code our way out of problems where operations are concerned, addressing availability, scalability, latency,...Show moreLast updated: 24 days ago

Promoted

Senior Site Reliability Engineer

iVoyanthosur, India

One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team. Reliability and Performance Management : .Design, imp...Show moreLast updated: 2 days ago

Promoted

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.hosur, India

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 6 days ago

Promoted

Staff Site Reliability Engineer (Observability)

Palo Alto NetworksBengaluru, Karnataka, India

Promoted

Site Reliability Engineer

CapgeminiBangalore, IN

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 2 days ago

Promoted

Site Reliability Engineer

QualityKiosk TechnologiesBengaluru, Karnataka, India

QualityKiosk Technologies is one of the world's largest independent Quality Engineering (QE) providers and digital transformation enablers, helping companies build and manage applications for optim...Show moreLast updated: 13 days ago

Promoted

Senior Site Reliability Engineer

ACL DigitalBengaluru, Karnataka, India

Python, AWS (EC2, IAM, Lambda, API Gateway, SNS, SQS & etc.GITHUB Actions, Service Management, Incident Management etc.Show moreLast updated: 13 days ago

Promoted

Staff Site Reliability Engineer, Application SRE

ConfidentialChennai, Bengaluru / Bangalore

Please note, this team is hiring across all levels and candidates are individually assessed and appropriately leveled based upon their skills and experience. The Application SRE Team supports severa...Show moreLast updated: 30+ days ago