This job offer is not available in your country.

Director of Site Reliability Engineering

ConfidentialBengaluru / Bangalore, India

10 days ago

Job description

Join us in bringing joy to customer experience. Five9 is a leading provider of cloud contact center software, bringing the power of cloud innovation to customers worldwide.

Living our values everyday results in our team-first culture and enables us to innovate, grow, and thrive while enjoying the journey together. We celebrate diversity and foster an inclusive environment, empowering our employees to be their authentic selves.

Director of Site Reliability Engineering

The Director of Site Reliability Engineering is responsible for leading the strategic vision, operational excellence, and organizational capability of our SRE function. This role combines technical leadership with people management to build and scale a world-class SRE organization that enables rapid innovation while maintaining exceptional reliability standards.

As the senior leader of the SRE discipline, you will establish the technical strategy, culture, and practices that ensure our systems can scale reliably to meet business demands. You will build and lead a team of SRE professionals, partner with engineering leadership across the organization, and drive the adoption of SRE principles and practices.

This is a hands-on leadership role requiring deep technical expertise, proven ability to scale engineering organizations, and a track record of building reliable systems at scale. The ideal candidate will balance reliability with tactical execution, driving both immediate operational excellence and long-term architectural improvements where necessary.

Key Responsibilities

Strategic Leadership & Vision

Define and execute the long-term SRE strategy aligned with business objectives and technical roadmap
Establish reliability standards, SLI / SLO frameworks, and error budget policies across services
Drive architectural decisions that improve system reliability, scalability, and operational efficiency
Partner with engineering leadership to influence platform and application design for reliability
Represent SRE perspective in executive technical discussions and strategic planning

Team Leadership & Development

Build, lead, and scale a high-performing SRE organization

Recruit, hire, and onboard top-tier SRE talent across multiple experience levels

Develop career progression frameworks and growth paths for SRE professionals

Foster a culture of continuous learning, blameless post-mortems, and operational excellence

Provide technical mentorship and leadership development for senior SRE staff

Operational Excellence & Incident Management

Manage and oversee enterprise-wide incident response processes and on-call practices

Drive root cause analysis programs and ensure systematic elimination of failure modes

Implement sustainable on-call practices that maintain work-life balance while ensuring coverage

Oversee capacity planning and resource optimization strategies across all services

Establish metrics and reporting frameworks for reliability, performance, and operational health

Cross-Functional Partnership

Collaborate with VP / Director level peers in Engineering, Product, and Infrastructure

Work with Security leadership to integrate reliability and security practices

Partner with Finance on cost optimization initiatives and capacity planning budgets

Engage with Customer Success and Support teams on reliability-impacting issues

Platform & Tooling Strategy

Drive the simplification and reduction of observability, monitoring, and alerting platforms

Establish automation standards and drive toil reduction initiatives

Help improve CI / CD pipeline architecture and deployment practices

Influence infrastructure-as-code and configuration management strategies

Organizational & Process Innovation

Implement SRE best practices including error budgets, toil tracking, and reliability reviews

Establish metrics-driven decision making and continuous improvement processes

Drive adoption of chaos engineering and proactive reliability testing

Create and maintain SRE documentation, runbooks, and knowledge sharing systems

Develop and execute disaster recovery and business continuity plans

Required Skills

Leadership & Management Experience

Bachelor&aposs or Master&aposs degree in Computer Science, Engineering, or equivalent experience

8+ years in engineering leadership roles, with 4+ years managing managers

Proven track record of building and scaling engineering teams

Experience with performance management, career development, and succession planning

Strong executive presence and ability to influence without authority

Experience driving organizational change and cultural transformation

Technical Expertise

Experience with multiple cloud platforms (AWS, GCP, Azure) and hybrid environments

Deep understanding of distributed systems, microservices architecture, and cloud platforms

Hands-on experience with modern observability tools (Prometheus, Grafana, Datadog, etc.)

Strong background in infrastructure automation, CI / CD, and infrastructure-as-code

Expertise in capacity planning, performance optimization, and cost management

SRE & Operations Mastery

Deep understanding of SRE principles, practices, and implementation at scale

Experience establishing SLI / SLO frameworks and error budget management

Proven track record of improving system reliability and reducing operational toil

Experience with incident management, post-mortem processes, and reliability engineering

Background in 24 / 7 operations and on-call management best practices

Business & Strategic Acumen

Understanding of budget management, resource allocation, and ROI analysis

Ability to communicate technical concepts to non-technical stakeholders and executives

Experience with vendor management and technology partnership decisions

Knowledge of compliance frameworks and regulatory requirements

Desired Skills

Advanced Technical Background

Background in container orchestration (Kubernetes) and service mesh technologies

Knowledge of database administration and data platform reliability

Experience with security engineering and DevSecOps practices

Success Metrics

Reliability & Performance

Achieve and maintain service availability targets (typically 99.9%+ uptime)

Reduce mean time to detection (MTTD) and mean time to recovery (MTTR)

Improve capacity planning accuracy and reduce over-provisioning costs

Increase deployment frequency while maintaining reliability standards

Team & Organizational Development

Build and retain a high-performing SRE organization with low attrition

Establish clear career progression and achieve high employee satisfaction scores

Develop internal talent and promote from within the SRE organization

Create sustainable on-call practices with reasonable operational load

Operational Excellence

Drive measurable reduction in operational toil and manual interventions

Establish comprehensive observability and proactive alerting across all services

Implement effective incident response with blameless post-mortem culture

Achieve cost optimization targets while maintaining reliability standards

Five9 embraces diversity and is committed to building a team that represents a variety of backgrounds, perspectives, and skills.  The more inclusive we are, the better we are.  Five9 is an equal opportunity employer.

View our privacy policy, including our privacy notice to California residents here : https : / / www.five9.com / pt-pt / legal.

Note : Five9 will never request that an applicant send money as a prerequisite for commencing employment with Five9.

Show less

Skills Required

Performance Optimization, Distributed Systems, Cost Management, Prometheus, Grafana, Datadog, Infrastructure Automation, Capacity Planning, Gcp, Incident Management, Azure, Kubernetes, Aws

Create a job alert for this search

Director Of Engineering • Bengaluru / Bangalore, India

Related jobs

Promoted

Manager, Site Reliability Engineering (Cortex XDR XSIAM)

Palo Alto NetworksBengaluru, Karnataka, India

At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show moreLast updated: 4 days ago

Promoted

Site Reliability Engineer

BCT Consulting P LimitedBangalore

Job Description : Key Responsibilities : &l...Show moreLast updated: 30+ days ago

Promoted
New!

Site Reliability Engineer

Rangam Indiabangalore, India

Infrastructure Platform Engineering (IPE), part of the client Infrastructure & Cloud organisation, are searching for a senior Associate to drive Site Reliability Engineering (SRE) and a professiona...Show moreLast updated: 1 hour ago

Promoted

Site Reliability Engineering Manager

Synechronbangalore, karnataka, in

We have immediate opportunity for Senior Site Reliability Engineer.Senior Site Reliability Engineer.At Synechron, we believe in the power of digital to transform businesses for the better.Our globa...Show moreLast updated: 5 days ago

Promoted
New!

Site Reliability Engineer

BayOne Solutionsbangalore, karnataka, in

Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: 22 hours ago

Promoted

Minfy Technologies - Head - Site Reliability Engineering

Minfy Technologies Private LimitedBangalore, India

Job Summary We are seeking a strategic and technically proficient Head of Site Reliability Engineering (SRE) to lead th...Show moreLast updated: 30+ days ago

Promoted

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.hosur, tamil nadu, in

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

ElgebraBangalore

Role Overview : We are seeking a highly experienced and technically proficient Site Reliability Engineer (SRE) to join our team in support of our c...Show moreLast updated: 4 days ago

Promoted

Site Reliability Engineer

Core Minds Tech SOlutionsHosur

Job Description : - Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions&l...Show moreLast updated: 30+ days ago

Promoted
New!

Senior Principal Site Reliability Engineer

F5bangalore, India

At F5, we strive to bring a better digital world to life.Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital...Show moreLast updated: 1 hour ago

Senior Manager – Site Reliability Engineering (SRE)

First AdvantageBangalore-560066, ITPL Bangalore, IN

Quick Apply

At First Advantage (Nasdaq : FA), people are at the heart of everything we do.From our customers and partners to our greatest advantage — our team members. Operating with empathy and compassion...Show moreLast updated: 11 days ago

Promoted
New!

Principal Consultant - Site Reliability Engineers / L2

Genpactbangalore, India

Genpact (NYSE : G) is a global professional services and solutions firm delivering outcomes that shape the future.Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepr...Show moreLast updated: 1 hour ago

Promoted
New!

Site Reliability Engineer

Exasofthosur, tamil nadu, in

Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: 22 hours ago

Promoted

Five9 - Director - Site Reliability Engineering

Five9Bangalore

Join us in bringing joy to customer experience.Five9 is a leading provider of cloud contact center software, bringing the power of cloud innovation to customers worldwide.Living our values everyday...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

TavantBengaluru, Karnataka, India

With 25+ years of experience building innovative digital products and solutions, Tavant provides impactful results to its customers. It has been the frontrunner in driving digital innovation and tec...Show moreLast updated: 27 days ago

Promoted

Principal Site Reliability Engineer

Rakuten IndiaBengaluru, Karnataka, India

Design, develop SLA, SLO, SLI of services within the Business Unit.Involve in whole process of Development, Production System Operation including system maintenance, monitoring, automation, backend...Show moreLast updated: 8 days ago

Promoted
New!

Lead Site Reliability Engineer

UnitedHealth Groupbangalore, India

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives.The work you do with our team will directly improve health outcomes by connect...Show moreLast updated: 1 hour ago

Promoted

Site Reliability Engineering Manager

Epsilonbangalore, karnataka, in

SaaSOps leads post-production support and the overall experience of Epsilon PeopleCloud products for our global clients.This function is responsible for product support, incident management, manage...Show moreLast updated: 8 days ago

Promoted

o9 Solutions - Site Reliability Engineering Manager

o9 SolutionsBangalore

Job Summary : We are seeking an experienced Manager to lead complex, cross-functional initiatives across our DevOps in collaboration with platform engineering.This ro...Show moreLast updated: 5 days ago

Promoted

Site Reliability Engineer

Xebiahosur, tamil nadu, in

AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 27 days ago