Talent.com
This job offer is not available in your country.
Sr. Staff Site Reliability Engineer

Sr. Staff Site Reliability Engineer

SolarWindsbangalore, India
8 hours ago
Job description

At SolarWinds, we’re a people-first company. Our purpose is to enrich the lives of the people we serve—including our employees, customers, shareholders, Partners, and communities. Join us in our mission to help customers accelerate business transformation with simple, powerful, and secure solutions.

The ideal candidate thrives in an innovative, fast-paced environment and is collaborative, accountable, ready, and empathetic. We’re looking for individuals who believe they can accomplish more as a team and create lasting growth for themselves and others. We hire based on attitude, competency, and commitment. Solarians are ready to advance our world-class solutions in a fast-paced environment and accept the challenge to lead with purpose. If you’re looking to build your career with an exceptional team, you’ve come to the right place. Join SolarWinds and grow with us!

About the Role :

As a Senior Staff Site Reliability Engineer, you will play a pivotal role in driving reliability and performance improvements across the SolarWinds Observability Platform. You will work closely with cross-functional engineering teams to manage and reduce SaaS backlogs, ensuring that our platform scales effectively while maintaining the highest standards of reliability and performance. Your ability to drive initiatives, provide technical leadership, and optimize complex systems will be key to our success.

This role demands deep technical expertise, a collaborative mindset, and the ability to mentor a high-performing team of engineers. You will be responsible for driving technical initiatives, overseeing incident response, and improving our platform’s infrastructure while focusing on the integration of emerging technologies such as ClickHouse, Kafka, Karpenter, and Buf.

Key Responsibilities :

  • Lead and Drive Initiatives : Own and lead strategic initiatives to improve the reliability, scalability, and performance of the SolarWinds Observability Platform, with a strong focus on reducing SaaS backlogs.
  • SaaS Backlog Management : Collaborate with cross-functional teams to identify, prioritize, and address outstanding backlog items, including incidents, infrastructure improvements, performance optimization, and automation.
  • Automation & Observability : Lead the development of automation strategies and observability tools to improve platform monitoring, reduce incidents, and enhance performance insights across the infrastructure.
  • Incident Response & Postmortems : Lead response efforts for production incidents, conducting thorough postmortems, driving continuous improvement initiatives, and ensuring the team learns from each incident.
  • Platform Engineering Leadership : Drive initiatives related to platform engineering and scale infrastructure systems, ensuring they meet the reliability and performance standards necessary for the SolarWinds Observability Platform.
  • Mentorship & Team Leadership : Mentor and provide technical guidance to the Site Reliability Engineering (SRE) team, helping them grow their skills and driving a culture of continuous learning and collaboration.
  • Collaboration & Cross-Functional Engagement : Collaborate closely with engineering, security, and product teams to ensure the seamless integration of new technologies and systems, improving platform reliability and scalability.

Ideal Candidate Attributes :

  • Strong Leadership Skills : Proven ability to drive initiatives, manage SaaS backlogs, and lead cross-functional teams to successful outcomes.
  • Collaborative Mindset : Comfortable working with diverse teams across different functions to solve complex problems and build scalable, high-performance systems.
  • Customer-Focused : A strong customer orientation, with the ability to translate technical challenges into business solutions.
  • Excellent Communication : Strong interpersonal and communication skills to effectively engage with both technical and non-technical stakeholders.
  • Problem-Solving & Ownership : A collaborative problem solver with a strong bias for ownership and decisive action.
  • Qualifications :

  • 13+ years of experience in Site Reliability Engineering, Platform Engineering, or related roles, with extensive experience managing SaaS environments.
  • 8+ years of experience designing, building, and maintaining AWS / Azure infrastructure, using Terraform and automation tools.
  • 5+ years of experience building, running, and scaling Kubernetes clusters in production environments.
  • Experience with Observability tools (e.g., monitoring, logging, tracing, metrics) and practices for high-performance systems.
  • Strong expertise with Kafka for real-time data processing, ClickHouse for OLAP workloads, and GitOps CI / CD processes.
  • Familiarity with Karpenter for Kubernetes autoscaling, and Buf for managing Protocol Buffers at scale is a plus.
  • Programming experience in Python, Go (Golang), and Bash.
  • Security Operations Experience : Knowledge of security best practices for cloud-native environments, including encryption, key management, and security policies.
  • Mentorship experience : Demonstrated success in mentoring and growing technical teams, fostering a culture of collaboration and continuous learning.
  • Create a job alert for this search

    Site Reliability Engineer • bangalore, India

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    BCT Consulting P LimitedBangalore
    Job Description : Key Responsibilities : &l...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Rangam Indiabangalore, India
    Infrastructure Platform Engineering (IPE), part of the client Infrastructure & Cloud organisation, are searching for a senior Associate to drive Site Reliability Engineering (SRE) and a professiona...Show moreLast updated: 8 hours ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Neptune Retail Solutionsbangalore, India
    Quotient a subsidiary of Neptune Retail Solutions is the leading digital media and promotions technology company that creates cohesive omnichannel brand-building and sales-driving opportunities to ...Show moreLast updated: 8 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    BayOne Solutionshosur, tamil nadu, in
    Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: 1 day ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.hosur, tamil nadu, in
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ElgebraBangalore
    Role Overview : We are seeking a highly experienced and technically proficient Site Reliability Engineer (SRE) to join our team in support of our c...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Core Minds Tech SOlutionsHosur
    Job Description : - Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions&l...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Staff Site Reliability Engineer

    Senior Staff Site Reliability Engineer

    Palo Alto NetworksBengaluru, Karnataka, India
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Exasofthosur, tamil nadu, in
    Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    TavantBengaluru, Karnataka, India
    With 25+ years of experience building innovative digital products and solutions, Tavant provides impactful results to its customers. It has been the frontrunner in driving digital innovation and tec...Show moreLast updated: 27 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    WSO2Bengaluru, Karnataka, India
    Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises in over 90 c...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Sr Site Reliability Engineer

    Sr Site Reliability Engineer

    Sabrebangalore, India
    SABRE TRAVEL SOLUTIONS IS LOOKING FOR A TALENTED SR SITE RELIABILITY ENGINEER.Come and join our team to build, deploy and maintain applications with the goal of continuous deployment and keep up hi...Show moreLast updated: 8 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Uplershosur, tamil nadu, in
    Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 25 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Point72bangalore, India
    You will play a highly critical operational role where you will apply a combination of software and systems engineering skills to develop and maintain a complex set of distributed, real-time system...Show moreLast updated: 8 hours ago
    • Promoted
    Staff Site Reliability Engineer (Observability)

    Staff Site Reliability Engineer (Observability)

    Palo Alto NetworksBengaluru, Karnataka, India
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show moreLast updated: 6 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    ViewSonicBengaluru, Karnataka, India
    At ViewSonic Technologies, we’re passionate about building software that solves problems.We count on our site reliability engineers (SREs) to empower users with a rich feature set, high availabilit...Show moreLast updated: 30+ days ago
    • Promoted
    Sr. Site Reliability Engineer [T500-20179]

    Sr. Site Reliability Engineer [T500-20179]

    Delta Air Linesbangalore, karnataka, in
    Delta Air Lines (NYSE : DAL) is the U.Powered by our employees around the world, Delta has for a decade led the airline industry in operational excellence while maintaining our reputation for award-...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer - Chaos Management

    Site Reliability Engineer - Chaos Management

    Xebiahosur, tamil nadu, in
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 8 days ago