Talent.com
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Elios TalentHyderabad, Telangana, India
15 hours ago
Job description

Senior Site Reliability Engineer

Key Highlights

🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms

⚡ Drive automation-first engineering across AWS, Terraform, CI / CD, observability, and resilient systems

📊 Own reliability, uptime, system health, costs, and performance across mission-critical environments

🔐 Strengthen DevSecOps practices—improving security, delivery velocity, and operational excellence

🚨 Lead major incident response, troubleshoot complex issues, and uphold production stability at scale

Position Overview

We are seeking a Senior Site Reliability Engineer to drive reliability, automation, and performance for large-scale, cloud-based platforms. This role blends deep technical engineering, systems thinking, DevOps collaboration, and operational leadership.

You will design and implement scalable infrastructure, improve observability, enhance resiliency, manage incident operations, and champion modern DevSecOps practices. This role plays a critical part in supporting tens of thousands of daily users while ensuring platforms remain secure, fast, and highly available.

Key Responsibilities

Cloud Engineering

  • Architect, deploy, and optimize AWS environments using automation and Infrastructure-as-Code
  • Build tooling that increases predictability, stability, and delivery speed
  • Optimize systems for scale, reliability, cost, and performance
  • Maintain repeatable, traceable, and transparent infrastructure through Terraform and automation
  • Monitor cloud spend and usage, ensuring alignment with service-level objectives

Observability & Reliability

  • Own uptime, reliability, system security, performance metrics, and golden signals
  • Lead incident management and triage bridges during major events
  • Enhance telemetry systems (NewRelic, CloudWatch, DataDog) for deep operational visibility
  • Use data-driven analysis to improve system stability and customer experience
  • Ensure architecture and deployment patterns meet SLAs and reliability goals
  • DevSecOps & Automation

  • Strengthen CI / CD pipelines, code-review practices, and engineering standards
  • Partner with Cybersecurity to address vulnerabilities through automation
  • Support secure, consistent, and scalable delivery workflows across engineering teams
  • Resiliency Engineering

  • Identify failure points, blast-radius risks, and architectural gaps
  • Run failure-injection / chaos testing to validate resiliency
  • Forecast traffic, plan for seasonal peaks, and scale systems for 2x+ load scenarios
  • Drive improvements to infrastructure and software to meet resiliency targets
  • Leadership & Collaboration

  • Mentor engineers across levels, promoting high-quality engineering practices
  • Collaborate daily with product, engineering, and security teams in a DevOps model
  • Document, uplift, and share knowledge through cross-team forums and best practices
  • Qualifications

  • Experience as a software engineer with strong debugging + deployment skills
  • Hands-on expertise with AWS and Terraform (required)
  • Experience with ECS, and Kubernetes / EKS experience strongly preferred
  • Strong proficiency in Python, Golang, Bash, and automation frameworks
  • CI / CD experience with Jenkins, GitHub Enterprise, CircleCI, or similar
  • Ability to troubleshoot across web servers, app servers, OS, networks, storage, and databases
  • Experience running large-scale, high-availability production systems
  • Strong communication, root-cause analysis, and incident leadership skills
  • BS in Computer Science or equivalent industry experience
  • About Us

    We build scalable, secure, and high-performing digital platforms that power global user experiences. By combining cloud engineering, automation, observability, and resilient systems design, we help organizations operate more reliably, innovate faster, and support long-term platform stability and growth.

    Why Join Us

    Join a forward-thinking engineering organization where reliability, automation, and performance are core values. You’ll work with a modern cloud stack, collaborate with exceptional engineers, and own meaningful technical impact across large-scale applications. This is an opportunity to shape infrastructure strategy, elevate engineering practices, and build systems that support millions with consistency and excellence.

    Create a job alert for this search

    Senior Site Reliability Engineer • Hyderabad, Telangana, India

    Related jobs
    • Promoted
    • New!
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Elios TalentHyderabad, Telangana, India
    Senior Site Reliability Engineer Key Highlights ️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms ⚡ Drive automation-first engineering across AW...Show moreLast updated: 13 hours ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Elios TalentHyderabad, Telangana, India
    Site Reliability Engineer Key Highlights ️ Build, automate, and support cloud-native infrastructure powering high-availability platforms ⚡ Contribute to automation-first engineering across AWS, Te...Show moreLast updated: 13 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Tata Consultancy ServicesHyderabad, Telangana, India
    GKE(Preferable); Kubernetes (Any cloud) + PostgresSQL, SQL(Must) Linux (Optional), Java (Optional) , Kubernetes (CLI), Prior Production support experience, Release Management, Prior Deployment exp...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    inTune Systems IncHyderabad, Telangana, India
    SRE / App Support Engineer Location Hyderabad Job Summary : We are looking for a Senior Site Reliability Engineer (SRE) to join our growing Engineering team. As an SRE, you will play a key role in en...Show moreLast updated: 13 hours ago
    • Promoted
    Sr Engineer, Site Reliability [T500-20425]

    Sr Engineer, Site Reliability [T500-20425]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    GSPANN Technologies, IncHyderabad, India
    Headquartered in California, U.GSPANN provides consulting and IT services to global clients.We help clients transform how they deliver business value by helping them optimize their IT capabilities,...Show moreLast updated: 18 hours ago
    • Promoted
    • New!
    Sr Engineer, Site Reliability

    Sr Engineer, Site Reliability

    TMUS Global SolutionsHyderabad, India
    The Senior Engineer, Site Reliability (SRE) will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Finance platforms.This role is focused on ...Show moreLast updated: 9 hours ago
    • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    AutoRABITHyderabad, Republic Of India, IN
    AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, ...Show moreLast updated: 30+ days ago
    • Promoted
    SRE (Site Reliability Engineer)

    SRE (Site Reliability Engineer)

    Tata Consultancy ServicesHyderabad, Republic Of India, IN
    Kubernetes (Any cloud) + PostgresSQL, SQL(Must).Linux (Optional), Java (Optional), Kubernetes (CLI), Prior Production support experience, Release Management, Prior Deployment experience,.Show moreLast updated: 5 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    AutoRABITHyderabad, Telangana, India
    AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Engineer - Site Relibility - FPT

    Engineer - Site Relibility - FPT

    Talent500 INCHyderabad, India
    Engineer - Site Reliability - FPT.As a Site Reliability Engineer, youll play a crucial role in keeping our digital backbone running seamlessly for millions of customers. Your mission : reduce inciden...Show moreLast updated: 9 hours ago
    • Promoted
    Site Reliability Engineer [T500-21132]

    Site Reliability Engineer [T500-21132]

    InspireHyderabad, Telangana, India
    Inspire Brands is disrupting the restaurant industry through digital transformation and operational efficiencies.The company’s technology hub, Inspire Brands Hyderabad Support Center, India, will l...Show moreLast updated: 15 days ago
    • Promoted
    • New!
    Principal Engineer, Site Reliability

    Principal Engineer, Site Reliability

    TMUS Global SolutionsHyderabad, India
    The Principal Engineer, Site Reliability (SRE) will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Finance platforms.This role is focused ...Show moreLast updated: 9 hours ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    TMUS Global SolutionsHyderabad, Republic Of India, IN
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FoodsmartHyderabad, Republic Of India, IN
    Foodsmart is the leading telenutrition and foodcare solution, backed by a robust network of Registered Dietitians.Our platform is designed to foster healthier food choices, drive lasting behavior c...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Engineer, Site Reliability

    Engineer, Site Reliability

    TMUS Global SolutionsHyderabad, India
    The Systems Reliability Engineer (SRE) ensures the stability, performance, and reliability of IT services and infrastructure. This role combines software engineering and operations expertise to buil...Show moreLast updated: 9 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    VXI Global SolutionsHyderabad, Telangana, India
    We are looking for a Site Reliability Engineer with 3+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications.The id...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    NationsBenefits IndiaHyderabad, Telangana, India
    Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog |.SRE team focused on maintaining the performance, reliability, and availability of our fintech platforms.Triage and resolve product...Show moreLast updated: 30+ days ago