Talent.com
Infrastructure Automation Site Reliability Engineer (SRE)

Infrastructure Automation Site Reliability Engineer (SRE)

ConfidentialHyderabad / Secunderabad, Telangana, India
5 days ago
Job description

About The Role

The Infrastructure Automation Site Reliability Engineer (SRE) bridges the gap between development and operations by applying software engineering principles to infrastructure and operational challenges. Responsibilities include creating support documentation, developing key metrics for tracking and reporting, managing monitoring services, using automation tools, and coordinating cross-team communications related to releases and maintenance.

Automation SREs support existing Infrastructure Developers by taking ownership of application support and process work required to manage these applications at scale in a 24×7 environment. This allows developers to focus on building new features and functionality.

Key Functions

Application / Tool Support

  • Support existing applications and services hosted by the Infrastructure Automation (InfAuto) team
  • Develop runbooks for application support and maintenance
  • Create detailed alerts for incident management and monitoring tools
  • Implement and manage an updated operations platform for the Technical Operations team

Service Introduction & Communications

  • Develop communication plans for service and tool launches
  • Improve messaging around service interruptions and maintenance
  • Infrastructure & Automation

  • Expand use of cloud development pipelines for new observability capabilities
  • Support cloud infrastructure integration
  • Use scripts to perform maintenance tasks
  • Monitoring & Observability

  • Define KPIs and SLAs for managed services
  • Assist with dashboard development and management
  • Integrate cloud infrastructure with monitoring and reporting tools
  • Conduct capacity planning to support proactive scaling
  • Operational Excellence

  • Design and execute high availability (HA) and disaster recovery (DR) infrastructure testing
  • Partner with operations teams to expedite issue analysis
  • Coordinate change management activities with application users
  • Required Skills And Tools Experience

    Experience Range : 3–6 years using tools in the following categories :

  • Infrastructure as Code : Terraform, CloudFormation, or similar
  • Configuration Management : Ansible, Puppet, or Chef
  • Container Technologies : Docker, Podman, basic Kubernetes concepts
  • Observability Platforms : Grafana, Elastic (ELK), DataDog, Splunk
  • Issue / Project Tracking : JIRA, ServiceNow, Trello, or similar
  • CI / CD Pipelines : Jenkins, GitLab CI, GitHub Actions
  • Documentation Tools : SharePoint, Confluence (for user guides, runbooks, etc.)
  • Linux Operating Systems : Red Hat Enterprise Linux or similar (CentOS, Rocky, Fedora)
  • Database Operations : SQL, PostgreSQL
  • IDEs : Visual Studio Code (VS Code), JetBrains IntelliJ IDEA
  • Desired Skills

  • 2–4 years in an L1 SRE or DevOps role
  • Experience as a Systems Engineer (infrastructure design and implementation)
  • Platform Engineer (internal tooling and platform development)
  • Cloud Engineer (multi-cloud experience and migration projects)
  • Application Support (production troubleshooting)
  • Release Engineer (software deployment and release management)
  • Incident Response (on-call experience and production issue resolution)
  • Company Benefits & Perks

  • Competitive salary package.
  • Performance-based annual bonus (cash and stocks).
  • Hybrid working model (3 days office / week).
  • Group Medical & Life Insurance.
  • Modern offices with free amenities & fully stocked cafeterias.
  • Monthly food card & company-paid snacks.
  • Hardship / shift allowance with company-provided pickup & drop facility
  • Attractive employee referral bonus.
  • Frequent company-sponsored team-building events and outings.
  • Depending upon the shifts.
  • The benefits package is subject to change at the management's discretion.
  • Skills Required

    Servicenow, Trello, Chef, Postgresql, Grafana, Jira, Red Hat Enterprise Linux, Confluence, Terraform, Docker, Visual Studio Code, Cloudformation, Fedora, Sql, Datadog, Jenkins, Ansible, Sharepoint, Centos, Splunk, Puppet, Kubernetes

    Create a job alert for this search

    Site Reliability Engineer • Hyderabad / Secunderabad, Telangana, India

    Related jobs
    • Promoted
    Senior Site Reliability Engineer (SRE) – Datadog Observability

    Senior Site Reliability Engineer (SRE) – Datadog Observability

    Jade Globalhyderabad, telangana, in
    Senior Site Reliability Engineer (SRE) – Datadog Observability.SRE and Infrastructure Operations with minimum 3.Hyderabad preferable but open for Pune and remote. Site Reliability Engineer (SRE).SRE...Show moreLast updated: 1 day ago
    • Promoted
    Senior Site Reliability Engineer (Sre) – Datadog Observability

    Senior Site Reliability Engineer (Sre) – Datadog Observability

    Jade GlobalSecunderabad, Republic Of India, IN
    Senior Site Reliability Engineer (SRE) – Datadog Observability.SRE and Infrastructure Operations with minimum 3.Hyderabad preferable but open for Pune and remote. Site Reliability Engineer (SRE).SRE...Show moreLast updated: 1 day ago
    • Promoted
    Sr Engineer, Site Reliability Engineer [T500-20464]

    Sr Engineer, Site Reliability Engineer [T500-20464]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 26 days ago
    • Promoted
    • New!
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Nebula Tech Solutionssecunderabad, India
    SRE team supporting mission-critical applications for our.We’re now looking for engineers who can go beyond operations — those who can. Enhance application reliability through code.Add or modify cod...Show moreLast updated: 20 hours ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeHyderabad, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 14 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    AutoRABITHyderabad, Telangana, India
    AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ConfidentialHyderabad / Secunderabad, Telangana, Pune, Chennai
    Hands on experience monitoring, managing, and maintaining high availability web systems (Windows and Linux) as a System Administrator Engineer. Follow and champion ITIL Best Practices and Standards....Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Talent Sutrahyderabad, telangana, in
    The position exists to deploy the products and their updates ensuring smooth infrastructure and configuration management for robust project delivery. Operating System (Linux & Windows), Ansible, Doc...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    o9 Solutions, Inc.Hyderabad, Republic Of India, IN
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 15 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapgeminiHyderabad, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 11 days ago
    • Promoted
    Sr Engineer, Site Reliability [T500-20279]

    Sr Engineer, Site Reliability [T500-20279]

    TMUS Global Solutionshyderabad, telangana, in
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 26 days ago
    • Promoted
    Sr Engineer, Site Reliability T500-20279

    Sr Engineer, Site Reliability T500-20279

    TMUS Global SolutionsHyderabad, Republic Of India, IN
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 26 days ago
    • Promoted
    Principal Site Reliability Engineer - IAC Terraform

    Principal Site Reliability Engineer - IAC Terraform

    TidyhireHyderabad
    Description : This is a pure individual contributor role.Core Responsibilities : Infrastructure Design &...Show moreLast updated: 23 days ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    ConfidentialHyderabad / Secunderabad, Telangana
    Design, build, and maintain scalable, highly available, and resilient infrastructure.Develop automation tools and scripts to improve operational efficiency and reduce manual intervention.Monitor sy...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (SRE) - Observability & Azure Infrastructure

    Site Reliability Engineer (SRE) - Observability & Azure Infrastructure

    ConfidentialHyderabad / Secunderabad, Telangana
    Observability Platform Implementation : .Design and maintain distributed tracing, metrics, and logging using OpenTelemetry, Prometheus, Loki, and Tempo. Ensure complete instrumentation of.NET Core app...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    NationsBenefits IndiaHyderabad, Telangana, India
    Job Title : Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog | 24 / 7 Support Department : Site Reliability Engineering Location : Hyderabad, India Employment Type : Fu...Show moreLast updated: 22 days ago
    • Promoted
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Tata Consultancy Servicessecunderabad, India
    Senior Site Reliability Engineer (SRE).Senior Site Reliability Engineer (SRE).Desired Experience Range : 7 - 10 yrs.Notice Period : Immediate to 90Days only. We are currently planning to do a Virtual....Show moreLast updated: 11 days ago
    • Promoted
    Site Reliability Engineer III

    Site Reliability Engineer III

    ConfidentialHyderabad / Secunderabad, Telangana, India
    As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office, you will collaborate with engineering, support, and operations teams to maintain and improve the reliability...Show moreLast updated: 30+ days ago