Talent.com
This job offer is not available in your country.
Principal Site Reliability Engineer - IAC Terraform

Principal Site Reliability Engineer - IAC Terraform

TidyhireHyderabad
1 day ago
Job description

Description :

This is a pure individual contributor role.

Core Responsibilities :

Infrastructure Design & Maintenance :

  • Lead the design, build, and maintenance of our core infrastructure using infrastructure-as-code (IaC) tools (e.g., Terraform, CloudFormation).
  • Own the provisioning and lifecycle management of production, staging, and other critical environments.
  • Architect and implement shared infrastructure components (e.g., logging, metrics, service mesh, load balancing).
  • Drive continuous improvements to infrastructure scalability, availability, and performance.
  • Act as a key partner to development teams, providing infrastructure primitives and strategic guidance on deployment needs.

Deployment Systems & CI / CD :

  • Design, own, and enhance our CI / CD pipelines (GitHub Actions, Argo CD) to maximize reliability, velocity, and automation.
  • Establish and enforce best practices across all environments for deployment, rollback, and observability.
  • Partner with developers to architect and streamline the testing and delivery of code to production.
  • Champion the elimination of manual steps in deployment and operations workflows.
  • Reliability, Observability & Tooling :

  • Architect and manage our monitoring, alerting, and logging infrastructure (Kube-Prometheus-Grafana stack).
  • Define, implement, and track SLOs / SLIs for core services, holding service owners accountable.
  • Proactively identify and eliminate single points of failure, performance bottlenecks, and sources of instability.
  • Lead reliability reviews, blameless post-incident analyses, and capacity planning initiatives.
  • Perform basic debugging of Java applications to assist development teams in & Knowledge Sharing :
  • Ensure all systems and processes built or maintained by the SRE team are accompanied by thorough, up-to-date documentation.
  • Lead internal training sessions, walkthroughs, and pairings to cross-train teammates and reduce knowledge silos.
  • Collaboration & Culture :

  • Work closely with the SRE Lead to define team strategy, prioritize work, and execute on team goals.
  • Participate in on-call rotations, acting as an escalation point for complex & Skillset :
  • 8+ years of experience in a Senior SRE / DevOps / related infrastructure role.

    Cloud :

  • Deep, hands-on expertise with AWS, including services like ECS, EKS, Aurora (Postgres), EC2, S3, and VPC.
  • Containers & Orchestration :

  • Strong, production-level proficiency with Kubernetes and Helm. Deep understanding of container runtimes and networking.
  • CI / CD :

  • Extensive experience designing, building, and managing complex CI / CD pipelines using tools like GitHub Actions and Argo CD. Experience with container registries like GHCR.
  • IaC :

  • Expertise in Infrastructure as Code, with strong proficiency in Terraform or :
  • Proven experience with observability stacks, particularly the Kube-Prometheus-Grafana stack, including custom metric instrumentation and advanced dashboarding.
  • Debugging :

  • Ability to perform basic performance analysis and debugging of applications (Java experience is a strong plus).
  • Incident Management :

  • Experience leading incident response, conducting blameless post-mortems, and driving resulting action items to completion.
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Hyderabad

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ValueMomentumHyderabad, Telangana, India
    Site Reliability / Azure DevOps Engineer with Dynatrace Experience.CI / CD practices, infrastructure automation, and cloud operations. The ideal candidate will have deep expertise in Azure DevOps, Inf...Show moreLast updated: 15 days ago
    • Promoted
    Site Reliability Engineer - AWS / Google Cloud Platform

    Site Reliability Engineer - AWS / Google Cloud Platform

    INDIGLOBE IT SOLUTIONS PRIVATE LIMITEDHyderabad
    Job Summary : We are looking for a Senior Site Reliability Engineer (SRE) to join our growing Engineering team.As an SRE, you will play a key role in ensuring the rel...Show moreLast updated: 30+ days ago
    • Promoted
    Cubic Corporation - Principal Site Reliability Engineer

    Cubic Corporation - Principal Site Reliability Engineer

    Cubic Transportation Systems India Pvt. Ltd.Hyderabad
    Job Details : The Senior Site Reliability Engineer is a leader within the team, responsible for designing, building, and owning the complex infrastructure and deploy...Show moreLast updated: 14 days ago
    • Promoted
    Site Reliability Engineer - AIOps / Observability Services

    Site Reliability Engineer - AIOps / Observability Services

    Intraedge Technologies Ltd.Hyderabad
    L2Observability / AIOps : Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, m...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Observability Services

    Site Reliability Engineer - Observability Services

    TeamWare SolutionsHyderabad
    Role Summary : We are seeking a highly skilled Site Reliability Engineer (SRE) with a strong focus on observability.The ideal candidate will have 5-8 years of experie...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    ConfidentialHyderabad / Secunderabad, Telangana
    Our purpose is to help a billion people find the right work! Phenom is an AI-Powered talent experience platform that is redefining the HR tech space. We have grown into a global organization with of...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SID Global SolutionsHyderabad, Telangana, India
    Job Role : Site Reliability Engineer (SRE) – GCP.SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortu...Show moreLast updated: 10 days ago
    • Promoted
    AutoRABIT - Senior Site Reliability Engineer - AWS Infrastructure

    AutoRABIT - Senior Site Reliability Engineer - AWS Infrastructure

    AutoRABIT Software Pvt LtdHyderabad
    AutoRABIT Profile : AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes R...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer - IAC Terraform

    Senior Site Reliability Engineer - IAC Terraform

    Options Executive Search Private LimitedHyderabad
    Job Title : SRE Lead Engineer.Location : Hyderabad, India.We are seeking a DevOps / SRE Lead Engineer to architect and scale our client's multi-tenant SaaS platform with ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    TalentBridgeHyderabad, Telangana, India
    Lead SRE and DevOps initiatives, supporting development teams with CI / CD, automation, and infrastructure design across Azure environments. Maintain Infrastructure as Code (IaC) standards; automate k...Show moreLast updated: 17 hours ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    ConfidentialHyderabad / Secunderabad, Telangana
    Extensive experience with infrastructure technologies such as Linux, Windows, cloud computing, virtualization, and containerization. Deep understanding of IT infrastructure services and their depend...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    WSO2secunderabad, telangana, in
    Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises in over 90 c...Show moreLast updated: 23 days ago
    • Promoted
    MetLife - Site Reliability Engineer - ELK Stack

    MetLife - Site Reliability Engineer - ELK Stack

    MetLife Global Operations Support CenterHyderabad
    Note : This job role is part of MetLifes Hack4Job India (a hiring hackathon).Only shortlisted candidates will be invited. Department : Global Overview Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    HuntingCube Recruitment SolutionsHyderabad, Telangana, India
    Lead, Tech (Site Reliability Engineering) – Systems.Strict Eligibility Criteria – Please Read Before Applying.High-Frequency Trading (HFT) firm. Only the following branches are eligible : .Computer Sc...Show moreLast updated: 22 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    AutoRABITHyderabad, India
    AutoRABIT Profile AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recove...Show moreLast updated: 21 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Insight Global, LLCHyderabad
    We are seeking SRE / Ansible Developers to join our Enterprise SRE Center of Excellence (COE) team.This team is responsible for defining development standards, ensuring compliance, and building autom...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Amicon Hub Serviceshyderabad, telangana, in
    Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 21 days ago
    • Promoted
    Site Reliability Engineer - Docker / Kubernetes

    Site Reliability Engineer - Docker / Kubernetes

    Purview India Consulting and Services LLPHyderabad
    Job Title : SRE Engineer! Location : : 5+ Yrs Position Overview : We ar...Show moreLast updated: 30+ days ago