Talent.com
Site Reliability Engineer - IAC Terraform
Site Reliability Engineer - IAC TerraformHashone Careers • India
Site Reliability Engineer - IAC Terraform

Site Reliability Engineer - IAC Terraform

Hashone Careers • India
30+ days ago
Job description

Job Summary :

The Site Reliability Engineer specializing in Infrastructure as Code (IaC) and Terraform is responsible for designing, building, automating, and maintaining cloud infrastructure using modern DevOps and SRE practices.

The role ensures system reliability, scalability, high availability, and operational excellence across production environments.

The engineer will focus heavily on automation, monitoring, CI / CD, incident response, and performance engineering while working closely with developers and platform teams.

Key Responsibilities :

  • Design, create, and maintain scalable cloud infrastructure using Terraform.
  • Develop reusable Terraform modules, pipelines, and automation frameworks.
  • Implement infrastructure provisioning, updates, and rollback workflows through version-controlled IaC.
  • Ensure compliance with infrastructure standards, security policies, and cloud governance frameworks.
  • Build and manage cloud infrastructure on AWS / Azure / GCP (customize as needed).
  • Implement scalable architecture patterns (auto-scaling, load balancing, container orchestration).
  • Optimize resource utilization and cost-efficiency.
  • Manage VPCs, subnets, security groups, firewalls, IAM, and other cloud services.
  • Ensure reliability, resiliency, scalability, and performance of production systems.
  • Implement chaos engineering practices, fault injection, and resiliency tests.
  • Conduct root cause analysis (RCA) and develop permanent fixes for system failures.
  • Define and maintain SLOs, SLIs, SLAs, and error budgets.
  • Build and enhance CI / CD pipelines using GitHub Actions, GitLab CI, Jenkins, Azure DevOps, or similar.
  • Automate testing, security checks, deployments, and environment provisioning.
  • Implement GitOps workflows with tools like ArgoCD or Flux (optional).
  • Deploy and manage containerized applications using Docker and Kubernetes.
  • Manage clusters (EKS, AKS, GKE, or self-hosted Kubernetes).
  • Implement Service Mesh (Istio / Linkerd) is an advantage.
  • Manage Helm charts, Kustomize, and Kubernetes controllers.
  • Implement and maintain monitoring solutions (Prometheus, Grafana, Datadog, New Relic, CloudWatch, etc.
  • Set up centralized logging using ELK / EFK, Cloud Logging, or Splunk.
  • Monitor system health, performance metrics, and application behavior.
  • Build alerting strategies and auto-remediation systems.
  • Implement security best practices across infrastructure and deployments.
  • Manage secrets, encryption, access control, and network security.
  • Use Terraform Cloud / Enterprise, Sentinel policies, and linting tools for compliance enforcement.
  • Participate in security audits, pen tests, and cloud hardening initiatives.
  • Participate in on-call rotations and respond to production incidents.
  • Troubleshoot and resolve system outages, latency issues, and performance problems.
  • Develop runbooks, playbooks, and post-incident reports.
  • Automate repetitive operational tasks.
  • Work collaboratively with developers, QA, product teams, and other SRE members.
  • Assist teams in adopting cloud-native, scalable, and automated practices.
  • Maintain up-to-date system documentation, diagrams, and operational SOPs.
  • Provide technical guidance and mentorship to junior engineers.

Required Skills & Competencies :

Technical Skills :

  • Strong experience in Terraform and IaC best practices.
  • Hands-on expertise with major cloud providers (AWS / Azure / GCP).
  • Solid knowledge of Linux administration, networking, and distributed systems.
  • Strong scripting skills (Python, Bash, Shell).
  • Excellent understanding of Kubernetes, Docker, and container orchestration.
  • Strong CI / CD experience.
  • Solid experience with monitoring tools (Grafana, Prometheus, Datadog, ELK).
  • Knowledge of GitOps, configuration management (Ansible), or cloud-native patterns (preferred).
  • Understanding of SRE concepts (SLIs, SLOs, error budgets, toil reduction)
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • India

    Related jobs
    Site Reliability Engineer Ii

    Site Reliability Engineer Ii

    RecRoots • Republic Of India, IN
    Key Job Responsibilities and Duties : .The core premise for the SRE lies in treating operational issues as a software problem. We code our way out of problems where operations are concerned addressing...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ACL Digital • India
    We are Hiring : SRE : Immediate Joiners Preferred.Bachelor's degree in engineering / computer science or equivalent with an overall work experience of 4 - 6 years. Deep understanding and Experience of ...Show more
    Last updated: 21 days ago • Promoted
    Sr Site Reliability Engineer

    Sr Site Reliability Engineer

    Media.net • Republic Of India, IN
    Net is a leading, global ad tech company that focuses on creating the most transparent and efficient path for advertiser budgets to become publisher revenue. Our proprietary contextual technology is...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Sails Software Inc • India
    We are looking for an experienced and driven Senior Site Reliability Engineer (SRE) to architect, implement, and maintain robust cloud infrastructure. This role demands a deep understanding of AWS, ...Show more
    Last updated: 3 days ago • Promoted
    Aws Site Reliability Engineer

    Aws Site Reliability Engineer

    HTC Global Services • Chennai, Republic Of India, IN
    Troy, Michigan, is a leading global Information Technology solution and BPO provider.HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data ...Show more
    Last updated: 22 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    o9 Solutions, Inc. • India
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    HRhelpdesk • Indore, Republic Of India, IN
    Company is a rapidly growing, private equity backed SaaS product company and provides cloud-based solutions.As a Site Reliability Engineer (SRE), you will be responsible for building and maintainin...Show more
    Last updated: 12 days ago • Promoted
    AWS Site Reliability Engineer

    AWS Site Reliability Engineer

    HTC Global Services • India
    Troy, Michigan, is a leading global Information Technology solution and BPO provider.HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data ...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WhiteLotus Talent Partners • India
    L0 and L1 Site Reliability Engineer (SRE) Support.Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by. In this role, you will focu...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Synamedia • India
    At Synamedia, the world’s most talented innovators and trailblazers are shaping the way the world is entertained and informed. We are backed by the Permira funds and Sky.This is the age of infinite ...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Delta Electronics India • India
    Define and monitor Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to balance reliability with feature velocity and ensure optimal system availability.Respond to...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Capgemini • Republic Of India, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Grootan Technologies • Chennai, Republic Of India, IN
    Site Reliability Engineer (SRE).In this role, you will be responsible for building and maintaining reliable, scalable, and secure infrastructure to support our applications.You will leverage your e...Show more
    Last updated: 12 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Relevance Lab • India
    The ideal candidate will have a strong background in infrastructure management and a deep understanding of blockchain ecosystems. You will be responsible for designing, implementing, and maintaining...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Media.net • Republic Of India, IN
    Net is a leading, global ad tech company that focuses on creating the most transparent and efficient path for advertiser budgets to become publisher revenue. Our proprietary contextual technology is...Show more
    Last updated: 4 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CareStack - Dental Practice Management • India
    Manage and maintain day-to-day BAU operations, including monitoring system.Build infrastructure as code (IAC) patterns that meet security and engineering. Build CI / CD pipelines using Octopus, GitLab...Show more
    Last updated: 2 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Synechron • India
    We have immediate opportunity for Senior Site Reliability Engineer.Senior Site Reliability Engineer.At Synechron, we believe in the power of digital to transform businesses for the better.Our globa...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Signzy • India
    Signzy is an AI-powered RPA platform for financial services.No matter how complex your workflow or operational complexity, Signzy can completely automate your back-operations decision-making proces...Show more
    Last updated: 2 days ago • Promoted