This job offer is not available in your country.

Site Reliability Engineer (SRE)

InfraveoAhmedabad, Gujarat, India

21 days ago

Job description

This is a remote position.

We are seeking a Site Reliability Engineer (SRE) to join our team

with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE you will be instrumental in ensuring the reliability performance and scalability of our production systems. Youll work closely with engineering teams to automate operations improve monitoring and design resilient systems.

Responsibilities :

Design implement and maintain scalable resilient AWS infrastructure.
Develop and manage CI / CD pipelines and infrastructure-as-code (Terraform or similar).
Set up and optimize monitoring alerting and incident response processes.
Proactively identify and resolve performance reliability and security issues.
Collaborate with development teams to integrate SRE best practices into their workflows.
Conduct post-mortems and root cause analyses on incidents.
Participate in on-call rotations to support 24 / 7 system reliability.

Requirements

5 years of experience as an SRE or similar role.

Deep knowledge of AWS services (EC2 ECS RDS Lambda S3 etc.).

Proficient in infrastructure-as-code tools (Terraform CloudFormation etc.).

Solid experience with Linux systems administration and networking concepts.

Strong programming / scripting skills (Python Bash Go etc.).

Experience with CI / CD tools (GitLab CI Jenkins etc.).

Familiarity with observability tools (Prometheus Grafana Datadog etc.).

Nice To Have :

Experience with container orchestration (ECS EKS or Kubernetes).

Understanding of security best practices in cloud environments.

Exposure to incident management frameworks (SRE handbook etc.).

Benefits

Work Location : Remote

5 days working

5+ years of experience as an SRE or similar role. Deep knowledge of AWS services (EC2, ECS, RDS, Lambda, S3, etc.). Proficient in infrastructure-as-code tools (Terraform, CloudFormation, etc.). Solid experience with Linux systems administration and networking concepts. Strong programming / scripting skills (Python, Bash, Go, etc.). Experience with CI / CD tools (GitLab CI, Jenkins, etc.). Familiarity with observability tools (Prometheus, Grafana, Datadog, etc.). Nice To Have : Experience with container orchestration (ECS, EKS, or Kubernetes). Understanding of security best practices in cloud environments. Exposure to incident management frameworks (SRE handbook, etc.).

Key Skills

Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting

Employment Type : Full Time

Experience : years

Vacancy : 1

Monthly Salary Salary : 81 - 100

Create a job alert for this search

Site Reliability Engineer • Ahmedabad, Gujarat, India