This job offer is not available in your country.

Site Reliability Engineer

Amicon Hub ServicesIndia

8 days ago

Job description

Key Responsibilities

Manage and scale production systems hosted on

Google Cloud Platform (GCP)

Implement

SRE best practices : monitoring, alerting, SLAs, SLOs, and error budgets

Automate operational tasks using

Infrastructure as Code (IaC)

tools like Terraform

Improve system reliability and reduce manual interventions through automation

Collaborate with development teams to ensure new services are production-ready

Incident response and post-mortem analysis to prevent recurring issues

Design and implement CI / CD pipelines for rapid and safe deployments

Manage GCP resources : IAM, VPC, Compute Engine, GKE, Cloud Functions, Pub / Sub, BigQuery, etc.

Ensure security, compliance, and cost optimization on the cloud infrastructure

Required Skills & Qualifications

5+ years

of experience in SRE, DevOps, or Cloud Infrastructure roles

Strong hands-on experience with

Google Cloud Platform (GCP)

services

Proficiency with

Terraform

or other IaC tools

Solid knowledge of

Kubernetes (GKE) , containerization, and microservices

Strong scripting skills in

Python, Go, or Shell

Familiarity with incident response and post-mortem culture

Knowledge of

networking, security, and cloud cost management

Preferred Qualifications

GCP certifications (e.g.,

Professional Cloud DevOps Engineer )

Prior experience working with e-commerce or high-scale platforms

Familiarity with SRE tooling like Chaos Engineering, Service Mesh (Istio), etc.

Soft Skills

Strong communication and stakeholder management

Problem-solving mindset with a focus on reliability and automation

Ability to work independently in a distributed, outsourced team model

Site Reliability Engineer • India