This job offer is not available in your country.

SRE & DevOps Engineer

METRO Global Solution Center INPune, Maharashtra, India

6 hours ago

Job description

Job Description

We are looking for…

An experienced SRE & DevOps Engineer with deep expertise in cloud infrastructure, automation, and observability
A hands-on engineer who ensures reliability, performance, and scalability of systems
A proactive problem solver with a strong focus on operational excellence and continuous improvement
A collaborator who bridges development and operations through modern DevOps and SRE practices
An effective communicator who thrives in cross-functional teams and drives best practices

This role matters to us…

The Senior SRE & DevOps Engineer plays a vital role in ensuring the resilience, scalability, and reliability. By applying modern SRE principles, automation, and incident management practices, you will enable faster, more reliable delivery of business value while safeguarding system stability and customer trust.

Key Responsibilities

Design, implement, and maintain scalable, secure, and cloud-native infrastructure

Set up and maintain observability solutions, including monitoring, alerting, logging, and tracing (e.g., Prometheus, Grafana, ELK, DataDog)

Continuously improve CI / CD pipelines and automate deployment workflows to increase delivery efficiency

Lead structured incident response, root cause analysis, and drive a culture of post-mortem learning

Collaborate closely with developers, QA, and architects to ensure seamless integration and performance optimization

Apply SRE principles (SLIs, SLOs, SLAs, error budgets) to guide operational decisions and system reliability

Champion Infrastructure-as-Code practices using Terraform, Helm, or Ansible

Ensure security, compliance, and reliability are embedded into operations

Mentor team members and foster a culture of operational excellence and continuous improvement

Qualifications

Education

Bachelor’s or Master’s degree in Computer Science, Engineering, or equivalent practical experience

Work Experience

Proven 6 to 8 yrs experience in Site Reliability Engineering, DevOps, or Cloud Engineering roles

Hands-on expertise with Kubernetes (preferably GKE), Docker, and service mesh technologies like Istio

Strong background in CI / CD practices and tools (GitHub Actions, Jenkins X, ArgoCD, or similar)

Experience with observability solutions (Prometheus, Grafana, ELK, Jaeger, DataDog, GCP Dashboards)

Proficiency with at least one major cloud platform (GCP, AWS, Azure)

Scripting or programming experience (Python, Go, Bash, or similar)

Practical knowledge of Infrastructure-as-Code tools like Terraform, Helm, or Ansible

Hands-on experience managing incidents, troubleshooting, and performing root cause analysis

Familiarity with SRE practices (SLIs, SLOs, SLAs, error budgets)

Other Requirements

Strong communication and collaboration skills across cross-functional teams

Ability to balance short-term operational needs with long-term scalability and system health

Analytical and proactive mindset with focus on continuous improvement

Fluency in English (written and spoken)

Nice-to-Have

Experience with security best practices in distributed systems (OAuth2, mTLS, RBAC)

Knowledge of cost optimization and cloud governance practices

Familiarity with Camunda / CIB7 environments

Contributions to open-source DevOps / SRE communities

Create a job alert for this search

Engineer Sre • Pune, Maharashtra, India