Site Reliability Engineer - 2

ConfidentialBengaluru / Bangalore

30+ days ago

Job description

As an SRE-2 at MoEngage, you'll be a critical member of our SRE team, responsible for the health and performance of key services and contributing directly to the evolution of our infrastructure at a scale that few engineers get to experience. This is your chance to deepen your technical expertise, take on more ownership, and mentor emerging talent while working on a platform that operates at the cutting edge.

What You'll Do to Keep Our Engines Roaring

Be a Reliability Champion : Take ownership of the reliability, performance, and efficiency of critical services.
Automate, Automate, Automate : Design, develop, and implement robust automation solutions to eliminate toil, streamline operations, and improve system resilience.
Battle Incidents (and Win) : Lead troubleshooting efforts for complex production incidents, perform in-depth root cause analysis, and implement sustainable preventative measures.
Sculpt Our Infrastructure : Actively contribute to the design, implementation, and optimization of our cloud infrastructure on AWS and GCP , leveraging your expertise in technologies like Kubernetes.
Enhance Observability : Implement and refine advanced monitoring, alerting, and logging solutions to gain deep insights into system behavior and predict potential issues.
Collaborate for Success : Partner closely with development teams to influence architectural decisions, ensuring reliability, scalability, and security are built in from the start.
Strengthen Our Security Posture : Implement and advocate for advanced security practices within our infrastructure and operational workflows.
Drive Efficiency : Analyze and optimize cloud infrastructure spend, identifying and implementing cost-saving opportunities.
Guide the Next Wave : Mentor and guide SRE-1 engineers, contributing to the growth and knowledge sharing within the team.
Be Ready for Action : Participate in our on-call rotation, acting as a key point of escalation and resolution for critical issues.

What Makes You the Ideal Candidate

3-5 years of hands-on experience in Site Reliability Engineering, DevOps, or a similar role with a strong focus on production systems.

Demonstrated expertise in Python or Go —you have a proven track record of automating complex tasks.

Strong command of AWS and / or GCP cloud platforms .

In-depth experience with containerization and orchestration using Kubernetes (K8s, ArgoCD, Helm / Kustomize) .

Experience with infrastructure as code tools like Terraform or Ansible is highly valued.

Solid understanding and experience with monitoring and observability stacks (VictoriaMetrics, Prometheus, Grafana, ELK stack, etc.).

Deep knowledge of Linux / Unix systems internals and advanced networking concepts .

Proven ability to diagnose and resolve complex issues in large-scale distributed systems.

A strong understanding of Cloud Security and Information Security principles and best practices .

Experience with cloud cost analysis and optimization techniques.

Familiarity with CI / CD pipelines and GitOps methodologies.

Experience with messaging queues and distributed systems (Celery, Kafka) is a plus.

Excellent communication, collaboration, and problem-solving skills.

A desire to mentor and lead by example.

Skills Required

Reliability Engineering, Devops, Python, Aws, Kubernetes

Create a job alert for this search

Site Reliability Engineer • Bengaluru / Bangalore

Related jobs

Promoted

Site Reliability Engineer

Alp Consulting Ltd.Bengaluru, Republic Of India, IN

Good knowledge of AWS technologies including EC2, ECS / EKS (Docker containers), RDS, S3, Lambda, CloudHSM.Cloud stack deployment & upgrade using CloudFormation / Terraform.REST end point development...Show moreLast updated: 19 days ago

Promoted

Site Reliability Engineer

Andromeda SecurityBengaluru, Republic Of India, IN

We are seeking an experienced Site Reliability Engineer (SRE) with a strong background in DevOps technologies and cloud infrastructure. The ideal candidate will have hands-on experience with Kuberne...Show moreLast updated: 1 day ago

Site Reliability Engineer

AIONBengaluru, KA, IN

Quick Apply

AION is building the next generation of AI cloud platform by transforming the future of high-performance computing (HPC) through its decentralized AI cloud. Purpose-built for bare-metal performance,...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

JRD SystemsBengaluru, Karnataka, India

Site Reliability Engineer (Windows / Cloud / Automation) Job Summary : We are seeking an experienced Site Reliability Engineer with a strong background in managing Windows infrastructure and cloud e...Show moreLast updated: 19 days ago

Promoted

Site Reliability Engineer

CodeKarmahosur, tamil nadu, in

Site Reliability Engineer (Multi-Cloud Deployments).CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s w...Show moreLast updated: 21 days ago

Promoted

Site Reliability Engineer

WhiteLotus Talent PartnersBengaluru, Karnataka, India

L0 and L1 Site Reliability Engineer (SRE) Support.Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by. In this role, you will focu...Show moreLast updated: 30+ days ago

Promoted

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.hosur, tamil nadu, in

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago

Promoted

Senior Site Reliability Engineer

Nebula Tech Solutionshosur, tamil nadu, in

SRE team supporting mission-critical applications for our.We’re now looking for engineers who can go beyond operations — those who can. Enhance application reliability through code.Add or modify cod...Show moreLast updated: 1 day ago

Promoted

Site Reliability Engineer II

RecRootsBengaluru, Karnataka, India

Key Job Responsibilities and Duties : .The core premise for the SRE lies in treating operational issues as a software problem. We code our way out of problems where operations are concerned addressing...Show moreLast updated: 8 days ago

Promoted

Senior Site Reliability Engineer (SRE) – Datadog Observability

Jade Globalhosur, tamil nadu, in

Senior Site Reliability Engineer (SRE) – Datadog Observability.SRE and Infrastructure Operations with minimum 3.Hyderabad preferable but open for Pune and remote. Site Reliability Engineer (SRE).SRE...Show moreLast updated: 1 day ago

Promoted

Site Reliability Engineer

ElgebraBangalore

Role Overview : We are seeking a highly experienced and technically proficient Site Reliability Engineer (SRE) to join our team in support of our c...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

Core Minds Tech SOlutionsHosur

Job Description : - Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions&l...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

o9 Solutions, Inc.Bengaluru, Karnataka, India

Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 30+ days ago

Site Reliability Engineer

Aqilea (formerly Soltia)Bangalore, Karnataka, India

Quick Apply

We are a consulting company with a bunch of technology-interested and happy people!.We love technology, we love design and we love quality. Our diversity makes us unique and creates an inclusive and...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

SynechronBengaluru, Karnataka, India

We have immediate opportunity for Senior Site Reliability Engineer.Senior Site Reliability Engineer.At Synechron, we believe in the power of digital to transform businesses for the better.Our globa...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

super.moneyBengaluru, Karnataka, India

Site Reliability Engineer (SRE) Level 3.A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and...Show moreLast updated: 1 day ago

Promoted

Site Reliability Engineer

CapgeminiBangalore, IN

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 11 days ago

Promoted

Site Reliability Engineer

Media.netBengaluru, Karnataka, India

Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show moreLast updated: 22 days ago