This job offer is not available in your country.

AWS Site Reliability Engineer - ELK Stack

SMARTWORK IT SERVICESBangalore

30+ days ago

Job description

Job Title : AWS Site Reliability Engineer (SRE)

Experience Required : 68 years

Interview Drive Details :

Date : 13th Aug 2025

Time : 10 AM to 4 PM

Mode : In-person ( Face to Face )

Location : Bangalore

Role Overview

As an AWS SRE, youll leverage DevOps and SRE best practices to build, automate, and maintain scalable, reliable cloud infrastructure.

Your focus will be on elevating system performance, observability, and incident response while fostering operational excellence.

Key Responsibilities

Define, monitor, and uphold Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to guide reliability efforts AvahiGeeksforGeeks.
Build and maintain infrastructure resilience through automation (IaC with Terraform, CloudFormation), on-call tooling, and self-healing practices SquareOpsAvahiAmazon Web Services, Inc.
Monitor system health using tools like Prometheus, Grafana, Datadog, CloudWatch, and ELK Stack; establish proactive alerts to detect issues before they escalate
Execute capacity planning and performance optimization to accommodate growth and improve efficiency teamaws.
Collaborate with development and operations teams to embed reliability in software lifecycle and deployments teamaws.
Optimize costs and performance while maintaining operational effectiveness through AWS-native solutions and observability Alp Consulting
Support disaster recovery planning, fault tolerance, and ensure compliance with reliability standards.

Required Skills And Qualifications :

Bachelors degree in Computer Science, IT, or related field.

68 years of experience in SRE, DevOps, or infrastructure engineering, with strong exposure to AWS environments.

Expert in infrastructure automation (e.g., Terraform, CloudFormation), containerization, and orchestration platforms.

Proficient in one or more programming / scripting languages (e.g., Python, Go, Bash).

Hands-on experience with monitoring, observability, and incident management tools (e.g., Prometheus, Grafana, CloudWatch, ELK, Datadog).

Strong understanding of system design, distributed systems, networking, and performance tuning.

Proven track record of managing production systems, incident response, and performing blameless postmortems.

Adept at capacity planning, performance benchmarking, and cost optimization.

Preferred Qualifications :

AWS certifications such as AWS Certified DevOps Engineer or AWS Certified Solutions Architect.

Familiarity with container orchestration like EKS / Kubernetes.

Experience with on-call practices, runbook development, and SRE methodologies (SLIs / SLOs, error budgets).

Exposure to chaos engineering or resilience testing frameworks.

(ref : hirist.tech)

Create a job alert for this search

Site Reliability Engineer • Bangalore

Related jobs

Promoted

Site Reliability Engineer

Vbeyond corporationBangalore

SRE (Site Reliability Engineer 2) We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tool...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer - Azure

ConfidentialBengaluru / Bangalore

Build and maintain scalable, high-availability infrastructure.Automate cloud and container operations using scripting and orchestration tools. Proactively troubleshoot distributed systems and produc...Show moreLast updated: 25 days ago

Promoted

Site Reliability Engineer

Amicon Hub Serviceshosur, tamil nadu, in

Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 5 days ago

Promoted

Site Reliability Engineer - Cloud Platforms

LanceSoft, IncBangalore

Role and Responsibilities : Reporting to Engineering, the Site Reliability Engineer will play a critical role in driving innovation and growth for the Banking Soluti...Show moreLast updated: 18 days ago

Promoted

Site Reliability Engineer

TavantBengaluru, Karnataka, India

With 25+ years of experience building innovative digital products and solutions, Tavant provides impactful results to its customers. It has been the frontrunner in driving digital innovation and tec...Show moreLast updated: 25 days ago

Promoted

LSEG - Site Reliability Engineer

REFINITIV INDIA SHARED SERVICES PRIVATE LIMITEDBangalore

LSEG is a leading global financial markets infrastructure and data provider.Our purpose is driving financial stability, empowering economies and enabling customers to create sustainable growth.Our ...Show moreLast updated: 30+ days ago

Promoted

Senior Site Reliability Engineer II

ConfidentialBengaluru / Bangalore

The Site Reliability Engineering team focused on Efficiency and Performance is responsible for driving AWS cost intelligence, managing the ThousandEyes infrastructure, and ensuring optimal resource...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer - Chaos Management

XebiaBengaluru, Karnataka, India

AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 7 days ago

Promoted

Site Reliability Engineer

ViewSonicbangalore, karnataka, in

Bachelor's degree in Computer Science, Engineering, or a related field.Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions in...Show moreLast updated: 17 days ago

Promoted

System Engineer

Netsmore Technologieshosur, tamil nadu, in

Systems Engineer – Level 3 (Internal).Mandatory skills : AWS cloud infrastructure + OKTA administration.The L3 Systems Engineer role is more engineering-focused than traditional system admin roles.I...Show moreLast updated: 4 days ago

Promoted

Site Reliability Engineer

XebiaBengaluru, IN

Promoted

Site Reliability Engineer - Cloud Operations

Creencia Technologies Pvt LtdBangalore

We are recruiting an experienced Site Reliability Engineer to join our newly established TechOps division within the Technology department. We maintain the systems that keep our products running smo...Show moreLast updated: 26 days ago

Promoted

Site Reliability Engineer

Uplershosur, tamil nadu, in

Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 23 days ago

Promoted

Site Reliability Engineer (SRE) – L2 Support

ConfidentialBengaluru / Bangalore

Focus on maintaining and improving the reliability, availability, and performance of AWS-based infrastructure and applications. Handle and resolve L2 incidents related to AWS services (EC2, RDS, S3,...Show moreLast updated: 30+ days ago

Promoted

Sr. AWS Cloud Engineer

Mastekhosur, tamil nadu, in

Cloud Engineer Job description : .Have work experience in the following areas : .Experience in designing, building, and maintaining AWS Cloud Infrastructure. Proficient in AWS services including EC2, S3...Show moreLast updated: 23 days ago

Promoted

ALLEGION - Senior Site Reliability Engineer - Terraform / Kubernetes

ALLEGION INDIA PRIVATE LIMITEDBangalore

About the role Allegion India is seeking a highly motivated Sr.Site Reliability Engineer on contract for 6 months who will play a critical role in ensuring the reliab...Show moreLast updated: 16 days ago

Promoted

Senior Site Reliability Engineer

WSO2Hosur, Tamil Nadu, India

About WSO2 Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises ...Show moreLast updated: 30+ days ago

Promoted

Principal Site Reliability Engineer

Rakuten IndiaBengaluru, Karnataka, India

Design, develop SLA, SLO, SLI of services within the Business Unit.Involve in whole process of Development, Production System Operation including system maintenance, monitoring, automation, backend...Show moreLast updated: 7 days ago