Talent.com
Senior Site Reliability Engineer- ELK Expert

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.Trivandrum, Kerala, India
30+ days ago
Job description

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice

Location : India (Remote) - Must be available to work in the EST (US / Canada) Time Zone.

Role Summary :

Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?

We're looking for an SRE with 7+ years of experience , including 4+ years specializing in the ELK stack (Elasticsearch, Logstash, Kibana) , to join our Platform Engineering Practice . In this role, you’ll design, manage, and scale ELK clusters ingesting 2–3+ TB / day , enhance reliability across distributed systems, and drive automation within Azure cloud environments. This is a high-impact engineering opportunity focused on performance, observability, and operational excellence at scale.

Why Join Us Career Growth : Work alongside industry experts on cutting-edge cloud technologies

Competitive Compensation and Benefits : We recognize and reward top talent

Exciting, Impactful Work : Design and build scalable, resilient cloud environments

Strategic Platform Role : Contribute to the foundation of next-gen observability and reliability infrastructure

What You Will Do Design and Optimize Cloud Infrastructure : Architect scalable, fault-tolerant systems on Microsoft Azure

Automate Everything : Use Terraform, Ansible, and GitHub Actions to streamline deployment and configuration

Ensure Reliability and Performance : Proactively monitor, troubleshoot, and resolve production issues using Prometheus, Grafana, and Azure Monitor

Enhance Security and Compliance : Implement security best practices across DevOps workflows

Collaborate and Innovate : Work closely with engineering, security, and operations teams to drive automation and efficiency

Manage and scale large ELK clusters handling 2–3+ TB / day log volumes, ensuring high availability and performance

Optimize ELK architecture : Implement efficient index lifecycle policies, shard strategies, and hot-warm-cold tiered storage

Build and tune log pipelines : Scale Logstash and Beats pipelines across distributed environments

Support Kibana observability layers : Create dashboards, visualizations, and custom alerting frameworks (e.g., Watcher, ElastAlert)

What You Bring 7+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering

4+ years of dedicated, hands-on experience with ELK (Elasticsearch, Logstash, Kibana)

Strong experience managing large-scale ELK clusters in production with heavy ingestion (multi-TB / day)

Deep knowledge of index tuning, shard allocation, ILM policies , and scaling ELK components

Expertise in GitHub Actions, Terraform, Ansible, and Infrastructure as Code (IaC)

Proficiency in Python, Go, or Bash for automation and scripting

Deep understanding of Kubernetes, Docker , and cloud-native architectures

Experience with observability tools such as Prometheus, Grafana, Azure Monitor

Ability to work in a fast-paced, collaborative environment and solve complex operational issues

Education Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field

Certifications (Nice to Have) Microsoft Azure certifications : AZ-104 , AZ-400

Create a job alert for this search

Senior Site Reliability Engineer • Trivandrum, Kerala, India

Related jobs
  • Promoted
Senior Site Reliability Engineer / Senior Cloud Engineer

Senior Site Reliability Engineer / Senior Cloud Engineer

CloudHireThiruvananthapuram, Republic Of India, IN
The Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture.Repo...Show moreLast updated: 1 day ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

CapgeminiThiruvananthapuram, IN
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 12 days ago
  • Promoted
Lead - Cloud Reliability Engineer

Lead - Cloud Reliability Engineer

Searce Incthiruvananthapuram, kerala, in
The ‘process-first’ AI-native modern tech consultancy that's rewriting the rules.As an engineering-led consultancy, we are dedicated to relentlessly improving the real business outcomes.Our solvers...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer II

Site Reliability Engineer II

ConfidentialThiruvananthapuram, Thiruvananthapuram / Trivandrum, India
The world's top banks use Zafin's integrated platform to drive transformative customer value.Powered by an innovative AI-powered architecture, Zafin's platform seamlessly unifies data from across t...Show moreLast updated: 6 days ago
  • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

IntraEdgeThiruvananthapuram, IN
Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 15 days ago
  • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

ConfidentialThiruvananthapuram / Trivandrum, India
Site Reliability Engineering (SRE).Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems.SRE ensures that ...Show moreLast updated: 6 days ago
  • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

ConfidentialThiruvananthapuram / Trivandrum
As a Site Reliability Engineer (SRE) you will be responsible for improving the overall reliability of applications by ensuring its availability, performance, and scalability.Should be able to gathe...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

CitNOW GroupThiruvananthapuram, IN
Founded in 2008, CitNOW is an innovative, enterprise-level software product suite that allows automotive dealerships globally to sell more vehicles and parts more profitably.CitNOW’s app-based plat...Show moreLast updated: 1 day ago
  • Promoted
Senior Site Reliability Engineer- ELK Expert

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.Thiruvananthapuram, IN
Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
  • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Nebula Tech SolutionsThiruvananthapuram, Republic Of India, IN
SRE team supporting mission-critical applications for our.We’re now looking for engineers who can go beyond operations — those who can. Enhance application reliability through code.Add or modify cod...Show moreLast updated: 2 days ago
  • Promoted
Senior Site Reliability Engineer (Sre) – Datadog Observability

Senior Site Reliability Engineer (Sre) – Datadog Observability

Jade GlobalThiruvananthapuram, Republic Of India, IN
Senior Site Reliability Engineer (SRE) – Datadog Observability.SRE and Infrastructure Operations with minimum 3.Hyderabad preferable but open for Pune and remote. Site Reliability Engineer (SRE).SRE...Show moreLast updated: 2 days ago
  • Promoted
Sr Engineer, Site Reliability [T500-21295]

Sr Engineer, Site Reliability [T500-21295]

TMUS Global Solutionsthiruvananthapuram, kerala, in
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 1 day ago
  • Promoted
Senior System Reliability Engineer

Senior System Reliability Engineer

Soffit Infrastructure Services (P) LtdThiruvananthapuram, Republic Of India, IN
Soffit is seeking a dedicated and qualified.The selected candidate will ensure high system availability, reliable service delivery, and optimized performance. The role requires hands-on experience w...Show moreLast updated: 23 days ago
  • Promoted
Senior DevOps / Site Reliability Engineer

Senior DevOps / Site Reliability Engineer

Scoop Technologies Pvt LtdTrivandrum
Job Title : Senior DevOps Engineer / Site Reliability Engineer (SRE) Experience : 5 to 8 Years &...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

CodeKarmathiruvananthapuram, kerala, in
Site Reliability Engineer (Multi-Cloud Deployments).CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s w...Show moreLast updated: 23 days ago
  • Promoted
Equifax - Senior Site Reliability Engineer - IAC Terraform

Equifax - Senior Site Reliability Engineer - IAC Terraform

EquifaxThiruvananthapuram
About the job Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distr...Show moreLast updated: 30+ days ago
  • Promoted
Senior Site Reliability Engineer- Elk Expert

Senior Site Reliability Engineer- Elk Expert

iVedha Inc.Thiruvananthapuram, Republic Of India, IN
Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 17 days ago
  • Promoted
  • New!
Sr Engineer, Site Reliability T500-21295

Sr Engineer, Site Reliability T500-21295

TMUS Global SolutionsThiruvananthapuram, Republic Of India, IN
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 21 hours ago