Talent.com
No longer accepting applications
Senior Site Reliability Engineer

Senior Site Reliability Engineer

iVoyantSecunderabad, Telangana, India
6 days ago
Job description

One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team.

Key Responsibilities :

Reliability and Performance Management :

Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products.

Develop and implement SLOs, SLIs, and SLAs to measure and improve service reliability.

Continuously optimize system performance and resource utilization across multiple cloud platforms.

Finetune / Optimize Application performance by analyzing the code, traces and database queries.

Incident Management and Troubleshooting :

Lead incident response efforts, effectively troubleshooting complex issues to minimize downtime and impact.

Reduce Mean Time to Recover (MTTR) through proactive monitoring, automated alerting, and efficient problem-solving techniques.

Conduct thorough Root Cause Analysis (RCA) for all major incidents and implement preventive measures.

Observability and Monitoring :

Design and implement end-to-end observability solutions across our distributed systems.

Develop and maintain comprehensive monitoring strategies using tools like ELK Stack, Prometheus, Grafana.

Create and optimize product status dashboards to provide real-time visibility into system health and performance.

Automation and Infrastructure as Code (IaC) :

Implement Infrastructure as Code practices using tools like Terraform.

Develop and maintain automated deployment pipelines and CI / CD workflows.

Create self-healing systems and automate routine operational tasks to reduce manual intervention.

Cloud-Agnostic Architecture :

Design and implement cloud-agnostic solutions that can operate efficiently across multiple cloud providers.

Develop expertise in event-driven architecture and related technologies (e.g., Apache Kafka / EventHub, Redis, Mongo Atlas, IoTHub).

Implement and manage containerized applications using Kubernetes across different cloud environments.

Continuous Improvement :

Regularly review and refine operational practices to enhance efficiency and reliability.

Stay updated with the latest industry trends and technologies in SRE, cloud computing, and DevOps.

Contribute to the development of internal tools and frameworks to support SRE practices.

Requirements :

Strong knowledge of cloud platforms - Azure and their associated services.

Expert in Observability tools (ELK Stack, Dynatrace, Prometheus)

Expertise in containerization technologies such as Docker and Kubernetes

Understanding of Event-driven architecture and database technologies (Mongo Atlas, Azure SQL, Postgres DB)

Proficient in IaaC tools such as - Terraform and GitHub Actions.

Proficiency in one or more programming languages - Python / .Net / Java

Strong understanding of networking concepts, load balancing, and security practices.

Create a job alert for this search

Senior Site Reliability Engineer • Secunderabad, Telangana, India

Related jobs
  • Promoted
Engineer, Site Reliability [T500-20266]

Engineer, Site Reliability [T500-20266]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

o9 Solutions, Inc.hyderabad, India
Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 10 days ago
  • Promoted
Sr Engineer, Site Reliability Engineer [T500-20464]

Sr Engineer, Site Reliability Engineer [T500-20464]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Sr Engineer, Site Reliability [T500-20279]

Sr Engineer, Site Reliability [T500-20279]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Engineer, Site Reliability [T500-20521]

Engineer, Site Reliability [T500-20521]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Tata Consultancy ServicesHyderabad, Telangana, India
We are currently seeking a for a position SRE Engineer in Hyderabad.Job ID : 375656 • • • •Apply Here : • • ( TCS iBegin ) • •Job Description : • • Proven experience as a DevOps / SRE Engineer Expertise in...Show moreLast updated: 17 days ago
  • Promoted
Sr Engineer, Site Reliability [T500-20437]

Sr Engineer, Site Reliability [T500-20437]

TMUS Global SolutionsHyderabad, Telangana, India
About T-Mobile : T-Mobile US, Inc.NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship b...Show moreLast updated: 21 days ago
  • Promoted
Sr Engineer, Site Reliability [T500-20463]

Sr Engineer, Site Reliability [T500-20463]

TMUS Global SolutionsHyderabad, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

IntraEdgeHyderabad, IN
Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 8 days ago
  • Promoted
Engineer, Site Reliability [T500-20517]

Engineer, Site Reliability [T500-20517]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Engineer, Site Reliability [T500-20515]

Engineer, Site Reliability [T500-20515]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

CapgeminiHyderabad, IN
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 5 days ago
  • Promoted
Sr Engineer, Site Reliability [T500-20439]

Sr Engineer, Site Reliability [T500-20439]

TMUS Global SolutionsHyderabad, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

SID Global SolutionsHyderabad, Telangana, India
Job Role : Site Reliability Engineer (SRE) – GCP.SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortu...Show moreLast updated: 27 days ago
  • Promoted
Senior Site Reliability Engineer- ELK Expert

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.Hyderabad, IN
Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
  • Promoted
Engineer, Site Reliability [T500-20519]

Engineer, Site Reliability [T500-20519]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Engineer, Site Reliability [T500-20518]

Engineer, Site Reliability [T500-20518]

TMUS Global SolutionsHyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 20 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

NationsBenefits IndiaHyderabad, Telangana, India
Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog |.SRE team focused on maintaining the performance, reliability, and availability of our fintech platforms.Triage and resolve product...Show moreLast updated: 17 days ago