This job offer is not available in your country.

Site Reliability Engineer

ZelisHyderabad, Telangana, India

4 hours ago

Job description

Position Overview

Site Reliability Engineer (SRE) with a strong focus on observability, monitoring, and incident management within the Microsoft Azure ecosystem. The ideal candidate will have hands-on experience with Azure Monitor, Application Insights, and Azure API Management (APIM), and will play a key role in ensuring the reliability, performance, and visibility of our cloud-based systems. Experience with New Relic is a plus.

Key Responsibilities :

1. Observability & Monitoring

Design and implement observability solutions using Azure Monitor, Log Analytics, and Application Insights.
Develop and maintain dashboards and visualizations for real-time system health and performance metrics.
Define and fine-tune alerting rules to proactively detect and respond to system anomalies.
Continuously improve monitoring practices to reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR).
Develop and maintain Azure B2C SSO Custom Policies for Orchestration process
Good to have Okta knowledge

2. Azure API Management (APIM)

Support the deployment, configuration, and monitoring of APIs using Azure APIM.

Implement logging, tracing, and analytics for APIs to ensure visibility and performance tracking.

Collaborate with development teams to ensure APIs are observable, secure, and scalable.

3. Incident Management

Respond to incidents in a timely and effective manner.

Conduct root cause analysis and contribute to post-incident reviews and documentation.

Work with cross-functional teams to implement long-term solutions and prevent recurrence.

4. Collaboration & Documentation

Collaborate with development, QA, and operations teams to embed observability into the software development lifecycle.

Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding

Collaborate with development teams to improve services through rigorous testing and release procedures

Participate in system design consulting, platform management, and capacity planning

Create sustainable systems and services through automation and uplifts

Balance feature development speed and reliability with well-defined service-level objectives

Maintain clear and comprehensive documentation for monitoring setups, incident response procedures, and system configurations.

Share knowledge through internal training, documentation, and peer reviews.

Required Skills & Qualifications :

5+ years of experience in a Site Reliability, DevOps, or Cloud Engineer role.

3+ years of experience with Azure Monitor, Application Insights, and Log Analytics.

2+ years of experience Azure API Management (APIM) and API lifecycle management.

Strong knowledge using Kusto Query Language (KQL)

Strong knowledge using SQL

Basic scripting skills (PowerShell, Azure CLI)

Understanding of distributed systems, microservices, and cloud-native architectures.

Strong analytical, troubleshooting, and communication skills.

Experience with New Relic , DataDog, Splunk (one of the tools) for application performance monitoring and observability.

Commitment to Diversity, Equity, Inclusion, and Belonging

At Zelis, we champion diversity, equity, inclusion, and belonging in all aspects of our operations. We embrace the power of diversity and create an environment where people can bring their authentic and best selves to work. We know that a sense of belonging is key not only to your success at Zelis, but also to your ability to bring your best each day.

Create a job alert for this search

Site Reliability Engineer • Hyderabad, Telangana, India

Related jobs

Promoted
New!

Site Reliability Engineer

Unison GroupHyderabad, Telangana, India

Experience with supporting Java (J2EE / Spring Boot) based multi-tier applications with complex upstream downstream interactions having expertise in understanding the application request flow and ana...Show moreLast updated: 4 hours ago

Promoted

Site Reliability Engineer

ValueMomentumHyderabad, Telangana, India

Site Reliability / Azure DevOps Engineer with Dynatrace Experience.CI / CD practices, infrastructure automation, and cloud operations. The ideal candidate will have deep expertise in Azure DevOps, Inf...Show moreLast updated: 3 days ago

Promoted

Engineer, Site Reliability [T500-20518]

ANSRHyderabad, Telangana, India

ANSR is hiring for one of its clients.NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flags...Show moreLast updated: 10 days ago

Promoted

Engineer, Site Reliability [T500-20502]

ANSRHyderabad, Telangana, India

Promoted
New!

Site Reliability Engineer

Talent WorxHyderabad, Telangana, India

Site Reliability Engineer (SRE).At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of o...Show moreLast updated: 4 hours ago

Promoted

Engineer, Site Reliability [T500-20517]

ANSRHyderabad, Telangana, India

Promoted
New!

Site Reliability Engineer III

JPMorgan Chase & Co.Hyderabad, Telangana, India

There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-cr...Show moreLast updated: 4 hours ago

Promoted
New!

Senior Site Reliability Engineer

Thomson ReutersMadhapur, Telangana, India

As a senior site reliability engineer will work in our global organization to provide operational support for all Thomson Reuters products, including development tools and infrastructure used by en...Show moreLast updated: 4 hours ago

Promoted
New!

Lead Site Reliability Engineer

Zeta Services Inc.Hyderabad, Telangana, India

It was founded by and Ramki Gaddipati in 2015.Our flagship processing platform - Zeta Tachyon - is the industry’s first modern, cloud-native, and fully API-enabled stack that brings together issuan...Show moreLast updated: 4 hours ago

Promoted

Site Reliability Engineer

Insight GlobalHyderabad, Telangana, India

Must be able to join within 30 days or less!.An employer is looking for an SRE to join their enterprise level SRE team.They are building a specialized team of Senior Site Reliability Engineers to a...Show moreLast updated: 30+ days ago

Promoted

Engineer, Site Reliability [T500-20504]

ANSRHyderabad, Telangana, India

Promoted
New!

Lead Site Reliability Engineer

UnitedHealth GroupHyderabad, Telangana, India

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives.The work you do with our team will directly improve health outcomes by connect...Show moreLast updated: 4 hours ago

Promoted
New!

Lead Site Reliability Engineer

JPMorgan Chase & Co.Hyderabad, Telangana, India

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.As a Lead Site Reliabi...Show moreLast updated: 4 hours ago

Promoted
New!

Senior Site Reliability Engineer

Zeta Services Inc.Hyderabad, Telangana, India

Promoted

Site Reliability Engineer

ConcordHyderabad, IN

Engineers (Individual Contributors).Strong SRE (Site Reliability Engineering).CI / CD, monitoring, automation, infrastructure as code, etc.Show moreLast updated: 21 days ago

Promoted

Engineer, Site Reliability [T500-20519]

ANSRHyderabad, Telangana, India

Promoted

Site Reliability Engineer

ExasoftHyderabad, IN

Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: 3 days ago

Promoted

Engineer, Site Reliability [T500-20521]

ANSRHyderabad, Telangana, India