This job offer is not available in your country.

Senior Lead Site Reliability Engineer / Expert / Specialist (DevOps & Automation)

SITADelhi, India

21 hours ago

Job description

Overview

WELCOME TO SITA

We're the team that keeps airports moving, airlines flying smoothly, and borders open. Our tech and communication innovations are the secret behind the success of the world's air travel industry.

You'll find us at 95% of international hubs. We partner closely with over 2,500 transportation and government clients, each with their own unique needs and challenges. Our goal is to find fresh solutions and cutting-edge tech to make their operations run like clockwork. Want to be a part of something big?

Are you ready to love your job? The adventure begins right here, with you, at SITA.

PURPOSE

Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger events and develops both manual remediation approaches and automated workflows to resolve alerts. Oversees the deployment of IT services and solutions, ensuring successful integration with minimal disruption. Focuses on operational automation and integration to enhance efficiency and collaboration between development and operations within service operations.

What will you do :

Define, build, and maintain support systems to ensure high availability and performance.

Handle complex cases for the PSO.

Build events to add to the event catalog for the relevant product or application.

Implement automation for system provisioning, self-healing, auto recovery, deployment, and monitoring.

Perform incident response and root cause analysis for critical system failures.

Monitor system performance and establish service-level indicators (SLIs) and objectives (SLOs).

Collaborate with development and operations to integrate reliability best practices, including moving to zero downtime architecture.

Proactively identify and remediate performance issues.

Problem Management

Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions.

Coordinate with incident management teams and collaborate with PSO and Engineering teams to develop and implement permanent solutions.

Monitor the effectiveness of problem resolution activities, provide regular reports on problem management activities, and ensure continuous improvement. Event Management

Define and maintain an event catalog, specifying active events, thresholds, and relevant remediation, and optimize it for efficiency.

Develop event response protocols, provide training to teams, and ensure quick and efficient handling of incidents.

Collaborate with stakeholders to define events, ensure coverage across the PSO, and drive improvements based on post-event reviews and feedback. Deployment Management

Own the quality of deployments for the PSO, ensuring a clear process and responsibilities are assigned for smooth implementation.

Develop and maintain deployment schedules, conduct operational readiness assessments, and manage deployment risk assessments to ensure service stability.

Oversee the execution of deployment plans, coordinate resources, communicate with stakeholders, and continuously improve deployment processes based on feedback. DevOps Management

Manage continuous integration and deployment (CI / CD) pipelines, ensuring smooth integration between development and operational teams.

Qualifications

EXPERIENCE, KNOWLEDGE & SKILLS

Several years of experience in IT operations, service management, or infrastructure management, including roles such as Site Reliability Engineer, Problem Manager, or DevOps Manager.

Proven experience in managing high-availability systems and ensuring operational reliability.

Extensive experience in root cause analysis (RCA), incident management, and developing permanent solutions for recurring service disruptions.

Hands-on experience with CI / CD pipelines, automation, system performance monitoring, and the implementation of infrastructure as code.

Strong background in collaborating with cross-functional teams (development, operations, engineering, etc.) to improve operational processes and service delivery.

Experience in managing deployments, risk assessments, and optimizing event and problem management processes.

Familiarity with cloud technologies, containerization, and scalable architecture, including experience with zero-downtime deployment strategies

Functional Skills : Collaboration

Stakeholder Management

Service Design

Project Management

Communication

Compliance & Risk Management

Problem Solving

Incident Management

Change Management

Innovation

Technical Skills : Cloud Infrastructure

Automation & AI

Operations Monitoring & Diagnostics

Deployment

Programming & Scripting Languages

Educational Background

Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field.

Advanced degree (Masters or equivalent) is often preferred for senior positions. Qualifications

Relevant certifications such as ITIL, CompTIA Security+, or Certified Kubernetes Administrator (CKA).

Certifications in cloud platforms (AWS, Azure, Google Cloud) or DevOps methodologies (e.g., Certified DevOps Professional).

Certifications in specific tools like ServiceNow, Jira, or other relevant software platforms

WHAT WE OFFER

We're all about diversity. We operate in 200 countries and speak 60 different languages and cultures. We're really proud of our inclusive environment. Our offices are comfortable and fun places to work, and we make sure you get to work from home too. Find out what it's like to join our team and take a step closer to your best life ever.

Flex Week : Work from home up to 2 days / week (depending on your team's needs)

Flex Day : Make your workday suit your life and plans.

Flex-Location : Take up to 30 days a year to work from any location in the world.

Employee Wellbeing : We have got you covered with our Employee Assistance Program (EAP), for you and your dependents 24 / 7, 365 days / year. We also offer Champion Health - a personalized platform that supports a range of wellbeing needs.

Professional Development : Level up your skills with our training platforms, including LinkedIn Learning!

Competitive Benefits : Competitive benefits that make sense with both your local market and employment status.

SITA is an Equal Opportunity Employer. We value a diverse workforce. In support of our Employment Equity Program, we encourage women, aboriginal people, members of visible

Create a job alert for this search

Senior Site Reliability Engineer • Delhi, India

Related jobs

Promoted

DevOps / Platform Engineer

iVedha Inc.Delhi, IN

Hiring a seasoned DevOps / Platform Engineer to drive automation, platform reliability, and robust.Design, deploy, and manage CI / CD pipelines and infrastructure automation, leveraging AI for.Implemen...Show moreLast updated: 30+ days ago

Promoted

Xebia - Senior / Lead / Principal Site Reliability Engineer

Xebia IT Architects India Pvt LtdGurgaon

Role : Site Reliability Engineer Experience Range : 7 - 12 Years Location : Pune & Chennai, Bangalore , Gurgaon Mode of Work : Hyb...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

ConcordGhaziabad, IN

Engineers (Individual Contributors).Strong SRE (Site Reliability Engineering).CI / CD, monitoring, automation, infrastructure as code, etc.Show moreLast updated: 22 days ago

Promoted

Site Reliability Engineer - Chaos Management

Xebiadelhi, delhi, in

AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 11 days ago

Promoted

RELX - Site Reliability Engineer - IAC Terraform

REED ELSEVIER INDIA (a part of RELX India Pvt Ltd)Gurgaon

Job Description : - Lead initiatives to identify and eliminate manual, repetitive tasks through automation and tooling.Develop s...Show moreLast updated: 22 days ago

Promoted
New!

Zinnia - Site Reliability Engineer III - DevSecOps

ZinniaGurgaon

Who You Are : As a Site Reliability Engineer at Zinnia, you will play a pivotal role in designing, building, and maintaining the infrastructure and systems that supp...Show moreLast updated: 16 hours ago

Promoted

Site Reliability Engineer / DevOps Support Engineer

TheThreeAcrossGurugram

Job Description : Role : SRE / Devops Support Engineer Trade Experience : 4-9 Years ...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

XebiaMeerut, IN

Promoted

Site Reliability Engineer

CorroHealthNoida, Uttar Pradesh, India

We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team.The ideal candidate will have a deep understanding of both software engineering and systems administration, with a f...Show moreLast updated: 22 days ago

Promoted

Site Reliability Engineer

Amicon Hub Servicesnoida, delhi, in

Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 9 days ago

Promoted

Site Reliability Engineer

BayOne Solutionsghaziabad, uttar pradesh, in

Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: 3 days ago

3469-Site Reliability Engineer-II

Innovaccer AnalyticsNoida, UP, IN

Quick Apply

With every line of code, we accelerate our customers' success, turning complex challenges into innovative solutions.Collaboratively, we transform each data point we gather into valuable insights fo...Show moreLast updated: 5 days ago

Promoted
New!

Site Reliability Engineer (Urgent Search)

ExasoftDelhi, Delhi, India

Responsibilities and Requirements : - Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites - Experience with multiple operating systems (Windows, Linux, mac...Show moreLast updated: 1 hour ago

Promoted

▷ 3 Days Left : Site Reliability Engineer

ConcordDelhi, Delhi, India

Engineers (Individual Contributors) Key Attributes : - Strong SRE (Site Reliability Engineering) experience - DevOps skills – CI / CD, monitoring, automation, infrastructure as code, etc.Excellent t...Show moreLast updated: 12 days ago

Promoted

Site Reliability Engineer

ExasoftGhaziabad, IN

Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: 4 days ago

Promoted

Aeris - Senior DevOps / Site Reliability Engineer - Cloud Infrastructure

Aeris IoT SaaSNoida

About the job : About Aeris : For more than three decades, Aeris has been a trusted cellular IoT leader enabling the biggest I...Show moreLast updated: 30+ days ago

3331-Site Reliability Engineer I

Innovaccer AnalyticsNoida, UP, IN

Quick Apply

Promoted

Gemini Solutions - Site Reliability Engineer - Cloud Solutions

Gemini Solutions Private LimitedGurugram

Position Summary : In this role, you will play a crucial part in shaping the firm's infrastructure reliability and efficiency by implementing robust Site Reliab...Show moreLast updated: 25 days ago

Promoted

Site Reliability Engineer

UplersGhaziabad, IN

Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 28 days ago

Promoted

Senior Site Reliability Engineer- ELK Expert

iVedha Inc.Ghaziabad, IN

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago