Talent.com
Lead Site Reliability Automation Specialist
Lead Site Reliability Automation SpecialistOneAdvanced • Bengaluru, Republic Of India, IN
Lead Site Reliability Automation Specialist

Lead Site Reliability Automation Specialist

OneAdvanced • Bengaluru, Republic Of India, IN
26 days ago
Job description

We’re looking for a Senior SRE Automation Engineer to lead and drive automation across the operations lifecycle. The ideal candidate will be responsible for identifying and implementing automation opportunities to reduce manual intervention, minimise service tickets, and enable self-service capabilities.

Key Responsibilities

  • Design and implement automation pipelines to eliminate manual operational tasks and improve service efficiency.
  • Automate all manual patching activity
  • Integrate AI / ML-based tools for incident detection, root cause analysis, and automated remediation to enhance platform resilience.
  • Build and maintain self-healing scripts and workflows using Infrastructure-as-Code (IaC) and event-driven automation frameworks.
  • Analyze recurring incidents to identify patterns and opportunities for automation and optimization.
  • Identify and automate standard operating procedures and repetitive day-to-day tasks to reduce ticket volume and manual intervention.
  • Lead service improvement initiatives through automation to improve overall team performance and customer satisfaction.
  • Own and continuously improve observability and alerting strategies to support proactive operations.
  • Effectively communicate with users to build trust and drive timely resolution of issues within SLA.
  • Collaborate with cross-functional teams to resolve complex problems and align on operational goals.
  • Handle escalations and critical incidents in a fast-paced environment with clear communication and swift action.
  • Mentor junior engineers, fostering a DevOps-first culture and encouraging skill development.
  • Demonstrate strong analytical and troubleshooting skills, including real-time issue identification and resolution in live environments.
  • Maintain thorough and accurate documentation of automation implementations, including known gaps and future opportunities.

Required Skills & Experience

  • Excellent analytical and problem-solving skills to diagnose, troubleshoot, and resolve complex technical issues.
  • Automation experience e.G automated patching
  • Proficient in scripting and programming languages such as Python, Go, and Bash.
  • Strong hands-on experience with automation frameworks and tools including Terraform, Ansible, Chef, and Puppet.
  • Familiarity with automation scripting tools for infrastructure and operations (e.G. Python, Terraform, Ansible).
  • Experience working with AI-driven operations tools and AIOps platforms such as Moogsoft, BigPanda, Dynatrace, or custom ML-based pipelines.
  • In-depth knowledge of CI / CD, GitOps, and event-driven systems for modern DevOps practices.
  • Solid background in Linux systems and containerized environments like Docker and Kubernetes.
  • Proven experience in designing resilient, self-healing systems for high availability and operational efficiency.
  • Deep understanding of cloud platforms and technologies, including Microsoft Azure, Amazon Web Services (AWS), as well as on-premises and data center environments
  • Experience integrating with LLMs for operational tasks or incident summarization.
  • Certifications in cloud platforms or DevOps tools (e.G., AWS Certified DevOps Engineer).
  • Exposure to service mesh, service discovery, or modern networking stacks.
  • Create a job alert for this search

    Site Reliability Specialist • Bengaluru, Republic Of India, IN

    Related jobs
    Reliability & Automation Engineer

    Reliability & Automation Engineer

    Synamedia • Bengaluru, Karnataka, India
    JOB DESCRIPTION At Synamedia, the world’s most talented innovators and trailblazers are shaping the way the world is entertained and informed. We are backed by the Permira funds and Sky.This is t...Show more
    Last updated: 18 hours ago • Promoted • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Synamedia • Bengaluru, Karnataka, India
    At Synamedia, the world’s most talented innovators and trailblazers are shaping the way the world is entertained and informed. We are backed by the Permira funds and Sky.This is the age of infinite ...Show more
    Last updated: 24 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Delta Electronics India • Bengaluru, Karnataka, India
    Define and monitor Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to balance reliability with feature velocity and ensure optimal system availability.Respond to...Show more
    Last updated: 13 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Pagos Consultants • Bengaluru, IN
    This team will play a pivotal role in spearheading innovation.As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its future d...Show more
    Last updated: 7 days ago • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Media.net • Bengaluru, Karnataka, India
    Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show more
    Last updated: 18 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Enterprise Minds, Inc • Bengaluru, Karnataka, India
    Senior Site Reliability Engineer (GCP | Terraform | Ansible | SRE | On-Call).Site Reliability Engineer (SRE).If you thrive in fast-paced environments, excel in incident management, and love buildin...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaService • Bangalore, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show more
    Last updated: 28 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    o9 Solutions, Inc. • Bengaluru, Karnataka, India
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show more
    Last updated: 21 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    OneAdvanced • Bengaluru, Republic Of India, IN
    We’re looking for a Senior SRE Automation Engineer to lead and drive automation across the operations lifecycle.The ideal candidate will be responsible for identifying and implementing automation o...Show more
    Last updated: 26 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    WSO2 • Bengaluru, Karnataka, India
    Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises in over 90 c...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    GREYTIP SOFTWARE PRIVATE LIMITED • Bengaluru, Karnataka, India
    The ideal candidate will have hands-on experience in.You will play a key role in ensuring the reliability, availability, and performance of our production systems. Monitor production systems using e...Show more
    Last updated: 17 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ACL Digital • Bengaluru, Karnataka, India
    We are Hiring : SRE : Immediate Joiners Preferred.Bachelor's degree in engineering / computer science or equivalent with an overall work experience of 4 - 6 years. Deep understanding and Experience of ...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Capgemini • Bengaluru, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Hydrolix • Bangalore, IN
    At Hydrolix, we are revolutionizing the world of data management and analytics with our innovative cloud data platform, purpose-built for petabyte-scale datasets. Our mission is to help organization...Show more
    Last updated: 3 days ago • Promoted
    Automation and Reliability Specialist

    Automation and Reliability Specialist

    ACL Digital • Bengaluru, Republic Of India, IN
    We are Hiring : SRE : Immediate Joiners Preferred.Bachelor's degree in engineering / computer science or equivalent with an overall work experience of 5 - 7 years. Deep understanding and Experience of ...Show more
    Last updated: 16 hours ago • Promoted • New!
    Senior Systems Reliability Specialist

    Senior Systems Reliability Specialist

    WSO2 • Bengaluru, Republic Of India, IN
    Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises in over 90 c...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Reyika • Bengaluru, Republic Of India, IN
    Senior Site Reliability Engineer / Reliability Architect.Pune,Bengalore,Chennai,Pune,Noida.Reliability Architect with over 9 years of experience in proactive monitoring, automation, and observabili...Show more
    Last updated: 15 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Landmark Group • Bengaluru, Karnataka, India
    Ensure reliability and high availability of.Java and microservices-based applications.Build and enhance observability using. Prometheus, Grafana, Loki, or New Relic.Collaborate with engineering and ...Show more
    Last updated: 23 days ago • Promoted