Talent.com
This job offer is not available in your country.
Site Reliability Engineering Lead - Cloud Platform

Site Reliability Engineering Lead - Cloud Platform

Leap India Stack FoundationBangalore
18 days ago
Job description

About the job : Position Purpose :

At Brambles there is a need to make sure that platforms built on cloud hypervisors run smoothly as expected and can scale to the demand. The SRE Lead will monitor, maintain, and drive the software engineering required to ensure performance, scalability and reliability of cloud-based applications and infrastructure.

This role will proactively use observation data to identify improvement opportunities, not just across cloud services, but also for the platform itself. This role will drive a self-healing mentality across the global estate that can scale seamlessly.

The SRE Lead will work alongside the Cloud Platform Engineering Team Lead(s) and others, and may assist in the creation of modules but is focused on delivering performance and optimisation to maintain production services.

Major / Key Accountabilities :

  • Using Brambles observability tools to detect platform and workload issues
  • Work closely with the native platform management team, product groups and technical leads to formulate and design systems to troubleshoot issues proactively and automatically.
  • Support cloud operations in postmortem reviews to identify mitigation for future failure.
  • Evaluate the key workloads and implement strategies to mitigate risk of failure.
  • Continuous monitoring to review effectiveness
  • Minimising mean time to respond. (MTTR)
  • Supporting the maintenance of tools for bug tracking
  • Ensuring documentation and designs are kept relevant

Experience :

  • Significant experience in within a technology automation / SRE role
  • 10+ years working with scripting languages.
  • Proven success in improving the customer experience.
  • Experience working within a matrix structure.
  • Qualifications :

    Essential Qualifications :

  • Extensive experience with Python
  • Strong experience with BASH
  • Strong experience with automation of processes
  • Experience with Kubernetes
  • Strong knowledge of CI / CD
  • Desirable Qualifications :

  • SRE Reliability Engineering Practitioner
  • SRE Reliability Engineering Foundations
  • Bachelors degree in Computer Science, Information Systems, Business or related field, Masters preferred or equivalent combination of education / experience.
  • Skills and Knowledge :

  • Python : Can guide others to write clean, reusable, scalable code
  • Build pipelines for continuous improvement, writing Python scripts to automate testing, deployment, and rollback processes to ensure a smooth and reliable CI / CD pipeline
  • Advanced monitoring, logging and custom tooling
  • Write scripts to interact with cloud APIs, handling authentication, error handling, and maximising availability
  • System Programming Languages : Can guide and support others in the development, testing, and deployment of cloud-native applications, services and infrastructure
  • Troubleshoot issues with guidance from senior team members
  • Support the integration of cloud applications with edge devices using system programming languages for low-level interactions and communication
  • Develop and implement networking protocols
  • Understanding and use of event-based design, object-oriented design, functional design, multi-tenant design, domain driven design and knowing which design approach is best suited for the particular problem and abstraction to solve complex problems.
  • Ability to design at both the high level (the forest) and the low level (the tree); and include understanding of current design approaches used in the field, and when they are appropriate to the use cases relevant to the platform being built.
  • Use of well-established tools such as databases and Structured Query Language (SQL), and new leading-edge tools such as Kubernetes and the eco-system of tools around a particular language or programming environment with continuous research and learning of emerging new tools in a rapidly changing computing landscape.
  • Thinking abstractly to incorporate multiple perspectives; work within a space where the boundary or scope of problem or system may be fuzzy; understand diverse operational contexts of the system; identify inter- and intrarelationships and dependencies; understand complex system behaviour; and reliably predict the impact of change to the system.
  • Ability to navigate cloud platforms such as AWS and Azure, and use them effectively as the technical landscape for building Brambles specific platforms (both multi-tenant and purely internal). The platforms built within Brambles Digital need to be "cloud-native" and run securely, effectively and correctly at scale.
  • (ref : hirist.tech)

    Create a job alert for this search

    Lead Platform Engineering • Bangalore

    Related jobs
    • Promoted
    Manager, Site Reliability Engineering (Cortex XDR XSIAM)

    Manager, Site Reliability Engineering (Cortex XDR XSIAM)

    Palo Alto NetworksBengaluru, Karnataka, India
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Amicon Hub Serviceshosur, tamil nadu, in
    Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 6 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    BayOne Solutionshosur, tamil nadu, in
    Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: 6 hours ago
    • Promoted
    Site Reliability Engineer - Cloud Platforms

    Site Reliability Engineer - Cloud Platforms

    LanceSoft, IncBangalore
    Role and Responsibilities : Reporting to Engineering, the Site Reliability Engineer will play a critical role in driving innovation and growth for the Banking Soluti...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer - Chaos Management

    Site Reliability Engineer - Chaos Management

    Xebiabangalore, karnataka, in
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 7 days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.hosur, tamil nadu, in
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    XebiaBengaluru, IN
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 26 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    WSO2hosur, tamil nadu, in
    Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises in over 90 c...Show moreLast updated: 7 days ago
    • Promoted
    athenahealth - Site Reliability Engineer - Cloud Infrastructure

    athenahealth - Site Reliability Engineer - Cloud Infrastructure

    athenaHealth Technology Private Limited.Bangalore
    Join us as we work to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all. Our modern, open ecosystem connects care teams and delivers actionable i...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer - Cloud Operations

    Site Reliability Engineer - Cloud Operations

    Creencia Technologies Pvt LtdBangalore
    We are recruiting an experienced Site Reliability Engineer to join our newly established TechOps division within the Technology department. We maintain the systems that keep our products running smo...Show moreLast updated: 27 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    ExasoftBangalore, IN
    Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: 10 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WhiteLotus Talent PartnersBengaluru, Karnataka, India
    L0 and L1 Site Reliability Engineer (SRE) Support.Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by. In this role, you will focu...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Uplershosur, tamil nadu, in
    Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 24 days ago
    • Promoted
    Site Reliability Engineering Specialist

    Site Reliability Engineering Specialist

    BT GroupBengaluru, Karnataka, India
    Platform Stability and Reliability.Ensure the platform meets performance, availability, and reliability SLAs.Proactively identify and resolve performance bottlenecks and risks in production environ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Cloud Infrastructure

    Site Reliability Engineer - Cloud Infrastructure

    ENTER RecruitmentBangalore
    We are looking for a dedicated Site Reliability Engineer (SRE) - Cloud Ops to join our team.In this role, you will play a key part in ensuring the stability and scalability of our cloud infrastruct...Show moreLast updated: 30+ days ago
    • Promoted
    TecQubes Technologies - Cloud Site Reliability Engineer - Azure Cloud Infrastructure

    TecQubes Technologies - Cloud Site Reliability Engineer - Azure Cloud Infrastructure

    TecQubes TechnologiesBangalore
    Key Responsibilities : - Design, deploy, and manage scalable, secure, and highly available infrastructure solutions on Microsoft Azure. Automate infrastructure provisioning, con...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    EpsilonBengaluru, Karnataka, India
    SaaSOps leads post-production support and the overall experience of Epsilon PeopleCloud products for our global clients.This function is responsible for product support, incident management, manage...Show moreLast updated: 8 days ago
    • Promoted
    Cloud Engineer Lead (AWS)

    Cloud Engineer Lead (AWS)

    Datapel Systemshosur, tamil nadu, in
    The Senior Cloud Engineer (AWS) will be responsible for developing, maintaining, optimising and supporting the cloud infrastructure that supports Datapel’s Warehouse Management System (WMS) and rel...Show moreLast updated: 18 days ago