Talent.com
Site Reliability Engineering (SRE) Lead

Site Reliability Engineering (SRE) Lead

ConfidentialBengaluru / Bangalore, India
5 days ago
Job description

Job Details

Position Purpose

At Brambles there is a need to make sure that platforms built on cloud hypervisors run smoothly as expected and can scale to the demand. The SRE Lead will monitor, maintain, and drive the software engineering required to ensure performance, scalability and reliability of cloud-based applications and infrastructure.

This role will proactively use observation data to identify improvement opportunities, not just across cloud services, but also for the platform itself. This role will drive a self-healing mentality across the global estate that can scale seamlessly.

The SRE Lead will work alongside the Cloud Platform Engineering Team Lead(s) and others, and may assist in the creation of modules but is focused on delivering performance and optimisation to maintain production services.

Major / Key Accountabilities

  • Using Brambles observability tools to detect platform and workload issues
  • Work closely with the native platform management team, product groups and technical leads to formulate and design systems to troubleshoot issues proactively and automatically.
  • Support cloud operations in postmortem reviews to identify mitigation for future failure.
  • Evaluate the key workloads and implement strategies to mitigate risk of failure.
  • Continuous monitoring to review effectiveness
  • Minimising mean time to respond. (MTTR)
  • Supporting the maintenance of tools for bug tracking
  • Ensuring documentation and designs are kept relevant

Experience :

  • Significant experience in within a technology automation / SRE role
  • 10+ years working with scripting languages.
  • Proven success in improving the customer experience.
  • Experience working within a matrix structure.
  • Qualifications

    Essential Qualifications

  • Extensive experience with Python
  • Strong experience with BASH
  • Strong experience with automation of processes
  • Experience with Kubernetes
  • Strong knowledge of CI / CD
  • Desirable Qualifications

  • SRE Reliability Engineering Practitioner
  • SRE Reliability Engineering Foundations
  • Bachelor's degree in Computer Science, Information Systems, Business or related field, Masters preferred or equivalent combination of education / experience.
  • Skills and Knowledge

  • Python - Can guide others to write clean, reusable, scalable code
  • Build pipelines for continuous improvement, writing Python scripts to automate testing, deployment, and rollback processes to ensure a smooth and reliable CI / CD pipeline
  • Advanced monitoring, logging and custom tooling
  • Write scripts to interact with cloud APIs, handling authentication, error handling, and maximising availability
  • System Programming Languages - Can guide and support others in the development, testing, and deployment of cloud-native applications, services and infrastructure
  • Troubleshoot issues with guidance from senior team members
  • Support the integration of cloud applications with edge devices using system programming languages for low-level interactions and communication
  • Kernel-level development and optimization
  • Develop and implement networking protocols
  • Design - Understanding and use of event-based design, object-oriented design, functional design, multi-tenant design, domain driven design – and knowing which design approach is best suited for the particular problem and abstraction to solve complex problems. Ability to design at both the high level (the forest) and the low level (the tree); and include understanding of current design approaches used 'in the field', and when they are appropriate to the use cases relevant to the platform being built.
  • Tooling - Use of well-established tools such as databases and Structured Query Language (SQL), and new leading-edge tools such as Kubernetes and the eco-system of tools around a particular language or programming environment with continuous research and learning of emerging new tools in a rapidly changing computing landscape.
  • Systems Thinking - Thinking abstractly to incorporate multiple perspectives; work within a space where the boundary or scope of problem or system may be 'fuzzy'; understand diverse operational contexts of the system; identify inter- and intrarelationships and dependencies; understand complex system behaviour; and reliably predict the impact of change to the system.
  • Cloud Platforms - Ability to navigate cloud platforms such as AWS and Azure, and use them effectively as the technical landscape for building Brambles specific platforms (both multi-tenant and purely internal). The platforms built within Brambles Digital need to be 'cloud-native' and run securely, effectively and correctly at scale.
  • Skills Required

    Functional Design, Bash, Sql, Scripting Languages, Networking Protocols, Python, Kubernetes, Logging

    Create a job alert for this search

    Site Reliability Sre • Bengaluru / Bangalore, India

    Related jobs
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Delta Air LinesBengaluru, India
    Execute on the Incident, Change Management, Problem Management processes.Building and supporting a reliable application suite for the environment in order to meet the development and maintenance re...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (SRE II)

    Site Reliability Engineer (SRE II)

    greytHRBengaluru, Karnataka, India
    We are looking for a passionate and detail-oriented.Site Reliability Engineer (SRE).As an SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our infrast...Show moreLast updated: 23 days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    Tata Consultancy ServicesBengaluru, Karnataka, India
    Role • • : Manager, Site Reliability Engineering.Required Technical Skill Set : Manager, Site Reliability Engineering.Desired Experience Range : 12 - 18 yrs. Notice Period : Immediate to 90Days only.We ar...Show moreLast updated: 23 days ago
    • Promoted
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Tata Consultancy ServicesBengaluru, Karnataka, India
    Senior Site Reliability Engineer (SRE).Senior Site Reliability Engineer (SRE).Desired Experience Range : 7 - 10 yrs.Notice Period : Immediate to 90Days only. We are currently planning to do a Virtual....Show moreLast updated: 12 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CodeKarmahosur, tamil nadu, in
    Site Reliability Engineer (Multi-Cloud Deployments).CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s w...Show moreLast updated: 22 days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    SynechronBengaluru, Karnataka, India
    We have immediate opportunity for Senior Site Reliability Engineer.Senior Site Reliability Engineer.At Synechron, we believe in the power of digital to transform businesses for the better.Our globa...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.hosur, tamil nadu, in
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Sr Site Reliability Engineer

    Sr Site Reliability Engineer

    Media.netBengaluru, Karnataka, India
    Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show moreLast updated: 12 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Nebula Tech Solutionshosur, tamil nadu, in
    SRE team supporting mission-critical applications for our.We’re now looking for engineers who can go beyond operations — those who can. Enhance application reliability through code.Add or modify cod...Show moreLast updated: 2 days ago
    • Promoted
    Senior Site Reliability Engineer (SRE) – Datadog Observability

    Senior Site Reliability Engineer (SRE) – Datadog Observability

    Jade Globalhosur, tamil nadu, in
    Senior Site Reliability Engineer (SRE) – Datadog Observability.SRE and Infrastructure Operations with minimum 3.Hyderabad preferable but open for Pune and remote. Site Reliability Engineer (SRE).SRE...Show moreLast updated: 2 days ago
    • Promoted
    Team Lead, Site Reliability Engineering

    Team Lead, Site Reliability Engineering

    ConfidentialBengaluru / Bangalore, India
    LSEG (London Stock Exchange Group).We are dedicated, open-access partners with a commitment to excellence in delivering services across Data & Analytics, Capital Markets, and Post Trade.Backed by t...Show moreLast updated: 3 days ago
    • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Delta Air LinesBengaluru, India
    Execute on the Incident, Change Management, Problem Management processes.Building and supporting reliable applications that meet development and maintenance requirements. Provide consultation and di...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeBangalore, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 14 days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    EpsilonBengaluru, Karnataka, India
    SaaSOps leads post-production support and the overall experience of Epsilon PeopleCloud products for our global clients.This function is responsible for product support, incident management, manage...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    super.moneyBengaluru, Karnataka, India
    Site Reliability Engineer (SRE) Level 3.A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and...Show moreLast updated: 2 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapgeminiBengaluru, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 11 days ago
    • Promoted
    • New!
    Sr Engineer, Site Reliability [T500-21295]

    Sr Engineer, Site Reliability [T500-21295]

    TMUS Global Solutionshosur, tamil nadu, in
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 21 hours ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    CitNOW Grouphosur, tamil nadu, in
    Founded in 2008, CitNOW is an innovative, enterprise-level software product suite that allows automotive dealerships globally to sell more vehicles and parts more profitably.CitNOW’s app-based plat...Show moreLast updated: 21 hours ago