Talent.com
SRE Manager - Distributed Systems

SRE Manager - Distributed Systems

ConfidentialHyderabad / Secunderabad, Telangana
19 days ago
Job description

ob description

We are looking for an experienced Engineering Manager to lead our Site Reliability Engineering (SRE) team. The ideal candidate will have a strong background in SRE principles and practices, as well as experience managing and mentoring engineers. The SRE Manager will be responsible for the overall success of the SRE team, including ensuring that our systems are reliable, scalable, and secure. The team is responsible for monitoring the stability and availability of mission critical production systems, managing incidents for quicker resolution, and establishing BAU. Team also building tools / infra which to be used by all development teams to assist in monitoring and troubleshooting.

As a Site Reliability Engineering Manager at Arcesium, you are expected to :

  • Manage a team of SRE engineers / SRE Leads
  • Own end to end availability and performance of mission critical services and build automation to prevent problem recurrence
  • Work closely with engineering managers and development teams to ensure that platforms are designed with scale and operability in mind
  • Help manage the teams infrastructure e.g. containers infrastructure using Docker & Kubernetes cluster, Kakfa clusters, etc.
  • Manage the teams AWS accounts and other infra provisioning.
  • Day to day support of dashboard, including responding to outages and triaging cases escalated by clients / internal teams
  • Manage on-call rotations to provide 24 hours coverage
  • Ensure systems are always DR ready
  • Manage team projects with Agile Methodology (Scrum / Kanban).
  • Review various processes from time to time and drive continual improvement.
  • Mentor SREs with incident case-studies and technical workshops
  • Mentor and coach engineers to be curious and effective at discovering and solving technical challenges

What you ll need :

  • 10+ years of experience in DevOps / Site reliability / Automation with 4+ years of People / Team Management exposure
  • Experienced with variety of tools that help manage, understand, and debug large, complex distributed systems
  • Good knowledge of Unix system, web technologies, databases and public cloud systems like AWS, Networking, Systems
  • Reliability : An exposure to Chaos Engineering and various reliability practices including disaster recovery will be good to have
  • IT Service Management : Incident Management, Problem Management, Change Management
  • Languages : Any of Python / Java / Node.js / Ruby
  • Linux : System Administration + Shell Scripting
  • Cloud Computing : Amazon Web Services
  • Microservices & Containerization Docker, Kubernetes
  • Version Control Git, Github, Gitlab, etc.
  • Configuration Management Ansible / Chef / Puppet
  • IT Service Management : Incident Management, Problem Management, Change Management
  • Agile : Scrum, Kanban
  • Skills Required

    Unix, Shell Scripting, Automation

    Create a job alert for this search

    Sre • Hyderabad / Secunderabad, Telangana

    Related jobs
    • Promoted
    Manager - SRE

    Manager - SRE

    ConfidentialHyderabad / Secunderabad, Telangana
    Zenoti provides an all-in-one, cloud-based software solution for the beauty and wellness industry.Our solution allows users to seamlessly manage every aspect of the business in a comprehensive mobi...Show moreLast updated: 18 days ago
    • Promoted
    Manager / Sr. Manager - Technical Support

    Manager / Sr. Manager - Technical Support

    ZetaHyderabad, IN
    Our flagship processing platform - Zeta Tachyon - is the industry’s first modern, cloud-native, and fully API-enabled stack that brings together issuance, processing, lending, core banking, fraud &...Show moreLast updated: 1 day ago
    • Promoted
    Manager-Hse (Technical)

    Manager-Hse (Technical)

    SodexoHyderabad, Republic Of India, IN
    Sodexo promotes an inclusive and diverse workplace and encourages applications from individuals of all backgrounds.At Sodexo, we offer 100+ service solutions across diverse sectorscorporates, healt...Show moreLast updated: 11 days ago
    • Promoted
    Senior Engineer, SRE - Accounting Tech

    Senior Engineer, SRE - Accounting Tech

    Talent500 INCHyderabad, India
    Senior Engineer, SRE - Accounting Tech.The Senior Engineer, Site Reliability (SRE) will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Fin...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (SRE) / DevOps Engineer

    Site Reliability Engineer (SRE) / DevOps Engineer

    Stoopa AIHyderabad, IN
    AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring ...Show moreLast updated: 1 day ago
    • Promoted
    SRE

    SRE

    ConfidentialHyderabad / Secunderabad, Telangana
    Service & Product Strategy Managers across our Investment Banking & Group Finance and Risk business divisions (Equity, Fixed Income, Prime Services, Research) and our technical partners.TOIL, autom...Show moreLast updated: 30+ days ago
    • Promoted
    Manager, Site Reliability Engineering

    Manager, Site Reliability Engineering

    ServiceNowHyderabad, Telangana, India
    What you get to do in this role : .As a Manager of the SRE team your responsibilities will be : .Team management career development project prioritization and performance review.Drive a culture of into...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Voya IndiaHyderabad, IN
    We are seeking a strategic and technically adept leader to drive the scalability, resilience, and operational excellence of our enterprise systems. This role will set the vision for site reliability...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaServiceHyderabad, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show moreLast updated: 15 days ago
    • Promoted
    Senior Manager, Cloud Site Reliability Engineering

    Senior Manager, Cloud Site Reliability Engineering

    London Stock Exchange GroupHyderabad, India
    LSEG (London Stock Exchange Group).We are dedicated, open-access partners with a commitment to excellence in delivering services across Data & Analytics, Capital Markets, and Post Trade.Backed by t...Show moreLast updated: 30+ days ago
    • Promoted
    Deputy Director Azure SRE

    Deputy Director Azure SRE

    ConfidentialHyderabad / Secunderabad, Telangana
    Team Leadership and Development : .Lead, mentor, and develop a team of SRE engineers, fostering a culture of collaboration, continuous learning, and operational excellence. Define team goals, key metr...Show moreLast updated: 21 days ago
    • Promoted
    Research Director

    Research Director

    MNR UniversitySangareddy, Telangana, India
    Research Director Job Description.The Director of Research is a senior academic leadership role within a department.In this the Director of Research supports the Head of Department as a member of t...Show moreLast updated: 6 days ago
    • Promoted
    Sr Architect, Systems

    Sr Architect, Systems

    TMUS Global SolutionsHyderabad, India
    The Senior Systems (Business) Architect drives strategic alignment between business goals and product delivery.This role facilitates collaboration across departments, promotes reusable frameworks, ...Show moreLast updated: 30+ days ago
    • Promoted
    Deputy General Manager

    Deputy General Manager

    MSN LaboratoriesPatancheru, Republic Of India, IN
    MSN Labs is the fastest growing research-based pharmaceutical company based out of India.Founded in 2003 with a mission to make health care affordable, this Hyderabad-based venture has.MSN Labs is ...Show moreLast updated: 7 days ago
    • Promoted
    Cubic Transportation Systems - Senior Systems Integration Engineer

    Cubic Transportation Systems - Senior Systems Integration Engineer

    Cubic Transportation Systems India Pvt. Ltd.Hyderabad
    Description : Business Unit : Cubic Transportation Systems.Company Details : When you join Cubic, you become part of...Show moreLast updated: 1 day ago
    • Promoted
    Infrastructure Automation Site Reliability Engineer (SRE)

    Infrastructure Automation Site Reliability Engineer (SRE)

    ConfidentialHyderabad / Secunderabad, Telangana, India
    The Infrastructure Automation Site Reliability Engineer (SRE) bridges the gap between development and operations by applying software engineering principles to infrastructure and operational challe...Show moreLast updated: 21 days ago
    • Promoted
    Director - SRE

    Director - SRE

    ConfidentialHyderabad / Secunderabad, Telangana
    Zenoti provides an all-in-one, cloud-based software solution for the beauty and wellness industry.Our solution allows users to seamlessly manage every aspect of the business in a comprehensive mobi...Show moreLast updated: 18 days ago
    • Promoted
    Director of Physical Therapy

    Director of Physical Therapy

    MNR UniversitySangareddi, Telangana, India
    Physical Director Job Description Physical Director's job includes managing physical education programs, organizing sports events, ensuring student participation, and overseeing staff.Key responsib...Show moreLast updated: 6 days ago