Talent.com
Lead Site Reliability Manager - Cloud Computing

Lead Site Reliability Manager - Cloud Computing

NeemtreeMumbai
2 days ago
Job description

Description : Responsibilities :

  • Manage and mentor a team of SREs, assigning tasks, providing technical guidance, and fostering a culture of collaboration and continuous learning.
  • Lead the implementation of reliable, scalable, and fault-tolerant systems, including infrastructure, monitoring, and alerting.
  • Manage incident response processes, including root cause analysis, post-mortem reviews, and proactive mitigation strategies to minimise system downtime and impact.
  • Develop and maintain comprehensive monitoring systems to identify potential issues early, set appropriate alerting thresholds, and optimise system performance.
  • Drive automation initiatives to streamline operational tasks, including deployments, scaling, and configuration management, utilising relevant tools and technologies.
  • Proactively assess system capacity needs, plan for future growth, and implement scaling strategies to ensure optimal performance under load.
  • Analyse system metrics and identify bottlenecks, implement performance improvements, and optimise resource utilisation.
  • Work closely with development teams, product managers, and other stakeholders to ensure alignment on reliability goals and smooth integration of new features.
  • Develop and implement the SRE roadmap, including technology adoption, standards, and best practices to maintain a high level of system reliability.

Requirements :

  • Strong proficiency in system administration, cloud computing (AWS, Azure), networking, distributed systems, and containerization technologies (Docker, Kubernetes).
  • Expertise in scripting languages (Python, Bash) and ability to develop automation tools.
  • Good to have a basic understanding of Java.
  • Deep understanding of monitoring systems (Prometheus, Grafana), alerting configurations, and log analysis.
  • Proven experience in managing critical incidents, performing root cause analysis, and coordinating response efforts.
  • Excellent communication skills to convey technical concepts to both technical and non-technical audiences, ability to lead and motivate a team.
  • Strong analytical and troubleshooting skills to identify and resolve complex technical issues.
  • (ref : hirist.tech)

    Create a job alert for this search

    Lead Cloud Computing • Mumbai

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapgeminiMumbai, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 7 days ago
    • Promoted
    RELX - Site Reliability Engineer - IAC Terraform

    RELX - Site Reliability Engineer - IAC Terraform

    REED ELSEVIER INDIA (a part of RELX India Pvt Ltd)Mumbai
    Job Description : - Lead initiatives to identify and eliminate manual, repetitive tasks through automation and tooling.Develop s...Show moreLast updated: 30+ days ago
    • Promoted
    Morningstar - Site Reliability Engineer - DevOps

    Morningstar - Site Reliability Engineer - DevOps

    Morning StarNavi Mumbai
    Description : Job Title : Site Reliability Engineer.The Group : The EDP group is the home of data production and innovation at Mornings...Show moreLast updated: 10 days ago
    • Promoted
    Cloud Architect

    Cloud Architect

    iVedha Inc.Thane, IN
    Seeking a highly experienced Cloud Architect to design and oversee robust, scalable, and secure.Architect end-to-end cloud solutions (public, private, hybrid) with a focus on reliability, security,...Show moreLast updated: 30+ days ago
    • Promoted
    DevOps Manager

    DevOps Manager

    Unified InfotechKalyan-Dombivli, IN
    We are seeking a highly skilled and motivated.AWS and Azure cloud platforms to join our dynamic team.The successful candidate will collaborate with solution architects, developers, project managers...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    MorningstarMumbai, Maharashtra, India
    This job is with Morningstar, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.Job Title : S...Show moreLast updated: 6 days ago
    • Promoted
    Akasa Air - Site Reliability Engineer

    Akasa Air - Site Reliability Engineer

    SNV AVIATION PRIVATE LIMITED / Akasa AirMumbai
    As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our systems and infrastructure. This includes troubleshooting issues, developing and maintaini...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeThane, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 10 days ago
    • Promoted
    Sr Site Reliability Engineer

    Sr Site Reliability Engineer

    Media.netMumbai, Maharashtra, India
    Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show moreLast updated: 30+ days ago
    • Promoted
    YugabyteDB Lead

    YugabyteDB Lead

    SID Global SolutionsMumbai, Maharashtra, India
    SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across...Show moreLast updated: 18 days ago
    • Promoted
    Lead - Cloud Reliability Engineer

    Lead - Cloud Reliability Engineer

    Searce Incnavi mumbai, India
    The ‘process-first’ AI-native modern tech consultancy that's rewriting the rules.As an engineering-led consultancy, we are dedicated to relentlessly improving the real business outcomes.Our solvers...Show moreLast updated: 9 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SynechronMumbai, Maharashtra, India
    We have immediate opportunity for.Site Reliability Engineer Devop 5 to 9 years.SRE (Senior Site Reliability Engineer) Devop. We began life in 2001 as a small, self-funded team of technology speciali...Show moreLast updated: 18 days ago
    • Promoted
    Zycus - Site Reliability Engineering Manager

    Zycus - Site Reliability Engineering Manager

    Zycus Infotech Pvt LtdMumbai
    Job Description : Zycus is looking for a Site Reliability Engineer (SRE) with deep expertise in Kubernetes, automation, and Linux systems. The ideal candidate will ha...Show moreLast updated: 17 days ago
    • Promoted
    Site Reliability Engineer / Lead - CI / CD Pipeline

    Site Reliability Engineer / Lead - CI / CD Pipeline

    SolutionTech HRMumbai
    Key Responsibilities : - Lead and mentor a team of SREs / DevOps Engineers, fostering a culture of ownership, reliability,...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.Kalyan-Dombivli, IN
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Site Reliability Engineer - Cloud Computing

    Lead Site Reliability Engineer - Cloud Computing

    NeemtreeMumbai
    Responsibilities : - Team Leadership : Manage and mentor a team of SREs, assigning tasks, providing technical guidance, and fostering a culture of collaboration and ...Show moreLast updated: 30+ days ago
    Cloud Lead

    Cloud Lead

    ScaleneWorksMumbai, Maharashtra, India
    Quick Apply
    Position Name : Cloud Architect.Position Title : Manager Cloud Engineering.A Cloud Architect is responsible for designing, managing, and overseeing the cloud computing strategy of an organization.T...Show moreLast updated: 30+ days ago
    • Promoted
    RELX - Senior Site Reliability Engineer II - GitHub Enterprise Cloud

    RELX - Senior Site Reliability Engineer II - GitHub Enterprise Cloud

    REED ELSEVIER INDIA (a part of RELX India Pvt Ltd)Mumbai
    About the Business : LexisNexis Risk Solutions is the essential partner in the assessment of risk.Within our Business Services vertical, we offer a multitude...Show moreLast updated: 30+ days ago