Talent.com
Site Reliability Engineer on Cloud / Kubernetes Platform_Director_Infrastructure Production Management & Reliability Engineering

Site Reliability Engineer on Cloud / Kubernetes Platform_Director_Infrastructure Production Management & Reliability Engineering

Morgan StanleyMumbai, India
11 hours ago
Job description

Profile Description

We're seeking someone to join our Enterprise Technology team as a Site Reliability Engineer on Cloud / Kubernetes Platform, in Enterprise Computing (EC) to join our Cloud Platform team. The ideal candidate will have hands-on experience managing large-scale Kubernetes clusters on public cloud environments (AKS, EKS, or GKE) and a strong understanding of modern SRE and DevOps practices. You will be responsible for ensuring high availability, reliability, scalability, and performance of our cloud-native infrastructure and CI / CD systems.

In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communities.

This is Director position that maintains the stability and reliability of the organization's infrastructure systems, ensuring optimal performance and availability to support business operations.

Since 1935, Morgan Stanley is known as a global leader in financial services, always evolving and innovating to better serve our clients and our communities in more than 40 countries around the world.

What you'll do in the role :

  • Manage, monitor, and optimize large-scale Kubernetes clusters hosted on public cloud platforms (Azure AKS, AWS EKS, or Google GKE).
  • Implement and maintain infrastructure as code using tools such as Terraform.
  • Collaborate with development and operations teams to improve system reliability and deployment automation.
  • Build and maintain CI / CD pipelines using Jenkins or similar tools.
  • Troubleshoot production issues, conduct root cause analysis, and implement preventive measures.
  • Automate operational tasks using Python or other scripting languages.
  • Contribute to observability and monitoring improvements using modern tools and best practices. Participate in on-call rotations and incident response processes.

What you'll bring to the role :

  • At least 4+ years' relevant experience would generally be expected to find the skills required for this role.
  • Experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles.
  • Strong hands-on experience managing Kubernetes clusters in production (AKS / EKS / GKE). Proficiency with Terraform and cloud infrastructure automation.
  • Good programming or scripting skills in Python (preferred) or similar languages.
  • Practical experience with Jenkins and CI / CD pipeline management.
  • Experience with Prometheus, Grafana, or Open Telemetry for observability.
  • Exposure to GitOps practices and tools (e.g.).
  • Sound understanding of SRE principles (incident management, blameless postmortems, capacity planning, error budgets, etc.).
  • Strong analytical, troubleshooting, and problem-solving abilities.
  • Excellent written and verbal communication skills.
  • WHAT YOU CAN EXPECT FROM MORGAN STANLEY :

    We are committed to maintaining the first-class service and high standard of excellence that have defined Morgan Stanley for over 89 years. Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren't just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. At Morgan Stanley, you'll find an opportunity to work alongside the best and the brightest, in an environment where you are supported and empowered. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There's also ample opportunity to move about the business for those who show passion and grit in their work.

    To learn more about our offices across the globe, please copy and paste https : / / www.morganstanley.com / about-us / global-offices into your browser.

    Morgan Stanley is an equal opportunities employer. We work to provide a supportive and inclusive environment where all individuals can maximize their full potential. Our skilled and creative workforce is comprised of individuals drawn from a broad cross section of the global communities in which we operate and who reflect a variety of backgrounds, talents, perspectives, and experiences. Our strong commitment to a culture of inclusion is evident through our constant focus on recruiting, developing, and advancing individuals based on their skills and talents.

    Create a job alert for this search

    Site Reliability Engineer • Mumbai, India

    Related jobs
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Nebula Tech SolutionsMumbai, Maharashtra, India
    SRE team supporting mission-critical applications for our.We’re now looking for engineers who can go beyond operations — those who can. Enhance application reliability through code.Add or modify cod...Show moreLast updated: 12 days ago
    • Promoted
    Lead Engineer

    Lead Engineer

    HyqooThane, IN
    Design, deploy, and manage AWS cloud infrastructure, including EC2 instances, S3 buckets, VPCs, RDS databases, and Lambda functions. Assist in the design, implementation, and maintenance of backup, ...Show moreLast updated: 7 days ago
    • Promoted
    Morningstar - Site Reliability Engineer - DevOps

    Morningstar - Site Reliability Engineer - DevOps

    Morning StarNavi Mumbai
    Description : Job Title : Site Reliability Engineer.The Group : The EDP group is the home of data production and innovation at Mornings...Show moreLast updated: 26 days ago
    • Promoted
    • New!
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Media.netMumbai, Maharashtra, India
    Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show moreLast updated: 20 hours ago
    • Promoted
    Sr Site Reliability Engineer

    Sr Site Reliability Engineer

    Media.netMumbai, Maharashtra, India
    Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer / Lead - CI / CD Pipeline

    Site Reliability Engineer / Lead - CI / CD Pipeline

    SolutionTech HRMumbai
    Key Responsibilities : - Lead and mentor a team of SREs / DevOps Engineers, fostering a culture of ownership, reliability,...Show moreLast updated: 30+ days ago
    • Promoted
    Zycus - Site Reliability Engineering Manager

    Zycus - Site Reliability Engineering Manager

    Zycus Infotech Pvt LtdMumbai
    Job Description : Zycus is looking for a Site Reliability Engineer (SRE) with deep expertise in Kubernetes, automation, and Linux systems. The ideal candidate will ha...Show moreLast updated: 30+ days ago
    • Promoted
    Senior DevOps & Database Reliability Engineer – 100% Remote

    Senior DevOps & Database Reliability Engineer – 100% Remote

    Hyly.AIThane, IN
    Remote
    AI, we’re building the first AI + Data Fabric for the multifamily industry, transforming how clients manage, secure, and scale their marketing and operational data. As the industry moves toward a co...Show moreLast updated: 5 days ago
    • Promoted
    Media.net - Senior Site Reliability Engineer - IAC Terraform

    Media.net - Senior Site Reliability Engineer - IAC Terraform

    Media.netMumbai
    Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeMumbai, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 25 days ago
    • Promoted
    Lead Site Reliability Engineer - Cloud Computing

    Lead Site Reliability Engineer - Cloud Computing

    NeemtreeMumbai
    Responsibilities : - Team Leadership : Manage and mentor a team of SREs, assigning tasks, providing technical guidance, and fostering a culture of collaboration and ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SynechronMumbai, Maharashtra, India
    We have immediate opportunity for.Site Reliability Engineer Devop 5 to 9 years.SRE (Senior Site Reliability Engineer) Devop. We began life in 2001 as a small, self-funded team of technology speciali...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaServiceMumbai, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show moreLast updated: 10 days ago
    • Promoted
    • New!
    Site Reliability Engineer - Cloud Infrastructure

    Site Reliability Engineer - Cloud Infrastructure

    NeemtreeMumbai
    Description : Experience : 6+ years Location : : Cloud Infrastructure Management : ...Show moreLast updated: 6 hours ago
    • Promoted
    RELX - Senior Site Reliability Engineer II - GitHub Enterprise Cloud

    RELX - Senior Site Reliability Engineer II - GitHub Enterprise Cloud

    REED ELSEVIER INDIA (a part of RELX India Pvt Ltd)Mumbai
    About the Business : LexisNexis Risk Solutions is the essential partner in the assessment of risk.Within our Business Services vertical, we offer a multitude...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    XequalstoMumbai
    Description : Senior Site Reliability Engineer (SRE) Location : Mumbai , Navi Mumbai - Hybrid office visits will be scheduled as and when requi...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Docker / Kubernetes

    Site Reliability Engineer - Docker / Kubernetes

    Talent LeadsMumbai
    Skill, Knowledge &Trainings : - Site Reliability Engineer will be responsible to develop and implement services that improve Software developme...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Site Reliability Manager - Cloud Computing

    Lead Site Reliability Manager - Cloud Computing

    NeemtreeMumbai
    Description : Responsibilities : - Manage and mentor a team of SREs, assigning tasks, providing technical guid...Show moreLast updated: 18 days ago