Talent.com
Databricks AI Platform SRE_Director_Infrastructure Production Management & Reliability Engineering
Databricks AI Platform SRE_Director_Infrastructure Production Management & Reliability EngineeringMorgan Stanley • Bangalore, India
Databricks AI Platform SRE_Director_Infrastructure Production Management & Reliability Engineering

Databricks AI Platform SRE_Director_Infrastructure Production Management & Reliability Engineering

Morgan Stanley • Bangalore, India
2 days ago
Job description

Profile Description

We're seeking someone to join our Enterprise Technology team as a Databricks AI Platform SRE, in Enterprise Computing (EC) to join our Platform SRE team. This role will be critical in designing, building, and optimizing a scalable, secure, and developer-friendly Databricks platform to enable Machine Learning (ML) and Artificial Intelligence (AI) workloads at enterprise scale. You will partner with ML engineer, data scientists, platform teams, and cloud architects to automate infrastructure, enforce best practices, and streamline the end-to-end ML lifecycle using modern cloud-native technologies.

In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communities.

This is Director position that maintains the stability and reliability of the organization's infrastructure systems, ensuring optimal performance and availability to support business operations.

Since 1935, Morgan Stanley is known as a global leader in financial services, always evolving and innovating to better serve our clients and our communities in more than 40 countries around the world.

What you'll do in the role :

  • Design and implement secure, scalable, and automated Databricks environments to support AI / ML workloads.
  • Develop infrastructure-as-code (IaC) solutions using Terraform for provisioning Databricks, cloud resources, and network configurations.
  • Build automation and self-service capabilities using Python, Java and APIs for platform onboarding, workspace provisioning, orchestration and monitoring.
  • Collaborate with data science and ML teams to define compute requirements, governance policies, and efficient workflows across dev / qa / prod environments.
  • Integrate Databricks offering with cloud-native services on Azure / AWS- Champion CI / CD and GitOps for managing ML infrastructure and configurations.- Ensure compliance with enterprise security and data governance policies using RBAC, Audit Controls, Encryption, Network Isolation, and policies.
  • Monitor platform performance, reliability, and usage, and drive improvements to optimize cost and resource utilizations.

What you'll bring to the role :

  • At least 4+ years' relevant experience would generally be expected to find the skills required for this role .
  • Proven experience with Terraform for building and managing infrastructure.
  • Strong programming skills in Python and Java.
  • Hands-on experience with cloud networking, identity and access management, key vaults, monitoring, and logging in Azure.
  • Hands on experience with Databricks (Workspace management, Clusters, Jobs, MLFlow, Delta Lake, Unity Catalog, Mosaic AI).
  • Deep understanding of Azure or AWS infrastructure (e.g. IAM, VNets / VPC, Storage, Networks, Compute, Key management, monitoring)- Strong experience in distributed system design, development and deployment using agile / devops practices.
  • Experience with CI / CD pipelines (GitHub Actions, or similar)
  • Experience implementing monitoring and observability using Prometheus, Grafana or Databricks-native solutions.
  • Good communication skills, excellent teamwork experience, ability to mentor and develop more junior developers, including participating in constructive code reviews.
  • Experience in multi-cloud environments (AWS / GCP) is a bonus.
  • Experience in working in highly regulated environments (finance, healthcare, etc.) is desirable-
  • Experience with Databricks REST APIs and SDKs- Knowledge of MLFlow, Mosaic AC, & MLOps tooling-
  • Working with teams using Scrum, Kanban or other agile practices
  • Proficiency with standard Linux command line and debugging tools
  • Azure or AWS Certifications
  • WHAT YOU CAN EXPECT FROM MORGAN STANLEY :

    We are committed to maintaining the first-class service and high standard of excellence that have defined Morgan Stanley for over 89 years. Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren't just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. At Morgan Stanley, you'll find an opportunity to work alongside the best and the brightest, in an environment where you are supported and empowered. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There's also ample opportunity to move about the business for those who show passion and grit in their work.

    To learn more about our offices across the globe, please copy and paste https : / / www.morganstanley.com / about-us / global-offices into your browser.

    Morgan Stanley is an equal opportunities employer. We work to provide a supportive and inclusive environment where all individuals can maximize their full potential. Our skilled and creative workforce is comprised of individuals drawn from a broad cross section of the global communities in which we operate and who reflect a variety of backgrounds, talents, perspectives, and experiences. Our strong commitment to a culture of inclusion is evident through our constant focus on recruiting, developing, and advancing individuals based on their skills and talents.

    Create a job alert for this search

    Ai Platform • Bangalore, India

    Related jobs
    Senior Analog Director- Pcie-Gen7, Ucie

    Senior Analog Director- Pcie-Gen7, Ucie

    Mulya Technologies • Bengaluru, Republic Of India, IN
    UCIe ( Senior Director level / Director ).Location : Bengaluru / Hyderabad.About Omni Design Technologies.Omni Design Technologies is a leading provider of high-performance, ultra-low power IP cores...Show more
    Last updated: 20 days ago • Promoted
    Director of Engineering (Data Infrastructure)

    Director of Engineering (Data Infrastructure)

    Confidential • Bengaluru / Bangalore, India
    Databricks processes petabytes of data and billions of transaction events daily - every cluster launch, every query executed, every dollar billed flows through infrastructure that must never fail.W...Show more
    Last updated: 25 days ago • Promoted
    Storage Engineer_Director_Infrastructure Production Management & Reliability Engineering

    Storage Engineer_Director_Infrastructure Production Management & Reliability Engineering

    Morgan Stanley • Bangalore, India
    We're seeking someone to join our Enterprise Technology team as a Storage Engineer, in Enterprise Computing (EC) to who will be responsible for managing infrastructure, handling incidents, implemen...Show more
    Last updated: 9 days ago • Promoted
    Director of Data Engineering – 100% Remote

    Director of Data Engineering – 100% Remote

    Hyly.AI • Bengaluru, Karnataka, India
    Remote
    AI is multifamily’s only Intelligence Fabric™, weaving Artificial, Business, and Human intelligence into one operating system for growth. The company connects raw data to decisions, decisions to act...Show more
    Last updated: 6 days ago • Promoted
    Director AIML, Enterprise Data & AI Team

    Director AIML, Enterprise Data & AI Team

    Confidential • Bengaluru / Bangalore, India
    As part of the SanDisk Global IT organization, this Director-level role places you at the forefront of a company-wide transformation to embed AI into core operations and product innovation.You will...Show more
    Last updated: 25 days ago • Promoted
    Storage EngineerDirectorInfrastructure Production Management & Reliability Engineering

    Storage EngineerDirectorInfrastructure Production Management & Reliability Engineering

    Morgan Stanley • Bengaluru, Karnataka, India
    Were seeking someone to join our Enterprise Technology team as a Storage Engineer in Enterprise Computing (EC) to who will be responsible for managing infrastructure handling incidents implementing...Show more
    Last updated: 6 days ago • Promoted
    Principal SerDes Analog Design

    Principal SerDes Analog Design

    Mulya Technologies • Greater Bengaluru Area, India
    Principal SerDes Analog Design.Milpitas / Austin / Bangalore / Fort Collins / Billerica.Omni Design Technologies is at the forefront of Wideband Signal Processing™ delivering high-performance, low-...Show more
    Last updated: 3 days ago • Promoted
    Director - Data Engineering & AI Solutions

    Director - Data Engineering & AI Solutions

    Core Edge Solutions LLP • Bangalore
    Director Data Engineering & AI Solutions Job Description : Strategic Leadership & Delivery : Show more
    Last updated: 30+ days ago • Promoted
    Sr. Manager, AI Application Engineering

    Sr. Manager, AI Application Engineering

    Confidential • Bengaluru / Bangalore, India
    Today, there's more data and users outside the enterprise than inside, causing the network perimeter as we know it to dissolve. We realized a new perimeter was needed, one that is built in the cloud...Show more
    Last updated: 25 days ago • Promoted
    Director of Engineering

    Director of Engineering

    Grizmo Labs • Bengaluru, Karnataka, India
    Bachelor's or Master's in Computer Science, Engineering, or a related field.Proven expertise in scalable, high-performance, and resilient system design. Strong knowledge of serverless architecture, ...Show more
    Last updated: 21 days ago • Promoted
    SRE_Director_Software Prod Management and Reliability Engineering

    SRE_Director_Software Prod Management and Reliability Engineering

    Morgan Stanley • Bangalore, India
    We're seeking someone to join our team as (Director) Site Reliabilty Engineer experienced in developing and / or supporting Enterprise Applications,Willingness to embrace Agile and DevOps / SRE concept...Show more
    Last updated: 1 day ago • Promoted
    Director AI / ML

    Director AI / ML

    True ValueHub, Inc. • Bengaluru, Karnataka, India
    True ValueHub is an AI-native B2B SaaS solution for manufacturing companies.We help manufacturing companies save costs on direct material procurement. Our AI engine provides visibility into the true...Show more
    Last updated: 30+ days ago • Promoted
    Senior Analog Director- PCIe-Gen7, UCIe

    Senior Analog Director- PCIe-Gen7, UCIe

    Mulya Technologies • bangalore, karnataka, in
    UCIe ( Senior Director level / Director ).Location : Bengaluru / Hyderabad.About Omni Design Technologies.Omni Design Technologies is a leading provider of high-performance, ultra-low power IP cores...Show more
    Last updated: 20 days ago • Promoted
    Engineering Leader- Agentic AI

    Engineering Leader- Agentic AI

    ThoughtSpot • bangalore, karnataka, in
    AI-powered tool in the emerging market of AI power.Spotter leverages machine learning and AI to continuously.As Director of Engineering, you’ll lead a team of managers and senior engineers in devel...Show more
    Last updated: 18 days ago • Promoted
    Director of Engineering – AI Document Intelligence & Automation

    Director of Engineering – AI Document Intelligence & Automation

    Idexcel • Bengaluru, Karnataka, India
    JD for Director of Engineering – AI Document Intelligence & Automation.This is a hands-on, high-impact leadership role : . We are now productizing this internal platform into a scalable, standalone so...Show more
    Last updated: 21 days ago • Promoted
    Director AI / ML Engineering

    Director AI / ML Engineering

    Confidential • Bengaluru / Bangalore, India
    Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives.The work you do with our team will directly improve health outcomes by connect...Show more
    Last updated: 14 days ago • Promoted
    Director of Engineering (Data Infrastructure)

    Director of Engineering (Data Infrastructure)

    Databricks • Bengaluru, Karnataka, India
    Databricks processes petabytes of data and billions of transaction events daily - every cluster launch every query executed every dollar billed flows through infrastructure that must never fail.Whe...Show more
    Last updated: 30+ days ago • Promoted
    Sr Director, Data Engineering

    Sr Director, Data Engineering

    Confidential • Bengaluru / Bangalore, India
    We are brand builders who focus our passion and creativity to build Calvin Klein and TOMMY HILFIGER into the most desirable lifestyle brands in the world and at the same time position PVH as one of...Show more
    Last updated: 25 days ago • Promoted