Talent.com
This job offer is not available in your country.
Service Engineer II

Service Engineer II

Microsofthyderabad, India
7 hours ago
Job description

Overview

Are you passionate about cloud computing, obsessed with customer experience, and driven to resolve complex issues under pressure? Do you thrive in high-stakes, live environments and want to play a pivotal role in ensuring the reliability of Microsoft’s cloud platform? If so, the Azure Customer Experience (CXP) team has the opportunity for you.

Microsoft Azure is one of the most exciting and strategic products at Microsoft—powering mission-critical workloads for enterprises, governments, and startups around the world. Azure delivers on-demand, hyper-scale infrastructure and platforms via Microsoft's global data centers, enabling customers to build, host, and scale their applications with confidence.

The Customer Reliability Engineering (CRE) team within Azure CXP is a top-level pillar of Azure Engineering responsible for world-class live-site management, customer reliability engagements, modern customer-first experiences for scale, and drives deep customer insights and empathy into the broader Azure Engineering organization. Our “no dead-end’s” philosophy ensures that every customer, regardless of size or scale, can realize their full potential through the Microsoft Cloud

We are seeking decisive and experienced Service Engineers for Live Site Issues, Problem Management and driving Customer reliability space. This role is accountable for enhancing the customer experience across Azure, including First Party Services. The ideal candidate will demonstrate strong breadth in managing complex, highly available services, paired with deep technical expertise in Azure Core Services and their inter dependencies. You will work closely with Customers, First Parties, Customer Support, Livesite, and Engineering teams to deliver critical, customer-facing features. Success in this role requires the ability to influence and collaborate across many Azure servicing teams to ensure customer needs are met.

In addition, this role includes on-call responsibilities for managing and resolving complex multi-service outages. It requires the ability to remain effective under pressure, apply broad technical and analytical skills, and coordinate seamlessly with internal service teams and stakeholders. Strong communication skills—both written and verbal—are essential. You will also lead the evolution of Azure's Incident Management practice through Post-Incident Reviews, process development, and system automation. By leveraging telemetry and metrics, you will identify and drive platform-wide improvements with global impact. You’ll be the single point of command and control during high-severity incidents, orchestrating cross-functional engineering, operations, and communications to minimize impact, restore services quickly, and protect the trust of our global customer base.

This role offers a unique opportunity to make immediate impact, improve systems at scale.

Qualifications

Required Qualifications :

  • 6+ Yrs of experience in roles cloud operations, incident response, SRE or large-scale system engineering preferably in platforms like Azure, AWS, or GCP.Must have Service Engineering experience in a 24 x 7 x 365 enterprise environmentsExceptional command-and-control communication skills—able to drive clarity and direction with customers - internal Microsoft stake holders and third-party vendors during ambiguity and chaos.Deep understanding of cloud architecture patterns, microservices, and containerization.Demonstrated ability to make decisions quickly, under pressure, and with limited data—without compromising long-term reliability.Familiarity with monitoring and observability tools (e.g., Grafana, Prometheus, Datadog, Splunk, New Relic).Contribute to Implement observability frameworks to proactively detect performance bottlenecks.Strong knowledge of CI / CD pipelines, container orchestration (Kubernetes, Docker), and infrastructure as code (Terraform, ARM, Bicep).Familiarity with AI / ML frameworks and cloud AI services.Experience implementing AI-driven monitoring, alerting, and remediation systemsFluency in one or more automation languages (PowerShell, Python, CLI etc.) Understanding ITIL or other incident management frameworks is a must.Understand High Availability, Disaster Recovery, Business Continuity, Performance TuningDemonstrates strategic thinking, quantitative and analytical skills, team leadership, and collaboration Excellent problem resolution, judgment, negotiating and decision-making skillsDesired Strong knowledge of Windows Platform or Linux, developer tools and ability to diagnose and debug user codeEffectively manage and prioritize multiple tasks in accordance with high level objectives / projects. Excellent communication skill (written + verbal) in English , especially in high-pressure scenarios.Ability to communicate with a variety of audiences; including high-profile customers, executive management, and engineering teams.Experience with Azure, AWS, or GCP core services and their interdependence.Bachelor’s or master’s degree in computer science, Information Technology or equivalent experience

Preferred Qualifications :

6+ Years of demonstrated experience as an Incident Commander or Crisis Manager for critical, high-severity incidents in high-availability, distributed environments.Experience with SRE (Site Reliability Engineering) principles and practices.Exposure to chaos engineering, fault injection, or high availability architecture.AI / ML Experience : (Beginner to Intermediate)Familiarity with how AI / ML models are integrated into cloud infrastructure and their potential failure modes.Experience using AI-powered tools for incident analysis, log correlation, or predictive alerting.An understanding of the challenges and risks associated with AI / ML systems in a production environment. Certifications : Relevant cloud certifications (e.g., AWS Certified DevOps Engineer, Azure Solutions Architect, GCP Professional Cloud Architect).Certifications in ITIL, SRE, or other relevant frameworks.

Every day, our customers stake their business and reputation on our cloud. You can help #AzCXP provide our customers with the world-class cloud services they need to succeed. #azcre

Responsibilities

To be successful in this role, you must have a great track record of customer compassion, an engineering mindset, an innate aptitude for agility, and technical excellence in software engineering. Collaborate closely with Engineering / PM to ensure the availability, performance of Live Site and the satisfaction of our customers

  • Manage high-severity incidents (SEV0 / SEV1 / SEV2) across Azure services, serving as the single point of accountability to ensure rapid detection, triage, resolution, and customer communication.
  • Act as the central authority during live site incidents, driving real-time decision-making and coordination across Engineering, Support, PM, Communications, and Field teams.
  • Provide calm, decisive leadership in crisis situations.
  • Promote a customer-first culture by prioritizing availability, reliability, and platform trust in every response.
  • Participate in the on-call rotation.
  • Analyze customer-impacting signals from telemetry, support cases, and feedback to identify root causes, drive incident reviews (RCAs / PIRs), and implement preventative service improvements.
  • Drive continuous improvement of the Azure platform by incorporating learnings from live site events and customer feedback, ensuring improved reliability, observability, and supportability.
  • Collaborate closely with Engineering and Product teams to influence and implement service resiliency enhancements, auto-remediation tools, and customer-centric mitigation strategies.
  • Identify and advocate for customer self-service capabilities, improved documentation, and scalable solutions that empower customers to resolve common issues independently.
  • Contribute to the development and adoption of incident response playbooks, mitigation levers, and operational frameworks aligned to real-world support scenarios and strategic customer needs
  • Contribute to the design of next-generation architecture for cloud infrastructure services with a focus on reliability and strategic customer support outcomes.
  • Build and maintain cross-functional partnerships, ensuring alignment across engineering, business, and support organizations.
  • Be data-driven and results-focused, using metrics to evaluate incident response effectiveness and platform health.
  • Bring an engineering mindset to operational challenges, balancing agility, scalability, and technical excellence.
  • Exhibit strong cross-team collaboration, engineering mindset, and results-oriented execution under pressure
  • Benefits / perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.Industry leading healthcareEducational resourcesDiscounts on products and servicesSavings and investmentsMaternity and paternity leaveGenerous time awayGiving programsOpportunities to network and connect

    Create a job alert for this search

    Engineer Ii • hyderabad, India

    Related jobs
    • Promoted
    • New!
    Infrastructure as a Service (IaaS)

    Infrastructure as a Service (IaaS)

    Anicalls (Pty) Ltdhyderabad, India
    Principal Engineer Cloud Infrastructure Platform Team Cloud Infrastructure Group (Seattle, WA) Cloud.Engineering Infrastructure Development At Oracle Cloud Infrastructure (OCI),.Enterprises as a di...Show moreLast updated: 7 hours ago
    • Promoted
    Lead Engineer II / Software Engineer - ServiceNow SecOps Modules

    Lead Engineer II / Software Engineer - ServiceNow SecOps Modules

    WorksconsultancyHyderabad
    Responsibilities : - Lead the design and implementation of ServiceNow SecOps modules, including : Security IncidentResponse (SIR) Vulnerability Re...Show moreLast updated: 17 days ago
    • Promoted
    • New!
    Service Engineer

    Service Engineer

    Erba Mannheimhyderabad, India
    Trichy, Kollam, Mumbai, Delhi, Jalandhar, Hubli, Hyderabad, Ranchi, Varanasi, Faizabad.To ensure Preventive Maintenance of Instruments. To maintain records of spares in respective territories.Howeve...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    Field Service Engineer LCMS

    Field Service Engineer LCMS

    Agilenthyderabad, India
    Provides Technical & application services & support to external customers and Agilent customer engineers.This includes reactive and pro-active actions that result in a timely and cost-effective pro...Show moreLast updated: 7 hours ago
    • Promoted
    Technical Service Engineer

    Technical Service Engineer

    SOFARHyderabad, IN
    Looking for an experienced and energetic professional interested in joining a fast-growing and diverse team aimed at provide better Smart Energy solutions to our customers.The individual should be ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Software Development Engineer II, AGI Data Services

    Software Development Engineer II, AGI Data Services

    ADCI HYD 13 SEZhyderabad, India
    Is passion for innovation what drives you? Are you turned on by seeing something you worked on being used by millions of people? Alexa Data Services (ADS) is a global Machine Intelligence Data Serv...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    Service Engineer

    Service Engineer

    White Forcehyderabad, India
    Position Title- Service Engineer.Should have minimum 6 months to 1 year of experience in servicing of electronic equipment. Freshers with good technical knowledge can also considered.Must be convers...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    Reliability Engineer II

    Reliability Engineer II

    Medtronicnanakramguda, India
    At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You’ll lead with purpose, breaking down barriers to innovati...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    IBM iSeries Engineer

    IBM iSeries Engineer

    NTT DATAhyderabad, India
    Join a company that is pushing the boundaries of what is possible.We are renowned for our technical excellence and leading innovations, and for making a difference to our clients and society.Our wo...Show moreLast updated: 7 hours ago
    • Promoted
    Principal Service Engineer - Intune

    Principal Service Engineer - Intune

    Providence Indiahyderabad, telangana, in
    The Service Integration Reporting and Analytics team supports reporting needs for the Virtual Application Delivery and Electronic Health Record teams – teams which enable our caregivers, patients, ...Show moreLast updated: 27 days ago
    • Promoted
    • New!
    R&D Engineer II

    R&D Engineer II

    Medtronicnanakramguda, India
    At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You’ll lead with purpose, breaking down barriers to innovati...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    Field Service Engineer

    Field Service Engineer

    Leidoshyderabad, India
    Responsible for providing effective, efficient and compliant on-site technical and customer support for installed systems at assigned airports, as determined by Leidos needs.Will install, commissio...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    Technical Engineer II – ServiceNow Platform

    Technical Engineer II – ServiceNow Platform

    General Millshyderabad, India
    Technical Engineer II – ServiceNow Platform.D&T Manager – ServiceNow Platform.We make food the world loves : 100 brands.With iconic brands like Cheerios, Pillsbury, Betty Crocker, Nature Valley, and...Show moreLast updated: 1 hour ago
    • Promoted
    • New!
    Senior Engineer I (Inference Services)

    Senior Engineer I (Inference Services)

    DigitalOceanhyderabad, India
    Design and implement an inference platform for serving large language models optimized for the various GPU platforms they will be run on. Develop and shepherd complex AI and cloud engineering projec...Show moreLast updated: 7 hours ago
    • Promoted
    Senior Microservices Engineer I

    Senior Microservices Engineer I

    Marriott Tech AcceleratorHyderabad, India
    Bethesda, Maryland, USA, was founded in May 1927 by J.Marriott with a modest nine-seat A&W root beer stand.Guided by the family's leadership and core principles, Marriott International today has gr...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Infra - Linux Service Engineer

    Infra - Linux Service Engineer

    Zoetishyderabad, India
    The Zoetis Tech & Digital (ZTD) Global ERP organization is a key building block of ZTD comprising of enterprise applications and systems platforms. Join us at Zoetis India Capability Center (ZICC) i...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    Senior Platform Engineer (Azure Kubernetes Service)

    Senior Platform Engineer (Azure Kubernetes Service)

    Procter & Gamblehyderabad, India
    Are you looking to take your career to the next level?.We’re looking for a Senior Platform Engineer to join our Data & Analytics Platforms engineering team. We are searching for self-motivated candi...Show moreLast updated: 7 hours ago
    • Promoted
    Senior Engineer II [T500-20359]

    Senior Engineer II [T500-20359]

    Marriott Tech Acceleratorhyderabad, telangana, in
    Bethesda, Maryland, USA, was founded in May 1927 by J.Marriott with a modest nine-seat A&W root beer stand.Guided by the family's leadership and core principles, Marriott International today has gr...Show moreLast updated: 8 days ago