Talent.com
No longer accepting applications
▷ [Immediate Start] Sr Engineer, Site Reliability [T500-20286]

▷ [Immediate Start] Sr Engineer, Site Reliability [T500-20286]

TMUS Global SolutionsIndia
13 days ago
Job description

About T-Mobile :

T-Mobile US, Inc. (NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional service experience.

TMUS Global Solutions :

TMUS Global Solutions is a world-class technology powerhouse accelerating the company’s global digital transformation. With a culture built on growth, inclusivity, and global collaboration, the teams here drive innovation at scale, powered by bold thinking.

TMUS India Private Limited operates as TMUS Global Solutions.

About the Role :

As a Senior Site Reliability Engineer, you will be a key member of the CFL Platform Engineering and Operations team you will play a pivotal role in building and scaling intelligent infrastructure to support AI / ML applications, enterprise services, and LLM-based platforms. You will contribute to the design and implementation of observability frameworks, automation-first operations, and incident response strategies to ensure reliability, performance, and scalability across production systems.

What You’ll Do :

  • Implement and maintain observability, monitoring, and alerting systems for AI platforms and backend services
  • Design and support telemetry pipelines, logging infrastructure, and dashboards (Splunk, Prometheus, Grafana, Open Telemetry)
  • Define and monitor SLOs, SLIs, latency, availability, and throughput metrics
  • Participate in on-call rotations, incident resolution, root cause analysis, and postmortems
  • Improve CI / CD workflows and infrastructure automation using GitLab pipelines
  • Optimize and scale infrastructure including Kafka, RMQ, HAProxy, and distributed APIs
  • Collaborate with engineering teams on governance, compliance, and secure operations
  • Support capacity planning, cost analysis, and tuning for high-scale performance
  • Automate repetitive tasks and reduce toil via scripting (Python, Bash, Java)
  • Contribute to runbooks, knowledge base articles, and SRE best practice documentation
  • Mentor junior engineers and support a culture of operational excellence and reliability

What You’ll Bring :

  • Bachelor’s degree in Computer Science, Engineering, or a related technical field
  • 4-7 years in SRE, DevOps, platform, or operations engineering roles
  • Strong hands-on experience in observability, monitoring, and distributed systems troubleshooting
  • Proficiency in scripting languages such as Python, Bash, or PowerShell
  • CI / CD experience with GitLab and automation across deployment pipelines
  • Solid understanding of SQL and NoSQL systems including Oracle DB and MongoDB
  • Familiarity with Kubernetes, container orchestration, and hybrid cloud (Azure, AWS, GCP, OCI)
  • Experience working in high-stakes, incident-driven environments
  • Strong working knowledge of Splunk, Grafana, Prometheus, and other observability tools
  • Understanding of AI / ML systems, inference APIs, and LLM infrastructure is a plus
  • Experience in platform compliance, security enforcement, and regulated domains (finance preferred)
  • Must Have Skills :

  • Application & Microservice : Java, Spring boot, API & Service Design
  • Any CI / CD Tools : Gitlab Pipeline / Test Automation / GitHub Actions / Jenkins / Circle CI
  • App Platform : Docker & Containers (Kubernetes)
  • Any Databases : SQL & NOSQL (Cassandra / Oracle / Snowflake / MongoDB)
  • Any Messaging : Kafka, Rabbit MQ
  • Any Observability / Monitoring : Splunk / Grafana / Open Telemetry / ELK Stack / Datadog / New Relic / Prometheus)
  • Incident / Change / Problem Management
  • Nice To Have :

  • Multi-region failover (SQL Server, MongoDB, vendors)
  • Observability platform design (sampling, retention policies)
  • Own domain SLOs and error budgets
  • Perf engineering for latency-sensitive apps
  • Toil automation (SRE bots, operators
  • Create a job alert for this search

    Site Reliability Engineer • India

    Related jobs
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.Nagpur, IN
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (SRE II)

    Site Reliability Engineer (SRE II)

    greytHRIndia
    We are looking for a passionate and detail-oriented Site Reliability Engineer (SRE) to join our engineering team.As an SRE, you will play a critical role in ensuring the reliability, scalability, a...Show moreLast updated: 13 days ago
    • Promoted
    • New!
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Atyeti IncIndia
    Job Description : We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our growing team. Bachelor’s degree in computer science, Engineering, or equivalent practical ...Show moreLast updated: 18 hours ago
    • Promoted
    Sr Engineer, Site Reliability Engineer [T500-20464]

    Sr Engineer, Site Reliability Engineer [T500-20464]

    TMUS Global SolutionsIndia
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 17 days ago
    • Promoted
    Sr Engineer, Site Reliability [T500-20439]

    Sr Engineer, Site Reliability [T500-20439]

    TMUS Global SolutionsIndia
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 17 days ago
    • Promoted
    Sr Site Reliability Engineer

    Sr Site Reliability Engineer

    Media.netIndia
    Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. HQ is based in New York, and the Global H...Show moreLast updated: 28 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapgeminiNagpur, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 1 day ago
    • Promoted
    Sr Engineer, Site Reliability [T500-20437]

    Sr Engineer, Site Reliability [T500-20437]

    TMUS Global SolutionsIndia
    About T-Mobile : T-Mobile US, Inc.NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship b...Show moreLast updated: 17 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    QualityKiosk TechnologiesIndia
    QualityKiosk Technologies is one of the world's largest independent Quality Engineering (QE) providers and digital transformation enablers, helping companies build and manage applications for optim...Show moreLast updated: 13 days ago
    • Promoted
    Sr Engineer, Site Reliability - Accounting Technology [T500-20168]

    Sr Engineer, Site Reliability - Accounting Technology [T500-20168]

    ANSRIndia
    ANSR is hiring for one of its clients.About T-Mobile : T-Mobile US, Inc.NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its st...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    TalentiserIndia
    Reliability, Automation, and Observability As a hybrid Site Reliability Engineer / DevOps Engineer, you'll be a key driver in ensuring the stability, performance, and scalability of our mission-criti...Show moreLast updated: 23 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    iVoyantIndia, India
    One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team. Reliability and Performance Management : .Design, imp...Show moreLast updated: 2 days ago
    • Promoted
    Sr Engineer, Site Reliability [T500-20425]

    Sr Engineer, Site Reliability [T500-20425]

    TMUS Global SolutionsIndia
    About T-Mobile : T-Mobile US, Inc.NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship b...Show moreLast updated: 17 days ago
    • Promoted
    Sr Engineer, Site Reliability [T500-20279]

    Sr Engineer, Site Reliability [T500-20279]

    TMUS Global SolutionsIndia
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 17 days ago
    • Promoted
    Sr Engineer, Site Reliability [T500-20286]

    Sr Engineer, Site Reliability [T500-20286]

    TMUS Global SolutionsIndia
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 17 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    o9 Solutions, Inc.India
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeIndia
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 4 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    RecRootsIndia
    The core premise for the SRE lies in treating operational issues as a software problem.We code our way out of problems where operations are concerned, addressing availability, scalability, latency,...Show moreLast updated: 23 days ago
    • Promoted
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Tata Consultancy ServicesIndia
    Senior Site Reliability Engineer (SRE) Required Technical Skill Set : .Senior Site Reliability Engineer (SRE) Desired Experience Range : 7 - 10 yrs Notice Period : Immediate to 90Days only Location of ...Show moreLast updated: 2 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    IntraEdgeIndia
    Job Title : Site Reliability Engineer (SRE) – Production Support Location : Bengaluru.Job Summary : We are looking for a skilled. Site Reliability Engineer (SRE).DevOps practices, and cloud infrastruct...Show moreLast updated: 30+ days ago