About T-Mobile :
T-Mobile US, Inc. (NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional service experience.
About TMUS Global Solutions :
TMUS Global Solutions is a world-class technology powerhouse accelerating the company’s global digital transformation. With a culture built on growth, inclusivity, and global collaboration, the teams here drive innovation at scale, powered by bold thinking.
TMUS India Private Limited operates as TMUS Global Solutions.
About the Role :
The Senior Site Reliability Engineer (Performance) is a hands-on engineering role focused on ensuring that customer-facing and internal applications deliver optimal performance, scalability, and resiliency. This role goes beyond traditional testing—profiling Java / Spring Boot applications, tuning Kubernetes workloads, analyzing network issues, and applying cloud-native best practices to drive systemic improvements. The engineer designs and automates performance tests, formulates hypotheses, and uses data-driven insights to recommend or implement fixes at code, infrastructure, and architecture levels. With an automation-first mindset, they embed performance engineering into CI / CD pipelines and partner with developers, architects, and SREs to create a performance-first culture.
What You’ll Do :
Profile and optimize Java / Spring Boot microservices, diagnosing memory leaks, GC issues, thread contention, and improving latency and throughput
Design and execute performance tests (load, stress, spike, endurance) using tools such as JMeter, k6, Gatling, Locust, or LoadRunner
Develop reusable test frameworks and integrate performance tests into CI / CD pipelines
Build load / capacity models and design experiments to test scalability under real-world conditions
Analyze and optimize Kubernetes workloads (EKS / AKS / GKE), tuning autoscaling, pod configs, and Helm charts
Use observability tools (AppDynamics, Prometheus, Grafana, Splunk) and tracing to identify performance bottlenecks
Conduct performance diagnostics across network layers (TCP / IP, gRPC,
Mentor engineers and participate in design reviews to embed performance practices from the start
What You’ll Bring :
4–7 years of experience in performance engineering or backend system optimization
Strong Java / Spring Boot development background
Hands-on experience with Kubernetes-based environments and container optimization
Familiarity with AWS (preferred), Azure or GCP acceptable
Experience with performance / load testing tools : JMeter, k6, Gatling, Locust, or LoadRunner
Ability to conduct network analysis and work with observability tools
Experience in embedding performance tests into CI / CD pipelines
Strong collaboration and mentoring skills
Must Have Skills : Java / Spring Boot
Kubernetes (EKS / AKS / GKE)
Performance testing tools : JMeter, k6, Gatling, LoadRunner
Observability : Prometheus, Grafana, AppDynamics, Splunk
CI / CD integration (e.g., GitLab, Jenkins)
JVM profiling tools : JFR, JProfiler, VisualVM
Nice To Have :
Cloud : AWS (preferred), Azure, GCP
Distributed tracing (Open Telemetry, Jaeger)
Helm, Terraform
Network-level analysis ( gRPC, TLS optimization)
Experience mentoring SRE or developer teams on performance best practices
Performance-related certifications (Java, Kubernetes, AWS
Site Reliability Engineer • India