Talent.com
Site Reliability Engineer III

Site Reliability Engineer III

ConfidentialHyderabad / Secunderabad, Telangana, India
30+ days ago
Job description

Job Description

As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office, you will collaborate with engineering, support, and operations teams to maintain and improve the reliability of mission-critical applications. You'll participate in incident management, troubleshooting, and continuous improvement, and help implement automation and monitoring solutions. On-call rotation is part of the role, requiring effective action during production incidents and a commitment to operational excellence. You'll share knowledge, follow best practices, and contribute to a culture of learning and innovation. We value team players who communicate clearly, solve problems proactively, and focus on customer needs.

Job Responsibilities

  • Design, develop, and operate solutions for application reliability, monitoring, and automation.
  • Execute incident response, troubleshooting, and root cause analysis to resolve production issues and improve system stability.
  • Build and maintain CI / CD pipelines using Jenkins (including global libraries), and implement infrastructure as code with Terraform.
  • Develop and support containerized applications using Docker and Kubernetes, ensuring robust deployments and scalability.
  • Implement and maintain observability solutions using tools such as Grafana, Prometheus, Splunk, and OpenTelemetry.
  • Collaborate with engineering and support teams to drive continuous improvement and operational excellence.
  • Participate in on-call rotation, responding to production incidents and ensuring timely resolution.

Required Qualifications, Capabilities, And Skills

  • Formal training or certification on Site Reliability Engineering concepts and 3+ years applied experience
  • Experience in SRE, DevOps, or application support roles, with knowledge of SLIs / SLOs, incident response, and troubleshooting.
  • Familiarity with monitoring and observability tools (e.g., Grafana, Prometheus, Splunk, OpenTelemetry).
  • Hands-on experience with CI / CD pipelines (Jenkins, including global libraries), infrastructure as code (Terraform), version control (Git), containerization (Docker), and orchestration (Kubernetes).
  • Exposure to cloud platforms (AWS, GCP, or Azure) and automating infrastructure and deployments.
  • Willingness to participate in on-call rotation and respond to production incidents.
  • Ability to break down issues, document solutions, and communicate effectively with team members and customers.
  • Preferred Qualifications, Capabilities, And Skills

  • Familiar in banking, fintech, or regulated environments.
  • Participation in game days or chaos engineering.
  • Interest in sharing knowledge and best practices with peers.
  • ABOUT US

    Skills Required

    Orchestration, Version Control, Prometheus, Grafana, Incident Response, Jenkins, Git, Gcp, Docker, Terraform, containerization , Troubleshooting, Splunk, Azure, Kubernetes, Aws

    Create a job alert for this search

    Site Reliability Engineer • Hyderabad / Secunderabad, Telangana, India