Talent.com
Site Reliability Engineer - Elastic Kubernetes Service

Site Reliability Engineer - Elastic Kubernetes Service

D2KSSHyderabad
6 days ago
Job description

Description :

Key Responsibilities :

  • Manage and maintain Kubernetes clusters (EKS) and ensure high system reliability and scalability.
  • Implement and manage AWS services including IAM, EC2, EKS, CloudWatch, and S3.
  • Build automation tools to enable self-healing and self-monitoring systems.
  • Develop and maintain monitoring solutions to track system performance and alert for low-latency applications.
  • Troubleshoot application-specific, network, system, and performance issues in real time.
  • Perform Linux debugging, performance tuning, and optimization for production systems.
  • Apply SRE principles monitoring, alerting, error budgets, fault analysis, capacity planning, and toil reduction.
  • Collaborate with cross-functional teams to improve reliability, performance, and deployment processes.

Must-Have Qualifications :

  • Bachelors degree in Computer Science or a related field.
  • Minimum 5+ years of experience in DevOps / Site Reliability Engineering roles.
  • Strong hands-on experience with Kubernetes and container orchestration.
  • In-depth knowledge of AWS services (IAM, EC2, EKS, CloudWatch, S3).
  • Proficiency in at least one programming / scripting language Python or Shell.
  • Excellent understanding of Linux systems, debugging tools, and performance tuning.
  • Strong problem-solving, troubleshooting, and analytical skills.
  • Ability to work collaboratively in a fast-paced, evolving technology environment.
  • Preferred Skills :

  • Experience with CI / CD pipelines and automation frameworks.
  • Familiarity with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
  • Understanding of networking concepts, system architecture, and distributed systems.
  • Key Traits :

  • Strong ownership and accountability.
  • Excellent communication and collaboration skills.
  • Willingness to continuously learn and adapt to new technologies.
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Hyderabad

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CodeKarmahyderabad, telangana, in
    Site Reliability Engineer (Multi-Cloud Deployments).CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s w...Show moreLast updated: 8 days ago
    • Promoted
    Sr Engineer, Site Reliability Engineer [T500-20464]

    Sr Engineer, Site Reliability Engineer [T500-20464]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 12 days ago
    • Promoted
    Engineer, Site Reliability [T500-20521]

    Engineer, Site Reliability [T500-20521]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 12 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Tata Consultancy ServicesHyderabad, Telangana, India
    We are currently seeking a for a position SRE Engineer in Hyderabad.Job ID : 375656 • • • •Apply Here : • • (TCS iBegin) • •Job Description : • • - Proven experience as a DevOps / SRE Engineer - Expertise in...Show moreLast updated: 9 days ago
    • Promoted
    Site Reliability Engineer - AWS / Google Cloud Platform

    Site Reliability Engineer - AWS / Google Cloud Platform

    INDIGLOBE IT SOLUTIONS PRIVATE LIMITEDHyderabad
    Job Summary : We are looking for a Senior Site Reliability Engineer (SRE) to join our growing Engineering team.As an SRE, you will play a key role in ensuring the rel...Show moreLast updated: 30+ days ago
    • Promoted
    Engineer, Site Reliability [T500-20503]

    Engineer, Site Reliability [T500-20503]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 13 days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.secunderabad, telangana, in
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    AutoRABIT - Senior Site Reliability Engineer - AWS Infrastructure

    AutoRABIT - Senior Site Reliability Engineer - AWS Infrastructure

    AutoRABIT Software Pvt LtdHyderabad
    AutoRABIT Profile : AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes R...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    TalentiserHyderabad, Telangana, India
    Reliability, Automation, and Observability As a hybrid Site Reliability Engineer / DevOps Engineer, you'll be a key driver in ensuring the stability, performance, and scalability of our mission-criti...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    NationsBenefits IndiaHyderabad, Telangana, India
    Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog |.SRE team focused on maintaining the performance, reliability, and availability of our fintech platforms.Triage and resolve product...Show moreLast updated: 8 days ago
    • Promoted
    MetLife - Site Reliability Engineer - ELK Stack

    MetLife - Site Reliability Engineer - ELK Stack

    MetLife Global Operations Support CenterHyderabad
    Note : This job role is part of MetLifes Hack4Job India (a hiring hackathon).Only shortlisted candidates will be invited. Department : Global Overview Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ValueMomentumHyderabad, Telangana, India
    Site Reliability / Azure DevOps Engineer with Dynatrace Experience.CI / CD practices, infrastructure automation, and cloud operations. The ideal candidate will have deep expertise in Azure DevOps, Inf...Show moreLast updated: 23 days ago
    • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    FACTSETHyderabad, India
    FactSet creates flexible, open data and software solutions for over 200,000 investment professionals worldwide, providing instant access to financial data and analytics that investors use to make c...Show moreLast updated: 3 days ago
    • Promoted
    DevOps Engineer - Site Reliability

    DevOps Engineer - Site Reliability

    Axceltran digital private limitedHyderabad
    Description : Qualifications : - Proven experience as a Site Reliability Engineer, Sr DevOps Engineer, or similar role...Show moreLast updated: 10 days ago
    • Promoted
    AWS Site Reliability Engineer

    AWS Site Reliability Engineer

    HTC Global ServicesHyderabad, Telangana, India
    Troy, Michigan, is a leading global Information Technology solution and BPO provider.HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data ...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Sonata SoftwareHyderabad, India
    Site Reliability Engineer (SRE) III – Data Engineering.AWS, CI / CD, Jenkins, IAAC, Terraform, Kubernetes.Secondary Skills (Good-to-Have). AWS systems; Dataiku data, Platform updates and patching.Data...Show moreLast updated: 9 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Insight Global, LLCHyderabad
    We are seeking SRE / Ansible Developers to join our Enterprise SRE Center of Excellence (COE) team.This team is responsible for defining development standards, ensuring compliance, and building autom...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    o9 Solutions, Inc.secunderabad, telangana, in
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 9 days ago