Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

CodeKarmahyderabad, telangana, in
2 days ago
Job description

Site Reliability Engineer (Multi-Cloud Deployments)

Location : Bangalore / Remote

Experience : 4–10 years

Type : Full-time (6-month probation)

About CodeKarma

CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s workflow.

Our platform runs both as SaaS and as sub-account / on-prem deployments within our customers’ cloud environments.

We’re looking for engineers who can take ownership of these deployments end-to-end — from setup to monitoring, upgrades, and ongoing reliability.

What You’ll Do

You’ll be responsible for managing CodeKarma’s distributed deployments across client environments — ensuring reliability, security, and performance at scale.

  • Deploy and manage CodeKarma clusters across AWS, GCP, and Azure customer sub-accounts.
  • Monitor, upgrade, and maintain Kubernetes clusters and related infrastructure.
  • Implement observability, alerting, and disaster recovery for each deployment.
  • Handle CI / CD automation for platform releases, patches, and version upgrades.
  • Work closely with client engineering teams to adapt deployments to their environments, policies, and security constraints.
  • Diagnose and resolve environment-specific issues across networking, storage, and configuration layers.
  • Build and maintain infrastructure playbooks, Helm charts, and Terraform modules for standardized deployment.

What We’re Looking For

  • Strong experience managing Kubernetes clusters (EKS, GKE, AKS, or on-prem equivalents).
  • Deep understanding of Kubernetes internals, Helm, ingress controllers, networking, and storage classes .
  • Hands-on experience with CI / CD tools (GitHub Actions, ArgoCD, or similar).
  • Familiarity with monitoring and alerting stacks (Prometheus, Grafana, Loki, ELK, etc.).
  • Working knowledge of cloud infrastructure across AWS / GCP / Azure.
  • Ability to work directly with client engineering and DevOps teams , understanding their constraints and helping them integrate CodeKarma.
  • Strong debugging and communication skills — you’ll often be the bridge between CodeKarma and client infrastructure.
  • Why Join Us

  • Manage real, large-scale production environments across multiple enterprises.
  • Work directly with founders and senior engineers to shape how CodeKarma scales across clients.
  • High ownership, fast-moving environment, and exposure to deep-tech systems.
  • How to Apply

    Please share :

  • A short summary of your Kubernetes experience (cluster management, scaling, debugging, etc.).
  • Any automation or deployment tooling you’ve built or maintained.
  • Links to your GitHub / GitLab / blog posts (if available).
  • Create a job alert for this search

    Site Reliability Engineer • hyderabad, telangana, in

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ValueMomentumHyderabad, Telangana, India
    Site Reliability / Azure DevOps Engineer with Dynatrace Experience.CI / CD practices, infrastructure automation, and cloud operations. The ideal candidate will have deep expertise in Azure DevOps, Inf...Show moreLast updated: 18 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    TalentiserHyderabad, Telangana, India
    Reliability, Automation, and Observability As a hybrid Site Reliability Engineer / DevOps Engineer, you'll be a key driver in ensuring the stability, performance, and scalability of our mission-criti...Show moreLast updated: 13 days ago
    • Promoted
    Sr Engineer, Site Reliability Engineer

    Sr Engineer, Site Reliability Engineer

    TMUS Global SolutionsHyderabad, India
    The Senior Systems Reliability Engineer (SRE) ensures the stability, performance, and reliability of IT services and infrastructure. This role combines software engineering and operations expertise ...Show moreLast updated: 30+ days ago
    • Promoted
    Engineer, Site Reliability [T500-20517]

    Engineer, Site Reliability [T500-20517]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Engineer, Site Reliability [T500-20521]

    Engineer, Site Reliability [T500-20521]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Sr Engineer, Site Reliability Engineer [T500-20464]

    Sr Engineer, Site Reliability Engineer [T500-20464]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SID Global SolutionsHyderabad, Telangana, India
    Job Role : Site Reliability Engineer (SRE) – GCP.SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortu...Show moreLast updated: 13 days ago
    • Promoted
    Engineer, Site Reliability [T500-20503]

    Engineer, Site Reliability [T500-20503]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Engineer, Site Reliability [T500-20515]

    Engineer, Site Reliability [T500-20515]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 6 days ago
    • Promoted
    Sr Engineer, Site Reliability

    Sr Engineer, Site Reliability

    TMUS Global SolutionsHyderabad, India
    As a Senior Site Reliability Engineer, you will be a key member of the CFL Platform Engineering and Operations team you will play a pivotal role in building and scaling intelligent infrastructure t...Show moreLast updated: 30+ days ago
    • Promoted
    Engineer - Site Relibility - FPT

    Engineer - Site Relibility - FPT

    Talent500 INCHyderabad, India
    Engineer - Site Reliability - FPT.As a Site Reliability Engineer, youll play a crucial role in keeping our digital backbone running seamlessly for millions of customers. Your mission : reduce inciden...Show moreLast updated: 30+ days ago
    • Promoted
    AWS Site Reliability Engineer

    AWS Site Reliability Engineer

    HTC Global ServicesHyderabad, Telangana, India
    Troy, Michigan, is a leading global Information Technology solution and BPO provider.HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data ...Show moreLast updated: 12 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Amicon Hub Servicessecunderabad, telangana, in
    Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 24 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    o9 Solutions, Inc.hyderabad, telangana, in
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Sonata SoftwareHyderabad, IN
    We're Hiring : Senior Site Reliability Engineer.Onsite (Office : Hyderabad – Mandatory from Day 1).Senior Site Reliability Engineer (SRE). This is a high-impact role where you’ll design scalable archi...Show moreLast updated: 3 days ago
    • Promoted
    Engineer, Site Reliability

    Engineer, Site Reliability

    TMUS Global SolutionsHyderabad, India
    As a Site Reliability Engineer (SRE), you will be a key member of the CFL Platform Engineering and Operations team you will be responsible for building and maintaining large-scale, distributed syst...Show moreLast updated: 30+ days ago
    • Promoted
    Engineer, Site Reliability [T500-20504]

    Engineer, Site Reliability [T500-20504]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Engineer, Site Reliability [T500-20518]

    Engineer, Site Reliability [T500-20518]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    NationsBenefits IndiaHyderabad, Telangana, India
    Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog |.SRE team focused on maintaining the performance, reliability, and availability of our fintech platforms.Triage and resolve product...Show moreLast updated: 3 days ago
    • Promoted
    Engineer, Site Reliability [T500-20519]

    Engineer, Site Reliability [T500-20519]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 6 days ago