Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

Batch Systems IncHyderabad, IN
1 day ago
Job description

Batch is a brand-first technology platform designed to amplify customer engagement, enable frictionless transactions, defend product authenticity, elevate customer loyalty, and ignite customer growth. Our mission is to provide seamless solutions that help businesses build stronger connections with their customers. With a focus on enhancing the customer experience, Batch delivers innovative technology that drives value and fosters long-term success.

Role Description

We are looking for a proactive Site Reliability Engineer (SRE) with strong hands-on experience in cloud infrastructure, automation, and system reliability. You will be responsible for building, maintaining, and optimizing highly available and scalable systems. The ideal candidate has expertise in Kubernetes, AWS, Terraform / Terragrunt, and CI / CD pipelines , along with solid scripting skills to automate operational tasks.

Key Responsibilities :

  • Design, deploy, and maintain cloud infrastructure (AWS) ensuring reliability, scalability, and security.
  • Manage and operate Kubernetes clusters , ensuring high availability and performance.
  • Implement infrastructure as code using Terraform / Terragrunt to automate provisioning and updates.
  • Develop and maintain CI / CD pipelines using GitLab for application deployment.
  • Write automation scripts in Python and Bash to streamline operational tasks and monitoring.
  • Manage Docker-based deployments , optimizing container performance and security.
  • Monitor and optimize PostgreSQL databases (including RDS) for performance, backup, and recovery.
  • Troubleshoot production issues and provide root cause analysis to prevent recurrence.
  • Collaborate with development and operations teams to improve system reliability, deployment processes, and infrastructure efficiency.
  • Implement observability solutions : logging, metrics, and alerts to ensure system health.

Required Skills & Experience :

  • 3+ years of professional experience in Site Reliability, DevOps, or Cloud Engineering roles.
  • Strong expertise in Kubernetes , including deployments, scaling, and cluster management.
  • Hands-on experience with AWS services (EC2, RDS, S3, IAM, CloudWatch, etc.).
  • Experience with Terraform / Terragrunt for infrastructure automation.
  • Proficient in GitLab CI / CD pipelines and version control workflows.
  • Strong scripting skills in Python and Bash for automation and operational tasks.
  • Experience with Docker containerization and orchestration.
  • Working knowledge of PostgreSQL and RDS management.
  • Strong problem-solving skills and attention to detail.
  • Good communication skills and ability to collaborate across teams.
  • Preferred Skills :

  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK Stack, etc.).
  • Familiarity with microservices architectures and cloud-native applications.
  • Understanding of security best practices in cloud infrastructure.
  • Experience with event-driven architectures and messaging systems (Kafka, RabbitMQ, etc.).
  • Benefits :

  • Competitive salary and benefits package.
  • Opportunities for career growth and professional development.
  • Work with cutting-edge cloud and DevOps technologies.
  • Collaborative and inclusive work environment.
  • Flexible work arrangements (if applicable).
  • Create a job alert for this search

    Site Reliability Engineer • Hyderabad, IN

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ValueMomentumHyderabad, Telangana, India
    Site Reliability / Azure DevOps Engineer with Dynatrace Experience.CI / CD practices, infrastructure automation, and cloud operations. The ideal candidate will have deep expertise in Azure DevOps, Inf...Show moreLast updated: 18 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    TalentiserHyderabad, Telangana, India
    Reliability, Automation, and Observability As a hybrid Site Reliability Engineer / DevOps Engineer, you'll be a key driver in ensuring the stability, performance, and scalability of our mission-criti...Show moreLast updated: 13 days ago
    • Promoted
    Sr Engineer, Site Reliability Engineer

    Sr Engineer, Site Reliability Engineer

    TMUS Global SolutionsHyderabad, India
    The Senior Systems Reliability Engineer (SRE) ensures the stability, performance, and reliability of IT services and infrastructure. This role combines software engineering and operations expertise ...Show moreLast updated: 30+ days ago
    • Promoted
    Engineer, Site Reliability [T500-20517]

    Engineer, Site Reliability [T500-20517]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Engineer, Site Reliability [T500-20521]

    Engineer, Site Reliability [T500-20521]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Sr Engineer, Site Reliability Engineer [T500-20464]

    Sr Engineer, Site Reliability Engineer [T500-20464]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SID Global SolutionsHyderabad, Telangana, India
    Job Role : Site Reliability Engineer (SRE) – GCP.SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortu...Show moreLast updated: 13 days ago
    • Promoted
    Engineer, Site Reliability [T500-20503]

    Engineer, Site Reliability [T500-20503]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Engineer, Site Reliability [T500-20515]

    Engineer, Site Reliability [T500-20515]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 6 days ago
    • Promoted
    Sr Engineer, Site Reliability

    Sr Engineer, Site Reliability

    TMUS Global SolutionsHyderabad, India
    As a Senior Site Reliability Engineer, you will be a key member of the CFL Platform Engineering and Operations team you will play a pivotal role in building and scaling intelligent infrastructure t...Show moreLast updated: 30+ days ago
    • Promoted
    Engineer - Site Relibility - FPT

    Engineer - Site Relibility - FPT

    Talent500 INCHyderabad, India
    Engineer - Site Reliability - FPT.As a Site Reliability Engineer, youll play a crucial role in keeping our digital backbone running seamlessly for millions of customers. Your mission : reduce inciden...Show moreLast updated: 30+ days ago
    • Promoted
    AWS Site Reliability Engineer

    AWS Site Reliability Engineer

    HTC Global ServicesHyderabad, Telangana, India
    Troy, Michigan, is a leading global Information Technology solution and BPO provider.HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data ...Show moreLast updated: 12 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    o9 Solutions, Inc.hyderabad, telangana, in
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Amicon Hub Serviceshyderabad, telangana, in
    Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 24 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Sonata SoftwareHyderabad, IN
    We're Hiring : Senior Site Reliability Engineer.Onsite (Office : Hyderabad – Mandatory from Day 1).Senior Site Reliability Engineer (SRE). This is a high-impact role where you’ll design scalable archi...Show moreLast updated: 3 days ago
    • Promoted
    Engineer, Site Reliability

    Engineer, Site Reliability

    TMUS Global SolutionsHyderabad, India
    As a Site Reliability Engineer (SRE), you will be a key member of the CFL Platform Engineering and Operations team you will be responsible for building and maintaining large-scale, distributed syst...Show moreLast updated: 30+ days ago
    • Promoted
    Engineer, Site Reliability [T500-20504]

    Engineer, Site Reliability [T500-20504]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Engineer, Site Reliability [T500-20518]

    Engineer, Site Reliability [T500-20518]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 7 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    NationsBenefits IndiaHyderabad, Telangana, India
    Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog |.SRE team focused on maintaining the performance, reliability, and availability of our fintech platforms.Triage and resolve product...Show moreLast updated: 3 days ago
    • Promoted
    Engineer, Site Reliability [T500-20519]

    Engineer, Site Reliability [T500-20519]

    TMUS Global SolutionsHyderabad, Telangana, India
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 6 days ago