Talent.com
This job offer is not available in your country.
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Loyalytics AIChennai
12 days ago
Job description

e're looking for a hands-on Site Reliability / DevOps Engineer to be our first hire in this function, responsible for owning and scaling the reliability, observability, and infrastructure of our platform running entirely on Microsoft Azure.

You'll be critical in shaping DevOps culture, architecting fault-tolerant systems, and deploying automation to improve uptime, performance, and cost efficiency.

This is a hybrid role combining SRE and DevOps principles - ideal for builders comfortable working in fast-paced, product-driven environments.

What You'll Own :

Cloud Infrastructure (Microsoft Azure Must Have) :

  • Architect, deploy, and maintain services across Azure App Services, Azure Container Apps, Cosmos DB, Event Hubs, Azure Monitor, Azure VMs, and Azure Kubernetes Service (AKS).
  • Design and manage networking (VNets, Subnets, NSGs) and identity / access controls (PIM, Managed Identities, Enterprise Applications, Role-based Access Control).
  • Own infrastructure provisioning using Terraform / Bicep.
  • Implement cost-effective, scalable, and secure cloud environments across development, staging, and production.

Monitoring, Observability & Incident Response :

  • Set up end-to-end observability using Prometheus, Grafana, Azure Monitor, ELK Stack, and Sentry.
  • Define and enforce standards for logging, metrics, traces, SLIs / SLOs, and error budgets.
  • Build proactive alerting systems for APIs, RabbitMQ, Databricks pipelines, and external integrations.
  • Establish on-call rotations, incident response runbooks, and lead RCAs to minimize MTTR.
  • CI / CD, Automation & Tooling :

  • Automate deployments and infrastructure lifecycle using GitHub Actions, Terraform modules, and CLI tools.
  • Improve CI / CD for faster, safer releases across containerized and VM-based workloads.
  • Build internal tools for diagnostics, rollback safety, and release automation.
  • Integrate resilience patterns : retries, circuit breakers, backoff strategies, failovers.
  • DevOps & System Reliability :

  • Optimize system performance, memory usage, and availability for core services like RabbitMQ, APIs, analytics pipelines on Databricks.
  • Implement zero-downtime deployments, self-healing systems, and infrastructure audits.
  • Perform regular cost analysis, right-sizing, and tag-based budget enforcement.
  • Security & Compliance Collaboration :

  • Work with security teams to maintain infrastructure and data flow diagrams, support ISO 27001, GDPR, PDPA readiness.
  • Participate in threat modeling, define trust boundaries, and implement audit-ready infrastructure practices.
  • Tech Stack You'll Work With :

  • Cloud : Microsoft Azure (App Services, Container Apps, AKS, Cosmos DB, Event Hubs, Monitor, VMs).
  • IaC : Terraform, Bicep.
  • CI / CD : Azure Devops,GitHub Actions.
  • Monitoring & Logs : Prometheus, Grafana, Azure Monitor, ELK, Sentry.
  • Queueing : RabbitMQ, Kafka.
  • Languages : Node.js, Python (mostly for debugging
  • (ref : hirist.tech)

    Create a job alert for this search

    Senior Site Reliability Engineer • Chennai

    Related jobs
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    PoshmarkChennai, Tamil Nadu, India
    We’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Zyoin GroupChennai, Tamil Nadu, India
    Site Reliability Engineer (SRE).Chennai (Hybrid – 2 days in office).We are seeking a Site Reliability Engineer (SRE) responsible for leading reliability practices, ensuring scalable systems, and co...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Amicon Hub Serviceschennai, tamil nadu, in
    Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ConfidentialChennai
    The right candidate will put the customer first, understand their user stories and will identify ways to support them, ensuring stability and ability to scale and meeting the needs both of our cust...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    XebiaChennai, IN
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 24 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    UplersChennai, IN
    Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 22 days ago
    • Promoted
    Reliability Engineer

    Reliability Engineer

    Alp Consulting Ltd.Chennai, Tamil Nadu, India
    Job Title : Reliability Engineer.Qualification : Diploma / BE (Mech.Experience of maintaining the Instruments, Valves, transmitters, Sensors, Control systems (DCS / PLC, SCADA), Analyzers and F &G system...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Chaos Management

    Site Reliability Engineer - Chaos Management

    Xebiachennai, tamil nadu, in
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 5 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    ConfidentialChennai, India
    Join us as we work to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all. We are looking for a Senior Site Reliability Engineer to join our Servic...Show moreLast updated: 7 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ConcordChennai, IN
    Engineers (Individual Contributors).Strong SRE (Site Reliability Engineering).CI / CD, monitoring, automation, infrastructure as code, etc.Show moreLast updated: 16 days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.Chennai, IN
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Poshmark - Senior Site Reliability Engineer - Cloud Infrastructure

    Poshmark - Senior Site Reliability Engineer - Cloud Infrastructure

    POSHMARKChennai
    Job Description : Were looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems ...Show moreLast updated: 16 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    WSO2chennai, tamil nadu, in
    Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises in over 90 c...Show moreLast updated: 5 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Tata Consultancy ServicesChennai, Tamil Nadu, India
    TCS is looking for Senior Site Reliability Engineer – AWS.Design, implement, and maintain scalable, secure, and highly available infrastructure on AWS. Develop and improve CI / CD pipelines, Infrastru...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer 2

    Site Reliability Engineer 2

    ConfidentialChennai
    Work with team to plan, design and deploy new cloud technologies.Create, Maintain , and Enhance Automated Product Deployments. Develop, Modify, Support and maintain AWS based components through Infr...Show moreLast updated: 18 days ago
    • Promoted
    Site Reliability Engineer - Cloud Platforms

    Site Reliability Engineer - Cloud Platforms

    LanceSoft, IncChennai
    Role and Responsibilities : Reporting to Engineering, the Site Reliability Engineer will play a critical role in driving innovation and growth for the Banking Soluti...Show moreLast updated: 16 days ago
    • Promoted
    RELX - Site Reliability Engineer - IAC Terraform

    RELX - Site Reliability Engineer - IAC Terraform

    REED ELSEVIER INDIA (a part of RELX India Pvt Ltd)Chennai
    Job Description : - Lead initiatives to identify and eliminate manual, repetitive tasks through automation and tooling.Develop s...Show moreLast updated: 16 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ElgebraChennai
    Role Overview : We are seeking a highly experienced and technically proficient Site Reliability Engineer (SRE) to join our team in support of our c...Show moreLast updated: 1 day ago