Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerPeoplefy • Pushkar, IN
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Peoplefy • Pushkar, IN
8 hours ago
Job description

Greetings from Peoplefy!

We’re looking for an SRE who can own reliability for mission-critical services on Azure , shape standards, lead incidents with calm clarity, and drive engineering excellence across teams

Experience : 10+ years

Location : Trivandrum

Responsibilities :

  • Strong site reliability experience
  • Previously worked as DevOps engineer and at present working as SRE
  • Strong experience in Azure
  • Strong experience with AKS
  • Experience working in docker
  • Experience with observability (Any tool)
  • Experience working on PostgreSQL

SLIs / SLOs & Error Budgets

  • Define SLIs / SLOs for Tier-0 / Tier-1 services & review quarterly
  • Implement multi-window, multi-burn-rate alerts
  • Change gating via CI / CD based on error budgets
  • Maintain Azure Monitor / Grafana / Prometheus / App Insights dashboards
  • Conduct weekly SLO reviews & drive reliability roadmap
  • Incident Management

  • Lead SEV1 / SEV2 incidents , own communication & postmortems
  • Ensure corrective actions are implemented
  • Reliability Engineering

  • Implement DR, multi-AZ / region patterns, HPA / VPA / KEDA, resilient rollouts
  • Cluster hardening (network, identity, policy), optimize density
  • Ingress : AGIC / Nginx
  • Observability

  • Metrics, traces, logs via Azure Monitor, App Insights, Log Analytics, Prometheus, Grafana, OpenTelemetry
  • Alerts on symptoms, not noise
  • Automation & IaC

  • Terraform / Bicep , GitOps (Flux / Argo) , Azure Policy / OPA Gatekeeper
  • Automate toil & build self-service runbooks / chatops
  • CI / CD Reliability

  • Azure DevOps / GitHub Actions with canary, blue-green, rollback
  • Key Vault-backed secrets
  • Performance & Capacity

  • Load testing, autoscaling, FinOps collaboration
  • Disaster Recovery

  • Define RTO / RPO , run chaos drills & game days
  • Security

  • Entra ID, Key Vault rotation, VNets / NSGs, shift-left security in CI
  • Documentation

  • Runbooks, SLOs, postmortems, architectures — kept current & accessible
  • Interested candidates please share your updated resumes on amruta.bu@peoplefy.com

    Create a job alert for this search

    Senior Site Reliability Engineer • Pushkar, IN