Talent.com
Lead Site Reliability Specialist

Lead Site Reliability Specialist

PeoplefyThiruvananthapuram, Republic Of India, IN
1 day ago
Job description

Greetings from Peoplefy!

We’re looking for an SRE who can own reliability for mission-critical services on Azure , shape standards, lead incidents with calm clarity, and drive engineering excellence across teams

Experience : 10+ years

Location : Trivandrum

Responsibilities :

  • Strong site reliability experience
  • Previously worked as DevOps engineer and at present working as SRE
  • Strong experience in Azure
  • Strong experience with AKS
  • Experience working in docker
  • Experience with observability (Any tool)
  • Experience working on PostgreSQL

SLIs / SLOs & Error Budgets

  • Define SLIs / SLOs for Tier-0 / Tier-1 services & review quarterly
  • Implement multi-window, multi-burn-rate alerts
  • Change gating via CI / CD based on error budgets
  • Maintain Azure Monitor / Grafana / Prometheus / App Insights dashboards
  • Conduct weekly SLO reviews & drive reliability roadmap
  • Incident Management

  • Lead SEV1 / SEV2 incidents , own communication & postmortems
  • Ensure corrective actions are implemented
  • Reliability Engineering

  • Implement DR, multi-AZ / region patterns, HPA / VPA / KEDA, resilient rollouts
  • Cluster hardening (network, identity, policy), optimize density
  • Ingress : AGIC / Nginx
  • Observability

  • Metrics, traces, logs via Azure Monitor, App Insights, Log Analytics, Prometheus, Grafana, OpenTelemetry
  • Alerts on symptoms, not noise
  • Automation & IaC

  • Terraform / Bicep , GitOps (Flux / Argo) , Azure Policy / OPA Gatekeeper
  • Automate toil & build self-service runbooks / chatops
  • CI / CD Reliability

  • Azure DevOps / GitHub Actions with canary, blue-green, rollback
  • Key Vault-backed secrets
  • Performance & Capacity

  • Load testing, autoscaling, FinOps collaboration
  • Disaster Recovery

  • Define RTO / RPO , run chaos drills & game days
  • Security

  • Entra ID, Key Vault rotation, VNets / NSGs, shift-left security in CI
  • Documentation

  • Runbooks, SLOs, postmortems, architectures — kept current & accessible
  • Interested candidates please share your updated resumes on amruta.bu@peoplefy.com

    Create a job alert for this search

    Site Reliability Specialist • Thiruvananthapuram, Republic Of India, IN

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Datum Technologies Groupkollam, kerala, in
    Job Title : Site Reliability Engineer (SRE) – AWS.AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, Datadog.We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experi...Show moreLast updated: 8 days ago
    • Promoted
    Team Lead

    Team Lead

    ALTISOURCE BUSINESS SOLUTIONS PRIVATE LIMITEDKollam, IN
    Willing to work in night shift.Lead the property inspection operations in a multi-client environment ensuring adherence to service level agreements and quality standards. Track team perfoJob Descrip...Show moreLast updated: 7 days ago
    • Promoted
    Lead Engineer

    Lead Engineer

    HyqooThiruvananthapuram, IN
    Design, deploy, and manage AWS cloud infrastructure, including EC2 instances, S3 buckets, VPCs, RDS databases, and Lambda functions. Assist in the design, implementation, and maintenance of backup, ...Show moreLast updated: 11 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    People Prime WorldwideThiruvananthapuram, IN
    Our client is a French multinational information technology (IT) services and consulting company, headquartered in Paris, France. Founded in 1967, It has been a leader in business transformation for...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    PeoplefyThiruvananthapuram, Kerala, India
    We’re looking for an SRE who can.Define SLIs / SLOs for Tier-0 / Tier-1 services & review quarterly.Change gating via CI / CD based on error budgets. Azure Monitor / Grafana / Prometheus / App Insights da...Show moreLast updated: 1 day ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    ConfidentialThiruvananthapuram, Thiruvananthapuram / Trivandrum, India
    We're looking for an SRE who can.Define SLIs / SLOs for Tier-0 / Tier-1 services & review quarterly.Change gating via CI / CD based on error budgets. Azure Monitor / Grafana / Prometheus / App Insights da...Show moreLast updated: 2 days ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    ConfidentialThiruvananthapuram / Trivandrum
    As a Site Reliability Engineer (SRE) you will be responsible for improving the overall reliability of applications by ensuring its availability, performance, and scalability.Should be able to gathe...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PhonePeKollam, IN
    SRE We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production ...Show moreLast updated: 16 days ago
    • Promoted
    • New!
    Site Reliability Engineer II

    Site Reliability Engineer II

    ConfidentialIndia, Thiruvananthapuram / Trivandrum, Thiruvananthapuram
    The world's top banks use Zafin's integrated platform to drive transformative customer value.Powered by an innovative AI-powered architecture, Zafin's platform seamlessly unifies data from across t...Show moreLast updated: 6 hours ago
    • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaServiceThiruvananthapuram, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show moreLast updated: 14 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Karixkollam, kerala, in
    We are seeking an experienced professional Site Reliability Engineer who acts as a bridge between development and IT operations, taking operational tasks to ensure the efficient functioning of Serv...Show moreLast updated: 4 hours ago
    • Promoted
    Senior Site Reliability Engineer (C# / Python)

    Senior Site Reliability Engineer (C# / Python)

    EntechKollam, IN
    Senior Software Site Reliability Engineer (C# / Python).You’ll ensure enterprise systems are reliable, scalable, and performant - driving improvements, leading SRE initiatives, and mentoring teams on...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Voya IndiaKollam, IN
    We are seeking a strategic and technically adept leader to drive the scalability, resilience, and operational excellence of our enterprise systems. This role will set the vision for site reliability...Show moreLast updated: 10 hours ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Awign ExpertKollam, IN
    Position : SRE Observability Engineer.Mandatory Skills : Observability, Grafana and Writing queries using Prometheus and Loki. We are seeking a highly experienced and driven Senior Observability Engin...Show moreLast updated: 10 hours ago
    • Promoted
    Principal Reliability Engineer

    Principal Reliability Engineer

    PeoplefyThiruvananthapuram, Republic Of India, IN
    We’re looking for an SRE who can.Define SLIs / SLOs for Tier-0 / Tier-1 services & review quarterly.Change gating via CI / CD based on error budgets. Azure Monitor / Grafana / Prometheus / App Insights da...Show moreLast updated: 1 day ago
    • Promoted
    Senior DevOps & Database Reliability Engineer – 100% Remote

    Senior DevOps & Database Reliability Engineer – 100% Remote

    Hyly.AIThiruvananthapuram, IN
    Remote
    AI, we’re building the first AI + Data Fabric for the multifamily industry, transforming how clients manage, secure, and scale their marketing and operational data. As the industry moves toward a co...Show moreLast updated: 8 days ago
    • Promoted
    • New!
    Site Reliability Engineer (SRE) / DevOps Engineer

    Site Reliability Engineer (SRE) / DevOps Engineer

    Stoopa AIThiruvananthapuram, IN
    AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring ...Show moreLast updated: 10 hours ago
    • Promoted
    Senior Systems Reliability Engineer

    Senior Systems Reliability Engineer

    PeoplefyThiruvananthapuram, Republic Of India, IN
    We’re looking for an SRE who can.Define SLIs / SLOs for Tier-0 / Tier-1 services & review quarterly.Change gating via CI / CD based on error budgets. Azure Monitor / Grafana / Prometheus / App Insights da...Show moreLast updated: 1 day ago