Talent.com
Site Reliability Engineer

Site Reliability Engineer

ElgebraChennai
30+ days ago
Job description

Role Overview :

We are seeking a highly experienced and technically proficient Site Reliability Engineer (SRE) to join our team in support of our client, Qincline. The ideal candidate will have 7 or more years of dedicated experience in Site Reliability Engineering or a closely related discipline. This pivotal role requires a strong focus on ensuring the reliability, scalability, performance, and operational efficiency of large-scale, complex production systems. You'll be instrumental in bridging the gap between development and operations by applying engineering principles to operational challenges.

Key Responsibilities :

Reliability & Performance Engineering :

  • System Reliability : Design, build, and maintain robust, fault-tolerant production systems and infrastructure to meet stringent Service Level Objectives (SLOs).
  • Performance Tuning : Proactively identify and resolve performance bottlenecks across the entire application stack, from infrastructure to application code.
  • Automation : Develop and implement automation for operational tasks, infrastructure provisioning, deployment, and monitoring to eliminate manual toil.
  • Capacity Planning : Collaborate with development teams on capacity planning, forecasting demand, and ensuring the infrastructure can scale efficiently to meet future business needs.

Operations & Incident Management :

  • Monitoring & Alerting : Establish and maintain comprehensive monitoring, logging, and alerting systems to gain deep visibility into system health and performance (e.g., using Prometheus, Grafana, ELK Stack, etc.).
  • Incident Response : Serve as a key responder during critical incidents, performing rapid triage, mitigation, and recovery.
  • Post-Mortems & RCA : Lead detailed Post-Mortem and Root Cause Analysis (RCA) processes for all significant incidents, ensuring that permanent fixes and preventative measures are implemented to prevent recurrence.
  • On-Call : Participate in a periodic on-call rotation to provide 24 / 7 support for critical production systems.
  • Tooling & Infrastructure :

  • CI / CD & DevOps : Enhance and manage CI / CD pipelines to facilitate fast, reliable, and automated software releases.
  • Containerization & Orchestration : Manage and optimize containerized environments using Docker and Kubernetes.
  • Infrastructure as Code (IaC) : Utilize IaC tools (e.g., Terraform, Ansible) to provision and manage infrastructure in a repeatable and documented manner.
  • Required Skills & Experience :

    Core Experience (7+ Years) :

  • Minimum 7 years of hands-on experience in a Site Reliability Engineer, DevOps Engineer, or Production Engineer role supporting high-availability, mission-critical production environments.
  • Deep expertise in establishing and improving system monitoring, logging, alerting, and telemetry practices.
  • Demonstrated experience with formal Incident Management processes and leading thorough Root Cause Analysis (RCA).
  • Technical Expertise :

  • Cloud Platforms : Extensive, hands-on experience with at least one major cloud provider (e.g., AWS, Azure, or GCP). This includes managing compute, networking, storage, and managed services.
  • Scripting & Programming : Strong proficiency in scripting and programming languages, with mandatory expertise in Python and Shell scripting for automation and tooling.
  • DevOps Tooling : Proven experience with CI / CD pipeline tools (e.g., Jenkins, GitLab CI, Azure DevOps), Git, and artifact repositories.
  • Containerization : Expert-level knowledge of Docker and robust experience with orchestrating large-scale deployments using Kubernetes.
  • Operating Systems : Strong command of Linux / Unix operating systems and networking fundamentals (TCP / IP, DNS, Load Balancing).
  • Desired Qualifications (Good to Have) :

  • Experience with configuration management tools (e.g., Ansible, Chef, Puppet).
  • Familiarity with service mesh technologies (e.g., Istio, Linkerd).
  • Knowledge of database administration and performance tuning (SQL / NoSQL).
  • Certifications related to SRE, Cloud (e.g., AWS Certified DevOps Engineer), or Kubernetes (CKA, CKAD).
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Chennai

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Tata Consultancy ServicesChennai, Tamil Nadu, India
    GKE(Preferable); Kubernetes (Any cloud) + PostgresSQL, SQL(Must).Linux (Optional), Java (Optional) , Kubernetes (CLI), Prior Production support experience, Release Management, Prior Deployment expe...Show moreLast updated: 24 days ago
    • Promoted
    • New!
    Subsurface Reliability Engineer

    Subsurface Reliability Engineer

    Chevronchennai, tamil nadu, in
    The Subsurface Reliability Engineer is part of the Production Engineering team within the Chevron ENGINE Center and is responsible for ensuring the reliability and efficiency of subsurface operatio...Show moreLast updated: 22 hours ago
    • Promoted
    AWS Site Reliability Engineer

    AWS Site Reliability Engineer

    HTC Global ServicesChennai, Tamil Nadu, India
    Troy, Michigan, is a leading global Information Technology solution and BPO provider.HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data ...Show moreLast updated: 14 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Intellistaff Services Pvt. LtdChennai, Tamil Nadu, India
    Role : Cloud Engineer - SRE Experience : 6+ Location : Chennai Fulltime - Hybrid Required Skills : - 6+ years' experience SRE, 3+ years in Public Cloud & Cloud Engineering - GCP experience (prefer...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Peoplefychennai, tamil nadu, in
    We’re looking for an SRE who can.Define SLIs / SLOs for Tier-0 / Tier-1 services & review quarterly.Change gating via CI / CD based on error budgets. Azure Monitor / Grafana / Prometheus / App Insights da...Show moreLast updated: 22 hours ago
    • Promoted
    Site Engineer

    Site Engineer

    Davidson Engineers and ContractorsChennai, Tamil Nadu, India
    A Site Engineer is responsible for managing and supervising construction projects on-site.They work closely with the project team, subcontractors, and construction workers to.Oversee and manage the...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineering (SRE)

    Site Reliability Engineering (SRE)

    Tata Consultancy ServicesChennai, Tamil Nadu, India
    TCS has been a great pioneer in feeding the fire of Young Techies like you.We are a global leader in the technology arena and there's nothing that can stop us from growing together.Location - Benga...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    People Prime Worldwidechennai, tamil nadu, in
    Our client is a French multinational information technology (IT) services and consulting company, headquartered in Paris, France. Founded in 1967, It has been a leader in business transformation for...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    VXI Global Solutionschennai, tamil nadu, in
    We are looking for a Site Reliability Engineer with 3+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications.The id...Show moreLast updated: 22 hours ago
    • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    PoshmarkChennai, Tamil Nadu, India
    We’re looking for an experienced.You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying ...Show moreLast updated: 27 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Grootan TechnologiesChennai, Tamil Nadu, India
    Site Reliability Engineer (SRE).In this role, you will be responsible for building and maintaining reliable, scalable, and secure infrastructure to support our applications.You will leverage your e...Show moreLast updated: 4 days ago
    • Promoted
    • New!
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Synechronchennai, tamil nadu, in
    We have immediate opportunity for.SRE (Senior Site Reliability Engineer) 5+ years.SRE (Senior Site Reliability Engineer). We began life in 2001 as a small, self-funded team of technology specialists...Show moreLast updated: 22 hours ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeChennai, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 27 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ACL Digitalchennai, tamil nadu, in
    ACL Digital is Hiring for the Below position.ACL Digital, part of the ALTEN Group, is a trusted AI-led, Digital & Systems Engineering Partner driving innovation by designing and building intelligen...Show moreLast updated: 13 days ago
    • Promoted
    • New!
    TCS Walkin Drive For Site Reliability Engineering (SRE)

    TCS Walkin Drive For Site Reliability Engineering (SRE)

    Tata Consultancy ServicesChennai, Tamil Nadu, India
    Site Reliability Engineering (SRE)Ops.TCS has been a great pioneer in feeding the fire of young Techies like you.We are a global leader in the technology arena and there’s nothing that can stop us ...Show moreLast updated: 7 hours ago
    • Promoted
    • New!
    Site Engineer

    Site Engineer

    Solarsurechennai, tamil nadu, in
    We are hiring a detail-oriented and technically skilled Site Engineer to monitor and support on-ground civil, electrical and mechanical works as per engineering drawings and quality standards, ensu...Show moreLast updated: 22 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Datum Technologies GroupChennai, Tamil Nadu, India
    Site Reliability Engineer (SRE) – Azure & AI.Work Location : Chennai / Mumbai / Gurgaon.We are looking for an experienced. Site Reliability Engineer (SRE).The ideal candidate will have a solid background...Show moreLast updated: 5 days ago
    • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaServiceChennai, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show moreLast updated: 12 days ago