Talent.com
Site Reliability Engineer - IT Infrastructure Automation
Site Reliability Engineer - IT Infrastructure AutomationFive9 • Bangalore
Site Reliability Engineer - IT Infrastructure Automation

Site Reliability Engineer - IT Infrastructure Automation

Five9 • Bangalore
27 days ago
Job description

Key Responsibilities :

Observability & Monitoring :

  • Dashboards & Metrics : Design and implement comprehensive dashboards covering OS / platform-level and application-level monitoring, broken into primary (RED) and secondary indicators (USE).
  • Availability & Reliability : Establish and maintain SLIs, SLOs, and error budgets for the service.
  • Performance Monitoring : Build alerting systems and performance monitoring to proactively identify and resolve issues before they impact users.
  • Incident Response : Participate in on-call rotations, lead incident response efforts (including post-mortem analysis and remediation), maintain on-call routing, and assign application-level problems to engineering teams.

Infrastructure Automation & Deployment :

  • CI / CD Pipeline Management : Build and optimize CI / CD pipelines for speed and resilience.
  • Infrastructure as Code : Develop and maintain infrastructure using tools like Terraform, Ansible, or similar.
  • Configuration Management : Automate system configuration and ensure consistency across environments.
  • Implement and recommend best practices for configuration control.
  • Security & Compliance :

  • Security Automation : Ensure security scanning systems are in place and review escalated vulnerabilities.
  • Access Control : Maintain proper authentication, authorization, and audit logging systems.
  • Compliance Reporting : Ensure systems meet regulatory and industry standards.
  • Security Incident Response : Participate in security incident response and remediation efforts.
  • Cost Optimization :

  • Resource Management : Monitor and optimize cloud resource usage and costs.
  • Capacity Planning : Analyze usage patterns and plan for future capacity needs.
  • Cost Analysis : Provide recommendations for cost-effective architecture and resource allocation.
  • Right-sizing : Implement automated scaling and resource optimization strategies.
  • Common Services & Platform Engineering :

  • Shared Infrastructure : Build and maintain common services (notification systems, caching layers, message queues, or third-party stacks).
  • Database Operations : Manage database reliability, performance, and scaling (where not handled by DB teams).
  • Service Mesh & Networking : Implement and maintain service discovery, load balancing, and network policies.
  • Developer Tools : Create and maintain tools and platforms that improve developer productivity and reliability.
  • Required Qualifications :

    Technical Skills :

  • Programming Languages : Proficiency in at least two of Python, Shell, Java, NodeJS, or similar.
  • Cloud Platforms : Experience with AWS, GCP, or Azure.
  • Containerization : Hands-on experience with Docker, Kubernetes, and container orchestration.
  • Monitoring & Observability : Experience with Prometheus, Grafana, ELK stack, or similar tools.
  • Infrastructure as Code : Proficiency with Ansible, Terraform, Helm, or similar.
  • Version Control : Expert-level Git usage and collaborative development practices.
  • CI / CD Pipelines : Hands-on experience with GitLab CI / CD, GitHub Actions, or similar.
  • SRE-Specific Knowledge :

  • Experience defining and maintaining SLOs and SLIs.
  • Understanding and implementation of error budget policies.
  • Proven track record in toil reduction and automation.
  • Experience with capacity planning and performance testing.
  • Preferred Qualifications :

  • Bachelors degree in Computer Science, Engineering, or equivalent experience.
  • Experience with microservices and distributed systems.
  • Knowledge of security best practices and compliance frameworks.
  • Experience with chaos engineering and reliability testing.
  • Prior experience in an SRE or DevOps role at a tech company.
  • Contributions to open-source projects or technical communities.
  • Success Metrics :

  • Maintain or improve service availability and reliability metrics.
  • Demonstrated reduction in manual operational work through automation.
  • Effective participation in incident response and prevention.
  • High-quality, well-tested code contributions.
  • Strong collaboration with development teams to improve system reliability.
  • Team Culture & Values :

  • Blameless Post-Mortems : Learn from failures without blame.
  • Automation First : Prefer automated solutions over manual processes.
  • Measure Everything : Data-driven decisions and continuous improvement.
  • Knowledge Sharing : Document and share expertise.
  • Work-Life Balance : Sustainable on-call practices and reasonable load.
  • Growth Opportunities :

  • Work on cutting-edge infrastructure and reliability challenges.
  • Exposure to large-scale distributed systems and modern cloud technologies.
  • Clear career path toward Senior SRE, Staff Engineer, or Management roles.
  • Collaboration with engineering teams across the organization.
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Bangalore

    Related jobs
    Site Reliability Engineer

    Site Reliability Engineer

    Reyika • Bengaluru, Karnataka, India
    Senior Site Reliability Engineer / Reliability Architect.Pune,Bengalore,Chennai,Pune,Noida.Reliability Architect with over 9 years of experience in proactive monitoring, automation, and observabili...Show more
    Last updated: 6 days ago • Promoted
    Senior Infrastructure / Site Reliability Engineer

    Senior Infrastructure / Site Reliability Engineer

    Confidential • Bengaluru / Bangalore
    As a Senior Site Reliability Engineer you have a proven track record of supporting Operational environments whilst using DevOps tooling to automate management and reliability.You will work closely ...Show more
    Last updated: 30+ days ago • Promoted
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    Rakuten India • Bengaluru, Karnataka, India
    Design, develop SLA, SLO, SLI of services within the Business Unit.Involve in whole process of Development, Production System Operation including system maintenance, monitoring, automation, backend...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaService • bangalore, karnataka, in
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show more
    Last updated: 19 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Yum! India Global Services Private Limited • Bengaluru, Karnataka, India
    Roles & Responsibilities Design, test, implement, deploy, and support continuous integration pipelines that build and deploy to cloud-based environments (development, stage / testing, production).In...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Synechron • Bengaluru, Karnataka, India
    We have immediate opportunity for Senior Site Reliability Engineer.Senior Site Reliability Engineer.At Synechron, we believe in the power of digital to transform businesses for the better.Our globa...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Synamedia • bangalore district, karnataka, in
    At Synamedia, the world’s most talented innovators and trailblazers are shaping the way the world is entertained and informed. We are backed by the Permira funds and Sky.This is the age of infinite ...Show more
    Last updated: 15 days ago • Promoted
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Tata Consultancy Services • bangalore, karnataka, in
    Senior Site Reliability Engineer (SRE).Senior Site Reliability Engineer (SRE).Desired Experience Range : 7 - 10 yrs.Notice Period : Immediate to 90Days only. We are currently planning to do a Virtual....Show more
    Last updated: 30+ days ago • Promoted
    System Reliability Engineer

    System Reliability Engineer

    Andromeda Security • bangalore, karnataka, in
    We are seeking an experienced Site Reliability Engineer (SRE) with a strong background in DevOps technologies and cloud infrastructure. The ideal candidate will have hands-on experience with Kuberne...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PhonePe • bangalore, karnataka, in
    SRE We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production ...Show more
    Last updated: 21 days ago • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Delta Air Lines • Bengaluru, India
    Execute on the Incident, Change Management, Problem Management processes.Building and supporting reliable applications that meet development and maintenance requirements. Provide consultation and di...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WhiteLotus Talent Partners • Bengaluru, Karnataka, India
    L0 and L1 Site Reliability Engineer (SRE) Support.Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by. In this role, you will focu...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Allegion India • Bengaluru, India
    Job Title : Senior SRE Engineer.Allegion is a global leader in security products and solutions, dedicated to creating safer environments for homes and businesses. With a focus on innovation and techn...Show more
    Last updated: 11 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    JRD Systems • Bengaluru, India
    Site Reliability Engineer (Windows / Cloud / Automation).We are seeking an experienced Site Reliability Engineer with a strong background in managing Windows infrastructure and cloud environments.T...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    super.money • Bengaluru, Karnataka, India
    Site Reliability Engineer (SRE) Level 3.A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and...Show more
    Last updated: 22 days ago • Promoted
    Site Reliability Engineer (Sre) – Infrastructure & Automation

    Site Reliability Engineer (Sre) – Infrastructure & Automation

    InstaService • Bengaluru, Republic Of India, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show more
    Last updated: 19 days ago • Promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    RecRoots • Bangalore Urban, Karnataka, India
    Key Job Responsibilities and Duties : .The core premise for the SRE lies in treating operational issues as a software problem. We code our way out of problems where operations are concerned addressing...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Landmark Group • Bengaluru, India
    Ensure reliability and high availability of Java and microservices-based applications through proactive monitoring and automation. Define and track SLIs / SLOs to maintain service performance and ...Show more
    Last updated: 13 days ago • Promoted