Talent.com
Senior Site Reliability Engineer - AWS / Google Cloud Platform

Senior Site Reliability Engineer - AWS / Google Cloud Platform

N-iX LTDDelhi, IN
10 days ago
Job type
  • Remote
Job description

About the Company :

Our Client is defining the future of cybersecurity through our XDR platform that automatically prevents, detects, and responds to threats in real time. Singularity XDR ingests data and leverages our patented AI models to deliver autonomous protection. With the Client, organizations gain full transparency into everything happening across the network at machine speed to defeat every attack at every stage of the threat lifecycle.

We are a values-driven team where names are known, results are rewarded, and friendships are formed. Trust, accountability, relentlessness, ingenuity, and our client-centric approach define the pillars of our collaborative and unified global culture. Were looking for people who will drive team success and collaboration across SentinelOne. If youre enthusiastic about innovative approaches to problem-solving, we would love to speak with you about joining our team!

What Are We Looking For?

We are seeking a Site Reliability Engineer (SRE) with extensive operational experience managing large-scale SaaS infrastructures. You will be responsible for designing and maintaining data infrastructure that emphasizes automation, self-service, and scalability. This role is vital to ensuring that we meet and exceed our Service Level Objectives (SLOs) and uptime commitments to customers.

You will partner closely with engineering teams to help them deliver software faster, safer, and with higher quality, while driving initiatives that enhance the reliability, stability, and cost efficiency of our production environments. Youll join a world-class team of like-minded SREs who manage complex, high-traffic systems that operate at global scale.

What Will You Do?

As a Site Reliability Engineer, you will play a critical role in ensuring the availability, scalability, and performance of SentinelOnes large-scale distributed systems. Working at the intersection of software development and operations, youll focus on making our infrastructure more reliable, automated, and efficient, while empowering development teams to deliver at speed and with confidence.

In this role, you will :

Drive Continuous Deployment & Delivery Excellence :

  • Design, implement, and optimize CI / CD pipelines for efficient, secure, and reliable software releases.
  • Automate build, test, and deployment processes to enhance release velocity and reduce manual intervention.

Manage and Command Production Incidents :

  • Lead the response to production incidents, ensuring timely mitigation and root cause identification.
  • Conduct post-incident reviews, define corrective actions, and drive continuous improvements to prevent recurrence.
  • Partner with Product Engineering Teams :

  • Collaborate with product, platform, and infrastructure teams to embed reliability and scalability into design and architecture.
  • Provide technical guidance to improve system performance, fault tolerance, and observability.
  • Automate Operations and Streamline Processes :

  • Build automation tools and frameworks that eliminate repetitive tasks, standardize operational procedures, and support a self-service infrastructure model for development teams.
  • Monitor, Measure, and Optimize Reliability :

  • Establish metrics for system performance and reliability (availability, latency, throughput).
  • Proactively identify and resolve potential issues using data-driven insights and continuous monitoring.
  • Eliminate Infrastructure Bottlenecks :

  • Analyze production systems to identify performance and scalability limitations.
  • Implement architectural improvements to enhance throughput, reliability, and cost efficiency across AWS and GCP environments.
  • Enhance Observability & Incident Readiness :

  • Develop and maintain observability stacks with advanced monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Datadog).
  • Conduct chaos engineering experiments to validate system resilience and ensure operational preparedness.
  • Ensure Security, Compliance & Resilience :

  • Work with security and compliance teams to enforce secure configurations, data integrity, and regulatory adherence.
  • Participate in disaster recovery planning and capacity forecasting for high availability.
  • Mentor and Collaborate Across Teams :

  • Share best practices through documentation, technical discussions, and internal workshops.
  • Foster a reliability-driven culture and promote continuous improvement across engineering functions.
  • What Skills and Experience Will You Need?

  • 5+ years of experience managing large-scale SaaS operations or distributed systems
  • Strong expertise in orchestration systems like Kubernetes, Nomad, or Mesos
  • Proficiency in Python (preferred), Golang, or Java for automation and tooling
  • Hands-on experience running and deploying Java and JavaScript applications
  • Proven experience in AWS and GCP environments
  • Practical knowledge of Infrastructure as Code (Terraform, CloudFormation, etc.)
  • Experience with CI / CD tools such as Jenkins, GitHub Actions, or ArgoCD, and deployment strategies like blue-green, rolling, or canary deploys
  • Familiarity with SRE principles SLOs, SLIs, and error budgets
  • Strong problem-solving, communication, and collaboration skills within distributed teams
  • Self-starter attitude with a passion for automation, reliability, and continuous learning
  • Prior product development or software engineering experience is a strong plus
  • What We Offer :

  • Flexible working format remote, office-based, or hybrid
  • Competitive salary and comprehensive compensation package
  • Personalized career growth opportunities and mentorship programs
  • Professional development tools : tech talks, training sessions, and centers of excellence
  • Active technical communities with regular knowledge-sharing
  • Education reimbursement for continued learning and certifications
  • Memorable milestone celebrations and company-sponsored events
  • Corporate gatherings and team-building initiatives
  • (ref : hirist.tech)

    Create a job alert for this search

    Senior Site Reliability Engineer • Delhi, IN

    Related jobs
    • Promoted
    AWS Site Reliability Engineer

    AWS Site Reliability Engineer

    HTC Global ServicesDelhi, India
    HTC – A brief profile Established in 1990, HTC Inc.Troy, Michigan, is a leading global Information Technology solution and BPO provider. HTC assists clients across multiple industry verticals, offer...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Voya IndiaDelhi, IN
    We are seeking a strategic and technically adept leader to drive the scalability, resilience, and operational excellence of our enterprise systems. This role will set the vision for site reliability...Show moreLast updated: 13 hours ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    o9 Solutions, Inc.Delhi, India
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 7 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    PeoplefyDelhi, India
    We’re looking for an SRE who can own reliability for mission-critical services on Azure, shape standards, lead incidents with calm clarity, and drive engineering excellence across teams.Strong site...Show moreLast updated: 2 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Allegion IndiaDelhi, India
    About Allegion : Allegion is a global leader in security products and solutions, dedicated to creating safer environments for homes and businesses. With a focus on innovation and technology, Allegion...Show moreLast updated: 8 days ago
    • Promoted
    • New!
    Site Reliability Engineer (SRE) / DevOps Engineer

    Site Reliability Engineer (SRE) / DevOps Engineer

    Stoopa AIGhaziabad, IN
    AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring ...Show moreLast updated: 13 hours ago
    • Promoted
    Senior DevOps & Database Reliability Engineer – 100% Remote

    Senior DevOps & Database Reliability Engineer – 100% Remote

    Hyly.AIDelhi, IN
    Remote
    AI, we’re building the first AI + Data Fabric for the multifamily industry, transforming how clients manage, secure, and scale their marketing and operational data. As the industry moves toward a co...Show moreLast updated: 8 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Awign ExpertDelhi, IN
    Position : SRE Observability Engineer.Mandatory Skills : Observability, Grafana and Writing queries using Prometheus and Loki. We are seeking a highly experienced and driven Senior Observability Engin...Show moreLast updated: 13 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    JRD SystemsDelhi, India
    Site Reliability Engineer (Windows / Cloud / Automation).We are seeking an experienced Site Reliability Engineer with a strong background in managing Windows infrastructure and cloud environments.T...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Elios TalentDelhi, India
    Key Highlights ️ Build, automate, and support cloud-native infrastructure powering high-availability platforms ⚡ Contribute to automation-first engineering across AWS, Terraform, CI / CD, and observa...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer - Azure

    Site Reliability Engineer - Azure

    PhonePeDelhi, India
    We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production serv...Show moreLast updated: 15 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SynamediaDelhi, India
    At Synamedia, the world’s most talented innovators and trailblazers are shaping the way the world is entertained and informed. We are backed by the Permira funds and Sky.This is the age of infinite ...Show moreLast updated: 10 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Datum Technologies GroupGhaziabad, IN
    Job Title : Site Reliability Engineer (SRE) – AWS.AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, Datadog.We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experi...Show moreLast updated: 8 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Elios TalentDelhi, India
    Senior Site Reliability Engineer.Key Highlights ️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms ⚡ Drive automation-first engineering across AWS...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    KarixDelhi, India
    We are seeking an experienced professional Site Reliability Engineer who acts as a bridge between development and IT operations, taking operational tasks to ensure the efficient functioning of Serv...Show moreLast updated: 2 hours ago
    • Promoted
    • New!
    Senior DataOps Engineer (AWS)

    Senior DataOps Engineer (AWS)

    MSBC GroupDelhi, IN
    Join us as a Senior DataOps Engineer (AWS)—Drive High-Performance Data Systems for Financial Services.Lead the E-Comms data pipeline within Compass’s Application Simplification workstream : design, ...Show moreLast updated: 13 hours ago
    • Promoted
    Senior Site Reliability Engineer (C# / Python)

    Senior Site Reliability Engineer (C# / Python)

    EntechDelhi, IN
    Senior Software Site Reliability Engineer (C# / Python).You’ll ensure enterprise systems are reliable, scalable, and performant - driving improvements, leading SRE initiatives, and mentoring teams on...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PhonePeDelhi, IN
    SRE We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production ...Show moreLast updated: 16 days ago