Talent.com
Site Reliability Engineer
Site Reliability EngineerLandmark Group • Delhi, India
Site Reliability Engineer

Site Reliability Engineer

Landmark Group • Delhi, India
6 days ago
Job description

What You’ll Do :

  • Ensure reliability and high availability of

Java and microservices-based applications

through proactive monitoring and automation.

  • Define and track
  • SLIs / SLOs

    to maintain service performance and stability.

  • Troubleshoot and resolve
  • production issues , performing detailed

    root cause analysis

    to prevent recurrence.

  • Build and enhance observability using
  • Prometheus, Grafana, Loki, or New Relic .

  • Automate operational tasks —
  • deployments, scaling, rollbacks, diagnostics, and alerting .

  • Collaborate with engineering and DevOps teams to integrate reliability practices into the CI / CD pipeline.
  • Drive
  • AIOps initiatives

    for intelligent alert correlation and predictive incident management.

  • Mentor teams on best practices in monitoring, performance optimization, and operational efficiency.
  • What We’re Looking For :

    3–6 years

    of experience in

    Site Reliability Engineering, Application Operations, or DevOps .

  • Strong hands-on experience with
  • Java, Spring Boot , and

    microservices architecture .

  • Proficiency in
  • monitoring tools

    (Prometheus, Grafana, Loki, New Relic, or similar).

  • Experience with
  • Kubernetes ,

    containers , and

    cloud platforms

    (AWS, Azure, or GCP).

  • Strong scripting skills in
  • Bash, Python, or Go

    for automation and diagnostics.

  • Familiar with
  • incident management, RCA, and performance debugging .

  • Exposure to
  • AIOps tools

    or

    AI / LLM-based observability platforms

    is a plus.

  • Excellent problem-solving and communication skills.
  • Create a job alert for this search

    Site Reliability Engineer • Delhi, India