Talent.com
Site Reliability Engineer
Site Reliability EngineerVXI Global Solutions • bhopal, madhya pradesh, in
Site Reliability Engineer

Site Reliability Engineer

VXI Global Solutions • bhopal, madhya pradesh, in
17 hours ago
Job description

We are looking for a Site Reliability Engineer with 3+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus , Grafana , along with exposure to SolarWinds . You should be comfortable working with metrics, logs, and traces , and be able to correlate telemetry data to proactively detect, diagnose, and resolve performance issues.

Key Responsibilities :

  • Design and maintain observability pipelines using OpenTelemetry, Prometheus, and Grafana.
  • Build dashboards and alerts to monitor system health, application performance, and business KPIs.
  • Integrate observability solutions with Google Cloud Platform services and SolarWinds.
  • Correlate logs, metrics, and traces to troubleshoot incidents and reduce MTTR.
  • Collaborate with SREs, DevOps, and development teams to improve end-to-end system observability.
  • Implement best practices for telemetry data collection, enrichment, storage, and visualization.

Requirements :

  • Strong experience with Prometheus and Grafana for monitoring and alerting.
  • Proficiency in OpenTelemetry for instrumenting distributed systems.
  • Working knowledge of observability tools in Google Cloud (e.g., Cloud Monitoring, Logging, Trace).
  • Exposure to SolarWinds for network and infrastructure monitoring.
  • Solid understanding of telemetry data types : metrics, logs, and traces.
  • Ability to correlate and analyze multi-source observability data.
  • Scripting skills (Python, Bash) and familiarity with Infrastructure-as-Code is a plus.
  • Preferred Qualifications :

  • Experience in Site Reliability Engineering or Platform Engineering roles.
  • Knowledge of SLIs / SLOs and performance benchmarking.
  • Experience with APM tools (e.g., Datadog, New Relic) is a plus.
  • Create a job alert for this search

    Site Reliability Engineer • bhopal, madhya pradesh, in