Grafana Specialist
Experience : 8+ years
Shift Timings : 4Pm to 1Am IST
Locations : Pune,Bengalore,Chennai,Hyderabad,Noida
Basically, we need Grafana Specialist who can work in dashboard development with underlying technologies and integration starting from scratch
Skilled in deploying and managing observability platforms such as Grafana, Prometheus, Loki, and Tempo , and automating infrastructure using Python, Bash, Terraform, and Ansible . Adept at dashboard design, alerting configuration, and incident response workflows , with a strong focus on security, RBAC, and compliance best practices . Experienced in integrating logs, metrics, and traces across multi-cloud systems ( AWS, Azure, GCP ) to deliver real-time visibility and system reliability. Recognized for cross-functional collaboration, technical communication, and driving continuous improvement in complex environments.
Core Responsibilities
User & Access Management
Create, update, and delete user accounts.
Assign roles and permissions via OKTA groups :
Ensure admin access is granted only upon ARF approval.
Dashboard & Visualization Management
Create and manage dashboards using data sources like Prometheus, Loki, and Tempo.
Customize panels, variables, and layouts for dynamic filtering.
Add trace components using Tempo and trace IDs.
Alerting & Monitoring
Set up and manage alerts based on log and metric data.
Ensure alerts are configured correctly and notifications are sent to appropriate users.
Monitor the health and performance of the Grafana instance.
System Administration
Perform regular backups of Grafana configurations and data.
Restore data from backups when necessary.
Escalate issues to platform owners as needed.
Documentation & Compliance
Maintain documentation for Grafana configurations, dashboards, and processes.
Support audit and compliance requirements by ensuring traceability and access logs.
Stack Deployment & Maintenance
Deploy and manage Grafana stack with Prometheus, Loki, and Tempo using Docker Compose.
Configure Prometheus to scrape metrics and Loki for log aggregation.
Maintain and update docker-compose and Prometheus configuration files.
Required Qualifications
Education & Certifications
Bachelor’s degree in Computer Science, IT, or related field.
Certifications preferred : Grafana Cloud Admin, Prometheus Certified Associate, or equivalent.
Experience
3–5 years of experience in monitoring and observability platforms.
Hands-on experience with Grafana, Prometheus, Loki, Tempo, and Docker.
Familiarity with OKTA, ARF workflows, and enterprise access control.
Skills
Strong troubleshooting and analytical skills.
Proficiency in scripting (Bash, Python) and automation tools (Ansible, Terraform).
Excellent communication and documentation abilities.
Willingness to work in 24x7 support environments and rotational shifts.
Specialist • bangalore, karnataka, in