Talent.com
This job offer is not available in your country.
Lead Platform Engineer - Observability Services

Lead Platform Engineer - Observability Services

neemtreeBangalore
30+ days ago
Job description

Roles & Responsibilities :

  • Solution Packaging : Lead the end-to-end development of observability packages for 100+ standard technologies across infrastructure, databases, middleware, and application platforms.
  • Data Collection Strategy : Define and implement data collection strategies including agent instrumentation, API integrations, log and metrics collection pipelines, and auto-discovery mechanisms.
  • Golden Signals & Data Modeling : Define golden signals, KPIs, SLIs / SLOs, and data schemas for different component types to support health monitoring, performance optimization, and anomaly detection.
  • Dashboards, Alerts, Reports : Design and standardize visualizations, alerting rules, reporting templates, and RCA workflows for fast detection and resolution of issues.
  • Platform Enablement : Guide enhancements to agents, collectors, and platform components to support new integrations and data formats.
  • Team Leadership : Lead a team of engineers and specialists focused on observability solutions development. Establish best practices, design standards, and agile delivery pipelines.
  • Collaboration & Stakeholder Management : Work closely with product management, DevOps, SRE, and customer success teams to align on priorities, gather requirements, and validate delivered packages.
  • Quality, Scale & Reusability : Ensure all developed solutions are scalable, reusable, and version-controlled,with automated testing and You Bring :

Mandatory Skills :

  • Minimum 6+ years of experience in observability, monitoring, SRE, or platform engineering roles.
  • Strong hands-on experience with observability tools such as Prometheus, Grafana, OpenTelemetry, ELK / EFK, Datadog, Splunk, or similar.
  • In-depth understanding of logs, metrics, traces, profiling, events, and the corresponding instrumentation / collection mechanisms.
  • Proven experience in developing observability solutions for platforms like Kubernetes, databases (Oracle, PostgreSQL), middleware (Tomcat, WebLogic), and distributed systems.
  • Experience with scripting, APIs, and automation frameworks (Python, Shell, Terraform, etc.).
  • Familiarity with RCA techniques, anomaly detection, and alert fatigue reduction strategies.
  • Ability to define and enforce design patterns, standards, and governance models.
  • Strong leadership, project management, and cross-functional collaboration skills.
  • Excellent verbal and written communication skills.
  • Good to Have Skills :

  • Experience building or managing a packaged observability marketplace or platform.
  • Contributions to open-source observability projects.
  • Certifications in Kubernetes, Observability tools, or cloud platforms (AWS, Azure, GCP).
  • Background in ITSM, CMDBs, or workflow automation is a plus.
  • (ref : hirist.tech)

    Create a job alert for this search

    Lead Platform Engineer • Bangalore