Talent.com
This job offer is not available in your country.
Lead CloudOps Engineer

Lead CloudOps Engineer

DevOnKanpur, IN
4 days ago
Job description

Lead Operations Engineer

Experience : 8+ years

  • Own operational oversight for services running on a Java-based microservices platform . Act as the primary escalation point for production incidents; lead incident response and communication.
  • Drive post-incident reviews (blameless RCAs) and embed learnings through preventive actions. Maintain service dashboards, alerts, and incident tooling (e.g., PagerDuty, Datadog).

Technical Leadership

  • Guide operational practices across services built using Java (Spring Boot), Kafka, MongoDB and related technologies.
  • Oversee monitoring, observability, and performance tuning using Datadog, ELK, Prometheus, or similar tooling.
  • Problem Management & Root Cause Elimination

  • Lead proactive and reactive problem management efforts. Identify recurring production issues and collaborate with engineering to design permanent solutions.
  • Track and reduce operational toil via automation and tooling improvements.
  • Change Enablement & Service Onboarding

  • Partner with development teams to onboard new services with production readiness standards.
  • Ensure all services meet requirements for monitoring, logging, documentation, support, and resilience before go-live.
  • Support safe, rapid change practices including canary releases, feature flags, and progressive delivery.
  • Team Management & Leadership

  • Lead and mentor a team of operations engineers and / or SREs.
  • Manage performance reviews, career development, and day-to-day team workload.
  • Foster a high-performance culture with strong accountability, collaboration, and a learning mindset.
  • Continuous Improvement & DevOps Practices

  • Drive automation and self-service initiatives to reduce manual intervention and operational burden.
  • Champion observability best practices (metrics, traces, logs) and error budget tracking. Promote DevOps culture and continuous feedback loops between engineering and operations.
  • Governance, Risk & Compliance

  • Ensure operational processes comply with security, privacy, and regulatory requirements (e.g., SOC 2, ISO 27001). Manage operational risks, service continuity plans, and audit readiness.
  • Create a job alert for this search

    Lead Engineer • Kanpur, IN