Description :
Location : Pan India (Except Mumbai)
About the Role :
We are looking for an experienced Sr. Team Lead Cloud & Infrastructure Operations to lead and optimize our cloud-based infrastructure operations. The ideal candidate will be responsible for ensuring high availability, scalability, and performance of our AWS environments while driving operational excellence across monitoring, automation, and incident response.
Key Responsibilities :
Cloud Monitoring & Visibility :
- Implement AWS-native monitoring services (CloudWatch, CloudTrail, VPC Flow Logs) and integrate with Datadog to provide complete observability across Drupal websites, EKS, and RDS environments.
Performance & SLA Management :
Configure log aggregation, health checks, and alerting rules to ensure application uptime (target 99.99%) and monitor cross-region failover effectiveness.Dashboards & Analytics :
Develop real-time dashboards for tracking traffic, latency, and error rates to support proactive issue detection and optimization.Incident Management :
Integrate monitoring insights into incident management workflows to enable faster detection, triage, and resolution for operational incidents.Tooling & Automation :
Oversee agent deployments and integrations of cloud workloads with third-party tools to enhance monitoring, logging, and security coverage.Leadership & Collaboration :
Lead a team of cloud engineers, ensuring best practices in DevOps, SRE, and Cloud Infrastructure Operations.Collaborate with cross-functional teams for system design, scaling, and reliability improvements.Required Skills & Experience :
10 - 17 years of experience in Cloud Infrastructure Operations with strong leadership exposure.Expertise in AWS services (EC2, EKS, RDS, CloudWatch, CloudTrail, IAM, VPC).Hands-on experience with Datadog, Prometheus, Grafana, or similar monitoring platforms.Strong knowledge of infrastructure automation tools (Terraform, Ansible, or CloudFormation).Proficiency in Linux administration, network troubleshooting, and incident response processes.Proven experience in managing high-availability and scalable environments.Excellent communication, stakeholder management, and team leadership skills.Good to Have :
Certifications such as AWS Certified Solutions Architect or DevOps Engineer Professional.Exposure to container orchestration (EKS, Kubernetes) and CI / CD pipelines.Experience with Drupal-hosted environments or large-scale web application infrastructure.Join us to lead high-performing cloud operations that power next-gen web applications with reliability, speed, and scale
(ref : hirist.tech)