Description :
Location : On-site Gurgaon (Hybrid)
Department : Technology / Engineering
Experience Level : 8+ Years
Employment Type : Full-Time
ABOUT THE ROLE :
We are looking for a highly skilled Lead DevOps Engineer to join our team and help build, scale, and maintain a reliable messaging platform that powers seamless communication for millions of users.
Youll be responsible for designing cloud-native infrastructure, automating deployments, ensuring high availability, and driving operational excellence in a fast-paced environment.
KEY RESPONSIBILITIES & Deployment :
- Design, implement, and manage scalable, resilient cloud infrastructure (AWS / GCP / Azure) for messaging workloads.
- Build CI / CD pipelines to enable automated, reliable, and fast delivery of new features.
- Containerize applications (Docker / Kubernetes) and optimize orchestration for performance.
Reliability & Monitoring :
Ensure high availability and low latency of the messaging platform with proactive monitoring and alerting (Prometheus, Grafana, ELK, Datadog, etc.).Troubleshoot production issues, perform root cause analysis, and implement long-term fixes.Define and track SLOs / SLAs / SLIs for messaging services.Automation & Security :
Automate provisioning, scaling, and failover processes using Infrastructure as Code (Terraform, Ansible, Helm).Enforce best practices for system security, secrets management, and compliance.Implement disaster recovery, backup strategies, and incident response playbooks.Collaboration & Culture :
Work closely with developers, SREs, and QA teams to deliver reliable features.Advocate for DevOps culture : CI / CD adoption, monitoring-first mindset, and blamelesspostmortems.
Contribute to documentation and knowledge sharing across teams.Required Skills & Qualifications :
8+ years of experience in DevOps / SRE / Cloud Engineering roles.Strong experience with Kubernetes, Docker, and CI / CD pipelines.Hands-on expertise in cloud platforms (AWS / GCP / Azure) and Infrastructure as Code Solid background in Linux systems, networking, and messaging protocols (e.g., Kafka, RabbitMQ, MQTT, WebSockets, or similar).Experience with monitoring, logging, and observability stacks.Knowledge of scripting / programming (Python, Bash, Go, Skills :Experience with real-time, high-throughput systems (messaging, streaming, or event-driven architectures).Exposure to scaling microservices in production.Familiarity with security best practices in distributed systems.Why Join Us?
Opportunity to build and scale a mission-critical messaging platform used globally.Work with a passionate, talented team driving innovation in cloud-native communication.Competitive salary.(ref : hirist.tech)