About the Role :
We are seeking an experienced Senior DevOps Engineer / Lead DevOps Engineer with deep expertise in Kubernetes infrastructure design and implementation. In this role, you will be responsible for architecting, building, and managing enterprise-grade Kubernetes clusters from scratch, driving cloud-native infrastructure modernization initiatives, and mentoring a small team of DevOps engineers.
This is an exciting opportunity to work with cutting-edge technologies, optimize application deployment pipelines, and ensure that our cloud infrastructure is robust, secure, and highly available.
Key Responsibilities :
- Design and implement enterprise-grade Kubernetes clusters across multi-cloud environments (AWS, Azure, GCP).
- Ensure high availability, scalability, and resilience of Kubernetes clusters.
- Manage upgrades, patching, and lifecycle management of clusters and nodes.
- Implement and maintain Terraform, Helm charts, and GitOps workflows for infrastructure provisioning and management.
- Automate routine operational tasks, deployments, and scaling operations.
- Build, maintain, and optimize CI / CD pipelines using Jenkins, GitLab CI / CD, or equivalent tools.
- Collaborate with development teams to streamline deployment strategies and reduce lead time for changes.
- Implement logging, monitoring, and observability solutions (Prometheus, Grafana, ELK Stack, or equivalent) for Kubernetes workloads.
- Proactively identify bottlenecks and performance issues.
- Implement security policies including RBAC, network policies, secrets management, and container security scanning.
- Ensure compliance with organizational and regulatory security standards.
- Design and implement disaster recovery strategies for containerized applications and Kubernetes clusters.
- Maintain backup and restore procedures for critical systems.
- Lead a team of 3 to 4 DevOps engineers, providing guidance, mentorship, and technical oversight.
- Conduct technical reviews and enforce code quality and best practices.
- Facilitate knowledge sharing and maintain comprehensive documentation.
Performance Optimization & Cost Management :
Optimize cluster performance, resource utilization, and cost-efficiency across multi-cloud deployments.Required Skills & Qualifications :
Education :
Bachelors or Masters degree in Computer Science, Engineering, or a related technical field (or equivalent experience).Experience :
8+ years in DevOps or Site Reliability Engineering roles, with strong hands-on experience in Kubernetes.Technical Skills : Kubernetes :
Cluster architecture, deployment, scaling, monitoring, and as Code :Terraform, Helm, GitOps.Containerization :
Docker and container best practices.Cloud Services :
AWS, Azure, and / or GCP.Scripting & Automation :
Python, Bash, or equivalent scripting languages.CI / CD Tools :
Jenkins, GitLab CI / CD, or equivalent.Version Control :
Git and branching strategies.Observability :
Monitoring, logging, and alerting using Prometheus, Grafana, ELK, or similar tools.Leadership & Soft Skills :
Proven ability to lead and mentor DevOps teams.Strong analytical, problem-solving, and troubleshooting skills.Excellent communication and collaboration with cross-functional teams.Ability to work in fast-paced, agile environments and drive initiatives independently.(ref : hirist.tech)