Description :
- Manage containerized environments using Docker and Kubernetes.
- Monitor system health, performance, and availability using modern monitoring and alerting tools.
- Ensure high availability, scalability, and fault tolerance of cloud infrastructure.
- Automate operational tasks and improve system reliability through SRE best practices.
- Collaborate with development teams to improve deployment, observability, and incident response.
Key Skills :
- Strong experience in DevOps and SRE practices.
- Hands-on with CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.)
- Expertise in Docker and Kubernetes.
- Experience with monitoring/logging tools (Prometheus, Grafana, ELK, Datadog, etc.)
- Solid knowledge of cloud platforms (AWS, Azure, or GCP).
- Scripting skills (Python, Bash, or similar).
Zenith - Site Reliability Engineer - CI/CD Pipeline • Hyderabad