Description
We are seeking a skilled Site Reliability Engineer (SRE) to join our team in India. The ideal candidate will have a strong background in maintaining and improving the reliability of our services, as well as a passion for automation and efficiency.
Responsibilities
- Monitor and maintain the availability and performance of critical services and systems.
- Design and implement automation tools for operational tasks.
- Collaborate with development teams to improve service reliability and performance.
- Troubleshoot and resolve incidents and outages in a timely manner.
- Participate in on-call rotations to respond to incidents and emergencies.
- Implement and manage CI / CD pipelines to streamline deployment processes.
- Conduct root cause analysis and implement preventive measures for recurring issues.
Skills and Qualifications
2-6 years of experience in Site Reliability Engineering, DevOps, or a related field.Strong understanding of cloud infrastructure (AWS, Azure, GCP).Proficiency in scripting languages (Python, Bash, etc.).Experience with containerization and orchestration technologies (Docker, Kubernetes).Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK stack).Knowledge of networking concepts and protocols (TCP / IP, DNS, HTTP).Ability to work collaboratively in a team environment and communicate effectively.Education
Bachelor Of Computer Application (B.C.A), Master in Computer Application (M.C.A), Post Graduate Diploma in Computer Applications (PGDCA), Bachelor Of Technology (B.Tech / B.E)
Skills Required
Kubernetes, Docker, Terraform, Prometheus, Grafana, Python, Linux, Monitoring, Networking