Job Description :
Site Reliability Engineer.
For this position, were looking for talented & experienced engineers who have a passion for infrastructure & automation.
As a Site Reliability Engineer (SRE), you will work within the development team to combine software and systems engineering and run large-scale distributed systems. You will also maintain the client's systems' capacity and :
- Taking part in architecture-level discussions, planning and implementation.
- Researching to ensure what we are building is always the best path forward.
- Documenting each project to facilitate integration for users.
- Driving proof of concepts and minimal viable products for demonstration.
- Delivery of Infrastructure as Code.
- Supporting multiple services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
Education and Experience :
To succeed in this role, candidates must have a strong foundational knowledge and demonstrated proficiency of Linux / Unix and Clouds (AWS / Azure / GCP / etc).At least 6 years of SRE or similar experience as a DevOps or Software Engineer.SRE (SLIs / SLOs, Error budget, incident management, change and problem management)3 years-Prometheus, Grafana custom dashboarding (Grafana as code)5 years in Kubernetes2 years-ARGO CD, GitOps tools (preferably FluxCD), Github actions,Flagger5+years Experience in Terraform code.At least two years of programming experience in a conventional programming language.Experience with infrastructure-as-code and configuration management tools (e.g., Terraform, Ansible, Puppet, or Chef).Networking and cloud computing platform experience.Proficiency in scripting and programming languages (e.g., Bash, Python, Go, Node JS, Java, or similar).Familiarity with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, LOKI or ELK similar).Familiarity with CI / CD tools and SDLC practices.You have strong problem-solving skills and excellent communication skills.You are able to work independently as well as collaboratively in a remote team environment.You are friendly, collaborative, humble, honest, and always striving to be better.(ref : hirist.tech)