Position SITE Reliability Engineer.
Budget 1.7 LPM.
Exp 10 yrs.
Duration 6 months.
Technical Skills :
- Programming : Proficiency in languages like Python.
- Operating Systems : Deep understanding of Linux / Windows operating systems and networking concepts.
- Cloud Technologies : Experience with Azure including services, architecture, and best practices.
- Containerization and Orchestration : Hands-on experience with Docker, Kubernetes, and related tools.
- Infrastructure as Code (IaC) : Familiarity with tools like Terraform, CloudFormation or Azure CLI.
- Monitoring and Observability : Experience with tools like Splunk, New Relic or Azure Monitoring.
- CI / CD : Experience with continuous integration and continuous delivery pipelines, GitHub, GitHub Actions.
- Knowledge in supporting Azure ML, Databricks and other related SAAS tools.
Soft Skills :
Problem-Solving : Ability to troubleshoot and debug complex distributed systems independently.Communication : Strong written and verbal communication skills to collaborate with development and operations teams, and able to write documentation like Runbook etc.Specific Experience :
Incident Management : Experience with incident response, root cause analysis, and post-incident reviews.Scalability and Performance : Understanding of scalability, availability, and performance monitoring for large-scale systems.Automation : Experience in automating repetitive tasks and workflows.Preferred Qualifications :
Experience with specific cloud platforms ( Azure).Certifications related to cloud engineering or DevOps.Experience with microservices architecture including supporting AI / ML solutions.Experience with large-scale system management and configuration.(ref : hirist.tech)