Overview
We are seeking a highly motivated and experienced Manager of Site Reliability Engineering (SRE) to lead our Azure-focused SRE team. The ideal candidate will combine technical expertise in Azure cloud services with strong leadership skills to ensure the reliability, scalability, and performance of our applications and infrastructure. As a manager, you will oversee a team of SREs, driving automation, incident management, and operational excellence while collaborating with cross-functional teams to achieve business goals.
Responsibilities
Key Responsibilities
o Lead, mentor, and grow a team of SREs, fostering a culture of collaboration, continuous learning, and operational excellence.
o Define team goals, metrics, and performance objectives aligned with organizational priorities.
o Ensure the reliability, availability, and performance of Azure-hosted services through proactive monitoring and alerting.
o Develop and enforce best practices for incident response, root cause analysis, and postmortem reporting.
o Establish SLAs, SLOs, and error budgets in collaboration with product and engineering teams.
o Drive the adoption of automation tools to reduce manual operational tasks and improve system reliability.
o Implement Infrastructure as Code (IaC) principles using tools such as Terraform, ARM templates, or Bicep for Azure resources.
o Optimize system performance, capacity planning, and scalability to support growth and evolving business needs.
o Leverage Azure services such as Azure Monitor, Application Insights, and Log Analytics to gain insights into system health.
o Partner with development, product, and infrastructure teams to align on technical strategies and priorities.
o Communicate operational health, risks, and opportunities to executive stakeholders.
o Ensure compliance with security best practices, standards, and policies within Azure environments.
o Identify and mitigate risks related to cloud infrastructure and applications.
Qualifications
Skills Required
Azure Cloud Services, Python, Terraform, Powershell, ARM Templates
Azure Sre • Hyderabad / Secunderabad, Telangana, India