Profile : Site Reliability Engineer (SRE)
Experience Required : 6+ Years
Locations : Mumbai, Gurgaon, Chennai
Work Arrangement : Hybrid
Key Responsibilities :
- Design and implement scalable, resilient cloud-native infrastructure across AWS / Azure / GCP platforms
- Own the SRE function including availability, latency, performance monitoring, emergency response, and capacity planning
- Collaborate with engineering and product teams to improve system reliability, speed, and performance
- Set up, maintain, and improve CI / CD pipelines using industry-standard tools
- Perform load and stress testing, analyze performance bottlenecks, and provide remediation strategies
- Manage incident response and conduct post-incident reviews
- Implement Infrastructure as Code using Terraform
- Monitor system performance and implement proactive measures for system optimization
Mandatory Technical Skills :
Cloud Architecture : Hands-on experience with AWS / Azure / GCP platformsTerraform : Infrastructure as Code implementation and managementPerformance Testing : Proficiency with JMeter, Gatling, k6, or LocustLoad Balancing : Experience with ALB, NLB, Azure Load Balancer, GCP Load BalancerCI / CD Pipelines : Jenkins, GitHub Actions, Azure DevOps, or GCP BuildAdditional Required Skills :
Cloud certifications (AWS / Azure / GCP Solution Architect preferred)SRE expertise in availability, performance monitoring, and capacity planningMonitoring tools : CloudWatch, Prometheus, GrafanaContainer technologies : Docker, Kubernetes, ECS / AKS / GKEScripting & automation : Python, BashDatabase operations : MySQL, PostgreSQL, NoSQL databasesStrong incident management and troubleshooting capabilitiesAnalytical problem-solving mindset(ref : hirist.tech)