Role : SRE Lead Engineer
Skills : Docker, Prometheus, grafana, ELK, DataDog
Location : Noida
Experience : 8+ Years
Mode : Work from office
We at Coforge are hiring a highly skilled and experienced SRE Lead Engineer to drive reliability, scalability, and performance across our infrastructure and applications. You will lead a team of SREs, collaborate with development and operations teams, and implement best practices to ensure high availability and resilience of our systems.
Key Responsibilities :
- Lead and mentor a team of SREs to build scalable and reliable systems.
- Design and implement monitoring, alerting, and incident response strategies.
- Drive automation of operational tasks and improve deployment pipelines.
- Collaborate with software engineers to ensure reliability is built into the product from the ground up.
- Conduct root cause analysis and postmortems for production incidents.
- Define and track SLAs, SLOs, and SLIs to measure and improve system reliability.
- Champion DevOps and SRE best practices across the organization.
- Manage capacity planning and performance tuning.
- Ensure security and compliance standards are met in infrastructure operations.
Required Qualifications :
Bachelor's or Master's degree in Computer Science, Engineering, or related field.8+ years of experience in software engineering, DevOps, or SRE roles.Strong experience with azure platforms.Proficiency in programming / scripting languages (Python, Go, Bash, etc.).Expertise in CI / CD tools (Jenkins, GitLab CI, etc.).Deep understanding of containerization and orchestration (Docker, Kubernetes).Experience with observability tools (Prometheus, Grafana, ELK, Datadog).Excellent problem-solving and communication skills.Proven leadership experience in technical teams.Preferred Qualifications :
Certifications in cloud technologies or DevOps practices.Experience with Infrastructure as Code (Terraform, Ansible).Familiarity with chaos engineering and resilience testing.Exposure to regulatory compliance (e.g., SOC2, ISO27001).