Key Responsibilities
- Lead the Site reliability engineering team
- Layout schedule and shift plans for the team
- Manage tickets and allocate tasks for team members
- Work collaboratively with peers and management
- Ensure transparent communication with the customer
- Provide direction and assistance to team members
- Record and track team SLAs and workflows
- Ensure that the monitoring systems and procedures are aligned with industry best practices, regulatory requirements, and security policies.
- Implement metrics-driven processes to ensure service quality
Skill Set :
Knowledge in monitoring tools such as Zabbix, Nagios, etcKnowledge / experience in ticketing systems such as Zoho Desk / JIRA etcStrong problem-solving skills, particularly in investigating and analyzing recurring issues.Hands-on knowledge of Linux fundamentals, System administration, scripting, performance tuning, etcStrong problem-solving skills and ability to think under pressureBasic knowledge of cloud environments such as AWS, Azure, Google Cloud, etcBasic knowledge of networking, routing and switchingCommunication and documentation skillsExperience : 5 - 7 Years of L2 monitoring Apply Now →
Skills Required
System Administration, Scripting, Performance Tuning