Site Reliability Engineer
Location : Hyderabad
Reports to : Head – DevOps & TechOps
Type of Position : Full Time
About us :
Arrise Solutions (India) Pvt. Ltd. is a leading content provider to the iGaming and Betting Industry, offering a multi-product portfolio that is innovative, regulated and mobile-focused. Arrise Solutions (India) Pvt. Ltd. strives to create the most engaging and evocative experience for customers globally across a range of products, including slots, live casino, sports betting, virtual sports and bingo.
Job Description : About The Role
The team at Arrise Solutions (India) Pvt. Ltd. deals with infrastructure provisioning automation, configuration management, tool administration, production system support, application environment maintenance, observability, release management, and internal tool development for efficient handling of the aforementioned activities. You will be actively involved in all aspects of the team activities.
What You’ll Do Here
- Analyze, troubleshoot, debug, and assist in problem solving in test and production environments within the framework of incident and change management processes.
- Lead any outage triage that impact the infrastructure / applications and collaborate closely with support & development teams by playing an active role in blameless postmortems, refine play books to reduce MTTR
- Ensure the operational health, high availability, reliability, and security of the applications & Infra.
- Maintain production services through measuring and monitoring availability, latency, and overall system health.
- Handle on-call and emergency support
What We're Looking For
7-9 years of experience with SRE / DevOps / System Engineer / Production Support rolesInfrastructure / app performance engineering experience is a nice to haveExperience with Linux administration and troubleshooting of Java applications in productionMaintenance and development of monitoring, logging, tracing, and alerting solutions (Grafana, ELK or PagerDuty or equivalents)Hands-on experience with tools and techniques to diagnose and uncover container and overall system performanceAbility to handle fast paced environment with multiple projects simultaneously and incident responses.Hands-on experience with handling Infrastructure on AWS / GCP / Data Center or similar cloud / on-premise platforms.Expertise in scripting languages : Python / Groovy / Ruby or equivalentExperience in DevOps, CI / CD, Configuration, and Release Management areas and relevant tools