Key Responsibilities :
- Monitor and support the services to ensure robust performance at scale - minimal downtime and quick recoveries.
- Analyse and solve incidents, prevent reoccurrence by automation or alerting.
- Analyse and solve user requests in close cooperation with the user.
- Continuous improvement of service deployment, infrastructure footprint, monitoring, alerting e.g. based on resolved incidents and triggered alerts.
- Develop and maintain Python scripts for automation and operational tasks.
- Build, deploy, and manage Docker containers and Kubernetes clusters.
- Administer GitHub repositories including workflows and integration pipelines.
- Collaborate with other services and infra teams to ensure robust performance of services at scale.
- Document configurations, processes, and tools usage to support internal teams.
Mandatory Skills :
Python Development Proven experience in Python development and testing to create and maintain automation and operational scripts.Plus : Experience in Test Driven Development.Docker Hands-on experience in containerization and image management.Kubernetes Practical experience in deploying, scaling, and troubleshooting applications.GitHub – Proficient with version control, and repository management.Terraform / Ansible – proven experience with an Infrastructure as Code tool.Skills Required
Github, Version Control, Python, Devops, repository management