Key Responsibilities :
- Manage deployments, patching, and changes on partner and internal cloud environments per compliance and partner needs.
- Set up new accounts and cloud environments for emerging business units.
- Conduct disaster recovery (DR) drills periodically for both internal and partner cloud environments.
- Create and manage secure golden AMIs and Docker images for consistency and security.
- Automate scripts and tools to improve operational efficiency and scalability.
- Monitor and resolve alerts promptly to ensure high availability of systems and services.
- Act as the first responder for internal and external security issues, ensuring resolution within defined SLAs.
- Collaborate with DevOps for smoother operations by handling mundane day-to-day tasks (e.g., creating new alerts or troubleshooting deployments).
- Manage internal and external audits by creating detailed documentation and automation for streamlined processes.
- Develop SOPs, runbooks, generic dashboards, and alerts for effective operations.
- Work closely with central teams (DevOps, Cloud Platform, Engineering Excellence) to ensure proper communication and understanding of requirements.
- Help developers troubleshoot deployment and alert-related issues, fostering a culture of collaboration.
- Maintain a strong understanding of networking concepts and infrastructure to ensure secure and reliable operations.
- Manage network firewalls.
- Occasionally travel to external bank offices for initial setups and infrastructure management as required.
Skills Required
Auditing, Unix, Linux, Disaster Recovery, Windows, Operations, Python