Key Responsibilities :
- 24x7 Observability : Monitor production systems globally, ensuring continuous system reliability and a seamless customer experience.
- Cross-Functional Troubleshooting : Collaborate with engineering and operations teams to assess, diagnose, and resolve production issues effectively.
- Deployment and Configuration : Utilize CI / CD tools to deploy services and configuration changes at enterprise scale.
- Security and Compliance : Implement security controls meeting regulatory standards such as GDPR, SOC2, ISO 27001, PCI, HIPAA, and FBA.
- Maintenance and Support : Apply security patches, perform upgrades, support on-call rotations (PagerDuty / Maximo), and collaborate with product support teams for issue resolution.
Skills Required
Site Reliability Engineering, Kubernetes, Openshift, Aws, Ibm Cloud, Python