RESPONSIBILITIES :
- Ensure the reliability, availability, and performance of critical payment platforms and services.
- Drive root cause analysis (RCA) and implement long-term solutions to prevent recurrence of incidents.
- Manage capacity planning, scalability, and performance tuning across cloud and on-prem environments.
- Lead and participate in the on-call rotation, providing timely support and issue resolution.
- Design, implement, and maintain CI / CD pipelines using Jenkins, GitHub, and other DevOps tools.
- Automate infrastructure deployment, configuration, and monitoring, following Infrastructure as Code (IaC) principles.
- Enhance automation for routine operational tasks, incident response, and self-healing capabilities.
- Implement and manage enterprise monitoring solutions including Splunk, Dynatrace, Prometheus, and Grafana.
- Build real-time dashboards, alerts, and reporting to proactively identify system anomalies.
- Continuously improve observability, logging, and tracing across all environments.
- Work with AWS, Azure, and PCF (Pivotal Cloud Foundry) environments, managing cloud-native services and infrastructure.
- Design and optimize cloud architecture for reliability and cost-efficiency.
- Collaborate with cloud security and networking teams to ensure secure and compliant infrastructure.
- Apply your understanding of Card Payment systems to ensure platform reliability and compliance.
- Troubleshoot payment-related issues, ensuring minimal impact on transaction flows and customer experience.
- Collaborate with product and development teams to ensure alignment with business objectives.
Skills Required
Devops, Jenkins, Github, Splunk, Prometheus, Grafana, Aws