We are seeking a skilled and proactive Site Reliability Engineer (SRE) to join the IT Transformation team.The role involves driving automation, reliability, and performance optimization across mission-critical applications and infrastructure within a financial market ecosystem.
The successful candidate will manage end-to-end deployment automation, CI / CD pipelines, monitoring frameworks, and system reliability, ensuring high availability, scalability, and performance of production and non-production environments
Key Responsibilities :
- Own and manage CI / CD pipelines for automated build, test, and deployment
- Design and implement robust deployment strategies for microservices and web applications
- Set up and maintain monitoring, alerting, and logging frameworks (Prometheus, Grafana, ELK)
- Build automations to optimize software delivery and reduce manual interventions
- Manage availability, latency, performance efficiency, change management, and capacity planning
- Develop services for automatic provisioning, release automation, and pre-emptive monitoring
- Partner with development and infrastructure teams to improve service reliability and incident management
- Run infrastructure leveraging GitLab CI / CD, Kubernetes, Kafka, NGINX, and ELK stack
- Provide L1 operational support and ensure seamless production stability
- Core Technical Competencies (Must Have) :
- Elastic, Logstash, Kibana (ELK) or AppDynamics
- CI / CD tools GitLab or Jenkins
- Ansible
- Python
- Linux Administration
- SSO technologies
Additional Competencies (Nice to Have) :
Kubernetes, Kafka, MQNGINX or ApigeeRedisExperience working with outsourced vendor teamsKnowledge of enterprise functional architecture in Capital MarketsBehavioral Competencies :
Strong analytical, troubleshooting, and communication skillsCollaborative and team-oriented approachAbility to manage multiple concurrent projects efficiently