Job Title : Site Reliability Architect
Location : GGN Office (Onsite)
Experience : 10+ Years (including 5+ years as Architect)
Domain Preference : Stock Broking / Capital Markets / Financial Services
Role Overview :
We are seeking an experienced Site Reliability Architect (SRE) to design, implement, and maintain highly available, secure, and scalable systems supporting mission-critical trading, settlement, and risk management platforms. The ideal candidate will bring deep expertise in SRE principles, cloud infrastructure, and automation , with strong exposure to financial services environments .
Key Responsibilities :
- Design and implement reliable, highly available, and fault-tolerant systems for trading, clearing, and settlement platforms.
- Lead SRE initiatives including monitoring, alerting, incident response, and capacity planning for low-latency trading systems.
- Collaborate with development, infrastructure, and security teams to implement automation, CI / CD pipelines, and DevOps best practices .
- Establish service-level objectives (SLOs), service-level indicators (SLIs), and error budgets for critical applications.
- Drive root cause analysis (RCA) and post-incident reviews for production outages or performance issues.
- Optimize system performance and reliability across on premise and cloud environments (GCP) .
- Ensure compliance with regulatory requirements (SEBI, NSE, BSE) for uptime, security, and audit trails.
- Mentor and guide teams on reliability, scalability, and operational best practices .
Required Skills & Experience :
10+ years of IT experience with at least 5 years in Site Reliability / Systems Architecture roles .Strong knowledge of cloud-native architectures, microservices, containerization (Docker / Kubernetes), and orchestration .Hands-on experience with monitoring and observability tools (Prometheus, Grafana, ELK, AppDynamics, Dynatrace).Expertise in automation, CI / CD pipelines, and infrastructure as code (Terraform, Ansible, Jenkins, GitOps).Experience in incident management, disaster recovery, and business continuity planning .Familiarity with financial services / stock broking environments , including low-latency and high-availability requirements.Good to Have :
Experience in trading systems, OMS, risk management platforms, or market data systems .Knowledge of regulatory compliance requirements in capital markets (SEBI / NSE / BSE) .Skills Required
Elk, Prometheus, Disaster Recovery, Grafana, Microservices, Jenkins, Appdynamics, Docker, Terraform, Incident Management, Ansible, Business Continuity Planning, containerization , Cloud Infrastructure, Dynatrace, Automation, Kubernetes