Talent.com
This job offer is not available in your country.
Senior Associate - Reliability Operations

Senior Associate - Reliability Operations

Zeta Services Inc.Hyderabad, Telangana, India
17 hours ago
Job description

About ZetaZeta is a Next-Gen Banking Tech company that empowers banks and fintechs to launch banking products for the future. It was founded by and Ramki Gaddipati in 2015.Our flagship processing platform - Zeta Tachyon - is the industry’s first modern, cloud-native, and fully API-enabled stack that brings together issuance, processing, lending, core banking, fraud & risk, and many more capabilities as a single-vendor stack. 20M+ cards have been issued on our platform globally.Zeta is actively working with the largest Banks and Fintechs in multiple global markets transforming customer experience for multi-million card portfolios.Zeta has over 1700+ employees - with over 70% roles in R&D - across locations in the US , EMEA , and Asia . We raised $280 million at a billion valuation from Softbank, Mastercard, and other investors in 2021.Learn more @ , , ,

About the Role

  • The Senior Associate Reliability Operations role is critical in ensuring the continuous, reliable, and secure operation of our SaaS products, operating in a 24x7 support capacity. This role involves proactive monitoring, incident response, and collaboration with teams across the organization to maintain optimal service levels. The Senior Associate will participate in a rotating shift schedule to ensure high availability, rapid issue resolution, and support for key reliability initiatives. Senior Associate will serve as a key escalation point, mentor junior team members, and lead critical efforts to optimize operational workflows and systems.

Responsibilities :

  • 24x7 Monitoring and Support : Oversee the health, performance, and availability of cloud-based SaaS infrastructure and applications, using monitoring tools like Prometheus and Grafana, and respond to alerts during assigned shifts. Alignment and adherence to organization process to maintain the SLA.
  • Incident Management : Act as the first responder in a 24x7 rotation, managing and mitigating service disruptions, following standard incident procedures, and escalating issues to SMEs as needed.
  • Deployments and Change Management : Manage deployment lifecycle of the applications. Proactively engage with SMEs to resolve deployment process issues or challenges.
  • Troubleshooting and Resolution : Use diagnostic tools and scripts to resolve common issues in real-time and collaborate with cross-functional teams to analyze and address root causes.
  • Service Health and Reliability : Assist in defining and refining SLAs, SLOs, and SLIs; perform routine checks and follow established runbooks to maintain consistent service reliability.
  • Analysis and Reporting : Regularly review incident data to identify patterns, improve service resilience, and produce shift reports summarizing system health and resolved incidents.
  • Documentation and Knowledge Base : Document incident resolutions, update runbooks, and contribute to an internal knowledge base to improve team response and efficiency.
  • Continuous Improvement Initiatives : Participate in reliability enhancement projects, including automation, configuration management, and tools improvement.
  • Collaboration : Communicate effectively with SMEs to relay critical incident information, insights, and preventive recommendations
  • Mentorship : Work closely with team members to provide guidance during shifts and share insights on improving incident response.
  • Experience and Qualifications

  • Education : IT, Computers, BCA or equivalent.
  • Experience : 2-4 years of experience in reliability operations or related 24x7 support role within SaaS or cloud environments
  • Skills

  • Proficiency in monitoring and alerting tools, such as Prometheus, Grafana, Datadog, or Splunk.
  • Ability to remain composed in high-stakes situations and resolve incidents promptly.
  • Strong verbal and written communication skills to document and relay incident information effectively.
  • Shift Information

  • 24x7 Rotational Shifts : This role requires availability to work rotating shifts, including nights, weekends, and holidays, to ensure 24x7 support coverage.
  • Zeta is an equal opportunity employerAt Zeta, we are committed to equal employment opportunities regardless of job history, disability, gender identity, religion, race, marital / parental status, or another special status. We are proud to be an equitable workplace that welcomes individuals from all walks of life if they fit the roles and responsibilities.

    Create a job alert for this search

    Associate Operation • Hyderabad, Telangana, India