Role Title - Site Reliability Engineer
Location : Gurgaon (Hybrid)
Bravura's Commitment and Mission
At Bravura Solutions, collaboration, diversity and excellence matter. We value your ideas, giving you room to be curious and innovate in an exciting, fast-paced, and flexible environment. We look for many different skills and abilities, as well as how you can add value to Bravura and our culture.
As a Global FinTech market leader and ASX listed company, Bravura is a trusted partner to over 350 leading financial services clients, delivering wealth management technology and products. We invest significantly in our technology hubs and innovation labs, which inspire and drive our creative, future-focused mindset. We take pride in developing cutting-edge, digital first technology solutions that support our clients to achieve financial security and prosperity for their customers.
Position Purpose
Join our dedicated Service Operations Team and contribute to the successful delivery of exciting projects within Bravura&aposs application portfolio. The Observability Team ensures the health, performance, and reliability of systems and applications by providing crucial insights.
Site Reliability Engineers (SREs) are skilled engineers who blend technical expertise with a passion for improvement. They creatively solve complex challenges, ensuring the availability and reliability of critical services. SREs collaborate with business leaders to build and maintain sustainable systems that adapt to a dynamic global environment.
At Bravura, we&aposre dedicated to building software that solves real-world problems. Our SREs play a vital role in empowering our users with a robust and high-performance platform. As we expand, we seek an experienced SRE who can bring fresh perspectives and innovative solutions. This individual will collaborate with cross-functional teams to deliver exceptional user experiences.
Main Activities
Based in Gurgaon, you will join us in ensuring our applications deliver high availability, optimal performance, and reliable uptime that meet our clients&apos needs and service level agreements. We&aposre looking for proactive, curious individuals with a focus on continuous improvement and automation.
Your day-to-day responsibilities will be :
- Proactively monitor and observe business services and processes to ensure uninterrupted service delivery.
- Continuously optimize system performance, anticipating client needs by proactively improving the reliability of services throughout their lifecycle.
- Support deployment, availability, reliability, performance, and customer escalation targets for these environments
- Create traceability of workflow transactions, alerting strategies & corresponding triggers
- Maintaining / Monitoring applications and infrastructure across multiple production and non-production environments
- Providing support of applications to resolve issues by troubleshooting application and infra issues while coordinating with multiple stakeholders.
- Actively work with development teams to diagnose application performance issues and identify areas for improvement
- Take responsibility for a piece of work and see it through from specification into production (in collaboration with others)
- Work closely with other teams to improve knowledge sharing and platform understanding
- Document and provide feedback on application documentation and tickets
- Incident management and response within a 24 / 7 environment and ensuring service level targets are met.
Key skills
Experience in supporting a cloud platform (AWS / Azure) along with previous comprehensive experience in application support to support non-cloud-based applications.Sound understanding of Site Reliability Engineering principles to manage a complex suite of environments and SRE tooling and leveraging SRE technology and tools to further automate current platforms and environment management activities.Demonstrated skills with automation including scripting knowledge Shell / BashExperience in Monitoring tools – AppDynamics and Grafana and PrometheusExperience in troubleshooting applications in Java / REST API's / JSONExcellent communication skills, with the ability to communicate ideas, concepts and facts to Clients, peers, and senior members of staffFriendly, professional, and business-like approach to both external and internal clientsSystematic, logical thinker with excellent attention to detailGood client focus with the ability to build positive effective relationships.The aptitude to be flexible and assertive in demanding circumstances.Self-control and resilience including the ability to work effectively under pressure.Proven use of problem-solving skills with the initiative to proactively resolve issues.Excellent team and interpersonal skillsEmpathy and the ability to understand customer needs.Effective organization and time management skillsAble to work unaided and as part of a collaborative team.Qualifications and Experience
Bachelor's degree in computer science or other highly technical, scientific discipline / MCA4-6 years of relevant industry experienceAny experience in regular expressions is bonus.Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C / C++, Ruby, and JavaScriptExperience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)Proven knowledge of databases, SQL preferable on Oracle Database or SQL ServerA basic understanding of service delivery processes ie.Incident ManagementCode promotion and release processChange ControlProblem ManagementAvailability ManagementContingency planning / business continuityConfiguration ManagementProven experience gained in an IT related role within the Financial Services Industry advantageousA proactive approach to spotting problems, areas for improvement, and performance bottlenecksCharacteristics
Consultative and an effective influencerAbility to apply analytical skill and conceptual thinking to operations and system planning.Ability to collaborate with clients.Commercial awarenessCapable of working on-site at client offices.Troubleshooting and debugging capabilities / techniquesShow more
Show less
Skills Required
S3, C, Oracle Database, Prometheus, Json, Yarn, Grafana, Mesos, Shell, Javascript, Ceph, Ruby, Nfs, Python, Aws, Java, Sql Server, Bash, hdfs , Sql, Appdynamics, Azure, Kubernetes