Overview The Service Reliability Engineer role is within MUFG Retirement Solutions Technology Delivery function accountable for ensuring the reliability and scalability of data services through proactive monitoring, automation, and incident resolution. This role focuses on maintaining system uptime and reducing unplanned downtime for critical systems. Key Accountabilities and main responsibilities Strategic Focus
- System Monitoring : Implement and maintain monitoring frameworks to ensure real-time visibility of system performance.
- Incident Resolution : Lead resolution efforts for critical incidents, ensuring minimal downtime.
- Automation : Develop and implement automation strategies to improve system reliability and reduce manual interventions.
- Collaboration : Work with cross-functional teams to identify and address potential system issues proactively.
- Performance Optimisation : Analyse performance data to drive continuous system improvement. Proactive focus on optimising cloud ROI.
Operational Management
Perform regular monitoring of the data systems real time to identify any performance issuesDrive incident meetings to identify the issue and provide resolution thereby ensuring the reliability and scalability of data servicesDevelop and implement automation strategies to improve system reliability and reduce manual interventions.Propose, document and implement changes to policies or procedures in line with technological advancementsAssist in the development, maintenance, implementation and changes to the SLAs.Monitor and identify any trends or irregular activities on jobs logged that could relate to potential IT issues and escalate appropriately.Provide knowledge, training and information support to enable self-service.Set procedures and processes in line with standards within the IT Desktop environment.Perform quality checks and audit the observations on the work carried out.Provide regular updates to leadership on status of all tasks, projects and improvements including issue and risk mitigation solutions, in agreed timeframes.Ensure that all requests from stakeholders for assistance are handled promptly and effectively and if necessary escalated to the appropriate levelDrive the onboarding and rollout of technology services according to the pre-defined roadmapApply best practices like regular system monitoring, performance optimisation, and collaboration for system reliability and uptime.Governance & Risk
Adhere to MUFG’s standards, policies, and proceduresEnsure adherence to governance framework set up by the domain and provide accurate matrices accordinglyManage risks, dependencies and issues associated with technology delivery.Adhere to Regulatory guidance and standards (. CPS230, CPG235 and GDPR)Reviewing IT processes and procedures to ensure efficiency and simplicity for the business and meet control objectives as set out by GS007, ISO27001 and other financial industry regulationsThe above list of key accountabilities is not an exhaustive list and may change from time-to-time based on business needs. Experience & Personal Attributes Experience
Overall, 7-10 years’ experience with minimum 5-7 years in data platform engineering and cloud migration in large, complex organisations.Strong experience in automation, cloud computing, and data governance.Preference for experience working with onshore teams and key Stakeholders, inclusive of migrations and driving global team collaboration and efficiency.Expertise in driving complex technical transformations, decommissions and realising business outcomes through data and analytics.Experience in implementing frameworks and policies with the ability to measure the outcomes.Govern IT End User Computing that drives transparency, operational stability, financial sustainability and productivityIdentify and mitigate security risks and ensure IT security design and deliveryProficiency in Snowflake, SQL, and DataOps frameworks with understanding of data management and processing.Strong analytical skills to analyse performance data to drive continuous system improvementStrong troubleshooting skills for resolving critical incidents, ensuring minimal downtime.Experience in Data observability automation to scale data monitoringPersonal Attributes
Effective communication & interpersonal skills to engage with people at all levels of the organization and build strong relationships and trust with global stakeholders.A good Problem-Solver and effective decision maker with a focus on overcoming challenges.Strong business acumen and passion for current, new and emerging technologies to enable and rollout to the business to improve customer experienceStrong in developing presentations and the ability to present and capture a wide variety of audiencesAbility to priorities, organise and plan and to meet demanding deadlinesAbility to make decisions in a timely manner based on the information, experience and skills availableAbility to recognise, lead and implement continuous service improvement opportunities.