Role Purpose :
As Incident Manager IV , you will be the link between our Support, Engineering, and Infrastructure teams. You will enhance the customer experience by organizing and driving the investigation of production issues in our SaaS application, which consists of Spring-based microservices, ML models, and data pipelines hosted within AWS infrastructure. You will report findings to Engineering, Support, and other stakeholders. In doing so, you will also positively impact the overall product quality.
This is an engineering-focused role, not a management position.
Role Value :
Your work will directly contribute to greater customer satisfaction by ensuring timely communication and resolution of product issues. You will also support Sales teams by providing technical insights about our infrastructure in customer RFPs.
Key Responsibilities
- Investigate production issues raised by customers, Support, and Engineering
- Act as a liaison between Support and Engineering to facilitate issue resolution, root cause analysis (RCA), and implementation of improvements
- Create and track progress of problem tickets in Jira
- Generate incident analysis reports with Engineering teams
- Perform log file analysis using Datadog
- Debug basic REST API calls for investigations
- Execute SQL database queries to gather insights for issue investigations
- Create and update knowledge base articles in Confluence
- Participate in security audits (PCI DSS, ISO 27001, SOC2) and prepare supporting evidence
Skills & Qualifications
Must-Have Skills :
8+ years of IT experience (SRE, Sysadmin, Developer, QA, Technical Support, or similar)University degree in a relevant fieldStrong analytical, problem-solving, and collaboration skillsGood understanding of cloud-hosted application architecturesData analysis skills — ability to create and interpret dashboards to separate real issues from false positivesFamiliarity with project management and documentation tools such as Jira and ConfluenceExcellent verbal and written communication skills in EnglishSolid knowledge of cloud (preferably AWS) infrastructure componentsExperience with REST APIs and tools such as PostmanExperience with logging / monitoring tools such as Kibana and DatadogProficiency in SQL, Linux, and NetworkingEagerness to continuously learn new technical skills