Role Purpose : As
Incident Manager IV , you will be the link between our Support, Engineering, and Infrastructure teams. You will enhance the customer experience by organizing and driving the investigation of production issues in our SaaS application, which consists of Spring-based microservices, ML models, and data pipelines hosted within AWS infrastructure. You will report findings to Engineering, Support, and other stakeholders. In doing so, you will also positively impact the overall product quality.
This is an engineering-focused role, not a management position.
Role Value :
Your work will directly contribute to greater customer satisfaction by ensuring timely communication and resolution of product issues. You will also support Sales teams by providing technical insights about our infrastructure in customer RFPs.
Key Responsibilities
Investigate production issues raised by customers, Support, and Engineering
Act as a liaison between Support and Engineering to facilitate issue resolution, root cause analysis (RCA), and implementation of improvements
Create and track progress of problem tickets in Jira
Generate incident analysis reports with Engineering teams
Perform log file analysis using Datadog
Debug basic REST API calls for investigations
Execute SQL database queries to gather insights for issue investigations
Create and update knowledge base articles in Confluence
Participate in security audits (PCI DSS, ISO 27001, SOC2) and prepare supporting evidence
Skills & Qualifications
Must-Have Skills :
8+ years of IT experience (SRE, Sysadmin, Developer, QA, Technical Support, or similar)
University degree in a relevant field
Strong analytical, problem-solving, and collaboration skills
Good understanding of cloud-hosted application architectures
Data analysis skills — ability to create and interpret dashboards to separate real issues from false positives
Familiarity with project management and documentation tools such as Jira and Confluence
Excellent verbal and written communication skills in English
Solid knowledge of cloud (preferably AWS) infrastructure components
Experience with REST APIs and tools such as Postman
Experience with logging / monitoring tools such as Kibana and Datadog
Proficiency in SQL, Linux, and Networking
Eagerness to continuously learn new technical skills
Incident Manager • India