Talent.com
This job offer is not available in your country.
Incident Manager - Application Architecture

Incident Manager - Application Architecture

CareerNet TechnologiesChennai
30+ days ago
Job description

Responsibilities :

  • Ability to understand the complete Support App architecture to determine if an error originates from the core platform, a snap-in, or a customer's configuration.
  • Help in resolving highly complex and escalated customer queries transferred by junior support engineers.
  • Identify potentially problematic services, check for recent code commits, and perform a rollback if the root cause is confirmed.
  • Device plans to continuously upskill junior support engineers.
  • For issues in snap-ins or configurations, a deep understanding of how they work is required to check for recent changes. This includes reasoning about workflows, agents, customisations, analytics, and portals to identify misconfigurations.
  • Basic knowledge of Kubernetes (k8s) and AWS infrastructure to debug and reason about infrastructure-level failures.
  • Exposure to building LOG monitoring dashboards and alerts.
  • Act as a champion in highlighting the repetitive pain points of customers and devise a solution to stop them from re-surfacing.
  • Help in triaging and resolving critical BUGS, features, and custom requests reported by DevRev customers.

Requirements :

  • Experience : 3-6 years.
  • Master's degree in Computer Science or related field / equivalent practical experience.
  • Excellent logical thinking and problem-solving mindset.
  • The ability to expertly use observability tools by reading logs, analysing traces, and interpreting metrics to pinpoint issues.
  • Proficiency in reading and understanding Golang, JavaScript (JS), and Python to trace issues through different parts of the codebase.
  • Deep knowledge of the product's features and the underlying engineering architecture is essential for effective debugging.
  • Skillfully using tools to inspect and debug API calls between the application and its various integrations.
  • Methodically isolating problematic components, whether it's a microservice, a database query, or a third-party integration.
  • Basic knowledge of Kubernetes (k8s) and AWS infrastructure
  • Familiarity with Cursor or similar investigation tools for live or near-real-time monitoring.
  • Comfortable with GitHub workflows, including branching, PR reviews, and GitHub best practices.
  • Strong testing mindset, with experience in writing and executing test cases and verifying hotfixes in production-like environments.
  • Experience in sprint management, stakeholder communication, and working cross-functionally with engineering and product teams.
  • Nice-to-Have :

  • Prior experience supporting or working on Agentic AI systems, LLMs, or AI copilots.
  • Familiarity with CI / CD workflows and build tools.
  • Exposure to customer support systems and experience collaborating with different cross-functional teams.
  • Ability to generate actionable insights from logs and telemetry data.
  • (ref : hirist.tech)

    Create a job alert for this search

    Incident Manager • Chennai