Role : L3 support (Technical Specialist – Technical Support Engineer)
YOE- 6 - 8
CTC - 16LPA
Notice period - Immediate
Mode : WFO, Mumbai
Primary Responsibilities :
A Tech support Engineer is responsible for the availability, performance, monitoring, and incident response, among other things, of the platforms and services that our company runs and owns.
We must make sure that everything that goes to production complies with a set of general requirements like diagrams, dependencies of other services, monitoring and logging plans, backups, and possible high availability setups.
Even when the software complies with all the necessary requirements such as uncaught exceptions, hardware degradation, networking problems, latencies, high usage of resources, or slow responses from our services could happen at any time. We always need to be prepared and be ready to act.
Key Technical Skills –
What you’ll do :
Documenting incidents, runbooks, and incident response report
Writing postmortem report
Ensure proper logging, monitoring and alerting
Change management and maintain an Incident management system.
Drive root cause analysis exercise for issues
Adopting Site Reliability Engineering practices in the group
Tailoring processes to manage time-sensitive issues and bring them to appropriate closure.
Owning end-to-end availability and performance of mission-critical services and building
Automation to prevent the recurrence of the problem.
Key Responsibilities :
Works with cross-product teams to ensure the high availability of the system.
Experience in at least one of the following languages and willingness to learn new ones : Bash,
Php, Golang
Ability to identify system bottlenecks and recommend solutions to solve the availability issue.
Proven expertise in system-level debugging
Working experience in building massively scalable high-performance services
Strong Linux systems knowledge
Guide new SRE engineers on aspects of system debugging.
Hands-on experience working with AWS systems and components.
List of tools we are using internally :
Thread Dump Analysis
Heap Dump Analysis
GC logs Analysis
Knowledge of Java Virtual Machine (JVM) tuning
Knowledge of Tomcat, IBM WAS and Weblogic
Multiple scripting / programming languages : PHP, GoLang, Scala
Monitoring : Grafana, ELK AWS cloud watch, and other internal tools
Documentation and Ticketing : Jira and Confluence
Qualification :
Bachelor's degree in computer science or equivalent field
Working experience in building massively scalable high-performance services
Excellent written and verbal skill
Production Support • Pune, Maharashtra, India