About the role :
We are seeking a motivated and experienced Software Development Engineer - Backend (SDE-III) to join our team.
You will be responsible for maintaining and enhancing our Deep Learning and LLM backend systems that serve high-scale applications.
Your work will directly impact our ability to deliver reliable, efficient, and innovative AI solutions to our Outcomes for the First 12 Months : High Operational Standards :
- Ensure AI backend services meet stringent Service Level Objectives (SLOs) with a focus on 99.99% uptime, optimal accuracy, swift response times, and a high success ratio for all of service :
- Advance the observability of model inference and performance metrics to facilitate proactive monitoring and rapid troubleshooting.
- Enhance the lead time for changes in AI backend services by emphasizing code quality, readability, and reliability.
- Implement release strategies that are quicker, safer, and more efficient.
- Identify and execute initiatives aimed at enhancing performance, operational efficiency, reliability, and cost-effectiveness, ensuring our systems deliver strategic business value.
- Drive customer delight by improving the Net Promoter Score (NPS) through faster response times to queries, effective bug fixes, and the swift execution of feature Strategic Innovations :
- Lead strategic initiatives in Machine Learning (ML) and Large Language Model (LLM) inference to secure a competitive advantage for the and Leadership :
- Provide guidance and mentorship to junior team members, fostering an environment of growth and accountability for delivering exceptional competencies :
- Languages - Strong proficiency in Nodejs and ML inference - Hands on experience with ML Inference frameworks such as TritonServer, TensorRT, ONNX etc
- Backend systems - Deep understanding of REST APIs, distributed systems, Cloud Infrastructure (preferably Code Quality : Write clean, maintainable, and efficient code. Conduct and participate in code reviews to ensure code quality and adherence to Scalability & Performance : Design and implement systems with a focus on performance, scalability, and Production Releases - Familiarity with CI / CD pipelines, DevOps practices, and automated testing Monitoring & Observability : Proficient in implementing monitoring solutions for system reliability and performance. Utilize tools and frameworks for effective logging, alerting, and visualization of system metrics.
- Mentorship : Ability to mentor juniors in the team. Ensure that designs and code align with the overall architecture and best competencies : Problem solving - You can analyze complex business problems and translate them into scalable, efficient technical solutions. You generate new and innovative approaches to solving skills - You are able to structure and process qualitative and quantitative data and draw insightful conclusions from it. You exhibit a probing mind and achieve penetrating and Persistence - You possess a strong willingness to work hard. You demonstrate tenacity and willingness to go the distance to get something bias - You have a tendency to take initiative and make decisions swiftly. You proactively move forward and tackle challenges, even in uncertain situations, prioritizing action over quickly - You have the ability to learn quickly and proficiently understand and absorb new information. You stay updated with emerging technologies and industry trends to drive continuous - You exhibit passion and excitement over work. You have a can-do - You produce significant output with minimal wasted / Adaptability - You adjust quickly to changing priorities and conditions and cope effectively with complexity and : 4-6 years of total experience working with backend systems at scale - preferably on AI / LLM / DeepLearning systems
(ref : hirist.tech)