Overview
Leading the design, development, and optimization of scalable, high-performance AI / ML platforms, integrating cutting-edge cloud services, and ensuring seamless interaction with front-end applications
Responsibilities
- Spearhead the development and implementation strategy for backend APIs, focusing on creating scalable, reliable, and high-performance solutions that align with business goals.
- Oversee the management of Azure Kubernetes Service (AKS) (and similar capability for AWS - EKS) deployments, optimizing container orchestration to ensure seamless application scaling and performance.
- Oversee management and automation of pipelines related to data storage, data movement and data retrieval for various forms of data (real-time, audit, fast moving messages) as well as formats (structured - SQL storage, unstructured - BLOB storage and Semi-structured - JSON and related format in no-SQL DB) - between edge and cloud and between cloud applications mediated by File, Service / API, Message Bus and bulk-movement options.
- Direct cloud operations to achieve optimal system performance, security, and cost-efficiency, leveraging Azure cloud services and infrastructure.
- Champion DevOps practices across the team, promoting automation, continuous integration (CI), and continuous deployment (CD) to enhance operational efficiency.
- The CI / CD automation should also include deployment strategies and automation for ML components (e.g., MLOps related pipeline building and automation).
- Desirable to have skills with Github, Ansible, ADO, Terraform and ability to integrate QA tools such as Sonar Cube, Synk etc.
- Contribute to a robust release management strategy and ensure that the CI / CD pipeline incorporates and aligns with complex, global and local release management strategy.
- Ensure that test automation, deployment automation (including edge deployments) and observability are integral part of release management and CI / CD pipelines.
- Facilitate collaboration among cross-functional teams, ensuring the seamless integration of backend services with frontend platforms and other system components.
- Establish and rigorously monitor service-level objectives (SLOs) and key performance indicators (KPIs) to maintain and improve backend system and cloud infrastructure performance.
- Lead incident response initiatives and conduct thorough post-incident analyses to identify root causes and implement preventive strategies, ensuring system integrity.
- Drive continuous improvement efforts, focusing on system efficiency, downtime reduction, and resource optimization.
- Maintain up-to-date knowledge of the latest backend technologies, Azure services, and DevOps tools, integrating innovative practices into our operations.
- Cultivate a team culture that values innovation, excellence, and continuous learning, encouraging team development and the adoption of best practices.
Qualifications
Bachelor's or master's degree in computer science, Engineering, or a closely related field.A minimum of 8 years of experience in backend API development, cloud infrastructure management, particularly Azure and AWS (EKS and AKS) as well as CI / CD automation.Demonstrated proficiency in DevOps methodologies and cloud process management, with a strong technical foundation in Azure cloud services, Kubernetes, and API development with a solid understanding of security, scalability, and high availability in cloud environments.Hands-on experience with CI / CD pipelines, automation tools, and cloud-native technologies involving complex edge deployments and multiple version release managementHands on experience with Blue-Green and Canary deployment (strategy, design and development) and incorporating learnings into release cycle in real-timeProven leadership capabilities in managing complex technical projects and guiding high-performing teams towards achieving operational excellence.Exceptional problem-solving skills and a strategic mindset, capable of addressing technical challenges effectively.Excellent communication and collaboration abilities, with a proven track record of engaging successfully with various stakeholders.A history of driving operational improvements and sustaining high-performance backend systems, demonstrating a commitment to innovation and quality.Skills Required
Terraform, Azure Cloud Services, Kubernetes, Github, Ansible, Api Development, MLops