Talent.com
This job offer is not available in your country.
SRE DevOps Lead Engineer - AI SaaS Fintech Product Domain

SRE DevOps Lead Engineer - AI SaaS Fintech Product Domain

TwinPacs Sdn BhdHyderabad
30+ days ago
Job description

We have an exciting role as below in Hyderabad for an AI SaaS Fintech Product Firm.

SRE DevOps Lead Engineer (SaaS) || 8-12 Y || Hyderabad (Hybrid) || Quick Starter ||

Key Responsibilities :

  • Architect, design, and deploy end-to-end infrastructure solutions for a multi-tenant microservices-based SaaS application with a focus on AI / ML model integration.
  • Ensure system reliability, scalability, performance, and security, specifically enhancing AI / ML processing pipelines and workflows.
  • Utilize Terraform scripting for on-demand environment provisioning within the AWS cloud, optimized for AI / ML workloads.
  • Implement and refine monitoring and alerting systems across application, network, and OS layers to support AI model operations and data processing.
  • Diagnose, support, and resolve production issues and alerts, participating in a 24 / 7 on-call rotation to maintain seamless AI / ML service operations.

Scope Of Work :

  • Actively participate in the Scrum team, delivering test automation for sprint features and ensuring high-quality product increments by certifying new and regression features using automated test suites
  • Integrate automated tests into the CI / CD pipeline and schedule them to run periodically in product development environments
  • Identify defects, collaborate with development engineers to resolve them, and verify the fixes
  • Maintain continuous availability in alignment with startup culture, staying informed and up to date with communications across various channels and email threads
  • Focus on the primary goal of minimizing customer-reported bugs to near zero.
  • Required Qualification :

  • 8+ years of experience in Site Reliability Engineering (SRE) and DevOps roles with a track record of managing large-scale enterprise SaaS services in production, including 1+ year in AI / ML infrastructure
  • Demonstrated expertise with AWS public cloud technologies, including extensive experience in deploying and managing large-scale container clusters using AWS, EKS.
  • Skilled in Infrastructure as Code (IaC) using Terraform, and container technologies such as Docker and Kubernetes.
  • Proficient in scripting and programming for automation (Python, Bash, etc.), with strong Linux OS and networking fundamentals relevant to AI / ML workloads.
  • Experience in establishing monitoring systems to ensure high availability, performance, and security integrity, using tools like ELK Stack, CloudWatch, and others tailored for AI / ML monitoring.
  • Hands-on experience managing microservices architecture SaaS products, enabling RESTful web services, SSO integration (Okta, Auth0), and utilizing cloud databases like EC2-RDS, MySQL, and Elasticsearch, especially in AI / ML deployments.
  • Proficient in backup and disaster recovery strategies specific to AI / ML data resources like RDS and Elasticsearch.
  • AWS Certified Solutions Architect is strongly preferred.
  • Self-driven, proactive, and adaptable to thrive in an early-stage startup environment, with a keen interest in integrating AI / ML technologies into modern SaaS solutions.
  • Strictly, prefer applicants with stable career (consistent employment) within 0-30 days NP only!
  • (ref : hirist.tech)

    Create a job alert for this search

    Lead Ai Engineer • Hyderabad