```html
About the Company : UptimeAI is leading the way in predictive analytics and AI-driven solutions to optimize operational uptime and reduce downtime for industrial and enterprise clients. Our innovative platform harnesses cutting-edge data science to deliver actionable insights, ensuring maximum efficiency and reliability. UptimeAI uniquely combines Artificial Intelligence with Subject Matter Knowledge from 200+ years of cumulative experience to explain interrelations across upstream / downstream equipment, adapt to changes, identify problems, and give prescriptive diagnosis like a human expert would.
About the Role : We are a fast-growing, AI-first SaaS startup backed by top-tier investors and operating across India and the US. Our platform helps enterprises optimize critical business functions using cutting-edge AI and automation. As we scale, we’re looking for a hands-on DevOps Engineer who thrives in startup environments and can take ownership of cloud infrastructure, deployment, and CI / CD workflows.
Responsibilities :
- Design, implement, and manage cloud infrastructure across Azure for both internal platforms and customer-specific deployments
- Configure and maintain VPCs, VPNs, and peering to enable secure, scalable, and isolated environments
- Build and automate CI / CD pipelines for application and ML workloads
- Manage multi-tenant vs single-tenant deployments based on customer requirements
- Implement monitoring, alerting, logging, and disaster recovery strategies
- Work closely with engineering to ensure seamless Dev→Prod flows and secure release management
- Set up and manage infrastructure as code (e.g., Terraform, Pulumi, Bicep, CloudFormation)
- Optimize costs, performance, and availability for both internal and customer-facing cloud workloads
- Enforce security best practices, access control, and compliance across infrastructure
Qualifications :
3 - 8 years of experience as a DevOps / SRE / Cloud Engineer in high-growth SaaS or product startupsAWS Certified (at least Solutions Architect - Associate) and Azure Certified (e.g., AZ-104 or higher)Strong experience with Azure networking, including : VPC, VPNs, Subnets, Route Tables, Security Groups, NAT GatewaysSite-to-site VPN setups for enterprise customersProven experience deploying applications to customer-controlled cloud environments (BYOC) and company-controlled SaaS environmentsExpertise with tools like : CI / CD : GitHub Actions, GitLab CI, Azure Pipelines; IaC : Terraform, Bicep, or Pulumi; Containers : Docker, Kubernetes (EKS / AKS preferred)Familiarity with Secrets Management, IAM, Role-based Access Control, and SSO / SAML integrationStrong scripting skills in Bash, Python, or PowerShellComfortable working in a fast-paced, ambiguous startup environmentRequired Skills :
Experience with AI / ML pipeline deployment or GPU workloadsExposure to SOC2, ISO27001, or GDPR compliance in a cloud environmentFamiliarity with tools like Prometheus, Grafana, Datadog, ELK, or Azure MonitorPay range and compensation package : Not specified in the provided job description.
Equal Opportunity Statement : UptimeAI is committed to diversity and inclusivity in the workplace.
Why to join UptimeAI :
Impact Industry-Wide Change : Contribute to transformative solutions that significantly improve operational efficiency and reliability for global clients.Collaborative and Growth-Oriented Environment : Join a talented, passionate team that values innovation, continuous learning, and professional growth.Opportunities for Leadership and Innovation : Lead pioneering projects, influence product development, and shape the future of industrial AI solutions.```