Job descriptionWe’re Hiring We are looking for a senior Cloud Infrastructure Operations leader to define and run a scalable cloud operating model across private cloud, hybrid cloud, Kubernetes, AI/GPU infrastructure, and edge environments. This is a foundational leadership role focused on building operational discipline, reliability standards, and end-to-end service operations for enterprise-grade infrastructure platforms.
Key Responsibilities • Define and own the cloud operating model across all infrastructure environments • Establish standards for reliability, observability, incident/problem management, change control, and release readiness • Build operational frameworks for compute, storage, networking, Kubernetes, and GPU infrastructure • Design SLO/SLA frameworks, runbooks, escalation models, and on-call structures • Partner with engineering, product, delivery, and security teams for production readiness • Drive capacity planning, cost optimization, patching, backup, and disaster recovery • Standardize monitoring, logging, alerting, and operational dashboards • Lead RCA discipline and reduce recurring operational failures • Improve day-two operations maturity for enterprise-grade deployments
What We’re Looking For • Strong experience in cloud/platform infrastructure operations leadership • Background in SRE, DevOps, or large-scale production systems • Experience across hybrid cloud and Kubernetes environments • Ability to design and implement operational governance frameworks • Strong focus on reliability engineering and operational maturity • Experience moving teams from reactive firefighting to proactive operations
Nice to Have • Experience in sovereign, regulated, or enterprise-grade environments • Exposure to GPU / AI infrastructure operations • Experience with ITSM transformations and enterprise tooling • Familiarity with multi-cloud or edge computing environments
If you’re interested or know someone who fits this profile, please reach out