About Company : Our Client Corporation provides digital engineering and technology services to Forbes Global 2000 companies worldwide. Our Engineering First approach ensures we can execute all ideas and creatively solve pressing business challenges. With industry expertise and empowered agile teams, we prioritize execution early in the process for impactful results. We combine logic, creativity and curiosity to build, solve, and create. Every day, we help clients engage with new technology paradigms, creatively building solutions that solve their most pressing business challenges and move them to the forefront of their industry.
Job Title : Senior Kubernetes Platform Engineer
Job Locations : Bengaluru
Experience : 8+ years
Education Qualification : Any Degree Graduation
Work Mode : Hybird
Employment Type : Contract
Notice Period : Immediate - 10 Days
Job description
Job Summary : Job Description :
Senior Kubernetes Platform Engineer (Zero-Touch GPU Cloud – GitOps Automation)
We are looking for a Senior Kubernetes Platform Engineer with 10+ years of infrastructure experience to design and implement the Zero-Touch Build, Upgrade, and Certification pipeline for our on-premises GPU cloud platform. This role focuses on automating the Kubernetes layer and its dependencies (e.g., GPU drivers, networking, runtime) using 100% GitOps workflows . You will work across teams to deliver a fully declarative, scalable, and reproducible infrastructure stack—from hardware to Kubernetes and platform services.
Key Responsibilities
- Architect and implement GitOps-driven Kubernetes cluster lifecycle automation using tools like kubeadm , ClusterAPI , Helm , and Argo CD .
- Develop and manage declarative infrastructure components for :
- GPU stack deployment (e.g., NVIDIA GPU Operator )
- Container runtime configuration ( Containerd )
- Networking layers ( CNI plugins like Calico, Cilium, etc.)
- Lead automation efforts to enable zero-touch upgrades and certification pipelines for Kubernetes clusters and associated workloads.
- Maintain Git-backed sources of truth for all platform configurations and integrations.
- Standardize deployment practices across multi-cluster GPU environments, ensuring scalability, repeatability, and compliance.
- Drive observability, testing, and validation as part of the continuous delivery process (e.g., cluster conformance, GPU health checks).
- Collaborate with infrastructure, security, and SRE teams to ensure seamless handoffs between lower layers (hardware / OS) and the Kubernetes platform.
- Mentor junior engineers and contribute to the platform automation roadmap.
Required Skills & Experience
10+ years of hands-on experience in infrastructure engineering, with a strong focus on Kubernetes-based environments.Primary key skills required are Kubernetes API, Helm templating, Argo CD GitOps integration, Go / Python scripting, ContainerdDeep knowledge and hands-on experience with :Kubernetes cluster management (kubeadm, ClusterAPI)Argo CD for GitOps-based deliveryHelm for application and cluster add-on packagingContainerd as a container runtime and its integration in GPU workloadsExperience deploying and operating the NVIDIA GPU Operator or equivalent in production environments.Solid understanding of CNI plugin ecosystems , network policies, and multi-tenant networking in Kubernetes.Strong GitOps mindset with experience managing infrastructure as code through Git-based workflows.Experience building Kubernetes clusters in on-prem environments (vs. managed cloud services).Proven ability to scale and manage multi-cluster, GPU-accelerated workloads with high availability and security.Solid scripting and automation skills (Bash, Python, or Go).Familiarity with Linux internals, systemd, and OS-level tuning for container workloads.Bonus :Experience with custom controllers, operators, or Kubernetes API extensionsContributions to Kubernetes or CNCF projectsExposure to service meshes, ingress controllers, or workload identity providers