Job Description :
Senior Infrastructure Automation Engineer (Zero-Touch GPU Cloud Stack – Linux Image Lifecycle)
We are seeking a Senior Infrastructure Automation Engineer with 10+ years of experience to lead the design and implementation of a Zero-Touch Build, Upgrade, and Certification pipeline for our on-prem GPU cloud infrastructure. This role focuses on automating the full stack—from hardware provisioning through OS and Kubernetes deployment—leveraging 100% GitOps workflows . The candidate will bring deep expertise in Linux systems automation, image management, and compliance hardening, with a strong foundation in infrastructure engineering.
Key Responsibilities
Architect and implement a fully automated, GitOps-based pipeline for building, upgrading, and certifying the Linux operating system layer in the GPU cloud stack (hardware → OS → Kubernetes → platform).
Design and automate Linux image builds using Packer , Kickstart , and Ansible .
Integrate CIS / STIG compliance hardening and OpenSCAP scanning directly into the image lifecycle and validation workflows.
Own and manage kernel module / driver automation , ensuring version compatibility and hardware enablement for GPU nodes.
Collaborate with platform, SRE, and security teams to standardize image build and deployment practices across the stack.
Maintain GitOps-compliant infrastructure-as-code repositories, ensuring traceability and reproducibility of all automation logic.
Build self-service capabilities and frameworks for zero-touch provisioning, image certification, and drift detection.
Mentor junior engineers and contribute to strategic automation roadmap initiatives.
Required Skills & Experience
10+ years of hands-on experience in Linux infrastructure engineering, system automation, and OS lifecycle management.
Primary key skills required are Ansible, Python, Packer, Kickstart, OpenSCAP
Deep expertise with :
Packer for automated image builds
Kickstart for unattended OS provisioning
OpenSCAP for security compliance and policy enforcement
Ansible for configuration management and post-build customization
Strong understanding of CIS / STIG hardening standards and their application in automated pipelines.
Experience with kernel and driver management , particularly in hardware-accelerated (GPU) environments.
Proven ability to implement GitOps workflows for infrastructure automation (e.g., Git-backed pipelines for image release and validation).
Solid knowledge of Linux internals , bootloaders, and provisioning mechanisms in bare-metal environments.
Exposure to Kubernetes , particularly in the context of OS-level customization and compliance.
Strong collaboration skills across teams including security, SRE, platform, and hardware engineering.
Bonus :
Familiarity with image signing, SBOM generation, or secure boot workflows
Experience working in regulated or compliance-heavy environments (e.g., FedRAMP, PCI-DSS)
Contributions to infrastructure automation frameworks or open-source tools
Eng • Bengaluru, India