Job Description :
Senior Infrastructure Automation Engineer (Zero-Touch GPU Cloud Stack – Linux Image Lifecycle)
We are seeking a
Senior Infrastructure Automation Engineer
with 10+ years of experience to lead the design and implementation of a
Zero-Touch Build, Upgrade, and Certification pipeline
for our on-prem GPU cloud infrastructure. This role focuses on automating the full stack—from hardware provisioning through OS and Kubernetes deployment—leveraging
100% GitOps workflows . The candidate will bring deep expertise in Linux systems automation, image management, and compliance hardening, with a strong foundation in infrastructure engineering.
Key Responsibilities
Architect and implement a
fully automated, GitOps-based pipeline
for building, upgrading, and certifying the Linux operating system layer in the GPU cloud stack (hardware → OS → Kubernetes → platform).
Design and automate
Linux image builds
using
Packer ,
Kickstart , and
Ansible .
Integrate
CIS / STIG compliance hardening
and
OpenSCAP
scanning directly into the image lifecycle and validation workflows.
Own and manage
kernel module / driver automation , ensuring version compatibility and hardware enablement for GPU nodes.
Collaborate with platform, SRE, and security teams to standardize image build and deployment practices across the stack.
Maintain GitOps-compliant infrastructure-as-code repositories, ensuring traceability and reproducibility of all automation logic.
Build self-service capabilities and frameworks for zero-touch provisioning, image certification, and drift detection.
Mentor junior engineers and contribute to strategic automation roadmap initiatives.
Required Skills & Experience
10+ years of hands-on experience
in Linux infrastructure engineering, system automation, and OS lifecycle management.
Primary key skills
required are Ansible, Python, Packer, Kickstart, OpenSCAP
Deep expertise with :
Packer
for automated image builds
Kickstart
for unattended OS provisioning
OpenSCAP
for security compliance and policy enforcement
Ansible
for configuration management and post-build customization
Strong understanding of
CIS / STIG hardening
standards and their application in automated pipelines.
Experience with
kernel and driver management , particularly in hardware-accelerated (GPU) environments.
Proven ability to implement
GitOps workflows
for infrastructure automation (e.g., Git-backed pipelines for image release and validation).
Solid knowledge of
Linux internals , bootloaders, and provisioning mechanisms in bare-metal environments.
Exposure to
Kubernetes , particularly in the context of OS-level customization and compliance.
Strong collaboration skills across teams including security, SRE, platform, and hardware engineering.
Bonus :
Familiarity with image signing, SBOM generation, or secure boot workflows
Experience working in regulated or compliance-heavy environments (e.g., FedRAMP, PCI-DSS)
Contributions to infrastructure automation frameworks or open-source tools
Eng • Bengaluru, Karnataka, India