Talent.com
Validation & Performance Automation Eng

Validation & Performance Automation Eng

LTIMindtreeIndia
8 days ago
Job description

Job Description :

Senior Infrastructure Test & Validation Engineer (Zero-Touch GPU Cloud – GitOps Validation & Certification)

We are seeking a

Senior Infrastructure Test & Validation Engineer

with 10+ years of experience to lead the

Zero-Touch Validation, Upgrade, and Certification automation

of our on-prem GPU cloud platform. This role focuses on ensuring the stability, performance, and conformance of the entire stack—from hardware to Kubernetes—using automated, GitOps-based validation pipelines. The ideal candidate has a strong infrastructure background with deep hands-on skills in

Sonobuoy ,

LitmusChaos ,

k6 , and

pytest , and is passionate about automated test orchestration, platform resilience, and continuous conformance.

Key Responsibilities

Design and implement

automated, GitOps-compliant pipelines

for

validation and certification

of the GPU cloud stack across hardware, OS, Kubernetes, and platform layers.

Integrate

Sonobuoy

for Kubernetes conformance and certification testing.

Design and orchestrate

chaos engineering workflows

using

LitmusChaos

to validate system resilience across failure scenarios.

Implement performance testing suites using

k6

and system-level benchmarks, integrated into CI / CD pipelines.

Develop and maintain

end-to-end test frameworks

using

pytest

and / or

Go , focusing on cluster lifecycle events, upgrade paths, and GPU workloads.

Ensure test coverage and validation across multiple dimensions : conformance, performance, fault injection, and post-upgrade validation.

Build and maintain dashboards and reporting for automated test results, including traceability, drift detection, and compliance tracking.

Collaborate with infrastructure, SRE, and platform teams to embed testing and validation early in the deployment lifecycle.

Own quality assurance gates for all automation-driven deployments.

Required Skills & Experience

10+ years of hands-on experience

in infrastructure engineering, systems validation, or SRE roles.

Primary key skills

required are pytest, Go, k6 scripting, automation frameworks integration (Sonobuoy, LitmusChaos), CI integration

Strong experience with :

Sonobuoy

for Kubernetes conformance and diagnostics

LitmusChaos

for fault injection and resilience validation

k6

for performance / load testing in distributed environments

pytest

or

Go-based test frameworks

for automation and validation scripting

Deep understanding of Kubernetes architecture, upgrade patterns, and operational risks.

Experience validating infrastructure components (GPU drivers, kernel modules, CNI, CRI, etc.) across lifecycle events.

Proficient in GitOps workflows and integrating tests into declarative, Git-backed pipelines (e.g., with Argo CD, Flux).

Hands-on experience with CI / CD systems (e.g., GitHub Actions, GitLab CI, Jenkins) to automate test orchestration.

Solid scripting and automation experience (Python, Bash, or Go).

Familiarity with GPU-based infrastructure and its performance characteristics is a strong plus.

Strong debugging, root cause analysis, and incident investigation skills.

Create a job alert for this search

Performance Automation • India