Talent.com
No longer accepting applications
Lead Solutions Architect – AI Infrastructure & Private Cloud

Lead Solutions Architect – AI Infrastructure & Private Cloud

Tekskills Inc.Dindigul, Tamil Nadu, India
23 hours ago
Job description

Job Title :

Lead Solutions Architect – AI Infrastructure & Private Cloud

Location :

Bengaluru (Electronic City)

Experience :

10–15 Years (Lead / Architect Level)

Position Type :

Full-Time | Immediate Joiners Preferred

Criticality : High

Role Overview : We are seeking a

Lead Solutions Architect

specializing in

AI Infrastructure and Private Cloud

to design and deliver scalable, high-performance compute environments for machine learning, deep learning, and AI workloads. The ideal candidate will have deep expertise in

Kubernetes ,

container orchestration ,

GPU / TPU acceleration , and

HPC (High Performance Computing)

architectures, enabling AI-driven innovation across enterprise platforms.

Key Responsibilities :

Architect, design, and implement

AI / ML infrastructure solutions

across private and hybrid cloud environments.

Lead setup and optimization of

Kubernetes Landing Zones , including cluster design, multi-tenancy, and security.

Manage

containerized workloads

using orchestration tools (Kubernetes, Docker, Podman, OpenShift).

Integrate

AI accelerators (NVIDIA GPUs, TPUs)

for ML / DL model training and inference.

Enable

deployment of deep learning models

with a focus on hardware acceleration, scalability, and performance tuning.

Build and maintain

edge and cloud-native deployment pipelines

for AI workloads.

Collaborate with AI / ML and DevOps teams to ensure robust CI / CD workflows for model deployment.

Drive

HPC architecture design , including compute, storage, networking, and scheduling (SLURM, PBS, etc.).

Optimize

HPC and AI infrastructure

for cost, performance, and resource utilization.

Provide technical leadership in evaluating and integrating emerging technologies (AI frameworks, MLOps platforms, accelerator hardware).

Define standards, documentation, and best practices for AI infrastructure operations.

Required Technical Skills :

Containerization & Orchestration :

Kubernetes, Docker, Helm, OpenShift, Rancher

Cloud Platforms :

AWS, Azure, GCP (Private & Hybrid Cloud expertise preferred)

AI / ML Infrastructure :

NVIDIA GPU integration, CUDA, TensorRT, TPUs, PyTorch / TensorFlow deployment

High Performance Computing (HPC) :

HPC architecture, schedulers (SLURM, PBS), parallel computing, storage & network optimization

DevOps & CI / CD :

GitHub Actions, Jenkins, ArgoCD, Terraform, Ansible

Monitoring & Observability :

Prometheus, Grafana, ELK Stack

Scripting / Programming :

Python, Bash, YAML, Go (preferred)

Desired Skills : Experience with

RAG / LLM model deployment pipelines

or

AI workload orchestration

Knowledge of

edge computing

and

distributed inference systems

Exposure to

AI model lifecycle management (MLOps)

Strong problem-solving, leadership, and cross-functional collaboration skills

Create a job alert for this search

Ai Solution Architect • Dindigul, Tamil Nadu, India