Talent.com
Lead Solutions Architect – AI Infrastructure & Private Cloud

Lead Solutions Architect – AI Infrastructure & Private Cloud

Tekskills Inc.Agra, Uttar Pradesh, India
12 hours ago
Job description

Job Title : Lead Solutions Architect – AI Infrastructure & Private Cloud

Location : Bengaluru (Electronic City)

Experience : 10–15 Years (Lead / Architect Level)

Position Type : Full-Time | Immediate Joiners Preferred

Criticality : High

Role Overview :

We are seeking a Lead Solutions Architect specializing in AI Infrastructure and Private Cloud to design and deliver scalable, high-performance compute environments for machine learning, deep learning, and AI workloads. The ideal candidate will have deep expertise in Kubernetes , container orchestration , GPU / TPU acceleration , and HPC (High Performance Computing) architectures, enabling AI-driven innovation across enterprise platforms.

Key Responsibilities :

Architect, design, and implement AI / ML infrastructure solutions across private and hybrid cloud environments.

Lead setup and optimization of Kubernetes Landing Zones , including cluster design, multi-tenancy, and security.

Manage containerized workloads using orchestration tools (Kubernetes, Docker, Podman, OpenShift).

Integrate AI accelerators (NVIDIA GPUs, TPUs) for ML / DL model training and inference.

Enable deployment of deep learning models with a focus on hardware acceleration, scalability, and performance tuning.

Build and maintain edge and cloud-native deployment pipelines for AI workloads.

Collaborate with AI / ML and DevOps teams to ensure robust CI / CD workflows for model deployment.

Drive HPC architecture design , including compute, storage, networking, and scheduling (SLURM, PBS, etc.).

Optimize HPC and AI infrastructure for cost, performance, and resource utilization.

Provide technical leadership in evaluating and integrating emerging technologies (AI frameworks, MLOps platforms, accelerator hardware).

Define standards, documentation, and best practices for AI infrastructure operations.

Required Technical Skills :

Containerization & Orchestration : Kubernetes, Docker, Helm, OpenShift, Rancher

Cloud Platforms : AWS, Azure, GCP (Private & Hybrid Cloud expertise preferred)

AI / ML Infrastructure : NVIDIA GPU integration, CUDA, TensorRT, TPUs, PyTorch / TensorFlow deployment

High Performance Computing (HPC) : HPC architecture, schedulers (SLURM, PBS), parallel computing, storage & network optimization

DevOps & CI / CD : GitHub Actions, Jenkins, ArgoCD, Terraform, Ansible

Monitoring & Observability : Prometheus, Grafana, ELK Stack

Scripting / Programming : Python, Bash, YAML, Go (preferred)

Desired Skills :

Experience with RAG / LLM model deployment pipelines or AI workload orchestration

Knowledge of edge computing and distributed inference systems

Exposure to AI model lifecycle management (MLOps)

Strong problem-solving, leadership, and cross-functional collaboration skills

Create a job alert for this search

Ai Solution Architect • Agra, Uttar Pradesh, India