Talent.com
Lead Solutions Architect – AI Infrastructure & Private Cloud

Lead Solutions Architect – AI Infrastructure & Private Cloud

Tekskills Inc.nadiad, gujarat, in
12 hours ago
Job description

Job Title : Lead Solutions Architect – AI Infrastructure & Private Cloud

Location : Bengaluru (Electronic City)

Experience : 10–15 Years (Lead / Architect Level)

Position Type : Full-Time | Immediate Joiners Preferred

Criticality : High

Role Overview :

We are seeking a Lead Solutions Architect specializing in AI Infrastructure and Private Cloud to design and deliver scalable, high-performance compute environments for machine learning, deep learning, and AI workloads. The ideal candidate will have deep expertise in Kubernetes , container orchestration , GPU / TPU acceleration , and HPC (High Performance Computing) architectures, enabling AI-driven innovation across enterprise platforms.

Key Responsibilities :

  • Architect, design, and implement AI / ML infrastructure solutions across private and hybrid cloud environments.
  • Lead setup and optimization of Kubernetes Landing Zones , including cluster design, multi-tenancy, and security.
  • Manage containerized workloads using orchestration tools (Kubernetes, Docker, Podman, OpenShift).
  • Integrate AI accelerators (NVIDIA GPUs, TPUs) for ML / DL model training and inference.
  • Enable deployment of deep learning models with a focus on hardware acceleration, scalability, and performance tuning.
  • Build and maintain edge and cloud-native deployment pipelines for AI workloads.
  • Collaborate with AI / ML and DevOps teams to ensure robust CI / CD workflows for model deployment.
  • Drive HPC architecture design , including compute, storage, networking, and scheduling (SLURM, PBS, etc.).
  • Optimize HPC and AI infrastructure for cost, performance, and resource utilization.
  • Provide technical leadership in evaluating and integrating emerging technologies (AI frameworks, MLOps platforms, accelerator hardware).
  • Define standards, documentation, and best practices for AI infrastructure operations.

Required Technical Skills :

  • Containerization & Orchestration : Kubernetes, Docker, Helm, OpenShift, Rancher
  • Cloud Platforms : AWS, Azure, GCP (Private & Hybrid Cloud expertise preferred)
  • AI / ML Infrastructure : NVIDIA GPU integration, CUDA, TensorRT, TPUs, PyTorch / TensorFlow deployment
  • High Performance Computing (HPC) : HPC architecture, schedulers (SLURM, PBS), parallel computing, storage & network optimization
  • DevOps & CI / CD : GitHub Actions, Jenkins, ArgoCD, Terraform, Ansible
  • Monitoring & Observability : Prometheus, Grafana, ELK Stack
  • Scripting / Programming : Python, Bash, YAML, Go (preferred)
  • Desired Skills :

  • Experience with RAG / LLM model deployment pipelines or AI workload orchestration
  • Knowledge of edge computing and distributed inference systems
  • Exposure to AI model lifecycle management (MLOps)
  • Strong problem-solving, leadership, and cross-functional collaboration skills
  • Create a job alert for this search

    Ai Solution Architect • nadiad, gujarat, in