Talent.com
Infrastructure Solutions Architect

Infrastructure Solutions Architect

BayOne SolutionsThrissur, Kerala, India
2 days ago
Job description

About the Role

We are looking for a

Systems or Solutions Architect

with deep expertise in

networking, infrastructure-as-a-service (IaaS), and cloud-scale system design

to help architect and optimize

AI / ML infrastructure .

The ideal candidate combines strong fundamentals in

cloud architecture (AWS or equivalent) ,

networking ,

compute , and

storage , with hands-on experience in

Kubernetes, observability, and automation .

You’ll design scalable systems that support large AI workloads — enabling efficient training, inference, and data pipelines across distributed environments.

Key Responsibilities

Architect and scale AI / ML infrastructure

across public cloud (AWS / Azure / GCP) and hybrid environments.

Design and optimize compute, storage, and network topologies

for distributed training and inference clusters.

Build and manage

containerized environments

using

Kubernetes, Docker, and Helm .

Develop

automation frameworks

for provisioning, scaling, and monitoring infrastructure using

Python, Go, and IaC (Terraform / CloudFormation) .

Partner with data science and ML Ops teams to align

AI infrastructure requirements

(GPU / CPU scaling, caching, throughput, latency).

Implement

observability, logging, and tracing

using

Prometheus, Grafana, CloudWatch, or Open Telemetry .

Drive

networking automation

(BGP, routing, load balancing, VPNs, service meshes) using software-defined networking (SDN) and modern APIs.

Lead performance, reliability, and cost-optimization efforts for AI training and inference pipelines.

Collaborate cross-functionally with product, platform, and operations teams to ensure

secure, performant, and resilient infrastructure .

Required Qualifications

Knowledge of

AI / ML infrastructure patterns , including distributed training, inference pipelines, and GPU orchestration.

Bachelor’s or Master’s degree

in Computer Science, Information Technology, or related field.

10+ years of experience

in systems, infrastructure, or solutions architecture roles.

Deep understanding of :

Cloud architecture :

AWS (preferred), Azure, or GCP

Networking :

VPC, Transit Gateway, DNS, routing, peering, load balancing, VPN

Compute and storage :

EC2, ECS / EKS, S3, EBS, EFS, FSx, caching systems

Core infrastructure :

virtualization, containers, distributed systems, and OS-level tuning

Proficiency in Linux systems engineering

and

scripting with Python and Bash .

Experience with Kubernetes

(EKS / GKE / AKS) for large-scale workload orchestration.

Experience with Go (Golang)

for infrastructure or network automation.

Familiarity with

Infrastructure-as-Code (IaC)

tools like Terraform, Ansible, or CloudFormation.

Experience implementing

monitoring and observability systems

(Prometheus, Grafana, ELK, Datadog, CloudWatch).

Preferred Qualifications

Experience with

DevOps and MLOps ecosystems

(SageMaker, Kubeflow, MLflow, Airflow).

AWS or cloud certifications such as

Solutions Architect Professional

or

Advanced Networking Specialty .

Experience in

performance benchmarking ,

security hardening , and

cost optimization

for compute-intensive workloads.

Strong collaboration skills and ability to communicate complex infrastructure concepts clearly.

Create a job alert for this search

Solution Architect • Thrissur, Kerala, India