Role : Technical Architect - Cloud & AI Infrastructure
Job Summary :
We are seeking a highly skilled and experienced Technical Architect - Cloud & AI Infrastructure to lead the design, development, and implementation of our scalable, secure, and high-performance cloud and Artificial Intelligence (AI) infrastructure. The ideal candidate will be a visionary leader responsible for architecting robust cloud environments that support advanced AI / ML workloads, ensuring operational excellence, cost efficiency, and seamless integration of cutting-edge technologies. This role demands deep expertise across major cloud platforms, infrastructure as code, containerization, and the specific requirements for deploying and managing AI / ML models at scale.
Key Responsibilities :
Architecture & Design :
- Lead the architectural design and evolution of highly scalable, secure, and resilient cloud infrastructure solutions across multiple cloud providers.
- Design and implement infrastructure specifically optimized for AI and Machine Learning (ML) workloads, including GPU orchestration, distributed training environments, and model serving platforms.
- Create detailed technical designs, reference architectures, and implementation roadmaps for cloud and AI infrastructure initiatives.
- Ensure all architectural designs adhere to best practices for security, compliance, performance, reliability, and cost efficiency.
Cloud Platform Expertise :
Drive the adoption and utilization of core cloud services including compute, storage, networking, security, and identity management.Implement infrastructure as code IaC solutions using tools such as Terraform or CloudFormation to automate provisioning and management.Architect and manage containerization and orchestration platforms using Docker and Kubernetes.Develop strategies for hybrid cloud environments and cloud migration.AI / ML Infrastructure :
Design and implement MLOps pipelines and platforms to automate the lifecycle of ML models, from experimentation to deployment and monitoring.Architect data pipelines and storage solutions optimized for large-scale AI training data and real-time inference.Evaluate and integrate AI / ML-specific cloud services and frameworks to accelerate development and deployment of intelligent applications.Ensure secure and efficient management of AI / ML models, datasets, and compute resources.Strategy & Roadmap :
Define the technical strategy and roadmap for cloud and AI infrastructure, evaluating new technologies and recommending their adoption to meet future business needs.Provide expert guidance on cloud adoption frameworks, AI infrastructure best practices, and industry trends.Influence technology choices and architectural decisions across various engineering teams.Collaboration & Leadership :
Collaborate closely with engineering teams, data scientists, product managers, and security professionals to understand requirements and deliver integrated solutions.Lead cross-functional technical discussions and drive consensus on complex architectural challenges.Mentor junior architects and engineers, fostering a culture of continuous learning and technical excellence.Optimization & Governance :
Implement strategies for cloud cost optimization, resource management, and governance across all cloud environments.Establish and enforce security policies and compliance frameworks for cloud and AI infrastructure.Monitor system performance, identify bottlenecks, and implement solutions to optimize the efficiency and reliability of infrastructure.Required Skills and Qualifications :
Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.Extensive experience in technical architecture roles, with a strong focus on cloud infrastructure.Proven experience in designing and implementing infrastructure specifically for Artificial Intelligence and Machine Learning workloads.Deep expertise in at least one major cloud platform AWS, Azure, or Google Cloud, with strong familiarity across others.Hands-on experience with Infrastructure as Code IaC tools such as Terraform or CloudFormation.Expert knowledge of containerization technologies Docker and orchestration with Kubernetes.Strong understanding of MLOps principles, tools, and platforms.Proficiency in programming and scripting languages including Python and shell scripting.Solid understanding of networking, security, and database concepts in a cloud environment.Excellent analytical, problem-solving, and communication skills, with the ability to articulate complex technical concepts to diverse audiences.Demonstrated leadership qualities and the ability to drive technical initiatives.Preferred Skills :
Professional certifications in cloud architecture AWS Certified Solutions Architect Professional, Microsoft Certified Azure Solutions Architect Expert, or Google Cloud Professional Cloud Architect.Experience with specific AI / ML frameworks including TensorFlow or PyTorch.Familiarity with big data technologies such as Apache Spark or Hadoop.Experience with serverless computing paradigms.Knowledge of FinOps principles for cloud cost management.ref : hirist.tech)