Role : Network Automation Engineer
Location : Bangalore Only
Years of Experience : 7 to 15 Years
Primary : Ansible / Python
Mandatory : Private Cloud & Networking skills
Job Description :
We are looking for an experienced Network Automation Engineer to design, implement, and optimize automation solutions for our Private Cloud datacenter network, which underpins large-scale AI / ML GPU and TPU workloads.
This role focuses on automating configuration, provisioning, and monitoring of high-performance networking devices to ensure low latency, high throughput, and reliability in a mission-critical environment. This role involves automating network device management as well as OS-level network configurations on servers. Expertise in Ansible and Python is essential, and experience with GoLang is a strong plus.
Key Responsibilities :
- Develop and maintain network automation frameworks for large-scale datacenter environments supporting AI / ML workloads.
- Build Ansible playbooks, roles, and modules to automate device configurations, software upgrades, and compliance checks across multi-vendor environments.
- Design and implement Python-based automation scripts and tools to integrate with APIs, orchestration platforms, and monitoring systems.
- Automate OS core networking configurations on servers (Linux / Windows / Hypervisor) including bonding, VLANs, routing tables, kernel network parameters, MTU tuning, and NIC performance optimization.
- Collaborate with cloud infrastructure, network engineering, and DevOps teams to deliver seamless provisioning and scaling of GPU / TPU clusters.
- Ensure network automation solutions meet high-performance computing (HPC) requirements such as low latency, high throughput, and fault tolerance.
- Participate in network architecture reviews to provide automation insights and recommendations.
- Document automation processes, workflows, and operational guidelines for the datacenter network.
- Stay updated on emerging technologies in network automation, SDN, and private cloud networking.
Required Skills & Experience :
Expertise in Ansible (playbook development, dynamic inventory, custom modules) for large-scale network automation.Strong proficiency in Python for scripting, API integrations (REST, NETCONF, gNMI), and device interaction (e.g., NAPALM, Netmiko, Paramiko).Hands-on experience with high-performance datacenter networking devices (Cisco Nexus, Arista, Juniper, Mellanox / NVIDIA Networking).Knowledge of Linux / Windows / Hypervisor OS core networking, including :
Network stack configuration (sysctl tuning, TCP / UDP parameters).NIC bonding, SR-IOV, DPDK, and kernel bypass techniques.VLANs, routing tables, MTU adjustments, jumbo frames.Performance tuning for HPC / AI workloads.Deep understanding of networking concepts including BGP, EVPN-VXLAN, MPLS, QoS, and leaf-spine architectures.Experience in Private Cloud environments with a focus on supporting HPC / AI workloads.Familiarity with CI / CD pipelines (GitLab, Jenkins) for deploying automation at scale.Knowledge of network observability, telemetry, and streaming protocols (gRPC, sFlow, SNMP, InfluxDB, Prometheus).Strong problem-solving skills and ability to operate in a high-availability, mission-critical datacenter environment.Good to Have :
Golang experience for building scalable and high-performance automation tools.Familiarity with Infrastructure-as-Code (IaC) tools like Terraform or Pulumi.Exposure to Kubernetes networking (CNI plugins) and containerized workloads.Understanding of AI / ML workload characteristics and their impact on network design and performance.Experience with SDN solutions (e.g., Cisco ACI, VMware NSX, NVIDIA Cumulus).(ref : hirist.tech)