Responsibilities :
System Administration & Maintenance : Install, configure, and maintain HPC clusters (hardware, software, operating systems), perform regular updates / patching, manage user accounts and permissions, and troubleshoot / resolve hardware or software issues.
Performance & Optimization : Monitor and analyse system and application performance, identify bottlenecks, implement tuning solutions, and profile workloads to improve efficiency.
Cluster & Resource Management : Manage and optimize job scheduling, resource allocation, and cluster operations using tools such as Slurm, LSF, Bright Cluster Manager / Base Command Manager , OpenHPC, and Warewulf.
Networking & Interconnects : Configure, manage, and tune Linux networking (TCP / IP, DNS, routing) and high-speed HPC interconnects (InfiniBand, Ethernet) to ensure low-latency, high-bandwidth communication.
Storage & Data Management : Implement and maintain large-scale storage and parallel file systems (Lustre, Ceph, GPFS), ensure data integrity, manage backups, and support disaster recovery.
Security & Authentication : Implement security controls, ensure compliance with policies, and manage authentication and directory services such as LDAP and Active Directory.
DevOps & Automation : Use configuration management and DevOps practices (Ansible, Terraform, Jenkins, Git) to automate deployments, application packaging (RPM / DEB), and system configurations.
User Support & Collaboration : Provide technical support, documentation, and training to researchers collaborate with scientists, HPC architects, and engineers to align infrastructure with research needs.
Planning & Innovation : Contribute to the design and planning of HPC infrastructure upgrades, evaluate and recommend hardware / software solutions, and explore cloud-based HPC solutions where applicable.
Qualifications :
Skills Required
Git, lustre, Dns, infiniband, Ansible, Active Directory, Ldap, Lsf, Gpfs, Ethernet, Bash, Python, Ceph, Terraform, Jenkins, Rpm
Senior System Engineer • Gurgaon / Gurugram, India