Talent.com
Senior HPC Administrator
Senior HPC AdministratorConcept Information Technologies (I) Pvt. Ltd. • Ajmer, Rajasthan, India
No longer accepting applications
Senior HPC Administrator

Senior HPC Administrator

Concept Information Technologies (I) Pvt. Ltd. • Ajmer, Rajasthan, India
4 days ago
Job description

Company Description

Concept Information Technologies (I) Pvt. Ltd., headquartered in Pune, is a leading IT solutions and system integration partner. We deliver scalable and cost-effective solutions in High-Performance Computing, Disaster Recovery, Enterprise Networking, Cybersecurity and Software Development. Backed by partnerships with HPE, Cisco, IBM and others, we combine industry expertise with advanced technology to drive measurable business value.

Location : Pune (with occasional project-based travel to Mumbai, Bengaluru, or Hyderabad)

Role Description

We’re seeking an experienced HPC Administrator to design, deploy and manage large-scale high-performance computing environments. This role requires deep technical expertise, hands-on cluster management experience and the ability to optimize performance for demanding workloads.

Key Responsibilities :

Design, deploy, and maintain HPC clusters and supporting infrastructure.

Manage schedulers such as SLURM , PBS Pro , or LSF .

Manage job queues, partitions, and scheduling policies to ensure efficient workload distribution.

Support user issues related to job submissions, resource requests, and job scripts.

Proficient with Docker, container technologies, Kubernetes

Maintain containerized environments using Docker and Enroot.

Optimize cluster performance and resource utilization.

Develop and manage workflows for submitting container-based jobs to compute clusters using SLURM or similar job schedulers.

Demonstrated proficiency in support and troubleshooting of 3rd party HPC software

Compiling and deploying open source software and software.

Integrate GPU-based workloads with SLURM , PBS , or similar job scheduling systems.

Understanding of MPI, Intel MPI

Understanding of different User authentication methods like, IPA / IDM, NIS, LDAP

Expert knowledge of related parallel distributed file system like Lustre / IBM GPFS / BGFS,

Implement backup, disaster recovery, and monitoring solutions.

Ability to deploy open-source and commercial HPC Platforms,

Support application teams with MPI libraries, parallel processing, and GPU setups.

Automate repetitive tasks through scripting.

Create and maintain detailed technical documentation.

Mentor junior team members and collaborate on solution design.

Required Skills :

7-8 years of experience in HPC administration or Linux systems engineering .

Strong expertise in cluster management, tuning, and performance optimization .

Experience with storage systems , networking (InfiniBand, Ethernet) and monitoring tools (Grafana, Prometheus).

Proficiency in shell scripting , Python , or automation tools.

Knowledge of MPI , CUDA , or GPU computing is an advantage.

Excellent troubleshooting and communication skills.

Create a job alert for this search

Senior Administrator • Ajmer, Rajasthan, India