Join our team as a Lead Linux Admin, where you will oversee the management and maintenance of Linux-based systems within a high-performance computing or research environment.
You will be responsible for software stack installation, system upgrades, and ensuring system stability while collaborating with developers and researchers. Apply now to contribute your expertise to a dynamic technical team.
Responsibilities
- Manage and maintain Linux servers and environments to ensure high availability and optimal performance
- Install, configure, and manage software packages using EasyBuild
- Perform installation and upgrades of R bundles to maintain compatibility and stability
- Collaborate with developers, researchers, and data scientists to meet software environment requirements
- Monitor system performance and proactively address hardware, software, or network issues
- Ensure system security through user management, patch application, and configuration hardening
- Maintain documentation for system configurations, procedures, and updates
- Automate routine administrative tasks using shell scripting or configuration management tools such as Ansible or Puppet
- Troubleshoot complex issues related to software dependencies and build environments
Requirements
Extensive experience of 10-15 years as a Linux system administrator, preferably in HPC or research environmentsStrong proficiency in shell scripting using bash or zshBackground in using automation tools for system administration tasksUnderstanding of system monitoring, logging, and alerting mechanismsProven problem-solving skills with attention to detailStrong communication and documentation abilitiesNice to have
Hands-on experience with EasyBuild for software stack managementPractical knowledge of R language and R bundle upgrade processesFamiliarity with module systems like Lmod or Environment ModulesExperience with job scheduling systems such as SLURM, PBS, or SGEKnowledge of containerization tools including Docker or SingularitySkills Required
pbs, Logging, Shell Scripting, Docker, Linux, Ansible, System Monitoring, Puppet