Talent.com
Senior HPC Engineer

Senior HPC Engineer

Netweb Technologies India Ltd.Delhi, India
30+ days ago
Job description

Job Title : Senior Engineer-HPC

Department : Production & Support

Location : Faridabad

Position Summary :

Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte storage environments. Proven expertise in designing, implementing, and optimizing HPC infrastructure, including compute, storage, and high-speed networking, to deliver maximum performance for demanding workloads.

Key Responsibilities :

HPC Cluster Management & Optimization

  • Design, implement, and maintain HPC environments, including compute, storage, and network components.
  • Configure and optimize Slurm, PBS Pro, or other workload managers / schedulers for efficient job scheduling and resource allocation.
  • Implement performance tuning for CPU, GPU, memory, I / O, and network subsystems to meet workload demands.
  • Manage HPC filesystem solutions such as Lustre, BeeGFS, or GPFS / Spectrum Scale.

Linux Administration

  • Administer enterprise-grade Linux distributions (RHEL, CentOS, Rocky, Ubuntu) in large-scale compute environments.
  • Manage kernel upgrades, patching, and security hardening.
  • Troubleshoot kernel-level and system-level issues for performance and stability.
  • Automation & Configuration Management

  • Develop and maintain Ansible playbooks / roles for automated provisioning, configuration, and patching of HPC systems.
  • Integrate Ansible with CI / CD pipelines for infrastructure as code (IaC) practices.
  • Automate cluster deployment and environment consistency across hundreds of nodes.
  • Monitoring, Troubleshooting & Support

  • Implement and maintain monitoring tools (e.g., Grafana, Prometheus, Nagios, Ganglia).
  • Troubleshoot complex HPC workloads, MPI communication issues, and application performance bottlenecks.
  • Provide Tier-3 escalation support for Linux / HPC-related incidents.
  • Collaboration & Documentation

  • Work closely with research teams, DevOps engineers, and system architects to deliver high-performance solutions.
  • Document architecture, SOPs, troubleshooting guides, and performance tuning methodologies.
  • Requirements

    Required Skills & Experience

  • 8–10 years of hands-on Linux system administration experience in production environments.
  • 5+ years managing HPC clusters at scale (500+ cores / multiple petabytes of storage).
  • Strong Ansible automation skills (complex playbooks, roles, variables, templates).
  • Deep understanding of MPI, OpenMP, and GPU / accelerator integration in HPC workloads.
  • Proficient with HPC job schedulers (Slurm, PBS Pro, LSF).
  • Experience with HPC storage (Lustre, BeeGFS, GPFS).
  • Strong knowledge of TCP / IP networking, Infiniband, and RDMA technologies.
  • Experience with performance tuning and benchmarking tools (perf, hpc tool kit, Intel VTune, Iperf, fio).
  • Scripting proficiency in Bash, Python, or Perl for automation and tooling.
  • Preferred Qualifications

  • Experience with containerized HPC (Singularity, Apptainer, or Podman).
  • Familiarity with cloud-HPC integration (AWS Parallel Cluster, Azure Cycle Cloud, GCP HPC).
  • Knowledge of security compliance standards (CIS benchmarks, STIG).
  • Contribution to HPC community tools or open-source projects.
  • Soft Skills

  • Strong problem-solving and analytical thinking.
  • Ability to mentor junior engineers and collaborate across teams.
  • Excellent communication skills for technical and non-technical stakeholders.
  • Create a job alert for this search

    Senior Engineer • Delhi, India

    Related jobs
    • Promoted
    Senior Engineer

    Senior Engineer

    QualcommDelhi, India
    Raja Shekar here from Qualcomm Staffing Team.We are having exciting opening for WLAN Driver Development / WIFI Driver Development / Networking Driver / WLAN HOST Development Role.Positions having -...Show moreLast updated: 21 days ago
    • Promoted
    Senior Kepware Engineer

    Senior Kepware Engineer

    9NEXUSfaridabad, India
    Job Title : Senior Kepware Engineer.IT–OT integration initiatives and manage connectivity across industrial automation systems. The role involves working with diverse PLC environments, enabling seaml...Show moreLast updated: 2 days ago
    • Promoted
    Deployment Engineer

    Deployment Engineer

    AvocaMeerut, IN
    Build, launch & optimize AI agents that power the next generation of home-service customer experiences.Avoca is the all-in-one AI lead-conversion platform. Our technology boosts booking rates, slash...Show moreLast updated: 30+ days ago
    • Promoted
    Senior HPC Systems Specialist

    Senior HPC Systems Specialist

    Netweb Technologies India Ltd.Faridabad, Republic Of India, IN
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Fullstack Engineer (ATS integration / automation)

    Senior Fullstack Engineer (ATS integration / automation)

    Asendia AIDelhi, IN
    Asendia AI is a fast-growing B2B SaaS startup redefining how staffing agencies source, screen, and hire candidates, powered by voice AI, automation, and agentic workflows.We help staffing teams scr...Show moreLast updated: 3 days ago
    • Promoted
    Senior HPC Engineer

    Senior HPC Engineer

    Netweb Technologies India Ltd.Faridabad, Haryana, India
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Hpc Engineer

    Senior Hpc Engineer

    Netweb Technologies India Ltd.Faridabad, Republic Of India, IN
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior HPC Engineer

    Senior HPC Engineer

    ConfidentialIndia, Faridabad
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 6 days ago
    • Promoted
    HPC Engineer

    HPC Engineer

    ConfidentialGurgaon / Gurugram, India
    Graviton is a privately funded quantitative trading firm striving for excellence in financial markets research.We are seeking an HPC Engineer for our team in Gurgaon. Graviton trades across a multit...Show moreLast updated: 6 days ago
    • Promoted
    Resident Engineer – Kubernetes & Portworx

    Resident Engineer – Kubernetes & Portworx

    CMK Resources, Inc.Delhi, IN
    CMK Resources Resident Engineer – Kubernetes & Portworx.Remote - based in India working U.EST standard time business hours. compensation expectation of up to 30 lakhs per annum depending on experie...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Systems Engineer HPC - R-21841

    Senior Systems Engineer HPC - R-21841

    ConfidentialGurgaon / Gurugram, India
    System Administration & Maintenance : .Install, configure, and maintain HPC clusters (hardware, software, operating systems), perform regular updates / patching, manage user accounts and permissions, a...Show moreLast updated: 6 days ago
    • Promoted
    Freelance Role : FPGA Engineer (Embedded / Control Systems)

    Freelance Role : FPGA Engineer (Embedded / Control Systems)

    ThreatXIntelGhaziabad, IN
    ThreatXIntel is a startup cyber security company focused on protecting businesses and organizations from cyber threats.Our experienced team offers a range of services, including cloud security, web...Show moreLast updated: 3 days ago
    • Promoted
    Senior Engineer - Protocols

    Senior Engineer - Protocols

    RecroMeerut, IN
    As a Software Engineer, you will play a key role in enhancing our cloud-scale NAS platform.Your responsibilities will include : . Collaborating on requirements analysis, design reviews to evolve Nasun...Show moreLast updated: 23 days ago
    • Promoted
    Junior Engineer, Product Platform Hardware (FPGA)

    Junior Engineer, Product Platform Hardware (FPGA)

    NIKSUNMeerut, IN
    Junior Engineer, Product Platform Hardware (FPGA) : .NIKSUN is the recognized worldwide leader in making the Unknown Known. The company develops and deploys a complete range of award-winning forensics...Show moreLast updated: 3 days ago
    • Promoted
    Lead HPC Infrastructure Engineer

    Lead HPC Infrastructure Engineer

    Netweb Technologies India Ltd.Faridabad, Republic Of India, IN
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Principal Engineer-HVAC

    Senior Principal Engineer-HVAC

    ConfidentialGurgaon / Gurugram, India
    At Jacobs, we're challenging today to reinvent tomorrow by solving the world's most critical problems for thriving cities, resilient environments, mission-critical outcomes, operational advancement...Show moreLast updated: 6 days ago
    • Promoted
    Senior AppDynamics Observability SME

    Senior AppDynamics Observability SME

    Dexian IndiaMeerut, IN
    Position Title : Senior AppDynamics Observability SME.IT operations, system administration, or engineering.Ansible, Jenkins, Terraform, Python to develop configuration, deployment, and orchestration...Show moreLast updated: 12 days ago
    • Promoted
    Senior FPGA Engineer

    Senior FPGA Engineer

    Black BoxDelhi, India
    We have Openings for Senior FPGA Engineer professionals at Bangalore location.Mode of job : 5 Days Working from Office.Primarily Roles & Responsibilities. Participate in feature requirements definiti...Show moreLast updated: 24 days ago