Talent.com
No longer accepting applications
▷ 15h Left! Senior HPC Engineer

▷ 15h Left! Senior HPC Engineer

Netweb Technologies India Ltd.Faridabad, Haryana, India
30+ days ago
Job description

Job Title : Senior Engineer-HPC

Department : Production & Support

Location : Faridabad

Position Summary :

Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte storage environments. Proven expertise in designing, implementing, and optimizing HPC infrastructure, including compute, storage, and high-speed networking, to deliver maximum performance for demanding workloads.

Key Responsibilities :

HPC Cluster Management & Optimization

  • Design, implement, and maintain HPC environments, including compute, storage, and network components.
  • Configure and optimize Slurm, PBS Pro, or other workload managers / schedulers for efficient job scheduling and resource allocation.
  • Implement performance tuning for CPU, GPU, memory, I / O, and network subsystems to meet workload demands.
  • Manage HPC filesystem solutions such as Lustre, BeeGFS, or GPFS / Spectrum Scale.

Linux Administration

  • Administer enterprise-grade Linux distributions (RHEL, CentOS, Rocky, Ubuntu) in large-scale compute environments.
  • Manage kernel upgrades, patching, and security hardening.
  • Troubleshoot kernel-level and system-level issues for performance and stability.
  • Automation & Configuration Management

  • Develop and maintain Ansible playbooks / roles for automated provisioning, configuration, and patching of HPC systems.
  • Integrate Ansible with CI / CD pipelines for infrastructure as code (IaC) practices.
  • Automate cluster deployment and environment consistency across hundreds of nodes.
  • Monitoring, Troubleshooting & Support

  • Implement and maintain monitoring tools (e.g., Grafana, Prometheus, Nagios, Ganglia).
  • Troubleshoot complex HPC workloads, MPI communication issues, and application performance bottlenecks.
  • Provide Tier-3 escalation support for Linux / HPC-related incidents.
  • Collaboration & Documentation

  • Work closely with research teams, DevOps engineers, and system architects to deliver high-performance solutions.
  • Document architecture, SOPs, troubleshooting guides, and performance tuning methodologies.
  • Requirements

    Required Skills & Experience

  • 8–10 years of hands-on Linux system administration experience in production environments.
  • 5+ years managing HPC clusters at scale (500+ cores / multiple petabytes of storage).
  • Strong Ansible automation skills (complex playbooks, roles, variables, templates).
  • Deep understanding of MPI, OpenMP, and GPU / accelerator integration in HPC workloads.
  • Proficient with HPC job schedulers (Slurm, PBS Pro, LSF).
  • Experience with HPC storage (Lustre, BeeGFS, GPFS).
  • Strong knowledge of TCP / IP networking, Infiniband, and RDMA technologies.
  • Experience with performance tuning and benchmarking tools (perf, hpc tool kit, Intel VTune, Iperf, fio).
  • Scripting proficiency in Bash, Python, or Perl for automation and tooling.
  • Preferred Qualifications

  • Experience with containerized HPC (Singularity, Apptainer, or Podman).
  • Familiarity with cloud-HPC integration (AWS Parallel Cluster, Azure Cycle Cloud, GCP HPC).
  • Knowledge of security compliance standards (CIS benchmarks, STIG).
  • Contribution to HPC community tools or open-source projects.
  • Soft Skills

  • Strong problem-solving and analytical thinking.
  • Ability to mentor junior engineers and collaborate across teams.
  • Excellent communication skills for technical and non-technical stakeholders.
  • Create a job alert for this search

    Senior Hpc Engineer • Faridabad, Haryana, India

    Related jobs
    • Promoted
    Embedded Principal Engineer (MCU)

    Embedded Principal Engineer (MCU)

    ACL DigitalDelhi, IN
    Designation : Senior Engineer / Principal Engineer (MCU).Experience Required : 5 to 10 Years.Job Location : Work from office. Job Functions / Responsibilities : .Requirement Understanding and requirement ...Show moreLast updated: 5 days ago
    • Promoted
    Senior HPC Systems Specialist

    Senior HPC Systems Specialist

    Netweb Technologies India Ltd.Faridabad, Republic Of India, IN
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Lead Engineer - Full Stack

    Senior Lead Engineer - Full Stack

    REAGurgaon, India
    Senior Lead Engineer Full Stack.In 1995, in a garage in Melbourne, Australia, REA Group was born from a simple question : Can we change the way the world experiences property?Could we? Yes.Fast for...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Automation Engineer

    Senior Automation Engineer

    The Lightning GroupDelhi, IN
    At Lightning Group, we empower non-technical founders to build, fund, and scale revenue-focused tech companies.We enjoy pushing tech boundaries and developing ground-breaking applications.Our cross...Show moreLast updated: 5 days ago
    • Promoted
    Senior HPC Engineer

    Senior HPC Engineer

    Netweb Technologies India Ltd.Faridabad, Haryana, India
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Hpc Engineer

    Senior Hpc Engineer

    Netweb Technologies India Ltd.Faridabad, Republic Of India, IN
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior HPC Engineer

    Senior HPC Engineer

    ConfidentialIndia, Faridabad
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 9 days ago
    • Promoted
    HPC Engineer

    HPC Engineer

    ConfidentialGurgaon / Gurugram, India
    Graviton is a privately funded quantitative trading firm striving for excellence in financial markets research.We are seeking an HPC Engineer for our team in Gurgaon. Graviton trades across a multit...Show moreLast updated: 9 days ago
    • Promoted
    Junior Engineer, Product Platform Hardware (FPGA)

    Junior Engineer, Product Platform Hardware (FPGA)

    NIKSUNDelhi, IN
    Junior Engineer, Product Platform Hardware (FPGA) : .NIKSUN is the recognized worldwide leader in making the Unknown Known. The company develops and deploys a complete range of award-winning forensics...Show moreLast updated: 5 days ago
    • Promoted
    Senior Engineer

    Senior Engineer

    Nucor Towers & Structures India Private LimitedNew Delhi, Delhi, India
    Nucor Towers & Structures (NTS) is a new entity within Nucor, the largest recycler and largest.North America, focused on the execution of our strategy to “Expand Beyond” our core business.As a part...Show moreLast updated: 3 days ago
    • Promoted
    Resident Engineer – Kubernetes & Portworx

    Resident Engineer – Kubernetes & Portworx

    CMK Resources, Inc.Delhi, IN
    CMK Resources Resident Engineer – Kubernetes & Portworx.Remote - based in India working U.EST standard time business hours. compensation expectation of up to 30 lakhs per annum depending on experie...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Systems Engineer HPC - R-21841

    Senior Systems Engineer HPC - R-21841

    ConfidentialGurgaon / Gurugram, India
    System Administration & Maintenance : .Install, configure, and maintain HPC clusters (hardware, software, operating systems), perform regular updates / patching, manage user accounts and permissions, a...Show moreLast updated: 9 days ago
    • Promoted
    Lead HPC Infrastructure Engineer

    Lead HPC Infrastructure Engineer

    Netweb Technologies India Ltd.Faridabad, Republic Of India, IN
    Accomplished HPC Systems Engineer with 8–10 years of enterprise Linux administration and over 5 years of hands-on experience managing large-scale HPC clusters exceeding 500 cores and multi-petabyte...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Engineer - Protocols

    Senior Engineer - Protocols

    RecroDelhi, IN
    As a Software Engineer, you will play a key role in enhancing our cloud-scale NAS platform.Your responsibilities will include : . Collaborating on requirements analysis, design reviews to evolve Nasun...Show moreLast updated: 26 days ago
    • Promoted
    Senior AppDynamics Observability SME

    Senior AppDynamics Observability SME

    Dexian IndiaDelhi, IN
    Position Title : Senior AppDynamics Observability SME.IT operations, system administration, or engineering.Ansible, Jenkins, Terraform, Python to develop configuration, deployment, and orchestration...Show moreLast updated: 15 days ago
    • Promoted
    Senior FPGA Engineer

    Senior FPGA Engineer

    Black BoxDelhi, India
    We have Openings for Senior FPGA Engineer professionals at Bangalore location.Total Experience : 5+ yrs Role : Senior FPGA Engineer Salary : Case to case basis Notice period : upto 60 Days Mode of job : ...Show moreLast updated: 26 days ago
    • Promoted
    Senior Engineer-HVAC(Building Engineering)

    Senior Engineer-HVAC(Building Engineering)

    ConfidentialGurugram, Gurgaon / Gurugram, India
    At AECOM, we're delivering a better world.Whether improving your commute, keeping the lights on, providing access to clean water, or transforming skylines, our work helps people and communities thr...Show moreLast updated: 9 days ago
    • Promoted
    Deployment Engineer

    Deployment Engineer

    AvocaDelhi, IN
    Build, launch & optimize AI agents that power the next generation of home-service customer experiences.Avoca is the all-in-one AI lead-conversion platform. Our technology boosts booking rates, slash...Show moreLast updated: 30+ days ago