Job Summary :
We are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for high-performance machine learning workloads. The ideal candidate has strong expertise in GPU programming across one or more platforms (e.g., NVIDIA CUDA, AMD ROCm / HIP, or OpenCL) and is comfortable working at the intersection of parallel computing, performance tuning, and ML system integration.
Key Responsibilities :
- Develop, optimize, and maintain GPU-accelerated components for machine learning pipelines using frameworks such as CUDA, HIP, or OpenCL
- Analyze and improve GPU kernel performance through profiling, benchmarking, and resource optimization.
- Optimize memory access, compute throughput, and kernel execution to improve overall system performance on the target GPUs.
- Port existing CPU-based implementations to GPU platforms while ensuring correctness and performance scalability.
- Work closely with system architects, software engineers, and domain experts to integrate GPU-accelerated solutions.
Required Qualifications :
Bachelor's or master's degree in computer science, Electrical Engineering, or a related field.3+ years of hands-on experience in GPU programming using CUDA, HIP, OpenCL, or other GPU compute APIs.Strong understanding of GPU architecture, memory hierarchy, and parallel programming models.Proficiency in C / C++ and hands-on experience developing on Linux-based systems.Familiarity with profiling and tuning tools such as Nsight, rocprof, or Perfetto.Preferred Qualifications :
Familiarity with cuDNN, TensorRT, OpenCL, or other GPU computing libraries.ref : hirist.tech)