Unlocking AI's Full Potential
Key Objectives :
We're pushing the boundaries of AI acceleration, aiming for a 23% speedup in critical kernels and 32.5% improvements in attention mechanisms.
At our organization, we're building the infrastructure for agentic hardware engineering and optimization, fueled by algorithmic discovery. Our goal is to empower every AI model to optimize its compute stack.
This involves breaking the data barrier, leveraging CUDA as a low-resource language, and kernel optimization dependent on context and hardware.
Our ideal candidate will have experience in optimizing hardware for model training and inference workloads, with expertise in CUDA, C++, and Python. They will also be familiar with frameworks like JAX / XLA, PyTorch, and TensorFlow, and libraries such as cuBLAS, cuDNN, and CUTLASS.
Responsibilities include :
Required Skills & Qualifications :
Proficiency in one or more programming languages (CUDA, C++, Python)
Familiarity with deep learning frameworks (JAX / XLA, PyTorch, TensorFlow)
Knowledge of computer architecture and system optimization
Experience with kernel optimization and development
Strong problem-solving skills and ability to debug complex issues
Benefits :
Collaborative and dynamic work environment
Ongoing training and professional development opportunities
Competitive salary and benefits package
Opportunities for growth and advancement within the organization
Work-life balance and flexible scheduling
What We Offer :
A challenging and rewarding role in a rapidly growing industry
The opportunity to work on cutting-edge technology and make a real impact
A collaborative and supportive team environment
Ongoing training and professional development opportunities
Competitive salary and benefits package
Artificial Intelligence Specialist • Anand, Gujarat, India