Responsibilities :
Deep Learning Model Conversion : Convert and adapt deep learning network architectures (e.g., from PyTorch) for deployment on various embedded platforms.
Quantization-Aware Training (QAT) : Implement and fine-tune Quantization-Aware Training techniques to optimize model performance and reduce memory footprint while maintaining accuracy.
Model Optimization : Perform extensive model optimization techniques, including pruning, quantization (post-training and QAT), and network architecture search, to achieve desired latency, power, and memory targets.
Runtime Integration : Integrate optimized deep learning models with embedded runtime environments and hardware accelerators.
Performance Profiling & Tuning : Analyze and profile model performance on target embedded hardware, identifying bottlenecks and implementing solutions for real-time inference.
Number Format Conversion : Work with various number formats (e.g., FP32, FP16, INT8) and develop strategies for efficient conversion and utilization on embedded processors.
Toolchain Development & Utilization : Utilize and contribute to the development of custom conversion tools and optimization scripts to streamline the deployment pipeline.
Skills Required :
Experience : 5+ years of experience in embedded software development with a strong focus on AI / Machine Learning deployment.
Programming Skills : Proficient in Python for AI development and scripting.
Deep Learning Frameworks : Hands-on experience with deep learning frameworks such as PyTorch. Experience with TensorFlow / Keras is a plus.
Embedded Systems : Strong understanding of embedded system architectures, microcontrollers, DSPs, and / or FPGAs.
Optimization Techniques : Proven experience with deep learning model optimization techniques (quantization, pruning, knowledge distillation).
Number Formats : Familiarity with different number formats (e.g., FP32, FP16, INT8) and their implications for embedded inference.
Conversion Tools : Experience with model conversion tools (e.g., ONNX, OpenVINO, TensorRT, TVM).
Problem-Solving : Excellent analytical and problem-solving skills, with a strong ability to debug and optimize complex systems.
Experience with C / C++ for embedded development.
Familiarity with hardware acceleration (e.g., NPUs, GPUs on edge devices).
Embedded Engineer • Chennai, Tamil Nadu, India