Overview
We’re building the future of voice — intelligent, expressive, multilingual systems that can understand, respond, and connect with humans naturally.
As our Speech AI Engineer , you’ll be part of a high-performance team designing voice intelligence that powers conversational platforms for governments, enterprises, and next-gen contact centers.
Your mission is to make machines sound human — blending deep learning, linguistics, and emotion modeling to create voices that respond with empathy and precision.
You’ll work on speech pipelines that run in real time, across languages like Arabic, English, Hindi , and beyond — optimizing for accuracy, speed, and natural flow.
This role is for those who thrive on pushing limits — where milliseconds matter, and the voice you create will represent the next leap in human–AI interaction.
Key Responsibilities
- Design and deploy ASR / STT systems using Whisper, NeMo, or Azure Speech.
- Develop TTS pipelines capable of expressive, multilingual synthesis (Arabic, English, Hindi).
- Fine-tune and customize models for dialects and prosody control , especially Emirati Arabic.
- Build robust speech preprocessing , noise reduction, and diarization systems.
- Integrate voice AI into live contact center flows with CRM and LLM backends.
- Work with linguists and UX teams to build emotionally adaptive voice personas.
- Optimize speech models for GPU inference , batch streaming , and low-latency response .
Preferred Project Experience
Minimum 4 years of experience delivering results in fast-paced, high-pressure project environmentsDelivered production-grade speech models (contact centers, IVRs, assistants).Fine-tuned or trained custom TTS / ASR for specific regions or accents.Integrated speech pipelines with LLMs in deployed systems.Experience with voice analytics , emotion detection , or speaker ID systems .Provide a sample , demo , or GitHub link showing your speech or voice-related project.Minimum Qualification
Bachelor’s degree in Computer Science, Electrical Engineering, or Linguistics with AI focus.Preferred Qualification
Master’s in Speech Processing, Computational Linguistics, or AI / ML.Specialization in Signal Processing, Deep Learning for Audio , or Multilingual Speech Systems .Research or thesis work in speech synthesis, accent modeling, or dialogue systems .