Development of ASR engine using frameworks like ESPNET or FairSeq or Athena or Deep Speech using PyTorch or Tensorflow or Kaldi.
Working on speech tech like Multilingual ASR, Contextual biasing, Text to Speech, Voice Biometric, speaker separation, and so on.
Assist to define technology required for Speech Technology besides core engine and to design integration of these technologies.
Work on improvement of adapting the model to multiple domains and channels.
Desired experience :
If fresher, projects should be in alignment with our domain.
Good understanding of signal processing, machine learning (ML) tools.
Should be well versed in classical speech processing methodologies like hidden Markov models (HMMs), Gaussian mixture models (GMMs), Artificial neural networks (ANNs), Language modeling, etc.
Experience in working with low-latency and optimization techniques.
Understanding of traditional speech decoders.
Hands-on experience current deep learning (DL) techniques like Convolutional neural networks (CNNs), recurrent neural networks (RNNs), long-term short-term memory (LSTM), connectionist temporal classification (CTC), Transformer, etc used for speech processing is essential.
The candidate should have hands-on experience and any of the end-to-end implementation of ASR tools such as ESPNET or FairSeq or Athena or Deep Speech