Position : Speech Data Scientist
Experience Level : 3-6 years
Location : Bangalore, India
Key Responsibilities :
Core Development & Implementation :
- Design and implement end-to-end speech analytics pipelines for production environments
- Develop ASR engines using state-of-the-art frameworks (Wav2vec, Whisper, Deep Speech) with PyTorch or TensorFlow
- Build and optimize speaker diarization, language identification (LID), and text post-processing systems
- Focus on multilingual audio processing capabilities
- Lead data selection strategies for domain adaptation and model optimization
Model Development & Enhancement :
Develop and analyze objective measures for speech quality evaluation and enhancementImplement speaker-conditioned personalization techniques for improved ASR accuracy in noisy environmentsOptimize on-device ASR models with emphasis in multilanguage scenariosGuide teams on best practices for model accuracy improvement and performance optimizationResearch & Innovation :
Conduct research on advanced speech processing techniques including neural speech enhancementDevelop novel approaches for complex audio scenarios and multi-speaker environmentsContribute to patent applications and research publications in speech technologyStay current with latest developments in transformer models, attention mechanisms, and foundation modelsTechnical Integration & Deployment :
Design integration architectures for speech-to-text services and supporting technologiesImplement MLOps processes and CI / CD pipelines for speech modelsDeploy and scale speech solutions on cloud platforms (AWS, GCP)Develop production-ready applications using Python, C++, and JavaRequired Qualifications :
Educational Background :
Ph.D / M.S. / M.Tech in relevant field (Computer Science / Signal Processing) preferredB.Tech / B.E in ECE, CSE, or related technical fieldTechnical Expertise :
Core Speech Processing :
3-6 years of hands-on experience in speech recognition and processingDeep understanding of classical methodologies : HMMs, GMMs, ANNs, Language modelingExpertise in modern deep learning techniques : CNNs, RNNs, LSTMs, CTC, Attention mechanismsStrong background in digital signal processing and audio analysisMachine Learning & Deep Learning :
Proficiency with PyTorch and TensorFlow frameworksExperience with transformer models (BERT, Wav2vec 2.0, Wisper)Knowledge of end-to-end ASR implementation and optimizationUnderstanding of foundation models and transfer learning approachesProgramming & Tools :
Strong Python programming skills with ML / DL libraries (numpy, pandas, scikit-learn)Experience with C++ and Java for production implementationsProficiency in bash scripting and automationFamiliarity with version control (Git) and collaborative developmentCloud & Deployment :
Hands-on experience with cloud platforms (AWS, GCP)Knowledge of containerization (Docker, Kubernetes)Experience with MLOps tools and CI / CD pipelinesUnderstanding of model serving and scalability considerationsPreferred Qualifications :
Advanced Experience :
Experience with multilingual and code-switched speech processingKnowledge of speaker verification, diarization, and voice biometricsFamiliarity with speech synthesis and voice conversion techniquesExperience with edge computing and on-device model optimizationResearch & Publications :
Published research in speech processing conferences or journalsPatent applications in speech technology domainActive participation in speech technology communitiesSpecialized Skills :
Experience with Indian languages and accent adaptationKnowledge of noise robustness and speech enhancement techniquesUnderstanding of confidence scoring and uncertainty quantificationExperience with real-time speech processing systemsWhat We Offer :
Opportunity to work on cutting-edge speech AI technology & patented speech insight productsCollaborative environment with experienced speech technology expertsChance to contribute to products impacting millions of usersProfessional development and conference participation opportunitiesCompetitive compensation and benefits package(ref : hirist.tech)