The First International Conference on AI-ML-Systems | 2021

Word-level beam search decoding and correction algorithm (WLBS) for end-to-end ASR

 
 

Abstract


A key challenge in resource-constrained speech recognition applications is the unavailability of a large, domain-specific audio corpus to train the models. In such scenarios, models may not be exposed to a wide range of domain-specific words and phrases. In this work, we propose an approach to improve the in-domain automatic speech recognition results using our word-level beam search decoding and correction algorithm (WLBS). We use a token-based language model to mitigate the data sparsity and the out of vocabulary issues in the corpus. We evaluate the proposed approach for airplane-cabin specific announcements use case. The experimental results show that the WLBS algorithm with its handling of misspellings and missing words achieves better performance than state-of-the-art beam search decoding and n-gram LMs. We report a WER of 11.48% on our airplane-cabin announcement test corpus.

Volume None
Pages None
DOI 10.1145/3486001.3486223
Language English
Journal The First International Conference on AI-ML-Systems

Full Text