2019 International Symposium on Multimedia and Communication Technology (ISMAC) | 2019
Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition
Abstract
This paper focused on the issue of robustness in children speech recognition system. The shape of filterbank analysis is presented in this study to suppress the additive background noise in acoustic features of children speakers. In addition, the Linear Discriminant Analysis (LDA), the Maximum Likelihood Linear Transform (MLLT) and feature space Maximum Likelihood Linear Regression (fMLLR) features are applied to build the speaker adaptive acoustic model with the help of Kaldi speech recognition toolkit. The performance of Gammatone filterbank and Bark-scale filterbank based Cepstral features were evaluated under contaminated situations using five different types of noise at a range of signal to noise ratio (SNR) 10dB to −10dB. As the detailed analysis shown, the performance of Gammatone frequency integration is superior to Mel Frequency Cepstral Coefficient (MFCC) in different types of additive background noise and various SNR situations.