Ramalingam Hariharan
Nokia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ramalingam Hariharan.
international conference on acoustics, speech, and signal processing | 2001
Ramalingam Hariharan; Juha Häkkinen; Kari Laurila
We propose a sub-band energy based end-of-utterance algorithm that is capable of detecting the time instant when the user has stopped speaking. The proposed algorithm finds the time instant at which many enough sub-band spectral energy trajectories fall and stay for a pre-defined fixed time below adaptive thresholds, i.e. a non-speech period is detected after the end of the utterance. With the proposed algorithm a practical speech recognition system can give timely feedback for the user, thereby making the behaviour of the speech recognition system more predictable and similar across different usage environments and, noise conditions. The proposed algorithm is shown to be more accurate and noise robust than the previously proposed approaches. Experiments with both isolated command word recognition and continuous digit recognition in various noise conditions verify the viability of the proposed approach with an average proper end-of-utterance detection rate of around 94% in both cases, representing 43% error rate reduction over the most competitive previously published method.
IEEE Transactions on Speech and Audio Processing | 2001
Ramalingam Hariharan; Imre Kiss; I. Viikki
In this paper, we present a multiresolution-based feature extraction technique for speech recognition in adverse conditions. The proposed front-end algorithm uses mel cepstrum-based feature computation of subbands in order not to spread noise distortions over the entire feature space. Conventional full-band features are also augmented to the final feature vector which is fed to the recognition unit. Other novel features of the proposed front-end algorithm include emphasis of long-term spectral information combined with cepstral domain feature vector normalization and the use of the PCA transform, instead of DCT, to provide the final cepstral parameters. The proposed algorithm was experimentally evaluated in a connected digit recognition task under various noise conditions. The results obtained show that the new feature extraction algorithm improves word recognition accuracy by 41 % when compared to the performance of mel cepstrum front-end. A substantial increase in recognition accuracy was observed in all tested noise environments at all different SNRs. The good performance of the multiresolution front-end is not only due to the higher feature vector dimension, but the proposed algorithm clearly outperformed the mel cepstral front-end when the same number of HMM parameters were used in both systems. We also propose methods to reduce the computational complexity of the multiresolution front-end-based speech recognition system. Experimental results indicate the viability of the proposed techniques.
Speech Communication | 2002
Ramalingam Hariharan; Olli Viikki
Abstract Inter-speaker variability and sensitivity to background noise are two major problems in modern speech recognition systems. In this paper, we investigate different techniques that have been developed to overcome these issues. These methods include vocal tract length normalisation (VTLN), on-line HMM adaptation and gender-dependent acoustic modelling. Our objective in this paper is to combine these techniques so that the system recognition performance is maximised. Moreover, we propose a vocal tract length normalisation technique, which is more implementation-friendly than the previously published utterance-specific VTLN (u-VTLN). In order to ensure the wide applicability of the methods to be studied, the performance evaluation is done both in connected digit recognition and monophone-based isolated word recognition. The recognition results obtained indicate the importance of the combined use of these techniques. The integrated use of VTLN and on-line adaptation always provided the highest performance in both types of recognition experiments using gender-independent models. As expected, on-line HMM adaptation provided the major performance improvement with respect to a gender- and speaker-independent baseline system. The combination of speaker-specific VTLN (s-VTLN) or gender-dependent acoustic modelling further improved the system accuracy. However, while the joint use of s-VTLN and gender-dependent HMMs improved the recognition rate with original unadapted models, a minor performance degradation was observed when s-VTLN was applied to on-line adapted gender-dependent HMMs.
Archive | 2001
Ramalingam Hariharan
Archive | 2000
Ramalingam Hariharan; Juha Häkkinen; Imre Kiss; Jilei Tian; Olli Viikki
Archive | 2004
Kari Laurila; Juha Häkkinen; Ramalingam Hariharan
conference of the international speech communication association | 1999
Juha Häkkinen; Janne Suontausta; Ramalingam Hariharan; Marcel Vasilache; Kari Laurila
conference of the international speech communication association | 2000
Ramalingam Hariharan; Imre Kiss; Olli Viikki; Jilei Tian
conference of the international speech communication association | 1999
Juha Iso-Sipilä; Kari Laurila; Ramalingam Hariharan; Olli Viikki
Archive | 2000
Kari Laurila; Juha Häkkinen; Ramalingam Hariharan