Keizaburo Takagi
NEC
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Keizaburo Takagi.
international conference on acoustics, speech, and signal processing | 1995
Takao Watanabe; Koichi Shinoda; Keizaburo Takagi; Ken-ichi Iso
This paper proposes a new speech recognition method using a tree-structured probability density function (PDF) to realize high speed HMM based speech recognition. In order to reduce the likelihood calculation for a PDF set composed of the Gaussian PDFs for all mixture components, all states and all recognition units, it is coarsely done for the element PDF whose likelihood is not likely to be large. The PDF set is expressed as a tree-structured form. In the recognition process, the likelihood set is calculated by searching the tree; by calculating the likelihood from the cluster PDF at the node and traversing the nodes with the largest likelihood from the root. Experimental results showed that the computation load was drastically reduced with little reduction in the recognition accuracy, in both speaker-independent and speaker-adaptive cases. The algorithm was applied to a personal computer speech recognition software without using special hardware.
Journal of the Acoustical Society of America | 2000
Keizaburo Takagi
A speech adaptation device comprises a vocabulary independent reference pattern memory for memorizing a plurality of vocabulary independent reference patterns having one or more categories. Each category has one or more acoustic units, and has such a connection relation of the acoustic units that allows reception of any sequence of the acoustic units appearing in the input speech. A preliminary matching unit is for use in making time-alignment between the time series of the feature vectors of the input speech obtained from the analysis unit and the vocabulary independent reference pattern to obtain mean vectors for individual categories of the input speech and the vocabulary independent reference pattern from the aligned portion for the individual categories of the feature vectors of the input speech and the vocabulary independent reference pattern. An adaptation unit is for use in making correction of at least one of the time series of the feature vectors of the input speech and the vocabulary independent reference pattern by using the mean vectors for each category calculated by the preliminary matching unit.
international conference on acoustics, speech, and signal processing | 1995
Keizaburo Takagi; Hiroaki Hattori; Takao Watanabe
The paper proposes a rapid environment adaptation algorithm based on spectrum equalization (REALISE). In practical speech recognition applications, differences between training and testing environments often seriously diminish recognition accuracy. These environmental differences can be classified into two types: difference in additive noise and difference in multiplicative noise in the spectral domain. The proposed method calculates time-alignment between a testing utterance and the closest reference pattern to it, and then calculates the noise differences between the two according to the time-alignment. Then, the authors adapt all reference patterns to the testing environment using the differences. Finally, the testing utterance is recognized using the adapted reference patterns. In a 250 Japanese word recognition task, in which the training and testing microphones were of two different types, REALISE improved recognition accuracy from 87% to 96%.
Journal of the Acoustical Society of America | 2000
Keizaburo Takagi
A speech recognition apparatus includes a feature extraction section, and a recognition section. The feature extraction section extracts the feature vectors of input speech. The feature extraction section includes at least a pitch intensity extraction section. The pitch intensity extraction section extracts the intensities of the fundamental frequency components of the input speech. The recognition section performs speech recognition by using the feature vectors from the feature extraction section.
international conference on spoken language processing | 1996
Keizaburo Takagi; Koichi Shinoda; Hiroaki Hattori; Takao Watanabe
A speaker adaptation method is described. In practical applications of speaker adaptation, adaptation and testing environments change significantly and are unknown beforehand. In such cases, since the speaker adaptation adapts a reference pattern to the adaptation utterances with regard to differences in both environment and speaker at the same time, performance in speaker adaptation would be degraded. To cope with this problem, our proposed method first eliminates the environmental differences between each input utterance and a reference pattern by using a rapid environment adaptation algorithm based on spectrum equalization (REALISE) (K. Takagi et al., 1995). Then we apply an unsupervised and incremental speaker adaptation with autonomous control using tree structure pdfs (ACTS) (K. Shinoda and T. Watanabe, 1995) to the environmentally adapted reference pattern. By combining these two methods, the resulting system is expected to perform well under adverse environmental conditions and to show a stable improvement, regardless of the amount of adaptation data. Evaluation experiments were carried out for utterances under three vehicle speed conditions. Recognition rates for a 100 Japanese word recognition task after 100 word adaptation were improved from 92% (ACTS alone) to 95% (proposed method).
Journal of the Acoustical Society of America | 1996
Hiroaki Hattori; Keizaburo Takagi; Takao Watanabe
This paper reports results for a rapid environment adaptation algorithm in adverse car environments. It is known that the difference between training and testing environments degrades speech recognition performance. This degradation becomes serious especially in applications such as telephone speech recognition and speech recognition inside a running vehicle, where the testing environment may drastically change. To solve this problem, a rapid environmental adaptation method (hereafter referred to as REALISE) is proposed and its performance is measured in telephone speech recognition. REALISE estimates the differences in multiplicative and additional noise in the spectral domain between the training and testing environments and uses them to adapt acoustic features of reference patterns to the testing environment. Utterances in a car were also recorded to investigate the performance of REALISE under more severe conditions. The testing data were 100 city names uttered by three males and three females under t...
Journal of the Acoustical Society of America | 1998
Keizaburo Takagi; Hiroaki Hattori
Journal of the Acoustical Society of America | 2000
Keizaburo Takagi
conference of the international speech communication association | 1994
Takao Watanabe; Koichi Shinoda; Keizaburo Takagi; Eiko Yamada
Archive | 2002
Keizaburo Takagi