Tadashi Emori | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tadashi Emori is active.

Explore More

Publication

Featured researches published by Tadashi Emori.

Journal of the Acoustical Society of America | 2006

Speaker's voice recognition system, method and recording medium using two dimensional frequency expansion coefficients

Tadashi Emori; Koichi Shinoda

A voice recognition system comprises an analyzer for converting an input voice signal to an input pattern including cepstrum, a reference pattern for storing reference patterns, an elongation/contraction estimating unit for outputting an elongation/contraction parameter in frequency axis direction by using the input pattern and the reference patterns, and a recognizing unit for calculating the distances between the converted input pattern from the converter and the reference patterns and outputting the reference pattern corresponding to the shortest distance as result of recognition. The elongation/contraction unit estimates an elongation/contraction parameter by using cepstrum included in the input pattern. The elongation/contraction unit does not have various values in advance for determining the elongation/contraction parameter, nor is it necessary for the elongation/contraction unit have to execute distance calculation for various values.

international conference on multimodal interfaces | 2002

An automatic speech translation system on PDAs for travel conversation

Ryosuke Isotani; Kiyoshi Yamabana; Shinichi Ando; Ken Hanazawa; Shinya Ishikawa; Tadashi Emori; Ken-ichi Iso; Hiroaki Hattori; Akitoshi Okumura; Takao Watanabe

We present an automatic speech-to-speech translation system for personal digital assistants (PDAs) that helps oral communication between Japanese and English speakers in various situations while traveling. Our original compact large vocabulary continuous speech recognition engine, compact translation engine based on a lexicalized grammar, and compact Japanese speech synthesis engine lead to the development of a Japanese/English bi-directional speech translation system that works with limited computational resources.

international conference on acoustics, speech, and signal processing | 2010

Speech modeling based on committee-based active learning

Yuzo Hamanaka; Koichi Shinoda; Sadaoki Furui; Tadashi Emori; Takafumi Koshinaka

We propose a committee-based active learning method for large vocabulary continuous speech recognition. In this approach, multiple recognizers are prepared beforehand, and the recognition results obtained from them are used for selecting utterances. Here, a progressive search method is used for aligning sentences, and voting entropy is used as a measure for selecting utterances. We apply our method not only to acoustic models but also to language models and their combination. Our method was evaluated by using 190-hour speech data in the Corpus of Spontaneous Japanese. It proved to be significantly better than random selection. It only required 63 h of data to achieve a word accuracy of 74%, while standard training (i.e., random selection) required 97 h of data. The recognition accuracy of our proposed method was also better than that of the conventional uncertainty sampling method using word posterior probabilities as the confidence measure for selecting sentences.

Systems and Computers in Japan | 2002

Vocal tract length normalization using rapid maximum‐likelihood estimation for speech recognition

Tadashi Emori; Koichi Shinoda

Speaker normalization techniques for correcting differences in the vocal tract lengths of different speakers, referred to as vocal tract length normalization, in a large vocabulary voice recognition system using a hidden Markov model (HMM), have been proposed in recent years. In this paper, a scheme for approximating especially small changes in the vocal tract length by linear mapping using a vocal tract length parameter in cepstrum space and maximum-likelihood estimation of this parameter from vocalization is proposed. The proposed method can estimate a more optimal parameter for a speaker with a small amount of computation than in past schemes using multiple vocal tract length parameters in advance. In evaluation tests of the recognition of 5000 single Japanese words, the proposed scheme decreased errors by 7.1% alone and 14.6% in combination with cepstrum mean normalization (CMN).

conference of the international speech communication association | 2001