Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ken-ichi Iso is active.

Publication


Featured researches published by Ken-ichi Iso.


international conference on acoustics, speech, and signal processing | 2010

Speaker clustering using vector quantization and spectral clustering

Ken-ichi Iso

We present a speaker clustering method for conversational speech recordings that contain short utterances from multiple speakers. The proposed method represents a speech segment with a vector of VQ code frequencies and uses a cosine between two vectors as their similarity measure. The clustering is performed by a spectral clustering algorithm with cluster number estimation based on an eigen structure of the similarity matrix. We conducted experiments on five test sets with different utterance length distributions to compare the proposed method with the conventional approach based on a hierarchical agglomerative clustering using BIC stopping criterion. The results show that the proposed method significantly outperforms the conventional one in speaker diarization error rate and purity metrics.


spoken language technology workshop | 2014

Speaker adaptation of deep neural networks using a hierarchy of output layers

Ryan Price; Ken-ichi Iso; Koichi Shinoda

Deep neural networks (DNN) used for acoustic modeling in speech recognition often have a very large number of output units corresponding to context dependent (CD) triphone HMM states. The amount of data available for speaker adaptation is often limited so a large majority of these CD states may not be observed during adaptation. In this case, the posterior probabilities of unseen CD states are only pushed towards zero during DNN speaker adaptation and the ability to predict these states can be degraded relative to the speaker independent network. We address this problem by appending an additional output layer which maps the original set of DNN output classes to a smaller set of phonetic classes (e.g. monophones) thereby reducing the occurrences of unseen states in the adaptation data. Adaptation proceeds by backpropagation of errors from the new output layer, which is disregarded at recognition time when posterior probabilities over the original set of CD states are used. We demonstrate the benefits of this approach over adapting the network with the original set of CD states using experiments on a Japanese voice search task and obtain 5.03% relative reduction in character error rate with approximately 60 seconds of adaptation data.


conference of the international speech communication association | 2016

Robust DNN-Based VAD Augmented with Phone Entropy Based Rejection of Background Speech.

Yuya Fujita; Ken-ichi Iso

We propose a DNN-based voice activity detector augmented by entropy based frame rejection. DNN-based VAD classifies a frame into speech or non-speech and achieves significantly higher VAD performance compared to conventional statistical model-based VAD. We observed that many of the remaining errors are false alarms caused by background human speech, such as TV / radio or surrounding peoples’ conversations. In order to reject such background speech frames, we introduce an entropy-based confidence measure using the phone posterior probability output by a DNN-based acoustic model. Compared to the target speaker’s voice background speech tends to have relatively unclear pronunciation or is contaminated by other types of noises so its entropy becomes larger than audio signals with only the target speaker’s voice. Combining DNN-based VAD and the entropy criterion, we reject speech frames classified by the DNN-based VAD as having an entropy larger than a threshold value. We have evaluated the proposed approach and confirmed greater than 10% reduction in Sentence Error Rate.


Nec Research & Development | 2003

Speech-to-speech translation software on PDAs for travel conversation

Ryosuke Isotani; Kiyoshi Yamabana; Shinichi Ando; Ken Hanazawa; Shinya Ishikawa; Ken-ichi Iso


Eurasip Journal on Audio, Speech, and Music Processing | 2016

Wise teachers train better DNN acoustic models

Ryan Price; Ken-ichi Iso; Koichi Shinoda


international conference on human language technology research | 2002

An automatic speech translation system for travel conversation

kitoshi Okumura; Ken-ichi Iso; Shinichi Doi; Kiyoshi Yamabana; Ken Hanazawa; Takao Watanabe


conference of the international speech communication association | 2012

Improvements in Japanese Voice Search.

Ken-ichi Iso; Edward W. D. Whittaker; Tadashi Emori; Junpei Miyake


international conference on multimedia and expo | 2009

Web-based topic language modeling for audio indexing

Ken-ichi Iso


conference of the international speech communication association | 1998

Predictive speaker adaptation and its prior training

Dieu Tran; Ken-ichi Iso


conference of the international speech communication association | 1994

An automatic voice dialing system developed on PC speech i/o platform.

Jun Noguchi; Shinsuke Sakai; Kaichiro Hatazaki; Ken-ichi Iso; Takao Watanabe

Collaboration


Dive into the Ken-ichi Iso's collaboration.

Top Co-Authors

Avatar

Koichi Shinoda

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Ryan Price

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Shinsuke Sakai

National Institute of Information and Communications Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge