Keiko Ochi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Keiko Ochi is active.

Explore More

Publication

Featured researches published by Keiko Ochi.

international conference on acoustics, speech, and signal processing | 2009

Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model

Keiko Ochi; Keikichi Hirose; Nobuaki Minematsu

A total corpus-based process of generating prosodic features from text is developed. The process first predicts pauses and phone durations, and then generates F<inf>0</inf> contours. Since F<inf>0</inf> contour generation is based on the generation process model, it is rather easy to manipulate the generated F<inf>0</inf> contours in command level. A method was developed for generating sentence F<inf>0</inf> contours, when a focus is placed in one of the “bunsetsu” of an utterance. The method is to predict differences in the F<inf>0</inf> model commands between with and without focus utterances, and apply them to the F<inf>0</inf> model commands predicted beforehand by the baseline method. The validity of the method was proved by the experiment on F<inf>0</inf> contour generation and speech synthesis.

conference of the international speech communication association | 2016

Multi-Talker Speech Recognition Based on Blind Source Separation with ad hoc Microphone Array Using Smartphones and Cloud Storage.

Keiko Ochi; Nobutaka Ono; Shigeki Miyabe; Shoji Makino

In this paper, we present a multi-talker speech recognition system based on blind source separation with an ad hoc microphone array, which consists of smartphones and cloud storage. In this system, a mixture of voices from multiple speakers is recorded by each speaker’s smartphone, which is automatically transferred to online cloud storage. Our prototype system is realized using iPhone and Dropbox. Although the signals recorded by different iPhones are not synchronized, the blind synchronization technique compensates both the differences in the time offset and the sampling frequency mismatch. Then, auxiliary-function-based independent vector analysis separates the synchronized mixture into each speaker’s voice. Finally, automatic speech recognition is applied to transcribe the speech. By experimental evaluation of the multi-talker speech recognition system using Julius, we confirm that it effectively reduces the speech overlap and improves the speech recognition performance.

9th International Conference on Speech Prosody 2018 | 2018

Respiratory Control, Pauses, and Tonal Control in L1’s and L2’s Text Reading – A Pilot Study on Swedish and Japanese –

Toshiko Isei-Jaakkola; Yasuko Nagano-Madsen; Keiko Ochi

This paper reports the results of a pilot study, which examines the respiratory control exerted by chest and abdominal-muscles during the reading of a long text in the mother tongue (L1) and a targeted foreign language that is being learned (L2), with reference to syntax and prosody in Japanese and Swedish. Three datasets of read speech were obtained from Swedish speakers (SwL1), Swedish learners of Japanese (SwL2), and Japanese speakers (JL1). The results showed that the subjects used respiratory control differently while reading L1 texts and L2 texts, respectively. Both SwL1 and JL1 used chest and abdominal-muscles almost simultaneously, and the peaks of their muscular movements co-occurred at the onset of major syntactic units such as sentences and clauses. SwL2 used more chest muscles than abdominal-muscles, with muscular movements being more frequent, irregular, and small. There was no significant difference between JL1 and Swedish L1 and L2 in terms of the tonal control (pitch range). Some pitch peaks and pauses that appeared at the major syntactic boundaries coincided with the peaks of the muscular movements, but other pitch peaks and pauses did not. These results led to the hypothesis that the acquisition of intonation precedes that of respiratory control in L2 learning.

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) | 2017

Development of High Quality Blind Source Separation Based on Independent Low-Rank Matrix Analysis and Statistical Speech Enhancement for Flexible Hose-Shaped Robot

Yoshiki Mitsui; Satoshi Mizoguchi; Hiroshi Saruwatari; Keiko Ochi; Daichi Kitamura; Nobutaka Ono; Masaru Ishimura; Narumi Mae; Moe Takakusaki; Yutaro Matsui; Kouei Yamaoka; Shoji Makino

Yoshiki MITSUI, The University of Tokyo, yoshiki [email protected] Satoshi MIZOGUCHI, The University of Tokyo Hiroshi SARUWATARI, The University of Tokyo Keiko OCHI, National Institute of Information Daichi KITAMURA, SOKENDAI Nobutaka ONO, National Institute of Information/SOKENDAI Masaru ISHIMURA, University of Tsukuba Narumi MAE, University of Tsukuba Moe TAKAKUSAKI, University of Tsukuba Yutaro MATSUI, University of Tsukuba Kouei YAMAOKA, University of Tsukuba Shoji MAKINO, University of Tsukuba

conference of the international speech communication association | 2016

Automatic Discrimination of Soft Voice Onset Using Acoustic Features of Breathy Voicing.

Keiko Ochi; Koichi Mori; Naomi Sakai; Nobutaka Ono

Soft onset vocalization is used in certain speech therapies. However, it is not easy to practice it at home because the acoustical evaluation itself needs training. It would be helpful for speech patients to get objective feedback during training. In this paper, new parameters for identifying soft onset with high accuracy are described. One of the parameters measures an aspect of the soft voice onset, in which the vocal folds start to oscillate periodically before coming in contact with each other at the beginning of vocalization. Combined with an onset time exceeding a threshold, the proposed parameters gave about 99% accuracy in identifying soft onset vocalization.

Journal of the Acoustical Society of America | 2016

Articulation rates of people who do and do not stutter during oral reading and speech shadowing

Rongna A; Keiko Ochi; Keiichi Yasu; Naomi Sakai; Koichi Mori

Purpose: Previous studies indicate that people who stutter (PWS) speak more slowly than people who do not stutter (PWNS), even in the fluent utterances. The present study compared the articulation rates of PWS and PWNS in two different conditions: oral reading and speech shadowing in order to elucidate the factor that affect the speech rate in PWS. Method: All participants were instructed to read aloud a text and to shadow a model speech without seeing its transcript. The articulation rate (mora per second) was analyzed with an open-source speech recognition engine “Julius” version 4.3.1 (https://github.com/julius-speech/julius). The pauses and disfluencies were excluded from the calculation of the articulation rate in the present study. Results: The mean articulation rate of PWS was significantly lower than that of PWNS only in oral reading, but not in speech shadowing. PWS showed a significantly faster articulation rate, comparable to that of the model speech, in shadowing than in oral reading, while PW...

international conference on signal processing | 2008

Corpus-based generation of F 0 contours of Japanese based on the generation process model and its control for prosodic focus

Keikichi Hirose; Keiko Ochi; Nobuaki Minematsu

A total corpus-based process of generating prosodic features form text is developed. The process first predicts pauses and phone durations, and then generates F0 contours. Since F0 contour generation is based on the generation process model, it is rather easy to manipulate the generated F0 contours in command level. A method was developed for generating sentence F0 contours, when a focus is placed in one of ldquobunsetsurdquo of an utterance. The method is to predict differences in the F0 model commands between with and without focus utterances, and applies them to the F0 model commands predicted beforehand by the baseline method. The validity of the method was proved by the experiment on F0 contour generation and speech synthesis.

conference of the international speech communication association | 2007