Yumi Wakita | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yumi Wakita is active.

Explore More

Publication

Featured researches published by Yumi Wakita.

Journal of the Acoustical Society of America | 2005

Sentence recognition apparatus, sentence recognition method, program, and medium

Yumi Wakita; Kenji Matsui

In the prior art, it has been difficult to perform proper sentence recognition by using speech recognition or text sentence recognition. The present invention provides a sentence recognition apparatus comprising: a data base for storing a plurality of predetermined standard content word pairs each formed from a plurality of predetermined content words; a speech recognition means of recognizing an input sentence made up of a plurality of words; a content word selection means of selecting content words from among the plurality of words forming the recognized sentence; a judging means of judging whether a content word pair arbitrarily formed from the selected content words matches any one of the standard content word pairs stored in the data base; and an erroneously recognized content word determining means 105 of determining, based on the result of the judgement, an erroneously recognized content word for which the recognition failed from among the selected content words.

Computer Speech & Language | 1999

Multiple pronunciation dictionary using HMM-state confusion characteristics

Yumi Wakita; Harald Singer; Yoshinori Sagisaka

Abstract In this paper, we propose a POS (part-of-speech)-dependent multiple pronunciation dictionary generation method using HMM-state confusions spanning several phonemes. When used in a multi-pass search, a dictionary generated from the method makes it possible to recover missing words that are lost during the first pass of the search process in continuous speech recognition using a single pronunciation dictionary. The new pronunciations are added to a dictionary that considers the POS dependency of the confusion characteristics. Continuous word recognition experiments have confirmed that the best results are obtained when (1) confusions expressed by HMM-state sequences and (2) pronunciation variations considering the POS-dependent confusion characteristics are used.

Computers and Electronics in Agriculture | 2015

Prediction of K value for fish flesh based on ultraviolet-visible spectroscopy of fish eye fluid using partial least squares regression

Anisur Rahman; Naoshi Kondo; Yuichi Ogawa; Tetsuhito Suzuki; Yuri Shirataki; Yumi Wakita

HighlightsUV-VIS spectra of fish eye fluid are used to predict K value of fish flesh.A regression model is developed using partial least squares (PLS) regression technique.Predict the K value of fish flesh with Rpred2 of 0.87 and RMSEP of 7.87% is observed. A method to predict K value of fish flesh using ultraviolet-visible (UV-VIS) spectral properties (250-600nm) of its eye fluid and a partial least squares (PLS) regression method is investigated. UV-VIS absorbance of eye fluid was monitored for 240 fresh fish (Japanese dace) while simultaneously measuring the K value of the fish flesh by a paper electrophoresis technique. Several spectral pre-processing techniques, such as moving average smoothing, normalization, multiplicative scatter correction (MSC), Savitzky-Golay first-order derivative and Savitzky-Golay second-order derivatives were compared. The results showed that the regression model developed by PLS based on MSC preprocessed spectra resulted in better performance compared to models developed by other preprocessing methods, with a determination coefficient of prediction (Rpred2) of 0.87 and a root mean square error of prediction (RMSEP) of 7.87%. Therefore, the use of UV-VIS spectroscopy combined with appropriate multivariate analysis has the potential to accurately predict K value of fish flesh.

workshop on perceptive user interfaces | 2001

An experimental multilingual speech translation system

Kenji Matsui; Yumi Wakita; Tomohiro Konuma; Kenji Mizutani; Mitsuru Endo; Masashi Murata

In this paper, we describe an experimental speech translation system utilizing small, PC-based hardware with multi-modal user interface. Two major problems for people using an automatic speech translation device are speech recognition errors and language translation errors. In this paper we focus on developing techniques to overcome these problems. The techniques include a new language translation approach based on example sentences, simplified expression rules, and a multi-modal user interface which shows possible speech recognition candidates retrieved from the example sentences. Combination of the proposed techniques can provide accurate language translation performance even if the speech recognition result contains some errors. We propose to use keyword classes by looking at the dependency between keywords to detect the misrecognized keywords and to search the example expressions. Then, the suitable example expression is chosen using a touch panel or by pushing buttons. The language translation picks up the expression in the other language, which should always be grammatically correct. Simplified translated expressions are realized by speech-act based simplifying rules so that the system can avoid various redundant expressions. A simple comparison study showed that the proposed method outputs almost 2 to 10 times faster than a conventional translation device.

international conference on human-computer interaction | 2016

Influence of Personal Characteristics on Nonverbal Information for Estimating Communication Smoothness

Yumi Wakita; Yuta Yoshida; Mayu Nakamura

To realize a system that can provide a new topic of discussion for improving lively and smooth human-to-human communication, a method to estimate conversation smoothness is necessary. To develop a process for estimating conversation smoothness, we confirmed the effectiveness of using fundamental frequency (F0). The analytic results of free dyadic conversation using the F0 of laughter utterances in conversation are strongly dependent on personal characteristics. Moreover, F0s without laughter utterances are effective in estimating conversation smoothness.

international conference on digital human modeling and applications in health, safety, ergonomics and risk management | 2017

F0 Feature Analysis of Communication Between Elderly Individuals for Health Assessment

Yumi Wakita; Shunpei Matsumoto

This study explores a system that estimates the health condition of an elderly individual using nonverbal information from daily conversations.

consumer communications and networking conference | 2004

Evaluation of a speech translation system for travel conversation installed in PDA

Kenji Mizutani; Tomohiro Konuma; Mitsuru Endo; Taro Nambu; Yumi Wakita

For mobile use, we are developing a multilingual example-sentence-driven speech translation system that has a multi-modal input interface to retrieve the sentence. In addition to the basic speech input mode, we apply an associative keyword mode with a software keyboard and an associative domain selection mode. The paper discusses the characteristics of each input mode, the synergistic effect obtained by combining the modes and the results of evaluations that show the difference between system performance in the laboratory and in the real world. As the evaluation criteria, we adopted the retrieval time and the retrieval precision of the sentence. When all of the modes were available, the precision within 30 seconds was 86.8% for a closed test set and 76.8% for an open-test set. When the retrieval was completed with only one operation, the average time was 10.3 seconds for a closed set. The precision was 12.0% higher than the maximum precision obtained when only one of the modes was available. The results show that synergetic effect of the combined modes certainly exists and all the modes are necessary to improve the systems usability.

Journal of the Acoustical Society of America | 2008