Yasuyuki Masai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yasuyuki Masai is active.

Explore More

Publication

Featured researches published by Yasuyuki Masai.

intelligent user interfaces | 2006

A multi modal supporting tool for multi lingual communication by inducing partner's reply

Kazunori Imoto; Munehiko Sasajima; Taishii Shimomori; Noriko Yamanaka; Makoto Yajima; Yasuyuki Masai

This paper introduces a new tool for supporting multilingual communication between speakers of different languages. Conventional tools such as electronic dictionaries enable users to communicate basic intentions to others, but are often insufficient to help understand replies. The input of a Japanese sentence in the proposed tool not only produces a translation of the sentences but also displays a window featuring possible answers. The authors have evaluated the function of a prototype system which resulted in a thorough understanding of the merits and comings of the proposed tool.

international conference on spoken language processing | 1996

Word-spotting based on inter-word and intra-word diphone models

Tsuneo Nitta; Shinichi Tanaka; Yasuyuki Masai; Hiroshi Matsuura

The authors propose a precise but simple inter-word diphone model (IDM) for word-spotting based on SMQ/HMM. They have applied ordinary diphone models to a speaker-independent, large-vocabulary word recognition unit. However, because users are apt to add words and/or extraneous speech, accuracy degrades due to the mismatch of models at word-boundaries. The IDM represents a transition from the preceding phonemes to a word or from a word to the succeeding phonemes. An experiment showed that the IDMs reduce error rates by about 5% for speech containing unknown words and extraneous speech. The experiment also showed that the proposed method ensured performance good enough for the practical use of a large-vocabulary isolated-word recognition system.

international conference on acoustics, speech, and signal processing | 1992

Representing dynamic features of phonetic segment in an orthogonalized codebook of HMM based speech recognition system

Tsuneo Nitta; Junichi Iwasaki; Yasuyuki Masai; Hiroshi Matsuura

The authors propose a matrix quantization (MQ) algorithm named statistical MQ (SMQ) which uses an orthogonalized phonetic segment codebook. The SMQ effectively incorporates pattern variations of each phonetic segment into the orthogonalized phonetic segment codebook, and transforms an input speech to a sequence of phonetic symbols which include about 700 types of phonetic segments. The authors also propose a simple SMQ-HMM training algorithm called an equally counted K-based learning in which each phonetic event observed within the best K is equally counted in a model and output probabilities are smoothed without fuzzy rule. The proposed algorithm has been tested on a 546-word vocabulary data set uttered by 10 unknown speakers, using a real time recognition system, and has achieved the high performance of 96.5%.<<ETX>>

Systems and Computers in Japan | 1995

Multimodal dialogue system MultiksDial

Hiroyuki Kamio; Hiroshi Matsuura; Tsuneo Nitta; Yasuyuki Masai

This paper describes the multimodal dialogue system MultiksDial, which focuses on the speech input/output. In this system, the speech recognition unit and the touch panel are used as the means of input, and the Text-To-Speech (TTS) synthesizer and the display are used as the means of output. In other words, the features of the system are that both the input and the output are multimodal. As the auxiliary means of input, a photoelectric sensor is used to realize a smooth dialogue by monitoring the behavior of the user and by guiding the operation. An information guide system is implemented on MultiksDial, and the operability of the user interface is evaluated. The following characteristics of the speech input are demonstrated as the result of the comparison experiment of the input mode. The direct indication by the speech input can execute the operation faster than the touch input, where the stepwise indication is required. Another point is that the beginner can execute? smooth dialogue by guiding the operation by the synthesized speech. By those results, it is verified that the multimodal implementation of the dialogue channel is useful in realizing an efficient dialogue between the user and the system.

Archive | 2007