Yasuo Ariki
Kyoto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yasuo Ariki.
international conference on acoustics, speech, and signal processing | 1986
Yasuo Ariki; K. Kajimoto; Toshiyuki Sakai
We will describe a noise reduction method which diffuses and suppresses the noise component, while concentrating and enhancing speech formant information. Two Dimensional Spectral Smoothing (TDSS) is applied over a time sequence of spectral envelope to diffuse noise and concentrate formant information. After consonant enhancement, Non-linear Spectral Amplitude Transformation (NSAT) is carried out to further suppress the noise component and enhance formant information. This noise reduction method has proven to be superior to the conventional frequency subtraction method.
Systems and Computers in Japan | 1987
Yasuo Ariki; Kouji Wakimoto; Hui Shieh; Toshiyuki Sakai
This paper assumes the situation where the existing drawing is to be re-utilized after a partial modification, and discusses a system, which can easily improve or modify with a high speed the characters and figures at the image level. First, the input drawing is segmented automatically and described in terms of the composing elements of the drawing, such as the character, the segment, the connecting point and the frame surrounding characters. Since the description is the structure description of the drawing, and no particular drawing is assumed, the method can be applied to various kinds of drawings. Then based on the description, the desired transformation is planned. Finally, based on the transformed description, the image is transformed. The proposed system features the following: (1) A versatile structure description is available; (2) two-stage transformation, i.e., the transformation at the description level and the transformation at the image level, is employed. This two-stage transformation makes possible the global transformation of the drawings, such that the frame surrounding the characters can be transformed according to the change of the character string length; and (3) the real-time image processing, such as extraction of the composing elements and image generation is realized by the data-flow processor.
Information Sciences | 1984
Kiyoshi Maenobu; Yasuo Ariki; Toshiyuki Sakai
Abstract A method of speaker-independent connected-word recognition by robust segmentation for speaker variation is described. To normalize the variation by speakers, an input speech pattern is transformed through segmentation and labeling into a sequence of phonemically labeled segments (phoneme string) which have less variation by speakers. Connected word recognition is carried out using a two-level DP matching algorithm on that phoneme string. The input speech pattern is oversegmented in order to avoid omissions which cause fatal errors in word recognition. The number of segments which correspond to one phoneme should depend on the phoneme; the number of segments for vowels should be greater than that for consonants. From this viewpoint, we propose a method of varying the matching path adaptively with respect to each phoneme, at the dynamic-programming word-matching level. In experiments on spokenword recognition of one to four connected digits, the recognition rate for each word was about 90% and for each sequence of words was about 80%, on an average over seven male speakers. In the case where the words are spoken clearly, the former improved to 93.8% and the latter to 86.0% on an average.
Systems and Computers in Japan | 1989
Yasuo Ariki; Masaaki Nagata; Toshiyuki Sakai
This paper examines experimentally the dynamical features which are useful in the recognition of words and monosyllables. By the two-dimensional cepstrum analysis of the speech waveform, the two-dimensional dynamical features are obtained for the time and frequency variations. First, the usefulness of the dynamic features for the time variation in the recognition is examined. It is shown as the result that the spectral envelope and its global variation are important in the recognition. It is shown especially that the recognition rate for the monosyllable is improved by providing a high-emphasis lifter for the dynamic feature, and the noise immunity is improved by providing a low-pass filter. The reason is that the high-emphasis lifter on the two-dimensional cepstrum emphasizes the formant in a two-dimensional way on the time-frequency plane, while the low-pass lifter corresponds to the two-dimensional smoothing on the time-frequency plane. Finally, the lifting on the two-dimensional cepstrum is compared experimentally with the usual one-dimensional cepstrum. Hence, it is seen that the two-dimensional cepstrum is better from the viewpoints of the monosyllable recognition and the recognition performance in a noisy environment.
IEE proceedings. Part I. Solid-state and electron devices | 1989
Yasuo Ariki; S. Mizuta; M. Nagata; Toshiyuki Sakai
international joint conference on artificial intelligence | 1987
Yasuo Ariki; Masashi Morimoto; Toshiyuki Sakai
音声科学研究 = Studia phonologica | 1984
Tooru Hasegawa; Yasuo Ariki; Toshiyuki Sakai
音声科学研究 = Studia phonologica | 1987
Yasuo Ariki; Kazuo Kajimoto; Toshiyuki Sakai
音声科学研究 = Studia phonologica | 1985
Yasuo Ariki; Toshiyuki Sakai
音声科学研究 = Studia phonologica | 1985
Masaaki Nagata; Yasuo Ariki; Toshiyuki Sakai