Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Satoru Tsuge is active.

Publication


Featured researches published by Satoru Tsuge.


international symposium on intelligent signal processing and communication systems | 2009

Speaker verification method using bone-conduction and air-conduction speech

Satoru Tsuge; Daichi Koizumi; Minoru Fukumi; Shingo Kuroiwa

Recently, some new sensors, such as bone-conductive microphones, throat microphones, and non-audible murmur (NAM) microphones, besides conventional condenser microphones have been developed for collecting speech data. Accordingly, some researchers began to study speaker and speech recognition using speech data collected by these new sensors. We focus on bone-conduction speech data collected by the bone-conductive microphone. In this paper, we first investigate speaker verification performances of bone-conduction speech. In addition, we propose a method of using bone-conduction speech and air-conduction together for the speaker verification. The proposed method integrates the similarity calculated by air-conduction speech model and similarity calculated by bone-conduction speech model. Using 99 female speakers speech data, we conducted speaker verification experiments. Experimental results show that the speaker verification performance of bone-conduction is lower than that of air-conduction speech. However, the proposed method can improve the speaker verification performance of bone- and air-conduction speech. Actually, the proposed method can reduce the equal error rate of air-conduction speech by 16.0% and the equal error rate of bone-conduction speech by 71.7%.


international conference on modeling, simulation, and applied optimization | 2011

Feature selection method for music mood score detection

Masato Miyoshi; Satoru Tsuge; Tadahiro Oyama; Momoyo Ito; Minoru Fukumi

In general, music retrieval and classification methods using music moods use a lot of acoustic features similar to music genre classification. These features are used as the spectral features, the rhythm features, the harmony features, and so on. However, all of these features may not be efficient for music retrieval and classification using music moods. Hence, in this paper, we propose a feature selection method for detecting music mood scores. In the proposed method, features which have strong correlation with mood scores are selected from a lot of features. Then, these are input into Multi-Layer Neural Networks (MLNNs) and mood scores are detected every mood labels. For evaluating the proposed method, we conducted the music mood score detection experiments. Experimental results show that the proposed method improves the detection performance compared to not use the feature selection.


international conference on natural computation | 2011

Music retrieval method based on filter-bank feature and earth mover's distance

Qingmei Xiao; Satoru Tsuge; Kenji Kita

In this paper, we propose a novel music retrieval method using a filter bank feature and earth movers distance (EMD). In the proposed method, we use MFCCs for acoustic features and EMD for a distance measurement. Evaluation experimental results show that the accuracy of the proposed method achieves 96.73%.


international conference on knowledge based and intelligent information and engineering systems | 2010

Music impression detection method for user independent music retrieval system

Masato Miyoshi; Satoru Tsuge; Hillary Kipsang Choge; Tadahiro Oyama; Momoyo Ito; Minoru Fukumi

In previous work, we have proposed the automatic sensitive word score detection system for a user dependent music retrieval system. However, the user dependent method causes a lot of burdens to the user because the system requires a lot of data for adapting it to each user. Hence, in this paper, we propose an automatic sensitive word score detection method for a user independent music retrieval system and evaluate the proposed method using 225 music data. Experimental results show that 87.5% of music patterns succeeded in detection of sensitive word score in the case that the difference between estimated and evaluated score is 1 (Error 1 rate). Moreover, we conduct subjective evaluation experiments to evaluate the proposed method as a utility method. From this experiment, it is observed that the user satisfaction level of the proposed method is higher than random selection impression detection.


asia-pacific signal and information processing association annual summit and conference | 2013

Spoken document retrieval using both word-based and syllable-based document spaces with latent semantic indexing

Ken Ichikawa; Satoru Tsuge; Norihide Kitaoka; Kazuya Takeda; Kenji Kita

In this paper, we propose a spoken document retrieval method using vector space models in multiple document spaces. First we construct multiple document vector spaces, one of which is based on continuous-word speech recognition results and the other on continuous-syllable speech recognition results. Query expansion is also applied to the word-based document space. We proposed to apply latent semantic indexing (LSI) not only to the word-based space but also to the syllable-based space, to reduce dimensionality of the spaces using implicitly defined semantics. Finally, we combine the distances and compare the distance between the query and the available documents in various spaces to rank the documents. In this procedure, we propose to model the document by hyperplane. To evaluate our proposed method, we conducted spoken document retrieval experiments using the NTCIR-9 SpokenDoc data set. The results showed that using the combination of the distances, and using LSI on the syllable-based document space, improved retrieval performance.


international conference natural language processing | 2007

Automatic Utterance Segmentation Tool for Speech Corpus

M. Ozawa; Satoru Tsuge; Masami Shishibori; Kenji Kita; Minoru Fukumi; Fuji Ren; Shingo Kuroiwa

We collect the speech data for investigating an intra-speakers speech variability over a short and long time. In general, to reduce the load of speakers, the speech data are collected as one file from collecting start to collecting end. Hence, there are some noises, non-speech sections and mistaken sections in this file. Consequently, we must segment this file into individual utterances and select the useful utterances. This process requires a lot of time and efforts. In this paper, we propose an automatic utterance segmentation tool for dividing the collected speech data. The proposed tool is composed of four processes, which are a voice activity detection, speech recognition, a DP matching, and a correct of speech section. For evaluating the proposed tool, we conduct the evaluation experiments using a female speakers speech data in our corpus. Experimental results show that the proposed method can reduce a filing time by 90% compared to a manual filing. In This paper, first, we introduced the large speech corpus. This speech corpus contains is the speech data collected by specific speaker over long and short time periods. And, we explained the automatic utterance segmentation tool which we made in the case of corpus build. And inspected the validity. As a result, it was demonstrated that the automatic utterance segmentation tool was high-performance. Furthermore, it was demonstrated that speech corpus build became simple by using the automatic utterance segmentation tool.


international conference natural language processing | 2007

Analysis of Variation on Intra-Speakers Speech Recognition Performances

Satoru Tsuge; Keiji Seida; Masami Shishibori; Kenji Kita; Fuji Ren; Minoru Fukumi; Shingo Kuroiwa

Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. However, the relationships between intra-speakers speech variability and speech recognition performance are not clear. To investigate these relationships, we have been collecting speech data since November 2002. In this paper, we analyze the relationships between intra-speakers speech variability and the phoneme accuracy by a correlation analysis. Analyzed results showed the strong negative correlation between the phoneme accuracy and the speaking rate. The correlation coefficient indicated -0.77. Moreover, we can see that the phoneme accuracy is correlated with the temperature in the recording room and the humidity difference.


conference of the international speech communication association | 2008

Combination method of bone-conduction speech and air-conduction speech for speaker recognition.

Satoru Tsuge; Takashi Osanai; Hisanori Makinae; Toshiaki Kamada; Minoru Fukumi; Shingo Kuroiwa


society of instrument and control engineers of japan | 2011

Personal identification method using footsteps

Masato Miyoshi; Kentaro Mori; Yasunori Kashihara; Masafumi Nakao; Satoru Tsuge; Minoru Fukumi


Acoustical Science and Technology | 2011

CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments

Takahiro Fukumori; Takanobu Nishiura; Masato Nakayama; Yuki Denda; Norihide Kitaoka; Takeshi Yamada; Kazumasa Yamamoto; Satoru Tsuge; Masakiyo Fujimoto; Tetsuya Takiguchi; Chiyomi Miyajima; Satoshi Tamura; Tetsuji Ogawa; Shigeki Matsuda; Shingo Kuroiwa; Kazuya Takeda; Satoshi Nakamura

Collaboration


Dive into the Satoru Tsuge's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kenji Kita

University of Tokushima

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fuji Ren

University of Tokushima

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Momoyo Ito

University of Tokushima

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge