Satoru Tsuge
Daido University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Satoru Tsuge.
international symposium on intelligent signal processing and communication systems | 2009
Satoru Tsuge; Daichi Koizumi; Minoru Fukumi; Shingo Kuroiwa
Recently, some new sensors, such as bone-conductive microphones, throat microphones, and non-audible murmur (NAM) microphones, besides conventional condenser microphones have been developed for collecting speech data. Accordingly, some researchers began to study speaker and speech recognition using speech data collected by these new sensors. We focus on bone-conduction speech data collected by the bone-conductive microphone. In this paper, we first investigate speaker verification performances of bone-conduction speech. In addition, we propose a method of using bone-conduction speech and air-conduction together for the speaker verification. The proposed method integrates the similarity calculated by air-conduction speech model and similarity calculated by bone-conduction speech model. Using 99 female speakers speech data, we conducted speaker verification experiments. Experimental results show that the speaker verification performance of bone-conduction is lower than that of air-conduction speech. However, the proposed method can improve the speaker verification performance of bone- and air-conduction speech. Actually, the proposed method can reduce the equal error rate of air-conduction speech by 16.0% and the equal error rate of bone-conduction speech by 71.7%.
international conference on modeling, simulation, and applied optimization | 2011
Masato Miyoshi; Satoru Tsuge; Tadahiro Oyama; Momoyo Ito; Minoru Fukumi
In general, music retrieval and classification methods using music moods use a lot of acoustic features similar to music genre classification. These features are used as the spectral features, the rhythm features, the harmony features, and so on. However, all of these features may not be efficient for music retrieval and classification using music moods. Hence, in this paper, we propose a feature selection method for detecting music mood scores. In the proposed method, features which have strong correlation with mood scores are selected from a lot of features. Then, these are input into Multi-Layer Neural Networks (MLNNs) and mood scores are detected every mood labels. For evaluating the proposed method, we conducted the music mood score detection experiments. Experimental results show that the proposed method improves the detection performance compared to not use the feature selection.
international conference on natural computation | 2011
Qingmei Xiao; Satoru Tsuge; Kenji Kita
In this paper, we propose a novel music retrieval method using a filter bank feature and earth movers distance (EMD). In the proposed method, we use MFCCs for acoustic features and EMD for a distance measurement. Evaluation experimental results show that the accuracy of the proposed method achieves 96.73%.
international conference on knowledge based and intelligent information and engineering systems | 2010
Masato Miyoshi; Satoru Tsuge; Hillary Kipsang Choge; Tadahiro Oyama; Momoyo Ito; Minoru Fukumi
In previous work, we have proposed the automatic sensitive word score detection system for a user dependent music retrieval system. However, the user dependent method causes a lot of burdens to the user because the system requires a lot of data for adapting it to each user. Hence, in this paper, we propose an automatic sensitive word score detection method for a user independent music retrieval system and evaluate the proposed method using 225 music data. Experimental results show that 87.5% of music patterns succeeded in detection of sensitive word score in the case that the difference between estimated and evaluated score is 1 (Error 1 rate). Moreover, we conduct subjective evaluation experiments to evaluate the proposed method as a utility method. From this experiment, it is observed that the user satisfaction level of the proposed method is higher than random selection impression detection.
asia-pacific signal and information processing association annual summit and conference | 2013
Ken Ichikawa; Satoru Tsuge; Norihide Kitaoka; Kazuya Takeda; Kenji Kita
In this paper, we propose a spoken document retrieval method using vector space models in multiple document spaces. First we construct multiple document vector spaces, one of which is based on continuous-word speech recognition results and the other on continuous-syllable speech recognition results. Query expansion is also applied to the word-based document space. We proposed to apply latent semantic indexing (LSI) not only to the word-based space but also to the syllable-based space, to reduce dimensionality of the spaces using implicitly defined semantics. Finally, we combine the distances and compare the distance between the query and the available documents in various spaces to rank the documents. In this procedure, we propose to model the document by hyperplane. To evaluate our proposed method, we conducted spoken document retrieval experiments using the NTCIR-9 SpokenDoc data set. The results showed that using the combination of the distances, and using LSI on the syllable-based document space, improved retrieval performance.
international conference natural language processing | 2007
M. Ozawa; Satoru Tsuge; Masami Shishibori; Kenji Kita; Minoru Fukumi; Fuji Ren; Shingo Kuroiwa
We collect the speech data for investigating an intra-speakers speech variability over a short and long time. In general, to reduce the load of speakers, the speech data are collected as one file from collecting start to collecting end. Hence, there are some noises, non-speech sections and mistaken sections in this file. Consequently, we must segment this file into individual utterances and select the useful utterances. This process requires a lot of time and efforts. In this paper, we propose an automatic utterance segmentation tool for dividing the collected speech data. The proposed tool is composed of four processes, which are a voice activity detection, speech recognition, a DP matching, and a correct of speech section. For evaluating the proposed tool, we conduct the evaluation experiments using a female speakers speech data in our corpus. Experimental results show that the proposed method can reduce a filing time by 90% compared to a manual filing. In This paper, first, we introduced the large speech corpus. This speech corpus contains is the speech data collected by specific speaker over long and short time periods. And, we explained the automatic utterance segmentation tool which we made in the case of corpus build. And inspected the validity. As a result, it was demonstrated that the automatic utterance segmentation tool was high-performance. Furthermore, it was demonstrated that speech corpus build became simple by using the automatic utterance segmentation tool.
international conference natural language processing | 2007
Satoru Tsuge; Keiji Seida; Masami Shishibori; Kenji Kita; Fuji Ren; Minoru Fukumi; Shingo Kuroiwa
Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. However, the relationships between intra-speakers speech variability and speech recognition performance are not clear. To investigate these relationships, we have been collecting speech data since November 2002. In this paper, we analyze the relationships between intra-speakers speech variability and the phoneme accuracy by a correlation analysis. Analyzed results showed the strong negative correlation between the phoneme accuracy and the speaking rate. The correlation coefficient indicated -0.77. Moreover, we can see that the phoneme accuracy is correlated with the temperature in the recording room and the humidity difference.
conference of the international speech communication association | 2008
Satoru Tsuge; Takashi Osanai; Hisanori Makinae; Toshiaki Kamada; Minoru Fukumi; Shingo Kuroiwa
society of instrument and control engineers of japan | 2011
Masato Miyoshi; Kentaro Mori; Yasunori Kashihara; Masafumi Nakao; Satoru Tsuge; Minoru Fukumi
Acoustical Science and Technology | 2011
Takahiro Fukumori; Takanobu Nishiura; Masato Nakayama; Yuki Denda; Norihide Kitaoka; Takeshi Yamada; Kazumasa Yamamoto; Satoru Tsuge; Masakiyo Fujimoto; Tetsuya Takiguchi; Chiyomi Miyajima; Satoshi Tamura; Tetsuji Ogawa; Shigeki Matsuda; Shingo Kuroiwa; Kazuya Takeda; Satoshi Nakamura