Takahiro Fukumori
Ritsumeikan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Takahiro Fukumori.
international conference on computer graphics and interactive techniques | 2010
Woong Choi; Takahiro Fukumori; Kohei Furukawa; Kozaburo Hachimura; Takanobu Nishiura; Keiji Yano
Recently, extensive research has been undertaken on digital archiving of cultural properties in the field of cultural heritage. These investigations have examined the processes of recording and preserving both tangible and intangible materials through the use of digital technologies.
asia-pacific signal and information processing association annual summit and conference | 2013
Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita
The automatic speech recognition (ASR) performance is degraded in noisy and reverberant environments. Although various techniques against degradation of the ASR performance have been proposed, it is difficult to properly apply them in evaluation environments with unknown noisy and reverberant conditions. It is possible to properly apply these techniques for improving the ASR performance if we can estimate the relationship between the ASR performance and degradation factors including both noise and reverberation. In this study, we here propose new noisy and reverberant criteria which are referred as “Noisy and Reverberant Speech Recognition with the PESQ and the Dn (NRSR-PDn)”. We first designed the “NRSR-PDn” using the relationships among the D value, the PESQ score, and the ASR performance. We then estimated the ASR performance with the designed criteria “NRSR-PDn” in evaluation experiments. Experimental evaluations demonstrated that our proposed criteria make the well suited for robustly estimating the ASR performance in noisy and reverberant environments.
software engineering, artificial intelligence, networking and parallel/distributed computing | 2012
Takahiro Fukumori; Takanobu Nishiura; Yoichi Yamashita
We digitally archived festival music signals (“Ohayashi”) in the Yamahoko parades of Gion festival in Kyoto, Japan. Besides, the festival music, which consists of Japanese traditional drums, flutes, bells, ambient noise and in-float driving noise are needed to reproduce the authentic atmosphere of this festival. To reproduce a high-quality sound field, we recorded the festival music in the presence of ambient noise and float noise by using multi-channel recording. We reproduced the sound field of one of the parades with the recorded sound sources. We employed point-source loudspeakers for reproducing Japanese traditional drums, flutes, and bells with omni-directional radiation characteristics. After that, we built a web-based system linked to a map of the parade route that would produce an acoustic sound field with realistic sensations at different points along the route. This system reproduces the Ohayashi at particular positions along the route when the user clicks circular buttons on the map.
Journal of the Acoustical Society of America | 2013
Naoto Kakino; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura
The surveillance systems with microphones have been developed to achieve a secure society. These systems can detect hazardous situations with observed speech, but are generally very expensive. This is because the conventional systems manually detect hazardous situations by security officers. Thus, we focus on the automatic shout detection method which can estimate hazardous situations. The acoustic model based on the Gaussian mixture model has been proposed as the conventional method to identify shouted and natural speeches. However, these methods have a problem that it is necessary to prepare huge training samples to accurately detect shouted speech. In the present paper, we focus on the rahmonic structure which shows a?subharmonic of fundamental frequency in the cepstrum domain because the rahmonic structure tends to arise in the shouted speech. In the present paper, we therefore propose the detection method of the shouted speech based on rahmonic structure. More specifically, we investigate rahmonic st...
Journal of the Acoustical Society of America | 2012
Keisuke Horii; Takahiro Fukumori; Masanori Morise; Takanobu Nishiura; Yoichi Yamashita
We should reduce the unwanted noise from noisy speech. Spectral Subtraction (SS) which is one of the noise reduction methods has been proposed by S. F. Boll in 1979, and it can effectively reduce the unwanted noise by utilizing only the observed signal. SS however has a problem that dissonant noise called musical tone is generated after noise reduction. SS estimates the noise with non-speech part and subtracts the estimated noise from observed signal. Flooring process is also performed depend on estimated noise power for supporting the excessive subtraction. Since the musical tone is generated by it, the improvement of SS is required to reduce it. In the past, we proposed SS with the weighted subtraction coefficients in each frequency band for controlling flooring process. In this method, the equal-weighted subtraction coefficients were utilized in every frame, however speech and noise power are different in each frame. To overcome this problem, we newly propose an advanced SS with the color-weighted subt...
international conference on acoustics, speech, and signal processing | 2017
Yukoh Wakabayashi; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita
Speech enhancement in noisy environments has been widely investigated by modifying only the amplitude spectrum of the speech signal, while the phase spectrum, which is regarded as an unimportant feature, is ignored. However, it has recently been reported that the phase spectrum plays an important role in the intelligibility and quality of speech. We propose a speech-enhancement method with phase reconstruction, which estimates inartificial phase spectrum by using the time-frequency feature called phase distortion, though a conventional phase reconstruction estimates artificial one. The objective experimental results indicate improvement in speech quality with and efficiency of the proposed method.
asia pacific signal and information processing association annual summit and conference | 2016
Masato Nakayama; Ryosuke Konabe; Takahiro Fukumori; Takanobu Nishiura
It is very important to provide a personal audible space (audio spot) to listeners. Near-sound-field propagation with a large scale system has been proposed to realize it. In addition, the parametric loudspeaker has been proposed in order to provide an audio spot. It is a small scale system, but the conventional parametric loudspeaker has difficulty in reproducing the audible sound only to near-field. In this paper, we therefore propose a new near-sound-field propagation based on individual beam-steering for the carrier and sideband waves using the parametric loudspeaker. In the proposed method, their waves are emitted to different directions using a parametric array loudspeaker. The method can realize the near-sound-field propagation because the audible area, which their waves are composited, is limited near to the parametric array loudspeaker. Finally, we evaluate the effectiveness of the proposed method through the evaluation experiments.
Journal of the Acoustical Society of America | 2016
Zhuan Zuo; Taku Yoshimura; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura
Three-dimensional (3-D) sound field reproduction systems such as surround sound reproduction systems are rapidly being developed in recent few years. However, these systems need a lot of space for the placement of loudspeakers. For this reason, a binaural system has been studied as one of the systems, which controls the sound image localization with the head-related transfer functions (HRTFs) and just by using only a headphones. However, the individual differences in body features would cause the variations of HRTFs which greatly affect the direction perception. For this reason, a method based on the spectral notch estimation from the pinna shape has been studied for the personalization of HRTFs. In this study, we propose a new method, in which the depth information used for the spectral notch estimation is measured with stereo images of the pinna. Two cameras are utilized together as a stereo camera to take the disparate stereo images so that we can use the disparity of them to measure the depth informat...
Journal of the Acoustical Society of America | 2016
Tomoyuki Mizuno; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura
Measuring distant-talking speech with high accuracy is important in detecting criminal activity. Various microphones such as the parabolic and shotgun microphones have been developed for measuring it. However, most of them have difficulty in extracting distant-talking speech at a target position if they are surrounded by noisy sound sources. Therefore, this study focuses on a microphone system to extract the distant-talking speech by vibrating a papery object using laser light. This system is referred to as an optical microphone in this study. The sound quality of the optical microphone is especially degraded at higher frequencies because it utilizes an external diaphragm consisting of various materials as the vibrating papery object. In this study, we therefore propose a reconstruction method for degraded distant-talking speech observed with the optical microphone. The method is realized by using a deep neural network (DNN) that is trained as the system between clean and observed speech signals. In the p...
Journal of the Acoustical Society of America | 2016
Sayaka Okayasu; Maiko Yoshiura; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura
MRI devices are often utilized as surgery support systems. However, communication among medical staff during surgery is difficult because conversations can be disturbed by loud noise emitted from devices. In this study, we propose a MRI noise suppression method using a weighted spectral subtraction to address the problem. Spectral subtraction is generally utilized for suppressing stationary noise. However, directly applying this method to non-stationary noise such as MRI noise, which has frequency and amplitude fluctuations in the long-and-short terms, may be difficult. In the proposed method, the noise from a MRI device is estimated on the basis of the frequency fluctuation in the short term. The estimated noise is then subtracted by processing the weighted spectral subtraction whose coefficients are calculated on the basis of the amplitude fluctuation in the long term. Objective evaluation experiments were carried out to evaluate the quantities of noise reduction and sound distortion. As a result, we co...