Takahiro Fukumori | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takahiro Fukumori is active.

Explore More

Publication

Featured researches published by Takahiro Fukumori.

international conference on computer graphics and interactive techniques | 2010

Virtual Yamahoko parade in virtual Kyoto

Woong Choi; Takahiro Fukumori; Kohei Furukawa; Kozaburo Hachimura; Takanobu Nishiura; Keiji Yano

Recently, extensive research has been undertaken on digital archiving of cultural properties in the field of cultural heritage. These investigations have examined the processes of recording and preserving both tangible and intangible materials through the use of digital technologies.

asia-pacific signal and information processing association annual summit and conference | 2013

Estimation of speech recognition performance in noisy and reverberant environments using PESQ score and acoustic parameters

Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita

The automatic speech recognition (ASR) performance is degraded in noisy and reverberant environments. Although various techniques against degradation of the ASR performance have been proposed, it is difficult to properly apply them in evaluation environments with unknown noisy and reverberant conditions. It is possible to properly apply these techniques for improving the ASR performance if we can estimate the relationship between the ASR performance and degradation factors including both noise and reverberation. In this study, we here propose new noisy and reverberant criteria which are referred as “Noisy and Reverberant Speech Recognition with the PESQ and the Dn (NRSR-PDn)”. We first designed the “NRSR-PDn” using the relationships among the D value, the PESQ score, and the ASR performance. We then estimated the ASR performance with the designed criteria “NRSR-PDn” in evaluation experiments. Experimental evaluations demonstrated that our proposed criteria make the well suited for robustly estimating the ASR performance in noisy and reverberant environments.

software engineering, artificial intelligence, networking and parallel/distributed computing | 2012

Digital Archive for Japanese Intangible Cultural Heritage Based on Reproduction of High-Fidelity Sound Field in Yamahoko Parade of Gion Festival

Takahiro Fukumori; Takanobu Nishiura; Yoichi Yamashita

We digitally archived festival music signals (“Ohayashi”) in the Yamahoko parades of Gion festival in Kyoto, Japan. Besides, the festival music, which consists of Japanese traditional drums, flutes, bells, ambient noise and in-float driving noise are needed to reproduce the authentic atmosphere of this festival. To reproduce a high-quality sound field, we recorded the festival music in the presence of ambient noise and float noise by using multi-channel recording. We reproduced the sound field of one of the parades with the recorded sound sources. We employed point-source loudspeakers for reproducing Japanese traditional drums, flutes, and bells with omni-directional radiation characteristics. After that, we built a web-based system linked to a map of the parade route that would produce an acoustic sound field with realistic sensations at different points along the route. This system reproduces the Ohayashi at particular positions along the route when the user clicks circular buttons on the map.

Journal of the Acoustical Society of America | 2013

Experimental study of shout detection with the rahmonic structure

Naoto Kakino; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura

The surveillance systems with microphones have been developed to achieve a secure society. These systems can detect hazardous situations with observed speech, but are generally very expensive. This is because the conventional systems manually detect hazardous situations by security officers. Thus, we focus on the automatic shout detection method which can estimate hazardous situations. The acoustic model based on the Gaussian mixture model has been proposed as the conventional method to identify shouted and natural speeches. However, these methods have a problem that it is necessary to prepare huge training samples to accurately detect shouted speech. In the present paper, we focus on the rahmonic structure which shows a?subharmonic of fundamental frequency in the cepstrum domain because the rahmonic structure tends to arise in the shouted speech. In the present paper, we therefore propose the detection method of the shouted speech based on rahmonic structure. More specifically, we investigate rahmonic st...

Journal of the Acoustical Society of America | 2012

The determination of dynamic subtraction for spectral subtraction towards musical tone reduction

Keisuke Horii; Takahiro Fukumori; Masanori Morise; Takanobu Nishiura; Yoichi Yamashita

We should reduce the unwanted noise from noisy speech. Spectral Subtraction (SS) which is one of the noise reduction methods has been proposed by S. F. Boll in 1979, and it can effectively reduce the unwanted noise by utilizing only the observed signal. SS however has a problem that dissonant noise called musical tone is generated after noise reduction. SS estimates the noise with non-speech part and subtracts the estimated noise from observed signal. Flooring process is also performed depend on estimated noise power for supporting the excessive subtraction. Since the musical tone is generated by it, the improvement of SS is required to reduce it. In the past, we proposed SS with the weighted subtraction coefficients in each frequency band for controlling flooring process. In this method, the equal-weighted subtraction coefficients were utilized in every frame, however speech and noise power are different in each frame. To overcome this problem, we newly propose an advanced SS with the color-weighted subt...

international conference on acoustics, speech, and signal processing | 2017

Phase reconstruction method based on time-frequency domain harmonic structure for speech enhancement

Yukoh Wakabayashi; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita

Speech enhancement in noisy environments has been widely investigated by modifying only the amplitude spectrum of the speech signal, while the phase spectrum, which is regarded as an unimportant feature, is ignored. However, it has recently been reported that the phase spectrum plays an important role in the intelligibility and quality of speech. We propose a speech-enhancement method with phase reconstruction, which estimates inartificial phase spectrum by using the time-frequency feature called phase distortion, though a conventional phase reconstruction estimates artificial one. The objective experimental results indicate improvement in speech quality with and efficiency of the proposed method.

asia pacific signal and information processing association annual summit and conference | 2016

Near-sound-field propagation based on individual beam-steering for carrier and sideband waves with parametric array loudspeaker

Masato Nakayama; Ryosuke Konabe; Takahiro Fukumori; Takanobu Nishiura

It is very important to provide a personal audible space (audio spot) to listeners. Near-sound-field propagation with a large scale system has been proposed to realize it. In addition, the parametric loudspeaker has been proposed in order to provide an audio spot. It is a small scale system, but the conventional parametric loudspeaker has difficulty in reproducing the audible sound only to near-field. In this paper, we therefore propose a new near-sound-field propagation based on individual beam-steering for the carrier and sideband waves using the parametric loudspeaker. In the proposed method, their waves are emitted to different directions using a parametric array loudspeaker. The method can realize the near-sound-field propagation because the audible area, which their waves are composited, is limited near to the parametric array loudspeaker. Finally, we evaluate the effectiveness of the proposed method through the evaluation experiments.

Journal of the Acoustical Society of America | 2016

Head-related transfer function personalization based on spectral notch estimation with stereo images of pinna

Zhuan Zuo; Taku Yoshimura; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura

Three-dimensional (3-D) sound field reproduction systems such as surround sound reproduction systems are rapidly being developed in recent few years. However, these systems need a lot of space for the placement of loudspeakers. For this reason, a binaural system has been studied as one of the systems, which controls the sound image localization with the head-related transfer functions (HRTFs) and just by using only a headphones. However, the individual differences in body features would cause the variations of HRTFs which greatly affect the direction perception. For this reason, a method based on the spectral notch estimation from the pinna shape has been studied for the personalization of HRTFs. In this study, we propose a new method, in which the depth information used for the spectral notch estimation is measured with stereo images of the pinna. Two cameras are utilized together as a stereo camera to take the disparate stereo images so that we can use the disparity of them to measure the depth informat...

Journal of the Acoustical Society of America | 2016

An evaluation of speech reconstruction for degraded speech of optical microphone with deep neural network

Tomoyuki Mizuno; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura

Measuring distant-talking speech with high accuracy is important in detecting criminal activity. Various microphones such as the parabolic and shotgun microphones have been developed for measuring it. However, most of them have difficulty in extracting distant-talking speech at a target position if they are surrounded by noisy sound sources. Therefore, this study focuses on a microphone system to extract the distant-talking speech by vibrating a papery object using laser light. This system is referred to as an optical microphone in this study. The sound quality of the optical microphone is especially degraded at higher frequencies because it utilizes an external diaphragm consisting of various materials as the vibrating papery object. In this study, we therefore propose a reconstruction method for degraded distant-talking speech observed with the optical microphone. The method is realized by using a deep neural network (DNN) that is trained as the system between clean and observed speech signals. In the p...

Journal of the Acoustical Society of America | 2016

MRI noise suppression using weighted spectral subtraction based on noise estimation with long-and-short terms

Sayaka Okayasu; Maiko Yoshiura; Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura

MRI devices are often utilized as surgery support systems. However, communication among medical staff during surgery is difficult because conversations can be disturbed by loud noise emitted from devices. In this study, we propose a MRI noise suppression method using a weighted spectral subtraction to address the problem. Spectral subtraction is generally utilized for suppressing stationary noise. However, directly applying this method to non-stationary noise such as MRI noise, which has frequency and amplitude fluctuations in the long-and-short terms, may be difficult. In the proposed method, the noise from a MRI device is estimated on the basis of the frequency fluctuation in the short term. The estimated noise is then subtracted by processing the weighted spectral subtraction whose coefficients are calculated on the basis of the amplitude fluctuation in the long term. Objective evaluation experiments were carried out to evaluate the quantities of noise reduction and sound distortion. As a result, we co...

Explore More