Nobuo Nukaga | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nobuo Nukaga is active.

Explore More

Publication

Featured researches published by Nobuo Nukaga.

IEEE Transactions on Audio, Speech, and Language Processing | 2013

Optimized Speech Dereverberation From Probabilistic Perspective for Time Varying Acoustic Transfer Function

Masahito Togami; Yohei Kawaguchi; Ryu Takeda; Yasunari Obuchi; Nobuo Nukaga

A dereverberation technique has been developed that optimally combines multichannel inverse filtering (MIF), beamforming (BF), and non-linear reverberation suppression (NRS). It is robust against acoustic transfer function (ATF) fluctuations and creates less distortion than the NRS alone. The three components are optimally combined from a probabilistic perspective using a unified likelihood function incorporating two probabilistic models. A multichannel probabilistic source model based on a recently proposed local Gaussian model (LGM) provides robustness against ATF fluctuations of the early reflection. A probabilistic reverberant transfer function model (PRTFM) provides robustness against ATF fluctuations of the late reverberation. The MIF and multichannel under-determined source separation (MUSS) are optimized in an iterative manner. The MIF is designed to reduce the time-invariant part of the late reverberation by using optimal time-weighting with reference to the PRTFM and the LGM. The MUSS separates the dereverberated speech signal and the residual reverberation after the MIF, which can be interpreted as an optimized combination of the BF and the NRS. The parameters of the PRTFM and the LGM are optimized based on the MUSS output. Experimental results show that the proposed method is robust against the ATF fluctuations under both single and multiple source conditions.

international conference on acoustics, speech, and signal processing | 2006

Scalable Implementation Of Unit Selection Based Text-To-Speech System For Embedded Solutions

Nobuo Nukaga; Ryota Kamoshida; Kenji Nagamatsu; Yoshinori Kitahara

In this paper we propose two methods in order to implement unit selection-based text-to-speech engine into resource-limited embedded systems. While we have achieved improving the quality of synthesized speech by unit selection-based text-to-speech technology, there is a practical problem regarding the trade-off between the size of database and the quality of synthesized speech. That is, we need large database and expensive computation in order to generate highly natural sounding voices, and the text-to-speech system is required to meet the specification of target system. For this problem, we introduced frequency-based approaches to reduce the size of speech database. The experimental results showed the step-by-step downsizing method was better than the direct one in terms of the cumulative join cost and the target cost. Furthermore, some techniques were introduced and evaluated in order to implement our text-to-speech engine into an embedded system. From experimental results, it developed that the run-time work load for the test sentences was 80 MIPS approximately and the implemented engine was useful and scalable for mid-class embedded system

international conference on acoustics, speech, and signal processing | 2012

Multichannel speech dereverberation and separation with optimized combination of linear and non-linear filtering

Masahito Togami; Yohei Kawaguchi; Ryu Takeda; Yasunari Obuchi; Nobuo Nukaga

In this paper, we propose a multichannel speech dereverberation and separation technique which is effective even when there are multiple speakers and each speakers transfer function is time-varying due to fluctuation of the corresponding speakers head. For robustness against fluctuation, the proposed method optimizes linear filtering with non-linear filtering simultaneously from probabilistic perspective based on a probabilistic reverberant transfer-function model, PRTFM. PRTFM is an extension of the conventional time-invariant transfer-function model under uncertain conditions, and PRTFM can be also regarded as an extension of recently proposed blind local Gaussian modeling. The linear filtering and the non-linear filtering are optimized in MMSE (Minimum Mean Square Error) sense during parameter optimization. The proposed method is evaluated in a reverberant meeting room, and the proposed method is shown to be effective.

multimedia signal processing | 1999

Sophisticated speech processing middleware on microprocessor

Nobuo Hataoka; Hiroaki Kokubo; Nobuo Nukaga; Yasunari Obuchi; Akio Amano; Yoshinori Kitahara

This paper describes speech processing middleware which has been developed on RISC microprocessors for embedded speech applications. This middleware consists of a speech recognition module and a speech synthesis module, and especially the speech recognition middleware has advantages of robustness for environmental noise and speaker differences. The speech middleware provides sophisticated user interfaces to multimedia systems using microprocessors as CPUs, such as car navigation systems, mobile information equipment, and game machines.

society of instrument and control engineers of japan | 2015

Area detection technology for air conditioner

Yuto Komatsu; Koichi Hamada; Nobuo Nukaga; Tatsuhiko Kagehiro; Yoshiro Ueda; Eisuke Matsubara; Noriyuki Jinno

We developed room layout detection technology for air conditioners. This technology uses an image camera that is equipped with the air conditioners. The technology controls the direction of the wind and their air capacity. It has been installed in commercial products as a “layout search.”.

international conference on signal processing | 2012

Online speech dereverberation with time-varying assumption of acoustic transfer functions for teleconferencing systems

Masahito Togami; Yohei Kawaguchi; Nobuo Nukaga

This paper deals with an online dereverberation technique for teleconferencing systems, which is robust against fluctuation of acoustic transfer functions (ATFs). The proposed method divides fluctuations into two classes. The first class is instantaneous fluctuation of the ATF of each speaker, e.g. movement of human head. Instead of the time-invariant assumption for the ATF in the conventional dereverberation techniques, the proposed method assumes that the ATF of each speaker is a probabilistic variable, and the dereverberated signal is obtained by integrating out of parameters related with the ATFs. The second class of fluctuations is fluctuation related with turn-taking of the active speaker. To be robust against the turn-taking, the proposed method utilizes multiple parameters which is estimated in different time-periods and selects the best parameter which maximizes the likelihood value at each time-frequency point. Experimental results under time-varying conditions show that the proposed method is effective.

information sciences, signal processing and their applications | 2012

Online mvbf adaptation under diffuse noise environments with mimo based noise pre-filtering

Masahito Togami; Yohei Kawaguchi; Nobuo Nukaga; Yasunari Obuchi

A noise-robust MVBF adaptation technique under diffuse noise environments is proposed. The proposed method is compatible with online adaptation and robustness against diffuse noise by combining a semi-online diffuse noise reduction and an online MVBF adaptation technique with sparseness assumption of speech sources. The online sparseness based MVBF adaptation is sensitive to diffuse noise, because diffuse noise is not sparse. However, by using diffuse noise pre-filtering based on local Gaussian modeling which can be regarded as an optimized MIMO(Multi-Input Multi-Output) diffuse noise reduction method from the probabilistic perspective, sparseness of the microphone input signal into the latter part is expected to be improved. The proposed method is evaluated by using speech signal under diffuse noise environments, and the proposed method can reduce more noise source with less distortion than the conventional online sparseness based MVBF adaptation.

Journal of the Acoustical Society of America | 2007