Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Takanori Nishino is active.

Publication


Featured researches published by Takanori Nishino.


workshop on applications of signal processing to audio and acoustics | 1999

Interpolating head related transfer functions in the median plane

Takanori Nishino; Shoji Kajita; Kazuya Takeda; Fumitada Itakura

This paper describes the interpolation of head related transfer functions (HRTFs) for all direction in the median plane. The interpolation of HRTFs enables us to reduce the number of measurements for new users HRTFs, and also reduce the data of HRTFs in auditory virtual systems. In this paper, a simple linear interpolation method and the spline interpolation method are evaluated and advantages of both methods clarified. In experiments, the interpolation methods are applied to HRTFs measured using a dummy head. The experimental results show that the two methods are comparable in the best case. The resultant minimum spectral distortion is about 2 dB for both methods. The results clarify that the linear interpolation is effective for a set of elevations selected based on the cross correlation and that the spline interpolation is effective at large and equal intervals. These results indicate that HRTFs in the median plane can be interpolated by the methods.


international conference on acoustics, speech, and signal processing | 2008

Encoding large array signals into a 3D sound field representation for selective listening point audio based on blind source separation

Kenta Niwa; Takanori Nishino; Kazuya Takeda

ABSTRACT A sound field reproduction method which uses blind source separation and head-related transfer function is proposed. In the proposed system, multichannel acoustic signals captured at the distant microphones are encoded to a set of location/signal pairs of virtual sound sources based on frequency-domain ICA. After estimating the locations and the signals of the virtual sources, by convolving the controlled acoustic transfer functions with each signal, the spatial sound at the selected point is constructed. In the evaluation, the sound field made by 6 sound sources is captured using 48 distant microphones and is encoded into set of virtual sound sources. Subjective evaluation shows that there is no significant difference between natural and reconstructed sound when more than 6 virtual sources are used. Therefore the effectiveness of the encoding algorithm as well as the virtual source representation is confirmed.


Journal of the Acoustical Society of America | 1996

Interpolating HRTF for auditory virtual reality

Takanori Nishino; Sumie Mase; Shoji Kajita; Kazuya Takeda; Fumitada Itakura

Two (linear and nonlinear) interpolation methods of the head‐related transfer function (HRTF) are exploited in order to realize virtual auditory localization. In both methods, HRTFs of the left and right ears are represented by a delay time and a common impulse response, where delay time is determined so that the cross correlation of two HRTFs takes the maximum value. A three‐layer neural network is trained for the nonlinear method, whereas basic linear interpolation is used for the linear method. Evaluation tests are performed by using HRTF prototypes, Web‐published by the MIT Media Lab. The signal‐to‐deviation ratios (SDR) of the measured and interpolated HRTFs are calculated for objective evaluation of the methods. The SDR of the nonlinear method is much better, i.e., 50 to 70 dB, than that of the linear method, i.e., 5 to 30 dB. On the other hand, there is no significant difference in the subjective evaluation of localizing the earphone‐presented sounds generated by the two interpolated HRTFs. Further...


Eurasip Journal on Audio, Speech, and Music Processing | 2014

Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech

Madoka Miki; Norihide Kitaoka; Chiyomi Miyajima; Takanori Nishino; Kazuya Takeda

We propose an integrative method of recognizing gestures such as pointing, accompanying speech. Speech generated simultaneously with gestures can assist in the recognition of gestures, and since this occurs in a complementary manner, gestures can also assist in the recognition of speech. Our integrative recognition method uses a probability distribution which expresses the distribution of the time interval between the starting times of gestures and of the corresponding utterances. We evaluate the rate of improvement of the proposed integrative recognition method with a task involving the solution of a geometry problem.


Journal of the Acoustical Society of America | 2006

Estimation of a talker and listener’s positions in a car using binaural signals

Madoka Takimoto; Takanori Nishino; Kazuya Takeda

The problem of estimating the location of a sound source is described which is based on signals observed at the entrances of the two ears. The purpose is to specify the talker’s and listener’s positions within a car using the binaural signal. The talker and the listener sit in two of the four car seats. In this experiment, two kinds of head and torso simulators are used as a talker and a listener. Given information includes the acoustic transfer functions for all positional patterns. Eight patterns of acoustic transfer functions are measured, involving those that have the same positional pattern, but in which the talker faces a different direction. A Gaussian mixture model for each positional pattern is generated. The parameters we used are interaural information such as the envelope of an interaural level difference. The models are evaluated by specifying the positional pattern. Results show that we can specify positions with up to 97% (35/36) accuracy using the binaural signals of two men. Then the inpu...


IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2005

Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition

Weifeng Li; Chiyomi Miyajima; Takanori Nishino; Katsunobu Itou; Kazuya Takeda; Fumitada Itakura

In this paper, we address issues in improving hands-free speech recognition performance in different car environments using multiple spatially distributed microphones. In the previous work, we proposed the multiple linear regression of the log spectra (MRLS) for estimating the log spectra of speech at a close-talking microphone. In this paper, the concept is extended to nonlinear regressions. Regressions in the cepstrum domain are also investigated. An effective algorithm is developed to adapt the regression weights automatically to different noise environments. Compared to the nearest distant microphone and adaptive beamformer (Generalized Sidelobe Canceller), the proposed adaptive nonlinear regression approach shows an advantage in the average relative word error rate (WER) reductions of 58.5% and 10.3%, respectively, for isolated word recognition under 15 real car environments.


international conference on intelligent transportation systems | 2009

Prediction model of driving behavior based on traffic conditions and driver types

Hideomi Amata; Chiyomi Miyajima; Takanori Nishino; Norihide Kitaoka; Kazuya Takeda

We investigate the driving behavior differences at unsignalized intersections between expert and nonexpert drivers. By analyzing real-world driving data, significant differences were seen in pedal operations but not in steering operations. Easing accelerator behaviors before entering unsignalized intersections were especially seen more often in expert driving. We propose two prediction models for driving behaviors in terms of traffic conditions and driver types: one is based on multiple linear regression analysis, which predicts whether the driver will steer, ease up on the accelerator, or brake. The second predicts driver decelerating intentions using a Bayesian Network. The proposed models could predict the three driving actions with over 70% accuracy, and about 50% of decelerating intentions were predicted before entering unsignalized intersections.


Archive | 2009

On-Going Data Collection of Driving Behavior Signals

Chiyomi Miyajima; Takashi Kusakawa; Takanori Nishino; Norihide Kitaoka; Katsunobu Itou; Kazuya Takeda

We are developing a large-scale real-world driving database of more than 200 drivers using a data collection vehicle equipped with various sensors for the synchronous recording of multimedia data including speech, video, driving behavior, and physiological signals. Driver’s speech and videos are captured with multi-channel microphones and cameras. Gas and brake pedal pressures, steering angles, vehicle velocities, and following distances are measured using pressure sensors, a potentiometer, a velocity pulse counter, and distance sensors, respectively. Physiological sensors are mounted to measure driver’s heart rate, skin conductance, and emotion-based sweating. The multimedia data is collected under four different task conditions while driving on urban roads and an expressway. Data collection is currently underway and to date 151 drivers have participated in the experiment. Data collection is being conducted in international collaboration with the United States and Europe. This chapter reports on our on-going driving data collection in Japan.


international conference on multimodal interfaces | 2008

An integrative recognition method for speech and gestures

Madoka Miki; Chiyomi Miyajima; Takanori Nishino; Norihide Kitaoka; Kazuya Takeda

We propose an integrative recognition method of speech accompanied with gestures such as pointing. Simultaneously generated speech and pointing complementarily help the recognition of both, and thus the integration of these multiple modalities may improve recognition performance. As an example of such multimodal speech, we selected the explanation of a geometry problem. While the problem was being solved, speech and fingertip movements were recorded with a close-talking microphone and a 3D position sensor. To find the correspondence between utterance and gestures, we propose probability distribution of the time gap between the starting times of an utterance and gestures. We also propose an integrative recognition method using this distribution. We obtained approximately 3-point improvement for both speech and fingertip movement recognition performance with this method.


ieee intelligent vehicles symposium | 2010

A browsing and retrieval system for driving data

Masashi Naito; Chiyomi Miyajima; Takanori Nishino; Norihide Kitaoka; Kazuya Takeda

With the increased presence and recent advances of drive recorders, rich driving data that include video, vehicle acceleration signals, driver speech, GPS data, and several sensor signals can be continuously recorded and stored. These advances enable researchers to study driving behavior more extensively for traffic safety. However, increasing the variety and the amount of driving data complicates the simultaneous browsing of various data and finding desired data from large databases. In this study, we develop a browsing and retrieval system for driving data that provides a multi-modal data browser, query- and similarity-based retrieval functions, and a fast browsing function that skips redundant scenes. For sharing data with several users, this system can be used via networks from PCs or smartphones, This system uses a time-series active search, which has been successfully used for fast search of audio and video data, as its retrieval function algorithm. In a few seconds, this system can retrieve driving scenes that are similar to an input scene from 80,000 scenes. Retrieval performance was compared in various retrieval conditions by changing the codebook size of the vector quantization for the histogram features and a combination of driving signals. Experimental results showed that more than 97% retrieval performance was achieved for driving behaviors of left/right turns and curves using a combination of such complementary information as steering angles and lateral acceleration. We also compared the proposed method to a conventional image-based retrieval method using subjective similarity scores of driving scenes. Our proposed system retrieved similar scenes with about a 75% retrieval performance that was five points higher than a conventional image-based retrieval method. It is because image-based method is sensitive to changes of image in the area except in the region of interest for driving data retrieval. The fast browsing function also skipped scenes that could not be skipped by an image-based method.

Collaboration


Dive into the Takanori Nishino's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge