Kohei Hayashida
Ritsumeikan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kohei Hayashida.
international conference on acoustics, speech, and signal processing | 2014
Kohei Hayashida; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita; Toshiharu Horiuchi; Tsuneo Kato
Desired/undesired speech discrimination is as important as speech/non-speech discrimination to achieve useful applications such as speech interfaces and teleconferencing systems. Conventional methods of voice activity detection (VAD) utilize the directional information of sound sources to distinguish desired from undesired speech. However, these methods have to utilize multiple microphones to estimate the directions of sound sources. Here, we propose a new method to discriminate desired from undesired speech with a single microphone. We assumed that the desired talkers would be close to the microphone, and the proposed method could distinguish close/distant-talking speech from observed signals based on the kurtosis of the linear prediction (LP) residual signals. The experimental results revealed that the proposed method could distinguish close-talking speech from distant-talking speech within a 10% equal error rate (EER) in ordinary reverberant environments with less processing time.
Journal of the Acoustical Society of America | 2013
Ryota Miyake; Kohei Hayashida; Masato Nakayama; Takanobu Nishiura
Information on the distance to the target is very important to achieve the practical use of hands-free speech interfaces and nursing-care robots. Many distance measurement methods, which use the time-of-flight (TOF) of a reflected wave measured with reference to the transmitted wave, have been proposed. However, these methods cannot measure short distances because the transmitted wave, which has not attenuated sufficiently at the time of a reflected wave reception, suppresses reflected waves for short distances. We previously proposed an acoustic distance measurement method based on interference between the transmitted and reflected waves, which can be used for distance measurement over a short range using single microphone. This method is referred to the phase interference method. It can estimate the distance to target, but cannot estimate the direction of target. In the present paper, therefore, we propose to achieve acoustic imaging with the phase interference method by using microphone-array instead o...
asia-pacific signal and information processing association annual summit and conference | 2013
Kohei Hayashida; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita
Conventional near-field talker localization methods with microphone-array calculate spatial spectrum in each scanning position of discretized space and each frequency. Hence, elapsed time is increased and real-time processing is difficult. Real-time processing is important for achieving the natural interaction with the speech interfaces. To overcome this problem, we newly propose a talker localization method based on Multi-resolution Scanning in Frequency Domain (MSFD). MSFD utilizes lower spatial resolution in the lower frequency band and higher spatial resolution in the higher frequency band to reduce elapsed time. We also propose a calculation method for suitable spatial resolution at each frequency on the basis of the variances of phase differences among microphones. The results of evaluation experiment indicated that the proposed MSFD could reduce the elapsed time without degrading the localization accuracy.
Journal of the Acoustical Society of America | 2013
Asako Okamoto; Kohei Hayashida; Masato Nakayama; Takanobu Nishiura
To detect hazardous situations with danger sounds, the acoustic surveillance system is an ideal candidate. The conventional systems recognize environmental sounds with hidden Markov model (HMM) in order to detect danger sounds. It is however difficult to accurately recognize the environmental sounds, because the optimum HMM parameters for environmental sounds have not been identified. It is important factor for accurately recognizing them to ideally determine the number of states, one of the HMM parameter. On the other hand, environmental sounds which include danger sounds have a wider characteristic as the structure, the complexity, the length, etc. The variable states should be therefore an optimum HMM structure to detect the danger sounds. We thus propose the danger sound detection based on variable-state HMMs corresponding to a number of inflection points with the delta power of environmental sounds. We first investigate the recognition performance of environmental sounds including danger sounds with ...
Journal of the Acoustical Society of America | 2013
Kohei Hayashida; Junpei Ogawa; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita
In recent years, the methods utilizing environmental sounds have been increasingly employed for monitoring the safety of the elder who lives in distant place. Environmental sounds should consist of various sounds in daily life, and identified ones enables to detect abnormity. To detect abnormity, it is therefore required that abnormal/warning sounds are accurately identified among environmental sounds. In the past, environmental sound identification methods have generally utilized acoustic models constructed by each sound source for all environmental sounds. In our former research, we proposed multi-stage identification for detecting abnormal/warning sounds accurately. However, these methods design individual acoustic models from similar sounds. Therefore, the sound identification performance is degraded. To overcome this problem, in this study, we proposed environmental sound classification based on acoustic features for model construction. The proposed method classifies environmental sounds based on the...
Computer Graphics and Imaging | 2013
Kohei Hayashida; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita
We study near-field sound source localization, f or which two-dimensional multiple signal classification (2DMUSIC) has already been developed. However, this method requires a lot of elapsed time because it processes in each frequency. A localization method based on crosspower spectrum phase analysis (CSP) in the near field has also been developed. However, its localization accuracy is degraded by noise sources and reverberations. To overcome these problems, we propose a new localization method based on time delay and subspace methods. Experimental results of an evaluation experiment carried out in a conference room demonstrate that the method improves localization accuracy with less elapsed time in a diffused noise environment.
Journal of the Acoustical Society of America | 2012
Kohei Hayashida; Masanori Morise; Takanobu Nishiura; Yoichi Yamashita
Robust speech recognition is necessary for realizing useful speech interfaces. A microphone array is an effective item at capturing distant-talking speech with high-quality in noisy environments. It captures the target speech by localizing a talker and steering the directivity. These processing is usually executed in each frequency. For realizing useful speech interface, these processing must be finished in real-time. In the research into sound source localization, various methods have already been developed, and these methods localize a sound source based on acoustic space scanning by fixed resolution in each frequency. Therefore, calculation time is increased, and the real-time processing is difficult in higher spatial resolution. To overcome this problem, we proposed the localization method based on multi-resolution scanning in frequency domain. The lower frequency band has lower spatial resolution, and the higher frequency band has higher spatial resolution. The proposed method localizes the sound sou...
Journal of the Acoustical Society of America | 2012
Junpei Ogawa; Kohei Hayashida; Masanori Morise; Takanobu Nishiura; Yoichi Yamashita
In recent years, the methods utilizing environmental sounds have been increasingly employed for monitoring the safety of the elder who lives in distant place. Environmental sounds should consist of various sounds in daily life, and identified ones enables to detect abnormity. To detect abnormity, it is therefore required that abnormal/warning sounds are accurately identified among environmental sounds. In the past, environmental sound identification method has utilized acoustic models constructed by each sound source for all environmental sounds. However, as it only stores a few training abnormal/warning sounds, it is difficult to accurately design each abnormal/warning acoustic model. To overcome this problem, we propose a multi-stage identification method for abnormal/warning sounds. In the first stage, it utilizes an abnormal/alarm acoustic sound model and each normal acoustic sound model for detecting the abnormal/alarm sounds with a few training abnormal/warning sounds. In the second stage, it utiliz...
IEICE technical report. Signal processing | 2015
Asako Okamoto; Kohei Hayashida; Masato Nakayama; Takanobu Nishiura
The IEICE transactions on information and systems | 2013
Yasuhiro Kuratani; Kohei Hayashida; Masato Nakayama; Takanobu Nishiura