Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mitsunori Mizumachi is active.

Publication


Featured researches published by Mitsunori Mizumachi.


international conference on acoustics speech and signal processing | 1998

Noise reduction by paired-microphones using spectral subtraction

Mitsunori Mizumachi; Masato Akagi

This paper proposes a method of noise reduction by paired microphones as a front-end processor for speech recognition systems. This method estimates noise using a subtractive microphone array and subtracts them from the noisy speech signal using spectral subtraction (SS). Since this method can estimate noise analytically and frame by frame, it is easy to estimate noise not depending on these acoustic properties. Therefore, this method can also reduce non-stationary noise, for example sudden noise when a door has just closed, which cannot be reduced by other SS methods. The results of computer simulations and experiments in a real environment show that this method can reduce LPC log spectral envelope distortions.


international conference on data engineering | 2005

CENSREC-3: Data Collection for In-Car Speech Recognition and Its Common Evaluation Framework

Masakiyo Fujimoto; Satoshi Nakamura; Toshiki Endo; Kazuya Takeda; Chiyomi Miyajima; Shingo Kuroiwa; Takeshi Yamada; Norihide Kitaoka; Kazumasa Yamamoto; Mitsunori Mizumachi; Takanobu Nishiura; Akira Sasou

This paper introduces a common database, an evaluation framework, and its baseline recognition results for in-car speech recognition, CENSREC-3, as an outcome of IPSJ-SIG SLP Noisy Speech Recognition Evaluation Working Group. CENSREC-3 which is a sequel of AURORA-2J is designed as the evaluation framework of isolated word recognition in real driving car environments. Speech data was collected using 2 microphones, a close-talking microphone and a hands-free microphone, under carefully controlled 16 different driving conditions, i.e., combinations of 3 car speeds and 5 car conditions. CENSREC-3 provides 6 evaluation environments which are designed using speech data collected in these car conditions.


sensor array and multichannel signal processing workshop | 2004

Iterative compensation of microphone array and sound source movements based on minimization of arrival time differences

Toshiharu Horiuchi; Mitsunori Mizumachi; Satoshi Nakamura

This paper proposes an iterative compensation method to deal with relative change of sound source location caused by rapid movements of a microphone array and a sound source. This method introduces a delay filter that has shifted and sampled sine functions. This paper presents a concept that applies both the error function of the adaptive algorithms to estimate two-dimensional direction-of-arrival and the coordinate system of a moving microphone array. This method directly estimates the relative directions of the microphone array to the sound source directions by minimizing the relative differences of arrival time among the observed signals, not by estimating the time difference of arrival (TDOA) between two observed signals. This method compensates the time delay of the observed signals simultaneously, and it has a feature to ensure that the output signals are in phase. Simulation results support effectiveness of the method.


Sensors | 2014

Robust sensing of approaching vehicles relying on acoustic cues

Mitsunori Mizumachi; Atsunobu Kaminuma; Nobutaka Ono; Shigeru Ando

This paper proposes a robust sensing technique for localizing an approaching vehicle relying on an acoustic cue. A camera and a radar system are commonly used as sensors for active safety systems. However, such sensors do not work well in a blind spot such as a blind junction of a highway. On the other hands, various acoustical noises caused by a cruising vehicle arrive at the blind spot. Those acoustic cues are available for achieving robust sensing of surrounding vehicles. It is necessary to extract a robust spatial feature from noisy acoustical observations. In this paper, the spatio-temporal gradient method is employed for the feature extraction, and the spatial feature is filtered out through sequential state estimation. Feasibility of the proposed method is confirmed with real acoustical observations, which are obtained by microphones outside a cruising vehicle. The filtering process can be done in real time, because its real-time factor is 0.096 using a 2.6 GHz Intel Core i7 processor.


sensor array and multichannel signal processing workshop | 2004

Passive subtractive beamformer for near-field sound sources

Mitsunori Mizumachi; Satoshi Nakamura

This paper describes the problem caused by near-field sound sources. Formerly, the authors proposed a 2-ch passive subtractive beamformer with a single sharp notch for noise reduction. It is obvious that the single sharp notch is insufficient for dealing with near-field, non-point sound sources. To solve this problem, this paper presents the hybrid subtractive beamformer that is realized as a cascade connection of single subtractive beamformers. The number of connections depends on frequency to minimize the negative effect caused by spatial aliasing when an objective signal is assumed as a wide-band speech signal. The experimental results verifies that the hybrid beamformer has an advantage in reducing signal distortion over the original single subtractive beamformer.


international conference on speech and computer | 2016

Study on the Improvement of Intelligibility for Elderly Speech Using Formant Frequency Shift Method

Yuto Tanaka; Mitsunori Mizumachi; Yoshihisa Nakatoh

In general, aging is progressing in developed countries. Elderly people have difficulty controlling their articulation accurately due to aging. We need to improve the quality of elderly speech for smooth communication. In this paper, we analyzed the 1st formant frequency (F1) and 2nd formant frequency (F2) between the more intelligible speech and less intelligible speech of Japanese elderly people. In addition, we improved the intelligibility of less intelligible elderly speech by using the formant frequency shift method. This method is the correcting by shift value of formant frequency based on LPC. The shift value is the magnification such as expanding the F1-F2 size of less intelligible speech.


Journal of the Acoustical Society of America | 2016

Superdirective non-linear beamforming with deep neural network

Mitsunori Mizumachi; Maya Origuchi

Beamforming has been one of important issues in acoustic signal processing, since it can achieve signal enhancement and sound source localization. In general, traditional beamformers are designed by an analytical approach or an adaptive approach. It is, however, difficult to properly optimize the beamformers under the complicated acoustical scene. An alternative non-linear beamforming can be substituted for the linear beamforming. In this study, a flexible framework for optimizing the beamformer is introduced based on a deep neural network. Capturing acoustic signals using a microphone array is regarded as spatial sampling, so that annoying grating lobes appear in beam-pattern when the relationship between the wavelength and the microphone spacing does not satisfy the sampling theorem. The proposed method achieves sub-band beamforming using the non-uniform microphone array with eight nesting microphones, which are carefully designed not to cause spatial aliasing. Feasibility of the proposed method has bee...


international conference on informatics electronics and vision | 2015

Design of elder-friendly auditory signals for microwave ovens

Naoki Kadota; Mitsunori Mizumachi; Yoshihisa Nakatoh

Recent house electrical appliances generate a number of auditory signals, when a user operates an appliance or an appliance draws attention to a user. In this paper, we focus on multi-functional microwave ovens, which commonly have several auditory signals for users operation, error occurrence, completion of cooking, and so on. A wide variety of auditory signals for current microwave ovens have two problems: it is difficult for users to image what the auditory signals inform, and the auditory signals including high frequency components are not friendly for elderly users. In this paper, popular auditory signals for microwave ovens were subjectively evaluated with/without an elderly-simulated band-pass filter, which simulated an averaged audiogram of elder people, on the five grade scales of easiness of perception, unpleasantness, and emergency. Experimental results suggest that young and elderly people might imagine different function for a single auditory signal. Finally, suitable auditory signals both for young and elderly users are proposed in the viewpoints of frequency and temporal patterns.


international symposium on computer consumer and control | 2014

Robust Sensing of Approaching Vehicles Relying on Acoustic Cue

Mitsunori Mizumachi; Atsunobu Kaminuma; Nobutaka Ono; Shigeru Ando

This paper proposes a robust sensing technique for localizing an approaching vehicle relying on an acoustic cue. A camera and a radar system are commonly used as sensors for active safety systems. However, such sensors do not work well in a blind spot such as a blind junction of a highway. On the other hands, various acoustical noises caused by a cruising vehicle arrive at the blind spot. Those acoustic cues are available for achieving robust sensing of surrounding vehicles. It is necessary to extract a robust spatial feature from noisy acoustical observations. In this paper, the spatio-temporal gradient method is employed for the feature extraction, and the spatial feature is filtered out through sequential state estimation. Feasibility of the proposed method is confirmed with real acoustical observations, which are obtained by microphones outside a cruising vehicle. The filtering process can be done in real time, because its real-time factor is 0.096 using a 2.6 GHz Intel Core i7 processor.


Journal of the Acoustical Society of America | 2014

Change of static characteristics of Japanese word utterances with aging

Mitsunori Mizumachi; Kazuto Ogata

Acoustical characteristics of elderly speech have been investigated in the various viewpoints. Elderly speech can be subjectively characterized by roughness, breathiness, asthenic, and hoarseness. Those characteristics have been individually explained in both medical science and speech science. In particular, the hoarseness, which is caused by a physiological problem with an aged vocal cord, is the most well-known static properties of elderly speech. Change of the hoarseness is quantitatively investigated with aging. Japanese phonetically-balanced 543 word utterances were collected with the cooperation of 153 speakers, whose ages ranged from 20 to 89 years old. Acoustical characteristics of the word utterances were examined in the viewpoints of age and auditory impression. In the static acoustical analysis of Japanese vowels /a/, /e/, /i/, /o/, and /u/, it is confirmed that energy in the high frequency region rises with aging. There is a remarkable energy lift over 4 kHz, and the amount of the energy lift...

Collaboration


Dive into the Mitsunori Mizumachi's collaboration.

Top Co-Authors

Avatar

Satoshi Nakamura

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yoshihisa Nakatoh

Kyushu Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Katsuyuki Niyada

Kyushu Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Masato Akagi

Japan Advanced Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Kazumasa Yamamoto

Toyohashi University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge