Michiko Kazama | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michiko Kazama is active.

Explore More

Publication

Featured researches published by Michiko Kazama.

Journal of the Acoustical Society of America | 2010

On the significance of phase in the short term Fourier spectrum for speech intelligibility

Michiko Kazama; Satoru Gotoh; Mikio Tohyama; Tammo Houtgast

This paper investigates the significance of the magnitude or the phase in the short term Fourier spectrum for speech intelligibility as a function of the time-window length. For a wide range of window lengths (1/16-2048 ms), two hybrid signals were obtained by a cross-wise combination of the magnitude and phase spectra of speech and white noise. Speech intelligibility data showed the significance of the phase spectrum for longer windows (>256 ms) and for very short windows (<4 ms), and that of the magnitude spectrum for medium-range window lengths. The hybrid signals used in the intelligibility test were analyzed in terms of the preservation of the original narrow-band speech envelopes. Correlations between the narrow-band envelopes of the original speech and the hybrid signals show a similar pattern as a function of window length. This result illustrates the importance of the preservation of narrow-band envelopes for speech intelligibility. The observed significance of the phase spectrum in recovering the narrow-band envelopes for the long term windows and for the very short term windows is discussed.

international conference on acoustics, speech, and signal processing | 2002

Pitch and speech-rate conversion using envelope modulation modeling

Kazuaki Yoshida; Michiko Kazama; Mikio Tohyama

This article describes a method of intelligible speech representation that uses narrow-band envelopes and their carriers. This method enables modification of the talkers voice pitch and speech-rate without sacrificing intelligibility. The carrier, which shows the instantaneous phase, conveys pitch information, while the temporal envelope conveys speech-rate information and preserves speech intelligibility. The carriers, however, can be replaced by sinusoidal signals without severely degrading intelligibility or voice quality. Consequently, we can modify the pitch by shifting each envelopes carrier-frequency and convert the speech-rate by stretching or shrinking the envelopes. These findings could be useful in frequency scaling of the speech spectrum to assist hearing-impaired listeners or in time scaling of the speech signal for speech signal reproduction.

international symposium on signal processing and information technology | 2006

Speaker Verification Using Narrow-band Envelope Correlation Matrices

Satoru Gotoh; Michiko Kazama; Mikio Tohyama; Yoshio Yamasaki

We confirmed that a speakers vocal individuality is contained in the inter-band correlations of narrow-band (1/4 or 1/8 octave bands) temporal envelopes. Two types of envelope correlation matrices (ECMs) were made for 53 speakers, using three utterances of an identical sentence (assuming a situation where a password for verification was stolen) so that any differences in the spoken contents might not greatly influence their individuality. Type-A (reference) ECMs of two of the utterances were constructed to make a speakers individual template, and a type-B ECM was constructed using the other utterance. Speaker matching tests between the two types of ECMs, based on Gaussian mixture model (GMM) matching scores, verified the validity of the individual speakers. In particular, a speakers voice could be verified using spoken materials through the telephone band (250 Hz 3 kHz), a high frequency range (2- 11.3 kHz), or a wide frequency range (250 Hz - 11.3 kHz)

Journal of the Acoustical Society of America | 1999

Speech reconstruction from a noisy reception signal

Michiko Kazama; Mikio Tohyama; Akira Morita

Noise reduction is a fundamental issue of smart microphone systems or a hearing aid. Noise reduction by spectral subtraction has been investigated for speech signals. However, identifying whether frame is a speech or a silence portion is difficult under nonstationary noisy conditions when using this method. Extracting the desired speech based on the sinusoidal wave model [T. Quatieri and R. Mcaulay, IEEE ASSP 34, 1449–1464 (1986)] was investigated. It was confirmed that intelligible speech sound could be synthesized using only five dominant sinusoidal waves [M. Kazama et al., 5th ICSV 2079–2086 (1997)]. In this article, a new noise reduction method by extracting the dominant sinusoidal waves in each frame (32 ms) according to the energy ratio of the signal to noise was proposed. The signal‐to‐noise ratio was improved by 10 dB (S/N ratio) when the original S/N ratio was 0 dB. Speech quality could also be improved by reconstructing the higher harmonics from the noisy vowels using the frame‐dependent comb fi...

Journal of The Audio Engineering Society | 2003