Keiichi Funaki
University of the Ryukyus
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Keiichi Funaki.
conference of the industrial electronics society | 2006
Tatsuhiko Kinjo; Keiichi Funaki
In speech recognition, LPC cepstrum based on LPC or MFCC based on Mel-frequency filter bank are widely used as a feature extraction that determines the performance. However, these are not being regarded as the best feature extraction. In this paper, we introduce a complex speech analysis for an analytic speech signal to HMM speech recognition. A complex speech analysis can estimate more accurate speech spectrum in low frequencies, as a result, it is expected that the speech analysis can perform well as a feature extractor in speech recognition. The MMSE-based time-varying complex AR speech analysis is adopted and the estimated complex parameters are converted to LPCCs and MFCCs as a feature vector for HTK (HMM tool kit) in order to realize the HMM speech recognition. Through continuous speech recognition experiments with the converted LPCCs and MFCCs, it was found that the complex speech analysis method would not perform well than the real one
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2007
Keiichi Funaki; Tatsuhiko Kinjo
This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex-valued speech analysis for an analytic speech signal. Since analytic signal provides spectra only over positive frequencies, spectra can be accurately estimated in low frequencies. Consequently, it is considered that F0 estimation using the residual signal extracted by complex-valued speech analysis can perform better for F0 estimation than that for the residual signal extracted by conventional real-valued LPC analysis. In this paper, the autocorrelation function weighted by AMDF is adopted for the F0 estimation criterion and four signals; speech signal, analytic speech signal, LPC residual and complex LPC residual, are evaluated for the F0 estimation. Speech signals used in the experiments were an IRS filtered speech corrupted by adding white Gaussian noise or Pink noise whose noise levels are 10, 5, 0, -5[dB]. The experimental results demonstrate that the proposed algorithm based on complex LPC residual can perform better than other methods in noisy environment.
Signal Processing | 1999
Keiichi Funaki; Yoshikazu Miyanaga; Koji Tochinai
Abstract A sophisticated new speech analysis method based on a Glottal-ARMAX (Auto Regressive and Moving Average eXogenous) model with phase compensation is proposed in this paper. A Glottal-ARMAX model consists of two kinds of inputs, i.e., glottal source model excitation and a white Gaussian input, and a vocal tract ARMAX model. The proposed method can estimate the phase-compensated excitation of a glottal source model and vocal tract parameters simultaneously with pitch synchronous. In the method, ARMAX identification using a modified MIS (Model Identification System) method is adopted to estimate ARMAX parameters, and the hybrid approach of genetic algorithm (GA) and simulated annealing (SA) is employed to solve efficiently the non-linear simultaneous optimization between glottal sources and the parameters of vocal tract ARMAX model. Furthermore, in order to compensate phase distortion, phase compensation using an all-pass filter is introduced within a generation loop in the GA method. Experiments using the Glottal-ARMAX synthetic speech and natural speech uttered by a male speaker demonstrate the efficiency of the proposed method.
Journal of the Acoustical Society of America | 1996
Keiichi Funaki; Kazunori Ozawa
A speech signal coding system for coding a speech signal at a bit rate of 8 to 4 kb/s wherein the amount of calculation for fractional search of delays of an adaptive codebook is reduced significantly. Before a fractional delay of the adaptive codebook is found, candidates of integer delay are found by an open-loop using correlation values. A search for a fractional delay by a closed loop is performed for a search range for fractional delays which is provided by ±several samples of each integer delay candidate thus found using the correlation values. The fractional delay search is realized by polyphase filtering of an excitation signal in the past. In the search, a plurality of candidates of fractional delay may be found for each integer delay candidate from the adaptive codebook. In this instance, a fractional delay is determined decisively from the decimal delay candidates after a search of an excitation codebook.
international conference on acoustics, speech, and signal processing | 1997
Keiichi Funaki; Yoshikazu Miyanaga; Koji Tochinai
This paper presents new speech analysis method based on a glottal-ARMAX (autoregressive and moving average exogenous) model with phase compensation. A glottal-ARMAX model consists of two kinds of inputs: glottal source model excitation and a white Gaussian input, and a vocal tract ARMAX model. The proposed method can simultaneously estimate the glottal source model and vocal tract ARMAX model parameters pitch synchronously. In this method, ARMAX identification using a modified MIS (model identification system) method is adopted to estimate the ARMAX parameters, and the hybrid approach of the genetic algorithm (GA) and simulated annealing (SA) is employed to efficiently solve the non-linear simultaneous optimization of both parameters. Furthermore, phase compensation using an all-pass filter is introduced within a generation loop in the GA method in order to compensate phase distortion. Experiments using synthetic speech and natural speech demonstrate the efficacy of the proposed method.
european signal processing conference | 2008
Keiichi Funaki
Recently, applications of speech coding and speech recognition have been exploding; for example, cellular phones and car navigation systems in an automobile. Since these are commonly used in noisy environment, noise reduction method, viz., speech enhancement is required as a pre-processor for speech coding and recognition. Iterative Wiener filter (IWF) method has been adopted as the speech enhancement that estimates speech and noise power spectra using LPC analysis iteratively. In this paper, we propose an improved method forWiener filter algorithm by introducing the complex LPC speech analysis instead of the conventional LPC analysis. The complex speech analysis can estimate more accurate spectrum in low frequencies, thus it is expected that it can perform better for the IWF especially for babble noise or car internal noise that contains much energy in low frequencies. The objective evaluation has been performed for speech signal corrupted by white Gaussian, pink noise, babble noise or car internal noise by means of spectral distance. The results demonstrate that the proposed method can perform better for babble or car internal noise than the conventional real-valued method.
conference on computer as a tool | 2005
Kazuyuki Nakamura; Keiichi Funaki
As broadband IP network is commonly used in home and office the demand of IP telephone is increasing more and more. On IP telephone packet loss occurs, as a result, speech quality is degraded seriously. Hence, packet loss concealment (PLC) is required in IP telephone. G.711 appendix I PLC algorithm has been recommended by ITU-T in 1999 and it is widely used in IP telephone. The performance of the G.711 PLC scheme is not sufficient, thus, we have already developed the improved algorithm using LPC analysis and synthesis scheme. In the improved method past speech is divided into LPC parameters and residual signal and lost residual signal is recovered by the G.711 PLC scheme and lost speech signal is recovered by synthesizing with the repeated LPC parameters and recovered residual signal. In this paper sinusoidal model is introduced to predict the lost residual signal more accurately. The estimation and prediction performance are evaluated by objective measure and the prediction is embedded in the improved G.711 PLC method
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2008
Keiichi Funaki; Tatsuhiko Kinjo
Complex speech analysis for an analytic speech signal can accurately estimate the spectrum in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F0 estimation using complex residual signal extracted by complex-valued speech analysis. We have already proposed F0 estimation using complex LPC residual, in which the autocorrelation function weighted by AMDF was adopted as the criterion. The method adopted MMSE-based complex LPC analysis and it has been reported that it can estimate more accurate F0 for IRS filtered speech corrupted by white Gauss noise although it can not work better for the IRS filtered speech corrupted by pink noise. In this paper, robust complex speech analysis based on ELS (Extended Least Square) method is introduced in order to overcome the drawback. The experimental results for additive white Gauss or pink noise demonstrate that the proposed algorithm based on robust ELS-based complex AR analysis can perform better than other methods.
international symposium on circuits and systems | 2012
Keiichi Funaki; Takehito Higa
We have proposed F0 estimation based on time-varying complex AR (TV-CAR) speech analysis in which F0 is estimated using an weighted auto-correlation function for complex-valued residual signal calculated by the estimated time-varying complex-valued parameter for analytic signal. On the other hand, Zero Frequency Resonance (ZFR) has been proposed and it has been reported that the ZFR can estimate more accurate F0. The ZFR employs Zero Frequency Filtering (ZFF) for Hilbert envelope(HE) of LP residual to emphasize the resonance at zero frequency. In this paper, the ZFR based on TV-CAR speech analysis is proposed to estimate more accurate F0. In the proposed method, the HE is calculated with complex LP residual estimated by the complex parameters for analytic signal. The ZFR signal is calculated from the HE. The ZFR signal is used for the weighted auto-correlation to estimate F0. We have conducted the evaluation of F0 estimation using Keele Pitch database. The experimental results show that LP residual-based ZFR method performs best.
international conference on digital signal processing | 2011
Keiichi Funaki
Robust F0 (Fundamental frequency) estimation plays an important role in speech processing. This paper proposes simple F0 contour estimation algorithm based on the robust TV-CAR speech analysis, in which the F0 contour is estimated by peak-picking for the estimated time-varying spectrum by means of ELS-based robust complex speech analysis for an analytic speech signal. The experimental results demonstrate that the proposed method leads to more accurate continuous F0 estimation than the conventional ones for high-pitched speech.