K.L. Brown
University of California, Davis
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by K.L. Brown.
IEEE Transactions on Speech and Audio Processing | 1993
V.R. Algazi; K.L. Brown; M.J. Ready; D.H. Irvine; Christie L. Cadwell; S. Chung
An approach to modeling and capturing the time-varying structure of the spectral envelope of speech is reported. Acoustic subword decomposition and the Karhunen-Loeve transform (KLT) are used to extract and efficiently represent the highly correlated structure of the spectral envelope. Integration of the KLT with acoustic subword modeling provides concise representation of both steady-state and dynamic features of the spectra in a unified framework that very effectively captures acoustic-phonetic patterns. The physiological and perceptual basis for the approach, the frame-based and acoustic-subword-based spectral representation, and applications to speaker-dependent recognition are presented. The performance of the recognition algorithm based on this approach compares favorably with that of other techniques. >
international conference on acoustics, speech, and signal processing | 1989
K.L. Brown; V.R. Algazi
A mathematical model has been developed for tracking spectral transitions within the spectral envelope of a speech signal. This technique incorporates linguistic knowledge into a mathematical framework to determine time-varying acoustic-phonetic features and describe formant transitions. The proposed model is quite robust and is capable of extracting not only rapid spectral movement, but also smoother spectral transitions that occur in vowel and sonorant sequences. This basic approach has been previously used to extract steady-state acoustic-phonetic features across spectrally homogeneous regions and to perform speaker dependent recognition in which quite successful results were attained in clean as well as noisy speech. It has now been augmented to capture the dynamics of spectral acoustic-phonetic features.<<ETX>>
international conference on acoustics, speech, and signal processing | 1989
V.R. Algazi; S. Chung; M.J. Ready; K.L. Brown
The authors propose a novel approach to the modeling and estimation of the speech spectral envelope over acoustic subwords that exhibits robust performance in noise. The technique exploits the underlying signal structure of speech to improve parameter estimates, and it uses the perceptual properties of hearing to decrease the computational requirements in a perceptually meaningful way. The approach provides a considerable speech quality improvement over other methods.<<ETX>>
international conference on acoustics speech and signal processing | 1988
V.R. Algazi; K.L. Brown
The authors have developed a very successful new approach to automatic speech recognition which incorporates speech knowledge into a mathematical framework and does not require a computationally intensive time alignment/dynamic programming scheme. They transform the speech signal into the spectral domain, segment it into sub-word units and, in turn, perform an additional transformation in the spectral domain to capture the spectral structure within each sub-word unit. The system was shown to perform robustly in hand segmented whole word digit recognition in clean as well as noisy speech. They have now augmented the system with an automatic acoustic sub-word segmentation routine and tested the performance of this integrated system with the TI isolated word database and the confusable E set.<<ETX>>
international conference on acoustics, speech, and signal processing | 1991
K.L. Brown; V.R. Algazi
A novel approach for speech signal analysis has been developed that incorporates both steady-state and dynamic spectral features into a unified model. This model has been successfully applied in automatic speech recognition contexts and does not require frame-based optimal search algorithms. The model decomposes an utterance into a chain of acoustic subwords and simultaneously generates a mathematical description of instantaneous acoustic-phonetic features and dynamic transitions. The algorithm was tested using a speaker-dependent limited vocabulary recognition task and achieved higher recognition rates than both vector quantization and hidden Markov models.<<ETX>>
international conference on acoustics speech and signal processing | 1996
Martha Birnbaum; K.L. Brown; Steven Bardenhagen
We have developed a new approach for speaker identification (SID) that employs fenonic speaker Markov modeling (FSMM). The FSMM is a hidden Markov model whose parameters capture and describe intra-speaker spectral dynamics. FSMM technology has been successfully applied in isolated word recognition and speaker adaptation applications. In this new application of FSMM technology to speaker identification, we have obtained an identification accuracy of 96.9% on the King database using 16 talkers. The results suggest that the FSMM provides a promising model for capturing and identifying speaker-specific characteristics.
IEEE Transactions on Speech and Audio Processing | 1993
V.R. Algazi; K.L. Brown; M.J. Ready; D.H. Irvine; Christie L. Cadwell; S. Chung
For Part I see ibid., vol.1, no.2, p.180-95 (1993). In Part I of this paper, the authors introduced an approach to the representation of the speech spectral envelope which makes use of the Karhunen-Loeve (KL) transformation of acoustic subword segments. This signal-dependent representation captures, with a few KL vectors and transform coefficients, the perceptually and phonetically important structure of the spectral envelope. Here the authors apply this representation to the analysis, synthesis, and coding of speech. They propose simple quantization and coding strategies for the KL representation vectors as well as for the resulting transform coefficients. The resulting technique is a variable rate encoding scheme which achieves good speech quality at an average rate of 3.5 kb/s. >
international conference on acoustics, speech, and signal processing | 1992
V.R. Algazi; D.H. Irvine; C. Caldwell; M.J. Ready; K.L. Brown; S. Chung
The authors have developed a signal-dependent representation which captures, with a few KL vectors and transform coefficients, the perceptually and phonetically important structure of the spectral envelope. Together with a mixed excitation strategy with some novel features, this representation has been applied to the analysis, synthesis and coding of speech with promising results in the 5-kb/s range.<<ETX>>
international conference on acoustics, speech, and signal processing | 1985
K.L. Brown; V. Algazi
international conference on acoustics, speech, and signal processing | 1989
K.L. Brown; V. Ralph Algani