ChiWei Che
Sarnoff Corporation
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by ChiWei Che.
international conference on spoken language processing | 1996
Qiguang Lin; Ea-Ee Jan; ChiWei Che; Dong-Suk Yuk; James L. Flanagan
The paper describes two separate sets of speaker identification experiments. In the first set of experiments, the speech spectrum is selectively used for speaker identification. The results show that the higher portion of the speech spectrum contains more reliable idiosyncratic information on speakers than does the lower portion of equal bandwidth. In the second set of experiments, vector-quantization based Gaussian mixture models (VQGMMs) are developed for text-independent speaker identification. The system has been evaluated in the recent speaker identification evaluation organized by NIST. Details of the system design are given and the evaluation results are presented.
international conference on acoustics speech and signal processing | 1999
Prabhu Raghavan; ChiWei Che; Dong-Suk Yuk; James L. Flanagan
Performance of automatic speech recognition systems trained on close-talking data suffers when used in a distant-talking environment due to the mismatch in the training and testing conditions. Microphone array sound capture can reduce some mismatch by removing ambient noise and reverberation but offers insufficient improvement in performance. However, using array signal capture in conjunction with a hidden Markov model (HMM) adaptation on the clean-speech models can result in improved recognition accuracy. This paper describes an experiment in which the output of an 8-element microphone array system using MFA processing is used for speech recognition with LT-MLLR adaptation. The recognition is done in two passes. In the first pass, an HMM trained on clean data is used to recognize the speech. Using the results of this pass, the HMM model is adapted to the environment using the LT-MLLR algorithm. This adapted model, a product of MFA and LT-MLLR, results in improved recognition performance.
ieee automatic speech recognition and understanding workshop | 1997
Dong-Suk Yuk; ChiWei Che; James L. Flanagan
The laboratory performance of well trained speech recognizers is usually degraded when they are used in real world environments. Robust speech recognition is therefore an important issue for successful application of speech recognizers. Neural network based transformation methods are studied to compensate for the mismatched conditions of training and testing. First, a feature transformation neural network is studied. Second, a maximum likelihood neural network is applied to model transformations. The advantage of the neural network based transformation methods is that retraining of the speech recognizer for each particular environment is avoided. Furthermore, because the multi layer neural network is known to be able to compute nonlinear functions, the neural network based transformation methods are able to establish nonlinear mapping functions between training and testing environments without specific knowledge about the distortion or the mismatched environments.
Journal of the Acoustical Society of America | 1998
Dong-Suk Yuk; ChiWei Che; Prabhu Raghavan; Samir Chennoukh; James L. Flanagan
In a large vocabulary continuous speech recognition system, high‐level linguistic knowledge can enhance performance. However, integration of high‐level linguistic knowledge and complex acoustic models under an efficient search scheme is still problematic. Higher‐order n‐grams are so computationally expensive, especially when the size of vocabulary is large, that real time processing is not possible yet. In this report, the n‐best breadth search algorithm is proposed under the framework of the state space search, which can handle higher order n‐grams and complex subword acoustic models such as the cross‐word triphones. The n‐best breadth search is a combination of the best first search and the breadth first search. The proposed algorithm can be extended to handle other types of language models such as the stochastic context‐free grammar, and different types of acoustic models including the neural networks. Compared with the conventional beam‐search method, this pilot experiment shows that the proposed algo...
Journal of the Acoustical Society of America | 1998
Prabhu Raghavan; ChiWei Che; Samir Chenoukh; Dong-Suk Yuk; James L. Flanagan
Performance of automatic speech recognition systems trained on close‐talking data suffers when the systems are used in a distant‐talking environment due to the mismatch in training and testing conditions. Microphone array sound capture can remove some of the mismatch by removing ambient noise and reverberation, resulting in an approximation to a clean speech signal. However, this often does not improve the performance sufficiently. But, using array signal capture in conjunction with Hidden Markov Model (HMM) adaptation on the clean‐speech models can result in high recognition accuracy. This paper describes an experiment in which the output of an eight‐element microphone array system using MFA processing is used for speech recognition with LT‐MLLR adaptation. The recognition is done in two passes. In the first pass, an HMM trained on clean data is used to recognize the speech. Using the results of this pass, the HMM model is adapted to the environment using the LT‐MLLR algorithm. This adapted model is then...
Journal of the Acoustical Society of America | 1993
Qiguang Lin; James L. Flanagan; ChiWei Che
In this paper acoustics and synthesis of nasalization is studied based on a comprehensive computer model of the vocal tract: TRACTTALK. TRACTTALK simulates the vocal‐tract system in the frequency domain and derives the time‐domain equivalent to produce sound output. It incorporates all important components of the system and decomposes the transfer function into its zero and pole parts. Such a decomposition enables one to accurately estimate poles and zeros of a nasalized sound. First, temporal trajectories of poles/zeros are examined as a function of the velopharyngeal opening, the presence of the nasal sinuses, and other articulatory parameters. This is in follow‐up to previous work [Flanagan, AT&T Bell Labs. internal report (1983)]. The attempt is to systematically characterize the pole/zero pattern of nasalization for improving the performance of formant tracking and feature labeling algorithms. Secondly, synthesis of nasalization is described using TRACTTALK. Listening experiments are conducted to ass...
international conference on acoustics speech and signal processing | 1996
J. Pearson; Qiguang Lin; ChiWei Che; D.-S. Yuk; L. Jin; B. deVries; James L. Flanagan
Archive | 1996
Dong-Suk Yuk; Qiguang Lin; ChiWei Che; Li-jie Jin; James L. Flanagan
conference of the international speech communication association | 1994
Qiguang Lin; Ea-Ee Jan; ChiWei Che; Bert de Vries
north american chapter of the association for computational linguistics | 1994
ChiWei Che; Qiguang Lin; John C. Pearson; Bert de Vries; James L. Flanagan