Carl D. Mitchell
Purdue University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Carl D. Mitchell.
IEEE Transactions on Speech and Audio Processing | 1995
Carl D. Mitchell; Mary P. Harper; Leah H. Jamieson
Introduces a new recursion that reduces the complexity of training a semi-Markov model with continuous output distributions. It is shown that the cost of training is proportional to M/sup 2/+D, compared to M/sup 2/D with the standard recursion, where M is the observation vector length and D is the maximum allowed duration. >
international conference on acoustics, speech, and signal processing | 1993
Goangshiuan S. Ying; Carl D. Mitchell; Leah H. Jamieson
The authors have developed an energy measure based on Teagers energy algorithm, and have applied it to the problem of endpoint detection. This energy measure is important in that it appears to be more suitable for describing the source energy associated with the production of speech sounds than the acoustic energy typically measured, and explores a new way of viewing and using Teagers energy algorithm. Experiments were conducted on 400 utterances on which endpoint detection was expected to be difficult. Typical examples that show that this new measure is more effective than traditional measures in capturing speech events such as initial and final fricatives and plosives are presented. Whereas traditional endpoint detectors have used both (acoustic) energy and zero crossing rate, the new measure effectively combines this information into a single measure. The experimental results demonstrate that the measure can be used to improve the performance of endpoint detection algorithms and should be effective for the detection of speech in noisy environments.<<ETX>>
international conference on acoustics, speech, and signal processing | 1993
Carl D. Mitchell; Leah H. Jamieson
A procedure for modeling duration with some PDF (probability density function) or PMF (probability mass function) in the exponential family is presented. A means of selecting an appropriate member of the exponential family is suggested. The parameter estimation procedure presented here offers several advantages over other methods of duration modeling. First, the duration PMF can be found directly, rather than sampling and truncating the optimum density. Secondly, the optimum duration parameters are found from Fergusons nonparametric PMF. This simplifies reestimation because the operation of casting a nonparametric PMF to the desired parametric family can be completely separated from the forward and backward algorithms. Thirdly, several competing members of the exponential family can be evaluated quickly for each state in the HMM. This makes it possible to model each states duration with the best member from a set of parametric PMFs in the exponential family. Finally, the solution holds for an PDF or PMF in the exponential family, which includes a large number of promising candidates.<<ETX>>
Journal of the Acoustical Society of America | 1994
Goangshiuan S. Ying; Leah H. Jamieson; Carl D. Mitchell
Global error correction routines play an important part in pitch detection algorithms (PDAs). Raw pitch period estimates are often incorrect. A good error correction routine can significantly improve the overall set of pitch estimates for an utterance. A simple and straightforward error correction routine is proposed. The pitch period markers are selected by a two‐stage probabilistic postprocessor. A time‐domain PDA, AMDF (average magnitude difference function) generates the markers (candidates for the pitch period) for each frame of the utterance. An initial pitch period estimate for each frame is produced, but all of the markers are also saved for possible later use. Using these initial estimates, a probability distribution of the pitch period is calculated across the utterance. This probability distribution is used to adjust the weights of the markers in each frame. Using these new weights, a new pitch period estimate is calculated for each frame. This process is performed twice, each time using the di...
international conference on acoustics, speech, and signal processing | 1995
Carl D. Mitchell; Mary P. Harper; Leah H. Jamieson
We show that many of the errors in a context-dependent phone recognition system are due to poor segmentation. We then suggest a method to incorporate explicit segmentation information directly into the HMM paradigm. The utility of explicit segmentation information is illustrated with experiments involving five types of segmentation information and three methods of smoothing.
international parallel and distributed processing symposium | 1993
Carl D. Mitchell; Randall A. Helzerman; Leah H. Jamieson; Mary P. Harper
This paper describes a parallel implementation of a Hidden Markov Model (HMM) for spoken language recognition on the MasPar MP-1. By exploiting the massive parallelism of explicit duration HMMs, we can develop more complex models for real-time speech recognition. Implementational issues such as choice of data structures, method of communication, and utilization of parallel functions are explored. The results of our experiments show that the parallelism in HMMs can be effectively exploited by the MP-1. Training that use to take nearly a week can now be completed in about an hour. The system can recognize the phones of a test utterance in a fraction of a second.<<ETX>>
international conference on acoustics speech and signal processing | 1999
Carl D. Mitchell; Anand Rangaswamy Setlur
This paper addresses the problem of selecting a name from a very large list using spelling recognition. In order to greatly reduce the computational resources required, we propose a tree-based lexical fast match scheme to select a short list of candidate names. Our system consists of a free letter recognizer, a fast matcher, and a rescoring stage. The letter recognizer uses n-grams to generate an n-best list of letter hypotheses. The fast matcher is a tree that is based on confusion classes, where a confusion class is a group of acoustically similar letters such as the e-set. The fast matcher reduces over 100,000 unique last names to tens or hundreds of candidates. Then the rescoring stage picks the best name using either letter alignment or a constrained grammar. The fast matcher retained the correct name 99.6% of the time and the system retrieved the correct name 97.6% of the time.
international conference on acoustics speech and signal processing | 1996
Carl D. Mitchell; Mary P. Harper; Leah H. Jamieson
Hybrids that use a neural network to estimate the output probability for a hidden Markov model (HMM) word recognizer have been competitive with traditional HMM recognizers when both use monophone context. While traditional HMM recognizers can easily utilize more context (e.g., triphones) to achieve better results, the size of the task has made it impractical to use phonetic context directly in the neural network front end of a hybrid. In this paper, we suggest a simple method to incorporate more context by modeling the phone distributions obtained from the neural network. This allows the HMM to easily handle stochastic pronunciations as well as errors from the neural network phone recognizer. The re-estimation equations are derived for the new model. Results for the Resource Management task illustrate that SOHMM increases recognition accuracy for the cases of no grammar, unigram grammar, and word pair grammar.
Digital Signal Processing | 1995
Carl D. Mitchell; Mary P. Harper; Leah H. Jamieson; Randall A. Helzerman
IEEE Transactions on Speech and Audio Processing | 1994
Carl D. Mitchell; Mary P. Harper; Leah H. Jamieson