Ea-Ee Jan
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ea-Ee Jan.
ieee automatic speech recognition and understanding workshop | 1997
Mukund Padmanabhan; Ea-Ee Jan; Lalit R. Bahl; Michael Picheny
Presents a decision-tree based procedure to quantize the feature-space of a speech recognizer, with the motivation of reducing the computation time required for evaluating Gaussians in a speech recognition system. The entire feature space is quantized into non-overlapping regions, where each region is bounded by a number of hyperplanes. Each region is characterized by the occurrence of only a small number of the total alphabet of allophones (sub-phonetic speech units). By identifying the region in which a test feature vector lies, only the Gaussians that model the density of allophones that exist in that region need be evaluated. The quantization of the feature space is done in a hierarchical manner using a binary decision tree. Each node of the decision tree represents a region of the feature space, and is further characterized by a hyperplane (a vector v_/sub n/ and a scalar threshold value h/sub n/), that subdivides the region corresponding to the current node into two non-overlapping regions corresponding to the two children of the current node. Given a test feature vector, the process of finding the region that it lies in involves traversing this binary decision tree, which is computationally inexpensive. We present results of experiments that show that the Gaussian computation time can be reduced by as much as a factor of 20 with negligible degradation in accuracy. We also examine issues of robustness to different environments.
international conference on acoustics speech and signal processing | 1999
Yuqing Gao; Ea-Ee Jan; Mukund Padmanabhan; Michael Picheny
Two discriminant measures for HMM states to improve the effectiveness on HMM training are presented. In HMM based speech recognition, the context-dependent states are usually modeled by Gaussian mixture distributions. In general, the number of Gaussian mixtures for each state is fixed or proportional to the amount of training data. From our study, some of the states are non-aggressive compared to others, and a higher acoustic resolution is required for them. Two methods are presented in this paper to determine those non-aggressive states. The first approach uses the recognition accuracy of the states and the second method is based on a rank distribution of states. Baseline systems, trained by a fixed number of Gaussian mixtures for each state, having 33 K and 120 K Gaussians, yield 14.57% and 13.04% word error rates, respectively. Using our approach, a 38 K Gaussian system was constructed that reduces the error rate to 13.95%. The average ranks of non-aggressive states in rank lists of testing data were also seen to dramatic improve compared to the baseline systems.
international symposium on chinese spoken language processing | 2010
Ea-Ee Jan; Niyu Ge; Shih-Hsiang Lin; Salim Roukos; Jeffrey S. Sorensen
Proper name transliteration, the pronunciation based translation of a proper name, is important to many multilingual natural language processing task, such as Statistical Machine Translation (SMT) and Cross Lingual Information Retrieval (CLIR). This task is extremely challenging due to the pronunciation difference between the source and target language. A given proper name can lead to many different transliterations. In the past, research efforts had demonstrated a 30–50% error using top-1 reference for transliteration. This error leads to performance degradation for many applications. In this paper, a novel approach to verify a given proper name transliteration pair using a discrete variant Hidden Markov Model (HMM) alignment is proposed. The state emission probabilities are derived from SMT phrase tables. The proposed method yields an Equal Error Rate (EER) of 3.73% on a 300 matched and 1000 unmatched name pairs test set. By comparison, the commonly used SMT framework yields 6.5% EER under the best configuration. The widely used edit distance approach has an EER of 22%. Our new method achieves high accuracy and low complexity, and provides an alternative for name transliteration in CLIR and other cross lingual natural language applications such as word alignment and machine translation.
conference of the international speech communication association | 1999
Kenneth Davies; Robert E. Donovan; Mark E. Epstein; Martin Franz; Abraham Ittycheriah; Ea-Ee Jan; Jean-Michel LeRoux; David Lubensky; Chalapathy Neti; Mukund Padmanabhan; Kishore Papineni; Salim Roukos; Andrej Sakrajda; Jeffrey S. Sorensen; Borivoj Tydlitát; Todd Ward
conference of the international speech communication association | 2001
George Saon; Juan M. Huerta; Ea-Ee Jan
conference of the international speech communication association | 2009
Ruhi Sarikaya; Sameer Maskey; R. Zhang; Ea-Ee Jan; Dagen Wang; Bhuvana Ramabhadran; Salim Roukos
conference of the international speech communication association | 2006
Osamuyimen Stewart; Juan M. Huerta; Ea-Ee Jan; Cheng Wu; Xiang Li; David Lubensky
conference of the international speech communication association | 2009
Ea-Ee Jan; Hong-Kwang Kuo; Osamuyimen Stewart; David Lubensky
conference of the international speech communication association | 2009
Juan M. Huerta; Cheng Wu; Andrej Sakrajda; Sasha P. Caskey; Ea-Ee Jan; Alexander Faisman; Shai Ben-David; Wen Liu; Antonio Lee; Osamuyimen Stewart; Michael Frissora; David Lubensky
conference of the international speech communication association | 2008
Ea-Ee Jan; Osamuyimen Stewart; Raymond L. Co; David Lubensky