Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Abraham Ittycheriah.
north american chapter of the association for computational linguistics | 2003
Radu Florian; Abraham Ittycheriah; Hongyan Jing; Tong Zhang
This paper presents a classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based learning, and hidden Markov model) are combined under different conditions. When no gazetteer or other additional training resources are used, the combined system attains a performance of 91.6F on the English development data; integrating name, location and person gazetteers, and named entity systems trained on additional, more general, data reduces the F-measure error by a factor of 15 to 21% on the English data.
meeting of the association for computational linguistics | 2004
Xiaoqiang Luo; Abraham Ittycheriah; Hongyan Jing; Nanda Kambhatla; Salim Roukos
This paper proposes a new approach for coreference resolution which uses the Bell tree to represent the search space and casts the coreference resolution problem as finding the best path from the root of the Bell tree to the leaf nodes. A Maximum Entropy model is used to rank these paths. The coreference performance on the 2002 and 2003 Automatic Content Extraction (ACE) data will be reported. We also train a coreference system using the MUC6 data and competitive results are obtained.
empirical methods in natural language processing | 2005
Abraham Ittycheriah; Salim Roukos
This paper presents a maximum entropy word alignment algorithm for Arabic-English based on supervised training data. We demonstrate that it is feasible to create training material for problems in machine translation and that a mixture of supervised and unsupervised methods yields superior performance. The probabilistic model used in the alignment directly models the link decisions. Significant improvement over traditional word alignment techniques is shown as well as improvement on several machine translation tests. Performance of the algorithm is contrasted with human annotation performance.
Journal of the Acoustical Society of America | 2000
Abraham Ittycheriah; Stephane Herman Maes
Clusters of quantized feature vectors are processed against each other using a threshold distance value to cluster mean values of sets of parameters contained in speaker specific codebooks to form classes of speakers against which feature vectors computed from an arbitrary input speech signal can be compared to identify a speaker class. The number of codebooks considered in the comparison may be thus reduced to limit mixture elements which engender ambiguity and reduce system response speed when the speaker population becomes large. A speaker class processing model which is speaker independent within the class may be trained on one or more members of the class and selected for implementation in a speech recognition processor in accordance with the speaker class recognized to further improve speech recognition to level comparable to that of a speaker dependent model. Formation of speaker classes can be supervised by identification of groups of speakers to be included in the class and the speaker class dependent model trained on members of a respective group.
north american chapter of the association for computational linguistics | 2003
Jennifer Chu-Carroll; Krzysztof Czuba; John M. Prager; Abraham Ittycheriah
Motivated by the success of ensemble methods in machine learning and other areas of natural language processing, we developed a multi-strategy and multi-source approach to question answering which is based on combining the results from different answering agents searching for answers in multiple corpora. The answering agents adopt fundamentally different strategies, one utilizing primarily knowledge-based mechanisms and the other adopting statistical techniques. We present our multi-level answer resolution algorithm that combines results from the answering agents at the question, passage, and/or answer levels. Experiments evaluating the effectiveness of our answer resolution algorithm show a 35.0% relative improvement over our baseline system in the number of questions correctly answered, and a 32.8% improvement according to the average precision metric.
north american chapter of the association for computational linguistics | 2001
Abraham Ittycheriah; Martin Franz; Wei-Jing Zhu; Adwait Ratnaparkhi; Richard J. Mammone
We present a statistical question answering system developed for TREC-9 in detail. The system is an application of maximum entropy classification for question/answer type prediction and named entity marking. We describe our system for information retrieval which did document retrieval from a local encyclopedia, and then expanded the query words and finally did passage retrieval from the TREC collection. We will also discuss the answer selection algorithm which determines the best sentence given both the question and the occurrence of a phrase belonging to the answer class desired by the question. A new method of analyzing system performance via a transition matrix is shown.
north american chapter of the association for computational linguistics | 2003
Deepak Ravichandran; Abraham Ittycheriah; Salim Roukos
In this paper we investigate the use of surface text patterns for a Maximum Entropy based Question Answering (QA) system. These text patterns are collected automatically in an unsupervised fashion using a collection of trivia question and answer pairs as seeds. These patterns are used to generate features for a statistical question answering system. We report our results on the TREC-10 question set.
Journal of the Acoustical Society of America | 2000
Abraham Ittycheriah; Stephane Herman Maes
A method and an apparatus are provided for performing speech recognition on speech segments frequently input by a user. The method and the apparatus include use of keyword scoring in connection with a speech recognition vocabulary, a temporary score, and a predetermined margin to determine an appropriate output as being representative of the input speech segment.
empirical methods in natural language processing | 2003
Hongyan Jing; Radu Florian; Xiaoqiang Luo; Tong Zhang; Abraham Ittycheriah
When building a Chinese named entity recognition system, one must deal with certain language-specific issues such as whether the model should be based on characters or words. While there is no unique answer to this question, we discuss in detail advantages and disadvantages of each model, identify problems in segmentation and suggest possible solutions, presenting our observations, analysis, and experimental results. The second topic of this paper is classifier combination. We present and describe four classifiers for Chinese named entity recognition and describe various methods for combining their outputs. The results demonstrate that classifier combination is an effective technique of improving system performance: experiments over a large annotated corpus of fine-grained entity types exhibit a 10% relative reduction in F-measure error.
north american chapter of the association for computational linguistics | 2003
Abraham Ittycheriah; Lucian Vlad Lita; Nanda Kambhatla; Nicolas Nicolov; Salim Roukos; Margo Stys
We present a system for identifying and tracking named, nominal, and pronominal mentions of entities within a text document. Our maximum entropy model for mention detection combines two pre-existing named entity taggers (built to extract different entity categories) and other syntactic and morphological feature streams to achieve competitive performance. We developed a novel maximum entropy model for tracking all mentions of an entity within a document. We participated in the Automatic Content Extraction (ACE) evaluation and performed well. We describe our system and present results of the ACE evaluation.