Alexander Yates
Temple University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alexander Yates.
international joint conference on natural language processing | 2009
Fei Huang; Alexander Yates
Supervised sequence-labeling systems in natural language processing often suffer from data sparsity because they use word types as features in their prediction tasks. Consequently, they have difficulty estimating parameters for types which appear in the test set, but seldom (or never) appear in the training set. We demonstrate that distributional representations of word types, trained on unannotated text, can be used to improve performance on rare words. We incorporate aspects of these representations into the feature space of our sequence-labeling systems. In an experiment on a standard chunking dataset, our best technique improves a chunker from 0.76 F1 to 0.86 F1 on chunks beginning with rare words. On the same dataset, it improves our part-of-speech tagger from 74% to 80% accuracy on rare words. Furthermore, our system improves significantly over a baseline system when applied to text from a different domain, and it reduces the sample complexity of sequence labeling.
Journal of Artificial Intelligence Research | 2009
Alexander Yates; Oren Etzioni
The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither hand-tagged training examples nor domain knowledge is available. The paper presents a scalable, fully-implemented system that runs in O(KN log N) time in the number of extractions, N, and the maximum number of synonyms per word, K. The system, called RESOLVER, introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. On a set of two million assertions extracted from the Web, RESOLVER resolves objects with 78% precision and 68% recall, and resolves relations with 90% precision and 35% recall. Several variations of RESOLVERs probabilistic model are explored, and experiments demonstrate that under appropriate conditions these variations can improve F1 by 5%. An extension to the basic RESOLVER system allows it to handle polysemous names with 97% precision and 95% recall on a data set from the TREC corpus.
conference on information and knowledge management | 2013
Avirup Sil; Alexander Yates
Recognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of mentions in text, and an Entity Linking (EL) system to connect the mentions to entries in structured or semi-structured repositories like Wikipedia. However, the two tasks are tightly coupled, and each type of system can benefit significantly from the kind of information provided by the other. We present a joint model for NER and EL, called NEREL, that takes a large set of candidate mentions from typical NER systems and a large set of candidate entity links from EL systems, and ranks the candidate mention-entity pairs together to make joint predictions. In NER and EL experiments across three datasets, NEREL significantly outperforms or comes close to the performance of two state-of-the-art NER systems, and it outperforms 6 competing EL systems. On the benchmark MSNBC dataset, NEREL provides a 60% reduction in error over the next-best NER system and a 68% reduction in error over the next-best EL system.
Computational Linguistics | 2014
Fei Huang; Arun Ahuja; Doug Downey; Yi Yang; Yuhong Guo; Alexander Yates
Finding the right representations for words is critical for building accurate NLP systems when domain-specific labeled data for the task is scarce. This article investigates novel techniques for extracting features from n-gram models, Hidden Markov Models, and other statistical language models, including a novel Partial Lattice Markov Random Field model. Experiments on part-of-speech tagging and information extraction, among other tasks, indicate that features taken from statistical language models, in combination with more traditional features, outperform traditional representations alone, and that graphical model representations outperform n-gram models, especially on sparse and polysemous words.
empirical methods in natural language processing | 2009
Prakash Srinivasan; Alexander Yates
It is well known that pragmatic knowledge is useful and necessary in many difficult language processing tasks, but because this knowledge is difficult to acquire and process automatically, it is rarely used. We present an open information extraction technique for automatically extracting a particular kind of pragmatic knowledge from text, and we show how to integrate the knowledge into a Markov Logic Network model for quantifier scope disambiguation. Our model improves quantifier scope judgments in experiments.
BMC Bioinformatics | 2009
Weisi Duan; Min Song; Alexander Yates
BackgroundWe aim to solve the problem of determining word senses for ambiguous biomedical terms with minimal human effort.MethodsWe build a fully automated system for Word Sense Disambiguation by designing a system that does not require manually-constructed external resources or manually-labeled training examples except for a single ambiguous word. The system uses a novel and efficient graph-based algorithm to cluster words into groups that have the same meaning. Our algorithm follows the principle of finding a maximum margin between clusters, determining a split of the data that maximizes the minimum distance between pairs of data points belonging to two different clusters.ResultsOn a test set of 21 ambiguous keywords from PubMed abstracts, our system has an average accuracy of 78%, outperforming a state-of-the-art unsupervised system by 2% and a baseline technique by 23%. On a standard data set from the National Library of Medicine, our system outperforms the baseline by 6% and comes within 5% of the accuracy of a supervised system.ConclusionOur system is a novel, state-of-the-art technique for efficiently finding word sense clusters, and does not require training data or human effort for each new word to be disambiguated.
conference of the european chapter of the association for computational linguistics | 2014
Fei Huang; Alexander Yates
Linguist Code Switching (LCS) is a situation where two or more languages show up in the context of a single conversation. For example, in EnglishChinese code switching, there might be a sentence like “· ‚15© ¨ k ‡meeting (We will have a meeting in 15 minutes)”. Traditional machine translation (MT) systems treat LCS data as noise, or just as regular sentences. However, if LCS data is processed intelligently, it can provide a useful signal for training word alignment and MT models. Moreover, LCS data is from non-news sources which can enhance the diversity of training data for MT. In this paper, we first extract constraints from this code switching data and then incorporate them into a word alignment model training procedure. We also show that by using the code switching data, we can jointly train a word alignment model and a language model using cotraining. Our techniques for incorporating LCS data improve by 2.64 in BLEU score over a baseline MT system trained using only standard sentence-aligned corpora.
IEEE Computer | 2009
Alexander Yates
By automatically extracting information from the Web, we can scale up the resulting knowledge bases to much greater sizes than current collections of manually gathered and user-contributed knowledge.
conference on information and knowledge management | 2013
Doug Downey; Chandra Bhagavatula; Alexander Yates
Web Information Extraction (WIE) systems extract billions of unique facts, but integrating the assertions into a coherent knowledge base and evaluating across different WIE techniques remains a challenge. We propose a framework that utilizes natural language to integrate and evaluate extracted knowledge bases (KBs). In the framework, KBs are integrated by exchanging probability distributions over natural language, and evaluated by how well the output distributions predict held-out text. We describe the advantages of the approach, and detail remaining research challenges.
international conference on social computing | 2010
Harish Sethu; Alexander Yates
A representation of the World Wide Web as a directed graph, with vertices representing web pages and edges representing hypertext links, underpins the algorithms used by web search engines today. However, this representation involves a key oversimplification of the true complexity of the Web: an edge in the traditional Web graph represents only the existence of a hyperlink; information on the context (e.g., informational, adversarial, commercial, spam) behind the hyperlink is absent. In this work-in-progress paper, we describe an ongoing collaborative project between two teams, one specializing in network science and analysis and the other specializing in text analysis and machine learning, to address this oversimplification. Using techniques in natural language processing, text mining and machine learning to extract relevant features of hyperlinks and classify them into one of several types, this undertaking builds and analyzes a multi-relational web graph. A key aspect of this work is that the multi-relational graph emerges naturally from the data instead of being based on an imposed classification of the hyperlinks.