Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Limin Yao is active.

Publication


Featured researches published by Limin Yao.


european conference on machine learning | 2010

Modeling relations and their mentions without labeled text

Sebastian Riedel; Limin Yao; Andrew McCallum

Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relation. Here we argue that this leads to noisy patterns that hurt precision, in particular if the knowledge base is not directly related to the text we are working with. We present a novel approach to distant supervision that can alleviate this problem based on the following two ideas: First, we use a factor graph to explicitly model the decision whether two entities are related, and the decision whether this relation is mentioned in a given sentence; second, we apply constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in our training KB. We apply our approach to extract relations from the New York Times corpus and use Freebase as knowledge base. When compared to a state-of-the-art approach for relation extraction under distant supervision, we achieve 31% error reduction.


knowledge discovery and data mining | 2009

Efficient methods for topic model inference on streaming document collections

Limin Yao; David M. Mimno; Andrew McCallum

Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of training documents requires approximate inference techniques that are computationally expensive. With todays large-scale, constantly expanding document collections, it is useful to be able to infer topic distributions for new documents without retraining the model. In this paper, we empirically evaluate the performance of several methods for topic inference in previously unseen documents, including methods based on Gibbs sampling, variational inference, and a new method inspired by text classification. The classification-based inference method produces results similar to iterative inference methods, but requires only a single matrix multiplication. In addition to these inference methods, we present SparseLDA, an algorithm and data structure for evaluating Gibbs sampling distributions. Empirical results indicate that SparseLDA can be approximately 20 times faster than traditional LDA and provide twice the speedup of previously published fast sampling methods, while also using substantially less memory.


conference on information and knowledge management | 2013

Universal schema for entity type prediction

Limin Yao; Sebastian Riedel; Andrew McCallum

Categorizing entities by their types is useful in many applications, including knowledge base construction, relation extraction and query intent prediction. Fine-grained entity type ontologies are especially valuable, but typically difficult to design because of unavoidable quandaries about level of detail and boundary cases. Automatically classifying entities by type is challenging as well, usually involving hand-labeling data and training a supervised predictor. This paper presents a universal schema approach to fine-grained entity type prediction. The set of types is taken as the union of textual surface patterns (e.g. appositives) and pre-defined types from available databases (e.g. Freebase)---yielding not tens or hundreds of types, but more than ten thousands of entity types, such as financier, criminologist, and musical trio. We robustly learn mutual implication among this large union by learning latent vector embeddings from probabilistic matrix factorization, thus avoiding the need for hand-labeled data. Experimental results demonstrate more than 30% reduction in error versus a traditional classification approach on predicting fine-grained entities types.


In: (pp. pp. 74-84). (2013) | 2013

Relation extraction with matrix factorization and universal schemas

Sebastian Riedel; Limin Yao; Andrew McCallum; Benjamin M. Marlin


north american chapter of the association for computational linguistics | 2013

Relation Extraction with Matrix Factorization and Universal Schemas

Sebastian Riedel; Limin Yao; Andrew McCallum; Benjamin M. Marlin


empirical methods in natural language processing | 2011

Structured Relation Discovery using Generative Models

Limin Yao; Aria Haghighi; Sebastian Riedel; Andrew McCallum


empirical methods in natural language processing | 2010

Collective Cross-Document Relation Extraction Without Labelled Data

Limin Yao; Sebastian Riedel; Andrew McCallum


meeting of the association for computational linguistics | 2012

Unsupervised Relation Discovery with Sense Disambiguation

Limin Yao; Sebastian Riedel; Andrew McCallum


north american chapter of the association for computational linguistics | 2012

Probabilistic Databases of Universal Schema

Limin Yao; Sebastian Riedel; Andrew McCallum


north american chapter of the association for computational linguistics | 2010

Constraint-Driven Rank-Based Learning for Information Extraction

Sameer Singh; Limin Yao; Sebastian Riedel; Andrew McCallum

Collaboration


Dive into the Limin Yao's collaboration.

Top Co-Authors

Avatar

Andrew McCallum

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sameer Singh

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Aria Haghighi

University of California

View shared research outputs
Top Co-Authors

Avatar

Benjamin M. Marlin

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Alexandre Passos

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Ari Kobren

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Brian Martin

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

David Belanger

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge