Is this you? Create Your Porfile

Xiaofeng Yu

The Chinese University of Hong Kong

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiaofeng Yu is active.

Explore More

Publication

Featured researches published by Xiaofeng Yu.

north american chapter of the association for computational linguistics | 2007

Chinese Named Entity Recognition with Cascaded Hybrid Model

Xiaofeng Yu

We propose a high-performance cascaded hybrid model for Chinese NER. Firstly, we use Boosting, a standard and theoretically well-founded machine learning method to combine a set of weak classifiers together into a base system. Secondly, we introduce various types of heuristic human knowledge into Markov Logic Networks (MLNs), an effective combination of first-order logic and probabilistic graphical models to validate Boosting NER hypotheses. Experimental results show that the cascaded hybrid model significantly outperforms the state-of-the-art Boosting model.

conference on information and knowledge management | 2009

An integrated discriminative probabilistic approach to information extraction

Xiaofeng Yu; Wai Lam; Bo Chen

Probabilistic graphical models for sequence data enable us to effectively deal with inherent uncertainty in many real-world domains. However, they operate on a mostly propositional level. Logic approaches, on the other hand, can compactly represent a wide variety of knowledge, especially first-order ones, but treat uncertainty only in limited ways. Therefore, combining probability and first-order logic is highly desirable for information extraction which requires uncertainty modeling as well as dependency and deeper knowledge representation. In this paper, we model both segmentations in observation sequence and relations of segments simultaneously in our proposed integrated discriminative probabilistic framework. We propose the Metropolis-Hastings, a Markov chain Monte Carlo (MCMC) algorithm for approximate Bayesian inference to find the maximum a posteriori assignment of all the variables of this model. This integrated model has several advantages over previous probabilistic graphical models, and it offers a great capability of extracting implicit relations and new relation discovery for relation extraction from encyclopedic documents, and capturing sub-structures in named entities for named entity recognition. We performed extensive experiments on the above two well-established information extraction tasks, illustrating the feasibility and promise of our approach.

Knowledge and Information Systems | 2012

Probabilistic joint models incorporating logic and learning via structured variational approximation for information extraction

Xiaofeng Yu; Wai Lam

Traditional information extraction systems for compound tasks adopt pipeline architectures, which are highly ineffective and suffer from several problems such as cascading accumulation of errors. In this paper, we propose a joint discriminative probabilistic framework to optimize all relevant subtasks simultaneously. This framework offers a great flexibility to incorporate the advantage of both uncertainty for sequence modeling and first-order logic for domain knowledge. The first-order logic model provides a more expressive formalism tackling the issue of limited expressiveness of traditional attribute-value representation. Our framework defines a joint probability distribution for both segmentations in sequence data and possible worlds of relations between segments in the form of an exponential family. Since exact parameter estimation and inference are prohibitively intractable in this model, a structured variational inference algorithm is developed to perform parameter estimation approximately. For inference, we propose a highly coupled, bi-directional Metropolis-Hastings (MH) algorithm to find the maximum a posteriori (MAP) assignments for both segmentations and relations. Extensive experiments on two real-world information extraction tasks, entity identification and relation extraction from Wikipedia, and citation matching show that (1) the proposed model achieves significant improvement on both tasks compared to state-of-the-art pipeline models and other joint models; (2) the bi-directional MH inference algorithm obtains boosted performance compared to the greedy, N-best list, and uni-directional MH sampling algorithms.

international conference on computational linguistics | 2008

An Integrated Probabilistic and Logic Approach to Encyclopedia Relation Extraction with Multiple Features

Xiaofeng Yu; Wai Lam

We propose a new integrated approach based on Markov logic networks (MLNs), an effective combination of probabilistic graphical models and first-order logic for statistical relational learning, to extracting relations between entities in encyclopedic articles from Wikipedia. The MLNs model entity relations in a unified undirected graph collectively using multiple features, including contextual, morphological, syntactic, semantic as well as Wikipedia characteristic features which can capture the essential characteristics of relation extraction task. This model makes simultaneous statistical judgments about the relations for a set of related entities. More importantly, implicit relations can also be identified easily. Our experimental results showed that, this integrated probabilistic and logic model significantly outperforms the current state-of-the-art probabilistic model, Conditional Random Fields (CRFs), for relation extraction from encyclopedic articles.

conference on information and knowledge management | 2011

Towards a top-down and bottom-up bidirectional approach to joint information extraction

Xiaofeng Yu; Irwin King; Michael R. Lyu

Most high-level information extraction (IE) consists of compound and aggregated subtasks. Such IE problems are generally challenging and they have generated increasing interest recently. We investigate two representative IE tasks: (1) entity identification and relation extraction from Wikipedia, and (2) citation matching, and we formally define joint optimization of information extraction. We propose a joint paradigm integrating three factors -- segmentation, relation, and segmentation-relation joint factors, to solve all relevant subtasks simultaneously. This modeling offers a natural formalism for exploiting bidirectional rich dependencies and interactions between relevant subtasks to capture mutual benefits. Since exact parameter estimation is prohibitively intractable, we present a general, highly-coupled learning algorithm based on variational expectation maximization (VEM) to perform parameter estimation approximately in a top-down and bottom-up manner, such that information can flow bidirectionally and mutual benefits from different subtasks can be well exploited. In this algorithm, both segmentation and relation are optimized iteratively and collaboratively using hypotheses from each other. We conducted extensive experiments using two real-world datasets to demonstrate the promise of our approach.

international conference on neural information processing | 2011

Enrichment and Reductionism: Two Approaches for Web Query Classification

Ritesh Agrawal; Xiaofeng Yu; Irwin King; Remi Zajac

Classifying web queries into predefined target categories, also known as web query classification, is important to improve search relevance and online advertising. Web queries are however typically short, ambiguous and in constant flux. Moreover, target categories often lack standard taxonomies and precise semantic descriptions. These challenges make the web query classification task a non-trivial problem. In this paper, we present two complementary approaches for the web query classification task. First is the enrichment method that uses the World Wide Web (WWW) to enrich target categories and further models the web query classification as a search problem. Our second approach, the reductionist approach, works by reducing web queries to few central tokens. We evaluate the two approaches based on few thousands human labeled local and non-local web queries. From our study, we find the two approaches to be complementary to each other as the reductionist approach exhibits high precision but low recall, whereas the enrichment method exhibits high recall but low precision.

conference on information and knowledge management | 2008

Coreference resolution using expressive logic models

Ki Chan; Wai Lam; Xiaofeng Yu

Coreference resolution is regarded as a crucial step for acquiring linkages among pieces of information extracted. Traditionally, coreference resolution models make use of independent attribute-value features over pairs of noun phrases. However, dependency and deeper relations between features can more adequately describe the properties of coreference relations between noun phrases. In this paper, we propose a framework of coreference resolution based on first-order logic and probabilistic graphical model, the Markov Logic Network. The proposed framework enables the use of background knowledge and captures more complex coreference linkage properties through rich expression of conditions. Moreover, the proposed conditions can capture the structural pattern within a noun phrase as well as contextual information between noun phrases. Our experiments show improvement with the use of the expressive logic models and the use of pattern-based conditions.

international conference on computational linguistics | 2010