Hoifung Poon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hoifung Poon is active.

Explore More

Publication

Featured researches published by Hoifung Poon.

international conference on computer vision | 2011

Sum-product networks: A new deep architecture

Hoifung Poon; Pedro M. Domingos

The key limiting factor in graphical model inference and learning is the complexity of the partition function. We thus ask the question: what are the most general conditions under which the partition function is tractable? The answer leads to a new kind of deep architecture, which we call sum-product networks (SPNs) and will present in this abstract.

empirical methods in natural language processing | 2009

Unsupervised Semantic Parsing

Hoifung Poon; Pedro M. Domingos

We present the first unsupervised approach to the problem of learning a semantic parser, using Markov logic. Our USP system transforms dependency trees into quasi-logical forms, recursively induces lambda forms from these, and clusters them to abstract away syntactic variations of the same meaning. The MAP semantic parse of a sentence is obtained by recursively assigning its parts to lambda-form clusters and composing them. We evaluate our approach by using it to extract a knowledge base from biomedical abstracts and answer questions. USP substantially outperforms TextRunner, DIRT and an informed baseline on both precision and recall on this task.

empirical methods in natural language processing | 2008

Joint Unsupervised Coreference Resolution with Markov Logic

Hoifung Poon; Pedro M. Domingos

Machine learning approaches to coreference resolution are typically supervised, and require expensive labeled data. Some unsupervised approaches have been proposed (e.g., Haghighi and Klein (2007)), but they are less accurate. In this paper, we present the first unsupervised approach that is competitive with supervised ones. This is made possible by performing joint inference across mentions, in contrast to the pairwise classification typically used in supervised methods, and by using Markov logic as a representation language, which enables us to easily express relations like apposition and predicate nominals. On MUC and ACE datasets, our model outperforms Haghigi and Kleins one using only a fraction of the training data, and often matches or exceeds the accuracy of state-of-the-art supervised models.

north american chapter of the association for computational linguistics | 2009

Unsupervised Morphological Segmentation with Log-Linear Models

Hoifung Poon; Colin Cherry; Kristina Toutanova

Morphological segmentation breaks words into morphemes (the basic semantic units). It is a key component for natural language processing systems. Unsupervised morphological segmentation is attractive, because in every language there are virtually unlimited supplies of text, but very few labeled resources. However, most existing model-based systems for unsupervised morphological segmentation use directed generative models, making it difficult to leverage arbitrary overlapping features that are potentially helpful to learning. In this paper, we present the first log-linear model for unsupervised morphological segmentation. Our model uses overlapping features such as morphemes and their contexts, and incorporates exponential priors inspired by the minimum description length (MDL) principle. We present efficient algorithms for learning and inference by combining contrastive estimation with sampling. Our system, based on monolingual features only, outperforms a state-of-the-art system by a large margin, even when the latter uses bilingual information such as phrasal alignment and phonetic correspondence. On the Arabic Penn Treebank, our system reduces F1 error by 11% compared to Morfessor.

empirical methods in natural language processing | 2015

Representing Text for Joint Embedding of Text and Knowledge Bases

Kristina Toutanova; Danqi Chen; Patrick Pantel; Hoifung Poon; Pallavi Choudhury; Michael Gamon

Models that learn to represent textual and knowledge base relations in the same continuous latent space are able to perform joint inferences among the two kinds of relations and obtain high accuracy on knowledge base completion (Riedel et al., 2013). In this paper we propose a model that captures the compositional structure of textual relations, and jointly optimizes entity, knowledge base, and textual relation representations. The proposed model significantly improves performance over a model that does not share parameters among textual relations with common sub-structure.

logic in computer science | 2016

Unifying Logical and Statistical AI

Pedro M. Domingos; Daniel Lowd; Stanley Kok; Aniruddh Nath; Hoifung Poon; Matthew Richardson; Parag Singla

Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching weights to first-order formulas and viewing them as templates for features of Markov networks. Inference algorithms for Markov logic draw on ideas from satisfiability, Markov chain Monte Carlo and knowledge-based model construction. Learning algorithms are based on the voted perceptron, pseudo-likelihood and inductive logic programming. Markov logic has been successfully applied to a wide variety of problems in natural language understanding, vision, computational biology, social networks and others, and is the basis of the open-source Alchemy system.

Scientific Reports | 2013

An exhaustive epistatic SNP association analysis on expanded Wellcome Trust data.

Christoph Lippert; Jennifer Listgarten; Robert I. Davidson; Jeff Baxter; Hoifung Poon; Carl M. Kadie; David Heckerman

We present an approach for genome-wide association analysis with improved power on the Wellcome Trust data consisting of seven common phenotypes and shared controls. We achieved improved power by expanding the control set to include other disease cohorts, multiple races, and closely related individuals. Within this setting, we conducted exhaustive univariate and epistatic interaction association analyses. Use of the expanded control set identified more known associations with Crohns disease and potential new biology, including several plausible epistatic interactions in several diseases. Our work suggests that carefully combining data from large repositories could reveal many new biological insights through increased power. As a community resource, all results have been made available through an interactive web server.

international semantic web conference | 2008

Just Add Weights: Markov Logic for the Semantic Web

Pedro M. Domingos; Daniel Lowd; Stanley Kok; Hoifung Poon; Matthew Richardson; Parag Singla

In recent years, it has become increasingly clear that the vision of the Semantic Web requires uncertain reasoning over rich, first-order representations. Markov logic brings the power of probabilistic modeling to first-order logic by attaching weights to logical formulas and viewing them as templates for features of Markov networks. This gives natural probabilistic semantics to uncertain or even inconsistent knowledge bases with minimal engineering effort. Inference algorithms for Markov logic draw on ideas from satisfiability, Markov chain Monte Carlo and knowledge-based model construction. Learning algorithms are based on the conjugate gradient algorithm, pseudo-likelihood and inductive logic programming. Markov logic has been successfully applied to problems in entity resolution, link prediction, information extraction and others, and is the basis of the open-source Alchemy system.

meeting of the association for computational linguistics | 2016

Compositional Learning of Embeddings for Relation Paths in Knowledge Base and Text

Kristina Toutanova; Victoria Lin; Wen-tau Yih; Hoifung Poon; Chris Quirk

Modeling relation paths has offered significant gains in embedding models for knowledge base (KB) completion. However, enumerating paths between two entities is very expensive, and existing approaches typically resort to approximation with a sampled subset. This problem is particularly acute when text is jointly modeled with KB relations and used to provide direct evidence for facts mentioned in it. In this paper, we propose the first exact dynamic programming algorithm which enables efficient incorporation of all relation paths of bounded length, while modeling both relation types and intermediate nodes in the compositional path representations. We conduct a theoretical analysis of the efficiency gain from the approach. Experiments on two datasets show that it addresses representational limitations in prior approaches and improves accuracy in KB completion.

Bioinformatics | 2014

Literome: PubMed-scale genomic knowledge base in the cloud

Hoifung Poon; Chris Quirk; Charlie DeZiel; David Heckerman

MOTIVATION Advances in sequencing technology have led to an exponential growth of genomics data, yet it remains a formidable challenge to interpret such data for identifying disease genes and drug targets. There has been increasing interest in adopting a systems approach that incorporates prior knowledge such as gene networks and genotype-phenotype associations. The majority of such knowledge resides in text such as journal publications, which has been undergoing its own exponential growth. It has thus become a significant bottleneck to identify relevant knowledge for genomic interpretation as well as to keep up with new genomics findings. RESULTS In the Literome project, we have developed an automatic curation system to extract genomic knowledge from PubMed articles and made this knowledge available in the cloud with a Web site to facilitate browsing, searching and reasoning. Currently, Literome focuses on two types of knowledge most pertinent to genomic medicine: directed genic interactions such as pathways and genotype-phenotype associations. Users can search for interacting genes and the nature of the interactions, as well as diseases and drugs associated with a single nucleotide polymorphism or gene. Users can also search for indirect connections between two entities, e.g. a gene and a disease might be linked because an interacting gene is associated with a related disease. AVAILABILITY AND IMPLEMENTATION Literome is freely available at literome.azurewebsites.net. Download for non-commercial use is available via Web services.

Explore More