James R. Foulds | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where James R. Foulds is active.

Explore More

Publication

Featured researches published by James R. Foulds.

Knowledge Engineering Review | 2010

A review of multi-instance learning assumptions

James R. Foulds; Eibe Frank

Multi-instance (MI) learning is a variant of inductive machine learning, where each learning example contains a bag of instances instead of a single feature vector. The term commonly refers to the supervised setting, where each bag is associated with a label. This type of representation is a natural fit for a number of real-world learning scenarios, including drug activity prediction and image classification, hence many MI learning algorithms have been proposed. An yM I learning method must relate instances to bag-level class labels, but many types of relationships between instances and class labels are possible. Although all early work in MI learning assumes a specific MI concept class known to be appropriate for a drug activity prediction domain; this ‘standard MI assumption’ is not guaranteed to hold in other domains. Much of the recent work in MI learning has concentrated on a relaxed view of the MI problem, where the standard MI assumption is dropped, and alternative assumptions are considered instead. However, often it is not clearly stated what particular assumption is used and how it relates to other assumptions that have been proposed. In this paper, we aim to clarify the use of alternative MI assumptions by reviewing the work done in this area.

knowledge discovery and data mining | 2013

Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation

James R. Foulds; Levi Boyles; Christopher DuBois; Padhraic Smyth; Max Welling

There has been an explosion in the amount of digital text information available in recent years, leading to challenges of scale for traditional inference algorithms for topic models. Recent advances in stochastic variational inference algorithms for latent Dirichlet allocation (LDA) have made it feasible to learn topic models on very large-scale corpora, but these methods do not currently take full advantage of the collapsed representation of the model. We propose a stochastic algorithm for collapsed variational Bayesian inference for LDA, which is simpler and more efficient than the state of the art method. In experiments on large-scale text corpora, the algorithm was found to converge faster and often to a better solution than previous methods. Human-subject experiments also demonstrated that the method can learn coherent topics in seconds on small corpora, facilitating the use of topic models in interactive document analysis software.

international joint conference on natural language processing | 2015

Joint Models of Disagreement and Stance in Online Debate

Dhanya Sridhar; James R. Foulds; Bert Huang; Lise Getoor; Marilyn A. Walker

Online debate forums present a valuable opportunity for the understanding and modeling of dialogue. To understand these debates, a key challenge is inferring the stances of the participants, all of which are interrelated and dependent. While collectively modeling users’ stances has been shown to be effective (Walker et al., 2012c; Hasan and Ng, 2013), there are many modeling decisions whose ramifications are not well understood. To investigate these choices and their effects, we introduce a scalable unified probabilistic modeling framework for stance classification models that 1) are collective, 2) reason about disagreement, and 3) can model stance at either the author level or at the post level. We comprehensively evaluate the possible modeling choices on eight topics across two online debate corpora, finding accuracy improvements of up to 11.5 percentage points over a local classifier. Our results highlight the importance of making the correct modeling choices for online dialogues, and having a unified probabilistic modeling framework that makes this possible.

empirical methods in natural language processing | 2015

RELLY: Inferring Hypernym Relationships Between Relational Phrases

Adam Grycner; Gerhard Weikum; Jay Pujara; James R. Foulds; Lise Getoor

Relational phrases (e.g., “got married to”) and their hypernyms (e.g., “is a relative of”) are central for many tasks including question answering, open information extraction, paraphrasing, and entailment detection. This has motivated the development of several linguistic resources (e.g. DIRT, PATTY, and WiseNet) which systematically collect and organize relational phrases. These resources have demonstrable practical benefits, but are each limited due to noise, sparsity, or size. We present a new general-purpose method, RELLY, for constructing a large hypernymy graph of relational phrases with high-quality subsumptions using collective probabilistic programming techniques. Our graph induction approach integrates small highprecision knowledge bases together with large automatically curated resources, and reasons collectively to combine these resources into a consistent graph. Using RELLY, we construct a high-coverage, high-precision hypernymy graph consisting of 20K relational phrases and 35K hypernymy links. Our evaluation indicates a hypernymy link precision of 78%, and demonstrates the value of this resource for a document-relevance ranking task.

discovery science | 2010

Speeding up and boosting diverse density learning

James R. Foulds; Eibe Frank

In multi-instance learning, each example is described by a bag of instances instead of a single feature vector. In this paper, we revisit the idea of performing multi-instance classification based on a point-and-scaling concept by searching for the point in instance space with the highest diverse density. This is a computationally expensive process, and we describe several heuristics designed to improve runtime. Our results show that simple variants of existing algorithms can be used to find diverse density maxima more efficiently. We also show how significant increases in accuracy can be obtained by applying a boosting algorithm with a modified version of the diverse density algorithm as the weak learner.

International Journal of Operational Research | 2006

A probabilistic dynamic programming model of rape seed harvesting

James R. Foulds; Les R. Foulds

We discuss a practical scenario from an operations scheduling viewpoint involving commercial contracting enterprises that visit farms in order to harvest rape seed crops. We report on a probabilistic dynamic programming formulation that was designed specifically for scenarios of the type described. The paper is an extension of previous work from the deterministic one-farm case to allow for: the specific considerations necessary to harvest a particular crop, namely rape seed, harvesting at multiple farms, machine failure, and the fact that activity duration times are uncertain. The computational times experienced in solving practical instances of the formulation are encouraging.

Asia-Pacific Journal of Operational Research | 2006

BRIDGE LANE DIRECTION SPECIFICATION FOR SUSTAINABLE TRAFFIC MANAGEMENT

James R. Foulds; Les R. Foulds

We present a deterministic model that specifies lane direction in a multi-laned bridge that has a movable barrier that divides the two directions of traffic flow, in order to reduce congestion. A probabilistic dynamic programming formulation for a stochastic extension of the model is also presented. Analysis of the special structure of the dynamic programming formulation provides new insights into important aspects of certain traffic planning problems and represents a useful addition to the traffic network planners toolkit. A case study involving the lane direction management of an actual bridge is also provided.

international conference on artificial intelligence and statistics | 2011