Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dale Schuurmans is active.

Publication


Featured researches published by Dale Schuurmans.


european conference on information retrieval | 2004

Augmenting Naive Bayes Classifiers with Statistical Language Models

Fuchun Peng; Dale Schuurmans; Shaojun Wang

We augment naive Bayes models with statistical n-gram language models to address short-comings of the standard naive Bayes text classifier. The result is a generalized naive Bayes classifier which allows for a local Markov dependence among observations; a model we refer to as the ChainAugmentedNaiveBayes (CAN) Bayes classifier. CAN models have two advantages over standard naive Bayes classifiers. First, they relax some of the independence assumptions of naive Bayes—allowing a local Markov chain dependence in the observed variables—while still permitting efficient inference and learning. Second, they permit straightforward application of sophisticated smoothing techniques from statistical language modeling, which allows one to obtain better parameter estimates than the standard Laplace smoothing used in naive Bayes classification. In this paper, we introduce CAN models and apply them to various text classification problems. To demonstrate the language independent and task independent nature of these classifiers, we present experimental results on several text classification problems—authorship attribution, text genre classification, and topic detection—in several languages—Greek, English, Japanese and Chinese. We then systematically study the key factors in the CAN model that can influence the classification performance, and analyze the strengths and weaknesses of the model.


conference on learning theory | 1997

General convergence results for linear discriminant updates

Adam J. Grove; Nick Littlestone; Dale Schuurmans

The problem of learning linear-discriminant concepts can be solved by various mistake-driven update procedures, including the Winnow family of algorithms and the well-known Perceptron algorithm. In this paper we define the general class of “quasi-additive” algorithms, which includes Perceptron and Winnow as special cases. We give a single proof of convergence that covers a broad subset of algorithms in this class, including both Perceptron and Winnow, but also many new algorithms. Our proof hinges on analyzing a generic measure of progress construction that gives insight as to when and how such algorithms converge.


meeting of the association for computational linguistics | 2006

Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling

Feng Jiao; Shaojun Wang; Chi-Hoon Lee; Russell Greiner; Dale Schuurmans

We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein mentions in biological texts, and show that incorporating unlabeled data improves the performance of the supervised CRF in this case.


european conference on information retrieval | 2003

Combining naive bayes and n-gram language models for text classification

Fuchun Peng; Dale Schuurmans

We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers. The chain augmented naive Bayes classifiers we propose have two advantages over standard naive Bayes classifiers. First, a chain augmented naive Bayes model relaxes some of the independence assumptions of naive Bayes--allowing a local Markov chain dependence in the observed variables--while still permitting efficient inference and learning. Second, smoothing techniques from statistical language modeling can be used to recover better estimates than the Laplace smoothing techniques usually used in naive Bayes classification. Our experimental results on three real world data sets show that we achieve substantial improvements over standard naive Bayes classification, while also achieving state of the art performance that competes with the best known methods in these cases.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

Graphical Models and Point Pattern Matching

Tibério S. Caetano; Terry Caelli; Dale Schuurmans; Dante Augusto Couto Barone

This paper describes a novel solution to the rigid point pattern matching problem in Euclidean spaces of any dimension. Although we assume rigid motion, jitter is allowed. We present a noniterative, polynomial time algorithm that is guaranteed to find an optimal solution for the noiseless case. First, we model point pattern matching as a weighted graph matching problem, where weights correspond to Euclidean distances between nodes. We then formulate graph matching as a problem of finding a maximum probability configuration in a graphical model. By using graph rigidity arguments, we prove that a sparse graphical model yields equivalent results to the fully connected model in the noiseless case. This allows us to obtain an algorithm that runs in polynomial time and is provably optimal for exact matching between noiseless point sets. For inexact matching, we can still apply the same algorithm to find approximately optimal solutions. Experimental results obtained by our approach show improvements in accuracy over current methods, particularly when matching patterns of different sizes


conference of the european chapter of the association for computational linguistics | 2003

Language independent authorship attribution using character level language models

Fuchun Peng; Dale Schuurmans; Shaojun Wang; Vlado Keselj

We present a method for computer-assisted authorship attribution based on character-level n-gram language models. Our approach is based on simple information theoretic principles, and achieves improved performance across a variety of languages without requiring extensive pre-processing or feature selection. To demonstrate the effectiveness and language independence of our approach, we present experimental results on Greek, English, and Chinese data. We show that our approach achieves state of the art performance in each of these cases. In particular, we obtain a 18% accuracy improvement over the best published results for a Greek data set, while using a far simpler technique than previous investigations.


computer vision and pattern recognition | 2003

Face alignment using statistical models and wavelet features

Feng Jiao; Stan Z. Li; Heung-Yeung Shum; Dale Schuurmans

Active shape model (ASM) is a powerful statistical tool for face alignment by shape. However, it can suffer from changes in illumination and facial expression changes, and local minima in optimization. In this paper, we present a method, W-ASM, in which Gabor wavelet features are used for modeling local image structure. The magnitude and phase of Gabor features contain rich information about the local structural features of face images to be aligned, and provide accurate guidance for search. To a large extent, this repairs defects in gray scale based search. An E-M algorithm is used to model the Gabor feature distribution, and a coarse-to-fine grained search is used to position local features in the image. Experimental results demonstrate the ability of W-ASM to accurately align and locate facial features.


Artificial Intelligence | 2006

Constraint-based optimization and utility elicitation using the minimax decision criterion

Craig Boutilier; Relu Patrascu; Pascal Poupart; Dale Schuurmans

In many situations, a set of hard constraints encodes the feasible configurations of some system or product over which multiple users have distinct preferences. However, making suitable decisions requires that the preferences of a specific user for different configurations be articulated or elicited, something generally acknowledged to be onerous. We address two problems associated with preference elicitation: computing a best feasible solution when the users utilities are imprecisely specified; and developing useful elicitation procedures that reduce utility uncertainty, with minimal user interaction, to a point where (approximately) optimal decisions can be made. Our main contributions are threefold. First, we propose the use of minimax regret as a suitable decision criterion for decision making in the presence of such utility function uncertainty. Second, we devise several different procedures, all relying on mixed integer linear programs, that can be used to compute minimax regret and regret-optimizing solutions effectively. In particular, our methods exploit generalized additive structure in a users utility function to ensure tractable computation. Third, we propose various elicitation methods that can be used to refine utility uncertainty in such a way as to quickly (i.e., with as few questions as possible) reduce minimax regret. Empirical study suggests that several of these methods are quite successful in minimizing the number of user queries, while remaining computationally practical so as to admit real-time user interaction.


Machine Learning | 2001

General Convergence Results for Linear Discriminant Updates

Adam J. Grove; Nick Littlestone; Dale Schuurmans

The problem of learning linear-discriminant concepts can be solved by various mistake-driven update procedures, including the Winnow family of algorithms and the well-known Perceptron algorithm. In this paper we define the general class of “quasi-additive” algorithms, which includes Perceptron and Winnow as special cases. We give a single proof of convergence that covers a broad subset of algorithms in this class, including both Perceptron and Winnow, but also many new algorithms. Our proof hinges on analyzing a generic measure of progress construction that gives insight as to when and how such algorithms converge.Our measure of progress construction also permits us to obtain good mistake bounds for individual algorithms. We apply our unified analysis to new algorithms as well as existing algorithms. When applied to known algorithms, our method “automatically” produces close variants of existing proofs (recovering similar bounds)—thus showing that, in a certain sense, these seemingly diverse results are fundamentally isomorphic. However, we also demonstrate that the unifying principles are more broadly applicable, and analyze a new class of algorithms that smoothly interpolate between the additive-update behavior of Perceptron and the multiplicative-update behavior of Winnow.


IEEE Transactions on Image Processing | 2011

Real-Time Discriminative Background Subtraction

Li Cheng; Minglun Gong; Dale Schuurmans; Terry Caelli

The authors examine the problem of segmenting foreground objects in live video when background scene textures change over time. In particular, we formulate background subtraction as minimizing a penalized instantaneous risk functional-yielding a local online discriminative algorithm that can quickly adapt to temporal changes. We analyze the algorithms convergence, discuss its robustness to nonstationarity, and provide an efficient nonlinear extension via sparse kernels. To accommodate interactions among neighboring pixels, a global algorithm is then derived that explicitly distinguishes objects versus background using maximum a posteriori inference in a Markov random field (implemented via graph-cuts). By exploiting the parallel nature of the proposed algorithms, we develop an implementation that can run efficiently on the highly parallel graphics processing unit (GPU). Empirical studies on a wide variety of datasets demonstrate that the proposed approach achieves quality that is comparable to state-of-the-art offline methods, while still being suitable for real-time video analysis (≥ 75 fps on a mid-range GPU).

Collaboration


Dive into the Dale Schuurmans's collaboration.

Top Co-Authors

Avatar

Fuchun Peng

University of Waterloo

View shared research outputs
Top Co-Authors

Avatar

Shaojun Wang

Wright State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yaoliang Yu

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge