Matthew J. Beal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matthew J. Beal is active.

Explore More

Publication

Featured researches published by Matthew J. Beal.

Journal of the American Statistical Association | 2006

Hierarchical Dirichlet Processes

Yee Whye Teh; Michael I. Jordan; Matthew J. Beal; David M. Blei

We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the well-known clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of a stick-breaking process, and a generalization of the Chinese restaurant process that we refer to as the “Chinese restaurant franchise.” We present Markov chain Monte Carlo algorithms for posterior inference in hierarchical Dirichlet process mixtures and describe applications to problems in information retrieval and text modeling.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

A graphical model for audiovisual object tracking

Matthew J. Beal; Nebojsa Jojic; Hagai Attias

We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from data. We demonstrate successful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.

Bayesian Analysis | 2006

Variational Bayesian Learning of Directed Graphical Models with Hidden Variables

Matthew J. Beal; Zoubin Ghahramani

A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most models of interest, notably those containing hidden or latent variables, the marginal likelihood is intractable to compute. We present the variational Bayesian (VB) algorithm for directed graphical mod- els, which optimises a lower bound approximation to the marginal likelihood in a procedure similar to the standard EM algorithm. We show that for a large class of models, which we call conjugate exponential, the VB algorithm is a straightfor- ward generalisation of the EM algorithm that incorporates uncertainty over model parameters. In a thorough case study using a small class of bipartite DAGs con- taining hidden variables, we compare the accuracy of the VB approximation to existing asymptotic-data approximations such as the Bayesian Information Crite- rion (BIC) and the Cheeseman-Stutz (CS) criterion, and also to a sampling based gold standard, Annealed Importance Sampling (AIS). We nd that the VB algo- rithm is empirically superior to CS and BIC, and much faster than AIS. Moreover, we prove that a VB approximation can always be constructed in such a way that guarantees it to be more accurate than the CS approximation.

international conference on document analysis and recognition | 2005

A statistical model for writer verification

Sargur N. Srihari; Matthew J. Beal; Karthik Bandi; Vivek Shah; Praveen Krishnamurthy

A statistical model for determining whether a pair of documents, a known and a questioned, were written by the same individual is proposed. The model has the following four components: (i) discriminating elements, e.g., global features and characters, are extracted from each document; (ii) differences between corresponding elements from each document are computed; (iii) using conditional probability estimates of each difference, the log-likelihood ratio (LLR) is computed for the hypotheses that the documents were written by the same or different writers; the conditional probability estimates themselves are determined from labeled samples using either Gaussian or gamma estimates for the differences assuming their statistical independence; and (iv) distributions of the LLRs for same and different writer LLRs are analyzed to calibrate the strength of evidence into a standard nine-point scale used by questioned document examiners. The model is illustrated with experimental results for a specific set of discriminating elements.

european conference on computer vision | 2002

Audio-Video Sensor Fusion with Probabilistic Graphical Models

Matthew J. Beal; Hagai Attias; Nebojsa Jojic

meeting of the association for computational linguistics | 2006

Automatically Extracting Nominal Mentions of Events with a Bootstrapped Probabilistic Classifier

Cassandre Creswell; Matthew J. Beal; John Chen; Thomas L. Cornell; Lars Nilsson; Rohini K. Srihari

Most approaches to event extraction focus on mentions anchored in verbs. However, many mentions of events surface as noun phrases. Detecting them can increase the recall of event extraction and provide the foundation for detecting relations between events. This paper describes a weakly-supervised method for detecting nominal event mentions that combines techniques from word sense disambiguation (WSD) and lexical acquisition to create a classifier that labels noun phrases as denoting events or non-events. The classifier uses boot-strapped probabilistic generative models of the contexts of events and non-events. The contexts are the lexically-anchored semantic dependency relations that the NPs appear in. Our method dramatically improves with bootstrapping, and comfortably outperforms lexical lookup methods which are based on very much larger hand-crafted resources.

Scopus | 2005

Machine learning approaches for person identification and verification

Harish Srinivasan; Matthew J. Beal; Sargur N. Srihari

New machine learning strategies are proposed for person identification which can be used in several biometric modalities such as friction ridges, handwriting, signatures and speech. The biometric or forensic performance task answers the question of whether or not a sample belongs to a known person. Two different learning paradigms are discussed: person-independent (or general learning) and person-dependent (or person-specific learning). In the first paradigm, learning is from a general population of ensemble of pairs, each of which is labelled as being from the same person or from different persons- the learning process determines the range of variations for given persons and between different persons. In the second paradigm the identity of a person is learnt when presented with multiple known samples of that person- where the variation and similarities within a particular person are learnt. The person-specific learning strategy is seen to perform better than general learning (5% higher performace with signatures). Improvement of person-specific performance with increasing number of samples is also observed.

international conference on acoustics, speech, and signal processing | 2002

A self-calibrating algorithm for speaker tracking based on audio-visual statistical models

Matthew J. Beal; Nebojsa Jojic; Hagai Attias

We present a self-calibrating algorithm for audio-visual tracking using two microphones and a camera. The algorithm uses a parametrized statistical model which combines simple models of video and audio. Using unobserved variables, the model describes the process that generates the observed data. Hence, it is able to capture and exploit the statistical structure of the audio and video data, as well as their mutual dependencies, The model parameters are estimated by the EM algorithm; object templates are learned and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location using the model. Successful performance is demonstrated on real multimedia clips.

Proceedings of SPIE, the International Society for Optical Engineering | 2006

Comparison of ROC-based and likelihood methods for fingerprint verification

Harish Srinivasan; Sargur N. Srihari; Matthew J. Beal; Prasad Phatak; Gang Fang

The fingerprint verification task answers the question of whether or not two fingerprints belongs to the same finger. The paper focuses on the classification aspect of fingerprint verification. Classification is the third and final step after after the two earlier steps of feature extraction, where a known set of features (minutiae points) have been extracted from each fingerprint, and scoring, where a matcher has determined a degree of match between the two sets of features. Since this is a binary classification problem involving a single variable, the commonly used threshold method is related to the so-called receiver operating characteristics (ROC). In the ROC approach the optimal threshold on the score is determined so as to determine match or non-match. Such a method works well when there is a well-registered fingerprint image. On the other hand more sophisticated methods are needed when there exists a partial imprint of a finger- as in the case of latent prints in forensics or due to limitations of the biometric device. In such situations it is useful to consider classification methods based on computing the likelihood ratio of match/non-match. Such methods are commonly used in some biometric and forensic domains such as speaker verification where there is a much higher degree of uncertainty. This paper compares the two approaches empirically for the fingerprint classification task when the number of available minutiae are varied. In both ROC-based and likelihood ratio methods, learning is from a general population of ensemble of pairs, each of which is labeled as being from the same finger or from different fingers. In the ROC-based method the best operating point is derived from the ROC curve. In the likelihood method the distributions of same finger and different finger scores are modeled using Gaussian and Gamma distributions. The performances of the two methods are compared for varying numbers of minutiae points available. Results show that the likelihood method performs better than the ROC-based method when fewer minutiae points are available. Both methods converge to the same accuracy as more minutiae points are available.

international conference on pattern recognition | 2006

Competitive Mixtures of Simple Neurons

Karthik Sridharan; Matthew J. Beal; Venu Govindaraju

We propose a competitive finite mixture of neurons (or perceptrons) for solving binary classification problems. Our classifier includes a prior for the weights between different neurons such that it prefers mixture models made up from neurons having classification boundaries as orthogonal to each other as possible. We derive an EM algorithm for learning the mixing proportions and weights of each neuron, consisting of an exact E step and a partial M step, and show that our model covers the regions of high posterior probability in weight space and tends to reduce overfitting. We demonstrate the way in which our mixture classifier works using a toy 2D data set, showing the effective use of strategically positioned components in the mixture. We further compare its performance against SVMs and one-hidden-layer neural networks on four real-world data sets from the UCI repository, and show that even a relatively small number of neurons with appropriate competitive priors can achieve superior classification accuracies on held-out test data

Explore More