Thomas P. Minka | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thomas P. Minka is active.

Explore More

Publication

Featured researches published by Thomas P. Minka.

international conference on computer vision | 2005

Object categorization by learned universal visual dictionary

John Winn; Antonio Criminisi; Thomas P. Minka

This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable for many applications such as semantic image retrieval, Web search, and interactive image editing. It classifies a region according to the proportions of different visual words (clusters in feature space). The specific visual words and the typical proportions in each object are learned from a segmented training set. The main contribution of this paper is twofold: i) an optimally compact visual dictionary is learned by pair-wise merging of visual words from an initially large dictionary. The final visual words are described by GMMs. ii) A novel statistical measure of discrimination is proposed which is optimized by each merge operation. High classification accuracy is demonstrated for nine object classes on photographs of real objects viewed under general lighting conditions, poses and viewpoints. The set of test images used for validation comprise: i) photographs acquired by us, ii) images from the Web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes)

IEEE Transactions on Image Processing | 2000

The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments

Ingemar J. Cox; Matthew L. Miller; Thomas P. Minka; Thomas V. Papathomas; Peter N. Yianilos

This paper presents the theory, design principles, implementation and performance results of PicHunter, a prototype content-based image retrieval (CBIR) system. In addition, this document presents the rationale, design and results of psychophysical experiments that were conducted to address some key issues that arose during PicHunters development. The PicHunter project makes four primary contributions to research on CBIR. First, PicHunter represents a simple instance of a general Bayesian framework which we describe for using relevance feedback to direct a search. With an explicit model of what users would do, given the target image they want, PicHunter uses Bayess rule to predict the target they want, given their actions. This is done via a probability distribution over possible image targets, rather than by refining a query. Second, an entropy-minimizing display algorithm is described that attempts to maximize the information obtained from a user at each iteration of the search. Third, PicHunter makes use of hidden annotation rather than a possibly inaccurate/inconsistent annotation structure that the user must learn and make queries in. Finally, PicHunter introduces two experimental paradigms to quantitatively evaluate the performance of the system, and psychophysical experiments are presented that support the theoretical claims.

computer vision and pattern recognition | 2006

Cosegmentation of Image Pairs by Histogram Matching - Incorporating a Global Constraint into MRFs

Carsten Rother; Thomas P. Minka; Andrew Blake; Vladimir Kolmogorov

We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizing an energy with an MRF term encoding spatial coherency and a global constraint which attempts to match the appearance histograms of the common parts. This energy has not been proposed previously and its optimization is challenging and NP-hard. For this problem a novel optimization scheme which we call trust region graph cuts is presented. We demonstrate that this framework has the potential to improve a wide range of research: Object driven image retrieval, video tracking and segmentation, and interactive image editing. The power of the framework lies in its generality, the common part can be a rigid/non-rigid object (or scene), observed from different viewpoints or even similar objects of the same class.

international acm sigir conference on research and development in information retrieval | 2002

Novelty and redundancy detection in adaptive filtering

Yi Zhang; James P. Callan; Thomas P. Minka

This paper addresses the problem of extending an adaptive information filtering system to make decisions about the novelty and redundancy of relevant documents. It argues that relevance and redundance should each be modelled explicitly and separately. A set of five redundancy measures are proposed and evaluated in experiments with and without redundancy thresholds. The experimental results demonstrate that the cosine similarity metric and a redundancy measure based on a mixture of language models are both effective for identifying redundant documents.

Multimedia Systems | 1995

Vision texture for annotation

Rosalind W. Picard; Thomas P. Minka

This paper demonstrates a new application of computer vision to digital libraries — the use of texture forannotation, the description of content. Vision-based annotation assists the user in attaching descriptions to large sets of images and video. If a user labels a piece of an image aswater, a texture model can be used to propagate this label to other “visually similar” regions. However, a serious problem is that no single model has been found that is good enough to match reliably human perception of similarity in pictures. Rather than using one model, the system described here knows several texture models, and is equipped with the ability to choose the one that “best explains” the regions selected by the user for annotating. If none of these models suffices, then it creates new explanations by combining models. Examples of annotations propagated by the system on natural scenes are given. The system provides an average gain of four to one in label prediction for a set of 98 images.

Pattern Recognition | 1997

Interactive learning with a “society of models”

Thomas P. Minka; Rosalind W. Picard

Digital library access is driven by features, but the relevance of a feature for a query is not always obvious. This paper describes an approach for integrating a large number of context-dependent features into a semi-automated tool. Instead of requiring universal similarity measures or manual selection of relevant features, the approach provides a learning algorithm for selecting and combining groupings of the data, where groupings can be induced by highly specialized features. The selection process is guided by positive and negative examples from the user. The inherent combinatorics of using multiple features is reduced by a multistage grouping generation, weighting, and collection process. The stages closest to the user are trained fastest and slowly propagate their adaptations back to earlier stages, improving overall performance.

computer vision and pattern recognition | 2006

Principled Hybrids of Generative and Discriminative Models

Julia Lasserre; Christopher M. Bishop; Thomas P. Minka

When labelled training data is plentiful, discriminative techniques are widely used since they give excellent generalization performance. However, for large-scale applications such as object recognition, hand labelling of data is expensive, and there is much interest in semi-supervised techniques based on generative models in which the majority of the training data is unlabelled. Although the generalization performance of generative models can often be improved by ‘training them discriminatively’, they can then no longer make use of unlabelled data. In an attempt to gain the benefit of both generative and discriminative approaches, heuristic procedure have been proposed [2, 3] which interpolate between these two extremes by taking a convex combination of the generative and discriminative objective functions. In this paper we adopt a new perspective which says that there is only one correct way to train a given model, and that a ‘discriminatively trained’ generative model is fundamentally a new model [7]. From this viewpoint, generative and discriminative models correspond to specific choices for the prior over parameters. As well as giving a principled interpretation of ‘discriminative training’, this approach opens door to very general ways of interpolating between generative and discriminative extremes through alternative choices of prior. We illustrate this framework using both synthetic data and a practical example in the domain of multi-class object recognition. Our results show that, when the supply of labelled training data is limited, the optimum performance corresponds to a balance between the purely generative and the purely discriminative.

computer vision and pattern recognition | 2008

Bayesian color constancy revisited

Peter V. Gehler; Carsten Rother; Andrew Blake; Thomas P. Minka; Toby Sharp

Computational color constancy is the task of estimating the true reflectances of visible surfaces in an image. In this paper we follow a line of research that assumes uniform illumination of a scene, and that the principal step in estimating reflectances is the estimation of the scene illuminant. We review recent approaches to illuminant estimation, firstly those based on formulae for normalisation of the reflectance distribution in an image - so-called grey-world algorithms, and those based on a Bayesian formulation of image formation. In evaluating these previous approaches we introduce a new tool in the form of a database of 568 high-quality, indoor and outdoor images, accurately labelled with illuminant, and preserved in their raw form, free of correction or normalisation. This has enabled us to establish several properties experimentally. Firstly automatic selection of grey-world algorithms according to image properties is not nearly so effective as has been thought. Secondly, it is shown that Bayesian illuminant estimation is significantly improved by the improved accuracy of priors for illuminant and reflectance that are obtained from the new dataset.

web search and data mining | 2008

SoftRank: optimizing non-smooth rank metrics

Michael J. Taylor; John Guiver; Stephen E. Robertson; Thomas P. Minka

We address the problem of learning large complex ranking functions. Most IR applications use evaluation metrics that depend only upon the ranks of documents. However, most ranking functions generate document scores, which are sorted to produce a ranking. Hence IR metrics are innately non-smooth with respect to the scores, due to the sort. Unfortunately, many machine learning algorithms require the gradient of a training objective in order to perform the optimization of the model parameters,and because IR metrics are non-smooth,we need to find a smooth proxy objective that can be used for training. We present a new family of training objectives that are derived from the rank distributions of documents, induced by smoothed scores. We call this approach SoftRank. We focus on a smoothed approximation to Normalized Discounted Cumulative Gain (NDCG), called SoftNDCG and we compare it with three other training objectives in the recent literature. We present two main results. First, SoftRank yields a very good way of optimizing NDCG. Second, we show that it is possible to achieve state of the art test set NDCG results by optimizing a soft NDCG objective on the training set with a different discount function

international conference on machine learning | 2004

Predictive automatic relevance determination by expectation propagation

Yuan Qi; Thomas P. Minka; Rosalind W. Picard; Zoubin Ghahramani

In many real-world classification problems the input contains a large number of potentially irrelevant features. This paper proposes a new Bayesian framework for determining the relevance of input features. This approach extends one of the most successful Bayesian methods for feature selection and sparse learning, known as Automatic Relevance Determination (ARD). ARD finds the relevance of features by optimizing the model marginal likelihood, also known as the evidence. We show that this can lead to overfitting. To address this problem, we propose Predictive ARD based on estimating the predictive performance of the classifier. While the actual leave-one-out predictive performance is generally very costly to compute, the expectation propagation (EP) algorithm proposed by Minka provides an estimate of this predictive performance as a side-effect of its iterations. We exploit this in our algorithm to do feature selection, and to select data points in a sparse Bayesian kernel classifier. Moreover, we provide two other improvements to previous algorithms, by replacing Laplaces approximation with the generally more accurate EP, and by incorporating the fast optimization algorithm proposed by Faul and Tipping. Our experiments show that our method based on the EP estimate of predictive performance is more accurate on test data than relevance determination by optimizing the evidence.

Explore More