Philip H. S. Torr | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Philip H. S. Torr is active.

Explore More

Publication

Featured researches published by Philip H. S. Torr.

international conference on computer vision | 2011

Struck: Structured output tracking with kernels

Sam Hare; Amir Saffari; Philip H. S. Torr

Adaptive tracking-by-detection methods are widely used in computer vision for tracking arbitrary objects. Current approaches treat the tracking problem as a classification task and use online learning techniques to update the object model. However, for these updates to happen one needs to convert the estimated object position into a set of labelled training examples, and it is not clear how best to perform this intermediate step. Furthermore, the objective for the classifier (label prediction) is not explicitly coupled to the objective for the tracker (accurate estimation of object position). In this paper, we present a framework for adaptive visual object tracking based on structured output prediction. By explicitly allowing the output space to express the needs of the tracker, we are able to avoid the need for an intermediate classification step. Our method uses a kernelized structured output support vector machine (SVM), which is learned online to provide adaptive tracking. To allow for real-time application, we introduce a budgeting mechanism which prevents the unbounded growth in the number of support vectors which would otherwise occur during tracking. Experimentally, we show that our algorithm is able to outperform state-of-the-art trackers on various benchmark videos. Additionally, we show that we can easily incorporate additional features and kernels into our framework, which results in increased performance.

Computer Vision and Image Understanding | 2000

MLESAC: A New Robust Estimator with Application to Estimating Image Geometry

Philip H. S. Torr; Andrew Zisserman

Abstract A new method is presented for robustly estimating multiple view relations from point correspondences. The method comprises two parts. The first is a new robust estimator MLESAC which is a generalization of the RANSAC estimator. It adopts the same sampling strategy as RANSAC to generate putative solutions, but chooses the solution that maximizes the likelihood rather than just the number of inliers. The second part of the algorithm is a general purpose method for automatically parameterizing these relations, using the output of MLESAC. A difficulty with multiview image relations is that there are often nonlinear constraints between the parameters, making optimization a difficult task. The parameterization method overcomes the difficulty of nonlinear constraints and conducts a constrained optimization. The method is general and its use is illustrated for the estimation of fundamental matrices, image–image homographies, and quadratic transformations. Results are given for both synthetic and real images. It is demonstrated that the method gives results equal or superior to those of previous approaches.

computer vision and pattern recognition | 2008

Robust higher order potentials for enforcing label consistency

Pushmeet Kohli; Lubor Ladicky; Philip H. S. Torr

This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner. Our method is based on higher order conditional random fields and uses potentials defined on sets of pixels (image segments) generated using unsupervised segmentation algorithms. These potentials enforce label consistency in image regions and can be seen as a generalization of the commonly used pairwise contrast sensitive smoothness potentials. The higher order potential functions used in our framework take the form of the Robust Pn model and are more general than the Pn Potts model recently proposed by Kohli et al. We prove that the optimal swap and expansion moves for energy functions composed of these potentials can be computed by solving a st-mincut problem. This enables the use of powerful graph cut based move making algorithms for performing inference in the framework. We test our method on the problem of multi-class object segmentation by augmenting the conventional crf used for object segmentation with higher order potentials defined on image regions. Experiments on challenging data sets show that integration of higher order potentials quantitatively and qualitatively improves results leading to much better definition of object boundaries. We believe that this method can be used to yield similar improvements for many other labelling problems.

International Journal of Computer Vision | 1997

The Development and Comparison of Robust Methodsfor Estimating the Fundamental Matrix

Philip H. S. Torr; David W. Murray

This paper has two goals. The first is to develop a variety of robust methods for the computation of the Fundamental Matrix, the calibration-free representation of camera motion. The methods are drawn from the principal categories of robust estimators, viz. case deletion diagnostics, M-estimators and random sampling, and the paper develops the theory required to apply them to non-linear orthogonal regression problems. Although a considerable amount of interest has focussed on the application of robust estimation in computer vision, the relative merits of the many individual methods are unknown, leaving the potential practitioner to guess at their value. The second goal is therefore to compare and judge the methods.Comparative tests are carried out using correspondences generated both synthetically in a statistically controlled fashion and from feature matching in real imagery. In contrast with previously reported methods the goodness of fit to the synthetic observations is judged not in terms of the fit to the observations per se but in terms of fit to the ground truth. A variety of error measures are examined. The experiments allow a statistically satisfying and quasi-optimal method to be synthesized, which is shown to be stable with up to 50 percent outlier contamination, and may still be used if there are more than 50 percent outliers. Performance bounds are established for the method, and a variety of robust methods to estimate the standard deviation of the error and covariance matrix of the parameters are examined.The results of the comparison have broad applicability to vision algorithms where the input data are corrupted not only by noise but also by gross outliers.

computer vision and pattern recognition | 2014

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

Ming-Ming Cheng; Ziming Zhang; Wen-Yan Lin; Philip H. S. Torr

Training a generic objectness measure to produce a small set of candidate object windows, has been shown to speed up the classical sliding window object detection paradigm. We observe that generic objects with well-defined closed boundary can be discriminated by looking at the norm of gradients, with a suitable resizing of their corresponding image windows in to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g. ADD, BITWISE SHIFT, etc.). Experiments on the challenging PASCAL VOC 2007 dataset show that our method efficiently (300fps on a single laptop CPU) generates a small set of category-independent, high quality object windows, yielding 96.2% object detection rate (DR) with 1, 000 proposals. Increasing the numbers of proposals and color spaces for computing BING features, our performance can be further improved to 99.5% DR.

international conference on computer vision | 2015

Conditional Random Fields as Recurrent Neural Networks

Shuai Zheng; Sadeep Jayasumana; Bernardino Romera-Paredes; Vibhav Vineet; Zhizhong Su; Dalong Du; Chang Huang; Philip H. S. Torr

Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate Conditional Random Fields with Gaussian pairwise potentials and mean-field approximate inference as Recurrent Neural Networks. This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has desirable properties of both CNNs and CRFs. Importantly, our system fully integrates CRF modelling with CNNs, making it possible to train the whole deep network end-to-end with the usual back-propagation algorithm, avoiding offline post-processing methods for object delineation. We apply the proposed method to the problem of semantic image segmentation, obtaining top results on the challenging Pascal VOC 2012 segmentation benchmark.

european conference on computer vision | 2004

Interactive Image Segmentation Using an Adaptive GMMRF Model

Andrew Blake; Carsten Rother; Matthew Brown; Patrick Pérez; Philip H. S. Torr

The problem of interactive foreground/background segmentation in still images is of great practical importance in image editing. The state of the art in interactive segmentation is probably represented by the graph cut algorithm of Boykov and Jolly (ICCV 2001). Its underlying model uses both colour and contrast information, together with a strong prior for region coherence. Estimation is performed by solving a graph cut problem for which very efficient algorithms have recently been developed. However the model depends on parameters which must be set by hand and the aim of this work is for those constants to be learned from image data.

international conference on computer vision | 2009

Associative hierarchical CRFs for object class image segmentation

Lubor Ladicky; Chris Russell; Pushmeet Kohli; Philip H. S. Torr

Most methods for object class segmentation are formulated as a labelling problem over a single choice of quantisation of an image space - pixels, segments or group of segments. It is well known that each quantisation has its fair share of pros and cons; and the existence of a common optimal quantisation level suitable for all object categories is highly unlikely. Motivated by this observation, we propose a hierarchical random field model, that allows integration of features computed at different levels of the quantisation hierarchy. MAP inference in this model can be performed efficiently using powerful graph cut based move making algorithms. Our framework generalises much of the previous work based on pixels or segments. We evaluate its efficiency on some of the most challenging data-sets for object class segmentation, and show it obtains state-of-the-art results.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

Model-based hand tracking using a hierarchical Bayesian filter

Björn Stenger; Arasanathan Thayananthan; Philip H. S. Torr; Roberto Cipolla

This paper sets out a tracking framework, which is applied to the recovery of three-dimensional hand motion from an image sequence. The method handles the issues of initialization, tracking, and recovery in a unified way. In a single input image with no prior information of the hand pose, the algorithm is equivalent to a hierarchical detection scheme, where unlikely pose candidates are rapidly discarded. In image sequences, a dynamic model is used to guide the search and approximate the optimal filtering equations. A dynamic model is given by transition probabilities between regions in parameter space and is learned from training data obtained by capturing articulated motion. The algorithm is evaluated on a number of image sequences, which include hand motion with self-occlusion in front of a cluttered background

european conference on computer vision | 1996

3D Model Acquisition from Extended Image Sequences

Paul A. Beardsley; Philip H. S. Torr; Andrew Zisserman

A method for matching image primitives through a sequence is described, for the purpose of acquiring 3D geometric models. The method includes a novel robust estimator of the trifocal tensor, based on a minimum number of token correspondences across an image triplet; and a novel tracking algorithm in which corners and line segments are matched over image triplets in an integrated framework. The matching techniques are both robust (detecting and discarding mismatches) and fully automatic.

Explore More