Stuart Golodetz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stuart Golodetz is active.

Explore More

Publication

Featured researches published by Stuart Golodetz.

european conference on computer vision | 2016

The Visual Object Tracking VOT2014 Challenge Results

Matej Kristan; Roman P. Pflugfelder; Aleš Leonardis; Jiri Matas; Luka Cehovin; Georg Nebehay; Tomas Vojir; Gustavo Fernández; Alan Lukezic; Aleksandar Dimitriev; Alfredo Petrosino; Amir Saffari; Bo Li; Bohyung Han; CherKeng Heng; Christophe Garcia; Dominik Pangersic; Gustav Häger; Fahad Shahbaz Khan; Franci Oven; Horst Bischof; Hyeonseob Nam; Jianke Zhu; Jijia Li; Jin Young Choi; Jin-Woo Choi; João F. Henriques; Joost van de Weijer; Jorge Batista; Karel Lebeda

Visual tracking has attracted a significant attention in the last few decades. The recent surge in the number of publications on tracking-related problems have made it almost impossible to follow the developments in the field. One of the reasons is that there is a lack of commonly accepted annotated data-sets and standardized evaluation protocols that would allow objective comparison of different tracking methods. To address this issue, the Visual Object Tracking (VOT) workshop was organized in conjunction with ICCV2013. Researchers from academia as well as industry were invited to participate in the first VOT2013 challenge which aimed at single-object visual trackers that do not apply pre-learned models of object appearance (model-free). Presented here is the VOT2013 benchmark dataset for evaluation of single-object visual trackers as well as the results obtained by the trackers competing in the challenge. In contrast to related attempts in tracker benchmarking, the dataset is labeled per-frame by visual attributes that indicate occlusion, illumination change, motion change, size change and camera motion, offering a more systematic comparison of the trackers. Furthermore, we have designed an automated system for performing and evaluating the experiments. We present the evaluation protocol of the VOT2013 challenge and the results of a comparison of 27 trackers on the benchmark dataset. The dataset, the evaluation tools and the tracker rankings are publicly available from the challenge website (http://votchallenge.net).

computer vision and pattern recognition | 2016

Staple: Complementary Learners for Real-Time Tracking

Luca Bertinetto; Jack Valmadre; Stuart Golodetz; Ondrej Miksik; Philip H. S. Torr

Correlation Filter-based trackers have recently achieved excellent performance, showing great robustness to challenging situations exhibiting motion blur and illumination changes. However, since the model that they learn depends strongly on the spatial layout of the tracked object, they are notoriously sensitive to deformation. Models based on colour statistics have complementary traits: they cope well with variation in shape, but suffer when illumination is not consistent throughout a sequence. Moreover, colour distributions alone can be insufficiently discriminative. In this paper, we show that a simple tracker combining complementary cues in a ridge regression framework can operate faster than 80 FPS and outperform not only all entries in the popular VOT14 competition, but also recent and far more sophisticated trackers according to multiple benchmarks.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Struck: Structured Output Tracking with Kernels

Sam Hare; Stuart Golodetz; Amir Saffari; Vibhav Vineet; Ming-Ming Cheng; Stephen L. Hicks; Philip H. S. Torr

Adaptive tracking-by-detection methods are widely used in computer vision for tracking arbitrary objects. Current approaches treat the tracking problem as a classification task and use online learning techniques to update the object model. However, for these updates to happen one needs to convert the estimated object position into a set of labelled training examples, and it is not clear how best to perform this intermediate step. Furthermore, the objective for the classifier (label prediction) is not explicitly coupled to the objective for the tracker (estimation of object position). In this paper, we present a framework for adaptive visual object tracking based on structured output prediction. By explicitly allowing the output space to express the needs of the tracker, we avoid the need for an intermediate classification step. Our method uses a kernelised structured output support vector machine (SVM), which is learned online to provide adaptive tracking. To allow our tracker to run at high frame rates, we (a) introduce a budgeting mechanism that prevents the unbounded growth in the number of support vectors that would otherwise occur during tracking, and (b) show how to implement tracking on the GPU. Experimentally, we show that our algorithm is able to outperform state-of-the-art trackers on various benchmark videos. Additionally, we show that we can easily incorporate additional features and kernels into our framework, which results in increased tracking performance.

Frontiers in Psychology | 2015

Imagining the impossible before breakfast: the relation between creativity, dissociation, and sleep

Dalena van Heugten – van der Kloet; Jan Cosgrave; Harald Merckelbach; Ross Haines; Stuart Golodetz; Steven Jay Lynn

Dissociative symptoms have been related to higher rapid eye movement sleep density, a sleep phase during which hyperassociativity may occur. This may enhance artistic creativity during the day. To test this hypothesis, we conducted a creative photo contest to explore the relation between dissociation, sleep, and creativity. During the contest, participants (N = 72) took one photo per day for five consecutive days, based on specific daily themes (consisting of single words) and the instruction to take as creative a photo as possible each day. Furthermore, they completed daily measures of state dissociation and a short sleep diary. The photos and their captions were ranked by two professional photographers and two clinical psychologists based on creativity, originality, bizarreness, and quality. We expected that dissociative people would rank higher in the contest compared with low-dissociative participants, and that the most original photos would be taken on days when the participants scored highest on acute dissociation. We found that acute dissociation predicted a higher ranking on creativity. Poorer sleep quality and fewer hours of sleep predicted more bizarreness in the photos and captions. None of the trait measures could predict creativity. In sum, acute dissociation related to enhanced creativity. These findings contribute to our understanding of dissociative symptomatology.

computer vision and pattern recognition | 2017

On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation

Tommaso Cavallari; Stuart Golodetz; Nicholas A. Lord; Julien P. C. Valentin; Luigi Di Stefano; Philip H. S. Torr

Camera relocalisation is an important problem in computer vision, with applications in simultaneous localisation and mapping, virtual/augmented reality and navigation. Common techniques either match the current image against keyframes with known poses coming from a tracker, or establish 2D-to-3D correspondences between keypoints in the current image and points in the scene in order to estimate the camera pose. Recently, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but must be trained offline on the target scene, preventing relocalisation in new environments. In this paper, we show how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. Our adapted forests achieve relocalisation performance that is on par with that of offline forests, and our approach runs in under 150ms, making it desirable for real-time systems that require online relocalisation.

Pattern Recognition | 2014

Two Tree−Based Methods for the Waterfall

Stuart Golodetz; C. Nicholls; Irina Voiculescu; Stephen Cameron

Abstract The waterfall transform is a hierarchical segmentation technique based on the watershed transform from the field of mathematical morphology. Watershed-based techniques are useful in numerous fields ranging from image segmentation to cell-and-portal generation for games. The waterfall helps mitigate the problem of over-segmentation that commonly occurs when applying the basic watershed transform. It can also be used as a core part of a method for constructing image partition forests , a tree-based, multi-scale representation of an image. The best existing method for the waterfall is fast and effective, but our experience has been that it is not as straightforward to implement as might be desired. Furthermore, it does not deal consistently with the issue of non-minimal plateaux. This paper therefore proposes two new tree-based methods for the waterfall. Both are easier to implement than the existing state-of-the-art, and in our implementations, both were faster by a constant factor. The Simplified Waterfall (SW) method focuses on simplicity and ease of implementation; the Balanced Waterfall (BW) method focuses on robust handling of non-minimal plateaux. We perform experiments on both 2D and 3D images to contrast the new methods with each other and with the existing state-of-the-art, and show that both achieve a noticeable speed-up whilst producing similar results.

conference on soft computing as transdisciplinary science and technology | 2008

Region analysis of abdominal CT scans using image partition forests

Stuart Golodetz; Irina Voiculescu; Stephen Cameron

The segmentation of medical scans (CT, MRI, etc.) and the subsequent identification of key features therein, such as organs and tumours, is an important precursor to many medical imaging applications. It is a difficult problem, not least because of the extent to which the shapes of organs can vary from one image to the next. One interesting approach is to start by partitioning the image into a region hierarchy, in which each node represents a contiguous region of the image. This is a well-known approach in the literature: the resulting hierarchy is variously referred to as a partition tree, an image tree, or a semantic segmentation tree. Such trees summarise the image information in a helpful way, and allow efficient searches for regions which satisfy certain criteria. However, once built, the hierarchy tends to be static, making the results very dependent on the initial tree construction process (which, in the case of medical images, is done independently of any anatomical knowledge we might wish to bring to bear). In this paper, we describe our approach to the automatic feature identification problem, in particular explaining why modifying the hierarchy at a later stage can be useful, and how it can be achieved. We illustrate the efficacy of our method with some preliminary results showing the automatic identification of ribs.

british machine vision conference | 2015

Joint Object-Material Category Segmentation from Audio-Visual Cues.

Anurag Arnab; Michael Sapienza; Stuart Golodetz; Julien P. C. Valentin; Ondrej Miksik; Shahram Izadi; Philip H. S. Torr

It is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials. In this paper, we therefore present an approach that augments the available dense visual cues with sparse auditory cues in order to estimate dense object and material labels. Since estimates of object class and material properties are mutually informative, we optimise our multi-output labelling jointly using a random-field framework. We evaluate our system on a new dataset with paired visual and auditory data that we make publicly available. We demonstrate that this joint estimation of object and material labels significantly outperforms the estimation of either category in isolation.

2009 Proceedings of 6th International Symposium on Image and Signal Processing and Analysis | 2009

Automatic spine identification in abdominal CT slices using image partition forests

Stuart Golodetz; Irina Voiculescu; Stephen Cameron

The identification of key features (e.g. organs and tumours) in medical scans (CT, MRI, etc.) is a vital first step in many other image analysis applications, but it is by no means easy to identify such features automatically. Using statistical properties of image regions alone, it is not always possible to distinguish between different features with overlapping greyscale distributions. To do so, it helps to make use of additional knowledge that may have been acquired (e.g. from a medic) about a patients anatomy. One important form this external knowledge can take is localization information: this allows a program to narrow down its search to a particular region of the image, or to decide how likely a feature candidate is to be correct (e.g. it would be worrisome were the aorta identified as running through the middle of a kidney). To make use of this information, however, it is necessary to identify a suitable frame of reference in which it can be specified. This frame should ideally be based on rigid structures, e.g. the spine and ribs. In this paper, we present a method for automatically identifying cross-sections of the spine in image partition forests of axial abdominal CT slices as a first step towards defining a robust coordinate system for localization.

computer vision and pattern recognition | 2017

Straight to Shapes: Real-Time Detection of Encoded Shapes

Saumya Jetley; Michael Sapienza; Stuart Golodetz; Philip H. S. Torr

Current object detection approaches predict bounding boxes that provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to regress directly to objects shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared for higher-order concepts such as view similarity, pose variation and occlusion. To achieve this, we use a denoising convolutional auto-encoder to learn a low-dimensional shape embedding space. We place the decoder network after a fast end-to-end deep convolutional network that is trained to regress directly to the shape vectors provided by the auto-encoder. This yields what to the best of our knowledge is the first real-time shape prediction network, running at ~35 FPS on a high-end desktop. With higher-order shape reasoning well-integrated into the network pipeline, the network shows the useful practical quality of generalising to unseen categories that are similar to the ones in the training set, something that most existing approaches fail to handle.

Explore More