Eric Sommerlade
University of Oxford
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eric Sommerlade.
international conference on computer vision | 2009
David Ellis; Eric Sommerlade; Ian D. Reid
We propose a non-parametric model for pedestrian motion based on Gaussian Process regression, in which trajectory data are modelled by regressing relative motion against current position. We show how the underlying model can be learned in an unsupervised fashion, demonstrating this on two databases collected from static surveillance cameras. We furthermore exemplify the use of model for prediction, comparing the recently proposed GP-Bayesfilters with a Monte Carlo method. We illustrate the benefit of this approach for long term motion prediction where parametric models such as Kalman Filters would perform poorly.
computer vision and pattern recognition | 2008
Eric Sommerlade; Ian D. Reid
Studies support the need for high resolution imagery to identify persons in surveillance videos. However, the use of telephoto lenses sacrifices a wider field of view and thereby increases the uncertainty of other, possibly more interesting events in the scene. Using zoom lenses offers the possibility of enjoying the benefits of both wide field of view and high resolution, but not simultaneously. We approach this problem of balancing these finite imaging resources - or of exploration vs exploitation - using an information-theoretic approach. We argue that the camera parameters - pan, tilt and zoom - should be set to maximise information gain, or equivalently minimising conditional entropy of the scene model, comprised of multiple targets and a yet unobserved one. The information content of the former is supplied directly by the uncertainties computed using a Kalman filter tracker, while the latter is modelled using a rdquobackgroundrdquo Poisson process whose parameters are learned from extended scene observations; together these yield an entropy for the scene. We support our argument with quantitative and qualitative analyses in simulated and real-world environments, demonstrating that this approach yields sensible exploration behaviours in which the camera alternates between obtaining close-up views of the targets while paying attention to the background, especially to areas of known high activity.
international conference on distributed smart cameras | 2009
Nicola Bellotto; Eric Sommerlade; Ben Benfold; Charles Bibby; Ian D. Reid; Daniel Roth; Charles Fernandez; Luc Van Gool; Jordi Gonzàlez
We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor. Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database.
international conference on robotics and automation | 2010
Eric Sommerlade; Ian D. Reid
In this work we present a consistent probabilistic approach to control multiple, but diverse pan-tilt-zoom cameras concertedly observing a scene. There are disparate goals to this control: the cameras are not only to react to objects moving about, arbitrating conflicting interests of target resolution and trajectory accuracy, they are also to anticipate the appearance of new targets.
Computer Vision and Image Understanding | 2012
Nicola Bellotto; Ben Benfold; Hanno Harland; Hans-Hellmut Nagel; Nicola Pirlo; Ian D. Reid; Eric Sommerlade; Chuan Zhao
Cognitive visual tracking is the process of observing and understanding the behavior of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision.
indian conference on computer vision, graphics and image processing | 2014
Makarand Tapaswi; Omkar M. Parkhi; Esa Rahtu; Eric Sommerlade; Rainer Stiefelhagen; Andrew Zisserman
The goal of this paper is unsupervised face clustering in edited video material – where face tracks arising from different people are assigned to separate clusters, with one cluster for each person. In particular we explore the extent to which faces can be clustered automatically without making an error. This is a very challenging problem given the variation in pose, lighting and expressions that can occur, and the similarities between different people. The novelty we bring is three fold: first, we show that a form of weak supervision is available from the editing structure of the material – the shots, threads and scenes that are standard in edited video; second, we show that by first clustering within scenes the number of face tracks can be significantly reduced with almost no errors; third, we propose an extension of the clustering method to entire episodes using exemplar SVMs based on the negative training data automatically harvested from the editing structure. The method is demonstrated on multiple episodes from two very different TV series, Scrubs and Buffy. For both series it is shown that we move towards our goal, and also outperform a number of baselines from previous works.
british machine vision conference | 2008
Michael D. Breitenstein; Eric Sommerlade; Bastian Leibe; Luc Van Gool; Ian D. Reid
We present an online learning approach for robustly combining unreliable observations from a pedestrian detector to estimate the rough 3D scene geometry from video sequences of a static camera. Our approach is based on an entropy modelling framework, which allows to simultaneously adapt the detector parameters, such that the expected information gain about the scene structure is maximised. As a result, our approach automatically restricts the detector scale range for each image region as the estimation results become more confident, thus improving detector run-time and limiting false positives.
international conference on robotics and automation | 2011
Christopher Mei; Eric Sommerlade; Gabe Sibley; Paul Newman; Ian D. Reid
Understanding and analysing video data from static or mobile surveillance cameras often requires knowledge of the scene and the camera placement. In this article, we provide a way to simplify the users task of understanding the scene by rendering the camera view as if observed from the users perspective by estimating his position using a real-time visual SLAM system. Augmenting the view is referred to as hidden view synthesis. Compared to previous work, the current approach improves by simplifying the setup and requiring minimal user input. This is achieved by building a map of the environment using a visual SLAM system and then registering the surveillance camera in this map. By exploiting the map, a different moving camera can render hidden views in real-time at 30Hz. We discuss some of the challenges remaining for full automation. Results are shown in an indoor environment for surveillance applications and outdoors with application to improved safety in transport.
international conference on computer vision | 2010
Ian D. Reid; Ben Benfold; Alonso Patron; Eric Sommerlade
The central tenet of this paper is that by determining where people are looking, other tasks involved with understanding and interrogating a scene are simplified. To this end we describe a fully automatic method to determine a persons attention based on real-time visual tracking of their head and a coarse classification of their head pose. We estimate the head pose, or coarse gaze, using randomised ferns with decision branches based on both histograms of gradient orientations and colour based features. We use the coarse gaze for three applications to demonstrate its value: (i) we show how by building static and temporally varying maps of areas where people look we are able to identify interesting regions; (ii) we show how by determining the gaze of people in the scene we can more effectively control a multi-camera surveillance system to acquire faces for identification; (iii) we show how by identifying where people are looking we can more effectively classify human interactions.
international conference on robotics and automation | 2011
Eric Sommerlade; Ben Benfold; Ian D. Reid
Face recognition in surveillance situations usually requires high resolution face images to be captured from remote active cameras. Since the recognition accuracy is typically a function of the face direction - with frontal faces more likely to lead to reliable recognition - we propose a system which optimises the capturing of such images by using coarse gaze estimates from a static camera. By considering the potential information gain from observing each target, our system automatically sets the pan, tilt and zoom values (i.e. the field of view) of multiple cameras observing different tracked targets in order to maximise the likelihood of correct identification. The expected gain in information is influenced by the controllable field of view, and by the false positive and negative rates of the identification process, which are in turn a function of the gaze angle. We validate the approach using a combination of simulated situations and real tracking output to demonstrate superior performance over alternative approaches, notably using no gaze information, or using gaze inferred from direction of travel (i.e. assuming each person is always looking directly ahead).We also show results from a live implementation with a static camera and two pan-tilt-zoom devices, involving real-time tracking, processing and control.