Dominic Zeng Wang
University of Oxford
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dominic Zeng Wang.
robotics science and systems | 2015
Dominic Zeng Wang; Ingmar Posner
This paper proposes an efficient and effective scheme to applying the sliding window approach popular in computer vision to 3D data. Specifically, the sparse nature of the problem is exploited via a voting scheme to enable a search through all putative object locations at any orientation. We prove that this voting scheme is mathematically equivalent to a convolution on a sparse feature grid and thus enables the processing, in full 3D, of any point cloud irrespective of the number of vantage points required to construct it. As such it is versatile enough to operate on data from popular 3D laser scanners such as a Velodyne as well as on 3D data obtained from increasingly popular push-broom configurations. Our approach is “embarrassingly parallelisable” and capable of processing a point cloud containing over 100K points at eight orientations in less than 0.5s. For the object classes car, pedestrian and bicyclist the resulting detector achieves best-in-class detection and timing performance relative to prior art on the KITTI dataset as well as compared to another existing 3D object detection approach.
international conference on robotics and automation | 2012
Dominic Zeng Wang; Ingmar Posner; Paul Newman
This paper tackles the problem of segmenting things that could move from 3D laser scans of urban scenes. In particular, we wish to detect instances of classes of interest in autonomous driving applications - cars, pedestrians and bicyclists - amongst significant background clutter. Our aim is to provide the layout of an end-to-end pipeline which, when fed by a raw stream of 3D data, produces distinct groups of points which can be fed to downstream classifiers for categorisation. We postulate that, for the specific classes considered in this work, solving a binary classification task (i.e. separating the data into foreground and background first) outperforms approaches that tackle the multi-class problem directly. This is confirmed using custom and third-party datasets gathered of urban street scenes. While our system is agnostic to the specific clustering algorithm deployed we explore the use of a Euclidean Minimum Spanning Tree for an end-to-end segmentation pipeline and devise a RANSAC-based edge selection criterion.
international conference on robotics and automation | 2017
Martin Engelcke; Dushyant Rao; Dominic Zeng Wang; Chi Hay Tong; Ingmar Posner
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs). In particular, this is achieved by leveraging a feature-centric voting scheme to implement novel convolutional layers which explicitly exploit the sparsity encountered in the input. To this end, we examine the trade-off between accuracy and speed for different architectures and additionally propose to use an L1 penalty on the filter activations to further encourage sparsity in the intermediate representations. To the best of our knowledge, this is the first work to propose sparse convolutional layers and L1 regularisation for efficient large-scale processing of 3D data. We demonstrate the efficacy of our approach on the KITTI object detection benchmark and show that VoteSDeep models with as few as three layers outperform the previous state of the art in both laser and laser-vision based approaches by margins of up to 40% while remaining highly competitive in terms of processing time.
The International Journal of Robotics Research | 2015
Dominic Zeng Wang; Ingmar Posner; Paul Newman
We present a new approach to detection and tracking of moving objects with a 2D laser scanner for autonomous driving applications. Objects are modelled with a set of rigidly attached sample points along their boundaries whose positions are initialized with and updated by raw laser measurements, thus allowing a non-parametric representation that is capable of representing objects independent of their classes and shapes. Detection and tracking of such object models are handled in a theoretically principled manner as a Bayes filter where the motion states and shape information of all objects are represented as a part of a joint state which includes in addition the pose of the sensor and geometry of the static part of the world. We derive the prediction and observation models for the evolution of the joint state, and describe how the knowledge of the static local background helps in identifying dynamic objects from static ones in a principled and straightforward way. Dealing with raw laser points poses a significant challenge to data association. We propose a hierarchical approach, and present a new variant of the well-known Joint Compatibility Branch and Bound algorithm to respect and take advantage of the constraints of the problem introduced through correlations between observations. Finally, we calibrate the system systematically on real world data containing 7,500 labelled object examples and validate on 6,000 test cases. We demonstrate its performance over an existing industry standard targeted at the same problem domain as well as a classical approach to model-free object tracking.
intelligent robots and systems | 2016
Markus Wulfmeier; Dominic Zeng Wang; Ingmar Posner
In this work, we present an approach to learn cost maps for driving in complex urban environments from a large number of demonstrations of human driving behaviour. The learned cost maps are constructed directly from raw sensor measurements, bypassing the effort of manually designing cost maps as well as features. When deploying the cost maps, the trajectories generated not only replicate human-like driving behaviour but are also demonstrably robust against systematic errors in putative robot configuration. To achieve this we deploy a Maximum Entropy based, non-linear IRL framework which uses Fully Convolutional Neural Networks (FCNs) to represent the cost model underlying expert driving behaviour. Using a deep, parametric approach enables us to scale efficiently to large datasets and complex behaviours while being run-time independent of dataset extent during deployment. We demonstrate scalability and performance on an ambitious dataset collected over the course of one year including more than 25k demonstration trajectories extracted from over 120km of driving and 13 different drivers. We evaluate against a carefully designed cost map and, in addition, demonstrate robustness to systematic errors by learning precise cost-maps even in the presence of system calibration perturbations.
The International Journal of Robotics Research | 2017
Markus Wulfmeier; Dushyant Rao; Dominic Zeng Wang; Peter Ondruska; Ingmar Posner
We present an approach for learning spatial traversability maps for driving in complex, urban environments based on an extensive dataset demonstrating the driving behaviour of human experts. The direct end-to-end mapping from raw input data to cost bypasses the effort of manually designing parts of the pipeline, exploits a large number of data samples, and can be framed additionally to refine handcrafted cost maps produced based on manual hand-engineered features. To achieve this, we introduce a maximum-entropy-based, non-linear inverse reinforcement learning (IRL) framework which exploits the capacity of fully convolutional neural networks (FCNs) to represent the cost model underlying driving behaviours. The application of a high-capacity, deep, parametric approach successfully scales to more complex environments and driving behaviours, while at deployment being run-time independent of training dataset size. After benchmarking against state-of-the-art IRL approaches, we focus on demonstrating scalability and performance on an ambitious dataset collected over the course of 1 year including more than 25,000 demonstration trajectories extracted from over 120 km of urban driving. We evaluate the resulting cost representations by showing the advantages over a carefully, manually designed cost map and furthermore demonstrate its robustness towards systematic errors by learning accurate representations even in the presence of calibration perturbations. Importantly, we demonstrate that a manually designed cost map can be refined to more accurately handle corner cases that are scarcely seen in the environment, such as stairs, slopes and underpasses, by further incorporating human priors into the training framework.
ISRR | 2016
Dominic Zeng Wang; Ingmar Posner; Paul Newman
This paper presents a unified and model-free framework for the detection and tracking of dynamic objects with 2D laser range finders in an autonomous driving scenario. A novel state formulation is proposed that captures joint estimates of the sensor pose, a local static background and dynamic states of moving objects. In addition, we contribute a new hierarchical data association algorithm to associate raw laser measurements to observable states, and within which, a new variant of the Joint Compatibility Branch and Bound (JCBB) algorithm is introduced for problems with large numbers of measurements. The system is calibrated systematically on 7.5K labeled object examples and evaluated on 6K test cases, and is shown to greatly outperform an existing industry standard targeted at the same problem domain.
The International Journal of Robotics Research | 2018
Julie Dequaire; Peter Ondruska; Dushyant Rao; Dominic Zeng Wang; Ingmar Posner
This paper presents a novel approach for tracking static and dynamic objects for an autonomous vehicle operating in complex urban environments. Whereas traditional approaches for tracking often feature numerous hand-engineered stages, this method is learned end-to-end and can directly predict a fully unoccluded occupancy grid from raw laser input. We employ a recurrent neural network to capture the state and evolution of the environment, and train the model in an entirely unsupervised manner. In doing so, our use case compares to model-free, multi-object tracking although we do not explicitly perform the underlying data-association process. Further, we demonstrate that the underlying representation learned for the tracking task can be leveraged via inductive transfer to train an object detector in a data efficient manner. We motivate a number of architectural features and show the positive contribution of dilated convolutions, dynamic and static memory units to the task of tracking and classifying complex dynamic scenes through full occlusion. Our experimental results illustrate the ability of the model to track cars, buses, pedestrians, and cyclists from both moving and stationary platforms. Further, we compare and contrast the approach with a more traditional model-free multi-object tracking pipeline, demonstrating that it can more accurately predict future states of objects from current inputs.
arXiv: Learning | 2016
Peter Ondruska; Julie Dequaire; Dominic Zeng Wang; Ingmar Posner
arXiv: Computer Vision and Pattern Recognition | 2016
Julie Dequaire; Dushyant Rao; Peter Ondruska; Dominic Zeng Wang; Ingmar Posner