Rodrigo Benenson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rodrigo Benenson is active.

Explore More

Publication

Featured researches published by Rodrigo Benenson.

computer vision and pattern recognition | 2016

The Cityscapes Dataset for Semantic Urban Scene Understanding

Marius Cordts; Mohamed Omran; Sebastian Ramos; Timo Rehfeld; Markus Enzweiler; Rodrigo Benenson; Uwe Franke; Stefan Roth; Bernt Schiele

Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.

european conference on computer vision | 2014

Ten Years of Pedestrian Detection, What Have We Learned?

Rodrigo Benenson; Mohamed Omran; Jan Hendrik Hosang; Bernt Schiele

Paper-by-paper results make it easy to miss the forest for the trees.We analyse the remarkable progress of the last decade by dis- cussing the main ideas explored in the 40+ detectors currently present in the Caltech pedestrian detection benchmark. We observe that there exist three families of approaches, all currently reaching similar detec- tion quality. Based on our analysis, we study the complementarity of the most promising ideas by combining multiple published strategies. This new decision forest detector achieves the current best known performance on the challenging Caltech-USA dataset.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

What Makes for Effective Detection Proposals

Jan Hendrik Hosang; Rodrigo Benenson; Piotr Dollár; Bernt Schiele

Current top performing object detectors employ detection proposals to guide the search for objects, thereby avoiding exhaustive sliding window search across images. Despite the popularity and widespread use of detection proposals, it is unclear which trade-offs are made when using them during object detection. We provide an in-depth analysis of twelve proposal methods along with four baselines regarding proposal repeatability, ground truth annotation recall on PASCAL, ImageNet, and MS COCO, and their impact on DPM, R-CNN, and Fast R-CNN detection performance. Our analysis shows that for object detection improving proposal localisation accuracy is as important as improving recall. We introduce a novel metric, the average recall (AR), which rewards both high recall and good localisation and correlates surprisingly well with detection performance. Our findings show common strengths and weaknesses of existing methods, and provide insights and metrics for selecting and tuning proposal methods.

european conference on computer vision | 2014

Face Detection without Bells and Whistles

Markus Mathias; Rodrigo Benenson; Marco Pedersoli; Luc Van Gool

Face detection is a mature problem in computer vision. While diverse high performing face detectors have been proposed in the past, we present two surprising new top performance results. First, we show that a properly trained vanilla DPM reaches top performance, improving over commercial and research systems. Second, we show that a detector based on rigid templates - similar in structure to the Viola&Jones detector - can reach similar top performance on this task. Importantly, we discuss issues with existing evaluation benchmark and propose an improved procedure.

computer vision and pattern recognition | 2015

Filtered channel features for pedestrian detection

Shanshan Zhang; Rodrigo Benenson; Bernt Schiele

This paper starts from the observation that multiple top performing pedestrian detectors can be modelled by using an intermediate layer filtering low-level features in combination with a boosted decision forest. Based on this observation we propose a unifying framework and experimentally explore different filter families. We report extensive results enabling a systematic analysis. Using filtered channel features we obtain top performance on the challenging Caltech and KITTI datasets, while using only HOG+LUV as low-level features. When adding optical flow features we further improve detection quality and report the best known results on the Caltech dataset, reaching 93% recall at 1 FPPI.

computer vision and pattern recognition | 2015

Taking a deeper look at pedestrians

Jan Hendrik Hosang; Mohamed Omran; Rodrigo Benenson; Bernt Schiele

In this paper we study the use of convolutional neural networks (convnets) for the task of pedestrian detection. Despite their recent diverse successes, convnets historically underperform compared to other pedestrian detectors. We deliberately omit explicitly modelling the problem into the network (e.g. parts or occlusion modelling) and show that we can reach competitive performance without bells and whistles. In a wide range of experiments we analyse small and big convnets, their architectural choices, parameters, and the influence of different training data, including pretraining on surrogate tasks. We present the best convnet detectors on the Caltech and KITTI dataset. On Caltech our convnets reach top performance both for the Caltech1x and Caltech10x training setup. Using additional data at training time our strongest convnet model is competitive even to detectors that use additional data (optical flow) at test time.

international symposium on neural networks | 2013

Traffic sign recognition — How far are we from the solution?

Markus Mathias; Radu Timofte; Rodrigo Benenson; Luc Van Gool

Traffic sign recognition has been a recurring application domain for visual objects detection. The public datasets have only recently reached large enough size and variety to enable proper empirical studies. We revisit the topic by showing how modern methods perform on two large detection and classification datasets (thousand of images, tens of categories) captured in Belgium and Germany. We show that, without any application specific modification, existing methods for pedestrian detection, and for digit and face classification; can reach performances in the range of 95% ~ 99% of the perfect solution. We show detailed experiments and discuss the trade-off of different options. Our top performing methods use modern variants of HOG features for detection, and sparse representations for classification.

international conference on computer vision | 2013

Handling Occlusions with Franken-Classifiers

Markus Mathias; Rodrigo Benenson; Radu Timofte; Luc Van Gool

Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets, INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.

computer vision and pattern recognition | 2017

Simple Does It: Weakly Supervised Instance and Semantic Segmentation

Anna Khoreva; Rodrigo Benenson; Jan Hendrik Hosang; Matthias Hein; Bernt Schiele

Semantic labelling and instance segmentation are two tasks that require particularly costly annotations. Starting from weak supervision in the form of bounding box detection annotations, we propose a new approach that does not require modification of the segmentation training procedure. We show that when carefully designing the input labels from given bounding boxes, even a single round of training is enough to improve over previously reported weakly supervised results. Overall, our weak supervision approach reaches ~95% of the quality of the fully supervised model, both for semantic labelling and instance segmentation.

computer vision and pattern recognition | 2017

Learning Video Object Segmentation from Static Images

Federico Perazzi; Anna Khoreva; Rodrigo Benenson; Bernt Schiele; Alexander Sorkine-Hornung

Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce the concept of convnet-based guidance applied to video object segmentation. Our model proceeds on a per-frame basis, guided by the output of the previous frame towards the object of interest in the next frame. We demonstrate that highly accurate object segmentation in videos can be enabled by using a convolutional neural network (convnet) trained with static images only. The key component of our approach is a combination of offline and online learning strategies, where the former produces a refined mask from the previous frame estimate and the latter allows to capture the appearance of the specific object instance. Our method can handle different types of input annotations such as bounding boxes and segments while leveraging an arbitrary amount of annotated frames. Therefore our system is suitable for diverse applications with different requirements in terms of accuracy and efficiency. In our extensive evaluation, we obtain competitive results on three different datasets, independently from the type of input annotation.

Explore More