Alexander Hermans | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexander Hermans is active.

Explore More

Publication

Featured researches published by Alexander Hermans.

international conference on robotics and automation | 2014

Dense 3D semantic mapping of indoor scenes from RGB-D images

Alexander Hermans; Georgios Floros; Bastian Leibe

Dense semantic segmentation of 3D point clouds is a challenging task. Many approaches deal with 2D semantic segmentation and can obtain impressive results. With the availability of cheap RGB-D sensors the field of indoor semantic segmentation has seen a lot of progress. Still it remains unclear how to deal with 3D semantic segmentation in the best way. We propose a novel 2D-3D label transfer based on Bayesian updates and dense pairwise 3D Conditional Random Fields. This approach allows us to use 2D semantic segmentations to create a consistent 3D semantic reconstruction of indoor scenes. To this end, we also propose a fast 2D semantic segmentation approach based on Randomized Decision Forests. Furthermore, we show that it is not needed to obtain a semantic segmentation for every frame in a sequence in order to create accurate semantic 3D reconstructions. We evaluate our approach on both NYU Depth datasets and show that we can obtain a significant speed-up compared to other methods.

computer vision and pattern recognition | 2017

Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes

Tobias Pohlen; Alexander Hermans; Markus Mathias; Bastian Leibe

Semantic image segmentation is an essential component of modern autonomous driving systems, as an accurate understanding of the surrounding scene is crucial to navigation and action planning. Current state-of-the-art approaches in semantic image segmentation rely on pre-trained networks that were initially developed for classifying images as a whole. While these networks exhibit outstanding recognition performance (i.e., what is visible?), they lack localization accuracy (i.e., where precisely is something located?). Therefore, additional processing steps have to be performed in order to obtain pixel-accurate segmentation masks at the full image resolution. To alleviate this problem we propose a novel ResNet-like architecture that exhibits strong localization and recognition performance. We combine multi-scale context with pixel-level accuracy by using two processing streams within our network: One stream carries information at the full image resolution, enabling precise adherence to segment boundaries. The other stream undergoes a sequence of pooling operations to obtain robust features for recognition. The two streams are coupled at the full image resolution using residuals. Without additional processing steps and without pre-training, our approach achieves an intersection-over-union score of 71.8% on the Cityscapes dataset.

IEEE Robotics & Automation Magazine | 2017

The STRANDS Project: Long-Term Autonomy in Everyday Environments

Nick Hawes; Christopher Burbridge; Ferdian Jovan; Lars Kunze; Bruno Lacerda; Lenka Mudrová; Jay Young; Jeremy L. Wyatt; Denise Hebesberger; Tobias Körtner; Rares Ambrus; Nils Bore; John Folkesson; Patric Jensfelt; Lucas Beyer; Alexander Hermans; Bastian Leibe; Aitor Aldoma; Thomas Faulhammer; Michael Zillich; Markus Vincze; Eris Chinellato; Muhannad Al-Omari; Paul Duckworth; Yiannis Gatsoulis; David C. Hogg; Anthony G. Cohn; Christian Dondrup; Jaime Pulido Fentanes; Tomas Krajnik

Thanks to the efforts of the robotics and autonomous systems community, the myriad applications and capacities of robots are ever increasing. There is increasing demand from end users for autonomous service robots that can operate in real environments for extended periods. In the Spatiotemporal Representations and Activities for Cognitive Control in Long-Term Scenarios (STRANDS) project (http://strandsproject.eu), we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile service robots and deploying these systems for long-term installations in security and care environments. Our robots have been operational for a combined duration of 104 days over four deployments, autonomously performing end-user-defined tasks and traversing 116 km in the process. In this article, we describe the approach we used to enable long-term autonomous operation in everyday environments and how our robots are able to use their long run times to improve their own performance.

international conference on robotics and automation | 2016

Multi-scale object candidates for generic object tracking in street scenes

Aljosa Osep; Alexander Hermans; Francis Engelmann; Dirk Klostermann; Markus Mathias; Bastian Leibe

Most vision based systems for object tracking in urban environments focus on a limited number of important object categories such as cars or pedestrians, for which powerful detectors are available. However, practical driving scenarios contain many additional objects of interest, for which suitable detectors either do not yet exist or would be cumbersome to obtain. In this paper we propose a more general tracking approach which does not follow the often used tracking-by-detection principle. Instead, we investigate how far we can get by tracking unknown, generic objects in challenging street scenes. As such, we do not restrict ourselves to only tracking the most common categories, but are able to handle a large variety of static and moving objects. We evaluate our approach on the KITTI dataset and show competitive results for the annotated classes, even though we are not restricted to them.

international conference on computer vision | 2017

Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds

Francis Engelmann; Theodora Kontogianni; Alexander Hermans; Bastian Leibe

Deep learning approaches have made tremendous progress in the field of semantic segmentation over the past few years. However, most current approaches operate in the 2D image space. Direct semantic segmentation of unstructured 3D point clouds is still an open research problem. The recently proposed PointNet architecture presents an interesting step ahead in that it can operate on unstructured point clouds, achieving encouraging segmentation results. However, it subdivides the input points into a grid of blocks and processes each such block individually. In this paper, we investigate the question how such an architecture can be extended to incorporate larger-scale spatial context. We build upon PointNet and propose two extensions that enlarge the receptive field over the 3D scene. We evaluate the proposed strategies on challenging indoor and outdoor datasets and show improved results in both scenarios.

german conference on pattern recognition | 2015

Biternion Nets: Continuous Head Pose Regression from Discrete Training Labels

Lucas Beyer; Alexander Hermans; Bastian Leibe

While head pose estimation has been studied for some time, continuous head pose estimation is still an open problem. Most approaches either cannot deal with the periodicity of angular data or require very fine-grained regression labels. We introduce biternion nets, a CNN-based approach that can be trained on very coarse regression labels and still estimate fully continuous \({360}^{\circ }\) head poses. We show state-of-the-art results on several publicly available datasets. Finally, we demonstrate how easy it is to record and annotate a new dataset with coarse orientation labels in order to obtain continuous head pose estimates using our biternion nets.

international conference on robotics and automation | 2017

DROW: Real-Time Deep Learning-Based Wheelchair Detection in 2-D Range Data

Lucas Beyer; Alexander Hermans; Bastian Leibe

We introduce the DROW detector, a deep learning-based object detector operating on 2-dimensional (2-D) range data. Laser scanners are lighting invariant, provide accurate 2-D range data, and typically cover a large field of view, making them interesting sensors for robotics applications. So far, research on detection in laser 2-D range data has been dominated by hand-crafted features and boosted classifiers, potentially losing performance due to suboptimal design choices. We propose a convolutional neural network (CNN) based detector for this task. We show how to effectively apply CNNs for detection in 2-D range data, and propose a depth preprocessing step and a voting scheme that significantly improve CNN performance. We demonstrate our approach on wheelchairs and walkers, obtaining state of the art detection results. Apart from the training data, none of our design choices limits the detector to these two classes, though. We provide a ROS node for our detector and release our dataset containing 464 k laser scans, out of which 24 k were annotated.

arXiv: Computer Vision and Pattern Recognition | 2017