Hayko Riemenschneider

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hayko Riemenschneider is active.

Explore More

Publication

Featured researches published by Hayko Riemenschneider.

european conference on computer vision | 2014

Creating Summaries from User Videos

Michael Gygli; Helmut Grabner; Hayko Riemenschneider; Luc Van Gool

This paper proposes a novel approach and a new benchmark for video summarization. Thereby we focus on user videos, which are raw videos containing a set of interesting events. Our method starts by segmenting the video by using a novel “superframe” segmentation, tailored to raw videos. Then, we estimate visual interestingness per superframe using a set of low-, mid- and high-level features. Based on this scoring, we select an optimal subset of superframes to create an informative and interesting summary. The introduced benchmark comes with multiple human created summaries, which were acquired in a controlled psychological experiment. This data paves the way to evaluate summarization methods objectively and to get new insights in video summarization. When evaluating our method, we find that it generates high-quality results, comparable to manual, human-created summaries.

computer vision and pattern recognition | 2012

Irregular lattices for complex shape grammar facade parsing

Hayko Riemenschneider; Ulrich Krispel; Wolfgang Thaller; Michael Donoser; Sven Havemann; Dieter W. Fellner; Horst Bischof

High-quality urban reconstruction requires more than multi-view reconstruction and local optimization. The structure of facades depends on the general layout, which has to be optimized globally. Shape grammars are an established method to express hierarchical spatial relationships, and are therefore suited as representing constraints for semantic facade interpretation. Usually inference uses numerical approximations, or hard-coded grammar schemes. Existing methods inspired by classical grammar parsing are not applicable on real-world images due to their prohibitively high complexity. This work provides feasible generic facade reconstruction by combining low-level classifiers with mid-level object detectors to infer an irregular lattice. The irregular lattice preserves the logical structure of the facade while reducing the search space to a manageable size. We introduce a novel method for handling symmetry and repetition within the generic grammar. We show competitive results on two datasets, namely the Paris 2010 and the Graz 50. The former includes only Hausmannian, while the latter includes Classicism, Biedermeier, Historicism, Art Nouveau and post-modern architectural styles.

european conference on computer vision | 2010

Using partial edge contour matches for efficient object category localization

Hayko Riemenschneider; Michael Donoser; Horst Bischof

We propose a method for object category localization by partially matching edge contours to a single shape prototype of the category. Previous work in this area either relies on piecewise contour approximations, requires meaningful supervised decompositions, or matches coarse shape-based descriptions at local interest points. Our method avoids error-prone pre-processing steps by using all obtained edges in a partial contour matching setting. The matched fragments are efficiently summarized and aggregated to form location hypotheses. The efficiency and accuracy of our edge fragment based voting step yields high quality hypotheses in low computation time. The experimental evaluation achieves excellent performance in the hypotheses voting stage and yields competitive results on challenging datasets like ETHZ and INRIA horses.

computer vision and pattern recognition | 2015

3D all the way: Semantic segmentation of urban scenes from start to end in 3D

Andelo Martinovic; Jan Knopp; Hayko Riemenschneider; Luc Van Gool

We propose a new approach for semantic segmentation of 3D city models. Starting from an SfM reconstruction of a street-side scene, we perform classification and facade splitting purely in 3D, obviating the need for slow image-based semantic segmentation methods. We show that a properly trained pure-3D approach produces high quality labelings, with significant speed benefits (20x faster) allowing us to analyze entire streets in a matter of minutes. Additionally, if speed is not of the essence, the 3D labeling can be combined with the results of a state-of-the-art 2D classifier, further boosting the performance. Further, we propose a novel facade separation based on semantic nuances between facades. Finally, inspired by the use of architectural principles for 2D facade labeling, we propose new 3D-specific principles and an efficient optimization scheme based on an integer quadratic programming formulation.

asian conference on computer vision | 2009

Efficient partial shape matching of outer contours

Michael Donoser; Hayko Riemenschneider; Horst Bischof

This paper introduces a novel efficient partial shape matching method named IS-Match. We use sampled points from the silhouette as a shape representation. The sampled points can be ordered which in turn allows to formulate the matching step as an order-preserving assignment problem. We propose an angle descriptor between shape chords combining the advantages of global and local shape description. An efficient integral image based implementation of the matching step is introduced which allows detecting partial matches an order of magnitude faster than comparable methods. We further show how the proposed algorithm is used to calculate a global optimal Pareto frontier to define a partial similarity measure between shapes. Shape retrieval experiments on standard shape databases like MPEG-7 prove that state-of-the-art results are achieved at reduced computational costs.

computer vision and pattern recognition | 2014

Fast, Approximate Piecewise-Planar Modeling Based on Sparse Structure-from-Motion and Superpixels

András Bódis-Szomorú; Hayko Riemenschneider; Luc Van Gool

State-of-the-art Multi-View Stereo (MVS) algorithms deliver dense depth maps or complex meshes with very high detail, and redundancy over regular surfaces. In turn, our interest lies in an approximate, but light-weight method that is better to consider for large-scale applications, such as urban scene reconstruction from ground-based images. We present a novel approach for producing dense reconstructions from multiple images and from the underlying sparse Structure-from-Motion (SfM) data in an efficient way. To overcome the problem of SfM sparsity and textureless areas, we assume piecewise planarity of man-made scenes and exploit both sparse visibility and a fast over-segmentation of the images. Reconstruction is formulated as an energy-driven, multi-view plane assignment problem, which we solve jointly over superpixels from all views while avoiding expensive photoconsistency computations. The resulting planar primitives -- defined by detailed superpixel boundaries -- are computed in about 10 seconds per image.

european conference on computer vision | 2014

Learning Where to Classify in Multi-view Semantic Segmentation

Hayko Riemenschneider; András Bódis-Szomorú; Julien Weissenberg; Luc Van Gool

There is an increasing interest in semantically annotated 3D models, e.g. of cities. The typical approaches start with the semantic labelling of all the images used for the 3D model. Such labelling tends to be very time consuming though. The inherent redundancy among the overlapping images calls for more efficient solutions. This paper proposes an alternative approach that exploits the geometry of a 3D mesh model obtained from multi-view reconstruction. Instead of clustering similar views, we predict the best view before the actual labelling. For this we find the single image part that bests supports the correct semantic labelling of each face of the underlying 3D mesh. Moreover, our single-image approach may surprise because it tends to increase the accuracy of the model labelling when compared to approaches that fuse the labels from multiple images. As a matter of fact, we even go a step further, and only explicitly label a subset of faces (e.g. 10%), to subsequently fill in the labels of the remaining faces. This leads to a further reduction of computation time, again combined with a gain in accuracy. Compared to a process that starts from the semantic labelling of the images, our method to semantically label 3D models yields accelerations of about 2 orders of magnitude. We tested our multi-view semantic labelling on a variety of street scenes.

european conference on computer vision | 2012

Hough regions for joining instance localization and segmentation

Hayko Riemenschneider; Sabine Sternig; Michael Donoser; Peter M. Roth; Horst Bischof

Object detection and segmentation are two challenging tasks in computer vision, which are usually considered as independent steps. In this paper, we propose a framework which jointly optimizes for both tasks and implicitly provides detection hypotheses and corresponding segmentations. Our novel approach is attachable to any of the available generalized Hough voting methods. We introduce Hough Regions by formulating the problem of Hough space analysis as Bayesian labeling of a random field. This exploits provided classifier responses, object center votes and low-level cues like color consistency, which are combined into a global energy term. We further propose a greedy approach to solve this energy minimization problem providing a pixel-wise assignment to background or to a specific category instance. This way we bypass the parameter sensitive non-maximum suppression that is required in related methods. The experimental evaluation demonstrates that state-of-the-art detection and segmentation results are achieved and that our method is inherently able to handle overlapping instances and an increased range of articulations, aspect ratios and scales.

computer vision and pattern recognition | 2015

Superpixel meshes for fast edge-preserving surface reconstruction

András Bódis-Szomorú; Hayko Riemenschneider; Luc Van Gool

Multi-View-Stereo (MVS) methods aim for the highest detail possible, however, such detail is often not required. In this work, we propose a novel surface reconstruction method based on image edges, superpixels and second-order smoothness constraints, producing meshes comparable to classic MVS surfaces in quality but orders of magnitudes faster. Our method performs per-view dense depth optimization directly over sparse 3D Ground Control Points (GCPs), hence, removing the need for view pairing, image rectification, and stereo depth estimation, and allowing for full per-image parallelization. We use Structure-from-Motion (SfM) points as GCPs, but the method is not specific to these, e.g. LiDAR or RGB-D can also be used. The resulting meshes are compact and inherently edge-aligned with image gradients, enabling good-quality lightweight per-face flat renderings. Our experiments demonstrate on a variety of 3D datasets the superiority in speed and competitive surface quality.

british machine vision conference | 2009

Bag of Optical Flow Volumes for Image Sequence Recognition.

Hayko Riemenschneider; Michael Donoser; Horst Bischof

This paper introduces a novel 3D interest point detector and feature representation for describing image sequences. The approach considers image sequences as spatiotemporal volumes and detects Maximally Stable Volumes (MSVs) in efficiently calculated optical flow fields. This provides a set of binary optical flow volumes highlighting the dominant motions in the sequences. 3D interest points are sampled on the surface of the volumes which balance well between density and informativeness. The binary optical flow volumes are used as feature representation in a 3D shape context descriptor. A standard bag-of-words approach then allows building discriminant optical flow volume signatures for predicting class labels of previously unseen image sequences by machine learning algorithms. We evaluate the proposed method for the task of action recognition on the well-known Weizmann dataset, and show that we outperform recently proposed state-of-the-art 3D interest point detection and description methods.

Explore More