Markus Vincze | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Markus Vincze is active.

Explore More

Publication

Featured researches published by Markus Vincze.

The International Journal of Robotics Research | 2007

Fast Ego-motion Estimation with Multi-rate Fusion of Inertial and Vision

Leopoldo Armesto; Josep Tornero; Markus Vincze

This paper presents a tracking system for ego-motion estimation which fuses vision and inertial measurements using EKF and UKF (Extended and Unscented Kalman Filters), where a comparison of their performance has been done. It also considers the multi-rate nature of the sensors: inertial sensing is sampled at a fast sampling frequency while the sampling frequency of vision is lower. the proposed approach uses a constant linear acceleration model and constant angular velocity model based on quaternions, which yields a non-linear model for states and a linear model in measurement equations. Results show that a significant improvement is obtained on the estimation when fusing both measurements with respect to just vision or just inertial measurements. It is also shown that the proposed system can estimate fast-motions even when vision system fails. Moreover, a study of the influence of the noise covariance is also performed, which aims to select their appropriate values at the tuning process. The setup is an end-effector mounted camera, which allow us to pre-define basic rotational and translational motions for validating results.

The International Journal of Robotics Research | 2007

Simultaneous Motion and Structure Estimation by Fusion of Inertial and Vision Data

Peter Gemeiner; Peter Einramhof; Markus Vincze

For mobile robotics, head gear in augmented reality (AR) applications or computer vision, it is essential to continuously estimate the egomotion and the structure of the environment. This paper presents the system developed in the SmartTracking project, which simultaneously integrates visual and inertial sensors in a combined estimation scheme. The sparse structure estimation is based on the detection of corner features in the environment. From a single known starting position, the system can move into an unknown environment. The vision and inertial data are fused, and the performance of both Unscented Kalman filter and Extended Kalman filter are compared for this task. The filters are designed to handle asynchronous input from visual and inertial sensors, which typically operate at different and possibly varying rates. Additionally, a bank of Extended Kalman filters, one per corner feature, is used to estimate the position and the quality of structure points and to include them into the structure estimation process. The system is demonstrated on a mobile robot executing known motions, such that the estimation of the egomotion in an unknown environment can be compared to ground truth.

international conference on robotics and automation | 2013

Multimodal cue integration through Hypotheses Verification for RGB-D object recognition and 6DOF pose estimation

Aitor Aldoma; Federico Tombari; Johann Prankl; A. Richtsfeld; L. Di Stefano; Markus Vincze

This paper proposes an effective algorithm for recognizing objects and accurately estimating their 6DOF pose in scenes acquired by a RGB-D sensor. The proposed method is based on a combination of different recognition pipelines, each exploiting the data in a diverse manner and generating object hypotheses that are ultimately fused together in an Hypothesis Verification stage that globally enforces geometrical consistency between model hypotheses and the scene. Such a scheme boosts the overall recognition performance as it enhances the strength of the different recognition pipelines while diminishing the impact of their specific weaknesses. The proposed method outperforms the state-of-the-art on two challenging benchmark datasets for object recognition comprising 35 object models and, respectively, 176 and 353 scenes.

international conference on robotics and automation | 2015

Fast semantic segmentation of 3D point clouds using a dense CRF with learned parameters

Daniel Wolf; Johann Prankl; Markus Vincze

In this paper, we present an efficient semantic segmentation framework for indoor scenes operating on 3D point clouds. We use the results of a Random Forest Classifier to initialize the unary potentials of a densely interconnected Conditional Random Field, for which we learn the parameters for the pairwise potentials from training data. These potentials capture and model common spatial relations between class labels, which can often be observed in indoor scenes. We evaluate our approach on the popular NYU Depth datasets, for which it achieves superior results compared to the current state of the art. Exploiting parallelization and applying an efficient CRF inference method based on mean field approximation, our framework is able to process full resolution Kinect point clouds in half a second on a regular laptop, more than twice as fast as comparable methods.

international conference on robotics and automation | 2016

Enhancing Semantic Segmentation for Robotics: The Power of 3-D Entangled Forests

Daniel Wolf; Johann Prankl; Markus Vincze

We present a novel, fast, and compact method to improve semantic segmentation of three-dimensional (3-D) point clouds, which is able to learn and exploit common contextual relations between observed structures and objects. Introducing 3-D Entangled Forests (3-DEF), we extend the concept of entangled features for decision trees to 3-D point clouds, enabling the classifier not only to learn, which labels are likely to occur close to each other, but also in which specific geometric configuration. Operating on a plane-based representation of a point cloud, our method does not require a final smoothing step and achieves state-of-the-art results on the NYU Depth Dataset in a single inference step. This compactness in turn allows for fast processing times, a crucial factor to consider for online applications on robotic platforms. In a thorough evaluation, we demonstrate the expressiveness of our new 3-D entangled feature set and the importance of spatial context in the scope of semantic segmentation.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

A Global Hypothesis Verification Framework for 3D Object Recognition in Clutter

Aitor Aldoma; Federico Tombari; Luigi Di Stefano; Markus Vincze

Pipelines to recognize 3D objects despite clutter and occlusions usually end up with a final verification stage whereby recognition hypotheses are validated or dismissed based on how well they explain sensor measurements. Unlike previous work, we propose a Global Hypothesis Verification (GHV) approach which regards all hypotheses jointly so as to account for mutual interactions. GHV provides a principled framework to tackle the complexity of our visual world by leveraging on a plurality of recognition paradigms and cues. Accordingly, we present a 3D object recognition pipeline deploying both global and local 3D features as well as shape and color. Thereby, and facilitated by the robustness of the verification process, diverse object hypotheses can be gathered and weak hypotheses need not be suppressed too early to trade sensitivity for specificity. Experiments demonstrate the effectiveness of our proposal, which significantly improves over the state-of-art and attains ideal performance (no false negatives, no false positives) on three out of the six most relevant and challenging benchmark datasets.

Proceedings of SPIE | 1996

General relationship for optimal tracking performance

Markus Vincze; Carl F. R. Weiman

Visual tracking is a vital task in active vision research, traffic surveillance, face following, robotics, and many other applications. This paper investigates the principles of finding optimal tracking performance depending on image tesselation and window size. Square windows reach best performance when image sampling time equals image processing time. This is valid in all cases where the algorithm investigates each pixel in the window and for both tracking with fixed or steered camera/s. Linear windows can improve tracking performance, though performance is limited, too. Best performance yield space-variant image tessellations. Image pyramids or log-polar sampled images show steadily increasing tracking performance with increasing sensor size. The reason is that the resolution drops as sensor size increases.

international conference on computer vision | 2011

Surface reconstruction for RGB-D data using real-time depth propagation

Karthik Mahesh Varadarajan; Markus Vincze

Real-time noise removal and depth propagation is a crucial component for surface reconstruction algorithms. Given the recent surge in the development of RGB-D sensors, a host of methods are available for detecting and tracking RGB-D features across multiple frames as well combining these frames to yield dense 3D point clouds. Nevertheless the sensor outputs are sparse in areas where textures are low (for traditional stereo cameras) and high reflectance regions (for Kinect like active sensors). It is crucial to employ a depth estimate propagation or diffusion algorithm to generate best approximation surface curvature in these regions for visualization. In this paper, we extend the Depth Diffusion using Iterative Back Substitution scheme to Kinect like RGB-D sensor data for real time surface reconstruction.

2011 5th International Symposium on Computational Intelligence and Intelligent Informatics (ISCIII) | 2011

Augmented virtuality based immersive telepresence for control of mining robots

Karthik Mahesh Varadarajan; Markus Vincze

Vast mineral resources of precious metals such as gold remain trapped and unexploited due to the lack of economical and practical means of exploration. This requires the development of alternate exploitation techniques. Mining robots form a significant alternative to convention mining techniques. However, there are several practical limitations that make such systems difficult to implement in practice. The primary hurdle in realizing such systems is the difficulty in tele-operating the robot under high latency conditions, which is typical of mining of environments. This is further compounded by poor representation of the environment, resulting in reduced situational awareness. The latency in tele-operation can be caused by numerous factors — system latency, compression scheme, communication protocols, constraints on bandwidth, channel contention, poor line of sight and display overhead. This is typically countered by reduction of frame rate or display resolution or quality. This further affects remote navigation of the robot. Non-holistic scene displays further degrade situational perception. This is intricately tied to the effectiveness of the Operator Control Unit (OCU). Besides, improvements in these capabilities without any vehicle intelligence do little in reducing the operator task-load. In this paper, we present the design of a novel augmented virtuality based visualization and operator interface unit along supported by vehicular intelligence, which are targeted at overcoming the above issues. These design considerations and presented algorithms are expected to form the foundation of next generation mining robots.

Proceedings of SPIE | 1996

Generic motion platform for active vision

Carl F. R. Weiman; Markus Vincze

The term active vision was first used by Bajcsy at a NATO workshop in 1982 to describe an emerging field of robot vision which departed sharply from traditional paradigms of image understanding and machine vision. The new approach embeds a moving camera platform as an in-the-loop component of robotic navigation or hand-eye coordination. Visually served steering of the focus of attention supercedes the traditional functions of recognition and gaging. Custom active vision platforms soon proliferated in research laboratories in Europe and North America. In 1990 the National Science Foundation funded the design of a common platform to promote cooperation and reduce cost in active vision research. This paper describes the resulting platform. The design was driven by payload requirements for binocular motorized C-mount lenses on a platform whose performance and articulation emulate those of the human eye- head system. The result was a 4-DOF mechanisms driven by servo controlled DC brush motors. A crossbeam supports two independent worm-gear driven camera vergence mounts at speeds up to 1,000 degrees per second over a range of +/- 90 degrees from dead ahead. This crossbeam is supported by a pan-tilt mount whose horizontal axis intersects the vergence axes for translation-free camera rotation about these axes at speeds up to 500 degrees per second.

Explore More