Is this you? Create Your Porfile

Aitor Aldoma

Vienna University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aitor Aldoma is active.

Explore More

Publication

Featured researches published by Aitor Aldoma.

IEEE Robotics & Automation Magazine | 2012

Tutorial: Point Cloud Library: Three-Dimensional Object Recognition and 6 DOF Pose Estimation

Aitor Aldoma; Zoltan-Csaba Marton; Federico Tombari; Walter Wohlkinger; Christian Potthast; Bernhard Zeisl; Radu Bogdan Rusu; Suat Gedikli; Markus Vincze

With the advent of new-generation depth sensors, the use of three-dimensional (3-D) data is becoming increasingly popular. As these sensors are commodity hardware and sold at low cost, a rapidly growing group of people can acquire 3- D data cheaply and in real time.

international conference on computer vision | 2011

CAD-model recognition and 6DOF pose estimation using 3D cues

Aitor Aldoma; Markus Vincze; Nico Blodow; David Gossow; Suat Gedikli; Radu Bogdan Rusu; Gary R. Bradski

This paper focuses on developing a fast and accurate 3D feature for use in object recognition and pose estimation for rigid objects. More specifically, given a set of CAD models of different objects representing our knoweledge of the world - obtained using high-precission scanners that deliver accurate and noiseless data - our goal is to identify and estimate their pose in a real scene obtained by a depth sensor like the Microsoft Kinect. Borrowing ideas from the Viewpoint Feature Histogram (VFH) due to its computational efficiency and recognition performance, we describe the Clustered Viewpoint Feature Histogram (CVFH) and the cameras roll histogram together with our recognition framework to show that it can be effectively used to recognize objects and 6DOF pose in real environments dealing with partial occlusion, noise and different sensors atributes for training and recognition data. We show that CVFH out-performs VFH and present recognition results using the Microsoft Kinect Sensor on an object set of 44 objects.

european conference on computer vision | 2012

A global hypotheses verification method for 3d object recognition

Aitor Aldoma; Federico Tombari; Luigi Di Stefano; Markus Vincze

We propose a novel approach for verifying model hypotheses in cluttered and heavily occluded 3D scenes. Instead of verifying one hypothesis at a time, as done by most state-of-the-art 3D object recognition methods, we determine object and pose instances according to a global optimization stage based on a cost function which encompasses geometrical cues. Peculiar to our approach is the inherent ability to detect significantly occluded objects without increasing the amount of false positives, so that the operating point of the object recognition algorithm can nicely move toward a higher recall without sacrificing precision. Our approach outperforms state-of-the-art on a challenging dataset including 35 household models obtained with the Kinect sensor, as well as on the standard 3D object recognition benchmark dataset.

Joint DAGM (German Association for Pattern Recognition) and OAGM Symposium | 2012

OUR-CVFH – Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram for Object Recognition and 6DOF Pose Estimation

Aitor Aldoma; Federico Tombari; Radu Bogdan Rusu; Markus Vincze

We propose a novel method to estimate a unique and repeatable reference frame in the context of 3D object recognition from a single viewpoint based on global descriptors. We show that the ability of defining a robust reference frame on both model and scene views allows creating descriptive global representations of the object view, with the beneficial effect of enhancing the spatial descriptiveness of the feature and its ability to recognize objects by means of a simple nearest neighbor classifier computed on the descriptor space. Moreover, the definition of repeatable directions can be deployed to efficiently retrieve the 6DOF pose of the objects in a scene. We experimentally demonstrate the effectiveness of the proposed method on a dataset including 23 scenes acquired with the Microsoft Kinect sensor and 25 full-3D models by comparing the proposed approach with state-of-the-art global descriptors. A substantial improvement is presented regarding accuracy in recognition and 6DOF pose estimation, as well as in terms of computational performance.

international conference on robotics and automation | 2013

Multimodal cue integration through Hypotheses Verification for RGB-D object recognition and 6DOF pose estimation

Aitor Aldoma; Federico Tombari; Johann Prankl; A. Richtsfeld; L. Di Stefano; Markus Vincze

This paper proposes an effective algorithm for recognizing objects and accurately estimating their 6DOF pose in scenes acquired by a RGB-D sensor. The proposed method is based on a combination of different recognition pipelines, each exploiting the data in a diverse manner and generating object hypotheses that are ultimately fused together in an Hypothesis Verification stage that globally enforces geometrical consistency between model hypotheses and the scene. Such a scheme boosts the overall recognition performance as it enhances the strength of the different recognition pipelines while diminishing the impact of their specific weaknesses. The proposed method outperforms the state-of-the-art on two challenging benchmark datasets for object recognition comprising 35 object models and, respectively, 176 and 353 scenes.

international conference on robotics and automation | 2012

3DNet: Large-scale object class recognition from CAD models

Walter Wohlkinger; Aitor Aldoma; Radu Bogdan Rusu; Markus Vincze

3D object and object class recognition gained momentum with the arrival of low-cost RGB-D sensors and enables robotics tasks not feasible years ago. Scaling object class recognition to hundreds of classes still requires extensive time and many objects for learning. To overcome the training issue, we introduce a methodology for learning 3D descriptors from synthetic CAD-models and classification of never-before-seen objects at the first glance, where classification rates and speed are suited for robotics tasks. We provide this in 3DNet (3d-net.org), a free resource for object class recognition and 6DOF pose estimation from point cloud data. 3DNet provides a large-scale hierarchical CAD-model databases with increasing numbers of classes and difficulty with 10, 50, 100 and 200 object classes together with evaluation datasets that contain thousands of scenes captured with a RGB-D sensor. 3DNet further provides an open-source framework based on the Point Cloud Library (PCL) for testing new descriptors and benchmarking of state-of-the-art descriptors together with pose estimation procedures to enable robotics tasks such as search and grasping.

intelligent robots and systems | 2015

RGB-D object modelling for object recognition and tracking

Johann Prankl; Aitor Aldoma; Alexander Svejda; Markus Vincze

This work presents a flexible system to reconstruct 3D models of objects captured with an RGB-D sensor. A major advantage of the method is that unlike other modelling tools, our reconstruction pipeline allows the user to acquire a full 3D model of the object. This is achieved by acquiring several partial 3D models in different sessions-each individual session presenting the object of interest in different configurations that reveal occluded parts of the object - that are automatically merged together to reconstruct a full 3D model. In addition, the 3D models acquired by our system can be directly used by state-of-the-art object instance recognition and object tracking modules, providing object-perception capabilities to complex applications requiring these functionalities (e.g. human-object interaction analysis, robot grasping, etc.). The system does not impose constraints in the appearance of objects (textured, untextured) nor in the modelling setup (moving camera with static object or turn-table setups with static camera). The proposed reconstruction system has been used to model a large number of objects resulting in metrically accurate and visually appealing 3D models.

international conference on robotics and automation | 2012

Supervised learning of hidden and non-hidden 0-order affordances and detection in real scenes

Aitor Aldoma; Federico Tombari; Markus Vincze

The ability to perceive possible interactions with the environment is a key capability of task-guided robotic agents. An important subset of possible interactions depends solely on the objects of interest and their position and orientation in the scene. We call these object-based interactions 0-order affordances and divide them among non-hidden and hidden whether the current configuration of an object in the scene renders its affordance directly usable or not. Conversely to other works, we propose that detecting affordances that are not directly perceivable increase the usefulness of robotic agents with manipulation capabilities, so that by appropriate manipulation they can modify the object configuration until the seeked affordance becomes available. In this paper we show how 0-order affordances depending on the geometry of the objects and their pose can be learned using a supervised learning strategy on 3D mesh representations of the objects allowing the use of the whole object geometry. Moreover, we show how the learned affordances can be detected in real scenes obtained with a low-cost depth sensor like the Microsoft Kinect through object recognition and 6D0F pose estimation and present results for both learning on meshes and detection on real scenes to demonstrate the practical application of the presented approach.

IEEE Robotics & Automation Magazine | 2017

The STRANDS Project: Long-Term Autonomy in Everyday Environments

Nick Hawes; Christopher Burbridge; Ferdian Jovan; Lars Kunze; Bruno Lacerda; Lenka Mudrová; Jay Young; Jeremy L. Wyatt; Denise Hebesberger; Tobias Körtner; Rares Ambrus; Nils Bore; John Folkesson; Patric Jensfelt; Lucas Beyer; Alexander Hermans; Bastian Leibe; Aitor Aldoma; Thomas Faulhammer; Michael Zillich; Markus Vincze; Eris Chinellato; Muhannad Al-Omari; Paul Duckworth; Yiannis Gatsoulis; David C. Hogg; Anthony G. Cohn; Christian Dondrup; Jaime Pulido Fentanes; Tomas Krajnik

Thanks to the efforts of the robotics and autonomous systems community, the myriad applications and capacities of robots are ever increasing. There is increasing demand from end users for autonomous service robots that can operate in real environments for extended periods. In the Spatiotemporal Representations and Activities for Cognitive Control in Long-Term Scenarios (STRANDS) project (http://strandsproject.eu), we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile service robots and deploying these systems for long-term installations in security and care environments. Our robots have been operational for a combined duration of 104 days over four deployments, autonomously performing end-user-defined tasks and traversing 116 km in the process. In this article, we describe the approach we used to enable long-term autonomous operation in everyday environments and how our robots are able to use their long run times to improve their own performance.

intelligent robots and systems | 2014

Automation of “ground truth” annotation for multi-view RGB-D object instance recognition datasets

Aitor Aldoma; Thomas Faulhammer; Markus Vincze

Aiming at reducing the labour intensity associated with the acquisition of ground truth annotations for object instance recognition datasets, this paper discusses a novel multi-view recognition method to automate the annotation (object instances and associated poses) of individual images in multi-view RGB-D datasets. In combination with recent single-view object recognition techniques, the supplementary information provided by multiple vantage points results in a rich and integrated representation of the environment, in the form of a 3D reconstructed scene as well as object hypotheses therein. We argue that such a representation facilitates improved recognition to an extent that the recovered results, obtained by means of a suitable 3D hypotheses verification stage, closely resemble the ground truth of the scene under consideration. On two large datasets, totalling more than 3500 object instances, our method yields 99.1% and 93.2% correct automatic annotations. These results corroborate our approach for the task at hand.

Explore More