Alvaro Collet | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alvaro Collet is active.

Explore More

Publication

Featured researches published by Alvaro Collet.

Autonomous Robots | 2010

HERB: a home exploring robotic butler

Siddhartha S. Srinivasa; Dave Ferguson; Casey Helfrich; Dmitry Berenson; Alvaro Collet; Rosen Diankov; Garratt Gallagher; Geoffrey A. Hollinger; James J. Kuffner; Michael Vande Weghe

We describe the architecture, algorithms, and experiments with HERB, an autonomous mobile manipulator that performs useful manipulation tasks in the home. We present new algorithms for searching for objects, learning to navigate in cluttered dynamic indoor scenes, recognizing and registering objects accurately in high clutter using vision, manipulating doors and other constrained objects using caging grasps, grasp planning and execution in clutter, and manipulation on pose and torque constraint manifolds. We also present numerous severe real-world test results from the integration of these algorithms into a single mobile manipulator.

international conference on robotics and automation | 2009

Object recognition and full pose registration from a single image for robotic manipulation

Alvaro Collet; Dmitry Berenson; Siddhartha S. Srinivasa; Dave Ferguson

Robust perception is a vital capability for robotic manipulation in unstructured scenes. In this context, full pose estimation of relevant objects in a scene is a critical step towards the introduction of robots into household environments. In this paper, we present an approach for building metric 3D models of objects using local descriptors from several images. Each model is optimized to fit a set of calibrated training images, thus obtaining the best possible alignment between the 3D model and the real object. Given a new test image, we match the local descriptors to our stored models online, using a novel combination of the RANSAC and Mean Shift algorithms to register multiple instances of each object. A robust initialization step allows for arbitrary rotation, translation and scaling of objects in the test images. The resulting system provides markerless 6-DOF pose estimation for complex objects in cluttered scenes. We provide experimental results demonstrating orientation and translation accuracy, as well a physical implementation of the pose output being used by an autonomous robot to perform grasping in highly cluttered scenes.

The International Journal of Robotics Research | 2011

The MOPED framework: Object recognition and pose estimation for manipulation

Alvaro Collet; Manuel Martinez; Siddhartha S. Srinivasa

We present MOPED, a framework for Multiple Object Pose Estimation and Detection that seamlessly integrates single-image and multi-image object recognition and pose estimation in one optimized, robust, and scalable framework. We address two main challenges in computer vision for robotics: robust performance in complex scenes, and low latency for real-time operation. We achieve robust performance with Iterative Clustering Estimation (ICE), a novel algorithm that iteratively combines feature clustering with robust pose estimation. Feature clustering quickly partitions the scene and produces object hypotheses. The hypotheses are used to further refine the feature clusters, and the two steps iterate until convergence. ICE is easy to parallelize, and easily integrates single- and multi-camera object recognition and pose estimation. We also introduce a novel object hypothesis scoring function based on M-estimator theory, and a novel pose clustering algorithm that robustly handles recognition outliers. We achieve scalability and low latency with an improved feature matching algorithm for large databases, a GPU/CPU hybrid architecture that exploits parallelism at all levels, and an optimized resource scheduler. We provide extensive experimental results demonstrating state-of-the-art performance in terms of recognition, scalability, and latency in real-world robotic applications.

international conference on robotics and automation | 2009

Manipulation planning with Workspace Goal Regions

Dmitry Berenson; Siddhartha S. Srinivasa; Dave Ferguson; Alvaro Collet; James J. Kuffner

We present an approach to path planning for manipulators that uses Workspace Goal Regions (WGRs) to specify goal end-effector poses. Instead of specifying a discrete set of goals in the manipulators configuration space, we specify goals more intuitively as volumes in the manipulators workspace. We show that WGRs provide a common framework for describing goal regions that are useful for grasping and manipulation. We also describe two randomized planning algorithms capable of planning with WGRs. The first is an extension of RRT-JT that interleaves exploration using a Rapidly-exploring Random Tree (RRT) with exploitation using Jacobian-based gradient descent toward WGR samples. The second is the IKBiRRT algorithm, which uses a forward-searching tree rooted at the start and a backward-searching tree that is seeded by WGR samples. We demonstrate both simulation and experimental results for a 7DOF WAM arm with a mobile base performing reaching and pick-and-place tasks. Our results show that planning with WGRs provides an intuitive and powerful method of specifying goals for a variety of tasks without sacrificing efficiency or desirable completeness properties.

Proceedings of the IEEE | 2012

Herb 2.0: Lessons Learned From Developing a Mobile Manipulator for the Home

Siddhartha S. Srinivasa; Dmitry Berenson; Maya Cakmak; Alvaro Collet; Mehmet Remzi Dogar; Anca D. Dragan; Ross A. Knepper; Tim Niemueller; Kyle Strabala; M. Vande Weghe; Julius Ziegler

We present the hardware design, software architecture, and core algorithms of Herb 2.0, a bimanual mobile manipulator developed at the Personal Robotics Lab at Carnegie Mellon University, Pittsburgh, PA. We have developed Herb 2.0 to perform useful tasks for and with people in human environments. We exploit two key paradigms in human environments: that they have structure that a robot can learn, adapt and exploit, and that they demand general-purpose capability in robotic systems. In this paper, we reveal some of the structure present in everyday environments that we have been able to harness for manipulation and interaction, comment on the particular challenges on working in human spaces, and describe some of the lessons we learned from extensively testing our integrated platform in kitchen and office environments.

international conference on robotics and automation | 2010

MOPED: A scalable and low latency object recognition and pose estimation system

Manuel Martinez; Alvaro Collet; Siddhartha S. Srinivasa

The latency of a perception system is crucial for a robot performing interactive tasks in dynamic human environments. We present MOPED, a fast and scalable perception system for object recognition and pose estimation. MOPED builds on POSESEQ, a state of the art object recognition algorithm, demonstrating a massive improvement in scalability and latency without sacrificing robustness. We achieve this with both algorithmic and architecture improvements, with a novel feature matching algorithm, a hybrid GPU/CPU architecture that exploits parallelism at all levels, and an optimized resource scheduler. Using the same standard hardware, we achieve up to 30x improvement on real-world scenes.

computer vision and pattern recognition | 2010

Making specific features less discriminative to improve point-based 3D object recognition

Edward Hsiao; Alvaro Collet; Martial Hebert

We present a framework that retains ambiguity in feature matching to increase the performance of 3D object recognition systems. Whereas previous systems removed ambiguous correspondences during matching, we show that ambiguity should be resolved during hypothesis testing and not at the matching phase. To preserve ambiguity during matching, we vector quantize and match model features in a hierarchical manner. This matching technique allows our system to be more robust to the distribution of model descriptors in feature space. We also show that we can address recognition under arbitrary viewpoint by using our framework to facilitate matching of additional features extracted from affine transformed model images. The evaluation of our algorithms in 3D object recognition is demonstrated on a difficult dataset of 620 images.

international conference on robotics and automation | 2010

Efficient multi-view object recognition and full pose estimation

Alvaro Collet; Siddhartha S. Srinivasa

We present an approach for efficiently recognizing all objects in a scene and estimating their full pose from multiple views. Our approach builds upon a state of the art single-view algorithm which recognizes and registers learned metric 3D models using local descriptors. We extend to multiple views using a novel multi-step optimization that processes each view individually and feeds consistent hypotheses back to the algorithm for global refinement. We demonstrate that our method produces results comparable to the theoretical optimum, a full multi-view generalized camera approach, while avoiding its combinatorial time complexity. We provide experimental results demonstrating pose accuracy, speed, and robustness to model error using a three-camera rig, as well as a physical implementation of the pose output being used by an autonomous robot executing grasps in highly cluttered scenes.

international conference on robotics and automation | 2011

Structure discovery in multi-modal data: A region-based approach

Alvaro Collet; Siddhartha S. Srinivasay; Martial Hebert

The ability of a perception system to discern what is important in a scene and what is not is an invaluable asset, with multiple applications in object recognition, people detection and SLAM, among others. In this paper, we aim to analyze all sensory data available to separate a scene into a few physically meaningful parts, which we term structure, while discarding background clutter. In particular, we consider the combination of image and range data, and base our decision in both appearance and 3D shape. Our main contribution is the development of a framework to perform scene segmentation that preserves physical objects using multi-modal data. We combine image and range data using a novel mid-level fusion technique based on the concept of regions that avoids any pixel-level correspondences between data sources. We associate groups of pixels with 3D points into multi-modal regions that we term regionlets, and measure the structure-ness of each regionlet using simple, bottom-up cues from image and range features. We show that the highest-ranked regionlets correspond to the most prominent objects in the scene. We verify the validity of our approach on 105 scenes of household environments.

computer vision and pattern recognition | 2007

Filtered Component Analysis to Increase Robustness to Local Minima in Appearance Models

F. De la Torre; Alvaro Collet; M. Quero; Jeffrey F. Cohn; Takeo Kanade

Appearance models (AM) are commonly used to model appearance and shape variation of objects in images. In particular, they have proven useful to detection, tracking, and synthesis of peoples faces from video. While AM have numerous advantages relative to alternative approaches, they have at least two important drawbacks. First, they are especially prone to local minima in fitting; this problem becomes increasingly problematic as the number of parameters to estimate grows. Second, often few if any of the local minima correspond to the correct location of the model error. To address these problems, we propose filtered component analysis (FCA), an extension of traditional principal component analysis (PCA). FCA learns an optimal set of filters with which to build a multi-band representation of the object. FCA representations were found to be more robust than either grayscale or Gabor filters to problems of local minima. The effectiveness and robustness of the proposed algorithm is demonstrated in both synthetic and real data.

Explore More