Jeremie Papon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jeremie Papon is active.

Explore More

Publication

Featured researches published by Jeremie Papon.

computer vision and pattern recognition | 2013

Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds

Jeremie Papon; Alexey Abramov; Markus Schoeler; Florentin Wörgötter

Unsupervised over-segmentation of an image into regions of perceptually similar pixels, known as super pixels, is a widely used preprocessing step in segmentation algorithms. Super pixel methods reduce the number of regions that must be considered later by more computationally expensive algorithms, with a minimal loss of information. Nevertheless, as some information is inevitably lost, it is vital that super pixels not cross object boundaries, as such errors will propagate through later steps. Existing methods make use of projected color or depth information, but do not consider three dimensional geometric relationships between observed data points which can be used to prevent super pixels from crossing regions of empty space. We propose a novel over-segmentation algorithm which uses voxel relationships to produce over-segmentations which are fully consistent with the spatial geometry of the scene in three dimensional, rather than projective, space. Enforcing the constraint that segmented regions must have spatial connectivity prevents label flow across semantic object boundaries which might otherwise be violated. Additionally, as the algorithm works directly in 3D space, observations from several calibrated RGB+D cameras can be segmented jointly. Experiments on a large data set of human annotated RGB+D images demonstrate a significant reduction in occurrence of clusters crossing object boundaries, while maintaining speeds comparable to state-of-the-art 2D methods.

computer vision and pattern recognition | 2014

Object Partitioning Using Local Convexity

Simon Christoph Stein; Markus Schoeler; Jeremie Papon; Florentin Wörgötter

The problem of how to arrive at an appropriate 3D-segmentation of a scene remains difficult. While current state-of-the-art methods continue to gradually improve in benchmark performance, they also grow more and more complex, for example by incorporating chains of classifiers, which require training on large manually annotated data-sets. As an alternative to this, we present a new, efficient learning- and model-free approach for the segmentation of 3D point clouds into object parts. The algorithm begins by decomposing the scene into an adjacency-graph of surface patches based on a voxel grid. Edges in the graph are then classified as either convex or concave using a novel combination of simple criteria which operate on the local geometry of these patches. This way the graph is divided into locally convex connected subgraphs, which -- with high accuracy -- represent object parts. Additionally, we propose a novel depth dependent voxel grid to deal with the decreasing point-density at far distances in the point clouds. This improves segmentation, allowing the use of fixed parameters for vastly different scenes. The algorithm is straightforward to implement and requires no training data, while nevertheless producing results that are comparable to state-of-the-art methods which incorporate high-level concepts involving classification, learning and model fitting.

workshop on applications of computer vision | 2012

Depth-supported real-time video segmentation with the Kinect

Alexey Abramov; Karl Pauwels; Jeremie Papon; Florentin Wörgötter; Babette Dellen

We present a real-time technique for the spatiotemporal segmentation of color/depth movies. Images are segmented using a parallel Metropolis algorithm implemented on a GPU utilizing both color and depth information, acquired with the Microsoft Kinect. Segments represent the equilibrium states of a Potts model, where tracking of segments is achieved by warping obtained segment labels to the next frame using real-time optical flow, which reduces the number of iterations required for the Metropolis method to encounter the new equilibrium state. By including depth information into the framework, true objects boundaries can be found more easily, improving also the temporal coherency of the method. The algorithm has been tested for videos of medium resolutions showing human manipulations of objects. The framework provides an inexpensive visual front end for visual preprocessing of videos in industrial settings and robot labs which can potentially be used in various applications.

international conference on robotics and automation | 2014

Convexity based object partitioning for robot applications

Simon Christoph Stein; Florentin Wörgötter; Markus Schoeler; Jeremie Papon; Tomas Kulvicius

The idea that connected convex surfaces, separated by concave boundaries, play an important role for the perception of objects and their decomposition into parts has been discussed for a long time. Based on this idea, we present a new bottom-up approach for the segmentation of 3D point clouds into object parts. The algorithm approximates a scene using an adjacency-graph of spatially connected surface patches. Edges in the graph are then classified as either convex or concave using a novel, strictly local criterion. Region growing is employed to identify locally convex connected subgraphs, which represent the object parts. We show quantitatively that our algorithm, although conceptually easy to graph and fast to compute, produces results that are comparable to far more complex state-of-the-art methods which use classification, learning and model fitting. This suggests that convexity/concavity is a powerful feature for object partitioning using 3D data. Furthermore we demonstrate that for many objects a natural decomposition into “handle and body” emerges when employing our method. We exploit this property in a robotic application enabling a robot to automatically grasp objects by their handles.

international conference on computer vision | 2015

Semantic Pose Using Deep Networks Trained on Synthetic RGB-D

Jeremie Papon; Markus Schoeler

In this work we address the problem of indoor scene understanding from RGB-D images. Specifically, we propose to find instances of common furniture classes, their spatial extent, and their pose with respect to generalized class models. To accomplish this, we use a deep, wide, multi-output convolutional neural network (CNN) that predicts class, pose, and location of possible objects simultaneously. To overcome the lack of large annotated RGB-D training sets (especially those with pose), we use an on-the-fly rendering pipeline that generates realistic cluttered room scenes in parallel to training. We then perform transfer learning on the relatively small amount of publicly available annotated RGB-D data, and find that our model is able to successfully annotate even highly challenging real scenes. Importantly, our trained network is able to understand noisy and sparse observations of highly cluttered scenes with a remarkable degree of accuracy, inferring class and pose from a very limited set of cues. Additionally, our neural network is only moderately deep and computes class, pose and position in tandem, so the overall run-time is significantly faster than existing methods, estimating all output parameters simultaneously in parallel.

intelligent robots and systems | 2013

Point cloud video object segmentation using a persistent supervoxel world-model

Jeremie Papon; Tomas Kulvicius; Eren Erdal Aksoy; Florentin Wörgötter

Robust visual tracking is an essential precursor to understanding and replicating human actions in robotic systems. In order to accurately evaluate the semantic meaning of a sequence of video frames, or to replicate an action contained therein, one must be able to coherently track and segment all observed agents and objects. This work proposes a novel online point cloud based algorithm which simultaneously tracks 6DoF pose and determines spatial extent of all entities in indoor scenarios. This is accomplished using a persistent supervoxel world-model which is updated, rather than replaced, as new frames of data arrive. Maintenance of a world model enables general object permanence, permitting successful tracking through full occlusions. Object models are tracked using a bank of independent adaptive particle filters which use a supervoxel observation model to give rough estimates of object state. These are united using a novel multi-model RANSAC-like approach, which seeks to minimize a global energy function associating world-model supervoxels to predicted states. We present results on a standard robotic assembly benchmark for two application scenarios - human trajectory imitation and semantic action understanding - demonstrating the usefulness of the tracking in intelligent robotic systems.

IEEE Transactions on Circuits and Systems for Video Technology | 2012

Real-Time Segmentation of Stereo Videos on a Portable System With a Mobile GPU

Alexey Abramov; Karl Pauwels; Jeremie Papon; Florentin Wörgötter; Babette Dellen

In mobile robotic applications, visual information needs to be processed fast despite resource limitations of the mobile system. Here, a novel real-time framework for model-free spatiotemporal segmentation of stereo videos is presented. It combines real-time optical flow and stereo with image segmentation and runs on a portable system with an integrated mobile graphics processing unit. The system performs online, automatic, and dense segmentation of stereo videos and serves as a visual front end for preprocessing in mobile robots, providing a condensed representation of the scene that can potentially be utilized in various applications, e.g., object manipulation, manipulation recognition, visual servoing. The method was tested on real-world sequences with arbitrary motions, including videos acquired with a moving camera.

intelligent robots and systems | 2013

Toward a library of manipulation actions based on semantic object-action relations

Mohamad Javad Aein; Eren Erdal Aksoy; Minija Tamosiunaite; Jeremie Papon; Ales Ude; Florentin Wörgötter

The goal of this study is to provide an architecture for a generic definition of robot manipulation actions. We emphasize that the representation of actions presented here is “procedural”. Thus, we will define the structural elements of our action representations as execution protocols. To achieve this, manipulations are defined using three levels. The toplevel defines objects, their relations and the actions in an abstract and symbolic way. A mid-level sequencer, with which the action primitives are chained, is used to structure the actual action execution, which is performed via the bottom level. This (lowest) level collects data from sensors and communicates with the control system of the robot. This method enables robot manipulators to execute the same action in different situations i.e. on different objects with different positions and orientations. In addition, two methods of detecting action failure are provided which are necessary to handle faults in system. To demonstrate the effectiveness of the proposed framework, several different actions are performed on our robotic setup and results are shown. This way we are creating a library of human-like robot actions, which can be used by higher-level task planners to execute more complex tasks.

computer vision and pattern recognition | 2015

Constrained planar cuts - Object partitioning for point clouds

Markus Schoeler; Jeremie Papon; Florentin Wörgötter

While humans can easily separate unknown objects into meaningful parts, recent segmentation methods can only achieve similar partitionings by training on human-annotated ground-truth data. Here we introduce a bottom-up method for segmenting 3D point clouds into functional parts which does not require supervision and achieves equally good results. Our method uses local concavities as an indicator for inter-part boundaries. We show that this criterion is efficient to compute and generalizes well across different object classes. The algorithm employs a novel locally constrained geometrical boundary model which proposes greedy cuts through a local concavity graph. Only planar cuts are considered and evaluated using a cost function, which rewards cuts orthogonal to concave edges. Additionally, a local clustering constraint is applied to ensure the partitioning only affects relevant locally concave regions. We evaluate our algorithm on recordings from an RGB-D camera as well as the Princeton Segmentation Benchmark, using a fixed set of parameters across all object classes. This stands in stark contrast to most reported results which require either knowing the number of parts or annotated ground-truth for learning. Our approach outperforms all existing bottom-up methods (reducing the gap to human performance by up to 50 %) and achieves scores similar to top-down data-driven approaches.

Künstliche Intelligenz | 2014

Technologies for the Fast Set-Up of Automated Assembly Processes

Norbert Krüger; Ales Ude; Henrik Gordon Petersen; Bojan Nemec; Lars-Peter Ellekilde; Thiusius Rajeeth Savarimuthu; Jimmy Alison Rytz; Kerstin Fischer; Anders Buch; Dirk Kraft; Wail Mustafa; Eren Erdal Aksoy; Jeremie Papon; Aljaž Kramberger; Florentin Wörgötter

In this article, we describe technologies facilitating the set-up of automated assembly solutions which have been developed in the context of the IntellAct project (2011–2014). Tedious procedures are currently still required to establish such robot solutions. This hinders especially the automation of so called few-of-a-kind production. Therefore, most production of this kind is done manually and thus often performed in low-wage countries. In the IntellAct project, we have developed a set of methods which facilitate the set-up of a complex automatic assembly process, and here we present our work on tele-operation, dexterous grasping, pose estimation and learning of control strategies. The prototype developed in IntellAct is at a TRL4 (corresponding to ‘demonstration in lab environment’).

Explore More