Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Freek Stulp is active.

Publication


Featured researches published by Freek Stulp.


The International Journal of Robotics Research | 2011

Learning variable impedance control

Jonas Buchli; Freek Stulp; Evangelos A. Theodorou; Stefan Schaal

One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high degree-of-freedom (DOF) robotic tasks. In this contribution, we accomplish such variable impedance control with the reinforcement learning (RL) algorithm PI 2 ( P olicy I mprovement with P ath I ntegrals). PI 2 is a model-free, sampling-based learning method derived from first principles of stochastic optimal control. The PI 2 algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on the cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PI 2 is that it can scale to problems of many DOFs, so that reinforcement learning on real robotic systems becomes feasible. We sketch the PI 2 algorithm and its theoretical properties, and how it is applied to gain scheduling for variable impedance control. We evaluate our approach by presenting results on several simulated and real robots. We consider tasks involving accurate tracking through via points, and manipulation tasks requiring physical contact with the environment. In these tasks, the optimal strategy requires both tuning of a reference trajectory and the impedance of the end-effector. The results show that we can use path integral based reinforcement learning not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications.


IEEE Transactions on Robotics | 2012

Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation

Freek Stulp; Evangelos A. Theodorou; Stefan Schaal

Physical contact events often allow a natural decomposition of manipulation tasks into action phases and subgoals. Within the motion primitive paradigm, each action phase corresponds to a motion primitive, and the subgoals correspond to the goal parameters of these primitives. Current state-of-the-art reinforcement learning algorithms are able to efficiently and robustly optimize the parameters of motion primitives in very high-dimensional problems. These algorithms often consider only shape parameters, which determine the trajectory between the start- and end-point of the movement. In manipulation, however, it is also crucial to optimize the goal parameters, which represent the subgoals between the motion primitives. We therefore extend the policy improvement with path integrals (PI2) algorithm to simultaneously optimize shape and goal parameters. Applying simultaneous shape and goal learning to sequences of motion primitives leads to the novel algorithm PI2 Seq. We use our methods to address a fundamental challenge in manipulation: improving the robustness of everyday pick-and-place tasks.


robot and human interactive communication | 2008

The Assistive Kitchen — A demonstration scenario for cognitive technical systems

Michael Beetz; Freek Stulp; Bernd Radig; Jan Bandouch; Nico Blodow; Mihai Emanuel Dolha; Andreas Fedrizzi; Dominik Jain; Uli Klank; Ingo Kresse; Alexis Maldonado; Zoltan Marton; Lorenz Mösenlechner; Federico Ruiz; Radu Bogdan Rusu; Moritz Tenorth

This paper introduces the assistive kitchen as a comprehensive demonstration and challenge scenario for technical cognitive systems. We describe its hardware and software infrastructure. Within the assistive kitchen application, we select particular domain activities as research subjects and identify the cognitive capabilities needed for perceiving, interpreting, analyzing, and executing these activities as research foci. We conclude by outlining open research issues that need to be solved to realize the scenarios successfully.


international conference on robotics and automation | 2011

Learning to grasp under uncertainty

Freek Stulp; Evangelos A. Theodorou; Jonas Buchli; Stefan Schaal

We present an approach that enables robots to learn motion primitives that are robust towards state estimation uncertainties. During reaching and preshaping, the robot learns to use fine manipulation strategies to maneuver the object into a pose at which closing the hand to perform the grasp is more likely to succeed. In contrast, common assumptions in grasp planning and motion planning for reaching are that these tasks can be performed independently, and that the robot has perfect knowledge of the pose of the objects in the environment.


Paladyn | 2013

Robot Skill Learning: From Reinforcement Learning to Evolution Strategies

Freek Stulp; Olivier Sigaud

Abstract Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. Owing to current trends involving searching in parameter space (rather than action space) and using reward-weighted averaging (rather than gradient estimation), reinforcement learning algorithms for policy improvement, e.g. PoWER and PI2, are now able to learn sophisticated high-dimensional robot skills. A side-effect of these trends has been that, over the last 15 years, reinforcement learning (RL) algorithms have become more and more similar to evolution strategies such as (μW , λ)-ES and CMA-ES. Evolution strategies treat policy improvement as a black-box optimization problem, and thus do not leverage the problem structure, whereas RL algorithms do. In this paper, we demonstrate how two straightforward simplifications to the state-of-the-art RL algorithm PI2 suffice to convert it into the black-box optimization algorithm (μW, λ)-ES. Furthermore, we show that (μW , λ)-ES empirically outperforms PI2 on the tasks in [36]. It is striking that PI2 and (μW , λ)-ES share a common core, and that the simpler algorithm converges faster and leads to similar or lower final costs. We argue that this difference is due to a third trend in robot skill learning: the predominant use of dynamic movement primitives (DMPs). We show how DMPs dramatically simplify the learning problem, and discuss the implications of this for past and future work on policy improvement for robot skill learning


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2008

Learning Local Objective Functions for Robust Face Model Fitting

Matthias Wimmer; Freek Stulp; Sylvia Pietzsch; Bernd Radig

Model-based techniques have proven to be successful in interpreting the large amount of information contained in images. Associated fitting algorithms search for the global optimum of an objective function, which should correspond to the best model fit in a given image. Although fitting algorithms have been the subject of intensive research and evaluation, the objective function is usually designed ad hoc, based on implicit and domain-dependent knowledge. In this article, we address the root of the problem by learning more robust objective functions. First, we formulate a set of desirable properties for objective functions and give a concrete example function that has these properties. Then, we propose a novel approach that learns an objective function from training data generated by manual image annotations and this ideal objective function. In this approach, critical decisions such as feature selection are automated, and the remaining manual steps hardly require domain-dependent knowledge. Furthermore, an extensive empirical evaluation demonstrates that the obtained objective functions yield more robustness. Learned objective functions enable fitting algorithms to determine the best model fit more accurately than with designed objective functions.


robotics science and systems | 2010

Variable Impedance Control A Reinforcement Learning Approach

Jonas Buchli; Evangelos A. Theodorou; Freek Stulp; Stefan Schaal

One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high DOF robotic tasks. In this contribution, we ac- complish such gain scheduling with a reinforcement learning ap- proach algorithm, PI 2 (Policy Improvement with Path Integrals). PI 2 is a model-free, sampling based learning method derived from first principles of optimal control. The PI 2 algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PI 2 is that it can scale to problems of many DOFs, so that RL on real robotic systems becomes feasible. We sketch the PI 2 algorithm and its theoretical properties, and how it is applied to gain scheduling. We evaluate our approach by presenting results on two different simulated robotic systems, a 3-DOF Phantom Premium Robot and a 6-DOF Kuka Lightweight Robot. We investigate tasks where the optimal strategy requires both tuning of the impedance of the end-effector, and tuning of a reference trajectory. The results show that we can use path integral based RL not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications. I. INTRODUCTION


intelligent robots and systems | 2011

Movement segmentation using a primitive library

Franziska Meier; Evangelos A. Theodorou; Freek Stulp; Stefan Schaal

Segmenting complex movements into a sequence of primitives remains a difficult problem with many applications in the robotics and vision communities. In this work, we show how the movement segmentation problem can be reduced to a sequential movement recognition problem. To this end, we reformulate the original Dynamic Movement Primitive (DMP) formulation as a linear dynamical system with control inputs. Based on this new formulation, we develop an Expectation-Maximization algorithm to estimate the duration and goal position of a partially observed trajectory. With the help of this algorithm and the assumption that a library of movement primitives is present, we present a movement segmentation framework. We illustrate the usefulness of the new DMP formulation on the two applications of online movement recognition and movement segmentation.


ieee-ras international conference on humanoid robots | 2013

Learning compact parameterized skills with a single regression

Freek Stulp; Gennaro Raiola; Antoine Hoarau; Serena Ivaldi; Olivier Sigaud

One of the long-term challenges of programming by demonstration is achieving generality, i.e. automatically adapting the reproduced behavior to novel situations. A common approach for achieving generality is to learn parameterizable skills from multiple demonstrations for different situations. In this paper, we generalize recent approaches on learning parameterizable skills based on dynamical movement primitives (DMPs), such that task parameters are also passed as inputs to the function approximator of the DMP. This leads to a more general, flexible, and compact representation of parameterizable skills, as demonstrated by our empirical evaluation on the iCub and Meka humanoid robots.


ieee-ras international conference on humanoid robots | 2010

Reinforcement learning of full-body humanoid motor skills

Freek Stulp; Jonas Buchli; Evangelos A. Theodorou; Stefan Schaal

Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous. Thus, most reinforcement learning algorithms would become computationally infeasible and require a prohibitive amount of trials to explore such high-dimensional spaces. In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals. The algorithm, called Policy Improvement with Path Integrals (PI2), has a surprisingly simple form, has no open tuning parameters besides the exploration noise, is model-free, and performs numerically robustly in high dimensional learning problems. We demonstrate how PI2 is able to learn full-body motor skills on a 34-DOF humanoid robot. To demonstrate the generality of our approach, we also apply PI2 in the context of variable impedance control, where both planned trajectories and gain schedules for each joint are optimized simultaneously.

Collaboration


Dive into the Freek Stulp's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Evangelos A. Theodorou

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peter Pastor

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Gennaro Raiola

Superior National School of Advanced Techniques

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge