Evangelos A. Theodorou

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Evangelos A. Theodorou is active.

Explore More

Publication

Featured researches published by Evangelos A. Theodorou.

international conference on robotics and automation | 2011

STOMP: Stochastic trajectory optimization for motion planning

Mrinal Kalakrishnan; Sachin Chitta; Evangelos A. Theodorou; Peter Pastor; Stefan Schaal

We present a new approach to motion planning using a stochastic trajectory optimization framework. The approach relies on generating noisy trajectories to explore the space around an initial (possibly infeasible) trajectory, which are then combined to produced an updated trajectory with lower cost. A cost function based on a combination of obstacle and smoothness cost is optimized in each iteration. No gradient information is required for the particular optimization algorithm that we use and so general costs for which derivatives may not be available (e.g. costs corresponding to constraints and motor torques) can be included in the cost function. We demonstrate the approach both in simulation and on a mobile manipulation system for unconstrained and constrained tasks. We experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based methods like CHOMP can get stuck in.

international conference on robotics and automation | 2010

Reinforcement learning of motor skills in high dimensions: A path integral approach

Evangelos A. Theodorou; Jonas Buchli; Stefan Schaal

Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far due to the computational difficulties that reinforcement learning encounters in high dimensional continuous state-action spaces. In this paper, we derive a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals. While solidly grounded in optimal control theory and estimation theory, the update equations for learning are surprisingly simple and have no danger of numerical instabilities as neither matrix inversions nor gradient learning rates are required. Empirical evaluations demonstrate significant performance improvements over gradient-based policy learning and scalability to high-dimensional control problems. Finally, a learning experiment on a robot dog illustrates the functionality of our algorithm in a real-world scenario. We believe that our new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

international conference on robotics and automation | 2011

Skill learning and task outcome prediction for manipulation

Peter Pastor; Mrinal Kalakrishnan; Sachin Chitta; Evangelos A. Theodorou; Stefan Schaal

Learning complex motor skills for real world tasks is a hard problem in robotic manipulation that often requires painstaking manual tuning and design by a human expert. In this work, we present a Reinforcement Learning based approach to acquiring new motor skills from demonstration. Our approach allows the robot to learn fine manipulation skills and significantly improve its success rate and skill level starting from a possibly coarse demonstration. Our approach aims to incorporate task domain knowledge, where appropriate, by working in a space consistent with the constraints of a specific task. In addition, we also present an approach to using sensor feedback to learn a predictive model of the task outcome. This allows our system to learn the proprioceptive sensor feedback needed to monitor subsequent executions of the task online and abort execution in the event of predicted failure. We illustrate our approach using two example tasks executed with the PR2 dual-arm robot: a straight and accurate pool stroke and a box flipping task using two chopsticks as tools.

The International Journal of Robotics Research | 2011

Learning variable impedance control

Jonas Buchli; Freek Stulp; Evangelos A. Theodorou; Stefan Schaal

One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high degree-of-freedom (DOF) robotic tasks. In this contribution, we accomplish such variable impedance control with the reinforcement learning (RL) algorithm PI 2 ( P olicy I mprovement with P ath I ntegrals). PI 2 is a model-free, sampling-based learning method derived from first principles of stochastic optimal control. The PI 2 algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on the cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PI 2 is that it can scale to problems of many DOFs, so that reinforcement learning on real robotic systems becomes feasible. We sketch the PI 2 algorithm and its theoretical properties, and how it is applied to gain scheduling for variable impedance control. We evaluate our approach by presenting results on several simulated and real robots. We consider tasks involving accurate tracking through via points, and manipulation tasks requiring physical contact with the environment. In these tasks, the optimal strategy requires both tuning of a reference trajectory and the impedance of the end-effector. The results show that we can use path integral based reinforcement learning not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications.

IEEE Reviews in Biomedical Engineering | 2009

Computational Models for Neuromuscular Function

Francisco J. Valero-Cuevas; Heiko Hoffmann; Manish U. Kurse; Jason J. Kutch; Evangelos A. Theodorou

Computational models of the neuromuscular system hold the potential to allow us to reach a deeper understanding of neuromuscular function and clinical rehabilitation by complementing experimentation. By serving as a means to distill and explore specific hypotheses, computational models emerge from prior experimental data and motivate future experimental work. Here we review computational tools used to understand neuromuscular function including musculoskeletal modeling, machine learning, control theory, and statistical model analysis. We conclude that these tools, when used in combination, have the potential to further our understanding of neuromuscular function by serving as a rigorous means to test scientific hypotheses in ways that complement and leverage experimental data.

intelligent robots and systems | 2007

A Kalman filter for robust outlier detection

Jo-Anne Ting; Evangelos A. Theodorou; Stefan Schaal

In this paper, we introduce a modified Kalman filter that can perform robust, real-time outlier detection in the observations, without the need for manual parameter tuning by the user. Robotic systems that rely on high quality sensory data can be sensitive to data containing outliers. Since the standard Kalman filter is not robust to outliers, other variations of the Kalman filter have been proposed to overcome this issue, but these methods may require manual parameter tuning, use of heuristics or complicated parameter estimation. Our Kalman filter uses a weighted least squares-like approach by introducing weights for each data sample. A data sample with a smaller weight has a weaker contribution when estimating the current time steps state. We learn the weights and system dynamics using a variational Expectation-Maximization framework. We evaluate our Kalman filter algorithm on data from a robotic dog.

IEEE Transactions on Robotics | 2012

Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation

Freek Stulp; Evangelos A. Theodorou; Stefan Schaal

Physical contact events often allow a natural decomposition of manipulation tasks into action phases and subgoals. Within the motion primitive paradigm, each action phase corresponds to a motion primitive, and the subgoals correspond to the goal parameters of these primitives. Current state-of-the-art reinforcement learning algorithms are able to efficiently and robustly optimize the parameters of motion primitives in very high-dimensional problems. These algorithms often consider only shape parameters, which determine the trajectory between the start- and end-point of the movement. In manipulation, however, it is also crucial to optimize the goal parameters, which represent the subgoals between the motion primitives. We therefore extend the policy improvement with path integrals (PI2) algorithm to simultaneously optimize shape and goal parameters. Applying simultaneous shape and goal learning to sequences of motion primitives leads to the novel algorithm PI2 Seq. We use our methods to address a fundamental challenge in manipulation: improving the robustness of everyday pick-and-place tasks.

international conference on robotics and automation | 2011

Learning to grasp under uncertainty

Freek Stulp; Evangelos A. Theodorou; Jonas Buchli; Stefan Schaal

We present an approach that enables robots to learn motion primitives that are robust towards state estimation uncertainties. During reaching and preshaping, the robot learns to use fine manipulation strategies to maneuver the object into a pose at which closing the hand to perform the grasp is more likely to succeed. In contrast, common assumptions in grasp planning and motion planning for reaching are that these tasks can be performed independently, and that the robot has perfect knowledge of the pose of the objects in the environment.

ieee symposium on adaptive dynamic programming and reinforcement learning | 2011

Path integral control and bounded rationality

Daniel A. Braun; Pedro A. Ortega; Evangelos A. Theodorou; Stefan Schaal

Path integral methods [1], [2],[3] have recently been shown to be applicable to a very general class of optimal control problems. Here we examine the path integral formalism from a decision-theoretic point of view, since an optimal controller can always be regarded as an instance of a perfectly rational decision-maker that chooses its actions so as to maximize its expected utility [4]. The problem with perfect rationality is, however, that finding optimal actions is often very difficult due to prohibitive computational resource costs that are not taken into account. In contrast, a bounded rational decision-maker has only limited resources and therefore needs to strike some compromise between the desired utility and the required resource costs [5]. In particular, we suggest an information-theoretic measure of resource costs that can be derived axiomatically [6]. As a consequence we obtain a variational principle for choice probabilities that trades off maximizing a given utility criterion and avoiding resource costs that arise due to deviating from initially given default choice probabilities. The resulting bounded rational policies are in general probabilistic. We show that the solutions found by the path integral formalism are such bounded rational policies. Furthermore, we show that the same formalism generalizes to discrete control problems, leading to linearly solvable bounded rational control policies in the case of Markov systems. Importantly, Bellmans optimality principle is not presupposed by this variational principle, but it can be derived as a limit case. This suggests that the information-theoretic formalization of bounded rationality might serve as a general principle in control design that unifies a number of recently reported approximate optimal control methods both in the continuous and discrete domain.

robotics science and systems | 2010

Variable Impedance Control A Reinforcement Learning Approach

Jonas Buchli; Evangelos A. Theodorou; Freek Stulp; Stefan Schaal

One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high DOF robotic tasks. In this contribution, we ac- complish such gain scheduling with a reinforcement learning ap- proach algorithm, PI 2 (Policy Improvement with Path Integrals). PI 2 is a model-free, sampling based learning method derived from first principles of optimal control. The PI 2 algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PI 2 is that it can scale to problems of many DOFs, so that RL on real robotic systems becomes feasible. We sketch the PI 2 algorithm and its theoretical properties, and how it is applied to gain scheduling. We evaluate our approach by presenting results on two different simulated robotic systems, a 3-DOF Phantom Premium Robot and a 6-DOF Kuka Lightweight Robot. We investigate tasks where the optimal strategy requires both tuning of the impedance of the end-effector, and tuning of a reference trajectory. The results show that we can use path integral based RL not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications. I. INTRODUCTION

Explore More