Alexandros Paraschos | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexandros Paraschos is active.

Explore More

Publication

Featured researches published by Alexandros Paraschos.

international conference on robotics and automation | 2013

Model-based imitation learning by probabilistic trajectory matching

Peter Englert; Alexandros Paraschos; Jan Peters; Marc Peter Deisenroth

One of the most elegant ways of teaching new skills to robots is to provide demonstrations of a task and let the robot imitate this behavior. Such imitation learning is a non-trivial task: Different anatomies of robot and teacher, and reduced robustness towards changes in the control task are two major difficulties in imitation learning. We present an imitation-learning approach to efficiently learn a task from expert demonstrations. Instead of finding policies indirectly, either via state-action mappings (behavioral cloning), or cost function learning (inverse reinforcement learning), our goal is to find policies directly such that predicted trajectories match observed ones. To achieve this aim, we model the trajectory of the teacher and the predicted robot trajectory by means of probability distributions. We match these distributions by minimizing their Kullback-Leibler divergence. In this paper, we propose to learn probabilistic forward models to compute a probability distribution over trajectories. We compare our approach to model-based reinforcement learning methods with hand-crafted cost functions. Finally, we evaluate our method with experiments on a real compliant robot.

Adaptive Behavior | 2013

Probabilistic model-based imitation learning

Peter Englert; Alexandros Paraschos; Marc Peter Deisenroth; Jan Peters

Efficient skill acquisition is crucial for creating versatile robots. One intuitive way to teach a robot new tricks is to demonstrate a task and enable the robot to imitate the demonstrated behavior. This approach is known as imitation learning. Classical methods of imitation learning, such as inverse reinforcement learning or behavioral cloning, suffer substantially from the correspondence problem when the actions (i.e. motor commands, torques or forces) of the teacher are not observed or the body of the teacher differs substantially, e.g., in the actuation. To address these drawbacks we propose to learn a robot-specific controller that directly matches robot trajectories with observed ones. We present a novel and robust probabilistic model-based approach for solving a probabilistic trajectory matching problem via policy search. For this purpose, we propose to learn a probabilistic model of the system, which we exploit for mental rehearsal of the current controller by making predictions about future trajectories. These internal simulations allow for learning a controller without permanently interacting with the real system, which results in a reduced overall interaction time. Using long-term predictions from this learned model, we train robot-specific controllers that reproduce the expert’s distribution of demonstrations without the need to observe motor commands during the demonstration. The strength of our approach is that it addresses the correspondence problem in a principled way. Our method achieves a higher learning speed than both model-based imitation learning based on dynamics motor primitives and trial-and-error-based learning systems with hand-crafted cost functions. We successfully applied our approach to imitating human behavior using a tendon-driven compliant robotic arm. Moreover, we demonstrate the generalization ability of our approach in a multi-task learning setup.

international conference on robotics and automation | 2015

Extracting low-dimensional control variables for movement primitives

Elmar Rueckert; Jan Mundo; Alexandros Paraschos; Jan Peters; Gerhard Neumann

Movement primitives (MPs) provide a powerful framework for data driven movement generation that has been successfully applied for learning from demonstrations and robot reinforcement learning. In robotics we often want to solve a multitude of different, but related tasks. As the parameters of the primitives are typically high dimensional, a common practice for the generalization of movement primitives to new tasks is to adapt only a small set of control variables, also called meta parameters, of the primitive. Yet, for most MP representations, the encoding of these control variables is pre-coded in the representation and can not be adapted to the considered tasks. In this paper, we want to learn the encoding of task-specific control variables also from data instead of relying on fixed meta-parameter representations. We use hierarchical Bayesian models (HBMs) to estimate a low dimensional latent variable model for probabilistic movement primitives (ProMPs), which is a recent movement primitive representation. We show on two real robot datasets that ProMPs based on HBMs outperform standard ProMPs in terms of generalization and learning from a small amount of data and also allows for an intuitive analysis of the movement. We also extend our HBM by a mixture model, such that we can model different movement types in the same dataset.

ieee-ras international conference on humanoid robots | 2013

A probabilistic approach to robot trajectory generation

Alexandros Paraschos; Gerhard Neumann; Jan Peters

Motor Primitives (MPs) are a promising approach for the data-driven acquisition as well as for the modular and re-usable generation of movements. However, a modular control architecture with MPs is only effective if the MPs support co-activation as well as continuously blending the activation from one MP to the next. In addition, we need efficient mechanisms to adapt a MP to the current situation. Common approaches to movement primitives lack such capabilities or their implementation is based on heuristics. We present a probabilistic movement primitive approach that overcomes the limitations of existing approaches. We encode a primitive as a probability distribution over trajectories. The representation as distribution has several beneficial properties. It allows encoding a time-varying variance profile. Most importantly, it allows performing new operations - a product of distributions for the co-activation of MPs conditioning for generalizing the MP to different desired targets. We derive a feedback controller that reproduces a given trajectory distribution in closed form. We compare our approach to the existing state-of-the art and present real robot results for learning from demonstration.

Frontiers in Computational Neuroscience | 2014

Learning modular policies for robotics

Gerhard Neumann; Christian Daniel; Alexandros Paraschos; Andras Gabor Kupcsik; Jan Peters

A promising idea for scaling robot learning to more complex tasks is to use elemental behaviors as building blocks to compose more complex behavior. Ideally, such building blocks are used in combination with a learning algorithm that is able to learn to select, adapt, sequence and co-activate the building blocks. While there has been a lot of work on approaches that support one of these requirements, no learning algorithm exists that unifies all these properties in one framework. In this paper we present our work on a unified approach for learning such a modular control architecture. We introduce new policy search algorithms that are based on information-theoretic principles and are able to learn to select, adapt and sequence the building blocks. Furthermore, we developed a new representation for the individual building block that supports co-activation and principled ways for adapting the movement. Finally, we summarize our experiments for learning modular control architectures in simulation and with real robots.

international conference on robotics and automation | 2017

Probabilistic prioritization of movement primitives

Alexandros Paraschos; Rudolf Lioutikov; Jan Peters; Gerhard Neumann

Movement prioritization is a common approach to combine controllers of different tasks for redundant robots, where each task is assigned a priority. The priorities of the tasks are often handtuned or the result of an optimization, but seldomly learned from data. This letter combines Bayesian task prioritization with probabilistic movement primitives (ProMPs) to prioritize full motion sequences that are learned from demonstrations. ProMPs can encode distributions of movements over full motion sequences and provide control laws to exactly follow these distributions. The probabilistic formulation allows for a natural application of Bayesian task prioritization. We extend the ProMP controllers with an additional feedback component that accounts inaccuracies in following the distribution and allows for a more robust prioritization of primitives. We demonstrate how the task priorities can be obtained from imitation learning and how different primitives can be combined to solve even unseen task-combinations. Due to the prioritization, our approach can efficiently learn a combination of tasks without requiring individual models per task combination. Furthermore, our approach can adapt an existing primitive library by prioritizing additional controllers, for example, for implementing obstacle avoidance. Hence, the need of retraining the whole library is avoided in many cases. We evaluate our approach on reaching movements under constraints with redundant simulated planar robots and two physical robot platforms, the humanoid robot “iCub” and a KUKA LWR robot arm.

intelligent robots and systems | 2015

Model-free Probabilistic Movement Primitives for physical interaction

Alexandros Paraschos; Elmar Rueckert; Jan Peters; Gerhard Neumann

Physical interaction in robotics is a complex problem that requires not only accurate reproduction of the kinematic trajectories but also of the forces and torques exhibited during the movement. We base our approach on Movement Primitives (MP), as MPs provide a framework for modelling complex movements and introduce useful operations on the movements, such as generalization to novel situations, time scaling, and others. Usually, MPs are trained with imitation learning, where an expert demonstrates the trajectories. However, MPs used in physical interaction either require additional learning approaches, e.g., reinforcement learning, or are based on handcrafted solutions. Our goal is to learn and generate movements for physical interaction that are learned with imitation learning, from a small set of demonstrated trajectories. The Probabilistic Movement Primitives (ProMPs) framework is a recent MP approach that introduces beneficial properties, such as combination and blending of MPs, and represents the correlations present in the movement. The ProMPs provides a variable stiffness controller that reproduces the movement but it requires a dynamics model of the system. Learning such a model is not a trivial task, and, therefore, we introduce the model-free ProMPs, that are learning jointly the movement and the necessary actions from a few demonstrations. We derive a variable stiffness controller analytically. We further extent the ProMPs to include force and torque signals, necessary for physical interaction. We evaluate our approach in simulated and real robot tasks.

Autonomous Robots | 2018

Using probabilistic movement primitives in robotics

Alexandros Paraschos; Christian Daniel; Jan Peters; Gerhard Neumann

Movement Primitives are a well-established paradigm for modular movement representation and generation. They provide a data-driven representation of movements and support generalization to novel situations, temporal modulation, sequencing of primitives and controllers for executing the primitive on physical systems. However, while many MP frameworks exhibit some of these properties, there is a need for a unified framework that implements all of them in a principled way. In this paper, we show that this goal can be achieved by using a probabilistic representation. Our approach models trajectory distributions learned from stochastic movements. Probabilistic operations, such as conditioning can be used to achieve generalization to novel situations or to combine and blend movements in a principled way. We derive a stochastic feedback controller that reproduces the encoded variability of the movement and the coupling of the degrees of freedom of the robot. We evaluate and compare our approach on several simulated and real robot scenarios.

intelligent robots and systems | 2015

Reinforcement learning vs human programming in tetherball robot games

Simone Parisi; Hany Abdulsamad; Alexandros Paraschos; Christian Daniel; Jan Peters

Reinforcement learning of motor skills is an important challenge in order to endow robots with the ability to learn a wide range of skills and solve complex tasks. However, comparing reinforcement learning against human programming is not straightforward. In this paper, we create a motor learning framework consisting of state-of-the-art components in motor skill learning and compare it to a manually designed program on the task of robot tetherball. We use dynamical motor primitives for representing the robots trajectories and relative entropy policy search to train the motor framework and improve its behavior by trial and error. These algorithmic components allow for high-quality skill learning while the experimental setup enables an accurate evaluation of our framework as robot players can compete against each other. In the complex game of robot tetherball, we show that our learning approach outperforms and wins a match against a high quality hand-crafted system.

Journal of Aerospace Information Systems | 2014

Generalizing Movements with Information-Theoretic Stochastic Optimal Control

Rudolf Lioutikov; Alexandros Paraschos; Jan Peters; Gerhard Neumann

Stochastic optimal control is typically used to plan a movement for a specific situation. Although most stochastic optimal control methods fail to generalize this movement plan to a new situation without replanning, a stochastic optimal control method is presented that allows reuse of the obtained policy in a new situation, as the policy is more robust to slight deviations from the initial movement plan. To improve the robustness of the policy, we employ information-theoretic policy updates that explicitly operate on trajectory distributions instead of single trajectories. To ensure a stable and smooth policy update, the ”distance” is limited between the trajectory distributions of the old and the new control policies. The introduced bound offers a closed-form solution for the resulting policy and extends results from recent developments in stochastic optimal control. In contrast to many standard stochastic optimal control algorithms, the current approach can directly infer the system dynamics from data p...

Explore More