Petar Kormushev | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Petar Kormushev is active.

Explore More

Publication

Featured researches published by Petar Kormushev.

intelligent robots and systems | 2010

Robot motor skill coordination with EM-based Reinforcement Learning

Petar Kormushev; Sylvain Calinon; Darwin G. Caldwell

We present an approach allowing a robot to acquire new motor skills by learning the couplings across motor control variables. The demonstrated skill is first encoded in a compact form through a modified version of Dynamic Movement Primitives (DMP) which encapsulates correlation information. Expectation-Maximization based Reinforcement Learning is then used to modulate the mixture of dynamical systems initialized from the users demonstration. The approach is evaluated on a torque-controlled 7 DOFs Barrett WAM robotic arm. Two skill learning experiments are conducted: a reaching task where the robot needs to adapt the learned movement to avoid an obstacle, and a dynamic pancake-flipping task.

Advanced Robotics | 2011

Imitation Learning of Positional and Force Skills Demonstrated via Kinesthetic Teaching and Haptic Input

Petar Kormushev; Sylvain Calinon; Darwin G. Caldwell

A method to learn and reproduce robot force interactions in a human–robot interaction setting is proposed. The method allows a robotic manipulator to learn to perform tasks that require exerting forces on external objects by interacting with a human operator in an unstructured environment. This is achieved by learning two aspects of a task: positional and force profiles. The positional profile is obtained from task demonstrations via kinesthetic teaching. The force profile is obtained from additional demonstrations via a haptic device. A human teacher uses the haptic device to input the desired forces that the robot should exert on external objects during the task execution. The two profiles are encoded as a mixture of dynamical systems, which is used to reproduce the task satisfying both the positional and force profiles. An active control strategy based on task-space control with variable stiffness is then proposed to reproduce the skill. The method is demonstrated with two experiments in which the robot learns an ironing task and a door-opening task.

Robotics | 2013

Reinforcement Learning in Robotics: Applications and Real-World Challenges

Petar Kormushev; Sylvain Calinon; Darwin G. Caldwell

In robotics, the ultimate goal of reinforcement learning is to endow robots with the ability to learn, improve, adapt and reproduce tasks with dynamically changing constraints based on exploration and autonomous learning. We give a summary of the state-of-the-art of reinforcement learning in the context of robotics, in terms of both algorithms and policy representations. Numerous challenges faced by the policy representation in robotics are identified. Three recent examples for the application of reinforcement learning to real-world robots are described: a pancake flipping task, a bipedal walking energy minimization task and an archery-based aiming task. In all examples, a state-of-the-art expectation-maximization-based reinforcement learning is used, and different policy representations are proposed and evaluated for each task. The proposed policy representations offer viable solutions to six rarely-addressed challenges in policy representations: correlations, adaptability, multi-resolution, globality, multi-dimensionality and convergence. Both the successes and the practical difficulties encountered in these examples are discussed. Based on insights from these particular cases, conclusions are drawn about the state-of-the-art and the future perspective directions for reinforcement learning in robotics.

intelligent robots and systems | 2011

Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization

Petar Kormushev; Barkan Ugurlu; Sylvain Calinon; Nikolaos G. Tsagarakis; Darwin G. Caldwell

We present a learning-based approach for minimizing the electric energy consumption during walking of a passively-compliant bipedal robot. The energy consumption is reduced by learning a varying-height center-of-mass trajectory which uses efficiently the robots passive compliance. To do this, we propose a reinforcement learning method which evolves the policy parameterization dynamically during the learning process and thus manages to find better policies faster than by using fixed parameterization. The method is first tested on a function approximation task, and then applied to the humanoid robot COMAN where it achieves significant energy reduction.

international conference on mechatronics | 2013

Development of a dynamic simulator for a compliant humanoid robot based on a symbolic multibody approach

Houman Dallali; Mohamad Mosadeghzad; Gustavo A. Medrano-Cerda; Nicolas Docquier; Petar Kormushev; Nikos G. Tsagarakis; Zhibin Li; Darwin G. Caldwell

This paper reports on development of an open source dynamic simulator for the Compliant huMANoid robot, COMAN. The key advantages of this simulator are: it generates efficient symbolic dynamical equations of the robot with high degrees of freedom, it includes a user-defined model of the actuator dynamics (the passive elasticity and the DC motor equations), user defined ground models and fall detection. Users have the freedom to choose the proposed features or include their own models. The models are generated in Matlab and C languages, where the user can leverage the power of Matlab and Simulink to carry out analysis to parameter variations or optimization and also have the flexibility of C language for realtime experiments on a DSP or FPGA chip. The simulation and experimental results of the robot as well as an optimization example to tune the ground model coefficients are presented. This simulator can be downloaded from the IIT website [1].

ieee-ras international conference on humanoid robots | 2010

Learning the skill of archery by a humanoid robot iCub

Petar Kormushev; Sylvain Calinon; Ryo Saegusa; Giorgio Metta

We present an integrated approach allowing the humanoid robot iCub to learn the skill of archery. After being instructed how to hold the bow and release the arrow, the robot learns by itself to shoot the arrow in such a way that it hits the center of the target. Two learning algorithms are proposed and compared to learn the bi-manual skill: one with Expectation-Maximization based Reinforcement Learning, and one with chained vector regression called the ARCHER algorithm. Both algorithms are used to modulate and coordinate the motion of the two hands, while an inverse kinematics controller is used for the motion of the arms. The image processing part recognizes where the arrow hits the target and is based on Gaussian Mixture Models for color-based detection of the target and the arrows tip. The approach is evaluated on a 53-DOF humanoid robot iCub.

international conference on robotics and automation | 2011

Upper-body kinesthetic teaching of a free-standing humanoid robot

Petar Kormushev; Dragomir N. Nenchev; Sylvain Calinon; Darwin G. Caldwell

We present an integrated approach allowing a free-standing humanoid robot to acquire new motor skills by kinesthetic teaching. The proposed method controls simultaneously the upper and lower body of the robot with different control strategies. Imitation learning is used for training the upper body of the humanoid robot via kinesthetic teaching, while at the same time Reaction Null Space method is used for keeping the balance of the robot. During demonstration, a force/torque sensor is used to record the exerted forces, and during reproduction, we use a hybrid position/force controller to apply the learned trajectories in terms of positions and forces to the end effector. The proposed method is tested on a 25-DOF Fujitsu HOAP-2 humanoid robot with a surface cleaning task.

Robotics and Autonomous Systems | 2013

Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning

Sylvain Calinon; Petar Kormushev; Darwin G. Caldwell

The democratization of robotics technology and the development of new actuators progressively bring robots closer to humans. The applications that can now be envisaged drastically contrast with the requirements of industrial robots. In standard manufacturing settings, the criterions used to assess performance are usually related to the robots accuracy, repeatability, speed or stiffness. Learning a control policy to actuate such robots is characterized by the search of a single solution for the task, with a representation of the policy consisting of moving the robot through a set of points to follow a trajectory. With new environments such as homes and offices populated with humans, the reproduction performance is portrayed differently. These robots are expected to acquire rich motor skills that can be generalized to new situations, while behaving safely in the vicinity of users. Skills acquisition can no longer be guided by a single form of learning, and must instead combine different approaches to continuously create, adapt and refine policies. The family of search strategies based on expectation-maximization (EM) looks particularly promising to cope with these new requirements. The exploration can be performed directly in the policy parameters space, by refining the policy together with exploration parameters represented in the form of covariances. With this formulation, RL can be extended to a multi-optima search problem in which several policy alternatives can be considered. We present here two applications exploiting EM-based exploration strategies, by considering parameterized policies based on dynamical systems, and by using Gaussian mixture models for the search of multiple policy alternatives.

IFAC Proceedings Volumes | 2012

Persistent Autonomy: the Challenges of the PANDORA Project

David M. Lane; Francesco Maurelli; Petar Kormushev; Marc Carreras; Maria Fox; Konstantinos Kyriakopoulos

Abstract PANDORA is a three year project that is developing new computational methods to make underwater robots Persistently Autonomous, significantly reducing the frequency of assistance requests. The aim of the project is to extend the range of tasks that can be carried on autonomously and increase their complexity while reducing the need for operator assistances. Dynamic adaptation to the change of conditions is very important while addressing autonomy in the real world and not just in well-known situation. The key of Pandora is the ability to recognise failure and respond to it, at all levels of abstraction. Under the guidance of major industrial players, validation tasks of inspection, cleaning and valve turning will be trialled with partners’ AUVs in Scotland and Spain.

Cybernetics and Information Technologies | 2012

Learning Fast Quadruped Robot Gaits with the RL PoWER Spline Parameterization

Haocheng Shen; Jason Yosinski; Petar Kormushev; Darwin G. Caldwell; Hod Lipson

Abstract Legged robots are uniquely privileged over their wheeled counterparts in their potential to access rugged terrain. However, designing walking gaits by hand for legged robots is a difficult and time-consuming process, so we seek algorithms for learning such gaits to automatically using real world experimentation. Numerous previous studies have examined a variety of algorithms for learning gaits, using an assortment of different robots. It is often difficult to compare the algorithmic results from one study to the next, because the conditions and robots used vary. With this in mind, we have used an open-source, 3D printed quadruped robot called QuadraTot, so the results may be verified, and hopefully improved upon, by any group so desiring. Because many robots do not have accurate simulators, we test gait-learning algorithms entirely on the physical robot. Previous studies using the QuadraTot have compared parameterized splines, the HyperNEAT generative encoding and genetic algorithm. Among these, the research on the genetic algorithm was conducted by (G l e t t e et al., 2012) in a simulator and tested on a real robot. Here we compare these results to an algorithm called Policy learning by Weighting Exploration with the Returns, or RL PoWER. We report that this algorithm has learned the fastest gait through only physical experiments yet reported in the literature, 16.3% faster than reported for HyperNEAT. In addition, the learned gaits are less taxing on the robot and more repeatable than previous record-breaking gaits.

Explore More