Peter Englert | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Peter Englert is active.

Explore More

Publication

Featured researches published by Peter Englert.

international conference on robotics and automation | 2014

Multi-Task Policy Search for Robotics

Marc Peter Deisenroth; Peter Englert; Jan Peters; Dieter Fox

Learning policies that generalize across multiple tasks is an important and challenging research topic in reinforcement learning and robotics. Training individual policies for every single potential task is often impractical, especially for continuous task variations, requiring more principled approaches to share and transfer knowledge among similar tasks. We present a novel approach for learning a nonlinear feedback policy that generalizes across multiple tasks. The key idea is to define a parametrized policy as a function of both the state and the task, which allows learning a single policy that generalizes across multiple known and unknown tasks. Applications of our novel approach to reinforcement and imitation learning in real-robot experiments are shown.

international conference on robotics and automation | 2013

Model-based imitation learning by probabilistic trajectory matching

Peter Englert; Alexandros Paraschos; Jan Peters; Marc Peter Deisenroth

One of the most elegant ways of teaching new skills to robots is to provide demonstrations of a task and let the robot imitate this behavior. Such imitation learning is a non-trivial task: Different anatomies of robot and teacher, and reduced robustness towards changes in the control task are two major difficulties in imitation learning. We present an imitation-learning approach to efficiently learn a task from expert demonstrations. Instead of finding policies indirectly, either via state-action mappings (behavioral cloning), or cost function learning (inverse reinforcement learning), our goal is to find policies directly such that predicted trajectories match observed ones. To achieve this aim, we model the trajectory of the teacher and the predicted robot trajectory by means of probability distributions. We match these distributions by minimizing their Kullback-Leibler divergence. In this paper, we propose to learn probabilistic forward models to compute a probability distribution over trajectories. We compare our approach to model-based reinforcement learning methods with hand-crafted cost functions. Finally, we evaluate our method with experiments on a real compliant robot.

Adaptive Behavior | 2013

Probabilistic model-based imitation learning

Peter Englert; Alexandros Paraschos; Marc Peter Deisenroth; Jan Peters

Efficient skill acquisition is crucial for creating versatile robots. One intuitive way to teach a robot new tricks is to demonstrate a task and enable the robot to imitate the demonstrated behavior. This approach is known as imitation learning. Classical methods of imitation learning, such as inverse reinforcement learning or behavioral cloning, suffer substantially from the correspondence problem when the actions (i.e. motor commands, torques or forces) of the teacher are not observed or the body of the teacher differs substantially, e.g., in the actuation. To address these drawbacks we propose to learn a robot-specific controller that directly matches robot trajectories with observed ones. We present a novel and robust probabilistic model-based approach for solving a probabilistic trajectory matching problem via policy search. For this purpose, we propose to learn a probabilistic model of the system, which we exploit for mental rehearsal of the current controller by making predictions about future trajectories. These internal simulations allow for learning a controller without permanently interacting with the real system, which results in a reduced overall interaction time. Using long-term predictions from this learned model, we train robot-specific controllers that reproduce the expert’s distribution of demonstrations without the need to observe motor commands during the demonstration. The strength of our approach is that it addresses the correspondence problem in a principled way. Our method achieves a higher learning speed than both model-based imitation learning based on dynamics motor primitives and trial-and-error-based learning systems with hand-crafted cost functions. We successfully applied our approach to imitating human behavior using a tendon-driven compliant robotic arm. Moreover, we demonstrate the generalization ability of our approach in a multi-task learning setup.

intelligent robots and systems | 2014

Dual execution of optimized contact interaction trajectories

Marc Toussaint; Nathan D. Ratliff; Jeannette Bohg; Ludovic Righetti; Peter Englert; Stefan Schaal

Efficient manipulation requires contact to reduce uncertainty. The manipulation literature refers to this as funneling: a methodology for increasing reliability and robustness by leveraging haptic feedback and control of environmental interaction. However, there is a fundamental gap between traditional approaches to trajectory optimization and this concept of robustness by funneling: traditional trajectory optimizers do not discover force feedback strategies. From a POMDP perspective, these behaviors could be regarded as explicit observation actions planned to sufficiently reduce uncertainty thereby enabling a task. While we are sympathetic to the full POMDP view, solving full continuous-space POMDPs in high-dimensions is hard. In this paper, we propose an alternative approach in which trajectory optimization objectives are augmented with new terms that reward uncertainty reduction through contacts, explicitly promoting funneling. This augmentation shifts the responsibility of robustness toward the actual execution of the optimized trajectories. Directly tracing trajectories through configuration space would lose all robustness-dual execution achieves robustness by devising force controllers to reproduce the temporal interaction profile encoded in the dual solution of the optimization problem. This work introduces dual execution in depth and analyze its performance through robustness experiments in both simulation and on a real-world robotic platform.

robotics science and systems | 2016

Combined Optimization and Reinforcement Learning for Manipulation Skills

Peter Englert; Marc Toussaint

This work addresses the problem of how a robot can improve a manipulation skill in a sample-efficient and secure manner. As an alternative to the standard reinforcement learning formulation where all objectives are defined in a single reward function, we propose a generalized formulation that consists of three components: 1) A known analytic control cost function; 2) A black-box return function; and 3) A black-box binary success constraint. While the overall policy optimization problem is highdimensional, in typical robot manipulation problems we can assume that the black-box return and constraint only depend on a lower-dimensional projection of the solution. With our formulation we can exploit this structure for a sample-efficient learning framework that iteratively improves the policy with respect to the objective functions under the success constraint. We employ efficient 2nd-order optimization methods to optimize the high-dimensional policy w.r.t. the analytic cost function while keeping the lower dimensional projection fixed. This is alternated with safe Bayesian optimization over the lower-dimensional projection to address the black-box return and success constraint. During both improvement steps the success constraint is used to keep the optimization in a secure region and to clearly distinguish between motions that lead to success or failure. The learning algorithm is evaluated on a simulated benchmark problem and a door opening task with a PR2.

international conference on robotics and automation | 2015

Sparse Gaussian process regression for compliant, real-time robot control

Jens Schreiter; Peter Englert; Duy Nguyen-Tuong; Marc Toussaint

Sparse Gaussian process (GP) models provide an efficient way to perform regression on large data sets. The key idea is to select a representative subset of the available training data, which induces the sparse GP model approximation. In the past, a variety of selection criteria for GP approximation have been proposed, but they either lack accuracy or suffer from high computational costs. In this paper, we introduce a novel and straightforward criterion for successive selection of training points used for GP model approximation. The proposed algorithm allows a fast and efficient selection of training points, while being competitive in learning performance. As evaluation, we employ our approach in learning inverse dynamics models for robot control using very large data sets (e.g. 500.000 samples). It is demonstrated in experiments that our approximated GP model is sufficiently fast for real-time prediction in robot control. Comparisons with other state-of-the-art approximation techniques show that our proposed approach is significantly faster, while being competitive to generalization accuracy.

The International Journal of Robotics Research | 2018

Learning manipulation skills from a single demonstration

Peter Englert; Marc Toussaint

We consider the scenario where a robot is demonstrated a manipulation skill once and should then use only a few trials on its own to learn to reproduce, optimize, and generalize that same skill. A manipulation skill is generally a high-dimensional policy. To achieve the desired sample efficiency, we need to exploit the inherent structure in this problem. With our approach, we propose to decompose the problem into analytically known objectives, such as motion smoothness, and black-box objectives, such as trial success or reward, depending on the interaction with the environment. The decomposition allows us to leverage and combine (i) constrained optimization methods to address analytic objectives, (ii) constrained Bayesian optimization to explore black-box objectives, and (iii) inverse optimal control methods to eventually extract a generalizable skill representation. The algorithm is evaluated on a synthetic benchmark experiment and compared with state-of-the-art learning methods. We also demonstrate the performance on real-robot experiments with a PR2.

international conference on robotics and automation | 2017

Constrained Bayesian optimization of combined interaction force/task space controllers for manipulations

Danny Dries; Peter Englert; Marc Toussaint

In this paper, we address the problem of how a robot can optimize parameters of combined interaction force/task space controllers under a success constraint in an active way. To enable the robot to explore its environment robustly, safely and without the risk of damaging anything, suitable control concepts have to be developed that enable compliant and force control in situations that are afflicted with high uncertainties. Instances of such concepts are impedance, operational space or hybrid control. However, the parameters of these controllers have to be tuned precisely in order to achieve reasonable performance, which is inherently challenging, as often no sufficient model of the environment is available. To overcome this, we propose to use constrained Bayesian optimization to enable the robot to tune its controller parameters autonomously. Unlike other controller tuning methods, this method allows us to include a success constraint into the optimization. Further, we introduce novel performance measures for compliant, force controlled robots. In real world experiments we show that our approach is able to optimize the parameters for a task that consists of establishing and maintaining contact between the robot and the environment efficiently and successfully.

The International Journal of Robotics Research | 2017