Martin A. Riedmiller | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin A. Riedmiller is active.

Explore More

Publication

Featured researches published by Martin A. Riedmiller.

international symposium on neural networks | 1993

A direct adaptive method for faster backpropagation learning: the RPROP algorithm

Martin A. Riedmiller; Heinrich Braun

A learning algorithm for multilayer feedforward networks, RPROP (resilient propagation), is proposed. To overcome the inherent disadvantages of pure gradient-descent, RPROP performs a local adaptation of the weight-updates according to the behavior of the error function. Contrary to other adaptive techniques, the effect of the RPROP adaptation process is not blurred by the unforeseeable influence of the size of the derivative, but only dependent on the temporal behavior of its sign. This leads to an efficient and transparent adaptation process. The capabilities of RPROP are shown in comparison to other adaptive techniques.<<ETX>>

Nature | 2015

Human-level control through deep reinforcement learning

Volodymyr Mnih; Koray Kavukcuoglu; David Silver; Andrei A. Rusu; Joel Veness; Marc G. Bellemare; Alex Graves; Martin A. Riedmiller; Andreas K. Fidjeland; Georg Ostrovski; Stig Petersen; Charles Beattie; Amir Sadik; Ioannis Antonoglou; Helen King; Dharshan Kumaran; Daan Wierstra; Shane Legg; Demis Hassabis

The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Computer Standards & Interfaces | 1994

Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms

Martin A. Riedmiller

Abstract Since the presentation of the backpropagation algorithm [1] a vast variety of improvements of the technique for training the weights in a feed-forward neural network have been proposed. The following article introduces the concept of supervised learning in multi-layer perceptrons based on the technique of gradient descent. Some problems and drawbacks of the original backpropagation learning procedure are discussed, eventually leading to the development of more sophisticated techniques This article concentrates on adaptive learning strategies. Some of the most popular learning algorithms are described and discussed according to their classification in terms of global and local adaptation strategies. The behavior of several learning procedures on some popular benchmark problems is reported, thereby illuminating convergence, robustness, and scaling properties of the respective algorithms.

european conference on machine learning | 2005

Neural fitted q iteration – first experiences with a data efficient neural reinforcement learning method

Martin A. Riedmiller

This paper introduces NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron. Based on the principle of storing and reusing transition experiences, a model-free, neural network based Reinforcement Learning algorithm is proposed. The method is evaluated on three benchmark problems. It is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality.

international conference on robotics and automation | 2012

A learned feature descriptor for object recognition in RGB-D data

Manuel Blum; Jost Tobias Springenberg; Jan Wülfing; Martin A. Riedmiller

In this work we address the problem of feature extraction for object recognition in the context of cameras providing RGB and depth information (RGB-D data). We consider this problem in a bag of features like setting and propose a new, learned, local feature descriptor for RGB-D images, the convolutional k-means descriptor. The descriptor is based on recent results from the machine learning community. It automatically learns feature responses in the neighborhood of detected interest points and is able to combine all available information, such as color and depth into one, concise representation. To demonstrate the strength of this approach we show its applicability to different recognition problems. We evaluate the quality of the descriptor on the RGB-D Object Dataset where it is competitive with previously published results and propose an embedding into an image processing pipeline for object recognition and pose estimation.

intelligent robots and systems | 2015

Multimodal deep learning for robust RGB-D object recognition

Andreas Eitel; Jost Tobias Springenberg; Luciano Spinello; Martin A. Riedmiller; Wolfram Burgard

Robust object recognition is a crucial ingredient of many, if not all, real-world robotics applications. This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture for object recognition. Our architecture is composed of two separate CNN processing streams - one for each modality - which are consecutively combined with a late fusion network. We focus on learning with imperfect sensor data, a typical problem in real-world robotics tasks. For accurate learning, we introduce a multi-stage training methodology and two crucial ingredients for handling depth data with CNNs. The first, an effective encoding of depth information for CNNs that enables learning without the need for large depth datasets. The second, a data augmentation scheme for robust learning with depth images by corrupting them with realistic noise patterns. We present state-of-the-art results on the RGB-D object dataset [15] and show recognition in challenging RGB-D real-world noisy settings.

Autonomous Robots | 2009

Reinforcement learning for robot soccer

Martin A. Riedmiller; Thomas Gabel; Roland Hafner; Sascha Lange

Batch reinforcement learning methods provide a powerful framework for learning efficiently and effectively in autonomous robots. The paper reviews some recent work of the authors aiming at the successful application of reinforcement learning in a challenging and complex domain. It discusses several variants of the general batch learning framework, particularly tailored to the use of multilayer perceptrons to approximate value functions over continuous state spaces. The batch learning framework is successfully used to learn crucial skills in our soccer-playing robots participating in the RoboCup competitions. This is demonstrated on three different case studies.

international symposium on neural networks | 2010

Deep auto-encoder neural networks in reinforcement learning

Sascha Lange; Martin A. Riedmiller

This paper discusses the effectiveness of deep auto-encoder neural networks in visual reinforcement learning (RL) tasks. We propose a framework for combining the training of deep auto-encoders (for learning compact feature spaces) with recently-proposed batch-mode RL algorithms (for learning policies). An emphasis is put on the data-efficiency of this combination and on studying the properties of the feature spaces automatically constructed by the deep auto-encoders. These feature spaces are empirically shown to adequately resemble existing similarities and spatial relations between observations and allow to learn useful policies. We propose several methods for improving the topology of the feature spaces making use of task-dependent information. Finally, we present first results on successfully learning good control policies directly on synthesized and real images.

robot soccer world cup | 2002

Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer

Artur Merke; Martin A. Riedmiller

Our long-term goal is to build a robot soccer team where the decision making part is based completely on Reinforcement Learning (RL) methods. The paper describes the overall approach pursued by the Karlsruhe Brainstormers simulator league team. Main parts of basic decision making are meanwhile solved using RL techniques. On the tactical level, first empirical results are presented for 2 against 2 attack situations.

international symposium on neural networks | 2012

Autonomous reinforcement learning on raw visual input data in a real world application

Sascha Lange; Martin A. Riedmiller; Arne Voigtländer

We propose a learning architecture, that is able to do reinforcement learning based on raw visual input data. In contrast to previous approaches, not only the control policy is learned. In order to be successful, the system must also autonomously learn, how to extract relevant information out of a high-dimensional stream of input information, for which the semantics are not provided to the learning system. We give a first proof-of-concept of this novel learning architecture on a challenging benchmark, namely visual control of a racing slot car. The resulting policy, learned only by success or failure, is hardly beaten by an experienced human player.

Explore More