Bernd Porr | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bernd Porr is active.

Explore More

Publication

Featured researches published by Bernd Porr.

Neural Computation | 2005

Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

Florentin Wörgötter; Bernd Porr

In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward-based (e.g., TD learning) and correlation-based (Hebbian) learning related? and How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe that reward-based and correlation-based learning are indeed very similar. Machine control is then used to introduce the problem of closed-loop control (e.g., actor-critic architectures). Here the problem of evaluative (rewards) versus nonevaluative (correlations) feedback from the environment will be discussed, showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question, we compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus, and cortex) and the molecular biophysics of glutamatergic and dopaminergic synapses. Finally, we discuss the different algorithms used to model STDP and compare them to reward-based learning rules. Certain similarities are found in spite of the strongly different timescales. Here we focus on the biophysics of the different calcium-release mechanisms known to be involved in STDP.

The International Journal of Robotics Research | 2006

Fast Biped Walking with a Sensor-driven Neuronal Controller and Real-time Online Learning

Tao Geng; Bernd Porr; Florentin Wörgötter

In this paper, we present our design and experiments on a planar biped robot under the control of a pure sensor-driven controller. This design has some special mechanical features, for example small curved feet allowing rolling action and a properly positioned center of mass, that facilitate fast walking through exploitation of the robots natural dynamics. Our sensor-driven controller is built with biologically inspired sensor- and motor-neuron models, and does not employ any kind of position or trajectory tracking control algorithm. Instead, it allows our biped robot to exploit its own natural dynamics during critical stages of its walking gait cycle. Due to the interaction between the sensor-driven neuronal controller and the properly designed mechanics of the robot, the biped robot can realize stable dynamic walking gaits in a large domain of the neuronal parameters. In addition, this structure allows the use of a policy gradient reinforcement learning algorithm to tune the parameters of the sensor-driven controller in real-time, during walking. This way RunBot can reach a relative speed of 3.5 leg lengths per second after only a few minutes of online learning, which is faster than that of any other biped robot, and is also comparable to the fastest relative speed of human walking.

PLOS Computational Biology | 2007

Adaptive, fast walking in a biped robot under neuronal control and learning

Poramate Manoonpong; Tao Geng; Tomas Kulvicius; Bernd Porr; Florentin Wörgötter

Human walking is a dynamic, partly self-stabilizing process relying on the interaction of the biomechanical design with its neuronal control. The coordination of this process is a very difficult problem, and it has been suggested that it involves a hierarchy of levels, where the lower ones, e.g., interactions between muscles and the spinal cord, are largely autonomous, and where higher level control (e.g., cortical) arises only pointwise, as needed. This requires an architecture of several nested, sensori–motor loops where the walking process provides feedback signals to the walkers sensory systems, which can be used to coordinate its movements. To complicate the situation, at a maximal walking speed of more than four leg-lengths per second, the cycle period available to coordinate all these loops is rather short. In this study we present a planar biped robot, which uses the design principle of nested loops to combine the self-stabilizing properties of its biomechanical design with several levels of neuronal control. Specifically, we show how to adapt control by including online learning mechanisms based on simulated synaptic plasticity. This robot can walk with a high speed (>3.0 leg length/s), self-adapting to minor disturbances, and reacting in a robust way to abruptly induced gait changes. At the same time, it can learn walking on different terrains, requiring only few learning experiences. This study shows that the tight coupling of physical with neuronal control, guided by sensory feedback from the walking pattern itself, combined with synaptic learning may be a way forward to better understand and solve coordination problems in other complex motor tasks.

Neural Computation | 2003

Isotropic sequence order learning

Bernd Porr; Florentin Wörgötter

In this article, we present an isotropic unsupervised algorithm for temporal sequence learning. No special reward signal is used such that all inputs are completely isotropic. All input signals are bandpass filtered before converging onto a linear output neuron. All synaptic weights change according to the correlation of bandpass-filtered inputs with the derivative of the output. We investigate the algorithm in an open- and a closed-loop condition, the latter being defined by embedding the learning system into a behavioral feedback loop. In the open-loop condition, we find that the linear structure of the algorithm allows analytically calculating the shape of the weight change, which is strictly heterosynaptic and follows the shape of the weight change curves found in spike-time-dependent plasticity. Furthermore, we show that synaptic weights stabilize automatically when no more temporal differences exist between the inputs without additional normalizing measures. In the second part of this study, the algorithm is is placed in an environment that leads to closed sensor-motor loop. To this end, a robot is programmed with a prewired retraction reflex reaction in response to collisions. Through isotropic sequence order (ISO) learning, the robot achieves collision avoidance by learning the correlation between his early range-finder signals and the later occurring collision signal. Synaptic weights stabilize at the end of learning as theoretically predicted. Finally, we discuss the relation of ISO learning with other drive reinforcement models and with the commonly used temporal difference learning algorithm. This study is followed up by a mathematical analysis of the closed-loop situation in the companion article in this issue, ISO Learning Approximates a Solution to the Inverse-Controller Problem in an Unsupervised Behavioral Paradigm (pp. 865884).

Neural Computation | 2006

Strongly improved stability and faster convergence of temporal sequence learning by using input correlations only

Bernd Porr; Florentin Wörgötter

Currently all important, low-level, unsupervised network learning algorithms follow the paradigm of Hebb, where input and output activity are correlated to change the connection strength of a synapse. However, as a consequence, classical Hebbian learning always carries a potentially destabilizing autocorrelation term, which is due to the fact that every input is in a weighted form reflected in the neurons output. This self-correlation can lead to positive feedback, where increasing weights will increase the output, and vice versa, which may result in divergence. This can be avoided by different strategies like weight normalization or weight saturation, which, however, can cause different problems. Consequently, in most cases, high learning rates cannot be used for Hebbian learning, leading to relatively slow convergence. Here we introduce a novel correlation-based learning rule that is related to our isotropic sequence order (ISO) learning rule (Porr & Wrgtter, 2003a), but replaces the derivative of the output in the learning rule with the derivative of the reflex input. Hence, the new rule uses input correlations only, effectively implementing strict heterosynaptic learning. This looks like a minor modification but leads to dramatically improved properties. Elimination of the output from the learning rule removes the unwanted, destabilizing autocorrelation term, allowing us to use high learning rates. As a consequence, we can mathematically show that the theoretical optimum of one-shot learning can be reached under ideal conditions with the new rule. This result is then tested against four different experimental setups, and we will show that in all of them, very few (and sometimes only one) learning experiences are needed to achieve the learning goal. As a consequence, the new learning rule is up to 100 times faster and in general more stable than ISO learning.

Neural Computation | 2006

A Reflexive Neural Network for Dynamic Biped Walking Control

Tao Geng; Bernd Porr; Bernd Florentinwörgötter

Biped walking remains a difficult problem, and robot models can greatly facilitate our understanding of the underlying biomechanical principles as well as their neuronal control. The goal of this study is to specifically demonstrate that stable biped walking can be achieved by combining the physical properties of the walking robot with a small, reflex-based neuronal network governed mainly by local sensor signals. Building on earlier work (Taga, 1995; Cruse, Kindermann, Schumm, Dean, & Schmitz, 1998), this study shows that human-like gaits emerge without specific position or trajectory control and that the walker is able to compensate small disturbances through its own dynamical properties. The reflexive controller used here has the following characteristics, which are different from earlier approaches: (1) Control is mainly local. Hence, it uses only two signals (anterior extreme angle and ground contact), which operate at the interjoint level. All other signals operate only at single joints. (2) Neither position control nor trajectory tracking control is used. Instead, the approximate nature of the local reflexes on each joint allows the robot mechanics itself (e. g., its passive dynamics) to contribute substantially to the overall gait trajectory computation. (3) The motor control scheme used in the local reflexes of our robot is more straightforward and has more biological plausibility than that of other robots, because the outputs of the motor neurons in our reflexive controller are directly driving the motors of the joints rather than working as references for position or velocity control. As a consequence, the neural controller and the robot mechanics are closely coupled as a neuromechanical system, and this study emphasizes that dynamically stable biped walking gaits emerge from the coupling between neural computation and physical computation. This is demonstrated by different walking experiments using a real robot as well as by a Poincare map analysis applied on a model of the robot in order to assess its stability.

Neural Computation | 2007

Learning with “Relevance”: Using a Third Factor to Stabilize Hebbian Learning

Bernd Porr; Florentin Wörgötter

It is a well-known fact that Hebbian learning is inherently unstable because of its self-amplifying terms: the more a synapse grows, the stronger the postsynaptic activity, and therefore the faster the synaptic growth. This unwanted weight growth is driven by the autocorrelation term of Hebbian learning where the same synapse drives its own growth. On the other hand, the cross-correlation term performs actual learning where different inputs are correlated with each other. Consequently, we would like to minimize the autocorrelation and maximize the cross-correlation. Here we show that we can achieve this with a third factor that switches on learning when the autocorrelation is minimal or zero and the cross-correlation is maximal. The biological counterpart of such a third factor is a neuromodulator that switches on learning at a certain moment in time. We show in a behavioral experiment that our three-factor learning clearly outperforms classical Hebbian learning.

Neural Computation | 2003

ISO learning approximates a solution to the inverse-controller problem in an unsupervised behavioral paradigm

Bernd Porr; Christian von Ferber; Florentin Wörgötter

In Isotropic Sequence Order Learning (pp. 831864 in this issue), we introduced a novel algorithm for temporal sequence learning (ISO learning). Here, we embed this algorithm into a formal nonevaluating (teacher free) environment, which establishes a sensor-motor feedback. The system is initially guided by a fixed reflex reaction, which has the objective disadvantage that it can react only after a disturbance has occurred. ISO learning eliminates this disadvantage by replacing the reflex-loop reactions with earlier anticipatory actions. In this article, we analytically demonstrate that this process can be understood in terms of control theory, showing that the system learns the inverse controller of its own reflex. Thereby, this system is able to learn a simple form of feedforward motor control.

Kybernetes | 2005

Inside embodiment – what means embodiment to radical constructivists?

Bernd Porr; Florentin Wörgötter

Purpose – This work explores the consequences of Heinz von Foersters claim in the context of linear signal theory, embodiment and the creation of artifacts that the nervous system is operationally closed. It operates only in contact to itself.Design/methodology/approach – In linear signal theory all transfer functions can be directly associated with the neural activity where also the environment is described by neural activity. The phenomenon of embodiment is interpreted from the perspective of the nervous system, thus from the inner perspective. To identify inside and outside an organism must learn to identify the disturbances which are only in the environment. This can be done by anticipatory learning.Findings – Questions whether the nervous system is able to distinguish between inside and outside. Mathematically stays in the field of linear control theory and tries to give this mathematical formalism a new meaning in the light of radical constructivism. Gives some guidelines how to apply the highly th...

Philosophical Transactions of the Royal Society A | 2003

Isotropic-sequence-order learning in a closed-loop behavioural system.

Bernd Porr; Florentin Wörgötter

The simplest form of sensor–motor control is obtained with a reflex. In this case the reflex can be interpreted as part of a closed–loop control paradigm which measures a sensor input and generates a motor reaction as soon as the sensor signal deviates from its desired (resting) state. This is a typical case of feedback control. However, reflex reactions are tardy, because they occur always only after a (for example, unpleasant) reflex–eliciting sensor event. This defines an objective problem for an organism which can only be avoided if the corresponding motor reaction is generated earlier. The goal of this study is to design a closed–loop control situation where temporal–sequence learning supersedes a tardy reflex reaction with a proactive anticipatory action. We achieve this by employing a second, earlier–occurring and causally coupled sensor event. An appropriate motor reaction to this early event prevents triggering of the original, primary reflex. Such causally coupled sensor events are common for animals, for example when smell predicts taste or when heat radiation precedes pain. We show that trying to achieve anticipatory control is a fundamentally different goal from trying to model a classical conditioning paradigm, which is an open–loop condition. To this end, we use a novel learning rule for temporal–sequence learning called isotropic–sequence–order (ISO) learning, which performs a confounded correlation between the primary sensor signal associated to the reflex and a predictive, earlier–occurring sensor input: this way the system learns the relation between the primary reflex and the earlier sensor input in order to create an earlier–occurring motor reaction. As a consequence of learning, the primary reflex will not be triggered any more, thereby permanently remaining in its desired resting state. In a robot application, we demonstrate that ISO learning can successfully solve the classical obstacle–avoidance task by learning to correlate a built–in reflex behaviour (retraction after touching) with earlier arising signals from range finders (before touching). Finally, we show that avoidance and attraction tasks can be combined in the same agent.

Explore More