Volkmar Sterzing | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Volkmar Sterzing is active.

Explore More

Publication

Featured researches published by Volkmar Sterzing.

Neural Networks: Tricks of the Trade (2nd ed.) | 2012

Solving Partially Observable Reinforcement Learning Problems with Recurrent Neural Networks

Siegmund Duell; Steffen Udluft; Volkmar Sterzing

The aim of this chapter is to provide a series of tricks and recipes for neural state estimation, particularly for real world applications of reinforcement learning. We use various topologies of recurrent neural networks as they allow to identify the continuous valued, possibly high dimensional state space of complex dynamical systems. Recurrent neural networks explicitly offer possibilities to account for time and memory, in principle they are able to model any type of dynamical system. Because of these capabilities recurrent neural networks are a suitable tool to approximate a Markovian state space of dynamical systems. In a second step, reinforcement learning methods can be applied to solve a defined control problem. Besides the trick of using a recurrent neural network for state estimation, various issues regarding real world problems such as, large sets of observables and long-term dependencies are addressed.

international symposium on neural networks | 2007

A Neural Reinforcement Learning Approach to Gas Turbine Control

Anton Maximilian Schaefer; Daniel Schneegass; Volkmar Sterzing; Steffen Udluft

In this paper a new neural network based approach to control a gas turbine for stable operation on high load is presented. A combination of recurrent neural networks (RNN) and reinforcement learning (RL) is used. The authors start by applying an RNN to identify the minimal state space of a gas turbines dynamics. Based on this the optimal control policy is determined by standard RL methods. The authors proceed to the recurrent control neural network, which combines these two steps into one integrated neural network. This approach has the advantage that by using neural networks one can easily deal with the high dimensions of a gas turbine. Due to the high system-identification quality of RNN one can further cope with the only limited amount of available data. The proposed methods are demonstrated on an exemplary gas turbine model where, compared to standard controllers, it strongly improves the performance.

international symposium on neural networks | 2017

Batch reinforcement learning on the industrial benchmark: First experiences

Daniel Hein; Steffen Udluft; Michel Tokic; Alexander Hentschel; Thomas A. Runkler; Volkmar Sterzing

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, such as continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions.

Archive | 2002