Is this you? Create Your Porfile

Felix A. Gers

Dalle Molle Institute for Artificial Intelligence Research

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Felix A. Gers is active.

Explore More

Publication

Featured researches published by Felix A. Gers.

Neural Computation | 2000

Learning to Forget: Continual Prediction with LSTM

Felix A. Gers; Jürgen Schmidhuber; Fred Cummins

Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the networks internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel, adaptive forget gate that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them, and in an elegant way.

IEEE Transactions on Neural Networks | 2001

LSTM recurrent networks learn simple context-free and context-sensitive languages

Felix A. Gers; E. Schmidhuber

Previous work on learning regular languages from exemplary training sequences showed that long short-term memory (LSTM) outperforms traditional recurrent neural networks (RNNs). We demonstrate LSTMs superior performance on context-free language benchmarks for RNNs, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM variants are also the first RNNs to learn a simple context-sensitive language, namely a(n)b(n)c(n).

Neural Networks | 2003

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Juan Antonio Pérez-Ortiz; Felix A. Gers; Douglas Eck; Jürgen Schmidhuber

The long short-term memory (LSTM) network trained by gradient descent solves difficult problems which traditional recurrent neural networks in general cannot. We have recently observed that the decoupled extended Kalman filter training algorithm allows for even better performance, reducing significantly the number of training steps when compared to the original gradient descent training algorithm. In this paper we present a set of experiments which are unsolvable by classical recurrent networks but which are solved elegantly and robustly and quickly by LSTM combined with Kalman filters.

Neural Computation | 2002

Learning nonregular languages: a comparison of simple recurrent networks and LSTM

Jürgen Schmidhuber; Felix A. Gers; Douglas Eck

In response to Rodriguezs recent article (2001), we compare the performance of simple recurrent nets and long short-term memory recurrent nets on context-free and context-sensitive languages.

european conference on artificial evolution | 1997

CoDi-1Bit: A Simplified Cellular Automata Based Neuron Model

Felix A. Gers; Hugo de Garis; Michael Korkin

This paper presents some simplifications to our recently introduced “CoDi-model”, which we use to evolve Cellular Automata based neural network modules for ATRs artificial brain project “CAM-Brain” [11]. The great advantage of CAs as a modeling medium, is their parallelism, which permits neural system simulation hardware based on CoDi to be scaled up without loss of speed. Simulation speed is crucial for systems using ldevolutionary engineering” technologies, such as ATRs CAM-Brain Project, which aims to build/grow/evolve a billion neuron artificial brain. The improvements in the CoDi model simplify it sufficiently, so that it can be implemented in state of the art FPGAs (e.g. Xilinxs XC6264 chips). ATR is building an FPGA based Cellular Automata Machine “CAM-Brain Machine (CBM)” [13], which includes circuits for neural module evolution and will simulate CoDi about 500 times faster than MITs Cellular Automata Machine CAM-8 currently used at ATR.

international conference on evolvable systems | 1996

CAM-Brain: A New Model for ATR's Cellular Automata Based Artificial Brain Project

Felix A. Gers; Hugo de Garis

This paper introduces a new model for ATRs CAM-Brain Project, which is far more efficient and simpler than the older model. The CAM-Brain Project aims at building a billion neuron artificial brain using “evolutionary engineering” technologies. Our neural structures are based on Cellular Automata (CA) and grow/evolve in special hardware such as MITs “CAM-8” machine. With the CAM-8 and the new CAM-Brain model, it is possible to grow a neural structure with several million neurons in a 128 M cell CA-space, at a speed of 200 M cell-updates per second. The improvements in the new model are based on a new CA-implementation technique, on reducing the number of cell-behaviors to two, and on using genetic encoding of neural structures in which the chromosome is initially distributed homogeneously over the entire CA-space. This new CAM-Brain model allows the implementation of neural structures directly in parallel hardware, evolving at hardware speeds.

international conference on artificial neural networks | 2002

Improving Long-Term Online Prediction with Decoupled Extended Kalman Filters

Juan Antonio Pérez-Ortiz; Jürgen Schmidhuber; Felix A. Gers; Douglas Eck

Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform traditional RNNs when dealing with sequences involving not only short-term but also long-term dependencies. The decoupled extended Kalman filter learning algorithm (DEKF) works well in online environments and reduces significantly the number of training steps when compared to the standard gradient-descent algorithms. Previous work on LSTM, however, has always used a form of gradient descent and has not focused on true online situations. Here we combine LSTM with DEKF and show that this new hybrid improves upon the original learning algorithm when applied to online processing.

international conference on artificial neural networks | 2002

Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

Felix A. Gers; Juan Antonio Pérez-Ortiz; Douglas Eck; Jürgen Schmidhuber

Unlike traditional recurrent neural networks, the Long Short-Term Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ? 10) of the context sensitive language anbncn to deal correctly with values ofn up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.

Archive | 1999

Continual Prediction using LSTM with Forget Gates

Felix A. Gers; Jürgen Schmidhuber; Fred Cummins

Long Short-Term Memory (LSTM,[1]) can solve many tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends. Without resets, the internal state values may grow indefinitely and eventually cause the network to break down. Our remedy is an adaptive “forget gate” that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review an illustrative benchmark problem on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve a continual version of that problem. LSTM with forget gates, however, easily solves it in an elegant way.

ieee international conference on evolutionary computation | 1998

ATR's Artificial Brain (CAM-Brain) Project: a progress report

H. de Garis; Felix A. Gers; Michael Korkin; Arvin Agah; Norberto Eiji Nawa

The paper reports on recent progress made in ATRs attempt to build a 10,000 evolved neural net module artificial brain to control the behavior of a life sized robot kitten.

Explore More