Is this you? Create Your Porfile

Marijn F. Stollenga

Dalle Molle Institute for Artificial Intelligence Research

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marijn F. Stollenga is active.

Explore More

Publication

Featured researches published by Marijn F. Stollenga.

Artificial Intelligence | 2017

Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots

Varun Raj Kompella; Marijn F. Stollenga; Matthew D. Luciw; Jürgen Schmidhuber

Abstract In the absence of external guidance, how can a robot learn to map the many raw pixels of high-dimensional visual inputs to useful action sequences? We propose here Continual Curiosity driven Skill Acquisition (CCSA). CCSA makes robots intrinsically motivated to acquire, store and reuse skills. Previous curiosity-based agents acquired skills by associating intrinsic rewards with world model improvements, and used reinforcement learning to learn how to get these intrinsic rewards. CCSA also does this, but unlike previous implementations, the world model is a set of compact low-dimensional representations of the streams of high-dimensional visual information, which are learned through incremental slow feature analysis. These representations augment the robots state space with new information about the environment. We show how this information can have a higher-level (compared to pixels) and useful interpretation, for example, if the robot has grasped a cup in its field of view or not. After learning a representation, large intrinsic rewards are given to the robot for performing actions that greatly change the feature output, which has the tendency otherwise to change slowly in time. We show empirically what these actions are (e.g., grasping the cup) and how they can be useful as skills. An acquired skill includes both the learned actions and the learned slow feature representation. Skills are stored and reused to generate new observations, enabling continual acquisition of complex skills. We present results of experiments with an iCub humanoid robot that uses CCSA to incrementally acquire skills to topple, grasp and pick-place a cup, driven by its intrinsic motivation from raw pixel vision.

international conference on development and learning | 2012

Autonomous learning of abstractions using Curiosity-Driven Modular Incremental Slow Feature Analysis

Varun Raj Kompella; Matthew D. Luciw; Marijn F. Stollenga; Leo Pape; Jürgen Schmidhuber

To autonomously learn behaviors in complex environments, vision-based agents need to develop useful sensory abstractions from high-dimensional video. We propose a modular, curiosity-driven learning system that autonomously learns multiple abstract representations. The policy to build the library of abstractions is adapted through reinforcement learning, and the corresponding abstractions are learned through incremental slow-feature analysis (IncSFA). IncSFA learns each abstraction based on how the inputs change over time, directly from unprocessed visual data. Modularity is induced via a gating system, which also prevents abstraction duplication. The system is driven by a curiosity signal that is based on the learnability of the inputs by the current adaptive module. After the learning completes, the result is multiple slow-feature modules serving as distinct behavior-specific abstractions. Experiments with a simulated iCub humanoid robot show how the proposed method effectively learns a set of abstractions from raw un-preprocessed video, to our knowledge the first curious learning agent to demonstrate this ability.

international symposium on neural networks | 2014

Explore to see, learn to perceive, get the actions for free: SKILLABILITY

Varun Raj Kompella; Marijn F. Stollenga; Matthew D. Luciw; Jürgen Schmidhuber

How can a humanoid robot autonomously learn and refine multiple sensorimotor skills as a byproduct of curiosity driven exploration, upon its high-dimensional unprocessed visual input? We present SKILLABILITY, which makes this possible. It combines the recently introduced Curiosity Driven Modular Incremental Slow Feature Analysis (Curious Dr. MISFA) with the well-known options framework. Curious Dr. MISFAs objective is to acquire abstractions as quickly as possible. These abstractions map high-dimensional pixel-level vision to a low-dimensional manifold. We find that each learnable abstraction augments the robots state space (a set of poses) with new information about the environment, for example, when the robot is grasping a cup. The abstraction is a function on an image, called a slow feature, which can effectively discretize a high-dimensional visual sequence. For example, it maps the sequence of the robot watching its arm as it moves around, grasping randomly, then grasping a cup, and moving around some more while holding the cup, into a step function having two outputs: when the cup is or is not currently grasped. The new state space includes this grasped/not grasped information. Each abstraction is coupled with an option. The reward function for the options policy (learned through Least Squares Policy Iteration) is high for transitions that produce a large change in the step-functionlike slow features. This corresponds to finding bottleneck states, which are known good subgoals for hierarchical reinforcement learning - in the example, the subgoal corresponds to grasping the cup. The final skill includes both the learned policy and the learned abstraction. SKILLABILITY makes our iCub the first humanoid robot to learn complex skills such as to topple or grasp an object, from raw high-dimensional video input, driven purely by its intrinsic motivations.

international conference on development and learning | 2012

Continually adding self-invented problems to the repertoire: First experiments with POWERPLAY

Rupesh Kumar Srivastava; Bas R. Steunebrink; Marijn F. Stollenga; Jürgen Schmidhuber

Pure scientists do not only invent new methods to solve given problems. They also invent new problems. The recent POWERPLAY framework formalizes this type of curiosity and creativity in a new, general, yet practical way. To acquire problem solving prowess through playing, POWERPLAY-based artificial explorers by design continually come up with the fastest to find, initially novel, but eventually solvable problems. They also continually simplify or speed up solutions to previous problems. We report on results of first experiments with POWERPLAY. A self-delimiting recurrent neural network (SLIM RNN) is used as a general computational architecture to implement the systems solver. Its weights can encode arbitrary, self-delimiting, halting or non-halting programs affecting both environment (through effectors) and internal states encoding abstractions of event sequences. In open-ended fashion, our POWERPLAY-driven RNNs learn to become increasingly general problem solvers, continually adding new problem solving procedures to the growing repertoire, exhibiting interesting developmental stages.

Neural Computation | 2016

Optimal curiosity-driven modular incremental slow feature analysis

Varun Raj Kompella; Matthew D. Luciw; Marijn F. Stollenga; Juergen Schmidhuber

Consider a self-motivated artificial agent who is exploring a complex environment. Part of the complexity is due to the raw high-dimensional sensory input streams, which the agent needs to make sense of. Such inputs can be compactly encoded through a variety of means; one of these is slow feature analysis (SFA). Slow features encode spatiotemporal regularities, which are information-rich explanatory factors (latent variables) underlying the high-dimensional input streams. In our previous work, we have shown how slow features can be learned incrementally, while the agent explores its world, and modularly, such that different sets of features are learned for different parts of the environment (since a single set of regularities does not explain everything). In what order should the agent explore the different parts of the environment? Following Schmidhuber’s theory of artificial curiosity, the agent should always concentrate on the area where it can learn the easiest-to-learn set of features that it has not already learned. We formalize this learning problem and theoretically show that, using our model, called curiosity-driven modular incremental slow feature analysis, the agent on average will learn slow feature representations in order of increasing learning difficulty, under certain mild conditions. We provide experimental results to support the theoretical analysis.

simulation of adaptive behavior | 2014

Rapid Humanoid Motion Learning through Coordinated, Parallel Evolution

Marijn F. Stollenga; Jürgen Schmidhuber; Faustino J. Gomez

Planning movements for humanoid robots is still a major challenge due to the very high degrees-of-freedom involved. Most humanoid control frameworks incorporate dynamical constraints related to a task that require detailed knowledge of the robot’s dynamics, making them impractical as efficient planning. In previous work, we introduced a novel planning method that uses an inverse kinematics solver called Natural Gradient Inverse Kinematics (NGIK) to build task-relevant roadmaps (graphs in task space representing robot configurations that satisfy task constraints) by searching the configuration space via the Natural Evolution Strategies (NES) algorithm. The approach places minimal requirements on the constraints, allowing for complex planning in the task space. However, building a roadmap via NGIK is too slow for dynamic environments. In this paper, the approach is scaled-up to a fully-parallelized implementation where additional constraints coordinate the interaction between independent NES searches running on separate threads. Parallelization yields a 12× speedup that moves this promising planning method a major step closer to working in dynamic environments.

ieee-ras international conference on humanoid robots | 2015

The Natural Gradient as a control signal for a humanoid robot

Marijn F. Stollenga; Alan J. Lockett; Jürgen Schmidhuber

This paper presents Natural Gradient Control (NGC), a control algorithm that efficiently estimates and applies the natural gradient for high-degree of freedom robotic control. In contrast to the standard task Jacobian, the natural gradient follows the direction of steepest descent with respect to a parameterized model with extra degrees of freedom injected. This procedure enables NGC to maneuver smoothly in regions where the task Jacobian is ill-conditioned or singular. NGC efficiently estimates the natural gradient using only forward kinematics evaluations. This sampling-based algorithm prevents the need for gradient calculations and therefore allows great flexibility in the cost functions. Experiments show NGC can even use statistics of rendered images as part of the cost function, which would be impossible with traditional inverse kinematics approaches. The advantages of NGC are shown on the full 41-degree upper body of an iCub humanoid, in simulation and on a real robot, and compared to a Jacobian-based controller. Experiments show that the natural gradient is robust and avoids common pitfalls such as local minima and slow convergence, which often affects the application of Jacobian-based methods. Demonstrations on the iCub show that NGC is a practical method that can be used for complex movements.

neural information processing systems | 2014