Marijn F. Stollenga
Dalle Molle Institute for Artificial Intelligence Research
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marijn F. Stollenga.
Artificial Intelligence | 2017
Varun Raj Kompella; Marijn F. Stollenga; Matthew D. Luciw; Jürgen Schmidhuber
Abstract In the absence of external guidance, how can a robot learn to map the many raw pixels of high-dimensional visual inputs to useful action sequences? We propose here Continual Curiosity driven Skill Acquisition (CCSA). CCSA makes robots intrinsically motivated to acquire, store and reuse skills. Previous curiosity-based agents acquired skills by associating intrinsic rewards with world model improvements, and used reinforcement learning to learn how to get these intrinsic rewards. CCSA also does this, but unlike previous implementations, the world model is a set of compact low-dimensional representations of the streams of high-dimensional visual information, which are learned through incremental slow feature analysis. These representations augment the robots state space with new information about the environment. We show how this information can have a higher-level (compared to pixels) and useful interpretation, for example, if the robot has grasped a cup in its field of view or not. After learning a representation, large intrinsic rewards are given to the robot for performing actions that greatly change the feature output, which has the tendency otherwise to change slowly in time. We show empirically what these actions are (e.g., grasping the cup) and how they can be useful as skills. An acquired skill includes both the learned actions and the learned slow feature representation. Skills are stored and reused to generate new observations, enabling continual acquisition of complex skills. We present results of experiments with an iCub humanoid robot that uses CCSA to incrementally acquire skills to topple, grasp and pick-place a cup, driven by its intrinsic motivation from raw pixel vision.
international conference on development and learning | 2012
Varun Raj Kompella; Matthew D. Luciw; Marijn F. Stollenga; Leo Pape; Jürgen Schmidhuber
To autonomously learn behaviors in complex environments, vision-based agents need to develop useful sensory abstractions from high-dimensional video. We propose a modular, curiosity-driven learning system that autonomously learns multiple abstract representations. The policy to build the library of abstractions is adapted through reinforcement learning, and the corresponding abstractions are learned through incremental slow-feature analysis (IncSFA). IncSFA learns each abstraction based on how the inputs change over time, directly from unprocessed visual data. Modularity is induced via a gating system, which also prevents abstraction duplication. The system is driven by a curiosity signal that is based on the learnability of the inputs by the current adaptive module. After the learning completes, the result is multiple slow-feature modules serving as distinct behavior-specific abstractions. Experiments with a simulated iCub humanoid robot show how the proposed method effectively learns a set of abstractions from raw un-preprocessed video, to our knowledge the first curious learning agent to demonstrate this ability.
international symposium on neural networks | 2014
Varun Raj Kompella; Marijn F. Stollenga; Matthew D. Luciw; Jürgen Schmidhuber
How can a humanoid robot autonomously learn and refine multiple sensorimotor skills as a byproduct of curiosity driven exploration, upon its high-dimensional unprocessed visual input? We present SKILLABILITY, which makes this possible. It combines the recently introduced Curiosity Driven Modular Incremental Slow Feature Analysis (Curious Dr. MISFA) with the well-known options framework. Curious Dr. MISFAs objective is to acquire abstractions as quickly as possible. These abstractions map high-dimensional pixel-level vision to a low-dimensional manifold. We find that each learnable abstraction augments the robots state space (a set of poses) with new information about the environment, for example, when the robot is grasping a cup. The abstraction is a function on an image, called a slow feature, which can effectively discretize a high-dimensional visual sequence. For example, it maps the sequence of the robot watching its arm as it moves around, grasping randomly, then grasping a cup, and moving around some more while holding the cup, into a step function having two outputs: when the cup is or is not currently grasped. The new state space includes this grasped/not grasped information. Each abstraction is coupled with an option. The reward function for the options policy (learned through Least Squares Policy Iteration) is high for transitions that produce a large change in the step-functionlike slow features. This corresponds to finding bottleneck states, which are known good subgoals for hierarchical reinforcement learning - in the example, the subgoal corresponds to grasping the cup. The final skill includes both the learned policy and the learned abstraction. SKILLABILITY makes our iCub the first humanoid robot to learn complex skills such as to topple or grasp an object, from raw high-dimensional video input, driven purely by its intrinsic motivations.
international conference on development and learning | 2012
Rupesh Kumar Srivastava; Bas R. Steunebrink; Marijn F. Stollenga; Jürgen Schmidhuber
Pure scientists do not only invent new methods to solve given problems. They also invent new problems. The recent POWERPLAY framework formalizes this type of curiosity and creativity in a new, general, yet practical way. To acquire problem solving prowess through playing, POWERPLAY-based artificial explorers by design continually come up with the fastest to find, initially novel, but eventually solvable problems. They also continually simplify or speed up solutions to previous problems. We report on results of first experiments with POWERPLAY. A self-delimiting recurrent neural network (SLIM RNN) is used as a general computational architecture to implement the systems solver. Its weights can encode arbitrary, self-delimiting, halting or non-halting programs affecting both environment (through effectors) and internal states encoding abstractions of event sequences. In open-ended fashion, our POWERPLAY-driven RNNs learn to become increasingly general problem solvers, continually adding new problem solving procedures to the growing repertoire, exhibiting interesting developmental stages.
Neural Computation | 2016
Varun Raj Kompella; Matthew D. Luciw; Marijn F. Stollenga; Juergen Schmidhuber
Consider a self-motivated artificial agent who is exploring a complex environment. Part of the complexity is due to the raw high-dimensional sensory input streams, which the agent needs to make sense of. Such inputs can be compactly encoded through a variety of means; one of these is slow feature analysis (SFA). Slow features encode spatiotemporal regularities, which are information-rich explanatory factors (latent variables) underlying the high-dimensional input streams. In our previous work, we have shown how slow features can be learned incrementally, while the agent explores its world, and modularly, such that different sets of features are learned for different parts of the environment (since a single set of regularities does not explain everything). In what order should the agent explore the different parts of the environment? Following Schmidhuber’s theory of artificial curiosity, the agent should always concentrate on the area where it can learn the easiest-to-learn set of features that it has not already learned. We formalize this learning problem and theoretically show that, using our model, called curiosity-driven modular incremental slow feature analysis, the agent on average will learn slow feature representations in order of increasing learning difficulty, under certain mild conditions. We provide experimental results to support the theoretical analysis.
simulation of adaptive behavior | 2014
Marijn F. Stollenga; Jürgen Schmidhuber; Faustino J. Gomez
Planning movements for humanoid robots is still a major challenge due to the very high degrees-of-freedom involved. Most humanoid control frameworks incorporate dynamical constraints related to a task that require detailed knowledge of the robot’s dynamics, making them impractical as efficient planning. In previous work, we introduced a novel planning method that uses an inverse kinematics solver called Natural Gradient Inverse Kinematics (NGIK) to build task-relevant roadmaps (graphs in task space representing robot configurations that satisfy task constraints) by searching the configuration space via the Natural Evolution Strategies (NES) algorithm. The approach places minimal requirements on the constraints, allowing for complex planning in the task space. However, building a roadmap via NGIK is too slow for dynamic environments. In this paper, the approach is scaled-up to a fully-parallelized implementation where additional constraints coordinate the interaction between independent NES searches running on separate threads. Parallelization yields a 12× speedup that moves this promising planning method a major step closer to working in dynamic environments.
ieee-ras international conference on humanoid robots | 2015
Marijn F. Stollenga; Alan J. Lockett; Jürgen Schmidhuber
This paper presents Natural Gradient Control (NGC), a control algorithm that efficiently estimates and applies the natural gradient for high-degree of freedom robotic control. In contrast to the standard task Jacobian, the natural gradient follows the direction of steepest descent with respect to a parameterized model with extra degrees of freedom injected. This procedure enables NGC to maneuver smoothly in regions where the task Jacobian is ill-conditioned or singular. NGC efficiently estimates the natural gradient using only forward kinematics evaluations. This sampling-based algorithm prevents the need for gradient calculations and therefore allows great flexibility in the cost functions. Experiments show NGC can even use statistics of rendered images as part of the cost function, which would be impossible with traditional inverse kinematics approaches. The advantages of NGC are shown on the full 41-degree upper body of an iCub humanoid, in simulation and on a real robot, and compared to a Jacobian-based controller. Experiments show that the natural gradient is robust and avoids common pitfalls such as local minima and slow convergence, which often affects the application of Jacobian-based methods. Demonstrations on the iCub show that NGC is a practical method that can be used for complex movements.
neural information processing systems | 2014
Marijn F. Stollenga; Jonathan Masci; Faustino J. Gomez; Jürgen Schmidhuber
neural information processing systems | 2015
Marijn F. Stollenga; Wonmin Byeon; Marcus Liwicki; Juergen Schmidhuber
international conference on informatics in control, automation and robotics | 2012
Mikhail Frank; Jürgen Leitner; Marijn F. Stollenga; Simon Harding; Alexander Förster; Jürgen Schmidhuber
Collaboration
Dive into the Marijn F. Stollenga's collaboration.
Dalle Molle Institute for Artificial Intelligence Research
View shared research outputsDalle Molle Institute for Artificial Intelligence Research
View shared research outputsDalle Molle Institute for Artificial Intelligence Research
View shared research outputsDalle Molle Institute for Artificial Intelligence Research
View shared research outputsDalle Molle Institute for Artificial Intelligence Research
View shared research outputsDalle Molle Institute for Artificial Intelligence Research
View shared research outputsDalle Molle Institute for Artificial Intelligence Research
View shared research outputsDalle Molle Institute for Artificial Intelligence Research
View shared research outputs