Olivier Sigaud | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Olivier Sigaud is active.

Explore More

Publication

Featured researches published by Olivier Sigaud.

soft computing | 2007

Learning classifier systems: a survey

Olivier Sigaud; Stewart W. Wilson

Learning classifier systems (LCSs) are rule- based systems that automatically build their ruleset. At the origin of Holland’s work, LCSs were seen as a model of the emergence of cognitive abilities thanks to adaptive mechanisms, particularly evolutionary processes. After a renewal of the field more focused on learning, LCSs are now considered as sequential decision problem-solving systems endowed with a generalization property. Indeed, from a Reinforcement Learning point of view, LCSs can be seen as learning systems building a compact representation of their problem thanks to generalization. More recently, LCSs have proved efficient at solving automatic classification tasks. The aim of the present contribution is to describe the state-of- the-art of LCSs, emphasizing recent developments, and focusing more on the sequential decision domain than on automatic classification.

Robotics and Autonomous Systems | 2011

On-line regression algorithms for learning mechanical models of robots: A survey

Olivier Sigaud; Camille Salaün; Vincent Padois

With the emergence of more challenging contexts for robotics, the mechanical design of robots is becoming more and more complex. Moreover, their missions often involve unforeseen physical interactions with the environment. To deal with these difficulties, endowing the controllers of the robots with the capability to learn a model of their kinematics and dynamics under changing circumstances is becoming mandatory. This emergent necessity has given rise to a significant amount of research in the Machine Learning community, generating algorithms that address more and more sophisticated on-line modeling questions. In this paper, we provide a survey of the corresponding literature with a focus on the methods rather than on the results. In particular, we provide a unified view of all recent algorithms that outlines their distinctive features and provides a framework for their combination. Finally, we give a prospective account of the evolution of the domain towards more challenging questions.

international conference on machine learning | 2006

Learning the structure of Factored Markov Decision Processes in reinforcement learning problems

Thomas Degris; Olivier Sigaud; Pierre-Henri Wuillemin

Recent decision-theoric planning algorithms are able to find optimal solutions in large problems, using Factored Markov Decision Processes (FMDPs). However, these algorithms need a perfect knowledge of the structure of the problem. In this paper, we propose SDYNA, a general framework for addressing large reinforcement learning problems by trial-and-error and with no initial knowledge of their structure. SDYNA integrates incremental planning algorithms based on FMDPs with supervised learning techniques building structured representations of the problem. We describe SPITI, an instantiation of SDYNA, that uses incremental decision tree induction to learn the structure of a problem combined with an incremental version of the Structured Value Iteration algorithm. We show that SPITI can build a factored representation of a reinforcement learning problem and may improve the policy faster than tabular reinforcement learning algorithms by exploiting the generalization property of decision tree induction algorithms.

Lecture Notes in Computer Science | 2003

Internal Models and Anticipations in Adaptive Learning Systems

Martin V. Butz; Olivier Sigaud; Pierre Gérard

The explicit investigation of anticipations in relation to adaptive behavior is a recent approach. This chapter first provides psychological background that motivates and inspires the study of anticipations in the adaptive behavior field. Next, a basic framework for the study of anticipations in adaptive behavior is suggested. Different anticipatory mechanisms are identified and characterized. First fundamental distinctions are drawn between implicit anticipatory behavior, payoff anticipatory behavior, sensory anticipatory behavior, and state anticipatory behavior. A case study allows further insights into the drawn distinctions. Many future research direction are suggested.

7ème Journées Nationales de la Recherche en Robotique | 2010

From Motor Learning to Interaction Learning in Robots

Olivier Sigaud; Jan Peters

From an engineering standpoint, the increasing complexity of robotic systems and the increasing demand for more autonomously learning robots, has become essential. This book is largely based on the successful workshop From motor to interaction learning in robots held at the IEEE/RSJ International Conference on Intelligent Robot Systems. The major aim of the book is to give students interested the topics described above a chance to get started faster and researchers a helpful compandium.

ABiALS | 2003

Anticipatory Behavior: Exploiting Knowledge About the Future to Improve Current Behavior

Martin V. Butz; Olivier Sigaud; Pierre Gérard

This chapter is meant to give a concise introduction to the topic of this book. The study of anticipatory behavior is referring to behavior that is dependent on predictions, expectations, or beliefs about future states. Hereby, behavior includes actual decision making, internal decision making, internal preparatory mechanisms, as well as learning. Despite several recent theoretical approaches on this topic, until now it remains unclear in which situations anticipatory behavior is useful or even mandatory to achieve competent behavior in adaptive learning systems. This book provides a collection of articles that investigate these questions. We provide an overview for all articles relating them to each other and highlighting their significance to anticipatory behavior research in general.

Paladyn | 2013

Robot Skill Learning: From Reinforcement Learning to Evolution Strategies

Freek Stulp; Olivier Sigaud

Abstract Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. Owing to current trends involving searching in parameter space (rather than action space) and using reward-weighted averaging (rather than gradient estimation), reinforcement learning algorithms for policy improvement, e.g. PoWER and PI2, are now able to learn sophisticated high-dimensional robot skills. A side-effect of these trends has been that, over the last 15 years, reinforcement learning (RL) algorithms have become more and more similar to evolution strategies such as (μW , λ)-ES and CMA-ES. Evolution strategies treat policy improvement as a black-box optimization problem, and thus do not leverage the problem structure, whereas RL algorithms do. In this paper, we demonstrate how two straightforward simplifications to the state-of-the-art RL algorithm PI2 suffice to convert it into the black-box optimization algorithm (μW, λ)-ES. Furthermore, we show that (μW , λ)-ES empirically outperforms PI2 on the tasks in [36]. It is striking that PI2 and (μW , λ)-ES share a common core, and that the simpler algorithm converges faster and leads to similar or lower final costs. We argue that this difference is due to a third trend in robot skill learning: the predominant use of dynamic movement primitives (DMPs). We show how DMPs dramatically simplify the learning problem, and discuss the implications of this for past and future work on policy improvement for robot skill learning

PLOS Computational Biology | 2014

Modelling Individual Differences in the Form of Pavlovian Conditioned Approach Responses: A Dual Learning Systems Approach with Factored Representations

Olivier Sigaud; Shelly B. Flagel; Terry E. Robinson; Mehdi Khamassi

Reinforcement Learning has greatly influenced models of conditioning, providing powerful explanations of acquired behaviour and underlying physiological observations. However, in recent autoshaping experiments in rats, variation in the form of Pavlovian conditioned responses (CRs) and associated dopamine activity, have questioned the classical hypothesis that phasic dopamine activity corresponds to a reward prediction error-like signal arising from a classical Model-Free system, necessary for Pavlovian conditioning. Over the course of Pavlovian conditioning using food as the unconditioned stimulus (US), some rats (sign-trackers) come to approach and engage the conditioned stimulus (CS) itself – a lever – more and more avidly, whereas other rats (goal-trackers) learn to approach the location of food delivery upon CS presentation. Importantly, although both sign-trackers and goal-trackers learn the CS-US association equally well, only in sign-trackers does phasic dopamine activity show classical reward prediction error-like bursts. Furthermore, neither the acquisition nor the expression of a goal-tracking CR is dopamine-dependent. Here we present a computational model that can account for such individual variations. We show that a combination of a Model-Based system and a revised Model-Free system can account for the development of distinct CRs in rats. Moreover, we show that revising a classical Model-Free system to individually process stimuli by using factored representations can explain why classical dopaminergic patterns may be observed for some rats and not for others depending on the CR they develop. In addition, the model can account for other behavioural and pharmacological results obtained using the same, or similar, autoshaping procedures. Finally, the model makes it possible to draw a set of experimental predictions that may be verified in a modified experimental protocol. We suggest that further investigation of factored representations in computational neuroscience studies may be useful.

IEEE Transactions on Autonomous Mental Development | 2014

Object Learning Through Active Exploration

Serena Ivaldi; Sao Mai Nguyen; Natalia Lyubova; Alain Droniou; Vincent Padois; David Filliat; Pierre-Yves Oudeyer; Olivier Sigaud

This paper addresses the problem of active object learning by a humanoid child-like robot, using a developmental approach. We propose a cognitive architecture where the visual representation of the objects is built incrementally through active exploration. We present the design guidelines of the cognitive architecture, its main functionalities, and we outline the cognitive process of the robot by showing how it learns to recognize objects in a human-robot interaction scenario inspired by social parenting. The robot actively explores the objects through manipulation, driven by a combination of social guidance and intrinsic motivation. Besides the robotics and engineering achievements, our experiments replicate some observations about the coupling of vision and manipulation in infants, particularly how they focus on the most informative objects. We discuss the further benefits of our architecture, particularly how it can be improved and used to ground concepts.

ieee-ras international conference on humanoid robots | 2013

Learning compact parameterized skills with a single regression

Freek Stulp; Gennaro Raiola; Antoine Hoarau; Serena Ivaldi; Olivier Sigaud

One of the long-term challenges of programming by demonstration is achieving generality, i.e. automatically adapting the reproduced behavior to novel situations. A common approach for achieving generality is to learn parameterizable skills from multiple demonstrations for different situations. In this paper, we generalize recent approaches on learning parameterizable skills based on dynamical movement primitives (DMPs), such that task parameters are also passed as inputs to the function approximator of the DMP. This leads to a more general, flexible, and compact representation of parameterizable skills, as demonstrated by our empirical evaluation on the iCub and Meka humanoid robots.

Explore More