Raphaël Fonteneau | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Raphaël Fonteneau is active.

Explore More

Publication

Featured researches published by Raphaël Fonteneau.

ieee symposium on adaptive dynamic programming and reinforcement learning | 2009

Inferring bounds on the performance of a control policy from a sample of trajectories

Raphaël Fonteneau; Susan A. Murphy; Louis Wehenkel; Damien Ernst

We propose an approach for inferring bounds on the finite-horizon return of a control policy from an off-policy sample of trajectories collecting state transitions, rewards, and control actions. In this paper, the dynamics, control policy, and reward function are supposed to be deterministic and Lipschitz continuous. Under these assumptions, a polynomial algorithm, in terms of the sample size and length of the optimization horizon, is derived to compute these bounds, and their tightness is characterized in terms of the sample density.

ieee symposium on adaptive dynamic programming and reinforcement learning | 2013

Optimistic planning for belief-augmented Markov Decision Processes

Raphaël Fonteneau; Lucian Busoniu; Rémi Munos

This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.

international conference on agents and artificial intelligence | 2010

Towards Min Max Generalization in Reinforcement Learning

Raphaël Fonteneau; Susan A. Murphy; Louis Wehenkel; Damien Ernst

In this paper, we introduce a min max approach for addressing the generalization problem in Reinforcement Learning. The min max approach works by determining a sequence of actions that maximizes the worst return that could possibly be obtained considering any dynamics and reward function compatible with the sample of trajectories and some prior knowledge on the environment. We consider the particular case of deterministic Lipschitz continuous environments over continuous state spaces, finite action spaces, and a finite optimization horizon. We discuss the non-triviality of computing an exact solution of the min max problem even after reformulating it so as to avoid search in function spaces. For addressing this problem, we propose to replace, inside this min max problem, the search for the worst environment given a sequence of actions by an expression that lower bounds the worst return that can be obtained for a given sequence of actions. This lower bound has a tightness that depends on the sample sparsity. From there, we propose an algorithm of polynomial complexity that returns a sequence of actions leading to the maximization of this lower bound. We give a condition on the sample sparsity ensuring that, for a given initial state, the proposed algorithm produces an optimal sequence of actions in open-loop. Our experiments show that this algorithm can lead to more cautious policies than algorithms combining dynamic programming with function approximators.

Iet Systems Biology | 2008

Modelling the influence of activation-induced apoptosis of CD4 + and CD8 + T-cells on the immune system response of a HIV-infected patient

Guy-Bart Stan; Florence Belmudes; Raphaël Fonteneau; Frédéric Zeggwagh; Marie-Anne Lefebvre; Christian Michelet; Damien Ernst

On the basis of the human immunodeficiency virus (HIV) infection dynamics model proposed by Adams, the authors propose an extended model that aims at incorporating the influence of activation-induced apoptosis of CD4+ and CD8+ T-cells on the immune system response of HIV-infected patients. Through this model, the authors study the influence of this phenomenon on the time evolution of specific cell populations such as plasma concentrations of HIV copies, or blood concentrations of CD4+ and CD8+ T-cells. In particular, this study shows that depending on its intensity, the apoptosis phenomenon can either favour or mitigate the long-term evolution of the HIV infection.

PLOS ONE | 2016

Benchmarking for Bayesian Reinforcement Learning

Michaël Castronovo; Damien Ernst; Adrien Couëtoux; Raphaël Fonteneau

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.

discovery science | 2012

Policy Search in a Space of Simple Closed-form Formulas: Towards Interpretability of Reinforcement Learning

Francis Maes; Raphaël Fonteneau; Louis Wehenkel; Damien Ernst

In this paper, we address the problem of computing interpretable solutions to reinforcement learning (RL) problems. To this end, we propose a search algorithm over a space of simple closed-form formulas that are used to rank actions. We formalize the search for a high-performance policy as a multi-armed bandit problem where each arm corresponds to a candidate policy canonically represented by its shortest formula-based representation. Experiments, conducted on standard benchmarks, show that this approach manages to determine both efficient and interpretable solutions.

BioResearch Open Access | 2014

Mathematical Modeling of HIV Dynamics After Antiretroviral Therapy Initiation: A Review

Pablo S. Rivadeneira; Claude H. Moog; Guy-Bart Stan; Cécile Brunet; François Raffi; Virginie Ferré; Vicente Costanza; Marie J. Mhawej; Federico L. Biafore; Djomangan Adama Ouattara; Damien Ernst; Raphaël Fonteneau; Xiaohua Xia

Abstract This review shows the potential ground-breaking impact that mathematical tools may have in the analysis and the understanding of the HIV dynamics. In the first part, early diagnosis of immunological failure is inferred from the estimation of certain parameters of a mathematical model of the HIV infection dynamics. This method is supported by clinical research results from an original clinical trial: data just after 1 month following therapy initiation are used to carry out the model identification. The diagnosis is shown to be consistent with results from monitoring of the patients after 6 months. In the second part of this review, prospective research results are given for the design of individual anti-HIV treatments optimizing the recovery of the immune system and minimizing side effects. In this respect, two methods are discussed. The first one combines HIV population dynamics with pharmacokinetics and pharmacodynamics models to generate drug treatments using impulsive control systems. The second one is based on optimal control theory and uses a recently published differential equation to model the side effects produced by highly active antiretroviral therapy therapies. The main advantage of these revisited methods is that the drug treatment is computed directly in amounts of drugs, which is easier to interpret by physicians and patients.

computational intelligence and games | 2012

Imitative learning for real-time strategy games

Quentin Gemine; Firas Safadi; Raphaël Fonteneau; Damien Ernst

Over the past decades, video games have become increasingly popular and complex. Virtual worlds have gone a long way since the first arcades and so have the artificial intelligence (AI) techniques used to control agents in these growing environments. Tasks such as world exploration, constrained pathfinding or team tactics and coordination just to name a few are now default requirements for contemporary video games. However, despite its recent advances, video game AI still lacks the ability to learn. In this paper, we attempt to break the barrier between video game AI and machine learning and propose a generic method allowing real-time strategy (RTS) agents to learn production strategies from a set of recorded games using supervised learning. We test this imitative learning approach on the popular RTS title StarCraft II® and successfully teach a Terran agent facing a Protoss opponent new production strategies.

Siam Journal on Control and Optimization | 2013