Olivier Spanjaard
University of Paris
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Olivier Spanjaard.
algorithmic decision theory | 2015
Hugo Gilbert; Olivier Spanjaard; Paolo Viappiani; Paul Weng
To tackle the potentially hard task of defining the reward function in a Markov Decision Process MDPs, a new approach, called Interactive Value Iteration IVI has recently been proposed by Weng and Zanuttini 2013. This solving method, which interweaves elicitation and optimization phases, computes a near optimal policy without knowing the precise reward values. The procedure as originally presented can be improved in order to reduce the number of queries needed to determine an optimal policy. The key insights are that 1 asking queries should be delayed as much as possible, avoiding asking queries that might not be necessary to determine the best policy, 2 queries should be asked by following a priority order because the answers to some queries can enable to resolve some other queries, 3 queries can be avoided by using heuristic information to guide the process. Following these ideas, a modified IVI algorithm is presented and experimental results show a significant decrease in the number of queries issued.
European Journal of Operational Research | 2017
Hugo Gilbert; Olivier Spanjaard
In this paper, we provide a generic anytime lower bounding procedure for minmax regret optimization problems. We show that the lower bound obtained is always at least as accurate as the lower bound recently proposed by Chassein and Goerigk [3]. This lower bound can be viewed as the optimal value of a linear programming relaxation of a mixed integer programming formulation of minmax regret optimization, but the contribution of the paper is to compute this lower bound via a double oracle algorithm [10] that we specify. The double oracle algorithm is designed by relying on a game theoretic view of robust optimization, similar to the one developed by Mastin et al. [9], and it can be efficiently implemented for any minmax regret optimization problem whose standard version is easy . We describe how to efficiently embed this lower bound in a branch and bound procedure. Finally we apply our approach to the robust shortest path problem. Our numerical results show a significant gain in the computation times compared to previous approaches in the literature.
multi disciplinary trends in artificial intelligence | 2013
Olivier Spanjaard; Paul Weng
Markov decision processes MDP have become one of the standard models for decision-theoretic planning problems under uncertainty. In its standard form, rewards are assumed to be numerical additive scalars. In this paper, we propose a generalization of this model allowing rewards to be functional. The value of a history is recursively computed by composing the reward functions. We show that several variants of MDPs presented in the literature can be instantiated in this setting. We then identify sufficient conditions on these reward functions for dynamic programming to be valid. In order to show the potential of our framework, we conclude the paper by presenting several illustrative examples.
International Journal on Artificial Intelligence Tools | 2017
Paul Weng; Olivier Spanjaard
Markov decision processes (MDP) have become one of the standard models for decisiontheoretic planning problems under uncertainty. In its standard form, rewards are assumed to be numerical additive scalars. In this paper, we propose a generalization of this model allowing rewards to be functional. The value of a history is recursively computed by composing the reward functions. We show that several variants of MDPs presented in the literature can be instantiated in this setting. We then identify sufficient conditions on these reward functions for dynamic programming to be valid. We also discuss the infinite horizon case and the case where a maximum operator does not exist. In order to show the potential of our framework, we conclude the paper by presenting several illustrative examples.
A Quarterly Journal of Operations Research | 2004
Olivier Spanjaard
Abstract.This is a summary of the most important results presented in the author’s PhD thesis (Spanjaard 2003). This thesis, written in French, was defended on 16 December 2003 and supervised by Patrice Perny. A copy is available from the author upon request. This thesis deals with the search for preferred solutions in combinatorial optimization problems (and more particularly graph problems). It aims at conciliating preference modelling and algorithmic concerns for decision aiding.
uncertainty in artificial intelligence | 2002
Patrice Perny; Olivier Spanjaard
international joint conference on artificial intelligence | 2005
Patrice Perny; Olivier Spanjaard; Paul Weng
national conference on artificial intelligence | 2002
Patrice Perny; Olivier Spanjaard
7th International Conference in Multi-Objective Programming and Goal Programming | 2006
Francis Sourd; Olivier Spanjaard; Patrice Perny
international joint conference on artificial intelligence | 2007
Patrice Perny; Olivier Spanjaard; Louis-Xavier Storme