Joelle Pineau | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Joelle Pineau is active.

Explore More

Publication

Featured researches published by Joelle Pineau.

Robotics and Autonomous Systems | 2003

Towards robotic assistants in nursing homes: Challenges and results

Joelle Pineau; Michael Montemerlo; Martha E. Pollack; Nicholas Roy; Sebastian Thrun

Abstract This paper describes a mobile robotic assistant, developed to assist elderly individuals with mild cognitive and physical impairments, as well as support nurses in their daily activities. We present three software modules relevant to ensure successful human–robot interaction: an automated reminder system; a people tracking and detection system; and finally a high-level robot controller that performs planning under uncertainty by incorporating knowledge from low-level modules, and selecting appropriate courses of actions. During the course of experiments conducted in an assisted living facility, the robot successfully demonstrated that it could autonomously provide reminders and guidance for elderly residents.

Journal of Artificial Intelligence Research | 2008

Online planning algorithms for POMDPs

Stéphane Ross; Joelle Pineau; Sébastien Paquet; Brahim Chaib-draa

Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.

meeting of the association for computational linguistics | 2000

Spoken dialogue management using probabilistic reasoning

Nicholas Roy; Joelle Pineau; Sebastian Thrun

Spoken dialogue managers have benefited from using stochastic planners such as Markov Decision Processes (MDPs). However, so far, MDPs do not handle well noisy and ambiguous speech utterances. We use a Partially Observable Markov Decision Process (POMDP)-style approach to generate dialogue strategies by inverting the notion of dialogue state; the state represents the users intentions, rather than the system state. We demonstrate that under the same noisy conditions, a POMDP dialogue manager makes fewer mistakes than an MDP dialogue manager. Furthermore, as the quality of speech recognition degrades, the POMDP dialogue manager automatically adjusts the policy.

Journal of Artificial Intelligence Research | 2006

Anytime point-based approximations for large POMDPs

Joelle Pineau; Geoffrey J. Gordon; Sebastian Thrun

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks.

Autonomous Agents and Multi-Agent Systems | 2013

A survey of point-based POMDP solvers

Guy Shani; Joelle Pineau; Robert Kaplow

The past decade has seen a significant breakthrough in research on solving partially observable Markov decision processes (POMDPs). Where past solvers could not scale beyond perhaps a dozen states, modern solvers can handle complex domains with many thousands of states. This breakthrough was mainly due to the idea of restricting value function computations to a finite subset of the belief space, permitting only local value updates for this subset. This approach, known as point-based value iteration, avoids the exponential growth of the value function, and is thus applicable for domains with longer horizons, even with relatively large state spaces. Many extensions were suggested to this basic idea, focusing on various aspects of the algorithm—mainly the selection of the belief space subset, and the order of value function updates. In this survey, we walk the reader through the fundamentals of point-based value iteration, explaining the main concepts and ideas. Then, we survey the major extensions to the basic algorithm, discussing their merits. Finally, we include an extensive empirical analysis using well known benchmarks, in order to shed light on the strengths and limitations of the various approaches.

empirical methods in natural language processing | 2016

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

Chia-Wei Liu; Ryan Lowe; Iulian Vlad Serban; Michael Noseworthy; Laurent Charlin; Joelle Pineau

We investigate evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available. Recent works in response generation have adopted metrics from machine translation to compare a models generated response to a single target response. We show that these metrics correlate very weakly with human judgements in the non-technical Twitter domain, and not at all in the technical Ubuntu domain. We provide quantitative and qualitative results highlighting specific weaknesses in existing metrics, and provide recommendations for future development of better automatic evaluation metrics for dialogue systems.

Machine Learning | 2011

Informing sequential clinical decision-making through reinforcement learning: an empirical study

Susan M. Shortreed; Eric B. Laber; Daniel J. Lizotte; T. Scott Stroup; Joelle Pineau; Susan A. Murphy

This paper highlights the role that reinforcement learning can play in the optimization of treatment policies for chronic illnesses. Before applying any off-the-shelf reinforcement learning methods in this setting, we must first tackle a number of challenges. We outline some of these challenges and present methods for overcoming them. First, we describe a multiple imputation approach to overcome the problem of missing data. Second, we discuss the use of function approximation in the context of a highly variable observation set. Finally, we discuss approaches to summarizing the evidence in the data for recommending a particular action and quantifying the uncertainty around the Q-function of the recommended policy. We present the results of applying these methods to real clinical trial data of patients with schizophrenia.

european conference on machine learning | 2005

Active learning in partially observable markov decision processes

Robin Jaulmes; Joelle Pineau; Doina Precup

This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. We propose two approaches to this problem. The first relies on a model of the uncertainty that is added directly into the POMDP planning problem. This has theoretical guarantees, but is impractical when many of the parameters are uncertain. The second, called MEDUSA, incrementally improves the POMDP model using selected queries, while still optimizing reward. Results show good performance of the algorithm even in large problems: the most useful parameters of the model are learned quickly and the agent still accumulates high reward throughout the process.

international conference on robotics and automation | 2008

Bayesian reinforcement learning in continuous POMDPs with application to robot navigation

Stéphane Ross; Brahim Chaib-draa; Joelle Pineau

We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially observable Markov decision processes (POMDPs) provide a rich mathematical model to handle such environments but require a known model to be solved by most approaches. This is a limitation in practice as the exact model parameters are often difficult to specify exactly. We adopt a Bayesian approach where a posterior distribution over the model parameters is maintained and updated through experience with the environment. We propose a particle filter algorithm to maintain the posterior distribution and an online planning algorithm, based on trajectory sampling, to plan the best action to perform under the current posterior. The resulting approach selects control actions which optimally trade-off between 1) exploring the environment to learn the model, 2) identifying the systems state, and 3) exploiting its knowledge in order to maximize long-term rewards. Our preliminary results on a simulated robot navigation problem show that our approach is able to learn good models of the sensors and actuators, and performs as well as if it had the true model.

ISRR | 2007

POMDP Planning for Robust Robot Control

Joelle Pineau; Geoffrey J. Gordon

POMDPs provide a rich framework for planning and control in partially observable domains. Recent new algorithms have greatly improved the scalability of POMDPs, to the point where they can be used in robot applications. In this paper, we describe how approximate POMDP solving can be further improved by the use of a new theoretically-motivated algorithm for selecting salient information states. We present the algorithm, called PEMA, demonstrate competitive performance on a range of navigation tasks, and show how this approach is robust to mismatches between the robot’s physical environment and the model used for planning.

Explore More