Stéphane Ross | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stéphane Ross is active.

Explore More

Publication

Featured researches published by Stéphane Ross.

Journal of Artificial Intelligence Research | 2008

Online planning algorithms for POMDPs

Stéphane Ross; Joelle Pineau; Sébastien Paquet; Brahim Chaib-draa

Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.

international conference on robotics and automation | 2013

Learning monocular reactive UAV control in cluttered natural environments

Stéphane Ross; Narek Melik-Barkhudarov; Kumar Shaurya Shankar; Andreas Wendel; Debadeepta Dey; J. Andrew Bagnell; Martial Hebert

Autonomous navigation for large Unmanned Aerial Vehicles (UAVs) is fairly straight-forward, as expensive sensors and monitoring devices can be employed. In contrast, obstacle avoidance remains a challenging task for Micro Aerial Vehicles (MAVs) which operate at low altitude in cluttered environments. Unlike large vehicles, MAVs can only carry very light sensors, such as cameras, making autonomous navigation through obstacles much more challenging. In this paper, we describe a system that navigates a small quadrotor helicopter autonomously at low altitude through natural forest environments. Using only a single cheap camera to perceive the environment, we are able to maintain a constant velocity of up to 1.5m/s. Given a small set of human pilot demonstrations, we use recent state-of-the-art imitation learning techniques to train a controller that can avoid trees by adapting the MAVs heading. We demonstrate the performance of our system in a more controlled environment indoors, and in real natural forest environments outdoors.

computer vision and pattern recognition | 2011

Learning message-passing inference machines for structured prediction

Stéphane Ross; Daniel Munoz; Martial Hebert; J. Andrew Bagnell

Nearly every structured prediction problem in computer vision requires approximate inference due to large and complex dependencies among output labels. While graphical models provide a clean separation between modeling and inference, learning these models with approximate inference is not well understood. Furthermore, even if a good model is learned, predictions are often inaccurate due to approximations. In this work, instead of performing inference over a graphical model, we instead consider the inference procedure as a composition of predictors. Specifically, we focus on message-passing algorithms, such as Belief Propagation, and show how they can be viewed as procedures that sequentially predict label distributions at each node over a graph. Given labeled graphs, we can then train the sequence of predictors to output the correct labeling s. The result no longer corresponds to a graphical model but simply defines an inference procedure, with strong theoretical properties, that can be used to classify new graphs. We demonstrate the scalability and efficacy of our approach on 3D point cloud classification and 3D surface estimation from single images.

international conference on robotics and automation | 2008

Bayesian reinforcement learning in continuous POMDPs with application to robot navigation

Stéphane Ross; Brahim Chaib-draa; Joelle Pineau

We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially observable Markov decision processes (POMDPs) provide a rich mathematical model to handle such environments but require a known model to be solved by most approaches. This is a limitation in practice as the exact model parameters are often difficult to specify exactly. We adopt a Bayesian approach where a posterior distribution over the model parameters is maintained and updated through experience with the environment. We propose a particle filter algorithm to maintain the posterior distribution and an online planning algorithm, based on trajectory sampling, to plan the best action to perform under the current posterior. The resulting approach selects control actions which optimally trade-off between 1) exploring the environment to learn the model, 2) identifying the systems state, and 3) exploiting its knowledge in order to maximize long-term rewards. Our preliminary results on a simulated robot navigation problem show that our approach is able to learn good models of the sensors and actuators, and performs as well as if it had the true model.

intelligent robots and systems | 2009

Bayesian reinforcement learning in continuous POMDPs with gaussian processes

Patrick Dallaire; Camille Besse; Stéphane Ross; Brahim Chaib-draa

Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle real-world sequential decision processes but require a known model to be solved by most approaches. However, mainstream POMDP research focuses on the discrete case and this complicates its application to most realistic problems that are naturally modeled using continuous state spaces. In this paper, we consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are unknown. We advocate the use of Gaussian Process Dynamical Models (GPDMs) so that we can learn the model through experience with the environment. Our results on the blimp problem show that the approach can learn good models of the sensors and actuators in order to maximize long-term rewards.

international conference on machine learning and applications | 2009

Sensitivity Analysis of POMDP Value Functions

Stéphane Ross; Masoumeh T. Izadi; Mark Mercer; David L. Buckeridge

In sequential decision making under uncertainty, as in many other modeling endeavors, researchers observe a dynamical system and collect data measuring its behavior over time. These data are often used to build models that explain relationships between the measured variables, and are eventually used for planning and control purposes. However, these measurements cannot always be exact, systems can change over time, and discovering these facts or fixing these problems is not always feasible. Therefore it is important to formally describe the degree to which the model can tolerate noise, in order to keep near optimal behavior. The problem of finding tolerance bounds has been the focus of many studies for Markov Decision Processes (MDPs) due to their usefulness in practical applications. In this paper, we consider Partially Observable MDPs (POMDPs), which is a more realistic extension of MDPs with a wider scope of applications. We address two types of perturbations in POMDP model parameters, namely additive and multiplicative, and provide theoretical bounds for the impact of these changes in the value function. Experimental results are provided to illustrate our POMDP perturbation analysis in practice.

international conference on artificial intelligence and statistics | 2011