Romain Laroche | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Romain Laroche is active.

Explore More

Publication

Featured researches published by Romain Laroche.

international conference on neural information processing | 2014

Contextual Bandit for Active Learning: Active Thompson Sampling

Djallel Bouneffouf; Romain Laroche; Tanguy Urvoy; Raphaël Féraud; Robin Allesiardo

The labelling of training examples is a costly task in a supervised classification. Active learning strategies answer this problem by selecting the most useful unlabelled examples to train a predictive model. The choice of examples to label can be seen as a dilemma between the exploration and the exploitation over the data space representation. In this paper, a novel active learning strategy manages this compromise by modelling the active learning problem as a contextual bandit problem. We propose a sequential algorithm named Active Thompson Sampling (ATS), which, in each round, assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for this sample point label. Experimental comparison to previously proposed active learning algorithms show superior performance on a real application dataset.

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing | 2013

Reward shaping for statistical optimisation of dialogue management

Layla El Asri; Romain Laroche; Olivier Pietquin

This paper investigates the impact of reward shaping on a reinforcement learning-based spoken dialogue systems learning. A diffuse reward function gives a reward after each transition between two dialogue states. A sparse function only gives a reward at the end of the dialogue. Reward shaping consists of learning a diffuse function without modifying the optimal policy compared to a sparse one. Two reward shaping methods are applied to a corpus of dialogues evaluated with numerical performance scores. Learning with these functions is compared to the sparse case and it is shown, on simulated dialogues, that the policies learnt after reward shaping lead to higher performance.

annual meeting of the special interest group on discourse and dialogue | 2015

Optimising Turn-Taking Strategies With Reinforcement Learning

Hatim Khouzaimi; Romain Laroche; Fabrice Lefèvre

In this paper, reinforcement learning (RL) is used to learn an efficient turn-taking management model in a simulated slotfilling task with the objective of minimising the dialogue duration and maximising the completion task ratio. Turn-taking decisions are handled in a separate new module, the Scheduler. Unlike most dialogue systems, a dialogue turn is split into microturns and the Scheduler makes a decision for each one of them. A Fitted Value Iteration algorithm, Fitted-Q, with a linear state representation is used for learning the state to action policy. Comparison between a non-incremental and an incremental handcrafted strategies, taken as baselines, and an incremental RL-based strategy, shows the latter to be significantly more efficient, especially in noisy environments.

annual meeting of the special interest group on discourse and dialogue | 2014

An easy method to make dialogue systems incremental

Hatim Khouzaimi; Romain Laroche; Fabrice Lefèvre

Incrementality as a way of managing the interactions between a dialogue system and its users has been shown to have concrete advantages over the traditional turn-taking frame. Incremental systems are more reactive, more human-like, offer a better user experience and allow the user to correct errors faster, hence avoiding desynchronisations. Several incremental models have been proposed, however, their core underlying architecture is different from the classical dialogue systems. As a result, they have to be implemented from scratch. In this paper, we propose a method to transform traditional dialogue systems into incremental ones. A new module, called the Scheduler is inserted between the client and the service so that from the client’s point of view, the system behaves incrementally, even though the service does not.

empirical methods in natural language processing | 2015

Turn-taking phenomena in incremental dialogue systems

Hatim Khouzaimi; Romain Laroche; Fabrice Lefèvre

In this paper, a turn-taking phenomenon taxonomy is introduced, organised according to the level of information conveyed. It is aimed to provide a better grasp of the behaviours used by humans while talking to each other, so that they can be methodically replicated in spoken dialogue systems. Five interesting phenomena have been implemented in a simulated environment: the system barge-in with three variants (resulting from either an unclear, an incoherent or a sufficient user message), the feedback and the user barge-in. The experiments reported in the paper illustrate that how such phenomena are implemented is a delicate choice as their impact on the system’s performance is variable.

7th International Workshop on Spoken Dialogue Systems (IWSDS 2016) | 2017

Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory

Layla El Asri; Romain Laroche; Olivier Pietquin

User satisfaction is often considered as the objective that should be achieved by spoken dialogue systems. This is why the reward function of Spoken Dialogue Systems (SDS) trained by Reinforcement Learning (RL) is often designed to reflect user satisfaction. To do so, the state space representation should be based on features capturing user satisfaction characteristics such as the mean speech recognition confidence score for instance. On the other hand, for deployment in industrial systems there is a need for state representations that are understandable by system engineers. In this article, we propose to represent the state space using a Genetic Sparse Distributed Memory. This is a state aggregation method computing state prototypes which are selected so as to lead to the best linear representation of the value function in RL. To do so, previous work on Genetic Sparse Distributed Memory for classification is adapted to the Reinforcement Learning task and a new way of building the prototypes is proposed. The approach is tested on a corpus of dialogues collected with an appointment scheduling system. The results are compared to a grid-based linear parametrisation. It is shown that learning is accelerated and made more memory efficient. It is also shown that the framework is scalable in that it is possible to include many dialogue features in the representation, interpret the resulting policy and identify the most important dialogue features.

international conference on acoustics, speech, and signal processing | 2014

Ordinal regression for interaction quality prediction

Layla El Asri; Hatim Khouzaimi; Romain Laroche; Olivier Pietquin

The automatic prediction of the quality of a dialogue is useful to keep track of a spoken dialogue systems performance and, if necessary, adapt its behaviour. Classifiers and regression models have been suggested to make this prediction. The parameters of these models are learnt from a corpus of dialogues evaluated by users or experts. In this paper, we propose to model this task as an ordinal regression problem. We apply support vector machines for ordinal regression on a corpus of dialogues where each system-user exchange was given a rate on a scale of 1 to 5 by experts. Compared to previous models proposed in the literature, the ordinal regression predictor has significantly better results according to the following evaluation metrics: Cohens agreement rate with experts ratings, Spearmans rank correlation coefficient, and Euclidean and Manhattan errors.

IWSDS | 2017

The Negotiation Dialogue Game

Romain Laroche; Aude Genevay

This article presents the design of a generic negotiation dialogue game between two or more players. The goal is to reach an agreement, each player having his own preferences over a shared set of options. Several simulated users have been implemented. An MDP policy has been optimised individually with Fitted Q-Iteration for several user instances. Then, the learnt policies have been cross evaluated with other users. Results show strong disparity of inter-user performances. This illustrates the importance of user adaptation in negotiation-based dialogue systems.

IWSDS | 2017

Incremental Human-Machine Dialogue Simulation

Hatim Khouzaimi; Romain Laroche; Fabrice Lefèvre

This chapter introduces a simulator for incremental human-machine dialogue in order to generate artificial dialogue datasets that can be used to train and test data-driven methods. We review the various simulator components in detail, including an unstable speech recognizer, and their differences with non-incremental approaches. Then, as an illustration of its capacities, an incremental strategy based on hand-crafted rules is implemented and compared to several non-incremental baselines. Their performances in terms of dialogue efficiency are presented under different noise conditions and prove that the simulator is able to handle several configurations which are representative of real usages.

Computer Speech & Language | 2018

A methodology for turn-taking capabilities enhancement in Spoken Dialogue Systems using Reinforcement Learning

Hatim Khouzaimi; Romain Laroche; Fabrice Lefèvre

Abstract This article introduces a new methodology to enhance an existing traditional Spoken Dialogue System (SDS) with optimal turn-taking capabilities in order to increase dialogue efficiency. A new approach for transforming the traditional dialogue architecture into an incremental one at a low cost is presented: a new turn-taking decision module called the Scheduler is inserted between the Client and the Service. It is responsible for handling turn-taking decisions. Then, a User Simulator which is able to interact with the system using this new architecture has been implemented and used to train a new Reinforcement Learning turn-taking strategy. Compared to a non-incremental and a handcrafted incremental baselines, it is shown to perform better in simulation and in a real live experiment.

Explore More