Matthew Henderson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matthew Henderson is active.

Explore More

Publication

Featured researches published by Matthew Henderson.

annual meeting of the special interest group on discourse and dialogue | 2014

The Second Dialog State Tracking Challenge

Matthew Henderson; Blaise Thomson; Jason D. Williams

A spoken dialog system, while communicating with a user, must keep track of what the user wants from the system at each step. This process, termed dialog state tracking, is essential for a successful dialog system as it directly informs the system’s actions. The first Dialog State Tracking Challenge allowed for evaluation of different dialog state tracking techniques, providing common testbeds and evaluation suites. This paper presents a second challenge, which continues this tradition and introduces some additional features ‐ a new domain, changing user goals and a richer dialog state. The challenge received 31 entries from 9 research groups. The results suggest that while large improvements on a competitive baseline are possible, trackers are still prone to degradation in mismatched conditions. An investigation into ensemble learning demonstrates the most accurate tracking can be achieved by combining multiple trackers.

annual meeting of the special interest group on discourse and dialogue | 2014

Word-Based Dialog State Tracking with Recurrent Neural Networks

Matthew Henderson; Blaise Thomson; Steve J. Young

Recently discriminative methods for tracking the state of a spoken dialog have been shown to outperform traditional generative models. This paper presents a new wordbased tracking method which maps directly from the speech recognition results to the dialog state without using an explicit semantic decoder. The method is based on a recurrent neural network structure which is capable of generalising to unseen dialog state hypotheses, and which requires very little feature engineering. The method is evaluated on the second Dialog State Tracking Challenge (DSTC2) corpus and the results demonstrate consistently high performance across all of the metrics.

spoken language technology workshop | 2012

Discriminative spoken language understanding using word confusion networks

Matthew Henderson; Milica Gasic; Blaise Thomson; Pirros Tsiakoulis; Kai Yu; Steve J. Young

Current commercial dialogue systems typically use hand-crafted grammars for Spoken Language Understanding (SLU) operating on the top one or two hypotheses output by the speech recogniser. These systems are expensive to develop and they suffer from significant degradation in performance when faced with recognition errors. This paper presents a robust method for SLU based on features extracted from the full posterior distribution of recognition hypotheses encoded in the form of word confusion networks. Following [1], the system uses SVM classifiers operating on n-gram features, trained on unaligned input/output pairs. Performance is evaluated on both an off-line corpus and on-line in a live user trial. It is shown that a statistical discriminative approach to SLU operating on the full posterior ASR output distribution can substantially improve performance both in terms of accuracy and overall dialogue reward. Furthermore, additional gains can be obtained by incorporating features from the previous system output.

spoken language technology workshop | 2014

The third Dialog State Tracking Challenge

Matthew Henderson; Blaise Thomson; Jason D. Williams

In spoken dialog systems, dialog state tracking refers to the task of correctly inferring the users goal at a given turn, given all of the dialog history up to that turn. This task is challenging because of speech recognition and language understanding errors, yet good dialog state tracking is crucial to the performance of spoken dialog systems. This paper presents results from the third Dialog State Tracking Challenge, a research community challenge task based on a corpus of annotated logs of human-computer dialogs, with a blind test set evaluation. The main new feature of this challenge is that it studied the ability of trackers to generalize to new entities - i.e. new slots and values not present in the training data. This challenge received 28 entries from 7 research teams. About half the teams substantially exceeded the performance of a competitive rule-based baseline, illustrating not only the merits of statistical methods for dialog state tracking but also the difficulty of the problem.

international conference on acoustics, speech, and signal processing | 2013

On-line policy optimisation of Bayesian spoken dialogue systems via human interaction

Milica Gasic; Catherine Breslin; Matthew Henderson; Dongho Kim; Martin Szummer; Blaise Thomson; Pirros Tsiakoulis; Steve J. Young

A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy.

spoken language technology workshop | 2016

The fifth dialog state tracking challenge

Seokhwan Kim; Luis Fernando D'Haro; Rafael E. Banchs; Jason D. Williams; Matthew Henderson; Koichiro Yoshino

Dialog state tracking - the process of updating the dialog state after each interaction with the user - is a key component of most dialog systems. Following a similar scheme to the fourth dialog state tracking challenge, this edition again focused on human-human dialogs, but introduced the task of cross-lingual adaptation of trackers. The challenge received a total of 32 entries from 9 research groups. In addition, several pilot track evaluations were also proposed receiving a total of 16 entries from 4 groups. In both cases, the results show that most of the groups were able to outperform the provided baselines for each task.

spoken language technology workshop | 2014

Robust dialog state tracking using delexicalised recurrent neural networks and unsupervised adaptation

Matthew Henderson; Blaise Thomson; Steve J. Young

Tracking the users intention throughout the course of a dialog, called dialog state tracking, is an important component of any dialog system. Most existing spoken dialog systems are designed to work in a static, well-defined domain, and are not well suited to tasks in which the domain may change or be extended over time. This paper shows how recurrent neural networks can be effectively applied to tracking in an extended domain with new slots and values not present in training data. The method is evaluated in the third Dialog State Tracking Challenge, where it significantly outperforms other approaches in the task of tracking the users goal. A method for online unsupervised adaptation to new domains is also presented. Unsupervised adaptation is shown to be helpful in improving word-based recurrent neural networks, which work directly from the speech recognition results. Word-based dialog state tracking is attractive as it does not require engineering a spoken language understanding system for use in the new domain and it avoids the need for a general purpose intermediate semantic representation.

spoken language technology workshop | 2012

Policy optimisation of POMDP-based dialogue systems without state space compression

Milica Gasic; Matthew Henderson; Blaise Thomson; Pirros Tsiakoulis; Steve J. Young

The partially observable Markov decision process (POMDP) has been proposed as a dialogue model that enables automatic improvement of the dialogue policy and robustness to speech understanding errors. It requires, however, a large number of dialogues to train the dialogue policy. Gaussian processes (GP) have recently been applied to POMDP dialogue management optimisation showing an ability to substantially increase the speed of learning. Here, we investigate this further using the Bayesian Update of Dialogue State dialogue manager. We show that it is possible to apply Gaussian processes directly to the belief state, removing the need for a parametric policy representation. In addition, the resulting policy learns significantly faster while maintaining operational performance.

spoken language technology workshop | 2012

N-best error simulation for training spoken dialogue systems

Blaise Thomson; Milica Gasic; Matthew Henderson; Pirros Tsiakoulis; Steve J. Young

A recent trend in spoken dialogue research is the use of reinforcement learning to train dialogue systems in a simulated environment. Past researchers have shown that the types of errors that are simulated can have a significant effect on simulated dialogue performance. Since modern systems typically receive an N-best list of possible user utterances, it is important to be able to simulate a full N-best list of hypotheses. This paper presents a new method for simulating such errors based on logistic regression, as well as a new method for simulating the structure of N-best lists of semantics and their probabilities, based on the Dirichlet distribution. Off-line evaluations show that the new Dirichlet model results in a much closer match to the receiver operating characteristics (ROC) of the live data. Experiments also show that the logistic model gives confusions that are closer to the type of confusions observed in live situations. The hope is that these new error models will be able to improve the resulting performance of trained dialogue systems.

Archive | 2016

Evaluation of Statistical POMDP-Based Dialogue Systems in Noisy Environments

Steve J. Young; Catherine Breslin; Milica Gasic; Matthew Henderson; Dongho Kim; Martin Szummer; Blaise Thomson; Pirros Tsiakoulis; Eli Tzirkel Hancock

Compared to conventional hand-crafted rule-based dialogue management systems, statistical POMDP-based dialogue managers offer the promise of increased robustness, reduced development and maintenance costs, and scaleability to large open-domains. As a consequence, there has been considerable research activity in approaches to statistical spoken dialogue systems over recent years. However, building and deploying a real-time spoken dialogue system is expensive, and even when operational, it is hard to recruit sufficient users to get statistically significant results. Instead, researchers have tended to evaluate using user simulators or by reprocessing existing corpora, both of which are unconvincing predictors of actual real world performance. This paper describes the deployment of a real-world restaurant information system and its evaluation in a motor car using subjects recruited locally and by remote users recruited using Amazon Mechanical Turk. The paper explores three key questions: are statistical dialogue systems more robust than conventional hand-crafted systems; how does the performance of a system evaluated on a user simulator compare to performance with real users; and can performance of a system tested over the telephone network be used to predict performance in more hostile environments such as a motor car? The results show that the statistical approach is indeed more robust, but results from a simulator significantly over-estimate performance both absolute and relative. Finally, by matching WER rates, performance results obtained over the telephone can provide useful predictors of performance in noisier environments such as the motor car, but again they tend to over-estimate performance.

Explore More