Jason D. Williams | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jason D. Williams is active.

Explore More

Publication

Featured researches published by Jason D. Williams.

Computer Speech & Language | 2007

Partially observable Markov decision processes for spoken dialog systems

Jason D. Williams; Steve J. Young

In a spoken dialog system, determining which action a machine should take in a given situation is a difficult problem because automatic speech recognition is unreliable and hence the state of the conversation can never be known with certainty. Much of the research in spoken dialog systems centres on mitigating this uncertainty and recent work has focussed on three largely disparate techniques: parallel dialog state hypotheses, local use of confidence scores, and automated planning. While in isolation each of these approaches can improve action selection, taken together they currently lack a unified statistical framework that admits global optimization. In this paper we cast a spoken dialog system as a partially observable Markov decision process (POMDP). We show how this formulation unifies and extends existing techniques to form a single principled framework. A number of illustrations are used to show qualitatively the potential benefits of POMDPs compared to existing techniques, and empirical results from dialog simulations are presented which demonstrate significant quantitative gains. Finally, some of the key challenges to advancing this method - in particular scalability - are briefly outlined.

international conference on acoustics, speech, and signal processing | 2013

Recent advances in deep learning for speech research at Microsoft

Li Deng; Jinyu Li; Jui-Ting Huang; Kaisheng Yao; Dong Yu; Frank Seide; Michael L. Seltzer; Geoffrey Zweig; Xiaodong He; Jason D. Williams; Yifan Gong; Alex Acero

Deep learning is becoming a mainstream technology for speech recognition at industrial scale. In this paper, we provide an overview of the work by Microsoft speech researchers since 2009 in this area, focusing on more recent advances which shed light to the basic capabilities and limitations of the current deep learning technology. We organize this overview along the feature-domain and model-domain dimensions according to the conventional approach to analyzing speech systems. Selected experimental results, including speech recognition and related applications such as spoken dialogue and language modeling, are presented to demonstrate and analyze the strengths and weaknesses of the techniques described in the paper. Potential improvement of these techniques and future research directions are discussed.

Proceedings of the IEEE | 2013

POMDP-Based Statistical Spoken Dialog Systems: A Review

Steve J. Young; Milica Gasic; Blaise Thomson; Jason D. Williams

Statistical dialog systems (SDSs) are motivated by the need for a data-driven framework that reduces the cost of laboriously handcrafting complex dialog managers and that provides robustness against the errors created by speech recognizers operating in noisy environments. By including an explicit Bayesian model of uncertainty and by optimizing the policy via a reward-driven process, partially observable Markov decision processes (POMDPs) provide such a framework. However, exact model representation and optimization is computationally intractable. Hence, the practical application of POMDP-based systems requires efficient algorithms and carefully constructed approximations. This review article provides an overview of the current state of the art in the development of POMDP-based spoken dialog systems.

annual meeting of the special interest group on discourse and dialogue | 2014

The Second Dialog State Tracking Challenge

Matthew Henderson; Blaise Thomson; Jason D. Williams

A spoken dialog system, while communicating with a user, must keep track of what the user wants from the system at each step. This process, termed dialog state tracking, is essential for a successful dialog system as it directly informs the system’s actions. The first Dialog State Tracking Challenge allowed for evaluation of different dialog state tracking techniques, providing common testbeds and evaluation suites. This paper presents a second challenge, which continues this tradition and introduces some additional features ‐ a new domain, changing user goals and a richer dialog state. The challenge received 31 entries from 9 research groups. The results suggest that while large improvements on a competitive baseline are possible, trackers are still prone to degradation in mismatched conditions. An investigation into ensemble learning demonstrates the most accurate tracking can be achieved by combining multiple trackers.

IEEE Transactions on Audio, Speech, and Language Processing | 2007

Scaling POMDPs for Spoken Dialog Management

Jason D. Williams; Steve J. Young

Control in spoken dialog systems is challenging largely because automatic speech recognition is unreliable, and hence the state of the conversation can never be known with certainty. Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for planning and control in this context; however, POMDPs face severe scalability challenges, and past work has been limited to trivially small dialog tasks. This paper presents a novel POMDP optimization technique-composite summary point-based value iteration (CSPBVI)-which enables optimization to be performed on slot-filling POMDP-based dialog managers of a realistic size. Using dialog models trained on data from a tourist information domain, simulation results show that CSPBVI scales effectively, outperforms non-POMDP baselines, and is robust to estimation errors.

Archive | 2008

Partially Observable Markov Decision Processes with Continuous Observations for Dialogue Management

Jason D. Williams; Pascal Poupart; Steve J. Young

This work shows how a spoken dialogue system can be represented as a Partially Observable Markov Decision Process (POMDP) with composite observations consisting of discrete elements representing dialogue acts and continuous components representing confidence scores. Using a testbed simulated dialogue management problem and recently developed optimisation techniques, we demonstrate that this continuous POMDP can outperform traditional approaches in which confidence score is tracked discretely. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the test-bed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions.

international conference on acoustics, speech, and signal processing | 2010

Incremental partition recombination for efficient tracking of multiple dialog states

Jason D. Williams

For spoken dialog systems, tracking a distribution over multiple dialog states has been shown to add robustness to speech recognition errors. To retain tractability, past work has suggested tracking dialog states in groups called partitions. While promising, current techniques are limited to incorporating a small number of ASR N-Best hypotheses. This paper overcomes this limitation by incrementally recombining partitions during the update. Experiments with a database of 300,000 AT&T staff show better whole-dialog accuracy than existing approaches. In addition, our implementation, which is available to the research community [1], views partitions as programmatic objects - an accessible formulation for commercial application developers.

annual meeting of the special interest group on discourse and dialogue | 2014

Web-style ranking and SLU combination for dialog state tracking

Jason D. Williams

In spoken dialog systems, statistical state tracking aims to improve robustness to speech recognition errors by tracking a posterior distribution over hidden dialog states. This paper introduces two novel methods for this task. First, we explain how state tracking is structurally similar to web-style ranking, enabling mature, powerful ranking algorithms to be applied. Second, we show how to use multiple spoken language understanding engines (SLUs) in state tracking — multiple SLUs can expand the set of dialog states being tracked, and give more information about each, thereby increasing both recall and precision of state tracking. We evaluate on the second Dialog State Tracking Challenge; together these two techniques yield highest accuracy in 2 of 3 tasks, including the most difficult and general task.

spoken language technology workshop | 2014

The third Dialog State Tracking Challenge

Matthew Henderson; Blaise Thomson; Jason D. Williams

In spoken dialog systems, dialog state tracking refers to the task of correctly inferring the users goal at a given turn, given all of the dialog history up to that turn. This task is challenging because of speech recognition and language understanding errors, yet good dialog state tracking is crucial to the performance of spoken dialog systems. This paper presents results from the third Dialog State Tracking Challenge, a research community challenge task based on a corpus of annotated logs of human-computer dialogs, with a blind test set evaluation. The main new feature of this challenge is that it studied the ability of trackers to generalize to new entities - i.e. new slots and values not present in the training data. This challenge received 28 entries from 7 research teams. About half the teams substantially exceeded the performance of a competitive rule-based baseline, illustrating not only the merits of statistical methods for dialog state tracking but also the difficulty of the problem.

meeting of the association for computational linguistics | 2017

Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning

Jason D. Williams; Kavosh Asadi; Geoffrey Zweig

End-to-end learning of recurrent neural networks (RNNs) is an attractive solution for dialog systems; however, current techniques are data-intensive and require thousands of dialogs to learn simple behaviors. We introduce Hybrid Code Networks (HCNs), which combine an RNN with domain-specific knowledge encoded as software and system action templates. Compared to existing end-to-end approaches, HCNs considerably reduce the amount of training data required, while retaining the key benefit of inferring a latent representation of dialog state. In addition, HCNs can be optimized with supervised learning, reinforcement learning, or a mixture of both. HCNs attain state-of-the-art performance on the bAbI dialog dataset (Bordes and Weston, 2016), and outperform two commercially deployed customer-facing dialog systems at our company.

Explore More