Tsung-Hsien Wen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tsung-Hsien Wen is active.

Explore More

Publication

Featured researches published by Tsung-Hsien Wen.

empirical methods in natural language processing | 2015

Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems

Tsung-Hsien Wen; Milica Gasic; Nikola Mrksic; Pei-Hao Su; David Vandyke; Steve J. Young

© 2015 Association for Computational Linguistics. Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact both on usability and perceived quality. Most NLG systems in common use employ rules and heuristics and tend to generate rigid and stylised responses without the natural variation of human language. They are also not easily scaled to systems covering multiple domains and languages. This paper presents a statistical language generator based on a semantically controlled Long Short-term Memory (LSTM) structure. The LSTM generator can learn from unaligned data by jointly optimising sentence planning and surface realisation using a simple cross entropy training criterion, and language variation can be easily achieved by sampling from output candidates. With fewer heuristics, an objective evaluation in two differing test domains showed the proposed method improved performance compared to previous methods. Human judges scored the LSTM system higher on informativeness and naturalness and overall preferred it to the other systems..

annual meeting of the special interest group on discourse and dialogue | 2015

Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking

Tsung-Hsien Wen; Milica Gasic; Dongho Kim; Nikola Mrksic; Pei-Hao Su; David Vandyke; Steve J. Young

The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on. These limitations add significantly to development costs and make cross-domain, multi-lingual dialogue systems intractable. Moreover, human languages are context-aware. The most natural response should be directly learned from data rather than depending on predefined syntaxes or rules. This paper presents a statistical language generator based on a joint recurrent and convolutional neural network structure which can be trained on dialogue act-utterance pairs without any semantic alignments or predefined grammar trees. Objective metrics suggest that this new model outperforms previous methods under the same experimental conditions. Results of an evaluation by human judges indicate that it produces not only high quality but linguistically varied utterances which are preferred compared to n-gram and rule-based systems.

north american chapter of the association for computational linguistics | 2016

Multi-domain Neural Network Language Generation for Spoken Dialogue Systems

Tsung-Hsien Wen; Milica Gasic; Nikola Mrksic; Lina Maria Rojas-Barahona; Pei-Hao Su; David Vandyke; Stephen Young

©2016 Association for Computational Linguistics. Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps. In this procedure, a model is first trained on counterfeited data synthesised from an out-of-domain dataset, and then fine tuned on a small set of in-domain utterances with a discriminative objective function. Corpus-based evaluation results show that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains. In subjective testing, human judges confirm that the procedure greatly improves generator performance when only a small amount of data is available in the domain.

international joint conference on natural language processing | 2015

Multi-domain Dialog State Tracking using Recurrent Neural Networks

Nikola Mrksic; Diarmuid Ó Séaghdha; Blaise Thomson; Milica Gasic; Pei-Hao Su; David Vandyke; Tsung-Hsien Wen; Steve J. Young

Dialog state tracking is a key component of many modern dialog systems, most of which are designed with a single, well-defined domain in mind. This paper shows that dialog data drawn from different dialog domains can be used to train a general belief tracking model which can operate across all of these domains, exhibiting superior performance to each of the domain-specific models. We propose a training procedure which uses out-of-domain data to initialise belief tracking models for entirely new domains. This procedure leads to improvements in belief tracking performance regardless of the amount of in-domain data available for training the model.

meeting of the association for computational linguistics | 2016

On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems

Pei-Hao Su; Milica Gasic; Nikola Mrksic; Lina M. Rojas Barahona; Stefan Ultes; David Vandyke; Tsung-Hsien Wen; Steve J. Young

The ability to compute an accurate reward function is essential for optimising a dialogue policy via reinforcement learning. In real-world applications, using explicit user feedback as the reward signal is often unreliable and costly to collect. This problem can be mitigated if the users intent is known in advance or data is available to pre-train a task success predictor off-line. In practice neither of these apply for most real world applications. Here we propose an on-line learning framework whereby the dialogue policy is jointly trained alongside the reward model via active learning with a Gaussian process model. This Gaussian process operates on a continuous space dialogue representation generated in an unsupervised fashion using a recurrent neural network encoder-decoder. The experimental results demonstrate that the proposed framework is able to significantly reduce data annotation costs and mitigate noisy user feedback in dialogue policy learning.

ieee automatic speech recognition and understanding workshop | 2015

Policy committee for adaptation in multi-domain spoken dialogue systems

Milica Gasic; Nikola Mrksic; Pei-Hao Su; David Vandyke; Tsung-Hsien Wen; Steve J. Young

Moving from limited-domain dialogue systems to open domain dialogue systems raises a number of challenges. One of them is the ability of the system to utilise small amounts of data from disparate domains to build a dialogue manager policy. Previous work has focused on using data from different domains to adapt a generic policy to a specific domain. Inspired by Bayesian committee machines, this paper proposes the use of a committee of dialogue policies. The results show that such a model is particularly beneficial for adaptation in multi-domain dialogue systems. The use of this model significantly improves performance compared to a single policy baseline, as confirmed by the performed real-user trial. This is the first time a dialogue policy has been trained on multiple domains on-line in interaction with real users.

annual meeting of the special interest group on discourse and dialogue | 2015

Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

Pei-Hao Su; David Vandyke; Milica Gasic; Nikola Mrksic; Tsung-Hsien Wen; Steve J. Young

Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real users where learning costs are expensive. Reward shaping is one promising technique for addressing these concerns. Here we examine three recurrent neural network (RNN) approaches for providing reward shaping information in addition to the primary (task-orientated) environmental feedback. These RNNs are trained on returns from dialogues generated by a simulated user and attempt to diffuse the overall evaluation of the dialogue back down to the turn level to guide the agent towards good behaviour faster. In both simulated and real user scenarios these RNNs are shown to increase policy learning speed. Importantly, they do not require prior knowledge of the users goal.

meeting of the association for computational linguistics | 2017

PyDial: A Multi-domain Statistical Dialogue System Toolkit

Stefan Ultes; Lina M. Rojas Barahona; Pei-Hao Su; David Vandyke; Dongho Kim; Iñigo Casanueva; Pawel Budzianowski; Nikola Mrksic; Tsung-Hsien Wen; Milica Gasic; Steve J. Young

Statistical Spoken Dialogue Systems have been around for many years. However, access to these systems has always been difficult as there is still no publicly available end-to-end system implementation. To alleviate this, we present PyDial, an opensource end-to-end statistical spoken dialogue system toolkit which provides implementations of statistical approaches for all dialogue system modules. Moreover, it has been extended to provide multidomain conversational functionality. It offers easy configuration, easy extensibility, and domain-independent implementations of the respective dialogue system modules. The toolkit is available for download under the Apache 2.0 license.

Computer Speech & Language | 2017

Dialogue manager domain adaptation using Gaussian process reinforcement learning

Milica Gai; Nikola Mrki; Lina Maria Rojas-Barahona; Pei-Hao Su; Stefan Ultes; David Vandyke; Tsung-Hsien Wen; Steve J. Young

Generic-specific policy model.Policy committee model.Multi-agent policy model.Human user evaluation. Spoken dialogue systems allow humans to interact with machines using natural speech. As such, they have many benefits. By using speech as the primary communication medium, a computer interface can facilitate swift, human-like acquisition of information. In recent years, speech interfaces have become ever more popular, as is evident from the rise of personal assistants such as Siri, Google Now, Cortana and Amazon Alexa. Recently, data-driven machine learning methods have been applied to dialogue modelling and the results achieved for limited-domain applications are comparable to or out-perform traditional approaches. Methods based on Gaussian processes are particularly effective as they enable good models to be estimated from limited training data. Furthermore, they provide an explicit estimate of the uncertainty which is particularly useful for reinforcement learning. This article explores the additional steps that are necessary to extend these methods to model multiple dialogue domains. We show that Gaussian process reinforcement learning is an elegant framework that naturally supports a range of methods, including prior knowledge, Bayesian committee machines and multi-agent learning, for facilitating extensible and adaptable dialogue systems.

ieee automatic speech recognition and understanding workshop | 2015

Multi-domain dialogue success classifiers for policy training

David Vandyke; Pei-Hao Su; Milica Gasic; Nikola Mrksic; Tsung-Hsien Wen; Steve J. Young

We propose a method for constructing dialogue success classifiers that are capable of making accurate predictions in domains unseen during training. Pooling and adaptation are also investigated for constructing multi-domain models when data is available in the new domain. This is achieved by reformulating the features input to the recurrent neural network models introduced in [1]. Importantly, on our task of main interest, this enables policy training in a new domain without the dialogue success classifier (which forms the reinforcement learning reward function) ever having seen data from that domain before. This occurs whilst incurring only a small reduction in performance relative to developing and using an in-domain dialogue success classifier. Finally, given the motivation with these dialogue success classifiers is to enable policy training with real users, we demonstrate that these initial policy training results obtained with a simulated user carry over to learning from paid human users.

Explore More