Lina Maria Rojas-Barahona
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lina Maria Rojas-Barahona.
north american chapter of the association for computational linguistics | 2016
Tsung-Hsien Wen; Milica Gasic; Nikola Mrksic; Lina Maria Rojas-Barahona; Pei-Hao Su; David Vandyke; Stephen Young
©2016 Association for Computational Linguistics. Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps. In this procedure, a model is first trained on counterfeited data synthesised from an out-of-domain dataset, and then fine tuned on a small set of in-domain utterances with a discriminative objective function. Corpus-based evaluation results show that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains. In subjective testing, human judges confirm that the procedure greatly improves generator performance when only a small amount of data is available in the domain.
Computer Speech & Language | 2017
Milica Gai; Nikola Mrki; Lina Maria Rojas-Barahona; Pei-Hao Su; Stefan Ultes; David Vandyke; Tsung-Hsien Wen; Steve J. Young
Generic-specific policy model.Policy committee model.Multi-agent policy model.Human user evaluation. Spoken dialogue systems allow humans to interact with machines using natural speech. As such, they have many benefits. By using speech as the primary communication medium, a computer interface can facilitate swift, human-like acquisition of information. In recent years, speech interfaces have become ever more popular, as is evident from the rise of personal assistants such as Siri, Google Now, Cortana and Amazon Alexa. Recently, data-driven machine learning methods have been applied to dialogue modelling and the results achieved for limited-domain applications are comparable to or out-perform traditional approaches. Methods based on Gaussian processes are particularly effective as they enable good models to be estimated from limited training data. Furthermore, they provide an explicit estimate of the uncertainty which is particularly useful for reinforcement learning. This article explores the additional steps that are necessary to extend these methods to model multiple dialogue domains. We show that Gaussian process reinforcement learning is an elegant framework that naturally supports a range of methods, including prior knowledge, Bayesian committee machines and multi-agent learning, for facilitating extensible and adaptable dialogue systems.
Language and Linguistics Compass | 2016
Lina Maria Rojas-Barahona
Research and industry are becoming more and more interested in finding automatically the polarised opinion of the general public regarding a specific subject. The advent of social networks has opened the possibility of having access to massive blogs, recommendations, and reviews. The challenge is to extract the polarity from these data, which is a task of opinion mining or sentiment analysis. The specific difficulties inherent in this task include issues related to subjective interpretation and linguistic phenomena that affect the polarity of words. Recently, deep learning has become a popular method of addressing this task. However, different approaches have been proposed in the literature. This article provides an overview of deep learning for sentiment analysis in order to place these approaches in context.
empirical methods in natural language processing | 2016
Tsung-Hsien Wen; Milica Gasic; Nikola Mrksic; Lina Maria Rojas-Barahona; Pei-Hao Su; Stefan Ultes; David Vandyke; Steve J. Young
Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential signals by applying a companion cross-entropy objective function to the conditioning vector. The experimental and analytical results demonstrate firstly that competition occurs between the conditioning vector and the LM, and the differing architectures provide different trade-offs between the two. Secondly, the discriminative power and transparency of the conditioning vector is key to providing both model interpretability and better performance. Thirdly, snapshot learning leads to consistent performance improvements independent of which architecture is used.
IWSDS | 2019
Kyusong Lee; Tiancheng Zhao; Stefan Ultes; Lina Maria Rojas-Barahona; Eli Pincus; David R. Traum; Maxine Eskenazi
Collecting a large amount of real human-computer interaction data in various domains is a cornerstone in the development of better data-driven spoken dialog systems. The DialPort project is creating a portal to collect a constant stream of real user conversational data on a variety of topics. In order to keep real users attracted to DialPort, it is crucial to develop a robust evaluation framework to monitor and maintain high performance. Different from earlier spoken dialog systems, DialPort has a heterogeneous set of spoken dialog systems gathered under one outward-looking agent. In order to access this new structure, we have identified some unique challenges that DialPort will encounter so that it can appeal to real users and have created a novel evaluation scheme that quantitatively assesses their performance in these situations. We look at assessment from the point of view of the system developer as well as that of the end user.
arXiv: Computation and Language | 2016
Pei-Hao Su; Milica Gasic; Nikola Mrksic; Lina Maria Rojas-Barahona; Stefan Ultes; David Vandyke; Tsung-Hsien Wen; Steve J. Young
north american chapter of the association for computational linguistics | 2016
Nikola Mrksic; Diarmuid Ó Séaghdha; Blaise Thomson; Milica Gasic; Lina Maria Rojas-Barahona; Pei-Hao Su; David Vandyke; Tsung-Hsien Wen; Steve J. Young
conference of the international speech communication association | 2017
Stefan Ultes; Pawel Budzianowski; Iñigo Casanueva; Nikola Mrksic; Lina Maria Rojas-Barahona; Pei-Hao Su; Tsung-Hsien Wen; Milica Gasic; Steve J. Young
annual meeting of the special interest group on discourse and dialogue | 2017
Pawel Budzianowski; Stefan Ultes; Pei-Hao Su; Nikola Mrksic; Tsung-Hsien Wen; Iñigo Casanueva; Lina Maria Rojas-Barahona; Milica Gasic
annual meeting of the special interest group on discourse and dialogue | 2013
Claire Gardent; Alejandra Lorenzo; Laura Perez-Beltrachini; Lina Maria Rojas-Barahona