Matthew Purver | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matthew Purver is active.

Explore More

Publication

Featured researches published by Matthew Purver.

IEEE Transactions on Audio, Speech, and Language Processing | 2010

The CALO Meeting Assistant System

Gökhan Tür; Andreas Stolcke; L. Lynn Voss; Stanley Peters; Dilek Hakkani-Tür; John Dowding; Benoit Favre; Raquel Fernández; Matthew Frampton; Michael W. Frandsen; Clint Frederickson; Martin Graciarena; Donald Kintzing; Kyle Leveque; Shane Mason; John Niekrasz; Matthew Purver; Korbinian Riedhammer; Elizabeth Shriberg; Jing Tien; Dimitra Vergyri; Fan Yang

The CALO Meeting Assistant (MA) provides for distributed meeting capture, annotation, automatic transcription and semantic analysis of multiparty meetings, and is part of the larger CALO personal assistant system. This paper presents the CALO-MA architecture and its speech recognition and understanding components, which include real-time and offline speech transcription, dialog act segmentation and tagging, topic identification and segmentation, question-answer pair identification, action item recognition, decision extraction, and summarization.

spoken language technology workshop | 2008

The CALO meeting speech recognition and understanding system

Gökhan Tür; Andreas Stolcke; L. Lynn Voss; John Dowding; Benoit Favre; Raquel Fernández; Matthew Frampton; Michael W. Frandsen; Clint Frederickson; Martin Graciarena; Dilek Hakkani-Tür; Donald Kintzing; Kyle Leveque; Shane Mason; John Niekrasz; Stanley Peters; Matthew Purver; Korbinian Riedhammer; Elizabeth Shriberg; Jing Tien; Dimitra Vergyri; Fan Yang

The CALO meeting assistant provides for distributed meeting capture, annotation, automatic transcription and semantic analysis of multiparty meetings, and is part of the larger CALO personal assistant system. This paper summarizes the CALO-MA architecture and its speech recognition and understanding components, which include real-time and offline speech transcription, dialog act segmentation and tagging, question-answer pair identification, action item recognition, decision extraction, and summarization.

meeting of the association for computational linguistics | 2006

Unsupervised Topic Modelling for Multi-Party Spoken Discourse

Matthew Purver; Konrad P. Körding; Thomas L. Griffiths; Joshua B. Tenenbaum

We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-party discourse transcripts. We show how Bayesian inference in this generative model can be used to simultaneously address the problems of topic segmentation and topic identification: automatically segmenting multi-party meetings into topically coherent segments with performance which compares well with previous unsupervised segmentation-only methods (Galley et al., 2003) while simultaneously extracting topics which rate highly when assessed for coherence by human judges. We also show that this method appears robust in the face of off-topic dialogue and speech recognition errors.

annual meeting of the special interest group on discourse and dialogue | 2001

On the means for clarification in dialogue

Matthew Purver; Jonathan Ginzburg; Patrick G. T. Healey

The ability to request clarification of utterances is a vital part of the communicative process. In this paper we discuss the range of possible forms for clarification requests, together with the range of readings they can convey. We present the results of corpus analysis which show a correlation between certain forms and possible readings, together with some indication of maximum likely distance between request and the utterance being clarified. We then explain the implications of these results for a possible HPSG analysis of clarification requests and for an ongoing implementation of a clarification-capable dialogue system.1

empirical methods in natural language processing | 2014

Evaluating Neural Word Representations in Tensor-Based Compositional Settings

Dmitrijs Milajevs; Dimitri Kartsaklis; Mehrnoosh Sadrzadeh; Matthew Purver

We provide a comparative study between neural word representations and traditional vector spaces based on cooccurrence counts, in a number of compositional tasks. We use three different semantic spaces and implement seven tensor-based compositional models, which we then test (together with simpler additive and multiplicative approaches) in tasks involving verb disambiguation and sentence similarity. To check their scalability, we additionally evaluate the spaces using simple compositional methods on larger-scale tasks with less constrained language: paraphrase detection and dialogue act tagging. In the more constrained tasks, co-occurrence vectors are competitive, although choice of compositional method is important; on the largerscale tasks, they are outperformed by neural word embeddings, which show robust, stable performance across the tasks.

PLOS ONE | 2014

Divergence in dialogue.

Patrick G. T. Healey; Matthew Purver; Christine Howes

One of the best known claims about human communication is that peoples behaviour and language use converge during conversation. It has been proposed that these patterns can be explained by automatic, cross-person priming. A key test case is structural priming: does exposure to one syntactic structure, in production or comprehension, make reuse of that structure (by the same or another speaker) more likely? It has been claimed that syntactic repetition caused by structural priming is ubiquitous in conversation. However, previous work has not tested for general syntactic repetition effects in ordinary conversation independently of lexical repetition. Here we analyse patterns of syntactic repetition in two large corpora of unscripted everyday conversations. Our results show that when lexical repetition is taken into account there is no general tendency for people to repeat their own syntactic constructions. More importantly, people repeat each others syntactic constructions less than would be expected by chance; i.e., people systematically diverge from one another in their use of syntactic constructions. We conclude that in ordinary conversation the structural priming effects described in the literature are overwhelmed by the need to actively engage with our conversational partners and respond productively to what they say.

annual meeting of the special interest group on discourse and dialogue | 2008

Modelling and Detecting Decisions in Multi-party Dialogue

Raquel Fernández; Matthew Frampton; Patrick Ehlen; Matthew Purver; Stanley Peters

We describe a process for automatically detecting decision-making sub-dialogues in transcripts of multi-party, human-human meetings. Extending our previous work on action item identification, we propose a structured approach that takes into account the different roles utterances play in the decision-making process. We show that this structured approach outperforms the accuracy achieved by existing decision detection systems based on flat annotations, while enabling the extraction of more fine-grained information that can be used for summarization and reporting.

PLOS ONE | 2015

Twitter Language Use Reflects Psychological Differences between Democrats and Republicans.

Karolina Sylwester; Matthew Purver

Previous research has shown that political leanings correlate with various psychological factors. While surveys and experiments provide a rich source of information for political psychology, data from social networks can offer more naturalistic and robust material for analysis. This research investigates psychological differences between individuals of different political orientations on a social networking platform, Twitter. Based on previous findings, we hypothesized that the language used by liberals emphasizes their perception of uniqueness, contains more swear words, more anxiety-related words and more feeling-related words than conservatives’ language. Conversely, we predicted that the language of conservatives emphasizes group membership and contains more references to achievement and religion than liberals’ language. We analysed Twitter timelines of 5,373 followers of three Twitter accounts of the American Democratic and 5,386 followers of three accounts of the Republican parties’ Congressional Organizations. The results support most of the predictions and previous findings, confirming that Twitter behaviour offers valid insights to offline behaviour.

IEEE Transactions on Audio, Speech, and Language Processing | 2008

A Probabilistic Model of Meetings That Combines Words and Discourse Features

Mike Dowman; Virginia Savova; Thomas L. Griffiths; Konrad P. Körding; Joshua B. Tenenbaum; Matthew Purver

In order to determine the points at which meeting discourse changes from one topic to another, probabilistic models were used to approximate the process through which meeting transcripts were produced. Gibbs sampling was used to estimate the values of random variables in the models, including the locations of topic boundaries. This paper shows how discourse features were integrated into the Bayesian model and reports empirical evaluations of the benefit obtained through the inclusion of each feature and of the suitability of alternative models of the placement of topic boundaries. It demonstrates how multiple cues to segmentation can be combined in a principled way, and empirical tests show a clear improvement over previous work.

european conference on principles of data mining and knowledge discovery | 2015

Predicting Emotion Labels for Chinese Microblog Texts

Zheng Yuan; Matthew Purver

We describe an experiment into detecting emotions in texts on the Chinese microblog service Sina Weibo (www.weibo.com) using distant supervision via various author-supplied emotion labels (emoticons and smilies). Existing word segmentation tools proved unreliable; better accuracy was achieved using character-based features. Higher-order n-grams proved to be useful features. Accuracy varied according to label and emotion: while smilies are used more often, emoticons are more reliable. Happiness is the most accurately predicted emotion, with accuracies around 90 % on both distant and gold-standard labels. This approach works well and achieves high accuracies for happiness and anger, while it is less effective for sadness, surprise, disgust and fear, which are also difficult for human annotators to detect.

Explore More