Maite Taboada
Simon Fraser University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maite Taboada.
Computational Linguistics | 2011
Maite Taboada; Julian Brooke; Milan Tofiloski; Kimberly D. Voll; Manfred Stede
We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classification task, the process of assigning a positive or negative label to a text that captures the texts opinion towards its main subject matter. We show that SO-CALs performance is consistent across domains and in completely unseen data. Additionally, we describe the process of dictionary creation, and our use of Mechanical Turk to check dictionaries for consistency and reliability.
Discourse Studies | 2006
Maite Taboada; William C. Mann
Rhetorical Structure Theory has enjoyed continuous attention since its origins in the 1980s. It has been applied, compared to other approaches, and also criticized in a number of areas in discourse analysis, theoretical linguistics, psycholinguistics, and computational linguistics. In this article, we review some of the discussions about the theory itself, especially addressing issues of the reliability of analyses and psychological validity, together with a discussion of the nature of text relations. We also propose areas for further research. A follow-up article (Taboada and Mann, forthcoming) will discuss applications of the theory in various fields.
Discourse Studies | 2006
Maite Taboada; William C. Mann
Rhetorical Structure Theory (RST) is a theory of text organization that has led to areas of application beyond discourse analysis and text generation, its original goals. In this article, we review the most important applications in several areas: discourse analysis, theoretical linguistics, psycholinguistics, and computational linguistics. We also provide a list of resources useful for work within the RST framework. The present article is a complement to our review of the theoretical aspects of the theory (Taboada and Mann, 2006).
annual meeting of the special interest group on discourse and dialogue | 2009
Maite Taboada; Julian Brooke; Manfred Stede
We present a taxonomy and classification system for distinguishing between different types of paragraphs in movie reviews: formal vs. functional paragraphs and, within the latter, between description and comment. The classification is used for sentiment extraction, achieving improvement over a baseline without paragraph classification.
meeting of the association for computational linguistics | 2009
Milan Tofiloski; Julian Brooke; Maite Taboada
We present a syntactic and lexically based discourse segmenter (SLSeg) that is designed to avoid the common problem of over-segmenting text. Segmentation is the first step in a discourse parser, a system that constructs discourse trees from elementary discourse units. We compare SLSeg to a probabilistic segmenter, showing that a conservative approach increases precision at the expense of recall, while retaining a high F-score across both formal and informal texts.
Text - Interdisciplinary Journal for the Study of Discourse | 2011
Maite Taboada
Abstract Genre, from the systemic functional linguistics point of view, refers to the organization of any speech activity in stages, determined by the overall purpose of the genre and by social conventions. In this paper, the SFL approach to genre and register is applied to the genre of online movie reviews. A corpus analysis shows specific stages in the genre: Descriptive stages (in turn, Subject Matter, Plot, Characters, and Background) and an obligatory Evaluation stage. Each stage is described in detail, in particular its characteristics and placement in the texts. We then turn to lexicogrammatical characteristics of the two main stages, showing that Description and Evaluation can be distinguished from each other using two features: evaluative words and connectives. Evaluation stages contain significantly more evaluative words. In terms of connectives, Description was shown to contain more temporal markers than Evaluation, whereas Evaluation contains more causal markers, indicating a basic distinction between narration (which tends to necessitate more temporal relations) and comment (which makes more use of cause, result, concession, condition, and contrast relations).
association for information science and technology | 2016
Noa P. Cruz; Maite Taboada; Ruslan Mitkov
Recognizing negative and speculative information is highly relevant for sentiment analysis. This paper presents a machine‐learning approach to automatically detect this kind of information in the review domain. The resulting system works in two steps: in the first pass, negation/speculation cues are identified, and in the second phase the full scope of these cues is determined. The system is trained and evaluated on the Simon Fraser University Review corpus, which is extensively used in opinion mining. The results show how the proposed method outstrips the baseline by as much as roughly 20% in the negation cue detection and around 13% in the scope recognition, both in terms of F1. In speculation, the performance obtained in the cue prediction phase is close to that obtained by a human rater carrying out the same task. In the scope detection, the results are also promising and represent a substantial improvement on the baseline (up by roughly 10%). A detailed error analysis is also provided. The extrinsic evaluation shows that the correct identification of cues and scopes is vital for the task of sentiment analysis.
Corpus Linguistics and Linguistic Theory | 2008
Maite Taboada; Loreley Hadic Zabala
Abstract Many efforts in corpora annotation start with segmenting discourse into units of analysis. In this paper, we present a method for deciding on segmentation units within Centering Theory (Grosz et al. 1995). We survey the different existing methods to break down discourse into utterances and discuss the results of a comparison study among them. The contribution of our study is that it was carried out with spoken data and in two different languages (English and Spanish). Our comparison suggests that the best unit of analysis for Centering-based annotation is the finite clause. The final result is a set of guidelines for how to segment discourse for Centering analysis, which is also potentially applicable to other analyses.
north american chapter of the association for computational linguistics | 2006
Gabriel Murray; Maite Taboada; Steve Renals
This paper investigates the usefulness of prosodic features in classifying rhetorical relations between utterances in meeting recordings. Five rhetorical relations of contrast, elaboration, summary, question and cause are explored. Three training methods - supervised, unsupervised, and combined - are compared, and classification is carried out using support vector machines. The results of this pilot study are encouraging but mixed, with pairwise classification achieving an average of 68% accuracy in discerning between relation pairs using only prosodic features, but multi-class classification performing only slightly better than chance.
language resources and evaluation | 2015
Mikel Iruskieta; Iria da Cunha; Maite Taboada
Explaining why the same passage may have different rhetorical structures when conveyed in different languages remains an open question. Starting from a trilingual translation corpus, this paper aims to provide a new qualitative method for the comparison of rhetorical structures in different languages and to specify why translated texts may differ in their rhetorical structures. To achieve these aims we have carried out a contrastive analysis, comparing a corpus of parallel English, Spanish and Basque texts, using Rhetorical Structure Theory. We propose a method to describe the main linguistic differences among the rhetorical structures of the three languages in the two annotation stages (segmentation and rhetorical analysis). We show a new type of comparison that has important advantages with regard to the quantitative method usually employed: it provides an accurate measurement of inter-annotator agreement, and it pinpoints sources of disagreement among annotators. With the use of this new method, we show how translation strategies affect discourse structure.