Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sara Rosenthal is active.

Publication


Featured researches published by Sara Rosenthal.


international conference on computational linguistics | 2014

SemEval-2014 Task 9: Sentiment Analysis in Twitter

Sara Rosenthal; Alan Ritter; Preslav Nakov; Veselin Stoyanov

We describe the Sentiment Analysis in Twitter task, ran as part of SemEval-2014. It is a continuation of the last year’s task that ran successfully as part of SemEval2013. As in 2013, this was the most popular SemEval task; a total of 46 teams contributed 27 submissions for subtask A (21 teams) and 50 submissions for subtask B (44 teams). This year, we introduced three new test sets: (i) regular tweets, (ii) sarcastic tweets, and (iii) LiveJournal sentences. We further tested on (iv) 2013 tweets, and (v) 2013 SMS messages. The highest F1score on (i) was achieved by NRC-Canada at 86.63 for subtask A and by TeamX at 70.96 for subtask B.


north american chapter of the association for computational linguistics | 2015

SemEval-2015 Task 10: Sentiment Analysis in Twitter

Sara Rosenthal; Preslav Nakov; Svetlana Kiritchenko; Saif Mohammad; Alan Ritter; Veselin Stoyanov

In this paper, we describe the 2015 iteration of the SemEval shared task on Sentiment Analysis in Twitter. This was the most popular sentiment analysis shared task to date with more than 40 teams participating in each of the last three years. This year’s shared task competition consisted of five sentiment prediction subtasks. Two were reruns from previous years: (A) sentiment expressed by a phrase in the context of a tweet, and (B) overall sentiment of a tweet. We further included three new subtasks asking to predict (C) the sentiment towards a topic in a single tweet, (D) the overall sentiment towards a topic in a set of tweets, and (E) the degree of prior polarity of a phrase.


north american chapter of the association for computational linguistics | 2016

SemEval-2016 Task 4: Sentiment Analysis in Twitter

Preslav Nakov; Alan Ritter; Sara Rosenthal; Fabrizio Sebastiani; Veselin Stoyanov

This paper discusses the fourth year of the ”Sentiment Analysis in Twitter Task”. SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. The first two subtasks are reruns from prior years and ask to predict the overall sentiment, and the sentiment towards a topic in a tweet. The three new subtasks focus on two variants of the basic “sentiment classification in Twitter” task. The first variant adopts a five-point scale, which confers an ordinal character to the classification task. The second variant focuses on the correct estimation of the prevalence of each class of interest, a task which has been called quantification in the supervised learning literature. The task continues to be very popular, attracting a total of 43 teams.


meeting of the association for computational linguistics | 2011

Age Prediction in Blogs: A Study of Style, Content, and Online Behavior in Pre- and Post-Social Media Generations

Sara Rosenthal; Kathleen R. McKeown

We investigate whether wording, stylistic choices, and online behavior can be used to predict the age category of blog authors. Our hypothesis is that significant changes in writing style distinguish pre-social media bloggers from post-social media bloggers. Through experimentation with a range of years, we found that the birth dates of students in college at the time when social media such as AIM, SMS text messaging, MySpace and Facebook first became popular, enable accurate age prediction. We also show that internet writing characteristics are important features for age prediction, but that lexical content is also needed to produce significantly more accurate results. Our best results allow for 81.57% accuracy.


ieee international conference semantic computing | 2012

Detecting Opinionated Claims in Online Discussions

Sara Rosenthal; Kathleen R. McKeown

This paper explores the automatic detection of sentences that are opinionated claims, in which the author expresses a belief. We use a machine learning based approach, investigating the impact of features such as sentiment and the output of a system that determines committed belief. We train and test our approach on social media, where people often try to convince others of the validity of their opinions. We experiment with two different types of data, drawn from Live Journal web logs and Wikipedia discussion forums. Our experiments show that sentiment analysis is more important in Live Journal, while committed belief is more helpful for Wikipedia. In both corpora, n-grams and part-of-speech features also account for significantly better accuracy. We discuss the ramifications behind these differences.


annual meeting of the special interest group on discourse and dialogue | 2015

I Couldn’t Agree More: The Role of Conversational Structure in Agreement and Disagreement Detection in Online Discussions

Sara Rosenthal; Kathleen R. McKeown

Determining when conversational participants agree or disagree is instrumental for broader conversational analysis; it is necessary, for example, in deciding when a group has reached consensus. In this paper, we describe three main contributions. We show how different aspects of conversational structure can be used to detect agreement and disagreement in discussion forums. In particular, we exploit information about meta-thread structure and accommodation between participants. Second, we demonstrate the impact of the features using 3-way classification, including sentences expressing disagreement, agreement or neither. Finally, we show how to use a naturally occurring data set with labels derived from the sides that participants choose in debates on createdebate.com. The resulting new agreement corpus, Agreement by Create Debaters (ABCD) is 25 times larger than any prior corpus. We demonstrate that using this data enables us to outperform the same system trained on prior existing in-domain smaller annotated datasets.


north american chapter of the association for computational linguistics | 2010

Corpus Creation for New Genres: A Crowdsourced Approach to PP Attachment

Mukund Jha; Jacob Andreas; Kapil Thadani; Sara Rosenthal; Kathleen R. McKeown

This paper explores the task of building an accurate prepositional phrase attachment corpus for new genres while avoiding a large investment in terms of time and money by crowd-sourcing judgments. We develop and present a system to extract prepositional phrases and their potential attachments from ungrammatical and informal sentences and pose the subsequent disambiguation tasks as multiple choice questions to workers from Amazons Mechanical Turk service. Our analysis shows that this two-step approach is capable of producing reliable annotations on informal and potentially noisy blog text, and this semi-automated strategy holds promise for similar annotation projects in new genres.


Proceedings of the Second Workshop on Language in Social Media | 2012

Detecting Influencers in Written Online Conversations

Or Biran; Sara Rosenthal; Jacob Andreas; Kathleen R. McKeown; Owen Rambow

It has long been established that there is a correlation between the dialog behavior of a participant and how influential he or she is perceived to be by other discourse participants. In this paper we explore the characteristics of communication that make someone an opinion leader and develop a machine learning based approach for the automatic identification of discourse participants that are likely to be influencers in online communication. Our approach relies on identification of three types of conversational behavior: persuasion, agreement/disagreement, and dialog patterns.


north american chapter of the association for computational linguistics | 2010

Time-Efficient Creation of an Accurate Sentence Fusion Corpus

Kathleen R. McKeown; Sara Rosenthal; Kapil Thadani; Coleman Moore

Sentence fusion enables summarization and question-answering systems to produce output by combining fully formed phrases from different sentences. Yet there is little data that can be used to develop and evaluate fusion techniques. In this paper, we present a methodology for collecting fusions of similar sentence pairs using Amazons Mechanical Turk, selecting the input pairs in a semi-automated fashion. We evaluate the results using a novel technique for automatically selecting a representative sentence from multiple responses. Our approach allows for rapid construction of a high accuracy fusion corpus.


international conference on computational linguistics | 2014

Columbia NLP: Sentiment Detection of Sentences and Subjective Phrases in Social Media

Sara Rosenthal; Kathy McKeown; Apoorv Agarwal

We present two supervised sentiment detection systems which were used to compete in SemEval-2014 Task 9: Sentiment Analysis in Twitter. The first system (Rosenthal and McKeown, 2013) classifies the polarity of subjective phrases as positive, negative, or neutral. It is tailored towards online genres, specifically Twitter, through the inclusion of dictionaries developed to capture vocabulary used in online conversations (e.g., slang and emoticons) as well as stylistic features common to social media. The second system (Agarwal et al., 2011) classifies entire tweets as positive, negative, or neutral. It too includes dictionaries and stylistic features developed for social media, several of which are distinctive from those in the first system. We use both systems to participate in Subtasks A and B of SemEval2014 Task 9: Sentiment Analysis in Twitter. We participated for the first time in Subtask B: Message-Level Sentiment Detection by combining the two systems to achieve improved results compared to either system alone.

Collaboration


Dive into the Sara Rosenthal's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Preslav Nakov

Qatar Computing Research Institute

View shared research outputs
Top Co-Authors

Avatar

Jacob Andreas

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zornitsa Kozareva

Information Sciences Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge