Svetlana Kiritchenko | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Svetlana Kiritchenko is active.

Explore More

Publication

Featured researches published by Svetlana Kiritchenko.

north american chapter of the association for computational linguistics | 2015

SemEval-2015 Task 10: Sentiment Analysis in Twitter

Sara Rosenthal; Preslav Nakov; Svetlana Kiritchenko; Saif Mohammad; Alan Ritter; Veselin Stoyanov

In this paper, we describe the 2015 iteration of the SemEval shared task on Sentiment Analysis in Twitter. This was the most popular sentiment analysis shared task to date with more than 40 teams participating in each of the last three years. This year’s shared task competition consisted of five sentiment prediction subtasks. Two were reruns from previous years: (A) sentiment expressed by a phrase in the context of a tweet, and (B) overall sentiment of a tweet. We further included three new subtasks asking to predict (C) the sentiment towards a topic in a single tweet, (D) the overall sentiment towards a topic in a set of tweets, and (E) the degree of prior polarity of a phrase.

Journal of the American Medical Informatics Association | 2011

Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010.

Berry de Bruijn; Colin Cherry; Svetlana Kiritchenko; Joel D. Martin; Xiaodan Zhu

Objective As clinical text mining continues to mature, its potential as an enabling technology for innovations in patient care and clinical research is becoming a reality. A critical part of that process is rigid benchmark testing of natural language processing methods on realistic clinical narrative. In this paper, the authors describe the design and performance of three state-of-the-art text-mining applications from the National Research Council of Canada on evaluations within the 2010 i2b2 challenge. Design The three systems perform three key steps in clinical information extraction: (1) extraction of medical problems, tests, and treatments, from discharge summaries and progress notes; (2) classification of assertions made on the medical problems; (3) classification of relations between medical concepts. Machine learning systems performed these tasks using large-dimensional bags of features, as derived from both the text itself and from external sources: UMLS, cTAKES, and Medline. Measurements Performance was measured per subtask, using micro-averaged F-scores, as calculated by comparing system annotations with ground-truth annotations on a test set. Results The systems ranked high among all submitted systems in the competition, with the following F-scores: concept extraction 0.8523 (ranked first); assertion detection 0.9362 (ranked first); relationship detection 0.7313 (ranked second). Conclusion For all tasks, we found that the introduction of a wide range of features was crucial to success. Importantly, our choice of machine learning algorithms allowed us to be versatile in our feature design, and to introduce a large number of features without overfitting and without encountering computing-resource bottlenecks.

north american chapter of the association for computational linguistics | 2016

SemEval-2016 Task 6: Detecting Stance in Tweets

Saif M. Mohammad; Svetlana Kiritchenko; Parinaz Sobhani; Xiaodan Zhu; Colin Cherry

Here for the first time we present a shared task on detecting stance from tweets: given a tweet and a target entity (person, organization, etc.), automatic natural language systems must determine whether the tweeter is in favor of the given target, against the given target, or whether neither inference is likely. The target of interest may or may not be referred to in the tweet, and it may or may not be the target of opinion. Two tasks are proposed. Task A is a traditional supervised classification task where 70% of the annotated data for a target is used as training and the rest for testing. For Task B, we use as test data all of the instances for a new target (not used in task A) and no training data is provided. Our shared task received submissions from 19 teams for Task A and from 9 teams for Task B. The highest classification F-score obtained was 67.82 for Task A and 56.28 for Task B. However, systems found it markedly more difficult to infer stance towards the target of interest from tweets that express opinion towards another entity.

international conference on computational linguistics | 2014

NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews

Svetlana Kiritchenko; Xiaodan Zhu; Colin Cherry; Saif M. Mohammad

Reviews depict sentiments of customers towards various aspects of a product or service. Some of these aspects can be grouped into coarser aspect categories. SemEval-2014 had a shared task (Task 4) on aspect-level sentiment analysis, with over 30 teams participated. In this paper, we describe our submissions, which stood first in detecting aspect categories, first in detecting sentiment towards aspect categories, third in detecting aspect terms, and first and second in detecting sentiment towards aspect terms in the laptop and restaurant domains, respectively.

computational intelligence | 2015

Using Hashtags to Capture Fine Emotion Categories from Tweets

Saif M. Mohammad; Svetlana Kiritchenko

Detecting emotions in microblogs and social media posts has applications for industry, health, and security. Statistical, supervised automatic methods for emotion detection rely on text that is labeled for emotions, but such data are rare and available for only a handful of basic emotions. In this article, we show that emotion‐word hashtags are good manual labels of emotions in tweets. We also propose a method to generate a large lexicon of word–emotion associations from this emotion‐labeled tweet corpus. This is the first lexicon with real‐valued word–emotion association scores. We begin with experiments for six basic emotions and show that the hashtag annotations are consistent and match with the annotations of trained judges. We also show how the extracted tweet corpus and word–emotion associations can be used to improve emotion classification accuracy in a different nontweet domain.

international conference on computational linguistics | 2014

NRC-Canada-2014: Recent Improvements in the Sentiment Analysis of Tweets

Xiaodan Zhu; Svetlana Kiritchenko; Saif M. Mohammad

This paper describes state-of-the-art statistical systems for automatic sentiment analysis of tweets. In a Semeval-2014 shared task (Task 9), our submissions obtained highest scores in the term-level sentiment classification subtask on both the 2013 and 2014 tweets test sets. In the message-level sentiment classification task, our submissions obtained highest scores on the LiveJournal blog posts test set, sarcastic tweets test set, and the 2013 SMS test set. These systems build on our SemEval-2013 sentiment analysis systems (Mohammad et al., 2013) which ranked first in both the termand message-level subtasks in 2013. Key improvements over the 2013 systems are in the handling of negation. We create separate tweet-specific sentiment lexicons for terms in affirmative contexts and in negated contexts.

Information Processing and Management | 2015

Sentiment, emotion, purpose, and style in electoral tweets

Saif M. Mohammad; Xiaodan Zhu; Svetlana Kiritchenko; Joel D. Martin

We automatically compile a dataset of 2012 US presidential election tweets.We annotate the tweets for sentiment, emotion, style, and purpose.We show that the tweets convey negative emotions twice as often as positive.We describe two automatic systems that predict emotion and purpose in tweets. Social media is playing a growing role in elections world-wide. Thus, automatically analyzing electoral tweets has applications in understanding how public sentiment is shaped, tracking public sentiment and polarization with respect to candidates and issues, understanding the impact of tweets from various entities, etc. Here, for the first time, we automatically annotate a set of 2012 US presidential election tweets for a number of attributes pertaining to sentiment, emotion, purpose, and style by crowdsourcing. Overall, more than 100,000 crowdsourced responses were obtained for 13 questions on emotions, style, and purpose. Additionally, we show through an analysis of these annotations that purpose, even though correlated with emotions, is significantly different. Finally, we describe how we developed automatic classifiers, using features from state-of-the-art sentiment analysis systems, to predict emotion and purpose labels, respectively, in new unseen tweets. These experiments establish baseline results for automatic systems on this new data.

ACM Transactions on Internet Technology | 2017

Stance and Sentiment in Tweets

Saif M. Mohammad; Parinaz Sobhani; Svetlana Kiritchenko

We can often detect from a person’s utterances whether he or she is in favor of or against a given target entity—one’s stance toward the target. However, a person may express the same stance toward a target by using negative or positive language. Here for the first time we present a dataset of tweet–target pairs annotated for both stance and sentiment. The targets may or may not be referred to in the tweets, and they may or may not be the target of opinion in the tweets. Partitions of this dataset were used as training and test sets in a SemEval-2016 shared task competition. We propose a simple stance detection system that outperforms submissions from all 19 teams that participated in the shared task. Additionally, access to both stance and sentiment annotations allows us to explore several research questions. We show that although knowing the sentiment expressed by a tweet is beneficial for stance classification, it alone is not sufficient. Finally, we use additional unlabeled data through distant supervision techniques and word embeddings to further improve stance classification.

north american chapter of the association for computational linguistics | 2015

Sentiment after Translation: A Case-Study on Arabic Social Media Posts

Mohammad Salameh; Saif M. Mohammad; Svetlana Kiritchenko

When text is translated from one language into another, sentiment is preserved to varying degrees. In this paper, we use Arabic social media posts as stand-in for source language text, and determine loss in sentiment predictability when they are translated into English, manually and automatically. As benchmarks, we use manually and automatically determined sentiment labels of the Arabic texts. We show that sentiment analysis of English translations of Arabic texts produces competitive results, w.r.t. Arabic sentiment analysis. We discover that even though translation significantly reduces the human ability to recover sentiment, automatic sentiment systems are still able to capture sentiment information from the translations.

meeting of the association for computational linguistics | 2014

An Empirical Study on the Effect of Negation Words on Sentiment

Xiaodan Zhu; Hongyu Guo; Saif M. Mohammad; Svetlana Kiritchenko

Negation words, such as no and not, play a fundamental role in modifying sentiment of textual expressions. We will refer to a negation word as the negator and the text span within the scope of the negator as the argument. Commonly used heuristics to estimate the sentiment of negated expressions rely simply on the sentiment of argument (and not on the negator or the argument itself). We use a sentiment treebank to show that these existing heuristics are poor estimators of sentiment. We then modify these heuristics to be dependent on the negators and show that this improves prediction. Next, we evaluate a recently proposed composition model (Socher et al., 2013) that relies on both the negator and the argument. This model learns the syntax and semantics of the negator’s argument with a recursive neural network. We show that this approach performs better than those mentioned above. In addition, we explicitly incorporate the prior sentiment of the argument and observe that this information can help reduce fitting errors.

Explore More