Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Svitlana Volkova is active.

Publication


Featured researches published by Svitlana Volkova.


meeting of the association for computational linguistics | 2014

Inferring User Political Preferences from Streaming Communications

Svitlana Volkova; Glen Coppersmith; Benjamin Van Durme

Existing models for social media personal analytics assume access to thousands of messages per user, even though most users author content only sporadically over time. Given this sparsity, we: (i) leverage content from the local neighborhood of a user; (ii) evaluate batch models as a function of size and the amount of messages in various types of neighborhoods; and (iii) estimate the amount of time and tweets required for a dynamic model to predict user preferences. We show that even when limited or no selfauthored data is available, language from friend, retweet and user mention communications provide sufficient evidence for prediction. When updating models over time based on Twitter, we find that political preference can be often be predicted using roughly 100 tweets, depending on the context of user selection, where this could mean hours, or weeks, based on the author’s tweeting frequency.


PLOS ONE | 2015

Studying User Income through Language, Behaviour and Affect in Social Media.

Daniel Preoţiuc-Pietro; Svitlana Volkova; Vasileios Lampos; Nikolaos Aletras

Automatically inferring user demographics from social media posts is useful for both social science research and a range of downstream applications in marketing and politics. We present the first extensive study where user behaviour on Twitter is used to build a predictive model of income. We apply non-linear methods for regression, i.e. Gaussian Processes, achieving strong correlation between predicted and actual user income. This allows us to shed light on the factors that characterise income on Twitter and analyse their interplay with user emotions and sentiment, perceived psycho-demographics and language use expressed through the topics of their posts. Our analysis uncovers correlations between different feature categories and income, some of which reflect common belief e.g. higher perceived education and intelligence indicates higher earnings, known differences e.g. gender and age differences, however, others show novel findings e.g. higher income users express more fear and anger, whereas lower income users express more of the time emotion and opinions.


Cyberpsychology, Behavior, and Social Networking | 2015

On Predicting Sociodemographic Traits and Emotions from Communications in Social Networks and Their Implications to Online Self-Disclosure

Svitlana Volkova

Social media services such as Twitter and Facebook are virtual environments where people express their thoughts, emotions, and opinions and where they reveal themselves to their peers. We analyze a sample of 123,000 Twitter users and 25 million of their tweets to investigate the relation between the opinions and emotions that users express and their predicted psychodemographic traits. We show that the emotions that we express on online social networks reveal deep insights about ourselves. Our methodology is based on building machine learning models for inferring coarse-grained emotions and psychodemographic profiles from user-generated content. We examine several user attributes, including gender, income, political views, age, education, optimism, and life satisfaction. We correlate these predicted demographics with the emotional profiles emanating from user tweets, as captured by Ekmans emotion classification. We find that some users tend to express significantly more joy and significantly less sadness in their tweets, such as those predicted to be in a relationship, with children, or with a higher than average annual income or educational level. Users predicted to be women tend to be more opinionated, whereas those predicted to be men tend to be more neutral. Finally, users predicted to be younger and liberal tend to project more negative opinions and emotions. We discuss the implications of our findings to online privacy concerns and self-disclosure behavior.


ieee international conference on data science and advanced analytics | 2015

Using emotions to predict user interest areas in online social networks

Yoad Lewenberg; Svitlana Volkova

We examine the relation between the emotions users express on social networks and their perceived areas of interests, based on a sample of Twitter users. Our methodology relies on training machine learning models to classify the emotions expressed in tweets, according to Ekmans six high-level emotions. We then used raters, sourced from Amazons Mechanical Turk, to examine several Twitter profiles and to determine whether the profile owner is interested in various areas, including sports, movies, technology and computing, politics, news, economics, science, arts, health and religion. We find that the propensity of a user to express various emotions correlates with their perceived degree of interest in various areas. We present several models that use the emotional distribution of a Twitter user, as reflected by their tweets, to predict whether they are interested or disinterested in a topic or to determine their degree of interest in a topic.


meeting of the association for computational linguistics | 2016

Inferring Perceived Demographics from User Emotional Tone and User-Environment Emotional Contrast.

Svitlana Volkova

We examine communications in a social network to study user emotional contrast – the propensity of users to express different emotions than those expressed by their neighbors. Our analysis is based on a large Twitter dataset, consisting of the tweets of 123,513 users from the USA and Canada. Focusing on Ekman’s basic emotions, we analyze differences between the emotional tone expressed by these users and their neighbors of different types, and correlate these differences with perceived user demographics. We demonstrate that many perceived demographic traits correlate with the emotional contrast between users and their neighbors. Unlike other approaches on inferring user attributes that rely solely on user communications, we explore the network structure and show that it is possible to accurately predict a range of perceived demographic traits based solely on the emotions emanating from users and their neighbors.


meeting of the association for computational linguistics | 2017

Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter.

Svitlana Volkova; Kyle Shaffer; Jin Yea Jang; Nathan O. Hodas

Pew research polls report 62 percent of U.S. adults get news on social media (Gottfried and Shearer, 2016). In a December poll, 64 percent of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Barthel et al., 2016). Fabricated stories in social media, ranging from deliberate propaganda to hoaxes and satire, contributes to this confusion in addition to having serious effects on global stability. In this work we build predictive models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news – satire, hoaxes, clickbait and propaganda. We show that neural network models trained on tweet content and social network interactions outperform lexical models. Unlike previous work on deception detection, we find that adding syntax and grammar features to our models does not improve performance. Incorporating linguistic features improves classification results, however, social interaction features are most informative for finer-grained separation between four types of suspicious news posts.


web intelligence | 2010

Boosting Biomedical Entity Extraction by Using Syntactic Patterns for Semantic Relation Discovery

Svitlana Volkova; Doina Caragea; William H. Hsu; John Drouhard; Landon Fowles

Biomedical entity extraction from unstructured web documents is an important task that needs to be performed in order to discover knowledge in the veterinary medicine domain. In general, this task can be approached by applying domain specific ontologies, but a review of the literature shows that there is no universal dictionary, or ontology for this domain. To address this issue, we manually construct an ontology for extracting entities such as: animal disease names, viruses and serotypes. We then use an automated ontology expansion approach to extract semantic relationships between concepts. Such relationships include asserted synonymy, hyponymy and causality. Specifically, these relationships are extracted by using a set of syntactic patterns and part-of-speech tagging. The resulting ontology contains richer semantics compared to the manually constructed ontology. We compare our approach for extracting synonyms, hyponyms and other disease related concepts, with an approach where the ontology is expanded using GoogleSets, on the veterinary medicine entity extraction task. Experimental results show that our semantic relationship extraction approach produces a significant increase in precision and recall as compared to the GoogleSets approach.Biomedical entity extraction from unstructured web documents is an important task that needs to be performed in order to discover knowledge in the veterinary medicine domain. In general, this task can be approached by applying domain specific ontologies, but a review of the literature shows that there is no universal dictionary, or ontology for this domain. To address this issue, we manually construct an ontology for extracting entities such as: animal disease names, viruses and serotypes. We then use an automated ontology expansion approach to extract semantic relationships between concepts. Such relationships include asserted synonymy, hyponymy and causality. Specifically, these relationships are extracted by using a set of syntactic patterns and part-of-speech tagging. The resulting ontology contains richer semantics compared to the manually constructed ontology. We compare our approach for extracting synonyms, hyponyms and other disease related concepts, with an approach where the ontology is expanded using GoogleSets, on the veterinary medicine entity extraction task. Experimental results show that our semantic relationship extraction approach produces a significant increase in precision and recall as compared to the GoogleSets approach.


intelligence and security informatics | 2010

Computational knowledge and information management in veterinary epidemiology

Svitlana Volkova; William H. Hsu

Monitoring of infectious animal diseases is an essential task for national biosecurity management and bioterrorism prevention. For this purpose, we present a system for animal disease outbreak analysis by automatically extracting relational information from online data. We aim to detect and map infectious disease outbreaks by extracting information from unstructured sources. The system crawls web sites and classifies pages by topical relevance. The information extraction component performs document analysis for animal disease related event recognition. The visualization component plots extracted events into GoogleMaps1 using geospatial information and supports timeline representation of animal disease outbreaks in SIMILE2.


north american chapter of the association for computational linguistics | 2015

Social Media Predictive Analytics.

Svitlana Volkova; Benjamin Van Durme; David Yarowsky

The recent explosion of social media services like Twitter, Google+ and Facebook has led to an interest in social media predictive analytics – automatically inferring hidden information from the large amounts of freely available content. It has a number of applications, including: online targeted advertising, personalized marketing, large-scale passive polling and real-time live polling, personalized recommendation systems and search, and real-time healthcare analytics etc. In this tutorial, we will describe how to build a variety of social media predictive analytics for inferring latent user properties from a Twitter network including demographic traits, personality, interests, emotions and opinions etc. Our methods will address several important aspects of social media such as: dynamic, streaming nature of the data, multi-relationality in social networks, data collection and annotation biases, data and model sharing, generalization of the existing models, data drift, and scalability to other languages. We will start with an overview of the existing approaches for social media predictive analytics. We will describe the state-of-the-art static (batch) models and features. We will then present models for streaming (online) inference from single and multiple data streams; and formulate a latent attribute prediction task as a sequence-labeling problem. Finally, we present several techniques for dynamic (iterative) learning and prediction using active learning setup with rationale annotation and filtering. The tutorial will conclude with a practice session focusing on walk-through examples for predicting latent user properties e.g., political preferences, income, education level, life satisfaction and emotions emanating from user communications on Twitter.


social informatics | 2016

Contrasting Public Opinion Dynamics and Emotional Response During Crisis

Svitlana Volkova; Ilia Chetviorkin; Dustin Arendt; Benjamin Van Durme

We propose an approach for contrasting spatiotemporal dynamics of public opinions expressed toward targeted entities, also known as stance detection task, in Russia and Ukraine during crisis. Our analysis relies on a novel corpus constructed from posts on the VKontakte social network, centered on local public opinion of the ongoing Russian-Ukrainian crisis, along with newly annotated resources for predicting expressions of fine-grained emotions including joy, sadness, disgust, anger, surprise and fear. Akin to prior work on sentiment analysis we align traditional public opinion polls with aggregated automatic predictions of sentiments for contrastive geo-locations. We report interesting observations on emotional response and stance variations across geo-locations. Some of our findings contradict stereotypical misconceptions imposed by media, for example, we found posts from Ukraine that do not support Euromaidan but support Putin, and posts from Russia that are against Putin but in favor USA. Furthermore, we are the first to demonstrate contrastive stance variations over time across geo-locations using storyline visualization (Storyline visualization is available at http://www.cs.jhu.edu/~svitlana/) technique.

Collaboration


Dive into the Svitlana Volkova's collaboration.

Top Co-Authors

Avatar

Dustin Arendt

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eric B. Bell

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Yarowsky

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Theresa Wilson

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Maria Glenski

University of Notre Dame

View shared research outputs
Top Co-Authors

Avatar

Alireza Karduni

University of North Carolina at Charlotte

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge