Featured Researches

Social And Information Networks

Characterizing Communities of Hashtag Usage on Twitter During the 2020 COVID-19 Pandemic by Multi-view Clustering

The COVID-19 pandemic has produced a flurry of online activity on social media sites. As such, analysis of social media data during the COVID-19 pandemic can produce unique insights into discussion topics and how those topics evolve over the course of the pandemic. In this study, we propose analyzing discussion topics on Twitter by clustering hashtags. In order to obtain high-quality clusters of the Twitter hashtags, we also propose a novel multi-view clustering technique that incorporates multiple different data types that can be used to describe how users interact with hashtags. The results of our multi-view clustering show that there are distinct temporal and topical trends present within COVID-19 twitter discussion. In particular, we find that some topical clusters of hashtags shift over the course of the pandemic, while others are persistent throughout, and that there are distinct temporal trends in hashtag usage. This study is the first to use multi-view clustering to analyze hashtags and the first analysis of the greater trends of discussion occurring online during the COVID-19 pandemic.

Read more
Social And Information Networks

Characterizing Discourse about COVID-19 Vaccines: A Reddit Version of the Pandemic Story

It has been one year since the outbreak of the COVID-19 pandemic. The good news is that vaccines developed by several manufacturers are being actively distributed worldwide. However, as more and more vaccines become available to the public, various concerns related to vaccines become the primary barriers that may hinder the public from getting vaccinated. Considering the complexities of these concerns and their potential hazards, this study aims to offer a clear understanding about different population groups' underlying concerns when they talk about COVID-19 vaccines, particular those active on Reddit. The goal is achieved by applying LDA and LIWC to characterizing the pertaining discourse with insights generated through a combination of quantitative and qualitative comparisons. Findings include: 1) during the pandemic, the proportion of Reddit comments predominated by conspiracy theories outweighed that of any other topics; 2) each subreddit has its own user bases, so information posted in one subreddit may not reach those from other subreddits; 3) since users' concerns vary across time and subreddits, communication strategies must be adjusted according to specific needs. The results of this study manifest challenges as well as opportunities in the process of designing effective communication and immunization programs.

Read more
Social And Information Networks

Characterizing Online Vandalism: A Rational Choice Perspective

What factors influence the decision to vandalize? Although the harm is clear, the benefit to the vandal is less clear. In many cases, the thing being damaged may itself be something the vandal uses or enjoys. Vandalism holds communicative value: perhaps to the vandal themselves, to some audience at whom the vandalism is aimed, and to the general public. Viewing vandals as rational community participants despite their antinormative behavior offers the possibility of engaging with or countering their choices in novel ways. Rational choice theory (RCT) as applied in value expectancy theory (VET) offers a strategy for characterizing behaviors in a framework of rational choices, and begins with the supposition that subject to some weighting of personal preferences and constraints, individuals maximize their own utility by committing acts of vandalism. This study applies the framework of RCT and VET to gain insight into vandals' preferences and constraints. Using a mixed-methods analysis of Wikipedia, I combine social computing and criminological perspectives on vandalism to propose an ontology of vandalism for online content communities. I use this ontology to categorize 141 instances of vandalism and find that the character of vandalistic acts varies by vandals' relative identifiability, policy history with Wikipedia, and the effort required to vandalize.

Read more
Social And Information Networks

Characterizing Sociolinguistic Variation in the Competing Vaccination Communities

Public health practitioners and policy makers grapple with the challenge of devising effective message-based interventions for debunking public health misinformation in cyber communities. "Framing" and "personalization" of the message is one of the key features for devising a persuasive messaging strategy. For an effective health communication, it is imperative to focus on "preference-based framing" where the preferences of the target sub-community are taken into consideration. To achieve that, it is important to understand and hence characterize the target sub-communities in terms of their social interactions. In the context of health-related misinformation, vaccination remains to be the most prevalent topic of discord. Hence, in this paper, we conduct a sociolinguistic analysis of the two competing vaccination communities on Twitter: "pro-vaxxers" or individuals who believe in the effectiveness of vaccinations, and "anti-vaxxers" or individuals who are opposed to vaccinations. Our data analysis show significant linguistic variation between the two communities in terms of their usage of linguistic intensifiers, pronouns, and uncertainty words. Our network-level analysis show significant differences between the two communities in terms of their network density, echo-chamberness, and the EI index. We hypothesize that these sociolinguistic differences can be used as proxies to characterize and understand these communities to devise better message interventions.

Read more
Social And Information Networks

Chasm in Hegemony: Explaining and Reproducing Disparities in Homophilous Networks

In networks with a minority and a majority community, it is well-studied that minorities are under-represented at the top of the social hierarchy. However, researchers are less clear about the representation of minorities from the lower levels of the hierarchy, where other disadvantages or vulnerabilities may exist. We offer a more complete picture of social disparities at each social level with empirical evidence that the minority representation exhibits two opposite phases: at the higher rungs of the social ladder, the representation of the minority community decreases; but, lower in the ladder, which is more populous, as you ascend, the representation of the minority community improves. We refer to this opposing phenomenon between the upper-level and lower-level as the \emph{chasm effect}. Previous models of network growth with homophily fail to detect and explain the presence of this chasm effect. We analyze the interactions among a few well-observed network-growing mechanisms with a simple model to reveal the sufficient and necessary conditions for both phases in the chasm effect to occur. By generalizing the simple model naturally, we present a complete bi-affiliation bipartite network-growth model that could successfully capture disparities at all social levels and reproduce real social networks. Finally, we illustrate that addressing the chasm effect can create fairer systems with two applications in advertisement and fact-checks, thereby demonstrating the potential impact of the chasm effect on the future research of minority-majority disparities and fair algorithms.

Read more
Social And Information Networks

Client Network: An Interactive Model for Predicting New Clients

Understanding prospective clients becomes increasingly important as companies aim to enlarge their market bases. Traditional approaches typically treat each client in isolation, either studying its interactions or similarities with existing clients. We propose the Client Network, which considers the entire client ecosystem to predict the success of sale pitches for targeted clients by complex network analysis. It combines a novel ranking algorithm with data visualization and navigation. Based on historical interaction data between companies and clients, the Client Network leverages organizational connectivity to locate the optimal paths to prospective clients. The user interface supports exploring the client ecosystem and performing sales-essential tasks. Our experiments and user interviews demonstrate the effectiveness of the Client Network and its success in supporting sellers' day-to-day tasks.

Read more
Social And Information Networks

Clinical trial of an AI-augmented intervention for HIV prevention in youth experiencing homelessness

Youth experiencing homelessness (YEH) are subject to substantially greater risk of HIV infection, compounded both by their lack of access to stable housing and the disproportionate representation of youth of marginalized racial, ethnic, and gender identity groups among YEH. A key goal for health equity is to improve adoption of protective behaviors in this population. One promising strategy for intervention is to recruit peer leaders from the population of YEH to promote behaviors such as condom usage and regular HIV testing to their social contacts. This raises a computational question: which youth should be selected as peer leaders to maximize the overall impact of the intervention? We developed an artificial intelligence system to optimize such social network interventions in a community health setting. We conducted a clinical trial enrolling 713 YEH at drop-in centers in a large US city. The clinical trial compared interventions planned with the algorithm to those where the highest-degree nodes in the youths' social network were recruited as peer leaders (the standard method in public health) and to an observation-only control group. Results from the clinical trial show that youth in the AI group experience statistically significant reductions in key risk behaviors for HIV transmission, while those in the other groups do not. This provides, to our knowledge, the first empirical validation of the usage of AI methods to optimize social network interventions for health. We conclude by discussing lessons learned over the course of the project which may inform future attempts to use AI in community-level interventions.

Read more
Social And Information Networks

Clustering Network Tree Data From Respondent-driven sampling with application to opioid users in New York City

There is great interest in finding meaningful subgroups of attributed network data. There are many available methods for clustering complete network. Unfortunately, much network data is collected through sampling, and therefore incomplete. Respondent-driven sampling (RDS) is a widely used method for sampling hard-to-reach human populations based on tracing links in the underlying unobserved social network. The resulting data therefore have tree structure representing a sub-sample of the network, along with many nodal attributes. In this paper, we introduce an approach to adjust mixture models for general network clustering for samplings by RDS. We apply our model to data on opioid users in New York City, and detect communities reflecting group characteristics of interest for intervention activities, including drug use patterns, social connections and other community variables

Read more
Social And Information Networks

Clustering of Social Media Messages for Humanitarian Aid Response during Crisis

Social media has quickly grown into an essential tool for people to communicate and express their needs during crisis events. Prior work in analyzing social media data for crisis management has focused primarily on automatically identifying actionable (or, informative) crisis-related messages. In this work, we show that recent advances in Deep Learning and Natural Language Processing outperform prior approaches for the task of classifying informativeness and encourage the field to adopt them for their research or even deployment. We also extend these methods to two sub-tasks of informativeness and find that the Deep Learning methods are effective here as well.

Read more
Social And Information Networks

CoVaxxy: A Collection of English-language Twitter Posts About COVID-19 Vaccines

With a substantial proportion of the population currently hesitant to take the COVID-19 vaccine, it is important that people have access to accurate information. However, there is a large amount of low-credibility information about vaccines spreading on social media. In this paper, we present the CoVaxxy dataset, a growing collection of English-language Twitter posts about COVID-19 vaccines. Using one week of data, we provide statistics regarding the numbers of tweets over time, the hashtags used, and the websites shared. We also illustrate how these data might be utilized by performing an analysis of the prevalence over time of high- and low-credibility sources, topic groups of hashtags, and geographical distributions. Additionally, we develop and present the CoVaxxy dashboard, allowing people to visualize the relationship between COVID-19 vaccine adoption and U.S. geo-located posts in our dataset. This dataset can be used to study the impact of online information on COVID-19 health outcomes (e.g., vaccine uptake) and our dashboard can help with exploration of the data.

Read more

Ready to get started?

Join us today