[PDF] In the Eyes of the Beholder: Analyzing Social Media Use of Neutral and Controversial Terms for COVID-19

Abstract

During the COVID-19 pandemic, "Chinese Virus" emerged as a controversial term for coronavirus. To some, it may seem like a neutral term referring to the physical origin of the virus. To many others, however, the term is in fact attaching ethnicity to the virus. While both arguments appear reasonable, quantitative analysis of the term's real-world usage is lacking to shed light on the issues behind the controversy. In this paper, we attempt to fill this gap. To model the substantive difference of tweets with controversial terms and those with non-controversial terms, we apply topic modeling and LIWC-based sentiment analysis. To test whether "Chinese Virus" and "COVID-19" are interchangeable, we formulate it as a classification task, mask out these terms, and classify them using the state-of-the-art transformer models. Our experiments consistently show that the term "Chinese Virus" is associated with different substantive topics and sentiment compared with "COVID-19" and that the two terms are easily distinguishable by looking at their context.

Full PDF

IIn the Eyes of the Beholder: Sentiment and TopicAnalyses on Social Media Use of Neutral andControversial Terms for COVID-19

Long Chen, Tongyu Yang, Jiebo Luo

Department of Computer ScienceUniversity of Rochester

Rochester, USA { lchen62,tyang20 } @u.rochester.edu, [email protected] Hanjia Lyu

Goergen Institute for Data ScienceUniversity of Rochester

Rochester, [email protected]

Yu Wang

Political ScienceUniversity of Rochester

Rochester, [email protected]

Abstract —During the COVID-19 pandemic, “Chinese Virus”emerged as a controversial term for coronavirus. To some, it mayseem like a neutral term referring to the physical origin of thevirus. To many others, however, the term is in fact attachingethnicity to the virus. In this paper, we attempt to shed light onthe term’s real-world usage on Twitter. Using sentiment featureanalysis and topic modeling, we reveal substantial differencesbetween the use of the controversial terms such as “Chinesevirus” and that of the non-controversial terms such as “COVID-19”. For example, tweets using controversial terms contain ahigher percentage of anger as well as negative emotions. Theyalso point to China more frequently. Our results suggest thatwhile the term “Chinese virus” could be interpreted either asneutral or racist, its usage on social media leans strongly towardsthe latter.

Index Terms —COVID-19, Topic Modeling, LIWC2015, Twit-ter, Controversial Term, Social Media

I. I

NTRODUCTION

Starting in late 2019, the COVID-19 pandemic has rapidlyimpacted over 200 countries, areas and territories. As ofApril 19, according to the World Health Organization (WHO),2,241,359 COVID-19 cases were conﬁrmed worldwide, with152,551 deaths . This disease has tremendous impacts on theworld’s populations daily lives.In light of the deteriorating situation in the United States,discussions of the pandemic on social media has drasticallyincreased since March 2020. Within these discussions, anoverwhelming trend is the use of controversial terms targetingAsian and, speciﬁcally, Chinese population, insinuating thatthe virus originated in China. On March 16, the Presidentof United States, Donald Trump, posted on Twitter callingCOVID-19 the Chinese Virus. Around March 18, mediacoverage of the term Chinese Flu also took off. Althoughmost public ﬁgures who used the controversial terms claimedthem to be non-discriminative, such terms have stimulated https://twitter.com/realdonaldtrump/status/1239685852093169664 https://blog.gdeltproject.org/is-it-coronavirus-or-covid-19-or-chinese-ﬂu-the-naming-of-a-pandemic/ racism and discrimination against Asian-Americans in the US,as reported by New York Times , the Washington Post , theGuardian , and other main stream news media.A recent work was done with social media data to char-acterize users who used controversial or non-controversialterms associated with COVID-19 and found the associationsbetween demographics, user-level features, political followingstatus, and geo-location attributes with the use of controversialterms [6]. In this study, we analyse from a language perspec-tive crawled tweets (Twitter posts) with and without contro-versial terms associated with COVID-19. To operationalizethis idea, we perform two investigations. First, latent Dirichletallocation (LDA) [1] is applied to extract the topics in con-troversial and non-controversial posts [1]. Next, LIWC2015(Linguistic Inquiry and Word Count 2015) [8] is applied tobuild multi-dimensional proﬁles of the posts. We then madecomparisons between the topics and proﬁles presented in bothcontroversial and non-controversial posts, trying to investigateany association between the use of controversial terms and theunderlying mindsets.II. R ELATED W ORK

Our work is built upon previous works on text mining usingdata from social media during inﬂuential events.Studies have been conducted using topic modeling, a pro-cess of identifying topics in a collection of documents. Thecommonly used model, Latent Dirichlet Allocation (LDA),provides a way to automatically detect hidden topics in agiven number [1]. Previous research has been conducted oninferring topics on social media. Kim et al. [5] investigatedthe topic coverage and sentiment dynamics on Twitter andnews press regarding the issue of Ebola. Chen et al. [2]found LDA-generated topics from e-cigarette related posts on a r X i v : . [ c s . S I] A p r ABLE IT

OPICS GENERATED BY THE

LDA

MODEL FOR BOTH CONTROVERSIAL AND NON - CONTROVERSIAL TWEETS . Classiﬁcation Topics Top 10 Topic WordsControversial

Racism call chinese virus stop racist people let kill pandemic infectAnecdote virus chinese people say world covid would death call takeConsipracy chinese spread virus be send full conspiracy exactly theory vidWork in hospital get good chinese test ﬁght virus hospital need work lastBlame the lie chinese virus must people lie blame tweet fact need die

Non-controversial

Test cases case test covid death new day positive patient break numberAnecdote tell virus covid man story spread free say try governmentTrump say people know get trump could make wouldHealth workers help need covid crisis health ﬁght worker work government pandemicStay home people virus home stay corona take get die

Note: All appearances of “Chinese virus” related keywords were removed prior to the LDA process. Bigrams and trigrams wereincluded in the LDA model. None of them appears in the top 10 topic words due to infrequency. Some topics contain less than 10topic words due to deletion of some short (less than 3 characters) words.

Reddit to identify potential associations between e-cigaretteuses and various self-reporting health symptoms. Wang etal. [11] applied negative binomial regression upon abstracttopics of LDA to model the likes on Trumps Twitter and infertopic preferences among followers.A large number of studies were performed with LIWC,an API to do linguistic analysis of documents. Tumasjan etal. [10] used LIWC to capture the political sentiment andpredict elections with Twitter. The API was also used byZhang et al. [13] to provide insights into the sentiment ofthe descriptions of crowdfunding campaigns. Our motivationis to combine qualitative analysis with LDA and quantitativeanalysis with LIWC, comparatively investigate discrepanciesbetween the tweets that use controversial terms associated withCOVID-19 and the tweets that use non-controversial terms.III. D ATA AND M ETHODOLOGY

The related tweets (Twitter posts) were crawled with theTweepy API using keyword ﬁltering in reference to [6].Simultaneous streams were collected to build the controversialdataset (CD) and the non-controversial dataset (ND) fromMarch 23 April 5. The controversial keywords consist of“Chinese virus and “ v8 . In the end, we set the number of topics to 5 and eachtopic’s number of words to 10. Since the objective of topicmodeling is to ﬁnd what people talk when using controversialor non-controversial terms, we also masked all the appearances https://liwc.wpengine.com/ C v is a performance measure based on a sliding window, one-set seg-mentation of the top words and an indirect conﬁrmation measure that usesnormalized pointwise mutual information (NPMI) and the cosine similarity. of the aforementioned streaming keywords by deleting themout of the documents. Next, bi-grams and tri-grams wereapplied on the document. We performed a qualitative, com-parative analysis to ﬁnd differences and similarities of topicsgenerated by the two datasets.Next, LIWC2015 was applied to extract the sentiment ofthe tweets of CD and ND. LIWC2015 is a dictionary-basedlinguistic analysis tool that can count the percentage of wordsthat reﬂect different emotions, thinking styles, social concerns,and capture people’s psychological states. . We focused on 4summary linguistic variables and 12 more detailed variablesthat reﬂect psychological states, cognition, drives, time ori-entation, and personal concerns of the Twitter users of CDand ND. We followed the similar methodology used by Yuet al. [12] by concatenating all tweets posted by users of CDand ND, respectively. One text sample was composed of allthe tweets from the aforementioned sampled dataset of CD,and the other was composed of all the tweets from that ofND. Then we applied LIWC2015 to analyze these two textsamples. In the end, there are 16 linguistic variables for CDand ND, respectively. IV. R ESULTS

A. Topic Modeling Results

In Table I, 5 topics generated by LDA on CD and ND arereported, respectively, together with the top 10 topic words.We manually assigned each topic a topic name to generalizewhat would most likely be discussed under the topic.Comparing across CD and ND, we observe that the topics inCD contain more opinions. 3 out of 5 topics have very strongopinion-related topic words, which are highlighted in TableI. In the topic “blame the lie, a very strong signal about lieand blame is present, as well as indication of some correlationbetween “lie and people “die. In contrast, the generated topicsin ND are more related to ﬁghting the pandemic (n=3), stories(n=1) and government (n=1), which are all related to COVID-19 in the US. No strong opinion-related keywords could befound.One ﬁnding is that all 5 topics in CD contain the topic word“chinese, even though we have removed all keywords related https://liwc.wpengine.com/how-it-works/ o Chinese virus in the documents for LDA. This suggeststhat discussions in CD are closely related to China or theChinese people/government. In addition, all 5 topics in CDcontain topic word “virus, whereas only 2 of 5 topics in NDcontain this topic word. Such a difference suggests that inCD, discussions are more related to the virus, while in NDdiscussions are more related to the pandemic caused by thisvirus. In fact, only one topic in CD is about how peoplework towards containing the pandemic (“work in hospital”),whereas 3 topics in ND are discussing measures to relieve thispandemic (“test cases,” “health workers” and “stay home”).These discrepancies in the topic modeling result contradictthe claim of “only referring to the geo-locational origin of thepandemic” by some public ﬁgures who employed the use of“Chinese virus” when referring to COVID-19. Nevertheless,such words have provoked, to certain degree, racist or xeno-phobic opinions and hate speeches towards China or peoplewith Chinese ethnicity on social media. Furthermore, hatespeeches can spread extremely fast on online social mediaplatforms and can stay online for a long time [3]. Gagliardoneet al. [3] found that such speeches are also itinerant, meaningthat despite forcefully removed by the platforms, one can stillﬁnd related expression elsewhere on the Internet and evenofﬂine. TABLE IIS

CORES OF “ I ”, “ WE ”, “ SHE / HE ”, “ THEY ”, AND PRESENT ORIENTATION .Variables Controversial Non-Controversiali 0.96 1.04we 1.25 1.0she/he 0.69 0.7they 1.05 0.71present orientation 9.37 9.22

B. LIWC Sentiment Features

Fig. 1 shows 4 summary variables for CD and ND. Weobserve that the clout scores for CD and ND are similar. Ahigh clout score suggests that the author is speaking from theperspective of high expertise [8]. At the same time, analyticalthinking, authentic and emotional tones scores for ND arehigher than those for CD. The analytical thinking score reﬂectsthe degree of hierarchical thinking. Higher numbers indicatea more logical and formal thinking [8]. A higher authenticscore suggests that the content of the text is more honest,personal and disclosing [8]. The emotional tones scores forCD and ND are both lower than 50, indicating that the overallemotions for CD and ND are negative. This is consistent withour expectation. However, the emotional tone score for ND ishigher than that for CD, indicating that the Twitter users inND are expressing a relatively more positive emotion.Fig. 2 shows 12 more detailed linguistic variables of tweetsof CD and ND. The scores of “future-oriented” and “past-oriented” reﬂect the temporal focus of attention of the Twitterusers by analyzing the verb tense used in the tweets [9]. Thetweets of ND are more future-oriented, while those of CDare more past-oriented. To better understand this difference,

Fig. 1. Summary variables for CD and ND.Fig. 2. Proﬁles for the tweets of CD and ND. we conducted a similar analysis as Gunsch et al. [4]. Weextract ﬁve more linguistic variables including four pronounsscore and one time orientation score. The scores of “i”,“we”, “she/he”, “they”, and present-orientation are shownin Table II. The tweets of CD show more other-references(“they”), whereas more self-references (“i”, “we”) are presentin the tweets of ND. The scores for “she/he” of CD and NDare close. The score of present orientation of CD is higherthan that of ND. From this similar observation to the ﬁndingsof Gunsch et al. [4], we can infer that the tweets of CD focuson the past and present actions of the others, and the tweetsof ND focus more on the future acts of themselves. Researchshows that LIWC can identify the emotion in language use [9].From the aforementioned discussion, the tweets of both CDand ND are expressing a negative emotion, and the emotionexpressed by the Twitter users of ND is relatively morepositive. This is consistent with the positive emotions scoreand negative emotions score. However, there are nuanceddifferences across the sadness, anxiety and anger scores. Whenreferring to COVID-19, the tweets of ND express more sadnessand anxiety than those of CD do. More anger is expressedthrough the tweets of CD. The certainty and tentativenessscores reveal the extent to which the event the author is goingthrough may have been established or is still being formed [9]. higher percentage of words like “always” or “never” resultsin a higher score for certainty, and a higher percentage ofwords like “maybe” or “perhaps” leads to a higher score fortentativeness [8]. We observe a higher tentative score anda higher certainty score for the tweets of CD, while thesetwo scores for the tweets of ND are both lower. We have aninteresting hypothesis for this subtle difference. Since 1986,Pennebaker et al. [8] have been collecting text samples from avariety of studies including blogs, expressive writing, novels,natural speech, New York Times, and Twitter to get a senseof the degree to which language varies across settings. Of allthe studies, the tentative and certainty scores for the text ofNew York Times are the lowest. However, these two scoresfor expressive writing, blog, and natural speech are relativelyhigher. This observation leads to our hypothesis that the tweetsof CD are more like blog, expressive writing, or natural speechthat focus on expressing ideas, whereas the tweets of ND aremore like newspaper articles which focus more one describingfacts. As for the score of “achievement”, McClelland [7] foundthat the stories people told in response to drawings of peoplecould provide important clues to their needs for achievement.We hypothesize that the higher value of the “achievement”score for the tweets of ND reﬂects the need of these Twitterusers to succeed in ﬁghting against COVID-19. As for thepersonal concerns, the scores for “work” and “money” ofND are both higher than those of CD which shows that theTwitter users of ND focus more on the work and moneyissue (e.g. working from home, unemployment). Accordingto the reports of the U.S. Department of Labor, the advanceseasonally adjusted insured unemployment rate was 8.2% forthe week ending April 4. The previous high was 7.0% in Mayof 1975. V. C

ONCLUSION AND F UTURE W ORK

We have presented a study of the topic preference related tothe use of controversial and non-controversial terms associatedwith COVID-19 on Twitter during the ongoing COVID-19pandemic. We ﬁrst use LDA to extract topics from the contro-versial and non-controversial posts crawled from Twitter, andthen qualitatively compare them across the two sets of posts.We ﬁnd that topics in the controversial posts are more relatedto China, even after the keywords related to “Chinese virus”were removed before the analysis, whereas discussions in non-controversial posts are more related to ﬁghting the pandemicin the US. We also ﬁnd differences across the sentiment ofthe tweets posted by the users using controversial terms andthe users using non-controversial terms. Both groups expressa negative emotion, yet the tweets of ND are relatively morepositive. The tweets of ND also show more analytical thinkingand are expressed in a more truthful manner. The tweets ofCD focus more on the past and present action of others, whilethe tweets of ND focus more on the future acts of the authorsthemselves. More anger is present in the tweets of CD, whilemore anxiety and sadness are observed in the tweets of ND. More tentativeness and certainty are observed in the tweets ofCD, which is not contradictory since these two scores are bothhigher in the text samples from blogs and expressive writingsthat focus on expressing ideas and opinions. These two scoresare both lower for the tweets of ND which is similar to thecase of newspaper articles like New York Times. Tweets ofND reﬂect a strong need for achievement. As for the personalconcerns, users of ND focus more on work and money issues.It is reported that the widespread use of controversial termsassociated with COVID-19 has induced hate speeches and,to some degree, racism and xenophobia on social media.Therefore, such content on social media should be closelymonitored to prevent a further escalation in the situation,which could potentially lead to social unrest. Our next stepis to use textual, demographic and account-level features todetect the use of hate speeches on social media, in an effort ofpredicting and analyzing such behaviors for uses including butnot limited to social media content management and policy-making. R

EFERENCES[1] David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichletallocation.

Journal of machine Learning research , 3(Jan):993–1022,2003.[2] Long Chen, Xinyi Lu, Jianbo Yuan, Joyce Luo, Jiebo Luo, Zidian Xie,and Dongmei Li. A social media study on associations of ﬂavorede-cigarette with health symptoms: Observational study (preprint). De-cember 2019.[3] Iginio Gagliardone, Danit Gal, Thiago Alves, and Gabriela Martinez.

Countering online hate speech . Unesco Publishing, 2015.[4] Mark A Gunsch, Sheila Brownlow, Sarah E Haynes, and Zachary Mabe.Differential forms linguistic content of various of political advertising.

Journal of Broadcasting & Electronic Media , 44(1):27–42, 2000.[5] Erin Hea-Jin Kim, Yoo Kyung Jeong, Yuyoung Kim, Keun Young Kang,and Min Song. Topic-based content and sentiment analysis of ebola viruson twitter and in the news.

Journal of Information Science , 42(6):763–781, 2016.[6] Hanjia Lyu, Long Chen, Yu Wang, and Jiebo Luo. Sense and sensibility:Characterizing social media users regarding the use of controversialterms for covid-19, 2020.[7] David C McClelland. Inhibited power motivation and high bloodpressure in men.

Journal of Abnormal Psychology , 88(2):182, 1979.[8] James W Pennebaker, Ryan L Boyd, Kayla Jordan, and Kate Blackburn.The development and psychometric properties of liwc2015. Technicalreport, 2015.[9] Yla R Tausczik and James W Pennebaker. The psychological meaningof words: Liwc and computerized text analysis methods.

Journal oflanguage and social psychology , 29(1):24–54, 2010.[10] Andranik Tumasjan, Timm O Sprenger, Philipp G Sandner, and Is-abell M Welpe. Predicting elections with twitter: What 140 charactersreveal about political sentiment. In

Fourth international AAAI conferenceon weblogs and social media , 2010.[11] Yu Wang, Jiebo Luo, Richard Niemi, Yuncheng Li, and Tianran Hu.Catching ﬁre via” likes”: Inferring topic preferences of trump followerson twitter. In

Tenth International AAAI Conference on Web and SocialMedia , 2016.[12] Bei Yu, Stefan Kaufmann, and Daniel Diermeier. Exploring the char-acteristics of opinion expressions for political opinion classiﬁcation. In

Proceedings of the 2008 international conference on Digital governmentresearch , pages 82–91. Digital Government Society of North America,2008.[13] Xupin Zhang, Hanjia Lyu, and Jiebo Luo. What contributes to a crowd-funding campaign’s success? evidence and analyses from gofundmedata. arXiv preprint arXiv:2001.05446arXiv preprint arXiv:2001.05446