Linguistic Analysis of Toxic Behavior in an Online Video Game
LLinguistic Analysis of Toxic Behavior in anOnline Video Game
Haewoon Kwak † and Jeremy Blackburn ‡ † Qatar Computing Research Institute, Doha, Qatar [email protected] ‡ Telefonica Research, Barcelona, Spain [email protected]
Abstract.
In this paper we explore the linguistic components of toxicbehavior by using crowdsourced data from over 590 thousand cases of ac-cused toxic players in a popular match-based competition game, Leagueof Legends. We perform a series of linguistic analyses to gain a deeperunderstanding of the role communication plays in the expression of toxicbehavior. We characterize linguistic behavior of toxic players and com-pare it with that of typical players in an online competition game. Wealso find empirical support describing how a player transitions from typi-cal to toxic behavior. Our findings can be helpful to automatically detectand warn players who may become toxic and thus insulate potential vic-tims from toxic playing in advance.
Keywords:
Toxic behavior · verbal violence · Tribunal · League of Leg-ends · cyberbullying · online games Multiplayer games provide players with the thrill of true competition. Playersprove themselves superior to other humans that exhibit dynamic behavior farbeyond that of any computer controlled opponent. Additionally, some multi-player games provide another wrinkle: teamwork. Now, not only is it a test ofskill between two individuals, but cooperation, strategy, and communication be-tween teammates can ensure victory. Unfortunately, the presence of teammatesand their influence on victory and defeat can result in toxic behavior.Toxic behavior, also known as cyberbullying [1], griefing [4], or online disinhi-bition [7], is bad behavior that violates social norms, inflicts misery, continues tocause harm after it occurs, and affects an entire community. The anonymity af-forded by, and ubiquity of, computer-mediated-communication (CMC) naturallyleads to hostility and aggressiveness [3,8]. A major obstacle in understandingtoxic behavior is its subjective perception. Unlike unethical behavior like cheat-ing, toxic behavior is nebulously defined; toxic players themselves sometimes failto recognize their behavior as toxic [6]. Nevertheless, because of the very realimpact toxic behavior has on our daily lives, even outside of games, a deeperunderstanding is necessary. a r X i v : . [ c s . S I] O c t Haewoon Kwak and Jeremy Blackburn
To further our understanding, in this paper we explore the linguistic com-ponents of toxic behavior. Using crowdsourced data from over 590 thousand“judicial trials” of accused toxic players representing over 2.1 million matchesof a popular match-based competition game, League of Legends , we perform aseries of linguistic analyses to gain a deeper understanding of the role commu-nication plays in the expression of toxic behavior. In our previous work [2], wefound that offensive language is the most reported reason across all the three re-gions. Also, in North America, verbal abuse is the second most reported reason.In other words, linguistic components are a prime method of expressing toxicity.From our analyses we draw several findings. First, the volume of communi-cation is not uniform throughout the length of the match, instead showing abi-modal shape with peaks at the beginning and end of a match. By comparingthe distribution of frequency of communications between normal players andtoxic players, we find subtle differences. Typical players chat relatively more atthe beginning of a match, which is mainly for ice breaking, morale boosting, andsharing early strategic information. In contrast, toxic players chat less at thebeginning but constantly more than typical players after some time point, i.e.phase transition. Next, we find discriminative uni- and bi-grams used by typicaland toxic players, as signatures of them, examine the differences, and show thatcertain bi-grams can be classified based on when they appear in a match. Tem-poral patterns of the linguistic signature of toxic players illustrate what kindof toxic playing happens as the match progresses. Deeper analysis of temporalanalysis of words used by toxic and typical players reveals a more interesting pic-ture. We focus on how a player transitions to toxic by comparing the temporalusage of popular uni-grams between typical players and toxic players.Our contribution is two-fold. First, we characterize linguistic behavior oftoxic players and compare it with that of typical players in online competitiongames. Second, we find empirical support to describe how a player turns to betoxic. Our findings would be helpful to automatically detect and warn playerswho may turn to be toxic and thus save potential victims of toxic playing inadvance. The League of Legends (LoL) is the most popular Multiplayer Online BattleArena out today, and suffers from a high degree of toxic behavior. The LoLTribunal is a crowdsourced system for determining the guilt of players accusedof tocix behavior.We collected 590,311 Tribunal cases from the North America region repre-senting a total of 2,107,522 individual matches. Each Tribunal case representsa single player and includes up to 5 matches in which he was accused of toxicbehavior. In LoL players can communicate via chat, which is ostensibly used toshare strategic plans and other important information during the game. How-ever, chat is also a prime vector for exhibiting toxic behavior. Thus, although http://leagueoflegends.cominguistic Analysis of Toxic Behavior in an Online Video Game 3 Normalized Time P D F offender typical Fig. 1.
Change of chat volume during a match. a variety of information is presented to Tribunal reviewers [2], in this paper wefocus exclusively on the in-game chat logs .We extract 24,039,184 messages from toxic players and 33,252,018 messagesfrom typical players. Because the teammates of toxic players are directly im-pacted by toxic playing and readily express aggressive reactions to a toxic player,we define typical players as the set of players on the opposite team when noneof them report the toxic player.Before continuing, we report some basic statistics about the size of vocabularyand the length of messages. We found 1,042,940 unique tokens in toxic playermessages and 1,176,356 unique tokens in typical player messages. While typicalplayers send 38% more messages than toxic players, the messages are composedof only 13% more unique tokens. Interestingly, we find that toxic players sendlonger messages than typical players; the average number of words per messageis 3.139 and 2.732 for toxic and typical players, respectively.
We begin our analysis by exploring chat volume over time. A LoL match canbe broken up into logical stages. First is the early game (also known as the“laning phase”), where characters are low level and weak. In the early game,players expend great effort towards “farming” computer controlled minions togain experience and gold, with aggressive plays against the other team usuallycoming as the result of an over extension or other mistake. As players earn goldand experience, they level up and become stronger, and the match transitions tothe mid game . During the mid game, players become more aggressive and tendto group up with teammates to make plays on their opponents. Finally, onceplayers are reaching their maximum power levels, the match transitions intothe end game , where teams will group together and make hard pushes towardstaking objectives and winning the match.
Haewoon Kwak and Jeremy Blackburn
While these phases are not dictated by the programming of LoL, and thusthere is no hard cut off point for when the transitions between phases occur,we suspect that each phase has an associated pattern of communication. Thus,in Figure 1 we plot the density of chat messages written by toxic and typicalplayers as a function of the normalized time during a match. The plot confirmsour suspicions: communication is not uniform throughout the match. Instead,we see three distinct levels of communication, likely corresponding to the threephases of a match, with relative peaks at the beginning and end of the match.This finding can be explained with a deeper understanding of how a LoLmatch progresses. As mentioned above, in the early game players are relativelyweak and must focus on farming for resources. Early game farming occurs viaplayers choosing one of three lanes to spend their time in. The lanes are quitefar from each other on the map (10+ seconds or so to travel between them) andthus players on the same team tend to be relatively isolated from each other. Totake advantage of this isolation, and to get an early lead, players might roamfrom the lane they chose to play in to another lane. In turn, this provides theirteammate in the other lane with a numbers advantage over opposing player inthe lane. Colloquially, this roaming to provide a temporary numbers advantageis known as a “gank.” To deal with ganks in the early game, players tend tocommunicate via chat when the opposing player in their lane has gone missing.As the match transitions to mid game, teammates start grouping up. Sincethey are no longer so isolated the fear of ganks dissipates, and the need to com-municate missing players diminishes. Additionally, since teammates are groupedtogether, they are seeing the same portion of the map, and there is not reallythat much additional information they can convey to each other.Finally, as late game comes around, teams must focus and work together tocomplete objectives and win the match. In practice, this might involve coming toagreement on a final “push” for an objective, or agreeing on which lane the teamshould travel down. Also, there are some customs in e-sports, saying ‘gg (goodgame)’ at the end of the game. The sharp spikes contain those messages as well.While this might explain some of the spike seen at the end of Figure 1 another,simpler explanation is that players are simply communicating their (dis)pleasurein winning or losing the match.A more interesting finding is the subtle difference in the distributions oftypical and toxic players. At the early stage we see more active communicationby normal players. We suppose that it includes all the messages for ice breakingor cheering (e.g. gl (good luck) or hf (have fun)). However, at some point afterthe short period, toxic players begin to chat more than typical players and keepsuch pattern until by the last stage. At the last stage of the match, typicalplayers again chat more socially, for example, sending smile emoticons, whichare :D or :), and also saying gg, as we mentioned. The transition point, wherethe distribution of toxic players cross over that of typical players, is a basis ofour further analysis in Section 5. inguistic Analysis of Toxic Behavior in an Online Video Game 5
Fig. 2.
Top 10 discriminative uni- and bi-grams
The linguistic approach to the chat log characterizes toxic players with context.We conduct n -gram analysis because it is intuitive and straightforward. We filterthe stopwords and then count the frequency of uni- and bi-grams from the chatlog involving toxic reports of either verbal abuse or offensive language.In order to find discriminative n -grams of toxic players we need a referencefor comparison. We conduct the same n -gram analysis from enemy’s chat logwhen verbal abuse or offensive language is not reported from the enemies. Weconsider it as a normal conversation among players and call those enemies typicalplayers. We create the top 1,000 uni- and bi-grams for toxic and typical players,respectively. We find 867 uni- and 748 bi-grams in common. Then we obtain 133non-overlapped uni- and 252 bi-grams for toxic and typical players; they appearonly in either toxic or typical players. We define them as discriminative uni- andbi-gram for toxic and typical players, respectively.Figure 2 shows top 10 discriminative uni- and bi-grams of toxic and typicalplayers. Top 10 discriminative uni- and bi-grams of toxic players are filled withbad words. That is, Riot Games does not offer even the basic level of bad wordfiltering, and such bad words can be used as the signatures of toxic playerswho used verbal abuse or offensive language. We find that several discriminativebi-grams of typical players are about strategies, while most of toxic players’ bi-grams are bad words. We note that some variations of ‘fucking’ are discriminativeuni-grams but ‘fucking’ itself is not. It means that ‘fucking’ is often used not onlyby toxic players but also by typical players as well. This shows the difficulties offiltering bad words by a simple dictionary-based approach.As the next step of the linguistic approach, we are interested in when ver-bal abuse occurs from a temporal perspective during a match. We divide 252discriminative bi-grams of toxic players into three classes, early-, mid-, and late-bi-grams, based on when their highest frequencies occur.Figure 3 presents an example of three temporal classes of bi-grams. Interest-ingly, 209 (82.94%) out of 252 bi-grams are late-bi-gram. The early-bi-gram “ill Haewoon Kwak and Jeremy Blackburn
Normalized time S c a l ed den s i t y ill feed fucking bot report noob Fig. 3.
Example of early-, mid-, and late-bi-gram feed” is a domain specific example of toxic behavior. In LoL, one of the waysplayers earn gold and experience during a match is by killing players on theopposite team. Intentional feeding is when a player deliberately allows the otherteam to kill them, thus “feeding” the enemies with gold and experience, in turnallowing them to become quite powerful.The mid-bi-gram “fucking bot” is the toxic player expressing his displeasurefor the performance of the bottom lane. The bottom lane is usually mannedby characters that have a primarily late-game presence, and thus being behindduring the mid-game has a significant impact on the remainder of the match.Most verbal abuse of toxic players occurs in the late stage of the game. Forexample, “report noob” is the toxic player requesting that the rest of his teamreport a player (the “noob”) that he singled out for his ire. We believe the mostlikely explanation for this is that verbal abuse is most likely a response to losinga game, which is often not apparent until the late-game. For example, considera scenario where one player on the team has a bad game, perhaps making poordecisions resulting in the enemy team becoming quite strong. In the early-, andeven mid-game phases, a toxic player might still be able to hold his own, however,when the enemy team groups up and makes coordinated pushes in the late-game,their relative strength will often result in quick and decisive victories in team-fights. If toxic playing can be detected in real-time, we could protect potentialvictims from verbal violence, for example via alerts or simply not delivering suchmessages.Temporal dynamics of bi-grams might help to create a mental model of toxicplayers. For instance, 10 bi-grams containing ‘bot’ are divided into 1 early-bi-gram, 5 mid-bi-grams, and 4 late-bi-grams. Through manual inspection, we con-firm that the early-bi-gram (‘go bot’) is strategic and non-aggressive, the mid-bi-grams are cursing, and the late-bi-grams are blaming the result of the match onthe bot player(s). This provides us with a rough idea of how toxic players mightbehave and think over time: initially they have a similar mindset as typical play- inguistic Analysis of Toxic Behavior in an Online Video Game 7
Fig. 4.
Time difference of last used time of uni-gram ers, but, as the game plays out contrary to their desires, they grow increasinglyaggressive, eventually lashing out with purely abusive language. We leave moresophisticated modeling of toxic players’ thought process as future work.
In the previous section we recognize which words are exclusively used by toxicand normal players. However, some words are used by both toxic players andnormal players. For these, the emerging patterns in a temporal sense could bequite different. If we assume that toxic players exhibit toxic behavior in reac-tion to certain events happening during the match, then the linguistic behaviorof such toxic players should be the same as typical players before those eventshappen .To validate the above hypothesis, we conduct the following experiment whichis focused on finding some words that are not used after some time point bytoxic players, while they are continuously used by normal players. We extractthe top 30 uni-gram at every normalized time unit (ranging from 0 to 100) fortoxic players and normal players, respectively. Since top 30 uni-grams are quitestable during the match, we obtain unique 80 uni-gram for toxic players and 91uni-grams for normal players. We first observe that toxic players have slightlysmaller vocabularies than that by normal players. For each of these uni-grams,we compute the normalized time of last use by toxic players and normal players,respectively. Finally, we compute the difference of the last used time betweentoxic and normal players for the common uni-grams.Figure 4 lists the uni-grams with a time difference greater than 30. I.e., wordsin the list are used later into the match by normal players. Some interestingpatterns are present in the results.First, emoticons, particularly smile emoticons, are almost never used by toxicplayers. Second, apologies (e.g., ‘sorry’) are also exclusively used by normal play-ers. Third, some words for strategic team maneuvers (e.g., ‘come’, ‘ult’, ‘blue’,‘ward’) are used by toxic players, but this ceases at some point during the match.Fourth, some words primarily used for communicating movements with partnersin the same lane (e.g. ‘back’, ‘b’, ‘brb’ (be right back), ‘omw’ (on my way), ‘k’
Haewoon Kwak and Jeremy Blackburn (okay)) are also used by toxic players, but again, after some point toxic playersstop this form of communication. Fifth, toxic players stop praising (e.g., ‘gj’(good job)) their teammates after some point in time.All these findings reveal how toxicity is born during a match. It seems to bea kind of phase transition. They behave the same as normal players during theearly stage of the match, but at some point they change their behavior. Aftersome point, they utter neither apologies nor praise to express their feelings, andalso stop strategic communication with team members.By combining this finding with discriminative words of toxic players, we seethe possibility for detecting a certain point that a player transitions to be toxicwithout using detailed in-game action logs, but just chat logs. Thus, linguisticanalysis of toxic players shows not just how different they are and when theybecome different as well.
In this work we have examined crowdsourced data from 590 thousand cases ofaccused toxic players in a popular match-based competition game, League ofLegends. We have performed a series of linguistic analyses to gain a deeper un-derstanding of the role communication plays in the expression of toxic behavior.We have several interesting findings: a bi-modal distribution of chats during amatch, a difference between temporal chat patterns between toxic and typicalplayers, a list of discriminative uni- and bi-grams used by typical and toxic play-ers as signatures of them, temporal patterns of the linguistic signature of toxicplayers, and a possible footprint of transitions from typical behavior to toxic be-havior. Our findings would be helpful to automatically detect and warn playerswho may turn to be toxic and thus save potential victims of toxic playing inadvance.Finally, we suggest several directions for future work. First, is focusing oninteraction between typical and toxic players. In this work the unit of our analysisis a message, but we do not delve into the flow of messages. Interaction analysiscould reveal more clear narratives of how a player transitions to toxic behavior.Next, is building a pre-warning system to detect toxic playing earlier. The mainchallenge here is to build a dictionary of words that are signs of toxic playing.As we have seen a list of discriminative uni- and bi-grams of toxic and typicalplayers, some bad words are also used by typical players as well. This behavior isprevalent in “trash talk” culture, and an important factor in immersing playersin a competitive game [5]. Thus, any pre-warning system must be effective indetecting toxic playing while being flexible enough to allow for trash talk toavoid breaking the immersive gaming experience. We believe that the signatureof toxic and typical players we found is a first step for building the dictionaryfor a pre-warning system. inguistic Analysis of Toxic Behavior in an Online Video Game 9
References
1. J. Barli´nska, A. Szuster, and M. Winiewski. Cyberbullying among adolescent by-standers: role of the communication medium, form of violence, and empathy.
Journalof Community & Applied Social Psychology , 23(1):37–51, 2013.2. J. Blackburn and H. Kwak. Stfu noob!: Predicting crowdsourced decisions on toxicbehavior in online games. In
Proceedings of the 23rd International Conference onWorld Wide Web , WWW ’14, pages 877–888, 2014.3. V. H.-H. Chen, H. B.-L. Duh, and C. W. Ng. Players who play to make otherscry: The influence of anonymity and immersion. In
Proceedings of the InternationalConference on Advances in Computer Enterntainment Technology , ACE ’09, pages341–344, 2009.4. T. Chesney, I. Coyne, B. Logan, and N. Madden. Griefing in virtual worlds: causes,casualties and coping strategies.
Information Systems Journal , 19(6):525–548, 2009.5. O. B. Conmy.
Trash Talk in a Competitive Setting: Impact on Self-efficacy, Affect,and Performance . ProQuest, 2008.6. H. Lin and C.-T. Sun. The ”white-eyed” player culture: Grief play and constructionof deviance in mmorpgs. In
Proceedings of DiGRA 2005 Conference , 2005.7. J. Suler. The online disinhibition effect.
Cyberpsychology & behavior , 7(3):321–326,2004.8. P. Thompson. Whats fueling the flames in cyberspace? a social influence model.