I Would Not Plant Apple Trees If the World Will Be Wiped: Analyzing Hundreds of Millions of Behavioral Records of Players During an MMORPG Beta Test
II Would Not Plant Apple Trees If the World Will Be Wiped:Analyzing Hundreds of Millions of Behavioral Records ofPlayers During an MMORPG Beta Test
Ah Reum Kang † , Jeremy Blackburn ‡ , Haewoon Kwak § , Huy Kang Kim ¶† University at Buffalo ‡ Telefonica Research § Qatar Computing Research Institute, Hamad Bin Khalifa University ¶ Korea University [email protected], [email protected], [email protected], [email protected]
ABSTRACT
In this work, we use player behavior during the closed beta test ofthe MMORPG ArcheAge as a proxy for an extreme situation: atthe end of the closed beta test, all user data is deleted, and thus,the outcome (or penalty) of players’ in-game behaviors in the lastfew days loses its meaning. We analyzed 270 million records ofplayer behavior in the 4th closed beta test of ArcheAge. Our find-ings show that there are no apparent pandemic behavior changes,but some outliers were more likely to exhibit anti-social behavior(e.g., player killing). We also found that contrary to the reassuringadage that “Even if I knew the world would go to pieces tomorrow,I would still plant my apple tree,” players abandoned character pro-gression, showing a drastic decrease in quest completion, leveling,and ability changes at the end of the beta test.
Keywords
Massively multiplayer online role playing game (MMORPG); On-line games; ArcheAge; Closed beta test (CBT)
1. INTRODUCTION
One problem that philosophers have struggled with over the cen-turies is how humans will behave in a disastrous “end times” sce-nario. For example, how does an individual behave if his/her be-havior will have no lasting outcomes or penalties? Do we continueto follow the compass that has led us through life or do we abandonour morals, ideals, and social norms in the face of oblivion? In thispaper, we examine such a scenario through the lens of a massivelymultiplayer online role playing game (MMORPG).In contrast to typical studies on running MMORPGs, our datasetis from the Closed Beta Test (CBT) of ArcheAge, developed andserviced by XLGames in Korea. The CBT is populated with a lim-ited number of testers, and more importantly, at the end of the CBTthe server is wiped: all characters are deleted, progression is lost,virtual property is deleted, etc. The mapping principle [19] statesthat the behavior of players in online games is not very far fromthe behavior that humans exhibit in the real world. Thus, whilenot a perfect mapping, we believe that the end of the CBT is a rel-c (cid:13)
WWW’17 Companion,
April 3–7, 2017, Perth, Australia.ACM 978-1-4503-4914-7/17/04.http://dx.doi.org/10.1145/3038912.3038914. atively good approximation of an “end times” scenario, and thusthe present work is not only useful for the understanding of play-ers’ behavior but can also begin to shed light on human behavior ingeneral under such conditions.From the “living laboratory” of the CBT, we first formulate theresearch problem. Our aim is to characterize the activity patternsof players over time with respect to a salient event. A salient eventis defined as an event that takes place in time and space whichhas impact on social units which can respond to the event. Theclosing of the CBT can be considered a salient event for the entirepopulation of players.In this work, we investigate how player behavior changes duringthe course of the CBT. We examine player behavior from two dif-ferent levels, system-wide and individual-level, which have differ-ent granularity. We do this to avoid an ecological fallacy [6], whichis when statistical inferences about individuals are deduced fromthose about groups that they belong to. Via a two-level analyses,we find no apparent pandemic (system-wide) behavior changes, al-though some outliers resorted to anti-social behavior, such as mur-der (player killing, or “PK”). That said, we surprisingly find thatchat content exhibits a slightly positive trend as the CBT draws to aclose. Overall, players increase social interaction with others: theyexchange more in-game messages (mails) and create more partiesto enjoy group-play or complete high-level quests.Additionally, we focus on whether individuals’ behavioral changesare due to the CBT ending by comparing behavior to that of typicalchurners. We find significant differences between players that vol-untarily leave the game (churners) and those who stay until the endof the CBT. In particular, we find that churners were more likely toexhibit anti-social behavior (e.g., PK). It seems that churners losetheir their sense of responsibility and attachment to the game. Incontrast, those who stay until the end might have some loyalty tothe game and thus continue to behave within accepted social norms.Using network analysis, we focus on an associations between awide range of individual player behaviors. We examine patternsof in-game actions, looking for changes in the frequency as theCBT ends, find that contrary to the reassuring adage “Even if Iknew the world would go to pieces tomorrow, I would still plantmy apple tree” (i.e., I would still continue to better myself and theworld), players abandoned character progression, showing a drasticdecrease in quest completion, leveling, and ability changes. Thisfinding itself is interesting and indicates why the quote resonates,and at the same time, it sheds light on game design implications forCBTs with respect to player reactions to the inevitable end of thebeta test.Our contributions are three-fold: 1) we prove that analyses withdifferent granularity, both individual-level and system-wide, are a r X i v : . [ c s . C Y ] M a r rucial to understand user behavior comprehensively; 2) we pro-pose a robust method to distinguish the effect of the end of the betatest from that of voluntary quit from the game by dealing with thetypical churners separately; and 3) to the best of our knowledge, weare the first to perform a large-scale quantitative characterization ofbehavior changes as the beta test of a game ends. This brings prac-tical implications to game designers and theoretical implications toresearchers who are interested in user behavior around a criticalevent.
2. BACKGROUND2.1 MMORPG as a Miniature of the World
MMORPGs, such as World of Warcraft (WoW) and Ever QuestII, where a very large number of players interact with one anotherwithin a virtual world are one of the most popular forms of onlinegaming. In MMORPGs, players choose a character and its role,race, and other traits, and live in the virtual world taking variousactions.ArcheAge is a medieval fantasy MMORPG serviced by XL-Games . Archeage has been released in Asia, Europe, and NorthAmerica, and has over 2 million subscribers as of October 2014 .A design goal of ArcheAge is to offer players a playground todo whatever they want and find their own way to enjoy the game.To this end, a wide range of actions are possible in ArcheAge com-pared to other MMORPGs. Users can modify the environment bymeans of construction and cultivation much more extensively thanin other comparable games. Users can enjoy player-vs-environment(PvE) combat and player-vs-player (PvP) combat, questing, groupactivities, chatting, housing and farming, crafting, trading, politics,voyaging and pirating, etc. All these actions are recorded in server-side game logs. Therefore, the logs are good assets to capture andobserve more varied human behaviors than other games. Online games development usually follows an ordered releaseprocess: Alpha Test, CBT, Open Beta Test (OBT), and then theofficial launch. CBTs are private with a limited number of testersin order to find bugs, validate the market and fun of the game beforeofficial release, and improve the game through feedback. A gamecan have multiple CBTs if necessary, and ArcheAge had five CBTsbefore launch.
3. RELATED WORK
Castronova has put forth the theory that video games can serveas living laboratories allowing for novel social science research [3].The rules of the game world are not just explicitly known, but infact completely controlled by the game developers. This in turnallows us to study what amounts to real world behavior but stillhave some well defined controls. In a nutshell, his thesis is that thescale, breadth, and depth of online games results in genuine socialinteractions, providing detailed, precise, and accurate traces at thesociety level. Related to this, Williams [19] presents the mappingprinciple which posits that behavior in online video games “maps”to behavior in the real world. I.e., that we can gain an understandingof real world behavior by examining behavior in online games.Castronova et al. [4] further used traces of virtual goods transac-tions to measure whether there is a mapping principal for economictheories. They quantitatively examine several economic indicators http://archeage.xlgames.com/en https://goo.gl/HHqCTx at large-scale based on the accurate, complete digital traces col-lected from an MMORPG. Their findings show that virtual worlddenizens operate in the same way as in real world economies. Likethe work presented in this paper, having access to detailed dataeliminates many of the concerns and limitations of traditional sur-vey based studies.Detailed game logs collected from the CBT of ArcheAge enableus to formulate research questions looking into players faced by anextreme situation. As previous efforts on online game beta testshave focused on things like market understanding [7] or perfor-mance testing [9], we believe that the analogy between the servershutdown and an end times scenario presents a novel research chal-lenge.One well-known study around a critical event in online gameswas performed by Boman and Johansson, modeling a syntheticplague in WoW [17]. The synthetic plague grew from what wasoriginally conceived as a “debuff” intended to spread only frommonsters to players. A programming bug, however, resulted in theplague being able to spread from player to player. As players con-stitute a synthetic society in the game, the game can be seen asan interactive executable model for studying disease spread (withthe caveat that it is a very special kind of disease) [16]. One in-teresting emerging behavior was that players would deliberately at-tempt to infect others by passing the debuff to their pets, dismissingthem, and then re-summoning them in a populated area, causing theplague to spread. Similarly, some industrious players set up salesof fraudulent cures.Although this behavior is clearly anti-social, other behavior emer-ged to counteract it. Some players acted as public health workers,healing the sick, while others even attempted to quarantine them-selves and suffer a solitary death. This incident, however, is verylimited in scope, which limits its implications. In contrast, ourwork is applicable to any games that have beta tests, which areessential for game development. We, thus, believe that our work iseasily generalizable with broad implications. Although ArcheAgehas been studied previously and shown to be rich enough to capturecomplicated social dynamics [10, 11], these previous works do notfocus on the end games scenario.
4. DATASET
We acquired the anonymized full logs of the 4th CBT of ArcheAgedirectly from XLGames. While we have the logs of the first twoweeks of the 5th CBT, in this work, we focus on the 4th CBT be-cause our aim is to examine user behavior as the CBT closes. Thelogs were delivered as a 45 GB MySQL database and include essen-tially all actions that players take during the 4th CBT. For example,experience points gained, spells and abilities used, items purchasedor crafted, etc. The CBT took place between December 8th, 2011to February 20th, 2012 (about 11 weeks) and is summarized in Ta-ble 1.
Period 12/08/2011 ∼ Table 1: Summary of our dataset.
We classified 75 different in-game actions into 11 categories:combat, party (grouping up with other players), instance dungeons(specially built dungeons with different content than the “surface”world), battle ground (a team deathmatch between players regard-less of their race that takes about 15 minutes), siege warfare (battleetween guilds over player owned castles; the largest scale col-laborative play in the game), raid (a large party formed to defeatdifficult boss enemies), expedition (ArcheAge’s version of guilds),PvP, “interaction doodad” (players interacting with various objectsin the world, e.g., harvesting a tree for wood), item production, andhousing. This variety of in-game actions lets us study complex dy-namics of a virtual world.Finally, with access to such detailed data, we respect and protectprivacy of game players. All the data are anonymized. We havemade no attempt to deanonymize it and do not have any informa-tion that can connect online and offline identity. Also, we note thatour legal agreement with XLGames explicitly prevents us from re-ceiving any information that can directly reveal players’ real worldidentities. We have further made a best attempt to avoid any analy-sis that would inadvertently reveal players’ real world identity.
5. DOES A SYSTEM-LEVEL PANDEMIC E-XIST?
The first step in understanding how players deal with the endof the CBT is exploring their behavior evolved over time. In thissection, we examine aggregate player behavior in ArcheAge fromthe start of the CBT to the end.
Like most MMORPGs, ArcheAge offers players a variety of ac-tivities to partake in. In particular, ArcheAge has a sophisticatedcrafting system which allows players to produce items, a construc-tion system allowing players to build houses, several types of PvEand PvP combat, as well as a variety of grouping and guild func-tionality. However, not all of these actions are easily accessibleat the start of the world. For example, to build a house, playersrequire some materials, such as a blueprint, wood, and ore. Toget a blueprint, players need to go to a specific region, called Mi-rage Island, and buy it from an NPC. For purchases, players needGilda Stars, the main currency of ArcheAge, obtained by complet-ing quests or trading packs. To get wood, players must plant andharvest trees. To get ore, players need to mine stone, metal, andgems from rocks. Overall, gathering these materials takes someplay time.Figure 1 plots radar charts of the change in the frequency of ac-tions per week, normalized by the number of users performing theaction during the entire CBT. Each week is demarcated by a tickon the outside radius (counter-clockwise ordering) and the furtheraway the shaded region is from the center, the greater the numberof actions.From Figure 1 we see that at week 1, expedition events werequite common. Players are exploring and gathering informationabout the new world from the start. Next, during week 2, party andraid events become popular. This is likely due to players focusingon leveling up their characters with new found friends. At week3, we see many players taking part in instance dungeons, housebuilding, and battle ground events. Once players have established afoothold in the world, they begin to take part in more varied contentfor fun and currency. At week 4 and 5, we see a relative peakof item production and PvP activity. PvP requires rarer materialsand in-game experience as it is generally considered more difficultthan PvE combat. Interestingly, after week 5, the frequency of PvPactivities decline. This shows that PvP tends not to be adopted as aregular activity by players. Siege warfare is observed at only week9 and 10 because it was being tested during those periods only.Overall, there are no extreme changes in in-game actions in terms
Figure 1: Radar plot of normalized action frequency per week.Weeks are arranged on the outer radius in clock-wise order.The further away the shaded region is from the center, thegreater the frequency for the corresponding week. of system-wide aggregated frequency over time, even though theCBT is ending.
As mentioned earlier, ArcheAge has a strong virtual economy.Like other online games, ArcheAge players earn a currency, “GildaStars,” by killing monsters, complete quests, or trade items. Simi-larly, they spend money to buy housing, items, pets, or materials.Intuitively, at the end of the CBT, there is no motivation to holdmoney in hand because it does not change to real money after thegame ends. Thus, we hypothesize that players increase their rateof spending when the end approaches. Figure 2 plots the distri-bution of the expenditure ratio per week. The expenditure ratio iscomputed as the amount of gold players spend divided by their networth: if a player spends no gold they have an expenditure ratio of0.0 but if they spend all the gold they have it is 1.0.From Figure 2 we see that in the earlier weeks players tend havea low expenditure ratio. They are likely to save up resources tospend on higher-tier items like houses or boats. Once they havebuilt up their reserves, we see an increasing trend towards higherexpenditure ratios. At the 4th week, player expenditures becomemore stable, with perhaps a slight reduction as the end of the worldapproaches. This is opposite to our hypothesis. The potential ex-planation is that game players would not wait until the exact mo-ment of the end of the CBT to spend their money to buy in-gameitems because it also requires some time to enjoy what they buy.Rather than the last week, we can see the higher peak at the 9thweek, while the difference is marginal.One interesting note is that a preliminary analysis on the econ-omy of a different MMORPG (Aion ) found that players tended tohave only around a 27% expenditure ratio [12], whereas in Arche-Age the median is around 27% but the average much higher. http://na.aiononline.com/en/ lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll Week E x pend i t u r e R a t i o Figure 2: Distribution of expenditure ratio (money out flow di-vided by net worth) per week.
Days PK R a t i o PK_Type
MURDER BATTLE
Figure 3: The relative frequency of murders over time.
ArcheAge has a sophisticated PvP system to support a wide rangeof player actions. Intuitively, we might expect that once it is appar-ent there are no consequences for actions, players might not feelthe need to obey rules (e.g., ethics and social norms) and take partin anti-social behavior for fun. ArcheAge’s PvP system allows usto examine this: PvP between players of the same character raceis classed as murder with a variety of in-game consequences andpenalties.Ultimately, we are concerned with whether or not players aban-don whatever reservations they might have against murder. Figure 3presents the relative frequency of regular PvP (battle kills allowedin the ArcheAge) to murders over time. From the Figure, we seethat murders are much more common at the beginning of the CBT,decreasing in proportion to regular PvP events until about the lastthird of the timeline where murders start becoming more prevalentagain. We suspect that the initial peak in murders is due to playerstrying out the PvP system as well as victimizing other new play-ers who might not expect such early aggression. The increasingtrend at the end of the timeline is an indication that players mightbe reverting to more “savage” tendencies as well. As we expected,players are more likely to perform anti-social behavior when nopenalty will be imposed.Next we examine how pervasive such anti-social behavior was.We extracted all players that committed at least one murder in thelast two weeks of the beta period. We note that there were relativelyfew such murderers (334), and thus, the analysis does not represent
Cluster PK _ C oun t Period
WEEK_9_11WEEK_4_8WEEK_1_3
Figure 4: Frequency of murders per period committed by eachcluster of murderers. pandemic behavior, but rather a closer look at outlier players that did resort to violence at the end.How do those 334 players behave during the whole beta test?Our interest here is whether they were “normal” players that turnedinto murderers only at the end of the CBT or whether they exhibitedabnormal behavior throughout the CBT. We cluster them based ontheir in-game activities, obtaining four clusters. We use k -meansclustering algorithms and find the optimal k by using an elbowmethod.Figure 4 plots a histogram of the murders performed per clusterdivided into three intervals (early-, mid-, and end-times). Interest-ingly, we see some differences in when the different clusters per-formed their murders. Even though the mid-times period encom-passes more weeks than the end-times period, cluster 4 committedabout the same number of murders in both periods and cluster 2saw a dramatic increase in murders.There are two main takeaways from this finding: 1) not all mur-derers are alike, but there do seem to be some archetypes they canbe clustered under, and 2) clearly there are some players who, al-though they did not quite go from pacifists to serial killers, did infact show an increase in murderous tendencies as the end of theworld drew near. The second “M” in MMORPG stands for
Multiplayer , and thusthe obvious draw to the genre is interacting with other players.A big part of player interaction is social interaction via variousforms of communication. ArcheAge supports a variety of real-timecommunication channels, many of which are tied to various actiontypes. For example, expeditions, parties, and factions all have theirown chat channels.These chat logs enable us to explore a few questions related tocommunication. We have seen initial evidence that game play re-lated behavior changes over time, but, does communication alsochange?To answer this question, we examine emotions conveyed in chatmessages via sentiment analysis. While many methods for senti-ment analysis have been proposed, we use the simple, yet power-ful, valence score [8]. A valence score is a measure of “happiness”and when applied to a corpus of text provides us with the sentimentexpressed in the text. As the ArcheAge CBT was populated almostentirely with native Korean speakers, we used a Korean languagevalence score dictionary compiled in [5], which showed a slight,but statistically significant positive bias in human language across l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l
EXPE D I T I O N F A C T I O N F I ND _ PA R T YPA R T Y R A I D SAY T R A D E Z O N E Days V a l en v e S c o r e Figure 5: Valence scores over time by chat channel. Note dif-fering y-scales. a variety of communication channels and languages. The valencescale ranged from unpleasant (1), to neutral (5) and pleasant (9).Figure 5 plots the valence scores over time for each public chatchannel in ArcheAge. We plot the valence score for each day aswell as a smoothed trend line. We first note that as found in [5],there is a slight positive bias across time; between 4.85 and 6.07 forthe entirety of our dataset. In particular, most of the valence scoreshover right around 5.4, although there are some extreme outliers.For example, the Raid channel saw a valence score of around 6.0on one day, even though every other day was around 5.4. We spoketo ArcheAge’s developers and were unable to find an in-game rea-son (e.g., a special event put on by the game company) for thisto happen, but we intend to look deeper into these outliers in thefuture.More interestingly, we see that there are clearly different patternsexpressed in the different chat channels over time. For example,the Expedition channel has a clearly increasing “happiness,” whilethe Party channel has an increasing valence score for the first fewweeks and then flattens out. In general, the “social” channels (Ex-pedition, Party, and Raid) see no decrease in valence as the endof the beta test approaches. Although players do have somewhatchanging sentiment, it is not sadness as the world reaches its end.
6. INDIVIDUAL PLAYER-LEVEL BEHAV-IORAL CHANGE
So far we have examined aggregated dynamics from the system-wide view and found that global-scale pandemic behavior does notseem to emerge. In this section, we focus on each individual playerto see whether there is a noticeable behavioral change that can behidden when viewed in aggregate. We do not have the data of private channels.
Our aim is to test whether an individual player shows behavioralchange at the end of the CBT. To this end, we first define the ir-regularity of player behavior as the behavioral difference betweenthe last day and the average day. In other words, we generate time-series of the frequency of actions for each user and detect a positive(higher frequency than average) or negative peak (lower frequencythan average) on the last day.Although there are sophisticated approaches to detect peaks atarbitrary time point of a given streams like [13], what we need ismuch simpler. We are interested in whether or not the irregular be-havior is observed on the last day, and thus, we take the straightfor-ward approach by using the mean and the standard deviation of thetime-series prior to the last login day. More specifically, we definea positive peak as the case where the frequency of a given actionon the last login day of a player is greater than the sum of the meanfrequency and two standard deviations of the action. Similarly, a negative peak is defined as the case where the frequency of a spe-cific action is less than the subtraction of two standard deviationsfrom the mean frequency of the action on the last day.More formally, peak ( c, a i ) = (cid:40) + if | a i, Ω | > (cid:10) a i, .. (Ω − (cid:11) + 2 × σ i, .. (Ω − − if | a i, Ω | < (cid:10) a i, .. (Ω − (cid:11) + 2 × σ i, .. (Ω − where c is the player, a i is the action i , and Ω is the last day. Whilethe height of the peak shows the strength of the irregularity, we firstfocus on the sign of the peak only.For the correct interpretation of peaks occurring at the last day,we compare them with peaks that can be observed from the playerswho leave the game earlier (i.e., voluntary churners). We denotethe former with peaks for system-wide ( S ) reasons (i.e., the end ofthe beta test) and the latter with peaks for individual ( I ) reasons.The key question here is what are common and what are not com-mon behaviors between S and I players because the circumstancesthat the players face are similar, but at the same time different. Forexample, players are not likely to care about the consequences oftheir actions on the last day because the next day does not come. Inparticular, penalties for bad actions, e.g., decrease in “honor points”or account suspension, are meaningless for both S and I playerssince they will not be playing another day anyways.Nevertheless, there exist differences between the circumstancesthey face. It is whether others are also supposed to leave the game at the same time and for the same reason . In other words, while I users leave the game alone, by contrast, S users leave the gametogether with all other players. From this difference we posit thatsystem-wide impact could be observed from S users but not from I users. This means that some of the behavioral changes derived byshared emotions among the players might be observed from S butnot I .Also, there might be some differences between the attitude ofthe S and I users towards the game. Playing the game until the endof the beta test ( S users) shows players’ devotion to the game. Bycontrast, leaving the game during the beta test ( I users) indicatesthat players lost interest. Losing interest in the game also connectsto losing loyalty to the game and might lead to anti-social behavior.We filter out players that logged in less than 5 days (not neces-sarily consecutive) to exclude unstable one-time playing charactersfrom our analysis. This filtering criterion leaves 22,945 players,28.27% of all players created during the 4th CBT. Among these rel-atively long-lived (again, > peak ( c, a i ) for 6,242 I players and 15,430 peak ( c, a i ) for 3,150 ab ili t y c hangeabno r m a l l og ou t c ha r a c t e r c r ea t i on c ha r a c t e r dea t h c ha r a c t e r de l e t i on c ha r a c t e r l og i n c ha r a c t e r l og ou t e x ped i t i on c an c e l e x ped i t i on e x il ee x ped i t i on i n v i t a t i one x ped i t i on i n v i t a t i on a cc ep t an c ee x ped i t i on o r gan i z ee x ped i t i on w i t hd r a w a l e x pe r i en c e po i n t c hange f a ll da m age f r i end add f r i end de l e t i onga m e m one y c hangehono r po i n t c hangehou s i ng au t ho r i t y s e tt i nghou s i ng c on s t r u c t i on p r og r e ss hou s i ng c on s t r u c t i on t r y hou s i ng i n t e r i o r i n s t a ll a t i on i n t e r a c t i on doodade w / l abo r po i n t i n t e r a c t i on doodade w / o l abo r po i n t i t e m de l e t i on i t e m e xc hange ( ga i n ) i t e m e xc hange ( o ff e r) i t e m ga i n i t e m m ag i c r e i n f o r c e m en t i t e m p r odu c t i on i t e m u s e l abo r po i n t c hange l e v e l up m a il r e c e i v i ng m a il s end i ng m a t e i t e m i n s t a ll a t i on m a t e i t e m i n s t a ll a t i on c an c e l m a t e l e v e l up m a t e r e c a ll m a t e r e c a ll c an c e l m a t e r e s u rr e c t i on m a t e sk ill a c qu i s i t i ono t he r c ha r a c t e r P . K o t he r P C r e s u rr e c t i onpa r t y c r ea t i onpa r t y e x il epa r t y i n v i t a t i onpa r t y i n v i t a t i on a cc ep t an c epa r t y i n v i t a t i on t r y pa r t y i t e m ga i n r u l e c hangepa r t y m a s t e r c hange ( de l ega t e ) pa r t y w i t hd r a w a l que s t a cc ep t an c eque s t c o m p l e t i onque s t f a il u r eque s t g i v e upque s t p r og r e ss r a i d c r ea t i on r a i d c r ea t i on t r y r a i d d i s band r a i d e x il e r a i d i n v i t a t i on r a i d i n v i t a t i on a cc ep t an c e r a i d i t e m l oo t i ng c hange r a i d m a s t e r c hange ( de l ega t e )r a i d w i t hd r a w a l r e s u rr e c t i on r e t u r n s h i p de s t r u c t i on s h i p p r odu c t i on p r og r e sss h i p p r odu c t i on t r ys hop r epu r c ha s e s hop s e lli ng sk ill a c qu i s i t i on sk ill i n i t i a li z a t i on sk ill l e v e l c hange sk ill r e i n f o r c e m en t s l a v e r e c a ll t e l epo r t U C C i t e m app l y w a r ehou s e i t e m depo s i t w a r ehou s e i t e m r e t r i e v e z one en t r an c e Action N o r m a li z ed P ea k F r equen cy Character Type
INDIVIDUAL SYSTEM
Figure 6: Normalized frequency of positive/negative peaks for each type of actions at the last day of the player’s “life.” Negative peakfrequency is shown on the negative y-axis. INDIVIDUAL players stopped playing before the CBT ended, while SYSTEM playersstayed until the end. S players. Among them, 40.93% of players show irregular behav-ior on their last login day. In other words, four out of ten playersbehave irregularly on their last login day no matter whether leavingthe game due to systemic or personal reasons. This confirms ouroriginal intuition about the pervasiveness of unusual behavior onthe last day.Figure 6 presents the number of players (either I or S ) that showpositive and negative peaks on their last login day for each actionnormalized to [0 , and [ − , , respectively. From the figure, weobserve many differences between I and S players. Both I and S players show positive peaks for mail sending and receiving. Indetail, I players have more positive peaks for mail sending than re-ceiving, but S characters show more receiving than sending. Play-ers who leave on purpose behave differently compared to playerswho leave due to the system end.We then rank each action by its likelihood to show positive peaksand negative peaks. Considering the relatively small sample size ofArcheAge players compared to million-player scale MMORPGs,we use the lower bound of the Wilson score confidence interval fora Bernoulli parameter [20] instead of a simple ranking by the exactnumber of positive and negative peaks. The method to compute theWilson score is as follows: W = (ˆ p + z α/ n − z α/ (cid:115) [ˆ p (1 − ˆ p ) + z α/ / n ] n ) / (1 + z α/ where ˆ p is the proportion of positive peaks to negative peaks, n is the number of peaks, and z α/ is the the (1- α/ ) quantile ofthe normal distribution. We use z α/ = 1.96 for a 95% confidencelevel.Table 2 shows the top 10 actions with positive peaks (the highest W ) and negative peaks (the lowest W ) for S players and I players.From the top positive peaks of the S players, we find an increasein social interaction, such as exchanging mails, party creation andinvitation, and item exchange. The significant increment of itemproduction shows that players willingly accept the risk of produc-tion failure at their last login days because the risk is not a risk any longer. Raid withdrawal indicates that raiding parties were brokenup when the CBT ends.Among the top positive peaks of the I users, we find two kindsof actions that are not observed from the S users: character dele-tion and PK. As creating multiple characters is allowed for a singleaccount, it is not essential to delete characters when leaving thegame. Thus, it is reasonable to understand character deletion as in-tentional behavior for one’s own reasons. We suppose that there isuser intent to wipe out one’s characters when they leave, perhapsrelated to growing privacy concerns online.PK observed as a positive peak is more interesting. ArcheAgechurners lose their reservations and show online disinhibition justbefore they leave the game. Even though PK is not explicitly bannedby gameplay mechanics, it is fundamentally harmful to other play-ers and involves risks such as penalty or item loss upon death. Sev-eral studies have examined the underlying conditions that catalyzetoxic behavior [15], and leaving the game might very well be onemore condition that can trigger toxic behavior. It is worth notingthat by contrast, PK is not likely to appear (48th rank) for playerswho stayed until the end.Most of the top negative peaks in Table 2 are common betweenthe two sets of players. The easily recognizable trend is that play-ers do not usually invest their time in making their characters betteror stronger (e.g., level up, ability change, experience point change,quest accept, question complete, and game money change) oncethe end of the world approaches. Players typically perform theseactions because of their expected outcome in the future rather thanan inherent enjoyment of the actions themselves. Therefore, at theend of the beta test, they cease performing them. This is in con-tradiction to the adage, “Even if I knew that tomorrow the worldwould go to pieces, I would still plant my apple tree.” by Mar-tin Luther ; people do not behave like that, and the counter-factualoptimistic view might be why the adage resonated. While Table 2 shows which action shows positive or negativepeaks, players actually can do multiple actions. Then, how do such
YSTEM INDIVIDUALRank Positive Peak Negative Peak Positive Peak Negative Peak
Table 2: Top 10 positive and negative peaks for SYSTEM and INDIVIDUAL users. peaks associate with each other? To answer this question, we bor-row some tools from network science and construct a network ofpeaks and cluster the peaks into communities.More formally, we build a graph G ( V, E ) where each vertex is apair of action and peak sign. I.e., v a i ,sign ∈ V . For each vertex v i ,we define c v i as the set of players that have the corresponding peak v i . Next, we connect two vertices by an edge if and only if thereare any players in common: e i,j = (cid:40) if c v i ∩ c v j (cid:54) = ∅ otherwiseNaturally, we weigh each edge with the number of common play-ers belonging to each vertex: w i,j = | c v i ∩ c v j | We note that vertices without edges (i.e., isolated vertices) areexcluded from the graph. We also prune relatively unimportantedges based on their weights by backbone extraction [18]. We setstatistical significance level as 95%. In pruning, the importance ofa single edge can be measured differently depending on which of itsvertices you measure from. Thus, to better understand associationsof peaks, we transform a single undirected edge into two reciprocaldirected edges and prune them according to the statistical frame-work. In the remaining text, G (cid:48) denotes the pruned G .We investigate two weighted directed networks, G (cid:48) S and G (cid:48) I ,built from S users and I users, respectively, which allows us tocompare peak associations from the two group of users. We findthat G (cid:48) I consists of fewer vertices than G (cid:48) S . It has 79 vertices and464 edges, but G (cid:48) S has 103 vertices and 498 edges. The differencemainly comes from negative peaks. Only four negative peaks (levelup (-), ability change (-), mate recall (-), and mate recall cancel (-))are included in G (cid:48) I , although 28 negative peaks are in G (cid:48) S .A large proportion of positive peaks in G (cid:48) I means that playersdo more of those actions together at their last login day rather thandoing them less, especially when they decide to leave the game.More importantly, those positive peaks are well connected to eachother like a clique.Intuitively, actions of a similar category are likely to synchro-nize. In other words, if one user does more leveling up, then theuser is likely to do other activities related to making his characterstronger. To support our intuition with data, we apply the Louvainmethod [2], a widely-used modularity-based community detectionalgorithm, to identify densely connected local groups of peaks.Figure 7 shows the communities derived from the peak networksof G (cid:48) I and G (cid:48) S . From this plot we make several observations. Wefind 12 communities on G (cid:48) S and 15 communities on G (cid:48) I . Low linkdensity, defined as | E | / {| V | ( | V | − } , of G (cid:48) S (0.047) results infewer communities, although the pruned network has more verticesthan G (cid:48) I (0.075). Interestingly, we find no communities containing positive peaksand negative peaks at the same time in either network. In addition,this tendency, different signs of peaks never associated with eachother, is confirmed by observing no direct edges between differentsigns of peaks in G (cid:48) I and G (cid:48) S . I.e., players do not change theirtypical behavior in two different ways at the same time.Also, communities from the two networks are not well over-lapped, as we observed in Figure 7. This finding is more evidencethat S players and I players behave differently at their last logindays. To quantitatively measure the similarity (or difference) be-tween the two sets of communities, for each community in G (cid:48) S , wecompute the Jaccard similarity coefficient with every communityidentified in G (cid:48) I . Then we pick the community with the maximumJaccard coefficient as a corresponding community for the specificcommunity in G (cid:48) S . The average and the median of the computedJaccard coefficient are 0.537 and 0.500, respectively. We note thatin computing Jaccard similarity, we focus only on vertices (peaks)that are common in both networks because both networks have adifferent number of vertices. In the system-wide view, we find that the sentiment of player chatslightly changes over time as in Figure 5. For our last experiment,we examine how an individual’s chat behavior changes when theirplaying reaches an end. We again used valence score to capturesentiment.Since the range of valence score that each player uses is different,comparing the average of the scores for all players during typicaldays and at the last day makes the behavioral change of each playeraverage out. For example, the score of one user changes from 3to 7 and that of the other user changes from 7 to 3. In this case,the average score does not change but stays at 5, even though eachuser shows substantial behavioral change. To address this issue, wetrack the change of every player. I.e., +4 and -4 for the two users inthe above example.For each player, we then subtract the average valence score oftypical days from that of the last day and denote it by ∆ . Fig-ure 8 shows a box-plot distribution of the ∆ for the S users and I users. Although medians of the two groups are almost the same ( S : − . , and I : . ), we can see that distribution of I users is muchwider. In other words, users show more change in chat sentimentwhen they leave the game of their own volition than when the betatest ends. Also, we have found a statistically significant effect of agroup ( I or S ) on the sentiment ( p < . ).We break down the change of valence score based on players’typical valence score in Figure 9. We bin them by the rounded valueof the average valence score for typical days and then show thedistributions as a box-plot. By break down, an interesting patternthat is veiled in Figure 9 emerges; ∆ is positive when the sentiment hip destruction (+)slave recall (+)character log in (+)character log out (+) character log in (-)character log out (-)abnormal log out (+) abnormal log out (-)mail sending (+)mail receiving (+)expedition invitation (+)expedition invitation acceptance (+)raid creation try (+)raid creation (+)raid invitation acceptance (+) raid withdrawal (+)raid invitation (+)mate recall (+) mate recall cancel (+) mate recall (-)mate recall cancel (-) mate item installation (+)mate item installation cancel (+)mate level up (+)mate skill acquisition (+)item gain (-)item deletion (-)item gain (+)item use (+) item use (-)item production (+)interaction doodade w/ labor point (+) interaction doodade w/ labor point (-)interaction doodade w/o labor point (+) quest acceptance (-)quest completion (-)quest progress (-)item deletion (+)shop repurchase (+) quest progress (+) item exchange(gain) (+)item exchange(offer) (+)warehouse item deposit (+)warehouse item retrieve (+) raid exile (+)raid master change (delegate) (+)skill level change (+) party invitation try (+)party invitation (+)party withdrawal (+)party creation (+)party invitation acceptance (+)character creation (+)character deletion (+) experience point change (-)experience point change (+)labor point change (+) labor point change (-)game money change (+) game money change (-) level up (-)ability change (-)character death (+)resurrection (+) character death (-) resurrection (-)other PC resurrection (+) other PC resurrection (-)return (+) return (-)honor point change (+)zone entrance (-) skill level change (-)interaction doodade w/o labor point (-) friend add (+)quest acceptance (+) quest completion (+)skill acquisition (+)skill reinforcement (+) skill acquisition (-)skill reinforcement (-)skill initialization (+) friend deletion (+)housing construction try (+) housing construction progress (+)housing authority setting (+)housing interior installation (+)ship production try (+)ship production progress (+) ship destruction (+)slave recall (+)character log in (+) character log out (+)abnormal log out (+)expedition invitation (+)expedition invitation acceptance (+)item production (+)raid invitation acceptance (+)item exchange(gain) (+)party invitation (+)item exchange(offer) (+) expedition exile (+)raid master change (delegate) (+)raid exile (+)raid creation try (+)raid creation (+) raid withdrawal (+)raid invitation (+) mate recall (+)mate recall cancel (+)mate recall (-)mate recall cancel (-) mate item installation (+)mate item installation cancel (+) mate level up (+)mate skill acquisition (+)item gain (+)item use (+) interaction doodade w/ labor point (+)interaction doodade w/o labor point (+)party withdrawal (+) warehouse item deposit (+)shop repurchase (+) housing authority setting (+)warehouse item retrieve (+)UCC item apply (+)mail sending (+)mail receiving (+) item deletion (+)return (+)zone entrance (+) skill level change (+)honor point change (+) quest give up (+)skill acquisition (+)skill reinforcement (+)party invitation try (+) party creation (+)party invitation acceptance (+) party master change (delegate) (+)character creation (+)character deletion (+) experience point change (+)labor point change (+)game money change (+) quest progress (+) level up (-)ability change (-) character death (+)resurrection (+)other PC resurrection (+)teleport (+)expedition withdrawal (+) fall damage (+)skill initialization (+)housing construction try (+) housing construction progress (+)housing interior installation (+) ship production try (+)ship production progress (+) Figure 7: Backbone networks of peaks for users until the system ends (left) and users who personally leave before (right). lll l ll l ll lll ll l ll l ll l ll l ll l l ll ll l ll lll ll llll ll ll lll l lll l ll l ll ll l ll lll l ll lll ll lllll llll ll l ll lll ll ll lll ll lll ll ll l ll ll lll l ll lll l lll l ll l lll ll ll lll lll l l ll ll llll ll ll lll llll ll lllll l l ll llll llll l llllll l ll l l lll lll l ll l ll ll llll ll lll ll lll ll ll ll llllll l l llll lll ll ll lll ll ll l ll l ll llll l lll l ll ll llll l ll l lllll ll ll l ll ll ll lll l ll ll lll llll ll ll ll ll llll ll l ll lll l ll l ll ll l lllllll lll l llll ll lll ll l lll lllll ll lll ll llll ll llll lll l ll ll ll lll ll lll l lll l lll llll l ll l ll l l ll ll l ll ll ll lll lll l ll lll l lll ll lll llll ll l lll l ll lll ll lll ll ll l ll lll l lll ll llll ll l
INDIVIDUALSYSTEM −3 −2 −1 0 1 2 D Valence (Last day − Typical days)
Figure 8: Change of valence score at the last day compared totypical days. of typical days is negative ( <
5) and negative when the sentimentof typical days is positive ( >
7. DISCUSSION AND CONCLUSION
In this work, we focused on understanding of user behavior dur-ing the beta test of an MMORPG. We used detailed logs from theCBT of the MMORPG ArcheAge as a proxy for an extreme sit-uation: at the end of the CBT, all user data is deleted, and thusthe outcome of players’ in-game behavior on the last days loses lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll ll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l
INDIVIDUAL SYSTEM−3−2−1012 3 4 5 6 7 3 4 5 6 7 D V a l en c e ( La s t da y − T y p i c a l da ys ) Figure 9: Change of valence score at the last day comparedto typical days binned by the average valence score of typicaldays. its meaning. We examined this virtual petri dish in terms of howplayer behavior evolved over time and then focused on the particu-lar behavioral changes that occurred right before the CBT ended.Our findings show that there is no apparent pandemic behaviorchanges even when the CBT ends. While we did find that someplayers resorted to anti-social behavior, such as murder, aggregatesentiment through chats shows pro-social trends. When we focuson individual users’ behavioral changes, we find significant dif-ferences between churners who voluntarily left the game beforethe end and players who stayed until the end. In particular, wefound that churners were more likely to exhibit anti-social behav-ior (PK). Using network analysis, we found communities of relatedbehavioral peaks. Interestingly, no communities contain positivend negative peaks simultaneously (i.e., no positive peak in a givenbehavior is correlated with a negative peak in a different behav-ior). Players do not change their typical behavior into two differentways at the same time. Finally, when the CBT ends, we foundthat contrary to the reassuring adage, players abandoned charac-ter progression, showing a drastic increase in quest abandonment,leveling, and ability changes.
Although we believe that our dataset represents about as closeas we can get to an empirical end of the world scenario, there areseveral limitations to keep in mind. First, ArcheAge is a videogame and thus the true consequences of the CBT ending are entirelyvirtual: although plenty of in-game characters perish, no humansdo. Thus, it would be naive of us to claim a one-to-one mappingwith real world behavior. However, players do invest substantialtime and energy into their characters, and it is quite common forvirtual property to be worth real world money these days, so thereare some real consequences.Next, we examined data from the 4th CBT of ArcheAge, butthere were three previous CBTs, one subsequent closed beta, and anopen beta prior to the game officially launching. It is quite possiblethat players become familiar with the end of the beta tests throughthe prior tests. However, the end of the particular instance of theirvirtual avatar affected their behavior. That said, where relevant,our findings tend to be in agreement with those based on real worldincidents, and thus we suspect this knowledge played little role.
Our study brings practical and theoretical implications to gameindustry and research communities. Practically, our findings onirregular behavior of individual churners could be an alarm, orearly-warning, of their leaving. As addressing churners remainsa consistent goal of game developers, our work can help inform thedevelopment of retainment strategies, such as offering incentivesor new interactions to help them become attached to the virtualworld. Also, what actions players increasingly or decreasingly per-form when the end of the CBT comes provides guidance on how torun the CBT; some features should be tested earlier because playersabandon them when the end of the CBT comes.From the perspective of studying human behavior where behav-ioral outcome does not have significant meaning, our findings thatplayers do not invest their time for advancement and some outliersexhibit anti-social behavior can help design future studies.Also, we have provided additional empirical evidence in favor ofthe emergence of pro-social behavior. Our findings that the senti-ment of social grouping specific chat channels trend towards “hap-pier” as the end times approach is a first indication of this pro-socialbehavior: existing social relationships are likely being strength-ened. Further, we saw that players that stayed until the end of theworld exhibited peaks in the number of small temporary groupings:new social relationships are being formed.
8. ACKNOWLEDGMENTS
This research was supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF) fundedby the Ministry of Science, ICT & Future Planning (2014R1A1A1006228).
9. REFERENCES [1] J. Blackburn and H. Kwak. STFU NOOB!: predictingcrowdsourced decisions on toxic behavior in online games. In
Proceedings of the 23rd International Conference onWorld Wide Web , pages 877–888, 2014.[2] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, andE. Lefebvre. Fast unfolding of communities in largenetworks.
Journal of Statistical Mechanics: Theory andExperiment , 2008(10):P10008, 2008.[3] E. Castronova. On the Research Value of Large Games:Natural Experiments in Norrath and Camelot.
Games andCulture , 1(2):163–186, Apr. 2006.[4] E. Castronova, D. Williams, Cuihua Shen, R. Ratan, LiXiong, Yun Huang, and B. Keegan. As Real as Real?Macroeconomic Behavior in a Large-scale Virtual World.
New Media & Society , 11(5):685–707, 2009.[5] P. S. Dodds, E. M. Clark, S. Desu, M. R. Frank, A. J. Reagan,J. R. Williams, L. Mitchell, K. D. Harris, I. M. Kloumann,J. P. Bagrow, K. Megerdoomian, M. T. McMahon, B. F.Tivnan, and C. M. Danforth. Human language reveals auniversal positivity bias. In
Proceedings of the NationalAcademy of Sciences , pages 2389–2394, 2015.[6] C. Ess. Culture, technology, communication: Towards anintercultural global village.
The Information Society ,20(3):233–234, 2004.[7] S. C. Gold and J. Wolfe. The validity and effectiveness of abusiness game beta test.
Simulation & Gaming ,43(4):481–505, 2012.[8] P. Gonçalves, M. Araújo, F. Benevenuto, and M. Cha.Comparing and combining sentiment analysis methods. In
Proceedings of the First ACM Conference on Online SocialNetworks , pages 27–38, 2013.[9] Y. Jung, B.-H. Lim, K.-H. Sim, H. Lee, I. Park, J. Chung,and J. Lee. Venus: The online game simulator usingmassively virtual clients.
Systems Modeling and Simulation:Theory and Applications , 3398:589–596, 2005.[10] A. R. Kang, J. Park, and H. K. Kim. Loyalty or profit? earlyevolutionary dynamics of online game groups. In
Proceedings of the 12th Annual Workshop on Network andSystems Support for Games , 2013.[11] A. R. Kang, J. Park, J. Lee, and H. K. Kim. Rise and fall ofonline game groups: Common findings on two differentgames. In
Proceedings of the 6th Annual Workshop onSimplifying Complex Networks for Practitioners (collocatedwith WWW) , pages 1079–1084, 2015.[12] A. R. Kang, J. Woo, J. Park, and H. K. Kim. Online game botdetection based on party-play log analysis.
Computers &Mathematics with Applications , 65(9):1384–1395, 2013.[13] J. Kleinberg. Bursty and hierarchical structure in streams.
Data Mining and Knowledge Discovery , 7(4):373–397, 2003.[14] H. Kwak and J. Blackburn. Linguistic analysis of toxicbehavior in an online video game. In
Proceedings of the FirstWorkshop on Exploration on Games and Gamers (collocatedwith SocInfo) , 2014.[15] H. Kwak, J. Blackburn, and S. Han. Exploring cyberbullyingand other toxic behavior in team competition online games.In
Proceedings of the 33rd Annual ACM Conference onHuman Factors in Computing Systems , pages 3739–3748,2015.[16] E. T. Lofgren and N. H. Fefferman. The untapped potentialof virtual game worlds to shed light on real world epidemics.
The Lancet Infectious Diseases , 7(9):625–629, 2007.[17] S. J. J. Magnus Boman. Modeling epidemic spread insynthetic populations-virtual plagues in massivelyultiplayer online games. In
Proceedings of Digital GamesResearch Association International Conference , 2007.[18] M. Á. Serrano, M. Boguná, and A. Vespignani. Extractingthe multiscale backbone of complex weighted networks. In
Proceedings of the National Academy of Sciences , volume106, pages 6483–6488, 2009. [19] D. Williams. The mapping principle, and a researchframework for virtual worlds.
Communication Theory ,20(4):451–470, 2010.[20] E. B. Wilson. Probable inference, the law of succession, andstatistical inference.