[PDF] Game Mechanic Alignment Theory and Discovery

Abstract

We present a new concept called Game Mechanic Alignment theory as a way to organize game mechanics through the lens of systemic rewards and agential motivations. By disentangling player and systemic influences, mechanics may be better identified for use in an automated tutorial generation system, which could tailor tutorials for a particular playstyle or player. Within, we apply this theory to several well-known games to demonstrate how designers can benefit from it, we describe a methodology for how to estimate "mechanic alignment", and we apply this methodology on multiple games in the GVGAI framework. We discuss how effectively this estimation captures agential motivations and systemic rewards and how our theory could be used as an alternative way to find mechanics for tutorial generation.

Full PDF

GGame Mechanic Alignment Theory and Discovery

Michael Cerny Green [email protected] York University | OriGen.AINew York City, New York, USA

Ahmed Khalifa [email protected] Innovation Lab | Modl.aiNew York City, New York, USA

Philip Bontrager [email protected] York UniversityNew York City, New York, USA

Rodrigo Canaan [email protected] York UniversityNew York City, New York, USA

Julian Togelius [email protected] York UniversityNew York City, New York, USA

ABSTRACT

We present a new concept called Game Mechanic Alignment theoryas a way to organize game mechanics through the lens of environ-mental rewards and intrinsic player motivations. By disentanglingplayer and environmental influences, mechanics may be better iden-tified for use in an automated tutorial generation system, whichcould tailor tutorials for a particular playstyle or player. Within,we apply this theory to several well-known games to demonstratehow designers can benefit from it, we describe a methodology forhow to estimate mechanic alignment, and we apply this methodol-ogy on multiple games in the GVGAI framework. We discuss howeffectively this estimation captures intrinsic/extrinsic rewards andhow our theory could be used as an alternative to critical mechanicdiscovery methods for tutorial generation.

CCS CONCEPTS • Applied computing → Computer games ; •

Human-centeredcomputing → Information visualization ; •

Mathematics of com-puting → Nonparametric statistics . KEYWORDS tutorial, player behavior, video game, mechanic, game mechanic,playstyle

ACM Reference Format:

Michael Cerny Green, Ahmed Khalifa, Philip Bontrager, Rodrigo Canaan,and Julian Togelius. 2021. Game Mechanic Alignment Theory and Discovery.In

FDG ’21: Foundations of Digital Games, August 03–06, 2021, Woodstock, NY.

ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/1122445.1122456

A player’s first experience with a video game is often with its tuto-rial. Tutorials provide a way for a game designer to communicatewith the player, to train them, and to help them understand the

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].

FDG ’21, August 03–06, 2021, Montreal, Canada © 2021 Association for Computing Machinery.ACM ISBN 978-1-4503-XXXX-X/18/06...$15.00https://doi.org/10.1145/1122445.1122456 game’s rules. Without this guidance, the player may become frus-trated, unable to figure out how to play or feel a sense of progression.At worst, a tutorial is unhelpful and confusing. But if done correctly,a tutorial excites and encourages the player to keep playing.To be effective teachers, tutorials need to contain the crucial bitsof information needed to play, such as what the controls are, howto win, and how to lose. This information is typically defined bythe game’s mechanics, i.e. the events within the game triggered bygame elements that impact the game state [46]. “Critical mechan-ics” are the mechanics that need to be triggered in order to win alevel [22]. Therefore, it makes logical sense that critical mechanicsbe explained within a tutorial. Previous work has proposed severalsolutions to automatically find critical mechanics (coined “criticalmechanic discovery methods”) using uninformed [20, 21] and in-formed [22] tree search methods and a graph of game mechanicrelationships.However, player enjoyment cannot and should not be limitedto a binary choice of “winning” and “losing”. Players engage withgames with their own intrinsic biases and motivations, which im-pact the enjoyment they receive from play. For example, in thegame Minecraft (Mojang 2007), a voxel-based open-world sandboxgame, the designers may have intended for players to travel to the“End,” a dangerous zone of monsters and treacherous terrain, todefeat the Ender Dragon, which could act as the game’s final boss.However, many players may choose to never travel to the End,instead selecting to build large castles and design elegant structuresthat are aesthetically pleasing. In an extreme case, some playersemploy TNT in an explosive attempt to blow up as much of therendered world as possible without crashing their server . Admit-tedly, defeating the Ender Dragon does not explicitly end the gameeither, as the player can keep on playing in their world with theability to slay the Ender Dragon repeatedly.The game designer creates external, “environmental” incentives,to guide the player into behaving a certain way. However, a playerhas their own internal, “intrinsic” motivations, which may be alignedwith or run against these external incentives. By categorizing me-chanics within a framework that provides spaces for both extrinsicand intrinsic incentives, we can better understand game mechan-ics and how to teach them to players. Automated tutorial genera-tion [21] is a relatively unexplored artificial intelligence applicationthat could greatly benefit from such a framework. Critical mechanic a r X i v : . [ c s . A I] F e b DG ’21, August 03–06, 2021, Montreal, Canada Green, et al. discovery [22], i.e. the process to automatically find which mechan-ics to teach inside a tutorial, provides a useful family of methods tofeed an automated tutorial generator. However, tutorial generationmethods are limited and have drawbacks, chief among those being areliance a complex game graph of mechanical relationships similarto the graphs created by Machinations framework [16].This paper presents a Game Mechanic Alignment theory, a frame-work in which mechanics can be analyzed in terms of intrinsic andextrinsic rewards. This work includes examples of Game MechanicAlignment applied to some well-known games, a methodology thatcan estimate mechanic alignment given playtrace data, and an ap-plication of this methodology on several video games in the GeneralVideo Game Artificial Intelligence (GVGAI) framework [40]. Weconclude how the theory can be applied as input to automatedtutorial generation systems as an alternative statistical method tocritical mechanic discovery.

Sicart defines a “game mechanic” as an event within the game thatis fired by a game element that impacts the game’s state [46]. Me-chanics allow players to interact with and impact the game’s state.Tutorials are meant to help players understand game mechanicsand, ultimately, learn how to play with them.

In addition to his definition of game mechanics, Sicart defines “core,”“primary,” and “secondary” mechanics, which are also relevant tothis work: • Core mechanics are (repetitively) used by agents to achievea systematically rewarded end-state. • Primary mechanics are the subset of core mechanics thatcan be directly applied by agents to solve challenges thatlead to a desired end-state. • Secondary mechanics are the subset of core mechanics thatmake it easier for the agent to reach the desired end-statebut are not essential like primary mechanics.For example, in Super Mario Bros (Nintendo, 1985) World 1-1, Jump-ing and Running are both core mechanics. Jumping is a primarymechanic as you can’t finish the level without it. Running is asecondary mechanic as it makes jumping easier and finishing thelevel faster, but it is not necessary to reach to the desired end state.Another example is Pacman (Namco 1980), where eating pellets isprimary and eating ghosts is secondary.Indeed core mechanics have been defined by others [1, 30, 45],but lack a systemic perspective. For example, Salen and Zimmermandefine them as “the essential play activity players perform againand again in a game (...) however, in many games, the core mechanicis a compound activity composed of a suite of actions” [45]. Jarvi-nen indirectly touches upon reward systems with “the possible orpreferred or encouraged means with which the player can interactwith game elements as she is trying to influence the game state athand towards the attainment of a goal” [30], but still does not tellus whose goals and if the core mechanics are actually essential toattaining it. Although Sicart’s definitions provide precision with https://machinations.io/ this useful formalism, they admittedly lack the ability to classify allgame mechanics outside of those oriented around environmentallydefined goal/reward structures. Our proposed framework attemptsto cover some of this undefined space, providing a method to ana-lyze mechanics through the lens of player and environment. Developers have experimented with multiple tutorial formats [50].For simple games meant to be picked up and played quickly, me-chanics tend to be intuitive: “Press space to shoot”, “Press up tojump”, and so on. As a result, these games usually lacked a formaltutorial. As game complexity increased and home consoles startedto explode in popularity, formal tutorials became more common.Tutorials have since evolved to incorporate different design andpresentation styles depending on the taste and conviction of gamedesigners and their perception of players [20]. Tutorials can adaptto different learning capabilities of the users who use them. SheriGraner Ray [43] discusses different knowledge acquisition styles .Explorative Acquisition follows a “learning by doing” philosophy,while Modeling Acquisition, is about “reading before doing.” Greenet al. [20] proposes three different presentation styles within thesame vein:

Text , Demonstrations , and

Well-Designed Experiences .Several projects have addressed challenges in automated tu-torial generation, such as heuristic generation for Blackjack andPoker [12–14] or quest/achievement generation in

Minecraft [2].Mechanic Miner [11] evolves mechanics for 2D puzzle-platformgames, using

Reflection to find a new game mechanic then generatelevels that utilize it. The Gemini system [47] takes game mechanicsas input and performs static reasoning to find higher-level mean-ings about the game. Mappy [39] can transform a series of buttonpresses into a graph of room associations, transforming movementmechanics into level information for any Nintendo EntertainmentSystem game.The

AtDelfi system [21] attempts to solve the challenge of au-tomatically generating tutorials for video games using two dif-ferent formats: text-based instructions and curated GIF demon-strations. This has been later expanded upon to include smallsub-levels [23, 32] and entire levels in Mario [24] and 2d arcadegames [10]. To develop these tutorials, each system requires aninput set of game mechanics, referred to as the critical mechanics :the set of mechanics that are necessary to trigger in order to winthe level. In addition to presenting a method to automatically findcritical mechanics,

AtDelfi also includes mechanics that give theplayer points or cause a loss.

Talin [4] is a Unity-based tutorialgeneration system which dynamically presents information to aplayer based on their skill-level. Novice players will be presentedwith more information, whereas experienced players will be sparedunneeded tooltips. Talin differentiates itself from AtDelfi in that itdoes not attempt to automatically discover which mechanics arecritical, but instead which of the manually-selected mechanics needto be displayed for the user’s consumption. https://code.google.com/archive/p/reflections/ ame Mechanic Alignment Theory and Discovery FDG ’21, August 03–06, 2021, Montreal, Canada By choosing to expose or omit certain game mechanics, a tutorialmay cater to some player’s intrinsic motivations over others. In thiswork, we suggest that automated tutorial generative systems shouldincorporate not only different presentation formats and learningstyles, but also a variety of player’s intrinsic motivations.

Ribbens et al. wrote that studying player behavior should not oc-cur in isolation of the game environment [44]. A player cannotplay without an environment to interact within, and gameplay isintimately intertwined with the particular player.Gameplay environments are created and shaped by the gamedesigner. Conscious design decisions can influence player behaviorand therefore the player experience [6]. But player behavior is notjust influenced by the environment. Players carry their own biasesinto the game that influence their in-game behavior, stemmingfrom individual motivations [36], culture [7], and even age [49]. Byanalyzing behaviors, one can categorize players into a taxonomyof playstyles [5, 55], each category not being mutually exclusiveof the rest. Artificial intelligence systems may assist with playerbehavioral analysis [29]. Our work presents a novel method to ana-lyze player behavior by focusing on game mechanic usage duringplay.

Player modeling is the study of computational models of players ingames, including their incentives and behavior [54]. It is often usedto study and even mimic the styles of players. This can be done usingmethods such as supervised learning using real playtraces [38, 51]or utility-function formulation [25–28]. Player modeling is relevantto this work as AI gameplaying agents are used in place of humansfor the sake of rapidly studying the efficacy of this method.Observations of human play data has been used to bias treesearch agents to play card games [15]. Outside of learning fromhuman data, several projects have demonstrated that human behav-ior can be mimicked by limiting computational resources [37, 56].Khalifa et al. [34] identify another method that can be used to ma-nipulate tree search to act more like humans. In this work, we utilizeplayer modeling by using gameplaying agents in lieu of humanplayers to demonstrate the method’s efficacy.

GVG-AI is a research framework for general video game play-ing [40, 41], aimed at exploring the problem of creating artificialplayers capable of playing a variety of games. Organizers host anannual competition where AI agents are scored on their perfor-mance in unseen games. Each agent is given 40 milliseconds tosubmit an action provided with a forward model for the currentgame. The framework’s environment has grown over years [40],including new competition tracks such as level generation [35], rulegeneration [33], learning agents [52], and two-player agents [18].To date, the framework contains a diverse set of games numbering over 100, including familiar titles such as

Pacman (Namco 1980)and

Sokoban (Imabayashi 1981), and brand new games such as

WaitFor Breakfast . This paper presents our theory of “Game Mechanic Alignment”.Within this framework, game mechanics can be categorized interms of the environment (the game) and the player engaging withit. Usually, games contain designer-defined reward systems whichare consistent regardless of who is playing. The impact these rewardsystems have on play and the form they take highly depends onthe conscious decisions of the designers. In a general sense, thesereward systems can be interpreted as environmental penalties ver-sus environmental rewards, which usually guide the player towardwinning and away from losing. This is a separate concept from theenvironmental reward explicitly defined in reinforcement learningenvironments [48]. In games that do not have explicit winning con-ditions, such as Minecraft (Mojang, 2008) or The Sims (Maxis, 2000),this condition can be substituted with another win-like condition,such as defeating the Ender Dragon in Minecraft, getting a job inThe Sims, having 𝑋 amount of happy customers in RollerCoasterTycoon (Chris Sawyer, 1999), etc.In addition to built-in environmental rewards, player-specificintrinsic rewards also influence how a player engages with thegame. For example, speed runners and casual players have verydifferent goals and motivations. A speed runner may bypass, skip,or glitch their way to the end of the game. A casual player may taketheir time and explore the environment. They may even test outlosing to better understand how certain failure mechanics work.Thus, a player’s goals may be independent of the environmentalrewards. A game may reward a player when they move toward theright side of the screen, but a player may want to first collect everypowerup before moving on to the next checkpoint. In an extremecase, a player interested in exploring the losing mechanics of agame may find themselves moving counter to the environmentalrewards. If player incentives and environmental rewards are inagreement for a specific mechanic, we consider this mechanic tobe in alignment .We can think of mechanics as living in a 2D space according tohow much they are rewarded both by explicit environmental rewardsystems and by a player’s intrinsic motivations. The extremes ofthe “Environmental Rewards” axis would coincide with critical andfatal mechanics as defined by previous work in automated gamemechanic discovery [22]: • Critical Mechanics : The set of game mechanics which mustbe triggered to win, or the equivalent to winning. • Fatal Mechanics : The set of game mechanics which result inlosing, or the equivalent harshest environmental penalty.At the extremes of the “Intrinsic Rewards” axis lie mechanicsthat the player feels intrinsically incentivized to pursue or to avoid: • Incentive Mechanics : The set of mechanics that a specificplayer is incentivized to trigger over the course of the levelto accomplish that player’s subgoals.

DG ’21, August 03–06, 2021, Montreal, Canada Green, et al.

Figure 1: Game Mechanic Alignment Theory • Avoidance Mechanics : the opposite of incentive mechanics.The set of mechanics that a specific player avoids trigger-ing over the course of the level to accomplish that player’ssubgoals.The intersection of intrinsic and extrinsic reward axes createsan origin point, aka the “neutral” zone, where the player’s behav-ior is neither influenced by reward/penalty or internal motiva-tions/aversions or only marginally influenced. The quadrant that amechanic inhabits will embody the relationship that a mechanichas in terms of the environment and the player. Figure 1 showsthe intrinsic and extrinsic rewards space plotted as 2D axes wherex-axis represents extrinsic/environment rewards and the y-axisrepresents intrinsic rewards/player motivations. These two axesdivide the space into four quadrants. They are, in counter-clockwiseorder: • Quadrant 1 (Green): Both the environmental rewards andthe player’s motivations encourage the player to trigger thismechanic. • Quadrant 2 (Yellow): The environment punishes the playerbut the player wants to trigger this mechanic anyways. • Quadrant 3 (Red): Both the environmental rewards and theplayer’s motivations discourage the player from triggeringthis mechanic. • Quadrant 4 (Blue): The environment rewards the player butthe player wants to avoid triggering this mechanic regard-less.When game mechanics are within Quadrant 1 or Quadrant 3(green and red zones), they can be considered to be “in alignment”. Inother words, both the environment and the player are in agreementin regards to rewards and incentives. Perfect alignment would be atthe 𝑦 = 𝑥 line. However, if mechanics lie within Quadrant 2 and/orQuadrant 4 (yellow and blue zones), they are player-environmentmisaligned. The player may be motivated to take actions that causeenvironmental penalties, or else refuse to take actions that wouldotherwise give them rewards.In the following sections, we propose how this theory can beapplied in practical situations. Section 6 describes how a gamedesigner might apply this theory to some video game examples. Section 7 explains a methodology which can be used to estimatemechanic alignment using playtrace data. To demonstrate the usefulness of this framework, in this section wepresent examples using several well-known games:

Super Mario Bros (Nintendo 1985),

Minecraft (Mojang 2009), and

Bioshock (Bioware2007). For each game, we highlight aspects relevant to the discus-sion of intrinsic and environmental rewards and provide mechanicalignment charts as examples of how the intrinsic rewards associ-ated with certain actions could differ for two hypothetical playerprofiles. Since these examples are simply for illustration purposes,the values on the x and y axis are meant to be taken qualitatively.In each example, we simplify the number and description of thegame mechanics: each has many more mechanics and they may bedescribed subjectively different depending on the individual doinganalysis. In sections 7 and 8 we propose a method for estimatingthese values given playtraces of a game or level by many differentplayers.

While much has been said about how the Mario series teaches theplayer through careful level design [19], a lot of the discussionaround the series focuses on critical and fatal mechanics such asmoving to the right, collecting power ups and jumping over hazards.These mechanics directly help the player progress towards winningstates or avoid losing states. However, the game also acknowledgesand incorporates in its design various intrinsic motivations thatplayers are likely to exhibit.For example, the scoring system featured in the early games ofthe franchise rewards the players both for actions that directly helpthe player progress and avoid losing, such as killing enemies andpicking up power-ups, and also for actions that display masteryover the game which have high intrinsic appeal to advanced players,such as grabbing the flagpole at a higher spot and finishing thelevel faster.The collectible coins fulfill multiple roles. At first, they provideno immediate benefit other than serving as a token of intrinsically-motivated achievements such as mastering a tricky jump or uncov-ering secrets. But once a certain number of coins are collected, theplayer is given an extra life, which helps avoid a game-over state.Finally, coins and other collectibles in Mario and similar gamescan be used to lead the player guide the player toward secrets orsuggest alternate paths that may require a power-up, which Khalifaet al. [31] call guidance and foreshadowing , respectively. Followingthese cues can lead to both environmental and intrinsic rewards.While there is no explicit in-game tutorial for these reward sys-tems, taken as a whole they show that designers can take stepsto align environmental rewards with the actions that satisfy theplayer’s intrinsic motivations. Figure 2 suggests a possible Player-Environment Mechanical Alignment chart for the mechanics ofthe game and the hypothetical player profiles of “explorer” and“speedrunner”. Both players are ultimately interested in beating thelevel, but the explorer player does this in a slower-paced and saferway. The explorer enjoys eating Mushrooms for extra safety, break-ing Bricks out of curiosity for what’s inside, and collecting coins ame Mechanic Alignment Theory and Discovery FDG ’21, August 03–06, 2021, Montreal, Canada

Figure 2: An example of alignment axes for two differentplayers in Super Mario Bros that are easily accessible. The speedrunner, as the name suggests,wants to beat the level with the lowest possible in-game time andwill avoid collectibles and bricks unless this results in a faster clear.The speedrunner will even occasionally take damage on purpose toshrink size in order to go through narrow paths, making its intrinsicreward higher than the explorer player. While neither player wantsto purposefully waste time, the explorer will only choose to runwhen it is safe to do so, while the speedrunner will run most of thetime, even at the risk of losing a life (and thus restarting the leveland the timer). That is why the speed runner has a higher value for“Run” mechanic over explorer.For the x-axis values, we assigned relative order for these me-chanics for what we think the game designer wants the player to doto finish the game. That’s why “Touch Flag” and “Move Right” havethe highest x-axis value, as they provide progress toward finishingthe game, while “Losing Life” and “Shrink one size” have the lowestvalues on the x-axis as they prevent progress.

Minecraft presents an interesting case of study for our frame-work since much of the appeal of the game comes from fulfillingintrinsically-motivated goals. While triggering the credits by beat-ing the Ender Dragon could be considered “winning” the game in atraditional sense, players are free to ignore this goal (potentiallyindefinitely) and focus on exploring, mining, crafting and buildingstructures.Figure 3 illustrates the mechanical alignment for two hypotheti-cal player profiles: the “builder”, who takes enjoyment from building

Figure 3: An example of alignment axes for two differentplayers in Minecraft structures, and the “adventurer”, who seeks challenging mob en-counters, including the Ender Dragon and other opponents found inthe Nether and the End dimensions. Both players place equal valueon crafting equipment as it enables both in reaching their goals.Neither player desires to be hit, but the adventurer is more comfort-able with the possibility, and gets hit more often as consequence ofthe more frequent combat encounters.For the x-axis values, we assigned them from the perspective ofthe designer who wants the player to reach the credit scene andfinishing the game. As you can see, all mechanics provide progresstoward reaching this goal are on the extreme right side such as“Visit End/Nether,” “Mine Ores,” and of course Defeating the EnderDragon. While all the mechanics that hinders this progress lies onthe extreme left such as “Getting hit” and “Death.” In Bioshock , the player is faced with an important choice regard-ing the fate of characters known as

Little Sisters which incurs inboth moral (intrinsic) and environmental implications. When meet-ing one of these genetically-modified young girls, the player canchoose to either harvest or save them. Harvesting them yields moreADAM, an environmental reward that serves as in-game currencyfor upgrades, but results in the child’s death. Saving them yieldsless ADAM but some other environmental benefits and can lead to a“happier” ending. Thus the game pits a player’s intrinsic motivationfor taking a moral path and reaching the “happy” ending againstenvironmental considerations.Figure 4 illustrates a possible alignment chart for two hypothet-ical player profiles: the “morally good” player and the “thriller”player. Both these players have the same final goal which is beatingthe game, but the morally good player is trying to reach it with the

DG ’21, August 03–06, 2021, Montreal, Canada Green, et al.

Figure 4: An example of alignment axes for two differentplayers in Bioshock least amount of killing, while the thriller only cares about powerand destruction. That is why they differs on the “Little Sisters”, thethriller since it needs more power, it is highly motivated on killingthe sisters as they provide too much power compared to savingthem. On the other hand, the morally good player don’t care aboutthe power but feels bad about killing these little girls (even if theyare not real) and they might be motivated to reach the good endingof the game.Similarly to all the previous games, we sort the mechanics onthe x-axis such as mechanics near the right are pushing the playerto progress in the game such as “Shooting Enemies” and “HackingTurrets”. While, mechanics near the left blocks that progress andcause the player to replay certain areas such as “Death” and “Gettinghit”.

In this section, we propose a method for automatically estimatingthe intrinsic and environment rewards for the game mechanicsfrom a certain game and level. This method requires a sizable distri-bution of playtraces from a diverse set of agents/players where eachagent/player plays the same level multiple times. The large groupof player data allows for the influence of player incentives to beseparated out from the environment specific motivations and thusmechanic alignments to be calculated for many types of players.Similar to any statistical technique, the bigger and more diversethe input data, the more accurate are the estimations.Without loss of generality, the method to calculate how criti-cal/fatal a mechanic is is fairly straightforward and intuitive. Wesimply want to know how often a mechanic occurs in a winning/losing playtrace as opposed to how often it occurs in any playtrace ingeneral. For example, a critical mechanic would occur in everywinning playtrace while likely occurring infrequently in losingplaytraces (winning/losing doesn’t need to be literal, it could be acertain condition that the player needs to trigger as discussed insection 5). The frequency of a mechanic across playtraces can beviewed as a probability density function (PDF) that represents howlikely a mechanic is to occur in a playtrace. It then follows that thedifference between the PDF of the mechanic and the PDF of themechanic given winning/losing would give a quantitative measureof how critical a mechanic is. This can be applied directly to playermotivations by simply conditioning the mechanic on playtraces ofthe given player. We use the Wasserstein distance [53] to calculatethe distance between the two distributions. The following equationshows the general form of the distance calculation. 𝐷 𝑚,𝑐 = 𝑊 ( 𝑃𝐷𝐹 ( 𝑚 | 𝑐 ) , 𝑃𝐷𝐹 ( 𝑚 )) (1)where 𝑚 is the current mechanic, 𝑐 is the current condition ( 𝑤𝑖𝑛 incase of environment reward and 𝑎𝑔𝑒𝑛𝑡 in case of intrinsic reward),and 𝑊 is the first Wasserstein distance, 𝑃𝐷𝐹 ( 𝑚 | 𝑐 ) is the distri-bution of the mechanic 𝑚 given the condition 𝑐 happening, and 𝑃𝐷𝐹 ( 𝑚 ) is the distribution of the mechanic 𝑚 in all the playtraces.The PDF can be calculated directly from the discrete data. Thefrequencies for each playtrace can be normalized to form a rough,discrete, approximation of the true PDF. The Wasserstein distancecan then be computed directly on this discrete data to compute theestimated distance. This approach has the benefit of not needingto fit any distribution to the data and therefore does not requireany assumptions about the data distribution. Calculating the resultover a PDF also would allow for this approach to be used when thePDF is known through other means.The distance proposed in equation 1 is a scalar with value be-tween 0 and 1, and doesn’t expression direction in the space. Tocalculate the direction, we compare the mean of 𝑃𝐷𝐹 ( 𝑚 ) . If theconditional distribution has a higher average, it means it is more en-couraged to happen, while if it has lower average, it is discouragedfrom happening. 𝑆 𝑚,𝑐 = 𝑆𝑖𝑔𝑛 ( 𝜇 𝑃𝐷𝐹 ( 𝑚 | 𝑐 ) − 𝜇 𝑃𝐷𝐹 ( 𝑚 ) ) (2)To calculate the final rewards, Environment reward ( 𝐸 𝑚 ) andIntrinsic reward ( 𝐼 𝑚 ), we combine both equations 2 and 1 to get thefinal values. Equation 3 shows the final equation for both rewardswhere the difference is the intrinsic reward estimation is condi-tioned on a certain agent(s) are playing, while the environmentreward is condition on having a winning playtraces. 𝐼 𝑚 = 𝑆 𝑚,𝑎𝑔𝑒𝑛𝑡 · 𝐷 𝑚,𝑎𝑔𝑒𝑛𝑡 𝐸 𝑚 = 𝑆 𝑚,𝑤𝑖𝑛 · 𝐷 𝑚,𝑤𝑖𝑛 (3) We ran a diverse set of 26 agents on 3 game levels from the GeneralVideo Game Artificial Intelligence Framework (GVGAI). The setof agents come with the public version of the framework. Oneof these game levels is

𝑍𝑒𝑙𝑑𝑎 , a demake of The Legend of Zelda(Nintendo 1986) dungeon system. The other is

𝑃𝑎𝑐𝑚𝑎𝑛 , a port ofPacman (Namco 1980). The last one is

𝐵𝑢𝑡𝑡𝑒𝑟 𝑓 𝑙𝑖𝑒𝑠 , which is a newgame where the player tries to collect all the butterflies before allthe cocoons hatches. Figure 5 shows the levels used from each of ame Mechanic Alignment Theory and Discovery FDG ’21, August 03–06, 2021, Montreal, Canada (a) Zelda (b) Butterflies (c) Pacman

Figure 5: GVGAI game levels used to test our estimation method.Figure 6: The Player-Environmental Mechanical Alignmentgraph for agents on GVGAI’s Zelda. these three games. All 26 agents play each game 100 times for a totalof 2600 playtraces per game level. We recorded every mechanictriggered by an agent, such as movement, collisions, and (in the caseof Zelda) swinging a sword. Presented below are the results of 4 ofthese agents, which we believe best showcase a variety of playstylesto demonstrate the efficacy of our method: Adrienctx, Monte CarloTree Search (MCTS), a Greedy Search agent, and Do Nothing, whichalways takes the same action (neutral). These 4 agents are sortedbased on their performance on these games [8]. Adrienctx usesOpen Loop Expectimax Tree Search (OLETS) algorithm to playthe games, it is also a previous winner of the planning track inthe GVGAI competition [42]. MCTS algorithm comes with theGVGAI framework where it is a vanilla implementation of UpperConfidence Bounds of Trees (UCT) algorithm [9]. Greedy Searchagent looks only for one step ahead and pick the action that willeither make it win or increase the score. Finally, Do Nothing agent,as its name implies, just stands still without executing any actionsuntil it dies or the game times out. The goal of Zelda is to collect a key and unlock the door on the farside of the level (as seen in figure 5a). Along the way the playerencounters monsters, who can destroy the player if they collide. Theplayer can swing a sword in front of them, destroying any monsterit touches. Monsters move randomly around the level every coupleof game ticks. Figure 6 displays the mechanical alignment of thefour agents. “Unlock Door” and “Collect Key” are the two most environmen-tally rewarding mechanics, which makes sense considering Zelda’snature as an adventure game. “Slay Monster” follows behind them asalso positively rewarding. Interestingly enough, “Space Bar” (whenthe player presses the space bar) and “Swing Sword” are negativelyrewarding. The discrepancy between the two mechanic scores canbe explained if we consider the frequency that these mechanics aretriggered. The sword can only be swung every couple of secondsdue to the frame animation for the attack. Thus pressing space andswinging a sword are counted as separate mechanics, as pressingthe space bar does not guarantee the sword will be swung. Thesemechanics are scored negatively in the environmental measure-ment. Agents who chase after monsters to slay them tend to alsodie from the stochastic nature of the game. Because monsters moverandomly, it is impossible for an agent to accurately predict if it willmove and in what direction at any particular game tick. Slayingmonsters requires the player to risk themselves by being right nextto one.Adrienctx (circle) far outperforms other agents when it comesto collecting the key and unlocking the door. In fact, Adrienctxappears to be aligned along 𝑦 = 𝑥 , and its mechanics are in perfectplayer-environmental alignment. Greedy Search (diamond) seemsmore inclined to chase after and slay monsters than the other agents,while the Do Nothing (cross) agent does nothing at all. The MCTS(square) agent is the least inclined to slay monsters while beingoften slain by them.We can use this graph to analyze the agent’s gameplay perfor-mance like a human player: • Adrienctx understands the game mechanics very well. Theycould take more risks and kill more monsters to get a higherscore. • MCTS could benefit from learning to collect the key andunlock the door, but they understand that slaying monstersis important. • Greedy Search knows to collect the key but not what to dowith it after. They also do not seem inclined to slay monsters,although they seem to avoid them well enough. • Do Nothing is entirely out of alignment and does not seemto understand basic principles of the game. Perhaps an easierlevel with no enemies would be beneficial.

Butterflies is considered a “deceptive game” [3], i.e. one where thereward structure is designed to lead a player away from globally

DG ’21, August 03–06, 2021, Montreal, Canada Green, et al.

Figure 7: The Player-Environmental Mechanical Alignmentgraph for agents on GVGAI’s Butterflies. optimal decisions. The goal of butterflies is to clear the level of allbutterflies, which fly around randomly (figure 5b). More butterfliescan be spawned from cocoons, which crack open after a butterflytouches it. However, the player will lose if no more cocoons are inthe level. Therefore, the optimal strategy is to let all but one cocoonopen to spawn as many butterflies as possible and collect them all.Figure 7 displays the mechanical alignment of the four agents.Due to the deceptive nature of the game, two of the mechanics(cocoons bursting and spawning butterflies) are heavily associatedwith losing. Remember that if all the cocoons of the game pop, theplayer will lose. Most agents do not to engage with the risk-rewardof spawning more butterflies for a higher score, and therefore co-coons do not open nearly as often in winning playtraces. Collectinga butterfly is slightly positively associated, as it happens slightlymore in winning playtraces.Adrienctx is in near perfect alignment, while “catch butterflies”is just under the x axis as it tries not to catch all the butterfliesas soon as possible which make it catch less butterflies comparedto the MCTS agent. MCTS collects a lot of butterflies as it allowcocoons to burst into new butterflies (that is why that “Cocoonspawns butterflies” and “Butterfly Destory Cocoon” are slightlypositive on the y-axis). This behavior of MCTS is probably due tobeing less efficient compared to Adrienctx and easily decieved bygetting a high score from collecting more butterflies compared tolosing the game. The rest of the agents (“Do Nothing” and “GreedySearch”) doesn’t have any mechanics in alignment as they eitherdon’t collect enough butterflies.Similar to Zelda, we could look at the graph and analyze theseagents like a human player: • Adrienctx appears to not let many cocoons burst open, whichlimits its ability to score high. It resembles humans that aretrying to play it safe and not risk losing the game. Theycould learn to be more risky by waiting a little longer beforecatching all the butterflies so they have higher score. • MCTS has the opposite problem, in that it lets too manycocoons burst open and loses. It resembles players that aremore risk taking and not afraid of losing. These players couldget frustrated after time if they didn’t understand how to

Figure 8: The Player-Environmental Mechanical Alignmentgraph for agents on GVGAI’s Pacman. improve themselves. They could learn to catch butterfliesfaster. • Neither Greedy Search nor Do Nothing seem to understandthe mechanics of the game and could benefit greatly froman easier level and a tutorial.

The Pacman level is made up of a 2-dimensional maze, which theplayer traverses while collecting all the pellets and fruit in the level(figure 5c). Four deadly ghosts chase the player in the maze andmust be avoided. Some of the pellets are “power pellets” whichgrant the player a temporary invulnerability to the ghosts, allowingthe player to eat them and force them to respawn in the center ofthe maze. Figure 8 displays the mechanical alignment of the fouragents.Eating fruit pellets, and power-pellets are the highest environ-mental rewarding mechanics of the game, which is in line with itswinning conditions. Eating ghosts are slightly positively rewarding,suggesting that this is not as aligned with winning and could beavoided to win. Getting eaten by a ghost results in the player losing,so it makes sense that it is heavily penalizing.Adrienctx seems to understand the basic premise of Pacman,in that it must eat the pellets and fruit to win. However, all fouragents seem nearly equivalent in getting eaten by ghosts, suggest-ing that none of them can reliably escape losing relative to eachother. This is supported by the fact that there are only 9 winningplaytraces out of the 2600. Adrienctx is relatively more capable ofcollecting power-pellets and proceeding to eat ghosts. DoNothing,on the other hand, is incapable of performing any of these actions.GreedySearch is only slightly better than DoNothing in this regard.MCTS, however, seems to be capable of engaging in these aspectsof the game, however not nearly as effectively as Adrienctx.Looking at these agents as human players, we could deduce thefollowing: • Adrienctx understands the game mechanics well relative toother players. It appears well-aligned for all the environmentally-rewarding mechanics. It doesn’t need any tutorials but justpractice the level to get better. ame Mechanic Alignment Theory and Discovery FDG ’21, August 03–06, 2021, Montreal, Canada • MCTS is beginning to understand the game mechanics, how-ever they could use some assistance by providing easierlevels that have less ghosts or smaller size maps. • Greedy Search and Do Nothing is almost entirely out ofalignment. They may not understand the basic premise ofthe game very well. They need a formal tutorial and easierlevels so they can understand the game.

Game Mechanic Alignment allows us to visualise how differentplayers engage with various game mechanics according to theirintrinsic motivations and the extrinsic motivations encouraged byenvironmental rewards. This has many potential applications, suchas categorizing players based on the y-value of different events(e.g. through clustering), validating a designer’s assumptions ofwhat features of the game will be most enjoyed by players, buildingreward systems that are aligned (to the desired extent) with player’sintrinsic preferences and building tutorials that support actionspreferred by different types of players, even if they are not allequally rewarded by the reward systems.It is important, however, to distinguish between the theoreticalnotion that player’s motivations might be more or less alignedwith environmental rewards and the quantitative methods used toestimate this alignment. The first, and most obvious limitation ofthe quantitative estimation done in this paper is that we use simpleartificial agents as proxies for what a player’s behavior might looklike. More accurate estimations could be done with either humanplay traces or using artificial agents that attempt to emulate aspecific persona.Another limitation of the estimation is that it relies on the corre-lation between triggering a mechanic and winning the game (forthe x axis) or between triggering a mechanic and belonging to acertain player profile (for the y axis). These correlations are notnecessarily causal and could be affected by spurious factors. Onesuch factor is skill. Consider a hypothetical level where a cosmeticitem with no functional value is hidden near the exit. The event ofpicking up this item would have a high correlation with winning,and therefore would show up far to the right in the x-axis, eventhough it does not improve a player’s chance to beat the level. Andif two player profiles A and B are equally intrinsically motivated topick up the item, but profile A reaches the end of the level moreoften, the event would show up higher on the y-axis for profile A,as profile A simply had more opportunities to trigger it due to itshigher player skill. This can be seen in our examples, where all theautomated agents except do-nothing are designed with the goal towin the level but they have different skill levels, causing differencesin the y-axis even with no discernible differences in motivation.The

Talin [4] system arguably touches upon this more directly.In Talin, mechanics are dynamically taught to the player basedon their skill level. Mechanic mastery is represented using scalarvalues initialized by the game designer. As the player plays andeither triggers or fails to trigger that mechanic, its value rises orfalls. A similar system could be used to measure player skill/masteryof particular mechanics to explain how much of a y-value differencebetween different players is a result of their skill versus intrinsic motivation. If we can be sure of the method’s accuracy, it may bemore beneficial to designate another axis to account for player skill.Idiosyncrasies of a particular game or level could also contributeto spurious correlations. For example, if another hypothetical levelcontains a bifurcation where the player can choose between a redpath heading North and a blue path heading South, and these pathsfeature opportunities to trigger different sets of events, then a naiveanalysis of the prevalence of these events might be confoundedby an aesthetic preference for the red or blue color (or, as is morelikely for bots, by an arbitrary tie-breaker between the North andSouth directions).Level difficulty also may play a role here. In our examples, allagents were compared on the same level. Therefore, we can beassured that difficulty remains constant. However, many gamespresent the player with procedurally generated sets of levels, such asSpelunky (Derek Yu, 2008). In games like these, a designer may notbe able to make the assumption that all levels are equally difficult.If we assume there exists a method for measuring level difficulty,then we could test our agents only against the levels with similardifficulty and the same set of game mechanics.It is possible, in principle, to attempt to correct for spuriouscorrelations. For example, if we know an event can only be triggeredat a certain point of the map, we could consider only play tracesthat got within a certain distance of that point. Or we could, at eachtime-step of the play trace, simulate the game for a number of stepswith the goal of triggering the event, and consider only play traceswhere it was possible to do so. We could also perform A/B testswhere some features of the level (e.g. the events on each path) arekept constant where others (e.g. the colors) are swapped. But thesecorrections would come at the cost of domain knowledge and/orcomputational power, which is why they were not considered inour experiments.All of our examples in Sections 5 and 8 refer to games withmechanics that have either immediate or delayed rewards/penalties.Let us consider the game Fallout 2 (Bethesda 1998), a first personshooter set in a post-apocalyptic Oregon, United States. In thegame, the player may encounter a consumable substance called“Jet,” a highly addictive meta-amphetamine which grants the playertemporary short-term bonuses to their combat abilities. After theinitial bonus period, however, the player will suffer heavy penaltiesto these same skills. Additionally, the player may become addictedto the substance, requiring them to keep dosing themselves everyday or suffer additional penalties. In this example, consuming Jethas opposing extrinsic reward/penalties, depending on the timehorizon: rewarding in the short-term, penalizing in the long-term.To visualize this using Game Mechanic Alignment, we may needto introduce a third axis to represent “time.”

10 CONCLUSION

In this work, we present Game Mechanic Alignment theory, a frame-work which enables designers to organize game mechanics in termsof the environment and a player engaging with it. We provide sev-eral well-known games as examples for how a designer may applythis theory. To demonstrate its practicality, we propose a method-ology to estimate reward values for Game Mechanic Alignment,

DG ’21, August 03–06, 2021, Montreal, Canada Green, et al. as well as an experimental evaluation using 3 games from the GV-GAI framework. We then point out shortcomings in regards to thismethodology while discussing several ways they may be overcome.One of our core motivations for this work was to find a newalternative method to detect critical mechanics [22] for automatedtutorial generation systems that considers the player and the en-vironment. By taking both into account, tutorial generators cancreate highly personalized tutorials. Rather than one tutorial for asingle game or level, each player or playstyle could receive theirunique tutorial. Players who are highly skilled can be introducedto more complex mechanics whereas novice players can be givenexplanations of basic controls. Furthermore, players with certainplaystyles can be introduced to completely different styles, theymight not have considered. Proposed future work is to automat-ically design tutorials using this methodology to organize input,first for gameplaying agents using the same methodology fromSection 8 and then for human players in a formal user study.The mechanic-graph restriction of the previously mentionedcritical mechanic discovery methods [21, 22] is not a requirementfor the Game Mechanic Alignment method. Therefore, users mightfind it more generalized and easier to use. In our experimentalsection, we used the output logs from agent gameplay in the GVGAIframework. However, mechanics could be automatically detectedin similar manner to detecting highlights in soccer matches [17].

11 ACKNOWLEDGEMENTS

Michael Cerny Green would like to thank the OriGen.AI educationprogram for their financial support. Rodrigo Canaan gratefullyacknowledges the financial support from Honda Research InstituteEurope (HRI-EU).

REFERENCES [1] Ernest Adams and Andrew Rollings. 2007. Game design and development: Fun-damentals of game design.

New Jersey: Pearse Prentice Hall (2007).[2] Ryan Alexander and Chris Martens. 2017. Deriving quests from open worldmechanics. In

Foundations of Digital Games . ACM, 12.[3] Damien Anderson, Matthew Stephenson, Julian Togelius, Christoph Salge, JohnLevine, and Jochen Renz. 2018. Deceptive games. In

International Conference onthe Applications of Evolutionary Computation . Springer, 376–391.[4] Batu Aytemiz, Isaac Karth, Jesse Harder, Adam Smith, and Jim Whitehead. 2018.Talin: A Framework for Dynamic Tutorials Based on the Skill Atoms Theory. In

Proceedings of the AAAI Conference on Artificial Intelligence and Interactive DigitalEntertainment , Vol. 14.[5] Richard A Bartle. 2004.

Designing virtual worlds . New Riders.[6] Kelly Bergstrom, Marcus Carter, Darryl Woodford, and Chris Paul. 2013. Con-structing the ideal EVE online player.

Proceedings of DiGRA 2013: DeFraggingGame Studies. (2013).[7] Mateusz Bialas, Shoshannah Tekofsky, and Pieter Spronck. 2014. Cultural influ-ences on play style. In . IEEE, 1–7.[8] Philip Bontrager, Ahmed Khalifa, Andre Mendes, and Julian Togelius. 2016. Match-ing games and algorithms for general video game playing. In

Proceedings of theAAAI Conference on Artificial Intelligence and Interactive Digital Entertainment ,Vol. 12.[9] Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas, Peter ICowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samoth-rakis, and Simon Colton. 2012. A survey of monte carlo tree search methods.

IEEE Transactions on Computational Intelligence and AI in games

4, 1 (2012), 1–43.[10] Megan Charity, Michael Cerny Green, Ahmed Khalifa, and Julian Togelius. 2020.Mech-Elites: Illuminating the Mechanic Space of GVG-AI. In

International Con-ference on the Foundations of Digital Games . 1–10.[11] Michael Cook, Simon Colton, Azalea Raad, and Jeremy Gow. 2013. Mechanicminer: Reflection-driven game mechanic discovery and level design. In

EuropeanConference on the Applications of Evolutionary Computation . Springer, 284–293. [12] Fernando de Mesentier Silva, Aaron Isaksen, Julian Togelius, and Andy Nealen.2016. Generating heuristics for novice players. In

Computational Intelligence andGames . IEEE, 1–8.[13] Fernando de Mesentier Silva, Julian Togelius, Frank Lantz, and Andy Nealen.2018. Generating Beginner Heuristics for Simple Texas Hold’em. In

Genetic andEvolutionary Computation Conference . ACM.[14] Fernando de Mesentier Silva, Julian Togelius, Frank Lantz, and Andy Nealen. 2018.Generating Novice Heuristics for Post-Flop Poker. In

Computational Intelligenceand Games . IEEE.[15] Sam Devlin, Anastasija Anspoka, Nick Sephton, Peter I Cowling, and Jeff Rollason.2016. Combining Gameplay Data With Monte Carlo Tree Search To EmulateHuman Play. In

Proceedings of the AAAI Artificial Intelligence for Interactive DigitalEntertainment Conference . AAAI.[16] Joris Dormans. 2011. Simulating mechanics to study emergence in games. In

Proceedings of the AAAI Conference on Artificial Intelligence and Interactive DigitalEntertainment , Vol. 7.[17] Ahmet Ekin, A Murat Tekalp, and Rajiv Mehrotra. 2003. Automatic soccer videoanalysis and summarization.

IEEE Transactions on Image processing

12, 7 (2003).[18] Raluca D Gaina, Diego Pérez-Liébana, and Simon M Lucas. 2016. General videogame for 2 players: Framework and competition. In

Computer Science and Elec-tronic Engineering

Proceedings ofthe AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment ,Vol. 13. AAAI.[21] Michael Cerny Green, Ahmed Khalifa, Gabriella AB Barros, Tiago Machado, AndyNealen, and Julian Togelius. 2018. AtDELFI: automatically designing legible, fullinstructions for games. In

Proceedings of the 13th International Conference on theFoundations of Digital Games . ACM, 1–10.[22] Michael Cerny Green, Ahmed Khalifa, Gabriella AB Barros, Tiago Machado, andJulian Togelius. 2020. Automatic Critical Mechanic Discovery Using Playtracesin Video Games. In

International Conference on the Foundations of Digital Games .ACM, 1–9.[23] Michael Cerny Green, Ahmed Khalifa, Gabriella AB Barros, Andy Nealen, andJulian Togelius. 2018. Generating levels that teach mechanics. In

Proceedings ofthe 13th International Conference on the Foundations of Digital Games . 1–8.[24] Michael Cerny Green, Luvneesh Mugrai, Ahmed Khalifa, and Julian Togelius.2020. Mario level generation from mechanics using scene stitching. In . IEEE, 49–56.[25] Christoffer Holmgård, Michael Cerny Green, Antonios Liapis, and Julian Togelius.2018. Automated playtesting with procedural personas through MCTS withevolved heuristics. (2018).[26] Christoffer Holmgård, Antonios Liapis, Julian Togelius, and Georgios N. Yan-nakakis. 2014. Evolving Personas for Player Decision Modeling. In

Proceedings ofthe IEEE Conference on Computational Intelligence and Games . IEEE.[27] Christoffer Holmgård, Antonios Liapis, Julian Togelius, and Georgios N Yan-nakakis. 2014. Personas versus clones for player decision modeling. In

Proceedingsof the International Conference on Entertainment Computing . Springer, 159–166.[28] Christoffer Holmgård, Antonios Liapis, Julian Togelius, and Georgios N. Yan-nakakis. 2015. Monte-Carlo Tree Search for Persona Based Player Modeling. In

Proceedings of the AIIDE workshop on Player Modeling .[29] Britton Horn, Amy K Hoover, Yetunde Folajimi, Jackie Barnes, Casper Harteveld,and Gillian Smith. 2017. AI-assisted analysis of player strategy across levelprogressions in a puzzle game. In

Proceedings of the 12th International Conferenceon the Foundations of Digital Games . ACM, 1–10.[30] Aki Järvinen. 2008.

Games without frontiers: Theories and methods for game studiesand design . Tampere University Press.[31] Ahmed Khalifa, Fernando de Mesentier Silva, and Julian Togelius. 2019. LevelDesign Patterns in 2D Games. In . IEEE,1–8.[32] Ahmed Khalifa, Michael Cerny Green, Gabriella Barros, and Julian Togelius.2019. Intentional computational level design. In

Proceedings of The Genetic andEvolutionary Computation Conference . 796–803.[33] Ahmed Khalifa, Michael Cerny Green, Diego Perez-Liebana, and Julian Togelius.2017. General video game rule generation. In . IEEE, 170–177.[34] Ahmed Khalifa, Aaron Isaksen, Julian Togelius, and Andy Nealen. 2016. Modify-ing mcts for human-like general video game playing. In

Proceedings of IJCAI .[35] Ahmed Khalifa, Diego Perez-Liebana, Simon M Lucas, and Julian Togelius. 2016.General video game level generation. In

Genetic and Evolutionary ComputationConference . ACM, 253–259.[36] D Michael Kuhlman and Alfred F Marshello. 1975. Individual differences ingame motivation as moderators of preprogrammed strategy effects in prisoner’sdilemma.

Journal of personality and social psychology

32, 5 (1975), 922. ame Mechanic Alignment Theory and Discovery FDG ’21, August 03–06, 2021, Montreal, Canada [37] Mark Nelson. 2016. Investigating vanilla MCTS scaling on the GVG-AI gamecorpus. In

Proceedings of the IEEE Conference on Computational Intelligence andGames . IEEE.[38] Juan Ortega, Noor Shaker, Julian Togelius, and Georgios N Yannakakis. 2013.Imitating Human Playing Styles in Super Mario Bros.

Entertainment Computing

4, 2 (2013), 93–104.[39] Joseph Osborn, Adam Summerville, and Michael Mateas. 2017. Automatic map-ping of NES games with mappy. In

Foundations of Digital Games . ACM, 78.[40] Diego Perez-Liebana, Jialin Liu, Ahmed Khalifa, Raluca D Gaina, Julian Togelius,and Simon M Lucas. 2019. General video game ai: a multi-track framework forevaluating agents, games and content generation algorithms.

Transactions onGames (2019).[41] Diego Perez-Liebana, Spyridon Samothrakis, Julian Togelius, Tom Schaul, andSimon M Lucas. 2016. General video game ai: Competition, challenges andopportunities. In

AAAI Conference on Artificial Intelligence .[42] Diego Perez-Liebana, Spyridon Samothrakis, Julian Togelius, Tom Schaul, Si-mon M Lucas, Adrien Couëtoux, Jerry Lee, Chong-U Lim, and Tommy Thompson.2015. The 2014 general video game playing competition.

IEEE Transactions onComputational Intelligence and AI in Games

DiGRA Conference .[45] Katie Salen, Katie Salen Tekinbaş, and Eric Zimmerman. 2004.

Rules of play:Game design fundamentals . MIT press.[46] Miguel Sicart. 2008. Defining game mechanics.

Game Studies

8, 2 (2008), n. [47] Adam Summerville, Chris Martens, Sarah Harmon, Michael Mateas, Joseph CarterOsborn, Noah Wardrip-Fruin, and Arnav Jhala. 2017. From Mechanics to Meaning.

IEEE Transactions on Computational Intelligence and AI in Games .[48] Richard S Sutton and Andrew G Barto. 2018.

Reinforcement learning: An intro-duction . MIT press.[49] S. Tekofsky, P. Spronck, A. Plaat, J. van den Herik, and J. Broersen. 2013. Playstyle: Showing your age. In . IEEE, 1–8. https://doi.org/10.1109/CIG.2013.6633616[50] Carl Therrien. 2011. " To Get Help, Please Press X" The Rise of the AssistanceParadigm in Video Game Design.. In

DiGRA Conference .[51] Julian Togelius, Renzo De Nardi, and Simon M Lucas. 2007. Towards AutomaticPersonalised Content Creation for Racing Games. In

Proceedings of the IEEEConference on Computational Intelligence and Games . IEEE.[52] Ruben Rodriguez Torrado, Philip Bontrager, Julian Togelius, Jialin Liu, and DiegoPerez-Liebana. 2018. Deep Reinforcement Learning for General Video Game AI.In

Computational Intelligence and Games . IEEE, 1–8.[53] Leonid Wasserstein. 1969. Markov processes with countable state space describinglarge systems of automata.

IEEE Transactions on Image processing

5, 3 (1969). inRussian.[54] Georgios N. Yannakakis, Pieter Spronck, Daniele Loiacono, and Elisabeth André.2013. Player Modeling. In

Artificial and Computational Intelligence in Games .Dagstuhl Publishing, Saarbrücken/Wadern, 45–55.[55] Nicholas Yee. 2002. Facets: 5 motivation factors for why people play MMORPG’s.

Terra Incognita