The 2017 AIBIRDS Competition
11 The 2017 AIBIRDS Competition
Matthew Stephenson, Jochen Renz, Xiaoyu Ge, and Peng Zhang
Abstract —This paper presents an overview of the sixthAIBIRDS competition, held at the 26th International JointConference on Artificial Intelligence. This competition taskedparticipants with developing an intelligent agent which can playthe physics-based puzzle game Angry Birds. This game uses asophisticated physics engine that requires agents to reason andpredict the outcome of actions with only limited environmentalinformation. Agents entered into this competition were requiredto solve a wide assortment of previously unseen levels within aset time limit. The physical reasoning and planning requiredto solve these levels are very similar to those of many real-world problems. This year’s competition featured some of the bestagents developed so far and even included several new AI tech-niques such as deep reinforcement learning. Within this paper wedescribe the framework, rules, submitted agents and results forthis competition. We also provide some background informationon related work and other video game AI competitions, as well asdiscussing some potential ideas for future AIBIRDS competitionsand agent improvements.
Index Terms —Angry Birds, intelligent agents, physics-basedgames, AI competitions, video games
I. I
NTRODUCTION
Over the past several years, many different AI competitionsfocused around video games have become extremely popular.Many of these competitions have yielded promising results andimprovements for the wider AI community, and have beenhosted at several major international conferences includingCIG, AIIDE, IJCAI, ECAI, GECCO and FDG to name justa few. Whilst competitions and challenges centred around AIplaying classic board games, such as chess with Deep Blue[1] and more recently Go with DeepMind’s AlphaGo [2], havebeen incredibly popular and successful, video games typicallyprovide a much more complex and challenging domain inwhich to interact. Developing agents (autonomous programsthat can react intelligently to environmental inputs) that cansuccessfully play popular and complex video games is a keyarea of research for AI. Video games provide a controllableand parameterised environment to work in, and the problemsthey pose are often very similar to those of the real-world [3].Most video games are designed to test the cognitive abilitiesof human players in one or multiple areas, which is preciselythe problem we wish intelligent agents to solve. Physics-basedpuzzles are a great example of this as they not only require afair amount of planning and knowledge reasoning, but also thetype of physical reasoning required to play them is comparableto that needed for an agent to operate successfully in thereal-world [4]. Angry Birds is a popular video game that fitsperfectly into this category.
M. Stephenson, J. Renz, X. Ge and P. Zhang are with the Research Schoolof Computer Science, Australian National University, Canberra, A.C.T. 0200,Australia, e-mail: ([email protected]).
Angry Birds is a physics-based simulation puzzle game,developed by Rovio Entertainment [5]. It uses a sophisticated(at least in terms of other video game AI problems) physicsengine to control the movement of certain objects and howthey respond to the player’s actions. Without getting too in-depth with its specific mechanics, players can only solve thisgame’s levels by planning out a sequence of well-reasonedactions, taking into account the physical nature of the game’senvironment. This type of physical reasoning problem is verydifferent to traditional games such as chess, as the exactattributes and parameters of various objects are often unknown.This means that it is very difficult to accurately predict the out-come of any action taken [6] and the exact result of an actionis only known for certain once it has been carried out. Even ifthe exact physics parameters of the world were available, theagent would have to simulate a potentially infinite number ofpossible actions due to the game’s continuous state and actionspaces. Developing intelligent agents that can play this gameeffectively has been an incredibly complex and challengingproblem for traditional AI techniques to solve, even though thegame is simple enough that any human player, or even a child,could learn it within a few minutes. Humans are naturally verygood at predicting the result of a physical action based onvisual information, while agents still struggle with this formof reasoning in unknown environments.What makes this research on physics-based games such asAngry Birds so important, is that the exact same problemsneed to be solved by AI systems that are intended to interactsuccessfully with the real-world. The ability to accuratelyestimate the consequences of a physical action based solelyon visual inputs or other forms of perception is essential forthe future of ubiquitous AI, and has huge real-world relevanceand application. Any real-world AI system that cannot achievethis will likely result in many unintended outcomes whichcould potentially be dangerous to people. Angry Birds, aswell as other physics-based games, provides a controlled andparametrised environment to experiment with new ideas andcapabilities. It is particularly important for the developmentof such systems to integrate the areas of computer vision,machine learning, knowledge representation and reasoning,heuristic search, planning, and reasoning under uncertainty.Contributions or additions to each of these areas will helpimprove the overall performance of an agent, but combiningeffective solutions to all of these problems will be needed todevelop a truly intelligent physical reasoning system.In this paper we present the description, entrants, results andconclusions for the sixth AIBIRDS competition. Participatingcompetitors are tasked with developing an agent that canplay and solve unknown Angry Birds levels. As previouslymentioned, this competition was created as a means to promotethe research and creation of intelligent agents that can reason a r X i v : . [ c s . A I] M a r and predict the outcome of actions in a physical simulationenvironment [7]. During the competition, agents are requiredto play a set number of unknown levels within a given time,attempting to score as many points as possible in each level.The exact parameters of certain objects, as well as the currentinternal state of the game, are not directly accessible. Instead,information about the level is provided using a computer visionmodule which gives approximations of objects boundaries andlocation based on screenshots of the game screen, effectivelymeaning that an agent gets exactly the same input as a humanplayer. Agents are required to solve these levels in real-time,and can attempt levels in any order and as many times as theylike. Once the time limit has expired the maximum scores thatan agent achieved for each solved level are summed up to giveits final score. Agents are then ranked based on this value andafter several rounds of elimination a winner is declared. Theeventual goal of this competition is to design agents that canplay new levels as well as, or better than, the best humanplayers. Many of the previous agents that have participated inthis competition employed a variety of techniques, includingqualitative reasoning [8], internal simulation analysis [9], [10],logic programming [11], heuristics [12], Bayesian inferences[13], [14], and structural analysis [15].Holding an AI competition has many advantages overtraditional research methods, chief of which is that it informsmembers of the AI community who may not be aware of theproblem about it. This is turn may encourage and motivatepeople to take part, perhaps inspiring them to try out theirown methods and ideas to solve the problem. Competitionsprovide easy to use software and interfaces that can make adaunting and challenging task seem much more possible. Someparticipants may even be able to apply their existing algorithmsto an entirely new problem they had not previously considered.Competitions also provide an effective way of comparing andbenchmarking all of the currently existing algorithms. Allsubmitted agents are evaluated using the same levels andrules, allowing for a fair and unbiased means of comparingthem. This also provides opportunities for discussion andcollaboration between researchers, and is a great way to getboth industry specialists and other non-academics involved inthis kind of work.The remainder of this paper is organized as follows: SectionII provides the background to this competition, includingpast AI video game competitions, a description of the AngryBirds game, and details on the related AIBIRDS level gener-ation competition; Section III describes the competition itself,providing details on the naive agent that is provided to allentrants, as well as the rules and scoring procedure; Section IVcontains descriptions of the ten agents submitted to this year’scompetition; Section V provides the results of the competition;Section VI discusses and interprets these results, providingsome possible improvements for next year’s agents and otherpotential benefits beyond the competition itself; Section VIIpresents our final conclusions and desired goals for futurecompetitions. Fig. 1: Screenshot of a level from the Angry Birds game.II. B ACKGROUND
A. Previous AI video game competitions
Examples of popular AI competitions (both past andpresent) include the Mario AI Championship [16], [17], [18],the StarCraft AI Competition [19], the Visual Doom AICompetition (ViZDoom) [20], the Geometry Friends Game AICompetition [21] and the Fighting Game AI Competition [22].The General Video Game AI (GVGAI) Competition has alsorun several tracks around developing agents for playing generalvideo games. These include the single-player planning track[23], the two-player planning track [24], [25] and the learningtrack [26]. The AIBIRDS competition has itself been runningsince 2012 [4], [7], with many advancements and improvedagents being developed since its initial inception.
B. Angry Birds game
Angry Birds is a popular physics-based puzzle game wherein each level the player uses a slingshot to shoot birds atstructures composed of blocks, with pigs placed within oraround them [5]. The player’s objective is to kill all the pigswithin a level using the birds provided. A typical Angry Birdslevel, as shown in Figure 1, contains a slingshot, birds, pigsand a collection of blocks arranged in one or more structures.All objects within the level have properties such as location,size, mass, friction, density, etc., and obey simplified physicsprinciples defined within the game’s engine. Each block inthe game can have multiple different shapes as well as beingmade from one of three materials (wood, ice or stone). Eachbird is assigned one of five different types (red, blue, yellow,black or white). Each of these bird types are strong/weakagainst certain block materials, as well some types possessingsecondary abilities which the player can activate during thebird’s flight. The player can choose the angle and speed withwhich to fire a bird from the slingshot, as well as a tap timefor when to activate the bird’s special ability if it has one,but cannot alter the ordering of the birds or affect the level inany other way. Pigs are killed once they take enough damagefrom either the birds directly or by being hit with anotherobject. The ground is usually flat but can vary in height forcertain difficult levels. TNT can also be placed within a leveland explodes when hit by another object. The difficulty ofthis game comes from predicting the physical consequencesof actions taken, and accurately planning a sequence of shots
Fig. 2: An example generated level that was used in this year’sAIBIRDS competition.that results in success. Points are awarded to the player oncethe level is solved based on the number of birds remainingand the total amount of damage caused.
C. AIBIRDS level generation competition
Whilst the AIBIRDS competition has been running annuallyfor many years, a second track of the competition was startedin 2016 known as the AIBIRDS level generation competition.This new competition revolves around developing proceduralcontent generators (PCG) that can autonomously create AngryBirds levels. These generators must create levels that are bothsolvable and physically stable. Generated levels should also befun and creatively designed, as well as providing the playerwith a suitable level of challenge. Currently the generatorscreate levels for a clone version of Angry Birds, due to thefact that the real Angry Birds game is not open source, butrecent engine improvements have allowed the generators tocreate levels very similar to those seen in the real AngryBirds game. In fact, several levels that were used in this year’sAIBIRDS competition were converted from levels created bygenerators from the AIBIRDS level generation competition.Figure 2 provides an example generated level that was used inthe AIBIRDS competition and was created using the algorithmdescribed in [27]. We also discuss later some ways in whichboth these competitions could be combined to increase theabilities and performance of agents, as well as helping to createbetter levels. III. AIBIRDS C
OMPETITION
A. Rules
Developed agents can be written in any programming lan-guage, although we strongly recommend the use of Java tomake integration with existing software easier. Each AngryBirds game instance is run on a game server while the agentitself is executed on a client computer, see Figure 3 for aserver-client architecture diagram. Client computers have noaccess to the internet and can only communicate with thegame server via the specified communication protocol. Nocommunication with other agents is possible and each agentcan only access files in its own directory. Each agent is able toobtain screenshots of the current Angry Birds game state fromthe server and can submit actions and other commands back Fig. 3: Server-client architecture.to it. The game is played in SD mode and all screenshots havea resolution of 840x480 pixels. Agents that attempt to tamperwith the competition settings or try to gain an unfair advantagewill be disqualified.The following objects are used for the competition levels:All objects, background, terrain, etc., that occur in the first 21Poached Eggs levels of the Chrome version of Angry Birds.In addition, the competition levels may include the whitebird, the black bird, TNT boxes, triangular blocks and hollowblocks. No other objects are used. The vision module of theprovided game playing software recognises all relevant gameobjects and all birds and pigs, including the terrain but not thebackground. All competition levels use the same backgroundthat occurs in the first 21 Poached Eggs levels.
B. Naive agent
The source code for a naive agent is provided to all compe-tition entrants as a useful starting point upon which to createtheir own agent. Objects within the level are first identifiedusing a computer vision module, which converts the raw pixelimage input into a easier to manage list of object types, sizes,materials and locations, see Figure 4. The naive agent also hasan additional trajectory module, which calculates two possiblerelease points, one firing horizontally (low trajectory) and theother vertically (high trajectory), that result in the current birdhitting a specified pig (assuming no objects are blocking thebird’s trajectory). The naive agent always fires the currentlyselected bird at a randomly chosen pig using either a low orhigh trajectory (also chosen at random). No other objects apartfrom the current bird and pigs are used when determining asuitable shot, and tap times are fixed for each bird based onthe total length of its trajectory. This agent can therefore makeshot calculations quickly and accurately but is unlikely to beskilled enough to solve more challenging Angry Birds levels.
Fig. 4: The example level from Figure 1, with blocks, pigsand birds identified using the computer vision module.Participating teams are advised to use both the vision andtrajectory modules provided with this agent, but will likelyneed to improve the techniques and strategies used to solvelevels. Entrants are also provided with several other built-in functions. These include a function that identifies whichblocks support a specific block, and a function which identifieswhether particular trajectories to pigs are obstructed by otherobjects.More detailed explanations about how the naive agent andserver software works can be found on the AIBIRDS website[28]. This website also provides instructions on how to begincoding your own intelligent agent for Angry Birds and hasseveral available open source entries to try out.
C. Scoring and tournament procedure
During the competition, there is a time limit to play a givenset of Angry Birds levels automatically and without any humanintervention. The competition is played over multiple knock-out rounds. In each round, the agents that achieve the highestcombined game score over all solved levels proceed to thenext round. The agent with the highest combined game scorein the grand final is the winner of the competition. There isalso an additional side competition in which the best agentsfrom the main AI competition are pitted against human playersattending the host conference.
1) Main AI competition:
The main AI competition consistsof three group rounds to determine the two best agents(qualification, quarter-finals and semi-final) as well as a finalshowdown to decide the overall winner (grand final). For eachround we have a dynamically updated leaderboard where allagents are ranked according to their total score. All levelsused in the competition are brand new and are not known inadvance by any participating team. Agents have a total timeof 30 minutes to solve eight levels in each round. Each ofthe eight game levels within a round can be accessed, playedand replayed in any arbitrary order. After the 30 minute timelimit for a round is reached, the connection between agentsand the game server is terminated. Agents then have up totwo minutes to store any information they wish to keep andto stop running. After two minutes the organisers terminatethe agents if they are still running. Agents cannot be modifiedduring the competition.The first round of the competition is a qualification round,where up to 16 teams are selected to proceed onto the nextround. As there were only ten participating teams this year, this round was simply used to help divide up the teams intogroups for the quarter-finals. For the quarter-finals we had onegroup of four teams and two groups of three teams, based onthe scores achieved in the qualification round. Any team canquery the current group high score for each level, but not thehigh scores of other groups. The top four teams across allthree quarter-final groups move on to the semi-final. The finalfour teams that make it to the semi-final then all play in thesame group. Any team can query the current high score foreach level, and the two best teams qualify for the grand final.During the grand final both teams can query the current highscore for each level, with the winner of this match being the2017 AIBIRDS champion. The four semi-finalists also qualifyfor the man vs machine challenge.
2) man vs. machine challenge:
During the man vs. machinechallenge we test if the best agents can beat humans atplaying Angry Birds. For this challenge we use four newAngry Birds levels not included in the main AI competition,that each player has 10 minutes to solve. Each game levelcan be accessed, played and replayed in any arbitrary order.Participating human players play the game levels first. Eachplayer can participate only once. A leaderboard is kept whichranks the human players according to their overall score (sumof individual high scores per level). After the human playersthe four best agents (those which qualified for the semi-final inthe main AI competition) are run in parallel on the same gamelevels with the same time limit. The player with the highestoverall score, man or machine, wins this challenge.IV. C
OMPETITION A GENTS
This year we had 10 agents submitted from teams across12 countries. More information on the agents entered intoboth this year’s and previous competitions are available onthe AIBIRDS website [29].
A. Datalab (Placed 7th; Czech Technical University inPrague; Czech Republic; First entered in 2014)
The Datalab agent uses a combination of four differentstrategies when attempting to solve a level. These can bedescribed as the destroy pigs, building, dynamite and roundblocks strategies. The decision of which strategy to use isbased on the environment, possible trajectories, currentlyselected bird and remaining birds. The destroy pigs strategyattempts to find a trajectory that intersects with as manypigs as possible. The building strategy identifies groups ofconnected blocks that either protect pigs or are near to them.The decision of which blocks within the building are suitabletargets is based on its location, size, shape, material andrelative placement within the structure, as well as the shape ofthe building itself. The shot that will cause the most damageto the building is then selected. The dynamite strategy rankseach TNT box within the level based on the number of pigs,stone blocks and other TNT boxes that are nearby. The roundblocks strategy attempts to either hit round blocks directly orelse destroy objects that are supporting round blocks. The taptime for each bird is fixed based on the location of the firstobstacle in its trajectory, with the exception of the white bird.
B. IHSEV (Placed 2nd; ´Ecole nationale d’ing´enieurs de Brest;France; First entered in 2013)
The IHSEV agent creates an internal Box2D simulation ofthe level, within which it tries out many shot angles and taptimes. These mental simulations are carried out in parallel toidentify the shot that destroys the most pigs. The simulation isnot a perfect representation of the environment and great careis taken when perceiving and reconstructing each level. Thevision module has also been slightly improved from the basecode provided so that objects are more robustly identified. Theagent does not use any information about the number or typeof remaining birds when deciding which shot to take. A futureplan to adapt the agent’s environmental simulation based onthe deviation between the actual and expected outcome of ashot was proposed but has not yet been implemented. This iscurrently the only agent that considers multiple different taptimes for activating each bird’s abilities successfully.
C. Angry-HEX (Placed 3rd; Universit`a della Calabria, ViennaUniversity of Technology, Marmara University, Max PlanckInstitut fuer Informatik; Italy, Austria, Turkey, Germany; Firstentered in 2013)
The Angry-HEX agent uses HEX programs to deal withdecisions and reasoning, while the computations are performedby traditional programming. HEX programs are an extensionof answer set programming (ASP) which use declarativeknowledge bases for information representation and reasoning.The Reasoner module of this agent determines several possibleshots based on different strategies. These shots are thensimulated using an internal Box2D simulation, with the shotthat kills the most pigs being selected as the ideal action. Ifthe estimated number of killed pigs is the same for multiplepossible shots, then the shot that also destroys the most objectsis selected. The trajectory module of the base program wasimproved to take the thickness of the currently selected birdinto account, as well as the ability to select several differentpoints on a block as the target location. The tap time for eachbird is fixed based on the location of the first obstacle in itstrajectory, with the exception of the white bird. This agent canalso remember the shots and strategies previously carried out,to aid them when re-attempting levels.
D. Eagle’s Wing (Placed 1st; University of Alberta and ZazzleInc.; Canada; First entered in 2016)
The Eagle’s Wing agent chooses from five different strate-gies when deciding what shot to perform. These are definedas the pigshooter, TNT, most blocks, high round objects andbottom building blocks strategies. The decision of which strat-egy to use is based on the estimated utility of each approachwith the currently selected bird. This utility is calculated basedon the level’s features and how these compare to a smallcollection of practice levels that are used to train the agent withthe machine learning method xgboost. The pigshooter strategyattempts to find a trajectory that either targets an unprotectedpig or includes multiple pigs within it. The TNT strategyaims for any TNT box that can cause significant damage to a large region. The many blocks strategy finds the trajectorythat destroys the most blocks (highly dependent on the typeof bird being used). The high round objects strategy attemptsto destroy objects close to large round objects that are highabove the ground, hopefully causing them to fall onto pigs.The bottom building block strategy targets blocks that areimportant to a structure’s overall stability. The tap time foreach bird is fixed based on the location of the first obstacle inits trajectory, with the exception of the white bird.
E. s-birds (Placed 5th; Dhirubhai Ambani IICT; India; Firstentered in 2013)
The s-birds agent has two different approaches for deter-mining the most effective shot to perform. The first strategy iscalled the bottom-up approach and identifies a set of candidatetarget blocks for the level based on the potential numberof affected pigs. The second strategy is called the top-downapproach and utilizes the crushing/rolling effect of a bird orround block onto pigs, as well as the toppling effect of thinnerblocks. Suitable target blocks are identified for each methodand are then ranked based on the expected number of pigskilled and the likelihood of the shot’s success. The penetrationfactor of specific bird types against certain materials is alsoconsidered when determining if a block can be hit. The taptime for each bird is fixed based on the total length of itstrajectory, with the exception of the white bird.
F. BamBirds (Placed 9th; Bamberg University; Germany;First entered in 2016)
The Bambirds agent creates a qualitative representation ofthe level and then chooses one of nine different strategiesbased on its current state. This includes approaches such asutilizing blocks within the level to create a domino effect,targeting blocks that support heavy objects, maximum struc-ture penetration and prioritizing protective blocks, as well assimpler options such as targeting pigs/TNT or utilizing certainbird’s special abilities. These strategies are each given a scorebased on their estimated damage potential for the current birdtype. A strategy is then chosen randomly, with this score beingused to determine the likelihood of selection (i.e. shots that arebelieved to be the most effective are more likely to be chosen).The tap time for each bird is fixed based on the total length ofits trajectory, with the exception of the white bird. This agentcan also remember the shots and strategies previously carriedout, to aid them when re-attempting levels.
G. PlanA+ (Placed 4th; Sejong University; South Korea; Firstentered in 2014)
The PlantA+ agent alternates between two different strate-gies each time it attempts a level. The first strategy involvesidentifying two possible trajectories to every pig and TNTwithin the level, and then counting the number of blocks(for each material) that are blocking each trajectory frombeing successful. The agent then compares the type of birdthat is currently available against the number and material ofblocks blocking each trajectory, to calculate a heuristic for each possible shot. This heuristic value defines the likelihoodof the bird successfully making it to the specific target. Thesecond strategy is similar to the first, except that the number ofpixels crossing the trajectory is used rather than the number ofblocks. Parameter values for each strategy are first identifiedby humans using trial and error, but are then optimised usinga simple greedy-style algorithm. The tap time for each bird isfixed based on the location of the first obstacle in its trajectory,with the exception of the white bird.
H. AngryBNU (Placed 10th; Beijing Normal University;China; First entered in 2017)
The AngryBNU agent uses deep reinforcement learning,more specifically it uses deep deterministic policy gradients(DDPG), to build a model for predicting suitable shots in un-known levels based on generalised experience from previouslyplayed levels. DDPG is a combination of Deep Q-networks,Deterministic policy gradient algorithms and actor-critic meth-ods, allowing for efficient deep learning in continuous actionspaces, such as the environment present in Angry Birds. Themodel trained with DDPG can be used to predict optimalshot angles and tap times, based on the features within alevel. The level features that are considered when training andutilising this model are the current bird type, the distance to thetarget points, and a 128x128 pixel matrix around each target(nearby objects). The level screenshot received by the agent isalso transformed into an annotated image that retains relevantfeatures while discarding those which are unnecessary. Thispre-processing allows for more efficient generalisation whenonly a limited training set is available. Continuous Q-learning(SARSA) is used as the critic model and policy gradient isused as the actor model. By following this process, a deeplearning model is trained on the original Angry Birds levelsthat are available, which allows the agent to predict the besttarget point for a shot based on the level’s features.
I. Condor (Placed 8th; UTN Facultad Regional Santa Fe;Argentina; First entered in 2017)
The Condor agent chooses from five different strategieswhen deciding what shot to perform. These are defined asthe structure, boulder, TNT, bird and alone pig strategies.Each strategy has corresponding level requirements to decidewhether it’s considered or discarded for the current shot.Each strategy also has a numerical weighting based on humananalysis of their potential impact for the current level. Thestructure strategy detects the shape of structures (two or moreconnected blocks) and classifies them as either a fortress or alookout. Fortresses are targeted at the top left position, whilstlookouts are targeted at the mid-point. The boulder strategytargets round blocks next to pigs. The TNT strategy simplytargets TNT boxes. The bird strategy identifies suitable blocksto hit based on their material and the type of bird that iscurrently available. The alone pig strategy targets pigs that arereachable and unprotected. The agent also uses an improvedsystem when waiting for the resulting movement caused by ashot to finish, using a dynamic system rather than a static timer.The agent also re-adjusts the target point slightly if it believes this may give a better shot result (multiple target points foreach object). The tap time for each bird is fixed based on thetotal length of its trajectory.
J. Vale Fina 007 (Placed 6th; Technical University of Crete;Greece; First entered in 2017)
The Vale Fina 007 agent uses reinforcement learning(specifically Q-learning) in an attempt to identify suitable shotsfor unknown levels based on past experience. In order todescribe the current state of a level, a list of objects is used thatcontains information about every object within it. An object isdescribed based on several features, including the object angle,object area, nearest pig distance, nearest round stone distance,the weight that the object supports, the impact that the currentbird type has on the object, and several others. Q-learningis then used to associate the features of the objects within alevel to certain actions (shots) that result in success. Thesefeatures are weighted based on their perceived importance fora collection of sample testing levels. Unfortunately, no moreinformation was provided about this agent, so a more in-depthcomparison between this and the other reinforcement learningagent (AngryBNU) is not possible.V. R
ESULTS
During the competition agents played in repeated roundsof 8 levels which needed to be solved in 30 minutes, as perour already described tournament procedure. A total of 32levels were created for the four rounds required. These levelswere created using a variety of techniques, including bothhand-designed levels written by the competition organisers, aswell as generated levels from this year’s companion AIBIRDSlevel generation competition. The additional man vs. machinechallenge was held after the main AI competition once the finalagent rankings were known. The final result for each agent ineach round of the main AI competition is shown in Table I.A dash in this table indicates that the agent was eliminateddue to a low score and did not proceed to this round of thecompetition.Although the qualification round was only used to divideagents into smaller groups for the quarter-finals, it was stilluseful in identifying the agents that would likely performbest in the future. Previous two-time winner Datalab was thebest scoring agent for this round and looked set to dominatethe following rounds at well. However, disaster struck in thequarter-finals with Datalab being ranked 7th, well below therequisite 4th place ranking to make it into the semi-finals.Instead the quarter-final round, and the subsequent semi-finalround, had their highest score achieved by IHSEV. Eventhough IHSEV performed best in both the quarter and semi-final rounds, and was a clear favourite going into the grandfinal, it ultimately lost to Eagle’s Wing.We can also compare the results from this year’s competi-tion against the agent rankings from past years, see Table II,as well as the benchmark scores for each agent, see Table III.These benchmark scores are the total scores for each agent,when given 120 minutes to solve each of the first two setsof “poached eggs” levels from the original Angry Birds game
Agent Qualification Quarter-finals Semi-final Grand final (1) Eagle’s Wing 416,650 175,510 350,900 355,700(2) IHSEV 415,370 261,600 415,890 275,110(3) Angry-HEX 405,340 242,980 238,040 -(4) PlanA+ 455,110 172,410 225,780 -(5) s-birds 155,980 147,120 - -(6) Vale Fina 007 332,630 106,930 - -(7) Datalab 483,750 97,100 - -(8) Condor 282,000 94,600 - -(9) BamBirds 307,890 89,830 - -(10) AngryBNU 0 0 - -
TABLE I: Round scores for each agent at the 2017 AIBIRDS competition (ordered based on final ranking)
Agent 2016 2015 2014 2013
Eagle’s Wing 5th - - -IHSEV 2nd 4th 4th 8thAngry-HEX 7th 2nd 7th 4thPlanA+ - 5th 3rd -s-birds 8th 6th 6th 11thVale Fina 007 - - - -Datalab 3rd 1st 1st -Condor - - - -BamBirds 1st - - -AngryBNU - - - -
TABLE II: Agent rankings at previous AIBIRDS competitions
Agent Benchmark set 1 Benchmark set 2 Total
Eagle’s Wing 941,840 896,630 1,838,470IHSEV 915,540 513,740 1,429,280Angry-HEX 865,470 668,690 1,534,160PlanA+ 922,480 653,720 1,576,200s-birds 732,080 223,710 955,790Vale Fina 007 661,870 292,060 953,930Datalab 947,240 1,060,610 2,007,850Condor 765,870 190,860 956,730BamBirds 774,730 242,150 1,016,880AngryBNU 763,720 618,820 1,382,540Naive 855,370 584,290 1,439,660
TABLE III: Agent benchmark scores on original Angry Birdslevels(21 levels in each benchmark set). These levels are availableto participating teams before the competition, and allow us tocompare each agent’s performance when playing the levels ithas been trained and fine-tuned on against the unknown levelsof the competition.For the man vs. machine challenge we had 45 humanparticipants, facing the four best agents from the main AIcompetition (Eagle’s Wing, IHSEV, Angry-HEX and PlanA+).The best performing agent was PlanA+ which solved threelevels and got 73,540 points, second was IHSEV which alsosolved three levels for a total of 71,020 points, third wasEagle’s Wing (the winner of the main AI competition) whichsolved two levels for 43,830 points, and last was Angry-HEXwhich solved one level for 22,920 points. Only one of the 45human players was unable to beat all agents, proving onceagain that agents still have a long way to go to achieve a levelof skill equivalent to that of a human. For the record, thebest score achieved by a human player was 178,290 points bySebastian Rudolph from the University of Dresden, Germany. VI. D
ISCUSSION
A. Competition issues
Due to the fact that some of the agents decided to implementtheir own computer vision module, there were occasionallytimes when agents would fail to detect key objects such asbirds or pigs and be unable to complete a level. As to notpenalise agents with this problem too much, any agent thatwas stuck on a particular level for too long without makinga shot had its level reset manually. This would often fix anyvision issues, and if it didn’t then the failure was on the agent.All levels were tested before the competition with our providedcomputer vision module, to ensure that there were no problemswith agents that chose to use it. The deep learning techniquesused by AngryBNU also required additional memory than wastypically available in past competitions, so all client computershad their memory increased from 4GB to 8GB. Apart fromthese small complications, everything else in the competition’sprocedure went exactly to plan.
B. Agent comparison
While the overall performance of each agent during thisyear’s competition should be clear from the results, thereis much discussion to be had on why certain agents weremore successful than others and what this means for futurework around developing AI systems for physical reasoning inunknown environments.
1) Agent techniques:
By looking at the techniques used byeach agent, we can try and identify why this may lead to vastlydifferent agent performances against certain types of levels.From our own observations, each of the 10 agents entered intothis year’s competition can be grouped into one of three maincategories based on the AI approach they used. Heuristic-basedagents (Datalab, Eagle’s Wing, s-birds, BamBirds, PlanA+ andCondor), Simulation-based agents (IHSEV and Angry-HEX)and Reinforcement Learning agents (AngryBNU and ValeFina007). Naturally there is some crossover between these groups,Angry-HEX for example uses heuristic calculations to identifyshots that are worth simulating, but these categories allow us todiscuss the different ways to approach this problem in broaderand more general terms without having to refer to specificagents.Heuristic-based agents are by far the most varying in theirperformance, as they effectively choose from a fixed numberof strategies based on their level observations. The skill ofa heuristic-based agent is therefore entirely dependent on the skill of the human designer, and their ability to identifycommon methods for solving levels. These agents have tra-ditionally performed very well in both this year’s and pastcompetitions, but struggle with levels that cannot be solvedusing one of their pre-defined strategies. Simulation-basedagents do not suffer from this limitation as much, as theyinstead simulate a variety of different possible shots using aninternal simulator and pick that which has the best outcome.This method typically takes longer than the simpler heuristic-based approaches, especially on big levels with lots of objects,but can often find solutions to more non-traditional leveldesigns. The problem with this approach lies in the fact thatthe internal simulation used by these agents is not a perfectrepresentation of the actual Angry Birds game engine and soits estimated shot results can sometimes be wildly inaccurate,leading to very strange and foolish shots. The last approachis that of reinforcement learning. This year was the firsttime that agents in the competition have used this technique,and unfortunately the results were far from groudbreaking.While advanced reinforcement learning techniques such asdeep learning have proven successful in many other videogames [30], they require a large number of varied traininglevels on which to practise (something which the version ofAngry Birds we are using does not currently possess).
2) Shot time:
As previously mentioned, not all agents takethe same length of time to make a shot, with many taking muchlonger to consider different options before acting. This is anadditional factor that must be considered in conjunction withthe strategy being used. The timed nature of this competitionmeans that simulation-based strategies, such as those usedby IHSEV, may not be the best approach. A 2017 paperaround the development of an Angry Birds hyper-agent basedon the 2016 competition entries, found a moderate negativecorrelation between the average score and shot time for eachagent [31]. This would suggest that having a faster shot timetypically leads to a greater overall score, which is likely dueto the increased number of level attempts this results in. Givenmore time to simulate a greater number of shot possibilitiesmay allow IHSEV to perform better, although the discrepancybetween its own internal simulation and that being used withinthe actual game would still hinder this approach. Allowingagents an unlimited or highly extended amount of time to makedecisions would heavily restrict our real-world applicabilityhowever, as an AI system that can perfectly reason about itsenvironment is pointless if the time required to make decisionsis too great. This same paper also found that heuristic andsimulation-based approaches performed better at certain levels,suggesting that there is no one-type-fits-all strategy.
3) Meta strategies:
Apart from the techniques used by eachagent to solve levels, another degree of complexity that thiscompetition brings is that of meta-strategies for determiningwhich specific levels to play. The time given to each agentto solve a round of the competition is typically high enoughthat each level can be attempted multiple times. Some agentschoose to attempt all levels once before replaying any unsolvedlevels (such as Datalab and Eagles Wing) whilst others attempta level multiple times before moving on (such as s-birds).Angry-HEX and Bambirds are also able to remember the shots Fig. 5: A level used in the quarter-finals round that requiresagents to plan multiple shots forward to achieve success.and strategies previously carried out, to aid them when re-attempting levels later on. Whilst most agents try to solveall levels before re-attempting those already solved, Bambirdscalculates a probability of attempting each level based onan estimated number of points for solving it, the number oftimes it has been played and the current score for that level.Agents can also see the scores of other agents, to determinewhich levels its counterparts are struggling with. Whilst thesestrategies can be very influential on an agent’s final score, theyare not yet sufficiently complex to warrant a full game theorystyle investigation and currently have little bearing outside ofthe competition environment.
4) Creative levels:
Several of the levels used in this year’scompetition were designed to require some form of creativereasoning to solve them. These levels typically require agentsto make a non-obvious first shot in order to clear the level withthe second shot, see Figure 5 for an example. Human playerscan easily tell that if they first destroy the wooden supportblocks the stone blocks will fall and leave the pig exposed,yet not one agent was able to solve this seemingly simplelevel. Most agents always targeted seemingly important objectssuch as Pigs or TNT, as well as making greedy shots with thecurrent bird without considering those still to come. Planningmore than one shot ahead may seem like an incrediblydifficult task without knowing the outcome of the initial shot,but humans who play Angry Birds can still come up withintelligent shot sequences without this information. It seemsas though human intuition about how physical environmentswill react to certain actions can extend multiple steps into thefuture. Levels that are designed to deliberately exploit agentlimitations and biases, demonstrate the need to develop andcombine different AI techniques across multiple fields in orderto achieve success.
5) Previous competitions:
Comparing the rankings of thisyear’s agents against those same agent’s rankings in previouscompetitions, see Table II, we can see that there is a largeamount of variation in each agent’s final ranking from year toyear. Previous two-time winner Datalab, and last year’s winnerBamBirds did surprisingly poorly in this year’s competition,whilst this year’s winner Eagle’s Wing rose from 5th place lastyear. This is likely caused by the nature of our competition’s“multiple round” design, which has been structured in such a way to give weaker agents a greater chance of winning than ifthe sum of each agent’s scores across all 32 competition levelswas used. This format means that factors such as the decisionof which levels to use for each round is very important.Swapping the levels used in the semi-final and grand finalrounds may have resulted in a completely different competitionwinner. Nevertheless, the fact that certain agents perform betterin different rounds with different levels indicates that thevariety of Angry Birds level designs used in the competitionis sufficiently diverse such that no agent is currently skilledenough to successfully solve them all, and definitely not to ahuman’s level of performance.
6) Agent benchmarks:
Whilst the competition’s levels areunknown, the benchmark levels are provided to participatingteams beforehand as a way to test and evaluate their agent(s).By comparing each agent’s scores for these levels againstthose in the competition, we can see that some agents reliedmuch more heavily on the design of the known levels thanothers. The clearest example would be AngryBNU which didreasonably well on the benchmark levels, although not asgood as some other agents, but failed to score any pointsat all during the competition. This would suggest that thedeep reinforcement learning techniques it uses have beentrained too heavily on the benchmark levels (i.e. overfitting),and thus cannot successfully adapt to the previously unseenlevels used in the competition. As will be discussed later,recent improvements in the development of Angry Birds levelgenerators could help alleviate this problem by providing amuch larger selection of training levels.
7) Man vs. Machine:
In order to estimate how close we areto achieving our goal of agents with human level performance,we compared the skill of the best four agents against humanparticipants in the man vs. machine challenge. In previousiterations, humans have always won with a wide, but shrinkingmargin. In 2013, half of human participants were better thanthe best AI, while in 2014 it was a third, and in 2015 /2016 the winning agent ended up being among the best eighthof all human players who participated. To up the stakes thisyear, we significantly increased the complexity and difficultyof the levels used in this challenge. As a result, the overallperformance of our agents against human players droppeddramatically compared to previous years. This suggests thatwhile agents have been improving in their ability to playAngry Birds levels with a more traditional design, those thatrequire creative reasoning to solve them still pose a significantchallenge.
C. Combining AIBIRDS competitions
There are several ways in which the two current AIBIRDScompetitions (agent and level generation) could be combined.As previously mentioned and utilised in this year’s competi-tion, level generators can be used to create additional levels fortesting and evaluating the performance of an agent beyond theoriginal hand-designed levels that the game currently provides.This ability to rapidly create new and unknown levels meansthat it is now possible to construct a large database of traininglevels for agents focussed around using reinforcement learning techniques, such as AngryBNU or Vale Fina 007. This additioncould dramatically improve the performance of agents that usethese techniques, particularly as they performed so poorly thisyear compared to other more traditional AI approaches. Agentsare also extremely useful in the AIBIRDS level generationcompetition, as they can be used to evaluate and test levelscreated by the generators. Different agents could also be usedto test a generated level against different playstyles, or todetermine whether a level is too hard or too easy based onthe number of agents that can solve it and how long it takesthem. We also hope to be able to link both the AIBIRDS agentand level generation competitions in the future, perhaps withagents trying to beat generated levels and generators trying tocreate levels that are difficult for agents.
D. Research, education and teaching
The AIBIRDS website [32] holds an extensive repositoryof resources for anyone wishing to enter the competition orconduct research around it. This includes open source codeand papers on prior agents to assist newcomers, benchmarkingsoftware for comparing different techniques, and extensivedetails on past competitions. There are a wide range oftechniques that are yet to be successfully implemented whichcould dramatically increase agent performance, ranging fromincredibly complex machine learning algorithms to simplybetter heuristics for evaluating shots. We hope that this paperwill inspire others to take up the challenge, perhaps you couldbe the one who finally cracks this problem and develops askilful agent that can outperform humans.An additional version of the basic game-playing softwarehas also been developed using the simple visual programminglanguage Snap!. This version of the framework allows anyoneto develop their own Angry Birds agent that can utilisemultiple different strategies with little to no prior programmingexperience. We hope that this software will be used to promotecomputer science and AI to school children, whilst alsolearning basic programming skills. Separate AIBIRDS com-petitions focused on comparing agents developed by studentsfrom different countries could even be held, inspiring futuregenerations of computer science researchers to take on the AIchallenges of tomorrow.
E. Future ideas
We believe that there are several key areas where agentscould be improved to help achieve better performance in thefuture. An overreliance on traditional Angry Birds levels seemsto be a key weakness for a lot of agents, not just thosethat use reinforcement learning. While an increased numberof available generated levels would help address this issue,it fails to tackle the underlying problem itself. Real-timemachine learning techniques will need to be employed in orderfor agents to experiment and develop solutions to creativelydesigned levels. These agents can use human knowledge andinsight as a useful starting point, but should also constructtheir own strategies independent of designer bias. It is clearfrom the results of this competition that this work still has along way to go, and that any successful agent will need to combine multiple AI approaches to even stand a chance ofrivalling human players.The main improvements that could be made to futurecompetitions would be to provide a greater variety of availablelevels, allowing competition entrants to better evaluate theiragents before the actual competition, and to deliberately designlevels that are meant to be difficult for agents but easy forhuman players, which would greatly help in identifying wherecertain AI approaches are lacking. It would also be veryhelpful if more competition participants made their agentsopen source or provided more detailed information about theirinner workings. While a competitive mentality and desire forteams with successful agents to keep their secrets to them-selves is understandable, the main goal of this competitionis to further the development and research around of agentsthat can interact within a physical environment. The moretransparent participating teams are with their breakthroughs,the more successful future agents will be. Several previouscompetition entrants have already published their researchand agent designs in academic conference or journal papers[8], [9], [10], [11], [12], [13], [14], and we hope that futureparticipants will continue doing this.Beyond the competition itself, the results and scores foreach agent can help identify where the current techniques forphysical reasoning in unknown environments are lacking. Itwould seem that even advanced AI techniques such as deeplearning are not enough on their own to solve this problem.Humans can accomplish these tasks with little cognitive ef-fort, but it would be incredibly difficult for an AI systemto accomplish. Developing Angry Birds agents is only thestart of this research but represents a clear and well-formedstep in the right direction. The AIBIRDS competition wasrecently investigated in a 2016 expert survey on progress inAI, which predicted that Angry Birds agents should be ableto outperform humans in the next three years (median of allexpert’s predicted times) [33]. While we would be thrilled ifsuch an agent was created in the next three years, we feelthat this is a severe underestimation of the challenges andcomplexities involved in such an accomplishment.VII. C ONCLUSION
In this paper we have presented an overview of the sixthAIBIRDS competition. The task of solving unknown AngryBirds levels posed by this competition is hugely relevant tomany real-world problems that require physical reasoning.Even though many different AI approaches have been imple-mented to tackle this challenge, it appears that the problemis too difficult to be solved by any single technique alone.This year’s competition even featured agents that attemptedto use modern machine learning techniques, but unfortunatelywith little success. It seems evident from this and previousyear’s results that for future agents to succeed, they must drawfrom multiple areas of AI. We hope that in the future manyof the competition entrants will mutually share informationabout their techniques, further pushing forward towards ourgoal of developing an AI system that can reason and act withinunknown physical environments based solely on visual inputs or other forms of perception. We would also like to thank themembers of the IJCAI committee, competition entrants and allman vs. machine participants for their contribution to makingthis event possible. We intend to run this competition again in2018 and encourage all interested teams to participate in thisexciting challenge. R
EFERENCES[1] M. Campbell, A. Hoane, and F. hsiung Hsu, “Deep blue,”
ArtificialIntelligence
Nature , vol. 529, no. 7587, pp. 484–489, 2016.[3] J. Togelius, “Ai researchers, video games are your friends!” in ,vol. 3, 2015, pp. 5–5.[4] J. Renz, “AIBIRDS: The Angry Birds artificial intelligence competition,”in
AAAI Conference on Artificial Intelligence
AAAI Conference on Artificial Intelligence ,2016, pp. 4338–4339.[7] J. Renz, X. Ge, S. Gould, and P. Zhang, “The Angry Birds AIcompetition,”
AI Magazine , vol. 36, no. 2, pp. 85–87, 2015.[8] P. A. Waga, M. Zawidzki, and T. Lechowski, “Qualitative physics inAngry Birds,”
IEEE Transactions on Computational Intelligence and AIin Games , vol. 8, no. 2, pp. 152–165, 2016.[9] M. Polceanu and C. Buche, “Towards a theory-of-mind-inspired genericdecision-making framework,” in
IJCAI Symposium on AI in Angry Birds ,2013.[10] S. Schiffer, M. Jourenko, and G. Lakemeyer, “Akbaba: An agent forthe Angry Birds AI challenge based on search and simulation,”
IEEETransactions on Computational Intelligence and AI in Games , vol. 8,no. 2, pp. 116–127, 2016.[11] F. Calimeri, M. Fink, S. Germano, A. Humenberger, G. Ianni, C. Redl,D. Stepanova, A. Tucci, and A. Wimmer, “Angry-HEX: An artificialplayer for Angry Birds based on declarative knowledge bases,”
IEEETransactions on Computational Intelligence and AI in Games , vol. 8,no. 2, pp. 128–139, 2016.[12] S. Dasgupta, S. Vaghela, V. Modi, and H. Kanakia, “s-Birds Avengers:A dynamic heuristic engine-based agent for the Angry Birds problem,”
IEEE Transactions on Computational Intelligence and AI in Games ,vol. 8, no. 2, pp. 140–151, 2016.[13] N. Tziortziotis, G. Papagiannis, and K. Blekas, “A bayesian ensembleregression framework on the Angry Birds game,”
IEEE Transactions onComputational Intelligence and AI in Games , vol. 8, no. 2, pp. 104–115,2016.[14] A. Narayan-Chen, L. Xu, and J. Shavlik, “An empirical evaluation ofmachine learning approaches for Angry Birds,” in
IJCAI Symposium onAI in Angry Birds , 2013.[15] P. Zhang and J. Renz, “Qualitative spatial representation and reasoningin Angry Birds: The extended rectangle algebra,” in
Proceedings ofthe Fourteenth International Conference on Principles of KnowledgeRepresentation and Reasoning , ser. KR’14, 2014, pp. 378–387.[16] J. Togelius, N. Shaker, S. Karakovskiy, and G. Yannakakis, “The MarioAI championship 2009-2012,” vol. 34, pp. 89–92, 2013.[17] J. Togelius, S. Karakovskiy, and R. Baumgarten, “The 2009 Mario AIcompetition,” in
IEEE Congress on Evolutionary Computation , 2010,pp. 1–8.[18] S. Karakovskiy and J. Togelius, “The Mario AI benchmark and com-petitions,”
IEEE Transactions on Computational Intelligence and AI inGames , vol. 4, no. 1, pp. 55–67, 2012.[19] S. Ontan, G. Synnaeve, A. Uriarte, F. Richoux, D. Churchill, andM. Preuss, “A survey of real-time strategy game AI research and com-petition in StarCraft,”
IEEE Transactions on Computational Intelligenceand AI in Games , vol. 5, no. 4, pp. 293–311, 2013. [20] M. Kempka, M. Wydmuch, G. Runc, J. Toczek, and W. Jakowski, “ViZ-Doom: A Doom-based AI research platform for visual reinforcementlearning,” in , 2016, pp. 1–8.[21] R. Prada, P. Lopes, J. Catarino, J. Quitrio, and F. S. Melo, “Thegeometry friends game AI competition,” in , 2015, pp. 431–438.[22] F. Lu, K. Yamamoto, L. H. Nomura, S. Mizuno, Y. Lee, and R. Tha-wonmas, “Fighting game artificial intelligence competition platform,” in ,2013, pp. 320–323.[23] D. Perez-Liebana, S. Samothrakis, J. Togelius, T. Schaul, S. M. Lucas,A. Coutoux, J. Lee, C. U. Lim, and T. Thompson, “The 2014 generalvideo game playing competition,” IEEE Transactions on ComputationalIntelligence and AI in Games , vol. 8, no. 3, pp. 229–243, 2016.[24] R. D. Gaina, D. Prez-Libana, and S. M. Lucas, “General video game for2 players: Framework and competition,” in , 2016, pp. 186–191.[25] R. D. Gaina, A. Couetoux, D. Soemers, M. H. M. Winands,T. Vodopivec, F. Kirchgebner, J. Liu, S. M. Lucas, and D. Perez,“The 2016 two-player GVGAI competition,”
IEEE Transactions onComputational Intelligence and AI in Games , 2017.[26] D. Perez-Liebana, S. Samothrakis, J. Togelius, S. Lucas, and T. Schaul,“General video game AI: Competition, challenges, and opportunities,”in . AAAIpress, 2016, pp. 4335–4337.[27] M. Stephenson and J. Renz, “Generating varied, stable and solvablelevels for Angry Birds style physics games,” in , 2017, pp. 288–295.[28] AIBIRDS, “AIBIRDS getting started,” https://aibirds.org/basic-game-playing-software/getting-started.html, 2017, accessed:2017-11-14.[29] ——, “AIBIRDS 2017 participating teams,” https://aibirds.org/angry-birds-ai-competition/participating-teams.html, 2017, accessed:2017-11-14.[30] N. Justesen, P. Bontrager, J. Togelius, and S. Risi, “Deep learningfor video game playing,”
CoRR , vol. abs/1708.07902, 2017. [Online].Available: http://arxiv.org/abs/1708.07902[31] M. Stephenson and J. Renz, “Creating a hyper-agent for solving angrybirds levels,” in
AAAI Conference on Artificial Intelligence and Interac-tive Digital Entertainment , 2017.[32] AIBIRDS, “AIBIRDS homepage,” https://aibirds.org, 2017, accessed:2017-11-14.[33] K. Grace, J. Salvatier, A. Dafoe, B. Zhang, and O. Evans,“When will AI exceed human performance? evidence from AIexperts,”