Random Choices can Facilitate the Solving of Collective Network Coloring Problems by Artificial Agents
RRandom Choices can Facilitate the Solving ofCollective Network Coloring Problems by ArtificialAgents
Matthew I. Jones , Scott D. Pauls , Feng Fu , Department of Mathematics, Dartmouth College, Hanover, NH 03755, USA Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth,Lebanon, NH 03756, USA
Abstract.
Global coordination is required to solve a wide variety of challengingcollective action problems from network colorings to the tragedy of the commons.Recent empirical study shows that the presence of a few noisy autonomous agentscan greatly improve collective performance of humans in solving networked colorcoordination games. To provide further analytical insights into the role of behavioralrandomness, here we study myopic artificial agents attempt to solve similar networkcoloring problems using decision update rules that are only based on local informationbut allow random choices at various stages of their heuristic reasonings. We considerthat agents are distributed over a random bipartite network which is guaranteed tobe solvable with two colors. Using agent-based simulations and theoretical analysis,we show that the resulting efficacy of resolving color conflicts is dependent on thespecific implementation of random behavior of agents, including the fraction ofnoisy agents and at which decision stage noise is introduced. Moreover, behavioralrandomness can be finely tuned to the specific underlying population structure such asnetwork size and average network degree in order to produce advantageous results infinding collective coloring solutions. Our work demonstrates that distributed greedyoptimization algorithms exploiting local information should be deployed in combinationwith occasional exploration via random choices in order to overcome local minima andachieve global coordination.
Submitted to: arXiv a r X i v : . [ phy s i c s . s o c - ph ] F e b ollective Network Coloring Problems
1. Introduction
Many classical games like the Prisoner’s Dilemma focus on two players attempting toget the better of each other. Both players would like to defect while their opponentcooperates, thus reaping rewards and avoiding punishments. A great body of work isfocused on how to foster cooperation in such non-zero sum games [1, 2]. But thereis another well-studied class of games in which all players receive the most benefitwhen they work together, called coordination games [3]. The optimal behavior for allplayers can be easily determined and agreed upon if all players can meet and strategizebeforehand. In such games, the difficulty comes not from attempting to scam one’sopponent, but figuring out what one’s partner will play before choosing one’s ownstrategy [4, 5]. However, there can still be a “defecting” component, in which one’sopponent can unilaterally choose a strategy with lower maximum payoff but also lessrisk [6].Frequently, we consider playing games in which the population is given some spatialstructure other than being well-mixed [7, 8]. Population structure is typically modelledas a graph or network, where each node is an individual, and individuals play games ifthey are connected by an edge [9, 10, 11, 12, 13, 14, 15, 16]. On such a network, manycoordination games can be rephrased as network coloring problems [17]. A coloring is acollection of labels or colors, one for each node, such that any two nodes connected byan edge have different colors. Network colorings make appearances in all sorts of fields,including sudoku puzzles, register allocation in computer science [18], and clusteringproblems [19]. Deciding on a time table for various classes with shared classrooms [20]and assignment of radio frequencies [21] are just two examples or coordination gamesthat manifest naturally as network coloring problems. Generally, we let the nodes beindividuals (which we refer to as artificial agents in this work), and the color choicerepresents the strategy of that individual. When the nodes of a network are properlycolored, all the individuals are playing an optimal strategy. In this sense, the networkcoloring problem, if assigned with a proper payoff structure for the coloring outcome,can be considered broadly as coordination game [22, 23].In general, the network coloring problem is NP-hard [24]. It is found that manydifficult mathematical problems cannot be solved by a simple, direct approach, but itcan help to apply a small degree of randomness to any algorithms searching the solutionspace. This approach has been used to all sorts of problems, including the TravelingSalesman Problem [25] and the graph coloring problem [26] with which we are concernedin this paper.Attempts to solve the network coloring problem typically use information about theentire network to make decisions about the colors of nodes. This makes sense as havingall the information simultaneously leads to better informed decisions. For example,Ref.[26] uses a notion of temperature to gradually reduce stochastic behavior as thesystem “cools” into the global solution. This requires some central information unitthat instructs each node on color choice. However, if we are using the network as a ollective Network Coloring Problems
2. Methods
As we will see, different network topologies will be easier or harder to color. Even withglobal information, finding network colorings becomes exponentially more difficult asthe number of nodes increases [24]. On the other hand, as average degree increases,individuals will have more neighbors and therefore be able to make more informeddecisions when choosing a color. Throughout this paper, we simulate artificial agentsthat attempt to find 2-colorings of random bipartite networks. The exact structure ofthese networks will vary, as will the decision update rules agents use to solve the networkcolorings. ollective Network Coloring Problems n nodes and average degree k by first assigningeach node to group A or group B with probability . Then, we add an edge betweenany two nodes in different groups with probability kn . Thus, the resulting network isguaranteed to have a 2-coloring by assigning every node in group A one color and everynode in group B the other color. However, there may be different numbers of nodes foreach color, as the sizes of groups A and B are binomially distributed in our bipartitenetwork model. In this paper, we consider multiple decision update rules to account for a varietyof artificial agents’ behavior, each with their own strengths and weaknesses. In thefollowing, an acceptable local coloring at a node is the choice of color such that none ofthe node’s neighbors have that color (no color conflicts with neighbors).We first consider a basic greedy update rule of agents:
I: Basic greedy update rule
Step 1: Check if the current color is already an acceptable local coloring. If yes, no furtherchange.Step 2: If not, check if the other color would make an acceptable local coloring. If yes,choose the other color.Step 3: (and so on for all the steps) If not, choose the color that minimizes color conflicts,and randomly choose one color if both color choices have the same number of colorconflicts with neighbors.We incorporate random behavior in various decision stages in the following modifiedupdate rules based on the basic greedy update rule above.
II: Randomness-first update rule
Step 1 With probability p , choose a color uniformly at random.Step 2 Check if the chosen color is already an acceptable local coloring. If yes, no furtherchange.Step 3 If not, check if the other color would make an acceptable local coloring. If yes,choose the other color.Step 4 If not, choose the color that minimizes color conflicts, and randomly choose onecolor if both color choices have the same number of color conflicts with neighbors. III: Memory- update rule Step 1 Check if the current color is already an acceptable local coloring. If yes, no furtherchange. ollective Network Coloring Problems p , choose a color uniformly at random.Step 4 Choose the color of the two that minimizes color conflicts, and randomly choose onecolor if both color choices have the same number of color conflicts with neighbors. IV: Memory- N update rule Step 1 Check if the current color is already an acceptable local coloring. If yes, no furtherchange.Step 2 If not, check if the other color would make an acceptable local coloring. If yes,choose the other color.Step 3 If not, and if no neighbors have changed colors in prior N cycles, with probability p , choose a color uniformly at random.Step 4 Choose the color of the two that minimizes color conflicts, and randomly choose onecolor if both color choices have the same number of color conflicts with neighbors. Each artificial agent, located at a node in the network, behaves according to one of theaforementioned update rules. Specifically, we consider scenarios where the populationmay be using two different update rules. A certain fraction ρ r of randomly-selectedagents adopt one of the randomness-first, memory-0, or memory- N update rules wherethe propensity of random behavior is p (as defined in the update rules), and the rest ofagents use the basic greedy update rule.The color choice of agents is updated in a random sequential manner [32]. Agentsupdate one at a time, and the order in which agents update is random. Each agentbegins with a randomly chosen color. We use three different metrics to quantify how successful a given decision update rule isin solving coloring problems by artificial agents: the number of unsolved networks, thenumber of update cycles, and the number of player updates.The number of unsolved networks metric is simply the probability that a givennetwork will reach a coloring given certain initial conditions including update order,update rules for each agent, and initial coloring.The number of update cycles measures the number of times each agent goes throughthe update process, and the number of updated agents measures the total number ofcolor changes. Roughly, the number of update cycles measures how long it will take thesystem to reach a coloring in real time, and the number of updated agents measureshow involved the process is for all agents involved. Because some combinations ofnetworks and initial conditions may never end up with a complete coloring solution, ollective Network Coloring Problems Figure 1.
Overcoming local minima is often needed to solve collective action problems.(a) shows a small network that did not find a valid coloring using only greedy behavior.The four dashed edges represent “bowties,” subgraphs where the greedy update rulecan become gridlocked. The red edge shows a color conflict that cannot be resolvedby greedy behavior. In (b), we see how the interior nodes of a bowtie are both forcedto keep the same color by the exterior nodes, creating gridlock. these metrics have the possibility to be infinite in these cases. Therefore, the average ofdifficulty metrics across model parameter combinations may be heavily skewed by someof the unsolved network coloring cases. Nevertheless, these difficulty metrics providea practical means to compare the efficacy of resolving color conflicts across simulatedscenarios and can help reveal interesting results to some extent.
3. Results
To see how local minima arise, we show a small network in which each agent occupyinga network node uses the greedy update rule in Fig. 1a. The dashed edges are “bowties,”small subgraphs consisting of a central edge whose end nodes both have at least threeedges. Motif structures like this can lead to gridlock and the failure of the greedy updaterule, as demonstrated in Fig. 1b. If the central agents are playing the same color, theycan become locked in by their other neighbors, and as a consequence, the greedy updaterule becomes trapped at this local minimum, unable to explore the entire space and finda global minimum of color conflicts. Without random behavior, the network will neverreach a global coloring once this happens. The smallest possible network structure thatcan become gridlocked is the six-node bowtie, as shown in Fig. 1b.This simple case demonstrated in Fig. 1b can yield an interesting insight. Considerthe case where there is no random behavior and each agent is playing the greedy updaterule. There are 6! · random initial conditions for the update order and initial colors.Using exhaustive search to work out each case, we find that the simple bowtie results ingridlock with probability . In each case, either gridlock or a global coloring is alwaysreached after two update cycles. ollective Network Coloring Problems − (1 − ρ r ) ), the network will eventually find a global coloring. However,in the memory- N update rule, the peripheral nodes already have a locally acceptablecolor, and will not change even if they have the potential for random behavior. Oneof the middle two nodes must have random behavior to find a coloring, which happenswith probability 1 − (1 − ρ r ) , a much less likely event than in the randomness-firstupdate rule. Thus, the gridlock probabilities for the randomness-first and memory- N update rules respectively are approximately P rand-first(Gridlock) = 29120 (1 − ρ r ) (1) P memory- N (Gridlock) = 29120 (1 − ρ r ) (2)We see excellent agreement between these equations and simulations in Fig. 2. Wenote that these probabilities are less accurate when p is large, because individuals couldbehave randomly before the system reaches gridlock, disrupting the earlier computationfor which assumed no random behavior takes place in the first two update cycles.Similarly, we see that the memory- N update rules require larger ρ r than therandomness-first rule to reach the same efficacy of resolving color conflicts. When usingthe former update rule, only agents with a color conflict are allowed to make randomchoices, unlike the latter randomness-first update rule. Because random behavior islimited to individuals with a color conflict, large ρ r values are less likely to result in toomuch randomness when most agents are already in a local coloring without conflictsand hence will not behave randomly in any given time step. We shall see this differencebetween randomness-first and memory- N update rules manifest itself in simulations onlarger networks in the following section. Having defined the model parameters for the problem, we now can ask a basic question:What is the optimal amount of randomness to have in the system so as to reach acoloring solution? It turns out that the answer varies, depending on the specific updaterule used, the size of the network, and the average degree of the underlying network.Typically, we will consider large and small networks with 50 and 500 nodes, and varywith average network degree values of 2 and 20, respectively. Figure 3 shows how noisyagents using different update rules succeed at reducing the total number of conflictsin different situations. Notice that no update rule alone can beat the greedy updaterule in the short term, but eventually the randomness-based update rules begin under-performing the greedy rule only to eventually surpass it and completely eliminate colorconflicts. ollective Network Coloring Problems Figure 2.
The probability of gridlock in the six-node bowtie for varying the fractionof agents with random behavior, ρ r . We see that the simulations (using p = 0 .
5) matchwell with the analytic results in Eqs. 1 and 2. Here we compare the randomness-firstrule with the memory-0 rule. Simulation results are averaged over 1 ,
000 independentruns.
There are two sources of difficulty for coloring networks using any randomness-basedupdate rule. If there is not enough randomness, the decision update rule is unable tobreak away from the local minimum found by agents using the greedy update rule. Ifthere is too much randomness, the probability that at least one agent will be picking thewrong color every turn is so high that the network will not find a coloring in a reasonablenumber of time steps. Methods like simulated annealing avoid this problem by coolingthe system and decreasing the amount of randomness over time [26]. However, in adistributed system (where each agent is using only local information to choose color)with no global information like temperature, we are limited to very simple local updaterules that simply cannot evolve over time.
For the randomness-first update rule, we ran simulations for 20 combination values of ρ r and p between 0 and 1. Networks that found a coloring within 10 ,
000 update cycles ollective Network Coloring Problems Figure 3.
Plots of total conflicts vs time. Each curve is the average of 1 , ,
000 update cycles. Observe that the x-axisis log-scale, to show the short and long term behavior of each update rule. Allnetworks have average degree 2, and the other network properties are as follows: a) n = 50 , p = 0 . , ρ r = 0 . n = 50 , p = 0 . , ρ r = 0 . n = 50 , p = 0 . , ρ r = 1 d) n = 500 , p = 0 . , ρ r = 1 by agents were considered solved, and those that did not find a coloring within 10 , p is best. Unfortunately, for large networks with ollective Network Coloring Problems Figure 4.
For the randomness-first update rule, simulation results of the probabilityof not solving the network in 10 ,
000 time steps using four different types of networks asa function of the level of randomness p and the fraction of agents with random behavior ρ r . The bipartite network parameters including the size N and the average degree k used for the underlying networks are as follows: a) n = 500 , k = 2 b) n = 500 , k = 20c) n = 50 , k = 2, d) n = 50 , k = 20. small average degree, there seems to be no good p and ρ r when using the randomness-first rule.Notice that in general, as network size goes up and/or average degree goes down,there are more unsolved networks. This makes intuitive sense, as additional nodes meansmore colors that need to be correct, and smaller average degree means the nodes haveless information and make poorer decisions. N rule We first study the memory-0 update rule that differs from the randomness-first rule inthat agents only take random actions if they are in conflict with at least one of theirneighbors. Thus, there are fewer needless random actions, and we would expect thisdecision update rule to perform better where excess randomness is an issue. This ispartially confirmed by simulations in Fig. 5. ollective Network Coloring Problems Figure 5.
For the memory-0 update rule, simulation results of the probability of notsolving the network in 10 ,
000 time steps using four different types of networks as afunction of the level of randomness p and the fraction of agents with random behavior ρ r . The bipartite network parameters including the size N and the average degree k used for the underlying networks are as follows: a) n = 500 , k = 2 b) n = 500 , k = 20c) n = 50 , k = 2, d) n = 50 , k = 20. Generally, we see an improvement of performance over the randomness-first updaterule. The memory-0 rule does very well when ρ r is close to one, even for large networkswith low average degree. However, it still struggles with excess randomness, particularlywhen network size and average degree are large. A higher average degree means thata single random color choice creates more color conflicts and therefore makes it moredifficult for the system to settle into a global coloring. However, if we assume agentswith a longer memory (i.e., N ≥ ρ r is close to one, networks arealmost always able to find a global coloring, regardless of network size or average degree.However, if for some reason only a rather small fraction of the agents are allowed touse randomness-based update rules, the randomness-first update rule will have more ollective Network Coloring Problems Figure 6.
For the memory-1 update rule, simulation results of the probability of notsolving the network in 10,000 time steps using four different types of networks as afunction of the level of randomness p and the fraction of agents with random behavior ρ r . The bipartite network parameters including the size N and the average degree k used for the underlying networks are as follows: a) n = 500 , k = 2 b) n = 500 , k = 20c) n = 50 , k = 2, d) n = 50 , k = 20. success, as seen in the simple bowtie example in Fig. 1b.
4. Discussion & Conclusion
Among others, an important insight stemming from the present paper is that the typeof decision update rule used by agents is at least as important as the amount of randombehavior. The randomness-first and memory- N update rules require different conditionsto be successful. This gives us two different update rules that are useful in differentsettings, and should be thought of as complementary instead of one being superior tothe other. For example, in a scenario where all agents are able to use a randomness-based update rule, a memory- N update rule can be used to great success. However, ifonly a few agents in the population can be persuaded to take on the personal risk ofbehaving randomly (or a small number of bots prescribed with random behavior have ollective Network Coloring Problems p will have a higher chance of success.This paper most closely relates to previous work involving human subjects playingthe coloring game with random bots [30]. While random behavior was observed comingfrom human players [29], it is not clear if this behavior was closer to the randomness-first or the memory- N update rule. The noisy bots themselves in Ref. [30] played arandomness-first update rule, which may explain how such a small fraction ( ρ r = 0 . N rule, could succeed in places where neitherupdate rule succeeds alone. Future work taking into account these extensions will be ofinterest and improves our understanding of collective decision-makings in the presenceof noise [33, 34], and more generally, machine behavior [35].
5. Acknowledgements
F.F. is supported by the Bill & Melinda Gates Foundation (award no. OPP1217336), theNIH COBRE Program (grant no. 1P20GM130454), a Neukom CompX Faculty Grant,the Dartmouth Faculty Startup Fund and the Walter & Constance Burke ResearchInitiation Award.
6. References [1] M. A. Nowak. Five rules for the evolution of cooperation.
Science , 314(5805):1560?1563, 2006.[2] Michael Doebeli and Christoph Hauert. Models of cooperation based on the prisoner’s dilemmaand the snowdrift game.
Ecology letters , 8(7):748–766, 2005.[3] Brian Skyrms.
The stag hunt and the evolution of social structure . Cambridge University Press,2004.[4] John B. Van Huyck, Raymond C. Battalio, and Richard O. Beil. Tacit coordination games,strategic uncertainty, and coordination failure.
The American Economic Review , 80(1):234–248, 1990. ollective Network Coloring Problems [5] Martin A. Nowak. Evolutionary dynamics: exploring the equations of life . Belknap Press ofHarvard University Press, 2006.[6] Christina Fang, Steven O. Kimbrough, Stefano Pace, Annapurna Valluri, and Zhiqiang Zheng. Onadaptive emergence of trust behavior in the game of stag hunt.
Group Decision and Negotiation ,11(6):449–467, 2002.[7] Richard Durrett and Simon Levin. The importance of being discrete (and spatial).
Theoreticalpopulation biology , 46(3):363–394, 1994.[8] Gy¨orgy Szab´o, Jeromos Vukov, and Attila Szolnoki. Phase diagrams for an evolutionary prisoner?sdilemma game on two-dimensional lattices.
Physical Review E , 72(4):047107, 2005.[9] Hisashi Ohtsuki, Christoph Hauert, Erez Lieberman, and Martin A Nowak. A simple rule for theevolution of cooperation on graphs and social networks.
Nature , 441(7092):502–505, 2006.[10] Francisco C Santos and Jorge M Pacheco. Scale-free networks provide a unifying framework forthe emergence of cooperation.
Physical review letters , 95(9):098104, 2005.[11] Feng Fu, Christoph Hauert, Martin A. Nowak, and Long Wang. Reputation-based partner choicepromotes cooperation in social networks.
Physical Review E , 78(2), 2008.[12] Matjaˇz Perc and Attila Szolnoki. Coevolutionary games?a mini review.
BioSystems , 99(2):109–125, 2010.[13] David G Rand, Samuel Arbesman, and Nicholas A Christakis. Dynamic social networks promotecooperation in experiments with humans.
Proceedings of the National Academy of Sciences ,108(48):19193–19198, 2011.[14] Hirokazu Shirado, Feng Fu, James H Fowler, and Nicholas A Christakis. Quality versus quantityof social ties in experimental cooperative networks.
Nature communications , 4(1):1–8, 2013.[15] Jes´us G´omez-Gardenes, Michel Campillo, Luis Mario Flor´ıa, and Yamir Moreno. Dynamicalorganization of cooperation in complex topologies.
Physical Review Letters , 98(10):108103,2007.[16] Hirokazu Shirado and Nicholas A Christakis. Network engineering using autonomous agentsincreases cooperation in human groups.
Iscience , 23(9):101438, 2020.[17] Stephen Judd, Michael Kearns, and Yevgeniy Vorobeychik. Behavioral dynamics and influencein networked coloring and consensus.
Proceedings of the National Academy of Sciences ,107(34):14978–14982, 2010.[18] G. J. Chaitin. Register allocation & spilling via graph coloring.
ACM SIGPLAN Notices ,17(6):98?101, 1982.[19] Pierre Hansen and Michel Delattre. Complete-link cluster analysis by graph coloring.
Journal ofthe American Statistical Association , 73(362):397?403, 1978.[20] D. De Werra. An introduction to timetabling.
European Journal of Operational Research ,19(2):151?162, 1985.[21] J. Zoeliner and C. Beall. A breakthrough in spectrum conserving frequency assignment technology.
IEEE Transactions on Electromagnetic Compatibility , EMC-19(3):313?319, 1977.[22] Jeremy Kun, Brian Powers, and Lev Reyzin. Anti-coordination games and stable graph colorings.In
International Symposium on Algorithmic Game Theory , pages 122–133. Springer, 2013.[23] Krzysztof R Apt, Mona Rahn, Guido Sch¨afer, and Sunil Simon. Coordination games on graphs.In
International Conference on Web and Internet Economics , pages 441–446. Springer, 2014.[24] Michael R. Garey and David S. Johnson.
Computers and intractability . Freeman, 1999.[25] Ernesto Bonomi and Jean-Luc Lutton. The n-city travelling salesman problem: Statisticalmechanics and the metropolis algorithm.
SIAM Review , 26(4):551?568, 1984.[26] David S. Johnson, Cecilia R. Aragon, Lyle A. Mcgeoch, and Catherine Schevon. Optimizationby simulated annealing: An experimental evaluation; part ii, graph coloring and numberpartitioning.
Operations Research , 39(3):378?406, 1991.[27] Irene Finocchi, Alessandro Panconesi, and Riccardo Silvestri. An experimental analysis of simple,distributed vertex coloring algorithms.
Algorithmica , 41(1):1?23, 2004.[28] Kamalika Chaudhuri, Fan Chung Graham, and Mohammad Shoaib Jamall. A network coloring ollective Network Coloring Problems game. Lecture Notes in Computer Science Internet and Network Economics , page 522?530,2008.[29] M. Kearns. An experimental study of the coloring problem on human subject networks.
Science ,313(5788):824?827, 2006.[30] Hirokazu Shirado and Nicholas A. Christakis. Locally noisy autonomous agents improve globalhuman coordination in network experiments.
Nature , 545(7654):370?374, 2017.[31] Jean-Loup Guillaume and Matthieu Latapy. Bipartite graphs as models of complex networks.
Physica A: Statistical Mechanics and its Applications , 371(2):795–813, 2006.[32] Gy¨orgy Szab´o and Gabor Fath. Evolutionary games on graphs.
Physics reports , 446(4-6):97–216,2007.[33] Iain D Couzin, Christos C Ioannou, G¨uven Demirel, Thilo Gross, Colin J Torney, Andrew Hartnett,Larissa Conradt, Simon A Levin, and Naomi E Leonard. Uninformed individuals promotedemocratic consensus in animal groups. science , 334(6062):1578–1580, 2011.[34] Iain D Couzin, Jens Krause, Nigel R Franks, and Simon A Levin. Effective leadership and decision-making in animal groups on the move.
Nature , 433(7025):513–516, 2005.[35] Iyad Rahwan, Manuel Cebrian, Nick Obradovich, Josh Bongard, Jean-Fran¸cois Bonnefon, CynthiaBreazeal, Jacob W Crandall, Nicholas A Christakis, Iain D Couzin, Matthew O Jackson, et al.Machine behaviour.