[PDF] Follow-the-Regularized-Leader Routes to Chaos in Routing Games

Abstract

We study the emergence of chaotic behavior of Follow-the-Regularized Leader (FoReL) dynamics in games. We focus on the effects of increasing the population size or the scale of costs in congestion games, and generalize recent results on unstable, chaotic behaviors in the Multiplicative Weights Update dynamics to a much larger class of FoReL dynamics. We establish that, even in simple linear non-atomic congestion games with two parallel links and any fixed learning rate, unless the game is fully symmetric, increasing the population size or the scale of costs causes learning dynamics to become unstable and eventually chaotic, in the sense of Li-Yorke and positive topological entropy. Furthermore, we show the existence of novel non-standard phenomena such as the coexistence of stable Nash equilibria and chaos in the same game. We also observe the simultaneous creation of a chaotic attractor as another chaotic attractor gets destroyed. Lastly, although FoReL dynamics can be strange and non-equilibrating, we prove that the time average still converges to an exact equilibrium for any choice of learning rate and any scale of costs.

Full PDF

FFOLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS INROUTING GAMES

JAKUB BIELAWSKI, THIPARAT CHOTIBUT, FRYDERYK FALNIOWSKI,GRZEGORZ KOSIOROWSKI, MICHAŁ MISIUREWICZ, AND GEORGIOS PILIOURAS

Abstract.

We study the emergence of chaotic behavior of Follow-the-Regularized Leader(FoReL) dynamics in games. We focus on the eﬀects of increasing the population size or thescale of costs in congestion games, and generalize recent results on unstable, chaotic behaviorsin the Multiplicative Weights Update dynamics [10, 11, 42] to a much larger class of FoReLdynamics. We establish that, even in simple linear non-atomic congestion games with twoparallel links and any ﬁxed learning rate, unless the game is fully symmetric, increasingthe population size or the scale of costs causes learning dynamics to become unstable andeventually chaotic, in the sense of Li-Yorke and positive topological entropy. Furthermore,we show the existence of novel non-standard phenomena such as the coexistence of stableNash equilibria and chaos in the same game. We also observe the simultaneous creationof a chaotic attractor as another chaotic attractor gets destroyed. Lastly, although FoReLdynamics can be strange and non-equilibrating, we prove that the time average still convergesto an exact equilibrium for any choice of learning rate and any scale of costs. Introduction

We study the dynamics of online learning in a non-atomic repeated congestion game.Namely, every iteration of the game presents a population (i.e., a continuum of players)with a choice between two strategies, and imposes on them a cost which increases with thefraction of population adopting the same strategy. In each iteration, the players updatetheir strategy accommodating for the outcomes of previous iterations. The structure of costfunction here concerns that of the congestion games, which are introduced by Rosenthal[45] and are amongst the most studied classes of games. A seminal result of [40] shows thatcongestion games are isomorphic to potential games; as such, numerous learning dynamicsare known to converge to Nash equilibria [5, 14–16, 28, 29, 35].A prototypical class of online learning dynamics is Follow the Regularized Leader (FoReL) [22,48]. FoReL algorithm includes as special cases ubiquitous meta-algorithms, such as the Multi-plicative Weights Update (MWU) algorithm [2]. Under FoReL, the strategy in each iterationis chosen by minimizing the weighted (by the learning rate) sum of the total cost of all actionschosen by the players and the regularization term. FoReL dynamics are known to achieveoptimal regret guarantees (i.e., be competitive with the best ﬁxed action with hindsight), aslong as they are executed with a highly optimized learning rate; i.e., one that is decreasingwith the steepness of the online costs (inverse to the Lipschitz constant of the online costfunctions) as well as decreasing with time T at a rate / √ T [48].Although precise parameter tuning is perfectly reasonable from the perspective of algo-rithmic design, it seems implausible from the perspective of behavioral game theory andmodeling. For example, experimental and econometric studies based on a behavioral gametheoretic learning model known as Experienced Weighted Attraction (EWA), which includes a r X i v : . [ c s . G T ] F e b BIELAWSKI, CHOTIBUT, FALNIOWSKI, KOSIOROWSKI, MISIUREWICZ, PILIOURAS

MWU as a special case, have shown that agents can use much larger learning rates thanthose required for the standard regret bounds to be meaningfully applicable [9, 23–25]. Insome sense, such a tension is to be expected because small and optimized learning rates aredesigned with system stability and asymptotic optimality in mind, whereas selﬁsh agentscare more about short-term rewards which result in larger learning rates and more aggressivebehavioral adaptation. Interestingly, recent work on learning in games study exactly suchsettings of FoReL dynamics with large, ﬁxed step-sizes, showing that vanishing and evenconstant regret is possible in some game settings [3, 4].For congestion games, it is reasonable to expect that increased demands and thus largerdaily costs should result in steeper behavioral responses, as agents become increasinglyagitated at the mounting delays. However, to capture this behavior we need to move awayfrom the standard assumption of eﬀective scaling down of the learning rate. Then, the costsincrease and allow more general models that can incorporate non-vanishing regret. Thus,in this regime, FoReL dynamics in congestion games cannot be reduced to standard regretbased analysis [7], or even Lyapunov function arguments (e.g., [43]), and more reﬁned andtailored arguments will be needed.In a series of work [10, 11, 42], the special case of MWU was analyzed under arbitrary pop-ulation, demands. In a nutshell, for any ﬁxed learning rate, MWU becomes unstable/chaoticeven in small congestion games with just two strategies/paths as long as the total demandexceeds some critical threshold, whereas for small population sizes it is always convergent.

Can we extend our understanding from MWU to more general FoReL dynamics? Moreover,are the results qualitatively similar showing that the dynamic is either convergent for allinitial conditions or non-convergent for almost all initial conditions, or can there be morecomplicated behaviors depending on the choice of the regularizer of FoReL dynamics?

Our model & results.

We analyze FoReL-based dynamics with steep regularizers innon-atomic linear congestion games with two strategies. This seemingly simple setting willsuﬃce for the emergence of highly elaborate and unpredictable behavioral patterns. For anysuch game G and an arbitrarily small but ﬁxed learning rate (cid:15) , we show that there exists asystem capacity N ( G, (cid:15) ) such that the system is unstable when the total demand exceedsthis threshold. In such case, we observe complex non-equilibrating dynamics: periodic orbitsof any period and chaotic behavior of trajectories (Section 7). A core technical result is thatalmost all such congestion games (i.e. unless they are fully symmetric), given suﬃciently largedemand, will exhibit Li-Yorke chaos and positive topological entropy (Section 7.1). In thecase of games with asymmetric equilibrium ﬂow, the bifurcation diagram is very complex (seeSection 8). Li-Yorke chaos implies that there exists an uncountable set of initial conditionsthat gets scrambled by the dynamics. Formally, given any two initial conditions x (0) , y (0) in this set, lim inf t →∞ dist ( x ( t ) , y ( t )) = 0 while lim sup t →∞ dist ( x ( t ) , y ( t )) > , meaning trajectoriescome arbitrarily close together inﬁnitely often but also then move away again. In the specialcase where the two edges have symmetric costs (equilibrium ﬂow is the − split),the system will still become unstable given large enough demand, but chaos is not possible.Instead, in the unstable regime, all but a measure zero set of initial conditions gets attractedby periodic orbits of period two which are symmetric around the equilibrium. Furthermore,we construct formal criteria for when the Nash equilibrium ﬂow is globally attracting. Forsuch systems we can prove their equilibration and thus social optimality even when standard Steepness of the regularizer guarantees that the dynamics will be well-deﬁned as a function of the currentstate of the congestion game. For details, see Section 2.

OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 3 AAACI3icbVDLSgMxFM3UV62vUZdugkVwIcNMtSpCoejGZQX7gOlQMmmmDc08SDJCGeZf3PgrblwoxY0L/8XMtAVtPSFwOOfem5vjRowKaZpfWmFldW19o7hZ2tre2d3T9w9aIow5Jk0cspB3XCQIowFpSioZ6UScIN9lpO2O7jK//US4oGHwKMcRcXw0CKhHMZJK6uk3qHZuVLowO0k3n2fzgeskpmHmODON6pxMldStmcallfb08lyCy8SakTKYodHTJ91+iGOfBBIzJIRtmZF0EsQlxYykpW4sSITwCA2IrWiAfCKcJN8phSdK6UMv5OoGEubq744E+UKMfVdV+kgOxaKXif95diy9ayehQRRLEuDpQ17MoAxhFhjsU06wZGNFEOZU7QrxEHGEpYq1pEKwFr+8TFoVw6oa5sNFuX47i6MIjsAxOAUWuAJ1cA8aoAkweAav4B18aC/amzbRPqelBW3Wcwj+QPv+AduynjE= a = 3 . b = 0 . AAACI3icbVDLSgMxFM3UV62vqks3wSK4kGGmWC1CoejGZQX7gJmhZNJMG5rJDElGKEP/xY2/4saFUty48F9Mpy1o6wmBwzn33twcP2ZUKsv6MnJr6xubW/ntws7u3v5B8fCoJaNEYNLEEYtEx0eSMMpJU1HFSCcWBIU+I21/eDf1209ESBrxRzWKiReiPqcBxUhpqVu8QbWyWXXh9KRuNs8Rfd9LLdPKcGGZlQWZKWO/ZplX9rhbLC0kuErsOSmBORrd4sTtRTgJCVeYISkd24qVlyKhKGZkXHATSWKEh6hPHE05Con00mynMTzTSg8GkdCXK5ipvztSFEo5Cn1dGSI1kMveVPzPcxIVVL2U8jhRhOPZQ0HCoIrgNDDYo4JgxUaaICyo3hXiARIIKx1rQYdgL395lbTKpl0xrYfLUv12HkcenIBTcA5scA3q4B40QBNg8AxewTv4MF6MN2NifM5Kc8a85xj8gfH9A+RNnjY= a = 2 . b = 0 . AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4Kokoeix68diC/YA2lM120q7dbMLuRiyhv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6nfqtR1Sax/LejBP0IzqQPOSMGivVn3qlsltxZyDLxMtJGXLUeqWvbj9maYTSMEG17nhuYvyMKsOZwEmxm2pMKBvRAXYslTRC7WezQyfk1Cp9EsbKljRkpv6eyGik9TgKbGdEzVAvelPxP6+TmvDaz7hMUoOSzReFqSAmJtOvSZ8rZEaMLaFMcXsrYUOqKDM2m6INwVt8eZk0zyveZcWtX5SrN3kcBTiGEzgDD66gCndQgwYwQHiGV3hzHpwX5935mLeuOPnMEfyB8/kD54uNAA== x AAAB9HicbVBNSwMxEJ2tX7V+VT16CRahgpRdUfRY9OKxgv2AdinZNNuGZpM1yRbL0t/hxYMiXv0x3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbWV1b38hvFra2d3b3ivsHDS0TRWidSC5VK8CaciZo3TDDaStWFEcBp81geDv1myOqNJPiwYxj6ke4L1jICDZW8ju1Aeum+CyYlJ9Ou8WSW3FnQMvEy0gJMtS6xa9OT5IkosIQjrVue25s/BQrwwink0In0TTGZIj7tG2pwBHVfjo7eoJOrNJDoVS2hEEz9fdEiiOtx1FgOyNsBnrRm4r/ee3EhNd+ykScGCrIfFGYcGQkmiaAekxRYvjYEkwUs7ciMsAKE2NzKtgQvMWXl0njvOJdVtz7i1L1JosjD0dwDGXw4AqqcAc1qAOBR3iGV3hzRs6L8+58zFtzTjZzCH/gfP4A7kaRjA== a,b ( x ) AAAB9HicbVBNSwMxEJ2tX7V+VT16CRahgpRdUfRY9OKxgv2AdinZNNuGZpM1yRbL0t/hxYMiXv0x3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbWV1b38hvFra2d3b3ivsHDS0TRWidSC5VK8CaciZo3TDDaStWFEcBp81geDv1myOqNJPiwYxj6ke4L1jICDZW8ju1Aeum+CyYlJ9Ou8WSW3FnQMvEy0gJMtS6xa9OT5IkosIQjrVue25s/BQrwwink0In0TTGZIj7tG2pwBHVfjo7eoJOrNJDoVS2hEEz9fdEiiOtx1FgOyNsBnrRm4r/ee3EhNd+ykScGCrIfFGYcGQkmiaAekxRYvjYEkwUs7ciMsAKE2NzKtgQvMWXl0njvOJdVtz7i1L1JosjD0dwDGXw4AqqcAc1qAOBR3iGV3hzRs6L8+58zFtzTjZzCH/gfP4A7kaRjA== a,b ( x ) AAAB9HicbVBNSwMxEJ2tX7V+VT16CRahgpRdUfRY9OKxgv2AdinZNNuGZpM1yRbL0t/hxYMiXv0x3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbWV1b38hvFra2d3b3ivsHDS0TRWidSC5VK8CaciZo3TDDaStWFEcBp81geDv1myOqNJPiwYxj6ke4L1jICDZW8ju1Aeum+CyYlJ9Ou8WSW3FnQMvEy0gJMtS6xa9OT5IkosIQjrVue25s/BQrwwink0In0TTGZIj7tG2pwBHVfjo7eoJOrNJDoVS2hEEz9fdEiiOtx1FgOyNsBnrRm4r/ee3EhNd+ykScGCrIfFGYcGQkmiaAekxRYvjYEkwUs7ciMsAKE2NzKtgQvMWXl0njvOJdVtz7i1L1JosjD0dwDGXw4AqqcAc1qAOBR3iGV3hzRs6L8+58zFtzTjZzCH/gfP4A7kaRjA== a,b ( x ) AAAB9HicbVBNSwMxEJ2tX7V+VT16CRahgpRdUfRY9OKxgv2AdinZNNuGZpM1yRbL0t/hxYMiXv0x3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbWV1b38hvFra2d3b3ivsHDS0TRWidSC5VK8CaciZo3TDDaStWFEcBp81geDv1myOqNJPiwYxj6ke4L1jICDZW8ju1Aeum+CyYlJ9Ou8WSW3FnQMvEy0gJMtS6xa9OT5IkosIQjrVue25s/BQrwwink0In0TTGZIj7tG2pwBHVfjo7eoJOrNJDoVS2hEEz9fdEiiOtx1FgOyNsBnrRm4r/ee3EhNd+ykScGCrIfFGYcGQkmiaAekxRYvjYEkwUs7ciMsAKE2NzKtgQvMWXl0njvOJdVtz7i1L1JosjD0dwDGXw4AqqcAc1qAOBR3iGV3hzRs6L8+58zFtzTjZzCH/gfP4A7kaRjA== a,b ( x ) AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4Kokoeix68diC/YA2lM120q7dbMLuRiyhv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6nfqtR1Sax/LejBP0IzqQPOSMGivVn3qlsltxZyDLxMtJGXLUeqWvbj9maYTSMEG17nhuYvyMKsOZwEmxm2pMKBvRAXYslTRC7WezQyfk1Cp9EsbKljRkpv6eyGik9TgKbGdEzVAvelPxP6+TmvDaz7hMUoOSzReFqSAmJtOvSZ8rZEaMLaFMcXsrYUOqKDM2m6INwVt8eZk0zyveZcWtX5SrN3kcBTiGEzgDD66gCndQgwYwQHiGV3hzHpwX5935mLeuOPnMEfyB8/kD54uNAA== x AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4Kokoeix68diC/YA2lM120q7dbMLuRiyhv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6nfqtR1Sax/LejBP0IzqQPOSMGivVn3qlsltxZyDLxMtJGXLUeqWvbj9maYTSMEG17nhuYvyMKsOZwEmxm2pMKBvRAXYslTRC7WezQyfk1Cp9EsbKljRkpv6eyGik9TgKbGdEzVAvelPxP6+TmvDaz7hMUoOSzReFqSAmJtOvSZ8rZEaMLaFMcXsrYUOqKDM2m6INwVt8eZk0zyveZcWtX5SrN3kcBTiGEzgDD66gCndQgwYwQHiGV3hzHpwX5935mLeuOPnMEfyB8/kD54uNAA== x AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4Kokoeix68diC/YA2lM120q7dbMLuRiyhv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6nfqtR1Sax/LejBP0IzqQPOSMGivVn3qlsltxZyDLxMtJGXLUeqWvbj9maYTSMEG17nhuYvyMKsOZwEmxm2pMKBvRAXYslTRC7WezQyfk1Cp9EsbKljRkpv6eyGik9TgKbGdEzVAvelPxP6+TmvDaz7hMUoOSzReFqSAmJtOvSZ8rZEaMLaFMcXsrYUOqKDM2m6INwVt8eZk0zyveZcWtX5SrN3kcBTiGEzgDD66gCndQgwYwQHiGV3hzHpwX5935mLeuOPnMEfyB8/kD54uNAA== x AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0WPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+rJXrniVt0ZyDLxclKBHPVe+avbj1kaoTRMUK07npsYP6PKcCZwUuqmGhPKRnSAHUsljVD72ezUCTmxSp+EsbIlDZmpvycyGmk9jgLbGVEz1IveVPzP66QmvPIzLpPUoGTzRWEqiInJ9G/S5wqZEWNLKFPc3krYkCrKjE2nZEPwFl9eJs2zqndRde/OK7XrPI4iHMExnIIHl1CDW6hDAxgM4Ble4c0Rzovz7nzMWwtOPnMIf+B8/gBpdo3h x n AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0WPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUkP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvquo2Liu12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A2GOM9g== n AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0WPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+rJXrniVt0ZyDLxclKBHPVe+avbj1kaoTRMUK07npsYP6PKcCZwUuqmGhPKRnSAHUsljVD72ezUCTmxSp+EsbIlDZmpvycyGmk9jgLbGVEz1IveVPzP66QmvPIzLpPUoGTzRWEqiInJ9G/S5wqZEWNLKFPc3krYkCrKjE2nZEPwFl9eJs2zqndRde/OK7XrPI4iHMExnIIHl1CDW6hDAxgM4Ble4c0Rzovz7nzMWwtOPnMIf+B8/gBpdo3h x n AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0WPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUkP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvquo2Liu12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A2GOM9g== n AAAB+HicbVDLSsNAFJ3UV62PRl26GSyCqzARrRZcFNy4rGAf0IYwmU7aoZNkmJlIa+iXuHGhiFs/xZ1/47QNqNUDFw7n3Mu99wSCM6UR+rQKK6tr6xvFzdLW9s5u2d7bb6kklYQ2ScIT2QmwopzFtKmZ5rQjJMVRwGk7GF3P/PY9lYol8Z2eCOpFeBCzkBGsjeTb5bGPYA8LIZMxHPvctyvIqbnV6lkNIgfN8U3cnFRAjoZvf/T6CUkjGmvCsVJdFwntZVhqRjidlnqpogKTER7QrqExjqjysvnhU3hslD4ME2kq1nCu/pzIcKTUJApMZ4T1UC17M/E/r5vq8NLLWCxSTWOyWBSmHOoEzlKAfSYp0XxiCCaSmVshGWKJiTZZlUwI7vLLf0nr1HHPHXR7Vqlf5XEUwSE4AifABRegDm5AAzQBASl4BM/gxXqwnqxX623RWrDymQPwC9b7F1w9kuI= x ⇡ x l AAAB+HicbVDLSsNAFJ3UV62PRl26GSyCqzARrRZcFNy4rGAf0IYwmU7aoZNkmJlIa+iXuHGhiFs/xZ1/47QNqNUDFw7n3Mu99wSCM6UR+rQKK6tr6xvFzdLW9s5u2d7bb6kklYQ2ScIT2QmwopzFtKmZ5rQjJMVRwGk7GF3P/PY9lYol8Z2eCOpFeBCzkBGsjeTb5bGPYA8LIZMxHPvctyvIqbnV6lkNIgfN8U3cnFRAjoZvf/T6CUkjGmvCsVJdFwntZVhqRjidlnqpogKTER7QrqExjqjysvnhU3hslD4ME2kq1nCu/pzIcKTUJApMZ4T1UC17M/E/r5vq8NLLWCxSTWOyWBSmHOoEzlKAfSYp0XxiCCaSmVshGWKJiTZZlUwI7vLLf0nr1HHPHXR7Vqlf5XEUwSE4AifABRegDm5AAzQBASl4BM/gxXqwnqxX623RWrDymQPwC9b7F1w9kuI= x ⇡ x l AAAB+HicdVDLSgMxFM3UV62Pjrp0EyyCqzJTq1VwUXDjsoJ9QDsMmTTThmYyIclI69AvceNCEbd+ijv/xrQdoYoeuHA4517uvScQjCrtOJ9WbmV1bX0jv1nY2t7ZLdp7+y0VJxKTJo5ZLDsBUoRRTpqaakY6QhIUBYy0g9H1zG/fE6lozO/0RBAvQgNOQ4qRNpJvF8e+A3tICBmP4diXvl1yys4ccIlc1irV03PoZkoJZGj49kevH+MkIlxjhpTquo7QXoqkppiRaaGXKCIQHqEB6RrKUUSUl84Pn8Jjo/RhGEtTXMO5ujyRokipSRSYzgjpofrtzcS/vG6iwwsvpVwkmnC8WBQmDOoYzlKAfSoJ1mxiCMKSmlshHiKJsDZZFUwI35/C/0mrUnbPys5ttVS/yuLIg0NwBE6AC2qgDm5AAzQBBgl4BM/gxXqwnqxX623RmrOymQPwA9b7F11KkuQ= x ⇡ x r AAAB+HicdVDLSgMxFM3UV62Pjrp0EyyCqzJTq1VwUXDjsoJ9QDsMmTTThmYyIclI69AvceNCEbd+ijv/xrQdoYoeuHA4517uvScQjCrtOJ9WbmV1bX0jv1nY2t7ZLdp7+y0VJxKTJo5ZLDsBUoRRTpqaakY6QhIUBYy0g9H1zG/fE6lozO/0RBAvQgNOQ4qRNpJvF8e+A3tICBmP4diXvl1yys4ccIlc1irV03PoZkoJZGj49kevH+MkIlxjhpTquo7QXoqkppiRaaGXKCIQHqEB6RrKUUSUl84Pn8Jjo/RhGEtTXMO5ujyRokipSRSYzgjpofrtzcS/vG6iwwsvpVwkmnC8WBQmDOoYzlKAfSoJ1mxiCMKSmlshHiKJsDZZFUwI35/C/0mrUnbPys5ttVS/yuLIg0NwBE6AC2qgDm5AAzQBBgl4BM/gxXqwnqxX623RmrOymQPwA9b7F11KkuQ= x ⇡ x r AAAB6HicbVDLSgNBEOz1GeMr6tHLYBA8LbshITl4CHjxmIB5QLKE2UknGTP7YGZWCEu+wIsHRbz6Sd78GyfJIhotaCiquunu8mPBlXacT2tjc2t7Zze3l98/ODw6LpyctlWUSIYtFolIdn2qUPAQW5prgd1YIg18gR1/erPwOw8oFY/COz2L0QvoOOQjzqg2UtMfFIqO7SxBHLtSK1drJfKtuBkpQobGoPDRH0YsCTDUTFCleq4Tay+lUnMmcJ7vJwpjyqZ0jD1DQxqg8tLloXNyaZQhGUXSVKjJUv05kdJAqVngm86A6ola9xbif14v0aOal/IwTjSGbLVolAiiI7L4mgy5RKbFzBDKJDe3EjahkjJtssmbENz1l/+Sdsl2K7bTLBfr11kcOTiHC7gCF6pQh1toQAsYIDzCM7xY99aT9Wq9rVo3rGzmDH7Bev8C9aCNBg== b AAAB6HicbVDLSgNBEOz1GeMr6tHLYBA8LbshITl4CHjxmIB5QLKE2UknGTP7YGZWCEu+wIsHRbz6Sd78GyfJIhotaCiquunu8mPBlXacT2tjc2t7Zze3l98/ODw6LpyctlWUSIYtFolIdn2qUPAQW5prgd1YIg18gR1/erPwOw8oFY/COz2L0QvoOOQjzqg2UtMfFIqO7SxBHLtSK1drJfKtuBkpQobGoPDRH0YsCTDUTFCleq4Tay+lUnMmcJ7vJwpjyqZ0jD1DQxqg8tLloXNyaZQhGUXSVKjJUv05kdJAqVngm86A6ola9xbif14v0aOal/IwTjSGbLVolAiiI7L4mgy5RKbFzBDKJDe3EjahkjJtssmbENz1l/+Sdsl2K7bTLBfr11kcOTiHC7gCF6pQh1toQAsYIDzCM7xY99aT9Wq9rVo3rGzmDH7Bev8C9aCNBg== b Figure 1.

Coexistence of locally attracting Nash equilibrium (green), limitcycles, and chaos in the same congestion game. Since congestion game has anassociated convex potential (cost) function Φ a,b ( x ) = a ((1 − b ) x + b (1 − x ) ) with a unique global minimum at the Nash equilibrium b , standard learningalgorithms such as gradient-like update with a small step size will convergeto the equilibrium. However, here we highlight the unusual coexistence of theattracting Nash equilibrium, limit cycles, and chaos for FoReL dynamics withlog-barrier regularizer r ( x ) = (1 − x ) log(1 − x )+ x log( x ) − log( − x + x +0 . .The right column shows that FoReL dynamics x n depends on the initialconditions (cyan and orange colors.) Red color encodes the dynamics initializednear the left critical point x l , which converges to the Nash equilibrium b . Bluecolor encodes the dynamics initialized near the right critical point x r , whichconverge to the limit cycle of period 2 (top), and to chaotic attractors (bottom).Convergence to the Nash equilibrium arises through dynamics that lower thecost function at every successive steps (left column), while convergence to alimit cycle or a chaotic attractor incur large cost, bouncing around in the costlandscape away from the Nash equilibrium. Remarkably, despite being periodicor chaotic, we prove that the time-average of the dynamics converges exactly to the Nash equilibrium b , independent of the interior initial conditions. Thebifurcation diagram associated with b = 0 . that demonstrates coexistence ofmultiple attractors in the same game is shown in Fig. 2regret bounds are not applicable (Section 6.1). Also, remarkably, whether the system isequilibrating or chaotic, we prove that the time-average ﬂows of FoReL dynamics exhibitregularity and always converge exactly to the Nash equilibrium (Section 4).In Section 8, for the ﬁrst time, to our knowledge, we report strange dynamics arisingfrom FoReL in congestion games. Firstly, we numerically show that for FoReL dynamics alocally attracting Nash equilibrium and chaos can coexist, see Figure 2. This is also formally BIELAWSKI, CHOTIBUT, FALNIOWSKI, KOSIOROWSKI, MISIUREWICZ, PILIOURAS proven in Section 6.2. Given the prominence of local stability analysis to equilibria fornumerous game theoretic settings which are widely used in Artiﬁcial Intelligence, such asGenerative Adversarial Networks (GANs), e.g., [18, 33, 37, 41, 53], we believe that this resultis rather important as it reveals that local stability analysis is not suﬃcient to guard againstchaotic behaviors even in a trivial game with one (locally stable) Nash equilibrium! Secondly,Figure 4 reveals that chaotic attractors can be non-robust. Speciﬁcally, we show that mildperturbations in the parameter can lead to the destruction of one complex attractor whileanother totally distinct complex attractor is born! To the best of our knowledge, thesephenomena have never been reported, and thus expanding our understanding of the rangeof possible behaviors in game dynamics. Several more examples of complex phenomena areprovided in Section 8. Finally, further calculations for entropic regularizers can be found inAppendix A.Our ﬁndings suggest that the chaotic behavior of players using Multiplicative WeightsUpdate algorithm in congestion games (see results from [10, 11, 42]) is not an exception butthe rule. Chaos is robust and can be seen for a vast subclass of online learning algorithms. Inparticular, our results apply to an important subclass of regularizers, of generalized entropies,which are widely used concepts in information theory, complexity theory, and statisticalmechanics [13, 50, 51]. Steep functions [34–36] and generalized entropies are also often usedas regularizers in game-theoretic setting [8, 12, 35]. In particular, Havrda-Charvát-Tsallisentropy-based dynamics was studied, for instance, in [20, 26]. Lastly, the emergence of chaosis clearly a hardness type of result. Such results only increase in strength the simpler theclass of examples is. Complicated games are harder to learn and it is harder for players tocoordinate on an equilibrium. Thus, in more complicated games one should expect even morecomplicated, unpredictable behaviors. 2.

Model

We consider a two-strategy congestion game (see [45]) with a continuum of players (agents),where all of the players apply the

Follow the Regularized Leader (FoReL) algorithm to updatetheir strategies [48]. Each of the players controls an inﬁnitesimally small fraction of the ﬂow.We assume that the total ﬂow of all the players is equal to N . We denote the fraction of theplayers adopting the ﬁrst strategy at time n as x n . The second strategy is then chosen by − x n fraction of the players. This model encapsulates how a large population of commutersselects between the two alternative paths that connect the initial point to the end point.When a large fraction of the players adopt the same strategy, congestion arises, and the costof choosing the same strategy increases. Linear congestion games : We focus on linear cost functions. Speciﬁcally, the cost ofeach path (link, route, or strategy) is proportional to the load . By denoting c j the cost ofselecting the strategy j (when x relative fraction of the agents choose the ﬁrst strategy),(1) c ( x ) = αN x, c (1 − x ) = βN (1 − x ) , where α, β > are the coeﬃcients of proportionality. Without loss of generality we willassume throughout the paper that α + β = 1 . Therefore, the values of α and β = 1 − α indicate how diﬀerent the path costs are from each other.A quantity of interest is the value of the equilibrium split; i.e., the relative fraction ofplayers using the ﬁrst strategy at equilibrium. The ﬁrst beneﬁt of this formulation is thatthe fraction of agents using each strategy at equilibrium is independent of the ﬂow N . The OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 5 second beneﬁt is that, independent of α , β and N , playing Nash equilibrium results in theoptimal social cost, which is the point of contact with the Price of Anarchy research [11, 30].2.1. Learning in congestion games with FoReL algorithms.

We assume that theplayers at time n + 1 know the cost of the strategies at time n (equivalently, the realized ﬂow(split) ( x n , − x n ) ) and update their choices according to the Follow the Regularized Leader (FoReL) algorithm. Namely, in the period n + 1 the players choose the ﬁrst strategy withprobability x n +1 such that: x n +1 = arg min x ∈ (0 , (cid:32) ε (cid:88) j ≤ n [ c ( x j ) · x + c (1 − x j ) · (1 − x )] + R ( x, − x ) (cid:33) = arg min x ∈ (0 , (cid:32) ε (cid:88) j ≤ n [ αN · x j · x + βN · (1 − x j ) · (1 − x )] + R ( x, − x ) (cid:33) , (2)where c ( x j ) · x + c (1 − x j ) · (1 − x ) is a total cost that is inﬂicted on the population of agentsplaying against the mix ( x, − x ) in period j , while R : (0 , (cid:55)→ R is a regularizer whichrepresents a “risk penalty”: namely, that term would penalize abrupt changes of strategybased on a small amount of data from previous iterations of the game. The existence ofa regularizer rules out strategies that focus too much on optimizing with respect to thehistory of our game. A weight coeﬃcient ε > of our choosing is used to balance thesetwo terms and may be perceived as a propensity to learn and try new strategies based onnew information: the larger ε is, the faster the players learn and the more eager they are toupdate their strategies. Commonly adopted as a standard assumption, the learning rate ε can be regarded as a small, ﬁxed constant in the following analysis but its exact value is notof particular interest. Our analysis/results holds for any ﬁxed choice of ε .Note that FoReL can also be regarded as an instance of an exploration-exploitationdynamics under the multi-armed bandits framework in online learning [54]. In the limit ε (cid:29) such that (2) is well approximated by the minimization of the cumulative expected cost (cid:88) j ≤ n [ c ( x j ) · x + c (1 − x j ) · (1 − x )] = (cid:88) j ≤ n c (1 − x j ) + (cid:32)(cid:88) j ≤ n [ c ( x j ) − c (1 − x j )] (cid:33) x, the minimization yields x n +1 = (cid:40) , (cid:80) j ≤ n [ c ( x j ) − c (1 − x j )] > , , (cid:80) j ≤ n [ c ( x j ) − c (1 − x j )] < . Namely, the strategy that incurs the least cumulative cost in the past time horizon isselected with probability 1. This term thus represents exploitation dynamics in reinforcementlearning and multi-armed bandits framework. In the opposite limit when (cid:15) (cid:28) , (2) is wellapproximated by the minimization of the regularizer R ( x, − x ) . For the Shannon entropyregularizer R ( x, − x ) = − H S ( x, − x ) = x log x + (1 − x ) log(1 − x ) that results in theMultiplicative Weight Update algorithm (see the details in Appendix A and Sec. 3), itsminimization yields x n +1 = (1 − x n +1 ) = 1 / . The entropic regularization term tends to explore every strategy with equal probabilities,neglecting the information of the past cumulative cost. Thus, this regularization term

BIELAWSKI, CHOTIBUT, FALNIOWSKI, KOSIOROWSKI, MISIUREWICZ, PILIOURAS corresponds to exploration dynamics. Therefore, ε adjusts the tradeoﬀ between exploration and exploitation . The continuous time variant of (2) with the Shannon entropy regularizerhas been studied as models of collective adaption [46, 47], also known as Boltzmann Q learning [27], in which the exploitation term is interpreted as behavioral adaptation whereasthe exploration term represents memory loss . More recent continuous-time variants studygeneralized entropies as regularizers, leading to a larger class of dynamics called EscortReplicator Dynamics [20] which was analyzed extensively in [34, 35].Motivated by the continuous-time dynamics with generalized entropies, we extend FoReLdiscrete-time dynamics (2) to a larger class of regularizers. For a given regularizer R , wedeﬁne an auxiliary function:(3) r : (0 , (cid:51) x (cid:55)→ R ( x, − x ) ∈ R . We restrict the analysis to a FoReL class of regularizers for which the dynamics implied bythe algorithm is well-deﬁned. Henceforth, we assume that R is a steep symmetric convexregularizer, namely R ∈ SSC , where:

SSC = (cid:26) R ∈ C ((0 , ) : ∀ ( x,y ) ∈ (0 , R ( y, x ) = R ( x, y ); ∀ x ∈ (0 , r (cid:48)(cid:48) ( x ) >

0; lim x → + r (cid:48) ( x ) = −∞ (cid:27) . These conditions on regularizers are not overly restrictive: the assumptions for convexity andsymmetry of the regularizer are natural, and if lim x → r (cid:48) ( x ) is ﬁnite, then the dynamics of x n from (2) will not be well-deﬁned.Many well-known and widely used regularizers like (negative) Arimoto entropies (Shannonentropy, Havrda-Charvát-Tsallis (HCT) entropies and log-barrier being most famous ones)and (negative) Rényi entropies, under mild assumptions, belong to SSC (see Appendix A).A standard non-example is the square of the Euclidean norm R ( x, − x ) = x + (1 − x ) .3. The dynamics introduced by FoReL

Let R ∈ SSC . Assume that up to the iteration n > the trajectory ( x , x , . . . , x n − ) wasestablished by (2). Then x n = arg min x ∈ (0 , (cid:32) N ε (cid:88) j ≤ n − [ α · x j · x + β · (1 − x j ) · (1 − x )] + r ( x ) (cid:33) . First order condition yields r (cid:48) ( x n ) = − N ε (cid:88) j ≤ n − [ α · x j − β · (1 − x j )] . We know that r is convex, therefore the suﬃcient and necessary condition for x n +1 to satisfy(2) takes form: r (cid:48) ( x n +1 ) = − N ε (cid:88) j ≤ n [ α · x j − β · (1 − x j )] = r (cid:48) ( x n ) − N ε [ α · x n − β · (1 − x n )]= r (cid:48) ( x n ) − N ε [ x n − β ] . (4) OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 7

We deﬁne

Ψ : (0 , (cid:51) x (cid:55)→ − r (cid:48) ( x ) ∈ R . Table 1 depicts functions Ψ for diﬀerent entropicregularizers . Before proceeding any further, we need to establish crucial properties of thefunction Ψ . Table 1.

Homeomorphisms Ψ for regularizers from SSC . regularizer r ( x ) Ψ( x ) Shannon x log x + (1 − x ) log(1 − x ) log − xx Havrda-Charvát-Tsallis, q ∈ (0 , − q (1 − x q − (1 − x ) q ) q − q ( x q − − (1 − x ) q − ) Rényi, q ∈ (0 , q − log( x q + (1 − x ) q ) q − q · x q − − (1 − x ) q − x q +(1 − x ) q log-barrier − log x − log(1 − x ) x − − x Proposition 3.1.

Let Ψ be a function derived from a regularizer from SSC . Theni)

Ψ(1 − x ) = − Ψ( x ) for x ∈ (0 , .ii) Ψ is a homeomorphism, lim x → + Ψ( x ) = ∞ , and lim x → − Ψ( x ) = −∞ .Proof. Due to the condition R ( x, y ) = R ( y, x ) , we have that ∂R∂x ( x, − x ) = ∂R∂y (1 − x, x ) .Thus, if ϕ ( x ) = 1 − x , then: Ψ(1 − x ) = − [ r ( ϕ ( x ))] (cid:48) = − ∂R∂x (1 − x, x ) + ∂R∂y (1 − x, x )= − ∂R∂y ( x, − x ) + ∂R∂x ( x, − x ) = r (cid:48) ( x ) = − Ψ( x ) . This implies (i). Moreover Ψ (cid:48) ( x ) = − r (cid:48)(cid:48) ( x ) < . Thus, Ψ is decreasing. lim x → + Ψ( x ) = − lim x → + r (cid:48) ( x ) = ∞ . From (i) we obtain that lim x → − Ψ( x ) = −∞ . (cid:3) By Proposition 3.1.ii, Ψ is a homeomorphism between (0 , and R .After substituting(5) a = N ε, b = β we obtain from (4) a general formula for the dynamics(6) x n +1 = Ψ − (Ψ( x n ) + a ( x n − b )) , where a > , b ∈ (0 , . Thus, we introduce f a,b : [0 , (cid:55)→ [0 , as(7) f a,b ( x ) =  , x = 0Ψ − (Ψ( x ) + a ( x − b )) , x ∈ (0 , , x = 1 . By the properties of Ψ , f a,b : [0 , (cid:55)→ [0 , is continuous, and (7) deﬁnes a discrete dynamicalsystem emerging from the FoReL algorithm for the pair of parameters ( a, b ) . Lemma 3.2.

The following properties hold: By substituting the negative Shannon entropy as r in (4), that is r ( x ) = R ( x, − x ) = x log x + (1 − x ) log(1 − x ) , we obtain the Multiplicative Weights Update algorithm. BIELAWSKI, CHOTIBUT, FALNIOWSKI, KOSIOROWSKI, MISIUREWICZ, PILIOURAS i) f a,b ( x ) > x if and only if x < b and f a,b ( x ) < x if and only if x > b .ii) If ϕ : (0 , (cid:55)→ (0 , is given by ϕ ( x ) = 1 − x , then ϕ ◦ f a,b = f a, − b ◦ ϕ. iii) Under the dynamics deﬁned by (7) , there exists a closed invariant and globally attractinginterval I ⊂ (0 , .Proof. We obtain (i) directly from (7) and the fact that Ψ is decreasing. Ψ is a homeomorphism, thus if y = Ψ( x ) for some x ∈ (0 , , then y = Ψ(Ψ − ( y )) = − Ψ(1 − Ψ − ( y )) . Hence, Ψ − ( − y ) = 1 − Ψ − ( y ) . Now let x ∈ (0 , . Then ( ϕ ◦ f a,b )( x ) = 1 − f a,b ( x ) = 1 − Ψ − (Ψ( x ) + a ( x − b )) = Ψ − ( − Ψ( x ) − a ( x − b ))= Ψ − (cid:0) Ψ(1 − x ) + a (cid:0) (1 − x ) − (1 − b ) (cid:1)(cid:1) = ( f a, − b ◦ ϕ )( x ) , and (ii) follows.By (i), f a,b ( x ) > x for x ∈ (0 , b ) and f a,b ( x ) < x for x ∈ ( b, . Therefore, there exists < δ < min { b, − b } such that | − x | > | − f a,b ( x ) | for x ∈ (0 , \ ( δ , − δ ) . Thereexists also δ > such that f a,b ([ δ , − δ ]) ⊂ ( δ , − δ ) . Set δ = min { δ , δ } . Then, theinterval I = [ δ, − δ ] is invariant.To complete the proof of (iii) we need to show that I is attracting. Assume that x ∈ (0 , \ I is such that its f a,b -trajectory never enters I . Since δ ≤ δ , the distance between f na,b ( x ) and I (that is, d I ( f na,b ( x )) , where d I ( z ) = δ − z for z ∈ [0 , δ ] and d I ( z ) = z − (1 − δ ) for z ∈ [1 − δ, )is decreasing and δ < f ( δ ) < − δ . Sequence d I ( f na,b ( x )) is decreasing and bounded frombelow by , so it is convergent to some (cid:15) ≥ . Therefore, the ω -set of the trajectory of x is anon-empty subset of d − I ( { (cid:15) } ) = I (cid:15) = { δ − (cid:15), − δ + (cid:15) } . However, no non-empty subset of I (cid:15) can be invariant (and thus, can be an ω -set of a trajectory), because δ − (cid:15) ≤ δ and thus f a,b ( I (cid:15) ) ⊂ ( δ − (cid:15), − δ + (cid:15) ) , and f a,b ( I (cid:15) ) ∩ I (cid:15) = ∅ . By this contradiction, such x does not exist,thus I is globally attracting. (cid:3) Average behavior — Nash equilibrium is Cesáro attracting

We start by studying asymptotic behavior by looking on the average behavior of orbits.We will show that the orbits of our dynamics exhibit regular average behavior known asCesáro attraction to the Nash equilibrium b . Deﬁnition 4.1.

For an interval map f a point p is Cesáro attracting if there is a neighborhood U of p such that for every x ∈ U the averages n (cid:80) n − k =0 f k ( x ) converge to p . Theorem 4.2 (Cesáro attracting) . For every a > , b ∈ (0 , and x ∈ (0 , we have (8) lim n →∞ n n − (cid:88) k =0 f ka,b ( x ) = b. Proof.

Fix x ∈ (0 , and let x k = f ka,b ( x ) .From (7) we get by induction that OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 9 (9) x n = f a,b ( x n − ) = Ψ − (cid:32) Ψ( x ) + a (cid:32) n − (cid:88) k =0 ( x k − b ) (cid:33)(cid:33) . By Lemma 3.2.iii there is δ > such that there exists a closed, globally absorbing andinvariant interval I ⊂ ( δ, − δ ) . Thus, for suﬃciently large nδ < x n = Ψ − (cid:32) Ψ( x ) + a (cid:32) n − (cid:88) k =0 ( x k − b ) (cid:33)(cid:33) < − δ. Ψ is decreasing, thus Ψ( δ ) > Ψ( x ) + a (cid:32) n − (cid:88) k =0 ( x k − b ) (cid:33) > Ψ(1 − δ ) . Therefore an (Ψ( δ ) − Ψ( x )) > n n − (cid:88) k =0 x k − b > an (Ψ(1 − δ ) − Ψ( x )) , so (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n n − (cid:88) k =0 x k − b (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < an max {| Ψ( δ ) − Ψ( x ) | , | Ψ(1 − δ ) − Ψ( x ) |} . Thus, (8) follows. (cid:3)

Corollary 4.3.

The center of mass of any periodic orbit { x , x , . . . , x n − } of f a,b in (0 , ,namely x + x + ... + x n − n , is equal to b . Applying the Birkhoﬀ Ergodic Theorem, we obtain:

Corollary 4.4.

For every probability measure µ , invariant for f a,b and such that µ ( { , } ) = 0 ,we have (cid:90) [0 , x dµ = b. In the following sections, we will show that, despite the regularity of the average trajectorieswhich converge to the Nash equilibrium b , the trajectories themselves typically exhibit complexand diverse behaviors. 5. Two definitions of chaos

In this section we introduce two notions of chaotic behavior: Li-Yorke chaos and (positive)topological entropy. Most deﬁnitions of chaos focus on complex behavior of trajectories, suchas Li-Yorke chaos or fast growth of the number of distinguishable orbits of length n , detectedby positivity of the topological entropy. Deﬁnition 5.1 (Li-Yorke chaos) . Let ( X, f ) be a dynamical system and ( x, y ) ∈ X × X . Wesay that ( x, y ) is a Li-Yorke pair if lim inf n →∞ dist ( f n ( x ) , f n ( y )) = 0 , and lim sup n →∞ dist ( f n ( x ) , f n ( y )) > . A dynamical system ( X, f ) is Li-Yorke chaotic if there is an uncountable set S ⊂ X (called scrambled set ) such that every pair ( x, y ) with x, y ∈ S and x (cid:54) = y is a Li-Yorke pair. Intuitively orbits of two points from the scrambled set have to gather themselves arbitrarilyclose and spring aside inﬁnitely many times but (if X is compact) it cannot happen simulta-neously for each pair of points. Obviously the existence of a large scrambled set implies thatorbits of points behave in unpredictable, complex way.A crucial feature of the chaotic behavior of a dynamical system is also exponential growthof the number of distinguishable orbits. This happens if and only if the topological entropy ofthe system is positive. In fact positivity of topological entropy turned out to be an essentialcriterion of chaos [17]. This choice comes from the fact that the future of a deterministic(zero entropy) dynamical system can be predicted if its past is known (see [52, Chapter 7])and positive entropy is related to randomness and chaos. For every dynamical system over acompact phase space, we can deﬁne a number h ( f ) ∈ [0 , ∞ ] called the topological entropy oftransformation f . This quantity was ﬁrst introduced by Adler, Konheim and McAndrew [1]as the topological counterpart of a metric (and Shannon) entropy. In general, computingtopological entropy is not an easy task. However, in the context of piecewise monotoneinterval maps, topological entropy is equal to the exponential growth rate of the minimalnumber of monotone subintervals for f n . Theorem 5.2 ([39]) . Let f be a piecewise monotone interval map and, for all n ≥ , let m n be the minimal cardinality of a monotone partition for f n . Then h ( f ) = lim n →∞ n log m n = inf n ≥ n log m n . Asymptotic stability of Nash equilibria

Asymptotic stability of Nash equilibria.

The dynamics induced by (7) admitsthree ﬁxed points: , and b . By Lemma 3.2.iii we know that all orbits starting from (0 , eventually fall into a globally attracting interval I . Thus, the points and are repelling.When does the Nash equilibrium b attract all point from (0 , ? First, we look when b is anattracting and when it is a repelling ﬁxed point. With this aim, we study the derivative of f a,b : f (cid:48) a,b ( x ) = (cid:0) Ψ − (cid:1) (cid:48) (Ψ( x ) + a ( x − b )) · (Ψ (cid:48) ( x ) + a ) . Then,(10) f (cid:48) a,b ( b ) = (cid:0) Ψ − (cid:1) (cid:48) (Ψ( b )) · (Ψ (cid:48) ( b ) + a ) = Ψ (cid:48) ( b ) + a Ψ (cid:48) ( b ) . The ﬁxed point b is attracting if and only if (cid:12)(cid:12) f (cid:48) a,b ( b ) (cid:12)(cid:12) < , which is equivalent to thecondition:(11) | Ψ (cid:48) ( b ) + a | < − Ψ (cid:48) ( b ) . Thus, the ﬁxed point b is attracting if and only if a ∈ (0 , − · Ψ (cid:48) ( b )) and repelling otherwise.We will answer when b is globally attracting on (0 , . First we will show the followingauxiliary lemma. Lemma 6.1.

Let a function g : I (cid:55)→ R be such that g (cid:48)(cid:48)(cid:48) < . Then g (cid:48) (cid:18) x + y (cid:19) > g ( x ) − g ( y ) x − y for every x, y ∈ I . OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 11

Proof.

Without loss of generality we can assume that x < y . Then y − x g (cid:48) (cid:18) x + y (cid:19) − (cid:90) x + y x g (cid:48) ( t ) dt = (cid:90) x + y x (cid:90) x + y t g (cid:48)(cid:48) ( s ) ds dt > (cid:90) y x + y (cid:90) t x + y g (cid:48)(cid:48) ( s ) ds dt = (cid:90) y x + y g (cid:48) ( t ) dt − y − x g (cid:48) (cid:18) x + y (cid:19) , where the inequality follows from the fact that g (cid:48)(cid:48) ( s ) is smaller in the latter region while theintegration is over the set of the same size. Therefore, g (cid:48) (cid:18) x + y (cid:19) > y − x (cid:90) yx g (cid:48) ( t ) dt = g ( y ) − g ( x ) y − x , which completes the proof of lemma. (cid:3) The following theorem answers, whether b is globally attracting on (0 , . Theorem 6.2.

Let Ψ be a homeomorphism derived from a regularizer from SSC . Supposethat b is an attracting ﬁxed point of f a,b . If Ψ (cid:48)(cid:48)(cid:48) < , then trajectories of all points from (0 , converge to b .Proof. In order to prove this theorem it is suﬃcient to show that f a,b doesn’t have periodicorbits of period 2.Suppose that { x , x } ∈ (0 , is a periodic orbit of f a,b of period 2. x + x b. We have that x = Ψ − (Ψ( x ) + a ( x − b )) , and therefore, Ψ( x ) = Ψ( x ) + a ( x − b ) . Thus, Ψ( x ) − Ψ( x ) = − a ( x − x ) , or equivalently(12) a = − · Ψ( x ) − Ψ( x ) x − x . By Lemma 6.1(13) Ψ (cid:48) ( b ) > Ψ( x ) − Ψ( x ) x − x = − a , but the point b is attracting if and only if Ψ (cid:48) ( b ) < − a , which contradicts the inequality (13).Therefore, f has no periodic point of period 2.Now, by [6], Chapter VI, Proposition 1, every trajectory of f converges to a ﬁxed point. (cid:3) Corollary 6.3.

Let Ψ (cid:48)(cid:48)(cid:48) < . Then the Nash equilibrium b attracts all points from the openinterval (0 , if and only if a ∈ (0 , − (cid:48) ( b )) . Functions Ψ derived from Shannon entropy, HCT entropy or log-barrier satisfy the inequality Ψ (cid:48)(cid:48)(cid:48) < . Nevertheless, this additional condition is needed, because for an arbitrary Ψ derivedfrom SSC attracting orbits of any period may exist together with the attracting Nashequilibrium b . In the next section we will discuss thoroughly an example of such behavior.This shows that even for the well-known class of FoReL algorithms knowledge of local behavior(even attraction) of the Nash equilibrium may not be enough to properly describe behaviorof agents. Figure 2.

Coexistence of the attracting Nash equilibrium and chaos.

The bifurcation diagrams for f a,b where the dynamics is induced by the reg-ularizer r ( x ) = (1 − x ) log(1 − x ) + x log x − . − x + x + 0 . for b = 0 . . On the horizontal axis the parameter a is between . and . , andon the vertical axis values of f a,b are shown. As starting points for bifurcationdiagrams two critical points of f a,b are taken — red refers to the critical pointin (0 , . and blue the critical point in (0 . , . Each critical point is iterated4000 times, visualizing the last 200 iterates. On the top picture ﬁrst red andthen blue trajectories are drawn, and on the bottom one the order is reversed.Function Ψ( x ) = − r (cid:48) ( x ) = log(1 − x ) − log x + 0 . . − x − x +0 . ] fulﬁlls all as-sumptions of Theorem 6.2 excluding Ψ (cid:48)(cid:48)(cid:48) < . Although for a < − (cid:48) ( b ) ≈ . the unique Nash equilibrium is attracting we can observe chaotic behavioralready for a > . . The picture suggests that in the coexistence region wehave an interval which is invariant for f a,b , and in it we see the usual evolutionof unimodal maps. This means that sometimes we see an attracting periodicorbit, sometimes a chaotic attractor. OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 13

Figure 3.

Graph of f a,b for a = 3 . , b = 0 . generated by the regularizer r ( x ) = (1 − x ) log(1 − x ) + x log x − . − x + x + 0 . .6.2. Coexistence of attracting Nash equilibrium and chaos.

In this section we willdescribe an example of the regularizer from

SSC , which introduces game dynamics in whichattracting Nash equilibrium coexist with chaos, see example in Figure 2. This phenomenon isobserved by replacing the Shannon entropic regularizer by the log-barrier regularizer. Namely,we take Ψ( x ) = log(1 − x ) − log x + 0 . · (cid:18) . − x − x + 0 . (cid:19) . We will show that there exist a > , b ∈ (0 , such that f a,b has an attracting ﬁxed point(which is the Nash equilibrium) yet the map can be chaotic! Proposition 6.4.

There exist a > , b ∈ (0 , such that f a,b has an attracting ﬁxed point(Nash equilibirum), positive topological entropy and is Li-Yorke chaotic.Proof. Let us take b = 0 . and a = 3 . (see the graph of f a,b in Figure 3). Set ξ ( x ) := Ψ( x ) + a ( x − b ) = log(1 − x ) − log x + 0 . · (cid:18) . − x − x + 0 . (cid:19) + 3 . · ( x − . . Since a < . ≈ − (cid:48) ( b ) , the ﬁxed point b is attracting.To show that f a,b is chaotic, we will prove that f a,b has a periodic point of period 6. Withthis aim, we will show that ( f ) ( x ) < x < f ( x ) for any x ∈ [0 . , . . We start byshowing that ξ ( x ) is monotone on [0 . , . . Formula for the derivative of ξ ( x ) is ξ (cid:48) ( x ) = − − x − x + 0 . · (cid:18) . − x ) + 1( x + 0 . (cid:19) + 3 . . Set z = ( x − . . Then ξ (cid:48) ( x ) = 0 if and only if g ( z ) = 0 , where g ( z ) = 3 . · z − . · z + 0 . · z − . . We have g (cid:48) ( z ) = 9 . · z − . · z + 0 . , and the discriminant of this quadratic polynomial is negative. Therefore, g has only one zero(approximately . ), so ξ (cid:48) has only two zeros, symmetric with respect to 0.5.Thus, as ξ (cid:48) (0 . ≈ − . , ξ (cid:48) (0 . ≈ − . there is no zero of ξ (cid:48) between these two points. Moreover, those computations give us anapproximation to both zeros of ξ (cid:48) : . and . .Now we look at the ﬁrst six images of [0 . , . : Ψ(0 . ≈ . ξ (0 . ≈ . ξ (0 . ≈ . . ≈ . . ≈ − . ξ (0 . ≈ − . ξ (0 . ≈ − . . ≈ − . . ≈ − . ξ (0 . ≈ − . ξ (0 . ≈ . . ≈ . . ≈ − . ξ (0 . ≈ − . ξ (0 . ≈ − . . ≈ − . . ≈ − . ξ (0 . ≈ − . ξ (0 . ≈ . . ≈ . . ≈ − . ξ (0 . ≈ − . ξ (0 . ≈ − . . ≈ − . .We have f a,b ( x ) = Ψ − ( ξ ( x )) , so Ψ( f a,b ( x )) = ξ ( x ) . Write (cid:104) x, y (cid:105) for [ x, y ] or [ y, x ] . If (cid:104) ξ ( x ) , ξ ( y ) (cid:105) ⊂ (cid:104) Ψ( z ) , Ψ( w ) (cid:105) and ξ is monotone on (cid:104) x, y (cid:105) , then (cid:104) f a,b ( x ) , f a,b ( y ) (cid:105) ⊂ (cid:104) z, w (cid:105) .Thus, the computations show that f a,b ([0 . , . ⊂ [0 . , . and f a,b ([0 . , . ⊂ [0 . , . . Therefore, for any x ∈ [0 . , . we have ( f a,b ) ( x ) < x < ( f a,b )( x ) , so by theorem from [31], f a,b has a periodic point of period 3 and f a,b has a periodic pointof period 6. Thus, because f a,b has a periodic point of period that is not a power of 2, thetopological entropy h ( f a,b ) is positive (see [38]) and it is Li-Yorke chaotic. (cid:3) Corollary 6.5.

There exist FoReL dynamics such that when applied to symmetric linearcongestion games with only two strategies/paths the resulting dynamics have • a set of positive measure of initial conditions that converge to the unique and sociallyoptimum Nash equilibrium and • an uncountable scrambled set for which trajectories exhibit Li-Yorke chaos, • periodic orbits of all possible even periods.Thus, the (long-term) social cost depends critically on the initial condition. FoReL dynamics induced by this regularizer manifests drastically diﬀerent behaviors thatdepend on the initial condition.

OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 15 Behavior for sufficiently large a Non-convergence for suﬃciently large a . In this subsection, we study what happensas we ﬁx b and let a be arbitrarily large . First, we study the asymmetric case, namely b (cid:54) = 1 / . We show chaotic behavior of our dynamical system for a suﬃciently large, that iswe will show that if a is suﬃciently large then f a,b is Li-Yorke chaotic, has periodic orbits ofall periods and positive topological entropy.The crucial ingredient of our analysis is the existence of periodic orbit of period 3. Theorem 7.1. If b ∈ (0 , \{ } , then there exists a b such that if a > a b then f a,b has aperiodic point of period 3.Proof. By Lemma 3.2.ii, without loss of generality, we may assume b ∈ (0 , ) . We will showthat there exists x ∈ (0 , such that f a,b ( x ) < x < f a,b ( x ) .Fix a > and b, x ∈ (0 , . We set x n = f na,b ( x ) , then formula (9) holds. Hence f a,b ( x ) > x if and only if x < b and, because Ψ − is decreasing, f a,b ( x ) < x is equivalent to x + f a,b ( x ) + f a,b ( x ) > b .From the fact that b ∈ (0 , ) we have that b − < b . So we can take x > such that b − < x < b . Then f a,b ( x ) > x . Moreover lim a →∞ f a,b ( x ) = lim a →∞ Ψ − (Ψ( x ) + a ( x − b )) = 1 . Thus, since b − x < , there exists a b > such that if a > a b , then f a,b ( x ) > b − x , so x + f a,b ( x ) + f a,b ( x ) > b . Hence, if a > a b , then f a,b ( x ) < x .Now we conclude that f a,b has a periodic point of period 3 for a > a b , from theorem from[31], which implies that if f n ( x ) < x < f ( x ) for some odd n > , then f has a periodic pointof period n . (cid:3) By the Sharkovsky Theorem ([49]), existence of a periodic orbit of period 3 implies existenceof periodic orbits of all periods, and by the result of [32], period 3 implies Li-Yorke chaos.Moreover, because f a,b has a periodic point of period that is not a power of 2, the topologicalentropy h ( f a,b ) is positive (see [38]). Thus: Corollary 7.2. If b ∈ (0 , \ { / } , then there exists a b such that if a > a b then f a,b hasperiodic orbits of all periods, has positive topological entropy and is Li-Yorke chaotic. This result has an implication in non-atomic routing games. Recall that the parameter a expresses the normalized total demand. Thus, Corollary 7.2 implies that when the costs(cost functions) of paths are diﬀerent, then increasing the total demand of the system willinevitably lead to chaotic behavior.Now we consider the symmetric case, when b = , which corresponds to equal coeﬃcientsof the cost functions, α = β . To simplify the notation we denote f a = f a, / . Theorem 7.3.

If the parameter a is small enough, then all trajectories of f a starting from (0 , converge to the attracting ﬁxed point / . There exists a b such that if a > a b , then allpoints from (0 , (except countably many points, whose trajectories eventually fall into therepelling point / ) are attracted by periodic attracting orbits of the form { σ a , − σ a } , where By (5) it reﬂects the case when we ﬁx cost functions (and learning rate ε ) and increase the total demand N . < σ a < / . Moreover, if there exists δ > such that Ψ is convex on (0 , δ ) , then thereexists a unique attracting orbit { σ a , − σ a } , which attracts trajectories of all points from (0 , , except countably many points, whose trajectories eventually fall into the repelling ﬁxedpoint / .Proof. By Lemma 3.2.ii the maps f a and ϕ commute. Set g a = ϕ ◦ f a = f a ◦ ϕ . Since ϕ is aninvolution, we have g a = f a . We show that the dynamics of f a is simple, no matter how large a is.We aim to ﬁnd ﬁxed points and points of period 2 of f a and g a . Clearly, f a (0) = 0 , f a (1) = 1 , g a (0) = 1 , g a (1) = 0 . By (9) we have f a ( x ) = Ψ − (Ψ( x ) + a ( x + f a ( x ) − , so the ﬁxed points of f a are , and the solutions to x + f a ( x ) − , that is, to g a ( x ) = x .Thus, the ﬁxed points of g a (which, as we noticed, is equal to f a ) are the ﬁxed points of g a and and .We can choose the invariant interval I a = I a, / symmetric, so that ϕ ( I a ) = I a . Let uslook at G a = g a | I a : I a → I a . All ﬁxed points of G a are also ﬁxed points of G a , so G a has noperiodic points of period 2. By the Sharkovsky Theorem, G a has no periodic points otherthan ﬁxed points. For such maps it is known (see, e.g., [6]) that the ω -limit set of everytrajectory is a singleton of a ﬁxed point, that is, every trajectory converges to a ﬁxed point.If x ∈ (0 , \ I a , then the g a -trajectory of x after a ﬁnite time enters I a , so g a -trajectoriesof all points of (0 , converge to a ﬁxed point of g a in I a . Observe that a ﬁxed point of g a can be a ﬁxed point of f a (other that 0, 1) or a periodic point of f a of period 2. Thus, the f a -trajectory of every point of (0 , converges to a ﬁxed point or a periodic orbit of period of f a , other than and .Observe now that / is a ﬁxed point of both f a and g a . The ﬁxed points of g a in [0 , / are the solutions of the equation g a ( x ) = x , which is equivalent to f a ( x ) = 1 − x , further to Ψ( x ) + a ( x − /

2) = Ψ(1 − x ) and ﬁnally, by Proposition 3.1.i, to x ) = − a ( x − / . Deﬁne γ a ( x ) = − a/ x − / . We look for σ a ∈ (0 , / such that(14) Ψ( σ a ) = γ a ( σ a ) . We know that

Ψ(1 /

2) = γ a (1 /

2) = 0 and γ (cid:48) a (1 /

2) = − a . As Ψ (cid:48) (1 / < ( Ψ is strictlydecreasing) there is no solution of (14) in (0 , / for suﬃciently small a . Then, / is theonly ﬁxed point of g a in (0 , . Thus, / will attract all points from (0 , .If γ (cid:48) a (1 / < Ψ (cid:48) (1 / , then there exists x ∈ (0 , / such that γ a ( x ) > Ψ( x ) . Because lim x → + Ψ( x ) = + ∞ and lim x → + γ a ( x ) = a/ and both functions are continuous, thereexists σ a ∈ (0 , / such that Ψ( σ a ) = γ a ( σ a ) . Finally, γ (cid:48) a (1 / < Ψ (cid:48) (1 / if and only if a > − (cid:48) (1 / .Lastly, if Ψ is convex on some neighborhood of zero, that is (0 , δ ) , then we can choose a (suﬃciently large) such that all solutions of (14) in (0 , ) lay in (0 , δ ) . From the fact that Ψ is convex on this interval and γ a is an aﬃne function we obtain uniqueness of σ a . (cid:3) OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 17

Theorem 7.3 has a remarkable implication in non-atomic routing games. It implies thatif cost of both paths is the same, then there is a threshold such that if the total demandwill cross this threshold, then starting from almost any initial condition the system willoscillate, converging to the symmetric periodic orbit of period 2, never converging to theNash equilibrium. 8.

Experimental results

In this section we report complex behaviors in bifurcation diagrams of FoReL dynamics. Weinvestigate the structures of the attracting periodic orbits and chaotic attractors associatedwith the interval map f a,b : [0 , (cid:55)→ [0 , deﬁned by (7). In the asymmetric case, that is when b diﬀers from . , the standard equilibrium analysis applies when the ﬁxed point b is stable,which is when | f (cid:48) a,b ( b ) | ≤ , or equivalently when a ≤ − (cid:48) ( b ) . Therefore, as we argued in theprevious section, in this case the dynamics will converge toward the ﬁxed point b whenever a < − (cid:48) ( b ) . However, when a ≥ − (cid:48) ( b ) there is no attracting ﬁxed point. Moreover, achaotic behavior of trajectories emerges when a is suﬃciently large, as the period-doublingbifurcations route to chaos is guaranteed to arise.In particular, we study the attractors of the map f a,b generated by the log-barrier regularizer(see Example A.2 with η ( x ) = log x ) and by the Havrda-Charvát-Tsallis regularizer for q = 0 . (see Example A.3). Note that for both of these regularizers, we have that Ψ (cid:48)(cid:48)(cid:48) < . Note alsothat the functions Ψ for these regularizers can be found in Table 1.We ﬁrst focus on the log-barrier regularizer . Figure 4 reveals an unusual bifurcationphenomenon, which, to our knowledge, is not known in other natural interval maps. Weobserve simultaneous evolution of two attractors in the opposite directions: one attractor,generated by the trajectory of the left critical point, is shrinking, while the other one,generated by the trajectory of the right critical point, is growing. Figure 5 shows anotherunusual bifurcation phenomenon: a chaotic attractor arises via period-doubling bifurcationsand then collapses. After that, the trajectories of the critical points, one after the other,jump, and then they together follow a period-doubling route to chaos once more.Finally we study the bifurcation diagrams generated from Havrda-Charvát-Tsallis regularizerwith q = 0 . . In Figure 8 we observe a ﬁnite number of period-doubling (and period-halving)bifurcations, a behavior that does not lead to chaos. Nevertheless, as a increases from 39.915to 39.93, the trajectory of the right critical point leaves the attractor which it shared withthe trajectory of the left critical point, and builds a separate chaotic attractor. When chaosarises, however, we observe that the induced dynamics of the log-barrier regularizer and ofthe Havrda-Charvát-Tsallis regularizer with q = 0 . both exhibit period-doubling routes tochaos though the regularizers are starkly diﬀerent, see Figure 6 and Figure 7 respectively.9. Conclusion

We study FoReL dynamics in non-atomic congestion games with arbitrarily small but ﬁxedstep-sizes, rather than with decreasing and regret-optimizing step-sizes. Our model allows foragents that can learn over time (e.g., by tracking the cumulative performance of all actionsto inform about their future decisions), while being driven by opportunities for short-termrewards, rather than only by long-term asymptotic guarantees. As a result, we can study From the regularity of the map f a,b (see Appendix B), we know that every limit cycle of the dynamicsgenerated by f a,b can be found by studying the behavior of the critical points of f a,b . Therefore, all attractorsof this dynamics can be revealed by following the trajectories of these two critical points (as in Figure 4). Figure 4.

Simultaneous creation and destruction of diﬀerent attrac-tors.

The bifurcation diagrams for f a,b where the dynamics is determinedby taking (negative) log-barrier regularizer with parameter b = 0 . . On thehorizontal axis the parameter a is between . and , and on the verticalaxis values of f a,b between . and . are shown. As starting points forbifurcation diagrams two critical points of f a,b are taken (regularity of this map,see Appendix B, guarantees that their trajectories detect all attractors). — redrefers to the critical point in (0 , . and blue to the critical point in (0 . , .Each critical point is iterated 4000 times, visualizing the last 200 iterates. Onthe top picture ﬁrst red and then blue trajectories are drawn and on the bottomone ﬁrst blue and then red. We observe the collapse of the red attractor (builton the left critical point) with the simultaneous creation of the blue one (builton the right critical point). OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 19

Figure 5.

Locally complex behavior.

The bifurcation diagrams for f a,b where the dynamics is determined by taking (negative) log-barrier as theregularizer for b = 0 . . On the horizontal axis the parameter a is between and . , and on the vertical axis values of f a,b are between . and . . Asstarting points for bifurcation diagrams two critical points of f a,b are taken —red refers to the critical point in (0 , . and blue the critical point in (0 . , .Each critical point is iterated 4000 times, then visualizing the last 200 iterates.On the top picture ﬁrst red and then blue trajectories are drawn, and on thebottom one the order is reversed. As a increases chaotic behavior of orbitsdisappears (around . ). Then, within the window [153 . , . , chaosemerges at [153 . , and vanishes. Then trajectories jump, one after theother, and then generate a chaotic attractor which then spreads, vanishes, andﬁnally spreads onto the whole interval. Figure 6.

Period-doubling road to chaos.

The bifurcation diagrams for f a,b where the dynamics is determined by taking (negative) log-barrier as theregularizer: r ( x ) = − log x − log(1 − x ) for b = 0 . . On the horizontal axisthe parameter a is between and , and on the vertical axis the valueof f a,b ranges between and . As starting points for bifurcation diagramstwo critical points of f a,b are taken (regularity of this map, see Appendix B,guarantees that by studying their trajectories we visit all attractors) — redrefers to the left critical point (in (0 , . ) and blue to the right critical point(in (0 . , ). Each critical point is iterated 4000 times, visualizing the last 200iterates. On the top picture ﬁrst red and then blue trajectories are drawn, andon the bottom one the order is reversed. The ﬁrst bifurcation takes place atthe moment when the Nash equilibrium b becomes repelling. Then we observeperiod-doubling route to chaos. In addition two diﬀerent attractors are visiblefor a ∈ (92 , . OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 21

Figure 7.

Period-doubling road to chaos with Havrda-Charvát-Tsallis regularizer.

The bifurcation diagrams for f a,b where the dynamics isdetermined by taking (negative) Havrda-Charvát-Tsallis entropy with q = 0 . as the regularizer and b = 0 . . On the horizontal axis the parameter a isbetween and , and on the vertical axis values of f a,b ranges between and . As starting points for bifurcation diagrams two critical points of f a,b aretaken — red refers to the critical point in (0 , . and blue the critical point in (0 . , . Each critical point is iterated 4000 times, then visualizing the last 200iterates. On the top picture ﬁrst red and then blue trajectories are drawn, andon the bottom one the order is reversed. The ﬁrst bifurcation takes place atthe moment when the Nash equilibrium b becomes repelling. Then we observeperiod-doubling route to chaos. In addition two diﬀerent attractors are visiblefor a ∈ (22 . , . . Figure 8.

Period-doubling not always lead to chaos.

The bifurcationdiagrams for f a,b where the dynamics is determined by taking (negative) Havrda-Charvát-Tsallis entropy with q = 0 . as the regularizer, that is, r ( x ) = √ x − √ − x . We ﬁx b = 0 . . On the horizontal axis the parameter a is between . and , and on the vertical axis values of f a,b are between . and . .As starting points for bifurcation diagrams two critical points of f a,b are taken— red refers to the critical point in (0 , . and blue the critical point in (0 . , .Each critical point is iterated 4000 times, then visualizing the last 200 iterates.On the top picture ﬁrst red and then blue trajectories are drawn, and on thebottom one the order is reversed. As a increases both trajectories go throughthe same forward and backward period doubling steps. Then, as a increasesfrom . to . , the trajectory of the right critical point escapes theattractor which she shared with the trajectory of the left critical point, andbuilds separate chaotic attractor. Then it jumps back to the red attractor. OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 23 the eﬀects of increasing system demand and delays on agents’ responses, which can becomesteeper as they are increasingly agitated by the increasing costs. Such assumptions are welljustiﬁed from a behavioral game theory perspective [9, 24]; however, FoReL dynamics arepushed outside of the standard parameter regime in which classic black-box regret bounds donot apply meaningfully. Using tools from dynamical systems, we show that, under suﬃcientlylarge demand, dynamics will unavoidably become chaotic and unpredictable. Thus, our workvastly generalizes previous results that hold in the special case of Multiplicative WeightsUpdate [10, 11, 42]. We also report a variety of undocumented complex behaviors such as theco-existence of a locally attracting Nash equilibrium and of chaos in the same game . Despitethis behavioral complexity of the day-to-day behavior, the time-average system behavioris always perfectly regular, converging to an exact equilibrium. Our analysis showcasesthat local stability in congestion games should not be considered as a foregone conclusionand paves the way toward further investigations at the intersection of optimization theory,(behavioral) game theory, and dynamical systems.

Acknowledgements

Georgios Piliouras acknowledge AcRF Tier 2 grant 2016-T2-1-170, grant PIE-SGP-AI-2018-01, NRF2019-NRF- ANR095 ALIAS grant and NRF 2018 Fellowship NRF-NRFF2018-07.Fryderyk Falniowski acknowledges the support of the National Science Centre, Poland, grant2016/21/D/HS4/01798 and COST Action CA16228 “European Network for Game Theory”.Research of Michał Misiurewicz was partially supported by grant number 426602 from theSimons Foundation. Jakub Bielawski and Grzegorz Kosiorowski acknowledge support froma subsidy granted to Cracow University of Economics. Thiparat Chotibut acknowledgesa fruitful discussion with Tanapat Deesuwan, and was partially supported by grants fordevelopment of new faculty staﬀ, Ratchadaphiseksomphot endownment fund, and Sci-SuperVI fund, Chulalongkorn University.

References [1] R. L. Adler, A. G. Konheim, and M. H. McAndrew. Topological entropy.

Transactionsof American Mathematical Society , 114:309–319, 1965.[2] S. Arora, E. Hazan, and S. Kale. The multiplicative weights update method: a meta-algorithm and applications.

Theory of Computing , 8(1):121–164, 2012.[3] J. P. Bailey, G. Gidel, and G. Piliouras. Finite regret and cycles with ﬁxed step-size viaalternating gradient descent-ascent.

CoRR , abs/1907.04392, 2019.[4] J. P. Bailey and G. Piliouras. Fast and furious learning in zero-sum games: Vanishingregret with non-vanishing step sizes. In

Advances in Neural Information ProcessingSystems , volume 32, pages 12977–12987, 2019.[5] P. Berenbrink, M. Hoefer, and T. Sauerwald. Distributed selﬁsh load balancing onnetworks. In

ACM Transactions on Algorithms (TALG) , 2014.[6] L. Block and W. A. Coppel.

Dynamics in one dimension , volume 513 of

Lecture Notesin Mathematics . Springer, Berlin New York, 2006.[7] A. Blum, E. Even-Dar, and K. Ligett. Routing without regret: On convergence tonash equilibria of regret-minimizing algorithms in routing games. In

Proceedings of thetwenty-ﬁfth annual ACM symposium on Principles of distributed computing , pages 45–52.ACM, 2006. [8] I. M. Bomze, P. Mertikopoulos, W. Schachinger, and M. Staudigl. Hessian barrieralgorithms for linearly constrained optimization problems.

SIAM Journal on Optimization ,29(3):2100–2127, 2019.[9] C. F. Camerer.

Behavioral game theory: Experiments in strategic interaction . PrincetonUniversity Press, 2011.[10] T. Chotibut, F. Falniowski, M. Misiurewicz, and G. Piliouras. Family of chaotic maps fromgame theory.

Dynamical Systems , 2020. https://doi.org/10.1080/14689367.2020.1795624.[11] T. Chotibut, F. Falniowski, M. Misiurewicz, and G. Piliouras. The route to chaos inrouting games: When is price of anarchy too optimistic?

Advances in Neural InformationProcessing Systems , 33, 2020.[12] P. Coucheney, B. Gaujal, and P. Mertikopoulos. Penalty-regulated dynamics and robustlearning procedures in games.

Mathematics of Operations Research , 40(3):611–633, 2015.[13] I. Csiszár. Axiomatic characterizations of information measures.

Entropy , 10(3):261–273,2008.[14] E. Even-Dar and Y. Mansour. Fast convergence of selﬁsh rerouting. In

Proceedings ofthe Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms , SODA ’05, pages772–781, Philadelphia, PA, USA, 2005. Society for Industrial and Applied Mathematics.[15] S. Fischer, H. Räcke, and B. Vöcking. Fast convergence to wardrop equilibria by adaptivesampling methods. In

Proceedings of the Thirty-eighth Annual ACM Symposium onTheory of Computing , STOC ’06, pages 653–662, New York, NY, USA, 2006. ACM.[16] D. Fotakis, A. C. Kaporis, and P. G. Spirakis. Atomic congestion games: Fast, myopicand concurrent. In B. Monien and U.-P. Schroeder, editors,

Algorithmic Game Theory ,volume 4997 of

Lecture Notes in Computer Science , pages 121–132. Springer BerlinHeidelberg, 2008.[17] E. Glasner and B. Weiss. Sensitive dependence on initial conditions.

Nonlinearity ,6(6):1067–1085, 1993.[18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville,and Y. Bengio. Generative adversarial nets. In

Advances in neural information processingsystems , pages 2672–2680, 2014.[19] P. D. Grünwald and A. P. Dawid. Game theory, maximum entropy, minimum discrepancyand robust bayesian decision theory.

The Annals of Statistics , 32(4):1367–1433, 2004.[20] M. Harper. Escort evolutionary game theory.

Physica D: Nonlinear Phenomena ,240(18):1411–1415, 2011.[21] J. Havrda and F. Charvát. Quantiﬁcation method of classiﬁcation processes. concept ofstructural a -entropy. Kybernetika , 3(1):30–35, 1967.[22] E. Hazan et al. Introduction to online convex optimization.

Foundations and Trends®in Optimization , 2(3-4):157–325, 2016.[23] T.-H. Ho and C. Camerer. Experience-weighted attraction learning in coordinationgames: Probability rules, heterogeneity, and time-variation.

Journal of MathematicalPsychology , 42:305–326, 1998.[24] T.-H. Ho and C. Camerer. Experience-weighted attraction learning in normal formgames.

Econometrica , 67:827–874, 1999.[25] T.-H. Ho, C. F. Camerer, and J.-K. Chong. Self-tuning experience weighted attractionlearning in games.

Journal of Economic Theory , 133:177–198, 2007.[26] G. P. Karev and E. V. Koonin. Parabolic replicator dynamics and the principle ofminimum Tsallis information gain.

Biology Direct , 8:19 – 19, 2013.

OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 25 [27] A. Kianercy and A. Galstyan. Dynamics of Boltzmann Q learning in two-player two-actiongames. Phys. Rev. E , 85:041145, Apr 2012.[28] R. Kleinberg, G. Piliouras, and É. Tardos. Multiplicative updates outperform genericno-regret learning in congestion games. In

ACM Symposium on Theory of Computing(STOC) , 2009.[29] R. Kleinberg, G. Piliouras, and É. Tardos. Load balancing without regret in the bulletinboard model.

Distributed Computing , 24(1):21–29, 2011.[30] E. Koutsoupias and C. Papadimitriou. Worst-case equilibria. In C. Meinel and S. Tison,editors,

STACS 99 , pages 404–413, Berlin, Heidelberg, 1999. Springer Berlin Heidelberg.[31] T. Y. Li, M. Misiurewicz, G. Pianigiani, and J. A. Yorke. Odd chaos.

Physics Letters A ,87(6):271–273, 1982.[32] T. Y. Li and J. A. Yorke. Period three implies chaos.

The American MathematicalMonthly , 82:985–992, 1975.[33] T. Liang and J. Stokes. Interaction matters: A note on non-asymptotic local convergenceof generative adversarial networks. In

The 22nd International Conference on ArtiﬁcialIntelligence and Statistics , pages 907–915. PMLR, 2019.[34] P. Mertikopoulos and W. H. Sandholm. Learning in games via reinforcement andregularization.

Mathematics of Operations Research , 41(4):1297–1324, 2016.[35] P. Mertikopoulos and W. H. Sandholm. Riemannian game dynamics.

Journal of EconomicTheory , 177:315–364, 2018.[36] P. Mertikopoulos and Z. Zhou. Learning in games with continuous action sets andunknown payoﬀ functions.

Mathematical Programming , 173(1-2):465–507, 2019.[37] L. Mescheder, A. Geiger, and S. Nowozin. Which training methods for gans do actuallyconverge? arXiv preprint arXiv:1801.04406 , 2018.[38] M. Misiurewicz. Horseshoes for mapping of the interval.

Bull. Acad. Polon. Sci. Sér.Sci. , 27:167–169, 1979.[39] M. Misiurewicz and W. Szlenk. Entropy of piecewise monotone mappings.

StudiaMathematica , 67(1):45–63, 1980.[40] D. Monderer and L. S. Shapley. Fictitious play property for games with identical interests.

Journal of Economic Theory , 68(1):258–265, 1996.[41] V. Nagarajan and J. Z. Kolter. Gradient descent gan optimization is locally stable. In

Advances in neural information processing systems , pages 5585–5595, 2017.[42] G. Palaiopanos, I. Panageas, and G. Piliouras. Multiplicative weights update withconstant step-size in congestion games: Convergence, limit cycles and chaos. In

Advancesin Neural Information Processing Systems , pages 5872–5882, 2017.[43] I. Panageas, G. Piliouras, and X. Wang. Multiplicative weights updates as a distributedconstrained optimization algorithm: Convergence to second-order stationary pointsalmost always. In

International Conference on Machine Learning , pages 4961–4969.PMLR, 2019.[44] A. Rényi. On measures of entropy and information.

Proceedings of the Fourth BerkeleySymposium on Mathematical Statistics and Probability, Volume 1: Contributions to theTheory of Statistics , 1961.[45] R. Rosenthal. A class of games possessing pure-strategy Nash equilibria.

InternationalJournal of Game Theory , 2(1):65–67, 1973.[46] Y. Sato, E. Akiyama, and J. P. Crutchﬁeld. Stability and diversity in collective adaptation.

Physica D: Nonlinear Phenomena , 210(1):21 – 57, 2005. [47] Y. Sato and J. P. Crutchﬁeld. Coupled replicator equations for the dynamics of learningin multiagent systems.

Phys. Rev. E , 67:015206, Jan 2003.[48] S. Shalev-Shwartz. Online learning and online covex optimization.

Foundations andTrends in Machine Learning , 4(2):107–194, 2012.[49] A. N. Sharkovsky. Coexistence of the cycles of a continuous mapping of the line intoitself.

Ukrain. Math. Zh. , 16:61–71, 1964.[50] W. Słomczyński, J. Kwapień, and K. Życzkowski. Entropy computing via integration overfractal measures.

Chaos: An Interdisciplinary Journal of Nonlinear Science , 10(1):180–188, 2000.[51] C. Tsallis. Possible generalization of Boltzmann-Gibbs statistics.

Journal of StatisticalPhysics , 52(1-2):479–487, 1988.[52] B. Weiss.

Single orbit dynamics , volume 95 of

CBMS Regional Conference Series inMathematics . American Mathematical Society, Providence, RI, 2000.[53] Y. Yaz, C.-S. Foo, S. Winkler, K.-H. Yap, G. Piliouras, V. Chandrasekhar, et al. Theunusual eﬀectiveness of averaging in gan training. In

International Conference onLearning Representations , 2018.[54] Q. Zhao.

Multi-Armed Bandits: Theory and Applications to Online Learning in Networks .Morgan and Claypool Publishers, 2019.

OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 27

Appendices

Appendix A. Generalized entropies as regularizers

We present information measures which are often used as regularizers.

Example A.1 (Shannon entropy) . Let R ( x, y ) = − H S ( x, y ) , where H S ( x, y ) = − x log x − y log y. Then H S ( x, − x ) is the Shannon entropy of a probability distribution ( x, − x ) and r ( x ) = R ( x, − x ) = − H S ( x, − x ) = x log x + (1 − x ) log(1 − x ) . From r (cid:48) ( x ) = log x − x we observe that r ∈ SSC . Example A.2 (Arimoto entropies) . We consider the class of Arimoto entropies [13], that isfunctions deﬁned as H η ( x, y ) = η ( x ) + η ( y ) , where η ∈ C ((0 , is a concave function. We deﬁne R ( x, y ) = − H η ( x, y ) . Then, by its deﬁnition, R ∈ C ((0 , ) and R ( y, x ) = R ( x, y ) . Moreover, r ( x ) = R ( x, − x ) = − H η ( x, − x ) = − η ( x ) − η (1 − x ) ,r (cid:48) ( x ) = − η (cid:48) ( x ) + η (cid:48) (1 − x ) and r (cid:48)(cid:48) ( x ) = − η (cid:48)(cid:48) ( x ) − η (cid:48)(cid:48) (1 − x ) . Thus, r is convex and the limit lim x → − η (cid:48) ( x ) is ﬁnite. Therefore, the condition for R = − H η ∈ SSC is steepness of η at zero: (15) lim x → + η (cid:48) ( x ) = ∞ . Hence, R = − H η ∈ SSC if and only if η satisﬁes (15) . Several well-known regularizers are given by (negative) Arimoto entropies satisfying (15).For instance, the Shannon entropy from Example A.1 is an Arimoto entropy for η ( x ) = − x log x , as well as log-barrier regularizer obtained from η ( x ) = log x . Another widely used(especially in statistical physics) example of Arimoto entropy is the Havrda-Charvát-Tsallisentropy. By substituting the (negative) Shannon Entropy as R into (4) we obtain the Multiplicative WeightsUpdate algorithm. In the decision theory Arimoto entropies correspond to separable Bregman scores [19]. This entropy (called also entropy of degree q ) was ﬁrst introduced by Havrda and Charvát [21] and usedto bound probability of error for testing multiple hypotheses. In statistical physics it is known as Tsallisentropy, referring to [51]. Example A.3 (Havrda-Charvát-Tsallis entropies) . The Havrda-Charvát-Tsallis entropy for q ∈ (0 , ∞ ) is deﬁned as (16) H q ( x, y ) = (cid:40) − q ( x q + y q − for q (cid:54) = 1 H S ( x, y ) for q = 1 .H q is an Arimoto entropy for η ( x ) = − q (cid:0) x q − (cid:1) , satisfying (15) for < q < . If R ( x, y ) = − H q ( x, y ) then r ( x ) = R ( x, − x ) = 1 q − x q + (1 − x ) q − and r (cid:48) ( x ) = qq − (cid:0) x q − − (1 − x ) q − (cid:1) , and r ∈ SSC for q ∈ (0 , .For q > the Havrda-Charvát-Tsallis entropy does not satisfy (15) and, consequently, theregularizer R emerging from the Havrda-Charvát-Tsallis entropy does not belong to SSC .Standard non-example is Euclidean norm, which we get from (16) when q = 2 . Then r ( x ) = R ( x, − x ) = − H ( x, − x ) = x + (1 − x ) − and as lim x → + r (cid:48) ( x ) = − , R doesn’t belong to SSC . Evidently there exist functions which are not Arimoto entropies but also generate regular-izers that belong to

SSC , one of them being the Rényi entropy of order q < . Example A.4 (Rényi entropies) . The Shannon entropy represents an expected mean ofindividual informations of the form I k = − log p k . Rényi [44] introduced alternative informa-tion measures, namely generalized means g − ( (cid:80) p k g ( I k )) , where g is a continuous, strictlymonotone function.Then, the Rényi entropy of order q (cid:54) = 1 correspond to g ( x ) = exp((1 − q ) x ) ,namely: H Rq ( x, y ) = (cid:40) − q log ( x q + y q ) , for q (cid:54) = 1 H S ( x, y ) , for q = 1 . As the variables x and y are not separable, this is not an Arimoto entropy. However, for R ( x, y ) = − H Rq ( x, y ) , R ∈ C ((0 , ) and R ( y, x ) = R ( x, y ) . Moreover, r ( x ) = R ( x, − x ) = − H Rq ( x, − x ) = 1 q − x q + (1 − x ) q ) and r (cid:48) ( x ) = qq − · x q − − (1 − x ) q − x q + (1 − x ) q . Thus, for q ∈ (0 , we know that r (cid:48)(cid:48) ( x ) > on (0 , and lim x → + r (cid:48) ( x ) = −∞ . Because H R = H S we infer that R ∈ SSC for q ∈ (0 , . OLLOW-THE-REGULARIZED-LEADER ROUTES TO CHAOS IN ROUTING GAMES 29

Appendix B. Regularity of log-barrier dynamics

To understand better the phenomenon discussed in Section 8, let us investigate regularityof f a,b . Nice properties of interval maps are guaranteed by the negative Schwarzian derivative.Let us recall that the Schwarzian derivative of f is given by the formula Sf = f (cid:48)(cid:48)(cid:48) f (cid:48) − (cid:18) f (cid:48)(cid:48) f (cid:48) (cid:19) . A “metatheorem” states that almost all natural noninvertible interval maps have negativeSchwarzian derivative. Note that, by Lemma 3.2.ii, if a ≤ − Ψ (cid:48) ( b ) then f a,b is a homeomor-phism, so we should not expect negative Schwarzian derivative for that case. For maps withnegative Schwarzian derivative each attracting or neutral periodic orbit has a critical point inits immediate basin of attraction. Thus, if we show that the Schwarzian derivative is negative,then we will know that all periodic orbits can be ﬁnd by studying behavior of critical pointsof f a,b . Therefore, we want to show that Sf a,b < for suﬃciently large a for f a,b determinedby log-barrier regularizer.In general, computation of Schwarzian derivative may be very complicated. However, thereis a useful formula(17) S ( h ◦ f ) = ( f (cid:48) ) (( Sh ) ◦ f ) + Sf.

The function f a,b is given by (7). Consider g ( x ) := (Ψ ◦ f a,b )( x ) = Ψ( x ) + a ( x − b ) . By (17) we have that Sg = ( f (cid:48) a,b ) (( S Ψ) ◦ f a,b ) + Sf a,b . At the same time Sg ( x ) = S (Ψ( x ) + a ( x − b )) . Therefore,(18) ( f (cid:48) a,b ( x )) (( S Ψ) ◦ f a,b ( x )) + Sf a,b ( x ) = S (Ψ( x ) + a ( x − b )) . Direct computations yield S Ψ( x ) = 6( x + (1 − x ) ) > and S (Ψ( x ) + a ( x − b )) = 6 [1 − a ( x + (1 − x ) )][ x + (1 − x ) − ax (1 − x ) ] . Observe that x + (1 − x ) (cid:62) for all x ∈ [0 , . Thus − a ( x + (1 − x ) ) (cid:54) − a , and(19) S (Ψ( x ) + a ( x − b )) < for a > . Therefore, Sf a,b < for a > . Moreover, max b ∈ [0 , Ψ (cid:48) ( b ) = Ψ (cid:48) (1 /

2) = − . Thus, Sf a,b ( x ) < for all a > − Ψ (cid:48) ( b ) ≥ . (J. Bielawski) Department of Mathematics, Cracow University of Economics, Rakowicka 27,31-510 Kraków, Poland

Email address : [email protected] (T. Chotibut) Chula Intelligent and Complex Systems, Department of Physics, Faculty ofScience, Chulalongkorn University, Bangkok 10330, Thailand.

Email address : [email protected], [email protected] (F. Falniowski)

Department of Mathematics, Cracow University of Economics, Rakowicka 27,31-510 Kraków, Poland

Email address : [email protected] (G. Kosiorowski) Department of Mathematics, Cracow University of Economics, Rako-wicka 27, 31-510 Kraków, Poland

Email address : [email protected] (M. Misiurewicz) Department of Mathematical Sciences, Indiana University-Purdue Univer-sity Indianapolis, 402 N. Blackford Street, Indianapolis, IN 46202, USA

Email address : [email protected] (G. Piliouras) Engineering Systems and Design, Singapore University of Technology andDesign, 8 Somapah Road, Singapore 487372

Email address ::