aa r X i v : . [ ec on . T H ] D ec Mathematical Game TheoryUlrich Faigle
Universit¨at zu K ¨olnMathematisches InstitutWeyertal 80 [email protected] ontents
Preface iii
Part 1. Introduction
Part 2. 2-Person-Games
AGRANGE games 45Chapter 4. Investing and Betting 531. Proportional investing 552. The fortune formula 563. Fair odds 584. Betting on alternatives 605. Betting and information 636. Common knowledge 65 ii CONTENTS
Part 3. n-Person Games n -person games 832. Equilibria 833. Randomization of matrix games 854. Traffic flows 86Chapter 7. Cooperative Games 891. Cooperative TU-games 902. The core 983. Values 1054. Boltzmann values 1105. Coalition formation 1136. Equilibria in cooperative games 115Chapter 8. Interaction Systems and Quantum Models 1171. Algebraic preliminaries 1172. Complex matrices 1193. Interaction systems 1224. Quantum systems 1265. Quantum games 1306. Final Remarks 131Appendix. 1331. Notions and facts from real analysis 1332. Convexity 1343. B ROUWER ’s fixed-point theorem 1354. Linear inequalities 1375. The M
ONGE algorithm 1376. Entropy and B
OLTZMANN distributions 138 reface
People have gambled and played games for thousands of years. Yet, only inthe 17th century we see suddenly a serious attempt for a scientific approachto the subject. The combinatorial foundations of probability theory weredeveloped by various mathematicians as a means to understand games ofchance (mostly with dice) and to make conjectures .Since then, game theory has grown into a wide field and appears at timesquite removed from its mathematical roots. The notion of a game has beenbroadened to encompass all kinds of human behavior and the interactionsof individuals or of groups and societies . Much of current research studieshumans in economic and social contexts and seeks to discover behaviorallaws in analogy to physical laws.The role of mathematics, however, has been quite limited so far in this en-deavor. One major reason lies certainly the fact that players in real life be-have often differently than a simple mathematical model would predict. Soseemingly paradoxical situations exist where people appear to contradictthe straightforward analysis of the mathematical model builder. A famoussuch example is S ELTEN ’s chain store paradox .As interesting and worthwhile as research into laws that govern psycho-logical, social or economic behavior of humans may be, mathematical gametheory is not about these aspects of game theory. In the center of our atten-tion are mathematical models that may be useful for the analysis of gametheoretic situations. We are concerned with the mathematics of game the-oretic models but leave the question aside whether a particular model de-scribes a particular situation in real life appropriately.The mathematical analysis of a game theoretic model treats objects neu-trally. Elements and sets have no feelings per se and show no psychologicalbehavior. They are neither generous nor cost conscious unless such featuresare built into the model as clearly formulated mathematical properties. The see, e.g. , the Ars Conjectandi by J. B
ERNOULLI (1654-1705) see, e.g. , E. B ERNE , Games People Play: The Psychology of Human Relationships ,Grove Press, 1964 R. S
ELTEN (1978):
The chain store paradox , Theory and Decision 9, 127-159. iiiv PREFACE advantage of mathematical neutrality is substantial, however, because it al-lows us to imbed the mathematical analysis into a much wider framework.The present introduction into mathematical game theory sees games beingplayed on (possibly quite general) systems . Moves of the game then corre-spond to transitions of the system from one state to another. This approachreveals a close connection with fundamental physical systems via the sameunderlying mathematics. Indeed, it is hoped that mathematical game theorymay eventually play a role for real world games akin to the role of theoreti-cal physics to real world physical systems.The reader of this introductory text is expected to have some knowledgein mathematics, perhaps at the level of an introductory course in linear al-gebra and real analysis. Nevertheless, the text will try to review relevantmathematical notions and properties and point to the literature for furtherdetails.The reader is expected to read the text ”actively’. ”Ex.” marks not only an”example” but also an ”exercise” that might deepen the understanding ofthe mathematical development.The book is based on a one-term course on the subject the author has givenrepeatedly at the University of Cologne to pre-master level students withan interest in applied mathematics, operations research and mathematicalmodelling. art 1
Introduction
HAPTER 1
Mathematical Models of the Real World
This introductory chapter discusses mathematical models, sketches themathematical tools for their analysis, defines systems in general and sys-tems of decisions in particular. Then games are introduced from a generalpoint of view and it is indicated how the may arise in combinatorial, eco-nomic, social, physical and other contexts.
1. Mathematical modelling
Mathematics is the powerful human instrument to analyze and to structureobservations and to possibly discover natural ”laws”. These laws are logicalprinciples that allow us not only to understand observed phenomena ( i.e. ,the so-called real world ) but also to compute possible evolutions of currentsituations and thus to try a ”look into the future”.Why is that so? An answer to this question is difficult if not impossible.There is a wide-spread belief that mathematics is the language of the uni-verse . So everything can supposedly be captured by mathematics and allmathematical deductions reveal facts about the real world. I do not knowwhether this is true. But even if it were, one would have to be careful withreal-world interpretations of mathematics, nonetheless. A simple examplemay illustrate the difficulty:While apples on a tree are counted by natural numbers n , it is not truethat, for every natural number n , there exists a tree with n apples. In otherwords, when we use the set of nonnegative integers to describe the numberof apples, our mathematical model will comprise mathematical objects thathave no real counterparts.Theoretically, one could try to get out of the apple dilemma by restrictingthe mathematical model to those numbers n that are realized by apple trees.But such a restricted model would be of no practical use as the set of suchapple numbers n is not explicitly known. G ALILEO G ALILEI (1564-1642)
In general, a mathematical model of a real-world situation is, alas, notnecessarily guaranteed to be absolutely comprehensive. Mathematical con-clusions are possibly only theoretical and may suggest objects and situa-tions which do not exist in reality. One always has to double check real-world interpretations of mathematical deductions and ask whether an inter-pretation is ”reasonable” in the sense that it is commensurate with one’sown personal experience.In the analysis of a game theoretic situation, for example, one may wantto take the psychology of individual players into account. A mathemat-ical model of psychological behavior, however, is typically based on as-sumptions whose accuracy is often not clear. Consequently, mathematicallyestablished results within such a model must be interpreted with care, ofcourse.Moreover, similar to physical systems with a large number of particles (likemolecules etc. ), game theoretic systems with many agents ( e.g. , traffic sys-tems and economies) are too complex to analyze by following each of themany agents individually. Hence a practical approach will have to concen-trate on ”group behavior” and consider statistical parameters that averageover individual numerical attributes.Having cautioned the reader about the real-world interpretation of mathe-matical deductions, we will concentrate on mathematical models (and theirmathematics) and leave the interpretation to the reader. Our emphasis is on game theoretic models. So we should explain what we understand by this.A game involves players that perform actions which make a given systemgo through a sequence of states. When the game ends, the system is in astate according to which the players receive rewards (or are charged with costs or whatever). Many game theorists think of a ”player” as a humanoid, i.e. , a creature with human feelings, wishes and desires, and thus give it ahuman name .Elements of a mathematical model, however, do not have humanoid feel-ings per se . If they are to represent objects with wishes and desires, thesewishes and desires must be explicitly formulated as mathematical optimiza-tion challenges with specified objective functions and restrictions. There-fore, we will try to be neutral and refer to ”players” often just as agents with no specified sexual attributes. In particular, an agent will typically bean ”it” rather than a ”he” or ”she”.This terminological neutrality makes it clear that mathematical game theorycomprises many more models than just those with human players. As we Alice and
Bob are quite popular choices . MATHEMATICAL PRELIMINARIES 5 will see, many models of games, decisions, economics and social scienceshave the same underlying mathematics as models of physics and informat-ics.
A note on continuous and differentiable functions.
Real world phenom-ena are often modelled with continuous or even differentiable functions.However, • There exists no practically feasible test for the continuity ordifferentiability of a function !Continuity and differentiability, therefore, are assumptions of the modelbuilder. These assumptions appear often very reasonable and produce goodresults in applications. Moreover, they facilitate the mathematical analysis.The reader should nevertheless be aware of this difference between a math-ematical model and its physical origin.
2. Mathematical preliminaries
The reader is assumed to have some basic mathematical knowledge (at leastat the level of an introductory course on linear algebra). Nevertheless, it isuseful to review some of the mathematical terminology. Further basic factsare outlined in the Appendix.
A function f : S → W assigns elements f s = f ( s ) of a set W to the elements s of a set S . Oneway of looking at f is to imagine f as a measuring device which producesthe result f ( s ) upon the input s : s ∈ S −→ f −→ f ( s ) ∈ W. We denote the collection of all such functions as W S = { f : S → W } . There is a dual way to look at this situation where the roles of the function f and variable s are reversed. The dual viewpoint sees s as a probe whichproduces the value f ( s ) when applied to f : f ∈ W S −→ s −→ f ( s ) . If S is small, f can be presented in table form which displays the total effectof f on S : f ←→ s s s . . . s n f ( s ) f ( s ) f ( s ) . . . f ( s n )
1. MATHEMATICAL MODELS OF THE REAL WORLD
The dual viewpoint would fix an element s ∈ S and evaluate the effect of themeasuring devices f , . . . , f k , for example, and thus represent an individualelement s ∈ S by a k -dimensional data table: s ←→ f f f . . . f k f ( s ) f ( s ) f ( s ) . . . f k ( s ) The dual viewpoint is typically present when one tries to describe the state s of an economic, social or physical system via the data values f ( s ) , . . . , f k ( s ) of statistical measurements f , . . . , f k with respect to k system characteris-tics.The two viewpoints are logically equivalent. Indeed, the dual point of viewsees the element s ∈ S just like a function ˆ s : W S → W with values ˆ s ( f ) = f ( s ) . Also the first point of view is relevant for data representation. Consider, forexample, a n -element set N = { i , . . . , i n } . In this context, a function f : N → { , } may specify a subset S f of N , via the identification(1) f ∈ { , } N ←→ S f = { i ∈ N | f ( i ) = 1 } ⊆ N. R EMARK
The vector f in (1) is the incidence vector of the subset S f .Denoting by N the collection of all subsets of N and writing = { , } ,the context (1) establishes the correspondence N = { , } N ←→ N = { S ⊆ N } . N OTA B ENE . (0 , -valued functions may also have other interpre-tations, of course. Information theory, for example, thinks of them as bit vectors and thus as carriers of information. An abstract functionis a purely mathematical object with no physical meaning by itself. Itobtains a concrete meaning only within the context to which it refers. . MATHEMATICAL PRELIMINARIES 7 Notation.
When we think of a function f : S → W as a data repre-sentative, we think of f as a coordinate vector f ∈ W S with coordinatecomponents f s = f ( s ) and also use the notation f = ( f s | s ∈ S ) . In the case S = { s , s , s . . . } , we may write f = ( f ( s ) , f ( s ) , . . . ) = ( f s , f s , . . . ) = ( f s | s ∈ S ) In the case of a direct product S = X × Y = { ( x, y ) | x ∈ X, y ∈ Y } of sets X and Y , a function A : X × Y → W can be imagined as amatrix with rows labeled by the elements x ∈ X and columns labeled bythe elements y ∈ Y : A = A x y A x y A x y . . .A x y A x y A x y . . .A x y A x y A x y . . . ... ... ... . . . The function values A xy are the coefficients of A . We express this point ofview in the short hand notation A = [ A xy ] ∈ W X × Y . The matrix form suggests to relate similar structures to A . The transpose ofthe matrix A ∈ W X × Y , for example, is the matrix A T = [ A Tyx ] ∈ W Y × X with the coefficients A Tyx = A xy .In the case X = { , , . . . , m } and Y = { , , . . . , n } , one often simplywrites W m × n ∼ = W X × Y . R EMARK
When one thinks of a coordinatevector f ∈ W X as a matrix having just f as the only column, one calls f a column vector . If f corresponds to a matrix with f as the only row, f is row vector . So f T row vector ⇐⇒ f column vector.
1. MATHEMATICAL MODELS OF THE REAL WORLD
Graphs. A (combinatorial) graph G = G ( X ) consists of a set X of nodes (or vertices ) whose ordered pairs ( x, y ) of elements are viewed asarrows (or (directed) edges ) between nodes:x −→ y ( x, y ∈ X ) . Denoting, as usual, by R the set of all real numbers, an R -weighting of G isan assignment of real number values a xx to the nodes x ∈ X and a xy to theother edges ( x, y ) and hence corresponds to a matrix A ∈ R X × X with X as its row and its column index set and coefficients A xy = a xy .Although logically equivalent to a matrix, a graph representation is oftenmore intuitive in dynamic contexts. A directed edge e = ( x, y ) may, forexample, represent a road along which one can travel from x to y in a trafficcontext. e could also indicate a possible transformation of x into y etc. Theedge weight a e = a xy could be the distance from x to y or the strength ofan action exerted by x onto y etc. While the coefficients of datavectors or matrices could be quite varied (colors, sounds, configurations ingames, etc. etc. ), we will typically deal with numerical data so that coordi-nate vectors have real numbers as components. Hence we deal with coordi-nate spaces of the type R S = { f : S → R } . Addition and scalar multiplication.
The sum f + g of two coordinatevectors f, g ∈ R S is the vector of component sums ( f + g ) s = f s + g s , i.e. , f + g = ( f s + g s | s ∈ S ) . For any scalar λ ∈ R , the scalar product multiplies each component by λ : λf = ( λf s | s ∈ S ) . W ARNING . There are many – quite different – notions for multiplicationwith vectors. . MATHEMATICAL PRELIMINARIES 9
Products.
The product f • g of two vectors f, g ∈ R S is the vector withthe componentwise products, i.e. f • g = ( f s g s | s ∈ S ) . In the special case of matrices
A, B ∈ R X × Y the function product of A and B is called the H ADAMARD product A • B ∈ R X × Y (with coefficients ( A • B ) xy = A xy B xy ).W ARNING : This is not the standard matrix multiplication rule (2)!The standard product of matrices A ∈ R X × Y and B ∈ R U × Z isONLY declared in the case U = Y and, then, defined as the matrix(2) C = AB ∈ R X × Z with coefficients C xz = X y ∈ Y A xy B yz .N OTA B ENE . The standard product of two matrices may not be well-definedin cases where the index sets X and Y are infinite because infinite sums areproblematic. For the purposes of this book, however, this is no obstacle: • Mainly finite sums are considered. R EMARK
ILBERT spaces). Much of game theoretic analysis canbe extended to the framework of H ILBERT spaces, namely to coordinatespaces of the form (3) ℓ ( S ) = { f : S → R | X s ∈ S f s < ∞} , where S is a countable (possibly infinite) set. Inner product and norm.
The inner product h A | B i of two matrices A, B ∈ R X × Y is the sum of the products of the respective components: h A | B i = X x ∈ X X y ∈ Y = X x ∈ X X y ∈ Y A xy B xy = X ( x,y ) ∈ X × Y ( A • B ) xy . N OTA B ENE . The inner product h A | B i is a scalar number – and not a ma-trix! J. H
ADAMARD (1865-1963) D. H
ILBERT (1862-1943)
In the vector case, we have for f, g ∈ R S , the inner product h f | g i = X S ∈ S f s g s ∼ = f T g, where the latter expression assumes that we think of f and g as columnvectors and identify the (1 × -matrix f T g with the scalar h f | g i .E X . 1.1. If f, g ∈ R n are column vectors, then f T g ∈ R × is a (1 × -matrix. This is to be clearly distinguished from the ( n × n ) -matrix f g T = f g f g . . . f g n f g f g . . . f g n ... ... . . . ... f n g f n g . . . f n g n ∈ R n × n The norm of a vector (or matrix) f ∈ R S is defined as k f k = p h f | f i = X s ∈ S | f s | . The norm of a vector is often geometrically interpreted as its euclidianlength . So one says that two vectors f, g ∈ R S are orthogonal if theysatisfy the so-called Theorem of P YTHAGORAS :(4) k f + g k = k f k + k g k . L EMMA
Assuming S finite, one has for the co-ordinate vectors f, g ∈ R S : f and g are orthogonal ⇐⇒ h f | g i = 0 . Proof.
Straightforward exercise.
The set R of real numbers has an alge-braic structure under the usual addition and multiplication rules for realnumbers. R contains the set of natural numbers N = { , , , . . . , n, . . . } . So the algebraic computational rules of R may also be applied to N as thesum and the product of two natural numbers yields a natural number . Simi-lar algebraic rules can be defined on other sets. We give two examples. though the same is not guaranteed for subtractions and divisions . MATHEMATICAL PRELIMINARIES 11 Complex numbers.
The computational rules of R can be extended tothe set R × R of pairs of real numbers when one defines ( a, b ) + ( c, d ) = ( a + c, b + d )( a, b ) · ( c, d ) = ( ab − bd, ad + bd ) . A convenient notation with respect to this algebra is the form ( a, b ) = a (1 ,
0) + b (0 , ←→ a + i b, with the so-called imaginary unit i ↔ (0 , . Notice that algebra thenyields i ↔ (0 , · (0 ,
1) = ( − , ↔ − · − . We define the set of complex numbers as the set C = { z = a + i b | ( a, b ) ∈ R × R } and identify R as a subset of C : a ∈ R ←→ ( a, ∈ R × R ←→ a + i · ∈ C . Algebra in C follows the same rules as algebra in R with the addi-tional rule i = − . Binary algebra.
On the -element set B = { , } , define addition ⊕ and multiplication ⊗ according to the following tables: ⊕ and ⊗ . In this binary algebra, also division is possible in the sense that the equation x ⊗ y = 1 has a unique solution y ”for every” x = 0 . (There is only one such case : y = x = 1 .) Vector algebra.
Complex numbers allow us the define sums and pro-ducts of vectors with complex coefficients in analogy with real sums andproducts. Applications of this algebraic technique will be discussed in Chap-ter 8.The same is true for vectors with (0 , -coefficients under the binary rules.An application of binary algebra is the analysis of winning strategies fornim games in Chapter 2.R EMARK . Are there clearly defined ”correct” or ”optimal” additionand multiplication rules on data structures that would reveal theirreal-world structure mathematically?The answer is ” no ” in general. The imposed algebra is always achoice of the mathematical analyst – and not of ”mother nature”.It often requires care and ingenuity. Moreover, different algebraicsetups may reveal different structural aspects and thus lead to addi-tional insight. Consider n mutually ex-clusive events E , . . . , E n , and expect that any one of these, say E i , indeedoccurs ”with probability” p i = Pr( E i ) . Then the parameters p i form a prob-ability distribution on the set E = { E , . . . , E n } , i.e. , the p i are non-negativereal numbers that sum to : p + . . . + p n = 1 and p , . . . , p n ≥ . If we have furthermore a measuring or observation device f that producesthe number f i if E i occurs, then these numbers have the expected value (5) µ ( f ) = f p + . . . + f n p n = n X k =1 f i p i = h f | p i . In a game theoretic context, a probability is often a subjective evaluationof the likelihood for an event to occur. The gambler, investor, or generalplayer may not know in advance what the future will bring, but has more orless educated guesses on the likelihood of certain events. There is a closeconnection with the notion of information . . MATHEMATICAL PRELIMINARIES 13 Intensity.
We think of the intensity of an event E as a numerical para-meter that is inversely proportional to its probability p = Pr( E ) : the smaller p , the more intensely felt is the actual occurrence of E . For simplicity, letus take /p as our subjective intensity measure.R EMARK
ECHNER ’s law).
According to F ECHNER , the intensity ofphysical stimulations is physiologically felt on a logarithmic scale. Well-known examples are the Richter scale for earthquakes or the decibel scalefor the sound. Following F
ECHNER , we feel the intensity of an event E that we expect withprobability p on a logarithmic scale and hence according to the function(6) I a ( p ) = log a (1 /p ) = − log a p, where log a p is the logarithm of p relative to the basis a > (see Ex. 1.2). Inparticular, the occurrence of an ”impossible” event, which we expect withzero probability, has infinite intensity I a (0) = − log a ∞ . N OTA B ENE . The mathematical intensity of an event depends only on the probability p with which it occurs – and not on its interpre-tation within a modelling context or its ”true nature” in a physicalenvironment. E X . 1.2 (Logarithm). Recall: For any given positive numbers a, x > ,there is a unique number y = log a x such that x = a y = a log a x . Where e = 2 . ... is E ULER ’s number , the notation ln x = log e x is commonly used. ln x is the so-called natural logarithm with the functionderivative (ln x ) ′ = 1 /x for all x > . Two logarithm functions log a x and log b x differ just by a multiplicativeconstant. Indeed, one has a log a x = x = b log b x = a (log b a ) log b x and hence log a x = (log b a ) · log b x for all x > . G.T H . F ECHNER (1801-1887) L. E
ULER (1707-1783)
Information.
In the fundamental theory of information , the parameter I ( p ) = − log p is the quantity of information provided by an event E thatoccurs with probability p . Note that the probability value p can be regainedfrom the information quantity I ( p ) : p = 2 log p = 2 − I ( p ) . This relationship shows that ”probabilities” can be understood as parametersthat capture the amount of information (or lack of information) we have onthe occurrence of events.
Entropy.
The expected quantity of information provided by the family E = { E , . . . , E n } of events with the probability distribution π = ( p , . . . , p n ) is known as its entropy (7) H ( E ) = H ( π ) = n X k =1 p k I ( p k ) = − n X k =1 p k log p k , where, by convention, one sets · log . Again, it should be noticed: H ( E ) just depends on the parameter vector π – and not on a real-world interpretation of E .R EMARK
Entropy is a also a fundamental notion in thermodynamics,where it serves, for example, to define the temperature of a system. Physi-cists prefer to work with base e rather than base and thus with ln x insteadof log x , i.e. , with the accordingly scaled entropy H ( π ) = − n X k =1 p k ln p k = (ln 2) · H ( π ) .
3. Systems A system is a physical, economic, or other entity that is in a certain state atany given moment. Denoting by S the collection of all possible states σ , weidentify the system with S . This is, of course, a very abstract definition. Inpractice, one will have to describe the system states in a way that is suitablefor a concrete mathematical analysis. To get a first idea of what is meant,let us look at some examples. due to C.E. S HANNON (1916-2001) . SYSTEMS 15
Chess.
A system arises from a game of chess as follows: A state ofchess is a particular configuration of the chess pieces on the chess board,together with the information which of the two players (” B ” or ” W ”) is todraw next. If C is the collection of all possible chess configurations, a statecould thus be described as a pair σ = ( C, p ) with C ∈ C and p ∈ { B, W } . In a similar way, a card game takes place in the context of a system whosestates are the possible distributions of cards among the players together withthe information which players are to move next.
Economies.
The model of an exchange economy involves a set N ofagents and a set G of certain specified goods. A bundle for agent i ∈ N is adata vector b ∈ R G where the component b G indicates that the bundle b comprises b G units ofthe good G ∈ G . Denote by B the set of all possible bundles. A state of theexchange economy is now described by a map β : N → B (or vector β ∈ B N )that specifies agent i ’s particular bundle β ( i ) ∈ B .Closely related is the description of the state of a general economy. Oneconsiders a set E of economic statistics E . Assuming that these statisticstake numerical values ǫ E at a given moment, the corresponding economicstate is given by the data vector ǫ ∈ R E having the statistical values ǫ E as its components. Decisions.
In a general decision system D , we are given a set N = { n , n , . . . } of agents and assume that each agent n i ∈ N has to make adecision of a given type, i.e. , we assume that n i has to choose an element d i in its ”decision set” D i . The joint decision of N is then a vector d = ( d , d , . . . ) = ( d i | i ∈ N ) ∈ D × D × · · · = Y i ∈ N D i and thus describes a decision state of the set N . In the context of gametheory, decisions of agents often correspond to choices of strategies fromcertain feasible strategy sets.Decision systems are ubiquitous. In the context of a traffic situation, forexample, N can be a set N of persons who want to travel from individualstarting points to individual destinations. Suppose that each person i selectsa path P i from a set P i of possible paths in order to do so. Then a state of the associated traffic system is a definite selection π of paths of members ofthe group N and thus a data vector with path-valued components: π = ( P i | i ∈ N ) .
4. Games A game Γ involves a set N of agents (or players ) and a system S relativeto which the game is played. A concrete game instance γ starts with someinitial state σ and consists in a sequence of moves , i.e. , state transitions σ t → σ t +1 that are feasible according to the rules of Γ . After t steps, the system hasevolved from state σ into a state σ t in a sequence of (feasible) moves σ → σ → . . . → σ t − → σ t . We refer to the associated sequence γ t = σ σ · · · σ t − σ t as the stage of Γ at time t and denote the set of all possible stages after t steps by(8) Γ t = { γ t | γ t is a possible stage of Γ at time t } . If the game instance γ ends in stage γ t = σ σ · · · σ t , then σ t is the final stateof γ . It is important to note that there may be many state sequence in S thatare not necessarily stages of Γ because they are not feasible according tothe rules of Γ .This informal discussion indicates how a general game can be defined froman abstract point of view: • A game Γ is, by definition, a collection of finite state se-quences γ with the property σ σ · · · σ t − σ t ∈ Γ = ⇒ σ σ · · · σ t − ∈ Γ . The members γ ∈ Γ are called the stages of Γ . Chess would thus be abstractly defined as the set of all possible finite se-quences of legal chess moves. This set, however, is infinitely large and im-possibly difficult to handle computationally.In concrete practical situations, a game Γ is characterized by a set of rules that allow us to check whether a state sequence γ is feasible for Γ , i.e. ,belongs to that potentially huge set Γ . The rules typically involve also a set N of players (or agents ) that ”somehow” influence the evolution of a gameby exerting certain actions and making certain choices at subsequent pointsin time t = 0 , , , . . . . . GAMES 17 Let us remain a bit vague on the precise mathematical meaning of ”in-fluence” at this point. It will become clear in special game contexts later.In an instance of chess, for example, one knows which of the players is tomove at a given time t . This player can then move the system deterministi-cally from the current state σ t into a next state σ t +1 according to the rulesof chess. Many games, however, involve stochastic procedures (like rollingdice or shuffling cards) whose outcome is not known in advance and makeit impossible for a player to select a desired subsequent state with certainty.R EMARK
When a game starts in a state σ at time t = 0 , it is usuallynot clear in what stage γ it will end (or whether it ends at all). Objectives and utilities.
The players in a game have typically certain objectives according to which they try to influence the evolution of a game.A rigorous mathematical model requires these objectives to be clearly for-mulated in mathematical terms, of course. A typical example of such objec-tives is a set u of utility functions u i : Γ → R ( i ∈ N ) which associates with each player i ∈ N a real number u i ( γ ) ∈ R as its utility value once the stage γ ∈ Γ is realized.Its expected utility is, of course, of importance for the strategic decision ofa player in a game. We illustrate this with an example in a betting context.E X . 1.3. Consider a single player with a capital of euros in a situationwhere a bet can be placed on the outcome of a (0 , -valued stochasticvariable X with probabilities p = Pr { X = 1 } and q = Pr { X = 0 } = 1 − p. Assume: • If the player invests f euros into the game and the event { X = 1 } occurs, the player will receive f euros. In the event { X = 0 } theinvestment f will be lost. Question:
What is the optimal investment amount f ∗ for the player?To answer it, observe that the player’s total portfolio after the bet is x = x ( f ) = (cid:26)
100 + f with probability p − f with probability q For the sake of the example, let us suppose that the player has a utilityfunction u ( x ) and wants to maximize the expected utility of x , that is thefunction g ( f ) = p · u (100 + f ) + q · u (100 − f ) . Let us consider two scenarios: (i) u ( x ) = x and hence g ( f ) = p (100 + f ) + 1(100 − f ) with derivative g ′ ( f ) = p − q = 2 p − . If p < / , g ( f ) is monotonically decreasing in f . Consequently f ∗ = 0 would be the best decision.In the case p > / , g ( f ) is monotonically increasing and, therefore, thefull investment f ∗ = 100 is optimal. (ii) u ( x ) = ln x and hence g ( f ) = p ln(100 + f ) + q ln(100 − f ) with the derivative g ′ ( f ) = p
100 + f − q − f = 100( p − q ) − f f (0 ≤ f ≤ . In this case, g ( f ) increases monotonically until f = 100( p − q ) and de-creases monotonically afterwards. So the best investment choice is f ∗ = 100( p − q ) if p ≥ q .If p < q , we have p − q ) < . Hence f ∗ = 0 would be the bestinvestment choice. N OTA B ENE . The player in Ex. 1.3 with utility function u ( x ) = x risks acomplete loss of the capital in the case p > / with probability q .A player with utility function u ( x ) = ln x will never experience a completeloss of the capital.E X . 1.4. Analyze the betting problem in Ex. 1.3 for an investor with utilityfunction u ( x ) = x . R EMARK
Utility functions which represent a gain are typ-ically ”concave”, which intuitively means that the marginal utility gain ishigher when the reference quantity is small than when it is big.As an illustration, assume that u : (0 , ∞ ) → R is a differentiable utilityfunction. Then the derivative u ′ ( x ) represents the marginal utility value at x . u is concave if the derivative function x u ′ ( x ) decreases monotonically with x . . GAMES 19 The logarithm function f ( x ) = ln x has the strictly decreasing derivative f ′ ( x ) = 1 /x and is thus an (important) example of a concave utility. Profit and cost.
In a profit game the players i are assumed to aim atmaximizing their utility u i . In a cost game one tries to minimize one’s utilityto the best possible.R EMARK
The notions of profit and cost games are closely related: Aprofit game with the utilities u i is formally equivalent to a cost game withutilities c i = − u i . Terminology.
A game with a set N of players is a so-called N-person game . If N has cardinality n = | N | , the N -person game is also simplytermed a n -person game. The particular case of -person games is funda-mental as we will see later. Decisions and strategies.
In order to pursue an agent’s i ∈ N objectivein a game, the agent may choose a strategy s i from a set S i of possiblestrategies. The joint strategic choice s = ( s i | i ∈ N ) typically influences the evolution of the game. We illustrate the situationwith a well-known game theoretic puzzle:E X . 1.5 (Prisoner’s dilemma). There are two agents
A, B and the data ma-trix (9) U = (cid:20) ( u A , u B ) ( u A , u B )( u A , u B ) ( u A , u B ) (cid:21) = (cid:20) (7 ,
7) (1 , ,
1) (3 , (cid:21) .A and B play a game with these rules: (1) A chooses a row i and B a column j of U . (2) The choice ( i, j ) entails that A is ”penalized” with the value u Aij and B with the value u Bij .This -person game has an initial state σ and other states (1 , , (1 , , (2 , , (2 , , which correspond to the four entry positions of U . The agentshave to decide on strategies i, j ∈ { , } . Their joint decision ( i, j ) willmove the game from σ into the final state σ = ( i, j ) . The game ends attime t = 1 . The utility of player A is then the value u Aij . B has the utilityvalue u Bij .This game is usually understood as a cost game, i.e. , A and B aim at mini-mizing their utilities. What should A and B do optimally? even if the players are not real ”persons” R EMARK
The utility matrix U in (9) yieldsa version of the so-called Prisoner’s dilemma , which is told as thestory of two prisoners A and B who can either individually ”confess”or ”not confess” to the crime they are jointly accused of. Dependingon their joint decision, they supposedly face prison terms as specifiedin U . Their ”dilemma” is: • no matter what they do, at least one of them will feel to havetaken the wrong decision in the end. art 2 HAPTER 2
Combinatorial Games
A look is taken at general games from the standpoint of two alternatingplayers. This aspect reveals a recursive character of games. Finite gamesare combinatorial. Under the normal winning rule, combinatorial games areseen to behave like generalized numbers. Game algebra allows one to ex-plicitly compute winning strategies for nim games, for example.
1. Alternating players
Let Γ be a game that is played on a system S and recall that Γ represents thecollection of all possible stages in an abstract sense. Assume that a concreteinstance of Γ starts in initial state σ ∈ S . Then we may imagine that theevolution of the game is caused by two ”superplayers” that alternate withthe following moves:(1) The beginning player chooses a stage γ = σ σ ∈ Γ .(2) Then the second player chooses a stage γ = σ σ σ ∈ Γ .(3) Now it is again the turn of the first player to realize the nextfeasible stage γ = σ σ σ ∈ Γ and so on.(4) The game stops if the player which would be to move nextcannot find a feasible extension γ t +1 ∈ Γ of the currentstage γ t .This point of view allows us to interpret the evolution of a game as theevolution of a so-called alternating -person game. For such a game A , weassume( A ) There is a set G and two players L and R and an initialelement G ∈ G .( A ) For every G ∈ G , there are subsets G L ⊆ G and G R ⊆ G .The two sets G L and G R in ( A ) are the sets of options of the respectiveplayers relative to G .
234 2. COMBINATORIAL GAMES
The rules of the alternating game A are:( A ) The beginning player chooses an option G relative to G .Then the second player chooses an option G relative to G .Now the first player may select an option G relative to G and so on.( A ) The game stops with G t if the player whose turn it is hasno option relative to G t ( i.e. , the corresponding option set isempty).E X . 2.1 (Chess). Chess is an obvious example of an alternating -persongame. Its stopping rule ( A ) says that the game ends when a player’s kinghas been taken (” checkmate ”). R EMARK L or R is the first playerin the general definition of an altenating -person game. This specificationwill offer the necessary flexibility in the recursive analysis of games later.
2. Recursiveness
An alternating -person game A as above has a recursive structure:(R) A feasible move G → G ′ of a player reduces the currentgame to a new alternating -player game with initial ele-ment G ′ . To make this conceptually clear, we denote the options of the players L and R relative to G as(10) G = { G L , G L , . . . | G R , G R , . . . } and think of G as the (recursive) description of a game that could possiblybe reduced by L to a game G Li or by R to a game G Lj , depending on whoseturn it is to make a move. . COMBINATORIAL GAMES 25
3. Combinatorial games
Consider an alternating -person game in its recursive form (10): G = { G L , G L , . . . | G R , G R , . . . } . Denoting by | G | the maximal number subsequent moves that are possiblein G , we say that G is a combinatorial game if | G | < ∞ , i.e. , if G is guaranteed to stop after a finite number of moves (no matterwhich player starts). Clearly, all the options G Li and G Rj of G must then becombinatorial games as well: | G | < ∞ = ⇒ | G Li | , | G Rj | ≤ | G | − < ∞ . E X . 2.2 (Chess). According to its standard rules, chess is not a combi-natorial game because the players could move pieces back and forth andthus create a never ending sequence of moves. In practice, chess is playedwith an additional rule that ensures finiteness and thus makes it combina-torial (in the sense above). The use of a timing clock, for example, limits thethe number of moves. E X . 2.3 (Nim). The nim game G = G ( N , . . . , N k ) has two alter-nating players and starts with the initial configuration of a collectionof k finite and pairwise disjoint sets N , . . . , N k . A move of a playeris: • Select one of these sets, say N j , and remove one or more ofthe elements from N j . Clearly, one has | G ( N , . . . , N k ) | ≤ | N | + . . . + | N k | < ∞ . So nim is acombinatorial game .E X . 2.4 (Frogs). Having fixed numbers n and k , the two frogs L and R sit n positions apart. A move of a frog consists in taking a leap of at least butnot more than k positions toward the other frog:L → • • • · · · • • • ← RThe frogs are not allowed to jump over each other. Obviously, the gameends after at most n moves. a popular version of nim starts from four sets N , N , N , N of pieces (pebbles ormatches etc. ) with | N | = 1 , | N | = 3 , | N | = 5 and | N | = 7 elements R EMARK
The game of frogs in Ex. 2.4 can be understood as a nimgame with an additional move restriction. Initially, there is a set N with n elements (which correspond to the positions separating the frogs). A playermust remove at least but not more than k elements. Creation of combinatorial games.
The class R of all combinatorialgames can be created systematically. We first observe that there is exactlyone combinatorial game G with | G | = 0 , namely the game O = {· | ·} in which no player has an option to move. Recall, furthermore, that all op-tions G L and G R of a game G with | G | = t must satisfy | G L | ≤ t − and | G R | ≤ t − . So we can imagine that R is ”created” in a never endingprocess from day to day:D AY : The game O = {· | ·} is created and yields R = { O } .D AY : The games { O |·} , {· | O } , { O | O } are created and oneobtains the class R = { O, { O |·} , {· | O } , { O | O }} of all combinatorial games G with | G | ≤ .D AY : The creation of the class R of those combinatorial gameswith options in R is completed. These include the games already in R and the new games {·|{ O |·}} , {· | { O |·}} , {· | {· | O }} . . . { O |{ O |·}} , { O | { O |·}} , { O | {· | O }} . . . { O, {· | O }|{ O |·}} , { O, { O | ·} | { O, { O | ·}|·} . . . { O, {· | O }|{ O |·}} . . . ...D AY t : The class R t of all those combinatorial games G with optionsin R t − is created.So one has R ⊂ R ⊂ . . . ⊂ R t ⊂ . . . and R = R ∪ R ∪ . . . ∪ R t ∪ . . . E X . 2.5. The number of combinatorial games grows rapidly: (1)
List all the combinatorial games in R . (2) Argue that many more than combinatorial games exist at theend of D AY (see Ex. 2.6). . WINNING STRATEGIES 27 E X . 2.6. Show that r t = | R t | grows super-exponentially fast: r t > r t − ( t = 1 , , . . . ) (Hint: A finite set S with n = | S | elements admits n subsets.)
4. Winning strategies
A combinatorial game is started with either L or R making the first move.This determines the first player . The other player is the second player. The normal winning rule for an alternating 2-person games is:(NR) If a player i ∈ { L, R } cannot move, player i has lost andthe other player is declared the winner .Chess matches, for example, are played under the normal rule: A loss of theking means a loss of the match (see Ex. 2.1).R EMARK
The mis`ere rule declares the player with no moveto be the winner of the game. A winning strategy for player i is a move (option) selection rule for i thatensures i to be the winner.T HEOREM
In any combinatorial game G , an overall winningstrategy exists for either the first or the second player.Proof. We prove the theorem by mathematical induction on t = | G | . In thecase t = 0 , we have G = O = {· | ·} . Because the first player has no move in O , the second player is automaticallythe winner in normal play and hence has a winning strategy trivially guar-anteed. Under the mis`ere rule, the first player wins.Suppose now t ≥ and that the Theorem is true for all games that werecreated on D AY t − or before. Consider the first player in G and assumethat it is R . (The argument for L would go exactly the same way!)If R has no option, L is the declared winner in normal play while R is thedeclared the winner in mis`ere play. Either way, G has a guaranteed winner.If options G R exist, the induction hypothesis says that each of R ’s optionsleads to a situation in which either the first or the second player would havea winning strategy. If there is (at least) one option G R with the second player as the winner, R can take this option and win as the second player in G R .On the other hand, if all of R ’s options have their first player as the winner,there is nothing R can do to prevent L from winning. So the originallysecond player L has an overall strategy to win the game guaranteed. ⋄ Note that the proof of Theorem 2.1 is constructive in the followingsense:(1) Player i marks by v ( G i ) = +1 all the options G i in G thatwould have i as the winner and sets v ( G i ) = − otherwise.(2) Player i follows the strategy to move to an option with thehighest v -value.(3) Provided a winning strategy exists at all for i , strategy (2) isa winning strategy for i .The reader must be cautioned, however. The concrete computation of a win-ning strategy may be a very difficult task in real life,E X . 2.7 (De Bruin’s game). Two players choose a natural number n ≥ and write down all the numbers , , , . . . , n − , n. A move of a player consists in selecting one of the numbers still present anderasing it together with all its (proper or improper) divisors.Note that a winning strategy exists for the first player in normal play. In-deed, if it existed for the second player, the first player could simply erase” ” on the first move and afterwards (being now the second player) followthat strategy and win. Alas, no practically efficient method for the com-putation of a winning strategy is known. R EMARK
If chess is played with a finiteness rule, then a winning strat-egy exists for one of the two players. Unfortunately, it is not known whatit looks like. It is not even known which player is the potential guaranteedwinner.
While winning strategies can be computedin principle (see the proof of Theorem 2.1), the combinatorial structure ofmany games is so complex that even today’s computers cannot perform thecomputation efficiently. . ALGEBRA OF GAMES 29
In practice, a player i will proceed according to the following v -greedy strat-egy : ( vg i ) Assign a quality estimate v ( G i ) ∈ R to all the options G i and move to an option with a highest v -value. A quality estimate v is not necessarily completely pre-defined by the gamein absolute terms but may reflect previous experience and other consider-ations. Once quality measures are accepted as ”reasonable”, it is perhapsnatural to expect that the game would evolve according to greedy strategiesrelative to these measures.E X . 2.8. A popular rule of thumb evaluates the quality of a chess con-figuration σ for a player W , say, by assigning a numerical weight v to thewhite pieces on the board. For example: v (pawn) = 1 v (bishop) = 3 v (knight) = 3 v (castle) = 4 . v (queen) = 9 . Where v ( σ ) is the total weight of the white pieces, a v -greedy player W would choose a move to a configuration σ with a maximal value v ( σ ) .(Player B can, of course, evaluate the black pieces similarly.)
5. Algebra of games
For the rest of the chapter we will (unless explicitly said otherwise) assume: • The combinatorial games under consideration are played with thenormal winning rule.
The set R of combinatorial games has an algebraic structure which allowsus to view games as generalized numbers. This section will give a shortsketch of the idea . also chess computer programs follow this idea (much) more can be found in the highly recommended treatise: J.H. C ONWAY , OnNumbers and Games.
A.K. Peters, 2000
Negation.
We first define the negation for the game G = { G L , G L , . . . | G R , G R , . . . } ∈ R as the game ( − G ) where the players L and R interchange their roles: L becomes the ”right” and R the ”left” player.So we obtain the negated games recursively as − O = O − G = {− G R , − G R . . . | − G L , − G L , . . . } if G = O .Also − G is a combinatorial game and one has the algebraic rule G = − ( − G ) . Addition.
The sum G + H of the games G and H is the game in which aplayer i ∈ { L, R } may choose to play either on G or on H . This means that i chooses an option G i in G or an option H i in H and accordingly reducesthe game either to G i + H or to G + H i . The reader is invited to verify the further algebraic rules: G + H = H + G ( G + H ) + K = G + ( H + K ) G + O = G. Moreover, we write G − H = G + ( − H ) . E X . 2.9. The second player wins G − G in normal play with theobvious strategy: • Imitate every move of the first player. When the first playerchooses the option G i in G , the second player will answerwith the option ( − G i ) in ( − G ) etc. Motivated by Ex. 2.9, we say that combina-torial games G and H are congruent (notation: ” G ≡ H ”) if(C) G − H is won by the second player (in normal play). . ALGEBRA OF GAMES 31 In particular, G ≡ O means that G is won by the second player.T HEOREM
For all
G, H, K ∈ R onehas: (a) If G ≡ H , then H ≡ G . (b) If G ≡ H , then G + K ≡ H + K . (c) If G ≡ H and H ≡ K , then G ≡ K .Proof. The verification of the commutativity rule (a) is left to the reader.To see that (b) is true, we consider the game M = ( G + K ) − ( H + K )= G + K − H − K = ( G − H ) + ( K − K ) . The game K − K can always be won by the second player (Ex. 2.9). Hence,if the second player can win G − H , then clearly M as well: • It suffices for the second player to apply the respective winningstrategies to G − H and to K − K .The proof of the transitivity rule (c) is similar. By assumption, the game T = ( G − K ) + ( − H + H ) = ( G − H ) + ( H − K ) can be won by the second player. We must show that the second player cantherefore win G − K .Suppose to the contrary that G − K O were true and that the game G − K could be won by the first player. Then the first player could win T by beginning with a winning move in G − K and continuing with the winstrategy whenever the second player moves in G − K . If the second playermoves in K − K , the first player becomes second there and thus is assuredto win on K − K ! So the first player would win T , which would contradictthe assumption however.Hence we conclude that G − K ≡ O must hold. ⋄ Congruence classes.
For any G ∈ R , the class of congruent games is [ G ] = { H ∈ R | G ≡ H } . Theorem 2.2 says that addition and subtraction can be meaningfully definedfor congruence classes: [ G ] + [ H ] = [ G + H ] and [ G ] − [ H ] = [ G − H ] . In particular, we obtain the familiar algebraic rule [ G ] − [ G ] = [ G − G ] = [ O ] , where [ O ] is the class of all combinatorial games that are won by the secondplayer. Hence we can re-cast the optimal strategy for a player (under thenormality rule): • Winning strategy:
Make a move G G ′ to an option G ′ ∈ [ O ] . Say that the combinatorial games G and H are strategically equivalent (denoted ” G ∼ H ”) if one of the followingstatements is true:( SE ) G and H can be won by the first player ( i.e. , G O H ).( SE ) G and H can be won by the second player ( i.e. , G ≡ O ≡ H ).T HEOREM
Congruent games
G, H ∈ R are strategically equivalent, i.e. , G ≡ H = ⇒ G ∼ H. Proof.
We claim that strategically non-equivalent games G and H cannotbe congruent.So assume, for example, that the first players wins G ( i.e. , G O ), and thesecond player wins H ( i.e. , H ≡ O and hence ( − H ) ≡ O ). We will arguethat the first player has a winning strategy for G − H , which means G H .Indeed, the first player can begin with a winning strategy on G . Once thesecond player moves on ( − H ) , the first player, being now the second playeron ( − H ) , wins there. Thus an overall victory is guaranteed for the firstplayer. ⋄
6. Impartial games
A combinatorial game G is said to be impartial (or neutral ) if both playershave the same options. The formal definition is recursive: • O = {· | ·} is impartial. • G = { A, B, . . . , T | A, B, . . . } is impartial if all the options A, B, . . . , T are impartial.Notice the following rules for impartial games G and H :(1) G = − G and hence G + G = G − G ∈ [ O ] . . IMPARTIAL GAMES 33 (2) G + H is impartial.Nim is the prototypical impartial game (as we will see with the Sprague-Grundy Theorem 2.4 below).To formalize this claim, we use the notation ∗ n for a nim game relative tojust one single set N with n = | N | elements. The options of ∗ n are thenim games ∗ , ∗ , . . . , ∗ ( n − . Moreover, G = ∗ n + ∗ n + . . . + ∗ n k is the nim game described in Ex. 2.3 with k piles of sizes n , n , . . . , n k .E X . 2.10. Show that the frog game of Ex. 2.4 is impartial.
We now define the mex (”minimal excluded”) of numbers a, b, c . . . , t as thesmallest natural number g that equals none of the numbers a, b, c, . . . , t :(11) mex { a, b, c, . . . , t } = min { g ≥ | g / ∈ { a, b, c, . . . , t }} , The crucial observation is stated in Lemma 2.1.L
EMMA
Let a, b, c, . . . t be arbitrary natural numbers. Then one has G = {∗ a, ∗ b, ∗ c, . . . , ∗ t | ∗ a, ∗ b, ∗ c, . . . , ∗ t } ≡ ∗ mex { a, b, c, . . . , } , i.e. , the impartial game G with the nim options ∗ a, ∗ b, ∗ c, . . . , ∗ t is equiva-lent to the simple nim game ∗ m with m = mex { a, b, c, . . . , } .Proof. In view of ∗ m = − ∗ m , we must show: G + ∗ m ≡ O , i.e. , thesecond player wins G + ∗ m . Indeed, if the first player chooses an option j from ∗ m = {∗ , ∗ , . . . , ∗ ( m − } , then the second player can choose j ∗ from G (which must exist because ofthe definition of m as the minimal excluded number) and continue to win ∗ j + ∗ j as the second player.If the first player selects an option from G , say ∗ a , we distinguish two cases.If a > m then the second player reduces ∗ a to ∗ m and wins. If a < m ,then the second player can reduce ∗ m to ∗ a and win. (Note that a = m isimpossible by the definition of mex .) ⋄ T HEOREM
PRAGUE -G RUNDY ). Every impartial combina-torial game G = { A, B, C, . . . | A, B, C . . . , T } is equivalent to a unique nim game of type ∗ m . The number m is theso-called G RUNDY number G ( G ) and can be computed recursively: (12) m = G ( G ) = mex {G ( A ) , G ( B ) , G ( C ) , . . . , G ( T ) } . Proof.
We prove the Theorem by induction on | G | and note G ≡ O if | G | = 0 . By induction, we now assume that the Theorem is true for alloptions of G , i.e. , A ≡ ∗ a , B ≡ ∗ b etc. with a = G ( A ) , b = G ( B ) etc. Hence we can argue G ≡ ∗ m = ∗G ( G ) exactly as in the proof of Lemma 2.1. G cannot be equivalent to another nim game ∗ k since (as Ex. 2.11 belowshows): ∗ k ≡ ∗ m = ⇒ k = m. ⋄ E X . 2.11. Show for all natural numbers k and n : ∗ k ≡ ∗ m ⇐⇒ k = m. E X . 2.12 (G RUNDY number of frogs).
Let F ( n, k ) be the frog game ofEx. 2.4 and G ( n, k ) its G RUNDY number. For k = 3 , F ( n, k ) has the options F ( n − , , F ( n − , , F ( n − , . So the associated G RUNDY number G ( n, has the recursion G ( n,
3) = mex {G ( n − , , G ( n − , , G ( n − } . Clearly, G (0 ,
3) = 0 , G (1 ,
3) = 1 and G (2 ,
3) = 2 . The recursion thenproduces the subsequent G RUNDY numbers: n · · ·G ( n,
3) 0 1 2 3 0 1 2 3 0 1 2 · · ·
The second player wins the nim game ∗ m if and only if m = 0 . So thefirst player can win exactly the impartial games G with G RUNDY number G ( G ) = 0 . In general, we note: • Winning strategy for impartial games:
Make a move G G ′ to an option G ′ with a G RUNDY number G ( G ′ ) = 0 . . IMPARTIAL GAMES 35 RUNDY numbers. If G and H are impartial games withG RUNDY numbers m = G ( G ) and n = G ( H ) , the G RUNDY number of theirsum is G ( G + H ) = G ( ∗ n + ∗ m ) Indeed, if G ≡ ∗ m and H ≡ ∗ n , then G + H ≡ ∗ m + ∗ n must hold .For the study of sums, we may therefore restrict ourselves to nim games.Moreover, the fundamental property G ( G + G ) = G ( ∗ n + ∗ n ) = G ( O ) = 0 suggests to study sums in the context of binary algebra . Binary algebra.
Recall that every natural number n has a unique binaryrepresentation in terms of powers of , n = ∞ X j =0 α j j , with binary coefficients α j ∈ { , } . We define binary addition of and according to the rules ⊕ ⊕ and ⊕ ⊕ and extend it to natural numbers: (cid:0) ∞ X j =0 α j j (cid:1) ⊕ (cid:0) ∞ X j =0 β j j (cid:1) = ∞ X j =0 ( α j ⊕ β j )2 j . R EMARK
Notice that α j = 0 must hold for all j > log n if n = ∞ X j =0 α j j with α j ∈ { , } . E X . 2.13. Show for the binary addition of natural numbers m, n, k : n ⊕ m = m ⊕ nn ⊕ ( m ⊕ k ) = ( n ⊕ m ) ⊕ kn ⊕ m ⊕ k = 0 ⇐⇒ n ⊕ m = k. recall Theorem 2.2! The sum theorem.
We consider nim games with three piles of n , m and k objects, i.e. , sums of single nim games ∗ n , ∗ m , and ∗ k .L EMMA
For all natural numbers n, m, k , one has: (1) If n ⊕ m ⊕ k = 0 , then the first player wins ∗ n + ∗ m + ∗ k . (2) If n ⊕ m ⊕ k = 0 , then the second player wins ∗ n + ∗ m + ∗ k .Proof. We prove the Lemma by induction on n + m + k and note that thestatements (1) and (2) are obviously true in the case n + m + k = 0 . Byinduction, we now assume that the Lemma is true for all natural numbers n ′ , m ′ , k ′ such that n ′ + m ′ + k ′ < n + m + k. We must now show that the Lemma holds for n, m, k with the binary repre-sentations n = ∞ X j =0 α j j , m = ∞ X j =0 β j j , k = ∞ X j =0 γ j j . In the case (1) with n ⊕ m ⊕ k = 0 , there must be at least one j such that α j ⊕ β j ⊕ γ j = 1 . Let J be the largest such index j . Two of these coefficients α J , β J , γ J mustbe equal and the third one must have value . So suppose α J = β J and γ J = 1 , for example, which implies n ⊕ m < k and n + m + ( n ⊕ m ) < n + m + k. Let k ′ = n ⊕ m . We claim that the first player can win by reducing ∗ k to ∗ k ′ .Indeed, the induction hypothesis says that the Lemma is true for n, m, k ′ .Since n ⊕ m ⊕ k ′ = n ⊕ m ⊕ n ⊕ m = 0 , property (2) guarantees a winning strategy for the second player in the re-duced nim game ∗ n + ∗ m + ∗ k ′ . But the latter is the originally first player! So statement (1) is found to betrue.In case (2), when n ⊕ m ⊕ k = 0 , the first player must make a move on oneof the three piles. Let us say that ∗ n is reduced to ∗ n ′ . Because n = m ⊕ k ,we have n ′ = m ⊕ k and therefore n ′ ⊕ m ⊕ k = 0 . Because the Lemma is assumed to be true for n ′ , m, k , statement (1) guar-antees a winning strategy for the first player in the reduced game ∗ n ′ + ∗ m ∗ k, . IMPARTIAL GAMES 37 which is the originally second player. ⋄ T HEOREM
For any impartial combi-natorial games G and H , one has G ( G + H ) = G ( G ) ⊕ G ( H ) . Proof.
Let n = G ( G ) and m = G ( H ) and k = n ⊕ m . Then n ⊕ m + k = 0 holds. So Lemma 2.2 says that the second player wins n ∗ + ∗ m ∗ ( n ⊕ m ) ,which yields G + H ≡ ∗ n + ∗ m ≡ ∗ ( n ⊕ m ) . Consequently, n ⊕ m must be the G RUNDY number of G + H . ⋄ We illustrate Theorem 2.5 with the nim game G = ∗ ∗ ∗ + ∗ ∗ of four piles with , , and objects respectively. The binary representa-tions of the pile sizes are · · + 1 · · + 0 · + 1 · · + 1 · + 1 · . So the G
RUNDY number of G is G ( ∗ ∗ ∗ + ∗ ∗
7) = 1 ⊕ ⊕ ⊕ . Hence G can be won by the second player in normal play.E X . 2.14. Suppose that the first player removes objects from the pile ofsize in G = ∗ ∗ ∗ + ∗ ∗ . How should the second player respond? E X . 2.15. There is a pile of red and another pile of black pebbles.Two players move alternatingly with the following options: • EITHER: take at least but not more than of the red pebbles OR: take at least but not more than of the black pebbles.Which of the players has a winning strategy in normal play? (Hint: Computethe G RUNDY numbers for the red and black piles separately (as in Ex. 2.4)and apply Theorem 2.5.)
HAPTER 3
Zero-sum Games
Zero-sum games abstract the model of combinatorial games. They arise nat-urally as L
AGRANGE games from mathematical optimization problems andthus furnish an important link between game theory and mathematical op-timization theory. In particular, strategic equilibria in games correspond tooptimal solutions of optimization problems. Conversely, mathematical op-timization techniques are important tools for the analysis of game theoreticsituations.
As in the previous chapter, we consider games Γ with agents (or players).However, rather than having to compute explicit strategies from scratch,the players are assumed to have sets X and Y of possible strategies (ordecisions, or actions etc. ) already at their disposal. One player now choosesa strategy x ∈ X and the other player a strategy y ∈ Y . So Γ can be viewedas being played on the system S = X × Y = { ( x, y ) | x ∈ X, y ∈ Y } ∪ { σ } of all possible joint strategy choices of the players together with an initialstate σ . Γ is a zero-sum game if there is a function U : X × Y → R that encodes the utility of the strategic choice ( x, y ) in the sense that theutility values of the individual players add up to zero:(1) u ( x, y ) = U ( x, y ) is the gain of the X -player;(2) u ( x, y ) = − U ( x, y ) is the gain of the Y -player.So the two players have opposing goals:(X) The x -player wants to choose x ∈ X as to maximize U ( x, y ) . (Y) The y -player wants to choose y ∈ Y as to minimize U ( x, y ) . We denote the corresponding zero-sum game by
Γ = Γ(
X, Y, U ) .
390 3. ZERO-SUM GAMES E X . 3.1. A combinatorial game with respective strategy sets X and Y forthe two players is, in principle, a zero-sum game with the utility function U ( x, y ) = +1 if x is a winning strategy for the x -player − if y is a winning strategy for the x -player otherwise.
1. Matrix games
In the case of finite strategy sets, say X = { , . . . , m } and Y = { , . . . , n } ,a function U : X × Y → R can be given in matrix form: U = u u . . . u n u u . . . u n ... ... ... u m u m . . . u mn ∈ R m × n . Γ = (
X, Y, U ) is the game where the x -player chooses a row i and the y -player a column j . This joint selection ( i, j ) has the utility value u ij forthe row player and the value ( − u ij ) for the column player. As an example,consider the game with the utility matrix(13) U = (cid:20) +1 − − (cid:21) . There is no obvious overall ”optimal” choice of strategies. No matter whatrow i or column j is selected by the players, one of the players will find that the other choice would have been more profitable. In this sense, thisgame has no ”solution”.Before pursuing this point further, we will introduce a general concept forthe solution of a zero-sum game in terms of an equilibrium between bothplayers.
2. Equilibria
Let us assume that both players in the zero-sum game
Γ = (
X, Y, U ) arerisk avoiding and want to ensure themselves optimally against the worstcase. So they consider the worst case functions(14) U ( x ) = min y ∈ Y u ( x, y ) ∈ R ∪ {−∞} U ( y ) = max x ∈ X u ( x, y ) ∈ R ∪ { + ∞} . The x -player thus faces the primal problem in hindsight! . EQUILIBRIA 41 (15) max x ∈ X U ( x ) = max x ∈ X min y ∈ Y U ( x, y ) , while the y -player has to solve the dual problem (16) min y ∈ Y U ( y ) = min y ∈ Y max x ∈ X U ( x, y ) . From the definition, one immediately deduces for any x ∈ X and y ∈ Y the primal-dual inequality :(17) U ( x ) ≤ U ( x, y ) ≤ U ( y ) We say that ( x ∗ , y ∗ ) ∈ X × Y is an equilibrium of the game Γ if it yieldsthe equality U ( x ∗ ) = U ( y ∗ ) , i.e. , if the primal-dual inequality is, in fact,an equality:(18) max x ∈ X min y ∈ Y U ( x, y ) = U ( x ∗ , y ∗ ) = min y ∈ Y max x ∈ X U ( x, y ) In the equilibrium ( x ∗ , y ∗ ) , none of the risk avoiding players has an in-centive to deviate from the chosen strategy. In this sense, equilibria repre-sent optimal strategies for risk avoiding players.E X . 3.2. Determine the best worst-case strategies for the two players inthe matrix game with the matrix U of (13) and show that the game has noequilibrium.Give furthermore an example of a matrix game that possesses at least oneequilibrium. If the strategy sets X and Y are finite and hence Γ = (
X, Y, U ) is a matrixgame, the question whether an equilibrium exists, can – in principle – beanswered in finite time by a simple procedure: • Check each strategy pair ( x ∗ , y ∗ ) ∈ X × Y for the pro-perty (18). If X and Y are infinite, the existence of equilibria can usually only be de-cided if the function U : X × Y → R has special properties. From a theo-retical point of view, the notion of convexity is very helpful and important.
3. Convex zero-sum games
Recall that a convex combination of points x , . . . , x k ∈ R n is a linearcombination x = k X i =1 λ i x i with coefficients λ i ≥ such that k X i =1 λ i = 1 . An important interpretation of x is based on the observation that the coeffi-cient vector λ = ( λ , . . . , λ k ) is a probability distribution: • If a point x i is selected from the set { x , . . . , x k } with prob-ability λ i , then the convex combination x has as componentsexactly the expected component values of the stochasticallyselected point. Another way of looking at x is : • If weights of size λ i are placed on the points x i , then x istheir center of gravity.A set X ⊆ R n is convex if X contains all convex combinations of all possi-ble finite subsets { x , . . . , x k } ⊆ X .E X . 3.3. Let S = { s , . . . , s m } an arbitrary set with m ≥ elements. Showthat the set S of all probability distributions λ on S forms a compact convexsubset of R m . A function f : X → R is convex (or convex up ) if X is a convex subsetof some coordinate space R n and for every x , . . . , x k ∈ X and probabilitydistribution λ = ( λ , . . . , λ k ) , one has f ( λ x + . . . + λ k x k ) ≤ λ f ( x ) + . . . + λ k f ( x k ) .f is concave (or convex down ) if g = − f is convex (up).With this terminology, we say that the zero-sum game Γ = (
X, Y, U ) is convex if(1) X and Y are non-empty convex strategy sets;(2) the utility U : X × Y → R is such that(a) For every y ∈ Y , the map x U ( x, y ) is concave.(b) For every x ∈ X , the map y U ( x, y ) is convex. see also Section 2 of the Appendix for more details . CONVEX ZERO-SUM GAMES 43 The main theorem on general convex zero-sum-games guarantees the exis-tence of at least one equilibrium in the case of compact strategy sets:T
HEOREM
A convex zero-sum game
Γ = (
X, Y, U ) with com-pact strategy sets X and Y and a continuous utility U admits a strat-egic equilibrium ( x ∗ , y ∗ ) ∈ X × Y .Proof. Since X and Y are convex and compact, so is Z = X × Y andhence also Z × Z . Consider the continuous function G : Z × Z → R where G (( x ′ , y ′ ) , ( x, y )) = U ( x, y ′ ) − U ( x ′ , y ) . Since U is concave in the first variable x and ( − U ) concave in the secondvariable y , we find that G is concave in the second variable ( x, y ) . So wededuce from Corollary A.1 in the Appendix the existence of an element ( x ∗ , y ∗ ) ∈ Z that satisfies for all ( x, y ) ∈ Z the inequality G (( x ∗ , y ∗ ) , ( x ∗ , y ∗ )) ≥ G (( x ∗ , y ∗ ) , ( x, y )) = U ( x, y ∗ ) − U ( x ∗ , y ) and hence U ( x, y ∗ ) ≤ U ( x ∗ , y ) for all x ∈ X and all y ∈ Y .This shows that x ∗ is the best strategy for the x -player if the y -playerchooses y ∗ ∈ Y . Similarly, y ∗ is optimal against x ∗ . In other words, ( x ∗ , y ∗ ) is an equilibrium of ( X, Y, U ) . ⋄ Theorem 3.1 has important consequences not only in game theory but alsoin the theory of mathematical optimization in general, which we will sketchin Section 4 in more detail below. To illustrate the situation, let us first lookat the special case of randomizing the strategic decisions in finite zero-sumgames.
The utility U of a zero-sum game Γ = (
X, Y, U ) with finite sets X = { , . . . , m } can be described as amatrix U = [ u ij ] ∈ R m × n with coefficients u ij . Such a matrix game Γ doesnot necessarily admit an equilibrium.Suppose the players randomize the choice of their respective strategies. Thatis to say, the x - player decides on a probability distribution x on X andchooses an i ∈ X with probability x i . Similarly, the y -player chooses aprobability distribution y on Y and chooses j ∈ Y with probability y j .Then the x -player’s expected gain is U ( x, y ) = m X i =1 n X j =1 u ij x i y j . So we arrive at a zero-sum game
Γ = (
X, Y , U ), where X is the set ofprobability distributions on X and Y the set of probability distributions on Y . X and Y are compact convex sets ( cf. Ex. 3.3). The function U is linear– and thus both concave and convex – in both components. It follows that Γ is a convex game that satisfies the hypothesis of Theorem 3.1 and thereforeadmits an equilibrium. This proves VON N EUMANN ’s Theorem :T HEOREM
VON N EUMANN ). Let U ∈ R m × n be an arbitrarymatrix with coefficients u ij . Then there exist x ∗ ∈ X and y ∗ ∈ Y such that max x ∈ X min y ∈ Y X ij u ij x i y i = X ij u ij x ∗ i y ∗ i = min y ∈ Y max x ∈ X X ij u ij x i y i . where X is the set of all probability distributions on { , . . . , m } and Y the set of all probability distributions on { , . . . , n } . ⋄ While it is generally not easy to computeequilibria in zero-sum games, the task becomes tractable for randomizedmatrix games. Consider, for example, the two sets X = { , . . . , m } and Y = { , . . . , n } and the utility matrix U = u u . . . u n u u . . . u n ... ... ... u m u m . . . u mn ∈ R m × n For the probability distributions x ∈ X and y ∈ Y , the expected utility forthe X -player is U ( x, y ) = m X i =1 n X j =1 u ij x i y j = n X j =1 y j (cid:0) m X i =1 u ij x i (cid:1) . So the worst case for the x -player happens when the y -player selects a prob-ability distribution that puts the full weight on k ∈ Y such that m X i =1 u ik x i = min ( m X i =1 u ij x i | j = 1 , . . . , n ) = U ( x ) . Hence(19) max x ∈ X U ( x ) = max z ∈ R ,x ∈ X { z | z ≤ m X i =1 u ij x i for all i = 1 , . . . , m } . J. VON N EUMANN (1928):
Zur Theorie der Gesellschaftsspiele , Math. Annalen 100 . LAGRANGE GAMES 45
Similarly, the worst case for the y -player it attained when the x -player putsthe full probability weight onto ℓ ∈ X such that n X j =1 u ℓj y j = max ( n X j =1 u ij y j | i = 1 , . . . , m ) = U ( y ) . This yields(20) min y ∈ Y U ( y ) = min w ∈ R ,y ∈ Y { w | w ≥ n X j =1 u ij y j for all j = 1 , . . . , n } . This analysis shows:P
ROPOSITION If ( z ∗ , x ∗ ) is an optimal solution of (19) and ( w ∗ , y ∗ ) and optimal solution of (20), then (1) ( x ∗ , y ∗ ) is an equilibrium of Γ = (
X, Y , U ) . (2) z ∗ = max x ∈ X U ( x ) = min y ∈ Y U ( y ) = w ∗ . R EMARK
As further outlined in Section 4.4 below, the optimizationproblems (19) and (19) are so-called linear programs that are dual to eachother. They can be solved very efficiently in practice. For explicit solution al-gorithms, we refer the interested reader to the standard literature on math-ematical optimization .
4. L
AGRANGE games
The analysis of zero-sum games is very closely connected with a funda-mental technique in mathematical optimization. A very general form of anoptimization problem is max x ∈F f ( x ) , where F could be any set and f : F → R an arbitrary objective function .In our context, however, we will look at more concretely specified problemsand understand by a mathematical optimization problem a problem of theform(21) max x ∈ X f ( x ) such that g ( x ) ≥ , where X is a subset of some coordinate space R n with an objective function f : X → R . The vector valued function g : X → R m is a restriction e.g. , U. F AIGLE , W. K
ERN and G. S
TILL , Algorithmic Principles of MathematicalProgramming , Springer (2002) function and combines m real-valued restriction functions g i : X → R asits components. The set of feasible solutions of (21) is F = { x ∈ X | g i ( x ) ≥ for all i = 1 , . . . , m } . R EMARK
The model (21) formulates an optimization problem as amaximization problem. Of course, minimization problems can also be for-mulated within this model because of min x ∈F f ( x ) = − max x ∈F ˜ f ( x ) with the objective function ˜ f ( x ) = − f ( x ) . The optimization problem (21) gives rise to a zero-sum game
Λ = ( X, R m + , L ) with the so-called L AGRANGE function (22) L ( x, y ) = f ( x ) + y T g ( x ) = f ( x ) + m X i =1 y i g i ( x ) as its utility. We refer to Λ as a L AGRANGE game .E X . 3.4 (Convex L AGRANGE games). If X is convex and the objectivefunction f : X → R in (21) as well as the restriction functions g i : X → R are concave, then the L AGRANGE game
Λ = ( X, R m + , L ) is a convex zero-sum game. Indeed, L ( x, y ) is concave in x for every y ≥ and linear in y for every x ∈ X . Since linear functions are in particular convex, the game Λ is convex. Complementary slackness.
The choice of an element x ∈ X with atleast one restriction violation g i ( x ) < would allow the y -player in theL AGRANGE game
Λ = ( X, R m + , L ) to increase its utility value infinitelywith y i ≈ ∞ . So the risk avoiding x -player will always try to select afeasible x .On the other hand, if g i ( x ) ≥ holds for all i , the best the y -player can dois the selection of y ∈ R m + such that the so-called complementary slacknesscondition (23) m X i =1 y i g i ( x ) = y T g ( x ) = 0 and hence L ( x, y ) = f ( x ) the idea goes back to J.-L. L AGRANGE (1736-1813) . LAGRANGE GAMES 47 is satisfied. Consequently, one finds:The primal L
AGRANGE problem is identical with the original prob-lem:(24) max x ∈ X min y ≥ L ( x, y ) = max x ∈F L ( x ) = max x ∈F f ( x ) . The dual L
AGRANGE worst case function is(25) L ( y ) = max x ∈ X f ( x ) + y T g ( x ) . L EMMA If ( x ∗ , y ∗ ) is an equilibrium of the L AGRANGE game Λ , then x ∗ is an optimal solution of problem (21).Proof. For every feasible x ∈ F , we have f ( x ) ≤ L ( y ∗ ) = L ( x ∗ ) = f ( x ∗ ) . So x ∗ is optimal. ⋄ Lemma 3.1 indicates the importance of be-ing able to identify equilibria in L
AGRANGE games. In order to establishnecessary conditions, i.e. , conditions which candidates for equilibria mustsatisfy, we impose further assumptions on problem (21):(1) X ⊆ R n is a convex set, i.e. , X contains with every x, x ′ also thewhole line segment [ x, x ′ ] = { x + λ ( x ′ − x ) | ≤ λ ≤ } . (2) The functions f and g i in (21) have continuous partial derivatives ∂f ( x ) /∂x j and ∂g i ( x ) /∂x j for all j = 1 , . . . , n .It follows that also the partial derivatives of the L AGRANGE function L exist. So the marginal change of L into the direction d of the x -variables is ∇ x L ( x, y ) d = ∇ f ( x ) d + m X i =1 y i ∇ g i ( x ) d = n X j =1 ∂f ( x ) ∂x j d j + m X i =1 n X j =1 ∂g i ( x ) ∂x j y i d j . R EMARK
ACOBI matrix).
The ( m × n ) matrix Dg ( x ) having as coef-ficients Dg ( x ) ij = ∂g i ( x ) ∂x j the partial derivatives of a function g : R n → R m is known as a func-tional matrix or J ACOBI matrix . It allows a compact matrix notation forthe marginal change of the Lagrange function: ∇ x L ( x, y ) d = ∇ f ( x ) + y T Dg ( x ) d. L EMMA
The pair ( x, y ) ∈ X × R m + cannot bean equilibrium of the L AGRANGE game Λ unless: (K ) g ( x ) ≥ , i.e. , x is feasible. (K ) y T g ( x ) = 0 . (K ) ∇ x L ( x, y ) d ≤ holds for all d such that x + d ∈ X .Proof. We already know that the feasibility condition (K ) and the com-plementary slackness condition (K ) are necessarily satisfied by an equilib-rium. If (K ) were violated and ∇ x L ( x, y ) d > were true, the x -playercould improve the L -value by moving a bit into direction d . This wouldcontradict the definition of an ”equilibrium”. ⋄ R EMARK
The three conditions of Lemma 3.2 are the so-called
KKT-conditions . Although they are always necessary, they are not always suffi-cient to conclude that a candidate ( x, y ) is indeed an equilibrium. The optimization problem(26) max x ∈ R n + f ( x ) s.t. a ( x ) ≤ b , . . . , a n ( x ) ≤ b m is of type (21) with the m restriction functions g i ( x ) = b i − a i ( x ) and hasthe L AGRANGE function C.G. J
ACOBI (1804-1851) named after the mathematicians K ARUSH , K
UHN and T
UCKER . LAGRANGE GAMES 49 L ( x, y ) = f ( x ) + m X i =1 y i ( b i − a i ( x ))= f ( x ) − m X i =1 y i a i ( x ) + m X i =1 y i b i . For an intuitive interpretation of the problem (26), think of the data vector x = ( x , . . . , x n ) as a plan for n products to be manufactured in quantities x j and of f ( x ) asthe market value of x .Assume that x requires the use of m materials in respective quantities a i ( x ) ,for i = 1 , . . . , m , and that the b i are the quantities of the materials alreadyin the possession of the manufacturer.If the y i represent the market prices (per unit) of the m materials, L ( x, y ) isthe market value of the production x plus the value of the materials left instock after the production of x . The manufacturer would, of course, like tohave that value as high as possible.”The market” is an opponent of the manufacturer and looks at the value − L ( x, y ) = m X i =1 y i ( a i ( x ) − b i ) − f ( x ) , which is the value of the materials the manufacturer must still buy on themarket for the production of x minus the value of the production that themarket would have to pay to the manufacturer for the production x . Themarket would like to set the prices y i so that − L ( x, y ) is as large as possible.Hence: • The manufacturer and the market play a L AGRANGE game Λ . • An equilibrium ( x ∗ , y ∗ ) of Λ reflects an economic balance: Neither the manufacturer nor the market have a guaranteedway to improve their value by changing the production planor by setting different prices.
In this sense, the production plan x ∗ is optimal . The (from the market pointof view) optimal prices y ∗ , . . . , y ∗ m are the so-called shadow prices of the m materials. The complementary slackness condition (K ) says that a material which isin stock but not completely used by x ∗ has zero market value: a i ( x ∗ ) < b i = ⇒ y ∗ i = 0 . The condition (K ) implies that x is a production plan of optimal value f ( x ∗ ) = L ( x ∗ , y ∗ ) under the given restrictions. Moreover, one has m X i =1 y ∗ i a i ( x ∗ ) = m X i =1 y ∗ i b i , which says that the price of the materials used for the production x ∗ equalsthe value of the inventory under the shadow prices y ∗ i .Property (K ) says that the marginal change ∇ x L ( x ∗ , y ∗ ) d of the manu-facturer’s value L is negative in any feasible production modification from x ∗ to x ∗ + d and only profitable for the market because ∇ x ( − L ( x ∗ , y ∗ )) = −∇ x L ( x ∗ , y ∗ ) . We will return to production games in the context of cooperative game the-ory in Section 1.3.
AGRANGE games.
Remarkably, the KKT-conditions turn out to be not only necessary but also sufficient for the char-acterization of equilibria in convex L
AGRANGE games with differentiableobjective functions. This gives a way to compute such equilibria and henceto solve optimization problems of type (21) in practice : • Find a solution ( x ∗ , y ∗ ) ∈ X × R m + for the KKT-inequalities. ( x ∗ , y ∗ ) will yield an equilibrium in Λ = ( X, R m + , L ) and x ∗ will be an optimal solution for (21) . T HEOREM
A pair ( x ∗ , y ∗ ) ∈ X × R m + is an equilibrium of theconvex L AGRANGE game
Λ = ( X, R m + , L ) if and only if ( x ∗ , y ∗ ) satisfies the KKT-conditions.Proof. From Lemma 3.2, we know that the KKT-conditions are neces-sary. To show sufficiency, assume that ( x ∗ , y ∗ ) ∈ X × R m + satisfies the it is not our current purpose to investigate in detail further computational aspects,which can be found in the established literature on mathematical programming . LAGRANGE GAMES 51 KKT-conditions. We must demonstrate that ( x ∗ , y ∗ ) is an equilibrium of theL AGRANGE game
Λ = ( X, R m + , L ) , i.e. , satisfies(27) max x ∈ X L ( x, y ∗ ) = L ( x ∗ , y ∗ ) = min y ≥ L ( x ∗ , y ) for L ( x, y ) = f ( x ) + y T g ( x ) . Since x L ( x, y ) is concave for every y ≥ we find for every x ∈ X , L ( x, y ∗ ) ≤ L ( x, y ∗ ) + ∇ x L ( x, y ∗ )( x − x ∗ ) ≤ L ( x ∗ , x ∗ ) because (K ) guarantees ∇ x L ( x, x ∗ )( x − x ∗ ) ≤ . So the first equality in(27) follows. From (K ) and (K ), we have g ( x ∗ ) ≥ and ( y ∗ ) T g ( x ∗ ) = 0 and therefore deduce the second equality: min y ≥ L ( x ∗ , y ) = f ( x ∗ ) + min y ≥ y T g ( x ∗ ) = f ( x ∗ ) + 0= f ( x ∗ ) + ( y ∗ ) T g ( x ∗ ) = L ( x ∗ , y ∗ ) . ⋄ A linear program ( LP ) is an optimization prob-lem of the form(28) max x ∈ R n + c T x s.t. Ax ≤ b, where c ∈ R n is an n -dimensional coefficient vector, A ∈ R m × n a matrixand b ∈ R m an m -dimensional coefficient vector. The problem type (28) isa special case of (21) with the parameters(1) X = R n + ,(2) f ( x ) = c T x = P nj =1 c j x j ,(3) g ( x ) = b − Ax ,and the L AGRANGE function L ( x, y ) = c T x + y T ( b − Ax ) = y T b + ( c T − y T A ) x, which is concave and convex in both variables x and y . The worst-casefunctions are L ( x ) = (cid:26) c T x if Ax ≤ b −∞ if Ax b . L ( y ) = (cid:26) y T b if y T A ≥ c T + ∞ if y T A c T .To mark this special case, we refer to a L AGRANGE game relative to thelinear program (28) as an
LP-game and denote it by LP ( c ; A, b ) . As we already know, the problem of maximizing L ( x ) corresponds to theoriginal problem (28). The dual problem of minimizing L ( y ) correspondsto the optimization problem(29) min y ∈ R m + b T y s.t. A T y ≥ c. E X . 3.5. Formulate a linear program of type (28) which is equivalent to theoptimization problem (29). E X . 3.6. Formulate the problem of finding an optimal strategy in a random-ized matrix game ( X, Y , U ) as a linear program of the form (28) i.e. , find asuitable matrix A and coefficient vectors c and b . We know from Theorem 3.3 that equilibria of LP-games can be computedas solutions of the KKT-conditions. As to their existence, the fundamentalTheorem 3.4 provides a compactness-free characterization:T
HEOREM
The LP-game ( c, A, b ) has an equilibrium if and only if both problems (28)and (29) have feasible solutions. LP-games are not only interesting as zero-sum games in their own right. Inthe theory of cooperative games with possibly more than two players (seeChapter 7) linear programming is a structurally analytical tool.Linear programming problems are particularly important in applications be-cause they can be solved efficiently. We do not go into algorithmic detailshere but refer to the standard mathematical optimization literature . (Seealso Section 4 in the Appendix.) e.g. , U. F AIGLE , W. K
ERN and G. S
TILL , Algorithmic Principles of MathematicalProgramming , Springer, 2002
HAPTER 4
Investing and Betting
The opponent of a gambler is usually a player with no specific optimizationgoal. The opponent’s strategy choices are determined by chance. Therefore,the gambler will have to decide on strategies with good expected returns.Information plays an important role in the quest for the best decision. Hencethe problem how to model information exchange and common knowledgeamong (possibly more than two) players deserves to be addressed as well.
Assume that an investor (or bettor or gambler or simply player ) is con-sidering a financial engagement in a certain venture. Then the obvious –albeit rather vague – big question for the investor is: • What decision should best be taken ?More specifically, the investor wants to decide whether an engagement isworthwhile at all and, if so, how much money should be how invested. Ob-viously, the answer depends a lot on additional information: What is thelikelihood of a success, what gain can be expected? What is the risk of aloss? etc.
The investor is thus about to participate as a player in a 2-person game withan opponent whose strategies and objective are not always clear or knownin advance. Relevant information is not completely (or reliably) available tothe investor so that the decision must be made under uncertainties. Typicalexamples are gambling and betting where the success of the engagementdepends on events that may or may not occur and hence on ”fortune” or”chance”. But also investments in the stock market fall into this categorywhen it is not clear in advance whether the value of a particular investmentwill go up or down.We will not be able to answer the big question above completely but discussvarious aspects of it. Before going into further details, let us illustrate thedifficulties of the subject with a classical – and seemingly paradoxical –gambling situation.
534 4. INVESTING AND BETTING
The St. Petersburg paradox.
Imagine yourself as a potential player inthe following game of chance.E X . 4.1 (St. Petersburg game). A coin (with faces ”H” and ”T”) istossed repeatedly until ”H” shows. If this happens at the n th toss, aparticipating player will receive α n = 2 n euros. There is a partici-pation fee of a euros, however. So the net gain of the player is a = α n − a = 2 n − a if the game stops at the n th toss. At what fee a would a participationin the game be attractive? Assuming a fair coin in the St. Petersburg game, the probability to go throughmore than n tosses (and hence to have the first n results as ”T”) is q n = (cid:18) (cid:19) n = 12 n → n → ∞ ) . So the game ends almost certainly after a finite number of tosses. The ex-pected return to a participant is nevertheless infinite: E P = ∞ X n =1 n q n = 2 + 2 + . . . + 2 n n + . . . = + ∞ , which might suggest that a player should be willing to pay any finite amount a for being allowed into the game. In practice, however, this could be arisky venture (see Ex. 4.2).E X . 4.2. Show that the probability of receiving a return of euros ormore in the St. Petersburg game is less than . So a participation fee of a = 100 euros or more appears to be not attractive because it will not berecovered with a probability of more than . Paradoxically, when we evaluate the utility of the return n not directly butby its logarithm log n = n , the St. Petersburg payoff has a finite utilityexpectation: G P = log + log + . . . + log n n + . . . = ∞ X n =1 n n < . It suggests that one should expect an utility value of less than and hence areturn of less than = 4 euros. . PROPORTIONAL INVESTING 55 R EMARK
The logarithm function as a measurefor the utility value of a financial gain was introduced by B ERNOULLI in his analysis of the St. Petersburg game. This concave function plays animportant role in our analysis as well.Whether one uses log x , the logarithm base , or the natural logarithm ln x , does not make any essential difference since the two functions differjust by a scaling factor: ln x = (ln 2) · log x.
1. Proportional investing
Our general model consists of a potential investor with an initial portfolioof
B > euros (or dollars or...) and an investment opportunity A . Theinvestor is to decide what portion aB (with scaling factor ≤ a ≤ )of B should be invested and considers k possible scenarios A , . . . , A k forits development. The investor believes that one of these scenarios will berealized and furthermore assumes:(S1) If A i occurs, then each invested euro returns ρ i ≥ euros.(S2) Scenario A i occurs with probability p i . Expected gain.
Under the investor’s assumptions, the expected returnon every invested euro is ρ = ρ p + . . . + ρ k p k . If the investor’s decision is motivated by the maximization of the expectedreturn, the naiv investment rule applies:(NIR) If ρ > , invest all of B in A and expect the return Bρ > B .If ρ ≤ , invest nothing since no proper gain is expected.In spite of its intuitive appeal, rule (NIR) can be quite risky (see Ex. 4.3).E X . 4.3. For k = 2 , assume the return rates ρ = 0 and ρ = 100 . If p =0 . and p = 0 . , the investor expects a tenfold return on the investment: ρ = 0 · . · . . However, with probability p = 90% , the investment can be expected toresult in a total loss. D. B
ERNOULLI (1700-1782)
Expected utility.
With respect to the logarithmic utility function ln x ,the expected utility of an investment of size aB would be U ( a ) = k X i =1 p i ln[(1 − a ) B + ρ i aB )]= k X i =1 p i ln[1 + ( ρ i − a ] + ln B. The derivative of U ( a ) is U ′ ( a ) = k X i =1 p i ( ρ i − ρ i − a = k X i =1 p i /r i + a with r i = ρ i − being the investor’s expected surplus over each investedeuro in scenario A i .The investment rate a ∗ with the optimal utility value would have to satisfy U ′ ( a ∗ ) = 0 and can thus be computed by solving the equation U ′ ( a ) = 0 .E X . 4.4. In the situation of Ex. 4.3, one has U ( a ) = 910 ln(1 − a ) + 110 ln[(1 + 99 a ]) + ln B with the derivative U ′ ( a ) = − − a ) + 9910(1 + 99 a ) .U ′ ( a ) = 0 implies a = 1 / . So the portion B/ of B should be investedin order to maximize the expected utility. The rest B = B − B/ of theportfolio should be retained and not invested.
2. The fortune formula
We turn to a fundamental question: • Should one invest in an opportunity A that offers an ex-pected return at the rate of ρ with probability p but also acomplete loss ( i.e. , a zero return) with probability q = 1 − p ?A special case of this situation was already encountered in Ex. 4.4. Denotingby r = ρ − the expected surplus rate of the investment, the associatedexpected logarithmic utility in general is . THE FORTUNE FORMULA 57 (30) U ( a ) = q ln(1 − a ) + p ln(1 + ra ) + ln B with the derivative(31) U ′ ( a ) = − q − a + p /r + a . If a loss is to be expected with positive probability q > , and the investordecides on a full investment, i.e. , chooses a = 1 , then the utility value U (1) = −∞ must be expected – no matter how big the surplus rate r might be.On the other hand, the choice a = 0 of no investment has the utility U (0) = ln B. The investment rate a ∗ with the optimal utility lies somewhere betweenthese extremes.L EMMA
Let U ′ ( a ) be as in (31) and < a ∗ < . Then U ′ ( a ∗ ) = 0 ⇐⇒ a ∗ = p − q/r. Proof. (Exercise left to the reader.) ⋄ The choice of the investment rate a ∗ with optimal expected logarithmic util-ity U ( a ∗ ) , i.e. , according to the so-called fortune formula of K ELLY :(32) a ∗ = p − qr if < p − qr < Betting one’s belief.
It is important to keep in mind that the probability p in the fortune formula (32) is the subjective evaluation of an investmentsuccess by the investor.The ”true” probability is often unknown at the time of the investment. How-ever, if p reflects the investor’s best knowledge about the true probability,there is nothing better the investor could do. This truism is known as theinvestment advice Bet your belief ! J.L K
ELLY (1956):
A new interpretation of information rate , The Bell System Tech-nical Journal
3. Fair odds
An investment into an opportunity A offering a return of ρ ≥ euros pereuro invested with a certain probability Pr( A ) or returning nothing (withprobability − Pr( A ) ) is called a bet on A . The investor is then a bettor (or a gambler ) and the return ρ is the payoff. The payoff is assumed to beguaranteed by a bookmaker (or bank ). The payoff rate is also denoted by ρ and known as the odds of the bet.The expected gain (per euro) of the gambler is E = ρ Pr( A ) + ( − − Pr( A ) = ( ρ + 1) Pr( A ) − . So ( − E ) is the expected gain of the bookmaker.The odds ρ are considered to be fair if the gambler and the bookmakerhave the same expected gain, i.e. , if E = − E and hence E = 0 holds. Inother words: ρ is fair ⇐⇒ ρ = 1 − Pr( A )Pr( A ) If the true probability
Pr( A ) is not known to the bettor, it needs to be esti-mated. Suppose the bettor’s estimate for Pr( A ) is p . Then the bet appears(subjectively) advantageous if and only if(33) E ( p ) > i.e. , if ρ + 1 > /p. The bettor will consider the odds ρ as fair if E ( p ) = 0 and hence ρ + 1 = 1 /p .In the case E ( p ) < , of course, the bettor would not expect a gain but a losson the bet – on the basis of the information that has lead to the probabilityestimate p for Pr( A ) . Let us look at some examples.E X . 4.5 ( DE M ´ ER ´ E ’s game ). Let A be the event that no ” ” shows if asingle -sided die is rolled four times. Suppose the odds are offeredon A . If the gambler considers all results as equally likely, the gambler’sestimate of the probability for A is p = 5 = 6251296 ≈ . < . mentioned to B. P ASCAL (1623-1662) . FAIR ODDS 59 because there are = 1296 possible result sequences on rolls of the die,of which = 625 correspond to A . So the player should expect a negativereturn: E ( p ) = ( ρ + 1) p − p − / − < . In contrast, let ˜ A be the event that no double shows if a pair of dice isrolled times. Now the prospective gambler estimates Pr( ˜ A ) as ˜ p = (35 / > . . Consequently, the odds on ˜ A would let the gambler expect a propergain: ˜ E = 2˜ p − > . E X . 4.6 (Roulette). Let W = { , , , . . . , } represent a roulette wheeland assume that ∈ W is colored green while eighteen numbers in W are red and the remaining eighteen numbers black . Assume that a number X ∈ W is randomly determined by spinning the wheel and allowing a ballcome to rest at one of these numbers.(a) Fix w ∈ W and the odds on the event A w = { X = w } . Should agambler expect a positive return when placing a bet on A w ?(b) Suppose the bank offers the odds on the event R = { X = red } .Should a gambler consider these odds on R to be fair? For the game of roulette (see Ex. 4.6) andfor similar betting games with odds , a popular wisdom recommendsrepeated betting according to the following strategy:(R) Bet the amount on R = { X = red } . If R does not occur,continue with the double amount on R . If R does not show,double again and bet on R and so on – until the event R happens.Once R shows, one has a net gain of on the original investment of size (see Ex. 4.7). The probability for R not to happen in one spin is / . Sothe probability of seeing red in one of the first n spins of an equally balancedroulette wheel is high: − (19 / n → n → ∞ ) . Hence:
Strategy (R) achieves a net gain of with high probability. I have learned strategy (R) myself as a youth from my uncle Max
Paradoxically(?), the expected net gain for betting any amount x > on theevent R is always strictly negative however: E R = 2 x (cid:18) (cid:19) − x = − x < . E X . 4.7. Show for the game of roulette with a well-balanced wheel: (1) If { X = red } shows on the fifth spin of the wheel only, strategy (R) has lost a total of on the first spins. However, having invested more and then winning = 32 on the fifth spin, yields theoverall net return − (15 + 16) = 1 . (2) The probability for { X = red } to happen on the first spins ismore than . C OMMENT . The problem with strategy (R) is its risk management. A playerhas only a limited amount of money available in practice. If the player wantsto limit the risk of a loss to B euros, then the number of iterations in thebetting sequence is limited to at most k , where k − ≤ B < k and hence k = ⌊ log B ⌋ . Consequently: • The available budget B is lost with probability (19 / k . • The portfolio grows to B + 1 with probability − (19 / k .
4. Betting on alternatives
Consider k mutually exclusive events A , . . . , A k − of which one will occurand a bank that offers the odds ρ i on the k events A i , which means:(1) The bank offers a scenario with /ρ i being the probabilityfor A i to occur.(2) The bank guarantees a payoff of ρ i euros for each euro beton A i if the event A i occurs.Suppose a gambler estimates that the events A i to occur with probabilities p i > and decides to invest the capital B = 1 fully. Under this condition,a (betting) strategy is a k -tuple a = ( a , a , . . . , a k − ) of numbers a i ≥ such that a + a + . . . + a k − = 1 . BETTING ON ALTERNATIVES 61 with the interpretation that the portion a i of the capital will be bet onto theoccurrence of event A i for i = 0 , , . . . , k − . So the gambler’s expectedlogarithmic utility of strategy a is U ( a, p ) = k − X i =0 p i ln( a i ρ i )= k − X i =0 p i ln a i + k − X i =0 p i ln ρ i . Notice that p = ( p , p , . . . , p k − ) is a strategy in its own right and thatthe second sum term in the expression for U ( a, p ) does not depend on thechoice of a . So only the first sum term is of interest when one searches astrategy with optimal expected utility.T HEOREM
Let p = ( p , p , . . . , p k − ) be the gambler’s pro-bability assessment. Then: U ( a, p ) < U ( p, p ) ⇐⇒ a = p. Consequently, a ∗ = p is the strategy with the optimal logarithmicutility under the gambler’s expectations.Proof. The function f ( x ) = x − − ln x is defined for all x > . Itsderivative f ′ ( x ) = 1 − /x is negative for x < and positive for x > . So f ( x ) is strictly decreasingfor x < and strictly increasing for x > with the unique minimum f (1) = 0 . This yields B ERNOULLI ’s inequality(34) ln x ≤ x − and ln x = x − ⇐⇒ x = 1 . Applying the B
ERNOULLI inequality, we find U ( a, p ) − U ( p, p ) = k − X i =0 p i ln a i − k − X i =0 p i ln p i = k − X i =0 p i ln( a i /p i ) ≤ k − X i =1 p i ( a i /p i −
1) = k − X i =1 a i − k − X i =1 p i = 0 with equality if and only if a i = p i for all i = 0 , , . . . , k . ⋄ Theorem 4.1 leads to the betting rule with the optimal expected logarithmicutility:(BR)
For all i = 0 , , . . . , k − , bet the portion a i = p i of thecapital B on the event A i . N OTA B ENE . The proportional rule (BR) only depends on the gambler’sprobability estimate p . It is independent of the particular odds ρ i thebank may offer ! Fair odds.
As in the proof of Theorem 4.1, one sees: k − X i =0 p i ln ρ i = − k − X i =0 p i ln(1 /ρ i ) ≥ − k − X i =0 p i ln p i with equality if and only if ρ i = 1 /p holds for all i = 0 , , . . . , k − . Itfollows that the best odds for the bank (and worst for the gambler) are givenby(35) ρ i = 1 /p i ( i = 0 , , . . . , k − . In this case, the gambler expects the logarithmic utility of the optimal strat-egy p as U ( p ) = k − X i =0 p i ln p i − k − X i =0 p i ln(1 /ρ i ) = 0 . We understand the odds as in (35) to be fair in the context of betting withalternatives.
Assume the gambler of the previous sec-tions has observed that in n consecutive instances of the bet the event A i has popped up s i times.So, under the strategy a , the original portfolio B = 1 would have developedinto B n ( a ) = ( a ρ ) s ( a ρ ) s · · · ( a k − ρ k − ) s k − with the logarithmic utility U n ( a ) = ln B n ( a ) = k − X i =0 s i ln( a i ρ i )= k − X i =0 s i ln a i + k − X i =0 s i ln ρ i . . BETTING AND INFORMATION 63 Based on the observed frequencies s i the gambler might reasonably esti-mate the events A i to occur with probabilities according to the relative fre-quencies p i = s i /n ( i = 0 , , . . . , k − . As in the proof of Theorem4.1, we now find in hindsight:C
OROLLARY
The strategy a ∗ = ( s /n ) , . . . , s k − /n ) wouldhave lead to the maximal logarithmic utility value U n ( a ∗ ) = k − X i =0 s i ln( s i /t ) + k − X i =0 p i ln ρ i and hence to the maximal growth B n ( a ∗ ) = ( s ρ ) s ( s ρ ) s · · · ( s k − ρ k − ) s k − n n
5. Betting and information
Assuming a betting situation with the k alternatives A , A , . . . , A k − andthe odds ρ x as before, suppose however that the event A x is alreadyestablished – but the bettor has no such information before placing the bet.Suppose further that information now arrives through some (human or tech-nical) communication channel K so that the outcome A x is reported to thebettor (perhaps incorrectly) as A y : x → K → y. Having received the (”insider”) information ” y ”, how should the bettorplace the bet? To answer this question, let p ( x | y ) = probability for the true result to be x when receiving y .Note that these parameters p ( x | y ) are typically subjective evaluations of thebettor’s trust in the channel K .A betting strategy in this information setting is now a ( k × k )-matrix A withcoefficients a ( x | y ) ≥ which satisfy k − X x =0 a ( x | y ) = 1 for y = 0 , , . . . , k − . a ( x | y ) is the fraction of the budget that is bet on the event A x when y isreceived. In particular, the bettor’s trust matrix P with coefficients p ( x | y ) is a strategy. For the case that A x is the true result, one therefore expects thelogarithmic utility U x ( A ) = k − X y =0 p ( x | y ) ln[ a ( x | y ) ρ x ] = k − X y =0 p ( x | y ) ln a ( x | y ) + ln ρ x . As in Corollary 4.1, we find for all x = 0 , , . . . , k − : U x ( A ) < U x ( P ) ⇐⇒ a ( x | y ) = p ( x | y ) ∀ y = 0 , , . . . , k − . So the strategy P is optimal (under the given trust in K on part of the bettor)and confirms the betting rule: Bet your belief ! Information transmission.
Let p = ( x , . . . , x k − ) be the bettor’s pro-bability estimates on the k events A , A , . . . , A k − or, equivalently, on theindex set { , , . . . , k − } . Then the expected logarithmic utility of strategy A is relative to base : U ( p )2 ( A ) = k − X x =0 k − X y =0 p x p ( x | y ) log a ( x | y ) + k − X x =0 p x log ρ x . N OTA B ENE . The probabilities p x are estimates on the likelihood of theevents A i , while the probabilities p ( x | y ) are estimates on the trust of thebettor into the reliability of the communication channel K . They are logi-cally not related. Setting H ( X ) = − k − X i =1 p x log xH ( ρ ) = − k − X i =1 ρ x log xH ( X, Y ) = − k − X x =0 k − X y =0 p x p ( x | y ) log a ( x | y ) , we thus have U ( p )2 = − H ( X | Y ) − H ( ρ ) = U ( p ) + T ( X | Y ) . COMMON KNOWLEDGE 65 where T ( X | Y ) = H ( X ) − H ( X | Y ) is the increase of the bettor’s expected logarithmic utility due to the com-munication via channel K .R EMARK
Given the channel K as above with trans-mission probabilities p ( s | r ) and the distribution p = ( p , p , . . . , p k − ) onthe channel inputs x , the parameter T ( X, Y ) is the (information) transmis-sion rate of K .Maximizing over all possible input distributions p , one obtains the channelcapacity C ( K ) as the smallest upper bound on the achievable transmissionrates: C ( K ) = sup p T ( X, Y ) . The parameter C ( K ) plays an important role in the theory of informationand communication in general . E X . 4.8. A bettor expects the event A with probability and the alter-native event A with probability . What bet should be placed?Suppose now that an expert tells the bettor that A is certain to happen.What bet should the bettor place under the assumption that the expert isbelieved to be right with probability ?
6. Common knowledge
Having discussed information with respect to betting, let us digress a lit-tle and take a more general view on information and knowledge. Given asystem S , we ask: To what extent does common knowledge in a group of agents influ-ence individual conclusions about the state of S ?To explain what is meant here, we first discuss a well-known riddle. C.E. S
HANNON (1948):
A mathematical theory of communication.
Bell SystemTechnical Journal
Imagine the following situation:(I) Three girls, G , G and G , with red hats are sitting in a circle.(II) They all know that their hats are either red or white .(III) Each can see the color of all hats except her own.Now the teacher comes and announces:(1) There is at least one red hat. (2)
I will start counting slowly. As soon as someone knows the colorof her hat, she should raise her hand.
What will happen? Does the teacher provide information that goes beyondthe common knowledge the girls already have? After all, each girl sees twored hats – and hence knows that each of the other girls sees at least one redhad as well.Because of (III), the girls know their hat universe S is in one of the statesof possible color distributions: σ σ σ σ σ σ σ σ G R R R W R W W WG R R W R W R W WG R W R R W W R W
None of these states can be jointly ruled out. The entropy H of their com-mon knowledge is: H = log . The teacher’s announcement, however, rules out the state σ and reducesthe entropy to H = log < H , which means that the teacher has supplied proper additional information.At the teacher’s first count, no girl can be sure about her own hat becausenone sees two white hats. So no hand is raised, which rules out the states σ , σ and σ as possibilities.Denote now by P i ( σ ) the set of states thought possible by girl G i when thehat distribution is actually σ . So we have, for example, P ( σ ) = { σ } , P ( σ ) = { σ } , P ( σ ) = { σ } . Consequently, in each of the states σ , σ , σ , at least one girl would raiseher hand at the second count and conclude confidently that her hat is red ,which would signal the state (and hence the hat distribution) to the othergirls.If no hand goes up at the second count, all girls know that they are in state σ and will raise their hands at the third count. . COMMON KNOWLEDGE 67 In contrast, consider the other extreme scenario and assume:(I’) Three girls, G , G and G , with white hats are sitting in a circle.(II) They all know that their hats are either red or white .(III) Each can see the color of all hats except her own.The effect of the teacher’s announcement is quite different: • Each girl will immediately conclude that her hat is red and raiseher hand because she sees only white hats on the other girls.This analysis shows:(i) The information supplied by the teacher is subjective : Evenwhen the information (”there is at least one red hat”) is false,the girls will eventually conclude with confidence that theyknow their hat’s color.(ii) When a girl thinks she knows her hat’s color, she may never-theless have arrived at a factually wrong conclusion.E X . 4.9. Assume an arbitrary distribution of red and white hats among thethree girls. Will the teacher’s announcement nevertheless lead the girls tothe belief that they know the color of their hats? An event in the system S is a subset E ⊆ S of states. We say that E occurs when S is in a state σ ∈ E . Denoting by S the collection of all possible events, we think of afunction P : S → S with the property σ ∈ P ( σ ) for all σ ∈ S as an information function . P has the interpretation: • If S is in the state σ , then P provides the information that the event P ( σ ) has occurred. Notice that P is not necessarily sharp: any state τ ∈ P ( σ ) is a candidate forthe true state under the information function P .The information function P defines a knowledge function K : S → S via K ( E ) = { σ | P ( σ ) ⊆ E } with the interpretation: • K ( E ) is the set of states σ ∈ S where P suggests that the event E has certainly occurred. L EMMA
The knowledge K of the information function P has theproperties: (K.1) K ( S ) = S . (K.2) E ⊆ F = ⇒ K ( E ) ⊆ K ( F ) . (K.3) K ( E ∩ F ) = K ( E ) ∩ K ( F ) . (K.4) K ( E ) ⊆ E .Proof. Straightforward exercise, left to the reader. ⋄ Property (K.4) is the so-called reliability axiom : If one knows (under K )that E has occurred, then E really has occurred.E X . 4.10 (Transparency). Verify the transparency axiom(K.5) K ( K ( E ) = K ( E ) for all events E .Interpretation: When one knows with certainty that E has occurred,then one knows with certainty that one considers E as having oc-curred.We say that E is evident if E = K ( E ) is true, which means: • The knowledge function K considers an evident event E as havingoccurred if and only if E really has occurred.E X . 4.11. Show: The set S of all possible states constitutes always an evi-dent event. E X . 4.12 (Wisdom). Verify the wisdom axiom(K.6) S \ K ( E ) = K ( S \ E ) for all events E .Interpretation: When one does not know with certainty that E hasoccurred, then one is aware of one’s uncertainty. Consider now a set N = { p , . . . , p n } of n players p i with respective information functions P i . We say that the event E ⊆ S is evident for N if E is evident for each of the members of N , i.e. ,if E = K ( E ) = . . . = K n ( E ) . . COMMON KNOWLEDGE 69 More generally, an event E ⊆ S is said to be common knowledge of N inthe state σ if there is an event F ⊆ E such that F is evident for N and σ ∈ F. P ROPOSITION
If the event E ⊆ S is common knowledge for the n players p i with information functions P i in state σ , then σ ∈ K i ( K i ( . . . ( K i m ( E )) . . . ))) holds for all sequences i . . . i m of indices ≤ i j ≤ n .Proof. If the event E is common knowledge, it comprises an evident event F ⊆ E with σ ∈ F . By definition, we have ∈ K i ( K i ( . . . ( K i m ( F )) . . . ))) = F By property (K.2) of a knowledge function (Lemma 4.2), we thus conclude K i ( K i ( . . . ( K i m ( E )) . . . ))) ⊇ K i ( K i ( . . . ( K i m ( F )) . . . ))) = F ∋ σ. ⋄ As an illustration of Proposition 4.1, consider the events K ( E ) , K ( K ( E )) , K ( K ( K ( E ))) .K ( E ) are all the states where player p is sure that E has occurred. Theset K ( K ( E )) comprises those states where player p is sure that player p ( E ) is sure that E has occurred. In K ( K ( K ( E )) are all the states whereplayer p is certain that player p is sure that player p believes that E hasoccurred. And so on.
Let p and p be two players with informationfunctions P and P relative to a finite system S and assume: • Both players have the same probability estimates
Pr( E ) on theoccurrence of events E ⊆ S .We turn to the question: • Can there be common knowledge among the two players in a cer-tain state σ ∗ that they differ in their estimate on the likelihood ofan event E having occurred ?Surprisingly(?), the answer can be ” yes ” as Ex. 4.13 shows. For the analysisin the example, recall that the conditional probability of an event E giventhe event A , is Pr( E | A ) = (cid:26) Pr( E ∩ A ) / Pr( A ) if Pr( A ) > if Pr( A ) = 0 .E X . 4.13. Let S = { σ , σ } and assume Pr( σ ) = Pr( σ ) = 1 / .Consider the information functions P ( σ ) = { σ } and P ( σ ) = { σ } P ( σ ) = { σ , σ } = P ( σ ) . For the event E = { σ } , one finds Pr( E | P ( σ )) = 1 and Pr( E | P ( σ )) = 0Pr( E | P ( σ )) = 1 / and Pr( E | P ( σ )) = 1 / . The ground set S = { σ , σ } corresponds to the event ” the two play-ers differ in their estimates on the likelihood that E has occurred ”. S is (trivially) common knowledge in each of the two states σ , σ . For a large class of information functions, however, our initial question hasthe guaranteed answer ” no ”. For example, let us call an information function P strict if • Every evident event E is a union of pairwise disjoint sets P ( σ ) .P ROPOSITION
Assume that both information functions P and P are strict. Let E ⊆ S be arbitrary event. Then there is no state σ ∗ in which it could be common knowledge of the players that theirlikelihood estimates η resp. η on the occurrence of E are different.Proof. Consider the events E i = { σ | Pr( E | P i ( σ ) = η } ( i = 1 , . The event E ∩ E is then the event that player p estimates the probabilityfor the occurrence of E with η while player p ’s estimate is η .Suppose E ∩ E is common knowledge in state σ , i.e. , there exists an event F ⊆ E ∩ E such that σ ∗ ∈ F and K ( F ) = F = K ( F ) . Because the information function P is strict, F is the union of pairwisedisjoint sets P ( σ ) , . . . , P ( σ k ) , say. Because of F ⊆ E ∩ E , one has Pr( E | P ( σ )) = . . . = Pr( E | P ( σ k )) = η . . COMMON KNOWLEDGE 71 Taking Ex. 4.14 into account, we therefore find
Pr( E | F ) = Pr( E | P ( σ ) = η . Similarly,
Pr( E | F ) = η is deduced and hence η = η follows. ⋄ E X . 4.14. Let
A, B be events such that A ∩ B = ∅ . Then the conditionalprobability satisfies: Pr( E | A ) = Pr( E | B ) = ⇒ Pr( E | A ∪ B ) = Pr( E | A ) . art 3 n-Person Games HAPTER 5
Utilities, Potentials and Equilibria
Before discussing n -person games per se , it is useful to go back to the fun-damental model of a game Γ being played on a system S of states and lookat characteristic features of Γ . The aim is a numerical assessment of theworth of states and strategic decisions from a general perspective.
1. Utilities and Potentials1.1. Utilities. A utility on the system S is an ensemble U = { u σ | σ ∈ S } of functions u σ : S → R , the so-called local utility functions of U .We think of U as a measuring instrument which allows us to evaluate apossible move σ τ numerically by the ensuing marginal difference ∂U ( σ, τ ) = u σ ( τ ) − u σ ( σ ) If the quality of any move σ τ in the game Γ is evaluated via the utility U , we call U the characteristic utility of Γ . Having a ’potential’ means to have the capability toenact something. In physics, the term potential refers to a characteristicquantity of a system whose change results in a dynamic behavior of thesystem. Potential energy, for example, may allow a mass to be set into mo-tion. The resulting kinetic energy corresponds to the change in the potentialenergy. Gravity is thought to result from changes in a corresponding poten-tial, the so-called gravitational field etc.
Mathematically, a potential is represented as a real-valued numerical para-meter. In other words: A potential on the system S is just a function v : S → R which assigns to a state σ ∈ S a numerical value v ( σ ) .The potential v gives rise to an associated utility V = { v σ | σ ∈ S } , where v σ = v for all σ ∈ S .
756 5. UTILITIES, POTENTIALS AND EQUILIBRIA If V is the utility of the game Γ on S , then the potential v is called the characteristic function of Γ . The value of a move σ τ is then given bythe marginal difference ∂V ( σ, τ ) = ∂v ( σ, τ ) = v ( τ ) − v ( σ ) Path independence.
Given the utility U on S , a path γ = σ σ σ . . . σ k − σ k of system transitions has the total utility weight ∂U ( γ ) = ∂U ( σ , σ ) + ∂U ( σ , σ ) . . . + ∂U ( σ k − , σ k ) . We say that U is path independent if the utility weight of any path dependsonly on its initial state σ and the final state (but not on the states σ i = σ , σ k in between): ∂U ( σ . . . σ i . . . σ k ) = ∂U ( σ , σ k ) . P ROPOSITION
The utility U is path independent on S if andonly if U is derived from a potential on S .Proof. If U is derived from the potential u : S → R , we have ∂U ( σ i − , σ i ) = u ( σ i ) − u ( σ i − ) and, therefore, for any γ = σ σ . . . σ k : ∂U ( γ ) = k X i =1 ( u ( σ i ) − u ( σ i − )) = k X i =1 u ( σ i ) − k − X i =0 u ( σ i )= u ( σ k ) − u ( σ )= ∂U ( σ k , σ ) , which shows that U is path independent.Conversely, assume that U = { u σ | σ ∈ S } is a path independent utility.Fix a state σ and notice that the utility function u = u σ is a potential in itsown right. Since U is path independent, we have for all σ, τ ∈ S , ∂U ( σ , σ ) + ∂U ( σ, τ ) = ∂U ( σ , τ ) and, therefore, ∂U ( σ, τ ) = ∂U ( σ , τ ) − ∂U ( σ , σ ) = u ( τ ) − u ( σ ) . So U is identical with the utility derived from the potential u . ⋄ . EQUILIBRIA 77
2. Equilibria
When we talk about an equilibrium of a utility U = { u σ | σ ∈ S } on thesystem S , we assume that each state σ has associated a neighborhood F σ ⊆ S with σ ∈ F σ .This means that we concentrate state transitions to neighbors, i.e. , to transi-tions σ τ with τ ∈ F σ . We now say that a system state σ is an ”equilib-rium” if it yields a locally extreme value of the utility function from where ano transition to a neighbor appears attractive. To be precise, we distinguishmaximal extreme values and minimal extreme values and, therefore, define:(1) σ is a gain (or profit ) equilibrium of U if u σ ( τ ) ≤ u σ ( σ ) holds for all τ ∈ F σ ; (2) σ is a cost equilibrium of U if u σ ( τ ) ≥ u σ ( σ ) holds for all τ ∈ F σ . Many real-world systems are assumed to be subject to a dynamic processthat eventually settles in an equilibrium state (or at least approximates anequilibrium) according to some utility measure. This phenomenon is strik-ingly observed in physics. But also economic theory has long suspected thateconomic systems may tend towards equilibrium states .If the utility measure suggests to maximize the value, gain equilibria are ofinterest. If a minimal value is desirable, one is investigates cost equilibria.R EMARK
Denote by C = − U the utility with localutility functions c σ = − u σ . Then one has σ is a gain equilibrium of U ⇐⇒ σ is a cost equilibrium of C From an abstract point of view, the theory of gain equilibria is, therefore,equivalent to the theory of cost equilibria.
In practice, the determination of an equi-librium is typically a very difficult computational task. In fact, many utilitiesdo not have equilibria. It is generally not easy to just find out whether anequilibrium for a given utility exists at all. So one is interested in conditionsthat allow one to conclude that at least one equilibrium exists. A.A. C
OURNOT (1838):
Recherche sur les principes math´ematiques de la th´eoriede la richesse , Paris
Utilities from potentials.
Consider a potential u : S → R withthe derived utility values ∂u ( σ, τ ) = u ( τ ) − u ( σ ) . Here, one has conditions that are obviously sufficient:(1) If u ( σ ) = max τ ∈ σ u ( τ ) , then σ is a gain equilibrium.(2) If u ( σ ) = min τ ∈ σ u ( τ ) , then σ is a cost equilibrium.Since every function on a finite set attains a maximum and a minimum, wefindP ROPOSITION If S is finite, then every potential function yieldsa utility with at least one gain and one cost equilibrium. Similarly, we can derive the existence of equilibria on systems that are re-presented in a coordinate space.P
ROPOSITION If S can be represented as a compact set S ⊆ R m such that u : S → R is continuous, then u implies a utility on S withat least one gain and one cost equilibrium. Indeed, it is well-known that a continuous function on a compact set attainsa maximum and a minimum.R
EMARK
Notice that the conditions given in this section, are sufficientto guarantee the existence of equilibria – no matter what neighborhoodstructure on S is assumed. Convex and concave utilities.
If the utility is not implied by a po-tential function, not even the finiteness of S may be sufficient to guaranteethe existence of an equilibrium (see Ex. 5.1).E X . 5.1. Give the example of a utility U relative to a finite state state S with no gain and no cost equilibrium. We now derive sufficient conditions for utilities U on systems whose statesare represented by a nonempty convex set S ⊆ R m . . EQUILIBRIA 79 We say: • U is convex if every local function u s : S → R is convex. • U is concave if every local function u s : S → R is concave.T HEOREM
Let U be a utility with continuous local utility func-tions u s : S → R on the nonempty compact set S ⊆ R m . Then (1) If U is convex, a cost equilibrium exists. (2) If U is concave, a gain equilibrium exists.Proof. Define the function G : S × S → R with values G ( s, t ) = u s ( t ) for all s, t ∈ S .Then the hypothesis of the Theorem says that G satisfies the conditions ofCorollary A.1 of the Appendix. Therefore, an element s ∈ S exists suchthat u s ( t ) = G ( s, t ) ≤ G ( s, s ) = u s ( s ) holds for all s ∈ S .Consequently, s ∗ is a gain equilibrium of U (The convex case is proved inthe same way.) ⋄ HAPTER 6 n-Person Games n -person games generalize -person games. Yet, it turns out that the spe-cial techniques for the analysis of -person games apply in this seeminglywider context as well. Traffic systems, for example, fall into this categorynaturally. The model of a n -person game Γ assumes the presence of a finite set N with n = | N | elements together with a family X = { X i | i ∈ N } of n furthernonempty sets X i . The elements i ∈ N are thought of as players or agentsetc. A member X i ∈ X represents the collection of resources (or actions , strategies , decisions etc. ) that are available to agent i ∈ N .A state of Γ is a particular selection x = ( x i | i ∈ N ) of individual re-sources x i ∈ X i by the n agents i . So the collection of all states x of Γ isrepresented by the direct product X = Y i ∈ N X i . R EMARK
It is often convenient to label the elements of N by naturalnumbers and assume N = { , , . . . , n } for simplicity of notation. In thiscase, a state x of Γ can be denoted in the form x = ( x , x , . . . , x n ) ∈ X × X × . . . × X n (= X ) . We furthermore assume that each player i ∈ N has an individual utility U i = { u x i | x ∈ X } so that u x i ( y ) assesses the value of a state transition x y for i . The wholecontext Γ = Γ( U i | i ∈ N ) of the set players and their utilities now describes the n -person game underconsideration.
812 6. N-PERSON GAMES E X . 6.1. The matrix game Γ with a row player R and a column player C and the payoff matrix P = (cid:20) ( p , q ) ( p , q ))( p , q ) ( p , q ) (cid:21) = (cid:20) (+1 , −
1) ( − , +1)( − , +1) (+1 , − (cid:21) . is a -person game with the player set N = { R, C } and the strategy sets X R = { , } and X C = { , } . Accordingly, the set of states is X = X R × X C = { (1 , , (1 , , (2 , , (2 , } . The individual utility functions u ( i,j ) R , u ( i,j ) C : X → R take the values u ( i,j ) R ( s, t ) = p st and u ( i,j ) C ( s, t ) = q st for all ( s, t ) ∈ X . Potential games.
The n -person game Γ = ( U i | i ∈ N ) is called a potential game if there is a potential v : X → R such that, for all ∈ N and x , y ∈ X the marginal utility change equals the change in the potential: u x i ( y ) − u x i ( x ) = ∂v ( x , y ) = v ( y ) − v ( x ) Cooperation.
The basic game model with a set N of players is readilygeneralized to a model where groups of players (and not just individuals)derive a utility value from a certain state x ∈ X . To this end, we call a subset S ⊆ N of players a coalition and assume an individual utility function u S : X → R to exist for each coalition S .From an abstract mathematical point of view, however, this generalizedmodel can be treated like a standard |N | -person game, having the set N = { S ⊆ N } of coalitions as its set of ”superplayers”. In fact, we may allow each coali-tion S to be endowed with its own set X S of resources.In this chapter, we therefore retain the basic model with respect to an under-lying set N of players. A special class of potential games with cooperation,so-called cooperative games , will be studied in more detail in Chapter 7. Probabilistic models.
There are many probabilistic aspects of n -persongames. One consists in having a probabilistic model to start with (see Ex.6.2).E X . 6.2 (Fuzzy games). Assume a game Γ where any player i ∈ N has todecide between two alternatives, say ”0” and ”1”, and chooses ”1” withprobability x i . Then Γ is a | N | -person game in which each player i has theunit interval X i = [0 ,
1] = { x ∈ R | ≤ x ≤ } as its set of resources. A joint strategic choice x = ( x , . . . , x i , . . . , x n ) ∈ [0 , N . EQUILIBRIA 83 can be interpreted as a ”fuzzy” decision to form a coalition X ⊆ N : • Player i will be a member of X with probability x i . x is thus the description of a fuzzy coalition . Γ is a fuzzy cooperative game in the sense of A UBIN if it is a potential game in our terminology. A further model arises from the randomization of a n -person game (seeSection 3). Other probabilistic aspects of n -person games are studied inChapter 7 and in Chapter 8.
1. Dynamics of n -person games If the game
Γ = ( U i | i ∈ N ) is played, a game instance yields a sequenceof state transitions. The transitions are thought to result from changes in thestrategy choices of the players.Suppose i ∈ N replaces its current strategy x i by the strategy y ∈ X i whileall other players j = i retain their choices x j ∈ X j . Then a state transition x y = x − i ( y ) results, where the new state has the components y j = (cid:26) y if j = ix j if j = i. Note in particular that x − i ( x i ) = x holds under this definition. Let us takethe set F i ( x ) = { x − i ( y ) | y ∈ X i } as the neighborhood of the state x ∈ X for the player i ∈ N . So the neighbors of x from i ’s perspective are those states that could be achievedby i with a change of its current strategy x i , provided all other players j = i retain their current strategies x j .
2. Equilibria A gain equilibrium of Γ = Γ( U i | i ∈ N ) is a joint strategic choice x ∈ X such that no player has an utility incentive to switch to another strategy, i.e. , u x i ( x ) ≥ u x i ( z ) holds for all i ∈ N and z ∈ F i ( x ) .Completely analogously, a cost equilibrium is defined via the reverse con-dition: u x i ( x ) ≤ u x i ( z ) holds for all i ∈ N and z ∈ F i ( x ) . J.-P. A
UBIN (1981):
Fuzzy cooperative games , Math. Operations Research 6, 1-13
This notion of an equilibrium can be brought in line with the general de-finition in Chapter 2). Given the state x , imagine that each player i consid-ers an alternative y i to its current strategy x i . The aggregated sum of theresulting utility values is G ( x , y ) = X i ∈ N u x ( x − i ( y i )) ( y = ( y i | y i ∈ X i )) . L EMMA x ∈ X is a gain equilibrium of Γ( U i | i ∈ N ) if and only if G ( x , y ) ≤ G ( x , x ) holds for all y ∈ X .Proof. If x is a gain equilibrium and y = ( y i | i ∈ N ) ∈ X , we have u x i ( x ) ≥ u x ( x − i ( y i )) for all y i ∈ X i , which implies G ( x , x ) ≥ G ( x ; y ) . Conversely, if x is not a gain equi-librium, there is an i ∈ N and a y ∈ X i such that < u x i ( x − i ( y )) − u x i ( x ) = G ( x , x i ( y )) − G ( x , x ) . which means that y = x − i ( y ) ∈ X violates the inequality. ⋄ Lemma 6.1 reduces the quest for an equilibrium of Γ to the quest for anequilibrium of the utility G = { g x | x ∈ X } with values g x ( y ) = G ( x , y ) . It follows that we can immediately carry over the general sufficient con-ditions in Chapter 5 for the existence of equilibria to the n -person game Γ = ( U i | i ∈ N ) with utility aggregation function G :(1) If Γ is a potential game with a finite set X of states, then theexistence of a gain and of a cost equilibrium is guaranteed.(2) If X is represented as a nonempty compact and convex set ina finite-dimensional real parameter space, and all the maps y G ( x , y ) are continuous and concave, then Γ admits again equilibrium.(3) If X is represented as a nonempty compact and convex set ina finite-dimensional real parameter space, and all the maps y G ( x , y ) are continuous and convex, then Γ admits acost equilibrium.E X . 6.3. Show that the matrix game in Ex. 6.1 is not a potential game (Hint:The set of states is finite.) . RANDOMIZATION OF MATRIX GAMES 85
3. Randomization of matrix games A n -person game Γ = ( U i | i ∈ N ) is a matrix game if(i) The set X i of resources of any player i ∈ N is finite.(ii) Each player i ∈ N has just one utility function u i : X → R .For a motivation of the terminology, assume N = { , . . . , n } and think ofthe sets X i as index sets for the coordinates of a multidimensional matrix U .A particular index vector x = ( x , . . . , x i , . . . , x n ) ∈ X = X × . . . × X i × . . . X n thus specifies a position in U with the n -dimensional coordinate entry U x = ( u ( x ) , . . . , u i ( x ) , . . . , u n ( x )) ∈ R n . Let us now change the rules of the matrix game Γ in the following way:(R) For each i ∈ N , player i chooses a probability distribu-tion p ( i ) on X i and selects the element x ∈ X i with pro-bability p ( i ) x .Under rule (R), the players are really playing the related n -person game Γ = ( U i | i ∈ N ) with resource sets P i and utility functions u i : P i → R ,where (1) P i is the set of all probability distributions on X i .(2) u i ( p ) is the expected value of u i relative to the joint pro-bability distribution p = ( p ( i ) | i ∈ N ) of the players.The n -person game Γ is the randomization of the matrix game Γ . Assuming N = { , . . . , n } , one has the expected utility values given as u i ( p (1) , . . . , p ( n ) )) = X x ∈ X . . . X x n ∈ X n u i ( x , . . . , x n ) p (1) x · · · p ( n ) x n . As Ex. 6.1 shows, a (non-randomized) matrix game Γ does not necessar-ily have equilibria. On the other hand, notice that the coordinate productfunction ( t , . . . , t n ) ∈ R n t · · · t n ∈ R is continuous and linear in each variable. Each utility function u i of therandomized game Γ is a linear combination of such functions and, therefore,also continuous and linear in each variable.Since linear functions are both concave and convex and the state set P = P × . . . × P n is convex and compact, we conclude:T HEOREM
ASH ). The randomization Γ of a n -person matrixgame Γ admits both a gain and a cost equilibrium. R EMARK
An equilibrium of a randomized matrix game is also knownas a N ASH equilibrium .
4. Traffic flows
A fundamental model for the analysis of flows in traffic networks goes backto W
ARDROP . It is based on a graph G = ( V, E ) with a (finite) set V ofnodes and set E of (directed) edges e between nodes,v e −→ w , representing directed connections from nodes to other nodes. The modelassumes:(W) There is a set N of players. A player i ∈ N wants to travelalong a path in G from a starting point s i to a destination t i and has a set P i of paths to choose from.Game-theoretically speaking, a strategic action s of player i ∈ N means aparticular choice of a path P ∈ P i . Let us identify a path P ∈ P i with itsincidence vector in R E with the components P e = (cid:26) if P passes through e otherwise.The joint travel path choice s of the players generates the traffic flow x s = X i ∈ N X P ∈P i λ s P P of size | x s | = X P λ s P ≤ n, J. N
ASH (1950):
Equilibrium points in n-person games , Proc. National Academy ofSciences 36, 48-49 J.G. W
ARDROP (1952):
Some theoretical aspects of road traffic research.
Institutionof Civil Engineers 1, 325–378 . TRAFFIC FLOWS 87 where λ s P is the number of players that choose path P in s . The component x s e of x s is the amount of traffic on edge e caused by the choice s .We assume that a traffic flow x produces congestion costs c ( x e ) along theedges e and hence results in the total congestion cost C ( x ) = X e ∈ E c e ( x e ) x e across all edges. An individual player i has the congestion cost just alongits chosen path P : C ( P, x ) = X e ∈ P c e ( x e ) x e . If we associate with the flow x the potential of aggregated costs Φ( x ) = X e ∈ E x e X t =1 c e ( t ) , we find that player i ’s congestion cost along path P in x equals the marginalpotential: C ( P, x ) = X e ∈ P c e ( x e ) = Φ( x ) − Φ( x − P ) = ∂ P Φ( x − P ) . It follows that the players in the W
ARDROP traffic model play a n -personpotential game on the finite set X of possible traffic flows. x ∈ X is said to be a N ASH flow if no player i can improve its congestioncost by switching from the current path P ∈ P i to the use of another path Q ∈ P i . In other words, the N ASH flows are the cost equilibrium flows.Since the potential function Φ is defined on a finite set, we conclude • The W
ARDROP traffic flow model admits a N
ASH flow. B RAESS ’ paradox.
If one assumes that traffic in the W
ARDROP modeleventually settles in a N
ASH flow, i.e. , that the traffic flow evolves towarda cost equilibrium situation, the well-known observation of B
RAESS iscounter-intuitive:(B) It can happen that a reduction of the congestion along aparticular connection increases(!) the total congestion cost.
As an example of B
RAESS ’ paradox, consider the network G = ( V, E ) with V = { s, r, q, t } and E = { ( s, r ) , ( s, q ) , ( r, t ) , ( q, t ) , ( r, q ) } . D. B
RAESS (1968): ¨Uber ein Paradoxon aus der Verkehrsplanung
Assume that the cost functions on the edges are c sr ( x ) = x, c sq ( x ) = 4 , c rt ( x ) = 4 , c qt ( x ) = x, c rq = 10 and that there are four network users, which choose individual paths fromthe starting point s to the destination t and want to minimize their individualtravel times.Because of the high congestion cost, no user will travel along ( r, q ) . As aconsequence, a N ASH flow will have two users of path P = ( s → r → t ) while the other two users would travel along ˜ P = ( s → q → t ) .The overall cost is: C (2 P + 2 ˜ P ) = 2 · · · · . If road improvement measures are taken to reduce the congestion on ( r, q ) to c ′ rq = 0 , a user of path P can lower its current cost C ( P ) = 6 to C ( Q ) = 5 by switching to the path Q = ( s → r → q → t ) . The resulting traffic flow, however, causes a higher overall cost: C ′ ( P + Q + 2 ˜ P ) = 2 · · · · · . HAPTER 7
Cooperative Games
Players in a cooperative game strive for a common goal, from which theypossibly profit. Mathematically, such games are special potential games andbest studied within the context of linear algebra. Central is the question howto distribute the achieved goal’s profit appropriately. The core of a cooper-ative game is an important analytical notion. It provides a link to the theoryof discrete optimization and greedy algorithms in particular. Moreover, nat-ural models for the dynamics of coalition formation are closely related tothermodynamical models in statistical physics. Consequently, the notion ofa temperature in a potential game can be made precise, for example.
While the agents in the -person games of the previous chapters typicallyhave opposing utility goals, the model of a cooperative game refers to afinite set N of n = | N | players that may or may not be active towards acommon goal. A subset S ⊆ N of potentially active players is traditionallycalled a coalition . Mathematically, there are several ways of looking at thesystem of coalitions:From a set-theoretic point of view, one has the system of the n coalitions N = { S | S ⊆ N } . On the other hand, one may represent a subset S ∈ N by its incidencevector x ( S ) ∈ R N with the coordinates x ( S ) i = (cid:26) if i ∈ S if i / ∈ S .The incidence vector x ( S ) suggests the interpretation of i ∈ N being active if x ( S ) i = 1 . The coalition S is thus the collection of active players.A further interpretation imagines every player i ∈ N to have a binary strat-egy set X i = { , } from which to choose one element. An incidence vector x = ( x , . . . , x n ) ∈ X × · · · × X n = { , } N ⊆ R N
890 7. COOPERATIVE GAMES represents the joint strategy decision of the n players and we have the cor-respondence N ←→ { , } N = 2 N By a cooperative game we will just understand a n -person game Γ withplayer set N and state set X = N or X = 2 N , depending on a set theoretic of vector space point of view.
1. Cooperative TU-games A transferable utility relative to a set N of players is a quantity v whosevalue v ( S ) depends on the coalition S of active players and hence is a po-tential v : N 7→ R . The resulting potential game
Γ = (
N, v ) represents a cooperative TU-game with characteristic function v . v ( ∅ ) is the utility value if no member of N is active in the game Γ .Typically, ( N, v ) is assumed to be zero-normalized , i.e. , to have v ( ∅ ) = 0 . In the case v ( ∅ ) = 0 , one considers the TU-game ( N, v (0) ) instead of ( N, v ) with the zero-normalized characteristic function values v (0) ( S ) = v ( S ) − v ( ∅ ) . In the sequel, we will concentrate on TU-games and therefore just talk abouta cooperative game ( N, v ) .R EMARK
Often the characteristic function v of a co-operative game ( N, v ) is already called a cooperative game . In discretemathematics and computer science a function v : { , } n → R is also known as a pseudo-boolean function. Decision theory refers to pseudo-boolean functions as set functions . The characteristic function v of a cooperative game can represent a cost utility or a profit utility. The real-world interpretation of the mathematicalanalysis, of course, depends on whether a cost or a gain model is assumed.Usually, the modeling context makes it clear however. see, e.g. , M. G RABISCH (2016):
Set functions, Games and Capacities in DecisionMaking , Springer-Verlag . COOPERATIVE TU-GAMES 91
Identifying a TU-game ( N, v ) withits characteristic function v , we think of the function space R N = { v : N → R } with N = { S ⊆ N } as the vector space of all TU-games on N . R N is isomorphic with coordi-nate space R n and has dimension dim R N = |N | = 2 n = dim R n . The n unit vectors of R N correspond to the so-called D IRAC functions δ S ∈ R N with the values δ S ( T ) = (cid:26) if T = S if T = S .The set { δ S | S ∈ N } is a basis of R N . Any v ∈ R N has the representation v = X S ∈N = v ( S ) δ S . Duality.
It is advantageous to retain N as the index set explicitlybecause one can use the set theoretic structure of N for game theoreticanalysis. One such example is the duality operator v v ∗ on R N , where(36) v ∗ ( S ) = v ( N ) − v ( N \ S ) for all S ⊆ N .We say that the game ( N, v ∗ ) is the dual of ( N, v ) . For any possible coalition S ∈ N , the numerical value v ∗ ( N \ S ) = v ( N ) − v ( S ) is the ”surplus” of the ”grand coalition” N vs. S in the game ( N, v ) . Soduality expresses a balance v ( S ) + v ∗ ( N \ S ) = v ( N ) for all coalitions S .E X . 7.1. Show: (1) v v ∗ is a linear operator on R N . (2) The dual v ∗∗ = ( v ∗ ) ∗ of the dual v ∗ of v yields exactly the zero-normalization of v . M ¨
OBIUS transformation.
For any v ∈ R N , let us define itsM ¨ OBIUS transform as the function ˆ v ∈ R N with values ˆ v ( S ) = X T ⊆ S v ( T ) ( S ∈ N ) . A.F. M ¨
OBIUS (1790-1868) ˆ v ( S ) sums up the v -values of all subcoalitions T ⊆ S . In this sense, theM ¨ OBIUS transformation is a kind of ”discrete integral” on the functionspace R N .E X . 7.2 (Unanimity games). The
M ¨
OBIUS transform b δ S of the D IRAC function δ S is known as a unanimity game and has the values b δ S ( T ) = (cid:26) if S ⊆ T if S T .A coalition T has a non-zero value b δ S ( T ) = 1 exactly when the coali-tion T includes all members of S . Unanimity games appear to bequite simple and yet are basic (Corollary 7.1 below). Many conceptsin cooperative game theory are tested against their performance onunanimity games. Clearly, the M ¨
OBIUS transformation v ˆ v is a linear operator on R N .The important observation concerns an inverse property: every characteris-tic function v arises as the transform of a uniquely determined other char-acteristic function w .T HEOREM
OBIUS inversion).
For each v ∈ R N , there is aunique w ∈ R N such that v = ˆ w .Proof. Recall from linear algebra that it suffices to show that ˆ z = O implies z = O , i.e. , that the kernel of the M ¨ OBIUS transform contains just the zerovector O ∈ R N .So let us consider an arbitrary function z ∈ R N with transform ˆ z = O . Let S ∈ N be a coalition and observe in the case S = ∅ : z ( ∅ ) = ˆ z ( ∅ ) = 0 . Assume now, by induction, that z ( T ) = 0 holds for all T ∈ N of size | T | < | S | . Then the conclusion z ( S ) = ˆ z ( S ) − X T ⊂ S z ( T ) = 0 − follows and completes the inductive step of the proof. So z ( S ) = 0 must betrue for all coalitions S . ⋄ . COOPERATIVE TU-GAMES 93 Since the M ¨
OBIUS operator is linear, Theorem 7.1 implies that it is, in fact,an automorphism of R N , which maps bases onto bases. So we find in par-ticular:C OROLLARY
The unanimity games b δ S forma basis of R N , i.e. , each v ∈ R N admits a unique representation ofthe form v = X S ∈N λ S b δ S with coefficients λ S ∈ R . E X . 7.3 (H ARSANYI dividends).
Where v = ˆ w , the values w ( S ) are knownin cooperative game theory as the H ARSANYI dividends of the coalitions S in the game ( N, v ) . It follows that the value v ( S ) of any coalition S is thesum of the H ARSANYI dividends of its subcoalitions T : v ( S ) = ˆ w ( S ) = X T ⊆ S w ( T ) . R EMARK
The literature is not quite clear on the terminology and oftenrefers to the inverse transformation ˆ v v as the M ¨
OBIUS transformation.Either way, the
M ¨
OBIUS transformation is a classical and important toolalso in in number theory and in combinatorics . Potentials and linear functionals.
A potential f : N → R ,interpreted as a vector f ∈ R N defines a linear functional ˜ f : R N → R ,where ˜ f ( g ) = h f | g i = X S ∈N f S g S for all g ∈ R N .If g ( S ) is the (0 , -incidence vector of a particular coalition S ∈ N , wehave ˜ f ( g ( S ) ) = h f | g ( S ) i = f S · f S . which means that ˜ f extends the potential f on N ( = N ) to all of R N .Conversely, every linear functional g
7→ h f | g i on R N defines a uniquepotential f on N via f ( S ) = h f | g ( S ) i for all S ∈ N . see, e.g. , G.-C. R OTA (1964):
On the foundations of combinatorial theory I. Theoryof
M ¨
OBIUS functions.
Zeitschrift f¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete
These considerations show that potentials (characteristic functions) on N and linear functionals on R N are two sides of the same coin. From the pointof view of linear algebra, one can therefore equivalently define: • A cooperative TU-game is a pair
Γ = (
N, v ) , where N is aset of players and v
7→ h v | g i is a linear functional on thevector space R N . The characteristic function v of the cooperativegame Γ = (
N, v ) is a utility relative to the system N of coalitions of N . In-dividual players i ∈ N will assess their value with respect to v by evaluatingthe change in v that they can effect by being active or inactive.For a player i ∈ N , we thus define its marginal value with respect to thecoalition S as ∂ i v ( S ) = (cid:26) v ( S ∪ i ) − v ( S ) if i ∈ N \ Sv ( S ) − v ( S \ i ) if i ∈ S . Additive games.
The marginal value ∂ i v ( S ) of a player i ∈ N dependson the coalition S it refers to. Different coalitions may yield different mar-ginal values for the player i .The game Γ = (
N, v ) is said to be additive if every player’s marginal valuesare the same relative to all possible coalitions. So there are numbers v i suchthat ∂ i v ( S ) = v i for all S ∈ N and i ∈ N .Hence, if v is additive, we have v ( S ) = v ( ∅ ) + X i ∈ S v i . Conversely, every vector a ∈ R N defines a zero-normalized additive game ( N, a ) with the understanding(37) a ( ∅ ) = 0 and a ( S ) = X s ∈ S w i for all S = ∅ .E X . 7.4. Which unanimity games (see Ex. 7.2) are additive? Show that thevector space of all additive games on N has dimension | N | + 1 . The sub-space of all zero-normalized additive games on N has dimension | N | . We turn now to more examples of cooperative games. . COOPERATIVE TU-GAMES 95
As in Section 4.2, consider a factory that pro-duces k different types of goods from m raw materials M , . . . , M m . Let x = ( x , . . . , x k ) be a plan that proposes the production of x j units of the j th good and assume(1) x would need a i ( x ) units of material M i for i = 1 , . . . , m ;(2) the production x could be sold for the price of f ( x ) (euros, dollarsor whatever);(3) there is a set N of suppliers and each s ∈ N owns b is units ofmaterial M i .As in Section 4.2, the quest for an optimal production plan x ∗ leads to theoptimization problem max x ∈ R k + f ( x ) s.t. a i ( x ) ≤ X s ∈ N b is ( i = 1 , . . . m ) . Assume that the market prices of the materials are y ∗ , . . . , y ∗ m (per unit).Then an optimal production plan x ∗ needs to buy v ( N ) = X s ∈ N y ∗ i a i ( x ∗ ) = m X i =1 y ∗ i b i ( with b i = m X s ∈ N b is ) worth of materials from the suppliers.How should the worth of an individual supplier s ∈ N be assessed? Anatural parameter is the market value of all the materials owned by s :(38) w ∗ s = n X i =1 y ∗ i b is . Is this allocation s w ∗ s to individual suppliers s ”fair”? To shed morelight onto this question (without answering it), let us consider an alternativeapproach:Assume that a coalition S evaluates its inner worth from the shadow prices y Si , . . . , y Sm of the S -restricted optimization problem(39) max x ∈ R k + f ( x ) s.t. a i ( x ) ≤ X s ∈ S b is ( i = 1 , . . . m ) . An optimal solution x S of (39) requires v ( S ) = m X i =1 y Si a i ( x S ) = m X i =1 y Si b Si (with b Si = m X s ∈ S b is ) worth of material and gives rise to a cooperative game ( N, v ) . In this con-text, the worth of a supplier s ∈ N \ T for a coalition T is ∂ s v ( T ) = v ( T ∪ s ) − v ( T ) . So one may want to argue that a ”fair” assessment of the suppliers shouldtake their marginal values into account. (This idea is studied further in Sec-tion 3.2 below.)
Forgetting about marginal values forthe moment, the situation is particularly transparent in the case of a linearobjective and linear production requirements: f ( x ) = c T x = c x + . . . + c n x n a i ( x ) = a Ti x = a i x + . . . + a in x n ( i = 1 , . . . , m ) . Where A denotes the matrix with the m row vectors a Ti , the shadow pricevector y ∗ is an optimal solution of the dual linear program min y ∈ R m + y T b s.t. y T A ≥ c T and has the property v ( N ) = m X i =1 b i y ∗ i = n X j =1 c j x ∗ j = f ( x ∗ ) . Note that the dual of the S -restricted production problem has the same con-straints and differs only in the coefficients of the objective function: min y ∈ R m + y T b S s.t. y T A ≥ c T . An optimal solution x S and a shadow price vector y S yield v ( S ) = m X i =1 b Si y Si = n X j =1 c j x Sj = f ( x S ) . So the shadow price vector y ∗ is also feasible (but not necessarily optimal)for the S -restriction and we conclude (with w ∗ s as in (38)):(40) v ( S ) ≤ m X i =1 b Si y ∗ i = X s ∈ S w ∗ s = w ∗ ( S ) . R EMARK
The inequality (40) suggests that the shadow prices y ∗ satisfyall coalitions in the sense that every coalition S receives a material worth w ∗ ( S ) that is at least as large as its pure market value v ( S ) . This is thethought behind the notion of the core of a game ( cf. Section 2 below). . COOPERATIVE TU-GAMES 97
Consider a set N = { p , . . . , p n } ofusers of some utility that are to be linked, either directly or indirectly ( via other users), to some supply node p . Assume that the cost of establishinga link between p i with p j would be c ij (euros, dollars or whatever). Theassociated cooperative game has the utility function c ( S ) = minimal cost of connecting just S to p .The relevant question is: • How much should a user p i ∈ N be charged so that a network withthe desired connection can be established? One possible cost distribution scheme is derived from a construction methodfor a connection of minimal total cost c ( N ) . The greedy algorithm buildsup a chain of coalitions ∅ = S ⊂ S ⊂ S ⊂ . . . ⊂ S n = N according to the following iterative procedure:( G ) S = ∅ ;( G j ) If S j has been constructed, choose p ∈ N \ S j such that c ( S j ∪ p ) is as small as possible and charge user p the mar-ginal cost ∂ j v ( S ) = c ( S j ∪ p ) − c ( S j ) . ( G n ) Set S j +1 = S j ∪ p and continue until all users have beencharged.N OTA B ENE . The greedy algorithm makes sure that the user set N in totalis charged the minimal possible connection cost: n X j =1 [ c ( S j ) − c ( S j − )] = c ( S n ) − c ( S ) + n − X k =1 [ c ( S k ) − c ( S k )]= c ( N ) − c ( ∅ ) = c ( N ) . In this sense, the greedy algorithm is efficient . Nevertheless, the greedy costallocation scheme may appear ”unfair” from the point of view of individualusers (see Ex. 7.5) . game theorists disagree on ”the best” network cost allocation scheme E X . 7.5. Consider a user set N = { p , p } with connection cost coefficients c = 100 , c = 101 and c = 2 . The greedy algorithm constructs thecoalition chain ∅ = S ⊂ S = { p } ⊂ S = { p .p } = N and charges c ( S ) = 100 to user p and c ( S ) − c ( S ) = 2 to user p . So p would be to bear about of the total cost c ( N ) = 102 . Assume there is a set N of n voters i of (not nec-essarily equal) voting power. Denote by w i the number of votes voter i cancast. Given a threshold w , the associated voting game has the characteristicfunction v ( S ) = if X i ∈ S w i ≥ w otherwise.In the voting context, v ( S ) = 1 has the interpretation that the coalition S has the voting power to make a certain proposed measure pass. Notice thatin the case v ( S ) = 0 , a voter i with marginal value ∂ i v ( S ) = v ( S ∪ i ) − v ( S ) = 1 has the power to swing the vote by joining S . The general question is ofhigh political importance: • How can (or should) one assess the overall voting power ofa voter i in a voting context? R EMARK
A popular index for individual voting power is the B ANZHAF power index (see Section 3 below). However, there are alternative evalua-tions that also have their merits. As in the case of network cost allocation,abstract mathematics cannot decide what the ”best” method would be.
2. The core
Note this particular feature in our current discussion: • In order to avoid technicalities, we assume that all cooperativegames in this section are zero-normalized.
The core of a (zero-normalized) cooperative profit game ( N, v ) is the setcore ( v ) = { x ∈ R N | x ( N ) = v ( N ) , x ( S ) ≥ v ( S ) ∀ S ⊆ N } , also known as a threshold game . THE CORE 99 with the notational understanding x ( S ) = P i ∈ S x s . Mathematically speak-ing, core ( v ) is the solution set of a finite number of linear inequalities in theeuclidian space R N .In a game theoretic interpretation on the other hand, a vector x ∈ core ( v ) is an assignment of individual values x i to the players i ∈ N such that thevalue v ( N ) is distributed completely and each coalition S receives at leastits proper value v ( S ) .Inequality (40) above, for example, exhibits the suppliers’ market values w ∗ s (see equation (38)) as the coefficients of a core vector in a linear productiongame.The core of a cost game ( N, c ) is defined analogously:core ∗ ( c ) = { x ∈ R N | x ( N ) = c ( N ) , x ( S ) ≤ c ( S ) ∀ S ⊆ N } .x ∈ core ∗ ( c ) distributes the cost c ( N ) among the players i ∈ N so that nocoalition S pays more than its proper cost c ( S ) .E X . 7.6. Show for the (zero-normalized) cooperative game ( N, v ) and itsdual ( N, v ∗ ) : core ( v ∗ ) = core ∗ ( v ) . E X . 7.7. Give the example of a cooperative game ( N, v ) with core ( v ) = ∅ . Alas, as Ex. 7.7 shows, the core is not a generally applicable concept for”fair” profit (or cost) distributions to the individual players in cooperativegames because it may be empty. Therefore, further value assignments con-cepts are of interest. Section 3 will provide examples of such concepts. Forthe moment, let us continue with the study of the core.
ONGE algorithm.
We consider a fixed cooperative game ( N, v ) with n players and collection N of coalitions.Given a parameter vector c ∈ R N and an arrangement π = i i . . . i n of theelements of N , the M ONGE algorithm constructs a primal M ONGE vector x π ∈ R N and a dual M ONGE vector y π ∈ R N as follows:(M ) Set S π = ∅ and S πk = { i , . . . , i k } for k = 1 , , . . . , n .(M ) Set x πi k = v ( S πk ) − v ( S πk − ) for k = 1 , , . . . , n .(M ) Set y πS n = c i n and y πS ℓ = c i ℓ − c i ℓ +1 for ℓ = 1 , , . . . , n − .Set y πS = 0 otherwise. G. M
ONGE (1746-1818)
00 7. COOPERATIVE GAMES
It is not hard to see that the M ONGE vectors x π and y π satisfy the identity m π ( c ) = X i ∈ N c i x πi = X S ∈N v ( S ) y πS . Different arrangements π and ψ of N , of course, may yield different M ONGE sums m π ( c ) and m ψ ( c ) . Important is the following observation.L EMMA
Let π = i i . . . i n and ψ = j j . . . j n be two arrangementsof N such that c ≥ c i ≥ . . . ≥ c i n and c j ≥ c j ≥ . . . ≥ c j n . Then m π ( c ) = X S ∈N v ( S ) y πS = X S ∈N v ( S ) y ψS = m ψ ( c ) . Proof.
Note that c i ℓ = c i ℓ +1 , for example, implies y πS ℓ = 0 . So we mayassume that the components of c have pairwise different values. But then π = ψ holds, which makes the claim trivial. ⋄ The M
ONGE extension.
Lemma. 7.1 says that there is a well-defined real-valued function c [ v ]( c ) , where [ v ]( c ) = m π ( c ) if π = i i . . . i n s.t. c i ≥ c i ≥ . . . ≥ c i n .The function [ v ] is the M ONGE extension of the characteristic function v .To justify the terminology ”extension”, consider a coalition S ⊆ N and its (0 , incidence vector c ( S ) with the component c ( S ) i = 1 in the case i ∈ S .An appropriate M ONGE arrangement of the elements of N lists first all -components and then all -components: π = i . . . i k . . . i n s.t. c ( S ) i . . . c ( S ] i k . . . c ( S ) i n = 1 . . . . . . . Hence we have y πS = 1 and yπ T = 0 for T = S and conclude [ v ]( c ( S ) ) = v ( S ) for all S ⊆ N .R EMARK
HOQUET and L OV ´ ASZ ). Applying the idea of M ONGE sums to nondecreasing value arrangements f ≤ . . . ≤ f n of nonnegativefunctions f : N → R + , one arrives at the C HOQUET integral Z f dv = n X k =1 f k ( v ( A k ) − v ( A k +1 )) , where A k = { k, k + 1 , . . . , n } and A n +1 = ∅ . cf. Section 5 of the Appendix . THE CORE 101
The map f R f dv is the so-called L OV ´ ASZ extension of the function v : N → R . Of course, mutatis mutandis, all structural properties carryover from M ONGE to C HOQUET and L OV ´ ASZ . Linear programming aspects.
Since core ( v ) is defined as thesolution set of a finite number of inequalities (and one equality), it is nat-ural to study linear programs with the core as its set of feasible solutions.Consider the linear program(41) min x ∈ R N c T x s.t. x ( N ) = v ( N ) and x ( S ) ≥ v ( S ) if S = N .and its dual(42) max y ∈ R N v T y s.t. X S ∋ i y S ≤ c i ∀ i ∈ N and y S ≥ if S = N .Observe in the case c i ≥ . . . . . . ≥ c i n that the dual M ONGE vector y π relative to c is a feasible solution since y πS ≥ is satisfied for all S = N .Hence, if core ( v ) = ∅ , both linear programs have optimal solutions. Linearprogramming duality then furthermore shows(43) ˜ v ( c ) = min x ∈ core ( v ) c T x ≥ v T y π = [ v ]( c ) . T HEOREM ˜ v = [ v ] holds for the game ( N, v ) if and only if allprimal M ONGE x π vectors lie in core ( v ) .Proof. Assume c i ≥ . . . ≥ c i n and π = i . . . i n . If x π ∈ core ( v ) , then x π is a feasible solution for the linear program (41). Since the dual M ONGE vector y π is feasible for (42), we find c T x π ≥ ˜ v ( c ) ≥ [ v ]( v ) = c T x π and hence ˜ v ( c ) = [ v ]( c ) . Conversely, ˜ v = [ v ] means that the dual M ONGE vector is guaranteed toyield an optimal solution for (42). So consider an arrangement ψ = j . . . j n of N and the parameter vector c ∈ R N with the components c j k = n + 1 − k for k = 1 , . . . , n .The dual vector y ψ has strictly positive components y ψS k = 1 > on thesets S k . It follows from the KKT-conditions for optimal solutions that an
02 7. COOPERATIVE GAMES optimal solution x ∗ ∈ core ( v ) of the corresponding linear program (41)must satisfy the equalities x ∗ ( S k ) = X i ∈ S k x ∗ i = v ( S k ) for k = 1 , . . . , n, which means that x ∗ is exactly the primal M ONGE vector x π and, hence,that x π ∈ core ( v ) holds. ⋄ Let us call a characteristic function v : 2 N → R con-cave if v arises from the restriction of a concave function to the (0 , - in-cidence vector c ( S ) of the coalitions S , i.e. , if if there is a concave function f : R N → R such that v ( S ) = f ( c ( S ) ) holds for all S ⊆ N .Accordingly, the cooperative game Γ = (
N, v ) is concave if v is concave.We will not pursue an investigation of general conacave cooperative gameshere but focus on a particularly important class of concave games which areclosely tied to the M ONGE algorithm via
Theorem 7.2.P
ROPOSITION
If all M ONGE vectors of the game ( N, v ) lie in core ( v ) , then ( N, v ) is concave.Proof. By Theorem 7.2, the hypothesis of the Proposition says ˜ v ( c ) = [ v ]( c ) for all c ∈ R N .Consequently, it suffices to demonstrate that ˜ v is a concave function. Clearly, ˜ λc = λ ˜ v h c ) holds for all scalars λ > , i.e. ˜ v is positively homogeneous.Consider now arbitrary parameter vectors c, d ∈ R N and x ∈ core ( v ) suchthat ˜ v ( c + d ) = ( c + d ) T x . Then ˜ v ( c + d ) = c T x + d T x ≥ ˜ v ( c ) + ˜ v ( d ) , which exhibits ˜ v as concave. ⋄ R EMARK
The converse of Proposition 7.1) is not true: there are con-cave games whose core does not include all primal M ONGE vectors. . THE CORE 103
A word of terminological caution.
The game theoretic literature oftenapplies the terminology ”convex cooperative game” to games ( N, v ) havingall primal M ONGE vectors in core ( v ) . In our terminology, however, suchgames are not convex but concave .To avoid terminological confusion, one may prefer to refer to them as super-modular games ( cf. Theorem 7.3 in the next Section 2.3).
The central notion that connects the M
ONGE algorithm with concavity is the notion of supermodularity :T HEOREM
For the cooperative game ( N, v ) , the following state-ments are equivalent: (I) All M ONGE vectors x π ∈ R N lie in core ( v ) . (II) v is supermodular , i.e. , satisfies the inequality v ( S ∩ T ) + v ( S ∪ T ) ≥ v ( S ) + v ( T ) for all S, T ⊆ N .Proof. Assuming (I), order the elements of N in an order i . . . i n such that S ∩ T = { i , . . . , i k } , S = { i , . . . , i ℓ } , S ∪ T = { i , . . . , i m } . By the definition of the M
ONGE algorithm, x π then satisfies x π ( S ∩ T ) = v ( S ∩ T ) , x π ( S ) = v ( S ) , x π ( S ∪ T ) = v ( S ∪ T ) . Moreover, x π ( T ) ≥ v ( T ) holds if x π ∈ core ( v ) . So we deduce the super-modular inequality v ( S ∩ T ) + v ( S ∪ T ) = x π ( S ∩ T ) + x π ( S ∪ T )= x π ( S ) + x π ( T ) ≥ v ( S ) + v ( T ) . Conversely, suppose that v is not supermodular. We will exhibit a M ONGE vector x π that is not a member of core ( v ) . Let S, T ⊆ N be such that v ( S ∩ T ) + v ( S ∪ T ) < v ( S ) + v ( T ) is true and arrange the elements of N in an order π = i . . . i n such that S ∩ T = { i , . . . , i k } , S = { i , . . . , i m } , S ∪ T = { i , . . . , i ℓ } .
04 7. COOPERATIVE GAMES
Consequently, the M
ONGE vector x π satisfies v ( S ) + v ( T ) > v ( S ∩ T ) + v ( S ∪ T )= x π ( S ∩ T ) + x π ( S ∪ T )= x π ( S ) + x π ( T )= v ( S ) + x π ( T ) and therefore v ( T ) > x π ( T ) , which shows x π / ∈ core ( v ) . ⋄ E X . 7.8. The preceding proof uses the fact that any vector x ∈ R N satisfiesthe modular equality x ( S ∩ T ) + x ( S ∪ T ) = x ( S ) + x ( T ) for all S, T ⊆ N , with the understanding x ( S ) = P i ∈ S x i . A characteristic function v is called submodular if the inequality v ( S ∩ T ) + v ( S ∪ T ) ≤ v ( S ) + v ( T ) holds for all S, T ⊆ N . E X . 7.9. Show for the zero-normalized game ( N, v ) the equivalence of thestatements: (1) v is supermodular. (2) v ∗ is submodular. (3) w = − v is submodular. In view of the equality core ( c ∗ ) = core ∗ ( c ) (Ex. 7.6) , we find that theM ONGE algorithm also constructs vectors in the core ∗ ( c ) of cooperativecost games ( N, c ) with submodular characteristic functions c .R EMARK
Note the fine point of Theorem 7.3, which in the languageof submodularity says: ( N, c ) is a submodular cost game if and only if all M ONGE vectors x π lie in core ∗ ( c ) . Network connection games are typically not submodular. Yet, the particularcost distribution vector discussed in Section 1.5 does lie in core ∗ ( c ) , as theambitious reader is invited to demonstrate. R EMARK
Because of the M ONGE algorithm, sub- and supermodularfunctions play a prominent role in the field of discrete optimiziation . Infact, many results of discrete optimization have a direct interpretation in the cf. S. F
UJISHIGE (2005),
Submodular Functions and Optimization , 2nd ed., Annalsof Discrete Mathematics 58 . VALUES 105 theory of cooperative games. Conversely, the model of cooperative gamesoften provides conceptual insight into the structure of discrete optimizationproblems. R EMARK
The
Monge algorithm, applied to linearprograms with core-type constraints is also known as the greedy algorithm in discrete optimization.
3. Values
While the marginal value ∂ i v ( S ) of player i ’s decision to join resp. to leavethe coalition S is intuitively clear, it is less clear how the overall strength of i should be assessed. From a mathematical point of view, there are infinitelymany possibilities to do this.In general, we understand by a value for the class of all TU-games ( N, v ) afunction Φ : R N → R N that associates with every characteristic function v a vector Φ( v ) ∈ R N .Given Φ , the number Φ i ( v ) is the assessment of the strength of i ∈ N in thegame ( N, v ) according to the evaluation concept Φ . The value Φ is said to be linear if Φ is a linearoperator, i.e. , if one has for all games v, w and scalars λ ∈ R , the equality Φ( λv + w ) = λ Φ( v ) + Φ( w ) . In other words : Φ is linear if each component function Φ i is a linear func-tional on the vector space R N .Recall from Ex. 7.2 that the unanimity games form a basis of R N , whichmeans that every game v can be uniquely expressed as a linear combinationof unanimity games. Hence a linear value Φ is completely determined bythe values assigned to unanimity games. The same is true for any other basisof R N , of course.Indeed, if v , . . . , v k ∈ G ( N ) are arbitrary games and λ , . . . , λ k arbitraryreal scalars, the linearity of Φ yields Φ( λ v + . . . + λ k v k ) = λ Φ( v ) + . . . + λ k Φ( v k ) . We give two typical examples of linear values.
06 7. COOPERATIVE GAMES
The S
HAPLEY value.
Consider the unanimity game b δ T relative to thecoalition T ∈ N where b δ T ( S ) = (cid:26) if S ⊇ T otherwise.In this case, it might appear reasonable to assess the strength of a player s ∈ N \ T as null, i.e. , with the value Φ Shj ( b δ T ) = 0 , and the strength of eachof the players t ∈ T in equal proportion as Φ Sht ( b δ T ) = 1 | T | . Extending Φ Sh to all games v by linearity in the sense v = X T ∈N λ t b δ T = ⇒ Φ Sh ( v ) = X T ∈N λ T Φ Sh ( b δ T ) one obtains a linear value v Φ Sh ( v ) , the so-called S HAPLEY value .E X . 7.10. Show that the players in ( N, v ) and in its zero-normalization ( N, v ∗∗ ) are assigned the same Shapley values. The B
ANZHAF power index.
The B
ANZHAF power index Φ B appearsat first sight quite similar to the S HAPLEY value, assessing the power Φ Bs ( b δ T ) = 0 if s ∈ N \ T while treating all t ∈ T as equals. Assuming T = ∅ , the mathematicaldifference lies in the scaling factor: Φ Bi ( b δ T ) = 12 | T |− for all t ∈ T .As was done with the S HAPLEY VALUE , the B
ANZHAF power index is ex-tended by linearity from unanimity games to all games ( N, v ) and thus givesrise to a linear value v Φ B ( v ) .As we will see in Section 3.2, the difference between the values Φ Sh and Φ B can also be explained by two different probabilistic assumptions aboutthe way coalitions are formed.R EMARK If | T | ≥ , the S HAPLEY value distributes thetotal amount X i ∈ N Φ Shi ( b δ T ) = 1 = b δ T ( N ) . VALUES 107 to the members of N and is, therefore, said to be efficient . In contrast, wehave X i ∈ N Φ Bi (( b δ T ) = | T | | T |− < b δ T ( N ) if | T | ≥ .So the B ANZHAF power index in not efficient.
The concept of a random value is based on theassumption that a player i ∈ N joins a coalition S ⊆ N \ { i } with a certainprobability π S as a new member. The expected marginal value of i ∈ N thus is E πi ( v ) = X S ⊆ N ∈{ i } ∂ i v ( S ) π S . The function E π : G ( N ) → R N with components E pi i ( v ) is the associated random value .Notice that marginal values are linear. Indeed, if u = λv + w , one has ∂ i u ( S ) = λ∂ i v ( S ) + ∂ i w ( S ) for all i ∈ N and S ⊆ N . Therefore, the random value E π is linear as well:(44) E π ( λv + w ) = λE π ( v ) + E π ( w ) . R EMARK
The linearity relation (44) implicitly assumes that the pro-babilities π S are independent of the particular characteristic function v . If π S depends on v , the linearity of E π is no longer guaranteed!The B OLTZMANN value (to be discussed in Section 4 below) is a randomvalue that is not linear in the sense of (44) because the associated pro-bability distribution depends on the characteristic function.
The value of B
ANZHAF . As an example, let us assume that aplayer i joins any of the n − coalitions S ⊆ N \ { i } with equal likelihood, i.e. , with probability π BS = 12 n − . Consider the unanimity v T = b δ T and observe that ∂ i v T ( S ) = 0 holds if i / ∈ T . On the other hand, if i ∈ T , then one has ∂ i v T ( S ) = 1 ⇐⇒ T \ { i } ⊆ S. So the number of coalitions S with ∂ i v T ( S ) = 1 equals |{ S ⊆ N \ { i } | T ⊆ S ∪ { i }}| = 2 n −| T |− . Hence we conclude
08 7. COOPERATIVE GAMES (45) E π B i ( v T ) = X S ⊆ N \{ i } ∂ i v T ( S ) π BS = 2 n −| T |− n − = 12 | T | , which means that the random value E π B is identical with the B ANZHAF power index. The probabilistic approach yields the explicit formula(46) Φ Bi ( v ) = E π B i ( v ) = 12 n − X S ⊆ N \{ i } ( v ( S ∪ i ) − v ( S )) ( i ∈ N ) . Marginal vectors and the S
HAPLEY value.
Let us imagine thatthe members of N build up the ”grand coalition” N in a certain order σ = i i . . . i n and hence join in the sequence of coalitions ∅ = S σ ⊂ S σ . . . ⊂ S σk ⊂ . . . ⊂ S σn = N where S σk = S σk − ∪ { i k } for k = 1 , . . . , n . Given the game ( N, v ) , σ givesrise to the marginal vector ∂ σ ( v ) ∈ R N with components ∂ i k ( v ) = v ( S σk ) − v ( S σk − ) ( k = 1 , . . . , n ) . Notice that v ∂ σ ( v ) is a linear value for G ( N ) . We can randomize thisvalue by picking the order σ from the set Σ N of all orders of N accordingto a probability distribution π . Then the expected marginal vector ∂ π ( v ) = X σ ∈ Σ N ∂ σ ( v ) π σ represents, of course, also a linear value on G ( N ) .E X . 7.11. Show that the value v ∂ π ( v ) is linear and efficient. ( H INT : Recall the discussion of the greedy algorithm for network connection gamesin Section 1.5). the marginal vectors are precisely the primal M ONGE vectors of Section 2.1 . VALUES 109 P ROPOSITION
The S HAPLEY value results as the expected mar-ginal vector relative to the uniform probability distribution on Σ N ,where all orders are equally likely: Φ Sh ( v ) = 1 n ! X σ ∈ Σ N ∂ σ ( v ) . (Recall from combinatorics that there are n ! = | Σ N | ordered ar-rangements of N .)Proof. Because of linearity, it suffices to prove the Proposition for un-animity games v T = b δ T . For any order σ = i . . . i n ∈ Σ N and element i k ∈ N \ T , we have ∂ i k ( v T ) = 0 and hence n ! X σ ∈ Σ N ∂ σi ( v T ) = 0 for all i ∈ N \ T .On the other hand, the uniform distribution treats all members i ∈ T equallyand thus distributes the value v T ( N ) = 1 equally and efficiently among themembers of T : n ! X σ ∈ Σ N ∂ σi ( v T ) = v T ( N ) | T | = 1 | T | for all i ∈ T ,which is exactly the concept of the S HAPLEY . ⋄ We can interpret the S
HAPLEY value within the framework of random val-ues initially introduced. So we assume that an order σ ∈ Σ N is chosen withprobability /n ! and consider a coalition S ⊆ N \ { i } . We ask: • What is the probability π ShS that i would join S ?Letting k − | S | be the size of S , the number of sequences σ where i would be added to S is |{ σ | i = i k and S σk − = S }| = ( k − n − k )! This is so because:(1) The first k − elements must be chosen from S in any of the ( k − possible orders.(2) The remaining n − k elements must be from N \ ( S ∪ { i } ) .So one concludes π ShS = ( k − n − k )! n ! = ( | S | !( n − | S | − n ! and obtains another explicit formula for the S HAPLEY VALUE :
10 7. COOPERATIVE GAMES (47) Φ Shi ( v ) = X S ⊆ N \{ i } ∂ i v ( S ) π ShS = X S ⊆ N \{ i } ∂ i v ( S ) ( | S | !( n − | S | − n ! E X . 7.12. Consider a voting/threshold game ( cf.
Section 1.6) with fourplayers of weights w = 3 , w = 2 , w = 2 , w = 1 . Compute the B ANZHAF and the S HAPLEY values for each of the players for the threshold w = 4 .
4. Boltzmann values
The probabilistic analysis of the previous section shows that the value as-sessment concepts of the B
ANZHAF power index and the S
HAPLEY value,for example, implicitly assume that players just join – but never leave – anexisting coalition in a cooperative game ( N, v ) ∈ G ( N ) .In contrast, the model of the present section assumes an underlying proba-bility distribution π on the set N of all coalitions of N and assigns to player i ∈ N its expected marginal value E i ( v, π ) = X S ⊆ N ∂ i v ( S ) π S = X S ⊆ N ( v ( S ∆ i ) − v ( S )) π S . We furthermore do allow π to depend on the particular characteristic func-tion v under consideration. So the functional v E i ( v, π ) is not guaranteedto be linear.What probability distribution π should one reasonably expect in this model?To answer this question, we consider the associated expected characteristicvalue µ = X S ⊆ N v ( S ) π S as a relevant parameter and ask: • Given just µ , which probability distribution ˆ π would be thebest unbiased guess for (the unknown) π ?From the information-theoretic point of view, the best unbiased guess ˆ π isthe one with the highest entropy among those probability distributions π yielding the expected value µ . So we seek a solution to the optimizationproblem . BOLTZMANN VALUES 111 (48) max x S ≥ H ( x ) = − n X S ⊆ N x S ln x S s.t. µ = X S ⊆ N v ( S ) x S X S ⊆ N x S . T HEOREM
For every potential v : N → R and possible ex-pected value µ of v , there exists a unique parameter −∞ ≤ T ≤ + ∞ such that (1) µ = X S ⊆ N v ( S ) e v ( S ) T /Z T , where Z T = X S ⊆ N e v ( S ) T ; (2) the numbers b TS = e − v ( S ) T /Z T are strictly positive andyield the unique optimal solution of the entropy optimiza-tion problem (48). A proof of Theorem 7.4 can be found in Section 6.2 of the Appendix.The probabilities b TS define the B OLTZMANN distribution of v : N → R relative to the parameter T . So we obtain a B OLTZMANN value Φ B for every v and parameter T :(49) Φ B i ( v ; T ) = X S ⊆ N ∂ i v ( S ) b TS = 1 Z T X S ⊆ N ∂ i v ( S ) e v ( S ) T ( i ∈ N ) Let us look at some extreme cases. For T = 0 , the B OLTZMANN distributionis just the uniform distribution on N , b S = 1 |N | = 12 n for all S ⊆ N ,and the B OLTZMANN value of a player i ∈ N is the average over all itsmarginal values: Φ B i ( v ; 0) = 12 n X S ⊆ N ∂ i v ( S ) . In the case T = + ∞ , b ∞ becomes the uniform distribution on the set V max ⊆ N of all maximizers of v (see Ex. 7.13). Hence one has Φ B i ( v ; + ∞ ) = 1 | V max | X S ∈ V max ∂ i v ( S ) .
12 7. COOPERATIVE GAMES
Similarly, one sees that b −∞ is the uniform distribution on the set V min ofminimizers of v .E X . 7.13. Let S ∗ ∈ V max be a maximizer of v and S ⊆ N arbitrary. Then lim T → + ∞ e v ( S ) T e v ( S ∗ ) T = lim T → + ∞ (cid:0) e ( v ( S ) − v ( S ∗ ) (cid:1) T = (cid:26) if v ( S ) = v ( S ∗ )0 otherwise.Conclude that b ∞ is the uniform distribution on V max (all other coalitionsoccur with probability ). Temperature.
In analogy with the B
OLTZMANN model in statisticalthermodynamical physics, we may think of v as an energy potential func-tion and ask for its minimizers. So we face B OLTZMANN distributions b T with T < . Where k B > is a normalizing factor , we define ˜ θ = − k B T ≥ and hence b TS = 1 Z T e − v ( S ) /k B ˜ θ . Thermodynamics interprets ˜ θ as the temperature of the cooperative systemwith potential distribution b T . Hence, if ˜ θ → , the system attains a state(coalition) of minimal potential with high probability.In the high temperature case ˜ θ → ∞ , the system becomes more and moreunpredictable: all states (coalitions) are about equally likely ( i.e. , are ap-proximately uniformly distributed).The analogy with physics suggests to take the nonnegative parameter θ = 1 | T | as a measure for the temperature in general game-theoretic contexts. In par-ticular, it appears reasonable to speak of the ”temperature” of an economicsystem, for example, if the system is assumed to be governed by a potential(like the gross national product or a similar global indicator).The conclusions are the same: • If θ → ∞ , all joint strategic actions are approximatelyequally likely. • If θ → , the expected potential value becomes extreme,namely maximal if T → + ∞ and minimal if T → −∞ . the precise physical value of the so-called B OLTZMANN constant k B is irrelevantfor our game theoretic purposes! . COALITION FORMATION 113
5. Coalition formation
The term coalition formation refers originally to cooperative TU-games.The formation of a coalition that will eventually be active is viewed as a dy-namic process where coalitions evolve over time according to the behaviorof players that might leave the momentary coalition and join other playersfor a new coalition etc. . The question how a decision process evolves overtime is, of course, of interest for a general potential game
Γ = (
N, v ) andsystem X of joint strategies.If one assumes | X | < ∞ and the agents i ∈ N to be individually greedy, i.e. ,to switch their strategic action if a switch offers a better marginal value, thenone must conclude that the agents eventually reach an action equilibrium x ∗ ∈ X ( cf. Proposition 5.2).On the other hand, if we make no a priori assumption on the individualagents but assume that the decision process eventually arrives at the jointaction x ∈ X with a probability π x and produces the expected potentialvalue µ = X vx ∈ X v ( x ) π x , an unbiased estimate of the probability distribution π leads to the conclu-sion (Theorem 7.4) that the decision process eventually produces the jointstrategic choice x ∈ X with the B OLTZMANN probability b T x = 1 Z T e v ( x ) T with T such that µ = X x ∈ X v ( x ) b T x . M ETROPOLIS et al. have formulated the model of a stochastic process thatconverges to the distribution b T with T ≥ as follows:(M1) If the process is currently in the state x , an agent i ∈ N ischosen with some probability p i > ;(M2) i chooses an action y ∈ X i with probability q y > ;(M3) If v ( x − i )( y ) > v ( x ) , then i switches from x i to y ;(M4) If v ( x − i ( y )) ≤ v ( x ) , then i switches from x i to y with pro-bability α = e ( v ( x − i ( y ) − v ( x )) T (and does not change the action otherwise). N. M
ETROPOLIS , A. R
OSENBLUTH , M. R
OSENBLUTH , A. T
ELLER , E. T
ELLER : Equation of state computations by fast computing machines . J. Chem. Physics 21 (1953)
14 7. COOPERATIVE GAMES
The algorithm of M
ETROPOLIS et al. simulates a so-called M
ARKOV chain on X . The proof of its correctness is not too difficult but a bit technical.Therefore, we will not reproduce it here but point out the relevant featuresof the algorithm: • If ∂ y v ( x ) = v ( x − i )( y ) − v ( x ) ≥ , agent i is greedy and switchesthe strategic action to y . • If ∂ y v ( x ) < , then i switches with a probability that is small when T is large: (cid:0) e ∂ t v ( x ) (cid:1) T → as T → + ∞ . R EMARK
The M ETROPOLIS algorithm is easily adjusted for T ≤ : One simply replaces v by the potential w = − v and proceeds with w andthe nonnegative parameter T ′ = − T as above. Let us assume that N is a society whose common welfare is expressed by the potential v . If allmembers of N act purely greedily, an equilibrium action will eventually bearrived at that does not necessarily lead to a high common welfare level.For example, if all players in a W ARDROP traffic situation ( cf.
Section 4)act purely greedily, no optimal traffic flow is guaranteed.However, if the members of N are prepared to possibly accept a momen-tary marginal individual deterioration (case (M4) in the algorithm), then acommon welfare of level µ T ( v ) = X x ∈ X v ( x ) b T x can be expected. Moreover, the larger T is, the closer is µ T ( v ) to the maxi-mal possible level v max .So the society N must offer incentives or individual rewards to induce theplayers to act as in (M4) in order to achieve a high public welfare level. If there is only a single agent, X is the strat-egy set of that agent and the M ETROPOLIS algorithm can be used to opti-mize a function v : X → R , i.e. , to find an optimal solution to the problem max x ∈ X v ( x ) with high probability by adding the procedural step(M5) After each iteration, increase T slightly with the aim T → ∞ . . EQUILIBRIA IN COOPERATIVE GAMES 115 In this form, the M
ETROPOLIS algorithm is also known as simulated an-nealing . It has proven to be a very successful optimization technique for inthe field of discrete optimization .R EMARKS . Notice that the description of the simulated annealing algo-rithm is not very specific with respect to its practical implementation. Howshould the probabilities q t in (M3) be chosen? How should T be increasedin (M5)? etc. So the success of the simulated annealing method will alsodepend on the skill and experience of its user in practice.
6. Equilibria in cooperative games
In the previous section, we have discussed the M
ETROPOLIS algorithm asan example of a M
ARKOV chain that models coalition formation accord-ing to B
OLTZMANN probabilities. If we go one step back and think of theplayers in a cooperative game
Γ = (
N, v ) as a group whose behavior isguided by individual profits from the achievement of a common goal, wemust assume that each i ∈ N has an individual utility function u i : N → R , where u i ( S ) is i ’s expected gain (or cost) in case the coalition S will be ac-tive in Γ and thus achieve the value v ( S ) for N . Of course, u i will naturallydepend on the characteristic function v of Γ . But other considerations mayplay a role as well.In the B OLTZMANN model, a player i ’s utility criterion is essentially itsmarginal gain ∂ i v ( S ) - if it is nonnegative. If it is negative, an additional in-centive is assumed to be offered to carry out a strategic switch nonethelesswith a certain probability. Typically, the B OLTZMANN model does not ad-mit coalition equilibria – unless the game Γ is played under an extremetemperature T .Many other value concepts (like S HAPLEY and B
ANZHAF , for example),are based on marginal gains as fundamental criteria for the individual utilityassessment of a player.Let us consider a game
Γ = (
N, v ) and take into account that the gamewill eventually split N into a group S ⊆ N and the complementary group S c = N \ S . Suppose a player i ∈ N evaluates the utility of the partition ( S, S c ) of N by v i ( S ) = v i ( S c ) = (cid:26) v ( S ) − v ( S \ i ) if i ∈ Sv ( S c ) − v ( S c \ i ) if i ∈ S c , S. K
IRKPATRICK , C.D. G
ELAT , M.P. V
ECCHI : Optimization by simulated anneal-ing.
Science 220 (1983)
16 7. COOPERATIVE GAMES E X . 7.14. Assume that ( N, v ) is a supermodular game. Then one has forall players i = j , v i ( N ) = v ( N ) − v ( N \ i ) ≥ v ( N \ j ) − v (( N \ j ) \ i ) = v i ( N \ j ) v i ( N ) = v ( N ) − v ( N \ i ) ≥ v ( { i } ) − v ( ∅ ) = v i ( N \ i ) . Consequently, the grand coalition N represents a gain equilibrium relativeto the utilities v i . E X . 7.15. Assume that ( N, c ) is a zero-normalized submodular game andthat the players i have the utilities c i ( S ) = (cid:26) c ( S ) − c ( S \ i ) if i ∈ Sc ( S c ) − c ( S c \ i ) if i ∈ S c . Show:
The grand coalition N is a cost equilibrium relative to the utilities c i . HAPTER 8
Interaction Systems and Quantum Models
This final chapter investigates a quite general model for cooperationand interaction relative to a set X . Using complex numbers, the statesof this model are naturally represented as hermitian matrices withcomplex coefficients. This representation allows us to carry out stan-dard spectral analysis for interaction systems and provides a link tothe standard mathematical model of quantum systems in physics.While the analysis could be extended to general H ILBERT spaces, X is assumed to be finite to keep the discussing simpler.
1. Algebraic preliminaries
Since matrix algebra is the main tool in our analysis, we review some morefundamental notions from linear algebra. Further details and proofs can befound in any decent book on linear algebra .Where X = { x , . . . , x m } and Y = { y , . . . , y n } are two finite index sets,recall that R X × Y denotes the real vector space of all matrices A with rowsindexed by X , columns indexed by Y , and coefficients A xy ∈ R .The transpose of A ∈ R X × X is the matrix A T ∈ R Y × X with the coefficients A Txy = A xy . The map A A T establishes an isomorphism between thevector spaces R X × Y and R Y × X .Viewing A ∈ R X × Y and B ∈ R Y × X as mn -dimensional parameter vectors,we have the usual euclidian inner product as h A | B i = X ( x,y ) ∈ X × Y A xy B xy = B T A. In the case h A | B i = 0 , A and B are said to be orthogonal . The associatedeuclidian norm is k A k = p h A | A T i = s X ( x,y ) ∈ X × Y | A xy | . e.g. , E.D N ERING (1967),
Linear Algebra and Matrix Theory , Wiley, New York
We think of a vector v ∈ R X typically as a column vector. So v T is the rowvector with the same coordinates v Tx = v x . Notice the difference betweenthe two matrix products: v T v = X x ∈ X | v x | = k v k vv T = v x v x v x v x . . . v x v x m v x v x v x v x . . . v x v x m ... ... . . . ... v x m v x v x m v x . . . v x m v x m . Assuming now identical index sets X = Y = { x , . . . , x n } a matrix A ∈ R X × X is symmetric if A T = A . In the case A T = − A ,the matrix A is skew-symmetric . With an arbitrary matrix A ∈ R X × X , weassociate the matrices A + = 12 ( A + A T ) and A − = 12 ( A − A T ) = A − A + . Note that A + is symmetric and A − is skew-symmetric. The symmetry de-composition of A is the representation(50) A = A + + A − The matrix A allows exactly one decomposition into a symmetric and askew-symmetric matrix (see Ex. 8.1). So the symmetry decomposition isunique.E X . 8.1. Let
A, B, C ∈ R X × X be such that A = B + C . Show thatthe two statements are equivalent: (1) B is symmetric and C is skew-symmetric. (2) B = A + and C = A − . Notice that symmetric and skew-symmetric matrices are necessarily pair-wise orthogonal (see Ex.8.2).E X . 8.2. Let A be a symmetric and B a skew-symmetric matrix. Show: h A | B i = 0 and k A + B k = k A k + k B k . . COMPLEX MATRICES 119
2. Complex matrices
In physics and engineering, complex numbers offer a convenient means torepresent orthogonal structures. Applying this idea to the symmetry decom-position, one arrives at so-called hermitian matrices .Recall that a complex number is an expression of the form z = a + i b where a and b are real numbers and i a special ”new” number, the so-called imaginary unit . In particular, a complex number z of the form a + i · is identified with the real number a ∈ R . We denote by C the set of allcomplex numbers, i.e. , C = R + i R = { a + i b | a, b ∈ R } . Complex numbers can be added, subtracted, multiplied and divided accord-ing to the algebraic rules for real numbers with the additional computationalproviso: i = − .z = a − i b is the conjugate of the complex number z = a + i b . So one has zz = ( a + i b )( a − i b ) = a + b = | z | . The conjugate of a complex matrix C is the matrix C with the conjugatedcoefficients C xy = C xy . The adjoint C ∗ of C is the transpose of the conju-gate of C : C ∗ = C T . For two complex matrices A = A + i A and B = B + i B with thematrices A , A , B , B ∈ R X × Y , one computes B ∗ A = ( B T − i B T )( A + i A ) = h A | B i + h A | B i , which means that the definition h A | B i = B ∗ A is a natural way to extend the inner product of real matrices to complexmatrices. In particular, one has the Pythagorean property k A + i A k = h A + i A | A + i A i = k A k + k A k . A complex matrix C is called selfadjoint if it equals is adjoint: C = C ∗ = C T If C has only real coefficients, then C = C , and consequently, ’selfadjoint’boils down to ’symmetric’. It is well-known that real symmetric matrices
20 8. INTERACTION SYSTEMS AND QUANTUM MODELS can be diagonalized. With the same arguments, one can extend this result togeneral selfadjoint matrices:T
HEOREM
For a matrix C ∈ C X × X the twostatements are equivalent: (1) C = C ∗ . (2) C X admits a unitary basis U = { U x | x ∈ X } of eigen-vectors U x of C with real eigenvalues λ x .Unitary means for the basis U that the vectors U x have unit norm and arepairwise orthogonal, i.e. , h U x | U y i = U ∗ y U x = (cid:26) if x = y if x = y .The scalar λ x is the eigenvalue of the eigenvector U x of C if CU x = λ x U x . It follows from Theorem 8.1 (see Ex. 8.3) that a selfadjoint matrix C admitsa spectral decomposition , i.e. , a representation in the form(51) C = X x ∈ X λ x U x U ∗ x , where the U x are pairwise orthogonal eigenvectors of C with eigen-values λ x ∈ R .E X . 8.3. Let U = { U x | x ∈ X } be a unitary basis of C X together with aset Λ = { λ x | x ∈ X } a set of arbitrary complex scalars. Show: (1) The U x are eigenvectors with eigenvalues λ x of the matrix C = X x ∈ X λ x U x U ∗ x . (2) C is selfadjoint if and only if all the λ x are real numbers. The spectral decomposition shows: the spectrum of a matrix is, by definition, its set of eigenvalues . COMPLEX MATRICES 121 The selfadjoint matrices C in C X × X are precisely the linear combi-nations of matrices of type C = X x ∈ X λ x U x U ∗ x , where the U x are (column) vectors in C X and the λ x are real num-bers. Spectral unity decomposition.
As an illustration, consider a matrix U ∈ C X × X with pairwise orthogonal column vectors U j of norm k U k x = 1 ,which means that the identity matrix I has the representation I = U U ∗ = U ∗ U. The eigenvalues of I have all value λ x = 1 . Relative to U , the matrix I hasthe spectral decomposition(52) I = X x ∈ X U x U ∗ x . For any vector v ∈ C X with norm k v k = 1 , we therefore find h v | v i = v ∗ Iv = X x ∈ X v ∗ U x U ∗ x v = X x ∈ X h v | U x i . If follows that the (squared) inner products p vx = h v | U x i of the vector v with the vectors U x yield a probability distribution p v on the set X .Consider now, more generally, the selfadjoint matrix C with eigenvalues ρ x of the form C = X x ∈ X ρ x U x U ∗ x v. Then we have(53) h v | Cv i = v ∗ Cv = X x ∈ X ρ x h v | U x i = X x ∈ X ρ x p vx . In other words:
The inner product h v | Cv i of the vectors v and Cv is the expectedvalue of the eigenvalues ρ x of C with respect to the probability dis-tribution p v on X .
22 8. INTERACTION SYSTEMS AND QUANTUM MODELS
Coming back to real matrices in thecontext of symmetry decompositions, associate with a matrix A ∈ R X × X the complex matrix ˆ A = A + + i A − . ˆ A is a hermitian matrix . The hermitian map A ˆ A establishes an isomor-phism between the vector space R X × X and the vector space H X = { ˆ A | A ∈ R X × X } with the set R as field of scalars . The import in our context is the funda-mental observation that the selfadjoint matrices are precisely the hermitianmatrices:L EMMA
Let C ∈ C X × X be an arbitrary complex matrix. Then C ∈ H X ⇐⇒ C = C ∗ Proof.
Assume C = A + i B with A, B ∈ R X × and hence C ∗ = A T − i B T So C = C ∗ means symmetry A = A T and skew-symmetry B = − B T .Consequently, one has ˆ A = A and ˆ B = i B , which yields C = A + i B = ˆ A + ˆ B ∈ H X . The converse is seen as easily. ⋄ The remarkable property of the hermitian representation is: • While a real matrix A ∈ R X × X does not necessarily admit aspectral decomposition with real eigenvalues, its hermitianrepresentation ˆ A is always guaranteed to have one.
3. Interaction systems
Let us assume that elements x, y ∈ X can interact with a certain interactionstrength , measured by a real number a xy . We denote this interaction sym-bolically as a xy ε xy . Graphically, one may equally well think of a weighted(directed) edge in an interaction graph with X as its set of nodes: a xy ε xy :: x a xy −−→ y . C. H
ERMITE (1822-1901) H X is not a complex vector space: The product zC of a hermitian matrix C with acomplex scalar z is not necessarily hermitian. . INTERACTION SYSTEMS 123 An interaction instance is a weighted superposition of interactions: ε = X x,y ∈ X a ax ε xy . We record the interaction instance ε in the interaction matrix A ∈ R X × X with the interaction coefficients A xy = a xy . The interaction is symmetric if A T = A and skew-symmetric if A T = − A .Conversely, each matrix A ∈ R X × X corresponds to some interaction in-stance ε = X x,y ∈ X A xy ε xy . So we may think of R X × X as the interaction space relative to the set X .Moreover, the symmetry decomposition A = A + + A − shows: Every interaction instance ε is the superposition of a symmetric in-teraction instance ε + and a skew-symmetric interaction instance ε − .Moreover, ε + and ε − are uniquely determined by ε . The norm of an interaction state ε with inter-action matrix A is the norm of the associated interaction matrix: k ε k = k A k . So k ε k 6 = 0 means that at least two members s, t ∈ X interact with strength A st = 0 and that the numbers p xy = | A xy | k A k (( x, y ) ∈ X × X )) yield a probability distribution on the set of all possibly interacting pairsand offer a probabilistic perspective on ε : • A pair ( x, y ) of members of X is interacting nontriviallywith probability p xy . Clearly, scaling ε to λε with a scalar λ = 0 , would result in the same proba-bility distribution on X × X . From the probabilistic point of view, it there-fore suffices to consider interaction instances ε with norm k ε k = 1 .We thus define:
24 8. INTERACTION SYSTEMS AND QUANTUM MODELS
The interaction system on X is the system I ( X ) with the set of states I X = { ε | ε is an interaction instance of X of norm k ε k = 1 } In terms of the matrix representation of states, we have I X ←→ S X = { A ∈ R X × X | k A k = 1 } . A potential F : X × X → R definesa matrix with coefficients F xy = F ( x, y ) and thus a scalar-valued linearfunctional A
7→ h F | A i = X x,y ∈ X F xy A xy on the vector space R X × X . Conversely, every linear functional f on R X × X is of the form f ( A ) = X x,y ∈ X F xy A xy = h F | A i with uniquely determined coefficients F xy ∈ R . So potentials and linearfunctionals correspond to each other.On the other hand, the potential F defines a linear operator A F • A onthe space R X × X , where the matrix F • A is the H ADAMARD product of F and A with the coefficients ( F • A ) xy = F xy A xy for all x, y ∈ X. With this understanding, one has h F | A i = X x,y ∈ X ( F • A ) xy . Moreover, one computes(54) h A | F • A i = X x,y ∈ X A xy ( F • A ) xy = X x,y ∈ X F xy | A xy | . If A ∈ S X ( i.e. , if A represents an interaction state ε ∈ I X ), theparameters p Axy = | A xy | define a probability distribution on X × X .The expected value of the potential F in this state ε is µ ε ( F ) = X x,y ∈ X F xy p Axy = h A | F • A i . . INTERACTION SYSTEMS 125 The interaction model offersa considerably wider context for the analysis of cooperation. To illustratethis, consider a cooperative TU-game
Γ = (
N, v ) with collection N ofcoalitions. v is a potential on N but not on the set N × N of possibly pairwise interacting coalitions. However, there is a straightfor-ward extension of v to N × N : v ( S, T ) = (cid:26) v ( S ) if S = T if S = T .Relative to a state σ ∈ I N ×M with interaction matrix A , the expected valueof v is v ( σ ) = X S ∈N v ( S ) | A SS | . In the special case of a state σ S where the coalition S interacts with it-self with certainty (and hence no proper interaction among coalitions takesplace), we have v ( σ S ) = v ( S ) which is exactly the potential value of the coalition S in the classical inter-pretation of Γ . Generalized cooperative games.
A more comprehensive model for thestudy of cooperation among players would be structures of the type
Γ =( N, N , v ) , where v is a potential on N × N (rather than just N ). Much of the current interaction ana-lysis remains valid for infinite sets with some modifications.For example, we admit as descriptions of interaction states only those ma-trices A ∈ R X × X with the property(H1) supp( A ) = { ( x, y ) ∈ X × X | A xy = 0 } is finite or countablyinfinite.(H2) k A k = X x,y ∈ X | A xy | = 1 . If the conditions (H1) and (H2) are met, we factually represent interactionstates in H
ILBERT spaces. To keep things simple, however, we retain thefiniteness property of the agent set X in the current text and refer the inter-ested reader to the literature for further details. e.g. , J. W EIDMANN (1980):
Linear Operators in Hilbert Spaces , Graduate Texts inMathematics, Springer Verlag
26 8. INTERACTION SYSTEMS AND QUANTUM MODELS
4. Quantum systems
Without going into the physics of quantum mechanics, let us quickly sketchthe basic mathematical model and then look at the relationship with theinteraction model. In this context, we think of an observable as a mechanism α that can be applied to a system S , S ( σ ) α −→ α ( σ ) with the interpretation: • If S is in the state σ , then α is expected to produce a mea-surement result α ( σ ) . There are two views on a quantum system Q X relative to a set X . They are dual to each other (reversing the roles ofstates and observables) but mathematically equivalent. The S
CHR ¨ ODINGER picture.
In the so-called S
CHR ¨ ODINGER picture ,the states of Q X are presented as the elements of the set W X = { v ∈ C X | k v k = 1 } of complex vectors of norm . An observable α corresponds to a selfadjoint ( n × n ) -matrix A ∈ H X and produces the real number α ( v ) = h v | Av i = v ∗ A ∗ v = v ∗ Av when Q X is in the state v ∈ W . Recall from the discussion of the spectraldecomposition in Section 2.1 that α ( v ) is the expected value of the eigen-values ρ i of A relative to the probabilities p A,vx = h v | U x i ( x ∈ X ) , where the vectors U x ∈ W constitute a vector space basis of correspondingeigenvectors of A .An interpretation of the probabilities p A,v could be this: Q X is a stochastic system that shows the element x ∈ X with prob-ability p A,vx if it is observed under A in the state v : Q X ( v ) A −→ x. E. S
CHR ¨ ODINGER (1887-1961) . QUANTUM SYSTEMS 127 E X . 8.4. The identity matrix I ∈ C X × X is selfadjoint and yields the distri-bution p I,v on X with probabilities p I,vx = | v x | ( x ∈ X ) . The H
EISENBERG picture.
In the H
EISENBERG picture of Q X , theselfadjoint matrices A ∈ H X take over the role of states while the vec-tors v ∈ W induce measuring results. The H EISENBERG is dual to theS CHR ¨ ODINGER picture. In both pictures, the expected values h v | Av i ( v ∈ W X , A ∈ H X ) are thought to be the numbers resulting from measurements on the system Q X .The H EISENBERG picture sees an element x ∈ X according to thescheme Q X ( A ) −→ v x with probability p A,vx . Densities and wave functions.
The difference in the two pictures liesin the interpretation of the probability distribution p A,v on the index set X relative to A ∈ H X and v ∈ W X .In the H EISENBERG picture, p A,v is imagined to be implied by a possiblyvarying A relative to a fixed state vector v . Therefore, the elements A ∈ H X are also known as density matrices .In the S CHR ¨ ODINGER picture, the matrix A is considered to be fixed whilethe state vector v = v ( t ) may vary in time t . v ( t ) is called a wave function . A quantum evolution
Φ = Φ(
M, v, A ) in (discrete) time t depends on a matrix-valued function t M t , a statevector v ∈ W , and a density A ∈ H X . The evolution Φ produces realobservation values(55) ϕ t = v ∗ ( M ∗ t AM t ) v ( t = 0 , , , . . . ) . Notice that the matrices A t = M ∗ t AM t are selfadjoint. So the evolution Φ can be seen as an evolution of density matrices, which is in accord with theH EISENBERG picture. W. H
EISENBERG (1901-1976) in the sense of Section 2.1 of Chapter 1
28 8. INTERACTION SYSTEMS AND QUANTUM MODELS If v ( t ) = M t v ∈ W holds for all t , the evolution Φ can also be interpretedin the S CHR ¨ ODINGER picture as an evolution of state vectors:(56) ϕ t = ( M t v ) ∗ A ( M t v ) ( t = 0 , , , . . . ) . R EMARK
The standard model of quantum mechanics assumes that evo-lutions satisfy the condition M t v ∈ W at any time t , so that the H EISEN - BERG and the S CHR ¨ ODINGER pictures are equivalent.
Markov coalition formation.
Let N be the collection of coalitions ofthe set N . The classical view on coalition formation sees the probabilitydistributions p on N as the possible states of the formation process and theprocess itself as a M ARKOV chain.To formalize this model, let P = P ( N ) be the set of all probability distri-butions on N . A M ARKOV operator is a linear map µ : N → R N ×N such that µ t p ∈ P holds for all p ∈ P . µ defines for every initial state p (0) ∈ P a so-called M ARKOV chain ofprobability distributions M ( p (0) ) = { µ t ( p (0) ) | t = 0 , , . . . } . Define now P t ∈ R N ×N as the diagonal matrix with p ( t ) = µ t ( p (0) ) as itsdiagonal coefficient vector. P t is a real symmetric matrix and therefore adensity in particular.Any v ∈ W gives rise to a quantum evolution with observed values π t = v ∗ P t v ( t = 0 , , . . . , n ) For example, if e S ∈ R N is the unit vector that corresponds to the coalition S ∈ N , one has π ( S ) t = e ∗ S P t e S = ( P t ) SS = p ( t ) S with the usual interpretation: • If the coalition formation proceeds according to theM
ARKOV chain M ( p (0) ) , then an inspection at time t willfind S is to be active with probability π ( S ) t = p ( t ) S .R EMARK
More generally, the simulated annealing processes of Sec-tion 5.2 are M ARKOV chains and, therefore, special cases of quantum evo-lutions. A.A. M
ARKOV (1856-1922) . QUANTUM SYSTEMS 129
Recalling the vectorspace isomorphism via the hermitian representation A ∈ R X × X ←→ ˆ A = A + i A − ∈ H X , we may think of interaction states as manifestations of S CHR ¨ ODINGER states of the quantum system Q X × X , S X = { A ∈ R X × X | k A k = 1 } ↔ W X × X = { ˆ A ∈ H X | k ˆ A k = 1 } , or as normed representatives of H EISENBERG densities relative to the quan-tum system Q X . Principal components.
An interaction instance A on X has a hermitianspectral decomposition ˆ A = X x ∈ X λ x U x U ∗ x = X x ∈ X λ x A x where the matrices ˆ A x = U x U ∗ x the principal components of ˆ A . The corre-sponding interaction instances A x are the principal components of A : A = X x ∈ X λ x A x . Principal components V of interaction instances arise from S CHR ¨ ODINGER states v = a + i b ∈ W X with a, b ∈ R X in the following way. Setting ˆ V = vv ∗ = ( a + i b )( a − i b ) T = aa T − bb T + i( ba T − ab T ) , one has V + = aa T + bb T and V − = ba T − ab T and thus V = V + + V − = ( aa T + bb T ) + ( ba T − ab T ) . The principal interaction instance V has hence the underlying structure:(0) Each x ∈ X has a pair ( a x , b x ) of weights a x , b x ∈ R .(1) The symmetric interaction between two arbitrary elements x, y ∈ X is V + xy = a x a y + b x b y . (2) The skew-symmetric interaction between two arbitrary ele-ments x, y ∈ X is V − xy = b x a y − a x b y .
30 8. INTERACTION SYSTEMS AND QUANTUM MODELS
Let N be a (finite)set of players and family N of coalitions. From the quantum point view, a(S CHR ¨ ODINGER ) state of N is a complex vector v ∈ W N , which impliesthe probability distribution p v with probabilities p vi = | v i | ( i ∈ N ) on N . In the terminology of fuzzy cooperation ( cf. Ex. 6.2), p v describes afuzzy coalition: • Player i ∈ N is active in state v i with probability p vi .Conversely, if w ∈ R N is a non-zero fuzzy coalition with component prob-abilities ≤ w i ≤ , the vector √ w = ( √ w i | i ∈ N ) may be normalized to a S CHR ¨ ODINGER state v = √ w k√ w k s.t. w i = k√ w k · | v i | for all i ∈ N .In the same way, a vector W N describes a S CHR ¨ ODINGER state of interac-tion among the coalitions of N . It is particularly instructive to look at theinteractions V of principal component type.As we have seen in Sectionsec:quantum-interaction above, V arises a fol-lows: (0) The interaction V on N is implied by two cooperativegames Γ a = ( N, a ) and Γ b = ( N, b ) .(1) Two coalitions S, T ∈ N interact symmetrically via V + ST = a ( S ) a ( T ) + b ( S ) b ( T ) . (2) Two coalitions S, T ∈ N interact skew-symmetrically via V − ST = b ( S ) a ( T ) − a ( S ) b ( T ) .
5. Quantum games
A large part of the mathematical analysis of game theoretic systems followsguideline • Represent the system in a mathematical structure, analyzethe representation mathematically and re-interpret the re-sult in the original game theoretic setting. . FINAL REMARKS 131
When one chooses a representation of the system in the same space as theones usually employed for the representation of a quantum system, one au-tomatically arrives at a ”quantum game”, i.e. , at a quantum theoretic inter-pretation of a game theoretic environment.So we understand by a quantum game any game on a system S whose statesare represented as quantum states and leave it to the reader to review gametheory in this more comprehensive context.
6. Final Remarks
Why should one pass to complex numbers and the hermitian space H X rather than the euclidian space R X × X if both spaces are isomorphic realHilbert spaces?The advantage lies in the algebraic structure of the field C of complex num-bers, which yields the spectral decomposition (51), for example. It would benot impossible, but somewhat ”unnatural” to translate this structural insightback into the environment R X × X without appeal to complex algebra.Another advantage becomes apparent when one studies evolutions of sys-tems over time. In the classical situation of real vector spaces, Markovchains are an important model for system evolutions. It turns out that thismodel generalizes considerably when one passes to the context of H ILBERT spaces .The game theoretic ramifications of this approach are to a large extent un-explored at this point. U. F
AIGLE AND
G. G
IERZ (2017):
Markovian statistics on evolving systems ,Evolving Systems, DOI 10.1007/s12530-017-9186-8 ppendix
1. Notions and facts from real analysis
The euclidian norm (or geometric length ) of a vector x ∈ R n with compo-nents x j , is k x k = q x + . . . + x n Writing B r ( x ) = { y ∈ R n | k x − y k ≤ r } for r ∈ R and x ∈ R n , a subset S ⊆ R n is(1) bounded if there is some r > such that S ⊆ B r (0) ;(2) open if for each x ∈ S there is some r > such that B r ( x ) ⊆ S ;(3) closed if R n \ S is open;(4) compact if S is closed and bounded.L EMMA
A.2 (H
EINE -B OREL ). S ⊆ R n is compact if and only if (HB) every family O of open sets O ⊆ R n such that every x ∈ S lies inat least one O ∈ O , admits a finite number of sets O , . . . , O ℓ ∈ O with the covering property • S ⊆ O ∪ O ∪ . . . ∪ O ℓ . ⋄ It is important to note that compactness is preserved under forming directproducts: • If X ⊆ R n and Y ⊆ R m are compact sets, then X × Y ⊆ R n + m iscompact.A function f : R n → R is continuous on S if for all x ∈ S , one always has lim d → f ( x + d ) = f ( x ) . L EMMA
A.3 (Extreme values). If f is continuous on the compact set S ,then there exist elements x ∗ , x ∗ ∈ S such that f ( x ∗ ) ≤ f ( x ) ≤ f ( x ∗ ) holds for all x ∈ S . ⋄ The continuous function f : S → R is differentiable on the open set S ⊆ R n if for each x ∈ S there is a (row) vector ∇ f ( x ) such that for every d ∈ R n of unit length k d k = 1 , one has lim t → f ( x + td ) − f ( x ) t = lim t → ∇ f ( x ) dt ( t ∈ R ) . ∇ f ( x ) is the gradient of f . Its components are the partial derivatives: ∇ f ( x ) = (cid:0) ∂f ( x ) /∂x , . . . , ∂f ( x ) /∂x n (cid:1) . N OTA B ENE . Not all continuous functions are differentiable.
2. Convexity A linear combination of elements x , . . . , x m is an expression of the form z = λ x + . . . + λ m x m , where λ , . . . , λ m are scalars (real or complex numbers). The linear com-bination z is affine if λ + . . . + λ m = 1 and λ , . . . , λ m ∈ R . An affine combination is a convex combination if all scalars λ i are nonnega-tive. The scalars ( λ , . . . , λ m ) of a convex combination is a ( m -dimensional) probability distribution .The set S ⊆ R n is convex if it contains with every x, y ∈ S also the con-necting line segment: [ x, y ] = { x + λ ( y − x ) | ≤ λ ≤ } ⊆ S. It is easy to verify that the direct product S = X × Y ⊆ R n × m is convex if X ⊆ R n and Y ⊆ R m are convex sets.A function f : S → R is convex (up) on the convex set S if for all x, y ∈ S and for all scalars ≤ λ ≤ , f ( x + λ ( y − x )) ≥ f ( x ) + λ ( f ( y ) − f ( x ))) . This definition is equivalent to the requirement that one has for any finitelymany elements x , . . . , x m ∈ S and probability distributions ( λ , . . . , λ m ) , f ( λ x + . . . + λ m x m ) ≥ λ f ( x ) + . . . + λ m f ( x m ) . The function f is concave (or convex down ) if g = − f is convex (up).A differentiable function f : S → R on the open set S ⊆ R n is convex (up)if and only if(57) f ( y ) ≥ f ( x ) + ∇ f ( x )( y − x ) holds for all x, y ∈ S . . BROUWER’S FIXED-POINT THEOREM 135 Assume, for example, that ∇ f ( x )( y − x ) ≥ it true for all y ∈ S , then onehas f ( x ) = min y ∈ S f ( y ) . On the other hand, if ∇ f ( x )( y − x ) < is true for some y ∈ S , onecan move from x a bit into the direction of y and find an element x ′ with f ( x ′ ) < f ( x ) . Hence one has a criterion for minimizers of f on S :L EMMA
A.4. If f is a differentiable convex function on the convex set S ,then for any x ∈ S , the statements are equivalent: (1) f ( x ) = min y ∈ S f ( y ) . (2) ∇ f ( x )( y − x ) ≥ for all y ∈ S . If strict inequality holds in (57) for all y = x , f is said to be stricly convex .In the case n = 1 ( i.e. , S ⊆ R ), a simple criterion applies to twice differen-tiable functions: f is convex ⇐⇒ f ′′ ( x ) ≥ for all x ∈ S .For example, the logarithm function f ( x ) = ln x is seen to be strictly con-cave on the open interval S = (0 , ∞ ) because of f ′′ ( x ) = − /x < for all x ∈ S .
3. B
ROUWER ’s fixed-point theorem A fixed-point of a map f : X → X is a point x ∈ X such that f ( x ) = x . Itis usually difficult to find a fix-point (or even to decide whether a fixed-pointexists). Well-known sufficient conditions were given by B ROUWER :T HEOREM
A.2 (B
ROUWER (1911)).
Let X ⊆ R n be a convex, compactand non-empty set and f : X → X a continuous function. Then f has afixed-point.Proof. See, e.g. , the enyclopedic textbook of A. G
RANAS and J. D
UGUNDJI , Fixed Point Theory , Springer-Verlag 2003. ⋄ For game theoretic applications, the following implication is of interest.C
OROLLARY
A.1.
Let X ⊆ R n be a convex, compact and nonempty setand G : X × X → R a continuous map that is concave in the secondvariable, i.e. , (C) for every x ∈ X , the map y G ( x, y ) is concave. L.E.J. B
ROUWER (1881-1966) Then there exists a point x ∗ ∈ X such that G ( x ∗ , x ∗ ) ≥ G ( x ∗ , y ) for all y ∈ X .Proof. We will derive a contradiction from the supposition that the Corol-lary is false. Indeed, if there is no x ∗ with the claimed property, then each x ∈ X lies in at least one of the sets O ( y ) = { x ∈ X | G ( x, x ) < G ( x, y ) } ( y ∈ X ) . Since G is continuous, the sets O ( y ) are open sets. Hence, since X is com-pact, already finitely many cover all of X , say X ⊆ O ( y ) ∪ O ( y ) ∪ . . . ∪ O ( y h ) . For all x ∈ X , define the parameters d ℓ ( x ) = max { , G ( x, y ℓ ) − G ( x, x ) } ( ℓ = 1 , . . . , h ) .x lies in at least one of the sets O ( y ℓ ) . Therefore, we have d ( x ) = d ( x ) + d ( x ) + . . . + d h ( x ) > . Consider now the function x ϕ ( x ) = j X ℓ =1 λ ℓ ( x ) y i (with λ ℓ = d ℓ ( x ) /d ( x ) ).Since G is continuous, also the functions x d ℓ ( x ) are continuous. There-fore, ϕ : X → X is continuous. By B ROUWER ’s Theorem A.2, ϕ has afixed point x ∗ = ϕ ( x ∗ ) = h X ℓ =1 λ ℓ ( x ∗ ) y ℓ . Since G ( x, y ) is concave in y and x ∗ is an affine linear combination of the y ℓ , we have G ( x ∗ , x ∗ ) = G ( x ∗ , ϕ ( x ∗ )) ≥ h X ℓ =1 λ ℓ ( x ∗ ) G ( x ∗ , y ℓ ) . If the Corollary were false, one would have λ ℓ G ( x ∗ , y ℓ ) ≥ λ ℓ ( x ∗ ) G ( x ∗ , x ∗ ) for each summand and, in at least one case, even a strict inequality λ ℓ ( x ∗ ) G ( x ∗ , y ℓ ) > λ ℓ G ( x ∗ , x ∗ ) , which would produce the contradictory statement G ( x ∗ , x ∗ ) > h X ℓ =1 λ ℓ ( x ∗ ) G ( x ∗ , x ∗ ) = G ( x ∗ , x ∗ ) . . THE MONGE ALGORITHM 137 It follows that the Corollary must be correct. ⋄
4. Linear inequalities
Also the facts stated in this section are well-known . For a coordinate vec-tor x ∈ R n , we write x = 0 if x j = 0 holds for all components x j of x . x ≥ means that all components of x are nonnegative.Assume now that the matrix A ∈ R m × n and vectors c ∈ R n and b ∈ R m aregiven and define the sets X = { x ∈ R n | b − Ax ≥ } Y = { y ∈ R m | y ≥ , A T y = c } . Then the main theorem of linear programming says:T
HEOREM
A.3.
For X and Y as above, the following statements are true: (1) For all x ∈ X and y ∈ Y one has: c T x ≤ b T y . (2) X = ∅ 6 = Y is true exactly when there are elements x ∗ ∈ X and y ∗ ∈ Y such that c T x ∗ = b T y ∗ . ⋄ In terms of mathematical optimization, Theorem A.3 says that either X or Y are empty or that there are elements x ∗ ∈ X and y ∗ ∈ Y with the property(58) c T x ∗ = max x ∈ X c T x = min y ∈ Y b T y = b T y ∗ . The optimization problems in (58) are so-called linear programs .
5. The M
ONGE algorithm
The M
ONGE algorithm with respect to coefficient vectors c, v ∈ R n hastwo versions.The primal M ONGE algorithm constructs a vector x ( v ) with the compo-nents x ( v ) = v and x k ( v ) = v k − v k − ( k = 2 , , . . . , n ) . The dual M ONGE algorithm constructs a vector y ( c ) with the components y n ( c ) = c n and y ℓ ( c ) = c ℓ − c ℓ +1 ( ℓ = 1 , . . . , n − . Notice: c ≥ c ≥ . . . ≥ c n = ⇒ y ℓ ( c ) ≥ ℓ = 1 , . . . , n − v ≤ v ≤ . . . ≤ v n = ⇒ x k ( v ) ≥ k = 2 , . . . , n ) . more details can be found in, e.g. , U. F AIGLE , W. K
ERN and G. S
TILL , AlgorithmicPrinciples of Mathematical Programming , Springer (2002) The important property to observe isL
EMMA
A.5.
The M ONGE vectors x ( v ) and y ( c ) satisfy c T x ( v ) = n X k =1 c k x k ( v ) = n X ℓ =1 v ℓ y ℓ ( c ) = v T y ( c ) . Proof.
Writing x = x ( v ) and y = y ( c ) , notice for all ≤ k, ℓ ≤ n , x + x + . . . + x ℓ = v ℓ and y k + y k +1 + . . . + y n = c k and hence n X k =1 c k x k = n X k =1 n X ℓ = k y ℓ x k = n X ℓ =1 ℓ X k =1 x k y ℓ = n X ℓ =1 v ℓ y ℓ . ⋄
6. Entropy and B
OLTZMANN distributions6.1. B
OLTZMANN distributions.
The partition function Z for a givenvector v = ( v , . . . , v n ) of real numbers v j takes the values Z ( t ) = n X j =1 e v j t ( t ∈ R ) . The associated B
OLTZMANN probability distribution b ( t ) has the components b j ( t ) = e v j t /Z ( t ) > and yields the expected value function µ ( t ) = n X j =1 v j b j ( t ) = Z ′ ( t ) Z ( t ) . The variance is defined as the expected quadratic deviation of v from µ ( t ) : σ ( t ) = n X j =1 ( µ ( t ) − v j ) b j ( t ) = n X j =1 v j b j ( t ) − µ ( t )= Z ′′ ( t ) Z ( t ) − Z ′ ( t ) Z ( t ) = µ ′ ( t ) . One has σ ( t ) > unless all v j are equal to a constant K (and hence µ ( t ) = K for all t ). Because µ ′ ( t ) = σ ( t ) , one sees that µ ( t ) is strictly increasingin t unless µ ( t ) is constant.Arrange the components such that v ≤ v ≤ . . . ≤ v n . Then lim t →∞ b j ( t ) b n ( t ) = lim t →∞ e ( v j − v n ) t = 0 unless v j = v n , . ENTROPY AND BOLTZMANN DISTRIBUTIONS 139 which implies b j ( t ) → if v j < v n . It follows that the limit distribution b ( ∞ ) is the uniform distribution on the maximizers of v . Similarly, one has lim t →−∞ b j ( t ) b ( t ) = lim t →−∞ e ( v j − v ) t = 0 unless v j = v and concludes that the limit distribution b ( −∞ ) is the uniform distributionon the minimizers of v .T HEOREM
A.4.
For every value v ≤ ξ ≤ v n , there is a unique parameter t ∈ R ∪ {−∞ , + ∞} such that ξ = µ ( t ) = n X j =1 v j b j ( t ) . Proof. If v = ξ = v n , the function µ ( t ) is constant and the claim is trivial.In the non-constant case, µ ( t ) is strictly monotone and continuous on R andsatisfies. v ≤ µ ( t ) ≤ v n . So, for every prescribed value ξ between the extrema v and v n , there mustexist precisely one t with µ ( t ) = ξ . ⋄ The real function h ( x ) = x ln x is defined for all non-negative real numbers and has the strictly increasing derivative h ′ ( x ) = x + ln x. So h is strictly convex and satisfies the inequality h ( y ) − h ( x ) > h ′ ( x )( y − x ) for all non-negative y = x . h is extended to nonnegative real vectors x = ( x , . . . , x n ) via h ( x ) = h ( x , . . . , x n ) = n X j =1 x j ln x j (cid:0) = n X j =1 h ( x j ) (cid:1) . The strict convexity of h becomes the inequality h ( y ) − h ( x ) > ∇ h ( x )( y − x ) , with the gradient ∇ h ( x ) = ( h ′ ( x ) , . . . , h ′ ( x n )) = ( x + ln x , . . . , x n + ln x n ) . with the understanding ln 0 = −∞ and · ln 0 = 0 In the case x + . . . + x n = 1 , the nonnegative vector x is a probabilitydistribution on the set { , . . . , n } and has the entropy H ( x ) = n X j =1 x j ln(1 /x j ) = − n X j =1 x j ln x j = − h ( x , . . . , x n ) . We want to show that B
OLTZMANN probability distributions are preciselythe ones with maximal entropy relative to given expected values.T
HEOREM
A.5.
Let v = ( v , . . . , v n ) be a vector of real numbers and b the B OLTZMANN distribution on { , . . . , n } with components b j = 1 Z ( t ) e v j t ( j = 1 , . . . , n ) . with respect to some t . Let p = ( p , . . . , p n ) be a probability distributionwith the same expected value n X j =1 v j p j = µ = n X j =1 v j b j . Then one has either p = b or H ( p ) < H ( b ) .Proof. For d = p − b , we have P j d j = P j p j − P j b j = 1 − , andtherefore ∇ h ( b ) d = n X j =1 (1 + ln b j ) = n X j =1 d j ln b j = n X j =1 d j ( v j t − Z ( t )) = t n X j =1 v k d j = t (cid:0) n X j =1 v j p j − n X j =1 v j b j ) (cid:1) = 0 . In the case p = b , the strict convexity of h thus yields h ( p ) − h ( b ) > ∇ h ( b )( p − b ) = 0 and hence H ( p ) < H ( b ) . ⋄ L EMMA
A.6 (Divergence).
Let a , . . . , a n , p , . . . , p n be arbitrary nonneg-ative numbers. Then n X i =1 a i ≤ n X i =1 p i = ⇒ n X i =1 p i ln a i ≤ n X i =1 p i ln p i . Equality is attained exactly when a i = p i holds for all i = 1 , . . . , n . by definition! . ENTROPY AND BOLTZMANN DISTRIBUTIONS 141 Proof.
We may assume p i = 0 for all i and make use of the well-knownfact (which follows easily from the concavity of the logarithm function): ln x ≤ x − and ln x = x − ⇔ x = 1 . Then we observe n X i =1 p i ln a i p i ≤ n X i =1 p i ( a i p i −
1) = n X i =1 a i − n X i =1 p i ≤ and therefore n X i =1 p i ln a i − n X i =1 p i ln p i = n X i =1 p i ln a i p i ≤ . Equality can only hold if ln( a i /p i ) = ( a i /p i ) − , and hence a i = p i is truefor all i ..