Incentive Alignment of Business Processes: a game theoretic approach
IIncentive Alignment of Business Processes:a game theoretic approach
Tobias Heindel [0000 − − − and Ingo Weber [0000 − − − Chair of Software and Business Engineering, Technische Universitaet Berlin, Germany {heindel,ingo.weber}@tu-berlin.de
Abstract.
Many definitions of business processes refer to business goals, valuecreation, or profits / gains of sorts. Nevertheless, the focus of formal methodsresearch on business processes, like the well-known soundness property, lies oncorrectness with regards to execution semantics of modeling languages. Amongothers, soundness requires proper completion of process instances. However, thequestion of whether participants have any interest in working towards completion(or in participating in the process) has not been addressed as of yet.In this work, we investigate whether inter-organizational business processes giveparticipants incentives for achieving the common business goals – in short, whetherincentives are aligned with the process. In particular, fair behavior should pay o ff and e ffi cient completion of tasks should be rewarded. We propose a game-theoreticapproach that relies on algorithms for solving stochastic games from the machinelearning community. We describe a method for checking incentive alignment ofprocess models with utility annotations for tasks, which can be used for a priori analysis of inter-organizational business processes. Last but not least, we showthat the soundness property corresponds to a special case of incentive alignment. Keywords: incentive alignment · inter-organizational business processes · collab-oration · choreography · soundness property
Many definitions of business processes refer to business goals [29] or value creation [7],but whether process participants are actually incentivized to contribute to a processhas not been addressed as yet. For intra -organizational processes, this question is lessrelevant; motivation to contribute is often based on loyalty, bonuses if the organizationperforms well, or simply that tasks in a process are part of one’s job. Instead, economicmodeling of intra-organizational processes often focuses on cost, e.g. in activity-basedcosting [12], which can be assessed using model checking tools [9] or simulation [5].For inter -organizational business processes, such indirect motivation cannot be as-sumed. A prime example of misaligned incentives was the $2.5B write-o ff in Cisco’ssupply chain in April 2001 [20]: success of the overall supply chain was grossly mis-aligned with the incentives of individual participants. (This happened although severalgame theoretic approaches for analyzing incentive structures are available for the caseof supply chains [4].) Furthermore, modeling incentives accurately is actually possi-ble in cross-organizational processes, e.g., based on contracts and agreed-upon prices. a r X i v : . [ c s . G T ] J un Tobias Heindel and Ingo Weber
Now, with the advent of blockchain technology [30], it is possible to execute cross-organizational business processes or choreographies as smart contracts [18,28]. Theblockchain serves as a neutral, participant-independent computational infrastructure,and as such enables collaboration across organizations even in situations characterizedby a lack of trust between participants [28]. However, as there is no central role foroversight, it is important that incentives are properly designed in such situations, amongother reasons to avoid unintended –possibly devastating– results, like those encounteredby Cisco. In fact, a main goal of the Ethereum blockchain is, according to its founderVitalik Buterin, to create “a better world by aligning incentives” .In this paper, we present a principled framework for incentive alignment of inter-organizational business processes. We consider bpmn models with suitable annotationconcerning the utility of activities, very much in the spirit of activity-based costing ( abc )[12, Chapter 5]. In short, fair behavior should pay o ff and participants should be rewardedfor e ffi cient completion of process instances. In more detail, we shall consider bpmn models as stochastic games [24] and formalize incentive alignment as “good” equilibriaof the resulting game. Which equilibria are the desirable ones depends on the businessgoals w.r.t. which we want align incentives. In the present paper, we focus on propercompletion and liveness of activities. Interestingly, the soundness property [2] will berediscovered as the special case of incentive alignment within a single organization thatrewards completion of every activity.The overall contribution of the paper is a framework for incentive alignment ofbusiness process models, particularly in inter-organizational settings. Our approach isbased on game theory and inspired by advances on the solution of stochastic gamesfrom the machine learning community, which has developed algorithms for the practicalcomputation of Nash [22] and correlated equilibria [16,17]. The framework focuses onchecking incentive alignment as an a priori analysis of business processes specified as bpmn models with activity-based utility annotations. Specifically, we:1. describe a principled method for translating bpmn -models with activity-based coststo stochastic games [24]2. propose a notion of incentive alignment that we prove to be a conservative extensionof Van der Aalst’s soundness property [2],3. illustrate the approach with an order-to-cash ( o c ) process.We pick up the idea of incentive alignment for supply chains [4] and set out to applyit in the realm of inter-organizational business processes. From a technical point ofview, we are interested in extending the model checking tools for cost analysis [9] for bpmn process models to proper collaborations, very much like the model checker prism has been extended from Markov decision processes to games [14]. Last but not least,we put importance on the connection to established concepts from the business processmanagement community: not only do we keep the spirit of the soundness property; weeven obtain the rigorous result that incentive alignment is a conservative extension of thesoundness property. , accessed 8-3-2020 We shall use utility functions in the sense of von Neumann and Morgenstern [19].ncentive Alignment of Business Processes 3
The remainder of the paper is structured as follows. We introduce concepts andnotations in Section 2. On this basis, we formulate two versions of incentive alignmentin Section 3. Finally, we draw conclusions in Section 4. The proof of the main theoremcan be found in Appendix A.
We now introduce the prerequisite concepts for stochastic games [24] and elementarynet systems [23]. The main benefit of using a game theoretic approach is a short list ofcandidate definitions of equilibrium, which make precise the idea of a “good strategy”for rational actors that compete as players of a game. We shall require the following twoproperties of an equilibrium: (1) no player can benefit from unilateral deviation from the“agreed” strategy and (2) players have the possibility to base their moves on informationfrom a single (trusted) mediator. The specific instance that we shall use are correlatedequilibria [3,10] as studied by Solan and Vieille [25]. We take ample space to reviewthe latter two concepts, followed by a short summary of the background on Petri nets.We use the following basic concepts and notation. The cardinality and the powerset ofa set M are denoted by | M | and ℘ M , respectively. The set of real numbers is denoted by R and [0 , ⊆ R is the unit interval. A probability distribution over a finite or countablyinfinite set M is a function p : M → [0 ,
1] whose values are non-negative and sum upto 1, in symbols (cid:80) m ∈ M p ( m ) =
1. The set of all probability distributions over a set M isdenoted by ∆ ( M ). We proceed by reviewing core concepts and central results for stochastic games [24],introducing notation alongside; we shall use examples to illustrate the most importantconcepts. The presentation is intended to be self-contained such that no additionalreferences should be necessary. However, the interested reader might want to consultstandard references or additional material, e.g., textbooks [15,21], handbook articles [11],and surveys [26]. We start with the central notion.
Definition 1 (Stochastic game). A stochastic game G is a quintuple G = (cid:104) N , S , A , q , u (cid:105) that consists of – a finite set of players N = { , . . . , | N |} (ranged over by i , j , i n , etc.); – a finite set of states S (ranged over by s , s (cid:48) , s n , etc.); – a finite, non-empty set of action profiles A = (cid:81) | N | i = A i (ranged over by a , a n , etc.),which is the Cartesian product of a player-indexed family { A i } i ∈ N of sets A i , each ofwhich contains the actions of the respective player (ranged over by a i , a in , etc.); – a non-empty set of available actions A i ( s ) ⊆ A i , for each state s ∈ S and player i; – probability distributions q ( · | s , a ) ∈ ∆ ( S ) , for each state s ∈ S and every actionprofile a ∈ A, which map each state s (cid:48) ∈ S to q ( s (cid:48) | s , a ) , the transition probability from state s to state s (cid:48) under the action profile a; and Nash equilibria are a special case, which however have drawbacks that motivate Aumann’swork on the more general correlated equilibria [3,10]. Tobias Heindel and Ingo Weber C u s t o m e r S h i pp e r S upp li e r sendorderreceiveorder c h ec k s t o c k out ofstock confirmrejected ship p i c kupd e li v e r receive c h ec k sendbackpayment successno successok p i c kup r e t u r n receivewrite o ff damagepayment soldordergoods r e j ec ti on acce p t a n ce i nvo i ce receiptreceiptpostagefee refusaldamage report OK ! d a m a g e d ! Fig. 1.
A an order-to-cash process – the payo ff vectors u ( s , a ) = (cid:104) u ( s , a ) , . . . , u | N | ( s , a ) (cid:105) , for each state s ∈ S and everyaction profile a = (cid:104) a , . . . , a | N | (cid:105) ∈ A. Note that players always have some action(s) available, possibly just a dedicated idleaction, see e.g. [13].The bpmn model of Fig. 1 can be understood as (a partial specification of) a stochasticgame played by a shipper, a customer, and a supplier. Abstracting from data, precisetimings, and similar semantic aspects, a state of the game is a state of an instance ofthe process, which is represented as a token marking of the bpmn model. The actionsof each player are the activities and events in the respective pool, e.g., the check stock task, which
Supplier performs upon receiving an order from
Customer . Action profilesare combinations of actions that can (or must) be executed concurrently. For example,sending the order and receiving the order after the start of the collaboration may beperformed synchronously (e.g., via telephone). The available actions of a player in agiven state are the tasks or events in the respective pool that can be executed or happennext – plus the possibility to idle. The transition probabilities for available actions in ncentive Alignment of Business Processes 5 B ob A li ce (cid:44)(cid:44)(cid:44) surfingwork$$ fishing (cid:44)(cid:44) work$$$ (cid:44) B a d d i a g r a m ! Fig. 2.
Bob’s ideas about work life balance with Alice’s disapproval this bpmn process are all 1, such that if players choose to execute certain tasks next, theywill be able to do so as long as the chosen activities are actually available actions. As aconsequence, all other transition probabilities are 0.One important piece of information that is not explicitly specified in the bpmn modelis the utility (or payo ff ) of tasks and events. In general, it is non-trivial to chose utilityfunctions. However, the example is chosen such that there are natural candidates; e.g.,postage can be looked up from one’s favorite carrier.A single instance of the order-to-cash process exhibits the well-known phenomenonthat Customer has no incentive to pay. However, we want to stress that – very much forthe same reason! –
Shipper would not have any good reason to perform delivery, oncethe postage fee is paid. Thus, besides the single instance scenario, we shall consider anunbounded number of repetitions of the process, but only one active process instanceat each point in time. Now, the rational reason for the shipper to deliver (and returndamaged goods) is expected revenue from future process instances.One distinguishing feature of the order-to-cash collaboration is that participants donot need to coordinate with each other in any non-trivial way; in particular, there are nojoint decisions to make. To illustrate the point, let us consider a di ff erent example. Aliceand Bob are co-founders of a company, which is running so smoothly that it su ffi ceswhen, any day of the week, only one of them is going to work. Depending on their mood,at least one of them can spend their time with their favorite hobby.Bob sketches a first plan to synchronize their schedules, depicted in Figure 2. How-ever, Alice points out that this diagram is no good, among others because it mixesdata and event-based decisions. Alice remarks that this can be avoided by throwing acoin in the morning, to decide who goes to work. But that leaves the question of whoshould throw the coin and communicate the result. Alice suggests that their secretaryMrs. Medina could help them out by rolling a 10-sided die each morning and notifyingthem about who is going to go to work that day.Alice comes up with the more elaborate process shown in Fig. 3, which lets Boband Alice work 60% and 40% of the days, resp., because Alice is 50% more e ffi cientthan Bob. In fact, every point on the solid red line on the right in Fig. 3 correspond to a We leave the very interesting situation of interleaved execution of several process instances forfuture work. The projections to the x - y , y - z , and x - z plane are rendered as dashed lines for clarity. Tobias Heindel and Ingo Weber B ob A li ce M r s . M e d i n a (cid:44)(cid:44)(cid:44) surfingwork$$fishing (cid:44)(cid:44) work$$$ (cid:44) flip ≤ > AliceBob$
Fig. 3.
The
To work or not to work? collaboration choice of odds for going to work; the given bpmn model corresponds to the dot slightlyo ff the middle. The coordinates of each point describe the expected gain in personalenjoyment and profits for the company, respectively. Thus, there is a choice of parameterfor any suitable work life balance. One might want to maximize for distance from theorigin, but questions of fairness can be addressed as well. However, Alice’s criterion forchoosing the odds is equal contribution of both co-founders to the company income.In game theoretic terminology, Mrs. Medina is taking the role of a common sourceof randomness that is independent of the state of the game and does not need to observethe actions of the players. The specific formal notion that we shall use is that of an autonomous correlation device [25, Definition 2.1]. Definition 2 (Autonomous correlation device). An autonomous correlation device isa family of pairs D = (cid:110) (cid:104){ M in } i ∈ N , d n (cid:105) (cid:111) n ∈ N (that is indexed over natural numbers n ∈ N )each of which consists of – a family of finite sets of signals M in , (additionally) indexed over players; and – a function d n that maps lists of signal vectors (cid:104) x , . . . , x n − (cid:105) ∈ (cid:81) n − k = M k to prob-ability distributions d n (cid:104) x , . . . , x n − (cid:105) ∈ ∆ ( M n ) over the Cartesian product M n = (cid:81) | N | i = M in of all signal sets M in . We shall refer to operators of autonomous correlation devices as mediators , which guidethe actions of players during the game.Each correlation device for a game induces an extended game, which proceedsin stages . Concerning the example of Bob and Alice, Fig. 3 is a description of theextended game of the collaboration that Bob drafted combined with the simplest possiblecorrelation device – assuming that each of the roles restarts at the end of the day. ncentive Alignment of Business Processes 7
In general, given a game and an autonomous correlation device, the n -th stagebegins with the mediator drawing a signal vector x n ∈ M n = (cid:81) | N | i = M in according to thedevice distribution d n (cid:104) x , . . . , x n − (cid:105) – e.g., Mrs. Medina rolling the die – and sending thecomponents to the respective players – the sending of messages to Bob and Alice (inone order or the other). Then, each player i chooses an available action a in . This choicecan be based on the respective component x in of the signal vector x n ∈ M n , informationabout previous states s k of the game G , and moves a jk of (other) players from the history. After all players made their choice, we obtain an action profile a n = (cid:104) a n , . . . , a | N | n (cid:105) . Thereader may note, that the process model does not give Alive and Bob a lot of choice:once Mrs. Medina sent them the message with the decision, they can only notify eachother (respectively receive this notification), and then work or surf / fish. However, inour game setting they can also choose the “idle” action, e.g., instead of going to work.While playing the extended game described above, each player makes observationsabout the state and the actions of players; the role of the mediator is special insofar asit does not need and is also not expected to observe the run of the game. The “local”observations of each player are the basis of their strategies. Definition 3 (Observation, strategy, strategy profile).
For a natural number n ∈ N ,an observation at stage n by player i is a tuple h = (cid:104) s , x i , a , . . . , s n − , x in − , a n − , s n , x in (cid:105) that consists of – one state s k , signal x ik , and action profile a k , for each number k < n, – the current state s n , also denoted by s h , and – the current signal x in .The set of all observations is denoted by H in ( D ) . The union H i ( D ) = (cid:83) n ∈ N H in ( D ) ofobservations at any stage is the set of observations of player i. A strategy is a map σ i : H i ( D ) → ∆ ( A i ) from observations to probability distributions over actions thatare available at the current state of histories, i.e., σ ih ( a i ) = if a i (cid:60) A i ( s h ) , for allhistories h ∈ H i ( D ) . A strategy profile is a player-indexed family of strategies { σ i } i ∈ N . Thus, each of the players observes the history of other players, including the possibilityof punishing other players for not heeding the advice of the mediator. This is possiblesince signals might give (indirect) information concerning the (mis-)behavior of playersin the past, as remarked by Solan and Vieille [25, p. 370]: by revealing information aboutproposed actions of previous rounds, players can check for themselves whether someplayer has ignored some signal of the mediator.The data of a game, a correlation device, and a strategy profile induce probabilitiesfor finite plays of the game, which in turn determine the expected utility of playingthe strategy. Formally, an autonomous correlation device and a strategy profile withstrategies for every player yield a probabilistic trajectory of a sequence of “global” states,signal vectors of all players, and complete action profiles, dubbed history . The formaldetails are as follows.
Definition 4 (History and its probability).
Given a natural number n ∈ N , a historyat stage n is a tuple h = (cid:104) s , x , a , . . . , s n − , x n − , a n − , s n , x n (cid:105) that consists of In the present paper, we only consider games of perfect information, which is suitable forbusiness processes in a single organization or which are monitored on a blockchain. Tobias Heindel and Ingo Weber – one state s k , signal vector x k , and action profile a k , for each number k < n, – the current state s n , often denoted by s h , and – the current signal vector x n .The set of all histories at state n is denoted by H n ( D ) . The union H ( D ) = (cid:83) n ∈ N H n ( D ) of histories at arbitrary stages is the set of finite histories . The probability of a finite his-tory h = (cid:104) s , x , a , . . . , s n − , x n − , a n − , s n , x n (cid:105) in the context of a correlation device D ,an initial state s, and a strategy profile σ is defined as follows, by recursion over thelength of histories.n = : P D , s ,σ ( (cid:104) s , x (cid:105) ) = if s (cid:44) s d (cid:104)(cid:105) ( x ) otherwisen > : P D , s ,σ ( (cid:104) (cid:126) , a n − , s n , x n (cid:105) ) = p (cid:104) (cid:126) (cid:105) ( a n − ) (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) (cid:81) i ∈ N σ i (cid:104) (cid:126) (cid:105) ( a in − ) q ( s n | s n − , a n − ) p d n − ( x n ) (cid:124) (cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32) (cid:125) d n − (cid:104) x ,..., x n − (cid:105) ( x n ) Again, note that the autonomous correlation device does not “inspect” the states of ahistory, in the sense that the distributions over signal vectors d n are not parameterizedover states from the history, but only over previously drawn signal vectors – whence thename. Definition 5 (Mean expected payo ff ). The mean expected payo ff of player i for stage nis ¯ γ in ( D , s , σ ) = (cid:80) h ∈ H n + ( D ) P D , s ,σ ( h ) n (cid:80) nk = u i ( s k , a k ) where h = (cid:104) s , x , a , . . . a n , s n + , x n + (cid:105) . At this point, we can address the question of what a good strategy profile is and fillin all the details of the idea that an equilibrium is a strategy profile that does not giveplayers any good reason to deviate unilaterally. We shall tip our hats to game theoryand use the notation ( π i , σ − i ) for the strategy profile which is obtained by “overwriting”the single strategy σ i of player i with a strategy π i (which might, but does not haveto be di ff erent); thus, the expression ‘( π i , σ − i )’ denotes the unique strategy subject toequations ( π i , σ − i ) i = π i and ( π i , σ − i ) j = σ j (for i (cid:44) j ). Definition 6 (Autonomous correlated ε -equilibrium). Given a positive real ε > , an autonomous correlated ε -equilibrium is a pair (cid:104)D , σ ∗ (cid:105) , which consists of an autonomouscorrelation device D and a strategy profile σ ∗ for which there exists a natural numbern ∈ N such that for any alternative strategy σ i of any player i, ¯ γ in ( D , s , σ ∗ ) ≥ ¯ γ in (cid:16) D , s , ( σ i , σ ∗− i ) (cid:17) − ε (1) Equation (1) holds, for all n ≥ n and all states s ∈ S .
Thus, a strategy is an autonomous correlated ε -equilibrium if the benefits that one mightreap in the long run by unilateral deviation from the strategy are negligible as ε canbe arbitrarily small. In fact, other players will have ways to punish deviation from theequilibrium [25, § 3.2]. Note that these probabilities induce the probability measure on infinite histories from Solan andVieille [25]; thus, we extend the notation of op. cit. to finite histories.ncentive Alignment of Business Processes 9
Elementary net systems [23] are expressive enough for the purposes of the present paper;however, for the sake of simplicity, we refer to them as Petri nets. Definition 7 (Petri net, marking, and marked Petri net). A Petri net is a tuple N = ( P , T , • () , () • ) that consists of – a finite set of places P; – a finite set of transitions T; and – two functions • () , () • : T → ℘ P \ { ∅ } that assign to each transition t ∈ T its pre-set • t ⊆ P and post-set t • ⊆ P, respectively, which are both required to be non-empty.A marking of a Petri net N is a multiset of places m, i.e., a function m : P → N thatassigns to each place p ∈ P a non-negative integer m ( p ) ≥ . A marked Petri net is atuple N = ( P , T , • () , () • , m ) whose first four components ( P , T , • () , () • ) are a Petri netand whose last component m is the initial marking , which is a marking of the latterPetri net. One essential feature of Petri nets is the ability to execute several transitions concurrently– possibly several occurrences of one and the same transition. However, we shall onlyencounter situations in which a set of transitions fires. Again, for the sake of simplicity,we shall use the general term step . We fix a Petri net N = ( P , T , • () , () • ) for the remainderof the section. Definition 8 (Step, step transition, reachable marking). A step in the net N is a setof transitions t ⊆ T. The transition relation of a step t ⊆ T relates a marking m to anothermarking m (cid:48) , in symbols m [ t (cid:105) m (cid:48) , if the following two conditions are satisfied, for everyplace p ∈ P.1. m ( p ) ≥ |{ t ∈ t | p ∈ • t }|
2. m (cid:48) ( p ) = m ( p ) − |{ t ∈ t | p ∈ • t }| + |{ t ∈ t | p ∈ t • }| We write m [ (cid:105) m (cid:48) if m [ t (cid:105) m (cid:48) holds for some step t and denote the reflexive transitiveclosure of the relation [ (cid:105) by [ (cid:105) ∗ . A marking m (cid:48) is reachable in a marked Petri net N = ( P , T , • () , () • , m ) if m [ (cid:105) ∗ m (cid:48) holds (in the net ( P , T , • () , () • ) ). For a single transition t ∈ T , we write m [ t (cid:105) m (cid:48) instead of m [ { t }(cid:105) m (cid:48) . Note that the emptystep is always fireable, i.e., for each marking m , we have an “idle” step m [ ∅ (cid:105) m .Recall that a marked Petri net N = ( P , T , • () , () • , m ) is safe if all reachable mark-ings m (cid:48) have at most one token in any place, i.e., if they satisfy m (cid:48) ( p ) ≤
1, for all p ∈ P .Thus, a marking m corresponds to a set ˆ m ⊆ P satisfying p ∈ ˆ m i ff m ( p ) >
0; forconvenience, we shall identity a safe marking m with its set of places ˆ m . The mainfocus will be on Petri nets that are safe and extended free choice , i.e., if the pre-sets oftwo transitions have a place in common, the pre-sets coincide. Finally, recall that the conflict relation , denoted by t t (cid:48) if • t ∩ • t (cid:48) (cid:44) ∅ , for t , t (cid:48) ∈ T ; for extended free choice nets, the conflict relation is anequivalence relation. In fact, the results of the paper apply to the general case, mutatis mutandis . Also, note that weleave the flow relation implicit as its identity is immaterial for the execution semantics.0 Tobias Heindel and Ingo Weber
Soundness of business processes in the sense of Van der Aalst [2] implies termination iftransitions are governed by a strongly fair scheduler [1]:If we assume a strong notion of fairness, then the first requirement implies thateventually state [o] is reached. Strong fairness, sometimes also referred to as“impartial” or “recurrent” [KA99], means that in every infinite firing sequence,each transition fires infinitely often. Note that weaker notions of fairness are notsu ffi cient, see Figure 2 in [KA99].Indeed, such a scheduler fits the intra-organizational setting. However, as discussedfor the order-to-cash process model, unfair scheduling practices could arise in the inter-organizational setting if undesired behavior yields higher profits. We consider incentivealignment to rule out scenarios that lure actors into counterproductive behavior. Weeven can check whether all activities in a given bpmn model with utility annotations arerelevant and profitable.As bpmn models have established Petri net semantics [6], it su ffi ces to consider thelatter for the game theoretic aspects of incentive alignment. As a preparatory step, weextend Petri nets with utility functions as pioneered by von Neumann and Morgen-stern [19]. Then we describe two ways to associate a stochastic game to a Petri net withtransition-based utilities: the first game retains the state space and the principal designchoice concerns transition probabilities; the second game is the restarting version of thefirst game, which will turn out to be better suited to analyze business processes and evengives rise to a tight connection with the soundness property of workflow nets [2]. Finally,we define incentive alignment in full formal rigor based on stochastic games and showthat the soundness property for workflows nets can be “rediscovered” as a special caseof incentive alignment; in other words, soundness is conserved, and therefore incentivealignment is a conservative extension of soundness. We assume that costs (respectively profits) are incurred (resp. gained) per task and that,in particular, utility functions do not depend on the state. Note that the game theoreticresults do not require this assumption; however, this assumption does not only avoidclutter, but also retains the spirit of the abc method [12] and is in line with the work ofHerbert and Sharp [9].
Definition 9 (Petri net with transition payo ff s and roles). For a set of roles R , a Petrinet with transition payo ff s and roles is a triple ( N , u , ρ ) where – N = ( P , T , • () , () • , m ) is a marked Petri net with initial marking m , – u : R → T → R is a utility function , and – ρ : T (cid:42) R is a partial function, assigning at most one role to each transition.The utility u i ( t ) of a step t ⊆ T is the sum of the utilities of its elements, i.e., u i ( t ) = (cid:80) t ∈ t u i ( t ) , for each role i ∈ R . ncentive Alignment of Business Processes 11 p p p p t tt (cid:48) t p p p p t a [ − t b [ + t (cid:48) c [ + t a [ + c [ + Fig. 4.
Extending Petri nets with role and utility annotations
As a consequence of the definition, the idle step has zero utility. We have included thepossibility that some of the transitions are not controlled by any of the roles (of a bpmn model) by using a partial function from transitions to roles; we take a leaf out of thegame theorist’s book and attribute the missing role to nature .Fig. 4 displays a Petri net on the left. The names of the places p , . . . , p will beconvenient later. In the same figure on the right, we have added annotations that carryinformation concerning roles, costs, and profits in the form of lists of role-utility pairsnext to transitions. E.g., the transition t is assigned to role a and firing t results inutility − a , i.e., one unit of cost. The first role in each list denotes responsibilityfor the transition and we have omitted entries with zero utility. We also have coloredtransitions with the same color as the role assigned to it.There are natural translations from bpmn models with payo ff annotations for activitiesto Petri nets with payo ff s and roles (relative to any of the established Petri net semanticsfor models in bpmn [6]). If pools are used, we take one role per pool and each task isassigned to its enclosing pool; for pairs of sending and receiving tasks or events, thesender is responsible for the transition to be taken. The only subtle point concerns the roleof nature. When should we blame nature for the data on which choices are based? Theanswer depends on the application at hand. For instance, let us consider the order-to-cashmodel of Fig. 1: whether or not the supplier is out of stock is only partially within thecontrol of supplier; also, we can avoid the need to meticulously model all factors thatmight lead to damage of a shipped good by blaming nature. In a first approximation, wesimply let nature determine the state of the stock and damage goods at will. We now describe how each Petri net with transition payo ff s and roles gives rise to astochastic game, based on two design choices: each role can execute only one (enabled)transition at a time and conflicts are resolved in a probabilistically fair manner. Forexample, for the net on the right in Fig. 4, we take four states p , p , p , p , one for eachreachable marking. The Petri net does not prescribe what should happen if roles a and c both try to fire transitions t and t (cid:48) simultaneously if the game is in state p . The simplestprobabilistically fair solution consists of flipping a coin; depending on the outcome, thegame continues in state p or in state p . For the general case, let us fix a safe, extendedfree-choice net ( N , u , ρ ) with payo ff s and roles whose initial marking is m . Definition 10 (The base game with fair conflicts).
Let
X ⊆ ℘ T be the partitioningof the set of transitions into equivalence classes of the conflict relation on the set of transitions, i.e., X = {{ t (cid:48) ∈ T | t (cid:48) t } | t ∈ T } ; its members are called conflict sets . Givena safe marking m ⊆ P and a step t ⊆ T, a maximal m -enabled sub-step is a step t (cid:48) thatis enabled at the marking m, is contained in the step t, and contains one transition ofeach conflict set that has a non-empty intersection with the step, i.e., such that all threeof m [ t (cid:48) (cid:105) , t (cid:48) ⊆ t and | t (cid:48) | = |{ X ∈ X | t ∩ X (cid:44) ∅ }| hold. We write t (cid:48) (cid:118) m t if the step t (cid:48) is amaximal m-enabled sub-step of the step t.The base game with fair conflicts (cid:104) N , S , A , q , u (cid:105) of the net ( N , u , ρ ) is defined asfollows. – The set of players N : = R ∪ {⊥} is the set of roles and nature , ⊥ (cid:60) R . – The state space S is the set of reachable markings, i.e., S = { m (cid:48) | m [ (cid:105) ∗ m (cid:48) } . – The action set of an individual player i is A i : = { ∅ } ∪ {{ t } | t ∈ T , ρ ( t ) = i } , whichconsists of the empty set and possibly singletons of transitions, where ρ ( t ) = ⊥ if ρ ( t ) is not defined. We identify an action profile a ∈ A = (cid:81) | N | i = A i with the union ofits components a ≡ (cid:83) i ∈ N a i . – In a given state m, the available actions of player i are the enabled transitions, i.e.,A i ( m ) = {{ t } ∈ A i | m [ t (cid:105)} . – q ( m (cid:48) | m , t ) = (cid:80) t (cid:48) (cid:118) m t s.t. m [ t (cid:48) (cid:105) m (cid:48) (cid:81) X ∈X s.t. t ∩ X (cid:44) ∅ | t ∩ X | – u i ( m , t ) = (cid:80) t ∈ t u i ( t ) if i ∈ R and u ⊥ ( m , t ) = , for all t ⊆ T, and m ⊆ P. Let us summarize the stochastic game of a given Petri net with transition payo ff s androles. The stochastic game has the same state space as the Petri net, i.e., the set ofreachable markings. The available actions for each player at a given marking are theenabled transitions that are assigned to the player, plus the “idle” step. Each step comeswith a state-independent payo ff , which sums up the utilities of each single transition, foreach player i . In particular, if all players chose to idle, the corresponding action profile isthe empty step ∅ , which gives 0 payo ff . The transition probabilities implement the ideathat all transitions of an action profile get a fair chance to fire, even if the step containsconflicting transitions. Let us highlight the following two points for a fixed markingand step: (1) given a maximal enabled sub-step, we roll a fair “die” for each conflict setwhere the “die” has one “side” for each transition in the conflict set that also belongsto the sub-step (unless the “die” has zero sides); (2) there might be several choices ofmaximal enabled sub-steps that lead to the same marking. In the definition of transitionprobabilities, the second point is captured by summation over maximal enabled sub-stepsof the step and the first point corresponds to a product of probabilities for each outcomeof “rolling” one of the “dice”.We want to emphasize that if additional information about transition probabilitiesare known, it should be incorporated. In a similar vein, one can adapt the approach ofHerbert and Sharp [9], which extends the bpmn language with probability annotations forchoices. However, as we are mainly interested in a priori analysis, our approach mightbe preferable since it avoids arbitrary parameter guessing. The most important designchoice that we have made concerns the role of nature, which we consider as absolutelyneutral; it is not even concerned with progress of the system as it does not benefit fromtransitions being fired.Now, let us consider once more the order-to-cash process. If the process reaches thestate in which customer’s next step is payment, there is no incentive for paying. Instead, ncentive Alignment of Business Processes 13 p p p p t a [ − t b [ + t (cid:48) c [ + t a [ + c [ + p p p t a [ − t b [ + t (cid:48) c [ + t a [ + c [ + Fig. 5.
Restarting process example customer can choose to idle, ad infinitum . In fact, this strategy yields maximum payo ff for the customer. The bpmn -model does not give any means for punishing customer’spayment inertia. However, there is not even any incentive for shipper to pick up thegoods! Incentives in the single instance scenario can be fixed, e.g., by adding escrow.However, in the present paper, we shall give yet a di ff erent perspective: we repeat theprocess indefinitely. The single instance game from Definition 10 has one major drawback. It allows toanalyze only a single instance of a business process. We shall now consider a variationof the stochastic game, which addresses the case of multiple instances in the simplestform. The idea is the same as the one for looping versions of workflow nets that havebeen considered in the literature, e.g., to relate soundness with liveness [1, Lemma 5.1]:we simply restart the game in the initial state whenever we reach a final marking.
Definition 11 (Restart game).
A safe marking m ⊆ P is final if it does not intersectwith any pre-set, i.e., if m ∩ • t = ∅ , for all transitions t ∈ T; we write m ↓ if themarking m is final, and m (cid:54) ↓ if not. Let (cid:104) N , S , A , q , u (cid:105) be the base game with fair conflictsof the net ( N , u , ρ ) . The restart game of the net ( N , u , ρ ) is the game (cid:104) N , ˚ S , ˚ A , ˚ q , u (cid:105) with – ˚ S = S \ { m (cid:48)(cid:48) ⊆ P | m (cid:48)(cid:48) ↓} ; – ˚ q ( m (cid:48) | m , t ) = q ( m (cid:48) | m , t ) if m (cid:48) (cid:44) m q ( m | m , t ) + (cid:80) m (cid:48)(cid:48) ↓ q ( m (cid:48)(cid:48) | m , t ) if m (cid:48) = m for all m , m (cid:48) ∈ ˚ S ; and the available actions restricted to ˚ S ⊆ S , i.e., ˚ A i ( s ) = A i ( s ) , fors ∈ ˚ S .
For workflow nets, the variation amounts to identifying the final place with the initialplace. The passage to the restart game is illustrated in Fig. 5. Note that the restart gameof our example is drastically di ff erent. Player c will be better o ff “cooperating” and neverchoosing the action t (cid:48) , but instead idly reaping benefits by letting players a and b do thework. We now formalize the idea that participants want to expect benefits from taking part ina collaboration if agents behave rationally – the standard assumption of game theory.
The proposed definition of incentive alignment is in principle of qualitative nature, but ithinges on quantitative information, namely the expected utility for each of the businesspartners of an interorganizational process.Let us consider a Petri net with payo ff s ( N , u , ρ ), e.g., the Petri net semantics of a bpmn model. Incentive alignment amounts to existence of equilibrium strategies in theassociated restart game (cid:104) N , ˚ S , ˚ A , ˚ q , u (cid:105) (as per Definition 11) that eventually will lead topositive utility for every participating player. The full details are as follows. Definition 12 (Incentive alignment w.r.t. completion and full liveness).
Given anautonomous correlation device D , a correlated strategy profile σ is eventually positive if there exists a natural number ¯ n ∈ N such that, for all larger natural numbers n > ¯ n,the expected payo ff of every player is positive, i.e., for all i ∈ N, ¯ γ in ( D , m , σ ) > .Incentives in the net ( N , u , ρ ) are aligned with – proper completion if, for every positive real ε > , there exist an autonomouscorrelation device D and an eventually positive correlated ε -equilibrium strategyprofile σ of the restart game (cid:104) N , ˚ S , ˚ A , ˚ q , u (cid:105) such that, for every natural number ¯ n ∈ N ,there exists a history h ∈ H n ( D ) at stage n > ¯ n with current state s h = m that hasnon-zero probability, i.e., P D , m ,σ ( h ) > ; – full liveness if, for every positive real ε > , there exist an autonomous correlationdevice D and an eventually positive correlated ε -equilibrium strategy profile σ of the restart game (cid:104) N , ˚ S , ˚ A , ˚ q , u (cid:105) such that, for every transition t ∈ T, for everyreachable marking m (cid:48) , and for every natural number ¯ n ∈ N , there exists a historyh = (cid:104) m (cid:48) , x , a , . . . , s n − , x n − , a n − , s n , x n (cid:105) ∈ H n ( D ) at stage n > ¯ n with t ∈ a n − and P D , m (cid:48) ,σ ( h ) > . Both variations of incentive alignment ensure that all participants can expect to gainprofits on average, eventually; moreover, something “good” will always be possible inthe future where something “good” is either restart of the game (upon completion) oradditional occurrences of every transition.There are several interesting consequences. First, incentive alignment w.r.t. fullliveness implies incentive alignment w.r.t. proper completion, for the case of safe conflict-free Petri nets where the initial marking is only reachable via the empty transitionsequence, including the following class of workflow nets.
Definition 13 (Elementary workflow net). An elementary workflow net is an elemen-tary net system N = ( P , T , • () , () • ) [23] such that – there exist unique places i ∈ P and o ∈ P, such that i (cid:60) t • and o (cid:60) • t, for all t ∈ T,which are called the initial and final place, resp., and – each place is in the pre- or post-set of some transition, i.e., P = (cid:83) t ∈ T ( • t ∪ t • ) . Next, note that incentive alignment w.r.t. full liveness implies the soundness property forsafe elementary workflow nets. The main insight is that correlated equilibria can servea very special case of strongly fair schedulers, not only for the case of a single player.However, we can even obtain a characterization of soundness in terms of incentivealignment w.r.t. full liveness, which makes precise in which sense we have extendedsoundness conservatively. ncentive Alignment of Business Processes 15
Theorem 1 (Characterization of soundness).
Let N be a elementary workflow netthat is safe and extended free choice; let ( N , ρ : T → { Σ } , be the net with transitionpayo ff s and roles where Σ is a unique role, ρ : T → { Σ } is the unique total function, and is the constant utility-1 function. The soundness property holds in N if, and only if,incentives in ( N , ρ : T → { Σ } , are aligned w.r.t. full liveness. The proof can be found in Appendix A.Finally, the reader may wonder why we consider the restarting game. Indeed, forPetri nets that do not have any cycles, one could formalize the idea of incentive alignmentusing finite extensive form games for which correlated equilibria have been studied aswell [27]. However, this alternative approach is only natural for Petri nets with transitionpayo ff s and roles, but without cycles . In the present paper we have opted for a generalapproach, which does not impose the rather strong restriction on nets to be acyclic. Alsolet us emphasize that the restart games are merely a means to an end. We generate themfrom arbitrary free-choice safe Petri nets with transition payo ff s and roles, which are themain objects of the theoretical result. We have described a game theoretic perspective on incentive alignment of inter-organiza-tional business processes. It applies to bpmn collaboration models that have annotationsfor activity-based utilities for all roles. The main theoretical result is that incentivealignment is a conservative extension of the soundness property, which means thatwe have described a uniform framework that applies the same principles to intra- andinter-organizational business processes. We have illustrated incentive alignment forthe example of the order-to-cash process and an additional example that is tailored toillustrate the game theoretic element of mediators.The natural next step is the implementation of a tool chain that takes a bpmn collabo-ration model with annotations, transforms it into a Petri net with transition payo ff s androles, which in turn is analyzed concerning incentive alignment, e.g., using algorithmsfor solving stochastic games [17]. One might even consider to extend prism [14] or themodel checker storm [8]. A very challenging venue for future theoretical work is theextension to the analysis of interleaved execution of several instances of a process. References
1. Van der Aalst, W.M.P., Van Hee, K.M., ter Hofstede, A.H.M., et al.: Soundness of workflownets: classification, decidability, and analysis. Formal Asp. Comp. (3), 333–363 (2011)2. Van der Aalst, W.M.P.: Verification of workflow nets. In: Application and Theory of PetriNets, ICATPN (1997)3. Aumann, R.J.: Subjectivity and correlation in randomized strategies. Journal of MathematicalEconomics (1), 67–96 (1974)4. Cachon, G.P., Netessine, S.: Handbook of Quantitative Supply Chain Analysis, chap. GameTheory in Supply Chain Analysis, pp. 13–65. Springer (2004)5. Cartelli, V., Di Modica, G., Manni, D., Tomarchio, O.: A cost-object model for activity basedcosting simulation of business processes. In: European Modelling Symposium (2014)6 Tobias Heindel and Ingo Weber6. Dijkman, R.M., Dumas, M., Ouyang, C.: Semantics and analysis of business process modelsin bpmn. Information and Software technology (12), 1281–1294 (2008)7. Dumas, M., Rosa, M.L., Mendling, J., Reijers, H.A.: Fundamentals of Business ProcessManagement. Springer, 2nd edn. (2018)8. Hensel, C., Junges, S., Katoen, J., Quatmann, T., Volk, M.: The probabilistic model checkerstorm. CoRR abs / (2020), https://arxiv.org/abs/2002.07080
9. Herbert, L., Sharp, R.: Using stochastic model checking to provision complex businessservices. In: IEEE Intl. Symp. High-Assurance Systems Engineering (2012)10. J. Aumann, R.: Correlated equilibrium as an expression of Bayesian rationality. Econometrica (1), 1–18 (1987)11. Ja´skiewicz, A., Nowak, A.S.: Non-Zero-Sum Stochastic Games, pp. 1–64. Springer (2017)12. Kaplan, R., Atkinson, A.: Advanced Management Accounting. Prentice Hall, 3rd edn. (1998)13. Kwiatkowska, M., Norman, G., Parker, D., Santos, G.: Automated verification of concurrentstochastic games. In: Intl. Conf. Quantitative Evaluation of Systems. pp. 223–239 (2018)14. Kwiatkowska, M., Parker, D., Wiltsche, C.: PRISM-games 2.0: A tool for multi-objectivestrategy synthesis for stochastic games. In: International Conference on Tools and Algorithmsfor the Construction and Analysis of Systems (2016)15. Leyton-Brown, K., Shoham, Y.: Essentials of game theory: A concise multidisciplinaryintroduction. Synthesis lectures on AI and ML (1), 1–88 (2008)16. MacDermed, L., Isbell, C.L.: Solving stochastic games. In: Conf. Neural Information Process-ing Systems. pp. 1186–1194 (2009)17. MacDermed, L., Narayan, K.S., Isbell, C.L., Weiss, L.: Quick polytope approximation of allcorrelated equilibria in stochastic games. In: AAAI Conference (2011)18. Mendling, J., Weber, I., Van der Aalst, W.M.P., et al.: Blockchains for business processmanagement – challenges and opportunities. ACM Transactions on Management InformationSystems (TMIS) (1), 4:1–4:16 (Feb 2018)19. Morgenstern, O., von Neumann, J.: Theory of games and economic behavior. Princetonuniversity press (1953)20. Narayanan, V., Raman, A.: Aligning incentives in supply chains. Harvard Business Review , 94–102, 149 (12 2004)21. Osborne, M.J., Rubinstein, A.: A course in game theory. MIT press (1994)22. Prasad, H., L., P., Bhatnagar, S.: Two-timescale algorithms for learning Nash equilibria ingeneral-sum stochastic games. In: Intl. Conf. Auton. Agents and Multiagent Systems (2015)23. Rozenberg, G., Engelfriet, J.: Elementary net systems. In: Lectures on Petri Nets I: BasicModels, Advances in Petri Nets, held in Dagstuhl. pp. 12–121 (Sep 1996)24. Shapley, L.S.: Stochastic games. Proc. Nat. Academy of Sciences (10), 1095–1100 (1953)25. Solan, E., Vieille, N.: Correlated equilibrium in stochastic games. Games and EconomicBehavior (2), 362–399 (2002)26. Solan, E., Vieille, N.: Stochastic games. Proceedings of the National Academy of Sciences (45), 13743–13746 (2015)27. von Stengel, B., Forges, F.: Extensive-form correlated equilibrium: Definition and computa-tional complexity. Math. Oper. Res. , 1002–1022 (2008)28. Weber, I., Xu, X., Riveret, R., et al.: Untrusted business process monitoring and executionusing blockchain. In: BPM (2016)29. Weske, M.: Business Process Management – Concepts, Languages, Architectures. Springer,3rd edn. (2019)30. Xu, X., Weber, I., Staples, M.: Architecture for Blockchain Applications. Springer (2019)ncentive Alignment of Business Processes 17 A The conservative extension theorem
Let us quickly recall the definition of soundness. According to [1] , an (elementary)workflow net issound if and only if the following three requirements are satisfied: (1) option tocomplete : for each case it is always still possible to reach the state which justmarks place end , (2) proper completion : if place end is marked all other placesare empty for a given case, and (3) no dead transitions : it should be possible toexecute an arbitrary activity by following the appropriate routewhere each case means each reachable marking, state means marking, marked meansmarked by a reachable marking, activity means transition , and following the appropriateroute means after executing the appropriate firing sequence (see also [1, Definition 5.1]). Lemma 1 (Proper completion from safety and option to complete).
If an elementaryworkflow net is safe, the option to complete implies proper completion.Proof.
Let m (cid:48) be a reachable marking that covers the final place, i.e., o ∈ m (cid:48) . Theexistence of an option to complete implies that there is a firing sequence from themarking m (cid:48) to the final marking, i.e., a firing sequence of the form. m (cid:48) [ t (cid:105) m [ t (cid:105) · · · m n [ t n (cid:105) [ o ]If this firing sequence is empty, i.e., if n =
0, we obtain the desired, because then we havealready started from the final marking, i.e., m (cid:48) = [ o ]. It remains to show that, for everyreachable marking m (cid:48) that covers o , there cannot be any (non-empty) firing sequence m (cid:48) [ t (cid:105) m [ t (cid:105) · · · m n [ t n (cid:105) [ o ]of length n ≥ n ≥ n = m (cid:48) be a reachable marking such that o ∈ m (cid:48) and assume that m (cid:48) [ t (cid:105) [ o ] (toderive a contradiction). Because the net is an elementary net system, the transition t consumes a token from the marking m (cid:48) and produces a token in the final place.However this would mean that the net is not safe as the marking m (cid:48) covers the finalplace already and t cannot consume a token from the final place, by the definition ofelementary workflow net. This however leads to a contradiction to the assumption ofsafety. Hence, there cannot be a firing sequence of length one from the marking m (cid:48) to the final marking. n (cid:123) n + m (cid:48) with o ∈ m (cid:48) ,there cannot be any firing sequence m (cid:48) [ t (cid:105) m [ t (cid:105) · · · m n [ t n (cid:105) [ o ]of length n ≥ m (cid:48)(cid:48) be a reachable marking that suchthat o ∈ m (cid:48)(cid:48) and assume that there exists a firing sequence m (cid:48)(cid:48) [ t (cid:48) (cid:105) m (cid:48) [ t (cid:48) (cid:105) · · · m (cid:48) n + [ t (cid:48) n + (cid:105) [ o ] Emphasis is taken from the source.8 Tobias Heindel and Ingo Weber (to derive a contradiction). Note that m (cid:48) still covers the final place (as o (cid:60) • t (cid:48) ) and isreachable. This however leads to a contradiction to the induction hypothesis, whichstates that there cannot by any firing sequence of the form m (cid:48) [ t (cid:48) (cid:105) · · · m (cid:48) n + [ t (cid:48) n + (cid:105) [ o ]with m (cid:48) reachable and covering the final place. Thus, we have derived a contradic-tion.Finally, by the principle of complete induction, there cannot be any firing sequence m (cid:48) [ t (cid:105) m [ t (cid:105) · · · m n [ t n (cid:105) [ o ]of length n for n >
0. Hence n = (cid:117)(cid:116) We now provide the proof of Theorem 1. To avoid clutter, we denote the restart gameby (cid:104) N , S , A , q , u (cid:105) . Theorem 1 (Characterization of soundness).
Let N be a elementary workflow net thatis safe and extended free choice; let ( N , ρ : T → { Σ } , be the net with transition payo ff sand roles where Σ is a unique role, ρ : T → { Σ } is the unique total function, and is theconstant utility-1 function. The soundness property holds in N if, and only if, incentivesin ( N , ρ : T → { Σ } , are aligned w.r.t. full liveness.Proof. Let N be an elementary workflow net that is safe and extended free-choice withinitial marking [ i ]. This implies that the set of transitions is non-empty, because the finalplace has to be an element of some post-set, by definition.First, assume that the net ( N , u , ρ ) with initial marking [ i ] is incentive alignedw.r.t. full liveness. By definition of the latter, for some specific choice of ε > ε = . ε -equilibrium σ of the restartgame (cid:104) N , S , A , q , u (cid:105) such that, for every transition t ∈ T , every reachable marking m (cid:48) ,and for every natural number ¯ n ∈ N , there exists a history h = (cid:104) m (cid:48) , x , a , . . . , s n − , x n − , a n − , s n , x n (cid:105) ∈ H n ( D ) (2)at stage n > ¯ n with a n − = { t } (using that there are no parallel steps allowed in the restartgame, and there is only one role) and P D , m (cid:48) ,σ ( h ) > dead transitions . Let t ∈ T be a transition.We have to show that t it is not dead. i.e., will eventually be fired. Now, m = [ i ] is areachable marking and 1 is a natural number. Hence, by the above property of σ , thereexists a history h = (cid:104) m , x , a , . . . , s n − , x n − , a n − , s n , x n (cid:105) ∈ H n ( D ) (3)at stage n > a n − = { t } and P D , m ,σ ( h ) >
0. Let k < n be the largest indexsuch that s k = m . Thus, in the Petri net N , there is a firing sequence[ i ] = m = s k [ a k (cid:105) · · · s n − [ a n − (cid:105) s n ncentive Alignment of Business Processes 19 such that a n − = { t } ; hence, s n − [ t (cid:105) and s n − is reachable. As the transition t was arbitrary,incentive alignment w.r.t. full liveness implies that there are no dead transitions.Concerning the option to complete , let m (cid:48) be an arbitrary reachable marking. Forthe option to complete, we want to show that there exists a firing sequence m (cid:48) [ t (cid:105) · · · m n [ t n (cid:105) [ o ]from the reachable marking m (cid:48) to the final marking [ o ]. First, note that we can assumew.l.o.g. that m (cid:48) (cid:44) [ o ] (as otherwise the empty firing sequence su ffi ces). Hence, we onlyneed to consider the case in which m (cid:48) (cid:44) [ o ]. We proceed by choosing a transition t ∈ T that is enabled at the initial marking, i.e., [ i ] [ t (cid:105) ; such a transition must exist since thenet N is an elementary net system. Also, 2 is a natural number. Hence, by the aboveproperty of σ , there exists a history h = (cid:104) m (cid:48) , x , a , . . . , s n − , x n − , a n − , s n , x n (cid:105) ∈ H n ( D ) (4)at stage n > a n − = { t } and P D , m (cid:48) ,σ ( h ) >
0. Hence, s n − = m = [ i ], becausethe net is an elementary workflow net and • t = [ i ] = m . Thus, there exists a smallestindex (cid:96) > s (cid:96) = m = [ i ]. Hence, there is a non-empty firing sequence m (cid:48) [ a (cid:105) · · · s (cid:96) − [ a (cid:96) − (cid:105) [ o ]in the Petri net N , by definition of the restart game and because [ o ] is the only finalmarking. Thus, we have shown that options to complete exist. Proper completion followsby Lemma 1. Conversely , assume that the net N satisfies the soundness property. We choose acorrelation device D that gives the same signal (no matter the history). Thus, M Σ n = {(cid:62)} where (cid:62) is an arbitrary unique signal and the probability distributions d n (cid:104) x , . . . , x n − (cid:105) concentrate all probability mass on the unique signal x = (cid:104)(cid:62)(cid:105) .First, note that if player Σ always chooses some transition as action, or, equivalently,never chooses to idle, we obtain a strategy profile for the correlation device, which isan eventually positive ε -equilibrium strategy profile. Here, we use the fact that therestart game allows the player to execute at most one transition at a time. Now, relyingon the option to complete, we shall construct a strategy σ that only considers the currentstate; such strategies are called stationary . Thus, let σ be the strategy of player Σ thatchooses, uniformly at random, one of the transitions that are enabled at the current state.More formally, if we put L m = { t ∈ T | m [ t (cid:105)} , for any reachable marking m , then σ ischaracterized by the equation σ Σ h { t } = L sh if s h [ t (cid:105) , for every observation h .Now, we claim that the strategy σ yields a “uniform” witness for incentive alignmentw.r.t. full liveness, i.e., that for every ε >
0, the strategy profile (cid:104) σ (cid:105) is an eventually These strategies are the best ones existing or dominant , in game theoretic vernacular.0 Tobias Heindel and Ingo Weber positive correlated ε -equilibrium such that, for all transitions t ∈ T , every reachablemarking m (cid:48) , and each natural number ¯ n ∈ N , there exists a history h = (cid:104) m (cid:48) , x , a , . . . , s n − , x n − , a n − , s n , x n (cid:105) ∈ H n ( D )at stage n > ¯ n with t ∈ a n − and P D , m (cid:48) ,σ ( h ) > ε > t ∈ T be a transition, let m (cid:48) be a reachablemarking, and let ¯ n ∈ N be a natural number. We first show that there exist1. a history h (cid:48) that starts from m (cid:48) and leads to [ i ],2. a history h (cid:48)(cid:48) that starts from [ i ], also leads to [ i ], and is non-empty, and3. a history h (cid:48)(cid:48)(cid:48) that starts from [ i ] and leads to a marking that enables the transition t and all these histories are “possible”, i.e., P D , m (cid:48) ,σ ( h (cid:48) ) > P D , [ i ] ,σ ( h (cid:48)(cid:48) ) >
0, and P D , [ i ] ,σ ( h (cid:48)(cid:48)(cid:48) ) > Back to the initial marking
By construction of σ , using the option to complete andLemma 1, there is a history h (cid:48) of the form h (cid:48) = (cid:104) m (cid:48) , x (cid:48) , a (cid:48) , . . . , s k − (cid:48) , x (cid:48) k − , a (cid:48) k − , s k (cid:48) , x (cid:48) k (cid:105) ∈ H k ( D )with P D , m (cid:48) ,σ ( h (cid:48) ) > s k (cid:48) = [ i ], s l (cid:48) (cid:44) [ i ] for all l < k , and k > Looping on the initial marking
There is an initial transition t , i.e., one such that • t = [ i ];let m (cid:48)(cid:48) be the marking reached after firing t , i.e., [ i ] [ t (cid:105) m (cid:48)(cid:48) . Now, by the same argumentas for h (cid:48) , we obtain a history h (cid:48)(cid:48) of the form h (cid:48)(cid:48) = (cid:104) m (cid:48)(cid:48) , x (cid:48)(cid:48) , a (cid:48)(cid:48) , . . . , s r − (cid:48)(cid:48) , x (cid:48)(cid:48) r − , a (cid:48)(cid:48) r − , s r (cid:48)(cid:48) , x (cid:48)(cid:48) r (cid:105) ∈ H r ( D )with P D , m (cid:48)(cid:48) ,σ ( h (cid:48)(cid:48) ) > s r (cid:48)(cid:48) = [ i ], s j (cid:48)(cid:48) (cid:44) [ i ] for all j < r , and r > Reaching the transition
Finally, by construction of σ , and using the absence of deadtransitions, there is a history h (cid:48)(cid:48)(cid:48) of the form h (cid:48)(cid:48)(cid:48) = (cid:104) m (cid:48)(cid:48)(cid:48) , x (cid:48)(cid:48)(cid:48) , a (cid:48)(cid:48)(cid:48) , . . . , s n − (cid:48)(cid:48)(cid:48) , x (cid:48)(cid:48)(cid:48) n (cid:48) − , a (cid:48)(cid:48)(cid:48) n (cid:48) − , s n (cid:48) (cid:48)(cid:48)(cid:48) , x (cid:48)(cid:48)(cid:48) n (cid:48) (cid:105) ∈ H n (cid:48) ( D )with P D , m (cid:48)(cid:48)(cid:48) ,σ ( h (cid:48)(cid:48)(cid:48) ) > { t } = a (cid:48)(cid:48)(cid:48) n (cid:48) − . Putting histories together
Finally, as the strategy σ is stationary, i.e., it does not dis-tinguish between histories with the same current state, we can concatenate h (cid:48) , enoughcopies of h (cid:48)(cid:48) , and the third history h (cid:48)(cid:48)(cid:48) to obtain a history h as desired.As σ always chooses some transition to execute next, it is a dominant strategy andthus in particular an ε -equilibrium; moreover, it is eventually positive since it reachesaverage utility of 1.-equilibrium; moreover, it is eventually positive since it reachesaverage utility of 1.