A Neural Architecture for Designing Truthful and Efficient Auctions
Andrea Tacchetti, DJ Strouse, Marta Garnelo, Thore Graepel, Yoram Bachrach
AA Neural Architecture for Designing Truthful andEfficient Auctions
Andrea Tacchetti, DJ Strouse, Marta Garnelo, Thore Graepel and Yoram Bachrach
DeepMind, UK {atacchet, strouse, garnelo, thore, yorambac}@google.com
Abstract
Auctions are protocols to allocate goods to buyers who have preferences over them,and collect payments in return. Economists have invested significant effort in de-signing auction rules that result in allocations of the goods that are desirable for thegroup as a whole. However, for settings where participants’ valuations of the itemson sale are their private information, the rules of the auction must deter buyersfrom misreporting their preferences, so as to maximize their own utility, since mis-reported preferences hinder the ability for the auctioneer to allocate goods to thosewho want them most. Manual auction design has yielded excellent mechanisms forspecific settings, but requires significant effort when tackling new domains. Wepropose a deep learning based approach to automatically design auctions in a widevariety of domains, shifting the design work from human to machine. We assumethat participants’ valuations for the items for sale are independently sampled froman unknown but fixed distribution. Our system receives a data-set consisting ofsuch valuation samples, and outputs an auction rule encoding the desired incentivestructure. We focus on producing truthful and efficient auctions that minimizethe economic burden on participants. We evaluate the auctions designed by ourframework on well-studied domains, such as multi-unit and combinatorial auctions,showing that they outperform known auction designs in terms of the economicburden placed on participants.
Mechanism design is a field in economics that deals with setting incentives and interaction rulesamong self-interested agents so as to achieve desired objectives for the group as a whole. It issometimes referred to as “inverse game theory”: in game theory we set the rules of a game, andstudy the behaviors that emerge, while in mechanism design we have a target behavior we wishto encourage, and we set the rules of the game so that agents acting in their own self-interest willgravitate towards that desired outcome. One prominent problem in mechanism design is engineeringauction rules, as auctions account for a large proportion of economic activity, such as the sponsoredsearch auction (the main source of revenue for search engines), e-commerce websites such as eBay,or the fine art market [2, 11, 21].One possible goal of an auction design is to maximize the revenue to the auctioneer [24]. In manycases, however, we are merely interested in allocating a set of goods in order to maximize the totalwelfare of participants (and thus minimize revenue to the auctioneer). One example are spectrumauctions [8], in which governments want to allocate the rights (licenses) to transmit signals overspecific bands of the electromagnetic spectrum. The government may wish to allocate the scarcetransmission rights to the firms who value these the most, with the goal of maximizing job creation,trade, and economic welfare. However, the true valuation a firm has for a spectrum band is onlyknown to the firm, rather than to the government. If the government simply declared they would givea band to the firm who wants it most without extracting payments, then all firms who want the band a r X i v : . [ c s . M A ] J u l non-zero amount would be incentivized to lie and say they value the band arbitrarily highly, andthe government could not ensure an optimal allocation. Thus, the auction design challenge for thegovernment becomes: which prices should it charge in order to get truthful reports regarding firms’valuations, and optimally allocate the spectrum bands, while still minimizing the economic burden onparticipants?We propose a learning approach to auction design. The point of departure from the existing economicsliterature is that we make the (often reasonable) assumption that bidders’ valuations for the goodsup for sale cannot take any value, but rather are sampled from an unknown, but fixed, probabilitydistribution (e.g. it is very unlikely anyone would pay $500,000 for a burrito). Under these settingswe introduce a representation of bidders’ preferences and a network architecture that can be used tolearn auction rules that a) incentivize truthful reports from the participants, b) result in the social-welfare-maximizing allocation of the goods in question, and c) place minimal economic burden onthe participants (i.e. extract minimal payments). We show that the proposed approach can learntruthful mechanisms under a wide variety of settings, including various “bidding languges” [25] (i.e.the set or outcomes that bidders can have preferences over), arbitrary distributions of valuations, andarbitrary numbers of participants. Moreover, the resulting payment rules generalize over varyingnumber of participants.Auctions are a pillar of economics and the market protocol of choice for a significant portion ofworld-wide trade. Similar to what Hartford et al. [19] have done for modeling human strategicbehavior, here we show that, under reasonable assumptions, designing auctions that shepherd thebehavior of rational participants towards desirable outcomes can be cast as a supervised functionapproximation problem, thus unlocking the application of modern machine learning methods, and inparticular deep learning, to this field. Mechanism design
Mechanism design deals with choosing from a set K of possible alternatives,where we have a set N = { , , . . . , n } of agents who each have preferences regarding the alternativesin K , expressed in monetary terms. Auction design relates to the specific case where we have a set I of items, and the alternative set K consists of all the possible ways to allocate the items to the agents.We call a subset of items B ⊆ I a bundle , and let P ( I ) be the power set of I (that is, the set of allpossible bundles). An allocation of the items is a function k : N → P ( I ) mapping each agent i to abundle of items k ( i ) ⊆ I , such that for any i = j we have k ( i ) ∩ k ( j ) = ∅ (i.e. no item is allocatedmore than once).Each agent, with knowledge of their own true preferences, reports what is commonly referred to asa “type”: for each allocation k ∈ K the agent communicates to the mechanism a valuation for thatoutcome θ i ( k ) ∈ Θ i . In particular, participants may choose to report truthfully and submit v i ( k ) .The mechanism then selects an allocation according to a choice rule c : Θ × ... × Θ n → K anddetermines the agents’ payments using a payment rule t i : Θ × . . . × Θ n → R , where t i is thepayment for the i th agent. Note that “payments” can be negative, that is the auctioneer can also payparticipants. After payments are collected, each agent thus derives utility u i ( k, t i ) = v i ( k ) − t i fromthe interaction . Allocation efficiency
The main goal of the designs we consider is to choose an efficient outcome:allocate items to those who want them the most, maximizing social welfare. We thus fix the choicefunction c to c = arg max k ∈ K P i v i ( k ) = k ∗ . Strategic behavior
Selecting the welfare maximizing allocation is difficult when the mechanismdoes not have access to the true preferences of each agent, but only to their reported types.This asymmetry in information leads to strategic behavior: rational participants will report what-ever preference θ i maximizes their utility under the mechanism (post payment). Let θ − i indi-cate all reports, truthful or otherwise, from all agents but i ; then rational agents will report: θ i = arg max θ ∈ Θ i u i ( c ( θ − i , θ ) , t i ( θ − i , θ )) . In general θ i = v i . One common assumption is that each agent only cares about the items allocated to them, that is v i = v ( k ( i )) . With a slight abuse of notation, we will sometimes drop v i and θ i ’s explicit dependence on each allocation k , and simply denote with v i and θ i the collections of preferences that can be held and expressed by player i .That is, given an arbitrary ordering of the choice set K , so that K = [ k , k , . . . , k | K | ] , we will use v i and θ i torefer to the vectors v i = [ v i ( k ) , . . . , v i ( k | K | )] , and θ i = [ θ i ( k ) , . . . , θ i ( k | K | )] . ruthful mechanisms In the presence of rational agents, and for our choice of allocationfunction, it is possible to select a payment rule that makes reporting one’s true preferencesthe dominant strategy. That is, for any agent i , and for all possible reports, or misreports,from other players θ , . . . , θ i − , θ i +1 , . . . , θ n , the best course of action is to tell the truth: arg max θ ∈ Θ i u i ( k ∗ ( θ − i , θ ) , t i ( θ − i , θ )) = v i (where we bypassed the explicit dependence on thechoice function c , and let k ∗ depend on θ directly).We restrict our attention to mechanisms that are both efficient and truthful. The only such mechanismsare members of the Groves family, and their payment rule can be written as [15, 14, 13]: t i ( θ − i , θ i ) = h ( θ − i ) − X j = i v j ( k ∗ ( θ − i , θ i )) , (1)where, h : Θ − i → R may be any function that only depends on the reported types of agents otherthan i , and k ∗ is the optimal allocation defined previously. Individual rationality
In the presence of strategic agents, we must ensure that bidders are neverworse off participating in the auctions we design than not. We should guarantee that our auc-tions are individually rational: any agent who truthfully reports their preferences realizes anon-negative utility. That is, regardless of reports from other agents θ − i , we wish to have u i ( v i , θ − i ) = u i ( k ∗ ( v i , θ − i ) , t i ( v i , θ − i )) ≥ . Weak budget balance
Finally, we wish to design mechanisms that do not require a subsidy to operate.That is, we require that the sum of payments collected by the mechanism be non-negative: P i t i ≥ . Vickrey-Clarke-Groves auctions
One of the main results of mechanism design is an auction rulethat satisfies all the criteria we listed above for any realization of agents’ preferences: the Vickrey-Clarke-Groves (VCG) auction. VCG is both efficient, and truthful, and as such it is a member of theGroves family. It is characterized by the choice of function h ( θ − i ) that completes the payment rule inEq. 1: h VCG = P j = i v j ( k ∗ ( θ − i )) . In words, h VCG is the collective value realized by all other agentswhen agent i is removed from the auction. Thus, the completed VCG payment t VCG for agent i (seeEq. 1) is the reduction in the collective value realized by all other agents due to agent i ’s participationin the auction. A special case of a VCG auction for a single item is the well-known second-priceauction (e.g. an eBay auction with no reserve price). VCG is the most widely accepted truthful andefficient mechanism (e.g. it is used for Facebook ad auctions [31]). VCG does not, however, aim tominimize the economic burden on participants, as we do here.
Bidding languages
Since we focus on efficient mechanisms, we must ensure that k ∗ can be computedquickly, even for relatively large numbers of players (see Allocation Efficiency). We thus restrict theway in which participants may express their preferences. Such representations are called biddinglanguages [25]. We consider the following three languages. Multi-unit auctions with decreasing marginal utilities
The first bidding language we examineconsiders selling multi-unit bundles to participants’ whose preferences depend only on the size ofbundles but not on their component objects. This language is useful for selling multiple identical units of the same kind. We further impose that larger bundles cannot be valued less than smaller ones.In these auctions k ∗ can be calculated greedily by allocating objects one by one. Heterogeneous objects with unit demand
The second bidding language we consider is useful whenplayers can take advantage of at most one of items they receive. For example, vacation packages for aspecific week. The valuation for a bundle of items, in this case, is identical to the valuation of the bestobject in the bundle. The allocation function k ∗ is found by solving the maximum-weighted bipartitematching between bidders and items, where participants’ preferences are incorporated as weights. Hierarchical bundles
Finally, we consider a bidding language that is useful to express preferences fora hierarchy of bundles. For example, home builders might bid to be awarded the contract to developtwo lots, with up to two new homes within each lot. The spatial nature of the work makes buildingindividual homes in separate lots is less cost-effective. Thus participants can express preferences fora hierarchy of bundles: component objects are arranged as the leaves of a binary tree, and valuationscan be expressed for leaf-nodes (individual objects), or for any sub-tree. The integer program requiredto find k ∗ can be relaxed as a feasible linear program [25]. For this reason, [27] summarizes the effect of the VCG payment rule as to “internalize the externality.” Problem statement and main contributions
Equipped with the definitions of Sec. 2, we can proceed to state our objective:
To design truthful andallocatively-efficient auctions, minimizing the sum of payments collected by the mechanism, whilekeeping the auction individually-rational and weakly budget balanced.
Minimizing the sum of payments is useful in settings such as the spectrum auction, where the goal isto allocate a scarce resource in an optimal way; the payments t i are by-products resulting from theneed to elicit true reports, so it is desirable to minimize them.Additionally, we strive to have adhere to the following desiderata: we seek a payment rule that is a) “convolutional” over players [22] (i.e. the same function is used to compute the payment owed by eachplayer), b) invariant to the order of other participants for each player (i.e. the payment of player does not change if players and swap bids), and c) robust to changes in the number of participants .In pursuit of this goal, we propose three main technical contributions. a) We show how the problem of designing truthful and efficient auctions can be cast as supervised function approximation . b)We introduce a novel representation of efficient auctions as a collection of counterfactual smallerauctions. c) We propose a network architecture to learn Groves payment rules based on ourrepresentation which supports various bidding languages and an arbitrary (and even varying) numberof bidders. Mechanism design is a relatively mature field. We rely on the framework of the Vickrey-Clarke-Groves mechanism [5] presented in the 1970s, based on earlier work on auctions that Vickrey hadconducted in the 1960s [32]. The design of auctions and mechanisms has been done predominantly manually , where a person uses their experience or intuition to come up with interaction or paymentrules leading to their desired objective. Economists have designed incentive schemes for various goals,such as minimizing the burden on participants while maintaining efficiency [14, 1] or maximizingrevenue [24, 4, 29]. The field of automated mechanism design , where we let a computer designan incentive scheme to meet desired objectives, is relatively new [6, 30, 7, 18, 16]. Early work onautomated mechanism design has focused on producing incentive compatible mechanisms, wheretruthfulness is a Nash equilibrium [30, 7]. In contrast, we aim to achieve truthfulness in the strongsense of a dominant strategy, where agents opt for a truthful report no matter what other agents do.More recently, economists have given significant attention to efficient mechanisms where truthfulnessis a dominant strategy, characterizing the family of Groves mechanisms as the only class of mecha-nisms which are truthful, efficient, individually rational and weakly budget balanced [15, 26]. Theyhave also provided negative results, showing that it is impossible to guarantee full budget-balance(i.e. P i t i = 0 ), in fully truthful and efficient mechanisms [14]. Given these results, researchershave manually constructed Redistribution Mechanisms , specific members in the Groves family thatmaximize budget-balance (i.e. minimize agent payments while requiring no subsidy) in restrictedsettings. We also use the general family of Groves mechanisms, or more specialized cases of Grovesredistribution mechanisms, but rather than manually building incentive schemes to achieve highbudget balance in specific settings, we take an automated mechanism design approach, using machinelearning to identify good members of the Groves family.Closest to our work are recent approaches for automated mechanism design through machine learning,and deep learning in particular [10, 12, 23]. These approaches search a family of payment functionsfor a mechanism with desired properties by defining a loss relating to the desired properties. While ourapproach is similar, we propose a more elaborate neural network architecture to capture reasonableauction rule properties, tackle the more demanding domain of combinatorial auctions under variousbidding languages [9, 25], and crucially we are able to learn mechanisms that are truthful in thestrong sense and support arbitrary, and even variable number of bidders.
Here we introduce the details of our main technical contributions: we show how the problemof completing the Groves payment is equivalent to supervised function approximation . We4 - La y e r CNN ⅀ R e c t i f i ed L i nea r D e c ode r Channel 1:All players’ valuations except i ’s.Channel 2:Assignment of most valuable object.Channel 3:Utility of assignment in Channel 2.Channel 4:Assignment of 2 most valuable objects.Channel 5:Utility of assignment in Channel 4.Channel 6:Assignment of 3 most valuable objects.Channel 7:Utility of assignment in Channel 6. Distributed representationsof each player’s preferences. Sum pooling builds invariance to ordering of other players and robustness to number of participants. The same 2-Layer MLP is applied to each player’s preference embedding. P l a y e r s ( e xc ep t i ) Number of Objects P e r- p l a y e r M L P Final representation of a counter-factual auction where player i did not participate. Figure 1:
Example of Auction representation and network architecture (best viewed in color). In thisexample we represent a multi-unit auction with decreasing marginal utilities with five players, and three objects.We construct the network input to compute the Groves payment rule or redistribution for player . The inputtensor is of size ( n − × | K | × | K | + 1 = 4 × × and is constructed as shown in the figure, on the left(darker shades of red indicate higher valuations in the first channel). This representation is processed with a2-layer CNN that extracts a per-player distributed representation of preferences and a per-player 2-Layer MLP(with shared weights across the players). The resulting embeddings are sum-pooled to build invariance to theordering of players, and robustness to the number of participants, and decoded into a single positive number. introduce our novel representation of efficient auctions , and we propose a network architectureto learn social-utility-maximizing, truthful auctions .We seek to design efficient and truthful mechanisms that are, at least in expectation, as close to budgetbalanced as possible. As discussed in Sec. 2, all mechanisms that are truthful and efficient belong tothe Groves family (and vice versa), so we restrict our search to this family, and effectively seek tocomplete the Groves payment rule by selecting a function h : Θ − i → R (see Eq. 1). The aim is then to complete the payment rule t i of a Groves mechanism so that, in expectation overvaluation profiles sampled from ρ , we minimize the sum total of payments received by the mechanism.However, minimizing payments without any further constraint will result in mechanisms that requirea subsidy to operate. Since this is undesirable, we incorporate a non-deficit constraint. Similarly,ensure that strategic players are never worse off participating in our mechanism than not, by includingan individual rationality constraint for all players. The resulting “ideal” mechanism design problemwe wish to solve is thus: h ∗ = arg min h ∈H E v i ∼ ρ " n X i =1 t i s.t. " n X i =1 t i ≥ , and, v i ( k ∗ ) − t i ≥ , (2)where t i is like in Eq. 1. As mentioned above, we assume we do not have access to the true distribution ρ , so that we cannot solve this minimization analytically. We do, however, assume we have access toa data-set of L realized n -player profiles D = { ( v l , . . . , v ln | l = 1 , . . . , L } , sampled i.i.d. from ρ . Wetherefore use use Lagrange multipliers λ b , and λ r to encode the non-deficit, and individual rationalityconstraints, and minimize the empirical version of our loss: ˆ h = arg min h ∈H L X l =1 n X i =1 t li + λ b min ( n X i =1 t li , )! + λ r n X i =1 (cid:16)(cid:0) min (cid:8) v li ( k ∗ ) − t li , (cid:9)(cid:1) (cid:17) . (3) Selecting a Groves payment rule
Concretely, we introduce two alternatives to learning a Grovespayments rule ˆ h : first, we investigate constructing a neural network to implement ˆ h directly andminimize the empirical loss in Eq. 3, given a data-set of realized valuation profiles.5 earning a VCG redistribution mechanism Our second approach amounts to learning a VCGredistribution mechanism. In this case, we use a neural network to implement a redistributionfunction r ( v , . . . , v i − , v i +1 , . . . , v n ) , and let ˆ h ( · ) = h VCG ( · ) − r ( · ) . Note that in this caseindividual rationality can be guaranteed by simply ensuring that r takes non-negative values, sinceVCG is individually rational and giving payments back can only increase participants’ utilities.The same representation and network architecture is used in both settings. We select a hypothesis space H so that, in practice, we can solve theminimization problem in Eq. 3, given access to a data-set D of valuation profiles. To this end, weintroduce a novel representation of auctions that supports learning Groves payment rules with DeepNeural Networks. Fig. 1 shows an example of our representation and architecture for an auction withthree objects and five participants.When computing t i , the payment owed by player i , the function we wish to learn has access toreports from “other” players, but no knowledge of player i ’s valuation (see Eq. 1). We construct ourrepresentation to highlight the magnitude of each individual bid, and preference profile, relative to therest of the “other” players’ types. The intuition behind this choice is that by comparing the availablebids to each other, a network can construct a sense of how likely it is that these will be surpassedor matched by the unseen preference profile. This is achieved as follows: first, since we focus onefficient mechanisms, we assume we are given access to an “allocation oracle” (a function that forany set of valid preferences profiles, returns the welfare maximizing allocation k ∗ ∈ K , see Sec. 2).Second, we choose to represent each of the v − i as outcomes of | K | counter-factual auctions, each forthe most valuable p bundles ( p = 1 , . . . , | K | ), thus providing information about the relative rank ofeach bundle valuation and preference profile.We provide evidence that an alternative representation of the same information as a flat vector resultsin substantially worse auction designs.Precisely, given a data-set of realized valuation profiles D , and an allocation oracle, for eachplayer i , we construct a tall “image” with “spatial dimensions” | K | × ( n − , and | K | + 1 “channels”. The first “channel” is a matrix V − i ∈ R | K |× ( n − with non-negative entries ( m, j ) representing the utility player j would realize from receiving bundle m . Each successive channel p is constructed by considering the outcome of a counter-factual auction where the n − playersbid for the p most valuable bundles. In particular, the second channel contains the allocation matrix k ∗ ∈ { , } | K |× ( n − with entries ( m, j ) = 1 if bidder j is allocated bundle m , and zero otherwise.The third channel represents the amount of utility realized by each player for this allocation (theelement-wise product between the first and second channels). Similarly, the fourth channel contains k ∗ : the allocation for two bundles, and the fifth channel contains the element-wise product betweenchannels 1 and 4, and so on until all bundles are considered. We alter this representation slightly to inmulti-unit auctions with decreasing marginal utilities. In this case the matrix V − i contains, for eachplayer, the marginal utility of adding one item to their bundle and has size | B | × ( n − , with B theset of available items. A network architecture to learn Groves payment rules
Given our auction representation, wepropose an architecture to learn a Groves payment rule that satisfies the desiderata outlined in Sec. 3and is: a) “convolutional” over players , b) invariant to the order of other participants for each player ,and c) robust to changes in the number of participants .For each player i , we construct the input tensor of size | K |× ( n − × (2 | K | +1) described above andpass it through a -layer CNN. The first layer uses filters of spatial size × so as to construct anembedding of each individual bid (how soon each bundle is allocated, and how much utility it realizescan be readily extracted from a single “column” in our representation). The second CNN layer has filters of size | K | × . The CNN’s output has size × ( n − × , and contains an embedding ofeach of the n − players’ preferences. We follow our CNN with a 2-Layer hidden and output unitsMLP, which we apply independently to each of the ( n − player preference embeddings to produce This is referred to as a “redistribution” mechanism because it can be viewed as collecting the VCG paymentsand then “redistributing” some of them back to participants. Note that this is reasonable given our choice of bidding languages. etting considered Guo et al. [17] Manisha et al. [23] G-CNN (ours) R-CNN (ours)Arbitrary distribution NO YES YES YESNo knowledge of dist. NO YES YES YESArbitrary Table 1:
Qualitative results . The method we propose here can be applied in more general settings thanpreviously proposed alternatives. Models: G-CNN: learns a Groves payment rule directly using our datarepresentation and network architecture. R-CNN: learns a VCG redistribution payment rule using our datarepresentation and network architecture. a new embedding for each player. We then sum-pool over the n − players (which guarantees thedesired robustness properties), and apply a linear decoder (with ReLU rectification) to output a singlevalue for either ˆ h directly, or for a redistribution function r . For each combination of number of participants, valuation distribution and bidding language weconsider, we construct an “auction simulator” that returns sample auctions (i.e. valuation profilesfor all participants, expressed in the appropriate language). We use each simulator to constructtraining and testing data-sets containing , and , auctions respectively. For each auction,we construct the representation described in Sec. 5.2, and train the auction design network aboveusing Adam SGD [20] with a learning rate of − , mini-batches of size , and for , iterations. In all experiments we set λ b = λ r = 100 (see Eq. 3). After training, we use our held-outtest set to report performance. The details of our auction simulators (details on the distributions weconsider, how we construct bundles, and how we implement the allocation function for each biddinglanguage) can be found in the Supplementary Material. In all experiments the number of objects forsale were as follows: with non-decreasing marginal utilities: 15 objects, with heterogeneous objectsand unit-demand: 8 objects, and with hierarchical bundles: 8 component objects (resulting in 15bundles). Baselines
We consider four baselines when reporting our performance. 1)
VCG auctions , the mostcommonly used Groves mechanism: a truthful, efficient, weakly budget balanced and individually ra-tional auction. 2)
Guo and Conitzer [17] a provably optimal-in-expectation linear VCG redistributionmechanism, which requires n < | K | , analytical knowledge of ρ , and only handles multi-unit auctions.3) Manisha et al. [23] a VCG redistribution learned using a MLP architecture that requires n < | K | ,does not support hierarchical bundles, and only works with unit-demand valuations. 4) MLP basedarchitecture lastly, we compare to a 2-layer, 128-hidden-unit MLP that operates on a flattened versionof the same data to empirically support our choice of representation and architecture.
Qualitative comparison with alternative methods
We start with a qualitative comparison with twoexisting alternative methods to automatically construct VCG redistribution mechanisms (see Sec. 4),and highlight how our method can be applied in more general settings in Tab. 1. A quantitativecomparison with these two methods (in the settings in which they can be applied) shows how ourmethods also leads to better performance in practice (see the Supplementary Material). Importantly,while our method does not guarantee we will find auctions that are weakly budget balanced andindividually rational, our quantitative result show that, in practice, we find zero, or next-to-zeroviolations of these constraints (see next paragraph). This architecture is effectively a DeepSets network applied to a graph of n − nodes, and a single globaloutput [33, 3]. The node functions are our CNN+MLP and the aggregator function is a sum. ulti-Unit Unit Dem. Hier. Bund.020406080100 % o f V C G B u dg e t R e t u r n e d Multi-Unit Unit Dem. Hier. Bund. 0 1 2 3 4 5 % o f A u c t i o n s w i t h D e f i c i t G-CNNG-MLPR-CNNR-MLP (a) n = 10 . Multi-Unit Unit Dem. Hier. Bund.020406080100 % o f V C G B u dg e t R e t u r n e d Multi-Unit Unit Dem. Hier. Bund. 0 1 2 3 4 5 % o f A u c t i o n s w i t h D e f i c i t G-CNNR-CNN (b) Train: n ∈ { , } . Test: n = 10 Figure 2:
Quantitative results . We report performance as the average reduction in payments collected relativeto the VCG mechanism (higher is better). Displayed: average performance across , auctions sampled from ρ , mean and standard deviation across training seeds. For each choice of model and bidding language wealso report the fraction of auctions that resulted in a deficit (i.e. the mechanism had to be subsidized) (lower isbetter). The right panel shows interpolation to a previously unseen number of participants (note that MLP-basedmodels do not support this, so their performance is not reported). Models: G-CNN: learn a groves payment ruledirectly using our data-representation and network architecture ( ours ). G-MLP: learn a groves payment ruledirectly using a MLP. R-CNN: Learning a VCG redistribution mechanism using our network architecture ( ours ).R-MLP: Learning a VCG redistribution mechanism using a MLP. Quantitative results
We illustrate quantitative results on synthetic auction data-sets in Fig. 2. Ourexperiments show that auctions learned using our data representation and network architecture resultin a significantly smaller economic burden on the participants than using VCG, and crucially, thatwe are able to learn auction rules with zero or next to zero violations of the weak budget balanceconstraint (i.e. mechanisms should operate without a subsidy). We highlight this by comparing ourdesigns with auction rules based on MLP architectures trained on the same data.Fig. 2 shows results for valuations distributed as ρ = N ( N (10 . , . , N (2 . , . (where N ( µ, σ ) is a Gaussian distribution with the appropriate parameters, clipped at from the left). We leave 2further examples of distributions in the Supplementary Material. The left panel of Fig. 2 shows resultson the three bidding languages we consider and for a fixed number of participants. Learning a Grovespayment rule directly with our architecture and data representation (G-CNN) results in a reductionof payments collected, relative to VCG, by at least , and in zero violations of the no-deficitconstraint. Using our representation and architecture to learn a VCG redistribution mechanism (R-CNN) results in a higher percentage of the budget returned, and in next-to-zero violations of the weakbudget balance constraint. In this case we are able to compare our architecture with MLP baselineswhich result in a relatively larger number of violations (some of which incur in egregious deficitsof up of of the VCG budget, see Supplementary Material). Note how across the two designchoices of learning a Groves payment rule or learning a redistribution mechanism, our CNN basedarchitecture consistently results in fewer constraint violations. The right panel shows the case wherethe auction rule we learn is required to interpolate to a previously unseen number of participants .Again our data representation and network architecture result in a dramatic reduction of the economicburden placed on participants relative to VCG, and, when we learn a redistribution (R-CNN), in zeroconstraint violations. MLP baselines cannot operate on a variable number of participants so we areunable to show a comparison. In the Supplementary Material we report results from testing the samenetwork on a varying number of participants. Note that since we strive to minimize payments, wefind zero, or next to zero, violations of the individual rationality constraints in any of the models (i.e.participants never pay for a bundle more than they think it’s worth). We investigated a machine-learning based approach to automated mechanism design and introducedthe first truly general-purpose data representation, network architecture, and robust problem for-mulation to learn truthful and efficient auction rules automatically, given access only to a data-setof valuations. We introduced a novel way to represent auctions as a collection of “counter-factual”smaller auctions, and proposed a neural architecture that operates on this representation, to learntruthful and efficient mechanisms with minimal economic burden on the participants. Our methodscan be applied on a wide variety of settings including arbitrary distributions, complex bidding lan-guages and variable number of participants. Our empirical analysis shows how the resulting auctionscollect only a small fraction of the VCG budget, and almost never require a subsidy.8echanism design is a pillar of economics and social sciences and the domain of choice to studyhow a central authority can shape the incentives of self-interested individuals in pursuit of groupmetrics of success (e.g. elicit truthful reports and maximize social welfare). Nonetheless, very fewattempts to apply machine learning ideas in this domain have been made. Here we show that undercertain reasonable assumptions, the special case of auction design can be turned into a supervisedlearning problem and the modern tools of statistical learning and deep networks can be brought tobear. The recent renaissance of Artificial Intelligence points to a future where multiple artificial agentsact in a shared environment to maximize individual rewards, realizing the vision of the machinaeconomicus [28]. In this context, it is paramount to investigate how to automatically translate high-level group-wide metrics of success, such as “social welfare maximization” and “truth-telling”, toindividual-level incentive structures. This work is a first step in this direction, and builds heavilyon the economics literature on the subject. Future efforts will focus on the extension of these ideasbeyond auctions to more general decision problems.
References [1] Lawrence M Ausubel and Paul Milgrom. The lovely but lonely Vickrey auction.
CombinatorialAuctions , 17:22–26, 2006.[2] Patrick Bajari and Ali Hortaçsu. The winner’s curse, reserve prices, and endogenous entry:Empirical insights from eBay auctions.
RAND Journal of Economics , pages 329–355, 2003.[3] Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinícius Flo-res Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, RyanFaulkner, Çaglar Gülçehre, Francis Song, Andrew J. Ballard, Justin Gilmer, George E. Dahl,Ashish Vaswani, Kelsey R. Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess,Daan Wierstra, Pushmeet Kohli, Matthew Botvinick, Oriol Vinyals, Yujia Li, and Razvan Pas-canu. Relational inductive biases, deep learning, and graph networks.
CoRR , abs/1806.01261,2018.[4] Jeremy Bulow and John Roberts. The simple economics of optimal auctions.
Journal ofPolitical Economy , 97(5):1060–1090, 1989.[5] Edward H Clarke. Multipart pricing of public goods.
Public Choice , 11(1):17–33, 1971.[6] Vincent Conitzer and Tuomas Sandholm. Complexity of mechanism design. In
Proceedings ofthe Eighteenth Conference on Uncertainty in Artificial Intelligence , pages 103–110. MorganKaufmann Publishers Inc., 2002.[7] Vincent Conitzer and Tuomas Sandholm. Self-interested automated mechanism design andimplications for optimal combinatorial auctions. In
Proceedings of the 5th ACM Conference onElectronic Commerce , pages 132–141. ACM, 2004.[8] Peter Cramton. The FCC spectrum auctions: An early assessment.
Journal of Economics &Management Strategy , 6(3):431–495, 1997.[9] Sven De Vries and Rakesh V Vohra. Combinatorial auctions: A survey.
INFORMS Journal onComputing , 15(3):284–309, 2003.[10] Paul Dütting, Zhe Feng, Harikrishna Narasimhan, and David C Parkes. Optimal auctionsthrough deep learning. arXiv preprint arXiv:1706.03459 , 2017.[11] Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. Internet advertising and thegeneralized second-price auction: Selling billions of dollars worth of keywords.
AmericanEconomic Review , 97(1):242–259, 2007.[12] Zhe Feng, Harikrishna Narasimhan, and David C Parkes. Deep learning for revenue-optimalauctions with budgets. In
Proceedings of the 17th International Conference on AutonomousAgents and MultiAgent Systems , pages 354–362. International Foundation for AutonomousAgents and Multiagent Systems, 2018.[13] Jerry Green and Jean-Jacques Laffont. Characterization of Satisfactory Mechanisms for theRevelation of Preferences for Public Goods.
Econometrica , 45(2):427–438, 1977.[14] Jerry R. Green and Jean-Jacques Laffont.
Incentives in Public Decision Making . North-Holland,Amsterdam, 1979.[15] Theodore Groves. Incentives in Teams.
Econometrica , 41(4):617–631, 1973.916] Mingyu Guo and Vincent Conitzer. Computationally feasible automated mechanism design:General approach and case studies. In
Twenty-Fourth AAAI Conference on Artificial Intelligence ,2010.[17] Mingyu Guo and Vincent Conitzer. Optimal-in-expectation redistribution mechanisms.
ArtificialIntelligence , 174(5-6):363–381, 2010.[18] Mohammad Taghi Hajiaghayi, Robert Kleinberg, and Tuomas Sandholm. Automated onlinemechanism design and prophet inequalities. In
Twenty-First AAAI Conference on ArtificialIntelligence , volume 7, pages 58–65, 2007.[19] Jason S Hartford, James R Wright, and Kevin Leyton-Brown. Deep learning for predictinghuman strategic behavior. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett,editors,
Advances in Neural Information Processing Systems 29 , pages 2424–2432. CurranAssociates, Inc., 2016.[20] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.
CoRR ,abs/1412.6980, 2015.[21] Paul Klemperer.
Auctions: theory and practice . Princeton University Press, 2018.[22] Yann LeCun, Yoshua Bengio, et al. Convolutional networks for images, speech, and time series.In Michael Arbib, editor,
The Handbook of Brain Theory and Neural Networks , pages 255–258.MIT Press, 1995.[23] Padala Manisha, CV Jawahar, and Sujit Gujar. Learning optimal redistribution mechanismsthrough neural networks. In
Proceedings of the 17th International Conference on AutonomousAgents and MultiAgent Systems , pages 345–353. International Foundation for AutonomousAgents and Multiagent Systems, 2018.[24] Roger B Myerson. Optimal auction design.
Mathematics of Operations Research , 6(1):58–73,1981.[25] Noam Nisan. Bidding and allocation in combinatorial auctions. In
Proceedings of the SecondACM Conference on Electronic Commerce , pages 1–12, 2000.[26] Noam Nisan and Amir Ronen. Algorithmic Mechanism Design.
Games and Economic Behavior ,35(1-2):166–196, 2001.[27] David C Parkes.
Iterative Combinatorial Auctions: Achieving Economic and ComputationalEfficiency . PhD thesis, University of Pennsylvania, 2001.[28] David C. Parkes and Michael P. Wellman. Economic reasoning and artificial intelligence.
Science , 349(6245):267–272, 2015.[29] Alvin E Roth. The economist as engineer: Game theory, experimentation, and computation astools for design economics.
Econometrica , 70(4):1341–1378, 2002.[30] Tuomas Sandholm. Automated mechanism design: A new application area for search algorithms.In
International Conference on Principles and Practice of Constraint Programming , pages19–36. Springer, 2003.[31] Hal R Varian and Christopher Harris. The VCG auction in theory and practice.
AmericanEconomic Review , 104(5):442–45, 2014.[32] William Vickrey. Counterspeculation, auctions, and competitive sealed tenders.
The Journal ofFinance , 16(1):8–37, 1961.[33] Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan R Salakhutdinov,and Alexander J Smola. Deep sets. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus,S. Vishwanathan, and R. Garnett, editors,
Advances in Neural Information Processing Systems30 , pages 3391–3401. Curran Associates, Inc., 2017.10
Neural Architecture for Designing Truthful andEfficient AuctionsSupplementary Material
Andrea Tacchetti, DJ Strouse, Marta Garnelo, Thore Graepel and Yoram Bachrach
DeepMind, UK {atacchet, strouse, garnelo, thore, yorambac}@google.com
SM-1 Example of auction
Here we provide a simple example of how the rules of an auction dictates the behavior of rationalparticipants. Consider an auction with two participants, Alice and Bob, who are interested in buyinga single item for sale: a burrito. Alice is hungry, so she would gladly pay up to $12 . Bob justhad breakfast, and he would not pay more than $6 . Consider a mechanism that receives all agents’bids, allocates the burrito to the highest bidder, and charges the amount that was bid to the receiver(and zero to the other bidder). Let’s assume that Bob decides to report truthfully that he values theburrito $6 . What should Alice do? If she also tells the truth, she will end up paying $12 for herwrap, realizing a net utility of $0 . However, Alice can improve her position by reporting anythingbetween $6 . and $11 . . This mechanism is not a truthful one, rational bidders will lie about theirvaluations, and consequently the mechanism might allocate items sub-optimally.Let’s modify the mechanism slightly: our new mechanism allocates the burrito to the highest bidder(like before), but it charges the receiver the amount bid by the other participant . It is easy to seethat once Bob bids truthfully, for any report by Alice between $6 . and $12 . she will realize thesame utility: $6 . Moreover, if she decides to bid under $6 , she risks losing the burrito to Bob, so shemight as well bid her true valuation $12 , to have the best chance of beating Bob’s bid (remember shedoes not know what Bob’s valuation actually is, nor what he might decide to report).The second mechanism is a special case of the VCG mechanism for a single item and two bidders. SM-2 Auction Simulators: Implementation details
We evaluate our ideas on synthetic auction data-sets where valuations are sampled from reasonabledistributions and are expressed in the appropriate bidding language. Here we give implementationdetails on the “auction simulators” used, including parameters for the valuation distributions weconsidered and implementation details of the allocation oracles required for the various biddinglanguages.
SM-2.1 Details on the distributionsMulti-unit auction with decreasing marginal utilities
We start by sampling participants’ valua-tions: for each player i ’s valuation, and for each individual object we sample v i ( k m ) ∼ ρ , where k m is a “bundle” containing only object m . When considering multi-unit auctions with non-decreasingmarginal utilities, we let the 10 players bid for 15 objects and sort the per-object valuations indepen-dently by player (see Fig. 1 in the main text for an example). We assume objects are ordered absolutely so that all players agree on which exactly object m is. a r X i v : . [ c s . M A ] J u l eterogeneous objects with unit demand When dealing with auctions for heterogeneous objectsand unit-demand, we again sample valuations independently for bidders and for single-object bundlesfrom ρ , like before. We then let bidders have preferences over the 8 objects and use the v i ( k m ) directly, since preferences are only expressed for each individual object. Hierarchical bundles
When considering hierarchical bundles we let participants have preferencesover 8 component objects objects (for a total of 15 bundles). We again started by sampling from ρ valuations for each player and for the 8 component object independently, just like we did for the twoprevious languages. We then construct valuations for bundles as follows: for each bundle v i ( k B ) , welet v i ( k B ) = (cid:0)P l ∈ B v i ( k l ) (cid:1) · δ · (1 + (cid:15) ) , where v i ( k B ) is player i ’s valuation for receiving bundle B , δ ∈ { , } is sampled from a Bernoulli distribution with p = 0 . and (cid:15) ≈ N (0 . , . , is aGaussian distribution with the appropriate parameters, clipped at from the left. Bundles are thusworth at least the sum of their component objects, and they have a chance of being worth about more. SM-2.2 Details on the allocation oraclesMulti-unit auction with decreasing marginal utilities
The allocation oracle for this case is im-plemented greedily. Agents essentially submit preferences in the form of their utility for receiving anadditional item . We can then simply loop over the items, and assign each one to the participant withthe largest outstanding unmatched utility.
Heterogeneous objects with unit demand
In this case, objects are assigned so that the sum totalof utilities is maximized. This is exactly equivalent to a weighted bipartite matching on a graph withplayers on the left, objects on the right, and edges connecting all agents to all objects weighted by theutility of the assignment.
Hierarchical bundles
When considering hierarchical bundles, we place component objects at theleaves of a binary tree, and let users express their utility for obtaining either a leaf node, or anysub-tree. In this case we implement the allocation oracle as a linear program that maximizes thesum of realized utilities (by all players), with linear inequality constraints ensuring objects are notover-allocated (see Nisan et al. for further details [4]).
SM-3 Additional results
The primary performance metric for our learned mechanisms is the percent reduction in paymentscollected (budget) relative to the VCG mechanism. This can also be thought of as the percent ofcollected VCG payments (budget) returned to bidders. A performance of means that theauction is strongly budget balanced and no net payments are collected by the auctioneer, an ideal butimpossible goal to achieve consistently [1]. A performance below means that some amount ofpayments are retained by the mechanism, and a performance above represents a budget deficit.Thus, a well-performing mechanism will achieve as close to , without going over, across manyauctions.In the main text (see Fig. 2), we report the average fraction of the VCG budget returned, the percentof auctions with a budget balance violation, and the variance of these two quantities across seeds.Here, we provide a more fine-grained picture by reporting histograms of budgets across auctions. Inaddition to the two above metrics, these histograms allow us to read off the budget variance acrossauctions, as well as the magnitude of budget balance violations. To wit: 1) the average percent of theVCG budget returned to bidders is the average of the distributions plotted, 2) the percent of auctionswith a budget balance violation is the integral of the distribution above , 3) the variance of thebudget across auctions is visible in the width of the distribution, and 4) the spread of the tail above represents the magnitude of budget balance violations.In Figures S1 through S5, the three panels correspond to the three bidding languages we consider.
Left : multi-unit auction with decreasing marginal utilities.
Middle : Heterogeneous objects withunit demand.
Right : hierarchical bundles. The y -axis represents the fraction of auctions (out of , , and for 5 seeds for each model) that resulted in the budget reported on the x -axis. The stackedhistograms represent the different models, with the model names listed on the left. In Figures S1SM-2
20 40 60 80 100 120 140 G - C NN G - M L P R - C NN R - M L P Figure SM-1: ρ = N ( N (10 . , . , N (2 . , . . Train: n = 10 . Test: n = 10 . Left : multi-unit auctionwith decreasing marginal utilities.
Middle : Heterogeneous objects with unit demand.
Right : hierarchicalbundles. G - C NN G - M L P R - C NN R - M L P Figure SM-2: ρ = N (10 . , . . Train: n = 10 . Test: n = 10 . We speculate that the higher variance inbudgets across models is due to overfitting, which is diminished for the more varied hierarchical distribution ofbidder preferences in the previous figure. Left : multi-unit auction with decreasing marginal utilities.
Middle :Heterogeneous objects with unit demand.
Right : hierarchical bundles. through S3, the number of bidders is fixed ( n = 10 ) and only the distribution over bidder preferences, ρ , varies. Here, we consider models G-CNN and R-CNN (our network architecture and auctionrepresentation to learn a Groves payment rule, or a VCG redistribution, respectively); and G-MLP andR-MLP (Groves and redistribution mechanisms parameterized by a MLP, respectively). In Figures S4and S5, the number of bidders varies, with training on n ∈ { , } and testing on n = 10 for FigureS4, and training / testing on training on n ∈ { , , } for Figure S5. The MLP models cannothandle variable numbers of bidders, so we only compare G-CNN and R-CNN. As for the distributionover bidder preferences, ρ , U ( a, b ) represents a uniform distribution between a and b , while N ( µ, σ ) represents a normal distribution with mean µ and standard deviation σ .The general trends we see are that: 1) the redistribution methods tend to result in larger budgetreductions relative to VCG than do the Groves mechanisms. However, 2) the Groves mechanismstend to better avoid budget balance violations, and thus the need for subsidies. 3) The redistributionmechanisms also tend to result in a lower variance in budget. Although it is easier to see in Fig. 2 inthe main text than the plots here, 4) the CNN models avoid budget balance violations better thanthe MLP models . SM-4 Comparison with alternative methods
SM-4.1 Comparison with
Learning Optimal Redistribution Mechanisms Through NeuralNetworks , Manisha et al. 2018
Here we present a quantitative comparison with the recent work by Manisha et al. [3] (which were-implemented). The authors also considered learning nonlinear mechanisms, restricted to the caseof heterogeneous objects with unit demand. Their architecture was essentially an MLP mappingfrom bidder preferences to redistributed payments. We reproduce their model and compare toour set of models in Figure S6. Across bidder preference distributions, Manisha et al.’s model(purple) leads to significantly more budget balance violations. Note that unlike other figures inthe Supplementary Material, the three panels correspond to various bidding languages, but ratherSM-3
20 40 60 80 100 120 140 G - C NN G - M L P R - C NN R - M L P Figure SM-3: ρ = U (0 . , . . Train: n = 10 . Test: n = 10 . Left : multi-unit auction with decreasingmarginal utilities.
Middle : Heterogeneous objects with unit demand.
Right : hierarchical bundles. G - C NN R - C NN Figure SM-4: ρ = N ( N (10 . , . , N (2 . , . . Train: n ∈ { , } . Test: n = 10 . Left : multi-unitauction with decreasing marginal utilities.
Middle : Heterogeneous objects with unit demand.
Right : hierarchicalbundles. to choices of valuation distributions; the bidding language is fixed to the only one considered byManisha et al.: heterogeneous objects with unit-demand.
SM-5 Comparison with
Optimal-in-expectation redistribution mechanisms
Guoand Conitzer. 2010
Here we present a quantitative comparison with the work by Guo and Conitzer [2]. The authorspresent a provably optimal-in-expectation linear VCG redistribution mechanism. This method canonly be applied when the number of participants is greater than the number of objects; the authorsonly report quantitative results for the a 2-unit auction with either n = 3 or n = 7 participants. Herewe report the reduction in VCG budget and fraction of auctions resulting in violations of the budgetbalance constraints. These results show how the methods we propose here are competitive withoptimal linear redistribution methods in terms of budget reduction, and empirically result in virtuallyno budget balance violations. G - C NN R - C NN Figure SM-5: ρ = N ( N (10 . , . , N (2 . , . . Train: n ∈ { , , } . Test: n ∈ { , , } . The threepeaks in the distribution of budgets corresponds to the three possible numbers of bidders. Left : multi-unitauction with decreasing marginal utilities.
Middle : Heterogeneous objects with unit demand.
Right : hierarchicalbundles.
SM-4
20 40 60 80 100 120 140 G - C NN G - M L P R - C NN R - M L P M A N I S H A Figure SM-6:
Left : ρ = N ( N (10 . , . , N (2 . , . . Middle : ρ = N (10 . , . , Right : ρ = U (0 . , . . n = 10 . Bidding language: Heterogeneous objects with unit-demand. R-CNN (ours) G-CNN (ours) Guo et al. [2]budget red. w/deficit budget red. w/deficit budget red. w/deficit n = 3 80 ±
1% 0 ±
0% 87 ±
1% 0 ±
1% 76 ±
0% 0 ± n = 7 84 ±
6% 0 ±
0% 95 ±
0% 0 ±
1% 94 ±
0% 0 ± Table 1: Quantitative comparison with Guo and Conitzer [2]. Our G-CNN method achieves competi-tive budget reductions, with no budget balance violations. Our R-CNN (non-linear) redistributionmechanism outperforms the budget reduction of the optimal linear case, at a cost of very few bud-get balance violation. Reported: average budget reduction and fraction of auctions resulting in abudget balance violation across , auctions. Shown are mean and standard deviation across fivemodel initialization seeds. Bidding language: Multi-unit auction with decreasing marginal utilities. ρ = U (0 . , . . References [1] Jerry R. Green and Jean-Jacques Laffont.
Incentives in Public Decision Making . North-Holland,Amsterdam, 1979.[2] Mingyu Guo and Vincent Conitzer. Optimal-in-expectation redistribution mechanisms.
ArtificialIntelligence , 174(5-6):363–381, 2010.[3] Padala Manisha, CV Jawahar, and Sujit Gujar. Learning optimal redistribution mechanismsthrough neural networks. In
Proceedings of the 17th International Conference on AutonomousAgents and MultiAgent Systems , pages 345–353. International Foundation for AutonomousAgents and Multiagent Systems, 2018.[4] Noam Nisan. Bidding and allocation in combinatorial auctions. In