[PDF] Contextual First-Price Auctions with Budgets

Abstract

The internet advertising market is a multi-billion dollar industry, in which advertisers buy thousands of ad placements every day by repeatedly participating in auctions. In recent years, the industry has shifted to first-price auctions as the preferred paradigm for selling advertising slots. Another important and ubiquitous feature of these auctions is the presence of campaign budgets, which specify the maximum amount the advertisers are willing to pay over a specified time period. In this paper, we present a new model to study the equilibrium bidding strategies in first-price auctions for advertisers who satisfy budget constraints on average. Our model dispenses with the common, yet unrealistic assumption that advertisers' values are independent and instead assumes a contextual model in which advertisers determine their values using a common feature vector. We show the existence of a natural value-pacing-based Bayes-Nash equilibrium under very mild assumptions, and study its structural properties. Furthermore, we generalize the existence result to standard auctions and prove a revenue equivalence showing that all standard auctions yield the same revenue even in the presence of budget constraints.

Full PDF

CContextual First-Price Auctions with Budgets

Santiago R. BalseiroColumbia University [email protected]

Christian KroerColumbia University [email protected]

Rachitesh KumarColumbia University [email protected]

This version: February 23, 2021

Abstract

The internet advertising market is a multi-billion dollar industry, in which advertisers buythousands of ad placements every day by repeatedly participating in auctions. In recent years, theindustry has shifted to ﬁrst-price auctions as the preferred paradigm for selling advertising slots.Another important and ubiquitous feature of these auctions is the presence of campaign budgets,which specify the maximum amount the advertisers are willing to pay over a speciﬁed time period.In this paper, we present a new model to study the equilibrium bidding strategies in ﬁrst-priceauctions for advertisers who satisfy budget constraints on average. Our model dispenses with thecommon, yet unrealistic assumption that advertisers’ values are independent and instead assumesa contextual model in which advertisers determine their values using a common feature vector.We show the existence of a natural value-pacing-based Bayes-Nash equilibrium under very mildassumptions, and study its structural properties. Furthermore, we generalize the existence resultto standard auctions and prove a revenue equivalence showing that all standard auctions yield thesame revenue even in the presence of budget constraints.

Keywords: ﬁrst-price auctions, contextual value models, budget constraints, equilibria in auctions,revenue equivalence. a r X i v : . [ c s . G T ] F e b Introduction

In 2019, the revenue from selling internet ads in the US surpassed $

129 billion, with double digitgrowth expected in future years. This growth is in a large part fueled by ad platforms operatedby tech-giants like Google, Facebook and Twitter, which facilitate the sale of ads by acting as in-termediaries between advertisers and publishers. Typically, millions of ad slots are sold everydayusing auctions in which advertisers bid based on user-speciﬁc information (such as geographical loca-tion, cookies, and historical activity, among others). The advertisers repeatedly participate in theseauctions, with the aim of using their advertising campaign budget to maximize their reach, througha combination of user-speciﬁc targeting and bid-optimization. The presence of budgets introducessigniﬁcant challenges in the analysis as it links diﬀerent auctions together.With billions of dollars at stake, the auction format plays a crucial role. Recent years have seen ashift by some major ad platforms towards using ﬁrst-price auctions as the preferred mode of sellingdisplay ads, moving away from the earlier standard of using second-price auctions. For example, in2019, Google, which is one of the industry leaders, announced a shift to the ﬁrst-price auction formatfor its AdManager exchange. In 2020, Twitter also made the move to ﬁrst-price auctions for thesale of mobile app advertising slots. First-price auctions typically lead to more complicated biddingbehaviour on part of the buyers, because, unlike second-price auctions, truthful bidding is not anequilibrium in the ﬁrst-price setting.This paper attempts to capture the salient features of these display ad auctions, with a focus on thenewly adopted ﬁrst-price auctions. While equilibrium behavior in ﬁrst-price auctions has been studiedfor a long time in the literature, very little attention has been given to the eﬀect of budget constraints(which span across several potential auctions) and user-speciﬁc information on the equilibrium be-haviour of the advertisers. Our paper aims to shed some light on these aspects by introducing andanalyzing a framework for ﬁrst-price auctions that incorporates budgets and context-based valuation.

The value that an advertiser has for an opportunity to show her ad to a particular user depends onthe extent to which the characteristics of the user, that can be gleaned from the information providedby the auctioneer, match the targeting criteria for the advertiser’s campaign. This means that, ingeneral, the value of an ad opportunity is not independent across advertisers, because they all receivethe same information about the user. We incorporate this availability of user-speciﬁc information (thatis common to all advertisers) via a contextual valuation model. In our framework, user information See . See . See . Equilibrium Analysis.

The main technical contribution of this paper is to develop a novel topologi-cal argument, which is used to establish the existence of a Bayes-Nash equilibrium. In our non-atomicmodel, there is a continuum of advertiser types, and a strategy for each advertiser type is a functionwhich maps contexts to bids. Directly analyzing this complicated strategy space in the presence ofbudget constraints turns out be diﬃcult. We side-step this diﬃculty by establishing strong dualityfor the constrained non-convex optimization problem faced by each advertiser type and characteriz-ing the primal optimum in terms of the dual optimum. Using this, we establish the existence of anequilibrium for our model via an inﬁnite-dimensional ﬁxed-point argument in the dual space. Ourexistence result carefully leverages the structural properties of our model to prove existence of a ﬁxedpoint in the space of functions of bounded variation. We believe the tools developed in this papermight be useful in other non-atomic games.In fact, we go beyond mere existence and ﬁnd a remarkably simple value-pacing-based equilibriumstrategy. Our equilibrium strategy builds on the symmetric equilibrium strategy of the standardi.i.d. setting, inheriting its interpretability in the process. It recommends that each advertiser shouldshade her value by a multiplicative factor to manage her budget, and then bid using the symmetricequilibrium strategy from the standard i.i.d. setting—like she would in the absence of budgets—butassuming that competitors’ values are also paced. This is only a slight modiﬁcation of multiplicativebid pacing/shading, which is one of the several ways budgets are managed in practice (Balseiro et al.,2017; Conitzer et al., 2018, 2019). Furthermore, we study how the structure of value-pacing-basedequilibrium strategies changes with types and budgets. Finally, we show that in some cases, thestrategies admit a simple closed-form solution. 2 tandard Auctions and Revenue Equivalence.

Our framework naturally extends to anonymousauction formats in which the highest bidder wins, such as the second-price auction and all-pay auction(even in the presence of reserve prices). In its full generality, it acts as a powerful black-box, takingas input any Bayes-Nash equilibrium for the well-studied standard i.i.d. setting and composing itwith value-pacing to output a Bayes-Nash equilibrium for our model. Surprisingly, we show that, fora ﬁxed distribution over advertisers and users, the same multiplicative factors can be used by theadvertisers to shade their values in the equilibrium strategies for all standard auctions. This factallows us to compare revenues across standard auctions in our model and prove the following revenueequivalence result: in the presence of in-expectation budget constraints, the revenue generated in avalue-pacing-based equilibrium is the same for all standard auctions. This is in sharp constrast tothe case when budgets constraints are strict where revenue equivalence is known not to hold (Cheand Gale, 1998). In light of the recent shift from second-price auction to ﬁrst-price auction by manyad platforms, the ability to compare budget management in both ﬁrst and second-price auctions isan especially relevant aspect of our framework.

Numerical Experiments.

To test our model, we ran numerical experiments after making appro-priate discretizations. The outcomes of these experiments were strikingly close to our theoreticalpredictions. In particular, despite the discontinuities introduced by discretization, budget violationswere found to be vanishingly small, and moreover, the equilibria found were in strong adherence tothe structural properties derived theoretically.

In addition to the works already mentioned above, the study of online auctions has received extensiveattention in the literature. Here, we discuss existing work that is most closely related to ours andprovide references to material which discusses alternative approaches in more detail. In keeping withprevious work on auctions, from now on, we will use the terms buyers and items in place of advertisersand users. Auctions with budget-constrained buyers have been modeled in a variety of ways, mostof which is focused on second-price auctions. From a technical standpoint, the closest to our workis that of Balseiro et al. (2015) in which the authors consider randomly arriving budget-constrainedbuyers and study the resulting strategic interactions using the ﬂuid mean ﬁeld equilibrium technique.They show the existence of an equilibrium for second-price auctions, in which buyers use pacing-based strategies. Their model assumes a ﬁnite type space and independence of the value distributionsof the buyers, as opposed to our context-based model, which allows for correlation between buyervalues. Gummadi et al. (2012) study an MDP-based model of repeated second-price auctions withstrategic budget-constrained buyers and arrive at an optimal strategy which relies on pacing. Balseiroet al. (2017) investigate the system equilibria of various budget-management strategies in repeatedsecond-price auctions, including pacing. Conitzer et al. (2018) study equilibrium bidding behaviourin second-price auctions where every buyer uses pacing for budget management. Balseiro and Gur(2019) study dynamics in repeated second-price auction with budget-constrained buyers, both in the3dversarial and stochastic settings. None of the aforementioned existing work addresses strategicbidding in ﬁrst-price auctions with budget-constrained buyers, where, unlike second-price auctions,bidding truthfully is not a dominant strategy in the absence of budgets.Conitzer et al. (2019) extend the model of Conitzer et al. (2018) to ﬁrst-price auctions and drawsconnections to market equilibria. Borgs et al. (2007) propose a tatonnement-type heuristic for budgetmanagement in repeated ﬁrst-price keyword auctions and proves that it converges eﬃciently to amarket equilibrium. Neither of these works address the strategic behavior exhibited by buyers inﬁrst-price auctions. This is also the case for a long line of work that models repeated auctions amongbudget-constrained buyers as an online matching problem with capacity constraints (see Mehta 2013for a survey).Another direction of research considers buyers who need to satisfy their budget constraints ex-post(also called strict budget constraints), which is diﬀerent from our model that only requires theseconstraints to hold in expectation at the interim stage. Kotowski (2020) investigates equilibriumbehaviour in ﬁrst-price auctions with two bidders who have strict budget constraints and interde-pendent values. Che and Gale (1998) study standard auctions under strict budget constraints andi.i.d. valuations, with the aim of comparing their revenue and eﬃciency properties. Interestingly,they show that ﬁrst-price auctions yield higher revenue than second-price auctions when budget con-straints are strict. Pai and Vohra (2014) characterize optimal auctions in a setting with strict budgetconstraints and i.i.d. valuations. Goel et al. (2015) design auctions for budget-constrained buyersand combinatorial constraints. Inspired by online auctions, we only require on-average budget con-straints, which are more appropriate to model repeated auctions and yield simple and interpretableequilibrium strategies, something that can seldom be said for equilibrium behavior under strict budgetconstraints.Feature vectors are widely used as contexts in the multi-armed bandit literature (for example, seeLangford and Zhang 2007 and Li et al. 2010), and in pricing (Golrezaei et al., 2021; Chen et al., 2021;Lobel et al., 2018). Vector-based valuation models are also connected to low-rank models, which havereceived attention in previous market design work (see e.g. McMahan et al. 2013; Kroer et al. 2019).Due to the presence of a continuum of buyer types in our model, the topological arguments wedevelop bear resemblance to those used in the study of non-atomic games, such as the one addressedin Schmeidler (1973), though continuity is by assumption in Schmeidler (1973), whereas achievingcontinuity is at the heart of our proof.

We consider the setting in which a seller (i.e., the advertising platform) plans to sell an indivisibleitem to one of n buyers (i.e., the advertisers) using an auction. We adopt a feature-based valuation4odel for the buyer. More precisely, the item type is represented using a vector α belonging to theset A ⊂ R d , where each component of α can be interpreted as a feature. We also refer to α as the context . Each buyer type is represented using a vector ( w, B ) belonging to the set Θ ⊂ R d +1 ofpossible buyer types, where the last component B denotes her budget and the ﬁrst d components w capture the weights she assigns to each of the d features. The value (maximum willingness to pay)that buyer type ( w, B ) has for item α is given by the inner product w T α . While we assume a linearrelationship between values and the features, our model and results can be extended to accommodatenon-linear response functions that are commonly used in practice.We assume that the context of the item to be auctioned is drawn from some distribution F over theset of possible items types A . Furthermore, the type for every buyer is drawn according to somedistribution G over the set of possible buyer types Θ, independently of the other buyers and thechoice of the item. Note that, by virtue of our context-based valuation model, the values of the n buyers for the item need not be independent. In line with standard models used for Bayesian analysisof auctions, we will assume that both G and F are common knowledge, while maintaining that therealized type vector ( w, B ) associated with a buyer is her private information. Our model allowsbudgets to be random and correlated with the buyers’ weight vector. In addition, we will assumethat buyers are unaware of the type of their competing buyers. We would like to emphasize that thismeans that the budgets are private.The seller allocates the item using a ﬁrst-price auction with a reserve price in which the highest bidderwins whenever her bid is above the reserve price and the winning bidder pays her bid. We assumethe seller discloses the item type α to the n buyers before bids are solicited from them. As a result,the bid of a buyer on item α can depend on α . We use r : A → R to specify the publicly knowncontext-dependent reserve prices, where r ( α ) denotes the reserve price on item type α .The budget of a buyer represents an upper bound on the amount she would like to pay in the auction.We only require that each buyer satisfy her budget constraint in expectation over the item type andcompeting buyer types. Similar assumptions have been made in the literature (see, e.g., Gummadiet al. 2012; Abhishek and Hosanagar 2013; Balseiro et al. 2015, 2017; Conitzer et al. 2018). Themotivation behind this modeling choice is that budget constraints are often enforced on average byadvertising platforms. For example, Google Ads allows daily budgets to be exceeded by a factor oftwo in any given day, but, over the course of month, the total expenditure never exceeds the dailybudget times the days in the month. In-expectation budget constraints are also motivated by thefact that, in practice, buyers typically participate in a large number of auctions and many buyersuse stationary bidding strategies. Thus, by the law of large numbers, our model can be interpretedas collapsing multiple, repeated auctions in which item types are drawn i.i.d. from F into a singleone-shot auction with in-expectation constraints. Google Ads Help page deﬁnes “Average Daily Budget”: https://support.google.com/google-ads/answer/6312?hl=en otation: We will use R + and R ≥ to denote the set of strictly positive and non-negative realnumbers, respectively. We will use G w to denote the marginal distribution of w when ( w, B ) ∼ G ,i.e., G w ( K ) := G ( { ( w, B ) ∈ Θ | w ∈ K } ) for all Borel sets K ⊂ S . In a similar vein, we will use Θ w to denote the set of w ∈ R d such that ( w, B ) ∈ Θ for some B ∈ R . Unless speciﬁed otherwise, (cid:107)·(cid:107) denotes the Euclidean norm. Assumptions:

We will assume that there exist

U, B min > , U ) d × ( B min , U ). In a similar vein, we also assume that the set of possible itemtypes A is a subset of the positive orthant R d + . We will restrict our attention to d ≥

2, which is theregime in which our feature-vector based valuation model yields interesting insights. To completelyspecify the aforementioned probability spaces, we endow A , Θ and Θ w with the Lebesgue sigmaalgebra. Moreover, we will assume that the distributions F and G have density functions. Note thatthe distribution G can be any distribution on Θ, including one with probability zero on some regionsof Θ. Thus we can address any buyer distribution, so long as it has a density and is supported on abounded subset of the strictly-positive orthant with a positive lower bound on the possible budgets.Similarly, F can capture a wide variety of item distributions. For ease of exposition, we will alsoassume that r ( α ) > α ∈ A . All of our results hold for general reserve prices r : A → R ,but the general case leads to more complex proofs without much additional insight. Consider the decision problem faced by a buyer type ( w, B ) ∈ Θ if we ﬁx the bidding strategies of allcompeting buyers on all possible item types: She wishes to bid on the items in a way that maximizesher expected utility while satisfying her per-auction budget constraint in expectation (where theexpectation is taken over items and competing buyers’ types). As is true in the well-studied standardbudget-free i.i.d. setting (Krishna, 2009), her optimal strategy depends on the strategies used by theother buyer types. In the standard setting, the symmetric Bayes-Nash equilibrium is an appealingsolution concept for the game formed by these interdependent decision problems faced by the buyers.We adopt a similar approach and deﬁne the symmetric Bayes-Nash equilibrium for our model. Astrategy β ∗ : Θ × A → R ≥ (a mapping that speciﬁes what each buyer type should bid on every item)is a Symmetric First Price Equilibrium if, almost surely over all buyer types, using β ∗ is an optimalsolution to a buyer type’s decision problem when all other buyer types also use it. Deﬁnition 1.

A strategy β ∗ : Θ × A → R ≥ is called a Symmetric First Price Equilibrium (SFPE) if β ∗ ( w, B, α ) (as a function of α ) is an optimal solution to the following optimization problem almostsurely w.r.t. ( w, B ) ∼ G : max b : A → R ≥ E α, { θ i } n − i =1 (cid:2) ( w T α − b ( α )) { b ( α ) ≥ max ( r ( α ) , { β ∗ ( θ i , α ) } i ) } (cid:3) s . t . E α, { θ i } n − i =1 [ b ( α ) { b ( α ) ≥ max ( r ( α ) , { β ∗ ( θ i , α ) } i ) } ] ≤ B.

6n the buyer’s optimization problem the buyer wins whenever her bid b ( α ) is higher than the reserveprice r ( α ) and all competiting bids β ∗ ( θ i , α ) for i = 1 , . . . , n −

1. Because of the ﬁrst-price auctionpayment rule, each bidder pays her bid whenever she wins.

Remark 1.

For convenience, in the above deﬁnition, we are using an infeasible tie-breaking rule whichallocates the entire good to every highest bidder. This is inconsequential, and can be replaced by anyarbitrary tie-breaking rule, because we will later show (see part (d) of Lemma 1) that ties are azero-probability event under our value-pacing-based equilibria.

In our solution concept, it is suﬃcient that advertisers have Bayesian priors over the maximumcompeting bid max i { β ∗ ( θ i , α ) } to determine a best response. This is aligned with practice as manyadvertising platforms provide bidders with historical bidding landscapes, which advertisers can use tooptimize their bidding strategies. Additionally, we require that budgets are satisﬁed in expectationover the contexts and buyer types. Connecting back to our repeated auctions interpretation, one canassume competitors’ types to be ﬁxed throughout the horizon while contexts are drawn i.i.d. in eachauction. In this case, our solution concept would be appropriate if buyers cannot observe the types ofcompetitors and, in turn, employ stationary strategies that do not react to the market dynamics. Suchstationary strategies are appealing because they deplete budgets smoothly over time and are simpleto implement. Moreover, it has been previously established that stationary policies approximate wellthe performance of dynamic policies in non-strategic settings when the number of auctions is large andthe maximum value of each auction is small relative to the budget (see, e.g., Talluri and Van Ryzin2006).When the types of bidders is ﬁxed throughout the horizon, a bidder who employs a dynamic strategycould, in principle, proﬁtable deviate by inferring the competitors’ types and using this informationto optimally shade her bids. Implementing such strategies in practice is challenging because manyplatforms do not disclose the identity of the winner nor the bids of competitors in real-time (as wediscussed above, they mostly provide historical information that is aggregated over many auctions).Moreover, when the number of bidders is large and each bidder competes with a random subset ofbidders, such deviations can be shown to not be proﬁtable using mean-ﬁeld techniques (see, e.g., Iyeret al. 2014; Balseiro et al. 2015). Therefore, our model can be alternatively interpreted as one inwhich there is a large population of active buyers and each buyer competes with a random subsetof buyers. This assumption is well motivated in the context of internet advertising markets becausethe number of advertisers actively bidding is typically large and, because of sophisticated targetingtechnologies, advertisers often participate only in a fraction of all auctions. Having deﬁned theequilibrium notion, we proceed to establish its existence in the next section. We do so by producinga simple value-pacing-based strategy that forms a SFPE. See, for example, . Existence of Symmetric First Price Equilibrium

In this section, we study the existence of SFPE, and show that this existence is achieved by acompelling solution which is interpretable. We do so in several steps. First, we deﬁne a naturalparameterized class of value-pacing-based strategies. Then, assuming that the buyer types are usinga strategy from this class, we establish strong duality for the optimization problem faced by eachbuyer type and characterize the primal optimum in terms of the dual optimum. This leads to asubstantial simpliﬁcation of the analysis because it allows us to work in the much simpler dual space.Finally, we establish the existence of a value-pacing-based SFPE by a ﬁxed-point argument over thespace of dual-multipliers.

In this paper, pacing refers to multiplicatively scaling down a quantity. We use the term value-pacingto diﬀerentiate it from bid-pacing/bid-shading, which has previously been studied in the context ofauctions (Borgs et al., 2007; Balseiro et al., 2015, 2017; Conitzer et al., 2018, 2019). Consider afunction µ : Θ → R ≥ , which we will refer to as the pacing function . We deﬁne the paced weightvector of a buyer with type ( w, B ) to be w/ (1 + µ ( w, B )), which is simply the true weight vector w scaled down by the factor 1 / (1 + µ ( w, B )). Similarly, we deﬁne the paced value of a buyer type ( w, B )for item α as w T α/ (1 + µ ( w, B )). We will use pacing to ensure that the budget constraints of allbuyer types are satisﬁed, and at the same time, maintain the best response property at equilibrium.The motivation for using pacing as a budget management strategy will become clear in the nextsection, where we show that the best response of a buyer to other buyers using a value-pacing-basedstrategy is to also use a value-pacing-based strategy. Before deﬁning the strategy, we set up somepreliminaries.Consider a pacing function µ : Θ → [0 , ω/B min ] and an item α ∈ A . Let λ µα denote the distributionof paced values w T α/ (1 + µ ( w, B )) for item α when ( w, B ) ∼ G . Let H µα denote the distributionof the highest value Y := max { X , . . . , X n − } among n − X i ∼ λ µα is drawnindependently for i ∈ { , . . . , n − } . Observe that H µα (( −∞ , x ]) = λ µα (( −∞ , x ]) n − for all α ∈ A because the random variables are i.i.d.For a given item α ∈ A , when x ≥ r ( α ), deﬁne the following bidding function, σ µα ( x ) := x − (cid:90) xr ( α ) H µα ( s ) H µα ( x ) ds, where we interpret σ µα ( x ) to be 0 if H µα ( x ) = 0. Moreover, when x < r ( α ), deﬁne σ µα ( x ) := x (we makethis choice to ensure that no value below the reserve price gets mapped to a bid above the reserveprice, while maintaining continuity). Note that σ µα ( x ) = E [max( Y, r ) | Y < x ]. If λ µα has a density,8hen σ µα is the same as the symmetric equilibrium strategy for a standard ﬁrst-price auction withoutbudgets, when the buyer values are drawn i.i.d. from λ µα and the item has a reserve price of r ( α ) (see,e.g., Section 2.5 of Krishna 2009). Our value-pacing-based strategy uses σ µα ( x ) as a building block,by composing it with value-pacing: Deﬁnition 2.

The value-pacing-based strategy β µ : Θ × A → R ≥ for pacing function µ : Θ → R ≥ isgiven by β µ ( w, B, α ) := σ µα (cid:18) w T α µ ( w, B ) (cid:19) ∀ ( w, B ) ∈ Θ , α ∈ A The bid β µ ( w, B, α ) is the amount that a non-budget-constrained buyer with type ( w, B ) would bidon item α if she acted as if her paced value was her true value (this is captured by the use of thepaced value as the argument for σ µα ), and believed that the rest of the buyers were also acting in thisway (this is captured by the use of σ µα ). Therefore, our strategy has a simple interpretation: bidderspace their values and then bid as in a ﬁrst-price auction in which competitors’ values are also paced.With this deﬁnition in hand, we are ready to state our main existence result. Theorem 1.

There exists a pacing function µ : Θ → R ≥ such that the value-pacing-based strategy β µ : Θ × A → R ≥ is a Symmetric First Price Equilibrium (SFPE). We provide the proof of this result in the remaining subsections. Before moving forward, we statesome useful properties about the quantities deﬁned in this section.

The following lemma establishes the almost sure continuity of the CDF of H µα , which is used exten-sively in our analysis. Lemma 1.

For every µ : Θ → R ≥ , the following properties hold:a. λ µα and H µα have continuous CDFs almost surely w.r.t. α ∼ F b. σ µα is continuous almost surely w.r.t. α ∼ F c. σ µα is non-decreasing. Furthermore, for x ∈ [0 , ω ] and α ∈ A such that H µα is continuous, thefollowing statement holds almost surely w.r.t. Y ∼ H µα , { x ≥ r ( α ) , x ≥ Y } = { σ µα ( x ) ≥ r ( α ) , σ µα ( x ) ≥ σ µα ( Y ) } d. Almost surely w.r.t. α ∼ F , when x , x ∼ λ µα i.i.d., the probability of σ µα ( x ) = σ µα ( x ) is zero. α ∼ F .This property is crucial because it allows us to leverage the known result establishing the existenceof a symmetric equilibrium in the i.i.d. setting under arbitrary tie-breaking rules, which holds onlyif the distribution of values is atom-less. The proof begins by showing that for any α , α ∈ A that are not scalar multiples of each other, if λ µα and λ µα have atoms x and x respectively, i.e. G (cid:0) ( w, B ) | w T α i / (1 + µ ( w, B )) = x i (cid:1) > i = 1 ,

2; then the probability, under w ∼ G , of theintersection of the events w T α / (1 + µ ( w, B )) = x and w T α / (1 + µ ( w, B )) = x is zero. Therefore,the positive probability sets of weight vectors w corresponding to atoms in diﬀerent directions of α have zero probability pairwise intersections, which means that there cannot be uncountably-manysuch directions. Part (a) of the lemma follows because F has a density and hence, the probability ofthe cone formed by the countably many bad directions is zero. Part (b) is a direct consequence of thedeﬁnition of σ µα . Part (c) follows from part (a). Part (c) says that when everyone uses the strategy σ µα , with probability 1, a buyer who has paced value x for item α has the highest bid if and only ifshe has the highest paced value. Finally, part (d), says that ties are a zero probability event whenplayers use the value-pacing-based strategy. Full proofs of our results are given in the appendices. We start by considering the optimization problem faced by an individual buyer with type ( w, B )when all competing buyers use the value-pacing-based strategy with pacing function µ : Θ → R ≥ .Denoting by Q µ ( w, B ) the optimal expected utility of such a buyer, we have Q µ ( w, B ) = max b : A → R ≥ E α, { θ i } n − i =1 (cid:2) ( w T α − b ( α )) { b ( α ) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) } (cid:3) s.t. E α, { θ i } n − i =1 [ b ( α ) { b ( α ) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) } ] ≤ B. Our goal in this section is to show that the value-pacing-based strategy put forward in Deﬁnition 1is a best response when competitors are pacing their bids according to a pacing function µ . Remark 2.

Compare Q µ ( w, B ) to the deﬁnition of a SFPE (Deﬁnition 1), and observe that, if wewere able to show that there exists µ : Θ → R ≥ such that β µ ( w, B, · ) is an optimal solution to Q µ ( w, B ) almost surely w.r.t. ( w, B ) ∼ G , then β µ would be an SFPE. For µ : Θ → R ≥ and ( w, B ) ∈ Θ, consider the Lagrangian optimization problem of Q µ ( w, B ) inwhich we move the budget constraint to the objective using the Lagrange multiplier t ≥

0. We use t to denote the multiplier of one buyer in isolation to distinguish from µ , which is a function giving amultiplier for every buyer type. Denoting by q µ ( w, B, t ) the dual function, we have that q µ ( w, B, t ) = max b ( · ) E α, { θ i } n − i =1 (cid:2) ( w T α − (1 + t ) b ( α )) { b ( α ) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) } (cid:3) + tB = (1 + t ) max b ( · ) E α, { θ i } n − i =1 (cid:20)(cid:18) w T α t − b ( α ) (cid:19) { b ( α ) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) } (cid:21) + tB. Q µ ( w, B ) is given by min t ≥ q µ ( w, B, t ).The next lemma states that the optimal solution to the Lagrangian optimization problem is a value-pacing-based strategy. More speciﬁcally, for every pacing function µ : Θ → R ≥ , buyer type ( w, B )and dual multiplier t , the value pacing based strategy σ µα (cid:0) w T α/ (1 + t ) (cid:1) is an optimal solution to theLangrangian relaxation of Q µ ( w, B ) corresponding to multiplier t . Note that, in general, t need notbe equal to µ ( w, B ). Lemma 2.

For pacing function µ : Θ → R ≥ , buyer type ( w, B ) ∈ Θ and dual multiplier t ≥ , σ µα (cid:18) w T α t (cid:19) ∈ argmax b ( · ) E α, { θ i } n − i =1 (cid:20)(cid:18) w T α t − b ( α ) (cid:19) { b ( α ) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) } (cid:21) . In the proof of Lemma 2, we actually show something stronger than the statement of Lemma 2: Thevalue-pacing-based strategy is optimal point-wise for each α and not just in expectation over α . Thisfollows from the observation that once we ﬁx an item α , we are solving the best response optimizationproblem faced by a buyer with value w T α/ (1 + t ) in the standard i.i.d. setting (Krishna, 2009) withcompeting buyer values being drawn from λ µα and under the assumption that the competing buyersuse the strategy σ µα . If λ µα had a strictly positive density, then the optimality of σ µα (cid:0) w T α/ (1 + t ) (cid:1) would be a direct consequence of the deﬁnition of a symmetric BNE in the standard i.i.d. setting.Even though the standard results cannot be used directly because of the potential absence of a densityin the situation outlined above, we show that it is possible to adapt the techniques used in the proofof Proposition 2.2 of Krishna (2009) to show Lemma 2.Using Lemma 2, we can simplify the expression for the dual function q µ ( w, B, t ). First, note thatbecause σ µα is non-decreasing the highest competing bid can be written asmax i =1 ,...,n − { β µ ( θ i , α ) } = max i =1 ,...,n − (cid:26) σ µα (cid:18) w Ti α µ ( θ i ) (cid:19)(cid:27) = σ µα ( Y ) , where Y ∼ H µα is the maximum of n − σ µα (cid:0) w T α/ (1 + t ) (cid:1) is anoptimal bidding strategy we get that q µ ( w, B, t ) = (1 + t ) E α E Y ∼ H µα (cid:20)(cid:18) w T α t − σ µα (cid:18) w T α t (cid:19)(cid:19) (cid:26) σ µα (cid:18) w T α t (cid:19) ≥ max ( r ( α ) , σ µα ( Y )) (cid:27)(cid:21) + tB = (1 + t ) E α E Y ∼ H µα (cid:20)(cid:18) w T α t − σ µα (cid:18) w T α t (cid:19)(cid:19) (cid:26) w T α t ≥ max ( r ( α ) , Y ) (cid:27)(cid:21) + tB = (1 + t ) E α (cid:20)(cid:18) w T α t − σ µα (cid:18) w T α t (cid:19)(cid:19) H µα (cid:18) w T α t (cid:19) (cid:26) w T α t ≥ r ( α ) (cid:27)(cid:21) + tB = (1 + t ) E α  (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) wT α t r ( α ) H µα ( s ) ds  + tB , where the second equation follows from part (c) of Lemma 1, the third from taking expectations with11espect to Y , and the last from our formula for σ µα .We now present the main result of this subsection, which characterizes the optimal solution of Q µ ( w, B ) in terms of the optimal solution of the dual problem. The idea of using value-pacing-based strategies as candidates for the equilibrium strategy owes its motivation to Theorem 2. Itestablishes that if all the other buyers are using a value-pacing-based strategy, with some pacingfunction µ : Θ → R ≥ , then a value-pacing-based strategy is a best response for a given buyer ( w, B ). Theorem 2.

There exists Θ (cid:48) ⊂ Θ such that G (Θ (cid:48) ) = 1 and for all pacing functions µ : Θ → R ≥ and buyer types ( w, B ) ∈ Θ (cid:48) , if t ∗ is an optimal solution to the dual problem, i.e., if t ∗ ∈ argmin t ∗ ≥ q µ ( w, B, t ) , then σ µα (cid:0) w T α/ (1 + t ∗ ) (cid:1) is an optimal solution for the optimization problem Q µ ( w, B ) . In Theorem 2, the pacing parameter t ∗ used for pacing in the best response can, in general, be diﬀerentfrom µ ( w, B ). This caveat requires a ﬁxed-point argument to resolve, which will be the subject matterof the next subsection. Remark 3.

Restricting to the measure-one set Θ (cid:48) is without loss. Recall that according to Deﬁnition1, a strategy constitutes a SFPE if, almost surely over ( w, B ) ∼ G , using β ∗ is an optimal solution totheir optimization problem when all other buyer types also use it. As a consequence of this deﬁnition,we will show that it suﬃces to show strong duality for a subset of buyer types Θ (cid:48) ⊂ Θ such that G (Θ (cid:48) ) = 1 . Remark 4.

In the absence of reserve prices r ( α ) for the items, Theorem 2 holds for all ( w, B ) ∈ Θ .Reserve prices introduce some discontinuities in the utility and payment term. The subset Θ (cid:48) ⊂ Θ cap-tures a collection of buyer types for which these discontinuities are inconsequential, while maintaining G (Θ (cid:48) ) = 1 . Observe that Q µ ( w, B ) is not a convex optimization problem, so in order to prove the above theo-rem, we cannot appeal to the well-known strong duality results established for convex optimization.Instead, we will use Theorem 5.1.5 of Bertsekas et al. 1998, which states that, to prove optimalityof σ µα (cid:0) w T α/ (1 + t ∗ ) (cid:1) for Q µ ( w, B ), it suﬃces to show primal feasibility of σ µα (cid:0) w T α/ (1 + t ∗ ) (cid:1) , dualfeasibility of t ∗ , Lagrange optimality of σ µα (cid:0) w T α/ (1 + t ∗ ) (cid:1) for multiplier t ∗ , and complementary slack-ness. Our approach will be to show these required properties by combining the diﬀerentiability of thedual function with ﬁrst order optimality conditions for one dimensional optimization problems. Thekey observation here is that the derivative of the dual function is equal to the diﬀerence between thebudget of the buyer and her expected expenditure. Therefore, at optimality, the ﬁrst-order conditionsof the dual problem imply feasibility of the value-based pacing strategy. To prove diﬀerentiabilitywe leverage that in our game the distribution of competing bids is absolutely continuous, which iscritical for our results to hold.For t ∗ ∈ argmin t ≥ q µ ( w, B, t ), if we apply the ﬁrst order optimality conditions for an optimization12roblem with a diﬀerentiable objective function over the domain [0 , ∞ ), we get ∂q µ ( w, B, t ∗ ) ∂t ≥ , t ∗ ≥ , t ∗ · ∂q µ ( w, B, t ∗ ) ∂t = 0 . The ﬁrst condition can be shown to imply primal feasibility, the second implies dual feasibility, and thethird implies complementary slackness. Combining this with Lemma 2, which establishes Lagrangeoptimality, and applying Theorem 5.1.5 of Bertsekas et al. (1998) yields Theorem 2. The completeproof of Theorem 2 can be found in Appendix A.

In light of Theorem 2, the proof of Theorem 1 (the existence of a value-pacing-based SFPE) boilsdown to showing that there exists a pacing function µ : Θ → R ≥ such that, almost surely w.r.t.( w, B ) ∼ G , µ ( w, B ) is an optimal solution to the dual optimization problem min t ≥ q µ ( w, B, t ). Inother words, given that everybody else acts according to µ , a buyer ( w, B ) that wishes to minimizethe dual function is best oﬀ acting according to µ . More speciﬁcally, in Theorem 2 we showed that,starting from a pacing function µ : Θ → R ≥ , if µ ∗ ( w, B ) constitutes an optimal solution to thedual problem min t ≥ q µ ( w, B, t ) almost surely w.r.t. ( w, B ) ∼ G , then σ µα (cid:0) w T α/ (1 + µ ∗ ( w, B )) (cid:1) isan optimal solution for the optimization problem Q µ ( w, B ) almost surely w.r.t. ( w, B ) ∼ G . Inother words, bidding according to σ µα while pacing according to µ ∗ : Θ → R ≥ is a utility-maximizingstrategy for buyer ( w, B ) ∼ G almost surely, given that other buyers bid according to σ µα withpaced values obtained from µ . The following theorem establishes the existence of a pacing function µ : Θ → R ≥ for which µ itself ﬁlls the role of µ ∗ in the previous statement, thereby implying theoptimality of σ µα (cid:0) w T α/ (1 + µ ( w, B )) (cid:1) almost surely w.r.t. ( w, B ) ∼ G . Theorem 3.

There exists µ : Θ → R ≥ such that µ ( w, B ) ∈ argmin t ≥ q ( µ, w, B, t ) almost surelyw.r.t. ( w, B ) ∼ G . We prove the above statement using an inﬁnite-dimensional ﬁxed-point argument on the space of pac-ing functions with a carefully chosen topology. Informally, we need to show that the correspondencethat maps a pacing function µ : Θ → R ≥ to the set of dual-optimal pacing functions, µ ∗ : Θ → R ≥ which satisfy µ ∗ ( w, B ) ∈ argmin t ≥ q µ ( w, B, t ), has a ﬁxed point. However, unlike ﬁnite-dimensionalﬁxed-point arguments, establishing the suﬃcient conditions of convexity and compactness needed toapply inﬁnite-dimensional ﬁxed point theorems requires a careful topological argument.Lemma 8 in the appendix shows that all dual optimal functions µ ∗ : Θ → R ≥ map to a rangethat is a subset of [0 , ω/B min ]. Therefore, any pacing function µ : Θ → R ≥ that is a ﬁxed point,i.e., satisﬁes µ ( w, B ) ∈ argmin t ≥ q µ ( w, B, t ) almost surely w.r.t. ( w, B ) ∼ G , must also satisfyrange( µ ) ⊂ [0 , ω/B min ]. Hence, it suﬃces to restrict our attention to pacing functions of the form µ : Θ → [0 , ω/B min ]. 13onsider the set of all potential pacing functions X = { µ ∈ L (Θ) | µ ( w, B ) ∈ [0 , ω/B min ] ∀ ( w, B ) ∈ Θ } , where L (Θ) is the space of functions f : Θ → R with ﬁnite L norm w.r.t. the Lebesgue measure.Here, by L norm of f w.r.t. the Lebesgue measure, we mean (cid:107) f (cid:107) L = (cid:82) Θ | f ( θ ) | dθ . Our goal is to ﬁnda µ ∈ X such that almost surely w.r.t ( w, B ) ∼ G we have µ ( w, B ) ∈ argmin t ∈ [0 ,ω/B min ] q µ ( w, B, t ) . Dealing with inﬁnitely many individual optimization problems min t ∈ [0 ,ω/B min ] q µ ( w, B, t ), one for each( w, B ), makes the analysis hard. To remedy this issue, we combine these optimization problems bydeﬁning the objective f : X × X → R , for all µ, ˆ µ ∈ X , as follows f ( µ, ˆ µ ) := E ( w,B ) [ q µ ( w, B, ˆ µ ( w, B ))] . For a ﬁxed µ ∈ X , we then get a single optimization problem min ˆ µ ∈X f ( µ, ˆ µ ) over functions in X , instead of one optimization problem for each of the inﬁnitely-many buyer types ( w, B ) ∈ Θ.Later, in Lemma 6, we will show that any optimal solution to the combined optimization problemis also an optimal solution to the individual optimization problems almost surely w.r.t ( w, B ) ∼ G .Thus, shifting our attention to the combined optimization problem is without any loss (becausesub-optimality on zero-measure sets is tolerable).With f as above, we proceed to deﬁne the correspondence that is used in our ﬁxed-point argument.The optimal solution correspondence C ∗ : X ⇒ X is given by C ∗ ( µ ) := arg min ˆ µ ∈X f ( µ, ˆ µ ) (whichcould be empty) for all µ ∈ X . In Lemma 6, we will show that the proof of Theorem 3 boils down toshowing that C ∗ has a ﬁxed point, which will be our next step.Our proof will culminate with an application of the Kakutani-Glicksberg-Fan theorem, on a suitableversion of C ∗ , to show the existence of a ﬁxed point. An application of this result (or any otherinﬁnite dimensional ﬁxed point theorem) requires intricate topological considerations. In particular,we need to endow X with a topology that satisﬁes the following conditions:I. X is compact, convex and C ∗ ( µ ) is a non-empty subset of X for all µ ∈ X .II. C ∗ is a Kakutani map, i.e., it is upper hemicontinuous, and C ∗ ( µ ) is compact and convex forall µ ∈ X .In the case of inﬁnite dimensions, bounded sets in many spaces, such as the L p (Ω) spaces, are notcompact. In particular, X is not compact as a subset of L p (Ω) for any 1 ≤ p ≤ ∞ . One possibleway around it would be to consider the weak* topology on X ⊂ L ∞ (Ω), in which bounded sets arecompact. This choice runs into trouble because it is diﬃcult to show the upper hemicontinuity of14 ∗ (property II) under the weak convergence notion of the weak* topology. Alternatively, one couldimpose structural properties and restrict to a subset of X , such as the space of Lipschitz functions,in which both compactness and continuity can be established. The issue with this approach is thatthe correspondence operator may, in general, not preserve these properties, i.e., property I might nothold. For example, even if µ is Lipschitz, C ∗ ( µ ) might not contain any Lipschitz functions.We would like to strike a delicate balance between properties I and II by picking a space in which wecan establish compactness of X and upper hemicontinuity of C ∗ , while, at the same time, ensuringthat C ∗ ( µ ) contains at least one element from this space. It turns out that the right space thatworks for our proof is the space of bounded variation. To motivate this topology on the spaceof pacing functions, we state some properties of the “smallest” dual optimal pacing function. For µ : Θ → [0 , ω/B min ], we deﬁne (cid:96) µ : Θ → [0 , ω/B min ] as (cid:96) µ ( w, B ) := min (cid:110) s ∈ argmin t ∈ [0 ,ω/B min ] q µ ( w, B, t ) (cid:111) for all ( w, B ) ∈ Θ. The minimum always exists because q µ ( w, B, t ) is continuous as a function of t (see Corollary 1 in the appendix for a proof) and the feasible set of the dual problem is compact.We ﬁrst show that (cid:96) µ varies nicely with w and B along individual components: Lemma 3.

For µ : Θ → [0 , ω/B min ] , the following statements hold1. (cid:96) µ : Θ → [0 , ω/B min ] is non-decreasing in each component of w .2. (cid:96) µ : Θ → [0 , ω/B min ] is non-increasing as a function of B . The proof applies results from comparative statics, which characterize the way the optimal solutionsbehave as a function of the parameters, to the family of optimization problems min t ∈ [0 ,ω/B min ] q µ ( w, B, t )parameterized by ( w, B ) ∈ Θ.Now we wish to show bounded variation of (cid:96) µ . It is a well-known fact that monotonic functions ofone variable have ﬁnite total variation. Moreover, functions of bounded total variation also form thedual space of the space of continuous functions with the L ∞ norm, which allows us to invoke theBanach-Alaoglu Theorem to establish compactness in the weak* topology. These results for singlevariable functions, although not directly applicable to the multivariable setting, act as a guide inchoosing the appropriate topology for our setting.Since pacing functions take as input several variables, we need to look at multivariable generalizationsof total variation. To this end, we state one of the standard deﬁnitions (there are multiple equivalentones) of total variation for functions of several variables (see Section 5.1 of Evans and Gariepy 2015)and then follow it up by a lemma which gives a bound on the total variation of the component-wisemonotonic function (cid:96) µ . 15 eﬁnition 3. For an open subset Ω ⊂ R n , the total variation of a function u ∈ L (Ω) is given by V ( u, Ω) := sup (cid:26)(cid:90) Ω u ( ω ) div φ ( ω ) dω (cid:12)(cid:12)(cid:12)(cid:12) φ ∈ C c (Ω , R n ) , (cid:107) φ (cid:107) ∞ ≤ (cid:27) where C c (Ω , R n ) is the space of continuously diﬀerentiable vector functions φ of compact supportcontained in Ω and div φ = (cid:80) ni =1 ∂φ i ∂x i is the divergence of φ . Lemma 4.

For any pacing function µ : Θ → [0 , ω/B min ] , the following statements hold:1. (cid:96) µ ∈ L (Θ) .2. V ( (cid:96) µ , Θ) ≤ V where V := ( d + 1) U d +1 ω/B min is a ﬁxed constant. Motivated by the above lemma, we deﬁne the set of pacing functions that will allow us to use ourﬁxed-point argument. Deﬁne X = { µ ∈ X | V ( µ, Θ) ≤ V } to be the subset of pacing functions with variation at most V . Note that (cid:96) µ ∈ X . Deﬁne C ∗ : X ⇒ X as C ∗ ( µ ) := argmin ˆ µ ∈X f ( µ, ˆ µ ) for all µ ∈ X . We now state the properties satisﬁed by X that makeit compatible with the Kakutani-Fan-Glicksberg ﬁxed-point theorem. Lemma 5.

The following statements hold:1. X is non-empty, compact and convex as a subset of L (Θ) .2. f : X × X → R is continuous when X × X is endowed with the product topology.3. C ∗ : X ⇒ X is upper hemi-continuous with non-empty, convex and compact values. Finally, with the above lemma in place, we can apply the Kakutani-Fan-Glicksberg theorem to es-tablish the existence of a µ ∈ X such that µ ∈ C ∗ ( X ). The following lemma completes the proofof Theorem 3 by showing that the ﬁxed point is also almost surely optimal for each type. It followsfrom the fact that for µ ∈ X that satisfy µ ∈ C ∗ ( µ ), we have (cid:96) µ ∈ C ∗ ( µ ). Lemma 6. If µ ∈ C ∗ ( µ ) = argmin ˆ µ ∈X f ( µ, ˆ µ ) , then µ ( w, B ) is almost sure optimal for each type,i.e., µ ( w, B ) ∈ argmin t ∈ [0 ,ω/B min ] q µ ( w, B, t ) a.s. w.r.t. ( w, B ) ∼ G . As mentioned earlier, Theorem 3, combined with Theorem 2, implies Theorem 1.

In this section, we will show that a pacing function µ associated with an SFPE (and more generallyﬁxed points of the dual minimization problem) satisﬁes certain monotonicity and geometric properties16elated to the space of value vectors.For the purposes of this section, we will assume that the reserve price for each item is zero, i.e., r ( α ) = 0 for all α ∈ A . Without this assumption, similar results hold, but it becomes less intuitivelyappealing and harder to state. Moreover, we will also assume that the support of G , denoted by δ ( G ), is a convex compact subset of R d +1+ . This assumption is here to avoid having to specifyconditions on the pacing multipliers of types with probability zero of occurring. Moreover, we considera pacing function µ : Θ → [0 , ω/B min ] such that µ ( w, B ) is the unique optimal solution for the dualminimization problem for each ( w, B ) in the support of G , i.e., µ ( w, B ) = argmin t ∈ [0 ,ω/B min ] q µ ( w, B, t )for all ( w, B ) ∈ δ ( G ). We remark that we are assuming that the best response is unique rather thanthe equilibrium being unique. The former can be shown to hold under fairly general conditions.First, in Lemma 3 we showed that the pacing function associated with an SFPE is monotone in thebuyer type. In particular, when the best response is unique, this result implies that µ ( w, B ) is non-decreasing in each component of the weight vector w and non-increasing in the budget B . Intuitively,if the budget decreases, a buyer needs to shade bids more aggressively to meet her constraints.Alternatively, when the weight vector increases, the advertiser’s paced values increase, which wouldresult in more auctions won and higher payments. Therefore, to meet her constraints the advertiserwould need to respond by shading bids more aggressively. Furthermore, when the best response isunique, it can also be shown that µ is continuous (see Lemma 11).The next theorem further elucidates the structure imposed on µ by virtue of it corresponding to theoptima of the family of dual optimization problems parameterized by ( w, B ). In what follows, we willrefer to a buyer ( w, B ) with µ ( w, B ) = 0 as an unpaced buyer, and call her a paced buyer otherwise. Theorem 4.

Consider a unit vector ˆ w ∈ R d + and budget B > such that w/ (cid:107) w (cid:107) = ˆ w , for some ( w, B ) ∈ δ ( G ) . Then, the following statements hold,1. Paced buyers with budget B and weight vectors lying along the same unit vector ˆ w have identicalpaced feature vectors in equilibrium. Speciﬁcally, if ( w , B ) , ( w , B ) ∈ δ ( G ) , with w / (cid:107) w (cid:107) = w / (cid:107) w (cid:107) = ˆ w and µ ( w , B ) , µ ( w , B ) > , then w / (1 + µ ( w , B )) = w / (1 + µ ( w , B )) .2. Suppose there exists an unpaced buyer ( w, B ) ∈ δ ( G ) with w/ (cid:107) w (cid:107) = ˆ w and µ ( w, B ) = 0 . Let w = argmax {(cid:107) w (cid:107)| w ∈ R d ; µ ( w, B ) = 0 and w/ (cid:107) w (cid:107) = ˆ w } be the largest unpaced weight vector w along the direction w . Then, all paced weight vectors get paced down to w , i.e., w/ (1 + µ ( w, B )) = w for all w ∈ δ ( G ) with w/ (cid:107) w (cid:107) = ˆ w and µ ( w, B ) > . In combination with complementary slackness, the ﬁrst part states that, in equilibrium, buyers whohave the same budget, have positive pacing multipliers, and have feature vectors which are scalarmultiples of each other, get paced down to the same type at which they exactly spend their budget.In other words, scaling up the feature vector of a budget-constrained buyer, while keeping her budgetthe same, does not aﬀect the equilibrium outcome. The second case of Theorem 4 addresses thedirections of buyers that have a mixture of paced and unpaced buyers. In this case, there is a critical17uyer type who exactly spends her budget when unpaced, and all buyer types which have a largernorm (but the same budget) get paced down to this critical buyer type, i.e., their paced weight vectorequals the critical buyer type’s weight vector in equilibrium. The buyer types which have a smallernorm are unpaced.The proof of Theorem 4 follows from the fact that the type of a buyer appears in her expendituresolely in the form of her paced weight vector w/ (1 + µ ( w, B )). Moreover, from the complementaryslackness conditions of the dual optimal µ established in Section 3.3, we get that a buyer with type( w, B ) such that µ ( w, B ) > w by a scalar x > µ ( xw, B ) = x (1 + µ ( w, B )). This relationship between the multipliersof buyer types whose weight vectors lie in the same direction, allows us to prove Theorem 4. In this section, we move beyond ﬁrst-price auctions and generalize our results to anonymous standardauctions with reserve prices. An auction A = ( Q, M ), with allocation rule Q : R n ≥ → [0 , n , paymentrule M : R n ≥ → R n ≥ and reserve price r , is called an anonymous standard auction if the followingconditions are satisﬁed: • Highest bidder wins.

When the buyers bid ( b , . . . , b n ), the allocation received by buyer i isgiven by Q i ( b , . . . , b n ) = ( b i ≥ r, b i ≥ b j ∀ j ∈ [ n ]), for all i ∈ [ n ]. • Anonymity.

The payments made by a buyer do not depend on the identity of the buyer. Moreformally, if the buyers bid ( b , . . . , b n ), then for any permutation π of [ n ] and buyer i ∈ [ n ],we have M i ( b , . . . , b n ) = M π ( i ) ( b π (1) , . . . , b π ( n ) ), i.e., the payment made by the i th buyer beforethe bids are permuted equals the payment made by the bidder π ( i ) after the bids have beenpermuted. Remark 5.

As in our deﬁnition of SFPE, we are using an infeasible tie-breaking rule which allocatesthe entire good to every highest bidder. As with SFPE, ties are a zero-probability event under ourvalue-pacing-based equilibria, and our results hold for arbitrary tie-breaking rules.

For consistency of notation, we will modify the above notation slightly to better match the one usedin previous sections. Exploiting the anonymity of auction A , we will denote the payment made by abuyer who bids b , when the other n − { b i } n − i =1 , by M (cid:0) b, { b i } n − i =1 (cid:1) , i.e., we use the ﬁrstargument for the bid of the buyer under consideration and the other arguments for the competitors’bids. Also, as the reserve price completely determines the allocation rule of a standard auction, in therest of the section, we will omit the allocation rule while discussing anonymous standard auctions.To avoid delving into the inner workings of the auction, we assume the existence of an oracle that18akes as an input an atomless distribution H over [0 , ω ] and outputs a bidding strategy ψ H : [0 , ω ] → R satisfying the following properties: • The strategy ψ H is a symmetric equilibrium for the auction A when the values are drawni.i.d. from H . • The strategy ψ H (x) is non-decreasing in x , and ψ H ( x ) ≥ r if and only if x ≥ r . • The payoﬀ for a bidder who has zero value for the object is zero at the symmetric equilibrium. • The distribution of ψ H ( x ), when x ∼ H , is atomless.Our results will produce a pacing-based equilibrium bidding strategy for budget-constrained buyersby invoking ψ H as a black box. Note that, in particular, if we set A to be the second price auctionwith reserve price r , then all the assumptions stated above are satisﬁed with truthful bidding as thesymmetric equilibrium.In our analysis, we allow the seller to condition on the feature vector and choose a diﬀerent mechanismfor each context α ∈ A . Let {A α = ( M α , r ( α )) } α ∈ A be a family of anonymous standard auctions suchthat α (cid:55)→ r ( α ) is measurable. Moreover, suppose that for any measurable bidding function α (cid:55)→ b ( α )and any collection of measurable competing bidding functions α (cid:55)→ b i ( α ) for i ∈ [ n − α (cid:55)→ M α (cid:0) b ( α ) , { b i ( α ) } n − i =1 (cid:1) is also measurable. Next, we deﬁne the equilibrium notion forthe family {A α } α ∈ A of anonymous standard auctions. Deﬁnition 4.

A strategy β ∗ : Θ × A → R is called a Symmetric Equilibrium for the family ofstandard auctions {A α } α ∈ A , if β ∗ ( w, B, α ) (as a function of α ) is an optimal solution to the followingoptimization problem almost surely w.r.t. ( w, B ) ∼ G . max b : A → R ≥ E α, { θ i } n − i =1 (cid:2)(cid:0) w T α − M α ( b ( α ) , { β ∗ ( w i , B i , α )) } i (cid:1) { b ( α ) ≥ max( r ( α ) , { β ∗ ( θ i , α ) } i ) } (cid:3) s.t. E α, { θ i } n − i =1 [ M α ( b ( α ) , { β ∗ ( w i , B i , α ) } i ) { b ( α ) ≥ max( r ( α ) , { β ∗ ( θ i , α ) } i ) } ] ≤ B Observe that the above deﬁnition reduces to Deﬁnition 1 if we take {A α } α ∈ A to be the set of ﬁrst-priceauctions with reserve price r ( α ). Next, we show that the equilibrium existence and characterizationresults of the previous sections apply to all standard auctions that satisfy the required assumptions.To do this, we ﬁrst need to deﬁne value-pacing strategies for anonymous standard auctions. Theseare a natural generalization of the value-pacing-based strategies used for ﬁrst-price auctions.Recall that, for a pacing function µ : Θ → R ≥ and α ∈ A , λ µα denotes the distribution of pacedvalues for item α , and H µα denotes the distribution of the highest value for α , among n − ψ µα to denote ψ H α the symmetric equilibrium strategy for auction A α when values are drawn from H = λ µα . For a pacing function µ : Θ → R ≥ , ( w, B ) ∈ Θ and α ∈ A ,19eﬁne Ψ µ ( w, B, α ) := ψ µα (cid:18) w T α µ ( w, B ) (cid:19) , to be our candidate equilibrium strategy. This strategy is well-deﬁned because, by Lemma 1, λ µα is atom-less almost surely w.r.t. α . As before, the bid Ψ µ ( w, B, α ) is the amount a non-budget-constrained buyer with type ( w, B ) would bid on item α if her paced value was her true value, whencompetitors are pacing their values accordingly. In other words, bidders in the proposed equilibriumﬁrst pace their values, and then bid according to the symmetric equilibrium of auction A α in whichcompetitors’ values are also paced.With the deﬁnition of value-pacing-based strategies in place, we can now state the main result of thissection. Recall that, C ∗ : X ⇒ X is given by C ∗ ( µ ) := arg min ˆ µ ∈X f ( µ, ˆ µ ) for all µ ∈ X , where f is the expected dual function in the case of a ﬁrst-price auction, as deﬁned in Section 3.4. Theorem 5 (Revenue and Pacing Equivalence) . For any pacing function µ ∈ X such that µ ∈ C ∗ ( µ ) isan equilibrium pacing function for ﬁrst-price auctions, the value-pacing-based strategy Ψ µ : Θ × A → R ≥ is a Symmetric Equilibrium for the family of auctions {A α } α ∈ A . Moreover, the expected paymentmade by buyer θ under this equilibrium strategy is equal to the expected payment made by buyer θ inﬁrst-price auctions under the equilibrium strategy β µ : Θ × A → R ≥ , i.e., E α, { ( θ i ) } n − i =1 [ M α (Ψ µ ( θ, α ) , { Ψ µ ( θ i , α ) } i ) { Ψ µ ( θ, α ) ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i ) } ]= E α, { ( θ i ) } n − i =1 [ β µ ( θ, α ) { β µ ( θ, α ) ≥ max( r ( α ) , { β µ ( θ i , α ) } i ) } ]The key lemma in the proof involves showing that the dual of the budget-constrained utility-optimizationproblem faced by a buyer is identical for all standard auctions, when the other buyers use the equilib-rium strategy Ψ µ of the standard auction under consideration. To establish this key step, we exploitthe separable structure of the Lagrangian optimization problem and apply the known utility equiva-lence result for standard auctions in the standard i.i.d. setting, once for each item α ∈ A . Then, weestablish the analogue of Theorem 2 for standard auctions. Combining this with µ ∈ C ∗ ( µ ) yieldsTheorem 5.Before ending this section, we state some important implications of Theorem 5. If the pacing function µ allows the buyers to satisfy their budget constraints in some standard auction, then the same pacingfunction µ allows the buyers to satisfy their budgets in every other standard auction. In other words,the equilibrium pacing functions are the same for all standard auctions. This means that in orderto calculate an equilibrium pacing function µ that satisﬁes µ ∈ C ∗ ( µ ), it suﬃces to compute it forany standard auction (in particular, one could consider a second-price auction for which biddingtruthfully is a symmetric equilibrium in the absence of budget constraints). This fact is especiallypertinent in view of the recent shift in auction format used for selling display ads from second-priceauctions to ﬁrst-price auctions, because it states that, in equilibrium, the buyers can use the same20acing function even after the change. Moreover, the same pacing function continues to work evenif the family {A α } α ∈ A is an arbitrary collection of ﬁrst-price and second-price auctions (or any othercombination of standard auctions), i.e., Theorem 5 states that, not only can one pacing function beused to manage budgets in ﬁrst-price and second-price auctions, the same pacing function also worksin the intermediate transitions stages, in which buyers may potentially participate in some mixtureof these auctions.Another important take away is that all standard auctions with the same allocation rule yield thesame revenue to the seller. While revenue equivalence is known to hold for standard auctions withoutbudget constraints, Che and Gale (1998) showed that when budget constraints are strict ﬁrst-priceauctions lead to higher revenue than second-price auctions. Surprisingly, Theorem 5 shows that whenbudgets constraints are in expectation (and values are feature-based) we recover revenue equivalence.This result is driven by the invariance of the pacing function over all standard auctions and theclassical revenue equivalence result for the unconstrained i.i.d. setting. We remark, however, that therevenue of the seller does depend on the allocation, and the seller could thus maximize her revenueby optimizing over the reserve prices. We leave the question of optimizing the auction design as afuture research direction. In this section, we illustrate our theory by providing a stylized example in which we can determinethe equilibrium bidding strategies in closed form, and then conduct some numerical experiments toverify our theoretical results.

We provide an instructive (albeit stylized) example to illustrate the structural property described inSection 4. For 1 ≤ a < b , deﬁne the set of buyer types as (see the blue region in Figure 1 for avisualization of this set)Θ := (cid:26) ( w, B ) ∈ R ≥ × R + (cid:12)(cid:12)(cid:12)(cid:12) a ≤ (cid:107) w (cid:107)≤ b, B = 2 (cid:107) w (cid:107)− w − w π (cid:107) w (cid:107) (cid:27) . Observe that all buyer types whose weight vectors are co-linear (i.e., they lie along the same unitvector) have identical budgets. Let the number of buyers in the auction be n = 2. Moreover, deﬁnethe set of item types as the two standard basis vectors A := { e , e } . Finally, let G (distribution overbuyer types) and F (distribution over item types) be the uniform distribution on Θ and A respectively.Since A is discrete and F does not have a density, this example does not satisfy the assumptions wemade in our model. Nonetheless, in the next claim, we show that not only does a pacing equilibrium21xists, but we can also state it in closed form. The proof of the claim can be found in Appendix D. Claim 1.

The pacing functions µ : Θ → R deﬁned as µ ( w, B ) = (cid:107) w (cid:107)− , for all ( w, B ) ∈ Θ , is anequilibrium, i.e., β µ , as given in Deﬁnition 1, is a SFPE. Since H µα ( · ) is a strictly increasing function for all α ∈ A , it is easy to check that µ ( w, B ) is theunique optimal to the dual optimization problem min t ∈ [0 ,ω/B min ] q ( µ, w, B, t ) for all ( w, B ) ∈ δ ( G ).Therefore, this example falls under the purview of part 1 of Theorem 4. As expected, conforming toTheorem 4, the buyers whose weight vectors are co-linear get paced down to the same point on theunit arc, as shown in Figure 1. We now describe the simulation-based experiments we conducted to verify our theoretical results.As is necessitated by computer simulations, we studied a discretized version of our problem in theseexperiments. More precisely, in our experiments, we used discrete approximations to the buyer typedistribution G and the item type distribution F . Moreover, for all item types α , we set the reserveprice r ( α ) = 0. For each discretized instance, we compute the equilibrium pacing multipliers µ ( w, B )using best-response dynamics in the dual space while assuming ﬁnitely-many discrete buyer types.We note that this approach is not guaranteed to converge to a solution in our setting (in fact ourresults do not guarantee the existence of an equilibrium when types are discrete). Nonetheless, weshow below that in our experiments the computed pacing multipliers behave consistently with thepredictions of our theoretical results.As a ﬁrst step, and to validate our best-response dynamics, we ran the algorithm on the discreteapproximation of the example discussed in subsection 6.1, for which we had already analyticallydetermined a pacing equilibrium in Claim 1. The problem was discretized by picking 320 points lyingin the set of buyer types Θ deﬁned in subsection 6.1. In Figure 1, we provide plots for the case when a = 2 , b = 3. We see that the theoretical predictions from Claim 1 are replicated almost exactly bythe solution computed by best-response iteration on the discretized problem.Secondly, we conducted experiments to verify the structural properties described in Theorem 4. Herewe consider instances with n = 3 buyers per auction, d = 2 features, the buyer type distribution G given by the uniform distribution on (1 , × (1 , × { . } and the item type distribution F given bythe uniform distribution on the one-dimensional simplex { ( x, y ) | x, y ≥ x + y = 1 } . These werediscretized taking a uniform grid with 10 points along each dimension. The results are portrayed inFigure 2.The structural properties discussed in Theorem 4 are clearly evident in Figure 2. In this scenario,the buyer types are uniformly distributed on (1 , × (1 , × { . } and, as a consequence, all buyershave identical budgets equal to 0.6. At equilibrium, it can be seen that the co-linear buyer types22 1 2 30123 0 1 2 30123 • Figure 1:

The example from Section 6.1 with a = 2 , b = 3. The unpaced and paced buyer weightvectors are uniformly distributed in the blue and black region, respectively. Each plot shows thedistribution of two-dimensional buyer weight vectors. The left plot shows the theoretical results ofsubsection 6.1. In the left plot, the buyer weight vectors lying on the red line get paced down to thered point. The right plot shows the results of best-response iteration on the corresponding discretizedproblem. w w / ( + ( w , B )) w /(1+ ( w , B ))1.01.21.41.61.8 w / ( + ( w , B )) Figure 2:

The left plot depicts how the multiplicative shading factor 1 / (1 + µ ( w, B )) varies withbuyer weight vector w (budget B = 0 . w are co-linear) who have a positive multiplier get paced down tothe critical buyer type who exactly spends her budget. Moreover, at equilibrium, the boundary thatseparates the paced buyer types from the unpaced buyer types—the curve in which the critical buyertypes lie—can be clearly observed in the left-hand plot in Figure 2. This paper introduces a natural contextual valuation model and characterizes the equilibrium biddingbehavior of budget-constrained buyers in ﬁrst-price auctions in this model. We extend this result toother standard auctions and establish revenue equivalence among them. Due to the extensive focuson second-price auctions, previous works endorse bid-pacing as the framework of choice for budgetmanagement in the presence of strategic buyers. Our results suggest that value-pacing, which coincideswith bid-pacing in second-price auctions, is an appropriate framework to manage budgets across allstandard auctions.An important open question we leave unanswered is that of optimizing the reserve prices to maximizeseller revenue under equilibrium bidding. In general, optimizing under equilibrium constraints is usu-ally challenging, so it is interesting to explore whether our model possesses additional structure thatallows for tractability. Investigating dynamics in ﬁrst-price auction with strategic budget-constrainedbuyers is another interesting open direction worth exploring. We also leave open the question ofeﬃcient computation of the pacing-based equilibria discussed in this paper. Addressing this ques-tion will likely require choosing a suitable method of discretization and tie-breaking, without whichequilibrium existence may not be guaranteed (see Conitzer et al. 2018).

References

Vibhanshu Abhishek and Kartik Hosanagar. Optimal bidding in multi-item multislot sponsored searchauctions.

Operations Research , 61(4):855–873, 2013.Charalambos D Aliprantis and Kim C Border.

Inﬁnite Dimensional Analysis: A Hitchhiker’s Guide .Springer Science & Business Media, 2006.Luigi Ambrosio, Nicola Fusco, and Diego Pallara.

Functions of bounded variation and free disconti-nuity problems , volume 254. Clarendon Press Oxford, 2000.Santiago Balseiro, Anthony Kim, Mohammad Mahdian, and Vahab Mirrokni. Budget managementstrategies in repeated auctions. In

Proceedings of the 26th International Conference on World WideWeb , pages 15–23, 2017.Santiago R Balseiro and Yonatan Gur. Learning in repeated auctions with budgets: Regret mini-mization and equilibrium.

Management Science , 65(9):3952–3968, 2019.24antiago R Balseiro, Omar Besbes, and Gabriel Y Weintraub. Repeated auctions with budgets in adexchanges: Approximations and design.

Management Science , 61(4):864–884, 2015.Dimitri P Bertsekas, WW Hager, and OL Mangasarian.

Nonlinear programming . Athena ScientiﬁcBelmont, MA, 1998.Christian Borgs, Jennifer Chayes, Nicole Immorlica, Kamal Jain, Omid Etesami, and MohammadMahdian. Dynamics of bid optimization in online advertisement auctions. In

Proceedings of the16th international conference on World Wide Web , pages 531–540, 2007.Yeon-Koo Che and Ian Gale. Standard auctions with ﬁnancially constrained bidders.

The Review ofEconomic Studies , 65(1):1–21, 1998.Xi Chen, Zachary Owen, Clark Pixton, and David Simchi-Levi. A statistical learning approach topersonalization in revenue management.

Management Science , 2021.Vincent Conitzer, Christian Kroer, Eric Sodomka, and Nicol´as E. Stier Moses. Multiplicative pacingequilibria in auction markets. In

Web and Internet Economics - 14th International Conference,WINE , volume 11316, page 443, 2018.Vincent Conitzer, Christian Kroer, Debmalya Panigrahi, Okke Schrijvers, Eric Sodomka, Nicolas EStier-Moses, and Chris Wilkens. Pacing equilibrium in ﬁrst-price auction markets. In

Proceedingsof the 2019 ACM Conference on Economics and Computation , pages 587–587, 2019.Rick Durrett.

Probability: theory and examples , volume 49. Cambridge university press, 2019.Lawrence Craig Evans and Ronald F Gariepy.

Measure theory and ﬁne properties of functions . CRCpress, 2015.Gagan Goel, Vahab Mirrokni, and Renato Paes Leme. Polyhedral clinching auctions and the adwordspolytope.

J. ACM , 62(3), June 2015.Negin Golrezaei, Adel Javanmard, Vahab Mirrokni, et al. Dynamic incentive-aware learning: Robustpricing in contextual auctions.

Operations Research , 69(1):297–314, 2021.Ramakrishna Gummadi, Peter Key, and Alexandre Proutiere. Repeated auctions under budget con-straints: Optimal bidding strategies and equilibria. In the Eighth Ad Auction Workshop , 2012.Dariusz Idczak. Functions of several variables of ﬁnite variation and their diﬀerentiability. In

AnnalesPolonici Mathematici , volume 60, pages 47–56. Instytut Matematyczny Polskiej Akademii Nauk,1994.Krishnamurthy Iyer, Ramesh Johari, and Mukund Sundararajan. Mean ﬁeld equilibria of dynamicauctions with learning.

Management Science , 60(12):2949–2970, 2014.Maciej H Kotowski. First-price auctions with budget constraints.

Theoretical Economics , 15(1):199–237, 2020.Vijay Krishna.

Auction theory . Academic press, 2009.Christian Kroer, Alexander Peysakhovich, Eric Sodomka, and Nicolas E Stier-Moses. Computing largemarket equilibria using abstractions. In

Proceedings of the 2019 ACM Conference on Economicsand Computation , pages 745–746, 2019. 25ohn Langford and Tong Zhang. The epoch-greedy algorithm for contextual multi-armed bandits.

Advances in neural information processing systems , 20(1):96–1, 2007.Lihong Li, Wei Chu, John Langford, and Robert E Schapire. A contextual-bandit approach topersonalized news article recommendation. In

Proceedings of the 19th international conference onWorld wide web , pages 661–670, 2010.Ilan Lobel, Renato Paes Leme, and Adrian Vladu. Multidimensional binary search for contextualdecision-making.

Operations Research , 66(5):1346–1361, 2018.H Brendan McMahan, Gary Holt, David Sculley, Michael Young, Dietmar Ebner, Julian Grady, LanNie, Todd Phillips, Eugene Davydov, Daniel Golovin, et al. Ad click prediction: a view fromthe trenches. In

Proceedings of the 19th ACM SIGKDD international conference on Knowledgediscovery and data mining , pages 1222–1230, 2013.Aranyak Mehta. Online matching and ad allocation. 2013.Mallesh M Pai and Rakesh Vohra. Optimal auctions with ﬁnancially constrained buyers.

Journal ofEconomic Theory , 150:383–425, 2014.Walter Rudin.

Principles of mathematical analysis , volume 3. McGraw-hill New York, 1964.David Schmeidler. Equilibrium points of nonatomic games.

Journal of statistical Physics , 7(4):295–300, 1973.Alexander Shapiro, Darinka Dentcheva, and Andrzej Ruszczynski. Lectures on stochastic program-ming. 2009.Rangarajan K Sundaram.

A ﬁrst course in optimization theory . Cambridge university press, 1996.Kalyan T Talluri and Garrett J Van Ryzin.

The theory and practice of revenue management , vol-ume 68. Springer Science & Business Media, 2006.26 lectronic Companion:Contextual First-Price Auctions with Budgets

Santiago Balseiro, Christian Kroer, Rachitesh KumarFebruary 23, 2021

A Existence of Symmetric First Price Equilibrium

A.1 Preliminaries on Continuity

Here, we provide the missing proofs from Section 3. To state the proof of Lemma 1, we will need thefollowing lemma.

Lemma 7.

Consider a set Y = { y α } α ∈ I with y α > , where A is an index set. If I is uncountable,then there exists a countable sequence { α n } n ∈ N ⊂ I such that (cid:80) n ∈ N y α n = ∞ .Proof. Write I = ∪ n ∈ Z + { α ∈ I | y α ≥ /n } . It is a well know fact that a countable union of countablesets is countable (see Theorem 2.12 of Rudin 1964). Because I is uncountable, there exists n suchthat { α ∈ I | y α ≥ /n } is uncountable. Therefore, we can ﬁnd a countable sequence { y α n } n ∈ N suchthat y α n ≥ /n for all n ∈ N . For this sequence, (cid:80) n ∈ N y α n = ∞ .We now state the proof of Lemma 1. Proof of Lemma 1. a. First, we will show that λ µα has a continuous CDF almost surely w.r.t. α ∼ F . Recall that A ⊂ R d + , where d ≥

2. Deﬁne f : Θ w → S d as f ( x ) = x/ (cid:107) x (cid:107) , where S d is the unit d -sphere.Consider linearly independent α , α ∈ A . For x , x ∈ [0 , ω ], consider the set C := cone( { s ∈ Θ w | α T s = x , α T s = x } ) ∩ Θ w . Next, we show that C = f − ( f ( C )). It is straightforwardto check C ⊂ f − ( f ( C )). On the other hand, if v ∈ f − ( f ( C )), then there exists w ∈ C suchthat f ( v ) = v/ (cid:107) v (cid:107) = w/ (cid:107) w (cid:107) = f ( w ). Therefore, v = (cid:107) v (cid:107) w/ (cid:107) w (cid:107)∈ C as cones are closed undermultiplication by positive scalars and v ∈ Θ w .Let λ µ denote the distribution of w/ (1 + µ ( w, B )) when ( w, B ) ∼ G . Note that, as α , α are linearly independent, V := { s ∈ R d | α T s = x , α T s = x } is a d − R d . Let { s , . . . , s d − } be an aﬃne basis of V , i.e., for every s ∈ V , there exist scalars a , . . . , a d − such that s = (cid:80) d − i =0 a i s i and (cid:80) d − i =0 a i = 1. Therefore, V ⊂ span( { s , . . . , s d − } ).This implies that C lies in a d − G w has a density, we get G w ( C ) = 0.Observe that f ( w/ (1 + µ ( w, B ))) = f ( w ) for all ( w, B ) ∈ Θ. Hence, the distribution of f ( s )when s ∼ λ µ is the same as the distribution of f ( w ) when w ∼ G w , which implies λ µ ( C ) =ec 1 µ ( f − ( f ( C ))) = G w ( f − ( f ( C )) = G w ( C ) = 0. Therefore, we have shown that for linearlyindependent α , α , we have λ µ ( { s ∈ S | α T s = x , α T s = x } ) = 0 . Deﬁne J = (cid:8) α/ (cid:107) α (cid:107) | ∃ x α > λ µ ( α T s = x α ) > (cid:9) . Suppose J is uncountable. Then, thereexists a countable sequence { α m } m ∈ N and { x α m } m ∈ N such that α i / (cid:107) α i (cid:107)(cid:54) = α j / (cid:107) α j (cid:107) for all i (cid:54) = j and (cid:88) m λ µ ( α Tm s = x α m ) = ∞ . using Lemma 7.Set S m := { s | α Tm s = x α m } . We have shown above that λ µ ( S i ∩ S j ) = 0 for all i (cid:54) = j . Therefore,for all m ≥

1, we have λ µ ( S m ∩ ( ∪ j x | σ µα ( t ) = σ µα ( x ) } , we have σ µα ( y ) = σ µα ( x ) (as σ µα isec 2ontinuous) and H µα (( x, y ]) >

0. First, consider the case when H µα ( x ) >

0. Observe that σ µα ( y ) − σ µα ( x ) = y − x − (cid:90) yr ( α ) H µα ( s ) H µα ( y ) ds + (cid:90) xr ( α ) H µα ( s ) H µα ( x ) ds = y − x − (cid:90) yx H µα ( s ) H µα ( y ) ds + (cid:18) H µα ( x ) − H µα ( y ) (cid:19) (cid:90) xr ( α ) H µα ( s ) ds> (cid:18) H µα ( x ) − H µα ( y ) (cid:19) (cid:90) xr ( α ) H µα ( s ) ds where the last inequality follows from H µα ( y ) − H µα ( x ) = H µα (( x, y ]) >

0. Therefore, σ µα ( y ) >σ µα ( x ) because H µα ( y ) > H µα ( x ), which contradicts σ µα ( y ) = σ µα ( x ).Next, consider the case when H µα ( x ) = 0. Then, H µα ( x ) = 0 and H µα ( y ) = H µα ( x ) + H µα (( x, y ]) >

0. Note that σ µα ( y ) H µα ( y ) = yH µα ( y ) − (cid:90) yr ( α ) H µα ( s ) ds = (cid:90) yr ( α ) [ H µα ( y ) − H µα ( s )] ds + r ( α ) H µα ( y )Hence, σ µα ( y ) = 0 if and only if H µα ( s ) = H µα ( y ) for all s ∈ [ r ( α ) , y ] and r ( α ) = 0. As H µα (0) = 0and H µα ( y ) >

0, we get σ µα ( y ) >

0, which contradicts σ µα ( y ) = σ µα ( x ).d. Consider a α ∈ A such that λ µα has a continuous CDF and P x ∼ λ µα ( σ µα ( x ) = c ) > c ≥

0. Then, by the deﬁnition of σ µα , it must be that c ≥ r ( α ). Moreover, if we let x = inf { x | σ µα ( x ) = c } , then P x ∼ λ µα ( σ µα ( x ) = c ) > H µα ( { y ∈ [0 , ω ] | x < y, σ µα ( x ) = σ µα ( y ) } ) > α ∈ A such that H µα iscontinuous and x ≥ r ( α ), we have H µα ( { y ∈ [0 , ω ] | x < y, σ µα ( x ) = σ µα ( y ) } ) = 0Therefore, when x ∼ λ µα , the CDF of σ µα ( x ) is continuous, and hence, if x , x ∼ λ µα i.i.d., thenthe probability of σ µα ( x ) = σ µα ( x ) is zero. Part (d) follows from combining this fact with part(a). A.2 Strong Duality and Characterizing an Optimal Pacing Strategy

We now state the proof of Lemma 2.

Proof of Lemma 2.

Note that bidding more than the highest competing bid with a positive probabilityis not optimal, i.e., if P α ( b ( α ) > σ µα ( ω )) >

0, then b is not optimal. Therefore, we can restrict ourattention to b such that 0 ≤ b ( α ) ≤ σ µα ( ω ) a.s. w.r.t. α ∼ F . Now, consider such a b . As σ µα (0) = 0and σ α is continuous a.s. w.r.t. α ∼ F , by the Intermediate Value Theorem, there exists z ( α ) ∈ [0 , ω ]such that σ µα ( z ( α )) = b ( α ).Therefore, with x ( α ) := w T α/ (1 + t ), we havemax b ( . ) E α, { θ i } n − i =1 (cid:20)(cid:18) w T α t − b ( α ) (cid:19) { b ( α ) ≥ max( r ( α ) , { β ( θ i , α ) } i ) } (cid:21) = max b ( . ) E α E Y ∼ H µα [( x ( α ) − b ( α )) { b ( α ) ≥ max( r ( α ) , σ µα ( Y )) } ]ec 3 max z ( . ) E α E Y ∼ H µα [( x ( α ) − σ µα ( z ( α ))) { σ µα ( z ( α )) ≥ max( r ( α ) , σ µα ( Y )) } ]= max z ( . ) E α E Y ∼ H µα [( x ( α ) − σ µα ( z ( α ))) { z ( α ) ≥ max( r ( α ) , Y ) } ]= max z ( . ) E α [( x ( α ) − σ µα ( z ( α ))) H µα ( z ( α ))) { z ( α ) ≥ r ( α ) } ]where the third equality follows from part (c) of Lemma 1. Hence, to prove the claim, it is enoughto show that for all α ∈ A , we have x ( α ) ∈ arg max z ( . ) ( x ( α ) − σ µα ( z ( α ))) H µα ( z ( α ))) { z ( α ) ≥ r ( α ) } The above statement holds trivially for α such that x ( α ) < r ( α ), because σ µα ( t ) ≥ r ( α ) when t ≥ r ( α ).Consider α ∈ A for which x ( α ) ≥ r ( α ). Then, for z ( α ) ≥ r ( α ),( x ( α ) − σ µα ( z ( α ))) H µα ( z ( α ))) = x ( α ) H µα ( z ( α )) − z ( α ) H µα ( z ( α )) + (cid:90) z ( α ) r ( α ) H µα ( s ) ds = ( x ( α ) − z ( α )) H µα ( z ( α )) + (cid:90) z ( α ) r ( α ) H µα ( s ) ds Therefore, for z ( α ) ≥ r ( α ), we have( x ( α ) − σ µα ( x ( α ))) H µα ( x ( α )) − ( x ( α ) − σ µα ( z ( α ))) H µα ( z ( α )) = ( z ( α ) − x ( α )) H µα ( z ( α )) − (cid:90) z ( α ) x ( α ) H µα ( s ) ds ≥ z ( α ) ≥ x ( α ) or x ( α ) ≥ z ( α ). Furthermore, for z ( α ) < r ( α ),we have( x ( α ) − σ µα ( x ( α ))) H µα ( x ( α ))) { x ( α ) ≥ r ( α ) } ≥ ( x ( α ) − σ µα ( z ( α ))) H µα ( z ( α ))) { z ( α ) ≥ r ( α ) } = 0Hence, z ( α ) = x ( α ) is optimal, which completes the proof.Before proceeding with the proof of Theorem 2, we state a few useful lemmas and deﬁnitions. Recallthat q µ ( w, B, t ) = (1 + t ) E α  (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) wT α t r ( α ) H µα ( s ) ds  + tB Lemma 8.

For µ : Θ → R ≥ and ( w, B ) ∈ Θ :1. q µ ( w, B, t ) is convex as a function of t .2. min t ≥ q µ ( w, B, t ) = min t ∈ [0 ,ω/B ] q µ ( w, B, t ) Proof.

1. The objective function of the dual problem of a maximization problem is convex.ec 4. As H µα ( s ) ≤ α ∈ A and s ∈ R , the following inequalities hold0 ≤ (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) wT α t r ( α ) H µα ( s ) ds ≤ ω ∀ t ≥ , α ∈ A If t > ω/B , then, q µ ( w, B, t ) ≥ tB > ω ≥ q µ ( w, B, q µ ( w, B, t ), as a function of t , hasits minimum in the interval [0 , ω/B ].Let K be the distribution of α/r ( α ) when α ∼ F . For w ∈ Θ w , let K w be the distribution of w T γ when γ ∼ K , i.e., K w ( B ) := K ( { γ | w T γ ∈ B} ) for all Borel sets B ⊂ R . Lemma 9. K w has a continuous CDF almost surely w.r.t. w ∼ G w .Proof. The proof follows from an argument almost identical to the one used in the proof of part (a)of Lemma 1. Nevertheless, we state it for the sake of completion. Recall that Θ w ⊂ R d + , where d ≥ f : A → S d as f ( x ) = x/ (cid:107) x (cid:107) , where S d is the unit d -sphere. Consider linearly independent w , w ∈ Θ w . For x , x ∈ [0 , ∞ ), consider the set C := cone( { s ∈ A | w T s = x , w T s = x } ) ∩ A .Next, we show that C = f − ( f ( C )). It is straightforward to check C ⊂ f − ( f ( C )). On the otherhand, if v ∈ f − ( f ( C )), then there exists α ∈ C such that f ( v ) = v/ (cid:107) v (cid:107) = α/ (cid:107) α (cid:107) = f ( α ). Therefore, v = (cid:107) v (cid:107) α/ (cid:107) α (cid:107)∈ C as cones are closed under multiplication by positive scalars and v ∈ A .Note that, as w , w are linearly independent, V := { s ∈ R d | w T s = x , w T s = x } is a d − R d . Let { s , . . . , s d − } be an aﬃne basis of V , i.e., for every s ∈ V , there exist scalars a , . . . , a d − such that s = (cid:80) d − i =0 a i s i and (cid:80) d − i =0 a i = 1. Therefore, V ⊂ span( { s , . . . , s d − } ). This implies that C lies in a d − F has a density,we get F ( C ) = 0. Observe that f ( α/r ( α )) = f ( α ) for all α ∈ A . Hence, the distribution of f ( s ) when s ∼ K is the same as the distribution of f ( α ) when α ∼ F , which implies K ( C ) = K ( f − ( f ( C ))) = F ( f − ( f ( C )) = F ( C ) = 0. Therefore, we have shown that for linearly independent w , w , we have K ( { s ∈ S | w T s = x , w T s = x } ) = 0 . Deﬁne J = (cid:110) w/ (cid:107) w (cid:107) (cid:12)(cid:12)(cid:12) ∃ x w > K ( w T s = x w ) > (cid:111) . Suppose J is uncountable. Then, thereexists a countable sequence { w m } m ∈ N and { x w m } m ∈ N such that w i / (cid:107) w i (cid:107)(cid:54) = w j / (cid:107) w j (cid:107) for all i (cid:54) = j and (cid:88) m λ µ ( w Tm s = x w m ) = ∞ . using Lemma 7.Set S m := { s | w Tm s = x w m } . We have shown above that λ µ ( S i ∩ S j ) = 0 for all i (cid:54) = j . Therefore,for all m ≥

1, we have K ( S m ∩ ( ∪ j

Deﬁne Θ (cid:48) ⊂ Θ to be the set of ( w, B ) ∈ Θ for which K w has a continuous CDF. The following lemma establishes diﬀerentiability of the dual objective function.

Lemma 10.

For all pacing functions µ : Θ → R ≥ and buyer types ( w, B ) ∈ Θ (cid:48) , the dual objective q µ ( w, B, t ) is diﬀerentiable as a function of t for t > − / . Moreover, ∂q µ ( w, B, t ) ∂t = B − E α (cid:20) σ µα (cid:18) w T α t (cid:19) H µα (cid:18) w T α t (cid:19) (cid:26) w T α t ≥ r ( α ) (cid:27)(cid:21) Proof.

Fix a pacing function µ : Θ → R ≥ and a buyer ( w, B ) ∈ Θ (cid:48) . Deﬁne g ( t, α ) := (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) wT α t r ( α ) H µα ( s ) ds ∀ t > − / , α ∈ A Note that x (cid:55)→ ( x ≥ r ( α )) (cid:82) xr ( α ) H µα ( s ) ds is a non-decreasing convex function because H µα is non-decreasing. Moreover, it is easy to verify using the second order suﬃcient condition that t (cid:55)→ w T α t isconvex. As t (cid:55)→ g ( t, α ) is a composition of these aforementioned functions, it is convex for each α .Fix t > − /

2. Using Lemma 9 and the deﬁnition of Θ (cid:48) , we can write F (cid:18)(cid:26) α ∈ A | w T α t = r ( α ) (cid:27)(cid:19) = F (cid:18)(cid:26) α ∈ A | w T αr ( α ) = 1 + t (cid:27)(cid:19) = K (cid:0)(cid:8) γ | w T γ = 1 + t (cid:9)(cid:1) = K w (1 + t ) = 0Using Theorem 7.46 of Shapiro et al. (2009), we get that E α [ g ( t, α )] is diﬀerentiable w.r.t t at t and ∂∂t E α [ g ( t , α )] = E α (cid:20) ∂g ( t , α ) ∂t (cid:21) . Therefore, the dual objective q µ ( w, B, t ) is diﬀerentiable as a function of t for t > − /

2, and ∂q µ ( w, B, t ) ∂t = E α [ g ( t , α )] + (1 + t ) ∂∂t E α [ g ( t , α )] + B = E α [ g ( t , α )] + (1 + t ) E α (cid:20) ∂g ( t , α ) ∂t (cid:21) + B = E α  (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) wT α t r ( α ) H µα ( s ) ds  + (1 + t ) E α (cid:20) − w T α (1 + t ) H µα (cid:18) w T α t (cid:19) (cid:26) w T α t ≥ r ( α ) (cid:27)(cid:21) + B = B − E α  w T α t H µα (cid:18) w T α t (cid:19) − (cid:90) wT α t r ( α ) H µα ( s ) ds  (cid:26) w T α t ≥ r ( α ) (cid:27) ec 6 orollary 1. For all pacing functions µ : Θ → R ≥ and buyer types ( w, B ) ∈ Θ (cid:48) , q µ ( w, B, t ) iscontinuous as a function of t for t for t > − / . Corollary 2.

For all pacing functions µ : Θ → R ≥ and buyer types ( w, B ) ∈ Θ (cid:48) , arg min t ∈ [0 ,ω/B ] q µ ( w, B, t ) is non-empty and compact. Corollary 1 is a direct consequence of Lemma 10 and Corollary 2 follows from Weirstrass Theorem.Finally, having established the required lemmas, we are ready to prove Theorem 2.

Proof of Theorem 2.

Let t ∗ ∈ argmin t ∈ [0 ,ω/B ] q µ ( w, B, t ). According to Theorem 5.1.5 from Bertsekaset al. (1998), in order to prove Theorem 2, it suﬃces to show the following conditions:(i) Primal feasibility: E α, { θ i } n − i =1 (cid:20) σ µα (cid:18) w T α t ∗ (cid:19) (cid:26) σ µα (cid:18) w T α t ∗ (cid:19) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) (cid:27)(cid:21) ≤ B (ii) Dual feasibility: t ∗ ≥ σ µα (cid:18) w T α t ∗ (cid:19) ∈ argmax b ( . ) E α, { θ i } n − i =1 (cid:2) ( w T α − (1 + t ) b ( α )) { b ( α ) ≥ max( r ( α ) , { β µ ( θ i , α ) } i ) } (cid:3) + tB (iv) Complementary slackness: t ∗ . (cid:26) B − E α, { θ i } n − i =1 (cid:20) σ µα (cid:18) w T α t ∗ (cid:19) (cid:26) σ µα (cid:18) w T α t ∗ (cid:19) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) (cid:27)(cid:21)(cid:27) = 0First, we simplify the expression for the expected expenditure used in the suﬃcient conditions (i)-(iv)stated above: E α, { ( θ i ) } n − i =1 (cid:20) σ µα (cid:18) w T α t ∗ (cid:19) (cid:26) σ µα (cid:18) w T α t ∗ (cid:19) ≥ max ( r ( α ) , { β µ ( θ i , α ) } i ) (cid:27)(cid:21) = E α, { ( w i ,B i ) } n − i =1 (cid:20) σ µα (cid:18) w T α t ∗ (cid:19) (cid:26) σ µα (cid:18) w T α t ∗ (cid:19) ≥ max (cid:18) r ( α ) , (cid:26) σ µα (cid:18) w Ti α µ ( w i , B i ) (cid:19)(cid:27) i (cid:19)(cid:27)(cid:21) = E α, { ( w i ,B i ) } n − i =1 (cid:20) σ µα (cid:18) w T α t ∗ (cid:19) (cid:26) w T α t ∗ ≥ max (cid:18) r ( α ) , (cid:26) w Ti α µ ( w i , B i ) (cid:27) i (cid:19)(cid:27)(cid:21) = E α (cid:20) σ µα (cid:18) w T α t ∗ (cid:19) H µα (cid:18) w T α t ∗ (cid:19) (cid:26) w T α t ∗ ≥ r ( α ) (cid:27)(cid:21) In the rest of the proof, we establish the aforementioned suﬃcient conditions (i)-(iv). Note that t ∗ satisﬁes the following ﬁrst order conditions of optimality ∂q µ ( w, B, t ∗ ) ∂t ≥ t ∗ ≥ t ∗ · ∂q µ ( w, B, t ∗ ) ∂t = 0 (A-1)ec 7sing Lemma 10, we can write ∂q µ ( w, B, t ∗ ) ∂t = B − E α (cid:20) σ µα (cid:18) w T α t ∗ (cid:19) H µα (cid:18) w T α t ∗ (cid:19) (cid:26) w T α t ∗ ≥ r ( α ) (cid:27)(cid:21) To establish the suﬃcient conditions (i)-(iv), observe that (after simpliﬁcation) conditions (i), (ii) and(iv) are the same as (A-1), and condition (iii) is a direct consequence of Lemma 2, thereby completingthe proof of Theorem 2.

A.3 Fixed Point Argument

Proof of Lemma 3.

1. First, observe that q µ ( w, B, t ) = E α  (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) wT α t r ( α ) (1 + t ) H µα ( s ) ds + tB  = E α (cid:34) (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) w T α (1+ t ) r ( α ) H µα (cid:18) y t (cid:19) dy + tB (cid:35) Consider ( w L , B ) , ( w H , B ) ∈ Θ (cid:48) such that w Li < w Hi and w L − i = w H − i , for some i ∈ [ d ]. Moreover,consider t L , t H ∈ [0 , ω/B min ] such that t L < t H . As H µα is a non-decreasing function, it isstraightforward to check that − q µ ( w, B, t ) has increasing diﬀerences w.r.t. w i and t : q µ ( w H , B, t L ) − q µ ( w L , B, t L ) ≥ q µ ( w H , B, t H ) − q µ ( w L , B, t H )Theorem 10.7 of Sundaram (1996) in combination with the deﬁnition of (cid:96) µ imply (cid:96) µ ( w H , B ) ≥ (cid:96) µ ( w L , B ).2. Consider ( w, B L ) , ( w, B H ) ∈ Θ (cid:48) such that B L < B H and and t L , t H ∈ [0 , ω/B min ] such that t L < t H . Then, − q µ ( w, B, t ) has increasing diﬀerences w.r.t. − B and t : q µ ( w, B H , t H ) − q µ ( w, B L , t H ) = ( B H − B L ) t H ≥ ( B H − B L ) t L = q µ ( w, B H , t L ) − q µ ( w, B L , t L )Theorem 10.7 of Sundaram (1996) and the deﬁnition of (cid:96) µ imply (cid:96) µ ( w, B H ) ≤ (cid:96) µ ( w, B L ). Proof of Lemma 4.

1. Theorem 1 of Idczak (1994) implies measurability of (cid:96) µ . Moreover, (cid:96) µ is bounded by deﬁnition.2. Consider φ ∈ C c (Θ , R n ) such that (cid:107) φ (cid:107) ∞ ≤

1. Then, V ( (cid:96) µ , Θ) = (cid:90) Θ (cid:96) µ ( θ ) div φ ( θ ) dθ ec 8 d +1 (cid:88) i =1 (cid:90) Θ (cid:96) µ ( θ ) ∂φ ( θ ) ∂θ i dθ = d +1 (cid:88) i =1 (cid:90) θ − i (cid:90) θ i (cid:96) µ ( θ ) ∂φ ( θ ) ∂θ i dθ i dθ − i = d +1 (cid:88) i =1 (cid:90) θ − i (cid:90) θ i − φ ( θ i , θ − i ) d(cid:96) µ ( θ i ) dθ − i ≤ d +1 (cid:88) i =1 (cid:90) θ − i (cid:90) θ i d(cid:96) µ ( θ i ) dθ − i ≤ d +1 (cid:88) i =1 (cid:90) θ − i ωB min dθ − i ≤ ( d + 1) U d +1 ωB min where the third equality follows from Fubini’s Theorem. The suﬃcient conditions for Fubini’sTheorem to hold are satisﬁed because | (cid:96) µ div φ | is bounded. Moreover, the fourth equality followsfrom the integration by parts for Lebesgue-Stieltjes integral and the fact that φ evaluates to 0at the boundaries of Θ because φ is compactly supported. Proof of Lemma 5.

We start by noting that, as G has a density, if a sequence converges almost surely(or in L ) under the Lebesgue measure on Θ, then it converges almost surely (or in L ) under G .1. If µ ( θ ) = 0 for all θ ∈ Θ, then µ ∈ X . Hence, X is non-empty. Consider µ , µ ∈ X and a ∈ [0 , aµ + (1 − a ) µ ∈ [0 , ω/B min ] and for φ ∈ C c (Ω , R n ) s.t. (cid:107) φ (cid:107) ∞ ≤

1, we have (cid:90) Ω { aµ + (1 − a ) µ } ( θ ) div φ ( θ ) dθ = a (cid:90) Ω µ ( θ ) div φ ( θ ) dθ + (1 − a ) (cid:90) Ω µ ( θ ) div φ ( θ ) dθ ≤ ( d + 1) U d +1 ωB min Hence, X is convex.Consider a sequence { µ n } ⊂ X and µ ∈ L (Θ) such that µ n L −→ µ . Then, there exists asubsequence { n k } such that µ n k a.s. −−→ µ as k → ∞ . Hence, range( µ ) ⊂ [0 , ω/B min ]. Moreover,by the semi-continuity of total variation (Remark 3.5 of Ambrosio et al. 2000), we have V ( µ, Θ) ≤ lim inf n →∞ V ( µ n , Θ) ≤ ( d + 1) U d +1 ω/B min Therefore, X is closed. To see why X is compact, consider a sequence { µ n } ⊂ X . Then, byTheorem 3.23 of Ambrosio et al. (2000), there exists a subsequence { n k } and µ ∈ BV (Θ) suchthat µ n k converges to µ in the weak* topology, which implies convergence in L (Θ) (Proposition3.13 of Ambrosio et al. 2000). Combining this with the fact that X is closed, completes theproof of compactness of X .2. Suppose f is not continuous. Then, there exists (cid:15) >

0, a sequence { ( µ n , ˆ µ n ) } n ⊂ X × X and( µ, ˆ µ ) ∈ X × X such that lim n →∞ ( µ n , ˆ µ n ) = ( µ, ˆ µ ) and | f ( µ n , ˆ µ n ) − f ( µ, ˆ µ ) |≥ (cid:15) for all n ∈ N .As µ n L −→ µ , there exists a subsequence { n k } k such that µ n k a.s. −−→ µ when k → ∞ . Moreover,ec 9 µ n L −→ ˆ µ implies ˆ µ n k L −→ ˆ µ . Therefore, there exists a subsequence { n k l } l such that ˆ µ n kl a.s. −−→ ˆ µ and µ n kl a.s. −−→ µ as l → ∞ . Here, we have repeatedly used the fact that L convergence impliesthe existence of a subsequence that converges a.s. Hence, after relabelling for ease of notation,we can write that there exists (cid:15) >

0, a sequence { ( µ n , ˆ µ n ) } n ⊂ X × X and ( µ, ˆ µ ) ∈ X × X such that µ n a.s. −−→ µ , ˆ µ n a.s. −−→ ˆ µ and | f ( µ n , ˆ µ n ) − f ( µ, ˆ µ ) |≥ (cid:15) for all n ∈ N .First, observe that µ n a.s. −−→ µ implies w T α/ (1 + µ n ( w, B )) a.s. −−→ w T α/ (1 + µ ( w, B )) and hence, λ µ n α d −→ λ µα for all α ∈ A . As λ µα is continuous almost surely w.r.t. α , by the deﬁnition ofconvergence in distribution, we get that lim n →∞ λ µ n α ( s ) = λ µα ( s ) for all s ∈ R a.s. w.r.t. α ∼ F .Therefore, lim n →∞ H µ n α ( s ) = H µα ( s ) for all s ∈ R , a.s. w.r.t. α ∼ F .Also, note that λ ˆ µ n α and λ ˆ µα are atom-less almost surely w.r.t. α . Let ¯ A ⊂ A be the set of α suchthat lim n →∞ H µ n α ( s ) = H µα ( s ) for all s ∈ R and { λ ˆ µ n α , λ ˆ µα } are atom-less. Therefore, F ( ¯ A ) = 1.For s ∈ R and α ∈ ¯ A , we getlim n →∞ (cid:26) w T α µ n ( w, B ) ≥ s ≥ r ( α ) (cid:27) = (cid:26) w T α µ ( w, B ) ≥ s ≥ r ( α ) (cid:27) a.s. w.r.t. ( w, B ) ∼ G . Note that the set of measure zero on which the above equality doesn’thold may depend on α, s .Fix s ∈ [0 , ω ] and α ∈ ¯ A . Combining these a.s. convergence statements, we getlim n →∞ (1 + ˆ µ n ( w, B )) H µ n α ( s ) (cid:26) w T α µ n ( w, B ) ≥ s ≥ r ( α ) (cid:27) =(1 + ˆ µ ( w, B )) H µα ( s ) (cid:26) w T α µ ( w, B ) ≥ s ≥ r ( α ) (cid:27) a.s. w.r.t. ( w, B ) ∼ G .Furthermore, we can use the Dominated Convergence Theorem (as the sequence is bounded) toshow lim n →∞ E ( w,B ) (cid:20) (1 + ˆ µ n ( w, B )) H µ n α ( s ) (cid:26) w T α µ n ( w, B ) ≥ s ≥ r ( α ) (cid:27)(cid:21) = E ( w,B ) (cid:20) (1 + ˆ µ ( w, B )) H µα ( s ) (cid:26) w T α µ ( w, B ) ≥ s ≥ r ( α ) (cid:27)(cid:21) Keep s ∈ [0 , ω ] ﬁxed and apply the Dominated Convergence Theorem for a second time toobtain, lim n →∞ E α (cid:20) E ( w,B ) (cid:20) (1 + ˆ µ n ( w, B )) H µ n α ( s ) (cid:26) w T α µ n ( w, B ) ≥ s ≥ r ( α ) (cid:27)(cid:21)(cid:21) = E α (cid:20) E ( w,B ) (cid:20) (1 + ˆ µ ( w, B )) H µα ( s ) (cid:26) w T α µ ( w, B ) ≥ s ≥ r ( α ) (cid:27)(cid:21)(cid:21) Finally, apply the Dominated Convergence Theorem for the third time to obtain,lim n →∞ (cid:90) ω E α (cid:20) E ( w,B ) (cid:20) (1 + ˆ µ n ( w, B )) H µ n α ( s ) (cid:26) w T α µ n ( w, B ) ≥ s ≥ r ( α ) (cid:27)(cid:21)(cid:21) ds ec 10 (cid:90) ω E α (cid:20) E ( w,B ) (cid:20) (1 + ˆ µ ( w, B )) H µα ( s ) (cid:26) w T α µ ( w, B ) ≥ s ≥ r ( α ) (cid:27)(cid:21)(cid:21) ds As we are dealing with non-negative random variables, we can apply Fubini’s Theorem to rewritethe above statement aslim n →∞ E ( w,B ) E α  (1 + ˆ µ n ( w, B )) (cid:18) w T α µ n ( w, B ) ≥ r ( α ) (cid:19) (cid:90) wT α µn ( w,B ) r ( α ) H µ n α ( s ) ds  = lim n →∞ E ( w,B ) E α (cid:20)(cid:90) ω (1 + ˆ µ n ( w, B )) H µ n α ( s ) (cid:26) w T α µ n ( w, B ) ≥ s ≥ r ( α ) (cid:27) ds (cid:21) = E ( w,B ) E α (cid:20)(cid:90) ω (1 + ˆ µ ( w, B )) H µα ( s ) (cid:26) w T α µ ( w, B ) ≥ s ≥ r ( α ) (cid:27) ds (cid:21) = E ( w,B ) E α  (1 + ˆ µ ( w, B )) (cid:26) w T α µ ( w, B ) ≥ r ( α ) (cid:27) (cid:90) wT α µ ( w,B ) r ( α ) H µα ( s ) ds  Moreover, applying Dominated Convergence Theorem to ˆ µ n a.s. −−→ ˆ µ yields lim n →∞ E ( w,B ) [ˆ µ n ( w, B ) B ] = E ( w,B ) [ˆ µ ( w, B ) B ]. Together, the above statements imply lim n →∞ f ( µ n , ˆ µ n ) = f ( µ, ˆ µ ), which isa contradiction.3. Part (2) allows us to invoke the Berge Maximum Theorem (Theorem 17.31 of Aliprantis andBorder 2006), which implies that C ∗ is upper hemi-continuous with non-empty and compactvalues. Next, we show that C ∗ ( µ ) is also convex. Fix µ ∈ X . Consider ˆ µ , ˆ µ ∈ C ∗ ( µ ) and λ ∈ [0 , f ( µ, λ ˆ µ + (1 − λ )ˆ µ ) = E ( w,B ) [ q µ ( w, B, λ ˆ µ ( w, B ) + (1 − λ )ˆ µ ( w, B ))] ≤ λ E ( w,B ) [ q µ ( w, B, ˆ µ ( w, B ))] + (1 − λ ) E ( w,B ) [ q µ ( w, B, ˆ µ ( w, B ))]= λf ( µ, ˆ µ ) + (1 − λ ) f ( µ, ˆ µ )Hence, λ ˆ µ + (1 − λ )ˆ µ ∈ C ∗ ( µ ). Proof of Lemma 6.

Recall that in, in Lemma 4, we showed that (cid:96) µ ∈ X . Therefore, as µ ∈ C ∗ ( µ ), E ( w,B ) [ q µ ( w, B, (cid:96) µ ( w, B ))] ≥ E ( w,B ) [ q µ ( w, B, µ ( w, B ))]On the other hand, by the deﬁnition of (cid:96) µ , we get that q µ ( w, B, (cid:96) µ ( w, B )) ≤ q µ ( w, B, µ ( w, B )) ∀ ( w, B ) ∈ ΘHence, combining the two statements yields q µ ( w, B, (cid:96) µ ( w, B )) = q µ ( w, B, µ ( w, B )) a.s. w.r.t. ( w, B ) ∼ G , which completes the proof. ec 11 Structural Properties

Before proceeding with the proof of Theorem 4, we establish the following Lemma, which is informativein its own right.

Lemma 11.

The pacing function µ : Θ → [0 , ω/B min ] is continuous.Proof. We start by observing that the following function is continuous for all α ∈ A :( w, B, t ) (cid:55)→ (cid:90) wT α t H µα ( s ) ds Therefore, Dominated Convergence Theorem implies ( w, B, t ) (cid:55)→ q µ ( w, B, t ) is continuous. Finally,applying Berge Maximum Theorem (Theorem 17.31 of Aliprantis and Border (2006)) yields thecontinuity of ( w, B ) (cid:55)→ µ ( w, B ) because of our assumption that µ ( w, B ) is the unique minimizerof q µ ( w, B, t ).We now state the proof of Theorem 4. Proof of Theorem 4.

Consider a unit vector ˆ w ∈ R d + and budget B > w/ (cid:107) w (cid:107) = ˆ w , forsome ( w, B ) ∈ δ ( X ). If µ ( w, B ) = 0 for all buyers ( w, B ) ∈ δ ( X ) with w/ (cid:107) w (cid:107) = ˆ w , then the theoremstatement holds trivially. So assume that there exists x > x ˆ w ∈ δ ( X ) and µ ( x ˆ w, B ) > x := inf { x ∈ (0 , ∞ ) | ( x ˆ w, B ) ∈ δ ( X ); µ ( x ˆ w, B ) > } . Then, as a consequence of thecomplementary slackness condition established in Theorem 2, for x > x , we have E α (cid:20) σ µα (cid:18) x ˆ w T α µ ( x ˆ w, B ) (cid:19) H µα (cid:18) x ˆ w T α µ ( x ˆ w, B ) (cid:19)(cid:21) = B. Recall that, in Lemma 1, we established the continuity of σ µα and H µα almost surely w.r.t. α ∼ F .Combining this with the continuity of µ established in Lemma 11, we can apply the DominatedConvergence Theorem to establish E α (cid:20) σ µα (cid:18) x ˆ w T α µ ( x ˆ w, B ) (cid:19) H µα (cid:18) x ˆ w T α µ ( x ˆ w, B ) (cid:19)(cid:21) = B. As B >

0, we get x >

0. Next, observe that if t ∗ ≥ x (1 + t ∗ ) = x (1 + µ ( x ˆ w, B ), then ∂q µ ( w, B, t ∗ ) ∂t = B − E α (cid:20) σ µα (cid:18) x ˆ w T α t ∗ (cid:19) H µα (cid:18) x ˆ w T α t ∗ (cid:19)(cid:21) = 0Therefore, by our uniqueness assumption on µ , we get 1 + µ ( x ˆ w, B ) = ( x/x )(1 + µ ( x ˆ w, B )) for all x ≥ x . Hence, for all x ≥ x , we get x ˆ w T α µ ( x ˆ w, B ) = x ˆ w T α µ ( x ˆ w, B )Part (1) of Theorem 4 follows directly. Part (2) considers the case when there exists y ≥ y ˆ w, B ) ∈ δ ( X ) and µ ( y ˆ w, B ) = 0. In this case, Lemma 11 and the connectedness of δ ( X ) imply thatec 12 ( x ˆ w, B ) = 0, with part (2) of Theorem 4 following as a direct consequence. C Standard Auctions and Revenue Equivalence

Before proceeding with the proof of Theorem 5, we state some deﬁnitions and lemmas which willcome in handy. For buyer type ( w, B ) ∈ Θ, we will use R ( w, B ) to denote the following optimizationproblem: R µ ( w, B ) := max b : A → R ≥ E α, { θ i } n − i =1 (cid:2)(cid:0) w T α − M α ( b ( α ) , { Ψ µ ( θ i , α )) } i (cid:1) { b ( α ) ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i ) } (cid:3) s.t. E α, { θ i } n − i =1 [ M α ( b ( α ) , { Ψ µ ( θ i , α ) } i ) { b ( α ) ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i ) } ] ≤ B Then the dual optimization problem (or simply the dual problem) of R µ ( w, B ) is given bymax t ≥ max b : A → R ≥ E α, { θ i } n − i =1 (cid:2)(cid:0) w T α − (1 + t ) M α ( b ( α ) , { Ψ µ ( θ i , α )) } i (cid:1) { b ( α ) ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i ) } (cid:3) + tB The following lemma characterizes the optimal solution to the lagrangian problem.

Lemma 12.

For all t ≥ , ψ µα (cid:18) w T α t (cid:19) ∈ arg max b ( . ) E α, { θ i } n − i =1 (cid:20)(cid:18) w T α t − M α ( b ( α ) , { Ψ µ ( θ i , α ) } i ) (cid:19) ( b ( α ) ≥ max ( r ( α ) , { Ψ µ ( θ i , α ) } i )) (cid:21) Proof.

Consider an α ∈ A such that λ µα is atom-less. Then, using the assumptions on auction A , wecan write ψ µα (cid:18) w T α t (cid:19) ∈ arg max t ∈ R E { X i } n − i =1 ∼ λ µα (cid:20)(cid:18) w T α t − M α ( t, { ψ µα ( X i ) } i ) (cid:19) ( t ≥ max( r ( α ) , { ψ µα ( X i ) } i )) (cid:21) Combining this with the deﬁnition of Ψ µ , we get ψ µα (cid:18) w T α t (cid:19) ∈ arg max t ∈ R E { θ i } n − i =1 (cid:20)(cid:18) w T α t − M α ( t, { Ψ µ ( θ i , α ) } i ) (cid:19) ( t ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i )) (cid:21) To complete the proof, note that λ µα is atom-less a.s. w.r.t. α by part (a) of Lemma 1. Lemma 13.

For α ∈ A such that λ µα is continuous, { ψ µα ( x ) ≥ max( r ( α ) , ψ µα ( Y )) } = { x ≥ max( r ( α ) , Y ) } a.s. Y ∼ H µα , ∀ x ∈ [0 , ω ] Proof. As ψ µα is non-decreasing, { ψ µα ( x ) ≥ max( r ( α ) , ψ µα ( Y )) } ≥ { x ≥ max( r ( α ) , Y ) } always holds.Suppose there exists α ∈ A such that λ µα is continuous and x ∈ [0 , ω ] for which { ψ µα ( x ) ≥ max( r ( α ) , ψ µα ( Y )) } > { x ≥ max( r ( α ) , Y ) } with positive probability w.r.t. Y ∼ H µα . Observe that ψ µα ( x ) ≥ r implies x ≥ r ,by the assumptions made on ψ µα . Therefore, { ψ µα ( x ) ≥ max( r ( α ) , ψ µα ( Y )) } > { x ≥ max( r ( α ) , Y ) } = ⇒ Y > x, x ≥ r ( α ) , ψ µα ( x ) ≥ ψ µα ( Y )ec 13ence, there exists α ∈ A such that λ α is continuous and x ∈ [ r ( α ) , ω ] for which H µα ( { y ∈ [0 , ω ] | y > x, ψ µα ( y ) ≤ ψ µα ( x ) } ) > y > x implies ψ µα ( y ) ≥ ψ µα ( x ), we get H µα ( { y ∈ [0 , ω ] | ψ µα ( y ) = ψ µα ( x ) } ) > ψ µα has a atom-less distribution. Hence, the lemma holds.Consider an α for which λ µα is continuous. Then, the expected utility U µα ( x ) of a bidder with value x in auction A , when the values of the other agents are drawn i.i.d. from λ µα and every bidder employsstrategy ψ µα , is given by U µα ( x ) := E { X i } n − i =1 ∼ λ µα [( x − M α ( ψ µα ( x ) , { ψ µα ( X i ) } i )) { ψ µα ( x ) ≥ max( r ( α ) , { ψ µα ( X i ) } i ) } ]= E { X i } n − i =1 ∼ λ µα [ x { x ≥ max( r ( α ) , { X i } i ) } ] − m α ( x )= xH µα ( x ) { x ≥ r ( α ) } − m µα ( x )where m µα ( x ) = E { X i } n − i =1 ∼ λ µα [ M α ( ψ µα ( x ) , { ψ µα ( X i ) } i ) { ψ µα ( x ) ≥ max( r ( α ) , { ψ µα ( X i ) } i ) } ] and the sec-ond equality follows from Lemma 13.Then, from the arguments given in section 5.1.2 of Krishna (2009), we get U µα ( x ) = (cid:90) x H µα ( s ) { s ≥ r ( α ) } ds = { x ≥ r ( α ) } (cid:90) xr ( α ) H µα ( s ) ds which further implies m µα ( x ) = xH µα ( x ) { x ≥ r ( α ) } − U µα ( x ) = { x ≥ r ( α ) } (cid:32) xH µα ( x ) − (cid:90) xr ( α ) H µα ( s ) ds (cid:33) Then, using Lemma 12 and Lemma 13, the value that the objective function of the dual problem of R ( w, B ) takes at t ≥ b : A → R ≥ E α, { θ i } n − i =1 (cid:2)(cid:0) w T α − (1 + t ) M α ( b ( α ) , { Ψ µ ( θ i , α )) } i (cid:1) { b ( α ) ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i ) } (cid:3) + tB =(1 + t ) max b : A → R ≥ E α, { θ i } n − i =1 (cid:20)(cid:18) w T α t − M α ( b ( α ) , { Ψ µ ( θ i , α ) } i ) (cid:19) ( b ( α ) ≥ max ( r ( α ) , { Ψ µ ( θ i , α ) } i )) (cid:21) + tB =(1 + t ) E α, { θ i } n − i =1 (cid:20)(cid:18) w T α t − M α (cid:18) ψ µα (cid:18) w T α t (cid:19) , { Ψ µ ( θ i , α ) } i (cid:19) ) (cid:19) ( b ( α ) ≥ max ( r ( α ) , { Ψ µ ( θ i , α ) } i )) (cid:21) + tB =(1 + t ) E α E { X i } n − i =1 ∼ λ µα (cid:20)(cid:18) w T α t − M α (cid:18) ψ µα (cid:18) w T α t (cid:19) , { ψ µα ( X i ) } i (cid:19) ) (cid:19) { ψ µα ( x ) ≥ max( r ( α ) , { ψ µα ( X i ) } i ) } (cid:21) + tB =(1 + t ) E α (cid:20) U µα (cid:18) w T α t (cid:19)(cid:21) + tB =(1 + t ) E α  (cid:26) w T α t ≥ r ( α ) (cid:27) (cid:90) wT α t r ( α ) H µα ( s ) ds  + tB ec 14 q µ ( w, B, t )Hence, we have shown that the dual optimization problem is identical to the one derived for ﬁrst-price auctions. Proof of Theorem 5 crucially hinges on this fact. The rest of the proof follows that ofTheorem 2. Proof of Theorem 5 .

By Lemma 6, we know that if µ ∈ C ∗ ( µ ), then µ ( w, B ) ∈ argmin t ∈ [0 ,ω/B ] q µ ( w, B, t )almost surely w.r.t. ( w, B ) ∼ G . Moroeover, by part (b) of Lemma 8, we have µ ( w, B ) ∈ argmin t ∈ [0 , ∞ ) q µ ( w, B, t ).Consider a θ = ( w, B ) ∈ Θ (cid:48) (see Deﬁnition 5) for which µ ( w, B ) ∈ argmin t ∈ [0 , ∞ ) q µ ( w, B, t ). Observethat such θ form a subset which has measure one under G . According to Theorem 5.1.5 from Bert-sekas et al. (1998), in order to prove that Ψ µ ( w, B, α ) (as a function of α ) is an optimal solution forthe optimization problem R ( w, B ), it suﬃces to show the following conditions:(i) Primal feasibility: E α, { θ i } n − i =1 [ M α (Ψ µ ( w, B, α ) , { Ψ µ ( θ i , α ) } i ) { Ψ µ ( w, B, α ) ≥ max ( r ( α ) , { Ψ µ ( θ i , α ) } i ) } ] ≤ B (ii) Dual feasibility: µ ( w, B ) ≥ µ ( w, B ) is an optimal solution formax b : A → R ≥ E α, { θ i } n − i =1 (cid:2)(cid:0) w T α − (1 + µ ( w, B )) M α ( b ( α ) , { Ψ µ ( θ i , α )) } i (cid:1) { b ( α ) ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i ) } (cid:3) + µ ( w, B ) B (iv) Complementary slackness: µ ( w, B ) . (cid:110) B − E α, { θ i } n − i =1 [ M α (Ψ µ ( w, B, α ) , { Ψ µ ( θ i , α ) } i ) { Ψ µ ( w, B, α ) ≥ max( r ( α ) , { Ψ µ ( θ i , α ) } i ) } ] (cid:111) = 0First, we simplify the expression for the expected expenditure used in the suﬃcient conditions (i)-(iv)stated above: E α, { θ i } n − i =1 [ M α (Ψ µ ( θ, α ) , { Ψ µ ( θ i , α ) } i ) { Ψ µ ( θ, α ) ≥ max ( r ( α ) , { Ψ µ ( θ i , α ) } i ) } ]= E α E { X i } n − i =1 ∼ λ µα (cid:20) M α (cid:18) ψ µα (cid:18) w T α µ ( w, B ) (cid:19) , { ψ µα ( X i ) } i (cid:19) ) (cid:26) ψ µα (cid:18) w T α µ ( w, B ) (cid:19) ≥ max( r ( α ) , { ψ µα ( X i ) } i ) (cid:27)(cid:21) = E α (cid:20) m µα (cid:18) w T α µ ( w, B ) (cid:19)(cid:21) = E α  w T α µ ( w, B ) H µα (cid:18) w T α µ ( w, B ) (cid:19) − (cid:90) wT α µ ( w,B ) r ( α ) H µα ( s ) ds  (cid:26) w T α µ ( w, B ) ≥ r ( α ) (cid:27) = E α (cid:20) σ µα (cid:18) w T α µ ( w, B ) (cid:19) H µα (cid:18) w T α µ ( w, B ) (cid:19) (cid:26) w T α µ ( w, B ) ≥ r ( α ) (cid:27)(cid:21) = E α, { ( θ i ) } n − i =1 [ β µ ( θ, α ) { β µ ( θ, α ) ≥ max( r ( α ) , { β µ ( θ i , α ) } i ) } ]Observe that the last term equals the expected payment made by buyer type ( w, B ) in the SFPEdetermined by pacing function µ . Hence, Theorem 5 will follow if we establish the aforementionedec 15uﬃcient conditions (i)-(iv). Note that µ ( w, B ) satisﬁes the following ﬁrst order conditions of opti-mality ∂q µ ( w, B, µ ( w, B )) ∂t ≥ µ ( w, B ) ≥ µ ( w, B ) · ∂q µ ( w, B, µ ( w, B )) ∂t = 0 (C-2)Using Lemma 10, we can write ∂q µ ( w, B, µ ( w, B )) ∂t = B − E α (cid:20) σ µα (cid:18) w T α µ ( w, B ) (cid:19) H µα (cid:18) w T α µ ( w, B ) (cid:19) (cid:26) w T α µ ( w, B ) ≥ r ( α ) (cid:27)(cid:21) To establish the suﬃcient conditions (i)-(iv), observe that (after simpliﬁcation) conditions (i), (ii)and (iv) are the same as (C-2), and condition (iii) is a direct consequence of Lemma 12, therebycompleting the proof of Theorem 5.

D Analytical and Numerical Examples

Proof of Claim 1.

Note that w/ (1+ µ ( w, B )) = w/ (cid:107) w (cid:107) for all ( w, B ) ∈ Θ. Therefore, w/ (1+ µ ( w, B ))is distributed uniformly on the unit ring restricted to the positive quadrant { ( x, y ) ∈ R ≥ | x + y =1 } . Hence, H µα ( s ) = P ( w,B ) (cid:18) w T α µ ( w, B ) ≤ s (cid:19) = arcsin( s ) π/ α ∈ A = { e , e } Observe that H µα is continuous for all α ∈ A . This implies that, for all ( w, B ) ∈ Θ, strong dualityholds for the optimization problem Q µ ( w, B ), because the proof of the results given in Section 3.3only relied on continuity of H µα . Therefore, to prove the claim, it suﬃces to show that each buyer( w, B ) exactly spends her budget. The total payment made by buyer ( w, B ) ∈ Θ, when everyone uses β µ , is given by E α (cid:20) ˜ β µα (cid:18) w T α µ ( w, B ) (cid:19) H µα (cid:18) w T α µ ( w, B ) (cid:19)(cid:21) = (cid:88) i =1 (cid:20) ˆ w i H µe i ( ˆ w i ) − (cid:90) ˆ w i H µe i ( s ) ds (cid:21) = 2 − ˆ w − ˆ w π = 2 (cid:107) w (cid:107)− w − w π (cid:107) w (cid:107) ..