[PDF] Objective Variables for Probabilistic Revenue Maximization in Second-Price Auctions with Reserve

Abstract

Many online companies sell advertisement space in second-price auctions with reserve. In this paper, we develop a probabilistic method to learn a profitable strategy to set the reserve price. We use historical auction data with features to fit a predictor of the best reserve price. This problem is delicate - the structure of the auction is such that a reserve price set too high is much worse than a reserve price set too low. To address this we develop objective variables, a new framework for combining probabilistic modeling with optimal decision-making. Objective variables are "hallucinated observations" that transform the revenue maximization task into a regularized maximum likelihood estimation problem, which we solve with an EM algorithm. This framework enables a variety of prediction mechanisms to set the reserve price. As examples, we study objective variable methods with regression, kernelized regression, and neural networks on simulated and real data. Our methods outperform previous approaches both in terms of scalability and profit.

Full PDF

aa r X i v : . [ s t a t . M L ] J un Ob jective Variables for Probabilistic RevenueMaximization in Second-Price Auctions withReserve

Maja R. Rudolph

Department of Computer ScienceColumbia University [email protected]

Joseph G. Ellis

Department of Electrical EngineeringColumbia University [email protected]

David M. Blei

Data Science InstituteDepartments of Computer Science, StatisticsColumbia University [email protected]

Abstract

Many online companies sell advertisement space in second-price auctionswith reserve. In this paper, we develop a probabilistic method to learn aproﬁtable strategy to set the reserve price. We use historical auction datawith features to ﬁt a predictor of the best reserve price. This problem isdelicate—the structure of the auction is such that a reserve price set toohigh is much worse than a reserve price set too low. To address this wedevelop objective variables , a new framework for combining probabilisticmodeling with optimal decision-making. Objective variables are ”halluci-nated observations” that transform the revenue maximization task into aregularized maximum likelihood estimation problem, which we solve withan EM algorithm. This framework enables a variety of prediction mech-anisms to set the reserve price. As examples, we study objective variablemethods with regression, kernelized regression, and neural networks on sim-ulated and real data. Our methods outperform previous approaches bothin terms of scalability and proﬁt.

Many online companies earn money from auctions, selling advertisement space or otheritems. One widely used auction paradigm is second-price auctions with reserve [1]. In thisparadigm, the company sets a reserve price , the minimal price at which they are willing tosell, before potential buyers cast their bids. If the highest bid is smaller than the reserveprice then there is no transaction; the company does not earn money. If any bid is largerthan the reserve price then the highest bidding buyer wins the auction, and the buyer paysthe larger of the second highest bid and the reserve price. To maximize their proﬁt from aspeciﬁc auction, the host company wants to set the reserve price as close as possible to the(future, unknown) highest bid, but no higher.Imagine a company which hosts second-price auctions with reserve to sell baseball cards.This auction mechanism is designed to be incentive compatible [2], which means that itis advantageous for baseball enthusiasts to bid exactly what they are willing to pay for1 R e v e nu e ( $ ) B =$43.03, b =$17.5 B =$34.23, b =$21.53 R e v e nu e ( $ ) B =$39.83, b =$39.13 B =$45.28, b =$41 (a) Revenue function of 4 auctions from theeBay data set as a function of reserve price.In second-price auctions with reserve therevenue depends on the highest and the sec-ond highest bid (dashed lines). s m oo t h e d r e v e nu e b=21.53, B=34.23σ=0.1σ=0.5σ=1σ=2σ=3σ=4 (b) The eﬀect of smoothing on the revenue func-tion of an auction from the eBay data set.The smaller σ the closer the smoothed rev-enue approximates the actual revenue func-tion. Figure 1:

The revenue (a) and smoothed revenue (b) for example auctions from the eBaydata set.the Stanley Kofax baseball card they are eager to own . Before each auction starts thecompany has to set the reserve price. When companies run millions of auctions of similaritems, they have the opportunity to learn how to opportunistically set the reserve price fromtheir historical data. In other words, they can try to learn their users’ value of diﬀerentitems, and take advantage of this knowledge to maximize proﬁt. This is the problem thatwe address in this paper.We develop a probabilistic model that predicts a good reserve price from prior features ofan auction. These features might be properties of the product, such as the placement ofthe advertisement, properties of the potential buyers, such as each one’s average past bids,or other external features, such as time of day of the auction. Given a data set of auctionfeatures and bids, our method learns a predictor of reserve price that maximizes the proﬁtof future auctions.A typical solution to such real-valued prediction problems is linear regression. However,the solution to this problem is more delicate. The reason is that the revenue function foreach auction—the amount of money that we make as a function of the reserve price y —isasymmetric. It remains constant to the second-highest bid b , increases to the highest bid B , and is zero beyond the highest bid. Formally, R ( y, B, b ) =  b if y < by if b ≤ y ≤ B . (1)Fig. 1a illustrates this function for four auctions of sports collectibles from eBay. Thisﬁgure puts the delicacy into relief. The best reserve price, in retrospect, is the highest bid B . But using a regression to predict the reserve price, e.g., by using the highest bid as theresponse variable, neglects the important fact that overestimating the reserve price is muchworse than underestimating it. For example, consider the top left panel in Fig. 1a, whichmight be the price of a Stanley Kofax baseball card. (Our data are anonymized, but we weuse this example for concreteness.) The best reserve price in retrospect is $43.03. A linearregressor is just as likely to overestimate as to underestimate and hence fails to reﬂect thatsetting the price in advance to $44.00 would yield zero earnings while setting it to $40.00would yield the full reserve. In contrast, the auction mechanism used on eBay is not incentive compatible since the bids arenot sealed. As a result, experienced bidders refrain from bidding the true amount they are willingto pay until seconds before the auction ends to keep sale prices low.

2o solve this problem we develop a new idea, the objective variable . Objective variables usethe machinery of probabilistic models to reason about diﬃcult prediction problems, suchas one that seeks to optimize Eq.1. Speciﬁcally, objective variables enable us to formulateprobabilistic models for which MAP estimation directly uncovers proﬁtable decision-makingstrategies. We develop and study this technique to set the reserve price in second-priceauctions.In more detail, our aim is to ﬁnd a parameterized mechanism f ( x i ; w ) to set the reserve pricefrom the auction features x i . In our study, we will consider a linear predictor, kernelizedregression, and a neural network. We observe a historical data set of N auctions thatcontains features x i , and the auction’s two highest bids B i and b i ; we would like to learna good mechanism by optimizing the parameter w to maximize the total (retrospective)revenue P Ni =1 R ( f ( x i ; w ) , B i , b i ).We solve this optimization problem by turning it into a maximum a posteriori (MAP) prob-lem. For each auction we deﬁne new binary variables—these are the objective variables—that are conditional on a reserve price. The probability of the objective variable being on(i.e., equal to one) is related to the revenue obtained from the reserve price; it is morelikely on if the auction produces more revenue. We then set up a model that ﬁrst assumeseach reserve price is drawn from the parameterized mechanism f ( x i ; w ) and then draws thecorresponding objective variable. Note that this model is deﬁned conditioned on our data,the features and the bids. It is a model of the objective variables.With the model deﬁned, we now imagine a “data set” where all of the objective variablesare on, and then ﬁt the parameters w subject to these data. Because of how we deﬁnedthe objective variables, the model will prefer more proﬁtable settings of the parameters.With this set up, ﬁtting the parameters by MAP estimation is equivalent to ﬁnding theparameters that maximize revenue.The spirit of this technique is that the objective variables are likely to be on when we makegood decisions, that is, when we proﬁt from our setting of the reserve price. When weimagine that they are all on, we are imagining that we made good decisions (in retrospect).When we ﬁt the parameters to these data, we are using MAP estimation to ﬁnd a mechanismthat helps us make such decisions.We ﬁrst derive our method for linear predictors of reserve price and show how to use theexpectation-maximization algorithm [3] to solve our MAP problem. We then show howto generalize the approach to nonlinear predictors, such as kernel regression and neuralnetworks. Finally, on simulated data and real-world data from eBay, we show that thisapproach outperforms the existing methods for setting the reserve price. It is both moreproﬁtable and more easily scales to larger data sets. Related work.

Second-price auctions with reserve are ﬁrst introduced in [1]. Ref. [4]empirically demonstrates the importance of optimizing reserve prices; Their study quantiﬁesthe positive impact it had on Yahoo!’s revenue. However, most previous work on optimizingthe reserve price are limited in that they do not consider features of the auction [4, 5].Our work builds on the ideas in Ref. [6]. This research shows how to learn a linear mappingfrom auction features to reserve prices, and demonstrates that we can increase proﬁt whenwe incorporate features into the reserve-price setting mechanism. We take a probabilisticperspective on this problem, and show how to incorporate nonlinear predictors. We showin Sec. 3 that our algorithms scale better and perform better than these approaches.The objective variable framework also relates to recent ideas from reinforcement learning tosolve partially observable Markov decision processes (POMDPs) [7, 8]. Solving an POMDPamounts to ﬁnding an action policy that maximizes expected future return. Refs. [7, 8]introduce a binary reward variable (similar to an objective variable) and use maximumlikelihood estimation to ﬁnd such a policy. Our work solves a diﬀerent problem with similarideas, but there are also diﬀerences between the methods. In one way, the problem inreinforcement learning is more diﬃcult because the reward is itself a function of the learnedpolicy; in auctions, the revenue function is known and ﬁxed. In addition, the work in3einforcement learning focuses on simple discrete policies while we show how to use theseideas for continuously parameterized predictors.

We ﬁrst describe the problem setting and the objective. Our data come from previousauctions. For each auction, we observe features x i , the highest bid B i , and the secondhighest bid b i . The features represent various characteristics of the auction, such as thedate, time of day, or properties of the item. For example, one of the auctions in the eBaysport collectibles data set might be for a Stanley Kofax baseball card; its features includethe date of the auction and various aspects of the item, such as its condition and the averageprice of such cards on the open market.When we execute an auction we set a reserve price before seeing the bids; this determinesthe revenue we receive after the bids are in. The revenue function (Eq. 1), which is indexedby the bids, determines how much money we make as a function of the chosen reserve price.We illustrate this function for 4 auctions from eBay in Fig. 1a. Our goal is to use thehistorical data to learn how to proﬁtably set the reserve price from auction features, thatis, before we see the bids.For now we will use a linear function to map auction features to a good reserve price. Giventhe feature vector x i , we set the reserve price with f ( x i ; w ) = w ⊤ x i . (In Sec. ?? we considernonlinear alternatives.) We ﬁt the coeﬃcients w from data, seeking w that maximizes theregularized revenue w ∗ = arg max N X i =1 R ( f ( x i ; w ) , B i , b i ) + ( λ/ w ⊤ w. (2)We have chosen an L regularization controlled by parameter λ ; other regularizers are alsopossible.Before we discuss our solution to this optimization, we make two related notes. First,the previous reserve prices are not included in the data. Rather, our data tell us about therelationship between features and bids. All the information about how much we might proﬁtfrom the auction is in the revenue function; the way previous sellers set the reserve prices isnot relevant. Second, our goal is not the same as learning a mapping from features to thehighest bid. Not all auctions are made equal: Consider the top left auction in Fig. 1a withhighest and second highest bid B = $43 .

03 and b = $17 . B = $39 . b = $39 .

17. The proﬁt margin in the ﬁrst auction is much larger, so predicting thereserve price for this auction well is much more important than when the two highest bidsare close to each other. We account for this by directly maximizing revenue, rather than bymodeling the highest bid.

The optimization problem in Eq. 2 is diﬃcult to solve because R ( · ) is discontinuous (andthus non-convex). Previous work [6] addresses this problem by iteratively ﬁtting diﬀerencesof convex (DC) surrogate functions and solving the resulting DC-program [9]. We deﬁnean objective function related to the revenue, but that smooths out the troublesome discon-tunuity. In the next section we show how to optimize this objective with an expectation-maximization algorithm.We ﬁrst place a Gaussian distribution on the reserve price centered around the linear map-ping, y i ∼ N ( f ( x i ; w ) , σ ). We deﬁne the smoothed regularized revenue to be L ( w ) = N X i =1 log E y i [exp {− R ( y i , B i , b i ) } ] − ( λ/ w ⊤ w. (3)Figure 1b shows one term from Eq. 3 and how – for a speciﬁc auction – the smoothedrevenue becomes closer to the original revenue function as σ decreases. This approach was4nspired by probit regression, where a Gaussian expectation is introduced to smooth thediscontinuous 0-1 loss [10, 11].We now have a well-deﬁned and continuous objective function; in principle, we can usegradient methods to ﬁt the parameters. However, we will ﬁt them by recasting the problemas a regularized likelihood under a latent variable model and then using the expectation-maximization (EM) algorithm [3]. This leads to closed-form updates in both the E and Msteps, and facilitates replacing linear regression with a nonlinear predictor. To reformulate our optimization problem, we introduce the idea of the the objective variable .Objective variables are part of a probabilistic model for which MAP estimation recovers theparameter w that maximizes the smoothed revenue in Eq. 3. Speciﬁcally, we deﬁne binaryvariables z i for each auction, each conditioned on the reserve price y i , the highest bid B i ,and next bid b i . We can interpret these variables to indicate “Is the auction host satisﬁedwith the outcome?” Concretely, the likelihood of satisfaction is related to how proﬁtablethe auction was relative to the maximum proﬁt, p ( z i = 1 | y i , B i , b i ) = π ( y i , B i , b i ) where π ( y i , B i , b i ) = exp {− ( B i − R ( y i , B i , b i )) } . (4)The revenue function R ( · ) is in Eq. 1. The revenue is bounded by B i ; thus the probabilityis in (0 , w to maximize the posterior conditioned on this”hallucinated data”. Fig. 2b provides visual intuition why the modes of the posteriorare proﬁtable. For ﬁxed w the posterior of y i is proportional to the product of its priorcentered at f ( x i ; w ) and the likelihood of the objective variable (Eq. 4) which captures theproﬁtability of each possible reserve price prediction.Consider the following model, w ∼ N (0 , λ − I ) (5) y i | w, x i ∼ N ( f ( x i ; w ) , σ ) i ∈ { , . . . , N } (6) z i | y i , B i , b i ∼ Bernoulli( π ( y i , B i , b i )) (7)where f ( x i ; w ) = x ⊤ i w is a linear map (for now). This is illustrated as a graphical model inFig. ?? .Now consider a data set z where all of the objective variables z i are equal to one. Conditionalon this data, the log posterior of w marginalizes out the latent reserve prices y i ,log p ( w | z , x , B , b ) = log p ( w | λ ) + N X i =1 (log E [exp { R ( y i , B i , b i ) } ] − B i ) − C, (8)where C is the normalizer. This is the smoothed revenue of Eq. 3 plus a constant involvingthe top bids B i in Eq. 4, constant components of the prior on w , and the normalizer. Thus,we can optimize the smoothed revenue by taking MAP estimates of w .As we mentioned above, we have deﬁned variables corresponding to the auction host’s sat-isfaction. With historical data of auction attributes and bids, we imagine that the host wassatisﬁed with every auction. When we ﬁt w , we ask for the reserve-price-setting mechanismthat leads to such an outcome. The EM algorithm is a technique for maximum likelihood estimation in the face of hiddenvariables [3]. (When there are regularizers, it is a technique for MAP estimation.) In the E-step, we compute the posterior distribution of the hidden variables given the current modelsettings; in the M-step, we maximize the expected complete regularized log likelihood, wherethe expectation is taken with respect to the previously computed posterior.5 i y i w B i b i x i N (a) The objective variable model (OV model).The objective variable is shaded with diag-onal lines to distinguish that its value is notobserved but rather set to our desired value. . . . . . . P r o bab ili t y B = 43 . , b = 17 . . . . . . . B = 34 . , b = 21 .

30 35 40 45 500 . . . . . . P r o bab ili t y B = 39 . , b = 39 .

30 35 40 45 500 . . . . . . B = 45 . , b = 41 . prior p ( Y i | w ∗ ) likelihood p ( z i = 1 | Y i ) product p ( z i | Y i ) p ( Y i | w ∗ ) (b) For ﬁxed w the posterior of the latent reserveprice (red) is proportional to the prior (blue)times the likelihood of the objective (green).MAP estimation uncovers proﬁtable modesof the posterior.

Figure 2:

The objective variable framework transforms the revenue maximization task intoa MAP estimation task. The model and the hallucinated data are designed such that themodes of the model’s posterior are the local maxima of the smoothed revenue in Eq. 3.In the OV model, the latent variables are the reserve prices y ; the observations are theobjective variables z ; and the model parameters are the coeﬃcients w . We compute theposterior expectation of the latent reserve prices in the E-step and ﬁt the model parametersin the M-step. This is a coordinate ascent algorithm on the expected complete regularizedlog likelihood of the model and the data. Each E-step tightens the bound on the likelihoodand the new bound is then optimized in the M-step. E-step.

At iteration t , the E-step computes the conditional distribution of the latentreserve prices y i given the objective variables z i = 1 and the parameters w ( t − of theprevious iteration. It is p ( y i | z i = 1 , w ( t − ) ∝ p ( z i = 1 | y i ) p ( y i | w ( t − ) (9) ∝ exp {− ( B i − R ( y i , B i , b i ) } φ (cid:18) y i − f ( x i ; w ( t − ) σ (cid:19) . (10)where φ ( · ) is the pdf of the standard normal distribution. The normalizing constant is inthe appendix in Eq. 15; we compute it by integrating Eq. 9 over the real line. We canthen compute the posterior expectation E (cid:2) y i | z i , w ( t − (cid:3) by using the moment generatingfunction. (See Eq. 18,Sec. A) M-step.

The M-step maximizes the complete joint log-likelihood with respect to the modelparameters w . When we use a linear predictor to set the reserve prices, i.e. f ( x i ; w ) = x ⊤ i w ,the M-step has a closed form update, which amounts to ridge regression against responsevariables E (cid:2) y i | z i , w ( t − (cid:3) (Eq. 18) computed in the E-step. The update is w ( t ) = (cid:18) λI + 1 σ x ⊤ x (cid:19) − σ x ⊤ E h y | z , w ( t − i (11)where E (cid:2) y | z , w ( t − (cid:3) denotes the vector with i th entry E (cid:2) y i | z , w ( t − (cid:3) and similarly x isa matrix of all feature vectors x i . Algorithm details.

To initialize, we set the expected reserve prices to be the highestbids E [ y i | z i ] = B i and run an M-step. The algorithm then alternates between updating theweights using Eq. 11 in the M-step and then integrating out the latent reserve prices in theE-step. The algorithm terminates when the change in revenue on a validation set is belowa threshold. (We use 10 − .)The E-step is linear in the number of auctions N and can be parallelized since the expectedreserve prices are conditionally independent in our model. The least squares update hasasymptotic complexity O ( d N ) where d is the number of features.6 .4 Nonlinear Objective Variable Models One of the advantages of our EM algorithm is that we can change the parameterized pre-diction technique f ( x i ; w ) from which we map auction features to the mean of the reserveprice. So far we have only considered linear predictors; here we show how we can adaptthe algorithm to nonlinear predictors. As we will see in Sec. 3, these nonlinear predictorsoutperform the linear predictors.In our framework, much of the model in Fig. 2a and corresponding algorithm remains thesame even when considering nonlinear predictors. The distribution of the objective variablesis unchanged (Eq. 4) as well as the E-step update in the EM algorithm (Eq. 18). All of thechanges are in the M-step. Kernel regression.

Kernel regression [12] maps the features x i into a higher dimen-sional space through feature map ψ ( · ); the mechanism for setting the reserve price becomes f ( x i ; w ) = ψ ( x i ) T w . In kernel regression we work with the N × N Gram matrix K of innerproducts, where K ij = ψ ( x i ) T ψ ( x j ). In this work we use a polynomial kernel of degree D , and thus compute the gram matrix without evaluating the feature map ψ ( · ) explicitly, K = ( x ⊤ x + 1) D .Rather than learning the weights directly, kernel methods operate in the dual space α ∈ R N .If K i is the i th column of the Gram matrix, then the mean of the reserve price is f ( x i ; w ) = ψ ( x i ) T w = K Ti α. (12)The corresponding M-step in the algorithm becomes α ( t ) = (cid:18) σ K + λI N (cid:19) − σ E [ y | z , α ( t − ] . (13)See [13] for the technical details around kernel regression.We will demonstrate in Sec. 3 that replacing linear regression with kernel regression canlead to better reserve price predictions. However, working with the Gram matrices comes ata computational cost and we consider neural networks as a scalable alternative to infusingnonlinearity into the model. Neural networks.

We also explore an objective variable model that uses a neuralnetwork [14] to set the mean reserve prices. We use a network with one hidden layer of H units and activation function tanh( · ). The parameters of the neural net are the weights ofthe ﬁrst layer and the second layer: w = { w (1) ∈ R H × d , w (2) ∈ R × H } . The mean of thereserve price is f ( x i ; w ) = w (2) (tanh( w (1) x i )) . (14)The M-step is no longer analytic; Instead, the network is trained using stochastic gradientmethods. We studied our algorithms with two simulated data sets and a large collection of real-worldauction data from eBay. In each study, we ﬁt a model on a subset of the data (using avalidation set to set hyperparameters) and then test how proﬁtable we would be if we usedthe ﬁtted model to set reserve prices in a held out set. Our objective variable methodsoutperformed the existing state of the art.

Data sets and replications.

We evaluated our method on both simulated data andreal-world data. • Linear simulated data.

Our simplest simulated data contains d = 5 auction features. Wedrew features x i ∼ N (0 , I ) ∈ R d for 2,000 auctions; we drew a ground truth weight vectorˆ w ∼ N (0 , I ) ∈ R d and an intercept α ∼ N (0 , B i ∼ N ( w ⊤ x i + α, .

1) and set the second bids b i = B i /

2. (Datafor which B i is negative are discarded and re-drawn.) We split into N train = 1000 and N valid = N test = 500. 7 able 1: The performance of the EM algorithms from Sec. 2 (OV Regression, OV KernelRegression with degree 2 and 4, OV Neural Networks) against the current state of theart (DC [6] and NoF [5]). We report results in terms of percentage of maximum possiblerevenue (computed by an oracle that knows the highest bid in advance). For each data set,we report mean and standard error aggregated from ten train/validation/test splits. Ourmethods outperform the existing methods on all data.

OV Reg OV Kern (2) OV Kern (4) OV NN DC [6] NoF [5]Linear Sim. . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . - 56 . ± . • Nonlinear simulated data.

These data contain features x i , true coeﬃcients ˆ w , and inter-cept α generated as for the linear data. We generate highest bids by taking the absolutevalue of those generated by the regression and second highest bids by halving them, asabove. Taking the absolute value introduces a nonlinear relationship between featuresand bids. • Data from eBay.

Our real-world data is auctions of sports collectibles from eBay. Thereare d = 74 features. All covariates are centered and rescaled to have mean zero andstandard deviation one. We analyze two data sets from eBay, one small and one large.On the small data set, the total number of auctions is 6 , N train = N valid = N test = 2 , N train = 50 , N valid = N test = 10 , Algorithms.

We describe the objective variable algorithms from Sec. 2, all of which weimplemented in Theano [15, 16], as well as the two previous methods we compare against. • OV Regression . OV Regression learns a linear predictor w for reserve prices using thealgorithm in Sec. 2.3. We ﬁnd a good setting for the smoothing parameter σ and regu-larization parameter λ using grid search. • OV Kernel Regression.

OV Kernel Regression uses a polynomial kernel to predict themean of the reserve price; we study polynomial kernels of degree 2 and 4. • OV Neural Network.

OV Neural Network ﬁts a neural net for predicting the reserveprices. As we discussed in Sec. 2.4, the M-step uses gradient optimization; we usedstochastic gradient ascent with a constant learning rate and early stopping [17]. Further,we used a warm-start approach, where the next M-step is initialized with the results ofthe previous M-step. We set the number of hidden units to H = 5 for the simulated dataand H = 100 for the eBay data. We use grid search to set the smoothing parameter σ ,the regularization parameters, the learning rate, the batch size, and the number of passesover the data for each M-step. • Diﬀerence of Convex Functions (DC) [6].

The DC algorithm ﬁnds a linear predictor ofreserve price with an iterative procedure based on DC-programming [9]. Grid search isused on the regularization parameter as well as the margin to select the surrogates forthe auction loss. • No Features (NoF) [5].

This is the state-of-the-art approach to set the reserve priceswhen we do not consider the auction’s features. The algorithm iterates over the highestbids in the training set and evaluates the proﬁtability of setting all reserve prices to thisvalue on the training set. Ref. [6] gives a more eﬃcient algorithm based on sorting. This data set comes from http://cims.nyu.edu/ munoz/data/index.html esults. Tab. 1 gives the results of our study. The metric is the percentage of thehighest possible revenue, where an oracle anticipates the bids and sets the reserve price tothe highest bid.A trivial strategy (not reported) sets all reserve prices to zero, and thus earns the secondhighest bid on each auction. The algorithm using no features [5] does slightly better thanthis but not as well as the algorithms which use features. OV Regression [this paper] andDC [6] both ﬁt linear mappings and exhibit similar performance. However, the DC algorithmdoes not scale to the large eBay data set.The nonlinear OV algorithms (OV Kernel Regression and OV Neural Networks) outperformthe linear models on the nonlinear simulated data and the real-world data. Note that thekernel algorithms do not scale to the large eBay data set because working with the Grammatrix becomes infeasible as the training set gets large. OV Neural Networks signiﬁcantlyoutperforms the existing methods on the real-world data. This is a viable solution tomaximizing proﬁt from historical auction data.

We developed the objective variable framework for combining probabilistic modeling withoptimal decision making. We used this method to solve the problem of how to set the reserveprice in second-price auctions. Our algorithms scaled better and outperformed the currentstate of the art on both simulated and real-world data.

A Appendix - Update Equations for EM

The normalizing constant C i of Eq. 9 can be computed by integrating Eq. 9 over the realline. Let µ i = f ( x i ; w ( t − ). Up to a constant factor of e B i the normalizing constant C i then equals C i e B i = Z b i −∞ e b i φ ( y i − µ i σ ) dy i + Z B i b i e y i φ ( y i − µ i σ ) + Z ∞ B i φ ( y i − µ i σ ) (15)= σe b i Φ( b i − µ i σ ) + σe µ i + σ (cid:2) Φ( B i − ( µ i + σ ) σ ) − Φ( b i − ( µ i + σ ) σ ) (cid:3) + σ (cid:2) − Φ( B i − µ i σ ) (cid:3) . (16)Computing the expectation of the latent reserve price E [ y i ] entails evaluating the momentgenerating function M i ( s ) = E [ e sy i ], where expectation is taken w.r.t. the posterior p ( y i | z i =1 , w ( t − ). Taking the derivative with respect to s and setting s = 0 then yields the desiredexpectation. E [ y i ] = dM i ( s ) ds (cid:12)(cid:12)(cid:12)(cid:12) s =0 (17)= σe b i C i µ i Φ( b i − µ i σ ) − σ e b i C i φ ( b i − µ i σ ) + σC i ( µ i + σ ) e µ i + σ Φ( B i − ( µ i + σ ) σ ) − σC i ( µ i + σ ) e µ i + σ Φ( b i − ( µ i + σ ) σ ) + σC i µ i (cid:2) − Φ( B i − µ i σ ) (cid:3) − σ C i e µ i + σ (cid:2) φ ( B i − ( µ i + σ ) σ ) − φ ( b i − ( µ i + σ ) σ ) (cid:3) + σ C i φ ( − µ i σ ) (18)9 eferences [1] D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a HighlyConnected World . Cambridge University Press, New York, NY, USA, 2010.[2] Z. Bar-Yossef, K. Hildrum, and F. Wu. Incentive-compatible online auctions for digitalgoods. In

ACM-SIAM symposium on Discrete algorithms , pages 964–970, 2002.[3] A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data viathe EM algorithm.

Journal of the Royal Statistical Society, Series B , 39:1–38, 1977.[4] M. Ostrovsky and M. Schwarz. Reserve prices in internet advertising auctions: A ﬁeldexperiment. In

ACM conference on Electronic commerce , pages 59–60, 2011.[5] N. Cesa-Bianchi, C. Gentile, and Y. Mansour. Regret minimization for reserve pricesin second-price auctions. In

ACM-SIAM Symposium on Discrete Algorithms , pages1190–1204, 2013.[6] A. Medina and M. Mohri. Learning theory and algorithms for revenue optimization insecond price auctions with reserve. In

International Conference on Machine Learning ,2014.[7] M. Toussaint, S. Harmeling, and A. Storkey. Probabilistic inference for solving (po)mdps. 2006.[8] M. Toussaint, L. Charlin, and P. Poupart. Hierarchical pomdp controller optimizationby likelihood maximization. In

UAI , volume 24, pages 562–570, 2008.[9] P. Tao and L. An. A dc optimization algorithm for solving the trust-region subproblem.

SIAM Journal on Optimization , 8(2):476–505, 1998.[10] J. Albert and S. Chib. Bayesian analysis of binary and polychotomous response data.

Journal of the American statistical Association , 88(422):669–679, 1993.[11] C. Holmes, L. Held, et al. Bayesian auxiliary variable models for binary and multinomialregression.

Bayesian Analysis , 1:145–168, 2006.[12] A. Aizerman, E. Braverman, and L. Rozoner. Theoretical foundations of the potentialfunction method in pattern recognition learning.

Automation and remote control , 25,1964.[13] C. Bishop et al.

Pattern recognition and machine learning , volume 4. springer NewYork, 2006.[14] C. Bishop et al. Neural networks for pattern recognition. 1995.[15] J. Bergstra et al. Theano: a CPU and GPU math expression compiler. In

Proceedings ofthe Python for Scientiﬁc Computing Conference (SciPy) , June 2010. Oral Presentation.[16] F. Bastien et al. Theano: new features and speed improvements. Deep Learning andUnsupervised Feature Learning NIPS 2012 Workshop, 2012.[17] L. Prechelt. Early stopping - but when? In