Approximating the Value Functions of Stochastic Knapsack Problems: A Homogeneous Monge-Ampére Equation and Its Stochastic Counterparts
aa r X i v : . [ m a t h . O C ] M a y Approximating the Value Functions of Stochastic KnapsackProblems: A Homogeneous Monge-Amp ´ere Equation andIts Stochastic Counterparts
Yingdong LuIBM T.J. Watson Research CenterYorktown Heights, NY 10598
ABSTRACT
Stochastic knapsack problem originally was a versatile model for controls in telecom-munication networks. Recently, it draws attentions of revenue management community byserving as a basic model for allocating resources over time. We develop approximationschemes for knapsack problems in this paper, a system of nonlinear but solvable partialdifferential equations and stochastic partial differential equation are shown to be the limit ofthe process that following the optimal solution of the stochastic knapsack problem.
Keywords: stochastic knapsack problem, fluid limit, diffusion approximation, Monge-Amp ´ereeqnation.
Stochastic knapsack problem(a.k.a. dynamic knapsack problem) is a typical mathematicalmodel for sequential resource allocation, see, e.g. ( ? ). Similar model was proposed also forqueueing control problems, see, e.g. ( ? ), ( ? ) and ( ? ). In the 1990’s, this type of models wereintensively studied in the analysis and design of telecommunication networks, see, e.g. ( ? )and ( ? ). Lately, the stochastic knapsack problem drew considerable attentions in the opera-tions research community due to their applications in revenue (yield) management, see, ( ? )and references therein. The solution of the problem can generally obtained through dynamicprogramming. However, these stochastic knapsack problems are usually only componentsof a larger optimal control problem. For example, in many revenue management problems,stochastic knapsack serves as a realistic model for pricing activities. In order to achieve over-all financial objectives, a pricing model have to be tied with manufacturing and supply chainmanagement. In these cases, constantly checking the look-up table generated by dynamicprogramming will not be feasible. Efficient approximations in more explicit forms are henceneeded. In this paper, we intend to study the first and second order approximations in conciseform.he methodology we follow is basically the fluid and diffusion approximation approaches thatderived from functional law of large numbers and functional central limit theorems. Thesemethods have been applied very successfully in the study of queues and queueing networks,see, e.g. ( ? ). Our study can be viewed as a further enrichment to this body of research. Inqueueing context, the limit process are usually functional of simple differential equations and(reflected) linear differential equations. The limit we obtain is a solution of Monge-Amp ´ereequation, one of the most important nonlinear partial differential equations. Monge-Amp ´ereequation was linked to crucial quantities in geometry, was extensively studied by researchersin various branches of mathematics. We also hope that this could lead to further interactionsbetween related fields.The specific setup of the problem under consideration is the following, at time t = 0 , we have W units of resource available, at any time t = 1 , · · · , T , a request(demand) arrives in the formof a bivariate random vector, ( P t , Q t ) , where the two components represent the unit offer priceand the required quantity; at each time t , after the realization of the request, we decide whetherto accept or reject it to maximize the average revenue collected by time T . Furthermore, weassume that, • Suppose ( P t , Q t ) , for each t (and i.i.d. across t ) follows a joint (discrete) distribution:P [ P t = p i , Q t = q j ] := θ ij , i = 0 , , ..., ℓ ; j = 1 , ..., k. (1.1)Here, p = 0 , and θ := P j θ j represents the probability that there is no arrival (re-quest/demand) in a given period; • No partial fulfillment is allowed.The decision of accept/rejection at each time t , given that the available resource level is d , isdetermined by the following stochastic dynamic programming, V ( t, d ) = V ( t + 1 , d )[ θ + Θ( d )]+ X i =0 X j : q j ≤ d θ ij · max { p i q j + V ( t + 1 , d − q j ) , V ( t + 1 , d ) } , (1.2)where Θ( d ) := X i =0 X j : q j >d θ ij . (1.3)Clearly, the first term on the right hand side of (1.2) corresponds to the case of either no arrivalor the request size exceeds the available inventory; whereas each term under the summationcompares the two actions: accept (i.e., supply) the request, or reject it. If we supply the request,then we earn the revenue p j q j , and proceed to the next period with q j units less in the availableinventory.t the last period, we have V ( T, d ) = X i =0 X j : q j ≤ d θ ij p i q j , (1.4)since clearly the best action is to supply any possible requests using the remaining inventory.The optimal decision is hence, to accept the demand ( P t , Q t ) , if it satisfies, P t Q t + V ( t + 1 , d − Q t ) ≥ V ( t + 1 , d ); (1.5)and reject it otherwise.This problem arises from many applications involving sequentially allocating limit amount ofresources to multiple classes of demands, hence has been widely studied. The existing re-search can be divided into two groups, one focuses on the structural properties of the valuefunction, usually through Markov decision process and modularity argument, see, e.g. ( ? ), ( ? );the other focus on a variety of heuristics policies and their performance, much attentions wasdrawn some applications in the field of revenue management and dynamic pricing, which is justanother mechanism to implement the accept/rejection decision, see, e.g. ( ? ). In this paper, weexamine the asymptotic behaviors of the value function. Our approach is similar to the “fluidlimits” and “diffusion limits” methods in queueing theory, where time and space are properlyscaled to induce asymptotic behaviors of the systems. To be more specific, we scale both timeand space, which is the amount we allow to allocate, by a constant n , then let n go to infinity,and examine the asymptotic behaviors of the value function under this situation. Equivalently,we consider that following two processes, ¯ V n ( t, d ) = 1 n V ( nt, nd ) , ˆ V n ( t, d ) = 1 √ n V ( nt, nd ) , In the following two sections, we will exam the asymptotic behavior of ¯ V as n → ∞ , we showthat it converges to the solution of a boundary value problem of a homogeneous Monge-Amp ´ere equation, give the general solution to the equation, and solve it for some specialcase; then we in order to characterize the rate of the convergence and the asymptotic randombehaviors, we identify the stochastic differential equation that the limit of ˆ V n ( t, d ) satisfies.When the knapsack is characterized by more than one characters, we have a multi-dimensionalstochastic knapsack problem. It is discovered recently to be an important model in the studyof inventory management. Unlike the one dimensional case, the dynamic programming for themulti-dimensional stochastic knapsack problem is exponential with respect to the size of theproblem. Approximations can certainly play even more important roles. Our fluid and diffusionapproximation are carried over to this important model. A system of differential equation needsto be solved to get the fluid limit.The reset of the paper will be organized as the following, in Sec. 2, we will establish the fluidlimit through probabilistic arguments, and present the solution to the Monge-Amp ´ere equation;in Sec. 3, diffusion approximations are established for the case of unit demand; then, themodel is extended to the case of multi-dimensional stochastic knapsack problem, for both fluidand diffusion approximation. Convergence Results
To facilitate the analysis, we will take a de tour. Instead of the value function, we study thestochastic process whose mean will achieve the value function. Let us denote X t,d ( s ) , s = t, t + 1 , · · · , T to be the reward collected at time s following the optimal policy while starting attime t with d units available. Apparently, V ( t, d ) = E [ X t,d ( T )] . Define, V n ( t, d ) = 1 n X nt,nd ( nT ) . (2.1)We expect that a law of large number type result exists, so that V n ( t, d ) can converge to V ( t, d ) for each ( t, d ) . Moreover, we hope to extract the dynamic it satisfies, thus give us therelationship we expect from V ( t, d ) .To prove the convergence, we need, Lemma . Let V n ( t, d ) be the random variable defined above, then, Var [ X nt,nd ( nT )] = O ( n ) , as n → ∞ . (2.2) Proof
We prove this lemma by induction. First, when n = 1 , observe that when d approachinfinity, the optimal policy will be just accept any arrivals, then the variance will be constant.Therefore, we have a maximum for the variance for all d . Denote M to be twice that number,we want to show that for each n , Var [ X nt,nd ( nT )] n ≤ M, ∀ n, d. Suppose it holds up to n . Then for the case of n + 1 , apparently, we need,Var [ X ( n +1) t, ( n +1) d (( n + 1) T )] n + 1 ≤ M, ∀ d, To see this, conditional upon the arrivals from ( n + 1) t + 1 to ( n + 1) t + ( T − t ) , whose σ -algebracan be denoted as F ( n +1) t +( T − t ) we have,Var [ X ( n +1) t, ( n +1) d (( n + 1) T )] = E [ Var [ X ( n +1) t, ( n +1) d (( n + 1) T ) |F ( n +1) t +( T − t ) ]]+ Var [ E [ X ( n +1) t, ( n +1) d (( n + 1) T ) |F ( n +1) t +( T − t ) ]] , in which, E [ Var [ X ( n +1) t, ( n +1) d (( n + 1) T ) |F ( n +1) t +( T − t ) ]] ≤ nM, from inductive assumption, andVar [ E [ X ( n +1) t, ( n +1) d (( n + 1) T ) |F ( n +1) t +( T − t ) ]] ≤ M. due to the definition of M . (cid:3) For the convergence proof, we adapt the classical proof of the strong law of large number, see,e.g. ( ? ). Equation (2.2) and Markov inequality imply, ∞ X n =1 P (cid:20)(cid:12)(cid:12)(cid:12)(cid:12) X u n t,u n d ( u n T ) − E [ X u n t,u n d ( u n T )] u n (cid:12)(cid:12)(cid:12)(cid:12) > ǫ (cid:21) < ∞ , here u n = ⌊ α n ⌋ for some α > . From Borel-Cantelli lemma, we know that X u n t,u n d ( u n T ) /u n converges almost surely, and let us denote the limit as u ( t, d ) for each ( t, d ) . Meanwhile, forany u n ≤ k ≤ u n +1 , we have, u n u n +1 X u n t,u n d ( u n T ) u n ≤ X kt,kd ( kT ) k ≤ u n +1 u n X u n +1 t,u n +1 d ( u n +1 T ) u n +1 . This leads to, α u ( t, d ) ≤ lim inf k X kt,kd ( kT ) k ≤ lim sup k X kt,kd ( kT ) k ≤ αu ( t, d ) , since it holds for any α > , we can conclude that the convergence with probability one.Next, we relax the integral assumptions on the time and the knapsack size. For any ( t, d ) ∈ R , define, ˆ X nt,nd ( nT ) = X ⌊ nt ⌋ , ⌊ nt ⌋ ( nT ) . Then it is easy to see that the convergence can beextended. Furthermore, the converge is uniform on compact set, for related details, see. e.g.( ? ). Therefore we have, Theorem . V n ( t, d ) → u ( t, d ) almost surely as n → ∞ . and u ( t, d ) satisfies the Monge-Amp ´ere equation with proper boundary value. ( ∂u ( x,y ) ∂x + g ( ∂u ( x,y ) ∂y ) = 0 ,u ( X, y ) = h ( y ) , u ( x,
0) = 0 , ( x, y ) ∈ [0 , X ] × [0 , Y ] (2.3)where g ( · ) is the loss function of the random variable P Q , i.e. g ( x ) := E [ Q [ P − x ] + ] . Proof
The only thing left to proof is that the limit will satisfies the equation in (2.3). Sincethe convergence is uniform on compact set, we can further conclude that, if u is continuouslydifferentiable with respect to t and d on a compact set, which imply uniform continuity for both u and its derivatives, then, X nd,nt ( nT ) − X nd,nt +1 ( nT ) = X nd,nt ( nT ) n − X nd,nt +1 ( nT ) n /n → − ∂u ( t, d ) ∂t , a . s .X nd,nt ( nT ) − X nd +1 ,nt ( nT ) = X nd,nt ( nT ) n − X nd +1 ,nt ( nT ) n /n → − ∂u ( t, d ) ∂d . a . s . This can be achieve from the following relationship. X nd,nt ( nT ) − X nd,nt +1 ( nT ) = [ V ( nd − Q, nt + 1) +
P Q − V ( nd, nt + 1)] + (2.4)This is the same as, X nd,nt ( nT ) n − X nd,nt +1 ( nT ) n n = [ P Q + V ( nd − Q,nt +1) n − V ( nd,nt +1) n n ] + . (2.5) The following relations can be also derived through actions of Schwartz functions to define derivative in thedistribution sense, hence, the solutions of the partial differential differential equation are weak solutions. However,in our case, the PDE has a classical solution, this direct approach is more intuitive. ince, both X nd,nt ( nT ) n and ¯ V n ( d, t ) , as its mean, converge to the same function uniformly oncompact set, and according to regularity theorems on the first order partial differential equa-tions, both this function and its derivatives are uniformly continuous, see, e.g. ( ? ), we canensure the convergence of differences to the derivatives. (2.5) gives us the relation, ∂u ( x, y ) ∂x + g ( ∂u ( x, y ) ∂y ) = 0 . We can draw the same conclusion from the boundary value. (cid:3)
The equation (2.3) is, of course, a nonlinear partial differential equation, which in general isextremely difficult to analyze. However, we can reduce this problem to a very well knownequation. Just take derivative with respect to x , we have, u xx + g ′ ( u y ) u xy = 0 , similarly, take derivative with respect to y , we have u xy + g ′ ( u y ) u yy = 0 , multiply the first one by u yy and the second one by u xy , then take difference, we have, u xx u yy − u xy = 0 . This is a homogeneous Monge-Amp ´ere equation. In general, Monge-Amp ´ere equa-tion is an important mathematical subject in analysis and geometry. The general solution tothe homogeneous Monge-Amp ´ere solutions was obtained independently in ( ? ) and ( ? ). Wewill follow the basic setup and structure of the latter. In Fairlie and Leznov(1995) ( ? ), a generalconstruction to the second class is obtained. Given any two arbitrary C function R ( · ) and f ( · ) ,a solution can be uniquely determined. Here, the general procedure of the construction givesus the basic forms that the solution will take, g ( · ) and h ( · ) then will help us to determine thesolution completely. We will also reveal the relations between the two sets of functions. Thegeneral solution to the homogeneous Monge-Amp ´ere equation bears the following form, ∂u∂x = R ( ξ ) − ξ dRdξ , ∂u∂y = dRdξ ,y − xξ = f ( ξ ) . (2.6)For any continuous function R ( ξ ) and f ( ξ ) . For our problem, we know that ξ + g ( R ′ )+ g ′ ( R ′ ) = 0 , u ( X, y ) = h ( y ) , these conditions can uniquely determine the two functions R ( ξ ) and f ( ξ ) ,hence, uniquely determine u ( x, y ) . For example, when the batch distribution takes the followingspecial form, g ( x ) = e − λx , R ′ ( ξ ) = − log( ξλ − ) λ , we know that u ( x, y ) = Z xx (cid:20) R ( ξ ) − ξ dRdξ (cid:21) dx + Z yy dRdξ dy, in conjunction with the relation, dξdx = ξx − f ′ ( ξ ) , dξdy = 1 f ′ ( ξ ) − x . we can obtain u ( x, y ) from routine differential equation procedures for first order partial differ-ential equation, see e.g. ( ? ). Second Order Approximations
While fluid limits capture the dynamics of the system evolution, asymptotic characterization ofthe randomness of the system is also desired. Diffusion approximations, results of various cen-tral limit theorems and strong approximation theorems usually serve this purpose. In addition,diffusion approximations can also provide an estimation of the order of the system convergingto the fluid limits. The process understudy differs from most of those appears in the queueingcontext is that the main process is a random walk with time and state dependency, furthermore,it depends upon the increments of another random walk. Standard functional central limit the-orem can not be applied directly. How ever in the case of unit demand, we can overcome thisbarrier by making use of results derived in ( ? ). In ( ? ), a large class of switching system arestudied. Our system turns out to be one that “switching” at every time period. To apply theresults in ( ? ), it is require that the evolution function of the mean and variance are smooth, infact Lipschitz should be enough. Those functions for our case are functions of the solutionsto the Monge-Amp ´ere equation, whose smoothness has been well established. Then, resultsin ( ? ) will give us the stochastic differential equation(SDE) the limiting process will satisfy.However, the existence and uniqueness of the solution to the SDE were not established in( ? ). Meanwhile, in ( ? ), the existence and uniqueness of the solution to this type of SDE withmore general assumptions on the coefficients have been obtained. Hence, the SDE uniquelydetermine a stochastic process.To capture the randomness of X n,d ( T ) , we define y d,T ( s ) to be the number of units that aresupplied to the demands up to time s . From the set up, we know that y d,T ( s ) is a Markov chain.Moreover, we have, y d,T ( s + 1) = y d,T ( s ) + χ ( { V ( s + 1 , d − y d,T ( s ) −
1) + p − V ( s + 1 , d − y d,T ( s )) > } ) ,Z d,T ( s ) − Z d,T ( s ) = p i [ y d,T ( s ) − y d,T ( s )] . In ( ? ), random walks of this type have been studied. More specifically, Let { ξ n ( α ) , α ∈ R + } beindependent random variables with al as parameters, and S n +1 = S n + ξ ( S n ) . The followingtheorem is a summary of results in ( ? ), Theorem . (( ? )) If there exists a function s ( t ) such that, | n S n ( nt ) − s ( t ) | → , a.s. and D n → D ( α ) , then, D n = Var [ ξ n ( α )] /n then, γ n ( t ) := n − / [ S n ( nt ) − ns ( t )] weekly convergein D to the following stochastic differential equation, dγ ( t ) = D ( s ( t )) / dw t , provided that its solution exists and is unique.ollowing the similar functional strong law of large numbers in the previous section, we cansee that s ( t ) satisfies, ds ( t ) t = F ( u d ( t, d − s ( t ))) . By the Lipschitz continuity of the solution to the Monge-Amp ´ere equation, and existence andstability theorems of ordinary differential equation, we can see that s ( t ) can be uniquely de-termined and is Lipschtz continuous. Therefore, we can conclude that n − / [ Y n ( nt ) − ns ( t )] converge to a stochastic process that satisfies the following stochastic differential equation, dY t = p g ( u d ( t, d − s ( t )))[1 − g ( u d ( t, d − s ( t )))] dW t , (3.1)where dW t denotes a wiener integral with respect to a standard Brownian motion, and therevenue collected can be represented by dZ s = P dY s . (3.2) Remark:
For the problems with general demand batch size, we conjecture that the processwill weakly converge to a process that satisfies the following stochastic differential equation, dY t = q F p,d ( u d ( t, d − Y t ))[1 − F p,d ( u d ( t, d − Y t ))] dW t . Where F p,d is the distribution function for both price and the quantity, and is not necessarycontinuous. Hence, the limiting process is not a diffusion process any more. However, resultsin ( ? ) guarantee the existence and uniqueness of the limiting process. This convergence willbe the subject of future research. In this section, we intend to deal with a multi-dimensional version of the stochastic knapsackproblem. To be more specific, at time t = 0 , we have a vector of m different types of prod-ucts available, ( W , W , · · · , W m ) , at any time t = 1 , , · · · , T , the demand arrives now ischaracterized by a m + 1 tube, ( P t , Q t , Q t , · · · , Q mt ) , where P t describes the rewards and a m -dimensional characterization vector ( Q t , Q t , · · · , Q mt ) describes the combination of the the m types products it requires. They are i.i.d random variables, with discrete distribution,P [ P t = p i , Q t = q j , Q t = q j , · · · , Q mt = q mj m ] = θ i,j ,j , ··· ,j m . Again, at each time t , we need to determine whether to accept or to reject the demand arrivalto achieve the maximum average revenue at time T . Let us denote V ( t, d , d , · · · , d m ) be thevalue function for the dynamic programming. Similarly to the one dimensional case, it satisfieshe following recursive equation, V ( t, d , d , · · · , d m )= V ( t + 1 , d , d , · · · , d m )[ θ + Θ( d , d , · · · , d m )]+ X θ i,j ,j , ··· ,j m max (cid:8) p i + V ( t + 1 , d − q , d − q , · · · , d m − q m ) ,V ( t + 1 , d , d , · · · , d m ) (cid:9) . (4.1)Here, Θ( d , d , · · · , d m ) = P [ ∪ mk =1 { Q kt > d k } ] . It is obvious that the dynamic programming is not so easy-to-solve as those in one-dimensionalcase. In fact, the computational efforts needed grow exponentially with respect to W =max k W k . Therefore, approximation will be much more desirable.Apply the sam scaling and argument, we can see that the similar convergence results hold, Theorem . V n ( t, d , d , · · · , d m ) → u ( t, d , d , · · · , d m ) as n → ∞ . and u ( t, d , d , · · · , d m ) satisfies the following boundary value problem, ( ∂u ( x ,x ,x , ··· ,x n − ) ∂x + G ( ∂u∂x , ∂u∂x , · · · , ∂u∂x m − ) = 0 ,u ( x , x , x , · · · , x m − ) = h ( x , x , · · · , x n − ) , u ( x ,
0) = 0 , where G ( z , z , · · · , z m − ) := E [ P − ( z + z + · · · + z m − )] + . Hence, u satisfies the homogeneous Monge-Amp ´ere equation Det ( D u ) = 0 . Proof:
The only thing left is to show that u satisfy the Monge-Amp ´ere equation. To see this,take derivative with respect to x , x , · · · , x m respectively on equation, ∂u ( x , x , x , · · · , x n − ) ∂x + G ( ∂u∂x , ∂u∂x , · · · , ∂u∂x m − ) = 0 , we have, for k = 0 , , · · · , m∂u ∂x x k + m − X i =1 G i ( ∂u∂x , ∂u∂x , · · · , ∂u∂x m − ) ∂u ∂x i x k = 0 , (4.2)regard this as a linear system for variables (1 , G , G , · · · , G m ) , then in order for (4.2) to hold,the coefficient matrix, ( D u ) , must be singular. (cid:3) Again from ( ? ), we have the general solution for n dimensional Monge-Amp ´ere equation. Un-derstandably, it depends upon the selection of two m − variate functions. Given arbitraryfunction R ( ξ , ξ , · · · , ξ n − ) , L ( ξ , ξ , · · · , ξ n − ) , u = R − X ξ j R j , u j = R j ,x j − ξ j x = Q j ( ξ , ξ , · · · , ξ n − ) , = ( D R ) − DL.
Where DL and D R denote the gradient vector and Hessian matrix of L and R respectively.Function G ( · ) and the boundary condition h ( · ) will then determine R and L .Define y kd,T ( s ) , k = 1 , , · · · , m to be the number of units of characterizations that are suppliedto the demands up to time s . There exists a function ( s ( t ) , s ( t ) , · · · , s m ( t )) which is thesolution to the following differential equation system, ds k ( t ) t = G ( u i ( t, d − s ( t ) , d − s ( t ) , · · · , d m − s m ( t )) , k = 1 , , · · · , m. The regularity of the solution will then help us conclude that, (cid:18) Y kn ( nt ) − ns k ( t ) n / (cid:19) , converges to a stochastic process determined by the following stochastic differential equation, dY kt = q ˜ G k (1 − ˜ G k ) dW kt , (4.3)where, ˜ G k = G ( u k ( t, d − s ( t ) , d − s ( t ) , · · · , d m − s m ( t ))))