[PDF] Fair Dynamic Rationing

Abstract

We study the allocative challenges that governmental and nonprofit organizations face when tasked with equitable and efficient rationing of a social good among agents whose needs (demands) realize sequentially and are possibly correlated. To better achieve their dual aims of equity and efficiency in such contexts, social planners intend to maximize the minimum fill rate across agents, where each agent's fill rate must be irrevocably decided upon its arrival. For an arbitrarily correlated sequence of demands, we establish upper bounds on both the expected minimum fill rate (ex-post fairness) and the minimum expected fill rate (ex-ante fairness) achievable by any policy. Our bounds are parameterized by the number of agents and the expected demand-to-supply ratio, and they shed light on the limits of attaining equity in dynamic rationing. Further, we show that for any set of parameters, a simple adaptive policy of projected proportional allocation achieves the best possible fairness guarantee, ex post as well as ex ante. Our policy is transparent and easy to implement; yet despite its simplicity, we demonstrate that this policy provides significant improvement over the class of non-adaptive target-fill-rate policies. We obtain the performance guarantees of (i) our proposed adaptive policy by inductively designing lower-bound functions on its corresponding value-to-go, and (ii) the optimal target-fill-rate policy by establishing an intriguing connection to a monopoly-pricing optimization problem. We complement our theoretical developments with a numerical study motivated by the rationing of COVID-19 medical supplies based on a projected-demand model used by the White House. In such a setting, our simple adaptive policy significantly outperforms its theoretical guarantee as well as the optimal target-fill-rate policy.

Full PDF

FFair Dynamic Rationing

Vahideh Manshadi

Yale School of Management, New Haven, CT, [email protected]

Rad Niazadeh

University of Chicago Booth School of Business, Chicago, IL, [email protected]

Scott Rodilitz

Yale School of Management, New Haven, CT, [email protected]

We study the allocative challenges that governmental and nonproﬁt organizations face when tasked withequitable and eﬃcient rationing of a social good among agents whose needs (demands) realize sequentiallyand are possibly correlated. As one example, early in the COVID-19 pandemic, the Federal EmergencyManagement Agency faced overwhelming, temporally scattered, a priori uncertain, and correlated demandsfor medical supplies from diﬀerent states. To better achieve their dual aims of equity and eﬃciency in suchcontexts, social planners intend to maximize the minimum ﬁll rate across agents, where each agent’s ﬁll ratemust be irrevocably decided upon its arrival. For an arbitrarily correlated sequence of demands, we establishupper bounds on both the expected minimum ﬁll rate (ex-post fairness) and the minimum expected ﬁll rate(ex-ante fairness) achievable by any policy. Our bounds are parameterized by the number of agents and theexpected demand-to-supply ratio, and they shed light on the limits of attaining equity in dynamic rationing.Further, we show that for any set of parameters, a simple adaptive policy of projected proportional allocationachieves the best possible fairness guarantee, ex post as well as ex ante. Our policy is transparent andeasy to implement, as it does not rely on distributional information beyond the ﬁrst conditional moments.Despite its simplicity, we demonstrate that this policy provides signiﬁcant improvement over the class ofnon-adaptive target-ﬁll-rate policies by characterizing the performance of the optimal such policy, whichrelies on full distributional knowledge. We obtain the performance guarantees of (i) our proposed adaptivepolicy by inductively designing lower-bound functions on its corresponding value-to-go, and (ii) the optimaltarget-ﬁll-rate policy by establishing an intriguing connection to a monopoly-pricing optimization problem.Further, we extend our results to considering alternative objective functions and to rationing multiple typesof resources. We complement our theoretical developments with a numerical study motivated by the rationingof COVID-19 medical supplies based on a projected-demand model used by the White House. In such asetting, our simple adaptive policy signiﬁcantly outperforms its theoretical guarantee as well as the optimaltarget-ﬁll-rate policy.

Key words : rationing, fair allocation, social goods, correlated demands, online resource allocation a r X i v : . [ c s . G T ] F e b Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

1. Introduction

In Spring 2020, with the COVID-19 pandemic surging across the US, states were relying on theFederal Emergency Management Agency (FEMA) to provide urgently needed medical equipmentfrom the Strategic National Stockpile. Unequipped for such a widespread emergency, FEMA aimedto ration its limited supplies in order to address states’ current needs while also retaining some ofthe stockpile in anticipation of future needs. However, the allocation decisions made by FEMA wereinconsistent and lacked transparency, which frustrated state oﬃcials (Washington Post 2020a). Because having access to medical equipment can be a matter of life or death for a COVID-19patient, making allocation decisions which are eﬃcient and equitable is of paramount importance(Emanuel et al. 2020). Achieving eﬃciency alone is easy: a ﬁrst-come, ﬁrst-serve policy allocatesall of the supply to meet early-arriving needs. However, such a policy can be unfair to patients instates where needs materialize later.The above is just one example of a fundamental sequential allocation problem that social plannersface when aiming to allocate divisible goods as eﬃciently and equitably as possible to demandingagents that arrive over time.

In this paper, we take the ﬁrst step toward theoretically studying the aforementioned class of prob-lems. We develop a framework for fair dynamic rationing where agents’ one-time needs (demands)for a divisible good realize sequentially and can be arbitrarily correlated. In particular, upon arrivalof each agent’s demand, the planner makes an irrevocable decision about their ﬁll rate (FR). Towardjointly achieving eﬃciency and equity, the planner aims to maximize the minimum

FR, either expost or ex ante. To assess the performance of sequential allocation policies, we introduce measuresof ex-post and ex-ante fairness guarantees. For this general setting:(i) We establish upper bounds on the ex-post and ex-ante fairness guarantees achievable by anypolicy. These bounds are parameterized by the supply scarcity (i.e., the expected demand-to-supply ratio) and the number of agents.(ii) Remarkably, we show that a simple, adaptive, and transparent policy called projected propor-tional allocation (PPA) simultaneously achieves our upper bounds on the ex-post and ex-antefairness guarantees for any set of parameters.(iii) We illustrate the power of adaptivity by characterizing the ex-post guarantee of the optimaltarget-ﬁll-rate policy and showing that such a non-adaptive policy cannot achieve our upperbounds. “ We don’t know how the federal government is making those decisions, ” said Casey Katims, the federal liaison forWashington state. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing (iv) Finally, we demonstrate the eﬀectiveness of our policy through an illustrative case studymotivated by the allocation of COVID-19 medical supplies based on a model of demand whichwas used by the White House. Introducing a framework for fair dynamic rationing:

We study the allocation of a divisiblegood to agents arriving over time with varying levels of demand. We assume the demand sequence isdrawn from an arbitrary but known joint distribution across all agents. To account for heterogeneityin the demand level of diﬀerent agents, we set each agent’s utility to be its FR. In our base model,we focus on the objective of maximizing the minimum FR across all agents. Such an objective—which is in the spirit of Rawlsian justice—maximizes the utility of the worst-oﬀ agent. As such,it takes fairness into consideration along with eﬃciency. Due to the stochasticity of the demandsequence, we consider two versions of this objective function: the excepted minimum FR and theminimum expected FR (see eq. (ex-post) and eq. (ex-ante), respectively, as well as the subsequentdiscussion).Like other online stochastic optimization problems, our sequential allocation problem can beformulated as a dynamic program (DP), and it similarly suﬀers from the curse of dimensionality aswell as other practical limitations such as a lack of interpretability. Consequently, we aim to designsequential allocation policies that perform well while being practically appealing and computablein polynomial time. We assess the performance of a policy by computing its ex-post and ex-antefairness guarantees for any given supply scarcity and number of agents. In deﬁning our notions ofsuch guarantees, we use the minimum FR achievable under deterministic demand as a benchmark(see eq. (1) and Deﬁnition 1 and their related discussion in Section 2) to separate the impactof demand stochasticity from the impact of supply scarcity. The ex-post (resp. ex-ante) fairnessguarantee of a policy serves as a lower bound on the expected minimum (resp. minimum expected)FR that the policy achieves relative to our benchmark under all possible joint demand distributions.

Establishing upper bounds:

In order to gain insight into the diﬃculty of achieving equity andeﬃciency in sequential allocation, we develop upper bounds on the achievable fairness guaranteesof any online policy, even policies which cannot be computed in polynomial time. For intuition,consider the following example with two agents. The ﬁrst agent has demand of B , where B is aBernoulli random variable with success probability 2 /

3. The second agent has demand B × B ,where B is an independent Bernoulli random variable with success probability 1 /

2. In other words,the demand sequence is equally likely to be (0 , , , In Section 5, we generalize our objective function.

Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing establish upper bounds on the ex-post and ex-ante guarantees of any policy (see Theorems 1 and4).As we later show, these bounds are indeed tight. Thus, conducting comparative statics withrespect to the supply scarcity and the number of agents reveals several insights (see Figure 1):when demand is small relative to supply, the bounds on both fairness guarantees deteriorate withincreased demand. However, in the over-demanded regime, the bounds are independent of thesupply scarcity. Further, in both the under-demanded and over-demanded regimes, the ex-postfairness guarantee worsens with more agents. On the other hand, the ex-ante fairness guarantee isindependent of the number of agents. This highlights the fundamental diﬀerence in our notions offairness: the objective corresponding to ex-post fairness (the expected minimum FR) is concernedwith fairness along all samples paths, whereas the objective corresponding to ex-ante fairness (theminimum expected FR) is only concerned with marginal fairness (see the related discussion inSection 2).

Achieving upper bounds:

Since our upper bounds apply to all sequential policies including theoptimal online policy (namely, the exponential-sized DP), it would be reasonable if no policy couldachieve these upper bounds in polynomial-time. However, we show that not only are these upperbounds achievable, but they can be achieved by our PPA policy. To motivate our policy, let usconsider a hypothetical situation where the demand sequence is known a priori. In that case, theoptimal allocation under both objectives is to equalize the FRs and then maximize that FR (seeeq. (1) and its related discussion). Alternatively, this can be written as a deterministic DP witha simple solution: at each time period, proportionally allocate the remaining supply based on thecurrent demand and the total future demand (see Section 3.2 and Appendix A.3). When demandis stochastic, our PPA policy simply replaces all the future random demands by their projectedvalues, namely, their conditional expectations (see eq. (4)).In Sections 3.3 and 3.5, we analyze the ex-post and ex-ante fairness guarantees of the PPA policyand show that it achieves the best of both worlds: our lower bounds on the PPA policy’s guaranteesmatch the corresponding upper bounds for any supply scarcity and any number of agents. Thesetwo analyses rely on delicate inductive arguments. For ex-post fairness, we establish a lower boundon the value-to-go function of our PPA policy by analyzing the evolution of the minimum FR andprogressively constructing a worst-case joint distribution for demand (see Lemma 1 and Figure 3).For ex-ante fairness, we demonstrate that the expected demand-to-supply ratio before the arrivalof each agent is non-increasing when following the PPA policy, which enables us to bound themarginal expected FR for each agent (see Appendix A.6).We highlight that beyond enjoying the best possible guarantees, our PPA policy is practicallyappealing: it is computationally eﬃcient, interpretable, and transparent. In addition, it does not anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing require full distributional knowledge, as it only relies on the ﬁrst conditional moments of the jointdistribution for demand. This last property is signiﬁcant because knowing the full distribution maynot be feasible in many practical situations (we discuss one such example in Section 4). Further,policies which rely on detailed distributional knowledge can be prone to errors or perturbations(see Footnote 15 and Appendix A.2). Establishing sub-optimality of target-ﬁll-rate policies:

In addition to showing that ourPPA policy achieves the best possible guarantees, we extend our work to studying the subclass oftarget-ﬁll-rate (TFR) policies. A TFR policy commits upfront to a ﬁll rate τ , and upon arrivalof each agent, it allocates a fraction τ of that agent’s demand until it exhausts the supply. Ourstudy of TFR policies is motivated by two reasons: (i) since such policies are transparent and easy-to-communicate, they are frequently used in practice, including at the outset of the COVID-19pandemic when an initial formula allocated a ﬁxed percentage of states’ estimated needs (Washing-ton Post 2020a), and (ii) TFR policies are a natural yet powerful class of non-adaptive policies (seeSection 3.4). Consequently, comparing the performance of TFR policies with that of our adaptivePPA policy sheds light on the limitations of making non-adaptive decisions.Intuitively, a TFR policy can perform poorly because it does not take advantage of informa-tion that reduces future uncertainty. For instance, consider a setting with two agents where thesecond agent’s demand is perfectly correlated with the ﬁrst agent’s demand. A simple adaptivepolicy—such as our PPA policy—will perform optimally in such a setting because demand is deter-ministic upon the ﬁrst agent’s arrival. The PPA policy achieves such performance by cruciallyleveraging information about the second agent’s demand when determining the ﬁrst agent’s ﬁllrate. In contrast, a TFR policy targets the same ﬁll rate regardless of the ﬁrst agent’s demand,and consequently cannot ensure that suﬃcient supply remains for the second agent. Based on thisintuition, in Section 3.4, we provide a tight bound on the ex-post fairness guarantee of the optimalTFR policy (see Theorem 3), which can be considerably lower than the corresponding guaranteeof our adaptive PPA policy (see Figure 4).To characterize the ex-post fairness guarantee of the optimal TFR policy, we construct the worst-case total demand distribution against such a policy. In the proof, we establish a rather surprisingconnection to the literature on monopoly pricing and Bayesian mechanism design (see Hartline(2013) for more details on this literature). In particular, upon mapping the problem of ﬁndingthe worst-case instance into the quantile space, our problem reduces to a constrained version ofthe (single-item) monopoly pricing problem (see Remark 2). We identify two key properties ofthe worst-case distribution in this constrained monopoly pricing problem, and by exploiting theconnection to our original problem, we end up with the desired characterization of the worst-case total demand distribution against the optimal TFR policy. Due to this connection, our proof Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing technique and corresponding results can be of independent interest (e.g., see Alaei et al. (2019) forproof techniques and results in the same spirit).

Illustrative case study:

To demonstrate the eﬀectiveness of our policy, in Section 4 we conducta numerical case study motivated by the allocative challenges that FEMA faced at the beginningof the COVID-19 pandemic (as discussed at the beginning of this section). Drawing upon a modelcited by the White house at that time, we ﬁrst highlight the sequential and heterogeneous nature ofstates’ demands for medical supplies. Then, we augment that model by considering a multivariateNormal distribution for demand with various levels of correlation. Our simulation results illustratethe superior performance of our PPA policy compared to both its ex-post fairness guarantee andthe optimal TFR policy. Further, the results suggest that our PPA policy performs nearly as wellas a DP solution (which, as we discuss, suﬀers from many practical limitations).Allocating medical supplies in a pandemic is just one motivating example of the challenges thatarise when a governmental or nonproﬁt organization aims to ration supply among agents whose (apriori uncertain and correlated) needs realize sequentially. Other examples include the allocationof emergency aid when a natural disaster such as a hurricane or wildﬁre impacts multiple locationsover time (Wang et al. 2019), as well as the distribution of food donations by mobile pantries thatsequentially visit agencies (Lien et al. 2014). Our proposed policy can eﬀectively guide transparentallocation decisions in such contexts while also providing a guarantee on the fairness level of theprocess. Finally, as discussed in Section 5, our framework can be enriched to account for otherpractical considerations, such as (i) generalized objective functions that enable the social plannerto balance equity and eﬃciency to varying degrees, and (ii) rationing multiple types of resources(see Corollaries 1, 2, and 3 in Section 5.2).

We conclude this section by discussing how our work relates to and contributes to several streamsof literature.

Fairness in static resource allocation:

Considerations of fairness and its trade-oﬀ with eﬃ-ciency have frequently arisen in the resource allocation literature in operations research and com-puter science. We begin by discussing papers which study fairness in static (one-shot) allocationsettings. The seminal work of Bertsimas et al. (2011) considers a general setting where a central As explained in detail in Lien et al. (2014), even though the daily demand for food donations from diﬀerent agenciesare not temporally scattered, they will only be observed by the operators upon their arrival at the sites. Other recent papers have focused on fairness in the contexts of pricing (Cohen et al. 2019), information acquisition(Cai et al. 2020), targeted interventions (Levi et al. 2019), service levels (Jiang et al. 2019), and online learning(Gupta and Kamble 2019). See also the work of Cayci et al. (2020) that considers fair resource allocation with onlinelearning. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing decision-maker allocates m divisible resources to n agents, each with a diﬀerent utility function.Focusing on two commonly used notions of fairness in allocation, max-min and proportional fair-ness, the authors characterize the eﬃciency loss due to maximizing fairness (see also Bertsimaset al. 2012 and Bertsimas et al. 2013). If demand was deterministic in our setting, the optimalallocation would coincide with that of the max-min objective in Bertsimas et al. (2011). Namely,for both objectives, the optimal allocation consists of maximized equal FRs.Focusing on indivisible goods, Donahue and Kleinberg (2020) considers the trade-oﬀ betweenfairness and utilization when demand is distributed across diﬀerent agents. A priori, only demanddistributions are known. However, after a one-shot allocation decision, all demand values realize.The fairness notion considered in this line of work is in the same spirit of our notion of ex-antefairness: they require that an individual’s chance of receiving the resource should not signiﬁcantlydepend on the group to which the individual belongs. Similarly, by maximizing the minimumexpected FR, we aim to reduce the impact of an agent’s place in the sequence of arrivals. Sharingsimilar motivation to our paper, Grigoryan (2020) and Pathak et al. (2020) consider equitableCOVID-19 vaccine allocation. However, the settings (e.g., oﬄine and deterministic), models, andtechniques in both papers diﬀer drastically from those in this work.Also falling within the category of static allocation of indivisible goods, a stream of papersin computer science considers allocation problems when agents’ valuations are deterministicallyknown. For deterministic algorithms, recent research has centered on the existence of allocationswhich satisfy certain fairness properties, such as envy-freeness up to any good (see, e.g., Chaudhuryet al. (2020) and references therein). For randomized algorithms, the closest to our work is therecent work of Freeman et al. (2020), which uses notions of ex-post and ex-ante fairness andexplores whether both can be achieved simultaneously. They develop a randomized algorithm thatis approximately fair ex post and precisely fair ex ante. We ask a similar question, albeit in adynamic divisible-good setting with random and correlated demand, and we aﬃrmatively answerit: our PPA policy exactly achieves the best possible fairness guarantee ex post as well as ex ante(see Theorems 2 and 4). Fairness in dynamic resource allocation:

We now turn our attention to papers that considerfairness in dynamic (online) allocation settings. In terms of modeling, closest to our work are Lienet al. (2014) and Sinclair et al. (2020). Motivated by the distribution of food donations by mobilepantries, Lien et al. (2014) introduced the problem of sequential resource allocation which coincideswith our base model and the ex-post fairness objective function, in that it aims to maximize the For other examples of work in this application area, see Solak et al. (2014), Orgut et al. (2018), and Eisenhandlerand Tzur (2019).

Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing expected minimum FR (although it only studies the special case of independent demands). Therecent work of Sinclair et al. (2020) considers a similar model; however, it focuses on a multi-criteriaobjective which is based on an allocation’s distance from the optimal oﬄine Nash Social Welfaresolution. We note that their notion of fairness is also diﬀerent in nature from ours. The algorithmic aspects of both Lien et al. (2014) and Sinclair et al. (2020) consist of designingnovel heuristics and numerically evaluating them against a relevant benchmark (the intractableDP solution and the Nash Social Welfare solution, respectively). On the other hand, we take atheoretical approach and analyze fairness guarantees for the policies we design. Further, we provideupper bounds on the performance of any policy (including the DP solution), which serves a dualpurpose: (i) it establishes that our policy is the best possible one if we aim to achieve both ex-anteand ex-post fairness guarantees, and (ii) it highlights the fundamental limits of achieving equity ina dynamic setting.In settings with multiple types of resources, Azar et al. (2010) and Bateni et al. (2016) studyonline versions of Fisher markets and develop policies with fairness guarantees under two diﬀerentarrival models. The former assumes an adversarial model whereas the latter considers demand thatbelongs to a general class of stochastic processes. There are fundamental diﬀerences between ourwork and the aforementioned papers. Just to name one, the settings of Azar et al. (2010) andBateni et al. (2016) are motivated by online advertising, where demanding agents (advertisers) areoﬄine and items (impressions) arrive in an online fashion. Demanding agents have a large budgetcompared to the price of each arriving item, and they derive item-speciﬁc utilities. Consequently,the fairness notion is concerned with the total utility of each agent, which is a function of all itemsallocated to it during the horizon. In contrast from such a setting, demanding agents in our workarrive in an online fashion while the supply side is oﬄine, and each demanding agent receives a single allocation. The recent works of Ma and Xu (2020) and Nanda et al. (2020) are closer toour setting in that the demanding agents arrive online; however, they diﬀer in several aspects: (i)the underlying arrival process is known i.i.d. where arriving demand belongs to various groups,(ii) they focus on group-level fairness, and (iii) they consider a matching setting, i.e., allocatingindivisible goods.The objectives of ex-post and ex-ante fairness which we study in our problem bear some resem-blance to the objective in the online contention resolution scheme (OCRS) problem, although the In Sinclair et al. (2020), their notion of fairness is with respect to the absolute allocation, i.e., if possible, agents’allocation should be equalized regardless of diﬀerences in their needs. In contrast, we aim for an allocation which isproportional to need. We remark that papers considering general convex objective functions, such as Agrawal and Devanur (2014) andBalseiro et al. (2020), admit many common fairness objective functions as special cases. See also Mehta (2012) formore details. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing two problems are not directly comparable. The OCRS is basically a rounding algorithm that aimsto uniformly preserve the marginals induced by a fractional solution while obtaining feasibility ofthe ﬁnal allocation. This technique has found application in many settings such as Bayesian onlineselection, oblivious posted pricing mechanisms, and stochastic probing models (see, e.g., Alaei 2014,Feldman et al. 2016, and Lee and Singla 2018). The OCRS problem diverges from ours becausethat setting focuses on designing randomized policies for allocating indivisible goods, while ourfocus is on divisible goods (consequently, restricting to deterministic policies is without loss). Dynamic allocation of social goods:

On a broader level, our paper is related to the literatureon dynamic allocation of social goods and services, such as public housing, donated organs, andemergency care. Examples of centralized allocation policies include Kaplan (1984), Ashlagi et al.(2013), Agarwal et al. (2019), and Ashlagi et al. (2019); examples of decentralized mechanismsare Leshno (2017), Anunrojwong et al. (2020), and Arnosti and Shi (2020). For the most part, theaforementioned papers focus on the analysis of social welfare in steady-state models where bothdemand and supply dynamically arrive. We complement this literature by focusing on equitableallocation in a non-stationary framework where a ﬁxed amount of supply must be rationed acrossdemand that arrives over time.

Online resource allocation:

From a technical point of view, our work is related to the richliterature on online resource allocation and prophet inequalities, which started from the seminalwork of Krengel and Sucheston (1978) and Samuel-Cahn et al. (1984). For an informative survey,we refer the interested reader to Lucier (2017). We highlight that in terms of modeling demand,our work departs from the prevailing approaches in this literature, namely adversarial, i.i.d., orrandom permutation arrival models. In our work, we assume that the sequence of demands can bearbitrarily correlated and the joint distribution is known in advance. In terms of modeling demand,our work is closest to a few papers that consider prophet inequalities with correlated demand(Rinott and Samuel-Cahn 1992, Truong and Wang 2019, Immorlica et al. 2020). However, thenature of the online decisions is diﬀerent; in our model, a fraction of a divisible good is allocatedto each arriving demand, whereas in prophet inequality settings, an indivisible good is allocatedto a single agent.Finally, our PPA policy relies on re-optimizing the FR by replacing all future random demandsby their expected values. As such, it is related to the stream of papers in revenue managementand dynamic programming that theoretically analyze the performance of such heuristics. Greatexamples of work in this direction include Ciocan and Farias (2012), Jasin and Kumar (2012),Balseiro and Brown (2019), and Calmon et al. (2020). Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

2. Preliminaries

Problem setup:

Consider a planner that is using a sequential allocation policy—also referred toas an online policy—to allocate a divisible resource of supply s among n agents. Without loss ofgenerality, we normalize the total supply so that s = 1. Agents arrive sequentially over time periods1 , , . . . , n , and we index agents according to the period in which they arrive. Once agent i arrives,their demand d i ∈ R ≥ is realized and observed by the planner. Based on the observed demand d i and the history up to time period i , the sequential policy makes an irrevocable decision byallocating an amount x i of the resource to this agent. The allocated amount x i cannot exceed theagent’s realized demand d i nor can it exceed the remaining supply before agent i ’s arrival, whichwe denote by s i . Thus, x i is a feasible allocation if x i ∈ [0 , min { s i , d i } ]. Given the feasible allocation x i and the demand d i , agent i ’s ﬁll rate (FR) is deﬁned as x i d i . After allocating x i to agent i , theremaining supply before the arrival of agent i + 1 is s i +1 = s i − x i .To model the uncertainty about future demands, we consider a Bayesian setting where the d i ’s arestochastic and arbitrarily correlated such that (cid:126)d = ( d , d , . . . , d n ) is drawn from a joint distribution (cid:126) F ∈ ∆ (cid:0) R n ≥ (cid:1) known by the planner. We further denote the supply scarcity (i.e., the expecteddemand-to-supply ratio) by µ (cid:44) E (cid:126)d ∼ (cid:126) F (cid:104)(cid:80) i ∈ [ n ] d i (cid:105) , which is equivalent to the total expected demandsince we normalize the total supply to be 1. For simplicity of presenting our results, we considerjoint distributions that assign non-zero probability to at least one sample path of demands with d n (cid:54) = 0. Equivalently, we assume d n is not deterministically equal to zero. As detailed earlier, our setup is motivated by the distributional operations of a governmentalor nonproﬁt organization. Consequently, we focus on an egalitarian planer that intends to balancethe equity and eﬃciency of the allocation. To this end, the planner’s objective is to maximize theminimum achieved FR among the agents, i.e., min i ∈ [ n ] x i d i , given the uncertainty in the demands.Maximizing such an objective has its roots in the classic literature on welfare economics (e.g., Arrow1963) and has been studied more recently in similar contexts in operations research (e.g., Lienet al. 2014). It provides equity through its focus on the worst FR across all agents—in contrast tothe sum of FRs—and provides eﬃciency by aiming to maximize this FR—in contrast to allocatingan equally minimal amount of the resource to all agents. Objectives & fairness guarantees:

Since demands are a priori uncertain in the setup describedabove, the planner should consider appropriate metrics to aggregate over uncertain outcomes. If x i = d i = 0, we set the FR to 1 as a convention. For any a ∈ N , we use [ a ] to refer to the set { , , . . . , a } . This assumption is without loss of generality, as one can alternatively re-deﬁne n to be the smallest index suchthat d n (cid:48) is deterministically equal to zero for n (cid:48) > n . We consider a broader class of objectives that subsumes the minimum FR in Section 5.2.1. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing We now formally deﬁne the planner’s objectives by considering two diﬀerent metrics: the ex-postminimum FR and the ex-ante minimum FR. For any sequential allocation policy π , the ex-postminimum FR of policy π is its expected minimum FR, i.e., W (cid:126) F p ( π ) (cid:44) E (cid:126)d ∼ (cid:126) F (cid:20) min i ∈ [ n ] (cid:26) x i d i (cid:27)(cid:21) , (ex-post)where (cid:126)x = ( x , x , . . . , x n ) is the sequence of allocations generated by π . On the other hand, theex-ante minimum FR of policy π is its minimum expected FR, i.e., W (cid:126) F a ( π ) (cid:44) min i ∈ [ n ] (cid:26) E (cid:126)d ∼ (cid:126) F (cid:20) x i d i (cid:21)(cid:27) . (ex-ante)For a randomized policy π , we abuse notation and again use W (cid:126) F p ( π ) and W (cid:126) F a ( π ) to denote theexpectation of the above two quantities over the policy π ’s internal randomness. These two objectives represent two diﬀerent notions of fairness: eq. (ex-post) aims for equityin outcomes, whereas eq. (ex-ante) aims for equity in expected outcomes. We largely focus on theex-post minimum FR for two main reasons. First, when allocating supplies in response to a rareevent like a pandemic or natural disaster, agents only observe one realized outcome. Because theex-ante minimum FR is only concerned with marginal fairness, it can have unfair outcomes for every sample path, i.e., every realized demand sequence. In contrast, the ex-post minimum FRconsiders each full sample path; every sample path with positive probability which results in anunfair outcome reduces W (cid:126) F p ( π ). Second, by Jensen’s inequality, the ex-post minimum FR serves asa lower bound on the ex-ante minimum FR, i.e., for any policy π , W (cid:126) F a ( π ) ≥ W (cid:126) F p ( π ) . However, for a ﬁxed ex-post minimum FR, achieving a higher ex-ante minimum FR is desirablebecause it reduces systematic biases against a particular agent, e.g., the last-arriving agent. In theextreme case where W (cid:126) F a ( π ) = W (cid:126) F p ( π ), one particular agent receives the smallest FR, regardless ofthe sample path. On the other hand, W (cid:126) F a ( π ) > W (cid:126) F p ( π ) implies that the worst-oﬀ agent varies acrossdiﬀerent sample paths.Having deﬁned our notions of fairness, we ﬁrst observe that if the sequence of demand is deter-ministic, then the policy that maximizes both the ex-post and the ex-ante minimum FR is simplyequalizing all FRs. Namely, W (cid:44) max π (cid:110) W (cid:126) F a ( π ) (cid:111) = max π (cid:110) W (cid:126) F p ( π ) (cid:111) = x d = x d = · · · = x n d n = min (cid:110) , µ (cid:111) . (1) In principle, we allow randomization of our policies in this paper; however, as will be clear later, all of our proposedpolicies are deterministic and no randomization is needed to obtain our targeted performance guarantees. Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

The above observation highlights that when total demand exceeds supply, even without stochas-ticity, we cannot guarantee a minimum FR better than 1 /µ , simply due to the scarcity of supply.However, if the sequence of demands is stochastic and correlated, a minimum FR of 1 /µ may notbe achievable. Consequently, we evaluate policies based on how they perform relative to W . For apolicy π and a joint demand distribution (cid:126) F , we say that the policy achieves ex-post fairness (resp. ex-ante fairness ) of W (cid:126) F p ( π ) /W (resp. W (cid:126) F a ( π ) /W ). We aim to design a policy with guarantees onboth ex-post and ex-ante fairness that hold universally for all joint demand distributions (cid:126) F with n agents and supply scarcity µ . We refer to the universal lower bounds of a policy π as its fairnessguarantees , which we formally deﬁne below. Definition 1 (

Ex-post/Ex-ante Fairness Guarantee ). A sequential allocation policy π achieves an ex-post fairness guarantee (resp. ex-ante fairness guarantee ) of κ p ( µ, n ) (resp. κ a ( µ, n )),if for all n ∈ N and µ ∈ R ≥ ,inf (cid:126) F∈ ∆ (cid:16) R n ≥ ; µ (cid:17) W (cid:126) F p ( π ) W ≥ κ p ( µ, n ) (cid:32) resp. inf (cid:126) F∈ ∆ (cid:16) R n ≥ ; µ (cid:17) W (cid:126) F a ( π ) W ≥ κ a ( µ, n ) (cid:33) , where ∆ (cid:0) R n ≥ ; µ (cid:1) denotes the domain of joint demand distributions with n agents and supplyscarcity µ .Our goals are (i) to understand the limits of achieving fairness in sequential allocation by comput-ing upper bounds on the achievable guarantees, and (ii) to obtain tight lower bounds by designingpolicies with strong ex-post guarantees as well as ex-ante guarantees. We show in Section 3 thatno gap exists between the achievable upper and lower bounds under both ex-post and ex-antenotions. More speciﬁcally, we show how to obtain exactly matching upper and lower bounds forboth notions of fairness using a single adaptive policy.

3. Optimal Bounds on Fairness Guarantees

In this section, we present our main results for the setting introduced in Section 2. First, wefocus on ex-post fairness in Section 3.1 and establish parameterized upper bounds on the ex-post fairness guarantee achievable by any sequential allocation policy—whether adaptive or non-adaptive, computationally eﬃcient (i.e., with polynomial running time) or not. Then, somewhatsurprisingly, we show that such upper bounds can be achieved by our policy, which is introducedand analyzed in Sections 3.2 and 3.3. Next, to illustrate the power of our simple adaptive algorithm,in Section 3.4 we characterize the ex-post fairness guarantee of the best policy which non-adaptivelyaims for a particular target ﬁll rate, and we show that our policy performs favorably compared tosuch a policy. Finally, in Section 3.5 we turn our attention to the notion of ex-ante fairness, andwe show that our policy also achieves the best possible ex-ante fairness guarantee. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing N u m b e r o f A g e n t s S u pp l y S c a r c i t y Figure 1 The upper bound on the ex-post fairness guarantee of any policy, as a function of the supply scarcityand the number of agents.

We begin this section by establishing a fundamental limit on ex-post fairness for any allocationpolicy when faced with stochastic and sequential demands. The main result of this subsection isthe following theorem:

Theorem 1 (Upper Bound on Ex-post Fairness Guarantee) . Given a ﬁxed number ofagents n ∈ N and supply scarcity µ ∈ R ≥ , no sequential allocation policy obtains an ex-post fairnessguarantee (see Deﬁnition 1) greater than κ p ( µ, n ) , deﬁned as κ p ( µ, n ) (cid:44)  − ( n n +1) ) µ, µ ∈ [0 , µ − ( n n +1) ) µ , µ ∈ [1 , n +1 n ) n +12 n , µ ∈ [ n +1 n , + ∞ ) . (2)See Figure 1 for an illustration of this upper bound as a function of the supply scarcity µ and thenumber of agents n . Per Deﬁnition 1, the ex-post fairness guarantee is relative to the achievableminimum FR when demands are deterministic, namely W = min { , /µ } . Consequently, this upperbound provides insight into the unavoidable loss in eﬃciency and equity when demands are a prioriuncertain and realize sequentially. In particular, we remark that the achievable fairness guaranteecrucially depends on the supply scarcity. In the regime where µ < n , which we refer to as Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing the under-demanded regime , κ p ( µ, n ) initially worsens as µ increases before hitting its minimum(for any ﬁxed n ) when expected demand equals supply, i.e., at µ = 1. This suggests that thestochastic nature of demand is most harmful when expected demand exactly equals supply. Onthe other hand, in the over-demanded regime where µ ≥ n , the achievable fairness guaranteeis independent of µ . Given that we are usually in the over-demanded regime in our motivatingapplications, Theorem 1 ensures that supply scarcity does not contribute to the loss in fairnessdue to uncertain, correlated, and sequential demand. Fixing µ , the upper bound always decreaseswith n , implying that achieving fairness can be more challenging for a larger population of agentswith stochastic demands, even if the total expected demand of the population remains the same.Finally, we highlight that the bound is always at least 1 / µ = 1 and n → + ∞ .The proof of Theorem 1 relies on establishing two hard instances with similar structures, onefor µ < /n and one for µ ≥ /n . The details of the proof are presented in Appendix A.1.Here, we present the instance for the over-demanded regime along with a sketch of our analysis. Inthis instance, there are n possible equally-likely scenarios, i.e., scenario σ happens with probability1 /n for σ ∈ [ n ]. In scenario σ , the ﬁrst σ agents have equal demand of µn +1 and the rest have nodemand. We illustrate this instance in Figure 2(a). First, note that the total supply scarcity for the above hard instance is µ (as shown in AppendixA.1). Next, consider any sequential policy that faces a non-zero demand from agent i . The policycannot distinguish among possible scenarios i, i + 1 , . . . , n . Consequently, its allocation decision foragent i will be independent of the scenario. In light of this observation, any policy can be suﬃcientlydescribed by a set of (possibly random) allocations with expected values (cid:126)y = ( y , y , . . . , y n ), suchthat if agent i has non-zero demand, then they receive an expected allocation y i . Given (cid:126)y , theminimum FR for scenario σ is r σ (cid:44) ( n + 1)2 µ E π [min { x , x , . . . , x σ } ] ≤ ( n + 1)2 µ min { y , y , . . . , y σ } ≤ ( n + 1)2 µ y σ , (3)where the ﬁrst inequality is due to the expectation of a minimum being less than the minimumover expectations (Jensen’s inequality).In order to establish our upper bound, we set up a factor-revealing linear program as presentedin Figure 2(b). The LP maximizes the expected minimum FR subject to three sets of naturalconstraints that must hold for any sequential policy: • The minimum FR in scenario σ cannot exceed the FR for agent σ , as shown in eq. (3). We remark that similar settings can occur in practice. As one example, consider the challenge of allocating limiteddisaster-relief supplies to towns damaged by a hurricane which may continue on its destructive path or may veer backout to sea. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing (a) (LP) uses variables ~r (ﬁll rate)and ~y (allocation).max ~y,~r ∈ R n ≥ n X σ ∈ [ n ] r σ (LP)s.t. r σ ≤ ( n + 1) y σ µ σ ∈ [ n ] r σ ≤ r σ − , r = 1 σ ∈ [ n ] X i ∈ [ n ] y i ≤ (b) Figure 2 (a) The instance and (b) the factor-revealing LP which establish an upper bound of κ p ( µ, n ) for theover-demanded regime (i.e., when µ ≥ n ). • The minimum FR in scenario σ is at most the minimum FR in scenario σ − • The total amount of expected allocations cannot exceed the available supply of 1.In Appendix A.1, we provide an upper bound on the optimal value of this LP by presenting afeasible solution to its dual. To complete the proof of Theorem 1, we must scale by W = min { , /µ } to translate this upper bound on the expected minimum FR into an upper bound on ex-postfairness (see Deﬁnition 1).Having provided the proof sketch of Theorem 1, we ﬁnish this section by noting that it is notclear whether the bound in eq. (2) can be achieved, even by the optimal online policy whichcan be found via a DP. Furthermore, there are signiﬁcant limitations and drawbacks to a DPapproach for maximizing the expected minimum FR in this setting. First, (i) the state space ofsuch a DP is exponentially large for correlated demands, which makes the DP intractable. Inaddition, (ii) solving the DP requires full distributional knowledge, and (iii) the DP decisions maylack transparency and interpretability, which are highly desirable properties in our motivatingapplications. Remarkably, in the following subsection, we design a simple adaptive policy that not only achievesthe best possible ex-post fairness guarantee of κ p ( µ, n ), but also oﬀers several corresponding advan-tages over a DP solution: (i) it can be computed eﬃciently, (ii) it only requires knowledge of While the exponential-sized state space does not formally prove the computational hardness of ﬁnding the optimalonline policy, it suggests that the naive DP approach will be computationally intractable; additional evidence sug-gesting computational hardness is due to Papadimitriou and Tsitsiklis (1987), which shows that the closely relatedgeneral problem of ﬁnding the optimal policy in partially observable MDPs is PSPACE-hard. Of course, the DP solution for ex-post fairness can lead to signiﬁcantly sub-optimal ex-ante fairness, and it canalso be sensitive to small perturbations in the demand distributions. We provide a simple example to illustrate bothof those drawbacks in Appendix A.2. Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing the conditional ﬁrst moments of agents’ demands, and (iii) its decisions can be clearly explained.Additionally, as shown in Section 3.5, it simultaneously attains the best-possible ex-ante fairnessguarantee.

We introduce our policy, referred to as the projected proportional allocation (PPA) policy, throughthe following simple intuition. Consider a planner that (magically) has access to all the demandrealizations (cid:126)d . As already discussed in Section 2, to maximize the minimum FR when the demandrealizations are known a priori, the planner should equalize the FR of all agents by allocating x ∗ i = min (cid:16) d i , d i (cid:80) j ∈ [ n ] d j (cid:17) to each agent i . If (cid:80) j ∈ [ n ] d j is at most the initial supply (which we normalizeto 1), then each agent i obtains a full allocation of x i = d i in such a solution. This results in themaximum equal FR of 1. Otherwise, all the agents will have an equal FR of 1 / (cid:80) j ∈ [ n ] d j , which is1 /µ when each demand is equal to its expected value.This solution can alternatively be obtained by solving a DP that returns allocations x ∗ n , x ∗ n − , . . . , x ∗ maximizing the minimum FR. By a simple induction argument, given the remain-ing supply s i at period i , this DP maintains the following invariant at each period i (refer toAppendix A.3 for details): x ∗ i = min (cid:40) d i , s i d i d i + (cid:80) j ∈ [ i +1: n ] d j (cid:41) = min (cid:26) d i , s i d i d i + [total future demand] (cid:27) . (4)Notably, the above invariant suggests a sequential implementation of the optimal solution at eachperiod i that only uses the knowledge of d i (i.e., the current demand at period i ) and (cid:80) j ∈ [ i +1: n ] d j (i.e., the total future demand from period i + 1 to n ). Now consider a setting with incompleteinformation, namely, with only knowledge of the current sample path of the observed demands upto period i , which we denote by (cid:126)d [1: i ] (cid:44) ( d , d , . . . , d i ). Our PPA policy implements a version of theabove policy by replacing the exact realization of total future demand with the conditional ﬁrstmoment of this random variable given the current sample path. More precisely: • Given the remaining supply s i , the PPA policy allocates an amount x i = min (cid:26) d i , s i d i d i + µ i +1 (cid:27) ( PPA’s update rule )of the (divisible) resource to agent i upon their arrival, where µ i +1 (cid:44) E (cid:126)d ∼ (cid:126) F  (cid:88) j ∈ [ i +1: n ] d j (cid:12)(cid:12)(cid:12) (cid:126)d [1: i ]  For any a, b ∈ N we use [ a : b ] to refer to the set { a, a + 1 , . . . , b } if a ≤ b (and the empty set otherwise). anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing Note that the conditional expected future demand µ i +1 given all previously-realized demands (cid:126)d [1: i ] is a function of (cid:126)d [1: i ] ; however, for ease of notation, we use µ i +1 without any input arguments.We remark that the PPA policy is simple, computationally eﬃcient, and solely uses ﬁrst-momentknowledge about the future demands. Further, because the allocation decisions of the PPA policydepend smoothly on the ﬁrst moment of future demand, these decisions are robust to small changesin the scale of any marginal distribution. Yet, as we show in Sections 3.3 and 3.5, this simple policyremarkably achieves the best possible guarantee for both notions of fairness (ex-post and ex-ante),even though these two notions are quantitatively diﬀerent whenever n > Remark 1.

The PPA policy can only run out of supply at the end of period i if µ i +1 = 0, orequivalently, only if all future demands (cid:126)d [ i +1: n ] are deterministically equal to zero, conditional onthe current realized sample path of demands (cid:126)d [1: i ] . This property holds simply because s i +1 = s i − x i ≥ s i µ i +1 d i + µ i +1 . In this section, we analyze the ex-post fairness guarantee of our PPA policy. In the following theo-rem, we show that this simple policy indeed achieves the best possible ex-post fairness guarantee.

Theorem 2 (Ex-post Fairness Guarantee of PPA Policy) . Given a ﬁxed number ofagents n ∈ N and supply scarcity µ ∈ R ≥ , the PPA policy achieves an ex-post fairness guarantee(see Deﬁnition 1) of at least κ p ( µ, n ) (deﬁned in eq. (2) ). In order to prove the above theorem, we would have liked toanalyze the evolution of the minimum FR, which we denote with f i at the end of period i −

1, i.e., f = 1 , f i = min { f i − , x i − d i − } for i ∈ [2 : n + 1]. Instead, we consider the evolution of a closely relatedstochastic process, which makes the analysis simpler. We deﬁne this surrogate stochastic processas follows: β (cid:44) min (cid:26) , n + 1 nµ (cid:27) β i (cid:44) min (cid:26) β i − , x i − d i − (cid:27) , i ∈ [2 : n + 1] . (5)First, we note that β i = min { f i , n +1 nµ } , i ∈ [ n + 1]. Next, recall that s i denotes the remaining supplyafter agent i − s i evolvesaccording to s = 1 s i = s i − − min (cid:26) d i − , d i − d i − + µ i s i − (cid:27) , i ∈ [2 : n ] . (6)With the above observations, the main step of the proof is carefully analyzing the evolution of( β k , s k ) under the PPA policy, which enables us to lower bound the ﬁnal expected minimum FR inthe following lemma. Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Lemma 1 (Lower Bound on Expected Minimum FR) . Under the PPA policy, for all i ∈ [ n + 1] and any subsequence of demand realizations (cid:126)d [1: i − , E (cid:126)d ∼ (cid:126) F (cid:104) f n +1 | (cid:126)d [1: i − (cid:105) ≥ β i (cid:18) − n + 1 − i n + 2 − i ) µ i s i β i (cid:19) (7) where β i is deﬁned in eq. (5) . Since the objective of our dynamic decision-making problem has no per-stage rewards and con-sists only of a terminal reward (i.e., the minimum FR), Lemma 1 can be thought of as establishinga lower bound on the value-to-go function of the PPA policy. Before providing the proof for thiskey lemma, we lay out the two remaining steps that ﬁnish the proof of Theorem 2: (i) plugging i = 1 into inequality (7) to obtain a lower bound on E (cid:126)d ∼ (cid:126) F [ f n +1 ], and (ii) scaling the obtainedlower bound result by our benchmark for deterministic demand, namely W = min { , /µ } , whichprovides an ex-post fairness guarantee (see Deﬁnition 1). Proof of Lemma 1:

We will show that inequality (7) holds via backwards induction. Thebase case of i = n + 1 is trivial as it follows from the observation we made earlier: β i = min { f i , n +1 nµ } .Now let us consider i = k < n + 1. Instead of proving inequality (7), we prove a stronger result: E (cid:126)d ∼ (cid:126) F (cid:104) f n +1 | (cid:126)d [1: k ] (cid:105) ≥ β k (cid:18) − n + 1 − k n + 2 − k ) d k + µ k +1 s k β k (cid:19) . (8)Establishing inequality (8) means that the inequality in (7) holds for any realization of agent k ’s demand. Consequently, it will hold when we take an expectation over agent k ’s demand. Inorder to prove inequality (8), we consider two diﬀerent cases that can arise depending on theremaining supply s k , agent k ’s demand d k , and the future expected demand µ k +1 . In the following,we introduce and analyze these cases separately.(i) Suﬃcient supply ( s k ≥ β k ( d k + µ k +1 ) ) : Recall that according to the PPA policy, x k =min { d k , d k d k + µ k +1 s k } . Therefore, in this case, either x k = d k , i.e., the PPA policy meets theentire demand, or x k = d k d k + µ k +1 s k ≥ β k d k , i.e., the PPA policy attains an FR of at least β k .According to the dynamics speciﬁed in (5) and (6), this implies β k +1 = β k and µ k +1 s k +1 = µ k +1 s k − min { d k , d k d k + µ k +1 s k } ≤ µ k +1 s k − d k d k + µ k +1 s k = d k + µ k +1 s k . Using our inductive hypothesis when i = k + 1, E (cid:126)d ∼ (cid:126) F (cid:104) f n +1 | (cid:126)d [1: k ] (cid:105) ≥ β k (cid:18) − n − k n + 1 − k ) d k + µ k +1 s k β k (cid:19) (cid:44) RHS (1) . (9)The lower bound given by RHS (1) is a linear function of d k + µ k +1 , as illustrated by the dottedred lines in all panels of Figure 3 (in the regime where d k + µ k +1 ∈ [0 , s k /β k ]). This linear If s i = 0, then by Remark 1, we must also have µ i = 0. In such cases, we take the convention that µ i s i = 0. anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing function has a non-positive slope and an intercept of β k . We can further lower bound thisfunction for any d k + µ k +1 ∈ [0 , s k /β k ] by another linear function with the same intercept of β k and a smaller (more negative) slope. In particular, since n − kn +1 − k ≤ n +1 − kn +2 − k , we have: RHS (1) ≥ β k (cid:18) − n + 1 − k n + 2 − k ) d k + µ k +1 s k β k (cid:19) , (10)which proves inequality (8) in the suﬃcient supply case (see the blue lines in all panels ofFigure 3).(ii) Insuﬃcient supply ( s k < β k ( d k + µ k +1 ) ) : In this case, the allocation of the PPA policy is x k = d k d k + µ k +1 s k , which results in an FR less than β k , i.e., x k d k = s k d k + µ k +1 < β k . According to thedynamics speciﬁed in (5) and (6), this implies β k +1 = s k d k + µ k +1 and µ k +1 s k +1 = µ k +1 s k − d k d k + µ k +1 s k = d k + µ k +1 s k . Using our inductive hypothesis when i = k + 1, E (cid:126)d ∼ (cid:126) F (cid:104) f n +1 | (cid:126)d [1: k ] (cid:105) ≥ s k d k + µ k +1 (cid:18) − n − k n + 1 − k ) (cid:19) (cid:44) RHS (2) . (11)The lower bound given by RHS (2) is a convex homographic function of d k + µ k +1 , as illustrated bythe dashed red lines in all panels of Figure 3 (in the regime where d k + µ k +1 ∈ [ s k /β k , + ∞ )). Tofurther lower bound this function by a linear function, note that for any variable z the followinginequality holds: (cid:18) n + 2 − k n + 1 − k ) (cid:19) s k z − β k (cid:18) − n + 1 − k n + 2 − k ) β k s k z (cid:19) = ( n + 1 − k ) β k n + 2 − k ) s k z (cid:18) ( n + 2 − k ) s k ( n + 1 − k ) β k − z (cid:19) ≥ . The proof of the above inequality is purely algebraic and we omit it for brevity. Substituting z = d k + µ k +1 in this inequality, we have: RHS (2) ≥ β k (cid:18) − n + 1 − k n + 2 − k ) d k + µ k +1 s k β k (cid:19) , (12)which proves inequality (8) in the insuﬃcient supply case (again, see the blue lines in all panels ofFigure 3).Combining the above cases proves inequality (8) everywhere, which immediately implies theinductive hypothesis, i.e., inequality (7), for i = k , thus ﬁnishing the proof of the lemma. (cid:3) As discussed in the previous sections, our PPA policy is adaptive , that is, the FR for agent i (andits corresponding allocation decision) can depend not only on the observed demand d i but also onthe exact sample path up to time i as well as the remaining supply s i . In contrast to an adaptive Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing d + µ d d + µ d + µ s β s β s β s β s β s β s β s β β β β β β β β β Figure 3 Lower bounds on the expected minimum FR given by eq. (8) (blue solid lines), eq. (9) (red dottedlines), and eq. (11) (red dashed lines) when n = 4 for k ∈ [4] . policy, a non-adaptive policy commits to a sequence of feasible allocation maps { x i ( d i ) } i ∈ [ n ] upfront,where x i ( d i ) ∈ [0 , d i ] is the allocation decision for agent i when agent i has demand d i . If the non-adaptive policy’s allocation decision x i ( d i ) exceeds the remaining supply s i , then agent i insteadreceives the entire remaining supply.For settings that we consider, adaptivity can indeed help with improving the expected minimumFR of a policy. As an example, compare running our PPA policy versus the best non-adaptivepolicy on an instance with three agents. In this instance, the demands (cid:126)d = ( d , d , d ) follow oneof the two possible sample paths ( (cid:15) , ,

1) or ( (cid:15) , ,

0) with equal probabilities 1 /

2, where (cid:15) , (cid:15) ≥ (cid:15) (cid:54) = (cid:15) . After agent 1’s demand is realized, the PPA policy knows exactly which sample pathis happening. By calculating the exact total demand of agents 2 and 3, it obtains the optimalexpected minimum FR of × × = 3 / (cid:15) , (cid:15) . However, a non-adaptive policy cannotdistinguish between the two possible sample paths after agent 1’s demand is realized. Therefore, For ease of presentation, we focus on deterministic non-adaptive policies. This is without loss of generality, as theex-post fairness of any randomized non-adaptive policy must be weakly dominated by the ex-post fairness of one ofthe deterministic policies that it randomizes over. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing without loss of generality, it targets a FR of τ for agent 2 and obtains an expected minimum FRof τ + min { τ, − τ } for small (cid:15) , (cid:15) , which attains its maximum equal to at any τ ∈ [ , Motivated by such scenarios, we study two simple and natural canonical classes of non-adaptivepolicies: those that ﬁx the sequence of allocation decisions a priori, namely they specify one alloca-tion vector (cid:126)x , and “smarter” policies which ﬁx the sequence of ﬁll rates (cid:126)τ (cid:44) ( τ , τ , . . . , τ n ) a priori.In Appendix A.4, we show that the ex-post fairness guarantee for the former subclass is vanishingas n gets large. Therefore, we focus on the latter subclass, which is formally deﬁned as follows. Definition 2 (

Target-ﬁll-rate Policies ). A target-ﬁll-rate (TFR) policy is any policy π which pre-determines a target ﬁll rate τ ∈ [0 , i , the policy π musteither allocate suﬃcient supply to meet the target or allocate all remaining supply, i.e., ∀ i ∈ [ n ] : x i ( s i , d i ) = min { τ d i , s i } . In the following theorem, we provide a tight bound on the ex-post fairness guarantee (Deﬁni-tion 1) achievable by the optimal

TFR policy—deﬁned as the one that maximizes ex-post fairnessfor the given joint demand distribution. We remark that setting one threshold is without lossof generality because the ex-post fairness guarantee of a policy which pre-determines a sequence of target ﬁll rates { τ i } i ∈ [ n ] is upper bounded by that of a TFR policy with the same target ﬁllrate τ = min i ∈ [ n ] { τ i } for all agents. We also highlight that in addition to achieving a lower ex-post fairness guarantee compared to our adaptive policy, ﬁnding the best TFR policy requires fullknowledge of the total demand distribution—in contrast to our PPA policy which only requiresknowing the ﬁrst conditional moments of the future total demand at each time. Theorem 3 (Ex-post Fairness Guarantee of Optimal TFR Policy) . Given any numberof agents n ∈ N \ { } and supply scarcity µ ∈ R ≥ , the optimal TFR policy achieves an ex-postfairness guarantee (see Deﬁnition 1) of max { ,µ } µ + √ µ +1 . In Figure 4, we compare the guarantee of the optimal TFR policy against our PPA policy fordiﬀerent model primitives, µ and n . First, we note that when n is not too large, our PPA policyachieves a considerably higher guarantee. Next, we highlight that the ex-post fairness guarantee As discussed in the introduction, the initial strategy for allocating medical supplies at the beginning of COVID-19pandemic had the form of a target-ﬁll-rate policy, which is a canonical non-adaptive strategy as we will discuss soon. Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing . . . . . . . . . . . . . Supply Scarcity E x - p o s t F a i r n e ss G u a r a n t ee PPA Policy, n = 1 PPA Policy, n = 4 PPA Policy, n → ∞ Optimal TFR Policy, n ≥ Figure 4 Ex-post fairness guarantees of our PPA policy and the optimal TFR policy. for the optimal TFR policy does not depend on the number of agents n . This is in contrast to theex-post guarantee for the PPA policy κ p ( µ, n ), which worsens as the number of agents increases.Furthermore, the guarantee in Theorem 3 has a unique minimum of √ ≈ .

41 when µ = 1. Thisonce again suggests that the stochastic nature of demand is most harmful when expected demandexactly equals supply. We begin by placing a lower bound on the performance of theoptimal TFR policy, and we then demonstrate the existence of a matching upper bound.

Proof of lower bound:

For any target ﬁll rate τ , a TFR policy will achieve that ﬁll rate if τ (cid:16)(cid:80) i ∈ [ n ] d i (cid:17) ≤

1. Let us deﬁne G as the cumulative distribution function (CDF) of the randomvariable v (cid:44) (cid:80) i ∈ [ n ] d i , where E v ∼ G (cid:2) v (cid:3) = µ . For ease of notation, we use ∆ ( R ≥ ; µ ) to denote thedomain of all such CDFs. Given a CDF G , a TFR policy with target ﬁll rate τ achieves an expectedminimum ﬁll rate of at least τ (1 − G ( τ )), which implies that the optimal TFR policy attains anexpected minimum ﬁll rate of at least max τ ∈ [0 , τ (1 − G ( τ )). In the following lemma, we establisha lower bound on max τ ∈ [0 , τ (1 − G ( τ )), which enables us to lower-bound the ex-post fairnessguarantee that the optimal TFR policy achieves. Lemma 2 (Tight Lower Bound for Optimal TFR Policy) . Given a ﬁxed number ofagents n ∈ N and supply scarcity µ ∈ R ≥ , the following holds: inf G ∈ ∆( R ≥ ; µ ) (cid:26) max τ ∈ [0 , τ (1 − G ( τ )) (cid:27) = 1 µ + √ µ + 1 . (13) We elaborate on the intuition behind this behavior when we present the hard instance for establishing the upperbound of Theorem 3. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing This inﬁmum is attained by the following CDF ˆ G ( v ) =  if v ∈ [0 , ˆ q )1 − ˆ q/v if v ∈ [ˆ q, − ˆ q if v ∈ [1 , + ∞ )1 if v = + ∞ (14) where ˆ q = µ + √ µ +1 . Before presenting the proof of the above lemma in Section 3.4.2, which is the key step of the proofof Theorem 3, we establish a matching upper bound and complete the proof of the theorem.

Proof of upper bound:

We show a matching upper bound by considering a two-agentinstance. In this instance, only the ﬁrst agent has stochastic demand. In particular, d = − (cid:15)v for v ∼ ˆ G (deﬁned in eq. (14)) and d = (cid:15)µ deterministically. Note that E [ d + d ] = µ . For any targetﬁll rate τ (cid:48) = (1 − (cid:15) ) τ where τ ∈ [0 , G ( τ ), in which case the minimum FR will be 0. Therefore, the expected minimum ﬁllrate of the optimal TFR policy in this instance is at mostmax τ ∈ [0 , (1 − (cid:15) ) − τ (cid:16) − ˆ G ( τ ) (cid:17) = (1 − (cid:15) ) − µ + √ µ + 1 , where the equality follows from Lemma 2.By allowing (cid:15) →

0, we conclude that there exists an instance where the expected minimum ﬁllrate of the optimal TFR policy is µ + √ µ +1 , which matches the lower bound from above. We remarkthat the construction of the above two-agent example clariﬁes why our upper bound does notdepend on the number of agents: we can modify the example to an n -agent one where the totaldemand of the ﬁrst n − − (cid:15)v for v ∼ ˆ G and the last agenthas a deterministic demand of (cid:15)µ .With the above (matching) bounds, we complete the proof of Theorem 3 by scaling this tightbound by our benchmark for deterministic demand, namely W = min { , /µ } , to arrive at theguarantee stated in Theorem 3. (cid:3) Having laid out theproof steps of Theorem 3, we now provide a constructive proof of the key lemma, i.e., Lemma 2.We do so by identifying properties of the worst-case distribution against the optimal TFR policy,which enables us to exactly characterize that distribution.To aid in this proof, we introduce a one-to-one mapping of each target ﬁll rate τ into thequantile space, such that quantile q corresponds to TFR τ if and only if there is suﬃcient supplyto meet a fraction τ of demand with probability exactly q . We start by describing notation forthis transformation, along with some basic properties, in the following deﬁnition. For simplicity ofexposition, we assume all the distributions playing the role of G are non-atomic. This assumption is without loss of generality, as one can always add an inﬁnitesimal continuous perturbation toeach distribution, which does not change any of the arguments in this proof. Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing (a) (b)

Figure 5 (a) The expected achievable ﬁll rate (EAFR) curve, ﬁll rates, and support in quantile space. (b) TheEAFR curve of the worst-case distribution, as deﬁned in eq. (14), in quantile space.

Definition 3 (

TFR in Quantile Space ). Given a (non-atomic) CDF G : R ≥ → [0 ,

1] andinverse total demand v ∼ G , we deﬁne the following mappings. • TFR-to-quantile map Q G : The quantile corresponding to TFR τ ∈ [0 ,

1] is Q G ( τ ) (cid:44) − G ( τ ).In words, the probability of being able to meet a fraction τ of total demand is Q G ( τ ). Thismap is monotone non-increasing. • Quantile-to-TFR map T G : The TFR corresponding to quantile q ∈ [0 ,

1] is T G ( q ) (cid:44) G − (1 − q ).In words, T G ( q ) is the TFR for which the probability of being able to meet a fraction T G ( q )of total demand is q . This map is monotone non-increasing and is the inverse of the TFR-to-quantile map, i.e., T G = ( Q G ) − . • The expected achievable ﬁll rate (EAFR) curve R G : For q ∈ [0 , R G ( q ) (cid:44) q · G − (1 − q ) is theEAFR when the probability of meeting demand (given the TFR) is exactly equal to q ∈ [0 , T G ( q ). Remark 2.

In light of the above transformation, we remark that there is a reduction from oursetup to a single-parameter Bayesian mechanism design problem in which a monopolistic sellerhas an item to sell to a single buyer with private valuation v ∼ G , where G is the common priorvaluation distribution. See Alaei et al. (2019) for an example of such a setting; also refer to Hartline(2013) for more details on monopoly pricing. In this reduction, target ﬁll rates correspond to pricesand the EAFR corresponds to the expected revenue in monopoly pricing (accordingly, the EAFRcurve also corresponds to the revenue curve). The problem in this parallel monopoly pricing settingis identifying the worst-case distribution G satisfying E v ∼ G [1 /v ] = µ , so that we minimize themaximum revenue obtained from selling the item at prices constrained to be in the interval [0 , anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing According to Deﬁnition 3, τ (1 − G ( τ )) is equivalent to R G ( Q G ( τ )) for any TFR τ ∈ [0 , G ∈ ∆( R ≥ ; µ ) (cid:26) max q ∈ [0 , T G ( q ) ∈ [0 , R G ( q ) (cid:27) = 1 µ + √ µ + 1 . (15)Consider all cumulative distribution functions G ∈ ∆ ( R ≥ ; µ ). We ﬁrst identify two additionalconstraints on G that do not change the inﬁmum in eq. (15). These constraints enable us to ﬁndthe worst-case distribution that achieves the inﬁmum value which establishes the desired result.Before proceeding, we develop intuition using an illustrative example of the EAFR curve shownin Figure 5(a). In general, if one draws R G ( q ) as a function of q ∈ [0 ,

1] (i.e., in the quantile space),then the slope of the line connecting the point (0 ,

0) to ( q, R G ( q )) is equal to T G ( q ) = R G ( q ) /q . Thisslope is monotone non-increasing in q for any CDF G according to Deﬁnition 3. Hence, given theEAFR curve R G ( q ), the support of the feasible ﬁll rates is equal to [ L, H ], where L (cid:44) R G (1) and H (cid:44) min (cid:26) , lim inf q → R G ( q ) /q (cid:27) . The two constraints that we will add below, as stated in Claims 1and 2, imply that the outer optimization problem in eq. (15) will remain unchanged if we requirethe EAFR curve to be (i) ﬂat over quantiles corresponding to target ﬁll rates in [ L, Q G (1) , , Q G (1)].With these two additional constraints, in Claim 3 we ﬁnd the worst-case CDF, which has an EAFRcurve as shown in Figure 5(b). Claim 1 (Equal EAFR) . Adding the constraint R G ( q ) = R G ( q (cid:48) ) , ∀ q, q (cid:48) ∈ [ Q G (1) , to the outeroptimization in eq. (15) does not change its inﬁmum value. We prove Claim 1 by contradiction: we show that for any CDF G ∈ ∆ ( R ≥ ; µ ), if the above conditiondoes not hold, we can slightly modify G to design a new distribution ˜ G ∈ ∆ ( R ≥ ; µ ) which hasan EAFR curve with a lower maximum value. The details are presented in Appendix A.5.1. Theabove claim readily implies that we can focus on distributions for which the EAFR curve is ﬂat inthe interval [ Q G (1) , v ∈ (1 , + ∞ ). Said diﬀerently, the support of inverse demand is (0 , ∪ { + ∞} . Claim 2 (Restricted Support for Inverse Demand) . Adding the constraint G ( v ) = G (1) for all v ∈ [1 , + ∞ ) and lim v → + ∞ G ( v ) = 1 to the outer optimization in eq. (15) does not change itsinﬁmum value. We also prove Claim 2 by contradiction: we show that for any CDF G ∈ ∆ ( R ≥ ; µ ), if there isprobability mass on v ∈ (1 , + ∞ ), we can construct a CDF ˜ G ∈ ∆ ( R ≥ ; µ ) which has an EAFRcurve with a lower maximum value by shifting that mass to + ∞ . The details are presented in Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Appendix A.5.2. Again, note that this claim implies that we can focus on distributions for whichthe EAFR curve starts with a straight line up to quantile Q G (1).Given the two claims above, the distribution that attains the inﬁmum in eq. (15) must satisfy thetwo constraints introduced. Figure 5(b) summarizes the eﬀect of these two restrictions on R G ( q ). Claim 3 (Worst-case CDF) . For any µ ∈ R ≥ , the distribution ˆ G given in eq. (14) is theunique distribution in ∆ ( R ≥ ; µ ) satisfying the constraints introduced in Claims 1 and 2. Therefore,this distribution attains the inﬁmum in eq. (15) . We prove Claim 3 in Appendix A.5.3. Since the EAFR curve R ˆ G ( q ) has a maximum value ofˆ q = µ + √ µ +1 , we have shown that the optimal TFR policy always achieves an EAFR of at least ˆ q .This completes the proof of Lemma 2. (cid:3) In this section, we study our second notion of fairness, namely, ex-ante fairness. As we did forex-post fairness, we ﬁrst establish an upper bound on the ex-ante fairness guarantee achievable byany policy. More importantly, we then show that our PPA policy achieves this worst-case ex-antefairness bound. The following theorem establishes our matching upper and lower bounds on theex-ante fairness guarantee.

Theorem 4 (Ex-ante Fairness Guarantee of PPA Achieves Upper Bound) . Given aﬁxed number of agents n ∈ N and supply scarcity µ ∈ R ≥ , no sequential allocation policy obtainsan ex-ante fairness guarantee (see Deﬁnition 1) greater than κ a ( µ, n ) , deﬁned as κ a ( µ, n ) =  − µ , µ ∈ [0 , µ (cid:0) − µ (cid:1) , µ ∈ [1 , , µ ∈ [2 , + ∞ ) . (16) Further, the PPA policy achieves an ex-ante fairness guarantee of at least κ a ( µ, n ) . Like its counterpart for ex-post fairness, κ a ( µ, n ) depends on the supply scarcity, µ , and is at itslowest when expected demand equals supply, which highlights the loss due to stochasticity whentrying to achieve eﬃciency and equity ex ante. However, unlike the bound for ex-post fairness, theex-ante fairness bound is independent of the number of agents. In fact, this bound is identical tothe ex-post fairness bound in the single-agent case, i.e. κ a ( µ, n ) = κ p ( µ,

1) (which is shown by thedotted line in Figure 4).For intuition about this relationship, note that one feasible policy is to allocate supply to eachagent proportional to their expected demand. Since the ex-ante problem only depends on marginalFRs, this reduces ex-ante fairness to the minimum ex-ante fairness across n single-agent instances(where in each instance, the supply scarcity is µ ). In a single-agent instance, ex-ante fairness is anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing equal to ex-post fairness, which implies that any lower bound on single-agent ex-post fairness alsoserves as a lower bound on ex-ante fairness with n agents. Furthermore, since demands can beperfectly correlated, any single-agent instance can be expressed as an instance with n agents forany n ∈ N . This implies that any upper bound on single-agent ex-post fairness also serves as aupper bound on ex-ante fairness with n agents.To prove the upper bound in Theorem 4, we build on the hard instances from the proof ofTheorem 1. To prove the lower bound, we use ideas similar to the proof of Theorem 2. We showthat when following the PPA policy, the expected FR for each agent is a decreasing and convexfunction of the ratio of expected remaining demand to remaining supply upon their arrival. Weinductively place an upper bound on the ex-ante expected value of that ratio for each agent, whichenables us to provide a lower bound on ex-ante fairness. See Appendix A.6 for a detailed proof.

4. Numerical Results

We complement our theoretical developments with an illustrative case study motivated by theallocative challenges that FEMA faced when rationing COVID-19 medical supplies in the earlydays of the pandemic in the US. First, we provide background on the sequential and heterogeneousnature of the demand from diﬀerent states. Then, we study this dynamic rationing problem withinour framework and illustrate the eﬀectiveness of our PPA policy by comparing it to its theoreticalguarantee, to the optimal TFR policy, and to a DP approach.

The U.S. federal government is equipped with a stockpile of medical resources which can be used toalleviate excessive demands on states’ local resources. However, the stockpile is intended to addressisolated emergencies in a small number of states, not a nationwide epidemic. Thus, due to COVID-19, the total demand for certain resources was expected to far exceed the federal government’sstockpile. Given the uneven spread of COVID-19 toward the outset of the pandemic, diﬀerentstates’ needs were expected to realize at diﬀerent times. Furthermore, the size of the demands werelikely to be correlated across states. When states with early outbreaks began requesting supplies,FEMA had to determine how much of the federal stockpile they should allocate and how muchthey should save to meet the projected future needs of other states. To capture the setting faced by FEMA, we draw upon the projections made by the Institute forHealth Metrics and Evaluation (IHME) at the University of Washington as of April 1, 2020. Theseprojections were cited by Dr. Deborah Birx (the White House Coronavirus Response Coordinator) In mid-March, FEMA took control of the stockpile and was tasked with distributing medical supplies. Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing as inﬂuencing decision-making at that time (Washington Post 2020b). We use each state’s projectedpeak excess demand for ICU beds as a proxy for their demand for medical supplies. The scale andtiming of the projected peak demand varied signiﬁcantly from state to state. To succinctly presentthe temporal aspect of states’ demands, we partition them into four groups based on when theirpeak demand was projected to occur: April 1 st -7 th , April 8 th -14 th , April 15 th -21 st , and after April21 st . We visualize this partitioning in the map in the left panel of Figure 6(a).Next, we describe how we construct the demand distributions for the aforementioned groups. Weremark that the IHME’s projections only consist of the estimated mean as well as estimated 5 th and 95 th percentiles for each individual state. Said diﬀerently, the model does not specify a joint(or even marginal) distribution for the demand. Consequently, we augment the model by assumingthat the aggregate demand in each group i ∈ [4] is Normally distributed with mean equal to theIHME’s total estimated mean for the states belonging to group i . The standard deviation for group i ’s demand distribution is set such that the 5 th percentile of demand is equal to the 5 th percentileof projected peak excess demands for ICU beds from all states in group i . Summary statisticsabout the demand distributions are presented in the right panel of Figure 6(b). We highlight thesigniﬁcant heterogeneity in both the mean and the standard deviation across these four groups.Finally, we construct the joint distribution for these four groups by letting all of the pairwisecorrelation coeﬃcients equal ρ , where ρ ∈ { , . , . , . } . The limited number of groups andsimple model of correlated demand allow us to compare the PPA policy against a DP solution withmanageable computing time, as we discuss in the following subsection.

For the setup described above, we numerically evaluate the performance of various policies. In thesimulations, we consider three values for supply scarcity (1, 2, and 4) by setting the initial supplyrelative to the sum of expected demand across the four groups according to the IHME model.Then, ﬁxing the correlation coeﬃcient ρ ∈ { , . , . , . } , we generate 10,000 sample paths forthe purpose of Monte-Carlo simulation.For each sample path, we compute the minimum FR of the following policies: (i) our PPA pol-icy, (ii) the optimal TFR policy (deﬁned in Section 3.4), and (iii) the solution to a DP computedby discretizing the state space, which consists of the remaining supply, the current demand, theminimum FR, and the distribution of future (i.e., remaining) demand given the observed demandthus far. We choose the discretization level such that each DP can be solved in under three hours In rare cases, the demand drawn from the described Normal distributions can be negative. In such cases, we assumethe demand is 0. In this context, we focus on diﬀerent levels of positive correlation to capture network eﬀects during the pandemic,e.g., virus transmission across state borders. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing (a) Group Mean Demand Std. Dev.1 15,366 32872 2193 7103 9719 36004 3831 1763 (b) Figure 6 (a) The allocation group for each state, which is based on the projected timing of excess demand forICU beds. (b) Summary of the demand distribution for each group of states. on Google’s Compute Engine. In Table 1, we present the average ex-post fairness for the aforemen-tioned policies along with the PPA policy’s theoretical guarantee for ex-post fairness (as given inTheorem 2). We make three key observations: • PPA vs. Guarantee:

In all cases, our policy performs more than 33% better than its the-oretical guarantee. In addition, in over-demanded settings our policy performs better whenthe correlation is higher because it leverages the information provided by realized demands.However, when supply is equal to expected demand, the extra information can be outweighedby the strain that correlated demands place on the available supply. • PPA vs. Optimal TFR:

Our PPA policy outperforms the optimal TFR policy by between7% and 28%. As we would expect, the gap is largest when there is correlation between demandssince TFR policies do not take advantage of the information that can be extracted fromrealized demands. Correlation between demands is likely strong during a nationwide epidemic,implying that our PPA policy is better suited than the optimal TFR policy in such situations. • PPA vs. DP:

The solution to the DP, despite being challenging to compute even in thissimple setting, exhibits nearly identical performance to the PPA policy.Moving beyond the performance comparison, we highlight that the PPA policy can be imple-mented simply by knowing the expected remaining demand, which was frequently updated by the We remark that ex-post fairness is well-concentrated around its average value for all of the considered policies. Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

IHME. In contrast, determining the optimal TFR policy or solving the DP requires full distribu-tional information, which was not provided by the IHME. Further, regarding the DP solution, weremark that our distributional assumptions (four agents and a simple correlation structure acrossdemands) make it possible to solve such a DP with a granular state space in a reasonable amountof time. However, we emphasize that in other settings, particularly those with a more complicatedcorrelation structure, such an approach may not be practical. The solution to the DP also suﬀersfrom two additional shortcomings: it achieves consistently worse ex-ante fairness than the PPApolicy (by an average of 6%), and its allocation is not transparent. Transparency is particularlyimportant in this setting, as numerous states questioned the allocation procedures implemented bythe federal government (see Footnote 1 for one such example). Similar to TFR policies, the PPApolicy follows a strategy that can be easily explained to stakeholders.

Supply Scarcity ( µ )1 2 4Demand Correlation ( ρ ) 0.00 0.25 0.50 0.75 0.00 0.25 0.50 0.75 0.00 0.25 0.50 0.75PPA Policy 0.85 0.85 0.85 0.86 0.84 0.85 0.87 0.92 0.84 0.85 0.87 0.91PPA Guarantee 0.600 0.625 0.625Opt. TFR Policy 0.79 0.76 0.74 0.72 0.79 0.76 0.74 0.72 0.79 0.76 0.74 0.72DP Solution 0.85 0.85 0.85 0.86 0.85 0.87 0.89 0.92 0.86 0.86 0.89 0.92 Table 1 Ex-post fairness of three policies across 10,000 simulations of the setting described in Figure 6(b), aswell as the ex-post fairness guarantee of the PPA policy.

5. Concluding Remarks, Extensions, and Future Directions

We conclude the paper by ﬁrst summarizing our main ﬁndings, and then elaborating on a fewextensions of our base framework. We ﬁnish by listing a few future directions.

In this paper, we initiate the theoretical study of fair dynamic rationing by introducing a simpleyet fundamental and well-motivated framework. In a nutshell, we design sequential policies forallocating limited supply to a sequence of arbitrarily correlated demands given an objective whichencompasses the dual goals of eﬃciency and equity. Based on our formalized notions of ex-post andex-ante fairness, we establish upper bounds on the fairness guarantees achievable by any sequentialallocation policy which depend on the supply’s scarcity level and the number of demanding agents.More importantly, we show that our simple PPA policy achieves the “best of both worlds” byattaining the upper bound on both the ex-post and ex-ante fairness guarantees. In addition toenjoying optimal fairness guarantees, our PPA policy is practically appealing: it is interpretable anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing as well as computationally eﬃcient since it does not rely on distributional knowledge beyond theconditional ﬁrst moments.Our framework lends itself to extensions such as considering generalized objectives and rationingmultiple types of resources. More broadly, it serves as a base model for theoretically studyingsequential allocation problems with an objective beyond utility maximization, which in turn opensseveral new research directions. In the rest of this section, we ﬁrst discuss the aforementionedextensions of our base model and then ﬁnish the paper by discussing future directions. Throughout the paper, we havefocused on the minimum FR as a social welfare objective that combines elements of equity andeﬃciency (i.e., U ( (cid:126)x ) = min i ∈ [ n ] { x i d i } ). However, this social welfare function—which is also knownas the Rawlsian social welfare function thanks to the philosophical work of John Rawls (1973)—isonly a special case of a more general class of social welfare functions that we call the weightedpower mean (WPM) social welfare family of functions. More precisely, this family is parameterizedby α ∈ [0 , + ∞ ) and deﬁned as U α ( (cid:126)x ) (cid:44) (cid:18) (cid:80) i ∈ [ n ] d i (cid:80) i ∈ [ n ] d i (cid:16) x i d i (cid:17) − α (cid:19) / (1 − α ) , α (cid:54) = 1 (cid:81) i ∈ [ n ] (cid:16) x i d i (cid:17) d i / (cid:80) i ∈ [ n ] d i , α = 1 (17)Note that the above is a weighted version of the celebrated power mean functions, introducedin Atkinson et al. (1970), that provides a broad class of social welfare functions which balanceequity and eﬃciency to varying degrees. Having weights proportional to the demands in eq. (17)ensures that equity is measured relative to demand and not simply based on the absolute alloca-tion. By varying the parameter α from 0 to + ∞ , the focus of the planner is shifted from extremeeﬃciency towards more equitable allocations. When α = 0, a utilitarian allocation (i.e., any allo-cation without waste) maximizes social welfare. In the limit as α →

1, proportional fairness (i.e.,a generalization of the Nash bargaining solution) maximizes social welfare. Finally, in the limit as α → + ∞ , maximizing the minimum FR maximizes social welfare. In fact, we highlight that thevalue of this social welfare function exactly approaches our objective in the base model (i.e., theminimum FR, or equivalently, the Rawlsian social welfare function).For any parameter α (including when α → + ∞ , which corresponds to our base model), theoptimal policy when demands are deterministic is to allocate the supply proportionally. Such a Further, this family of functions has a one-to-one relationship with the α -fairness social welfare functions introducedin Mo and Walrand (2000). In fact, the two families of functions are the same up to a transformation via a one-to-one,increasing function, which means that the maximizing vectors are identical for a given α . Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing policy achieves the optimal social welfare of W (deﬁned in eq. (1)). However, the value of theparameter α impacts the optimal policy when demands are stochastic. To study this impact, wenaturally generalize our notion of ex-post fairness to WPM social welfare functions, i.e., ex-postfairness is given by E (cid:126)d ∼ (cid:126) F [ U α ] /W . We remark that in the limit as α → + ∞ , this is equivalent tothe notion of ex-post fairness introduced in Section 2. In the following corollary of Theorem 2, weestablish that our PPA policy achieves an ex-post fairness guarantee of at least κ p ( µ, n ) for any α ∈ [0 , + ∞ ). Corollary 1 (PPA’s Guarantee for WPM Objectives) . Given a ﬁxed number of agents n ∈ N , supply scarcity µ ∈ R ≥ , and any α ∈ [0 , + ∞ ) , the PPA policy achieves an ex-post fairnessguarantee of at least κ p ( µ, n ) (deﬁned in eq. (2) ) when social welfare is measured by a WPM function(deﬁned in eq. (17) ) with parameter α . We prove Corollary 1 in Appendix B.1.

In our base model, we assume that agentshave demand for only a single type of resource. However, in many of the motivating applicationsthat we consider, agents may have concurrent demand for multiple types of resources. For example,states may need many diﬀerent types of medical supplies during the peak of a pandemic.Our setup readily extends to the sequential allocation of m diﬀerent resource types where anarriving agent simultaneously demands m types of supply. We allow the demands to be correlatedacross agents as well as resource types. For the sake of brevity, we refrain from repeating the setupand we simply augment our notation for various quantities (e.g., supply, demand, and allocation)by adding a superscript j ∈ [ m ]. In this generalized model, we deﬁne agent i ’s utility to be their weighted FR, deﬁned as: (cid:88) j ∈ [ m ] λ j x ji d ji , (18)where we normalize the weights λ j to satisfy (cid:80) j ∈ [ m ] λ j = 1. A simple corollary of Theorem 2, asstated below, ensures that independently following our PPA policy for each resource achieves alower bound on the expected minimum weighted FR which is a weighted sum of the expectedminimum FR guaranteed by the PPA for one resource, i.e. κ p ( µ j , n ) max { , µ j } for resource j . Note that in a deterministic setting, maximizing any social welfare objective function as in eq. (17) is a concavemaximization. By writing KKT conditions it is not hard to see that any such function attains its maximum at afeasible proportional allocation, i.e., x i = min (cid:110) d i , d i (cid:80) j ∈ [ n ] d j (cid:111) , under deterministic demands. The maximum is thenequal to W = min { , /µ } . anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing Corollary 2 (PPA’s Guarantee on Expected Minimum Weighted FR) . Considerany instance with n ∈ N agents and m ∈ N resources, where the initial supply for resource j is s j ∈ R ≥ . For any joint demand distribution over all agents and resources (cid:126) F ∈ ∆( R n × m ≥ ) ,independently following the PPA policy for each resource achieves an expected minimum weightedFR (as deﬁned in eq. (18) ) of at least (cid:80) j ∈ [ m ] λ j κ p (cid:16) µ j s j , n (cid:17) max (cid:110) , µ j s j (cid:111) , where µ j ∈ R ≥ is theexpected total demand for resource j . In addition, since demand can be correlated across agents, we can re-use the hard instances ofTheorem 1 to construct a joint distribution which establishes an upper-bound on the performanceof any policy matching the lower bound given in Corollary 2. We state this upper bound as acorollary of Theorem 1 below.

Corollary 3 (Upper Bound on Expected Minimum Weighted FR) . For any n ∈ N agents, m ∈ N resources, and any initial supply for resource j of s j ∈ R ≥ , there exists a jointdemand distribution over all agents and resources (cid:126) F ∈ ∆( R n × m ≥ ) for which no policy can achievean expected minimum weighted FR greater than (cid:80) j ∈ [ m ] λ j κ p (cid:16) µ j s j , n (cid:17) max (cid:110) , µ j s j (cid:111) , where µ j ∈ R ≥ is the expected total demand for resource j . Together, these two corollaries (which we prove in Appendix B) establish that independentlyfollowing the PPA policy for each resource j provides the best possible guarantee on the expectedminimum weighted FR. Consequently, we can use our PPA policy to shed light on how the socialplanner can prepare for demand across multiple types of resources. If the initial endowment ofdiﬀerent resource types is not exogenously set, then the social planner can solve an outer endowmentoptimization problem to maximize the guarantee on the expected minimum weighted FR subjectto a budget constraint.We remark that such an endowment optimization problem is a max-max-min problem wherethe social planner ﬁrst optimizes the initial endowment across resource types subject to a budgetconstraint (the outer problem). Then, given the initial endowment, the social planner maximizesover policies the minimum over demand distributions of our objective, i.e., the expected minimumweighted FR among agents. We solve the outer maximization of the multiple resource-type prob-lem by determining the optimal initial endowment across resource types when the social plannerindependently follows the PPA policy for each resource. To be concrete, suppose the social plannerhas a ﬁxed budget B that can be used to procure an initial endowment (cid:126)s = ( s , s , . . . , s m ). Fur-ther, suppose the per unit cost of resource j ∈ [ m ] is c j . Then, the outer endowment optimizationproblem can be formulated as follows:max (cid:126)s ∈ R m ≥ (cid:88) j ∈ [ m ] λ j κ p (cid:18) µ j s j , n (cid:19) max (cid:26) , µ j s j (cid:27) s.t. B ≥ (cid:88) j ∈ [ m ] c j s j ( Endowment Optimization ) Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

We highlight that to formulate this optimization problem, we crucially use the parameterizedcharacterization of the ex-post fairness guarantee in Theorem 2, as opposed to the worst-caseguarantee for any set of parameters. Further, we remark that the above maximization problem islinearly separable and concave in the decision variables (cid:126)s , meaning that it can be solved eﬃciently. Based on Corollaries 2 and 3, the optimal solution (cid:126)s ∗ , combined with independently implementingthe PPA policy for each resource, is indeed the optimal solution of the max-max-min multipleresource-type problem. Our paper can be viewed as an analog of the classic prophet inequality problem (Krengel andSucheston 1978, Samuel-Cahn et al. 1984) for equitably allocating divisible goods. As such, similarto prophet inequalities, many interesting variants of our setting arise. We discussed two such vari-ants above, and for both, we established an achievable lower bound by employing our PPA policy.However, in the former variant, we do not establish a matching upper bound. Establishing tightbounds on the achievable performance in such a setting—which may require the use of a diﬀer-ent policy—is an interesting direction for future research. Further, understanding the ineﬃciency(unused supply) which may occur in sequential allocation due to our focus on an egalitarian objec-tive is a fruitful research direction that we plan to pursue. Finally, here we made no assumptionabout the correlation structure underlying the demand sequence. It would be compelling to inves-tigate whether including a (well-motivated) correlation structure can result in improved fairnessguarantees.

Acknowledgment

The authors would like to thank Itai Ashlaghi, Amin Saberi, and Ed Kaplan for helpful commentsand insights at early stages of this work.

References

Nikhil Agarwal, Itai Ashlagi, Michael A Rees, Paulo J Somaini, and Daniel C Waldinger. An empiricalframework for sequential assignment: The allocation of deceased donor kidneys. Technical report,National Bureau of Economic Research, 2019.Shipra Agrawal and Nikhil R Devanur. Fast algorithms for online stochastic convex programming. In

Proceedings of the twenty-sixth annual ACM-SIAM symposium on Discrete algorithms , pages 1405–1424. SIAM, 2014.Saeed Alaei. Bayesian combinatorial auctions: Expanding single buyer mechanisms to many buyers.

SIAMJournal on Computing , 43(2):930–972, 2014. It is not diﬃcult to check that the function κ p (cid:16) µ j s j , n (cid:17) max (cid:110) , µ j s j (cid:111) is concave in s j for any choice of µ j and n ;check eq. (2) for a deﬁnition of κ p ( · , · ). We omit this purely algebraic proof for brevity. anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing

Games and Economic Behavior , 118:494–510, 2019.Jerry Anunrojwong, Krishnamurthy Iyer, and Vahideh Manshadi. Information design for congested socialservices: Optimal need-based persuasion. arXiv preprint arXiv:2005.07253 , 2020.Nick Arnosti and Peng Shi. Design of lotteries and wait-lists for aﬀordable housing allocation.

ManagementScience , 66(6):2291–2307, 2020. doi: 10.1287/mnsc.2019.3311.Kenneth Joseph Arrow.

Social Choice and Individual Values . Yale University Press, 1963.Itai Ashlagi, Patrick Jaillet, and Vahideh H Manshadi. Kidney exchange in dynamic sparse heterogenouspools. arXiv preprint arXiv:1301.3509 , 2013.Itai Ashlagi, Maximilien Burq, Patrick Jaillet, and Vahideh Manshadi. On matching and thickness in het-erogeneous dynamic markets.

Operations Research , 67(4):927–949, 2019.Anthony B Atkinson et al. On the measurement of inequality.

Journal of economic theory , 2(3):244–263,1970.Yossi Azar, Niv Buchbinder, and Kamal Jain. How to allocate goods in an online market? In

EuropeanSymposium on Algorithms , pages 51–62. Springer, 2010.Santiago Balseiro, Haihao Lu, and Vahab Mirrokni. The best of many worlds: Dual mirror descent for onlineallocation problems. arXiv preprint arXiv:2011.10124 , 2020.Santiago R Balseiro and David B Brown. Approximations to stochastic dynamic programs via informationrelaxation duality.

Operations Research , 67(2):577–597, 2019.Mohammad Hossein Bateni, Yiwei Chen, Dragos Florin Ciocan, and Vahab Mirrokni. Fair resource allocationin a volatile marketplace. In

Proceedings of the 2016 ACM Conference on Economics and Computation ,pages 819–819, 2016.Dimitris Bertsimas, Vivek F Farias, and Nikolaos Trichakis. The price of fairness.

Operations research , 59(1):17–31, 2011.Dimitris Bertsimas, Vivek F Farias, and Nikolaos Trichakis. On the eﬃciency-fairness trade-oﬀ.

ManagementScience , 58(12):2234–2250, 2012.Dimitris Bertsimas, Vivek F Farias, and Nikolaos Trichakis. Fairness, eﬃciency, and ﬂexibility in organallocation for kidney transplantation.

Operations Research , 61(1):73–87, 2013.William Cai, Johann Gaebler, Nikhil Garg, and Sharad Goel. Fair allocation through selective informationacquisition. In

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , pages 22–28,2020.Andre P Calmon, Florin D Ciocan, and Gonzalo Romero. Revenue management with repeated customerinteractions.

Management Science , 2020.Semih Cayci, Swati Gupta, and Atilla Eryilmaz. Group-fair online allocation in continuous time.

Advancesin Neural Information Processing Systems , 33, 2020.Bhaskar Ray Chaudhury, Jugal Garg, and Kurt Mehlhorn. Efx exists for three agents. In

Proceedings of the21st ACM Conference on Economics and Computation , pages 1–19, 2020.Dragos Florin Ciocan and Vivek Farias. Model predictive control for dynamic resource allocation.

Mathe-matics of Operations Research , 37(3):501–525, 2012.6

Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Maxime Cohen, Adam N Elmachtoub, and Xiao Lei. Price discrimination with fairness constraints.

Availableat SSRN 3459289 , 2019.Kate Donahue and Jon Kleinberg. Fairness and utilization in allocating resources with uncertain demand.In

Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , pages 658–668,2020.Ohad Eisenhandler and Michal Tzur. The humanitarian pickup and distribution problem.

OperationsResearch , 67(1):10–32, 2019.Ezekiel J. Emanuel, Govind Persad, Ross Upshur, Beatriz Thome, Michael Parker, Aaron Glickman, CathyZhang, Connor Boyle, Maxwell Smith, and James P. Phillips. Fair allocation of scarce medical resourcesin the time of covid-19.

New England Journal of Medicine , 382(21):2049–2055, 2020. doi: 10.1056/NEJMsb2005114. URL https://doi.org/10.1056/NEJMsb2005114 .Moran Feldman, Ola Svensson, and Rico Zenklusen. Online contention resolution schemes. In

Proceedingsof the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms , pages 1014–1033. SIAM,2016.Rupert Freeman, Nisarg Shah, and Rohit Vaish. Best of both worlds: Ex-ante and ex-post fairness in resourceallocation. arXiv preprint arXiv:2005.14122 , 2020.Aram Grigoryan. Eﬀective, fair and equitable pandemic rationing.

Available at SSRN 3646539 , 2020.Swati Gupta and Vijay Kamble. Individual fairness in hindsight. In

Proceedings of the 2019 ACM Conferenceon Economics and Computation , pages 805–806, 2019.Jason D Hartline. Mechanism design and approximation.

Book draft. October , 122, 2013.Nicole Immorlica, Sahil Singla, and Bo Waggoner. Prophet inequalities with linear correlations and augmen-tations. In

Proceedings of the 21st ACM Conference on Economics and Computation , pages 159–185,2020.Stefanus Jasin and Sunil Kumar. A re-solving heuristic with bounded revenue loss for network revenuemanagement with customer choice.

Mathematics of Operations Research , 37(2):313–345, 2012.Jiashuo Jiang, Shixin Wang, and Jiawei Zhang. Achieving high individual service-levels without safety stock?optimal rationing policy of pooled resources.

Optimal Rationing Policy of Pooled Resources (May 2,2019). NYU Stern School of Business , 2019.Edward H Kaplan.

Managing the Demand for Public Housing . PhD thesis, Massachusetts Institute ofTechnology, 1984.Ulrich Krengel and Louis Sucheston. On semiamarts, amarts, and processes with ﬁnite value.

Probability onBanach spaces , 4:197–266, 1978.Euiwoong Lee and Sahil Singla. Optimal online contention resolution schemes via ex-ante prophet inequal-ities. In . Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.Jacob D. Leshno. Dynamic matching in overloaded waiting lists. Technical report, SSRN:2967011, 2017.URL https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2967011 .Retsef Levi, Elisabeth Paulson, and Georgia Perakis. Optimal interventions for increasing healthy foodconsumption among low income households.

Available at SSRN 3486292 , 2019. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Operations Research , 62(2):301–317, 2014.Brendan Lucier. An economic view of prophet inequalities.

ACM SIGecom Exchanges , 16(1):24–47, 2017.Will Ma and Pan Xu. Group-level fairness maximization in online bipartite matching. arXiv preprintarXiv:2011.13908 , 2020.Aranyak Mehta. Online matching and ad allocation.

Theoretical Computer Science , 8(4):265–368, 2012.Jeonghoon Mo and Jean Walrand. Fair end-to-end window-based congestion control.

IEEE/ACM Transac-tions on networking , 8(5):556–567, 2000.Vedant Nanda, Pan Xu, Karthik Abhinav Sankararaman, John Dickerson, and Aravind Srinivasan. Balancingthe tradeoﬀ between proﬁt and fairness in rideshare platforms during high-demand hours. In

Proceedingsof the AAAI Conference on Artiﬁcial Intelligence , volume 34, pages 2210–2217, 2020.Irem Sengul Orgut, Julie S Ivy, Reha Uzsoy, and Charlie Hale. Robust optimization approaches for theequitable and eﬀective distribution of donated food.

European Journal of Operational Research , 269(2):516–531, 2018.Christos H Papadimitriou and John N Tsitsiklis. The complexity of markov decision processes.

Mathematicsof operations research , 12(3):441–450, 1987.Parag A Pathak, Tayfun S¨onmez, M Utku ¨Unver, and M Bumin Yenmez. Fair allocation of vaccines,ventilators and antiviral treatments: leaving no ethical value behind in health care rationing. arXivpreprint arXiv:2008.00374 , 2020.John Rawls. A theory of justice, 1973.Yosef Rinott and Ester Samuel-Cahn. Optimal stopping values and prophet inequalities for some dependentrandom variables.

Lecture Notes-Monograph Series , 22:343–358, 1992. ISSN 07492170. URL .Ester Samuel-Cahn et al. Comparison of threshold stop rules and maximum for independent nonnegativerandom variables. the Annals of Probability , 12(4):1213–1216, 1984.Sean R Sinclair, Gauri Jain, Siddhartha Banerjee, and Christina Lee Yu. Sequential fair allocation of limitedresources under stochastic demands. arXiv preprint arXiv:2011.14382 , 2020.Senay Solak, Christina Scherrer, and Ahmed Ghoniem. The stop-and-drop problem in nonproﬁt food distri-bution networks.

Annals of Operations Research , 221(1):407–426, 2014.Van-Anh Truong and Xinshang Wang. Prophet inequality with correlated arrival probabilities, with appli-cation to two sided matchings. arXiv preprint arXiv:1901.02552 , 2019.Yanyan Wang, Vicki M Bier, and Baiqing Sun. Measuring and achieving equity in multiperiod emergencymaterial allocation.

Risk Analysis , 39(11):2408–2426, 2019.Washington Post. Desperate for medical equipment, states encounter a beleaguered nationalstockpile, 2020a. URL .8 Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Washington Post. Desperate for medical equipment, states encounter a beleaguerednational stockpile, 2020b. URL . anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing Appendix

A. Missing Proofs and Discussions of Section 3

A.1. Proof of Theorem 1 (Section 3.1)

We prove the theorem by considering two separate cases corresponding to the over-demanded regime ( µ ≥ n ) and the under-demanded regime ( µ < n ). For each regime, we provide an instance of the problemunder which no sequential allocation policy obtains ex-post fairness larger than κ p ( µ, n ) restricted to thatregime. Over-demanded regime ( µ ≥ n ): Consider an instance with n equally likely scenarios, where inscenario σ all agents i ∈ [ σ ] have demand d i = µn +1 . This instance is depicted in Figure 2(a). We remark thatthe total expected demand is equal to µ , simply because (cid:88) σ ∈ [ n ] n · µn + 1 · σ = µ. In such a setting, whenever agent i has non-zero demand, every agent j where j < i must also have had non-zero demand. Since the policy cannot distinguish among scenarios i, i +1 , . . . , n , its allocation decision must beindependent of the scenario. Therefore, any policy can be described by a set of allocations (cid:126)y = ( y , y , . . . , y n ),such that if agent i has non-zero demand, then they receive an expected allocation y i . Furthermore, whenmaking the allocation decision for agent n , there is only one possible history: every other agent also hadnon-zero demand. Thus, any feasible sequential allocation policy must respect the constraint (cid:80) i ∈ [ n ] y i ≤ r σ as the expected minimum FR of the given policy in scenario σ (i.e., if only the ﬁrst σ agents have non-zero demand). By convention, we set r = 1 and we must have r σ ≤ r σ − by deﬁnition. Inaddition, r σ must be less than the expected FR of agent σ , so r σ ≤ ( n +1)2 µ y σ . Given (cid:126)r , the expected minimumFR in this instance is equal to n (cid:80) σ ∈ [ n ] r σ . Based on these constraints and the objective, we can formulatea linear program whose optimal solution is an upper bound on the expected minimum FR achievable byany feasible sequential allocation policy. This linear program was originally presented in Section 3.1, but wereplicate it here ( Primal-LP1 ) along with its dual program (

Dual-LP1 ).(

Primal-LP1 ) (

Dual-LP1 )max (cid:126)y, (cid:126)r ∈ R n ≥ n (cid:88) σ ∈ [ n ] r σ s.t. min (cid:126)γ, (cid:126)δ ∈ R n ≥ ω ∈ R ≥ ω + γ s.t. r σ ≤ ( n + 1) y σ µ σ ∈ [ n ] ω ≥ ( n + 1)2 µ δ i i ∈ [ n ] r σ ≤ r σ − , r = 1 σ ∈ [ n ] δ i ≥ n + γ i +1 − γ i i ∈ [ n − (cid:88) i ∈ [ n ] y i ≤ δ n ≥ n − γ n To upper-bound the value of the program

Primal-LP1 , we ﬁnd a feasible assignment for the dual program

Dual-LP1 . Consider the assignment where δ i = n and γ i = 0 for all i ∈ [ n ], and where ω = n +12 nµ . Under this0 Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Figure 7 The instance which establishes an upper bound of κ p ( µ, n ) when µ < n (which we refer to asthe under-demanded regime). assignment, all dual variables are non-negative and all constraints are satisﬁed (in fact, all are tight). Thus,this assignment is feasible in Dual-LP1 . It also attains an objective value of n +12 nµ . By weak duality, thisrepresents an upper bound on the optimal value of Primal-LP1 , and hence an upper bound on the expectedminimum FR of any policy in the over-demanded regime.

Under-demanded regime ( µ < n ): Consider an instance with n + 1 scenarios, where the ﬁrst n scenarios each occur with equal probability of µn +1 and scenario n + 1 occurs with probability 1 − nµn +1 . Inscenario n + 1, there is no demand. In scenario σ for σ ∈ [ n ], all agents i ∈ [ σ ] have demand d i = n . Thisinstance is depicted in Figure 7. We remark the total expected demand is equal to µ , simply because (cid:88) σ ∈ [ n ] µn + 1 · n · σ = µ. As was the case in the over-demanded regime, any sequential allocation policy can be described by a set ofallocation decisions such that if agent i has non-zero demand, then they receive an expected allocation y i . Weagain deﬁne r σ as the expected minimum FR in scenario σ , and we note that r n +1 = 1. Thus, the expectedminimum FR in this instance is equal to µn +1 (cid:80) σ ∈ [ n ] r σ + 1 − nµn +1 . By imposing the constraints described for Primal-LP1 , we can formulate a slightly diﬀerent linear program whose optimal solution is an upper boundon the expected minimum FR of any feasible sequential allocation policy in the under-demanded regime.This linear program (

Primal-LP2 ), along with its dual program (

Dual-LP2 ), is presented below. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Primal-LP2 ) (

Dual-LP2 )max (cid:126)y, (cid:126)r ∈ R n ≥ µn + 1 (cid:88) σ ∈ [ n ] r σ + 1 − nµn + 1 s.t. min (cid:126)γ, (cid:126)δ ∈ R n ≥ ω ∈ R ≥ ω + γ + 1 − nµn + 1 s.t. r σ ≤ ny σ σ ∈ [ n ] ω ≥ n δ i i ∈ [ n ] r σ ≤ r σ − , r = 1 σ ∈ [ n ] δ i ≥ µn + 1 + γ i +1 − γ i i ∈ [ n − (cid:88) i ∈ [ n ] y i ≤ δ n ≥ µn + 1 − γ n To upper-bound the value of the program

Primal-LP2 , we ﬁnd a feasible assignment for its dual. Consider δ i = µn +1 and γ i = 0 for all σ ∈ [ n ], and ω = nµ n +1) . Under this dual assignment, all dual variables are non-negative and all constraints are satisﬁed (and again tight). Thus, this assignment is feasible in Dual-LP2 .It also attains an objective value of nµ n +1) − nµn +1 = 1 − nµ n +1) . By weak duality, this represents an upperbound on the optimal value of Primal-LP2 , and hence an upper bound on the expected minimum FRattainable by any sequential allocation policy when µ < n .We conclude the proof by scaling the obtained upper bounds on the expected minimum FR by ourbenchmark for deterministic demand, namely W = min { , /µ } . This establishes an upper bound of κ p ( µ, n )on the ex-post fairness guarantee (see Deﬁnition 1) achievable by any sequential allocation policy. (cid:3) A.2. Example Illustrating Limitations of Dynamic Programming (Section 3.1)

To highlight some of the shortcomings of a DP approach, we will use the following example, which is in facta perturbed version of a particular instance of the class of hard examples illustrated in Figure 2(a).

Example 1.

Consider an instance with two agents ( n = 2) where expected demand is almost twice theamount of supply ( µ = 2 + (cid:15) for small (cid:15) > + (cid:15), ) or ( + (cid:15),

0) with equalprobabilities, that is, the ﬁrst agent has deterministic demand d = + (cid:15) and the second agent either has nodemand or has demand d = .Given the objective of maximizing the expected minimum FR in such a setting, it is not hard to see thatthe optimal online policy—which is the solution of the ex-post DP—simply allocates (cid:15) (cid:15) of the supply tothe ﬁrst agent and the remaining supply to the second agent, assuming that agent’s demand is realized.This achieves an expected minimum FR of (cid:15) . Note that the minimum expected FR is no higher than theexpected minimum FR, as the FR of ﬁrst agent is always the minimum FR when following this DP solution.We now make the following two observations:(i) The DP solution achieves a sub-optimal minimum expected FR.

As mentioned earlier, the minimumexpected FR of the DP solution is (cid:15) . Since µ = 2 + (cid:15) , this corresponds to ex-ante (and ex-post)fairness of (cid:15) (cid:15) . We highlight that the ex-ante fairness of the DP solution is less than the ex-antefairness (and the ex-ante fairness guarantee) of our PPA policy on this instance (for small enough (cid:15) ) bya constant multiplicative factor of ≈ . Note that the PPA policy targets an FR of ≈ / / = forthe ﬁrst agent, resulting in the second agent having a higher expected FR of ≈ / / × + 1 × = . So,2 Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing the PPA policy achieves a minimum expected FR of ≈ and an ex-ante fairness of ≈

1, which for small (cid:15) matches the ex-ante fairness guarantee of the PPA policy established in Theorem 4 when µ = 2 + (cid:15) .(ii) The DP solution is sensitive to small perturbations in agents’ demand distributions.

To see this, weshow that the above DP solution can vary signiﬁcantly if the demand distribution is slightly diﬀerent.Suppose that we perturb this instance by decreasing the ﬁrst agent’s demand by 2 (cid:15) . In this case, theDP solution is to allocate all of the supply to the ﬁrst agent. The expected minimum FR is essentiallyunchanged, but the ﬁrst agent receives almost twice as much supply as before. Not only does thisdemonstrate that the DP solution is highly sensitive, but it also highlights that the DP solution cansuﬀer from a lack of interpretability: the ﬁrst agent receives more supply in this case, even though theonly change was a deterministic and negligible decrease in that agent’s demand.

A.3. Simple Backward Dynamic Programming for Optimum Oﬄine (Section 3.2)

Suppose the planner has access to all the demand realizations (cid:126)d , and let f i be the minimum FR of the policyat the end of period i −

1, i.e., f = 1 , f i = min { f i − , x i − d i − } for i ∈ [2 : n + 1]. We will show via backwardinduction that for any remaining supply s i and minimum FR f i , the maximum-achievable minimum FR ismin { f i , s i (cid:80) j ∈ [ i : n ] d j } , which is achieved by a policy x ∗ i = min (cid:110) d i , s i d i (cid:80) j ∈ [ i : n ] d j (cid:111) .Clearly, this is true when i = n , as the optimal policy is to allocate as much supply as possible, i.e. x ∗ n = min { d n , s n } . This policy achieves a minimum FR of f n +1 = min (cid:26) f n , x n d n (cid:27) = min (cid:26) f n , min { d n , s n } d n (cid:27) = min (cid:26) f n , s n d n (cid:27) . We now assume this is true for i > k . In that case, given an allocation to agent k of x k , the minimum FRat the end of period k is given by f k +1 = min (cid:110) f k , xkd k (cid:111) and the remaining supply is s k +1 = s k − x k . Based onour inductive hypothesis, the maximum-achievable minimum FR is thusmin (cid:40) f k +1 , s k +1 (cid:80) j ∈ [ k +1: n ] d j (cid:41) = min (cid:40) f k , xkd k , s k − x k (cid:80) j ∈ [ k +1: n ] d j (cid:41) . This is maximized when the second and third terms are equal, which occurs when x k = s k d k (cid:80) j ∈ [ k : n ] d j . Ifthis allocation is infeasible (cid:16) i.e., if s k d k (cid:80) j ∈ [ k : n ] d j ≥ d k (cid:17) , then an allocation of x k = d k is optimal because thisallocation ensures that f k must be the minimum of the three terms. This completes the proof by backwardinduction that the maximum-achievable minimum FR is min (cid:110) f i , s i (cid:80) j ∈ [ i : n ] d j (cid:111) , which is achieved by a policy x ∗ i = min (cid:110) d i , s i d i (cid:80) j ∈ [ i : n ] d j (cid:111) . A.4. Analysis of Non-adaptive Fixed-allocation Policies (Section 3.4)

In this appendix section, we consider another class of non-adaptive policies which we call ﬁxed-allocationpolicies. A ﬁxed-allocation policy is one which pre-determines an allocation x i for each agent i ∈ [ n ]. Theoptimal ﬁxed-allocation policy is the policy which, given a joint demand distribution (cid:126) F , pre-determines avector of allocations (cid:126)x = ( x , x , . . . , x n ) which maximizes E (cid:126)d ∼ (cid:126) F (cid:104) min i ∈ [ n ] (cid:110) x i d i (cid:111)(cid:105) . As explained in Footnote 18, we focus on deterministic policies without loss of generality. anshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing Proposition 1 (Ex-post Fairness Guarantee of the Optimal Fixed-allocation Policy) . Givena ﬁxed number of agents n ∈ N and supply scarcity µ ∈ R ≥ , the optimal ﬁxed-allocation policy achieves anex-post fairness guarantee of κ fa ( µ, n ) = max { , µ } (cid:40) − nµ , µn ∈ [0 , nµ , µn ∈ [2 , + ∞ ) . (19)We remark that the ex-post fairness guarantee κ fa ( µ, n ) tends to 0 as the number of agents n gets large.This is in stark contrast to the guarantees provided by the PPA policy and the optimal TFR policy, whichare lower-bounded by a constant regardless of the number of agents. A.4.1. Proof of Proposition 1

We prove this proposition by ﬁrst showing that there exists a distri-bution where no ﬁxed-allocation policy achieves ex-post fairness greater than κ fa ( µ, n ), which thus servesas an upper bound on the ex-post fairness guarantee of the optimal ﬁxed-allocation policy. We then showthat there exists a ﬁxed-allocation policy that achieves ex-post fairness of at least κ fa ( µ, n ) for any demanddistribution, which means the bound κ fa ( µ, n ) is tight. Upper bound:

We prove the hardness result by considering two separate cases corresponding to nµ < nµ ≥

2. For each case, we provide an instance of the problem under which any ﬁxed-allocation policyobtains ex-post fairness no larger than κ fa ( µ, n ).(i) If nµ <

2, consider a joint demand distribution such that with probability 1 − nµ there is no demand, andwith probability nµ one agent chosen uniformly at random has demand n . In this case, with probability1 − nµ , the minimum FR is 1, and with probability nµ , the minimum FR is equal to the allocationof a randomly selected agent (which is at most 1 /n ) divided by the total demand 2 /n . Therefore, theminimum expected FR for this instance is upper-bounded by 1 − nµ .(ii) If nµ ≥

2, consider a joint demand distribution where one agent chosen uniformly at random has demandequal to the expected total demand µ . In this instance, the minimum expected FR is upper-boundedby the allocation of a randomly selected agent (which is at most 1 /n ) divided by the total demand µ .Taken together, these instances provide an upper bound on the expected minimum FR one can hopeto achieve with a ﬁxed-allocation policy. We then scale each instance by our benchmark for deterministicdemand, namely W = min { , /µ } , which provides an upper bound of κ fa ( µ, n ) on the ex-post fairnessguarantee (see Deﬁnition 1) of any ﬁxed-allocation policy. Lower bound:

Consider a policy which allocates an equal amount of supply to each agent, i.e., x i = n for all i ∈ [ n ]. In that case, the minimum FR is lower-bounded by min (cid:110) , n (cid:80) i ∈ [ n ] d i (cid:111) . Further,min (cid:40) , n (cid:80) i ∈ [ n ] d i (cid:41) ≥ (cid:40) − n (cid:80) i ∈ [ n ] d i , n (cid:80) i ∈ [ n ] d i ∈ [0 , n (cid:80) i ∈ [ n ] d i , n (cid:80) i ∈ [ n ] d i ∈ [2 , + ∞ )We note that the right hand side of the above inequality is convex in (cid:80) i ∈ [ n ] d i . Therefore, using Jensen’sinequality, the expected minimum FR must be at least (cid:40) − nµ , nµ ∈ [0 , nµ , nµ ∈ [2 , + ∞ )We then scale each this lower bound on the expected minimum FR by our benchmark for deterministicdemand, namely W = min { , /µ } . This provides a lower bound on the ex-post fairness guarantee (seeDeﬁnition 1) that is equal to κ fa ( µ, n ). Thus, we have shown that κ fa ( µ, n ) is a tight bound on the ex-postfairness guarantee of the optimal ﬁxed-allocation policy. (cid:3) Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

A.5. Proof of Claims in Lemma 2 (Section 3.4)

In this section, we present the proofs of the three claims which appear in the proof of Lemma 2.

A.5.1. Proof of Claim 1

Suppose we have a distribution G such that R G ( q ) is not ﬂat in ( Q G (1) , ∃ q , q ∈ ( Q G (1) ,

1] such that R G ( q ) (cid:54) = R G ( q ). Let R G ( q ) attain its maximum in [ Q G (1) ,

1] at thequantile ¯ q . Now consider a quantile q (cid:48) ∈ ( Q G (1) ,

1] so that R G ( q (cid:48) ) < R G (¯ q ). Let δ (cid:44) q (cid:48) − Q G ( T G ( q (cid:48) ) + (cid:15) ) bethe total probability mass in the TFR interval ( T G ( q (cid:48) ) , T G ( q (cid:48) ) + (cid:15) ]. Then pick a small enough (cid:15) > (cid:15) + δ + (cid:15) · δ < R G (¯ q ) − R G ( q (cid:48) ),(ii) (cid:15) ≤ − T G ( q (cid:48) ),(iii) ¯ q / ∈ [ Q G ( T G ( q (cid:48) ) + (cid:15) ) , q (cid:48) ].Now consider a distribution ¯ G that is generated from G by moving all the δ probability mass in( T G ( q (cid:48) ) , T G ( q (cid:48) ) + (cid:15) ] to the point T G ( q (cid:48) ) + (cid:15) . With this modiﬁcation in the distribution, the EAFR of everyTFR τ ∈ [0 , T G ( q (cid:48) )] ∪ ( T G ( q (cid:48) ) + (cid:15),

1] remains the same. Moreover, the maximum EAFR in the interval( T G ( q (cid:48) ) , T G ( q (cid:48) ) + (cid:15) ] is also achieved at T G ( q (cid:48) ) + (cid:15) . Given this target ﬁll rate, the EAFR of the distribution ¯ G is equal to R ¯ G ( T G ( q (cid:48) ) + (cid:15) ) = ( T G ( q (cid:48) ) + (cid:15) ) (cid:0) Q G (cid:0) T G ( q (cid:48) ) + (cid:15) (cid:1) + δ (cid:1) < T G ( q (cid:48) ) · Q G (cid:0) T G ( q (cid:48) ) + (cid:15) (cid:1) + (cid:15) + δ + (cid:15) · δ< R G ( q (cid:48) ) + ( R G (¯ q ) − R G ( q (cid:48) )) = R G (¯ q ) . Therefore, the maximum EAFR over all possible TFRs in [0 ,

1] is the same for G and ¯ G , i.e.,max q ∈ [0 , T G ( q ) ∈ [0 , R G ( q ) = max q ∈ [0 , T ¯ G ( q ) ∈ [0 , R ¯ G ( q ) = R G (¯ q ) . However, ¯ µ = E v ∼ ¯ G (cid:2) v (cid:3) < E v ∼ G (cid:2) v (cid:3) = µ . Now let ˜ G be the distribution of the random variable (¯ µ/µ ) · v where v ∼ ¯ G . We have:max τ ∈ [0 , τ (1 − ˜ G ( τ )) = max τ ∈ [0 , τ (cid:0) − ¯ G ( τ · µ/ ¯ µ ) (cid:1) ≤ max τ ∈ [0 , τ (cid:0) − ¯ G ( τ ) (cid:1) = max τ ∈ [0 , τ (1 − G ( τ )) . Also, E v ∼ ˜ G (cid:2) v (cid:3) = E v ∼ G (cid:2) v (cid:3) = µ . Therefore, dropping such a distribution G from the feasible set in the outeroptimization of eq. (15) does not change the inﬁmum value, which proves Claim 1. (cid:3) A.5.2. Proof of Claim 2

Suppose we have a distribution G that has non-zero total mass in the interval(1 , + ∞ ). Now shift all the probability mass in (1 , + ∞ ) to + ∞ . Let ¯ G be the resulting distribution. Notethat the maximum EAFR among targets in [0 ,

1] is the same for G and ¯ G , as the EAFR for any target in[0 ,

1] remains the same. However, ¯ µ = E v ∼ ¯ G (cid:2) v (cid:3) < E v ∼ G (cid:2) v (cid:3) = µ , as we have moved the probability mass of v towards larger values (equivalently, the probability mass of demand to lower values). By using the sametrick as above in Section A.5.1, we conclude that dropping such a distribution G from the feasible set in theouter optimization of eq. (15) does not change the inﬁmum value, which proves Claim 2. (cid:3) anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing A.5.3. Proof of Claim 3

Consider an inverse demand distribution ¯ G that satisﬁes the two constraintsgiven by Claim 1 and Claim 2, namely (i) R G ( q ) = R G ( q (cid:48) ) , ∀ q, q (cid:48) ∈ [ Q G (1) , G ( v ) = G (1) for all v ∈ [1 , + ∞ ). Let us deﬁne ¯ q such that Q ¯ G (1) = ¯ q , or equivalently, T ¯ G (¯ q ) = 1.The EAFR curve for ¯ G by deﬁnition attains a value of R ¯ G (¯ q ) = ¯ q . Since ¯ G has a constant EAFR curve inthe interval [¯ q, q over that interval. Futher, since the EAFRcurve can also be expressed as q · ¯ G − (1 − q ), we must have ¯ G (¯ q/q ) = 1 − q for all q ∈ [¯ q, v = ¯ q/q , we must have ¯ G ( v ) = 1 − ¯ q/v for all v ∈ [¯ q,

1] (which implies ¯ G (¯ q ) = 0).Further, since the CDF ¯ G pushes all the probability mass in the interval (1 , + ∞ ) to + ∞ , ¯ G must beconstant in the interval [1 , + ∞ ). Thus, we have uniquely described ¯ G , up to a constant ¯ q :¯ G ( v ) =  v ∈ [0 , ¯ q )1 − ¯ q/v if v ∈ [¯ q, − ¯ q if v ∈ [1 , + ∞ )1 if v = + ∞ For this distribution to have an expected demand of µ , consider the corresponding CDF for demand¯ F : R ≥ → [0 ,

1] for the random variable x (cid:44) v where v ∼ ¯ G . We have:¯ F ( x ) = 1 − ¯ G (1 /x ) =  x = 0¯ q if x ∈ (0 , qx if x ∈ (1 , q ]1 if x ∈ ( q , + ∞ )As a result, E v ∼ G (cid:20) v (cid:21) = (cid:90) ∞ (cid:0) − ¯ F ( x ) (cid:1) dx = 1 − ¯ q + (cid:90) q (1 − ¯ qx ) dx = 12 (cid:18) q − ¯ q (cid:19) . The unique solution to (cid:16) q − ¯ q (cid:17) = µ satisfying ¯ q ≥ q = µ + √ µ +1 .We highlight that ¯ G with ¯ q = µ + √ µ +1 is identical to the distribution ˆ G deﬁned in eq. (14). This showsthat ˆ G is the unique worst-case distribution. (cid:3) A.6. Proof of Theorem 4 (Section 3.5)

We prove this theorem by ﬁrst providing two instances which together show that no policy achieves ex-antefairness greater than κ a ( µ, n ). We then show that our PPA policy achieves ex-ante fairness of at least κ a ( µ, n )for any demand distribution, which means the bound is tight. Upper bound:

We prove the hardness result by considering two separate cases corresponding to µ < µ ≥

2. For each case, we provide an instance of the problem under which no policy can obtain ex-antefairness larger than κ a ( µ, n ).(i) If µ <

2, consider a joint demand distribution for an arbitrary number of agents such that with prob-ability 1 − µ there is no demand, and with probability µ the total demand is 2 (arbitrarily and deter-ministically split among agents). In this case, with probability 1 − µ , each agent achieves an FR of 1,and with probability µ , the expected FR cannot exceed for at least one agent. Therefore, in thisinstance the minimum expected FR is upper-bounded by 1 − µ .(ii) If µ ≥

2, consider a deterministic demand distribution where total demand is equal to its expectation µ (arbitrarily split among n agents). In this case, the minimum expected FR is clearly upper-boundedby µ .6 Manshadi, Niazadeh, Rodilitz:

Fair Dynamic Rationing

Taken together, these instances provide an upper bound on the minimum expected FR one can hope toachieve with any policy. We then scale each instance by our benchmark for deterministic demand, namely W =min { , /µ } , which provides an upper bound of κ a ( µ, n ) on the ex-ante fairness guarantee (see Deﬁnition 1)of any policy. (cid:3) Lower bound:

We now show that the PPA policy achieves an ex-ante fairness guarantee of κ a ( µ, n ).First, we show via induction that after the arrival of any agent i ∈ [ n ], the ratio d i + µ i +1 s i , i.e., expected totalremaining demand to remaining supply, is in expectation at most µ . We then place a lower bound on theexpected FR of agent i when following the PPA policy which is a decreasing and convex function of the ratio d i + µ i +1 s i . We conclude by applying Jensen’s inequality to lower bound the expected FR of agent i . Since thislower bound is the same for each agent, it constitutes a lower bound on the minimum expected FR. Claim 4 (Upper Bound on Demand-to-Supply Ratio) . When following the PPA policy, for all i ∈ [ n ] , E (cid:126)d ∼ (cid:126) F (cid:104) d i + µ i +1 s i (cid:105) ≤ µ . Proof: We proceed by induction. Clearly, when i = 1, E (cid:126)d ∼ (cid:126) F (cid:104) d + µ s (cid:105) = E (cid:126)d ∼ (cid:126) F (cid:104) µ s (cid:105) = µ . We now assumethat this holds for i = k and attempt to prove the claim for i = k + 1. According to the PPA policy, x k ≤ s k d k d k + µ k +1 . Thus, s k +1 is at least s k µ k +1 d k + µ k +1 . Consequently, E (cid:126)d ∼ (cid:126) F (cid:20) d k +1 + µ k +2 s k +1 (cid:21) = E (cid:126)d ∼ (cid:126) F (cid:20) µ k +1 s k +1 (cid:21) ≥ E (cid:126)d ∼ (cid:126) F (cid:34) µ k +1 s k µ k +1 d k + µk +1 (cid:35) = E (cid:126)d ∼ (cid:126) F (cid:20) d k + µ k +1 s k (cid:21) ≥ µ. The ﬁnal inequality comes from our inductive hypothesis, which completes the proof by induction. (cid:3)

Given a current demand d i and expected future demand µ i +1 , the FR of agent i is min (cid:110) , s i d i + µ i +1 (cid:111) . It isstraightforward to show that this FR is lower-bounded by the following function of the ratio d i + µ i +1 s i : h (cid:18) d i + µ i +1 s i (cid:19) = (cid:40) − d i + µ i +1 s i , d i + µ i +1 s i < s i d i + µ i +1 , d i + µ i +1 s i ≥ h is a decreasing and convex function of its argument. Hence, by Jensen’s inequality andClaim 4, E (cid:126)d ∼ (cid:126) F (cid:20) h (cid:18) d i + µ i +1 s i (cid:19)(cid:21) ≥ h (cid:18) E (cid:126)d ∼ (cid:126) F (cid:20) d i + µ i +1 s i (cid:21)(cid:19) ≥ h ( µ ) . Since this lower bound on the expected FR holds for each agent i , we have shown a lower bound on theminimum expected FR when following the PPA policy. When scaled by our benchmark for deterministicdemand, namely W = min { , /µ } , this lower bound exactly matches the upper bound of κ a ( µ, n ) establishedabove, and thus completes the proof of Theorem 4. (cid:3) B. Missing Proofs of Section 5.2

B.1. Proof of Corollary 1

Suppose the allocation given by the PPA policy is (cid:126)x . By Theorem 2, the PPA policy achieves ex-post fairnessof κ p ( µ, n ) when the social welfare function is the minimum FR, which implies U + ∞ ( (cid:126)x ) = min i ∈ [ n ] { x i d i } ≥ κ p ( µ, n ) W .Now consider a new allocation vector (cid:126)x (cid:48) such that for all i ∈ [ n ], x (cid:48) i d i = min j ∈ [ n ] { x j d j } . We remark that x i ≥ x (cid:48) i for all i ∈ [ n ]. Further, it is easy to verify that U α ( (cid:126)x (cid:48) ) = U + ∞ ( (cid:126)x ) for any α ∈ [0 , + ∞ ). Since U α ( (cid:126)x ) isnon-decreasing in each x i , U α ( (cid:126)x ) ≥ U + ∞ ( (cid:126)x ). anshadi, Niazadeh, Rodilitz: Fair Dynamic Rationing α ). Scaling by the achievable social welfare when demandis deterministic, we have shown that ex-post fairness, i.e., E (cid:126)d ∼ (cid:126) F [ U α ( (cid:126)x )] /W , must be at least κ p ( µ, n ). (cid:3) B.2. Proof of Corollary 2

Theorem 2 establishes that the PPA policy guarantees an expected minimum FR for resource j of at least κ p (cid:16) µ j s j , n (cid:17) max (cid:110) , µ j s j (cid:111) . Consequently, we can place a lower bound on the expected minimum weighted FR: E (cid:126)d ∼ (cid:126) F  min i ∈ [ n ] (cid:88) j ∈ [ m ] λ j x ji d ji  ≥ (cid:88) j ∈ [ m ] λ j E (cid:126)d ∼ (cid:126) F (cid:20) min i ∈ [ n ] x ji d ji (cid:21) ≥ (cid:88) j ∈ [ m ] λ j κ p (cid:18) µ j s j , n (cid:19) max (cid:26) , µ j s j (cid:27) . (cid:3) B.3. Proof of Corollary 3

Consider a correlated distribution for demands—correlated across agents and resource types—where themarginal distribution of demands for each resource type matches the worst-case joint distribution of thesingle-type problem described in the proof of Theorem 1. These marginal distributions are then coupledsuch that the last-arriving agent with non-zero demand for resource j is the same as the last-arriving agentwith non-zero demand for resource j (cid:48) , for any resources j, j (cid:48) ∈ [ m ] for which at least one agent has non-zerodemand. Given the marginal distributions, each agent is equally likely to be this last-arriving agent. For anysample path drawn from this distribution, it is without loss of generality to only consider policies where theallocation is decreasing (i.e., where this last-arriving agent has the worst FR) for every resource. If supply of resource j is s j , Theorem 1 establishes an upper-bound of κ p (cid:16) µ j s j , n (cid:17) max (cid:110) , µ j s j (cid:111) on theexpected minimum FR of any policy in the single-type problem corresponding to resource j . Since forevery sample path the agent with the worst FR (i.e., the last-arriving agent) is the same across resources,aggregating these bounds establishes an upper-bound of (cid:80) j ∈ [ m ] λ j κ p (cid:16) µ j s j , n (cid:17) max (cid:110) , µ j s j (cid:111) on the expectedminimum weighted FR. (cid:3) For intuition as to why this is without loss of generality, note (i) each agent with non-zero demand for resource j ∈ [ m ] has identical demand for that resource, (ii) each agent linearly aggregates their FRs by the same weights { λ j } j ∈ [ m ] , and (iii) each agent is equally likely to be the last-arriving agent. Consequently, if agent i has a strictlylarger allocation than agent i (cid:48) (where i (cid:48) < i ) for any resource jj