Eliciting Private User Information for Residential Demand Response
Datong P. Zhou, Maximilian Balandat, Munther A. Dahleh, Claire J. Tomlin
EEliciting Private User Information for Residential Demand Response
Datong P. Zhou ∗† , Maximilian Balandat (cid:63) , Munther A. Dahleh † , and Claire J. Tomlin (cid:63) Abstract — Residential Demand Response has emerged asa viable tool to alleviate supply and demand imbalances ofelectricity during times when the electric grid is strained.Demand Response providers bid reduction capacity into thewholesale electricity market by asking their customers undercontract to temporarily reduce their consumption in exchangefor a monetary incentive. This paper models consumer behaviorin response to such incentives by formulating Demand Responsein a Mechanism Design framework. In this auction setting,the Demand Response Provider collects the price elasticities ofdemand as bids from its rational, profit-maximizing customers,which allows targeting only the users most susceptible toincentives such that an aggregate reduction target is reachedin expectation. We measure reductions by comparing thematerialized consumption to the projected consumption, whichwe model as the “10-in-10”-baseline, the regulatory standardset by the California Independent System Operator. Due tothe suboptimal performance of this baseline, we show, usingconsumption data of residential customers in California, thatDemand Response Providers receive payments for “virtualreductions”, which exist due to the inaccuracies of the baselinerather than actual reductions. Improving the accuracy of thebaseline diminishes the contribution of these virtual reductions.
I. I
NTRODUCTION
With the restructuring of the traditional, vertically in-tegrated energy market towards a competitive market,Demand-Side Management (DSM) has become a viable toolfor alleviating supply and demand imbalances of electricity.Facilitated by advancements in information and communica-tions technology, smart metering infrastructure allows end-users of electricity to “participate” in the electric marketas virtual power plants through properly designed incentivemechanisms. DSM is motivated by the inelasticity of energysupply, which causes small variations in demand to result in aprice boom or bust, respectively. These price fluctuations areaggravated by the inherent volatility of renewable generationresources, their increasing levels of penetration, and theprohibitively high capital cost of energy storage. Since aload-serving entity (LSE) is required to procure electricityat fluctuating prices to cover the electricity demand of itsresidential households under contract instantaneously andat quasi-fixed tariffs, price risks are almost entirely borneby the LSE. Incentivizing users to temporarily reduce their ∗ Department of Mechanical Engineering, University of California, Berke-ley, USA. [email protected] † Laboratory for Information and Decision Systems, MIT, Cambridge,USA. [datong, dahleh]@mit.edu (cid:63)
Department of Electrical Engineering and Computer Sci-ences, University of California, Berkeley, USA. [balandat,tomlin]@eecs.berkeley.edu
This work has been supported in part by the National Science Foundationunder CPS:FORCES (CNS-1239166) and CEC Grant 15-083. consumption (and charging a fee if users do not reduce)during periods of high prices therefore partially passes suchprice risks on to customers.While the area of DSM has attracted a vast array ofresearch across different domains (see [1] for a summary),we in this paper focus on the area of Demand Response(DR), where end-users of electricity are incentivized toreduce their demand temporarily during designated hours,precisely when there is a shortage of electricity supply. Usersreceive a reward for each unit of reduction, but incur apenalty for increasing their consumption. Demand Responseproviders (DRPs) bundle these reductions and can offer thesereductions as a bid directly into the competitive wholesaleelectricity market, or enter bilateral contracts with load-serving utilities. While DR is traditionally carried out on acommercial level, residential customers are targeted for loadreduction programs, as well. For instance, in California, thePublic Utilities Commission (CPUC) launched a “DemandResponse Auction Mechanism” (DRAM) in July 2015 [2]to allow DRPs to offer reduction capacity from residentialcustomers directly into the day-ahead electricity market,where they are subject to regular market clearing prices andshortfall penalties. Utilities are required to purchase a fixedminimum monthly amount of this reduction capacity.To make an informed capacity bid into the market, theDRP must take various factors into account, such as theexpected Locational Marginal Price (LMP) which determinesits market clearing price, the elasticity of users’ demandgiven an incentive, and the number of users under contract.If the DRP bids too much capacity, the aggregate reductionamong its user base will likely fail to reach the capacityvolume, thereby incurring a shortfall penalty; similarly, asuboptimal revenue arises from too small a bid. The DRP canimprove its bidding strategy by learning users’ behavior inresponse to incentives. However, users’ preferences are typ-ically private information and hence unknown to the utility.The challenge thus becomes to elicit this private information.We cast this problem as a mechanism design problem, wherethe DRP as the auctioneer solicits bids from each of itsresidential customers through an incentive compatible and individually rational mechanism. The motivation behind thisapproach is to increase allocative efficiency, that is, theutility would like to solicit reductions only from the highestreducers, who are most willing to reduce their consumptionin exchange for the lowest possible reward. In this paper,we design such a mechanism that fulfills these criteriaand benchmark its performance against the omniscient case,where user characteristics are common knowledge.A crucial question that arises from this setting is how to a r X i v : . [ c s . G T ] S e p easure the reduction of any individual user during a DRevent, given that only the consumption outcome under atreatment can be observed, but not its counterfactual (theconsumption had there been no DR event). This is thefundamental problem of causal inference [3]. To estimatethe reduction during any particular DR event, it is thusessential to estimate the counterfactual, which we refer toin this context as “baseline”. Estimating this baseline in theabsence of a Randomized Controlled Trial is a modern areaof research at the intersection of economics and machinelearning. Examples for such baseline estimates can be foundin [4], [5], [6], [7]. In this paper, we employ the “10-in-10” baseline employed by the California Independent SystemOperator (CAISO) [8], which estimates the counterfactualfor a particular DR event as the mean consumption of the10 most recent days during the same hour as the DR event.Using this baseline, the measured reduction for any selecteduser can be formulated as the sum of a virtual reduction,which reflects the estimation error in the baseline prediction,and the actual reduction due to price elasticity of userdemand. We observe that the DR provider can achieve avirtual reduction from those users for which the baselineis high. That is, the DR provider receives payments forvirtual, non-existent reductions which are indirectly paid forby utilities. However, we show that a more accurate baselinediminishes the impact of such virtual reductions. Related Work
Modeling consumer behavior in response to monetaryincentives in DR and their heterogeneity is a growing areaof research. In [9], the authors formulate the problem oftargeting the “right” customers for DR as a stochastic knap-sack problem in order to achieve a target reduction withhigh probability. However, users’ responses are modeled asa linear model without private user information.Other works have incorporated a contractual formulationbetween consumers and suppliers in DR settings. For exam-ple, [10] designs a DR market where suppliers bid supplycurves in the presence of a supply shortage to the load-serving entity and analyzes the ensuing market equilibria. In[11], the authors formulate a contract between an aggregatorof buildings, individual buildings, and the wholesale elec-tricity market to exploit flexibility of commercial buildings’HVAC consumption. In a similar fashion, [12] formulatesa contract design problem between an aggregator and indi-vidual electric vehicle owners to maximize its revenue byproviding power capacity to the grid operator.To quantify the impact of DR signals on the reduction ofconsumption, [6], [7] estimate individual treatment effects inresponse to hourly DR events by comparing the estimatedcounterfactual consumption to the actual, observed con-sumption. [13] formulates an optimal treatment assignmentstrategy to precisely measure the treatment effect of DR.The application of Mechanism Design on DR is coveredin [14], where the authors maximize the social welfare ofconsumers and the energy provider by designing a consump-tion controller with a Vickrey-Clarke-Groves auction. In [15], [16], the authors incorporate uncertainty into consumers’reduction behavior and introduce the notion of reliability forachieving a designated amount of aggregate reduction.
Contributions
Unlike previous works, which modeled reductions as mul-tiples of unit reductions, we account for the
FundamentalProblem of Causal Inference [3] into the mechanism designformulation between DRP and users, which is our maincontribution. Specifically, we estimate reductions using theCAISO “10-in-10” baseline as the counterfactual estimate.As a consequence of uncertain baseline predictions, virtualreductions arise. Using observational data from residentialcustomers in California, we quantify the extent to whichthese virtual reductions counteract DR, and how these reduc-tions diminish as baseline estimates become more precise.
Notation
Let [ · ] + = max(0 , · ) . Vectors are printed in boldface. Let a − i denote the vector of all components in a excluding i . ( · ) denotes the indicator function. Outline
The remainder of this paper is organized as follows:Section II characterizes DR market participants and theirinteractions, based on which Section III presents a mech-anism for the DR provider to elicit private user informationand to achieve an aggregate reduction among its users undercontract. Section IV elucidates the difference between virtualand actual reductions as an artifact of an uncertain baselineestimate. The mechanism is simulated on residential smartmeter data in California in Section V, where we experimen-tally show how more accurate baselines reduce the amountof virtual reductions. Section VI concludes. All proofs arerelegated to the Appendix.II. M
ARKET P ARTICIPANTS AND I NTERACTIONS
A. Residential Demand Response
Figure 1 describes the interaction between the DRP, end-users, the electric utility, and the wholesale electricity market.Electric UtilityWholesale Market
End UsersDR Provider
Payment Electricity
Incen-tives Reduc-tions
SignalReductions
Fig. 1: Energy Market Participants for DR and their Interactions
The DRAM requires electric utilities to acquire demandflexibility from DRPs, which they submit as part of theirsupply curves as a bid into the real-time wholesale electricitymarket. If these bids are cleared, the utility sends the DRPa signal to ask for a specified aggregate reduction among itssers. The DRP elicits reductions by incentivizing a subsetof its customers
T ⊆ I with user-specific per-unit rewards { r i ∈ R + | i ∈ T } , where I = { , . . . , n } denotes theset of users. In exchange for the monetary incentive, usersreduce consumption by { δ i ∈ R | i ∈ T } . A per-unitpenalty q ∈ R + , which is assumed to be identical for allusers and common knowledge, is enforced for an increase inconsumption beyond the baseline. Users in the non-targetedgroup I \ T are excluded from the incentive program. In thispaper, we focus on the interaction between the DRP and theend-users from the perspective of the DRP. To maximize itsprofit, the goal of the DRP is to achieve an a-priori definedaggregate reduction with minimal payments to its users.
B. Residential Customers
Each rational, profit-maximizing user i ∈ I is endowedwith the “10-in-10” baseline ˆ x i ∈ R + employed by theCalifornia Independent System Operator (CAISO) [8], whichis an estimate of her counterfactual consumption [17] for aparticular hour. For notational ease, we drop time indices, butwe emphasize the need to re-calculate ˆ x i for any individualhour. The baseline for a particular hour on a weekday iscalculated as the mean of the hourly consumptions on the10 most recent business days during the hour of interest. Forweekend days and holidays, the mean of the 4 most recentobservations is calculated. User i ’s measured load reduction δ i , provided she is given incentive r i to reduce during aparticular hour, is simply the difference between the baseline ˆ x i and the actual, materialized consumption x i : δ i = (cid:40) , if i (cid:54)∈ T ˆ x i − x i , if i ∈ T (1)Due to the widespread existence of advanced metering infras-tructure, the baseline ˆ x i is assumed to be common knowledgeamong the DRP and user i . The utility of user i is definedas follows: u i = (cid:40) , if i (cid:54)∈ T r i · [ˆ x i − x i ] + − q · [ x i − ˆ x i ] + , if i ∈ T (2)which equals the payment from the DRP to user i . That is, ifthe user is under a DR contract with the DRP, she is rewardedwith r i ∈ R + for each unit of reduction, and charged q foreach unit of consumption above the baseline ˆ x i .We model users’ consumption in response to r i , denotedwith x i ( r i ) , with a semi-logarithmic demand curve, anassumption frequently made in economics: x i ( r i ) = ¯ x i · exp( − α i r i )log x i ( r i ) = log ¯ x i − α i r i ∀ i ∈ I (3)In (3), ¯ x i ∈ R + and α i ∈ R + are random variables signifyingthe base demand (the intercept or the consumption with r i = 0 ) and the slope of the demand curve in log-linear co-ordinates, respectively. This semi-logarithmic demand curvecaptures the fact that the amount of reduction is marginallydecreasing in the reward r i and saturates. User i ’s type θ i isinformation correlated with (¯ x i , α i ) (not necessarily (¯ x i , α i ) itself) and user i ’s private information. C. Demand Response Provider
The DRP aims to maximize its profit Π in expectation: Π = ¯ r · min(∆ , M ) − ¯ q · [ M − ∆] + − (cid:88) i ∈I δ i ( r i · δ i < − q i · δ i ≥ ) . (4) Π is random in δ , . . . , δ n . ∆ = (cid:80) i ∈I δ i is the total sumof reductions and M ∈ R + the target capacity the DRP hasto provide to the utility. ¯ r and ¯ q ∈ R + denote the per-unitreward and shortfall penalty the DRP is subject to in thewholesale electricity market. Note that ¯ q (cid:54) = q and ¯ r (cid:54) = r i .The first term of (4) represents the profit the DRP earnsfor materialized reductions, the second term captures theshortfall penalty for unfulfilled reductions, and the last termis the sum of payments disbursed to individual customers. Assumption 1.
The DRP is risk-neutral and profit-maximizing.
Assumption 2.
The per-unit penalty ¯ q in the wholesaleelectricity market and the per-unit reward ¯ r are greater thanthe maximum per-unit reward disbursed to any customer, i.e. min(¯ q, ¯ r ) > max ≤ i ≤ n ( r i ) . With Assumptions 1 and 2, (4) can be rewritten as follows:minimize r ,...,r n E δ ,...,δ n (cid:34)(cid:88) i ∈I δ i ( r i δ i < − q i δ i ≥ ) (cid:35) subject to E δ ,...,δ n (cid:34)(cid:88) i ∈I δ i (cid:35) ≥ M. (5)That is, the DRP aims to find an optimal vector of per-unit rewards r ∗ that minimizes the expected total amountof payments disbursed to the users while satisfying theconstraint that the expected sum of reductions exceeds M .III. D EMAND R ESPONSE M ECHANISM
To find an approximation to the solution of (5), the utilityneeds to elicit user i ’s private type θ i with an incentivecompatible (IC) and individually rational (IR) mechanism.IR guarantees that participation in the mechanism, providedusers act rationally, results in an expected payoff that is atleast as large as in the case of non-participation (outsideoption), which is zero in our case (2). IC is required to ensurethat users report their types truthfully to the DRP. A. Mechanism Design Basics
We first introduce basic notation relevant to our problem.Let θ denote the collection of types ( θ , . . . , θ n ) , whereeach θ i ∈ Θ i ∀ i ∈ I is drawn from its type space Θ i .It is assumed that θ is drawn from a commonly known jointdistribution F defined on the product space Θ = × ni =1 Θ i .Each agent is assumed to seek expected utility maximizationof her utility function u i ( y , θ i ) : Y × Θ i (cid:55)→ R , where y =( d , r ) ∈ Y = { , } n × R n + is the collective choice consistingof the vector of allocation decisions d and the vector ofrewards r . The social choice function f ( θ ) : Θ (cid:55)→ Y mapsa particular collection of types θ to y .et S i , . . . , S n denote the strategy spaces of users i ∈ I .A realized strategy vector s ∈ × ni =1 S i defines an outcomefunction g ( s , . . . , s n ) : × ni =1 S i (cid:55)→ Y . Together they define amechanism Γ = ( S , . . . , S n , g ( · )) , which transforms users’strategies into a social choice function through the outcomefunction g ( · ) . (Γ , F, { u i } ni =1 ) defines a Bayesian Game withpayoffs u i ( g ( s , . . . , s n ) , θ i ) and strategies s i : Θ i (cid:55)→ S i .The revelation principle [18] allows us to focus on directmechanisms, i.e. S i = Θ i and g ( s , . . . , s N ) ≡ g ( θ ) = f ( θ ) , which is the well-known fact that any equilibrium ofany mechanism is identical to an equilibrium of a directmechanism, provided truthful reporting. We focus on thedominant strategy equilibrium: Definition 1 (Dominant Strategy Equilibrium (DSE)) . ADominant Strategy Equilibrium is given by θ i = arg max z i ∈ Θ i E z i [ u i ( f ( z i , z − i ) , θ i )] ∀ i ∈ I , z ∈ Θ (6) That is, if the supremum of user i ’s expected utility u i isachieved with truthful reporting s ∗ i ( θ i ) = θ i , regardlessof other users reports z − i ∈ Θ − i , then the social choicefunction f ( · ) is dominant strategy incentive compatible.B. Timing, User Types, and Reward Calculation The DR mechanism unfolds as follows: • The users i ∈ I discover their types θ , . . . , θ n . Thebaselines ˆ x , . . . , ˆ x n become common knowledge. • The users reveal their types { z i } ni =1 to the DRP, where z i not necessarily corresponds to the true type θ i . • The DRP implements the collective choice f ( z ) = y =( d , r ) through the mechanism Γ . • Users observe f ( z ) and adjust their consumption ac-cording to (3) and d i , r i .For better visualization, Figure 2 depicts these steps. θ and ˆx materialize t = 0 Users revealtypes z to DRP t = 1 DRP implements f ( z ) = ( d , r ) ,informs users t = 2 Users’ consumptionsin response to y materialize t = 3 Fig. 2: DR Mechanism Timeline
An important observation is that, after the implementationof f ( z ) at t = 2 , the DRP calculates its expected profit E [Π] and the expected payments disbursed to each user i .Due to the Myerson-Satterthwaite Theorem [19], we do notperform any ex-post analysis on the realized consumptions x ( r ) at t = 3 .To model the fact that users’ base electricity consump-tion is often driven by habits rather than rational profit-maximization [20], we assume the user-specific intercept ¯ x i to be drawn from an a-priori defined distribution G withcharacteristic parameters ξ i encoded in user i ’s private type. ξ i itself is distributed according to the joint distribution F ξ , and so ¯ x i is a compound random variable. The slope,however, is assumed to be explicitly known for each user anddrawn from distribution F α . Thus θ i = ( α i ∼ F α , ξ i ∼ F ξ ) ,where ¯ x i ∼ G ξ i ∼ G ξ i ∼ F ξ . All distributions have supporton R + . We make the following assumption: Assumption 3.
The types ( α i , ξ i ) are drawn from indepen-dent, absolutely continuous distributions F α and F ξ . Eachcomponent k in ξ i is independently drawn from the marginaldistribution F ξ k s.t. F ξ = F ξ · . . . · F ξ m , where m is thedimension of ξ i . G is pairwise independent of F α and F ξ . User i ’s expected utility µ i , given the realized types α i and ξ i , allocation d i = 1 , and reward r i , is obtained by takingthe expectation of (2) with respect to the random variable ¯ x i ∼ G ξ i : µ i ( d i = 1 , r i ) = (cid:90) R + u i ( α i , r i , x ) dG ξ i ( x ) , (7)which is strictly monotonically increasing in reward r i , cf.(2). Letting G denote the CDF of G , (7) for r i = 0 becomes µ i ( d i = 1 , r i = 0) = q i (cid:20) ˆ x i (1 − G (ˆ x i )) − (cid:90) ∞ ˆ x i x dG ξ i ( x ) (cid:21) which is negative. Hence, there is a unique ˜ r i such that µ i ( d i = 1 , ˜ r i ) = 0 , i.e. the unique threshold reward levelfor which user i ’s expected utility is zero. We approximate ˜ r i with Newton’s method, exploiting the fact µ i is monoton-ically increasing in r i . Due to the same property, any reward r i ≥ ˜ r i fulfills the IR constraint as µ i ( d i = 0) = 0 (Eq. 2). C. Mechanism for Demand Response
We now present the Demand Response Mechanism:1) Each user announces her private type z i ∈ Θ i tothe DRP. We will later show that this mechanism isincentive compatible, so that users report their typestruthfully. In the following, we thus let z i = θ i .2) The DRP calculates the unique ˜ r i for each user basedon the reports θ i with Newton’s method on (7).3) The DRP sorts { ˜ r i | i ∈ I} in ascending order. Callthis sorted set R .4) The DRP implements the social choice y as follows: j max = min j (cid:40) j ∈ N + (cid:12)(cid:12)(cid:12) j (cid:88) i =1 δ i (˜ r j | θ i ) ≥ M (cid:41) (8a) j ( i ) = min k k ∈ N + (cid:12)(cid:12)(cid:12) k (cid:88) s =1 ,s (cid:54) = i δ s (˜ r k | θ s ) ≥ M ∀ i ∈ { , . . . , j max } =: T (8b) r i ← ˜ r j ( i ) ≥ ˜ r i ∀ i ∈ T (8c)The allocation decision and the reward vector are d = (1 , . . . , , n − j max ) , (9a) r = (˜ r j (1) , . . . , ˜ r j ( j max ) , n − j max ) . (9b)In the above mechanism, δ i (˜ r j | θ i ) denotes the expectedreduction of user i , given the reward level ˜ r j conditionalon truthful reporting z i = θ i , which is computed by takingthe expectation on (1) and (3) with respect to ξ i .The mechanism first determines the set of targeted users T by selecting the smallest index j max ∈ { , . . . , n } , such thatthe sum of expected reductions of users through j max , ifeach user were given the reward ˜ r j max , exceeds the desiredggregate amount M (8a). Notice that since the set R issorted in ascending order, ˜ r j max ≥ ˜ r i ∀ i ≤ j max . Because µ i ( d i = 1 , r i ) is strictly monotonically increasing in r i , alltargeted users will respond to incentive level ˜ r j max .Next, the reward for each user i ∈ T is determined byrunning the same exact mechanism (8a) on I \ i , i.e. the setof all users excluding i (8b). Denote the user with the largestthreshold reward ˜ r j ( i ) in this new set with j ( i ) . This rewardlevel is then assigned to user i (8c).In summary, the first j max users (8a) with the smallestthreshold rewards ˜ r i are offered user-specific unit-rewards((8b), (8c)). The remaining n − j max users are not targeted.Lastly, to ensure that the mechanism returns a valid index j max , we restrict M to the range (cid:104) , (cid:80) n − i =2 δ i (˜ r n − | θ i ) (cid:105) . If M exceeds this range, there are not enough users to achieveexpected aggregate reduction M on the given n users. Theorem 1. If M ∈ (cid:104) , (cid:80) n − i =2 δ i (˜ r n − | θ i ) (cid:105) , the DR Mech-anism terminates. The mechanism fulfills the IR constraint.Truthful reporting, i.e. s ∗ i ( θ i ) = θ i , establishes a DSE. Since truthful reporting establishes a DSE (Theorem 1),Mechanism I is also IC, due to the revelation principle [21].
Remark 1.
Due to the fact that { ( α i , ξ i ) } ni =1 are realiza-tions of continuous random variables, no ties need to bebroken in (8a) , (8b) and the sorting of the users into R ,because identical threshold rewards ˜ r i = ˜ r j , i, j ∈ I , i (cid:54) = j ,only occur with probability zero. The presented mechanism runs in O ( n log n ) time, as ittakes O ( n log n ) time to create the sorted list ˜ R and log n time to determine the correct index j max (8a) with a binarysearch on all possible values of j = 1 , . . . , n . Once j max hasbeen found, we have to determine the reward level for eachuser by running the same mechanism again, which amountsto O ( n log n ) . This yields a runtime of O ( n log n ) . Remark 2.
This mechanism is motivated by the classicVickrey-Clarke-Groves Mechanism [21], as it allocates an“items” (in our case reward) to the “highest” bidders (inour case lowest threshold reward levels).D. Numerical Example
Table I lists threshold rewards ˜ r i and reduction functionsof hypothetical users in a synthetic user pool. The linearityof { δ i } i =1 is assumed for ease of exposition. Let M = 4 . . Pool of UsersUser ˜ r i δ i ( r i ) 1 + r r r r r r TABLE I: Example User Characteristics (8a) selects j max = 2 such that δ (˜ r ) + δ (˜ r ) = (1 +1) + (2 + ·
1) = 4 . ≥ M . Thus T = { , } . (8b) thendetermines j (1) and j (2) by solving (8a) on T \ and T \ ,respectively: • For i = 1 , j (1) = 4 because δ (˜ r ) + δ (˜ r ) + δ (˜ r ) =(2 + 1 . /
2) + (1 + 1 . /
3) + (2 + 1 . /
4) = 6 . ≥ M . Indeed, j (1) (cid:54) = 3 because δ (˜ r ) + δ (˜ r ) = (2 +1 . /
2) + (1 + 1 . /
3) = 4 . < M . • For i = 2 , j (2) = 4 because δ (˜ r ) + δ (˜ r ) + δ (˜ r ) =(1 + 1 .
8) + (1 + 1 . /
3) + (2 + 1 . /
4) = 6 . ≥ M .Indeed, j (2) (cid:54) = 3 because δ (˜ r ) + δ (˜ r ) = (1 + 1 .
5) +(1 + 1 . /
3) = 4 < M
User 1 and 2’s rewards therefore are ˜ r , see (8c).IV. E FFECT OF B ASELINE “G AMING ”By expanding user i ’s reduction of consumption (1), δ i = (ˆ x i − ¯ x i ) + ¯ x i (1 − e − α i r i ) =: δ BL i + δ ri , (10)it becomes clear that the measured reduction δ i of user i is comprised of two components: δ BL i , which captures thedifference between the baseline ˆ x i and the base consump-tion (i.e. the consumption with no reward), and the actualreduction δ ri due to the elasticity of user i in response to thereward level r i . δ BL i is a “virtual reduction”, which, if pos-itive (negative), represents the amount of falsely measuredreduction (increase). From an economic perspective, δ BL i > results in falsely allocated credit from the utility to the DRPas well as from the DRP to users i . On the contrary, δ BL i < is synonymous with a misallocated monetary transfer fromuser i to the utility as well as from the utility to the DRPproportional to the amount of | δ BL i | . To diminish the effectof virtual reduction, the baseline estimates should become asprecise as possible. We make the following assumption: Assumption 4.
The random variables α i and ξ i for differentpoints in time are independent. Assumption 4 excludes the possibility of baseline manip-ulation [22], which captures the fact that users can inflateor deflate their baseline, given the knowledge of future DRevents, in order to increase their calculated reduction δ i (1).For example, a user can increase her expected utility (2) fora DR event by consciously over-consuming prior to the DRevent so as to increase the baseline ˆ x i , which results in ahigher payment r i · [ˆ x i − x i ] + , despite having a zero actualreduction δ ri . However, as DR events are difficult to forecast,the mild assumption that users do not consciously manipulatetheir baseline justifies Assumption 4, that is, users consumeindependently of the past and the future.As a result, averaging 10 recent observations for weekdays(or 4 for weekends and holidays), excluding hours of past DRevents, results in an unbiased estimate of the mean consump-tion x i , but with considerable variance around x i . From atheoretical perspective, the baseline estimate approaches zerovariance as the number of previous observations to estimate ˆ x i goes to infinity, due to the Central Limit Theorem andAssumption 4. In the next Section, we simulate the effectof more precise baseline estimates on the quantity of virtualreductions δ BL i .As the analysis of the economic implications of thisvirtual baseline reduction component is outside the scopeof the paper, the reader is referred to [23], which explicitlycharacterizes the magnitude of marginal competitive rents inCalifornia’s wholesale electricity market, and [6], [7], wherehe authors suggest alternative baselining methodologiesbased on Machine Learning, which weaken the effect of suchvirtual reductions. V. S IMULATIONS
In this section, we simulate the presented mechanism andthe effect of virtual reductions stemming from imperfectbaseline predictions. We utilize hourly smart meter data from1,000 residential customers serviced by the three largestutilities in California (Pacific Gas & Electric, San Diego Gas& Energy, and Southern California & Edison).
A. Approximation of Base Consumption
Figure 3 shows the distribution of the hourly base con-sumptions between 5-6 pm in the absence of DR events of aselected user. The restriction to 5-6 pm is arbitrarily chosen.For a more thorough analysis, we would have to analyze all24 hours of the day separately.
Consumption [kWh] E m p i r i c a l P D F Log-Normal Empirical Distribution
FitEmpirical
Fig. 3: Lognormal Consumption Distribution Fit for Selected User, 5-6 pm
It is found that the base consumption ¯ x i can be approxi-mated with a log-normal distribution, whose density N (log x ) = 1 σ √ π exp (cid:18) − (log( x − (cid:96) ) − µ ) σ (cid:19) (11)is fully parameterized by the shape σ > , scale s = e µ > ,and location parameter (cid:96) . As (11) has support on ( (cid:96), ∞ ) , thelocation (cid:96) denotes the lower bound on the support of thebase consumption distribution.Fitting a log-normal distribution to the hourly consump-tions between 5-6 pm across all users yields a distributionof the compound statistics ξ i = ( σ, s, (cid:96) ) , given below: ¯ x i ∼ Lognormal( σ, s, (cid:96) ) σ ∼ N ( µ n , σ n ) s ∼ Cauchy( (cid:96) c , s c ) (cid:96) ∼ Exponential( λ e ) That is, the shape parameter σ is best approximated with aGaussian distribution N ( µ n , σ n ) , the location (cid:96) by a Cauchydistribution parameterized by location (cid:96) c and scale parameter s c , and the scale parameter s by an exponential distributionwith parameter λ e . Figure 4 shows the distribution of thesecompound statistics across all 1,000 users. B. Performance of DR Mechanism
We compare the DR Mechanism (8a)-(8c) to the hypothet-ical case of an omniscient DRP, which knows { ( α i , ξ i ) } ni =1 .Despite this being an unrealistic scenario, it provides a near-optimal approximation of the minimum payment disbursed to the users necessary to elicit a target reduction of M .Given the sorted list R of user-specific threshold rewards, theomniscient DRP implements the social choice y o = ( d o , r o ) as follows: j o = min j (cid:40) j ∈ N + (cid:12)(cid:12)(cid:12) j (cid:88) i =1 δ i (˜ r i ) ≥ M (cid:41) (12a) T o = { , . . . , j o } (12b) r oi = ˜ r i ∀ i ∈ T o (12c) d o = (1 , . . . , , n − j o ) (12d)That is, the DRP determines the smallest index j o to obtainthe desired expected aggregate reduction M (12a) whereeach user { , . . . , j o } is given their individual thresholdreward ˜ r i (12c). These are the targeted users (12b), (12d).Due to { ( α i , ξ i ) } ni =1 being publicly known, users areunable to extract information rent from the DRP, which isthe payment to the users to elicit their private information[24]. Hence, the DRP can offer targeted users their thresholdreward ˜ r i , which keeps users at an expected utility (7) ofzero. To guarantee user participation, the DRP has to offerthe reward level ˜ r i + ε to each user i ∈ T o , where ε is anarbitrarily small positive number.Figure 5 compares the DR Mechanism (8a)-(8c) to theomniscient allocation with respect to the number of targetedusers (left) and the total amount of rewards disbursed (right)on n = 500 users whose parameters ξ i = ( σ i , s i , (cid:96) i ) are sampled from the fitted distributions in Figure 4. Asexpected, the omniscient allocation is more economical ateliciting a particular aggregate reduction target M due tothe lack of private user information, namely about betterthan the DR mechanism. However, it needs to target morecustomers as each customer in the omniscient case receivesa smaller reward level than in the DR mechanism. C. Virtual Reductions
Figure 6 shows the total reduction (cid:80) i ∈T δ i of all targetedusers and its components (cid:80) i ∈T δ BL i and (cid:80) i ∈T δ ri as afunction of M for n = 500 users, q = 5 , and elasticities { α i } ni =1 drawn from a uniform distribution with support [0 . , . . The baseline computed with a particular number x of previous days taken into consideration is calculated asthe mean of x randomly drawn samples from the empiricalconsumption distribution (11).As can be seen from Figure 6, almost the entire reductionis attributed to the baseline component (cid:80) i ∈T δ BL i for small M . With larger values of M , the contribution of (cid:80) i ∈T δ BL i decreases marginally and finally starts decreasing. This canbe explained by the fact that sorting users in R tends toput users with the highest δ BL i towards the start of the array,while those with the lowest (and negative) δ BL i bunch upat the end of R . Consequently, as more users are assignedto T , the sum of baseline reductions decreases. The actualreduction (cid:80) i ∈T δ ri increases exponentially with the numberof users targeted, because as more users are assigned to T , .0 0.25 0.5 0.75 1.0 1.25 1.5 1.75 Shape of Log-Normal Distribution E m p i r i c a l P D F Gaussian ( µ n = 0 . , σ n = 0 . FitEmpirical -0.4 -0.2 0.0 0.2 0.4 0.6
Location of Log-Normal Distribution E m p i r i c a l P D F Cauchy ( ‘ c = 0 . , s c = 0 . FitEmpirical 0.0 0.5 1.0 1.5 2.0 2.5 3.0
Scale of Log-Normal Distribution E m p i r i c a l P D F Exponential ( λ e = 1 . ) FitEmpirical
Distribution of Log-Normal Parameters of Consumption Distribution Across Users
Fig. 4: Compound Statistics for Lognormal Consumption Distribution. Left: Shape, Middle: Location, Right: Scale
Aggregate Reduction Target
Number of Targeted Users
DR MechanismOmniscient DRP
Aggregate Reduction Target
Total Payment to Users
Fig. 5: Number of Targeted Users and Total Payment to Users for DRMechanism (blue) vs. Omniscient Allocation (green), n = 500 , q =5 . , α i ∼ unif [0 . , . . Aggregate Reduction Target M R e du c t i o n C o m p o n e n t s Effect of BL Accuracy on δ BL vs. δ r δ BL , δ BL , δ BL , δ BL , δ BL , δ r , δ r , δ r , δ r , δ r , M Fig. 6: Composition of Target Aggregate Reduction M for varying Base-lines. Red: (cid:80) i ∈T δ BL i . Blue: (cid:80) i ∈T δ ri . Parameters: n = 500 , q =5 . , α i ∼ unif [0 . , . the per-unit reward levels also increase, which results in asuperlinear growth of (cid:80) i ∈T δ ri .For increasing numbers of baseline averaging components,that is, the number of previous days to calculate the baseline,the variance of the baseline estimate ¯ x i − ˆ x i decreases, andso the virtual reductions decrease. For the limiting case of aperfect baseline, the virtual reductions are zero.Finally, Figure 7 depicts the total amount of payments theDRP has to make to the users for varying baseline accuraciesin the range M ∈ [0 , , where virtual payments have thelargest effect (see Figure 6). For more inaccurate baselines(fewer number of averaging days), the DRP has to pay usersless as it can exploit the virtual reduction component.VI. C ONCLUSION
We modeled Residential Demand Response with a Mecha-nism Design Framework where a Demand Response Providerasks a subset of its customers under contract to reduce elec-tricity consumption temporarily in exchange for a monetaryreward. Each user’s consumption in response to a per-unit
Aggregate Reduction Target M P a y m e n t s Effect of BL Accuracy on Payments to Users BL , BL , BL , BL , BL , BL , BL , BL , BL , Fig. 7: Payments to Users to Elicit M for varying Baseline Accuracies. reduction incentive is modeled as a logarithmic demandcurve where the intercept and the slope are private infor-mation of users. While each user has a fixed slope, the user-specific intercept, which corresponds to the consumptiongiven no incentive, is modeled as a realization of a compoundrandom variable, capturing the fact that users often do notconsume electricity in a profit-maximizing fashion, but ratherare following habits, and hence have no explicit utilityfunction. To make an informed choice about the magnitudeof reductions in response to incentives to achieve an a-prioridefined aggregate reduction target M , the Demand ResponseProvider asks for residential customers’ bids to elicit theirprivate information. Reductions are measured against a coun-terfactual estimate of the consumption in the hypotheticalcase of no DR event, which in this paper is the “10-in-10”-baseline employed by the California Independent SystemOperator. Since this baseline is plagued by high variance, theDemand Response Provider can exploit “virtual reductions”emanating from high baseline estimates, which are false-positive reductions despite the users not having reduced, butwhose role diminishes as the baseline becomes more precise.The Demand Response mechanism is validated on hourlysmart meter data of residential customers in California.Our analysis is an initial step towards quantifying eco-nomic implications of Demand Response on a residentiallevel. While we approximated users’ base demand (i.e. inthe absence of incentives) reasonably well with existingsmart meter data, the price elasticity of users in responseto incentives is unknown, a fact that is complicated by thefundamental problem of causal inference. Thus, to furthervalidate our analysis on real data, credible parameters forusers’ slope of the demand curve would be necessary.Lastly, extending the single period analysis employedin this paper towards a dynamical problem, which allowsfor baseline manipulation of users, is a logical next step.Comparing the “10-in-10”-baseline to improved baselinestimates obtained with Machine Learning techniques, whichexploit serial correlation of consumption time series, wouldshed further light on the economics of Residential DemandResponse. A PPENDIX
Proof of Theorem 1Individual Rationality:
Notice first that each user i ∈ T is given the reward ˜ r j ( i ) , where j ( i ) ≥ j max ≥ i. The firstinequality is a consequence of (8b), which for each i ∈ T re-runs (8a) on the subset of users I \ i . Thus, to achieve theaggregate reduction M on users I \ i , where each user wouldbe given the highest threshold reward of the targeted group,requires more users to be targeted than running the samemechanism on I . Hence j ( i ) ≥ j max . The second inequalityis due to the fact that ˜ R is sorted in ascending order, whichalso implies E [ u i (˜ r j ( i ) | θ i )] ≥ E [ u i (˜ r i | θ i )] = 0 . due to the monotonically increasing property of the expectedutility in the reward. Thus, participation in the mechanismand being assigned to T results in a non-negative expectedutility, compared to a zero utility for non-participation. Onthe other hand, users i (cid:54)∈ T receive a zero payment and sothe expected utility is zero. Incentive Compatibility:
To show that the DR mechanismis incentive compatible, first note that the reward level r ( i ) for each i ∈ T is determined independently of user i ’s bid z i . For each i (cid:54)∈ T , user i is not given a reward. To show IC,we must therefore iterate through the following two cases:1) i ∈ T for z i = θ i , i.e. user i is assigned treatmentwith truthful reporting. This implies user i is givenreward ˜ r j ( i ) , which results in a positive expected utility.Now suppose user i had reported z i (cid:54) = θ i . Then eitherthe user is still assigned treatment, in which case herreward remains the same, or the user is not assignedtreatment, in which case her reward reduces to zero.Thus, misreporting could lead to a zero utility whenthe user could have had a positive expected utility.2) i (cid:54)∈ T for z i = θ i , i.e. user i is outside the treatmentgroup with truthful reporting. If user i had reported z i (cid:54) = θ i , then either the user is still outside thetreatment group, which results in a zero utility, or theuser is now in the treatment group. In the latter case,note that user i is assigned reward ˜ r j max , as j max isexactly the solution to (8b). Finally, because ˜ r j max < ˜ r i (due to i (cid:54)∈ T ), the expected utility turns negative.Thus, misreporting does not improve the expectedutility, but could lead to a negative expected utilitywhen the user could have had a zero utility.Combining the two cases above, it follows that misreportingthe true type either yields a utility that is identical to or lessthan the utility in case of truthful reporting. Therefore themaximum expected utility is attained with truthful reporting, z i = θ i , and so the DR mechanism is incentive compatible.Lastly, to show that the mechanism terminates if ≤ M ≤ (cid:80) n − i =2 δ i (˜ r n − | θ i ) , simply notice that j ( i ) ≤ n ∀ i ∈ T (8b) because δ (˜ r k | θ ) ≥ δ i (˜ r k | θ i ) , i ≤ k ≤ j ( i ) due tothe monotonically increasing property of (1) and (7). Hencerunning the mechanism on T \ i always satisfies M .R EFERENCES[1] P. Palensky and D. Dietrich, “Demand Side Management: DemandResponse, Intelligent Energy Systems, and Smart Loads,”
IEEE Trans-actions on Industrial Informatics , vol. 7, no. 3, pp. 381–388, 2011.[2] Public Utilities Commission of the State of California, “ResolutionE-4728. Joint Utility Proposal for a Demand Response AuctionMechanism Pilot.”[3] P. W. Holland, “Statistics and Causal Inference,”
Journal of theAmerican Statistical Association , vol. 81, no. 396, pp. 945–960, 1986.[4] S. Athey and G. Imbens, “Recursive Partitioning for HeterogeneousCausal Effects,”
Proceedings of the National Academy of Sciences ofthe United States of America , vol. 113, no. 27, pp. 7353–7360, 2016.[5] A. Abadie, A. Diamond, and J. Hainmueller, “Synthetic ControlMethods for Comparative Case Studies: Estimating the Effect ofCalifornia’s Tobacco Control Program,”
Journal of the AmericanStatistical Association , vol. 105, no. 490, pp. 493–505, 2012.[6] D. Zhou, M. Balandat, and C. Tomlin, “Residential Demand ResponseTargeting Using Observational Data,” , 2016.[7] ——, “A Bayesian Perspective on Residential Demand ResponseUsing Smart Meter Data,”
Big Data, IEEE International Conference on , 2013.[10] N. Li, L. Chen, and M. Dahleh, “Demand Response Using LinearSupply Function Bidding,”
IEEE Transactions on Smart Grid , vol. 6,no. 4, pp. 1827–1838, 2015.[11] M. Balandat, F. Oldewurtel, M. Chen, and C. Tomlin, “ContractDesign for Frequency Regulation by Aggregations of CommercialBuildings,”
Communication, Control, and Computing (Allerton), 201452nd Annual Allerton Conference on , pp. 38–45, 2014.[12] S. Han, S. Han, and K. Sezaki, “Development of an Optimal Vehicle-to-Grid Aggregator for Frequency Regulation,”
IEEE Transactions onSmart Grid , vol. 1, no. 1, pp. 65–72, 2010.[13] P. Li and B. Zhang, “An Optimal Treatment Assignment Strategy toEvaluate Demand Response Effect,” , 2016.[14] P. Samadi, H. Mohsenian-Rad, R. Schober, and V. W. S. Wong,“Advanced Demand Side Management for the Future Smart GridUsing Mechanism Design,”
IEEE Transactions on Smart Grid , vol. 3,no. 3, pp. 1170–1180, 2012.[15] H. Ma, V. Robu, N. Li, and D. C. Parkes, “Incentivizing Reliabilityin Demand-Side Response,”
The 25th International Joint Conferenceon Artificial Intelligence ∼ /media/markets-ops/dsr/pjm-analysis-of-dr-baseline-methods-full-report.ashx.[18] M. J. Osborne and A. Rubinstein, A Course in Game Theory . MITPress, 1994.[19] R. B. Myerson and M. A. Satterthwaite, “Efficient Mechanisms forBilateral Trading,”
Journal of Economic Theory , vol. 29, no. 2, pp.265–281, 1983.[20] K. Mar´echal, “Not Irrational but Habitual: The Importance of ”Be-havioural Lock-in” in Energy Consumption”,”
Ecological Economics ,vol. 69, no. 5, pp. 1104–1114, 2010.[21] P. Milgrom,
Putting Auction Theory to Work . Cambridge UniversityPress, 2004.[22] C. Campaigne, M. Balandat, and L. Ratliff, “Welfare Effects ofDynamic Electricity Pricing,”
Working Paper , 2016.[23] S. Borenstein, J. Bushnell, and F. Wolak, “Measuring Market Inef-ficiencies in California’s Restructured Wholesale Electricity Market,”
The American Economic Review , vol. 92, no. 5, pp. 1376–1405, 2002.[24] J.-J. Laffont and D. Martimort,