[PDF] Data-Driven Load Profiles and the Dynamics of Residential Electric PowerConsumption

Abstract

The dynamics of power consumption constitutes an essential building block for planning and operating energy systems based on renewable energy supply. Whereas variations in the dynamics of renewable energy generation are reasonably well studied, a deeper understanding of short and long term variations in consumption dynamics is still missing. Here, we analyse highly resolved residential electricity consumption data of Austrian and German households and propose a generally applicable methodology for extracting both the average demand profiles and the demand fluctuations purely from time series data. The analysis reveals that demand fluctuations of individual households are skewed and consistently highly intermittent. We introduce a stochastic model to quantitatively capture such real-world fluctuations. The analysis indicates in how far the broadly used standard load profile (SLP) may be is insufficient to describe the key characteristics observed. These results offer a better understanding of demand dynamics, in particular its fluctuations, and provide general tools for disentangling mean demand and fluctuations for any given system. The insights on the demand dynamics may support planning and operating future-compliant (micro) grids in maintaining supply-demand balance.

Full PDF

DData-Driven Load Proﬁles and the Dynamics of Residential Electric PowerConsumption

Mehrnaz Anvari, ∗ Elisavet Proedrou, ∗ Benjamin Sch¨afer, ∗ Christian Beck, Holger Kantz, and Marc Timme Potsdam Institute for Climate Impact Research (PIK),Member of the Leibniz Association, P.O. Box 60 12 03, D-14412 Potsdam, Germany DLR Institute for Networked Energy Systems School of Mathematical Sciences, Queen Mary University of London, United Kingdom Max Planck Institute for the Physics of Complex Systems, D-01187 Dresden, Germany Chair for Network Dynamics, Center for Advancing Electronics Dresden (cfaed) and Institute for Theoretical Physics,Technical University of Dresden, 01062 Dresden, Germany

The dynamics of power consumption constitutes an essential building block for planning and op-erating energy systems based on renewable energy supply. Whereas variations in the dynamics ofrenewable energy generation are reasonably well studied, a deeper understanding of short and longterm variations in consumption dynamics is still missing. Here, we analyse highly resolved resi-dential electricity consumption data of Austrian and German households and propose a generallyapplicable methodology for extracting both the average demand proﬁles and the demand ﬂuctu-ations purely from time series data. The analysis reveals that demand ﬂuctuations of individualhouseholds are skewed and consistently highly intermittent. We introduce a stochastic model toquantitatively capture such real-world ﬂuctuations. The analysis indicates in how far the broadlyused standard load proﬁle (SLP) may be is insuﬃcient to describe the key characteristics observed.These results oﬀer a better understanding of demand dynamics, in particular its ﬂuctuations, andprovide general tools for disentangling mean demand and ﬂuctuations for any given system. Theinsights on the demand dynamics may support planning and operating future-compliant (micro)grids in maintaining supply-demand balance. ∗ contributed equally a r X i v : . [ phy s i c s . a pp - ph ] S e p I. Introduction

Electrical energy is an essential part of the daily life that should be generated, transmitted, stored and, ﬁnally,consumed. Generation and storage of electricity shall match the dynamic consumption of residential, industrial andother sectors at all times. To maintain the balance between the electricity generated by energy providers, and theelectricity consumed by consumers, energy suppliers need to know the electricity required by all consumer sectors ona broad range of time scales, i.e. seconds to days. Estimating the typical variations in the electricity demand over thecourse of a day yields a load proﬁle, which can be attained either through the deﬁnition of a methodology to extracta load proﬁle from empirical data, the creation of a model or a combination of both.Research aiming at developing load proﬁles goes back to at least the 1940s [1], however the issue of ﬁnding aprecise, high resolution load proﬁle is becoming more and more urgent due to increasing population, electrical heat-ing systems, electrical vehicles for transportation, solar home systems as well as the increasing share of ﬂuctuatingrenewable energy (RE) feed-in and the construction of distributed power grids, especially smart grids. In 1999, theﬁrst methodologically systematic German household load proﬁle, known as the H0 Standard Load Proﬁle (H0 SLP),was developed [2] and has since been in use without alterations in at least Germany and Austria [3].We focus here on the residential sector consuming around 29% of all electricity in the European Union [4]. Al-though initially household load proﬁles used a temporal resolution of around one hour, newer models have a temporalresolutions up to and including one second, due to the recent availability of highly resolved data sets of electricityconsumption and the usage of smart meters in selected houses (see Supplementary Note 1). These new data setsallow the grid operators to record the electricity consumption of individual houses at high temporal resolutions.Several previous studies [5–7] analysed high temporal resolution datasets and reported the presence of extreme andsigniﬁcant peaks in the datasets which have not been reported for data sets with temporal resolutions of 15 minutesto 1 hour. However, a structured approach to translate these high-resolution data sets into usable load proﬁles isnot readily available to date. Therefore, in this work we ﬁrst analyse highly resolved electricity consumption datafor groups of houses in Austria and Germany. Our data-driven analysis indicates the potential for the presence ofstrong ﬂuctuations and high levels of unpredictability in the distribution grids, see Section II. Then, we use the sameanalysis to introduce the new load proﬁle which is consistent with high-resolution electricity consumption data.As mentioned above, one important reason of having a high resolution load proﬁle is the increasing share of REfeed-in. In contrast to electric energy generated from burning fossil fuels, RE feed-in is weather-dependent, intermit-tent, and highly variable [8–10]. It thus becomes harder to balance supply and demand. As the share of renewablefeed-in is increasing, recent and current research focuses on gaining a deeper understanding of the electricity generatedby RE as well as advanced approaches of balancing demand and supply, e.g. by load-shifting [11, 12]. In contrast, thedynamics of electricity consumption is far from understood, in particular on the residential level, partially becauseelectricity consumption data are diﬃcult to obtain.Here, we analyse the dynamic characteristics of the residential power consumption at a high temporal resolution anddevelop a generally applicable methodology to extract both the consumption trend and the consumption ﬂuctuations.After reviewing the need for a new demand model in more technical detail (Section II), we analyse the electricityconsumption data of Germany and Austrian houses measured for several weeks. We disentangle the variations inthe consumption dynamics into two main factors, the average load proﬁle (ALP) and the statistics of short-timeﬂuctuations of the ALP. First, through the application of the empirical mode decomposition (EMD) [13], we extractthe average load proﬁle from the time series data. This extracted trend captures the demand much more accuratelythan the often-used H0 SLP (Section III). Next, we use superstatistics to model the ﬂuctuations around this trendinto a stochastic ﬂuctuation proﬁle (SFP). Combining the trend and ﬂuctuation analysis, we successfully reproducea synthetic high resolution load proﬁle, yielding a full data-driven load proﬁle (DLP). Our modelling approach canbe readily generalised for use with other data sets and in order to facilitate this we are providing executable code as(Section IV). We thereby develop a load proﬁle methodology that is applicable to existing power grids and data sets,and also provides the tools for extracting load proﬁles in diﬀerent regions and under diﬀerent boundary conditions,for instance for microgrids with high shares of on-site generation or electrical cars. P o w e r [ W ] P o w e r [ W ] a Main PV Main - PV - - - NOVAREF

ADRES

H0 SLP

Time c b

Time Time

FIG. 1.

Systematic, asymmetric and intermittent deviations of empirical power consumption dynamics andGerman standard load proﬁle (H0 SLP) (a) We compare the H0 SLP with the averaged real consumption data of a singleday in winter for the NOVAREF (12 houses) and ADRES (30 houses) data sets. The H0 SLP not only fails to capture thecorrect daily trend, but we also observe large ﬂuctuations of the consumption at short time scales. (b) Shows the same datasets at 15 minutes resolution to emulate the temporal resolution that was used to generate the H0 SLP. Here the failure of theH0 SLP to capture the correct daily trend is even more pronounced. (c) Shows the recorded household power consumption withand without photovoltaics as well as the electricity generated by the PV module of a single household. Note the smaller spikesvisible in panels (a) and (b) because the data is averaged over 12 and 30 houses, respectively, for NOVAREF and ADRES,while in panel (c) only a single household is shown.

II. Complex demand dynamics – the necessity of new load proﬁles

Notwithstanding recent advances, energy suppliers still mostly use the older load proﬁles, such as the H0 standardload proﬁle, which only has a 15 minute temporal resolution. In Germany, the standard load proﬁle (H0 SLP), wasintroduced in 1999. Ninety percent of the residential load data used in its creation were measured in the 1970s orearlier with an hourly temporal resolution. Only 10% of the measured load data had a temporal resolution of 15minutes. Here, we review some of the recent advances towards a modern load proﬁle and touch on the persistent needfor a generally applicable consumption framework which we provide in the next sections.Three well-known model classes exist to describe a household load proﬁle.

Top-down or conditional demand analysismodels are downward models that use the total electricity consumption estimates of multiple households as well asmacro-variables to predict the household energy consumption and generate household load proﬁles [14, 15].

Bottom-upmodels use micro-variables as input, such as the number of active occupants, the appliances’ energy demand andusage time etc. They also often use Markov chains to generate household load proﬁles [16, 17]. Finally, hybrid modelsemploy a combination of the techniques used in top-down and bottom-up models to build up a Statistical AdjustedEngineering (SAE) model [18]. A detailed analysis of how these models have been applied was recently reviewedin [19]. At present, many demand side management models exist that can generate daily residential electricity loadproﬁles, see Supplementary Note 1, [20] and [1] for details. Only a few of these models use a high temporal resolutionof the order of seconds and those require a lot of micro-parameters, which still leaves us with the need for an accurate,high-resolution, easy-to-use load proﬁle to be developed.A focus on higher temporal resolution is necessary to fully understand modern consumption patterns and respondquickly, for instance the disturbances caused by input ﬂuctuations or regulatory or trading anomalies [21]. Recentstatistical research on power consumption (see for example [6] and [7]) demonstrates that substantial diﬀerences exist a D a y s D i v e r s i t y F a c t o r P o w e r [ W ] P o w e r [ W ] b c Time

Time Time

Non-Coincident demandCoincident demand Non-Coincident demandCoincident demand

FIG. 2.

Indicating the interaction between households energy consumption by considering the diversity factor (a) shows the diversity factor or the ratio between coincident demand, P cd , and non-coincident demand, P ncd , of 30 householdsbelonging to the ADRES data set every 15 minutes and for 14 days. As it is clear, there are some time intervals during a daywhose diversity factor reaches 0.6. (b) and (c) respectively show the energy consumption trajectory of all 30 households from6:00 to 6:15 and 11:30 to 11:45. It is clear from the ﬁgures that the energy consumption of all houses is low around the 6:00and then is gradually increasing together around 11:00 and, consequently the diﬀerence between P cd and P ncd is decreasingproving the obvious interaction between houses consuming the electricity. between the statistical features of the highly resolved power consumption and consumption on a 15 minute time scale.Analysing power consumption on the short time scale of seconds to one minute reveals extreme consumption spikes,which are completely ignored in the 15 minute load proﬁle [6]. In Fig. 1(a) and (b) a comparison of the H0 SLP withthe load proﬁles of two residential data sets of high temporal resolution, measured in Germany and Austria is shown.The Austrian data set was recorded in 2009-2010 during the ADRES project [22]. The German data set was recordedbetween 2013 and 2016 during the NOVAREF project [23]. In the Supplementary Note 2 we report further analysison data sets related to 70 households, recorded in August and September 2019 in Germany during the ENERA project.As can be seen in Fig. 1(a) and (b), the averaged single day load proﬁles of both the ADRES and the NOVAREF datasets, strongly deviate from the H0 SLP. Speciﬁcally, the trend (mean) of the load proﬁle of the measured data is notwell described by the H0 SLP and the ﬂuctuations on short times scales are signiﬁcant (see Section III). Furthermore,the analysis of the statistics of the highly resolved power consumption reveals the presence of non-Gaussian, inter-mittent and asymmetric ﬂuctuations, which need to be taken into account when designing demand side managementand control mechanisms (see Section IV and Supplementary Note 4.)Some of the observed stochastic properties could be explained by the increased usage of new power generationand consumption devices: Rooftop photovoltaic panels (PVs) in the ﬁrst case and new electronic devices such aselectrical heating systems, smartphones, tablets, robot vacuum cleaners, electric vehicles etc. in the second case[24]. The usage of such devices will inﬂuence and alter the shape of the current standard load proﬁle. As shown inFig. 1(c), the consumption can even reach negative values if houses are directly connected to local RE resources [25].Local ﬂuctuations in solar and wind generation may further lead to coincident load proﬁle spikes, both positive andnegative. Therefore, the precise prediction and management of the power consumption of households on a short timescale is essential to ensure the stability of distributed grids. In addition to installing more on-site generations in theresidential sector, replacing fossil-fuelled cars with electric ones, which are charged using household electricity, resultsin changes in the shape of the load proﬁle of the households (see Supplementary Note 2), especially when organisingconsumers in micro grids [26]. Since we expect an increasing number of such electrical cars, as well as an increasingpenetration of strongly ﬂuctuating RE generating electricity, a new data driven load proﬁle based on modern andhighly resolved measurements is urgently needed.As mentioned above, extreme consumption spikes are completely ignored in the 15 minute load proﬁle as theyare averaged out. However, these spikes are of particular importance to lower voltage distribution grids, where coin-cident consumption can dominate the consumption patterns locally or on a country scale, due to e.g. synchronisedactivity during major (e.g., sports) events [27]. A well-known simple measure to quantify the coincident electricityconsumption between households and moreover to see how this coincidence depends on the number of households isthe diversity factor between households [28]. To determine the diversity factor between households, we ﬁrst evaluatethe maximum coincident demand, P cd and non-coincident demand, P ncd , respectively in 15 minutes and one day timewindows. For this purpose, we sum the maximum power demand of all houses every 15 minutes and then divide the P cd with the sum of the maximum power demand of all houses over the course of a day, i.e. P ncd .The diversity factor varies from zero to one, where zero indicates no coincident electricity consumption betweenhouseholds, while a diversity factor equal to one shows strong coincidence. As an example, the diversity factor ofthe ADRES data set, sampled every 15 minutes for 14 days, is shown Fig. 2(a). To clearly indicate the interactionbetween houses during a day (i) the Energy consumption trajectory of all 30 households (in grey); (ii) the maximumcoincident demand (in red) and (iii) the non-coincident demand (in purple) between 6:00 to 6:15 (Fig. 2(b)) and 11:30to 11:45 (Fig. 2(c)) for day 13 are shown in Fig. 2(b) and (c).Looking at the households trajectories, it is clear that the energy consumption of all houses is low at the beginningof the day and then it is gradually increasing until around 11:00. Thus, the value of the diversity factor, which is theratio between coincident demand ( P cd ), and the non-coincident demand ( P ncd ) increases during daytime. This provesthe interaction between the demand of the households during the day and is the reason why the signiﬁcant spikes inaverage energy consumption do not disappear after averaging.To show the relationship between the value of the diversity factor and the number of households, we calculate thediversity factor for the NOVAREF and ENERA data sets, which are composed of 12 and 70 households, respectively,in the Supplementary Note 2. Our results demonstrate that the values of the diversity factor are not zero duringa day, even for the 70 ENERA houses and for some time intervals the diversity factor ranges between 0.3 to 0.4 oreven larger (maximum 0.6). This indicates that regardless of the number of houses coincident demand will take placeduring certain time intervals (e.g lunch and dinner time) and spikes during that time will not be averaged out. Thiswill result in a spiky load proﬁle, such as the single day averaged ENERA load proﬁle, see also Supplementary Note1. In the next sections, we introduce a generally applicable methodology for creating residential load proﬁles anddemonstrate its applicability through comparisons with power consumption data sets measured in Germany andAustria, as well as the industry standard load proﬁle. In the next Section (Section III) we present a methodology toextract the averaged load proﬁle (ALP). III. Demand trend: Mode decomposition

In this section we present the Average Load Proﬁle predictor, a methodology that extracts the Averaged LoadProﬁle (ALP) from high temporal resolution data sets and creates the load proﬁle of a full week that explicitlyrespects diﬀerences between the diﬀerent days during a week, and in particular diﬀerences between diﬀerent workingdays, much as the H0 SLP does. Its main advantage compared to the H0 SLP is that it only requires 4 weeks of highresolution measured data and can, therefore, be applied to both small neighbourhoods and whole cities. Furthermore,its only input is the electricity consumption trajectory and therefore does not rely on numerous micro- or macroscopicparameters, as existing models do. As a result, it can easily be used to analyse both present and future residentialpower systems, which might include new technologies and devices. Due to the high temporal resolution of the datasets we use, the ALP predictor has a higher level of accuracy than the H0 SLP. It is, therefore, a good alternativeto be used instead of the H0 SLP and could provide better and more accurate results, especially when investigatingor operating microgrids (a detailed discussion on the subject can be found in Sec. V). In its present form, the ALPpredictor can predict the weekly trend of the load proﬁle of a group of houses (anywhere from 12 to 70 or morehouses).To produce the ALP predictor we extract the consumption trend from four consecutive weeks of high resolutionelectricity consumption data measurements. In this Section we present our results for the NOVAREF data set(temporal resolution = 2 s). The methodology we use to create the ALP predictor is the following. We split themeasured data into multiple modes using the Empirical Mode Decomposition (EMD) [13], and so separate the long-term trends from the short-term ﬂuctuations. Because the EMD extracts all the signals present in the data, theoriginal data set can be recreated 1-for-1 without any loss of information by summing up all the modes. A detailed P o w e r M i s m a t c h [ W ] -2002001000600 0-600 Date Date Date Date M S E Number of modes a b c de

MSE

ALP

MSE H SLP

FIG. 3.

Determining the optimal number of modes for the average load proﬁle (ALP) Top panel:

The ﬁrst threepanels from left to right show the ∆ P between the original NOVAREF data and the ALP for (a) 5, (b) 8 and (c) 16 summedmodes, respectively. The last top panel (d) shows the ∆ P between the original NOVAREF data and the H0 SLP. The MSEpredictor for the ALP for various N (yellow curve) and the MSE predictor for the H0 SLP (black line) are shown in panel (e).The largest diﬀerence between the H0 SLP and the ALP (and so the optimal number of mode to sum) is achieved for N = 8,but the ALP clearly overperforms the H0 SLP for any N ≥ discussion of the adaptive time-frequency data analysis via the EMD can be found in Methods, Supplementary Note3 and in a recent paper applying EMD in a stochastic context [29].In order to predict the future daily and weekly household demand we must ﬁrst train the predictor. As a ﬁrst stepwe select 4 chronologically consecutive weeks with no datagaps of measured data from the NOVAREF data set, i.e07.01–14.01, 14.01–21.01, 04.02–11.02, 11.2–18.04. This constitutes our training set. Then, we calculate the averageof these four weekly consumption proﬁles and then apply the EMD method to it. Through this process 18 individualand independent modes are extracted. We then divide these modes into two categories: (a) high-frequency; and (b)low-frequency modes. The high-frequency modes contain information on the stochastic consumption ﬂuctuations (seeSection IV), while the low-frequency modes are almost free of stochastic ﬂuctuations and are instead dominated bydeterministic eﬀects. Next, we sum the last N low frequency modes.In order to get the best performing ALP we must determine the optimal number of modes N ( N optimal ). If we sumtoo few modes, the predictor will fail to generate an ALP that tracks the high resolution consumption better than theH0 SLP. Whereas, if we sum too many modes it will overﬁt the training data set and lead to inaccurate future loadpredictions. To determine the N optimal and quantify the performance of the optimised ALP, we use the mean-squarederror (MSE) on a validation set, consisting of 29 randomly chosen weeks of data from the NOVAREF data set (seeMethods for details on MSE). We must stress here that we used 29 weeks of data because they were available to us.For the purpose of applying this method to other (possibly shorter datasets) only two additional weeks of data areneeded to determine the ALP.The results of this validation analysis reveal the optimal number of modes N optimal , which in this case is obtainedat N optimal = 8, see Fig. 3. Indeed, even for values of N >

7, the generated ALP outperforms the H0 SLP in terms ofdemand prediction. Fig. 3 shows the results for one of the 29 weeks, and the ALP is found to predict the future dailyand weekly electricity consumption of the 12 NOVAREF households much more accurately compared to the H0 SLP.Finally, we use 3 randomly chosen weeks to test our predictor and visualise modes and trajectories. We ﬁnd thatthe ALP outperforms the H0 SLP when the N optimal is chosen, while it performs worse for a small number of modes,see Fig. 4. Furthermore, we observe the diﬀerent modes, ranging from high-frequency ﬂuctuations (mode 1) to slowlychanging trend (mode 16). We continue to evaluate the ALP performance in the next section (Section IV).The ALP predictor has several major advantages compared to both the H0 SLP and the available demand side Time (s) P o w e r [ W ] 𝑀𝑀𝑀

𝑀𝑀𝑀

𝑀𝑀𝑀 𝑀 𝑀𝑀𝑀

Time mode: 1 mode: 12 mode: 16mode: 9 P o w e r [ W ]

200 0-100-200 2 2.5 x10 Time (s) x10 Time (s) x10 Time (s) x10 c d e fa FIG. 4.

The ALP, is a suitable replacement for the H0 SLP with higher precision (a)-(b) Comparison of the originaldata with the H0 SLP for the selected dates and the ALP generated through the summing of 5 and 8 modes. The 5 modessum ALP is not a good approximation for the original data. The 8 modes sum ALP generates a suitable approximation forthe signal while excluding the stochastic behaviour of the original data. (c)-(f) Selected examples of the modes extracted usingthe EMD. We categorise the ﬁrst nine modes as high modes, and consequently the tenth to eighteenth modes are categorisedas low modes. From left to right (c) mode 1, (d) mode 9, (e) mode 12 and (f) mode 16 are shown, respectively. management models. Firstly, it is data driven and is based on modern, high temporal resolution (2 s) measurements,whereas the H0 SLP is based mostly on hourly resolution data measured before 1999, with a small subset of 15-minuteresolution data measured between 1990 and 1999 [2]. Secondly, because only four chronologically consecutive weeksof data are necessary to generate the ALP and only two additional weeks of data are necessary to determine the N optimal , it can be used to predict the electricity consumption of both small and large groups of houses and doesnot need the plethora of the micro-and-macro-parameters that presently available demand side management modelsrequire. Lastly, the ALP has a higher level of accuracy than the H0 SLP due the high temporal resolution of the datasets it uses.In the next Section (Section IV) a stochastic model to generate the stochastic ﬂuctuations around the trend ispresented. IV. Demand ﬂuctuations: Stochastic model

Having extracted the predominantly deterministic trend of the consumption, we now turn to the stochastic ﬂuctu-ations around this trend. Speciﬁcally, we split the power consumption into trend P (trend) and ﬂuctuations P (ﬂuc.) : P = P (trend) ( t ) + P (ﬂuc.) ( t ) . (1)First, we investigate the statistical properties of these consumption ﬂuctuations. Analysing the histograms, we noticethat the ﬂuctuations are skewed, i.e. asymmetric, and heavy tailed, so large deviations are much more likely than ifthey were characterised by a simple Gaussian distribution, see Fig. 5.Next, we construct a stochastic model, which describes the observed consumption ﬂuctuation statistics by applyingsuperstatistics methods [21, 30–33]: Dividing the full trajectory into several shorter trajectories allows us to charac-terise each local distribution with a simpler distribution, such as a Gaussian or an exponential distribution. In thecase of consumption ﬂuctuations, the ﬂuctuations within each time window of approximately T ≈ T . Each local Maxwell-Boltzmann distribution has its distinct a b PP ( trend ) ( ALP ) P o w e r [ W ] P ( fluc. ) Gaussq - MB -

500 0 500 100010 - - - P ( fluc. ) [ W ] P D F FIG. 5.

The intermittent characteristic of power consumption ﬂuctuations. (a) The total power consumption P is a sum of the trend consumption P trend , obtained by the EMD method described in the previous section and ﬂuctuations P ﬂuc. . We record the diﬀerence between trend and real demand as the ﬂuctuation trajectory. (b) The probability densityfunction (PDF) of the consumption ﬂuctuation does not follow a Gaussian distribution but is better described by a q -Maxwell-Boltzmann distribution, especially on the right ﬂank. The histogram uses the NOVAREF data from 2018, with Gaussian and q -Maxwell-Boltzmann parameters determined by the methods of moments. scale parameter σ MB and oﬀset from zero µ MB . When we now move our analysis from one local distribution to thenext one, we observe a slow time dynamics of these Maxwell-Boltzmann parameters σ MB and µ MB and thereby a timedynamics of the local Maxwell-Boltzmann distribution itself. Superimposing these time-varying local distributions,we re-obtain the full aggregated statistics as approximately a q-Maxwell-Boltzmann distribution, see Fig. 5.Mathematically, we formulate stochastic equations of motion for the ﬂuctuations leading to local Maxwell-Boltzmann distributions as follows: We deﬁne auxiliary variables x i , with i ∈ { , , ..., J } , each following a simpleOrnstein-Uhlenbeck process, based on independent Wiener processes W i :d x i ( t ) = − γx i ( t ) d t + (cid:15) d W i , (2)with damping γ and ﬂuctuations amplitude (cid:15) . Hence, the x i are identical but independently distributed Gaussianrandom variables with a mean 0 and standard deviation σ = (cid:15) √ γ . Then, the demand ﬂuctuations P (ﬂuc.) are obtainedby aggregating these Gaussian distributions and applying the observed shift from zero µ MB : P (ﬂuc.) ( t ) = (cid:113) ( x ( t )) + ( x ( t )) + ... + ( x J ( t )) + µ MB . (3)As is known from statistical physics [34], choosing J = 3 yields exactly Maxwell-Boltzmann distributions in theprobability density p : p (cid:16) P (ﬂuc.) (cid:17) = 1 σ MB (cid:114) π (cid:16) P (ﬂuc.) − µ MB (cid:17) exp (cid:34) − (cid:0) P (ﬂuc.) − µ MB (cid:1) σ MB (cid:35) , (4)where the shape parameter σ MB of the local Maxwell-Boltzmann distribution is identical to the standard deviationof the independent Gaussian variables σ MB = σ . We do consider cases of J (cid:54) = 3 in Supplementary Note 4 and ﬁndthat J = 3 is the best ﬁt to the data. Indeed, the approach of 3 independent Gaussian variables is very convenient forthe computational application. Alternative modelling approaches using a mathematically simple 1-D process wouldrequire more complex dynamics.In applying the superstatistical approach, we implicitly assumed a separation of time scales. Here, we have thelong time scale, on which we locally observe Maxwell-Boltzmann distributions of the power ﬂuctuations of T ≈ τ = 1 /γ ≈ ...

400 seconds, see also SupplementaryNote 4. Comparing these two time scales, we observe a clear time separation between long time scales T and shorttime scales τ , which diﬀer by a factor of 20. Hence, each local Maxwell-Boltzmann distribution relaxes much fastertowards its equilibrium (with rate 1 /τ = γ ) than the overall process changes towards a new Maxwell-Boltzmanndistribution (which happens with a rate of 1 /T ).Finally, we combine the EMD-based trend of the demand with the stochastic ﬂuctuation model, obtaining a data-driven load proﬁle (DLP) and, then, compare it to the original NOVAREF consumption data. We notice that whilethe precise trajectories are not identical (by construction), the stochastic properties align very well with drastically - P o w e r f l u c . P ( f l u c . ) [ W ] ab c P ( fluc. ) [ W ] P D F DataMB0 200 400 6000.00.20.40.60.81.0 C D F P ( fluc. ) [ W ] P D F DataMB0 20 40 60 800.00.20.40.60.81.0 C D F FIG. 6.

Local Maxwell-Boltzmann distributions of power demand ﬂuctuations (a) Using only the high-frequencymodes from the empirical mode decomposition (EMD), we plot the consumption ﬂuctuations over time. (b-c) Using super-statistical methods [30, 31], we ﬁnd a long time scale T ≈ reduced error compared to the standard H0 SLP model (see Fig. 7). In the Methods section we provide a link to thesoftware and pseudo-code to generate these kind of trajectories for other demand regions and data sets. V. Discussion

Summarising, we have shown that modern residential electricity load proﬁles strongly diﬀer from the H0 standardload proﬁle (H0 SLP), which is widely used in the industry. We set out to replace or supplement the existing standardload proﬁle and obtained three main results: First, using the empirical mode decomposition (EMD) method, wedeveloped a method to approximate the average load proﬁle (ALP). Second, using superstatistics, we then developeda simple stochastic ﬂuctuation proﬁle (SFP) to describe the non-Gaussian ﬂuctuations of the demand around thetrend. Finally, combining both trend and ﬂuctuations yields a full data-driven load proﬁle (DLP).All three ﬁndings are critical for the stable operation of the power system: Knowledge of the expected demand iscritical for energy providers to calculate how much power is needed by each household within a given time period.Simultaneously, knowledge of how much the demand might ﬂuctuate around this trend is also essential, to havesuﬃcient balancing and back-up power at hand. According to the Bundesverband der Energie- und Wasserwirtschafte. V. (BDEW) [2], the electricity load of a household can be expected to deviate between 10 % and 20 % at any giventime. Our ﬂuctuation model provides a better quantitative estimate for these ﬂuctuations to avoid an overestimationof the demand and too much usage of expensive quickly dispatchable generation [35]. On the other hand, we have toprevent an underestimation of the demand as this could easily lead to a collapse of the system.Another strong point of our modelling approach is its ﬂexibility: We do not present a ﬁxed load proﬁle but amethodology to extract trend (ALP) and ﬂuctuations (SFP) for any present or future power system. This canbe applied to diﬀerent regions or even continents. Maybe even more importantly, the extracted proﬁles can easilybe updated based on recent developments in consumption, e.g. due to additional PV installation or adaptation ofelectrical cars. Similarly, new load proﬁles can be created for newly built micro or smart grids [36].0 - - [ W ] P D F a c - - - | Δ P | [ W ] P D F MSE ( H0 SLP )= ( ALP ) =

NOVAREF H0 SLP ALP0 1 2 30200400600800100012001400 Time [ d ] P o w e r [ W ] | Δ P | b FIG. 7.

Synthetic power demand in agreement with empirical data. (a) We display a brief trajectory of the real(blue), H0 SLP (black) and the new ALP (orange). The ALP curve overall keeps closer to the real consumption values thanthe H0 SLP does. (b) The histogram of the power mismatch ∆ P = | P (real) − P (model) | shows higher deviations of thedemand forecast for the H0 SLP compared to the ALP. We also report the mean-squared error (MSE) of the H0 SLP andthe ALP with respect to the NOVAREF data set. (c) Combining the trend extraction via EMD to obtain the ALP and thesuperstatistical model for demand ﬂuctuations (DLP) approximates the real consumption histogram very well, in particularfor large consumption values. All trajectories use the same time stamps, starting 15 minutes past midnight on 15th of April ofthe NOVAREF data set. The histograms return almost identical results for other weeks. The H0 SLP focused mainly on the time scale of 15 minutes and reports much smaller ﬂuctuations than our new loadproﬁles. How can we explain this? First, we note that the H0 SLP used older data, with a temporal resolution muchlower than the 1 seconds of our data sources. More relevant is however the power system perspective: Generationand demand are scheduled for ﬁxed time intervals, such as 15 minutes and all ﬂuctuations and deviations within eachinterval are taken care by control mechanisms [37]. This view also emphasises the role of the high-voltage transmissiongrid where some demand ﬂuctuations, which we observe on the distribution grid, might not occur. Our temporallyhighly resolved model and focus on the distribution grid becomes increasingly relevant: Conventional generators andtheir stabilising inertia are removed from the grid and renewable generators are often directly coupled with households.Hence, the balance of supply and demand has to be present also on the distribution level and on an increasingly fasttime scale. Finally, we note that not only do both households and renewable introduce ﬂuctuations in the distributionpower grid but their ﬂuctuations share similarities. In particular, the heavy tails of consumption ﬂuctuations and itsslowly decreasing power spectrum are also observed in renewable generation [10].Our model is also especially useful in the case of micro-grids coupled with smart meters. Micro-grids are smallautonomous grids, which can function independently or semi-independently from the main power grid [38]. In thepast, small autonomous grids were comprised mainly by remote, usually rural villages or households powered byfossil fuel generators. The concept has been reconﬁgured to help overcome the challenges of integrating intermittentrenewable energy production to the main energy grid [38], [39] as well as provide electricity to populations in remoteregions while replacing fossil fuels with renewable resources. Such grids however, always face the problem of matchingthe intermittent energy generation with the, as we have shown in Section II, also intermittent power consumption of1the households that comprise them. A task that is made harder by the fact that many microgrids are composed of lessthan the 332 households that the H0 SLP assumes. As showed in Sections II and III, the consumption behaviour ofa small number of houses deviates signiﬁcantly from the H0 SLP and displays a strong stochastic component. Whichtechnologies can be used to achieve a successful balancing between the generation and consumption [38], [40], whichprotection schemes [41] and control systems can be used to ensure the stable and secure operation of these grids [42],[43], [44] and even more scheduling and operation issues [45], [39], [46] are all active areas of research.Furthermore, our model could allow the administrators of an autonomous or semi-autonomous microgrid equippedwith smart meters to use the collected data to create an accurate prediction of the electricity consumption of themicrogrids units in the following months and take the appropriate measures to ensure a continuous stable operation,e.g. by ensuring enough balancing power is available. At present the H0 SLP, which does not diﬀerentiate betweenthe electricity consumption of a handful of houses and a small city, makes this impossible. Our model would alsoprovide researchers, who have access to smart meter data (or any other high temporal resolution data), with muchmore accurate electricity consumption predictions compared to the ones that H0 SLP allows them at the moment.These prediction could then be used as input for further consumption models.Because our approach is entirely dependent on data with no prior assumptions being made, it could theoreticallybe used to predict the electricity consumption not only of households but also of small businesses or full countries.The current lack of freely available data makes this a task for the future. For example, we wish to apply our modelon consumption data of other regions, such as the UK or the US and compare it to the local equivalent of the H0SLP. Regions like Canada, where a large number of smart meters are already installed, are particularly promising.While we focused here on household demand, the presented methods should also be applicable to extract the demandtrend and ﬂuctuations of industry consumption or combinations of very diﬀerent consumers. Finally, the ﬂuctuationmodelling could be extended to include longer-term correlations or match the precise increments in the observed datasets. This would move the study of consumption ﬂuctuations closer to the well-researched state of ﬂuctuations inrenewable generation.

MethodsEMD

The Empirical mode decomposition (EMD) is a well-known method used for nonlinear, non-stationary data sets.It decomposes the data into a ﬁnite number of intrinsic mode functions based on the local properties of the data.Therefore, this method is not restricted to linear or stationary time series, as is the case with other methods, such asFourier spectral analysis. To determine the empirical mode functions, one applies the following steps on a data set:(1) ﬁrst the envelope of both local maxima and minima in a data set are deﬁned separately. For instance all localmaxima are connected by cubic spline lines for the upper envelope and then the same procedure is repeated for thelower envelope. Consequently, all data is conﬁned between upper and lower envelopes. It is worth mentioning that inthis method there is no need for the data to have zero crossings, therefore all values can be just positive or negative.(2) Deﬁning the upper and lower envelopes, their mean value, m has to be calculated and then subtracted from theoriginal data, i.e. X ( t ) − m = h , where X ( t ) is the original data and h is called the ﬁrst component. (3) In thenext step the ﬁrst component, h is converted to the intrinsic mode function (IMF), i.e. h should have same numberof extrema and zero crossings plus the symmetry of upper and lower envelops around zero (for details see [13]). Afterﬁnding the IMF, c , we subtract it from the original data: X ( t ) − c = r . This c is the ﬁrst mode and contains theshortest period of the original data X ( t ). (4) Steps (1)-(3) are repeated until r n becomes a monotonic function, andit becomes impossible to deﬁne any envelop for that. If we sum up all IMFs and the residue, we reproduce again theoriginal data, i.e. X ( t ) = n (cid:88) i =1 c i + r n ; (5)Consequently, we can obtain all the intrinsic oscillation modes of the data set with the EMD method.2 Trend extraction

In this section we present the adaptive time-frequency data analysis we used to created the ALP. It is based on anormalised version of the one step prediction mean squared error (MSE) [47] of an estimator.In order to produce the ALP, we create a predictor of the future daily and weekly household electricity demandwhich we then train using four chronologically consecutive weeks of high resolution electricity consumption datameasurements which have no data gaps. In this case, weeks 07.01-14.01, 14.01-21.01, 04.02-11.02, 11.2-18.04 of theNOVAREF data set. For accurate and meaningful results we must use data from the same group of houses whosedemand we want to predict.The ﬁrst step in this process is calculating the average of the four weeks of data: E mean ( t ) = 14 (cid:88) i =1 E i ( t ) (6)Where E i ( t ) is the electricity consumption of each week and i is the number of weeks used in the calculation (i = 4).For the next step we apply the EMD on the averaged weekly electricity demand proﬁle E mean . In the case of theNOVAREF data 18 individual and independent modes are extracted. Finally to calculate the ALP we sum the lastN low frequency modes. ALP = N (cid:88) i =1 M i + s (7)where M s is highest-frequency mode still to be included. In the case of the NOVAREF data, we have N = 8 and s = 10, hence all modes from i = 11 up to i = 18 are summed up to obtain the ALP.To determine the optimal number of low modes to sum ( N optimal ) and quantify the performance of the optimisedALP, we use the mean-squared error (MSE). The MSE measures the average squared diﬀerence between the estimatedvalues and the actual value [47] and in this paper we calculate it as follows: M SE = 1 L L (cid:88) i =1 [ C (predicted , t i ) − C (actual , t i )] (8)where C (predicted , t i ) is the predicted consumption timeseries, C (actual , t i ) is the measured consumption timeseries, t i is the time in seconds, 0 < i < L and L is the length of data used to normalise the ALP. L is 7 days measured in 2seconds increments M = 7 days ∗ hours ∗ minutes ∗ seconds M SE

ALP ) and the H0 SLP (

M SE H SLP ) and compared (see Fig. 3)in Section III. The comparison reveals that:

M SE

ALP < M SE H SLP (10)for the majority of modes summed. Therefore, the ALP performs better than the H0 SLP and can replace it.Regardless of the number of modes summed to create the ALP there is always a minimum which fulﬁls the

M SE

ALP < M SE H SLP condition (see Fig. 3). This minimum gives the optimal number of low modes N ( N optimal )that must be summed to create the optimal ALP for a given set of chronologically consecutive weeks four weeks.This minimum varies with the set of weeks used, though for the NOVAREF data set N optimal = 8 for the majorityof the weeks investigated. A detailed analysis of the model training, validation and testing can be found in theSupplementary Note 3.The non-randomness of our results were also veriﬁed by calculating the fraction of the M ES

ALP and the

M SE H SLP

M SE fraction = M SE

ALP

M SE H SLP (11)for the N ≈ N optimal , M SE fraction <

1. A more detailed analysis can be found in Supplementary Note 3.3

Data availability statement

The ADRES household consumption data were provided by the TU Wien and are available for research purposesupon request . The ENERA household consumptiondata that support the ﬁndings of this study were made available by the DLR Institute for Networked Energy Systemsand the EWE, respectively. Restrictions apply to their availability as they were used under license for the currentstudy, and so are not publicly available. The data are, however, available from the authors upon reasonable requestand with permission of the DLR Institute for Networked Energy Systems and the EWE. The NOVAREF householdconsumption data that support the ﬁndings of this study are available for download under the CC-BY license togetherwith code on how the EMD and ﬂuctuation analyses were performed on OSF: https://osf.io/yu2dm/?view_only=685370675ca145eb88234031158fc32c . Acknowledgments

The authors would like to thank the NOVAREF, ADRES and ENERA projects for the data that they provided,without which this paper would not have been possible. This project has received funding from the European UnionsHorizon 2020 research and innovation programme under the Marie Sklodowska–Curie grant agreement No 840825,the German Ministry for Education and Research (BMBF) under grant no. 03ET4027A for the DYNAMOS projectand no. 03EF3055F for the CoNDyNet2 project.

Author contributions

M.A., E.P., B.S., contributed equally. M.A., E.P., B.S., C.B., H.K. and M. T. conceived and designed the research.E.P. and M.A. detrended the data, E.P developed the deterministic model and B.S., C.B developed the stochasticmodel. All authors contributed to discussing and interpreting the results and writing the manuscript.

Competing interests

The authors declare no competing interests. [1] Grandjean, A., Adnot, J. & Binet, G. A review and an analysis of the residential electric load curve models.

Renewableand Sustainable Energy Reviews , 6539 – 6565 (2012).[2] Bitterer, R. & habil. B. Schieferdecker, P. D. Reprsentative vdew-lastproﬁle aktionsplan wettbewerb, m-32/99. Tech. Rep.,VDEW, Stresemannallee 23 D-60596 Frankfurt /M (2001).[3] E-Control. Sonstige marktregelnstrom kapitel 6 zhlwerte, datenformate undstandardisierte lastproﬁle. Tech.Rep., Energie-Control Austria fr die Regulierung der Elektrizitts- und Erdgaswirtschaft, Rudolfspl. 13A, 1010Wien, Austria (2019). URL .[4] Schilling, S. Final energy consumption by sector and fuel. Tech. Rep., European Enviroment Agency (2018). URL .[5] Monacchi, A. et al. Greend: An energy consumption dataset of households in italy and austria. In , 511–516 (2014).[6] Wright, A. & Firth, S. The nature of domestic electricity-loads and eﬀects of time averaging on statistics and on-sitegeneration calculations.

Applied Energy , 389–403 (2007).[7] Marszal-Pomianowska, A., Heiselberg, P. & Larsen, O. K. Household electricity demand proﬁles a high-resolution loadmodel to facilitate modelling of energy ﬂexible buildings. Energy , 487 – 501 (2016).[8] Liu, J., Krogh, B. H. & Ydstie, B. E. Passivity-based robust control for power systems subject to wind power variability.In

Proceedings of the 2011 American Control Conference , 4149–4154 (IEEE, 2011).[9] P. Milan, J. P., M. W¨achter. Turbulent character of wind energy.

Physical Review Letters , 138701 (2013).[10] Anvari, M. et al.

Short term ﬂuctuations of wind and solar power systems.

New Journal of Physics , 063027 (2016).[11] L´opez, M. A., De La Torre, S., Mart´ın, S. & Aguado, J. A. Demand-side management in smart grid operation consideringelectric vehicles load shifting and vehicle-to-grid support. International Journal of Electrical Power & Energy Systems ,689–698 (2015). [12] Logenthiran, T., Srinivasan, D. & Shun, T. Z. Demand side management in smart grid using heuristic optimization. IEEEtransactions on smart grid , 1244–1252 (2012).[13] Wu, Z. & Huang, N. E. Ensemble emprical mode decomposition: A noise-assisted data analysis method. Advances in Adaptive Data Analysis , 1–41 (2009). URL https://doi.org/10.1142/S1793536909000047 .https://doi.org/10.1142/S1793536909000047.[14] Parti, M. & C.Parti. The total and appliance-speciﬁc conditional demand for electricity in the household sector. BellJournal of Econimic , 309–21 (1980).[15] Aigner, D., Sorooshian, C. & Kerwin, P. Conditional demand analysis for estimating residential end-use load proﬁles.

TheEnergy Journal , 81–97 (1984).[16] Capasso, A., Grattieri, W., Lamedica, R. & Prudenzi, A. A bottom-up approach to residential load modeling.

IEEETransactions on Power Systems , 957–964 (1994).[17] Paatero, J. V. & Lund, D. A model for generating household electricity load proﬁles. International Journal of EnergyResearch , 273–290 (2006). URL https://onlinelibrary.wiley.com/doi/abs/10.1002/er.1136 .[18] Train, K., Herriges, J. & Windle, R. Statistically adjusted engineering models of end use load curves. Energy ,1103–11 (1985).[19] Esther, B. P. & Kumar, K. S. A survey on residential demand side management architecture, approaches, optimizationmodels and methods.

Renewable and Sustainable Energy Reviews , 342 – 351 (2016). URL .[20] Proedrou, E. A comprehensive review of residential electricity load proﬁle modelling. ArXiv (2019).[21] Sch¨afer, B., Beck, C., Aihara, K., Witthaut, D. & Timme, M. Non-gaussian power grid frequency ﬂuctuations characterizedby l´evy-stable laws and superstatistics.

Nature Energy , 119 (2018).[22] Einfalt, A. et al. Energie der zukunft publizierbarer endberich, adres-concept. Tech. Rep., TU Wien (2012).[23] Lange, M. & Zobel, M. Novaref, erstellung neuer referenzlastproﬁle zur auslegung, dimensionierung und wirtschaftlichkeits-berechnung von hausenergieversorgungssystemen. Tech. Rep., NEXT—ENERGY, Carl-von-Ossietzky-Strasse 15, 26129Oldenburg (2016).[24] (Destatis), S. B. Wirtschaftsrechnungen einkommens- und verbrauchsstichprobe ausstattung privater haushalte mit aus-gewhlten gebrauchsgtern und versicherungen. Tech. Rep., Statistisches Bundesamt (Destatis) (2018). URL .[25] Braun, M., Bdenbender, K., Magnor, D. & Jossen, A. Photovoltaic self-consumption in germany - using lithium-ion storageto increase self-consumed photovoltaic energy. In , 3121 – 3127 (2009).[26] Muratori, M. Impact of uncoordinated plug-in electric vehicle charging on residential power demand.

Nature Energy ,193–201 (2018).[27] Chen, L., Markham, P., Chen, C.-f. & Liu, Y. Analysis of societal event impacts on the power system frequency using fnetmeasurements. In , 1–8 (IEEE, 2011).[28] Kersting, W. H. Distribution System Modeling and Analysis (CRC Press, Boca Raton, FL, 2007).[29] Kampers, G. et al.

Disentangling stochastic signals superposed on short localized oscillations.

Physics Letters A

Physica A: Statistical mechanics and its applications , 267–275 (2003).[31] Beck, C., Cohen, E. G. & Swinney, H. L. From time series to superstatistics.

Physical Review E , 056133 (2005).[32] Weber, J. et al. Wind power persistence is characterized by superstatistics. arXiv preprint arXiv:1810.06391 (2018).[33] Williams, G., Sch¨afer, B. & Beck, C. Superstatistical approach to air pollution statistics. arXiv preprint arXiv:1909.10433 (2019).[34] Laurendeau, N. M.

Statistical thermodynamics: fundamentals and applications (Cambridge University Press, 2005).[35] Wood, A. J., Wollenberg, B. F. & Shebl´e, G. B.

Power generation, operation, and control (John Wiley & Sons, 2013).[36] Fang, X., Misra, S., Xue, G. & Yang, D. Smart gridthe new and improved power grid: A survey.

IEEE communicationssurveys & tutorials , 944–980 (2011).[37] Machowski, J., Bialek, J. W. & Bumby, J. Power system dynamics: stability and control (John Wiley & Sons, 2011).[38] Olivares, D. E. et al.

Trends in microgrid control.

IEEE Transactions on Smart Grid , 1905–1919 (2014).[39] Lasseter, R. H. Microgrids. In , vol. 1, 305–308 vol.1 (2002).[40] Katiraei, F., Iravani, R., Hatziargyriou, N. & Dimeas, A. Microgrids management. IEEE Power and Energy Magazine ,54–65 (2008).[41] Sortomme, E., Venkata, S. S. & Mitra, J. Microgrid protection using communication-assisted digital relays. IEEE Trans-actions on Power Delivery , 2789–2796 (2010).[42] Farrokhabadi, M., Caizares, C. A. & Bhattacharya, K. Frequency control in isolated/islanded microgrids through voltageregulation. IEEE Transactions on Smart Grid , 1185–1194 (2017).[43] Delille, G., Francois, B. & Malarange, G. Dynamic frequency control support by energy storage to reduce the impact ofwind and solar generation on isolated power system’s inertia. IEEE Transactions on Sustainable Energy , 931–939 (2012).[44] Katiraei, F., Iravani, M. R. & Lehn, P. W. Micro-grid autonomous operation during and subsequent to islanding process. IEEE Transactions on Power Delivery , 248–257 (2005).[45] Hirsch, A., Parag, Y. & Guerrero, J. Microgrids: A review of technologies, key drivers, and outstanding issues. Renewableand Sustainable Energy Reviews , 402 – 411 (2018). URL . [46] Mumtaz, F. & Bayram, I. S. Planning, operation, and protection of microgrids: An overview. Energy Procedia ,94 – 100 (2017). URL . 3rd InternationalConference on Energy and Environment Research, ICEER 2016, 7-11 September 2016, Barcelona, Spain.[47] Schelter, B., Winterhalder, M. & Timmer, J. Local and cluster weighted modeling for time series prediction. In