Using attention to model long-term dependencies in occupancy behavior
Max Kleinebrahm, Jacopo Torriti, Russell McKenna, Armin Ardone, Wolf Fichtner
UUsing attention to model long-term dependencies inoccupancy behavior
Max Kleinebrahm
Chair of Energy EconomicsKarlsruhe Institute of TechnologyKarlsruhe, Germany [email protected]
Jacopo Torriti
School of the Built EnvironmentUniversity of ReadingReading, United Kingdom [email protected]
Russell McKenna
Chair in Energy TransitionUniversity of AberdeenAberdeen, United Kingdom [email protected]
Armin Ardone
Chair of Energy EconomicsKarlsruhe Institute of TechnologyKarlsruhe, Germany [email protected]
Wolf Fichtner
Chair of Energy EconomicsKarlsruhe Institute of TechnologyKarlsruhe, Germany [email protected]
Abstract
Models simulating household energy demand based on different occupant andhousehold types and their behavioral patterns have received increasing attentionover the last years due the need to better understand fundamental characteristicsthat shape the demand side. Most of the models described in the literature arebased on Time Use Survey data and Markov chains. Due to the nature of theunderlying data and the Markov property, it is not sufficiently possible to considerday to day dependencies in occupant behavior. An accurate mapping of day today dependencies is of increasing importance for accurately reproducing mobilitypatterns and therefore for assessing the charging flexibility of electric vehicles. Thisstudy bridges the gap between energy related activity modelling and novel machinelearning approaches with the objective to better incorporate findings from the fieldof social practice theory in the simulation of occupancy behavior. Weekly mobilitydata are merged with daily time use survey data by using attention based models.In a first step an autoregressive model is presented, which generates syntheticweekly mobility schedules of individual occupants and thereby captures day today dependencies in mobility behavior. In a second step, an imputation model ispresented, which enriches the weekly mobility schedules with detailed informationabout energy relevant at home activities. The weekly activity profiles build thebasis for modelling consistent electricity, heat and mobility demand profiles ofhouseholds. Furthermore, the approach presented forms the basis for providingdata on socio-demographically differentiated occupant behavior to the generalpublic.
Tackling Climate Change with Machine Learning workshop at NeurIPS 2020. a r X i v : . [ ec on . GN ] J a n Introduction
Occupant behavior has been identified as having a significant impact on household energy consump-tion [12]. Therefore, there has been an increasing research interest in the field of behavioral modellingover the last years with the aim to explain dynamics in residential energy demand based on energyrelated activities [13, 14]. A large number of studies focus on the modelling of activity sequences ofsingle households or individuals with the objective to describe occupant behavior on an aggregatedlevel for socio-demographic differentiated groups [10, 17, 6, 1]. Time use data (TUD) are used as adata basis, which provide information on the temporal course of occupant activities over single daysand are available for various countries in the form of population representative samples [4]. Basedon occupant behavior, different approaches were developed that connect occupant activities withelectrical household appliances and thus generate synthetic electricity demand profiles [18]. The aimof these studies is to gain a deeper understanding of household electricity demand in order to e.g. beable to evaluate device-specific efficiency measures, time-dependent electricity tariffs or load shiftpotentials.In the course of the decarbonisation of domestic heat demand, it is expected that a large part of theheat will be generated by electricity (e.g. through heat pumps). In order to decarbonise the mobilitysector, the aim is to increase the amount of electric vehicles in e.g. Germany from 53,861 in 2018to 6,000,000 by 2030 [7, 2]. Due to the mentioned developments, fundamental characteristics willchange in the course of energy demand in the household sector. Furthermore, the introduction ofstationary and mobile electricity storage systems as well as stationary heat storage systems enablethe storage of energy over periods of single days and therefore open up flexibility potentials in theresidential sector, which can support the integration of fluctuating renewable energies. To evaluatethese flexibility potentials, data are required that contain information about the mobility behavior ofindividuals over several days and about their energy relevant at home activities. TUD only provideinformation on activity patterns of two or three individual days, therefore longer-term dependenciesin behavior that extend over several days are not captured in existing TUD based models [13]. Theobjective of this study is to develop an approach which captures long-term dependencies in behaviorin order to be able to provide high quality mobility and activity data of individuals to the generalpublic.Therefore, the paper is structured as follows. Section 2 provides an overview of the literature in thefield of categorical (activity) sequence modelling. Section 3 explains the methodology introduced tocapture long-term dependencies in occupancy behavior, before Section 4 presents and discusses theresults. The conclusion can be found in Section 5.
Activity schedules are categorical time series in which the state space is defined on the basis of thepossible activity states that a person can be in. The most commonly used approaches to model activitysequences is to describe them as Markov chains. [10] developed an approach which uses a first orderMarkov chain and distinguishes between the states ‘active at home’ and ‘not active at home’ for eachperson of a household. First order Markov models are adequately suited to describe processes that fullfill the Markov property, which refers to the memorylessness of a stochastic process. This means thatthe transition to a subsequent state depends only on the current state and is independent of previouslyobserved states. It is obvious that residential activity schedules represent more complex processesand therefore cannot easily be represented by a first order Markov model. To overcome this problem,a variety of more complex Markov models have been presented in recent years (semi-Markov [17],higher order [6], variable memory length [8]). However, two serious remain with higher-orderMarkov chains. The number of free parameters in the model increases exponentially with the orderof the model and the collection of all possible full high-order Markov chain models is limited andcompletely stratified. Due to these issues, Markov models are not used in this study. Social practicetheory literature points out that in order to understand people’s daily/weekly activity schedules theseshould be treated as a whole [11, 14]. As a result, model approaches are required whose memorymechanisms enables complex relationships in human behavior to be recorded as comprehensively aspossible.Attention based neural networks represent the state of the art in the rapidly developing field of naturallanguage processing (NLP) over the past years [15, 3]. In these networks language is interpreted as2ategorical sequences in which letters or words represent states. attention based models have shownthat they can learn complex long-term dependencies in language and are therefore promising for theapplication in this study.
Figure 1 describes the architecture of the two step approach presented in this study to combine weeklymobility schedules with daily activity schedules for the generation of synthetic weekly activityschedules, which combine the advantages of both datasets. For the application in this study, datafrom the German Mobility Panel (MOP) and German time use data (TUD) are used. The datasets aredescribed in the appendix in Section 6.1. a b ?c? b ?a a b cd b c
Autoregressive model Imputation model p r e d i c t train ? b ? a a b ?c train i npu t p r e d i c t Mobility dataset Activity datasetSynthetic weeklymobility schedules … Synthetic weeklyactivity & mobilityschedules
Figure 1: Visualization of the two step approach for the generation of weekly activity schedulesThe autoregressive model tries to capture the stochasticity in mobility behavior and to generate, step bystep, synthetic mobility schedules that have identical properties as the empirical data. Mobility stateinformation is provided as input in 10-minute resolution, with six mobility states being distinguishedfrom one another (see Figure 3). In contrast to recurrent models (e.g. LSTM models), the temporalrelationships in time series must be learned from scratch. To make this easier, timestamp informationis provided to the model in the form of sinusoidal position encoding [15] as well as weekdayembeddings (see Figure 4). Weekday embeddings are multi-dimensional representations of theweekdays in a continuous space, which are learned during the training process. In this way, daily andweekly rhythms in behavior and differences in behavior on work and weekend days can be learnedmore easily. Since the socio-demographic composition of both data sets differ and in order to be ableto generate socio-demographically differentiated activity schedules, information about the age andoccupation type is provided in the form of seven age classes and seven occupation classes. Afterall time-step-specific information are concatenated, they flow into the main part of the attentionbased autoregressive model, which is shown in Figure 5. By using the self-attention mechanism, alldependencies between the 1008 10-minute time steps are learned during the training process. Theuse of the look-ahead mask ensures that only information from previous time steps is used whenpredicting the mobility state of the next time step.The imputation model enriches the “at home” state in the generated mobility schedules by distinguish-ing between ten different energy relevant at home activities (see in Figure 3). During the trainingprocess TUD are used which provide activity information about maximum three weekdays (from 4am to 4 am) for each individual (see Figure 4). In contrast to the autoregressive model, no look-aheadmask is used in the imputation model, since the model is supposed to use information about futuremobility states while predicting current at home states. Therefore, only future at home states aremasked during the training process, so that the attention matrix calculates all dependencies betweenthe unknown at home state under consideration and all mobility states as well as all known (previous)energy relevant at home states. By using all three days per individual in one sample during thetraining process, the model is able to learn day-to-day dependencies between at home activities (e.g.3 ttention EmpiricalMarkov a.) Hamming distance working days c.) Schedule (age: 51-60, occ.: full time)b.) Autocorrelation (driving car)
Attention Empirical
Figure 2: Visualization of the distribution of the hamming distance between all working days (a.), theautocorrelation of the state driving car (b.) and an example synthetic activity schedule (c.)people go to bed or eat at similar times on consecutive days). During the prediction process, theentire weekly mobility plan is given as input and the at home activities are enriched chronologically.Both models are trained using the cross entropy loss function and the Adam optimizer. The datasetsare split up into training data (9-fold cross validation (80 % training, 10 % validation)) and test data(10 %). The generated mobility and activity schedules are constantly evaluated on an individual andaggregated level by calculating the aggregated state probability, the distribution of state durations, theautocorrelation, the amount of weekly activities for each state and the distribution of the hammingdistance between all working days over all generated samples.
The model configurations used for the results in Figure 2 can be found in Table 1 (model no. 3) andTable 2 (model no. 4). As a reference model for the autoregressive mobility schedule generation,a 1st order Markov model is used. The 1st order Markov model characteristics are representativefor the models presented in section 2, since marginal changes in the metrics can be achieved byusing more complex Markov chains, but the basic problems remain (no long-term memory). Fromthe visualization of the distribution of the hamming distances between the five working days ofthe week, it can be seen that the attention based models depict similar behavior between the dayswell, in contrast to the Markov model, in which working days of individuals are not as similar asin the empirical data. In the course of the autocorrelation of the driving car state (Figure 2 b.) apeak can be seen after 144 10-minute intervals (24 hours). This peak can be explained by dailyrhythms in the commuting behavior of car drivers. From the distribution of the Hamming distanceand the 25/75% quantiles of the autocorrelation, it can be seen that both the diversity betweenindividuals and the average mobility behavior are captured by the model. The described day to daydependencies can also be recognized in the exemplary activity plan in Figure 2 c. The person underconsideration leaves home during working days at around the same time and sleeps a bit longer onthe weekend. In Tables 1 and 2, error values are given for the various metrics in comparison to theempirical data. In contrast to Markov models, which, based on their structure, meet aggregate stateprobabilities and the average number of weekly activities well, the attention based models reflectintrapersonal dependencies significantly better, which can be seen by the lower errors in the durationof the states, the autocorrelation and the Hamming distance. The errors of the attention-based modelin the aggregated state probabilities are, as can be seen from Table 1 and Table 2, slightly higher thanthose of the Markov model, in which the error tends to zero with increasing sample size. However,the average error in the aggregated state probability over all socio-demographic groups betweenthe generated data and the empirical mobility data is more than two times lower than the error thatarises when comparing the overlapping states of the two input data sets. It can be concluded that theapproach presented enables to combine the advantages of weekly mobility data with a large samplesize with the advantages of high-resolution activity data of individual days, thus creating a new databasis which can be used for further analyses of human occupancy and mobility behavior.4
Conclusion
Over the past years, more and more models have been published that aim to capture relationshipsin human residential behavior. Most of these models are different Markov variants or regressionmodels that have a strong assumption bias and are therefore unable to capture complex long-termdependencies and the diversity in occupant behavior. This work shows that attention based modelsare able to capture complex long-term dependencies in occupancy behavior and at the same time ade-quately depict the diversity in behavior across the entire population and different socio-demographicgroups. By combining an autoregressive generative model with an imputation model, the advantagesof two data sets are combined and new data are generated which are beneficial for multiple use cases(e.g. generation of consistent household energy demand profiles). The two step approach generatessynthetic activity schedules that have similar statistical properties as the empirical collected schedulesand do not contain direct information about single individuals. Therefore, the presented approachforms the basis to make data on occupant behavior freely available, so that further investigationsbased on the synthetic data can be carried out without a large data application effort. In future workit is planned to take interpersonal dependencies into account in order to be able to generate entirehousehold behavior profiles.
Acknowledgments and Disclosure of Funding
This work was supported by the Helmholtz Association under the Joint Initiative “Energy SystemsIntegration” (funding reference: ZT-0002) and was done during a research stay funded by the Centrefor Research into Energy Demand Solutions (CREDS) at the University of Reading (UK). This workwas supported by UKRI [grant numbers EP/R000735/1, EP/R035288/1 and EP/P000630/1].
References [1] D. Aerts, J. Minnen, I. Glorieux, I. Wouters, and F. Descamps. A method for the identifica-tion and modelling of realistic domestic occupancy sequences for building energy demandsimulations and peer comparison.
Building and Environment , 75:67–78, 2014.[2] BMWi, BMVBS, BMU, and BMBF. Regierungsprogramm elektromobilität: technical report,2011.[3] Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal,Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, ArielHerbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M.Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, MateuszLitwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, AlecRadford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners, 28.05.2020.[4] Eurostat. Harmonized european time of use survey. 2000.[5] Eurostat. Final energy consumption in the residential sector by use, eu-28: Statistics ecplained,2019.[6] Graeme Flett and Nick Kelly. An occupant-differentiated, higher-order markov chain methodfor prediction of domestic occupancy.
Energy and Buildings , 125:219–230, 2016.[7] Kraftfahrt-Bundesamt. Bestand an pkw in den jahren 2009 bis 2018 nach ausgewähltenkraftstoffarten, 2020.[8] José Luis Ramírez-Mendiola, Philipp Grünewald, and Nick Eyre. Residential activity patternmodelling through stochastic chains of variable memory length.
Applied Energy , 237:417–430,2019.[9] RDC of the Federal Statistical Office and Statistical Offices of the Länder. Zeitbudgeterhebung2001 and 2002, own calculations, 2002.[10] Ian Richardson, Murray Thomson, and David Infield. A high-resolution domestic buildingoccupancy model for energy demand simulations.
Energy and Buildings , 40(8):1560–1566,2008.[11] Elizabeth Shove, Mika Pantzar, and Matt Watson.
The Dynamics of Social Practice: EverydayLife and How it Changes . SAGE Publications Ltd, 2012.512] Koen Steemers and Geun Young Yun. Household energy consumption: A study of the role ofoccupants.
Building Research & Information , 37(5-6):625–637, 2009.[13] Jacopo Torriti. A review of time use models of residential electricity demand.
Renewable andSustainable Energy Reviews , 37:265–272, 2014.[14] Jacopo Torriti. Understanding the timing of energy demand through time use data: Time of theday dependence of social practices.
Energy Research & Social Science , 25:37–47, 2017.[15] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez,Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. 2017.[16] Christine Weiß, Bastian Chlond, Tim Hilgert, and Peter Vortisch. Deutsches mobilitätspanel(mop) - wissenschaftliche begleitung und auswertungen, bericht 2014/2015: Alltagsmobilitätund fahrleistung.[17] Urs Wilke. Probabilistic bottom-up modelling of occupancy and activities to predict electricitydemand in residential buildings: Phd thesis. 2013.[18] Y. Yamaguchi, S. Yilmaz, N. Prakash, S. K. Firth, and Y. Shimoda. A cross analysis of existingmethods for modelling household appliance use.
Journal of Building Performance Simulation ,12(2):160–179, 2018.[19] D. Zumkeller and B. Chlond. Dynamics of change: Fifteen-year german mobility panel, 2009.
Broader impact on climate change
In 2017, the residential sector accounted for 27% of European final energy consumption (mobilityexcluded) and therefore takes a key role in achieving European climate targets [5]. Approximately64% of the residential final energy demand can be attributed to space heating demand, 15% todomestic hot water demand and 20% to the demand for lighting, cooking and appliances [5]. Inthe course of the decarbonisation of domestic heat demand, it is expected that a large part of theheat will be generated by electricity (e.g. through heat pumps). In order to decarbonise the mobilitysector, the aim is to increase the amount of electric vehicles in Germany from 53,861 in 2018 to6,000,000 by 2030 [7, 2]. Due to the mentioned developments and an expected further increase inphotovoltaic battery systems in the residential sector, fundamental characteristics will change in thecourse of energy demand in the household sector. Therefore, residential neighbourhoods are gainingincreased attention by policy makers. In order to establish targeted policy interventions for differenthouseholds, which allow an optimal integration of low carbon technologies and open up flexibilityoptions on the demand side, fundamental factors influencing the structure of energy demand mustbe understood. The diversity and differences in electricity and heat consumption between differenthouseholds/buildings is a major obstacle to understanding future energy consumption.
Broader impact regarding ethical aspects
When providing behavioral data, ethical aspects such as data privacy must be taken into account at alltimes. The data sets used in this work do not allow any conclusions to be drawn back to individuals.However, before the synthetic data are made available online, differential privacy of the models mustbe ensured.The data set provided by this work can support energy system planners, for example in predictingfuture load peaks due to high electrical demand at certain times. Furthermore, politicians can besupported in the introduction of variable electricity tariffs so that they do not favor or disadvantagedifferent socio-demographic groups.Based on the behavior of different socio-demographic groups, thedata could be used in the marketing sector to target specific groups that increasingly watch televisionat certain times. When using synthetic data, it must also be taken into account that models alwayslead to errors and that the synthetic data therefore differ from the empirically collected data, whichare also subject to certain biases. This study particularly points out that the two basic data sets differrelatively strongly in terms of the composition of the socio-demographic groups. This is taken intoaccount when generating the synthetic data by using socio-demographic factors age and occupationstatus while merging the datasets. 6
Appendix
Information about the weekly mobility behavior is taken from the German Mobility Panel (MOP)which collects information about each travel activity of about 1,500 to 3,100 individuals since 1994every year [16, 19]. In this study 26,610 weekly mobility schedules from the years 2001 till 2017together with their associated socio-demographic information (age, occupation) are used as input.Information about energy relevant at home activities is taken from the Harmonized European TimeUse Survey [9, 4]. Activity diaries and socio-demographic information of 11,921 individuals out of5,443 households are used to train the imputation model. Most of the participants provide diaries ontwo weekdays and one weekend-day in 10-minute resolution. Figure 3 shows the temporal course ofthe aggregated state probability over the entire dataset population and an exemplary mobility andactivity schedule.
Aggregated state probability (MOP) Aggregated state probability (TUD)Individual mobility schedule(exemplary visualization)
Individual activity schedule (exemplary visualization)
Figure 3: Visualization of aggregated state probabilities based on the MOP [16] and german TUS [9]and exemplary artificial individual diary entries 7
Time of day/week (τ t ) EmbeddingEmbedding … τ τ τ Day of week (d t )Activity state (as p,t ) Job (sd ) Embedding
Age (sd ) EmbeddingEmbeddingConcatenation … Time of day/week (τ t ) EmbeddingEmbedding … τ τ τ Day of week (d t )Mobility state (ms p,t ) Job (sd ) Embedding
Age (sd ) EmbeddingEmbeddingConcatenation a.) Autoregressive model b.) Imputation model
Figure 4: Visualization of the training input of the autoregressive model and imputation model andvisualization of their first layers
Masked self-attentionInput layers
Dense Dense … Dense
Transformer layer (Nx)Dense Dense … Densems p,1 ms p,2 ms … ms p,1008 DenseDenseDense Q K V mask
Figure 5: Attention based autoregressive model architecture (residual connections are not visualized)8 nput layersTransformer layer (1x)
Dense Dense … Dense as p,1 as p,2 as … as p,432 Q K V
Masked attentionDense Dense … Dense DenseDenseDense mask = un k n o w n s t a t e a s p , t Masked attentionDense Dense … Dense DenseDenseDense mask K VQ
Transformer layer (Nx)
Figure 6: Attention based imputation model architecture (residual connections are not visualized)Table 1: Hyperparameter configurations and metrics for the attention based autoregressive model.Metrics are calculated using N = 2,000 samples.No. layers/ sp sd ac na hd loss acc. epochd_model/ rmse rmse rmse mae maelearning rate/ [%] [%] [-] [-] [-] [-] [%] [-]batch size/1 1/64/0.001/64 0.83 0.31 1.32 2.96 244 0.14 95.95 92 4/64/0.001/64 0.91 0.16 0.70 2.53 33 0.128 96.34 153 8/64/0.001/64 0.86 0.17 0.54 3.6 5 0.127 96.36 74 4/128/0.001/128 0.89 0.24 0.59 3.60 9 0.128 96.33 61st order Markov 0.53 0.53 3.79 0.73 908Table 2: Hyperparameter configurations and metrics for the attention based imputation model. Metricsare calculated using N = 2,000 samples.No. layers/ sp sd ac na loss acc. epochd_model/ rmse rmse rmse maelearning rate/ [%] [%] [-] [-] [-] [%] [-]batch size/1 1/64/0.001/256 0.58 0.39 0.50 0.50 0.469 86.97 1582 4/64/0.001/256 0.58 0.39 0.44 0.62 0.436 87.32 223 4/64/0.001/64 0.57 0.38 0.36 0.90 0.436 87.35 84 4/64/0.0005/128 0.49 0.39 0.39 0.66 0.431 87.41 379 age < 18) (full time)age < 18) (full time)