[PDF] Indoor environment data time-series reconstruction using autoencoder neural networks

Abstract

As the number of installed meters in buildings increases, there is a growing number of data time-series that could be used to develop data-driven models to support and optimize building operation. However, building data sets are often characterized by errors and missing values, which are considered, by the recent research, among the main limiting factors on the performance of the proposed models. Motivated by the need to address the problem of missing data in building operation, this work presents a data-driven approach to fill these gaps. In this study, three different autoencoder neural networks are trained to reconstruct missing short-term indoor environment data time-series in a data set collected in an office building in Aachen, Germany. This consisted of a four year-long monitoring campaign in and between the years 2014 and 2017, of 84 different rooms. The models are applicable for different time-series obtained from room automation, such as indoor air temperature, relative humidity and C O 2 data streams. The results prove that the proposed methods outperform classic numerical approaches and they result in reconstructing the corresponding variables with average RMSEs of 0.42 °C, 1.30 % and 78.41 ppm, respectively.

Full PDF

IIndoor Environment Data Time-Series Reconstruction Using AutoencoderNeural Networks

Antonio Liguori* , a , Romana Markovic b , Thi Thu Ha Dam a , J´erˆome Frisch a , Christoph van Treeck a , FrancescoCausone c a E3D - Institute of Energy E ﬃ ciency and Sustainable Building, RWTH Aachen University, Mathieustr. 30, 52074 Aachen, Germany b Building Science Group, Karlsruhe Institute of Technology, Englerstr. 7, 76131 Karlsruhe, Germany c Department of Energy, Politecnico di Milano, Via Lambruschini 4, 20156 Milano, Italy

Abstract

As the number of installed meters in buildings increases, there is a growing number of data time-series that couldbe used to develop data-driven models to support and optimize building operation. However, building data sets areoften characterized by errors and missing values, which are considered, by the recent research, among the mainlimiting factors on the performance of the proposed models. Motivated by the need to address the problem of missingdata in building operation, this work presents a data-driven approach to ﬁll these gaps. In this study, three di ﬀ erentautoencoder neural networks are trained to reconstruct missing indoor environment data time-series in a data setcollected in an o ﬃ ce building in Aachen, Germany. The models are applicable for di ﬀ erent time-series obtained fromroom automation, such as indoor air temperature, relative humidity and CO data streams. The results prove thatthe proposed methods outperform classic numerical approaches and they result in reconstructing the correspondingvariables with average RMSEs of 0.42 C, 1.30 % and 78.41 ppm, respectively. Keywords: indoor environment data time-series, machine learning, data analytics, autoencoder, neural networks. ∗ Corresponding author. Tel.: +

49 241 80 25541;E-Mail: [email protected] a r X i v : . [ s t a t . M L ] S e p . Introduction In the European Union buildings account for more than 40 % of the total ﬁnal energy consumption and approxi-mately 36 % of CO emissions [1]. As a consequence, reliable estimation of building consumption data could fosterenergy e ﬃ ciency strategies, such as the analyses of retroﬁt options [2] or the development of fault detection and di-agnosis (FDD) schemes [3]. In the related research, two approaches are generally followed to achieve this goal [4]:forward modeling and data-driven modeling. While the former is based on solid engineering principles, the latterrelies on data collected during normal or predetermined system operation and it can usually capture more accurateas-built system’s performance with a limited number of known parameters [4]. Additionally, data-driven approachescan be successfully applied to represent energy-related human actions in buildings (e.g. window openings), being theresult of a number of stochastic driving forces [5, 6].By deﬁnition, data-driven modeling explicitly requires the availability of useful data [7]. Therefore, missing valuespresent the major limitation on this approach [8–10]. As stated by multiple studies [7, 8], data gaps are a commonproblem in building automation systems (BAS) and they may be caused by a number of reasons such as poweroutages, sensors defects, communication problems or network issues. As a result, the presence of these anomaliescould signiﬁcantly reduce the size of the available data set and hinder further energy analysis [11, 12]. So far, existingstudies have handled missing data either using simpliﬁed methods [8] or excluding them from further analytics due tothe lack of ground truth values [7]. In summary, both latter approaches have usually led to limited inserting accuracyand lower resulting model performance [7, 8].The aim of this paper is to propose a method for reconstructing missing sequences of indoor environment dataobtained from room control sensors. For that purpose, a data set collected in an o ﬃ ce building in Aachen, Germany,was analyzed and preprocessed. Models for handling missing data points were implemented, trained and evaluatedon indoor air temperature ( T ), relative humidity ( RH ) and CO concentration data. In particular, three promising au-toencoder architectures were investigated: feed-forward denoising autoencoder, convolutional denoising autoencoderand long short-term memory (LSTM) denoising autoencoder. Eventually, the performance of the proposed methodswere compared to analytical methods based on polynomial interpolations.Even though the related research has already identiﬁed autoencoders as a promising technique to address missingvalues and anomalies in monitoring building data sets [13–16], some signiﬁcant research questions are still unad-dressed. In particular, the existing studies have often focused on reconstructing a single type of signal or they havebeen limited by the small amount of available training data and computational power. Motivated by the latter open,yet signiﬁcant research question, this work presents an autoencoder-based method for reconstructing and forecastingindoor environment data time-series. The models were developed using monitoring data from more than 70 o ﬃ cesover four years, which resulted in almost 100,000 monitoring days. Eventually, the optimal problem hypothesis wasidentiﬁed based on the results obtained from 7,000 core hours of GPU and CPU computations.Additionally, the scientiﬁc contribution of the presented work consists of the following: • To analyze the variability of univariate artiﬁcial neural networks (ANNs) performance when applied to di ﬀ erentkinds of indoor environment data time-series. • To present a generalized gap-ﬁlling method to address the problem of missing values in building data sets. • To propose a solution to address the issue of ANNs’ saturation for energy systems applications.The rest of this paper is organized as follows: Section 2 presents the motivation that led to the development of a missingdata inserting model based on autoencoder neural networks. Section 3 provides the reader with further information onthe used data set and on the models’ theory and implementation. Section 4 presents results on developing a suitabletool for indoor environment data time-series reconstruction. Finally, the results and novel ﬁndings are discussed andsummarized in Sections 5 and 6. 2 able 1: List of abbreviations.

ANNs artiﬁcial neural networksBAS building automation systemsBIT bi-directional imputation and transfer learningCR corruption rateDBN deep belief networkELM extreme learning machineFDD fault detection and diagnosisFFT fast fourier transformGANs generative adversarial networksIAQ indoor air qualityIQR interquartile rangeLSTM long short-term memoryMAE mean absolute errorMELs miscellaneous electric loadsMSE mean squared errorNRMSE normalized root mean squared errorOB occupant behaviorRBMs restricted boltzmann machinesRF random forestRH indoor relative humidityRMSE root mean squared errorSAT saturation performance metricSGD stochastic gradient descendentT indoor air temperature

2. Background

The importance of su ﬃ cient large data sets on time-series modeling was empirically explored in the scope of theTexas LoanSTAR program [11], whose objective was to measure savings from energy conservation retroﬁts. Byincreasing the length of building data sets from one to ﬁve months, the average cooling prediction error decreasedfrom 7.3 % to 3.0 % and the annual heating prediction error decreased from 27.5 % to 12.9 %. Zapata et al. [12]discovered also that a fast fourier transform (FFT), as applied to assess the frequency content of time-series windspeed data, gave unacceptable results when information loss was at least 2.5 % of the data set.As pointed out by Chong et al. [8], in 2016 there was still little relevant research about handling missing values inbuilding data sets. As a consequence, existing studies often relied on simpliﬁed methods such as complete case anal-ysis, mean inserting and zeros inserting [8]. However, these methods often resulted in poor reconstruction of missingdata, which could lead to limited performance of later applied data-driven models. In a recent study, Candanedo etal. [9] reconstructed the average indoor air temperatures of a passive house, achieving accurate results with a randomforest (RF) model. However, the proposed model used as input features multiple time-series (e.g. external weatherdata, total electrical energy use) that could not be always known for a data reconstruction problem. Furthermore,when the same outputs were used to reconstruct the single-room air temperature, the performance of the proposedmethod dropped signiﬁcantly. The use of more advanced deep learning techniques for the reconstruction of buildingenergy data was explored by Ma et al. [10]. By applying a LSTM with bi-directional imputation and transfer learning(BIT), the same author managed to achieve a reconstruction error approximately 30 % less than linear interpolationmodels, in case of continuous missing electrical power data. Conclusively, Benitez et al. [13] coupled multivariatevariational autoencoders with convolutional layers to estimate missing indoor air quality (IAQ) subway data. Theresults proved that a correct reconstruction of IAQ data gaps could have a direct impact on the subway ventilationsystem’s performance. However, a strong limitation of this study was the size of the used data set, namely 1 month ofmeasurements with hourly resolution. 3 .2. Deep learning methods for buildings’ control Given the recent ﬁndings in the related literature [10, 13–16], the modeling methods adopted in this study belongto the category of deep learning.Deep learning models are neural networks with learned feature representation over multiple hidden layers [17].In the related building research, these methods have been extensively used for energy consumption and occupantbehavior (OB) modeling applications [5, 18–37]. Qian et al. [23] explored the potential of ANNs for HVAC loadforecasting, when applied to small amount of data. The results proved that the ﬁtting degree of the proposed modelswas over 85 %. They pointed out the importance of the su ﬃ ciently large training data set. In particular, the useof a smaller training set that consisted of one month and one week of monitoring data led to accuracy decrease for6 % and 20 %, respectively. Zhang et al. [25] trained a deep belief network (DBN) and extreme learning machine(ELM) based framework to predict half-hourly building energy consumption data. Here, DBNs consisted of a stackof restricted boltzmann machines (RBMs), where RBMs had fully connected visible and hidden layers [25]. The cor-respondent mean absolute error (MAE) was 10 % higher than the results obtained by support vector regression. Yanet al. [30] addressed the issue of the imbalanced properties of training data sets for automatic FDD of chillers. Theyused generative adversarial networks (GANs) to generate faulty training samples. The results proved that without theimplemented model, classiﬁcation accuracy could hardly reach 90 %. Conclusively, Markovic et al. [37] developed aLSTM neural network for day-ahead prediction of miscellaneous electric loads (MELs). The proposed implementa-tion outperformed benchmark approaches based on Weibull distribution and Gaussian mixture methods when MELsand occupancy information data were used as input parameters to the model. Autoencoders are ANNs which learn to reconstruct original inputs from a noisy version, making missing data recon-struction one of their reasonable applications [38]. However, few studies applied these models in the ﬁeld of buildingenergy systems [39], further emphasizing the relevance of this work. In this regard, Fan et al. [14] explored thepotential of di ﬀ erent types of autoencoder neural networks in the anomaly detection of building operational data. Theresults showed that a 1D convolutional architecture could e ﬀ ectively capture the intrinsic characteristics in buildingenergy data, while preserving the temporal data information. Liu et al. [15] applied di ﬀ erent machine learning-basedanomaly detection methods to vertical plant wall systems. The results conﬁrmed that the autoencoder conﬁgurationoutperformed other models for both contextual and point anomaly detection of temperature and CO data. Finally,Araya et al. [16] used autoencoders to capture HVAC consumption patterns and they used the gained knowledge toidentify abnormal consumption behaviours in the same system.

3. Methodology

The aim of this paper is to develop an approach for ﬁlling indoor environment data gaps. For that purpose, three dif-ferent autoencoder neural network architectures were implemented in order to identify the optimal model hypothesis.In this respect, room temperature, relative humidity and CO concentration data were artiﬁcially corrupted, by settingto zero sub-daily sequences of random length. In particular, the reconstructed gaps ranged between few hours (10 %of the daily values) and around 22 hours (90 % of the daily values). A summary of the adopted modeling approach isproposed in Figure 1. 4 ata setOutliers detectionMissing values detection ValidationsetTrainingset EvaluationsetHyperparametertuningModeltrainingModelvalidationModel trainingModel validationModelevaluationx 0.3 x 0.1 x 0.6 For every CR

Figure 1: Modeling ﬂowchart. CR is the applied masking noise.

The general structure of an autoencoder is presented in Figure 2. An autoencoder neural network is a represen-tation learning approach which turns incoming data into di ﬀ erent representations, through an encoder function, andreconstructs the original input through a decoder function [17]. hx rf g Figure 2: Working principle of a general autoencoder. Figure reproduced based on Goodfellow [17].

Input x is mapped to an output reconstruction r through an internal representation h , namely code. The autoencoderhas two components: the encoder f (mapping x to h ) and the decoder g (mapping h to r ) [17].The encoder is deﬁned as [17]: h = c ( W x + b ) , (1)while the decoder is formulated as: r = c ( W (cid:48) h + b (cid:48) ) , (2)5here c is a non-linear function, called activation function, W and W (cid:48) are called weights, b and b (cid:48) are called biases.By implementing activation several times (training), the model could acquire useful knowledge about the systems’properties [17].The structure of a denoising autoencoder is presented in Figure 3, while its extension to the stacked denoising au-toencoder is presented in Figure 4. In contrast to a general autoencoder, a denoising autoencoder receives a corruptedinput x ∗ and is trained to reconstruct the original uncorrupted data x [17]. L(x, r)xx* h r f g

Figure 3: Working principle of a denoising autoencoder. Figure reproduced based on Vincent et al. [40].

The training process consists of minimizing a loss function L ( x , r ) which quantiﬁes the di ﬀ erence between theoriginal input and output at each step. In this study, corruption of input data is performed by setting to zero an intervalof sequential values of random length. This approach is used to simulate how missing data are distributed. L(x’, r’)rx*’ h’ r’ f’ g’ fx Figure 4: Working principle of a stacked denoising autoencoder. Figure reproduced based on Vincent et al. [40].

After the ﬁrst level of denoising autoencoder has been trained (Figure 3), a second level of denoising autoencoderis trained using the previously optimized encoding function, f . Corruption takes place on the output of the previousoptimized layer r [40]. The used data were collected in the E.ON ERC main building, located in Aachen, Germany. The building underinvestigation has a usable area of 7,500 m over four storeys and it includes o ﬃ ces, seminar rooms, laboratories andcommon area [41–44].Based on the logging frequency and monitoring duration, it could be expected that around 181 million sets ofobservations were collected for each variable from 2014 to 2017. The monitoring data were grouped in two subsets,namely data set ”A” (2014-2015) and ”B” (2016-2017). Data set ”A” contained measurements for 73 rooms from2014 and 2015, stored into ”HDF5” data containers [45] on a monthly basis. Monitoring data for the years 2016 and2017 were collected in 84 o ﬃ ces and they were stored in ”pickle” ﬁles [46]. Here, each ﬁle contained data for a singleo ﬃ ce over the whole observed biannual period. 6 .3. Data preprocessing Before further analysis and modeling, data were cleaned and preprocessed. This step involved the detection offrequently encountered anomalies, such as missing values and outliers [47]. Missing values reduce the size of theavailable data set, hence compromising the reliability of the model’s outcome [47]. Outliers are noisy data pointswith values signiﬁcantly di ﬀ erent from the majority of other data points [47]. For this reason, they could lead tounderestimated or overestimated results [47]. For the detailed explanation of the adopted data cleaning procedure thereader is referred to the Appendix.According to the existing literature [48–50], additional preprocessing, such as resampling resolution and data nor-malization was performed in order to increase performance and computational e ﬃ ciency of the proposed models.Here, resampling resolution is the process of changing the frequency of time-series data [49], while data normaliza-tion is applied to prepare raw data for a better network use [50]. In particular, data were downsampled to 30 minutesfrequency and normalized using the Statistical or Z-Score Normalization function [50]: z = ( x − u ) / s , (3)where, x are data to normalize, u is the mean of the training samples and s is the standard deviation of the trainingsamples. Before normalizing, data were split into training (30 %), validation (10 %) and test set (60 %) for everyvariable. In order to favour models generalization, limits were deﬁned based on training set and eventually adoptedfor each of the three parts identically. The models were developed using the Python programming language and open source libraries Tensorﬂow [51]and Keras [52]. Figure 5 shows the autoencoder structure as implemented in the scope of the performed experiments.The explored models included the architectures with one and up to three hidden layers per each encoder and decoder.Batch-normalization was included after every layer in order to avoid the network saturation for both feed-forward andconvolutional autoencoders [53].

Figure 5: General autoencoder architecture with 3 layers per side.

Feed-forward denoising autoencoders were fed with unrolled half-hourly daily observations, which resulted in 48features. Unrolling the 1-D temporal sequences into a single input layer is, indeed, a commonly adopted approach7o address temporal dependencies for a feed-forward neural network [5, 36]. However, feed-forward neural networkswith unrolled sequences have separate parameters for each feature, which means that weights cannot be shared alongthe input series [17]. While convolutional neural networks apply the same kernel to every time-step, in the LSTM con-ﬁguration each output node is a function of the previous one and parameters can be shared along very long sequences[17].Models’ overﬁtting was prevented by using an early stopping criteria based on the validation loss [17]. Furthermore,the mean squared error (MSE) loss function was applied to the reconstructed and original input over all the trainingsamples, as follows [17]:

MS E = (cid:80) mi ( Y (cid:48) − Y ) i m , (4)where m is the batch size. In order to ﬁnd the optimal parameters for the models, the MSE was minimized usingeither a stochastic gradient descendent (SGD) optimizer with momentum [54] or Adam optimizer [55]. The opti-mizer choice was handled as an additional hyperparameter. In this study, the hyperparameter tuning was conductedseparately for each target variable (temperature, relative humidity, CO concentration). Additionally, the whole hy-perparameter range was explored for all neural units, such as feed-forward, LSTM and convolutional. The range ofvalues in which the optimal model’s conﬁguration was investigated is summarized in Table 2. Table 2: Overview of tuned hyperparameter and explored values.

Hyperparameter ValuesHidden layers per side 1 - 3Hidden layer units / ﬁlters (convolutional) 8, 16, 32, 64, 128Kernel size (convolutional) 2, 3, 5, 7, 11Batch size 128Learning rate 0.001, 0.01, 0.1Optimizer SGD with 0.9 momentum, AdamIt was opted for the grid search tuning over more advanced methods due to the relatively small researched hyperpa-rameter space. Other tuning strategies, such as random or greedy search [56], would have been beneﬁcial in case ofmore complex scenarios.For every possible combination of hyperparameter, the score given to the model was obtained by averaging theMSEs, computed on the normalized validation set, with a corruption rate (CR) ranging between 0.2 and 0.8. Giventhe computational cost of this process, multiple independent jobs with di ﬀ erent hyperparameters were run in parallelusing the computational resources at the RWTH Aachen University Compute Cluster. In order to address the stochas-tic initialization of models’ weights, the tuned conﬁgurations were run again 10 times and evaluated on the samevalidation data as before. Models with the lowest MSE were eventually exported for further evaluation on the test set.

4. Results

The performance of missing data insertion was assessed using the root mean squared error (RMSE) method, since itis the established evaluation method by the existing research [9, 10, 13]. In order to obtain objective evaluation results,the RMSE was applied only to the corrupted sequence of the test data. Additionally, the ability of the proposed modelsfor capturing indoor environmental data patterns was estimated by computing the RMSE on each sequence [14]. TheRMSE equation is given as follows [9, 10, 14]:

RMS E = (cid:115) (cid:80) ni = ( X obsi − X insertedi ) n , (5)where X obsi are the i − th real values, X insertedi are the i − th reconstructed values and n are the total number of datapoints on which RMSE is computed. 8omparison between di ﬀ erent variables and studies was made by means of the normalized root mean squared error(NRMSE), obtained by normalizing the RMSE over the interquartile range (IQR), due to the possible presence ofnoisy data points: NRMS E = RMS EIQR . (6) Network saturation was identiﬁed to be a major modeling complexity in case of both feed-forward and convolu-tional denoising autoencoders. Formally, neural network saturation could be described as an impediment to gradientpropagation [37, 57]. Its main e ﬀ ect was to produce always the same output, no matter how di ﬀ erent the inputsequence was.As presented earlier, neural networks are trained by minimizing a loss function between targets and true values.This can be accomplished by applying an iterative algorithm called gradient descendent [58]. The working principleof a gradient descendent algorithm is simple: weights and biases are initialized to some values and then they arecontinuously updated in the direction that decreases the loss function, namely opposite to the gradient [58]. Each unitin a neural network receives signals and weights from previous units and computes a value called pre-activation [58]: z = (cid:88) j w j x j + b , (7)where z is the pre-activation value, j is the number of input signals and weights, w are the weights, b is the biaswhich determines units’ activation in case no inputs are present. The activation value is the pre-activation passedthrough an activation function φ [58]: a = φ ( z ) . (8)If, during training, the activation of a neural network unit is always near the boundaries of its dynamic range (thepossible outputs of an activation function), then the gradient of the pre-activation is very small and weights are notupdated [58]. These neural network units are called saturated units [58]. Hence, saturated units can be identiﬁedlooking at the histogram of the average activations and checking that they are not concentrated at the endpoints [58].Di ﬀ erent approaches are followed in the literature to avoid network saturation. Glorot and Bengio [57] proposed anovel normalized initialization for neural network weights: W = U [ − √ √ n j + n j + , √ √ n j + n j + ] , (9)where W are the neural network weights, U is a uniform distribution function, n is the size of the j − th layer.Maas et al. [59] used ReLU activation functions as alternative to sigmoid and tanh , causing in this way networksaturation at exactly 0. Io ﬀ e and Szegedy [53] proposed a new mechanism called batch-normalization, which ensuresthat distribution of nonlinearity inputs (i.e. pre-activations) remains more stable as the network trains.In this study, since no other formal metrics were available at the current state of the research, network saturationwas evaluated using the saturation performance metric (SAT) proposed by Markovic et al. [37], calculated over thenormalized reconstructed values.In order to guarantee an e ﬀ ective model training, all the previous approaches followed in the literature were imple-mented [53, 57, 59]. Figure 6 shows the histogram of the average activations for the original feed-forward denoisingautoencoder conﬁguration with classical uniform weights initialization, sigmoid activation functions and no batch-normalization. It is possible to observe the typical network saturation problem, being the average activations alwaysconcentrated at 0 and 1, namely the endpoints of a sigmoid activation function.9 .1 0.3 0.5 0.7 0.9 1.1 100020003000400050006000 S t e p Activation [-]

Figure 6: 3D Histogram of the average activations for the original feed-forward denoising autoencoder with 10 % corruption rate.

The use of normalized initialization, ReLU activation function and batch-normalization as proposed in the liter-ature conﬁrmed to reach higher performance and to overcome saturation problems. Figure 7 shows how averagenetwork activations change during training, not being stuck in any saturating point. Note that ReLU activation func-tion was applied on the encoder layer while batch-normalization only in the decoder layer. The authors realizedthat higher performance could be reached using a tanh activation function in the decoder layer, in association withbatch-normalization. Conclusively, the SAT results conﬁrmed the superiority of the chosen models’ conﬁgurationswith respect to the original denoising autoencoder architectures with classical uniform weights initialization, sigmoidactivation functions and no batch-normalization. -1.0 -0.6 -0.2 0.2 0.6 S t e p Activation [-]

Figure 7: 3D Histogram of the average activations for the new feed-forward denoising autoencoder with 10 % corruption rate.

The saturation performance results of the proposed models for di ﬀ erent CR are summarized in Table 3. Here, net-work saturation is deﬁned as the SAT lower than 0.1 [37]. Every autoencoder was well above the previous deﬁnedlimit, meaning that the adopted architecture strategies could e ﬃ ciently overcome saturation issues. The above identi-ﬁed metric was function of the particular variable and of the applied CR. While the SAT was insensitive to changes in10R for relative humidity, it decreased for both temperature and CO data. Table 3: SAT for denoising autoencoder neural networks for di ﬀ erent CR. ”CONV”, ”FEED” and ”LSTM” stand for convolutional, feed-forwardand LSTM denoising autoencoder. T [-] RH [-] CO [-]CR [-] CONV FEED LSTM CONV FEED LSTM CONV FEED LSTMSAT 0.10 0.88 0.85 0.84 0.95 0.97 0.98 0.78 0.83 0.770.20 0.87 0.83 0.79 0.95 0.98 0.98 0.71 0.77 0.760.30 0.83 0.82 0.77 1.00 0.98 0.98 0.65 0.73 0.690.40 0.77 0.75 0.73 0.99 0.96 0.99 0.75 0.66 0.590.50 0.73 0.72 0.72 0.98 0.97 0.98 0.57 0.60 0.470.60 0.73 0.68 0.65 0.97 0.95 0.95 0.52 0.46 0.350.70 0.69 0.69 0.59 0.94 0.96 0.96 0.51 0.48 0.320.80 0.64 0.64 0.60 0.92 0.94 0.94 0.44 0.41 0.300.90 0.62 0.63 0.57 0.93 0.96 0.93 0.39 0.34 0.31Average 0.75 0.73 0.70 0.96 0.96 0.97 0.59 0.59 0.51 Firstly, the ability of autoencoder neural networks for capturing daily patterns of environmental data is assessed.Eventually, the proposed models’ performance in reconstructing sub-daily indoor environment data gaps are evaluatedand compared to classic polynomial interpolations.Figure 8 shows the RMSE variance for a LSTM denoising autoencoder. In particular, the model was trained witha masking noise of 0.1 and applied to the non corrupted test sets. Here, the RMSE was computed on the single time-step as seen in Section 4.1. The average RMSEs over the number of test sequences were 0.027 C, 0.12 % and 4.25ppm, respectively for T , RH and CO data. Furthermore, all the reconstruction residuals computed with the LSTMarchitecture were lower than the convolutional and feed-forward ones.Based on the boxplots presented in Figure 8, the observed days with RMSE out of the measured IQR are repre-sented. These could be considered as sequences with atypical behavior, for which the indoor environment data patternscannot be detected by the model. Figure 8 shows also, for every variable, how a random day with atypical behaviorlooks like and how it is reconstructed by the LSTM autoencoder. In particular, the represented sequence of relativehumidity data presents an outlier in the early morning observation. This anomaly, which could be caused by sensors’malfunctioning, was not detected during the data preprocessing, but it was identiﬁed with the proposed model.11 .000.250.500.751.001.25 R M S E [ d e g C ] Boxplot 02468 R M S E [ % ] Boxplot 0255075100125 R M S E [ pp m ] Boxplot0 20 40Observations14161820 M e a s u r e d t e m p e r a t u r e [ d e g C ] ObservationReconstruction 0 20 40Observations0204060 R e l a t i v e hu m i d i t y [ % ] ObservationReconstruction 0 20 40Observations500750100012501500 C O c o n ce n t r a t i o n [ pp m ] ObservationReconstruction

Figure 8: Boxplots of of the observed sequences and reconstruction of a day with atypical behavior.

Table 4 summarizes the behavior of all the models, when applied to data with di ﬀ erent CR. For every variablethere was a polynomial degree for which the daily data trends were better ﬁt by the interpolation. While indoor airtemperature data were more accurately described by cubic correlations, relative humidity and CO concentration datahad, respectively, more linear and quadratic trends. However, all autoencoders performed by a large margin better thanbaseline approaches for all variables. In particular, the performance of the convolutional conﬁguration outperformed,in average, all the alternative models. In this regard, the RMSE was 37 % lower than cubic interpolation for indoor airtemperature, 24 % lower than linear interpolation for relative humidity and 30 % lower than quadratic interpolation for CO concentration. In terms of NRMSE, missing relative humidity data could be reconstructed with higher accuracy,when compared to other variables. On that place, the worst behavior was obtained with CO data. In particular, theNRMSE of the convolutional conﬁguration in case of RH data was 75 % lower than T and 90 % lower than CO .These results were consistent with the SAT trend, being higher for RH and lower for CO (Table 3).Figure 9 shows exemplary indoor environment data reconstruction over one random day from the test sets. Allpresented data were corrupted with a masking noise of 0.5. The presented data conﬁrmed once again the resultspresented in Table 4, namely, that the proposed denoising autoencoder architectures were suitable for ﬁlling themissing indoor environment data sequences. 12 able 4: Performance of denoising autoencoder neural networks and polynomial interpolations for reconstructing sub-daily indoor environmentdata gaps. ”CONV”, ”FEED” and ”LSTM” stand for convolutional, feed-forward and LSTM denoising autoencoder. ”LIN”, ”QUAD” and ”CUB”stand for linear, quadratic and cubic interpolation. CR [-] T [C] RH [%] CO [ppm]CONV FEED LSTM CONV FEED LSTM CONV FEED LSTMRMSE 0.10 0.22 0.32 0.33 0.73 1.14 1.05 49.10 64.58 64.880.20 0.31 0.36 0.47 0.90 1.23 1.47 61.16 69.41 82.510.30 0.36 0.42 0.53 1.08 1.35 1.78 69.11 75.04 89.000.40 0.41 0.47 0.59 1.23 1.45 2.11 78.25 81.46 101.640.50 0.46 0.51 0.62 1.36 1.59 2.33 84.12 89.98 107.850.60 0.50 0.53 0.64 1.48 1.66 2.54 91.59 95.38 110.280.70 0.51 0.52 0.63 1.54 1.66 2.72 91.43 94.86 106.160.80 0.50 0.51 0.61 1.66 1.74 2.80 91.26 94.32 102.180.90 0.50 0.49 0.60 1.73 1.78 3.00 89.63 90.52 96.75Average 0.42 0.46 0.56 1.30 1.51 2.20 78.41 83.95 95.69NRMSE [-] 0.10 0.15 0.21 0.22 0.04 0.07 0.06 0.43 0.56 0.560.20 0.21 0.24 0.31 0.05 0.07 0.08 0.53 0.61 0.720.30 0.24 0.28 0.35 0.06 0.08 0.10 0.60 0.66 0.780.40 0.27 0.31 0.39 0.07 0.08 0.12 0.68 0.71 0.880.50 0.31 0.34 0.41 0.08 0.09 0.13 0.73 0.79 0.940.60 0.33 0.35 0.42 0.09 0.10 0.14 0.80 0.83 0.960.70 0.34 0.35 0.42 0.09 0.10 0.15 0.80 0.83 0.920.80 0.33 0.34 0.41 0.10 0.10 0.16 0.80 0.82 0.890.90 0.33 0.33 0.40 0.10 0.10 0.17 0.78 0.79 0.84Average 0.28 0.31 0.37 0.07 0.09 0.12 0.68 0.73 0.83LIN QUAD CUB LIN QUAD CUB LIN QUAD CUBRMSE 0.10 0.58 0.52 0.39 1.31 1.10 0.94 94.89 79.97 72.000.20 0.67 0.64 0.49 1.48 1.28 1.13 106.67 92.86 85.320.30 0.73 0.73 0.57 1.59 1.45 1.29 114.41 102.74 96.760.40 0.78 0.82 0.66 1.70 1.59 1.49 118.96 113.61 113.680.50 0.80 0.91 0.74 1.77 1.76 1.69 121.94 122.14 125.890.60 0.79 0.95 0.82 1.82 1.89 1.89 121.25 124.12 148.580.70 0.77 0.95 0.82 1.86 2.01 2.03 120.32 122.30 138.570.80 0.75 0.92 0.70 1.88 2.21 2.31 117.48 122.40 120.250.90 0.74 0.86 0.77 1.91 3.15 3.24 112.40 126.33 125.01Average 0.73 0.81 0.66 1.70 1.83 1.78 114.26 111.83 114.01NRMSE [-] 0.10 0.39 0.34 0.27 0.07 0.06 0.05 0.83 0.69 0.630.20 0.45 0.42 0.33 0.08 0.07 0.06 0.93 0.81 0.740.30 0.49 0.48 0.38 0.09 0.08 0.07 1.00 0.89 0.840.40 0.52 0.55 0.44 0.09 0.09 0.08 1.03 0.99 0.990.50 0.53 0.60 0.49 0.10 0.10 0.09 1.06 1.06 1.100.60 0.53 0.63 0.54 0.10 0.11 0.11 1.05 1.08 1.290.70 0.51 0.63 0.54 0.10 0.11 0.11 1.05 1.07 1.210.80 0.51 0.61 0.46 0.11 0.12 0.13 1.02 1.07 1.050.90 0.49 0.57 0.51 0.11 0.18 0.18 0.98 1.10 1.09Average 0.49 0.54 0.44 0.09 0.10 0.10 0.99 0.97 0.9913

20 40Observations2122232425 M e a s u r e d t e m p e r a t u r e [ d e g C ] Convolutional autoencoder 0 20 40Observations2122232425 M e a s u r e d t e m p e r a t u r e [ d e g C ] Feed-forward autoencoder 0 20 40Observations2122232425 M e a s u r e d t e m p e r a t u r e [ d e g C ] LSTM autoencoder0 20 40Observations2122232425 M e a s u r e d t e m p e r a t u r e [ d e g C ] Linear interpolation 0 20 40Observations2122232425 M e a s u r e d t e m p e r a t u r e [ d e g C ] Quadradic interpolation 0 20 40Observations2122232425 M e a s u r e d t e m p e r a t u r e [ d e g C ] Cubic interpolation0 20 40Observations4547495153 R e l a t i v e hu m i d i t y [ % ] Convolutional autoencoder 0 20 40Observations4547495153 R e l a t i v e hu m i d i t y [ % ] Feed-forward autoencoder 0 20 40Observations4547495153 R e l a t i v e hu m i d i t y [ % ] LSTM autoencoder0 20 40Observations4547495153 R e l a t i v e hu m i d i t y [ % ] Linear interpolation 0 20 40Observations4547495153 R e l a t i v e hu m i d i t y [ % ] Quadratic interpolation 0 20 40Observations4547495153 R e l a t i v e hu m i d i t y [ % ] Cubic interpolation0 20 40Observations400500600700800 C O c o n ce n t r a t i o n [ pp m ] Convolutional autoencoder 0 20 40Observations400500600700800 C O c o n ce n t r a t i o n [ pp m ] Feed-forward autoencoder 0 20 40Observations400500600700800 C O c o n ce n t r a t i o n [ pp m ] LSTM autoencoder0 20 40Observations400500600700800 C O c o n ce n t r a t i o n [ pp m ] Linear interpolation 0 20 40Observations400500600700800 C O c o n ce n t r a t i o n [ pp m ] Quadratic interpolation 0 20 40Observations400500600700800 C O c o n ce n t r a t i o n [ pp m ] Cubic interpolation

Figure 9: One day-long indoor environment data reconstruction. Blue colored line represents the real data. Hashed blue colored line represents themissing data. Red colored line represents the reconstruction of the whole day with the adopted model. Observations were sampled to 30 minutessteps. .4. Data forecasting performance evaluation As presented in earlier sections, the proposed autoencoder neural networks were implemented to reconstruct sub-daily indoor environment data gaps, since building data sets often contain missing values that could hinder furtherenergy analysis. Nonetheless, the same models could be also used for short-term indoor environment data forecasting.Table 5 summarizes the performance of the implemented autoencoders for di ﬀ erent predictive horizons. In sum-mary, all the models performed similarly well. However, there was a clear improvement, with respect to the datareconstruction case, in the LSTM conﬁguration. In particular, the RMSE of the LSTM model was 14 % lower forindoor air temperature and 20 % lower for CO concentration. In terms of NRMSE, even in this case, the proposed au-toencoder neural networks could forecast relative humidity data better than other variables. In particular, the NRMSEof the LSTM conﬁguration in case of RH data was 59 % lower than T and 80 % lower than CO data (Table 5). Table 5: Performance of denoising autoencoder neural networks for forecasting indoor environment data. ”CONV”, ”FEED” and ”LSTM” standfor convolutional, feed-forward and LSTM denoising autoencoder. PH is the predictive horizon.

PH [h] T [C] RH [%] CO [ppm]CONV FEED LSTM CONV FEED LSTM CONV FEED LSTMRMSE 2.50 0.19 0.18 0.17 0.89 0.97 0.89 25.62 23.86 25.345.00 0.31 0.30 0.29 1.47 1.44 1.37 43.00 42.38 39.337.00 0.42 0.41 0.40 1.72 1.83 1.76 55.62 55.29 56.119.50 0.50 0.48 0.46 2.15 2.21 2.18 72.37 72.84 74.2112.00 0.55 0.52 0.52 2.47 2.45 2.41 81.27 82.55 82.1114.50 0.62 0.60 0.59 2.88 2.85 2.67 102.71 102.40 104.5117.00 0.73 0.66 0.66 2.84 2.88 2.81 108.13 107.22 107.7919.00 0.74 0.64 0.63 2.93 2.95 2.84 102.40 101.58 102.0221.50 0.75 0.60 0.60 3.01 3.01 3.18 96.89 96.14 96.88Average 0.53 0.49 0.48 2.26 2.29 2.23 76.45 76.03 76.48NRMSE [-] 2.50 0.13 0.12 0.11 0.05 0.05 0.05 0.22 0.21 0.225.00 0.20 0.20 0.19 0.08 0.08 0.08 0.37 0.37 0.347.00 0.28 0.27 0.27 0.10 0.10 0.10 0.48 0.48 0.499.50 0.33 0.32 0.31 0.12 0.12 0.12 0.63 0.63 0.6512.00 0.36 0.35 0.35 0.14 0.14 0.14 0.71 0.72 0.7114.50 0.41 0.40 0.40 0.16 0.16 0.15 0.89 0.89 0.9117.00 0.48 0.44 0.44 0.16 0.17 0.16 0.94 0.93 0.9419.00 0.49 0.42 0.42 0.17 0.17 0.16 0.89 0.88 0.8921.50 0.50 0.40 0.40 0.17 0.17 0.18 0.84 0.84 0.84Average 0.35 0.32 0.32 0.13 0.13 0.13 0.66 0.66 0.67Figure 10 shows exemplary indoor environment data reconstruction over one random day from the test sets. Allpresented data were corrupted with a masking noise of 0.5 at the end of each time step (predictive horizon of 12h). The presented data conﬁrmed once again the results presented in Table 5, namely, that the proposed denoisingautoencoder architectures were also suitable for the short-term indoor environment data forecasting.15

Figure 10: One day-long indoor environment data forecasting. Blue colored line represents the real data. Hashed blue colored line represents themissing data. Red colored line represents the reconstruction of the whole day with the adopted model. Observations were sampled to 30 minutessteps.

5. Discussion and future work

The aim of this study was to reconstruct sub-daily indoor environment data time-series since short-term missing dataare often present in building data sets and they could hinder further energy analysis. Considering that building energymodels usually require inputs at an hourly resolution [8], feeding the models with minute-wise time-series would haveresulted in a signiﬁcant increase of computational costs. Consequently, it was opted for a 30 minutes frequency toconduct further models’ development. An important contribution of this paper is the analysis of autoencoder neuralnetworks’ performance on di ﬀ erent types of environmental time-series, measured over multiple years in a wholecommercial building. Accordingly, this ﬁlls an important research gap present in the related literature, since existingstudies either focused on reconstructing a single type of data stream or they were limited by the size of the availabletraining set.As presented in Section 3, data were split in three sets before normalization. Model training was performed us-ing a training set, the optimal model conﬁguration was chosen based on performance on the validation set and thedata reconstruction accuracy was evaluated using the test set. Approximately 94,085 full days of observations wereavailable, which makes this study – as far as the authors know – the largest of its kind for indoor environment datatime-series reconstruction purposes. In order to guarantee a signiﬁcant generalization of the models, it was decidedto take an extensive test set. Accordingly, approximately 2.7 million data points were used for models’ evaluation.Improved performance of the ﬁnal models could be achieved by introducing the dimension of each data set as anadditional hyperparameter. However, this choice would have led to additional computational costs and, therefore, itwas not pursued.In total, 1,890 hyperparameter combinations were explored by applying a grid search. Simulations were performedusing compute sources granted at RWTH Aachen University. In particular, approximately 7,000 core hours wereexploited under project rwth0622. 16his work provided important insights regarding the occurrence of the neural networks saturation for analyticsrelated to building’s performance. The consequence of model saturation was that the weights were not updateddue to the vanishing gradient problem (Section 4.2), which led always to identical predicted values. This problemwas observed in case of the convolutional and feed-forward autoencoders, while it was not detected in case of theLSTM autoencoders. The suitable approach to tackle the model saturation was explored in the existing literature oncomputer vision and general machine learning. Even though all the computed saturation metrics were well abovethe deﬁned saturation limit (0.1), the SAT decreased with the corruption rate for both indoor air temperature and CO concentration and remained almost unchanged for relative humidity. This was inconsistent with the increasingRMSE trend of the latter variable (Table 4). It can be concluded that the above saturation metric could not be usedas an additional performance measure, since it was dependent also on the sequence-variability of the original data. Inparticular, the worst saturation performance on CO data could be explained with the presence of more extreme valuesand frequent peaks. This could explain the reason why LSTM neural networks su ﬀ ered saturation issues in the paperproposed by Markovic et al. [37]. In this earlier study, an LSTM-based model was applied to plug-in loads data andsaturation occurred in more than 70 % of the trained conﬁgurations. This could be caused by the larger data imbalanceof the plug-in loads and the extreme values. Namely, similarly to plug-in loads, the time-series of CO concentrationconsist of frequent peaks and extreme values, which showed to be a particular complexity to be considered when usingthe LSTM for building’s energy analysis.The data reconstruction analysis, applied to the non corrupted data, revealed that the proposed autoencoder neuralnetworks, especially the LSTM conﬁguration, could accurately evaluate the indoor environment patterns. Accord-ingly, this represents a signiﬁcant practical potential for the inclusion of these methods in the real time buildingcontrol. This could be used for anomaly detection purposes, by identifying data sequences with atypical behavior(e.g. noisy data, sensors’ malfunctioning) (Figure 8). On the other hand, the performance of the convolutional con-ﬁguration spiked out when a certain masking noise was applied to the test sequences. It can be, therefore, stated thatthe spatial correlations of input data were more important than the temporal ones, when a gap-ﬁlling method was in-vestigated. Additionally, the NRMSE analysis established that relative humidity data patterns were, in general, easierto detect by the proposed models. In order to increase the generalization capability of the developed methods, theinclusion of monitoring data collected in multiple buildings with signiﬁcant di ﬀ erences in thermal mass and designshould be further researched.Some of possible applications of this study are the use of autoencoder neural networks for time-series data fore-casting. In this regard, the performance of the proposed models are really promising and should be considered as partof future expansions of this work. Accordingly, the proposed models could forecast the indoor air temperature dataeven better than calibrated Modelica-based building performance simulation tools applied in other studies [60]. Thetemporal correlations of input data gained signiﬁcant importance with respect to the reconstruction case, placing theLSTM conﬁguration on a slightly better performance level than the convolutional one. Based on the previous con-siderations, a denoising autoencoder which relies, at the same time, on LSTM and convolutional units could furtherincrease the predictive accuracy of the model. Future work should also evaluate the implemented autoencoders forforecasting energy and environmental data time-series over longer time horizons. The ability of these methods tocapture indoor environment data patterns could be further exploited by modelling the energy-related users actions inan unsupervised manner.Additionally, the proposed models could be used as back-up option in case of sensor failure in the real time buildingcontrol [7, 13]. In that case, missing data insertion could be e ﬀ ective over the current day of observation.One of the possible limitations of this study is a direct consequence of the training process. The proposed au-toencoders were, indeed, implemented to capture information related to the daily trends of the observed variables.Accordingly, day-ahead data sequences cannot be reconstructed with the current training scheme. Conclusively, theapplicability of the proposed models for reconstructing indoor environment data in other buildings has not yet beentested.

6. Conclusion

The aim of this paper was to develop an approach for reconstructing indoor environment data time-series. For thatpurpose, three autoencoder neural networks models were implemented and polynomial interpolation methods were17valuated for baseline comparison. The evaluation of model performance was conducted using indoor air temperature,relative humidity and CO concentration data. The key ﬁndings could be summarized as follows: • Autoencoder neural networks outperformed polynomial interpolation methods for ﬁlling environmental datagaps. • The convolutional conﬁguration worked better than other models. In this regard, temperature, relative humidityand CO data could be reconstructed with average RMSE (from 10 % to 90 % masking noise) respectively of0.42 C, 1.30 % and 78.41 ppm. • Autoencoder neural networks could be used for predicting the indoor environment data with high accuracy, overthe multi hour time horizon. The proposed models outperformed calibrated building performance simulationwith approximately 56 % lower error rate, in terms of indoor air temperature [60]. • The implementation of normalized initial weights, ReLU activation function and batch-normalization for feed-forward and convolutional autoencoders avoided network saturation. • Network saturation was not an issue for neural networks with LSTM layers, since they could overcome thegradient vanishing problem.

7. Acknowledgements

Part of this work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – TR892 / eferences [1] Directive (EU) 2018 /

844 of the European Parliament and of the Council of 30 May 2018 amending Directive 2010 / / EU on the energyperformance of buildings and Directive 2012 / / EU on energy e ﬃ ciency. O ﬃ cial Journal of the European Union 61 (2018) 75-76. http://data.europa.eu/eli/dir/2018/844/oj .[2] N. Fumo. A review on the basics of building energy estimation. Renewable & Sustainable Energy Reviews 31 (2014) 53-60. https://doi.org/10.1016/j.rser.2013.11.040 .[3] I. Khan, A. Capozzoli, S.P. Corgnati, T. Cerquitelli. Fault Detection Analysis of Building Energy Consumption Using Data Mining Tech-niques. Energy Procedia 42 (2013) 557-566. https://doi.org/10.1016/j.egypro.2013.11.057 .[4] 2009 ASHRAE Handbook-Fundamentals. ASHRAE Inc 404 (2009) 636-8400. .[5] R. Markovic, E. Grintal, D. Wlki, J. Frisch, C. van Treeck. Window opening model using deep learning methods. Building and Environment145 (2018) 319-329. https://doi.org/10.1016/j.buildenv.2018.09.024 .[6] F. Causone, S. Carlucci, M. Ferrando, A. Marchenko, S. Erba. A data-driven procedure to model occupancy and occupant-related electricload proﬁles in residential buildings for energy simulation. Energy and Buildings 202 (2019) 109342. https://doi.org/10.1016/j.enbuild.2019.109342 .[7] R.Markovic. Generic occupant behavior modeling for commercial buildings. Doctoral Thesis. RWTH Aachen University. (2020).[8] A. Chong, K.P. Lam, W. Xu, O.T. Karaguzel, Y. Mo. Imputation of missing values in building sensor data. ASHRAE and IBPSA-USASimBuild 2016 Building Performance Modeling Conference. Salt Lake City, UT (2016). .[9] L.M. Candanedo, V. Feldheim, D. Deramaix. Reconstruction of the indoor temperature dataset of a house using data driven models forperformance evaluation. Building and Environment 138 (2018) 250-261. https://doi.org/10.1016/j.buildenv.2018.04.035 .[10] J. Ma, J.C.P. Cheng, F. Jiang, W. Chen, M. Wang, C. Zhai. A bi-directional missing data imputation scheme based on LSTM and transfer.Energy and Buildings 216 (2020) 109941. https://doi.org/10.1016/j.enbuild.2020.109941 .[11] J.S. Haberl, T.A. Reddy, D.E. Claridge, W.D. Turner, D.L. O’Neal, W.M. He ﬃ ngton. Measuring energy-saving retroﬁts: experiences fromthe Texas LoanSTAR Program. United States (1996). https://doi.org/10.2172/219427 .[12] A.J. Zapata-Sierra, A. Cama-Pinto, F.G. Montoya, A. Alcayde, F. Manzano-Agugliaro. Wind missing data arrangement using wavelet basedtechniques for getting maximum likelihood. Energy Conversion and Management 185 (2019) 552-561. https://doi.org/10.1016/j.enconman.2019.01.109 .[13] J. Loy-Benitez, S. Heo, C. Yoo. Imputing missing indoor air quality data via variational convolutional autoencoders: Implications for venti-lation management of subway metro systems. Building and Environment 182 (2020) 107135. https://doi.org/10.1016/j.buildenv.2020.107135 .[14] C. Fan, F. Xiao, Y. Zhao, J. Wang. Analytical investigation of autoencoder-based methods for unsupervised anomaly detection in buildingenergy data. Applied Energy 211 (2018) 1123-1135. https://doi.org/10.1016/j.apenergy.2017.12.005 .[15] Y. Liu, Z. Pang, M. Karlsson, S. Gong. Anomaly detection based on machine learning in IoT-based vertical plant wall for indoor climatecontrol. Building and Environment 183 (2020) 107212. https://doi.org/10.1016/j.buildenv.2020.107212 .[16] D.B. Araya, K. Grolinger, H.F. ElYamany, M.A.M. Capretz, G. Bitsuamlak. Collective contextual anomaly detection framework for smartbuildings. 2016 International Joint Conference on Neural Networks, IJCNN, (2016) 511518. https://doi.org/10.1109/IJCNN.2016.7727242 .[17] I. Goodfellow, Y. Bengio, A. Courville. Deep Learning. MIT Press (2016). .[18] D.H. Tran, D.L. Luong, J.S. Chou. Nature-inspired metaheuristic ensemble model for forecasting energy consumption in residential buildings.Energy 191 (2020) 116552. https://doi.org/10.1016/j.energy.2019.116552 .[19] M. lbeigi, M. Ghomeishi, A. Dehghanbanadaki. Prediction and optimization of energy consumption in an o ﬃ ce building using artiﬁcial neuralnetwork and a genetic algorithm. Sustainable Cities and Society 61 (2020) 102325. https://doi.org/10.1016/j.scs.2020.102325 .[20] X.J. Luo, L.O. Oyedele, A.O. Ajayi, O.O. Akinade, H.A. Owolabi, A. Ahmed. Feature extraction and genetic algorithm enhanced adaptivedeep neural network for energy consumption prediction in buildings. Renewable and Sustainable Energy Reviews 131 (2020) 109980. https://doi.org/10.1016/j.rser.2020.109980 .[21] N. Somu, G.R. M R, K. Ramamritham. A hybrid model for building energy consumption forecasting using long short term memory networks.Applied Energy 261 (2020) 114131. https://doi.org/10.1016/j.apenergy.2019.114131 .[22] J.R.S. Iruela, L.G.B. Ruiz, M.C. Pegalajar, M.I. Capel. A parallel solution with GPU technology to predict energy consumption in spatiallydistributed buildings using evolutionary optimization and artiﬁcial neural networks. Energy Conversion and Management 207 (2020) 112535. https://doi.org/10.1016/j.enconman.2020.112535 .[23] F. Qian, W. Gao, Y. Yang, D. Yu. Potential analysis of the transfer learning model in short and medium-term forecasting of building HVACenergy consumption. Energy 193 (2020) 116724. ttps://doi.org/10.1016/j.energy.2019.11672 .[24] Y. Gao, Y. Ruan, C. Fang, S. Yin. Deep learning and transfer learning models of energy consumption forecasting for buildings with poorinformation data. Energy and Buildings 223 (2020) 110156. https://doi.org/10.1016/j.enbuild.2020.110156 .[25] G. Zhang, C. Tian, C. Li, J.J. Zhang, W. Zuo. Accurate forecasting of building energy consumption via a novel ensembled deep learningmethod considering the cyclic feature. Energy 201 (2020) 117531. https://doi.org/10.1016/j.energy.2020.117531 .[26] D.K. Bui, T.N. Nguyen, T.D. Ngo, H. Nguyen-Xuan. An artiﬁcial neural network (ANN) expert system enhanced with the electromagnetism-based ﬁreﬂy algorithm (EFA) for predicting the energy consumption in buildings. Energy 190 (2020) 116370. https://doi.org/10.1016/j.energy.2019.116370 .[27] Y. Tian, J. Yu, A. Zhao. Predictive model of energy consumption for o ﬃ ce building by using improved GWO-BP. Energy Reports 6 (2020)620-627. https://doi.org/10.1016/j.egyr.2020.03.003 .[28] C. Zhou, Z. Fang, X. Xu, X. Zhang, Y. Ding, X. Jiang, Y. ji. Using long short-term memory networks to predict energy consumption ofair-conditioning systems. Sustainable Cities and Society 55 (2020) 102000. https://doi.org/10.1016/j.scs.2019.102000 .[29] Y. Meng, T. Li, G. Liu, S. Xu, T. Ji. Real-time dynamic estimation of occupancy load and an air-conditioning predictive control method basedon image information fusion. Building and Environment 173 (2020) 106741. https://doi.org/10.1016/j.buildenv.2020.106741 .[30] K. Yan, A. Chong, Y. Mo. Generative adversarial network for fault detection diagnosis of chillers. Building and Environment 172 (2020)106698. https://doi.org/10.1016/j.buildenv.2020.106698 .[31] K. Yan, J. Huang, W. Shen, Z. Ji. Unsupervised learning for fault detection and diagnosis of air handling units. Energy and Buildings 210(2020) 109689. https://doi.org/10.1016/j.enbuild.2019.109689 .[32] C. Hengda, C. Huanxin, L. Zhengfei, C. Xiangdong. Ensemble 1-D CNN diagnosis model for VRF system refrigerant charge faults underheating condition. Energy and Buildings 224 (2020) 110256. https://doi.org/10.1016/j.enbuild.2020.110256 .[33] Z. Zhou, G. Li, J. Wang, H. Chen, H. Zhong, Z. Cao. A comparison study of basic data-driven fault diagnosis methods for variable refrigerantﬂow system. Energy and Buildings 224 (2020) 110232. https://doi.org/10.1016/j.enbuild.2020.110232 .[34] M. Han, R. May, X. Zhang, X. Wang, S. Pan, D. Yan,Y. Jin. A novel reinforcement learning method for improving occupant comfort viawindow opening and closing. Sustainable Cities and Society 61 (2020) 102247. https://doi.org/10.1016/j.scs.2020.102247 .[35] A. Das, M.K. Annaqeeb, E. Azar, V. Novakovic, M.B. Kjrgaard. Occupant-centric miscellaneous electric loads prediction in buildings usingstate-of-the-art deep learning methods. Applied Energy 269 (2020) 115135. https://doi.org/10.1016/j.apenergy.2020.115135 .[36] R. Markovic, J. Frisch, C. Treeck. Learning short-term past as predictor of window opening-related human behavior in commercial buildings.Energy and Buildings 185 (2019) 1-11. https://doi.org/10.1016/j.enbuild.2018.12.012 .[37] Under review: R. Markovic, M.K. Annaqeeb, J. Frisch, C. Treeck, E. Azar. Day-ahead prediction of plug-in loads using a long short-termmemory neural network. Energy and Buildings (2019).[38] B.K. Beaulieu-Jones, J.H. Moore. Missing data imputation in the electronic health record using deeply learned autoencoders. Paciﬁc Sympo-sium on Biocomputing 22 (2017) 207-218. https://doi.org/10.1142/9789813207813_0021 .[39] G. Bode, T. Schreiber, M. Baranski, D. Mller. A time series clustering approach for Building Automation and Control Systems. AppliedEnergy 238 (2019) 1337-1345. https://doi.org/10.1016/j.apenergy.2019.01.196 .[40] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.A. Manzagol. Stacked Denoising Autoencoders: Learning useful representations in a DeepNetwork with a Local Denoising Criterion. Journal of Machine Learning Research 11(12) (2010) 3379-3382. http://portal.acm.org/citation.cfm?id=1953039 .[41] J.P. Ftterer, A. Constantin. Energy concept for the E.ON ERC main building. E.ON Energy Research Center 4 (2014). https://publications.rwth-aachen.de/record/443118 .[42] J.P. Ftterer, A. Constantin, M. Schmidt, R. Streblow, D. Mller, E. Kosmatopoulos. A multifunctional demonstration bench for advanced controlresearch in buildingsMonitoring, control, and interface system. IECON 2013 - 39th Annual Conference of the IEEE Industrial ElectronicsSociety, Vienna (2013) 5696-5701. http://dx.doi.org/10.1109/IECON.2013.6700068 .[43] G. Bode, J. Ftterer, D. Mller. Ein Demonstrator fr innovative Technologien und Konzepte. TGA-Kongress 2016 Berlin (2016). http://publications.rwth-aachen.de/record/660706/files/Vortragsfolien.pdf .[44] M. Baranski, A. Kmpel, T.P. Schild, M.H. Schraven, F. Stinner, T.P.B. Storek, D. Mller, J. Ftterer. Ein ﬂexibles lebendes Labor fr dieEntwicklung und Erprobung von Cloud-basierten Regelungsalgorithmen fr die Gebudeautomation. Deutscher Klte- und KlimatechnischerVerein (DKV) e.V.: Tagung 2018, 2018-11-21 - 2018-11-23, Aachen (2018). http://publications.rwth-aachen.de/record/750585 .[45] The HDF Group. Hierarchical Data Format, version 5, 1997-2020. .

46] Python Core Team (2015). Pickle Python object serialization. Python Software Foundation. https://docs.python.org/3/library/pickle.html .[47] K. Sang, K. Jong. Statistical data preparation: Management of missing values and outliers. Korean Journal of Anesthesiology 70(4) (2017)407-411. https://doi.org/10.4097/kjae.2017.70.4.407 .[48] M. Ferrando, A. Marchenko, S. Erba, F. Causone, S. Carlucci. Pattern Recognition And Classiﬁcation For Electrical Energy Use In ResidentialBuildings. Building Simulation 2019, Rome (2019). https://doi.org/10.26868/25222708.2019.210750 .[49] M. Rtz, A.P. Javadi, M. Baranski, K. Finkbeiner, D. Mller. Automated data-driven modeling of building energy systems via machine learningalgorithms. Energy and Buildings 202 (2019) 109384. https://doi.org/10.1016/j.enbuild.2019.109384 .[50] T. Jayalakshmi, A. Santhakumaran. Statistical normalization and back propagation for classiﬁcation. International Journal of Computer The-ory & Engineering 3(1) (2011) 1793-8201. .[51] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado,A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A.Harp, G. Irving,M. Isard, R. Jozefowicz, Y. Jia, L. Kaiser, M. Kudlur, J. Levenberg, D. Man,M. Schuster, R. Monga, S. Moore, D. Murray, C.Olah, J. Shlens, B. Steiner,I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vigas,O. Vinyals, P. Warden, M. Wattenberg,M. Wicke, Y. Yu, X. Zheng. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. CoRR (2016). https://arxiv.org/abs/1603.04467v2 .[52] F. Chollet. Keras. (2015). https://keras.io .[53] S. Io ﬀ e, C. Szegedy. Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR (2015). https://arxiv.org/abs/1502.03167v3 .[54] I. Sutskever, J. Martens, G. Dahl, G. Hinton. On the importance of initialization and momentum in deep learning. Proceedings of the 30thinternational conference on machine learning. 28(3) (2013) 11391147. http://proceedings.mlr.press/v28/sutskever13.html .[55] K. Diederik, J. Ba. Adam: A method for stochastic optimization. (2014). https://arxiv.org/abs/1412.6980v9 .[56] J. Bergstra, Y. Bengio. Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research 13 (2012) 281-305. https://dl.acm.org/doi/10.5555/2188385.2188395 .[57] X.Glorot, Y. Bengio. Understanding the di ﬃ culty of training deep feedforward neural networks. Proceedings of Machine Learning Research9 (2010) 249-256. http://proceedings.mlr.press/v9/glorot10a.html .[58] R. Grosse. Introduction to Neural Networks and Machine Learning. CSC 321 Winter 2018. .[59] A.L. Maas, A.Y. Hannun, A.Y. Ng. Rectiﬁer nonlinearities improve neural network acoustic models. Proceedings of the 30th InternationalConference on Machine Learning 28 (2013). http://robotics.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf .[60] R. Markovic, E. Grintal, A. Nouri, J. Frisch, C. Treeck. Right on Time -Exploring Suitable Time Discretization for Occupant Behavior Co-Simulation. Building Simulation 2019, Rome (2019). http://doi.org/10.26868/25222708.2019.210166 .[61] D.K. Chaturvedi. Soft computing: techniques and its applications in electrical engineering. Springer 103 (2008) 51-85. https://doi.org/10.1007/978-3-540-77481-5 . ppendix This Appendix provides additional information about the adopted data cleaning process.

A.1. Outliers

Outliers were detected favoring model generalization, rather than accuracy. The aim of this paper was, indeed, toprovide a tool to reconstruct indoor environment time-series, independently of the type and quality of data. Ma et al.[10] applied the IQR method for the reconstruction of building electric power data, deﬁning as outlier every value outof the following range: [ Q − . · IQR ; Q + . · IQR ] , (10)where Q is the ﬁrst quartile of the dataset, Q is the third quartile, IQR is the di ﬀ erence between the third andﬁrst quartile. Data out of the previous interval were replaced with the nearest IQR limit [10]. However, since thegeneralization characteristics of ANNs depend on the noise included in the training data [61], the authors decided notto follow this approach at the expenses of an overall reduced accuracy [61]. Accordingly, outliers were detected basedon theoretical limits ﬁxed by Markovic et al. [5] in a di ﬀ erent study, where a subset of the same data set was analyzed.Therefore, temperature was established between -10 and +

40 °C, based on the plausible range for the continentalclimate in Germany [5]. Relative humidity was set between 0 and 100 % [5], while CO concentration was assumedto be between 0 and 2,500 ppm [5]. Table 6 summarizes descriptive statistics for the data set before and after outliersdetection, based on the methods proposed in the literature [5, 10]. The IQR method proposed by Ma et al. [10] seemedto oversimplify the problem, by identifying as outliers a wide range of values (Table 6). It was, therefore, opted forthe other approach [5]. Table 6: Descriptive statistics for the data set before (raw data) and after ([5, 10]) outliers detection. Std and L / U IQR stand respectively forstandard deviation and lower / upper IQR limit. T [C] RH [%] CO [ppm]Raw data [10] [5] Raw data [10] [5] Raw data [10] [5]Min 6 19.4 6 0.5 1.45 0.5 0 265 192Max 2.34E +

16 25.8 37.1 2.53E +

03 72.25 99.4 1.31E +

04 737 2,000Mean 6.47E +

10 22.64 22.63 3.75E +

01 37.52 37.5 5.16E +

02 509.12 516.3Median 22.7 22.7 22.7 36.4 36.4 36.4 491 491 491Std 3.60E +

13 1.22 1.4 1.18E +

01 11.01 11 1.25E +

02 100.27 124.25L IQR 19.4 19.4 19.4 1.45 1.45 1.45 265 265 265U IQR 25.8 25.8 25.8 72.25 72.25 72.3 737 737 737Outliers / +

06 358 / +

03 313 / +

06 307

A.2. Missing values

Based on the logging frequency and monitoring duration, it could be expected that around 181 million sets ofobservations were collected for each variable from 2014 to 2017. Of the latter, only 73 million data points werecorrectly recorded for T (59.6 % error rate), while 70 million for both RH and CO (61.3 % error rate). In order toincrease the computational e ﬃ ciency of the models, frequency was reduced from minute-wise to 30 minutes, leadingto approximately 2.3 million data points for each variable. For the missing values handling, a complete case analysisapproach was adopted, where only full day of observations with the current resolution were considered. Hence, thenumber of available monitoring points per variable were reduced to 1.5 million data. Accordingly, from the starting376,938 daily observations, models were applied only to 94,085 days (75 % error rate). An overview of the missingvalues handling strategy is presented in Table 7 and in Figure 11.22 able 7: Overview of the preprocessed data set. T RH CO Frequency [min] 30 30 30Expected days 125,646 125,646 125,646Discarded days 94,265 94,294 94,294Complete days 31,381 31,352 31,352 𝑋 𝑋 𝑋 … 𝑋 𝑋 𝑋 … … … … 𝑋 𝑋 𝑋 … 𝑋 𝑋 𝑋 𝑋 𝑋 propagation 𝑋 𝑋 𝑋 … 𝑋 𝑋 𝑋 … … … … 𝑋 𝑋 𝑋 … 𝑋 𝑋 𝑋 𝑋 𝑋 𝑋 𝑋 … 𝑋 𝑋 𝑋 … … … … 𝑋 𝑋 𝑋 … 𝑋 𝑋 𝑋 𝑋 Discarded daysRestored days

Complete days