[PDF] A Novel Event-based Non-intrusive Load Monitoring Algorithm

Abstract

Non-intrusive load monitoring (NILM), aims to infer the power profiles of appliances from the aggregated power signal via purely analytical methods. Existing NILM methods are susceptible to various issues such as the noise and transient spikes of the power signal, overshoots at the mode transition times, close consumption values by different appliances, and unavailability of a large training dataset. This paper proposes a novel event-based NILM classification algorithm mitigating these issues. The proposed algorithm (i) filters power consumption signals and accurately detects all events, (ii) extracts specific features of appliances, such as operation modes and their respective power consumption intervals, from their power consumption signals in the training dataset, and (iii) labels with high accuracy each detected event of the aggregated signal with an appliance mode transition. The algorithm is validated using REDD with the results showing its effectiveness to accurately disaggregate low frequency measured data by existing smart meters.

Full PDF

AA Novel Event-basedNon-intrusive Load Monitoring Algorithm

Elnaz Azizi,

Student Member, IEEE , Mohhamd TH Beheshti,

Member, IEEE ,and Sadegh Bolouki,

Member, IEEE , Abstract —Non-intrusive load monitoring (NILM), aims to inferthe power proﬁles of appliances from the aggregated powersignal via purely analytical methods. Existing NILM methods aresusceptible to various issues such as the noise and transient spikesof the power signal, overshoots at the mode transition times, closeconsumption values by different appliances, and unavailabilityof a large training dataset. This paper proposes a novel event-based NILM classiﬁcation algorithm mitigating these issues. Theproposed algorithm (i) ﬁlters power consumption signals andaccurately detects all events, (ii) extracts speciﬁc features ofappliances, such as operation modes and their respective powerconsumption intervals, from their power consumption signals inthe training dataset, and (iii) labels with high accuracy eachdetected event of the aggregated signal with an appliance modetransition. The algorithm is validated using REDD with theresults showing its effectiveness to accurately disaggregate lowfrequency measured data by existing smart meters.

Index Terms —Demand-side management, clustering, eventdetection, non-intrusive load monitoring.

I. I

NTRODUCTION

Due to the unpredictable nature of both generation, causedby renewable energy resources, and consumer demand, main-taining the balance between generation and demand is one ofthe main challenges in smart grids [1]. Residential demand-side management programs have thus emerged as a promisingset of methods to strike such balance [2]. Among them,non-intrusive load monitoring (NILM), that is the process ofextracting the power consumption proﬁle or operating patternof each appliance from the aggregated power consumptionsignal of a house using purely analytical methods, has gained agreat deal of attention in recent years. Practical and efﬁcient,NILM provides consumers with an opportunity to track theenergy consumption of each appliance and voluntarily changetheir usage patterns to save energy and reduce the cost whilemaintaining their comfort which also results in higher stabilityand efﬁciency of the power grid [3].The concept of NILM was ﬁrst introduced in 1992 by Hart[4]. Since then, a variety of analytical algorithms have beenproposed to address the NILM problem. These algorithmsemploy various features and parameters such as voltage,current, and active and reactive power signals of a house. Sincemeasuring the active power is cost-efﬁcient, a majority ofstudies have focused on this feature alone [5]. NILM researchbased on the active power signal diverged into two mainlines of study, (i) state-based algorithms that consider each

E. Azizi, M. TH. Beheshti and S. Bolouki are with Department of Electricaland Computer Engineering, Tarbiat Modares University, Tehran, Iran. e-mails: { e.azizi, mbehesht, bolouki } @modares.ac.ir appliance as a ﬁnite-state machine and disaggregate the totalpower signal based on the learned model of state transitionsof appliances [6] and (ii) event-based algorithms, which arebased on the edges or considerable variations of the signalcaused by turning ON/OFF of appliances or their other modetransitions [2]. Due to the low computational complexity ofevent-based techniques, they have proved more popular thanthe state-based ones [7].In designing an event-based algorithm, multiple challengesare involved. The ﬁrst one lies in the event detection partcaused by the presence of noise, spikes, uncertainties in thevoltage of the grid, and overshoots in appliances’ powerconsumption signals. The second challenge is closeness ofdifferent appliances’ power consumption values which makesthem somewhat indistinguishable. The third and last challengeis that high volume training datasets and ground-truth infor-mation about each appliance are scarce in practice, althougha small amount of data can perhaps be collected for eachresidential building. Overcoming these challenges, this paperproposes an event-based NILM algorithm with competitiveaccuracy, which ﬁrst via a novel method detects events in thepower signals, then extracts speciﬁc features and informationabout appliances from their consumption proﬁles in a smalltraining dataset, and ﬁnally utilizes them to disaggregate theaggregated power consumption signal. A. Related Work

The most well-known state-based NILM algorithms are theHidden Markov Model (HMM) [8] and its variants such asFactorial HMM methods [9]. The main drawback of thesemethods is the requirement for a large training dataset toconstruct and learn the model. Computational complexitiesof these methods also increases exponentially by adding anew appliance [10]. However, event-based NILM techniqueswhich deal with the detected events of the aggregated signaland classify them have lower computational complexities incomparison with state-based ones [11]. Recent research ofevent-based NILM falls into two main categories, namely un-supervised and supervised methods [12]. Unsupervised NILMalgorithms, tackling the so-called blind source problem, dealwith the case where no prior information about appliances isavailable. In these methods, events are detected and differentclustering algorithms such as subtractive clustering [13] and k -means [14] are applied to them. They detect different clustersof appliances without assigning a label to each cluster. Despitesome success in the case where all appliances have only two a r X i v : . [ ee ss . S Y ] S e p ON and OFF) modes, these algorithms have been ineffectivein dealing with multi-mode appliances [14], [15].In contrast with unsupervised NILM methods, supervisedalgorithms such as NILM classiﬁcation algorithms requireprior metadata information about the number of appliancesand their operation modes as well as a training datasetcontaining appliances’ consumption proﬁles for a period oftime. Considering modes of appliances as class labels, variousclassiﬁcation methods such as KNN, multi-label classiﬁcation[16], [17], and deep learning [18] have been utilized in thisﬁeld. These methods have proved to be signiﬁcantly moreaccurate than their unsupervised counterparts, particularly inthe presence of multi-state appliances [18]. However, theirmain drawback is the need for an enormous training datasetthat is not in general feasible to collect [19]. Therefore,extracting useful information from a small training dataset forthe NILM classiﬁcation problem has become a topic of greatinterest in the past few years [20].

B. Contributions

This paper proposes a novel event-based NILM algorithmthat minimizes the ground-truth data required, performs well inanalyzing real data measured by existing meters, and remainsefﬁcient and accurate even for large numbers of appliances.Major contributions of this work are detailed below.1) Event-based algorithms are highly dependent on detectionof events. Therefore, the event detection algorithm used forthe NILM purpose should be accurate in the sense that itshould not miss any actual event or mistake ﬂuctuations ofthe signal as an event. We propose in Section III a novelStatistics-based method that ﬁlters the signal and detectsevents with a 100% accuracy.2) For NILM as a classiﬁcation problem, the number ofoperation modes of appliances and their respective powerconsumption values are key to assigning labels. In most ofthe existing literature, these modes are obtained by visuallyanalyzing the appliances’ power consumption signals inthe training dataset. We introduce a clustering approachin Subsection IV-A, using in part the linkage-Ward algo-rithm, which automatically extracts appliances’ modes andtheir respective consumption values. Then, in Section V,a novel classiﬁcation technique, with competitive accuracy,is established for the NILM problem employing previouslyextracted features.3) Existing classiﬁcation algorithms consider appliances’power consumption values at each mode as their main char-acteristics. However, the appliance consumption pattern,its transitions between different modes, their ON durationperiod, and their probability of occurrence are also keyinformation that can be used to distinguish two applianceswith close power consumption values. Analyzing the train-ing dataset in Section IV, we extract these features ofappliances and utilized them for label reﬁnement.

C. Paper Organization

The remainder of this paper is organized as follows. Theterminology and problem statement are presented in Section II. In Section III, the proposed signal ﬁltering and event detectiontechniques are described. The feature extraction methods arethen detailed in Section IV, followed by the proposed classi-ﬁcation method in Section V. The effectiveness and accuracyof the proposed algorithms are evaluated and compared withother algorithms using the REDD [21] in Section VI. Finally,Section VII concludes the paper.II. T

ERMINOLOGY AND P ROBLEM S TATEMENT

In this section, we present the terminology and the event-based NILM classiﬁcation problem considered in this paper.

A. Notions and Terminology

In this research, power refers to active power.

Operationmodes of an appliance refer to a ﬁxed set of modes, includingthe OFF mode, in which the appliance can operate. Appliancesare assumed to have two or more operation modes. The operating mode of an appliance is the mode in which theappliance is operating at a speciﬁc point in time. Whenno ambiguity results, the term mode is used to refer to anoperation mode or operating mode. A state of an applianceis deﬁned as its power consumption amount in one of itsoperation modes. Since this amount is assumed to vary at leastslightly over time, a state is represented by a ﬁxed closedinterval within the set R of real numbers. One notices thatthere exists a state corresponding to each operation mode ofan appliance.The aggregated power consumption signal , or simply the aggregated signal , refers to the sum of power consumptionsignals of all appliances of a house or speciﬁc appliances ofinterest. The term non-intrusive load monitoring , or NILM,in this work is then deﬁned as extracting the sequence ofoperating modes of each appliance from the aggregated signal.This NILM problem is sometimes referred to as the NILMclassiﬁcation problem. While deducing individual power con-sumption signals of appliances from the aggregated signal isalso a NILM problem, that one views as a NILM regressionproblem, it is not considered in this work.Given the power consumption signal of an appliance, an event is a change in the signal value caused by a modetransition of the appliance. Similarly, an event of the ag-gregated signal is a change in the signal value caused by amode transition of any of the appliances contributing to theaggregated signal. B. Event-based NILM Problem

The event-based NILM (classiﬁcation) problem can bedescribed as the process of assigning proper labels to eventsof the aggregated power consumption signal, where the set oflabels consists of all appliance mode transitions. It should benoted that a training set in the form of a set or sequenceof events and their corresponding labels is in general notimmediately available. Instead, it has to be derived from thegiven individual appliances’ power consumption signals overa period of time. No additional information, such as thenumber of modes of each appliance or their nominal powerconsumption values, is available. It is assumed throughout thepaper that the given signals are measured in discrete time. ig. 1: Typical outliers in consumption signal

III. S

IGNAL F ILTERING AND E VENT D ETECTION

A fundamental part of event-based NILM is detecting eventsaccurately. Event detection should be executed on individualappliances’ power consumption signals in the training setas well as the aggregated signal in the test set. In a vastmajority of the literature, an event is detected based on thedifference between two consecutive sampled values. Moreprecisely, if the absolute value of this difference is greater thana certain threshold, an event is considered to have occurredbetween the two sampling times. Existing threshold-basedevent detection techniques heavily rely on the threshold thatis selected manually given the dataset in hand. Thus, theyare not expected to perform as well on the meter’s data ofa different residential house. Beside this extensibility issue,there appears to exist a fundamental limit on the accuracy ofthreshold-based event detection techniques, which is causedby ﬂuctuations of voltage in the power grid and noise, spikes,and the various ranges of overshoots in the signal, as shownin Fig. 1 [22], [23]. In this section, we propose a novelStatistics-based algorithm that overcomes all these challengesand achieves 100% accuracy in event detection.While the mainstream view of an event is a signiﬁcant value change in the signal, our view of an event is an uncommon value change. Thus, considering a set formed basedon value changes in the signal, we search for “outliers” ofthe set. As it will be explained later in this section, this setconsists of the min/max ratios between consecutive sampledvalues of the signal, subtracted from 1. Of course, carefulconsiderations should be made with regard to transient spikesand lengthy overshoots during mode transitions, as they wouldalso be outliers of the formed set. Detecting these spikesand overshoots and ﬁltering them will also prove signiﬁcantfor getting more accurate results in the event-based NILMproblem.Our proposed event detection algorithm consists of threemain steps, 1) outlier detection, 2) ﬁltered signal constructionand 3) event detection, which will be discussed in the follow-ing subsections.

A. Outlier Detection

Different ﬁelds of research have been dealing with theoutlier detection problem given a dataset and different methodshave been proposed to address it [24], [25]. We herein extenda Statistics-based outlier detection method to suit the NILMproblem. As it is shown in Algorithm 1, the algorithm startswith calculating the min/max ratio between any two consecu-tive sampled values of signal P ( t ) , subtracting them from 1, Algorithm 1:

Proposed algorithm for outlier detection.

Step 0:

Initialize the parameters, i = 1 , M (:) = 0 and M o (:) = 0 Step 1:

Get signal P ( t ) , t = 1 , . . . , T Step 2: while t ≤ T − do S ( t ) = { P ( t ) , P ( t + 1) } M ( t ) = 1 − min ( S ( t )) max ( S ( t )) t = t + 1 Step 3:

Compute sd as the standard deviation of M Step 4:

Extract outliers’ instances based on sd while t ≤ T − doif M ( t ) > sd then M o ( i ) = t ; i = i + 1 ; Sample A c t i v e po w e r ( W ) substitute outliers with the average of their following inliersInliersOutliers Fig. 2: Construction of the ﬁltered signal and saving them in a vector M . Then, the standard deviationof M is computed. Finally, for any t , if M ( t ) is grater thanthe calculated standard deviation, it is considered an outlierand t is saved as the outlier occurrence instance in vector M o of outlier instances. B. Filtered Signal Construction

Having performed outlier detection, outliers’ instances areobtained, as well as instances that are not outliers, referred toas inlier instances. In this so-called ﬁltering step, the aim is toﬂatten spikes and overshoots of the signal. To achieve this aim,the signal value at each outlier instance is substituted with themean of the signal values at the following consecutive inlierinstances, as shown in Fig. 2. You may note that, unlike forspikes and overshoots, the signal values at actual event timeswill not experience signiﬁcant change in the ﬁltering step.

C. Event Detection

In the ﬁnal step, to detect events, Algorithm 1 is appliedto the ﬁltered signal. Now, all detected outlier instances areconsidered as event instances. We note that transient spikes andovershoots of the original signal have already been ﬂattenedin constructing the ﬁltered signal, meaning that they can nolonger be mistaken for events.IV. F

EATURE E XTRACTION

The most common speciﬁc feature of appliances utilizedin NILM algorithms is their power consumption values attheir operation modes. However, since different appliancesmay have operation modes with close power consumptionalues, an effective load disaggregation algorithm should useadditional appliance features broadly called ﬁnger prints in[26]. In this section, we propose different methods to extractoperation modes of appliances and additional features from asmall training dataset.

A. Modes and States of Appliances

States of an appliances, deﬁned as the power consumptioninterval corresponding to its operation modes, are its mostuseful features widely used for the NILM purpose. Therefore,detecting the number of modes of an appliance and their corre-sponding states plays a crucial role in the accuracy of NILM.As opposed to most of the existing literature that extracts anappliance’s modes/states visually using its power consumptionvalues in the training set or by using the datasheet of theappliance, we propose a novel clustering-based approach forappliance modes/states extraction from its power consumptionsignal in a systematic fashion.Our approach is based on the linkage-Ward (LW) clusteringalgorithm [27]. The objective function of the LW algorithmis the squared sum of distances between data points and thetheir cluster centroids. The LW clustering algorithm ﬁrst treatseach data point as a cluster of its own, which means thatthe initial value of the objective function is 0. Then, clustersare merged together, one pair at every stage, based on thefollowing merging policy: clusters A and B are merged if ∆( A, B ) is a minimum among all pairs of clusters, ∆( A, B ) = (cid:88) p ∈ A ∪ B (cid:107) p − m A ∪ B (cid:107) − (cid:88) p ∈ A (cid:107) p − m A (cid:107) − (cid:88) p ∈ B (cid:107) p − m B (cid:107) = n A n B n A + n B (cid:107) m A − m B (cid:107) , (1)where m A , m B , and m A ∪ B are centroids of clusters A , B and A ∪ B respectively; and n A and n B are the size of clusters A and B . One notices that ∆( A, B ) , which is non-negative,is in essence the cost of merging clusters A and B . Thus,the value of the objective function at any given stage is thetotal cost of all the merging up to that stage. Analyzing theincreasing objective function, the elbow method [11] is oftenused to determine the optimal number of clusters, that is todetermine when to stop merging. Roughly speaking, accordingto the elbow method, merging stops when it becomes toocostly compared to the merging at the previous stage.It can be seen from the deﬁnition of ∆( A, B ) in (1) thatthe combination of the LW algorithm and the elbow methodis susceptible to unbalanced data. More precisely, when theLW algorithm is close to reaching the optimal number ofclusters, the elbow method discourages stopping where twoor more small clusters exist since merging them is now notrelatively “too costly” even though their centroids may befar apart. For the mode/state extraction purpose, this couldbe a signiﬁcant issue, as infrequently occurring modes ofan appliance may be lumped into one mode with a wide-ranging power consumption, which would undermine any attempt of NILM. Therefore, one should take advantage ofthe LW algorithm in such a way to suit the NILM purpose bydiscouraging merging of clusters that are far apart. Thus, thefollowing algorithm is suggested for mode extraction.After ﬁltering the power consumption signal of each appli-ance in the training dataset, the LW algorithm is applied tothe signal’s data considering K clusters, where K ≥ . Thecluster centroids are then computed and sorted in descendingorder. From this stage forward, a different merging policy,entitled as distance-based policy, is adopted that only dependson the distance between cluster centroids. It starts by consid-ering the cluster with the highest centroid as the root cluster .If its centroid’s distance to the next highest centroid is lessthan 15% of the root cluster’s centroid, the two clusters mergeand the step is repeated considering the merged cluster as thenew root cluster. Otherwise, the cluster with the next highestcentroid to the root cluster’s centroid is considered as the newroot cluster and the step is repeated. The algorithm terminateswhen no further merging can take place. The ﬂowchart ofthe proposed mode extraction method is illustrated in Fig. 3,where centroids of the root cluster and the cluster with thenext highest centroid are denoted as C r and C j . B. Transition Intervals of Appliances

Consider an arbitrary mode transition of an appliance from a(relatively) high state H = [ H min , H max ] to a (relatively) lowstate L = [ L min , L max ] . Then, the interval of this transitionis deﬁned as [ H min − L max , H max − L min ] . (2) C. Transition Participation Indices

Two appliances with some overlapping transition intervalscan be difﬁcult to separate in the aggregated signal. The participation index of their transitions deﬁned as (3), obtainedfrom each appliance’s usage pattern in the training set, mayhelp separate these appliances. Given a transition T of anappliance, its participation index P par is deﬁned as P par = 1 N days N days (cid:88) day =1 number of T in day number of events in day . (3)where N days is the total number of days of training datasetin which the transition T happened. In other words, thisparameter measures the daily average of contribution of aspeciﬁc transition of an appliance in events of the total signal. D. Additional Features of Appliances

Analyzing the power consumption signals of appliances inthe training dataset, one observes that some of the appli-ances exhibit very speciﬁc behavioral patterns. Some of thesefeatures, that can further help improve the accuracy of loaddisaggregation, are listed below. • As shown in Fig. 4, the dishwasher exhibits the same patternwhen it is ON. As this pattern is complex, using it may hurt pply outlier detectionConstruct the filtered signalCompute the centroid of each clusterStart

Get the consumption signal of training dataset

Apply LW clustering algorithm considering

K=10 clustersSort centroids from highest one to the lowest one(S c = {C , C

2, ..., C N })Compute the distance of each cluster, C l , form the C r Choose the highest centroid, C , as the root centroid entitled as C r m in M l Save C Dist(C r , C l )<0.15C r l = l + 1Omit the M m from S c If size of S c >0 Finish

YesYesNo Nol > N c N c = size of S c l = 1No Yes

Mode i = all members of respective clusters of the saved centorids in M m i = i + 1 i = 1 Fig. 3: Flowchart for the proposed mode extraction algorithmFig. 4: Daily usage pattern of the dishwasher efﬁciency of NILM. However, one notices that in any givenday, the dishwasher operates in either all or non of its modes.This simple characteristic will prove signiﬁcant for NILM. • One observes that not all possible mode transitions of multi-mode appliances can ever occur. • Some appliances appear to have unique mode transitionovershoots in their signals. • The time period between two consecutive ON samples(with OFF samples in between) of different appliances aresigniﬁcantly different.V. C

LASSIFICATION

In this section, a novel classiﬁcation algorithm is proposedto address the NILM problem, that is to determine each eventin the aggregated power consumption signal corresponds towhich appliance mode transition. To label a given event,the classiﬁer ﬁrst obtains all transition intervals containingthe value of that event. As a simple example, consider twoON/OFF appliances 1 and 2, whose ON modes correspond tostates [890,1000] and [970,1050], respectively. Given an eventwith value 980 in the aggregated test signal, the classiﬁer ﬁrstassociates this event with OFF-to-ON transition of appliance 1 as well as that of appliance 2. Afterwards, taking other speciﬁcfeatures of each appliance into considering along with the testsignal’s behavior near the event, a single label is assigned tothe event.Deﬁning N e and N T as the total number of events inthe test signal and the number of all possible appliancemode transitions, respectively, let L e be an N T × N e binary-valued matrix, with columns corresponding to events and rowscorresponding to mode transitions, which represents predictedlabels for events of the test signal. Obviously, an element1 of L e indicates that its row’s corresponding transitionis the predicted label of its column’s event. The proposedclassiﬁcation algorithm is detailed in 4 steps below. Step 1:

Given an event of the test signal and a mode transitionof an appliance, if the event value is within the transitioninterval, the respective element of L e is labeled 1. Otherwise,it is labeled 0. It should be clear that since transition intervalsmay overlap, some events may be assigned multiple labelsin this step. We also point out that some events may remainunlabeled, which means that matrix L e may have some all-zerocolumns. Each of these unlabeled events is then labeled thetransition whose interval is closest to the event value. We notethat the distance between a value and an interval is calculatedas the minimum distance between the value and any pointwithin the interval. Step 2:

Analyzing the daily aggregated signal, it can beobserved that in a majority of time samples, appliances arein their OFF modes. With that in mind, one should focus onparts of the aggregated signal where at least one appliance isON. In particular, given the aggregated signal, one obtains its cycles , where each cycle starts with an event succeeding anall-OFF sample and ends with the nearest event preceding anall-OFF sample. Fig. 5 illustrates how cycles of an aggregatedsignal are derived. Over each cycle, the labels assigned to the ig. 5: Daily usage pattern of the refrigerator vs its cycles events should be compatible .To clarify what is meant by labels compatibility over a cycle,one thinks of an undirected graph in which nodes representall mode vectors θ , where each element θ a of θ is a modeof appliance a . Two nodes are then connected by an edgeif their corresponding vectors differ in exactly one element.It should be clear that an edge represents a single appliancemode transition, while noting that two different edges maycorrespond to the same transition. Now, a sequence of labelsover a cycle are said to be compatible if starting from the all-OFF node of the graph, one can walk according to the labelsin the sequence and terminate at the all-OFF node. In otherwords, labels/transitions over a cycle are compatible if theyform a cycle in the graph constructed above.Using the compatibility condition described, the multiplelabels assigned to some events can be narrowed down assome labels are deemed inadmissible. More precisely, a labelwithin a cycle is removed if it is not part of any compatiblesequence of labels over the cycle. As an example, in cycle 1of Fig. 5, after Step 1, assume that the ﬁrst and secondevents are assigned single labels, that are mode transitions ofappliance 1; while the third event is assigned two labels, onea mode transition of appliance 1 and one a mode transition ofappliance 2. Then, in Step 2, the mode transition of appliance2 is ruled out as a label of the third event since it is not part ofany compatible sequence of labels from Step 1 over cycle 1.In other words, appliance 2 cannot possibly be ON when thethird event occurs. Step 3:

After Step 2’s cycle-based label reﬁnement, each eventmay still be assigned multiple labels. Considering extractedspeciﬁc features of some appliances as described in IV-D,some labels are removed from the multi-labeled events.

Step 4:

Finally, based on participation indices of appliancetransitions explained in IV-C, the most probable label is chosenfor the multi-labeled events.VI. S

IMULATION S TUDY

In this section, the accuracy and effectiveness of our al-gorithms proposed in Sections III–V for the NILM purposeare evaluated by applying them to a low-frequency dataset,gathering which is practical as it can be done using existingsmart meters [28]. The dataset considered consists of 28 daysof power consumption data for seven appliances of house1 in the REDD [21]. These appliances are listed as oven(OV), microwave (MW), kitchen outlets (KO), bathroom GFI(BGFI), washer/drier (W/D) with a high consumption state,refrigerator (RFG), and dishwasher (DW). The ﬁrst ﬁve listed

Fig. 6: Total consumption vs the consumption of each appliance appliance only have ON and OFF modes, while the last twohave more than two operation modes. Fig. 6 shows the powerconsumption signals of all seven appliances in a day of thedataset. To measure the accuracy of the proposed classiﬁcation,the F measure is used [29], F measure = 2 × T RP × RCT RP + RC , (4)

T P R = T PT P + F N , RC = F PF P + T N , (5)where

T RP and RC show the precision and recall; T P istrue positive,

F P is false positive,

T N is true negative, and

F N is false negative.In the following subsections, ﬁrst the proposed ﬁlteringend event detection method are applied to training and testdataset. Then, based on the ﬁltered signal of each appliance,their speciﬁc features are extracted. Finally, considering thesefeatures the proposed classiﬁcation technique is utilized todisaggregate the test signal.

A. Signal Filtering and Event Detection

The ﬁltering and event-detection method of Section III isapplied to individual appliances’ power consumption signalsin the training dataset as well as the aggregated signal in thetest dataset. As an example, Fig. 7 illustrates the outliers ofthe signal, the overshoots and spikes in the signal and theconstructed ﬁltered signal of the dishwasher for a period oftime, respectively.

B. Feature Extraction

Having individual appliances’ signals ﬁltered, their featuresare extracted via methods of Section IV as discussed below.

1) States of Each Appliance:

Applying the LW-based clus-tering method of Subsection IV-A to the ﬁltered signal ofeach appliance, its modes and their corresponding states areobtained. One recalls that the merging policy is that of the LWalgorithm until 10 clusters are obtained, then changes to thatof distance-based policy. As an example, Table I shows the10 obtained clusters for modes/states of the dishwasher beforethe merging policy changes from the LW method, and Table IIshows the ﬁnal obtained states of all appliances. - m i n / m a x (a) A c ti v e po w e r ( W ) (b) Sample A c ti v e po w e r ( W ) (c) Real data cleaned data

Fig. 7: Filtering the dishwasher’s signal: (a) inliers vs. outliers in1-min/max ratios of consecutive samples, (b) detected outliers in thepower consumption signal, (c) the orignal signal vs. the ﬁltered signalTABLE I: Minimum, maximum, and centroid of the 10 LW-obtainedclusters for dishwasher’s modes/states. min max centroidActive power (W) 154 183 168198 208 205210 231 219236 261 236398 420 449416 496 489500 598 589643 737 6801078 1110 10991115 1247 1173

2) Transition Intervals of Appliances:

Intervals of probablemode transitions of all appliances are now derived using (2).When a mode transition is from or to the OFF mode of anappliance, it simply coincides with the state correspondingto the other appliance mode involved in that transition. Con-sidering probable transitions of multi-mode appliances, non-trivial transition intervals [817-1049] and [155-240] should beconsidered for DW and RFG, respectively.

3) Transition Participation Index:

One recalls that transi-tion participation index is only used to separate appliance tran-

TABLE II: States corresponding to non-OFF modes of appliances

App StatesDW 157-183,198-261,398-496,643-737,1078-1247RFG 153-183,185-260,415-425MW 1439-1598BGFI 1580-1620KO 1064-1087W/D 2641-3000OV 4138-4157

TABLE III: Transition participation index

Group Corresponding app P par Overlappingtransitionsgroup 1 BGFI 0.11MW 0.202 KO 0.17DW 0.73

TABLE IV: F measure for different methods App [20] ML-KNN [30] Proposed methodDW 0.59 - 0.96RFG 1 0.88 0.81MW 0.40 0.76 0.82BGFI 0.97 0.88 0.91KO 1 0.84 0.86WD 0.92 0.94 1OV - 0.86 1LIGHT - 0.76 -AVERAGE 0.81 0.80 0.90sition with overlapping intervals. Table III shows participationindex for different groups of the overlapped appliances.

C. Classiﬁcation

To apply the proposed classiﬁcation method, events of theﬁltered test signal are detected. Then, the classiﬁcation-basedalgorithm is applied to detected events based on the transi-tion intervals of appliances. Finally, after cycle-based labelreﬁnement, to reﬁne the remained multiple labeled events andchoose the most probable label for each event, the followingspeciﬁc features of appliances are considered:1) Based on the power consumption signal of the dishwasher,one observes that during every ON cycle, all its modesoccur. Thus, if the transition with interval [643-737] ofthe dishwasher, which happens to not overlap with anyother transition interval, is not detected, the dishwasher canbe assumed OFF during that whole day. Consequently, allcandidate labels/transitions involving a non-OFF mode ofthe dishwasher in that day can be ruled out.2) Three modes for the refrigerator have been detected, namelythe OFF mode, the lower ON mode, and the higher ONmode. It is observed in the training dataset that the tran-sition from the OFF mode to the higher ON mode neverhappens. Thus, all candidate labels of such can be removed.3) Overshoot values appearing in the refrigerator’s signal arehigher than 500 W as opposed to those of the dishwasher.This can be used to separate transitions of the refrigeratorand dishwasher with overlapping intervals.Applying aforementioned features, for each remaining eventwith multiple labels, the participation index is calculated foroverlapped appliances separately. The appliance which hasclose P par to the calculated participation index of appliancesin training dataset, is assigned to events. Table IV shows thehigh accuracy of our classiﬁcation method for each appliancein comparison with the results of [20] and [30]. Keepingin mind that a higher number of appliances should dimin-ishes the accuracy of NILM, one notes that the number ofappliances considered in this work and [20] are seven andix. On the other hand, considering multi-mode appliancessuch as dishwasher increases the complexity of disaggregation.However, the accuracy of our proposed method in which wehave considered dishwasher is higher than [30] which did notconsidered dishwasher in appliances’ set.VII. C ONCLUDING R EMARKS AND F UTURE WORK

In this paper, we have proposed a novel classiﬁcationmethod to address the NILM problem given a small dataset.The proposed algorithm has three main phases: 1) ﬁlteringtraining and test signals and accurately detect their eventsusing a statistics-based method, 2) extracting features of ap-pliances, most notably their modes and states via a clusteringapproach that in part uses the LW clustering method, 3)proposing a classiﬁcation algorithm labeling events of theaggregated test signal with mode transitions of appliances,where various features and techniques are utilized to enhanceits accuracy. The proposed event detection and modes/statesextraction methods have been done in a systematic fashion insuch a way to perform well for any set of power consumptiondata. The proposed ﬁltering method, feature extraction tech-niques, and event-based NILM classiﬁcation algorithm havebeen validated using the REDD. Results show 100% accuracyin event detection. Juxtaposing the results of our classiﬁcationalgorithm with two recently introduced event-based NILMmethods indicate a relatively high accuracy of our algorithm.Reconstructing the power consumption signals of appli-ances, which can be cast as a regression problem as opposedto the classiﬁcation problem we considered in this work, isone of the main challenges of event-based NILM problems.We aim to modify our algorithms, mainly the one in SectionV, to address the reconstruction problem. Moreover, due tothe lack of a training dataset for each residential building,we wish to move a step further to use transfer learning tocircumvent the training phase of the proposed method, withthe practical assumption that nominal values for appliances’power consumption are given.R

EFERENCES[1] M. S. H. Nizami, M. J. Hossain, and E. Fernandez, “Multiagent-based transactive energy management systems for residential buildingswith distributed energy resources,”

IEEE Trans. Industrial Informatics ,vol. 16, no. 3, pp. 1836–1847, 2020.[2] M. Lu and Z. Li, “A hybrid event detection approach for non-intrusiveload monitoring,”

IEEE Trans. Smart Grid , vol. 11, no. 1, pp. 528–540,2019.[3] T. Ji, L. Liu, T. Wang, W. Lin, M. Li, and Q. Wu, “Non-intrusive loadmonitoring using additive factorial approximate maximum a posterioribased on iterative fuzzy c -means,” IEEE Trans. Smart Grid , vol. 10,no. 6, pp. 6667–6677, 2019.[4] G. W. Hart, “Nonintrusive appliance load monitoring,”

Proceedings ofthe IEEE , vol. 80, no. 12, pp. 1870–1891, 1992.[5] V. Rathore and S. K. Jain, “Non intrusive load monitoring and loaddisaggregation using transient data analysis,” in . IEEE, 2018, pp.1–5.[6] B. Zhao, K. He, L. Stankovic, and V. Stankovic, “Improving event-based non-intrusive load monitoring using graph signal processing,”

IEEE Access , vol. 6, pp. 53 944–53 959, 2018.[7] Q. Liu, K. M. Kamoto, X. Liu, M. Sun, and N. Linge, “Low-complexitynon-intrusive load monitoring using unsupervised learning and general-ized appliance models,”

IEEE Trans. Consumer Electronics , vol. 65,no. 1, pp. 28–37, 2019.[8] S. Singh and A. Majumdar, “Deep sparse coding for non–intrusive loadmonitoring,”

IEEE Trans. Smart Grid , vol. 9, no. 5, pp. 4669–4678,2017. [9] R. Bonﬁgli, E. Principi, M. Fagiani, M. Severini, S. Squartini, andF. Piazza, “Non-intrusive load monitoring by using active and reactivepower in additive factorial hidden markov models,”

Applied Energy , vol.208, pp. 1590–1607, 2017.[10] K. He, D. Jakovetic, B. Zhao, V. Stankovic, L. Stankovic, and S. Cheng,“A generic optimisation-based approach for improving non-intrusiveload monitoring,”

IEEE Trans. Smart Grid , vol. 10, no. 6, pp. 6472–6480, 2019.[11] E. Azizi, A. M. Shotorbani, M.-T. Hamidi-Beheshti, B. Mohammadi-Ivatloo, and S. Bolouki, “Residential household non-intrusive loadmonitoring via smart event-based optimization,”

IEEE Trans. ConsumerElectronics , vol. 66, no. 3, pp. 233–241, 2020.[12] B. Zhao, M. Ye, L. Stankovic, and V. Stankovic, “Non-intrusive loaddisaggregation solutions for very low-rate smart meter data,”

AppliedEnergy , vol. 268, p. 114949, 2020.[13] N. Henao, K. Agbossou, S. Kelouwani, Y. Dub´e, and M. Fournier,“Approach in nonintrusive type i load monitoring using subtractiveclustering,”

IEEE Trans. Smart Grid , vol. 8, no. 2, pp. 812–821, 2015.[14] W. Kong, Z. Y. Dong, J. Ma, D. J. Hill, J. Zhao, and F. Luo, “Anextensible approach for non-intrusive load disaggregation with smartmeter data,”

IEEE Trans. Smart Grid , vol. 9, no. 4, pp. 3362–3372,2016.[15] C. Dinesh, S. Makonin, and I. V. Baji´c, “Residential power forecastingusing load identiﬁcation and graph spectral clustering,”

IEEE Trans.Circuits and Systems II: Express Briefs , vol. 66, no. 11, pp. 1900–1904,2019.[16] S. M. Tabatabaei, S. Dick, and W. Xu, “Toward non-intrusive loadmonitoring via multi-label classiﬁcation,”

IEEE Trans. Smart Grid ,vol. 8, no. 1, pp. 26–40, 2017.[17] L. Massidda, M. Marrocu, and S. Manca, “Non-intrusive load disaggre-gation by convolutional neural network and multilabel classiﬁcation,”

Applied Sciences , vol. 10, no. 4, p. 1454, 2020.[18] V. Singhal, J. Maggu, and A. Majumdar, “Simultaneous detection ofmultiple appliances from smart-meter measurements via multi-labelconsistent deep dictionary learning and deep transform learning,”

IEEETrans. Smart Grid , 2018.[19] C. C. Yang, C. S. Soh, and V. V. Yap, “A systematic approachin load disaggregation utilizing a multi-stage classiﬁcation algorithmfor consumer electrical appliances classiﬁcation,”

Frontiers in Energy ,vol. 13, no. 2, pp. 386–398, 2019.[20] S. Dash, R. Sodhi, and B. Sodhi, “An appliance load disaggrega-tion scheme using automatic state detection enabled enhanced integer-programming,”

IEEE Trans. Industrial Informatics , 2020.[21] J. Z. Kolter and M. J. Johnson, “Redd: A public data set for energydisaggregation research,” in

Workshop on Data Mining Applications inSustainability (SIGKDD), San Diego, CA , vol. 25, no. Citeseer, 2011,pp. 59–62.[22] N. Henao, K. Agbossou, S. Kelouwani, Y. Dub´e, and M. Fournier,“Approach in nonintrusive type i load monitoring using subtractiveclustering,”

IEEE Trans. Smart Grid , vol. 8, no. 2, pp. 812–821, 2017.[23] T. Lu, Z. Xu, and B. Huang, “An event-based nonintrusive load monitor-ing approach: Using the simpliﬁed viterbi algorithm,”

IEEE PervasiveComputing , vol. 16, no. 4, pp. 54–61, 2017.[24] S. Kandanaarachchi, M. A. Mu˜noz, R. J. Hyndman, and K. Smith-Miles,“On normalization and algorithm selection for unsupervised outlierdetection,”

Data Mining and Knowledge Discovery , vol. 34, no. 2, pp.309–354, 2020.[25] Y. Chakhchoukh, S. Liu, M. Sugiyama, and H. Ishii, “Statistical outlierdetection for diagnosis of cyber attacks in power state estimation,”in .IEEE, 2016, pp. 1–5.[26] M. Ma, W. Lin, J. Zhang, P. Wang, Y. Zhou, and X. Liang, “Towardenergy-awareness smart building: Discover the ﬁngerprint of your elec-trical appliances,”

IEEE Trans. Industrial Informatics , vol. 14, no. 4, pp.1458–1468, 2017.[27] E. Alpaydin,

Introduction to machine learning . MIT press, 2020.[28] A. Elnaz, S. Amin M, B. Mohammad TH, I. Behnam M, and B. Sadegh,“Residential household non-intrusive load monitoring via smart event-based optimization,”

IEEE Trans. Consumer Electronics , vol. 10, no. 4,pp. 4615–4627, 2020.[29] Z. Xu, W. Chen, and Q. Wang, “A new non-intrusive load monitoringalgorithm based on event matching,”

IEEE Access , vol. 7, pp. 55 966–55 973, 2019.[30] D. Li and S. Dick, “Residential household non-intrusive load monitoringvia graph-based multi-label semi-supervised learning,”