[PDF] An Online Approach to Cyberattack Detection and Localization in Smart Grid

Abstract

Complex interconnections between information technology and digital control systems have significantly increased cybersecurity vulnerabilities in smart grids. Cyberattacks involving data integrity can be very disruptive because of their potential to compromise physical control by manipulating measurement data. This is especially true in large and complex electric networks that often rely on traditional intrusion detection systems focused on monitoring network traffic. In this paper, we develop an online detection algorithm to detect and localize covert attacks on smart grids. Using a network system model, we develop a theoretical framework by characterizing a covert attack on a generator bus in the network as sparse features in the state-estimation residuals. We leverage such sparsity via a regularized linear regression method to detect and localize covert attacks based on the regression coefficients. We conduct a comprehensive numerical study on both linear and nonlinear system models to validate our proposed method. The results show that our method outperforms conventional methods in both detection delay and localization accuracy.

Full PDF

11 An Online Approach to Cyberattack Detection andLocalization in Smart Grid

Dan Li,

Student Member, IEEE,

Nagi Gebraeel, Kamran Paynabar and A. P. Sakis Meliopoulos

Abstract —Complex interconnections between informationtechnology and digital control systems have signiﬁcantly in-creased cybersecurity vulnerabilities in smart grids. Cyberattacksinvolving data integrity can be very disruptive because oftheir potential to compromise physical control by manipulatingmeasurement data. This is especially true in large and complexelectric networks that often rely on traditional intrusion detectionsystems focused on monitoring network trafﬁc. In this paper, wedevelop an online detection algorithm to detect and localize covertattacks on smart grids. Using a network system model, we developa theoretical framework by characterizing a covert attack on agenerator bus in the network as sparse features in the state-estimation residuals. We leverage such sparsity via a regularizedlinear regression method to detect and localize covert attacksbased on the regression coefﬁcients. We conduct a comprehensivenumerical study on both linear and nonlinear system models tovalidate our proposed method. The results show that our methodoutperforms conventional methods in both detection delay andlocalization accuracy.

Index Terms —Cybersecurity, ISO, State Estimation, Detection,Localization

I. I

NTRODUCTION

The growing digitization of smart grids and the infusion ofIoT technologies has exposed numerous cybersecurity vulner-abilities [5], [11]. Cybersecurity of smart grids is a topic thathas been studied extensively. Examples of common areas ofinterested revolve around data conﬁdentiality (eavesdropping,phishing, spooﬁng) and availability (ﬂooding, denial of ser-vice, and distributed DoS) [8]. A sizeable research effort isfocused on cyberattacks that target data integrity in smart gridapplications. Data integrity cyberattacks refer to the manipu-lation of sensor measurements (namely, false data injections[15]) and control actions such as in the case of replay [6]and covert attacks [18] where often the attacker’s intent is toalter normal system operations and cause physical damages.Data integrity cyberattacks can be especially disruptive to gridoperations; consider for example the 2015 Ukrainian blackout[5]. Recent research has shown that data integrity attackscan successfully bypass conventional detection schemes suchas the bad-data-detection [15], [16]. Consequently, this paperfocuses on developing a cyberattack detection scheme aimedat the detection of such cyberattacks in smart grid applications.Aside from detection, cyberattack localization is a majorchallenge in the smart grid. Due to the sheer scale coupledwith complex and dynamic interactions between cyber andphysical components of the grid, a cyberattack on one part inthe network can propagate very rapidly. Hypothesis testing

Dan Li, Nagi Gebraeel, Kamran Paynabar, and A.P. Sakis Meliopoulos arewith Georgia Institute of Technology, Atlanta, GA, USA. has traditionally been the de facto approach for detectingand identifying the locations of sensors with anomalous datasignatures [1], [3], [21]. As the size of the network growsso does the number of sensors and the frequency of hypoth-esis tests required to identify the anomalous sensors. Thiscan create signiﬁcant statistical and computational challenges.Statistically, it is hard to make inferences (interpret the teststatistics) on a large number of hypothesis tests, especiallywhen they are dependent. The false alarm rate also increasesand correction methods can be conservative. Computationally,with the number of affected sensors unknown, the numberof hypothesis tests grows exponentially with the numberof sensors. Thus, with a large number of sensors in thesmart grid, hypothesis testing becomes intractable. Graph-based approaches were also proposed for localization. Theyoften require hierarchical partition of the network and areoften computationally expensive [9], [17]. These challengesprovide an opportunity to develop an integrated detectionand localization methodology that is statistically interpretableand computationally efﬁcient. Speciﬁcally, we extract the datafeature that contains information about both the abnormalitycaused by a covert attack and the location of the attack. Theabnormality is identiﬁed by the magnitude, and the location isidentiﬁed by the sparsity of the extracted feature.

A. Related Work

Models for detecting data integrity cyberattacks can beclassiﬁed into two groups. The ﬁrst group uses network trafﬁcfrom cyber communications to detect the attacks on the smartgrid [14], [24], [26]. These methods are based on detectingabnormalities in the network trafﬁc data and are often similarin their operation to detecting DoS attacks.The second group couples model-based detection withsensor data. A mathematical model is used to represent thenormal behavior of the system. Attacks are detected usingdiscrepancies between model prediction and actual systemobservations [10], [13], [19], [20], [27]. Most of these methodsare based on new designs of the detector, i.e., redeﬁning thetest statistic for detection [13], [19], [20], [27]. For example,in [13], a CUSUM statistic is used to capture the cumulativeerror, and in [27], the test statistic is extracted from theresiduals of a robust state estimator. Another commonly usedmodel-based approach is using authenticating data signaturesto periodically verify the system’s state [10] Covert attackswhere an attacker has sufﬁcient knowledge of the system canstill be missed using these approaches.Literature related to localizing cyberattacks in smart gridapplications is very limited. This is especially the case in data a r X i v : . [ s t a t . A P ] F e b integrity attacks. One of the main challenges in localizingcyberattacks in smart grids is due to its complex physicalinteractions. An attack on one node can quickly propagatethrough the network making it difﬁcult to associate theanomaly generated by the attack and its origin in the network.In [17], a graphical model is used to locate attacks whichare modeled as disturbances. However, it is not clear howthis method can be extended to attacks that simultaneouslymanipulate the sensor data. In [23], the attack localization iscoupled with distributed state estimation, where each regionshares its belief of the attack localization. In this paper, we usea centralized state estimation conﬁguration to facilitate attacklocalization. The centralized approach utilizes the informationfrom the neighborhood regions that can be affected by theattacked region to locate the cyberattacks more accurately. B. Contributions

To the best of our knowledge, this is the ﬁrst work thatfocuses on detecting covert attacks in smart grids. The paperfocuses on power systems consisting of an Independent Sys-tem Operator (ISO) and multiple Regional Control Centers(RCCs). We develop a method to detect and localize a covertattack on an RCC in (near) real-time. This is accomplishedby analyzing residuals from the ISO state estimation, inreal-time. We demonstrate the effectiveness of our proposedmethodology through a simulation study on an IEEE 14-bussystem.Our main contributions are summarized as follows: 1. Wedevelop a generalized framework to model covert attacks ona regional control center. 2. We build an online covert attackdetection mechanism by investigating and formally modelingthe characteristics of the residuals from various sensor mea-surements under normal operations and under covert attacks.3. We derive the impact of a covert attack on the neighboringregions of the targeted RCC. This serves as the basis of ouronline attack localization scheme to identify which RCC isunder attack. 4. Speciﬁcally, we leverage the unique sparsestructure of the system residuals to locate the covert attackat the level of the individual generator bus. This is achievedby utilizing Spare Group Lasso to enable efﬁcient featureextraction. The rest of the paper is organized as follows:In Section 2, we introduce the problem setup, includingthe system model, the attack model, as well as the SparseGroup Lasso (SGL) problem. In Section 3, we develop ourmethodology and propose two detection algorithms for linearand nonlinear systems, respectively. In Section 4, we conducta numerical study and present the results. Finally, Section 5concludes the paper.II. P

ROBLEM S ETUP

A. System Model

We consider an N -bus power transmission system com-prised of power plants and substations that are grouped in K different regions (an example of N = 14 and K = 3 is shownin Figure 1). An ISO acts as a centralized coordinator whichmanages and controls the electric transmission of the powernetwork. The power generation plants are operating under thecontrol of regional control centers (RCCs). The global system state x ∈ R n ( n = 2 N − under the N -bus setting) is deﬁnedas x T = [ x T , ..., x TN ] , where x i represents the state of bus i .In most cases, x i is deﬁned as the voltage and phase anglefor bus i . Without loss of generality, the phase angle of thereference bus is set equal to 0. Hence, x ∈ R when bus1 is the reference bus, and x i ∈ R for i = 2 , ..., N . Weassume there are m ( m >> n ) sensors in the global systemthat guarantee the observability of the system. We denotethe vector of measurements as z ∈ R m , which may includethe line power ﬂows, the bus voltages and/or currents, loadson all the load buses, and the generation powers of all thegenerator buses. The measurement function is represented bya (nonlinear) measurement function z = h ( x ) (1)the state estimation is given by solving the following opti-mization problem: min x ( z − h ( x )) T Σ − ( z − h ( x )) (2)where Σ is the diagonal matrix of the sensor measurementprecisions.For a nonlinear model, the above problem is solved usingNewton’s method, and the state estimate ˆ x is calculatediteratively as follows: ˆ x ν +1 = ˆ x ν − ( J T Σ − J ) − J T Σ − ( z − h (ˆ x ν )) (3)where J is the Jacobian matrix of h ( · ) , and ν is the iteration.However, if the system operates around a state x , the modelcan be properly linearized around x [17]. The state x can beobtained from a recent state estimation, which remains validfor multiple observations. The generalized linearization has thefollowing form: z = H x , (4)where z and x are redeﬁned as their linearization around z and x , and H is the measurement matrix derived from x .In this case, the above state estimation problem is solved as alinear regression problem. That is, ˆ x = ( H T H ) − H T z . (5)This approach is commonly used in the literature. For example,a state-space model is used in [10], and a linear regressionmodel is used in [25]. B. Covert Attack

A covert attack is a cyberattack that maliciously manip-ulates system controls covertly by manipulating the sensormeasurements. The mechanism of a covert attack on a steady-state linear system was ﬁrst proposed by [18]. Sensor dataand control actions were manipulated by injecting two biasterms, which were assumed to be linearly dependent. In[12], the authors proposed the covert attack against a lineardynamic system. The proposed attacks were proved to beundetectable when all the sensors in the system are vulnerableto manipulation.In contrast, this paper tries to introduce a covert attack on anonlinear system. We provide a more generalized deﬁnition ofa covert attack that is applicable to both linear and nonlinearsystems. For both types of systems, we assume that an attackerhas full knowledge of the system. With this knowledge and

Fig. 1: The Graphical Model of the Smart Grid (IEEE 14 bus)access to controllers, the attacker can manipulate controlactions arbitrarily. In addition, with access to the sensors, theattacker can simultaneously manipulate sensor data to disguisethe control actions. We also assume that an attacker can onlyaccess one power plant at a time. In this work, we considerthe following covert attack vector described below:1. An attacker reads and manipulates the control of a generator, i , such that the state (power and/or voltage) of generator i isaltered. That is, x ai = x i + β i , (6)where x i is the original state of generator bus i , β i is the shiftof the state caused by a covert attack, and x ai is the state ofbus i under attack.2. The attacker simulates the expected sensor measurementscorresponding to the “normal” state of the generator usingknowledge acquired of the system model; z = h ( x ) (7)3. The attacker manipulates the corresponding sensor mea-surements of generator, i , using simulated sensor measure-ments as follows, z ak = z k ∀ k ∈ M i (8)where M i is the set of sensors connected to generator i ,including the ones that are measuring the state of bus i andthose that are in the neighborhood of bus i .If an attacker was to orchestrate such an attack vector, it ishighly unlikely that the attack would be detected by existingdetection models. Note that all the sensors in M i that wouldhave otherwise registered an abnormality in the state x i havebeen manipulated by the attacker, i.e., the readings indicatenormal system behavior. The covert attack has proven to beundetectable for linear systems when the attacker has accessto all the sensors and full knowledge of the system dynamics[18]. In [22], the authors designed a detection method forscenarios where the attacker has limited access to the sensors,which is similar to our assumption here. However, the limitedaccess implies that a subset of the sensors is always protectedfrom manipulation.In contrast, this paper assumes that none of the sensorsare immune to manipulation. Instead, we assume the attackercould only access sensors related to the targeted generatorbus. We believe that this is a reasonable assumption sinceit is highly unlikely that an attacker would access all sensors in a large system simultaneously. By manipulating only therelated sensors, the attacker is able to bypass traditional baddata detection schemes, such as χ detection. For example,in Figure 2 we plot the χ statistic of a system that isexperiencing a covert attack since time t = 1000 . As shownin the ﬁgure, there is no signiﬁcant change before and afterthe onset of the attack. time c threshold onset of attack Fig. 2: The χ detector fails to detect the covert attackIII. M ETHODOLOGY

In this section, we develop an online approach to detect andlocalize covert attacks in a smart grid. Our problem settingconsiders a smart grid comprised of multiple regions. Eachregion consists of generator buses that can experience a covertattack. For illustrative purposes, Figure 1 shows a smart gridcomprised of three regions. We assume that an ISO collects allthe sensor data from all the regions and that the topology of thenetwork is known. Our detection and localization algorithm iscoupled with the smart grid state estimation at the ISO level.In this paper, we consider a scenario where only onegenerator bus experiences a covert attack (note that our modelcan be extended to multiple busses). If the true state ofthe system is known, then residuals derived from sensorobservations can be utilized to detect a covert attack. In thiscase, we expect that only the subset of sensors connected tothe generator bus will display relatively large residuals. Thiswill present signiﬁcant sparsity in the residuals. Sparse GroupLasso (SGL) can be used to extract relevant (sparse) featuresthat can detect and localize the covert attack. However, thetrue state of the system is typically unknown. Consequently, sparse features are used to correct the state estimation bycoupling the observation model in Eq.(1) as a constraint tothe SGL problem. This will be formalized later in Eq.(16). Inthe following subsections, we discuss the development of ourmethodology on a linear approximation of the system. Next,we relax the linear assumption and extend the methodologyto the nonlinear setting.

A. Sparse Group Lasso

The formulation of sparse group Lasso was proposed in [7]as an advancement of the group Lasso problem that selectsthe group(s), among L groups of predictors X , ..., X L , thatexplain the variation in the data y min( || y − L (cid:88) l =1 X l β l || + λ L (cid:88) l =1 √ p l || β l || ) , (9)where p l is the group size, and || · || is the L (Euclidean)norm. The penalty term λ (cid:80) Ll =1 √ p l || β l || yields sparsity atthe group level. The sparse group Lasso considers within groupsparsity in addition to the group level sparsity, which solvesthe following optimization problem: min( || y − L (cid:88) l =1 X l β l || + λ || β || + λ L (cid:88) l =1 || β l || ) , (10)where the L penalty term λ || β || yields the element-wise(within-group) penalty. The SGL problem can be solved byblock coordinate descent. The algorithm is given in [7]. B. Linear System

We ﬁrst hypothesize a simpliﬁed linearized model of thesystem, where the system operates around some state x , andthe measurement function is linearized in the form of z = h ( x ) = H x , (11)where H = h ( x ) is a known constant, which is derived fromthe steady operating point x . As given in Section II-B, z (cid:48) = H ( x + β ) = z + Hβ, (12)where z (cid:48) is the measurements under covert attack before theattacker manipulates the sensor data , and β is sparse, suchthat β i (cid:54) = 0 and β j = 0 for all j (cid:54) = i , because the attack onlychanges the state of generator i . The measurements ( z a ) afterthe attacker manipulates the sensor data , according to (8), is z aj = z j , ∀ j ∈ M i (13) z aj = z (cid:48) j , ∀ j ∈ M Ci (14)where M i represents the set of sensors in the neighborhood ofplant i , and M ci is the complement of set M i . From (12-14),we have z a = z + B i β i , (15)where B i ∈ R m × is deﬁned as follows: B i [ M Ci , · ] = H [ M Ci , S i ] ,B i [ M i , · ] = 0 , with B [ S, · ] representing the rows in set S of matrix B , and B [ · , S ] representing the columns in set S of matrix B . In theabove equations, S i denotes the set of elements correspondingto the state of bus i , x i , in the state vector x . In general, thematrix B i shows the relation between the state of bus i and the measurements of the sensors that are not directly connectedto bus i .When the system is complex, each state is correlated with(different) multiple sensors. Notice that B i is obtained bytaking some columns of H and setting a subset of elementsto zero. This transformation is nonlinear since it is element-wise and sparse, meaning there is a very high chance that B i is linearly independent of the column space of the matrix H .Therefore, when we know that bus i is under attack, we couldestimate β in the following way:1. Project the measurement onto the column space of matrix H to estimate the state ˆ x (i.e., solve the linear regressionproblem using (5).2. Project the residuals r = z − H ˆ x onto the column spaceof matrix B i , and the solution is ˆ β i , which can be expressedas: ˆ β i = ( B Ti B i ) − B Ti r . Note that in step 1, there might be the inaccuracy of stateestimation because of leverage points. Therefore, the two stepsshould be done iteratively by removing the explained residuals( B i ˆ β i ) from z and re-estimating ˆ x . That is, in the seconditeration, we ﬁrst update the measurement as its correction bysubtracting the explained residuals, i.e., z c = z − B i ˆ β i , andthen estimate ˆ x . The residuals r in step 2 are still calculatedbased on the original measurement of z . After we get theprojection of the new residuals, we correct the measurementagain using z c ← z c − B i ˆ β i . These two steps are reiterateduntil the convergence of the estimated states.The above solution is only valid when it is known that bus i is under attack. When this is unknown, we need to locatethe attack. Note that in the above solution, B i can be treatedas the basis for bus i . Therefore, one can ﬁnd all the basis forall the buses, and the problem can be formulated as ﬁndingthe basis among B i for all i that best explains the residuals r .Since it is likely that a subset of the states of bus i is altered(e.g. when there are multiple generators in one power station,the attacker might only attack a subset of the generators, or theattacker only changes the power without changing the voltage), β i would be sparse. Therefore, if we divide the elements intogroups according to the states of each node i , the estimated β should be: 1) between-group sparse, meaning there shouldbe only one basis B i that properly explains the residuals and2) within-group sparse, meaning it is very likely that only asubset of the elements in x i is altered, which also means onlya subset of the columns in basis B i is important in explainingthe residuals variation.The above problem can then be formulated as a SparseGroup Lasso (SGL) with linear constraint in the followingform: min β i ,...,β L , ˆ x || r − L (cid:88) i =1 B i β i || + λ || β || + λ L (cid:88) i =1 || ˆ β i || (16)such that H ˆ x + r = z , (17)where λ || β || is the L penalty term that encourages within-group sparsity, and the L penalty term λ (cid:80) Li =1 || β i || en-courages the between-group sparsity. Under our assumptionthat only one region is under attack, there is only one of all || β i || ’s that is signiﬁcantly greater than . Since we assume that Algorithm 1

SGL-based attack detection and localization forlinear system

Input: tol , z , H , { M , ..., M K } , { S , ..., S N } , λ alarm = , converge = ; for t = 1 , , ... do z = z new ← z ( t ) ; ˆ x old = x ; while ! converge do ˆ x = ( H T H ) − H T z ; r = z − H ˆ x ; Solve (16); z new ← z − (cid:80) Li =1 B i ˆ β i ; if ( || ˆ x new − ˆ x old || < tol ) then converge = ; end if ˆ x old ← ˆ x new ; end while if ( max || ˆ β i || > λ ) then alarm = ; location = argmax || ˆ β i || break ; end if end for Return alarm , location the basis B i does not lie in the column space of H , the aboveoptimization problem can be solved iteratively by ﬁrst solvingthe linear regression problem and then the SGL without theconstraint, which is demonstrated in Algorithm 1. The solution ˆ β = [ ˆ β T , ..., ˆ β TK ] is an estimation of the attack vector β . Forgenerator i under attack, || ˆ β i || > , and for generator j notunder attack, we expect to get ˆ β j ≈ . Within the detectedgenerator, the none-zero elements would correspond to thealtered state variables. When there is no attack, || ˆ β i || wouldbe close to 0 for all i . The online detection mechanism isbuilt based on the maximum magnitude of L1 norm || ˆ β i || :the alarm is set when max || ˆ β i || is greater than the pre-speciﬁed threshold λ , where λ could be selected based on theempirical distribution of max || ˆ β i || under normal condition.Speciﬁcally, λ could be selected as the ( − α ) quantile of theempirical distribution of max || ˆ β i || , where α is the desiredType-I error rate (usually set to 0.005). For generality, the max || ˆ β i || value is normalized based on the historical meanof max || ˆ β i || for the in-control data. C. Nonlinear System

We now extend the formulation in the previous subsectionto the nonlinear system setting as given by (1). In this case, theJacobian matrix H ( x ) is no longer a constant, but a functionof x . Therefore, we need to re-approximate H and the basis B i at every iteration, based on the new state estimation ˆ x .The detection and localization problem could be extendedto a nonlinear system setting as a sparse group Lasso (SGL)problem with nonlinear constraints: min β i ,...,β L , ˆ x || r − L (cid:88) i =1 B i β i || + λ L (cid:88) i =1 || ˆ β i || + λ L (cid:88) i =1 || ˆ β i || such that h (ˆ x ) + r = z B i [ M Ci , · ] = H (ˆ x )[ M Ci , S i ] B i [ M i , · ] = 0 Note that the above optimization problem has no closed formsolution or iterative algorithm with guaranteed convergence.However, we could linearize the constraint as follows. Inpractice, if the attacker alters the state of the generator by alarge magnitude in a short time, it might directly shut down thegenerator, and the attack could be easily exposed. Therefore,we reasonably assume β is not too large such that h ( x a ) ≈ h ( x ) + H ( x ) β. Similar to the linear case, β is sparse with β k = 0 ∀ k / ∈ N i As mentioned in Section II-B, the observed measurement z a is an element-wise combination of h ( x a ) and h ( x ) . i.e., z (cid:48) = h ( x a ) ≈ h ( x ) + H ( x ) β, (18) z aj = z (cid:48) j , ∀ j ∈ M Ci (19) z aj = z j , ∀ j ∈ M i (20)Since all the sensors that are directly related to the attackednode are covered by the normal measurements, the stateestimation should be close to x . i.e., ˆ x ≈ x . Therefore, ifgenerator i is attacked, the residual r should satisfy: r j = H (ˆ x )[ j, · ] β ∀ j ∈ M Ci which is equivalent to r = B i β i Similar to the linear case, when region i is under attack, thecorresponding ˆ β i should be large, otherwise we expect ˆ β i ≈ .The above optimization problem is solved iteratively bysolving the state estimation and the SGL. At each iteration,we ﬁrst solve the SE problem using Newton’s method andget the residual r . Then, we solve the SGL using blockcoordinate descent and get estimates, ˆ β i , i = 1 , ..., L . In thenext iteration, the state estimation is solved by correcting z using ˆ β i , i = 1 , .., L ; i.e., z (cid:48) = z − (cid:80) Li =1 B i ˆ β i . At eachtime step, this procedure is iterated until convergence. Wedemonstrate the above procedure in Algorithm 2.IV. N UMERICAL R ESULTS

We validate the proposed detection algorithm on both linearand nonlinear systems. For the linear system setting, we modelthe complete system with a 20-variable linear time-invariantstate-space model composed of 4 regions. For the nonlinearsystem setting, we use the IEEE 14 bus model and decomposeit into three regions as shown in Figure 1.

A. Linear System

For simulation on the linear system, we use the followingdiscrete-time state-space model to represent the system oper-ations: x ( t + 1) = A x ( t ) + B u ( t ) + e ( t ) (21) z ( t ) = H x ( t ) (22)where x and z are the system state and sensor measurement,respectively, as deﬁned earlier, u is the control action thatis calculated by the controller to keep the system state attarget. (21) is the state-transition function, and (22) is themeasurement function. We generate the state-transition matrix Algorithm 2

SGL-based attack detection and localization fornonlinear system

Input: tol , z , H , { M , ..., M K } , { S , ..., S N } , λ alarm = , converge = ; for t = 1 , , ... do z = z new ← z ( t ) ; ˆ x old = x ; while ! converge do Solve min x ( z − h ( x )) T Σ − ( z − h ( x )) using New-ton’s method; r = z − H ˆ x ; Solve (16); z new ← z − (cid:80) Li =1 B i ˆ β i ; if ( || ˆ x new − ˆ x old || < tol ) then converge = ; end if ˆ x old ← ˆ x new ; end while if ( max || ˆ β i || > λ ) then alarm = ; location = argmax || ˆ β i || break ; end if end for Return alarm , locationA randomly as a positive-deﬁnite matrix with the largest eigen-value less than 1 to guarantee stability, the control action u is calculated by a coupled linear-quadratic regulator [2]. Notethat H is a sparse matrix where each state variable only affectsa subset of the sensors. The non-zero elements are generatedfrom a uniform distribution between 0 and 1. The sensorsconnected to generator i are deﬁned by the strong correlationbetween sensor j and state x i , where || H [ j, S i ] || ∞ > . means sensor j is directly connected to generator i . The state-transition function is taken as a “black box” which is assumedunknown, and the steady-state estimation is implemented onlybased on (22) using (5). In this simulation, we have 20 statevariables (i.e., x ∈ R × ) and 30 sensors (i.e., z ∈ R × ).We run the detection algorithm for N = 500 replications toevaluate the detection delay and localization performance onaverage. In each replication, the attacked region i attack is chosenrandomly, and the thresholds remain unchanged.The attack detection performance of our proposed methodis evaluated by the in-control and out-of-control average runlength, i.e. ALR and ALR . The average run length isdeﬁned as the average number of observations before an alarmis raised: ARL = E [min { t : max || ˆ β i ( t ) || > λ } ] (23)It has the following relation with the type-I and type-II errorrates: ARL ≈ P r ( type-I error ) = 1 P r ( false positive ) (24) ARL ≈ − P r ( type-II error ) = 11 − P r ( false negative ) (25)We choose the threshold λ ofﬂine based on the empiricaldistribution of max || ˆ β i || , such that the Type-I error rate α = 0 . , so the expected in-control average run length is200. The magnitude of attack is deﬁned by the signal-to-noiseratio (SNR): SN R = (cid:113) β Ti Σ − i β i , where Σ i is the covariance matrix of the state variables ofgenerator, x i .The out-of-control ARL ’s along with their standard de-viations under different SNR’s are shown in Table I. Forcomparison, we use the traditional χ detector as a baseline,whose ARL is also given in the table.Recall that in our proposed algorithm, the attack local-ization is identiﬁed as argmax || ˆ β i || . The attack localizationperformance of the proposed method is evaluated by theidentiﬁcation accuracy, precision, recall, and the F score ofthe proposed attack localization approach, which are calculatedusing the following equations:Accuracy = P r ( argmax || ˆ β i || = i attack ) , (26)Precision = T PT P + F P , (27)Recall = T PT P + F N , (28) F = 2 · Precision · RecallPrecision + Recall , (29)where T P , T N , F P , F N are the number of true positives,true negatives, false positives, and false negatives, respectively.The precision values in Table I is obtained by taking theaverage of the precision values over all three regions, and thesame applies to the recall and F score values. The accuracyrepresents the probability that the algorithm correctly identiﬁesthe attacked region at the time of detection. Precision repre-sents the proportion of correct alarms among all the alarms,recall represents the proportion of correct alarms among allthe cases where region i is indeed under attack, and F scoreis the harmonic mean of precision and recall.The accuracy of attack localization of the proposed methodis compared with a modiﬁcation of the hypothesis testingtechnique used in the literature [21], where we test the groupof sensors that are related to region i all together. Morespeciﬁcally, for each region i , we remove the sensors in theset M i and re-estimate the state using the remaining sensormeasurements. The new χ i statistic is calculated accordingly.After we go through all the regions, the new χ i statistics arecompared, and the attacked region is identiﬁed as the region i that minimizes χ i . This is because a low χ i value means theremoved sensors best explains the abnormality. The accuracyof the proposed method and the hypothesis testing method isgiven in Table I.The results in Table I show that the proposed method has ahigher detection power and a higher localization accuracy thanthe traditional χ detector. For example, when the SNR is 1,the localization accuracy of SGL is 69.8%, which is greaterthan the accuracy of the χ detector, 48.8%. When the SNRis 6, the accuracy of both the methods increase, where the χ detector reaches an 86.6% accuracy, and SGL reaches a99.4% accuracy, which is also better than χ . Table I showsas SNR increases, the localization accuracy for both methodsincreases. However, the proposed method has higher accuracy than the hypothesis tests under all the tested SNR levels.More importantly, the proposed method reaches a reasonablyhigh accuracy (more than 85%) at a relatively low SNR level(SNR=2), while the hypothesis test reaches similar accuracy ata much higher SNR level (SNR=6). This means the proposedmethod is more sensitive to covert attacks. . . . . . t m a x || b i || threshold onset of attack (a) Detection of attack on . . . . t m a x || b i || threshold onset of attack (b) Detection of attack on Fig. 3: Simulation Results

B. Nonlinear System

To validate the performance of the method on nonlinearsystems, we simulate the attack using the IEEE 14 bus model.The input to the simulation is the load proﬁles of the loadbuses, the generation plan of the generator buses, and the phaseangles and voltages at each bus. The load proﬁles are generatedfrom the real data from Pecan Street dataset. The generationplan is generated based on the load by solving the mixed-integer unit commitment problem [4].We simulate the system for 2000 observations and monitorthe l norm of the ˆ β i vectors. The threshold is selected basedon the . quantile of the monitoring statistic ( max || ˆ β i || ),such that the in-control average run length is around 200.There are 5 levels of attack, where level 1 to 5 representsdecreasing the generation level by 20% to 100%. For eachlevel of attack, we replicate the simulation 500 times, andin each replication, a covert attack on one of the generatorsrandomly selected among buses t = 1000 . Before the onset of the attack, the magnitudes of || ˆ β i || for all generator busesgenerally follow the same distribution, and false alarms aretriggered with a low possibility. On the contrary, after the onsetof the attack, the || ˆ β i || values for the region under attack ismuch higher than the others, and the max || ˆ β i || values areabove the threshold such that alarms are frequently triggered.We evaluate the performance of the method using averagerun length and the localization accuracy deﬁned by Equations(23) and (26). We show the variation of the average run lengthand the localization accuracy under the 5 levels of attacks inFigure. 4. The results show that, as the attack level (severity ofattack) increases, the detection delay decreases, and the local-ization accuracy increases. This means the proposed methodhas a higher detection power as well as a higher localizationaccuracy as the attack is more severe.V. C ONCLUSION

We proposed an online approach to detect and locate a typeof data integrity attack called a covert attack. Our detectionapproach is based on the SGL formulation. We showed thetheoretical foundation of applying SGL to a linear systemsetting and extend the method to a nonlinear system settingwith relaxation. We conducted a simulation study to evaluatethe performance of our proposed method by highlighting theaverage run length, which indicates the expected detectiondelay, as well as the localization identiﬁcation accuracy. Theresults showed that as the severity (SNR) of attack increases,both the detection power and the localization accuracy ofthe proposed method increase. The results also showed thatthe proposed method is much more sensitive than χ tests.Furthermore, we implemented a case study on the IEEE 14-bus system as a representative of the more practical nonlinearsystems. The results showed that the proposed method isapplicable to nonlinear systems and able to reach shorterdetection delay and higher localization accuracy as the attackseverity increases. As future work, we will investigate thescalability and the robustness of the proposed method. Wecan also extend the method to detection, localization, andidentiﬁcation of other types of cyberattacks and faults insmart grids. Another direction is to incorporate the deep-learning based methods in order to reach a high accuracyand still preserving interpretability. Presently, our optimizationalgorithm is research grade. Our plan is to further developthe code into a commercial grade optimization algorithm thatsolves the SGL with linear and/or nonlinear constraints.R EFERENCES[1] Eduardo N Asada, Ariovaldo V Garcia, and R Romero. Identifyingmultiple interacting bad data in power system state estimation. In

IEEE Power Engineering Society General Meeting, 2005 , pages 571–577. IEEE, 2005.[2] Michael Athans. The role and use of the stochastic linear-quadratic-gaussian problem in control system design.

IEEE transactions onautomatic control , 16(6):529–552, 1971.[3] Eduardo Caro, Antonio J Conejo, Roberto Minguez, Marija Zima,and G¨oran Andersson. Multiple bad data identiﬁcation consideringmeasurement dependencies.

IEEE Transactions on Power Systems ,26(4):1953–1961, 2011.[4] Miguel Carri´on and Jos´e M Arroyo. A computationally efﬁcient mixed-integer linear formulation for the thermal unit commitment problem.

IEEE Transactions on power systems , 21(3):1371–1378, 2006.

TABLE I: ARL and Accuracy under different levels of SNR

ARL(Std.Dev.) Accuracy Precision Recall F scoreSNR χ SGL χ SGL χ SGL χ SGL χ SGL0 203.46 (8.04) 200.84 (9.96) - - - - - - - -1 171.68 (7.26) 153.34 (6.80) 48.80% 69.80% 52.74% 72.63% 48.80% 69.80% 47.35% 69.61%2 97.404 (4.48) 82.78 (3.88) 60.80% 85.80% 63.13% 86.77% 59.45% 85.37% 59.91% 85.92%3 50.20 (2.22) 41.47 (1.95) 68.20% 92.60% 73.86% 92.79% 68.27% 92.60% 68.51% 92.69%4 23.32 (1.00) 16.18 (0.66) 72.40% 97.00% 75.97% 97.13% 71.37% 97.05% 72.18% 97.08%5 15.54 (0.63) 13.11 (0.51) 80.40% 96.40% 82.90% 96.77% 80.91% 96.38% 81.28% 96.58%6 12.13 (0.44) 8.38 (0.27) 86.60% 99.40% 87.99% 99.39% 86.36% 99.41% 86.91% 99.40% attack level A R L . . . . attack level a cc u r a cy Fig. 4: ARL and localization accuracy under 5 levels of attack [5] Defense Use Case. Analysis of the cyber attack on the ukrainian powergrid.

Electricity Information Sharing and Analysis Center (E-ISAC) ,2016.[6] Vasco Delgado-Gomes, Jo˜ao F Martins, Celson Lima, and Paul NicolaeBorza. Smart grid security issues. In , pages 534–538. IEEE,2015.[7] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. A note on thegroup lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736 ,2010.[8] M Zekeriya Gunduz and Resul Das. Analysis of cyber-attacks onsmart grid applications. In , pages 1–5. IEEE, 2018.[9] Miao He and Junshan Zhang. A dependency graph approach for faultdetection and localization towards secure smart grid.

IEEE Transactionson Smart Grid , 2(2):342–351, 2011.[10] Tong Huang, Bharadwaj Satchidanandan, PR Kumar, and Le Xie. Anonline detection framework for cyber attacks on automatic generationcontrol.

IEEE Transactions on Power Systems , 33(6):6816–6827, 2018.[11] David Kushner. The real story of stuxnet. ieee Spectrum , 3(50):48–53,2013.[12] Dan Li, Kamran Paynabar, and Nagi Gebraeel. A degradation-baseddetection framework against covert cyberattacks on scada systems.

IISETransactions , (just-accepted):1–37, 2020.[13] Shang Li, Yasin Yılmaz, and Xiaodong Wang. Quickest detection offalse data injection attack in wide-area smart grids.

IEEE Transactionson Smart Grid , 6(6):2725–2735, 2014.[14] Ting Liu, Yanan Sun, Yang Liu, Yuhong Gui, Yucheng Zhao, DaiWang, and Chao Shen. Abnormal trafﬁc-indexed state estimation: Acyber–physical fusion approach for smart grid attack detection.

FutureGeneration Computer Systems , 49:94–103, 2015.[15] Yao Liu, Peng Ning, and Michael K Reiter. False data injection attacksagainst state estimation in electric power grids.

ACM Transactions onInformation and System Security (TISSEC) , 14(1):13, 2011.[16] Yilin Mo, Rohan Chabukswar, and Bruno Sinopoli. Detecting integrityattacks on scada systems.

IEEE Transactions on Control SystemsTechnology , 22(4):1396–1407, 2014.[17] Thomas R Nudell, Seyedbehzad Nabavi, and Aranya Chakrabortty. Areal-time attack localization algorithm for large power system networksusing graph-theoretic techniques.

IEEE Transactions on Smart Grid ,6(5):2551–2559, 2015.[18] Roy S Smith. A decoupled feedback structure for covertly appropriatingnetworked control systems.

IFAC Proceedings Volumes , 44(1):90–95,2011.[19] Siddharth Sridhar and Manimaran Govindarasu. Model-based attack detection and mitigation for automatic generation control.

IEEE Trans-actions on Smart Grid , 5(2):580–591, 2014.[20] Rui Tan, Hoang Hai Nguyen, Eddy YS Foo, David KY Yau, ZbigniewKalbarczyk, Ravishankar K Iyer, and Hoay Beng Gooi. Modeling andmitigating impact of false data injection attacks on automatic generationcontrol.

IEEE Transactions on Information Forensics and Security ,12(7):1609–1624, 2017.[21] Th Van Cutsem, Mania Ribbens-Pavella, and Lamine Mili. Hypothesistesting identiﬁcation: A new method for bad data analysis in powersystem state estimation.

IEEE Transactions on Power Apparatus andSystems , (11):3239–3252, 1984.[22] DO Van Long, Lionel FILLATRE, and Igor NIKIFOROV. Sequentialmonitoring of scada systems against cyber/physical attacks.

IFAC-PapersOnLine , 48(21):746–753, 2015.[23] Ognjen Vukovi´c and Gy¨orgy D´an. Detection and localization oftargeted attacks on fully distributed power system state estimation. In , pages 390–395. IEEE, 2013.[24] Ruzhi Xu, Rui Wang, Zhitao Guan, Longfei Wu, Jun Wu, and XiaojiangDu. Achieving efﬁcient detection against false data injection attacks insmart grid.

IEEE Access , 5:13787–13798, 2017.[25] Jiafan Yu, Yang Weng, and Ram Rajagopal. Patopa: A data-drivenparameter and topology joint estimation framework in distribution grids.

IEEE Transactions on Power Systems , 33(4):4335–4347, 2017.[26] Yichi Zhang, Lingfeng Wang, Weiqing Sun, Robert C Green II, andMansoor Alam. Distributed intrusion detection system in a multi-layernetwork architecture of smart grids.

IEEE Transactions on Smart Grid ,2(4):796–808, 2011.[27] Junbo Zhao, Lamine Mili, and Meng Wang. A generalized false datainjection attacks against power system nonlinear state estimator andcountermeasures.