Online Detection of Low-Quality Synchrophasor Data Considering Frequency Similarity
Wenyun Ju, Horacio Silva-Saravia, Neeraj Nayak, Wenxuan Yao, Yichen Zhang, Qingxin Shi, Fan Ye
11 Online Detection of Low-Quality SynchrophasorData Considering Frequency Similarity
Wenyun Ju,
Member, IEEE , Horacio Silva-Saravia,
Member, IEEE , Neeraj Nayak,
Member, IEEE , WenxuanYao,
Member, IEEE , Yichen Zhang,
Member, IEEE , Qingxin Shi,
Member, IEEE , Fan Ye,
Member, IEEE
Abstract —This letter proposes a new approach for onlinedetection of low-quality synchrophasor data under both normaland event conditions. The proposed approach utilizes the fea-tures of synchrophasor data in time and frequency domains todistinguish multiple regional PMU signals and detect low-qualitysynchrophasor data. The proposed approach does not requireany offline study and it is more effective to detect low-qualitydata with apparently indistinguishable profiles. Case studies fromrecorded synchrophasor measurements verify the effectiveness ofthe proposed approach.
Index Terms —Synchrophasor measurements, low-quality syn-chrophasor data, frequency domain, data analytics.
I. I
NTRODUCTION
Low-quality synchrophasor data are widely seen in practice.It represents data that cannot accurately reflect the underlyingsystem behavior [1]. Due to the inherent networked electri-cal couplings between individual buses, synchrophasor datafrom regional PMU signals generally have similar dynamicbehaviors in both normal and event operating conditions [1],[2]. This is called strong spatial-temporal correlation. Thiscorrelation becomes relatively weak if data anomalies exist.The detection of spatial-temporal anomalies including randomspikes, repeated data and false data injection is the objectiveof this letter. Other types of data anomalies such as missingdata and high sensing noises are not considered by this letter.In recent years, model-free based methods [1], [2], [3], [4],[5] are exploited to achieve more reliable detection under in-accurate topology information or parameter errors. Reference[3] proposes a method to identify and correct low-qualitydata based on low-rank property of the Hankel structure.However, its complicated optimizations make it hard for onlineapplication. References [4], [5] use machine learning methodsto detect low-quality data and require time-consuming labelleddataset for training. Reference [1] proposes a density-basedlocal outlier approach to detect low-quality data. It requireshigh-quality historical database for multiple PMU signals.Reference [2] proposes an approach based on spatial-temporalnearest neighbor (STNN) discovery. Some of them are notcapable to detect low-quality data that appears in the sametime period of multiple regional PMU signals [2], [4], [5].This letter addresses issues memtioned above by developinga model-free approach which utilizes the features of syn-chrophasor data in time and frequency domains for detectinglow-quality synchrophasor data online. The advantages are: • It does not require any offline study and training and fewonline computational efforts are required. • It is more effective to detect low-quality synchrophasordata with apparently indistinguishable profiles. • It can differentiate event data from low-quality data. • The threshold to detect low-quality synchrophasor data ismeaningful and it is much easier to understand and set.II. Q
UANTIFYING S IMILARITY FOR T WO D ATA C URVES
The synchrophasor data matrix M from N regional PMUsignals is divided into T time periods. Let M i and M j denotethe data for one same time period of signals i and j . Threeindices are used to comprehensively quantify the similaritybetween M i and M j from both time and frequency domains. A. Dynamic Change Similarity
Define Dynamic Change Similarity as (1), denoted by I dcs . I dcs = e − γ , γ = max [ σ i σ j , σ j σ i ] (1)where σ i and σ j are the standard deviations of M i and M j .The range of I dcs is [0, 1]. The closer I dcs is to 1, the moresimilar of M i and M j in terms of strength of dynamic change. B. Frequency Magnitude Similarity
The frequency response of a system with M i as the inputand M j as the output is: H ( f ) = | M j ( f ) || M i ( f ) | e j ( φ Mj ( f ) − φ Mi ( f )) (2)where M i ( f ) and M j ( f ) represent the Fourier transforms(FTs) of M i and M j . | M i ( f ) | and | M j ( f ) | are the frequencymagnitudes of the FTs of M i and M j . φ M i ( f ) and φ M j ( f ) are the phase angles of the FTs of M i and M j . | H ( f ) | equals to 1 for all the values of f if M i and M j are the same. Eqn. (3) is used for frequency magnitude and itmaps all the values to the range of [0, 1] for each frequency. S ( f ) = 1 − tanh ( | log | H ( f ) || λ ) (3)where λ is a sensitivity parameter for magnitude distance.For a frequency range with m frequency values, FrequencyMagnitude Similarity [6], denoted by I fms , is: I fms = (cid:80) mk =1 S ( f k ) m (4) C. Frequency Phase Similarity
The angle of frequency response H ( f ) , denoted by φ ( f ) ,equals to zero for all values of frequency if M i and M j arethe same. Phase angle distance is defined for each frequency: A ( f ) = 1 − tanh ( 12 π | φ ( f ) | (cid:15) ) (5) a r X i v : . [ ee ss . S Y ] N ov where (cid:15) is a sensitivity parameter for angle distance.For a given frequency range with m frequency values,Frequency Phase Similarity [6], denoted by I fps , is: I fps = (cid:80) mk =1 A ( f k ) m (6)Note that I dcs can quantify the similarity of M i and M j inthe time domain. I fms and I fps quantify the similarity of M i and M j from frequency domain. M i and M j have differentdynamic behaviors if either one has low-quality data, I dcs , I fms and I fps values tend to be close to 0. Otherwise, I dcs , I fms and I fps values tend to be close to 1.We use a simple example in Fig. 1 to illustrate the necessityfor considering I fms and I fps to distinguish two data curves. Fig. 1. An example for two signals.
The standard deviations of signals 1 and 2 are almost thesame, the difference is only 0.0001%. Therefore the LOFapproach will conclude that they have the same dynamicchange and there is no difference between them. However, thevalues of I fms and I fps are calculated as 0.4355 and 0.6615,they quantify the difference between signals 1 and 2 moreaccurately in the frequency domain.III. P ROPOSED A PPROACH FOR D ETECTING L OW -Q UALITY S YNCHROPHASOR D ATA
A. Similarity Degree
By weighting I dcs , I fms and I fps , we can have I ijsd I ijsd = ω I dcs + ω I fms + ω I fps (7)where ω , ω and ω are the weights. ω + ω + ω = 1.For the i th PMU signal at the k th time period, calculatethe I ijsd for each pair of data curves M i and M j to obtain theset of { I i sd , ..., I ijsd , ..., I iNsd } , then calculate the mean value toobtain the Similarity Degree, denoted by I isd . I isd = (cid:80) Nj =1 ,j (cid:54) = i I ijsd N − i = 1 , ..., N (8)where N is the number of regional PMU signals. Thefrequency characteristics of low-quality synchrophasor dataespecially random spikes and false data injection in multipleregional PMU signals are expected to be different, I isd ismore accurate to distinguish low-quality synchrophasor datain multiple regional PMU signals. B. Proposed Approach for Detecting Low-Quality Data
With the set of { I sd , ..., I isd , ..., I Nsd } , for the i th PMU signal,it is detected as a candidate PMU signal with low-quality dataat the k th time period if its value of I isd satisfies (9). I isd < ζ i = 1 , ..., N (9) where ζ is the threshold. The range of I isd is [0, 1]. The closer I isd is to 0, the more dissimilar of the i th PMU signal comparedwith other PMU signals. Therefore ζ is meaningful and it iseasier to understand and tune for online detection.The algorithm for detection of low-quality synchrophasordata can be summarized as:Step 1: Obtain the synchrophasor data matrix M of N regional PMU signals for the k th time period.Step 2: For the i th ( i =1,..., N ) PMU signal, calculate the I ijsd for each pair of data curves M i and M j using (1)-(7) andobtain { I i sd , ..., I ijsd , ..., I iNsd } .Step 3: Calculate I isd for the i th ( i =1,..., N ) PMU signalusing (8).Step 4: Identify the candidate PMU signals with low-qualitysynchrophasor data using (9), flag the data points in the k thtime period of the candidate PMU signals as low-quality data,then go to the ( k + 1 )th time period.IV. C ASE S TUDIES
Case studies use recorded PMU measurements from normaland event conditions from utilities. λ is given as 10, and (cid:15) is setas 0.5. They are the same with [6]. The frequency range is [0,5] Hz and higher frequency band is not considered. ω , ω , ω and ζ are set as 0.3, 0.35, 0.35 and 0.3, respectively. The LOFapproach [1] is implemented for comparison. The similaritymetric with low variance f L is used and the LOF threshold isset as 10 (100 in [1]). We give a smaller value for the LOFthreshold in order to achieve better performance for the LOFapproach. The moving window length is 80 data points for twoapproaches. Synchrophasor measurements within the currentmoving window are identified to contain low-quality data ifthere are already 15 consecutive moving windows prior to thiscurrent window detected in order to avoid the false alarms [1]. A. Scenario 1: Detection of indistinguishable low-quality syn-chrophasor data under normal condition
The synchrophasor data of 22 frequency signals( f f f
22) recorded from normal condition are used.Among them, f
10 contains random spikes and repeated data. (a) Proposed approach (b) LOF approach
Fig. 2. Identified random spikes and repeated data for Scenario 1.
The data points detected as low-quality synchrophasor datausing the proposed and LOF approaches are marked by redin Fig. 2(a) and Fig. 2(b). They are almost the same exceptthat the proposed approach detects additional data pointshighlighted by red in the zoomed subfigures in Fig. 2(a) andthe LOF approach fails to detect them. The reason is furtherinvestigated as below. (a) f f
10 excluded) (b) f
10 vs other signals
Fig. 3. S ( f ) values between different signals. We select one time period [100.18s, 101.50s] which theproposed approach detects successfully and the LOF approachfails. For this time period, we calculate the S ( f ) valuesbetween different signals for different frequency values. Fig.3(a) shows the S ( f ) values between f f
10. Fig. 3(b) gives the S ( f ) values between f
10 andother 21 signals. Comparing them, we can see that f
10 is verydifferent with other signals in term of frequency magnitude fornon-DC component. However, f f
10) are much more similar. The dynamic change in timedomain for the selected time period is not significant and theLOF approach is more sensitive to dynamic change and thusfails to detect it. The proposed approach also considers thefrequency characteristic of data points and it is more accurateto quantify the similarity and differentiate different signals.
B. Scenario 2: Performance under event condition
The synchrophasor data recorded from a frequency event asshown by Fig. 4 is used. There are 20 frequency signals.Using the proposed approach, no low-quality data is de-tected. It indicates that the proposed approach will not causewrong detection of low-quality data under event condition.
C. Scenario 3: Detection of low-quality synchrophasor datathat appears in the same time period of multiple regional PMUsignals under event condition
Fig. 4. Signals for Scenario 2. Fig. 5. Signals for Scenario 3.
The LOF approach can detect false data injection thatappears in the same time period of multiple regional PMUsignals [1]. This subsection verifies that the proposed approachis more effective to detect indistinguishable false data injectionthat appears in the same time period of multiple regional PMUsignals when compared with the LOF approach.False data injection (10 data points) are introduced to thesame time period of 4 signals( f f f f
4) shown in Fig. 5. (a) Proposed approach (b) LOF approach
Fig. 6. Identified false data injection for Scenario 3.
The identified data points with low-quality data using theproposed and LOF approaches are marked by red in Fig. 6(a)and Fig. 6(b), respectively. Comparing them, we can find thatthe proposed approach detects the data points with false datainjection for f f f f
4. However, the LOF approachonly detect the data points with false data injection for f TABLE I. Mean Value of Identified Signals with False Data InjectionApproach ValueLOF Approach 1.21Proposed Approach 3.95
V. C
ONCLUSION
This letter proposes a novel approach for detecting low-quality synchrophasor data under normal and event operatingconditions. It utilizes the features of synchrophasor data inboth time and frequency domains. The proposed approach ismore effective to detect unobvious low-quality synchrophasordata such as random spikes and false data injection. It does notinvolve any offline study or training. Case studies for differentscenarios verify the proposed approach.R
EFERENCES[1] M. W, L. Xie, “Online detection of low-quality synchrophasor measure-ments: a data-driven approach,”
IEEE Trans. Power Syst ., vol. 32, no. 4,pp. 2817-2827, Jul. 2017.[2] L. Zhu, D. J. Hill, “Cost-effective bad synchrophasor data detection basedon unsupervised time series data analytics,”
IEEE Internet of ThingsJournal , DOI: 10.1109/JIOT.2020.3016032.[3] Y. Hao, M. Wang, et al, “Modelless data quality improvement ofstreaming synchrophasor measurements by exploiting the low-rank hankelstructure,”
IEEE Trans. Power Syst. , vol. 33, no. 6, pp. 6966–6977, 2018.[4] Z. Yang, H. Liu, T. Bi, and Q. Yang, “Bad data detection algorithm forpmu based on spectral clustering,”
Journal of Modern Power Systems andClean Energy , vol. 8, no. 3, pp. 473–483, 2020.[5] X. Deng, D. Bian, W. Wang, et al, “Deep learning model to detectvarious synchrophasor data anomalies,”
IET Generation, Transmission &Distribution , DOI:10.1049/iet-gtd.2020.0526.[6] W. Ju, N. Nayak, C. Vikram, et al, “Indices for automated identificationof questionable generator models using synchrophasors,”