Ensemble and Random Collaborative Representation-Based Anomaly Detector for Hyperspectral Imagery
Rong Wang, Wei Feng, Qianrong Zhang, Feiping Nie, Zhen Wang, Xuelong Li
11 Ensemble and Random CollaborativeRepresentation-Based Anomaly Detector forHyperspectral Imagery
Rong Wang, Wei Feng, Qianrong Zhang, Feiping Nie, Zhen Wang, and Xuelong Li,
Fellow, IEEE
Abstract —In recent years, hyperspectral anomaly detection(HAD) has become an active topic and plays a significant rolein military and civilian fields. As a classic HAD method, thecollaboration representation-based detector (CRD) has attractedextensive attention and in-depth research. Despite the goodperformance of CRD method, its computational cost is toohigh for the widely demanded real-time applications. To alle-viate this problem, a novel ensemble and random collaborativerepresentation-based detector (ERCRD) is proposed for HAD.This approach comprises two main steps. Firstly, we proposea random background modeling to replace the sliding dualwindow strategy used in the original CRD method. Secondly, wecan obtain multiple detection results through multiple randombackground modeling, and these results are further refined tofinal detection result through ensemble learning. Experimentson four real hyperspectral datasets exhibit the accuracy andefficiency of this proposed ERCRD method compared with tenstate-of-the-art HAD methods.
Index Terms —Hyperspectral imagery (HSI), hyperspectralanomaly detection (HAD), collaborative representation, randombackground modeling, ensemble learning.
I. I
NTRODUCTION
Hyperspectral imagery (HSI) offers plentiful useful spectraland spatial information to monitor the earth’s surface forthe fine identification of various land cover materials [1]–[5]. Owing to the powerful advantage of HSI, it has beenwidely applied to various remote sensing fields, such as sceneclassification [3], [5], unmixing [6], clustering [7], [8], changedetection [9] and target or anomaly detection [2], [10]–[12].Hyperspectral anomaly detection (HAD) has also attractedmuch interest and in-depth research for its widespread appli-cation in areas like military reconnaissance, civilian searchand rescue, environmental monitoring and mineral explo-ration [13], [14].In essence, HAD is an unsupervised binary classificationproblem, which detects anomalies against background withoutany prior knowledge of this scene. In general, anomaliesusually refer to pixels with two significant features, namely,
R. Wang is with the School of Cybersecurity and Center for OPTicalIMagery Analysis and Learning (OPTIMAL), Northwestern PolytechnicalUniversity, Xi’an 710072, P. R. China. E-mail: [email protected]. Feng, Q. Zhang, F. Nie, and X. Li are with the School of ComputerScience and Center for OPTical IMagery Analysis and Learning (OPTIMAL),Northwestern Polytechnical University, Xi’an 710072, P. R. China. E-mail:[email protected]. Wang is with the Center for OPTical IMagery Analysis and Learning(OPTIMAL) and School of Mechanical Engineering, Northwestern Polytech-nical University, Xi’an 710072, China. low occurrence probability and distinct spectral-spatial char-acteristic different from the background. For example, tanksin the forest background, warships in the sea background andaircraft in the airport background.In the last three decades, various methods have beendeveloped to detect anomalies in HSI. Generally speaking,there are two main kinds of existing HAD methods: statis-tics modeling HAD and representation-based HAD. Statis-tics modeling HAD assume a multivariate normal (Gaussian)background distribution. See [15] for a well-known statis-tics modeling method called the Reed-Xiaoli (RX) detector,which identifies anomalies based on the Mahalanobis dis-tance between pixel and the background, with an assumptionof multivariate Gaussian background distribution. The RXdetector is now the benchmark in HSI anomaly detectionand comes in two variants: the global RX (GRX) detectorwhich uses the entire image to model the background, and thelocal RX (LRX) detector which uses the local dual windowimage. Afterwards, numerous variants of the RX detectorhave been proposed [16]–[25]. For example, the kernel RX(KRX) [16] method, considered as the nonlinear version ofthe RX detector, uses the kernel theory to characterize thenon-Gaussian distributions in high-dimensional feature space.The cluster-based anomaly detection (CBAD) [17] methodfirst grouped the entire image into several clusters, then theabove RX detector was applied on each cluster to achievebetter detection performance. The subspace RX (SSRX) [18]detector assumes that the high-variance principal componentsand low-variance principal components are associated withbackground subspace and anomaly subspace, respectively. TheSSRX detector is obtained by applying the RX detector tothe low-variance principal components. The weighted RX(WRX) [21] detector assigned specific weight to every pixelfor background modeling. To speed up the KRX detector, thecluster KRX (CKRX) [25] detector applied the fast eigenvaluedecomposition on clusters, which are all obtained by clusteringthe entire image. Chang et al. [26] designed a background-anomaly component projection and separation optimized de-tector (BASO) to detect anomalies from the perspective ofoptimization theory.Besides, many representation-based HAD methods, in-cluding sparse representation [27]–[32], low-rank represen-tation [33]–[37] and collaborative representation [38]–[43],have also attracted substantial attention and in-depth research.For example, Chen et al. [27] came up with the sparserepresentation-based HAD method, which assumes the back- a r X i v : . [ ee ss . I V ] J a n ground can be approximated by several atoms in the dictionary,while the anomalies cannot. The low-rank and sparse matrixdecomposition (LRaSMD) detector designed by Sun et al. [33]separates the anomalies by calculating the Euclidean distancebetween each pixel and the mean vector on the sparse part.Zhang et al. [34] came up with the low-rank and sparsematrix decomposition-based Mahalanobis distance (LSMAD)method, which exert the low-rank constraint on the backgroundand the sparse constraint on anomalies. Under the assumptionthat each background pixel can be approximated by its spatialneighborhood pixels in a sliding dual window centered atthis pixel, while the anomalies cannot, Li and Du proposedthe collaborative representation-based detector (CRD) for HSIin [38]. On the basis of the CRD, many variants have been pro-posed recently. To make better use of the spatial features fromHSI, a morphology-based collaborative representation detector(MCRD) [39] was presented. Principal component analysiswas applied in CRD method to remove outliers (PCAro-CRD) [41] first uses PCA to extract the main background pixelinformation, and then predicts the anomalies with collaborativerepresentation. It comes in two variants: Global PCAroCRDand Local PCAroCRD. In order to decrease the complexityof computing, a recursive CRD (RCRD) [42] algorithm wasproposed based on the matrix inversion lemma. In addition tothe above two categories of methods, many scholars have alsoexplored HAD methods on the strength of support vector datadescription (SVDD) [44], [45], morphological and attributefilters [46]–[48], tensor decomposition [49]–[52], and deepconvolutional neural networks [53]–[55], etc., which showsthat the in-depth study of HAD method has become a populartopic.Here we are focusing on the family of collaborativerepresentation-based HAD methods. These methods are builtbased on an assumption: each background pixel can beapproximated by its spatial neighborhood pixels in a dualwindow centered at this pixel, while the anomalies can-not. Most traditional collaborative representation-based HADmethods, however, focus on detection accuracy while ignoringthe underlying computational complexity, which is of greatimportance for the anomaly detection of a large dataset.The complexity mainly arises from the sliding dual windowstrategy. Therefore, we come up with a novel ensemble andrandom collaborative representation-based detector (ERCRD)with random background modeling and ensemble learning.Firstly, we propose a random background modeling to re-place the sliding dual window strategy and use the samebackground pixels for each pixel in our model. Then, wecan obtain multiple detection results through multiple randombackground modeling, and these results are further refined tofinal detection result through ensemble learning. The proposedERCRD shows its advantages over many existing HAD meth-ods, such as GRX [15], LRX [15], SSRX [18], CBAD [17],and LSMAD [34]. In comparison with the CRD [38] and itsvariants [39], [41], [42], we also validate that our ERCRDmethod is able to attain considerable or better detectionaccuracy with even much less implementation time.The rest of this paper is arranged as follows. In Sec-tion II, we briefly review the traditional CRD and the recently proposed PCAroCRD. The proposed approach ERCRD isdescribed in Section III. In Section IV, we conduct empiricalstudies on four real datasets to validate the accuracy andefficiency of our approach. Finally, we summarize this paperin Section V. II. R ELATED W ORK
Let X ∈ R d × h × w denotes the hyperspectral imagery, where d is the number of spectral bands, and h and w are theheight and width of the background respectively. The three-dimensional (3-D) hyperspectral imagery X is transformedinto a two-dimensional (2-D) matrix X = [ x , x , · · · , x n ] ∈ R d × n , where n = h × w is the total number of pixels. Outer windowInner window
Fig. 1: The sliding dual window of the CRD.
A. CRD
In this subsection, we review the recently proposed collab-orative representation-based detector (CRD) [38]. The CRDassumed that the background pixel can be well approximatedby its spatial neighborhood pixels, whereas the anomaliescannot. For the pixel x i ∈ R d × , its spatial neighborhoodpixels are selected by a sliding dual window, as can beseen from Fig. 1. The spatial neighborhood pixels specif-ically refer to pixels outside the inner window and insidethe outer window. The size of the outer window and theinner window are represented as w out and w in , respectively.Thus, the spatial neighborhood pixels are resized into a two-dimensional matrix X s = [ ¯ x , ¯ x , · · · , ¯ x s ] ∈ R d × s , where s denotes the number of spatial neighborhood pixels and s = w out × w out − w in × w in . The objective function of theCRD is defined as [38] min α i (cid:107) x i − X s α i (cid:107) + λ (cid:107) α i (cid:107) , (1)where α i ∈ R s × denotes the weight vector and λ is theregularization parameter. Taking the derivative w.r.t α i andsetting it to zero, so we have ˆ α i = ( X Ts X s + λ I ) − X Ts x i , (2)where I is an identity matrix. The reconstruction error r i forthe pixel x i is regarded as the anomaly score and can becomputed by r i = (cid:107) x i − X s ˆ α i (cid:107) . (3)If r i is greater than a threshold, then the pixel x i is calledanomalous pixel. The computational complexity of this step is O ( ds + s + ds + d ) . Each pixel x i needs to get the corresponding sur-rounding pixel matrix X s on its own sliding double window.Thus, the weight vector α i and the anomaly score r i need tocalculate n times, thereby the total computational complexityof the CRD is O ( nds + ns + nds + nd ) . This repeated processneeds a high computational burden, reduces the speed of theanomaly detection, and limits the application of the CRD inreal-time tasks. B. PCAroCRD
The PCAroCRD method was recently proposed by Su etal. [41], which applied PCA to CRD method for removingoutliers. It has two versions: Global PCAroCRD and LocalPCAroCRD.The Global PCAroCRD first obtains the projection matrix W g ∈ R n × n by solving the standard PCA model: max W Tg W g = I tr( W Tg X T XW g ) , (4)then the spatial-domain PCA transformation is represented as ˆ X = XW g (5)where ˆ X = [ ˆ x , ˆ x , · · · , ˆ x n ] ∈ R d × n denotes the transformeddata matrix. For the pixel x i , the objective function of theGlobal PCAroCRD can be written as follows [41] min α i (cid:107) x i − X m α i (cid:107) + λ (cid:107) Γ i α i (cid:107) . (6)where X m = [ ˆ x , ˆ x , · · · , ˆ x m ] ∈ R d × m denotes the first m principal components of ˆ X and contains the most informa-tion of X in the spatial domain. Γ gi denotes the Tikhonovregularization matrix and is defined as: Γ gi = (cid:107) x i − ˆ x (cid:107) · · · ... . . . ... · · · (cid:107) x i − ˆ x m (cid:107) . (7)Similarly, taking the derivative w.r.t α i and setting the deriva-tive to zero, we have ˆ α i = ( X Tm X m + λ Γ gTi Γ gi ) − X Tm x i . (8)The reconstruction error r i can be computed by r i = (cid:107) x i − X m ˆ α i (cid:107) . (9)If r i is greater than a threshold, the pixel x i is referred to asan anomaly. The total computational complexity of the GlobalPCAroCRD algorithm is O ( n + ndm + nm + ndm + nd ) .The Local PCAroCRD first selects the surrounding pixelsfor each pixel x i by a sliding dual window, which is exactlythe same as the CRD. Then, the spatial-domain PCA transfor-mation is performed on the surrounding pixels X s ∈ R d × s inthe spatial domain: ¯ X = X s W l (10) where ˇ X = [ ˇ x , ˇ x , · · · , ˇ x s ] ∈ R d × s denotes the transformeddata matrix. The projection matrix W l ∈ R s × s is obtained bysolving the PCA model: max W Tl W l = I tr( W Tl X Ts X s W l ) . (11)For the pixel x i , the objective function of the Local PCAro-CRD can be written as follows [41] min α i (cid:107) x i − X k α i (cid:107) + λ (cid:107) Γ li α i (cid:107) . (12)where X k = [ ˇ x , ˇ x , · · · , ˇ x k ] ∈ R d × k denotes the first k principal components of ¯ X and contains the most informa-tion of X s in the spatial domain. Γ li denotes the Tikhonovregularization matrix and is defined as: Γ li = (cid:107) x i − ¯ x (cid:107) · · · ... . . . ... · · · (cid:107) x i − ¯ x k (cid:107) . (13)Similarly, taking the derivative w.r.t α i and setting the deriva-tive to zero, we have ˆ α i = ( X Tk X k + λ Γ lTi Γ li ) − X Tk x i . (14)The reconstruction error r i can be computed by r i = (cid:107) x i − X k ˆ α i (cid:107) . (15)If r i is greater than a threshold, then the pixel x i is referredto as an anomaly. The total computational complexity of theLocal PCAroCRD algorithm is O ( ns + ndk + nk + ndk + nd ) . III. T HE P ROPOSED M ETHOD
For the HAD applications, computational efficiency is animportant factor to assess the detection performance of thedetector. To reduce the computational complexity of CRD,a novel collaborative representation-based detector using therandom background modeling and ensemble learning is pro-posed in this section.Fig. 2: Random background modeling.
A. Random CRD
The main idea of the CRD hypothesize that the backgroundpixel can be approximated by a linear combination of thebackground pixels, but the anomalous pixel cannot. In theCRD, the background pixels for each pixel is represented bythe surrounding pixels, which are selected by a sliding dualwindow centered at this pixel.
Different from the sliding dual window in the CRD, weselect the background pixels for each pixel by the randombackground modeling, which is completed by random sam-pling from the whole hyperspectral image scene, as shown inFig. 2.Then, the background pixels obtained by the random sam-pling are resized into a matrix X r = [ ˜ x , ˜ x , · · · , ˜ x r ] ∈ R d × r . In the CRD, every pixel has its own surrounding pixels,and the surrounding pixels of each pixel are different. Unlikethe CRD, we use the same matrix X r for each pixel inour model. Therefore, the objective function of the proposedRandom CRD is written as min A (cid:107) X − X r A (cid:107) F + λ (cid:107) A (cid:107) F , (16)where A ∈ R m × n denotes the weight matrix and λ denotesthe regularization parameter. Taking the derivative w.r.t A andsetting the derivative to zero, we have the following equation: A = ( X Tr X r + λ I ) − X Tr X . (17)Thus, the matrix X can be reconstructed by the matrix X r A . The reconstruction error for the pixel x i is regardedas anomaly score and can be obtained by δ i = (cid:107) x i − X r a i (cid:107) . (18)where x i and a i denote the i column of X and A , respec-tively.If δ i is larger than a threshold, then the pixel x i is calledan anomaly. The computational complexity of the proposedRandom CRD is O ( ndr + nd + dr + r ) . The detailed processcan be found in Algorithm 1. Algorithm 1
Random CRD
Input:
The two-dimensional HSI matrix X , the number ofrandom sampling r .1. Randomly select r pixels from the matrix X and resizethese pixels into the matrix X r .2. Calculate the weight matrix A by Eq. (17).3. Obtain the anomaly score δ i for the pixel x i by Eq. (18). Output:
The anomaly scores for all pixels.
B. Ensemble and Random CRD
As an anomaly detector ensemble that employs RandomCRDs, the proposed Ensemble and Random CRD (ERCRD)has multiple Random CRDs acting as ‘experts’ to detectdifferent anomalies. Through multiple Random CRDs, theanomaly score for the pixel x i is obtained by γ i = T (cid:88) t =1 δ ti , (19)where T denotes the ensemble size of the Random CRDs.Therefore, the total computational complexity of the proposedERCRD is O ( ndrT + ndT + dr T + r T ) . The details of theERCRD can be found in Algorithm 2. Algorithm 2
Ensemble and Random CRD (ERCRD)
Input:
The two-dimensional HSI matrix X , the number ofrandom sampling r and the ensemble size of the RandomCRDs T .1. Repeat the Random CRD (Algorithm 1) T times.2. Obtain the anomaly score γ i for the pixel x i by Eq. (19). Output:
The anomaly scores for all pixels.IV. E
XPERIMENTAL R ESULTS
To explore the detection performance of our ERCRDmethod, we conduct several experiments on a PC with E - v @ . GHz and
GB RAM, MATLAB b. Weuse four hyperspectral datasets obtained from different scenes,which are described as follows:1) AVIRIS-I Dataset: This dataset was acquired by the Air-borne Visible/Infrared Imaging Spectrometer (AVIRIS)from San Diego, CA, USA, with a spatial resolution of . m per pixel and a spectral resolution of nm. Thisdataset has spectral bands with wavelengths rangingfrom to , nm. After removing the bands withwater absorption, low signal-to-noise ratio, and poor-quality ( − , − , , − , − , and − ), bands are retained in this experiment.The size of the entire image scene is × pixels,from which we select a × pixels area in the upperleft corner to test and mark it as AVIRIS-I. The threeairplanes in the image are considered to be anomalies,which consist of pixels and should be detected.2) AVIRIS-II Dataset: This dataset was derived from [46].Same as the above dataset, we select a × pixelsarea at the center of the San Diego image to test andmark it as AVIRIS-II. The three airplanes consist of pixels in the image are considered to be anomalies.3) AVIRIS-III Dataset: This dataset was obtained from[56]. Again, we select a × pixels area in theupper left of the San Diego image to test and mark itas AVIRIS-III. The six airplanes consist of pixels inthe image are considered to be anomalies.4) Cri Dataset: This dataset was derived from [34] andcollected by the Nuance Cri hyperspectral sensor, witha spectral resolution of nm. This dataset has a size of × pixels and spectral bands with wavelengthsranging from to , nm. The ten rocks consistof , pixels in the image are considered to beanomalies.Note that Fig. 3a and 3b respectively present the false colorimage and the corresponding ground truth map of the AVIRIS-I dataset. In the same way, Fig. 3c and 3d correspond to theAVIRIS-II dataset, Fig. 3e and 3f correspond to the AVIRIS-IIIdataset, and Fig. 3g and 3h correspond to the Cri dataset.In addition, the color detection map is used as the qualitativeevaluation metric in our experiments. The receiver operatingcharacteristic (ROC) curve [57], the area value under the ROCcurve (AUC), the normalized background-anomaly separationmap and the running time are used as quantitatively evalua- tion metrics in our experiments. The ROC curve reflects therelations between the detection probability (DP) and the falsealarm rate (FAR) at the thresholds ranging from to on thestrength of ground truth. An excellent detector usually has ahigh DP value under the same FAR value, which leads to thephenomena that the corresponding ROC curve located closeto the upper left corner, making the area under the curvelarger. The value of the area enclosed by the ROC curveand the false alarm rate axis is called AUC. The normalizedbackground-anomaly separation map describes the normalizedanomaly score distributions of the background, and anomalouspixels are represented by box plot. Generally speaking, a goodmethod should have a high AUC value and a distinct gapbetween the background box and the anomaly box. A. Detection Performance
In this subsection, we carry out two experiments to verifythe performance of our ERCRD method. Firstly, our ERCRDmethod is compared with five state-of-the-art methods. Sincethe proposed ERCRD method is a variant of the CRD, theCRD and four representative variants of the CRD are subse-quently compared.In the first experiment, in order to evaluate detectionperformance, we make a comparison between the proposedERCRD method and five state-of-the-art methods: GRX [15],LRX [15], SSRX [18], CBAD [17], and LSMAD [34]. Amongthem, GRX is known as the benchmark anomaly detector forHSI. The LRX, SSRX and CBAD are three representativeimproved versions of RX. LSMAD is a typical low-rank andsparse matrix decomposition-based detector with remarkabledetection performance. We choose the inner window size w in ranging from to and the outer window size w out rangingfrom to for the reason that the detection performance ofLRX is sensitive to them. Moreover, we set the parametersof SSRX, CBAD and LSMAD to be accordant with earlierwork [17], [18], [34].The color detection maps of different methods based onAVIRIS-I dataset, AVIRIS-II dataset, AVIRIS-III dataset andCri dataset are presented in Fig. 4, Fig. 6, Fig. 8, Fig. 10respectively.As for AVIRIS-I dataset, our ERCRD method is able toidentify the locations of three airplanes, but fail to preciselypicture the shapes of them. The GRX, LRX, SSRX, CBADand LRaSMD methods not only fail to detect the anomalies,but also misidentify several normal background pixels asanomalies. Moreover, the ROC curves, the corresponding AUCvalues and the normalized background-anomaly separationmaps are displayed in Fig. 5. It can be observed that the curveof the ERCRD method is closer to the upper left corner thanthe others and its AUC value is . , which is larger thanothers. We can see that the separation gap for the proposedERCRD method is larger than those for the other methods.This indicates that the ERCRD method achieves the bestseparation result. Moreover, the LSMAD, SSRX and CBADmethods obtain relatively better separation capacity, while theGRX and LRX methods perform unsatisfactorily separationcapacity. (a) (b)(c) (d)(e) (f)(g) (h) Fig. 3: Image scene descriptions. (a) False color image ofAVIRIS-I dataset. (b) Ground truth map of AVIRIS-I dataset.(c) False color image of AVIRIS-II dataset. (d) Ground truthmap of AVIRIS-II dataset. (e) False color image of AVIRIS-III dataset. (f) Ground truth map of AVIRIS-III dataset. (g)False color image of Cri dataset. (h) Ground truth map of Cridataset.As for AVIRIS-II dataset, the LRX and CBAD methodsstill can not separate the anomalies from the background andeven misidentifies some normal pixels as anomalies. The GRX,SSRX, LSMAD and ERCRD methods can identify the loca-tions of three airplanes, but the shapes of them are fuzzy andsome false anomalies are also detected. The ROC curves, cor-responding AUC values and normalized background-anomalyseparation maps of different methods are given in Fig. 7. Itcan be concluded that the ROC curve of the ERCRD methodis closer to the upper left corner than the others, and its (a) (b)(c) (d)(e) (f)
Background Anomaly
Fig. 4: Color detection maps obtained by different algorithmsfor AVIRIS-I dataset. (a) GRX. (b) LRX. (c) SSRX. (d)CBAD. (e) LSMAD. (f) ERCRD.AUC value is . , which is also larger than others. Here,the proposed ERCRD method still achieves larger separationgaps while the separation capabilities of the GRX, SSRX,CBAD and LSMAD methods are slightly poorer. Comparedwith the above methods, the LRX method performs relativelyunsatisfactorily.As for AVIRIS-III dataset, the proposed ERCRD methodis able to identify the locations of six airplanes but someanomalous pixels are missing and several normal pixelsare misidentified. Unfortunately, other methods cannot detectanomalies effectively. In Fig. 9a, the ROC curves indicatethat the proposed ERCRD method obtains a higher detectionprobability than others. The AUC values of all methods areillustrated in Fig. 9b; these values indicate that the proposedERCRD method can achieve the best detection results amongall the compared methods. Fig. 9c presents the separationmaps for this dataset. Here, the proposed ERCRD methodstill achieves larger separation gaps. Moreover, the separationabilities of the other methods are greatly poorer, since theirseparation gaps are narrower than that of the ERCRD method.As for Cri dataset, the SSRX, LSMAD and proposedERCRD methods can effectively detect the locations and clear False Alarm Rate D e t e c t i on P r obab ili t y GRXLRXSSRXCBADLSMADERCRD (a)
GRX LRX SSRX CBAD LSMAD ERCRD0.60.70.80.91 A UC V a l ue s (b) GRX LRX SSRX CBAD LSMAD ERCRD00.20.40.60.81 N o r m a li z ed D e t e c t i on S t a t i s t i cs R ange BackgroundAnomaly (c)
Fig. 5: Detection accuracy evaluation for AVIRIS-I dataset.(a) ROC curves. (b) AUC values. (c) Normalized background-anomaly separation map.shapes of ten rocks, while the GRX, LRX and CBAD methodscan only detect the positions of several anomalous pixels butthe shapes of some are missing. It can be seen from Fig. 11 thatthe proposed ERCRD method performs high detection abilitieswith a low false alarm rate, high AUC value and achieveslarger separation gaps than the other methods.In addition, the running times for the four datasets aredisplayed in Table I. The GRX and SSRX methods are very (a) (b)(c) (d)(e) (f)
Background Anomaly
Fig. 6: Color detection maps obtained by different algorithmsfor AVIRIS-II dataset. (a) GRX. (b) LRX. (c) SSRX. (d)CBAD. (e) LSMAD. (f) ERCRD.fast compared to other methods. It is noteworthy that therunning time of the proposed ERCRD and CBAD methods aresimilar to that of the GRX and SSRX methods; meanwhile,the proposed ERCRD method also achieves excellent detectionperformance. Thus, the proposed ERCRD method can beeasily utilized in real-time applications. Furthermore, the LRXand LSMAD methods are more time-consuming than the othermethods. TABLE I: Running time (seconds)
Dataset GRX LRX SSRX CBAD LSMAD ERCRDAVIRIS-I .
27 140 .
75 0 .
21 0 .
41 20 .
28 0 . AVIRIS-II .
15 98 .
04 0 .
14 0 .
23 14 .
50 0 . AVIRIS-III .
52 443 .
55 0 .
67 3 .
56 73 .
99 2 . Cri .
79 192 .
86 0 .
98 2 .
76 55 .
01 1 . In the second experiment, the detection performance of ER-CRD method is assessed and compared with the CRD [38] andfour representative variants of CRD: Global PCAroCRD [41],Local PCAroCRD [41], MCRD [39], and RCRD [42]. It canbe seen that the detection performance of the CRD, LocalPCAroCRD, MCRD and RCRD methods are sensitive to the
False Alarm Rate D e t e c t i on P r obab ili t y GRXLRXSSRXCBADLSMADERCRD (a)
GRX LRX SSRX CBAD LSMAD ERCRD0.60.70.80.91 A UC V a l ue s (b) GRX LRX SSRX CBAD LSMAD ERCRD00.20.40.60.81 N o r m a li z ed D e t e c t i on S t a t i s t i cs R ange BackgroundAnomaly (c)
Fig. 7: Detection accuracy evaluation for AVIRIS-II dataset.(a) ROC curves. (b) AUC values. (c) Normalized background-anomaly separation map.inner window size w in and the outer window size w out . Thus,we employed four window sizes: (5 , , (7 , , (9 , and (11 , . The regularization parameter λ of these six methodsis set to − . The AUC values and the corresponding runningtimes of these six methods are displayed in Table II andTable III, respectively.As for AVIRIS-I dataset, the AUC value of ERCRD methodis . , only smaller than that of MCRD with window sizes (a) (b)(c) (d)(e) (f) Background Anomaly
Fig. 8: Color detection maps obtained by different algorithmsfor AVIRIS-III dataset. (a) GRX. (b) LRX. (c) SSRX. (d)CBAD. (e) LSMAD. (f) ERCRD.TABLE II: AUC values
Dataset AVIRIS-I AVIRIS-II AVIRIS-III CriCRD (5 ,
9) 0 . . . . ,
11) 0 . . . . ,
13) 0 . . . . ,
15) 0 . . . . LocalPCAroCRD (5 ,
9) 0 . . . . ,
11) 0 . . . . ,
13) 0 . . . . ,
15) 0 . . . . MCRD (5 ,
9) 0 . . . . ,
11) 0 . . . . ,
13) 0 . . . . ,
15) 0 . . . . RCRD (5 ,
9) 0 . . . . ,
11) 0 . . . . ,
13) 0 . . . . ,
15) 0 . . . . Global PCAroCRD . . . . ERCRD . . . . (11 , and larger than that of other methods. The runningtime of ERCRD is . s, which is much lower than that ofthe other methods. The running time of MCRD with windowsizes (11 , is . s, which is much higher than that of False Alarm Rate D e t e c t i on P r obab ili t y GRXLRXSSRXCBADLSMADERCRD (a)
GRX LRX SSRX CBAD LSMAD ERCRD0.60.70.80.91 A UC V a l ue s (b) GRX LRX SSRX CBAD LSMAD ERCRD00.20.40.60.81 N o r m a li z ed D e t e c t i on S t a t i s t i cs R ange BackgroundAnomaly (c)
Fig. 9: Detection accuracy evaluation for AVIRIS-III dataset.(a) ROC curves. (b) AUC values. (c) Normalized background-anomaly separation map.the ERCRD method. As for AVIRIS-II dataset, the AUC valueof ERCRD is . , higher than that of other methods. Therunning time of ERCRD is . s, which is much lower thanthat of the other methods. As for AVIRIS-III dataset, the AUCvalue of ERCRD is . , smaller than that of CRD withwindow sizes (9 , and (11 , , Local PCAroCRD withwindow sizes (11 , , MCRD with window sizes (9 , and (11 , , RCRD with window sizes (11 , . Note that the (a) (b)(c) (d)(e) (f) Background Anomaly
Fig. 10: Color detection maps obtained by different algorithmsfor Cri dataset. (a) GRX. (b) LRX. (c) SSRX. (d) CBAD. (e)LSMAD. (f) ERCRD.TABLE III: Running time (seconds)
Dataset AVIRIS-I AVIRIS-II AVIRIS-III CriCRD (5 ,
9) 19 .
31 13 .
11 63 .
92 186 . ,
11) 25 .
12 18 .
53 88 .
05 264 . ,
13) 33 .
21 23 .
43 119 .
41 360 . ,
15) 44 .
66 31 .
01 153 .
33 434 . LocalPCAroCRD (5 ,
9) 11 .
39 7 .
72 39 .
79 101 . ,
11) 15 .
61 10 .
48 52 .
32 131 . ,
13) 20 .
97 13 .
89 70 .
89 186 . ,
15) 26 .
82 18 .
12 91 .
24 214 . MCRD (5 ,
9) 58 .
45 52 .
14 249 .
23 816 . ,
11) 82 .
64 72 .
11 311 .
20 1195 . ,
13) 115 .
99 91 .
41 463 .
19 1667 . ,
15) 167 .
34 114 .
62 568 .
85 2051 . RCRD (5 ,
9) 12 .
55 8 .
43 42 .
65 99 . ,
11) 15 .
88 10 .
56 51 .
18 123 . ,
13) 17 .
47 11 .
65 56 .
84 145 . ,
15) 23 .
72 16 .
15 79 .
23 224 . Global PCAroCRD .
90 37 .
37 115 .
89 809 . ERCRD .
77 0 .
79 2 .
45 1 . running time of ERCRD method is . s, which is much lowerthan others. As for Cri dataset, the AUC value of ERCRDmethod is . , higher than that of other methods. Therunning time of ERCRD method is . s, which is much lower False Alarm Rate D e t e c t i on P r obab ili t y GRXLRXSSRXCBADLSMADERCRD (a)
GRX LRX SSRX CBAD LSMAD ERCRD0.60.70.80.91 A UC V a l ue s (b) GRX LRX SSRX CBAD LSMAD ERCRD00.20.40.60.81 N o r m a li z ed D e t e c t i on S t a t i s t i cs R ange BackgroundAnomaly (c)
Fig. 11: Detection accuracy evaluation for Cri dataset. (a) ROCcurves. (b) AUC values. (c) Normalized background-anomalyseparation map.than that of the other methods.
B. Parameter Analysis and Discussion
There are two parameters in the proposed ERCRD method:the number of random sampling r and the ensemble size T .We will conduct parameter analysis by comparing the AUCvalue and running time on four datasets. Fig. 12 presents the impact of different random samplingnumber r on the detection performance and running time foreach dataset. The value of the random sampling number r isvaried in the range [1 , , · · · , and the ensemble size T isset to . It can be seen from Fig. 12a that the AUC valueson AVIRIS-I and AVIRIS-II datasets fluctuate in a small rangefrom . to . , while on AVIRIS-III dataset it fluctuate in aslightly larger range from . to . , and within a relativelylarger range from . to . in Cri dataset. It is known fromFig. 12b that the running time on each dataset is nearly stablewhile the random sampling number r increasing.Then the impact of different ensemble size T on the fourdatasets is analyzed. The value of ensemble size T is changedwithin the range of [5 , , · · · , and the number of randomsampling r is set to . In Fig. 13a, we see that the AUCvalues on AVIRIS-I and AVIRIS-II datasets are nearly stable,while on AVIRIS-III and Cri datasets are increasing at first,and then fluctuate on a small scale. Fig. 13b illustrates thatwith the ensemble size T increasing, the running time of eachdataset increases almost linearly.V. C ONCLUSION
In this paper, we came up with a novel ensemble andrandom collaborative representation-based detector (ERCRD)for hyperspectral imagery by exploring random backgroundmodeling and ensemble learning. Fistly, the random CRD isproposed by the random background modeling. The computa-tional complexity of random CRD is greatly reduced comparedto that of the original CRD. Then, multiple random CRDsact as ‘experts’ to detect different anomalies by using ensem-ble learning. Experimental results on four real hyperspectraldatasets validate that our method outperforms its counterpartsin the aspect of detection accuracy and running time.R
EFERENCES[1] G. Shaw and D. Manolakis, “Signal processing for hyperspectral imageexploitation,”
IEEE Signal Process. Mag. , vol. 19, no. 1, pp. 12–16, Jan.2002.[2] D. W. J. Stein, S. G. Beaven, L. E. Hoff, E. M. Winter, A. P. Schaum,and A. D. Stocker, “Anomaly detection from hyperspectral imagery,”
IEEE Signal Process. Mag. , vol. 19, no. 1, pp. 58–69, Jan. 2002.[3] C.-I. Chang and S.-S. Chiang, “Anomaly detection and classificationfor hyperspectral imagery,”
IEEE Trans. Geosci. Remote Sens. , vol. 40,no. 6, pp. 1314–1325, Jun. 2002.[4] L. Zhang, Q. Zhang, B. Du, X. Huang, Y. Y. Tang, and D. Tao, “Simul-taneous spectral-spatial feature selection and extraction for hyperspectralimages,”
IEEE Trans. Cybern. , vol. 48, no. 1, pp. 16–28, 2018.[5] F. Luo, B. Du, L. Zhang, L. Zhang, and D. Tao, “Feature learningusing spatial-spectral hypergraph discriminant analysis for hyperspectralimage,”
IEEE Trans. Cybern. , vol. 49, no. 7, pp. 2406–2419, 2019.[6] J. M. Bioucas-Dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader,and J. Chanussot, “Hyperspectral unmixing overview: Geometrical,statistical, and sparse regression-based approaches,”
IEEE J. Sel. TopicsAppl. Earth Observ. Remote Sens. , vol. 5, no. 2, pp. 354–379, Apr. 2012.[7] R. Wang, F. Nie, and W. Yu, “Fast spectral clustering with anchor graphfor large hyperspectral images,”
IEEE Geoscience and Remote SensingLetters , vol. 14, no. 11, pp. 2003–2007, Nov. 2017.[8] R. Wang, F. Nie, Z. Wang, F. He, and X. Li, “Scalable graph-basedclustering with nonnegative relaxation for large hyperspectral image,”
IEEE Trans. Geosci. Remote Sens. , vol. 57, no. 10, pp. 7352–7364, Oct.2019.[9] S. Liu, D. Marinelli, L. Bruzzone, and F. Bovolo, “A review of changedetection in multitemporal hyperspectral images: Current techniques,applications, and challenges,”
IEEE Geosci. Remote Sens. Mag. , vol. 7,no. 2, pp. 140–158, Jun. 2019. [10] Y. Yuan, D. Ma, and Q. Wang, “Hyperspectral anomaly detection bygraph pixel selection,”
IEEE Trans. Cybern. , vol. 46, no. 12, pp. 3123–3134, 2016.[11] L. Li, W. Li, Q. Du, and R. Tao, “Low-rank and sparse decompositionwith mixture of gaussian for hyperspectral anomaly detection,”
IEEETrans. Cybern. , pp. 1–10, 2020.[12] N. M. Nasrabadi, “Hyperspectral target detection: An overview ofcurrent and future challenges,”
IEEE Signal Process. Mag. , vol. 31, no. 1,pp. 34–44, Jan. 2014.[13] M. Huber-Lerner, O. Hadar, S. R. Rotman, and R. Huber-Shalem,“Hyperspectral band selection for anomaly detection: The role of datagaussianity,”
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. ,vol. 9, no. 2, pp. 732–743, Feb. 2016.[14] L. Wang, C. Chang, L. Lee, Y. Wang, B. Xue, M. Song, C. Yu, and S. Li,“Band subset selection for anomaly detection in hyperspectral imagery,”
IEEE Trans. Geosci. Remote Sens. , vol. 55, no. 9, pp. 4887–4898, Sep.2017.[15] I. S. Reed and X. Yu, “Adaptive multiple-band CFAR detection of anoptical pattern with unknown spectral distribution,”
IEEE Trans. Acoust.,Speech, Signal Process. , vol. 38, no. 10, pp. 1760–1770, Oct. 1990.[16] H. Kwon and N. M. Nasrabadi, “Kernel RX-algorithm: A nonlinearanomaly detector for hyperspectral imagery,”
IEEE Trans. Geosci.Remote Sens. , vol. 43, no. 2, pp. 388–397, Feb. 2005.[17] M. J. Carlotto, “A cluster-based approach for detecting man-madeobjects and changes in imagery,”
IEEE Trans. Geosci. Remote Sens. ,vol. 43, no. 2, pp. 374–387, Feb. 2005.[18] A. P. Schaum, “Hyperspectral anomaly detection beyond RX,” in
Proc.SPIE , vol. 6565, May 2007.[19] J. M. Molero, E. M. Garzn, I. Garcła, and A. Plaza, “Analysis andoptimizations of global and local versions of the RX algorithm foranomaly detection in hyperspectral data,”
IEEE J. Sel. Topics Appl. EarthObserv. Remote Sens. , vol. 6, no. 2, pp. 801–814, Apr. 2013.[20] W. Liu and C. Chang, “Multiple-window anomaly detection for hyper-spectral imagery,”
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. ,vol. 6, no. 2, pp. 644–658, Apr. 2013.[21] Q. Guo, B. Zhang, Q. Ran, L. Gao, J. Li, and A. Plaza, “Weighted-RXDand linear filter-based RXD: Improving background statistics estimationfor anomaly detection in hyperspectral imagery,”
IEEE J. Sel. TopicsAppl. Earth Observ. Remote Sens. , vol. 7, no. 6, pp. 2351–2366, Jun.2014.[22] M. Imani, “RX anomaly detector with rectified background,”
IEEEGeosci. Remote Sens. Lett. , vol. 14, no. 8, pp. 1313–1317, Aug. 2017.[23] C. Zhao, X. Yao, and Y. Yan, “Modified kernel RX algorithm based onbackground purification and inverse-of-matrix-free calculation,”
IEEEGeosci. Remote Sens. Lett. , vol. 14, no. 4, pp. 544–548, Apr. 2017.[24] C. Zhao and Y. Xi-Feng, “Fast real-time kernel RX algorithm basedon cholesky decomposition,”
IEEE Geosci. Remote Sens. Lett. , vol. 15,no. 11, pp. 1760–1764, Nov. 2018.[25] J. Zhou, C. Kwan, B. Ayhan, and M. T. Eismann, “A novel cluster kernelRX algorithm for anomaly and change detection using hyperspectralimages,”
IEEE Trans. Geosci. Remote Sens. , vol. 54, no. 11, pp. 6497–6504, Nov. 2016.[26] S. Chang, B. Du, and L. Zhang, “BASO: A background-anomaly com-ponent projection and separation optimized filter for anomaly detectionin hyperspectral images,”
IEEE Trans. Geosci. Remote Sens. , vol. 56,no. 7, pp. 3747–3761, Jul. 2018.[27] Y. Chen, N. M. Nasrabadi, and T. D. Tran, “Sparse representation fortarget detection in hyperspectral imagery,”
IEEE J. Sel. Topics SignalProcess. , vol. 5, no. 3, pp. 629–640, Jun. 2011.[28] J. Li, H. Zhang, L. Zhang, and L. Ma, “Hyperspectral anomaly detectionby the use of background joint sparse representation,”
IEEE J. Sel. TopicsAppl. Earth Observ. Remote Sens. , vol. 8, no. 6, pp. 2523–2533, Jun.2015.[29] Y. Zhang, B. Du, Y. Zhang, and L. Zhang, “Spatially adaptive sparse rep-resentation for target detection in hyperspectral images,”
IEEE Geosci.Remote Sens. Lett. , vol. 14, no. 11, pp. 1923–1927, Nov. 2017.[30] R. Zhao, B. Du, and L. Zhang, “Hyperspectral anomaly detection viaa sparsity score estimation framework,”
IEEE Trans. Geosci. RemoteSens. , vol. 55, no. 6, pp. 3208–3222, Jun. 2017.[31] F. Li, X. Zhang, L. Zhang, D. Jiang, and Y. Zhang, “Exploiting structuredsparsity for hyperspectral anomaly detection,”
IEEE Trans. Geosci.Remote Sens. , vol. 56, no. 7, pp. 4050–4064, Jul. 2018.[32] Q. Ling, Y. Guo, Z. Lin, and W. An, “A constrained sparse representa-tion model for hyperspectral anomaly detection,”
IEEE Trans. Geosci.Remote Sens. , vol. 57, no. 4, pp. 2358–2371, Apr. 2019. A UC V a l ue s AVIRIS-IAVIRIS-IIAVIRIS-IIICri (a) R unn i ng T i m e ( s ) AVIRIS-IAVIRIS-IIAVIRIS-IIICri (b)
Fig. 12: Effect of the number of random sampling r on each dataset. (a) AUC values. (b) Running time. A UC V a l ue s AVIRIS-IAVIRIS-IIAVIRIS-IIICri (a) R unn i ng T i m e ( s ) AVIRIS-IAVIRIS-IIAVIRIS-IIICri (b)
Fig. 13: Effect of the ensemble size T on each dataset. (a) AUC values. (b) Running time. [33] W. Sun, C. Liu, J. Li, Y. M. Lai, and W. Li, “Low-rank and sparse matrixdecomposition-based anomaly detection for hyperspectral imagery,” J.Appl. Remote Sens. , vol. 8, no. 1, pp. 1–18, 2014.[34] Y. Zhang, B. Du, L. Zhang, and S. Wang, “A low-rank and sparse matrixdecomposition-based mahalanobis distance method for hyperspectralanomaly detection,”
IEEE Trans. Geosci. Remote Sens. , vol. 54, no. 3,pp. 1376–1389, Mar. 2016.[35] Y. Xu, Z. Wu, J. Li, A. Plaza, and Z. Wei, “Anomaly detection inhyperspectral images based on low-rank and sparse representation,”
IEEE Trans. Geosci. Remote Sens. , vol. 54, no. 4, pp. 1990–2000, Apr.2016.[36] Y. Qu, W. Wang, R. Guo, B. Ayhan, C. Kwan, S. Vance, and H. Qi, “Hy-perspectral anomaly detection through spectral unmixing and dictionary-based low-rank decomposition,”
IEEE Trans. Geosci. Remote Sens. ,vol. 56, no. 8, pp. 4391–4405, Aug. 2018.[37] B. Madathil and S. N. George, “Simultaneous reconstruction andanomaly detection of subsampled hyperspectral images using l / regularized joint sparse and low-rank recovery,” IEEE Trans. Geosci.Remote Sens. , vol. 57, no. 7, pp. 5190–5197, Jul. 2019.[38] W. Li and Q. Du, “Collaborative representation for hyperspectralanomaly detection,”
IEEE Trans. Geosci. Remote Sens. , vol. 53, no. 3,pp. 1463–1474, Mar. 2015.[39] M. Imani, “Anomaly detection using morphology-based collaborativerepresentation in hyperspectral imagery,”
European Journal of RemoteSensing , vol. 51, no. 1, pp. 457–471, 2018.[40] M. Vafadar and H. Ghassemian, “Anomaly detection of hyperspectralimagery using modified collaborative representation,”
IEEE Geosci.Remote Sens. Lett. , vol. 15, no. 4, pp. 577–581, Apr. 2018. [41] H. Su, Z. Wu, Q. Du, and P. Du, “Hyperspectral anomaly detection usingcollaborative representation with outlier removal,”
IEEE J. Sel. TopicsAppl. Earth Observ. Remote Sens. , vol. 11, no. 12, pp. 5029–5038, Dec.2018.[42] N. Ma, Y. Peng, and S. Wang, “A fast recursive collaboration represen-tation anomaly detector for hyperspectral image,”
IEEE Geosci. RemoteSens. Lett. , vol. 16, no. 4, pp. 588–592, Apr. 2019.[43] T. Cheng and B. Wang, “Graph and total variation regularized low-rank representation for hyperspectral anomaly detection,”
IEEE Trans.Geosci. Remote Sens. , vol. 58, no. 1, pp. 391–406, Jan. 2020.[44] A. Banerjee, P. Burlina, and C. Diehl, “A support vector methodfor anomaly detection in hyperspectral imagery,”
IEEE Trans. Geosci.Remote Sens. , vol. 44, no. 8, pp. 2282–2291, Aug. 2006.[45] W. Sakla, A. Chan, J. Ji, and A. Sakla, “An SVDD-based algorithm fortarget detection in hyperspectral imagery,”
IEEE Geosci. Remote Sens.Lett. , vol. 8, no. 2, pp. 384–388, Mar. 2011.[46] X. Kang, X. Zhang, S. Li, K. Li, J. Li, and J. A. Benediktsson,“Hyperspectral anomaly detection with attribute and edge-preservingfilters,”
IEEE Trans. Geosci. Remote Sens. , vol. 55, no. 10, pp. 5600–5611, Oct. 2017.[47] S. Li, K. Zhang, Q. Hao, P. Duan, and X. Kang, “Hyperspectral anomalydetection with multiscale attribute and edge-preserving filters,”
IEEEGeosci. Remote Sens. Lett. , vol. 15, no. 10, pp. 1605–1609, Oct. 2018.[48] A. Taghipour and H. Ghassemian, “Hyperspectral anomaly detectionusing attribute profiles,”
IEEE Geosci. Remote Sens. Lett. , vol. 14, no. 7,pp. 1136–1140, Jul. 2017.[49] X. Zhang, G. Wen, and W. Dai, “A tensor decomposition-based anomalydetection algorithm for hyperspectral image,”
IEEE Trans. Geosci. Remote Sens. , vol. 54, no. 10, pp. 5801–5820, Oct. 2016.[50] Y. Xu, Z. Wu, J. Chanussot, and Z. Wei, “Joint reconstruction andanomaly detection from compressive hyperspectral images using maha-lanobis distance-regularized tensor rpca,”
IEEE Trans. Geosci. RemoteSens. , vol. 56, no. 5, pp. 2919–2930, May. 2018.[51] X. Zhang and G. Wen, “A fast and adaptive method for determining k , k , and k in the tensor decomposition-based anomaly detectionalgorithm,” IEEE Geosci. Remote Sens. Lett. , vol. 15, no. 1, pp. 3–7,Jan. 2018.[52] W. Xie, T. Jiang, Y. Li, X. Jia, and J. Lei, “Structure tensor and guidedfiltering-based algorithm for hyperspectral anomaly detection,”
IEEETrans. Geosci. Remote Sens. , vol. 57, no. 7, pp. 4218–4230, Jul. 2019.[53] W. Li, G. Wu, and Q. Du, “Transferred deep learning for anomaly detection in hyperspectral imagery,”
IEEE Geosci. Remote Sens. Lett. ,vol. 14, no. 5, pp. 597–601, May. 2017.[54] C. Zhao, X. Li, and H. Zhu, “Hyperspectral anomaly detection based onstacked denoising autoencoders,”
J. Appl. Remote Sens. , vol. 11, no. 4,p. 042605, 2017.[55] N. Ma, Y. Peng, S. Wang, and L. Phw, “An unsupervised deep hyper-spectral anomaly detector,”
Sensors , vol. 18, no. 3, p. 693, 2018.[56] Y. Zhang, B. Du, and L. Zhang, “A sparse representation-based binaryhypothesis model for target detection in hyperspectral images,”
IEEETrans. Geosci. Remote Sens. , vol. 53, no. 3, pp. 1346–1354, Mar. 2015.[57] J. Kerekes, “Receiver operating characteristic curve confidence intervalsand regions,”