Quickest Eigenvalue-Based Spectrum Sensing using Random Matrix Theory
11 Quickest Eigenvalue-Based Spectrum Sensing usingRandom Matrix Theory
Martijn Arts, Andreas Bollig and Rudolf MatharInstitute for Theoretical Information Technology, RWTH Aachen University, D-52074 Aachen, GermanyE-mail: { arts, bollig, mathar } @ti.rwth-aachen.de Abstract —We investigate the potential of quickest detectionbased on the eigenvalues of the sample covariance matrix forspectrum sensing applications. A simple phase shift keying (PSK)model with additive white Gaussian noise (AWGN), with primary user (PU) and K secondary users (SUs) is considered.Under both detection hypotheses H (noise only) and H (signal+ noise) the eigenvalues of the sample covariance matrix followWishart distributions. For the case of K = 2 SUs, we derivean analytical formulation of the probability density function(PDF) of the maximum-minimum eigenvalue (MME) detectorunder H . Utilizing results from the literature under H , weinvestigate two detection schemes. First, we calculate the receiveroperator characteristic (ROC) for MME block detector basedon analytical results. Second, we introduce two eigenvalue-based quickest detection algorithms: a cumulative sum (CUSUM)algorithm, when the signal-to-noise ratio (SNR) of the PU signalis known and an algorithm using the generalized likelihood ratio,in case the SNR is unknown. Bounds on the mean time to false-alarm τ fa and the mean time to detection τ d are given for theCUSUM algorithm. Numerical simulations illustrate the potentialadvantages of the quickest detection approach over the blockdetection scheme. I. I
NTRODUCTION M ODERN telecommunication systems face ever-increasing demands on data rates in order to supporta growing number of mobile applications, e.g., mobile audioand video streaming. Since unlicensed wireless frequencyspectrum is already very rare, it has been proposed to reuseparts of the spectrum which are already licensed. This isdue to the fact that large parts of the spectrum are not inuse in certain geographical locations or at certain pointsin time. The term dynamic spectrum access groups effortsto make use of such spectrum holes in order to increasebandwidth, see [1] for an overview. One subgroup of researchfocuses on opportunistic spectrum access, in which theunlicensed users, called secondary users (SUs), determine thepresence of licensed primary users (PUs) and based on thesefindings decide whether to start unlicensed communicationin an autonomous fashion. Obviously, a major challenge is
This work was partly supported by the Deutsche Forschungsgemeinschaft(DFG) projects CoCoSa (grant MA 1184/26-1).This work has been submitted to the IEEE for possible publication.Copyright may be transferred without notice, after which this version mayno longer be accessible.c (cid:13) the reliable detection of transmission opportunities for theSUs, such that ideally no or only a negligible amount ofinterference is caused for PUs.Detecting spectrum holes is known as spectrum sensing inthe literature and there have been a number of publicationson detection schemes based on various signal features, see [2]for a fairly recent review. Among those features, the samplecovariance matrix and particularly its eigenvalues have beenused in order to design several detectors, see e.g., [3]–[8]. Awell known one, which will be relevant in this work, is theratio of the maximum to minimum eigenvalue (MME), i.e. thestandard condition number (SCN) of the sample covariancematrix, see [3], [4]. Since the noise power of the receiver iscontained in both eigenvalues, it is canceled out in the ratio.Thus, designing an alarm threshold for this detector is possiblewithout knowledge of the noise power.When analyzing communication systems, receiver noise isoften modeled as additive white Gaussian noise (AWGN). Ifthe entries of a matrix A are Gaussian distributed, then thematrix AA H is called a Wishart matrix [9]. On the one hand,such matrices occur in the analysis of multiple-input multiple-output (MIMO) communication systems as the instantaneousMIMO correlation matrix. On the other hand, Wishart matricesare also encountered when using the sample covariance matrixfor spectrum sensing detectors, as will be explained in greaterdetail in Section II. The field of Random Matrix Theory (RMT)studies said Wishart matrices and in the recent years therehas been significant progress on the exact distributions oftheir eigenvalues, see e.g., [10] for a summary. Based on thejoint ordered eigenvalue distribution, also the distribution ofthe SCN can be found, see [11], [12]. This result, amongother approximating approaches from RMT, has recently beenapplied to spectrum sensing [13]. In this work, we will utilizethe SCN distribution in a spectrum sensing problem as well.The spectrum sensing detectors discussed above are mostoften employed with a fixed sample size. There, a block ofsamples is taken, the test statistic is calculated and subse-quently a decision is made by comparing the value of thetest statistic to a predetermined threshold. One disadvantageof said block detection schemes lies in the fact that the numberof samples taken is independent of the detection difficultyof the current situation. For example, even if a very strongPU signal is present, it will take as many samples until adecision is reached as originally planned when defining thethreshold. This may introduce significant delays, which mayeither result in interference for the PUs or reduce the data rate a r X i v : . [ c s . I T ] O c t H H tt c t d detection delay Fig. 1. Quickest detection problem. At the change time t c , hypothesis H becomes true. The detection algorithm gives an alarm at the detection time t d , hence the detection delay or time to detection is t d − t c . by shortening the transmission window of the SUs.If one seeks to minimize the detection delay, the hypoth-esis detection problem can be cast within the frameworkof quickest detection . In contrast to block detection, it isassumed that it is known which hypothesis is true and thatan upcoming change is to be detected with minimum delay.Let H identify the situation that the PU is not transmittingand only noise is received. Consequently, let H define thesituation that the PU is transmitting and a signal with additivenoise is received. Consider without loss of generality thatthere is no PU transmitting (hypothesis H is true). At theunknown change time t c a PU starts transmitting, such thatthe corresponding hypothesis H becomes true and at thedetection time t d the detection algorithm raises an alarm, seealso Figure 1. The objective in a quickest detection problemis to minimize the mean detection delay τ d , while having amean time to false alarm τ fa that is greater than a predefinedvalue. Quickest detection has been applied to the spectrumsensing problem and is also known as quickest spectrumsensing in the literature. The first publications in this field were[14], [15] using Gaussian noise & signal and Gaussian noise& sinusoidal signal models, respectively. Quickest spectrumsensing using cyclostationary signal features was investigatedin [16]. Collaboration between multiple SUs using quickestdetection has been studied in [17], [18]. Also, the case of anSU having multiple antennas was analyzed in [19].To the best of our knowledge, so far there has been noinvestigation whether it is feasible to use quickest spectrumsensing, while using a test statistic that is based on thesample covariance matrix or its eigenvalues. In this work,we use a very basic model using two receivers that caneither be interpreted as one SU with K receiving antennasor as K collaborating SUs in close proximity. That is, theyare assumed to have the same signal-to-noise ratio (SNR).It is furthermore assumed, that only one PU is potentiallypresent using phase shift keying (PSK) signal modulation.With this simple model, we study eigenvalue-based quickestspectrum sensing on the basis of the MME test statisticmentioned above. The main contributions and the outline ofthis paper can be summarized as follows. First, we formalizethe signal model and specify the Wishart matrices involvedunder both hypotheses in Section II. Second, we develop theexact probability distribution function (PDFs) of the MME teststatistic under hypothesis H for the special case of 2 SUsusing results from RMT in Section III. For the hypothesis H we recall results known from literature. We also give detailson the numerical evaluation of the PDFs in the same Section.Then, we apply these results to predict the performance of theMME block detection algorithm by theoretical calculation ofthe receiver operator characteristic (ROC) in Section IV. InSection V we study the quickest detection problem using theMME test statistic with and without knowledge of the SNR.For the first case theoretical performance bounds are given. Anumerical performance evaluation of the MME based quickestspectrum sensing detector is subsequently given in Section VI.We conclude the paper in Section VII and discuss some futureresearch directions. II. S YSTEM M ODEL
In this Section, we introduce the system model and fixnotation. We introduce the model in general with K SUs. FromSection III on it will be confined to two SUs ( K = 2 ) only.The basic hypothesis testing problem of detecting the pres-ence of a PU with K collaborating SUs can be stated in thefollowing form: H : y ( t ) = w ( t ) H : y ( t ) = x ( t ) + w ( t ) , where y ( t ) is a K × vector of complex baseband samplescollected by the SUs at time index t ∈ N = { , , , . . . } .Furthermore, the complex K × vectors x ( t ) and w ( t ) standfor the PU’s signal and additive noise, respectively. If H istrue the PU is not transmitting and the SUs receive noise only.If H is true, the PU is transmitting such that the SUs receivea noisy version of the PU signal. Combining N samples intoa K × N sample matrix yields Y = ( y (1) , y (2) , . . . , y ( N )) .Similarly, a signal matrix X and a noise matrix W can beconstructed, so it follows Y = W under H and Y = X + W under H . Using this notation, we can calculate the samplecovariance matrix as R y = N Y Y H .In this work we consider a phase shift keying (PSK) signalsubject to basic additive white Gaussian noise (AWGN). Thus,each column of the signal matrix X is assumed to havethe form x j = x ( j ) = √ α s j K for j = 1 , . . . , N . Here, α is the signal-to-noise ratio (SNR), s j ∈ C is a complexPSK symbol on the unit circle (i.e. | s j | = 1 ) and K is acolumn vector of dimension K containing only ones. ThePSK symbols s j sent by the PU are assumed to be i.i.d.and are drawn from a uniform distribution over an arbitraryPSK alphabet. The noise matrix is assumed to follow a jointlycomplex circularly symmetric Gaussian distribution, such that W ∼ CN ( K × N , I K ⊗ I N ) , where I K is the identity matrixof dimension K . This means, that the real- and imaginarypart of each matrix entry of W is independently Gaussian N (0 , / distributed and the matrix entries are mutuallyindependent. Furthermore, we assume that the PU signal isindependent of the noise. Under this model, every receiver isassumed to have the same SNR. This will be essential for theanalytical derivation of the PDF under H in Section III.The test statistic of the well known MME detector is: T = λ λ K , (1) where λ ≥ · · · ≥ λ K are the ordered eigenvalues of thesample covariance matrix R y [3], [4]. Since scaling of thesample covariance matrix results in the same scaling of itseigenvalues, the ratio is not affected by it. Thus, we will omitthe normalization factor and use the scaled sample covariancematrix R = Y Y H in the following. Similarly, the test statistic T is only dependent on the SNR α , but independent of theactual noise power under both hypotheses. Hence, our systemmodel uses the SNR as a parameter directly, as knowledge ofthe actual noise power is unnecessary.Under hypothesis H the sample covariance matrix is sim-ply R = W W H . This is a complex uncorrelated centralWishart matrix of dimension K with N degrees of freedom[9], which we denote as R ∼ CW K ( N ) .Under hypothesis H , the sample covariance matrix is R = ( X + W )( X + W ) H = XX H + XW H + W X H + W W H . (2)The first part of (2) written using the columns x j of X gives XX H = N (cid:88) j =1 x j ( x j ) H = α N (cid:88) j =1 s j s ∗ j (cid:124)(cid:123)(cid:122)(cid:125) | s j | K T K = αN K T K . (3)Evidently, (3) is constant. Similarly, the middle parts of (2)can be expanded as XW H = N (cid:88) j =1 x j ( w j ) H = N (cid:88) j =1 √ αs j K ( w j ) H and W X H = N (cid:88) j =1 w j ( x j ) H = N (cid:88) j =1 √ αs ∗ j w j T K , respectively. Since the entries of the vectors w j are complexcircularly symmetric Gaussian, their distribution is unaffectedby complex rotations, which occur by multiplication with theunit circle PSK symbols s j . Therefore the distribution of themiddle part of (2), XW H + W X H , is identically to thedistribution of √ α K T K W H + √ α W K T K . Hence, it makesno difference for the distribution of the sample covariancematrix (and thereby also no difference for the distribution ofits eigenvalues) whether an arbitrary PSK signal (including4QAM for that matter) or a constant signal is observed. Thus, R is a complex uncorrelated non-central Wishart matrix ofdimension K with N degrees of freedom, c.f. [9]. In shortnotation we write R ∼ CW K ( N, Ω ) . Here, the parameter Ω = E[ ( X + W )]E[( X + W )] H = αN K T K is calledthe non-centrality matrix.In order to determine the PDF of the test statistic T underhypothesis H in Section III, we need the ordered eigenvalues ω ≥ · · · ≥ ω K of the non-centrality matrix Ω . For the rankone matrix Ω they are readily given as ω = αKN , while ω = · · · = ω K = 0 . III. MME D ISTRIBUTIONS FOR H & H As a special case, we have investigated the scenario of twocooperating users in more detail, so for the following K = 2 .For this case, the sample covariance matrices are distributedaccording to R ∼ CW ( N ) and R ∼ CW ( N, α N T ) ,respectively. It turns out, that for this simplification the PDFsunder both hypotheses can be found explicitly. A. PDF under hypothesis H Under hypothesis H , the PDF of the test statistic T for K = 2 has been found in [11] as: f ( T ) = Γ(2 N )Γ( N )Γ( N − (cid:18) − T (cid:19) (cid:18) T (cid:19) N (cid:18) T (cid:19) − N = ( N − N )[Γ( N )] ( T − T ( N − ( T + 1) N , (4)for T ≥ . B. PDF under hypothesis H To develop the PDF of the test statistic T for K = 2 underhypothesis H , we follow the same way as [20], where aversion of the PDF was given for N = 2 and generalize thisresult to arbitrary N . We collect the ordered eigenvalues of R and Ω in the vectors λ = ( λ , λ ) T and ω = ( ω , ω ) T ,respectively. Then, we can find the joint PDF of the orderedeigenvalues of R in [12] as: f λ ( λ ) = K un | V ( λ ) | | F ( λ , ω ) | (cid:89) j =1 ξ ( λ j )= K un e − ( λ + λ ) ( λ − λ ) ( λ λ ) ( N − | F ( λ , ω ) | , where | V ( λ ) | is the determinant of a Vandermonde matrixbuilt from λ , | F ( λ , ω ) | is the determinant of a matrix wherethe entry of the i -th row and j -th column follows standardgeneralized hypergeometric functions F ( N − λ j ω i ) (see[21], (9.14.1)), ξ ( λ j ) = λ ( N − j e − λ j and K un = e − ( ω + ω ) [( N − ( ω − ω ) . We are interested in the PDF of the test statistic T , however,which is the ratio between the two ordered eigenvalues. Bysubstituting λ = T λ we can obtain the desired marginal PDFby applying the transformation in (5). Explicit calculation ofthe remaining determinant of (5) yields (6). The hypergeomet-ric function F ( a + 1; b ) can also be written in terms of the a -th order modified Bessel function of the first kind I a ( b ) ,c.f. [21], as F ( a + 1; b ) = a ! b − a I a (2 √ b ) . Substituting (6)into (5), using the identity for the hypergeometric functionand simplifying gives (7). Lacking an analytical solution ofthe integral in (7), we use the series expansion of the a -thorder modified Bessel function of the first kind I a ( · ) : I a ( b ) = ∞ (cid:88) j =0 j ! Γ( j + a + 1) (cid:18) b (cid:19) (2 j + a ) . (8) f ( T ) = (cid:90) ∞ λ f λ ( T λ , λ ) d λ = K un (cid:90) ∞ e − λ ( T +1) λ ( T − T λ ) ( N − | F (( T λ , λ ) T , ω ) | d λ (5) | F (( T λ , λ ) T , ω ) | = F ( N − T λ ω ) F ( N − λ ω ) − F ( N − T λ ω ) F ( N − λ ω ) (6) f ( T ) = K un [( N − ( T − T ( N − ( T ω ω ) − ( N − ) × (7) (cid:90) ∞ λ N e − λ ( T +1) (cid:104) I ( N − (2 (cid:112) T λ ω ) I ( N − (2 (cid:112) T λ ω ) − I ( N − (2 (cid:112) T λ ω ) I ( N − (2 (cid:112) T λ ω ) (cid:105) d λ f ( T ) = K un [( N − ( T − T ( N − ∞ (cid:88) j =0 ∞ (cid:88) l =0 (cid:0) ( T ω ) j ω l − ( T ω ) j ω l (cid:1) j ! l ! Γ( j + N −
1) Γ( l + N − (cid:90) ∞ λ ( j + l +2 N − e − λ ( T +1) d λ (9) f ( T ) = e − ( ω + ω ) ( T − T ( N − ( ω − ω ) ∞ (cid:88) j =0 ∞ (cid:88) l =0 Γ( j + l + 2 N − T j (cid:16) ω j ω l − ω j ω l (cid:17) j ! l ! Γ( j + N −
1) Γ( l + N − T + 1) ( j + l +2 N − (10) f α ( T ) = e − (2 αN ) ( T − T ( N − ∞ (cid:88) j =0 ( T j −
1) Γ( j + 2 N −
1) (2 αN ) ( j − j ! Γ( j + N −
1) Γ( N −
1) ( T + 1) ( j +2 N − (11)Substituting the Bessel functions by (8) in (7), using that ( (cid:80) ∞ j =0 a j )( (cid:80) ∞ l =0 b l ) = (cid:80) ∞ j =0 (cid:80) ∞ l =0 a j b l and simplifying wearrive at (9). The remaining integral in (9) can be found in[21] (3.351.3) as (cid:90) ∞ x b e − ax d x = b ! a ( b +1) = Γ( b + 1) a ( b +1) to obtain f ( T ) for T ≥ and ω (cid:54) = ω in (10). There, we usethe identity Γ( a ) = ( a − for a ∈ N . As already discussedabove, the non-centrality matrix Ω of the PSK signal has rankone and therefore ω = 2 αN and ω = 0 . Setting ω = 0 and ω = 2 αN in (10), we gain a simpler version of the PDF ofthe test statistic T under hypothesis H for T ≥ in (11),which has the SNR α as a parameter directly.The PDF f ( T ) must be a special case of the PDF f α ( T ) in the limit for vanishing SNR. Examining (11) in the limitfor α → , the only term in the sum which is unequal to zerois when j = 1 , and it follows: lim α → f α ( T ) = ( N − N )[Γ( N )] ( T − T ( N − ( T + 1) N = f ( T ) . (12)In Figure 2 the PDFs f ( T ) and f α ( T ) are visualized, thelatter for different values of the SNR α . C. Numerical Evaluation of the PDFs
Numerical evaluation of the PDFs f ( T ) and f α ( T ) is notstraightforward, especially for large numbers of samples N ,which are relevant in a spectrum sensing context. First, dueto the factorials and exponents the numerical range neededquickly exceeds the range of the IEEE double precision format.Second, f α ( T ) involves an infinite sum and must therefore beapproximated, e.g., by stopping the summation after a certainnumber of summands. One possibility is to utilize arbitrary-precision arithmetic to bypass numerical issues, however, thecomputation time is often unacceptably long. Thus, we will . . . . . . T f ( T ) , f α ( T ) noise only α = − dB α = − dB α = − dB α = − dB α = − dB Fig. 2. Plot of f ( T ) (noise only) and f α ( T ) for different values of theSNR α and N = 500 . give a brief description of our implementation of the PDFsand compare them to empirical PDFs obtained by monte-carlosimulations.The log-gamma function log(Γ( x )) is often used for com-putations involving factorials or the gamma function itself. Weused MATLAB for our numerical calculations, where an im-plementation of the log-gamma function is already provided.Therefore, we reformulate f ( T ) to perform the numericallycritical computations inside the exponential function: f ( T ) = ( N −
1) ( T − exp [( N −
2) log( T ) − N log( T + 1) + log(Γ(2 N )) (13) − N ))] . In this way, we can evaluate the function for much higher N than with a direct implementation of (4).Since f α ( T ) contains an infinite sum, we approximate it by stopping the computation after a finite number of summands.Similarly to the approach of f ( T ) we reformulate f α ( T ) to beable to use log-gamma functions for numerical computation: f α ( T ) ≈ ( T − J s (cid:88) j =0 exp( β j + j log( T )) − exp( β j ) , (14)where β j = log(Γ( j + 2 N − − log(Γ( j + 1)) − (2 αN ) − log(Γ( N − N −
2) log( T ) − ( j + 2 N −
1) log( T + 1) + ( j −
1) log(2 αN ) . In [20], where a version of f α ( T ) was developed with N = 2 , it was reported that the number of summands canbe chosen such that a further increase would only add anegligible amount (say ≤ . ) to the summation. However,for higher N this criterion fails since the main contributionto the summation does not take place in the first summandsanymore. To achieve a reasonable approximation, we checkwhether the integral over f α ( T ) is very close to one. InTable I, we summarize values for J s of (14), such that theintegral deviates less than − from one (using a trapezoidalquadrature rule with a bin width of . ). TABLE IU
PPER BOUND OF SUMMATION ( J s ) TO ACHIEVE A REASONABLEAPPROXIMATION OF f α USING (14)
FOR DIFFERENT VALUES OF THENUMBER OF SAMPLES N AND THE
SNR α . N ↓ \ → α -20 dB -15 dB -10 dB -5 dB 0 dB50 10 20 30 60 200100 20 30 50 200 300500 30 60 200 400 20001000 50 200 300 800 30005000 200 400 2000 4000 2000010000 300 800 3000 7000 3000050000 2000 4000 20000 40000 200000 To verify our implementation including our approximations,the PDFs f ( T ) and f ( T ) are plotted in Figure 3 by eval-uating (13) and (14), respectively. There, the values of theempirical PDFs obtained by digital simulation are drawn incircles. .
05 1 . .
15 1 . .
25 1 . . T f ( T ) , f α ( T ) noise only α = − dB Fig. 3. Plot of f ( T ) (noise only) and f α ( T ) for α = − dB and N = 500 . Circles indicate values taken from an empirical PDF obtained bysimulation. IV. A
PPLICATION : B
LOCK D ETECTION
In this Section, we apply the results from Section III toa block detection scenario, before we turn to the quickestdetection problem in the next Section. For a block detectionscheme using the MME detector, the number of samples N used in calculating the sample covariance matrix and thedecision threshold h need to be determined. Intuitively, higher N result in less overlap between the PDFs f ( T ) and f α ( T ) .Thus, choosing N presents a trade-off between the detectionperformance and the time to reach a decision. The threshold h , poses a trade-off between the probability of detection P d and the probability of false alarm P fa : P fa ( h ) = 1 − F ( h ) = 1 − (cid:90) h f ( T ) d T ,P d ( h ) = 1 − F α ( h ) = 1 − (cid:90) h f α ( T ) d T . (15)As mentioned above, test statistic T of the MME detector isnot directly dependent on the actual noise and signal powers,but rather on the SNR α . Typically, the SNR α is assumedto be unknown beforehand. The decision threshold h is thenchosen, such that the detector has false alarm rate P fa .Note, that it is assumed in block detection that no changeof the hypothesis happens within a block. Consequently, whenthe CDFs under hypotheses H and H are both known,the receiver operator characteristic (ROC) can be calculatedtheoretically. Thereby, the design of a detector is simplifiedsignificantly, as the detection performance can be determinedbeforehand without the need for simulations or empirical tests.The CDF under hypothesis H , i.e. F ( T ) was given in [12]as: F ( T ) = K uc ( p ( T ) − p (1)) , where T ≥ , p ( T ) = ∆ ( N, N − , T ) − ( N − , N, T )+ ∆ ( N − , N + 1 , T ) and ∆ ( a, b, c ) = ( b − a ! − ( b − (cid:88) j =0 ( a + j )! c j j ! ( c + 1) a + j +1 . We are not aware of an analytical solution to the integralin (15). However, with the help of the PDF f α ( T ) from (10)we can numerically evaluate the integral to get P d . Plotting P d over P fa for various thresholds results in the ROC. Asan example, the ROC for N = 500 with various SNRs α isshown in Figure 4. There, results from a digital simulation areindicated by circles to verify the theoretical ROC.V. A PPLICATION : Q
UICKEST D ETECTION
This Section applies the results from Section III to introducequickest detection (QD) based on the eigenvalues of thesample covariance matrix. In contrast to block detection, wherethe objective is to determine which hypothesis is true, in QD itis assumed that it is known which hypothesis is true and that achange to the other hypothesis shall be detected with as little . . . . . . . . P fa P d α = − dB α = − dB α = − dB Fig. 4. ROC of the MME block detection scheme with N = 500 and variousSNRs α . Circles indicate values obtained from digital simulation. Comparealso the corresponding PDFs in Figure 2. delay as possible. Contrary to evaluating the probabilities ofdetection P d and false alarm P fa as done in block detection,the design criterions for QD algorithms are the mean timeto detection τ d and the mean time to false alarm τ fa . A wellknown algorithm used in QD is the cumulative sum (CUSUM)algorithm [22]. It is the optimal QD algorithm, in the sensethat it achieves minimal τ d while satisfying τ fa ≥ c , for any c [23]. Without loss of generality, we assume in the followingthat before the change time t c hypothesis H is true and at time t c transmission of the PU starts, i.e., hypothesis H becomestrue (see also Figure 1). This can be formalized as follows.Let Z t be a sequence of random variables, which for each t is independently distributed according to the PDF f θ ( z ) with parameter θ before the change time, i.e. for t < t c andaccording to the PDF f θ ( z ) with parameter θ at and after thechange time, i.e. for t ≥ t c . Samples z ( t ) are taken in orderto detect the change from H → H . The CUSUM algorithmrelies on the log-likelihood ratio of the received sample attime index t : l z ( t ) = log ( f θ ( z ( t )) /f θ ( z ( t ))) . Over time, l z ( t ) shows a positive drift if H is true and a negative driftif H is true. A cumulative sum of the log-likelihood ratiosof consecutive samples is formed by the CUSUM algorithm.However, only those consecutive samples are used for thesummation, which result in the largest sum. In other words,only the time interval showing a consistent positive drift isconsidered for summation. A convenient recursive formulationof the CUSUM algorithm can be given as [22], [24]: g z ( t ) = max ≤ m ≤ t t (cid:88) j = m +1 log (cid:18) f θ ( z ( j )) f θ ( z ( j )) (cid:19) = [ g z ( t −
1) + l z ( t )] + , (16)where the initial value is defined as g z (0) = 0 and the positivepart is defined as [ · ] + = max( · , . At each time index t the value g z ( t ) is compared to a threshold h > and if g z ( t ) > h the algorithm decides that a change to H hashappened. As can be seen from (16), g z ( t ) may never reachnegative values. Since l z ( t ) shows a negative drift under H and we are interested in detecting a potential change from H → H , accumulating negative values would result in a much longer time to reach the threshold after the change time t c and thus in a larger mean time to detection τ d . Note, thatthe parameters θ and θ must be known to use the CUSUMalgorithm. If the parameters are unknown, the algorithm canbe adapted to use the generalized log-likelihood ratio. We willstudy the application of QD using the MME test statistic withand without knowledge of the SNR in the following. A. Known SNR
First, we study the model from Section II for the case thatthe SNR α is known to the receiver. To adapt the results fromSection III, we define a time-dependent version of the teststatistic T from (1). Introducing a block index k ∈ N , wedefine λ ( k ) as the vector of ordered eigenvalues of the scaledsample covariance matrix of block k : R ( k ) = Y ( k ) Y H ( k ) ,where Y ( k ) = ( y (( k − N + 1) , · · · , y ( kN )) . In otherwords, λ ( k ) contains the eigenvalues of the k -th consecutivesample covariance matrix built from a block of N non-overlapping samples. Thus, since we assume K = 2 , the teststatistic follows as: T ( k ) = λ ( k ) λ ( k ) . To take over existing results from QD, we make an addi-tional assumption, which is similarly present in block detec-tion: t c ∈ { ( k − N +1 | k ∈ N } . This means that no hypothe-sis change may happen within a block. Depending on whether H or H is true, T ( k ) is distributed according to f ( T ) or f α ( T ) , respectively. As shown in (12), lim α → f α ( T ) = f ( T ) so the only parameter influenced by the change is the SNR α .The log-likelihood ratio can be found by inserting (4) and(10) into its definition and simplifying as seen in (17). There,we use the Pochhammer symbol defined as ( a ) b = ( a + b − / ( a − . Note, that (17) can be evaluated numericallyusing a similar approach to the one used for the PDFs inSection III-C. Using the log-likelihood ratio, we can give theCUSUM algorithm for our model as: g ( k ) = [ g ( k −
1) + l ( k )] + . (18)When designing the CUSUM detector, a method for findinga suitable threshold h is desirable. Since the exact calculationof the mean time to detection τ d and the mean time to falsealarm τ fa is very complicated in general, we turn to boundsalready studied in the literature. Of special interest are anupper bound on τ d and a lower bound on τ fa to ease thedesign of the threshold h . Note, that we are evaluating thebounds using the timescale of the block index k . To evaluatethe bounds on a sample based timescale (time index t inSection II), the bounds must be multiplied by the number ofsamples N used for calculating the sample covariance matrixin each block.Starting with the upper bound on τ d , we find from [24],[25], that τ d ≤ ( h + γ ( f α ))E f α [ l ( k )] , (19)where γ ( f ) = sup δ> E f [ l ( k ) − δ | l ( k ) ≥ δ > . l ( k ) = log (cid:18) f α ( T ( k )) f ( T ( k )) (cid:19) = log e − αN ( T ( k ) − ∞ (cid:88) j =0 ( T ( k ) j −
1) (2 αN ) ( j − (2 N ) ( j − j ! ( T ( k ) + 1) ( j − ( N ) ( j − (17)Here, we denote the expectation over the PDF f ( T ) as E f [ · ] .Likewise, a lower bound on τ fa can be found in [24]: τ fa ≥ f [ l ( k )] (cid:18) e − ϕh − ϕ + h + γ ( f ) (cid:19) , (20)where ϕ < is the single non-zero root of E f [ e − ϕ l ( k ) ] = 1 ,which can be found by solving: E f [ e − ϕ l ( k ) ] = (cid:90) ∞ (cid:18) f α ( T ) f ( T ) (cid:19) − ϕ f ( T ) d T ! = 1 . (21)Obviously, ϕ = − solves (21), since only the integral over f α ( T ) remains. Thus, (20) can easily be simplified to: τ fa ≥ f [ l ( k )] (cid:0) − e h + h + γ ( f ) (cid:1) . (22)Neither E f [ l ( k )] , E f α [ l ( k )] , γ ( f ) nor γ ( f α ) lend them-selves to proper analytical treatment, but can be calculatednumerically.A second lower bound on τ fa reported in the literature (seee.g., [24]) is simply τ fa ≥ e h . (23)Although this bound is very simple to compute, we found thatfor our model and the threshold values we considered in thenumerical evaluation, this bound turns out to be very loose.Since (22) is much tighter to the value of τ fa obtained bysimulation, we will omit (23) in our numerical evaluation. B. Unknown SNR
A more realistic scenario is that the SNR is unknown tothe receiver beforehand. In this case, the CUSUM algorithmcannot be used directly. However, a similar algorithm basedon the generalized likelihood ratio test (GLRT) can be utilizedinstead. The GLRT allows to perform likelihood ratio testswhen the parameters of the involved PDFs are unknown,by estimating these parameters using a maximum likelihoodestimation (MLE) beforehand [26]. In our scenario, the onlyunknown parameter is the SNR and the GLRT for this casecan be given as: l G ( k ) = log (cid:18) sup ˆ α f ˆ α ( T ( k )) f ( T ( k )) (cid:19) , (24)where ˆ α is the estimated SNR. Utilizing the GLRT test onan i.i.d. sequence of samples, the generalized likelihood ratio(GLR) algorithm can be constructed as [24]: g G ( k ) = max ≤ m ≤ k sup ˆ α log k (cid:89) j = m +1 f ˆ α ( T ( j )) f ( T ( j )) = max ≤ m ≤ k sup ˆ α k (cid:88) j = m +1 log (cid:18) f ˆ α ( T ( j )) f ( T ( j )) (cid:19) . (25) We are not aware of an analytical form of the supremumin (24), therefore we evaluate it numerically. Note also, thatthe GLR algorithm of (25) cannot be written in a recursiveform. In contrast to the CUSUM algorithm, the GLR algorithmrequires memory of all previous samples. This is due to thenumerical evaluation of the supremum and the subsequentmaximization in (25), which also has a higher computationalcomplexity.In Figure 5 the mean time to detection τ d obtained fromdigital simulation is plotted over the threshold for the CUSUMand the GLR algorithm. Also, the theoretical upper bound on τ d from (19) is depicted. Note, that the upper bound is onlyvalid for the CUSUM algorithm. Similarly, in Figure 6 themean time to false alarm τ fa obtained by digital simulationis shown for both the CUSUM and the GLR algorithm witha logarithmic scaling of the ordinate. Again, the theoreticallower bound on τ fa from (22), which is only valid for theCUSUM algorithm, is evaluated there. The values of τ d and τ fa shown in the Figures 5 and 6 are evaluated on a block-wisetimescale, as already explained above. h τ d boundCUSUMGLR Fig. 5. Theoretical upper bound from (19) on the mean time to detection τ d of the CUSUM algorithm. Simulated performance of both the CUSUMand GLR algorithm is shown for N = 500 and SNR α = − dB.The GLR numerically evaluates the supremum over ˆ α within the interval [ − dB , − dB ] in . dB steps. As can be seen in both Figures 5 and 6, the GLR algorithmshows a more aggressive behavior compared to the CUSUMalgorithm. This can be explained by two effects, predominantlypresent at the beginning of the algorithm run-time. Firstly,the SNR estimate ˆ α used by the GLR algorithm (25) mayover- or underestimate the SNR. If, for instance, the firstcouple of sample covariance matrices result in a greater thanaverage test statistic T ( k ) for the respective SNR, the GLRwill overestimate the SNR ˆ α and hence the log-likelihoodratio will be greater than it would be for the true SNR α . Secondly, when the GLR algorithm (25) determines itsparameter m , it may also choose larger values for m thanit would if the SNR estimation ˆ α had returned the true SNR h τ f a boundCUSUMGLR Fig. 6. Theoretical lower bound from (22) on the mean time to false alarm τ fa of the CUSUM algorithm. Simulated performance of both the CUSUMand GLR algorithm is shown for N = 500 and SNR α = − dB.The GLR numerically evaluates the supremum over ˆ α within the interval [ − dB , − dB ] in . dB steps. Note the logarithmic scaling of the ordinate. α . In Figure 7, a typical run of both the CUSUM and theGLR algorithm are depicted. The bottom of Figure 7 showsthe internal parameters of the GLR algorithm. Both effectscan be observed as discussed above. With advancing run-time, the SNR estimation ˆ α converges to the true SNR α .Consequently, also the behavior of the GLR algorithm moreclosely resembles the behavior of the CUSUM algorithm. Ifthe SNR estimation of the GLR algorithm returns the trueSNR, both algorithms behave identically, as can be seen bycomparing (18) and (25). g ( k ) , g G ( k ) CUSUMGLR − − − ˆ α block index k m ˆ αm Fig. 7. Comparison of a typical run of the CUSUM and the GLR algorithmwith N = 500 and SNR α = − dB. The internal parameters ˆ α and m ofthe GLR algorithm are shown in the bottom plot. VI. N
UMERICAL E VALUATION
A fair comparison between the GLR algorithm, which isa quickest detection algorithm and the MME block detection scheme is difficult to obtain due to the assumptions present inthe two different detection problems. In the following evalu-ation we design the MME block detector with a block lengthof samples and probability of false alarm P fa = 0 . .Using the results from Section IV, we find the probabilityof detection P d = 0 . for the SNR α = − dB.When using the GLR algorithm, a run-time has to be decidedbecause the algorithm will eventually generate a false alarm.Evidently, this run-time is to be chosen such that the mean timeto false alarm τ fa is significantly higher than the algorithmrun-time. We performed a monte-carlo simulation with theGLR algorithm for 1000 random seeds, with N = 10000 samples in each block and a variety of threshold values.Furthermore, two versions of the algorithm per monte-carloinstance are computed. The first one receives samples underhypothesis H , i.e. it only receives noise. The second onereceives samples under hypothesis H , so it receives signalwith additive noise. For the detection performance this is aworst-case analysis, since the GLR algorithm is initialized withthe value zero. If the hypothesis would switch from H to H during the simulation, it might detect a longer consecutivepositive drift due to the effects of the noise, resulting inpotentially quicker detection. To calculate probabilities of falsealarm and detection for the GLR algorithm, respectively, weconsider different run-times and calculate how many of the1000 monte-carlo instances have raised a false-alarm / havecorrectly detected the signal for a given run-time, threshold h and SNR α . Thus, we can compare the detection performanceof the GLR algorithm to the fixed performance of the MMEblock detector. For the numerical evaluation of the supremumin (25), an SNR range of [ − , − dB was evaluated in . dBsteps. In Figure 8, the GLR performance is depicted for severalSNRs. The threshold h = 4 . was chosen, such that at a run-time of samples, i.e., the block size of the MME detector,the probabilities of false alarm are approximately equal (GLR: P fa = 0 . vs. MME: P fa = 0 . ). As expected, the GLR · . . . . GLR run-time P f a , P d P fa P d , α = − dB P d , α = − dB P d , α = − dB P d , α = − dB P d , α = − dB P d , α = − dB Fig. 8. Performance of the GLR algorithm, evaluated as P d and P fa overthe algorithm run-time for the threshold h = 4 . . The gray circular markersand the thin solid lines indicate the performance of the MME block detectordesigned for a block length of samples and P fa = 0 . , which resultsin P d = 0 . at SNR α = − dB. quickest detection algorithm is able to detect higher SNRsmuch earlier than the MME detector at a lower or comparablefalse-alarm rate. However, at low SNRs the MME detectoris faster. When considering realistic detection scenarios, thePUs SNR cannot be known beforehand. So the MME detectoris designed to have a specific performance for the lowestSNR that has to be reliably detected. Evidently, in practiceit would be more precise to consider an SNR interval, saye.g., [ − , − dB. If no further knowledge is available, wemay model the SNR with a uniform distribution over saidinterval. Here, we see the advantage of the quickest detectionapproach. For a wide range of SNRs, it offers faster detectionat comparable false-alarm performance, thus wasting less timeon the detection process. On the one hand, this leads tohigher efficiency in using transmission opportunities, therebyincreasing transmission rates. On the other hand, interferencefor the PU may be decreased by diminishing the reaction timeto free the channel when the PU initiates a communication. Ascan be seen in Figure 8, there is a gap of approximately dBin the low SNR performance of the GLR quickest detectionalgorithm compared to the MME block detector. This is dueto the fact that the GLR algorithm does not use the entirety ofsamples, but rather blocks of N = 10000 samples to calculatethe sample covariance matrices. Thus, in choosing the numberof samples N , one has to trade-off the detection delay forhigher SNRs and the performance gap compared to the MMEblock detector for low SNRs. However, it must be stressedthat this comparison is not completely fair, since the GLRalgorithm introduced in (25) essentially throws data away. Thisstems from the CUSUM analysis, where the bounds rely onthe assumption that the samples of the test statistic used inthe CUSUM algorithm are i.i.d.. Further improvements in thisdirection are discussed in Section VII.As for all detection algorithms the choice of the detectionthreshold has a huge impact on the detection performance.This effect can already be seen in Figures 5 and 6. To visualizethe performance trade-off between P d and P fa directly, threedifferent choices of the threshold h are compared in Figure 9.VII. C ONCLUSION & F
UTURE W ORK
In this work, we have investigated the potential for usingquickest detection based on the sample covariance matrixfor spectrum sensing applications. First, we have specified asimple phase shift keying (PSK) model with additive whiteGaussian noise (AWGN), consisting of one primary user(PU) and K secondary users (SUs). Second, we characterizedthe Wishart distributions of the eigenvalues of the samplecovariance matrix under both detection hypotheses H (noiseonly) and H (PU signal + noise). Third, we derived analyticalformulations of the PDF of the maximum-minimum eigen-value (MME) test statistic, which is also called the standardcondition number (SCN) of a matrix, under both hypotheses H and H for the special case K = 2 . These PDFs have thenumber of samples N in the sample covariance matrix andthe signal-to-noise ratio α as parameters. Then, we discussedthe numerical computation of said PDFs. We put these results . . · . . . . GLR run-time P f a , P d P fa , h = 3 . P d , h = 3 . P fa , h = 4 . P d , h = 4 . P fa , h = 6 . P d , h = 6 . Fig. 9. Performance of the GLR algorithm, evaluated as P d and P fa overthe algorithm run-time for different thresholds h at SNR α = − dB. to application by explicitly calculating the ROC for the wellknown MME block detector without the need of expensivesimulations. Subsequently, the main application of these re-sults is discussed: quickest detection based on the samplecovariance matrix, i.e. eigenvalue-based quickest detection forspectrum sensing. Two quickest detection algorithms wereintroduced, which rely on the MME test statistic. The CUSUMalgorithm is applicable when the SNR of the PU signal isknown and a GLR algorithm was deduced to cope with thesituation when the SNR is unknown. Bounds on the mean timeto false-alarm τ fa and the mean time to detection τ d have beenprovided for the CUSUM algorithm. Numerical simulationsillustrate the potential advantages of the quickest detectionapproach compared to the block detection scheme for spectrumsensing applications. It turns out, that for a wide range of SNRsthe introduced GLR quickest detection algorithm offers fasterdetection at better or comparable false-alarm performance thanthe MME block detector.Future research may focus on finding similar bounds on τ d and τ fa for the GLR algorithm. Also, at very low SNRsthe block detector is still faster than the GLR algorithm. Thisresults from the fact that the GLR algorithm calculates itssample covariance matrices block-wise and hence uses lessdata than the block detector. One possibility of improvementwould be to get rid of the block structure introduced inSection V and perform updates of the samples covariancematrix on a sample basis. However, this would introducestochastic dependencies in the eigenvalues, so that the behaviorof both algorithms would be changed fundamentally. Therebythe presented bounds would be rendered invalid. Since thepresented GLR algorithm is computationally expensive, itwould be interesting to investigate efficient heuristics whichcan be used in practical scenarios. Additionally, it wouldbe interesting to extend the simple channel model to morerealistic fading channel models. Finally, it may be relevant toexamine other test statistics based on the (eigenvalues of) thesample covariance matrix for quickest spectrum sensing. R EFERENCES[1] Q. Zhao and B. M. Sadler, “A survey of dynamic spectrum access,”
IEEE Signal Processing Magazine , vol. 24, no. 3, pp. 79–89, 2007.[2] E. Axell, G. Leus, E. G. Larsson, and H. V. Poor, “Spectrum sensingfor cognitive radio: State-of-the-art and recent advances,”
IEEE SignalProcessing Magazine , vol. 29, no. 3, pp. 101–116, 2012.[3] Y. Zeng and Y.-C. Liang, “Maximum-minimum eigenvalue detectionfor cognitive radio,” in
IEEE 18th Annual International Symposium onPersonal, Indoor and Mobile Radio Communication , vol. 7, (Athens,Greece), pp. 1–5, 2007.[4] Y. Zeng and Y.-C. Liang, “Eigenvalue-based spectrum sensing algo-rithms for cognitive radio,”
IEEE Transactions on Communications ,vol. 57, no. 6, pp. 1784–1793, 2009.[5] Y. Zeng, Y.-C. Liang, and R. Zhang, “Blindly combined energy detectionfor spectrum sensing in cognitive radio,”
IEEE Signal Processing Letters ,vol. 15, pp. 649–652, 2008.[6] X. Yang, K. Lei, S. Peng, and X. Cao, “Blind detection for primaryuser based on the sample covariance matrix in cognitive radio,”
IEEECommunications Letters , vol. 15, no. 1, pp. 40–42, 2011.[7] J. Font-Segura, J. Riba, J. Villares, and G. Vazquez, “Quadratic spheric-ity test for blind detection over time-varying frequency-selective fadingchannels,” in
IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP) , (Vancouver, Canada), pp. 4708–4712,2013.[8] A. Bollig and R. Mathar, “MMME and DME: Two new eigenvalue-based detectors for spectrum sensing in cognitive radio,” in
IEEE GlobalConference on Signal and Information Processing (GlobalSIP) , (Austin,USA), pp. 1210–1213, 2013.[9] A. T. James, “Distributions of matrix variates and latent roots derivedfrom normal samples,”
The Annals of Mathematical Statistics , vol. 35,no. 2, pp. 475–501, 1964.[10] A. Zanella, M. Chiani, and M. Z. Win, “On the marginal distribution ofthe eigenvalues of wishart matrices,”
IEEE Transactions on Communi-cations , vol. 57, no. 4, pp. 1050–1060, 2009.[11] A. Kortun, T. Ratnarajah, M. Sellathurai, C. Zhong, and C. Papadias,“On the performance of eigenvalue-based cooperative spectrum sensingfor cognitive radio,”
IEEE Journal of Selected Topics in Signal Process-ing , vol. 5, pp. 49–55, Feb. 2011.[12] M. Matthaiou, M. R. McKay, P. J. Smith, and J. A. Nossek, “Onthe condition number distribution of complex wishart matrices,”
IEEETransactions on Communications , vol. 58, no. 6, pp. 1705–1717, 2010.[13] W. Zhang, G. Abreu, M. Inamori, and Y. Sanada, “Spectrum sensingalgorithms via finite random matrices,”
IEEE Transactions on Commu-nications , vol. 60, no. 1, pp. 164–175, 2012.[14] L. Lai, Y. Fan, and H. V. Poor, “Quickest detection in cognitive radio:A sequential change detection framework,” in
IEEE Global Telecom-munications Conference (GLOBECOM) , (New Orleans, USA), pp. 1–5,2008.[15] H. Li, C. Li, and H. Dai, “Quickest spectrum sensing in cognitive radio,”in
IEEE 42nd Annual Conference on Information Sciences and Systems(CISS) , (Princeton, USA), pp. 203–208, 2008.[16] H. Li, “Cyclostationary feature based quickest spectrum sensing in cog-nitive radio systems,” in
IEEE 72nd Vehicular Technology ConferenceFall (VTC 2010-Fall) , (Ottawa, Canada), pp. 1–5, 2010.[17] S. Zarrin and T. J. Lim, “Cooperative quickest spectrum sensing incognitive radios with unknown parameters,” in
IEEE Global Telecommu-nications Conference (GLOBECOM) , (Honululu, Hawaii, USA), pp. 1–6, IEEE, 2009.[18] H. Li, H. Dai, and C. Li, “Collaborative quickest spectrum sensing viarandom broadcast in cognitive radio systems,”
IEEE Transactions onWireless Communications , vol. 9, no. 7, pp. 2338–2348, 2010.[19] E. Hanafi, P. A. Martin, P. J. Smith, and A. J. Coulson, “Extension ofquickest spectrum sensing to multiple antennas and rayleigh channels,”
IEEE Communications Letters , vol. 17, no. 4, pp. 625–628, 2013.[20] M. Matthaiou, D. I. Laurenson, and C.-X. Wang, “On analytical deriva-tions of the condition number distributions of dual non-central wishartmatrices,”
IEEE Transactions on Wireless Communications , vol. 8, no. 3,pp. 1212–1217, 2009.[21] A. Jeffrey and D. Zwillinger,
Table of integrals, series, and products .Academic Press, 2007.[22] E. S. Page, “Continuous inspection schemes,”
Biometrika , vol. 41,no. 1/2, pp. 100–115, 1954.[23] G. V. Moustakides, “Optimal stopping times for detecting changes indistributions,”
The Annals of Statistics , vol. 14, no. 4, pp. 1379–1387,1986. [24] M. Basseville and I. V. Nikiforov,
Detection of abrupt changes: theoryand application . Englewood Cliffs: Prentice Hall, 1993.[25] A. Wald,
Sequential analysis . New York: Wiley, 1947.[26] S. M. Kay,