EOVSA Implementation of a Spectral Kurtosis Correlator for Transient Detection and Classification
aa r X i v : . [ a s t r o - ph . I M ] F e b September 7, 2018 14:36 ms-jai-arxiv
Journal of Astronomical Instrumentationc (cid:13)
World Scientific Publishing Company
EOVSA Implementation of a Spectral Kurtosis Correlator for Transient Detection andClassification
Gelu M. Nita , Jack Hickish , David MacMahon , Dale E. Gary Center for Solar-Terrestrial Research, New Jersey Institute of Technology, Newark, NJ 07102, USA, [email protected] Radio Astronomy Laboratory, University of California Berkeley, Berkeley, CA, USA
We describe in general terms the practical use in astronomy of a higher-order statistical quantity called SpectralKurtosis (SK), and describe the first implementation of SK-enabled firmware in the F-engine (Fourier transform-engine) of a digital FX correlator for Expanded Owens Valley Solar Array (EOVSA). The development of thetheory for SK is summarized, leading to an expression for generalized SK that is applicable to both SK spectrom-eters and those not specifically designed for SK. We also give the means for computing both the d SK estimatorand thresholds for its application as a discriminator of RFI contamination. Tests of the performance of EOVSA asan SK spectrometer are shown to agree precisely with theoretical expectations, and the methods for configuringthe correlator for correct SK operation are described. Keywords : instrumentation: interferometers, instrumentation: spectrographs, methods: statistical, methods: an-alytical
1. Introduction
The quality of radio astronomy scientific data can be greatly affected by radio frequency interference(RFI). With the increased use of wireless communication systems operating in frequency bands of scientificinterest, the RFI environment becomes ever more hostile. At the same time, astronomers seek to observeover an ever broader part of the radio spectrum, both for increased sensitivity and to exploit spectralcontinuum diagnostics. These trends make it necessary to find efficient and robust methods that are ableto discriminate and excise the RFI contamination, while preserving as much of the underlying usefulinformation as possible.There is no universal solution for the problem of RFI mitigation, and one must find or developmethods that are most suitable for a particular instrument and the RFI environment in which it op-erates (Fridman, P. A. & Baan, W. A., 2001). Such methods include active sampling and subtractionof RFI signals, detection and excision (blanking), and spatial nulling using interferometric techniques(van Ardenne et al. , 2000). Among these methods, RFI mitigation algorithms based on statistical discrim-ination of signals in both the time domain (Ruf et al. , 2006) and the frequency domain (Fridman, P. A.,2001; Fridman, P. A. & Baan, W. A., 2001) have become increasingly popular in recent years, due to thefeasibility of their implementation using field-programmable gate arrays (FPGAs) or graphical program-ming units (GPUs), equipped on many modern digital instruments (Baan et al. , 2004; Gary et al. , 2010;Ford & Buch, 2014).Spectral Kurtosis (SK) is a higher-order statistical tool, applicable to accumulated Power SpectralDensity (PSD) estimates, that has become increasingly popular over the last several decades in digital signalprocessing applications for automatic detection of non-Gaussian signals embedded in Gaussian backgroundnoise (Dwyer, 1983; Ottonello & Pagnan, 1994; Vrabie et al. , 2003; Antoni, 2006; Antoni & Randall, 2006;Antoni, 2007; Dion et al. , 2012). More recently, Nita et al. (2007) demonstrated that SK may be effectivefor automatic detection of RFI in astronomical signals, and proposed a simple design for a wide-band SKspectrometer based on its implementation and testing in a software correlator for a prototype instrument(Liu et al. , 2007). eptember 7, 2018 14:36 ms-jai-arxiv Nita et al.
As extensively developed in our preceding studies (Nita et al. , 2007; Nita & Gary, 2010a; Gary et al. ,2010; Nita & Gary, 2010b), what makes an SK spectrometer distinct from a traditional one is the fact that,for each frequency channel, it accumulates not only a set of M instantaneous power spectral density (PSD)estimates, S = P Mi =1 P i , but also the squared instantaneous spectral power, S = P Mi =1 P i . A remarkableproperty of the spectral kurtosis estimator ( d SK ) computed from these parallel accumulations is that, inthe case of a pure Gaussian time domain signal, its statistical expectation is unity for each frequencychannel and time, no matter how the power level may vary in spectral shape or time. This property givesthe d SK estimator the ability to discriminate signals that deviate from Gaussian time domain statisticsagainst arbitrarily shaped astronomical backgrounds, thus making it sensitive to many man-made signalsthat produce unwanted RFI contamination of the astronomical signals of interest.Moreover, the d SK estimator is expected to deviate below unity if the RFI signal is continuous or actingfor more than half of the accumulation time, while it is expected to deviate above unity if the RFI signal hasa less than 50% duty-cycle. Thus, the d SK estimator also provides the means to characterize the durationof the RFI signals relative to the instrumental accumulation time. Nevertheless, as shown by Nita et al. (2007), the d SK estimator also deviates above unity in the case of a Gaussian transient lasting shorter thanthe integration time, which may pose a real-time decision challenge if such Gaussian transients representwanted astronomical signals. However, as recently suggested by Nita (2016) in a comprehensive theoreticalstudy, and experimentally demonstrated by Nita & Gary (2016), this sensitivity of the d SK estimatorto transient astronomical signals, if properly taken advantage of, may provide an experimental meansfor obtaining accurate signal-to-noise ratio and duration estimates for both Gaussian and non-Gaussiantransients, and thus extends the potential uses of the d SK spectrometer well-beyond its originally intendedscope.Developed for the Korean Solar Radio Burst Locator telescope (KSRBL, Dou et al. , 2009), the firsthardware implementation of a wide-band spectrometer with SK capabilities, utilizing FPGA architecture,served as a demonstration (Gary et al. , 2010) of the effectiveness of real-time RFI detection and excisionvia the SK algorithm. In addition, the quantitative examination of the resulting high-precision probabilitydensity function (PDF) also revealed the need for a more accurate calculation of the theoretical RFI de-tection thresholds than the traditional, symmetrical standard deviation thresholds employed by Nita et al. (2007) in the original software-based SK spectrometer prototype.Consequently, Nita & Gary (2010a) proposed a more theoretically well-founded spectral kurtosis esti-mator ( d SK ), which applies to the case of a normally distributed time domain signal whose PSD estimatesare obtained by means of a Fast Fourier Transform (FFT). It is an unbiased estimator of the spectralvariability, which is defined as the ratio between the variance and the squared mean of the PSD estimates.For this particular case, Nita & Gary (2010a) derived exact analytical expressions for the infinite series ofstatistical moments of the d SK estimator, and assigned to it a Pearson Type IV probability curve (Pearson,1895), which was shown to be in very good agreement with numerically simulated d SK PDFs, as well aswith the d SK distributions derived from direct experimental observations made with KSRBL (Gary et al. ,2010).Given its conceptual simplicity, which makes it ideally suited for a real-time FPGA or GPU imple-mentation, the original d SK spectrometer design proposed by Nita et al. (2007) has begun to be adopted asa standard component in the analysis pipelines of a new generation of radio telescopes designed not onlyby us, but also by other independent instrument design teams (Deller et al. , 2011; Anderson et al. , 2012;Bassa et al. , 2016).However, one drawback of the SK spectrometer is the need to record both power and power-squared.Despite rapidly increasing computational and storage capabilities and decreasing associated costs, the over-head brought by the doubling of real-time data volume and rates relative to that required by a traditionalspectrometer design may still negatively impact the larger adoption of such instruments, especially in thecase of radio interferometry arrays involving a large number of individual antennas. One solution to thisis to compute d SK in the firmware and either record only the real-time SK flags, which can be representedby single bits, or even apply them in the firmware and dispense with recording power-squared entirely.eptember 7, 2018 14:36 ms-jai-arxiv EOVSA SK Correlator An alternative solution, which also broadens the use of the d SK estimator to existing spectrometers notspecifically designed for SK, was developed by Nita & Gary (2010b)—the concept of a generalized d SK es-timator that, with a trade-off of decreased temporal resolution, may be computed solely from the outputof a standard spectrometer (one that provides only accumulated PSD estimates, s = Σ Nj =1 P j ).As defined by Nita & Gary (2010b) in terms of the accumulations S = Σ Mi =1 ( s ) i and S = Σ Mi =1 ( s ) i ,and the shape parameter d , the generalized spectral estimator is d SK = M N d + 1 M − (cid:16) M S S − (cid:17) , (1)where N is the number of internal accumulations performed to produce the instrumental output s , and M is the number of such consecutive accumulations used to compute the above sums S and S . Thisgeneralized form of d SK has unity expectation not only in the particular case of accumulations involving rawPSD estimates obtained via FFT, which have a Gamma probability distribution of shape parameter d = 1(exponential distribution), but also for PSD estimates characterized by Gamma probability distributions ofarbitrary shape parameters d , particularly the PSD estimates obtained by means of narrow-band filtering ofwide-band time domain signals, which obey a chi-square statistics corresponding to a Gamma distributionwith shape parameter d = 0 .
5. Therefore, unlike its original counterpart, the generalized d SK estimatordefined by Equation 1 is no longer limited to a particular instrument design or method used to obtain thePSD estimates.It may be helpful to give some specific use cases: (i) For a spectrometer specifically designed for SK,which produces S and S in firmware, one would use N = 1, d = 1 in Equation 1. (ii) For a typicalFFT-based spectrometer that outputs only accumulated power s , one would set N to the number ofon-board accumulations, and then perform offline accumulation of S and S , and use d = 1. (iii) Fora spectrometer that outputs only a band-limited time-domain signal, one would set N to the number ofon-board accumulations of data samples, perform offline accumulation of S and S , and use d = 0 . M , d SK ′ = ( M S /S −
1) and observe its value in RFI-free regions, to estimate
N d ≈ ( M S /S − − .Although Equation 1 suffices to calculate the expected d SK for different use cases, it is also necessary toknow the d SK PDF for each case, so that thresholds can be computed beyond which one can be confident, atsome probability level, that the signal is non-Gaussian. Following the same mathematical framework usedto obtain the d SK PDF approximation corresponding to the particular case { N = 1 , d = 1 } (Nita & Gary,2010a), Nita & Gary (2010b) derived the infinite series of statistical moments of the generalized d SK esti-mator defined by Equation 1 and demonstrated that, for different regions of the parameter space { M, N d } ,the d SK PDF may be approximated by one of a Pearson Type I, Type III, or Type IV probability distri-bution, any of which can provide known probability of false alarm (PFA) detection thresholds accurateenough for practical applications. In Appendix 1 we provide an example procedure written in InteractiveData Language (IDL) that may be used to compute such thresholds.Just as KSRBL (Dou et al. , 2009) was the first implementation of an SK spectrometer, we report hereon the first implementation of SK in a multi-antenna, interferometric array, the Expanded Owens ValleySolar Array (EOVSA). In doing so, we extend the previous work by demonstrating the use of both thetraditional d SK estimator and the generalized d SK estimator for data recorded with the system. In § § §
2. EOVSA SK Correlator Design
Following our success with the KSRBL spectrometer design described by Dou et al. (2009), we have imple-mented a similar design in the Fourier transform engine (F-engine) of the 16-antenna, dual-polarization FXcorrelator for the Expanded Owens Valley Solar Array (EOVSA). EOVSA is a solar-dedicated array of 13eptember 7, 2018 14:36 ms-jai-arxiv Nita et al. small 2.1-m-diameter dishes equipped with broadband (1–18 GHz) front-end systems with crossed-linear-polarization feeds. A pair of 20 GHz optical links transmit the entire RF band from the two polarizationsto a central control building, where they are downconverted via a tunable LO to sample an instantaneousbandwidth of 600 MHz. The system is capable of tuning to a new IF band in under 1 ms, and the correlatoris designed to accumulate for 19 ms, thus providing a 20 ms sample time. The frequency-agile system iscapable of tuning in arbitrary order, and typically tunes to 50 IF frequencies in a fixed 1 s cycle. Thecorrelator also accepts input from two 27-m antennas equipped with He-cooled receivers, which are usedfor observation of celestial sources for calibration of the small dishes.The digital part of the system, comprising the ADCs, the F-engine, and the cross-correlation engine(X-engine), is based on the hardware and firmware of the Collaboration for Astronomical Signal Process-ing and Electronics Research (CASPER), and is implemented on second-generation Reconfigurable OpenArchitecture Computing Hardware (ROACH-2) boards equipped with first-generation, dual-input KarooArray Telescope analog-to-digital converter (KatADC) boards. Each KatADC board accepts input fromthe two polarizations of a single antenna, and each ROACH-2 has two such boards, for four inputs in total.The entire 16-antenna correlator thus requires 8 ROACH-2 boards. The firmware for each ROACH-2 isrunning four parallel, identical F-engines and a single X-engine that handles 1/8th of the total IF band-width on a single Xilinx Virtex-6 Field Programmable Gate Array (FPGA). The remaining 7/8ths of theband is sent via 10-gigabit ethernet packets to the other 7 ROACH-2 boards. The F-engine and X-enginedata are accumulated over 19 ms on each board, and then sent via 10-gigabit ethernet to a processingcomputer. Figure 2 shows a simplified block diagram of the digital hardware. At present (August 2016),the functional design of the digital correlator is final, but the design is operating at an FPGA clock speedof 200 MHz, lower than the design goal of 300 MHz. Small design tweaks are being applied in order tobring the timing to the design goal of 300 MHz, which is expected soon. Meanwhile, the tests described inthe next section were taken with the lower clock speed, which limits the effective IF bandwidth to 400 MHzand causes an overlap of the lower half of the band, so that it is effectively double-sideband. The upperhalf of the 400 MHz band is the preferred single-sideband. However, these complications in no way affectthe operation of the system for the purposes of evaluating SK performance.
Fig. 1.
Block diagram of EOVSA correlator. Quantities with no bit-truncation are shown without an underbar, while quantitieswith potential truncation are shown with an underbar. Truncation is performed via the ”Cast” blocks, which change the outputbit-width for practical purposes of data volume and compatibility with standard computational data types, e.g. 32-bit data foraccumulated power ( P P ) and cross-power ( P X ), and 64-bit data for power-squared ( P P ). Figure 2 gives the bit-width of the digital representation of numerical values at various points ineptember 7, 2018 14:36 ms-jai-arxiv
EOVSA SK Correlator the design. To properly determine d SK from the quantities output by the EOVSA correlator, care mustbe taken to keep the precision of the approximate output values ( P P ) and ( P P ) as high as possiblewithout overflow. The critical locations where inappropriate truncation could occur are the “Cast” blockjust after the accumulator for the accumulated power, and the “Cast” block just before the accumulatorfor the individual power-squared samples. Since these blocks merely take the most significant bits in bothcases, this amounts to ensuring that the power be as high as possible without overflowing. There aretwo adjustments that must act together to keep the power at the desired level. First, the overall analogsystem gain must be adjusted to keep the digitized ADC signal in an appropriate range, which anyway isneeded for optimum performance of the digital system, and second, a quantity called the FFT shift can bemanipulated to keep the output of the FFT in the appropriate range for good SK performance. EOVSA’s4096 frequency channels require a 13-stage FFT, and at each stage it is possible with the CASPER firmwareto shift the output by one bit to lower precision. With no shifts at all, FFT overflows are highly likely, soshifting the output of some FFT stages is often needed. Each shift of a stage results in the x, y quantitiesbeing reduced by a factor of 2, and after squaring to form the power P , the power reduces by a factor of4 (and power-squared by a factor of 16). The procedure we use to determine the optimum FFT shift is tosimply try different values and form the d SK estimator, which will be unity for non-RFI affected channelsfor a fairly wide range of FFT shift values, but will differ from unity when the FFT shift is too small, sothat P starts to overflow, or too large, so that P starts to fall to low precision.In practice, EOVSA’s design provides such a high precision (64 bits) for P P that d SK is well behavedover a wide range of FFT shifts. When only the first 3 or fewer stages are shifted, the d SK values fall belowunity for some non-RFI channels, but they are well behaved when at least 4, and up to 10, stages areshifted. This range of 7 bits in x, y corresponds to a range of 14 bits in P P and 28 bits in P P , yet theSK performance is unchanged over this range. For our operational setting, we have settled on shifting thefirst 5 stages, which gives some headroom in case of flares while keeping the precision high. Although notrelevant to SK operation, we note that once the FFT shift is set, it is a more delicate matter to keep the X data in range due to its cast to only 4 bits prior to cross-correlation, via the “Cast” block after theequalizer block (labeled “EQ multiply” in Figure 2). This is done by applying a 34 ×
128 array of equalizercoefficients (yellow block in the figure), which can vary by band and channel, with 128 coefficients per bandso that each applies to 32 of the 4096 channels in each band. The equalizer coefficients are adjusted by acalibration procedure that seeks to make the rms of the 4-bit autocorrelations lie in the range 2 to 3.
3. EOVSA SK Performance Tests
As described in §
2, the normal mode of operation of EOVSA consists of cycling a user-defined tuningsequence composed from a set of 34 predefined 400 MHz-wide IF bands (will be 600 MHz-wide whentiming is met), some of them repeated, such that the entire (1–18 GHz) frequency range is covered in 1 sby 50 independent tunings, resulting in 19-ms-long power and power-square accumulations ( M = 1792raw FFT spectra), which are divided into 4096 frequency channels ( ∼ .
098 MHz resolution). If the real-time RFI-excision algorithm is activated, the high-frequency-resolution S and S accumulations are thenused to compute d SK estimators according to equation 1 ( N = 1 , d = 1). d SK values that fall outside thethresholds are interpreted as RFI-contaminated channels, which are excluded from a subsequent frequencyintegration performed in order to produce the final 1 s time-resolution dynamic spectrum characterizedby, typically, ∼
500 “science channels” (with variable frequency resolution up to ∼
40 MHz). However, forthe purpose of evaluating the performance of the hardware-embedded SK engine, we first analyze a datasegment obtained by dwelling on a fixed 400 MHz IF band (9.6–10 GHz), which happens to be virtuallyfree of RFI.The results of statistical analysis displayed in Table 1, which was performed for the 13 EOVSA anten-nas, 2 linear polarizations labeled XX and YY, 4096 frequency channels, and 1000 contiguous accumulationsof M = 1792 raw FFT spectra each, demonstrates a full agreement of results with the theoretical expecta-tions of Nita & Gary (2010a,b). Indeed, despite the variations of bandpass shapes of individual antennasand polarizations, the distribution of the observed 4,096,000/antenna/polarization d SK samples displayedeptember 7, 2018 14:36 ms-jai-arxiv Nita et al. in Table 1 have frequency- and polarization-independent means close to unity, and sample variances con-sistent with the theoretical expected variance σ d SK = 2 / √ M = 0 . M = 1792.The accuracy of the Pearson Type IV approximation of the d SK PDF derived by Nita & Gary (2010a) isconsistently confirmed by the observed percentages of flagged data by a pair of asymmetric thresholds, d SK ≤ . d SK ≥ . . Table 1.
EOVSA SK statistical analysis of RFI-free data.
Thesample d SK means and variances are to be compared with thetheoretical expectations given in the first line of the table. Thepercentages of flagged frequency-time bins below and aboveunity are to be compared with the expected equal probabili-ties of false alarm corresponding to the asymmetric RFI de-tection thresholds, i.e. PFA( d SK ≤ . . d SK ≥ . . h d SK i σ d SK ≤ . ≥ . → In the next performance test, we tuned the EOVSA instrument to a fixed 400 MHz IF band prone toRFI contamination (3.6–4.0 GHz), and then commanded the antennas to point from an on-Sun position tooff-Sun, to simulate a change in total power level that might be seen during the evolution of a solar burst.Although S thus varies by about a factor of 3, the value of the d SK estimator is expected to be unchangeddue to its strict lack of dependence on changes of the background power level.Figure 2a displays the XX-polarization dynamic power spectrum for one of the antennas obtainedin the 3 . − f = 0 .
097 MHz, and time resolution∆ t = 0 .
02 s, the latter corresponding to an accumulation length of M = 1792 raw FFT spectra. Theeptember 7, 2018 14:36 ms-jai-arxiv EOVSA SK Correlator change of the background on a time scale of about 2 s during the on-off Sun transition is evident. Thesharp transition to higher power level at about 17:32:19 UT is due to the automatic gain control switchingout 2 dB of attenuation to compensate for the decreased radio frequency (RF) level, which affords us theopportunity to characterize the SK performance for both gradual and abrupt power-level changes. Alsoapparent from Figure 2a is the nonuniform bandpass shape reflecting the (temporary) overlap of the lowerhalf of the band mentioned in §
2. Relatively faint RFI contamination may be observed superposed on thebackground spectrum in some time–frequency bins that are crossed by the vertical dotted line and one ofthe two horizontal dotted lines that mark one instant of time and two frequency channels selected for amore detailed analysis presented in the subsequent panels of Figure 2.Figure 2b displays a color-coded dynamic spectrum of d SK flags corresponding to the power spectrumshown in panel 2a. The green uniform background indicates the RFI-free time-frequency bins where d SK iswithin the calculated thresholds. The black and yellow regions indicate the presumably RFI-contaminatedtime-frequency bins characterized by d SK ≤ . d SK ≥ . M = 1792 RFI thresholds corresponding to an intended 0 . d SK ≥ . d SK flag spectrum indicate the presence of RFI signals characterizedby <
50% duty-cycle relative to the integration time, while the more uniformly scattered d SK ≤ . d SK ≥ . d SK statistical fluctuations, as we confirm by comparingthe actual percentage of the d SK ≤ . . . . d SK ≥ . . − .
74 GHz frequency range. The corresponding d SK spectrumdisplayed in Figure 2d generally reveals a flat unity mean d SK distribution bounded by the computed0 . . − .
74 GHz range isclearly flagged by larger than unity d SK values, which are indicative of RFI contamination with < d SK lightcurve displayed in Figure 2f, which fluctuates within the statistical 0 . d SK evolution at 3.73 GHz is shown in Figure 2h. The varying d SK ≥ . <
50% duty-cycles relative to the integration time.Nevertheless, the d SK time profile clearly demonstrates the ability of the SK-based RFI excision algorithmto adaptively select contiguous RFI-free time segments suitable for science investigations, as opposed to anon-adaptive procedure that would discard entirely such heavily RFI-contaminated channels.In a final test of the SK performance, in Figure 3 we present a data segment recorded by the EOVSAinstrument during the GOES C8.6 flare of 2016-07-10 00:53:00 UT. However, since at the time of thisevent the real-time RFI excision algorithm was turned off, we do not have the full-resolution SK data, butrather only S , S and M accumulated over the wider, and variable-width, science channels. The dynamicspectrum ranges from 2.86–17.98 GHz, and the time resolution is 1 s, which, as explained in §
2, are a resultof the tuning cycle of 1 s and a frequency-domain integration resulting in 134 science channels. Moreover,prior to the science-band integration, the S and S accumulations corresponding to any repeated IF bandsduring the 1 s cycle were added together. As the result of these operations, the reduced-resolution S and S corresponding to the science channels are characterized by different accumulation lengths M . For thedata set analyzed in Figure 3, the number of accumulations range from M = 164 ,
864 = 90 × M for theeptember 7, 2018 14:36 ms-jai-arxiv Nita et al.
Fig. 2.
Fixed IF band SK analysis (3.6–4 GHz). a) Antenna 1 dynamic power spectrum for XX polarization, with ∆ f =0 .
097 MHz; ∆ t = 0 .
02 s; M = 1792, during an on-to-off-center Sun track. The vertical and horizontal dotted lines indicateone time and two frequency channels selected for the line-plots in subsequent panels. b) Color-coded dynamic spectrum ofthe corresponding d SK flags. A green, uniform background indicates d SK within the computed thresholds. Black and yellowcolors indicate the presumably RFI-contaminated time-frequency bins characterized by d SK ≤ . d SK ≥ . M = 1792 RFI thresholds corresponding to an intended 0 . d SK spectrumcorresponding to the same selected time slot as in panel c, overplotted with the thresholds in red. e) Time evolution of theintegrated power at 3.75 GHz. f) Time evolution of d SK corresponding to panel e, overplotted with the thresholds in red. g)same as e, for the RFI-contaminated frequency 3.73 GHz. h) Time evolution of d SK at 3.73 GHz. The varying d SK ≥ . <
50% duty-cycles relative to the integration time. eptember 7, 2018 14:36 ms-jai-arxiv
EOVSA SK Correlator narrower science bands at lower frequencies, to M = 1 , ,
608 = 1074 × M for the broader (and repeated)science bands at higher frequencies, where M = 1792 is the output accumulation length of the EOVSAcorrelator.Figure 3a displays the raw S spectrum that results from this procedure. Except for a strong RFI-contaminated channel, the raw solar spectrum appears almost featureless due uncalibrated, frequency-dependent gain variations, and to the strong gradient of the quiet Sun signal, which reaches more than 500solar flux units (sfu; 1 sfu = 10 − W m − Hz − = 10000 Jy) at high frequencies. Only after subtractingthis background signal and applying calibration does the <
50 sfu solar burst become clearly visible. Thecalibrated power spectrum derived from S , which is displayed in Figure 3b, reveals the flare spectrumcomposed of a more or less smooth envelope, on top of which a set of clustered fine spectral features areclearly visible. However, except for the clearly RFI-contaminated channel near 6 GHz, it is not immediatelyevident whether other frequency-time bins are affected by RFI. Although the frequency-time integration isan irreversible process, the availability of the S , S and M still allows the computation of an d SK estimatorfor each time-frequency bin, and thus discrimination of those bins affected by RFI signals.Similar to the procedure used to obtain the RFI flags displayed in Figure 2b, we show in Figure 3cthe RFI flag dynamic spectrum obtained by assigning a flat unity value to all spectral bins characterizedby d SK that fluctuate within the 1 ± / √ M range (black background), while saturating to these M -dependent values all spectral bins characterized by d SK values ranging below or above these limits. Thechoice of these 1 ± σ d SK symmetric thresholds, instead of the exact 0 . M values, for which the Gaussian approximation of the true d SK PDF is fully justified (Nita & Gary, 2010a). Figure 3c reveals that the RFI contamination is strongerthan suggested by the visual analysis of the dynamic spectra shown in Figure 3a,b. Indeed, a quantitativeanalysis demonstrates that 12 .
57% of the time-frequency bins feature d SK values above the upper 1+6 / √ M threshold, while no spectral bin is found to have d SK ≤ − / √ M .Remarkably, the d SK flagged power spectrum shown in Figure 3d, demonstrates that, with a fewnoticeable exceptions discussed below, the vast majority of the d SK flags are not related to the flare evolu-tion. This demonstrates the ability of the d SK estimator to automatically discriminate RFI contaminationagainst natural spectral features, even in the case of fine spectral structures as those seen superposed on theflare background. Nevertheless, a few flagged spectral bins coincide with some of the fine spectral features,which may be just simple coincidence, or the result of a fast evolution of the such spectral features at atime scale shorter than the integration time, as theoretically expected by Nita (2016) and experimentallydemonstrated by Nita & Gary (2016) using data recorded by an EOVSA prototype instrument during aflare featuring solar radio spikes. Therefore, aside from these few exceptions, the d SK flags may be safelyattributed to RFI contamination, and corresponding data may be safely discarded from scientific analy-sis of the event, particularly imaging when available, given the fact that EOVSA is capable designed toproduce a full Sun image for very time-frequency bin of the dynamic spectrum.We also find noteworthy the fact that none of the d SK flags are found to correspond to d SK values lessthan unity, which would be the case for an RFI signal lasting longer than the instrumental integrationtime (Nita, 2016; Nita & Gary, 2016), which in the particular case of the EOVSA tuning sequence mayrange anywhere between 20 ms (for non-repeated bands) and 60 ms (for bands repeated 3 times). However,rather than asserting that no such RFI signals were mixed in the data, we point out that, even for apurely continuous, narrowband RFI signal, which would be detected as less than unity by the real-timeRFI excision algorithm, the integration across frequency channels to form the wider-band science-frequencychannels effectively reduces the contribution of RFI to the integrated signal to a percentage much smaller50%, and thus results in d SK values larger than unity.We conclude this analysis by pointing out that, had the automatic RFI excision algorithm been activeat the time of the flare, most of the 12 .
57% bins contaminated by RFI could have been saved from beingfully discarded, and the scientific return of the instrument significantly increased.eptember 7, 2018 14:36 ms-jai-arxiv Nita et al.
Fig. 3. d SK analysis based on EOVSA data recorded during the GOES C8.6 flare of 2016-07-10 00:53:00 UT in the [2 . − . GHz frequency range. a) Raw (uncalibrated, not background-subtracted) S spectrum. b) Calibrated and background-subtracted power spectrum that reveals a more or less smooth flare envelope, on top of which a set of clustered fine spectralfeatures are clearly visible. The RFI contaminated channel around 6 GHz is also visible in this panel. c) RFI flag spectrumcorresponding to variable 1 + 6 / √ M thresholds, as explained in the main text. A much larger percent of RFI contaminatedchannel than suggested by the plots shown in a and b is revealed. d) RFI-flagged dynamic spectrum. Remarkably, almost noneof the observed fine structures are flagged, which confirms their solar origin.
4. Discussion and Conclusion
Although not a test of EOVSA’s SK performance per se, we can use EOVSA data to better illustrate inFigure 4 the contrast between using its output as an SK spectrometer versus its output as a traditionalspectrometer lacking specially designed SK capabilities.We already showed an example in Figure 2d of the d SK estimator for a single, 20 ms snapshot, wherethe lower and upper thresholds for M = 1792, N = 1 and d = 1 are 0.87145 and 1.15800, respectively.Figure 4a,c are the same as Figure 2c,d, to aid comparison. In addition, Figure 4b also shows the power-squared accumulation corresponding to the same 20 ms time slot. As indicated in the Figure 4c inset, theeptember 7, 2018 14:36 ms-jai-arxiv EOVSA SK Correlator d SK thresholds (red horizontal lines) flag 2 .
17% of the 4096 frequency channels as beingcontaminated by RFI. For later comparison, the inset also gives the mean and standard deviation of RFIoccupancy (1 . ± . .
44% standard deviation corresponding to the analyzedsequence clearly indicates a true change in RFI occupancy from one accumulation to another. We considerthis finding to be fully consistent with the observed, larger-than-unity d SK values, which can only ariseas the result of RFI signals that change their dynamics at time scales shorter than the integration time(Nita et al. , 2007; Nita, 2016), and thus may not be necessarily active, or detectable, in all consecutiveaccumulations.Now imagine that the EOVSA correlator does not provide power-squared information, but insteadprovides only accumulated power, s . We can still calculate d SK using the generalized d SK formula (Equation1), where S and S are accumulated from sums of s and s over some time interval. The middle row ofFigure 4 illustrates this approach for a sum of 50 consecutive spectral samples (1 s) of the same data asin Figure 2a, where now the generalized d SK is computed using M = 50, N = 1792, and d = 1. Figure 4ddisplays the 1 s accumulated power, while Figure 4e displays the sum of the squares of the same sequence ofaccumulated power spectra. Based on these inputs, we calculate the generalized d SK spectrum displayed inFigure 4f, which, despite being computed from a much larger data set, is affected by much larger statisticalfluctuations than those corresponding to the d SK plot displayed in Figure 4c. However, this is an expectedresult (Nita & Gary, 2010b), since the statistical fluctuations affecting the generalized d SK estimator areultimately dictated by the much smaller number of terms entering the outer sum, i.e M = 50 versus M = 1792.Nevertheless, the confidence level of the RFI flags is not dictated by the absolute value of the d SK sta-tistical fluctuations, but by the accuracy with which the RFI threshold are computed for a predefined PFA(Nita & Gary, 2010a,b). Indeed, the lower and upper thresholds, shown by red horizontal lines in Figure 4c,which are now 0.50077 and 1.71615, respectively, do not appear to be crossed by more than the expected0.13499% PFA of d SK values, except for the clearly RFI contaminated spectral region flagged in both (c)and (f) panels.From this perspective, the significantly larger percentage of RFI occupancy, 7 . d SK analysis over the same data segment on which the sequence of 50 d SK estimators indicatesonly 1 . ± .
44% RFI occupancy, cannot be attributed to a less accurate generalized d SK estimation, butto the dynamical nature of the RFI signals themselves, which during a longer time window contaminatemore frequency channels than the number contaminated in a single 20 ms period. This conclusion is alsosupported by much larger generalized d SK values, ( d SK max ∼ . d SK max ∼ . d SK and generalized d SK approaches, we display on the bottomrow of Figure 4 the result of the d SK analysis involving both outputs of the EOVSA spectrometer over theentire 1 s time window, from which accumulations of M = 50 × d SK spectra displayedon panels (f) and (i), the vertical scale of the d SK plot shown in panel (i) has been zoomed-in by a ∼
30 : 1factor, which demonstrates their statistical similarity. Remarkably, when compared with the 0.13499%PFA, the difference between the 7 .
13% and 7 .
06% percentages of RFI flags is not statistically significant,which quantitatively demonstrates the statistical equivalence of the d SK and generalized d SK estimators interms of their RFI sensitivity, when computed based on data spanning the same time window. Although,in the light of this comparison, it may be concluded that the same RFI detection performance is obtainedby generalized d SK without the need of doubling the data output of a spectrometer, our comparison alsodemonstrates that this may come at the cost of flagging and discarding potentially good data (of order 5% inthis example), and on a time scale much larger than the native resolution of the instrument. Moreover, if theastronomical signal of interest is expected to vary on a time scale shorter than the analysis time window, asnoted in Nita et al. (2007) (see also Nita, 2016; Nita & Gary, 2016), flagging of such astronomical transientswill occur. Therefore, all these aspects should be considered when designing a new instrument.eptember 7, 2018 14:36 ms-jai-arxiv Nita et al.
In any case, as we already pointed out in the introductory section, a native d SK design may extendthe potential uses of the d SK spectrometer well-beyond its originally intended scope by providing an ex-perimental means for obtaining accurate signal-to-noise-ratio and duration estimates for both Gaussianand non-Gaussian transients (Nita, 2016; Nita & Gary, 2016). Moreover, due to the additive property ofboth its outputs in both frequency and time, a native d SK spectrometer may also offer versatile means ofemploying, in an automated manner, the concept of multi-scale d SK analysis (Gary et al. , 2010), not onlyfor improving the sensitivity of RFI excision (Gary et al. , 2010), but also for the purpose of automaticdetection and classification of the statistical nature of signals (Nita, 2016). A direct illustration of themulti-scale d SK analysis concept is provided by the right column of Figure 4, where the maximum valuesof d SK vary from d SK max ∼ . d SK max ∼ . d SK max ∼ . d SK estimator defined inEquation 1. Using actual data from EOVSA, we have demonstrated that it performs exactly as expectedin a series of stringent tests, for RFI-free data (Table 1), data with RFI (Figure 2), and data containingboth RFI and solar radio bursts (Figure 3).While the SK concept is simple and easy to apply, the computation of the correct thresholds is moreinvolved in the general case. To help with this, we include some example code written in the IDL languagein Appendix 1. We note that the thresholds are typically just constants for data from a given instrumentalsetup. For example, for the EOVSA case the M = 1792, N = 1 and d = 1 values are fixed, hence thethresholds do not vary for the real-time d SK calculation. Acknowledgments
EOVSA was funded by a major construction grant from NSF under the American Recovery and Reinvest-ment Act. This work was supported by NSF grant AST-1312802, AFOSR grant FA9550-14-1-0336, andNASA grant NNX14AK66G to New Jersey Institute of Technology.eptember 7, 2018 14:36 ms-jai-arxiv
EOVSA SK Correlator d SK analysis comparing use of EOVSA data in SK mode versus use as a traditional spectrometer in generalized SKmode. a) The 20 ms accumulated power spectrum from Figure 2c. b) The accompanying 20 ms accumulated power-squaredspectrum output. c) Computed d SK spectrum (black), and 0.13499% PFA threshold levels (red horizontal lines) from Figure 2d.d) Sum of 50 consecutive 20-ms power accumulations covering 1 s of data, starting with the time frame illustrated on the toppanels. e) Sum of the squares of the same 50 consecutive 20 ms power accumulations. f) Computed generalized d SK spectrumfor the band (black) and the threshold levels (red horizontal lines). The same vertical scale as in panel (c) has been imposedto facilitate comparison of the d SK and generalized d SK fluctuations. g) The same 1 s accumulated power spectrum shown inpanel (d). h) The 1 s accumulated power-squared spectrum directly obtained by summing the 50 consecutive 20 ms powersquared accumulations natively produced by the EOVSA spectrometer. i) The d SK spectrum computed based on the 1 s powerand power-squared accumulations. The vertical scale of the d SK plot shown in panel (i) has been zoomed-in by a ∼
30 : 1factor to illustrate the practical statistical equivalence of the the 1 s d SK and generalized d SK spectra, despite the ∼
30 : 1ratio of their absolute statistical fluctuations. In each panel on the right column we indicate the actual M , N and d valuesentered in Equation 1 to compute the d SK spectra. The percentages of RFI flagged channels are displayed for each d SK plotin the right column. In panel (c), the mean and standard deviation of the RFI percentages corresponding to each of the 50consecutive time frames are also displayed. Appendix A Code example for threshold computations
Here we provide an Interactive Data Language (IDL) example code that may be used to compute thegeneralized d SK thresholds with a predefined PFA based on equations provided by Nita & Gary (2010b).The program can be adapted to other languages provided replacements are available for IDL’s incompletegamma function (igamma) and NEWTON function, which is based on the routine newt described in section9.7 of Numerical Recipes in C: The Art of Scientific Computing (Second Edition), published by CambridgeUniversity Press. ;+; NAME: eptember 7, 2018 14:36 ms-jai-arxiv Nita et al. ; GSK;; PURPOSE:; This function computes the lower and upper detection thresholds; for the GSK estimator corresponding to a given false alarm (PFA) probability level;; CATEGORY:; Numerical Analysis.;; CALLING SEQUENCE:; Result = GSK(M,N=N,d=d,PFA=PFA);; INPUTS:;; KEYWORD PARAMETERS:; N: Number of on-board accumulations before S1 and S2 are computed.;; If not provided, N=1 is assumed.; d: The shape parameter of the Gamma probability distribution corresponding to the PSD estimates.; The default value of d=1 corresponds to PSD estimates obtained by FFT means.; A value of d=0.5 corresponds to PSD estimates obtained by means of narrow; band filtering of a time domain signal.; Other vagues my be used to account for the particulars of the instrument; PFA: False alarm probability level.; If not provided, the default PFA=0.0013499 (3-sigma normal PDF) is used.; OUTPUTS:; This function returns the lower and upper thresholds in the array form [lower,upper];; RESTRICTIONS:; Restricted to reasonably small values of M to avoid IGAMMA computation errors (M<1e9);; PROCEDURE:;; EXAMPLES:;; REFERENCES:; Nita and Gary 2010, The Generalized Spectral Kurtosis Estimator,; MNRAS, 406 (1): L60-L64, doi: 10.1111/j.1745-3933.2010.00882.x;;; MODIFICATION HISTORY:; Written April 2010 by Gelu M. Nita ([email protected]);;-FUNCTION upper_root, Xcommon share, m1,m2,m3,m4,preturn,abs((1-igamma(4*(m2^3)/(m3^2),(-(m3-2*m2^2)/m3+x[0])/(m3/2/m2)))-p)ENDFUNCTION lower_root, Xcommon share, m1,m2,m3,m4,preturn,abs(igamma(4*(m2^3)/(m3^2),(-(m3-2*m2^2)/m3+x[0])/(m3/2/m2))-p)ENDfunction GSK,M,N=N,d=d,PFA=PFAcommon share, m1,m2,m3,m4,pon_error,1 eptember 7, 2018 14:36 ms-jai-arxiv
EOVSA SK Correlator ;Define default parametersif n_elements(N) eq 0 then N=1if n_elements(d) eq 0 then d=1if n_elements(M) eq 0 then M=6104if n_elements(PFA) eq 0 then PFA=0.0013499;Force double precisionp=double(pfa)M=double(M)Nd=double(N*d)PFA=double(PFA);Compute GSK moments according to equation 9 (used in threshold computation)m1=1dm2=(2*( M^2)* Nd *(1 + Nd))/((-1 + M) *(6 + 5* M* Nd + (M^2)*( Nd^2)))m3=(8*( M^3)* Nd* (1 + Nd)* (-2 + Nd* (-5 + M *(4 + Nd))))/(((-1 + M)^2)* $(2 + M* Nd) *(3 +M* Nd)* (4 + M* Nd)* (5 + M* Nd))m4=(12*( M^4)* Nd* (1 + Nd)* (24 + Nd *(48 + 84* Nd + $M *(-32 + Nd *(-245 - 93 *Nd + M* (125 + Nd* (68 + M + (3 + M)* Nd)))))))/$(((-1 + M)^3)* (2 + M* Nd)* (3 + M *Nd)* (4 + M* Nd) *(5 + M *Nd)* (6 + M* Nd)* (7 + M *Nd));Compute normalized moments entering Pearson criterion (not used below)beta1=(8* (2 + M *Nd)* (3 + M* Nd)* (-2 + Nd* (-5 + M* (4 + Nd)))^2)/$((-1 + M)* Nd* (1 + Nd) *((4 + M *Nd)^2)* (5 + M *Nd)^2)beta2=(3* (2 + M *Nd) *(3 + M* Nd)* (24 + Nd* (48 + 84* Nd + M *(-32 + $Nd *(-245 - 93* Nd + M *(125 + Nd* (68 + M + (3 + M)* Nd)))))))/$((-1 +M)* Nd *(1 + Nd)* (4 + M* Nd)* (5 + M *Nd)* (6 + M* Nd)* (7 + M *Nd));Compute Pearson criterion according to equation 11 (not used below)k=2*m2*(3*(m2^2)-m4)+3*(m3^2);Compute Type III parameters defined by equation 19delta=m1-2*(m2^2)/m3beta=4*(m2^3)/(m3^2)alpha=m3/(2*m2);Compute fourth moment error according to equation 21err4=abs((3* beta *(2 + beta)* (alpha^4)/m4-1)*100);Compute the thresholds according to equation 20upper = NEWTON([1], ’upper_root’,/double,tolf=1e-8)lower = NEWTON([1], ’lower_root’,/double,tolf=1e-8)return,[lower,upper]end References
Anderson, M. M., Siemion, A. P. V., Barott, W. C., Bower, G. C., Delory, G. T., de Pater, I. & Werthimer, D. [2012]
The Astrophysical Journal , 15, doi:10.1088/0004-637X/744/1/15.Antoni, J. [2006]
Mechanical Systems and Signal Processing Mechanical Systems and Signal Processing Mechanical Systems and Signal Processing , 308, doi:10.1016/j.ymssp.2004.09.002, URL eptember 7, 2018 14:36 ms-jai-arxiv Nita et al.
Signal Processing Conference, 2004 12th European , p. 765.Bassa, C. G., Janssen, G. H., Karuppusamy, R., Kramer, M., Lee, K. J., Liu, K., McKee, J., Perrodin, D., Purver,M., Sanidas, S., Smits, R. & Stappers, B. W. [2016]
Monthly Notices of the Royal Astronomical Society ,2196, doi:10.1093/mnras/stv2755.Deller, A. T., Brisken, W. F., Phillips, C. J., Morgan, J., Alef, W., Cappallo, R., Middelberg, E., Romney, J.,Rottmann, H., Tingay, S. J. & Wayth, R. [2011]
Publications of the Astronomical Socieaty of Pacific , 275,doi:10.1086/658907.Dion, J.-L., Tawfiq, I. & Chevallier, G. [2012]
Mechanical Systems and Signal Processing Publicationsof the Astronomical Socieaty of Pacific , p. 231, doi:10.1109/IGARSS.2014.6946399.Fridman, P. A. [2001]
A&A , 369, doi:10.1051/0004-6361:20000552, URLhttp://dx.doi.org/10.1051/0004-6361:20000552.Fridman, P. A. & Baan, W. A. [2001]
A&A , 327, doi:10.1051/0004-6361:20011166, URLhttp://dx.doi.org/10.1051/0004-6361:20011166.Gary, D. E., Liu, Z. & Nita, G. M. [2010]
Publications of the Astronomical Socieaty of Pacific , 560, doi:10.1086/652410.Liu, Z., Gary, D. E., Nita, G. M., White, S. M. & Hurford, G. J. [2007]
Publications of the Astronomical Socieaty ofPacific , 303, doi:10.1086/512825.Nita, G. M. [2016]
Monthly Notices of the Royal Astronomical Society , 2530, doi:10.1093/mnras/stw550, URLhttp://mnras.oxfordjournals.org/content/458/3/2530.abstract.Nita, G. M. & Gary, D. E. [2010a]
Publications of the Astronomical Socieaty of Pacific , 595, doi:10.1086/652409.Nita, G. M. & Gary, D. E. [2010b]
Monthly Notices of the Royal Astronomical Society , L60, doi:10.1111/j.1745-3933.2010.00882.x.Nita, G. M. & Gary, D. E. [2016]
Journal of Geophysical Research: Space Physics , n/adoi:10.1002/2016JA022615,URL http://dx.doi.org/10.1002/2016JA022615, 2016JA022615.Nita, G. M., Gary, D. E., Liu, Z., Hurford, G. J. & White, S. M. [2007]
Publications of the Astronomical Socieaty ofPacific , 805, doi:10.1086/520938.Ottonello, C. & Pagnan, S. [1994]
Electronics Letters Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engi-neering Sciences , 343, doi:10.1098/rsta.1895.0010.Ruf, C., Gross, S. & Misra, S. [2006]