Interferometric radio transient reconstruction in compressed sensing framework
SSF2A 2015
S. Boissier, V. Buat, L. Cambr´esy, F. Martins and P. Petit (eds)
INTERFEROMETRIC RADIO TRANSIENT RECONSTRUCTION IN COMPRESSEDSENSING FRAMEWORK
M. Jiang , J. N. Girard , J.-L. Starck , S. Corbel and C. Tasse Abstract.
Imaging by aperture synthesis from interferometric data is a well-known, but is a strong ill-posed inverseproblem. Strong and faint radio sources can be imaged unambiguously using time and frequency integrationto gather more Fourier samples of the sky. However, these imagers assumes a steady sky and the complexityof the problem increases when transients radio sources are also present in the data. Hopefully, in the contextof transient imaging, the spatial and temporal information are separable which enable extension of animager fit for a steady sky. We introduce independent spatial and temporal wavelet dictionaries to sparselyrepresent the transient in both spatial domain and temporal domain. These dictionaries intervenes in anew reconstruction method developed in the Compressed Sensing (CS) framework and using a primal-dualsplitting algorithm. According to the preliminary tests in different noise regimes, this new “Time-agile” (or2D-1D) method seems to be efficient in detecting and reconstructing the transients temporal dependence.Keywords: interferometry, imaging, transients, sparsity, compressed sensing
The study of the radio sky at radio wavelengths has increased since the arrival of new sensitive instrumentation.The timescale to produce images in radio was reduced by the development of new imaging techniques taking thefull advantage of the instrument. At the same time, the study of known class of transient sources (e.g. pulsarsfor general relativity tests, Active Galactic Nuclei, etc) and the recent discovery of new class of transients (e.g.Rotating Radio Transients, Fast Radio Bursts, Lorimer type bursts, see Lorimer et al. (2007)) has motivatedfurther development for transient detection and characterization.Imaging via aperture synthesis with interferometric data has been an active field of research for ∼
40 years.A radio interferometer gives a limited set of noisy Fourier samples of the sky (the visibilities (Wilson et al.2009)) inside the field of view of the instrument. An approximate of the sky can be obtained (assuming thesmall field approximation) by simply taking the inverse Fourier Transform (FT) of those visibilities. Due to thelimited number of baselines, the Fourier plane is incomplete and one requires to process this incomplete Fouriermap either by solving a deconvolution problem, using tools such as CLEAN and its derivates (e.g. H¨ogbom(1974); Schwab (1984); Rau & Cornwell (2011)) or by solving the “inpainting” problem by recovering missinginformation in the Fourier plane. Several teams have addressed this issue within the framework of the recentCompressed Sensing (CS) theory (e.g. Garsden et al. (2015); Girard et al. (2015); Dabbech et al. (2015) andother references therein). In addition, next generation of giant interferometers such as LOFAR (van Haarlemet al. 2013), suffers from “direction-dependent” effects (Tasse et al. 2012) which distort the Fourier relationshipbetween the measurements and the sky (such as array non-coplanarity and dipole projection). In Garsden et al.(2015), a new imager compatible with LOFAR combined both a sparse approach given by the CS theory andcorrections for A and W effects (Tasse et al. 2013). It also demonstrated better angular resolution and lowerresiduals as compared to classical methods, on simulated and real datasets.A lot of effort has been put into the development of detection pipelines (e.g. the LOFAR TRAnsient Pipeline– TRAP (Swinbank et al. 2015), based on fast iterative closed-loop performing calibration / imaging / source Service d’Astrophysique, CEA Saclay, Orme des Merisiers, 91410 GIF-Sur-YVETTE, France GEPI, Observatoire de Paris-Meudon, 5 rue place Jules Janssen, 92190 Meudon, Francec (cid:13)
Soci´et´e Francaise d’Astronomie et d’Astrophysique (SF2A) 2015 a r X i v : . [ a s t r o - ph . I M ] D ec
38 SF2A 2015detection / catalogue cross-matching). However, being variable and mostly point-like, the transients imagingsuffers from the imaging rate. On the one hand, short time integration enables temporal monitoring of atransient, but each snapshot provides poor visibility coverage. Therefore, the image has low signal-to-noiseratio (SNR) due to large fraction of missing data. On the other hand, long time integration ensures a goodsampling, but it will average out the temporal variability of the transient by mixing and diluting “ON” stateperiods with “OFF” state periods. As a result, a variety of transient radio sources might be missed due touncertainty or timescale filtering. Consequently, it is difficult to use classical imagers to detect and imagetransient source when the temporal variability of the transient source is unknown. There is an interest indeveloping fast imagers, enable to cope with the time variability of the sources. Such imagers can rely on theCS framework to give a quick approximate of the true sky, giving access to faster transients. This motivatedthe development of a 2D-1D sparse reconstruction imager on the experience obtained in the 2D imaging case.
Shannon theory is commonly known in the domain of signal processing to perfectly reconstruct a regularlysampled signal. However, the innovative sampling and compression theory of recent years, Compressed Sensing(CS) (Candes & Wakin 2008), or Compressive Sensing could go beyond the Shannon limit, at a rate significantlybelow the Nyquist rate, to capture and represent compressible signals based on the sparsity of observed signals.The CS theory is a paradigm for finding a nearly exact reconstruction in the case of an undeterminedproblem. In the radio interferometry imaging problem, as we have fewer observations than unknowns (i.e. thesparsely sampled FT of the sky), the CS theory applies and could enable to produce accurate maps possiblywith improved angular resolution. To achieve the perfect reconstruction from few samples, the CS theory relieson two principles: sparsity and incoherence. First, in general, the CS theory exploits the fact that the signalcan be sparse or compressive in some dictionary Φ . For instance, a signal x ( t ) may be not sparse in the directspace, but can be sparse in the wavelet space. In such case, x ( t ) can be decomposed as its sum of few, butsignificant, coefficients, as x = Φ α = (cid:80) Ti =1 α [ i ] ϕ i , with T relatively small. Second, the incoherence principlestates that a sparse signal in the dictionary Φ must be as dense as possible in the domain where it is acquired.It means that the sensing vectors must be as different as possible from the atoms of Φ . As indicated in the subsection 2.1, the interferometry imaging problem constitutes an ill-posed inpaintingproblem which can be described mathematically in Eq. (2.1): V = MFx + N (2.1)with V , the measured visibility vector, M the sampling mask which accounts for incomplete sampling in theFourier space, F the FT operator, x the sky, and N the noise. The sky x , expressed in the direct space, is areal quantity while the noise N is complex as it alters both amplitude and phase of the visibility measurements.By extension, the application to transient imaging leads to recast the entities of Eq. (2.1) as time-dependententities. In that scope, the data model of the sky x , containing a transient source, will be a cube. At a givenfrequency, two dimensions are associated with the spatial information and the third dimension describe its timedependence. In the classical 2D case, the masking operator M is a given, depending on the frequency and timeintegration. In our case, we have to account for its time-dependence as we considered a ground-based radiointerferometer that rotates with Earth (leading to the apparent motion of the sky). M will sample differentregion of the sky FT with time which is a cumulative effect to the variation of the sampled FT of the varyingsky. The imaging of a transient radio sky, can be regarded as a 2D-1D spatial-temporal image reconstructionproblem.To comply with the CS framework, the choice of the corresponding 2D-1D sparse representation Φ is criticalfor our problem. Fortunately, as the temporal information is not correlated to the spatial information, wecan separate the 2D-spatial dictionary and 1D-temporal dictionary rather than looking for a 3D dictionary.Therefore, as described in (Starck et al. 2009), an ideal wavelet function would be ψ ( x, y, t ) = ψ ( xy ) ( x, y ) ψ ( t ) ( t )where the space (xy) and time (t) are independent. As in Garsden et al. (2015), we selected the 2D starlets(Starck et al. 2010) which have proven to be adapted to astronomical sources. For the 1D temporal transformnterferometric Radio Transient Reconstruction 239 ψ ( t ) , we used decimated wavelet functions such as Haar or biorthogonal CDF 9/7 wavelets, depending on thetemporal profile of the transient. The whole 2D-1D decomposition scheme is illustrated in Fig. 1. Firstly,assuming a cube of size N x × N y × N z where N z denotes the number of time frames, the starlet transformis operated on each time frame, yielding N D spatial scales for each frame. Secondly, For each spatial scale,the temporal 1D transform is performed on each pixel column in the temporal direction of each scale. The 1Dtransform is decimated and will not increase the size of coefficients in time. Thus, we obtain a 2D-1D coefficientset of size N D × N x × N y × N z .
2D Wavelet
DATA
WT1D2D Wavelet 2D Wavelet time axis
WT1DWT1DWT1DWT1D N p = N Nx Ny Nz
Fig. 1.
Illustration of 2D-1D decomposition: For a cube of size N x × N y × N z , the total number of coefficients will be N D × N x × N y × N z where N D is the 2D decomposition scale. Given this 2D-1D dictionary Φ, the minimization problem can be formulated from Eq. (2.1) in the analysisframework: min || Φ t x || s . t . || V − MFx || < (cid:15), (2.2)where (cid:15) denotes the error radius. The objective minimization function takes the form of || Φ t · || where the (cid:96) -norm (summation of absolute values of coefficients) is used to reinforce the sparsity of the solution and ensurethe convexity of the problem. However, the (cid:96) -norm involves a soft-thresholding operator which induce bias issolutions (Starck et al. 2010). This is particularly unsuitable for scientific data analysis involving photometry.According to Candes et al. (2007), the reweighted (cid:96) scheme is one way to handle this issue. To address thisissue, we adopted a reweighted (cid:96) scheme (Candes et al. 2007), by defining a weighting vector W . In addition,as the source photometry is always assumed positive, we impose a positivity constraint as well. Thus, ourconvex minimization problem can be formulated as: min x || V − MFx || + || W (cid:12) λ (cid:12) Φ t x || + i C + ( x ) , (2.3)where λ , a scale-dependent vector, is a Lagrangian parameter which depends implicitly on (cid:15) of the data fidelityin (2.2), the operator (cid:12) denotes the element-by-element multiplication and i C + ( x ) is the indicator function(which is 0 if x belongs to C + , + ∞ otherwise)As λ is not explicitly related to the error radius (cid:15) , the mathematical relation between λ and (cid:15) is not easyto find out. However, since the real dataset V is noisy and makes the sparsity constraint decline in the 2D-1Ddecomposition space, λ is closely related to the statistical distribution of decomposition coefficients. Thus,the study of the statistical distribution is important. In practice, several noise driven strategies are availableto estimate the statistical distribution. One of the strategies is noise-driven strategy from the residual, whichis used hereof. As we will see in section 2.3, the residual is obtained by R ( n ) = V − MFx ( n ) in the n-thiteration. Consequently, the statistical distribution α R ( n ) = Φ t F t M t R ( n ) , and σ is accessible by the reliablenoise estimator MAD (Median of the Absolute Deviation). Then, λ = k σ where k defines the level of thesignificant coefficients which lie within the band kσ of the Gaussian distribution.40 SF2A 2015 According to Candes et al. (2007), the reweighted (cid:96) scheme is applied as follows:1. Set the iteration count n = 0 and initialize W (0) = .2. Solve the minimization problem (2.3) yielding a solution x ( n ) , and α ( n ) is obtained by α ( n ) = Φ t x ( n ) .3. Update the weights (see later on).4. Terminate on convergence or when reaching the maximum number of iterations N max . Otherwise, incre-ment n and go to step 2.First, a biased solution x is obtained by the non-reweighted convex optimization, then a weighting step isperformed using the following weighting strategy: if | α i,j | ≥ k (cid:48) σ j then w i,j = k (cid:48) σ j / | α i,j | , else, w i,j =1 (operationlater described as function f ( | α i,j | ). We update the weights for each entity i at scale j , σ j is the noise standarddeviation at scale j . k (cid:48) acts as a reweight level in scale j . Then, we subsequently apply the proximal theoryto solve the minimization problem (2.3) with the Condat-V˜u splitting method (CVSM - Condat (2013); V˜u(2013)). The CVSM introduces a primal-dual pair ( x, u ) to solve the convex optimization problem (2.3) usingforward-backward algorithm. Thus, in summary, the adapted CVSM with reweighted scheme is presented inAlgo 1, where the parameters τ and η respect the convergence condition such as 1 − τ η || Φ || > τ || MF || / µ is a relaxation parameter used to accelerate the algorithm. If µ = 1, the algorithm is in the unrelaxedcase or no acceleration case. In the Algo 1, the line 3 can be considered as the forward step to converge thenon-negative solution from the residual R = V − MFx , while the line 4 can be regarded as the backward stepto enforce the sparsity constraint by using the soft-thresholding proximity (ST).
Algorithm 1:
Analysis reconstruction using CVSM
Data : Visibility V ; Mask MResult : Reconstructed image x Initialize ( x (0) , u (0) ) , W (0) = , τ > , η > , µ ∈ ]0 , for n = 0 to N max − do p ( n +1) = Proj C + ( x ( n ) − τ Φ u ( n ) + τ ( MF ) ∗ ( V − MFx ( n ) )); q ( n +1) = ( Id − ST λ (cid:12) W )( u ( n ) + η Φ T (2 p ( n +1) − x ( n ) )); ( x ( n +1) , u ( n +1) ) = µ ( p ( n +1) , q ( n +1) ) + (1 − µ )( x ( n ) , u ( n ) ); α ( n ) = Φ t x ( n ) ; Update W by w ( n +1) i,j = f ( | α ( n ) i,j | ); end return x ( N max ) We simulated a sky model of size 32 × ×
64, i.e. 32 ×
32 pixels on image plane and 64 2-min frames on time.The time-dependent sky model is constituted of a control steady source at the center of the field and a transientsource with a gaussian light curve (FWHM = 20 min located at time slice T=24). Both sources have the samepeak flux density of 10 arbitrary unit in the sky model. Then, to do the realistic simulation, we generated a2-hour Fourier sampling mask cube by using an uniform random distribution of 20 antennas observing at thezenith at the latitude of the Nanay Radio Observatory.To generate the observed visibilities in terms of noise levels, we take the FT of the sky model and apply themask cube, and then add white gaussian noise with various magnitude ( σ = 0 . , . , . , . Flux (AU)
Fig. 2. (left) Dirty images from the benchmark data cube containing two point sources: a central steady source(x=15,y=15) and a transient source (x=24,y=6). First row corresponds to the OFF state (T=10) and second rowto the ON state (T=25) at the maximum of the transient. Each column corresponds to various level of additive gaussiannoise with σ = 0 , . , . , . due to missing data), our 2D-1D CS method gives a perfect profile reconstruction, with flux unit relative error ∼ − . As the noise level increases, the flux density of the transient is more spread around the central pixel,resulting in a bias of the peak. However, by summing the flux of the transient on nearby pixels (over a surfaceequal to the source size), the profile of the transient is again well recovered. Meanwhile, the steady source(not shown) is also well recovered except in the high noise regime where fluctuations are present. From thesepreliminary results, it seems that our 2D-1D CS reconstruction method is efficient reconstruct the transient ina noisy dataset. The reconstruction time was not monitored but is subject to further study. CS theory offers new tools for solving ill-posed problems such as the imaging problem in radio interferometry. Inprevious studies, such as Garsden et al. (2015), we have shown that the 2D CS method can outperform classicaltools used for deconvolution. In this work, we present an extension of this 2D imager, by recovering the 3D datawith the third dimension being the temporal information to detect and reconstruct radio transients. We usedrespectively a 2D and a 1D wavelet dictionaries to perform 2D-1D reconstruction. In addition, we minimizethe convex problem using Condat-V˜u splitting method in the proximal theory framework. The preliminaryresults based on simulated data cubes containing both steady and transient sources show a good potentialfor transient imaging. For next steps, we will compare our method with classical deconvolution methods anddevelop a “time-agile” imager addressed to the next generation of interferometers, such as LOFAR and SKA,to enable the detection of radio transients. A featured paper is in preparation, and contains extended tests onvarious dataset (simulated and real datasets containing transient source). As the search for known and unknowntransient is a emerging field in radio astronomy, the development of such tools may have a strong impact ontransient studies.
We acknowledge the financial support from the UnivEarthS Labex program of Sorbonne Paris Cit´e (ANR-10-LABX-0023 andANR-11-IDEX-0005-02) and the Physis project (H2020-LEIT-Space-COMPET).
References
Candes, E. & Wakin, M. 2008, Signal Processing Magazine, IEEE, 25, 21Candes, E. J., Wakin, M. B., & Boyd, S. P. 2007, ArXiv:0711.1612Condat, L. 2013, Journal of Optimization Theory and Applications, 158, 460Dabbech, A., Ferrari, C., Mary, D., et al. 2015, A&A, 576, A7Garsden, H., Girard, J. N., Starck, J. L., et al. 2015, A&A, 575, A90Girard, J. N., Garsden, H., Starck, J. L., et al. 2015, Journal of Instrumentation, 10, C8013
42 SF2A 201542 SF2A 2015