[PDF] Imaging Spatially Extended Objects with Interferometers: Mosaicking and the Short Spacing Correction

Abstract

Interferometry is a powerful technique for making sensitive, high-fidelity images of the sky, but is limited in its ability to measure extended or diffuse emission. Better images of extended astronomical objects can be obtained by mosaicking together many pointings of the interferometer array. Even better images can be obtained by combining these data with data from a single-dish telescope. This lecture explains commonly practiced techniques for obtaining and analyzing these observations, and the theory behind them.

Full PDF

IImaging Spatially Extended Objects with Interferometers:Mosaicking and the Short Spacing Correction

Brian S. Mason National Radio Astronomy Observatory, Charlottesville, VA, USA; [email protected]

Abstract.

Interferometry is a powerful technique for making sensitive, high-ﬁdelityimages of the sky, but is limited in its ability to measure extended or di ﬀ use emission.Better images of extended astronomical objects can be obtained by mosaicking togethermany pointings of the interferometer array. Even better images can be obtained bycombining these data with data from a single-dish telescope. This lecture explainscommonly practiced techniques for obtaining and analyzing these observations, andthe theory behind them.

1. Introduction

An interferometer can make remarkably accurate measurements of the sky intensity,but the features that these measurements capture is limited by the length and orienta-tion of the interferometer’s baselines. The absence of information at spatial frequencieshigher than that provided by the interferometer’s longest baselines results in limitedangular resolution, making features much smaller than the scale of θ min = λ/ b max im-possible to distinguish. The absence of information on angular scales larger than thosemeasured by the interferometer’s shortest baselines, θ > θ max ∼ ( λ/ b min ), can in manyinstances be even more problematic. The interferometer can sometimes be reconﬁg-ured to measure larger scales by having more, shorter baselines. However, since thebaselines cannot be shorter than the dish diameter D , θ max can never be larger than( λ/ D ). The missing large-scale information corresponds to missing ﬂux for spatiallyresolved objects; and can give rise, in e ﬀ ect, to variable local backgrounds that makeit challenging to measure integrated ﬂux densities, even of relatively compact objects.Since much astrophysics is done by comparing spatially integrated features at di ﬀ er-ent wavelengths— e.g. , the integrated line intensity in two transitions, or the integratedcontinuum ﬂux density at two widely separated frequencies— this can be a signiﬁcantlimitation.This lecture will address two of the main methods to overcome this limitation:acquiring and jointly analyzing data from adjacent interferometer pointings, a techniqueknown as mosaicking ; and combining data from a single-dish telescope(s) with theinterferometer data. In § 2, I will articulate the problem and present the mathematicaland conceptual foundation for both of these methods, the Ekers-Rots theorem. We willsee that the Ekers-Rots theorem implies that the shortest baseline b min in fact containsinformation about scales θ > ( λ/ b min ), and gives a conceptual scheme to retrieve thatinformation from mosaicked interferometer observations. Several common approachesfor analyzing interferometric mosaics are discussed in § 3. Techniques for combining1 a r X i v : . [ a s t r o - ph . I M ] J un B. Mason single dish and interferometer data are discussed in § 4, and some common “real-world”considerations are discussed in § 5.A few general comments are in order. The purpose of this lecture is to explain thecurrent, generally accepted best practices involved in interferometric imaging of spa-tially extended astronomical objects, and the theory behind these practices. The readeris presumed to be familiar with the basic concepts of synthesis imaging, such as theCLEAN algorithm for deconvolution. These foundational topics are covered in Tayloret al. (1999), and with greater depth and mathematical rigor in Thompson et al. (2017).The reader may also ﬁnd the online lectures from recent Synthesis Imaging Workshopsto be of use (e.g. Mioduszewski 2014; Marvil 2018). The ﬁeld of astronomical imagingis dynamic, and consequently there are numerous new and exciting techniques underdevelopment that are applicable to the problems we consider. While pointers into thisliterature are provided, there is no attempt to survey these approaches systematically,which could constitute an article unto itself. In this article the clean deconvolution al-gorithm will be denoted as CLEAN; implementations of CLEAN (or other imaging al-gorithms) in particular packages will be indicated in typewriter font— clean , tclean ,or MOSMEM for instance. Most of the practical examples in this lecture use the CASApackage. At the time these lecture notes were posted the most recent CASA release was5 .

6. In this and recent releases, the generally recommended implementation of CLEANis the task called tclean .

2. Mosaicking Fundamentals2.1. The Problem

There are two, related criteria by which a source can be considered “large” from thepoint of view of synthesis imaging:1. The source is large compared to the scale measured by the shortest baseline b min : θ src > λ/ b min .2. The source is large compared to the antenna primary beam: θ src > θ B .Here θ B is the Full-Width at Half Maximum (FWHM) of the antenna primary beam; λ is the wavelength of the observation; and θ src is the source size. As mentioned pre-viously these two cases are related since θ B is determined by the antenna diameter D ( θ B ≈ λ/ D ), and the shortest possible physically realizable value of b min is also con-strained by the dish diameter ( b min > D ). Interferometric mosaicking can help in bothcases, although in the second case supplementary single-dish data will often be requireddepending on the science goal and source morphology.A related situation is that where the sources are compact by the above criteria butdistributed over a region comparable to or larger than the antenna primary beam. Manyinterferometric surveys fall into this category, such as the NRAO VLA Sky Survey(NVSS: Condon et al. 1998) and the more recent VLA Sky Survey (VLASS: Lacy et al.2020). These wide-ﬁeld imaging cases require techniques — for instance, accuratemodelling of the primary beam and accounting for the W-term— that can also applyto the cases we are considering here. For clarity our discussion will focus speciﬁcallyon issues involved in imaging and ﬂux recovery for objects that are large in the senseindicated. Note that for geometrical reasons, the W-term tends not to be signiﬁcant forthe types of compact conﬁgurations which are best for imaging extended sources. osaciking & the Short Spacing Corrections b , for a sky brightness component which has a Gaussian shapewith FWHM = θ src . As a proxy for the quality of information on a given angular scale,we consider the ratio of the measured visibility to the total ﬂux density of the Gaussiancomponent S src , which can be shown to be: V ( b ) S src = + ( θ src /θ B ) Exp (cid:32) − .

71 ( b / D ) + ( θ B /θ src ) (cid:33) (1)Here we have assumed the antenna primary beam FWHM θ B = . × ( D /λ ). Thisratio is shown as a function of source size for three represenatitive baseline lengthsin Figure 1. Several facts are apparent. First, for a source as large as the primarybeam ( θ B = θ src ), even the shortest physically realizable baseline ( b = D ) performsvery poorly, recovering only 5% of the total ﬂux density. For this reason, a typicallyquoted “largest angular scale” (LAS) that an interferometer can e ﬀ ectively measure is θ LAS = ( λ/ b min ). If the uv -coverage is poor, even this is an overestimate. Oursingle-baseline example measurement would detect ∼

40% of the total ﬂux density ofa Gaussian component of this size. Second, it is apparent that the decline in visibilitywith baseline length is very rapid. For instance, a b = D basline— which practicallyspeaking is still a very short baseline— will recover only 4% of the total ﬂux density fora Gaussian component with θ src = θ LAS = ( λ/ b min ). Finally, the increase in the signalamplitude as the baseline gets shorter is (reciprocally) rapid: a hypothetical b = D / ﬀ ect of missing short-spacings can be obtained by con-sidering Figure 2. Imagining the interferometer uv -coverage as a uniform plateau—which has a Fourier transform of a Sinc() function— the short-spacing deﬁcit can bethought of as a uniform plateau of lesser extent subtracted from the uv -coverage. Thisintroduces a negative bowl around the core of the point spread function (PSF), or dirtybeam, which in turn gives rise to negative bowls around regions of positive emission inthe dirty map. Begin by considering the measurement equation, which states the relationship betweena single visibility measurement V ( u , v ) and the sky intensity I ( (cid:96), m ): V ( u , v ) = (cid:90) (cid:90) d (cid:96) dm A ( (cid:96), m ) I ( (cid:96), m ) e − π i ( u (cid:96) + vm ) (2) The LAS a particular interferometer can accurately recover in a given observations depends on the uv -coverage. There is furthermore not a widely accepted quantitative criterion by which this LAS is judged.Consequently there is some variation in the relationship between LAS and b min that di ﬀ erent facilitiesquote. For example, the ALMA Technical Handbook gives θ LAS = . λ/ b min , and the VLA ObservationalStatus Summary gives θ LAS = . λ/ b min to 0 . λ/ b min for long tracks (half that for snapshots). If this levelof discrepancy would substantially impact the science of a proposed project, careful simulations would bein order to determine the suitability of the proposed observations. B. Mason

Figure 1. Fraction of total ﬂux density retrieved by an interfereometer as a func-tion of the ratio of the (Gaussian) source size to the antenna primary beam, for threedi ﬀ erent baseline lengths. Here A ( (cid:96), m ) is the antenna primary beam (assumed identical for all antennas); I ( (cid:96), m )is the sky intensity; (cid:96) and m are sky coordinates; u and v represent the baseline betweenantennas in wavelengths; and V ( u , v ) is the measured visibility value. First, note that V (0 ,

0) is the ﬂux density which would be measured by a single dish telescope with thesame primary beam A ; also known as the “zero-spacing” ﬂux . Most of the followingalso therefore applies to total power measurements with single dish telescopes.From the convolution theorem, Eq. 2 can be written as V ( u , v ) = ˜ A ( u , v ) ⊗ ˜ I ( u , v ) (3)where ˜ I ( u , v ) is the Fourier transform of the sky intensity and ˜ A ( u , v ) is the Fouriertransform of the antenna primary beam. From this it is clear that the antenna primarybeam has the e ﬀ ect of a point spread function in uv space: a single visibility measure-ment V ( u , v ) has information not only from the Fourier component ( u , v ) of the idealsky brightness ˜ I ( u , v ), but also from a range of nearby values determined by the Fouriertransform of the primary beam.Using the fact that the antenna primary beam A ( (cid:96), m ) is the magnitude squared ofthe Fourier transform of the aperture illumination function E ( u , v ) (Napier 1999; Hunter& Napier 2016) it can also be shown that˜ A ( u , v ) = E ( u , v ) ⊗ E ( u , v ) The term “zero-spacing ﬂux” is sometimes loosely used to mean the total ﬂux density of a given object,although if the source is comparable to or bigger than the primary beam A ( (cid:96), m ), the total ﬂux density isgreater than the value V (0 ,

0) that a single-dish having that primary beam would measure. osaciking & the Short Spacing Corrections Figure 2. Conceptual illustration of the e ﬀ ect of missing short spacings on aninterferometer’s synthesized beam. Left (a): uv - or aperture-plane coverage;

Right(b): resulting beam.

Figures from Braun and Walterbos 1985 i.e. , the uv -plane “point-spread function” of an interfometer is the auto-correlation func-tion of the aperture illumination function. For a ﬁnite, circular aperture of diameter D , E ( u , v ) from ( u , v ) = √ u + v = D / λ . It follows that ˜ A ( u , v ) will typically havesupport over a region twice as large, i.e. , from ( u , v ) = (0 ,

0) to √ u + v = D /λ . There-fore, a given baseline b contains information not only about the spatial frequency b /λ ,but about a whole range of spatial frequencies from ( b − D ) /λ to ( b + D ) /λ .This result is illustrated in Figure 3 : a single visibility measurement on a baselineof (nominal) length b contains information about a range of baseline lengths, from b − D to b + D . Similarly, a measurement with a single-dish telescope can be thought of asbeing a sum of measurements with “baselines” ranging from D to 0. Geometrically itis clear that the single dish measurement has many of the very short spacings and few“baselines” of length ∼ D ; the single dish correspondingly provides greater sensitivityor weight at the shortest spatial frequencies and less at the higher spatial frequencies.Similarly the interferometer provides the highest sensitivity at the spatial frequency b /λ . The e ﬀ ect of the blurring e ﬀ ect of the primary beam on an interferometer’s uv -coverage can be seen in Fig. 4. This is particularly important for the problem of imag-ing large objects: the “zero-spacing-hole” has shrunk, and the shortest interferometerbaseline b ∼ D is seen to contain information almost all the way down to ( u , v ) = B. Mason

Figure 3. Geometrical interpretation of the range of baselines measured by a sin-gle dish (blue, left ) and a single baseline interferometer (red, right ). due to their low weight; these correspond to spatial frequencies of ∼ (3m) /λ in Fig. 3.The other e ﬀ ect of the primary beam convolution has been to ﬁll in most of the unsam-pled holes in the aperture plane, at least for this (quite compact) array conﬁguration.If this information can be used it has the potential to stabilize nonlinear deconvolutionalgorithms like CLEAN (Cornwell et al. 1999).The problem is that with only a single measurement V ( u , v ), it is impossible toseparately estimate the contributions of the individual Fourier components of the skybrightness. Ekers & Rots (1979) ﬁrst showed how this limitation can be overcome foran interferometer. By continuously scanning the antennas’ pointing positions, makingmany measurements at di ﬀ erent pointing centers on the sky ( (cid:96) (cid:48) , m (cid:48) ), one can tabulate(sample) the function V ( u , v , (cid:96) (cid:48) , m (cid:48) ). Fourier transforming this function with respect to (cid:96) (cid:48) and m (cid:48) yields an estimate of the sky Fourier transform which has a Fourier resolutiondetermined by the size of the area scanned over . If this area is larger than the primarybeam, this is a higher Fourier resolution estimate of the sky brightness. It will also allow osaciking & the Short Spacing Corrections Figure 4. Nominal aperture plane coverage of a single snapshot of a compactALMA conﬁguration ( left ). The e ﬀ ective aperture plane coverage is shown on the right , which is simply the convolution of the points in the left panel with antenna(aperture-plane) primary beam. Note that the units of aperture plane coordinatesare distance units (in this case, meters). This is a convenient representation of aninterferometer’s spatial frequency response when, as is often the case by design, theantenna illumination pattern is approximately wavelength-independent. estimation of sky components with spatial frequencies all the way down to ( b min − D ) /λ , i.e. , less than the spatial frequency given by the shortest physical baseline b min /λ . Acompletely analagous argument applies to measurements made by scanning a singledish telescope.The scheme described by Ekers & Rots gives powerful conceptual insight but suf-fers from several shortcomings as a practical solution to the short spacing problem. Avariety of practical implementations are described in § 3. Among the shortcomings ofEkers-Rots is the fact that what is usually of interest is not a higher resolution estimateof the sky’s Fourier transform, but rather, an improved image of the sky— and in par-ticular, an image which has not been convolved with the interferometer’s synthesizedbeam. Consequently most practical algorithms which make use of the Ekers-Rots in-formation do so within the context of a deconvolution operation. Additionally, mostinterferometers do not retain ﬁxed baselines ( u , v ) with time; rather, the baslines (pro-jected on the sky) change as time progresses. Finally, continuously-scanned interfero-metric observing has not been commonly available on major radio interferometers untilrelatively recently. & Fourier Space Sampling

Although the argument of Ekers & Rots assumes the interferometer is continuouslyscanned across the sky, it turns out that continuous scanning is not in fact required.Cornwell (1988) showed that the full Ekers-Rots information can be obtained provided

B. Mason only that the sky is Nyquist sampled with respect to the antenna primary beam. Corn-well also presented one of the ﬁrst practical implementations of a deconvolution algo-rithm which has the beneﬁt of the short-spacing, Ekers-Rots information. This algo-rithm will be discussed in § 3.2.1.The Nyquist sampling requirement can be understood as follows. Imagine a single-baseline interferometer with a ﬁxed baseline ( u , v ), constructing a rectangular mosaicof pointings spaced by ∆ in each of two orthogonal directions on the sky. This willgive rise to a series of aliases spaced by 1 / ∆ in the aperture plane. Each measurement,and each alias, is in fact a measurement over a circular patch of uv -space of diameter2 D /λ . Therefore spacing the interferometer pointings ∆ ≤ λ/ D ∼ θ B / . ﬃ ces toprevent the aliases from overlapping the “real” data.Because the Nyquist-Shannon theorem strictly applies only in one dimension, it isin fact possible to do slightly better than this. A generalization of the Nyquist-Shannontheorem (Petersen & Middleton 1962) can be used to show that in two dimensions theoptimal sampling strategy for circularly band-limited functions is a hexagonally closedpacked lattice. This gives the maximally close-packed arrangement of aliases in Fourierspace, and therefore a somewhat more sparse set of telescope pointings in real space.For the case of mosaicking with antennas of diameter D it corresponds to interferometerpointing arranged in equilateral triangles with centers spaced by λ/ √ D ∼ θ B /

2. Ahexagonal pattern will require 15% fewer pointings than a rectangular mosaic coveringthe same region.For imaging extended emission with an interferometer the recommended practiceis to use a hexagonal mosaic with pointing centers spaced by λ/ √ D ∼ θ B / λ min should be used to establish the sampling requirement. In practice, the amountof aliasing caused by slightly undersampling the sky is greatly reduced by the e ﬀ ectsof the aperture illumination taper. Therefore for surveying relatively compact sourcesover large regions considerably more sparse samplings— 0 . θ B or even 0 . θ B — areoften used to good e ﬀ ect. Note that the anti-aliasing sampling requirements λ/ D (fora square mosaic) and λ/ √ D (for a hexagonal mosaic) are exact and do not require useof an approximate Gaussian beam width. As pointed out by Cornwell (1988), all ofthese sampling arguments also apply to single-dish mapping.An approach which has gained wider use recently due to advances in computingand storage speed is on-the-ﬂy (or OTF) mosaicking (e.g. Murphy et al. 2010; Lacyet al. 2020). The OTF observing technique is widespread in the single-dish commu-nity; Mangum et al. (2007) present an excellent review of the details involved in usingOTF in a single dish context. In interferometric OTF mosaicking— just as for single-dish OTF— the antenna pointing positions are continuously and synchronously scannedover the region of astronomical interest on the sky while the visibility data are recorded,usually at a considerably higher data rate than would be necessary for a conventionalmosaic. This is also, of course, the observing scheme considered by Ekers & Rots(1979). OTF mosaicking will often provide higher observing e ﬃ ciency ( i.e. , lower ob-serving overheads), and can be advantageous in situations where the goal is to maplarge areas quickly. Assuming that the antennas are circular, as is commonly but not always the case. osaciking & the Short Spacing Corrections

3. Interferometric Mosaicking Algorithms

In the following sub-sections, I describe the algorithms most commonly used to makeimages from mosaicked interferometer data.

The most straightforward approach to making an image from mosaicked interferome-ter data is the so-called linear mosaic . In a linear mosaic the individual interferometerpointings are deconvolved separately and the resulting deconvolved maps are appropri-ately weighted and added together to make a ﬁnal mosaic image. Explicitly, the linearmosaic image I LM formed out of the individual-pointing deconvolved images I p is: I LM ( x ) = (cid:80) p A ( x − x p ) I p ( x ) /σ p (cid:80) p A ( x − x p ) /σ p (4)Here σ p is an estimated or measured noise level for pointing p which is needed toweight the data for optimal sensitivity in beam-overlap regions. For the single-ﬁeldcase this reduces to I LM ( x ) = I p ( x ) A ( x − x p )which is simply the deconvolved image corrected for the primary beam attenuation.Linear mosaicking is not usually recommended for imaging extended objects. Itsu ﬀ ers from two signiﬁcant disadvantages for this application, both related to the crit-ical deconvolution step. First, because the visibility data from the individual pointingsare never used together, the deconvolution algorithm does not have access to the Ekers-Rots information. Therefore the Fourier resolution and short-spacing information inthe deconvolved image will be no better than for a single ﬁeld. Second, the signal tonoise ratio and uv coverage in the individual (dirty) maps used by the deconvolutionis worse than in the joint map, particularly in regions of signiﬁcant overlap betweenadjacent pointings. This also limits the e ﬀ ectiveness of the deconvolution algorithm, e.g. , how deeply it is feasible to clean in order to accurately deconvolve di ﬀ use struc-tures. These problems are exacerbated due to the fact that the deconvolutions used forinterferometric imaging are necessarily nonlinear operations.On the other hand linear mosaicking does o ﬀ er the possibility of carefully ﬁne-tuning the calibration and deconvolution parameters for each individual pointing. Thiscan be useful when there are signiﬁcant time and sky position variable e ﬀ ects (suchas the ionosphere), or when the ﬁeld is crowded or confused. For these reasons linearmosaicking is often used at low frequencies ( < dirty images does in fact contain the full Ekers-Rots information.Since it is convolved with the interferometer synthesized beam— which moreover willvary from pointing to pointing, and thus over the mosaicked dirty image— the jointdirty image mosaic is not useful for most applications. This observation does, however,motivate one of the methods of joint deconvolution presented in the next section.Linear mosaicking is available via the CASA toolkit image toolkit method linearmosaic ;there is also a prototype implementation in tclean accessed by setting gridder =‘imagemosaic’ . It is available in AIPS through the FLATN task.0

B. Mason

In order for the deconvolved interferometer image to possess accurate information atspatial frequencies shorter than that of the shortest baseline, it is necessary for the de-convolution algorithm to make use of the visibilities from all of the interferometerpointings (see, e.g. , Cornwell 1988; Cornwell et al. 1993; Sault et al. 1996). I willdescribe three approaches that have been widely used: one that combines the visibilitydata during deconvolution (§ 3.2.1), and two that combine the visibility data from dif-ferent pointings before the deconvolution (§ 3.2.2 and § 3.2.3). Note that I use the term“deconvolution” in a broader sense— common in the synthesis imaging community—of estimating the true sky brightness from the incompletely sampled data, rather thanthe stricter mathematical sense of removing the e ﬀ ects of a spatially invariant pointspread function. & Maximum Entropy

CLEAN is a procedure or recipe which decades of use by the community has shown tobe quite e ﬀ ective for deconvolution of interferometric data. Its nature as a procedure,however, can make its results and limitations di ﬃ cult to understand. An alternative ap-proach is to determine the true sky brightness as the solution to a more mathematicallysimple and well-deﬁned problem. One could, for instance, regard the true (decon-volved) sky as represented by a set of pixel values I j , all of which are parameters whichare varied in a χ ﬁt to the measured visibility data V : χ = (cid:88) p (cid:88) i | V p , i − V Mp , i | σ p , i (5)Here the model visibility values V Mp , i for each pointing p are obtained by Fourier trans-forming the sky model I j using the measurement equation. Due to the ﬁnite samplingof uv space, this minimization is not generally a well-posed or stable problem: this isthe “ghost distributions” problem discussed in Cornwell et al. (1999).Cornwell (1988) introduced a practical algorithm for deconvolving mosaicked in-terferometer data which, instead, aims to maximize the so-called EntropyH = − (cid:88) j I j ln (cid:32) I j M j e (cid:33) (6)subject to the constraint that χ — Eq. 5— is close to its expected value. In Eq. 6, M j is a “default image” which is an input to the deconvolution process. In the ab-sence of other information it is typically taken to be a constant. The H term servesto regularize the otherwise ill-posed inversion problem and thereby select one of theinﬁnitely many surface brightness distributions which are consistent with the measuredvisibilities. Cornwell (1988) called this method “Nonlinear Mosaicking”. It is com-monly referred to in the synthesis imaging literature as the Maximum Entropy Method(MEM). A general review of the MEM technique is given by Narayan & Nityananda(1986).Maximizing the entropy term has the e ﬀ ect of compressing the range of pixelvalues (relative to the default image) in the solution. This tends therefore to producesmoother images than CLEAN, and is naturally suited to imaging extended emission. osaciking & the Short Spacing Corrections I j in the so-lution must be positive; and the e ﬀ ect of the entropy term H is typically to move thesolution away from solutions which strictly minimize χ alone. Means of minimizingthis bias will be discussed more in § 3.3. One attractive property of MEM is that itprovides a straightforward method to provide single dish information to the joint, in-terferometric deconvolution: the single dish data can be used as the default image M j (§ 4).A version of the MEM mosaic deconvolver is implemented in AIPS as the tasks UTESS and

VTESS , and is available in the CASA toolkit.

Another approach to imaging mosaicked interferometer data takes as its starting pointthe linear mosaic of dirty maps from all pointings: I Djoint ( x ) = W ( x ) (cid:80) p A ( x − x p ) I Dp ( x ) /σ p (cid:80) p A ( x − x p ) /σ p (7) W ( x ) here is an apodization function used to suppress noise artifacts at the edge of themosaic. As previously noted this joint dirty map contains all of the information that thevisibilities together do, but not in a very convenient form since it is convolved with amessy point spread function. In fact for an image formed according to Eq. 7, the PSFwill typically be spatially variable since the uv coverage and noise generally changefrom one interferometer pointing to the next. This approach is often referred to simplyas “joint deconvolution”.Given a method of calculating the PSF this I Djoint can be deconvolved by the usualmeans, e.g.

CLEAN or MEM. Cornwell et al. (1993) studied a method employingan approximate PSF evaluated at the center of the mosaic; CASA’s tclean task, in-voked with gridder=’standard’ , uses a variant of this algorithm. Sault et al. (1996)present an exact expression for the position-dependent PSF of the map in Eq. 7; theyalso present an image projection designed to minimize w-term distortions. MIRIADimplements this form of joint deconvolution using in tasks

MOSSDI (which uses CLEANfor the deconvolution) and

MOSMEM (which uses MEM instead). Use of MEM for thedeconvolution of the joint dirty map (Eq. 7) is very similar to the strategy outlined in§ 3.2.1, but with the χ of the sky model with respect to the data evaluated in the imagerather than visibility domain.This approach to joint deconvolution has been widely used and is a good optionfor imaging signiﬁcantly extended objects with an interferometer. A more modern approach to imaging mosaicked interferometer data is to reference allpointings’ data to a single phase center and grid them onto a common uv -plane. Thisresults in a single PSF with improved uv -coverage for the whole mosaic and an optimal-sensitivity weighting of the data. This approach is described in detail in (Myers et al.2003) but a brief outline is as follows. First, we can write the measurement equationfor a given visibility u k = ( u k , v k ), for a mosaic pointing at x p = ( (cid:96) p , m p ), and with anexplicit phase tracking center x φ, p , as: V p ( u k ) = (cid:90) (cid:90) d x A ( x − x p ) I ( x ) e − π i u k · ( x − x φ ) (8)2 B. Mason

Using the shift theorem this can be re-expressed as an integral in the aperture plane as V p ( u k ) = e π i ( u k · x φ, p ) (cid:90) (cid:90) d u ˜ A ( u k − u ) ˜ I ( u ) e π i ( u − u k ) · x p . (9)This relationship can, in essence, be inverted to obtain a set of estimators ˜ I j = ˜ I ( u j )of the Fourier transform of the sky brightness on a grid of points u j . The cell sizeof the grid u j is determined by the angular extent of the mosaic on the sky and theestimators themselves are simply linear combinations of the original visibility data withappropriate phase gradients and normalizations. To the extent that the aperture planeillumination ˜ A ( u k − u ) is correct, the “blurring” e ﬀ ect of the interferometer primarybeam will have been removed from the ˜ I J , which will have a resolution determined bythe mosaic size. These visibility estimators can be imaged by standard techniques suchas CLEAN.This approach makes optimal use of the whole range of aperture plane informa-tion inherent in the data and provides a natural basis on which to incorporate wide-ﬁelde ﬀ ects such as W-projection and A-projection. It is also well-suited to jointly imagingdata from a heterogeneous interferometer, i.e. , one in which the antenna sizes are notidentical. One drawback to this approach is that it relies upon an accurate model ofthe antenna primary beam. It can also be computationally expensive, though it is fasterwhen there are many pointing centers in the mosaic, such as there are for an OTF mo-saic. It is generally the preferred approach for mosaicking extended structures when us-ing CLEAN in CASA. The tclean task uses this algorithm when gridder=’mosaic’ is set in its invocation. Even with the beneﬁt of the information provided by the joint mosaic pointings’ vis-ibilities, imaging extended emission with an interferometer can be challenging. It istherefore important to understand the characteristics of the deconvolution algorithmused. Here I summarize some relevant, key properties of the two dominant approachesused by the synthesis imaging community: CLEAN and Maximum Entropy.Since CLEAN represents the sky as a sum of unresolved signals (or “point sources”),it can take many iterations to construct a model of a signiﬁcantly extended source. Ingeneral, it is advisable to clean interactively; to carefully deﬁne the regions used forcleaning so as to contain only ostensibly real and believable signal; and to iterativelyinspect and reﬁne these regions through the cleaning process ( e.g. , after each major cy-cle). For imaging extended emission, the CLEAN threshold chosen can have a signiﬁ-cant e ﬀ ect on the resulting image. For instance, a 3 σ threshold will leave a plateau ofun-deconvolved signal of approximately this amplitude (with the corresponding “shortspacing bowl” around it). This e ﬀ ect can be mitigated by cleaning deeply, e.g. , downto a 1 . σ threshold. Such deep cleaning will result in some spurious noise componentsin the CLEAN model. This is often acceptable since the science image is usually takento be the sum of the (CLEAN-beam convolved) model and the residuals, and to ﬁrstorder the deep cleaning will simply have re-partitioned the noise between the modeland the residuals. It is potentially more problematic if the CLEAN model is to be usedfor self calibration. Such deep cleaning can also be very time-consuming for spectralline cubes with many channels.There are two other biases worth bearing in mind when using CLEAN to imageextended objects. One is the ﬂux bias resulting from the mismatch between the clean osaciking & the Short Spacing Corrections ﬀ ect, and note that it is alsomitigated by deeper cleaning. Another is the so-called “CLEAN bias” discussed in, e.g. , Condon et al. (1998). The CLEAN bias results from the constructive interferenceof the sidelobes of the synthesized beam. Particularly for di ﬀ use, extended sources,these sidelobes can then be brighter in the dirty map than the apparent signal in themain lobe of the PSF. This biases the recovered ﬂux densities low and is mitigated bycareful masking and better uv coverage. All of that said, CLEAN is the algorithm mostcommonly used for synthesis imaging, and when carefully applied will generally yieldexcellent results if the data are of su ﬃ cient quality.A variety of techniques have been developed to improve the performance of CLEANfor extended sources. These include Multi-Scale CLEAN (Cornwell 2008) and ASP-clean (Bhatnagar & Cornwell 2004), and involve using CLEAN components of varyingsizes. These algorithms can improve the stability and convergence speed when imag-ing extended objects. Multi-scale CLEAN is available within CASA’s tclean task bysetting deconvolver = multiscale .Maximum Entropy Deconvolution (MEM) is naturally well-suited to image ex-tended emission. By deﬁnition it is a biased estimator of the true sky image since thee ﬀ ect of including of an entropy term is to move the solution away from the minimumof χ (§ 3.2.1). For SNR >> e.g. , a single-dish image. The resolution of MEM-reconstructed images is also known to vary with SNR; the impact of this can be reducedby convolving the MEM image with a nominal synthesized beam, much as is done witha CLEAN model. Finally, MEM has great di ﬃ culty with point sources embedded indi ﬀ use emission. In this circumstance it is best to remove the point source by othermeans (such as CLEAN) prior to MEM deconvolution.Finally, di ﬀ erent implementations of these deconvolution algorithms often containsubtle (or not subtle) di ﬀ erences, and the performance of algorithms can vary betweenimplementations. If the application is demanding— and imaging di ﬀ use emission oftenis— it can be helpful to consult or collaborate with an expert.

4. Combining Interferometeric and Single-Dish Data

Often when the source of interest is comparable in size to the interferometer primarybeam the best course of action is to obtain single-dish data and to combine the twodatasets. A variety of techniques has been developed to do this. An excellent reviewand comparison of techniques used to combine interferometric and single dish data isgiven by Stanimirovic (2002). Here— because it is straightforward, widely used, androbust— I will focus mainly on a technique to linearly combine the single dish andinterferometer data in uv space known as “feathering”. Cotton (2017) discusses thetechnique in detail.To understand feathering, it is useful ﬁrst to consider the nature of the low spa-tial frequency components in an interferometer map which has been deconvolved usingCLEAN. Although the interferometer does not accurately measure these spatially largesignals there is not usually an explicit constraint on the total ﬂux density in the CLEANmodel: for instance, it is not generally zero. Therefore CLEAN will extrapolate thevalues of spatial frequencies smaller than the smallest measured. Often this extrapo-4 B. Mason lation will be very noisy. It is desirable to down-weight this low-quality informationwhere better quality (single dish) information is available.To feather single dish and interferometric maps together, the individual datasetsare ﬁrst deconvolved separately to remove the e ﬀ ects of their respective point spreadfunction. Each deconvolved map is then Fourier transformed and the interferometermap is high-pass ﬁltered— with a characteristic scale determined by the diameter D S D of the single dish telescope— to eliminate the poorly determined large-scale informa-tion in it. The maps are then added together and inverse Fourier transformed to obtain asingle image with information from both datasets. In more detail, the sequence of stepstypically required is as follows:1. Prepare input images.(a) Deconvolve the synthesized beam from the interferometric dirty map, e.g. using CLEAN. The resulting model is usually then re-convolved with anominal “restoring” beam (of FWHM θ B , int ) to obtain the deconvolved in-terferometric map I int in units of Janskys per (restoring) beam. This nominalbeam is almost always taken to be a Gaussian.(b) Deconvolve the antenna beam from the single dish map and re-convolve itwith a nominal restoring beam ( θ B , S D ) to obtain the deconvolved single dishmap I S D in Janskys per (restoring) beam. Since the single dish data in prin-ciple sample all spatial frequencies out to the maximum ( D S D /λ ), simplelinear methods can sometimes be used instead of the non-linear methodsrequired by the interferometric imaging problem. If the single-dish beamis simple enough to be well-approximated by a Gaussian, this step can beomitted. If there is considerable overlap between the interferometer and thesingle dish in uv -space, it may be desirable to use a restoring beam largerthan the antenna primary beam to give relatively greater weight to the inter-ferometer data at the higher spatial frequencies.(c) Various implementation-dependent clerical steps may also be required. Forexample: ensuring the map sizes, registrations, and pixellizations are ap-propiate; correcting for the primary beams; re-ordering axes; padding andapodizing maps as needed to avoid edge e ﬀ ects; and putting restoring beaminformation in image headers.2. Place the deconvolved maps on a common surface brightness scale. This can bedone, for example, by multiplying I S D by ( θ B , int /θ B , S D ) , or by converting bothmaps into instrument-independent surface brightness units such as Janskys perSteradian.3. Fourier transform the maps to obtain ˜ I S D ( u , v ) and ˜ I int ( u , v ).(a) It is often useful at this stage to correct any (hopefully slight) errors in therelative calibration of the single-dish and interferometric maps, e.g. , due tothe ﬂux scale assumed. This can be done by deriving a scale factor f cal fromcomparing the Fourier transforms of I S D and I int in the range of uv valueswell-measured by each, properly accounting for the restoring beam of each.(b) “feather” the images together in the Fourier domain using a taper function T ( u , v ) = − ˜ B S D as˜ I combined ( u , v ) = ˜ I S D ( u , v ) + ˜ I int ( u , v ) × (1 − ˜ B S D ) (10) osaciking & the Short Spacing Corrections B S D is the Fourier transform of the single dish restoring beam nor-malized to have a peak of one. An example interferometer re-weightingfunction suitable for combining GBT and VLA C-conﬁguration data isshown in Figure 5. The taper function has the e ﬀ ect of emphasizing eachdataset where it provides the best information while minimizing excessivenoise that will be introduced by over-weighting data in poorly measuredregions of the uv plane.4. Inverse Fourier transform to obtain the image I combined , containing informationfrom both the single dish and the interferometer.This algorithm is implemented in CASA as feather , in AIPS as IMERG , and inMIRIAD as

IMMERGE . These implementations di ﬀ er in the details of how the restor-ing beams and feathering weights are deﬁned and handled, but typically take the de-convolved interferometer and single dish maps and their beams θ S D , θ int as input andperform steps 2 - 4 of the above. They also assume that the restoring beams are Gaus-sian. In particular, as described in step 1(b) above, if the single-dish beam signiﬁcantlydeviates from a Gaussian shape— e.g. due to sidelobes or ellipticity— then its e ﬀ ectsshould be removed from the map by deconvolution (see, e.g. , Weiß et al. 2001). Two ofmany examples of the application of the feathering technique in the literature are Vogelet al. (1984) and Dubner et al. (1998).For feathering to work well there should be region of the uv plane measured byboth the interferometer and the single dish. A commonly adopted criterion is for thediameter of the single dish D S D to be 1 . b min in the interferometric array. By this criterion, for instance, the 100-meter GBT iswell-suited to provide short-spacing data for EVLA C- or D-conﬁguration data. Thiscriterion also implies that an interferometeric array with antennas of a single diametercannot even in principle provide high-quality total power data for itself by measuringauto-correlations or outﬁtting some of the antennas with total power receivers.Another approach to combining single dish and interferometric data is to do sobefore or during the deconvolution. This has the advantage of providing considerablymore information to the nonlinear interferometric deconvolution. One method, men-tioned previously in § 3.2.1, is to use the single-dish data as the “default image” in aMaximum Entropy deconvolution of the interferometric image. Another method is todirectly form a linear combination of the single-dish and interferometer “dirty” images,pre-deconvolution. In this case the e ﬀ ective beam is the corresponding linear combina-tion of the respective single-dish and interferometer point spread functions; the imagecan then be deconvolved by the usual methods. These techniques are discussed andcompared in greater detail in Stanimirovic (2002), beautifully applied to 21-cm mosaicmaps of the Small Molecular Cloud from Parkes and the ACTA. Two recently devel-oped, alternative approaches which fold single dish data into the deconvolution pro-cess are the so-called TP2VIS (Koda et al. 2011, 2019) and SDINT (Rau et al. 2019)methods. One attractive feature of these methods is that they naturally accommodatedeconvolution of the single-dish PSF (primary beam). A word of caution: the sdfactor keyword of CASA’s feather task is not a weight, it is a direct scalingfactor applied to the single dish data. As such, values other than the default (1 .

0) should be set withcaution, and a clear and quantitative physical motivation. B. Mason

Figure 5. Relative aperture-plane sensitivities of GBT and the C-conﬁguration ofthe VLA and an appropriate feathering weight which could be used to combine thetwo.

The relative weights of the single dish and interferometer data will strongly af-fect the characteristics of the combined image. These can be set or adjusted manuallybut the best quality images are obtained when the relative weights are determined bythe intrinsic noise properties of each dataset. Therefore it is important that the sensi-tivities of the single dish and interferometric maps or cubes are well matched. Twouseful criteria are to match the single dish and interferometer sensitivities in overlap-ping regions of the aperture plane (Kurono et al. 2009; Koda et al. 2011; Mason &Brogan 2014) and to obtain single dish data which result in an overall distribution of uv -plane weights which is smooth and approximately Gaussian (Rodríguez-Fernándezet al. 2008). When combining with data from modern interferometer arrays, the im-plied requirement on the single dish map sensitivity is su ﬃ ciently stringent that largeaperture single dishes and / or focal plane arrays are often advantageous.An illustration of the improvement that can be made by adding single-dish datato an interferometer map is shown in Figure 6. This is a simulated observation of anearby galaxy, ∼ (cid:48) in size, using ALMA at λ = . (cid:48)(cid:48) (FWHM); 67 pointings of the 12-m ALMAarray on a hexagonal lattice are used to cover the ﬁeld of interest, and 23 pointings of This simulation and imaging exercise uses sample data and scripts available on http://casaguides.nrao.edu/ . osaciking & the Short Spacing Corrections Figure 6.

Left : Simulated ALMA 12m + λ = . (cid:48) in size (ALMA 12-m primary beam: 19 (cid:48)(cid:48) FWHM).

Right : after adding12m total power data using the feather algorithm.

5. Practical Issues

Following is a brief summary of some of the practical e ﬀ ects that need to be consideredwhen planning and analyzing mosaicked interferometric observations and single dishobservations intended to support them. • Choose an appropriate mosaic sampling strategy (§ 2.3). For high-ﬁdelity imag-ing of extended emission, a hexagonal mosaic with pointing centers spaced by λ/ √ D is preferable. Faster sky coverage can be achieved with sparser cover-age, although this will not perform as well for retrieving short spacings, and willrequire more accurate knowledge of the primary beam. • Many e ﬀ ects can cause the interferometer imaging characteristics to vary fromone mosiac pointing to another. This will tend to degrade the resulting im-age quality, depending on the reconstruction approach adopted. For instance,the uv coverage will change due to the Earth’s rotation; ﬂagging; the systemtemperature— hence noise— can vary; at short wavelengths the tropospheric8 B. Mason phase can vary; and at long wavelengths ionospheric e ﬀ ects will vary. Multi-ple coverages of the mosaic will tend to average such variations out, improvingimage quality, at the price of lower observing e ﬃ ciency. • Minimizing antenna pointing errors is more important for interferometric mo-saics than for single pointings. In the single-pointing case, the strongest emis-sion is typically in the center of the beam, which is relatively ﬂat; in the case ofa mosaic, there is generally emission over the entire ﬁeld, including areas wherethe primary beam has a steep gradient. • If single dish data is required, it is desirable that the dish diameter be at least1 . × the minimum baseline in the interferometer, and preferably more. This isrequired to provide adequate uv coverage and to be able to accurately link thecalibration of the two instruments. • Single dish pointing errors can introduce spurious high spatial frequency struc-ture to the map. These can often be alleviated by lightly smoothing. • The single dish antenna may have a non-trivial beam ( e.g. , due to an error beam,shadowing of the primary, etc. ) requiring deconvolution. • Depending on the calibration and imaging algorithms used the single dish mapmay have accurate information over only a limited range of spatial frequencies,not all the way down to ( u , v ) =

0. This is particularly true for continuum data,which by deﬁnition lack a spectral dimension to help distinguish systematic ef-fects from astronomical signal. • Ensure that the single dish map is su ﬃ ciently sensitive to provide useful informa-tion, and has a guard band of at least a few single-dish beams around the source. Acknowledgments.

The National Radio Astronomy Observatory is a facility ofthe National Science Foundation operated by Associated Universities, Inc. This textis based on lectures given by the author at the 14th, 15th, and 16th NRAO SynthesisImaging Summer Schools, and I thank the organizing committees for doing the hardwork of getting together and running these workshops. I am also grateful to those whocovered this topic at previous summer schools, especially Jurgen Ott and Steve Myers.I thank Urvashi Rau and Crystal Brogan for helpful discussions, and Je ﬀ Mangum andAdele Plunkett for comments on this manuscript.

References

Bhatnagar, S., & Cornwell, T. J. 2004, A&A, 426, 747. astro-ph/0407225

Condon, J. J., Cotton, W. D., Greisen, E. W., Yin, Q. F., Perley, R. A., Taylor, G. B., & Broder-ick, J. J. 1998, AJ, 115, 1693Cornwell, T., Braun, R., & Briggs, D. S. 1999, in Synthesis Imaging in Radio Astronomy II,edited by G. B. Taylor, C. L. Carilli, & R. A. Perley, vol. 180 of Astronomical Societyof the Paciﬁc Conference Series, 151Cornwell, T. J. 1988, A&A, 202, 316— 2008, IEEE Journal of Selected Topics in Signal Processing, 2, 793Cornwell, T. J., Holdaway, M. A., & Uson, J. M. 1993, A&A, 271, 697Cotton, W. D. 2017, PASP, 129, 094501.

Dubner, G. M., Holdaway, M., Goss, W. M., & Mirabel, I. F. 1998, AJ, 116, 1842 osaciking & the Short Spacing Corrections Ekers, R. D., & Rots, A. H. 1979, in IAU Colloq. 49: Image Formation from Coherence Func-tions in Astronomy, edited by C. van Schooneveld, vol. 76 of Astrophysics and SpaceScience Library, 61Hunter, T. R., & Napier, P. J. 2016, arXiv e-prints, arXiv:1609.09376.

Jorsater, S., & van Moorsel, G. A. 1995, AJ, 110, 2037Koda, J., Sawada, T., Wright, M. C. H., Teuben, P., Corder, S. A., Patience, J., Scoville, N.,Donovan Meyer, J., & Egusa, F. 2011, ApJS, 193, 19Koda, J., Teuben, P., Sawada, T., Plunkett, A., & Fomalont, E. 2019, PASP, 131, 054505.

Kurono, Y., Morita, K.-I., & Kamazaki, T. 2009, PASJ, 61, 873Lacy, M., Baum, S. A., Chandler, C. J., Chatterjee, S., Clarke, T. E., Deustua, S., English,J., Farnes, J., Gaensler, B. M., Gugliucci, N., Hallinan, G., Kent, B. R., Kimball, A.,Law, C. J., Lazio, T. J. W., Marvil, J., Mao, S. A., Medlin, D., Mooley, K., Murphy,E. J., Myers, S., Osten, R., Richards, G. T., Rosolowsky, E., Rudnick, L., Schinzel, F.,Sivako ﬀ , G. R., Sjouwerman, L. O., Taylor, R., White, R. L., Wrobel, J., Andernach,H., Beasley, A. J., Berger, E., Bhatnager, S., Birkinshaw, M., Bower, G. C., Brandt,W. N., Brown, S., Burke-Spolaor, S., Butler, B. J., Comerford, J., Demorest, P. B., Fu,H., Giacintucci, S., Golap, K., Güth, T., Hales, C. A., Hiriart, R., Hodge, J., Horesh, A.,Ivezi´c, Ž., Jarvis, M. J., Kamble, A., Kassim, N., Liu, X., Loinard, L., Lyons, D. K.,Masters, J., Mezcua, M., Moellenbrock, G. A., Mroczkowski, T., Nyland, K., O’Dea,C. P., O’Sullivan, S. P., Peters, W. M., Radford, K., Rao, U., Robnett, J., Salcido, J.,Shen, Y., Sobotka, A., Witz, S., Vaccari, M., van Weeren, R. J., Vargas, A., Williams,P. K. G., & Yoon, I. 2020, PASP, 132, 035001. Mangum, J. G., Emerson, D. T., & Greisen, E. W. 2007, A&A, 474, 679.

Marvil, J. 2018, Online Proceedings of the 16th NRAO Synthesis Imag-ing Workshop. Accessed: 2020-06-02, URL https://science.nrao.edu/science/meetings/2018/16th-synthesis-imaging-workshop/16th-synthesis-imaging-workshop-lectures

Mason, B., & Brogan, C. 2014, ALMA Memo 598Mioduszewski, A. 2014, Online Proceedings of the 14th NRAO Synthesis Imaging Workshop.Accessed: 2020-06-02, URL https://science.nrao.edu/science/meetings/2014/14th-synthesis-imaging-workshop/lectures

Murphy, T., Sadler, E. M., Ekers, R. D., Massardi, M., Hancock, P. J., Mahony, E., Ricci, R.,Burke-Spolaor, S., Calabretta, M., Chhetri, R., de Zotti, G., Edwards, P. G., Ekers, J. A.,Jackson, C. A., Kesteven, M. J., Lindley, E., Newton-McGee, K., Phillips, C., Roberts,P., Sault, R. J., Staveley-Smith, L., Subrahmanyan, R., Walker, M. A., & Wilson, W. E.2010, MNRAS, 402, 2403.

Myers, S. T., Contaldi, C. R., Bond, J. R., Pen, U.-L., Pogosyan, D., Prunet, S., Sievers, J. L.,Mason, B. S., Pearson, T. J., Readhead, A. C. S., & Shepherd, M. C. 2003, ApJ, 591,575. astro-ph/0205385

Napier, P. 1999, in Synthesis Imaging in Radio Astronomy II, edited by G. B. Taylor, C. L.Carilli, & R. A. Perley, vol. 180 of Astronomical Society of the Paciﬁc ConferenceSeries, 37Narayan, R., & Nityananda, R. 1986, ARA&A, 24, 127Petersen, D. P., & Middleton, D. 1962, Information and Control, 5, 279Rau, U., Naik, N., & Braun, T. 2019, AJ, 158, 3.

Rodríguez-Fernández, N., Pety, J., & Gueth, F. 2008, IRAM technical memo 2008-2Sault, R. J., Staveley-Smith, L., & Brouw, W. N. 1996, A&AS, 120, 375Stanimirovic, S. 2002, in Single-Dish Radio Astronomy: Techniques and Applications, editedby S. Stanimirovic, D. Altschuler, P. Goldsmith, & C. Salter, vol. 278 of AstronomicalSociety of the Paciﬁc Conference Series, 375. astro-ph/0205329

Taylor, G. B., Carilli, C. L., & Perley, R. A. (eds.) 1999, vol. 180 of Astronomical Society ofthe Paciﬁc Conference SeriesThompson, A. R., Moran, J. M., & Swenson, J., George W. 2017, Interferometry and Synthesisin Radio Astronomy, 3rd Edition B. Mason

Vogel, S. N., Wright, M. C. H., Plambeck, R. L., & Welch, W. J. 1984, ApJ, 283, 655Walter, F., Brinks, E., de Blok, W. J. G., Bigiel, F., Kennicutt, R. C., Jr., Thornley, M. D., &Leroy, A. 2008, AJ, 136, 2563.