[PDF] Estimating sizes of faint, distant galaxies in the submillimetre regime

Abstract

We measure the sizes of redshift ~2 star-forming galaxies by stacking data from the Atacama Large Millimeter/submillimeter Array (ALMA). We use a uv-stacking algorithm in combination with model fitting in the uv-domain and show that this allows for robust measures of the sizes of marginally resolved sources. The analysis is primarily based on the 344 GHz ALMA continuum observations centred on 88 sub-millimeter galaxies in the LABOCA ECDFS Submillimeter Survey (ALESS). We study several samples of galaxies at z~2 with M ∗ ∼5× 10 10 M ⊙ , selected using near-infrared photometry (distant red galaxies, extremely red objects, sBzK-galaxies, and galaxies selected on photometric redshift). We find that the typical sizes of these galaxies are ~0.6 arcsec which corresponds to ~5 kpc at z~2, this agrees well with the median sizes measured in the near-infrared z-band (~0.6 arcsec). We find errors on our size estimates of ~0.1-0.2 arcsec, which agree well with the expected errors for model fitting at the given signal-to-noise ratio. With the uv-coverage of our observations (18-160 m), the size and flux density measurements are sensitive to scales out to 2 . We compare this to a simulated ALMA Cycle 3 dataset with intermediate length baseline coverage, and we find that, using only these baselines, the measured stacked flux density would be an order of magnitude fainter. This highlights the importance of short baselines to recover the full flux density of high-redshift galaxies.

Full PDF

MMon. Not. R. Astron. Soc. , 1–11 (2016) Printed October 9, 2018 (MN L A TEX style ﬁle v2.2)

Estimating sizes of faint, distant galaxies in the submillimetreregime

L. Lindroos (cid:63) , K. K. Knudsen , L. Fan , J. Conway , K. Coppin , R. Decarli , G.Drouart , J. A. Hodge , A. Karim , J. M. Simpson , and J. Wardlow , Department of Earth and Space Sciences, Chalmers University of Technology, Onsala Space Observatory, SE-439 92 Onsala, Sweden Shandong Provincial Key Lab of Optical Astronomy and Solar-Terrestrial Environment, Institute of Space Science,Shandong University, Weihai, 264209, China Centre for Astrophysics Research, University of Hertfordshire, College Lane, Hatﬁeld, AL10 9AB, UK Max-Planck-Institut für Astronomie, Köonigstuhl 17, D-69117 Heidelberg, Germany International Centre for Radio Astronomy Research, Curtin University, GPO Box U1987, Perth, WA 6845, Australia Leiden Observatory, Leiden University, P.O. Box 9513, 2300 RA Leiden, The Netherlands Argelander-Institut für Astronomie, Universität Bonn, Auf dem Hügel 71, D-53121 Bonn, Germany Centre for Extragalactic Astronomy, Department of Physics, Durham University, South Road, Durham DH1 3LE, UK Dark Cosmology Centre, Niels Bohr Institute, University of Copenhagen, DK-2100 Copenhagen, Denmark

Accepted 2016 July 05. Received 2016 June 25; in original form 2016 December 21

ABSTRACT

We measure the sizes of redshift ∼ star-forming galaxies by stacking data fromthe Atacama Large Millimeter/submillimeter Array (ALMA). We use a uv -stackingalgorithm in combination with model ﬁtting in the uv -domain and show that thisallows for robust measures of the sizes of marginally resolved sources. The analysisis primarily based on the 344 GHz ALMA continuum observations centred on 88sub-millimeter galaxies in the LABOCA ECDFS Submillimeter Survey (ALESS). Westudy several samples of galaxies at z ≈ with M ∗ ≈ × M (cid:12) , selected usingnear-infrared photometry (distant red galaxies, extremely red objects, sBzK-galaxies,and galaxies selected on photometric redshift). We ﬁnd that the typical sizes of thesegalaxies are ∼ . (cid:48)(cid:48) which corresponds to ∼ kpc at z = 2 , this agrees well with themedian sizes measured in the near-infrared z -band ( ∼ . (cid:48)(cid:48) ).We ﬁnd errors on our size estimates of ∼ . (cid:48)(cid:48) − . (cid:48)(cid:48) , which agree well with the ex-pected errors for model ﬁtting at the given signal-to-noise ratio. With the uv -coverageof our observations (18-160 m), the size and ﬂux density measurements are sensitiveto scales out to 2 (cid:48)(cid:48) . We compare this to a simulated ALMA Cycle 3 dataset withintermediate length baseline coverage, and we ﬁnd that, using only these baselines,the measured stacked ﬂux density would be an order of magnitude fainter. This high-lights the importance of short baselines to recover the full ﬂux density of high-redshiftgalaxies. Key words: techniques: interferometric – galaxies: high-redshift – galaxies: structure– sub-millimetre: galaxies

The star-formation rate density in the universe peaks at z ∼ (e.g. Madau & Dickinson 2014), making this a veryimportant epoch in the formation of galaxies. For galaxies atthese redshifts submillimeter (sub-mm) emission is a com-monly used tracer of star formation (e.g. Daddi et al. 2010b),often used in combination with ultraviolet and optical mea-surements to allow reliable star-formation rate (SFR) esti- (cid:63) E-mail: [email protected] mates for galaxies with very diﬀerent dust properties (e.g.Tacconi et al. 2013; da Cunha et al. 2015). The AtacamaLarge millimeter/submillimeter Array (ALMA) and IRAMNOrthern Extended Millimeter Array (NOEMA) are cur-rently producing a large wealth of data at frequencies of − GHz, allowing us to measure the sub-mm emis-sion from high-redshift galaxies previously to faint to study.Observing at these frequencies is eﬃcient for high redshifts,as the ﬂux density for galaxies at a given SFR is expected tobe almost constant for redshift z ∼ − due to the negative K -correction (e.g. Blain et al. 2002; Casey et al. 2014). c (cid:13) a r X i v : . [ a s t r o - ph . GA ] A ug L. Lindroos et al.

Current observations with ALMA and NOEMA primar-ily focus on the galaxies with high SFR, >

100 M (cid:12) yr − ,however, these galaxies constitute a small fraction of the to-tal star formation (e.g. Bouwens et al. 2011; Rodighiero et al.2011). It is possible to study single sources from much faintergalaxy populations, e.g., with 50 ALMA antennas and ∼ hour integration we can reach a depth of 20 µ Jy/beam at 345GHz, which corresponds to 1 σ uncertainty of ∼ (cid:12) yr − at z = 2 . However, to obtain large samples of galaxies forstatistical studies is very expensive. An alternate approach isto study galaxies that are ampliﬁed by gravitational lensing.By using lensing it is possible to detect very faint sourceswith shorter observations, e.g., Watson et al. (2015) detecteda z ∼ galaxy with a SFR of 9 M (cid:12) yr − and a ﬂux den-sity of 0.61 mJy at 220 GHz, which would require only a ∼

30 s integration for a 5 σ detection with 50 ALMA anten-nas. However, it can be diﬃcult to obtain large samples ofsuch galaxies as high magniﬁcations are rare. A third ap-proach is stacking, which uses shallower surveys to studystatistical properties of large samples galaxies which havepreviously been detected at other wavelengths. Stacking is acommon technique used across many diﬀerent wavelength: γ -rays (e.g. Aleksić et al. 2011), X-rays (Chaudhary et al.2012; George et al. 2012), optical/near infrared (Zibetti et al.2007; Matsuda et al. 2012; González et al. 2012), mid/far in-frared (e.g. Dole et al. 2006), and radio (Boyle et al. 2007;Ivison et al. 2007; Hodge et al. 2008, 2009; Dunne et al. 2009;Karim et al. 2011).Looking speciﬁcally at sub-mm emission, stacking hasbeen applied to data from James Clerk Maxwell Telescope(JCMT) and Atacama Pathﬁnder EXperiment (APEX),using several diﬀerent samples of high-redshift galaxies,(e.g. Webb et al. 2004; Knudsen et al. 2005; Greve et al.2010). Compared to these surveys, ALMA can achieve sub-arcsecond resolution, which is orders of magnitude betterthan the . (cid:48)(cid:48) and . (cid:48)(cid:48) at GHz of APEX and JCMT re-spectively. Firstly, this allows us to measure the ﬂux densityof the sources without being aﬀected by confusion, whichis believed to impact the result of stacking at JCMT andAPEX resolutions (e.g. Webb et al. 2004). Secondly, we canstudy the structure of our stacked source. Several studieshave found star-forming galaxies at redshifts of z ∼ havelarge sizes, e.g. Daddi et al. (2010b) found sizes up to 1 . (cid:48)(cid:48) z ∼ galaxies.Decarli et al. (2014) used stacking to measure the sub-mm ﬂux density of star-forming galaxies in the ExtendedChandra Deep Field South (ECDFS) with data from theALMA. In this paper we will build on the work by Decarliet al. (2014), using the same data, but extending the analy-sis to focus on the sizes of the stacked sources. Decarli et al.(2014) performed stacking on the imaged pointings, analo-gous to how stacking is done at other wavelengths. However,as seen in Lindroos et al. (2015), this may not be ideal forinterferometric data. In this paper we instead adopt the uv -stacking approach described in Lindroos et al. (2015), whichperforms the stacking directly on the visibility data. Whenusing image stacking in mosaiced data sets, it is necessary tocombine data from pointings imaged with diﬀerent restoringbeams. Because of this, it is very diﬃcult to deconvolve thesource structure from the beam in the ﬁnal stacked image.Using uv -stacking, we combine the data in the uv -domain,and the beam can be directly calculated from the new uv - coverage. Therefore, using the uv -stacking algorithm is es-pecially important for measuring the sizes of the stackedsources.While the work in this paper is primarily focused onstacking high-redshift galaxies, the stacking techniques ap-plied are quite general. Many of the lessons learned apply toany ALMA stacking of marginally extended sources.The paper is structured as following. In §2, we describethe ALMA data we use and in §3 we describe the sample, aswell as the photometric near infrared and optical catalogue.In §4 we describe a set of simulations performed to testvarious aspect of the stacking result and in §5, we describeour uv -stacking procedure. In §6, we summarise our results,including the typical galaxy sizes for each sample. Finally,in §7 we discuss the implications of the results both for starformation at z ∼ , and for general stacking of ALMA data.In this paper we use a standard cosmology with H =67 . − Mpc − , Ω Λ = 0 . , and Ω m = 0 . (PlanckCollaboration et al. 2014). All magnitudes are in AB (Oke1974) unless otherwise speciﬁed. Our analysis is based on data from the ALMA survey of thesubmillimetre galaxies (SMGs) detected in LESS (ALESS,Hodge et al. 2013), where LESS is the LABOCA ECDFSSubmm Survey, LABOCA is the Large Apex BOlometerCAmera mounted on APEX, and ECDFS is the ExtendedChandra Deep Field South. The ALESS survey is com-posed of 122 pointings across the ECDFS, centred on 122SMGs, observed during ALMA Cycle 0 between Octoberand November 2011. The observations are tuned to a fre-quency of 344 GHz and have a typical resolution around1 . (cid:48)(cid:48) × . (cid:48)(cid:48)

2. The median value of the noise (standard devia-tion) in the centre of each pointing is ∼ . mJy/beam. Allpointings with central noise > . mJy/beam or beam axisratio > are excluded from the analysis, see Hodge et al.(2013) for more details. As such our data consist of 88 “goodquality” pointings, each with a ﬁeld of view (full width athalf power of the ALMA primary beam) of 17 . (cid:48)(cid:48) . In this paper we extend the analysis of Decarli et al. (2014),using the same sample selection. The selection is based onthe photometric catalogue of the ECDFS assembled usingthe same procedure as Simpson et al. (2014), using pri-marily data from the Wide MUlti-wavelength Survey byYale-Chile (MUSYC; Taylor et al. 2009). The MUSYC cat-alogue is a K -band ﬂux limited sample, covering a (cid:48) × (cid:48) area of the ECDFS, with photometry for the sources in thebands UBV RIzJHK . At K AB = 22 mag the sample is100 per cent complete for point sources, and 96 per centcomplete for extended sources with a scale radius of 0 . (cid:48)(cid:48) J and K band catalogue Zibetti et. al (in prepara-tion), the Taiwan ECDFS NIR survey (Hsieh et al. 2012),and Spitzer/IRAC 3.6, 4.5, 5.8, and 8.0 µ m images from theSpitzer IRAC/MUSYC Public Legacy Survey (Damen et al. c (cid:13) , 1–11 stimating sizes of faint, distant galaxies in the submillimetre regime Figure 1.

Distribution of stellar masses for each sample. Thestellar masses are estimated by SED ﬁtting to optical and near-infrared band using PEGASE 2 (Fioc & Rocca-Volmerange 1997). K Vega < , z > , and further limitthe samples as follows:(i) All sources with ( z − K − . > . B − z + 0 . − . which separate the galaxies from the stars (Daddi et al.2004). This sample was refered to the K Vega < samplein Decarli et al. (2014) and will be refered to as the K20sample in this paper.(ii) Actively star-forming galaxies selected using the sBzKcriteria by Daddi et al. (2004), i.e., ( z − K − . − ( B − z + 0 . > − . .(iii) Distant Red Galaxies (DRGs) selected using J − K > . (Franx et al. 2003).(iv) Extremely Red Objects (EROs) selected using ( R − K ) > . and ( J − K ) > . (Elston, Rieke & Rieke 1988).This results in our samples being the same as the z > samples in Decarli et al. (2014).Using the MUSYC photometry we also estimate thestellar mass ( M ∗ ) of our selected galaxies. The stellar-mass estimates are done using PEGASE 2 (Fioc & Rocca-Volmerange 1997). For each galaxy we use all availablebands, i.e., U , B , V , R , I , z , J , H , and K . Using four dif-ferent galaxy templates (elliptical, spiral Sa, spiral Sd, andstarburst), all assuming Kroupa IMF, we ﬁt for stellar mass.The redshift is not ﬁtted directly, instead we use the photo-metric estimates from Taylor et al. (2009). For each sourcewe choose the model with the lowest χ , with more than 90per cent of sources best ﬁtted by the elliptical or the star-burst model. The distributions of stellar masses for thesesamples are shown in Fig. 1. The samples are stacked using the uv -stacking algorithmdescribed in Lindroos et al. (2015). The algorithm performsthe stacking operation directly on the visibility data. We usemodel ﬁtting in uv -domain to estimate the ﬂux densities andsizes of our stacked sources. For comparison with previousimage stacking results we also use a simpler ﬂux density es-timate which assumes a point source, where the ﬂux densityis estimated using the weighted average of all non-ﬂaggedvisiblities. We refer to this estimate at the point-source es-timate . Prior to stacking each sample, all bright sources not part ofthe sample are subtracted from the visibility data.The modelling and subtraction was performed as fol-lows. The data is imaged and cleaned using Common As-tronomy Software Applications package (CASA) version4.4. Each pointing is imaged separately with a cell size of0 . (cid:48)(cid:48) (cid:48)(cid:48) of a stacking position. The model issubtracted from the uv -data, to produce a residual data set.To ensure that the visibility weights are accurate after thesubtraction, they weights are recalculated using the scatterof the visibilities in each baseline and time bin.Note that the aim of the bright source subtraction isto remove bright sources that are unrelated to the targetstacking sources, not to remove those bright in the targetsample. As such, this subtraction is performed separately foreach sample. We also note that the bright source subtractionis based on the clean models, which while not fully removingthe sources, is found to be suﬃcient for stacking, see section6.2. The uv -stacking method prescribed in Lindroos et al. (2015)uses a weighted average. We calculate the stacking weightsfor each position from the primary beam attenuation. Noisevariations between pointings are included in the visibilityweights, and are thus not included in the stacking weights.To ensure that the visibility weights are accurate, they arerecalculated prior to stacking from the scatter of each base-line and integration,The primary beam attenuation ( A N ) is estimated us-ing the ALMA model present in CASA version 4.4, i.e., anAiry pattern with a full width at half maximum (FWHM)of 1.17 λD ≈ . (cid:48)(cid:48) . This results in stacking weights calculatedas W k = (cid:104) A N ( ˆ S k ) (cid:105) , (1)where W k and ˆ S k are the weight and position of source k . http://casa.nrao.educ (cid:13) , 1–11 L. Lindroos et al.

Two diﬀerent models are used to characterise the stackedsources.The ﬁrst: a point source model deﬁned by V ps ( u, v ) = Φ e πi ul + vmλ (2)where ( u, v ) are the projected baselines, λ is the wavelength, ( l, m ) are the direction cosines relative to the phase centre,and Φ is the ﬂux density of the source.The second: a Gaussian model deﬁned by V ( u, v ) = e (cid:18) π × ( u + v )2 λ (cid:19) V ps ( u, v ) (3)where V ps is deﬁned according to Equation 2, and Θ is thesource size (FWHM) in radians.The models are ﬁtted in the uv -domain to our stackedsources using the least square minimizer package Ceres . The model ﬁtting is done to all non-ﬂagged visibilities, andincludes the visibility weights in the χ minimization. We use two diﬀerent methods to estimate the uncertaintiesof our size and ﬂux density estimates: a Monte Carlo methodwhere random sources are inserted into the data and stacked,and a bootstrapping method.The Monte Carlo simulation for a given sample andmodel is performed as follows: a set of Monte Carlo sourcesis generated with the same number of sources as the givensample. The position for each source is randomized, however,always within the same pointing as their corresponding ac-tual source. Each source is modelled as the ﬁt for the givenmodel to the stacked sources of the given sample. The set ofMonte Carlo sources are introduced into the residual dataset for the given sample and stacked using the same pro-cedure as for the actual samples. Finally the ﬂux densityand size of the stacked Monte Carlo sources are estimatedusing the given model. This procedure is repeated a 100times for each sample and model to produce a distributionof estimated Monte Carlo ﬂux densities and sizes. The un-certainties are calculated as the standard deviation of ourMonte Carlo estimates.The bootstrapping method is performed by resamplingthe galaxies in each sample allowing replacements, e.g., pick-ing galaxy 1 two times and galaxy 2 one time, and galaxy4 one time from a sample of 4 galaxies. We stack the newsample, and estimate the ﬂux density and size using modelﬁtting. By studying the distribution of the parameters in dif-ferent resamples we can measure the inﬂuence of noise andunderlying sample variance on the result. To fully exhaustall possible resamplings would require (cid:0) N × − N (cid:1) resamplings Ceres (Agarwal, Mierle et al. 2015) uses a Levenberg-Marquardt algorithm (Levenberg 1944) for non-linear least squareminimization. It supports several diﬀerent solvers for the linearstep. We use the solver based on Cholesky decomposition, whichfor our data set typically run 2 times faster compared to a stan-dard QR factorisation. The ﬁt is terminated at the ﬁrst to occurwithin 50 iterations, a parameter change in the last step of lessthan − , or a relative χ change less than − . All ﬂuxdensities are constrained to be positive. where N is the number of galaxies in the sample. This isapproximately for the sBzK sample, however, we canget a good estimate using a much smaller number of resam-plings. As such we resample 1000 times for each target sam-ple. The error on each paramater is reported as where themeasured cummulative distribution function (CDF) crosses0.159 and 0.841, equivalent to ± σ of a Normal distribution.The estimated parameters are also recentered on where themeasured CDF crosses 0.5, thereby reducing the inﬂuenceof outliers on the result.We choose to refer to the ﬁrst method as the MonteCarlo method as this is the same as the Monte Carlo methodused in Decarli et al. (2014). However, it is worth noting thatthe bootstrap method is also a Monte Carlo method as wedo not fully exhaust all possible resamples, however, in thiswork we will refer to it as bootstrapping. The bootstrapping described in §4.4 uses resampling of thegalaxies to estimate the uncertainties of stacking. Usingbootstrapping we can also estimate the uncertainty of themodel ﬁtting, by resampling the visibilities of the uv -data.This method is not used for the stacked results as it will notestimate uncertainty from variance within the sample, how-ever, it is powerful for model-ﬁtting of individual sources.We will refer to this method as visibility bootstrapping. The model ﬁtting described in section 4.3 allows us to esti-mate the total ﬂux densities and typical sizes of our stackedsources. The uv -models used aim to simulate the behaviourof the averages of our samples. They are not based on theunderlying morphologies of our samples. However, lookingat the data in the uv -domain we can obtain hints on the un-derlying structures of our sources. We have simulated severalpossible morphologies for the galaxies of our samples, to testif they produce diﬀerent signatures in the stacked data, andto be able to compare them with our actual stacked data.For each simulation we generate a model of fake sourcesand then simulate an ALMA data set with the followingprocedure. We take the raw ALESS data set and set allvisibilities to zero, then we add the model and noise to thedata set. The noise is added using the simulator tool ( sm )in CASA, using the default parameters which produces arealistic noise for the ALMA site. After this the visibilityweights are recalculated by using the scatter in each baselineand time bin.This simulated data set is then stacked using the sameprocedure as for our real data sets (section 4.) Observations of high-z star-forming galaxies at rest-framewavelengths of ∼

200 nm indicate that they are more clumpycompared to their counterparts at lower redshifts (e.g. Imet al. 1999; Förster Schreiber et al. 2009). Based on this wehave generated a model where all the sub-mm ﬂux is comingfrom a few clumps. c (cid:13) , 1–11 stimating sizes of faint, distant galaxies in the submillimetre regime For each source in the sample we generate 3 clumps, i.e.,3 point sources. The clumps are scattered uniformly aroundthe source position, with a maximal distance of 0 . (cid:48)(cid:48) pc, resulting in a total ﬂux of 2.1 mJy for each simulatedgalaxy.We simulate two diﬀerent uv -coverages. Firstly, thesame as our ALESS observations, with a similar level ofnoise added using the standard sm parameters. Secondly, anintermediate length baseline array with 36 antennas takenfrom ALMA Cycle 3: the C36-5 conﬁguration described inthe ALMA Cycle 3 technical handbook , with baselinesfrom 45 m to 1.4 km. The total observation time is scaleddown to achieve a similar noise, i.e. 1 h spread evenly overthe 122 pointings. The inner parts of the ECDFS are covered by the GOODS-Ssurvey (Giavalisco et al. 2004), with Hubble Space Telescope(

HST ) observations in z -band (900nm) with a point-sourcesensitivity of 27.4 mag. The wider ﬁeld of ECDFS is ob-served in the Galaxy Evolution from Morphology and SEDs(GEMS, Rix et al. 2004), with HST data in the F606Wand F850LP ﬁlters, however, at a shallower depth comparedto the GOODS-S observation: 2000 s typical integration ascompared to 6000 s. At z ∼ the z -band observed corre-sponds to a rest-frame wavelength of approximately nm,where the emission is dominated by light from intermediatemass stars (Bruzual & Charlot 2003).In contrast the sub-mm emission observed by ALMA at344 GHz will primarily trace star-formation surface density(Leroy et al. 2012). We can use this to test whether the starformation follows a signiﬁcantly diﬀerent morphology com-pared to the stellar population. Since we are working withstacking we can not study individual galaxies, however, wecan say something about average properties. As such, weproduce a simulated dataset where the star formation hasthe same surface density as the stellar mass traced by theHST z -band. This simulated dataset can be directly com-pared with the actual stacked data.The simulated dataset is produced as follows: we selectall sources which are part of the K20 and have at least a 5 σ detection in either GEMS (band F850LP) or GOODS-S, atotal of 32 sources. These galaxies are stacked using the samemethod as for the other samples. For each source we takethe HST image, mask all pixels below 5 times the noise, andscale to the same total ﬂux density as the stacked averagefor the sample, i.e., 1.4 mJy. These images are then used asinput model for a simulation, following the same method asdescribed for the clumpy model. As part of the stacking process, we re-align the astrome-try from our optical catalogue with our ALMA astrometry. https://almascience.eso.org/documents-and-tools/cycle3/alma-technical-handbook From model ﬁtting with a point source we ﬁnd an oﬀsetin declination of approximately 0 . (cid:48)(cid:48)

3, with small variations( < . (cid:48)(cid:48) ) between diﬀerent samples. We also ﬁt the positionusing the disk and Gaussian models, ﬁnding a variation of ∼ . (cid:48)(cid:48)

02 between the diﬀerent models. This is consistent withthe oﬀset found by Simpson et al. (2014) for the bright galax-ies in the same data. Based on this, all stacked datasets werephase rotated with 0 . (cid:48)(cid:48) K -band detections. Of the 100sources in the K20 11 galaxies are detected at a peak SNR > . For these galaxies we estimate the peak position using apoint-source model, and the errors of the ﬁtted positions us-ing visibility bootstrapping (see §4.5). We ﬁnd that weightedmeans of oﬀset between the optical positions and submm po-sitions are . (cid:48)(cid:48) ± . (cid:48)(cid:48) in right ascension and . (cid:48)(cid:48) ± . (cid:48)(cid:48) indeclination. The errors on the averaged oﬀsets are estimatedusing bootstrapping, where the 11 galaxies are resampled1000 times. We model the oﬀsets between the sources as thesystematic oﬀset, combined with one random oﬀset for eachsource between the optical and submm position, plus theerror of the position measurement for the submm position.We ﬁnd that the oﬀset between the submm and measuredoptical position can be modelled as a circular Gaussian witha FWHM of . +0 . − . . Again the errors are estimated basedon bootstrapping, where the 11 galaxies are resampled 1000times. To ensure robustness of our new results based on uv -stacking, we perform several test on the stacked data andmethod. By inserting and stacking point sources in theALESS data, using the method described in section 4.4,we evaluate biases in the stacking result. We ﬁnd that theﬂux density agrees with the expected values, except for thevery shortest baselines, where the ﬂux density is approxi-mately 20 per cent too high. The results for the sBzK sam-ple is shown in Fig. 2, however, the other samples showvery similar structure in the uv -plane. Lindroos et al. (2015)found similar biases on the shortest baselines for simulateddatasets. In Lindroos et al. (2015) this could be shown tobe due to nearby bright sources which were not fully sub-tracted. This is consistent with our data, as the bright sourcesubtraction is based on the clean models, which may notfully subtract the sources. Based on this we ﬂag all baselineshorter than . for the sBzK and DRG samples ( ∼ m forthe K20 and ERO samples ( ∼ ∼ . The stacked data are well ﬁtted by a Gaussian,as shown in Fig. 3, with a ﬂux density . ± . mJy and . (cid:48)(cid:48) ± . (cid:48)(cid:48) . This agrees well with Simpson et al. (2015),which found typical sizes (FWHM) of SMGs between 0 . (cid:48)(cid:48) . (cid:48)(cid:48) c (cid:13) , 1–11 L. Lindroos et al.

Figure 2.

Stacked ﬂux densities for a simulated dataset, pro-duced by inserting point sources into the ALESS data. Flux den-sities averaged over 100 simulated datasets accurately estimatesystematic biases. The noise is estimated as the standard devia-tion between the diﬀerent simulations. The red line indicates theexpected ﬂux density for the stacked point sources. The shortestbaseline is higher than the expected ﬂux density due to contribu-tions from residuals of bright sources, see Lindroos et al. (2015)for more discussion of such eﬀects.

Figure 3.

Flux densities for the stacked visibilities of the SMGsample. The visibilities are binned by baseline length. The redline indicates a Gaussian ﬁt. The errors are estimated from thestandard deviation of the real part of the visibilities within eachbin. The horizontal error is estimated from the standard deviationof the uv -distance within each bin. Fig. 4 shows ﬂux density as a function of baseline length foreach sample. In the plot is shown the ﬁt from the Gaussianmodel, with two free parameters: the total ﬂux density andthe FWHM size.The typical sources in all of our samples are found to beextended, with stacked sizes between . (cid:48)(cid:48) and . (cid:48)(cid:48) (see Ta- Figure 4.

Stacked visibilities for each sample binned by baselinelength. The errors are estimated from the standard deviation forthe real part of the visibilities within each bin. The horizontalerror is estimated from the standard deviation of the uv -distancewithin each bin. The lines show uv -models that are ﬁtted to thefull uv -data. The blue dash-dotted line is a Gaussian, the solidgreen line is a Gaussian plus a point source, and the black dashedline is a disk plus a point source. Note that no Gaussian model isvisible for the DRG sample, as it is identical to the Gaussian +point source model for this sample. ble 1). The measured stacked sizes are broadened by randomoﬀsets between the measured K -band positions and submmpositions. Accounting for this eﬀect, we ﬁnd deconvolvedsizes for our samples between . (cid:48)(cid:48) and . (cid:48)(cid:48) . The uncer-tainties are estimated by using the bootstrap and MonteCarlo methods described in section 4.4. The bootstrappingerrors are larger as they account for variance within the se-lected sample as well as observational uncertainties, whilethe Monte Carlo only accounts for observational uncertain-ties. For the deconvolved sizes, the reported errors are thecombination of the Monte-Carlo errors and the errors onrandom oﬀset measurements, assuming that these two er-rors are independent.Roughly half the galaxies in our samples are detected inthe HST z -band observations from GOODS-S and GEMS.By ﬁtting a Sérsic distribution to these sources we can esti-mate the sizes at z -band wavelength. We ﬁnd a median sizeof 0 . (cid:48)(cid:48)

46 for the K20 sample and 0 . (cid:48)(cid:48)

52 for the other samples.The median Sérscic index n is around 1.33 for each sample,although slightly lower for the sBzK sample at 0.94.Compared to the results from Decarli et al. (2014), weﬁnd ﬂux densities which are 20 to 40 per cent higher. Thisis expected as the image stacking method in Decarli et al.(2014) uses the peak ﬂux density in the stacked stamp, whichassumes that the sources are unresolved at the image res-olution of 1 . (cid:48)(cid:48)

6. When ﬁtting a point source model to our uv -stacked data, the measured ﬂux densities deviate fromthe Decarli et al. (2014) measurements by less than a fewper cent. c (cid:13) , 1–11 stimating sizes of faint, distant galaxies in the submillimetre regime Table 1.

Flux density estimates with uv -stacking. The ﬂux density in uv -stacking is estimated using twodiﬀerent methods. Method one (model): the ﬂux density is estimated as the best ﬁt Gaussian model. Methodtwo (point source): the ﬂux density is estimated as the weighted average of all unﬂagged visibilities. Thesetwo estimates would coincide for point sources. We also present the ﬁtted size of the Gaussian model, as wellas ﬁtted size deconvolved from the random oﬀsets between opical and sub-mm positions. For comparison thetable also shows the image stacking results from Decarli et al. (2014). The errors are estimated by stackingfake sources introduced into the data. uv -stacking Image stackingGaussian Point sourceSample N.gal ﬂux density Size Deconvolved ﬂux density Peak ﬂux density[mJy] size [mJy] [mJy]K20 52 . ± .

30 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

14 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

16 1 . ± .

07 1 . ± . sBzK 22 . ± .

32 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

15 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

17 1 . ± .

10 1 . ± . ERO 25 . ± .

22 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

17 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

19 1 . ± .

10 1 . ± . DRG 19 . ± .

28 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

14 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48)

16 1 . ± .

11 1 . ± . Table 2.

Distributions of stacked parameters as estimated from bootstrapping, resampling the galaxieswithin each sample 1000 times. These distributions include both errors from measurement uncertainties andvariance within the samples. The presented range of 15.9 per cent to 84.1 per cent corresponds to the ± σ range for a Gaussian distribution. The distributions are also presented as histograms in A.Sample Gaussian ﬂux [mJy] Size Point source ﬂux [mJy]15.9% 50% 84.1% 15.9% 50% 84.1% 15.9% 50% 84.1%K20 1.33 1.90 2.58 0 . (cid:48)(cid:48)

63 0 . (cid:48)(cid:48)

94 1 . (cid:48)(cid:48)

38 0.95 1.25 1.61sBzK 1.62 2.38 3.14 0 . (cid:48)(cid:48)

54 0 . (cid:48)(cid:48)

74 0 . (cid:48)(cid:48)

91 1.31 1.86 2.33ERO 1.03 1.56 2.20 0 . (cid:48)(cid:48)

48 0 . (cid:48)(cid:48)

76 1 . (cid:48)(cid:48)

05 0.80 1.14 1.56DRG 1.81 2.43 3.16 0 . (cid:48)(cid:48)

54 0 . (cid:48)(cid:48)

72 0 . (cid:48)(cid:48)

85 1.49 1.91 2.32

To study the eﬀect of substructure, we perform a simulationin which the emission originates from kpc-scale clumps inthe galaxies, described in more detail in section 5. At base-lines shorter than ∼ m the stacked visibilities are wellﬁtted by a Gaussian model, as is shown in Fig. 5. The blacksquares indicate the ALESS baselines. The simulation alsoinclude a set of longer baselines modelled on a intermedi-ate length baseline conﬁguration from ALMA Cycle 3, withbaselines from 45 m to 1400 m, shown in Fig. 5 as red cir-cles. The Gaussian model recovers an average ﬂux densityfor the stacked sources of . ± . mJy, compared to theinput ﬂux density for the simulation of 2.1 mJy per source.The ﬂux density is primarily recovered by using the ALESSbaselines, using the long baselines from the ALMA Cycle 3conﬁguration, we measure an average ﬂux density of only 90 µ Jy. When ﬁtting to the data from both baseline conﬁgu-rations, the size measured for the Gaussian is 0 . (cid:48)(cid:48) ± . (cid:48)(cid:48) .This agrees well with the distribution of the positions forthe clumps, which are spread in a disk with a diameter of1 . (cid:48)(cid:48)

2. For the

HST z -band detected galaxies, we measure andcompare the HST sizes to our stacked ALMA sizes, and ﬁndthe values to be consistent with uncertainties for all samples.However, for those sources with a strong detection we canperform a more in-depth comparison. We select all sourcesfrom the K20 sample with peak SNR > in z -band, a totalof 32 sources. Stacking these sources in the ALESS data wemeasure an average size of 0 . (cid:48)(cid:48) ± z -band (0 . (cid:48)(cid:48) z -band morphology, described in detail in section 5.2. Fig. Baseline length [m] − . . . . . . . F l u x den s i t y [ m Jy ] Figure 5.

Stacked ﬂux densities for simulated dataset. Eachgalaxy is simulated as a combination of three clumps, scat-tered within a radius of 5 kpc from the centre position for thegalaxy. The errors are estimate from the standard deviations ofthe visibilities in each bin. The plot combines data from simula-tions with two diﬀerent baseline conﬁguration, The shorter base-lines, marked with black squares, are simulated with the same uv -coverage as the ALESS observations. The longer baselines,marked with red circles, are simulated using an ALMA Cycle3 conﬁguration with baselines from 45 m to 1.4 km. uv -plane, indicating that the z -band and the sub-mm emissiontrace a similar radial morphology. c (cid:13)000

Figure 6.

Simulation of stacked ﬂux densities based on HST z -band emission maps shown in black, binned by baseline length.The errors are estimated from the standard deviation for the visi-bilities within each bin. For comparison the stacked ﬂux densitiesof the z -band detected galaxies of our sample, using the samebinning. Note that for the middle bin the simulated and real dataare very close, and as such the simulated data point is hiddenbehind in the plot. Decarli et al. (2014) stacked each of the four samples in thethree

Herschel

SPIRE bands. Using data from the

Herschel

Multi-tiered Extragalactic Survey (Oliver et al. 2012). Wecombine these values with our stacked ALESS ﬂux densitiesto better constrain the dust spectral energy distributions(SED) of our samples. The dust emission is modelled as amodiﬁed black body: S ν ∝ ν β B ν ( T ) where S ν is the dustSED, B ν ( T ) is the Planck function, T is the dust temper-ature (typically T ≈ β describes the eﬀect ofdust opacity (typically β ≈ . − ) (e.g. Kelly et al. 2012).The total IR luminosity ( L IR ) is calculated between 8 µ mand 1000 µ m (e.g. Sanders et al. 2003). The dust emissionis ﬁtted using a χ minimization, with two free parameters, T and L IR . The value of β is ﬁxed to 1.6. Each data pointis weighted by σ − . Data and ﬁtted SEDs are shown in Fig.7, and results are summarised in Table 3.The SFRs are calculated from L IR assuming a Chabrier(2003) initial mass function (Genzel et al. 2010) SFR = 1 . × − M (cid:12) yr − L IR L (cid:12) . (4)We ﬁnd that the SFRs are similar for all samples at ∼ (cid:12) yr − , with the DRG sample showing a ∼

20 per centlarger star-formation rate compared to the other samples.In Fig. 8 we show SFR as a function of stellar mass for eachsample. The measured values fall close to the best-ﬁt “mainsequence” for star-forming galaxies at similar redshifts, (e.g.the Tacconi et al. (2013) parametrization for comparison).We also split the sBzK sample into two subsets based onstellar mass, estimating the ﬂux density of the stacked data

Table 3.

Infrared luminosity and SFR estimates for the stackedsamples, using a combination of

Herschel and the new stackedALMA results. We also show the average stellar mass for eachsample. The errors are estimated from χ when varying both T and L FIR simultaneously.Sample L FIR T dust SFR M ∗ [ L (cid:12) ] [K] [ M (cid:12) yr − ] [ M (cid:12) ]K20 . ± . ± ±

18 5 . × sBzK . ± . ± ±

14 5 . × ERO . ± . ± ±

22 4 . × DRG . ± . ± ±

20 6 . × sBzK(high mass) . ± . ± ±

20 2 . × sBzK(low mass) . ± . ± ±

20 8 . × ERO DRGK20 sBzK F l u x d e n s i t y [ m J y ] Wavelength [ µ m] Figure 7.

Stacked ﬂux densities for the samples and ﬁtted dust-emission SEDs. Combines the three wavelengths from the

Her-schel /SPIRE with our new ALMA estimates. The parameters ofthe ﬁtted models can be found in Table 3. with a Gaussian. The star-formation rate is calculated usingthe same dust temperature as for the full sBzK sample.

Our stacked results show that the stacked sources have ex-tended emission with typical sizes ∼ . (cid:48)(cid:48)

7. Assuming that thetarget sources are compact or unresolved, as was done inDecarli et al. (2014), the ﬂux density is systematically un-derestimated. For the samples in this study with between30 and 40 per cent. For the SMGs, where we measure thestacked size to be 0 . (cid:48)(cid:48)

4, this eﬀect is smaller with the peakbrightness only ∼ uv -domain we can eﬀectively re-cover the full ﬂux density. This does, however, rely on havingaccess suﬃcient sensitivity on short baselines. The ALESS c (cid:13) , 1–11 stimating sizes of faint, distant galaxies in the submillimetre regime × Stellar mass [M ⊙ ] S t a r -f o r m a t i o n r a t e [ M ⊙ y r − ] Figure 8.

Average star-formation rate and stellar mass for theeach sample shown as blue triangles (see Table 3). The sBzKsample is also split into two sub-sample based on stellar mass,shown as black circles. The red line indicates the best-ﬁt “mainsequence” for star-forming galaxies at z ∼ , using the Tacconi etal. (2013) parametrization. data were observed in a very compact ALMA conﬁguration,with most baselines shorter than 100 m, or 115 k λ . This re-sults in a naturally weighted beam size of ∼ . (cid:48)(cid:48) , i.e., theobservations are sensitive to scales of 1 (cid:48)(cid:48) -2 (cid:48)(cid:48) .The ﬁltering of spatial scales is a well known eﬀectwithin interferometry, however, the results of this studyshow that the eﬀect is especially pronounced for stacking.For the mapping of individual galaxies, most of the ﬂuxdensity will originate from smaller scales, allowing it tobe resolved with higher resolutions. Only emission which issmooth over larger scales is ﬁltered. In the case of stacking,the averaging of multiple galaxies smooth out substructure.As such, having access to suﬃciently short baselines is essen-tial to measure the total ﬂux density of the stacked sources.Emission at larger scales, at sizes larger than approximately2-3 (cid:48)(cid:48) , would be similarly suppressed in the ALESS data.However, HST data at z -band set an upper limit for oursamples at around 2 (cid:48)(cid:48) , as the dust-emission is unlikely toextend much beyond the stellar region. Our simulations show that with stacking, we can eﬃcientlyestimate the total ﬂux density and the radial distribution ofthe emission. Using Gaussian models, we ﬁnd sizes around0 . (cid:48)(cid:48) . (cid:48)(cid:48) . (cid:48)(cid:48)

17. This means thatall samples are extended at a greater than σ signiﬁcance.Martí-Vidal, Pérez-Torres & Lobanov (2012) calculate thelimitation of model ﬁtting of detected sources in a interfer-ometric data set and ﬁnd that the minimal size that can bemeasured is given by Θ min = β (cid:18) λ c (cid:19) (cid:18) S/N (cid:19) × Θ beam (5) where S/N is the SNR of the averaged visibilities, β is a pa-rameter that depends on the array conﬁguration (typicallybetween 0.5 and 1.0), Θ beam is the FWHM of the beam us-ing natural weighting, and λ c depends on the probabilitycut-oﬀ for false detection (3.84 for σ ). Using this formulawe ﬁnd our size error to be consistent with a β between 0.4and 0.5. This both indicates that the sizes of . (cid:48)(cid:48) are veryrobust, and also shows that model ﬁtting of stacked sourceshas similar noise to individual sources with similar SNR. Forcomparison we also stacked the SMGs in our data, and ﬁndan average size of . (cid:48)(cid:48) ± . (cid:48)(cid:48) . This is marginally larger thanthe median size measured by Simpson et al. (2015) of 0 . (cid:48)(cid:48) . (cid:48)(cid:48) > σ , we ﬁnd that the typical oﬀsets are . (cid:48)(cid:48) ± . (cid:48)(cid:48) . If we deconvolve this from the measured sizeswe ﬁnd that the sizes the actual galaxies are . (cid:48)(cid:48) − . (cid:48)(cid:48) .We also estimate the variance of the target samples us-ing bootstrapping. This indicate larger errors on our esti-mated parameters due to the sample sizes, with size errorsincreasing to 0 . (cid:48)(cid:48)

20 - 0 . (cid:48)(cid:48)

35. Larger samples of star-forminggalaxies have been studied using

HST , e.g. van der Wel et al.(2014) measured the sizes of ∼ star-forming galaxiesat z > . Based on this sample they ﬁnd that the opticalsizes follow a log-normal distribution. Looking at the sBzKgalaxies, if we assume that the sub-mm sizes of our samplesfollow a similar distributions, we would expect this to con-tribute 0 . (cid:48)(cid:48)

04 to error of our stacked size assuming we sample22 random galaxies. This eﬀect is similar for the other sam-ples, getting smaller the larger the sample is. Looking atresults from bootstrapping, we ﬁnd that the results are con-sistent for the sBzK and DRG samples. For the K20 thebootstrap estimated error is larger than expected from theoptical sizes of star-forming galaxies, however, this sample isnot selective to star-forming galaxies leading probably lead-ing to a more heterogeneous sample. For the ﬂux densities ofour stacked sample, the bootstrap errors are larger than themeasurement errors. This is consistent with the large varia-tion seen for star-forming galaxies, where the SFR can varywith more than an order of magnitude within a sample. Wenote that this indicates the error on the SFRs measured forour samples are dominated by sample variance. This wouldbe true even if each galaxy was individually detected, indi-cating the importance of large samples to accurately esti-mate the typical SFR for a population of galaxies.

Looking at the sizes of the galaxies with a detection in the

HST z -band data (peak SNR > ), we can estimate thesize of the stellar component of the galaxies. Using a Sersicdistribution, we ﬁnd an median eﬀective radius ( r e ) of 0 . (cid:48)(cid:48) n of 1.33. The sizes measured at sub-mmwavelengths for our stacked sources are based on a Gaussianproﬁle in place of a Sersic proﬁle. For comparison we ﬁtour stacked sources using a Sersic proﬁle, with n ﬁxed to c (cid:13) , 1–11 L. Lindroos et al.

HST observations arefrom the GEMS survey. The GEMS z -band observations arenot as deep as the GOODS-S z -band observations. As suchis possible that we are missing low ﬂux surface density emis-sion, and underestimating the size of these galaxies. How-ever, as this primarily aﬀects half the sample, the impact onthe median value is not expected to be very large.Another limitation of the z -band measurements is dustobscuration. The measured submm continuum emission in-dicates that dust is abundant in all samples. We can com-pare to the shallower HST H -band observations from GEMSand GOODS-S, which are less aﬀected by dust absorption.However, only 16 galaxies are detected in H -band. For thesegalaxies we measure a median size of 0 . (cid:48)(cid:48)

6, which agrees wellwith the sizes measured in z -band.The size of . (cid:48)(cid:48) corresponds to a physical size of 6 kpc atthe average redshift of the sBzK sample. For SMGs severalmeasurements of the sizes at sub-mm wavelengths exist, e.g.,Simpson et al. (2015) ﬁnd a median size of 2.4 ± ± ∼ × Focusing on the sBzK sample, the total SFR is estimatedto be 100 M (cid:12) yr − , over a size of 10 kpc, or a SFR surfacedensity ( Σ SFR ) of 1 M (cid:12) yr − kpc − . This value is consistentwith other measurements of sBzK galaxies, e.g., Daddi et al.(2010b) which found values for 0.1 to 30 M (cid:12) yr − kpc − . Ofthis, 40 per cent originates in the centre. This corresponds to Σ SFR ≈

13 M (cid:12) yr − kpc − in the inner 1 kpc of the galax-ies. While this is higher than the corresponding value forthe DRGs ( ∼ M (cid:12) yr − kpc − ), it is a very small valuecompared to LIRGs at lower redshift. E.g., in Arp 220 witha similar SFR (Anantharamaiah et al. 2000), the major-ity of the star formation occurs inside 1 kpc of the centre(Scoville, Yun & Bryant 1997), resulting in an average Σ SFR of approximately 70 M (cid:12) yr − kpc − (Anantharamaiah et al.2000). We can also compare this to SMGs, e.g., Hodge et al.(2015) measured Σ SFR in the centre of a z = 4 SMG to be ∼ M (cid:12) yr − kpc − , which is similar to Arp 220, but muchhigher than our sBzK galaxies.As noted, Σ SFR in the centre of the DRG sample isvery low, at 2 M (cid:12) yr − kpc − it is only a factor 4 above thesame value for the Milky Way (Robitaille & Whitney 2010),despite a factor 100 diﬀerence in SFR. In Decarli et al. (2014), all samples were found to have anexcess of star formation compared to the similar samples inother ﬁelds. Our updated ﬂux-density estimate are ∼

30 - 40per cent higher than those found by Decarli et al. (2014).However, after ﬁtting the SED of the dust emission, theﬁtted dust temperatures are typically lower. For the sBzKand DRG samples, this results in SFRs which are consistentwith the Decarli et al. (2014) measurements within the un-certainties. However, for the K20 and ERO sample the SFRdrops with ∼ per cent compared to Decarli et al. (2014).This results in the K20, ERO and sBzK samples having verysimilar star-formation rates, at ∼

90 M (cid:12) yr − .We also compare the measured star-formation rates tothe stellar masses, and ﬁnd them to be consistent with Tac-coni et al. (2013) for star-forming galaxies at z ∼ . We alsosplit the sBzK sample, the sample with highest SNR, by stel-lar mass. Both the low- and high-mass samples fall close tothe best-ﬁt “main sequence” using the Tacconi et al. (2013)parametrization. This indicates, that while these galaxies aretypically more massive compared to other similar samples,the star formation is driven by the same mechanics. In this paper we use stacking to measure the average mor-phologies and sizes of samples of galaxies using ALMA. Weuse a uv -stacking algorithm combined with model ﬁtting inthe uv -domain. We select star-forming galaxies at z ∼ us-ing four diﬀerent criteria: K VEGA < , ERO, DRG, andsBzK. The samples are stacked in the ALMA 344 GHz con-tinuum observations from the ALESS survey. We ﬁnd thatall samples are extended, with FWHM sizes of ∼ . (cid:48)(cid:48) ± . (cid:48)(cid:48) estimated using a Gaussian model. Accounting for randomoﬀsets between optical catalogue positions and submm po-sitions in the data, we ﬁnd that the actual average sizes aresomewhat smaller at ∼ . (cid:48)(cid:48) ± . (cid:48)(cid:48) .The uv -model ﬁtting results in ﬂux densities that are ∼ per cent higher than if the sources are assumed to bepoint sources. Furthermore, assuming that the dust emissionmeasured at 344 GHz is primarily heated by star formation,we ﬁnd that the majority of the star formation is takingplace outside the inner kpc of the galaxy. We compare thisto the stellar distribution in the same galaxies, using HST z -band data. The median eﬀective radius is measured to 0 . (cid:48)(cid:48) z -band maps as inputmodel for each galaxy. The distribution are found to agreewell, indicating no systematic diﬀerence in size or radial dis-tributions between the stellar and star-forming component.Using a Monte Carlo method to estimate the robust-ness of the result, we ﬁnd the measured sizes to be robust at > σ for all samples. The measured diﬀerence between the c (cid:13) , 1–11 stimating sizes of faint, distant galaxies in the submillimetre regime sBzK and DRG sample, is larger than the uncertainties witha statistical signiﬁcance of σ . We ﬁnd that the measuredaccuracy of the sizes is comparable to the theoretical limitsfor individual sources (e.g. Martí-Vidal et al. 2014). As in allcases with stacking we do not measure the properties of theindividual galaxies, but the average properties of the sam-ples, and this smoothing eﬀect can simplify the modelling ofthe stacked source. However, it also increase the interfero-metric eﬀect of ﬁltering of large spatial scale, making shortspacings very important to recover the full ﬂux density.We can conclude that for the stacking of any sourcesthat may be marginally extended, using uv -stacking withmodel ﬁtting can provide a ﬂux-density estimate that is sig-niﬁcantly more robust and valuable additional informationsuch as the typical sizes of the sources of the stacked sam-ple. This is also important for future facilities such as theSquare Kilometer Array (SKA), showing that having accessto uv -data in stacking is invaluable. LL thanks Robert Beswick for useful discussion, IvanMartí-Vidal for helpful input on the model ﬁtting, andIan Smail for useful discussion. We thank an anony-mous referee for helpful suggestions and useful comments.This paper makes use of the following ALMA data:ADS/JAO.ALMA

References

Agarwal S., Mierle K., et al., 2015, Ceres solver. http://ceres-solver.org

Aleksić J. et al., 2011, ApJ, 729, 115Anantharamaiah K. R., Viallefond F., Mohan N. R., GossW. M., Zhao J. H., 2000, ApJ, 537, 613Blain A. W., Smail I., Ivison R. J., Kneib J.-P., FrayerD. T., 2002, PhR, 369, 111Bouwens R. J. et al., 2011, ApJ, 737, 90Boyle B. J., Cornwell T. J., Middelberg E., Norris R. P.,Appleton P. N., Smail I., 2007, MNRAS, 376, 1182Brammer G. B., van Dokkum P. G., Coppi P., 2008, ApJ,686, 1503 Bruzual G., Charlot S., 2003, MNRAS, 344, 1000Casey C. M. et al., 2014, ApJ, 796, 95Chabrier G., 2003, PASP, 115, 763Chaudhary P., Brusa M., Hasinger G., Merloni A., Comas-tri A., Nandra K., 2012, A&A, 537, A6da Cunha E. et al., 2015, ApJ, 806, 110Daddi E. et al., 2010a, ApJ, 713, 686Daddi E., Cimatti A., Renzini A., Fontana A., Mignoli M.,Pozzetti L., Tozzi P., Zamorani G., 2004, ApJ, 617, 746Daddi E. et al., 2010b, ApJ, 714, L118Damen M. et al., 2011, ApJ, 727, 1Decarli R. et al., 2014, ApJ, 780, 115Dole H. et al., 2006, A&A, 451, 417Dunne L. et al., 2009, MNRAS, 394, 3Elston R., Rieke G. H., Rieke M. J., 1988, ApJ, 331, L77Fioc M., Rocca-Volmerange B., 1997, A&A, 326, 950Förster Schreiber N. M. et al., 2009, ApJ, 706, 1364Franx M. et al., 2003, ApJ, 587, L79Genzel R. et al., 2010, MNRAS, 407, 2091George M. R. et al., 2012, ApJ, 757, 2Giavalisco M. et al., 2004, ApJ, 600, L93González V., Bouwens R. J., Labbé I., Illingworth G.,Oesch P., Franx M., Magee D., 2012, ApJ, 755, 148Greve T. R. et al., 2010, ApJ, 719, 483Hodge J. A., Becker R. H., White R. L., de Vries W. H.,2008, AJ, 136, 1097Hodge J. A. et al., 2013, ApJ, 768, 91Hodge J. A., Riechers D., Decarli R., Walter F., CarilliC. L., Daddi E., Dannerbauer H., 2015, ApJ, 798, L18Hodge J. A., Zeimann G. R., Becker R. H., White R. L.,2009, AJ, 138, 900Hsieh B.-C., Wang W.-H., Hsieh C.-C., Lin L., Yan H., LimJ., Ho P. T. P., 2012, ApJS, 203, 23Ikarashi S. et al., 2015, ApJ, 810, 133Im M., Griﬃths R. E., Naim A., Ratnatunga K. U., RocheN., Green R. F., Sarajedini V. L., 1999, ApJ, 510, 82Ivison R. J. et al., 2007, MNRAS, 380, 199Karim A. et al., 2011, ApJ, 730, 61Kelly B. C., Shetty R., Stutz A. M., Kauﬀmann J., Good-man A. A., Launhardt R., 2012, ApJ, 752, 55Knudsen K. K. et al., 2005, ApJ, 632, L9Leroy A. K. et al., 2012, AJ, 144, 3Levenberg K., 1944, Quart. Appl. Math., 2, 164Lindroos L., Knudsen K. K., Vlemmings W., Conway J.,Martí-Vidal I., 2015, MNRAS, 446, 3502Madau P., Dickinson M., 2014, ARA&A, 52, 415Martí-Vidal I., Pérez-Torres M. A., Lobanov A. P., 2012,A&A, 541, A135Martí-Vidal I., Vlemmings W. H. T., Muller S., Casey S.,2014, A&A, 563, A136Matsuda Y. et al., 2012, MNRAS, 425, 878Oke J. B., 1974, ApJS, 27, 21Oliver S. J. et al., 2012, MNRAS, 424, 1614Planck Collaboration et al., 2014, A&A, 571, A16Rix H.-W. et al., 2004, ApJS, 152, 163Robitaille T. P., Whitney B. A., 2010, Highlights of As-tronomy, 15, 799Rodighiero G. et al., 2011, ApJ, 739, L40Sanders D. B., Mazzarella J. M., Kim D.-C., Surace J. A.,Soifer B. T., 2003, AJ, 126, 1607Scoville N. Z., Yun M. S., Bryant P. M., 1997, ApJ, 484,702 c (cid:13)000

Figure A1.

Distribution of stacked size for the K20 sample asestimated through bootstrapping.

Simpson J. M. et al., 2015, ApJ, 799, 81Simpson J. M. et al., 2014, ApJ, 788, 125Tacconi L. J. et al., 2013, ApJ, 768, 74Taylor E. N. et al., 2009, ApJS, 183, 295van der Wel A. et al., 2014, ApJ, 788, 28Watson D., Christensen L., Knudsen K. K., Richard J.,Gallazzi A., Michałowski M. J., 2015, Nature, 519, 327Webb T. M. A., Brodwin M., Eales S., Lilly S. J., 2004,ApJ, 605, 645Zibetti S., Ménard B., Nestor D. B., Quider A. M., RaoS. M., Turnshek D. A., 2007, ApJ, 658, 161

APPENDIX A: FITTED MODELS

In this appendix we present the distributions determined forthe ﬁtted sizes using bootstrapping on the stacking samples.The method for the bootstrapping is described in §4.4, andthe plotted distribution indicate the probability of possiblesizes for the population of each sample. The bootstrappingmethod approximate errors from observational noise as wellas sample variance.

Figure A2.

Distribution of stacked size for the sBzK sample asestimated through bootstrapping.

Figure A3.

Distribution of stacked size for the ERO sample asestimated through bootstrapping.c (cid:13) , 1–11 stimating sizes of faint, distant galaxies in the submillimetre regime Figure A4.

Distribution of stacked size for the DRG sample asestimated through bootstrapping.c (cid:13)000