[PDF] Imaging the Schwarzschild-radius-scale Structure of M87 with the Event Horizon Telescope using Sparse Modeling

Abstract

We propose a new imaging technique for radio and optical/infrared interferometry. The proposed technique reconstructs the image from the visibility amplitude and closure phase, which are standard data products of short-millimeter very long baseline interferometers such as the Event Horizon Telescope (EHT) and optical/infrared interferometers, by utilizing two regularization functions: the \ell_1-norm and total variation (TV) of the brightness distribution. In the proposed method, optimal regularization parameters, which represent the sparseness and effective spatial resolution of the image, are derived from data themselves using cross validation (CV). As an application of this technique, we present simulated observations of M87 with the EHT based on four physically motivated models. We confirm that \ell_1+TV regularization can achieve an optimal resolution of \sim 20-30% of the diffraction limit \lambda/D_{\rm max}, which is the nominal spatial resolution of a radio interferometer. With the proposed technique, the EHT can robustly and reasonably achieve super-resolution sufficient to clearly resolve the black hole shadow. These results make it promising for the EHT to provide an unprecedented view of the event-horizon-scale structure in the vicinity of the super-massive black hole in M87 and also the Galactic center Sgr A*.

Full PDF

DDraft version February 27, 2017

Typeset using L A TEX twocolumn style in AASTeX61

IMAGING THE SCHWARZSCHILD-RADIUS-SCALE STRUCTURE OF M87 WITH THE EVENT HORIZONTELESCOPE USING SPARSE MODELING

Kazunori Akiyama,

1, 2, 3, ∗ Kazuki Kuramochi,

3, 2

Shiro Ikeda,

4, 5

Vincent L. Fish, Fumie Tazaki, Mareki Honma,

2, 6

Sheperd S. Doeleman,

1, 7

Avery Broderick,

8, 9

Jason Dexter, Monika Mo´scibrodzka, Katherine L. Bouman, Andrew Chael, and Masamichi Zaizen Massachusetts Institute of Technology, Haystack Observatory, Route 40, Westford, MA 01886, USA Mizusawa VLBI Observatory, National Astronomical Observatory of Japan, 2-21-1 Osawa, Mitaka, Tokyo 181-8588, Japan Department of Astronomy, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan Department of Statistical Science, School of Multidisciplinary Sciences, Graduate University for Advanced Studies, 10-3 Midori-cho,Tachikawa, Tokyo 190-8562, Japan Graduate University for Advanced Studies, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan Department of Astronomical Science, School of Physical Sciences, Graduate University for Advanced Studies, 2-21-1 Osawa, Mitaka,Tokyo 181-8588, Japan Harvard Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138, USA Perimeter Institute for Theoretical Physics, 31 Caroline Street, North Waterloo, Ontario N2L 2Y5, Canada Department of Physics and Astronomy, University of Waterloo, 200 University Avenue West, Waterloo, Ontario N2l 3G1, Canada Max Planck Institute for Extraterrestrial Physics, Giessenbachstr. 1, 85748 Garching, Germany Department of Astrophysics/IMAPP, Radboud University Nijmegen, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands Massachusetts Institute of Technology, Computer Science and Artiﬁcial Intelligence Laboratory, 32 Vassar Street, Cambridge, MA 02139,USA Department of Astronomy, School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan

Submitted to ApJABSTRACTWe propose a new imaging technique for radio and optical/infrared interferometry. The proposed technique recon-structs the image from the visibility amplitude and closure phase, which are standard data products of short-millimetervery long baseline interferometers such as the Event Horizon Telescope (EHT) and optical/infrared interferometers,by utilizing two regularization functions: the (cid:96) -norm and total variation (TV) of the brightness distribution. In theproposed method, optimal regularization parameters, which represent the sparseness and eﬀective spatial resolutionof the image, are derived from data themselves using cross validation (CV). As an application of this technique, wepresent simulated observations of M87 with the EHT based on four physically motivated models. We conﬁrm that (cid:96) +TV regularization can achieve an optimal resolution of ∼ −

30% of the diﬀraction limit λ/D max , which is thenominal spatial resolution of a radio interferometer. With the proposed technique, the EHT can robustly and reason-ably achieve super-resolution suﬃcient to clearly resolve the black hole shadow. These results make it promising forthe EHT to provide an unprecedented view of the event-horizon-scale structure in the vicinity of the super-massiveblack hole in M87 and also the Galactic center Sgr A*.

Keywords: accretion, accretion disks — black hole physics — Galaxy: center — submillimeter: general— techniques: interferometric

Corresponding author: Kazunori [email protected] ∗ JSPS Postdoctoral Fellow for Research Abroad a r X i v : . [ a s t r o - ph . I M ] F e b K. Akiyama et al. INTRODUCTIONSupermassive black holes (SMBHs) reside in the ma-jority of the galactic nuclei in the universe. In a subsetof such galaxies, accretion drives a highly energetic ac-tive galactic nucleus (AGN) often associated with pow-erful jets. Understanding the nature of these systemshas been a central quest in astronomy and astrophysics.The SMBHs at the centers of our galaxy (Sgr A*) andthe giant elliptical galaxy M87 provide unprecedentedopportunities to directly image the innermost regionsclose to the central black hole, since the angular sizeof the event horizon is the largest among known blackholes. The angular size of the Schwarzschild radius ( R s )is ∼ µ as for Sgr A* for a distance of 8.3 kpc and amass of 4 . × M (cid:12) (e.g. Chatzopoulos et al. 2015),and ∼ − µ as for M87 with a distance for 16.7 Mpc(Blakeslee et al. 2009) and a mass of 3 − × M (cid:12) (e.g. Gebhardt et al. 2011; Walsh et al. 2013). The ap-parent diameter of the dark silhouette of the black hole is √ R s for the non-rotating black hole. It correspondsto ∼ µ as for Sgr A* and ∼ − µ as for M87,which changes by only 4% with the black-hole spin andviewing orientation (Bardeen 1973).Very long baseline interferometric (VLBI) observa-tions at short/sub-millimeter wavelengths ( λ (cid:46) . ν (cid:38)

230 GHz) can achieve a spatial resolution of afew tens of microarcseconds and therefore are expectedto resolve event-horizon-scale structures, including theshadow of SMBHs. Indeed, recent signiﬁcant progresson 1.3 mm VLBI observations with the Event HorizonTelescope (EHT; Doeleman et al. 2009) has succeeded inresolving compact structures of a few R s in the vicinityof the SMBHs in both Sgr A* and M87 (e.g. Doele-man et al. 2008, 2012; Fish et al. 2011, 2016; Akiyamaet al. 2015; Johnson et al. 2015). Direct imaging ofthese scales will be accessible in the next few yearswith technical developments and the addition of new(sub)millimeter telescopes such as the Atacama LargeSubmillimeter/millimeter Array (ALMA) to the EHT(e.g. Fish et al. 2013).Regardless of the observing wavelength, angular res-olution (often referred to as “beam size” in radio as-tronomy and “diﬀraction limit” in optical astronomy)is simply given by θ ≈ λ/D max , where λ and D max arethe observed wavelength and the diameter of the tele-scope (or the longest baseline length for the radio inter-ferometer), respectively. A practical limit for a ground-based, 1.3 mm VLBI array like the EHT is ∼ µ as(= 1 . λ/D max would be desirable, particularly for future EHT observations of M87 and SgrA*.The imaging problem of interferometry is formulatedas an under-determined linear problem when recon-structing the image from full-complex visibilities thatare Fourier components of the source image. In the con-text of compressed sensing (also known as “compressivesensing”), it has been revealed that an ill-posed linearproblem may be solved accurately if the underling solu-tion vector is sparse (Donoho 2006; Candes & Tao 2006).Since then, many imaging methods have been appliedto radio interferometry (see Garsden et al. 2015, andreferences therein). We call these approaches “sparsemodeling” since they utilize the sparsity of the groundtruth.In Honma et al. (2014), we applied LASSO (LeastAbsolute Shrinkage and Selection Operator; Tibshirani1996), a technique of sparse modeling, to interferomet-ric imaging. LASSO solves under-determined ill-posedproblems by utilizing the (cid:96) -norm (see § (cid:96) -norm of the solution reduces the num-ber of non-zero parameters in the solution, equivalent tochoosing a sparse solution. The philosophy of LASSOis therefore similar to that of the traditional CLEANtechnique (H¨ogbom 1974), which favors sparsity in thereconstructed image and has been independently devel-oped as Matching Pursuit (Mallat & Zhang 1993) instatistical mathematics for sparse reconstruction. InHonma et al. (2014), we found that LASSO can poten-tially reconstruct structure ∼ λ/D max .Indeed, it works well for imaging the black hole shadowfor M87 with the EHT in simulations.Our previous work (Honma et al. 2014) has three rel-evant issues. The ﬁrst issue is reconstructing the imageonly from the visibility amplitudes and closure phases(see § (cid:96) -norm minimization and sparse imagereconstruction. Another potential approach is to solvefor the image and visibility phases with (cid:96) -norm regu- maging the R s -scale Structure of M87 with the EHT using Sparse Modeling (cid:96) -norm regularizationmight not provide a unique solution and/or could recon-struct an image that is too sparse image if the numberof pixels with non-zero brightness is not small enoughcompared to the number of pixels. This violates a crit-ical assumption in techniques with (cid:96) -norm regulariza-tion that the solution (i.e. the true image) should besparse. Such a situation may occur for an extendedsource or also even for a compact source if the imag-ing pixel size is set to be much smaller than the size ofthe emission structure. Pioneering work has made useof transforms to wavelet or curvelet bases, in which theimage can be represented sparsely (e.g. Li et al. 2011;Carrillo et al. 2012, 2014; Garsden et al. 2015; Dabbechet al. 2015). As another strategy to resolve this poten-tial issue, in Honma et al. (2014), we proposed to addanother regularization, Total Variation (TV; e.g. Rudinet al. 1992), which is another popular regularization insparse modeling. TV is a good indicator for sparsity ofthe image in its gradient domain instead of the imagedomain (see § (cid:96) -norm regularization, thereby extending the classof objects where sparse modeling is applicable. Indeed,regularization with both the (cid:96) -norm and TV has beenshown to be eﬀective for imaging polarization with fullcomplex visibilities in our recent work (Akiyama et al.2017).An important detail is the determination of regular-ization parameters (e.g. weights on regularization func-tions), which is common in the vast majority of exist-ing techniques. Since one can not know the true imageof the source a priori, one should evaluate goodness-of-ﬁtting and select appropriate regularization parametersfrom the data themselves. In well-posed problems, onecan use statistical quantities considering residuals be-tween data and models as well as model complexity toavoid over-ﬁtting, such as reduced χ , the Akaike in- formation criterion (AIC) and the Bayesian informationcriterion (BIC) using the degrees of freedom to constrainmodel complexity. However, for ill-posed problems likeinterferometric imaging, degrees of freedom can not berigorously deﬁned, preventing the use of such statisticalquantities.In this paper, we propose a new technique to recon-struct images from interferometric data using sparsemodeling. The proposed technique directly solves theimage from visibility amplitudes and closure phases.In addition to the (cid:96) -norm (LASSO), we also utilizeanother new regularization term, TV, so that a high-ﬁdelity image will be obtained even with a small pixelsize and/or for extended sources. Furthermore, we pro-pose a method to determine optimal regularization pa-rameters with cross validation (CV; see § THE PROPOSED METHOD2.1.

A brief introduction of the closure phase

A goal of radio and optical/infrared interferometry isto obtain the brightness distribution I ( x, y ) of a targetsource at a wavelength λ or a frequency ν , where ( x, y )is a sky coordinate relative to a reference position socalled the phase-tracking center. The observed quantityis a complex function called visibility V ( u, v ), which isrelated to I ( x, y ) by two-dimensional Fourier transformgiven by V ( u, v ) = (cid:90) dxdy I ( x, y ) e − i π ( ux + vy ) . (1)Here, the spatial frequency ( u, v ) corresponds to thebaseline vector (in units of the observing wavelength λ )between two antennas projected to the tangent plane ofthe celestial sphere at the phase-tracking center.Observed visibilities are discrete quantities, and thesky image can be approximated by a pixellated versionwhere the pixel size is much smaller than the nominalresolution of the interferometer. The image can there-fore be represented as a discrete vector I , related to theStokes visibilities V by a discrete Fourier transform F : V = FI . (2)The sampling of visibilities is almost always incomplete.Since the number of visibility samples V is smaller thanthe number of pixels in the image, solving the aboveequation for the image I is an ill-posed problem.Here, we consider that the complex visibility V j is ob-tained from observation(s) with multiple antennas. Let K. Akiyama et al. us deﬁne its phase and amplitude as φ j and ¯ V j , respec-tively, denoted as follows V j = ¯ V j e iφ j , (3)where j is the index of the measurement. Each measure-ment corresponds to a point ( u j , v j ) in ( u, v )-plane andrecorded at time t j . In actual observations, some instru-mental eﬀects and the atmospheric turbulence primaryfrom the troposphere induce the antenna-based errorsin the visibility phase, leading that the observed phase˜ φ j is oﬀset from the true phase φ j of the true image. Inparticular, this is a serious problem in VLBI observa-tions performed at diﬀerent sites (see Thompson et al.2001).However, the robust interferometric phase informationcan be obtained through the measurements of the clo-sure phase , deﬁned as a combination of triple phases ona closed triangle of baselines recorded at the same time.It is known that the closure phase is free from antenna-based phase errors (Jennison 1958), which can be seenfrom the following deﬁnition of the closure phase, ψ m = ˜ φ j + ˜ φ k + ˜ φ l = φ j + φ k + φ l , (4)where m is the index of the closure phase, and uppernumbers (1 , ,

3) mean the index of stations involvedin the closure phase or the visibility phase. The closurephase is also known as a phase term of the triple productof visibilities on closed baselines recorded at the sametime, V j V k V l , known as the bi-spectrum . Closurephases have been used to calibrate visibility phases inVLBI observations (e.g. Rogers et al. 1974).In short/sub-millimeter VLBI or optical/infrared in-terferometry, the stochastic atmospheric turbulence inthe troposphere over each station induces a rapid phaserotation in the visibility, making it diﬃcult to calibrateor even measure the visibility phase reliably (e.g., seeRogers et al. 1995; Thi´ebaut 2013). Thus, image recon-struction using more robust closure phases, free fromstation-based phase errors, is useful for interferometricimaging with such interferometers.2.2. Image Reconstruction from Visibility Amplitudesand Closure Phases

In this paper, we propose a method to solve the two-dimensional image I = { I i,j } by the following equations:min I C ( I ) subject to I i,j ≥ . (5) Data products of visibility amplitudes and closure phases arealso sometimes named as “bi-spectrum” in literature (e.g. Buscher1994). In this paper, we strictly distinguish them.

The cost function C ( I ) is deﬁned as C ( I ) = χ ( I ) + η l || I || + η t || I || tv , (6)where || I || p is l p -norm of the vector I given by || I || p = (cid:88) i (cid:88) j | I i,j | p  p (for p > , (7)and || I || tv indicates an operator of TV.The ﬁrst term of Eq.(6) is the traditional χ term rep-resenting the deviation between the reconstructed im-age and observational data (i.e. the visibility amplitude ¯V = {| V j |} and closure phase Ψ = { ψ m } ), deﬁned by χ ( I ) = || ¯V − A ( FI ) || + || Ψ − B ( FI ) || , (8)where A and B indicate operators to calculate the visi-bility amplitude and closure phase, respectively. Devia-tions between the model and observational data are nor-malized with the errors. This form of the residual sum ofsquares (RSS) is originally proposed in the Bi-spectrumMaximum Entropy Method (BSMEM; Buscher 1994)and also for modeling EHT data (e.g. Lu et al. 2012,2013). Note that it could be replaced to a RSS term forbi-spectra (e.g. Bouman et al. 2015; Chael et al. 2016).The second term represents LASSO-like regularizationusing the (cid:96) -norm. Under the non-negative condition, (cid:96) -norm is equivalent to the total ﬂux. η l is the regu-larization parameter for LASSO, adjusting the degree ofsparsity by changing the weight of the (cid:96) -norm penalty.In general, a large η l prefers a solution with very fewnon-zero components, while η l = 0 introduces no spar-sity. In this paper, we use the normalized regularizationparameter ˜ η l deﬁned by η l ≡ ˜ η l ( N amp + N cphase ) / max( ¯V ) , (9)which is less aﬀected by the number of visibility ampli-tude and closure phase data points, N amp and N cphase ,respectively, and also by the total ﬂux density of thetarget source.The third term is the TV regularization, deﬁned bythe sum of all diﬀerences of the brightness between ad-jacent image pixels. In this paper, we adopt a typi-cal form for two-dimensional images (Rudin et al. 1992)that has been used in astronomical imaging, (e.g. Wiauxet al. 2010; McEwen & Wiaux 2011; Uemura et al. 2015;Chael et al. 2016), deﬁned as || I || tv = (cid:88) i (cid:88) j (cid:113) | I i +1 ,j − I i,j | + | I i,j +1 − I i,j | . (10)TV is a good indicator of image sparsity in its gradi-ent domain instead of the image domain. TV is highly maging the R s -scale Structure of M87 with the EHT using Sparse Modeling η t adjusts the eﬀective spa-tial resolution of the reconstructed image. In general, alarger (smaller) η t prefers smoother (ﬁner) distributionof power with less (higher) discreteness, leading to larger(smaller) angular resolution. In the present work, we usethe normalized regularization parameter ˜ η t deﬁned by η t ≡ ˜ η t ( N amp + N cphase ) / ¯V ) , (11)similar to the LASSO term. A factor of 4 is based ona property of TV that takes a diﬀerence in the bright-ness to all four directions at each pixel. Note that amajor diﬀerence to maximum entropy methods, whichalso favor smooth images, is that TV regularization hasa strong advantage in edge-preserving; strong TV regu-larization favors a piecewise smooth structure, but withclear and often sharp boundaries between non-emittingand emitting regions.The problem described in Eq.(5, 6) is non-linear min-imum optimization. In this work, we adopt a non-linear programming algorithm L-BFGS-B (Byrd et al.1995; Zhu et al. 1997) that is an iterative method forsolving bound-constrained nonlinear optimization prob-lems. L-BFGS-B is one of the quasi-Newton meth-ods that approximates the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm using a limited amount ofcomputer memory. In L-BFGS-B, the cost function andits gradient are used to determine the next model param-eters at each iterative process. We approximately setpartial derivatives to 0 at non-diﬀerentiable points forboth the (cid:96) -norm and TV. The partial derivatives of χ are calculated numerically with central diﬀerences. Weuse the latest Fortran implementation of L-BFGS-B (L-BFGS-B v3.0; Morales & Nocedal 2011). We note thatthe problem is non-convex as other imaging techniquesusing closure quantities (e.g. Buscher 1994; Thi´ebaut2008; Bouman et al. 2015; Chael et al. 2016), and aglobal solution is generally not guaranteed.2.3. Determination of Imaging Parameters

In the proposed method, the most important tuningparameters are the regularization parameters for the (cid:96) -norm (˜ η l ) and TV (˜ η t ), which determine the sparse-ness and eﬀective spatial resolution of the image, respec-tively. Smaller regularization parameters generally favorimages with larger numbers of non-zero image pixels andmore complex image structure, which could give better χ values by over-ﬁtting. On the other hand, large regu-larization parameters provide images that are too simpleand that do not ﬁt the data well. To determine optimalparameters, we need to evaluate the goodness-of-ﬁt us-ing Occam’s razor to prevent over-ﬁtting.In this work, we adopt cross validation (CV) to eval-uate goodness-of-ﬁt. CV is a measure of the relativequality of the models for a given set of data. CV checkshow the model will generalize to an independent dataset by using separate datasets for ﬁtting the model andfor testing the ﬁtted model. CV consists of three steps:(1) randomly partitioning a sample of data into com-plementary subsets, (2) performing the model ﬁtting onone subset (called the training set ), and (3) validatingthe analysis on the other subset (called the validationset ). To reduce variability, multiple rounds of cross-validation are performed using diﬀerent partitions, andthe validation results are averaged over the rounds. Ifthe regularization parameters are too small, the estab-lished model from the training set would be over-ﬁttedand too complicated, resulting in a large deviation inthe validation set. On the other hand, if the regular-ization parameters are too large, the established modelwould be too simple and not well-ﬁtted to the trainingset, also resulting in a large deviation in the validationset. Thus, reasonable parameters can be estimated byﬁnding a parameter set that minimizes deviations (e.g. χ ) of the validation set.In this work, we adopted 10-fold CV for evaluating thegoodness-of-ﬁt. The original data were randomly par-titioned into 10 equal-sized subsamples. 9 subsampleswere used in the image reconstruction as the trainingset, and the remaining single subsample was used asthe validation set for testing the model using χ . Werepeated the procedure by changing the subsample forvalidation data 10 times, until all subsamples were usedfor both training and validation. The χ values of thevalidation data were averaged and then used to deter-mine optimal tuning parameters.An important advantage of this method comparedwith previously proposed methods is that it is appli-cable to any type of regularization functions and alsoimaging with multiple regularization functions. Forinstance, Carrillo et al. (2012, 2014) and subsequentwork solve images by utilizing the (cid:96) -norm on wavelet-transformed image or TV regularization alone. In this In this work, subsamples are obtained with a uniform proba-bility regardless of baselines following the most basic style of theCV. However, there could be more eﬀective way of choosing sub-samples for interferometric imaging. The optimum partitioningfor CV is in the scope of our next studies in near future.

K. Akiyama et al. case, the parameter can be uniquely determined fromthe (cid:96) -norm of the estimated uncertainties on observa-tional data (see Carrillo et al. 2012, for details). How-ever, it is not straightforward to extend the idea for theproblems with multiple regularization functions. Foranother example, Garsden et al. (2015) proposes an-other heuristic method to determine the regularizationparameters on (cid:96) -regularization on the wavelet/curvelet-transformed image by estimating its noise level on eachscale, which is successful. However, the method wouldnot work for all types of regularization functions. Onthe other hand, CV is a general technique that can beapplied to imaging with any other regularization func-tions or any combination of them in principle, whichinclude MEM (e.g. Buscher 1994; Chael et al. 2016) andpatch priors (e.g. Bouman et al. 2015). This advan-tage is particularly important for Sgr A*, which needsto involve a regularization function to mitigate the inter-stellar scattering eﬀects (scattring optics; Johnson 2016)in addition to the general regularization function(s) forimaging.A relevant disadvantage of this method is its compu-tational cost, since n -fold CV requires to reconstruct( n + 1) images for each set of regularization parameters.Recently, an accurate approximation of CV, which canbe derived from a single imaging on full data set for eachparameter set, has been proposed for imaging from fullcomplex visibilities with (cid:96) +TV regularizations (Obuchi& Kabashima 2016; Obuchi et al. 2016). Future devel-opment of such heuristic approximations for other typesof data and regularization functions could overcome thisissue. IMAGING SIMULATIONS3.1.

Physically Motivated Models

In this paper, we adopt four physical models previ-ously proposed for 1.3 mm emission on event-horizonscales.The ﬁrst model is a simple, but qualitatively correct,force-free jet model (hereafter BL09) in the magneti-cally dominated regime presented in Broderick & Loeb(2009) and Lu et al. (2014). We adopted a model im-age presented in Akiyama et al. (2015), which is basedon the model parameters ﬁtted to the results of 1.3 mmobservations with the EHT in Doeleman et al. (2012)and the SED of M87 (Broderick et al. in preparation).The approaching jet is predominant for this model (seeAkiyama et al. 2015, for more details).The second and third models are based on results ofGRMHD simulations presented in Dexter et al. (2012).We used the representative models DJ1 and J2, whichare based on the same GRMHD simulation but with dif- ferent energy and spatial distributions for radio-emittingleptons. The dominant emission region is the accretionﬂow in DJ1 and the counter jet in J2 illuminating thelast photon orbit in J2. We adopt model images inAkiyama et al. (2015), where the position angle of thelarge-scale jet for models is adjusted to − ◦ inferredfor M87 (e.g. Hada et al. 2011).The last model is based on results of GRMHD simu-lations presented in Mo´scibrodzka et al. (2016), whichmodels M87 core emission as radiation produced by thejet sheath. We use the image averaged for ∼ − ◦ .3.2. Simulated Observations

We simulate observations with the EHT at 1.3 mm(230 GHz) using the MAPS (MIT Array PerformanceSimulator) package based on the above models. Thesimulated observations are performed for the array ex-pected to comprise in Spring 2017.We assume an array consisting of stations at 6 dif-ferent sites: a phased array of the Submillimeter Ar-ray (SMA) antennas and the James Clerk MaxwellTelescope (JCMT) on Mauna Kea in Hawaii; theArizona Radio Observatory’s Submillimeter Telescope(ARO/SMT) on Mt. Graham in Arizona; the LargeMillimeter Telescope (LMT) on Sierra Negra, Mex-ico; a phased array of the Atacama Large Millime-ter/submillimeter Array (ALMA) in the Atacamadesert, Chile; the Institut de Radioastronomie Mil-lim´etrique (IRAM) 30m telescope on Pico Veleta, Spain;and a single dish telescope of the Northern ExtendedMillimeter Array (NOEMA) in France. We adopt thesystem equivalent ﬂux density (SEFD) of each stationshown in Table 1 based on the proposer’s guide of 1-mmVLBI observations in ALMA Cycle 4.The simulations assume a bandwidth of 3.5 GHz forStokes I , which is half of the standard setting in ALMACycle 4 . We assume a correlation eﬃciency of 0.7, in-cluding a quantization eﬃciency of 0.88 for 2-bit sam-pling and other potential losses such as bandpass eﬀectsand pointing errors. https://science.nrao.edu/observing/call-for-proposals/1mm-vlbi-cycle4/ maging the R s -scale Structure of M87 with the EHT using Sparse Modeling Table 1.

Stations used in simulated observationsTelescope SEFD (Jy)Phased ALMA 100Phased SMA and JCMT 4000LMT 1400IRAM 30m 1400NOEMA single dish 5200ARO/SMT 11000

Baseline Length u (G λ ) B a s e li n e L e n g t h v ( G λ ) SMA & JCMTARO/SMTLMT ALMAIRAM 30mNOEMA

Figure 1.

The uv -coverage of the simulated observationswith the EHT array expected in Spring 2017. Each baselineis split in two colors to show involving stations. We simulate observations as a series of 5-minutes scanswith a cadence of 20 minutes over a GST range of 13-0 hour, corresponding to the timerange when M87 canbe observed by ALMA or LMT at an elevation greaterthan 20 ◦ . ALMA and LMT are sensitive stations nearthe middle of the east-west extent of the array, andthey may be important anchor stations for fringe detec-tion. This provides an observational eﬃciency of 25%in time, expected for VLBI observations with ALMA in2017. Data are integrated for the duration of each scan(i.e. 5 min) following previous EHT observations (e.g.Doeleman et al. 2012; Akiyama et al. 2015). Figure 1shows the resultant uv -coverage of simulated observa-tions. The maximum baseline length of observations is7.2 G λ , corresponding to λ/D = 28 . µ as.We note that the conditions of our simulation aremuch worse than previous simulations in Lu et al. (2014) in terms of the baseline sensitivity, uv -coverage, angularresolution (i.e. the maximum baseline length) and theexposure time of observations. Nevertheless, our sim-ulated conditions are much closer to the observationalconditions in Spring 2017.3.3. Imaging

We reconstruct images from simulated data-sets basedon the method described in § µ as gridded by 100 pixels inboth the RA and Dec directions for all models, giv-ing a pixel size of ∼ µ as corresponding to a physi-cal scale of ∼ . R s . Images are reconstructed at 4regularization parameters for both ˜ η l and ˜ η t , ranging as10 − , , ..., +2 . As a result, we obtain 4 × § η l = ˜ η t = 10 , which is expected to derive the sim-plest image among parameters we adopt, assuming apoint source as initial images. This is then used as theinitial image at other values of the regularization param-eters.At each parameter, we do iterations until achievingconvergence or 1000 iterations, and then ﬁlter the out-put image with a hard thresholding deﬁned by I ﬁltered i =  | I i | < t ) I i (otherwise) , (12)where t is a threshold. We repeat this process until thenormalized root mean square error (NRMSE) betweenthe previous and latest ﬁltered images becomes smallerthan 1%. NRMSE is deﬁned by (e.g. Chael et al. 2016)NRMSE( I , K ) = (cid:115) (cid:80) i | I i − K i | (cid:80) i | K i | , (13)where I and K are the image to be evaluated and thereference image, respectively. We adopt the latest non-ﬁltered image as the ﬁnal product. Although this pro-cedure makes the computational time longer, we foundthis works very well for avoiding convergence at somelocal minima. In this paper, we set 10% of the peakﬂux as a threshold t . For the simulated observationaldata in this paper, it takes typically about several toten minutes on a standard desktop computer with sixintel core-i7 CPU cores to reconstruct an image at eachset of two regularization parameters. K. Akiyama et al.

We evaluate the goodness-of-ﬁt for each image andthen selected the best-ﬁt images with 10-fold CV as de-scribed in § , because absolute po-sitions cannot be deﬁned from visibility amplitudes andclosure phases alone. Finally, the metrics were evalu-ated. In addition to NRMSE, we also measure structuraldissimilarity (Wang et al. 2004) between the model andreconstructed images using the DSSIM metric adoptedwork by Lu et al. (2014) and Fish et al. (2014). Sinceboth metrics show similar trends, we show only the be-havior of the NRMSE in the ﬁgures that follow.3.3.1. Imaging with the Cotton-Schwab CLEAN

For evaluating the performance of our techniques,we also reconstructed images with the most widely-used Cotton-Schwab CLEAN (henceforth CS-CLEAN;Schwab 1984) implemented in the Common AstronomySoftware Applications (CASA) package with uniformweighting. Since CLEAN requires complex visibilities,we adopted the simulated complex visibilities with ther-mal noises. We set a gain of 0.1 and a threshold of0.08 mJy beam − , comparable to the image sensitivityof simulated observations. Since the fast Fourier trans-form is often used in CLEAN, a very small FOV canrequire a grid size in uv -plane that is too large, whichcould cause additional deconvolution errors. Hence, weset 1024 pixels with the same pixel size in each axisfor the entire map, and put a CLEAN box in the cen-tral 100 ×

100 pixels to put CLEAN components in thesame region as other techniques. We use the modelimage instead of the CLEAN map for calculating met-rics, since the residual map, which is generally added tothe CLEAN map, cannot be calculated for the proposedmethod. RESULTS Previous works (e.g. Lu et al. 2014; Fish et al. 2014) derivedposition oﬀsets between the model and reconstructed images bytaking cross correlations of the two images. However, we foundthat the position oﬀsets derived from cross correlations induce anadditional error in the metrics due to errors in position oﬀsets. Wefound that the center of mass for the image is a better indicatorof the position oﬀsets than the cross-correlation minimum. https://casa.nrao.edu/ The best-ﬁt images

We show the best-ﬁt images selected with CV in Fig-ure 2. A clear shadow feature is well reproduced for thecounter-jet- and accretion-ﬂow-dominated models (J2,M16 and DJ1). This demonstrates that the EHT willachieve eﬀective suﬃcient spatial resolution to image theblack hole shadow of M87 if the mass-loading radius ofthe jet is not too large (Broderick & Loeb 2009; Lu et al.2014; Akiyama et al. 2015).Figure 3 shows the NRMSE metric for reconstructedimages. The black curve labeled “Model” shows theNRMSE calculated when the model image is convolvedwith a circular Gaussian beam with a full width at halfmaximum (FWHM) as shown on the abscissa and com-pared against the original (unconvolved) model image.The Model curve eﬀectively quantiﬁes the best-case sce-nario in which the diﬀerences from the original input aredue solely to a loss of resolution, not to errors in recon-structing the image. Figure 3 also show the NRMSE ofeach of the reconstructed images convolved with circularGaussian beams.Figure 3 clearly shows that closure-phase imagingwith (cid:96) +TV regularization works well compared to CS-CLEAN in particular at ﬁner resolutions, despite thefact that CS-CLEAN uses full complex visibilities withmore information and higher SNRs than closure phases.Both CS-CLEAN and (cid:96) +TV images achieve similarNRMSEs on scales comparable to or greater than thediﬀraction limit. On the other hand, the NRMSEs ofthe reconstructed images start to deviate from the modelimages in the super-resolution regime—namely on scalessmaller than the diﬀraction limit. In this regime, theNRMSEs diﬀer by technique. CS-CLEAN has a com-mon trend for all four models, which is broadly consis-tent with results of Chael et al. (2016). They achieveminimum errors at a resolution of ∼ −

60% of thediﬀraction limit and then show a rapid increase in er-rors at smaller scales. In contrast, closure-phase imag-ing with (cid:96) +TV regularizations show much more modestvariations in the super-resolution regime. They achieveminimum errors at a resolution of ∼ −

30% of thediﬀraction limit, smaller than CS-CLEAN, and showonly a slight increase at smaller scales. (cid:96) +TV recon-structions produce images that have a smooth distribu-tion similar to the model images, resulting in smallererrors than CS-CLEAN, even if the (cid:96) +TV reconstruc-tions are not convolved with a restoring beam.In super-resolution regimes, the errors in (cid:96) +TV im-ages mostly arise from the presence of tiny substruc-tures in the image. For instance, we show model, re-constructed and residual images ﬁltered with diﬀerentbaseline lengths for DJ1 in Figure 4. As shown in Fig- maging the R s -scale Structure of M87 with the EHT using Sparse Modeling

60 40 20 0 20 40 60

Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Model Model (Convolved) CS-CLEAN ‘ +TV

60 40 20 0 20 40 60

Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Model Model (Convolved) CS-CLEAN ‘ +TV

60 40 20 0 20 40 60

Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Model Model (Convolved) CS-CLEAN ‘ +TV

60 40 20 0 20 40 60

Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Model Model (Convolved) CS-CLEAN ‘ +TV Figure 2.

The model and reconstructed images. All images are convolved with circular Gaussian beams with the FWMH sizescorresponding to diameters of the yellow circles, which coincide with the optimal resolutions for (cid:96) +TV regularization shown inFigure 3. Top panels : The approaching-jet-dominated model BL09 taken from Akiyama et al. (2015) (originally from Broderick& Loeb 2009 and Lu et al. 2014).

The second and third panels from the top : The counter-jet-dominated models J2 takenfrom Akiyama et al. (2015) (originally from Dexter et al. 2012) and M16 (Mo´scibrodzka et al. 2016), respectively.

Bottompanels : The accretion-ﬂow-dominated model DJ1 taken from Akiyama et al. (2015) (originally from Dexter et al. 2012). ure 4, residuals are small when ﬁltering baseline lengthsare shorter than the maximum baseline length. On theother hand, for longer ﬁltering baseline lengths, system-atic residuals due to tiny substructures much smallerthan the diﬀraction limit start to appear, which can betraced only with baselines longer than the simulated ob-servations.Although NRMSEs are better than CS-CLEAN atﬁner resolutions, (cid:96) +TV images have broader emission region sizes regardless of models. This is due to a typicalfeature of images reconstructed with the isotropic TV,which prefers ﬂat images with sharp edges. Since thesimulated data do not have visibilities at baseline lengthslong enough to resolve the width of ring- or crescent-likefeatures, TV enlarges their widths until images start todeviate from observed visibilities. This property of theisotropic TV regularization would be useful to constrainthe upper-limit size of the emission regions and black-0 K. Akiyama et al.

FWHM of the restoring beam ( λ/D max ) N R M S E ( % ) BL09

Model ‘ +TVCS-CLEAN FWHM of the restoring beam ( λ/D max ) N R M S E ( % ) J2 Model ‘ +TVCS-CLEAN FWHM of the restoring beam ( λ/D max ) N R M S E ( % ) M16

Model ‘ +TVCS-CLEAN FWHM of the restoring beam ( λ/D max ) N R M S E ( % ) DJ1

Model ‘ +TVCS-CLEAN FWHM of the restoring beam ( µas ) FWHM of the restoring beam ( µas ) FWHM of the restoring beam ( µas ) FWHM of the restoring beam ( µas ) Figure 3.

The NRMSE between the non-beam-convolved original model image and beam-convolved model/reconstructedimages of all four models, as a function of the FWHM size of the convolving circular beam. The red and blue arrows indicatethe optimal resolution of (cid:96) +TV regularization and CS-CLEAN, respectively, which minimize the NRMSE. hole shadow (see § § Regularization parameters and Cross Validation

Reconstructed images for all 16 sets of the regulariza-tion parameters are shown in Figure 5 for the accretion-ﬂow-dominated model. As shown in Figure 5, the re-constructed images with ˜ η l , ˜ η t (cid:46) χ betweenthe validation set and the model image reconstructed from the training set, which is averaged for all 10 trials(henceforth the CV value). As expected in § maging the R s -scale Structure of M87 with the EHT using Sparse Modeling µ as)6040200204060 R e l a t i v e D e c ( µ a s ) Model 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Reconstructed 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Residual 0.40.30.20.10.00.10.20.30.4 (a) Filtered with | u | cutoﬀ = 0 . D max /λ µ as)6040200204060 R e l a t i v e D e c ( µ a s ) Model 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Reconstructed 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Residual 0.40.30.20.10.00.10.20.30.4 (b) Filtered with | u | cutoﬀ = 1 . D max /λ µ as)6040200204060 R e l a t i v e D e c ( µ a s ) Model 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Reconstructed 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Residual 0.40.30.20.10.00.10.20.30.4 (c) Filtered with | u | cutoﬀ = 2 . D max /λ µ as)6040200204060 R e l a t i v e D e c ( µ a s ) Model 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Reconstructed 6040200204060 Relative RA ( µ as) R e l a t i v e D e c ( µ a s ) Residual 0.40.30.20.10.00.10.20.30.4 (d) Original Images (no ﬁltering)

Figure 4.

The model, reconstructed, and residual images of DJ1 for diﬀerent ﬁltering baseline lengths. The left and middlepanels show the model and best-ﬁt images, while the right panels show the diﬀerence between these two images normalizedwith the peak brightness of the model image. Images at panels (a), (b) and (c) are low-pass ﬁltered with spatial frequencies (orbaseline lengths) of | u | cutoﬀ = (0 . , , × D max /λ , which are equivalent to convolving images with modiﬁed Bessel functionswith FWHM sizes of ∼ (1 . , . , . × λ/D max , respectively. The panels (d) show original images without ﬁltering. K. Akiyama et al. -50-2502550 ˜ η t = -50-2502550 ˜ η t = -50-2502550 ˜ η t = -50-2502550 ˜ η l = 10 − -50-2502550 ˜ η t = − -50-2502550 ˜ η l = 10 -50-2502550 ˜ η l = 10 -50-2502550 ˜ η l = 10 Figure 5.

The parameter dependences of the reconstructed images in the regularization parameters for the accretion-ﬂow-dominated model (DJ1). The yellow star indicates the regularization parameters for the best-ﬁt images with the minimum CV.The units of the tick labels are µ as. the minimum values for the three of four models (J2,M16 and DJ1), they are consistent to within a few per-cents. These slight diﬀerences between the best-ﬁt andbest-ﬁdelity parameters would not produce substantialdiﬀerences in the image, and the resulting images aregood enough.The above results clearly demonstrate that CV is auseful technique to determine the regularization param-eters so that the reconstructed image does not overﬁtnoises in the data. We emphasize that CV is a generaltechnique and can be applied to imaging regardless ofthe speciﬁc data products used (e.g. full complex visi-bilities, visibility amplitudes and/or closure quantities)or chosen regularization (for instance, sparse modeling;MEM: e.g., Buscher 1994, Chael et al. 2016; patch pri-ors: e.g., Bouman et al. 2015; scattering optics: Johnson2016). DISCUSSIONS 5.1.

Implications for future EHT observations of M87

The proposed method successfully reproduces a clearsignature for the black hole shadow for the models J2,M16 and DJ1. This is as expected from the visibilitydistribution of these models, which have null amplituderegions, created by the shadow feature, at intermedi-ate baseline lengths of ∼ − λ (Akiyama et al.2015). MEM also succeeds in reproducing the blackhole shadow in these models (Lu et al. 2014). Presenceof a clear shadow feature is tightly connected to wherethe dominant emission originates, since the silhouetteof the black hole is created by the photons produced afew R s from the black hole, illuminating the last photonorbit (see discussion in Akiyama et al. 2015). As demon-strated in previous imaging simulations (Lu et al. 2014),future EHT observations can constrain the loading ra-dius of the high-energy leptons producing synchrotronemission at 1.3 mm via the appearance of the black hole. maging the R s -scale Structure of M87 with the EHT using Sparse Modeling log˜ η l l og˜ η t BL09 log˜ η l l og˜ η t J2 log˜ η l l og˜ η t M16 log˜ η l l og˜ η t DJ1 (a) Cross Validation (log CV) log˜ η l l og˜ η t BL09 log˜ η l l og˜ η t J2 log˜ η l l og˜ η t M16 log˜ η l l og˜ η t DJ1 (b) NRMSE (%)

Figure 6.

The logarithm of the CV value (a) and the NRMSE of non-Gaussian convolved image (b) for models BL09, J2, M16and DJ1. The yellow stars indicate the regularization parameters for the best-ﬁt images with the minimum CV value. Notethat, in the panel (a), we set the upper limit of the color contours to values (1 .

8) lower than the maximum CV ( > The most important implication of this work is thatour regularization function TV and parameter selectionwith CV will enlarge the ring- or crescent-like emissionilluminating the last-photon orbit of the black hole asmuch as possible, within the range that the model im-age neither over-ﬁts nor deviates too much from the ob-served visibilities. Hence, the obtained width of the sur-rounding emission in the reconstructed image is close toan upper-limit on the width of the emission region, andsimultaneously, the obtained diameter of the black holeshadow should be interpreted as a reasonable lower limitfor it. The clear shadow features in the reconstructedimages for models J2, M16 and DJ1, therefore, stronglyindicate that the EHT can sample a large enough rangeof visibilities with appropriately low noise levels to im-age the black hole shadow. In addition, the raw recon-structed image with the proposed method can be used toconstrain the lower-limit size of the shadow. This wouldbe useful to constrain the mass of M87, which has an un-certainty of an factor of about two between the stellar-dynamical (e.g. Gebhardt et al. 2011) and gas-dynamical(e.g. Walsh et al. 2013) modeling, and therefore be in-formative to clarify that which of modeling methods isdesirable to measure the mass of super-massive blackholes.Another important implication is that, at least for theblack hole images, post-processing Gaussian convolutionwould not be required with the (cid:96) + T V regularization,although the CLEAN techniques do require it to reduce many compact artifacts in the image. As shown in Fig-ure 3, the NRMSE curves for the (cid:96) + T V regularizationare shallow for smaller convolving sizes, and applying acircular Gaussian beam therefore makes only small im-provements of a few percent in the NRMSE regardless ofthe input model images. Similar results are also shownin recent work with the MEM (Chael et al. 2016). Ourresults support that application of the beam is not re-quired for the recent state-of-art imaging methods utiliz-ing multi-resolution regularization functions for imagingthe R s -scale structure of M87 and Sgr A* with the EHT.5.2. Relevant future issues

Other sparse regularization for smoothed images

In this paper, we adopt TV regularization, which fa-vors a smooth image, so that images can be recon-structed with smaller pixel sizes and/or for more ex-tended sources. As shown in §

4, the reconstructed im-ages have good image ﬁdelity. In particular, it is note-worthy that CV selected high values of ˜ η t = 10 − for all models (see Figure 6 (a)), suggesting that the so-lutions would be over-ﬁtted without TV regularization.These results demonstrate that inclusion of the TV reg-ularization can extend the range of objects where sparsemodeling is applicable.However, as described in § K. Akiyama et al. tion in super-resolution regimes where the image cannotbe constrained well. Hence, a regularization preferringmore smoothed edges is required to improve the ﬁdelityfor the black-hole imaging with the EHT.In the context of sparse reconstruction, there are sev-eral candidates for improving the image ﬁdelity as a nat-ural extension of this work. First of all, there are otherforms of regularization functions, which prefer sparseimages in the gradient domain with smoother edges. Forinstance, an alternative form, given by, || I || tv = (cid:88) i (cid:88) j (cid:0) | I i +1 ,j − I i,j | + | I i,j +1 − I i,j | (cid:1) . (14)is also convex like the isotropic TV term adopted in thiswork, and prefers images with smoother edges. Fur-thermore, previous studies of sparse image reconstruc-tion techniques have shown that regularization with (cid:96) +wavelet/curvelet transformation is also a promisingapproach (e.g. Li et al. 2011; Carrillo et al. 2012, 2014;Garsden et al. 2015; Dabbech et al. 2015). We will testthese sparse regularizations in a forthcoming paper.5.2.2. Enhancing dynamic range with self-calibration

In VLBI, the visibility phase is initially calibratedwith fringe ﬁtting (also called as fringe search), whichis a self-calibration technique using phase closure (seeThompson et al. 2001). The fringe ﬁtting can miti-gate most station-based errors due to atmospheric andinstrumental eﬀects, although errors may remain ifan incorrect source model is assumed. Traditionally,self-calibration with hybrid/diﬀerential mapping (e.g.Walker 1995) has been employed to solve for resid-ual structural phase errors and images simultaneously,which has been successful for VLBI imaging.This work and previous works on closure-phase imag-ing techniques using other regularizations such as MEMs(Lu et al. 2014, 2016; Fish et al. 2014; Chael et al. 2016)and patch priors (CHIRP; Bouman et al. 2015) demon-strate that an image can be reconstructed with high ﬁ-delity even from closure quantities. However, since clo-sure phases and other closure quantities have less infor-mation about the source structure and also larger ther-mal noises than full complex visibilities, imaging withclosure quantities can limit the dynamic range, imagesensitivity and optimal spatial resolution.A promising approach to improve the dynamic range isto use a reconstructed image from closure imaging tech-niques as an initial image for hybrid/diﬀerential map-ping. It can also be used as a model for fringe ﬁtting, asself-calibrated images are often applied to detect morefringes on faint sources (e.g. Hada et al. 2016). In aforthcoming paper, we will evaluate the performance of such a hybrid-mapping technique including closure-phase/full-closure imaging priors. We emphasize that,for this purpose, one does not need to reconstruct theimage with pixels much smaller than scales where thebrightness distribution cannot be constrained by dataand therefore will not aﬀect results of self-calibration. CONCLUSIONSWe have presented a new imaging technique recon-structing images from visibility amplitudes and closurephases by utilizing two regularizations of sparse mod-eling: the (cid:96) -norm and Total Variation (TV). Further-more, we also propose a method to select optimal regu-larization parameters with cross validation (CV), whichcan be applied to most existing imaging algorithms. Asan example, we applied our technique to simulated ob-servations of M87 with the EHT at 1.3 mm. Here, wesummarize our conclusions.1. We ﬁnd that (cid:96) +TV regularization can achieve anoptimal resolution of ∼ −

30% of the diﬀractionlimit λ/D max , which is the nominal spatial resolu-tion of a radio interferometer. This optimal reso-lution is better than that of the most-widely usedCotton-Schwab CLEAN, which uses full complexvisibilities.2. We conﬁrm that cross validation (CV) works as anOccam’s razor and prevents over-ﬁtting when se-lecting the optimal regularization parameters. CVis a general method that can be applied to interfer-ometric imaging more generally, such as imagingwith full-complex visibilities and/or using otherregularizations.3. Using (cid:96) +TV regularization, the reconstructedimage maximizes the width of the emission regionwithin the range that it neither over-ﬁts nor devi-ates too strongly from the data. Hence, the clearreproduction of the black hole shadow in the re-constructed image suggests that future EHT ob-servations will have the uv -coverage and sensitiv-ity suﬃcient for imaging it. In addition, the recon-structed image will be able to constrain the sizesof the black hole shadow and surrounding emissionregion.Finally, we remark that all of above results demonstratethe clear promise of the EHT for providing an unprece-dented view of the event-horizon-scale structure of thesuper-massive black hole in M87 and also the Galacticcenter Sgr A*.We thank the anonymous referee for his/her use-ful and constructive suggestions to improve the pa- maging the R s -scale Structure of M87 with the EHT using Sparse Modeling Software:

MAPS ( ), CASAREFERENCES