[PDF] Machine-learning approach to identification of coronal holes in solar disk images and synoptic maps

Abstract

Identification of solar coronal holes (CHs) provides information both for operational space weather forecasting and long-term investigation of solar activity. Source data for the first problem are typically most recent solar disk observations, while for the second problem it is convenient to consider solar synoptic maps. Motivated by the idea that the concept of CHs should be similar for both cases we investigate universal models that can learn a CHs segmentation in disk images and reproduce the same segmentation in synoptic maps. We demonstrate that Convolutional Neural Networks (CNN) trained on daily disk images provide an accurate CHs segmentation in synoptic maps and their pole-centric projections. Using this approach we construct a catalog of synoptic maps for the period of 2010-20 based on SDO/AIA observations in the 193 Angstrom wavelength. The obtained CHs synoptic maps are compared with magnetic synoptic maps in the time-latitude and time-longitude diagrams. The initial results demonstrate that while in some cases the CHs are associated with magnetic flux transport events there are other mechanisms contributing to the CHs formation and evolution. To stimulate further investigations the catalog of synoptic maps is published in open access.

Full PDF

DDraft version June 16, 2020

Typeset using L A TEX default style in AASTeX63

Machine-learning approach to identiﬁcation of coronal holes in solar disk images and synoptic maps

Egor Illarionov,

1, 2

Alexander Kosovichev,

3, 4, 5 and Andrey Tlatov Moscow State University, Moscow, 119991, Russia Moscow Center of Fundamental and Applied Mathematics, Moscow, 119234, Russia Center for Computational Heliophysics, New Jersey Institute of Technology, Newark, NJ 07102, USA Department of Physics, New Jersey Institute of Technology, Newark, NJ 07102, USA NASA Ames Research Center, Moﬀett Field, CA 94035, USA Kislovodsk Mountain Astronomical Station of the Pulkovo Observatory, Kislovodsk, 357700, Russia (Received; Revised; Accepted)

Submitted to ApJABSTRACTIdentiﬁcation of solar coronal holes (CHs) provides information both for operational space weatherforecasting and long-term investigation of solar activity. Source data for the ﬁrst problem are typicallymost recent solar disk observations, while for the second problem it is convenient to consider solarsynoptic maps. Motivated by the idea that the concept of CHs should be similar for both cases weinvestigate universal models that can learn a CHs segmentation in disk images and reproduce the samesegmentation in synoptic maps. We demonstrate that Convolutional Neural Networks (CNN) trainedon daily disk images provide an accurate CHs segmentation in synoptic maps and their pole-centricprojections. Using this approach we construct a catalog of synoptic maps for the period of 2010–20based on SDO/AIA observations in the 193 Angstrom wavelength. The obtained CHs synoptic mapsare compared with magnetic synoptic maps in the time-latitude and time-longitude diagrams. Theinitial results demonstrate that while in some cases the CHs are associated with magnetic ﬂux transportevents there are other mechanisms contributing to the CHs formation and evolution. To stimulatefurther investigations the catalog of synoptic maps is published in open access.

Keywords:

Solar coronal holes, Astronomy data analysis, Solar magnetic ﬁelds INTRODUCTIONSolar magnetic ﬁelds play a key role in the formation of solar activity tracers that are observed in solar disk images(Solanki et al. 2006). Regions, where magnetic ﬁeld lines are open in the outer space and appear darker in EUVimages, are called coronal holes (CHs). Direct observation of such structures is a challenging procedure and requiresspecial conditions (Lin et al. 2004). Another option based on is a reconstruction of magnetic ﬁeld lines from solarmagnetograms requires additional modeling (see e.g. Stenﬂo 2013, for details of observations). There are long-termand intense debates about a proper way of the magnetic ﬁeld reconstruction and there is no single accepted way (seeWiegelmann et al. 2014; Wiegelmann et al. 2017, for review of models and its limitations).A search for a robust detection procedure for CHs is motivated by at least two aspects. First, due to the openmagnetic ﬁeld line conﬁguration, high-energy particles can easily ﬂow into the outer space and form a solar wind(Nolte et al. 1976; Abramenko et al. 2009; Cranmer 2009; Obridko et al. 2009). The solar wind from CHs can reachthe Earth and manifest itself in geomagnetic storms (Robbins et al. 2006; Vrˇsnak et al. 2007). Thus, the detectionof CHs is essential for space weather forecasting. Second, in the view of the solar dynamo theory, periods of solaractivity minima are associated with a strong poloidal magnetic ﬁeld (Parker 1955). Thus, observations of polar CHs

Corresponding author: Egor [email protected] a r X i v : . [ a s t r o - ph . S R ] J un Illarionov et al.

Table 1.

Input data used for CHs segmentation in previous studies.Author Reference name Input wavelengthHenney & Harvey (2005) – 10830 ˚A and magnetogramScholl & Habbal (2008) – 171, 195, 304 ˚A and magnetogramKrista & Gallagher (2009) – 195 ˚AReiss et al. (2014) – 193 ˚AVerbeeck, C. et al. (2014) SPoCA 193 ˚A or 195 ˚A or (171 and 195 ˚A)Lowder et al. (2017) – (193 or 195 ˚A) and magnetogramGarton et al. (2018) CHIMERA 171, 193 and 211 ˚AHeinemann et al. (2019) CATCH 193 ˚A and magnetogram may provide information about the poloidal ﬁeld strength and also the upcoming solar cycle (Harvey & Recely 2002).Identiﬁcation of CHs as open ﬁeld regions in reconstructed solar magnetic ﬁeld lines is doable, however, with signiﬁcantuncertainties (see e.g. Linker et al. 2017)Fortunately, CHs have an easily accessible tracer. They appear as massive dark regions when the solar disk isobserved in the EUV or X-ray spectrum. The reason for its darker appearance is a lower density and temperature ofthe solar corona due to the special magnetic ﬁeld conﬁguration (Priest 2014). Detection of such speciﬁc dark regionsis a convenient way for CH identiﬁcation. We review some common approaches to this problem below.Detection of CHs is performed both in solar disk images and in solar synoptic (Carrington) maps that are a compi-lation of successive disk images during a solar rotation period. Methods for CHs identiﬁcation in the disk images areremarkably diverse. They range from fully manual procedures to fully automatic ones and use observations in variouswavelengths (Table 1). In addition, source data providers often apply a custom data preprocessing that contributes todisagreements among various identiﬁcation attempts. A detailed and unbiased analysis of the various approaches andtheir uncertainties is outside the scope of this research.Further progress in methods for CHs identiﬁcation in disk images will help to reduce uncertainties in the determi-nation of CH boundaries. However, CHs are typically large structures, and a single disk image may reveal only a partof a CH that is on the visible side of the Sun. This means that we need some compilation of series of disk images tocapture the whole region of CH. Solar synoptic maps are a convenient way for such representation. A straightforwardapproach to get the CHs boundaries in a synoptic map is a compilation of the CHs boundaries detected in disk images.This approach was implemented e.g. by Caplan et al. (2016). We note that this approach may unambiguously workonly if all disk images are taken at the same time and cover the whole solar surface. However, CHs evolve and changetheir shape with time. Even long-living CHs may appear substantially diﬀerent in the disk images after a single solarrotation. The instantaneous coverage of the whole solar surface was only available during the STEREO observationsof the far-side of the Sun.An alternative approach suggests ﬁrst to merge the solar disk images into full-surface synoptic maps, and thenidentify CHs in the synoptic map directly. Of course, we still have uncertainties in pixel intensities, however, it is moreconvenient to resolve them for continuous values (pixel intensities) than for binary values (CH boundaries). Quitesurprisingly we ﬁnd much less recent publications on CH identiﬁcation in the synoptic maps. Toma & Arge (2005) andToma (2010) developed a CH identiﬁcation procedure using synoptic maps in the 171, 195, 284, 304, 10830 ˚A spectrumlines along with the H α and magnetic synoptic maps. The dataset and analysis cover a period from 2006 to 2009.Hess Webber et al. (2014) investigated polar coronal holes from 1996 through 2010 and compared the identiﬁcationof CHs in the disk images with two techniques that identify CHs in the synoptic maps. One method is based on acombination of synoptic maps in the 171, 195, and 304 ˚A wavelengths, while the second one works with the magneticsynoptic maps. The authors concluded that these methods produced comparable results. An extended time-periodfrom 1996 to 2016 was considered by Hamada et al. (2018) who used the multi-wavelength synoptic maps togetherwith magnetograms. An important contribution of this paper is the development of a homogenization procedure fordata from diﬀerent observational instruments, which allowed them to perform a joint analysis of two solar cycles (23and 24). achine-learning approach to identification of coronal holes DATAWe analyze a dataset of the Solar Dynamic Observatory (SDO) Atmospheric Imaging Assembly (AIA) 193 ˚A solardisk images with a cadence of one image per day (Lemen et al. 2012). Start date is 2010-06-16, the end date is2020-03-01. This period covers 130 full solar rotation periods starting from Carrington rotation (CR) number 2098 to2227 inclusively. The dataset was obtained from the SunInTime website in JPEG quality and 1K resolution. Thereare two reasons for this choice. First, this is the same dataset as was used by Illarionov & Tlatov (2018) for the CNNmodel training. In the context of neural networks models, the dataset uniformity is essential. Second, this datasetis already calibrated with respect to any know instrument issues by the instrument team (Lemen et al. 2012). Thisallows a direct assessment of the input data quality and prevents from possible misinterpretation in data preprocessingsteps. Based on this data we construct solar synoptic maps as described in the next section.In the data analysis section, we use also Carrington rotation synoptic charts of radial magnetic ﬁeld from Helioseismicand Magnetic Imager (HMI, Scherrer et al. 2012) component . CONSTRUCTION OF SYNOPTIC MAPSA standard way of the synoptic map construction consists of two steps. First, we project the solar disk images ontothe Carrington coordinate system. Second, we select latitudinal strips centered at the central meridian and concatenatethem within a single solar rotation period. Other catalogs of the SDO/AIA synoptic maps were prepared similarly(e.g. Karna et al. 2014; Caplan et al. 2016; Hamada et al. 2020).For the construction of the synoptic maps, we use a dataset of solar disk images described in Section 2. The diskimages have a resolution of 1024 × × https://suntoday.lmsal.com/suntoday/ http://hmi.stanford.edu/data/synoptic.html Illarionov et al.

CR 2098 CR 2145 CR 2219(a) (b) (c)

Figure 1.

Pixel intensity distribution of synoptic maps before histogram matching in comparison to the distribution ofcontributing disk projections. Histogram matching procedure adjusts the synoptic map to make it similar to disk projections.Carrington rotations: a) CR 2098, b) CR 2145 and c) CR 2219 are shown. information from all pixels that cover the solar disk; the disadvantage is that the corresponding pixels of the synopticmap are sparse. The higher the resolution of the synoptic map, the greater its sparsity. On the other side, one canconstruct a reverse mapping. The advantage here is that pixels of the synoptic map are dense, however, some pixels ofdisk image will be ignored and not contribute to the synoptic map. In this case, the higher the resolution of the diskimage, the greater the number of pixels ignored in this image. Since we want to keep the resolution of the synopticmaps as a free parameter, we suggest using the mapping of both types and averaging the pixel values that correspondto the same pixel of a synoptic map.The next step is to select a strip around the central meridian of the projected disk image. It is convenient to considerthis step as a part of an averaging procedure, in which we take into account the distance between the pixel longitudes andthe central meridian longitude in the contributing disk image. The greater the distance, the smaller the pixel weightingfactor. The proposed weighting function is deﬁned as: sigmoid(( − d + a ) /b ), where sigmoid( x ) = 1 / (1 + exp( − x )) isa standard sigmoid function, d is a distance in degrees, a and b are the shift and scale parameters that help toselect the desired blending. Indeed, varying these parameters we will obtain wider or narrower rectangular domainsand can play with the softness of its borders. As a particular choice in this work, we use the weighting function:sigmoid(( − d + 13 . / achine-learning approach to identification of coronal holes CR 2098CR 2145CR 2219(a)(b)(c)

Figure 2.

Sample synoptic maps for Carrington rotations: a) CR 2098, b) CR 2145, and c) CR 2219.

To conclude this section we would like to mention that the source code for synoptic maps construction is open-sourcedin the GitHub repository https://github.com/observethesun/synoptic maps, while the synoptic maps produced for eachCarrington rotation are available in a catalog https://sun.njit.edu/coronal holes/.

Illarionov et al. L a t i t u d e Figure 3.

Concatenation of synoptic maps averaged over longitudes. Green vertical lines mark timestamps corresponding toCR 2098, CR 2145, and CR 2219 shown in Figure 2.4.

SEGMENTATION MODELWe start with a brief description of the neural network model proposed by Illarionov & Tlatov (2018) and discusshow to apply it to the synoptic maps or, generally speaking, to input images of arbitrary shape.The model is a typical U-Net convolutional model (Ronneberger et al. 2015). Figure 4 schematically shows themodel architecture. It consists of two branches. The ﬁrst branch compresses an input image via a set of convolutionaland downsampling operations into a tensor with reduced spatial dimensions but an increased channel dimension. Eachdownsampling operation reduces the spatial dimensions by a factor of two, while each convolutional operation increasesthe number of channels by the same factor of two. The number of the channels after the ﬁrst convolutional operation(denoted K in Figure 4) is a parameter of the model. The model we use has K = 24. It total, the compression branchconsists of four convolutional-downsampling steps. For example, for an input image of (256, 256) pixels and K = 24the compression branch will result in a (16, 16, 384) tensor. (cid:1840) (cid:3400) (cid:1839) (cid:3400) (cid:1837)(cid:1840)(cid:512)(cid:884) (cid:3400) (cid:1839)(cid:512)(cid:884) (cid:3400) (cid:884)(cid:1837)(cid:1840)(cid:512)(cid:886) (cid:3400) (cid:1839)(cid:512)(cid:886) (cid:3400) (cid:886)(cid:1837)(cid:1840)(cid:512)(cid:890) (cid:3400) (cid:1839)(cid:512)(cid:890) (cid:3400) (cid:890)(cid:1837) (cid:1840)(cid:512)(cid:886) (cid:3400) (cid:1839)(cid:512)(cid:886) (cid:3400) (cid:890)(cid:1837) (cid:1840)(cid:512)(cid:884) (cid:3400) (cid:1839)(cid:512)(cid:884) (cid:3400) (cid:886)(cid:1837) (cid:1840) (cid:3400) (cid:1839) (cid:3400) (cid:884)(cid:1837) (cid:1840) (cid:3400) (cid:1839) (cid:3400) (cid:1829) (cid:3042)(cid:3048)(cid:3047) (cid:1840) (cid:3400) (cid:1839) (cid:3400) (cid:1829) (cid:3036)(cid:3041) Skip-connectionConvolutions + downsampling Convolutions + upsamplingInputConvolutions

Output (cid:1840)(cid:512)(cid:883)(cid:888) (cid:3400) (cid:1839)(cid:512)(cid:883)(cid:888) (cid:3400) (cid:883)(cid:888)(cid:1837) (cid:1840)(cid:512)(cid:890) (cid:3400) (cid:1839)(cid:512)(cid:890) (cid:3400) (cid:883)(cid:888)(cid:1837)

Figure 4.

The U-Net architecture with compression and decompression branches and skip-connections. The input images (e.g.solar disk image or synoptic map) have spatial dimensions N × M and C in channels. Each convolutional-downsampling blockcompresses spatial dimensions and increases the number of channels. The decompression branch acts as an inverse operation,the output images (e.g. segmentation mask) have spatial dimensions N × M and C out channels. achine-learning approach to identification of coronal holes . These binary masks along with other products are contained in daily reports of the station. Anarchive of solar activity maps, including CH boundaries is available at https://observethesun.com. Thus, the modeltraining represents a semi-automated and manually controlled process of the CH identiﬁcation applied at the station.We use the same convolutional kernels and other trainable parameters that were obtained by Illarionov & Tlatov(2018). This means that the presented results can be directly correlated with the previous work.There are some technical issues that we would like to mention. First, the synoptic maps presented in Sec. 3 havethe spatial resolution of 720 ×

360 pixels. The model was trained on the 256 × ×

180 pixels and apply the spatial padding to obtain the target size of 512 ×

256 pixels. TheCNN model applied to the 512 ×

256 input images produces the segmentation masks of the same size from which weextract a 360 ×

180 region which contains the desired segmentation map for the synoptic map, and is the ﬁnal output. L a t i t u d e Figure 5.

A heatmap of CHs in the output of the CNN model for CR 2219 (shown in grayscale). Green lines correspond to athreshold value of 0.5 used for binarization. In the following ﬁgures, we show only the boundaries of the binarized heatmaps.

Figure 5 shows a sample segmentation map obtained using the CNN model. The model outputs a score for eachpixel to be a part of a CH. The score ranges from 0 to 1. We apply a 0.5 thresholding to convert the heatmaps intobinary masks. For example, Figure 6 shows that the identiﬁed CHs boundaries correspond to visual expectation andaccurately detects CHs regions. In the next section, we provide a detailed analysis.To demonstrate an additional application of the CNN model, we apply it to the pole-centric projections of thesynoptic maps. The model inference in this case is the same as for the solar disk images. Figure 7 shows samplesegmentation maps obtained for the polar projection inputs. For comparison, we put in the same ﬁgure pole-centric http://en.solarstation.ru/ Illarionov et al.

CR 2098CR 2145CR 2219(a)(b)(c)

Figure 6.

Overlaid synoptic maps and reconstructed CH boundaries for CR 2098 (a), CR 2145 (b), and CR 2219 (c). Theseare the same CRs as in Figure 2. projections of CHs obtained in synoptic maps. We note that both methods are in good agreement as it should beexpected.In Appendix, we discuss a possible interpretation of the segmentation procedure within the CNN model from aphysical point of view. achine-learning approach to identification of coronal holes NorthSouth CR 2098 CR 2145 CR 2219

Figure 7.

CH boundaries (green lines) identiﬁed in the pole-centric input images (color background), in comparison with thepole-centric projections of the CH boundaries deduced from the synoptic maps (blue lines). Columns correspond to the sameCRs as in Figure 2. Top and bottom rows show the North and South pole projections.5.

ANALYSISIn this section, we demonstrate that the CH detection method is stable against parameters of the construction ofthe synoptic maps, and investigate general physical properties of CHs.The most essential parameter in the synoptic map construction is the strip width (in our notation it is represented bythe shift and scale parameters). Indeed, the wider strips result in smoother maps without ﬁner details, while narrowerstrips preserve details but provide noisier maps. Another point is that due to the limb-brightening eﬀect the stripwidth also aﬀects the pixel intensity distribution. To avoid this eﬀect we apply the histogram matching procedure asdescribed in Sec. 3.For the uncertainty estimation we consider all combinations of values of the shift parameter: { . ◦ , 13 . ◦ , 19 . ◦ ,26 . ◦ , 33 . ◦ , 39 . ◦ } and the scale parameter: { } . Note that the extreme cases correspond approximately tothe narrowest possible strip (about ± . ◦ around the central meridian with a thin blending zone), and a case whereeach pixel of the synoptic map results from averaging of 6 nearest disk images. In Figure 8 we show intervals betweenthe smallest and largest total areas obtained for all parameter combinations. One can notice that the uncertaintiesare rather negligible. This important point allows us to conclude that the CH regions detected in the synoptic mapsdo not depend on a particular way of the map compilation, but represent stable and physical structures.Figure 8 shows the CH areas as a function of time separately for the Northern and Southern hemispheres as well asfor the polar ( | θ | > ◦ ) and low-latitude ( | θ | ≤ ◦ ) zones. Our choice of separating boundary θ = ± ◦ is consistentwith the work of Hess Webber et al. (2014). We take into account the contribution of individual pixels into each ofthese groups rather than attribute a whole CH based on the location of its center. Thus, pixels from the same CHmay contribute to the diﬀerent groups. We make two observations from the ﬁgure. First, there is an asymmetrybetween the North and South. We observe the hemispheric asymmetry both in time (the area of the Southern polar0 Illarionov et al.

CHs decreases later and starts to increase earlier than the area of the Northern CHs) and in amplitude (the southernpolar CHs demonstrate increasing trend during the solar minimum between Cycles 24 and 25, while the northernCHs do not show this trend). Hess Webber et al. (2014) also demonstrated asymmetries in the polar CHs during thesolar minimum between Cycles 23 and 24. Second, from the bottom panel, we ﬁnd that the solar minimum manifestsitself in increasing both the polar and low-latitude areas of CHs. Moreover, while the areas of the polar CHs continueto increase, the low-latitude CH areas ﬂuctuate near constant value. This may be consistent with ideas of the solarﬂux transport theory that magnetic ﬁelds migrate from low-latitudes to the poles and accumulate there during solarminimums (see Babcock 1961; Leighton 1969).

Date T o t a l C H s a r e a ( % o f s p h e r e ) Northern polar CHsSouthern polar CHs

Date T o t a l C H s a r e a ( % o f s p h e r e ) Polar CHsLow-latitude CHs

Figure 8.

Upper panel: areas of the northern and southern polar CHs. Bottom panel: areas of the polar and low-latitude CHs.The separating boundary between polar and low-latitude regions is θ = ± ◦ . Line width corresponds uncertainties that arisefrom diﬀerent parameters of synoptic maps construction. Now we consider synoptic maps of CHs with respect to magnetic synoptic maps and construct time-latitude andtime-longitude diagrams. We start with the time-latitude diagram that shows a ratio of unsigned magnetic ﬂux in CHsto the unsigned magnetic ﬂux integrated over all longitudes (Figure 9). We conclude from this plot that while the solarminimum is accompanied by an increase of the low-latitudes CHs areas (see Figure 8, lower panel), its contributionto the total unsigned ﬂux is not dominant. In contrast, polar CHs generate almost the whole unsigned magnetic ﬂux.Note that for construction of this plot we thresholded unsigned magnetic synoptic maps at 10 Gauss to avoid noisecontribution.For a more detailed investigation, we take into account the sign of the magnetic ﬁeld. In Figure 10, the grayscalebackground is a magnetic ﬁeld averaged over all longitudes while blue and red colors show the magnetic ﬁeld averagedover longitudes only in CH regions. Note that averaging CHs magnetic ﬁeld we ﬁlter out latitudes where CHs cover achine-learning approach to identification of coronal holes L a t i t u d e R a t i o o f t o t a l f l u x Figure 9.

Ratio of unsigned magnetic ﬂux in CHs to the unsigned magnetic ﬂux integrated over all longitudes. less than 20 ◦ of longitudes in total to prevent plotting of statistically insigniﬁcant values. We ﬁnd from this plot thatpolar latitudes have a prevalent sign of the magnetic ﬁeld that is opposite in North and South and between solarcycles. Also in agreement with Figure 9, we ﬁnd that CHs at lower latitudes in the minimum between Cycles 24 and25 have signiﬁcantly lower magnetic ﬁelds in contrast to polar CHs. L a t i t u d e < -10-505> 10 < B r > , G < -5-2.502.5> 5 < B r > , G Figure 10.

Time-latitude diagram of the longitudinally averaged magnetic ﬁeld in CH regions (shown in blue and reds colors)and magnetic ﬁeld averaged over all longitudes (grayscale map). Note that neutral color in red-blue color bar is not white buttransparent so that weak CH magnetic ﬁelds are not visible in the plot.

A detailed investigation of results presented in Figure 10 can give insights about the origin of the CHs open magneticﬂux and its relation to the ﬂux-transport mechanism. For example, Golubeva & Mordvinov (2017) associated CHswith decaying complexes of magnetic activity, while studies of Tlatov et al. (2014) and Huang et al. (2017) revealedpole-to-pole open ﬂux migration. Hamada et al. (2018) presented a similar plot showing dominant polarity and relativeareas of CHs for Cycles 23 and 24. To facilitate we have constructed the CH catalog and made it publicly available.Finally, we demonstrate time-longitude diagrams of the CH magnetic ﬁelds. Panels in Figure 11 correspond to threeregions located at northern polar latitudes, low-latitudes, and southern polar latitudes. The separating boundaries are θ = ± ◦ as in Figure 8. We observe that CHs patterns are substantially diﬀerent in the high and low latitude regions.At the high latitudes, we ﬁnd large-scale structures that exist for about a year. This indicates that CHs form stablesector structures in the magnetic ﬁeld distribution. In the low latitude region, we ﬁnd a mixture of two populations.Before 2015 (during the solar maximum) one can observe small-scale structures that exist for several months. After2015 (during the solar minimum) we ﬁnd characteristics strip structures that can be traced for several years. A ﬁnalremark from Figure 11 is about the inclination of the structures across all there panels. The elongation from the2 Illarionov et al. bottom right to the top left (which we see in the high-latitude zones) means that the region rotates slower than theCarrington coordinate system. In contrast, the opposite elongation at the low latitudes means the faster rotation.This is consistent with the general picture of the diﬀerential rotation of the Sun. However, a detailed analysis androtation speed estimation is out of the scope of this paper. (b)(a) (c)

Figure 11.

Time-longitude diagrams of CH magnetic ﬁelds (shown in blue and red colors) in three latitudinal zones. Panel(a) is for high latitudes in the Northern hemisphere ( θ > ◦ ), panel (b) is for lower latitudes ( | θ | ≤ ◦ ), panel (c) is for highlatitudes in the Southern hemisphere ( θ < − ◦ ). The grayscale background shows the magnetic ﬁeld averaged over latitudesfor each latitudinal zone. Note that neutral color in red-blue color bar is not white but transparent so that weak CH magneticﬁelds are not visible in the plot. 6. CONCLUSIONSWe have demonstrated that a Convolutional Neural Network (CNN) model trained to identify CHs in the solar diskimages is capable to detect CHs in the solar synoptic maps without any additional adjustments. Being composed ofonly convolutional operations the CNN processes images of any shape in the same way. This also implies that thelocal image content dominates over the global content (i.e. the segmentation result will be the same for portions of theimage and the whole image). Due to these facts, one can expect that for CNN it should be the same whether it seesthe whole disk image, a partial disk image, or a synoptic map (we suppose that human interpretation acts similarly).To illustrate this idea, we constructed a dataset of synoptic maps from daily solar disk images used for modeltraining. The process of synoptic map construction is not unique and contains free parameters. We have shown thatthe segmentation procedure is stable for a wide range of parameter values (Figure 8). achine-learning approach to identification of coronal holes

Illarionov et al. of the CHs segmentation algorithms proposed earlier rely on this idea more or less explicitly. In this respect, the CNNmodel automatically ﬁnds this more or less reasonable and intuitive strategy. (a) (b) (c)InputimageCNNsegmentationSegmentationbythresholdingInput imagehistogramand thresholdlevel

Figure 12.

Comparison of CNN and threshold-based segmentation. Columns correspond to CR 2098 (a), CR 2145 (b), andCR 2219 (c). These are the same CRs as in Figure 2. Top row shows the input synoptic maps. Second row demonstratesthe CHs segmentation by the CNN model. Third row shows the equivalent, in terms of CHs pixel counts, threshold-basedsegmentation. Bottom row shows histograms of the pixel intensity distributions and the threshold levels which provide theequivalent segmentation in terms of the pixel counts.

Now we want to take a step deeper and consider some synthetic cases. We noted in Figure 12 that while beingequivalent in terms of the CH pixel counts to the thresholding procedure, the CNN segmentation masks are not asnoisy as the threshold-based ones. To investigate this fact in more detail, we generate a set of synthetic synoptic mapsas Gaussian random structures with radial exponential correlation function K ( r ) = exp( − r/r ). Here r is a distancebetween pixels in the pixel units, and r is a correlation radius. Varying r we obtain a set of synthetic maps rangingfrom the maps with almost uncorrelated noise for small r to the maps with large-scale correlated random structuresfor large r . For each map, we apply the histogram matching procedure and make its distribution similar to the solarsynoptic map corresponding to CR 2219 (see the right column in Figure 12). Thus, for the threshold-based approach,each synthetic map contains the same number of pixels assigned to CHs (the threshold is also the same as for thesynoptic map corresponding to CR 2219). Our goal is to compare this against the CNN model. In fact, we vary r from 0.01 to 20 and use a sample of ten synthetic maps for each r . Figure 13 shows a sample of the synthetic mapfor various r and the corresponding segmentation maps.We note in Figure 13 that both segmentation methods give mostly similar results for large-scale structures, butsubstantially diﬀer for small-scale structures. This is also a reasonable feature of the CNN model trained for the CH achine-learning approach to identification of coronal holes InputimageCNNsegmentationSegmentationbythresholding (a) (b) (c)

Figure 13.

Sample of synthetic synoptic maps and corresponding segmentation maps. Columns correspond to the correlationradius parameter r = 0.01 (a), 10 (b) and 20 (c). Top row shows the synthetic synoptic maps. Middle row shows thesegmentation maps obtained using the CNN model. Bottom row shows the threshold-based segmentation. segmentation. Indeed, CHs are typically large-scale structures so a proper model should take into account the sizefactor. While for a typical CH segmentation method a region ﬁltering procedure is an explicit part of the algorithm,for the CNN model this step works automatically. Figure 14 demonstrates the number of pixels labeled as CHs againstthe scale factor (or the correlation radius r in our notations). Note that for the threshold-based segmentation thepixel count is a constant because each synthetic synoptic map has the same intensity distribution. r C H s p i x e l c o un t s , CNNThresholding

Figure 14.

CH pixel counts for the CNN model (blue line) and the thresholding method (orange line). Horizontal axis showsthe correlation radius r used for synthetic synoptic map sampling. Gray color shows a min-max range within 10 samples. REFERENCES