Alert Classification for the ALeRCE Broker System: The Light Curve Classifier
P. Sánchez-Sáez, I. Reyes, C. Valenzuela, F. Förster, S. Eyheramendy, F. Elorrieta, F. E. Bauer, G. Cabrera-Vives, P. A. Estévez, M. Catelan, G. Pignata, P. Huijse, D. De Cicco, P. Arévalo, R. Carrasco-Davis, J. Abril, R. Kurtev, J. Borissova, J. Arredondo, E. Castillo-Navarrete, D. Rodriguez, D. Ruz-Mieres, A. Moya, L. Sabatini-Gacitúa, C. Sepúlveda-Cobo, E. Camacho-Iñiguez
DDraft version August 11, 2020
Typeset using L A TEX twocolumn style in AASTeX63
Alert Classification for the ALeRCE Broker System: The Light Curve Classifier
P. S´anchez-S´aez,
1, 2, 3
I. Reyes,
1, 4, 5
C. Valenzuela,
4, 1
F. F¨orster,
6, 1, 7
S. Eyheramendy,
3, 1
F. Elorrieta,
6, 1
F. E. Bauer,
2, 8, 1, 9
G. Cabrera-Vives,
10, 1
P. A. Est´evez,
5, 1
M. Catelan,
2, 8, 1
G. Pignata,
11, 1
P. Huijse,
12, 1
D. De Cicco,
1, 2
P. Ar´evalo, R. Carrasco-Davis, J. Abril,
14, 15
R. Kurtev,
13, 1
J. Borissova,
13, 1
J. Arredondo, E. Castillo-Navarrete,
1, 4
D. Rodriguez, D. Ruz-Mieres,
1, 4
A. Moya,
4, 1
L. Sabatini-Gacit´ua,
4, 1 andC. Sep´ulveda-Cobo
4, 1 Millennium Institute of Astrophysics (MAS), Nuncio Monse˜nor S´otero Sanz 100, Providencia, Santiago, Chile Instituto de Astrof´ısica, Facultad de F´ısica, Pontificia Universidad Cat´olica de Chile, Casilla 306, Santiago 22, Chile Faculty of Engineering and Sciences, Universidad Adolfo Iba˜nez, Diagonal Las Torres 2700, Pe˜nalol´en, Santiago, Chile Center for Mathematical Modeling, Universidad de Chile, Beauchef 851, North building, 7th floor, Santiago 8320000, Chile Department of Electrical Engineering, Universidad de Chile, Av. Tupper 2007, Santiago 8320000, Chile Departmento de Matem´aticas, Facultad de Ciencia, Universidad de Santiago de Chile, Av. Libertador Bernardo OHiggins 3663.Estaci´on Central, Santiago, Chile Departamento de Astronom´ıa, Universidad de Chile, Casilla 36D, Santiago, Chile Centro de Astroingenier´ıa, Pontificia Universidad Cat´olica de Chile, Av. Vicu˜na Mackenna 4860, 7820436 Macul, Santiago, Chile Space Science Institute, 4750 Walnut Street, Suite 205, Boulder, Colorado 80301 Department of Computer Science, University of Concepc´on, Edmundo Larenas 219, Concepci´on, Chile Departamento de Ciencias Fis´ıcas, Universidad Andres Bello, Avda. Republica 252, Santiago, Chile Informatics Institute, Universidad Austral de Chile, General Lagos 2086, Valdivia, Chile Instituto de F´ısica y Astronom´ıa, Facultad de Ciencias, Universidad de Valpara´ıso, Gran Bretana No. 1111, Playa Ancha, Valparaso,Chile European Southern Observatory (ESO), Alonso de C´ordova 3107, Vitacura, Santiago, Chile Centro de Estudios de F´ısica del Cosmos de Arag´on (CEFCA) - Unidad Asociada al CSIC, Plaza San Juan, 1, E-44001, Teruel, Spain (Received; Revised; Accepted)
Submitted to AJABSTRACTWe present the first version of the ALeRCE (Automatic Learning for the Rapid Classification ofEvents) broker light curve classifier. ALeRCE is currently processing the Zwicky Transient Facility(ZTF) alert stream, in preparation for the Vera C. Rubin Observatory. The ALeRCE light curveclassifier uses variability features computed from the ZTF alert stream, and colors obtained fromAllWISE and ZTF photometry. We apply a Balanced Hierarchical Random Forest algorithm with atwo-level scheme, where the top level classifies each source as periodic, stochastic, or transient, andthe bottom level further resolve each hierarchical class, yielding a total of 15 classes. This classifiercorresponds to the first attempt to classify multiple classes of stochastic variables (including nucleus-and host-dominated active galactic nuclei, blazars, young stellar objects, and cataclysmic variables) inaddition to different classes of periodic and transient sources, using real data. We created a labeled setusing various public catalogs (such as the Catalina Surveys and
Gaia
DR2 variable stars catalogs, andthe Million Quasars catalog), and we classify all objects with ≥ g -band or ≥ r -band detections inZTF (868,371 sources as of 2020/06/09), providing updated classifications for sources with new alertsevery day. For the top level we obtain macro-averaged precision and recall scores of 0.96 and 0.99,respectively, and for the bottom level we obtain macro-averaged precision and recall scores of 0.57 and0.76, respectively. Updated classifications from the light curve classifier can be found at the ALeRCEExplorer website. Corresponding author: P. S´anchez-S´[email protected] a r X i v : . [ a s t r o - ph . I M ] A ug S´anchez-S´aez et al.
Keywords: galaxies: active – methods: data analysis – stars: variables: general – supernovae: general– surveys INTRODUCTIONBrightness variations of astrophysical objects offer keyinsights into their physical emission mechanisms andrelated phenomena. In stars, pulsations, both radialand non-radial, can result from a thermodynamic en-gine operating in their partial ionization layers, whenstars are located inside one of the several so-called in-stability strips that are found in the Hertzsprung-Russelldiagram. Eruptive events can be generated by materialbeing lost from a star, or occasionally accreted onto it, asis typical in protostars and young stellar objects (YSOs).Explosive events can occur when material is accretedonto compact objects, such as white dwarfs in the case ofcataclysmic variables (CVs) or neutron stars in the caseof X-ray binaries, or star mergers. Brightness changescan also originate from the rotation of stars, caused bysurface features such as starspots, and/or by stars’ el-lipsoidal shapes. Finally, eclipses can occur, dependingon the observer’s line-of-sight, due to the presence ofbinary companions, planets, and/or other circumstellarmaterial. These and other classes of stellar variabilityare reviewed and summarized, for instance, in Catelan& Smith (2015), where extensive additional referencescan be found. In addition, there are a wide array oftransients such as kilonovae (Metzger et al. 2010), super-novae (SNe; Woosley et al. 2002), and tidal disruptionevents, which are beacons of destructive episodes in thelife of a star (Komossa 2015). Galaxies, in turn, can alsopresent a wide array of variability phenomena. In thosehosting strongly accreting massive black holes (BHs),for instance, variations develop due to the stochasticnature of the accretion disk, corona, and jet emission,potentially related to both the BH properties and thestructure of the material in the immediate vicinity (e.g.,MacLeod et al. 2010; Caplar et al. 2017; S´anchez-S´aezet al. 2018).To study the variability of individual variable andtransient objects in detail, and to use this informationto probe different variability and physical models, ob-servations over a wide range of timescales are required.Hence, long and intensive campaigns of a large numberof targets are crucial. In recent years surveys coveringa significant part of the sky, revisiting the same regionson timescales from days to years, and containing a largesample of serendipitous objects, are now becoming avail-able as predecessors of the Vera C. Rubin ObservatoryLegacy Survey of Space and Time (LSST; Ivezi´c et al.2019). Among these is the Zwicky Transient Facility (ZTF;Bellm 2014; Bellm et al. 2019), which had first light in2017 and employs a powerful 47 deg field-of-view cam-era mounted on the Samuel Oschin 48-inch Schmidt tele-scope. ZTF is designed to image the entire northern skyevery three nights and scan the plane of the Milky Waytwice each night to a limiting magnitude of 20.5 in gri ,thus enabling a wide variety of novel multiband timeseries studies, in preparation for the LSST.LSST, which aims for first light in 2022, will revo-lutionize time domain astronomy, enabling for the firsttime the study of transient and variable objects over longperiods of time ( ∼
10 years) with (cid:38) r ∼ . ∼ . ∼ . σ ), over a large sky area ( > ).Given the large number of sources that ZTF andLSST will observe ( ∼ he ALeRCE Light Curve Classifier DATA2.1.
Reference Data
ALeRCE has been processing the public ZTF alertstream since May 2019, which includes g and r pho-tometry. The ALeRCE pipeline is described in detail byF¨orster (2020); for clarity, we provide a brief descriptionof the light curve construction process.The ALeRCE pipeline processes the ZTF Avro alertfiles. These files contain metadata and contextual in-formation for a single event, which are defined as aflux-transient, a reoccurring flux-variable, or a mov-ing object (Masci et al. 2019). To construct lightcurves, the ALeRCE pipeline uses: the photometry of For details, see https://zwickytransientfacility.github.io/ztf-avro-alert/.
S´anchez-S´aez et al. the difference-image and reference-image (detections);possible non-detections associated with the target dur-ing the previous 30 days of the event (5 σ magnitudelimit in the difference image based on PSF-fit photom-etry, called diffmaglim by ZTF); the real-bogus qual-ity score reported by ZTF ( rb , which ranges from 0 to1, with values closer to 1 implying more reliable detec-tions); and the morphological classification of the clos-est object obtained from PanSTARRS1 (Tachibana &Miller 2018). An overview of the pipeline is presentedin Figure 1. In addition, Figure 2 shows an example ofa SN light curve obtained using ZTF data. In summary,the different stages of the pipeline are:1) Ingestion: the ZTF public stream is ingested usingKafka.2) S3 upload: the alert Avro packets are stored inAWS S3 for later access.3) Crossmatch: the position of the alert is used toquery external catalogs.4) Stamp classifier: alerts from new objects are clas-sified using their image cutouts (stamps).5) Preprocessing: the photometry associated with agiven alert is corrected to take into account theuse of difference image fluxes (see details below),and simple statistics associated with the aggre-gated light curve are computed.6) Light curve features: advanced light curve statis-tics (features) are computed when there are atleast six detections in a given band.7) Light curve classifier: the light curve classifier de-scribed in this work is applied.8) Outliers: an outlier detection algorithm is applied.9) ALeRCE stream: the aggregated, annotated andclassified light curves are reported in a Kafkastream.In step 3) we are experimenting with several catalogs,but for this work we use the AllWISE public SourceCatalog (Wright et al. 2010; Mainzer et al. 2011), in-voking a match radius of 2 arcseconds, to obtain W1,W2, and W3 photometry (using magnitudes measuredwith profile-fitting photometry, e.g., w1mpro).The preprocessing procedure (step 5) is described indetail in F¨orster (2020) (see Section A of their ap-pendix). In particular, for the light curve classifier we http://wise2.ipac.caltech.edu/docs/release/allwise/ use the corrected light curves ( lc corr ; ˆ m sci in F¨orster2020) for sources whose closest source in the referenceimage coincides with the location of the alert (in a radiusof 1.4 arcseconds). It is important to use the correctedlight curves for persistent variable sources, in order totake into account changes in the sign of the differencebetween the reference and the science images, or possi-ble changes of the reference image. For the rest of thesources we use the light curves obtained using the dif-ference images ( lc diff ; m diff in F¨orster 2020), whichcorrespond in general to transient sources.2.2. Classification Taxonomy
The first version of the ALeRCE light curve classifierconsiders 15 subclasses of variable and transient objects,presented as a taxonomy tree defined by the ALeRCEcollaboration in Figure 3. The taxonomy is subdividedin a hierarchical fashion according to both the physi-cal properties of each class and the empirical variabilityproperties of the light curves, as follows (in parenthesiswe indicate the class name used by the classifier): • Transient: Type Ia supernova (SNIa), Type Ibcsupernova (SNIbc), Type II supernova (SNII), andSuper Luminous Supernova (SLSN); • Stochastic: Type 1 Seyfert galaxy (AGN; i.e.,host-dominated active galactic nuclei), Type 1Quasar (QSO; i.e., nucleus-dominated activegalactic nuclei), blazar (Blazar; i.e, beamed jet-dominated active galactic nuclei), Young StellarObject (YSO), and Cataclysmic Variable/Nova(CV/Nova); • Periodic: Long-Period Variable (LPV; includesregular, semi-regular, and irregular variable stars),RR Lyrae (RRL), Cepheid (CEP), eclipsing binary(E), δ Scuti (DSCT), and other periodic variablestars (Periodic-Other; this includes classes of vari-able stars that are not well represented in the la-beled set, e.g., sources classified as Hump, Misc,Rotational, or RS CVn in CRTS).It is important to note that there are a number of lesscommon classes which have not been separated out yetin the ALeRCE taxonomy tree, because the number ofcross-matched objects in these classes is too low to traina good classification model (e.g. SNe IIb, TDEs, KNe,among others). There is a catch-all “Periodic-Other”class for periodic classes excluded in the taxonomy tree,but not for transient or stochastic classes, and thus, forthe moment, these missing classes are being groupedinto one or more of the existing ones. he ALeRCE Light Curve Classifier ZTFstream S3 upload Preprocessing LC features LC classifierOutliers ALeRCE streamStamp classifierCrossmatch≥ 6 detectionsall alerts
Figure 1.
A scheme of the ALeRCE pipeline. ZTF alerts are ingested using Kafka and a series of sequential and parallel stepsare initiated. Alerts are stored in AWS S3, classified based on its image stamps, crossmatched with other catalogs, and theirphotometry corrected to take into account difference fluxes. Aggregated light curves are used to compute basic statistics (forinternal use) and, if enough data points exist, features are computed, and a light curve and outlier classifiers are applied beforesending an output stream. A PostgreSQL database is populated along the way, which can then be queried.
Figure 2.
A SN light curve queried from the previouslypopulated database (after stage 5 of the pipeline is per-formed). Green circles and red squares indicate the g and r bands, respectively. Error bars indicate photometry asso-ciated with detections, with consecutive measurements con-nected by straight solid line segments. Triangles denote lim-iting magnitudes. Labeled Set
The labeled set (i.e., the set of sources used to de-fine the training and testing sets) for the light curveclassifier was built using sources observed by ZTF withknown labels obtained via spectroscopic and/or photo-metric analysis by previous works. Further descriptionof the labeled set construction strategy can be foundin F¨orster (2020). We obtained labels from the follow-ing catalogs: the ASAS-SN catalogue of variable stars(ASASSN; Jayasinghe et al. 2018, 2019a,b, 2020), theCatalina Surveys Variable Star Catalogs (CRTS; Drakeet al. 2014; Drake et al. 2017), LINEAR catalog of peri-odic light curves (LINEAR; Palaversa et al. 2013), GaiaData Release 2 (
Gaia
DR2; Mowlavi et al. 2018; Rimol-dini et al. 2019), the Transient Name Server database(TNS) , the Roma-BZCAT Multi-Frequency Catalog ofBlazars (ROMABZCAT; Massaro et al. 2015), the Mil-lion Quasars Catalog (MILLIQUAS, version 6.4c, De-cember 2019; Flesch 2015, 2019), the New Catalog ofType 1 AGNs (Oh2015; Oh et al. 2015), and the SIM-BAD database (Wenger et al. 2000). Some additionalCV labels were obtained from different catalogs (includ-ing Ritter & Kolb 2003), compiled by Abril et al. (2020)(JAbril). https://wis-tns.weizmann.ac.il/ S´anchez-S´aez et al.
Real BogusTransient StochasticSN IaSN Ib/c SN II Super luminousSN Long PeriodVariableYoung Stellar ObjectNova / Cataclysmicvariable
Illustrations: @wandering_astro
Blazar QSO AGN PulsatingPeriodic RR Lyrae δ ScutiCepheidAlert ? Periodic-OtherEclipsing Binary
Figure 3.
Taxonomy tree used in the current version of the ALeRCE light curve classifier.
Table 1 lists the number of sources in the labeled setbelonging to each class (with their correspondent per-centages according to their hierarchical group), and thecatalogs from which the classifications were obtained.Only sources with ≥ g or ≥ r were included (considering data obtained un-til 2020/06/09). It is clear from the table that there isa high imbalance in the labeled set, with some classesrepresenting less than 5% of their respective hierarchicalgroup. Figure 4 shows the (ordered) number of sourcesper class for the labeled set, and Figure 5 shows thefraction of sources in each class with photometry onlyin the g band, only in the r band, or in both bands. FEATURES USED BY THE CLASSIFIERThe light curve classifier uses a total of 152 features.We avoid including features that require a long time tocompute, for example features that require the use ofMarkov chain Monte Carlo techniques, since one of thegoals of the light curve classifier is to provide a fast andhighly scalable classification. 146 of these features arecomputed using the public ZTF g and r data. We ex-cluded the mean magnitude as a feature to avoid thatany bias in the labeled set magnitude distribution af-fects the classification of sources that are fainter (orbrighter). Features obtained using the ZTF observedmagnitudes are called detection features (124 in total), E RR L Q S O L P V A G N Y S O S N I a B l a z a r P e r i o d i c - O t h e r C V / N o v a D S C T C E P S N II S N I b c S L S N o f s o u r c e s Figure 4.
Number of sources per class for the labeled set,as reported in Table 1. and features computed using the ZTF non-detection 5 σ magnitude limits diffmaglim ’s are called non-detectionfeatures (18 in total). These features are described inthe following sections (3.1 and 3.2), as well as in Ap-pendix A. Considering the LSST Data Products Defi-nition Document (Juri´c et al. 2019), we expect that allthese features would be measured using LSST data. he ALeRCE Light Curve Classifier Table 1.
Labeled set definitionHierarchical Class Class † Source CatalogsSNIa 1272 (74.0%) TNSSNIbc 94 (5.5%) TNSTransient SNII 328 (19.1%) TNSSLSN 24 (1.4%) TNSTotal 1718QSO 26168 (75.4%) MILLIQUAS (sources with class “Q”)AGN 4667 (13.4%) Oh2015, MILLIQUAS (sources with class “A”)Stochastic Blazar 1267 (3.6%) ROMABZCAT, MILLIQUAS (sources with class “B”)YSO 1740 (5.0%) SIMBADCV/Nova 871 (2.5%) TNS, ASASSN, JAbrilTotal 34713LPV 14076 (16.2%) CRTS, ASASSN,
Gaia
DR2E 37901 (43.5%) CRTS, ASASSN, LINEARDSCT 732 (0.8%) CRTS, ASASSN, LINEAR,
Gaia
DR2Periodic RRL 32482 (37.3%) CRTS, ASASSN, LINEAR,
Gaia
DR2CEP 618 (0.7%) CRTS, ASASSNPeriodic-Other 1256 (1.4%) CRTS, LINEARTotal 87065 † Values in parentheses correspond to the fraction of sources of a given class (second column) within its correspondinghierarchical class (first column).
Figure 5.
For the sources in the labeled set, this figure shows the fraction of sources in each class with photometry: only inthe g band (green); only in the r band (red); or in both bands (grey). The reasons for the non-uniformity of coverage may bephysical (strongly red or blue source) or organizational (survey focused on one band only). For most classes, the vast majorityof the sources ( (cid:38) g and r ; the exceptions are the YSO, LPV, and CEP classes, whereonly 76% of the sources have photometry in both bands. We also included as features the galactic coordinatesof each target ( gal b and gal l ), the
W1-W2 and
W2-W3
AllWISE colors, and the g − W2 , g − W3 , r − W2 , and r − W3 colors, where g and r are computed as the mean mag-nitude of the g band and r band light curves for agiven source. In addition, we use information includedin the Avro files metadata: the sgscore1 parameter,which corresponds to a morphological star/galaxy scoreof the closest source from PanSTARRS1 (Tachibana& Miller 2018) reported in the ZTF Avro files, with0 ≤ sgscore1 ≤
1, where values closer to 1 imply a higher likelihood of the source being a star; and the me-dian rb (real-bogus) parameter for each band.As we mentioned in Section 2.1, in this work we onlyconsider light curves with ≥ g or ≥ r . If a given source has ≥ S´anchez-S´aez et al.
Detection Features
Most of the features used by the light curve classifierare computed using the observed magnitudes in the g and r bands (i.e., the detections). There are 56 differ-ent features computed for each band, and 12 featurescomputed using a combination of both bands, yieldinga total of 124 detection features. The definition of allthese features can be found in Table 2. We split thetable in three blocks. The first block contains new fea-tures defined by this work (i.e., novel features). Some of these features are further described in Section 3.1.1. Thesecond block contains features that correspond to newvariants of descriptors included in other works. Some ofthem are further described in Section 3.1.1 and in Ap-pendix A. Finally, the third block includes 22 featuresthat come from the Feature Analysis for Time Series(FATS; Nun et al. 2015) Python package. Hereafter,features ending with “ 1” are computed for the g band,and features ending with “ 2” are computed for the r band, following the notation used in the ZTF Avro files. Table 2 . List of detection features used by the light curve classifier. Features marked with (cid:7) are computed using both g and r bands at the same time. Features marked with * and ** are further described in Section 3.1.1 and Appendix A, respectively.Feature Description Reference delta period Absolute value of the difference between the
Multiband period andthe MHAOV period obtained using a single band This work
IAR phi * Level of autocorrelation using a discrete-time representation of aDRW model Eyheramendy et al. (2018)MHPS parameters* Obtained from a MHPS analysis (three in total) Ar´evalo et al. (2012) positive fraction
Fraction of detections in the difference-images of a given band whichare brighter than the template image This work
Power rate * (cid:7) Ratio between the power of the multiband periodogram obtainedfor the best period candidate ( P ) and 2 × P , 3 × P , 4 × P , P/ P/ P/ PPE * (cid:7) Multiband Periodogram Pseudo Entropy This work ( g - r ) max (cid:7) g − r color obtained using the brightest lc diff magnitude in eachband This work ( g - r ) max corr (cid:7) g − r color obtained using the brightest lc corr magnitude in eachband This work ( g - r ) mean (cid:7) g − r color obtained using the mean lc diff magnitude of each band This work ( g - r ) mean corr (cid:7) g − r color obtained using the mean lc corr magnitude of each band This work delta mag fid Difference between maximum and minimum observed magnitude ina given band This work
ExcessVar ** Measure of the intrinsic variability amplitude Allevato et al. (2013)
GP DRW tau ** Relaxation time τ from DRW modeling Graham et al. (2017) GP DRW sigma ** Amplitude of the variability at short timescales ( t << τ ), fromDRW modeling Graham et al. (2017)Harmonics parameters* Obtained by fitting a harmonic series up to the seventh harmonic(14 in total) (Stellingwerf & Donohoe1986)
Multiband period * (cid:7) Period obtained using the multiband MHAOV periodogram Mondrik et al. (2015)
Pvar ** Probability that the source is intrinsically variable McLaughlin et al. (1996)
SF ML amplitude ** rms magnitude difference of the SF, computed over a 1 yr timescale Schmidt et al. (2010)
SF ML gamma ** Logarithmic gradient of the mean change in magnitude Schmidt et al. (2010)SPM features* Supernova parametric model features (seven in total) Villar et al. (2019b)
Amplitude
Half of the difference between the median of the maximum 5% andof the minimum 5% magnitudes Richards et al. (2011)
AndersonDarling
Test of whether a sample of data comes from a population with aspecific distribution Nun et al. (2015)
Autocor length
Lag value where the auto-correlation function becomes smaller than
Eta e
Kim et al. (2011)
Beyond1Std
Percentage of points with photometric mag that lie beyond 1 σ fromthe mean Richards et al. (2011) he ALeRCE Light Curve Classifier Con
Number of three consecutive data points brighter/fainter than 2 σ of the light curve Kim et al. (2011) Eta e
Ratio of the mean of the squares of successive mag differences tothe variance of the light curve Kim et al. (2014)
Gskew
Median-based measure of the skew Nun et al. (2015)
LinearTrend
Slope of a linear fit to the light curve Richards et al. (2011)
MaxSlope
Maximum absolute magnitude slope between two consecutiveobservations Richards et al. (2011)
Meanvariance
Ratio of the standard deviation to the mean magnitude Nun et al. (2015)
MedianAbsDev
Median discrepancy of the data from the median data Richards et al. (2011)
MedianBRP
Fraction of photometric points within amplitude/10 of the medianmag Richards et al. (2011)
PairSlopeTrend
Fraction of increasing first differences minus fraction of decreasingfirst differences over the last 30 time-sorted mag measures Richards et al. (2011)
PercentAmplitude
Largest percentage difference between either max or min mag andmedian mag Richards et al. (2011)
Psi CS
Range of a cumulative sum applied to the phase-folded light curve Kim et al. (2011)
Psi eta Eta e index calculated from the folded light curve Kim et al. (2014)
Q31
Difference between the 3 rd and the 1 st quartile of the light curve Kim et al. (2014) Rcs
Range of a cumulative sum Kim et al. (2011)
Skew
Skewness measure Richards et al. (2011)
SmallKurtosis
Small sample kurtosis of the magnitudes Richards et al. (2011)
Std
Standard deviation of the light curve Nun et al. (2015)
StetsonK
Robust kurtosis measure Kim et al. (2011)
Further description of some relevant detectionfeatures
Table 2 summarizes the definitions of the detectionfeatures used by the light curve classifier. However, someof these features are worth describing in more detail,either because they are new features, or because thechanges we made to them were sufficiently relevant thatfurther explanation is in order (see also Appendix A): • Multiband Period:The period is estimated using the Multi Har-monic Analysis of Variance (MHAOV) peri-odogram (Schwarzenberg-Czerny 1996), which isbased on fitting periodic orthogonal polynomialsto the data. A single period estimate per lightcurve is computed by fitting both bands usingthe MHAOV multiband extension proposed byMondrik et al. (2015). We denote this period as
Multiband period . For sources with detectionsonly in g or only in r , the Multiband period re-ports the single band period. To avoid overfittingthe data when few samples are available we setthe number of harmonics to one. This might notcapture the best period for non-sinusoidal lightcurves, e.g., detached and semi-detached eclips-ing binaries, returning a harmonic instead. Wefound that having a harmonic of the true period is in general sufficient to classify non-sinusoidal lightcurves correctly given that other features such asthe power rate are included. We choose MHAOVfor this analysis as it provides a good trade-off be-tween performance and computational complexity. • Periodogram Pseudo Entropy:To have an estimate of the confidence of thecandidate period (obtained with the multibandMHAOV method), we developed a heuristic basedon the entropy of the normalized periodogrampeaks, which we denote as periodogram pseudoentropy (
PPE ). This value is computed by recov-ering the 100 largest values of the periodogram,normalizing them so they add to 1 and computingthe entropy of that vector. This feature is com-puted as
P P E = 1 + 1log(100) (cid:88) i =1 (cid:16) p i Z (cid:17) log (cid:16) p i Z (cid:17) , (1)where p i is the value of the i -th largest peak of theperiodogram and Z = (cid:80) i =1 p i . This feature takesvalues between zero (no clear period stands out)and one (periodogram has a single large peak).0 S´anchez-S´aez et al. • Power Rate: Ratio between the power of themultiband periodogram obtained for the best pe-riod candidate ( P ) versus the power of the multi-band periodogram obtained for n times this pe-riod [ Power rate n = P ower ( P ) /P ower ( n × P )],where n can take values of 2, 3, 4, 1 /
2, 1 /
3, and1 /
4. We computed these ratios in order to detectcases where we measure an aliased multiple of theperiod instead of the true period, which is partic-ularly a common issue for some classes of eclipsingbinaries (e.g., Catelan et al. 2013; Graham et al.2013; McWhirter et al. 2018; VanderPlas 2018, andreferences therein). • Harmonics parameters: Harmonic series (Stelling-werf & Donohoe 1986) are commonly used tomodel and classify periodic light curves (Deboss-cher et al. 2007; Sarro et al. 2009; Richards et al.2011; Elorrieta et al. 2016). In this work we fit aharmonic series up to the seventh harmonic, ac-cording to the expression y ( t j ) = (cid:88) k =1 (cid:20) A k cos (cid:18) πkt j P (cid:19) + B k sin (cid:18) πkt j P (cid:19)(cid:21) + C, (2)where t j corresponds to the observational time ofthe j -th detection, P is the best candidate pe-riod computed from the multiband periodogramas above, and y ( t j ) is the magnitude estimatedby the harmonic model. Even though we use the Multiband period , the harmonic model is com-puted using the detections in each band indepen-dently. A k , B k and C for k = 1 , . . . , A k ,B k J J (cid:88) j =1 [mag( t j ) − y ( t j )] sigma( t j ) , (3)where J is the number of observations, mag( t j ) isthe observed magnitude at time t j , and sigma( t j )is the observational error at time t j . The solutionto the weighted least squares optimization problemis found using the Moore-Penrose pseudoinverse. This solution has the additional property of havingthe minimum Euclidean norm when the problemis underdetermined (Ben-Israel & Greville 2003),which in this case corresponds to having less than15 observations.Once the parameters are learnt, equation 2 can berewritten as y ( t j ) = (cid:88) k =1 M k cos (cid:18) πkt j P − φ k (cid:19) + C, (4)with M k = (cid:112) A k + B k and φ k = arctan( B k /A k ).In this way, the harmonics are now described bythe amplitude and phase of each component. Themodel is shifted in time in order to have zero phasein the first harmonic, which is done following theexpression φ (cid:48) k = φ k − kφ , replacing φ k by φ (cid:48) k inEq. 4.Finally, the parameters M k for k = 1 , . . . , φ (cid:48) k for i = 2 , . . . , Harmonics mag 1 ,...,
Harmonics mag 7 , Harmonics phase 2 , ...,
Harmonics phase 7 , Harmonics mse , respectively. • Supernova parametric model (SPM): Villar et al.(2019b) introduced an analytic model describingSN light curves as a six parameter function, whichthey used to characterize and classify SN lightcurves from the Pan-STARRS1 Medium-deep Sur-vey (Chambers et al. 2016). This model is anextension of previous empirical efforts to analyti-cally describe supernova light curves, including theeffects of different explosion times, normalizationfactors, initial rise timescales, rate of decline afterpeak, plateau lengths, or tail decay timescales. Weintroduce two modifications to this model. Firstwe reparametrize the function to always remainpositive in a simple validity range, i.e., one definedby a set of inequalities involving at most one vari-able per inequality, changing the validity range ac-cordingly. After the first modification, the modelis the following: F = A (cid:16) − β (cid:48) t − t t − t (cid:17) (cid:16) − t − t τ rise (cid:17) if t < t A (1 − β (cid:48) ) exp (cid:16) − t − t τ fall (cid:17) (cid:16) − t − t τ rise (cid:17) if t ≥ t , (5) he ALeRCE Light Curve Classifier γ ≡ t − t as a parameter in-stead of t . This function is positive valued when A > γ > τ rise > τ fall > < β (cid:48) < σ ( t ) = 1 / (1 + exp( − t )), which allows a soft tran-sition between zero and one. As the parameter t defines the transition between the two pieces of themodel in Eq. 5, it cannot be optimized properlyusing first-order methods. Our proposed modelallows using this technique effectively to learn theparameters of the model, which is given by thefollowing equation: F = A (cid:16) − β (cid:48) t − t t − t (cid:17) (cid:16) − t − t τ rise (cid:17) · (cid:20) − σ (cid:18) t − t (cid:19)(cid:21) + A (1 − β (cid:48) ) exp (cid:16) − t − t τ fall (cid:17) (cid:16) − t − t τ rise (cid:17) · (cid:20) σ (cid:18) t − t (cid:19)(cid:21) . (6)In this particular model, for all the sources weuse the light curves based on the difference im-ages ( lc diff ). This is done to avoid the con-tamination from unrelated host galaxy emission,which can distort the real shape of the SNe lightcurves. We also subtract from t the MJD value ofthe first detection observed for a given source. Wecomputed A ( SPM A ), β (cid:48) ( SPM beta ), t ( SPM t0 ), γ ( SPM gamma ), τ rise ( SPM tau rise ), and τ fall ( SPM tau fall ), for each band independently. Inaddition, we computed the reduced χ of the fit forthe light curve, denoted as SPM chi . The param-eters are found using the function curve fit pro-vided by the Scipy library (Virtanen et al. 2020). • Irregular autoregressive (IAR) model: Eyhera-mendy et al. (2018) introduced this model. It is adiscrete-time representation of the continuous au-toregressive model of order 1 [CAR(1) or DRW],which has desirable statistical properties such asstrict stationarity and ergodicity without a distri-butional assumption. The IAR model is definedby y t j = φ t j − t j − y t j − + σ (cid:113) − φ t j − t j − ) ε t j , (7) where ε t j is a white noise sequence with zero meanand unit variance, σ is the standard deviationof y t j , and { t j } are the observational times for j = 1 , . . . , n . We used a modified version of theIAR model, which considers the estimated vari-ance of the measurement errors δ t j in the likeli-hood of the model. Thus, by assuming a Gaussiandistribution, the negative log-likelihood of the pro-cess is given by (cid:96) ( θ ) = n π ) + 12 n (cid:88) j =1 log ν t j + 12 n (cid:88) j =1 e t j ν t j , (8)where e t = y t , ν t = σ + δ t , and ˆ y t = 0are the initial values, while ˆ y t j = φ t j − t j − y t j − , e t j = y t j − ˆ y t j , and ν t j = σ (1 − φ t j − t j − ) ) + δ t j for j = 2 , . . . , n . Particularly, φ describes the au-tocorrelation function of order 1 for a given lightcurve. We computed the maximum likelihood esti-mation of the parameter φ (obtained directly fromthe light curves), and we used this as a featurefor our classifier. We denoted this parameter as IAR phi . • Mexican Hat Power Spectrum (MHPS): Ar´evaloet al. (2012) proposed a method to compute low-resolution power spectra from data with gaps,where the light curves are convolved with a Mexi-can hat filter: F ( x ) ∝ (cid:104) − x σ (cid:105) e − x / σ . Gaps,or generally uneven sampling, are corrected forby convolving a unit-valued mask with the samesampling as the light curve and dividing the con-volved light curve by it. This method can beused to isolate structures with a characteristictimescale ( t ∼ σ/ √ π ) in a given light curve,in order to estimate the light curve variance as-sociated with that timescale. We compute thelight curve variance at two different timescales of10 and 100 days. The variance associated withthe 10 day timescale (“high” frequency) is denoted MHPS high , while the variance associated with the100 day timescale (“low” frequency) is denoted
MHPS low . We also compute the ratio betweenthe low and high frequency variances for a givenband, denoted as
MHPS ratio . The logarithm of
MHPS ratio is therefore an estimate of the powerlaw slope of the power spectrum of the light curve.3.2.
Non-detection Features
For each detection, the ZTF alert stream includes 5- σ magnitude limits ( diffmaglim ), which are computed2 S´anchez-S´aez et al. from the g and r difference images of the same areaof the sky obtained in the previous 30 days, wherethe target associated with the alert was not detected(non-detections). These non-detections are very infor-mative, since they can, for instance, inform us whethera transient has not been detected before; whether a non-variable source has begun to exhibit a variable behav-ior; or which range of observed magnitudes we shouldexpect to measure when there are not significant differ-ences between the science and template images, and analert is not generated. The light curve classifier uses ninedifferent features defined using all the non-detectionsassociated with a given source, computed for both g and r bands, yielding a total of 18 non-detection fea-tures. Note that all these features are new, and havenot been used before for classification. Table 3 lists thenon-detection features used by the light curve classifier.As before, non-detection features ending with “ 1” arecomputed for the g band, and features ending with “ 2”are computed for the r band. CLASSIFICATION ALGORITHMSThe labeled set used in this work presents a very highimbalance (see Table 1). For instance, QSOs represent75.4% of the stochastic sources, while CV/Novae repre-sent just 2.5%. To deal with this issue, we looked forML algorithms available in the literature that are de-signed to mitigate the imbalance problem. In particular,we worked with the imbalanced-learn
Python package(Lemaˆıtre et al. 2017).
Imbalanced-learn includes im-plementations of several re-sampling algorithms that arecommonly used to handle data sets with strong between-class imbalance. The algorithms available in this pack-age are fully compatible with scikit-learn methods.In the following sections we describe the Random For-est (RF) algorithm used by the light curve classifier, aswell as other tested ML algorithms. To train each clas-sifier we randomly split the labeled set into a trainingset (80%) and testing set (20%) in a stratified fashion,preserving the percentage of samples for each class.4.1.
Balanced Hierarchical Random Forest
A Decision Tree (Rokach & Maimon 2008) is a pre-dictive algorithm that uses a tree structure to performsuccessive partitions on the data according to a certaincriterion (e.g., a cut-off value in one of the descriptorsor features) and produces possible decision paths, pro-viding a final outcome for each path (the leaves of thetree). Decision Trees are commonly used for classifi-cation, where each final leaf is associated with a givenclass. RFs (Breiman 2001) are algorithms that buildmultiple Decision Trees, where each tree is trained us- ing a random sub-sample of elements from a given train-ing set, selected allowing repetition (bootstrap sample ofthe training set), and using a random selection of fea-tures. The final classification is obtained by averagingthe classifications provided by each single tree. This av-erage score can be interpreted as the probability ( P RF )that the input element belongs to a given class. One ofthe main advantages of RF is that it naturally providesa ranking of features for the classification, by countingthe number of times each feature is selected to split thedata.Chen et al. (2004) proposed a modified RF that candeal with the imbalanced data classification. In theirmodel each individual tree is trained using a sub-sampleof the training set that is defined by generating a boot-strap sample from the minority class, and then ran-domly selecting the same number of cases, with replace-ment, from the majority classes. Imbalanced-learn im-plements the balanced RF classifier proposed by Chenet al. (2004). For the ALeRCE light curve classifier weuse their
BalancedRandomForestClassifier method,selecting the hyper-parameters (number of trees, maxi-mum number of features per tree, and maximum depthof each tree) with a K-Fold Cross-Validation procedureavailable in scikit-learn , with k = 5 folds and usingthe “macro-recall” as target score (see its definition inSection 5.1).4.1.1. The hierarchical classifier approach
Considering the hierarchical structure of the taxon-omy (see Section 2.2), we decided to construct a bal-anced hierarchical RF (BHRF) classifier. There are var-ious approaches to perform a hierarchical classification(e.g., Silla & Freitas 2011; Naik & Rangwala 2018). Inthe particular case of the light curve classifier, we usea two-level scheme. The first level consists of a sin-gle classifier that separates the sources into three broadclasses. The second level consists of three distinct clas-sifiers, which further resolve each class in the first levelinto subclasses. We then use the probabilities obtainedfor each independent classifier to obtain the final classi-fication.In more detail, the first level (top level hereafter) con-sists of a single classifier which classifies every sourceas periodic, stochastic, or transient. The second level(bottom level hereafter) consists of three distinct clas-sifiers: Transient, Stochastic, and Periodic. The classesconsidered by each of these three classifiers are the onesshown in Table 1 and Figure 3. Each classifier in thebottom level is trained using a training subset havingonly those classes included in the primary top class (forinstance, the Transient classifier only includes sources he ALeRCE Light Curve Classifier Table 3.
List of non-detection features used by the light curve classifier. Note that “x” stands for either g or r bands. Feature Description dmag first det fid
Difference between the last non-detection diffmaglim in band “x” before the first detectionin any band and the first detected magnitude in band “x” dmag non det fid
Difference between the median non-detection diffmaglim in the “x” band before the firstdetection and in any band the minimum detected magnitude (peak) in the “x” band last diffmaglim before fid
Last non-detection diffmaglim in the “x” band before the first detection in any band max diffmaglim before fid
Maximum non-detection diffmaglim in the “x” band before the first detection in any band max diffmaglim after fid
Maximum non-detection diffmaglim in the “x” band after the first detection in any band median diffmaglim before fid
Median non-detection diffmaglim in the “x” band before the first detection in any band median diffmaglim after fid
Median non-detection diffmaglim in the “x” band after the first detection in any band n non det before fid
Number of non-detections in the “x” band before the first detection in any band n non det after fid
Number of non-detections in the “x” band after the first detection in any band classified as SNIa, SNIbc, SNII, and SLSN). It is im-portant to note that these four classifiers are indepen-dent and process the same input features set describedin Section 3. The final classification is constructedby multiplying the probabilities obtained for each classof the top level [ P top ( transient ), P top ( stochastic ), and P top ( periodic )] with the individual probabilities ob-tained by their correspondent classifier in the bottomlevel. Namely, the probabilities of the Transient classi-fier ( P T ) are multiplied by P top ( transient ), the prob-abilities of the Stochastic classifier ( P S ) are multipliedby P top ( stochastic ), and the probabilities of the Peri-odic classifier ( P S ) are multiplied by P top ( periodic ). Wedenote the product of these probabilities as P . For in-stance, the probability of a given source being an RRLcorresponds to the product of its probability of beingperiodic (according to the top level) and its probabilityof being an RRL (according to the Periodic classifier): P ( RRL ) = P top ( periodic ) · P P ( RRL ) , while the probability of being a Blazar is computed as: P ( Blazar ) = P top ( stochastic ) · P S ( Blazar ) . Following this, the sum of the probabilities of the 15classes for a given source adds up to one. Finally, theclass of a given object is determined by selecting theclass with the maximum P .The best cross-validation performance was obtainedwith the following hyper-parameter setting: 500 trees ineach classifier, maximum depth trees (the nodes are ex-panded until all leaves are pure), and a maximum num-ber of features equal to the square root of the total num-ber of features, except for the Stochastic classifier, wherewe used 20% of the features. In Section 5 we present the results obtained when applying the BHRF classifier tothe ZTF data.4.2. Additional ML algorithms tested
In addition to RF, we also tested two other super-vised classification algorithms: Gradient Boosting andMultilayer Perceptron. These tests were done as a com-plementary analysis, with the purpose of guiding futureefforts in improving the light curve classifier.None of these methods has a Python implementationparticularly designed to handle imbalanced data sets;however, using imbalanced-learn we can generate bal-anced training sets. We present the results obtainedusing both classifiers in Section 5.4.2.1.
Gradient Boosting
Gradient Boosting (GBoost; Friedman 2001) is anML algorithm that uses an ensemble of weak predic-tion models (e.g., Decision Trees) to produce a morerobust classification. The method implements a boost-ing algorithm (using a Gradient Descent algorithm) thattrains a sequence of weak models, each compensatingthe weaknesses of its predecessors. eXtreme GradientBoosting ( XGBoost ; Chen & Guestrin 2016) is a pack-age available in several computing languages (includingPython) that implements GBoost algorithms for classifi-cation and regression in an efficient and scalable way. Ithas become one of the most used packages for regressionand classification in recent years.For the case of GBoost we followed the sametwo-level hierarchical strategy described in Section4.1.1. However, since the current version of the
XGBoost multi-class classifier was not designed todeal with highly imbalanced data sets (e.g., Wanget al. 2019), we tested a model that uses
XGBoost S´anchez-S´aez et al. and is trained with a balanced training set. Weconstructed this balanced training set using the
RandomUnderSampler and
RandomOverSampler meth-ods available in imbalanced-learn . For the case of thetop level, Periodic, and Stochastic classifiers, we con-structed a balanced training set by generating 10 ran-dom samples using the
RandomUnderSampler method,resampling all classes (‘all’ sampling strategy), and con-catenating them, in order to obtain a training set withmore than 10,000 objects in total for each classifier.For the case of the Transient classifier we used the
RandomOverSampler method, resampling all classes, togenerate one random sample with ∼
600 objects. Eachclassifier uses the default hyper-parameters defined bythe
XGBoost
Python package, with the exception of theboosting rounds, where we used 500, and the objectivefunction, which was set up to do multi-class classifica-tion, using the softmax function (‘multi:softmax’). As inthe case of the BHRF model, the class of a given objectis determined by selecting the class with the maximumprobability (obtained by multiplying the probabilities ofthe top and bottom levels).4.2.2.
Multilayer Perceptron
Artificial neural networks (ANNs) are mathematicalmodels inspired by the human brain. ANNs are com-posed of elemental computational units called neurons(Haykin 1994). ANNs can be used to perform complextasks such as classification or regression. A MultilayerPerceptron (MLP) corresponds to an ANN whose neu-rons are ordered by layers, where all neurons belong-ing to a given layer receive the same input vector andeach unit processes this vector independently accordingto its own parameters. The outputs of all neurons ina layer are grouped and form the input vector for thenext layer. For the case of classification, when usingthe softmax activation function, the final layer providesthe probabilities that a given element belongs to a givenclass. One way of obtaining the final class is to assignthe label with the maximum probability in the outputlayer.For this model, we also followed the two-level hierar-chical strategy described in Section 4.1.1. We tested dif-ferent MLP architectures, changing the number of layersand the number of neurons per layer. We used the
KerasAPI provided by the Python version of
TensorFlow 2.0 (Abadi et al. 2016). We split the original training setdefined above into a new training set (80%) and a valida-tion set (20%). In order to deal with the high imbalanceof the training set we used the balanced mini-batchesgenerator for Keras provided by the imbalanced-learn
Python package. The best performance (considering the categorical cross-entropy loss and accuracy curvesfor the training and validation sets) was obtained usingMLPs with two hidden layers with 256 and 128 neuronsfor all the classifiers (top level, Transient, Stochastic,and Periodic). Regularization via the dropout method(Srivastava et al. 2014) is used to prevent overfitting.The dropout fraction is set at 0.50. RESULTS5.1.
Results for the BHRF classifier
In order to test the performance of our BHRF classifierwe generated 20 different training and testing sets usingthe
ShuffleSplit iterator provided by scikit-learn ,which uses random permutations to split the data, usingeach time 80% of the labeled set as training set and 20%as testing set, preserving the percentage of samples foreach class in the original labeled set. Then, we trained20 different BHRF models using each training set, andtested their performance using the corresponding testingsets. We emphasize that the testing sets are never usedin training their respective models.Table 4 lists three different scores: precision, recall,and F1-score. Despite the high imbalance present in thelabeled set, all classes are equally important, thus wecomputed macro-averaged scores:Precision = 1 n cl n cl (cid:88) i =1 T P i T P i + F P i , (9)Recall = 1 n cl n cl (cid:88) i =1 T P i T P i + F N i , (10)F1-score = 2 × Precision × RecallPrecision + Recall , (11)where n cl is the total number of classes, T P i is the num-ber of true positives, F P i the number of false positives,and F N i is the number of false negatives, for a givenclass i . For the particular case of the BHRF classifier,Table 4 reports the mean and the standard deviationof the macro-averaged scores obtained by the 20 modelswhen applying them to their respective testing sets.In addition, Figures 6 and 7 show the confusion matri-ces obtained for the top and bottom levels, respectively.To generate these confusion matrices we used the resultsobtained when applying each of the 20 BHRF models totheir corresponding testing sets, providing for each levelthe median and 5 and 95 percentiles of the confusionmatrices obtained by each of the 20 models.The confusion matrix of the top level shows that theclassifier can recover more than 97% of the true labels,and that the contamination between classes is below 3%.The scores obtained reflect the good performance of thetop level classifier. he ALeRCE Light Curve Classifier Table 4.
Macro-averaged scores obtained for each classifier in the testing set.Classifier Precision Recall F1-scoreBHRF - top 0.96 ± ± ± ± ± ± T r a n s i e n t S t o c h a s t i c P e r i o d i c Predicted label
TransientStochasticPeriodic T r u e l a b e l +00 +10 +00 +00 +01 +00 +00 +00 +01 Figure 6.
Confusion Matrix for the top level BHRF clas-sifier. The confusion matrix was obtained by generating 20different training and testing sets, and by training 20 in-dependent models using each training set separately. Afterthe training, each model is applied to their respective test-ing set. We provide the median and 5 and 95 percentilesof the confusion matrices obtained for the 20 testing sets.The reported values correspond to the percentages obtainedby normalizing the matrices with the true labels. This levelshows a high degree of accuracy with a low percentage ofmisclassifications.
For the case of the bottom level we obtained an F1-score of 0.59, implying significant confusion betweenclasses. From Figure 7 we can see that the fraction oftrue positives in the confusion matrix of the bottom levelhas values between 50% and 100%, with mean, medianand standard deviation of 78%, 76%, and 14%, respec-tively. In addition, from the figure it can be observedthat the confusion is most often observed among classeswith similar characteristics, like among the SN classes(particularly among SNIa versus SNIbc and SNII ver-sus SLSN); among Blazar, AGN, and QSO classes; andamong various periodic classes. The highest standarddeviation of the predictions is observed for the case of SLSN. This is a result of the low number of SLSN in thelabeled set.To complement our analysis, in Appendix B we pro-vide the results obtained by a one-level multi-class RFmodel. From its results we can conclude that the lightcurve classifier improves considerably when a hierarchi-cal strategy is followed.5.1.1.
Comparison with the GBoost and MLP classifiers
In this work we tested two other classifiers: GBoostand MLP. For these models we present the results ob-tained by using 80% of the labeled set as a training set,and the remaining 20% as a testing set, preserving thepercentage of samples for each class in the original la-beled set. The scores obtained by these classifiers in thetesting set are shown in Table 4. The confusion matricesobtained for the bottom level of the GBoost and MLPclassifiers in the testing set are presented on the left andright sides of Figure 8, respectively.The precision and F1-score obtained by GBoost arein general better than the ones obtained by the BHRFclassifier, with the exception of the recall score of thebottom level. However, the fraction of true positives inthe confusion matrix of the bottom level range between5% and 100%, with a mean, median, and standard de-viation of 72%, 83%, and 29%, respectively, which ex-plain the lower recall obtained by GBoost, compared tothe BHRF classifier. In addition, the classes with thelargest fraction of true positives in the confusion matrixof GBoost correspond to the most populated classes inthe labeled set, like QSOs, which represent 75.4% of thestochastic sources; SNIa, which represent 74.0% of thetransients; or LPV, E, and RRL, which represent 16.2%,43.5%, and 37.3% of the periodic sources, respectively.This is not observed in the results obtained by BHRF,where there is no evidence of correlation between therepresentativity of a given class and its fraction of truepositives in the confusion matrix shown in Figure 7.The results obtained using GBoost are promising. Weobtained good scores although the current versions ofGBoost available in the literature have not been de-signed to deal with high imbalance for the case of multi-6
S´anchez-S´aez et al.
Figure 7.
As in Figure 6, but for the bottom level of the BHRF classifier. The black squares highlight the three classes of thetop level (from top to bottom, transient, stochastic, and periodic, respectively). This matrix is quite diagonal, but shows moremisclassification among related subtypes compared to the matrix obtained for the top level. class classification. Therefore, further efforts should bedone to implement GBoost in future versions of the lightcurve classifier. In particular, new implementations ofGBoost for multi-class classification that follow similarapproaches to the ones proposed by Chen et al. (2004)or Wang et al. (2019) should be tested, as should com-binations of GBoost with data augmentation techniques(i.e., generating synthetic light curves of less populatedclasses using physical and/or statistical models). For the case of the MLP classifier the scores obtainedare in general lower compared to BHRF and GBoost.Its confusion matrix for the bottom level (see Figure8) presents the same issues already discussed for thecase of GBoost. The fraction of true positives in theconfusion matrix ranges between 11% and 98%, with amean, median, and standard deviation of 69%, 74%, and22%, respectively. Therefore, we conclude that more he ALeRCE Light Curve Classifier
Results for the BHRF classifier excluding AllWISEdata
We tested a version of the BHRF classifier that ex-cludes features computed using AllWISE data. Themacro-averaged precision, recall, and F1-score of thetop level are 0.93, 0.97, and 0.95, respectively. For thebottom level the macro-averaged precision, recall, andF1-score are, respectively, 0.53, 0.72, and 0.55. Thesescores are slightly smaller than the ones obtained us-ing the original version of the BHRF classifier. As canbe observed in the confusion matrix shown in Figure9, the stochastic classes are the most affected by thelack of AllWISE data, particularly YSOs and Blazars,whose fraction of true positives decreased 10% and 16%,respectively. This happens because of the similaritiesobserved between the light curves of these and otherclasses, which are not easily separated using variabilityfeatures alone. However, the results obtained by thisversion of the classifier are still good enough to be usedin the case that AllWISE data are not available, as oc-curs, for instance, for faint objects.5.2.
Performance of the BHRF classifier as a functionof magnitude and number of detections
As can be seen in Figure 5 for some classes, like YSOs,CEPs, and LPVs, a non-negligible fraction of sources inthe labeled set have photometry available only in oneband. It is therefore important to know how well theclassifier behaves when a single band is available for agiven source.To evaluate this, we created 20 new testing sets de-fined considering only those sources with ≥ ShuffleSplit iterator (see Section 5.1).We then classified each new testing set with its respec-tive model, considering: a) the features available for the g and r bands, b) the features available only for the g band, and c) the features available only for the r band.Figures 10 and 11 illustrate the results of this analy-sis. Figure 10 shows the recall as a function of the av- erage magnitude for each subclass, with the exceptionof the SN classes which are grouped in one class calledSN, while Figure 11 shows the recall as a function ofthe number of detections. From both figures we can in-fer that in general the best results are obtained whenphotometry from both g and r are available, with theexceptions of QSO, CEP and Periodic-Other classes.From Figure 10 we can also conclude that the relia-bility of the classification versus the average magnitudeis different for each class. These distributions in gen-eral follow the magnitude distribution in the labeled setof each class considered in this model (with the excep-tion of RRL). For instance, for the labeled set, the CEPclass corresponds to one of the brightest classes, hav-ing in general r <
16, while the LPV class covers abroader range of magnitudes. On the other hand, fromFigure 11, we can infer that in general the classificationimproves when more detections are available in bothbands, with the exceptions of the QSO and Periodic-Other classes.The results obtained for Periodic-Other are not sur-prising since this class includes all the periodic classesnot considered in the classifier (including several differ-ent types of pulsating stars, as well as the rotationalvariables). The results observed for the CEP class areprobably due to the large fraction of CEPs in the la-beled set with photometry only in the g band, whichis produced by the saturation limit of the ZTF survey(12.5 to 13.2 magnitudes), and the fact that CEPs tendto be very bright particularly in the r band, and thusfor bright CEPs the r band light curve is not available.For SNe, there is a decrease in the recall curve in the g band, presumably due to the fact that in general the g -band light curves of SNe tend to decay faster than the r -band light curves, producing shorter (and thus fewerdetections) g -band light curves. This trend can be seenin the SN shown in Figure 2, as well as more generally inthe light curve statistics for SNe. The average numberof detections of SNe light curves is 12 and 16 in the g and r bands, respectively, and the total time length ofSNe light curves corresponds to 53 and 64 days in the g and r bands, respectively. The low recall obtained forbright RRL when only the g band is available may beproduced by differences in the variability features mea-sured for different RRL sub-types. This issue is furtherdiscussed in Appendix C. The results obtained for AGNsand QSOs are likely related with incorrect labels, whichwe discuss further in Appendix D.5.3. The deployed BHRF classifier
In order to use the BHRF classifier to classify the ZTFalert stream we need to train a single BHRF model. We8
S´anchez-S´aez et al.
Figure 8.
Confusion matrices of the bottom level obtained when applying the GBoost (left) and the MLP (right) classifiersto the testing set. The black squares highlight the three hierarchical classes (from top to bottom, transient, stochastic, andperiodic, respectively). The reported values correspond to the percentages obtained by normalizing the matrix with the truelabels. These matrices present a high percentage of misclassifications compared to the bottom level of the BHRF model (Figure7).
Figure 9.
As in Figure 7, but for a model that excludesAllWISE data. We drop the errors, which are comparable tothose listed in Figure 7, for simplicity. The results obtainedfor this model are reasonable, although the classification ofsome stochastic classes are affected by the lack of AllWISEdata. call this model “the deployed BHRF classifier”. As inthe previous sections, we trained the deployed BHRFmodel using 80% of the labeled set as the training setand the remaining 20% as the testing set. Table 5 showsthe classification report obtained for the bottom levelof this classifier. From the table we can notice thatin general the precision scores are lower than the recallscores. The results obtained for SNIa, SNII, QSO, AGN,YSO, CV/Nova, LPV, E, and RRL are good, with F1-score values higher than or equal to 0.60. This tablealso shows that some classes, such as the the SNIbc,Periodic-Other, DSCT, and CEP classes, require fur-ther work in order to improve the results obtained bythe classifier; this may involve data augmentation pro-cedures, and better period estimations, as discussed inthe following sections.5.3.1.
Feature ranking of the deployed BHRF classifier
In Table 6 we list the feature ranking (top 30) foreach classifier within the two-level BHRF classifier (toplevel, Transient, Stochastic, and Periodic). The featureranking is computed considering which features separatebetter the subclasses within each classifier, with moreinformative features having higher ranks (for more de-tails see Hastie et al. 2009). From the table we can seethat for all the classifiers, a considerable fraction of thetop 30 features correspond to colors computed using the he ALeRCE Light Curve Classifier
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0 R e c a ll SN
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0
QSO
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0
AGN
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0 R e c a ll Blazar
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0
YSO
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0
CV/Nova
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0 R e c a ll LPV
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0 E
13 14 15 16 17 18 19 20 210.00.20.40.60.81.0
DSCT
13 14 15 16 17 18 19 20 21 average magnitude R e c a ll RRL
13 14 15 16 17 18 19 20 21 average magnitude
CEP
13 14 15 16 17 18 19 20 21 average magnitude
Periodic-Other g+rgr
Figure 10.
Recall for each stochastic and periodic subclass, as well as all transients (grouped as SN), as a function of theaverage magnitude. The x-axis ranges from 13 to 21 magnitudes, this range includes ∼
90% of the sources. In black triangles weshow the recall curves obtained when g and r photometries are available (considering the average magnitude in the g band), ingreen circles when only the g band is available, and in red squares when only the r band is available. The shaded regions wereobtained by generating 20 different training and testing sets, and training 20 independent models using each of these sets. Wereport the median and 5 and 95 percentile values obtained from the 20 models. There is a truly wide variety of behaviors (seediscussion in the text). S´anchez-S´aez et al. R e c a ll SNe QSO AGN R e c a ll Blazar YSO CV/Nova R e c a ll LPV E DSCT R e c a ll RRL CEP Periodic-Other g+rgr
Figure 11.
As in Figure 10, but plotting the Recall as a function of the number of detections in the light curve (in logarithmicscale). The x-axis ranges from 6 to 150 detections, this range includes ∼
90% of the sources. Again, there is a wide variety ofbehaviors (see discussion in the text). he ALeRCE Light Curve Classifier Table 5.
Classification report of the bottom level of thedeployed BHRF classifier. The last row shows the macroaverage of each scoreClass score precision recall f1-scoreSNIa 0.92 0.76 0.83SNIbc 0.14 0.58 0.23SNII 0.69 0.55 0.61SLSN 0.28 1.00 0.43QSO 0.97 0.87 0.92AGN 0.65 0.85 0.73Blazar 0.40 0.75 0.53YSO 0.73 0.78 0.75CV/Nova 0.52 0.71 0.60LPV 0.99 0.98 0.98E 0.95 0.74 0.83DSCT 0.15 0.89 0.26RRL 0.95 0.88 0.91CEP 0.18 0.75 0.29Periodic-Other 0.14 0.73 0.24macro avg 0.58 0.79 0.61
AllWISE and ZTF photometry, as well as new detectionfeatures (i.e., features not included in the FATS pack-age) and non-detection features. Moreover, it can beobserved that the ranking of features changes for eachclassifier.The top level classifier is dominated by different typesof features: ZTF and ALLWISE colors, morpholog-ical properties of the images ( sgscore1 ), variabilityfeatures related with the amplitude of the variabilityat short and long timescales (
MHPS low , GP DRW sigma , Meanvariance , ExcessVar , and
SPM A ), variability fea-tures that detect smooth decrease or increase of the lu-minosity (
LinearTrend , SPM tau rise , SPM tau fall ),features related with the quality of a supernovaparametric model fitting (
SPM chi ), and features re-lated with transient appearance or disappearance( positive fraction , and n non det after fid ).On the other hand, the Transient classifier is dom-inated by the SPM features (e.g.,
SPM beta , SPM t0 , SPM tau rise , and
SPM tau fall ). Other relevant fea-tures are the optical colors in the peak and the meanof the light curve, measured from the difference im-age light curves, features that detect smooth increaseor decrease of the observed flux (
LinearTrend ), fea-tures that measure the level of correlation in the lightcurve (
IAR phi ), features related with the amplitude ofthe variability (
MHPS low ), and features related with theappearance of a transient source ( dmag first det fid , last diffmaglim before fid 1 ). Note that SN rise re- lated features, such as SPM t0 and
SPM rise , are some ofthe most relevant features for the classification of tran-sients, and are crucial for the early classification of SNe.Also, note that
SPM t0 is not the explosion time, butsome characteristic time where the SN has risen signifi-cantly.For the Stochastic classifier, 12 of the top 30 fea-tures are related with color, morphology and distanceto the Galactic plane, and the rest correspond tofeatures related with the amplitude of the variabil-ity observed at different time scales (e.g.,
ExcessVar , SPM A , Meanvariance , GP DRW sigma , and
Amplitude ),and features related with the time scale of the variability(
IAR phi , GP DRW tau ).Finally, the Periodic classifier is clearly dominated bythe
Multiband period feature, but also by different col-ors, by features related with the amplitude of the vari-ability (e.g., delta mag fid , Amplitude , ExcessVar , Meanvariance , and
GP DRW sigma ), and features relatedwith the timescale of the variations (e.g.,
GP DRW tau ,and
IAR phi ).5.3.2.
Results for the unlabeled ZTF set
We now turn to discuss the results obtained when ap-plying the final BHRF classifier to all the sources in ZTFwith ≥ g or ≥ r . Consid-ering the data obtained by ZTF until 2020/06/09, thereare 868,371 sources that satisfy this condition, hereafterdefined as the “unlabeled ZTF set”. We define the classof a given object in the unlabeled ZTF set by selectingthe class with the maximum probability obtained by thedeployed BHRF classifier. However, users of this clas-sifier can use the obtained probabilities to make theirown probability cuts and select samples for their science.The features, classifications and probabilities obtainedfor the unlabeled ZTF set with data until 2020/06/09are provided in a catalog that can be downloaded athttp://ml.alerce.online/lc pred.It is important to note that the classifications obtainedby the light curve classifier are updated every day, asnew alerts are received. Whenever a new alert is receivedfor a given source, the ALeRCE pipeline recomputes itsvariability features and provides an updated classifica-tion. These updated classifications can be found at theALeRCE Explorer website, using the “light curve clas-sifier” tab, and specifying the desired class. Consideringthe results shown in Figure 11, we expect that the qual-ity of the classification for a given source will improveas more detections are added to the light curve. In ad-dition, with new alerts, more objects will satisfy thecondition of having ≥ g or ≥ r , and thus, the size of the unlabeled ZTF set2 S´anchez-S´aez et al.
Table 6.
Feature ranking (top 30) for each layer of the deployed BHRF classifier.
Top level Transient Stochastic PeriodicFeature Rank Feature Rank Feature Rank Feature Rank
W1-W2
SPM t0 1
W1-W2
Multiband period sgscore1
SPM beta 2 sgscore1 g -W2 positive fraction 2 SPM tau rise 2 r -W2 r -W2 positive fraction 1 SPM tau rise 1 ( g - r ) mean corr ( g - r ) max corr SPM tau rise 1 ( g - r ) max g -W2 g -W3 LinearTrend 2
SPM t0 2 gal b ( g - r ) mean SPM chi 1
LinearTrend 2 g -W3 ( g - r ) max g -W2 AndersonDarling 2 ( g - r ) max corr GP DRW tau 1 g -W3 SPM beta 1
ExcessVar 2 ( g - r ) mean corr n non det after fid 2 SPM tau fall 2
Meanvariance 2
IAR phi 1
W2-W3 dmag first det fid 1 ( g - r ) mean Amplitude 1
SPM beta 1
MHPS low 1 delta mag fid 2
ExcessVar 1
SPM tau rise 2
LinearTrend 1 r -W3 delta mag fid 1 SPM A 2 ( g - r ) mean W2-W3
Meanvariance 1
SPM chi 2
MHPS ratio 1
Amplitude 2 r -W3 SPM A 1
SPM tau fall 1
Std 2
Std 1 r -W2 SPM gamma 2 ( g - r ) max GP DRW sigma 1
ExcessVar 1
MHPS ratio 2
SPM A 2
GP DRW tau 2
ExcessVar 2
Skew 2
PercentAmplitude 2
PercentAmplitude 1 r -W3 sgscore1 SPM A 1
W1-W2
Rcs 2
SPM gamma 1
ExcessVar 1
W2-W3
GP DRW sigma 2
Power rate 2
IAR phi 1
SF ML amplitude 1
SPM tau fall 1
IAR phi 2
GP DRW sigma 2
SPM A 1
Meanvariance 2 dmag first det fid 2 delta mag fid 1
Gskew 1
LinearTrend 1
IAR phi 1
Pvar 2
Q31 1
SPM beta 2 last diffmaglim before fid 1
IAR phi 2
Autocor length 1
Pvar 2
PPE
GP DRW tau 2
IAR phi 2
MHPS low 2
Harmonics mag 6 1
GP DRW tau 1
SF ML gamma 1
SF ML amplitude 2
MHPS low 2
MHPS high 1
Amplitude 2 ( g - r ) max corr Gskew 2
MHPS low 1 delta mag fid 2 increases every day. Moreover, with new detections thesize of the labeled set will increase, allowing the trainingof new BHRF models. Updates regarding any possiblemodification to the light curve classifier (e.g., labeled setand models) will be published on the ALeRCE Sciencewebsite.Figure 12 shows the number of candidates per classobtained for the 868,371 sources with enough alerts un-til 2020/06/09. Compared to Figure 4, it can be no-ticed that there is no correlation between the number ofsources per class for the unlababel set and the numberof sources per class for the labeled set. The Periodic-Other, E, and LPV classes have the highest number ofcandidates, while the SN classes have the lowest. Thedistribution of candidates per class is consistent with theastrophysical number densities (i.e., we are likely notmisclassifying large numbers of sources). For instance,Blazars are relativistically beamed (and thus seen tohave farther distances), but only over very small viewing angles, and hence are expected to be less common thanQSOs and AGNs. In the case of SNe, not factoring in theamount of time a particular SN is above the magnitudelimits of the search (the “control time”), we find ratiosof SNe II/SNe Ia and SNe Ibc/SN Ia of 0.21 and 0.41,respectively. Computing these ratios using the numberof such classes reported from ASAS-SN discoveries inHoloien et al. (2019) yields 0.36 and 0.09, respectively.The significant differences between the SNe Ibc/SN Iaratios implies that we are strongly overestimating thenumbers of SN Ibc; given the similarities between SN Ibcand SN Ia light curves, we are likely classifying a non-negligible fraction of SN Ia as SN Ibc. This highlightsthe importance of including distance estimations to im-prove the classification of transients. We are currentlyworking to include distance-based features in future ver-sions of the classifier.To investigate the quality of the predictions, we plot-ted the probability distributions of the top and bottom he ALeRCE Light Curve Classifier P e r i o d i c - O t h e r E L P V Y S O RR L Q S O D S C T C E P A G N C V / N o v a B l a z a r S N I a S N I b c S N II S L S N o f c l a ss i f i e d s o u r c e s Figure 12.
Number of candidates per class for all thesources in the unlabeled ZTF set. It can be seen that thenumber of sources per class for the unlabeled ZTF set doesnot correlate with the number of sources per class for thelabeled set (see Figure 4). levels on the left and right sides of Figure 13. The redlines denote the position of the median probability foreach class, and the green lines denote the 5 and 95 per-centiles. It is clear from the figure that the distributionof probabilities for the top level are higher comparedto the bottom level. For the top level, the classes withthe lowest probabilities are CV/Nova, and YSO. For thebottom level, the classes with the highest probabilitiesare LPV, QSO, and AGN, and the lowest probabilitiesare for the different classes of SNe, YSO, CV/Nova, andsome periodic variables.The low probabilities obtained for some classes arerelated with the confusion between classes observed inFigure 7. For instance, in Figure 7 we can see that theSN classes present a high confusion among them. On theother hand, the SNIa, SNIbc, SNII and SLSN medianprobabilities of sources classified as SNII are 0.16, 0.19,0.28 and 0.19, respectively. The high confusion amongthe SN classes may be due to the low number of sourcesin the labeled set, but also to the intrinsic similaritiesamong these classes. For example, the physical mecha-nism responsible for the main peak of the light curve ofSNe Ia and SNe Ib/c is the same, the diffusion of energydeposited by radioactive Ni (Arnett 2008). Indeed,Villar et al. (2019a) report that 15% of their Type IbcSNe are classified as Ia. This might be improved by per-forming data augmentation using, for example, Gaus-sian process modeling (e.g., Boone 2019). On the otherhand, the low probabilities observed for CV/Novae andYSOs can be produced by the similarities between theircolors and the colors of some periodic sources, and the fact that some CV/Novae and YSOs present very rapidvariability compared to the ZTF cadence, that produceslight curves with low auto-correlation, and thus low val-ues of the
IAR phi parameter, which is normally ob-served for periodic sources (excluding LPVs). Thesesimilarities can be seen in Figure 14, where we showthe distributions of
IAR phi 1 and g -W3 for YSOs andCV/Novae (grouped), the rest of the stochastic classes,and periodic sources from the labeled set.Figure 15 shows the normalized r band magnitudedistribution of the different classes considering sourcespresent in the labeled set and candidates from the un-labeled ZTF set. In general, the distributions of mag-nitudes of the candidates are similar to or fainter thanfound among sources from the labeled set. These resultsshow that the classifier is able to detect faint and brightcandidates, regardless of the luminosity biases present inthe labeled set, which can be dominated by the brightesttail of the true magnitude distribution of each class.A simple way to test the performance of the BHRFclassifier is to verify whether the results obtained whenthe model is applied to the unlabeled ZTF set are inagreement with what is astrophysically expected fromprevious works. For instance, younger Galactic targetslike YSOs, Classical Cepheids, and LPVs should residenear the Galactic plane (e.g., Catelan & Smith 2015;Mowlavi et al. 2018), while extragalactic sources likeAGNs, QSOs, Blazars, and SNe should have roughlyisotropic distributions, perhaps with fewer sources nearthe Galactic plane due to attenuation/reddening by gasand dust (e.g., Calzetti et al. 2000; Padovani et al. 2017).On the left side of Figure 16 the sky distribution, inGalactic coordinates, of LPV, CEP, and YSO candidatesis shown. It is clear from the figure that most of themare located in the Galactic plane, and that sources lo-cated outside the plane have a low BHRF probability.This is consistent with the results obtained by previousworks (e.g., Mowlavi et al. 2018; Rimoldini et al. 2019).The right panel of Figure 16 shows the Galactic latitudeversus the g − r color obtained using the mean magni-tude of the light curves in each band, for extragalacticcandidates (QSO, AGN, Blazar, SNIa, SNIbc, SNII, andSLSN). From the figure we can see that the fraction ofextragalactic candidates observed around the Galacticplane is low, and that most of the candidates locatedin the plane have low probabilities. Moreover, the g − r colors of the extragalactic candidates are consistent withwhat is expected for these classes, with clear evidence ofreddening for the candidates located around the Galac-tic plane. The sky distribution, in Galactic coordinates,of the extragalactic candidates can be found in Figure22 of the appendix.4 S´anchez-S´aez et al. top ( transient ) top ( transient ) top ( transient ) top ( transient ) top ( stochastic ) top ( stochastic ) top ( stochastic ) top ( stochastic ) top ( stochastic ) top ( periodic ) top ( periodic ) top ( periodic ) top ( periodic ) top ( periodic ) Predicted top level probability ( top ) top ( periodic ) Predicted bottom level probability ( )
Figure 13.
Left: normalized probability distributions of the top level of the deployed BHRF classifier, split by subclass, forcandidates from the unlabeled ZTF set. The reported values correspond to the probabilities obtained for each class of the toplevel, as indicated below the class name. Right: normalized probability distributions of the bottom level of the deployed BHRFclassifier, split by subclass, for candidates from the unlabeled ZTF set. The red lines show the median probability for each class.The green lines show the 5 and 95 percentiles of the probabilities. Some subclasses show broad distributions to low values,implying that they are not so well-represented or characterized by the highest ranked features within the hierarchical class. he ALeRCE Light Curve Classifier IAR g band (IAR_phi_1) n o r m a li z e d f r e q u e n c y YSO and CV/NovaStochastic (other)Periodic g -W3 n o r m a li z e d f r e q u e n c y Figure 14.
Normalized IAR φ distribution ( top ) in the g band ( IAR phi 1 ; cut in the y-axes at 0.4), and normalized g -W3 distribution ( bottom ), for YSOs and CV/Novae (blue),stochastic sources (red; excluding CV/Novae and YSOs),and periodic sources (yellow). It can be seen that someCV/Novae and YSOs have similar IAR phi 1 and g -W3 thanperiodic sources. 6. CONCLUSIONS6.1.
Summary
In this paper we presented the first version of theALeRCE light curve classifier. This classifier uses a to-tal of 152 features, including variability features com-puted from ZTF light curves with ≥ g or r bands, and colors computed using ZTF and AllWISEphotometry (see Section 3), to classify each source into15 subclasses, including periodic, transient, and stochas-tic variable sources (see Section 2.2). The light curveclassifier uses a balanced hierarchical RF classifier (see SNIa in LSSNIa cand.
SNIbc in LSSNIbc cand.
SNII in LSSNII cand.
SLSN in LSSLSN cand.
QSO in LSQSO cand.
AGN in LSAGN cand.
Blazar in LSBlazar cand.
YSO in LSYSO cand.
CV/Nova in LSCV/Nova cand.
LPV in LSLPV cand.
E in LSE cand.
DSCT in LSDSCT cand.
RRL in LSRRL cand.
CEP in LSCEP cand.
12 14 16 18 20
Mean r Periodic-Other in LSPeriodic-Other cand.
Figure 15.
Normalized magnitude distributions in the r band for sources in the labeled set (LS; red histograms)and candidates from the unlabeled ZTF set (cand.; blue his-tograms). Section 4.1), constructed with a two-level scheme. Thefirst level (top level) classifies each source as periodic,stochastic or transient. The second level (bottom level)consists of three classifiers that further resolve each hi-erarchical class into subclasses.6
S´anchez-S´aez et al.
Figure 16.
Left: Galactic latitude (gal b, in degrees) versus Galactic longitude (gal l, in degrees) for candidates expected to bemostly observed around the Galactic plane (LPV, CEP, and YSO classes). Right: Galactic latitude (gal b, in degrees) versus g − r max for extragalactic candidates (QSO, AGN, Blazar, SNIa, SNIbc, SNII, and SLSN classes). The bottom level probabilitycomputed by the deployed BHRF classifier are color-coded according to the colorbar to the right. The red contours show thedensity of points in each plot. It can be seen that most of the extragalactic candidates are located outside the Galactic plane,while most of the YSO, LPV and CEPcandidates are located in the Galactic plane. We trained and tested the BHRF classifier using a la-beled set obtained by cross-matching the ZTF databasewith different catalogs of transients, stochastic and pe-riodic sources (see Section 2.2.1). For the top level weobtained macro-averaged precision, recall, and F1-scorevalues of 0.96, 0.99, and 0.97, respectively, while for thebottom level we obtained macro-averaged precision, re-call, and F1-score values of 0.57, 0.76, and 0.59, respec-tively.We used the BHRF classifier to classify 868,371sources from ZTF (unlabeled ZTF set), obtaining re-sults that are in agreement with what we expect astro-physically. For instance, most of the high probabilityextragalactic candidates are located outside the Galac-tic plane, and most of the high probability YSO, LPV,and CEP candidates are located in the Galactic plane(see Figure 16).The condition of ≥ g or r normallyequates to a timespan between 3 and 30 days sincethe first detection. Whenever a new detection is re-ceived for an object, the ALeRCE pipeline processes itand provides an updated classification in ∼ Final remarks and perspectives
One of the main challenges found during the devel-opment of the ALeRCE light curve classifier was thehigh imbalance present in the labeled set. For instance,the transient sources represent 1.4% of the labeled set,while the periodic sources represent 70.5%. Each hier-archical class also suffers from high imbalance amongits subclasses; for example, in the case of the tran-sient class, SNIa comprise 74.0% of the sample, whileSLSN correspond to only 1.4%. We addressed this prob-lem by using the balanced RF implementation of the imbalanced-learn
Python package, which follows theprocedure proposed by Chen et al. (2004). This methoduses a downsampling majority class technique to traineach tree with a balanced sub-sample. We also testedtwo other algorithms, GBoost and MLP, but concludedthat more work is needed if we want to obtain betterresults from those. he ALeRCE Light Curve Classifier
IAR phi parameter, theMHPS features, and the non-detection features. In addi-tion, during the development of the light curve classifierwe realized that some stochastic classes are hard to sep-arate using just variability features. In particular, theseparation of YSOs from the other stochastic classes im-proved significantly once we included AllWISE colors inthe set of features (as can be seen in Figures 7 and 9).Furthermore, the computation of reliable periodswas quite challenging, particularly considering that theALeRCE pipeline requires fast computation of features.Huijse et al. (2018) demonstrated that very good re-sults can be achieved by quadratic mutual information(QMI) estimators, however these techniques are compu-tational expensive [ O ( n )]. We solve this issue by usingthe MHAOV periodogram, which provides less reliableperiods (see Figure 10 in Huijse et al. 2018), but is muchfaster to compute [ O ( n )]. Periods become increasinglyunreliable as the number of datapoints decreases, butthe classification of periodic variables can still be accu-rate (see RRL, E or LPV panels in Figure 11), as otherfeatures can compensate for the decreasing quality ofthe periodogram (e.g., features related with the ampli-tude and the timescale of the variability). ALeRCE iscurrently working to implement methods for period esti-mation that are both accurate and fast. Computing theperiodogram is expensive for sources with a large num-ber of detections. We are currently exploring so-called“online” periodograms, which are updated as new sam-ples arrive, at a fraction of the computational cost ofrecomputing the period each time from scratch (Zorichet al. 2020), as well as other techniques that might workbetter with eclipsing binary light curves (e.g., Kov´acset al. 2002; Mighell & Plavchan 2013).Moreover, the classification of the different SN classeswas particularly challenging. First, the number of SNein the labeled set was very small compared to otherclasses, and second, the light curves of SN classes canpresent similarities, which makes their separation dif-ficult, as discussed in Section 5.3.2. We solved thisissue by using the BalancedRandomForestClassifier method from imbalanced-learn , and by including theSPM features, whose definition is a modification of the work of Villar et al. (2019a). In the future we plan totest other techniques to improve the separation of SNclasses. Previous works have performed Gaussian pro-cess regression to model SNe light curves and generatenew light curves with different cadences therefrom (e.g.,Boone 2019). Moreover, better results can be obtainedif we use information regarding the SN host galaxy (e.g.Foley & Mandel 2013; Baldeschi et al. 2020).In the future we also plan to perform data augmenta-tion to improve the classification of persistent variableobjects. For the case of variable stars, light curves canbe modeled with Gaussian process or with a combina-tion of harmonics, and then basic transformations canbe applied to these models to obtain light curves withdifferent periods and amplitudes (e.g., Elorrieta et al.2016; Mart´ınez-Palomera et al. 2018; Castro et al. 2018;Aguirre et al. 2019). To the best of our knowledge forthe case of AGNs, QSOs and Blazars no previous at-tempts to perform data augmentation have been made.A promising option is to use synthetic light curve gen-erators that consider the physical processes behind thevariability (e.g., Sartori et al. 2019).Most of the features used by this classifier can be im-plemented and used to classify light curves from otherdata sets. In particular, for the case of LSST, thenon-detection features can be adapted to work withthe forced photometry that will be provided for eachalert (
DIAForcedSources in the Data Products Defini-tion Document; Juri´c et al. 2019). LSST will also benefitfrom the multiband ugrizy light curves. As we demon-strated in this work, in general the light curve classifica-tion improves when both ZTF g and r data are available.For the case of LSST this would be the same, and prob-ably we should even be able to further resolve some ofthe subclasses presented in this work. For instance, us-ing the zy light curves we should be able to separatelocal type 1 and type 2 active galactic nuclei, since forlow redshift sources, we can detect variability from thedusty torus at these wavelengths (see S´anchez et al. 2017and references therein), or identify high redshift QSOs,whose emission is expected to be absorbed in the bluerbands. We encourage researches interested in classifyingstochastic and transient sources in particular to use thenovel (or modified) features presented in this work, likethe IAR phi parameter, the MHPS features, the SPMfeatures, and the non-detection features.It is worth to note that the ALeRCE light curve clas-sifier is being constantly improved, and this work de-scribes its Version 1.0. Future versions of this classifiermay include new classes of variable and transient ob-jects, as well as sub-classes of sources already presentin the taxonomy (e.g., RRL types ab and c; classical8
S´anchez-S´aez et al. and type II CEPs; contact, detached, and semi-detachedeclipsing binaries; among others). We are also workingto find new features and techniques that can improvethe performance of the classifier. Future work will re-port any changes included in the classifier model, likethe inclusion of data augmentation, or the use of otherclassification strategies (e.g., semi supervised training).We recommend the users of this classifier to check theALeRCE Science website to get updates related with thedifferent classifiers and the data processing. We are ex-ploring different classification algorithms which are notbased on manually designed features, but on automat-ically derived, recurrent, implicitly extracted features,via deep learning (e.g. Naul et al. 2018; Muthukrishnaet al. 2019; Becker et al. 2020). However, up to thispoint we have found that the former produce better re-sults when applied to real data. Most likely, a combina-tion of simulated and real data will be required to trainreliable deep learning classification models in the future,as found by Carrasco-Davis et al. (2019).ACKNOWLEDGMENTSThe authors acknowledge support from: ANID Mil-lennium Science Initiative ICN12 009; ANID grantsPIA AFB-170001 (F.F., I.R., C.V., E.C., D.R., L.S.G.,C.S.C., D.R.U.), PIA ACT172033 (P.A.), Basal-CATAAFB-170002 (P.S.S., F.E.B., M.C., D.D.), FONDECYTRegular N 1200710 (F.F.), 1190818 (F.E.B.), 1200495(F.E.B.), 1171678 (P.E.) and 1171273 (M.C.), FONDE-CYT Initiation N 11191130 (G.C.V.), and FONDECYTPostdoctorado N 3200250 (P.S.S.), and 3200222 (D.D.),the Max-Planck Society, Germany, through a PartnerGroup grant (P.A.). Powered@NLHPC: This researchwas partially supported by the supercomputing infras-tructure of the NLHPC (ECM-02).Based on observations obtained with the SamuelOschin 48-inch Telescope at the Palomar Observatory as part of the Zwicky Transient Facility project. ZTFis supported by the National Science Foundation underGrant No. AST-1440341 and a collaboration includ-ing Caltech, IPAC, the Weizmann Institute for Science,the Oskar Klein Center at Stockholm University, theUniversity of Maryland, the University of Washington,Deutsches Elektronen-Synchrotron and Humboldt Uni-versity, Los Alamos National Laboratories, the TANGOConsortium of Taiwan, the University of Wisconsin atMilwaukee, and Lawrence Berkeley National Laborato-ries. Operations are conducted by COO, IPAC, andUW. AllWISE makes use of data from WISE, which is ajoint project of the University of California, Los Ange-les, and the Jet Propulsion Laboratory/California Insti-tute of Technology, and NEOWISE, which is a projectof the Jet Propulsion Laboratory/California Institute ofTechnology. WISE and NEOWISE are funded by theNational Aeronautics and Space Administration.
Software:
Apache Kafka , Apache Spark (Zahariaet al. 2016), ASTROIDE (Brahem et al. 2020), As-tropy (Astropy Collaboration et al. 2013, 2018), Dask(Rocklin 2015), FATS (Nun et al. 2015), IAR (Eyhera-mendy et al. 2018), Imbalanced-learn (Lemaˆıtre et al.2017), Jupyter (Kluyver et al. 2016), Keras (Cholletet al. 2015), Matplotlib (Hunter 2007), Numpy & Scipy(van der Walt et al. 2011), P4J (Huijse et al. 2018),Pandas (McKinney et al. 2010), PostgreSQL , Python(Van Rossum & Drake Jr 1995), Scikit-learn (Pedregosaet al. 2012), Seaborn (Waskom et al. 2017), Tensorflow(Abadi et al. 2016), TOPCAT (Taylor 2005), XGBoost(Chen & Guestrin 2016).APPENDIX A. FURTHER DESCRIPTION OF SOME VARIABILITY FEATURESIn this section we provide additional description of some of the features listed in Table 2 (those marked with **).These features correspond to new variants of features included in the FATS package and other works: • Damp Random Walk (DRW) parameters: a DRW model is defined by a stochastic differential equation whichincludes a damping term that pushes the signal back to its mean: dX ( t ) = − τ DRW X ( t ) dt + σ DRW √ dt (cid:15) ( t ) + https://kafka.apache.org/ Python and R implementations are available in https://github.com/felipeelorrieta/IAR Model he ALeRCE Light Curve Classifier b dt, τ DRW , σ
DRW , t > τ DRW corresponds to the characteristic time for the time series to become roughlyuncorrelated, σ DRW corresponds to the amplitude of the variability at short timescales ( t (cid:28) τ DRW ), and (cid:15) ( t ) isa white noise process with zero mean and variance equal to 1. DRW modelling is typically used to describe lightcurves of active galactic nuclei (Kelly et al. 2009). In this case we obtained the σ DRW and τ DRW parametersusing Gaussian process regression, with a Ornstein-Uhlenbeck kernel, as in Graham et al. (2017).
GP DRW sigma denotes σ DRW , while
GP DRW tau denotes τ DRW . • Excess Variance ( σ rms ): Measure of the intrinsic variability amplitude in a given band (see S´anchez et al. 2017,and references therein). σ = ( σ LC − σ m ) /m , where σ LC is the standard deviation of the light curve, σ m isthe average photometric error, and m is the average magnitude. We denoted σ as ExcessVar . • P var : Probability that the source is intrinsically variable in a given band (see Paolillo et al. 2004, and referencestherein). It considers the χ of the light curve respect to its mean, and calculates the probability P var = P ( χ )that a χ lower or equal to the observed value could occur by chance for an intrinsically non-variable source,assuming that for each light curve its χ will follow a probability distribution described by an incomplete gammafunction Γ( ν/ , χ / ν corresponds to the degrees of freedom. We denoted P var as Pvar . • Structure Function (SF) parameters: The SF quantifies the amplitude of the variability as a function of the timedifference between pairs of detections ( τ ). In this work we consider the definition provided by Caplar et al. (2017).We model the SF as a power law: SF( τ ) = A SF (cid:16) τ (cid:17) γ SF , where γ SF corresponds to the logarithmic gradientof the change in magnitude, and A SF corresponds to the amplitude of the variability at 1 yr. SF ML amplitude denotes A SF , while SF ML gamma denotes γ SF . B. ONE-LEVEL MULTI-CLASS RF MODELThe first model tested for the ALeRCE light curve classifier was a simple one-level RF model with 15 classes,implemented using the imbalanced-learn
Python package. This model uses 500 trees, maximum depth trees, andmaximum number of features equal to the square root of the total number of features. Figure 17 shows the confusionmatrix obtained by this model. The precision, recall and F1-score obtained are 0.49, 0.68, and 0.50, respectively.Clearly this model has a lower performance compared to the BHRF classifier. C. THE PARTICULAR CASE OF RRLIn Section 5.2 we claimed that the low recall values obtained in the g band for bright RRL can be explained bythe differences in the variability features of the RRL sub-types. Figure 18 shows the Multiband period versus the
Meanvariance measured in the g and r bands, for RRLs split into their sub-classes ‘ab’ and ‘c’, and for E, CEP, andDSCT (grouped as a single class). From the figure we can notice that ab-type RRL tend to have larger Meanvariance in the g band, which helps to distinguish them from E/CEP/DSCT, while c-type RRL have values of Meanvariance inboth bands similar to those of E/CEP/DSCT, which makes it difficult to tell them apart. The RRL class in our labeledset is dominated by ab-type RRL ( ∼ g ≤ D. THE PARTICULAR CASE OF AGN, QSO AND BLAZARIn this work we present the first attempt to separate different types of active galactic nuclei according to theirvariability properties. As we mentioned in Section 2.2, we separate active galactic nuclei in the following way: • AGN: type 1 Seyfert galaxies (i.e., active galactic nuclei whose emission is dominated by the host galaxy), selectedfrom MILLIQUAS (broad type “A”), and from Oh et al. (2015). • QSO: type 1 nucleus-dominated active galactic nuclei (i.e., active galactic nuclei whose emission is dominated bytheir active nuclei), selected from MILLIQUAS (broad type “Q”). • Blazar: BL Lac objects and Flat Spectrum Radio Quasars (FSRQ), selected from ROMABZCAT and MILLI-QUAS.0
S´anchez-S´aez et al.
Figure 17.
Confusion matrix for a one-level multi-class RF Model. The black squares highlight the three hierarchical classes(from top to bottom, transient, stochastic, and periodic, respectively). The reported values correspond to the percentagesobtained by normalizing the matrix with the true labels. The performance of this model is poorer compared to the BHRFclassifier (see Figure 7).
Figure 18.
Logarithm of the Multiband period versus the
Meanvariance 1 ( g band, left) and the Meanvariance 2 ( r band,right), for RRL split according to their sub-classes ‘ab’ (green) and ‘c’ (red), and for E, CEP, and DSCT (grey). We show azoom in the area where most of the RRLs lie. The contours show the density of points of each RRL class. he ALeRCE Light Curve Classifier
14 15 16 17 18 19 20 21 average magnitude R e c a ll AGN+QSO+Blazar g+rgr R e c a ll AGN+QSO+Blazar g+rgr
Figure 19.
Similar to Figures 10 and 11, but treating AGNs, QSOs, and Blazars as a single class. When grouped together, therecall is much better behaved than for any individual active galactic nuclei class, highlighting remaining difficulties distinguishingbetween these classes.
In Figure 11 we showed the recall curves for each class as a function of the number of detections, and we couldobserve that these curves decrease for QSOs and AGNs when more detections are available, particularly in the r band.These results are puzzling, considering that we would normally expect to improve the classification of a given class asmore detections are included in the light curves. In order to better understand the origin of these results, we show inFigure 19 the recall curves as a function of average magnitude and number of detections for AGNs, QSOs and Blazarsgrouped as a single class. In this case, the recall curves are around 0.8 and 1.0 for every bin of magnitude (speciallyfor g >
16 or r >
16) and number of detections. From this we can infer that the light curve classifier has a very goodperformance selecting active galactic nuclei as a single class, but some issues still remain regarding the separation ofAGNs, QSOs, and Blazars. There are two main explanations for these results: a) the method cannot properly separateQSOs from AGNs and Blazars, or b) there are sources in the labeled set with incorrect labels.A possible way to explore how well the light curve classifier can discriminate among AGNs, QSOs, and Blazars,is to check whether the features available in the light curve classifier can separate these three populations. Figure20 shows six different features used by the classifier, ( g - r ) mean corr , Meanvariance 1 , ExcessVar 1 , sgscore1 , Meanvariance 2 , and
ExcessVar 2 , for QSOs, AGNs, and Blazars from the labeled set (most of these features arein the top 30-ranked features shown in Table 6). From the figure we can see that these three classes have differentcolor distributions, different morphologies, and also different variability properties. AGNs and Blazars tend to beredder than QSOs (see ( g - r ) mean corr ), Blazars and AGNs tend to have larger amplitudes (see Meanvariance 1 and
ExcessVar 1 ), and AGNs tend to have more extended morphologies compared to QSOs and Blazars. Theseare just some examples of features that can be used to separate the three classes above mentioned. After a visualinspection of the feature distribution of AGNs, QSOs, and Blazars, we found that more than 30 features can be used toseparate them, including for instance
PercentileAmplitude , Q31 , GP DRW sigma , GP DRW tau , MHPS low , MHPS high , SF ML amplitude , among others. From this we can infer that the light curve classifier should be able to separate thesethree populations properly.In addition, from Figure 20 we can notice that for the case of AGNs and QSOs, the distribution of features obtainedusing the g band light curves presents larger differences compared to the features obtained for the r band. Thisbehavior is also seen in other features, like PercentileAmplitude , GP DRW sigma , Std , among other features relatedwith the amplitude of the variability. These differences might be produced by the combined effect of having a highercontamination from the host in the r band and intrinsically lower amplitude of the variability in the r band, due tothe well known anti-correlation between amplitude of the variability and the wavelength of emission (see S´anchez et al.2017 and references therein). From this, we can understand the differences observed in the g and r band recall curvesof AGNs and QSOs shown in Figure 11, which should be produced by these differences in the features distributions.On the other hand, MILLIQUAS uses a luminosity-based divider to separate AGNs and QSOs (see Section 5 ofFlesch 2015), and therefore we expect that a fraction of sources could have an incorrect classification, if the hostgalaxy component is not corrected properly. In order to see whether there are sources in the labeled set with incorrectQSO or AGN classification, we crossmatched our labeled set with the SIMBAD database. There are 26168 QSOsin the labeled set (all obtained from MILLIQUAS), and 1590 of them are classified as Seyfert or AGN by SIMBAD2 S´anchez-S´aez et al. ( g r ) mean _ corr n o r m a li z e d f r e q u e n c y Meanvariance_1 ( g band) ExcessVar_1 ( g band) sgscore1 n o r m a li z e d f r e q u e n c y AGNQSOBlazar Meanvariance_2 ( r band) ExcessVar_2 ( r band) Figure 20.
Distribution of ( g - r ) mean corr , Meanvariance 1 , ExcessVar 1 , sgscore1 , Meanvariance 2 , and
ExcessVar 2 , forQSOs (blue), AGNs (red), and Blazars (yellow) classes from the labeled set. These features (among others) can be used toseparate these classes of active galactic nuclei. (6%), with 1580 having a reported redshift <
1. On the other hand, there are 4667 AGNs in the labeled set, and 830(17%) of them are classified as QSO in SIMBAD. Therefore, there are 2420 sources in the labeled set with inconsistentclassification in MILLIQUAS and SIMBAD. The light curve classifier classifies 920 of these sources as QSO and 1319as AGN. To understand more properly the differences between these AGN and QSO candidates, we plot in Figure 21the same features plotted in Figure 20 for the sources with inconsistent classification in MILLIQUAS and SIMBAD,splitting them according to their predicted class from the BHRF model. The distribution of features of these AGNand QSO candidates is similar to the general distributions observed for AGNs and QSOs from the labeled set. Again,AGNs tend to have higher variability amplitudes, redder colors and extended morphologies.From these results, we can infer that the decrease in the recall as a function of the number of detections observed forQSOs and AGNs is produced by the discrepancy in the QSO/AGN classification obtained from different catalogs. Whenmore epochs are available, it is easier for the light curve classifier to perform a correct variability-based classification,and therefore, identify an original incorrect classification provided by a given catalog. We propose that by usingthe variability properties of active galactic nuclei we can more easily separate them as nucleus-dominated or host-dominated, or in other words, as regular QSOs or low luminosity AGNs. E. SKY DISTRIBUTION OF THE EXTRAGALACTIC CANDIDATESFigure 22 shows the sky distribution (in Galactic coordinates) of extragalactic candidates (QSO, AGN, Blazar, SNIa,SNIbc, SNII, and SLSN). It is expected that only a few extragalactic candidates are observed in the Galactic plane,which is confirmed in Figure 22. REFERENCES
Abadi, M., Barham, P., Chen, J., et al. 2016, in 12th { USENIX } Symposium on Operating Systems Designand Implementation ( { OSDI } he ALeRCE Light Curve Classifier ( g r ) mean _ corr n o r m a li z e d f r e q u e n c y Meanvariance_1 ( g band) ExcessVar_1 ( g band) sgscore1 n o r m a li z e d f r e q u e n c y AGN cand.QSO cand. Meanvariance_2 ( r band) ExcessVar_2 ( r band) Figure 21.
Distribution of ( g - r ) mean corr , Meanvariance 1 , ExcessVar 1 , sgscore1 , Meanvariance 2 , and
ExcessVar 2 , forsources from the labeled set with discrepancies in their classification according to MILLIQUAS and SIMBAD, and classified asAGN (red) and QSO (blue) by the light curve classifier. The distribution of features is similar to that presented in Figure 20for QSOs and AGNs.Aguirre, C., Pichara, K., & Becker, I. 2019, MNRAS, 482,5078, doi: 10.1093/mnras/sty2836Allevato, V., Paolillo, M., Papadakis, I., & Pinto, C. 2013,ApJ, 771, 9, doi: 10.1088/0004-637X/771/1/9Ar´evalo, P., Churazov, E., Zhuravleva, I.,Hern´andez-Monteagudo, C., & Revnivtsev, M. 2012,MNRAS, 426, 1793,doi: 10.1111/j.1365-2966.2012.21789.xArnett, D. 2008, in American Institute of PhysicsConference Series, Vol. 1053, American Institute ofPhysics Conference Series, ed. S. K. Chakrabarti & A. S.Majumdar, 237–242, doi: 10.1063/1.3009489Astropy Collaboration, Robitaille, T. P., Tollerud, E. J.,et al. 2013, A&A, 558, A33,doi: 10.1051/0004-6361/201322068Astropy Collaboration, Price-Whelan, A. M., Sip˝ocz, B. M.,et al. 2018, AJ, 156, 123, doi: 10.3847/1538-3881/aabc4fBaldeschi, A., Miller, A., Stroh, M., Margutti, R., &Coppejans, D. L. 2020, arXiv e-prints, arXiv:2005.00155.https://arxiv.org/abs/2005.00155 Becker, I., Pichara, K., Catelan, M., et al. 2020, MNRAS,493, 2981, doi: 10.1093/mnras/staa350Bellm, E. 2014, in The Third Hot-wiring the TransientUniverse Workshop, ed. P. R. Wozniak, M. J. Graham,A. A. Mahabal, & R. Seaman, 27–33.https://arxiv.org/abs/1410.8185Bellm, E. C., Kulkarni, S. R., Graham, M. J., et al. 2019,PASP, 131, 018002, doi: 10.1088/1538-3873/aaecbeBen-Israel, A., & Greville, T. 2003, Adi Ben-Israel andThomas N.E. Greville, Generalized Inverses: Theory andApplications, ISBN 0-387-00293-6,doi: https://doi.org/10.1007/b97366Boone, K. 2019, AJ, 158, 257,doi: 10.3847/1538-3881/ab5182Brahem, M., Zeitouni, K., & L., Y. 2020, ApJBreiman, L. 2001, Machine Learning, 45, 5,doi: 10.1023/A:1010933404324Butler, N. R., & Bloom, J. S. 2011, AJ, 141, 93,doi: 10.1088/0004-6256/141/3/93Calzetti, D., Armus, L., Bohlin, R. C., et al. 2000, ApJ,533, 682, doi: 10.1086/308692 S´anchez-S´aez et al.
Figure 22.