The ZTF Source Classification Project: I. Methods and Infrastructure
Jan van Roestel, Dmitry A. Duev, Ashish A. Mahabal, Michael W. Coughlin, Przemek Mróz, Kevin Burdge, Andrew Drake, Matthew J. Graham, Lynne Hillenbrand, C. Fremling, David Hale, Russ R. Laher, Frank J. Masci, Reed Riddle, Philippe Rosnet, Ben Rusholme, Roger Smith, Maayane T. Soumagnac, Richard Walters, Thomas A. Prince, S. R. Kulkarni
DDraft version February 24, 2021
Typeset using L A TEX twocolumn style in AASTeX63
The ZTF Source Classification Project: I. Methods and Infrastructure
Jan van Roestel, Dmitry A. Duev, Ashish A. Mahabal, Michael W. Coughlin, Przemek Mr´oz, Kevin Burdge, Andrew Drake, Matthew J. Graham, Lynne Hillenbrand, Eric C. Bellm, Alexandre Delacroix, C. Fremling, V. Zach Golkhou, David Hale, Russ R. Laher, Frank J. Masci, Reed Riddle, Philippe Rosnet, Ben Rusholme, Roger Smith, Maayane T. Soumagnac,
8, 9
Richard Walters, Thomas A. Prince, and S. R. Kulkarni Division of Physics, Mathematics and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA School of Physics and Astronomy, University of Minnesota, Minneapolis, Minnesota 55455, USA DIRAC Institute, Department of Astronomy, University of Washington, 3910 15th Avenue NE, Seattle, WA 98195, USA Caltech Optical Observatories, California Institute of Technology, Pasadena, CA 91125 Cahill Center for Astrophysics, California Institute of Technology, MC 249-17, 1200 E California Boulevard, Pasadena, CA, 91125, USA IPAC, California Institute of Technology, 1200 E. California Blvd, Pasadena, CA 91125, USA Universit´e Clermont Auvergne, CNRS/IN2P3, LPC, Clermont-Ferrand, France Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA Department of Particle Physics and Astrophysics, Weizmann Institute of Science, Rehovot 76100, Israel Cahill Center for Astrophysics, California Institute of Technology, MC 249-17,1200 E California Boulevard, Pasadena, CA, 91125,USA (Received August 5, 2020; Accepted February 24, 2021)
Submitted to AJABSTRACTThe Zwicky Transient Facility (ZTF) has been observing the entire northern sky since the start of2018 down to a magnitude of 20.5 (5 σ for 30s exposure) in g , r , and i filters. Over the course of twoyears, ZTF has obtained light curves of more than a billion sources, each with 50-1000 epochs per lightcurve in g and r , and fewer in i . To be able to use the information contained in the light curves ofvariable sources for new scientific discoveries, an efficient and flexible framework is needed to classifythem. In this paper, we introduce the methods and infrastructure which will be used to classify allZTF light curves. Our approach aims to be flexible and modular and allows the use of a dynamicalclassification scheme and labels, continuously evolving training sets, and the use of different machinelearning classifier types and architectures. With this setup, we are able to continuously update andimprove the classification of ZTF light curves as new data becomes available, training samples areupdated, and new classes need to be incorporated. Keywords: editorials, notices — miscellaneous — catalogs — surveys INTRODUCTIONAstronomy, like many other branches of science, hasbeen experiencing an explosive increase in data volumes,which are doubling roughly every two years. This rev-olution has driven a renaissance in many areas of as-tronomy, most notably in the time domain. At somelevel, all astronomical sources exhibit changes in their
Corresponding author: Jan van [email protected] brightness with time, driven by a myriad of differentphenomena. The study of source variability has bene-fited greatly from the data deluge providing insight intoa broad range of astrophysical processes and phenom-ena.The light curves of variable objects contain informa-tion about the nature of the objects and the physicalprocesses that are responsible for the observed changes.Variable objects are a key tool in astrophysics and arethe main science driver in many fields. While it is nearlyimpossible to list all their astrophysical applications,variable stars have been used as distance indicators (e.g., a r X i v : . [ a s t r o - ph . I M ] F e b J. van Roestel et al
Pietrzy´nski et al. 2013, 2019; Riess et al. 2018), tracersof the structure and kinematics of the Milky Way andnearby galaxies (e.g., Skowron et al. 2019; Chen et al.2019; Jacyszyn-Dobrzeniecka et al. 2016, 2017), or trac-ers of the chemical evolution of galaxies (e.g., Genovaliet al. 2015). Studying stellar variability also helps usto understand the evolution and physics of stars them-selves – detailed modeling of eclipses has enabled precisemeasurements of masses and radii of all types of stars(e.g., Torres et al. 2010), asteroseismology is being usedto great effect to study the interior structure of stars(e.g., Aerts 2019), and the irregular variability of cat-aclysmic variables (CV), young stellar objects (YSO),and active galactic nuclei (AGN) offers insight into ac-cretion physics on all scales (Scaringi et al. 2015).The necessary first step in enabling all these appli-cations is to identify variable sources and classify theminto known object types while simultaneously lookingfor new classes.The astronomical community has extensive experiencein dealing with large samples of light curve data enabledby survey telescope automation and advances in boththe camera technology and data processing and analy-sis techniques. Notable examples of large-scale surveysare: the All Sky Automated Survey (ASAS; Pojman-ski 1997), the All Sky Automated Survey for Super-novae (ASAS-SN; Shappee et al. 2014), the AsteroidTerrestrial-impact Last Alert System (ATLAS; Tonryet al. 2018), the Catalina Real-Time Transient Survey(CRTS; Drake et al. 2014), EROS (Tisserand et al.2007), Gaia (Gaia Collaboration et al. 2016), MACHO(Alcock et al. 2000), the Northern Sky Variability Sur-vey (NSVS; Wo´zniak et al. 2004b) the Optical Gravita-tional Lensing Experiment (OGLE; Udalski 2003; Udal-ski et al. 2015), Pan-STARRS1 (Chambers et al. 2016),the VISTA Variables in the Via Lactea (VVV; Minnitiet al. 2010).To deal with the massive amount of data involved,these projects usually employ machine learning (ML)techniques to detect and classify variable sources (e.g.,Wo´zniak et al. 2004a; Debosscher et al. 2007; Kim et al.2011, 2014; Palaversa et al. 2013; Masci et al. 2014; Arm-strong et al. 2016; Heinze et al. 2018; Holl et al. 2018;Jayasinghe et al. 2019; Jayasinghe et al. 2020). How-ever, a more traditional approach – with the light curvesvetted by a human expert – also proves to be success-ful (e.g., Drake et al. 2014; Soszy´nski et al. 2014, 2015,2016a,b; Udalski et al. 2018). To date, over a millionvariable stars have been detected and classified, the ma-jority of which were found by OGLE (Soszy´nski 2018).Astronomical light curve data are typically sparselyand unevenly sampled, incomplete, heteroskedastic, and come with a lot of different biases. It is challengingto apply standard time series processing and analysistechniques developed in other areas to such data.A common approach to classification is to first com-pute a set of summary statistics (features), such as themean or median flux, interquartile range (iqr), von Neu-mann ratio, period(s)/amplitude(s), etc. These featuresencode the light curves (with different cadences andnumber of epochs) as a vector of finite length which al-lows for direct comparison of objects. Debosscher et al.(2007); Bloom et al. (2012); Nun et al. (2015); Kim &Bailer-Jones (2016) use those to classify the objects.This task can be done by humans (often by inspect-ing only two features at a time); but the scale of theproblem essentially forces one to use machine learningmethods.At the forefront of the revolution in time-domain as-tronomy, the Zwicky Transient Facility (ZTF) projectuses the 48-inch (1.2 meter) Samuel Oschin Schmidttelescope at Palomar Observatory in Southern Califor-nia to observe the sky every night. Science observa-tions began on March 17th, 2018 (Graham et al. 2019;Bellm et al. 2019). The median magnitude limit is20.5 in the r band for a nominal 30-second exposuretime (5 σ detection). ZTF has been performing frequentaccurate measurements of more than a billion astro-nomical objects observable from Palomar Observatory(declination > − ◦ ).ZTF light curves of variable stars have already beenused to make exciting discoveries by using targetedsearches. Searching for very short period variability,Burdge et al. (2019) discovered one of the shortest pe-riod binary systems known with a period of just 7 min-utes. ZTF light curves have also been used to discoverednew types of variable stars: Kupfer et al. (2020a) discov-ered a new type of compact objects binary, and Kupferet al. (2020b) found a new type of pulsating star. Van-derbosch et al. (2019) discovered a white dwarf withexocomets, the second of such a system. ZTF has alsobeen used to discover large number of outbursting orflaring objects: cataclysmic variables and microlensingevents Szkody et al. (2020); Mr´oz et al. (2020).In this paper, we present the framework designed bythe ZTF project to identify and classify variable ob-jects in all ZTF data. Section 2 describes the ZTF lightcurve data including pre-processing and feature extrac-tion. In Section 3, we introduce the classification schemeadopted for ZTF and describe the ML algorithms usedtherein. In Section 4, we present the active learning ap-proach to labeled data set assembly and classifier train-ing. The performance of the resulting classifiers are dis- TF source classification I ZTF LIGHT CURVES AND PRE-PROCESSING2.1.
ZTF light curves
The ZTF camera uses 16 separate 6 k × k CCD de-tectors and has a total field of view of 47 square degreeswith a pixel size of 1.01 (cid:48)(cid:48) (Bellm et al. 2019). ZTF point-ings are organized in two grids with rows of equal dec-lination to cover the entire northern sky ranging fromdeclination − ◦ to the Northern celestial pole. Theprimary grid uses 637 pointings of slightly overlapping“fields”, covering 88% of the observable sky. The re-maining area falls in the gaps between CCD detectors.To cover this missing area, a secondary grid (897 point-ings), offset in both right ascension and declination fromthe primary grid, is used.Several surveys are carried out by ZTF which use dif-ferent filters, cadences, sky areas, and exposure times.The main public survey (40% of the time) is an all-skysurvey in g and r with a cadence of 3 days (Bellm et al.2019). Smaller, dedicated surveys are carried out by theZTF partnership (40%) and time available to Caltech(20%). The largest survey is the supernova survey withsix observations per night of ≈ g -and r -bands (deep-drilling is only done in r -band), buta small fraction of the observations ( ∼ . i -band. Most of the surveys focus on observing fieldsfrom the primary grid.At the time of writing, the median (min, max) numberof epochs for primary grid fields for all surveys combinedare 184 (26, 1079) in g , 338 (23, 1263) in r , and 23 (1,165) in i . This broad range in the number of epochs perfield is partially due to observability (lower declinationfields tend be have fewer epochs), but mostly becauseof the smaller surveys which tend to accumulate manyepochs for small sets of fields. The median number ofepochs per field in the secondary grid is much lower(median <
50 for all filters). The low number of epochsmakes classification challenging, and we did not includethem in this study but will do so in future work.All ZTF images are processed and data products areautomatically generated. This includes light curves ofall persistent sources in the science images, which are themain data product for this work. Here, we summarisethe process, for a full description, see Masci et al. (2019).First, reference source lists are generated by running asource finding algorithm on reference images. Reference images are constructed by combining at least 15 imagesof good quality. When new science images are avail-able,
SExtractor (Bertin & Arnouts 1996) is appliedand sources within 1.5 (cid:48)(cid:48) of a reference source are linkedto that reference source to construct a light curve. Eachfilter, and each CCD-quadrant per ZTF pointing is pro-cessed completely separately from all other data. Thismeans that a single astrophysical object will have mul-tiple ZTF light curves for each filter, and if they occurin multiple ZTF fields (which can occur in the overlapbetween fields or the primary and secondary grid), willhave multiple light curves even for the same filter.In this work, we will use the individual light curvesas the basis for our classification, see for example Fig.1. While combining light curves potentially allows forbetter classification, we choose to classify the individuallight curve instead of combining them. We do this forseveral reasons. First, the large field of view of ZTFmakes perfect absolute calibration of light curves diffi-cult. Combining light curves with small but significantcalibration differences will introduce spurious variabil-ity. Second, image artifacts (ghosts on the CCD, badpixels, etc) are position-dependent and typically onlyaffect one light curve of an object. Keeping the lightcurve separate allows the objects to be classified usingthe unaffected light curves. An additional motivationto not combine light curves is that not all objects havelight curves in the different band-passes, especially forfaint and/or red objects. Classifying only single-bandlight curve allows for a more uniform classification. Notethat we do inspect all light curves simultaneously whenlabeling light curves, and light curves of the same objectshare the same label.Besides the light curves, images are also processed bya difference image pipeline, and any source more signif-icant than 5 σ on the difference images (positive or neg-ative), is reported as an alert (Masci et al. 2018). Whilethis pipeline is mainly designed to study transients andmoving objects, variable sources also generate alerts.While, in principle, the alerts do not contain new in-formation, the separate pipeline allows for a consistencycheck which is useful to identify image and processingartifacts. We, therefore, include some information fromthe alert pipeline in our analysis.2.2. Light curve pre-processing
Two main approaches to the light curve classificationproblem have been employed by the community, differ-ing by what is fed into a machine learning system: ei-ther pre-computed features (e.g. Blomme et al. 2010; each CCD has four readout channels J. van Roestel et al . . . . . . . . . . . M ag ZTF g ZTF r ZTF i
100 200 300 400 500 600 700HJD - 2458200 121314151617 M ag . . . . . . . . . . . M ag M ag Figure 1.
Example light curves from ZTF. The left column shows two periodic variables, an RR Lyrae and an eclipsing binary.The right column shows an outbursting cataclysmic variable and an irregular variable young stellar object. Note that we classifythe individual light curves for objects.
Richards et al. 2012) or the light curves directly (Naulet al. 2017; Muthukrishna et al. 2019; Jamal & Bloom2020).The first approach inevitably causes certain informa-tion loss, even though the computed features providea powerful and standardized insight into the raw data.The choice of such features referred to as “feature engi-neering” in the ML world, is a highly non-trivial problemon its own.The recent success of techniques that use artificialmany-layer neural networks (deep learning, DL; McCul-loch & Pitts 1943), is in big part attributed to the abilityof such systems to discover and extract relevant featuresdirectly from the data. DL systems frequently outper-form more traditional approaches; however, it is chal-lenging to apply those techniques to astronomical datadue to the intrinsic characteristics of the data discussedabove (Naul et al. 2017).In this work, we employ a hybrid approach to retainthe advantages of both methods. We rely on the lightcurve features while simultaneously striving to preservemore information contained in the time series by us-ing a two-dimensional second-order mapping of the light curves based on the changes in magnitude ( dm ) over theavailable time-differences ( dt ) (Mahabal et al. 2017).2.2.1. Calculation of light curve features
For each light curve, we calculate a number of sim-ple statistics, determine the best period and significanceusing a period-finding algorithm, and evaluate featuresthat are the result of fitting the phase-folded light curvewith a multi-harmonic sinusoid. Here, we briefly sum-marise the procedure; for full details, please see Cough-lin et al. (2020a). First, we remove any light curveepochs which are flagged as taken in bad conditions,which is about 6% of all data. We also skip any ob-ject within an empirically chosen radius 13 (cid:48)(cid:48) of “brightstars,” taken to be stars in Gaia (Gaia Collaboration2018) brighter than 13th magnitude or any object inthe Yale Bright Star Catalog (Hoffleit & Jaschek 1991).Early experiments showed that the feature values arestrongly affected by the presence of deep-drilling data inthe light curves. E.g. deep-drilling observations of long-period variables at one particular phase of the light curvesignificantly skew many of the light curve statistics. Be-cause this would severely limit the use of the features, wedecided to mask deep-drilling data when calculating the
TF source classification I χ , inter-percentile ranges). These features allowed us to betterassess the quality of the light curves.2.2.2. Period finding and Fourier features
The period finding strategy relies on a hierarchicaltechnique, where two fast algorithms, conditional en-tropy (CE; Graham et al. 2013) and Lomb-Scargle (LS,Lomb 1976; Scargle 1982), are used to identify high-significance, candidate periods, which are then passedto a slower, more comprehensive algorithm, multi-harmonic analysis of variance (AOV, Schwarzenberg-Czerny 1998). Our fast algorithms are implemented onGraphics Processing Units (GPU) in CUDA, the spe-cific implementation of CE can be found in Katz et al.2020 and the LS implementation can be found here . ACPU-based AOV is then applied to the top 50 frequen-cies identified by each of the algorithms to identify thebest period. We again masked any deep-drilling data asit strongly affects the period-finding performance.Once the best period has been identified, we fit thelight curve with a simple model that combines an offsetand slope with a series of sinusoids using that period.The model is described by: M ( t ) = st + c + n =5 (cid:88) n =1 a n sin( n πtP ) + b n cos( n πtP ) (1)The parameters a n and b a are converted to amplitudesand phases, and the amplitudes and phases of the har-monics normalized to the amplitude of the first har-monic. To determine the goodness-of-fit of this model,we use the Bayesian Information Criterion (BIC) value(Schwarz 1978). The number of harmonics used is de-termined by the lowest BIC value.2.2.3. Magnitude-time histograms – ‘dmdt’
As additional input for the deep-learning-based clas-sifiers, we calculate a 2D histogram from all pairs ofmagnitude and time difference ( dm and dt , respectively).This method encodes the one-dimensional light curves ofvarious lengths into a two-dimensional array of fixed di-mensions (an image), which is much easier for a classifierto interpret (Mahabal et al. 2017). We use 26 approxi-mately logarithmic spaced time bins and 26 magnitude https://github.com/mikekatz04/gce https://github.com/johnh2o2/cuvarbase bins, approximately logarithmic in both positive andnegative magnitude differences. We did include deep-drilling data into the calculation of the histograms as thehigh cadence data would fall mostly in the low- dt binsthat are not populated by the rest of the data points.2.3. External data
In addition to the data based purely on the ZTF lightcurves, some of our classifiers use data extracted fromexternal catalogs. We spatially cross-matched all of theZTF objects with the AllWISE (Wright et al. 2010),Gaia DR2 (Gaia Collaboration et al. 2018), and Pan-STARRS1 DR1 (Chambers et al. 2016) catalogs usinga match radius of 2 (cid:48)(cid:48) and extracted the following data(and a catalog ID) for the closest corresponding objectwithin that radius: • Gaia DR2: the G , BP and RP magnitudes, theparallax and proper motion with their associateduncertainties • Pan-STARRS1 DR1: the grizy magnitudes withtheir uncertainties • AllWISE: the W W W
3, and W Data storage and access
Efficient data storage and access, given the data setsize, represent a substantial problem. We solved it byemploying
Kowalski , an open-source system used in-ternally at Caltech to store the ZTF alert and lightcurve data together with external catalogs and accessthose through a standardized API (Duev et al. 2019a).We used Kowalski to efficiently feed the feature com-putation pipeline (Coughlin et al. 2020a) with the ZTFlight curve data and store the results. Additionally, the(versioned) classifier predictions have been stored in adedicated database that fed the active learning processdescribed in Section 4. CLASSIFICATION SCHEMEAstronomical ground-based light curve data are usu-ally sparse, unevenly sampled, and heteroskedastic, andZTF is no exception to this general rule. A variableobject classification framework must tackle these chal-lenges. First of all, the input image data used to gen-erate the light curves are affected by a broad range offactors such as the weather, the observability of fields,and the cadences of different sub-surveys within ZTF. In https://github.com/dmitryduev/kowalski J. van Roestel et al addition, the accuracy of the photometry decreases forfainter objects. The result is that objects belonging tothe same class will have different noise levels and appeardifferent to the classifier.The second problem is that for some types of vari-able objects, a light curve in a single filter is insufficientfor correct classification. Frequently, additional obser-vations are needed in a different bandpass, either opti-cal, infrared, radio or high-energy. In some cases eventhis is insufficient, and the intrinsic luminosity must beknown (e.g. by using the distance from, say, Gaia paral-lax). Evidently, these data are not always available forall ZTF sources making the external data to be used bythe classification framework inhomogeneous and poten-tially biased towards specific object subsets.Further, there are several challenges specific to partic-ular classes of variable objects. Some types of objectsare more abundant than others so that one has to fre-quently deal with very imbalanced data sets, with classexamples ranging from hundreds of thousands down tojust a handful. In addition, the source taxonomy addsto the challenge as classes can be overlapping. For ex-ample, an accreting white dwarf – red dwarf binary (acataclysmic variable) can be both outbursting (e.g., adwarf nova) and eclipsing, or a pulsating star (Cepheid)can be in an eclipsing binary system (e.g., Pietrzy´nskiet al. 2010). This is often caused by the class definitionsbeing a mix of phenomenological and “ ontological ” (orintrinsic) characteristics of sources.To tackle these challenges, we employ a hierarchicalapproach to classification and use a set of independentbinary classifiers , each of which categorizes the inputdata set into two groups (e.g., whether or not an objectbelongs to some class A).The main advantage of this approach is significantlygreater flexibility as compared to the typically usedmulti-class classifiers, where an object is assumed tohave a single correct label of many, or multi-label classi-fiers, where a single system outputs probabilistic predic-tions of object class membership for multiple classes atonce. If the performance on a particular class is deemedinsufficient, retraining the classifier with new trainingdata (or employing a different architecture) does not af-fect the system performance on other classes. Addingnew types of variable objects is straightforward andalso does not affect other classifiers. As ZTF contin-uous operations and the temporal baseline and numberof epochs increases, new types of variables become de-tectable, which only requires new classifiers to be added, taxonomy: a scheme of classification instead of having to rebuild an entire multi-class ormulti-label classifier.Another advantage is more flexibility for the end-user.Depending on the (astrophysical) class and the scien-tific goal, requirements for completeness and purity can be very different. This trade-off is easier to inter-pret when using binary classifiers. For example, eventhough our classifiers are completely independent, theyare conceptually organized in a hierarchy so that the“upstream” classifiers (high-level classes encompassinga broader range of objects, which are typically trainedon larger collections) may be used to increase the samplepurity for the “downstream” classifiers.If a classifier is trained on two specific types, the re-sults can be erratic when it is confronted with out-of-distribution objects. The binary classifiers we are using- those that separate their inputs into a given type versuseverything else - help alleviate such a problem, makingthem more robust by allowing the user to impose thresh-olds along multiple dimensions simultaneously.Finally, our approach implicitly allows for anomaly de-tection (for example, the user can select all light curvesmarked as variable, flaring, and periodic, and not be-longing to any other class with high confidence).These benefits come at a price: the main disadvantagein our approach is that is computationally expensive,both at training and for inference: one would need totrain, tune, evaluate, and then use for inference a largenumber of models instead of a single one.We organize our labels/classes and the correspondingclassifiers into two conceptual groups – phenomenologi-cal and ontological (see Fig. 2). The classifiers of the first group characterize each ZTFobject according to the phenomenological properties ofthe corresponding ZTF light curve, e.g. is the objectvariable, periodic, flaring, eclipsing, etc. The classifiersmay act as high-level filters allowing the end-users toefficiently identify objects of interest without imposinga detailed classification scheme. The aim of the “ phe-nomenological ” classifiers is to be as complete and unbi-ased as possible. Therefore, these classifiers do not useany external data for the classification to avoid biasesand enable independent analyses.Our second group of classifiers - “ ontological ” - isgeared towards the categorization of specific types of As quantified by recall or true positive rate , i.e. how many rele-vant items are selected by the classifier. As quantified by precision , i.e. how many items selected by theclassifier are relevant. The figure was generated using the tdtax library,https://github.com/profjsb/timedomain-taxonomy
TF source classification I Figure 2.
Conceptual hierarchical classification tree of in-dependent binary labels/classifiers used in this work. Thefilled circles indicate labels for which classifiers were trained. variable objects based on as much information as isavailable for a particular object. The result can thenbe used to easily obtain a large, pure sample of thatparticular type of variable. Alternately, by also includ-ing lower-scoring examples one can use the result as theinput for specialized pipelines to discover new sources(e.g. fitting eclipsing binary light curves with binarystar models). We note that these classifiers use featuresfrom external catalogs in addition to ZTF data and aretherefore prone to non-ZTF-specific biases.3.1.
Machine learning algorithms
To automatically classify all ZTF light curves, weuse supervised machine learning algorithms. Supervisedmachine learning algorithms “learn” mappings betweenthe input and the output spaces from a training set (forwhich both the input and output are known). This isachieved by solving an optimization problem of mini-mizing a loss function that quantifies the gap betweenprediction and ground truth. How this mapping is con-structed depends on the machine learning algorithm andcan be tuned by changing the values of “hyperparame-ters”. In this work, we use two different types of super-vised machine learning methods.In the first case, we employ deep learning methods, re-ferred to hereafter as the deep neural networks (DNN).DNN are universal function approximators that canlearn arbitrary mappings between the input and the out-put spaces. The network’s output is produced usingmultiple simple non-linear transformations organized ininterconnected “layers”. The networks are typically “trained” by alternating forward and backward passes –computing a prediction and then updating the trainabletransformation parameters (weights and biases) to de-crease the loss function. Neural networks are extremelyflexible, with the number of layers and the number ofnodes per layer as some of the most important hyperpa-rameters.The second type of classifiers are gradient boosted de-cision tree classifiers (Friedman 2001), implemented in
XGBoost (Chen & Guestrin 2016). This type of clas-sifier is based on a series of decision trees used as weaklearners. They have real-valued outputs that can beadded together and used to implement splits. The treesare gradually grown, with the additions being weightedsuch that the classifier performance improves on the ear-lier values. The growth is carried out in a greedy fash-ion, based on purity scores and minimization of the lossfunction. The thresholds for the accumulating values,the number of trees, etc., can be used as hyperparame-ters making this method extremely adaptable and gen-eral. As in random forests (Ho 1995), random subsetsof features and the data are used per iterations. DATA SET ASSEMBLY AND CLASSIFIERTRAININGAs we noted above, labeled light curves from a multi-tude of previous and current surveys are available. How-ever, we decided not to blindly use those because thatwould inevitably introduce survey-specific biases. In-stead, we employed an active-learning approach of alter-nating between data labeling and classifier training withsubsequent sampling of their predictions, both confidentones and those near the decision boundary.To streamline data labeling, we have built a dedicatedextension of the ZTF Variable Marshal , an open-sourceweb application for interactive exploration, analysis, andannotation of the ZTF variable sources (see Fig. 3).The API-driven interface displays the ZTF light curvesfor each filter per object, along with an additional setof light curves that are phase-folded to any period (orperiods) associated with an object. As additional infor-mation, the location on the Gaia observational HR di-agram and a Pan-STARRS image cutout are displayed.Labels can be assigned using a set of range sliders repre-senting the class labels. The slider values are quantizedto 0, 0.25, 0.5, 0.75 and 1, to enable a human scannerto indicate how certain they are of their classification,the information that can be used in classifier training.Being part of the ZTF Variable Marshal, all interactions https://github.com/dmitryduev/ztf-variable-marshal J. van Roestel et al
Figure 3.
Labeling interface of the ZTF Variable Marshal. with the interface can be carried out programmaticallyvia API calls.Contrary to a common misconception, data labelingis actually a job for highly-skilled, trained professionalsthat takes most of the time and is one of the most impor-tant parts of the work to build any successful ML sys-tem. We started with multiple experts performing clas-sification using different user accounts, but later movedto regular multi-expert classification sprints that used asingle user account. This approach proved to be supe-rior as it effectively averaged input from multiple expertsand minimized the number of mistakes while labeling .To test our pipeline during development, we selecteda subset of the ZTF data. In order to obtain a represen-tative set, we chose ten pairs of ZTF-fields, taking intoaccount the RA and Dec, and the Galactic latitude, seeFig. 4 and Table 4. These fields contain a diverse rangeof Galactic environments and also span a range of dif-ferent cadences and total number of epochs. We use the This is somewhat similar to the agile software development tech-nique of pair programming
Figure 4.
The location on sky of the 10 pairs of ZTF-fields(red) we used for testing our pipeline. The figure uses Equa-torial coordinates and a Mollweide projection. The back-ground shows the stellar density according to Gaia DR2 us-ing a logarithmic scaling. g , r , and i band light curve of these fields, a total of ≈ TF source classification I ≈ ≈ no simple selection method could be found, which yieldsa sample that is sufficiently clean and simultaneouslyrepresentative of the data .Instead, to efficiently increase the size of the trainingsample while keeping it representative, we used the in-put from human scanners to build a ’seed’ variable/non-variable (“vnv”) classifier. We used the labels ob-tained from the initial scanning effort to build a simpleclassifier (see below) and inspected random samples of ≈ ≈
34 million light curves).Similarly to the seed “vnv” classifier, the predictionswere randomly sampled for low and high-scoring can-didates as well as “abstained” examples (meaning thattheir scores were close to 0.5 – the classifier decisionthreshold used at training), and the resulting sets wereinspected and labeled by human experts. This processwas repeated multiple times over.Next, we applied this set of classifiers to the starsfrom the CRTS sample of periodic variables (Drake et al.2014). We visually inspected all objects for which theprediction did not match the CRTS label. We thenadded the CRTS labels to the training sample, whichwe used to train the next set of classifiers.Finally, we performed several more rounds of thetrain-infer-sample-label active learning process. As ex-pected, with each completed cycle, we observed a grad-ual improvement of the classifier performance (as de-termined from a “hold-out” set). We stress that theresulting labeled data set is very much a living entity . Figure 5.
The parameter distribution of the training setwith high-level classes indicated with different colors. Thetop panel show the 90% interval (a measure of amplitude)and the median magnitude. Objects with a large amplitudeare often variables, but so are many “bogus” light curves(often artifacts due to bright stars). This figure also showthat there is no clear separation between variables and non-variables. The bottom panel show the median magnitudedistribution of high-level classes.
Several characteristics of the training set as of the timeof writing (internal tag d11 ) are shown in Fig. 5 ( withthe total number of objects per class in Table 1). Thisshows that there are approximately the same amountof variables and non-variables. Note that magnitudedistribution is different for the high-level classes. Thisis partially due to the intrinsic distribution of objects,but mostly due to selection biases in the training set.From the training data, we separated a few sets of ≈
100 objects each for the ontological classes, the ’gold’samples. The light curves in these sets were selected asvery easy to classify examples of those particular classes.These sets are meant as verification sets, to be usedas a ‘sanity check’ for both the phenomenological andontological classifiers.The workflow described above is summarized in Fig.6. 4.1.
Training process
DNN J. van Roestel et al
Figure 6.
Flowchart of the workflow. Features are extracted from the pre-processed ZTF light curves and combined withexternal features from Gaia DR2, PanSTARRS1 DR1, and AllWISE via a spatial cross-match. The resulting feature data setis sampled for a small “seed” set for human expert labeling. Externally-labeled data are inspected by the experts as well. Theblue arrows show the active learning process for iteratively building the training set and improving the classifier performance.The labeled examples are assembled into a training/validation/test set that is used for classifier training. The phenomenologicalclassifiers use only the ZTF-based features in the process, while the ontological ones additionally use the external features. Theresulting set of trained classifiers is evaluated on the full light curve features data set. The resulting (versioned) scores are storedin a database and sampled both for confident and near-the-decision-boundary predictions and passed for labeling to begin anew active learning cycle.
Figure 7.
Schematic of the conceptual DNN architecture.
When building the DNN classifiers, we had to explorea vast hyperparameter space. Figure 7 illustrates theconceptual DNN architecture that we iterated on: • The phenomenological classifiers use the pre-computed light curve features and dmdt his-tograms as input. The ontological classifiers ad-ditionally use the external features. • A dense neural network containing multiple fully-connected layers is used to process the features. • A convolutional neural network is used to processthe dmdt ’s. • The resulting feature maps are fused and passedthrough a fully-connected “head” network thatoutputs the final classification score.The classifiers were implemented using
TensorFlow software and its high-level
Keras
API (Abadi et al.2015; Chollet et al. 2015). We used the binary cross-entropy loss function, the Adam optimizer (Kingma & Ba 2014), a batch size of 64, and a 81% / /
10% train-ing/validation/test data split with a fixed random seedfor reproducibility. The input features were normalized;the same norms were used for all classifiers. We did classbalancing of the training sets for the classifiers with asmall number of positive examples and used all availabledata for the classifiers with a large number of availableexamples. In the first case, the classifier performancewas checked on the originally dropped negative exam-ples and the small number of misclassifications (typi-cally on the order of 1 − vnv classifier, we used a simplearchitecture that followed the schematic in Fig. 7 anddemonstrated satisfying performance, with a minimalnumber (chosen arbitrarily) of fully-connected and con-volutional layers. As we expanded the data sets andadded more classifiers, we ran several rounds of hy-perparameter tuning using the keras-tuner library(O’Malley et al. 2019). The following hyperparameterswere tuned: https://github.com/keras-team/keras-tuner TF source classification I (a) Architecture used in pro-duction (b) Example of a more com-plicated architecture Figure 8.
Best-performing DNN architectures. Panel (a)shows the architecture used in production phenomenologicalclassifiers. The same architecture is used for the ontologicalclassifiers with the difference being the input feature vec-tor size (69 vs 40). Panel (b) shows an example of a morecomplicated architecture that tends to show higher variancecompared to (a). • Inclusion of the fully-connected branch in thearchitecture or not (provided the convolutionalbranch is included)? – Number of layers (from 1 to 4) and neuronstherein (from 32 to 512 with a step of 32) • Inclusion of the convolutional branch in the ar-chitecture or not (provided the fully-connectedbranch is included)? – Number of filters (from 16 to 64 with a stepof 16), their size (3x3, 5x5, or 7x7) and type(regular or separable convolution) – Flattening the output of the last convolu-tional block or use global average pooling in-stead • Number of layers and neurons in the head network(from 0 to 3) • Dropout rates (from 0.15 to 0.55 with a step of0.1) • Activation functions (ReLU, leaky ReLU, sigmoid,tanh) • Initial learning rate (from 1e-4 to 1e-3 with a stepof 1e-4)Several best-performing architectures were evaluatedon the test sets described below in Sec. 5. As ex-pected, the more complicated architectures tended toshow higher variance so for production, we selectedthe simplest architecture that yielded the most robustperformance in most cases (see Fig. 8). The architectureincludes both the fully-connected and the convolutionalbranches confirming that using dmdt ’s indeed improvesclassifier performance. It uses separable convolutions(see e.g. Chollet 2016), ReLU activation functions forall hidden trainable layers and a sigmoid activation func-tion for the output layer that produces a score from 0.0to 1.0. Dropout layers with a rate of 0.25 are used forregularization. 4.1.2. XGBoost
Similar to DNN, XGBoost has a large number of hy-perparameters. These can be categorized as general pa-rameters, tree boosting parameters, learning parame-ters, etc. A thorough hyperparameter tuning is gen-erally not possible, and, indeed, not practical. Vari-ous methods are adopted to find near-optimal values forsome of the parameters that should be tuned. Some ofthe critical parameters are: • max depth : this indicates the depth of the tree,with greater depth indicating more complex mod-els, in turn implying models that are more proneto overfitting, • min child weight : this is a parameter that deter-mines when further partitioning of a tree will stop.Larger numbers indicate a more conservative ap-proach, In ML, variance is usually defined as the error from sensitivityto small fluctuations in the training set. J. van Roestel et al • subsample : this determines the fraction of thedata that the boosting algorithm will use at eachboosting iteration, • colsample bytree : this is a counterpart to subsample but pertaining to the columns. In otherwords, it is the number of features that will getused in each tree, • eta : this is the learning rate and is applied afterevery boosting step.All of these parameters affect tree boosting. We tunedthese parameters, and, since we use all the available datawhich is very unbalanced, we tuned one more viz. • scale pos weight : this parameter decides if one ofthe classes needs to be given extra weight whilefitting because it has fewer samples. Given theway XGBoost determines the splits using its com-plex parameters, even for unbalanced classes, oneis often fine with leaving this parameter set to one.We started with scale pos weight , giving it fourchoices viz. [1, CR/2, CR, 2*CR] where CR is the ra-tio of samples belonging to the two classes. Then wetuned max depth (from 3 to 7 at a spacing of 2) and min child weight (from 1 to 5 at a spacing of 2) simulta-neously, sampling the grid at nine points. Then we sam-pled near the optimal point at a spacing of 1, thus cover-ing [2,8] for max depth and [1,6] for min child weight .This was followed by similar simultaneous tuning of subsample (from 0.6 to 1.0 at the spacing of 0.2) and colsample bytree (from 0.6 to 1.0 at a spacing of 0.2).Here too we did a second round of tuning near the op-timal point at a spacing of 0.1, resulting in a cover of[0.5,1.0] each for subsample and colsample bytree . Thiswas then followed by tuning eta at the values [0.3, 0.2,0.1, 0.05]. Then we went back to scale pos weight toensure that the value we had determined at the startwas still the best value. In all cases, scale pos weight was 1 or close to one.We did two sets of classifications with different inputssets of features (1) for the phenomenological classes weused 40 features determined from the ZTF light curvesalone (see Table 2), and (2) for the ontological classes,we used the 29 external features from AllWISE, Gaia,and Pan-STARRS along with the 40 ZTF features aswith the DNN classifiers (see Table 3). Metrics fromthese runs are given in Table 1.4.2. T-SNE analysis of the training set
As described in Section 4, we built our training setthrough a series of iterative steps. Both DNN and
Figure 9.
An overview of the training set using t-SNE. Top:variables and non-variables, Bottom: Leaf-level ontologicalclasses within the set of variables.
XGBoost use a set of features for classifications. Wepassed these sets of features for our training sampleto t-distributed Stochastic Neighbor Embedding (t-SNEvan der Maaten & Hinton 2008), a dimensionality reduc-tion technique. t-SNE maps points near each other in ahigh-dimensional space to its low dimensional counter-part by minimizing KL divergence (Kullback & Leibler1951) between the two probability distributions usinggradient descent. In Fig. 9, we plot variables and non-variables separately and then plot the leaf-level onto-logical classes by leaving out the non-variables. Manyclasses are seen to be clustered, but there is also overlapbetween some others. This is to be expected especiallyfor classes with relatively fewer examples and the over-laps can be used to predict classes with possible ambigu-ities when running inference on light curves of unknownobjects. CLASSIFIER PERFORMANCEWe have tested our classifiers on different labeled sets.The test performance is based on a random split of thetraining set (10% of the examples for a given class). The
TF source classification I > . > . ≈
5% percent of the periods donot seem to match what is expected for their respectiveclasses. We also inspected the distribution as a func-tion of the number of epochs in the light curves. Thisshows that the classifiers are not confident for objectswith fewer than 100 epochs, but this varies by classifier.We finally inspected the spatial distribution, and it doesnot show any anomalies. DISCUSSION6.1.
Example usages and real-life performance
The classifier performance on the test, gold, and ex-ternal high-purity sets indicate the precision (and to amuch lesser extent the recall) of the models, however,say little about the “real-life” performance when eval-uated on the full corpus of ZTF light curve data. Toexplore the performance in a production setting, we ranour classifiers on all light curves in the 20 test fields.6.1.1.
RR Lyrae
RR Lyrae pulsators are a well-defined class of pul-sating stars that are relatively easy to identify. As atypical example, we query classification results to ob-tain a clean set of RR Lyrae. As criteria we use are:score(variable) > . > .
9. Atotal of 2102 out of 34 million light curves pass thesecriteria. Because objects can have multiple light curves,this corresponds to a total of 1199 astrophysical objects.A visual inspection of the light curves shows that 1073objects (89%) are RR Lyrae variables. False positives in-clude a few Delta Scuti stars (22) and Cepheid variables(21) which have the same light curve shape (and areclosely related to the RR Lyrae pulsators), but have dif-ferent pulsation periods. Other false positives are mostlyhigh amplitude irregular variables.This analysis shows that the classifier works on ‘real-life’ data. As expected, the real-life performance isslightly lower than the test performance. It also indi-cates that there is some confusion with irregular vari-ables, which be solved by adding more examples of thelatter to the training set.6.1.2.
Finding YSOs
Young stellar objects exhibit variability on a widerange of timescales, from hours to months, that maybe periodic or quasi-periodic when associated with stel-lar rotation, or aperiodic/irregular when related to ac-cretion from a circumstellar disk onto the central star,which is a more stochastic process. Previous attemptsto find and classify YSOs using machine learning tech-niques have not been particularly successful, havingboth low completeness and low reliability.As an example of a challenging classification task, weinspect the sample of high probability YSO’s in lightcurves of the 20 test fields. We select all light curveswith score(YSO) > .
9. A visual inspection shows thatapproximately 26% of the classified YSO’s can be con-firmed as bonafide young variables. Contaminants in-cluded AGN/ QSO/ Seyfert classes, which have similaraperiodic variability to YSOs (both object categories areoften described as a damped random walk), as well aspulsating AGB (e.g., Mira and SRVs), post-AGB (e.g.,RV Tau) and other types of LPVs, with which YSOsalso share some features. YSOs can also have flaring4
J. van Roestel et al
Class
Table 1.
Test set performance of our classifiers using a score threshold of 0.5. Labeled data set version d11 . Total numberof light curves in the set 124,037. See the appendix for the definition of each of the classes. The first half of the tableshows phenomenological classes, the second half the ontological classes. The second column shows the total number of labeledexamples of the corresponding class in the set; the classifiers were evaluated on 10% of those. For the phenomenological classesonly features from ZTF data were used (excluding dmdt for XGBoost). behavior similar to the CV class, though contaminationfrom this category was < . Comparison with ZTF alert brokers
ZTF alert brokers, e.g. ANTARES (Saha et al.2014), ALeRCE (S´anchez-S´aez et al. 2020), La-sair (Smith et al. 2019), and FINK (M¨oller et al.2020), use ZTF alerts to identify and classify ob-jects which exhibit variability in the ZTF data,a goal similar to that of this project. The ap-proach and focus is different however. The alertbrokers (currently) only use the ZTF alert data,which are generated by 5 standard deviationsources on difference images, and lack any in-formation on lower amplitude variability. Theaim of this work is specifically to identify andclassify all stars, including low amplitude vari-ables. We therefore use PSF photometry of allpersistent point-sources in the ZTF science im-ages to classify them all. Therefore, we also use different processing methods (most importantlyperiod finding).
Deficiencies of the classifiers and improvements
While the classifiers are working very well, we haveidentified a few deficiencies. First, the classification per-formance drops off for objects which are fainter than20th magnitude. We expect the classification perfor-mance to decrease with magnitude simply because of thelower precision in the light curves. However, inspectinglight curves of faint, misclassified variables shows thata human (and thus the machine learning algorithm) isable to easily classify these light curves. Inspection ofthe training set shows that there is a relative lack offaint variable objects in the training sample (see Fig.5). With our active learning framework, we are able toremedy this by labeling a set of faint objects with a highvariability score.A second issue is a large number of misclassificationsof irregular variables and “bogus” objects. In buildingthe training sample, we have focused mostly on identify-ing periodic variable stars since they are easy to identify.Analyses of the classification results show that many ir-regular variables that are not classified correctly and
TF source classification I (a) RR Lyrae ab (b) RR Lyrae c(c) Flaring stars (d) EA(e) EB (f) EW Figure 10.
Score distributions color-coded in logarithmic scale of all the DNN classifiers on different object types from thegold set. also that many seemingly high amplitude variables turnout to be “bogus” (internal reflections in the ZTF tele-scope). We expect to solve this issue automatically whileusing the classifier; as objects are misclassified, we willencounter them while using the classifiers. They will beadded to the training sample, and the next iteration ofclassifiers will learn to better classify similar false posi-tives. 6.4.
Meta-classification
In this work, we have run DNN and XGBoost indepen-dently. Each of them works very differently and yet itis heartening to see their consistent performance withhigh precision and recall for almost all classes. Thesmall number of misclassifications are of two types: (a) outliers – these will be misclassified by both types ofclassifiers, and (b) objects with a subset of propertiesnot quite captured by the classifier – these will likelybe different for the two classifier types. By combiningthe classifications from the two classifier types we canobtain even purer samples. The misclassifications - ormore specifically their deviant properties - will providean additional facet to the active learning training regimewe have employed here. That will be our next goal aswe bootstrap from the sources classified in this work. SUMMARY AND FUTURE WORKIn this paper, we have established the framework andinfrastructure for the machine learning classification ofZTF light curves. In future work, we will use the frame-6
J. van Roestel et al − . . . . . . C u m u l a t i v e f r a c t i o n dscuewrrlyreablyrlpvceph Figure 11.
The cumulative period distributions for differentclasses. The samples have been selected by selecting on thescore(variable) > . > . not inthe training-set. The distribution are generally what can beexpected for each class. A few percent of systems do haveperiod which are either very short or very long. work construct the ZTF variable object catalog whichprovides light curve features and classifications for allZTF light curves. The catalog will allow astronomers toefficiently search the ZTF light curves for objects of in-terests. In addition to the catalog, we will also make thetraining set available for users who wish to run their ownclassifiers (e.g. Alert brokers). The variable catalog andtraining set will be updated periodically to incorporateimprovements in the classification.The classification performance will be improved in anumber of ways. First of all, as ZTF keeps on accumu-lating data, both the time baseline and the number ofepochs will increase. This will both improve the classi-fication of longer-timescale phenomena, but also allowfor the detection of more subtle variability (e.g. thedetection of low-amplitude periodic variables or narrowfeatures like eclipses).As astronomers are using the classifications and visu-ally inspect the light curves, they will continue to la-bel data. This will further improve the machine learn-ing classifiers by correcting misclassifications and addingthem to the training sample. In addition, explorationof the data by using a combination of light curve fea-tures and phenomenological classes (e.g. periodic vari-ables that do not fall into any of the known ontologicalclasses), allows us to identify rarer classes and add themto the classification scheme.Future work also includes testing of improved and dif-ferent machine learning methods. Neural networks andXGBoost are currently state of the art, but new ma-chine learning methods are being developed at a rapidpace. The currently implemented methods will serve asa baseline benchmark to test novel methods. For exam- ple; recurrent neural networks can be used to classifyvariable-length time series directly without the need forfeatures. In addition, unsupervised machine learningmethods can be applied to find anomalous light curves.ACKNOWLEDGMENTSWe thank the referee for useful and constructive feed-back on the manuscript.Based on observations obtained with the SamuelOschin Telescope 48-inch and the 60-inch Telescope atthe Palomar Observatory as part of the Zwicky Tran-sient Facility project. ZTF is supported by the NationalScience Foundation under Grant No. AST-1440341and a collaboration including Caltech, IPAC, the Weiz-mann Institute for Science, the Oskar Klein Center atStockholm University, the University of Maryland, theUniversity of Washington (UW), Deutsches Elektronen-Synchrotron and Humboldt University, Los Alamos Na-tional Laboratories, the TANGO Consortium of Tai-wan, the University of Wisconsin at Milwaukee, andLawrence Berkeley National Laboratories. Operationsare conducted by Caltech Optical Observatories, IPAC,and UW.DAD acknowledges support from the Heising-SimonsFoundation under Grant No. 12540303. AAM acknowl-edges support from the NSF grant OAC-1640818. MWCacknowledges support from the National Science Foun-dation with grant number PHY-2010970. The authorsacknowledge support from Google Cloud.The authors acknowledge the Minnesota Supercom-puting Institute (MSI) at the University of Minnesotafor providing resources that contributed to the researchresults reported within this paper under project “Identi-fication of Variable Objects in the Zwicky Transient Fa-cility.” This research used resources of the National En-ergy Research Scientific Computing Center (NERSC), aU.S. Department of Energy Office of Science User Facil-ity operated under Contract No. DE-AC02-05CH11231under project “Towards a complete catalog of variablesources to support efficient searches for compact binarymergers and their products.” This work used the Ex-treme Science and Engineering Discovery Environment(XSEDE), which is supported by National Science Foun-dation grant number ACI-1548562. This work used theExtreme Science and Engineering Discovery Environ-ment (XSEDE) COMET at SDSU through allocationAST200016. TF source classification I (a) RR Lyrae (b) Cepheids Figure 12.
Score distributions color-coded in logarithmic scale of all the DNN classifiers on different object types from theChen et al. set.
Facilities:
ZTF
Software: astropy (Astropy Collaboration et al.2018), keras (Chollet et al. 2015), keras-tuner (O’Malleyet al. 2019), kowalski (Duev et al. 2019a), matplotlib(Hunter 2007), numpy (van der Walt et al. 2011), pan-das (pandas development team 2020), tensorflow (Abadiet al. 2015), xgboost (Chen & Guestrin 2016)8
J. van Roestel et al
APPENDIX A. FEATURESThis section shows all the light curve statistics (Table 2) and external statistics (Table 3). The statistics andprocessing of light curves is discussed in detail in Coughlin et al. (2020). In this appendix, we present the equations ofnon-standard statistics or statistics for which multiple definitions exists. We refer readers the appropriate referencesin the Table in other cases. In the equations, we use m for the magnitudes, t for the observation times, N for the totalnumber of epochs, and i to indicate individual epochs.A.1. Amplitude statistics
We calculate a few simple statistics which are measures of amplitude: MAD (median absolute deviation), theinter-quartile range, and the inter-percentile ranges for 60, 70, 80, and 90 percent.MAD = median ( | m i − m median | ) (A1)IQR = percentile( m i , − percentile( m i , m i , − percentile( m i , Higher order moments
We calculate the higher order moments with the equations given below.Skewness = N ( N − N − (cid:88) i ( m mean − m i ) σ i (A4)Kurtosis = N ( N + 1)( N − N − N − (cid:88) i ( m mean − m i ) σ i − N − ( N − N −
3) (A5)A.3.
Von Neumann ratio
The Von Neumann ratio measures the ratio between the correlated variance and the variance. η = (cid:32)(cid:88) i (cid:18) t i (cid:19) m var (cid:33) − (cid:88) i (cid:18) ∆ m i ∆ t i (cid:19) (A6)with ∆ t i = t i +1 − t i and ∆ m i = m i +1 − m i A.4.
Welch & Stetson statistics
The use the Welch-Stetson I and Stetson J & K statistics from Stetson (1996) δ i = N/ ( N − m i − wmean) P i = δ i δ i +1 J = (cid:88) sign ( P i ) (cid:112) | P i | K = (cid:88) ( | δ i | ) /N/ (cid:113) /N (cid:88) δ i (A7) TF source classification I χ value after subtracting the mean8 roms robust mean statistic (Rose & Hintz 2007)9 norm peak to peak amp normalised peak-to-peak amplitude (Sokolovsky et al. 2009)10 norm excess var normalised excess variance (Nandra et al. 1997)11 MAD median absolute deviation12 IQR the interquartile range13 f60 the inter-60% range14 f70 the inter-70% range15 f80 the inter-80% range16 f90 the inter-90% range17 skew the skewness (2nd moment)18 smallkurt the kurtosis (3rd moment)19 inv vonneumannratio the inverse Von Neumann ratio (Neumann 1941, 1942)20 welch i the Welch I statistic (Welch & Stetson 1993)21 stetson j the Stetson J statistic (Stetson 1994)22 stetson k the Stetson L statistic (Stetson 1994)23 ad Anderson Darling test (Stephens 1974)24 sw Shapiro Wilk test (Shapiro & Wilk 1965)25 f1 power χ − χ χ of the fit26 f1 BIC difference in BIC value (Schwarz 1978)27 f1 s slope28 f1 c constant29 f1 amp amplitude of the fundamental period30 f1 phi0 phase of the fundamental period31 f1 relamp1 relative amplitude of 1st harmonic32 f1 relphi1 relative phase of 1st harmonic33 f1 relamp2 relative amplitude of 2nd harmonic34 f1 relphi2 relative phase of 2nd harmonic35 f1 relamp3 relative amplitude of 3rd harmonic36 f1 relphi3 relative phase of 3rd harmonic37 f1 relamp4 relative amplitude of 4th harmonic38 f1 relphi4 relative phase of 4th harmonic39 n ztf alerts number of alerts within 2 (cid:48)(cid:48) (Duev et al. 2019b)40 mean ztf alert braai the mean ‘real-bogus’ score of the alerts (Duev et al. 2019b)41 dmdt a 26 by 26 histogram of all dm-dt pairs (Mahabal et al. 2017) Table 2.
ZTF features we calculated for each of the light curves. The ‘f1’ features are parameters from a fit on the phasefoldedlight curves. All features are used by the phenomenological classifiers (barring n, the number of points in a light curve). XGBoostexcluded dmdt as well. See Coughlin et al. 2020b for more information. J. van Roestel et al g filter22 PS1 DR1 gMeanPSFMagErr Error in the magnitude from g filter23 PS1 DR1 rMeanPSFMag Mean PSF AB magnitude from r filter24 PS1 DR1 rMeanPSFMagErr Error in the magnitude from r filter25 PS1 DR1 iMeanPSFMag Mean PSF AB magnitude from i filter26 PS1 DR1 iMeanPSFMagErr Error in the magnitude from i filter27 PS1 DR1 zMeanPSFMag Mean PSF AB magnitude from z filter28 PS1 DR1 zMeanPSFMagErr Error in the magnitude from z filter29 PS1 DR1 yMeanPSFMag Mean PSF AB magnitude from y filter30 PS1 DR1 yMeanPSFMagErr Error in the magnitude from y filter31 PS1 DR1 qualityFlag binary flag denoting if real of false positive Table 3.
Features external to ZTF. Barring the quality flags these were used in the ontological classifiers in addition to usingthe ZTF features. B.
CLASS DESCRIPTIONB.1.
Phenomenological classes • Variable (vnv) ; a ‘variable’ source is any ZTF source which shows variability in its light curve due to astro-physical origin. This excludes variability due to blended photometry, bright nearby stars, or any CCD artifact.In a sense, this step can be regarded as a ’real-bogus’ filter (Duev et al. 2019b). • Periodic (pnp) ; the ZTF light curve features astrophysical periodic variability. These are typically pulsators,rotating stars and (eclipsing) binaries. This does not include semi-periodic or quasi-periodic variability. Thisexcludes variability due to a varying background (e.g. due to the moon), or spurious periodic variability due tonearby bright stars or other artifacts. • Flaring (fla) ; any ZTF light curve that shows flares, rapidly rising and fading events. These are mostlycataclysmic variables, some young stellar objects, some AGN.
TF source classification I • Irregular (i) ; objects which show irregular variability. These are mostly accreting objects, AGN, CVs, andYSOs. • Long timescale (longt) ; any object which shows variability on timescales of 100 days. This can be a steadyincrease/decrease in luminosity, e.g., AGN and CVs. Long timescale period variable like Miras and the moreirregular semi-regular variables are generally included in this category. • Eclipsing (e) ; any source which shows eclipses in the light curve. These are predominately eclipsing binaries.Eclipsing planetisimals or planets with large rings would also fall in this category. The subtypes are EW (overcontact binaries), EB (semi-detached binaries), and EA (detached binaries). • Bogus ; any light curve that seems variable, but is not due to any astrophysical variability. These are galaxieswhich can seem to vary due to PSF variations, image artifacts like ‘ghosts’, blended stars, or diffraction spikes.B.2.
Ontological classes • Active Galactic Nuclei (agn) ; extra-galactic objects which tend to vary irregularly. Often show slowly risingor fading light curves, and can also show outbursts in rare cases. • Long Period Variables (lpv) ; Long Period Variables are cool giant stars. Nearly all stars of this type showsome variability.
Mira variables are AGB stars which show very high amplitude ( > . Semi Regular Variables (srv) show more irregular, and lower amplitude variability thanMiras. • Pulsator (puls) ; any kind of pulsating star. • Cepheid (ceph) ; Cepheids are radially pulsating giant stars in the instability strip. Period range between 1and 50 days, with extreme examples of 200 days. light curves shapes range from asymmetric with a steep riseand slow decay, to almost sinusoidal light curve shapes. • Delta Scuti (dscu) ; Delta Scuti are pulsating A&F main sequence stars. The pulsation period ranges between0.03 between 0.3 days, and the amplitude is typically 0.2 mag but can reach up to 0.8mag. Their light curvesare asymmetric, with a rapid rise and slow decay. • RR Lyrae (rrlyr) ; RR Lyrae are radial pulsators on the horizontal branch; they are helium core burning andhydrogen shell burning. The pulsation period ranges between 0.2 and 1.0 days.
RR Lyrae ab are pulsatingin their fundamental mode and show amplitudes of up to 1 magnitude and have asymmetric light curves with asteep rising phase.
RR Lyrae c are first overtone pulsators. They have maximum magnitudes of 0.5, and showmore sinusoidal light curves.
RR Lyrae d pulsate at two periods (’d’ stand for ‘double’).
RR Lyrae Blazkho are RR Lyrae which show evolution in their light curve shape, known as the Blazkho effect. • Binary star (bis) ; any object which is a binary star. • RS CVn (rscvn) ; a binary star in which at least one of the components has large stellar spots. The light curveshape is sinusoidal with periods of a few hours to 14 days. The shape of the light curve changes over timescalesof months to years. • Beta Lyrae (blyr) ; Binary systems were one of the components has evolved into a subgiant or giant star andis filling it’s Roche lobe transferring mass in a disk. The light curve of these systems is of type ’EB’. The periodranges between 0.3 days and 200 days. Systems with a period of >
100 days contains a supergiant. • Young Stellar Objects (yso) ; pre-main-sequence stars. They typically have an accretion disk and dust aroundthem. This results in a light curve which shows irregular behavior, sometimes with outbursts or dust obscurationevents. C. OVERVIEW OF THE 20 TEST-FIELDS.In Table 4 we show basic properties of the test-fields we used. They were selected as pairs and chosen to make surethat they represent different Galactic latitudes and ZTF coverage.2
J. van Roestel et al
FieldID ra dec l b g r i
296 15.7910 -17.05 141.274 -79.1454 126 165 23297 22.8729 -17.05 168.678 -75.7386 108 156 26423 192.250 -2.65 303 59.95 85 126424 199.340 -2.65 316.926 59.1829 89 115487 281.193 4.55 36.5577 3.0284 230 598 52488 288.119 4.55 39.7486 -3.0953 221 584 52562 88.373 18.95 189.844 -2.9775 245 595 16563 95.578 18.95 193.164 2.9745 266 1047 14682 266.856 33.35 58.6098 26.7346 822 790 79683 274.709 33.35 60.8553 20.5066 773 1013 61699 45.975 40.55 148.921 -15.1819 220 430 6700 54.523 40.55 154.444 -11.5358 241 430 6717 200.867 40.55 96.8087 75.0527 613 654 166718 209.484 40.55 80.2113 70.6464 636 683 167777 49.315 54.95 143.261 -1.7248 324 709 4778 60.116 54.95 148.223 1.9883 300 577 4841 145.714 69.42 142.515 40.204 368 354 31842 162.857 69.35 137.094 44.7025 390 359 31852 334.286 69.35 110.25 10.5906 262 292853 351.429 69.35 115.729 7.9313 232 269
Table 4.
The ZTF-fields IDs and statistics for the fields we have used for testing our procedures while developing the pipeline
REFERENCES
TF source classification I Debosscher, J., Sarro, L. M., Aerts, C., et al. 2007,Astronomy and Astrophysics, 475, 1159,doi: 10.1051/0004-6361:20077638Drake, A. J., Graham, M. J., Djorgovski, S. G., et al. 2014,The Astrophysical Journal Supplement Series, 213, 9,doi: 10.1088/0067-0049/213/1/9Duev, D. A., Mahabal, A., Masci, F. J., et al. 2019a,Monthly Notices of the Royal Astronomical Society, 489,3582, doi: 10.1093/mnras/stz2357—. 2019b, Monthly Notices of the Royal AstronomicalSociety, stz2357, doi: 10.1093/mnras/stz2357Friedman, J. H. 2001, Ann. Statist., 29, 1189,doi: 10.1214/aos/1013203451Gaia Collaboration. 2018, A&A, 616, A1,doi: 10.1051/0004-6361/201833051Gaia Collaboration, Prusti, T., de Bruijne, J. H. J., et al.2016, A&A, 595, A1, doi: 10.1051/0004-6361/201629272Gaia Collaboration, Brown, A. G. A., Vallenari, A., et al.2018, 616, A1, doi: 10.1051/0004-6361/201833051Genovali, K., Lemasle, B., da Silva, R., et al. 2015, A&A,580, A17, doi: 10.1051/0004-6361/201525894Graham, M. J., Drake, A. J., Djorgovski, S. G., Mahabal,A. A., & Donalek, C. 2013, Monthly Notices of the RoyalAstronomical Society, 434, 2629,doi: 10.1093/mnras/stt1206Graham, M. J., Kulkarni, S. R., Bellm, E. C., et al. 2019,131, 078001, doi: 10.1088/1538-3873/ab006cHeinze, A. N., Tonry, J. L., Denneau, L., et al. 2018, AJ,156, 241, doi: 10.3847/1538-3881/aae47fHo, T. K. 1995, in Proceedings of 3rd InternationalConference on Document Analysis and Recognition,Vol. 1, 278–282 vol.1Hoffleit, D., & Jaschek, C. 1991, The Bright star catalogueHoll, B., Audard, M., Nienartowicz, K., et al. 2018, A&A,618, A30, doi: 10.1051/0004-6361/201832892Hunter, J. D. 2007, Computing in Science & Engineering, 9,90, doi: 10.1109/MCSE.2007.55Jacyszyn-Dobrzeniecka, A. M., Skowron, D. M., Mr´oz, P.,et al. 2016, AcA, 66, 149.https://arxiv.org/abs/1602.09141—. 2017, AcA, 67, 1, doi: 10.32023/0001-5237/67.1.1Jamal, S., & Bloom, J. S. 2020, arXiv e-prints, 2003,arXiv:2003.08618.http://adsabs.harvard.edu/abs/2020arXiv200308618JJayasinghe, T., Stanek, K. Z., Kochanek, C. S., et al. 2019,Monthly Notices of the Royal Astronomical Society, 486,1907, doi: 10.1093/mnras/stz844Jayasinghe, T., Stanek, K. Z., Kochanek, C. S., et al. 2020,MNRAS, 491, 13, doi: 10.1093/mnras/stz2711 Katz, M. L., Cooper, O. R., Coughlin, M. W., et al. 2020,GPU-Accelerated Periodic Source Identification inLarge-Scale Surveys: Measuring P and ˙ P .https://arxiv.org/abs/2006.06866Kim, D.-W., & Bailer-Jones, C. A. L. 2016, Astronomy andAstrophysics, 587, A18,doi: 10.1051/0004-6361/201527188Kim, D.-W., Protopapas, P., Bailer-Jones, C. A. L., et al.2014, A&A, 566, A43, doi: 10.1051/0004-6361/201323252Kim, D.-W., Protopapas, P., Byun, Y.-I., et al. 2011, ApJ,735, 68, doi: 10.1088/0004-637X/735/2/68Kingma, D. P., & Ba, J. 2014, arXiv e-prints,arXiv:1412.6980. https://arxiv.org/abs/1412.6980Kullback, S., & Leibler, R. A. 1951, Ann. Math. Statist.,22, 79, doi: 10.1214/aoms/1177729694Kupfer, T., Bauer, E. B., Burdge, K. B., et al. 2020a, TheAstrophysical Journal, 898, L25,doi: 10.3847/2041-8213/aba3c2Kupfer, T., Bauer, E. B., Marsh, T. R., et al. 2020b, TheAstrophysical Journal, 891, 45,doi: 10.3847/1538-4357/ab72ffLomb, N. R. 1976, Ap&SS, 39, 447,doi: 10.1007/BF00648343Mahabal, A., Sheth, K., Gieseke, F., et al. 2017, arXive-prints, 1709, arXiv:1709.06257.http://adsabs.harvard.edu/abs/2017arXiv170906257MMasci, F., Kulkarni, S. R., Graham, M., Prince, T., &Helou, G. 2018, The Astronomer’s Telegram, 1685.http://adsabs.harvard.edu/abs/2018ATel11685....1MMasci, F. J., Hoffman, D. I., Grillmair, C. J., & Cutri,R. M. 2014, The Astronomical Journal, 148, 21,doi: 10.1088/0004-6256/148/1/21Masci, F. J., Laher, R. R., Rusholme, B., et al. 2019, 131,018003, doi: 10.1088/1538-3873/aae8acMcCulloch, W. S., & Pitts, W. 1943, The bulletin ofmathematical biophysics, 5, 115,doi: 10.1007/BF02478259Minniti, D., Lucas, P. W., Emerson, J. P., et al. 2010,NewA, 15, 433, doi: 10.1016/j.newast.2009.12.002M¨oller, A., Peloton, J., Ishida, E. E. O., et al. 2020,MNRAS, doi: 10.1093/mnras/staa3602Mr´oz, P., Street, R. A., Bachelet, E., et al. 2020, ResearchNotes of the American Astronomical Society, 4, 13,doi: 10.3847/2515-5172/ab7021Muthukrishna, D., Narayan, G., Mandel, K. S., Biswas, R.,& Hloˇzek, R. 2019, Publications of the AstronomicalSociety of the Pacific, 131, 118002,doi: 10.1088/1538-3873/ab1609 J. van Roestel et al
Nandra, K., George, I. M., Mushotzky, R. F., Turner, T. J.,& Yaqoob, T. 1997, The Astrophysical Journal, 476, 70,doi: 10.1086/303600Naul, B., Bloom, J. S., P´erez, F., & van der Walt, S. 2017,Nature Astronomy, 2, 151–155,doi: 10.1038/s41550-017-0321-zNeumann, J. v. 1941, Annals of Mathematical Statistics,12, 367, doi: 10.1214/aoms/1177731677—. 1942, Annals of Mathematical Statistics, 13, 86,doi: 10.1214/aoms/1177731645Nun, I., Protopapas, P., Sim, B., et al. 2015, arXiv e-prints,1506, arXiv:1506.00010.http://adsabs.harvard.edu/abs/2015arXiv150600010NO’Malley, T., Bursztein, E., Long, J., et al. 2019, KerasTuner, https://github.com/keras-team/keras-tunerPalaversa, L., Ivezi´c, ˇZ., Eyer, L., et al. 2013, AJ, 146, 101,doi: 10.1088/0004-6256/146/4/101pandas development team, T. 2020, pandas-dev/pandas:Pandas, latest, Zenodo, doi: 10.5281/zenodo.3509134Pashchenko, I. N., Sokolovsky, K. V., & Gavras, P. 2018,Monthly Notices of the Royal Astronomical Society, 475,2326, doi: 10.1093/mnras/stx3222Pietrzy´nski, G., Thompson, I. B., Gieren, W., et al. 2010,Nature, 468, 542, doi: 10.1038/nature09598Pietrzy´nski, G., Graczyk, D., Gieren, W., et al. 2013,Nature, 495, 76, doi: 10.1038/nature11878Pietrzy´nski, G., Graczyk, D., Gallenne, A., et al. 2019,Nature, 567, 200, doi: 10.1038/s41586-019-0999-4Pojmanski, G. 1997, AcA, 47, 467.https://arxiv.org/abs/astro-ph/9712146Richards, J. W., Starr, D. L., Brink, H., et al. 2012, TheAstrophysical Journal, 744, 192,doi: 10.1088/0004-637X/744/2/192Riess, A. G., Casertano, S., Yuan, W., et al. 2018, ApJ,861, 126, doi: 10.3847/1538-4357/aac82eRose, M. B., & Hintz, E. G. 2007, The AstronomicalJournal, 134, 2067, doi: 10.1086/522963Saha, A., Matheson, T., Snodgrass, R., et al. 2014, inSociety of Photo-Optical Instrumentation Engineers(SPIE) Conference Series, Vol. 9149, ObservatoryOperations: Strategies, Processes, and Systems V, ed.A. B. Peck, C. R. Benn, & R. L. Seaman, 914908,doi: 10.1117/12.2056988S´anchez-S´aez, P., Reyes, I., Valenzuela, C., et al. 2020,arXiv e-prints, arXiv:2008.03311.https://arxiv.org/abs/2008.03311Scargle, J. D. 1982, ApJ, 263, 835, doi: 10.1086/160554Scaringi, S., Maccarone, T. J., K¨ording, E., et al. 2015,Science Advances, 1, e1500686,doi: 10.1126/sciadv.1500686 Schwarz, G. 1978, Annals of Statistics, 6, 461.http://adsabs.harvard.edu/abs/1978AnSta...6..461SSchwarzenberg-Czerny, A. 1998, Baltic Astronomy, 7, 43,doi: 10.1515/astro-1998-0109Shapiro, S. S., & Wilk, M. B. 1965, Biometrika, 52, 591,doi: 10.1093/biomet/52.3-4.591Shappee, B. J., Prieto, J. L., Grupe, D., et al. 2014, ApJ,788, 48, doi: 10.1088/0004-637X/788/1/48Skowron, D. M., Skowron, J., Mr´oz, P., et al. 2019, Science,365, 478, doi: 10.1126/science.aau3181Smith, K. W., Williams, R. D., Young, D. R., et al. 2019,Research Notes of the American Astronomical Society, 3,26, doi: 10.3847/2515-5172/ab020fSokolovsky, K. V., Kovalev, Y. Y., Kovalev, Y. A.,Nizhelskiy, N. A., & Zhekanis, G. V. 2009, AstronomischeNachrichten, 330, 199, doi: 10.1002/asna.200811155Soszy´nski, I. 2018, in XXXVIII Polish Astronomical SocietyMeeting, ed. A. R´o&˙za´nska, Vol. 7, 168–174Soszy´nski, I., Udalski, A., Szyma´nski, M. K., et al. 2014,AcA, 64, 177. https://arxiv.org/abs/1410.1542—. 2015, AcA, 65, 297. https://arxiv.org/abs/1601.01318—. 2016a, AcA, 66, 131. https://arxiv.org/abs/1606.02727Soszy´nski, I., Pawlak, M., Pietrukowicz, P., et al. 2016b,AcA, 66, 405. https://arxiv.org/abs/1701.03105Stephens, M. A. 1974, Journal of the American StatisticalAssociation, 69, 730,doi: 10.1080/01621459.1974.10480196Stetson, P. B. 1994, Publications of the AstronomicalSociety of the Pacific, 106, 250, doi: 10.1086/133378—. 1996, Publications of the Astronomical Society of thePacific, 108, 851, doi: 10.1086/133808Szkody, P., Anderson, S. F., Brooks, K., et al. 2011, TheAstronomical Journal, 142, 181,doi: 10.1088/0004-6256/142/6/181Szkody, P., Dicenzo, B., Ho, A. Y. Q., et al. 2020, TheAstronomical Journal, 159, 198,doi: 10.3847/1538-3881/ab7cceTisserand, P., Le Guillou, L., Afonso, C., et al. 2007, A&A,469, 387, doi: 10.1051/0004-6361:20066017Tonry, J. L., Denneau, L., Heinze, A. N., et al. 2018, PASP,130, 064505, doi: 10.1088/1538-3873/aabadfTorres, G., Andersen, J., & Gim´enez, A. 2010, A&A Rv,18, 67, doi: 10.1007/s00159-009-0025-1Udalski, A. 2003, AcA, 53, 291.https://arxiv.org/abs/astro-ph/0401123Udalski, A., Szyma´nski, M. K., & Szyma´nski, G. 2015,AcA, 65, 1. https://arxiv.org/abs/1504.05966Udalski, A., Soszy´nski, I., Pietrukowicz, P., et al. 2018,AcA, 68, 315, doi: 10.32023/0001-5237/68.4.1
TF source classification I25