A Morphological Classification Model to Identify Unresolved PanSTARRS1 Sources II: Update to the PS1 Point Source Catalog
DDR A FT Draft version January 21, 2021
Typeset using L A TEX twocolumn style in AASTeX63
A Morphological Classification Model to Identify UnresolvedPanSTARRS1 Sources II: Update to the PS1 Point Source Catalog
A. A. Miller
1, 2 and X. Hall
3, 11
Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA) and Department of Physics and Astronomy,Northwestern University, 1800 Sherman Road, Evanston, IL 60201, USA The Adler Planetarium, Chicago, IL 60605, USA Cahill Center for Astrophysics, California Institute of Technology, 1200 E. California Boulevard, Pasadena, CA 91125, USA (Received January 21, 2021; Revised; Accepted)
Submitted to PASP
Abstract
We present an update to the PanSTARRS-1 Point Source Catalog (PS1 PSC), which provides mor-phological classifications of PS1 sources. The original PS1 PSC adopted stringent detection criteriathat excluded hundreds of millions of PS1 sources from the PSC. Here, we adapt the supervised machinelearning methods used to create the PS1 PSC and apply them to different photometric measurementsthat are more widely available, allowing us to add ∼
144 million new classifications while expanding thethe total number of sources in PS1 PSC by ∼ ∼ ∼
144 million new classifications to thePS1 PSC will improve the efficiency with which transients are discovered.
Keywords:
Catalogs – Surveys – Astronomy data analysis – Astrostatistics
1. Introduction
The proliferation of wide-field time-domain surveysover the past ∼ decade has led to the discovery of a bevyof novel extragalactic transients (e.g., Quimby et al.2011; Gezari et al. 2012; Drout et al. 2014; Gal-Yamet al. 2014; Abbott et al. 2017; Prentice et al. 2018;IceCube Collaboration et al. 2018). While these wide-field surveys have been enabled by significant advancesin detector technology, software has proven equally im-portant (e.g., Masci et al. 2017, 2019; Smith et al. 2020;Jones et al. 2020) as many of these critical discoverieshave been facilitated by the rapid identification and dis-semination of new transient candidates in near real time(e.g., Patterson et al. 2019).Reliable catalogs identifying stars and galaxies, orsimilarly unresolved and resolved sources, are an essen-tial cog in the machinery necessary to identify extra- [email protected] galactic transients. On a nightly basis, time-domain sur-veys are inundated with transient candidates, the vastmajority of which are considered “bogus” (e.g., Bloomet al. 2012). Despite sophisticated software capable ofwhittling down the number of likely transients by sev-eral orders of magnitude (e.g., Brink et al. 2013; Gold-stein et al. 2015; Duev et al. 2019; Smith et al. 2020),the number of candidates still vastly outpaces the spec-troscopic resources necessary to classify everything thatvaries (e.g., Kulkarni 2020). The aforementioned star–galaxy catalogs therefore play an essential role in thesearch for transients by removing stellar-like objects thatare likely to be Galactic in origin.The PanSTARRS-1 Point Source Catalog (PS1 PSC;Tachibana & Miller 2018), which provides probabilisticpoint-source like classifications for ∼ a r X i v : . [ a s t r o - ph . I M ] J a n Miller & Hall
StackObjectAttributes table (see § ∼ StackObjectAttributes table, the vast majorityof those missing from the PS1 PSC are either spurious orhave an extremely low signal-to-noise ratio (S/N), suchthat the methods in Tachibana & Miller (2018) wouldnot provide a reliable classification. Additional sourcesare missing from the PS1 PSC because there are multi-ple rows within the PS1
StackObjectAttributes tablethat have the same
ObjID and primaryDetection = 1.By definition this should not happen, and therefore thesesources were excluded. For PS1 sources that are not inthe PS1 PSC, ZTF reports a probability score = 0 . ∼
144 million sources that were previously “miss-ing” from the catalog. These classifications are madeusing different photometric measurements from the onesadopted in Tachibana & Miller (2018). While ournew method performs slightly worse than the one inTachibana & Miller (2018), we nevertheless achieve asimilar level of accuracy with the new model. We applyour new model to the ∼
426 million “missing” sources(classifying ∼
34% of them), providing a new and usefulsupplement to the PS1 PSC. Alongside this paper, we have released our open-source software needed to recreate the analysis in thisstudy. These are available online at https://github.com/adamamiller/PS1 star galaxy.
2. ML Model Data
PS1 conducted a five filter ( g PS1 , r PS1 , i PS1 , z PS1 , y PS1 ) time-domain survey covering ∼ During the preparation of this manuscript Beck et al. (2020) pub-lished a new machine learning catalog (PS1-STRM) to classifythe ∼ ForcedMeanObject table. Wehighlight differences and similarities between the Beck et al. cat-alog and this work in § flux measurements from the individual PS1 exposures ofeach field, there are stack flux measurements from thedeeper stack images that co-add individual exposures,and there are forced-flux measurements that measurethe flux in individual exposures at the location of allsources detected in the stack images. The mean photom-etry is limited by the depth of the individual exposures,while the stack photometry has a difficult to model pointspread function (PSF) because images must be warpedbefore they can be co-added. The forced-flux measure-ments provide an intermediate compromise as they aredeeper than the mean flux measurements, while in prin-ciple having a more stable PSF than the stack images.Tachibana & Miller (2018) show that the stack pho-tometry works best when morphologically classifying re-solved, extended sources and unresolved point sources.The methodology that we adopt here is extremely simi-lar to Tachibana & Miller (2018), but we instead use PS1forced photometry to classify sources that do not havesuitable stack photometry. The forced-photometry-based model leads to slightly lower quality classifications(see § As a training set for the model, we use deep observa-tions of the COSMOS field from the
Hubble Space Tele-scope (HST). The superior resolution of HST enablesreliable morphological classifications for sources as faintas ∼
25 mag (Leauthaud et al. 2007). There are 80,867bright HST sources from Leauthaud et al. (2007) thathave PS1 counterparts (within a 1 (cid:48)(cid:48) match radius; seeTachibana & Miller 2018) in the PS1
ForcedMeanObject table (see § nDetections ≥ ∼
3. ML Model Features
Regardless of the choice of algorithm, the basic goalof a machine learning model is to build a map betweensource features, numerical and/or categorical propertiesthat can be measured for an individual source, and la-bels, the target output, often a classification, of themodel. This mapping is learned via a training set, a For this work a source is considered “detected” only if the
FPSFFlux , FPSFFluxErr , FKronFlux , FKronFluxErr , FApFlux , FApFluxErr are all > S1 Point Source Catalog II Machine learning models are lim-ited by their training sets: there is no guarantee thattheir empirical mapping will correctly extend beyond theboundaries enclosed by the training set. Given the sig-nificant systematic uncertainties associated with Galac-tic reddening, and the tendency for spectroscopic sam-ples, which are typically used to define training sets, tobe biased in their target selection (see e.g., Miller et al.2017), the motivation for “white flux” features becomesclear: they reduce potential biases in the final classifi-cations due to selection effects in how the training setsources were targeted. Therefore, as in Tachibana &Miller (2018), we use “white flux” features in this study.The PS1
StackObjectAttributes table providesboth flux and shape (e.g., second moment of the ra-diation intensity) measurements in each of the five PS1filters, whereas the PS1
ForcedMeanObject table onlyprovides flux measurements. To create the feature set for our machine learn-ing model, we create “white flux” features forthe six different flux measurements available inthe
ForcedMeanObject table ( FPSFFlux , FKronFlux , FApFlux , FmeanflxR5 , FmeanflxR6 , FmeanflxR7 ), aswell as the E1 and E2 measurements, which represent themean polarization parameters from Kaiser et al. (1995).We use flux ratios, rather than the raw flux measure-ments, which provide morphological classifications thatare independent of S/N (Lupton et al. 2001). Only filters in which the source is detected are included in thesum, see Equations 1 and 2 in Tachibana & Miller (2018). The PS1
ForcedMeanObject table provides average measurementsacross all epochs on which a PS1 source is observed, and the aver-age second moment of the radiation intensity is somewhat mean-ingless as the orientation of the detector and observing conditionsvary image to image. The original PS1 PSC and the PS1-STRM catalogs are both con-structed using the first PS1 data release. This study uses mea-surements from the second PS1 data release, which corrects apercent-level flat-field correction that was applied with the wrongsign in DR1 (Beck et al. 2020).
Our final model includes nine features, five flux ratios: whiteFPSFApRatio = whiteFPSFluxwhiteFApFlux , whiteFPSFKronRatio = whiteFPSFluxwhiteFKronFlux , whiteFPSFFmeanflxR5Ratio = whiteFPSFluxwhiteFmeanflxR5Flux , whiteFPSFFmeanflxR6Ratio = whiteFPSFluxwhiteFmeanflxR6Flux , whiteFPSFFmeanflxR7Ratio = whiteFPSFluxwhiteFmeanflxR7Flux , the white polarization parameters: whiteE1 and whiteE2 , and two “simple” distance measures: whiteFPSFKronDist and whiteFPSFKronDist (see § whiteFPSFApRatio is the mostuseful feature, aside from the “simple” features, to sep-arate resolved and unresolved sources. This intuitivelymakes sense as PS1 ApFlux measurements are matchedto the seeing, whereas the
R5flx , R6flx , R7flx measure-ments use fixed aperture sizes. With multiple imagestaken under different observing conditions contributingto the final forced flux measurements, fixed aperturemeasurements should be more noisy.
Tachibana & Miller (2018) introduced a “simple”model to classify sources based solely on their measured whitePSFFlux and whiteKronFlux . The model was in-spired by the use of flux ratios, which have been shownto provide a good discriminant between resolved andunresolved sources (e.g., the SDSS morphological
CLASS parameter; Lupton et al. 2001). At moderate to lowS/N, however, flux ratios no longer provide accurateclassifications (see e.g., Figure 1). The simple modelfrom Tachibana & Miller (2018) leverages this fact bymeasuring the distance of each source from a line drawnin the whitePSFFlux – whiteKronFlux plane. Unlike aflux ratio, the simple model preserves information aboutthe S/N, meaning sources with large absolute distancesfrom the dividing line can be classified with greater con-fidence.Following from Equation 3 in Tachibana & Miller(2018), “simple” features can be calculated as: whiteF1F2Dist ( a ) = whiteF1 − a × whiteF2 √ a , (1)where whiteF1 and whiteF2 are the “white flux” mea-surements introduced in § whiteFKronFlux ), a is the slope of the line in the whiteF1 – whiteF2 plane, Miller & Hall w h i t e F P S F A p R a t i o w h i t e F P S F K r o n R a t i o w h i t e F P S F f l x R R a t i o w h i t e F P S F f l x R R a t i o w h i t e F P S F f l x R R a t i o w h i t e E
16 18 20 22whiteFKronMag0.40.20.00.20.4 unresolvedresolved PDF w h i t e E
16 18 20 22whiteFKronMag0.40.20.00.20.4 unresolvedresolved PDF
Figure 1.
The primary square panels show Gaussian KDEs of the PDF for each of the “white flux” features as a function of whiteFKronMag (= − . [ whiteFKronFlux / S1 Point Source Catalog II w h i t e F P S F A p D i s t × unresolvedresolved 17 18 19 20 21 225051015 Figure 2.
The distribution of whiteFPSFKronDist val-ues for resolved, extended sources and unresolved pointsources from the training set as a function of whiteKronMag .The colors and contours are the same as Figure 1.The horizontal dashed line shows the optimal threshold( whiteFPSFKronDist ≥ . × − ) for resolved–unresolvedclassification. The upper-right inset shows a zoom-out high-lighting the stark difference between stars and galaxies atthe bright end. and whiteF1F2Dist is the orthogonal distance of asource from the line (sources above the line have pos-itive values). For this study we construct two simplefeatures for inclusion in our machine learning model: whiteFPSFFKronDist and whiteFPSFFApDist .We determine the optimal value of a for the sim-ple features via cross validation. We find a =0 . whiteFPSFFKronDist feature and a =0 . whiteFPSFFApDist feature maximizesthe FoM (see § whiteFPSFFApDist isbetter at separating resolved and unresolved sourcesthan whiteFPSFFKronDist , and therefore the “simple”model, discussed below, is based on whiteFPSFFApDist .The whiteFPSFFApDist and whiteFPSFFKronDist dis-tribution of resolved and unresolved sources is shown inFigures 2 and 3, respectively.
4. Training the ML Model
We construct a model to maximize the figure of merit(FoM) for our morphological classification model. Ouraim is to retain nearly all the resolved, extended sourceswhile excluding as many unresolved point sources as pos-sible. Thus, our FoM is defined as the true positive rate(TPR) at a fixed false positive rate (FPR) = 0.005. TPR = TP / (TP + FP), where TP is the total number of truepositive classifications and FP is the number of false positives. FPR = FP / (FP+TN), where TN is the number of true negatives. w h i t e F P S F K r o n D i s t × unresolvedresolved 17 18 19 20 21 225051015 Figure 3.
Same as Figure 2, but showing the distributionfor whiteFPSFKronDist . A horizontal line is not shown aswe do not recommend the use of only whiteFPSFKronDist for resolved-unresolved classification.
Using the nine features from §
3, we use the randomforest (RF) algorithm (Breiman 2001), as implementedin scikit-learn (Pedregosa et al. 2011), to classify PS1sources as resolved or unresolved. Briefly, the RF algo-rithm constructs an ensemble of decision trees (Breimanet al. 1984), where each tree is constructed using a boot-strapped sample of the training set (a method known as“bagging”; Breiman 1996) and the split for each branchwithin the tree is selected from a random subset of thefull feature set. The result is a lower variance estimatorthan is possible from a single decision tree.To train the RF model, we replicate the procedure inTachibana & Miller (2018). We use k -fold cross vali-dation (CV) to optimize the model tuning parameters,namely the number of trees in the forest N tree , the ran-dom number of features for splitting at each node m try ,and the minimum number of sources in a terminal leafof the tree nodesize . Our CV procedure utilizes bothan inner and outer loop, each with k = 10 folds. In theinner loop, a k = 10 folds CV grid search is performedover the three tuning parameters, while predictions fromthe optimal grid location are applied to the 1/10 of thetraining set that was withheld in the outer loop. Thisprocess is then repeated for the remaining 9 folds in theouter loop. We adopt the average results from the 10different grid searches to arrive at optimal model pa-rameters of: N tree = 900, m try = 3, and nodesize = 2.The RF model results are not strongly dependent on thefinal choice of tuning parameters.
5. Results
Miller & Hall T r u e P o s i t i v e R a t e RF modelsimple modelPS1 model F o M Figure 4.
ROC curves comparing the relative performanceof the PS1, simple, and RF models for HST sources with i PS1 detections. The thick slate gray, green, and purple linesshow the ROC curves for the PS1, simple, and RF models,respectively. The light, thin lines show the ROC curves forthe individual CV folds. The inset on the right shows azoom in around FPR = 0.005, shown as a dotted verticalline, corresponding to the FoM (the PS1 model is not shownin the inset, because it has very low FoM).
Our aim is to maximize the FoM of the RF model. Weshow receiver operating characteristic (ROC) curves ofthe RF, simple, and PS1 models in Figure 4. FromFigure 4, it is clear that the RF and simple modelsgreatly outperform the PS1 model. Furthermore, whilethe gains are modest, the inclusion of all the “whiteflux” features and use of machine learning is justified asthe RF model produces a higher FoM than the simplemodel.The FoM of each of the three models is summarizedin Table 1. In addition to providing the largest FoM,the RF model is also the most accurate and it has thelargest area under the ROC curve (ROC AUC). We ro-bustly conclude that, of the models considered here, theRF model is best. Comparing with Table 1 in Tachibana& Miller (2018), we find that the forced-photometry fea-tures derived in this study do not provide the same dis-criminating power as the PS1 stack-photometry featuresused in Tachibana & Miller (2018). Our new model per-forms ∼
7% worse than the one in Tachibana & Miller(2018). In §
6, we argue that this slight reduction inperformance is more than offset by the ∼
144 million ad-ditional sources that are now classified using the forced-photometry features. The PS1 model is defined by a single hard cut on the PSF–Kron flux ratio measured in the i PS1 band (for further detailssee Tachibana & Miller 2018).
Table 1.
CV Results for the Training Setmodel FoM Accuracy ROC AUCRF ± ± ± ± ± ± ± ± ± Note —Uncertainties represent the sample standard devia-tion for the 10 individual folds used in CV.
We show the CV accuracy of the RF, simple, and PS1models as a function of whiteFKronMag in Figure 5. Asin Tachibana & Miller (2018), we find that the RF modelprovides more accurate classifications than the alterna-tives.The accuracy of each model shown in Figure 5 de-creases for lower S/N sources. The accuracy curve forthe RF and simple models feature a slight departurefrom expectation in that they do not decrease much from22 to 24 mag. This quasi-plateau in the model accuracycan be understood as the result of two components of thetraining set: (i) unresolved sources completely dominatethe source counts at these magnitudes, and (ii) the well-defined locus of unresolved sources in the training set(see Figure 1) becomes heavily blended with the resolvedsource population at these brightness levels. Taken to-gether the model will be biased towards classifying allfaint sources as resolved, despite the fact that we do notexplicitly include flux measurements in the feature set.With 88.5% of the whiteFKronMag > . ∼
88% makes sense. This is confirmed in thebottom panel of Figure 5, which shows the RF modeltrue positive rate (TPR) for both resolved and unre-solved sources as a function of whiteFKronMag . A near100% TPR for faint resolved sources combined with afew correctly classified unresolved sources leads to theobserved quasi-plateau in Figure 5.
With a new RF model in hand, we can now providemorphological classifications for the PS1 sources thatare currently missing from the PS1 PSC. Of the ∼ ∼
144 million have PS1 DR2
ForcedMeanObject photometry that pass our detectioncriteria (see Appendix A for more details). A histogramshowing the distribution of the RF classification scorefor these newly classified sources is shown in Figure 6.Figure 6 shows that there are relatively few high-confidence classifications (i.e., very likely extendedsources with RF score ≈ S1 Point Source Catalog II A cc u r a c y HST training setHST resolvedHST unresolved
RF modelSimple modelPS1 model
15 16 17 18 19 20 21 22 23 24 whiteFKronMag
RF resolvedRF unresolved
Figure 5.
Top : Model accuracy as a function of whiteFKronMag for HST sources with i PS1 detections. Ac-curacy curves for the PS1, simple and RF models are shownas slate gray pentagons, green triangles, and purple circles,respectively. The bin widths are 0.5 mag, and the error barsrepresent the 68% interval from bootstrap resampling. Ad-ditionally, a Gaussian KDE of the PDF for the training set,as well as the unresolved point sources and resolved, ex-tended objects in the same subset is shown in the shadedgray, red, and green regions, respectively. The amplitudeof the star and galaxy PDFs have been normalized by theirrelative ratio compared to the full i PS1 -band subset.
Bot-tom : accuracy of resolved and unresolved classifications asa function of whiteFKronMag from the RF model (i.e., theTPR when treating each class as the positive class). Nearlyall the resolved sources are correctly classified, because theydominate by number at low S/N (see text), while only brightunresolved sources are correctly classified. with RF score ≈
1) among the “missing” sources. Fig-ure 6 also reveals the likely explanation for this outcome:the vast majority of the newly classified sources are inthe Galactic plane. Of the ∼
144 million newly classifiedsources, ∼
57% have galactic latitude | b | < >
95% are in the Galactic plane ( | b | <
15 deg). The HSTCOSMOS field, from which we derive our training set,has b ≈
42 deg and as a result includes very few stellarblends, which are common at low galactic latitudes. ThePS1 PSC also has significantly lower confidence classifi- N F o M t h r e s h o l d Extended PSPS1 PSC "missing"| b | < 5°| b | > 30° Figure 6.
Histogram showing the RF classification scoresfor the ∼
144 million newly classified sources from PS1. All ofthe newly classified sources are shown in blue, while Galacticplane sources ( | b | < ◦ ) are shown in orange, and high galac-tic latitude sources ( | b | > ◦ ) are shown in grey. The verti-cal dotted line shows the conservative classification thresholdadopted in Tachibana & Miller (2018) (sources to the rightof the line are considered point sources). The vast majorityof the newly classified sources are in the Galactic plane. cations in the Galactic plane (see Figure 8 in Tachibana& Miller 2018). That these sources were not “detected”in the PS1 stack images also suggests that it is difficultto make reliable photometric measurements using thePS1 data, which could also contribute to the lower con-fidence classifications. The upcoming third data releasefrom the space-based Gaia telescope (Perryman et al.2001) will improve this situation by classifying many ofthese ambiguous sources as stars via parallax and propermotion measurements.Ultimately, this update to the PS1 PSC has identi-fied 17,945,494 likely point sources using the optimizedthreshold from Tachibana & Miller (2018, RF score ≥ . ∼
734 million point sources in the original PS1 PSC,these ∼
18 million newly identified point sources wouldotherwise pass filters looking for extragalactic transientsin the ZTF alert stream. Their removal will reduce thenumber of false positive transient candidates.
6. Deployment in the ZTF Real-Time Pipeline
The ZTF real-time pipeline (Masci et al. 2019) pro-vides AVRO alert packets (see Patterson et al. 2019)containing information (e.g., flux, position, nearestneighbors) about any newly discovered sources of vari-ability. The packets include morphological classifica-tions, based on the PS1 PSC (Tachibana & Miller 2018),for the three closest closed sources in the ZTF
Stars table that are within 30 (cid:48)(cid:48) of the newly observed vari-
Miller & Hall able source (see Appendix A for a summary of the PS1sources included in the ZTF
Stars table). There are ∼
426 million PS1 sources in the ZTF
Stars table arenot classified in the original PS1 PSC (see § The
Gaia
Early Data Release 3 includes high-precisionastrometric measurements collected over a 34 monthtimespan for ∼ § Gaia stars, which are identifiedvia high-significance parallax and proper motion detec-tions.A common threshold for determining “high-significance” is S / N ≥
5, which in the case of gaussianuncertainties corresponds to a ∼ × − probabilitythat the observed signal is the result of noise. We cantherefore select stars from Gaia sources with high S/Nparallax or proper motion measurements. We adoptconservative significance thresholds because the formaluncertainties from
Gaia are slightly underestimated(Fabricius et al. 2020) and because most of the “miss-ing” sources in the ZTF
Stars table are in the GalacticPlane (e.g., Figure 6). Fabricius et al. (2020) estimatethat
Gaia parallax measurements underestimate theuncertainties by a much as ∼
60% in crowded regions.Similarly, proper motions are found to be underesti-mated by as much ∼
80% in crowded regions (Fabriciuset al. 2020). We therefore only consider
Gaia sourceswith a parallax S / N ≥ / N ≥ Gaia archive we find there are18,662,985 sources with either a high-significance paral-lax or proper motion detection in the ZTF Stars tablethat lack a classification in the original PS1 PSC. Forthese sources (11,479,512 of which have RF scores from § Stars table.This effectively excludes each of these sources from fil-ters designed to find extragalactic transients in the ZTFalert stream.
Moving forward, ZTF alert packets now include ∼ ∼ The total proper motion is estimated by adding the propermotion in Right Ascension and Declination in quadrature, seeTachibana & Miller (2018) for the corresponding uncertainty onthis quantity. https://gea.esac.esa.int/archive/ classifications from § ∼ § The addition of these new classifications to the ZTFAVRO packets should not affect existing alert-streamfilters, as we describe below.While a one-to-one mapping of point-source classi-fication scores cannot be made between Tachibana &Miller (2018) and this study, the similarity betweenthe two methodologies leads to classifications that arehighly similar. Table 2 summarizes the TPR and FPRfor different classification thresholds using the modelfrom Tachibana & Miller (2018) and the RF model cre-ated in this study. The PS1 stack photometry usedin Tachibana & Miller (2018) consistently produces ahigher TPR, by ∼ Stars table.
7. Discussion
During the preparation of this manuscript, Beck et al.(2020) published the Pan-STARRS1 Source Types andRedshifts with Machine learning (PS1-STRM) catalog,which includes the machine learning classification of PS1sources as either stars, galaxies, or quasars. Like thisstudy, Beck et al. (2020) use PS1 forced photometry toprovide classifications. There are a couple of differencesbetween the catalogs: the PS1-STRM classifies all ∼ ForcedMeanObject table,while the updated PS1 PSC only classifies ∼ half thatmany sources. Another difference between the twocatalogs is that the PS1-STRM uses a neural-networkclassifier, whereas the PS1 PSC uses the RF algorithm.Finally, the PS1-STRM uses full color information intheir classifier whereas the PS1 PSC uses “white flux”features (see § This change will occur upon the journal acceptance of this paper. We note that the majority of the additional classifications in thePS1-STRM are nDetections ≤ S1 Point Source Catalog II Table 2.
TPR and FPR for TM18 ThresholdsCatalog Threshold 0.829 0.724 0.597 0.397 0.224TM18 TPR 0.734 0.792 0.843 0.0904 0.947FPR 0.005 0.01 0.02 0.05 0.1This work TPR 0.684 +0 . − . +0 . − . +0 . − . +0 . − . +0 . − . FPR 0.005 +0 . − . +0 . − . +0 . − . +0 . − . +0 . − . Note —The table reports the TPR and FPR for different classification thresholds given inTable 3 in Tachibana & Miller (2018). To estimate the TPR and FPR we perform 10-foldCV on the entire training set, but only include sources with nDetections ≥ The most important distinction between the two cat-alogs, in our estimation, is their training sets. ThePS1-STRM is trained using spectroscopic labels thatpredominantly come from the Sloan Digital Sky Survey(SDSS; Abolfathi et al. 2018), whereas the PS1 PSC istrained via morphological classifications from HST. AnSDSS-based training set has two distinct advantages: itis nearly two orders of magnitude larger than the HSTtraining set and it includes redshift information (whichcan be used to estimate photometric redshifts, as is donein the PS1-STRM).When considering only morphological classification, orsimilarly star-galaxy separation, an SDSS-based train-ing set produces biased classifications (Miller et al. 2017;Tachibana & Miller 2018). The SDSS spectroscopictargeting algorithm was biased towards specific sourceclasses, such as luminous red galaxies, and as a re-sult SDSS spectra are not representative of the averagesource in PS1 (see Figure 1 in Tachibana & Miller 2018).Furthermore, the SDSS training set is distinctly biasedtowards point sources at the faint end ( r (cid:38)
21 mag),which leads to models that overestimate the prevalenceof point sources at these brightness levels (see e.g., Fig-ure 7 in Tachibana & Miller 2018). It is for these reasonsthat we adopt the HST training set for the PS1 PSC,despite its relatively modest size.Ultimately, we recommend the use of both catalogs.Despite the different methodologies and training sets,we expect the classifications to largely be in agreementfor bright sources ( r (cid:46)
20 mag). In cases where thecatalogs agree, the classifications can be treated as ex-tremely confident. Most of the disagreements will oc-cur at the faint end, where both catalogs will providenoisier estimates. For faint sources where the catalogsdisagree, users should consider applying an additionalprior based on the observed source counts in the Uni- verse (e.g., Henrion et al. 2011). At high galactic lati-tudes, nearly all the very faint sources are galaxies, whilewithin the Galactic plane nearly everything will be astar.
8. Conclusions
We have presented an update to the PS1 PSC(Tachibana & Miller 2018), by classifying ∼
144 millionsources that were previously “missing.” The new clas-sifications are made using a new RF model that uti-lizes photometric and shape features from the PS1 DR2
ForcedMeanObject table.The training set and methodology are nearly identi-cal to those used in Tachibana & Miller (2018), withthe major difference being that that study used featuresfrom the PS1 DR1
StackObjectAttributes table. Thesimilarity in methodology is intentional, as it allows newclassifications for the previously “missing” sources to beincorporated into the PS1 PSC without a need for sig-nificant revisions to existing filters that are applied tothe ZTF alert stream. We find that the new model per-forms ∼ >
144 million newly classified sources. Theupdate to the PS1 PSC presented here will improve theextragalactic transient search efficiency for ZTF.Spectroscopic observations from SDSS have now fu-eled the training sets for machine learning models toseparate stars and galaxies for more than a decade (e.g.,Ball et al. 2006; Beck et al. 2020). These labels haveproven extremely valuable as they have been applied toseveral surveys beyond SDSS (e.g., Miller et al. 2017;Beck et al. 2020). Our ability to use methods built onempirical training sets is going to be severely limitedby the Vera C. Rubin Observatory, whose images will0
Miller & Hall be predominantly populated by extremely faint sources( r ≈
24 mag; Ivezi´c et al. 2019). With few spectroscopicclassifications of any kind at these depths, the separationof stars and galaxies in Rubin Observatory data is goingto largely rely on data from the Rubin Observatory it-self. In this regime machine learning is unlikely to playa leading role, and purely photometric methods will berequired to separate stars and galaxies (e.g., Slater et al.2020) and triage the Rubin Observatory alert stream toremove stellar variables prior to the search for extra-galactic transients.
Acknowledgments
This work would not have been possible without thepublic release of the PS1 data. We thank F. Masci andR. Laher for helping us identify sources that were notclassified in the ZTF
Stars table.A.A.M. is funded by the Large Synoptic Survey Tele-scope Corporation (LSSTC), the Brinson Foundation,and the Moore Foundation in support of the LSSTCData Science Fellowship Program; he also receives sup-port as a CIERA Fellow by the CIERA PostdoctoralFellowship Program (Center for Interdisciplinary Ex-ploration and Research in Astrophysics, NorthwesternUniversity). X.H. is supported by LSSTC, through En-abling Science Grant
Facilities:
PS1 (Chambers et al. 2016)
Software: astropy (Astropy Collaboration et al.2013, 2018), scipy (Virtanen et al. 2020), matplotlib (Hunter 2007), pandas (McKinney 2010), scikit-learn (Pedregosa et al. 2011)
Appendix A The ZTF–PS1 MorphologicalCatalog
The ZTF database contains a table (
Stars ) withsources selected from the PS1 DR1 that are used to provide morphological classifications in the ZTFalert packets. The ZTF
Stars table was seededfrom the PS1
MeanObject table and includes all PS1
MeanObject sources with nDetections ≥ Thereare 1,919,106,844 sources in the ZTF
Stars table.Of these, 1,484,281,394 are classified in the PS1 PSCand another 8,520,167 are classified as point sourcesbased on
Gaia parallax and/or proper motion measure-ments (Tachibana & Miller 2018). Therefore, there are426,305,283 sources in the ZTF
Stars table that didnot meet the quality cuts necessary to be included inthe PS1 PSC. For the ∼
426 million ZTF
Stars table sourcesnot in the PS1 PSC, 5,885,633 had multiple rowsin the PS1
StackObjectAttributes table with primaryDetection = 1, while the rest were not “de-tected” in the PS1 stacks. As described in § ForcedMeanObject “detection” criteria (see § ∼
281 million sources do not have reli-able PS1 stack or forced photometry, and as a result re-main in the ZTF
Stars table with an ambiguous score of0.5. About 8% of the still unclassified ZTF
Stars tablesources are not present in PS1 DR2 (mostly because theyhave declination δ < −
30 deg). Furthermore, ∼
34% ofthese ∼
281 million sources have nDetections = 3, and ∼
55% have nDetections ≤
5. That these sources haveso few detections in PS1 increases the probability thatthey may be spurious, and even if they are not spurious,they are otherwise very low S/N detections, which donot produce highly confident classifications.
References Immediately after the release of PS1 DR1 it was recommendedthat sources detected on at least three individual PS1 imageswere unlikely to be spurious. Hence, the use of this selection cutfor the ZTF
Stars table. Only sources with a single row designated as the primaryDetection in the PS1
StackObjectAttributes ta-ble and a stack “detection” (i.e., the PSF, Kron, and apertureflux are all > See https://outerspace.stsci.edu/display/PANSTARRS/PS1+DR2+caveats
S1 Point Source Catalog II11