FLEET: A Redshift-Agnostic Machine Learning Pipeline to Rapidly Identify Hydrogen-Poor Superluminous Supernovae
Sebastian Gomez, Edo Berger, Peter K. Blanchard, Griffin Hosseinzadeh, Matt Nicholl, V. Ashley Villar, Yao Yin
DDraft version September 7, 2020
Typeset using L A TEX twocolumn style in AASTeX62
FLEET: A Redshift-Agnostic Machine Learning Pipeline to Rapidly Identify Hydrogen-Poor SuperluminousSupernovae
Sebastian Gomez, Edo Berger, Peter K. Blanchard, Griffin Hosseinzadeh, Matt Nicholl,
3, 4
V. Ashley Villar, and Yao Yin Center for Astrophysics | Harvard & Smithsonian, 60 Garden Street, Cambridge, MA 02138-1516, USA Center for Interdisciplinary Exploration and Research in Astrophysics and Department of Physics and Astronomy,Northwestern University, 2145 Sheridan Road, Evanston, IL 60208-3112, USA Birmingham Institute for Gravitational Wave Astronomy and School of Physics and Astronomy, University of Birmingham,Birmingham B15 2TT, UK Institute for Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill EH9 3HJ, UK
ABSTRACTOver the past decade wide-field optical time-domain surveys have increased the discovery rate oftransients to the point that (cid:46)
10% are being spectroscopically classified. Despite this, these surveyshave enabled the discovery of new and rare types of transients, most notably the class of hydrogen-poor superluminous supernovae (SLSN-I), with about 150 events confirmed to date. Here we present amachine-learning classification algorithm targeted at rapid identification of a pure sample of SLSN-I toenable spectroscopic and multi-wavelength follow-up. This algorithm is part of the FLEET (FindingLuminous and Exotic Extragalactic Transients) observational strategy. It utilizes both light curve andcontextual information, but without the need for a redshift, to assign each newly-discovered transienta probability of being a SLSN-I. This classifier can achieve a maximum purity of about 85% (with20% completeness) when observing a selection of SLSN-I candidates. Additionally, we present twoalternative classifiers that use either redshifts or complete light curves and can achieve an even higherpurity and completeness. At the current discovery rate, the FLEET algorithm can provide about 20SLSN-I candidates per year for spectroscopic follow-up with 85% purity; with the Legacy Survey ofSpace and Time we anticipate this will rise to more than ∼ events per year. Keywords: supernovae: general – methods: statistical – surveys INTRODUCTIONType I Superluminous Supernovae (hereafter, SLSN-I)are a class of astrophysical transients that exceed theluminosity of normal SNe by up to two orders of mag-nitude. They were originally classified based on theirluminosity, since most have typical peak absolute mag-nitudes of (cid:46) −
21 (Chomiuk et al. 2011; Quimby et al.2011). However, events with spectroscopic signaturesthat match those of SLSN-I have been discovered atlower luminosities (e.g., Lunnan et al. 2013) and theyare now classified based on their hydrogen-free spectra,strong O II absorption lines at early time, and a bluecontinuum (Angus et al. 2019). At present, about 150 Corresponding author: Sebastian [email protected]
SLSN-I have been spectroscopically classified; see Ta-ble A.1 for a listing and references.While the energy source of SLSN-I was intensely de-bated for a few years following their discovery, it nowappears that radioactive decay of Ni (as in normalType I SNe) and circumstellar interaction (as in TypeIIn SNe) cannot explain the bulk of the population.Instead, the most likely energy source appears to bethe spin-down of a millisecond magnetar produced inthe explosion (Kasen & Bildsten 2010; Metzger et al.2015). This model can explain the diverse light curvebehavior (Nicholl et al. 2017c), the early-time UV spec-tra (Mazzali et al. 2016), the late-time light curve flat-tening (Blanchard et al. 2018; Nicholl et al. 2018), andthe nebular spectra (Dessart et al. 2012; Nicholl et al.2019) of SLSN-I. Still, the nature of SLSN-I progenitors,their environments, and their relation to those of otherstripped-envelope explosions remain areas of active in- a r X i v : . [ a s t r o - ph . H E ] S e p Gomez et al. vestigation (e.g., Blanchard et al. 2020). Similarly, theubiquity and origin of unusual light curve and spectro-scopic features seen in some SLSN-I, such as late time“bumps” (Nicholl et al. 2016; Inserra et al. 2017; Blan-chard et al. 2018; Lunnan et al. 2019), double-peakedlight curves (Nicholl et al. 2015), or potential heliumlines (Yan et al. 2020) remain unclear.Making progress on these open questions requires asubstantial increase in the identification rate of SLSN-I,preferably at early times to enable spectroscopic follow-up. A significant challenge is that SLSN-I are intrinsi-cally rare, at a volumetric rate of ∼
90 SNe yr − Gpc − at a weighted redshift of z = 1 .
13, they represent < . ∼ .
5% of the detection rate in magnitude-limitedsurveys (Villar et al. 2019; Fremling et al. 2020). Cur-rently, only ∼
10% of all optical transients are classifiedspectroscopically, and with the Legacy Survey of Spaceand Time (LSST) on the Vera C. Rubin Observatory,this will decline to (cid:46) . RAPID :Muthukrishna et al. 2019,
Avocado : Boone 2019) havebeen trained on synthetic data, such as the PhotometricLSST Astronomical Time-series Classification project(PLAsTiCC; Kessler et al. 2019), but their performancewith real data remains untested. Other classifiers suchas
SuperRAENN (Villar et al. 2020) or
Superphot (Villaret al. 2019; Hosseinzadeh et al. 2020) have been trainedon real survey data from the Pan-STARRS1 MediumDeep Survey (PS1/MDS). Overall, these classifiers havea fairly high success rate and recover ∼
80% of SLSN-I,but only when using redshift information and fairly com-plete light curves. Additionally, the Automatic Learningfor the Rapid Classification of Events (ALeRCE) broker,which is currently providing real-time classifications fortransients from ZTF (S´anchez-S´aez et al. 2020), is ableto recover up to 100% of the SLSN-I in their trainingsample, but with a large standard deviation of ∼ ∼ .
5% to ∼ § § § § § § §
8. FLEET isprovided as a Python package on Github and Zenodo(Gomez et al. 2020), as well as included in the PythonPackage Index with the name fleet-pipe . GUIDING PRINCIPLESAs discussed above, there are several efforts aimedat ML classification of astronomical transients, mainlybased on light curve information from wide-field surveys.By design, some classifiers make choices that tend tooptimize their overall classification success rate acrossa range of astronomical transients (e.g. Boone 2019;Muthukrishna et al. 2019; Gagliano et al. 2020; Hossein-zadeh et al. 2020; Villar et al. 2020). Here, we take adistinct approach by focusing on optimized classificationof a single class of transients. Our algorithm is based onthe following guiding principles:1. Classifying only SLSN-I with no regard for theclassification success of other transients.2. Obtaining the purest possible sample of SLSN-I,at the expense of sample completeness.3. Prioritizing speed and computational resourcesover model complexity to allow for rapid classi-fication.4. Finding SLSN-I at early times to enable real-timefollow-up.This approach enables us to make efficient use of large-aperture telescopes for spectroscopic classification, aswell as perform later follow-up studies. https://github.com/gmzsebastian/FLEET LEET ∼ (cid:38) per month expected from LSST) motivates ouremphasis on computational speed, as well as purity atthe expense of completeness. In particular, even if wemanage to identify less than half of the SLSN-I in thedata stream, but with a high success rate, then we candouble the existing sample of SLSN-I by the time LSSTcommences.We provide a main rapid version of the classifier inaddition to two additional classifiers with somewhat dif-ferent motivations. First, a full light curve classifier thatcan more confidently classify SLSN-I, but at the expenseof early discovery, mainly aimed at constructing largesamples with only photometric data. And second, aclassifier that uses redshift information for higher pu-rity classification, mainly in anticipation of robust pho-tometric redshifts that will be provided by LSST. TEST SETTo train our classifier we obtained all spectroscopicallyclassified transients from the TNS: SNe, tidal disruptionevents (TDEs), active galactic nuclei (AGN) flares, andGalactic transients (e.g., cataclysmic variables and vari-able stars). In addition to those, we included the TDEspublished in van Velzen et al. (2020), which are not yetreported to the TNS, and every unambiguous SLSN-Ifrom the literature; see Table A.1. We also obtainedall of the available photometry for each transient, fromthe Open Supernova Catalog (OSC; Guillochon et al.2017) or the Zwicky Transient Facility (ZTF; Bellm et al.2019). We require each transient to have at least 2 g -band and 2 r -band measurements to model their lightcurves. We restrict the list to transients within the foot-print of the Pan-STARRS1 3 π (PS1/3 π ) survey (Cham-bers & Pan-STARRS Team 2018) for the purpose ofidentifying host galaxies. Finally, we removed from thetraining set 44 transients with ambiguous host galaxyidentifications or spurious data in order to have thecleanest data set possible; however, we kept these eventsin our test set to prevent any resulting biases. The re-sulting sample is composed of 1,813 transients, with thefollowing distinct labels from the TNS: 800 SN Ia, 381SN II, 156 SLSN-I, 95 CV, 71 SN IIn, 63 SN IIP, 59 SN Ic,43 SLSN-II, 37 SN Ib, 33 SN IIb, 19 TDE, 16 SN Ic-BL,13 SN Ibc, 12 AGN, 8 SN Ibn, and 7 Varstar (variablestars). Table 1.
Observational Rates of TransientsTransient Fremling TNS Target f SNI 587 (77.1%) 6500 (70.8%) 73.9%SNII 155 (20.4%) 2109 (23.0%) 19.6%SLSN-I 12 (1.6%) 123 (1.3%) 1.5%SLSN-II 7 (0.9%) 45 (0.5%) 0.9%Nuclear – 58 (0.6%) 0.6%Star – 340 (3.7%) 3.5%
Note —Observational rates for the relevant types oftransients considered here. We normalize the rateof events in our test set to an expected Target rate f calculated from the Fremling et al. (2020) sampleand the TNS sample, used for Equation 1. Since the number of events per class varies substan-tially, making the training set unbalanced, the classifica-tion would be biased towards the more common classes.To mitigate this bias we over-sample each class to havea total of 800 events, using the Synthetic Minority Over-sampling Technique (SMOTE; Chawla et al. 2002). Thisalgorithm draws random samples along vectors joiningevery pair of objects in feature space until all classeshave the same number of events. We tested an alterna-tive multivariate-Gaussian (MVG) oversampling tech-nique, as implemented in Villar et al. (2019), but findthat when sampling features that are close to zero andconstrained to be positive (e.g., redshift), SMOTE per-forms significantly better; even when imposing a > Test Set
We test the efficacy of our classifier on all of the eventsfrom the training set. In addition to all the eventsfrom the training set, we include in the test set the 44transients that were removed in § Gomez et al. possible biases. We implement a leave-one-out cross-validation method, allowing us to train the classifier onevery event except for one, and then predict the classifi-cation of that one event, cycling through all events. Thisallows us to robustly test our classifier without havingto divide the data set into a training and test set, whichwould compromise the sample size.We define completeness , classifier purity , and observedpurity as useful metrics to test the efficacy of our algo-rithm: Completeness = SN T N SLSN
Classifier Purity = SN T SN T + SN F Observed Purity = SN T SN T + (cid:80) i η i SN F , i η i = N SLSN × f i N i × f SLSN , (1)where N SLSN is the total number of SLSN-I in the testset, SN T is the total number of true positive SLSN-I re-covered, and SN F is the total number of false positiveSLSN-I. The relative fractions of each transient class inour test set, which we obtained directly from the TNS,does not reflect the true fractions of these transients ina magnitude-limited survey. To determine a purity thatis representative of on-going and future surveys, we re-normalize the classifier purity into an observed purity ,which more accurately represents the outcome of ourpipeline in a real survey. Here, SN F , i is the false pos-itive rate for an individual transient class i , and f i isthe corresponding true observational rate for that class,listed in Table 1. We use the observational rates of SNefrom Fremling et al. (2020) to estimate the expected Tar-get Rate, f , for any magnitude-limited survey. We theninclude nuclear transients (TDEs + AGN) and Galactictransients (CVs + variable stars) from the TNS, normal-izing by the total number of classified transients from theTNS to the total number of SNe in the Fremling et al.(2020) sample.Given that SLSN-I are over-represented in our test setcompared to the rate they would have in a magnitudelimited survey, observed purity will be lower than theclassifier purity. For example, our test set has 800 SNIa and 156 SLSN-I, or 0.20 SLSN-I for each SN Ia. Butin a magnitude limited survey, there is typically only0.02 SLSN-I for each SN Ia. Therefore, if we wanted topredict how many SLSN-I we would be able to find ina real survey, we need to normalize the classifier purity,in this example by multiplying the false positive rate bya factor of ∼
14 16 18 20 2220406080100 M a t c h [ % ]
13 14 15 16 17 18 19 20 21 22 23PSF mag [i]0123 P S F - K r o n GalaxiesStars
Figure 1.
Galaxies (green) and stars (red) classified by theCFHTLS survey (D1 field) plotted in terms of the differencebetween their PSF and Kron magnitude as a function of ap-parent i -band magnitude in PS1/3 π . Using this calibration,we assign a probability of being a galaxy to all objects in thefield of a transient based on their location in this diagram.The top panel shows the percent of objects for which ourclassification matches that of the CFHTLS as a function ofapparent magnitude, a 90% match occurs at a magnitude of22.5. 4. CONTEXTUAL INFORMATIONSLSN-I are known to prefer low-luminosity galaxies(Lunnan et al. 2014), and it is therefore advantageous touse contextual information in their classification. Herewe describe our method of assigning a host galaxy toeach transient, while in § π grizy (Chambers & Pan-STARRSTeam 2018) and SDSS ugriz (Alam et al. 2015; Ahu-mada et al. 2019) PSF and Kron magnitudes of everycataloged source in a 1 (cid:48) radius region around the tran-sient location. We use this information both to separategalaxies from stars, and to identify the most likely hostgalaxy. 4.1. Star-Galaxy Separation
The first step to identifying the host galaxy of eachtransient is to separate stars from galaxies. SDSS pro-vides a classification for every object in their catalog, butsince SDSS is shallower than PS1/3 π and has a smallerfootprint, this is not sufficient for our purposes. Instead,we develop a method to assign a probabilistic value (be-tween 0 and 1) of how likely every object in SDSS andPS1/3 π is to be a galaxy. LEET RA D E C N W GalaxiesStars
Figure 2.
PS1/3 π i -band image of a 1 (cid:48) × (cid:48) field centered onthe position of the SLSN-I SN 2013hy, indicating objects clas-sified as galaxies (green) and stars (red) based on our star-galaxy separation algorithm ( § P cc ≈ .
03 as determined by the algorithm described in § To train our star-galaxy separation algorithm we usedata from the Canada-France-Hawaii Telescope LegacySurvey (CFHTLS; Hudelot et al. 2012), which providesmagnitudes and star-galaxy classifications down to ≈ π . Wespecifically use the D1 field (1 deg ) and cross-matchwith every overlapping object in SDSS and PS1/3 π , for atotal of ∼ ,
000 objects. Galaxies tend to have a largerdifference between their PSF and Kron magnitudes thanstars, so we use this specific feature (PSF − Kron) toseparate them; see Figure 1 for an example in the i -band. The CFHTLS uses the CLASS STAR classifier flagin
SExtractor to separate stars from galaxies, which re-lies on a multi-layer feed-forward neural network (Bertin& Arnouts 1996).In our galaxy-star separator we assign a probabilityof being a galaxy to any object in SDSS or PS1/3 π byusing a custom k-nearest-neighbors algorithm. Given anobject’s PSF and Kron magnitude, we find the 20 near-est objects in the PSF versus PSF − Kron phase-space(Figure 1) to calculate its probability of being a galaxybased on the fraction of those 20 neighbors from theCFHTLS training set that are galaxies. Experimentingwith different number of neighbors, we find that at least10 neighbors are required to produce robust estimates, with only marginal improvement in accuracy beyond 20neighbors. For every object we calculate its probabilityof being a galaxy in every available filter, and adopt theaverage probability among all filters.An alternative star-galaxy separator for objects inPS1/3 π is presented in Tachibana & Miller (2018). Al-though this latest one has a very high accuracy, it doesnot include objects from SDSS, for which we also requirea classification when they are not in the PS1/3 π catalog.We note that if we label objects with a probability of be-ing a galaxy of P G ≤
10% as stars, our classifier agreeswith the classification from Tachibana & Miller (2018)at the 90% level. In Figure 2 we show an example of ourstar-galaxy separator applied on a field from PS1/3 π centered on the location of the SLSN-I SN 2013hy.We opt to only label objects with a galaxy probabilityof P G <
10% as stars to avoid missing a possible hostgalaxy identification. While this conservative cut retainsmore stars in the sample, these are rarely predicted tobe the most likely host galaxy of a SN due to the smallsize of their PSF. We find that a more strict thresholdresults in a large number of host galaxies being rejectedas stars. In the top panel of Figure 1 we show that usingthe classification from the CFHTLS as a reference, ourthreshold for labeling stars yields a successful galaxyclassification for essentially all objects with (cid:38)
22 magand ≈
65% down to 23 mag.4.2.
Host Identification
Once we have identified which objects in the field arelikely to be galaxies we can determine which galaxy isthe most likely host for a given transient. First, we la-bel stellar transients, using the criterion of a star (i.e., P G < < (cid:48)(cid:48) from a transient’s posi-tion. Then, for the non-stellar transients we determinethe probability of chance coincidence for each galaxy inthe field relative to the transient’s position. We followthe method of Bloom et al. (2002) and Berger (2010) us-ing the measured number density of galaxies, Σ( ≤ m ),brighter than a magnitude m , to calculate the probabil-ity of chance coincidence: P cc = 1 − e − π ( d +4 R ) Σ( ≤ m ) Σ( ≤ m ) = 10 . m − − . .
33 ln(10) , (2)where d is the angular separation between the centerof a galaxy and the transient, and R is the half-lightradius of the galaxy obtained from the SDSS catalog, orfrom the PS1/3 π catalog if the object is not in the SDSScatalog. We consider the galaxy with the lowest valueof P cc to be the host, as long as P cc ≤ .
1. Otherwise,we designate the transient as “host-less” given the more
Gomez et al.
Table 2.
Feature Sets P (SLSN − I)1 W +∆ t + R n + ∆ m . ± .
0% 16 . ± .
8% 0.732 W +∆ t + ∆ m . ± .
8% 2 . ± .
7% 0.813 W +∆ t + R n . ± .
5% 20 . ± .
3% 0.844 R n +∆ m . ± .
5% 4 . ± .
4% 0.895 W + R n . ± .
9% 16 . ± .
8% 0.746 W +∆ t + R n +( g − r ) 91 . ± .
4% 17 . ± .
3% 0.807 W +∆ t + R n + ∆ m +( g − r ) 82 . ± .
4% 2 . ± .
9% 0.888 W + R n + ∆ m +( g − r ) 94 . ± .
2% 3 . ± .
7% 0.87
Note —Different sets of light curve and contextual features used to train our classifier.We list the highest classifier purity that each set of features achieves, as well as thecorresponding completeness and classification probability P (SLSN − I) that correspond tothat peak purity. W is the width of the light curve, R n is the normalized host separation,∆ m is the peak transient magnitude minus the host magnitude, ∆ t is the time of peakmagnitude minus the time of discovery, and ( g − r ) is the light curve color at peak. M a g n i t u d e r-bandg-band20d70d Figure 3.
Light curves of the SLSN-I SN 2011ke fit withthe model described in Equation 3. The dashed lines showthe fit using only data up to 20 days after detection (with afixed value of A = 0 . A as a freeparameter). The former is part of our main rapid classifier,while the latter is part of an alternative classifier that usesfull light curves ( § likely situation that its host galaxy is fainter than themagnitude limit of SDSS and PS1/3 π . LIGHT CURVE MODELIn addition to the contextual information, we use thelight curves of each transient to predict which tran-sients are most likely SLSN-I. We obtain photometricdata from the OSC, as well as from ZTF using theMake Alerts Really Simple (MARS) broker. We cor-rect all the photometry for Galactic extinction usingthe Schlafly & Finkbeiner (2011) dust maps assuming R V = 3 . m = e W ( t − φ ) − A × W ( t − φ ) + m , (3)where W is the effective width of the light curve, A modifies the decline time relative to the rise time, m is the peak magnitude, and φ is a phase offset rela-tive to the time of the first observation. An exampleof this function fit to a SLSN-I (SN 2011ke) is shownin Figure 3. We fit this model independently to the g -and r -band light curves using the emcee implementationof the Goodman and Weare (Goodman & Weare 2010)Markov chain Monte Carlo algorithm (Foreman-Mackeyet al. 2013) and adopt the median of the posterior asthe best estimate for each parameter. We use flat un-informative priors for all parameters, but initiate thewalkers’ position at a value of m equal to the brightestobserved magnitude, and a value of φ that correspondsto the time of that measurement. We find that a model https://mars.lco.global/ LEET ∼
30 steps.We use two versions of Equation 3 to test and evaluatethe classifier. One version has a fixed value of A =0 . § A has only a marginal effecton the results, since this model only uses data up to 20days after detection, which do not encompass a declinephase. The second version of the model uses data up to70 days after discovery and has A as a free parameterto fit the light curve decline. This model is used for thefull light curve classifier described in § A ) and 70 days of data ( A as afree parameter). CLASSIFICATION ALGORITHMTo classify the transients we use the contextual andlight curve information described in § §
5, respec-tively, with an implementation of the random forest(RF) algorithm in the scikit-learn
Python package(Pedregosa et al. 2012). In this manner, we assignto each transient a classification probability of being aSLSN-I. This algorithm takes various sub-samples of thetraining set and forms a number of decision tree classi-fiers to classify each object. The output classificationprobability is the result of averaging the output of allthe trees in the forest. We run the classifier with 100estimators to mitigate over-fitting and improve predic-tive accuracy. We also run each version of the model25 times using different initial random seeds to estimatethe classifier’s uncertainties. We run the classifier us-ing the Gini index as the criterion that minimizes theprobability of misclassification. We optimize the depthof the trees in each RF by running a grid of models froma depth of 3 to 12 in steps of 1 and find a depth of 7performs best (a depth of 6 and 8 performed similarlywell, within a 1 σ uncertainty derived from the differentrandom seed iterations).Additionally, we optimize the grouping of transientclasses into different sets, described in § Feature Selection t R n StarNuclear SNIISNI SLSN-IISLSN-I W r g - r Figure 4.
Phase-spaces of features selected for the classi-fier, plotted for the various classes of transients.
Top : Thenormalized host separation ( R n ) versus the time differencebetween the light curve peak and the first detection (∆ t ). Forhost-less transients we set R n = 0 (Shown here at R n = 0 . Bottom : Light curve width in r − band W r , compared to the color of the transient duringpeak ( g − r ). Unlike newly-discovered transients, the transients inour training set have full light curves. Since a goal ofFLEET is to find SLSN-I in real-time we test the al-gorithm using a varying cutoff time for the light curvedata. Naturally, with more data the light curve mod-els are better constrained, but this delays the identi-fication and spectroscopic follow-up into a later phasewhen the SN is fainter. For our rapid classifier we findoptimal results when using the first 20 days of data foreach light curve, by which time most SLSN-I have notreached their peak luminosity.
Gomez et al. R n W g W r g-r t R n W r W g T g-r R n W r W g T g-r 0.20.40.60.81.0 C o rr e l a t i o n Figure 5.
Top : Correlated importance for the features usedin the rapid version of our classifier.
Bottom : Correlationmatrix for the same features.
For the rapid classifier we have 6 light curve parame-ters (3 in each filter) that can be used as input features:the widths of the light curve, W , the phase offsets φ ,and the peak magnitudes m . In addition to these weexplore the use of two additional features: (i) ∆ t , whichis the time difference from first detection of a transientto its observed light curve peak in either g - or r -band,whichever one is brightest; and (ii) the g − r color atpeak, using the model fits, where the time of peak is theone with the brightest observed magnitude in either g -or r -band.For the contextual information features we test theuse of several host galaxy parameters: the apparentmagnitude of the host, m h , its half light radius in r -band, R , the projected angular separation between thetransient and its host center D , the projected angularseparation normalized by the galaxy radius R n , and thedifference between m and m h in r -band, ∆ m . For host-less transients we use the limiting magnitude of PS1/3 π of r = 23 . m h , and set all othergalaxy parameters to 0 (since those cannot be measuredfor a non-detected host).
20 40 60 80 100 120
Days Since First Datapoint F r a c t i o n o f S L S N e F o un d [ N = ] SLSN-I [N = 40]StarNuclearSNIISNISLSN-IISLSN-I
Figure 6.
Fraction of SLSN-I correctly identified by therapid version of our classifier amongst the top 20 objects pre-dicted to be SLSN-I as a function of days of light curve dataused. The peak purity is about 90% when using (cid:38)
20 days ofdata. This purity is relevant for the training set, before nor-malizing to observational rates in a magnitude-limited survey( § (cid:38)
70 dayscomes from a single CV with a 50-day long outburst that wasclassified as a SLSN-I due to its long light curve and lack ofdetected “host”.
We tested several combinations of the available lightcurve and contextual features in order to determinewhich combination set yields the highest purity of SLSN-I, while maintaining reasonable completeness; listed inTable 2. We find that the most relevant features thathelp separate SLSN-I from other transients are W g and W r , ∆ t , R n , ∆ m , and ( g − r ). In Figure 4 we show howthe different classes of transients lie in feature-space. InTable 2 we list the highest purity, and associated uncer-tainty, achieved for each feature set, as well as the cor-responding completeness and classification confidence, P (SLSN − I), at which this highest purity is achieved.We find that for the rapid classifier, feature set W in g - and r -band, the normalized hostseparation R n , the time of peak magnitude minus thetime of discovery in either band, ∆ t , and the light curvecolor at peak, ( g − r ).The importance of each feature used is not definedindependently of other features, if two features are cor-related then their relative importance might be affected.In the bottom panel of Figure 5 we show the correlationbetween features, and find that with the exception of W g and W r , the features are mostly independent. In LEET C u m u l a t i v e F r a c t i o n SLSN-INot SLSN-IMissclassified
Figure 7.
Cumulative distribution as a function of classifi-cation confidence ( P ) for transients classified as SLSN-I (red)and non-SLSN-I (blue). The crosses mark events that aremisclassified. We find that for the SLSN-I sample, the mis-classified events are mainly concentrated at P (SLSN − I) (cid:46) . order to calculate the correlated importance we use thepermutation importance method described in Breiman(2001). The correlated importance of each feature isshown on the top panel of Figure 5.In Figure 6 we show how the rapid version of the classi-fier (trained on the first 20 days of light curve data) per-forms as a function of days of light curve data used, andinclude the contaminating classes of transients. Whenconsidering the top 20 transients with the highest pre-dicted confidence P (SLSN − I), we find that the classi-fier performance rises for the first ∼
20 days, and thenplateaus to a peak classifier purity of about 90% (i.e., wecorrectly identify about 18 of the top 20 transients clas-sified as SLSN-I). This purity is relevant for the train-ing set, without normalizing for the observational ratesdescribed in § Model Validation
We use three different methods to evaluate the per-formance of our classifier: a confusion matrix, a pu-rity/completeness curve, and the fraction of SLSN-I re-covered. Unless otherwise stated, in this section thevalues being reported have been corrected for the obser- C o m p l e t e n e ss / O b s e r v e d P u r i t y Observed PurityCompleteness
Features
Figure 8.
The observed purity and completeness for thebest performing set of features described in Table 2. Thepurity curve represents the percent of transients that areSLSN-I as a function of the classifier confidence p(SLSN-I). The shaded region for the purity and error-bars for thecompleteness represent 1 σ uncertainties. SLSN-I Not SLSN-IPredicted label S L S N - I N o t S L S N - I T r u e l a b e l Classifier Purity (P > 0.75)
Figure 9.
Confusion matrix that indicates a purity of 80%for SLSN-I. Based only on objects with a classification prob-ability of P(SLSN-I) > .
75 or P(not-SLSN-I) > .
75, for atotal of 1438 transients. vational rates expected in a magnitude-limited survey asdescribed in § observed purity . Sincewe are not concerned with the classification of transientsother than SLSN-I, we collapse the individual transient0 Gomez et al. classifications into a binary SLSN-I versus non-SLSN-Iclassification. To calculate the probability of non-SLSN-I for each transient we sum the probabilities of all othertransient classes.In Figure 7 we show how the rapid classifier performsat classifying SLSN-I and not misclassifying other ob-jects, as a function of classification confidence level.We find that most of the misclassified SLSN-I are at P (SLSN − I) (cid:46) .
6, with only 4 misclassified SLSN-I athigher values of P (SLSN − I). The few objects that aretrue SLSN-I but were missclassified as something elsewith high confidence are usually SLSN-I with relativelybright host galaxies that got missclassified as Type-IISNe, which have light curves that might also appearbroad due to their late-time plateau.The completeness and purity of the rapid classifier forthe three top performing feature sets are shown in Fig-ure 8. As expected, the purity increases and the com-pleteness declines as we restrict the sample to eventswith progressively higher values of classification confi-dence. For P (SLSN − I) > .
5, the observed purity is ≈ ≈ − ≈ . ≈ ∼ ,
000 transients a year, assuming anobservational rate of 1.5% for SLSN-I (Table 1), a 20%completeness corresponds to ∼
50 SLSN-I a year thatcould be discovered.In Figure 9 we show the confusion matrix, namely, thelabel predicted by our classifier compared to the truelabel of the transient. We impose a confidence cut of
P > .
75 for either the SLSN-I or not-SLSN-I classes,corresponding to the peak classifier purity (Figure 8);this leads to a sample of 1438 events. We see that 14 outof the 18 transients predicted to be SLSN-I are correctlylabeled, indicating a classifier purity of 80%.We run an additional model validation to test for over-fitting. Given the relatively small sample size of ourdata set we split the entire data set into two indepen-dent sets, a training set (with 1209 objects) and a testset (with 604 objects), as opposed to a traditional train-ing/test/validation set. We optimize the combinationof transient class grouping, depth or the RF trees, andincluded features using a leave-one-out cross-validationmethod on the training set. We find that the best re-sults (in terms of purity and completeness) are consis- tent with the main classifier presented in this section,with the exception that a depth of 5 is slightly preferredover a depth of 7 for the RF trees. We then test this clas-sifier on the 604 object test set and find it performs asexpected with a maximum classifier purity of 75% anda corresponding completeness of 15% for objects withp(SLSN-I) > . −
20 on a personal computer, and abouthalf the time to re-run on an existing transient once therequired catalog data has been downloaded and storedlocally. We note that since FLEET is designed to rapidlyselect the most promising SLSN-I candidates for follow-up, manual vetting of the top candidate events can fur-ther increase the sample purity. This is because somecandidates might be due to obvious failure modes; forexample, an AGN with a highly variable light curvemight be classified as a SLSN-I due to its “broad” lightcurve, but manual inspection will reveal a variable nu-clear source that is not SN-like. Another potential fail-ure mode that can be mitigated with manual inspection,is when SDSS and/or PS1/3 π report large galaxies asmultiple individual sources, causing the classifier to as-sociate the transient to a small dim source, instead ofthe main galaxy.To summarize, our rapid classifer, using basic lightcurve and contextual information (and no redshift infor-mation) can achieve a factor of 30 −
60 times improve-ment over random selection for SLSN-I, with a complete-ness of ∼ ALTERNATIVE CLASSIFIERSThe rapid version of the FLEET classifier presentedabove is tailored to find a pure sample of SLSN-I beforeor near peak, as to enable real-time follow-up. In thissection we explore two alternative classifiers that utilizeadditional information: (i) using redshift as a feature,based on the expectation that LSST will provide pho-tometric redshifts with ∼
5% uncertainty for galaxiesdown to i ≈
25 mag (Graham et al. 2018); and (ii)using more complete light curve information, includingthe decline phase, which may hinder spectroscopic clas-sification, but will provide samples of SLSN-I for purephotometric population studies. We optimize these al-ternative classifiers in terms of feature selection, depthof the classifier’s trees, and time span of the light curveused in the same manner as for the main rapid classifier,described in §
6. 7.1.
Redshift Classifier
A key advantage of our rapid classifier is that it doesnot rely on redshift information. However, with the ad-
LEET P e a k M a g n i t u d e NuclearSNIISNISLSN-IISLSN-I
Figure 10.
Peak absolute magnitude in r -band versus spec-troscopic redshift for all the transients in our sample (ex-cluding stars). As expected, SLSN-I separate well from othertypes of transients when the redshift is known. vent of LSST it is expected that robust photometric red-shifts will be available for galaxies down to i ≈
25 mag.Since SLSN-I are generally more luminous than otherSN classes, redshift information is certain to aid in theclassification confidence. In Figure 10 we plot the peakabsolute r -band magnitude as a function of redshift forall of the extragalactic transients in our training set, in-dicating how well SLSN-I can be separated when redshiftinformation is available.To test this effect, we use here the known spectro-scopic redshift of each transient in our training set (as-signing Galactic transients a redshift of 0). As in therapid classifier, we only use the first 20 days of data(designed to enable rapid follow-up) and optimize forRF depth and features. We find that feature set P (SLSN − I), with an observed pu-rity of about 60% and completeness of about 60% at P (SLSN − I) > . P (SLSN − I) > . ≈
65 candidate SLSN-I, significantly higher than the ≈
27 candidate SLSN-I at 50% purity for the main clas- sifier. Stated differently, the redshift classifier achieves80% observed purity for the 27 top candidate SLSN-I,compared to the 50% observed purity for the main clas-sifier. We therefore conclude that when robust redshiftinformation is available it can significantly aid in thepurity and completeness of the classifier.7.2.
Full Light Curve Classifier
The rapid classifier is trained on only the first 20 daysof light curve data. Here we investigate the efficacy ofusing more complete light curves. This may inhibit thesuccess of spectroscopic classification, since SLSN-I areon average about 2 mag fainter on a timescale of 70days after discovery compared to at 20 days after dis-covery. But using light curves well beyond peak allowsfor a more robust classification and can aid in the con-struction of more complete photometric SLSN-I samplesonce they fade away. For this full light curve classifier wemeasure the decline rate by fitting for A in Equation 3.After optimizing the classifier we find that feature set W and A (Table 2), and a depth of9 for the RF trees provide the best results. We similarlyfind that using the first 70 days of light curve data pro-vides optimal results; later time data tend to be of lowerquality and are more greatly affected by non-monotoniclight curve features that cannot be captured in our sim-ple light curve model. In Figure 11 we show how the fulllight curve classifier performs in terms of classificationprobability. We find an overall better performance thanfor the rapid classifier, achieving a comparable peakobserved purity, but at P (SLSN − I) ≈ .
65 instead of ≈ .
80, and hence with a higher completeness of about40% compared to 20% for the rapid classifier. As shownin Figure 12, this essentially means that the full lightcurve classifier can achieve 50% purity for a comparablenumber of top SLSN-I candidates as the redshift classi-fier, ≈
65 events. Similarly, it can achieve an observedpurity comparable to the peak observed purity of therapid classifier, but for about 45 top SLSN-I candidatesas opposed to about 27.2
Gomez et al. C o m p l e t e n e ss Classifier:
RapidFull Light CurveRedshift 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0P(SLSN-I)0.00.10.20.30.40.50.60.70.80.91.0 P u r i t y Classifier:
RapidFull Light CurveRedshift
Figure 11.
Left : Completeness as a function of confidence for all three classifiers presented here.
Right : Purity, corrected forobservational rates for the same classifiers. The shaded regions represent the 1 σ uncertainties. C o m p l e t e n e ss Classifier:
RapidFull Light CurveRedshift 1020304050607080 Number of Top SLSN-I0.00.10.20.30.40.50.60.70.80.91.0 P u r i t y Classifier:
RapidFull Light CurveRedshift
Figure 12.
Left : Completeness as a function of the top transients classified as SLSN-I for all three classifiers presented here.
Right : Purity, corrected for observational rates for the same classifiers. The shaded regions represent the 1 σ uncertainties.8. CONCLUSIONSWe have presented a random forest classifier, FLEET,designed specifically to rapidly identify SLSN-I with ahigh purity, without the need for redshift information.We trained this classifier on a sample of about 1800classified transients reported to the TNS, including 156SLSN-I (i.e., 8.6% of the total sample). The classifieruses both light curve and contextual host galaxy in-formation. We assess the observed purity achieved byFLEET for the actual rate of SLSN-I in a magnitude-limited survey of ≈ . • We find that the most important features are thelight curve width, g − r color at peak, and the pro- jected angular separation between the transientand host galaxy normalized by host radius. • We find an observed purity of about 50% for eventsclassified as SLSN-I with a probability confidenceof P (SLSN − I) > .
5. This is a factor of 33 timesimprovement compared to a random selection (i.e.,compared to the fraction of 1.5% of SLSN-I ina magnitude-limited survey). The completenessfor this classification confidence threshold is about30%. • We find a peak observed purity of about 85% forSLSN-I, corresponding to a classification probabil-ity threshold of P (SLSN − I) > .
80 and a total of
LEET ∼
15 objects. The completeness for this classifica-tion confidence threshold is about 20%.In addition to the main rapid classifier we also ex-plored two alternative classifiers that use redshift infor-mation and full light curves, respectively. As expected,we find that these classifiers achieve better results, witha significant increase in completeness by about a fac-tor of 2, for an observed purity that matches the peakperformance of the main rapid classifier.Placing our results in context we note that at present,current surveys are reporting ∼ ,
000 transients a year,out of which ∼ g - and r -band) and localization (within the footprintof PS1/3 π ) to be classified by our algorithm. For anobservational SLSN-I fraction of 1.5%, this sample con-tains about 90 SLSN-I per year. Our rapid classifier cantherefore recover about 30 SLSN-I with a purity of 50%,thereby requiring about 60 follow-up spectra per year;or alternatively, about 18 SLSN-I per year with a pu-rity of about 85%, requiring about 21 follow-up spectra.Looking forward to LSST, which is expected to have ∼ SLSN-I in its data stream (Villar et al. 2018), ourclassifier could discover ∼
140 SLSN-I a month, with ∼
170 follow-up spectra. This would increase the exist-ing sample by two orders of magnitude over the lifetimeof LSST.The Berger Time-Domain Group is supported inpart by NSF grant AST-1714498. V.A.V. acknowl-edges support from a Ford Foundation DissertationFellowship. Operation of the Pan-STARRS1 telescopeis supported by the National Aeronautics and Space Administration under grant No. NNX12AR65G andgrant No. NNX14AM74G issued through the NEOObservation Program. This work has made use ofdata from the European Space Agency (ESA) mission
Gaia
Gaia
Gaia
Multilateral Agreement. Thisresearch has made use of NASAs Astrophysics DataSystem. This research has made use of the SIMBADdatabase, operated at CDS, Strasbourg, France. Basedon observations obtained with MegaPrime/MegaCam, ajoint project of CFHT and CEA/IRFU, at the Canada-France-Hawaii Telescope (CFHT) which is operated bythe National Research Council (NRC) of Canada, theInstitut National des Science de l’Univers of the CentreNational de la Recherche Scientifique (CNRS) of France,and the University of Hawaii. This work is based inpart on data products produced at Terapix available atthe Canadian Astronomy Data Centre as part of theCanada-France-Hawaii Telescope Legacy Survey, a col-laborative project of NRC and CNRS. This research hasmade use of the NASA/IPAC Extragalactic Database,which is funded by the National Aeronautics and SpaceAdministration and operated by the California Instituteof Technology.
Facilities:
ADS, TNS
Software:
Astropy (Astropy Collaboration 2018), ex-tinction((Barbary 2016)), Matplotlib (Hunter 2007), em-cee(Foreman-Mackey et al. 2013), NumPy (van der Waltet al. 2011), scikit-learn (Pedregosa et al. 2012), SMOTEChawla et al. 2002REFERENCES
Ahumada, R., Allende Prieto, C., Almeida, A., et al. 2019, arXive-prints, arXiv:1912.02905Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS,219, 12Angus, C. R., Smith, M., Sullivan, M., et al. 2019, MNRAS, 487,2215Astropy Collaboration. 2018, AJ, 156, 123Barbary, K. 2016, extinction, v0.3.0, Zenodo,doi:10.5281/zenodo.804967Bellm, E. C., Kulkarni, S. R., Graham, M. J., et al. 2019, PASP,131, 018002Berger, E. 2010, ApJ, 722, 1946Bertin, E., & Arnouts, S. 1996, A&AS, 117, 393Blanchard, P. K. 2019, PhD thesis, Harvard University,http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029690Blanchard, P. K., Berger, E., Nicholl, M., & Villar, V. A. 2020,ApJ, 897, 114 Blanchard, P. K., Nicholl, M., Berger, E., et al. 2019, ApJ, 872,90Blanchard, P. K., Nicholl, M., Berger, E., et al. 2017, ApJ, 843,106Blanchard, P. K., Nicholl, M., Berger, E., et al. 2018, ApJ, 865, 9Bloom, J. S., Kulkarni, S. R., & Djorgovski, S. G. 2002, AJ, 123,1111Boone, K. 2019, AJ, 158, 257Breiman, L. 2001, Machine Learning, 45, 5Chambers, K., & Pan-STARRS Team. 2018, in AmericanAstronomical Society Meeting Abstracts, Vol. 231, AmericanAstronomical Society Meeting Abstracts Gomez et al.
Chen, T. W., Inserra, C., Fraser, M., et al. 2018, ApJL, 867, L31Chomiuk, L., Chornock, R., Soderberg, A. M., et al. 2011, ApJ,743, 114Cooke, J., Sullivan, M., Gal-Yam, A., et al. 2012, Nature, 491,228Dahiwale, A., & Fremling, C. 2020, Transient Name ServerClassification Report, 2020-1756, 1De Cia, A., Gal-Yam, A., Rubin, A., et al. 2018, ApJ, 860, 100Dessart, L., Hillier, D. J., Waldman, R., Livne, E., & Blondin, S.2012, MNRAS, 426, L76Drake, A. J., Djorgovski, S. G., Mahabal, A., et al. 2009, ApJ,696, 870Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J.2013, PASP, 125, 306Fraser, M., Reynolds, T., Mattila, S., & Yaron, O. 2016,Transient Name Server Classification Report, 2016-521, 1Fremling, C., & Dahiwale, A. 2019, Transient Name ServerClassification Report, 2019-1774, 1Fremling, C., Dugas, A., & Sharma, Y. 2018a, Transient NameServer Classification Report, 2018-1411, 1Fremling, C., Dugas, A., & Sharma, Y. 2018b, Transient NameServer Classification Report, 2018-1416, 1Fremling, C., Dugas, A., & Sharma, Y. 2018c, Transient NameServer Classification Report, 2018-1877, 1Fremling, C., Dugas, A., & Sharma, Y. 2019a, Transient NameServer Classification Report, 2019-32, 1Fremling, C., Dugas, A., & Sharma, Y. 2019b, Transient NameServer Classification Report, 2019-188, 1Fremling, C., Dugas, A., & Sharma, Y. 2019c, Transient NameServer Classification Report, 2019-598, 1Fremling, C., Dugas, A., & Sharma, Y. 2019d, Transient NameServer Classification Report, 2019-636, 1Fremling, C., Dugas, A., & Sharma, Y. 2019e, Transient NameServer Classification Report, 2019-952, 1Fremling, C., Miller, A. A., Sharma, Y., et al. 2020, ApJ, 895, 32Gagliano, A., Narayan, G., Engel, A., & Carrasco Kind, M. 2020,arXiv e-prints, arXiv:2008.09630Gomez, S., Berger, E., Blanchard, P. K., et al. 2020, FLEETFinding Luminous and Exotic Extragalactic Transients, 1.0.0,Zenodo, doi:10.5281/zenodo.4013965Gomez, S., Berger, E., Nicholl, M., et al. 2019, ApJ, 881, 87Goodman, J., & Weare, J. 2010, Communications in AppliedMathematics and Computational Science, 5, 65Graham, M. L., Connolly, A. J., Ivezi´c, ˇZ., et al. 2018, AJ, 155, 1Guillochon, J., Parrent, J., Kelley, L. Z., & Margutti, R. 2017,ApJ, 835, 64Hosseinzadeh, G., Dauphin, F., Villar, V. A., et al. 2020, arXive-prints, arXiv:2008.04912Howell, D. A., Kasen, D., Lidman, C., et al. 2013, ApJ, 779, 98Hudelot, P., Cuillandre, J. C., Withington, K., et al. 2012,VizieR Online Data CatalogHunter, J. D. 2007, CSE, 9, 90Inserra, C., Smartt, S. J., Jerkstrand, A., et al. 2013, ApJ, 770,128Inserra, C., Nicholl, M., Chen, T. W., et al. 2017, MNRAS, 468,4642Kasen, D., & Bildsten, L. 2010, ApJ, 717, 245Kasliwal, M., & Cao, Y. 2019, Transient Name Server DiscoveryReport, 2019-259, 1Kessler, R., Narayan, G., Avelino, A., et al. 2019, PASP, 131,094501Leloudas, G., Chatzopoulos, E., Dilday, B., et al. 2012, A&A,541, A129Lin, W. L., Wang, X. F., Li, W. X., et al. 2020, arXiv e-prints,arXiv:2006.16443Liu, L.-D., Wang, L.-J., Wang, S.-Q., & Dai, Z.-G. 2018, ApJ,856, 59Lunnan, R., Chornock, R., Berger, E., et al. 2013, ApJ, 771, 97 Lunnan, R., Chornock, R., Berger, E., et al. 2014, ApJ, 787, 138Lunnan, R., Fransson, C., Vreeswijk, P. M., et al. 2018a, NatureAstronomy, 2, 887Lunnan, R., Chornock, R., Berger, E., et al. 2018b, ApJ, 852, 81Lunnan, R., Yan, L., Perley, D. A., et al. 2019, arXiv e-prints,arXiv:1910.02968Lyman, J., Homan, D., Magee, M., & Yaron, O. 2017, TransientName Server Classification Report, 2017-881, 1Mazzali, P. A., Sullivan, M., Pian, E., Greiner, J., & Kann, D. A.2016, MNRAS, 458, 3455McCrum, M., Smartt, S. J., Rest, A., et al. 2015, MNRAS, 448,1206Metzger, B. D., Margalit, B., Kasen, D., & Quataert, E. 2015,MNRAS, 454, 3311Muthukrishna, D., Narayan, G., Mandel, K. S., Biswas, R., &Hloˇzek, R. 2019, PASP, 131, 118002Nicholl, M., Berger, E., Blanchard, P. K., Gomez, S., &Chornock, R. 2019, ApJ, 871, 102Nicholl, M., Berger, E., Margutti, R., et al. 2017a, ApJL, 845, L8Nicholl, M., Berger, E., Margutti, R., et al. 2017b, ApJL, 835, L8Nicholl, M., Guillochon, J., & Berger, E. 2017c, ApJ, 850, 55Nicholl, M., Smartt, S. J., Jerkstrand, A., et al. 2013, Nature,502, 346Nicholl, M., Smartt, S. J., Jerkstrand, A., et al. 2014, MNRAS,444, 2096Nicholl, M., Smartt, S. J., Jerkstrand, A., et al. 2015, ApJL, 807,L18Nicholl, M., Berger, E., Margutti, R., et al. 2016, ApJL, 828, L18Nicholl, M., Blanchard, P. K., Berger, E., et al. 2018, ApJL, 866,L24Nicholl, M., Blanchard, P. K., Berger, E., et al. 2020, NatureAstronomy, arXiv:2004.05840Papadopoulos, A., D’Andrea, C. B., Sullivan, M., et al. 2015,MNRAS, 449, 1215Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2012, arXive-prints, arXiv:1201.0490Perley, D., Yan, L., Andreoni, I., et al. 2019a, Transient NameServer Classification Report, 2019-1712, 1Perley, D., Yan, L., Lunnan, R., et al. 2019b, Transient NameServer Classification Report, 2019-2829, 1Perley, D. A., Yan, L., Gal-Yam, A., et al. 2019c, TransientName Server AstroNote, 79, 1Perley, D. A., Quimby, R. M., Yan, L., et al. 2016, ApJ, 830, 13Prajs, S., Sullivan, M., Smith, M., et al. 2017, MNRAS, 464, 3568Prentice, S. J., Maguire, K., Skillen, K., Magee, M. R., & Clark,P. 2019, Transient Name Server Classification Report,2019-2339, 1Quimby, R. M., Aldering, G., Wheeler, J. C., et al. 2007, ApJL,668, L99Quimby, R. M., Kulkarni, S. R., Kasliwal, M. M., et al. 2011,Nature, 474, 487Quimby, R. M., De Cia, A., Gal-Yam, A., et al. 2018, ApJ, 855, 2Roy, R., Sollerman, J., Silverman, J. M., et al. 2016, A&A, 596,A67S´anchez-S´aez, P., Reyes, I., Valenzuela, C., et al. 2020, arXive-prints, arXiv:2008.03311Schlafly, E. F., & Finkbeiner, D. P. 2011, ApJ, 737, 103Schulze, S., Kr¨uhler, T., Leloudas, G., et al. 2018, MNRAS, 473,1258Short, P., Nicholl, M., Muller, T., Angus, C., & Yaron, O. 2019,Transient Name Server Classification Report, 2019-772, 1Tachibana, Y., & Miller, A. A. 2018, PASP, 130, 128001van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, CSE,13, 22van Velzen, S., Gezari, S., Hammerstein, E., et al. 2020, arXive-prints, arXiv:2001.01409Villar, V. A., Nicholl, M., & Berger, E. 2018, ApJ, 869, 166Villar, V. A., Berger, E., Miller, G., et al. 2019, ApJ, 884, 83
LEET Villar, V. A., Hosseinzadeh, G., Berger, E., et al. 2020, arXive-prints, arXiv:2008.04921Vreeswijk, P. M., Savaglio, S., Gal-Yam, A., et al. 2014, ApJ,797, 24Vreeswijk, P. M., Leloudas, G., Gal-Yam, A., et al. 2017, ApJ,835, 58Whitesides, L., Lunnan, R., Kasliwal, M. M., et al. 2017, ApJ,851, 107Yan, L., Chen, Z., Perley, D., et al. 2019a, Transient NameServer Classification Report, 2019-2041, 1 Yan, L., Perley, D., Lunnan, R., et al. 2019b, Transient NameServer AstroNote, 45, 1Yan, L., Quimby, R., Ofek, E., et al. 2015, ApJ, 814, 108Yan, L., Lunnan, R., Perley, D. A., et al. 2017, ApJ, 848, 6Yan, L., Perley, D., Schulze, S., et al. 2020, arXiv e-prints,arXiv:2006.13758Young, D. 2016, Transient Name Server Classification Report,2016-68, 1 Gomez et al.
APPENDIXWe show in Table A.1 the sample of all the SLSN-I used for this classifier, sorted by redshift.
Table 1 . Type-I SLSNeName Redshift Reference Name Redshift Reference Name Redshift ReferenceSN2017egm 0.0307 49 PS15cjz 0.2200 2 DES17C3gyp 0.4700 2PTF11hrq 0.0571 21 SN2016wi 0.2240 27 DES14C1rhg 0.4810 2SN2018hti 0.0600 62 SN2018gft 0.2300 56 SN2019itq 0.4810 this workSN2019unb 0.0635 1 SN2010gx 0.2301 30 SN2016aj 0.4850 60SN2018bgv 0.0795 23 SN2018ffj 0.2340 this work SN2019kwq 0.5000 53SN2012aa 0.0830 48 SN2018gkz 0.2400 29 PTF09atu 0.5015 10SN2019hge 0.0866 61 SN2011kf 0.2450 28 PS114bj 0.5125 4SN2017gci 0.0900 47 iPTF16bad 0.2467 27 SN2019otl 0.5140 this workSN2010md 0.0987 10 SN2019enz 0.2550 26 † PS112bqf 0.5220 4SN2016eay 0.1013 46 LSQ12dlf 0.2550 25 PS111ap 0.5240 4PTF12hni 0.1056 30 LSQ14mo 0.2560 24 DES16C3dmp 0.5620 2PTF12dam 0.1070 41 PTF09cnd 0.2584 10 DES15S1nog 0.5650 2SN2019neq 0.1075 45 SN2019dlr 0.2600 53 SN2019sgg 0.5726 54SN2018kyt 0.1080 44 SN2019hno 0.2600 53 SN2019kwu 0.6000 53SN2017ens 0.1086 43 SN2018fd 0.2630 this work DES14X3taz 0.6080 2SN2015bn 0.1136 42 SN2013dg 0.2650 25 PS110bzj 0.6500 4PTF10nmn 0.1237 10 ,
21 SN2018lfd 0.2700 55 SN2013hy 0.6630 9,2SN2007bi 0.1279 41 iPTF13bjz 0.2712 30 SN2019fiy 0.6700 53SN2017dwh 0.1300 40 SN2018bym 0.2740 23 PS112zn 0.6740 52SN2018avk 0.1320 23 SN2011ep 0.2800 16 DES17X1blv 0.6900 2SN2020exj 0.1330 59 SN2005ap 0.2832 22 DES16C3cv 0.7270 2SN2019lsq 0.1400 39 PTF10uhf 0.2879 21 PS111bdn 0.7380 4SN2018ffs 0.1420 this work SN2016inl 0.2980 this work iPTF13ajg 0.7403 8SN2011ke 0.1429 21 MLS121104 0.3030 52 SNLS07D3bs 0.7570 51SN2019bgu 0.1480 58 SN2019eot 0.3057 20 DES15X3hm 0.8600 2SN2019cdt 0.1530 38 SN2017beq 0.3100 19 DES14X2byo 0.8680 2LSQ14an 0.1630 37 PS112cil 0.3200 4 PS113gt 0.8840 4SN2019ujb 0.1647 this work SN2019cwu 0.3200 53 PS110awh 0.9084 7SN2019obk 0.1656 61 PTF12mxx 0.3296 10 DES17X1amf 0.9200 2SN2018ibb 0.1660 57 iPTF13ehe 0.3434 18 DES16C3ggu 0.9490 2SN2019pvs 0.1670 this work SN2019sgh 0.3440 this work PS110ky 0.9558 7PTF10bfz 0.1701 10 LSQ14bdq 0.3450 17 PS111aib 0.9970 4SN2012il 0.1750 28 SN2018lfe 0.3500 63 DES16C2aix 1.0680 2PTF12gty 0.1768 21 SN2019kwt 0.3562 53 PS110ahf 1.1000 4CSS160710 0.1800 36 PTF10bjp 0.3584 10 DES15X1noe 1.1880 2
Table 1 continued
LEET Table 1 (continued)
Name Redshift Reference Name Redshift Reference Name Redshift ReferenceSN2019gfm 0.1816 35 LSQ14fxj 0.3600 16 SCP06F6 1.1890 6SN2009cb 0.1864 21 SN2019zbv 0.3700 this work PS110pm 1.2060 5SN2009jh 0.1867 10 ,
21 SN2006oz 0.3760 15 PS111tt 1.2830 4iPTF16asu 0.1870 34 SN2019zeu 0.3900 this work DES14C1fi 1.3020 2SN2019nhs 0.1900 33 DES15C3hav 0.3920 2 PS111afv 1.4070 4SN2018cxa 0.1900 this work iPTF13cjq 0.3962 30 SNLS07d2bv 1.5000 3SN2010hy 0.1901 10 ,
30 SN2019kcy 0.4000 53 DES14S2qri 1.5000 2SN2011kg 0.1924 10 iPTF13bdl 0.4030 30 PS113or 1.5200 4SN2019kws 0.1977 53,61 SN2019cca 0.4103 14 PS111bam 1.5650 4SN2019xaq 0.2000 this work iPTF16eh 0.4270 13 PS112bmy 1.5720 4SN2016ard 0.2025 32 CSS130912 0.4305 11 ,
12 SNLS06d4eu 1.5881 3PTF10aagc 0.2060 10 PTF10vqv 0.4518 10 DES16C2nm 1.9980 2SN2016els 0.2170 31 CSS140925 0.4600 16 SN2213 2.0500 50
Note —All the SLSN-I used to train our classifier. Note there are more SLSNe candidates in the literature, but we keep onlythe unambiguous ones to avoid polluting the sample. 1:Prentice et al. (2019); 2:Angus et al. (2019); 3:Howell et al. (2013);4:Lunnan et al. (2018b); 5:McCrum et al. (2015); 6:Quimby et al. (2011); 7:Chomiuk et al. (2011); 8:Vreeswijk et al. (2014);9:Papadopoulos et al. (2015); 10:Perley et al. (2016); 11:Vreeswijk et al. (2017); 12:Liu et al. (2018); 13:Lunnan et al. (2018a); 14:Perley et al. (2019b); 15:Leloudas et al. (2012); 16:Schulze et al. (2018); 17:Nicholl et al. (2015); 18:Yan et al. (2015);19:Kasliwal & Cao (2019); 20:Fremling et al. (2019e); 21:Quimby et al. (2018); 22:Quimby et al. (2007); 23:Lunnan et al.(2019); 24:Chen et al. (2017); 25:Nicholl et al. (2014); 26:Short et al. (2019); 27:Yan et al. (2017); 28:Inserra et al. (2013);29:Fremling et al. (2018b); 30:De Cia et al. (2018); 31:Fraser et al. (2016); 32:Blanchard et al. (2018); 33:Perley et al. (2019a); 34:Whitesides et al. (2017); 35:Chen (2019); 36:Drake et al. (2009); 37:Inserra et al. (2017); 38:Fremling et al. (2019d);39:Fremling & Dahiwale (2019); 40:Blanchard et al. (2019); 41:Nicholl et al. (2013); 42:Nicholl et al. (2016); 43:Chen et al.(2018); 44:Fremling et al. (2019b); 45:Perley et al. (2019c); 46:Nicholl et al. (2017b); 47:Lyman et al. (2017); 48:Roy et al.(2016); 49:Nicholl et al. (2017a); 50:Cooke et al. (2012); 51:Prajs et al. (2017); 52:Lunnan et al. (2014); 53:Yan et al. (2019b); 54:Yan et al. (2019a); 55:Fremling et al. (2019a); 56:Fremling et al. (2018a); 57:Fremling et al. (2018c); 58:Fremling et al.(2019c); 59:Dahiwale & Fremling (2020); 60:Young (2016); 61:Yan et al. (2020); 62:Lin et al. (2020); 63: Yin et al., in prep. † We find that a redshift of z = 0 .
255 is a better match to the SNe spectral features than the z = 0 ..