[PDF] FLEET: A Redshift-Agnostic Machine Learning Pipeline to Rapidly Identify Hydrogen-Poor Superluminous Supernovae

Abstract

Over the past decade wide-field optical time-domain surveys have increased the discovery rate of transients to the point that ≲10% are being spectroscopically classified. Despite this, these surveys have enabled the discovery of new and rare types of transients, most notably the class of hydrogen-poor superluminous supernovae (SLSN-I), with about 150 events confirmed to date. Here we present a machine-learning classification algorithm targeted at rapid identification of a pure sample of SLSN-I to enable spectroscopic and multi-wavelength follow-up. This algorithm is part of the FLEET (Finding Luminous and Exotic Extragalactic Transients) observational strategy. It utilizes both light curve and contextual information, but without the need for a redshift, to assign each newly-discovered transient a probability of being a SLSN-I. This classifier can achieve a maximum purity of about 85\% (with 20\% completeness) when observing a selection of SLSN-I candidates. Additionally, we present two alternative classifiers that use either redshifts or complete light curves and can achieve an even higher purity and completeness. At the current discovery rate, the FLEET algorithm can provide about 20 SLSN-I candidates per year for spectroscopic follow-up with 85\% purity; with the Legacy Survey of Space and Time we anticipate this will rise to more than ∼ 10 3 events per year.

Full PDF

DDraft version September 7, 2020

Typeset using L A TEX twocolumn style in AASTeX62

FLEET: A Redshift-Agnostic Machine Learning Pipeline to Rapidly Identify Hydrogen-Poor SuperluminousSupernovae

Sebastian Gomez, Edo Berger, Peter K. Blanchard, Griffin Hosseinzadeh, Matt Nicholl,

3, 4

V. Ashley Villar, and Yao Yin Center for Astrophysics | Harvard & Smithsonian, 60 Garden Street, Cambridge, MA 02138-1516, USA Center for Interdisciplinary Exploration and Research in Astrophysics and Department of Physics and Astronomy,Northwestern University, 2145 Sheridan Road, Evanston, IL 60208-3112, USA Birmingham Institute for Gravitational Wave Astronomy and School of Physics and Astronomy, University of Birmingham,Birmingham B15 2TT, UK Institute for Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill EH9 3HJ, UK

ABSTRACTOver the past decade wide-ﬁeld optical time-domain surveys have increased the discovery rate oftransients to the point that (cid:46)

10% are being spectroscopically classiﬁed. Despite this, these surveyshave enabled the discovery of new and rare types of transients, most notably the class of hydrogen-poor superluminous supernovae (SLSN-I), with about 150 events conﬁrmed to date. Here we present amachine-learning classiﬁcation algorithm targeted at rapid identiﬁcation of a pure sample of SLSN-I toenable spectroscopic and multi-wavelength follow-up. This algorithm is part of the FLEET (FindingLuminous and Exotic Extragalactic Transients) observational strategy. It utilizes both light curve andcontextual information, but without the need for a redshift, to assign each newly-discovered transienta probability of being a SLSN-I. This classiﬁer can achieve a maximum purity of about 85% (with20% completeness) when observing a selection of SLSN-I candidates. Additionally, we present twoalternative classiﬁers that use either redshifts or complete light curves and can achieve an even higherpurity and completeness. At the current discovery rate, the FLEET algorithm can provide about 20SLSN-I candidates per year for spectroscopic follow-up with 85% purity; with the Legacy Survey ofSpace and Time we anticipate this will rise to more than ∼ events per year. Keywords: supernovae: general – methods: statistical – surveys INTRODUCTIONType I Superluminous Supernovae (hereafter, SLSN-I)are a class of astrophysical transients that exceed theluminosity of normal SNe by up to two orders of mag-nitude. They were originally classiﬁed based on theirluminosity, since most have typical peak absolute mag-nitudes of (cid:46) −

21 (Chomiuk et al. 2011; Quimby et al.2011). However, events with spectroscopic signaturesthat match those of SLSN-I have been discovered atlower luminosities (e.g., Lunnan et al. 2013) and theyare now classiﬁed based on their hydrogen-free spectra,strong O II absorption lines at early time, and a bluecontinuum (Angus et al. 2019). At present, about 150 Corresponding author: Sebastian [email protected]

SLSN-I have been spectroscopically classiﬁed; see Ta-ble A.1 for a listing and references.While the energy source of SLSN-I was intensely de-bated for a few years following their discovery, it nowappears that radioactive decay of Ni (as in normalType I SNe) and circumstellar interaction (as in TypeIIn SNe) cannot explain the bulk of the population.Instead, the most likely energy source appears to bethe spin-down of a millisecond magnetar produced inthe explosion (Kasen & Bildsten 2010; Metzger et al.2015). This model can explain the diverse light curvebehavior (Nicholl et al. 2017c), the early-time UV spec-tra (Mazzali et al. 2016), the late-time light curve ﬂat-tening (Blanchard et al. 2018; Nicholl et al. 2018), andthe nebular spectra (Dessart et al. 2012; Nicholl et al.2019) of SLSN-I. Still, the nature of SLSN-I progenitors,their environments, and their relation to those of otherstripped-envelope explosions remain areas of active in- a r X i v : . [ a s t r o - ph . H E ] S e p Gomez et al. vestigation (e.g., Blanchard et al. 2020). Similarly, theubiquity and origin of unusual light curve and spectro-scopic features seen in some SLSN-I, such as late time“bumps” (Nicholl et al. 2016; Inserra et al. 2017; Blan-chard et al. 2018; Lunnan et al. 2019), double-peakedlight curves (Nicholl et al. 2015), or potential heliumlines (Yan et al. 2020) remain unclear.Making progress on these open questions requires asubstantial increase in the identiﬁcation rate of SLSN-I,preferably at early times to enable spectroscopic follow-up. A signiﬁcant challenge is that SLSN-I are intrinsi-cally rare, at a volumetric rate of ∼

90 SNe yr − Gpc − at a weighted redshift of z = 1 .

13, they represent < . ∼ .

5% of the detection rate in magnitude-limitedsurveys (Villar et al. 2019; Fremling et al. 2020). Cur-rently, only ∼

10% of all optical transients are classiﬁedspectroscopically, and with the Legacy Survey of Spaceand Time (LSST) on the Vera C. Rubin Observatory,this will decline to (cid:46) . RAPID :Muthukrishna et al. 2019,

Avocado : Boone 2019) havebeen trained on synthetic data, such as the PhotometricLSST Astronomical Time-series Classiﬁcation project(PLAsTiCC; Kessler et al. 2019), but their performancewith real data remains untested. Other classiﬁers suchas

SuperRAENN (Villar et al. 2020) or

Superphot (Villaret al. 2019; Hosseinzadeh et al. 2020) have been trainedon real survey data from the Pan-STARRS1 MediumDeep Survey (PS1/MDS). Overall, these classiﬁers havea fairly high success rate and recover ∼

80% of SLSN-I,but only when using redshift information and fairly com-plete light curves. Additionally, the Automatic Learningfor the Rapid Classiﬁcation of Events (ALeRCE) broker,which is currently providing real-time classiﬁcations fortransients from ZTF (S´anchez-S´aez et al. 2020), is ableto recover up to 100% of the SLSN-I in their trainingsample, but with a large standard deviation of ∼ ∼ .

5% to ∼ § § § § § § §

8. FLEET isprovided as a Python package on Github and Zenodo(Gomez et al. 2020), as well as included in the PythonPackage Index with the name fleet-pipe . GUIDING PRINCIPLESAs discussed above, there are several eﬀorts aimedat ML classiﬁcation of astronomical transients, mainlybased on light curve information from wide-ﬁeld surveys.By design, some classiﬁers make choices that tend tooptimize their overall classiﬁcation success rate acrossa range of astronomical transients (e.g. Boone 2019;Muthukrishna et al. 2019; Gagliano et al. 2020; Hossein-zadeh et al. 2020; Villar et al. 2020). Here, we take adistinct approach by focusing on optimized classiﬁcationof a single class of transients. Our algorithm is based onthe following guiding principles:1. Classifying only SLSN-I with no regard for theclassiﬁcation success of other transients.2. Obtaining the purest possible sample of SLSN-I,at the expense of sample completeness.3. Prioritizing speed and computational resourcesover model complexity to allow for rapid classi-ﬁcation.4. Finding SLSN-I at early times to enable real-timefollow-up.This approach enables us to make eﬃcient use of large-aperture telescopes for spectroscopic classiﬁcation, aswell as perform later follow-up studies. https://github.com/gmzsebastian/FLEET LEET ∼ (cid:38) per month expected from LSST) motivates ouremphasis on computational speed, as well as purity atthe expense of completeness. In particular, even if wemanage to identify less than half of the SLSN-I in thedata stream, but with a high success rate, then we candouble the existing sample of SLSN-I by the time LSSTcommences.We provide a main rapid version of the classiﬁer inaddition to two additional classiﬁers with somewhat dif-ferent motivations. First, a full light curve classiﬁer thatcan more conﬁdently classify SLSN-I, but at the expenseof early discovery, mainly aimed at constructing largesamples with only photometric data. And second, aclassiﬁer that uses redshift information for higher pu-rity classiﬁcation, mainly in anticipation of robust pho-tometric redshifts that will be provided by LSST. TEST SETTo train our classiﬁer we obtained all spectroscopicallyclassiﬁed transients from the TNS: SNe, tidal disruptionevents (TDEs), active galactic nuclei (AGN) ﬂares, andGalactic transients (e.g., cataclysmic variables and vari-able stars). In addition to those, we included the TDEspublished in van Velzen et al. (2020), which are not yetreported to the TNS, and every unambiguous SLSN-Ifrom the literature; see Table A.1. We also obtainedall of the available photometry for each transient, fromthe Open Supernova Catalog (OSC; Guillochon et al.2017) or the Zwicky Transient Facility (ZTF; Bellm et al.2019). We require each transient to have at least 2 g -band and 2 r -band measurements to model their lightcurves. We restrict the list to transients within the foot-print of the Pan-STARRS1 3 π (PS1/3 π ) survey (Cham-bers & Pan-STARRS Team 2018) for the purpose ofidentifying host galaxies. Finally, we removed from thetraining set 44 transients with ambiguous host galaxyidentiﬁcations or spurious data in order to have thecleanest data set possible; however, we kept these eventsin our test set to prevent any resulting biases. The re-sulting sample is composed of 1,813 transients, with thefollowing distinct labels from the TNS: 800 SN Ia, 381SN II, 156 SLSN-I, 95 CV, 71 SN IIn, 63 SN IIP, 59 SN Ic,43 SLSN-II, 37 SN Ib, 33 SN IIb, 19 TDE, 16 SN Ic-BL,13 SN Ibc, 12 AGN, 8 SN Ibn, and 7 Varstar (variablestars). Table 1.

Observational Rates of TransientsTransient Fremling TNS Target f SNI 587 (77.1%) 6500 (70.8%) 73.9%SNII 155 (20.4%) 2109 (23.0%) 19.6%SLSN-I 12 (1.6%) 123 (1.3%) 1.5%SLSN-II 7 (0.9%) 45 (0.5%) 0.9%Nuclear – 58 (0.6%) 0.6%Star – 340 (3.7%) 3.5%

Note —Observational rates for the relevant types oftransients considered here. We normalize the rateof events in our test set to an expected Target rate f calculated from the Fremling et al. (2020) sampleand the TNS sample, used for Equation 1. Since the number of events per class varies substan-tially, making the training set unbalanced, the classiﬁca-tion would be biased towards the more common classes.To mitigate this bias we over-sample each class to havea total of 800 events, using the Synthetic Minority Over-sampling Technique (SMOTE; Chawla et al. 2002). Thisalgorithm draws random samples along vectors joiningevery pair of objects in feature space until all classeshave the same number of events. We tested an alterna-tive multivariate-Gaussian (MVG) oversampling tech-nique, as implemented in Villar et al. (2019), but ﬁndthat when sampling features that are close to zero andconstrained to be positive (e.g., redshift), SMOTE per-forms signiﬁcantly better; even when imposing a > Test Set

We test the eﬃcacy of our classiﬁer on all of the eventsfrom the training set. In addition to all the eventsfrom the training set, we include in the test set the 44transients that were removed in § Gomez et al. possible biases. We implement a leave-one-out cross-validation method, allowing us to train the classiﬁer onevery event except for one, and then predict the classiﬁ-cation of that one event, cycling through all events. Thisallows us to robustly test our classiﬁer without havingto divide the data set into a training and test set, whichwould compromise the sample size.We deﬁne completeness , classiﬁer purity , and observedpurity as useful metrics to test the eﬃcacy of our algo-rithm: Completeness = SN T N SLSN

Classiﬁer Purity = SN T SN T + SN F Observed Purity = SN T SN T + (cid:80) i η i SN F , i η i = N SLSN × f i N i × f SLSN , (1)where N SLSN is the total number of SLSN-I in the testset, SN T is the total number of true positive SLSN-I re-covered, and SN F is the total number of false positiveSLSN-I. The relative fractions of each transient class inour test set, which we obtained directly from the TNS,does not reﬂect the true fractions of these transients ina magnitude-limited survey. To determine a purity thatis representative of on-going and future surveys, we re-normalize the classiﬁer purity into an observed purity ,which more accurately represents the outcome of ourpipeline in a real survey. Here, SN F , i is the false pos-itive rate for an individual transient class i , and f i isthe corresponding true observational rate for that class,listed in Table 1. We use the observational rates of SNefrom Fremling et al. (2020) to estimate the expected Tar-get Rate, f , for any magnitude-limited survey. We theninclude nuclear transients (TDEs + AGN) and Galactictransients (CVs + variable stars) from the TNS, normal-izing by the total number of classiﬁed transients from theTNS to the total number of SNe in the Fremling et al.(2020) sample.Given that SLSN-I are over-represented in our test setcompared to the rate they would have in a magnitudelimited survey, observed purity will be lower than theclassiﬁer purity. For example, our test set has 800 SNIa and 156 SLSN-I, or 0.20 SLSN-I for each SN Ia. Butin a magnitude limited survey, there is typically only0.02 SLSN-I for each SN Ia. Therefore, if we wanted topredict how many SLSN-I we would be able to ﬁnd ina real survey, we need to normalize the classiﬁer purity,in this example by multiplying the false positive rate bya factor of ∼

14 16 18 20 2220406080100 M a t c h [ % ]

13 14 15 16 17 18 19 20 21 22 23PSF mag [i]0123 P S F - K r o n GalaxiesStars

Figure 1.

Galaxies (green) and stars (red) classiﬁed by theCFHTLS survey (D1 ﬁeld) plotted in terms of the diﬀerencebetween their PSF and Kron magnitude as a function of ap-parent i -band magnitude in PS1/3 π . Using this calibration,we assign a probability of being a galaxy to all objects in theﬁeld of a transient based on their location in this diagram.The top panel shows the percent of objects for which ourclassiﬁcation matches that of the CFHTLS as a function ofapparent magnitude, a 90% match occurs at a magnitude of22.5. 4. CONTEXTUAL INFORMATIONSLSN-I are known to prefer low-luminosity galaxies(Lunnan et al. 2014), and it is therefore advantageous touse contextual information in their classiﬁcation. Herewe describe our method of assigning a host galaxy toeach transient, while in § π grizy (Chambers & Pan-STARRSTeam 2018) and SDSS ugriz (Alam et al. 2015; Ahu-mada et al. 2019) PSF and Kron magnitudes of everycataloged source in a 1 (cid:48) radius region around the tran-sient location. We use this information both to separategalaxies from stars, and to identify the most likely hostgalaxy. 4.1. Star-Galaxy Separation

The ﬁrst step to identifying the host galaxy of eachtransient is to separate stars from galaxies. SDSS pro-vides a classiﬁcation for every object in their catalog, butsince SDSS is shallower than PS1/3 π and has a smallerfootprint, this is not suﬃcient for our purposes. Instead,we develop a method to assign a probabilistic value (be-tween 0 and 1) of how likely every object in SDSS andPS1/3 π is to be a galaxy. LEET RA D E C N W GalaxiesStars

Figure 2.

PS1/3 π i -band image of a 1 (cid:48) × (cid:48) ﬁeld centered onthe position of the SLSN-I SN 2013hy, indicating objects clas-siﬁed as galaxies (green) and stars (red) based on our star-galaxy separation algorithm ( § P cc ≈ .

03 as determined by the algorithm described in § To train our star-galaxy separation algorithm we usedata from the Canada-France-Hawaii Telescope LegacySurvey (CFHTLS; Hudelot et al. 2012), which providesmagnitudes and star-galaxy classiﬁcations down to ≈ π . Wespeciﬁcally use the D1 ﬁeld (1 deg ) and cross-matchwith every overlapping object in SDSS and PS1/3 π , for atotal of ∼ ,

000 objects. Galaxies tend to have a largerdiﬀerence between their PSF and Kron magnitudes thanstars, so we use this speciﬁc feature (PSF − Kron) toseparate them; see Figure 1 for an example in the i -band. The CFHTLS uses the CLASS STAR classiﬁer ﬂagin

SExtractor to separate stars from galaxies, which re-lies on a multi-layer feed-forward neural network (Bertin& Arnouts 1996).In our galaxy-star separator we assign a probabilityof being a galaxy to any object in SDSS or PS1/3 π byusing a custom k-nearest-neighbors algorithm. Given anobject’s PSF and Kron magnitude, we ﬁnd the 20 near-est objects in the PSF versus PSF − Kron phase-space(Figure 1) to calculate its probability of being a galaxybased on the fraction of those 20 neighbors from theCFHTLS training set that are galaxies. Experimentingwith diﬀerent number of neighbors, we ﬁnd that at least10 neighbors are required to produce robust estimates, with only marginal improvement in accuracy beyond 20neighbors. For every object we calculate its probabilityof being a galaxy in every available ﬁlter, and adopt theaverage probability among all ﬁlters.An alternative star-galaxy separator for objects inPS1/3 π is presented in Tachibana & Miller (2018). Al-though this latest one has a very high accuracy, it doesnot include objects from SDSS, for which we also requirea classiﬁcation when they are not in the PS1/3 π catalog.We note that if we label objects with a probability of be-ing a galaxy of P G ≤

10% as stars, our classiﬁer agreeswith the classiﬁcation from Tachibana & Miller (2018)at the 90% level. In Figure 2 we show an example of ourstar-galaxy separator applied on a ﬁeld from PS1/3 π centered on the location of the SLSN-I SN 2013hy.We opt to only label objects with a galaxy probabilityof P G <

10% as stars to avoid missing a possible hostgalaxy identiﬁcation. While this conservative cut retainsmore stars in the sample, these are rarely predicted tobe the most likely host galaxy of a SN due to the smallsize of their PSF. We ﬁnd that a more strict thresholdresults in a large number of host galaxies being rejectedas stars. In the top panel of Figure 1 we show that usingthe classiﬁcation from the CFHTLS as a reference, ourthreshold for labeling stars yields a successful galaxyclassiﬁcation for essentially all objects with (cid:38)

22 magand ≈

65% down to 23 mag.4.2.

Host Identiﬁcation

Once we have identiﬁed which objects in the ﬁeld arelikely to be galaxies we can determine which galaxy isthe most likely host for a given transient. First, we la-bel stellar transients, using the criterion of a star (i.e., P G < < (cid:48)(cid:48) from a transient’s posi-tion. Then, for the non-stellar transients we determinethe probability of chance coincidence for each galaxy inthe ﬁeld relative to the transient’s position. We followthe method of Bloom et al. (2002) and Berger (2010) us-ing the measured number density of galaxies, Σ( ≤ m ),brighter than a magnitude m , to calculate the probabil-ity of chance coincidence: P cc = 1 − e − π ( d +4 R ) Σ( ≤ m ) Σ( ≤ m ) = 10 . m − − . .

33 ln(10) , (2)where d is the angular separation between the centerof a galaxy and the transient, and R is the half-lightradius of the galaxy obtained from the SDSS catalog, orfrom the PS1/3 π catalog if the object is not in the SDSScatalog. We consider the galaxy with the lowest valueof P cc to be the host, as long as P cc ≤ .

1. Otherwise,we designate the transient as “host-less” given the more

Gomez et al.

Table 2.

Feature Sets P (SLSN − I)1 W +∆ t + R n + ∆ m . ± .

0% 16 . ± .

8% 0.732 W +∆ t + ∆ m . ± .

8% 2 . ± .

7% 0.813 W +∆ t + R n . ± .

5% 20 . ± .

3% 0.844 R n +∆ m . ± .

5% 4 . ± .

4% 0.895 W + R n . ± .

9% 16 . ± .

8% 0.746 W +∆ t + R n +( g − r ) 91 . ± .

4% 17 . ± .

3% 0.807 W +∆ t + R n + ∆ m +( g − r ) 82 . ± .

4% 2 . ± .

9% 0.888 W + R n + ∆ m +( g − r ) 94 . ± .

2% 3 . ± .

7% 0.87

Note —Diﬀerent sets of light curve and contextual features used to train our classiﬁer.We list the highest classiﬁer purity that each set of features achieves, as well as thecorresponding completeness and classiﬁcation probability P (SLSN − I) that correspond tothat peak purity. W is the width of the light curve, R n is the normalized host separation,∆ m is the peak transient magnitude minus the host magnitude, ∆ t is the time of peakmagnitude minus the time of discovery, and ( g − r ) is the light curve color at peak. M a g n i t u d e r-bandg-band20d70d Figure 3.

Light curves of the SLSN-I SN 2011ke ﬁt withthe model described in Equation 3. The dashed lines showthe ﬁt using only data up to 20 days after detection (with aﬁxed value of A = 0 . A as a freeparameter). The former is part of our main rapid classiﬁer,while the latter is part of an alternative classiﬁer that usesfull light curves ( § likely situation that its host galaxy is fainter than themagnitude limit of SDSS and PS1/3 π . LIGHT CURVE MODELIn addition to the contextual information, we use thelight curves of each transient to predict which tran-sients are most likely SLSN-I. We obtain photometricdata from the OSC, as well as from ZTF using theMake Alerts Really Simple (MARS) broker. We cor-rect all the photometry for Galactic extinction usingthe Schlaﬂy & Finkbeiner (2011) dust maps assuming R V = 3 . m = e W ( t − φ ) − A × W ( t − φ ) + m , (3)where W is the eﬀective width of the light curve, A modiﬁes the decline time relative to the rise time, m is the peak magnitude, and φ is a phase oﬀset rela-tive to the time of the ﬁrst observation. An exampleof this function ﬁt to a SLSN-I (SN 2011ke) is shownin Figure 3. We ﬁt this model independently to the g -and r -band light curves using the emcee implementationof the Goodman and Weare (Goodman & Weare 2010)Markov chain Monte Carlo algorithm (Foreman-Mackeyet al. 2013) and adopt the median of the posterior asthe best estimate for each parameter. We use ﬂat un-informative priors for all parameters, but initiate thewalkers’ position at a value of m equal to the brightestobserved magnitude, and a value of φ that correspondsto the time of that measurement. We ﬁnd that a model https://mars.lco.global/ LEET ∼

30 steps.We use two versions of Equation 3 to test and evaluatethe classiﬁer. One version has a ﬁxed value of A =0 . § A has only a marginal eﬀecton the results, since this model only uses data up to 20days after detection, which do not encompass a declinephase. The second version of the model uses data up to70 days after discovery and has A as a free parameterto ﬁt the light curve decline. This model is used for thefull light curve classiﬁer described in § A ) and 70 days of data ( A as afree parameter). CLASSIFICATION ALGORITHMTo classify the transients we use the contextual andlight curve information described in § §

5, respec-tively, with an implementation of the random forest(RF) algorithm in the scikit-learn

Python package(Pedregosa et al. 2012). In this manner, we assignto each transient a classiﬁcation probability of being aSLSN-I. This algorithm takes various sub-samples of thetraining set and forms a number of decision tree classi-ﬁers to classify each object. The output classiﬁcationprobability is the result of averaging the output of allthe trees in the forest. We run the classiﬁer with 100estimators to mitigate over-ﬁtting and improve predic-tive accuracy. We also run each version of the model25 times using diﬀerent initial random seeds to estimatethe classiﬁer’s uncertainties. We run the classiﬁer us-ing the Gini index as the criterion that minimizes theprobability of misclassiﬁcation. We optimize the depthof the trees in each RF by running a grid of models froma depth of 3 to 12 in steps of 1 and ﬁnd a depth of 7performs best (a depth of 6 and 8 performed similarlywell, within a 1 σ uncertainty derived from the diﬀerentrandom seed iterations).Additionally, we optimize the grouping of transientclasses into diﬀerent sets, described in § Feature Selection t R n StarNuclear SNIISNI SLSN-IISLSN-I W r g - r Figure 4.

Phase-spaces of features selected for the classi-ﬁer, plotted for the various classes of transients.

Top : Thenormalized host separation ( R n ) versus the time diﬀerencebetween the light curve peak and the ﬁrst detection (∆ t ). Forhost-less transients we set R n = 0 (Shown here at R n = 0 . Bottom : Light curve width in r − band W r , compared to the color of the transient duringpeak ( g − r ). Unlike newly-discovered transients, the transients inour training set have full light curves. Since a goal ofFLEET is to ﬁnd SLSN-I in real-time we test the al-gorithm using a varying cutoﬀ time for the light curvedata. Naturally, with more data the light curve mod-els are better constrained, but this delays the identi-ﬁcation and spectroscopic follow-up into a later phasewhen the SN is fainter. For our rapid classiﬁer we ﬁndoptimal results when using the ﬁrst 20 days of data foreach light curve, by which time most SLSN-I have notreached their peak luminosity.

Gomez et al. R n W g W r g-r t R n W r W g T g-r R n W r W g T g-r 0.20.40.60.81.0 C o rr e l a t i o n Figure 5.

Top : Correlated importance for the features usedin the rapid version of our classiﬁer.

Bottom : Correlationmatrix for the same features.

For the rapid classiﬁer we have 6 light curve parame-ters (3 in each ﬁlter) that can be used as input features:the widths of the light curve, W , the phase oﬀsets φ ,and the peak magnitudes m . In addition to these weexplore the use of two additional features: (i) ∆ t , whichis the time diﬀerence from ﬁrst detection of a transientto its observed light curve peak in either g - or r -band,whichever one is brightest; and (ii) the g − r color atpeak, using the model ﬁts, where the time of peak is theone with the brightest observed magnitude in either g -or r -band.For the contextual information features we test theuse of several host galaxy parameters: the apparentmagnitude of the host, m h , its half light radius in r -band, R , the projected angular separation between thetransient and its host center D , the projected angularseparation normalized by the galaxy radius R n , and thediﬀerence between m and m h in r -band, ∆ m . For host-less transients we use the limiting magnitude of PS1/3 π of r = 23 . m h , and set all othergalaxy parameters to 0 (since those cannot be measuredfor a non-detected host).

20 40 60 80 100 120

Days Since First Datapoint F r a c t i o n o f S L S N e F o un d [ N = ] SLSN-I [N = 40]StarNuclearSNIISNISLSN-IISLSN-I

Figure 6.

Fraction of SLSN-I correctly identiﬁed by therapid version of our classiﬁer amongst the top 20 objects pre-dicted to be SLSN-I as a function of days of light curve dataused. The peak purity is about 90% when using (cid:38)

20 days ofdata. This purity is relevant for the training set, before nor-malizing to observational rates in a magnitude-limited survey( § (cid:38)

70 dayscomes from a single CV with a 50-day long outburst that wasclassiﬁed as a SLSN-I due to its long light curve and lack ofdetected “host”.

We tested several combinations of the available lightcurve and contextual features in order to determinewhich combination set yields the highest purity of SLSN-I, while maintaining reasonable completeness; listed inTable 2. We ﬁnd that the most relevant features thathelp separate SLSN-I from other transients are W g and W r , ∆ t , R n , ∆ m , and ( g − r ). In Figure 4 we show howthe diﬀerent classes of transients lie in feature-space. InTable 2 we list the highest purity, and associated uncer-tainty, achieved for each feature set, as well as the cor-responding completeness and classiﬁcation conﬁdence, P (SLSN − I), at which this highest purity is achieved.We ﬁnd that for the rapid classiﬁer, feature set W in g - and r -band, the normalized hostseparation R n , the time of peak magnitude minus thetime of discovery in either band, ∆ t , and the light curvecolor at peak, ( g − r ).The importance of each feature used is not deﬁnedindependently of other features, if two features are cor-related then their relative importance might be aﬀected.In the bottom panel of Figure 5 we show the correlationbetween features, and ﬁnd that with the exception of W g and W r , the features are mostly independent. In LEET C u m u l a t i v e F r a c t i o n SLSN-INot SLSN-IMissclassified

Figure 7.

Cumulative distribution as a function of classiﬁ-cation conﬁdence ( P ) for transients classiﬁed as SLSN-I (red)and non-SLSN-I (blue). The crosses mark events that aremisclassiﬁed. We ﬁnd that for the SLSN-I sample, the mis-classiﬁed events are mainly concentrated at P (SLSN − I) (cid:46) . order to calculate the correlated importance we use thepermutation importance method described in Breiman(2001). The correlated importance of each feature isshown on the top panel of Figure 5.In Figure 6 we show how the rapid version of the classi-ﬁer (trained on the ﬁrst 20 days of light curve data) per-forms as a function of days of light curve data used, andinclude the contaminating classes of transients. Whenconsidering the top 20 transients with the highest pre-dicted conﬁdence P (SLSN − I), we ﬁnd that the classi-ﬁer performance rises for the ﬁrst ∼

20 days, and thenplateaus to a peak classiﬁer purity of about 90% (i.e., wecorrectly identify about 18 of the top 20 transients clas-siﬁed as SLSN-I). This purity is relevant for the train-ing set, without normalizing for the observational ratesdescribed in § Model Validation

We use three diﬀerent methods to evaluate the per-formance of our classiﬁer: a confusion matrix, a pu-rity/completeness curve, and the fraction of SLSN-I re-covered. Unless otherwise stated, in this section thevalues being reported have been corrected for the obser- C o m p l e t e n e ss / O b s e r v e d P u r i t y Observed PurityCompleteness

Features

Figure 8.

The observed purity and completeness for thebest performing set of features described in Table 2. Thepurity curve represents the percent of transients that areSLSN-I as a function of the classiﬁer conﬁdence p(SLSN-I). The shaded region for the purity and error-bars for thecompleteness represent 1 σ uncertainties. SLSN-I Not SLSN-IPredicted label S L S N - I N o t S L S N - I T r u e l a b e l Classifier Purity (P > 0.75)

Figure 9.

Confusion matrix that indicates a purity of 80%for SLSN-I. Based only on objects with a classiﬁcation prob-ability of P(SLSN-I) > .

75 or P(not-SLSN-I) > .

75, for atotal of 1438 transients. vational rates expected in a magnitude-limited survey asdescribed in § observed purity . Sincewe are not concerned with the classiﬁcation of transientsother than SLSN-I, we collapse the individual transient0 Gomez et al. classiﬁcations into a binary SLSN-I versus non-SLSN-Iclassiﬁcation. To calculate the probability of non-SLSN-I for each transient we sum the probabilities of all othertransient classes.In Figure 7 we show how the rapid classiﬁer performsat classifying SLSN-I and not misclassifying other ob-jects, as a function of classiﬁcation conﬁdence level.We ﬁnd that most of the misclassiﬁed SLSN-I are at P (SLSN − I) (cid:46) .

6, with only 4 misclassiﬁed SLSN-I athigher values of P (SLSN − I). The few objects that aretrue SLSN-I but were missclassiﬁed as something elsewith high conﬁdence are usually SLSN-I with relativelybright host galaxies that got missclassiﬁed as Type-IISNe, which have light curves that might also appearbroad due to their late-time plateau.The completeness and purity of the rapid classiﬁer forthe three top performing feature sets are shown in Fig-ure 8. As expected, the purity increases and the com-pleteness declines as we restrict the sample to eventswith progressively higher values of classiﬁcation conﬁ-dence. For P (SLSN − I) > .

5, the observed purity is ≈ ≈ − ≈ . ≈ ∼ ,

000 transients a year, assuming anobservational rate of 1.5% for SLSN-I (Table 1), a 20%completeness corresponds to ∼

50 SLSN-I a year thatcould be discovered.In Figure 9 we show the confusion matrix, namely, thelabel predicted by our classiﬁer compared to the truelabel of the transient. We impose a conﬁdence cut of

P > .

75 for either the SLSN-I or not-SLSN-I classes,corresponding to the peak classiﬁer purity (Figure 8);this leads to a sample of 1438 events. We see that 14 outof the 18 transients predicted to be SLSN-I are correctlylabeled, indicating a classiﬁer purity of 80%.We run an additional model validation to test for over-ﬁtting. Given the relatively small sample size of ourdata set we split the entire data set into two indepen-dent sets, a training set (with 1209 objects) and a testset (with 604 objects), as opposed to a traditional train-ing/test/validation set. We optimize the combinationof transient class grouping, depth or the RF trees, andincluded features using a leave-one-out cross-validationmethod on the training set. We ﬁnd that the best re-sults (in terms of purity and completeness) are consis- tent with the main classiﬁer presented in this section,with the exception that a depth of 5 is slightly preferredover a depth of 7 for the RF trees. We then test this clas-siﬁer on the 604 object test set and ﬁnd it performs asexpected with a maximum classiﬁer purity of 75% anda corresponding completeness of 15% for objects withp(SLSN-I) > . −

20 on a personal computer, and abouthalf the time to re-run on an existing transient once therequired catalog data has been downloaded and storedlocally. We note that since FLEET is designed to rapidlyselect the most promising SLSN-I candidates for follow-up, manual vetting of the top candidate events can fur-ther increase the sample purity. This is because somecandidates might be due to obvious failure modes; forexample, an AGN with a highly variable light curvemight be classiﬁed as a SLSN-I due to its “broad” lightcurve, but manual inspection will reveal a variable nu-clear source that is not SN-like. Another potential fail-ure mode that can be mitigated with manual inspection,is when SDSS and/or PS1/3 π report large galaxies asmultiple individual sources, causing the classiﬁer to as-sociate the transient to a small dim source, instead ofthe main galaxy.To summarize, our rapid classifer, using basic lightcurve and contextual information (and no redshift infor-mation) can achieve a factor of 30 −

60 times improve-ment over random selection for SLSN-I, with a complete-ness of ∼ ALTERNATIVE CLASSIFIERSThe rapid version of the FLEET classiﬁer presentedabove is tailored to ﬁnd a pure sample of SLSN-I beforeor near peak, as to enable real-time follow-up. In thissection we explore two alternative classiﬁers that utilizeadditional information: (i) using redshift as a feature,based on the expectation that LSST will provide pho-tometric redshifts with ∼

5% uncertainty for galaxiesdown to i ≈

25 mag (Graham et al. 2018); and (ii)using more complete light curve information, includingthe decline phase, which may hinder spectroscopic clas-siﬁcation, but will provide samples of SLSN-I for purephotometric population studies. We optimize these al-ternative classiﬁers in terms of feature selection, depthof the classiﬁer’s trees, and time span of the light curveused in the same manner as for the main rapid classiﬁer,described in §

6. 7.1.

Redshift Classiﬁer

A key advantage of our rapid classiﬁer is that it doesnot rely on redshift information. However, with the ad-

LEET P e a k M a g n i t u d e NuclearSNIISNISLSN-IISLSN-I

Figure 10.

Peak absolute magnitude in r -band versus spec-troscopic redshift for all the transients in our sample (ex-cluding stars). As expected, SLSN-I separate well from othertypes of transients when the redshift is known. vent of LSST it is expected that robust photometric red-shifts will be available for galaxies down to i ≈

25 mag.Since SLSN-I are generally more luminous than otherSN classes, redshift information is certain to aid in theclassiﬁcation conﬁdence. In Figure 10 we plot the peakabsolute r -band magnitude as a function of redshift forall of the extragalactic transients in our training set, in-dicating how well SLSN-I can be separated when redshiftinformation is available.To test this eﬀect, we use here the known spectro-scopic redshift of each transient in our training set (as-signing Galactic transients a redshift of 0). As in therapid classiﬁer, we only use the ﬁrst 20 days of data(designed to enable rapid follow-up) and optimize forRF depth and features. We ﬁnd that feature set P (SLSN − I), with an observed pu-rity of about 60% and completeness of about 60% at P (SLSN − I) > . P (SLSN − I) > . ≈

65 candidate SLSN-I, signiﬁcantly higher than the ≈

27 candidate SLSN-I at 50% purity for the main clas- siﬁer. Stated diﬀerently, the redshift classiﬁer achieves80% observed purity for the 27 top candidate SLSN-I,compared to the 50% observed purity for the main clas-siﬁer. We therefore conclude that when robust redshiftinformation is available it can signiﬁcantly aid in thepurity and completeness of the classiﬁer.7.2.

Full Light Curve Classiﬁer

The rapid classiﬁer is trained on only the ﬁrst 20 daysof light curve data. Here we investigate the eﬃcacy ofusing more complete light curves. This may inhibit thesuccess of spectroscopic classiﬁcation, since SLSN-I areon average about 2 mag fainter on a timescale of 70days after discovery compared to at 20 days after dis-covery. But using light curves well beyond peak allowsfor a more robust classiﬁcation and can aid in the con-struction of more complete photometric SLSN-I samplesonce they fade away. For this full light curve classiﬁer wemeasure the decline rate by ﬁtting for A in Equation 3.After optimizing the classiﬁer we ﬁnd that feature set W and A (Table 2), and a depth of9 for the RF trees provide the best results. We similarlyﬁnd that using the ﬁrst 70 days of light curve data pro-vides optimal results; later time data tend to be of lowerquality and are more greatly aﬀected by non-monotoniclight curve features that cannot be captured in our sim-ple light curve model. In Figure 11 we show how the fulllight curve classiﬁer performs in terms of classiﬁcationprobability. We ﬁnd an overall better performance thanfor the rapid classiﬁer, achieving a comparable peakobserved purity, but at P (SLSN − I) ≈ .

65 instead of ≈ .

80, and hence with a higher completeness of about40% compared to 20% for the rapid classiﬁer. As shownin Figure 12, this essentially means that the full lightcurve classiﬁer can achieve 50% purity for a comparablenumber of top SLSN-I candidates as the redshift classi-ﬁer, ≈

65 events. Similarly, it can achieve an observedpurity comparable to the peak observed purity of therapid classiﬁer, but for about 45 top SLSN-I candidatesas opposed to about 27.2

Gomez et al. C o m p l e t e n e ss Classifier:

RapidFull Light CurveRedshift 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0P(SLSN-I)0.00.10.20.30.40.50.60.70.80.91.0 P u r i t y Classifier:

RapidFull Light CurveRedshift

Figure 11.

Left : Completeness as a function of conﬁdence for all three classiﬁers presented here.

Right : Purity, corrected forobservational rates for the same classiﬁers. The shaded regions represent the 1 σ uncertainties. C o m p l e t e n e ss Classifier:

RapidFull Light CurveRedshift 1020304050607080 Number of Top SLSN-I0.00.10.20.30.40.50.60.70.80.91.0 P u r i t y Classifier:

RapidFull Light CurveRedshift

Figure 12.

Left : Completeness as a function of the top transients classiﬁed as SLSN-I for all three classiﬁers presented here.

Right : Purity, corrected for observational rates for the same classiﬁers. The shaded regions represent the 1 σ uncertainties.8. CONCLUSIONSWe have presented a random forest classiﬁer, FLEET,designed speciﬁcally to rapidly identify SLSN-I with ahigh purity, without the need for redshift information.We trained this classiﬁer on a sample of about 1800classiﬁed transients reported to the TNS, including 156SLSN-I (i.e., 8.6% of the total sample). The classiﬁeruses both light curve and contextual host galaxy in-formation. We assess the observed purity achieved byFLEET for the actual rate of SLSN-I in a magnitude-limited survey of ≈ . • We ﬁnd that the most important features are thelight curve width, g − r color at peak, and the pro- jected angular separation between the transientand host galaxy normalized by host radius. • We ﬁnd an observed purity of about 50% for eventsclassiﬁed as SLSN-I with a probability conﬁdenceof P (SLSN − I) > .

5. This is a factor of 33 timesimprovement compared to a random selection (i.e.,compared to the fraction of 1.5% of SLSN-I ina magnitude-limited survey). The completenessfor this classiﬁcation conﬁdence threshold is about30%. • We ﬁnd a peak observed purity of about 85% forSLSN-I, corresponding to a classiﬁcation probabil-ity threshold of P (SLSN − I) > .

80 and a total of

LEET ∼

15 objects. The completeness for this classiﬁca-tion conﬁdence threshold is about 20%.In addition to the main rapid classiﬁer we also ex-plored two alternative classiﬁers that use redshift infor-mation and full light curves, respectively. As expected,we ﬁnd that these classiﬁers achieve better results, witha signiﬁcant increase in completeness by about a fac-tor of 2, for an observed purity that matches the peakperformance of the main rapid classiﬁer.Placing our results in context we note that at present,current surveys are reporting ∼ ,

000 transients a year,out of which ∼ g - and r -band) and localization (within the footprintof PS1/3 π ) to be classiﬁed by our algorithm. For anobservational SLSN-I fraction of 1.5%, this sample con-tains about 90 SLSN-I per year. Our rapid classiﬁer cantherefore recover about 30 SLSN-I with a purity of 50%,thereby requiring about 60 follow-up spectra per year;or alternatively, about 18 SLSN-I per year with a pu-rity of about 85%, requiring about 21 follow-up spectra.Looking forward to LSST, which is expected to have ∼ SLSN-I in its data stream (Villar et al. 2018), ourclassiﬁer could discover ∼

140 SLSN-I a month, with ∼

170 follow-up spectra. This would increase the exist-ing sample by two orders of magnitude over the lifetimeof LSST.The Berger Time-Domain Group is supported inpart by NSF grant AST-1714498. V.A.V. acknowl-edges support from a Ford Foundation DissertationFellowship. Operation of the Pan-STARRS1 telescopeis supported by the National Aeronautics and Space Administration under grant No. NNX12AR65G andgrant No. NNX14AM74G issued through the NEOObservation Program. This work has made use ofdata from the European Space Agency (ESA) mission

Gaia

Multilateral Agreement. Thisresearch has made use of NASAs Astrophysics DataSystem. This research has made use of the SIMBADdatabase, operated at CDS, Strasbourg, France. Basedon observations obtained with MegaPrime/MegaCam, ajoint project of CFHT and CEA/IRFU, at the Canada-France-Hawaii Telescope (CFHT) which is operated bythe National Research Council (NRC) of Canada, theInstitut National des Science de l’Univers of the CentreNational de la Recherche Scientiﬁque (CNRS) of France,and the University of Hawaii. This work is based inpart on data products produced at Terapix available atthe Canadian Astronomy Data Centre as part of theCanada-France-Hawaii Telescope Legacy Survey, a col-laborative project of NRC and CNRS. This research hasmade use of the NASA/IPAC Extragalactic Database,which is funded by the National Aeronautics and SpaceAdministration and operated by the California Instituteof Technology.

Facilities:

ADS, TNS

Software:

Astropy (Astropy Collaboration 2018), ex-tinction((Barbary 2016)), Matplotlib (Hunter 2007), em-cee(Foreman-Mackey et al. 2013), NumPy (van der Waltet al. 2011), scikit-learn (Pedregosa et al. 2012), SMOTEChawla et al. 2002REFERENCES

Ahumada, R., Allende Prieto, C., Almeida, A., et al. 2019, arXive-prints, arXiv:1912.02905Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS,219, 12Angus, C. R., Smith, M., Sullivan, M., et al. 2019, MNRAS, 487,2215Astropy Collaboration. 2018, AJ, 156, 123Barbary, K. 2016, extinction, v0.3.0, Zenodo,doi:10.5281/zenodo.804967Bellm, E. C., Kulkarni, S. R., Graham, M. J., et al. 2019, PASP,131, 018002Berger, E. 2010, ApJ, 722, 1946Bertin, E., & Arnouts, S. 1996, A&AS, 117, 393Blanchard, P. K. 2019, PhD thesis, Harvard University,http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029690Blanchard, P. K., Berger, E., Nicholl, M., & Villar, V. A. 2020,ApJ, 897, 114 Blanchard, P. K., Nicholl, M., Berger, E., et al. 2019, ApJ, 872,90Blanchard, P. K., Nicholl, M., Berger, E., et al. 2017, ApJ, 843,106Blanchard, P. K., Nicholl, M., Berger, E., et al. 2018, ApJ, 865, 9Bloom, J. S., Kulkarni, S. R., & Djorgovski, S. G. 2002, AJ, 123,1111Boone, K. 2019, AJ, 158, 257Breiman, L. 2001, Machine Learning, 45, 5Chambers, K., & Pan-STARRS Team. 2018, in AmericanAstronomical Society Meeting Abstracts, Vol. 231, AmericanAstronomical Society Meeting Abstracts Gomez et al.

Chen, T. W., Inserra, C., Fraser, M., et al. 2018, ApJL, 867, L31Chomiuk, L., Chornock, R., Soderberg, A. M., et al. 2011, ApJ,743, 114Cooke, J., Sullivan, M., Gal-Yam, A., et al. 2012, Nature, 491,228Dahiwale, A., & Fremling, C. 2020, Transient Name ServerClassiﬁcation Report, 2020-1756, 1De Cia, A., Gal-Yam, A., Rubin, A., et al. 2018, ApJ, 860, 100Dessart, L., Hillier, D. J., Waldman, R., Livne, E., & Blondin, S.2012, MNRAS, 426, L76Drake, A. J., Djorgovski, S. G., Mahabal, A., et al. 2009, ApJ,696, 870Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J.2013, PASP, 125, 306Fraser, M., Reynolds, T., Mattila, S., & Yaron, O. 2016,Transient Name Server Classiﬁcation Report, 2016-521, 1Fremling, C., & Dahiwale, A. 2019, Transient Name ServerClassiﬁcation Report, 2019-1774, 1Fremling, C., Dugas, A., & Sharma, Y. 2018a, Transient NameServer Classiﬁcation Report, 2018-1411, 1Fremling, C., Dugas, A., & Sharma, Y. 2018b, Transient NameServer Classiﬁcation Report, 2018-1416, 1Fremling, C., Dugas, A., & Sharma, Y. 2018c, Transient NameServer Classiﬁcation Report, 2018-1877, 1Fremling, C., Dugas, A., & Sharma, Y. 2019a, Transient NameServer Classiﬁcation Report, 2019-32, 1Fremling, C., Dugas, A., & Sharma, Y. 2019b, Transient NameServer Classiﬁcation Report, 2019-188, 1Fremling, C., Dugas, A., & Sharma, Y. 2019c, Transient NameServer Classiﬁcation Report, 2019-598, 1Fremling, C., Dugas, A., & Sharma, Y. 2019d, Transient NameServer Classiﬁcation Report, 2019-636, 1Fremling, C., Dugas, A., & Sharma, Y. 2019e, Transient NameServer Classiﬁcation Report, 2019-952, 1Fremling, C., Miller, A. A., Sharma, Y., et al. 2020, ApJ, 895, 32Gagliano, A., Narayan, G., Engel, A., & Carrasco Kind, M. 2020,arXiv e-prints, arXiv:2008.09630Gomez, S., Berger, E., Blanchard, P. K., et al. 2020, FLEETFinding Luminous and Exotic Extragalactic Transients, 1.0.0,Zenodo, doi:10.5281/zenodo.4013965Gomez, S., Berger, E., Nicholl, M., et al. 2019, ApJ, 881, 87Goodman, J., & Weare, J. 2010, Communications in AppliedMathematics and Computational Science, 5, 65Graham, M. L., Connolly, A. J., Ivezi´c, ˇZ., et al. 2018, AJ, 155, 1Guillochon, J., Parrent, J., Kelley, L. Z., & Margutti, R. 2017,ApJ, 835, 64Hosseinzadeh, G., Dauphin, F., Villar, V. A., et al. 2020, arXive-prints, arXiv:2008.04912Howell, D. A., Kasen, D., Lidman, C., et al. 2013, ApJ, 779, 98Hudelot, P., Cuillandre, J. C., Withington, K., et al. 2012,VizieR Online Data CatalogHunter, J. D. 2007, CSE, 9, 90Inserra, C., Smartt, S. J., Jerkstrand, A., et al. 2013, ApJ, 770,128Inserra, C., Nicholl, M., Chen, T. W., et al. 2017, MNRAS, 468,4642Kasen, D., & Bildsten, L. 2010, ApJ, 717, 245Kasliwal, M., & Cao, Y. 2019, Transient Name Server DiscoveryReport, 2019-259, 1Kessler, R., Narayan, G., Avelino, A., et al. 2019, PASP, 131,094501Leloudas, G., Chatzopoulos, E., Dilday, B., et al. 2012, A&A,541, A129Lin, W. L., Wang, X. F., Li, W. X., et al. 2020, arXiv e-prints,arXiv:2006.16443Liu, L.-D., Wang, L.-J., Wang, S.-Q., & Dai, Z.-G. 2018, ApJ,856, 59Lunnan, R., Chornock, R., Berger, E., et al. 2013, ApJ, 771, 97 Lunnan, R., Chornock, R., Berger, E., et al. 2014, ApJ, 787, 138Lunnan, R., Fransson, C., Vreeswijk, P. M., et al. 2018a, NatureAstronomy, 2, 887Lunnan, R., Chornock, R., Berger, E., et al. 2018b, ApJ, 852, 81Lunnan, R., Yan, L., Perley, D. A., et al. 2019, arXiv e-prints,arXiv:1910.02968Lyman, J., Homan, D., Magee, M., & Yaron, O. 2017, TransientName Server Classiﬁcation Report, 2017-881, 1Mazzali, P. A., Sullivan, M., Pian, E., Greiner, J., & Kann, D. A.2016, MNRAS, 458, 3455McCrum, M., Smartt, S. J., Rest, A., et al. 2015, MNRAS, 448,1206Metzger, B. D., Margalit, B., Kasen, D., & Quataert, E. 2015,MNRAS, 454, 3311Muthukrishna, D., Narayan, G., Mandel, K. S., Biswas, R., &Hloˇzek, R. 2019, PASP, 131, 118002Nicholl, M., Berger, E., Blanchard, P. K., Gomez, S., &Chornock, R. 2019, ApJ, 871, 102Nicholl, M., Berger, E., Margutti, R., et al. 2017a, ApJL, 845, L8Nicholl, M., Berger, E., Margutti, R., et al. 2017b, ApJL, 835, L8Nicholl, M., Guillochon, J., & Berger, E. 2017c, ApJ, 850, 55Nicholl, M., Smartt, S. J., Jerkstrand, A., et al. 2013, Nature,502, 346Nicholl, M., Smartt, S. J., Jerkstrand, A., et al. 2014, MNRAS,444, 2096Nicholl, M., Smartt, S. J., Jerkstrand, A., et al. 2015, ApJL, 807,L18Nicholl, M., Berger, E., Margutti, R., et al. 2016, ApJL, 828, L18Nicholl, M., Blanchard, P. K., Berger, E., et al. 2018, ApJL, 866,L24Nicholl, M., Blanchard, P. K., Berger, E., et al. 2020, NatureAstronomy, arXiv:2004.05840Papadopoulos, A., D’Andrea, C. B., Sullivan, M., et al. 2015,MNRAS, 449, 1215Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2012, arXive-prints, arXiv:1201.0490Perley, D., Yan, L., Andreoni, I., et al. 2019a, Transient NameServer Classiﬁcation Report, 2019-1712, 1Perley, D., Yan, L., Lunnan, R., et al. 2019b, Transient NameServer Classiﬁcation Report, 2019-2829, 1Perley, D. A., Yan, L., Gal-Yam, A., et al. 2019c, TransientName Server AstroNote, 79, 1Perley, D. A., Quimby, R. M., Yan, L., et al. 2016, ApJ, 830, 13Prajs, S., Sullivan, M., Smith, M., et al. 2017, MNRAS, 464, 3568Prentice, S. J., Maguire, K., Skillen, K., Magee, M. R., & Clark,P. 2019, Transient Name Server Classiﬁcation Report,2019-2339, 1Quimby, R. M., Aldering, G., Wheeler, J. C., et al. 2007, ApJL,668, L99Quimby, R. M., Kulkarni, S. R., Kasliwal, M. M., et al. 2011,Nature, 474, 487Quimby, R. M., De Cia, A., Gal-Yam, A., et al. 2018, ApJ, 855, 2Roy, R., Sollerman, J., Silverman, J. M., et al. 2016, A&A, 596,A67S´anchez-S´aez, P., Reyes, I., Valenzuela, C., et al. 2020, arXive-prints, arXiv:2008.03311Schlaﬂy, E. F., & Finkbeiner, D. P. 2011, ApJ, 737, 103Schulze, S., Kr¨uhler, T., Leloudas, G., et al. 2018, MNRAS, 473,1258Short, P., Nicholl, M., Muller, T., Angus, C., & Yaron, O. 2019,Transient Name Server Classiﬁcation Report, 2019-772, 1Tachibana, Y., & Miller, A. A. 2018, PASP, 130, 128001van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, CSE,13, 22van Velzen, S., Gezari, S., Hammerstein, E., et al. 2020, arXive-prints, arXiv:2001.01409Villar, V. A., Nicholl, M., & Berger, E. 2018, ApJ, 869, 166Villar, V. A., Berger, E., Miller, G., et al. 2019, ApJ, 884, 83

LEET Villar, V. A., Hosseinzadeh, G., Berger, E., et al. 2020, arXive-prints, arXiv:2008.04921Vreeswijk, P. M., Savaglio, S., Gal-Yam, A., et al. 2014, ApJ,797, 24Vreeswijk, P. M., Leloudas, G., Gal-Yam, A., et al. 2017, ApJ,835, 58Whitesides, L., Lunnan, R., Kasliwal, M. M., et al. 2017, ApJ,851, 107Yan, L., Chen, Z., Perley, D., et al. 2019a, Transient NameServer Classiﬁcation Report, 2019-2041, 1 Yan, L., Perley, D., Lunnan, R., et al. 2019b, Transient NameServer AstroNote, 45, 1Yan, L., Quimby, R., Ofek, E., et al. 2015, ApJ, 814, 108Yan, L., Lunnan, R., Perley, D. A., et al. 2017, ApJ, 848, 6Yan, L., Perley, D., Schulze, S., et al. 2020, arXiv e-prints,arXiv:2006.13758Young, D. 2016, Transient Name Server Classiﬁcation Report,2016-68, 1 Gomez et al.

APPENDIXWe show in Table A.1 the sample of all the SLSN-I used for this classiﬁer, sorted by redshift.

Table 1 . Type-I SLSNeName Redshift Reference Name Redshift Reference Name Redshift ReferenceSN2017egm 0.0307 49 PS15cjz 0.2200 2 DES17C3gyp 0.4700 2PTF11hrq 0.0571 21 SN2016wi 0.2240 27 DES14C1rhg 0.4810 2SN2018hti 0.0600 62 SN2018gft 0.2300 56 SN2019itq 0.4810 this workSN2019unb 0.0635 1 SN2010gx 0.2301 30 SN2016aj 0.4850 60SN2018bgv 0.0795 23 SN2018ﬀj 0.2340 this work SN2019kwq 0.5000 53SN2012aa 0.0830 48 SN2018gkz 0.2400 29 PTF09atu 0.5015 10SN2019hge 0.0866 61 SN2011kf 0.2450 28 PS114bj 0.5125 4SN2017gci 0.0900 47 iPTF16bad 0.2467 27 SN2019otl 0.5140 this workSN2010md 0.0987 10 SN2019enz 0.2550 26 † PS112bqf 0.5220 4SN2016eay 0.1013 46 LSQ12dlf 0.2550 25 PS111ap 0.5240 4PTF12hni 0.1056 30 LSQ14mo 0.2560 24 DES16C3dmp 0.5620 2PTF12dam 0.1070 41 PTF09cnd 0.2584 10 DES15S1nog 0.5650 2SN2019neq 0.1075 45 SN2019dlr 0.2600 53 SN2019sgg 0.5726 54SN2018kyt 0.1080 44 SN2019hno 0.2600 53 SN2019kwu 0.6000 53SN2017ens 0.1086 43 SN2018fd 0.2630 this work DES14X3taz 0.6080 2SN2015bn 0.1136 42 SN2013dg 0.2650 25 PS110bzj 0.6500 4PTF10nmn 0.1237 10 ,

21 SN2018lfd 0.2700 55 SN2013hy 0.6630 9,2SN2007bi 0.1279 41 iPTF13bjz 0.2712 30 SN2019ﬁy 0.6700 53SN2017dwh 0.1300 40 SN2018bym 0.2740 23 PS112zn 0.6740 52SN2018avk 0.1320 23 SN2011ep 0.2800 16 DES17X1blv 0.6900 2SN2020exj 0.1330 59 SN2005ap 0.2832 22 DES16C3cv 0.7270 2SN2019lsq 0.1400 39 PTF10uhf 0.2879 21 PS111bdn 0.7380 4SN2018ﬀs 0.1420 this work SN2016inl 0.2980 this work iPTF13ajg 0.7403 8SN2011ke 0.1429 21 MLS121104 0.3030 52 SNLS07D3bs 0.7570 51SN2019bgu 0.1480 58 SN2019eot 0.3057 20 DES15X3hm 0.8600 2SN2019cdt 0.1530 38 SN2017beq 0.3100 19 DES14X2byo 0.8680 2LSQ14an 0.1630 37 PS112cil 0.3200 4 PS113gt 0.8840 4SN2019ujb 0.1647 this work SN2019cwu 0.3200 53 PS110awh 0.9084 7SN2019obk 0.1656 61 PTF12mxx 0.3296 10 DES17X1amf 0.9200 2SN2018ibb 0.1660 57 iPTF13ehe 0.3434 18 DES16C3ggu 0.9490 2SN2019pvs 0.1670 this work SN2019sgh 0.3440 this work PS110ky 0.9558 7PTF10bfz 0.1701 10 LSQ14bdq 0.3450 17 PS111aib 0.9970 4SN2012il 0.1750 28 SN2018lfe 0.3500 63 DES16C2aix 1.0680 2PTF12gty 0.1768 21 SN2019kwt 0.3562 53 PS110ahf 1.1000 4CSS160710 0.1800 36 PTF10bjp 0.3584 10 DES15X1noe 1.1880 2

Table 1 continued

LEET Table 1 (continued)

Name Redshift Reference Name Redshift Reference Name Redshift ReferenceSN2019gfm 0.1816 35 LSQ14fxj 0.3600 16 SCP06F6 1.1890 6SN2009cb 0.1864 21 SN2019zbv 0.3700 this work PS110pm 1.2060 5SN2009jh 0.1867 10 ,

21 SN2006oz 0.3760 15 PS111tt 1.2830 4iPTF16asu 0.1870 34 SN2019zeu 0.3900 this work DES14C1ﬁ 1.3020 2SN2019nhs 0.1900 33 DES15C3hav 0.3920 2 PS111afv 1.4070 4SN2018cxa 0.1900 this work iPTF13cjq 0.3962 30 SNLS07d2bv 1.5000 3SN2010hy 0.1901 10 ,

30 SN2019kcy 0.4000 53 DES14S2qri 1.5000 2SN2011kg 0.1924 10 iPTF13bdl 0.4030 30 PS113or 1.5200 4SN2019kws 0.1977 53,61 SN2019cca 0.4103 14 PS111bam 1.5650 4SN2019xaq 0.2000 this work iPTF16eh 0.4270 13 PS112bmy 1.5720 4SN2016ard 0.2025 32 CSS130912 0.4305 11 ,

12 SNLS06d4eu 1.5881 3PTF10aagc 0.2060 10 PTF10vqv 0.4518 10 DES16C2nm 1.9980 2SN2016els 0.2170 31 CSS140925 0.4600 16 SN2213 2.0500 50

Note —All the SLSN-I used to train our classiﬁer. Note there are more SLSNe candidates in the literature, but we keep onlythe unambiguous ones to avoid polluting the sample. 1:Prentice et al. (2019); 2:Angus et al. (2019); 3:Howell et al. (2013);4:Lunnan et al. (2018b); 5:McCrum et al. (2015); 6:Quimby et al. (2011); 7:Chomiuk et al. (2011); 8:Vreeswijk et al. (2014);9:Papadopoulos et al. (2015); 10:Perley et al. (2016); 11:Vreeswijk et al. (2017); 12:Liu et al. (2018); 13:Lunnan et al. (2018a); 14:Perley et al. (2019b); 15:Leloudas et al. (2012); 16:Schulze et al. (2018); 17:Nicholl et al. (2015); 18:Yan et al. (2015);19:Kasliwal & Cao (2019); 20:Fremling et al. (2019e); 21:Quimby et al. (2018); 22:Quimby et al. (2007); 23:Lunnan et al.(2019); 24:Chen et al. (2017); 25:Nicholl et al. (2014); 26:Short et al. (2019); 27:Yan et al. (2017); 28:Inserra et al. (2013);29:Fremling et al. (2018b); 30:De Cia et al. (2018); 31:Fraser et al. (2016); 32:Blanchard et al. (2018); 33:Perley et al. (2019a); 34:Whitesides et al. (2017); 35:Chen (2019); 36:Drake et al. (2009); 37:Inserra et al. (2017); 38:Fremling et al. (2019d);39:Fremling & Dahiwale (2019); 40:Blanchard et al. (2019); 41:Nicholl et al. (2013); 42:Nicholl et al. (2016); 43:Chen et al.(2018); 44:Fremling et al. (2019b); 45:Perley et al. (2019c); 46:Nicholl et al. (2017b); 47:Lyman et al. (2017); 48:Roy et al.(2016); 49:Nicholl et al. (2017a); 50:Cooke et al. (2012); 51:Prajs et al. (2017); 52:Lunnan et al. (2014); 53:Yan et al. (2019b); 54:Yan et al. (2019a); 55:Fremling et al. (2019a); 56:Fremling et al. (2018a); 57:Fremling et al. (2018c); 58:Fremling et al.(2019c); 59:Dahiwale & Fremling (2020); 60:Young (2016); 61:Yan et al. (2020); 62:Lin et al. (2020); 63: Yin et al., in prep. † We ﬁnd that a redshift of z = 0 .

255 is a better match to the SNe spectral features than the z = 0 ..