[PDF] Classifying High-cadence Microlensing Light Curves I; Defining Features

Abstract

Microlensing is a powerful tool for discovering cold exoplanets, and the The Roman Space Telescope microlensing survey will discover over 1000 such planets. Rapid, automated classification of Roman's microlensing events can be used to prioritize follow-up observations of the most interesting events. Machine learning is now often used for classification problems in astronomy, but the success of such algorithms can rely on the definition of appropriate features that capture essential elements of the observations that can map to parameters of interest. In this paper, we introduce tools that we have developed to capture features in simulated Roman light curves of different types of microlensing events, and evaluate their effectiveness in classifying microlensing light curves. These features are quantified as parameters that can be used to decide the likelihood that a given light curve is due to a specific type of microlensing event. This method leaves us with a list of parameters that describe features like the smoothness of the peak, symmetry, the number of peaks, and width and height of small deviations from the main peak. This will allow us to quickly analyze a set of microlensing light curves and later use the resulting parameters as input to machine learning algorithms to classify the events.

Full PDF

DDraft version January 12, 2021

Typeset using L A TEX twocolumn style in AASTeX63

Classifying High-cadence Microlensing Light Curves I: Deﬁning Features

Somayeh Khakpash, Joshua Pepper, Matthew Penny, B. Scott Gaudi, and R. A. Street Department of Physics, Lehigh University, 16 Memorial Drive East, Bethlehem, PA 18015, USA Department of Physics & Astronomy, 261-B Nicholson Hall, Tower Dr., Baton Rouge, LA 70803, USA Department of Astronomy, The Ohio State University, 140 W. 18th Ave., Columbus, OH 43210 LCOGT, 6740 Cortona Dr, Goleta, CA 93117, USA (Dated: January 12, 2021)

ABSTRACTMicrolensing is a powerful tool for discovering cold exoplanets, and the The Roman Space Telescopemicrolensing survey will discover over 1000 such planets. Rapid, automated classiﬁcation of Roman’smicrolensing events can be used to prioritize follow-up observations of the most interesting events.Machine learning is now often used for classiﬁcation problems in astronomy, but the success of suchalgorithms can rely on the deﬁnition of appropriate features that capture essential elements of theobservations that can map to parameters of interest. In this paper, we introduce tools that we havedeveloped to capture features in simulated Roman light curves of diﬀerent types of microlensing events,and evaluate their eﬀectiveness in classifying microlensing light curves. These features are quantiﬁedas parameters that can be used to decide the likelihood that a given light curve is due to a speciﬁctype of microlensing event. This method leaves us with a list of parameters that describe features likethe smoothness of the peak, symmetry, the number of peaks, and width and height of small deviationsfrom the main peak. This will allow us to quickly analyze a set of microlensing light curves and lateruse the resulting parameters as input to machine learning algorithms to classify the events. INTRODUCTIONMicrolensing is a phenomenon that happens whenlight emitted from a distant object (the source) is lensedby a closer, massive object (the lens), and as a result,multiple images of the source are formed. These imagesare typically not resolved because their angular separa-tion is much smaller than the angular resolution of bothground- and space-based telescopes, and consequently,we observe a brightening of the source.Microlensing is a powerful tool for detecting small anddim objects that are otherwise very hard to detect bytheir emitted light, and in particular, it is so far the onlymethod capable of investigating planetary systems withterrestrial-mass planets orbiting beyond the snow line(Gaudi 2010). For several decades, ground-based sur-veys have searched the Galactic Bulge for microlensingevents. The Optical Gravitational Lensing Experiment(OGLE) (Udalski et al. 2015), the Microlensing Obser-vations in Astrophysics (MOA) (Bond et al. 2001) andthe Korea Microlensing Telescope Network (KMTNet)(Kim et al. 2016) surveys detect thousands of microlens-ing events. These observations have led to the successful discovery of over 100 exoplanets and some compact sub-stellar objects (Griest et al. 1995; Bennett et al. 2002;Mao et al. 2002).Identifying microlensing events from the huge num-bers of light curves obtained in these surveys is a chal-lenge. It is useful to be able to distinguish microlensingevents either in real time during survey operations asthe events start, or after an observing campaign has ﬁn-ished and the data of being fully analyzed. Microlens-ing surveys have generally relied on circulating real-time alerts of potential microlensing events when thereis an increase in the brightness of an observed source.This method can lead to alerts for any sudden rise in alight curve, which can include variability other than mi-crolensing, such as cataclysmic variables (CVs). Therehas been recent work to increase the accuracy of thesealerts, limiting them to genuine microlensing events, us-ing machine learning (ML) algorithms that can distin-guish microlensing light curves from other variability inreal time (Kessler et al. 2019; Godines et al. 2019). Real-time classiﬁcation can increase the accuracy of alerts. Itcan also identify high-value events for which only partial https://exoplanetarchive.ipac.caltech.edu a r X i v : . [ a s t r o - ph . E P ] J a n Khakpash et al. coverage will be obtained by the survey, and allow ob-servers to schedule supplemental follow-up observationsto increase the coverage of the event.There is also a need for classiﬁcation of microlens-ing survey data after the conclusion of observations. Atthat point, it is necessary to ﬁrst distinguish microlens-ing events from other types of variability, and also toseparate diﬀerent types of microlensing events, such asthose that include planetary signals. Surveys like theKMTNet survey have employed an automated approachto detect microlensing-like variability in complete lightcurves by ﬁtting simple functions (Kim et al. 2018). Be-lokurov et al. (2003) also advocates the use of neuralnetworks to distinguish well-sampled microlensing lightcurves from other types of variability. In this paper, wewish to explore the eﬃciency and eﬃcacy of after-the-fact classiﬁcation of microlensing signals.ML has now become a popular method for classify-ing astronomical time series. Some ML classiﬁers likethe Random Forest (Liaw & Wiener 2002) and k-mean(Lloyd 1982) classiﬁers take light curve features as inputand try to ﬁnd a connection between those features andthe classiﬁcation labels. In this scenario, “features” arequantitative statistical or morphological measurementsof the time series. Some other methods like neural net-works use the light curves themselves as input, and thenﬁnd common patterns or higher-order correlated prop-erties to classify them. As a speciﬁc example, RandomForest is a widely-used algorithm that classiﬁes time se-ries by making decisions based on features in the timeseries (Bluck et al. 2020; Pawlak 2019). In order touse this algorithm eﬃciently, observable features in thelight curves that are most closely related to the canoni-cal model parameters must ﬁrst be identiﬁed.Here, we focus on analyzing complete high-cadencemicrolensing light curves. These are simulated lightcurves for the Roman Galactic Bulge Exoplanet Survey.The Nancy Grace Roman Space Telescope (Roman), for-mally known as the Wide Field Infrared Survey Tele-scope (WFIRST) is a NASA future space mission thatis expected to be launched by mid 2020’s. One of its pri-mary goals is to detect exoplanets using the microlens-ing method (Spergel et al. 2015). It is estimated thatRoman will ﬁnd about 54 ,

000 microlensing events andwill detect 1400 planets (Penny et al. 2019). The 0 . (Street et al. in prep.).Binary-lens light curves are often diﬃcult to modelbecause of the complicated features in their light curvesand the large parameter space that needs to be fullysearched (for a more thorough review refer to the intro-duction of Penny (2014) and Khakpash et al. (2019)).Those challenges highlight the importance of develop-ing fast automated algorithms to quickly analyze lightcurves and estimate the microlensing model parameters.Our primary goal in this work is to identify a list of fea-tures speciﬁcally deﬁned for microlensing light curvesthat can be generated by fast and eﬃcient algorithms,and show that these features can help either in diﬀeren-tiating microlensing light curves among other types ofvariability or in classifying microlensing light curves intodiﬀerent types. This would enable fast detection of plan-etary system lenses and other interesting lensing casesin the released datasets of large surveys like Roman.We present a collection of algorithms including vari-ous functional ﬁts that are applied to the light curves,and from these ﬁts, we extract parameters that quantifyfeatures of the light curve like smoothness of the peak,symmetry, number of peaks, similarity to microlens-ing single-lens curves, number of deviations from thepeak, and width and height of the deviations from themain peak. We then show how eﬀective each of thesefunctions are in distinguishing between diﬀerent typesof microlensing events like identifying single-lens versusmultiple-lens systems, or planetary system lenses versusstellar binary lens systems, and in some cases in detect-ing microlensing events among other types of variabil-ity. A similar work was ﬁrst done by Mao & Di Stefano(1994) where they used features such as an estimate ofthe asymmetry about the peak to detect binary-lens sig-natures in the light curves. In Section 2, we introducethe diﬀerent models of microlensing events that need tobe parameterized for the classiﬁcation, and in Section3, we discuss the properties of our test dataset. In Sec-tion 4, we introduce our algorithm package includingthe diﬀerent functional ﬁts and their respective outputparameters. We also evaluate the eﬀectiveness of eachfunction in capturing the speciﬁc set of features in thelight curve. In Section 6, we show preliminary tests ofusing these features as input to machine learning clas-siﬁers. Finally, in Section 7, we discuss our results andapplicability of our method to other datasets. MICROLENSING MODELS https://microlensing-source.org/data-challenge/ lassifying High-cadence Microlensing Light Curves I; Defining Features Point-source Point-lens Microlensing Light Curves

The Point-source Point-lens (PSPL) model of a lensingevent when there is a single lens and a single source, andthe source and the lens can both be approximated aspoint-like objects with zero angular size. When there isno blending of the source with the neighboring stars orlight from the lens or companions to the lens or source,this model can be described by a simple Paczy´nski curveas in Equation 1 and 2. A is the magniﬁcation of thesource, t is the time of the maximum magniﬁcation, u is the impact parameter, and t E is the Einstein crossing time. Figure 2 shows ﬁve diﬀerent PSPL models withthe same Einstein timescale, and diﬀerent values of u .In the presence of blending, another parameter is addedto Equation 1 and will yield Equation 3 where F ( t ) isthe diﬀerential ﬂux and f s is the blending parameterand determines what fraction of the total ﬂux in theaperture is due only to the source. When there are nohigher order eﬀects, these curves remain symmetric, andthe sharpness of the peak in high-magniﬁcation eventsremains intact. Detecting any deviation from this modelcan help identify the presence of higher order eﬀects. A ( t ) = u ( t ) + 2 u ( t ) √ u + 4 (1) u ( t ) = (cid:115) u + (cid:18) t − t t E (cid:19) (2) F ( t ) = f s × A ( t ) + (1 − f s ) (3)2.2. Finite Source Eﬀects

When the source is not point-like, and has a ﬁnite size,the magniﬁcation function will be diﬀerent, especiallywhen the impact parameter is comparable to the sourcesize (Witt & Mao 1994). In Figure 1, an example of thechanges due to ﬁnite source eﬀects can be seen in panel(b). For small impact parameters, this eﬀect becomesstronger, and it smooths out the peak of the event. Thiseﬀect can be used to estimate θ E , the angular size of theEinstein ring and therefore provide constraints on thelens mass (Yoo et al. 2004; Jiang et al. 2004).2.3. Free Floating Planets

Isolated, short duration microlensing events can bedue to free ﬂoating planets (FFPs) or planets in verywide orbits. Compred to most other exoplanet detec-tion methods, microlensing has the capability to detectthese planets since the method only depends on the massof the lens and not its luminosity. Microlensing eventscaused by FFPs have short timescales of t E < ρ , is deﬁned as θ ∗ /θ E where θ ∗ is the apparent size of the source, and θ E is the angularsize of the Einstein ring. θ E depends on the lens mass,and it becomes smaller for single planetary lenses, and Khakpash et al. -4 -2 0 2 4 . .

02 1 . . . . a) PSPL -0.5 -0.25 0 0.25 0.5 b) FSPL -0.03 0 0.03 .

01 8 .

11 8 .

21 8 .

31 8 .

41 8 . c) FFP -5 0 5 .

31 9 .

41 9 .

51 9 .

61 9 .

71 9 .

81 9 .

92 0 . d) Stellar Binary-lens -10 0 10 . . . . . e) Stellar Binary-lens -15 -10 -5 0 5 10 15 . . . . . . . . f) Caustic-crossing Planetary Binary-lens -15 -10 -5 0 5 10 15 . . . . . g) Planetary Binary-lens - Major Image Perturbation -15 -10 -5 0 5 10 15 . . . . . h) Planetary Binary-lens - Minor Image Perturbation t t (days) W M a g n i t u d e Figure 1.

Light curves containing diﬀerent types of major features in microlensing light curves. The top row includes lightcurves created by single lenses. Panel (a) is a single stellar lens with no signiﬁcant ﬁnite source eﬀect. Panel (b) shows theeﬀects of the ﬁnite source eﬀect on a single-lens stellar event. Panel (c) shows an event due to a free-ﬂoating planet. Panels (d)and (e) show two examples of events caused by binary star lens systems. Panels (g) and (h) are due to planetary systems andhave no caustic crossings, whereas Panel (f) is a planetary event that contains a caustic-crossing event. lassifying High-cadence Microlensing Light Curves I; Defining Features t t M a g n i f i c a t i o n t E = 5 daysu = 0.1 u = 0.3 u = 0.2 u = 0.5 u = 0.4 Figure 2.

Five PSPL models with diﬀerent values of u .The ﬁgure shows how the shape of the curve changes fordiﬀerent values of the impact parameter. therefore, ρ becomes larger, and the ﬁnite source eﬀectis stronger. As a result, the top of the short-timescalepeak becomes ﬂattened, and the overall shape of theevent begins to resemble a tophat function. An exampleof such an event can be seen in panel (c) of Figure 1.2.4. Binary-lens Microlensing Events

When there is more than one lens, the range of possi-ble light curve morphologies increases dramatically. Inbinary-lens systems, various shapes of the light curvesdepend on parameters such as q , the mass ratio, s , theprojected separation, and α , the angle the path of thesource makes with the line connecting the two lenses.Since α can be any value between zero and 360 ◦ , thisquantity along with q , s , and t E allow the time, height,width, and number of the features in the light curvevary signiﬁcantly. Based on OGLE (Udalski 2004) andMOA (Bond et al. 2001) observations, around 10% ofmicrolensing light curves show lens binarity signatures(Penny et al. 2011). Fully modeling each of these lightcurves is challenging because there are multiple min-ima in the χ surface as a function of their parameters,and therefore a wide range of parameter space must besearched. Furthermore, there is no direct mapping be-tween the observable features and the canonical param-eters ( q, s, α, t , t E , u ).2.4.1. Stellar Binary-lens Events

Stellar binary lenses are likely as common or morecommon than planetary system lenses, but the proba-bility of seeing deviations from a PSPL curve is much higher in nearly equal mass ( q (cid:38) .

1) binary lens lightcurves than light curves due to planetary mass ratio( q (cid:46) .

01) binaries because the size of the caustics (theloci of inﬁnite magniﬁcation for a point source) increasewith increasing q . Therefore, they are typically easy todistinguish from planetary system lenses since the devi-ations from the standard PSPL form last much longer,and in particular can be comparable to t E . They usuallyhave larger perturbations in their light curves, but canbe still misinterpreted as planetary binary-lens events(Han et al. 2016). Two examples of such light curve canbe seen in panels (d) and (e) of Figure 1.2.4.2. Planetary Binary-lens Events

A lens system containing a star and a single planetis a binary-lens system with a very small mass ratio( q ). Note that there is no strict delineation (in termsof q ) between a stellar binary-lens event and a plane-tary binary-lens event, and this is one of the reasonsthat make this a challenging classiﬁcation problem. Inthese cases, the light curve is dominated by the hoststar, such that there is a main peak which can be de-scribed by the PSPL model with one or more small de-viations caused by the planet (Gaudi 2012). As in asingle-lens microlensing event, two images of the sourcewill be formed as it passes close by the lens star; an im-age outside of the Einstein ring of the lens star calledthe major image, and an image inside the Einstein ringof the lens star called the minor image. Depending onwhether or not the projected separation of the planetand the lens star ( s ) is larger or smaller than the Ein-stein ring, the planet will have to perturb one of theseimages to leave a signature on the star light curve, andthe perturbation will have diﬀerent characteristics. Theeﬀect of the planet on the light curve can also be ex-plained by the positions of caustics. In real cases, sincethe source star is not a point source, we observe sharphigh-magniﬁcation peaks when there are caustic cross-ings. The numbers and sizes of these caustics depend onthe mass ratio ( q ) and the projected separation ( s ). Thestructure of the caustics leads to the planetary featuresin the light curves of these events (Gaudi 2012).Caustics have three main topologies. Depending onthe values of the projected separation and the massratio, there is either one (intermediate topology), two(wide topology), or three (close topology) caustics (fora thorough description of planetary caustics please re-fer to Gaudi (2012)). These caustics are also classiﬁedinto two classes, called the central caustic and the plan-etary caustic, and depending on which one is closer tothe path of the source, the shapes of the light curve arediﬀerent. Khakpash et al.

If the source crosses a caustic curve, two sharp devia-tions will be seen in the light curve assuming suﬃcientsampling corresponding to the entrance and exit of thesource through the caustic. The shape of these devia-tions depends on the path of the source which is deter-mined by the source path angle ( α ). Panel (f) of Figure1 includes an example of a caustic-crossing planetary mi-crolensing event. In this work, we analyze these pertur-bations based on whether or not the source has crossedthe caustics, and we investigate the features created ineach of these cases. In cases with no caustic crossing, thesource only passes close to the caustics, and that resultsin features like short bumps or troughs, or both, in thelight curve. These deviations are typically small in mag-nitude and short in duration which makes them some-times very diﬃcult to detect, especially in low-cadencesurveys, and they can be created by either the central orthe planetary caustics. In panels (g) and (h) of Figure 1,we show examples of non-caustic-crossing events causedby the planetary caustic. These perturbations happenwhen the source passes closely by the planetary caustic,and ﬁnding the time of the perturbation can help deter-mine the location of the caustic, and therefore estimatethe projected separation of the planetary system. Theevent in panel (g) is a perturbation in the major imageand the event in panel (h) is a perturbation in the minorimage.In cases with caustic crossing, we need to identify thetimes of the two sharp deviations to ﬁnd an estimate ofthe time the source spends inside the caustic. This al-lows us to estimate the size and location of the caustics,and therefore ﬁnd an estimation of the mass ratio andthe projected separation of the planetary system (e.g.,Poleski et al. 2014). SCOPE OF THE ANALYSISAs mentioned in Section 1, classifying astronomicaltime series using ML algorithms relies on the data setsused for training and testing. Features computed fora data set cannot always be used to analyze anotherdata set; therefore, the connection between the datasetand the deﬁned features should be discussed. In thispaper, we focus on computing features for simulatedRoman light curves. These light curves are simulatedbased on the Roman Cycle 7 mission design and havesix 72-days seasons with 15 minute cadence. These lightcurves are similar to the light curves generated by Pennyet al. (2013) and Penny et al. (2019), and the full de-tails of the simulated data can be found in these papers.This data set was designed in particular to be used for the Roman microlensing data challenge and has fourclasses of variability types, consisting of diﬀerent typesof microlensing light curves along with a particular typeof stellar variability that can be misinterpreted as mi-crolensing. These classes include cataclysmic variables,single-lens systems, stellar binary lenses and planetarysystem lenses. An example of each class of light curvesused in this analysis can be seen in Figure 3.The dataset contains 4181 light curves, including 429cataclysmic variables, 1626 single-lens systems, 1386stellar binary lenses, and 688 planetary system lenses.The CV class contains CV variabilities with either zero,one or multiple instances of outbursts in the baseline.The single-lens systems include both stars and planetsas lenses, therefore their shape varies in terms of theduration, amplitude, and morphology of the peak. Asmentioned before, the stellar binaries and the planetarysystems can result in very similar light curves, and thetransition between these two classes is smooth in termsof the observables, and distinguishing between them isone of our main concerns.The lightcurves selected for this project have a widerange of signal to noise ratio. To enter our sample,events had to pass a ∆ χ >

500 cut of the true modelto a ﬂat line, and binary and planetary events had topass a ∆ χ >

160 cut of the true model relative to asingle lens lightcurve model (see Penny et al. (2019) forfull details). While these ∆ χ values suggest quite highformal detection signiﬁcance, the cuts can be passed bylightcurves with signal to noise per data point less thanunity, e.g., a t E = 25 day single lens event will have 4800data points within one t E of the event peak. Similarly, aJupiter-mass planet with a planetary deviation lasting ∼ ∼

96 data points) can pass the detectioncut with an average contribution of ∆ χ per data pointof less than 2. This said, our selection cuts do not allowus to test our method’s sensitivity to the lowest signalto noise ratio events that could potentially be formallydetected. FEATURE DETECTION ANDPARAMETRIZATIONIn Section 2, we described major types of microlensingvariability with diﬀerent morphologies and amplitudes.In this section, we introduce a package of statistical testsand model ﬁts we have assembled into an algorithmicapproach to detect the features of a set of light curves.These features individually can then be used to distin-guish microlensing events from other variability, and to https://microlensing-source.org/data-challenge/ lassifying High-cadence Microlensing Light Curves I; Defining Features

20 15 10 5 0 5 10 15 20 a) CV b) PSPL/FFP

40 20 0 20 40 c) Stellar Binary-lens

30 20 10 0 10 20 30 d) Planetary Binary-lens

Time t W M a g n i t u d e Figure 3.

Four classes of light curves in the dataset used for this analysis. From left: is an example of the light curve of acataclysmic variable. The second panel shows a single-lens model which contains a single symmetric peak. The third panelrepresents lens systems containing a stellar binary, and the last panel is an example of an event with a planetary system as thelens. The blue data points are from the original simulated light curves, and the black data points are smoothed light curvesusing the low-pass ﬁlter. classify the diﬀerent types of microlensing systems. Ulti-mately, in a subsequent paper, we will use the combinedpower of all of the features as input to a ML classiﬁer.The package includes the following tools: • Peak Finder • Gould Two-parameter Point-source Point-lens Fit(G-PSPL) • Symmetry Check • Trapezoidal Function Fit • Cauchy Distribution Fit • Planetary Parameters Finder • Chebyshev Polynomials FitIn the sections below we introduce each of these tools,and explain how they are applied to the simulated dataset and evaluated. Not all tools are used independently,and some are used in combination with other tools.Before applying some of these tools, we employ a low-pass ﬁlter to the light curve which allows low frequencysignals to pass and therefore reduces the high-frequencynoise in the data. This low-pass ﬁlter has a smoothingwindow of 10 subsequent data points. The width ofthe smoothing window was optimized for the simulated Roman data set analyzed in this work, but should beseparately optimized for each data set so as to removenon-astrophysical noise as much as possible.4.1.

Peak Finder

The primary idea behind this tool is to identify in-dividual peaks in a light curve, which are often char-acteristic of microlensing events. After employing thelow-pass ﬁlter, we bin the light curve in time, and then,in each bin, we count the number of data points withpositive deviations larger than some number of stan-dard deviations in that bin, which we deﬁne as the peakthreshold. If a bin has more than one data point beyondthat threshold, that bin is deﬁned as a peak.When using this algorithm, two parameters are impor-tant in determining the success rate; the bin size and thepeak threshold. Depending on the particular implemen-tation, we can use this method to search for a single largepeak, such as a single-lens microlensing event. Or afteridentifying a single-lens event, we can subtract a modeland then search for additional peaks in the light curveresiduals that would indicate a binary lens event. Wecan also search simultaneously for multiple large peaksat once.To examine the performance of this method in a par-ticular case, we consider the case of searching for a sin-gle peak within a light curve. We test diﬀerent values of

Khakpash et al. the bin size and threshold parameters to determine whatcombination is more successful at detecting peaks. Wedeﬁne a successful detection as one where exactly onebin identiﬁes a peak, and the light curve is known to bedue to a microlensing event. We deﬁne a failed detec-tion as one where a light curve labeled as containing amicrolensing event did not have any peaks identiﬁed, ormultiple peaks identiﬁed. With this range of bin sizes,this algorithm will not detect narrow deviations in mi-crolensing light curves, and will only look for the mainevent. Note that the algorithm is looking for single-peaked variations, so it might also ﬁnd single-peakedtransients. We expect that next stages of analysis willdistinguish between these cases. This method has a lowfalse negative rate. For example, it is possible that itwould fail at detecting a microlensing light curve at lowthresholds. This happens when the event has a low mag-niﬁcation and it might be considered similar to the noisein the light curve.Figures 4, 5, and 6 show the performance of the algo-rithm across a range of bin sizes and thresholds. Theydisplay the True Positive Rate (TPR), False PositiveRate (FPR), and False Discovery Rate (FDR) for dif-ferent options of bin size and peak threshold. Figure 4shows that bin sizes larger than about 50 days and athresholds of 3 σ to 5 σ give the highest TPR of about80%.Figure 5 shows the False Positive Rate which repre-sents what fraction of non-microlensing events are la-beled as microlensing. Bin sizes do not consistently af-fect the FPR. Thresholds larger than 5 σ give unrea-sonably low FPR, and that is because at these thresh-olds, most light curves will have zero peaks identiﬁed.Thresholds of 3 σ to 5 σ give a FPR of 50%. Since somefraction of non-microlensing events have a single peak,we expect that a group of non-microlensing events willbe classiﬁed as microlensing even though the algorithmhas successfully identiﬁed one peak in them.Figure 6 show the False Discovery Rate which repre-sents the fraction of events labeled as microlensing thatare are in fact non-microlensing. For the same reason asmentioned above, thresholds larger than 5 σ are not fa-vorable, and thresholds of 3 σ to 5 σ have reasonably lowFDR of larger than 7% for most bin sizes. Note that theextremely low rates of FDR is also an indication of theunbalanced dataset (a dataset in which there is unequalnumber of objects in each category.), and our purposeis to compare how sensitive it is to diﬀerent thresholdsand bin sizes.Using this peak ﬁnder algorithm with a bin size of 60days and a threshold of 5 σ , we can identify which lightcurves show a single peak event. In Section 5, we use this Figure 4.

Results of the Peak Finder algorithm is appliedto all of the light curves in the dataset for diﬀerent valuesof bin size and threshold. The True Positive Rate (TPR),the fraction of microlensing events that are labeled as mi-crolensing is plotted versus bin size, with various thresholdsdenoted by symbol and color. The value of TPR is roughlyindependent of the bin sizes, and that thresholds of 3 σ to 5 σ yield the highest TPR. Figure 5.

As Figure 4, but showing the False PositiveRate (FPR), the fraction of non-microlensing events that arelabeled as microlensing. The value of FPR is only weaklydependent of the bin sizes for most thresholds. feature in conjunction with the other feature detectiontools. 4.2.

Gould Two-parameter PSPL Fit

The PSPL function has been demonstrated to bea good ﬁt to most microlensing light curves, and tonot match most other types of astrophysical variabil-ity (Di Stefano & Perna 1997). Therefore, the goodness lassifying High-cadence Microlensing Light Curves I; Defining Features Figure 6.

As ﬁgure 4, but showing the False DiscoveryRate (FDR), the fraction of events labeled as microlensingthat are in fact non-microlensing. Overall, values of FDRare very low and also roughly independent of the bin sizesfor most thresholds. of ﬁt to the PSPL model can be used as a feature ofthe light curve. In this section, we demonstrate em-ploying a two-parameter PSPL model ﬁrst introducedby Gould (1996) to detect curves similar to a Paczy´nskicurve. The reason for not using the regular PSPL modelis that brute force searches over the three non-linear pa-rameters ( t E , t , u ) of a PSPL model are slow, whilethe two-parameter Gould approximate PSPL model (G-PSPL) captures the basic observables of a full PSPLmodel but requires many fewer computations. For moredetails refer to Gould (1996) and Wo´zniak & Paczy´nski(1997).In order to employ the PSPL model as a light curvefeature, we employ a version of the Gould (1996) two-parameter PSPL model (G-PSPL) that has been usedby the KMTNet survey to detect microlensing events.For that survey, Kim et al. (2018) ﬁt a combination oftwo two-parameter G-PSPL models: one with the as-sumption of u = 1, and one with the assumption of u being very close to zero. This approach reduces thenumber of ﬁtted parameters, and can be used to detectboth high and low-magniﬁcation events. This doubletwo-parameter G-PSPL ﬁt is deﬁned here as function F ( Q ): F ( Q ) = f × ( A Q ) + A ( Q )) + f (4)where A ( Q ) = 1 √ Q (5)and A ( Q ) = 1 (cid:113) − Q ) (6) are functions that represent good ﬁts to high and lowmagniﬁcation events, respectively. The two parameters f and f do not have a physical meaning in this equa-tion, but in the limit of u = 1 and u close to zero,they are related to the blending and source parameters,described in detail in Kim et al. (2018). A and A areboth functions of Q , Q ( t ) = 1 + (cid:18) t − t t eff (cid:19) . (7) Q is a function of time and depends on two parameters t and t eﬀ . t is the time of the maximum magniﬁcation,and t eﬀ approximates u × t E in the limit of u (cid:28) . F ( Q ),we set the initial values of the parameters f and f to0 .

5, and t is the time of the maximum magniﬁcation inthe light curve. For t eﬀ , we choose a list of seven initialvalues 0 . , . , . , , , , and 20. For high magniﬁca-tion events, low values of t eﬀ usually provide a betterﬁt, and for low magniﬁcation events, larger values of t eﬀ are a better ﬁt. The ﬁt with the minimum χ value isused as the selected model. Note that for the remainderof the paper, we specify whether we use A G-PSPL or aPSPL model ﬁt.4.3. Symmetry Check Algorithm

If we identify a peak in a light curve and ﬁt a two-parameter G-PSPL model to the data, we then employa check on the symmetry of the event. In Section 4.2,we showed that this method can be used to detect mi-crolensing events. Here, we calculate the reduced χ statistic separately for the right and left side of the peakin the light curve. If the peak is symmetric, the ratioof ( χ left ) reduced / ( χ right ) reduced , which we deﬁne as β ,is very close to unity. Since the number of data pointsin the right and left wings of the peak might not beexactly equal, we use the reduced χ . Deviations of β from unity can be related to additional physical detailsin the light curves, such as the existence of planetary orbinary star companion lensing signatures, parallax ef-fects, or the asymmetry in non-microlensing peaks likecataclysmic variables.In Figure 7 we plot histograms showing the distribu-tions of variability types for a set of ranges of β values.Each column represents events with a speciﬁc range of β . We ﬁnd that for CV light curves, the asymmetry isextremely large, consistent with the morphology of CVlight curves, displaying steep rises and more gradual de-creases. Single-lens microlensing events are symmetric,with β values close to unity. Binary lenses have a widerange of β values depending on whether they are caused0 Khakpash et al.

Figure 7. β is a measurement of asymmetry of the peaksin the light curves that is obtained from the symmetry checkalgorithm. This plot shows the distribution of β for diﬀerentclasses of light curves in our dataset. by a planetary system or a stellar binary system. Theseratios can be used as a feature to obtain informationabout the kinds of deviations from the PSPL model,and classify the light curves.4.4. Trapezoidal Function Fit

As explained in Section 2.3, lensing events caused byfree-ﬂoating planets have a short duration and ampli-tude and are more often strongly aﬀected by the ﬁnitesource eﬀect. Therefore, their shapes approximately re-semble a trapezoidal function.The parameters of the trapezoidal function are thebaseline magnitude ( a ), maximum magnitude ( b ), timeof the ﬁrst rise ( τ ), the time when maximum magni-tude is reached ( τ ), the time when maximum magnitudeends ( τ ), and the time when the magnitude returns tothe baseline ( τ ). Figure 8 shows a diagram of the func-tion and its six parameters. We must ﬁrst ﬁnd initialguesses for the six parameters, then ﬁt the function andobtain more accurate estimates of the parameters. Wealso calculate the duration of the trapezoidal portion∆ τ full , and the duration of the ﬂat section of the trape-zoid, ∆ τ top as presented in Equations 8 and 9 and shownon Figure 8. ∆ τ full = τ − τ (8)∆ τ top = τ − τ (9)In order to ﬁnd initial guesses for the six parameters,we ﬁrst subtract the median baseline magnitude fromthe light curve, setting a to zero. We then select 20, ab ! ! ! " ! ! $ Time M a g n i t ud e Δ! %&' Δ! ()** Figure 8.

A trapezoidal function with its deﬁned parame-ters.

2, 0 .

2, or 0 .

02 days as initial guesses for the total du-ration (∆ τ full ). We deﬁne t as the time of maximumbrightness, and assume ∆ τ full = 2∆ τ top . We then de-ﬁne the remaining parameters τ , τ , τ , and τ as thefollowing quantities, respectively: t − ∆ τ full , t − ∆ τ full , t + ∆ τ full , and t + ∆ τ full . We ﬁt the resulting modelsbased on the initial guesses for ∆ τ full , using a least-squared minimization approach. We ﬁnally select theﬁt that minimizes the χ statistic.We have applied this algorithm to our test dataset,with Figure 9 showing an example of a best ﬁt func-tion for a simulated light curve. We ﬁnd that half ofthe total duration (∆ τ full ) is a good representation ofthe Einstein timescale ( t E ) of the event, and we com-pare it with true values of t E of the events. Figure 10shows the estimated versus true duration of single-lensmicrolensing events with t E < t E < .

06 days, while for events with t E ≥ . κ = ∆ τ top ∆ τ full as the trape-zoidal timescale ratio, which is the ratio of the durationof the ﬂat part of the trapezoid to the total durationof the trapezoid. This quantity was initially set to 0.5for the initial guesses of the function ﬁtting. This pa-rameter can be thought of as a measure of how squareor peaked the event is, analogous to the kurtosis of theevent. We show that using κ along with ∆ τ full allowsus to ﬂag the events that are strongly aﬀected by the ﬁ-nite source eﬀect. We can characterize the approximatesigniﬁcance of the ﬁnite source eﬀect using | u | /ρ , theratio of the impact parameter to the ﬁnite source ratio.When | u | /ρ is smaller than unity, it implies the lenspasses directly over the source and thus that can be asign of signiﬁcant ﬁnite source eﬀects.The left panel of Figure 11 shows the trapezoidaltimescale ratio ( κ ) versus ∆ τ full for the subset of all lassifying High-cadence Microlensing Light Curves I; Defining Features Figure 9.

Trapezoidal function ﬁtted to the light curveof a simulated event caused by a free-ﬂoating planet. Theﬂat top of the trapezoid describes the duration of maximumbrightness fairly well, and the duration of the event is wellrepresented by the full trapezoidal duration.

Figure 10.

True versus estimated Einstein Timescale ( t E )of the events with t E < light curves that meet the ﬁtting criteria for single-lensmodels, which are described below in §

5. Note thatboth of the quantities on the axes in the left panel arefound experimentally using the trapezoidal function ﬁt.In that panel, most of the green data points have largervalues of κ which implies a more square light curve shapecaused by strong ﬁnite source eﬀects. The right panelshows the distributions of the true values of ρ for allthe data points inside and outside of the black box inthe left panel. That panel indicates that light curveswith small values of ∆ τ full and large values of κ tend to have larger values of ρ (grey), and light curves withlarger values of ∆ τ full and smaller values of κ tend tohave smaller values of ρ (hatched white). This is be-cause events with short duration caused by free-ﬂoatingplanets are more aﬀected by the ﬁnite source eﬀect, andtherefore more resemble a trapezoidal functions, with∆ τ full and κ helping to indicate these events.4.5. Cauchy Distribution Fit

For microlensing events with a strong ﬁnite source ef-fect in the absence of limb darkening eﬀects, the top ofthe peak becomes ﬂatter and will not look like a PSPLfunction. For detecting this phenomenon, we need toparameterize the ﬂatness of the top of the peak. Eﬀec-tively, we would like to use a functional ﬁt that canapply to the range of morphologies between that ofa single-lens PSPL model, and a trapezoid that muchmore closely ﬁts a strong ﬁnite source eﬀect. Our goalhere is to ﬁnd a correlation between the ﬁtted parame-ters and the ﬁnite source ratios. For that purpose, weuse a Cauchy distribution function to ﬁt the light curves.The diﬀerence in the minimum χ of the Cauchy ﬁt andthe PSPL ﬁt (shown in Equations 3 and 1), along withone of the parameters of the Cauchy distribution, canbe used as features.The Cauchy distribution shown in Equation 10 hasfour parameters, time of the peak ( t ), the duration ofthe peak ( σ ), the ﬂatness of the peak ( b ), and the am-plitude of the peak ( a ). C ( t ) = a (cid:12)(cid:12) t − t σ (cid:12)(cid:12) b (10)The functions C ( t ) and PSPL have two common pa-rameters, time and duration of the peak. After ﬁttingthese two functions to a light curve, we have ﬁtted valuesfor t E and t , which we refer to as t E,P SP L , t E,Cauchy , t ,P SP L , and t ,Cauchy . Note that t E,Cauchy is equiva-lent to σ in Equation 10. We deﬁne the diﬀerence inthe χ of the PSPL ﬁt and the Cauchy ﬁt as a fea-ture characterizing the ﬂatness of the top of the curverepresented in Equation 11. In order to calculate ψ ,we select a section of the curve at the peak, between t − ( t E,P SP L × u ,P SP L ) and t + ( t E,P SP L × u ,P SP L ). ψ = χ P SP L − χ Cauchy (11)The more the event is aﬀected by the ﬁnite source ef-fect, the more the top of its peak deviates from the PSPLmodel and is more similar to the Cauchy model. Thusfor single-lens events with a ﬂatter peak, ψ is positive.Even in cases where the Cauchy distribution ﬁts muchbetter than the PSPL ﬁt, for events that are not causedby free-ﬂoating planets that still experience strong ﬁnite2 Khakpash et al. full (days) | u |/ > 1 | u |/ < 1 True Value of | u |/ N u m b e r Outside the Box Inside the Box

Figure 11.

Left Panel : Values of the trapezoidal timescale ratio ( κ ) are plotted versus the total trapezoidal duration (∆ τ full ).The purple data points show events with | u | /ρ >

1, and the green data points represent events with | u | /ρ <

1, which indicatesevents with free-ﬂoating planets or other PSPL events aﬀected by the ﬁnite source eﬀect. κ is larger for shorter events, andmost of the events with | u | /ρ < κ and small ∆ τ full . Therefore, the selected ranges of κ and ∆ τ full can be usedto identify likely events caused by free-ﬂoating planets. Right Panel : The distribution of | u | /ρ for all data points inside (grey)and outside (hatched white) of the black box in the left panel. Light curves with small values of ∆ τ full and large values of κ tend to have smaller values of | u | /ρ , and light curves with larger values of ∆ τ full and smaller values of κ tend to have largervalues of | u | /ρ . source eﬀects, the Cauchy distribution will typically bea better ﬁt than the trapezoidal function. Figure 12shows a single-lens event highly aﬀected by the ﬁnitesource eﬀect, along with the best-ﬁt PSPL and Cauchymodels. The two curves deviate the most close to thepeak of the event.Figure 13 shows values of the Cauchy feature ( ψ ) ver-sus ρ/u for all single-lens events that have at least fourobservations within a time of t E,P SP L × u ,P SP L fromthe maximum. The upper panel shows positive valuesof ψ and the lower panel shows the negative values. Theevents with ﬂatter peaks have a positive value of ψ , andon average, have a shorter duration than those with neg-ative ψ . Events with large positive ψ (ﬂatter peaks) ap-pear in the upper right of this ﬁgure and have ρ/u > ψ include both short and longduration events. The black lines show the one-to-one re-lation for events with positive and negative ψ , and havea median absolute deviations of 0.45 for positive ψ , and0.64 for negative ψ . Events in the lower region mostlyhave small negative values of ψ , indicating that neitherthe PSPL and Cauchy functions are signiﬁcantly betterﬁts to the data. Those with large negative values are Figure 12.

Cauchy and PSPL functions ﬁtted to a Romansimulated single-lens microlensing light curve highly aﬀectedby the ﬁnite source eﬀect. The blue data points are fromthe simulated light curve, the red curve is the ﬁtted PSPLmodel, and the orange curve is the ﬁtted Cauchy model. those with sharp peaks that are well-described by thePSPL function, which also have low u . This plot shows lassifying High-cadence Microlensing Light Curves I; Defining Features ψ can help identify theﬂattest and sharpest microlensing peaks, and ﬂag theones that are more likely aﬀected by the ﬁnite sourceeﬀect.The other feature that we use for this purpose is pa-rameter b in Equation 10. We expect that if the eventexhibits strong ﬁnite source eﬀects which means a ﬂat-ter top, this feature would take a larger value, and ifthe ﬁnite source eﬀect is negligible, the feature wouldbe very close to unity (in the absence of limb darken-ing eﬀects). In reality, b does not correlate with ρ formost events, but once ρ is greater than 0.1, we do seea positive correlation with b , as seen in Figure 14. Wetherefore retain b as a potentially useful feature.4.6. Planetary Parameters Finder

Our goal in this section is to measure the two physi-cal parameters associated with the planetary binary-lensevents. These are s , the projected separation betweenthe planet and the lens star and q , the mass ratio of theplanet and the lens star. The approach that we takehere is to ﬁrst ﬁt the main event with a PSPL function,then ﬁnd the deviations from the PSPL model by look-ing at the residual, to identify peaks or troughs. We thenseek to identify cases where these residuals have one sig-niﬁcant peak or trough, or if they have two signiﬁcantpeaks.The reason we take this approach is that the single-deviation residuals can be characterized much more ro-bustly than the double-peaked ones. For events of theboth groups, we ﬁt the “busy” function introduced byWestmeier et al. (2014). This function is primarily usedto describe double-peaked features in spectra, and withsome changes in its parameters, we can also use it to ﬁndthe single-peaked deviations. Khakpash et al. (2019)suggest ﬁtting the single-peaked residuals with a Gaus-sian, and show that this approach is useful mostly forlow-mass ratio events. Here, we take the same approachto ﬁnd the initial parameters in the residual, but in-stead of a Gaussian, we ﬁt all deviations with the busyfunction. After ﬁtting the residual, in the next roundof ﬁttings, we ﬁt a PSPL plus a busy function, and weuse parameters obtained from the ﬁrst two ﬁts as initialparameters of this ﬁt. Next, we calculate s and q usingthe parameters found by the ﬁnal ﬁt.The busy function is commonly used to describedouble-horn proﬁle of galaxy spectra (Westmeier et al.2014). The function is shown in Equation 12, and hasnine parameters that are sketched in Figure 15. Thisfunction comprises two error functions and a polyno-mial of degree n . The parameters x e and x p determinelocation of the middle of the two error functions and the middle of the polynomial, δ and δ are the steepness ofthe two error functions, n is the degree of the polyno-mial and determines the steepness of the middle trough, c determines the depth of the polynomial, w determinesthe distance of the error function zero points from x e , a determines the height of the error functions, and ε isthe horizontal scaling of the function. A ( t ) = ( a/ × (erf ( δ ( w + ε × t − x e )) + 1) × (erf ( δ ( w − ε × t + x e )) + 1) × ( c × | ε × t − x p | n + 1)(12)Depending on the values of the nine parameters, theshape of the function can diﬀer signiﬁcantly. We takeadvantage of this fact and use diﬀerent shapes of thefunction to ﬁt single-peaked and double-peaked devia-tions from the PSPL model. The two forms of the busyfunction used to ﬁt the residuals are shown in Figure16. The left panel shows a double-horn shape of thebusy function, and the values of the parameters canbe found on the plot. The three curves represent theshapes of the function when only the steepness of thetwo error functions are altered. This set of parametersresults in a form that can describe the caustic-crossingfeatures of the planetary microlensing light curves. Inthe right panel, the value of c in Equation 12 is set tozero, and therefore the equation only comprises two er-ror functions. The shape of the function resembles theGaussian function used by Khakpash et al. (2019) toﬁt single-peaked deviations with an additional ability todescribe asymmetric deviations. The three curves showhow the shape of the function changes as the steepnessof the two error functions are altered.We apply this approach to the set of simulated Romanplanetary microlensing light curves. We follow the pro-cedure in Khakpash et al. (2019), and ﬁrst ﬁt a PSPLfunction as shown in Equations 1 and 3 to the lightcurves. We ﬁnd an initial value for t by ﬁnding thetime of the maximum ﬂux in the light curve, and we setthe initial value of f s to be 0 .

5. For an initial guess forthe duration of the event t E , we ﬁrst interpolate betweenthe data points using a cubic interpolation to estimatea continuous version of the data, then, for events with u ,initial < .

5, we assume t E is the interval betweentimes when the magniﬁcation is 1 .

34. For events with u ,initial > .

5, we set t E equal to the intervals betweentimes when magniﬁcation is 1 .

06. We use Equations 1and 3 to calculate an initial guess for u based on theabove other assumed values at time t = t . Then, weﬁt the PSPL function using the initial guesses, and thenapply the peak-ﬁnding algorithm to the residual.4 Khakpash et al.

Figure 13.

The Cauchy feature ( ψ ) plotted versus ρ/u , restricted to cases with at least four observations near the peakmaximum. The colors of the points represent the PSPL-estimated event duration t E,PSPL , while the sizes of the points representthe PSPL-estimated value of u ( u ,PSPL ). The black trend lines show a one-to-one relation between ψ and ρ/u (reversed inthe bottom panel). Figure 14.

Plot of the parameter b in equation 10 versustrue parameters of ρ . It shows that the b values are in generallarger for events with ρ larger than 0 . Next, we use the peak ﬁnder ( § σ . With thesevalues, the algorithm ﬁnds the most signiﬁcant devia-tion. Then, we start decreasing the bin size by 50%until it either ﬁnds two peaks or the bin size reaches 0.2days. At any of these decreasing steps, if it ﬁnds twopeaks that are separated by less than 10 days, we endthe search. This condition aims at preventing the algo-rithm from selecting multiple peaks in a caustic-crossingmicrolensing event. If the search concludes with the de-tection of one deviation, we accept that and move onto the next step. If it ﬁnds zero deviations, we sim-ply select the maximum or minimum data point in theresidual and identify that as a single deviation.At this point, we ﬁt the modiﬁed busy function topeaks identiﬁed in the residual. For this purpose, we lassifying High-cadence Microlensing Light Curves I; Defining Features 𝑥 ! , 𝑥 " 𝑎𝑤𝛿 𝑛 M a g n i f i c a t i o n Time 𝛿 𝑐 Figure 15.

A schematic plot of the busy function, withfeatures associated with eight function parameters. The pa-rameter ε (not shown on the ﬁgure) represents a horizontalscaling of the function without disrupting its shape. identify the time of the single deviation, or the middlepoint of the double-peaked deviation to be initial guessesfor x e and x p . We set these two values equal to zero,the values of δ and δ are set equal to 1, and the valueof ε is set to 5. Parameters a , w , n and c are set updiﬀerently depending on the number of deviations foundin the residual. If there is one deviation, n and c are setequal to zero, and a is equal to the largest peak or troughin the residual, and w is set equal to unity. If there aretwo deviations, initial values of n and c are set to 10and 0 .

02, and the value of a is set equal to the medianof the residual values between the two peaks, and w isset equal to half of the distance between the two peaks.Parameters describing the deviations in the residualare obtained by the busy function ﬁt. Using these pa-rameters along with the PSPL parameters as initial val-ues, we then ﬁt a PSPL plus a busy function to the lightcurve, and we ﬁnd ﬁnal values of 13 parameters, four ofthem describing the main event, and nine of them de-scribing the deviation. Using these values, we calculatethe two physical parameters s and q using two diﬀerentapproaches.When only one deviation is found, the busy functionwill turn into two error functions with six parameters.After ﬁtting the function to the deviation, we determinethe duration of the deviation by looking at where theﬁtted model is not zero, and we assume half of that tobe the duration t Ep . t p is set to x e , and then we ﬁnd thetwo values of s by solving Equation 13. If the deviationis positive, a secondary check is done to ﬁnd if thereis a large trough close to this peak. If there is a largede-magniﬁcation close to a single caustic crossing, the event is very likely to have s < s < s > s < q is then obtained by Equation 14. | s − s | = u, where u = (cid:115) u + (cid:18) t − t p t E (cid:19) (13) q = (cid:18) t Ep t E (cid:19) (14)When there are two deviations, we assume that thedeviations are caused by crossing the planetary caustic,and that these epochs correspond to when the sourceapproaches two cusps of the caustic. Using these epochs,we calculate the distance of the source from the lensat these times using u , = (cid:114) u + (cid:16) t − t , t E (cid:17) . Han(2006) calculates the size of the planetary caustics interms of s and q , and we ﬁnd the size of the caustic usingsimple geometry, and then, we use their formulation tocalculate s and q .Figure 17 shows an example of the geometry involvedin these lensing events. In this ﬁgure, planetary causticsof two systems with projected separation of 1 . . .

03 is shown. Theshape and sizes of the caustics depend on s and q . An ex-ample path of the source through the system is shown onboth panels. The distance between the lens and the cen-ter of the caustics, LX , gives us the value of | s − s | . Atthis point, we do another secondary check as describedearlier and we determine whether we have a system with s > s <

1. Next, for systems with s >

1, we ﬁnd thevertical dimensions of the caustic (left panel of Figure17), and for systems with s <

1, we ﬁnd the distancebetween the two caustics by geometry (right panel ofFigure 17). We refer to both of these values as heightof the caustics. Han (2006) ﬁnds that these values areroughly equal to √ qs √ s − and s √ q (cid:113) ( s ) + 27, respec-tively. We then use these equations to ﬁnd q . Table 1summarizes the process we adopt to extract the systemparameters from the light curve residuals in the cases ofone and two deviations (Poleski et al. 2014; Bozza 2000).Figures 18, 19, and 20, show three examples of thePSPL plus the busy function ﬁtted to planetary lightcurves. Figure 18 shows the ﬁt for a system with onedeviation in its PSPL residual where the true projectedseparation of the system is smaller than unity. The algo-6 Khakpash et al. M a g n i f i c a t i o n x e = 0 x p = 0 a = 2 n = 4 w = 7 c = 0.002= 5

1, 2 = 1

1, 2 = 0.1 = 0.5 and = 0.1 x e = 0 x p = 0 a = 2 n = 4 w = 7 c = 0= 5

1, 2 = 1

1, 2 = 0.1 = 0.5 and = 0.1 Time

Figure 16.

Two diﬀerent shapes of the busy function that are ﬁtted to the single- and double-peaked residuals.

Left Panel : Adouble-horn shape of the busy function, and the values of the parameters. The three curves represent the shapes of the functionwhen only the steepness of the two error functions are altered.

Right Panel : The value of c in Equation 12 is set to zero, andthe equation only comprises the two error functions and no polynomial. Only one peak is present, with the two error functionscontrolling the steepness of the two sides of the peak. The three curves show how the shape of the function changes as thesteepness of the two error functions are altered. rithm ﬁrst detects sharp peaks or troughs, and after ﬁt-ting the busy function to the light curve residuals fromthe PSPL ﬁt, we then obtain the residual of the busyfunction ﬁt. At the times where a single peak is found,we still cannot conclude that the event includes a ma-jor image perturbation. At this point, if the sum of theﬂux of negative points in the busy function ﬁt residualis larger than the sum of the ﬂux of the positive points,the system will be considered to be aﬀected by a minorimage perturbation, and s will be chosen to be less thanone. Otherwise, s will be larger than one. The exam-ple in Figure 18, shows an event with a sharp positivedeviation that appears to be a major image perturba-tion; however, the preceding trough indicates that this aminor image perturbation, and the algorithm correctlydecides that s is smaller than one. The true parametersfor this event are s = 0 .

63 and q = 0 . s = 0 .

64 and q = 0 . s <

1. The busyfunction ﬁt may appear to be a poor ﬁt to the devi-ations but this fact is not considered a failure, and infact helps with determining whether it is a major or aminor image perturbation. After the secondary check,the algorithm decides to correctly choose s smaller thanone. The true parameters for this event are s = 0 . q = 0 . s = 0 .

59 and q = 0 . s >

1. The true parameters for thisevent are s = 1 .

11 and q = 0 . s = 1 .

02 and q = 0 . s and q with theirtrue values as demonstrated by Khakpash et al. (2019).Figure 21 shows the estimated values of s and q plottedversus their true values. Most values of s are well es-timated by the method. The ﬁtted values close to onearise from events caused by the central caustics. Whilethe estimated Values of q show a lot of scatter comparedto the true value, they are generally estimated to withinabout one order of magnitude of the true values. Notethat this algorithm is slower than the algorithm pre-sented in Khakpash et al. (2019), but it ﬁts a broaderrange of events.4.7. Chebyshev Polynomials Fit

Di Stefano & Perna (1997) suggests that unlikebinary-lens microlensing light curves that are signiﬁ-cantly perturbed, smooth binary-lens events can be eas-ily misclassiﬁed, and therefore, it is useful to developmethods that can distinguish between this type of mi-crolensing events and other types of similar variabili-ties. The method should work fast and be eﬀective atmarking light curves with smooth features. One of the lassifying High-cadence Microlensing Light Curves I; Defining Features | s s | q = 0.03 s = 1.6 L X | s s | q = 0.03 s = 0.8 L X Figure 17.

Left panel : Planetary caustics of a system with a projected separation of 1 . . s and q . Right panel : Planetary caustics of a system with a projected separation of 0 . .

03 (red). At this range of s , there are two smaller planetary caustics. The distance betweenthe lens and the center of the planetary caustic is equal to | s − /s | . The solid black line with the arrow in each panel representsthe path of the source through the caustics. The left panel is a reproduction of a similar ﬁgure in Poleski et al. (2014). Table 1.

The process taken to calculate s and q is shown in this table. One Deviation: t p is determined.2. u = (cid:113) u + ( t p − t /t E ) s − s − u is solved for s .4. Secondary check determines if s > s < q = ( t E,p t E ) . Two Deviations: t and t are determined.2. u , = (cid:113) u + ( t , − t /t E ) LX is found by geometry.4. | s − s | − LX is solved for s .5. Secondary check determines if s > s < s >

1: Height of the caustic ∼ √ qs √ s − = ⇒ q is found.7. For s <

1: Height of the caustic ∼ s √ q (cid:113) ( s ) + 27 = ⇒ q isfound. approaches to approximate a function is to expand it inform of a series of polynomials. For this purpose, Di Ste-fano & Perna (1997) suggests to use the Chebyshev ap-proximation on microlensing light curves. The idea isthat features of Chebyshev polynomials are useful forcapturing the smooth features of binary-lens microlens-ing light curves without caustic crossings. We have implemented the work of Di Stefano & Perna(1997), and applied it to our simulated data set. For thispurpose, we ﬁrst detect the peaks in the light curve, thenfor each peak, we choose the interval around the peakthat includes the wings of the event up to a magniﬁca-tion of 1.06 that corresponds to an impact parameter of2 R E (Di Stefano & Perna 1997).8 Khakpash et al.

Figure 18.

An example of the PSPL plus the busy functionto the light curve of an event with s = 0 .

63 and q = 0 . s shouldbe less than unity. The estimated parameters are s = 0 . q = 0 . Figure 19.

An example of the PSPL plus the busy functionto the light curve of a double-peaked event with s = 0 .

62 and q = 0 . s should be lessthan unity. The estimated parameters are s = 0 .

59 and q = 0 . We then ﬁnd the coeﬃcients of the Chebyshev poly-nomials using the formulation described by Press et al.(1992) and expand the selected parts of the light curvesby the expression in Equation 15. The polynomial T k has k +1 number of extrema in the interval [ − ,

1] where

Figure 20.

An example of the PSPL plus the busy functionto the light curve of a double-peaked event with s = 1 . q = 0 . s = 1 .

02 and q = 0 . the Chebyshev polynomials are deﬁned. To use thismethod, the time coordinates of the selected potion ofour light curves should be within this interval, and sowe convert the interval of time to be between − f ( x ) ≈ (cid:34) m (cid:88) k =0 c k T k ( x ) (cid:35) − c (15)In this work, we choose the ﬁrst 50 polynomials ofthis series ( m = 50) to ﬁt Equation 15 to the lightcurves, and then using the coeﬃcients c k , we calculateΛ = − log (( (cid:80) mk =0 ( c k c ) ) −

1) as a feature parameterthat can be used to distinguish between diﬀerent lightcurve types in our simulated data set.In Figure 22, an example of a binary-lens event ap-proximated by the Chebyshev polynomials of degree 50can be seen. It is important to note that the whole lightcurve is not approximated in this method, and only thesegment containing the event excluding gaps is selected.This is because the Chebyshev approximation is goodfor approximating functions that have ﬁnite number ofextrema in the interval of [ − , lassifying High-cadence Microlensing Light Curves I; Defining Features True q F i tt e d q MAD = 1.15

Single-peaked DeviationsDouble-peaked Deviations True s F i tt e d s MAD = 0.13

Single-peaked DeviationsDouble-peaked Deviations

Figure 21.

Left panel shows plot of ﬁtted values of q versus the true values. Fitted values of q are mostly within one order ofmagnitude of the true values for the PSPL plus busy function ﬁts. The right panel shows ﬁtted s versus the true values. Most values of s are well estimated by the method. The ﬁtted valuesclose to one occur for events caused by the central caustics for which our formalism breaks down. Blue dots are events withoutcaustic crossings, and black dots are events with caustic crossings. Figure 22.

The stellar binary-lens light curve is approxi-mated with the Chebyshev polynomials of degree 50. microlensing events are closer to being an even functionrather than an odd function, the expansion only involvesthe even terms in the Chebyshev polynomials.We then check how well diﬀerent values of Λ matchto diﬀerent light curve types. Figure 23 shows that thevalue of Λ correlate fairly well with four diﬀerent lightcurve types. Note that the fractions are calculated for a given range of Λ, across each light curve types, horizon-tally in the ﬁgure. Although the fractions of diﬀerentvariability types in the simulated Roman data set arenot equal, there are large numbers of each variabilitytype, so the distinctions seen in Figure 23 are still quiterobust. We therefore expect that the continuous param-eter Λ, can be a useful input for a ML algorithm (suchas a Random Forest) to classify event types.In addition to the feature Λ, we also follow the sug-gestion of Di Stefano & Perna (1997), to include theindividual even-numbered coeﬃcients of the Chebyshevpolynomial as classiﬁcation features. AN ALGORITHMIC APPROACHIn this section, we introduce a suggested algorithmicapproach to use the produced features as input to MLclassiﬁers. The goal of this section is to provide an eﬃ-cient procedure to use these features to train classiﬁerson a high-cadence data set and detect diﬀerent types ofmicrolensing light curves. In Table 2, we have a sum-mary of the tools in our package and the associated fea-tures. Each of these tools can be either applied to thewhole dataset or a subset of it, and there may be morethan a single algorithmic approach to use these features.In a paper under development (Khakpash et al, inpreparation), we are implementing an algorithmic ap-0

Khakpash et al.

Table 2.

Summary of the introduced algorithms and their resulting features.

Algorithms Produced Features

Peak Finder Number of the PeaksG-PSPL Fit χ : Goodness of the G-PSPL ﬁtSymmetry Check β : A measure of asymmetry of the peakTrapezoidal Function Fit κ : Ratio of the duration of the ﬂat top of the trapezoidal function to total duration t E,trap : Duration of the trapezoidal functionCauchy Distribution Fit ψ : The diﬀerence between the goodness of PSPL and Cauchy ﬁts b : A measure of the ﬂatness of the topPlanetary Parameter Finder s : Projected mass ratio in units of Einstein radius q : Mass ratioChebyshev Polynomials Fit Λ: Sum of the square roots of the Chebyshev coeﬃcients a , a , a , a , a : First ﬁve even coeﬃcients of Chebyshev polynomials Figure 23.

This ﬁgure shows the breakdown by variabilitytype for each range of values of the Chebyshev feature Λ. Λis equal to − log (( (cid:80) mk =0 ( c k c ) ) − c . proach to employing the features discussed in this pa-per to comprehensively search for microlensing events,classify them by type, and derive preliminary systemparameters. For any set of lightcurve features, thereare numerous ways to conduct classiﬁcation and our ap-proach is by no means certain to be optimal in everyway. We encourage other astronomers to make use ofthe features described here to develop independent clas-siﬁcation methods.The particular structure or sequence of a classiﬁcationapproach can vary. One might use all the features todetect all the categories at once, or alternatively, ﬁndparticular classes by doing a step-by-step classiﬁcation.In an ideal case where there are thousands examples ofeach class across a full range of empirical properties ofeach feature, it is likely better to use all the features toﬁnd all of the classes in a one-step classiﬁcation. Since in this case we have very limited examples of some ofthe classes, we recommend a step-by-step classiﬁcationapproach.The ﬁrst step of classiﬁcation is to distinguish mi-crolensing light curves from other types of variability(e.g., CV in our dataset). We refer to this step as clas-siﬁcation step I. Although our goal in this work is notfocused on this set, we believe some of the features iden-tiﬁed in this work can be used to improve the currentexisting classiﬁers focused on this task. We have testedthis type of classiﬁcation using the features such as num-ber of the peaks found by the Peak Finder, Λ, a , a , a produced by the Chebyshev ﬁt, and χ P SP L , t E,P SP L , β generated by the G-PSPL ﬁt and the Symmetry Check.Assuming we have identiﬁed all the microlensing lightcurves, the next type of classiﬁcation is to classify theminto single-lens versus binary-lens events which we referto as step II . Note that in a real dataset, multi-lensevents also exist which can either be added as a cate-gory or can be included in one category along with thebinary-lens events. We use the same feature as in step Iexcluding the number of peaks.Once we have single-lens and binary-lens events, weclassify the binary-lens events into stellar binary-lensand planetary binary-lens systems (classiﬁcation stepIII) using χ P SP L , t E,P SP L produced by the PSPL ﬁtand s , and q produced by the PSPL plus busy ﬁt.Furthermore, single-lens events can be classiﬁed intoclasses of isolated short timescale events likely caused byfree-ﬂoating planets or planets on very wide orbits, stel-lar PSPL, and stellar FSPL events (Classiﬁcation step IV ). We suggest using κ , ∆ τ full , ψ , and b produced bythe Trapezoidal Fit and the Cauchy/PSPL Fit to ob-tain better results. This step will be investigated in thefuture work.Figure 24 displays a diagram showing our suggestedalgorithmic approach to use the diﬀerent features pro- lassifying High-cadence Microlensing Light Curves I; Defining Features k -nearest neighborsclassiﬁer, a decision tree classiﬁer, a random forest clas-siﬁer and a neural network classiﬁer. The preliminaryresults for that are presented in the next section. PRELIMINARY TESTING OF MLALGORITHMSAs stated above, there are many ways to implementan approach to identify and characterize microlensingevents with classiﬁcation based on light curve features.We are developing a thorough investigation into thatquestion (Khakpash, et al. in preparation), but for nowwe present a preliminary analysis using a few simpleanalysis steps based on the classiﬁcation approach de-scribed in §

5. We present the results of training fourML classiﬁers including a k -nearest neighbor classiﬁer(KNN), a decision tree classiﬁer (DT), a random forestclassiﬁer (RF), and a neural network classiﬁer (NN) us-ing the features we introduced in this paper. In order totest these algorithms, we follow the step-by-step schemeof Figure 24. It is important to note again that the step IV of the classiﬁcation in Figure 24 is not tested hereand will be pursued in the future. At each step, we setthe size of our test set to be 20% of the whole dataset.The test set is then randomly chosen in a 5-fold cross-validation process, and the average scores are reportedat the end.In order to compare results if these classiﬁers, weshow confusion matrices made with the test set at eachstep along with their Receiver operating characteristic(ROC) curves. As mentioned in Section 5, we under-stand that the one-step classiﬁcation is a more idealapproach, and we tested this approach with our cur-rent data set. However, the limited number of objectinstances in each class was insuﬃcient for achieving anoverall acceptable accuracy. Nevertheless, the isolatedconfusion matrices of the step-by-step classiﬁcations arevaluable to evaluate the utility of the features presentedin this paper. We are planning to thoroughly investigatethe one-step classiﬁcation in a future paper.At each step, we use the same dataset and featuresfor all of the four classiﬁers, but we optimize their hy-perparameters separately. The RF, KNN, and DT clas-siﬁers are implemented using the scikit-learn package inpython (Pedregosa et al. 2011). The NN classiﬁer isimplemented using the Tensorﬂow and Keras packagesin python (Abadi et al. 2015; Chollet et al. 2015). Asummary of the features and hyperparameters of eachclassiﬁer at each step is given in Table 3. 6.1. Classiﬁcation Step I

As shown in Figure 24, the ﬁrst step of the algo-rithmic approach is to detect microlensing light curvesamong other stellar variability. We should ﬁrst note thatour current light curve data set is not completely rep-resentative of what we expect from the Roman mission,since the non-microlensing type of variability in the lightcurves in our current dataset only includes cataclysmicvariables. Most (but not all) forms of non-microlensingvariability are expected to be periodic or quasi-periodic,and thus simply including CVs is a good starting point.Our dataset for this set contains 4181 light curves amongwhich there are 3752 microlensing light curves (labeledas 1) and 429 non-microlensing light curves (labeled as0).We ﬁnd that the test set and training set accuracy forall of the four classiﬁers in this step are very close. Acommon way to evaluate the results of a classiﬁcationmodel is to plot a confusion matrix for it. A confusionmatrix is a table containing the percentages of both cor-rectly and incorrectly classiﬁed objects for each class inthe dataset. According to the confusion matrices shownin Figure 25, most of ML tools can ﬁnd the microlensinglight curves with very small classiﬁcation error, whereas,about 40% of the CV light curves are misclassiﬁed. Thiscould be a result of having a small training set, or incom-plete hyperparameter tuning, or might be an indicationof the need to include more features.A more robust method of comparing diﬀerent ML clas-siﬁers is to plot their ROC curves, and calculate theirvalues of the Area Under the Curve (AUC). The closerthe AUC is to unity, the better the performance of thatclassiﬁer is. This includes plotting True Positive Rate(TPR) versus False Positive Rate (FPR) for diﬀerentdecision thresholds, and ﬁnding the area under that.Figure 26 shows the ROC curves of the four classiﬁerstrained in step I. The diagonal dashed line representsthe ROC curve of a random classiﬁcation. The ROCcurves show that the NN and RF classiﬁers have a bet-ter performance and can achieve a higher TPR withoutlowering the FPR.6.2.

Classiﬁcation Step II

The second classiﬁcation step is distinguishing be-tween single-lens and binary-lens microlensing lightcurves as shown in Figure 24. Our dataset for this stepcontains 4181 light curves among which there are 2143binary-lens microlensing light curves (labeled as 1) and1626 single-lens microlensing light curves (labeled as 0).For this step, we ﬁnd that DT and RF have similar train-ing and test accuracy and seem to work better than NN2

Khakpash et al.

Figure 24.

This diagram shows an algorithmic approach for using the features described in this paperas input to ML classiﬁers. lassifying High-cadence Microlensing Light Curves I; Defining Features (a) Random Forest Classiﬁer (b) Neural Networks Classiﬁer(c) Decision Tree Classiﬁer (d) K -nearest Neighbors Figure 25.

Confusion matrices of the four trained classiﬁers of classiﬁcation Step I. In this step, we aim at classifying all ofthe light curves into two classes of CV and microlensing. Khakpash et al.

FPR T P R KNN, AUC = 0.794DT, AUC = 0.849Rf, AUC = 0.904NN, AUC = 0.978

Figure 26.

ROC curves of the four trained classiﬁers ofclassiﬁcation Step I along with their AUC values. In thisstep, we aim at classifying all of the light curves into twoclasses of CV and microlensing. RF and NN have the largestAUC implying that they are able to achieve higher TPRwhile the FPR is also low. and KNN, although NN seems to work much better thanKNN.Figure 27 shows confusion matrices of the four classi-ﬁers trained for this step. The confusion matrices sug-gest that RF works better at predicting both labels,whereas NN works best at ﬁnding a larger fraction ofthe binary-lens light curves. Figure 28 shows the ROCcurves and AUC values of this step. The classiﬁers areoverall working better in this step mainly because thedataset is more balanced here.6.3.

Classiﬁcation Step III

After ﬁnding the binary lens light curves, the nextclassiﬁcation step is distinguishing between stellarbinary-lens and planetary binary-lens microlensing lightcurves as shown in Figure 24. Our dataset for this stepcontains 2074 light curves among which there are 688planetary binary-lens microlensing light curves (labeledas 1) and 1386 stellar binary-lens microlensing lightcurves (labeled as 0).This classiﬁcation step is particularly important inthis context since planetary binary-lens systems are theones that astronomers would like to distinguish from therest of the dataset. In our tested example the numberof light curve is lower than the previous classiﬁcationsteps and this decreases the accuracy of the classiﬁers.Because of this, we ﬁnd that the overall accuracy values of this step are smaller than the previous steps. Ad-ditionally, stellar and planetary binary-lens systems aremuch less distinguishable from each other compared tothe previous tasks, and for this reason the simpler algo-rithms of DT and KNN appear to have lower accuracy.RF and NN have higher overall accuracy, but NN hasa higher test accuracy which results in a larger fractionof the test set being correctly labeled. It seems thatan algorithm like NN is more capable of distinguishingbetween these two categories which is expected as NNis theoretically more complex and is designed to ﬁndcomplicated patterns in a data set.Figure 29 shows confusion matrices of the four clas-siﬁers. The confusion matrix of the NN shows a largervalue of TPR compared to all the other confusion ma-trices, and this is more favorable since our ultimate goalis to detect planetary microlensing light curves. Fig-ure 30 shows the ROC curves and AUC values of thefour classiﬁers in step III. The dataset in this step issmaller compared to the other two steps and is not wellbalanced. Therefore, the performance of the diﬀerentclassiﬁers are not as well-diﬀerentiated as in the othersteps. However, NN shows signiﬁcantly better perfor-mance compared to the other classiﬁers. CONCLUSION AND FUTURE WORKClassifying light curves that manifest diﬀerent typesof stellar variability is still a major challenge. Althoughusing ML methods to classify astronomical time seriesis a powerful tool, it is important to understand whichmethod we should choose and what are the steps weneed to take in order to use these methods eﬀectively.In this paper, we introduce a new approach towardsclassifying microlensing light curves based on light curvemorphologies. We introduce a package of tools includ-ing several functional ﬁts that can be applied to thelight curve in a fast and eﬃcient way to extract infor-mation about the diﬀerent morphological features in thelight curves. This information is quantiﬁed as featuresthat can be used to make decisions about the light curvetypes.This approach will be useful when it comes to analyz-ing large microlensing data sets from high-cadence sur-veys like Roman (Spergel et al. 2015) in the future andongoing surveys like KMTNet (Kim et al. 2010). Ourpreliminary results in Section 6 show that the featuresproduced by our tools can be used as input to ML classi-ﬁers like Random Forest to distinguish between diﬀerenttypes of microlensing light curves, and help prioritizethe ones that are more likely to be caused by planetarysystems. lassifying High-cadence Microlensing Light Curves I; Defining Features T a b l e . C l a ss i ﬁ e r s H y p e r p a r a m e t e r s a nd F e a t u r e s . F e a t u r e s K NN D T R F NN S t e p I Λ , a , a , a , β , N u m b e r o f p e a k s ( b i n s i z e = , t h r e s h o l d = & ) , χ P S P L , t E , P S P L m e t r i c = “ m a nh a tt a n ” , nn e i g hb o r s = c r i t e r i o n = “g i n i ” , m a x d e p t h = , s p li tt e r = “ b e s t ” m a x f e a t u r e s = , m i n s a m p l e s l e a f = n e s t i m a t o r s = , c r i t e r i o n = “ e n t r o p y ” , b oo t s t r a p = F a l s e D e n s e L a y e r s , D r o p o u t L a y e r s , a c t i v a t i o n = “ r e l u ” b a t c h = , e p o c h s = S t e p II Λ , a , a , a , β , χ P S P L , t E , P S P L m e t r i c = “ e u c li d e a n ” , nn e i g hb o r s = c r i t e r i o n = “ e n t r o p y ” , m a x d e p t h = , s p li tt e r = “ b e s t ” m a x f e a t u r e s = , m i n s a m p l e s l e a f = n e s t i m a t o r s = , m a x d e p t h = D e n s e L a y e r s , D r o p o u t L a y e r s , a c t i v a t i o n = “ r e l u ” b a t c h = , e p o c h s = S t e p III Λ , a , a , a , χ P S P L , t E , P S P L , s , q m e t r i c = “ m i n k o w s k i ” , nn e i g hb o r s = , p = c r i t e r i o n = “g i n i ” , m a x d e p t h = , s p li tt e r = “ b e s t ” m a x f e a t u r e s = , m i n s a m p l e s l e a f = n e s t i m a t o r s = , m a x d e p t h = D e n s e L a y e r s , D r o p o u t L a y e r s , a c t i v a t i o n = “ t a nh ” b a t c h = , e p o c h s = Khakpash et al. (a) Random Forest Classiﬁer (b) Neural Networks Classiﬁer(c) Decision Tree Classiﬁer (d) K -nearest Neighbors Figure 27.

Confusion matrices of the four trained classiﬁers of classiﬁcation Step II. At this stage, we assume that themicrolensing light curves are already detected, and our goal is to classify them into groups of single-lens and binary-lensmicrolensing light curves. lassifying High-cadence Microlensing Light Curves I; Defining Features FPR T P R KNN, AUC = 0.92DT, AUC = 0.918Rf, AUC = 0.943NN, AUC = 0.939

Figure 28.

ROC curves of the four trained classiﬁers of clas-siﬁcation Step II along with their AUC values. In this step,we are classifying the microlensing light curves into groupsof single-lens and binary-lens microlensing light curves. Thedataset is more balanced in this step and all of the classiﬁersseem to work well. RF and NN still show a better perfor-mance.

The simulated data used in this paper included CV-like events as the only non-microlensing instance. Anideal dataset would include a variety of stellar vari-ability, and our developed package should be modiﬁedto include all other variability in its analysis. We be-lieve that the same approach will be successful in recov-ering microlensing events from a wide variety of non-microlensing variability on its own.There are currently a number of methods that at-tempt to categorize photometric variability in large datasets, such as parametric statistical methods and ML.ML methods for detecting diﬀerent types of variabilities are becoming more common (e.g. Richards et al. 2011;Pichara 2013; Pashchenko 2017; Valenzuela 2017). Forexample, as mentioned before, Godines et al. (2019) hastrained a Random Forest classiﬁer to detect microlens-ing light curves in a low-cadence survey dataset in realtime. Our produced features can be a complementaryset of features for such algorithms and not only can im-prove their results but also can give them the ability todetect possible planetary binary-lenses as well.In ML classiﬁers, increasing the number of object in-stances can signiﬁcantly improve the results. Our largesttraining set included 4181 light curves, which is not alarge number for most ML applications. Increasing thisnumber to about ∼ ,

000 light curves would yield morerobust and reliable results. Some of the tools presentedin this paper produce other parameters as well, and in-cluding diﬀerent subsets of those parameters could alsoimprove the results. This task needs to be done care-fully, though, since adding more features that are notimportant might result in overﬁtting. Additionally, agreat avenue to improve the results would be to testother classiﬁers like the Support Vector Classiﬁer andNaive Bayesian, and also investigate deeper neural net-works. The improvement of the data set and the MLalgorithms will be presented in a second paper of thisseries. ACKNOWLEDGEMENTS.K thanks the LSSTC Data Science Fellowship Pro-gram, which is funded by LSSTC, NSF Cybertrain-ing Grant

Abadi, M., Agarwal, A., Barham, P., et al. 2015,TensorFlow: Large-Scale Machine Learning onHeterogeneous Systems, software available fromtensorﬂow.orgBelokurov, V., Evans, N. W., & Du, Y. L. 2003, MonthlyNotices of the Royal Astronomical Society, 341, 1373Bennett, D., Becker, A. C., Quinn, J., et al. 2002, TheAstrophysical Journal, 579, 639 Bluck, A. F., Maiolino, R., S´anchez, S. F., et al. 2020,Monthly Notices of the Royal Astronomical Society, 492,96Bond, I., Abe, F., Dodd, R., et al. 2001, Monthly Notices ofthe Royal Astronomical Society, 327, 868Bozza, V. 2000, Journal of Mathematical Physics, 41, 6284Chollet, F., et al. 2015, Keras, https://keras.ioDi Stefano, R., & Perna, R. 1997, The AstrophysicalJournal, 488, 55Gaudi, B. S. 2010, arXiv preprint arXiv:1002.0332 Khakpash et al. (a) Random Forest Classiﬁer (b) Neural Networks Classiﬁer(c) Decision Tree Classiﬁer (d) K -nearest Neighbors Figure 29.

Confusion matrices of the four trained classiﬁers of classiﬁcation Step III. Once the binary-lens microlensing lightcurves are found, we use this classiﬁcation step to classify them into stellar and planetary binary-lens microlensing light curves.—. 2012, Annual Review of Astronomy and Astrophysics,50, 411Godines, D., Bachelet, E., Narayan, G., & Street, R. 2019,Astronomy and Computing, 28, 100298Gould, A. 1996, The Astrophysical Journal, 470, 201Gould, A., & Gaucherel, C. 1996, arXiv preprintastro-ph/9606105Griest, K., Alcock, C., Allsman, R., et al. 1995, arXivpreprint astro-ph/9506016Han, C. 2006, The Astrophysical Journal, 638, 1080Han, C., Bennett, D. P., Udalski, A., & Jung, Y. K. 2016,The Astrophysical Journal, 825, 8Jiang, G., DePoy, D., Gal-Yam, A., et al. 2004, TheAstrophysical Journal, 617, 1307Johnson, S. A., Penny, M. T., Gaudi, B. S., et al. 2020,arXiv preprint arXiv:2006.10760Kessler, R., Narayan, G., Avelino, A., et al. 2019, arXivpreprint arXiv:1903.11756Khakpash, S., Penny, M., & Pepper, J. 2019, TheAstronomical Journal, 158, 9 Kim, D.-J., Kim, H.-W., Hwang, K.-H., et al. 2018, TheAstronomical Journal, 155, 76Kim, S.-L., Park, B.-G., Lee, C.-U., et al. 2010, inGround-based and Airborne Telescopes III, Vol. 7733,International Society for Optics and Photonics, 77333FKim, S.-L., Lee, C.-U., Park, B.-G., et al. 2016, Journal ofthe Korean Astronomical Society, 49, 37Liaw, A., & Wiener, M. 2002, R News, 2, 18Lloyd, S. 1982, IEEE transactions on information theory,28, 129Mao, S., & Di Stefano, R. 1994, Interpretation ofgravitational microlensing by binary systems, Tech. rep.,SCAN-9411403Mao, S., Smith, M. C., Wo´zniak, P., et al. 2002, MonthlyNotices of the Royal Astronomical Society, 329, 349Mr´oz, P., Udalski, A., Skowron, J., et al. 2017, Nature, 548,183Mr´oz, P., Ryu, Y.-H., Skowron, J., et al. 2018, TheAstronomical Journal, 155, 121 lassifying High-cadence Microlensing Light Curves I; Defining Features FPR T P R KNN, AUC = 0.661DT, AUC = 0.68Rf, AUC = 0.724NN, AUC = 0.835