Classifying High-cadence Microlensing Light Curves I; Defining Features
Somayeh Khakpash, Joshua Pepper, Matthew Penny, B. Scott Gaudi, R. A. Street
DDraft version January 12, 2021
Typeset using L A TEX twocolumn style in AASTeX63
Classifying High-cadence Microlensing Light Curves I: Defining Features
Somayeh Khakpash, Joshua Pepper, Matthew Penny, B. Scott Gaudi, and R. A. Street Department of Physics, Lehigh University, 16 Memorial Drive East, Bethlehem, PA 18015, USA Department of Physics & Astronomy, 261-B Nicholson Hall, Tower Dr., Baton Rouge, LA 70803, USA Department of Astronomy, The Ohio State University, 140 W. 18th Ave., Columbus, OH 43210 LCOGT, 6740 Cortona Dr, Goleta, CA 93117, USA (Dated: January 12, 2021)
ABSTRACTMicrolensing is a powerful tool for discovering cold exoplanets, and the The Roman Space Telescopemicrolensing survey will discover over 1000 such planets. Rapid, automated classification of Roman’smicrolensing events can be used to prioritize follow-up observations of the most interesting events.Machine learning is now often used for classification problems in astronomy, but the success of suchalgorithms can rely on the definition of appropriate features that capture essential elements of theobservations that can map to parameters of interest. In this paper, we introduce tools that we havedeveloped to capture features in simulated Roman light curves of different types of microlensing events,and evaluate their effectiveness in classifying microlensing light curves. These features are quantifiedas parameters that can be used to decide the likelihood that a given light curve is due to a specifictype of microlensing event. This method leaves us with a list of parameters that describe features likethe smoothness of the peak, symmetry, the number of peaks, and width and height of small deviationsfrom the main peak. This will allow us to quickly analyze a set of microlensing light curves and lateruse the resulting parameters as input to machine learning algorithms to classify the events. INTRODUCTIONMicrolensing is a phenomenon that happens whenlight emitted from a distant object (the source) is lensedby a closer, massive object (the lens), and as a result,multiple images of the source are formed. These imagesare typically not resolved because their angular separa-tion is much smaller than the angular resolution of bothground- and space-based telescopes, and consequently,we observe a brightening of the source.Microlensing is a powerful tool for detecting small anddim objects that are otherwise very hard to detect bytheir emitted light, and in particular, it is so far the onlymethod capable of investigating planetary systems withterrestrial-mass planets orbiting beyond the snow line(Gaudi 2010). For several decades, ground-based sur-veys have searched the Galactic Bulge for microlensingevents. The Optical Gravitational Lensing Experiment(OGLE) (Udalski et al. 2015), the Microlensing Obser-vations in Astrophysics (MOA) (Bond et al. 2001) andthe Korea Microlensing Telescope Network (KMTNet)(Kim et al. 2016) surveys detect thousands of microlens-ing events. These observations have led to the successful discovery of over 100 exoplanets and some compact sub-stellar objects (Griest et al. 1995; Bennett et al. 2002;Mao et al. 2002).Identifying microlensing events from the huge num-bers of light curves obtained in these surveys is a chal-lenge. It is useful to be able to distinguish microlensingevents either in real time during survey operations asthe events start, or after an observing campaign has fin-ished and the data of being fully analyzed. Microlens-ing surveys have generally relied on circulating real-time alerts of potential microlensing events when thereis an increase in the brightness of an observed source.This method can lead to alerts for any sudden rise in alight curve, which can include variability other than mi-crolensing, such as cataclysmic variables (CVs). Therehas been recent work to increase the accuracy of thesealerts, limiting them to genuine microlensing events, us-ing machine learning (ML) algorithms that can distin-guish microlensing light curves from other variability inreal time (Kessler et al. 2019; Godines et al. 2019). Real-time classification can increase the accuracy of alerts. Itcan also identify high-value events for which only partial https://exoplanetarchive.ipac.caltech.edu a r X i v : . [ a s t r o - ph . E P ] J a n Khakpash et al. coverage will be obtained by the survey, and allow ob-servers to schedule supplemental follow-up observationsto increase the coverage of the event.There is also a need for classification of microlens-ing survey data after the conclusion of observations. Atthat point, it is necessary to first distinguish microlens-ing events from other types of variability, and also toseparate different types of microlensing events, such asthose that include planetary signals. Surveys like theKMTNet survey have employed an automated approachto detect microlensing-like variability in complete lightcurves by fitting simple functions (Kim et al. 2018). Be-lokurov et al. (2003) also advocates the use of neuralnetworks to distinguish well-sampled microlensing lightcurves from other types of variability. In this paper, wewish to explore the efficiency and efficacy of after-the-fact classification of microlensing signals.ML has now become a popular method for classify-ing astronomical time series. Some ML classifiers likethe Random Forest (Liaw & Wiener 2002) and k-mean(Lloyd 1982) classifiers take light curve features as inputand try to find a connection between those features andthe classification labels. In this scenario, “features” arequantitative statistical or morphological measurementsof the time series. Some other methods like neural net-works use the light curves themselves as input, and thenfind common patterns or higher-order correlated prop-erties to classify them. As a specific example, RandomForest is a widely-used algorithm that classifies time se-ries by making decisions based on features in the timeseries (Bluck et al. 2020; Pawlak 2019). In order touse this algorithm efficiently, observable features in thelight curves that are most closely related to the canoni-cal model parameters must first be identified.Here, we focus on analyzing complete high-cadencemicrolensing light curves. These are simulated lightcurves for the Roman Galactic Bulge Exoplanet Survey.The Nancy Grace Roman Space Telescope (Roman), for-mally known as the Wide Field Infrared Survey Tele-scope (WFIRST) is a NASA future space mission thatis expected to be launched by mid 2020’s. One of its pri-mary goals is to detect exoplanets using the microlens-ing method (Spergel et al. 2015). It is estimated thatRoman will find about 54 ,
000 microlensing events andwill detect 1400 planets (Penny et al. 2019). The 0 . (Street et al. in prep.).Binary-lens light curves are often difficult to modelbecause of the complicated features in their light curvesand the large parameter space that needs to be fullysearched (for a more thorough review refer to the intro-duction of Penny (2014) and Khakpash et al. (2019)).Those challenges highlight the importance of develop-ing fast automated algorithms to quickly analyze lightcurves and estimate the microlensing model parameters.Our primary goal in this work is to identify a list of fea-tures specifically defined for microlensing light curvesthat can be generated by fast and efficient algorithms,and show that these features can help either in differen-tiating microlensing light curves among other types ofvariability or in classifying microlensing light curves intodifferent types. This would enable fast detection of plan-etary system lenses and other interesting lensing casesin the released datasets of large surveys like Roman.We present a collection of algorithms including vari-ous functional fits that are applied to the light curves,and from these fits, we extract parameters that quantifyfeatures of the light curve like smoothness of the peak,symmetry, number of peaks, similarity to microlens-ing single-lens curves, number of deviations from thepeak, and width and height of the deviations from themain peak. We then show how effective each of thesefunctions are in distinguishing between different typesof microlensing events like identifying single-lens versusmultiple-lens systems, or planetary system lenses versusstellar binary lens systems, and in some cases in detect-ing microlensing events among other types of variabil-ity. A similar work was first done by Mao & Di Stefano(1994) where they used features such as an estimate ofthe asymmetry about the peak to detect binary-lens sig-natures in the light curves. In Section 2, we introducethe different models of microlensing events that need tobe parameterized for the classification, and in Section3, we discuss the properties of our test dataset. In Sec-tion 4, we introduce our algorithm package includingthe different functional fits and their respective outputparameters. We also evaluate the effectiveness of eachfunction in capturing the specific set of features in thelight curve. In Section 6, we show preliminary tests ofusing these features as input to machine learning clas-sifiers. Finally, in Section 7, we discuss our results andapplicability of our method to other datasets. MICROLENSING MODELS https://microlensing-source.org/data-challenge/ lassifying High-cadence Microlensing Light Curves I; Defining Features Point-source Point-lens Microlensing Light Curves
The Point-source Point-lens (PSPL) model of a lensingevent when there is a single lens and a single source, andthe source and the lens can both be approximated aspoint-like objects with zero angular size. When there isno blending of the source with the neighboring stars orlight from the lens or companions to the lens or source,this model can be described by a simple Paczy´nski curveas in Equation 1 and 2. A is the magnification of thesource, t is the time of the maximum magnification, u is the impact parameter, and t E is the Einstein crossing time. Figure 2 shows five different PSPL models withthe same Einstein timescale, and different values of u .In the presence of blending, another parameter is addedto Equation 1 and will yield Equation 3 where F ( t ) isthe differential flux and f s is the blending parameterand determines what fraction of the total flux in theaperture is due only to the source. When there are nohigher order effects, these curves remain symmetric, andthe sharpness of the peak in high-magnification eventsremains intact. Detecting any deviation from this modelcan help identify the presence of higher order effects. A ( t ) = u ( t ) + 2 u ( t ) √ u + 4 (1) u ( t ) = (cid:115) u + (cid:18) t − t t E (cid:19) (2) F ( t ) = f s × A ( t ) + (1 − f s ) (3)2.2. Finite Source Effects
When the source is not point-like, and has a finite size,the magnification function will be different, especiallywhen the impact parameter is comparable to the sourcesize (Witt & Mao 1994). In Figure 1, an example of thechanges due to finite source effects can be seen in panel(b). For small impact parameters, this effect becomesstronger, and it smooths out the peak of the event. Thiseffect can be used to estimate θ E , the angular size of theEinstein ring and therefore provide constraints on thelens mass (Yoo et al. 2004; Jiang et al. 2004).2.3. Free Floating Planets
Isolated, short duration microlensing events can bedue to free floating planets (FFPs) or planets in verywide orbits. Compred to most other exoplanet detec-tion methods, microlensing has the capability to detectthese planets since the method only depends on the massof the lens and not its luminosity. Microlensing eventscaused by FFPs have short timescales of t E < ρ , is defined as θ ∗ /θ E where θ ∗ is the apparent size of the source, and θ E is the angularsize of the Einstein ring. θ E depends on the lens mass,and it becomes smaller for single planetary lenses, and Khakpash et al. -4 -2 0 2 4 . .
02 1 . . . . a) PSPL -0.5 -0.25 0 0.25 0.5 b) FSPL -0.03 0 0.03 .
01 8 .
11 8 .
21 8 .
31 8 .
41 8 . c) FFP -5 0 5 .
31 9 .
41 9 .
51 9 .
61 9 .
71 9 .
81 9 .
92 0 . d) Stellar Binary-lens -10 0 10 . . . . . e) Stellar Binary-lens -15 -10 -5 0 5 10 15 . . . . . . . . f) Caustic-crossing Planetary Binary-lens -15 -10 -5 0 5 10 15 . . . . . g) Planetary Binary-lens - Major Image Perturbation -15 -10 -5 0 5 10 15 . . . . . h) Planetary Binary-lens - Minor Image Perturbation t t (days) W M a g n i t u d e Figure 1.
Light curves containing different types of major features in microlensing light curves. The top row includes lightcurves created by single lenses. Panel (a) is a single stellar lens with no significant finite source effect. Panel (b) shows theeffects of the finite source effect on a single-lens stellar event. Panel (c) shows an event due to a free-floating planet. Panels (d)and (e) show two examples of events caused by binary star lens systems. Panels (g) and (h) are due to planetary systems andhave no caustic crossings, whereas Panel (f) is a planetary event that contains a caustic-crossing event. lassifying High-cadence Microlensing Light Curves I; Defining Features t t M a g n i f i c a t i o n t E = 5 daysu = 0.1 u = 0.3 u = 0.2 u = 0.5 u = 0.4 Figure 2.
Five PSPL models with different values of u .The figure shows how the shape of the curve changes fordifferent values of the impact parameter. therefore, ρ becomes larger, and the finite source effectis stronger. As a result, the top of the short-timescalepeak becomes flattened, and the overall shape of theevent begins to resemble a tophat function. An exampleof such an event can be seen in panel (c) of Figure 1.2.4. Binary-lens Microlensing Events
When there is more than one lens, the range of possi-ble light curve morphologies increases dramatically. Inbinary-lens systems, various shapes of the light curvesdepend on parameters such as q , the mass ratio, s , theprojected separation, and α , the angle the path of thesource makes with the line connecting the two lenses.Since α can be any value between zero and 360 ◦ , thisquantity along with q , s , and t E allow the time, height,width, and number of the features in the light curvevary significantly. Based on OGLE (Udalski 2004) andMOA (Bond et al. 2001) observations, around 10% ofmicrolensing light curves show lens binarity signatures(Penny et al. 2011). Fully modeling each of these lightcurves is challenging because there are multiple min-ima in the χ surface as a function of their parameters,and therefore a wide range of parameter space must besearched. Furthermore, there is no direct mapping be-tween the observable features and the canonical param-eters ( q, s, α, t , t E , u ).2.4.1. Stellar Binary-lens Events
Stellar binary lenses are likely as common or morecommon than planetary system lenses, but the proba-bility of seeing deviations from a PSPL curve is much higher in nearly equal mass ( q (cid:38) .
1) binary lens lightcurves than light curves due to planetary mass ratio( q (cid:46) .
01) binaries because the size of the caustics (theloci of infinite magnification for a point source) increasewith increasing q . Therefore, they are typically easy todistinguish from planetary system lenses since the devi-ations from the standard PSPL form last much longer,and in particular can be comparable to t E . They usuallyhave larger perturbations in their light curves, but canbe still misinterpreted as planetary binary-lens events(Han et al. 2016). Two examples of such light curve canbe seen in panels (d) and (e) of Figure 1.2.4.2. Planetary Binary-lens Events
A lens system containing a star and a single planetis a binary-lens system with a very small mass ratio( q ). Note that there is no strict delineation (in termsof q ) between a stellar binary-lens event and a plane-tary binary-lens event, and this is one of the reasonsthat make this a challenging classification problem. Inthese cases, the light curve is dominated by the hoststar, such that there is a main peak which can be de-scribed by the PSPL model with one or more small de-viations caused by the planet (Gaudi 2012). As in asingle-lens microlensing event, two images of the sourcewill be formed as it passes close by the lens star; an im-age outside of the Einstein ring of the lens star calledthe major image, and an image inside the Einstein ringof the lens star called the minor image. Depending onwhether or not the projected separation of the planetand the lens star ( s ) is larger or smaller than the Ein-stein ring, the planet will have to perturb one of theseimages to leave a signature on the star light curve, andthe perturbation will have different characteristics. Theeffect of the planet on the light curve can also be ex-plained by the positions of caustics. In real cases, sincethe source star is not a point source, we observe sharphigh-magnification peaks when there are caustic cross-ings. The numbers and sizes of these caustics depend onthe mass ratio ( q ) and the projected separation ( s ). Thestructure of the caustics leads to the planetary featuresin the light curves of these events (Gaudi 2012).Caustics have three main topologies. Depending onthe values of the projected separation and the massratio, there is either one (intermediate topology), two(wide topology), or three (close topology) caustics (fora thorough description of planetary caustics please re-fer to Gaudi (2012)). These caustics are also classifiedinto two classes, called the central caustic and the plan-etary caustic, and depending on which one is closer tothe path of the source, the shapes of the light curve aredifferent. Khakpash et al.
If the source crosses a caustic curve, two sharp devia-tions will be seen in the light curve assuming sufficientsampling corresponding to the entrance and exit of thesource through the caustic. The shape of these devia-tions depends on the path of the source which is deter-mined by the source path angle ( α ). Panel (f) of Figure1 includes an example of a caustic-crossing planetary mi-crolensing event. In this work, we analyze these pertur-bations based on whether or not the source has crossedthe caustics, and we investigate the features created ineach of these cases. In cases with no caustic crossing, thesource only passes close to the caustics, and that resultsin features like short bumps or troughs, or both, in thelight curve. These deviations are typically small in mag-nitude and short in duration which makes them some-times very difficult to detect, especially in low-cadencesurveys, and they can be created by either the central orthe planetary caustics. In panels (g) and (h) of Figure 1,we show examples of non-caustic-crossing events causedby the planetary caustic. These perturbations happenwhen the source passes closely by the planetary caustic,and finding the time of the perturbation can help deter-mine the location of the caustic, and therefore estimatethe projected separation of the planetary system. Theevent in panel (g) is a perturbation in the major imageand the event in panel (h) is a perturbation in the minorimage.In cases with caustic crossing, we need to identify thetimes of the two sharp deviations to find an estimate ofthe time the source spends inside the caustic. This al-lows us to estimate the size and location of the caustics,and therefore find an estimation of the mass ratio andthe projected separation of the planetary system (e.g.,Poleski et al. 2014). SCOPE OF THE ANALYSISAs mentioned in Section 1, classifying astronomicaltime series using ML algorithms relies on the data setsused for training and testing. Features computed fora data set cannot always be used to analyze anotherdata set; therefore, the connection between the datasetand the defined features should be discussed. In thispaper, we focus on computing features for simulatedRoman light curves. These light curves are simulatedbased on the Roman Cycle 7 mission design and havesix 72-days seasons with 15 minute cadence. These lightcurves are similar to the light curves generated by Pennyet al. (2013) and Penny et al. (2019), and the full de-tails of the simulated data can be found in these papers.This data set was designed in particular to be used for the Roman microlensing data challenge and has fourclasses of variability types, consisting of different typesof microlensing light curves along with a particular typeof stellar variability that can be misinterpreted as mi-crolensing. These classes include cataclysmic variables,single-lens systems, stellar binary lenses and planetarysystem lenses. An example of each class of light curvesused in this analysis can be seen in Figure 3.The dataset contains 4181 light curves, including 429cataclysmic variables, 1626 single-lens systems, 1386stellar binary lenses, and 688 planetary system lenses.The CV class contains CV variabilities with either zero,one or multiple instances of outbursts in the baseline.The single-lens systems include both stars and planetsas lenses, therefore their shape varies in terms of theduration, amplitude, and morphology of the peak. Asmentioned before, the stellar binaries and the planetarysystems can result in very similar light curves, and thetransition between these two classes is smooth in termsof the observables, and distinguishing between them isone of our main concerns.The lightcurves selected for this project have a widerange of signal to noise ratio. To enter our sample,events had to pass a ∆ χ >
500 cut of the true modelto a flat line, and binary and planetary events had topass a ∆ χ >
160 cut of the true model relative to asingle lens lightcurve model (see Penny et al. (2019) forfull details). While these ∆ χ values suggest quite highformal detection significance, the cuts can be passed bylightcurves with signal to noise per data point less thanunity, e.g., a t E = 25 day single lens event will have 4800data points within one t E of the event peak. Similarly, aJupiter-mass planet with a planetary deviation lasting ∼ ∼
96 data points) can pass the detectioncut with an average contribution of ∆ χ per data pointof less than 2. This said, our selection cuts do not allowus to test our method’s sensitivity to the lowest signalto noise ratio events that could potentially be formallydetected. FEATURE DETECTION ANDPARAMETRIZATIONIn Section 2, we described major types of microlensingvariability with different morphologies and amplitudes.In this section, we introduce a package of statistical testsand model fits we have assembled into an algorithmicapproach to detect the features of a set of light curves.These features individually can then be used to distin-guish microlensing events from other variability, and to https://microlensing-source.org/data-challenge/ lassifying High-cadence Microlensing Light Curves I; Defining Features
20 15 10 5 0 5 10 15 20 a) CV b) PSPL/FFP
40 20 0 20 40 c) Stellar Binary-lens
30 20 10 0 10 20 30 d) Planetary Binary-lens
Time t W M a g n i t u d e Figure 3.
Four classes of light curves in the dataset used for this analysis. From left: is an example of the light curve of acataclysmic variable. The second panel shows a single-lens model which contains a single symmetric peak. The third panelrepresents lens systems containing a stellar binary, and the last panel is an example of an event with a planetary system as thelens. The blue data points are from the original simulated light curves, and the black data points are smoothed light curvesusing the low-pass filter. classify the different types of microlensing systems. Ulti-mately, in a subsequent paper, we will use the combinedpower of all of the features as input to a ML classifier.The package includes the following tools: • Peak Finder • Gould Two-parameter Point-source Point-lens Fit(G-PSPL) • Symmetry Check • Trapezoidal Function Fit • Cauchy Distribution Fit • Planetary Parameters Finder • Chebyshev Polynomials FitIn the sections below we introduce each of these tools,and explain how they are applied to the simulated dataset and evaluated. Not all tools are used independently,and some are used in combination with other tools.Before applying some of these tools, we employ a low-pass filter to the light curve which allows low frequencysignals to pass and therefore reduces the high-frequencynoise in the data. This low-pass filter has a smoothingwindow of 10 subsequent data points. The width ofthe smoothing window was optimized for the simulated Roman data set analyzed in this work, but should beseparately optimized for each data set so as to removenon-astrophysical noise as much as possible.4.1.
Peak Finder
The primary idea behind this tool is to identify in-dividual peaks in a light curve, which are often char-acteristic of microlensing events. After employing thelow-pass filter, we bin the light curve in time, and then,in each bin, we count the number of data points withpositive deviations larger than some number of stan-dard deviations in that bin, which we define as the peakthreshold. If a bin has more than one data point beyondthat threshold, that bin is defined as a peak.When using this algorithm, two parameters are impor-tant in determining the success rate; the bin size and thepeak threshold. Depending on the particular implemen-tation, we can use this method to search for a single largepeak, such as a single-lens microlensing event. Or afteridentifying a single-lens event, we can subtract a modeland then search for additional peaks in the light curveresiduals that would indicate a binary lens event. Wecan also search simultaneously for multiple large peaksat once.To examine the performance of this method in a par-ticular case, we consider the case of searching for a sin-gle peak within a light curve. We test different values of
Khakpash et al. the bin size and threshold parameters to determine whatcombination is more successful at detecting peaks. Wedefine a successful detection as one where exactly onebin identifies a peak, and the light curve is known to bedue to a microlensing event. We define a failed detec-tion as one where a light curve labeled as containing amicrolensing event did not have any peaks identified, ormultiple peaks identified. With this range of bin sizes,this algorithm will not detect narrow deviations in mi-crolensing light curves, and will only look for the mainevent. Note that the algorithm is looking for single-peaked variations, so it might also find single-peakedtransients. We expect that next stages of analysis willdistinguish between these cases. This method has a lowfalse negative rate. For example, it is possible that itwould fail at detecting a microlensing light curve at lowthresholds. This happens when the event has a low mag-nification and it might be considered similar to the noisein the light curve.Figures 4, 5, and 6 show the performance of the algo-rithm across a range of bin sizes and thresholds. Theydisplay the True Positive Rate (TPR), False PositiveRate (FPR), and False Discovery Rate (FDR) for dif-ferent options of bin size and peak threshold. Figure 4shows that bin sizes larger than about 50 days and athresholds of 3 σ to 5 σ give the highest TPR of about80%.Figure 5 shows the False Positive Rate which repre-sents what fraction of non-microlensing events are la-beled as microlensing. Bin sizes do not consistently af-fect the FPR. Thresholds larger than 5 σ give unrea-sonably low FPR, and that is because at these thresh-olds, most light curves will have zero peaks identified.Thresholds of 3 σ to 5 σ give a FPR of 50%. Since somefraction of non-microlensing events have a single peak,we expect that a group of non-microlensing events willbe classified as microlensing even though the algorithmhas successfully identified one peak in them.Figure 6 show the False Discovery Rate which repre-sents the fraction of events labeled as microlensing thatare are in fact non-microlensing. For the same reason asmentioned above, thresholds larger than 5 σ are not fa-vorable, and thresholds of 3 σ to 5 σ have reasonably lowFDR of larger than 7% for most bin sizes. Note that theextremely low rates of FDR is also an indication of theunbalanced dataset (a dataset in which there is unequalnumber of objects in each category.), and our purposeis to compare how sensitive it is to different thresholdsand bin sizes.Using this peak finder algorithm with a bin size of 60days and a threshold of 5 σ , we can identify which lightcurves show a single peak event. In Section 5, we use this Figure 4.
Results of the Peak Finder algorithm is appliedto all of the light curves in the dataset for different valuesof bin size and threshold. The True Positive Rate (TPR),the fraction of microlensing events that are labeled as mi-crolensing is plotted versus bin size, with various thresholdsdenoted by symbol and color. The value of TPR is roughlyindependent of the bin sizes, and that thresholds of 3 σ to 5 σ yield the highest TPR. Figure 5.
As Figure 4, but showing the False PositiveRate (FPR), the fraction of non-microlensing events that arelabeled as microlensing. The value of FPR is only weaklydependent of the bin sizes for most thresholds. feature in conjunction with the other feature detectiontools. 4.2.
Gould Two-parameter PSPL Fit
The PSPL function has been demonstrated to bea good fit to most microlensing light curves, and tonot match most other types of astrophysical variabil-ity (Di Stefano & Perna 1997). Therefore, the goodness lassifying High-cadence Microlensing Light Curves I; Defining Features Figure 6.
As figure 4, but showing the False DiscoveryRate (FDR), the fraction of events labeled as microlensingthat are in fact non-microlensing. Overall, values of FDRare very low and also roughly independent of the bin sizesfor most thresholds. of fit to the PSPL model can be used as a feature ofthe light curve. In this section, we demonstrate em-ploying a two-parameter PSPL model first introducedby Gould (1996) to detect curves similar to a Paczy´nskicurve. The reason for not using the regular PSPL modelis that brute force searches over the three non-linear pa-rameters ( t E , t , u ) of a PSPL model are slow, whilethe two-parameter Gould approximate PSPL model (G-PSPL) captures the basic observables of a full PSPLmodel but requires many fewer computations. For moredetails refer to Gould (1996) and Wo´zniak & Paczy´nski(1997).In order to employ the PSPL model as a light curvefeature, we employ a version of the Gould (1996) two-parameter PSPL model (G-PSPL) that has been usedby the KMTNet survey to detect microlensing events.For that survey, Kim et al. (2018) fit a combination oftwo two-parameter G-PSPL models: one with the as-sumption of u = 1, and one with the assumption of u being very close to zero. This approach reduces thenumber of fitted parameters, and can be used to detectboth high and low-magnification events. This doubletwo-parameter G-PSPL fit is defined here as function F ( Q ): F ( Q ) = f × ( A Q ) + A ( Q )) + f (4)where A ( Q ) = 1 √ Q (5)and A ( Q ) = 1 (cid:113) − Q ) (6) are functions that represent good fits to high and lowmagnification events, respectively. The two parameters f and f do not have a physical meaning in this equa-tion, but in the limit of u = 1 and u close to zero,they are related to the blending and source parameters,described in detail in Kim et al. (2018). A and A areboth functions of Q , Q ( t ) = 1 + (cid:18) t − t t eff (cid:19) . (7) Q is a function of time and depends on two parameters t and t eff . t is the time of the maximum magnification,and t eff approximates u × t E in the limit of u (cid:28) . F ( Q ),we set the initial values of the parameters f and f to0 .
5, and t is the time of the maximum magnification inthe light curve. For t eff , we choose a list of seven initialvalues 0 . , . , . , , , , and 20. For high magnifica-tion events, low values of t eff usually provide a betterfit, and for low magnification events, larger values of t eff are a better fit. The fit with the minimum χ value isused as the selected model. Note that for the remainderof the paper, we specify whether we use A G-PSPL or aPSPL model fit.4.3. Symmetry Check Algorithm
If we identify a peak in a light curve and fit a two-parameter G-PSPL model to the data, we then employa check on the symmetry of the event. In Section 4.2,we showed that this method can be used to detect mi-crolensing events. Here, we calculate the reduced χ statistic separately for the right and left side of the peakin the light curve. If the peak is symmetric, the ratioof ( χ left ) reduced / ( χ right ) reduced , which we define as β ,is very close to unity. Since the number of data pointsin the right and left wings of the peak might not beexactly equal, we use the reduced χ . Deviations of β from unity can be related to additional physical detailsin the light curves, such as the existence of planetary orbinary star companion lensing signatures, parallax ef-fects, or the asymmetry in non-microlensing peaks likecataclysmic variables.In Figure 7 we plot histograms showing the distribu-tions of variability types for a set of ranges of β values.Each column represents events with a specific range of β . We find that for CV light curves, the asymmetry isextremely large, consistent with the morphology of CVlight curves, displaying steep rises and more gradual de-creases. Single-lens microlensing events are symmetric,with β values close to unity. Binary lenses have a widerange of β values depending on whether they are caused0 Khakpash et al.
Figure 7. β is a measurement of asymmetry of the peaksin the light curves that is obtained from the symmetry checkalgorithm. This plot shows the distribution of β for differentclasses of light curves in our dataset. by a planetary system or a stellar binary system. Theseratios can be used as a feature to obtain informationabout the kinds of deviations from the PSPL model,and classify the light curves.4.4. Trapezoidal Function Fit
As explained in Section 2.3, lensing events caused byfree-floating planets have a short duration and ampli-tude and are more often strongly affected by the finitesource effect. Therefore, their shapes approximately re-semble a trapezoidal function.The parameters of the trapezoidal function are thebaseline magnitude ( a ), maximum magnitude ( b ), timeof the first rise ( τ ), the time when maximum magni-tude is reached ( τ ), the time when maximum magnitudeends ( τ ), and the time when the magnitude returns tothe baseline ( τ ). Figure 8 shows a diagram of the func-tion and its six parameters. We must first find initialguesses for the six parameters, then fit the function andobtain more accurate estimates of the parameters. Wealso calculate the duration of the trapezoidal portion∆ τ full , and the duration of the flat section of the trape-zoid, ∆ τ top as presented in Equations 8 and 9 and shownon Figure 8. ∆ τ full = τ − τ (8)∆ τ top = τ − τ (9)In order to find initial guesses for the six parameters,we first subtract the median baseline magnitude fromthe light curve, setting a to zero. We then select 20, ab ! ! ! " ! ! $ Time M a g n i t ud e Δ! %&' Δ! ()** Figure 8.
A trapezoidal function with its defined parame-ters.
2, 0 .
2, or 0 .
02 days as initial guesses for the total du-ration (∆ τ full ). We define t as the time of maximumbrightness, and assume ∆ τ full = 2∆ τ top . We then de-fine the remaining parameters τ , τ , τ , and τ as thefollowing quantities, respectively: t − ∆ τ full , t − ∆ τ full , t + ∆ τ full , and t + ∆ τ full . We fit the resulting modelsbased on the initial guesses for ∆ τ full , using a least-squared minimization approach. We finally select thefit that minimizes the χ statistic.We have applied this algorithm to our test dataset,with Figure 9 showing an example of a best fit func-tion for a simulated light curve. We find that half ofthe total duration (∆ τ full ) is a good representation ofthe Einstein timescale ( t E ) of the event, and we com-pare it with true values of t E of the events. Figure 10shows the estimated versus true duration of single-lensmicrolensing events with t E < t E < .
06 days, while for events with t E ≥ . κ = ∆ τ top ∆ τ full as the trape-zoidal timescale ratio, which is the ratio of the durationof the flat part of the trapezoid to the total durationof the trapezoid. This quantity was initially set to 0.5for the initial guesses of the function fitting. This pa-rameter can be thought of as a measure of how squareor peaked the event is, analogous to the kurtosis of theevent. We show that using κ along with ∆ τ full allowsus to flag the events that are strongly affected by the fi-nite source effect. We can characterize the approximatesignificance of the finite source effect using | u | /ρ , theratio of the impact parameter to the finite source ratio.When | u | /ρ is smaller than unity, it implies the lenspasses directly over the source and thus that can be asign of significant finite source effects.The left panel of Figure 11 shows the trapezoidaltimescale ratio ( κ ) versus ∆ τ full for the subset of all lassifying High-cadence Microlensing Light Curves I; Defining Features Figure 9.
Trapezoidal function fitted to the light curveof a simulated event caused by a free-floating planet. Theflat top of the trapezoid describes the duration of maximumbrightness fairly well, and the duration of the event is wellrepresented by the full trapezoidal duration.
Figure 10.
True versus estimated Einstein Timescale ( t E )of the events with t E < light curves that meet the fitting criteria for single-lensmodels, which are described below in §
5. Note thatboth of the quantities on the axes in the left panel arefound experimentally using the trapezoidal function fit.In that panel, most of the green data points have largervalues of κ which implies a more square light curve shapecaused by strong finite source effects. The right panelshows the distributions of the true values of ρ for allthe data points inside and outside of the black box inthe left panel. That panel indicates that light curveswith small values of ∆ τ full and large values of κ tend to have larger values of ρ (grey), and light curves withlarger values of ∆ τ full and smaller values of κ tend tohave smaller values of ρ (hatched white). This is be-cause events with short duration caused by free-floatingplanets are more affected by the finite source effect, andtherefore more resemble a trapezoidal functions, with∆ τ full and κ helping to indicate these events.4.5. Cauchy Distribution Fit
For microlensing events with a strong finite source ef-fect in the absence of limb darkening effects, the top ofthe peak becomes flatter and will not look like a PSPLfunction. For detecting this phenomenon, we need toparameterize the flatness of the top of the peak. Effec-tively, we would like to use a functional fit that canapply to the range of morphologies between that ofa single-lens PSPL model, and a trapezoid that muchmore closely fits a strong finite source effect. Our goalhere is to find a correlation between the fitted parame-ters and the finite source ratios. For that purpose, weuse a Cauchy distribution function to fit the light curves.The difference in the minimum χ of the Cauchy fit andthe PSPL fit (shown in Equations 3 and 1), along withone of the parameters of the Cauchy distribution, canbe used as features.The Cauchy distribution shown in Equation 10 hasfour parameters, time of the peak ( t ), the duration ofthe peak ( σ ), the flatness of the peak ( b ), and the am-plitude of the peak ( a ). C ( t ) = a (cid:12)(cid:12) t − t σ (cid:12)(cid:12) b (10)The functions C ( t ) and PSPL have two common pa-rameters, time and duration of the peak. After fittingthese two functions to a light curve, we have fitted valuesfor t E and t , which we refer to as t E,P SP L , t E,Cauchy , t ,P SP L , and t ,Cauchy . Note that t E,Cauchy is equiva-lent to σ in Equation 10. We define the difference inthe χ of the PSPL fit and the Cauchy fit as a fea-ture characterizing the flatness of the top of the curverepresented in Equation 11. In order to calculate ψ ,we select a section of the curve at the peak, between t − ( t E,P SP L × u ,P SP L ) and t + ( t E,P SP L × u ,P SP L ). ψ = χ P SP L − χ Cauchy (11)The more the event is affected by the finite source ef-fect, the more the top of its peak deviates from the PSPLmodel and is more similar to the Cauchy model. Thusfor single-lens events with a flatter peak, ψ is positive.Even in cases where the Cauchy distribution fits muchbetter than the PSPL fit, for events that are not causedby free-floating planets that still experience strong finite2 Khakpash et al. full (days) | u |/ > 1 | u |/ < 1 True Value of | u |/ N u m b e r Outside the Box Inside the Box
Figure 11.
Left Panel : Values of the trapezoidal timescale ratio ( κ ) are plotted versus the total trapezoidal duration (∆ τ full ).The purple data points show events with | u | /ρ >
1, and the green data points represent events with | u | /ρ <
1, which indicatesevents with free-floating planets or other PSPL events affected by the finite source effect. κ is larger for shorter events, andmost of the events with | u | /ρ < κ and small ∆ τ full . Therefore, the selected ranges of κ and ∆ τ full can be usedto identify likely events caused by free-floating planets. Right Panel : The distribution of | u | /ρ for all data points inside (grey)and outside (hatched white) of the black box in the left panel. Light curves with small values of ∆ τ full and large values of κ tend to have smaller values of | u | /ρ , and light curves with larger values of ∆ τ full and smaller values of κ tend to have largervalues of | u | /ρ . source effects, the Cauchy distribution will typically bea better fit than the trapezoidal function. Figure 12shows a single-lens event highly affected by the finitesource effect, along with the best-fit PSPL and Cauchymodels. The two curves deviate the most close to thepeak of the event.Figure 13 shows values of the Cauchy feature ( ψ ) ver-sus ρ/u for all single-lens events that have at least fourobservations within a time of t E,P SP L × u ,P SP L fromthe maximum. The upper panel shows positive valuesof ψ and the lower panel shows the negative values. Theevents with flatter peaks have a positive value of ψ , andon average, have a shorter duration than those with neg-ative ψ . Events with large positive ψ (flatter peaks) ap-pear in the upper right of this figure and have ρ/u > ψ include both short and longduration events. The black lines show the one-to-one re-lation for events with positive and negative ψ , and havea median absolute deviations of 0.45 for positive ψ , and0.64 for negative ψ . Events in the lower region mostlyhave small negative values of ψ , indicating that neitherthe PSPL and Cauchy functions are significantly betterfits to the data. Those with large negative values are Figure 12.
Cauchy and PSPL functions fitted to a Romansimulated single-lens microlensing light curve highly affectedby the finite source effect. The blue data points are fromthe simulated light curve, the red curve is the fitted PSPLmodel, and the orange curve is the fitted Cauchy model. those with sharp peaks that are well-described by thePSPL function, which also have low u . This plot shows lassifying High-cadence Microlensing Light Curves I; Defining Features ψ can help identify theflattest and sharpest microlensing peaks, and flag theones that are more likely affected by the finite sourceeffect.The other feature that we use for this purpose is pa-rameter b in Equation 10. We expect that if the eventexhibits strong finite source effects which means a flat-ter top, this feature would take a larger value, and ifthe finite source effect is negligible, the feature wouldbe very close to unity (in the absence of limb darken-ing effects). In reality, b does not correlate with ρ formost events, but once ρ is greater than 0.1, we do seea positive correlation with b , as seen in Figure 14. Wetherefore retain b as a potentially useful feature.4.6. Planetary Parameters Finder
Our goal in this section is to measure the two physi-cal parameters associated with the planetary binary-lensevents. These are s , the projected separation betweenthe planet and the lens star and q , the mass ratio of theplanet and the lens star. The approach that we takehere is to first fit the main event with a PSPL function,then find the deviations from the PSPL model by look-ing at the residual, to identify peaks or troughs. We thenseek to identify cases where these residuals have one sig-nificant peak or trough, or if they have two significantpeaks.The reason we take this approach is that the single-deviation residuals can be characterized much more ro-bustly than the double-peaked ones. For events of theboth groups, we fit the “busy” function introduced byWestmeier et al. (2014). This function is primarily usedto describe double-peaked features in spectra, and withsome changes in its parameters, we can also use it to findthe single-peaked deviations. Khakpash et al. (2019)suggest fitting the single-peaked residuals with a Gaus-sian, and show that this approach is useful mostly forlow-mass ratio events. Here, we take the same approachto find the initial parameters in the residual, but in-stead of a Gaussian, we fit all deviations with the busyfunction. After fitting the residual, in the next roundof fittings, we fit a PSPL plus a busy function, and weuse parameters obtained from the first two fits as initialparameters of this fit. Next, we calculate s and q usingthe parameters found by the final fit.The busy function is commonly used to describedouble-horn profile of galaxy spectra (Westmeier et al.2014). The function is shown in Equation 12, and hasnine parameters that are sketched in Figure 15. Thisfunction comprises two error functions and a polyno-mial of degree n . The parameters x e and x p determinelocation of the middle of the two error functions and the middle of the polynomial, δ and δ are the steepness ofthe two error functions, n is the degree of the polyno-mial and determines the steepness of the middle trough, c determines the depth of the polynomial, w determinesthe distance of the error function zero points from x e , a determines the height of the error functions, and ε isthe horizontal scaling of the function. A ( t ) = ( a/ × (erf ( δ ( w + ε × t − x e )) + 1) × (erf ( δ ( w − ε × t + x e )) + 1) × ( c × | ε × t − x p | n + 1)(12)Depending on the values of the nine parameters, theshape of the function can differ significantly. We takeadvantage of this fact and use different shapes of thefunction to fit single-peaked and double-peaked devia-tions from the PSPL model. The two forms of the busyfunction used to fit the residuals are shown in Figure16. The left panel shows a double-horn shape of thebusy function, and the values of the parameters canbe found on the plot. The three curves represent theshapes of the function when only the steepness of thetwo error functions are altered. This set of parametersresults in a form that can describe the caustic-crossingfeatures of the planetary microlensing light curves. Inthe right panel, the value of c in Equation 12 is set tozero, and therefore the equation only comprises two er-ror functions. The shape of the function resembles theGaussian function used by Khakpash et al. (2019) tofit single-peaked deviations with an additional ability todescribe asymmetric deviations. The three curves showhow the shape of the function changes as the steepnessof the two error functions are altered.We apply this approach to the set of simulated Romanplanetary microlensing light curves. We follow the pro-cedure in Khakpash et al. (2019), and first fit a PSPLfunction as shown in Equations 1 and 3 to the lightcurves. We find an initial value for t by finding thetime of the maximum flux in the light curve, and we setthe initial value of f s to be 0 .
5. For an initial guess forthe duration of the event t E , we first interpolate betweenthe data points using a cubic interpolation to estimatea continuous version of the data, then, for events with u ,initial < .
5, we assume t E is the interval betweentimes when the magnification is 1 .
34. For events with u ,initial > .
5, we set t E equal to the intervals betweentimes when magnification is 1 .
06. We use Equations 1and 3 to calculate an initial guess for u based on theabove other assumed values at time t = t . Then, wefit the PSPL function using the initial guesses, and thenapply the peak-finding algorithm to the residual.4 Khakpash et al.
Figure 13.
The Cauchy feature ( ψ ) plotted versus ρ/u , restricted to cases with at least four observations near the peakmaximum. The colors of the points represent the PSPL-estimated event duration t E,PSPL , while the sizes of the points representthe PSPL-estimated value of u ( u ,PSPL ). The black trend lines show a one-to-one relation between ψ and ρ/u (reversed inthe bottom panel). Figure 14.
Plot of the parameter b in equation 10 versustrue parameters of ρ . It shows that the b values are in generallarger for events with ρ larger than 0 . Next, we use the peak finder ( § σ . With thesevalues, the algorithm finds the most significant devia-tion. Then, we start decreasing the bin size by 50%until it either finds two peaks or the bin size reaches 0.2days. At any of these decreasing steps, if it finds twopeaks that are separated by less than 10 days, we endthe search. This condition aims at preventing the algo-rithm from selecting multiple peaks in a caustic-crossingmicrolensing event. If the search concludes with the de-tection of one deviation, we accept that and move onto the next step. If it finds zero deviations, we sim-ply select the maximum or minimum data point in theresidual and identify that as a single deviation.At this point, we fit the modified busy function topeaks identified in the residual. For this purpose, we lassifying High-cadence Microlensing Light Curves I; Defining Features 𝑥 ! , 𝑥 " 𝑎𝑤𝛿 𝑛 M a g n i f i c a t i o n Time 𝛿 𝑐 Figure 15.
A schematic plot of the busy function, withfeatures associated with eight function parameters. The pa-rameter ε (not shown on the figure) represents a horizontalscaling of the function without disrupting its shape. identify the time of the single deviation, or the middlepoint of the double-peaked deviation to be initial guessesfor x e and x p . We set these two values equal to zero,the values of δ and δ are set equal to 1, and the valueof ε is set to 5. Parameters a , w , n and c are set updifferently depending on the number of deviations foundin the residual. If there is one deviation, n and c are setequal to zero, and a is equal to the largest peak or troughin the residual, and w is set equal to unity. If there aretwo deviations, initial values of n and c are set to 10and 0 .
02, and the value of a is set equal to the medianof the residual values between the two peaks, and w isset equal to half of the distance between the two peaks.Parameters describing the deviations in the residualare obtained by the busy function fit. Using these pa-rameters along with the PSPL parameters as initial val-ues, we then fit a PSPL plus a busy function to the lightcurve, and we find final values of 13 parameters, four ofthem describing the main event, and nine of them de-scribing the deviation. Using these values, we calculatethe two physical parameters s and q using two differentapproaches.When only one deviation is found, the busy functionwill turn into two error functions with six parameters.After fitting the function to the deviation, we determinethe duration of the deviation by looking at where thefitted model is not zero, and we assume half of that tobe the duration t Ep . t p is set to x e , and then we find thetwo values of s by solving Equation 13. If the deviationis positive, a secondary check is done to find if thereis a large trough close to this peak. If there is a largede-magnification close to a single caustic crossing, the event is very likely to have s < s < s > s < q is then obtained by Equation 14. | s − s | = u, where u = (cid:115) u + (cid:18) t − t p t E (cid:19) (13) q = (cid:18) t Ep t E (cid:19) (14)When there are two deviations, we assume that thedeviations are caused by crossing the planetary caustic,and that these epochs correspond to when the sourceapproaches two cusps of the caustic. Using these epochs,we calculate the distance of the source from the lensat these times using u , = (cid:114) u + (cid:16) t − t , t E (cid:17) . Han(2006) calculates the size of the planetary caustics interms of s and q , and we find the size of the caustic usingsimple geometry, and then, we use their formulation tocalculate s and q .Figure 17 shows an example of the geometry involvedin these lensing events. In this figure, planetary causticsof two systems with projected separation of 1 . . .
03 is shown. Theshape and sizes of the caustics depend on s and q . An ex-ample path of the source through the system is shown onboth panels. The distance between the lens and the cen-ter of the caustics, LX , gives us the value of | s − s | . Atthis point, we do another secondary check as describedearlier and we determine whether we have a system with s > s <
1. Next, for systems with s >
1, we find thevertical dimensions of the caustic (left panel of Figure17), and for systems with s <
1, we find the distancebetween the two caustics by geometry (right panel ofFigure 17). We refer to both of these values as heightof the caustics. Han (2006) finds that these values areroughly equal to √ qs √ s − and s √ q (cid:113) ( s ) + 27, respec-tively. We then use these equations to find q . Table 1summarizes the process we adopt to extract the systemparameters from the light curve residuals in the cases ofone and two deviations (Poleski et al. 2014; Bozza 2000).Figures 18, 19, and 20, show three examples of thePSPL plus the busy function fitted to planetary lightcurves. Figure 18 shows the fit for a system with onedeviation in its PSPL residual where the true projectedseparation of the system is smaller than unity. The algo-6 Khakpash et al. M a g n i f i c a t i o n x e = 0 x p = 0 a = 2 n = 4 w = 7 c = 0.002= 5
1, 2 = 1
1, 2 = 0.1 = 0.5 and = 0.1 x e = 0 x p = 0 a = 2 n = 4 w = 7 c = 0= 5
1, 2 = 1
1, 2 = 0.1 = 0.5 and = 0.1 Time
Figure 16.
Two different shapes of the busy function that are fitted to the single- and double-peaked residuals.
Left Panel : Adouble-horn shape of the busy function, and the values of the parameters. The three curves represent the shapes of the functionwhen only the steepness of the two error functions are altered.
Right Panel : The value of c in Equation 12 is set to zero, andthe equation only comprises the two error functions and no polynomial. Only one peak is present, with the two error functionscontrolling the steepness of the two sides of the peak. The three curves show how the shape of the function changes as thesteepness of the two error functions are altered. rithm first detects sharp peaks or troughs, and after fit-ting the busy function to the light curve residuals fromthe PSPL fit, we then obtain the residual of the busyfunction fit. At the times where a single peak is found,we still cannot conclude that the event includes a ma-jor image perturbation. At this point, if the sum of theflux of negative points in the busy function fit residualis larger than the sum of the flux of the positive points,the system will be considered to be affected by a minorimage perturbation, and s will be chosen to be less thanone. Otherwise, s will be larger than one. The exam-ple in Figure 18, shows an event with a sharp positivedeviation that appears to be a major image perturba-tion; however, the preceding trough indicates that this aminor image perturbation, and the algorithm correctlydecides that s is smaller than one. The true parametersfor this event are s = 0 .
63 and q = 0 . s = 0 .
64 and q = 0 . s <
1. The busyfunction fit may appear to be a poor fit to the devi-ations but this fact is not considered a failure, and infact helps with determining whether it is a major or aminor image perturbation. After the secondary check,the algorithm decides to correctly choose s smaller thanone. The true parameters for this event are s = 0 . q = 0 . s = 0 .
59 and q = 0 . s >
1. The true parameters for thisevent are s = 1 .
11 and q = 0 . s = 1 .
02 and q = 0 . s and q with theirtrue values as demonstrated by Khakpash et al. (2019).Figure 21 shows the estimated values of s and q plottedversus their true values. Most values of s are well es-timated by the method. The fitted values close to onearise from events caused by the central caustics. Whilethe estimated Values of q show a lot of scatter comparedto the true value, they are generally estimated to withinabout one order of magnitude of the true values. Notethat this algorithm is slower than the algorithm pre-sented in Khakpash et al. (2019), but it fits a broaderrange of events.4.7. Chebyshev Polynomials Fit
Di Stefano & Perna (1997) suggests that unlikebinary-lens microlensing light curves that are signifi-cantly perturbed, smooth binary-lens events can be eas-ily misclassified, and therefore, it is useful to developmethods that can distinguish between this type of mi-crolensing events and other types of similar variabili-ties. The method should work fast and be effective atmarking light curves with smooth features. One of the lassifying High-cadence Microlensing Light Curves I; Defining Features | s s | q = 0.03 s = 1.6 L X | s s | q = 0.03 s = 0.8 L X Figure 17.
Left panel : Planetary caustics of a system with a projected separation of 1 . . s and q . Right panel : Planetary caustics of a system with a projected separation of 0 . .
03 (red). At this range of s , there are two smaller planetary caustics. The distance betweenthe lens and the center of the planetary caustic is equal to | s − /s | . The solid black line with the arrow in each panel representsthe path of the source through the caustics. The left panel is a reproduction of a similar figure in Poleski et al. (2014). Table 1.
The process taken to calculate s and q is shown in this table. One Deviation: t p is determined.2. u = (cid:113) u + ( t p − t /t E ) s − s − u is solved for s .4. Secondary check determines if s > s < q = ( t E,p t E ) . Two Deviations: t and t are determined.2. u , = (cid:113) u + ( t , − t /t E ) LX is found by geometry.4. | s − s | − LX is solved for s .5. Secondary check determines if s > s < s >
1: Height of the caustic ∼ √ qs √ s − = ⇒ q is found.7. For s <
1: Height of the caustic ∼ s √ q (cid:113) ( s ) + 27 = ⇒ q isfound. approaches to approximate a function is to expand it inform of a series of polynomials. For this purpose, Di Ste-fano & Perna (1997) suggests to use the Chebyshev ap-proximation on microlensing light curves. The idea isthat features of Chebyshev polynomials are useful forcapturing the smooth features of binary-lens microlens-ing light curves without caustic crossings. We have implemented the work of Di Stefano & Perna(1997), and applied it to our simulated data set. For thispurpose, we first detect the peaks in the light curve, thenfor each peak, we choose the interval around the peakthat includes the wings of the event up to a magnifica-tion of 1.06 that corresponds to an impact parameter of2 R E (Di Stefano & Perna 1997).8 Khakpash et al.
Figure 18.
An example of the PSPL plus the busy functionto the light curve of an event with s = 0 .
63 and q = 0 . s shouldbe less than unity. The estimated parameters are s = 0 . q = 0 . Figure 19.
An example of the PSPL plus the busy functionto the light curve of a double-peaked event with s = 0 .
62 and q = 0 . s should be lessthan unity. The estimated parameters are s = 0 .
59 and q = 0 . We then find the coefficients of the Chebyshev poly-nomials using the formulation described by Press et al.(1992) and expand the selected parts of the light curvesby the expression in Equation 15. The polynomial T k has k +1 number of extrema in the interval [ − ,
1] where
Figure 20.
An example of the PSPL plus the busy functionto the light curve of a double-peaked event with s = 1 . q = 0 . s = 1 .
02 and q = 0 . the Chebyshev polynomials are defined. To use thismethod, the time coordinates of the selected potion ofour light curves should be within this interval, and sowe convert the interval of time to be between − f ( x ) ≈ (cid:34) m (cid:88) k =0 c k T k ( x ) (cid:35) − c (15)In this work, we choose the first 50 polynomials ofthis series ( m = 50) to fit Equation 15 to the lightcurves, and then using the coefficients c k , we calculateΛ = − log (( (cid:80) mk =0 ( c k c ) ) −
1) as a feature parameterthat can be used to distinguish between different lightcurve types in our simulated data set.In Figure 22, an example of a binary-lens event ap-proximated by the Chebyshev polynomials of degree 50can be seen. It is important to note that the whole lightcurve is not approximated in this method, and only thesegment containing the event excluding gaps is selected.This is because the Chebyshev approximation is goodfor approximating functions that have finite number ofextrema in the interval of [ − , lassifying High-cadence Microlensing Light Curves I; Defining Features True q F i tt e d q MAD = 1.15
Single-peaked DeviationsDouble-peaked Deviations True s F i tt e d s MAD = 0.13
Single-peaked DeviationsDouble-peaked Deviations
Figure 21.
Left panel shows plot of fitted values of q versus the true values. Fitted values of q are mostly within one order ofmagnitude of the true values for the PSPL plus busy function fits. The right panel shows fitted s versus the true values. Most values of s are well estimated by the method. The fitted valuesclose to one occur for events caused by the central caustics for which our formalism breaks down. Blue dots are events withoutcaustic crossings, and black dots are events with caustic crossings. Figure 22.
The stellar binary-lens light curve is approxi-mated with the Chebyshev polynomials of degree 50. microlensing events are closer to being an even functionrather than an odd function, the expansion only involvesthe even terms in the Chebyshev polynomials.We then check how well different values of Λ matchto different light curve types. Figure 23 shows that thevalue of Λ correlate fairly well with four different lightcurve types. Note that the fractions are calculated for a given range of Λ, across each light curve types, horizon-tally in the figure. Although the fractions of differentvariability types in the simulated Roman data set arenot equal, there are large numbers of each variabilitytype, so the distinctions seen in Figure 23 are still quiterobust. We therefore expect that the continuous param-eter Λ, can be a useful input for a ML algorithm (suchas a Random Forest) to classify event types.In addition to the feature Λ, we also follow the sug-gestion of Di Stefano & Perna (1997), to include theindividual even-numbered coefficients of the Chebyshevpolynomial as classification features. AN ALGORITHMIC APPROACHIn this section, we introduce a suggested algorithmicapproach to use the produced features as input to MLclassifiers. The goal of this section is to provide an effi-cient procedure to use these features to train classifierson a high-cadence data set and detect different types ofmicrolensing light curves. In Table 2, we have a sum-mary of the tools in our package and the associated fea-tures. Each of these tools can be either applied to thewhole dataset or a subset of it, and there may be morethan a single algorithmic approach to use these features.In a paper under development (Khakpash et al, inpreparation), we are implementing an algorithmic ap-0
Khakpash et al.
Table 2.
Summary of the introduced algorithms and their resulting features.
Algorithms Produced Features
Peak Finder Number of the PeaksG-PSPL Fit χ : Goodness of the G-PSPL fitSymmetry Check β : A measure of asymmetry of the peakTrapezoidal Function Fit κ : Ratio of the duration of the flat top of the trapezoidal function to total duration t E,trap : Duration of the trapezoidal functionCauchy Distribution Fit ψ : The difference between the goodness of PSPL and Cauchy fits b : A measure of the flatness of the topPlanetary Parameter Finder s : Projected mass ratio in units of Einstein radius q : Mass ratioChebyshev Polynomials Fit Λ: Sum of the square roots of the Chebyshev coefficients a , a , a , a , a : First five even coefficients of Chebyshev polynomials Figure 23.
This figure shows the breakdown by variabilitytype for each range of values of the Chebyshev feature Λ. Λis equal to − log (( (cid:80) mk =0 ( c k c ) ) − c . proach to employing the features discussed in this pa-per to comprehensively search for microlensing events,classify them by type, and derive preliminary systemparameters. For any set of lightcurve features, thereare numerous ways to conduct classification and our ap-proach is by no means certain to be optimal in everyway. We encourage other astronomers to make use ofthe features described here to develop independent clas-sification methods.The particular structure or sequence of a classificationapproach can vary. One might use all the features todetect all the categories at once, or alternatively, findparticular classes by doing a step-by-step classification.In an ideal case where there are thousands examples ofeach class across a full range of empirical properties ofeach feature, it is likely better to use all the features tofind all of the classes in a one-step classification. Since in this case we have very limited examples of some ofthe classes, we recommend a step-by-step classificationapproach.The first step of classification is to distinguish mi-crolensing light curves from other types of variability(e.g., CV in our dataset). We refer to this step as clas-sification step I. Although our goal in this work is notfocused on this set, we believe some of the features iden-tified in this work can be used to improve the currentexisting classifiers focused on this task. We have testedthis type of classification using the features such as num-ber of the peaks found by the Peak Finder, Λ, a , a , a produced by the Chebyshev fit, and χ P SP L , t E,P SP L , β generated by the G-PSPL fit and the Symmetry Check.Assuming we have identified all the microlensing lightcurves, the next type of classification is to classify theminto single-lens versus binary-lens events which we referto as step II . Note that in a real dataset, multi-lensevents also exist which can either be added as a cate-gory or can be included in one category along with thebinary-lens events. We use the same feature as in step Iexcluding the number of peaks.Once we have single-lens and binary-lens events, weclassify the binary-lens events into stellar binary-lensand planetary binary-lens systems (classification stepIII) using χ P SP L , t E,P SP L produced by the PSPL fitand s , and q produced by the PSPL plus busy fit.Furthermore, single-lens events can be classified intoclasses of isolated short timescale events likely caused byfree-floating planets or planets on very wide orbits, stel-lar PSPL, and stellar FSPL events (Classification step IV ). We suggest using κ , ∆ τ full , ψ , and b produced bythe Trapezoidal Fit and the Cauchy/PSPL Fit to ob-tain better results. This step will be investigated in thefuture work.Figure 24 displays a diagram showing our suggestedalgorithmic approach to use the different features pro- lassifying High-cadence Microlensing Light Curves I; Defining Features k -nearest neighborsclassifier, a decision tree classifier, a random forest clas-sifier and a neural network classifier. The preliminaryresults for that are presented in the next section. PRELIMINARY TESTING OF MLALGORITHMSAs stated above, there are many ways to implementan approach to identify and characterize microlensingevents with classification based on light curve features.We are developing a thorough investigation into thatquestion (Khakpash, et al. in preparation), but for nowwe present a preliminary analysis using a few simpleanalysis steps based on the classification approach de-scribed in §
5. We present the results of training fourML classifiers including a k -nearest neighbor classifier(KNN), a decision tree classifier (DT), a random forestclassifier (RF), and a neural network classifier (NN) us-ing the features we introduced in this paper. In order totest these algorithms, we follow the step-by-step schemeof Figure 24. It is important to note again that the step IV of the classification in Figure 24 is not tested hereand will be pursued in the future. At each step, we setthe size of our test set to be 20% of the whole dataset.The test set is then randomly chosen in a 5-fold cross-validation process, and the average scores are reportedat the end.In order to compare results if these classifiers, weshow confusion matrices made with the test set at eachstep along with their Receiver operating characteristic(ROC) curves. As mentioned in Section 5, we under-stand that the one-step classification is a more idealapproach, and we tested this approach with our cur-rent data set. However, the limited number of objectinstances in each class was insufficient for achieving anoverall acceptable accuracy. Nevertheless, the isolatedconfusion matrices of the step-by-step classifications arevaluable to evaluate the utility of the features presentedin this paper. We are planning to thoroughly investigatethe one-step classification in a future paper.At each step, we use the same dataset and featuresfor all of the four classifiers, but we optimize their hy-perparameters separately. The RF, KNN, and DT clas-sifiers are implemented using the scikit-learn package inpython (Pedregosa et al. 2011). The NN classifier isimplemented using the Tensorflow and Keras packagesin python (Abadi et al. 2015; Chollet et al. 2015). Asummary of the features and hyperparameters of eachclassifier at each step is given in Table 3. 6.1. Classification Step I
As shown in Figure 24, the first step of the algo-rithmic approach is to detect microlensing light curvesamong other stellar variability. We should first note thatour current light curve data set is not completely rep-resentative of what we expect from the Roman mission,since the non-microlensing type of variability in the lightcurves in our current dataset only includes cataclysmicvariables. Most (but not all) forms of non-microlensingvariability are expected to be periodic or quasi-periodic,and thus simply including CVs is a good starting point.Our dataset for this set contains 4181 light curves amongwhich there are 3752 microlensing light curves (labeledas 1) and 429 non-microlensing light curves (labeled as0).We find that the test set and training set accuracy forall of the four classifiers in this step are very close. Acommon way to evaluate the results of a classificationmodel is to plot a confusion matrix for it. A confusionmatrix is a table containing the percentages of both cor-rectly and incorrectly classified objects for each class inthe dataset. According to the confusion matrices shownin Figure 25, most of ML tools can find the microlensinglight curves with very small classification error, whereas,about 40% of the CV light curves are misclassified. Thiscould be a result of having a small training set, or incom-plete hyperparameter tuning, or might be an indicationof the need to include more features.A more robust method of comparing different ML clas-sifiers is to plot their ROC curves, and calculate theirvalues of the Area Under the Curve (AUC). The closerthe AUC is to unity, the better the performance of thatclassifier is. This includes plotting True Positive Rate(TPR) versus False Positive Rate (FPR) for differentdecision thresholds, and finding the area under that.Figure 26 shows the ROC curves of the four classifierstrained in step I. The diagonal dashed line representsthe ROC curve of a random classification. The ROCcurves show that the NN and RF classifiers have a bet-ter performance and can achieve a higher TPR withoutlowering the FPR.6.2.
Classification Step II
The second classification step is distinguishing be-tween single-lens and binary-lens microlensing lightcurves as shown in Figure 24. Our dataset for this stepcontains 4181 light curves among which there are 2143binary-lens microlensing light curves (labeled as 1) and1626 single-lens microlensing light curves (labeled as 0).For this step, we find that DT and RF have similar train-ing and test accuracy and seem to work better than NN2
Khakpash et al.
Figure 24.
This diagram shows an algorithmic approach for using the features described in this paperas input to ML classifiers. lassifying High-cadence Microlensing Light Curves I; Defining Features (a) Random Forest Classifier (b) Neural Networks Classifier(c) Decision Tree Classifier (d) K -nearest Neighbors Figure 25.
Confusion matrices of the four trained classifiers of classification Step I. In this step, we aim at classifying all ofthe light curves into two classes of CV and microlensing. Khakpash et al.
FPR T P R KNN, AUC = 0.794DT, AUC = 0.849Rf, AUC = 0.904NN, AUC = 0.978
Figure 26.
ROC curves of the four trained classifiers ofclassification Step I along with their AUC values. In thisstep, we aim at classifying all of the light curves into twoclasses of CV and microlensing. RF and NN have the largestAUC implying that they are able to achieve higher TPRwhile the FPR is also low. and KNN, although NN seems to work much better thanKNN.Figure 27 shows confusion matrices of the four classi-fiers trained for this step. The confusion matrices sug-gest that RF works better at predicting both labels,whereas NN works best at finding a larger fraction ofthe binary-lens light curves. Figure 28 shows the ROCcurves and AUC values of this step. The classifiers areoverall working better in this step mainly because thedataset is more balanced here.6.3.
Classification Step III
After finding the binary lens light curves, the nextclassification step is distinguishing between stellarbinary-lens and planetary binary-lens microlensing lightcurves as shown in Figure 24. Our dataset for this stepcontains 2074 light curves among which there are 688planetary binary-lens microlensing light curves (labeledas 1) and 1386 stellar binary-lens microlensing lightcurves (labeled as 0).This classification step is particularly important inthis context since planetary binary-lens systems are theones that astronomers would like to distinguish from therest of the dataset. In our tested example the numberof light curve is lower than the previous classificationsteps and this decreases the accuracy of the classifiers.Because of this, we find that the overall accuracy values of this step are smaller than the previous steps. Ad-ditionally, stellar and planetary binary-lens systems aremuch less distinguishable from each other compared tothe previous tasks, and for this reason the simpler algo-rithms of DT and KNN appear to have lower accuracy.RF and NN have higher overall accuracy, but NN hasa higher test accuracy which results in a larger fractionof the test set being correctly labeled. It seems thatan algorithm like NN is more capable of distinguishingbetween these two categories which is expected as NNis theoretically more complex and is designed to findcomplicated patterns in a data set.Figure 29 shows confusion matrices of the four clas-sifiers. The confusion matrix of the NN shows a largervalue of TPR compared to all the other confusion ma-trices, and this is more favorable since our ultimate goalis to detect planetary microlensing light curves. Fig-ure 30 shows the ROC curves and AUC values of thefour classifiers in step III. The dataset in this step issmaller compared to the other two steps and is not wellbalanced. Therefore, the performance of the differentclassifiers are not as well-differentiated as in the othersteps. However, NN shows significantly better perfor-mance compared to the other classifiers. CONCLUSION AND FUTURE WORKClassifying light curves that manifest different typesof stellar variability is still a major challenge. Althoughusing ML methods to classify astronomical time seriesis a powerful tool, it is important to understand whichmethod we should choose and what are the steps weneed to take in order to use these methods effectively.In this paper, we introduce a new approach towardsclassifying microlensing light curves based on light curvemorphologies. We introduce a package of tools includ-ing several functional fits that can be applied to thelight curve in a fast and efficient way to extract infor-mation about the different morphological features in thelight curves. This information is quantified as featuresthat can be used to make decisions about the light curvetypes.This approach will be useful when it comes to analyz-ing large microlensing data sets from high-cadence sur-veys like Roman (Spergel et al. 2015) in the future andongoing surveys like KMTNet (Kim et al. 2010). Ourpreliminary results in Section 6 show that the featuresproduced by our tools can be used as input to ML classi-fiers like Random Forest to distinguish between differenttypes of microlensing light curves, and help prioritizethe ones that are more likely to be caused by planetarysystems. lassifying High-cadence Microlensing Light Curves I; Defining Features T a b l e . C l a ss i fi e r s H y p e r p a r a m e t e r s a nd F e a t u r e s . F e a t u r e s K NN D T R F NN S t e p I Λ , a , a , a , β , N u m b e r o f p e a k s ( b i n s i z e = , t h r e s h o l d = & ) , χ P S P L , t E , P S P L m e t r i c = “ m a nh a tt a n ” , nn e i g hb o r s = c r i t e r i o n = “g i n i ” , m a x d e p t h = , s p li tt e r = “ b e s t ” m a x f e a t u r e s = , m i n s a m p l e s l e a f = n e s t i m a t o r s = , c r i t e r i o n = “ e n t r o p y ” , b oo t s t r a p = F a l s e D e n s e L a y e r s , D r o p o u t L a y e r s , a c t i v a t i o n = “ r e l u ” b a t c h = , e p o c h s = S t e p II Λ , a , a , a , β , χ P S P L , t E , P S P L m e t r i c = “ e u c li d e a n ” , nn e i g hb o r s = c r i t e r i o n = “ e n t r o p y ” , m a x d e p t h = , s p li tt e r = “ b e s t ” m a x f e a t u r e s = , m i n s a m p l e s l e a f = n e s t i m a t o r s = , m a x d e p t h = D e n s e L a y e r s , D r o p o u t L a y e r s , a c t i v a t i o n = “ r e l u ” b a t c h = , e p o c h s = S t e p III Λ , a , a , a , χ P S P L , t E , P S P L , s , q m e t r i c = “ m i n k o w s k i ” , nn e i g hb o r s = , p = c r i t e r i o n = “g i n i ” , m a x d e p t h = , s p li tt e r = “ b e s t ” m a x f e a t u r e s = , m i n s a m p l e s l e a f = n e s t i m a t o r s = , m a x d e p t h = D e n s e L a y e r s , D r o p o u t L a y e r s , a c t i v a t i o n = “ t a nh ” b a t c h = , e p o c h s = Khakpash et al. (a) Random Forest Classifier (b) Neural Networks Classifier(c) Decision Tree Classifier (d) K -nearest Neighbors Figure 27.
Confusion matrices of the four trained classifiers of classification Step II. At this stage, we assume that themicrolensing light curves are already detected, and our goal is to classify them into groups of single-lens and binary-lensmicrolensing light curves. lassifying High-cadence Microlensing Light Curves I; Defining Features FPR T P R KNN, AUC = 0.92DT, AUC = 0.918Rf, AUC = 0.943NN, AUC = 0.939
Figure 28.
ROC curves of the four trained classifiers of clas-sification Step II along with their AUC values. In this step,we are classifying the microlensing light curves into groupsof single-lens and binary-lens microlensing light curves. Thedataset is more balanced in this step and all of the classifiersseem to work well. RF and NN still show a better perfor-mance.
The simulated data used in this paper included CV-like events as the only non-microlensing instance. Anideal dataset would include a variety of stellar vari-ability, and our developed package should be modifiedto include all other variability in its analysis. We be-lieve that the same approach will be successful in recov-ering microlensing events from a wide variety of non-microlensing variability on its own.There are currently a number of methods that at-tempt to categorize photometric variability in large datasets, such as parametric statistical methods and ML.ML methods for detecting different types of variabilities are becoming more common (e.g. Richards et al. 2011;Pichara 2013; Pashchenko 2017; Valenzuela 2017). Forexample, as mentioned before, Godines et al. (2019) hastrained a Random Forest classifier to detect microlens-ing light curves in a low-cadence survey dataset in realtime. Our produced features can be a complementaryset of features for such algorithms and not only can im-prove their results but also can give them the ability todetect possible planetary binary-lenses as well.In ML classifiers, increasing the number of object in-stances can significantly improve the results. Our largesttraining set included 4181 light curves, which is not alarge number for most ML applications. Increasing thisnumber to about ∼ ,
000 light curves would yield morerobust and reliable results. Some of the tools presentedin this paper produce other parameters as well, and in-cluding different subsets of those parameters could alsoimprove the results. This task needs to be done care-fully, though, since adding more features that are notimportant might result in overfitting. Additionally, agreat avenue to improve the results would be to testother classifiers like the Support Vector Classifier andNaive Bayesian, and also investigate deeper neural net-works. The improvement of the data set and the MLalgorithms will be presented in a second paper of thisseries. ACKNOWLEDGEMENTS.K thanks the LSSTC Data Science Fellowship Pro-gram, which is funded by LSSTC, NSF Cybertrain-ing Grant
Abadi, M., Agarwal, A., Barham, P., et al. 2015,TensorFlow: Large-Scale Machine Learning onHeterogeneous Systems, software available fromtensorflow.orgBelokurov, V., Evans, N. W., & Du, Y. L. 2003, MonthlyNotices of the Royal Astronomical Society, 341, 1373Bennett, D., Becker, A. C., Quinn, J., et al. 2002, TheAstrophysical Journal, 579, 639 Bluck, A. F., Maiolino, R., S´anchez, S. F., et al. 2020,Monthly Notices of the Royal Astronomical Society, 492,96Bond, I., Abe, F., Dodd, R., et al. 2001, Monthly Notices ofthe Royal Astronomical Society, 327, 868Bozza, V. 2000, Journal of Mathematical Physics, 41, 6284Chollet, F., et al. 2015, Keras, https://keras.ioDi Stefano, R., & Perna, R. 1997, The AstrophysicalJournal, 488, 55Gaudi, B. S. 2010, arXiv preprint arXiv:1002.0332 Khakpash et al. (a) Random Forest Classifier (b) Neural Networks Classifier(c) Decision Tree Classifier (d) K -nearest Neighbors Figure 29.
Confusion matrices of the four trained classifiers of classification Step III. Once the binary-lens microlensing lightcurves are found, we use this classification step to classify them into stellar and planetary binary-lens microlensing light curves.—. 2012, Annual Review of Astronomy and Astrophysics,50, 411Godines, D., Bachelet, E., Narayan, G., & Street, R. 2019,Astronomy and Computing, 28, 100298Gould, A. 1996, The Astrophysical Journal, 470, 201Gould, A., & Gaucherel, C. 1996, arXiv preprintastro-ph/9606105Griest, K., Alcock, C., Allsman, R., et al. 1995, arXivpreprint astro-ph/9506016Han, C. 2006, The Astrophysical Journal, 638, 1080Han, C., Bennett, D. P., Udalski, A., & Jung, Y. K. 2016,The Astrophysical Journal, 825, 8Jiang, G., DePoy, D., Gal-Yam, A., et al. 2004, TheAstrophysical Journal, 617, 1307Johnson, S. A., Penny, M. T., Gaudi, B. S., et al. 2020,arXiv preprint arXiv:2006.10760Kessler, R., Narayan, G., Avelino, A., et al. 2019, arXivpreprint arXiv:1903.11756Khakpash, S., Penny, M., & Pepper, J. 2019, TheAstronomical Journal, 158, 9 Kim, D.-J., Kim, H.-W., Hwang, K.-H., et al. 2018, TheAstronomical Journal, 155, 76Kim, S.-L., Park, B.-G., Lee, C.-U., et al. 2010, inGround-based and Airborne Telescopes III, Vol. 7733,International Society for Optics and Photonics, 77333FKim, S.-L., Lee, C.-U., Park, B.-G., et al. 2016, Journal ofthe Korean Astronomical Society, 49, 37Liaw, A., & Wiener, M. 2002, R News, 2, 18Lloyd, S. 1982, IEEE transactions on information theory,28, 129Mao, S., & Di Stefano, R. 1994, Interpretation ofgravitational microlensing by binary systems, Tech. rep.,SCAN-9411403Mao, S., Smith, M. C., Wo´zniak, P., et al. 2002, MonthlyNotices of the Royal Astronomical Society, 329, 349Mr´oz, P., Udalski, A., Skowron, J., et al. 2017, Nature, 548,183Mr´oz, P., Ryu, Y.-H., Skowron, J., et al. 2018, TheAstronomical Journal, 155, 121 lassifying High-cadence Microlensing Light Curves I; Defining Features FPR T P R KNN, AUC = 0.661DT, AUC = 0.68Rf, AUC = 0.724NN, AUC = 0.835