The digest2 NEO Classification Code
Sonia Keys, Peter Vereš, Matthew J. Payne, Matthew J. Holman, Robert Jedicke, Gareth V. Williams, Tim Spahr, David J. Asher, Carl Hergenrother
DD RAFT VERSION A PRIL
22, 2019Typeset using L A TEX twocolumn style in AASTeX61
THE digest2
NEO CLASSIFICATION CODE S ONIA K EYS , P ETER V EREŠ , M ATTHEW
J. P
AYNE , M ATTHEW
J. H
OLMAN , R OBERT J EDICKE , G ARETH
V. W
ILLIAMS , T IM S PAHR , D AVID
J. A
SHER , AND C ARL H ERGENROTHER Harvard-Smithsonian Center for Astrophysics, 60 Garden St., MS 51, Cambridge, MA 02138, USA Institute for Astronomy, 2680 Woodlawn Drive, Honolulu, HI 96822, USA NEO Sciences, LLC, 1308 Applebriar Lane, Marlborough, MA 01752, USA Armagh Observatory and Planetarium, College Hill, Armagh, BT61 9DG, UK Lunar and Planetary Laboratory, University of Arizona, Tucson, AZ 85721, USA
Submitted to PASP, Day Month YearABSTRACTWe describe the digest2 software package, a fast, short-arc orbit classifier for small Solar System bodies. The digest2 algorithmhas been serving the community for more than 13 years. The code provides a score, D , which represents a pseudo-probabilitythat a tracklet belongs to a given Solar System orbit type. digest2 is primarily used as a classifier for Near-Earth Object (NEO)candidates, to identify those to be prioritized for follow-up observation. We describe the historical development of digest2 anddemonstrate its use on real and synthetic data. We find that digest2 can accurately and precisely distinguish NEOs from non-NEOs. At the time of detection, 14% of NEO tracklets and 98.5% of non-NEOs tracklets have D below the critical value of D = 65. 94% of our simulated NEOs achieved the maximum D = 100 and 99.6% of NEOs achieved D ≥
65 at least onceduring the simulated 10-year timeframe. We demonstrate that D varies as a function of time, rate of motion, magnitude andsky-plane location, and show that NEOs tend to have lower D at low Solar elongations close to the ecliptic. We use our findingsto recommend future development directions for the digest2 code. Keywords: short-arc orbit determination, asteroids, Near-Earth Object, Minor Planet Center
Corresponding author: Peter Vereš[email protected] a r X i v : . [ a s t r o - ph . E P ] A p r K EYS ET AL . INTRODUCTIONNear-Earth Objects (NEOs), defined as any small SolarSystem body having perihelion less than 1.3 au, are of sig-nificant interest for a number of reasons. These include plan-etary defense, spacecraft missions, commercial development,and investigations into the origin and evolution of the SolarSystem. The NEO discovery rate has risen rapidly in recentdecades, driven by the 1998 Congressional mandate to dis-cover 90% of NEOs larger than 1 km (the Spaceguard Sur-vey Morrison 1992), and the subsequent 2005 Congressionalmandate to discover 90% of NEOs larger than 140 m (GeorgeE. Brown, Jr. NEO Survey Act ).Solar System objects are typically identified through theirapparent motion in a series of images spanning minutes tohours. Given such a sequence of exposures, it is straight-forward to identify “tracklets” (Kubica et al. 2007), sets ofdetections that are consistent with an object with a fixed rateof motion. A related term is an arc , which may contain anynumber of tracklets, possibly from different observatories.The time span from first to last observation is termed the arc-length .It is straightforward to determine whether a tracklet rep-resents a known object with a well defined orbit. In thatcase, the observations coincide with the predicted positionsfor the known object. Otherwise, the tracklet likely repre-sents a newly discovered object. It is then of interest to deter-mine what type of orbit that object has, and in particular, ifit is a new NEO. Given limited telescopic resources, it is notpossible to re-observe all objects to immediately determinetheir orbits. It is therefore essential to identify which objectsare more likely to be NEOs and prioritize those for furtherobservation.Although a tracklet does not uniquely determine an orbit,characteristic sky-plane motion can be used to infer a possi-ble orbit type (e.g. Jedicke 1996). The tool currently used forthis is digest2 . The digest2 code uses only a single motionvector, derived from a tracklet or short arc, to identify all pos-sible elliptical orbits consistent with that motion. This set ofpossible orbits is then divided into disjoint orbital categories.Given a population model that represents the number of solarsystem objects of each type, we can estimate the likelihoodthat the tracklet represent a member of each category.With a suitable threshold, digest2 can serve as an NEObinary classifier. Tracklets with high digest2 scores, D ,are posted on the Near-Earth Object Confirmation Page Section 321 of the NASA Authorization Act of 2005 (Public Law No.109-155) There is no known history of a digest1; this is not version 2 of a pro-gram. Also nothing is known about the origin or meaning of the name. Theprogram name is simply digest2 . (NEOCP ). These objects are prioritized by the NEO follow-up community for additional observations.In this manuscript we describe the digest2 code. The codeas implemented has been used by the community for a num-ber of years. Our primary goal is to describe and documentthe major elements of digest2 and the practical consequencesof its design and implementation. Although we highlightpossible areas of improvement, these are left for future work. METHODOLOGYThe observations from a short-arc tracklet directly con-strain the position, ( α, δ ), and motion, ( ˙ α, ˙ δ ), of the asteroidin the sky, but the topocentric radial-distance, ρ , and radial-velocity, ˙ ρ , between the asteroid and the observer are essen-tially unconstrained. This is the fundamental challenge oforbit determination.An assumed ρ allows a heliocentric position to be derived(see Appendix A), and with the assumption of a topocentric ˙ ρ , a heliocentric velocity can also be calculated. Milani et al.(2004) refer to the “admissible region” as being the set of( ρ, ˙ ρ ) for which the resultant orbit is heliocentrically bound(elliptical w.r.t. the Sun).Many authors have addressed the problem of short-arc or-bit determination and constraint (e.g. Väisälä 1939; Mars-den 1985; Bowell 1989; Bowell et al. 1990; Marsden 1991;Tholen & Whiteley 2000; Milani et al. 2004; Milani &Kneževi´c 2005; Oszkiewicz et al. 2009; Spoto et al. 2018,and discussions therein). Of particular relevance to digest2 ,in the late 1980s and early 1990s, R. H. McNaught of theSiding Spring Observatory undertook an extensive effort todetermine allowable orbital element ranges and class-specificobject probabilities based on two observations and a magni-tude (P ANGLOSS
McNaught 1999). The method proceededby stepping through a range of possible topocentric distancesand angles (between the velocity vector and the line of sight)which were consistent with the observations and bound to thesun. P
ANGLOSS used a population model to assign weightsto different orbit classes. The P
ANGLOSS code provided thefoundation for the digest2 code described in this manuscript.We review two related methods for addressing the problemof short-arc orbits. The first concerns the technique referredto as “statistical ranging” (Virtanen et al. 2001), in whichtwo observations are selected from the tracklet, thus fixing( α, ˙ α, δ, ˙ δ ), and then topocentric ranges at each of the epochsof the two observations are randomly chosen and a corre-sponding orbit computed from the admissible region that iscompatible with the observational tracklet data. This processis repeated over many random topocentric distances to gen-erate a set of compatible orbits. HE digest2 NEO C
LASSIFICATION C ODE ρ, ˙ ρ ) space is performed, generating anorbit for each and comparing to all tracklet observations. TheRMS of the fit residuals for each ( ρ, ˙ ρ ) point indicates thequality of the fit, and χ probabilities can be used to deriveconfidence regions in ( ρ, ˙ ρ ) space.While digest2 is not directly derived from either Virtanenet al. (2001) or Chesley (2005) (and the P ANGLOSS rangingcode, from which digest2 is derived, predates both), the di-gest2 code uses the same fundamental approach that is com-mon to both statistical and systematic ranging techniques:sets of bound heliocentric orbits are generated that all satisfythe short-arc observations.2.1.
Historical Development of digest2As described above, the early origins of the digest2 werethe P
ANGLOSS code developed by R. McNaught. The cur-rent version of digest2 evolved from a FORTRAN 77 code(“ ”) employed in Jedicke (1996), and then furtherdeveloped by C. Hergenrother and T. Spahr from the P AN - GLOSS code of R. McNaught. It accepted observation files inthe MPC’s 80-character format , encoded the vector solutiondescribed in section 3.2, employed the parabolic limit de-scribed in appendix A, and used the model population look-up described in section 3.3.Since the first publication, digest2 has undergone many in-cremental improvements. In Appendix B.1, we provide a de-tailed description of the key changes to the digest2 . COMPUTATIONAL ELEMENTS OF digest2
In this section we describe the main components of the cur-rent version of digest2 and explain the algorithmic choices inits development. We emphasize that we are providing a fac-tual report of how the code was implemented, and hence howthe code has been operating for more than a decade. Sig-nificant improvements are possible in future versions, but thefocus of this paper is documenting the development, features,and performance of the existing code.We discuss digest2 ’s key algorithmic steps in Sections 3.1to 3.5, and then summarize them in Section 3.6 and Algo-rithm Table 1. 3.1.
Endpoint Synthesis
The algorithm starts by reading tracklet and population-model data from input files. Since version 11 (June 2011), digest2 considers all observations in the arc and synthesizesend points to define a motion vector, with the aim of improv-ing the robustness of digest2 against both bad observations https://minorplanetcenter.net/iau/info/ObsFormat.html Figure 1.
Schematic illustration of the construction of two-elementtracklets within digest2 for short-arcs ( < hrs : top-left), space-based data (top-right), and long-arcs ( ≥ hrs : bottom). Detectionsare plotted as red circles along their underlying path (gray dottedline); The two-element tracklets constructed by digest2 are the bluesquares and dashed lines. A detailed description of the algorithmused to construct the two-element tracklets is provided in the text of§3.1. and statistical observational error. Figure 1 demonstrates anumber of cases handled by the code. Short-Arcs:
Panel A illustrates relatively short arcs, withall observations from the same site. At the top, both obser-vations in the two-observation-arc are used as-is. For thesecond arc, a more typical tracklet with four observations,a great circle linear motion vector is fit to the observations.Endpoints are synthesized near the 17th and 83rd percentileof the observations in the arc, instead of near the extremeendpoints. This is not to account for positional uncertainty,but rather is an approximate method to account for cases inwhich excessive measurement error or poor astrometry cansignificantly shift the endpoint, and thus change the shapeof the admissible region. . For the third arc, synthesizingendpoints near the 17th and 83rd percentile yields syntheticobservations within the tracklet. The fourth arc shows sig-nificant curvature, with the synthesized linear motion vectorexhibiting significant residuals w.r.t all observations. The justification for the particular choice of 17th and 83rd percentilesis unknown to the surviving authors. Moreover, we emphasize that futureversions of the digest2 code will improve on the handling of measurementuncertainty and tracklet construction. K EYS ET AL . Space-based observations : As illustrated in Panel B,space-based observations often show strong curvature on thesky due to parallax as the spacecraft orbits the earth. In thiscase interpolation on a great circle fit is not meaningful. Ifany observations in an arc are space-based, two actual obser-vations are selected, near the 17th and 83rd percentile.
Longer-Arcs:
Panel C shows cases of longer arcs and/orarcs with multiple observing sites. Here, two sub-arcs areselected where each sub-arc is < 3 hours and all detectionsin a sub-arc are from the same site. These two sub-arcs arethen separately reduced to yield the motion vector end points.In the top arc of Panel C, sub-arcs can be selected to use allavailable observations. A separate great circle linear motionfit is then applied to each sub-arc and observations are syn-thesized, again near the 17th and 83rd percentiles. The sec-ond arc illustrates how the conditions that sub-arcs be < 3hours and all from the same site may leave a number of ob-servations unused. The third arc shows a case where the 83rdpercentile is outside of the end sub-arc. In this case no pointis synthesized outside the sub-arc, rather the first observationof the sub-arc is used as-is.3.1.1.
Dithering and Observational Uncertainty
Figure 2.
Schematic illustration of the construction of “dithered”end-points to a two-point tracklet constructed as per Figure 1, Sec-tion 3.1.1. The nominal end-points (large blue squares) are variedwithin the uncertainties, σ , of the input detections by adding ± . σ to one-or-both of the RA and Dec values of the end points, produc-ing 8 additional deterministic variations (small gray symbols). Werefer the reader to the main text of Section 3.1.1 for a brief discus-sion of the short-comings of this approach. Once a two-point tracklet has been constructed as de-scribed in Section 3.1, the end-points are varied within theuncertainties, σ , of the input detections. The nominal end- https://minorplanetcenter.net/iau/info/SatelliteObs.html points are varied by adding ± . σ to either (or both) of theRA and Dec values of the initial end points, producing a to-tal of 9 variant tracklets: the initial one, plus the additional 8variations illustrated in Figure 2.The astrometric uncertainties used for σ can be definedper observatory code in the digest2 configuration file. Thecurrent MPC settings of uncertainties and keywords are pre-sented in Table 6 of Appendix D.While this approach has the effect of producing a range ofdifferent motion vectors, it should be emphasized that thismethod of dealing with astrometric uncertainty is not idealfrom a statistical viewpoint, omitting occasional astrometrythat is worse than our statistics, or significantly better dueto improved astrometric catalog or measurement. Improve-ments expected in the near future will employ astrometricuncertainties directly reported with the submitted astrometryon an individual detection basis to handle observational un-certainty. 3.2. Orbit Solution
Given a nominated observer-object distance, D , and a nom-inated angle, α , between the observer-object unit vector, (cid:126) d ,and the object velocity vector, a specific orbit can be com-puted (Appendix A). The H -magnitude can be calculated foreach orbit using the geocentric vector (cid:126) ∆ = D (cid:126) d and the helio-centric distance of the object, (cid:126) r , to get the object phase angle Φ , which, with the apparent magnitude V , gives H in the IAUadopted H-G magnitude system Bowell et al. (1989).3.3. Population Model used by digest2The digest2 code requires a population model againstwhich tracklet-derived-orbits can be compared, representingthe number of solar system objects in a variety of dynamicalcategories. digest2 uses two different population models.The first is a full population model including all objectsdown to a given diameter. The second is the difference be-tween the full population model and the known orbit catalog,representing the undiscovered population.3.3.1.
Full Population Model digest2 uses the term “raw” to indicate when the full pop-ulation model is used to score tracklet-derived orbits. Thecurrent version of digest2 uses the Pan-STARRS SyntheticSolar System Model (S3M) of Grav et al. (2011), consistingof over 14 million simulated Keplerian orbits. digest2 bins S3M into 15 different orbit classes (Table 8 inAppendix D). The model uses bins in perihelion ( q ), eccen-tricity ( e ), inclination ( i ) and absolute magnitude ( H ). Bin-ning in perihelion, rather than semi-major axis a , enables abin cut at q = 1 . HE digest2 NEO C
LASSIFICATION C ODE Desig. RMS Int NEO N22 N18 Other PossibilitiesNE00030 0.15 100 100 36 0NE00199 0.56 98 98 17 0 (MC 2) (JFC 1)NE00269 0.42 24 23 4 0 (MC 7) (Hun 3) (Pho 15) (MB1 <1) (Han <1) (MB2 41) (MB3 5) (JFC 1)
Figure 3.
Sample output from the digest2 code showing scores for 3 tracklets. Columns from left: Designation of a tracklet, root-mean-square(RMS) of great-circle-fit of the tracklet in arcseconds, and D for the orbit classes (Int, NEO, N22, N18, Other Possibilities) described in Table 8of Appendix D. space, while simultaneously reducing the number of emptyand near-empty bins in low density regions. The bin bound-aries and counts for the full population are depicted in Fig-ures 19 in Appendix E. The binned population and modelreduction is done by the MUK program .3.3.2. Unknown Population digest2 also allows a comparison of tracklet-derived or-bits against the likely population of “undiscovered” or “un-known” objects, referred to as the “no ID” model. An ar-gument can be made that “unknown” objects can be moreaccurately scored against a population model that excludesobjects typical of those that are already known.Given a binned version of a full population model (as de-scribed in Section 3.3.1), the desired “unknown” popula-tion model is constructed by reducing bin populations bythe number of cataloged objects with well determined or-bits. Given a metric for orbit quality, an orbit is selectedfrom a catalog of known objects and the orbit is graded asto whether it would likely be identified. If the catalog or-bit is identifiable, the count of objects in the appropriate binwithin the full, modelled population can then be reduced. Re-peating this procedure for all well-known objects reduces thefull population model to the “unknown” population model.This model reduction and bin selection is performed by theabove-mentioned program
MUK .The known catalog currently used by
MUK is astorb.dat Bowell et al. (1994), Bowell (2018). Its orbit quality metric is astorb field 24 (Muinonen & Bowell 1993), correspondingto the peak ephemeris uncertainty over a ten year period. ThePeak ephemeris uncertainty is observationally motivated, of-fers a direct comparison regarding how secure the object isagainst getting lost, and is the most generic way to compressthe information of the covariance matrix into a scalar.
MUK considers such an orbit to be determined well enough thatits identification with a tracklet would be trivial. For each ofthese orbits, the corresponding model bin is decremented byone, unless the bin population reaches zero. MUK is available from https://bitbucket.org/mpcdev/d2model/src/master/muk.c astorb.dat , and is available from ftp://ftp.lowell.edu/pub/elgb/astorb.dat.gz A binned population model divided into bins for “raw” and“no id” populations is distributed with the digest2 sourcecode and updated when a new version of digest2 code isavailable. 3.4.
Searching for bins
Given an orbit calculated as described in Section 3.2, thenext step is to assign the orbit to the appropriate ( q , e , i , H )population-bin. The algorithm searches over a range of dis-tances, D , and angles, α , within the admissible region (Sec-tion 2). There are a number of possible approaches to select-ing a set of points for evaluation within this region. digest2 does not use a χ (or RMS) approach, nor does it generate afixed grid of points, but instead utilizes a binary search ap-proach (See Appendix C).3.5. Class Score
Given the set of bins identified in Section 3.4, a “classscore” is calculated over those bins. These scores are cal-culated for any of the classes listed in Section 3.3 that areselected by the user at run-time as being of interest.For a specified orbit class of interest, a sum Σ class is accu-mulated based on the modeled population consistent with theorbit class. Another sum, Σ other , is accumulated for the mod-eled population consistent with orbits not of the class. Theclass score S c is given by S c = 100 Σ class Σ class + Σ other (1)Therefore, the digest2 score, D , ranges from 0 to 100 inany given class. The example in Figure 3 demonstrates arange of D values for three tracklets of different objects, af-ter execution of the compiled digest2 program with its sup-plied sample input files .The output generated by digest2 lists scores for tracklets onindividual rows. The number of displayed columns as wellas the configuration of digest2 can be controlled using key-words in the configuration file (See Table 7 of Appendix D).The first object (NE00030) appears to be an NEO and an “In-teresting” object based on D = 100. However, its scores for https://bitbucket.org/mpcdev/digest2/overview https://bitbucket.org/mpcdev/digest2/downloads/ K EYS ET AL .being a large NEO ( N N
18) are rather low, so this objectis therefore most likely a small NEO with H >
22. The sec-ond object ( NE NE D < NE Algorithmic Summary
In Algorithm Table 1, we summarize the key algorithmicsteps employed in digest2 , These are: (1) the reduction of thetracklet observations to a single motion vector (Line 4), (2)the sampling of the distance, D , and velocity angle, α (Lines7-11), (3) the generation of orbital elements and populationcomparison (Lines 13-14), leading to (4) the accumulation ofa score (Lines 15-16). Algorithm 1: digest2 essential algorithmSee Appendix A and C for more details. Open observation file Load binned population model into memory for each arc of identical designations in file do Select or derive two positions from arc Compute a magnitude V from arc Initialize score accumulators for a number of nominated distances D do Compute distance dependent vectors { (cid:126) v D } Compute angle limits α , α from { (cid:126) v D } Compute H from V and { (cid:126) v D } for nominated angles α in range α , α do Compute state vector at α Compute orbital elements from state vector Look-up bin population for elements Accumulate values needed for scoring Compute score from accumulated values Output arc results, minimally designation and score
As described in Section 3.2, digest2 generates a large num-ber of different observer-object distances ( D ) and angles ( α )between the observer-object unit vector and the object ve-locity vector, each pair of ( D , α ) yielding different keplerianorbital elements. The distance, D , and angle α , are selectedin order to construct a bound heliocentric orbit.For each candidate orbit, the elements index a bin (Sec-tion 3.4) of a population model of the Solar System (Sec-tion 3.3). Indexed bin populations then represent populationsof modeled objects consistent with the input arc. For a dy-namic class of interest, such as NEOs, population sums canbe accumulated that allow score calculation (Section 3.5). A Table 1.
Individual NEOs (top) and non-NEOs (bottom) selectedfor detailed study in Section 4.1. q e i
Designation H PHA Orbit type [ au ] − [ deg ](198752) 19.8 No NEO Amor 1 .
21 0 .
52 1 . .
02 0 .
22 5 . .
89 0 .
67 14 . .
95 0 .
21 23 . .
80 0 .
12 6 . .
89 0 .
40 25 . .
56 0 .
40 29 . .
60 0 .
13 23 . .
12 0 .
22 14 . .
87 0 .
06 16 . score indicates a quasi-likelihood that the input arc representsan object of the class of interest. digest2 SCORE ANALYSISThe primary use of the digest2 code has been as a binaryclassifier for NEOs, in which a tracklet’s NEO D score is compared to a critical value, D , crit Tracklets with D > D , crit are considered by the MPC to be eligible for submis-sion to the Near-Earth Object Confirmation Page (NEOCP).The rationale and history for selecting a given score is de-scribed later at the end of Section 4.3. Submitting newly dis-covered tracklets to the NEOCP allows them to be prioritizedfor follow-up by the community, allowing their arcs to be ex-tended, and a more precise orbit to be determined for them,ultimately determining whether the object is indeed an NEO.The value of D , crit used to decide whether a tracklet isadmissible to the NEOCP has varied since the introductionof digest2 . This has changed both the number of trackletsadmitted to the NEOCP and the purity of the tracklets (whatfraction are NEOs) on the NEOCP. The detailed effects ofthese changes are discussed at length in Section 4.3.4.1. Application of digest2 to Individual Objects
We provide illustrative examples of the digest2 score fora number of representative objects from the MPC database.The objects and their orbital classifications are listed in Table1. Because the observing cadence for object tracklets sub-mitted to the MPC is rather sparse, we generated dailyephemerides and synthesized tracklets for our sample over a Unless otherwise indicated, in the remainder of this work we will un-derstand D to refer to the NEO "no id" (NID) digest2 score. Since 2012, D , crit = 65. HE digest2 NEO C
LASSIFICATION C ODE Figure 4. Top:
Examples of the variation in digest2 score as afunction of time for the NEOs in Table 1.
Bottom:
Examples ofthe variation in digest2 score as a function of time for the non-NEOobjects in Table 1. Thick lines indicate when the objects were practi-cably observable. All plotted NEOs have a maximum digest2 score ∼ <
65. The sample non-NEOs display a wide range of behaviors.Some, such as the MBA “2006 EW1” always display low digest2 scores. In contrast, some objects such as the Mars-crosser “2013BX45” spend sizable fractions of the time with a score >
65. Inthe remainder of Section 4 we examine the primary drivers of thisvariation. period of 10 years. For each object, the orbit was numericallyintegrated and an ephemeris was generated for two epochsper night, separated by 20 minutes. The pair of points withina night is then transformed into Geocentric (RA,Dec) coor-dinates, and used to define a tracklet for each night. Eachtracklet is then used as an input to the digest2 code, and ascore generated.All of the NEOs plotted in Figure 4 have a maximum di-gest2 score ∼ Table 2.
Description of population data sets used in this work. The
MPC D , , MPC D , and MPC A , data sets contain real observations,the Sim set contains synthetic observations based on real orbits,while
LSST and
Granvik contain synthetic observations based onsynthetic orbits.Data Set Type Timespan Tracklets Objects
MPC D , Real (All) 1-Year 14,003 14,003
MPC D , Real (All) 1-Year 18,715 18,715
MPC A , Real (All) 1-Month 374,406 143,983
Sim
Synthetic (NEO) 10-Years 3 . × . × . × Granvik Synthetic (NEO) 1-Year 618,951 14,458 which their score is <
65. The non -NEOs in Figure 4 havea wide range of behaviors. Some, such as the MBA “2006EW1” always display low digest2 scores. In contrast, someobjects such as the Mars-crosser “2013 BX45” spend sizablefractions of the time with a score > digest2 score D , crit = 65 is used by theMPC to post objects to the NEOCP and prioritize them forobservational follow-up, it follows from Figure 4 that, (a) anNEO could fall below D , crit for some fraction of the time it isvisible, and hence “miss-out” on being sent to the NEOCP,and (b) non-NEOs can gain scores higher than D , crit , andhence be “unnecessarily” sent to the NEOCP. In the follow-ing sections we quantify the frequency of such scenarios.In Figure 5 we plot D as a function of ecliptic coordinates(with respect to opposition) and the rate of motion. The plot-ted positions are for the time of visibility depicted in Fig-ure 4. It is clear from the middle plot of Figure 5 that thevalue of the digest2 score is strongly dependent on the rate-of-motion and latitude. While this generally serves to dis-tinguish NEOs from non-NEOs, we can see that some of thenon-NEOs spend some fraction of their time with relativelyhigh rates-of-motion and/or high-ecliptic latitudes. This in-evitably leads to confusion between NEO and non-NEO, asevidenced by the high D scores for some of the non-NEOs.4.2. Application of digest2 to Populations
Data Sets and Simulations
To investigate the behaviour of D for a large population,we computed D for the data sets listed in Table 2. Thedata sets MPC D , and MPC D , contain all discovery track-lets submitted to the MPC in 2011 and 2017 respectively,thus containing tracklets of all types of orbit. We use both A discovery tracklet is the first reported tracklet of an object K EYS ET AL . Figure 5.
Colormaps of the variation in digest2 score as a function of opposition-centered ecliptic latitude, longitude and the rate of motionduring the times of visibility of the individual objects from Figure 4. The large diameter circles are the NEOs, and the small points are thenon-NEOs. It is clear from the center plot that the value of the NEO digest2 score is strongly dependent on the rate-of-motion and latitude.
MPC D , and MPC D , data sets because in 2011 the MPCused D , Crit = 50 for posting to the NEOCP, whereas in 2017the MPC used D , Crit = 65, allowing us insight into the effectsof different value of D , Crit used in the past.
MPC A , contains a month’s worth of all tracklets submit-ted to the MPC, containing all types of submitted orbit. Weselected January 2017 as a time when Jupiter’s Trojans wereobservable near opposition. We only selected tracklets fainterthan 18 magnitude, to work in the regime of newly discov-ered objects.The Sim data set was obtained by integrating the knowncatalog of NEOs (as of July 2018) with H >
18 for 10 years,starting on January 1, 2008 and deriving a daily ephemerisfrom the Geocenter, providing a two-detection tracklet span-ning 1-hour for D computation. To simulate accessibilityfor ground-based telescopes, we constrained the maximumV-band magnitude to 22.5 and distance from opposition tobe within ±
100 degrees in longitude and ±
70 degrees in lat-itude. 90% of objects in the catalog were observable withinthe
Sim data.The
LSST data set contains tracklets from 9-nights worthof a simulated LSST survey (Vereš & Chesley 2017). Thedata set contains NEOs from Granvik et al. (2018) and Main-Belt asteroids from Grav et al. (2011).The
Granvik data set contains NEO tracklets for the H >
18 objects in the synthetic population of Granvik et al. (2018)that we propagated for one calendar year and then usedthe resulting ephemerides to compute the value of D on anightly basis. We employed the same observability crite-ria for the limiting V-band magnitude and the opposition-centered ecliptic coordinates as we used on the Sim data set.4.2.2.
Distribution of digest2
Scores
The overall distribution of scores for the
MPC A , data setcan be seen in Figure 6. This data set contains various ob- Figure 6. Top:
Distribution of digest2 scores for 1-Month’s worthof all tracklets submitted to the MPC (
MPC A , ). Bottom:
Cumula-tive version of the top plot. The majority of NEOs have D ∼ D ≈
0. Trojans are particu-larly numerous for D ⊂ (55 − ject types. We plot histograms of the D score at the top,and fractional cumulative distributions below. Most Main-Belt asteroids have D ∼
0, and most NEOs have D ∼ HE digest2 NEO C
LASSIFICATION C ODE D is interestingbecause those with D < D , Crit are not recognized as NEOsand will not be placed onto the NEOCP for follow-up. In the
MPC A , data set, 14% (98.5%) of NEO (non-NEO) track-lets had D < D , Crit . Figure 6 shows that Trojans and Mars-Crossers are objects that “mimic NEOs”, as these two popu-lations of object have the highest fraction with D > D , Crit .Trojans dominate numerically for 55 < D <
90, while Mars-Crossers contribute equally across the entire range of D val-ues.4.2.3. Comparison of NEO digest2
Scores Across Data Sets
Figure 7.
Cumulative distribution of digest2 scores for all
NEO datasets. The blue (
MPC A , ) line is reproduced from Figure 6. Whilemost NEO tracklets have D = 100 at any given time, the distributionof smaller D differs between data sets. In the unbiased Sim sampleabout 30% of NEO tracklets have D < D , Crit . Figure 7 shows the cumulative distribution of D for theNEOs generated within the data sets of Table 2. The data setsdiffer at small D , introducing differing biases. The discov-ery data sets ( MPC D , , MPC D , ) have the smallest fractionof tracklets in the low D range and clearly display the ef-fects of imposing a D , Crit -cut-off on objects submitted to theNEOCP.The data sets containing all submitted tracklets (
MPC A , )or simulated tracklets ( Granvik , LSST , and
Sim ) have in-creased ratios of low-scoring NEOs because (a) they are notsubject to the D , Crit threshold placed on objects sent to theNEOCP, and (b) they contain objects observed repeatedly.The largest fraction of low scoring tracklets comes from the
Sim data set, which may be a consequence of the long-termsimulation window (10-years) and our optimistic nightly ca-dence over the entire visible night sky, allowing it to containobjects with large synodic periods, and objects that only oc-casionally have D > D , Crit . The
Sim data set has a widerange of NEOs: 28% of all visible tracklets had D <
65 at any given time. However, we note that within the data set,some NEOs are visible for only a single one night, whileothers are visible for many months.4.2.4.
Detailed Analysis of digest2 scores for NEOs
We use the
Sim data set to examine the frequency withwhich NEOs and their subclasses (Amor, Apollo and Atentype orbits, PHAs and low-MOID (<0.05 AU) NEOs)have tracklets with low values of D . Figure 8.
Cumulative histogram of maximum, mean and minimum digest2 score per object for NEOs and their orbital subclasses from
Sim data set. Only 0.4% of NEOs never reach D > Figure 8 shows the cumulative histograms of the minimum,mean and maximum D per object from the Sim data set, andthe differences in D seen between the different orbit classes.NEOs can, at times, have low scores: 53% of NEOs have D <
65 at some point while observable. But in general,NEOs (and their subclasses) achieve high D in almost allcases: 94% of NEOs reach D = 100 (Table 3). This ratio is Potentially Hazardous Asteroids have MOID ≤ . H ≥ Minimum Orbit Intersection Distance with the Earth.
EYS ET AL .highest for objects that come close to the Earth or the Sun (Aten, Apollo and low-MOID classes). The ratio is smallestin the case of Amors, only 87% of which ever reach D = 100,because Amors remain distant and often mimic Main-beltmotion when near aphelion. Only 0.4% of NEOs never reachD > while visible. Table 3.
Fraction of NEOs that achieved a given D in the Sim dataset.
Orbit type min D >
65 max D <
65 max D = 100(%) (%) (%)NEO 46.7 0.4 93.9PHA 22.8 0.4 99.2Aten 65.9 0.2 99.6Apollo 46.4 0.2 99.5Amor 43.8 0.6 86.4Low-MOID 59.9 0.1 99.5 To understand the NEOs with low D , in Figure 9 we plotthe rate-of-motion and sky-plane location (in ecliptic coordi-nates centered at opposition ) of each tracklet from the Sim data set. There are two regions in the morning and eveningsky where NEOs tend to have lower D . NEOs in these re-gions are moving very slowly and mimic the Main-belt rate.These regions are coincident with so-called “sweet-spots” atlow solar elongations, that are the only regions of the sky inwhich many NEOs become observable (e.g. Stokes & Yeo-mans 2003; Chesley & Spahr 2004; Boattini et al. 2007).There is also an area near opposition where low digest scoresare possible for NEOs moving with rates of motion similarto those of MBAs. Interestingly, if the object moves evenslower, e.g. below 0.1 degree per day, its score is very high.All fast moving objects (rates above ∼ . D = 100.4.3. Accuracy, Precision and the NEOCP
To understand how accurate D is when used as an NEOclassifier, we can calculate the value of D for trackletsfrom a population of known objects (both NEOs and non-NEOs). We can then compare D to an imposed critical value D , Crit , and calculate for the population the rate of “TruePositives”(
T P ), “False Positives” ( FP ), “False Negatives”( FN ), and “True Negatives” ( T N ). Repeating this gives thevalues of these quantities as a function of D , Crit . We havesummarized these quantities (and some additional metrics)in Table 4.As described in Section 4, unknown tracklets with D > D , Crit are submitted to the NEOCP, increasing the likelihoodof follow-up observations being obtained and hence of theobject being confirmed as real. The precision,
PRE , is a
Table 4. Top:
Error matrix for digest2 when used as a classifier, i.e.when the digest2 score, D , is compared to a critical value D , Crit . Bottom:
Definition of additional useful metrics.NEO Non-NEO D > = D , Crit
True Positives,
T P
False Positives,
FPD < D , Crit
False Negatives, FN True Negatives,
T N
False Positive Rate,
FPR = FP / ( FP + T N )False Negative Rate,
FNR = FN / ( FN + T P )Precision,
PRE = T P / ( T P + FP )Accuracy, ACC = (
T P + T N ) / ( T P + T N + FP + FN ) measure of the purity of the tracklets that will end-up on theNEOCP (i.e. what fraction of the tracklets on the NEOCPwill actually be NEOs).In Figure 10 we plot the False Positive Rate , False Nega-tive Rate , Precision , and
Accuracy as functions of the critical digest2 score, D , Crit . We do this for both real (
MPC D , , MPC D , and MPC A , ) and synthetic ( LSST ) data sets.These plots show that the discovery data sets (
MPC D , and MPC D , ) both display “kinks” in the various performancemetrics at the values of D , Crit used at the time of submis-sion. Above the threshold, most NEOs are followed-up andorbits computed. However, NEOs below the threshold at thetime of first observation are less likely to be immediatelyfollowed-up. In contrast, no selection effects are imposed oneither the
MPC A , or LSST data sets, hence no kinks appear.We find that the false negative rate is the largest inthe
MPC A , and LSST data sets, essentially becausetracklets with D < D , crit are absent from the other( MPC D , & MPC D , ) data-sets.We note that the value of D , Crit = 65 was chosen with theaim of generating approximately half of the objects on theNEOCP being NEOs. In 2010, the D , Crit was lowered to D , Crit = 50 based on the request of the community and avail-ability of more follow-up and discovery assets. After a year,due to the lack of follow-up of the low-scoring NEO can-didates, MPC increased D , Crit back to D , Crit = 65 in mid-2012. It is clear from both the curves in Figure 10 and byVereš et al. (2018) that this choice was successful in generat-ing the approximate desired level of precision. If members ofthe community wish to advocate for changes to the adoptedvalue of D , Crit = 65, they should weigh-up the pros and consof such a change: Increasing the value of D , Crit will causethe false positive rate to drop (cutting down on the number ofnon-NEOs “unnecessarily” followed-up), but the false nega-tive rate will rise (hence losing NEOs). HE digest2 NEO C
LASSIFICATION C ODE Figure 9. digest2 score for all of the NEOs from the
Sim data set in opposition-centered ecliptic coordinates per-bin mean digest2 . digest2 isgenerally smaller at low solar elongation. Figure 10. Top Left:
MPC D , , All discovery tracklets from 2011; Top Right:
MPC D , , All discovery tracklets from 2017; BottomLeft:
MPC A , , All tracklets from January 2017. Bottom Right:
LSST , All tracklets from LSST sim. Each panel plots the False-Positive-Rate (FPR, blue), the False-Negative-Rate (FNR, orange),the Precision (Green), and the Accuracy (red). The precision isa measure of the purity of the tracklets that will end-up on theNEOCP. The “kink” that is particularly obvious in the “Precision”measure for
MPC D , is driven by the D , Crit = 65 NEOCP threshold.
Nuances of Practical digest2
Usage
There are a number of practical details relating to the al-gorithmic design of digest2 that should be considered whenusing the digest2 code in practice.4.4.1.
Effects of Random-Choice of α The digest2 code uses a pseudo-Monte Carlo method togenerate a range of variant orbits (see Appendix A). By de-fault, the pseudo random number generator is seeded ran- domly. Therefore, even though the admissible region is thesame, the sampling of available orbits could hit different pop-ulation bins during an independent instance of the program.We demonstrate an extreme example of the D variation us-ing a real NEO candidate ( P10Gj15 ) that was observed onJanuary 17, 2018 as a 3-detection tracklet. The tracklet had D = 66 and therefore it was posted to the NEOCP. When were-ran digest2 D in the left-panel of Figure 11 shows that on many occa-sions the D value was below 65.To understand this result, we show two specific runs, oneleading to D = 66 and one to D = 60 in the center of Fig-ure 11, demonstrating that the sampled orbits inhabit almostidentical regions of phase space. However, counting the binshit by generated variant orbits in the two runs (illustrated inthe right-hand side of Figure 11), we see that different pop-ulation bins were hit, giving different scores. In the caseof P G j
15, further follow-up observations allowed the or-bit determination of the object that became announced as2018 BE Granvik
NEO data set, running the digest2 code 100-times for each. We found that 82% of the track-lets had no change in the integer value of D . In particular,tracklets with D ∼ D ∼
100 had almost no variation in D . Only a small fraction of tracklets, 0.7%, had a variation ∆ D >
3. Most of the varying D -tracklets were in the range40 < D <
60, the region where both NEO and Main-Beltpopulations seem to be probable for a given velocity vector.The small variation is caused by the binned-populationmodel and the algorithmic design. Future versions of thecode should reduce/eliminate this effect.2 K
EYS ET AL . Figure 11. Left:
One thousand runs of digest2 on a single Pan-STARRS1 tracklet P10Gj15, resulting in a range of ’raw’ and ’nid’ NEO scoresdue to the random selection of α . This is a particularly extreme example of the variation that is seen. Center:
Variant orbits generated intwo independent runs of digest2 (which resulted in different D values: Red D = 60 and black D = 66) plotted in perihelion distance (q) andabsolute magnitude (H). Right:
Population bins populated by the variant orbits from the center panel. The different runs hit different bins,causing skews in the calculated score.
We emphasize that if the user wishes to suppress these vari-ations, the keyword repeatable can be added to the configura-tion file. The random angle α is then reseeded with a constantvalue for each tracklet, yielding repeatable scores in indepen-dent runs. 4.4.2. Effects of Observational Uncertainty
As illustrated in Section 3.1.1, “dithered” tracklets are con-structed from two end-points or two generated points of thetracklet. The dithering itself is governed by the expected as-trometric uncertainty. The default astrometric uncertaintyused by digest2 is 1.0 arcsecond. However, in our workand at the MPC, astrometric uncertainties are assigned in the digest2 configuration file, using the long-term average foreach observatory code (Table 6, Appendix D). To demon-strate the effect of changing the assumed uncertainty on asingle tracklet, we again chose the Pan-STARRS tracklet P G j
15. Figure 12 shows how D depends on the assumeduncertainty. The variation in score is driven by the same ef-fect demonstrated in Section 4.4.1 (Figure 11): varying theuncertainty changes the population bins included in the track-let score calculation. In the example illustrated, when the as-trometric uncertainty is overestimated, P G j D leads tomore variant orbits that hit the Main-Belt bins.To study the significance of the assumed astrometric uncer-tainties for a large data set, we selected 30,000 NEOs fromthe Granvik data set, generated two-detection tracklets with0.2" astrometric errors and computed D using three differentassumed astrometric uncertainties: underestimated (0.05"),nominal (0.2") and overestimated (1.0"). Note, that the anal-ysis was undertaken with the digest2 keyword “repeatable”set to avoid variation due to angle α randomness.When the astrometric uncertainty is overestimated by 0.8"(underestimated by 0.15"), the value of D varies by anamount ∆ D ≈ − ∆ D ≈ + . D (20 to 80), Figure 12. digest2 score computed for a Pan-STARRS1 trackletP10Gj15, assuming different astrometric uncertainty for each run.The vertical line shows the assumed Pan-STARRS1 uncertainty.The variation in score is driven by the same effect demonstratedin Figure 11: varying the uncertainty changes the population binsused to calculate the tracklet’s score. where overestimation (underestimation) lead to ∆ D ≈ − ∆ D ≈ + . D values near zero and 100 do not showany variation. Overall there was no change for 90% of the"underestimated" and 78% of the "overestimated" tracklets.4.4.3. Great Circle Departure
As described in Section 3.1.1, the digest2 algorithm uses apair of synthetic detections derived from the set of observedpositions. In a typical short arc tracklet spanning ∼ digest2 code to output the root- HE digest2 NEO C
LASSIFICATION C ODE Figure 13. D score of close-approaching NEOs derived for 4-detection tracklets. Correlations between great-circle-residual rms , geocentricdistance and rate of motion are depicted. Nearby objects exhibit large rms and high rates-of-motion, but also always have D = 100. Largepoints are used to emphasize those NEOs with low D . There are no fast-moving, nearby NEOs or NEOs with non-linear motion, that have low digest score. mean-square of the great circle residuals for the positions re-ported in a tracklet (if the tracklet has ≥ rms score is within the expected astrometric uncertainties,then, linear motion is a good approximation for the digest2 method. A large rms can be due to either bad astrometry, orto diurnal parallax.To test the effect of great circle deviations we selectedclose approach events from the CNEOS web site for as-teroids that approached the Earth within 0 . F
51 observatory codeas the topocentric position and “smeared” the generated as-trometry by an expected astrometric uncertainty of 0.2 arc-second. Only tracklets >
60 degrees from the Sun were con-sidered.Figure 13 illustrates the rate of motion, the rms of thegreat-circle-residuals, and the geocentric distance of eachtracklet. As expected, the rms increases with decreasinggeocentric distance. In addition, the correlation betweenrate of motion and rms , as well as between rate of motionand geocentric distance is obvious. Low scoring NEOs had rms < . Earth quasi-satellites, co-orbitals and Trojans
Although the Moon is the only known natural satellite ofthe Earth, temporarily captured NEOs, NEOs in 1:1 mean-motion resonance with the Earth, Earth Trojans or proposedminimoons(Granvik et al. 2012; Bolin et al. 2014) can followgeocentric orbits for days or months. (469219) 2016 HO3is the closest and most stable Earth’s quasi-satellite (de laFuente Marcos & de la Fuente Marcos 2016). The only Earth https://cneos.jpl.nasa.gov/ca/ Trojan discovered so far is 2010 TK7(Connors et al. 2011;Mainzer et al. 2012). We selected a small group of knownquasi-satellites or co-orbitals , generated daily ephemeridesfrom 2000 until 2020, created 20-minute tracklets and sim-ulated sky-plane visibility down to +22.5 V − band magni-tude. Their sky-plane motion and digest2 score are shown inFigure 14. The analyzed co-orbitals typically display largevaluess of D , unless they are in the low-scoring “sweet-spots”. The overall distribution of digest2 scores is similarto MPC A , and Granvik from Figure 7 in Section 4.2.3. Theonly known Earth Trojan 2010 TK7 has a digest2 value os-cillating between 93 and 100, and 40% of its tracklets had DD = 100. 4.6. NEOs at low Solar elongations
Most of our analyses were focused on NEOs observed infavorable locations of the sky within 100 degrees of the op-position (e.g. Figure 9). In Section 4.2.4 we identified re-gions near the ecliptic roughly 100 to 60 degrees from theopposition where the mean digest2 score is relatively low.However, some orbit types, such as Atens or Atiras, rarely ornever cross the opposition region. Searches close to the Sunare required to find these objects. Ground-based observationsnear the Sun are limited by large airmass and associated tar-get brightness losses, as well as limited observing time atdawn or dusk. Low Solar elongation observations are idealfor space-based observatories. Figure 15 shows the mean di-gest2 score for the
Sim data set in the Sun-centered eclipticcoordinates. Interestingly, closer to the Sun digest2 is al-most always above 65 and Atiras have particularly very large D values. We demonstrated that digest2 score can identifyNEOs in low-solar elongations. (3753) Cruithne, (164207), (277810), (419624), (469219), 2002 AA29,2006 RH120, 2010 TK7 EYS ET AL . Figure 14. digest2 score for known Earth quasi-satellites and their sky-plane motion in ecliptic coordinates centered at the opposition. digest2 score is mostly large. Low digest2 scores corresponds to the “sweet-spots” areas. Pattern of the horseshoe orbits is apparent at low solarelongations.
Figure 15. digest2 score for all of the NEOs, Atiras and Atens from the
Sim data set in Sun-centered ecliptic coordinates per-bin mean digest2 . digest2 is generally large at low solar elongation. Yellow point in the center represents the Sun. Interstellar Objects and digest2The interstellar object (ISO) 1I/’Oumuamua (Bacci et al.2017) was discovered by the Pan-STARRS NEO survey andits first tracklet was reported to the NEOCP as a D = 100tracklet. Subsequent follow-up quickly revealed its hyper-bolic orbit. However, without a large NEO digest2 score, itwould never have appeared on the NEOCP or been followed-up. As mentioned earlier, one of the key constraints of di-gest2 are orbits bound to the Sun. Previous work (Engelhardtet al. 2017) showed that the NEO digest2 score could be usedto identify ISOs, demonstrating that 2/3 of ISOs in a simu-lated Pan-STARRS survey achieved D >
90 at some timewhile visible. Figure 16 shows the D , rate of motion and magnitude ofall reported 1I/’Oumuamua tracklets. From the time of dis-covery to November 12, 2017, 1I/’Oumuamua had D = 100at all times, and in addition, 1I/’Oumuamua had D = 100for the entire time it was visible to the main NEO surveys( V < ∼ DISCUSSIONWe have described the algorithm underlying the digest2 code, as well as its usage to calculate the score of a short-arctracklet with respect to a given orbit class. We focused on the
NEO digest2 score, D , as used by the Minor Planet Center(MPC) to classify the likelihood that submitted sets of detec-tions are NEOs. We showed that D changes for each objectas it moves across the sky and its brightness and rate of mo- HE digest2 NEO C
LASSIFICATION C ODE Figure 16. digest2 score, mean V-band magnitude and rate ofmotion of 1I/’Oumuamua tracklets. Dashed vertical lines displayepochs when object was observed by Hubble Space Telescope.1I/’Oumuamua has a high digest2 score for the entire time it wasvisible to surveys ( V < ∼ tion vary. For NEOs, the maximum D reaches the extremevalue of D = 100 for more than 93% of NEOs, and over 99%of NEOs have D , crit >
65 at some point while visible. Thus,the vast majority of NEOs have sufficiently high D to allowthem to be posted to the NEOCP and be rapidly followed-upby observers around the world.While the digest2 code identifies short-arc NEOs correctlyin the vast majority of cases, there are a small number ofcases when D is below the critical threshold, currently setto D , crit >
65. We identify the two locations on the sky, onthe ecliptic and at low solar elongations, where the mean D is significantly lower than elsewhere on the sky. This is dueto the observing geometry and rates of motion of the objects,that cause them to be confused with Main-Belt objects.We also demonstrate that without the "repeatable" parame-ter the digest2 code can generate varying output for the sametracklet due to its random selection of variant orbits. How-ever, the variation in D is small and does not affect scores near extremes ( D = 0 and D = 100). The MPC uses digest2 without the "repeatable" parameter to avoid bias.5.1. Future directions
As we move into an era of large surveys that can ob-tain huge volumes of high-precision observations and utilizehighly-precise stellar catalogues, it is clear that the digest2 code must evolve. Future developments to the code will in-clude: • It will be possible to characterize observational uncer-tainties on a per-observation basis, as provided for inthe ADES format . • A python wrapper and API to provide simplified usage. • A new NEO population model, such as the high-fidelity Granvik et al. (2018) model, will be required.Catalog of known orbits used for unknown populationreduction would be generated more frequently. • The discrete population bins (in q , e , i , H ) need to be re-placed with a smooth 4-dimensional function, perhapsalso incorporating additional orbital elements to allowimproved discrimination of (e.g.) Hildas and JupiterTrojans. • Replacing the synthetic two-detection approach witha robust statistical treatment of all submitted detec-tions, allowing for a significantly improved handlingof NEOs that exhibit non-linear motion during close-approach. • Further tests and improvements of the code are neededfor space-based NEO surveys, such the proposedNEOCam mission (Mainzer et al. 2015). • Replacing the astorb known orbit catalog and its or-bit quality metric (Muinonen & Bowell 1993) with theMPC’s orbit catalog and orbit uncertainty parame-ter .We emphasize that digest2 constitutes an evolving code-base and that we welcome reports from the community ofany bugs found, or of suggestions for future directions . https://minorplanetcenter.net/iau/info/ADES.html https://minorplanetcenter.net/iau/MPCORB.html https://minorplanetcenter.net/iau/info/UValue.html At the time of submission, the digest2 team are currently diagnosing thecause of a rare output-formatting error that appears to afflict approximately1 in every 50,000 evaluation attempts when running digest2 in a multi-coreenvironment. We anticipate that the solution of this issue will lead to a futureversion release.
EYS ET AL .We are grateful to Rob McNaught for his insights regard-ing the history of the P
ANGLOSS code, and for his many clar-ifications regarding the historical development of ranging.MJH and MJP gratefully acknowledge NASA grantsNNX12AE89G, NNX16AD69G, and NNX17AG87G, aswell as support from the Smithsonian 2015-2017 ScholarlyStudies programs. DJA acknowledges the N. Ireland Dept.for Communities support at Armagh.We are saddened to report that during the course of thework on this manuscript, lead author Sonia Keys died on 13August 2018 at the age of 57, after a long struggle with can-cer. Sonia worked as a commercial software developer from1983 to 2001 and then began working with the Astronomical Society of Kansas City, specializing in the tracking of Near-Earth Objects (NEOs). Her skill and reliability with NEOastrometry caught the attention of the Minor Planet Center(MPC), and that led to a position as an astronomer and soft-ware developer for the MPC. During her tenure at the MPC,Sonia was the lead developer of the digest2 software and wasthe point-of-contact for the many people in the NEO com-munity that utilized it for over a decade. We deeply regretSonia’s passing, and we hope that we have completed thismanuscript in a manner she would have approved of. Wehope this publication and note will serve as a small token ofthe esteem in which we held her. Sonia was the conscienceof the MPC. Her clarity of thought, her willingness to askdifficult questions, and her tenacity will be sorely missed.APPENDIX A. VARIANT ORBITS
Figure 17. Top:
Vector algebra for solving state vectors according to Appendix A. E i and A i are the positions of the Earth and the asteroidat times t i . Bottom: Because only the transverse component of the velocity vector is visible to the observer, the true velocity vector could bein the dotted area between A and A ∗ , that represents all possible solutions for bound orbits. (cid:126) ∆ i represent topocentric and (cid:126) r i the heliocentricvectors of the object and (cid:126) R i the heliocentric vectors of the observer at time t i . (cid:126) s is directed along the velocity vector of the object. In the following algorithmic description, all line references relate to Algorithm 1 of Section 3.The input file for the digest2 program contains astrometry of short-arc tracklets, such as those illustrated in Appendix F. Thefile can contain one or many tracklets (Line 1). After the input data are loaded, binned population model is loaded into memoryas well (Line 2, Section 3.3). Each tracklet is subsequently treated individually (Line 3): HE digest2 NEO C
LASSIFICATION C ODE N = 2) or generated ( N > A and A of the same object (Line 4), are defined by A t i =[ α i , δ i , V i ], where α i is the rightascension, δ i the declination, V i is the apparent magnitude at the time of observation t i .Because the tracklet consists of N ≥ V is computed (line 5) in the Johnson-Cousins V-band photometric system. The transfor-mation to V-band is simple (Vereš et al. 2018), the correction is − . B , or + . V , where no correction is applied. If no photometry is provided, V = 21 is assumed as a typical magnitude limit of the currentasteroid surveys.We can easily transform the observation [ α i , δ i ] to its topocentric cartesian unit vector (Figure 17) (cid:126) d i = [ x i , y i , z i ]:( x i , y i , z i ) = (cos α i cos δ i , sin α i cos δ i , sin δ i ) (A1)For further analysis, we will need to transform unit vectors from equatorial to ecliptic coordinate system (See Figure 17): (cid:126) d i = R T (cid:15) · (cid:126) d i (A2)where R (cid:15) is the rotation matrix defined by obliquity of the ecliptic (cid:15) : R (cid:15) = (cid:15) sin (cid:15) − sin (cid:15) cos (cid:15) (A3)To derive the position state vector (cid:126) r , we have to assume a topocentric distance D to the asteroid. In line 8 of Algorithm 1, Sun-observer vectors (labeled (cid:126) R , (cid:126) R in Figure 17) in ecliptic Cartesian coordinates are computed by local sidereal time and observersite parallax constants. Observer’s topocentric positions are loaded from MPC-managed list of observatory codes . Also theobserver-object unit vectors (cid:126) d , (cid:126) d are computed in ecliptic Cartesian coordinates.Then, the topocentric vector to the first observation, (cid:126) ∆ , will be a scalar multiple of D and the unit vector (cid:126) d . D is one of thevalues selected recursively (Appendix C) between the minimum (0.05 au) and maximum (100 au) distances within the allowedadmissible region: (cid:126) ∆ = D (cid:126) d (A4)Knowing the position vectors of the observer with respect to the Sun at the time of the two observations ( R and R ) in eclipticcoordinates, we can easily derive the heliocentric state vector (cid:126) r as: (cid:126) r = (cid:126) R + (cid:126) ∆ (A5)After the position vectors are derived, the algorithm continues in investigation of the velocity vectors. The observed motion ofthe asteroid on the celestial sphere is a projection of the actual velocity vector to the tangent plane. While we do not know theangle between the tangent plane and the velocity vector (cid:126) v , we can make progress by computing the possible ranges of angles(Figure 17 - bottom) under the assumption that the object is bound to the Sun. At a given distance from the Sun, r , an orbit isbound if its specific orbital energy (cid:15) <
0. We will find the angles of the (cid:126) v for the parabolic orbits from: (cid:15) = v − Ur = 0 (A6)where U = k is the Gaussian gravitational constant squared .To find (cid:126) v one has to solve triangle E , A , A on figure 17. The velocity vector will be derived as (cid:126) v = ( (cid:126) ∆ − (cid:126) ∆ ) / ( t − t ). https://minorplanetcenter.net/iau/info/ObservatoryCodes.html k = 0 . au / day − ( solar mass ) − / EYS ET AL .The angle θ (Figure 17 - bottom) can be derived by computing the intermediate vector (cid:126) ∆ from vector difference of (cid:126) r and (cid:126) R and the dot product of known unit vector (cid:126) d and derived (cid:126) ∆ : (cid:126) ∆ = (cid:126) r − (cid:126) R cos θ = (cid:126) ∆ · (cid:126) d | ∆ || | (A7)Then, by using the cosine rule, substituting v = s / ( t − t ) and using equation (A6), we can derive the length of the vector ∆ asfollows. Line 9 of Algorithm 1 involves solving a quadratic that gives two values for α corresponding to state vectors (cid:126) s and (cid:126) s ∗ that represent parabolic orbits: ∆ + ∆ − s − ∆ ∆ cos θ = 0 ∆ − ∆ ∆ cos θ + (cid:32) ∆ − U ( t − t ) r (cid:33) = 0 (A8)Equation (A8) has two roots for ∆ after doing simple substitution (A, B, C) solving:( A , B , C ) = (1 , − ∆ cos θ, ∆ − U ( t − t ) r ) ∆ = − B ± √ B − AC A (A9)Two roots of ∆ lead to the extreme solutions of angles at which the vector v of an asteroid is equal to escape velocity:Based on figure 17 and cosine rules and law of sines: s = (cid:113) ∆ + ∆ − ∆ ∆ cos θ cos α = s + ∆ − ∆ s ∆ sin α = ∆ sin θ s α = arctan (cid:18) sin α cos α (cid:19) (A10)Having the topocentric ( (cid:126) r ) and heliocentric ( (cid:126) ∆ ) distance, the mean V-band magnitude V and geometry ( α ), we are now ableto compute the object’s absolute magnitude H (Line 10) of Algorithm 1 by equation: H = V − ∆ r ) + f ( Φ , G ) (A11)where the phase angle Φ is the angle between (cid:126) ∆ and (cid:126) r and G is the slope parameter. The form of the phase function f ( Φ , G )is based on Bowell et al. (1989) and G = 0 .
15 is assumed.In the next step (Line 11) of Algorithm 1, the algorithm picks angles randomly for a selected topocentric distance D in a rangebetween the two extremes of α that represent a bound elliptical orbit.With any valid α and D , we can derive the velocity vector, again by using the rule of sines (Figure 17 - bottom): ∆ = ∆ sin α sin ( π − α − θ ) (A12) HE digest2 NEO C
LASSIFICATION C ODE (cid:126) v in ecliptic coordinates is (cid:126) v = ∆ (cid:126) d − (cid:126) ∆ t − t (A13)Having (cid:126) r , (cid:126) v at a given epoch, we have the orbit defined through its state vector (Line 12 of Algorithm 1) which is easy to convertinto a set of Keplerian elements (Line 13 of Algorithm 1).The entire procedure is repeated for a range of assumed distances, D j that lead to set of unique position vectors, (cid:126) r j , and velocityvectors, (cid:126) v j . Example of variant orbits generated by digest2 in a phase space of α and ∆ is shown in Figure 18. Figure 18.
Grid of bound variant orbits generated by digest2 for an NEO (left), Main Belt (center) and Centaur (right) tracklet. The asterisksdepict the true position. B. digest2
SOFTWAREB.1.
Detailed Historical Evolution
In Section 2.1 we provided a brief outline of the historical development of the digest2 algorithm. In Table 5 of this appendix,we provide the interested reader with a more detailed, chronologically-ordered description of the key changes made to the digest2 code. B.2.
Software Availability and Usage
The digest2 source code and the documentation is freely available at https://bitbucket.org/mpcdev/digest2/overview . The download section contains a zip archive of digest2 and the population model tar achive d2model .The synthetic solar system model binning and reduction source code
MUK is freely available at https://bitbucket.org/mpcdev/d2model/src/master/muk.c .The digest2 code can be executed directly after its compilation using the supplied input observation file (sample.obs) andconfiguration file (MPC.config) as follows: digest2 -c MPC.config sample.obs
Further detailed instructions for the operation of digest2 can be found in https://bitbucket.org/mpcdev/digest2/src/master/OPERATION.md . C. BINARY SEARCH ALGORITHM FOR POPULATION BINSThe initial algorithm (version 0.1) for constructing class scores was as in Algorithm 2.On line 5 the function
Interesting tests if candidate elements are in the class of interest. The single class of interest defined in digest2 versions-0.1 and -0.2 corresponded to the Minor Planet Center’s definition of “interesting” which included not just NEOs( q < .
3) but also high eccentricity ( e > .
5) and high inclination ( i >
40) orbits which would also lead to a discovery MPEC.The current code uses the classes described in Table 8 of Appendix D.Version-0.2 contained an improved score formula. The goal of the new score formula was to change the computation of thevariables neo and mb (Algorithm 2 lines 6 and 8) to be more like, neo = sum of bin populations over all represented NEO bins mb = sum of bin populations over all represented non-NEO bins , (C14)0 K EYS ET AL . Table 5.
Significant Changes and Versions of Digest2. Archived (and operational) code for the majority of the versions listed in this table areavailable in https://bitbucket.org/mpcdev/digest2/src/master/archive/ .Date Version Descriptionpre-2002 0.0223 Original FORTRAN 77 code (“ ”) built on P
ANGLOSS
Aug 2005 0.1 MIT-style license, catalog-reduced model.Sep 2005 0.2 Bin search algorithm (App. C)Feb 2010 0.3 Pan-STARRS S3M model, multiple orbit classes, improved scoring (§ 3.3). Parallelized across multiple CPU cores.Feb 2010 0.4 Supported space-based observations.Feb 2010 0.5 Great circle RMS output. Stand-alone GCR utility.Mar 2010 0.6 Improved stand-alone GCR utility.Nov 2010 0.7 Orbit classes H18, H22, bug fixes.Apr 2011 0.8 More flexible output, new config options. Removed stand-alone GCR utility, bug fixes.Apr 2011 0.9 Bug fixes.Apr 2011 0.10 Bug fixes.May 2011 0.11 Endpoint synthesis (§ 3.1). Observational Error (§ 4.4.2).May 2012 0.12 Bug fixes.Jun 2012 0.13 Bug fixes.Jan 2013 0.14 Bug fixes.Feb 2014 0.15 Allow more obscodes.Feb 2015 0.16 Command line option to limit parallelism.Jun 2015 0.17 CSV model, bug fixes.Jun 2015 0.18 Minor usability improvements. Bug fixes.Aug 2017 0.19 Bug fixes, including a serious uninitialized data problem.
Algorithm 2: digest2 scoring version 0.1 . . . for distance D ← .025 AU to 7 by .025 do for α ← α to α by 2 ◦ do Look-up
BinPopn ( a , e , i , H ) if Interesting ( a , e , i ) then neo + = BinPopn else mb + = BinPopn score ← ∗ neo / ( neo + mb ) where a best effort is made to locate all the population bins represented by the input arc. The reasons for this change ultimatelystem from a desire to (a) avoid double-counting population bins, and (b) avoid omitting population bins due to the use of overlylarge distance and/or angle step-sizes in Algorithm 2.Per-class binning enables enhancements to the bin tagging and scoring algorithms. In the current version of digest2 , separatebin tags are accumulated per class. Further, for each class, two sets of tags are accumulated per class. An in-class tag is set fora bin when a candidate orbit meets the class definition. A separate out-of-class tag is set otherwise. Two population sums arethen computed for each class, a sum of bin populations with in-class tags and sum of bin populations with out-of-class tags. The HE digest2 NEO C
LASSIFICATION C ODE α and α .At each call, a state vector and then orbital elements are computed, but then rather than accessing the indicated bin population,the corresponding tag is simply checked (in the distance-specific tags). If the tag was already set, subdivision ends and therecursive function returns. If the tag was not set, then the angle space is subdivided further and the recursive function is calledfor each subdivision.The distance, D , is subdivided similarly with a separate recursive function. On each call, computations are performed for asingle distance as just described, then the distance-specific tags are checked. If all bins tagged at that distance had previouslybeen tagged in the overall result, the recursive function simply returns. If new bins were tagged, they are merged into the overallresult, the distance space is subdivided further, and the recursive (distance) function is called for each subdivision. After allsubdivision functions return, the formulas above for neo and mb are computed as described in Algorithm 3. Algorithm 3:
Tag accumulation for each tag of overall tags do if tag set then if interesting bin then neo ← neo + BinPop ( a , e , i , H ) if non-interesting bin then mb ← mb + BinPop ( a , e , i , H )D. TABULATED VALUES AND SETTINGS FOR digest2
In Table 6 we list the assumed astrometric uncertainties adopted in the latest version of the digest2 code. The values are set inthe file MPC.
CONFIG and can be altered by the user. digest2 is configurable with keywords that allow control of the output and setup of some input variables. The keywords are setin the same file (MPC.config) as the astrometric uncertainties (Table 6). The available keywords are described in Table 7.The orbit classes used by digest2 are defined by the orbital elements and absolute magnitudes listed in Table 8. E. POPULATION BINSIn Figure 19 we illustrate the population model used by digest2 (see Section 3.3 above for further description).
Table 6.
Astrometric uncertainties currently used by digest2 by the MPC and digest2 code, based on long-term statistics a .observatory astrometric observatory astrometriccode uncertainty code uncertainty106 0.4" D29 0.5"291 0.4" E12 0.5"568 0.1" F51 0.2"691 0.4" F52 0.2"703 0.7" G96 0.3"704 0.7" H15 0.5"A50 0.5" J75 0.4"C51 0.7 other 1.0" a https://minorplanetcenter.net/iau/special/residuals.txt EYS ET AL . Figure 19.
Binned population model for the Solar System (odd columns) and NEOs (even columns) as functions of perihelion distance ( q ),eccentricity ( e ), inclination ( i ) and absolute magnitude ( H ) for the full (“raw”) population model. Note the unequal bin sizes and limits of theSolar System model. HE digest2 NEO C
LASSIFICATION C ODE Table 7.
Keywords available in digest2 . keyword meaningheadings show headingnoheadings hide headingrms show residual RMS from linear motion along a great circle in arc-secondsnorms do not show RMSraw show raw digest2 (Section 3.3.1)noid show noid digest2 (Section 3.3.2)repeatable random seed is defined and always the same (Section 4.4.1)random pseudo-random generation of α and resulting digest2 (Section 4.4.1)obserr fixed astrometric uncertainty set to a value in arc secondsposs show other non-zero resulting scores in addition to specified classes digest2 any orbit class(es) defined in Table 8 e.g. NEOuser defined astrometric uncertainty (arc seconds) per MPC observatory code e.g. obserrF51=0.2default keywords: noid, rms, headings, N22, N18, NEO, Int, obserr=1.0 Table 8.
Definitions of orbital classes in digest2 . orbit class abbreviation definition in keplerian elements and absolute magnitudeMPC Interesting Int q < 1.3 || e >= 0.5 || i >= 40 || Q > 10Near-Earth Object NEO q < 1.3Large Near-Earth Object N18 q < 1.3 & H<18.5Intermediate-size Near-Earth Objects N22 q < 1.3 & H<22.5Mars-Crossers MC q < 1.67 & q >= 1.3 & Q > 1.58Hungarias Hun a < 2 & a > 1.78 & e < 0.18 & i > 16 & i < 34Phocaeas Pho a < 2.45 & a > 2.2 & q > 1.5 & i > 20 & i < 27Inner Main Belt MB1 q > 1.67 & a < 2.5 & a > 2.1 & i < ((a - 2.1) / 0.4) * 10 + 7Pallas family Pal a < 2.8 & a > 2.5 & e < 0.35 & i > 24 & i < 37Hansas Han a < 2.72 & a > 2.55 & e < 0.25 & i > 20 & i < 23.5Central Main Belt MB2 a < 2.8 & a > 2.5 & e < 0.45 & i < 20Outer Main Belt MB3 e < 0.4 & a > 2.8 & a < 3.25 & i < ((a - 2.8) / 0.45) * 16 + 20Hildas Hil a > 3.9 & a < 4.02 & i < 18 & e < 0.4Jupiter Trojans JTr a > 5.05 & a < 5.35 & e < 0.22 & i < 38Jupiter Family Comets JFC q > 1.3 & T J > T J < T J - Tisserand parameter with respect to Jupiter, a - semimajor axis (AU), e - eccentricity, i - inclination (deg)Q - aphelion distance (AU), q - perihelion distance (AU), H - absolute magnitude F. EXAMPLE TRACKLETWe provide an example of a tracklet taken from a a typical Hungaria-group asteroid. The data is presented in a standardized80-column format - Minor Planet Packed Designation (K18B01E), Mode of Observation ("C" for CCD), Time of observations(Year, Month, Day), Right Ascension (Hours, Minutes, Seconds) and Declination (Degrees, Minutes, Seconds), Magnitude, Band("w"), Catalog Code ("U"), Packed Reference (" ∼ K18B01E* C2018 01 17.42780 08 01 58.347+41 29 48.20 21.4 wU~2VXlF51 https://minorplanetcenter.net/iau/info/ObsFormat.html EYS ET AL . K18B01E C2018 01 17.43919 08 01 57.028+41 29 44.58 21.2 wU~2VXlF51K18B01E C2018 01 17.46196 08 01 54.365+41 29 37.55 21.2 wU~2VXlF51
REFERENCES
Bacci, P., Maestripieri, M., Tesi, L., et al. 2017, Minor PlanetElectronic Circulars, 2017Boattini, A., Milani, A., Gronchi, G. F., Spahr, T., & Valsecchi,G. B. 2007, in IAU Symposium, Vol. 236, Near Earth Objects,our Celestial Neighbors: Opportunity and Risk, ed. G. B.Valsecchi, D. Vokrouhlický, & A. Milani, 291–300Bolin, B., Jedicke, R., Granvik, M., et al. 2014, Icarus, 241, 280,doi:
Bowell, E. 1989, in BAAS, Vol. 21, Bulletin of the AmericanAstronomical Society, 969Bowell, E. 2018, The Asteroid Orbital Elements Database. ftp://ftp.lowell.edu/pub/elgb/astorb.html
Bowell, E., Hapke, B., Domingue, D., et al. 1989, in Asteroids II,ed. R. P. Binzel, T. Gehrels, & M. S. Matthews, 524–556Bowell, E., Muinonen, K., & Wasserman, L. H. 1994, in IAUSymposium, Vol. 160, Asteroids, Comets, Meteors 1993, ed.A. Milani, M. di Martino, & A. Cellino, 477–481Bowell, E., Skiff, B. A., & Wasserman, L. H. 1990, in Asteroids,Comets, Meteors III, ed. C. I. Lagerkvist, H. Rickman, & B. A.Lindblad, 19Chesley, S. R. 2005, in IAU Colloq. 197: Dynamics of Populationsof Planetary Systems, ed. Z. Kneževi´c & A. Milani, 255–258Chesley, S. R., & Spahr, T. B. 2004, in Mitigation of HazardousComets and Asteroids, ed. M. J. S. Belton, T. H. Morgan, N. H.Samarasinha, & D. K. Yeomans, 22Connors, M., Wiegert, P., & Veillet, C. 2011, Nature, 475, 481,doi: de la Fuente Marcos, C., & de la Fuente Marcos, R. 2016,MNRAS, 462, 3441, doi:
Engelhardt, T., Jedicke, R., Vereš, P., et al. 2017, AJ, 153, 133,doi:
Farnocchia, D., Chesley, S. R., & Micheli, M. 2015, Icarus, 258,18, doi:
Granvik, M., Vaubaillon, J., & Jedicke, R. 2012, Icarus, 218, 262,doi:
Granvik, M., Morbidelli, A., Jedicke, R., et al. 2018, Icarus, 312,181, doi:
Grav, T., Jedicke, R., Denneau, L., et al. 2011, PASP, 123, 423,doi:
Jedicke, R. 1996, AJ, 111, 970, doi:
Kubica, J., Denneau, L., Grav, T., et al. 2007, Icarus, 189, 151,doi:
Mainzer, A., Grav, T., Bauer, J., et al. 2015, AJ, 149, 172,doi:
Mainzer, A., Grav, T., Masiero, J., et al. 2012, ApJL, 760, L12,doi:
Marsden, B. G. 1985, AJ, 90, 1541, doi: —. 1991, AJ, 102, 1539, doi:
McNaught, R. 1999, Journal of the British AstronomicalAssociation, 109, 294Milani, A., Gronchi, G. F., Vitturi, M. D., & Kneževi´c, Z. 2004,Celestial Mechanics and Dynamical Astronomy, 90, 57,doi:
Milani, A., & Kneževi´c, Z. 2005, Celestial Mechanics andDynamical Astronomy, 92, 1,doi:
Morrison, D., ed. 1992, The Spaceguard surveyMuinonen, K., & Bowell, E. 1993, Icarus, 104, 255,doi:
Oszkiewicz, D., Muinonen, K., Virtanen, J., & Granvik, M. 2009,Meteoritics and Planetary Science, 44, 1897,doi:
Spoto, F., Del Vigna, A., Milani, A., et al. 2018, ArXiv e-prints. https://arxiv.org/abs/1801.04004
Stokes, G. H., & Yeomans, D. K. 2003, AGU Fall MeetingAbstracts, P51ETholen, D. J., & Whiteley, R. J. 2000, in Bulletin of the AmericanAstronomical Society, Vol. 32, AAS/Division for PlanetarySciences Meeting Abstracts
Vereš, P., Payne, M. J., Holman, M. J., et al. 2018, AJ, 156, 5,doi: