[PDF] A Fast Template Periodogram for Detecting Non-sinusoidal Fixed-shape Signals in Irregularly Sampled Time Series

Abstract

Astrophysical time series often contain periodic signals. The large and growing volume of time series data from photometric surveys demands computationally efficient methods for detecting and characterizing such signals. The most efficient algorithms available for this purpose are those that exploit the \mathcal{O}(N\log N) scaling of the Fast Fourier Transform (FFT). However, these methods are not optimal for non-sinusoidal signal shapes. Template fits (or periodic matched filters) optimize sensitivity for a priori known signal shapes but at a significant computational cost. Current implementations of template periodograms scale as \mathcal{O}(N_f N_{obs}), where N_f is the number of trial frequencies and N_{obs} is the number of lightcurve observations, and due to non-convexity, they do not guarantee the best fit at each trial frequency, which can lead to spurious results. In this work, we present a non-linear extension of the Lomb-Scargle periodogram to obtain a template-fitting algorithm that is both accurate (globally optimal solutions are obtained except in pathological cases) and computationally efficient (scaling as \mathcal{O}(N_f\log N_f) for a given template). The non-linear optimization of the template fit at each frequency is recast as a polynomial zero-finding problem, where the coefficients of the polynomial can be computed efficiently with the non-equispaced fast Fourier transform. We show that our method, which uses truncated Fourier series to approximate templates, is an order of magnitude faster than existing algorithms for small problems (N\lesssim 10 observations) and 2 orders of magnitude faster for long base-line time series with N_{obs} \gtrsim 10^4 observations. An open-source implementation of the fast template periodogram is available at this https URL

Full PDF

DDraft version February 9, 2021

Preprint typeset using L A TEX style emulateapj v. 01/23/15

A FAST TEMPLATE PERIODOGRAM FOR DETECTING NON-SINUSOIDAL FIXED-SHAPE SIGNALS INIRREGULARLY SAMPLED TIME SERIES

J. Hoffman , J. Vanderplas , J. D. Hartman , G. ´A Bakos Draft version February 9, 2021

ABSTRACTAstrophysical time series often contain periodic signals. The large and growing volume of time seriesdata from photometric surveys demands computationally eﬃcient methods for detecting and charac-terizing such signals. The most eﬃcient algorithms available for this purpose are those that exploitthe O ( N log N ) scaling of the Fast Fourier Transform (FFT). However, these methods are not optimalfor non-sinusoidal signal shapes. Template ﬁts (or periodic matched ﬁlters) optimize sensitivity for a priori known signal shapes but at a signiﬁcant computational cost. Current implementations oftemplate periodograms scale as O ( N f N obs ), where N f is the number of trial frequencies and N obs isthe number of lightcurve observations, and due to non-convexity, they do not guarantee the best ﬁtat each trial frequency, which can lead to spurious results. In this work, we present a non-linear ex-tension of the Lomb-Scargle periodogram to obtain a template-ﬁtting algorithm that is both accurate(globally optimal solutions are obtained except in pathological cases) and computationally eﬃcient(scaling as O ( N f log N f ) for a given template). The non-linear optimization of the template ﬁt at eachfrequency is recast as a polynomial zero-ﬁnding problem, where the coeﬃcients of the polynomial canbe computed eﬃciently with the non-equispaced fast Fourier transform. We show that our method,which uses truncated Fourier series to approximate templates, is an order of magnitude faster thanexisting algorithms for small problems ( N (cid:46)

10 observations) and 2 orders of magnitude faster forlong base-line time series with N obs (cid:38) observations. An open-source implementation of the fasttemplate periodogram is available at github.com/PrincetonUniversity/FastTemplatePeriodogram. INTRODUCTIONAstronomical systems exhibit a wide range of time-dependent variability. By measuring and characterizingthis variability, astronomers are able to infer a variety ofimportant astrophysical properties about the underlyingsystem.Periodic signals in noisy astronomical timeseries canbe detected with a number of techniques, includingGaussian process regression (Foreman-Mackey et al.2017; Rasmussen & Williams 2005), least-squares spec-tral analysis (Lomb 1976; Scargle 1982; Barning 1963;Van´ıˇcek 1971), and information-theoretic methods (Gra-ham et al. 2013a; Huijse et al. 2012; Cincotta et al. 1995).For an empirical comparison of some of these techniquesapplied to several astronomical survey datasets, see Gra-ham et al. (2013b).For stationary periodic signals, least-squares spectralanalysis — also known as the Lomb-Scargle (LS) pe-riodogram (Lomb 1976; Scargle 1982; Barning 1963;Van´ıˇcek 1971) — is perhaps the most sensitive andcomputationally eﬃcient method of detection. The LSperiodogram can be made to scale as O ( N f log N f ),where N f is the number of trial frequencies, by utiliz-ing the non-equispaced fast Fourier transform (Keineret al. 2009; Dutt & Rokhlin 1993, NFFT) to evaluate [email protected] Xaxis, LLC, 3 World Trade Center, 175 Greenwich Street,30th Floor, New York, NY 10007 eScience Institute, University of Washington, Seattle, WA98195 Department of Astrophysical Sciences, Princeton University,Princeton NJ 08540 Institute for Advanced Study, 1 Einstein Drive, Princeton,NJ frequency-dependent sums, or by “extirpolating” irregu-larly spaced observations to a regular grid with Lagrangepolynomials (Press & Rybicki 1989).The LS periodogram ﬁts the following model to a setof observations:ˆ y LS ( t | θ, ω ) = θ cos ωt + θ sin ωt. (1)where ω = 2 π/P is the (angular) frequency of the un-derlying signal, and θ and θ are the amplitudes of thesignal. When the data y i is composed of a sinusoidalcomponent and a white noise component (i.e., when themeasurement uncertainties are uncorrelated and Gaus-sian), the LS periodogram provides a maximum likeli-hood estimate for the model parameters ( ω, θ , and θ ).The LS “power” P LS ( ω ) has several deﬁnitions in theliterature (Zechmeister & K¨urster 2009), but we adoptthe following deﬁnition throughout the paper: P ( ω ) = χ − χ ( ω ) χ (2)Where χ is the weighted sum of squared residuals fora constant ﬁt: χ = (cid:88) i w i ( y i − ¯ y ) (3)where ¯ y = (cid:80) i w i y i is the weighted mean of the ob-servations, and w i ∝ σ − i are the normalized weights foreach observation ( (cid:80) i w i = 1), and χ ( ω ) is the weightedsum of squared residuals for the best-ﬁt model ˆ y at a a r X i v : . [ a s t r o - ph . I M ] F e b Hoﬀman et al. 2021given trial frequency ω = 2 π/P where P is the period: χ ( ω ) = min θ (cid:88) i w i ( y i − ˆ y ( t i | ω, θ )) (4)1.1. Bayesian interpretation

We note that, while this formalism captures many dataand modeling scenarios, it is not completely general. Forexample, correlated uncertainties are not handled here.A more general Bayesian treatment of periodic models inastronomical timeseries is better handled by expressinga posterior over the model parameters.Assuming a Gaussian likelihood for the observations y i ∼ N (ˆ y ( t i | ω, θ ) , σ i ), and uniform priors on both thefrequency parameter ω and the non-frequency parame-ters θ , the posterior is p ( ω, θ | X ) = p ( X | ω, θ ) p ( ω, θ ) p ( X ) (5) ∝ p ( ω, θ ) N obs (cid:89) i =1 N (ˆ y ( t i | ω, θ ) , σ i ) (6)where we use X = { ( t i , y i , σ i ) | < i < N obs } as short-hand for the lightcurve observations. The logarithm ofthe posterior islog p ( ω, θ | X ) = − N obs (cid:88) i =1 (cid:18) y i − ˆ y ( t i | ω, θ ) σ i (cid:19) + const . (7)= − W N obs (cid:88) i =1 w i ( y i − ˆ y ( t i | ω, θ )) + const . (8)where W = (cid:80) i σ − i . Thus, χ ( ω ) = min θ (cid:88) i w i ( y i − ˆ y ( t i | ω, θ )) (9)= min θ (cid:18) − W log p ( ω, θ | X ) + const . (cid:19) (10)= − W max θ log p ( ω, θ | X ) + const . (11)and therefore, since P ( ω ) = 1 − χ ( ω ) /χ , P ( ω ) = 2 W χ max θ log p ( ω, θ | X ) + const . (12)The Lomb-Scargle power is a linear transformation ofthe maximum of the log posterior over the non-frequencyparameters, with the frequency parameter held ﬁxed.Thus, choosing the frequency that maximizes the peri-odogram value corresponds to ﬁnding a MAP estimateof the frequency parameter.A MAP interpretation is more general, and is applica-ble to scenarios not considered in this paper (e.g. corre-lated uncertainties, multi-dimensional timeseries, etc.),since all of these problems are amenable to MAP es-timation of their model parameters. However, in or-der to maintain consistency with notation in the Lomb- Scargle literature (e.g. Zechmeister & K¨urster 2009; Van-derPlas 2018), we keep our deﬁnition of the periodogramrestricted as above and only consider one-dimensionaltimeseries with heteroscedastic but uncorrelated uncer-tainties.Mortier et al. (2015a,b) explored a Bayesian interpre-tation of the periodogram. Their frequency periodogramdeﬁnition marginalizes over non-frequency parametersin contrast to a (scaled) MAP value of the log like-lihood discussed here. They use similar assumptions(uncorrelated, Normal measurement uncertainties) anduse uniform priors on non-frequency parameters, buttheir periodogram power was based on marginal prob-abilities P ( ω ) = p ( ω | X ) ∝ (cid:82) θ ∈ Θ p ( ω, θ | X ) p ( ω, θ ) wherehere the periodogram power is related to log-probabilitieslog p ( ω | X, ˆ θ ( ω )) where ω is the angular frequency, X is shorthand for observations, and ˆ θ ( ω ) are the non-frequency parameters that maximize log p ( ω, θ | X ) at agiven trial frequency.Additionally, Mortier et al. (2015a) do not speciﬁcallyconsider the problem examined in this paper (periodictemplate ﬁtting); they derive results for the classicalperiodogram model (sinusoidal model without constantoﬀset) derived in Lomb (1976) and Scargle (1982) aswell as the extension with a constant oﬀset derived inZechmeister & K¨urster (2009). However, their generalmethodology (deﬁning the periodogram power in termsof marginal probabilities), can be applied to our case andothers.As remarked in (VanderPlas 2018), Bayesian peri-odogram results should be taken with a grain of salt– usually the underlying assumptions of the statisticalmodel (uncorrelated, Normally-distributed measurementuncertainties, no systematics, etc.) are broken in manyreal-world astronomical timeseries..1.2. Extending Lomb-Scargle

The LS periodogram has numerous extensions to ac-count for, e.g., biased estimates of the mean brightness(Zechmeister & K¨urster 2009), non-sinusoidal signals(Schwarzenberg-Czerny 1996; Bretthorst 2003; Palmer2009; Baluev 2009, 2013c), multiple periodicities (Baluev2013a,b), multi-band observations (VanderPlas & Ivezi´c2015), and to mitigate overﬁtting of more ﬂexible mod-els via regularization (VanderPlas & Ivezi´c 2015). Baluev(2008) provided the framework for improving false alarmstatistics with extreme value theory, and this was gen-eralized further in Baluev (2013c). For a more detailedreview of the LS periodogram and its extensions, see Van-derPlas (2018).The multi-harmonic LS periodogram (Bretthorstet al. 1988; Schwarzenberg-Czerny 1996; Palmer 2009,MHGLS), provides a more ﬂexible model by adding har-monic components to the ﬁt:ˆ y MHLS ( t | θ, ω ) = θ + H (cid:88) n =1 θ n cos nωt + θ n − sin nωt. (13)Additional harmonics are important for modelingsignals that, while stationary and periodic, are non-sinusoidal (e.g. RR Lyrae, eclipsing binaries, etc.).However, the MHLS periodogram contains 2 H + 1ast Template Periodogram 3free parameters while the original LS periodogram con-tains only 2 (3 if the mean brightness is considered afree parameter). Including higher order harmonics addsmodel complexity, which can degrade the sensitivity ofthe MHLS periodogram to sinusoidal or approximatelysinusoidal signals.Tikhonov regularization (or L regularization), is onetool for mitigating overﬁtting of the higher order har-monics. However, adding an L regularization term tothe Fourier amplitudes adds bias to the model, and thatbias should be compared to the value added by the de-crease in model variance (VanderPlas & Ivezi´c 2015).1.3. Computational scaling

The LS periodogram naively scales as O ( N f N obs )where N f is the number of trial frequencies and N obs isthe number of observations. However, the limiting com-putations of the LS periodogram involve sums of trigono-metric functions over the observations. When the obser-vations are regularly sampled, the fast Fourier transform(FFT) (Cooley & Tukey 1965) can evaluate such sums ef-ﬁciently and the LS periodogram scales as O ( N f log N f ).When the data is not regularly sampled, as is the casefor most astronomical time series, the LS periodogramcan be evaluated quickly in one of two popular ways. Theﬁrst, by Press & Rybicki (1989) involves“extirpolating”irregularly sampled data onto a regularly sampled mesh,and then performing FFTs to evaluate the necessarysums. The second, as pointed out in Leroy (2012), isto use the non-equispaced FFT (Keiner et al. 2009; Dutt& Rokhlin 1993, NFFT) to evaluate the sums; this pro-vides roughly an order of magnitude speedup over thePress & Rybicki (1989) algorithm, and both algorithmsscale as O ( N f log N f ) (Leroy 2012).There is a growing population of alternative methodsfor detecting periodic signals in astrophysical data. Someof these methods can reliably outperform the LS peri-odogram, especially for non-sinusoidal signal shapes (seeGraham et al. (2013b) for a recent empirical review of pe-riod ﬁnding algorithms). However, a key advantage theLS periodogram and its extensions is speed. Virtuallyall other “phase-folding” methods scale as O ( N obs × N f ),where N obs is the number of observations and N f is thenumber of trial frequencies, while the Lomb-Scargle pe-riodogram scales as O ( N f log N f ). The virtual indepen-dence of Lomb-Scargle’s computation time with respectto the number of observations (assuming N f (cid:38) N obs ) isespecially valuable for lightcurves with N obs (cid:29) log N f ∼ O (10 ) observations of O (10 − ) stars. The Gaiatelescope (Gaia Collaboration et al. 2016) is set to pro-duce O (10 − O (10 ) stars. The LargeSynoptic Survey Telescope (LSST; LSST Science Collab-oration et al. (2009)) will make O (10 − ) observationsof O (10 ) stars during its operation starting in 2023.1.4. Template periodograms

When the shape of a stationary periodic signal isknown a priori , then the number of degrees of freedom is the same as the original LS periodogram (with a ﬂoatingmean component):ˆ y ( t | θ, ω ) = θ + θ M ( ωt − θ ) , (14)where M : [0 , π ) → R is a predeﬁned periodic tem-plate. We refer to the periodogram corresponding to thismodel as the “template periodogram.”As is the case for the LS periodogram, the templateperiodogram is equivalent to a maximum-likelihood es-timate of the model parameters under the assumptionthat measurement uncertainties are Gaussian and uncor-related (i.e. white noise).This paper develops new extensions of least-squaresspectral analysis for arbitrary signal shapes. For non-periodic signals this method is known as matched ﬁlteranalysis, and can be extended to search for periodic sig-nals by, e.g., phase folding the data at diﬀerent trial pe-riods.An analysis by Sesar et al. (2016) found that templateﬁtting signiﬁcantly improved period and amplitude es-timation for RR Lyrae in Pan-STARRS DR1 photome-try (Chambers et al. 2016). Since the signal shapes forRR Lyrae in various bandpasses are known a priori (seeSesar et al. (2010)), template ﬁtting provides an optimalestimate of amplitude and period, given that the objectis indeed an RR Lyrae star well modeled by at least oneof the templates. Templates were especially crucial forPan-STARRS data, since there are typically only 35 ob-servations per source over 5 bands (Hernitschek et al.2016), not enough to obtain accurate amplitudes empir-ically by phase-folding. By including domain knowledge(i.e. knowledge of what RR Lyrae lightcurves look like),template ﬁtting allows for accurate inferences of ampli-tude even for undersampled lightcurves.However, the improved accuracy comes at substantialcomputational cost: the template ﬁtting procedure took30 minutes per CPU per object, and Sesar et al. (2016)were forced to limit the number of ﬁtted lightcurves( (cid:46) (cid:46) DERIVATIONSWe deﬁne a template MM : [0 , π ) → R , (15) Hoﬀman et al. 2021as a mapping between the unit interval and the set ofreal numbers. We restrict our discussion to suﬃcientlysmooth templates such that M can be adequately de-scribed by a truncated Fourier seriesˆ M ( ωt | H ) = H (cid:88) n =1 ( c n cos nωt + s n sin nωt ) (16)for some ﬁnite H > c n and s n values are ﬁxed (i.e., they deﬁnethe template) is the crucial diﬀerence between the tem-plate periodogram and the multi-harmonic Lomb-Scargle(Palmer 2009; Bretthorst et al. 1988), where c n and s n are free parameters .We now construct a periodogram for this template.The periodogram assumes that an observed time series S = { ( t i , y i , σ i ) } Ni =1 can be modeled by a scaled, trans-posed template that repeats with period 2 π/ω , i.e. y i ≈ ˆ y ( ωt i | θ, M ) = θ M ( ωt i − θ ) + θ , (17)where θ = ( θ , θ , θ ) ∈ R is a set of model parameters.The optimal parameters are the location of a local min-imum of the (weighted) sum of squared residuals, χ ( θ, S ) ≡ (cid:88) i w i ( y i − ˆ y ( ωt i | θ )) , (18)and thus the following condition must hold for all threemodel parameters at the optimal solution θ = θ opt : ∂χ ∂θ j (cid:12)(cid:12)(cid:12)(cid:12) θ = θ opt = 0 ∀ θ j ∈ θ. (19)Note that we have implicitly assumed χ ( θ, S ) is a C diﬀerentiable function of θ , which requires that M is a C diﬀerentiable function. Though this assumption couldbe violated if we considered a more complete set of tem-plates, (e.g. a box function), our restriction to truncatedFourier series ensures C diﬀerentiability.Note that we also implicitly assume σ i > i andwe will later assume that the variance of the observations y i is non-zero. If there are no measurement errors, i.e. σ i = 0 for all i , then uniform weights (setting σ i = 1)should be used. If the variance of the observations y is zero, the periodogram (as deﬁned in Equation 2) isundeﬁned for all frequencies. We do not consider thecase where σ i = 0 for some observations i and σ j > j .We can derive a system of equations for θ opt from thecondition given in Equation 19. The explicit conditionthat must be met for each parameter θ j is simpliﬁed be-low, using ˆ y i = ˆ y ( ωt i | θ ) (20)and ∂ j ˆ y i = ∂ ˆ y ( ωt | θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12) t = t i (21) for brevity: 0 = ∂χ ∂θ j (cid:12)(cid:12)(cid:12)(cid:12) θ = θ opt = − (cid:88) i w i ( y i − ˆ y i ) ( ∂ j ˆ y ) i (cid:88) i w i y i ( ∂ j ˆ y ) i = (cid:88) i w i ˆ y i ( ∂ j ˆ y ) i . (22)The above is a general result that extends to all leastsquares periodograms. To simplify derivations, we adoptthe following notation: (cid:104) X (cid:105) ≡ (cid:88) i w i X i (23) (cid:104) XY (cid:105) ≡ (cid:88) i w i X i Y i (24)Cov( X, Y ) ≡ (cid:104) XY (cid:105) − (cid:104) X (cid:105) (cid:104) Y (cid:105) (25)Var( X ) ≡ Cov(

X, X ) (26)In addition, the transposed template M θ = M ( ωt i − θ ) can be expressed as M θ ( ωt ) = (cid:88) n c n cos n ( ωt − θ ) (27)+ s n sin n ( ωt − θ ) (28)= (cid:88) n ( c n cos nθ − s n sin nθ ) cos nωt + (29)( s n cos nθ + c n sin nθ ) sin nωt (30)= (cid:88) n (cid:0) α n e inθ + α ∗ n e − inθ (cid:1) cos nωt + (31) (cid:0) − i (cid:2) α n e inθ − α ∗ n e − inθ (cid:3)(cid:1) sin nωt (32)= (cid:88) n (cid:0) α n ψ n + α ∗ n ψ − n (cid:1) cos nωt + (33) (cid:0) − i (cid:2) α n ψ n − α ∗ n ψ − n (cid:3)(cid:1) sin nωt (34)= (cid:88) n A n ( ψ ) cos nωt + B n ( ψ ) sin nωt (35)where α n = ( c n + is n ) /

2, and ψ ≡ e iθ is a convenientchange of variable.We also deﬁne the following terms: (cid:100) Y M = (cid:104) y M θ (cid:105) (36) (cid:91) M M = (cid:10) M θ (cid:11) (37) M = (cid:104) M θ (cid:105) (38) M M = Var( M θ ) = (cid:91) M M − M (39) Y M = Cov( M θ , y ) = (cid:100) Y M − ¯ yM (40)For a given phase shift θ , the optimal amplitude andoﬀset are obtained from requiring the partial derivativesof the sum of squared residuals, χ , to be zero.ast Template Periodogram 5Namely, we obtain that0 = ∂χ ∂θ = 2 (cid:88) i w i ( y i − ˆ y i ) (cid:18) − ∂ ˆ y∂θ (cid:19) i (41)= (cid:88) i w i ( y i − θ M θ − θ ) M θ (42)= (cid:100) Y M − θ (cid:91) M M − θ M (43)and 0 = ∂χ ∂θ = 2 (cid:88) i w i ( y i − ˆ y i ) (cid:18) − ∂ ˆ y∂θ (cid:19) i (44)= (cid:88) i w i ( y i − θ M θ − θ ) (45)= ¯ y − θ M − θ (46)This system of equations can then be rewritten as (cid:18) (cid:91) M M MM (cid:19) (cid:18) θ θ (cid:19) = (cid:18) (cid:100) Y M ¯ y (cid:19) (47)which reduces to (cid:18) θ θ (cid:19) = 1 (cid:91) M M − M (cid:18) − M − M (cid:91) M M (cid:19) (cid:18) (cid:100)

Y M ¯ y (cid:19) (48)= 1 (cid:91) M M − M (cid:32) (cid:100) Y M − ¯ yM (cid:91) M M ¯ y − (cid:100) Y M M (cid:33) (49)Letting

M M = (cid:91) M M − M and Y M = (cid:100) Y M − ¯ yM , wehave (cid:18) θ θ (cid:19) = (cid:18) Y M/M M ¯ y − M ( Y M/M M ) (cid:19) (50)This means we can rewrite the model ˆ y = θ M θ + θ as ˆ y i = ¯ y + (cid:18) Y MM M (cid:19) ( M i − M ) (51)To obtain an expression for the periodogram, P = 1 − χ /χ , we ﬁrst compute χ χ = (cid:88) i w i ( y i − ˆ y i ) (52)= (cid:88) i w i ( y i − y i ˆ y i + ˆ y i ) (53)= Y Y − Y M ) M M + (

Y M ) M M (54)=

Y Y − ( Y M ) M M (55)Since, χ = Y Y , we have P ( ω ) = ( Y M ) Y Y · M M (56)We wish to maximize P ( ω ) with respect to the phaseshift parameter θ , ∂ θ P = 0 = Y MY Y · M M (cid:18) ∂ θ ( Y M ) − Y MM M ∂ θ ( M M ) (cid:19) (57)= 2 M M ∂ θ ( Y M ) − Y M ∂ θ ( M M ) . (58)The ﬁnal expression is the non-linear condition thatmust be satisﬁed by the optimal phase shift parameter θ .However, satisfying Equation 57 is not suﬃcient to guar-antee that θ is optimal. The value of the periodogramat each θ satisfying Equation 57 must be computed, andthe globally optimal solution chosen from this set.We seek a more explicit form for Equation 57. Wederive expressions for M M and

Y M , deﬁning CC nm ≡ Cov(cos nωt, cos mωt ) (59) CS nm ≡ Cov(cos nωt, sin mωt ) (60) SS nm ≡ Cov(sin nωt, sin mωt ) (61)

Y C n ≡ (cid:104) ( y − ¯ y ) cos nωt (cid:105) (62) Y S n ≡ (cid:104) ( y − ¯ y ) sin nωt (cid:105) , (63)all of which can be evaluated eﬃciently using the NFFT.The autocovariance of the template values M M , isgiven by

M M ≡ (cid:88) i w i M i − (cid:32)(cid:88) i w i M i (cid:33) (64)= (cid:88) i w i (cid:32)(cid:88) n A n cos ωnt i + B n sin ωnt i (cid:33) (65) − (cid:32)(cid:88) n A n C n + B n S n (cid:33) (66)= (cid:88) n,m A n A m CC nm + ( A n B m CS nm + B n A m ( CS T ) nm )+ B n B m SS nm (67)= (cid:88) n,m A n A m CC nm + 2 A n B m CS nm + B n B m SS nm , (68)using (cid:88) n,m A n B m CS nm = (cid:88) n,m A m B n CS mn . (69)We also derive the products A n A m , A n B m , B n B m : Hoﬀman et al. 2021 A n A m = (cid:0) α n ψ n + α ∗ n ψ − n (cid:1) (cid:0) α m ψ m + α ∗ m ψ − m (cid:1) = α n α m ψ n + m + α ∗ n α m ψ m − n + α n α ∗ m ψ n − m + α ∗ n α ∗ m ψ − n − m ) (70) A n B m = − i (cid:0) α n ψ n + α ∗ n ψ − n (cid:1) (cid:0) α m ψ m − α ∗ m ψ − m (cid:1) = − i (cid:8) α n α m ψ n + m + α ∗ n α m ψ m − n − α n α ∗ m ψ n − m − α ∗ n α ∗ m ψ − n − m ) (cid:111) (71) B n B m = − (cid:0) α n ψ n − α ∗ n ψ − n (cid:1) (cid:0) α m ψ m − α ∗ m ψ − m (cid:1) = − (cid:8) α n α m ψ n + m − α ∗ n α m ψ m − n − α n α ∗ m ψ n − m + α ∗ n α ∗ m ψ − n − m (cid:9) (72)Now we have that M M nm = A n A m CC nm + 2 A n B m CS nm + B n B m SS nm = α n α m (cid:103) CC nm ψ n + m + 2 α n α ∗ m (cid:103) CS nm ψ n − m + α ∗ n α ∗ m (cid:102) SS nm ψ − ( n + m ) (73)where (cid:103) CC nm = ( CC nm − SS nm ) − i (cid:0) CS + CS T (cid:1) nm (74) (cid:103) CS nm = ( CC nm + SS nm ) + i (cid:0) CS − CS T (cid:1) nm (75) (cid:102) SS nm = ( CC nm − SS nm ) + i (cid:0) CS + CS T (cid:1) nm (76)and for Y M : Y M k = A k Y C k + B k Y S k = α k Y C k ψ k + α ∗ k Y C k ψ − k − i (cid:0) α k Y S k ψ k − α ∗ k Y S k ψ − k (cid:1) = ( Y C k − iY S k ) α k ψ k + ( Y C k + iY S k ) α ∗ k ψ − k = α k (cid:103) Y C k ψ k + α ∗ k (cid:103) Y C ∗ k ψ − k . (77)We also deﬁne Y M (cid:48) = ψ H Y M and

M M (cid:48) = ψ H M M ,both of which are polynomials in ψ . Their derivativesare ∂Y M (cid:48) = Hψ H − Y M + ψ H ∂Y M (78) ∂M M (cid:48) = 2 Hψ H − M M + ψ H ∂M M. (79)A new polynomial condition can then be expressed interms of M M (cid:48) , Y M (cid:48) and their derivatives. 0 = 2

M M ∂ ( Y M ) − Y M ∂ ( M M ) (80)= ψ H +1 (2 M M ∂ ( Y M ) − Y M ∂ ( M M )) (81)= 2 ψ H M M (cid:0) ψ H +1 ∂ ( Y M ) (cid:1) − ψ H Y M (cid:0) ψ H +1 ∂ ( M M ) (cid:1) (82)= 2 M M (cid:48) ( ψ∂ ( Y M (cid:48) ) − H ( Y M (cid:48) )) − Y M (cid:48) ( ψ∂ ( M M (cid:48) ) − H ( M M (cid:48) )) (83)= ψ (2 M M (cid:48) ∂ ( Y M (cid:48) ) − Y M (cid:48) ∂ ( M M (cid:48) )) − H ( M M (cid:48)

Y M (cid:48) − Y M (cid:48)

M M (cid:48) ) (84)= 2

M M (cid:48) ∂ ( Y M (cid:48) ) − Y M (cid:48) ∂ ( M M (cid:48) ) (85)The last step assumes that ψ (cid:54) = 0, which is a validassumption since ψ = e iθ lies on the unit circle for allreal θ .We solve for the zeros of the polynomialcondition deﬁned by Equation 80 using the numpy.polynomial.polyroots function, which solvesfor the eigenvalues of the polynomial companion matrix.Solving for the zeros of a polynomial given a set ofcoeﬃcients is unstable in certain cases, since the coef-ﬁcients are represented as ﬂoating point numbers withﬁnite precision. Thus, we scale the roots by their modu-lus to ensure they lie on the unit circle. Alternatively, wecould use iterative schemes such as Newton’s method toimprove the estimate of the roots more robustly, howeverthis requires more computational power and the accuracyof the roots was not a problem for any of the cases theauthors have tested.2.1. Negative amplitude solutions

The model ˆ y = θ M θ + θ allows for θ < − cos x = cos( x − π ) and − sin x = sin( x − π ).However, for non-sinusoidal templates, M , negativeamplitudes do not generally correspond to a phase dif-ference. For example, a detached eclipsing binary tem-plate M EB ( x ) cannot be expressed in terms of a phase-shifted negative eclipsing binary template; i.e. M EB (cid:54) = − M EB ( x − φ ) for any φ ∈ [0 , π ).Negative amplitude solutions found by the fast tem-plate periodogram are usually undesirable, as they mayproduce false positives for lightcurves that resembleﬂipped versions of the desired template, and allowing for θ < P FTP ( ω ) = 0 if the optimal solution for θ is negative,but this complicates the interpretation of P FTP . Anotherpossible remedy is, for frequencies that have a θ < θ >

0, e.g. via non-linear optimization, butthis likely will eliminate the computational advantage ofFTP over existing methods.Thus, we allow for negative amplitude solutions in themodel ﬁt and caution the user to check that the best ﬁt θ is positive.ast Template Periodogram 72.2. Extending to multi-band observations

Multi-phase model

As shown in VanderPlas & Ivezi´c (2015), the multi-phase periodogram (their ( N base , N band ) = (0 ,

1) peri-odogram), for any model can be expressed as a linearcombination of single-band periodograms: P (0 , ( ω ) = (cid:80) Kk =1 χ ,k P k ( ω ) (cid:80) Kk =1 χ ,k (86)where K denotes the number of bands, χ ,k is theweighted sum of squared residuals between the data inthe k -th band and its weighted mean (cid:104) y (cid:105) , and P k ( ω ) isthe periodogram value of the k -th band at the trial fre-quency ω .With Equation 86, the template periodogram is readilyapplicable to multi-band time series, which is crucial forexperiments like LSST, SDSS, Pan-STARRS, and othercurrent and future photometric surveys.Other multi-band extensions of the template peri-odogram are provided in Appendix A.2.3. Computational requirements

For a given number of harmonics H , the task of deriv-ing the polynomial given in Equation 80 requires O ( H )computations, and ﬁnding the roots of this polynomialrequires O ( H ) computations. The degree of the ﬁnalpolynomial is 6 H − N f trial frequencies, the polynomialcomputation and root-ﬁnding step scales as O ( H N f ).The computation of the sums (Equations 59 – 63) scalesas O ( HN f log HN f ).Therefore, the entire template periodogram scales as O ( HN f log HN f + H N f ) . (87)However, an important consideration is that the peakwidth scales inversely with the number of harmonics δf peak ∝ /H and so the number of trial frequenciesneeded to resolve a peak increases linearly with the num-ber of harmonics in the template.The computational scaling factoring in an extra powerof H is therefore O ( H N f log HN f + H N f ) . (88)For a ﬁxed number of harmonics H , the template peri-odogram scales as O ( N f log N f ). However, for a constantnumber of trial frequencies N f , the template algorithmscales as O ( H ), and computational resources alone limit H to reasonably small numbers H (cid:46)

15 (see Figure 1). IMPLEMENTATIONAn open-source implementation of the template peri-odogram in Python is available. Polynomial algebra isperformed using the numpy.polynomial module (Joneset al. 2001–). The nfft

Python module, which pro-vides a Python implementation of the non-equispaced https://github.com/PrincetonUniversity/FastTemplatePeriodogram https://github.com/jakevdp/nfft H (number of harmonics) ¤ ¤ E x ec . t i m e [ s ] / ( H × N f ) < ( HN f ) ( HN f ) ( N f H ) N obs = 10 N f = 125 N obs = 100 N f = 1250 Fig. 1.—

Computation time of FTP scaled by NH for diﬀerentnumbers of harmonics. For H (cid:46)

3, FTP scales sublinearly in H (possibly due to a constant overhead per trial frequency, indepen-dent of H ). When 3 (cid:46) H (cid:46)

11, FTP scales approximately linearlyin H , and when H (cid:38)

11 FTP approaches the O ( H ) scaling limit. fast Fourier transform, is used to compute the necessarysums for a particular time series.No explicit parallelism is used anywhere in the cur-rent implementation, however certain linear algebra op-erations in Scipy use OpenMP via calls to BLAS librariesthat have OpenMP enabled.All timing tests were run on a quad-core 2.6 GHz IntelCore i7 MacBook Pro laptop (mid-2012 model) with 8GBof 1600 MHz DDR3 memory. The

Scipy stack (version0.18.1) was compiled with multi-threaded MKL libraries.3.1.

Comparison with non-linear optimization

In order to evaluate the accuracy and speed of the tem-plate periodogram, we have included slower alternativesolvers within the Python implementation of the FTPthat employ non-linear optimization to ﬁnd the best ﬁtparameters.Periodograms computed in Figures 2, 3, and 4 usedsimulated data. The simulated data has uniformlyrandom observation times, with Gaussian-random, ho-moskedastic, uncorrelated uncertainties. An eclipsing bi-nary template, generated by ﬁtting a well-sampled, highsignal-to-noise eclipsing binary in the HATNet dataset(BD+56 603) with a 10-harmonic truncated Fourier se-ries. 3.1.1.

Accuracy

For weak signals or signals folded at the incorrect trialperiod, there may be a large number of local χ min-ima in the parameter space, and thus non-linear opti-mization algorithms may have trouble ﬁnding the globalminimum. The FTP, on the other hand, solves for theoptimal parameters directly, and thus is able to recover Hoﬀman et al. 2021 Frequency [ d ] Multi-harmonic Lomb ScargleBox Least Squares0.0 0.5 1.0

PhaseH = 1H = 2H = 5H = 9H = 10 Template periodogramTemplate fits

Fig. 2.—

Template periodograms performed on a simulated eclipsing binary lightcurve (shown phase-folded in the left-hand plots). Thetop-most plot uses only one harmonic, equivalent to a Lomb-Scargle periodogram. Subsequent plots use an increasing number of harmonics,which produces a narrower and higher peak height around the correct frequency. For comparison, the multi-harmonic extension to Lomb-Scargle is plotted in blue, using the same number of harmonics as the FTP. The Box Least-Squares (Kov´acs et al. 2002) periodogram isshown in the ﬁnal plot. P FTP ( | H = 10) P s l o w () R = 0.995 Fig. 3.—

Comparing accuracy between previous methods thatrely on non-linear optimization at each trial frequency with thefast template periodogram described in this paper. Both methodsare applied to the same simulated data as shown in Figure 2. TheFTP consistently ﬁnds more optimal template ﬁts than those foundwith non-linear optimization, which do not guarantee convergenceto a globally optimal solution. The FTP solves for the optimal ﬁtparameters directly, and therefore is able to achieve greater accu-racy than template ﬁts done via non-linear optimization. optimal solutions even when the signal is weak or notpresent.Figure 3 illustrates the accuracy improvement withFTP. Many solutions found via non-linear optimizationare signiﬁcantly suboptimal compared to the solutionsfound by the FTP.Figure 4 compares FTP results obtained using the fulltemplate ( H = 10) with those obtained using smallernumbers of harmonics. The left-most plot compares the H = 1 case (weighted Lomb-Scargle), which, as alsodemonstrated in Figure 2, illustrates the advantage ofthe template periodogram for known, non-sinusoidal sig-nal shapes. 3.1.2. Computation time

FTP scales asymptotically as O ( N f H log N f H ) withrespect to the number of trial frequencies, N f and as O ( N f H ) with respect to the number of harmonics inwhich the template is expanded, H . For a given resolv-ing power, however, there is an additional factor of H ineach of these terms due to the number of trial frequenciesnecessary to resolve a periodogram peak being propor-tional to H . However, for reasonable cases ( N f (cid:46) when H = 5) the computation time is dominated by com-puting polynomial coeﬃcients and root ﬁnding, both ofwhich scale linearly in N f .The number of trial frequencies needed for ﬁnding as-trophysical signals in a typical photometric time seriesis N f = 1 . × (cid:18) H (cid:19) (cid:18) baseline10 yrs (cid:19) (cid:16) α (cid:17) (cid:18)

Comparing the template periodogram calculated with H = 10 harmonics to the template periodogram using a smaller numberof harmonics H <

10. The template and data used to perform the periodogram calculations are the same as those shown in Figure 2. Number of datapoints ¤ E x ec u t i o n t i m e [ s ] H = 1 H = 3 H = 6 Constant cadence

Fast template periodogramNon-linear optimization 10 Number of datapoints E x ec u t i o n t i m e [ s ] H = 3 N f = 499 Constant baseline

Fast template periodogramNon-linear optimization

Fig. 5.—

Computation time of FTP compared with alternative techniques that use non-linear optimization at each trial frequency.

Left : timing for the case when N f = 12 N obs , i.e. the cadence of the observations is constant. Right : timing for the case when N f isﬁxed, i.e. the baseline of the observations are constant. Non-linear optimization techniques scale as O ( N f N obs ) while the FTP scales as O ( HN f log HN f + N f H ), where H is the number of harmonics needed to approximate the template. where α represents the “oversampling factor,”∆ f peak / ∆ f , where ∆ f peak ∼ / baseline is the typ-ical width of a peak in the periodogram and ∆ f is thefrequency spacing of the periodogram.Extrapolating from the timing of a test case (500 obser-vations, 5 harmonics, 15,000 trial frequencies), the sum-mations account for approximately 5% of the computa-tion time when N f ∼ . If polynomial computationsand root-ﬁnding can be improved to the point where theyno longer dominate the computation time, this wouldprovide an order of magnitude speedup over the currentimplementation.Figure 5 compares the timing of the FTP with that ofprevious methods that employ non-linear optimization.For the case when N f ∝ N obs , FTP achieves a factor of 3speedup for even the smallest test case (15 datapoints),while for larger cases ( N ∼ ) FTP oﬀers 2-3 orders ofmagnitude speed improvement. For the constant baselinecase, FTP is a factor of ∼ ∼

20 faster for N obs ∼ . Futureimprovements to the FTP implementation could further improve speedups by 1-2 orders of magnitude over non-linear optimization. DISCUSSIONTemplate ﬁtting is a powerful technique for accuratelyrecovering the period and amplitude of objects with apriori known lightcurve shapes. It has been used inthe literature by, e.g. Stringer et al. (2019); Sesar et al.(2016, 2010), to analyze RR Lyrae in the SDSS, PS1, andDES datasets, where it has been shown to produce purersamples of RR Lyrae at a given completeness. The com-putational cost of current template ﬁtting algorithms,however, limits their application to larger datasets orwith a larger number of templates.We have presented a novel template ﬁtting algorithmthat extends the Lomb-Scargle periodogram (Lomb 1976;Scargle 1982; Barning 1963; Van´ıˇcek 1971) to handle non-sinusoidal signals that can be expressed in terms of atruncated Fourier series with a reasonably small numberof harmonics ( H (cid:46) O ( N f H log N f H + N f H ), while previoustemplate ﬁtting algorithms such as the one used in the gatspy library (VanderPlas 2016), scale as O ( N f N obs ).However, the FTP eﬀectively scales as O ( N f H ), sincethe time needed to compute polynomial coeﬃcients andperform zero-ﬁnding dominates the computational timefor all practical cases ( N f (cid:46) ). The H scaling ef-fectively restricts templates to those that are suﬃcientlysmooth to be explained by a small number of Fourierterms.FTP also improves the accuracy of previous templateﬁtting algorithms, which rely on non-linear optimizationat each trial frequency to minimize the χ of the templateﬁt. The FTP routinely ﬁnds superior ﬁts over non-linearoptimization methods.An open-source Python implementation of the FTP isavailable at GitHub. The current implementation couldlikely be improved by:1. Improving the speed of the polynomial coeﬃcient calculations and the zero-ﬁnding steps. This couldpotentially yield a speedup of ∼ − ∼ few forlightcurves with O (100) observations, and by an order ofmagnitude or more for objects with more than 1,000 ob-servations. These improvements, taken at face value, arenot enough to make template ﬁtting feasible on LSST-sized datasets. However, optimizing the polynomial com-putations could yield a factor of ∼ −

100 speedupover the current implementation, which would make theFTP 1-3 orders of magnitude faster than alternative tech-niques.APPENDIX

SHARED-PHASE MULTI-BAND TEMPLATE PERIODOGRAM

We derive a multi-band extension for the template periodogram for data taken in K ﬁlters, with N k observations inthe k -th ﬁlter. We use the same model as the one described in Sesar et al. (2016) in order to illustrate the applicabilityof the template periodogram to more sophisticated scenarios.The i -th observation in the k -th ﬁlter is denoted y ( k ) i . We wish to ﬁt a periodic, multi-band template M =( M (1) , M (2) , ..., M ( K ) ) : [0 , π ) → R K to all observations. We assume the same model used by Sesar et al. (2016),which assumes the relative amplitudes, phase shifts, and oﬀsets are shared across bands:ˆ y ( k ) ( t | θ ) = θ M ( k ) ( ωt − θ ) + θ + λ ( k ) (A1)where λ ( k ) is a ﬁxed relative oﬀset for band k . The χ for this model is χ = K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:16) y ( k ) i − ˆ y ( k ) i (cid:17) (A2)To make things simpler, we can set the λ ( k ) values to 0 simply by subtracting them oﬀ from our observations; thismeans we take all y ( k ) i → y ( k ) i − λ ( k ) . We have that χ = K (cid:88) k =1 W ( k ) χ k (A3)Where W ( k ) ≡ (cid:80) N k i =1 w ( k ) i and (cid:80) Kk =1 W ( k ) = 1. This means that the system of equations reduces to:0 = ∂χ ∂θ = K (cid:88) k =1 W ( k ) (cid:18) (cid:100) Y M ( k ) − θ (cid:91) M M ( k ) − θ M ( k ) (cid:19) (A4)for the θ parameter, where (cid:100) Y M ( k ) , (cid:91) M M ( k ) , M ( k ) are values from the single band case computed for each bandindividually, holding W ( k ) = 1 for each band. That is: https://github.com/PrincetonUniversity/ FastTemplatePeriodogram ast Template Periodogram 11 (cid:100) Y M ( k ) ≡ W ( k ) N k (cid:88) i =1 w ( k ) i y ( k ) i M ( k ) θ ( ωt i ) (A5) (cid:91) M M ( k ) ≡ W ( k ) N k (cid:88) i =1 w ( k ) i (cid:16) M ( k ) θ ( ωt i ) (cid:17) (A6) M ( k ) ≡ W ( k ) N k (cid:88) i =1 w ( k ) i M ( k ) θ ( ωt i ) (A7)For the oﬀset θ , we have 0 = ∂χ ∂θ = K (cid:88) k =1 W ( k ) (cid:16) ¯ y ( k ) − θ M ( k ) − θ (cid:17) (A8)Where ¯ y ( k ) is the weighted-mean for the k -th band (¯ y ( k ) ≡ (1 /W ( k ) ) (cid:80) N k i =1 w ( k ) i y ( k ) i ), again with y ( k ) i → y ( k ) i − λ ( k ) .So if we redeﬁne the quantities (cid:100) Y M , (cid:91) M M , M , and ¯ y as weighted averages across the bands, i.e. (cid:100) Y M = (cid:80) Kk =1 W ( k ) (cid:100) Y M ( k ) , etc., the solution for the optimal parameters has the same form: θ = (cid:18) Y MM M (cid:19) (A9) θ = ¯ y − M (cid:18) Y MM M (cid:19) (A10)which means that the model for the k -th band isˆ y ( k ) = ¯ y + (cid:18) Y MM M (cid:19) (cid:16) M ( k ) i − M (cid:17) (A11)The form of the periodogram has the same form as the single-band case: χ = K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:16) y ( k ) i − ˆ y ( k ) i (cid:17) (A12)= K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:18) ( y ( k ) i − ¯ y ) − (cid:18) Y MM M (cid:19) ( M ( k ) i − M ) (cid:19) (A13)= K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:32) ( y ( k ) i − ¯ y ) − (cid:18) Y MM M (cid:19) ( M ( k ) i − M )( y ( k ) i − ¯ y ) + (cid:18) Y MM M (cid:19) ( M ( k ) i − M ) (cid:33) (A14)= Y Y − Y M ) M M + (

Y M ) M M (A15)=

Y Y − ( Y M ) M M . (A16)Since there is a single shared oﬀset between the bands (i.e. we assume the mean magnitude is the same in all bandsafter subtracting λ ( k ) ), the variance for the signal, Y Y , is not (cid:80) Kk =1 (cid:80) N k i =1 w ( k ) i ( y ( k ) i − ¯ y ( k ) ) but (cid:80) Kk =1 (cid:80) N k i =1 w ( k ) i ( y ( k ) i − ¯ y ) .Construction of the polynomial for the multi-band case can be performed by ﬁrst computing the polynomial expres-sion for ψ H (cid:91) M M = (cid:80) Kk =1 (cid:80) N k i =1 w ( k ) i (cid:16) ψ H M ( k ) i (cid:17) and a separate polynomial expression for ψ H M , which can then besquared and subtracted from ψ H (cid:91) M M to ﬁnd

M M (cid:48) = ψ H M M .For

Y M (cid:48) = ψ H Y M , we merely compute ψ H (cid:100) Y M ( k ) for each band, take the weighted average of the polynomialcoeﬃcients for all bands ( ψ H (cid:100) Y M = (cid:80) Kk W ( k ) ψ H (cid:100) Y M ( k ) ) and subtract the ψ H ¯ yM polynomial to get Y M (cid:48) .After computing the polynomials

M M (cid:48) and

Y M (cid:48) , the polynomial in Equation 80 can be computed quickly and thefollowing steps for ﬁnding the optimal model parameters is the same as in the single band case.2 Hoﬀman et al. 2021The computational complexity of this model scales as O ( KN f H log N f H + N f H ) where K is the number of ﬁlters.Since the polynomial zero-ﬁnding step is the limiting computation in most real-world applications, the multi-bandtemplate periodogram corresponding to the Sesar et al. (2016) model should not be signiﬁcantly more computationallyintensive than the single-band case. DATA AVAILIBILITY