A Fast Template Periodogram for Detecting Non-sinusoidal Fixed-shape Signals in Irregularly Sampled Time Series
DDraft version February 9, 2021
Preprint typeset using L A TEX style emulateapj v. 01/23/15
A FAST TEMPLATE PERIODOGRAM FOR DETECTING NON-SINUSOIDAL FIXED-SHAPE SIGNALS INIRREGULARLY SAMPLED TIME SERIES
J. Hoffman , J. Vanderplas , J. D. Hartman , G. ´A Bakos Draft version February 9, 2021
ABSTRACTAstrophysical time series often contain periodic signals. The large and growing volume of time seriesdata from photometric surveys demands computationally efficient methods for detecting and charac-terizing such signals. The most efficient algorithms available for this purpose are those that exploitthe O ( N log N ) scaling of the Fast Fourier Transform (FFT). However, these methods are not optimalfor non-sinusoidal signal shapes. Template fits (or periodic matched filters) optimize sensitivity for a priori known signal shapes but at a significant computational cost. Current implementations oftemplate periodograms scale as O ( N f N obs ), where N f is the number of trial frequencies and N obs isthe number of lightcurve observations, and due to non-convexity, they do not guarantee the best fitat each trial frequency, which can lead to spurious results. In this work, we present a non-linear ex-tension of the Lomb-Scargle periodogram to obtain a template-fitting algorithm that is both accurate(globally optimal solutions are obtained except in pathological cases) and computationally efficient(scaling as O ( N f log N f ) for a given template). The non-linear optimization of the template fit at eachfrequency is recast as a polynomial zero-finding problem, where the coefficients of the polynomial canbe computed efficiently with the non-equispaced fast Fourier transform. We show that our method,which uses truncated Fourier series to approximate templates, is an order of magnitude faster thanexisting algorithms for small problems ( N (cid:46)
10 observations) and 2 orders of magnitude faster forlong base-line time series with N obs (cid:38) observations. An open-source implementation of the fasttemplate periodogram is available at github.com/PrincetonUniversity/FastTemplatePeriodogram. INTRODUCTIONAstronomical systems exhibit a wide range of time-dependent variability. By measuring and characterizingthis variability, astronomers are able to infer a variety ofimportant astrophysical properties about the underlyingsystem.Periodic signals in noisy astronomical timeseries canbe detected with a number of techniques, includingGaussian process regression (Foreman-Mackey et al.2017; Rasmussen & Williams 2005), least-squares spec-tral analysis (Lomb 1976; Scargle 1982; Barning 1963;Van´ıˇcek 1971), and information-theoretic methods (Gra-ham et al. 2013a; Huijse et al. 2012; Cincotta et al. 1995).For an empirical comparison of some of these techniquesapplied to several astronomical survey datasets, see Gra-ham et al. (2013b).For stationary periodic signals, least-squares spectralanalysis — also known as the Lomb-Scargle (LS) pe-riodogram (Lomb 1976; Scargle 1982; Barning 1963;Van´ıˇcek 1971) — is perhaps the most sensitive andcomputationally efficient method of detection. The LSperiodogram can be made to scale as O ( N f log N f ),where N f is the number of trial frequencies, by utiliz-ing the non-equispaced fast Fourier transform (Keineret al. 2009; Dutt & Rokhlin 1993, NFFT) to evaluate [email protected] Xaxis, LLC, 3 World Trade Center, 175 Greenwich Street,30th Floor, New York, NY 10007 eScience Institute, University of Washington, Seattle, WA98195 Department of Astrophysical Sciences, Princeton University,Princeton NJ 08540 Institute for Advanced Study, 1 Einstein Drive, Princeton,NJ frequency-dependent sums, or by “extirpolating” irregu-larly spaced observations to a regular grid with Lagrangepolynomials (Press & Rybicki 1989).The LS periodogram fits the following model to a setof observations:ˆ y LS ( t | θ, ω ) = θ cos ωt + θ sin ωt. (1)where ω = 2 π/P is the (angular) frequency of the un-derlying signal, and θ and θ are the amplitudes of thesignal. When the data y i is composed of a sinusoidalcomponent and a white noise component (i.e., when themeasurement uncertainties are uncorrelated and Gaus-sian), the LS periodogram provides a maximum likeli-hood estimate for the model parameters ( ω, θ , and θ ).The LS “power” P LS ( ω ) has several definitions in theliterature (Zechmeister & K¨urster 2009), but we adoptthe following definition throughout the paper: P ( ω ) = χ − χ ( ω ) χ (2)Where χ is the weighted sum of squared residuals fora constant fit: χ = (cid:88) i w i ( y i − ¯ y ) (3)where ¯ y = (cid:80) i w i y i is the weighted mean of the ob-servations, and w i ∝ σ − i are the normalized weights foreach observation ( (cid:80) i w i = 1), and χ ( ω ) is the weightedsum of squared residuals for the best-fit model ˆ y at a a r X i v : . [ a s t r o - ph . I M ] F e b Hoffman et al. 2021given trial frequency ω = 2 π/P where P is the period: χ ( ω ) = min θ (cid:88) i w i ( y i − ˆ y ( t i | ω, θ )) (4)1.1. Bayesian interpretation
We note that, while this formalism captures many dataand modeling scenarios, it is not completely general. Forexample, correlated uncertainties are not handled here.A more general Bayesian treatment of periodic models inastronomical timeseries is better handled by expressinga posterior over the model parameters.Assuming a Gaussian likelihood for the observations y i ∼ N (ˆ y ( t i | ω, θ ) , σ i ), and uniform priors on both thefrequency parameter ω and the non-frequency parame-ters θ , the posterior is p ( ω, θ | X ) = p ( X | ω, θ ) p ( ω, θ ) p ( X ) (5) ∝ p ( ω, θ ) N obs (cid:89) i =1 N (ˆ y ( t i | ω, θ ) , σ i ) (6)where we use X = { ( t i , y i , σ i ) | < i < N obs } as short-hand for the lightcurve observations. The logarithm ofthe posterior islog p ( ω, θ | X ) = − N obs (cid:88) i =1 (cid:18) y i − ˆ y ( t i | ω, θ ) σ i (cid:19) + const . (7)= − W N obs (cid:88) i =1 w i ( y i − ˆ y ( t i | ω, θ )) + const . (8)where W = (cid:80) i σ − i . Thus, χ ( ω ) = min θ (cid:88) i w i ( y i − ˆ y ( t i | ω, θ )) (9)= min θ (cid:18) − W log p ( ω, θ | X ) + const . (cid:19) (10)= − W max θ log p ( ω, θ | X ) + const . (11)and therefore, since P ( ω ) = 1 − χ ( ω ) /χ , P ( ω ) = 2 W χ max θ log p ( ω, θ | X ) + const . (12)The Lomb-Scargle power is a linear transformation ofthe maximum of the log posterior over the non-frequencyparameters, with the frequency parameter held fixed.Thus, choosing the frequency that maximizes the peri-odogram value corresponds to finding a MAP estimateof the frequency parameter.A MAP interpretation is more general, and is applica-ble to scenarios not considered in this paper (e.g. corre-lated uncertainties, multi-dimensional timeseries, etc.),since all of these problems are amenable to MAP es-timation of their model parameters. However, in or-der to maintain consistency with notation in the Lomb- Scargle literature (e.g. Zechmeister & K¨urster 2009; Van-derPlas 2018), we keep our definition of the periodogramrestricted as above and only consider one-dimensionaltimeseries with heteroscedastic but uncorrelated uncer-tainties.Mortier et al. (2015a,b) explored a Bayesian interpre-tation of the periodogram. Their frequency periodogramdefinition marginalizes over non-frequency parametersin contrast to a (scaled) MAP value of the log like-lihood discussed here. They use similar assumptions(uncorrelated, Normal measurement uncertainties) anduse uniform priors on non-frequency parameters, buttheir periodogram power was based on marginal prob-abilities P ( ω ) = p ( ω | X ) ∝ (cid:82) θ ∈ Θ p ( ω, θ | X ) p ( ω, θ ) wherehere the periodogram power is related to log-probabilitieslog p ( ω | X, ˆ θ ( ω )) where ω is the angular frequency, X is shorthand for observations, and ˆ θ ( ω ) are the non-frequency parameters that maximize log p ( ω, θ | X ) at agiven trial frequency.Additionally, Mortier et al. (2015a) do not specificallyconsider the problem examined in this paper (periodictemplate fitting); they derive results for the classicalperiodogram model (sinusoidal model without constantoffset) derived in Lomb (1976) and Scargle (1982) aswell as the extension with a constant offset derived inZechmeister & K¨urster (2009). However, their generalmethodology (defining the periodogram power in termsof marginal probabilities), can be applied to our case andothers.As remarked in (VanderPlas 2018), Bayesian peri-odogram results should be taken with a grain of salt– usually the underlying assumptions of the statisticalmodel (uncorrelated, Normally-distributed measurementuncertainties, no systematics, etc.) are broken in manyreal-world astronomical timeseries..1.2. Extending Lomb-Scargle
The LS periodogram has numerous extensions to ac-count for, e.g., biased estimates of the mean brightness(Zechmeister & K¨urster 2009), non-sinusoidal signals(Schwarzenberg-Czerny 1996; Bretthorst 2003; Palmer2009; Baluev 2009, 2013c), multiple periodicities (Baluev2013a,b), multi-band observations (VanderPlas & Ivezi´c2015), and to mitigate overfitting of more flexible mod-els via regularization (VanderPlas & Ivezi´c 2015). Baluev(2008) provided the framework for improving false alarmstatistics with extreme value theory, and this was gen-eralized further in Baluev (2013c). For a more detailedreview of the LS periodogram and its extensions, see Van-derPlas (2018).The multi-harmonic LS periodogram (Bretthorstet al. 1988; Schwarzenberg-Czerny 1996; Palmer 2009,MHGLS), provides a more flexible model by adding har-monic components to the fit:ˆ y MHLS ( t | θ, ω ) = θ + H (cid:88) n =1 θ n cos nωt + θ n − sin nωt. (13)Additional harmonics are important for modelingsignals that, while stationary and periodic, are non-sinusoidal (e.g. RR Lyrae, eclipsing binaries, etc.).However, the MHLS periodogram contains 2 H + 1ast Template Periodogram 3free parameters while the original LS periodogram con-tains only 2 (3 if the mean brightness is considered afree parameter). Including higher order harmonics addsmodel complexity, which can degrade the sensitivity ofthe MHLS periodogram to sinusoidal or approximatelysinusoidal signals.Tikhonov regularization (or L regularization), is onetool for mitigating overfitting of the higher order har-monics. However, adding an L regularization term tothe Fourier amplitudes adds bias to the model, and thatbias should be compared to the value added by the de-crease in model variance (VanderPlas & Ivezi´c 2015).1.3. Computational scaling
The LS periodogram naively scales as O ( N f N obs )where N f is the number of trial frequencies and N obs isthe number of observations. However, the limiting com-putations of the LS periodogram involve sums of trigono-metric functions over the observations. When the obser-vations are regularly sampled, the fast Fourier transform(FFT) (Cooley & Tukey 1965) can evaluate such sums ef-ficiently and the LS periodogram scales as O ( N f log N f ).When the data is not regularly sampled, as is the casefor most astronomical time series, the LS periodogramcan be evaluated quickly in one of two popular ways. Thefirst, by Press & Rybicki (1989) involves“extirpolating”irregularly sampled data onto a regularly sampled mesh,and then performing FFTs to evaluate the necessarysums. The second, as pointed out in Leroy (2012), isto use the non-equispaced FFT (Keiner et al. 2009; Dutt& Rokhlin 1993, NFFT) to evaluate the sums; this pro-vides roughly an order of magnitude speedup over thePress & Rybicki (1989) algorithm, and both algorithmsscale as O ( N f log N f ) (Leroy 2012).There is a growing population of alternative methodsfor detecting periodic signals in astrophysical data. Someof these methods can reliably outperform the LS peri-odogram, especially for non-sinusoidal signal shapes (seeGraham et al. (2013b) for a recent empirical review of pe-riod finding algorithms). However, a key advantage theLS periodogram and its extensions is speed. Virtuallyall other “phase-folding” methods scale as O ( N obs × N f ),where N obs is the number of observations and N f is thenumber of trial frequencies, while the Lomb-Scargle pe-riodogram scales as O ( N f log N f ). The virtual indepen-dence of Lomb-Scargle’s computation time with respectto the number of observations (assuming N f (cid:38) N obs ) isespecially valuable for lightcurves with N obs (cid:29) log N f ∼ O (10 ) observations of O (10 − ) stars. The Gaiatelescope (Gaia Collaboration et al. 2016) is set to pro-duce O (10 − O (10 ) stars. The LargeSynoptic Survey Telescope (LSST; LSST Science Collab-oration et al. (2009)) will make O (10 − ) observationsof O (10 ) stars during its operation starting in 2023.1.4. Template periodograms
When the shape of a stationary periodic signal isknown a priori , then the number of degrees of freedom is the same as the original LS periodogram (with a floatingmean component):ˆ y ( t | θ, ω ) = θ + θ M ( ωt − θ ) , (14)where M : [0 , π ) → R is a predefined periodic tem-plate. We refer to the periodogram corresponding to thismodel as the “template periodogram.”As is the case for the LS periodogram, the templateperiodogram is equivalent to a maximum-likelihood es-timate of the model parameters under the assumptionthat measurement uncertainties are Gaussian and uncor-related (i.e. white noise).This paper develops new extensions of least-squaresspectral analysis for arbitrary signal shapes. For non-periodic signals this method is known as matched filteranalysis, and can be extended to search for periodic sig-nals by, e.g., phase folding the data at different trial pe-riods.An analysis by Sesar et al. (2016) found that templatefitting significantly improved period and amplitude es-timation for RR Lyrae in Pan-STARRS DR1 photome-try (Chambers et al. 2016). Since the signal shapes forRR Lyrae in various bandpasses are known a priori (seeSesar et al. (2010)), template fitting provides an optimalestimate of amplitude and period, given that the objectis indeed an RR Lyrae star well modeled by at least oneof the templates. Templates were especially crucial forPan-STARRS data, since there are typically only 35 ob-servations per source over 5 bands (Hernitschek et al.2016), not enough to obtain accurate amplitudes empir-ically by phase-folding. By including domain knowledge(i.e. knowledge of what RR Lyrae lightcurves look like),template fitting allows for accurate inferences of ampli-tude even for undersampled lightcurves.However, the improved accuracy comes at substantialcomputational cost: the template fitting procedure took30 minutes per CPU per object, and Sesar et al. (2016)were forced to limit the number of fitted lightcurves( (cid:46) (cid:46) DERIVATIONSWe define a template MM : [0 , π ) → R , (15) Hoffman et al. 2021as a mapping between the unit interval and the set ofreal numbers. We restrict our discussion to sufficientlysmooth templates such that M can be adequately de-scribed by a truncated Fourier seriesˆ M ( ωt | H ) = H (cid:88) n =1 ( c n cos nωt + s n sin nωt ) (16)for some finite H > c n and s n values are fixed (i.e., they definethe template) is the crucial difference between the tem-plate periodogram and the multi-harmonic Lomb-Scargle(Palmer 2009; Bretthorst et al. 1988), where c n and s n are free parameters .We now construct a periodogram for this template.The periodogram assumes that an observed time series S = { ( t i , y i , σ i ) } Ni =1 can be modeled by a scaled, trans-posed template that repeats with period 2 π/ω , i.e. y i ≈ ˆ y ( ωt i | θ, M ) = θ M ( ωt i − θ ) + θ , (17)where θ = ( θ , θ , θ ) ∈ R is a set of model parameters.The optimal parameters are the location of a local min-imum of the (weighted) sum of squared residuals, χ ( θ, S ) ≡ (cid:88) i w i ( y i − ˆ y ( ωt i | θ )) , (18)and thus the following condition must hold for all threemodel parameters at the optimal solution θ = θ opt : ∂χ ∂θ j (cid:12)(cid:12)(cid:12)(cid:12) θ = θ opt = 0 ∀ θ j ∈ θ. (19)Note that we have implicitly assumed χ ( θ, S ) is a C differentiable function of θ , which requires that M is a C differentiable function. Though this assumption couldbe violated if we considered a more complete set of tem-plates, (e.g. a box function), our restriction to truncatedFourier series ensures C differentiability.Note that we also implicitly assume σ i > i andwe will later assume that the variance of the observations y i is non-zero. If there are no measurement errors, i.e. σ i = 0 for all i , then uniform weights (setting σ i = 1)should be used. If the variance of the observations y is zero, the periodogram (as defined in Equation 2) isundefined for all frequencies. We do not consider thecase where σ i = 0 for some observations i and σ j > j .We can derive a system of equations for θ opt from thecondition given in Equation 19. The explicit conditionthat must be met for each parameter θ j is simplified be-low, using ˆ y i = ˆ y ( ωt i | θ ) (20)and ∂ j ˆ y i = ∂ ˆ y ( ωt | θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12) t = t i (21) for brevity: 0 = ∂χ ∂θ j (cid:12)(cid:12)(cid:12)(cid:12) θ = θ opt = − (cid:88) i w i ( y i − ˆ y i ) ( ∂ j ˆ y ) i (cid:88) i w i y i ( ∂ j ˆ y ) i = (cid:88) i w i ˆ y i ( ∂ j ˆ y ) i . (22)The above is a general result that extends to all leastsquares periodograms. To simplify derivations, we adoptthe following notation: (cid:104) X (cid:105) ≡ (cid:88) i w i X i (23) (cid:104) XY (cid:105) ≡ (cid:88) i w i X i Y i (24)Cov( X, Y ) ≡ (cid:104) XY (cid:105) − (cid:104) X (cid:105) (cid:104) Y (cid:105) (25)Var( X ) ≡ Cov(
X, X ) (26)In addition, the transposed template M θ = M ( ωt i − θ ) can be expressed as M θ ( ωt ) = (cid:88) n c n cos n ( ωt − θ ) (27)+ s n sin n ( ωt − θ ) (28)= (cid:88) n ( c n cos nθ − s n sin nθ ) cos nωt + (29)( s n cos nθ + c n sin nθ ) sin nωt (30)= (cid:88) n (cid:0) α n e inθ + α ∗ n e − inθ (cid:1) cos nωt + (31) (cid:0) − i (cid:2) α n e inθ − α ∗ n e − inθ (cid:3)(cid:1) sin nωt (32)= (cid:88) n (cid:0) α n ψ n + α ∗ n ψ − n (cid:1) cos nωt + (33) (cid:0) − i (cid:2) α n ψ n − α ∗ n ψ − n (cid:3)(cid:1) sin nωt (34)= (cid:88) n A n ( ψ ) cos nωt + B n ( ψ ) sin nωt (35)where α n = ( c n + is n ) /
2, and ψ ≡ e iθ is a convenientchange of variable.We also define the following terms: (cid:100) Y M = (cid:104) y M θ (cid:105) (36) (cid:91) M M = (cid:10) M θ (cid:11) (37) M = (cid:104) M θ (cid:105) (38) M M = Var( M θ ) = (cid:91) M M − M (39) Y M = Cov( M θ , y ) = (cid:100) Y M − ¯ yM (40)For a given phase shift θ , the optimal amplitude andoffset are obtained from requiring the partial derivativesof the sum of squared residuals, χ , to be zero.ast Template Periodogram 5Namely, we obtain that0 = ∂χ ∂θ = 2 (cid:88) i w i ( y i − ˆ y i ) (cid:18) − ∂ ˆ y∂θ (cid:19) i (41)= (cid:88) i w i ( y i − θ M θ − θ ) M θ (42)= (cid:100) Y M − θ (cid:91) M M − θ M (43)and 0 = ∂χ ∂θ = 2 (cid:88) i w i ( y i − ˆ y i ) (cid:18) − ∂ ˆ y∂θ (cid:19) i (44)= (cid:88) i w i ( y i − θ M θ − θ ) (45)= ¯ y − θ M − θ (46)This system of equations can then be rewritten as (cid:18) (cid:91) M M MM (cid:19) (cid:18) θ θ (cid:19) = (cid:18) (cid:100) Y M ¯ y (cid:19) (47)which reduces to (cid:18) θ θ (cid:19) = 1 (cid:91) M M − M (cid:18) − M − M (cid:91) M M (cid:19) (cid:18) (cid:100)
Y M ¯ y (cid:19) (48)= 1 (cid:91) M M − M (cid:32) (cid:100) Y M − ¯ yM (cid:91) M M ¯ y − (cid:100) Y M M (cid:33) (49)Letting
M M = (cid:91) M M − M and Y M = (cid:100) Y M − ¯ yM , wehave (cid:18) θ θ (cid:19) = (cid:18) Y M/M M ¯ y − M ( Y M/M M ) (cid:19) (50)This means we can rewrite the model ˆ y = θ M θ + θ as ˆ y i = ¯ y + (cid:18) Y MM M (cid:19) ( M i − M ) (51)To obtain an expression for the periodogram, P = 1 − χ /χ , we first compute χ χ = (cid:88) i w i ( y i − ˆ y i ) (52)= (cid:88) i w i ( y i − y i ˆ y i + ˆ y i ) (53)= Y Y − Y M ) M M + (
Y M ) M M (54)=
Y Y − ( Y M ) M M (55)Since, χ = Y Y , we have P ( ω ) = ( Y M ) Y Y · M M (56)We wish to maximize P ( ω ) with respect to the phaseshift parameter θ , ∂ θ P = 0 = Y MY Y · M M (cid:18) ∂ θ ( Y M ) − Y MM M ∂ θ ( M M ) (cid:19) (57)= 2 M M ∂ θ ( Y M ) − Y M ∂ θ ( M M ) . (58)The final expression is the non-linear condition thatmust be satisfied by the optimal phase shift parameter θ .However, satisfying Equation 57 is not sufficient to guar-antee that θ is optimal. The value of the periodogramat each θ satisfying Equation 57 must be computed, andthe globally optimal solution chosen from this set.We seek a more explicit form for Equation 57. Wederive expressions for M M and
Y M , defining CC nm ≡ Cov(cos nωt, cos mωt ) (59) CS nm ≡ Cov(cos nωt, sin mωt ) (60) SS nm ≡ Cov(sin nωt, sin mωt ) (61)
Y C n ≡ (cid:104) ( y − ¯ y ) cos nωt (cid:105) (62) Y S n ≡ (cid:104) ( y − ¯ y ) sin nωt (cid:105) , (63)all of which can be evaluated efficiently using the NFFT.The autocovariance of the template values M M , isgiven by
M M ≡ (cid:88) i w i M i − (cid:32)(cid:88) i w i M i (cid:33) (64)= (cid:88) i w i (cid:32)(cid:88) n A n cos ωnt i + B n sin ωnt i (cid:33) (65) − (cid:32)(cid:88) n A n C n + B n S n (cid:33) (66)= (cid:88) n,m A n A m CC nm + ( A n B m CS nm + B n A m ( CS T ) nm )+ B n B m SS nm (67)= (cid:88) n,m A n A m CC nm + 2 A n B m CS nm + B n B m SS nm , (68)using (cid:88) n,m A n B m CS nm = (cid:88) n,m A m B n CS mn . (69)We also derive the products A n A m , A n B m , B n B m : Hoffman et al. 2021 A n A m = (cid:0) α n ψ n + α ∗ n ψ − n (cid:1) (cid:0) α m ψ m + α ∗ m ψ − m (cid:1) = α n α m ψ n + m + α ∗ n α m ψ m − n + α n α ∗ m ψ n − m + α ∗ n α ∗ m ψ − n − m ) (70) A n B m = − i (cid:0) α n ψ n + α ∗ n ψ − n (cid:1) (cid:0) α m ψ m − α ∗ m ψ − m (cid:1) = − i (cid:8) α n α m ψ n + m + α ∗ n α m ψ m − n − α n α ∗ m ψ n − m − α ∗ n α ∗ m ψ − n − m ) (cid:111) (71) B n B m = − (cid:0) α n ψ n − α ∗ n ψ − n (cid:1) (cid:0) α m ψ m − α ∗ m ψ − m (cid:1) = − (cid:8) α n α m ψ n + m − α ∗ n α m ψ m − n − α n α ∗ m ψ n − m + α ∗ n α ∗ m ψ − n − m (cid:9) (72)Now we have that M M nm = A n A m CC nm + 2 A n B m CS nm + B n B m SS nm = α n α m (cid:103) CC nm ψ n + m + 2 α n α ∗ m (cid:103) CS nm ψ n − m + α ∗ n α ∗ m (cid:102) SS nm ψ − ( n + m ) (73)where (cid:103) CC nm = ( CC nm − SS nm ) − i (cid:0) CS + CS T (cid:1) nm (74) (cid:103) CS nm = ( CC nm + SS nm ) + i (cid:0) CS − CS T (cid:1) nm (75) (cid:102) SS nm = ( CC nm − SS nm ) + i (cid:0) CS + CS T (cid:1) nm (76)and for Y M : Y M k = A k Y C k + B k Y S k = α k Y C k ψ k + α ∗ k Y C k ψ − k − i (cid:0) α k Y S k ψ k − α ∗ k Y S k ψ − k (cid:1) = ( Y C k − iY S k ) α k ψ k + ( Y C k + iY S k ) α ∗ k ψ − k = α k (cid:103) Y C k ψ k + α ∗ k (cid:103) Y C ∗ k ψ − k . (77)We also define Y M (cid:48) = ψ H Y M and
M M (cid:48) = ψ H M M ,both of which are polynomials in ψ . Their derivativesare ∂Y M (cid:48) = Hψ H − Y M + ψ H ∂Y M (78) ∂M M (cid:48) = 2 Hψ H − M M + ψ H ∂M M. (79)A new polynomial condition can then be expressed interms of M M (cid:48) , Y M (cid:48) and their derivatives. 0 = 2
M M ∂ ( Y M ) − Y M ∂ ( M M ) (80)= ψ H +1 (2 M M ∂ ( Y M ) − Y M ∂ ( M M )) (81)= 2 ψ H M M (cid:0) ψ H +1 ∂ ( Y M ) (cid:1) − ψ H Y M (cid:0) ψ H +1 ∂ ( M M ) (cid:1) (82)= 2 M M (cid:48) ( ψ∂ ( Y M (cid:48) ) − H ( Y M (cid:48) )) − Y M (cid:48) ( ψ∂ ( M M (cid:48) ) − H ( M M (cid:48) )) (83)= ψ (2 M M (cid:48) ∂ ( Y M (cid:48) ) − Y M (cid:48) ∂ ( M M (cid:48) )) − H ( M M (cid:48)
Y M (cid:48) − Y M (cid:48)
M M (cid:48) ) (84)= 2
M M (cid:48) ∂ ( Y M (cid:48) ) − Y M (cid:48) ∂ ( M M (cid:48) ) (85)The last step assumes that ψ (cid:54) = 0, which is a validassumption since ψ = e iθ lies on the unit circle for allreal θ .We solve for the zeros of the polynomialcondition defined by Equation 80 using the numpy.polynomial.polyroots function, which solvesfor the eigenvalues of the polynomial companion matrix.Solving for the zeros of a polynomial given a set ofcoefficients is unstable in certain cases, since the coef-ficients are represented as floating point numbers withfinite precision. Thus, we scale the roots by their modu-lus to ensure they lie on the unit circle. Alternatively, wecould use iterative schemes such as Newton’s method toimprove the estimate of the roots more robustly, howeverthis requires more computational power and the accuracyof the roots was not a problem for any of the cases theauthors have tested.2.1. Negative amplitude solutions
The model ˆ y = θ M θ + θ allows for θ < − cos x = cos( x − π ) and − sin x = sin( x − π ).However, for non-sinusoidal templates, M , negativeamplitudes do not generally correspond to a phase dif-ference. For example, a detached eclipsing binary tem-plate M EB ( x ) cannot be expressed in terms of a phase-shifted negative eclipsing binary template; i.e. M EB (cid:54) = − M EB ( x − φ ) for any φ ∈ [0 , π ).Negative amplitude solutions found by the fast tem-plate periodogram are usually undesirable, as they mayproduce false positives for lightcurves that resembleflipped versions of the desired template, and allowing for θ < P FTP ( ω ) = 0 if the optimal solution for θ is negative,but this complicates the interpretation of P FTP . Anotherpossible remedy is, for frequencies that have a θ < θ >
0, e.g. via non-linear optimization, butthis likely will eliminate the computational advantage ofFTP over existing methods.Thus, we allow for negative amplitude solutions in themodel fit and caution the user to check that the best fit θ is positive.ast Template Periodogram 72.2. Extending to multi-band observations
Multi-phase model
As shown in VanderPlas & Ivezi´c (2015), the multi-phase periodogram (their ( N base , N band ) = (0 ,
1) peri-odogram), for any model can be expressed as a linearcombination of single-band periodograms: P (0 , ( ω ) = (cid:80) Kk =1 χ ,k P k ( ω ) (cid:80) Kk =1 χ ,k (86)where K denotes the number of bands, χ ,k is theweighted sum of squared residuals between the data inthe k -th band and its weighted mean (cid:104) y (cid:105) , and P k ( ω ) isthe periodogram value of the k -th band at the trial fre-quency ω .With Equation 86, the template periodogram is readilyapplicable to multi-band time series, which is crucial forexperiments like LSST, SDSS, Pan-STARRS, and othercurrent and future photometric surveys.Other multi-band extensions of the template peri-odogram are provided in Appendix A.2.3. Computational requirements
For a given number of harmonics H , the task of deriv-ing the polynomial given in Equation 80 requires O ( H )computations, and finding the roots of this polynomialrequires O ( H ) computations. The degree of the finalpolynomial is 6 H − N f trial frequencies, the polynomialcomputation and root-finding step scales as O ( H N f ).The computation of the sums (Equations 59 – 63) scalesas O ( HN f log HN f ).Therefore, the entire template periodogram scales as O ( HN f log HN f + H N f ) . (87)However, an important consideration is that the peakwidth scales inversely with the number of harmonics δf peak ∝ /H and so the number of trial frequenciesneeded to resolve a peak increases linearly with the num-ber of harmonics in the template.The computational scaling factoring in an extra powerof H is therefore O ( H N f log HN f + H N f ) . (88)For a fixed number of harmonics H , the template peri-odogram scales as O ( N f log N f ). However, for a constantnumber of trial frequencies N f , the template algorithmscales as O ( H ), and computational resources alone limit H to reasonably small numbers H (cid:46)
15 (see Figure 1). IMPLEMENTATIONAn open-source implementation of the template peri-odogram in Python is available. Polynomial algebra isperformed using the numpy.polynomial module (Joneset al. 2001–). The nfft
Python module, which pro-vides a Python implementation of the non-equispaced https://github.com/PrincetonUniversity/FastTemplatePeriodogram https://github.com/jakevdp/nfft H (number of harmonics) ¤ ¤ E x ec . t i m e [ s ] / ( H × N f ) < ( HN f ) ( HN f ) ( N f H ) N obs = 10 N f = 125 N obs = 100 N f = 1250 Fig. 1.—
Computation time of FTP scaled by NH for differentnumbers of harmonics. For H (cid:46)
3, FTP scales sublinearly in H (possibly due to a constant overhead per trial frequency, indepen-dent of H ). When 3 (cid:46) H (cid:46)
11, FTP scales approximately linearlyin H , and when H (cid:38)
11 FTP approaches the O ( H ) scaling limit. fast Fourier transform, is used to compute the necessarysums for a particular time series.No explicit parallelism is used anywhere in the cur-rent implementation, however certain linear algebra op-erations in Scipy use OpenMP via calls to BLAS librariesthat have OpenMP enabled.All timing tests were run on a quad-core 2.6 GHz IntelCore i7 MacBook Pro laptop (mid-2012 model) with 8GBof 1600 MHz DDR3 memory. The
Scipy stack (version0.18.1) was compiled with multi-threaded MKL libraries.3.1.
Comparison with non-linear optimization
In order to evaluate the accuracy and speed of the tem-plate periodogram, we have included slower alternativesolvers within the Python implementation of the FTPthat employ non-linear optimization to find the best fitparameters.Periodograms computed in Figures 2, 3, and 4 usedsimulated data. The simulated data has uniformlyrandom observation times, with Gaussian-random, ho-moskedastic, uncorrelated uncertainties. An eclipsing bi-nary template, generated by fitting a well-sampled, highsignal-to-noise eclipsing binary in the HATNet dataset(BD+56 603) with a 10-harmonic truncated Fourier se-ries. 3.1.1.
Accuracy
For weak signals or signals folded at the incorrect trialperiod, there may be a large number of local χ min-ima in the parameter space, and thus non-linear opti-mization algorithms may have trouble finding the globalminimum. The FTP, on the other hand, solves for theoptimal parameters directly, and thus is able to recover Hoffman et al. 2021 Frequency [ d ] Multi-harmonic Lomb ScargleBox Least Squares0.0 0.5 1.0
PhaseH = 1H = 2H = 5H = 9H = 10 Template periodogramTemplate fits
Fig. 2.—
Template periodograms performed on a simulated eclipsing binary lightcurve (shown phase-folded in the left-hand plots). Thetop-most plot uses only one harmonic, equivalent to a Lomb-Scargle periodogram. Subsequent plots use an increasing number of harmonics,which produces a narrower and higher peak height around the correct frequency. For comparison, the multi-harmonic extension to Lomb-Scargle is plotted in blue, using the same number of harmonics as the FTP. The Box Least-Squares (Kov´acs et al. 2002) periodogram isshown in the final plot. P FTP ( | H = 10) P s l o w () R = 0.995 Fig. 3.—
Comparing accuracy between previous methods thatrely on non-linear optimization at each trial frequency with thefast template periodogram described in this paper. Both methodsare applied to the same simulated data as shown in Figure 2. TheFTP consistently finds more optimal template fits than those foundwith non-linear optimization, which do not guarantee convergenceto a globally optimal solution. The FTP solves for the optimal fitparameters directly, and therefore is able to achieve greater accu-racy than template fits done via non-linear optimization. optimal solutions even when the signal is weak or notpresent.Figure 3 illustrates the accuracy improvement withFTP. Many solutions found via non-linear optimizationare significantly suboptimal compared to the solutionsfound by the FTP.Figure 4 compares FTP results obtained using the fulltemplate ( H = 10) with those obtained using smallernumbers of harmonics. The left-most plot compares the H = 1 case (weighted Lomb-Scargle), which, as alsodemonstrated in Figure 2, illustrates the advantage ofthe template periodogram for known, non-sinusoidal sig-nal shapes. 3.1.2. Computation time
FTP scales asymptotically as O ( N f H log N f H ) withrespect to the number of trial frequencies, N f and as O ( N f H ) with respect to the number of harmonics inwhich the template is expanded, H . For a given resolv-ing power, however, there is an additional factor of H ineach of these terms due to the number of trial frequenciesnecessary to resolve a periodogram peak being propor-tional to H . However, for reasonable cases ( N f (cid:46) when H = 5) the computation time is dominated by com-puting polynomial coefficients and root finding, both ofwhich scale linearly in N f .The number of trial frequencies needed for finding as-trophysical signals in a typical photometric time seriesis N f = 1 . × (cid:18) H (cid:19) (cid:18) baseline10 yrs (cid:19) (cid:16) α (cid:17) (cid:18)
15 mins P min (cid:19) (89)ast Template Periodogram 9 P FTP ( | H = 1) P F T P ( | H = ) R = 0.120 P FTP ( | H = 2) R = 0.897 P FTP ( | H = 5) R = 0.964 P FTP ( | H = 9) R = 1.000 Fig. 4.—
Comparing the template periodogram calculated with H = 10 harmonics to the template periodogram using a smaller numberof harmonics H <
10. The template and data used to perform the periodogram calculations are the same as those shown in Figure 2. Number of datapoints ¤ E x ec u t i o n t i m e [ s ] H = 1 H = 3 H = 6 Constant cadence
Fast template periodogramNon-linear optimization 10 Number of datapoints E x ec u t i o n t i m e [ s ] H = 3 N f = 499 Constant baseline
Fast template periodogramNon-linear optimization
Fig. 5.—
Computation time of FTP compared with alternative techniques that use non-linear optimization at each trial frequency.
Left : timing for the case when N f = 12 N obs , i.e. the cadence of the observations is constant. Right : timing for the case when N f isfixed, i.e. the baseline of the observations are constant. Non-linear optimization techniques scale as O ( N f N obs ) while the FTP scales as O ( HN f log HN f + N f H ), where H is the number of harmonics needed to approximate the template. where α represents the “oversampling factor,”∆ f peak / ∆ f , where ∆ f peak ∼ / baseline is the typ-ical width of a peak in the periodogram and ∆ f is thefrequency spacing of the periodogram.Extrapolating from the timing of a test case (500 obser-vations, 5 harmonics, 15,000 trial frequencies), the sum-mations account for approximately 5% of the computa-tion time when N f ∼ . If polynomial computationsand root-finding can be improved to the point where theyno longer dominate the computation time, this wouldprovide an order of magnitude speedup over the currentimplementation.Figure 5 compares the timing of the FTP with that ofprevious methods that employ non-linear optimization.For the case when N f ∝ N obs , FTP achieves a factor of 3speedup for even the smallest test case (15 datapoints),while for larger cases ( N ∼ ) FTP offers 2-3 orders ofmagnitude speed improvement. For the constant baselinecase, FTP is a factor of ∼ ∼
20 faster for N obs ∼ . Futureimprovements to the FTP implementation could further improve speedups by 1-2 orders of magnitude over non-linear optimization. DISCUSSIONTemplate fitting is a powerful technique for accuratelyrecovering the period and amplitude of objects with apriori known lightcurve shapes. It has been used inthe literature by, e.g. Stringer et al. (2019); Sesar et al.(2016, 2010), to analyze RR Lyrae in the SDSS, PS1, andDES datasets, where it has been shown to produce purersamples of RR Lyrae at a given completeness. The com-putational cost of current template fitting algorithms,however, limits their application to larger datasets orwith a larger number of templates.We have presented a novel template fitting algorithmthat extends the Lomb-Scargle periodogram (Lomb 1976;Scargle 1982; Barning 1963; Van´ıˇcek 1971) to handle non-sinusoidal signals that can be expressed in terms of atruncated Fourier series with a reasonably small numberof harmonics ( H (cid:46) O ( N f H log N f H + N f H ), while previoustemplate fitting algorithms such as the one used in the gatspy library (VanderPlas 2016), scale as O ( N f N obs ).However, the FTP effectively scales as O ( N f H ), sincethe time needed to compute polynomial coefficients andperform zero-finding dominates the computational timefor all practical cases ( N f (cid:46) ). The H scaling ef-fectively restricts templates to those that are sufficientlysmooth to be explained by a small number of Fourierterms.FTP also improves the accuracy of previous templatefitting algorithms, which rely on non-linear optimizationat each trial frequency to minimize the χ of the templatefit. The FTP routinely finds superior fits over non-linearoptimization methods.An open-source Python implementation of the FTP isavailable at GitHub. The current implementation couldlikely be improved by:1. Improving the speed of the polynomial coefficient calculations and the zero-finding steps. This couldpotentially yield a speedup of ∼ − ∼ few forlightcurves with O (100) observations, and by an order ofmagnitude or more for objects with more than 1,000 ob-servations. These improvements, taken at face value, arenot enough to make template fitting feasible on LSST-sized datasets. However, optimizing the polynomial com-putations could yield a factor of ∼ −
100 speedupover the current implementation, which would make theFTP 1-3 orders of magnitude faster than alternative tech-niques.APPENDIX
SHARED-PHASE MULTI-BAND TEMPLATE PERIODOGRAM
We derive a multi-band extension for the template periodogram for data taken in K filters, with N k observations inthe k -th filter. We use the same model as the one described in Sesar et al. (2016) in order to illustrate the applicabilityof the template periodogram to more sophisticated scenarios.The i -th observation in the k -th filter is denoted y ( k ) i . We wish to fit a periodic, multi-band template M =( M (1) , M (2) , ..., M ( K ) ) : [0 , π ) → R K to all observations. We assume the same model used by Sesar et al. (2016),which assumes the relative amplitudes, phase shifts, and offsets are shared across bands:ˆ y ( k ) ( t | θ ) = θ M ( k ) ( ωt − θ ) + θ + λ ( k ) (A1)where λ ( k ) is a fixed relative offset for band k . The χ for this model is χ = K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:16) y ( k ) i − ˆ y ( k ) i (cid:17) (A2)To make things simpler, we can set the λ ( k ) values to 0 simply by subtracting them off from our observations; thismeans we take all y ( k ) i → y ( k ) i − λ ( k ) . We have that χ = K (cid:88) k =1 W ( k ) χ k (A3)Where W ( k ) ≡ (cid:80) N k i =1 w ( k ) i and (cid:80) Kk =1 W ( k ) = 1. This means that the system of equations reduces to:0 = ∂χ ∂θ = K (cid:88) k =1 W ( k ) (cid:18) (cid:100) Y M ( k ) − θ (cid:91) M M ( k ) − θ M ( k ) (cid:19) (A4)for the θ parameter, where (cid:100) Y M ( k ) , (cid:91) M M ( k ) , M ( k ) are values from the single band case computed for each bandindividually, holding W ( k ) = 1 for each band. That is: https://github.com/PrincetonUniversity/ FastTemplatePeriodogram ast Template Periodogram 11 (cid:100) Y M ( k ) ≡ W ( k ) N k (cid:88) i =1 w ( k ) i y ( k ) i M ( k ) θ ( ωt i ) (A5) (cid:91) M M ( k ) ≡ W ( k ) N k (cid:88) i =1 w ( k ) i (cid:16) M ( k ) θ ( ωt i ) (cid:17) (A6) M ( k ) ≡ W ( k ) N k (cid:88) i =1 w ( k ) i M ( k ) θ ( ωt i ) (A7)For the offset θ , we have 0 = ∂χ ∂θ = K (cid:88) k =1 W ( k ) (cid:16) ¯ y ( k ) − θ M ( k ) − θ (cid:17) (A8)Where ¯ y ( k ) is the weighted-mean for the k -th band (¯ y ( k ) ≡ (1 /W ( k ) ) (cid:80) N k i =1 w ( k ) i y ( k ) i ), again with y ( k ) i → y ( k ) i − λ ( k ) .So if we redefine the quantities (cid:100) Y M , (cid:91) M M , M , and ¯ y as weighted averages across the bands, i.e. (cid:100) Y M = (cid:80) Kk =1 W ( k ) (cid:100) Y M ( k ) , etc., the solution for the optimal parameters has the same form: θ = (cid:18) Y MM M (cid:19) (A9) θ = ¯ y − M (cid:18) Y MM M (cid:19) (A10)which means that the model for the k -th band isˆ y ( k ) = ¯ y + (cid:18) Y MM M (cid:19) (cid:16) M ( k ) i − M (cid:17) (A11)The form of the periodogram has the same form as the single-band case: χ = K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:16) y ( k ) i − ˆ y ( k ) i (cid:17) (A12)= K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:18) ( y ( k ) i − ¯ y ) − (cid:18) Y MM M (cid:19) ( M ( k ) i − M ) (cid:19) (A13)= K (cid:88) k =1 N k (cid:88) i =1 w ( k ) i (cid:32) ( y ( k ) i − ¯ y ) − (cid:18) Y MM M (cid:19) ( M ( k ) i − M )( y ( k ) i − ¯ y ) + (cid:18) Y MM M (cid:19) ( M ( k ) i − M ) (cid:33) (A14)= Y Y − Y M ) M M + (
Y M ) M M (A15)=
Y Y − ( Y M ) M M . (A16)Since there is a single shared offset between the bands (i.e. we assume the mean magnitude is the same in all bandsafter subtracting λ ( k ) ), the variance for the signal, Y Y , is not (cid:80) Kk =1 (cid:80) N k i =1 w ( k ) i ( y ( k ) i − ¯ y ( k ) ) but (cid:80) Kk =1 (cid:80) N k i =1 w ( k ) i ( y ( k ) i − ¯ y ) .Construction of the polynomial for the multi-band case can be performed by first computing the polynomial expres-sion for ψ H (cid:91) M M = (cid:80) Kk =1 (cid:80) N k i =1 w ( k ) i (cid:16) ψ H M ( k ) i (cid:17) and a separate polynomial expression for ψ H M , which can then besquared and subtracted from ψ H (cid:91) M M to find
M M (cid:48) = ψ H M M .For
Y M (cid:48) = ψ H Y M , we merely compute ψ H (cid:100) Y M ( k ) for each band, take the weighted average of the polynomialcoefficients for all bands ( ψ H (cid:100) Y M = (cid:80) Kk W ( k ) ψ H (cid:100) Y M ( k ) ) and subtract the ψ H ¯ yM polynomial to get Y M (cid:48) .After computing the polynomials
M M (cid:48) and
Y M (cid:48) , the polynomial in Equation 80 can be computed quickly and thefollowing steps for finding the optimal model parameters is the same as in the single band case.2 Hoffman et al. 2021The computational complexity of this model scales as O ( KN f H log N f H + N f H ) where K is the number of filters.Since the polynomial zero-finding step is the limiting computation in most real-world applications, the multi-bandtemplate periodogram corresponding to the Sesar et al. (2016) model should not be significantly more computationallyintensive than the single-band case. DATA AVAILIBILITY