Trend Filtering -- II. Denoising Astronomical Signals with Varying Degrees of Smoothness
Collin A. Politsch, Jessi Cisewski-Kehe, Rupert A. C. Croft, Larry Wasserman
MMNRAS , 1–15 (2020) Preprint 13 January 2020 Compiled using MNRAS L A TEX style file v3.0
Trend Filtering – II. Denoising Astronomical Signals withVarying Degrees of Smoothness
Collin A. Politsch, , , (cid:63) Jessi Cisewski-Kehe, Rupert A. C. Croft, , , and Larry Wasserman , , Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA 15213 Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213 McWilliams Center for Cosmology, Carnegie Mellon University, Pittsburgh, PA 15213 Department of Statistics and Data Science, Yale University, New Haven, CT 06520 Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213 School of Physics, University of Melbourne, VIC 3010, Australia
Accepted XXX. Received YYY; in original form 2019 August 20
ABSTRACT
Trend filtering—first introduced into the astronomical literature in Paper I of thisseries—is a state-of-the-art statistical tool for denoising one-dimensional signals thatpossess varying degrees of smoothness. In this work, we demonstrate the broad utilityof trend filtering to observational astronomy by discussing how it can contribute toa variety of spectroscopic and time-domain studies. The observations we discuss are(1) the Lyman- α forest of quasar spectra; (2) more general spectroscopy of quasars,galaxies, and stars; (3) stellar light curves with planetary transits; (4) eclipsing bi-nary light curves; and (5) supernova light curves. We study the Lyman- α forest in thegreatest detail—using trend filtering to map the large-scale structure of the intergalac-tic medium along quasar-observer lines of sight. The remaining studies share broadthemes of: (1) estimating observable parameters of light curves and spectra; and (2)constructing observational spectral/light-curve templates. We also briefly discuss theutility of trend filtering as a tool for one-dimensional data reduction and compression. Key words:
Methods: statistical, techniques: spectroscopic, cosmology: observations,stars: planetary systems, stars: binaries: eclipsing, supernovae: general
Many astronomical analyses can be described by the fol-lowing problem setup. Suppose we collect noisy measure-ments of an observable quantity (e.g., flux, magnitude, pho-ton counts) according to the data generating process f ( t i ) = f ( t i ) + (cid:15) i , i = , . . . , n , (1)where t , . . . , t n is an arbitrarily-spaced grid of one-dimensional inputs (e.g., times or wavelengths), (cid:15) i is meanzero noise, and f is a signal that may exhibit varying degreesof smoothness across the input domain (e.g., a smooth signalwith abrupt dips/spikes). Given the observed data sample,we then attempt to estimate (or denoise) the underlyingsignal f from the observations by applying appropriate sta-tistical methods. In Politsch et al. (2020)—hereafter referredto as Paper I—we introduced trend filtering (Tibshirani & (cid:63) E-mail: [email protected]
Taylor 2011; Tibshirani 2014) into the astronomical litera-ture. When the underlying signal is spatially heterogeneous,i.e. possesses varying degrees of smoothness, trend filteringis superior to popular statistical methods such as Gaussianprocess regression, smoothing splines, kernels, LOESS, andmany others (Nemirovskii et al. 1985; Nemirovskii 1985; Tib-shirani 2014). Furthermore, the trend filtering estimate canbe computed via a highly efficient and scalable convex op-timization algorithm (Ramdas & Tibshirani 2016) and onlyrequires data-driven selection of a single scalar hyperparam-eter. In this paper, we directly demonstrate the broad utilityof trend filtering to observational astronomy by using it tocarry out a diverse set of spectroscopic and time-domainanalyses.The outline of this paper is as follows. In Section 2, weuse trend filtering to study the Lyman- α forest of quasarspectra—a series of absorption features that can be usedas a tracer of the matter density distribution along quasar-observer lines of sight. We choose to study this application in © a r X i v : . [ a s t r o - ph . I M ] J a n C. A. Politsch et al. depth and then illustrate the breadth of trend filtering’s util-ity through our discussions in Section 3. The applications wediscuss in Section 3 can be grouped into two broad (and oftenintertwined) categories: (1) deriving estimates of observableparameters from trend-filtered observations; and (2) usingtrend filtering to construct spectral/light-curve templates ofastronomical objects/events. In Section 3.1, we discuss con-structing spectral template libraries for astronomical objectsby trend filtering coadded spectroscopic observations. We il-lustrate our approach on quasar, galaxy, and stellar spectrafrom the Sloan Digital Sky Survey. Emission-line parame-ters can also be robustly estimated by fitting radial basisfunctions (e.g., Gaussians) to trend-filtered estimates nearemission lines. In Section 3.2, we use relaxed trend filteringto model the detrended, phase-folded light curve of a Keplerstellar system with a transiting exoplanet. We derive esti-mates and full uncertainty distributions for the transit depthand total transit duration. In Section 3.3, we use trend filter-ing to denoise a detrended, phase-folded Kepler light curveof an eclipsing binary (EB) system. We illustrate that trendfiltering provides significant improvements upon the popular polyfit method of Prˇsa et al. (2008) that is used to modelKepler EB light curves and derive observable parameters.In Section 3.4, we discuss using trend filtering to constructlight-curve templates of supernova (SN) explosions. We il-lustrate this approach on a SN light curve obtained from theOpen Supernova Catalog (Guillochon et al. 2017). Further-more, we derive estimates and full uncertainty distributionsfor a set of observable parameters—namely, the maximumapparent magnitude, the time of maximum, and the declinerate. Finally, in Section 3.5, we briefly discuss a different,non-data-analysis application of trend filtering. Specifically,we discuss the use of trend filtering as a tool for fast and flex-ible one-dimensional data reduction and compression. Theflexibility of the trend filtering estimator, paired with its ef-ficient speed and storage capabilities, make it a potentiallypowerful tool to include in large-scale (one-dimensional) as-tronomical data reduction and storage pipelines.We utilize the Arnold et al. (2014) glmgen R implemen-tation of the Ramdas & Tibshirani (2016) trend filteringoptimization algorithm in this work. See Paper I for imple-mentations in other programming languages. α FOREST
The Lyman- α (Ly α ) forest is the name given to the absorp-tion features seen in quasar spectra which are caused byneutral hydrogen (H I ) in the intergalactic medium betweena quasar and an observer. When emitted from an accretiondisk close to the central black hole, the light from the quasarhas a relatively smooth spectrum—a continuum—causedby the summed black-body emission of gas with differenttemperatures at different disk radii (Osterbrock & Ferland2006). Emission lines are also seen, and their intensities andline ratios supply information on the physical conditions inthe line emitting gas. At least twenty broad emission lines,broadened by high velocities and temperatures can be mea-sured in a single AGN, along with a similar number of nar-row lines from colder gas (Marziani et al. 1996). The emit-ted spectrum therefore already consists of a superposition of components with varying degrees of smoothness. The Ly α forest arises when this spectrum is further processed withthe addition of absorption lines. Light moving towards theobserver is redshifted into resonance with the Ly α transitionof H I , and the strength of absorption features is dictated bythe densities of intergalactic material along the line of sight(Rauch 1998). The smoothness of the absorption lines variesdepending on the gas pressure, and thermal doppler broad-ening (Gnedin & Hui 1998; Peeples et al. 2010). Sharperabsorption features, metal lines, are also caused by otherintergalactic species, such as C IV , O VI and Mg II (Hellstenet al. 1998; Pieri et al. 2010). The usefulness of the Ly α forest as a cosmological probe (e.g., Palanque-Delabrouilleet al. 2015) stems from its relationship to the matter den-sity field in the Universe, effectively mapping out structurealong each quasar-observer line of sight (e.g., Croft et al.1997; Lee et al. 2014). In order to extract this informationfrom noisy spectra and separate it from other components,it is useful to have a method that can deal with the com-plexities outlined above, i.e. one that can naturally adapt tovarying degrees of smoothness without extensive tuning.The relative fluctuations in the Ly α forest transmittedflux fraction are of primary interest since they possess amonotonic relationship with the relative distribution of theabsorbing H I . We utilize trend filtering to first denoise thespatially heterogeneous flux signal in an observed Ly α for-est. Estimates for the fluctuations in transmitted flux dueto absorbing H I are then typically produced by couplingthe denoised Ly α forest with estimates for the quasar con-tinuum and the cosmic mean transmitted flux in the Ly α forest. We take an alternative approach: directly estimatingthe mean flux level—defined as the product of the contin-uum and cosmic mean transmitted flux—as in Croft et al.(2002). The mean flux level is a very smooth, spatially ho-mogeneous function within the truncated Ly α forest rest-frame. It is therefore appropriate to use a linear smoother(see Paper I, Section 2) for this stage of estimation. Specifi-cally, we use local polynomial regression (LOESS; Cleveland1979; Fan & Gijbels 1996; Loader 1999). In this section, weillustrate these methods on a mock quasar Ly α forest fromBautista et al. (2015) and a real quasar Ly α forest fromthe Baryon Oscillation Spectroscopic Survey Data Release12 (BOSS DR12; Alam et al. 2015) of the Sloan Digital SkySurvey III (SDSS-III; Eisenstein et al. 2011; Aihara et al.2011).Historically, Ly α forest analyses have typically utilizedkernel smoothers (e.g., Croft et al. 2002; Kim et al. 2004),wavelets (e.g., Theuns & Zaroubi 2000), or Gaussian pro-cesses (e.g., Palanque-Delabrouille et al. 2013). Suppose we observe a quasar located at redshift z = z . Ig-noring systematic effects such as sky contamination and in-terstellar extinction for the moment, the observational DGPof the Ly α forest can be assumed to follow the model f ( λ ) = f ( λ ) + (cid:15) ( λ ) , λ ∈ Λ ( z ) , (2) = F ( λ ) · C ( λ ) · (cid:0) + δ F ( λ ) (cid:1) + (cid:15) ( λ ) , (3)where f ( λ ) is the observed flux at wavelength λ , f ( λ ) is theflux signal, (cid:15) ( λ ) is zero mean white Gaussian noise, Λ ( z ) = MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering Input Definition Range (quasar at z = z ) λ Observed wavelength Λ ( z ) ν Rest wavelength Λ rest ( z ) = Λ ( z )/( + z ) z Redshift Π ( z ) = Λ ( z )/ λ Ly α − ζ Log-wavelength (scaled) Z ( z ) = · log ( Λ ( z )) Table 1.
Various input spaces utilized for the Ly α forest analysis.Notation of functions is held constant, e.g. δ F (·) , and an alterationof the input variable implicitly indicates a change of input spaces.Logarithmic wavelengths are scaled for numerical stability of thetrend filtering optimization algorithm. ( λ Ly β , λ Ly α ) · ( + z ) is the redshifted Ly α forest, C ( λ ) is theflux of the unabsorbed quasar continuum, F ( λ ) = f ( λ )/ C ( λ ) is the transmitted flux fraction, F ( λ ) = E [ F ( λ )] is the meantransmitted flux fraction (over the sky) in the Ly α forest atredshift z = λ / λ Ly α − , and δ F ( λ ) = F ( λ )/ F ( λ ) − (4)is the fluctuation about the mean Ly α transmitted flux atredshift z = λ / λ Ly α − . Here, δ F is the quantity we areprimarily interested in estimating since δ F ∝ δ − H I at eachfixed redshift, where δ H I is the density of H I . The estimationof the flux signal f is viewed as an ancillary step.Although, in principle, it is preferable to study the fullspectral range Λ ( z ) we have found that, in the nonpara-metric setting, estimating the quasar continuum near thelocalized Ly α and Ly β emission peaks at the boundaries ofthe Ly α forest reduces the estimation accuracy in the inte-rior of Λ ( z ) . Therefore, in this work we limit our analysisto the truncated Ly α forest range Λ ( z ) = ( ˚A , ˚A ) · ( + z ) . (5)We simplify notation in this work by changing the inputspace of the functions introduced above by merely alteringthe input variable. For example, with respect to δ F , we main-tain the notation δ F (·) for all inputs λ , ν , z , ζ , while it isunderstood that a proper change of input spaces has takenplace. The various input spaces are defined in Table 1. We use quadratic trend filtering (Tibshirani & Taylor 2011;Tibshirani 2014) to estimate the flux signal f of the obser-vational model (2). In both BOSS DR12 and the Bautistaet al. (2015) mock catalog, the quasar spectra are sampledon equally-spaced grids in logarithmic wavelength space with ∆ log ( λ i ) = − dex (in logarithmic angstroms). Further-more, flux measurement variances are provided by the BOSSpipeline (Bolton et al. 2012), accounting for the statisticaluncertainty introduced by photon noise, CCD read noise,and sky-subtraction error. We correct the BOSS spectrumfor interstellar extinction with the Cardelli et al. (1989) ex-tinction law and the Schlafly & Finkbeiner (2011) dust map.We fit the trend filtering estimator on the equally-spaced logarithmic grid and tune the complexity by mini-mizing Stein’s unbiased risk estimate (SURE) of the fixed-input mean-squared error (see Paper I). More precisely, wefit the trend filtering estimator in the input space Z ( z ) = · log ( Λ ( z )) , as defined in Table 1, where we add the scaling to unit spacing for numerical stability of the trendfiltering convex optimization. We utilize a modified Croft et al. (2002) approach to prop-agate the trend filtering estimate for the flux signal f fromSection 2.2 into an estimate for the fluctuation field δ F alonga line of sight to an observed quasar. Namely, given the trendfiltering estimate (cid:98) f , we directly estimate the smooth meanflux level defined as the product m = F · C and then define the δ F estimates via the transformation (cid:98) δ F : = (cid:98) f / (cid:98) m − . We carryout the estimation of m via a wide-kernel LOESS smooth ofthe trend filtering estimate, with the specific bandwidth ofthe kernel selected by optimizing over a large sample of mockspectra (detailed in Section 2.4). We find that regressing onthe fitted values of the trend filtering estimate—instead ofthe observational DGP (2)—significantly improves the ac-curacy and robustness of the δ F estimates. We carry outthe LOESS estimation in the Ly α restframe Λ rest ( z ) (seeTable 1) in order to remove the effect of redshifting on thesmoothness of m . The LOESS estimation of m is a fully non-parametric procedure and provides a reduction in bias overpopular parametric approaches such as low-order power lawsand principal components analyses (PCA). The sole assump-tion of our LOESS approach is that, in the restframe, themean flux level m always has a fixed degree of smoothness,defined by an optimal fixed kernel bandwidth. There is ofcourse a bias-variance tradeoff here; namely, the decreasedbias comes with a modest increase in variance comparedto a parametric power law or low-dimensional PCA model.Our decision in favor of the LOESS approach directly re-flects our stance that low bias is preferable to low variancein this context since statistical uncertainty due to estima-tor variability is tracked by our uncertainty quantification(Section 2.5), while uncertainty due to modeling bias is noteasily quantifiable. Therefore, we can be more confident thatsignificant fluctuations in the estimated δ F field are in factreal, and not due to statistical bias in the quasar continuumestimate.To be explicit, the LOESS estimator for m is a regressionon the data set {( ν i , (cid:98) f ( ν i ; (cid:98) γ )} ni = , ν i ∈ Λ rest ( z ) , (6)which can be viewed as arising from the DGP (cid:98) f ( ν i ; (cid:98) γ ) = m ( ν i ) + ρ i , (7)where (cid:98) f is the trend filtering estimate fixed at the minimumSURE hyperparameter (cid:98) γ , e i = (cid:98) f ( ν i ; (cid:98) γ ) − f ( ν i ) are the errorsof the trend filtering estimate, and ρ i = m ( ν i ) · δ F ( ν i ) + e i areautocorrelated fluctuations about zero. The LOESS estima-tor is the natural extension of kernel regression (Nadaraya1964; Watson 1964) to higher-order local polynomials. Givena kernel function K (·) with bandwidth h > , for each i = , . . . , n , the LOESS estimator is obtained by minimiz-ing n (cid:213) j = (cid:16) (cid:98) f ( ν j ; (cid:98) γ ) − φ ν i ( ν j ; β , . . . , β d ) (cid:17) K (cid:32) | ν j − ν i | h (cid:33) , (8)and letting (cid:98) m ( ν i ) = (cid:98) β , where φ ν i (· ; β , . . . , β d ) is a d th orderpolynomial centered at ν i . Specifically, we utilize the local MNRAS000
Various input spaces utilized for the Ly α forest analysis.Notation of functions is held constant, e.g. δ F (·) , and an alterationof the input variable implicitly indicates a change of input spaces.Logarithmic wavelengths are scaled for numerical stability of thetrend filtering optimization algorithm. ( λ Ly β , λ Ly α ) · ( + z ) is the redshifted Ly α forest, C ( λ ) is theflux of the unabsorbed quasar continuum, F ( λ ) = f ( λ )/ C ( λ ) is the transmitted flux fraction, F ( λ ) = E [ F ( λ )] is the meantransmitted flux fraction (over the sky) in the Ly α forest atredshift z = λ / λ Ly α − , and δ F ( λ ) = F ( λ )/ F ( λ ) − (4)is the fluctuation about the mean Ly α transmitted flux atredshift z = λ / λ Ly α − . Here, δ F is the quantity we areprimarily interested in estimating since δ F ∝ δ − H I at eachfixed redshift, where δ H I is the density of H I . The estimationof the flux signal f is viewed as an ancillary step.Although, in principle, it is preferable to study the fullspectral range Λ ( z ) we have found that, in the nonpara-metric setting, estimating the quasar continuum near thelocalized Ly α and Ly β emission peaks at the boundaries ofthe Ly α forest reduces the estimation accuracy in the inte-rior of Λ ( z ) . Therefore, in this work we limit our analysisto the truncated Ly α forest range Λ ( z ) = ( ˚A , ˚A ) · ( + z ) . (5)We simplify notation in this work by changing the inputspace of the functions introduced above by merely alteringthe input variable. For example, with respect to δ F , we main-tain the notation δ F (·) for all inputs λ , ν , z , ζ , while it isunderstood that a proper change of input spaces has takenplace. The various input spaces are defined in Table 1. We use quadratic trend filtering (Tibshirani & Taylor 2011;Tibshirani 2014) to estimate the flux signal f of the obser-vational model (2). In both BOSS DR12 and the Bautistaet al. (2015) mock catalog, the quasar spectra are sampledon equally-spaced grids in logarithmic wavelength space with ∆ log ( λ i ) = − dex (in logarithmic angstroms). Further-more, flux measurement variances are provided by the BOSSpipeline (Bolton et al. 2012), accounting for the statisticaluncertainty introduced by photon noise, CCD read noise,and sky-subtraction error. We correct the BOSS spectrumfor interstellar extinction with the Cardelli et al. (1989) ex-tinction law and the Schlafly & Finkbeiner (2011) dust map.We fit the trend filtering estimator on the equally-spaced logarithmic grid and tune the complexity by mini-mizing Stein’s unbiased risk estimate (SURE) of the fixed-input mean-squared error (see Paper I). More precisely, wefit the trend filtering estimator in the input space Z ( z ) = · log ( Λ ( z )) , as defined in Table 1, where we add the scaling to unit spacing for numerical stability of the trendfiltering convex optimization. We utilize a modified Croft et al. (2002) approach to prop-agate the trend filtering estimate for the flux signal f fromSection 2.2 into an estimate for the fluctuation field δ F alonga line of sight to an observed quasar. Namely, given the trendfiltering estimate (cid:98) f , we directly estimate the smooth meanflux level defined as the product m = F · C and then define the δ F estimates via the transformation (cid:98) δ F : = (cid:98) f / (cid:98) m − . We carryout the estimation of m via a wide-kernel LOESS smooth ofthe trend filtering estimate, with the specific bandwidth ofthe kernel selected by optimizing over a large sample of mockspectra (detailed in Section 2.4). We find that regressing onthe fitted values of the trend filtering estimate—instead ofthe observational DGP (2)—significantly improves the ac-curacy and robustness of the δ F estimates. We carry outthe LOESS estimation in the Ly α restframe Λ rest ( z ) (seeTable 1) in order to remove the effect of redshifting on thesmoothness of m . The LOESS estimation of m is a fully non-parametric procedure and provides a reduction in bias overpopular parametric approaches such as low-order power lawsand principal components analyses (PCA). The sole assump-tion of our LOESS approach is that, in the restframe, themean flux level m always has a fixed degree of smoothness,defined by an optimal fixed kernel bandwidth. There is ofcourse a bias-variance tradeoff here; namely, the decreasedbias comes with a modest increase in variance comparedto a parametric power law or low-dimensional PCA model.Our decision in favor of the LOESS approach directly re-flects our stance that low bias is preferable to low variancein this context since statistical uncertainty due to estima-tor variability is tracked by our uncertainty quantification(Section 2.5), while uncertainty due to modeling bias is noteasily quantifiable. Therefore, we can be more confident thatsignificant fluctuations in the estimated δ F field are in factreal, and not due to statistical bias in the quasar continuumestimate.To be explicit, the LOESS estimator for m is a regressionon the data set {( ν i , (cid:98) f ( ν i ; (cid:98) γ )} ni = , ν i ∈ Λ rest ( z ) , (6)which can be viewed as arising from the DGP (cid:98) f ( ν i ; (cid:98) γ ) = m ( ν i ) + ρ i , (7)where (cid:98) f is the trend filtering estimate fixed at the minimumSURE hyperparameter (cid:98) γ , e i = (cid:98) f ( ν i ; (cid:98) γ ) − f ( ν i ) are the errorsof the trend filtering estimate, and ρ i = m ( ν i ) · δ F ( ν i ) + e i areautocorrelated fluctuations about zero. The LOESS estima-tor is the natural extension of kernel regression (Nadaraya1964; Watson 1964) to higher-order local polynomials. Givena kernel function K (·) with bandwidth h > , for each i = , . . . , n , the LOESS estimator is obtained by minimiz-ing n (cid:213) j = (cid:16) (cid:98) f ( ν j ; (cid:98) γ ) − φ ν i ( ν j ; β , . . . , β d ) (cid:17) K (cid:32) | ν j − ν i | h (cid:33) , (8)and letting (cid:98) m ( ν i ) = (cid:98) β , where φ ν i (· ; β , . . . , β d ) is a d th orderpolynomial centered at ν i . Specifically, we utilize the local MNRAS000 , 1–15 (2020)
C. A. Politsch et al. linear regression estimator (LLR; d = ) and the Epanech-nikov kernel (Epanechnikov 1967) K ( t ) = ( − t ) {| t | < } . (9)The LLR estimator is described in full detail by Algorithm 1.Given the trend filtering estimate (cid:98) f and the LLR estimate (cid:98) m , the δ F estimates are then defined as (cid:98) δ F ( z i ; (cid:98) γ ) = (cid:98) f ( z i ; (cid:98) γ ) (cid:98) m ( z i ) − , z , . . . , z n ∈ Π ( z ) , (10)where we deliberately express (cid:98) m as “hyperparameter-less”since γ has already been fixed at the minimum SURE value (cid:98) γ and we provide the optimal LOESS bandwidth value h = ˚A—optimized over the large mock sample. Here, we havealso done a change of variables to redshift space—our desiredinput domain for studying the H I density fluctuations in theIGM. Algorithm 1
LOESS (local linear) estimator for mean flux level
Require:
Training Data {( ν i , (cid:98) f ( ν i ; (cid:98) γ ))} ni = , Bandwidth h = ˚A for all i do Let (cid:98) β , (cid:98) β minimize the locally weighted sum ofsquares n (cid:213) j = (cid:16) (cid:98) f ( ν j ; (cid:98) γ ) − β − β ( ν j − ν i ) (cid:17) · K (cid:32) | ν j − ν i | h (cid:33) . Let (cid:98) m ( ν i ) = (cid:98) β ( ν i ) . end forOutput: { (cid:98) m ( ν i )} ni = Although h is chosen to directly optimize (cid:98) δ F accuracy,an estimate for the quasar continuum C (·) arises intrinsically: (cid:98) C ( ν i ) = F ( ν i ) − · (cid:98) m ( ν i ) , ν i ∈ Λ rest ( z ) , (11)where precise estimates of F follow from a rich literature (e.g.Bernardi et al. 2003; McDonald et al. 2005; Faucher-Gigu`ereet al. 2008; Dall’Aglio et al. 2009; Pˆaris et al. 2011; Beckeret al. 2013). The δ F estimates could then be equivalentlyrestated as (cid:98) δ F ( z i ; (cid:98) γ ) = (cid:98) F ( z i ; (cid:98) γ ) F ( z i ) − , z , . . . , z n ∈ Π ( z ) , (12)where (cid:98) F ( z i ; (cid:98) γ ) = (cid:98) f ( z i ; (cid:98) γ )/ (cid:98) C ( z i ) . (13) We utilize a sample of 124,709 mock quasar spectra from theBautista et al. (2015) catalog to optimize the bandwidth ofthe LOESS estimator for the mean flux level that intrinsi-cally removes the effect of the quasar continuum. Our mockdata reduction is detailed in the appendix and the redshiftdistribution of the quasars is shown in the top panel of Fig-ure 1.For each mock quasar Ly α forest with DGP Q j , j = , Figure 1. Top : Distribution of mock quasar redshifts (datareduction detailed in the appendix). We utilize this sample of124,709 quasars to calibrate the optimal nonparametric contin-uum smoothness.
Bottom : Mean absolute deviation error curvefor selecting the optimal kernel bandwidth for the LOESS (locallinear) estimator of the mean flux level, averaged over the 124,709spectra in the mock sample. The optimal choice of bandwidth is h = ˚A. . . . , 124,709, we first compute the trend filtering hyperpa-rameter value that minimizes the true fixed-input mean-squared prediction error γ j = arg min γ> n n (cid:213) i = E Q j (cid:104)(cid:0) f ( ζ i ) − (cid:98) f ( ζ i ; γ ) (cid:1) (cid:12)(cid:12)(cid:12) ζ , . . . , ζ n (cid:105) . (14)Then, given the trend filtering restframe fitted values {( ν i , (cid:98) f ( ν i ; γ j )} ni = , we fit a LOESS estimator with bandwidth h and define the error (as a function of h ) of the resulting δ F estimator as the fixed-input mean absolute deviation (MAD)error R j ( h ) = n (cid:213) i E Q j (cid:104)(cid:12)(cid:12) δ F ( ν i ) − (cid:98) δ F ( ν i ; γ j , h ) (cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ν , . . . , ν n (cid:105) , (15)where E Q j denotes the mathematical expectation over the MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering randomness arising from the observational DGP Q j . Becausewe can repeatedly sample from each mock quasar DGP, theexpectations in (14) and (15) can be computed to an ar-bitrary precision. Here, we utilize 300 realizations of eachDGP to approximate the mathematical expectations.We then define the optimal choice of h as the minimizerof the summed error over the full sample of m = h = arg min h > m (cid:213) j = R j ( h ) . (16)The aggregate error curve is shown in the bottom panel ofFigure 1, yielding an optimal value of h = ˚A. We findthat defining R ( h ) as the conditional MAD error—insteadof the conditional MSE—provides an essential boost in ro-bustness that keeps the error from being dominated by avery small proportion of worst-case estimates. More complexchoices of bandwidth, e.g. variable bandwidths that dependon the S/N ratio of the trend filtering estimator, the rest-frame pixel spacing, and/or the redshift of the quasar, donot significantly improve upon the h = ˚A restframe fixedbandwidth. Given the assumed noise model (cid:15) i ∼ N ( , σ i ) , i = , . . . , n provided by the BOSS pipeline (Bolton et al. 2012), we canconstruct a pointwise variability band for (cid:98) δ F via an augmen-tation of the parametric bootstrap outlined in Algorithm 2of Paper I. Specifically, given the parametric bootstrap en-semble of trend filtering estimates provided by the paramet-ric bootstrap algorithm, we fit the mean flux level of eachwith the LOESS estimator detailed in Algorithm 1, and thendefine the (cid:98) δ F bootstrap ensemble (cid:98) δ ∗ F , b ( z i ) = (cid:98) f ∗ b ( z i ) (cid:98) m ∗ b ( z i ) − , i = , . . . , n , b = , . . . , B , (17)where, for each b = , . . . , B , (cid:98) m ∗ b is the LOESS estimate fit tothe data set (cid:8) ( ν i , (cid:98) f ∗ b ( ν i )) (cid:9) ni = . Note that refitting the LOESSestimator on each bootstrap realization allows us to trackthe extra variability introduced into the δ F estimates bythe uncertainty in the continuum estimate. A ( − α ) · quantile-based pointwise variability band for (cid:98) δ F is then givenby V − α ( z i ) = (cid:16)(cid:98) δ ∗ F ,α / ( z i ) , (cid:98) δ ∗ F , − α / ( z i ) (cid:17) , i = , . . . , n . (18)where (cid:98) δ ∗ F ,β ( z i ) = inf g (cid:40) g : 1 B B (cid:213) b = (cid:8)(cid:98) δ ∗ F , b ( z i ) ≤ g (cid:9) ≥ β (cid:41) , (19)for any β ∈ ( , ) . A mock quasar spectrum from Bautista et al. (2015) anda real quasar spectrum from the Baryon Oscillation Spec-troscopic Survey (BOSS; Alam et al. 2015) are displayed inFigure 2, with the results of our Ly α forest analysis overlaid. Starting with the top panel, the quadratic trend filtering es-timate is fit on the equally-spaced observations in logarith-mic wavelength space and then transformed to the restframewavelength space, where the LOESS estimate for the meanflux level—the product of the continuum and cosmic meanLy α absorption—is then fit to the trend filtering fitted val-ues. The δ F estimates are then computed according to (10)and displayed in redshift space in the second panel, wherethey closely track the true δ F defined by (4). Furthermore,the 95% bootstrap variability band defined in (18) can beseen to almost fully cover the true δ F signal. The estimated δ F can be interpreted as an inversely proportional proxy forthe deviations from the mean H I density in the interveningintergalactic medium at each redshift—with negative valuesof δ F corresponding to (epoch-relative) over-densities of H I and positive values corresponding to under-densities.The third and fourth panels are the analogous plots fora BOSS quasar spectrum Ly α forest (Data Release 12, Plate= 6487, MJD = 56362, Fiber = 647). The quasar is locatedin the northern galactic cap at (RA, Dec, z ) ≈ (196.680 ◦ ,31.078 ◦ , 2.560). The analysis of signals possessing varying degrees of smooth-ness permeates many areas of astronomy. In this section, wediscuss a variety of further applications for which trend fil-tering may find use. For the sake of brevity, we discuss theseapplications in less detail than the Ly α forest analysis in Sec-tion 2. Naturally, we expect trend filtering may find manyuses in astronomy beyond those we explicitly discuss. Automated spectral classification and redshift estimationpipelines require rich template libraries that span the fullspace of physical objects in the targeted set in order toachieve high statistical accuracy. In this section, we discussusing trend filtering to construct spectral templates fromobservational spectra. We describe the procedure here forgenerating a single spectral template from a well-sampledobserved spectrum. Suppose we observe coadded flux mea-surements of a targeted source at wavelengths λ , . . . , λ n ∈ Λ according to the data generating process f ( λ i ) = f ( λ i ) + (cid:15) i , i = , . . . , n , (20)where the observations have been corrected for systematiceffects (e.g., sky subtraction, interstellar extinction, etc.)and the (cid:15) i are mean-zero errors that arise from instrumentaluncertainty and systematic miscalibrations. After removal ofpotentially problematic pixels (e.g., near bright sky lines),let (cid:98) f ( λ ) , λ ∈ Λ denote the continuous trend filtering esti-mate fit to the observations. Given a confident object classi-fication and redshift estimate z (e.g., determined by visualinspection), we then define the restframe spectral template b ( λ rest ) = (cid:98) f ( λ /( z + )) , λ rest ∈ Λ /( z + ) , (21)and store it in the respective object template library.In Figure 3, we show three optical coadded spectra from MNRAS000
Bottom : Mean absolute deviation error curvefor selecting the optimal kernel bandwidth for the LOESS (locallinear) estimator of the mean flux level, averaged over the 124,709spectra in the mock sample. The optimal choice of bandwidth is h = ˚A. . . . , 124,709, we first compute the trend filtering hyperpa-rameter value that minimizes the true fixed-input mean-squared prediction error γ j = arg min γ> n n (cid:213) i = E Q j (cid:104)(cid:0) f ( ζ i ) − (cid:98) f ( ζ i ; γ ) (cid:1) (cid:12)(cid:12)(cid:12) ζ , . . . , ζ n (cid:105) . (14)Then, given the trend filtering restframe fitted values {( ν i , (cid:98) f ( ν i ; γ j )} ni = , we fit a LOESS estimator with bandwidth h and define the error (as a function of h ) of the resulting δ F estimator as the fixed-input mean absolute deviation (MAD)error R j ( h ) = n (cid:213) i E Q j (cid:104)(cid:12)(cid:12) δ F ( ν i ) − (cid:98) δ F ( ν i ; γ j , h ) (cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ν , . . . , ν n (cid:105) , (15)where E Q j denotes the mathematical expectation over the MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering randomness arising from the observational DGP Q j . Becausewe can repeatedly sample from each mock quasar DGP, theexpectations in (14) and (15) can be computed to an ar-bitrary precision. Here, we utilize 300 realizations of eachDGP to approximate the mathematical expectations.We then define the optimal choice of h as the minimizerof the summed error over the full sample of m = h = arg min h > m (cid:213) j = R j ( h ) . (16)The aggregate error curve is shown in the bottom panel ofFigure 1, yielding an optimal value of h = ˚A. We findthat defining R ( h ) as the conditional MAD error—insteadof the conditional MSE—provides an essential boost in ro-bustness that keeps the error from being dominated by avery small proportion of worst-case estimates. More complexchoices of bandwidth, e.g. variable bandwidths that dependon the S/N ratio of the trend filtering estimator, the rest-frame pixel spacing, and/or the redshift of the quasar, donot significantly improve upon the h = ˚A restframe fixedbandwidth. Given the assumed noise model (cid:15) i ∼ N ( , σ i ) , i = , . . . , n provided by the BOSS pipeline (Bolton et al. 2012), we canconstruct a pointwise variability band for (cid:98) δ F via an augmen-tation of the parametric bootstrap outlined in Algorithm 2of Paper I. Specifically, given the parametric bootstrap en-semble of trend filtering estimates provided by the paramet-ric bootstrap algorithm, we fit the mean flux level of eachwith the LOESS estimator detailed in Algorithm 1, and thendefine the (cid:98) δ F bootstrap ensemble (cid:98) δ ∗ F , b ( z i ) = (cid:98) f ∗ b ( z i ) (cid:98) m ∗ b ( z i ) − , i = , . . . , n , b = , . . . , B , (17)where, for each b = , . . . , B , (cid:98) m ∗ b is the LOESS estimate fit tothe data set (cid:8) ( ν i , (cid:98) f ∗ b ( ν i )) (cid:9) ni = . Note that refitting the LOESSestimator on each bootstrap realization allows us to trackthe extra variability introduced into the δ F estimates bythe uncertainty in the continuum estimate. A ( − α ) · quantile-based pointwise variability band for (cid:98) δ F is then givenby V − α ( z i ) = (cid:16)(cid:98) δ ∗ F ,α / ( z i ) , (cid:98) δ ∗ F , − α / ( z i ) (cid:17) , i = , . . . , n . (18)where (cid:98) δ ∗ F ,β ( z i ) = inf g (cid:40) g : 1 B B (cid:213) b = (cid:8)(cid:98) δ ∗ F , b ( z i ) ≤ g (cid:9) ≥ β (cid:41) , (19)for any β ∈ ( , ) . A mock quasar spectrum from Bautista et al. (2015) anda real quasar spectrum from the Baryon Oscillation Spec-troscopic Survey (BOSS; Alam et al. 2015) are displayed inFigure 2, with the results of our Ly α forest analysis overlaid. Starting with the top panel, the quadratic trend filtering es-timate is fit on the equally-spaced observations in logarith-mic wavelength space and then transformed to the restframewavelength space, where the LOESS estimate for the meanflux level—the product of the continuum and cosmic meanLy α absorption—is then fit to the trend filtering fitted val-ues. The δ F estimates are then computed according to (10)and displayed in redshift space in the second panel, wherethey closely track the true δ F defined by (4). Furthermore,the 95% bootstrap variability band defined in (18) can beseen to almost fully cover the true δ F signal. The estimated δ F can be interpreted as an inversely proportional proxy forthe deviations from the mean H I density in the interveningintergalactic medium at each redshift—with negative valuesof δ F corresponding to (epoch-relative) over-densities of H I and positive values corresponding to under-densities.The third and fourth panels are the analogous plots fora BOSS quasar spectrum Ly α forest (Data Release 12, Plate= 6487, MJD = 56362, Fiber = 647). The quasar is locatedin the northern galactic cap at (RA, Dec, z ) ≈ (196.680 ◦ ,31.078 ◦ , 2.560). The analysis of signals possessing varying degrees of smooth-ness permeates many areas of astronomy. In this section, wediscuss a variety of further applications for which trend fil-tering may find use. For the sake of brevity, we discuss theseapplications in less detail than the Ly α forest analysis in Sec-tion 2. Naturally, we expect trend filtering may find manyuses in astronomy beyond those we explicitly discuss. Automated spectral classification and redshift estimationpipelines require rich template libraries that span the fullspace of physical objects in the targeted set in order toachieve high statistical accuracy. In this section, we discussusing trend filtering to construct spectral templates fromobservational spectra. We describe the procedure here forgenerating a single spectral template from a well-sampledobserved spectrum. Suppose we observe coadded flux mea-surements of a targeted source at wavelengths λ , . . . , λ n ∈ Λ according to the data generating process f ( λ i ) = f ( λ i ) + (cid:15) i , i = , . . . , n , (20)where the observations have been corrected for systematiceffects (e.g., sky subtraction, interstellar extinction, etc.)and the (cid:15) i are mean-zero errors that arise from instrumentaluncertainty and systematic miscalibrations. After removal ofpotentially problematic pixels (e.g., near bright sky lines),let (cid:98) f ( λ ) , λ ∈ Λ denote the continuous trend filtering esti-mate fit to the observations. Given a confident object classi-fication and redshift estimate z (e.g., determined by visualinspection), we then define the restframe spectral template b ( λ rest ) = (cid:98) f ( λ /( z + )) , λ rest ∈ Λ /( z + ) , (21)and store it in the respective object template library.In Figure 3, we show three optical coadded spectra from MNRAS000 , 1–15 (2020)
C. A. Politsch et al.
Figure 2.
Results of Ly α forest analysis. Top panel : Ly α forest of a mock quasar spectrum (Bautista et al. 2015) in the restframe,with the quadratic trend filtering estimate shown in orange and the LOESS (local linear) estimate for the mean flux level shown in blue. Second panel : The redshift-space fluctuations in the Ly α transmitted flux fraction, with our estimate superposed. The fluctuationsinversely trace the relative under- and over-densities of H I in the intervening intergalactic medium between Earth and the quasar. Thirdand Fourth panels : Analogous plots for a real quasar Ly α forest from the twelfth data release of the Baryon Oscillation SpectroscopicSurvey (Alam et al. 2015; Plate = 6487, MJD = 56362, Fiber = 647). The quasar is located at (RA, Dec, z ) ≈ (196.680 ◦ , 31.078 ◦ , 2.560).MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering Figure 3.
Optical coadded spectra collected by the Baryon Oscillation Spectroscopic Survey (Alam et al. 2015) of the Sloan Digital SkySurvey III (Eisenstein et al. 2011; Aihara et al. 2011). From top to bottom, a quasar (DR12, Plate = 7140, MJD = 56569, Fiber = 58)located at (RA, Dec, z ) ≈ (349.737 ◦ , 33.414 ◦ , . ), a galaxy (DR12, Plate = 7140, MJD = 56569, Fiber = 68) located at (RA, Dec, z ) ≈ (349.374 ◦ , 33.617 ◦ , . ), and a star (DR12, Plate = 4055, MJD = 55359, Fiber = 84) located at (RA, Dec, z ) ≈ (236.834 ◦ , 0.680 ◦ , . ). We fit a quadratic trend filtering estimate to each spectrum in the logarithmic wavelength space in which the observations areequally spaced, and optimize the hyperparameter by minimizing Stein’s unbiased risk estimate. Given confidently determined redshifts(e.g., determined by visual inspection), the trend filtering estimate for each object can be scaled to the restframe and stored as a spectraltemplate. Furthermore, emission-line parameter estimates for a spectrum can be obtained by fitting Gaussian radial basis functions tothe emission lines of the trend filtering estimate.MNRAS000
Optical coadded spectra collected by the Baryon Oscillation Spectroscopic Survey (Alam et al. 2015) of the Sloan Digital SkySurvey III (Eisenstein et al. 2011; Aihara et al. 2011). From top to bottom, a quasar (DR12, Plate = 7140, MJD = 56569, Fiber = 58)located at (RA, Dec, z ) ≈ (349.737 ◦ , 33.414 ◦ , . ), a galaxy (DR12, Plate = 7140, MJD = 56569, Fiber = 68) located at (RA, Dec, z ) ≈ (349.374 ◦ , 33.617 ◦ , . ), and a star (DR12, Plate = 4055, MJD = 55359, Fiber = 84) located at (RA, Dec, z ) ≈ (236.834 ◦ , 0.680 ◦ , . ). We fit a quadratic trend filtering estimate to each spectrum in the logarithmic wavelength space in which the observations areequally spaced, and optimize the hyperparameter by minimizing Stein’s unbiased risk estimate. Given confidently determined redshifts(e.g., determined by visual inspection), the trend filtering estimate for each object can be scaled to the restframe and stored as a spectraltemplate. Furthermore, emission-line parameter estimates for a spectrum can be obtained by fitting Gaussian radial basis functions tothe emission lines of the trend filtering estimate.MNRAS000 , 1–15 (2020) C. A. Politsch et al. the twelfth data release of the Baryon Oscillation Spectro-scopic Survey (BOSS DR12; Alam et al. 2015) of the SloanDigital Sky Survey III (SDSS-III; Eisenstein et al. 2011; Ai-hara et al. 2011). The figure is zoomed to a subinterval ofthe optical range for visual clarity and the spectra are easilyidentifiable as a quasar, a galaxy, and a star, respectively.We fit an error-weighted quadratic trend filtering estimateto each spectrum in the logarithmic-angstrom wavelengthspace in which the BOSS spectra are gridded, and tune thehyperparameter to minimize Stein’s unbiased risk estimate(see Section 3.5 of Paper I). The trend filtering estimatesgive good results, adequately adapting to even the strongestemission and absorption features in each spectrum.As in the BOSS spectroscopic pipeline (Bolton et al.2012), after a spectral classification and redshift measure-ment have been precisely determined, emission-line parame-ter estimates can then be obtained by fitting Gaussian radialbasis functions to the emission lines of the spectrum “bestfit”—nonlinearly optimizing the amplitudes, centroids, andwidths. We propose that the trend filtering estimate of aspectrum be used as the “best fit” for this type of proce-dure, e.g. instead of the low-dimensional principal compo-nents models currently used by the BOSS pipeline. The rel-ative magnitudes of the emission-line parameter estimatescan then be used to determine object subclassifications.
In this section we discuss the use of trend filtering for model-ing the photometric time series of an extrasolar planet tran-siting its host star. Given a phase-folded stellar light curve,corrected for stellar variability and spacecraft motion sys-tematics, trend filtering can be used to automatically pro-duce fully nonparametric estimates and uncertainties for thetransit depth and total transit duration.We demonstrate our approach on long-cadence photo-metric observations of Kepler-10 (KOI-72, KIC 11904151;Borucki et al. 2010). Kepler-10 is confirmed to host atleast two exoplanets: Kepler-10b (KOI-72 b, KIC 11904151b; Batalha et al. 2011) and Kepler-10c (KOI-72 c, KIC11904151 c; Fressin et al. 2011). Each planet was first de-tected via the transit method—a measurable periodic dim-ming in the photometry caused by the planet crossing infront of the host star. Here, we use trend filtering to es-timate a nonparametric transit model for Kepler-10c andderive depth and duration measurements. The results ofour analysis are displayed in Figure 4. The top panel dis-plays a sample of the Kepler-10 long-cadence (30-min. in-crement), quarter-stitched, median detrended, relative fluxlight curve processed by the Kepler pipeline (Jenkins et al.2010), which we accessed from the NASA Exoplanet Archive(Akeson et al. 2013). The observed transit events of Kepler-10c are indicated by the vertical dashed lines. The middlepanel displays the light curve (with σ error bars) afterphase-folding with respect to the ∼ K -fold cross validation. The optimal relaxationhyperparameter is (cid:98) φ = . , indicating that relaxation sig-nificantly improves the traditional trend filtering estimate in this setting. In particular, we find that relaxation allowsthe estimate to faithfully capture the sharp transitions cor-responding to the beginning of the ingress phase and theend of the egress phase. The relaxed trend filtering estimateis overlaid on the phase-folded light curve, along with 95%variability bands. Estimates for the transit depth and totaltransit duration follow immediately from the relaxed trendfiltering estimate, as detailed below. The estimated incep-tion and termination of the transit event are indicated bythe vertical dashed lines in the middle panel. The nonpara-metric bootstrap sampling distributions (see Algorithm 1of Paper I) of the transit depth and total transit durationmeasurements are displayed in the bottom panel. Our pointestimates for the transit depth and total transit durationfor Kepler-10c are (cid:98) δ = . ppm and (cid:98) T = . hours,respectively.The knot-selection property of trend filtering is partic-ularly appealing in this setting because it provides interpre-tation as to where the transit event begins and ends. Specif-ically, we define our estimate of the inception of the transitevent (cid:98) T as the leftmost knot selected by the trend filteringestimator and we define our estimate of the termination ofthe transit event (cid:98) T as the rightmost knot . The total transitduration estimate then follows as (cid:98) T = (cid:98) T − (cid:98) T . Naturally, wedefine our transit depth estimate (cid:98) δ to be the minimum ofthe relaxed trend filtering estimate.It is thus far unclear to us whether trend filtering canalso reliably detect the end of the ingress and beginningof the egress phases—e.g. via knot selection or examiningchanges in the estimated derivates—and therefore also pro-vide nonparametric ingress/egress duration measurements.We recommend pairing trend filtering with the traditionalanalytical transit model search (e.g., Mandel & Agol 2002)for these particular measurements. That is, given the transitdepth and total transit duration measurements provided bytrend filtering, an analytical planet model can be fit overa parameter space that is constrained by the trend filter-ing parameter measurements. Coupling the two methods inthis way therefore also provides significant computationalspeedups over a traditional single-stage analytical modelsearch since it greatly reduces the dimensionality of the pa-rameter space to be searched over. Furthermore, the relaxedtrend filtering estimate may provide a benchmark χ statis-tic for the constrained analytical model comparison.A dedicated paper on this application of trend filteringis forthcoming. In this section we discuss how trend filtering can furtherimprove the work of the Kepler Eclipsing Binary workinggroup, specifically in regard to the Eclipsing Binaries viaArtificial Intelligence (EBAI) pipeline used to characterizeKepler eclipsing binary stars via their phase-folded lightcurves (Prˇsa et al. 2008, 2011; Slawson et al. 2011; Slaw-son, Prˇsa, Welsh, Orosz, Rucker, Batalha, Doyle, Engle,Conroy, Coughlin & et al. Mat). The EBAI pipeline uti-lizes an artificial neural network (ANN) to estimate a set Note that the boundary points of the input space are not con-sidered knots. MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering Figure 4.
Kepler-10c transit light curve analysis.
Top : Long-cadence (30-min.), quarter-stitched, median detrended, relative flux lightcurve (
LC_DETREND ) of the confirmed exoplanet host Kepler-10 (KOI-072, KIC 11904151; Batalha et al. 2011), processed by the Keplerpipeline (Jenkins et al. 2010) and obtained from the NASA Exoplanet Archive (Akeson et al. 2013). Vertical lines indicate the observedtransit events of the system’s second confirmed planet Kepler-10c (KOI-072 c, KIC 11904151 c; Fressin et al. 2011).
Middle : Phase-foldedtransit light curve for Kepler-10c ( ∼ σ error bars. The error-weighted relaxed trend filtering estimate,optimized by sequential K -fold cross validation, is superposed with 95% variability bands. The estimated inception and termination ofthe transit event are indicated by the vertical dashed lines. The estimated transit depth and total transit duration are (cid:98) δ = . ppmand (cid:98) T = . hours, respectively. Bottom : Bootstrap sampling distributions of the transit depth and transit duration estimates.MNRAS000
Middle : Phase-foldedtransit light curve for Kepler-10c ( ∼ σ error bars. The error-weighted relaxed trend filtering estimate,optimized by sequential K -fold cross validation, is superposed with 95% variability bands. The estimated inception and termination ofthe transit event are indicated by the vertical dashed lines. The estimated transit depth and total transit duration are (cid:98) δ = . ppmand (cid:98) T = . hours, respectively. Bottom : Bootstrap sampling distributions of the transit depth and transit duration estimates.MNRAS000 , 1–15 (2020) C. A. Politsch et al.
Figure 5.
Long-cadence (30-min. increment), detrended, median-normalized light curve of a Kepler eclipsing binary system (KIC 6048106;Borucki et al. 2010; Prˇsa et al. 2011). The vertical red lines mark the primary eclipses (the eclipses of the hotter star) and the verticalblue lines mark the secondary eclipses (the eclipses of the cooler star). KIC 6048106 has an orbital period of ∼ of physical parameters for each binary pair (the tempera-ture ratio, sum of fractional radii, photometric mass ratio,radial and tangential components of the eccentricity, filloutfactor, and inclination) from the observables of the phase-folded light curve (e.g., the eclipse widths, depths, and sep-arations). Prˇsa et al. (2008) outline the EBAI light curvepre-processing algorithm, which they call polyfit , that pro-vides the crucial step of taking a noisy, irregularly-spaced,phase-folded light curve (detrended for spacecraft motionand normalized by the median flux) and outputting a de-noised and gridded phase-folded light curve, which is thenfed to the ANN. We propose that trend filtering be used forthis pre-processing step instead of the polyfit algorithm forthe reasons detailed below.An eclipsing binary (EB) light curve is characterized byperiodic dips in the observed brightness that correspond toeclipse events along the line of sight to an observer. In par-ticular, there are two eclipses per orbital period—a primaryand a secondary eclipse. The primary eclipse occurs whenthe hotter star is eclipsed by the cooler star and produces acomparatively deep dip in observed brightness. Conversely,the secondary eclipse occurs when the cooler star is eclipsedby the hotter star and produces a comparatively shallow dipin the observed brightness. Depending on the effective tem-perature ratio and orbital period of the EB, the dips mayrange from very narrow and abrupt to very wide and smooth.In Figure 5, we display a detrended, median-normalized,long-cadence (30 min. increment) light curve of a Kepler EB(KIC 6048106; Borucki et al. 2010; Prˇsa et al. 2011), withthe primary eclipses and secondary eclipses designated bydashed red and blue lines, respectively. The orbital periodof KIC 6048106 is ∼ Phase = 0 ,the purpose of the EBAI light-curve pre-processing step is tofaithfully extract the signal of the phase-folded light curveand evaluate it on a regular grid so that it can then be inputinto the EBAI ANN. We show a comparison of the poly-fit algorithm of Prˇsa et al. (2008) and our trend filteringapproach in Figure 6 on the phase-folded KIC 6048106 lightcurve. We choose a relatively high S/N light curve here in or- der to elucidate the significant statistical bias that underlies polyfit . The polyfit algorithm fits a piecewise quadraticpolynomial by weighted least-squares with four knots se-lected by a randomized computational search over the phasespace. The piecewise quadratic polynomial is forced to becontinuous, but no smoothness constraints are placed on thederivatives of the estimate at the knots. Recalling our dis-cussion in Section 2.1 of Paper I, this overly-stringent mod-eling assumption leads to significant statistical bias in thelight-curve estimate. The bias is particularly apparent byexamining the residuals of the polyfit estimate, which wedisplay below the light curve. Moreover, recalling our dis-cussion of variable-knot regression splines in Section 3.1.1 ofPaper I, a randomized partial search over the space of feasi-ble knots inherently provides no guarantee of finding a globalsolution—leaving the algorithm susceptible to extreme fail-ure scenarios. We display a quadratic trend filtering estimateof the KIC 6048106 phase-folded light curve in the bottompanel of Figure 6, with the hyperparameter chosen by K -fold cross validation. The trend filtering estimate accuratelyrecovers the signal of the light curve (clearly apparent hereby examining a high S/N ratio light curve) and produces adesirable random residual scatter about zero. Since the sta-tistical bias introduced by the polyfit pre-processing stagepropagates through to the EBAI ANN as systematic bias inthe input data, we are confident that the use of trend fil-tering will in turn improve the error rate of the EBAI ANNoutput-parameter estimates. In this section we demonstrate the use of trend filtering forgenerating light-curve templates of supernova (SN) eventsand estimating observable parameters. We illustrate our ap-proach on SN 2016coi (ASASSN-16fp; Holoien et al. 2016)by constructing a B-band light-curve template from the well-sampled observations of Brown et al. (2014) and Prenticeet al. (2018) and deriving nonparametric estimates and fulluncertainty distributions for the maximum apparent bright-ness, the time of maximum, and the decline rate param-
MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering Figure 6.
Comparison of the polyfit algorithm of Prˇsa et al. (2008) and our trend filtering approach for denoising phase-folded eclipsingbinary light curves. The light curve shown in this example comes from the Kepler eclipsing binary system KIC 6048106 (Borucki et al.2010; Prˇsa et al. 2011).
Top : The polyfit algorithm fits a piecewise quadratic polynomial by weighted least-squares with four knotsselected by a randomized search over the phase space. The estimate is constrained to be continuous but no constraints are enforcedon the derivatives at the knots. The overly-stringent assumed model leads to significant statistical bias, which is readily apparent byexamining the autocorrelation in the residuals.
Bottom : Trend filtering is sufficiently flexible to accurately denoise the diverse set ofsignals observed in phase-folded eclipsing binary light curves. Here, the goodness-of-fit is clear by the random, mean-zero residual scatter.MNRAS000
Bottom : Trend filtering is sufficiently flexible to accurately denoise the diverse set ofsignals observed in phase-folded eclipsing binary light curves. Here, the goodness-of-fit is clear by the random, mean-zero residual scatter.MNRAS000 , 1–15 (2020) C. A. Politsch et al. eter ∆ m ( B ) introduced by Phillips (1993). The improve-ment yielded by trend filtering as a tool for SN light-curvetemplate generation, compared to, for example, the SNooPy cubic smoothing spline approach of the Carnegie SupernovaProject (CSP; Contreras et al. 2010; Burns et al. 2010; Burnset al. 2014) primarily corresponds to light curves with an es-pecially bright peak magnitude and fast decline rate. In suchcases, trend filtering is better able to recover the abruptnessof the peak, the initial sharp decline, and the subsequentslow decay. This behavior is particularly characteristic ofType Ia SNe (e.g., Woosley et al. 2007). In cases wherethe peak is not particularly prominent, trend filtering andcubic smoothing splines produce nearly identical estimates.Our procedure for generating the SN light-curve templatesrequires reasonably well-sampled observations (in particu-lar, with the initial observation occurring before maximumlight). The resulting template libraries can then be used toclassify SNe with partially-sampled light curves and deriveparameter estimates (e.g., Belokurov et al. 2004).The SN light-curve template generation procedure isanalogous to the spectral template generation procedure dis-cussed in Section 3.1, so we discuss it in less formal detailhere. Naturally, the same procedure can also be implementedfor generating fixed-time spectral templates of SN events.Given a well-sampled light curve, corrected for systematiceffects (e.g., K -corrections and interstellar reddening), wepropose the use of quadratic trend filtering to generate a“best fit” to the observations. Given a confident type classi-fication, the trend filtering estimate can then be stored as alight-curve template in the respective library.We show a B-band light curve for SN 2016coi (ASASSN-16fp; Holoien et al. 2016) in the top panel of Figure 7. Thelight curve is an aggregation of observations collected byBrown et al. (2014) and Prentice et al. (2018), which we ac-cessed from the Open Supernova Catalog (Guillochon et al.2017). SN 2016coi was discovered on May 27, 2016 (UT 2016-05-27.55) by the All Sky Automated Survey for SuperNovae(ASAS-SN; Shappee et al. 2014) in the galaxy UGC 11868at redshift z ≈ . . The classification of SN 2016coi cur-rently remains intermediate between Type Ib and Type Ic(Elias-Rosa et al. 2016; Kumar et al. 2018; Prentice et al.2018; Terreran et al. 2019). We fit a quadratic trend filter-ing estimate to the observations, weighted by measurementuncertainties and with the hyperparameter selected by K -fold cross validation. The trend filtering estimate, along with95% nonparametric bootstrap variability bands, is overlaidin the top panel of the figure. In the bottom panels we showthe univariate nonparametric bootstrap sampling distribu-tions for the estimates of the maximum apparent magnitude,the time of maximum, and the decline rate—defined as therelative change in the B-magnitude light curve from max-imum light to the magnitude 15 days after the maximum(Phillips 1993). We also show a bivariate bootstrap sam-pling distribution for maximum apparent magnitude versusdecline rate. The bimodality in the bootstrap sampling dis-tributions arises from systematic discrepancies between thetwo separate sets of B-band observations that form the lightcurve (Brown et al. 2014; Prentice et al. 2018). Although the primary focus of this paper is the use of trendfiltering as a tool for astronomical data analysis, it also pos-sesses potential utility for large-scale reduction and com-pression of one-dimensional data sets. This owes to severalfactors: its speed, scalability, flexibility, and representationas a sum of basis functions with a sparse coefficient vector.Given a one-dimensional set of n observations ( t , f ( t )) , . . . , ( t n , f ( t n )) ∈ ( a , b ) × R , trend filtering can quicklyprovide a flexible, lower-dimensional approximation of thedata where the dimensionality is controlled by the choice ofthe hyperparameter γ . In this context γ is a subjective choicethat specifies the amount of (lossy) compression desired—unrelated to the discussion in Section 3.5 of Paper I. For anygiven choice of γ , let p be the number of knots selected bythe trend filtering estimator. The corresponding continuous-time representation of this lower-dimensional approximationis fully encoded by the knot locations and the sparse basisvector with p + k + nonzero entries, which can be stored effi-ciently. The falling factorial basis then serves as the “dictio-nary” from which the continuous-time representation can belosslessly recovered. Gridded uncertainty measurements forthe reduced observations can also be computed and stored,though not in a sparse format, via the methods discussed inSection 3.6 of Paper I. In order to illustrate the broad utility of trend filtering toastronomy, we demonstrated its promising performance on awide variety of problems across time-domain astronomy andastronomical spectroscopy. We studied the Lyman- α forestof quasar spectra with the most depth—using trend filter-ing to map the relative distribution of neutral hydrogen inthe high redshift intergalactic medium along quasar-observerlines of sight. Furthermore, we discussed how trend filter-ing can be used to (1) generate galaxy, quasar, and stel-lar spectral templates and estimate emission-line parame-ters; (2) produce nonparametric models for exoplanet tran-sit events in phase-folded stellar light curves, providing es-timates for the transit depth and total duration; (3) im-prove upon the polyfit algorithm utilized by the KeplerEclipsing Binary via Artificial Intelligence (EBAI) pipelinefor denoising phase-folded eclipsing binary light curves (asa preliminary step to estimating the physical parameters);(4) generate supernova light-curve templates and producenonparametric estimates of the maximum apparent magni-tude, the time of maximum, and the decline rate; (5) quicklyand efficiently compress large collections of one-dimensionaldata sets. Naturally, we expect trend filtering will find usesin astronomy beyond those that we explicitly discussed. ACKNOWLEDGEMENTS
We gratefully thank Ryan Tibshirani for his inspiration andgenerous feedback on this topic. This work was partiallysupported by NASA ATP grant NNX17AK56G, NASA ATPgrant 80NSSC18K1015, and NSF grant AST1615940.This research contains data collected by the Sloan Dig-ital Sky Survey III (SDSS-III). Funding for SDSS-III has
MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering Figure 7.
SN light-curve analysis (SN 2016coi).
Top : B-band photometry of the supernova SN 2016coi (ASASSN-16 fp; Holoien et al.2016) discovered on May 27, 2016 by the All Sky Automated Survey for SuperNovae (ASAS-SN; Shappee et al. 2014) in the galaxy UGC11868 at redshift z ≈ . . We fit a quadratic trend filtering estimate, tuned by K -fold cross validation, and overlay 95% nonparametricbootstrap variability bands. Bottom : Univariate/bivariate nonparametric bootstrap sampling distributions of the observable parameterestimates derived from the trend filtering light-curve estimate. The bimodality in the bootstrap parameter distributions arises fromsystematic discrepancies between the observations of the two separate observers (Brown et al. 2014; Prentice et al. 2018).MNRAS000
Top : B-band photometry of the supernova SN 2016coi (ASASSN-16 fp; Holoien et al.2016) discovered on May 27, 2016 by the All Sky Automated Survey for SuperNovae (ASAS-SN; Shappee et al. 2014) in the galaxy UGC11868 at redshift z ≈ . . We fit a quadratic trend filtering estimate, tuned by K -fold cross validation, and overlay 95% nonparametricbootstrap variability bands. Bottom : Univariate/bivariate nonparametric bootstrap sampling distributions of the observable parameterestimates derived from the trend filtering light-curve estimate. The bimodality in the bootstrap parameter distributions arises fromsystematic discrepancies between the observations of the two separate observers (Brown et al. 2014; Prentice et al. 2018).MNRAS000 , 1–15 (2020) C. A. Politsch et al. been provided by the Alfred P. Sloan Foundation, the Par-ticipating Institutions, the National Science Foundation, andthe U.S. Department of Energy Office of Science. The SDSS-III web site is . SDSS-III is man-aged by the Astrophysical Research Consortium for the Par-ticipating Institutions of the SDSS-III Collaboration includ-ing the University of Arizona, the Brazilian ParticipationGroup, Brookhaven National Laboratory, Carnegie MellonUniversity, University of Florida, the French ParticipationGroup, the German Participation Group, Harvard Univer-sity, the Instituto de Astrofisica de Canarias, the MichiganState/Notre Dame/JINA Participation Group, Johns Hop-kins University, Lawrence Berkeley National Laboratory,Max Planck Institute for Astrophysics, Max Planck Insti-tute for Extraterrestrial Physics, New Mexico State Univer-sity, New York University, Ohio State University, Pennsyl-vania State University, University of Portsmouth, PrincetonUniversity, the Spanish Participation Group, University ofTokyo, University of Utah, Vanderbilt University, Univer-sity of Virginia, University of Washington, and Yale Univer-sity. This research has made use of the NASA ExoplanetArchive, which is operated by the California Institute ofTechnology, under contract with the National Aeronauticsand Space Administration under the Exoplanet ExplorationProgram. This paper includes data collected by the Keplermission. Funding for the Kepler mission is provided by theNASA Science Mission Directorate. This research containsdata that was accessed from the Open Supernova Catalog,a centralized, open repository for supernova metadata, lightcurves, and spectra. The Open Supernova Catalog web siteis https://sne.space . The original sources of the supernovalight curve studied in this paper are credited in the text.
REFERENCES
Aihara H., et al., 2011, ApJS, 193, 29Akeson R. L., et al., 2013, PASP, 125, 989Alam S., et al., 2015, ApJS, 219Arnold T. B., Sadhanala V., Tibshirani R. J., 2014, Fast algo-rithms for generalized lasso problems, https://github.com/glmgen
Batalha N. M., et al., 2011, ApJ, 729, 27Bautista J. E., et al., 2015, JCAP, 1505Becker G. D., Hewett P. C., Worseck G., Prochaska J. X., 2013,MNRAS, 430, 2067Belokurov V., Evans N. W., Le Du Y., 2004, MNRAS, 352, 233Bernardi M., et al., 2003, AJ, 125, 32Bolton A. S., et al., 2012, AJ, 144, 144Borucki W. J., et al., 2010, Science, 327, 977Brown P. J., Breeveld A. A., Holland S., Kuin P., Pritchard T.,2014, Ap&SS, 354, 89Burns C. R., et al., 2010, AJ, 141, 19Burns C. R., et al., 2014, ApJ, 789, 32Cardelli J., Clayton G., Mathis J., 1989, ApJ, 345, 245Cleveland W. S., 1979, Journal of the American Statistical Asso-ciation, 74, 829Contreras C., et al., 2010, AJ, 139, 519Croft R. A. C., Weinberg D. H., Katz N., Hernquist L., 1997,ApJ, 488, 532Croft R. A. C., et al., 2002, ApJ, 581Dall’Aglio A., Wisotzki L., Worseck G., 2009Eisenstein D. J., et al., 2011, AJ, 142, 72Elias-Rosa N., et al., 2016, The Astronomer’s Telegram Epanechnikov V., 1967, Theory of Probability & its Applications,14, 153Fan J., Gijbels I., 1996, Local Polynomial Modeling and Its Ap-plications. Chapman and HallFaucher-Gigu`ere C.-A., Lidz A., Hernquist L., Zaldarriaga M.,2008, ApJ, 688, 85Fressin F., et al., 2011, ApJS, 197, 5Gnedin N. Y., Hui L., 1998, MNRAS, 296, 44Guillochon J., Parrent J., Kelley L. Z., Margutti R., 2017, ApJ,835, 64Hellsten U., Hernquist L., Katz N., Weinberg D. H., 1998, ApJ,499, 172Holoien T. W.-S., et al., 2016, The Astronomer’s Telegram, 9086Jenkins J. M., et al., 2010, ApJ, 713, L87Kim T.-S., Viel M., Haehnelt M. G., Carswell R. F., Cristiani S.,2004, MNRAS, 347, 355Kumar B., Singh A., Srivastav S., Sahu D. K., Anupama G. C.,2018, MNRAS, 473, 3776Lee K.-G., et al., 2014, ApJL, 795, L12Loader C., 1999, Local Regression and Likelihood. Springer-VerlagMandel K., Agol E., 2002, ApJ, 580, L171Marziani P., Sulentic J. W., Dultzin-Hacyan D., Calvani M.,Moles M., 1996, ApJS, 104, 37143McDonald P., et al., 2005, ApJ, 635, 761Nadaraya E. A., 1964, Theory of Probability & its Applications,9, 141Nemirovskii A., 1985, Izv. Akad. Nauk. SSSR Tekhn. Kibernet.(in Russian), 3, 50Nemirovskii A., Polyak B., Tsybakov A., 1985, Problems of In-formation Transmission, 21Osterbrock D. E., Ferland G. J., 2006, Astrophysics of gaseousnebulae and active galactic nucleiPalanque-Delabrouille N., et al., 2013, A&A, 559, A85Palanque-Delabrouille N., et al., 2015, JCAP, 2015, 011Pˆaris I., et al., 2011, A&A, 530Peeples M. S., Weinberg D. H., Dav´e R., Fardal M. A., Katz N.,2010, MNRAS, 404, 1281Phillips M. M., 1993, ApJL, 413, L105Pieri M. M., Frank S., Weinberg D. H., Mathur S., York D. G.,2010, ApJL, 724, L69Politsch C. A., Cisewski-Kehe J., Croft R. A. C., Wasserman L.,2020, MNRAS, 492Prentice S. J., et al., 2018, MNRAS, 478, 4162Prˇsa A., Guinan E. F., Devinney E. J., DeGeorge M., BradstreetD. H., Giammarco J. M., Alcock C. R., Engle S. G., 2008,ApJ, 687, 542Prˇsa A., et al., 2011, AJ, 141, 83Ramdas A., Tibshirani R. J., 2016, Journal of Computational andGraphical Statistics, 25, 839Rauch M., 1998, ARA&A, 36, 267Schlafly E., Finkbeiner D., 2011, ApJ, 737, 103Shappee B. J., et al., 2014, ApJ, 788, 48Slawson R. W., et al., 2011, AJ, 142, 160Terreran G., et al., 2019, ApJ, 883, 147Theuns T., Zaroubi S., 2000, MNRAS, 317, 989Tibshirani R. J., 2014, The Annals of Statistics, 42, 285Tibshirani R. J., Taylor J., 2011, The Annals of Statistics, 39,1335Watson G. S., 1964, Sankhy¯a: The Indian Journal of Statistics,Series A, 26, 359Woosley S. E., Kasen D., Blinnikov S., Sorokina E., 2007, ApJ,662, 487 MNRAS , 1–15 (2020) enoising Astronomical Signals via Trend Filtering APPENDIXMock quasar Lyman- α forest reduction The Bautista et al. [2015] mock quasar catalog is designedto mimic the observational data generating processes of thequasar spectra released in Data Release 11 of the BaryonOscillation Spectroscopic Survey (Alam et al. 2015). Wepool the first three realizations of the mock catalog, i.e.
M3_0_3/000 , M3_0_3/001 , and
M3_0_3/002 and remove alldamped Ly α systems (DLAs), Lyman-limit systems (LLS),and broad absorption line quasars (BALs). We assume nometal absorption in the Ly α forest and correct estimationand subtraction of the sky. We mask all pixels with and_mask (cid:44) or or_mask (cid:44) . Finally, we retain only the spectra with ≥ pixels in the truncated Ly α forest and those with afixed-input-optimal trend filtering hyperparameter satisfy-ing γ < . . Spectra with γ ≥ . correspond to the verylowest S/N ratio observations, where the trend filtering esti-mate typically reduces to a global power law fit (zero knots).The final mock sample contains 124,709 quasar spectra. This paper has been typeset from a TEX/L A TEX file prepared bythe author.MNRAS000