starry_process: Interpretable Gaussian processes for stellar light curves
sstarry_process: Interpretable Gaussian processes forstellar light curves
Rodrigo Luger
1, 2 , Daniel Foreman-Mackey , and Christina Hedges
3, 4 Center for Computational Astrophysics, Flatiron Institute, New York, NY Virtual PlanetaryLaboratory, University of Washington, Seattle, WA Bay Area Environmental Research Institute, P.O.Box 25, Moffett Field, CA 94035, USA NASA Ames Research Center, Moffett Field, CA
DOI:
DOIunavailable
Software • Review• Repository• Archive
Editor:
Pending Editor
Reviewers: • @Pending Reviewers
Submitted:
N/A
Published:
N/A
License
Authors of papers retaincopyright and release the workunder a Creative CommonsAttribution 4.0 InternationalLicense (CC BY 4.0).
Summary
The starry_process code implements an interpretable Gaussian process (GP) for modelingvariability in stellar light curves. As dark starspots rotate in and out of view, the total fluxreceived from a distant star will change over time. Unresolved flux time series therefore encodeinformation about the spatial structure of features on the stellar surface. The starry_process software package allows one to easily model the flux variability due to starspots, whetherone is interested in understanding the properties of these spots or marginalizing over thestellar variability when it is treated as a nuisance signal. The main difference between the GPimplemented here and typical GPs used to model stellar variability is the explicit dependenceof our GP on physical properties of the star, such as its period, inclination, and limb darkeningcoefficients, and on properties of the spots, such as their radius and latitude distributions(see Figure 1). This code is the
Python implementation of the interpretable GP algorithmdeveloped in Luger, Foreman-Mackey, & Hedges (2021). (a) rotations f l u x [ pp t ] (b) rotations f l u x [ pp t ] i n t e n s i t y i n t e n s i t y Figure 1:
Five random samples from our GP (columns) conditioned on two different hyperparametervectors θθθ • (rows). The samples are shown on the surface of the star in a Mollweide projectionalongside the corresponding light curves viewed at several different inclinations. (a) Samples from aGP describing a star with small mid-latitude spots. (b) Samples from a GP describing a star withlarger high-latitude spots. Luger, (2021). starry_process: Interpretable Gaussian processes for stellar light curves.
Journal of Open Source Software , TBD(TBD), TBD.https://doi.org/DOIunavailable, TBD(TBD), TBD.https://doi.org/DOIunavailable
Journal of Open Source Software , TBD(TBD), TBD.https://doi.org/DOIunavailable, TBD(TBD), TBD.https://doi.org/DOIunavailable a r X i v : . [ a s t r o - ph . S R ] F e b tatement of need Mapping the surfaces of stars using time series measurements is a fundamental problem inmodern time-domain stellar astrophysics. This inverse problem is ill-posed and computationallyintractable, but in the associated AAS Journals publication submitted in parallel to this paper(Luger, Foreman-Mackey, & Hedges, 2021), we derive an interpretable effective GaussianProcess (GP) model for this problem that enables robust probabilistic characterization ofstellar surfaces using photometric time series observations. Our model builds on previouswork by Perger et al. (2020) on semi-interpretable Gaussian processes for stellar timeseriesdata and by Morris (2020) on approximate inference for large ensembles of stellar light curves.Implementation of our model requires the efficient evaluation of a set of special functionsand recursion relations that are not readily available in existing probabilistic programmingframeworks. The starry_process package provides the necessary elements to perform thisanalysis with existing and forthcoming astronomical datasets.
Implementation
We implement our interpretable GP in the user-friendly
Python package starry_process ,which can be installed via pip or from source on GitHub. The code is thoroughly unit-testedand well documented, with examples on how to use the GP in custom inference problems. Asdiscussed in the associated AAS Journals publication (Luger, Foreman-Mackey, & Hedges,2021), users can choose, among other options, whether or not to marginalize over the stellarinclination and whether or not to model a normalized process. Users can also choose thespherical harmonic degree of the expansion, although it is recommended to use l max = 15 (seebelow). Users may compute the mean vector and covariance matrix in either the sphericalharmonic basis or the flux basis, or they may sample from it or use it to compute marginallikelihoods. Arbitrary order limb darkening is implemented following Agol et al. (2020).The code was designed to maximize the speed and numerical stability of the computation.Although the computation of the GP covariance involves many layers of nested sums overspherical harmonic coefficients, these may be expressed as high-dimensional tensor products,which can be evaluated efficiently on modern hardware. Many of the expressions can also beeither pre-computed or computed recursively. To maximize the speed of the algorithm, thecode is implemented in hybrid C++ / Python using the just-in-time compilation capability of the
Theano package (Theano Development Team, 2016). Since all equations derived here haveclosed form expressions, these can be autodifferentiated in a straightforward and numericallystable manner, enabling the computation of backpropagated gradients within
Theano . As such, starry_process is designed to work out-of-the box with
Theano -based inference tools suchas
PyMC3 for NUTS/HMC or ADVI sampling (Salvatier et al., 2016).Figure 2 shows the computational scaling of the
Python implementation of the algorithmfor the case where we condition the GP on a specific value of the inclination (blue) and thecase where we marginalize over inclination (orange). Both curves show the time in seconds tocompute the likelihood (averaged over many trials to obtain a robust estimate) as a function ofthe number of points K in a single light curve. For K (cid:46) , the computation time is constantat − ms for both algorithms. This is the approximate time (on a typical modern laptop)taken to compute the GP covariance matrix given a set of hyperparameters θθθ • . For largervalues of K , the cost approaches a scaling of K . , which is dominated by the factorizationof the covariance matrix and the solve operation to compute the likelihood. The likelihoodmarginalized over inclination is only slightly slower to compute, thanks to the tricks discussedin Luger, Foreman-Mackey, & Hedges (2021). Luger, (2021). starry_process: Interpretable Gaussian processes for stellar light curves.
Journal of Open Source Software , TBD(TBD), TBD.https://doi.org/DOIunavailable, TBD(TBD), TBD.https://doi.org/DOIunavailable
Journal of Open Source Software , TBD(TBD), TBD.https://doi.org/DOIunavailable, TBD(TBD), TBD.https://doi.org/DOIunavailable number of points10 li k e li h oo d e v a l t i m e [ s ] K . conditionalmarginal Figure 2:
Evaluation time in seconds for a single log-likelihood computation as a function of thenumber of points K in each light curve when conditioning on a value of the inclination (blue) andwhen marginalizing over the inclination (orange). At l max = 15 , computation of the covariance matrixof the GP takes about 20ms on a typical laptop. The dashed line shows the asymptotic scaling of thealgorithm, which is due to the Cholesky factorization and solve operations. l max l o g c o n d i t i o n nu m b e r Figure 3:
Log of the condition number of the covariance in the spherical harmonic basis as a functionof the spherical harmonic degree of the expansion, l max . Different lines correspond to different valuesof θθθ • drawn from a uniform prior (see text for details). In the majority of the cases, the matrix becomesill-conditioned above l max = 15 . Many modern GP packages (e.g., Ambikasaran et al., 2015; Foreman-Mackey et al., 2017)have significantly better asymptotic scalings, but these are usually due to specific structureimposed on the kernel functions, such as the assumption of stationarity. Our kernel structure isdetermined by the physics (or perhaps more accurately, the geometry) of stellar surfaces, andits nonstationarity is a consequence of the normalization step in relative photometry (Luger,Foreman-Mackey, Hedges, & Hogg, 2021). Moreover, and unlike the typical kernels used forGP regression, our kernel is a nontrivial function of the hyperparameters θθθ • , so its computationis necessarily more expensive. Nevertheless, the fact that our GP may be used for likelihoodevaluation in a small fraction of a second for typical datasets ( K ∼ , ) makes it extremelyuseful for inference.Our algorithm is also numerically stable over nearly all of the prior volume up to l max = 15 .Figure 3 shows the log of the condition number of the covariance matrix in the sphericalharmonic basis as a function of the spherical harmonic degree of the expansion for 100 drawsfrom a uniform prior over the domain of the hyperparameters. The condition number is nearlyconstant up to l max = 15 in almost all cases; above this value, the algorithm suddenly becomesunstable and the covariance is ill-conditioned. The instability occurs within the computation ofthe latitude and longitude moment integrals and is likely due to the large number of operations Luger, (2021). starry_process: Interpretable Gaussian processes for stellar light curves.
Journal of Open Source Software , TBD(TBD), TBD.https://doi.org/DOIunavailable, TBD(TBD), TBD.https://doi.org/DOIunavailable
Journal of Open Source Software , TBD(TBD), TBD.https://doi.org/DOIunavailable, TBD(TBD), TBD.https://doi.org/DOIunavailable l max via careful reparametrization of some of thoseequations, we find that l max = 15 is high enough for most practical purposes. References
Agol, E., Luger, R., & Foreman-Mackey, D. (2020). Analytic Planetary Transit Light Curvesand Derivatives for Stars with Polynomial Limb Darkening. (3), 123. https://doi.org/10.3847/1538-3881/ab4feeAmbikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D. W., & O’Neil, M. (2015).Fast Direct Methods for Gaussian Processes.
IEEE Transactions on Pattern Analysis andMachine Intelligence , , 252. https://doi.org/10.1109/TPAMI.2015.2448083Foreman-Mackey, D., Agol, E., Ambikasaran, S., & Angus, R. (2017). Fast and ScalableGaussian Process Modeling with Applications to Astronomical Time Series. (6), 220.https://doi.org/10.3847/1538-3881/aa9332Luger, R., Foreman-Mackey, D., & Hedges, C. (2021). Mapping stellar surfaces II: Aninterpretable Gaussian process model for light curves. arXiv e-Prints , arXiv:2102.01697.http://arxiv.org/abs/2102.01697Luger, R., Foreman-Mackey, D., Hedges, C., & Hogg, D. W. (2021). Mapping stellar surfacesI: Degeneracies in the rotational light curve problem. arXiv e-Prints , arXiv:2102.00007.http://arxiv.org/abs/2102.00007Morris, B. (2020). fleck: Fast approximate light curves for starspot rotational modulation. The Journal of Open Source Software , (47), 2103. https://doi.org/10.21105/joss.02103Perger, M., Anglada-Escudé, G., Ribas, I., Rosich, A., Herrero, E., & Morales, J. C. (2020).Auto-correlation functions of astrophysical processes, and their relation to Gaussian pro-cesses; Application to radial velocities of different starspot configurations. arXiv e-Prints ,arXiv:2012.01862. http://arxiv.org/abs/2012.01862Salvatier, J., Wiecki, T. V., & Fonnesbeck, C. (2016). Probabilistic programming in Pythonusing PyMC3. PeerJ Computer Science , , e55.Theano Development Team. (2016). Theano: A Python framework for fast computation ofmathematical expressions. arXiv e-Prints , abs/1605.02688 . http://arxiv.org/abs/1605.02688 Luger, (2021). starry_process: Interpretable Gaussian processes for stellar light curves.
Journal of Open Source Software , TBD(TBD), TBD.https://doi.org/DOIunavailable, TBD(TBD), TBD.https://doi.org/DOIunavailable