deeplenstronomy: A dataset simulation package for strong gravitational lensing
Robert Morgan, Brian Nord, Simon Birrer, Joshua Yao-Yu Lin, Jason Poh
ddeeplenstronomy: A dataset simulation package forstrong gravitational lensing
Robert Morgan ∗1, 2 , Brian Nord
3, 4 , Simon Birrer , Joshua Yao-YuLin , and Jason Poh University of Wisconsin-Madison Legacy Survey of Space and Time Data Science FellowshipProgram Fermi National Accelerator Laboratory University of Chicago Stanford University University of Illinois Urbana-Champaign
DOI:
Software • Review • Repository • Archive
Submitted:
10 November 2020
Published:
04 February 2021
License
Authors of papers retain copyrightand release the work under a Cre-ative Commons Attribution 4.0 In-ternational License (CC BY 4.0).
Background
Astronomical observations and statistical modeling permit the high-fidelity analysis ofstrong gravitational lensing (SL) systems, which display an astronomical phenomenon inwhich light from a distant object is deflected by the gravitational field of another objectalong its path to the observer. These systems are of great scientific interest because theyprovide information about multiple astrophysical and cosmological phenomena, includingthe nature of dark matter, the expansion rate of the Universe, and characteristics of galaxypopulations. They also serve as standing tests of the theory of General Relativity andmodified theories of gravity.Traditional searches for SL systems have involved time- and effort-intensive visual ormanual inspection of images by humans to identify characteristic features — like arcs,particular color combinations, and object orientations. However, a comprehensive searchusing the traditional approach is prohibitively expensive for large numbers of images, likethose in cosmological surveys — e.g., the Sloan Digital Sky Survey (York et al. 2000),the Dark Energy Survey (Abbott et al. 2018), and the Legacy Survey of Space and Time(LSST) (Ivezić et al. 2019). To automate the SL detection process, techniques basedon machine learning (ML) are beginning to overtake traditional approaches for scanningastronomical images. In particular, deep learning techniques have been the focus, but theyrequire large sets of labeled images to train these models. Because of the relatively lownumber of observed SL systems, simulated datasets of images are often needed. Thus, thecomposition and production of these simulated datasets have become integral parts of theSL detection process.One of the premier tools for simulating and analyzing SL systems, lenstronomy (Birrer andAmara 2018), works by the user specifying the properties of the physical systems, as wellas how they are observed (e.g., telescope and camera) through a python -based applicationprogramming interface (API) to generate a single image. Generating populations of SLsystems that are fit for neural network training requires additional infrastructure.
Statement of need
Due to the inherent dependence of the performance of ML approaches on their trainingdata, the deep learning approach to SL detection is in tension with scientific reproducibilitywithout a clear prescription for the simulation of the training data. There is a criticalneed for a tool that simulates full datasets in an efficient and reproducible manner, whileenabling the use of all the features of the lenstronomy simulation API. Additionally, ∗ Corresponding author
Morgan, (2021). deeplenstronomy: A dataset simulation package for strong gravitational lensing.
Journal of Open Source Software , 6(58), 2854.https://doi.org/10.21105/joss.02854, 6(58), 2854.https://doi.org/10.21105/joss.02854
Journal of Open Source Software , 6(58), 2854.https://doi.org/10.21105/joss.02854, 6(58), 2854.https://doi.org/10.21105/joss.02854 a r X i v : . [ a s t r o - ph . I M ] F e b his tool should simplify user interaction with lenstronomy and organize the simulationsand associated metadata into convenient data structures for deep learning problems.Multiple packages have been developed to generate realistic training data by wrappingaround lenstronomy : baobab (Park 2021) generates training sets for lens modeling andhierarchical inference and the LSST Dark Energy Science Collaboration’s SLSprinkler (Kalmbach et al. 2020) adds strongly lensed variable objects into catalogs and images.Nonetheless, the need for a simple, general tool capable of efficiently simulating anyastronomical system in a reproducible manner while giving the user complete freedom toset the properties of objects remains.
Summary deeplenstronomy generates SL datasets by organizing and expediting user interactionwith lenstronomy . The user creates a single yaml-style configuration file that describesthe aspects of the dataset: number of images, properties of the telescope and camera,cosmological parameters, observing conditions, properties of the physical objects, andgeometry of the SL systems. deeplenstronomy parses the configuration file and generatesthe dataset, producing both the images and the parameters that led to the productionof each image as outputs. The configuration files can easily be shared, enabling users toeasily reproduce each other’s training datasets.The premier objective of deeplenstronomy is to help astronomers make their trainingdatasets as realistic as possible. To that end, deeplenstronomy contains built-in featuresfor the following functionalities: use any stellar light profile or mass profile in lenstronomy ;simulate a variety of astronomical systems such as single galaxies, foreground stars, galaxyclusters, supernovae, and kilonovae, as well as any combination of those systems; fullycontrol the placements of objects in the simulations; use observing conditions of realastronomical surveys; draw any parameter from any probability distribution; introduceany correlation; and incorporate real images into the simulation. Furthermore, deeplenstronomy facilitates realistic time-domain studies by providing access to public spectralenergy distributions of observed supernovae and kilonovae and incorporating the transientobjects into time series of simulated images. Finally, deeplenstronomy provides datavisualization functions to enable users to inspect their simulation outputs. These featuresand the path from configuration file to full data set are shown in Figure 1. deeplenstronomy makes use of multiple open-source software packages: lenstronomy isused for all gravitational lensing calculations and image simulation; numpy (Harris et al.2020)
Array s are used internally to store image data and perform vectorized calculations; pandas (McKinney et al. 2010)
DataFrame s are utilized for storing simulation metadataand file reading and writing; scipy (Jones et al. 2001) is used for integration andinterpolation; matplotlib (Hunter 2007) functions are used for image visualization; astropy (Astropy Collaboration et al. 2013) is used for cosmological calculations and colorimage production; h5py (Collette 2014) is utilized for saving images; and
PyYAML (Simonovand Net 2006) is used to manage the configuration file. While not used directly, some python-benedict (Caccamo 2018) functionalities helped to create deeplenstronomy ’sdata structures and internal search algorithms. deeplenstronomy is packaged and disseminated via PyPI. Documentation and examplenotebooks are available on the deeplenstronomy website. Any bugs or feature requestscan be opened as issues in the GitHub repository (Morgan 2020).
Morgan, (2021). deeplenstronomy: A dataset simulation package for strong gravitational lensing.
Journal of Open Source Software , 6(58), 2854.https://doi.org/10.21105/joss.02854, 6(58), 2854.https://doi.org/10.21105/joss.02854
Journal of Open Source Software , 6(58), 2854.https://doi.org/10.21105/joss.02854, 6(58), 2854.https://doi.org/10.21105/joss.02854 igure 1: The deeplenstronomy process. Dataset properties, camera and telescope properties,observing conditions, object properties (e.g., lenstronomy light and mass profiles, point sources,and temporal behavior), the geometry of the SL systems, and optional supplemental input files(e.g., probability distributions, covariance matrices, and image backgrounds) are specified in themain configuration file. deeplenstronomy then interprets the configuration file, calls lenstronomy simulation functionalities, and organizes the resulting images and metadata.
Acknowledgements
R. Morgan thanks the LSSTC Data Science Fellowship Program, which is funded byLSSTC, NSF Cybertraining Grant
Morgan, (2021). deeplenstronomy: A dataset simulation package for strong gravitational lensing.
Journal of Open Source Software , 6(58), 2854.https://doi.org/10.21105/joss.02854, 6(58), 2854.https://doi.org/10.21105/joss.02854
Journal of Open Source Software , 6(58), 2854.https://doi.org/10.21105/joss.02854, 6(58), 2854.https://doi.org/10.21105/joss.02854 eferences Abbott, T. M. C., F. B. Abdalla, A. Alarcon, J. Aleksić, S. Allam, S. Allen, A. Amara,et al. 2018. “Dark Energy Survey year 1 results: Cosmological constraints from galaxyclustering and weak lensing” 98 (4): 043526. doi:10.1103/PhysRevD.98.043526.Astropy Collaboration, T. P. Robitaille, E. J. Tollerud, P. Greenfield, M. Droettboom, E.Bray, T. Aldcroft, et al. 2013. “Astropy: A community Python package for astronomy”558 (October): A33. doi:10.1051/0004-6361/201322068.Birrer, Simon, and Adam Amara. 2018. “Lenstronomy: Multi-Purpose Gravitational LensModelling Software Package.”
Physics of the Dark Universe
22: 189–201. doi:10.1016/j.dark.2018.11.002.Caccamo, Fabio. 2018. “Python-Benedict.” https://github.com/fabiocaccamo/python-benedict.Collette, Andrew. 2014.
Python and Hdf5 . O’Reilly.Harris, Charles R., K. Jarrod Millman, Stéfan J van der Walt, Ralf Gommers, PauliVirtanen, David Cournapeau, Eric Wieser, et al. 2020. “Array Programming with NumPy.”
Nature
Computing in Scienceand Engineering
LSST-DESC/SLSprinkler: LSST DESC Strong Lensing Sprinkler for Simulations (version v1.1.0).Zenodo. doi:10.5281/zenodo.4480392.McKinney, Wes, and others. 2010. “Data Structures for Statistical Computing in Python.”In
Proceedings of the 9th Python in Science Conference , 445:51–56. Austin, TX. doi:10.25080/majora-92bf1922-00a.Morgan, Robert. 2020. “Deeplenstronomy.” https://github.com/deepskies/deeplenstronomy.Park, Ji Won. 2021.
Jiwoncpark/Baobab: V0.1.2 (version v0.1.2). Zenodo. doi:10.5281/zenodo.4476822.Simonov, K., and I. Net. 2006. “PyYaml.” https://github.com/yaml/pyyaml.York, Donald G., J. Adelman, Jr. Anderson John E., Scott F. Anderson, James Annis,Neta A. Bahcall, J. A. Bakken, et al. 2000. “The Sloan Digital Sky Survey: TechnicalSummary” 120 (3): 1579–87. doi:10.1086/301513.
Morgan, (2021). deeplenstronomy: A dataset simulation package for strong gravitational lensing.
Journal of Open Source Software , 6(58), 2854.https://doi.org/10.21105/joss.02854, 6(58), 2854.https://doi.org/10.21105/joss.02854