The ANTARES Astronomical Time-Domain Event Broker
Thomas Matheson, Carl Stubens, Nicholas Wolf, Chien-Hsiu Lee, Gautham Narayan, Abhijit Saha, Adam Scott, Monika Soraisam, Adam S.Bolton, Benjamin Hauger, David R.Silva, John Kececioglu, Carlos Scheidegger, Richard Snodgrass, Patrick D. Aleo, Eric Evans-Jacquez, Navdeep Singh, Zhe Wang, Shuo Yang, Zhenge Zhao
DDraft version January 14, 2021
Typeset using L A TEX default style in AASTeX63
The ANTARES Astronomical Time-Domain Event Broker
Thomas Matheson, Carl Stubens, Nicholas Wolf, Chien-Hsiu Lee ( 李 見 修 ), Gautham Narayan,
2, 3
Abhijit Saha, Adam Scott, Monika Soraisam,
4, 2
Adam S.Bolton, Benjamin Hauger, David R.Silva, John Kececioglu, Carlos Scheidegger, Richard Snodgrass, Patrick D. Aleo, Eric Evans-Jacquez,
6, 7
Navdeep Singh,
6, 8
Zhe Wang, Shuo Yang,
6, 9 and Zhenge Zhao NSF’s National Optical-Infrared Astronomy Research Laboratory950 North Cherry AvenueTucson, AZ 85719, USA Department of AstronomyUniversity of Illinois at Urbana-ChampaignUrbana, IL 61801, USA Center for Astrophysical Surveys, National Center for Supercomputing ApplicationsUrbana, IL 61801, USA National Center for Supercomputing ApplicationsUniversity of Illinois at Urbana-ChampaignUrbana, IL 61801, USA University of Texas at San Antonio College of SciencesOne UTSA CircleSan Antonio, TX 78249, USA Department of Computer ScienceThe University of Arizona1040 East 4th StreetTucson, AZ 85721, USA Wisconsin IceCube Particle Astrophysics CenterUniversity of Wisconsin-MadisonMadison, WI 53703 USA Amazon.com, Inc440 Terry Ave NSeattle, WA 98109, USA DiDi Research America450 National AvenueMountain View, CA 94043, USA (Received; Revised; Accepted)
Submitted toABSTRACTWe describe the Arizona-NOIRLab Temporal Analysis and Response to Events System (ANTARES),a software instrument designed to process large-scale streams of astronomical time-domain alerts. Withthe advent of large-format CCDs on wide-field imaging telescopes, time-domain surveys now routinelydiscover tens of thousands of new events each night, more than can be evaluated by astronomersalone. The ANTARES event broker will process alerts, annotating them with catalog associations andfiltering them to distinguish customizable subsets of events. We describe the data model of the system,the overall architecture, annotation, implementation of filters, system outputs, provenance tracking,system performance, and the user interface.
Corresponding author: Thomas [email protected] a r X i v : . [ a s t r o - ph . I M ] J a n Matheson et al.
Keywords: astronomical methods, time-domain astronomy — astronomy software — computationalmethods INTRODUCTIONThe practice of time-domain astronomy is undergoing immense changes. The pathway for detection of transientand variable objects in the sky once relied on visual inspection and comparison of images. The advent of digitaldetectors, especially large-format CCDs that can be deployed on wide-field telescopes, has enabled a new era whereimage subtraction algorithms can identify virtually every difference between a current image and a template overthousands of square degrees of sky each night.As an illustration of the change in scale, there were 4038 International Astronomical Union Circulars (IAUCs) issuedbetween 1991 January and 2010 December with 10,358 entries. Some of those entries were for multiple objects, but thevast majority were single events, sometimes for follow-up observations of prior discoveries. Even if we make the generousassumption of ∼ ∼ ∼ σ . With ∼ ∼ Lasair (Smith et al. 2019), AlertManagement, Photometry and Evaluation of Lightcurves, (AMPEL, Nordin et al. 2019), Automatic Learning for theRapid Classification of Events (ALeRCE, F¨orster et al. 2020), and Fink (M¨oller et al. 2020). Alert brokers can filteralerts, add value through annotation from many sources of information, characterize and classify alerts, and distributealerts to the consumer.Overall, we view ANTARES as an astronomical instrument, but one that lets scientists collect and analyze time-domain alerts, rather than photons. ANTARES is designed to operate at LSST scale and beyond while ingestingalerts, annotating them with additional information, filtering them, and distributing alerts of interest to astronomerswho request them. In this paper, we describe the principles that guided development, the overall architecture of thesystem, how the various components work, and the performance as deployed at the ZTF scale. Descriptions of previousversions of the system were presented by Saha et al. (2014, 2016) and Narayan et al. (2018). ILLUSTRATIVE USE CASES https://mars.lco.global/ he ANTARES Event Broker ). The program may have dedicated spectroscopic resources for only a short time, so a steady stream ofType Ia SNe is only needed for that period. Different groups will have different needs at different times. The brokermust be capable of filtering alerts to generate streams of objects and flexible enough to provide these customizedstreams on demand.Rare events detected by time-domain surveys will also be of great interest. The recent success in identifying anelectromagnetic counterpart to a gravitational wave detection (e.g., Abbott et al. 2017a,b) highlights the scientificvalue in distinguishing the rare events from the many more common ones that will turn up in the same survey.Many multi-messenger events, whether gravitational waves (e.g., LIGO/Virgo, Aasi et al. 2015; Acernese et al. 2015),neutrinos (e.g., Adri´an-Mart´ınez et al. 2011; Aartsen et al. 2017), or cosmic rays (e.g., Aharonian et al. 2006; Holderet al. 2006; Kn¨odlseder 2020), have poor sky localizations (e.g., Abbott et al. 2018), necessitating wide-area searchesthat will produce large numbers of alerts ( ∼ https://doi.org/10.5281/zenodo.3588457 Matheson et al. be valuable (e.g., Arcavi 2018). Even without a multi-messenger component, rare events will appear in time-domainsurveys. At LSST rate, there could be several one-in-a-billion events every year. The broker must operate rapidly tofilter the alerts and distinguish rare events. One way of finding the needle in the haystack is to remove the hay, so thebroker should be able to identify common transients and variables that would otherwise obscure rare events with theirnumbers. As an example, flares on dwarf stars could be distinguished using multiwavelength data such as checking forcoincidence with red sources in the
WISE catalog (e.g., Davenport et al. 2012).Activity on Solar System objects provides another tool for investigating their nature and thus elucidating aspectsof Solar System formation (e.g., Jewitt et al. 2015). Identification of new moving objects in time-domain surveysemploys a distinctly different approach than finding objects that merely change in brightness. It typically requirescombining tracks of objects over multiple frames separated by hours to days (e.g., Trilling et al. 2017; Jones et al.2018). Finding new moving objects is thus beyond the capacity of a broker such as ANTARES, and it is a task thatwill be a component of LSST data management (Juric et al. 2017). Known Solar System objects, however, can bedistinguished if the ephemerides are correct. Both ZTF and LSST will flag known objects and these can then beevaluated to see if they have the expected brightness or if they are showing signs of activity. Such a specialized processon a large subset of a time-domain stream would likely require a separate broker, so a general-purpose broker must beable to efficiently redirect a significant portion of the alert stream to downstream, customized brokers.Many variable objects do not need immediate spectroscopic or photometric follow up, especially those with relativelylong time scales. These objects, whether periodic or aperiodic, are most valuable to an astronomer once a full lightcurve is available. Depending on the cadence of the survey and the period of the variable, this may take many periodsto sufficiently sample the light curve. To study these objects, real-time filtering is less useful than easy discovery inan archive of alerts. While surveys may retain searchable archives of raw alerts, the broker must keep a record of thefull, annotated set of alerts. This value-added archive must be built up by the broker over the course of the survey. Itmust provide a database of alerts with a flexible query system that enables science on longer time-scale objects.Sometimes variables (and some longer-lived transients) can behave in unexpected ways. Examples could includea known pulsating periodic variable suddenly undergoing an eclipse from a distant companion or an eruption of aluminous blue variable. Wide-field time-domain surveys provide astronomers with a tool that monitors all of theseobjects. The broker must provide a way for users to provide watch lists of objects of interest and a method ofcommunication to notify these users when an alert occurs for one of their objects.Finally, as each new time-domain survey has begun, the novel aspects of parameter space that it reaches haveinevitably yielded interesting objects that no one had yet detected such as superluminous supernovae (e.g., Quimbyet al. 2007; Gal-Yam 2012), luminous red novae (e.g., Martini et al. 1999; Kasliwal et al. 2011), calcium-rich transients(e.g., Kasliwal et al. 2012; Foley 2015), .Ia supernovae (e.g., Kasliwal et al. 2010; Perets et al. 2010), and fast, blueoptical transients (e.g., Drout et al. 2014; Smartt et al. 2018; Margutti et al. 2019). Astronomy has traditionally beena science of discovery and exploration. Wide-field time-domain surveys continue this tradition, providing opportunitiesfor discovery. There are, as yet, still empty regions in the energy-time diagram of Kasliwal (2012). LSST will producetime-domain alerts at an unprecedented rate, but also at an unprecedented depth, relatively high cadence, and inmultiple filters, thus opening an entirely new discovery volume. We may not know precisely what will appear in theLSST alert stream, but we can predict that new and extraordinary objects will be there. We need to have a brokerthat can sort through the alerts rapidly, filter them to recognize what we know, and then provide a substream ofunusual events that will provide the next breakthrough in astronomy. DATA MODELA data model is a set of data entities that exist within a system, the relationships between entities, and the wayof representing these entities and relationships in a database. The ANTARES data model is driven by the nature oftime-domain data. The primary data entities in ANTARES are the Alert and the Locus. Secondary entities includethe Tag, LocusProperty, AlertProperty, WatchList, WatchObject, Catalog, and CatalogObject. The data model isvisualized in Figure 1. he ANTARES Event Broker LocusPropertyAlertLocus 1 *1 * AlertProperty1 *CatalogObject** WatchObject* * WatchList* 1Tag** Catalog* 1 Provenance* 1
Figure 1.
ANTARES data model, simplified. Boxes indicate entities. Lines indicate relationships. Asterisks and numberson lines indicate relationship type: one-to-many (1—*), many-to-one (*—1), or many-to-many (*—*). Data contained withinentities are omitted. The Filter is a software entity, not a data entity, and is also omitted.
An Alert is a message received by ANTARES from a survey such as ZTF or LSST, plus additional properties gener-ated by ANTARES. Typically an Alert is either an observed magnitude measurement or an “upper limit” magnitudemeasurement, but in principle an Alert can be any packet of data that has precise spatial coordinates and a timestamp. A Locus (plural, Loci) is a point on the sky where Alerts cluster and is roughly equivalent to an astrophysical object.The association of Alerts to Loci uses a 1 (cid:48)(cid:48) radius cone search. That is, each incoming Alert is associated with thenearest Locus within 1 (cid:48)(cid:48) of the Alert. If no such Locus exists, a new Locus is created.Tags are short strings that are associated with one or more Loci. Tags are used to flag loci that meet specificcriteria, e.g., “extragalactic,” “nuclear transient,” etc. Tags are used to generate output streams and are queryable inthe database.Alert properties are associated with a particular Alert, such as magnitude. Some originate from the incoming sourcealert, and others are annotations generated by Filters.Locus properties are associated with a Locus (i.e., an astronomical object), not an Alert. Examples include color,or statistics of a light curve such as skewness.CatalogObjects are entities in astronomical object catalogs. ANTARES 1.0 supports extended objects by modelingthem as circular regions. Support for other shapes is possible in future versions.WatchObjects and WatchLists allow users to define and track objects and regions of interest to them. A WatchObjectis a coordinate and a radius, and a WatchList is a collection of such objects. Users may create their own WatchListsusing the ANTARES Portal (Section 11.1 describes the portal) and browse matching Loci. ANTARES can notify usersof new hits to their WatchLists over Slack. The Provenance object stores information about the version and configuration of ANTARES that processes eachAlert. It is described further in Section 8. Region-based alerts (e.g., gravitational wave detections from LIGO/Virgo) do not have definitive coordinate locations, and are handledindependently of the Alert/Locus concept. We have implemented a system for ingesting LIGO/Virgo alerts that can be generalized to anyalert source with imprecise localizations (and thus large areas associated with the alert), including the neutrino and cosmic ray observatoriesmentioned above, but also for networks that collate multi-messenger alerts like the Astrophysical Multimessenger Observatory Network(AMON, Ayala Solares et al. 2020). Filters in ANTARES have access to all LIGO/Virgo data within the past 30 days. This makes itpossible to write a filter that flags all Alerts within a particular confidence level region of a recent LIGO/Virgo detection. By combiningthis with other criteria, such as galaxy catalog associations or light curve history, this allows the production of filters that identify likelyLIGO/Virgo correlates. https://slack.com/ Matheson et al. SYSTEM ARCHITECTUREThe ANTARES system is designed around a parallel stream-processing architecture. The system ingests alertsfrom input Apache Kafka streams (Kafka is an industry standard technology for data streaming, Kreps et al. 2011) ,processes them, stores them in a database, and produces streams of output alerts using Kafka. The system is composedof ten subsystems that perform specific functions. Figure 2 depicts the layering of systems and activities. Figure 3depicts the architecture of the system and connections between components. Figure 4 depicts the behavior of the AlertPipeline subsystem that processes alert streams. The remainder of this section describes these concepts in more detail.4.1. System Layers
At an abstract level, the system is composed of three layers as shown in Figure 2. From bottom to top theselayers are systems administration, software engineering, and science. Each layer contains responsibilities and systemsthat facilitate those of the layer above. The systems administration layer is responsible for hosting and managingcomponents that run on bare-metal hardware. This includes the databases, the Kafka streaming message broker, andthe Kubernetes (a software container orchestration system) compute cluster. On top of this layer is the softwareengineering layer, which includes two types of systems, live systems that run on Kubernetes and distributable softwarepackages that users download and run outside the ANTARES cluster. On top of the software engineering layer isthe science layer. The science layer consists of tasks and components written by researchers who use ANTARES asa science tool. This includes data analysis tasks, model training, filter development, and downstream processing ofANTARES output streams. 4.2. System Components
We show the architecture of the system in Figure 3. This illustrates the connections between components and theflow of data through the system.ANTARES processes alerts concurrently in multiple instances of a program called the “Pipeline Worker.” We run oneinstance for each partition in the Kafka topic from which they receive data. Each worker processes alerts in the orderthey were received and uses Kafka’s “commit” mechanism to track progress and remain resilient to system failures.The alerts are processed according to the configured science workload and the results are stored in the ANTARESdatabase as well as distributed to the community. The Pipeline Workers load Filter code from the MySQL database upon startup. Loci, Alerts, and their annotations are stored in the Alert Database, which is implemented with ApacheCassandra. The Alert Database is the single source of truth for ANTARES data, although its content is also indexedseparately by the Search Engine. The Alert Database schema is designed for fast access by the Pipeline Workersand thus does not support complex queries (see Section 9). The Pipeline Workers perform Locus associations andexecute filter code on the Loci (see Section 4.3). After each Locus is processed, updated information about the Locusis sent to the Search Engine (implemented using Elasticsearch ) via Kafka and the Index Worker microservice. TheSearch Engine indexes all Loci and makes them queryable by the Application Programming Interface (API). The APIservice is a Representational State Transfer (REST) Hypertext Transfer Protocol (HTTP) API that is used by thePortal web site and by the Client Library. The API uses the Search Engine to perform searches for Loci, and loadsthe complete data for each Locus from the Alert Database. The Client Library and the Portal interface with users,providing visibility into the database.The Pipeline Workers also produce two other forms of output, Slack notifications and output Kafka Alerts. Slackmessages are sent using an asynchronous cluster of job workers that process tasks in a job queue stored in the Redis in-memory cache system. The workers implement automatic retry and exponential back off in the event that Slackis not responding. This system is designed to facilitate other forms of notification in the future such as web hooks,emails, or Short Message Service (SMS) text messaging as needed. In ANTARES 1.0, only Slack notifications areimplemented.Also shown in the architecture diagram is the Filter Devkit, described in Section 11.3.4.3. Alert Pipeline https://kafka.apache.org/ https://kubernetes.io/ https://cassandra.apache.org/ https://redis.io/ he ANTARES Event Broker KubernetesCluster KafkaClusterCassandraCluster ElasticSearchCluster MySQLClusterAPIPipelineWorkers Web ServerJob Queue Workers RedisIn-Memory Cache Filters
ClientLibrary
Live services running on Kubernetes Distributable Packages
DevKit
Layer 3: ScienceLayer 2:Software EngineeringLayer 1:Systems Admin
DataAnalysis ModelTraining Downstream Processing
Legend
Subsystem written in Python Apache Kafka clusterApache Cassandra database clusterMySQL Database serverKubernetes ClusterSubsystem written in TypeScript Elasticsearch search engine clusterRedis in-memore cache server
Figure 2. “Layercake” diagram showing the layers of systems and activities in ANTARES. The Systems Administration layerprovides compute and database services. The Software Engineering layer provides software systems. The Science layer consistsof activities performed by science staff and researchers from the community. This is an abstract representation, not a systemarchitecture diagram.
Matheson et al.
User Interfaces
Users
Index WorkersKafkaCluster Client LibraryHTTP APIServers Frontend Servers Web BrowserAlert Database(Apache Cassandra) Search Index(ElasticSearch)Filters& User Data(MySQL) SlackNotificationsAsynchronous Job WorkersPipelineWorkers Caches in RAMSynchronization& Job Queue FilterDevkit
Legend
System or SubsystemDatabaseSubsystem written in Python Apache Kafka clusterApache Cassandra database clusterMySQL Database serverRuns in Docker containers on Kubernetes Elasticsearch search engine clusterSlack instant messaging appCommunication between subsystemsFlow of Alert messages Redis in-memore cache serverSubsystem written in TypeScript
Internal Systems
Figure 3.
Architecture diagram of ANTARES. Input Alerts enter the system on the left (from ZTF, LSST, etc) and outputAlerts exit on the right (to downstream systems, TOMs, etc). Many subsystems represent clusters of processes on multiplemachines. Subsystems marked with the blue Kubernetes logo run on a shared pool of hardware resources in a Kubernetescluster. Subsystems not marked with the Kubernetes logo run on dedicated hardware. he ANTARES Event Broker Pipeline WorkerL1 FiltersL2 Filters Alert Database
LocusCatalog TablesCatalogsAlertDownstream Systems
Locus Aggregation
Read/WriteRead UpdateRead……
CatalogObject Association Search Engine
Kafka Output WatchObjectRead
WatchObject Association
Kafka Input AppendIndexWorkers LocusUpdate
Store the Alertand update Locus
Filter out bad AlertsScience FiltersGet or create a Locus, and load historical Alerts.Load associated objects from astro catalogsLoad associated objects from users’ watch lists
Figure 4.
Logical structure of the Alert Pipeline Worker. Input Alert streams are consumed at the top of the diagram andoutput streams are produced at the bottom. Matheson et al.
The internal structure of the Pipeline Workers is shown in Figure 4. Alerts enter at the top and outputs are producedat the bottom. The first stage of processing is the L1 Filters, that determine whether to run the rest of the pipelineon a given Alert, or whether to simply store it in the database and perform no other processing. This capability canbe used to ignore Alerts that are determined to be false detections or of poor quality by some metric. The L1 Filtercapability is optional, depending on ANTARES configuration.The next step is Locus Aggregation, in which the incoming Alert is assigned to a Locus object. This process uses a1 (cid:48)(cid:48) radius cone search. That is, each incoming Alert is associated with the nearest Locus within 1 (cid:48)(cid:48) of the Alert, withprecedence given to survey-provided associations. If no such Locus exists, a new Locus is created there and is assigneda unique “locus id.” If there is a pre-existing Locus, then all historical Alerts associated with it are loaded from thedatabase. From this point onward in the pipeline, the object being processed is a Locus object, not an Alert object.The next stages of our pipeline are CatalogObject Association and WatchObject Association, in which CatalogOb-jects and WatchObjects are loaded for the given Locus. This is the first step in annotating the Locus with value-addeddata. This is described in detail in Section 5.With all known data loaded for the Locus, the pipeline executes the L2 Filters or “Science Filters.” These areFilters developed by the ANTARES science staff (e.g., Soraisam et al. 2020) or submitted by the community. Filtersare described in Section 6. They implement science use cases such as classification and other decision making.Finally, the new Alert is written to the database and the Locus is updated with whatever new properties andassociations it has acquired. Output Alerts are broadcast to downstream systems using Kafka and the Search Engineis updated with the new Locus information. 4.4.
Fault Tolerance
ANTARES runs Filter code submitted by the community, which is difficult to rigorously test. Therefore, ANTARESexpects and handles filter crashes. Filters are immediately disabled if they crash and a notification is sent to theauthor via Slack. The notification contains a “crash log id” and the “locus id” of the Locus that was being processed,allowing the author to inspect the error and replicate the fault using the Devkit (see Section 11.3). The author maythen improve the Filter, test it on actual data, and submit a new version. Other than temporary downtime of theFilter, there are no negative consequences to this occurring. From experience we have found that this capability isessential when developing new Filters.Every component in the main ANTARES data pathway is a distributed system, allowing continuous operations inthe event of a hardware failure of one machine at a time. Simultaneous failure of multiple machines may require ahands-on systems-administration response.ANTARES’ fault tolerance capabilities benefit from the properties of Kubernetes and Kafka. Specifically, Alertsthat are received are not “committed” (i.e., marked as received) in Kafka until their processing is complete. Therefore,when a Pipeline Worker crashes in an unexpected way, it can pick up where it left off when it reboots. Kubernetesautomatically restarts containers that fail.Each system in the main data pathway is idempotent. This means that if the same Alert is received more than once,the final data in the Alert is correct. For example, duplicate Alerts are detected and skipped. ANNOTATIONIn addition to the base data included in the original alert, ANTARES attaches data and associations to Alerts andLoci. This “value-added” data we call Annotations. Annotations have three purposes: to provide richer input datafor Filters, to allow Filters to store computed properties on Loci and Alerts, and to allow searchability of the databaseby meaningful criteria. ANTARES includes several types of annotations, including Locus properties, Alert properties,Locus Tags, astronomical catalog object associations, and associations with watched objects.Catalog associations are performed by first modeling each catalog object as circular regions of sky of a particularradius. Each Locus is annotated with all catalog objects whose circular region the Locus falls within. In the case ofextended objects, the radius of each object region is taken from the catalog data. In the case of point sources, theobject is given a default value of 1 . (cid:48)(cid:48) The catalog search algorithm uses a hierarchical triangular mesh (HTM, Kunszt et al. 2001) ID lookup-table (HTM-LUT). The HTMLUT allows for efficient and scalable object associations. Each catalog object is entered into theHTMLUT multiple times, once for each HTM trixel ID that the catalog object’s circular region intersects. BecauseHTM trixel regions are triangular and not circular, the region for each object in the HTMLUT is larger than the he ANTARES Event Broker Table 1.
External Catalogs Used in ANTARESCatalog ReferenceThe Two Micron All Sky Survey Skrutskie et al. (2006)AllWISE Data Release Cutri et al. (2014)ASAS-SN Catalog of Variable Stars Shappee et al. (2014); Jayasinghe et al. (2018, 2019a,b)Second-Generation Guide Star Catalog Lasker et al. (2008)
Chandra
Source Catalog Evans et al. (2010)Catalina Surveys Drake et al. (2013a,b, 2014); Torrealba et al. (2015); Drake et al. (2017)Preferred Tidal Disruption Hosts French & Zabludoff (2018)
GAIA
Data Release 2 Gaia Collaboration et al. (2018)The NASA/IPAC Extragalactic Database a Helou et al. (1995)New York University Value-Added Galaxy Catalog Blanton et al. (2005)Third Reference Catalogue of Bright Galaxies de Vaucouleurs et al. (1991); Corwin et al. (1994)Sloan Digital Sky Survey Data Release 12 Alam et al. (2015)A Catalogue of Quasars and Active Nuclei: 13th Edition V´eron-Cetty & V´eron (2010)The Third XMM-Newton Serendipitous Source Catalogue Rosen et al. (2016)Revised Catalog of GALEX Ultraviolet Sources Bianchi et al. (2017) a The NASA/IPAC Extragalactic Database (NED) is funded by the National Aeronautics and Space Administration and operated bythe California Institute of Technology. circular region representing the catalog object. Therefore, false positive associations occur when using the HTMLUTalone. To resolve this, the position and radius of each object are checked after loading their data, and false positivesare removed.The HTMLUT is multi-level, meaning that it supports HTM IDs at multiple levels of tesselation. Each catalogobject is represented at a level of tesselation calculated from its radius. The algorithm is tuned to select an HTMtesselation level such that each object intersects 2.5 HTM trixels on average. This level provides an approximatebalance between minimizing the number of entries in the HTMLUT and minimizing the number of false positives thathave to be loaded, checked, and removed.The HTMLUT scheme can represent object regions of any size and shape, given an algorithm to represent theregion as a set of HTM trixels at various tesselation levels. In ANTARES 1.0, circular regions have been implemented.Ellipses or other shapes can be implemented in the future.The exact complement of catalogs will evolve as new catalogs are made available or old ones are updated, but LSSTdata will be incorporated as catalogs are published. The current set of catalogs in ANTARES is listed in Table 1,many derived from catsHTM (Soumagnac & Ofek 2018). Several catalogs in ANTARES 1.0 make use of variableradii, including the Third Reference Catalogue of Bright Galaxies (RC3), the Revised Catalog of GALEX UltravioletSources (GALEX), and the The Two Micron All Sky Survey (2MASS) extended source catalogs. In the case of RC3,we adopted a search radius associated with the apparent major isophotal diameter. We are using D25, measured at thesurface brightness level µ B = 25.0 B-mag per square arcsecond. For GALEX, we adopted a search radius associatedwith the Kron radius in the NUV. In practice, the Kron radius is expressed in the form of NUV KRON RADIUS *NUV A WORLD in the GALEX catalog (Bianchi et al. 2017). We thus adopt this form to define our search radiusfor GALEX. For the 2MASS extended source catalog, we use the semi-major axis (in arc seconds) of a fiducial ellipseat isophote K=20mag/arcsec , taken from the catalog data field “r k20fe.”Catalogs used by ANTARES are stored locally within the Cassandra database. This allows our HTMLUT systemto be used and eliminates the latency and reliability issues associated with querying to external databases hundredsof times per second. FILTERS2
Matheson et al.
The Filter is the primary way in which users of ANTARES can winnow a stream of time-domain alerts. Byevaluating data, both from the alert and annotations, algorithms implemented in filters can identify objects of interestand eliminate irrelevant objects, producing a subset of alerts that astronomers can then pursue with other observationalresources. ANTARES Filters are snippets of Python (Van Rossum & Drake 2009) code that process incoming Alerts.Filters can be built in to the ANTARES codebase, or can be submitted by the community using the ANTARES webPortal. Filters are executed in a sequence called the Filter Pipeline (see Figure 4). The Pipeline runs on each Locuswhen a new Alert is received on that Locus. Filters can use data files such as statistical models, neural networks,lookup-tables, etc. These repositories represent distillations of astronomical knowledge derived from larger samples(e.g., Soraisam et al. 2020). These larger samples are what we call a Touchstone for ANTARES, but the Touchstoneitself is external to the Pipeline described here. It is a separate system that enables the design, training, and deploymentof filters. 6.1.
Filter Inputs and Outputs
Filters have access to a variety of information sources in ANTARES at runtime, including the Locus object andits properties and Tags, all catalog objects associated with the Locus, all Alerts associated with the Locus and theirproperties, and recent LIGO/Virgo detection reports. Timeseries data such as Alert properties over time are availableas “Astropy.TimeSeries” objects (Astropy Collaboration et al. 2013) or as Pandas dataframes (McKinney 2010). Otherdata are represented as Python data structures.Based on the input data, Filters may take several actions, including setting properties on the Locus, setting propertieson the newly arrived Alert (but not on historical Alerts), adding Tags to the Locus, or halting the pipeline and thuspreventing subsequent Filters from running. This last ability is restricted to prevent misuse. Properties may be oftype float64, string, or long integer. 6.2.
Filter Structure
Filters are Python classes that inherit from class “antares.devkit.Filter” and implement, at minimum, a method run(self, locus).
A simple filter is the HelloWorld filter, shown below. The HelloWorld filter declares that it mayproduce a Tag called “hello world” and gives the Tag a text description. Then, in the run method, the filter adds thetag to the locus. There is no conditional logic around adding the tag, so this tag will be added to every locus that theFilter sees. import antares.devkit as dkclass HelloWorld(dk.Filter):OUTPUT_TAGS = [{ ’name’: ’hello_world’,’description’: ’This tag is added to EVERY Locus.’,},]def run(self, locus):locus.tag(’hello_world’)
The HelloWorld filter is a trivial example to demonstrate the basic format of a filter. Appendix A contains a morecomplex, real-world example. There, we present a high signal-to-noise ratio filter that explicitly declares all inputs andoutputs, performs initial setup, executes a simple computation, conditionally adds a Tag depending on input data,and handles a potential error case. OUTPUTSThe output of ANTARES takes three forms. These are our searchable database, Slack notifications through ourSlack workspace, and Kafka streams. The database is searchable using the ANTARES Portal and can also be queried https://antares-noao.slack.com he ANTARES Event Broker Table 2.
ANTARES Provenance Log ContentProvenance Label Descriptionantares version ANTARES package versionconfig ANTARES system configurationfilters Active Filters with version numberscatalogs Active Catalogspython packages Python package versions using the HTTP API either by users or by autonomous systems. Slack notifications are used to inform users and teamsin real time of events. This includes newly tagged Loci, new Alerts on tagged Loci, and hits to WatchLists. ANTARESoutput Kafka streams are intended to be consumed by downstream systems. Streams are produced by tagging Loci.Each stream is configurable to include a union or intersection of one or more Tags. For example, given a Filter thattags some Loci “extragalactic” and another Filter that tags some Loci “sn1a candidate,” a stream could be configuredto include all Alerts whose Loci have both tags. A system could then connect to that stream and receive all such alertsfrom ANTARES. Connections to the HTTP API and Kafka streams are facilitated by the ANTARES Client Library,discussed in Section 11.2. PROVENANCE TRACKINGWe considered two primary use cases for tracking of provenance data associated with processing of alerts byANTARES. First, reproducing offline the state of the ANTARES system at a time in the past, and, second, viewinghow filter decision making changed over time on a Locus. An example of the first use case would be if an astronomerwished to inspect an Alert that was processed by ANTARES in the past. This individual could, based on the prove-nance data, run a copy of ANTARES with the same configuration as ran in production and reproduce the decisionsmade while evaluating that alert. As an example of the second use case, given a Filter that produces a classificationprediction, a user might wish to view how their Filter’s classification of a given Locus changed as each new Alert wasreceived and processed.To address the first use case, ANTARES logs all of its configuration and state on startup. When each ANTARESprocess boots up, it records information about the version of the system, the version of all dependencies, catalogs,filters, etc. The full content of the Provenance Log is shown in Table 2. This information is stored as a Provenance Logobject in the database, indexed by a provenance id. Each incoming Alert is annotated with the provenance id of theANTARES process that received it. Thus, the state of the system that processed a given Alert can be retrieved and,if necessary, reproduced offline. To simplify the state of ANTARES, the system does not add/remove Filters whilerunning. Instead, the system reboots automatically every 24 hours and updates its filter set and configuration at thattime only. If necessary, the system can be manually rebooted more frequently with no negative effect on operations.To address the second provenance use case, ANTARES allows Filters to record time-series of variables. This isimplemented using Alert properties. Alert properties can be written to the current (i.e., new) Alert under considerationby a Filter and are immutable thereafter. For example, a Filter could exist that predicts the probability that a Locusis an RR Lyrae variable star. Each time the Filter runs it could store that probability as an Alert property on thenew Alert. Then, the Filter’s author can view this value over time using the Client library or the Devkit. DATABASE DESIGNThe data model described above is implemented using two databases, the alert database and the search index.These databases are implemented using Cassandra and Elasticsearch, respectively. Cassandra stores all data in ahigh-performance and scalable manner, but is indexed only by Locus ID and sky coordinate (represented by HTMIDs). Elasticsearch provides search indexes over values such as Locus properties, Tags, and catalog matches. To reducecapacity requirements, Elasticsearch stores only Locus data, not Alert data.9.1.
Apache Cassandra – ANTARES Alert Database Matheson et al.
Cassandra was chosen for its horizontal scalability and performance. The final size of the ANTARES data set isestimated at between 100TB and 2PB. The estimate varies depending on many factors such as the final specification ofthe LSST alert packet contents, and the number of annotations that Filters produce. Taking the upper limit of 2PB,Cassandra handles volumes of this size in many organizations. Apple famously operates the largest publicly knowncluster that, as of 2015, stored 10PB of data on 75,000 nodes . Cassandra supports the Apache Spark distributedcluster-computing framework, which opens the possibility of developing batch-processing jobs to run over the entireANTARES data set in the future. ANTARES uses a Cassandra replication factor of 3 to ensure durability of data.ANTARES uses custom tooling to monitor and operate our Cassandra cluster, that contains 6 nodes at the time ofthe ANTARES 1.0 release. These tools report realtime performance and status metrics, gather logs and monitor thehealth of cluster, and Slack notifications in the event of problems. The tools also allow the scheduling and triggeringof Cassandra’s built in anti-entropy repair feature that we run weekly. Another custom tool, called “my2cass,”allows object catalogs stored in MySQL to be automatically migrated into Cassandra. This entails inspecting theMySQL table schema, creating an equivalent table in Cassandra, copying data into Cassandra using the ANTARESdistributed job queue system, and populating the HTMLUT. These investments in operational capability are essential,as the Cassandra cluster will grow significantly over the 10 years of LSST operations.9.2. Elasticsearch – ANTARES Search Engine
Our Cassandra database is designed to handle the particular set of queries described in 4.4. This allows it to handleLSST-scale throughput but prevents users from writing more complex queries to search for data they want. In orderto allow for flexible searches over the ANTARES data holdings, we additionally store information about all of theLoci we’ve processed in an Elasticsearch cluster. Elasticsearch is a search engine built around Apache Lucene thatallows our team and users to find Loci and compute aggregate statistics. It was chosen for its efficient addition of newindexes to existing tables, and horizontal scalability. Because new Locus properties, Tags, and Catalogs will be addedcontinuously over time, the Search Engine needs to be able to efficiently add new indexes. ANTARES automaticallycreates appropriately typed indexes in Elasticsearch as this occurs. At ZTF-scale, a single node Elasticsearch clustersuffices to store the search index. At LSST scale the Elasticsearch will need to be expanded out to multiple nodes.
PERFORMANCEOver the 10 years of LSST operations, the lengths of the light curves of variable objects will increase linearly fromzero to hundreds of data points per Locus. Because ANTARES allows Filters to process the complete Locus historydata each time an Alert is received, the load on the system will increase linearly over time. This is true both incomputational load and read load on the database. Experience with pre-Cassandra designs of ANTARES running onZTF data have taught us to plan ahead. In the first few months of ZTF data, a single MySQL node could easily actas the Alert Database. After more than a year of ZTF data, this was untenable. A distributed database was required.This linearly increasing load demands that long-term scaling plans be made in advance. No static system configurationthat works at year 1 is guaranteed to work at year 10, or even at year 3.Alert streams have the useful property that each Alert can be processed independently of other nearby Alerts, aslong as they are not close enough together that they may belong to the same object (1 (cid:48)(cid:48) ). ANTARES makes useof this property by processing data in parallel. The default configuration in our load tests has been 10 ANTARESprocesses, each with 10 threads. The number of processes and threads per process can be increased at any time,given sufficient hardware. Likewise, the database needs to scale over time. This is a matter of both capacity andperformance. Response times must be low enough that ANTARES keeps up with the incoming data.There is one barrier to arbitrarily scaling the system that can be addressed when needed. Kafka topics from ZTF(and from LSST) are internally split into multiple partitions. Kafka distributes messages approximately evenly betweenpartitions. Kafka topic partitions allows a consumer (technically, a “consumer group”) to pull messages from a Kafkatopic using multiple processes simultaneously. The number of processes can only be as large as the number of partitions.If, in the future, this becomes a bottleneck, ANTARES may simply re-partition the Kafka stream into more partitionsby mirroring it into a local Kafka cluster with more partitions. https://cassandra.apache.org/ https://dl.acm.org/doi/10.1145/2934664 he ANTARES Event Broker locking mechanism prevents two Alerts on the same Locus frombeing processed at the same time. This mechanism is called the HTM Region Lock. Implemented using Redis andLua scripts, the region lock allows processes to lock access to sets of HTM trixels in an atomic, thread safe manner.This is a bottleneck. However, the load on this system is constant over time and does not increase as light curveslengthen. In testing, the region lock introduces 10ms to 100ms of latency into the pipeline when processing Alerts ata test rate of 10 alerts/second. This small latency cost is acceptable to prevent race conditions. If this latency growsto unacceptable levels at LSST alert rates, then we will spread the load over multiple Redis instances.Second, in the event of creating a new Locus object, a unique “locus id” is generated using a counter stored inmemory in Redis. Earlier ANTARES designs used random GUIDs such as “1c165092-a540-11ea-94be-324c811e1e2.”By using an integer counter and converting the integer to a compressed string format, Locus IDs can instead takea human-readable form such as “ANT2020so7ia”. This system was inspired by the ZTF Object ID scheme, (e.g.,“ZTF18aabejqj”) and produces shorter, more recognizable identifiers than UUIDs. As with the HTM Region Lock,the performance is acceptable. The system recovers gracefully from Redis failures.10.1. Scalability Properties
We conjecture that ANTARES scales in proportion to three variables: the rate of incoming alerts, R, the numberof historical alerts stored in the database, N (the integral of R over time), and the total computational complexity offilters, F. Note that N is somewhat related to the average length of all light curves in the database, which grow overtime as the survey operates. We further conjecture that the number of worker processes and the number of Cassandradatabase nodes both must scale in proportion to a function of R and N, while the CPU usage of the system scales inproportion to a function of all three variables R, N, and F.Note that the value of N increases linearly with time as it integrates R. Therefore even with constant R, the loadon ANTARES increases over time.An estimate of the alert rate, R, can be made based on LSST’s estimated value of 10 million alerts per night and atypical night lasting 10 hours. This gives an average alert rate of about 280 alerts per second over the course of thenight. Over shorter time intervals the rate will very greatly over the course of a night, with spikes of much higher rates.Since alerts are queued and buffered in Kafka, ANTARES does not need to meet the maximum spike throughout rate,but it does need to easily handle the average rate over the course of each night.10.2.
ZTF-Scale Load Test Results
We present the results of a load test of ANTARES using 1.23 million alerts taken from eight consecutive nights ofZTF data. The alerts were loaded into a single Kafka topic (stream) with 20 partitions on a Kafka cluster of 10 nodesrunning on a cloud service provider. ANTARES was deployed on-site at NOIRLab and was configured to connect tothis stream with 20 worker processes. Each process had 10 threads for a total of 200 threads. The system under testwas the ANTARES infrastructure and database, not a particular combination of filters. A single filter was enabledwith an average execution time less than 1 millisecond. Alert processing rate over time is shown in Figure 5. Asteady-state throughput rate of approximately 45 alerts/second is visible for the first 6 hours. The rate then dropsstep wise until all alerts have finished being processed at approximately t=11 hours.The step wise decrease in throughput rate towards the end of the processing run has been observed during multipletests, and appears to be scale-invariant with respect to the number of alerts in the run. We conjecture that thisphenomenon is an artifact of Apache Kafka. Kafka divides each stream into multiple parallel sub-streams calledpartitions. A group of processes that together consume a single topic is called a “consumer group.” Each individualconsumer process in the consumer group may consume from multiple partitions, but each partition may only beconsumed by one consumer process at a time. Some partitions finish being processed before others, and their consumersthen sit idle while the slower partitions finish. We suspect that this causes a “tail” effect on plots of throughput overtime, and is consistent with the step wise nature of the tail towards the end of a processing run. We will continue toinvestigate this.We have observed that steady-state throughput rate of ANTARES is dependent on which Kafka input cluster is usedand on the number of partitions in the topic. We measure CPU, memory, and local network well below full utilization. A mutex, or mutual exclusion object, allows multiple program threads to share the same resource, but not to use the resource simultaneously. Matheson et al. A l e r t s p r o ce ss e d / s ec ond Figure 5.
Alert throughput over time as ANTARES processes 1.23 million ZTF alerts in approximately 11 hours. The dataset was created by merging eight consecutive nights of ZTF data into a singe input stream for ANTARES.
This leads us to speculate that ANTARES capability with current hardware is limited by the input stream ratherthan by the database, local network, or CPU and memory constraints. This may be due to the internet connectionto the Kafka cluster, to the number of nodes in the Kafka cluster, or to the number of partitions in the topic. Wewill investigate this further by using a local Kafka cluster directly connected to the ANTARES local network, and byvarying the number of nodes and the number of partitions.10.3.
Alert Processing Time
Figure 6 shows the typical times that ANTARES takes to process alerts. We define this as the time that an alertspends in the ANTARES pipeline from beginning to end. These data were taken from 7 days of real-time processingof ZTF alerts. Because ANTARES is multi-process and multi-threaded, many alerts are in some stage of the pipelineat any given moment. The total throughput has been tested at 45 alerts/second as described above, but each alertspends multiple seconds in the pipeline. In Figure 6, we show that the median alert processing time is around 11seconds and the maximum is typically between 30 and 60 seconds.10.4.
Opportunities for Performance Increases
As discussed above, every subsystem of ANTARES is scalable including the Cassandra database and the AlertPipeline. LSST throughput rates will be about 10x higher than the test result presented above. Also, LSST operationswill demand continuous scaling over time because the size of the alert database will grow linearly. We expect to achievethe required scale by expanding the ANTARES server clusters. In addition to simply adding servers, we will continueto pursue performance improvements to the system. Cassandra is highly tunable and we will continue to investigatehow best to configure it for our particular query paths.In the event that technology or funding limits our ability to scale the system, we have contingency options. Oneoption is to use Level 1 Filters more aggressively to filter out artifacts and other non-astrophysical alerts. Anotheris to discard the majority of input alert properties and keep only an essential subset. Depending on the content ofthe LSST alert packet, this could significantly decrease our database capacity needs while preserving our science usecases. Another option is to store all data, but only process a subset of Loci in real-time according to some filtering.Other data could be stored and deferred for daytime processing (although the day only adds a factor of two, as night he ANTARES Event Broker A l e r t P r o ce ss i n g T i m e ( s ec o nd s ) Figure 6.
Data from several days of operation showing the time that an alert spends in the ANTARES pipeline. Blue dotsrepresent median values while red crosses show the maximum time spent in the pipeline for each of the operating windowsplotted. will fall again). We consider these options to be undesirable and we do not plan to implement them. We expect toachieve the required scale through adding nodes and by tuning Cassandra and the ANTARES pipeline.
USER INTERFACEThe ANTARES user interface has three components, the Web Portal, the API and companion Client library, andthe Filter Devkit. 11.1.
Web Portal
The “Portal” is a client-side web application that lets scientists submit filters and visually explore our database.The homepage (https://antares.noirlab.edu/) shows the most recent data that the system has processed, aggregatestatistics about the ANTARES data holdings (see Figure 7), and details about particular loci (see Figure 8). Becausemanually constructing Elasticsearch queries (for the locus search engine) is difficult, we allow users to interactivelybuild search queries for loci that meet certain criteria and then to export them for use in the ANTARES client (seeSection 11.2). The Portal also provides an administrative interface for submitting, approving, and managing filtersand watch lists in the pipeline.11.2.
Application Programming Interface and Client Library
ANTARES allows for interactive and programmatic data access through an HTTP API. This API exposes “re-sources,” views into our database that generally align with the data models described in Section 3. For example, therequest
GET /loci will receive as response a collection of “locus” resources that each have a light curve, a set of locusproperties, etc. We chose to build the API in accordance with version 1.0 of the JavaScript Object Notation:API(JSON:API) specification. This decision was motivated by a number of considerations. A JSON:API supports par-tial fetching of resources (e.g., fetch just the light curve of a locus) and the ability to include related resources in arequest (e.g., fetch a locus and its most recent alert). It also allows for hypermedia-driven interactions that improve https://jsonapi.org/format/1.0/ Matheson et al.
Figure 7.
Web-based search interface for the ANTARES Portal. Users can interactively refine search result sets along anumber of different dimensions, including properties of the object (its location and the number of measurements there), catalogassociations, and annotations added to the locus by filters (tags). Users are also able to bookmark and make private annotationson loci. discoverability of our data and simplify maintaining backwards-compatibility as we continue development work. Inshort, it allowed us to build an API that supports a rich range of queries, minimizes network traffic, and leveragestools from an existing ecosystem of software for producing and consuming JSON:API-conformant data.We also provide a user-friendly Python library, the “ANTARES Client,” that supports scientists in consumingstreaming data through Kafka streams and in analyses of the ANTARES data holdings through the HTTP API. Italso includes a command-line interface that enables common tasks such as running a daemon to save a local copy ofall the data from a particular stream. A copy of this client library is hosted on the Python Package Index so it canbe installed with the Python package manager “pip.” Currently, the interfaces provided in the client access resourcesthrough the API that don’t require the user to be authenticated. We plan to expand these in the future so that userscan programmatically access data such as objects of interest that they have bookmarked. Consuming data from Kafkastreams requires credentials that the ANTARES team provides upon request. Documentation for installing and usingthe client library is provided as a standalone website. The Client library was used to discover the first R Coronae Borealis star from the ZTF public survey (Lee et al.2020b). Candidates were preselected using color-color cuts, and their long term light curves were loaded from thealert database. Upon further inspection of the light curves, we discovered ZTF18abhjrcf showing large (greater than5 magnitudes) brightness variation over an extended period of time (about 100 days). This is similar to the behavior https://pypi.org/project/antares-client/ https://noao.gitlab.io/antares/client/ he ANTARES Event Broker Figure 8.
Viewing the details of a locus on the ANTARES Portal. Users can see a plot of the light curve of the object, a tableof observed values, links to external catalogs, the positions of detections relative to the locus center, and a thumbnail view ofthe region powered by Aladdin Lite (Bonnarel et al. 2000; Boch & Fernique 2014). of known R Corona Borealis star. Further spectroscopic follow-up with the Las Cumbres Observatory telescope hasconfirmed its R Corona Borealis classification. 11.3.
Filter Devkit
The Devkit allows filter authors to develop and test their filter code using real data from the ANTARES database.It is designed to be used in the NOIRLab’s DataLab Jupyter environment, where it has direct read-only access tothe ANTARES databases. Any user may sign up for Datalab and use the Devkit. The Devkit allows users to fetchLocus data from the database and run filter code on this data in a Python environment that mimics the ANTARESproduction system, as demonstrated in the following example. https://datalab.noao.edu/ Matheson et al. import antares.devkit as dkdk.init()
Users can also construct custom Locus data to test their filters, either by modifying real data or constructing itcompletely from scratch. The Devkit documentation describes how to do this. Some filters require access to datafiles such as statistical models, neural networks, lookup-tables, etc., as described in Section 6. These data files can beuploaded into ANTARES using the Devkit. Full documentation for the Devkit is provided as a standalone website.The Portal’s FAQ page provides links to all documentation.
ANTARES IN THE ERA OF LSSTThe data instrument we have described here is already functional at the scale of alerts produced by ZTF. The stepto LSST scale will be at least an order of magnitude greater, if not two. We are confident that the design of thesystem will scale to match LSST, with only database tuning and hardware expansion necessary to accommodate thelarger number of alerts. Specifically we expect the Cassandra cluster to require significant expansion both for capacityand for throughput. In addition to providing real-time reactivity to alerts, ANTARES will provide a database of allLSST alerts over the operational period, and is designed to allow future work on batch processing of this data set.ANTARES will be a general-use broker for the US and world community as it takes advantage of all the time-domainscience opportunities that LSST will provide. ACKNOWLEDGMENTSThe ANTARES team would like to thank the following individuals for their support and advice in the development ofthis software system: Tim Axelrod, Robert Blum, Todd Boroson, Glenn Eychaner, Mike Fitzpatrick, Michael Fox, SteveHowell, Tim Jenness, Tod Lauer, Nirav Merchant, Catherine Merrill, Robert Nikutta, Knut Olsen, Stephen Ridgway,Robert Seaman, Claire Taylor, Adam Thornton, Jackson Toeniskoetter, Alistair Walker, and John Wregglesworth.The ANTARES team gratefully acknowledges financial support from the National Science Foundation through acooperative agreement with the Association of Universities for Research in Astronomy (AURA) for the operation of theNSF’s National Optical-Infrared Astronomy Research Laboratory, through an NSF INSPIRE grant to the University ofArizona (CISE AST-1344024, PI: R. Snodgrass), and through a grant from the Heising-Simons Foundation (2018-0909,PI: T. Matheson). he ANTARES Event Broker Software:
Python (Van Rossum & Drake 2009), Astropy (Astropy Collaboration et al. 2013), Pandas (McK-inney 2010; The Pandas Development Team 2020), Kafka, Apache Cassandra, Kubernetes, MySQL, Slack, Elasticsearch, Redis, Lua, JSON APPENDIX A. EXAMPLE FILTERThis filter selects alerts that have a high signal-to-noise ratio (SNR), based on data intrinsic to the alert itself.Downstream filters can use this information to operate only on those alerts whose detection is more secure. import antares.devkit as dkclass HighSNR(dk.Filter):ERROR_SLACK_CHANNEL = "" http://python.org https://pandas.pydata.org/ https://kafka.apache.org/ https://cassandra.apache.org/ https://kubernetes.io/ https://slack.com/ https://redis.io/ Matheson et al. eg: loading files, constructing datastructures, etc."""
REFERENCES