The VO: A powerful tool for global astronomy
Christophe Arviset, Mark Allen, Alessandra Aloisi, Bruce Berriman, Catherine Boisson, Baptiste Cecconi, David Ciardi, Janet Evans, Giuseppina Fabbiano, Francoise Genova, Tim Jenness, Bob Mann, Tom McGlynn, William OMullane, David Schade, Felix Stoehr, Andrea Zacchi
aa r X i v : . [ a s t r o - ph . I M ] M a r The VO: A powerful tool for global astronomy
Christophe Arviset , Mark Allen , Alessandra Aloisi , Bruce Berriman ,Catherine Boisson , Baptiste Cecconi , David Ciardi , Janet Evans ,Giuseppina Fabbiano , Francoise Genova , Tim Jenness , Bob Mann ,Tom McGlynn , William OMullane , David Schade , Felix Stoehr , andAndrea Zacchi ESAC Science Data Centre, ESA, Spain;
[email protected] CDS, Université de Strasbourg, CNRS, France MAST, STScI, USA IPAC, Caltech, USA LUTH, Observatoire de Paris, CNRS, France LESIA, Observatoire de Paris, CNRS, France SAO, CXC, USA LSST Project Management O ffi ce, Tucson, AZ, USA Wide-Field Astronomy Unit, University of Edinburgh, UK NASA, HEASARC, USA Gaia, ESAC, ESA, Spain CADC, Canada ALMA, Germany INAF-OATs, Euclid, Italy
Abstract.
Since its inception in the early 2000’s, the Virtual Observatory (VO),developed as a collaboration of many national and international projects, has become amajor factor in the discovery and dissemination of astronomical information worldwide.The International Virtual Observatory Alliance (IVOA) has been coordinating all thesee ff orts worldwide to ensure a common VO framework that enables transparent accessto and interoperability of astronomy resources (data and software) around the world.The VO is not a magic solution to all astronomy data management challenges butit does bring useful solutions in many areas borne out by the fact that VO interfaces arebroadly found in astronomy’s major data centres and projects worldwide. Astronomydata centres have been building VO services on top of their existing data services toincrease interoperability with other VO-compliant data resources to take advantage ofthe continuous and increasing development of VO applications. VO applications havemade multi-instrument and multi-wavelength science, a di ffi cult and fruitful part ofastronomy, somewhat easier.More recently, several major new astronomy projects have been directly adoptingVO standards to build their data management infrastructure, giving birth to ‘VO built- in’ archives. Embracing the VO framework from the beginning brings the double gainof not needing to reinvent the wheel and ensuring from the start interoperability withother astronomy VO resources. Some of the IVOA standards are also starting to be usedby neighbour disciplines like planetary sciences.There is still quite a lot to be done on the VO, in particular tackling the upcomingbig data challenge and how to find interoperable solutions to the new data analysisparadigm of bringing and running the software close to the data.We report on the current status and also desire to encourage others to adopt VOtechnology and engage them in the e ff ort of developing the VO. Thus, we wish to ensurethat the VO standards fit new astronomy projects requirements and needs.
1. IVOA today and VO successes
The International Virtual Observatory Alliance was created in 2002 and today con-sists of twenty diverse member projects worldwide. Over the years, some new nationalprojects have joined, some others have withdrawn (some due to lack of funding) butmost of them have persisted. Such international collaboration represents undoubtedlyIVOA’s greatest success, where people coming from all over the planet have managedto agree (sometimes with great di ffi culty) and implement astronomical data interoper-ability standards.The two annual IVOA interoperability meetings continue to be well attended with70 to 100 participants, organised in working groups (Applications, Data Access Layer,Data Model, Grid and Web Services, Registry and Semantics) and interest groups (DataCuration and Preservation, Education, Theory, Time Domain, Operations, KnowledgeDiscovery in Databases). The IVOA Executive Committee, with representatives fromeach VO project, makes decision in a collaborative manner, coordinates and overseesall IVOA activities.After the first years of general brainstorming, getting the VO o ff the ground, thestandard development process has now become very mature. Standards definition rep-resents the main activity of the IVOA and in 2010, an e ff ort was made to define moreclearly and carefully the IVOA architecture (Arviset et al. 2012) which has remainedvery stable since then, with well established interoperability standards for tables, im-ages, spectra and registries. IVOA work has always been done in an open and col-laborative environment, with shareable software and infrastructure components, likeregistry validators, TAP libraries, VOTable parsers, data publishing software and tools(e.g. DaCHS).On one side, many major astronomical data collections are now registered in andaccessible through the VO, making them easily discoverable through VO portals (e.g.MAST, Datascope, ESASky). This was usually made by adding a VO layer on top ofexisting data archives, enabling interoperability amongst them. More recently, the VOprotocols have been used to actually build astronomical data management infrastruc-tures (such as the CADC, SkyMapper and the Gaia archive), converting these into thefirst ‘VO built-in’ archives, which will benefit directly from interoperability with othersVO compliant archives.On the other side, more and more VO applications now enable end users to moreeasily access and combine data that comes from various data centres, regardless ofuthor’s Final Checklist 3how and where these products are being stored. The two most noticeable examples areAladin and Topcat, which have now become widely used by astronomers.Although specific VO funding has decreased or has been shifted over recent years,it is still recognised as an important e-infrastructure. In Europe, the VO is included inthe ASTRONET European Infrastructure Roadmap, and the new ASTERICS projectaims to make ESFRI (European Strategy Forum on Research Infrastructures) projectsand their pathfinders data available for discovery. In the USA, VO support is providedby NASA data centres. The VO e ff ort is becoming more an integral part of data cen-tres and laboratory data management resources, rather than a side project. It is alsonoteworthy that the VO standards and protocols are being re-used by other scientificdisciplines, like planetary science (EuroPlaNet), and for molecular and lines databases(VAMDC).
2. VO problem areas and challenges
Along with its successes, the VO is also experiencing some di ffi culties. Due to thedecrease of VO direct funding in the UK, at ESO and more recently in the USA, theperception by some is that the VO is flickering out. Sometimes the VO is seen as aclosed shop, where VO standards are complex and di ffi cult to implement. Initial ‘sim-ple’ access protocols (Cone Search, Simple Image Access) were easier to implementwhile more e ff ort is required for the more ‘rich’ VO services (synchronous and asyn-chronous Table Access Protocol (TAP) with table upload and cross match functions,image cut outs, multi dimensional data, workflow building, storage of results on theserver through VOSpace, etc.). These more sophisticated services are required to re-spond to the more complex data access and exploitation science cases.On the other hand, the VO initial expectations, as they were sometimes stated,were probably unrealistic. There was the need then to get it o ff the ground with greatvision and ideas, but the VO was probably oversold. Many people might have had awrong perception of the VO. The VO is not an astronomy ‘killer’ application but rathera data management interoperability infrastructure. The IVOA role is to define the VOecosystem and its interoperability standards and it is up to the astronomy projects anddata centres to build VO services and VO applications.In the era of very large data and with the ever-growing need of the scientific com-munity to connect datasets from di ff erent projects and archives, the VO can play a cru-cial role in helping the community access and utilize the data, but it still needs greatercommunity take-up, in particular by the new projects such as LSST. The IVOA needs tofind a way to better engage large data centres and projects, capturing their requirementsin the standards development process, and trying to align their constraints and prioritieswith those of the IVOA (data cube, time domain, big data and bring the software to thedata). Two VO implementation models can be envisaged: (1) VO layer, supported byVO publishing tools for small data centres with little IT expertise and, (2) VO built-in,supported by sets of VO software libraries for bigger data centres and projects wheremore expertise usually reside. The VO is being used more and more by data centresto build their infrastructure and this probably represents the future of the VO. There-fore, the IVOA needs to engage big projects and data centres as ‘participants’, not as‘customers’.With an increasing number of VO services and VO built-in infrastructures, the VOhas become operational. Hence, the IVOA needs now to play a more active role in en- Christophe Arviset etal.suring that the VO ecosystem is reliable and trustworthy. Through the recently createdIVOA Operations Group, this will be coordinated by curating the resources registeredin the IVOA registries, monitoring their uptime and their compliance to standards, andencouraging the service providers to take appropriate measures to improve their ser-vices.Measuring the success of the VO is another challenge for the IVOA. The VO is inuse, but how can we see that? How do we define success (or failure) metrics? Countingthe VO scientific publications probably won’t capture the VO use, as it is unlikelythat scientists acknowledge the VO infrastructure that enables them to do new science.Some may not even know that the services and software they use are based on VOprotocols. Nobody acknowledges the web, although all scientists use it! Another optionwould be to try to define VO services usage statistics, but how can we uniformly collectand compare them? When astronomers use Topcat, MAST, Aladin, CADC services orESASky, do they know they are using the VO? Do they need to know? Probably not. Itcould well be that the success of the VO is to be ‘invisible’!The VO greatly facilitates data discovery and quick exploration of data from vari-ous data sources and data centres. Interoperability between SAMP compliant tools andarchives allow easily to transfer data to dedicated external applications for display andanalysis. This mechanism is in place and works well for tables and images, but stillneeds to be improved for spectra, multi dimensional and time domain data. The VOcan not do the science for astronomers, but its goal is to make access and use of datamuch easier to enable new science that could not be easily done otherwise.
3. Conclusions