AstroCloud, a Cyber-Infrastructure for Astronomy Research: Data Archiving and Quality Control
Boliang He, Chenzhou Cui, Dongwei Fan, Changhua Li, Jian Xiao, Ce Yu, Chuanjun Wang, Zihuang Cao, Junyi Chen, Weimin Yi, Shanshan Li, Linying Mi, Sisi Yang
aa r X i v : . [ a s t r o - ph . I M ] N ov **Volume Title**ASP Conference Series, Vol. **Volume Number****Author** c (cid:13) **Copyright Year** Astronomical Society of the Pacific AstroCloud, a Cyber-Infrastructure for Astronomy Research: DataArchiving and Quality Control
Boliang He , Chenzhou Cui , Dongwei Fan , Changhua Li , Jian Xiao , CeYu , Chuanjun Wang , Zihuang Cao , Junyi Chen , Weimin Yi , ShanshanLi , Linying Mi and Sisi Yang National Astronomical Observatories, Chinese Academy of Sciences (CAS),20A Datun Road, Beijing 100012, China Tianjin University, 92 Weijin Road, Tianjin 300072, China Yunnan Astronomical Observatory, CAS, P.0.Box110, Kunming 650011, China
Abstract.
AstroCloud is a cyber-Infrastructure for Astronomy Research initiatedby Chinese Virtual Observatory (China-VO) under funding support from NDRC (Na-tional Development and Reform commission) and CAS (Chinese Academy of Sci-ences) (Cui et al. 2014). To archive the astronomical data in China, we present theimplementation of the astronomical data archiving system (ADAS). Data archiving andquality control are the infrastructure for the AstroCloud. Throughout the data of the en-tire life cycle, data archiving system standardized data, transferring data, logging obser-vational data, archiving ambient data, And storing these data and metadata in database.Quality control covers the whole process and all aspects of data archiving.
1. Introduction
There are tens of telescopes running in China. Every night and day, they are producingseveral terabytes data. To archive these huge data and manage them, we present animplementation of an Astronomical Data Archiving System (ADAS). The data typeswhich would be archived are the observation data and ambient data. The observationdata such as image FITS, spectra FITS and observation log, are produced by telescopeand data reduce pipeline. Ambient data are some environment data, such as weather,seeing data and allsky camera images.Archived data is stored into the observatory data center first, then Data transferredto AstroCloud data center via ADAS. In AstroCloud, we build a Data Access API Forusers and programs to access data. The following telescopes have been already usingthis archiving system to archive their data. These telescope are located in multiplesites in China: Guo Shoujing Telescope (LAMOST), Lijiang GMG 2.4m Telescope,Xinglong 2.16m Telescope, Delingha 50Bin Telescope, Huairou Solar Radio Telescope,Huairou Solar Multi-Channel Telescope and Fuxian 1m New Vacuum Solar Telescope(NVST). http://astrocloud.china-vo.org Observation Data
Image Spectra Obs Logs
Ambient Data
Weather Seeing
Allsky Camera
On-site Data CenterNAOC Data Center (AstroCloud Data Center)Data Access API
Figure 1. Data archive Framework
2. Data Model
The type of raw data include files and tables. FITS file mainly contain the raw data.FITS can be image, can be spectral, etc. The tables are catalog tables, ambient datatables, observational logs, etc.Metadata consists of two types: • Schema Metadata : Schema Metadata stores all the databases, schemas, tablesand columns information. The database-schema is similar to the IVOA TAPschemas(IVO 2010). • Archive Metadata : Archive Metadata stores the FITS files header information.The must filed in database-schema is shown in Table 1. Usually, One telescopehas one table in archive database.
Table 1. Archive Metadata database-schema
Column Name Definition Description id SERIAL
Auto increasing integer, Primary Key filename VARCHAR(30)
FITS file name object VARCHAR(30)
Observation object
RA NUMERIC(12,8)
Right ascension, default J2000
Dec NUMERIC(12,8)
Declination, default J2000 filesize INTEGER
File size (bytes) checksum VARCHAR(64)
MD5 checksum recTime TIMESTAMP WITHOUT TIME ZONE
Recorded time
3. Software archiving Architecture
The system consists of four submodules (Laher et al. 2014):ataArchiving and Quality Control 3
Figure 2. Software Architecture Data Transfer System (DTS) . Data transferring is via network. The networktransfer is scheduled. In the central data center in NAOC, we set up a Trans-fer Server to accept data transfer. We choose rsync tools running this service.Because it is open source and has a very good performances. (Zampieri et al.2009)2
Data Ingest System (DIS) . DIS provides the data to database function. Thisprocedure will parse the FITS header and choose the necessary filed to recordinto the database. We use the
AstroPy (Ast 2014) to manipulate the FITS file,which can collect the FITS file header easily.(Dobrzycki et al. 2012)3
Logging System (LGS) . All the operation will be logged into the database. LGSis the procedure to log the operation: data transfer, data ingest, database replica-tion, etc.4
Archive Backup System (BKS) . BKS consists of files backup, database replica-tion, and database backup. These operations are scheduled.
4. Archiving Pipeline
SkyTools (Sky2014) replication procedure will replicate the database to the Query Databases forother user or system to access, such as Data Publish System (Fan et al. 2014).
5. Quality Control
Data quality can be controlled by the data archiving process. In DTS, every file has beenmade a MD5 checksum, before transferred and after transferred, transfer procedure willvalid the checksum. Database is been checked and valid by schedule.
6. Conclusions
We developed and implemented an astronomical data archiving system that can be op-erated automatic. When the data is produced, the procedure will be running quietly.When the procedure is finished, the operator will receive the job detail email.
Acknowledgments.
This paper is funded by National Natural Science Foundationof China (U1231108), Ministry of Science and Technology of China (2012FY120500),Chinese Academy of Sciences (XXH12503-05-05). Data resources are supported byChinese Astronomical Data Center.
References http://pgfoundry.org/projects/skytools
Cui, C., Yu, C., Xiao, J., He, B., Li, C., Fan, D., Wang, C., Hong, Z., Li, S., Mi, L., Wan, W.,Cao, Z., Wang, J., Yin, S., Fan, Y., Wang, J., & Yang, S. 2014, in ADASS XXIV, editedby A. R. Taylor, & J. M. Stil (San Francisco: ASP), vol. TBD of ASP Conf. Ser., TBDDobrzycki, A., da Rocha, C., Vera, I., Vuong, M.-H., Bierwirth, T., Forchı, V., Fourniol, N.,Moins, C., & Zampieri, S. 2012, in Society of Photo-Optical Instrumentation Engineers(SPIE) Conference Series, vol. 8451 of Society of Photo-Optical Instrumentation Engi-neers (SPIE) Conference SeriesFan, D., He, B., Xiao, J., Li, S., Li, C., Cui, C., Yu, C., Hong, Z., Yin, S., Wang, C., Cao, Z.,Fan, Y., Mi, L., Wan, W., & Wang, J. 2014, in ADASS XXIV, edited by A. R. Taylor, &J. M. Stil (San Francisco: ASP), vol. TBD of ASP Conf. Ser., TBDLaher, R. R., Surace, J., Grillmair, C. J., Ofek, E. O., Levitan, D., Sesar, B., van Eyken, J. C.,Law, N. M., Helou, G., Hamam, N., Masci, F. J., Mattingly, S., Jackson, E., Hacopeans,E., Mi, W., Groom, S., Teplitz, H., Desai, V., Hale, D., Smith, R., Walters, R., Quimby,R., Kasliwal, M., Horesh, A., Bellm, E., Barlow, T., Waszczak, A., Prince, T. A., &Kulkarni, S. R. 2014, PASP, 126, 674.
Zampieri, S., Forchi, V., Gebbinck, M. K., Moins, C., & Padovan, M. 2009, in AstronomicalData Analysis Software and Systems XVIII, edited by D. A. Bohlender, D. Durand, &P. Dowler, vol. 411 of Astronomical Society of the Pacific Conference Series, 5402