[PDF] Reproducing Scientific Experiment with Cloud DevOps

Abstract

The reproducibility of scientific experiment is vital for the advancement of disciplines based on previous work. To achieve this goal, many researchers focus on complex methodology and self-invented tools which have difficulty in practical usage. In this article, we introduce the Cloud DevOps infrastructure from software engineering community and shows how it can be used effectively for heterogeneous agents to reproduce experiments for computer science related disciplines. DevOps can be enabled using freely available cloud computing machines for medium-sized experiment and self-hosted computing engines for large-scale computing, thus powering researchers to share their experiment result with others in a more reliable way.

Full PDF

DDepartment: Electronic Engineering

Editor: Name, xxxx@email

Reproducing ScientiﬁcExperiment with CloudDevOps

Feng Zhao

Tsinghua University

Xingzhi Niu

University of Washington, Tacoma

Shao-Lun Huang

Tsinghua University

Lin Zhang

Tsinghua University

Abstract —The reproducibility of scientiﬁc experiment is vital for the advancement of disciplinesbased on previous work. To achieve this goal, many researchers focus on complex methodologyand self-invented tools which have difﬁculty in practical usage. In this article, we introduce theDevOps infrastructure from software engineering community and shows how DevOps can beused effectively to reproduce experiments for computer science related disciplines. DevOps canbe enabled using freely available cloud computing machines for medium sized experiment andself-hosted computing engines for large scale computing, thus powering researchers to share their experiment result with others in a more reliable way.

THE INTRODUCTION

As the developmentof Big Data, scientiﬁc computing encompassesmore disciplines and are much more complexthan before. Cloud Computing offers many con-venient infrastructures which have been provenuseful in scientiﬁc experiment scenario. For ex-ample, some highly parallel experiments can bedone on Cloud Serveless with affordable cost [1].There are also other emerging domains whichare related to scientiﬁc computational experi- ment. These experiments require more dedicatedtoolchains, speciﬁc workﬂow and expensive com-putational resources which put new challenge onexperiment reproducibility.To solve the reproducibility issue, there arethree kinds of approaches: Tools, Platform andMethodology. Many tools [2] are provided, whichcan capture the running environment informationor storing the experiment results. These tools arevaluable but may suffer from bad-maintainability

CiSE

Published by the IEEE Computer Society c (cid:13) a r X i v : . [ c s . D C ] O c t epartment Head and difﬁcult conﬁguration. Indeed, they are madeby domain speciﬁc scientists, not by experiencedfull-time software engineers. These tools canstore the experiment results, which requires usersto conﬁgure the database locally but researchersmay not be experienced in database. Also thelocal data is difﬁcult to share. Furthermore, mostof these tools are not programming languageneutral, which means researchers cannot use themin other programming languages. Still, somethingis better than nothing if researchers use these toolsto manage their experiment.For platform solution, traditionally container-ization is used. In recent years, it has been shownthat cloud computing is suitable for scientiﬁcresearch purpose [3]. Conﬁguration on cloud en-vironment from scratch is difﬁcult for unexperi-enced researchers and it is better to use speciﬁccloud service for research purpose. For example,we have Code Ocean or other commercial cloudsystems [4], which can tackle the reproducibilityproblem to some extend. However, their free tieris mean and researchers probably are not willingto pay for extra computational resource. Budgetlimitation is an important factor when it comesto buying cloud computing resources.Methodology, or best practice in reproducibil-ity usually discusses the general principals [5] orcombines tools and platforms to explore the bestpractice [6]. Generally speaking, methodology ishard to follow as they tend to be ideal and theresearchers may not be familiar with or have theability to setup the toolchain used.All the above three aspects have pros and consfor experiment reproducibility. The key is how tocombine the three aspects to make the best useof their advantages. This is what DevOps tries tosolve. This idea is not newly proposed. Boettigergives a try using Docker container for experi-ment reproducibility [7]. He also mentioned theDevOps philosophy and acknowledged its limita-tions. There are other research projects which bor-row the ideas of DevOps to conduct sophisticatedexperiments [8]. These previous researches arevaluable but they are limited to speciﬁc domainand local environment.There are also dedicated system on the cloud,which tries to solve domain speciﬁc problemand is related with experiment reproducibility. Devops@mech is developed for a certain in- stitute, which is based on DevOps methodology[9]. For public service, we have RAMP for datascience domain [10] or VCR for computationalresults indexing purpose [11]. Though these ser-vices are available when corresponding paperis written, they are unavailable now. Everest,claimed to simplify the use of clouds for scientiﬁccomputing, is still available but users are requiredto attach their own resources before actually usingit [12]. Just like existing tools, these lab-madeservices suffer from bad maintenance.From the above analysis, we see that previouscombination of DevOps with scientiﬁc experi-ment has some shortcomings. In this article, wepropose Cloud DevOps approach, which usesDevOps from cloud service point of view. It hasthe following advantages which are not presentcompletely in previous approaches: • High availability of the service and well main-tenance of the infrastructure • Easy to use and have ﬂexible conﬁguration • Unlimited Usage and rich computing resourceIn the following sections, we will give an in-troduction to Cloud DevOps and show the fea-sibility to incorporate existing tools in CloudDevOps. We then investigate the reproducibilityproblem with some proof-of-concept examples.These examples take the advantage of CloudDevOps while integrated with old toolchains. Webelieve Cloud DevOps can help researchers bemore productive in their experiment and helpothers easier to follow their research. All toooften, helping others actually helps yourself.

INFRASTRUCTURE

Originally, DevOps refers to the software en-gineering approach to automate the process ofbuilding and deploying software product, which issummarized by its two core components “Contin-uous Integration and Deployment (CICD)” [13].DevOps service (server) can be self-hosted orcentrally hosted. In either way, it requires someother computing machines (called agents or run-ners) to actually run the jobs submitted. Usuallythe jobs are not submitted by hand but triggeredby an update of code repository. DevOps serveris quite complex and self-hosted solution is notsuitable for sharing results with others. Thereforeit is preferred to use public cloud DevOps service, CiSE able 1. Comparison of Cloud DevOps provider (until2019)

AppVeyor Azurepipelines CircleCI GitLabCICD TravisPlatform Windows,Linux

All All Linuxdocker All

Parallel

Selfhost

Y Y N Y N

Artifact

N Y Y Y N which provides some free time-unlimited com-puting power (cloud agent). Besides, self-hostedcomputing agent (client) can be used if publicprovided agent is not suitable to reproduce theexperiment due to computing resource limitation.In this article, we only consider cloud hostedDevOps service and call them Cloud DevOps forshort.There are some similarities between CloudDevOps and Everest infrastructure [11] . Bothof Cloud DevOps and Everest allow dynamicprovision of computing resources from publiccloud service provider and support computingagents attached by users. The computation canbe trigger by user when the button is clickedvia web interface. However, Everest suffers fromproblems mentioned in the last Section. From theworkﬂow management point of view, Cloud De-vOps is similar to Pegasus system [14]. While thelatter is more suitable for large scale distributedcomputing management, Cloud DevOps is scal-able and covers the need from small experimentto large scale experiment as well.There are many freely available Cloud De-vOps service providers for open source projectwhich greatly power individual developers andopen source community.

Table

Platform = All in Table 1) if it supports Windows, MacOS and Linux OS. Cross-platform is an important topic in software engi-neering. For scientiﬁc community, most researchexperiments can only be reproduced on speciﬁcversion of one Operating System. This is OKsince researchers may not have machines of otherOperating Systems or they have no time to maketheir code run on different platforms. A recentstudy found a ﬂaw of Python script in an articlepublished on Nature which produces differentresults on different OS [15]. This incident canbe avoided if researchers test their experimentcode on different OS. Cloud DevOps provideseasy conﬁguration for different environments andresearchers are encouraged to test their codeon different OS without learning too much newknowledge and spending too much time. To theleast extent, researchers can choose the mostsimilar environment on the cloud to their localdevelopment environment and make the experi-ment able to run on cloud. To the largest extend,it is beneﬁcial if newly developed algorithms andexperiments can be run on more platforms.Parallelism is a valuable capability of CloudDevOps. In software engineering community, itis often used to run different tests in parallel.Artifacts are build product which are ready to bedeployed to other places. Some DevOps serviceprovider give the opportunity to save artifactspermanently. For scientiﬁc experiment scenario,independent experiments can be run in paralleljobs and the results (like ﬁgures) can be savedautomatically for each job and viewed by public.Cloud DevOps uses conﬁguration ﬁle to de-termine the running environment and workﬂowinstructions. Usually the conﬁguration ﬁle is writ-ten in

YAML format. Different Cloud DevOpsproviders have different schema in this format,but they all do the same thing. Below we givea short introduction of how to conﬁgure CloudDevOps to run the experiment.

Choosing Environment for Agent

Users ﬁrst choose the actual running environ-ment of their code. Usually, it is the combinationof the following items:1) virtual machine or docker container.2) public cloud service or local runner.3) programming language and version.

May/June 2019 epartment Head For example, on Travis users can have

Ubuntu 16.04 Python 3.6 environmentby simple requires it in the following way:

Listing 1. environment conﬁguration o s : l i n u x d i s t : x e n i a l l a n g u a g e : p y t h o n p y t h o n : 3 . 6In this conﬁguration, we use the Linux virtualmachine provided by cloud service. We also ﬁxthe version of certain software. Such shortcutmakes installing dependency in later workﬂowmanagement much easier as we do not need toinstall Python or other pre-installed softwaresmanually.Besides virtual machine, many DevOps in-frastructures support Docker containers as well,which provides more ﬂexible way to conﬁgurethe environment. Generally speaking, virtualiza-tion is better than bare metal OS for experimentreproducibility [3]. Hence Cloud DevOps can doa good job by providing out-of-the-box virtualmachine.Usually Cloud DevOps is used in cooperationwith a source code repository. The system dia-gram in

Figure

RepositoryDevOps fetch codetrigger buildupload log&artifact

Agent

Figure 1.

Interaction of DevOps server with agentand code repository it possible. By installing a client software, it ispossible to empower the advantages of CloudDevOps without losing the computing ability ofself-hosted servers.

Describe Workﬂow for Agent

In this step, users should determine how to ex-ecute their code sequentially. The basic workﬂowcan be summarized in

Figure Info Install BuildTestRunReport

Deploy

Figure 2.

CICD pipeline illustration. The steps withinblue boxes are speciﬁc stages for scientiﬁc experi-ments.

The ﬁrst few steps are common. We needto capture enough information of the runningmachine (

Info ) and install necessary softwaredependencies (

Install ). Then we build oursource code to binary executable (

Build ) andrun some test to verify whether it works forsimple cases (

Test ). In software engineeringcommunity, DevOps ends with the deploymentstep. But for scientiﬁc experiment, the story justbegins after packing your algorithm to a reusablepackage. Therefore, we use blue boxes to empha- CiSE is the unique steps for scientiﬁc experiment inCloud DevOps infrastructure. After the test, webegin to run the experiment (

Run ) and ﬁnally theresult needs to be collected and further processedto produce the artifacts (

Report ).For

Info step, it is automatically done byDevOps server. For other steps, shell scripts areused to tell the running machine how to install,build and run the code. Not all steps are nec-essary. For example, no

Build step is neededfor interpreted programming language. Supposea researcher writes his experiment code usingPython programming language, then he can writehis workﬂow as follows:

Listing 2. workﬂow description i n s t a l l : − p i p i n s t a l l − r r e q u i r e m e n t s . t x t s c r i p t : − python main . py In the above workﬂow description,

Build , Test and

Deploy steps are omitted. This iscommon for many researchers, often they do nottest and deploy their code. That’s OK as longas the experiment results are all right. Still, itis better to do some test and deployment task.Deployment makes other researchers easier tocompare their results with your method withoutcopying your code to their own repository andmodify your code to ﬁt their need.Since the conﬁguration of Cloud DevOps istransparent to all users and the mechanism ofit is totally determined by conﬁguration ﬁle andspeciﬁc version of source code. Other researcherscan trust the output logs and artifacts of DevOpsas evidence of experiment reproducibility. Rerun-ning the code is very easier: just use the sameservice provider and the code can be run under adifferent account after replicating the code reposi-tory. We acknowledge that this convenience is notapplicable to self-hosted agent. For self-hostedagent usually the environment conﬁguration partis not written in a ﬁle but determined by whichtype of agent attached. This made reproducibilityless easy. Still, the logs and artifacts are availableto be examined by public since they are uploadedto public Cloud DevOps server from local agent.To make the story of experiment reproducibilitycomplete, we encourage the researchers to runa partial and small scale experiment on public DevOps server and run their full experiment onself-hosted server using the same code.

CASE STUDIES

In the previous section, we brieﬂy overviewthe common practice in DevOps and how it canbe related with scientiﬁc experiment reproducibil-ity. Different domains may still face differentproblems in practice. In this section, we useexperiments from the domain of Graph Comput-ing and Bioinformatic to show that how CloudDevOps can be used to solve the reproducibilityproblems. We believe Cloud DevOps can be usedin experiments of other domains as well.

Using public agent

Generally, if researchers develop a new algo-rithm for a speciﬁc domain, the workﬂow shownin Fig 2 can be further decomposed into twophases: algorithm library build phase and ex-periment running phase. The output of the ﬁrstphase is the reusable library which is one of inputto the second phase. Using DevOps in the ﬁrstphase is nearly identical to how DevOps is usedin software community. The code can be testedagainst different environment and the reusable li-brary can be deployed to public available packagerepository.Following this two-phase philosophy, we con-sider a simple triangle counting algorithm andapply it to considerable large graph. The codeis available at https://github.com/zhaofeng-shu33/triangle counting. In the ﬁrst phase, we compilethe code and deploy the package to Ubuntu PPA.We also demonstrate the code can be compiledand run successfully on Windows by using Ap-pVeyor. In the second phase we just install thedeployed package and run the actual experimenton the agent provided by Travis, which has 2 CPUcores and 7.5GB memory. The log of this exper-iment can be checked publicly on Travis, whichshows our program consumes 3.1GB memory inpeak and ﬁnishes in 4.3 minutes.

Using self-hosted agent

Cloud DevOps public agent provides usefulagent for general purpose task but is not suitableto run long-time experiment due to time limitationfor a single run. For this kind of experiment,self-hosted agent should be used. Self-hosted

May/June 2019 epartment Head agent includes laptops, workstations, lab bare-metal server, paid cloud virtual machines etc. Forour triangle counting experiment, we use lab bare-metal server to run larger experiment. Since weuse OpenMP to do some parallelism on algorithmlevel, the multi-core CPUs on the server help alot in accelerating the experiment. For our trianglecounting experiment, we choose a larger datasetwhich requires 18GB peak memory to consumesit. We use GitLab as the DevOps service providerand our self-hosted agent is a head node in anHPC cluster and we use it to submit the job tocomputing nodes. The relationship is illustratedby Figure

3. Using agent to submit job has extraadvantages that the running logs are preservedin a continuous way without messing things up.Since dependency of experiment can be compiledand prepared beforehand, generally only blue boxworkﬂow in Figure 2 is executed in self-hostedagent. Going through the whole pipeline of De-vOps costs extra time but makes the experimentmore reliable.

Computing Grid

DevOps submit jobtrigger buildupload log & artifact

Agent

Figure 3.

Using self-hosted HPC to connect DevOpsserver. The blue ellipse part is self-hosted resources.

Each computing node of ours has 56 CPUs,256GB memory and we need around 5 hours torun this experiment. The log of each running forself-hosted agent is available publicly on GitLab.

Incorporating other cloud infrastructure

Cloud DevOps is not exclusive and can incor-porate other infrastructure as well.The experiment we use is a sequence align-ment algorithm applied to protein sequences.The code is available at https://github.com/zhaofeng-shu33/ssw experiment. Originally theexperiment is run on AWS Lambda, which is a Serverless infrastructure provided by Amazon[16]. Serverless infrastructure allows many con-current experiments to run in isolated environ-ment and we need a client to coordinate them.The client program we used is run on public agentprovided by Microsoft Azure. We ﬁrst compilethe experiment source code and deploy it toAWS Lambda platform using Travis. Then theclient program is run to invoke Lambda functionsand collect the experiment results. The overallexperiment ﬁnishes in 2 minutes and producespublic viewable logs on Azure pipelines.From this example we also notice that CloudDevOps is not bound to a speciﬁc provider. Wecan use DevOps provided by Microsoft to triggerthe experiments deployed on Amazon.

CONCLUSION

DevOps infrastructure is actively maintainedby software engineering community and evolvestowards better usability. It will be beneﬁcial forscientists if they could incorporate DevOps intotheir daily research. Researchers can run theirmedium-sized or partial version of experimentsdirectly on public DevOps service and completetheir full experiment using self-hosted agent. Cur-rently, it is unknown the accessibility of De-vOps by scientiﬁc community beyond scientiﬁcsoftware development. Since DevOps is easilyconﬁgurable and compatible with existing tools,we believe it will sweep more disciplines in thefuture.

ACKNOWLEDGMENT

This work is supported by the Natural Sci-ence Foundation of China 61807021, Shen-zhen Science and Technology Research andDevelopment Funds (JCYJ20170818094022586),and Innovation and entrepreneurship projectfor overseas high-level talents of Shenzhen(KQJSCX20180327144037831).

REFERENCES

1. E. Jonas, J. Schleier-Smith, V. Sreekanti, C. Tsai,A. Khandelwal, Q. Pu, V. Shankar, J. Carreira,K. Krauth, N. J. Yadwadkar, J. E. Gonzalez, R. A. Popa,I. Stoica, and D. A. Patterson, “Cloud programmingsimpliﬁed: A berkeley view on serverless computing,”

CoRR , vol. abs/1902.03383, 2019. [Online]. Available:http://arxiv.org/abs/1902.03383 CiSE . K. Greff, A. Klein, M. Chovanec, F. Hutter, andJ. Schmidhuber, “The sacred infrastructure for com-putational research,” in

Proceedings of the Python inScience Conferences-SciPy Conferences , 2017.3. B. Howe, “Virtual appliances, cloud computing, andreproducible research.”

Comput Sci Eng , vol. 14, no. 4,pp. 36–41, Jul. 2012.4. J. M. Perkel, “Data visualization tools drive interactivityand reproducibility in online publishing,”

Nature , vol.554, no. 7690, pp. 133–134, Feb. 2018.5. V. Stodden and S. Miguez, “Best practices for compu-tational science: Software infrastructure and environ-ments for reproducible and extensible research,”

Jour-nal of open research software , vol. 2, no. 1, Jul. 2014.6. R. Qasha, J. Cala, and P. Watson, “A frameworkfor scientiﬁc workﬂow reproducibility in the cloud.” in eScience . IEEE Computer Society, 2016, pp. 81–90.7. C. Boettiger, “An introduction to docker for reproducibleresearch.”

SIGOPS Oper. Syst. Rev. , vol. 49, no. 1, pp.71–79, Jan. 2015.8. M. Chwalisz, K. Geissdoerfer, and A. Wolisz, “Walker:Devops inspired workﬂow for experimentation,” in

IEEEINFOCOM 2019 - IEEE Conference on Computer Com-munications Workshops (INFOCOM WKSHPS) . IEEE,Apr. 2019, pp. 277–282.9. J. Philips and H. Bruyninckx, “Devops@ mech-a cloudinfrastructure for reproducible research,” in

Interna-tional Conference on Robotics and Automation, Date:2019/05/20-2019/05/24, Location: Montreal, Canada ,2019.10. B. K´egl, A. Boucaud, M. Cherti, A. Kazakci, A. Gramfort,G. Lemaitre, J. Van den Bossche, D. Benbouzid, andC. Marini, “The ramp framework: from reproducibility totransparency in the design and optimization of scientiﬁcworkﬂows,” in

ICML 2018 - RML Workshop , July 2018.11. M. Gavish and D. L. Donoho, “Three dream applicationsof veriﬁable computational results.”

Comput Sci Eng ,vol. 14, no. 4, pp. 26–31, Jul. 2012.12. S. Volkov and O. Sukhoroslov, “Simplifying the use ofclouds for scientiﬁc computing with everest,”

ProcediaComputer Science , vol. 119, pp. 112 – 120, 2017, 6thInternational Young Scientist Conference on Compu-tational Science, YSC 2017, 01-03 November 2017,Kotka, Finland.13. L. Bass, I. Weber, and L. Zhu,

DevOps: A softwarearchitect’s perspective . Addison-Wesley Professional,2015.14. E. Deelman, K. Vahi, M. Rynge, R. Mayani, R. F. daSilva, G. Papadimitriou, and M. Livny, “The evolution of the pegasus workﬂow management software,”

ComputSci Eng , vol. 21, no. 4, pp. 22–36, Jul. 2019.15. J. Bhandari Neupane, R. P. Neupane, Y. Luo, W. Y.Yoshida, R. Sun, and P. G. Williams, “Characteriza-tion of leptazolines a–d, polar oxazolines from thecyanobacterium leptolyngbya sp., reveals a glitch withthe willoughby–hoye scripts for calculating nmr chemi-cal shifts,”

Organic Letters , 2019.16. X. Niu, D. Kumanov, L.-H. Hung, W. Lloyd, and K. Y.Yeung, “Leveraging serverless computing to improveperformance for sequence comparison,” in

Proceedingsof the 10th ACM International Conference on Bioinfor-matics, Computational Biology and Health Informatics .ACM, 2019, pp. 683–687.

Feng Zhao is currently with Tsinghua University, PR.China. He received the B.S. degree and is pursingPh.D degree at Department of Electronic Engineer-ing. His research interest focus on machine learning,graph computing and scientiﬁc computing. Contacthim at [email protected].

Xingzhi Niu is student working for Master in com-puter science at University of Washington, Tacoma,Tacoma, WA. Contact him at [email protected].

Shaolun Huang, is a Professor with Tsinghua-Berkley Shenzhen Institute. Contact him [email protected].

Lin Zhang, is a Professor with Tsinghua Shen-zhen International Graduate School. Contact him [email protected].