Reproducing Scientific Experiment with Cloud DevOps
DDepartment: Electronic Engineering
Editor: Name, xxxx@email
Reproducing ScientificExperiment with CloudDevOps
Feng Zhao
Tsinghua University
Xingzhi Niu
University of Washington, Tacoma
Shao-Lun Huang
Tsinghua University
Lin Zhang
Tsinghua University
Abstract —The reproducibility of scientific experiment is vital for the advancement of disciplinesbased on previous work. To achieve this goal, many researchers focus on complex methodologyand self-invented tools which have difficulty in practical usage. In this article, we introduce theDevOps infrastructure from software engineering community and shows how DevOps can beused effectively to reproduce experiments for computer science related disciplines. DevOps canbe enabled using freely available cloud computing machines for medium sized experiment andself-hosted computing engines for large scale computing, thus powering researchers to share their experiment result with others in a more reliable way.
THE INTRODUCTION
As the developmentof Big Data, scientific computing encompassesmore disciplines and are much more complexthan before. Cloud Computing offers many con-venient infrastructures which have been provenuseful in scientific experiment scenario. For ex-ample, some highly parallel experiments can bedone on Cloud Serveless with affordable cost [1].There are also other emerging domains whichare related to scientific computational experi- ment. These experiments require more dedicatedtoolchains, specific workflow and expensive com-putational resources which put new challenge onexperiment reproducibility.To solve the reproducibility issue, there arethree kinds of approaches: Tools, Platform andMethodology. Many tools [2] are provided, whichcan capture the running environment informationor storing the experiment results. These tools arevaluable but may suffer from bad-maintainability
CiSE
Published by the IEEE Computer Society c (cid:13) a r X i v : . [ c s . D C ] O c t epartment Head and difficult configuration. Indeed, they are madeby domain specific scientists, not by experiencedfull-time software engineers. These tools canstore the experiment results, which requires usersto configure the database locally but researchersmay not be experienced in database. Also thelocal data is difficult to share. Furthermore, mostof these tools are not programming languageneutral, which means researchers cannot use themin other programming languages. Still, somethingis better than nothing if researchers use these toolsto manage their experiment.For platform solution, traditionally container-ization is used. In recent years, it has been shownthat cloud computing is suitable for scientificresearch purpose [3]. Configuration on cloud en-vironment from scratch is difficult for unexperi-enced researchers and it is better to use specificcloud service for research purpose. For example,we have Code Ocean or other commercial cloudsystems [4], which can tackle the reproducibilityproblem to some extend. However, their free tieris mean and researchers probably are not willingto pay for extra computational resource. Budgetlimitation is an important factor when it comesto buying cloud computing resources.Methodology, or best practice in reproducibil-ity usually discusses the general principals [5] orcombines tools and platforms to explore the bestpractice [6]. Generally speaking, methodology ishard to follow as they tend to be ideal and theresearchers may not be familiar with or have theability to setup the toolchain used.All the above three aspects have pros and consfor experiment reproducibility. The key is how tocombine the three aspects to make the best useof their advantages. This is what DevOps tries tosolve. This idea is not newly proposed. Boettigergives a try using Docker container for experi-ment reproducibility [7]. He also mentioned theDevOps philosophy and acknowledged its limita-tions. There are other research projects which bor-row the ideas of DevOps to conduct sophisticatedexperiments [8]. These previous researches arevaluable but they are limited to specific domainand local environment.There are also dedicated system on the cloud,which tries to solve domain specific problemand is related with experiment reproducibility. Devops@mech is developed for a certain in- stitute, which is based on DevOps methodology[9]. For public service, we have RAMP for datascience domain [10] or VCR for computationalresults indexing purpose [11]. Though these ser-vices are available when corresponding paperis written, they are unavailable now. Everest,claimed to simplify the use of clouds for scientificcomputing, is still available but users are requiredto attach their own resources before actually usingit [12]. Just like existing tools, these lab-madeservices suffer from bad maintenance.From the above analysis, we see that previouscombination of DevOps with scientific experi-ment has some shortcomings. In this article, wepropose Cloud DevOps approach, which usesDevOps from cloud service point of view. It hasthe following advantages which are not presentcompletely in previous approaches: • High availability of the service and well main-tenance of the infrastructure • Easy to use and have flexible configuration • Unlimited Usage and rich computing resourceIn the following sections, we will give an in-troduction to Cloud DevOps and show the fea-sibility to incorporate existing tools in CloudDevOps. We then investigate the reproducibilityproblem with some proof-of-concept examples.These examples take the advantage of CloudDevOps while integrated with old toolchains. Webelieve Cloud DevOps can help researchers bemore productive in their experiment and helpothers easier to follow their research. All toooften, helping others actually helps yourself.
INFRASTRUCTURE
Originally, DevOps refers to the software en-gineering approach to automate the process ofbuilding and deploying software product, which issummarized by its two core components “Contin-uous Integration and Deployment (CICD)” [13].DevOps service (server) can be self-hosted orcentrally hosted. In either way, it requires someother computing machines (called agents or run-ners) to actually run the jobs submitted. Usuallythe jobs are not submitted by hand but triggeredby an update of code repository. DevOps serveris quite complex and self-hosted solution is notsuitable for sharing results with others. Thereforeit is preferred to use public cloud DevOps service, CiSE able 1. Comparison of Cloud DevOps provider (until2019)
AppVeyor Azurepipelines CircleCI GitLabCICD TravisPlatform Windows,Linux
All All Linuxdocker All
Parallel
Selfhost
Y Y N Y N
Artifact
N Y Y Y N which provides some free time-unlimited com-puting power (cloud agent). Besides, self-hostedcomputing agent (client) can be used if publicprovided agent is not suitable to reproduce theexperiment due to computing resource limitation.In this article, we only consider cloud hostedDevOps service and call them Cloud DevOps forshort.There are some similarities between CloudDevOps and Everest infrastructure [11] . Bothof Cloud DevOps and Everest allow dynamicprovision of computing resources from publiccloud service provider and support computingagents attached by users. The computation canbe trigger by user when the button is clickedvia web interface. However, Everest suffers fromproblems mentioned in the last Section. From theworkflow management point of view, Cloud De-vOps is similar to Pegasus system [14]. While thelatter is more suitable for large scale distributedcomputing management, Cloud DevOps is scal-able and covers the need from small experimentto large scale experiment as well.There are many freely available Cloud De-vOps service providers for open source projectwhich greatly power individual developers andopen source community.
Table
Platform = All in Table 1) if it supports Windows, MacOS and Linux OS. Cross-platform is an important topic in software engi-neering. For scientific community, most researchexperiments can only be reproduced on specificversion of one Operating System. This is OKsince researchers may not have machines of otherOperating Systems or they have no time to maketheir code run on different platforms. A recentstudy found a flaw of Python script in an articlepublished on Nature which produces differentresults on different OS [15]. This incident canbe avoided if researchers test their experimentcode on different OS. Cloud DevOps provideseasy configuration for different environments andresearchers are encouraged to test their codeon different OS without learning too much newknowledge and spending too much time. To theleast extent, researchers can choose the mostsimilar environment on the cloud to their localdevelopment environment and make the experi-ment able to run on cloud. To the largest extend,it is beneficial if newly developed algorithms andexperiments can be run on more platforms.Parallelism is a valuable capability of CloudDevOps. In software engineering community, itis often used to run different tests in parallel.Artifacts are build product which are ready to bedeployed to other places. Some DevOps serviceprovider give the opportunity to save artifactspermanently. For scientific experiment scenario,independent experiments can be run in paralleljobs and the results (like figures) can be savedautomatically for each job and viewed by public.Cloud DevOps uses configuration file to de-termine the running environment and workflowinstructions. Usually the configuration file is writ-ten in
YAML format. Different Cloud DevOpsproviders have different schema in this format,but they all do the same thing. Below we givea short introduction of how to configure CloudDevOps to run the experiment.
Choosing Environment for Agent
Users first choose the actual running environ-ment of their code. Usually, it is the combinationof the following items:1) virtual machine or docker container.2) public cloud service or local runner.3) programming language and version.
May/June 2019 epartment Head For example, on Travis users can have
Ubuntu 16.04 Python 3.6 environmentby simple requires it in the following way:
Listing 1. environment configuration o s : l i n u x d i s t : x e n i a l l a n g u a g e : p y t h o n p y t h o n : 3 . 6In this configuration, we use the Linux virtualmachine provided by cloud service. We also fixthe version of certain software. Such shortcutmakes installing dependency in later workflowmanagement much easier as we do not need toinstall Python or other pre-installed softwaresmanually.Besides virtual machine, many DevOps in-frastructures support Docker containers as well,which provides more flexible way to configurethe environment. Generally speaking, virtualiza-tion is better than bare metal OS for experimentreproducibility [3]. Hence Cloud DevOps can doa good job by providing out-of-the-box virtualmachine.Usually Cloud DevOps is used in cooperationwith a source code repository. The system dia-gram in
Figure
RepositoryDevOps fetch codetrigger buildupload log&artifact
Agent
Figure 1.
Interaction of DevOps server with agentand code repository it possible. By installing a client software, it ispossible to empower the advantages of CloudDevOps without losing the computing ability ofself-hosted servers.
Describe Workflow for Agent
In this step, users should determine how to ex-ecute their code sequentially. The basic workflowcan be summarized in
Figure Info Install BuildTestRunReport
Deploy
Figure 2.
CICD pipeline illustration. The steps withinblue boxes are specific stages for scientific experi-ments.
The first few steps are common. We needto capture enough information of the runningmachine (
Info ) and install necessary softwaredependencies (
Install ). Then we build oursource code to binary executable (
Build ) andrun some test to verify whether it works forsimple cases (
Test ). In software engineeringcommunity, DevOps ends with the deploymentstep. But for scientific experiment, the story justbegins after packing your algorithm to a reusablepackage. Therefore, we use blue boxes to empha- CiSE is the unique steps for scientific experiment inCloud DevOps infrastructure. After the test, webegin to run the experiment (
Run ) and finally theresult needs to be collected and further processedto produce the artifacts (
Report ).For
Info step, it is automatically done byDevOps server. For other steps, shell scripts areused to tell the running machine how to install,build and run the code. Not all steps are nec-essary. For example, no
Build step is neededfor interpreted programming language. Supposea researcher writes his experiment code usingPython programming language, then he can writehis workflow as follows:
Listing 2. workflow description i n s t a l l : − p i p i n s t a l l − r r e q u i r e m e n t s . t x t s c r i p t : − python main . py In the above workflow description,
Build , Test and
Deploy steps are omitted. This iscommon for many researchers, often they do nottest and deploy their code. That’s OK as longas the experiment results are all right. Still, itis better to do some test and deployment task.Deployment makes other researchers easier tocompare their results with your method withoutcopying your code to their own repository andmodify your code to fit their need.Since the configuration of Cloud DevOps istransparent to all users and the mechanism ofit is totally determined by configuration file andspecific version of source code. Other researcherscan trust the output logs and artifacts of DevOpsas evidence of experiment reproducibility. Rerun-ning the code is very easier: just use the sameservice provider and the code can be run under adifferent account after replicating the code reposi-tory. We acknowledge that this convenience is notapplicable to self-hosted agent. For self-hostedagent usually the environment configuration partis not written in a file but determined by whichtype of agent attached. This made reproducibilityless easy. Still, the logs and artifacts are availableto be examined by public since they are uploadedto public Cloud DevOps server from local agent.To make the story of experiment reproducibilitycomplete, we encourage the researchers to runa partial and small scale experiment on public DevOps server and run their full experiment onself-hosted server using the same code.
CASE STUDIES
In the previous section, we briefly overviewthe common practice in DevOps and how it canbe related with scientific experiment reproducibil-ity. Different domains may still face differentproblems in practice. In this section, we useexperiments from the domain of Graph Comput-ing and Bioinformatic to show that how CloudDevOps can be used to solve the reproducibilityproblems. We believe Cloud DevOps can be usedin experiments of other domains as well.
Using public agent
Generally, if researchers develop a new algo-rithm for a specific domain, the workflow shownin Fig 2 can be further decomposed into twophases: algorithm library build phase and ex-periment running phase. The output of the firstphase is the reusable library which is one of inputto the second phase. Using DevOps in the firstphase is nearly identical to how DevOps is usedin software community. The code can be testedagainst different environment and the reusable li-brary can be deployed to public available packagerepository.Following this two-phase philosophy, we con-sider a simple triangle counting algorithm andapply it to considerable large graph. The codeis available at https://github.com/zhaofeng-shu33/triangle counting. In the first phase, we compilethe code and deploy the package to Ubuntu PPA.We also demonstrate the code can be compiledand run successfully on Windows by using Ap-pVeyor. In the second phase we just install thedeployed package and run the actual experimenton the agent provided by Travis, which has 2 CPUcores and 7.5GB memory. The log of this exper-iment can be checked publicly on Travis, whichshows our program consumes 3.1GB memory inpeak and finishes in 4.3 minutes.
Using self-hosted agent
Cloud DevOps public agent provides usefulagent for general purpose task but is not suitableto run long-time experiment due to time limitationfor a single run. For this kind of experiment,self-hosted agent should be used. Self-hosted
May/June 2019 epartment Head agent includes laptops, workstations, lab bare-metal server, paid cloud virtual machines etc. Forour triangle counting experiment, we use lab bare-metal server to run larger experiment. Since weuse OpenMP to do some parallelism on algorithmlevel, the multi-core CPUs on the server help alot in accelerating the experiment. For our trianglecounting experiment, we choose a larger datasetwhich requires 18GB peak memory to consumesit. We use GitLab as the DevOps service providerand our self-hosted agent is a head node in anHPC cluster and we use it to submit the job tocomputing nodes. The relationship is illustratedby Figure
3. Using agent to submit job has extraadvantages that the running logs are preservedin a continuous way without messing things up.Since dependency of experiment can be compiledand prepared beforehand, generally only blue boxworkflow in Figure 2 is executed in self-hostedagent. Going through the whole pipeline of De-vOps costs extra time but makes the experimentmore reliable.
Computing Grid
DevOps submit jobtrigger buildupload log & artifact
Agent
Figure 3.
Using self-hosted HPC to connect DevOpsserver. The blue ellipse part is self-hosted resources.
Each computing node of ours has 56 CPUs,256GB memory and we need around 5 hours torun this experiment. The log of each running forself-hosted agent is available publicly on GitLab.
Incorporating other cloud infrastructure
Cloud DevOps is not exclusive and can incor-porate other infrastructure as well.The experiment we use is a sequence align-ment algorithm applied to protein sequences.The code is available at https://github.com/zhaofeng-shu33/ssw experiment. Originally theexperiment is run on AWS Lambda, which is a Serverless infrastructure provided by Amazon[16]. Serverless infrastructure allows many con-current experiments to run in isolated environ-ment and we need a client to coordinate them.The client program we used is run on public agentprovided by Microsoft Azure. We first compilethe experiment source code and deploy it toAWS Lambda platform using Travis. Then theclient program is run to invoke Lambda functionsand collect the experiment results. The overallexperiment finishes in 2 minutes and producespublic viewable logs on Azure pipelines.From this example we also notice that CloudDevOps is not bound to a specific provider. Wecan use DevOps provided by Microsoft to triggerthe experiments deployed on Amazon.
CONCLUSION
DevOps infrastructure is actively maintainedby software engineering community and evolvestowards better usability. It will be beneficial forscientists if they could incorporate DevOps intotheir daily research. Researchers can run theirmedium-sized or partial version of experimentsdirectly on public DevOps service and completetheir full experiment using self-hosted agent. Cur-rently, it is unknown the accessibility of De-vOps by scientific community beyond scientificsoftware development. Since DevOps is easilyconfigurable and compatible with existing tools,we believe it will sweep more disciplines in thefuture.
ACKNOWLEDGMENT
This work is supported by the Natural Sci-ence Foundation of China 61807021, Shen-zhen Science and Technology Research andDevelopment Funds (JCYJ20170818094022586),and Innovation and entrepreneurship projectfor overseas high-level talents of Shenzhen(KQJSCX20180327144037831).
REFERENCES
1. E. Jonas, J. Schleier-Smith, V. Sreekanti, C. Tsai,A. Khandelwal, Q. Pu, V. Shankar, J. Carreira,K. Krauth, N. J. Yadwadkar, J. E. Gonzalez, R. A. Popa,I. Stoica, and D. A. Patterson, “Cloud programmingsimplified: A berkeley view on serverless computing,”
CoRR , vol. abs/1902.03383, 2019. [Online]. Available:http://arxiv.org/abs/1902.03383 CiSE . K. Greff, A. Klein, M. Chovanec, F. Hutter, andJ. Schmidhuber, “The sacred infrastructure for com-putational research,” in
Proceedings of the Python inScience Conferences-SciPy Conferences , 2017.3. B. Howe, “Virtual appliances, cloud computing, andreproducible research.”
Comput Sci Eng , vol. 14, no. 4,pp. 36–41, Jul. 2012.4. J. M. Perkel, “Data visualization tools drive interactivityand reproducibility in online publishing,”
Nature , vol.554, no. 7690, pp. 133–134, Feb. 2018.5. V. Stodden and S. Miguez, “Best practices for compu-tational science: Software infrastructure and environ-ments for reproducible and extensible research,”
Jour-nal of open research software , vol. 2, no. 1, Jul. 2014.6. R. Qasha, J. Cala, and P. Watson, “A frameworkfor scientific workflow reproducibility in the cloud.” in eScience . IEEE Computer Society, 2016, pp. 81–90.7. C. Boettiger, “An introduction to docker for reproducibleresearch.”
SIGOPS Oper. Syst. Rev. , vol. 49, no. 1, pp.71–79, Jan. 2015.8. M. Chwalisz, K. Geissdoerfer, and A. Wolisz, “Walker:Devops inspired workflow for experimentation,” in
IEEEINFOCOM 2019 - IEEE Conference on Computer Com-munications Workshops (INFOCOM WKSHPS) . IEEE,Apr. 2019, pp. 277–282.9. J. Philips and H. Bruyninckx, “Devops@ mech-a cloudinfrastructure for reproducible research,” in
Interna-tional Conference on Robotics and Automation, Date:2019/05/20-2019/05/24, Location: Montreal, Canada ,2019.10. B. K´egl, A. Boucaud, M. Cherti, A. Kazakci, A. Gramfort,G. Lemaitre, J. Van den Bossche, D. Benbouzid, andC. Marini, “The ramp framework: from reproducibility totransparency in the design and optimization of scientificworkflows,” in
ICML 2018 - RML Workshop , July 2018.11. M. Gavish and D. L. Donoho, “Three dream applicationsof verifiable computational results.”
Comput Sci Eng ,vol. 14, no. 4, pp. 26–31, Jul. 2012.12. S. Volkov and O. Sukhoroslov, “Simplifying the use ofclouds for scientific computing with everest,”
ProcediaComputer Science , vol. 119, pp. 112 – 120, 2017, 6thInternational Young Scientist Conference on Compu-tational Science, YSC 2017, 01-03 November 2017,Kotka, Finland.13. L. Bass, I. Weber, and L. Zhu,
DevOps: A softwarearchitect’s perspective . Addison-Wesley Professional,2015.14. E. Deelman, K. Vahi, M. Rynge, R. Mayani, R. F. daSilva, G. Papadimitriou, and M. Livny, “The evolution of the pegasus workflow management software,”
ComputSci Eng , vol. 21, no. 4, pp. 22–36, Jul. 2019.15. J. Bhandari Neupane, R. P. Neupane, Y. Luo, W. Y.Yoshida, R. Sun, and P. G. Williams, “Characteriza-tion of leptazolines a–d, polar oxazolines from thecyanobacterium leptolyngbya sp., reveals a glitch withthe willoughby–hoye scripts for calculating nmr chemi-cal shifts,”
Organic Letters , 2019.16. X. Niu, D. Kumanov, L.-H. Hung, W. Lloyd, and K. Y.Yeung, “Leveraging serverless computing to improveperformance for sequence comparison,” in
Proceedingsof the 10th ACM International Conference on Bioinfor-matics, Computational Biology and Health Informatics .ACM, 2019, pp. 683–687.
Feng Zhao is currently with Tsinghua University, PR.China. He received the B.S. degree and is pursingPh.D degree at Department of Electronic Engineer-ing. His research interest focus on machine learning,graph computing and scientific computing. Contacthim at [email protected].
Xingzhi Niu is student working for Master in com-puter science at University of Washington, Tacoma,Tacoma, WA. Contact him at [email protected].
Shaolun Huang, is a Professor with Tsinghua-Berkley Shenzhen Institute. Contact him [email protected].
Lin Zhang, is a Professor with Tsinghua Shen-zhen International Graduate School. Contact him [email protected].