[PDF] How to Read a Research Compendium

Abstract

Researchers spend a great deal of time reading research papers. Keshav (2012) provides a three-pass method to researchers to improve their reading skills. This article extends Keshav's method for reading a research compendium. Research compendia are an increasingly used form of publication, which packages not only the research paper's text and figures, but also all data and software for better reproducibility. We introduce the existing conventions for research compendia and suggest how to utilise their shared properties in a structured reading process. Unlike the original, this article is not build upon a long history but intends to provide guidance at the outset of an emerging practice.

Full PDF

aa r X i v : . [ c s . G L ] J un How to Read a Research Compendium

Daniel Nüst, Institute for Geoinformatics, University of Münster, Münster, Germany([email protected])Carl Boettiger, Department of Environmental Science, Policy and Management,University of California, Berkeley, Berkeley, California, United States ([email protected])Ben Marwick, Department of Anthropology, University of Washington, Seattle, Washington,United States ([email protected])

Abstract

Researchers spend a great deal of time reading re-search papers. Keshav (2012) provides a three-passmethod to researchers to improve their reading skills.This article extends Keshav’s method for reading aresearch compendium. Research compendia are anincreasingly used form of publication, which pack-ages not only the research paper’s text and ﬁgures,but also all data and software for better reproducibil-ity. We introduce the existing conventions for re-search compendia and suggest how to utilise theirshared properties in a structured reading process.Unlike the original, this article is not build upon along history but intends to provide guidance at theoutset of an emerging practice.

1. Introduction

Research compendia are an increasingly used formof publication and scholarly communication. Theycomprise not only the research paper’s text and ﬁg-ures, but also all data and software used to con-duct the computational workﬂow and create all out-puts. They provide a lot of added value by revealingmore of the research process to readers, but, if notdone well, they can increase the diﬃculty of under-standing the research. To help readers better un-derstand how to read a research compendium, weextends Keshav’s three-pass method targeted at im-proving skills for reading a research paper (Keshav2007) with additional steps relevant to a researchcompendium’s content.Unlike the ﬁrst version of the original (Keshav 2007),we cannot draw from a long history of experience, be-cause until recently research compendia have beenrelatively rare. Our intention here is to provideguidance at the outset of an emerging practice toboth readers and authors of research compendia tohelp them understand each others’ perspectives and needs and improve their communication. Authorscan use this guide to improve their research com-pendium’s structure and content by better anticipat-ing their readers’ needs. They should not be heldback by unwarranted concerns, like providing sup-port (Barnes 2010). Readers can avoid the trap offalling too deep into technological challenges by aniterative approach to reading and using that gives at-tention to the scientiﬁc issues. Ultimately researchcompendia can enhance and deepen the reading ex-perience, if done right. Keshav’s following introduc-tion applies directly to research compendia:

Researchers must read papers for severalreasons: to review them for a conferenceor a class, to keep current in their ﬁeld, orfor a literature survey of a new ﬁeld. A typ-ical researcher will likely spend hundreds ofhours every year reading papers.Learning to eﬃciently read a paper is acritical but rarely taught skill. Beginninggraduate students, therefore, must learn ontheir own using trial and error. Studentswaste much eﬀort in the process and arefrequently driven to frustration.For many years I have used a simple ‘three-pass’ approach to prevent me from drown-ing in the details of a paper before gettinga bird’s-eye-view. It allows me to estimatethe amount of time required to review a setof papers. Moreover, I can adjust the depthof paper evaluation depending on my needsand how much time I have. This paper de-scribes the approach and its use in doing aliterature survey. (Keshav 2016)The additions made in this work to accommodatefor the content in a research compendium are quiteextensive. This stems from the complexity that aninteractive compendium has compared to a classicstatic “paper”, because a research compendium goeswell beyond the “mere advertising of the scholarship”(Claerbout 1994). We see the breadth of additions asa sign of potential, namely for unprecedented trans-parency, openness, and collaboration.1 .2 Structure

In the remainder of this paper, the excellent origi-nal work is taken over completely. It is set in italicfont based on the most recent online version: Ke-shav (2016). The term “paper” was not replacedwith “research compendium” for better readability.First we brieﬂy introduce research compendia andexisting conventions. We further list relevant re-sources for authors related to research compendia.Then, matching the original paper’s section number-ing, Sections 2 extends the “Three-pass Approach”to include research compendium features in the read-ing process. Section 3 extends “Doing a LiteratureSurvey” with aspects relevant reviewing many re-search compendia.

The term research compendium was coined by Gen-tleman and Lang (2007) who “introduce[d] the con-cept of a compendium as both a container for thediﬀerent elements that make up the document andits computations (i.e. text, code, data,. . . ), andas a means for distributing, managing and updat-ing the collection.” According to Marwick, Boet-tiger, and Mullen (2018) it provides “a standardand easily recognisable way for organising the dig-ital materials of a research project to enable otherresearchers to inspect, reproduce, and extend theresearch”. This standard may diﬀer between scien-tiﬁc domains, yet the intentions and beneﬁts arethe same. Research compendia are practised OpenScience culture and as such improve transparency(Nosek et al. 2015), “make more published researchtrue” (Ioannidis 2014), and enable enhanced reviewand publication workﬂows (Nüst et al. 2017). Theyanswer readers’ needs to understand complex anal-yses through inspection and manipulation (Konkoland Kray 2018) and enable other researchers to re-produce and extend the research (Marwick, Boet-tiger, and Mullen 2018). Research compendia im-prove citations since code and data are openly avail-able (Vandevalle 2012). Ultimately, their goal is toimprove reproducibility (see Barba (2018) for deﬁni-tions of terms) in the light of claims of a “repro-ducibility crisis” in several ﬁelds. Infrastructuresto support the creation, scientiﬁc publication, in-spection, and collaboration based on research com-pendia are an active ﬁeld of research, but none ofwhich have been widely deployed yet (Nüst et al.(2017); Brinckman et al. (2018); Stodden, Miguez, and Seiler (2015); Kluyver et al. (2016); Green andClyburne-Sherin (2018)).As this article is focused on providing hands-on guid-ance on using, and to some extend also creating,research compendia, we refer the reader to the ref-erences for more speciﬁc details. For the remain-der of this work, we assume a minimal view of aresearch compendium suitable for readers who ex-amine a research compendium directly. A researchcompendium has three integral parts: text, code,and data. Text can be instructions, software doc-umentation, or a full manuscript with ﬁgures. Codecan be scripts, software packages, speciﬁcations ofdependencies and computational environments, oreven virtual machines. Data can be just about any-thing, but probably comprises plain text or binaryﬁles that are used as input to the workﬂow, and pro-duced as output from executing the workﬂow.For authors , there is a wealth of generic recom-mendations guiding researchers in creating open re-search (software), for example Sandve et al. (2013),Taschuk and Wilson (2017), Prlić and Procter(2012), Stodden and Miguez (2014), and Wilson etal. (2017). When a research compendium is pub-lished, one can assume the authors have the inten-tion to help the reader understanding the work andaccepts there are “no excuses” to not publishing yourcode (Barnes 2010). Authors may attempt to reachthe ideals of having one “main” ﬁle that can be exe-cuted with “one-click” (Pebesma 2013), of enablingre-use with proper licensing (Stodden 2009), and ofinterweaving code and text following the literate pro-gramming paradigm (Knuth 1984).The following conventions are speciﬁcally for re-search compendia:• Marwick, Boettiger, and Mullen (2018) andROpenSci community’s rrrpkg (https://github.com/ropensci/rrrpkg) discuss the standardsand tooling of the R programming language andsoftware engineering tools for a variety of disci-plines with real-world examples, including sev-eral templates• Jimenez et al. (2017) apply software engineer-ing best pratices from the Open Source softwaredomain to research (see also http://falsiﬁable.us/).• Konkol, Kray, and Pfeiﬀer (2018) derive recom-mendations for authors from issues encounteredreproducing research compendia in geosciences• Gentleman and Lang (2007) recommend usingprogramming languages’ packaging mechanismsfor research compendia, more speciﬁcally R and2ython packages• Chirigati et al. (2016) describe the tool

ReproZip (https://reprozip.org) to support cap-ture and reproduction of a research com-pendium

2. The three-pass approach

The key idea is that you should read the pa-per in up to three passes, instead of start-ing at the beginning and plowing your wayto the end. Each pass accomplishes speciﬁcgoals and builds upon the previous pass:The ﬁrst pass gives you a general idea aboutthe paper. The second pass lets you graspthe paper’s content, but not its details. Thethird pass helps you understand the paperin depth. (Keshav 2016)

The ﬁrst pass is a quick scan to get a bird’s-eye view of the paper. You can also decidewhether you need to do any more passes.This pass should take about ﬁve to ten min-utes and consists of the following steps: (Keshav 2016)1.

Carefully read the title, abstract, and introduc-tion Read the section and sub-section headings, butignore everything else Glance at the mathematical content (if any)to determine the underlying theoretical founda-tions Read the conclusions Glance over the references, mentally ticking oﬀthe ones you’ve already read

6. Glance over the text looking for (a) URLsand formatted names referencing software anddata products or repositories not yet mentionedin the sections read so far, mentally ticking oﬀthe ones you’ve heard about or used, and (b)tables or ﬁgures describing computational envi-ronments, deployments, or execution statistics

At the end of the ﬁrst pass, you should be able toanswer the seven

Cs: Category: What type of paper is this? A mea-surement paper? An analysis of an existing sys-tem? A description of a research prototype? Context: Which other papers is it related to?Which theoretical bases were used to analyze theproblem? Correctness: Do the assumptions appear to bevalid? Contributions: What are the paper’s main con-tributions? Clarity: Is the paper well written?

6. Construction: What are the building blocksof the analysis workﬂow and how accessibleare they (data set(s), programming language(s),tools, algorithms, scripts)? Under what licensesare code and data published?7. Complexity: What is the scale of the analy-sis (e.g. HPC, required OS/cores/memory, typ-ical execution time, data size) and the software(number of dependencies and is installation pos-sible with dependency management tools)?

Using this information, you may choose notto read further (and not print it out, thussaving trees). This could be because the pa-per doesn’t interest you, or you don’t knowenough about the area to understand the pa-per, or that the authors make invalid as-sumptions. (Keshav 2016)You may also choose not to pursue the parts of theresearch compendium further, i.e. not running theworkﬂow or looking at data or code, thus savingresources. Reasons to not read further that relatespeciﬁcally to code and data may be that you don’thave the expertise or access to resources to re-usethe data and code.

The ﬁrst pass is adequate for papers thataren’t in your research area, but may some-day prove relevant. (Keshav 2016)This ﬁrst pass suits research compendia comprisingpotentially re-usable components, like workﬂows oralgorithms using data sets or generic software thatare directly transferable to your ﬁeld of research. Af-ter the ﬁrst pass, you should be able to judge if thesoftware is useful, if it works.

Incidentally, when you write a paper, youcan expect most reviewers (and readers) tomake only one pass over it. Take care tochoose coherent section and sub-section ti-tles and to write concise and comprehen-sive abstracts. If a reviewer cannot under-stand the gist after one pass, the paper willlikely be rejected; if a reader cannot un-derstand the highlights of the paper afterﬁve minutes, the paper will likely never be ead. For these reasons, a ‘graphical ab-stract’ that summarizes a paper with a sin-gle well-chosen ﬁgure is an excellent ideaand can be increasingly found in scientiﬁcjournals. (Keshav 2016)When you write a paper, take care to add instruc-tions on how a reader can reproduce your work andprovide all required parts, i.e. publish a researchcompendium. The instructions should start with a“blank” system and be speciﬁc, i.e. ready for copy &paste, including expected or experienced executiontimes and resources. Such instructions give readersa good idea about what is needed to recreate yourenvironment and execute the analysis If your workrequires specialised or bespoke hardware (HPC, spe-ciﬁc GPUs), consider creating an exemplary, reducedanalysis that runs in regular environments.Also ensure your code and data are properly de-posited, citable and licensed. If you don’t do this,these core parts of your work will likely never beproperly evaluated or re-used. See the section “Re-search Compendia”, above, for recommendationsand further reading on how to make your reviewers’and readers’ lives easier. In the second pass, read the paper withgreater care, but ignore details such asproofs. It helps to jot down the key points,or to make comments in the margins, asyou read. Dominik Grusemann from UniAugsburg suggests that you “note downterms you didn’t understand, or questionsyou may want to ask the author.” If you areacting as a paper referee, these commentswill help you when you are writing your re-view, and to back up your review duringthe program committee meeting. (Keshav2016)1.

Look carefully at the ﬁgures, diagrams and otherillustrations in the paper. Pay special atten-tion to graphs. Are the axes properly labelled?Are results shown with error bars, so that con-clusions are statistically signiﬁcant? Commonmistakes like these will separate rushed, shoddywork from the truly excellent. (Keshav 2016)2.

Remember to mark relevant unread referencesfor further reading (this is a good way to learnmore about the background of the paper). (Ke-shav 2016) 3. Skim over data and source code ﬁles withoutopening them. Are they reasonably named(Bryan 2015)? Do they follow a well-deﬁnedstructure (e.g. a Python package or a researchcompendium convention)? Is there a READMEﬁle and/or structured documentation for func-tionalities?4. Visit the online source code repository, if avail-able. Is it established and well maintained, ororphaned? Is there only one author or are therecontributors? How responsive are they to is-sues? Does the repository have signs of publicrecognition (i.e. GitHub “stars” and “forks”)?Are there regular releases, using semantic ver-sioning?5. Follow the instructions to install the re-quired software and execute the research com-pendium’s workﬂow with the provided parame-ters and input or sample data. Note down er-rors or warnings but do not try to ﬁx any buttrivial or known problems (e.g. ﬁxing a path orinstalling an undocumented dependency).6. Compare the outputs with the expected onesreported in the paper. Also check for diﬀerencesin output ﬁgures: Do labels, legends etc. matchthose in the paper?Points 3 and 4 above hint at how to estimate thequality of a software, but we recommend to be realis-tic as to what to expect and be careful not to judgetoo fast. The software project you evaluate mightbe done by a single researcher who is not a profes-sional programmer, working under a lot of pressureto write code for a single use case. In these situationsone might ﬁnd low levels of code documentation, butfurther documentation might be quickly provided bythe authors once you as an external user show inter-est. Also, no recent changes or releases at a sourcecode repository can also mean the software is stableand simply works with no problems!

The second pass should take up to an hourfor an experienced reader. (Keshav 2016)This does not include the computation time of work-ﬂows in a research compendium. Use this time tocomplete ﬁrst passes for one or several other com-pendia. If the software used is familiar, you may at-tempt to reduce the computation time by sub-settingdata or simplifying the workﬂow. As an author, con-sider adding a reduced example to your research com-pendium for easier access by readers.

After this pass, you should be able to graspthe content of the paper. (Keshav 2016)4ou should have re-executed the provided workﬂowor understand why you could not. You should beable to complete the second pass even if you are un-familiar with the actual language the software is writ-ten in or if you are not a developer yourself. Howeverwe do recommend not to dive too deep, i.e. not go-ing beyond the provided instructions for the researchcompendium’s workﬂow. At this stage, it is the au-thor’s responsibility to guide you through their work.Still, you may also face unsolvable problems, like ac-cess to speciﬁc infrastructure. But if you encounterissues or have questions, you should communicatethese to the author, for example in the software’spublic code repository, if available. It is importantto do this respectfully, and give the authors a chanceto ﬁx bugs or respond to issues (Kahneman 2014).Also let the authors know if your reproduction wassuccessful, especially if you used a diﬀerent operat-ing system or software version than reported.At this point you should be able to judge whetherthe software works and if it is sustainable. Basedon this evaluation you can decide to re-use parts ofthe analysis, i.e. software, data, or method, for yourown work.

You should be able to summarize the mainthrust of the paper, with supporting evi-dence, to someone else. This level of de-tail is appropriate for a paper in which youare interested, but does not lie in your re-search speciality. Sometimes you won’t un-derstand a paper even at the end of the sec-ond pass. This may be because the subjectmatter is new to you, with unfamiliar ter-minology and acronyms. Or the authorsmay use a proof or experimental techniquethat you don’t understand, so that the bulkof the paper is incomprehensible. The pa-per may be poorly written with unsubstanti-ated assertions and numerous forward ref-erences. (Keshav 2016)The research compendium may have incomplete doc-umentation, rely on unavailable software (e.g. propri-etary) or data (e.g. sensitive), or require infrastruc-ture not available to you (e.g. high-performance com-puting, HPC). It may use a programming languageor programming paradigms unfamiliar to you.

Or it could just be that it’s late at night andyou’re tired. You can now choose to: (a)set the paper aside, hoping you don’t needto understand the material to be successfulin your career, (b) return to the paper later, perhaps after reading background materialor (c) persevere and go on to the third pass. (Keshav 2016)

To fully understand a paper, particularly ifyou are a reviewer, requires a third pass.The key to the third pass is to attempt tovirtually re-implement the paper: that is,making the same assumptions as the au-thors, re-create the work. By comparingthis re-creation with the actual paper, youcan easily identify not only a paper’s inno-vations, but also its hidden failings and as-sumptions. This pass requires great atten-tion to detail. (Keshav 2016)If a best practice or established convention for struc-turing data and code was followed, familiarise your-self with it now.

You should identify and challenge every as-sumption in every statement. Moreover,you should think about how you yourselfwould present a particular idea. This com-parison of the actual with the virtual lendsa sharp insight into the proof and presenta-tion techniques in the paper and you canvery likely add this to your repertoire oftools. (Keshav 2016)Take a close look at data, metadata, source code in-cluding the embedded code comments, and furtherdocumentation. You now leave the realm of the meresoftware user to the developer’s perspective. Thiscan be a time consuming very close study of thematerials. If data is not publicly available, e.g. be-cause it contains information about human subjects,decide if you have a reasonable request to contactthe original authors and ask for data access. Workthough the examples and analysis scripts includedin the research compendium. Play close attentionnot only to code, but also to code comments as theyshould include helpful information. A good entrypoint for your code read may be a “main” script (ifprovided by the author), makeﬁle, or literate pro-gramming document (e.g. an R Markdown ﬁle orJupyter Notebook). If neither of these are available,then start with the code creating the ﬁgures for thearticle (e.g. look for “ plot ” statements in the code)and trace your way back through the code until youreach a statement where the input data is read. Yourimpression of the code can help to inform your im-pression of the article’s quality.5f you did not succeed before but the work is relevantfor you, spend more time on getting the analysis torun on your computer. Do not hesitate to contactthe authors of the paper or authors of the softwarefor help, but follow common error reporting guide-lines (e.g. Stack Overﬂow (2018) or Tatham (n.d.)).For authors it is a great experience to be contactedby an interested and respectful reader!With regard to the analysis, you may re-implementcore parts or the full workﬂow with a diﬀerent soft-ware. For example, using a tool you know but whichwas not used in the research compendium. Doesyour code lead to the same results, or does it givediﬀerent ones? Can the diﬀerences be explained orare they not signiﬁcant? Note that such a replica-tion is of very high value for science and you shouldshare your ﬁndings with the research compendium’sauthors and also with the scientiﬁc community. De-pending on the eﬀorts you put in, write a blog postor even publish a replication research compendiumfor one or more evaluated research compendia.If a full replication is not feasible, explore the as-sumptions you challenge with data and code. Playaround with input parameters to get a feel forthe changing results. Create exploratory plots forthe data as if you would want to analyse it fromscratch, without the knowledge of the existing work-ﬂow. With your understanding of the code you canextend the method to a new problem or apply it toa diﬀerent dataset. This deep evaluation of codeand data increases your understanding of the au-thors’ reasoning and decisions, and may lead to newquestions.To make sure you can trace your own hands-onchanges with the original code and conﬁguration.We recommend initiating a local git repository whenstarting this pass. You can create branches for spe-ciﬁc explorations and easily reset to the original func-tional state.

During this pass, you should also jot downideas for future work. This pass can takemany hours for beginners and more than anhour or two even for an experienced reader.At the end of this pass, you should be able toreconstruct the entire structure of the paperfrom memory, as well as be able to iden-tify its strong and weak points. In particu-lar, you should be able to pinpoint implicitassumptions, missing citations to relevantwork, and potential issues with experimen-tal or analytical techniques. (Keshav 2016) You should be able to come up with useful extensionsof the used software stack and be able to judge thetransferability and reusability of the analysis’ build-ing blocks. You should most certainly have improvedyour programming skills by reading and evaluatingother people’s code or even trying to extend or im-prove it.

3. Doing a literature survey

Paper reading skills are put to the test indoing a literature survey. This will requireyou to read tens of papers, perhaps in anunfamiliar ﬁeld. What papers should youread? Here is how you can use the three-pass approach to help. First, use an aca-demic search engine such as Google Scholaror CiteSeer and some well-chosen keywordsto ﬁnd three to ﬁve recent highly-cited pa-pers in the area. (Keshav 2016)No search capability comparable to scientiﬁc ar-ticles exists for research compendia, though youcan of course use generic and academic search en-gines. More and more journals encourage repro-ducible research and software and data publication,so that extending your search regular search withkeywords such as “reproduction”, “reproducible”,“open data/software/code” may improve your re-sults.In addition, you can search online platforms whereresearch compendia have been published and taggedas a research compendium ( research-compendium ):• GitHub label: https://github.com/topics/research-compendium• Zenodo community: https://zenodo.org/communities/research-compendiumThere is no journal speciﬁcally for research compen-dia yet, but the following ones feature reproducibil-ity, computational studies, or openness in a promi-nent way and can be a starting point for ﬁndingresearch compendia, if they ﬁt your topic:•

ReScience : https://rescience.github.io/•

Information Systems

Do one pass on each paper to get a sense ofthe work, then read their related work sec-tions. You will ﬁnd a thumbnail summaryof the recent work, and perhaps, if you arelucky, a pointer to a recent survey paper. Ifyou can ﬁnd such a survey, you are done.Read the survey, congratulating yourself onyour good luck. Otherwise, in the secondstep, ﬁnd shared citations and repeated au-thor names in the bibliography. These arethe key papers and researchers in that area.

You can also ﬁnd shared software or data and usethem as a seed for a next iteration.

Download the key papers and set themaside. Then go to the websites of the keyresearchers and see where they’ve publishedrecently. That will help you identify the topconferences in that ﬁeld because the best re-searchers usually publish in the top confer-ences.

Also check where they publish their code and data.It will give you an idea where this community inter-acts online and can even lead you to research com-pendia under development.

The third step is to go to the websitefor these top conferences and look throughtheir recent proceedings. A quick scan willusually identify recent high-quality relatedwork. These papers, along with the onesyou set aside earlier, constitute the ﬁrstversion of your survey. Make two passesthrough these papers. If they all cite a keypaper that you did not ﬁnd earlier, obtainand read it, iterating as necessary. (Keshav2016)If a majority cites or uses a key software, technology,or dataset, then evaluate it and include it in the nextiteration.

4. Related work

If you are reading a paper to do a review,you should also read Timothy Roscoe’s pa-per on “Writing reviews for systems confer-ences” (Roscoe 2007). If you’re planningto write a technical paper, you should re-fer both to Henning Schulzrinne’s compre-hensive web site (Schulzrinne n.d.) andGeorge Whitesides’s excellent overview ofthe process (Whitesides 2004). Finally, Si-mon Peyton Jones has a website that coversthe entire spectrum of research skills (Pey-ton Jones n.d.). Iain H. McLean of Psy-chology, Inc. has put together a download-able ‘review matrix’ that simpliﬁes paper re-viewing using the three-pass approach forpapers in experimental psychology (McLean2012), which can probably be used, with mi-nor modiﬁcations, for papers in other areas. (Keshav 2016)We are working on an extended version of thismatrix to provide space for notes about soft-ware, data, results of the reproduction, andapplication of the methods. See the corre-sponding repository issue for details and pro-vide your feedback: https://github.com/nuest/how-to-read-a-research-compendium/issues/2If you are reviewing a research compendium, a moredetailed checklist is given in the “rOpenSci Analy-sis Best Pratice Guidelines” (rOpenSci 2017), whichare partially even automated for R-based researchcompendia (DeCicco et al. 2018), and the Journalof Open Research Software’s guidelines for reviewingresearch software (JORS Editorial Team 2018).

5. Acknowledgements

The ﬁrst version of this document wasdrafted by my students: Hossein Falaki,Earl Oliver, and Sumair Ur Rahman. Mythanks to them. I also beneﬁted fromChristophe Diot’s perceptive comments andNicole Keshav’s eagle-eyed copy-editing. Iwould like to make this a living document,updating it as I receive comments. Pleasetake a moment to email me any commentsor suggestions for improvement. Thanks toencouraging feedback from many correspon-dents over the years. (Keshav 2016)7n the spirit of the original paper, we would liketo make this a living document and invite read-ers to provide comments or suggestions for improve-ment via email, as part of this preprint, or onthe GitHub repository: https://github.com/nuest/how-to-read-a-research-compendium. The reposi-tory also includes open questions and is where thepaper’s authors openly discuss.

References

Barba, Lorena A. 2018. “Terminologies for Repro-ducible Research.” arXiv:1802.03311 [Cs] , February.http://arxiv.org/abs/1802.03311.Barnes, Nick. 2010. “Publish Your Computer Code:It Is Good Enough.”

Nature News

467 (7317):753–53. https://doi.org/10.1038/467753a.Brinckman, Adam, Kyle Chard, Niall Gaﬀney, Mi-hael Hategan, Matthew B. Jones, Kacper Kowalik,Sivakumar Kulasekaran, et al. 2018. “ComputingEnvironments for Reproducibility: Capturing the‘Whole Tale’.”

Future Generation Computer Systems ,February. https://doi.org/10.1016/j.future.2017.12.029.Bryan, Jenny. 2015. “Naming Things.”

Speaker Deck . https://speakerdeck.com/jennybc/how-to-name-ﬁles.Chirigati, Fernando, Rémi Rampin, Dennis Shasha,and Juliana Freire. 2016. “ReproZip: Computa-tional Reproducibility with Ease.” In

Proceedingsof the 2016 International Conference on Manage-ment of Data

Journal of Computational and Graphi-cal Statistics

16 (1):1–23. https://doi.org/10.1198/106186007X178663. Green, Seth Ariel, and April Clyburne-Sherin. 2018.“Computational Reproducibility via Containers inSocial Psychology.”

PsyArXiv , February. https://doi.org/10.17605/OSF.IO/MF82T.Ioannidis, John P. A. 2014. “How to MakeMore Published Research True.”

PLOS Medicine

11 (10):e1001747. https://doi.org/10.1371/journal.pmed.1001747.Jimenez, I., M. Sevilla, N. Watkins, C. Maltzahn,J. Lofstead, K. Mohror, A. Arpaci-Dusseau, and R.Arpaci-Dusseau. 2017. “The Popper Convention:Making Reproducible Systems Evaluation Practical.”In , 1561–70. https://doi.org/10.1109/IPDPSW.2017.157.JORS Editorial Team. 2018. “Journal of OpenResearch Software - Editorial Policies, Peer Re-view Process.” http://openresearchsoftware.metajnl.com/about/editorialpolicies/.Kahneman, Daniel. 2014. “A New Etiquette forReplication.”

Soc. Psychol.

45 (4):310.Keshav, S. 2007. “How to Read a Paper.”

SIG-COMM Comput. Commun. Rev.

37 (3):83–84.https://doi.org/10.1145/1273445.1273458.———. 2016. “How to Read a Paper.” Manuscript.Waterloo, ON, Canada. http://blizzard.cs.uwaterloo.ca/keshav/home/Papers/data/07/paper-reading.pdf.Kluyver, Thomas, Benjamin Ragan-Kelley, Fer-nando Pérez, Brian Granger, Matthias Bussonier,Jonathan Frederic, Kyle Kelley, et al. 2016.“Jupyter Notebooks - a Publishing Format for Re-producible Computational Workﬂows.”

Position-ing and Power in Academic Publishing: Players,Agents and Agendas , 87–90. https://doi.org/10.3233/978-1-61499-649-1-87.Knuth, Donald E. 1984. “Literate Programming.”

Comput. J.

27 (2):97–111. https://doi.org/10.1093/comjnl/27.2.97.Konkol, Markus, and Christian Kray. 2018. “In-Depth Examination of Spatio-Temporal Figures inOpen Reproducible Research.”

EarthArXiv , April.https://doi.org/10.17605/OSF.IO/Q53M8.Konkol, Markus, Christian Kray, and Max Pfeiﬀer.2018. “The State of Reproducibility in the Compu-tational Geosciences.” https://doi.org/10.17605/osf.io/kzu8e.Marwick, Ben, Carl Boettiger, and Lincoln Mullen.8018. “Packaging Data Analytical Work Repro-ducibly Using R (and Friends).”

The AmericanStatistician

72 (1):80–88. https://doi.org/10.1080/00031305.2017.1375986.McLean, Iain H. 2012.

Literature Re-view Matrix . http://archive.org/details/LiteratureReviewMatrix.Nosek, B. A., G. Alter, G. C. Banks, D. Borsboom,S. D. Bowman, S. J. Breckler, S. Buck, et al. 2015.“Promoting an Open Research Culture.”

Science

348 (6242):1422–5. https://doi.org/10.1126/science.aab2374.Nüst, Daniel, Markus Konkol, Edzer Pebesma,Christian Kray, Marc Schutzeichel, Holger Prz-ibytzin, and Jörg Lorenz. 2017. “Opening the Pub-lication Process with Executable Research Compen-dia.”

D-Lib Magazine

23 (1/2). https://doi.org/10.1045/january2017-nuest.Pebesma, Edzer. 2013. “Earth and Planetary In-novation Challenge (EPIC) Submission "One-Click-Reproduce".” http://pebesma.staﬀ.ifgi.de/epic.pdf.Peyton Jones, Simon. n.d. “Simon PeytonJones at Microsoft Research.”

Simon PeytonJones at Microsoft Research

PLOS Comput Biol

Google Docs .https://docs.google.com/document/d/1OYcWJUk-MiM2C1TIHB1Rn6rXoF5fHwRX-7_C12Blx8g/edit?usp=embed_facebook.Roscoe, Timothy. 2007. “Writing Reviews for Sys-tems Conferences,” March, 6. https://people.inf.ethz.ch/troscoe/pubs/review-writing.pdf.Sandve, Geir Kjetil, Anton Nekrutenko, James Tay-lor, and Eivind Hovig. 2013. “Ten Simple Rules forReproducible Computational Research.”

PLoS Com-put Biol

Stack Overﬂow .https://stackoverﬂow.com/help/mcve.Stodden, Victoria. 2009. “The Legal Frameworkfor Reproducible Scientiﬁc Research: Licensing andCopyright.”

Computing in Science & Engineering

Journal of Open ResearchSoftware

Computing in Science &Engineering

17 (1):12–19. https://doi.org/10.1109/MCSE.2015.18.Taschuk, Morgan, and Greg Wilson. 2017.“Ten Simple Rules for Making Research SoftwareMore Robust.”

PLOS Computational Biology

Computing in Science & Engineering

ReproducibleResearch for Scientiﬁc Computing (July):42–47.Whitesides, G. M. 2004. “Whitesides’ Group: Writ-ing a Paper.”

Advanced Materials

16 (15):1375–7.https://doi.org/10.1002/adma.200400767.Wilson, Greg, Jennifer Bryan, Karen Cranston,Justin Kitzes, Lex Nederbragt, and Tracy K. Teal.2017. “Good Enough Practices in Scientiﬁc Comput-ing.”