An Environment for Sustainable Research Software in Germany and Beyond: Current State, Open Challenges, and Call for Action
Hartwig Anzt, Felix Bach, Stephan Druskat, Frank Löffler, Axel Loewe, Bernhard Y. Renard, Gunnar Seemann, Alexander Struck, Elke Achhammer, Piush Aggarwal, Franziska Appel, Michael Bader, Lutz Brusch, Christian Busse, Gerasimos Chourdakis, Piotr W. Dabrowski, Peter Ebert, Bernd Flemisch, Sven Friedl, Bernadette Fritzsch, Maximilian D. Funk, Volker Gast, Florian Goth, Jean-Noël Grad, Sibylle Hermann, Florian Hohmann, Stephan Janosch, Dominik Kutra, Jan Linxweiler, Thilo Muth, Wolfgang Peters-Kottig, Fabian Rack, Fabian H.C. Raters, Stephan Rave, Guido Reina, Malte Reißig, Timo Ropinski, Joerg Schaarschmidt, Heidi Seibold, Jan P. Thiele, Benjamin Uekerman, Stefan Unger, Rudolf Weeber
PP O S I T I O N P A P E R
An Environment for Sustainable Research Software in Germany andBeyond: Current State, Open Challenges, and Call for Action
Hartwig Anzt , Felix Bach , Stephan Druskat , Frank Löffler , Axel Loewe , Bernhard Y. Renard , GunnarSeemann , Alexander Struck , Elke Achhammer , Piush Aggarwal , Franziska Appel , Michael Bader , LutzBrusch , Christian Busse , Gerasimos Chourdakis , Piotr W. Dabrowski , Peter Ebert , Bernd Flemisch , Sven Friedl ,Bernadette Fritzsch , Maximilian D. Funk , Volker Gast , Florian Goth , Jean-Noël Grad , Sibylle Hermann , FlorianHohmann , Stephan Janosch , Dominik Kutra , Jan Linxweiler , Thilo Muth , Wolfgang Peters-Kottig , FabianRack , Fabian H.C. Raters , Stephan Rave , Guido Reina , Malte Reißig , Timo Ropinski , Joerg Schaarschmidt ,Heidi Seibold , Jan P. Thiele , Benjamin Uekerman , Stefan Unger and Rudolf Weeber Steinbuch Centre for Computing, Karlsruhe Institute of Technology (KIT), Germany and Innovative Computing Lab, University of Tennessee, Knoxville, TN,USA and Department of English Studies, Friedrich Schiller University Jena, Germany and Institute of Software Technology, German Aerospace Center (DLR),Germany and Department of Computer Science, Humboldt-Universität zu Berlin, Germany and Heinz Nixdorf Chair for Distributed Information Systems,Friedrich Schiller University Jena, Germany and Michael Stifel Center Jena, Germany and Center for Computation and Technology, Louisiana State University,Baton Rouge, USA and Institute of Biomedical Engineering, Karlsruhe Institute of Technology (KIT), Germany and Bioinformatics Unit (MF1), Robert KochInstitute, Berlin, Germany and Hasso Plattner Institute, Faculty of Digital Engineering, University of Potsdam, Germany and Institute for ExperimentalCardiovascular Medicine, University Heart Centre Freiburg Bad Krozingen, Germany and Faculty of Medicine, University of Freiburg, Freiburg, Germany and Matters of Activity. Image Space Material. Cluster of Excellence at Humboldt-Universität zu Berlin, Germany and Technische Universität München, Germanyand Language Technology Lab, Universität Duisburg-Essen, Germany and Leibniz Institute of Agricultural Development in Transition Economies (IAMO),Halle (Saale), Germany and Department of Informatics, Technical University of Munich , Germany and Center for Information Services and High PerformanceComputing (ZIH), Technische Universität Dresden, Germany and Deutsches Krebsforschungszentrum, Heidelberg, Germany and Chair of ScientificComputing in Computer Science, Technical University Munich, Germany and School of Computing, Communication and Business, Hochschule für Technik undWirtschaft Berlin, Germany and Center for Bioinformatics Saar, Saarland Informatics Campus, Germany and Institute for Modelling Hydraulic andEnvironmental Systems, University of Stuttgart, Germany and Berlin Institute of Health, Germany and Scientific Computing, Alfred Wegener Institute,Helmholtz Center for Polar and Marine Research Bremerhaven, Germany and Max-Planck-Gesellschaft e.V. and Institut für Theoretische Physik undAstrophysik, Universität Würzburg, Germany and Institut für Computerphysik, University of Stuttgart, Germany and University Library, University ofStuttgart, Germany and Zentrum für Medien-, Kommunikations- und Informationsforschung (ZeMKI), Universität Bremen, Germany and Max PlanckInstitute of Molecular Cell Biology and Genetics, Dresden, Germany and European Molecular Biology Laboratory, Heidelberg, Germany and Center forMechanics, Uncertainty and Simulation in Engineering, Technische Universität Braunschweig, Germany and eScience Division, Federal Institute for MaterialsResearch and Testing, Berlin, Germany and Konrad-Zuse-Zentrum für Informationstechnik Berlin (ZIB), Germany and FIZ Karlsruhe - Leibniz Institute forInformation Infrastructure, Germany and Department of Economics, University of Goettingen, Germany and Applied Mathematics, University of Münster,Germany and Visualization Research Center, University of Stuttgart, Germany and Institute for Advanced Sustainability Studies e.V. and Institute of MediaInformatics, Ulm University, Germany and Department of Science and Technology, Linköping University, Sweden and Institute of Nanotechnology, KarlsruheInstitute of Technology (KIT), Germany and LMU Munich, Germany and Bielefeld University, Germany and Helmholtz Zentrum Munich, Germany and Institute of Applied Mathematics, Leibniz University Hannover, Germany and Energy Technology, Eindhoven University of Technology, The Netherlands and Julius Kühn-Institut (JKI), Federal Research Centre for Cultivated Plants, Quedlinburg, Germany * [email protected], [email protected]; † Contributed equally.
Abstract
Research software has become a central asset in academic research. It optimizes existing and enables new researchmethods, implements and embeds research knowledge, and constitutes an essential research product in itself. Researchsoftware must be sustainable in order to understand, replicate, reproduce, and build upon existing research or conduct newresearch effectively. In other words, software must be available, discoverable, usable, and adaptable to new needs, bothnow and in the future. Research software therefore requires an environment that supports sustainability.Hence, a change is needed in the way research software development and maintenance are currently motivated,incentivized, funded, structurally and infrastructurally supported, and legally treated. Failing to do so will threaten thequality and validity of research. In this paper, we identify challenges for research software sustainability in Germany andbeyond, in terms of motivation, selection, research software engineering personnel, funding, infrastructure, and legalaspects. Besides researchers, we specifically address political and academic decision-makers to increase awareness of theimportance and needs of sustainable research software practices. In particular, we recommend strategies and measures tocreate an environment for sustainable research software, with the ultimate goal to ensure that software-driven research isvalid, reproducible and sustainable, and that software is recognized as a first class citizen in research. This paper is theoutcome of two workshops run in Germany in 2019, at deRSE19 - the first International Conference of Research SoftwareEngineers in Germany - and a dedicated DFG-supported follow-up workshop in Berlin.
Key words : Sustainable Software Development; Academic Software; Software Infrastructure; Software Training; SoftwareLicensing; Research Software a r X i v : . [ c s . G L ] M a y | de-RSE Position Paper Background
Meet Kim, who is currently a post-grad PhD student in re-searchonomy at the University of Arcadia (UofA). We willfollow Kim’s fictional career in order to understand differ-ent aspects of research software sustainability. Note that inKim’s world, many of the changes this paper calls for havealready been implemented. (In our example, Kim is a female person. Of course, research softwareengineers (RSEs) can be any gender.)
Computational analysis of large data sets, computer-basedsimulations, and software technology in general play a cen-tral role for virtually all scientific breakthroughs of at leastthe 21st century. The first image of a black hole may be themost prominent recent example where astrophysical experi-ments and the collection and processing of data had to be com-plemented with sophisticated algorithms and software to en-able research excellence [1, 2]. Similarly, it is research softwarethat allows us to get a glimpse of the consequences our actionstoday have on the climate of tomorrow. However, an implica-tion of computer-based research is that findings and data canonly be reproduced, understood, and validated if the softwarethat was used in the research process is sustained and theirfunctionality maintained.At the same time, sustaining research software, and in par-ticular open research software, comes with a number of chal-lenges. Commercial research software often has revenue flowsthat can facilitate sustainable software development, mainte-nance, and documentation as well as the operation of adequateinfrastructure. However, a large share of researchers base theirresearch on software that was developed in-house or as a com-munity effort. Many of these software stacks can not be sus-tained – often because research software was not a first classdeliverable in a research project and hence remained in a pro-totype state, or because of missing incentives and resourcesto maintain the software after project funding ended. Anotherfundamental difference to industrial software development isthat most developers of academic research software (often doc-toral students or postdoctoral researchers) never receive train-ing in sustainable software development [3]. In particular, asthey see themselves usually as the primary user of a softwareproduct, there are virtually no incentives to invest in sustain-ability measures such as code documentation or portability. Incombination with the predominance of temporary positions inresearch, this results in a highly inefficient system where mil-lions of lines of code are generated every year that will not bere-used after the termination of the developer’s position. Partof the problem is the reluctance to accept research softwareengineering as an academic profession that results in a lack ofincentives to produce high-quality software: producing highsoftware quality needs sufficient resources, and although theSan Francisco Declaration on Research Assessment (DORA [4])demands a change in the academic credit system, many institu-tions base promotion and appointments on traditional metricslike the Hirsch index [5]. It is obvious that an extraordinaryamount of idealism is required to write sustainable code includ-ing documentation and installation routines, as well as runninginfrastructure and giving support to others when resources canbe used more profitably in writing scientific publications basedon fragile prototype software [6, 7].Thus, one main factor for the poor sustainability of researchsoftware is the lack of long-term funding for research software engineers (RSEs) [8] who take care of the appropriate architec-ture, organization, implementation, documentation, and com-munity interaction for the software, paired with the implemen-tation of measures towards making the software sustainableduring and beyond the development process [9].In this paper, we describe the state of the practice and cur-rent challenges for research software sustainability, and sug-gest measures towards improvements that can solve these chal-lenges. The paper is the result of a community effort, withwork undertaken during two workshops and subsequent col-laborative work across the larger RSE community in Germany.It has been initiated during a half-day workshop at the first In-ternational Conference for Research Software Engineers in Ger-many (deRSE19) in Potsdam, Germany on June 5th, 2019 [10],and continued during a dedicated two-day workshop in Berlin,Germany on November 7th and 8th, 2019, which was fundedby the German Research Foundation (Deutsche Forschungsge-meinschaft, DFG). Subsequently, the draft produced during thelatter event was opened up for collaborative discussion by theGerman RSE community through de-RSE e.V. - Society for Re-search Software .We mainly focus on the situation of research software andRSEs in Germany, where funding bodies increasingly acknowl-edge the importance and value of sustainable research softwareand related infrastructures. The DFG, the largest funding bodyfor fundamental research in Germany, for example, opened acall for sustainable research software development [11] at theend of 2016 and a second call for quality management in re-search software [12] in June 2019. The first call was oversub-scribed by a factor of 10-15, a strong indicator of unmet de-mand. As another example, the 2019 “Guidelines for Safeguard-ing Good Research Practice” codex of the DFG [13] now explic-itly lists software side-by-side with other research results anddata. The FAIR principles for research data [14] provide guide-lines for data archiving, but enabling full reproducibility andtraceability of research software requires additional steps [15].In consequence, there are ongoing discussions on whether soft-ware should be considered as a specific kind of research dataor as a separate entity [16].These positive developments notwithstanding, guidelinesand policies for sustainable research software development inGermany are unfortunately still lacking, and long-term fund-ing strategies are missing. This all leads to unmet require-ments and unsolved challenges that we want to highlight inthis paper by elaborating on (1) why research software engi-neering needs to be considered an integral part of academicresearch; (2) how to decide which software to sustain; (3) whosustains research software; (4) how software can be fundedsustainably; (5) what infrastructure is needed for sustainablesoftware development; and (6) legal aspects of research soft-ware development in academia. While we specifically focuson the research software landscape in Germany, we are con-vinced that many of the analyses, findings, and recommenda-tions may carry beyond. We want to address RSEs who areexperiencing similar challenges and newcomers to the field ofresearch software development, but first and foremost politi-cal and academic decision makers to raise awareness of the im-portance of and requirements for sustainable software develop-ment. As a community we work hard on overcoming the chal-lenges of software development in an academic setting, but weneed support – and reliable funding options and institutionalrecognition in particular – for the sake of better research.
Public reviews for this de-RSE Position Paper were conducted between 23 January 2020 and 09 February 2020.The paper has been accepted as an official position of de-RSE e.V. – Society for Research Software on 03 April 2020.The paper is publicly available under the persistent identifier arXiv:2005.01469 [cs.GL]. nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | Why Sustainable Research Software in theFirst Place?
After graduation, Kim joins a fixed-term researchonomicalresearch project. For her PhD thesis, she wants to crunchsome data. Her colleague recommends learning some Boa,which is an all-purpose programming language often usedin researchonomy. Luckily, the UofA runs regular SoftwarePlumbery courses for researchers, including a Boa course.Kim takes the course and gains a solid understanding of thebasics of the Hash shell, version control with Tig, and thebasics of Boa. She starts writing scripts, which help her alot with the data processing. Unfortunately, Kim’s scriptsare quite slow and actually break after she installs a newerversion of Boa. She visits the weekly Code Café organized byher university’s central RSE team. The RSEs not only helpher update her scripts but also suggest some changes whichspeed up the computation by a factor of 25.During the next meeting with her PhD supervisor, Kimpresents her collection of scripts. The supervisor encouragesKim to create a Boa library from them, as they will be veryuseful to other researchonomists. Thankfully, Kim’s projectPI had applied for 3 RSE person months in their grant, sothe project enlists an RSE from the central team. Over thenext three months, Kim and the RSE work together to buildthe library, document it, test it, license it under the permis-sive Comanche license, update the TigLab repository to letothers contribute, introduce automated builds for every codechange via a continuous integration platform, and make thelibrary citable. Finally, they release the first major version ofthe library, named hal9k and publish it through the univer-sity library’s software portal, where they get a DOI (DigitalObject Identifier) for the version as well as a concept DOI forany future versions of the library. Working with the RSE,Kim has gained a good understanding of some methods insoftware engineering, and she’s thrilled because this alsomeans she’ll be able to get a job with a local tech companyonce her fixed-term contract has run out.Kim passes her PhD - of which hal9k is an important part- with flying colors, and soon citations to her library startappearing in the researchonomic literature. To Kim’s sur-prise, she also reads a blog post about a citizen science makerproject which has used hal9k to process researchonomic datameasured in a neighborhood of her hometown. She is invitedto give a talk at the local office of Siren, a global tech com-pany, which look to adopt hal9k , and pay Kim a generousspeaker honorarium. So generous in fact, that Kim can paya student assistant for a full year from the money.Our credibility as researchers in society hinges on the notionof proper research conduct, also known as “good researchpractice”. The digitalization of research has introducedcomplex digital research outputs, such as software and datasets. Although first recommendations [17] and policies [18]exist, they are far from being widely adopted. It is stillsomewhat unclear how to translate good research practiceinto good research software practice, for example in termsof validity and reproducibility, but also pertaining to theresponsible use of resources. The damage that failing to do sois causing both to the progress of the research community andto the credibility of academic research in society is becomingincreasingly clear with the growth of the replication crisis -while the lack of universally agreed-upon and supported goodresearch software practice is not the main reason for thatcrisis, it clearly is a contributing factor.While it is obvious that software qualifies as a potentially re-usable digital artifact, the additional benefit of not just repro- ducing a given scenario, but transferring software use to newproblems, domains, and/or applications, justifies developingresearch software with a long-term perspective as sustainableresearch software .In order to support research, a sustainable software mustbe correct [19–21], validatable, understandable, documented,publicly released, adequately published (i.e. in persistentlyidentifiable form as software source code [22], and potentiallyin an additional paper which describes the software concept,design decisions, and development rationale), actively main-tained, and (re-)usable [23–25]. We also argue that truly sus-tainable research software must ideally be published under aFree/Libre Open Source Software (FLOSS) license, and followan open development model, to (1) enable the validation ofresearch results that have been produced using the software,(2) enable the reproducibility of software-based research, (3)enable improvement and (re-) use of the software to supportmore and better research, and reduce resources to be spent onsoftware development, (4) reduce legal issues (see section be-low), (5) meet ethical obligations from public funding, and (6)open research software to the general public, i.e., the stake-holder group with arguably the greatest interest in furtheringresearch knowledge and improving research for the benefit ofall.To make software-based research (and with that almost anyresearch) reproducible, the used software must continue to ex-ist. Furthermore, it must continue to be usable, understand-able, and return consistent results (or potential changes to re-sults and bug fixes must be clearly documented) in the evolvingsoftware and hardware environment. Moreover, the softwareshould support reuse scenarios to avoid duplication of effortsand unneeded drain of resources. Therefore, if research soft-ware is publicly funded, it should be freely available under aFLOSS license.Currently, creating and using sustainable research softwareis not sufficiently incentivized. To evaluate in which area thisshortcoming should be addressed, we have identified the fol-lowing challenges:•
Lack of benefit for the individual : Currently, the primarymotivation for sustainable research software is the commonbenefit, rather than the individual benefit. It is clearly ben-eficial for the research community as a whole to direct re-sources towards sustainable research software, as it enablesbetter and more research by freeing funds for domain re-search rather than (repetitive) software development. Butthe developers are often even at a disadvantage (e.g., theypublish fewer papers [6, 7]), which in turn prevents sustain-able research software.•
Lack of suitable incentive systems : Contributions to re-search that are not traditional text-based products (i.e., pa-pers or monographs) are still not sufficiently rewarded, ornot rewarded at all, due to the missing implementation ofmandatory software citation [22, 26–34], among other rea-sons. Interestingly, one third of research software reposito-ries have a lifespan (defined as the time from the first timeany code was uploaded to the last contribution) of less thanone day (median: 15 days [15]), indicating that many codesare only made available publicly for the publication in a jour-nal (as increasingly encouraged or required by journals [35]and associated with higher impact [36]) but are not main-tained thereafter.•
Lack of awareness : Research software sustainability(see [37–40]) and its importance is lacking visibility as wellas acceptance, and research software engineering in its im-plementation as sustainable software development and soft-ware maintenance is not sufficiently supported, both in Ger-many and beyond [9, 41, 42].•
Lack of expertise : Knowledge about how to create, maintain, | de-RSE Position Paper and support sustainable research software is emerging [43–45] but has not yet permeated related activities within or-ganizations - specifically teaching, mentoring, and consul-tancy. This lack of expertise can also lead to divergence be-tween software design and community uptake, e.g., if thesoftware fails to meet the needs of the target group, or isinsufficiently usable. RSEs combine sustainable software en-gineering expertise with experience in one or more researchdomains.• Heterogeneous research community : There are significantdifferences with respect to how software is developed, pub-lished, used, and valued in the different academic disciplines.Additionally, there is even heterogeneity within a commu-nity in terms of application and approach. This also makesit hard to train researchers for sustainable software devel-opment, as beyond basic training in computational researchsuch as provided by The Carpentries, advanced courses forresearch software engineering are not widely available (withthe notable exception of the CodeRefinery project [46]). Tar-geted curricula must be developed and updated regularly, andspecialized instructors need to be trained.•
Lack of impact measures : It is unclear how to measurethe impact of research software with respect to its quality,reusability, and benefit for the research community. Thisexceeds the implementation of research software citation(which is work in progress [22, 33, 34, 47]), and pertainsto sustainability and policy studies.•
Infrastructure issues : Due to a lack of knowledge about howsustainability features impact the application of researchsoftware, there is not yet enough evidence for whether cen-tralized or decentralized facilities should be favored to fur-ther research software sustainability [48–50]. This in turnleads to a lack of infrastructure as a whole.•
Legal issues : Many obstacles for research software pertain tolegal issues, such as applicable licensing and compatibility oflicenses [51], and decisions about license types.•
Funding issues : Despite some individual initiatives [11, 12,52, 53], funding for the creation, maintenance, and supportof sustainable research software is still scarce.•
Slow adoption of research software engineering as a profes-sion : Career options for research software work are not fullydetermined, although career paths are emerging in someregions. Initially, the RSE initiative in the UK has madeprogress in this area, and RSE groups have been installedin many institutions. In Germany, the US, and the Nether-lands, this is still work in progress [9, 54–56]. It is also notyet determined how to match research software engineeringroles in public institutions with industry roles (see [57]).In summary, the necessary but resource-intensive practiceof creating, maintaining, supporting, and funding sustainableresearch software is not yet sufficiently incentivized and en-abled by research institutions and funding agencies, nor doesit align well with the publish-or-perish culture that is stillprominent in most fields.Therefore, it is necessary to comprehensively motivate sus-tainable research software practice. In the following, we iden-tify stakeholders of research software (see [58–60]), and expli-cate their particular motivations for sustainable research soft-ware. Subsequently, we specify challenges towards satisfyingthe demands of the individual stakeholders.
Stakeholder Motivations for Research Software Sus-tainability
While a wide range of stakeholders share interest in sustainablesoftware, we argue that their individual motivation can differquite significantly: The general public benefits from research which supportsthe common good, in other terms: creates a better world,faster. Taxpayers have an interest in economical use of theirtax money, to which duplicated or flawed efforts to create re-search software – in contrast to software reuse – is contrary. Asubset of this group may be interested in sustainable, i.e., re-usable and understandable, software as part of citizen science.
Domain researchers benefit from better software to do more,better, and faster research. Sustainable research software sup-ports this through validated functionality (e.g., correct algo-rithms), the potential for reuse, and general availability. Sus-tainable software also potentially simplifies building upon pre-vious research results by re-using the involved software to pro-duce additional data or by extending the software’s function-ality. In light of recent updates to definitions of good researchpractice [13], sustainable research software also allows domainresearchers to comply with guidelines and best practices. Ad-ditionally, using a software that is sustainable enough to es-tablish itself as a standard tool in a field signifies inclusion ina research community. Less directly, researchers may benefitfrom the existence of sustainable standard tools as they yieldstandard formats, which in themselves facilitate reuse of re-search data.
Research software engineers (RSEs) have an intrinsic interestin sustainable research software. They create better softwarefor research, which enables more and better research. RSEshave an inherent interest in developing and working with highquality software, as part of professional ethics as well as goodresearch practice. RSEs build their reputation on high qualitysoftware and software citation [22, 33], which will open up newcareer paths. Finally, for RSEs, creating sustainable researchsoftware is part of an attractive, intellectually challenging, andsatisfying work environment.
Research leaders as well as research performing organizations mainly focus on the economic aspects and management of re-search, i.e., available funds, people, and time employed to op-timize research output. Both need to make sure that their em-ployees continually improve their qualification and generateimpact to improve their standing in the various research com-munities and ensure continued funding. Overseeing and en-abling the creation of sustainable research software advancestheir visibility in the field and makes their research endeav-ors both more future-proof and more easily traceable, repro-ducible, and verifiable and thus more likely to attract additionalresources (including human resources).
Research performing or-ganizations can additionally benefit from sustainable researchsoftware if it can be reused in other areas, creating syner-gies between different research disciplines. These synergiestypically free resources that can then be used in areas otherthan software development and maintenance. Finally, organi-zations can gain highly competitive positions in terms of fund-ing and hiring opportunities, as well as a reputation for beingon the cutting edge of research, through early adoption of re-search software engineering units, and the implementation ofsustainable research software policy and practice.
Research funding organizations have inherent interest in –and directly benefit from – the existence of sustainable re-search software as it allows them to direct more resources to-wards actual research (rather than recreation of software) andincrease return on investment. At the same time, funding orga-nizations can create incentives for sustainable software by im-posing policies that reflect the necessity of research softwaresustainability and creating respective funding opportunities.
Geopolitical units have a strategic interest to be independentof other geopolitical units to ensure that research can continueseamlessly regardless of geopolitical developments and ensu-ing embargoes on information flow. Reuse of sustainable soft-ware additionally frees up funding for uses other than software nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | development. Well-established, sustainable software systemscan also attract researchers and companies in the research andtechnology sector. Libraries (also registries, indices) benefit from sustainable re-search software, as it will undergo a formal publishing pro-cess and be properly described in its metadata. Libraries canextend their portfolio beyond text-based research objects andstake claims as organizations harnessing the digitalization ofresearch. In turn, they help to increase visibility and discov-erability for research software through their services and ad-vance the competitiveness of their organization or geopoliti-cal unit. In addition, libraries also use research software andwould thus benefit directly from a more sustainable researchsoftware landscape. Last but not least, by using FLOSS researchsoftware, libraries could avoid expensive licenses and often in-sufficiently adapted commercial software.
Infrastructure units , such as supercomputing facilities anduniversity computing centers, benefit from sustainable soft-ware as it makes their daily work in terms of software instal-lation and user support easier. Additionally, they can positionthemselves at the forefront of research by bundling expertiseon the creation and maintenance of sustainable research soft-ware and installing research software engineering teams.
Industry benefits from sustainable research software, as theprocess of creating and maintaining research software pro-duces a highly-skilled workforce. Depending on the employedlicensing model, sustainable research software can also beadopted by industry partners to reduce cost in corporate re-search and development. Helping to sustain research softwaremay also enable positive outreach for companies across indus-try and into society.
Independent (open source) developers can get involved in re-search software, even if they are not employed by a researchinstitution. This can help them get in contact with other de-velopers in the field and may potentially lead to collaborationsor job opportunities in research based on this extended experi-ence.
How to Decide Which Software to Sustain?
Requirements and Challenges
The sustained funding of all existing software efforts is notonly impossible but would risk to overly splinter the commu-nity and eventually become counterproductive to the efficiencyof the research community. Therefore, it is important to agreeon a list of transparent criteria that qualify a software prod-uct for sustained funding. We recognize that defining researchsoftware engineering criteria for software evaluation will alsolead to activities aiming at optimizing scores to achieve thesecriteria. Hence, the criteria have to be designed such that allscore-pushing effort truly advances the value of the software.Criteria that can be manipulated without effectively addingvalue, i.e., wasting resources, should be excluded. The list ofcriteria presented in this chapter could be the basis for a struc-tured review process that facilitates an unbiased evaluation ofsoftware tools from various fields. Therefore, this list must begeneral enough to be applied to research software from vari-ous research disciplines while also respecting differences be-tween fields (e.g. citation rates between humanities and lifesciences). The challenge to do justice to a wide spectrum ise.g. reflected by suggesting criteria comprising different lev-els [61]. One of the major challenges in the endeavor to define aselection scheme for sustainable funding of research softwareis to organize a fair and transparent review process. We be-lieve that it is important that the review process is conductedby experts, or teams of experts, that have a strong background both on software engineering as well as on the domain-specificaspects, the latter because certain criteria often exist on a spec-trum that is most likely shaped by the specific demands of therespective research community.Kim’s PI is happy because Kim writes a longer section on hal9k for the final project report and provides a softwaremanagement plan alongside it, which ticks off a box in thetemplate that the PI had previously worried about. The PIdoes not want to let Kim go and instead offers her to be co-PIon a follow-up project to test new methods on the data, andintegrate them into hal9k as well. They are positive that sucha project proposal has a good chance to be funded, as theycan show impact of their first project via their university’scurrent research information system (CRIS) and through thenumber of citations of hal9k and the publications for which itwas used. While they write the proposal, the faculty dean ap-proaches the two to tell them that based on Kim’s work, theywill now negotiate about two new RSEs for the central RSEteam with the university’s provost for research and plan toconsider candidates with a background in researchonomics.When they get the decision letter from the research fund-ing organization, Kim and her co-PI are happy to learn thattheir new project has won the grant. The reviewers specif-ically point out the value of extending Kim’s Boa library toinclude the proposed new methods, as well as the signifi-cant reuse potential of hal9k for the researchonomic com-munity as a direct effect of its well-engineered architectureand modularity. Additionally, they stress that it was reallyeasy to evaluate the software due to the comprehensive testsuite, documentation, and example data. In fact, during thefirst month of the new project, three other researchonomicresearch projects approach them to ask whether they cancontribute to Kim’s library and offer to fund six months ofRSE work for this. Kim uses this money to also parallelize hal9k together with the RSEs and works with her university’scomputing center to offer it as a standard tool for researcho-nomic supercomputing.While an assessment based purely on quantitative met-rics would allow for seemingly objective comparisons betweenprograms, the definition of valid and robust quantitativemetrics that can be evaluated with reasonable effort is amajor challenge. On the other hand, a structured qualitativeassessment with scores for groups of criteria can provide amiddle ground. It is clear that both preparing an applicationfor a review against these criteria from the applicant side aswell as the evaluation by the reviewers requires significanteffort. We believe that the added value significantly outweighsthe investment but appropriate resources need to be factoredin. Sustainability of research software should be consideredfrom the beginning for new projects. The criteria listedbelow, or a subset such as the “good enough” practicesproposed by Wilson et al. [45], are valuable throughout thedevelopment process (including early phases) for almost alltypes of research software applications. “Classical” researchfunding schemes should acknowledge the need to follow bestpractices during the development of new software and allowfactoring in appropriate resources to design and implementfor sustainability. In this section, we focus on the questionwhich software to support in dedicated sustainability fundingschemes. For such sustained funding, only software inapplication class 2 or 3 as defined by Schlauch et al. [62], i.e.,with significant use beyond personal or institutional purposes,would likely be considered. Excellence as reflected in fundedprojects, publications, and software adoption, i.e., backingby a community, should be considered during selection.Nevertheless, we believe a good scheme should strike a bal- | de-RSE Position Paper ance between consolidating the field to few well-establishedsoftware packages on one side and stimulating innovationand cooperation promoting diversity in terms of more thanone monopolistic package on the other side. Last but notleast, there is an inherent conflict between the long-termgoals of sustainability funding a software and the necessaryreevaluation to monitor the state of the software over time. Selection Criteria
Several evaluation schemes for research software have beenproposed before and led to the formulation of first recommen-dations [17, 18]. Gomez-Diaz & Recio suggested the CDURscheme based on Citation, Dissemination (including aspectslike license, web site, contact point), Use, and Research (out-put) [63]. Lamprecht et al. rephrased the FAIR data princi-ples [14] for research software [16]. Hasselbring et al. foundthat the adoption of FAIR principles is different between fieldswith an emphasis on reuse in computer science as opposed toa reproducibility focus in computational science [15]. Fehr etal. collected a set of best practices for the setup and publi-cation of numerical experiments [64]. Jiménez et al. boiledit down to four best practices [65]: public source code, com-munity registry, license, and governance. Hsu et al. [66]proposed a framework of seven sustainability influences (out-puts modified, code repository used, champion present, work-force stability, support from other organizations, collabora-tion/partnership, and integration with policy). They foundthat the various outputs are widely accessible but not neces-sarily sustained or maintained. Projects with most sustainabil-ity influences often became institutionalized and met requiredneeds of the community [66]. In the field of open source soft-ware, the CHAOSS (Community Health Analytics Open SourceSoftware) project has developed metrics to evaluate sustain-ability [67]. One objective of CHAOSS is to automatically gen-erate project health reports based on software that evaluatesthe metrics, with most of the metrics already covered. The UKSoftware Sustainability Institute (SSI) suggested both a sub-jective tutorial-based and a more objective criteria-based soft-ware evaluation scheme [68], the latter being available as anonline form [69]. ROpenSci [70] provides software reviews forR developers, which have been very successful in the commu-nity. The review criteria of the Journal of Open Source Soft-ware (JOSS) [71] focus on the aspects license, documentation,functionality, and tests. This list of essential items should befulfilled by all research software that wants to be considerednot only for publication but also for sustained funding.We drew inspiration from all these works and suggest a setof criteria to base reviews for sustainable funding on. This setcomprises mandatory, hard criteria that we think have to befulfilled across domains (highlighted in italics) and additionaldesirable, soft criteria that can be implemented to differentdegrees depending on the use case and domain-specific soft-ware development requirements. The soft criteria should beevaluated in a structured way by the reviewers with a specificresponse for each section rather than one running text. Thefact that most of these criteria will be considered in any soft-ware management plan (SMP, [72]) highlights its importancefor sustainable research software.
Usage and Impact
Requirements qualifying software for sustained funding are(1) its use beyond a single research group , (2) the scientific rel-evance and validity of the software documented in at leastone peer-reviewed scientific publication . Ideally a paper also de-scribes the scope, performance, and design of the software.(3) The use of the software in publications is a measure of im- pact but quantitative assessment brings about additional chal-lenges [29]. Therefore, other, potentially domain-specific, im-pact measures, such as influence on policy and practice as wellas use in other software and products should be considered aswell to evaluate relevance for academia and society. Consider-able attendance at training and networking events can be con-sidered as a proof of use as well. (4) A market analysis needs toshow that the software is important to a user base of relevantsize and either unique or one of the main players in a field withseveral existing solutions. Geographical or political aspects canbe considered as well, e.g. to support the maintenance of a Eu-ropean solution. A convergence process of (parts of) a researchcommunity towards a specific software stack, i.e., documentedtransition of several research groups to a common software,would be a strong indicator of impact. (5) As community up-take and benefits are a central goal of sustained software fund-ing, outreach and appropriate training material for new users ofthe software are essential.
Software Quality
As mandatory criteria of software quality that have to be ful-filled, we consider (6) the public availability of the source code in both a code repository and an archive (for long term avail-ability), developed using (7) version control with meaningfulcommit messages and linked to an issue tracker (ideally main-tained, but at least mirrored on a public platform). (8)
Docu-mentation of the software needs to be publicly available com-prising both user documentation (requirements, installation,getting started, user manual, release notes) and developer doc-umentation (with a development guide and API documentationwithin the code, e.g. using Doxygen) [73]. (9) The license un-der which the software is distributed must be defined. Pub-licly funded software should be published under a Free/LibreOpen Source Software (FLOSS) license by default, although ex-ceptions to this might apply (e.g. excluding commercial use).(10)
Dependencies on libraries and technologies must be defined.We acknowledge that some additional criteria have to beevaluated under consideration of the research domain. Thesecomprise (11) the availability of examples (comprising inputdata and reference results), (12) mechanisms for extensibil-ity (software modularity) as one aspect of software architec-ture [74] and (13) interoperability (APIs / common and opendata formats for input and output), (14) a test suite (includingat least some of the following: unit tests, regression tests, inte-gration tests, end-to-end tests, performance tests; ideally runin an automated fashion in a continuous integration environ-ment), (15) tagged releases (considering their frequency, andavailability for end users in terms of binary packages for ma-jor operating systems, or availability via package managers orcontainers), (16) no large-scale re-implementations for func-tionality for which good solutions already exist. Many of theseaspects require appropriate infrastructure (see page 9).
Maturity
The research software applying for sustained funding musthave already reached a certain level of maturity (typically class2 or 3 as defined by Schlauch et al. [62]). A mandatory require-ment is (17) a comprehensive and up-to-date software manage-ment plan [72]. The software should (18) be maintainable withan appropriate amount of resources as detailed in a sustain-ability section of the software management plan. The softwarehas (19) a well maintained website with a clearly defined pointof contact and a communication channel to inform users aboutnews regarding the software such as new releases. Besides anactive user community, sustainable software requires (20) agroup of developers (i.e., definitely more than 1 developer ) docu-mented, e.g. by contributions to the code base or participationin documented, public discussions or issue tracking. Another nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | criterion is (21) whether potential contributors are invited toparticipate in a clearly defined process (e.g., a CONTRIBUTINGdocument). The group of developers should have defined a gov-ernance model for their project and easy ways for users to pro-vide input regarding their needs. Recommendations
Given the diversity in the software technology landscape, andthe domain-specific software development cultures [75], someof the above-mentioned criteria have to be evaluated againstdomain-specific requirements. Therefore, we highly recom-mend to base the selection process on a combination of (1) asoftware quality-based review and (2) a domain-specific sci-entific review. In particular, the former should be ideally per-formed by a central institution (e.g. at funding bodies or otherindependent agencies such as a software sustainability insti-tute). Only criteria for which improvement truly advancesthe value of the software should be considered in evaluationschemes, i.e., no criteria that can be gamed. After rejectingsoftware not fulfilling the mandatory criteria in a first stage ofthe review process, the second stage of the selection processshould be realized as a transparent procedure ideally allowingthe reviewers to interact with the PIs of the software (e.g. re-mote meetings, forum-like discussions) and put the softwarequality and development efforts into the domain-specific con-text. The outcome of this second stage should be a structuredreview assessing each criterion explicitly and a rating for eachof the dimensions
Usage and Impact, Software Quality, and Matu-rity . For sustained software funding, it is important to audit theperformance, relevance, impact, progress, and level of sustain-ability of funded software after reasonable time frames. Sucha reevaluation should revisit the criteria under considerationof evolving software technology and scientific standards, with-out requiring a completely new proposal being submitted. Weenvision funding periods of 5 years to provide sufficient secu-rity for funded software projects, while allowing for adaptationof the portfolio of funded software to novel research directionsand community needs. Failure to meet the reevaluation criteriashould lead to the decision to phase-out sustainable funding.The phase-out process may come with a 1-year funding pro-gram based on a consolidation plan with clear goals regardingthe archiving and preservation of the software, documentation,and all existing resources.
Who Sustains Research Software?
Research relies on software and software relies on the peopledeveloping and maintaining it. Sustainable research requiressustainable software, and this in turn requires continuity forthose who develop and maintain it.
Requirements
Possibly the most important demand is the need for an increasein recognition and awareness of research software as a first classcitizen in research [18, 76, 77]. For sustainability of researchsoftware, long-term commitments of the respective softwareleads are crucial, but very few professional RSE profiles currentlyexist. In consequence, it is essential to create career paths forRSEs that are attractive and include permanency perspectives.While creating permanent positions in the German academicsystem below the faculty level is an actively discussed topicoverall [78], we specifically focus on the needs originating fromthe development and maintenance of research software here.As already mentioned, research software development not only requires domain expertise, but also software development education, skills, and competencies . Currently, most of the domainresearchers developing and maintaining domain-specific soft-ware technology never received professional training on soft-ware development [3, 43]. To enhance the productivity andsustainability of computer-based research, it is essential to in-tegrate software development training into the education of do-main researchers.Currently, a significant portion of the existing research soft-ware is developed by individuals or in small groups, primarilyto serve their own requirements. This situation is unsatisfy-ing in terms of collaboration and inefficient in terms of sev-eral groups spending resources on generating similar or eventhe same functionality. To enable and promote synergies, it isimportant to allocate resources for research software develop-ment and to build communities , as described in [79].Kim wants to broaden her research portfolio within re-searchonomics and applies for postdoctoral positions atother institutions. Her library hal9k is growing in popularitywithin researchonomics, and she wants to continue workingon it. As her university has adopted an open science policy, hal9k is free software under a Free/Libre Open Source Soft-ware (FLOSS) license, and Kim is free to continue her workon the library even after moving away from UofA. Due to herinvolvement in the creation of hal9k as well as her previoussuccess in attracting funding, Kim has the choice betweenmultiple, attractive positions and decides to move to the re-searchonomics group at Eden University (EdU). She has al-ready extended hal9k in multiple directions in the past andplans to continue this work at EdU. Her group leader at EdUwould like to continue funding her but due to a law calledthe Fixed-term Research Contract Bill, EdU is not allowed toextend her contract, and neither third-party funding for herown position nor a permanent position are available. Afterhaving developed a now widely-used research tool, severalpublications in software and paper form, as well as havingattracted funding, Kim finds herself looking for a job again.
Challenges
We are currently facing a lack of awareness for the importance ofresearch software as discussed above. Moreover, there is littlerecognition for the efforts put into software development andmaintenance. In consequence, software development in aca-demic settings is mostly considered as a means to an end andsustainability is often not considered in project planning andgrant proposals and contributes little to progressing researchcareers [4, 80]. The main challenge here is the continued use ofmetrics that primarily leverage traditionally published articlesand article citation numbers.In academia, developers of research software are typicallydomain researchers, and in particular if new areas are explored,the software development process itself has research charac-ter. Obviously, developing research software requires not onlydomain knowledge but also software development skills, andthe researchers leading the software development process areoften domain experts with substantial software developmentexperience, making them extremely valuable members of theresearch community. However, the current academic systemin Germany does not provide a defined
RSE role . Limited-termpositions are, at least currently within the main German aca-demic system, often effectively the end of their career path,sometimes even a dead end. The challenge here is the lackof available permanent positions within the non-professorialacademic faculty (“Mittelbau”) in Germany, compounded by alack of access to these few permanent positions for RSEs due to | de-RSE Position Paper the already mentioned lack of recognition for efforts concern-ing research software for faculty appointments within domainsciences.In order to develop sustainable software, researchers needto have the skills and expertise to build software that is easy tomaintain and extend [81]. However, most of the researchersare self-taught developers [3, 43]. Ideally, these skills have tobe built into the domain science curricula, which could gener-ally be done in two different ways (or a combination of them).One obvious solution attempt are additional courses that focuson these topics. The main challenge here is to decide whichother topic(s) to possibly drop due to the limited volume ofany given curriculum. A different approach is to incorporatesoftware-related topics into existing domain science courses.While this would provide the benefit of show-casing the usageof specific software skills directly within the domain science,the challenge here is the amount of work necessary to changeexisting lecture material, let alone the need of the lecturers toacquire those skills themselves in the first place.As long as the necessary software skills within domain sci-ences are not yet wide-spread, building a network from thosethat have acquired relevant skills is difficult. Community ef-forts, that concentrate on questions regarding research soft-ware, can help to fill this gap. Examples of such efforts includethe Software Carpentries, national and international RSE soci-eties (e.g., within Germany de-RSE e.V.). However, since re-search software is such an interdisciplinary topic, it is hard toget recognition and find funding within any specific discipline.As a result, existing communities often have to rely heavilyon volunteers. This is challenging, because despite benefitsto domain science, volunteers hardly receive recognition fortheir work “back home”, i.e., within their domain, underlin-ing again the importance of our first demand.
Recommendations
Increasing recognition and awareness is a challenge that calls forboth immediate action and perseverance. Nevertheless, somemeasures will likely show positive effects comparatively soon.Similarly to plans for research data management, fundingagencies should request that applicants include considerationsabout how software developed in a project can be sustained be-yond the end of the funded project. A follow up on these plansduring and after the project lifetime, i.e., a dedicated softwaremanagement plan, is crucial.Another recommendation is aimed at decision makers con-cerning recruitment for academic positions: broaden the defi-nition of research impact beyond traditional scientific publica-tions to also include other impactful results. Not all researchersthinking of themselves as RSEs pursue a faculty position astheir main career goal. However, permanent academic non-faculty positions are rare within the German academic system,also due to the lack of a defined
RSE role . We recommend re-search institutions to leverage the benefit of dedicated RSEsby establishing attractive long-term career options in the aca-demic environment. The long-term solution in order to gainsufficient software development skills should be education thatis included early in the career path, ideally already at the Bach-elor level. For the time being however, efforts involving work-shops and seminars that provide easy access to hands-on train-ing on software-related questions should be promoted and sup-ported as much as possible.It is important to provide an environment where communi-ties can form and flourish by allocating resources for researchsoftware development and for building communities aroundit [65, 79, 82]. The identification with a community of like-minded people and personal action [83] can lead to a perma- nent establishment of sustainable research software as a valu-able research output. Thus, research institutions as well asfunding agencies should not only be open-minded regardingexisting volunteer organizations, but should actively promotethe creation of such groups.
How can Research Software be SustainablyFunded?
Hal9k has grown into a widely used software in researcho-nomics, and Kim is proactively asked to apply for - and issubsequently awarded - a permanent RSE position at theinstitute for researchonomy at UofA, based on her work onthe library. She works closely with the central RSE team,but mostly due to bureaucracy and the high demand for herlibrary, Kim does not have enough time to maintain and fur-ther develop hal9k alone anymore. Together with the deanshe develops a course for the researchonomics curriculumwhich teaches data processing with hal9k . As a lesson fromher own career, she starts the course with sessions on theHash shell, version control with Tig, Boa, and two whole ses-sions on basics of sustainable software development. This isvery fruitful, and due to the implementation of a new re-search software funding scheme at UofA, Kim is able to hireone of the course students, who has shown great RSE skills,straight into a long-term position at her institute, wherethey focus on the maintenance and development of hal9k ,work with the computing center to support hal9k -based su-percomputing on a new, dedicated FGPA cluster, developtraining materials for external users, and organize the yearly hal9k users and developers conference. Kim gets to travel theworld to visit researchonomics groups who are using hal9k . Requirements
Sustainable funding for research software boils down to fund-ing the four main pillars enabling sustainable software develop-ment: (1) Personnel with expertise in research software devel-opment; (2) Infrastructure for developing, testing, validating,and benchmarking research software, and distributed version-ing systems for collaborative software development; (3) Train-ing in software design and sustainable software development;and (4) Community management and events for creating syn-ergies between research groups and software efforts.
Challenges
Short-term engagement of (young) researchers raises the ques-tion of how to maintain a constant level of expertise within adeveloper team and prevent knowledge drain concerning do-main knowledge and software engineering skills. Conversely,the permanent engagement of qualified personnel requiresto offer career perspectives, especially due to the fact thatacademia competes with industry for the same people. A chal-lenge specific to Germany is posed by the shortage of perma-nent positions and by the restrictions for temporary positionsdue to the German
Wissenschaftszeitvertragsgesetz [84].Sustainable software development requires hardware tech-nology to develop, test, validate, and benchmark features ina continuous integration cycle. The challenge in this contextis the persistent evolution of the hardware landscape. Hence,for creating an environment promoting sustainable softwaredevelopment, it is important to provide access to a wide hard-ware portfolio and to support a development cycle based on nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | continuous integration.Expertise in sustainable research software development isa scarce resource, and training is heavily needed as one wayof building up more expertise. However, while integrating in-terdisciplinary software engineering courses into the educa-tion curriculum can build up basic skills, some expertise isdomain-specific and requires inter-institutional training ac-tivities. Furthermore, there exist no financial incentives forcreating software-specific documentation and tutorials nor toprovide other forms of support.While the creation of research software communities is oneof the major assets in sustaining research software technol-ogy, promoting this process requires the installation of newfunding instruments. Traditionally, research grants are lim-ited to rather short time frames and support personnel, ma-terial, hardware, and to a limited degree also travel and re-search visits. Creating a research software community howeverrequires funding for community and training events as wellas “virtual hardware” such as webspace, versioning systems,task-managing systems, and compute cycles. These demandscan hardly be met without third-party funding [48, 85–87]. Recommendation: Creation of Adequate FundingSchemes
Funding is a crucial factor for sustaining research software.Currently available sources and instruments are not adequatelyshaped for the challenges and solutions outlined above. Werecommend actions on the individual, organizational, and na-tional level.Existing project-focused funding instruments on the local,national, and international level need to be complemented withfunding instruments specifically designed for research soft-ware development and sustained research software mainte-nance to make research software a first class citizen in theresearch landscape. For example, software projects enhanc-ing research and fulfilling the sustainability criteria detailedin section How to Decide Which Software to Sustain? on p. 5may be entitled for sustained funding as long as they live up tothe standards and remain a central component of the researchlandscape.Computing centers and supercomputing facilities for re-search need to receive earmarked resources for the support ofsustainable software development. This funding is necessaryto provide continuous integration services, a hardware port-folio for development, testing and benchmarking software, aswell as personnel for training domain researchers in softwaredesign and the proper usage of the services.The creation and maintenance of training materials for gen-eral research software engineering education and the software-specific documentation and tutorial creation needs to be re-flected in funding opportunities. This can either happen bydedicating modules of research or software grants to providingsupport and the generation of training material, or by openingfunding schemes focusing on interdisciplinary software devel-opment education. The latter may include research that looksat research software development as a process to analyze whichmeasures, interactions, and team compositions make researchsoftware successful. Additionally, funding instruments foster-ing the formation of research software communities have to beestablished.
Which Infrastructure is Needed to Sustain Re-search Software?
As the hal9k community grows, so does the need for infras-tructure. Kim and her team collaborate with the NationalRSE Consortium to set up hal9k on the Consortium’s dis-tributed TigHub instance, and organize world-wide accessto it via the NRSEC-AAI federation. Going forward, the Con-sortium’s Research Software Hub - a registry and SoftwareHeritage Archive-based [88] long-term repository for re-search software on a national level - ingests hal9k releaseswith complete metadata: citation metadata, the hal9k prove-nance graph and computational environment information,ORCID iDs [89], etc. and provides its own DOIs for versionsunder a concept (umbrella) DOI. The community reviewsall code and documentation changes that are contributed to hal9k via the central TigHub, and the Hub’s CI system Alfredbuilds, tests, and pushes new releases automatically to theregistered supercomputing clusters. Especially the commu-nity efforts become better and more streamlined by the day,as research software development training is now offered aspart of most curricula, and skilled RSEs are now much easierto find and hire by research institutions.
Project Management Tools
Research software is developed by individual researchers, insmall teams within a single institution, or in larger teams dis-tributed across multiple institutions. In particular if softwaredevelopment is distributed across institutions, there exists anurgent need for frameworks and tools enabling collaborativecode development, software feature planning, and softwaremanagement. As research software development typically in-cludes bleeding-edge research and development that the re-searchers do not want to disclose for a certain time to pre-serve intellectual property, distributed research software devel-opment also needs a global Authentication and AuthorizationInfrastructure (AAI). We recommend the development and/ordeployment of tools for distributed software development andsoftware management as central research infrastructure. Animportant aspect in this context is the cataloging of researchsoftware to reduce the duplication of development efforts. Thiscan efficiently be realized by promoting the registration of allresearch software with a unique identifier and developing a toolthat allows to explore the research software landscape. Re-search software contributors should have an ORCID iD [89]to be uniquely identifiable and referable. While some fund-ing for such tools and software repositories is emerging (e.g.the bio.tools catalogue of bioinformatics tools funded as partof the European ELIXIR project [90]), a standardized extensionof such efforts to the RSE community as a whole is necessary.However, as the experiences from ELIXIR demonstrate, this is anon-trivial effort that requires significant dedicated and long-term funding.
Developer Training, Motivation, and Knowledge Ex-change
As elaborated, training in sustainable software development iskey to achieve sustainability in research software. At the sametime, it is not clear how such training should be facilitated andinstitutionalized. Furthermore, for deriving software qualitystandards, evaluating the quality of software, and providing acode review service, central resources are necessary that indi-viduals and groups in the research software landscape can draw | de-RSE Position Paper from.We consider Software Carpentry and similar efforts likethe creation of the Data Science Academy HIDA [91] in theHelmholtz Association of German Research Centers helpful so-lutions to exchange and distribute knowledge. Local chaptersof RSE groups and (inter-)national conferences will furtherfoster networking and community building. We strongly rec-ommend the creation of a national Software Sustainability In-stitute (involving funded positions to establish web platformsand training material) similar to the existing institute in theUK [92], which serves as a national contact for all aspects re-lated to research software. The UK SSI also publishes best prac-tice guidelines [93] for research software engineering. Research Software Discovery and Publication
Proper software publication and possibilities for the commu-nity to find existing software solutions for a given problemare a prerequisite to optimally exploit synergies and avoid re-dundant development. However, we observe that today, manyfunding proposals lack a thorough state-of-the-art report ofsoftware that could possibly be reused. This is most oftencaused by insufficient information retrieval strategies, lackof knowledge about relevant repositories, and an abundanceof locations where software is collaboratively developed andstored [94]. Discovery requires publication in a globally ac-cessible location with appropriate metadata, e.g., Citation FileFormat (CFF) [95] and CodeMeta [96]. Comprehensive meta-data (e.g. contributors, contact, keywords, linked publications,etc.) and publishing platforms have to enable persistent citing,which in turn benefits research evaluation. Selection and cu-ration of software (probably by a data/software librarian) forpublication and discovery are certainly challenging.We consider GitLab or GitHub as collaborative working envi-ronments and repositories like Zenodo appropriate publicationplatforms, because the latter mint DOIs, allow versioning andare publicly funded for long-term access. GitHub, Figshare,and Mendeley Data are examples of commercial enterpriseswith business cases in the background, which leverage researchresults. Besides the aforementioned metadata standards, itis advisable to document source code, e.g., using MarkDown(with Doxygen tooling). Metadata and citations play a role inbeneficial tools like PIDgraph, DataCite.org, CrossRef, whichutilize Persistent Identifiers (PIDs) like DOIs. Another solu-tion to discovery are (mostly) disciplinary software indices likeswMATH [97] or the Astronomy Source Code Library [98] aswell as language focused systems like CRAN [99] for R . Mostof them started as national endeavors and became platforms ofglobal importance. For Germany, we assume that the NationaleForschungsdateninfrastruktur (NFDI) will put effort into cre-ating or supporting discovery platforms at a central point thatease information retrieval. At the same time, all stakeholdersshould be aware of and counteract potential institutional “fear”of losing “their” data, software, and intellectual property.Especially in interdisciplinary environments, it would behelpful to have access to a meta software repository index, sim-ilar to what re3data [100] does for research data repositories.We recommend the creation of such a meta index covering im-portant (disciplinary) software indexes in order to ease discov-ery of relevant software locations. Evaluation of discoveredsoftware is an unsolved problem. Here, anonymous teleme-try of usage may provide information for the selection of rel-evant software. Publishing software, their dependencies, andenvironment in containers may also ease evaluation and fur-ther reuse. These suggestions require significant investmentin long-term infrastructure. When publishing research soft-ware it is recommended to make use of integration schemes like GitHub with Zenodo or local GitLab instances with pub-lication platforms. Such indices and publication outlets maybenefit national federated research indexing & archiving sys-tems, similar to the hierarchy of library catalogs [101]. Archiving
Software preservation aims to extend the lifetime of softwarethat is no longer actively maintained. There are different ap-proaches, which vary in the effort required and the likelihoodof success. Software archiving is one important aspect of soft-ware preservation: the process of storing a copy of a software sothat it may be referred to in the future. The publication of a cer-tain software version for reference in research articles requiressimple ways to archive research software on a long-term basis.Furthermore, its integration with collaborative software devel-opment environments such as GitLab or GitHub and with pub-lication repositories is needed to facilitate archiving of refer-enced software versions based on sustainable frameworks (e.g.Invenio [102] for GitHub to Zenodo integration).A challenge for software archiving is the need to (ideally)preserve the runtime environment and all dependencies of thesoftware. This could improve reproducibility, especially whenrunning the software in its original state. If research data areneeded to reproduce results, they should also be archived withthe software or the publication. Specialized and unique hard-ware - like high performance computing resources - can bepart of the runtime environment, which may not be accessiblein the future. To overcome this, an emulation of hardware maybe a (challenging) solution. Emulation involves the encapsula-tion and distribution of the complete hardware and softwarestacks, including the operating system and driver interdepen-dencies. This can result in intellectual property issues whenoffered as a service.There are both local and global approaches to software con-servation. One solution to keep the software in an executablestate by preserving its context and runtime environment is touse containers such as Docker. However, to archive the Dockercontainers, additional metadata should be added and storedwith the software in an archive container format that allowsexchange between repositories and exit strategies, such as theBagIt container format [103]. Application or platform conser-vation is also achieved by conservational efforts where unmain-tainable (virtual) machines are sandboxed to keep the platformin a secure but running state. Another threat is losing projectrepositories on global platforms like Github or BitBucket. Here,global platforms like Software Heritage [104] harvest thoserepositories and prevent loss by long-term archiving.
Legal Aspects
More and more industrial partners enter the hal9k commu-nity, and they bring their lawyers. Together with UofA’sresearch software task force, the RSE team, the researchon-omy institute, the corporate lawyers, and community rep-resentatives, Kim decides to create a foundation to govern hal9k and its environment: the Fullest Possible Use Foun-dation for Open Researchonomy, funded by the Ministry ofResearch and Education and a consortium of corporate part-ners. As a first step, they re-license hal9k under the OSIapproved MIT license.A common situation in research software creation is that thedeveloper has no knowledge or awareness of legal aspects andtherefore did not consider them early enough. Thus, we thinkthe main legal demands for research software development nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | are raising awareness and empowering all levels of responsi-ble persons in academia (from researcher and RSEs over PIs toresearch performing organizations and research funding orga-nizations) in legal aspects. This will hopefully lead to a gen-eral legal certainty before, during, and after the research soft-ware development process and thus enable better options forcollaborations between universities, non-commercial researchinstitutions, and other national or international partners. Le-gal aspects always have to be considered regarding the relevantjurisdiction. Though similar issues arise in all jurisdictions, thefollowing will focus on the European and specifically Germanlegal framework. Challenges and Clarifications
Clarification of Rights
Software development is a creative activity. The main rel-evant law governing legal aspects is therefore the copyrightlaw. It regulates the rights and obligations of the parties in-volved. Chapter 8 of the German Act on Copyright and RelatedRights (UrhG) contains specific provisions applicable to com-puter programs and is based on the EU computer programs di-rective. Copyright law protecting the creator of software insimilar ways exist in nearly all legal systems. It is importantfor the identification of rights that software, in the sense of(German) law, includes not only the source code but also thedesign materials [105]. The challenge in the use, distribution,and commercialization of software is to determine the chain ofrights and to identify all right holders. The owner of the copy-right is not necessarily the owner of the right of use. For Ger-many, the Copyright Act regulates the rights for employmentrelationships [106]. In such cases, the right of use is automat-ically transferred to the employer. This means that in mostcases of employed software developers and research staff, theinstitution holds the rights of use for the software work. Thisis not automatically the case for students, freelancers, and indi-vidual external cooperation partners. Employment and servicecontracts with contributors could contain regulations regard-ing the transfer of rights of use. For researchers who conductfree research not subject to directives, in Germany the consti-tution guarantees freedom of research so that the rights of usefor their work remains initially with the natural person. In ad-dition to the rights of the people directly involved, other rightsof third parties may also be relevant. Existing source code(e.g., other Free/Libre Open Source Software (FLOSS)), exter-nal libraries, and contributions from institutional cooperationpartners are published and provided under certain licenses andtheir conditions must be observed (which, due to incompatibil-ities even among FLOSS licenses, may well mean that individ-ually reusable pieces of software cannot be reused together orin a new context). The nature of research careers often bringsadditional complications to the chain of rights. It happens thatresearchers take their software with them when they changeinstitutions and develop it further during their career. Here,the former employer may be entitled to some rights of use.In third-party funded projects, in particular with industry butalso with public funding, rules regarding rights of use are of-ten defined. Last but not least, the software can also be affectedby other (intellectual) property rights such as patents or trade-marks. Software itself is usually not patentable but it may im-plement a technical invention covered by patents. When usingor distributing such software, an additional matching patent li-cense may be necessary. Licenses exist (for example: GNU GPLv3) which automatically grant related patent licenses while us-ing the software license. That should be considered when ex-ploitation of the patent is planned.
Liability
Issues of warranty and liability for faulty software must betaken into account. We consider the possibilities of contractuallimitation of liability in licenses. Full exclusions of liability aregenerally invalid in the German law. Limitations of liabilityusually depend on the form of distribution: The limitation op-tions are larger if the rights of use are granted free of charge,e.g. provision “as is” as defined in the BSD 3-clause license.
Ideas for Solutions
In order to meet the legal challenges mentioned, it is absolutelynecessary for the software developer (team) to document therights chain comprehensively during the software development(see, e.g., supplementary material). Contributions of individ-ual persons must be traceable and their (labor law) status mustbe named. At best, contracts with rules on the transfer of rightsof use should be concluded before work begins. Declarationsof assignment of rights can be made for existing works. Li-cense conditions for external contributions must be evaluatedwith regard to further rights of use and possible sub-licensing.Contracts and funding conditions must be conscientiously doc-umented and analyzed with regard to rules on rights of use. Incase that different parts of the software are based on differentconditions and rights of third parties, individual modules ofthe new software could be published under different licensesand merged accordingly.A national research software sustainability institute couldbe established. This institute supports local research softwaretask forces and thereby respective researchers and researchteams in the licensing of research software and related legal is-sues. For this purpose, a legal help desk will be set up, to whichall members of their respective research performing organiza-tion can apply. If researchers want to publish the research soft-ware under a Free/Libre Open Source Software license, the or-ganization could bundle the necessary rights beforehand. Thisis particularly useful when teams of researchers, often inter-national, write software. In addition, the sustainability insti-tute may serve as a one-stop-shop for the licensing of researchsoftware.
Recommendations
We see it as an essential part of the sustainability of re-search to enable the free distribution of research software.There are a variety of open source software licensing models(ranging from permissive to copyleft; for further information,see [51, 107, 108]). The use of an FSF- or OSI-approved FLOSSlicense for example would enable a truly free model and also re-duce legal issues. We recommend that research funding orga-nizations such as the DFG discuss if they expect publishing allfunded software under these licenses, following the paradigmof “public money, public code” [109].Also for legal aspects, we believe it is important that all(German) research performing organizations install a researchsoftware task force, especially since the new DFG Code of Con-duct [13] was released. Besides organization and bundling oftechnical and infrastructural support for local RSEs and re-searchers (see previous sections), this group should organize alocal legal help desk, organize educational offers e.g. for the le-gal topics presented, and (if not implemented yet) develop thesoftware policy of the research performing organization. As anexample, with the help of on-boarding processes performed bythe research software task force, RSEs should be able to keepthe clearance of rights as simple as possible right from the start.One possibility how local legal help desks could structure theirwork is shown in the decision tree in Fig. 1. A more complete | de-RSE Position Paper Sustainable software?
NO YES
Who is contributing?External Internal (e.g. students,professor, staff...)Does the institution haveaccess rights to the extentneeded?
Subject to directives [weisungsgebunden dt.]
Maybe later?License agreement? (with contributor) NOYESYES NO
Legal setup givesfurther obligations [3rd party finance, by-laws...]YES
Go to legal
YESNO NO LicensingplannedTemplate by institution §69b UrhGDocumentationtemplate DocumentationtemplateExpansion needed: depends on individual requirements. If unsure, checkwith the legal department or responsible person named in the policy. ② Figure 1. Decision tree for contributors . This tree helps to figure out whetherthe academic institution where the software is developed owns the intellectualproperty (copyright). suggestion for decision trees for both legal help desks and in-terested RSEs can be found in the supplementary material. Wesuggest that the local task forces build a network with the otherresearch performing organizations for exchange of ideas butalso for generating a bottom-up strategy to organize RSE stan-dards for Germany and beyond and possibly be the origin of theaforementioned software sustainability institute.
Conclusions
We find that the research software ecosystem is notoriouslylacking resources despite its strategic importance. If fund-ing and support does not improve, the success story of sci-ence based on academic research software may be at stake. Werecommend the installation of infrastructure that enables sus-tainable software development including platforms for collab-oration, continuous integration, testing, discovery, and long-term preservation. We suggest the establishment of a nation-wide institution similar to the Software Sustainability Institute(SSI) to provide project consulting and code review services aswell as sustainable software development training. We thinkthat sustainable software development should become an inte-gral component of the universities’ teaching curriculum. Weencourage the research funding bodies to reflect the licens-ing models for academic software development, and to decidewhether the “public money, public code” paradigm justifies therequirement that all publicly-funded software has to be pub-licly available under a Free/Libre Open Source Software (FLOSS)license. Ultimately, we strongly advise the implementation offunding schemes for sustainably supporting the developmentand maintenance of research software based on clear and trans-parent criteria, for creating incentives to produce high qualitycommunity software, and for enabling career paths as researchsoftware engineer (RSE).
Declarations
Glossary domain researchers
The people doing the research to advanceknowledge in a field. general public
Lay people that do not necessarily have specificinsight regarding a research domain. geopolitical units
Governed public units, ranging from citiesand councils, over federal states and countries, up to po-litical unions such as the EU. In the context of this paper,the discussion usually focuses on the larger units (coun-tries and political unions). independent (open source) developers
Project-external soft-ware developers who are not employed by the institu-tion(s) carrying out the project.. industry
Companies conducting research or profit from avail-able academic research software which they can directlyor indirectly apply to their field. infrastructure units
Computing centers of research bodiessuch as universities and other research centers, as wellas high-performance computing facilities. libraries (also registries, indices)
Infrastructure units of re-search bodies such as universities, or independent orga-nizations, which gather research outputs and their struc-tured metadata, and provide indices, search, etc. research funding organizations
Public research funding bod-ies but potentially also companies, foundations, associa-tions, etc. research performing organizations
Research groups, depart-ments, faculties, research institutions (universities, re-search institutions, cross-institutional research groups,etc.), umbrella organizations, such as Helmholtz-Gemeinschaft Deutscher Forschungszentren, Max-Planck-Gesellschaft zur Förderung der Wissenschaften,Leibniz-Gemeinschaft, etc. research leaders
Heads of research groups, such as professorsand other people with staff responsibility. research software engineers (RSEs)
People creating andmaintaining research software; this group ranges fromresearch-focused software developers, to softwareengineers with a focus on research; other definitionsinclude other roles, such as research software managers.
Consent for Publication
Not applicable
Competing Interests
The authors declare that they have no competing interests.
Funding
The authors thank the DFG for funding a meeting (
Rundge-spräch , grants LO 2093/3-1 and SE 1758/6-1) during which theinitial draft of this paper has been created. We are particularlygrateful for the support from Dr. Matthias Katerbow (DFG).This work was additionally supported by Research Soft-ware Sustainability grants funded by the DFG [110]: Aggarwal:390886566; PI: Zesch. Appel: 391099391; PI: Balmann. Bach& Loewe & Seemann: 391128822; PIs: Loewe / Scholze / See-mann / Selzer / Streit / Upmeier. Bader: 391134334; PIs: Bader/ Gabriel / Frank. Brusch: 391070520; PI: Brusch. Druskat &Gast: 391160252; PI: Gast / Lüdeling. Ebert: 391137747; PI:Marschall. Flemisch & Hermann: 391049448; PIs: Boehringer/ Flemisch / Hermann. Hohmann: 391054082; PI: Hepp. Goth:390966303; PI: Assaad. Grad & Weeber: 391126171; PI: Holm.Kutra: 391125810; PI: Kreshuk. Mehl & Uekermann: 391150578;PIs: Bungartz / Mehl / Uekermann. Peters-Kottig: 391087700; nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | PIs: Gleixner / Peters-Kottig / Shinano / Sperber. Raters:39099699; PI: Herwartz. Reina: 391302154; PIs: Ertl / Reina.Muth & Renard: 391179955; PIs Renard / Fuchs. Ropinski:391107954; PI: Ropinski.
Author’s Contributions
We are a group of software-providing researchers, RSEs, andinfrastructural as well as legal supporters. Initially, a groupof representatives of funded projects of the first DFG sustain-ability call [110] met during the first German RSE conference(deRSE19) [10] in June 2019 in a grass-roots workshop on sus-tainable research software addressing the software-based re-search community. During this workshop, we realized that alot of valuable experience and good ideas are present in thegroup, and we decided to start working on this paper togetherwith other interested practitioners. We followed the generousinvitation of the DFG for the above-mentioned two-day meet-ing at the Robert Koch Institute in Berlin in November 2019 tosharpen the focus of this paper.The individual contributions of authors to this work are de-tailed below, following the CASRAI CRediT (Contributor RolesTaxonomy):•
Conceptualization:
Anzt, Bach, Druskat, Loewe, Löffler, Re-nard, Seemann, Struck•
Funding acquisition:
Loewe, Seemann•
Investigation:
Bach, Druskat, Loewe, Löffler, Renard, See-mann•
Project administration:
Loewe, Seemann•
Visualization:
Unger, Friedl, Löffler•
Writing original draft:
Achhammer, Aggarwal, Anzt, Appel,Bach, Bader, Brusch, Druskat, Ebert, Flemisch, Friedl, Funk,Grad, Goth, Herrmann, Hohmann, Kutra, Linxweiler, Loewe,Löffler, Muth, Peters-Kottig, Rack, Raters, Rave, Reina, Re-nard, Ropinksi, Schaarschmidt, Seemann, Struck, Thiele,Uekermann, Unger, Weeber•
Writing review & editing:
Anzt, Appel, Bach, Bader, Brusch,Busse, Chourdakis, Dabrowski, Druskat, Friedl, Fritzsch,Funk, Gast, Herrmann, Janosch, Loewe, Löffler, Rack, Reina,Reißig, Renard, Seemann, Seibold, Struck, Thiele, Ueker-mann,
References
1. The Event Horizon Telescope Collaboration, Akiyama K,Alberdi A, Alef W, Asada K, Azulay R, et al. First M87Event Horizon Telescope Results. IV. Imaging the CentralSupermassive Black Hole. The Astrophysical Journal 2019Apr;875(1):L4. http://stacks.iop.org/2041-8205/875/i=1/a=L4?key=crossref.f93f0b1565bb46b48cb890351d9fef13 .2. Nowogrodzki A. How to support open-source softwareand stay sane. Nature 2019 Jul;571(7763):133–134. https://doi.org/10.1038/d41586-019-02046-0 .3. Philippe O, Hammitzsch M, Janosch S, van derWalt A, van Werkhoven B, Hettrick S, et al.,softwaresaved/international-survey: Public releasefor 2018 results; 2019. doi.org/10.5281/zenodo.2585783 .4. San Francisco Declaration on Research Assessment; 2012. sfdora.org/read .5. Hirsch JE. An index to quantify an individual's scientificresearch output. PNAS 2005 Nov;102(46):16569–16572.6. Bangerth W, Heister T. Quo Vadis, Scientific Software?SIAM News 2014;47(1):8.7. Prins P, de Ligt J, Tarasov A, Jansen RC, Cuppen E, BournePE. Toward effective software solutions for big biology.Nature Biotechnology 2015 Jul;33(7):686–687. 8. Richardson C, Croucher M, Research Software Engineer: ANew Career Track?; 2018. sinews.siam.org/Details-Page/research-software-engineer-a-new-career-track .9. Brett A, Croucher M, Haines R, Hettrick S, Hetherington J,Stillwell M, et al., Research Software Engineers: State ofthe Nation Report 2017; 2017.10. de-RSE e V - Society for Research Software, Konferenzfür ForschungssoftwareentwicklerInnen in Deutschland;2019. web.archive.org/web/20191213123440/de-rse.org/en/conf2019 .11. DFG, Nachhaltigkeit von Forschungssoftware; 2016. .12. DFG, Qualitätssicherung von Forschungssoftware durchihre nachhaltige Nutzbarmachung; 2019. .13. DFG, Guidelines for Safeguarding Good Research Prac-tice; 2019. .14. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G,Axton M, Baak A, et al. The FAIR Guiding Principles for sci-entific data management and stewardship. Scientific Data2016 Mar;3(1):160018. doi.org/10.1038/sdata.2016.18 .15. Hasselbring W, Carr L, Hettrick S, Packer H, Tiropanis T,FAIR and Open Computer Science Research Software; 2019. https://arxiv.org/abs/1908.05986 .16. Lamprecht AL, Garcia L, Kuzak M, Martinez C, Arcila R,Martin Del Pico E, et al. Towards FAIR principles for re-search software. Data Science 2019;.17. Katerbow M, Feulner G, Recommendations on the devel-opment, use and provision of Research Software; 2018. https://doi.org/10.5281/zenodo.1172988 .18. Scheliga K, Pampel H, Konrad U, Fritzsch B, Schlauch T,Nolden M, et al. Dealing with research software: Recom-mendations for best practices. Helmholtz Open ScienceCoordination Office; 2019.19. Hatton L. The Chimera of Software Quality. Computer2007 Aug;40(8):104–103.20. Chang G, Roth CB, Reyes CL, Pornillos O, Chen YJ, ChenAP. Retraction. Science 2006 Dec;314(5807):1875–1875. science.sciencemag.org/content/314/5807/1875.2 .21. Matthews BW. Five Retracted Structure Reports: In-verted or Incorrect? Protein Science 2007;16(6):1013–1016. onlinelibrary.wiley.com/doi/abs/10.1110/ps.072888607 .22. Smith AM, Katz DS, Niemeyer KE, FORCE11 Software Cita-tion Working Group. Software Citation Principles. PeerJComputer Science 2016;2:e86. doi.org/10.7717/peerj-cs.86 .23. Merali Z. Computational Science: ...Error. Nature2010 Oct;467(7317):775–777. .24. Barnes N. Publish Your Computer Code: It Is Good Enough.Nature 2010 Oct;467(7317):753. .25. Tse H. Computer Code: More Credit Needed. Nature 2010Nov;468(7320):37. .26. Hafer L, Kirkpatrick AE. Assessing Open Source Soft-ware As a Scholarly Contribution. Commun ACM 2009Dec;52(12):126–129. http://doi.acm.org/10.1145/1610252.1610285 .27. Howison J, Bullard J. Software in the Scientific Literature:Problems with Seeing, Finding, and Using Software Men-tioned in the Biology Literature. Journal of the Associationfor Information Science and Technology 2016;67(9):2137–2155. http://dx.doi.org/10.1002/asi.23538 .28. Li K, Yan E, Feng Y. How Is R Cited in Re-search Outputs? Structure, Impacts, and Citation Stan-dard. Journal of Informetrics 2017 Nov;11(4):989– | de-RSE Position Paper .29. Li K, Chen PY, Yan E. Challenges of measuring softwareimpact through citations: An examination of the lme4 Rpackage. Journal of Informetrics 2019 Feb;13(1):449–461. doi.org/10.1016/j.joi.2019.02.007 .30. Park H, Wolfram D. Research Software Citation in the DataCitation Index: Current Practices and Implications for Re-search Software Sharing and Reuse. Journal of Informet-rics 2019 May;13(2):574–582. .31. Pan X, Yan E, Cui M, Hua W. How Important Is Softwareto Library and Information Science Research? A ContentAnalysis of Full-Text Publications. Journal of Informet-rics 2019 Feb;13(1):397–406. .32. Doerr A, Rusk N, Vogt N, Strack R, Tang L, Arunima S,et al. Giving Software Its Due. Nature Methods 2019Mar;16(3):207–207. doi.org/10.1038/s41592-019-0350-x .33. Druskat S. Software and Dependencies in Research Ci-tation Graphs. Computing in Science & Engineering2020 Mar;22(2):8–21. https://doi.org/10.1109/MCSE.2019.2952840 .34. Katz DS, Bouquin D, Hong NPC, Hausman J, Jones C,Chivvis D, et al., Software Citation Implementation Chal-lenges; 2019. arxiv.org/abs/1905.08674 .35. Resnik DB, Morales M, Landrum R, Shi M, Min-nier J, Vasilevsky NA, et al. Effect of impact fac-tor and discipline on journal data sharing poli-cies. Accountability in Research 2019;26(3):139–156. ,cited By 0.36. Vandewalle P. Code Sharing Is Associated with ResearchImpact in Image Processing. Computing in Science & En-gineering 2012 July;14(4):42–47.37. Venters CC, Jay C, Lau LMS, Griffiths MK, Holmes V, WardRR, et al. Software Sustainability: The Modern Towerof Babel. In: Proceedings of the Third InternationalWorkshop on Requirements Engineering for SustainableSystems Co-Located with 22nd International Conferenceon Requirements Engineering (RE 2014), vol. 1216 Karl-skrona, Sweden: CEUR-WS; 2014. p. 7–12. http://ceur-ws.org/Vol-1216/paper2.pdf .38. Goble C. Better Software, Better Research. IEEE InternetComputing 2014 Sep;18(5):4–8.39. Druskat S. A Proposal for the Measurement and Docu-mentation of Research Software Sustainability in Interac-tive Metadata Repositories. In: Allen G, Carver J, ChoiSCT, Crick T, Crusoe MR, Gesing S, et al., editors. Pro-ceedings of the Fourth Workshop on Sustainable Soft-ware for Science: Practice and Experiences (WSSSPE4), vol.1686 Manchester, UK: CEUR-WS; 2016. http://ceur-ws.org/Vol-1686/WSSSPE4_paper_20.pdf .40. Katz DS, Fundamentals of Software Sustainabil-ity; 2018. web.archive.org/web/20191213110119/danielskatzblog.wordpress.com/2018/09/26/fundamentals-of-software-sustainability .41. Akhmerov A, Cruz M, Drost N, Hof C, Knapen T, Kuzak M,et al. Raising the Profile of Research Software: Recom-mendations for Funding Agencies and Research Institu-tions. NWO (The Netherlands Organisation for ScientificResearch); 2019.42. Casties R, Czmiel A, Damerow J, Ionov M,Meroño Peñuela A, Ranford S, et al., DH ResearchSoftware Engineers - For We Are Many; 2019. http://web.archive.org/web/20190829122818/https: //dh-tech.github.io/dhrse-whitepaper/ .43. Wilson G, Aruliah DA, Brown CT, Hong NPC, Davis M, GuyRT, et al. Best Practices for Scientific Computing. PLoSBiology 2014 Jan;12(1):e1001745. doi.org/10.1371/journal.pbio.1001745 .44. Stodden V, Miguez S. Best Practices for Computa-tional Science: Software Infrastructure and Envi-ronments for Reproducible and Extensible Research.Journal of Open Research Software 2014 Jul;2(1):e21. http://openresearchsoftware.metajnl.com/articles/10.5334/jors.ay .45. Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, TealTK. Good enough practices in scientific computing. PLOSComputational Biology 2017 Jun;13(6):e1005510. doi.org/10.1371/journal.pcbi.1005510 .46. The CodeRefinery Project, CodeRefinery - Lessons;. http://web.archive.org/web/20180531130106/http://coderefinery.org/lessons/ .47. Li K, Lin X, Greenberg J. Software Citation, Reuse andMetadata Considerations: An Exploratory Study Examin-ing LAMMPS. Proceedings of the Association for Informa-tion Science and Technology 2016;53(1):1–10. doi.wiley.com/10.1002/pra2.2016.14505301072 .48. Kuchinke W, Ohmann C, Stenzhorn H, Anguista A,Sfakianakis S, Graf N, et al. Ensuring sustainability ofsoftware tools and services by cooperation with a researchinfrastructure. Personalized Medicine 2016 Jan;13(1):43–55. doi.org/10.2217/pme.15.43 .49. Loewe A, Seemann G, Wülfers EM, Huang YL, SánchezJ, Bach F, et al. SuLMaSS - Sustainable Lifecycle Man-agement for Scientific Software. In: E-Science-Tage2019: Data to Knowledge; 2019. dx.doi.org/10.11588/HEIDOK.00026843 .50. Druskat S, Krause T, Lüdeling A, Gast V. Infras-trukturstrategien für nachhaltige Forschungssoftwarein befristeten Projekten. In: deRSE19 - Confer-ence for Research Software Engineers in Germany.Potsdam, Germany; 2019. https://doi.org/10.6084/m9.figshare.11277764.v1 .51. Morin A, Urban J, Sliz P. A Quick Guide to Software Li-censing for the Scientist-Programmer. PLOS Computa-tional Biology 2012 Jul;8(7):e1002598. journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002598 .52. Katz DS, Ramnath R, Looking at Software Sustainabilityand Productivity Challenges from NSF; 2015. http://arxiv.org/abs/1508.03348 .53. Chan Zuckerberg Initiative, Essential OpenSource Software for Science;. web.archive.org/web/20191213112602/chanzuckerberg.com/rfa/essential-open-source-software-for-science .54. de-RSE e V - Society for Research Software, de-RSE.org -Research Software Engineers (RSEs) - The People behindResearch Software;. .55. The US Research Software Engineer Association, US-RSE;. web.archive.org/web/20191213123453/us-rse.org .56. The Netherlands Research Software Engineer community,The Netherlands Research Software Engineer Community;. web.archive.org/web/20191213123459/nl-rse.org .57. Rodríguez-Sánchez F, Marwick B, Lazowska E, Vander-Plas J. Academia’s Failure to Retain Data Scientists. Sci-ence 2017 Jan;355(6323):357–358. science.sciencemag.org/content/355/6323/357.3 .58. Katz DS, Druskat S, Haines R, Jay C, Struck A. The State ofSustainable Research Software: Learning from the Work-shop on Sustainable Software for Science: Practice andExperiences (WSSSPE5.1). Journal of Open Research Soft-ware 2019 Apr;7(1):11. openresearchsoftware.metajnl.com/ nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | articles/10.5334/jors.242 .59. Druskat S, Katz DS. Mapping the Research Software Sus-tainability Space. In: 2018 IEEE 14th International Confer-ence on E-Science (e-Science); 2018. p. 25–30. doi.org/10.1109/eScience.2018.00014 .60. Ye Y, Boyce RD, Davis MK, Elliston K, Davatzikos C, Fe-dorov A, et al., Open Source Software Sustainability Mod-els: Initial White Paper from the Informatics Technologyfor Cancer Research Sustainability and Industry Partner-ship Work Group; 2019.61. Hong NC. Minimal information for reusable scientificsoftware. In: 2nd Workshop on Sustainable Soft-ware for Science: Practice and Experiences (WSSSPE2);2014. figshare.com/articles/Minimal_information_for_reusable_scientific_software/1112528 .62. Schlauch T, Meinel M, Haupt C. DLR Software EngineeringGuidelines. Deutsches Zentrum für Luft- und Raumfahrt(DLR); 2018.63. Gomez-Diaz T, Recio T. On the evaluation of researchsoftware: the CDUR procedure. F1000Research 2019Aug;8:1353. doi.org/10.12688/f1000research.19994.1 .64. Fehr J, , Heiland J, Himpe C, Saak J. Best practices forreplicability, reproducibility and reusability of computer-based experiments exemplified by model reduction soft-ware. AIMS Mathematics 2016;1(3):261–281. doi.org/10.3934/math.2016.3.261 .65. Jiménez RC, Kuzak M, Alhamdoosh M, Barker M, Batut B,Borg M, et al. Four simple recommendations to encouragebest practices in research software. F1000Research 2017Jun;6:876. doi.org/10.12688/f1000research.11407.1 .66. Hsu L, Hutchison VB, Langseth ML. Measuring sustain-ability of seed-funded earth science informatics projects.PLOS ONE 2019 Oct;14(10):e0222807. doi.org/10.1371/journal.pone.0222807 .67. CHAOSS, CHAOSS Metrics; 2019. chaoss.community/metrics .68. Jackson M, Crouch S, Baxter R, Software Evaluation Guide;2019. .69. SSI, Online sustainability evaluation; 2019. .70. rOpenSci, Anderson B, Chamberlain S, Krystalli A, MullenL, Ram K, et al. Software Peer Review, Why? What?In: rOpenSci Packages: Development, Maintenance,and Peer Review Zenodo; 2019. https://doi.org/10.5281/zenodo.2554759 .71. Review Criteria - JOSS documentation;. http://web.archive.org/web/20200317125049/https://joss.readthedocs.io/en/latest/review_criteria.html .72. SSI, Writing and using a software managementplan; 2019. .73. Lee BD. Ten simple rules for documenting scien-tific software. PLOS Computational Biology 2018Dec;14(12):e1006561. doi.org/10.1371/journal.pcbi.1006561 .74. Venters CC, Capilla R, Betz S, Penzenstadler B, Crick T,Crouch S, et al. Software sustainability: Research andpractice from a software architecture viewpoint. Journalof Systems and Software 2018 Apr;138:174–188. doi.org/10.1016/j.jss.2017.12.026 .75. Johanson A, Hasselbring W. Software engineering forcomputational science: Past, present, future. Computingin Science & Engineering 2018;20(2):90–109.76. Akhmerov A, Cruz M, Drost N, Hof C, Knapen T, Kuzak M,et al., Making Research Software a First-Class Citizen inResearch; 2019. https://zenodo.org/record/2647436 .77. Chue Hong N, Making Software A First-Class Citizen; 2019. https://figshare.com/articles/Making_Software_A_First-Class_Citizen/9862835 .78. Vereinigung der Kanzlerinnen und Kanzler der Uni-versitäten Deutschlands, Bayreuther Erklärung zubefristeten Beschäftigungsverhältnissen mit wis-senschaftlichem und künstlerischem Personal in Univer-sitäten; 2019. .79. Katz DS, McInnes LC, Bernholdt DE, Mayes AC, HongNPC, Duckles J, et al. Community Organizations: Chang-ing the Culture in Which Research Software Is Developedand Sustained. Computing in Science & Engineering 2019March;21(2):8–24.80. Science Guide, Room for everyone’s talent; 2019. .81. Carver JC, Hong NPC, Thiruvathukal GK. Software engi-neering for science. CRC Press; 2016.82. Iaffaldano G, Steinmacher I, Calefato F, Gerosa M, Lanu-bile F, Why do developers take breaks from contributingto OSS projects? A preliminary analysis; 2019.83. Allen A, Aragon C, Becker C, Carver J, Chis A, Combemale B,et al. Engineering Academic Software (Dagstuhl Perspec-tives Workshop 16252). Dagstuhl Manifestos 2017;6(1):1–20. drops.dagstuhl.de/opus/volltexte/2017/7146 .84. Bundesministerium der Justiz und für Verbraucherschutz,Gesetz über befristete Arbeitsverträge in der Wissenschaft;2017. .85. Chang V, Mills H, Newhouse S. From Open Source tolong-term sustainability: Review of Business Models andCase studies. In: Proceedings of the UK e-Science AllHands Meeting 2007 University of Edinburgh/Universityof Glasgow (acting through the NeSC); 2007. eprints.leedsbeckett.ac.uk/649 .86. Aartsen W, Peeters P, Wagers S, Williams-Jones B. GettingDigital Assets from Public–Private Partnership ResearchProjects through “The Valley of Death, ” and MakingThem Sustainable. Frontiers in Medicine 2018 Mar;5:65. doi.org/10.3389/fmed.2018.00065 .87. Gabella C, Durinx C, Appel R. Funding knowledgebases:Towards a sustainable funding model for the UniProt usecase. F1000Research 2018 Mar;6:2051. doi.org/10.12688/f1000research.12989.2 .88. Abramatic JF, Di Cosmo R, Zacchiroli S. Building theUniversal Archive of Source Code. Communications ofthe ACM 2018 Sep;61(10):29–31. https://doi.org/10.1145/3183558 .89. ORCID; 2019. orcid.org/ .90. Ison J, Rapacki K, Ménager H, Kalaš M, Rydza E, Chmura P,et al. Tools and Data Services Registry: A Community Ef-fort to Document Bioinformatics Resources. Nucleic AcidsResearch 2016 Jan;44(Database issue):D38–D47. https://doi.org/10.1093/nar/gkv1116 .91. HIDA;. .92. Institute USS, UK Software Sustainability Institute; 2019. .93. Institute USS, Software Systems Development LifeCycle(SDLC); 2019. .94. Struck A. Research Software Discovery: An Overview.In: 2018 IEEE 14th International Conference on e-ScienceIEEE; 2018. doi.org/10.1109/escience.2018.00016 .95. Druskat S, Spaaks JH, Chue Hong N, Haines R, Baker J,Citation File Format (CFF) - Specifications; 2019. doi.org/10.5281/zenodo.3515946 .96. The CodeMeta Project, The CodeMeta Project; 2019. codemeta.github.io . | de-RSE Position Paper
97. swMATH, An information service for mathematical soft-ware; 2019. swmath.org .98. ASCL net, Astrophysics Source Code Library; 2019. ascl.net .99. The Comprehensive R Archive Network (CRAN);. cran.r-project.org .100. re3data.org – Registry of Research Data Repositories;. .101. Mönnich MW. KVK - a meta catalog of libraries. LIBERQuarterly 2001;11(2):121–127.102. CERN; 2019. invenio-software.org .103. Kunze J, Scancella J, Adams C, Littman J. The bagIt filepackaging format (v1. 0). RFC Editor; 2018.104. Software Heritage;. .105. Bundesministerium der Justiz und für Verbraucherschutz,§69a subsection (1) UrhG; 2014. .106. Bundesministerium der Justiz und für Verbraucherschutz,§69b UrhG; 2014. .107. ifrOSS, Lizenz Center; 2019. ifross.github.io/ifrOSS/Lizenzcenter .108. tl;drLegal, Software Licenses in Plain English; 2019. tldrlegal.com .109. Free Software Foundation Europe, Public Money PublicCode; 2017. publiccode.eu .110. DFG, Geförderte Projekte Ausschreibung NachhaltigkeitForschungssoftware; 2018. gepris.dfg.de/gepris/suche/projekt/research_software . nzt, Bach, Druskat, Löffler, Loewe, Renard, Seemann, Struck et al. | Supplementary Material
Decision Trees for Legal Topics
The decision trees presented here shall help legal help desksand developers to identify risks regarding the mandate of thesoftware. In a perfect world, one would address the legal as-pects at the start of a project. It is crucial to know aboutthese to create sustainable software. We strongly recommendto write a documentation of the answers and outcomes. Pleasekeep in mind that only restrictions from copyright law are ad-dressed. In some projects, you might also have to considerpatents, trademarks etc.Before you can publish, use, and/or license a software, youhave to check:• The policy of the institution (Fig. S1)• The rights restriction imposed by the persons who “create”the software (Fig. S2)• The rights restriction imposed by the environment (Fig. S2)• If third-party code is incorporated (Fig. S3)We also built a tree for the scenario that you have to checkfor already existing software (Fig. S4).
Is there a policy regardingIntellectual Property? (copyright)
Restriction ondevelopment? Restriction onpublication? on infra-structure onpeople oncode (internal)reviewprocess licenseNO NOYES YESYES ① Figure S1. Policy . This tree recommends to check closely any policies imple-mented in the software developers organization.
Sustainable software?
NO YES
Who is contributing?External Internal (e.g. students,professor, staff...)Does the institution haveaccess rights to the extentneeded?
Subject to directives [weisungsgebunden dt.]
Maybe later?License agreement? (with contributor) NOYESYES NO
Legal setup givesfurther obligations [3rd party finance, by-laws...]YES
Go to legal
YESNO NO LicensingplannedTemplate by institution §69b UrhGDocumentationtemplate DocumentationtemplateExpansion needed: depends on individual requirements. If unsure, checkwith the legal department or responsible person named in the policy. ② Figure S2. Contributors . This tree helps to find out whether the academicinstitution where the software development is located is the owner of the in-tellectual property (copyright).
If the outcome is a prohibition sign, we believe there is noother solution than to rewrite parts of the code or the wholecode. If the outcome is a green checkmark, we believe you havethe rights which you need to proceed. The other outcomes areself-explaining (e.g. go to legal department).
NO YES
Go toWho is contributing License ofincorporated codeProprietaryAre you going to use100% internal code?License with supplier /manufacturer
YES NO
License compliance
YESNO Matrix on compatibility;Checklist licenseobligationsDocumentationtemplateCheck ② to see if copyrightbelongs to your institution. NOYES Check ② whois contributing. ③ Figure S3. Code history . The code history tree points out tasks for projectsthat incorporate existing code.
Do you want to distributesource code?
NO YES
Legal obligations?See ① , ② , and ③ Freedom tochoose license ② or ③ =ORQuestions about ① Open access(no source)?Contact technologytransfer office orpatent utilizationagency ① , ② and ③ = Go to legal License short list Documentationtemplate ④ YESNO
Commercial Dual/Multi FreeFLOSS or (just)FREE