[PDF] Analysing the Requirements for an Open Research Knowledge Graph: Use Cases, Quality Requirements and Construction Strategies

Abstract

Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KG) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective and present a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting and reviewing daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications, and outline possible solutions.

Full PDF

NNoname manuscript No. (will be inserted by the editor)

Analysing the Requirements for an Open Research Knowledge Graph:Use Cases, Quality Requirements and Construction Strategies

Arthur Brack · Anett Hoppe · Markus Stocker · S¨oren Auer · Ralph Ewerth

Received: date / Accepted: date

Abstract

Current science communication has a number ofdrawbacks and bottlenecks which have been subject of dis-cussion lately: Among others, the rising number of pub-lished articles makes it nearly impossible to get a full over-view of the state of the art in a certain ﬁeld, or reproducibil-ity is hampered by ﬁxed-length, document-based publica-tions which normally cannot cover all details of a researchwork. Recently, several initiatives have proposed knowledgegraphs (KG) for organising scientiﬁc information as a solu-tion to many of the current issues. The focus of these propos-als is, however, usually restricted to very speciﬁc use cases.In this paper, we aim to transcend this limited perspectiveand present a comprehensive analysis of requirements for anOpen Research Knowledge Graph (ORKG) by (a) collectingand reviewing daily core tasks of a scientist, (b) establishingtheir consequential requirements for a KG-based system, (c)identifying overlaps and speciﬁcities, and their coverage incurrent solutions. As a result, we map necessary and desir-able requirements for successful KG-based science commu-nication, derive implications, and outline possible solutions.

Keywords scholarly communication · research knowledgegraph · design science research · requirements analysis Arthur BrackE-mail: [email protected] HoppeE-mail: [email protected] StockerE-mail: [email protected]¨oren AuerE-mail: [email protected] EwerthE-mail: [email protected] TIB – Leibniz Information Centre for Science and Technology, Han-nover, Germany L3S Research Center, Leibniz University, Hannover, Germany

Today’s scholarly communication is a document-centred pro-cess and as such, rather inefﬁcient. Scientists spend consid-erable time in ﬁnding, reading and reproducing research re-sults from PDF ﬁles consisting of static text, tables, and ﬁg-ures. The explosion in the number of published articles [14]aggravates this situation further: It gets harder and harder tostay on top of current research, that is to ﬁnd relevant works,compare and reproduce them and, later on, to make one’sown contribution known for its quality.Some of the available infrastructures in the research eco-system already use knowledge graphs (KG) to enhance theirservices. Academic search engines, for instance, such as Mi-crosoft Academic Knowledge Graph [37] or

Literature Graph [1] utilise metadata-based graph structures which link re-search articles based on citations, shared authors, venues andkeywords.Recently, initiatives have promoted the usage of KGsin science communication, but on a deeper, semantic level[3,49,55,72,77,83,112]. They envision the transformationof the dominant document-centred knowledge exchange toknowledge-based information ﬂows by representing and ex-pressing knowledge through semantically rich, interlinkedKGs. Indeed, they argue that a shared structured represen-tation of scientiﬁc knowledge has the potential to alleviatesome of the science communication’s current issues: Rele-vant research could be easier to ﬁnd, comparison tables au-tomatically compiled, own insights rapidly placed in the cur-rent ecosystem. Such a powerful data structure could, more Acknowledging that knowledge graph is vaguely deﬁned, weadopt the following deﬁnition: A knowledge graph (KG) consists of (1)an ontology describing a conceptual model (e.g. with classes and rela-tion types), and (2) the corresponding instance data (e.g. objects, liter-als, and < subject, predicate, object > -triplets) following the constraintsposed by the ontology (e.g. instance-of relations). The construction ofa KG involves ontology design and population with instances. a r X i v : . [ c s . D L ] F e b Brack et al. than the current document-based system, also encourage theinterconnection of research artefacts such as datasets andsource code much more than current approaches (like Digi-tal Object Identiﬁer (DOI) references etc.); allowing for eas-ier reproducibility and comparison. To come closer to thevision of knowledge-based information ﬂows, research arti-cles should be enriched and interconnected through machine-interpretable semantic content. The usage of Papers WithCode [79] in the machine learning community and Jaradehet al.’s study [55] indicate that authors are also willing tocontribute structured descriptions of their research articles.The work of a researcher is manifold, but current propos-als usually focus on a speciﬁc use case (e.g. the aforemen-tioned examples focus on enhancing academic search). Inthis paper, we present a detailed analysis of common literature-related tasks in a scientist’s daily life and analyse (a) howthey could be supported by an ORKG, (b) what requirementsresult for the design of (b1) the KG and (b2) the surroundingsystem, (c) how different use cases overlap in their require-ments and can beneﬁt from each other. Our analysis is ledby the following research questions:1. Which use cases should be supported by an ORKG?(a) Which user interfaces are necessary?(b) Which machine interfaces are necessary?2. What requirements can be deﬁned for the underlying on-tologies to support these use cases?(a) Which granularity of information is needed?(b) To what degree is domain specialisation needed?3. What requirements can be deﬁned for the instance datain context of the respective use cases?(a) Which completeness is sufﬁcient for the instance data?(b) Which correctness is sufﬁcient for the instance data?(c) Which approaches (human vs. machine) are suitableto populate the ORKG?We follow the design science research (DSR) methodology[50]. In this study, we focus on the ﬁrst phase of DSR andconduct a requirements analysis. The objective is to chartnecessary (and desirable) requirements for successful KG-based science communication, and, consequently, provide amap for future research.Compared to our paper at the 24th International Con-ference on Theory and Practice of Digital Libraries 2020[16], this journal paper has been modiﬁed and extended asfollows: The related work section is updated and extendedwith the new sections

Quality of knowledge graphs and

Sys-tematic literature reviews . The new Appendix A containscomparative overviews of datasets for research knowledgegraph population tasks such as sentence classiﬁcation, re-lation extraction, and concept extraction. To be consistentwith terminology in related work, we use the term “com-pleteness” instead of “coverage” and “correctness” insteadof “quality”. The requirements analysis in Section 3 is re- vised and contains more details with more justiﬁcations forthe posed requirements and approaches.The remainder of the paper is organised as follows. Sec-tion 2 summarises related work on research knowledge graphs,scientiﬁc ontologies, KG construction, data quality require-ments, and systematic literature reviews. The requirementsanalysis is presented in Section 3, while Section 4 discussesimplications and possible approaches for ORKG construc-tion. Finally, Section 5 concludes the requirements analy-sis and outlines areas of future work. Appendix A containscomparative overviews for the tasks of sentence classiﬁca-tion, relation extraction, and concept extraction.

This section gives a brief overview of (a) existing researchKGs, (b) ontologies for scholarly knowledge, (c) approachesfor KG construction, (d) quality dimensions of KGs, and (e)processes in systematic literature reviews.2.1 Research knowledge graphsAcademic search engines (e.g. Google Scholar, MicrosoftAcademic, SemanticScholar) exploit graph structures suchas the Microsoft Academic Knowledge Graph [37], SciGraph[110], the Literature Graph [1], or the Semantic Scholar OpenResearch Corpus (S2ORC) [69]. These graphs interlink re-search articles through metadata, e.g. citations, authors, af-ﬁliations, grants, journals, or keywords.To help reproduce research results, initiatives such asResearch Graph [2], Research Objects [7] and OpenAIRE[72] interlink research articles with research artefacts suchas datasets, source code, software, and video presentations.Scholarly Link Exchange (Scholix) [20] aims to create astandardised ecosystem to collect and exchange links be-tween research artefacts and literature.Some approaches connect articles at a more semanticlevel: Papers With Code [79] is a community-driven effortto supplement machine learning articles with tasks, sourcecode and evaluation results to construct leaderboards. Am-mar et al. [1] link entity mentions in abstracts with DBpe-dia [65] and Uniﬁed Medical Language System (UMLS)[11], and Cohan et al. [23] extend the citation graph withcitation intents (e.g. citation as background or used method).Various scholarly applications beneﬁt from semantic con-tent representation, e.g. academic search engines by exploit-ing general-purpose KGs [109], and graph-based researchpaper recommendation systems [8] by utilising citation graphsand mentioned entities. However, the coverage of science-speciﬁc concepts in general-purpose KGs is rather low [1],e.g. the task “geolocation estimation of photos” from Com- equirements Analysis for an Open Research Knowledge Graph 3 puter Vision is neither present in Wikipedia nor in the Com-puter Science Ontology (CSO) [94].2.2 Scientiﬁc ontologiesVarious ontologies have been proposed to model metadatasuch as bibliographic resources and citations [82]. Iniestaand Corcho [92] reviewed ontologies to describe scholarlyarticles. In the following, we describe some ontologies thatconceptualise the semantic content in research articles.Several ontologies focus on rhetorical [106,48,27] (e.g.Background, Methods, Results, Conclusion), argumentative[103,68] (e.g. claims, contrastive and comparative statementsabout other work) or activity-based structure [83] (e.g. se-quence of research activities) of research articles. Othersdescribe scholarly knowledge with linked entities such asproblem, method, theory, statement [49,19], or focus on themain research ﬁndings and characteristics of research arti-cles described in surveys with concepts such as problems,approaches, implementations, and evaluations [39,104].Various domain-speciﬁc ontologies exist, for instance,mathematics [64] (e.g. deﬁnitions, assertions, proofs), ma-chine learning [61,73] (e.g. dataset, metric, model, exper-iment), and physics [95] (e.g. formation, model, observa-tion). The EXPeriments Ontology (EXPO) is a core ontol-ogy for scientiﬁc experiments that conceptualises experi-mental design, methodology, and results [97].Taxonomies for domain-speciﬁc research areas supportthe characterisation and exploration of a research ﬁeld. Sala-tino et al. [94] give an overview, e.g. Medical Subject Head-ing (MeSH), Physics Subject Headings (PhySH), ComputerScience Ontology (CSO). Gene Ontology [26] and Chemi-cal Entities of Biological Interest (CheBi) [30] are KGs forgenes and molecular entities.2.3 Construction of knowledge graphsNickel et al. [76] classify KG construction methods into fourgroups: (1) curated approaches, i.e. triples created manuallyby a closed group of experts, (2) collaborative approaches,i.e. triples created manually by an open group of volun-teers, (3) automated semi-structured approaches, i.e. triplesextracted automatically from semi-structured text via hand-crafted rules, and (4) automated unstructured approaches,i.e. triples are extracted automatically from unstructured text.

WikiData [105] is one of the most popular KGs with seman-tically structured, encyclopaedic knowledge curated manu-ally by a community. As of January 2021, WikiData com-prises 92M entities curated by almost 27.000 active contrib- utors. The community also maintains a taxonomy of cate-gories and ”infoboxes” which deﬁne common properties ofcertain entity types. Furthermore, Papers With Code [79] isa community-driven effort to interlink machine learning ar-ticles with tasks, source code and evaluation results. KGssuch as Gene Ontology [26] or Wordnet [40] are curatedby domain experts. Research article submission portals suchas EasyChair ( ) en-force the authors to provide machine-readable metadata. Li-brarians and publishers tag new articles with keywords andsubjects [110]. Virtual research environments enable the ex-ecution of data analysis on interoperable infrastructure andstore the data and results in KGs [99].

Petasis et al. [84] pre-sent a review on ontology learning , that is ontology creationfrom text, while Lubani et al.[71] review ontology popula-tion systems . Pajura and Singh [87] give an overview of theinvolved tasks for

KG population : (a) information extrac-tion to extract a graph from text with entity extraction and relation extraction , and (b) graph construction to clean andcomplete the extracted graph, as it is usually ambiguous, in-complete and inconsistent.

Coreference resolution [17,70]clusters different mentions of the same entity in text and en-tity linking [62] maps mentions in text to entities in the KG.

Entity resolution [102] identiﬁes objects in the KG that re-fer to the same underlying entity. For taxonomy population ,Salatino et al. [94] provide an overview of methods basedon rule-based natural language processing (NLP), clusteringand statistical methods.The Computer Science Ontology (CSO) has been auto-matically populated from research articles [94]. The AI-KGwas automatically generated from 333,000 research papersin the artiﬁcial intelligence (AI) domain [32]. It containsﬁve entity types (tasks, methods, metrics, materials, others)linked by 27 relations types. Kannan et al. [57] create a mul-timodal KG for deep learning papers from text and imagesand the corresponding source code. Brack et al. [17] gener-ate a KG for 10 different science domains with the concepttypes material, method, process, and data. Zhang et al. [112]suggest a rule-based approach to mine research problemsand proposed solutions from research papers.

Information extraction from scientiﬁc text:

Information ex-traction is the ﬁrst step in the automatic KG population pipe-line. Nasar et al. [74] survey methods on information extrac-tion from scientiﬁc text. Beltagy et al. [9] present bench-marks for several scientiﬁc datasets and Peng et al. [81] es-pecially for the biomedical domain. Appendix A presentscomparative overviews of datasets for the tasks sentence clas-siﬁcation, relation extraction, and concept extraction, respec-tively, in research papers.

Brack et al.

There are datasets which are annotated at sentence level for several domains, e.g. biomedical [31,59], computer graph-ics [42], computer science [24], chemistry and computa-tional linguistics [103], or algorithmic metadata [93]. Theycover either only abstracts [31,59,24] or full articles [42,68,93,103]. The datasets differentiate between ﬁve and twelveconcept classes (e.g. Background, Objective, Results). Ma-chine learning approaches for datasets consisting of abstractsachieve an F1 score ranging from 66% to 92% and for datasetswith full papers F1 scores range from 51% to 78% (see Ta-ble 2).More recent corpora, annotated at phrasal level , aim atconstructing a ﬁne-grained KG from scholarly abstracts withthe tasks of concept extraction [4,43,70,15,88], binary re-lation extraction [70,44,4], n-ary relation extraction [58,54,56], and coreference resolution [17,25,70]. They cover sev-eral domains, e.g. material sciences [43]; computational lin-guistics [44,88]; computer science, material sciences, andphysics [4]; machine learning [70]; biomedicine [25,56,63];or a set of ten scientiﬁc, technical and medical domains [15,17,36]. The datasets differentiate between four to seven con-cept classes (like Task, Method, Tool) and between two toseven binary relation types (like used-for, part-of, evaluate-for). The extraction of n-ary relations involves extractionof relations among multiple concepts such as drug-gene-mutation interactions in medicine [56], experiments relatedto solid oxide fuel cells with involved material and measure-ment conditions in material sciences [43], or task-dataset-metric-score tuples for leaderboard construction for machinelearning tasks [58].Approaches for concept extraction achieve F1 scores rang-ing from 56.6% to 96.9% (see Table 4), for coreference res-olution F1 scores range from 46.0% to 61.4% [17,25, 70],and for binary relation extraction from 28.0% to 83.6% (seeTable 3). The task of n-ary relation extraction with an F1score from 28.7% to 56.4% [56,58] is especially challeng-ing, since such relationships usually span beyond sentencesor even sections and thus, machine learning models requirean understanding of the whole document. The inter-coderagreement for the task of concept extraction ranges from 0.6to 0.96 (Table 4), for relation extraction from 0.6 to 0.9 (seealso Table 3), while for coreference resolution the value of0.68 was reported in two different studies [17,70]. The re-sults suggest that these tasks are not only difﬁcult for ma-chines but also for humans in most cases.2.4 Quality of knowledge graphsKGs may contain billions of machine-readable facts aboutthe world or a certain domain. However, do the KGs havealso an appropriate quality? Data quality (DQ) is deﬁned as ﬁtness for use by a data consumer [107]. Thus, to evaluatedata quality, it is important to know the needs of the data consumer since, in the end, the consumer judges whetheror not a product is ﬁt for use. Wang et al. [107] propose adata quality evaluation framework for information systemsconsisting of 15 dimensions grouped into four categories,i.e.:1.

Intrinsic DQ : accuracy, objectivity, believability, and rep-utation.2.

Contextual DQ : value-added, relevancy, timeliness, com-pleteness, and an appropriate amount of data.3.

Representational DQ : interpretability, ease of understand-ing, representational consistency, and concise represen-tation.4.

Accessibility DQ : accessibility and access security.Bizer [10] and Zaveri [111] propose further dimensionsfor the Linked Data context like consistency, veriﬁability,offensiveness, licensing and interlinking. Pipino et al. [86]subdivide completeness into schema completeness , i.e. theextent to which classes and relations are missing in the on-tology to support a certain use, column completeness (alsoknown as

Partial Closed World Assumption [46]), i.e. theextent to which facts are not missing, and population com-pleteness , i.e. the extent to which instances for a certainclass are missing. F¨arber et al. [38] comprehensively eval-uate and compare the data quality of popular KGs (e.g. DB-pedia, Freebase, Wikidata, YAGO) using such dimensions.To evaluate the correctness of instance data (also knownas precision ), the facts in the KG have to be compared againsta ground truth. For that, humans annotate a set of facts astrue or false. YAGO found to be 95% correct [101]. The au-tomatically populated AI-KG has a precision of 79% [32] .The KG automatically populated by the Never-Ending Lan-guage Learner (NELL) has a precision of 74% [21].To evaluate the completeness of instance data (also knownas coverage and recall ), small collections of ground-truthcapturing all knowledge for a certain ontology is necessary,that are usually difﬁcult to obtain [108]. However, some stud-ies estimate the completeness of several KGs. Galarrage etal. [45] suggest a rule mining approach to predict missingfacts. In Freebase [12] 71% of people have an unknownplace of birth, and 75% have an unknown nationality [35].Suchanek et al. [100] report that 69%-99% of instances inpopular KGs (e.g. YAGO, DBPedia) do not have at least oneproperty that other instances of the same class have. The AI-KG has a recall of 81.2% [32].2.5 Systematic literature reviewsLiterature reviews are one of the main tasks of researchers,since a clear identiﬁcation of a contribution to the presentscholarly knowledge is a crucial step in scientiﬁc work [50].This requires a comprehensive elaboration of the present equirements Analysis for an Open Research Knowledge Graph 5

Plan (1) Define research questions (2) Develop a review protocol and data extraction forms

Conduct (3) Find related work (4) Assess the relevance (5) Extract relevant data

Report (6) Assess the quality of the data (7) Analyse and combine the data (8) Write the review

Fig. 1: Activities within a systematic literature reviewscholarly knowledge for a certain research question. Fur-thermore, systematic literature reviews help to identify re-search gaps and to position new research activities [60].A literature review can be conducted systematically or ina non-systematic, narrative way. Following Fink’s [41] def-inition, a systematic literature review is “a systematic, ex-plicit, comprehensive, and reproducible method identifying,evaluating, and synthesising the existing body of completedand recorded work” . Guidelines for systematic literature re-views have been suggested for several scientiﬁc disciplines,e.g. for software engineering [60], for information systems[78] and for health sciences [41]. A systematic literature re-view consists typically of the activities depicted in Figure 1subdivided into the phases plan , conduct , and report . Theactivities may differ in detail for the speciﬁc scientiﬁc do-mains [60,78,41]. In particular, a data extraction form de-ﬁnes which data has to be extracted from the reviewed pa-pers. Data extraction requirements vary from review to re-view so that the form is tailored to the speciﬁc research ques-tions investigated in the review. As the discussion of related work reveals, existing knowl-edge graphs for research information focus on speciﬁc usecases (e.g. improve search engines, help to reproduce re-search results) and mainly manage metadata and researchartefacts about articles. We envision a KG in which researcharticles are linked through a deep semantic representation oftheir content to enable further use cases. In the following, weformulate the problem statement and describe our researchmethod. This motivates our use case analysis in Section 3.1,from which we derive requirements for an ORKG.

Problem statement:

Scholarly knowledge is very heteroge-neous and diverse. Therefore, an ontology that conceptu-alises scholarly knowledge comprehensively does not ex-ist. Besides, due to the complexity of the task, the popu-lation of comprehensive ontologies requires domain and on-tology experts. Current automatic approaches can only pop- ulate rather simple ontologies and achieve moderate accu-racy (see Section 2.3 and Appendix A).

On the one hand, wedesire an ontology that can comprehensively capture schol-arly knowledge, and instance data with high correctness andcompleteness. On the other hand, we are faced with a “knowl-edge acquisition bottleneck”.Research method:

To illuminate the problem statement, weperform a requirements analysis . We follow the design sci-ence research (DSR) methodology [52,18]. The requirementsanalysis is a central phase in DSR, as it is the basis for designdecisions and selection of methods to construct effective so-lutions systematically [18]. The objective of DSR in generalis the innovative, rigorous and relevant design of informa-tion systems for solving important business problems, or theimprovement of existing solutions [18,50].To elicit requirements, we studied guidelines for (a) sys-tematic literature reviews (see Section 2.5), (b) data qual-ity requirements for information systems (see Section 2.4),and (c) interviewed members of the ORKG and Visual An-alytics team at TIB , who are software engineers and re-searchers in the ﬁeld of computer science and environmentalsciences. Based on the requirements, we elaborate possibleapproaches to construct an ORKG, which were identiﬁedthrough a literature review (see Section 2.3). To verify ourassumptions on the presented requirements and approaches,ORKG and Visual Analytics team members reviewed themin an iterative reﬁnement process.3.1 Overview of the use casesWe deﬁne functional requirements with use cases which area popular technique in software engineering [13]. A use casedescribes the interaction between a user and the system fromthe user’s perspective to achieve a certain goal. Furthermore,a use case introduces a motivating scenario to guide the de-sign of a supporting ontology and the use case analysis helpsto ﬁgure out which kind of information is necessary [29]. https://projects.tib.eu/orkg/project/team/ , Brack et al.

ORKG obtain deepunderstandingresearcher virtual research environmentsarticle repositories e.g. DataCitee.g. Dataset Searche.g. GitHube.g. beaker.orge.g. WikiDatae.g. Wikipedia,TIB AV-portaldata repositoriescode repositoriesexternal knowledge basesscholarly portalsﬁnd related work get research ﬁeldoverviewassess relevance extract relevantinformationget recommendedarticlesreproduce results

Fig. 2: UML use case diagram for the main use cases between the actor researcher, an Open Research Knowledge Graph(ORKG), and external systems.There are many use cases (e.g. literature reviews, plagia-rism detection, peer reviewer suggestion) and several stake-holders (e.g. researchers, librarians, peer reviewers, practi-tioners) that may beneﬁt from an ORKG. Ngyuen et al. [75]discuss some research-related tasks of scientists for infor-mation foraging at a broader level. In this study, we focuson use cases that support researchers (a) conducting liter-ature reviews (see also Section 2.5), (b) obtaining a deepunderstanding of a research article and (c) reproducing re-search results. A full discussion of all possible use casesof graph-based knowledge management systems in the re-search environment is far beyond the scope of this article.With the chosen focus, we hope to cover the most frequent,literature-oriented tasks of scientists.Figure 2 depicts the main identiﬁed use cases, which aredescribed brieﬂy in the following. Please note that we focuson how semantic content can improve these use cases andnot further metadata.

Get research ﬁeld overview:

Survey articles provide an over-view of a particular research ﬁeld, e.g. a certain researchproblem or a family of approaches. The results in such sur-veys are sometimes summarised in structured and compar-ative tables (an approach usually followed in domains suchas computer science, but not as systematically practised inother ﬁelds). However, once survey articles are publishedthey are no longer updated. Moreover, they usually representonly the perspective of the authors, i.e. very few researchersof the ﬁeld. To support researchers to obtain an up-to-dateoverview of a research ﬁeld, the system should maintainsuch surveys in a structured way, and allow for dynamicsand evolution. A researcher interested in such an overviewshould be able to search or to browse the desired researchﬁeld in a user interface for ORKG access. Then, the systemshould retrieve related articles and available overviews, e.g.in a table or a leaderboard chart.While an ORKG user interface should allow for show-ing tabular leaderboards or other visual representations, the backend should semantically represent information to al-low for the exploitation of overlaps in conceptualisationsbetween research problems or ﬁelds. Furthermore, faceteddrill-down methods based on the properties of semantic de-scriptions of research approaches could empower researchersto quickly ﬁlter and zoom into the most relevant literature.

Find related work:

Finding relevant research articles is adaily core activity of researchers. The primary goal of thisuse case is to ﬁnd research articles which are relevant to acertain research question. A broad research question is of-ten broken down into smaller, more speciﬁc sub-questionswhich are then converted to search queries [41]. For instance,in this paper, we explored the following sub-questions: (a)

Which ontologies do exist to represent scholarly knowledge? (b)

Which scientiﬁc knowledge graphs do exist and which in-formation do they contain? (c)

Which datasets do exist forscientiﬁc information extraction? (d)

What are current state-of-the-art methods for scientiﬁc information extraction? (e)

Which approaches do exist to construct a knowledge graph?

An ORKG should support the answering of queries re-lated to such questions, which can be ﬁne-grained or broadsearch intents. Preferably, the system should support natu-ral language queries as approached by semantic search andquestion answering engines [6]. The system has to return aset of relevant articles.

Assess relevance:

Given a set of relevant articles the re-searcher has to assess whether the articles match the cri-teria of interest. Usually researchers skim through the ti-tle and abstract. Often, also the introduction and conclu-sions have to be considered, which is cumbersome and time-consuming. If only the most important paragraphs in the ar-ticle are presented to the researcher in a structured way, thisprocess can be boosted. Such information snippets might in-clude, for instance, text passages that describe the problemtackled in the research work, the main contributions, the em-ployed methods or materials, or the yielded results. equirements Analysis for an Open Research Knowledge Graph 7

Research question and data extraction formWhich datasets exist for scientific sentence classification?* (1) Which domains are covered by the dataset?* (2) Who were the annotators?* (3) What is the inter-annotator agreement?  find   (2) Extract entities in search query (e.g. dataset, task), find relevant papers and rank them(3) Present relevant papers with extracted text (1) Define research question and data extraction form Fig. 3: An example research questions with a corresponding data extraction form, and the extracted text passages fromrelevant research articles for the respective (data extraction form) ﬁelds presented in a tabular form.

Extract relevant information:

To tackle a particular researchquestion, the researcher has to extract relevant informationfrom research articles. In a systematic literature review, theinformation to be extracted can be deﬁned through a dataextraction form (see Section 2.5). Such extracted informa-tion is usually compiled in written text or comparison tablesin a related work section or survey articles. For instance, forthe question ”Which datasets do exist for scientiﬁc sentenceclassiﬁcation?” a researcher who focuses on a new anno-tation study could be interested in (a) domains covered bythe dataset and (b) the inter-coder agreement (see Table 2as an example). Another researcher might follow the samequestion but focusing on machine learning, and thus couldbe more interested in (c) evaluation results and (d) featuretypes used.The system should support the researcher with tailoredinformation extraction from a set of research articles: (1)the researcher deﬁnes a data extraction form as proposed insystematic literature reviews (e.g. the ﬁelds (a)-(d)), and (2)the system presents the extracted information as suggestionsfor the corresponding data extraction form and articles in acomparative table. Figure 3 illustrates a data extraction formwith corresponding ﬁelds in form of questions, and a pos-sible approach to visualise the extracted text passages fromthe articles for the respective ﬁelds in a tabular form.

Get recommended articles:

When the researcher focuses ona particular article, further related articles could be recom-mended by the system utilising an ORKG, for instance, arti- cles that address the same research problem or apply similarmethods.

Obtain deep understanding:

The system should help the re-searcher to obtain a deep understanding of a research arti-cle (e.g. equations, algorithms, diagrams, datasets). For thispurpose, the system should connect the article with arte-facts such as conference videos, presentations, source code,datasets, etc., and visualise the artefacts appropriately. Alsotext passages can be linked, e.g. explanations of methods inWikipedia, source code snippets of an algorithm implemen-tation, or equations described in the article.

Reproduce results:

The system should offer researchers linksto all necessary artefacts to help to reproduce research re-sults, e.g. datasets, source code, virtual research environ-ments, materials describing the study, etc. Furthermore, thesystem should maintain semantic descriptions of domain-speciﬁc and standardised evaluation protocols and guide-lines such as in machine learning reproducibility checklists[85] and bioassays in the medical domain.3.2 Knowledge graph requirementsAs outlined in Section 2.4, data quality requirements shouldbe considered within the context of a particular use case(“ﬁtness for use”). In this section, we ﬁrst describe dimen-sions we used to deﬁne non-functional requirements for anORKG. Then, we discuss these requirements within the con-text of our identiﬁed use cases.

Brack et al.

In the following, we describe the dimensions that we useto deﬁne the requirements for ontology design and instancedata. We selected these dimensions since we assume thatthey are most relevant and also challenging to construct anORKG with appropriate data to support the various use cases.For ontology design , i.e. how comprehensively shouldan ontology conceptualise scholarly knowledge to support acertain use case, we use the following dimensions:A)

Domain specialisation of the ontology:

How domain-speciﬁc should the concepts and relation types be in theontology? An ontology with high domain specialisation targets a speciﬁc (sub-)domain and uses domain-speciﬁcterms. An ontology with low domain specialisation tar-gets a broad range of domains and uses rather domain-independent terms. For instance, various ontologies (e.g.[83,15]) propose domain independent concepts (e.g. Pro-cess, Method, Material). In contrast, Klampanos et al. [61]present a very domain-speciﬁc ontology for artiﬁcial neu-ral networks.B)

Granularity of the ontology:

Which granularity of theontology is required to conceptualise scholarly knowl-edge? An ontology with high granularity conceptualisesscholarly knowledge with a lot of classes that have verydetailed and a lot of ﬁne-grained properties and rela-tions. An ontology with a low granularity has only a fewclasses and relation types. For instance, the annotationschemes for scientiﬁc corpora (see Section 2.3) have arather low granularity, as they do not have more than 10classes and 10 relation types. In contrast, various ontolo-gies (e.g [49,83]) with more than 20 to 35 classes andover 20 to 70 relations and properties are ﬁne-grainedand have a relatively high granularity.Although there is usually a correlation between domain spe-cialisation and granularity of the ontology (e.g. an ontologywith high domain-specialisation has also a high granularity),there exist also rather domain-independent ontologies witha high granularity, e.g. Scholarly Ontology [83]), and on-tologies with high domain-specialisation and low granular-ity, e.g. the PICO criterion in Evidence Based Medicine [59,91]) which stands for Population (P), Intervention (I), Com-parison (C), and Outcome (O). Thus, we use both dimen-sions independently. Furthermore, a high domain specialisa-tion requirement for a use case implies that each sub-domainrequires a separate ontology for the speciﬁc use case. Thesedomain-speciﬁc ontologies can be organised in a taxonomy.For the instance data , we use the following dimensions:C)

Completeness of the instance data:

Given an ontology,to which extent do all possible instances (i.e. instancesfor classes and facts for relation types) in all researcharticles have to be represented in the KG?

Low com- pleteness: it is tolerable for the use case when a con-siderable amount of instance data is missing for the re-spective ontology.

High completeness: it is mandatoryfor the use case that for the respective ontology, a con-siderable amount of instances are present in the instancedata. For instance, given an ontology with a class “Task”and a relation type “subTaskOf” to describe a taxonomyof tasks, the instance data for that ontology would becomplete if all tasks mentioned in all research articlesare present (population completeness) and “subTaskOf”facts between the tasks are not missing (column com-pleteness).D)

Correctness of the instance data:

Given an ontology,which correctness is necessary for the corresponding in-stances?

Low correctness: it is tolerable for the use case,that some instances (e.g. 30%) are not correct.

High cor-rectness: it is mandatory for the use case, that instancedata must not be wrong i.e. all present instances in theKG must conform to the ontology and reﬂect the contentof the research articles properly. For instance, an articleis correctly assigned to the task addressed in the article,the F1 score in the evaluation results are correctly ex-tracted, etc.It should be noted that completeness and correctness of in-stance data can be evaluated only for a given ontology. Forinstance, let A be an ontology having the class “Deep Learn-ing Model” without properties, and let B be an ontology thatalso has a class “Deep Learning Model” and additionallyfurther relation types describing the properties of the deeplearning model (e.g. drop-out, loss functions, etc.). In thisexample, the instance data of ontology A would be consid-ered to have high completeness, if it covers most of the im-portant deep learning models. However, for ontology B, thecompleteness of the same instance data would be rather lowsince the properties of the deep learning models are miss-ing. The same holds for correctness: if ontology B has, forinstance, a sub-type “Convolutional Neural Network”, thenthe instance data would have a rather low correctness for on-tology B if all “Deep Learning Model” instances are typedonly with the generic class “Deep Learning Model”.

Next, we discuss the seven main use cases with regard to therequired level of ontology domain specialisation and gran-ularity, as well as completeness and correctness of instancedata. Table 1 summarises the requirements for the use casesalong the four dimensions at ordinal scale. The use cases aregrouped together, when they have (1) similar justiﬁcationsfor the requirements, and (2) a high overlap in ontology con-cepts and instances. equirements Analysis for an Open Research Knowledge Graph 9

Table 1:

Requirements and approaches for the main use cases.

The upper part describes the minimum requirements forthe ontology (domain specialisation and granularity) and the instance data (completeness and correctness). The bottom partlists possible approaches for manual, automatic and semi-automatic curation of the KG for the respective use cases. “X”indicates that the approach is suitable for the use case while “(x)” denotes that the approach is only appropriate with humansupervision. The left part (delimited by the vertical triple line) groups use cases suitable for manual, and the right side forautomatic approaches. Vertical double lines group use cases with similar requirements.

Extractrelevantinfo Researchﬁeldoverview Deepunder-standing Repro-duceresults Findrelatedwork Recom-mendarticles AssessrelevanceOntology

Domain specialisation high high med med low low medGranularity high high med med low low low

Instancedata

Completeness low med low med high high medCorrectness med high high high low low med

Manualcuration

Maintain terminologies - X - - X X -Deﬁne templates X X - - - - -Fill in templates X X X X - - -Maintain overviews X X - - - - -

Automaticcuration

Entity/relation extraction (x) (x) (x) (x) X X XEntity linking (x) (x) (x) (x) X X XSentence classiﬁcation (x) - (x) - X - XTemplate-based extraction (x) (x) (x) (x) - - -Cross-modal linking - - (x) (x) - - -

Extract relevant information & get research ﬁeld overview:

The information to be extracted from relevant research ar-ticles for a data extraction form within a literature reviewis very heterogeneous and depends highly on the intent ofthe researcher and the research questions. Thus, the ontol-ogy has to be domain-speciﬁc and ﬁne-grained to offer allpossible kinds of desirable information. However, missinginformation for certain questions in the KG may be tolerablefor a researcher. Furthermore, it is tolerable for a researcherif some of the extracted suggestions are wrong since the re-searcher can correct them.Research ﬁeld overviews are usually the result of a lit-erature review. The data in such an overview has also to bevery domain-speciﬁc and ﬁne-grained. Also, this informa-tion must have high correctness, e.g. an F1 score of an eval-uation result must not be wrong. Furthermore, an overviewof a particular research ﬁeld should have appropriate com-pleteness and must not miss any relevant research papers.However, it is acceptable when overviews for some researchﬁelds are missing.

Obtain deep understanding & reproduce results:

The infor-mation required for these use cases has to achieve a highlevel of correctness (e.g. accurate links to dataset, sourcecode, videos, articles, research infrastructures). An ontol-ogy for the representation of default artefacts can be ratherdomain-independent (e.g. Scholix [20]). However, seman-tic representation of evaluation protocols require domain-dependent ontologies (e.g. EXPO [97]). Missing informa-tion is tolerable for these use cases.

Find related work & get recommended articles:

When search-ing for related work, it is essential not to miss relevant arti-cles. Previous studies revealed that more than half of searchqueries in academic search engines refer to scientiﬁc enti-ties [109]. However, the coverage of scientiﬁc entities ingeneral-purpose KGs (e.g. WikiData) is rather low, sincethe introduction of new concepts in research literature oc-curs at a faster pace than KG curation [1]. Despite the lowcompleteness, Xiong et al. [109] could improve the rank-ing of search results in academic search engines by exploit-ing general-purpose KGs. Hence, the instance data for the“ﬁnd related work” use case should have high complete-ness with ﬁne-grained scientiﬁc entities. However, semanticsearch engines leverage latent representations of KGs andtext (e.g. graph and word embeddings) [6]. Since a non-perfect ranking of the search results is tolerable for a re-searcher, lower correctness of the instance data could beacceptable. Furthermore, due to latent feature representa-tions, the ontology can be kept rather simple and domain-independent. For instance, the STM corpus [15] introducesfour domain-independent concepts.Graph- and content based research paper recommendationsystems [8] have similar requirements since they also lever-age latent feature representations and require ﬁne-grainedscientiﬁc entities. Also, non-perfect recommendations aretolerable for a researcher.

Assess relevance:

To help the researcher to assess the rele-vance of an article according to her needs, the system shouldhighlight the most essential zones in the article to get a quickoverview. The completeness and correctness of the presented

Fig. 4: The virtuous cycle of data network effects by com-bining manual and automatic data curation approaches [22].information must not be too low, as otherwise the user ac-ceptance may suffer. However, it can be suboptimal, sinceit is acceptable for a researcher when some of the high-lighted information is not essential or when some impor-tant information is missing. The ontology to represent es-sential information should be rather domain-speciﬁc (i.e. us-ing terms that the researchers understands) and quite simple(cf. ontologies for scientiﬁc sentence classiﬁcation in Sec-tion 2.3.2).

In this section, we discuss the implications for the designand construction of an ORKG and outline possible approach-es, which are mapped to the use cases in Table 1. Basedon the discussion in the previous section, we can subdividethe use cases into two groups: (1) requiring high correct-ness and high domain specialisation with rather low require-ments on the completeness (left side in Table 1), and (2) re-quiring high completeness with rather low requirements onthe correctness and domain specialisation (right side in Ta-ble 1). The ﬁrst group requires manual approaches while thesecond group could be accomplished with fully automaticapproaches. To ensure trustworthiness, data records shouldcontain provenance information, i.e. who or what system cu-rated the data.Manually curated data can also support use cases withautomatic approaches, and vice versa. Furthermore, auto-matic approaches can complement manual approaches byproviding suggestions in user interfaces. Such synergy be-tween humans and algorithms may lead to a “data ﬂywheel”(also known as data network effects, see Figure 4): usersproduce data which enable to build a smarter product withbetter algorithms so that more users use the product and thusproduce more data, and so on. 4.1 Manual approaches

Ontology design:

The ﬁrst group of use cases requires ratherdomain-speciﬁc and ﬁne-grained ontologies. We suggest todevelop novel or reuse ontologies that ﬁt the respective usecase and the speciﬁc domain (e.g. EXPO [97] for experi-ments). Moreover, appropriate and simple user interfaces arenecessary for efﬁcient and easy population.However, such ontologies can evolve with the help of thecommunity, as demonstrated by WikiData and Wikipediawith “infoboxes” (see Section 2.3). Therefore, the systemshould enable the maintenance of templates , which are pre-deﬁned and very speciﬁc forms consisting of ﬁelds with cer-tain types (see Figure 5). For instance, to automatically gen-erate leaderboards for machine learning tasks a template wouldhave the ﬁelds Task, Model, Dataset and Score, which canthen be ﬁlled in by a curator for articles providing such kindof results in a user interface generated from the template.Such an approach is based on meta-modelling [13], as themeta-model for templates enables the deﬁnition of concretetemplates, which are then instantiated for articles.

Knowledge graph population:

Several user interfaces arerequired to enable manual population: (1) populate seman-tic content for a research article by (1a) choosing relevanttemplates or ontologies and (1b) ﬁll in the values; (2) termi-nology management (e.g. domain-speciﬁc research ﬁelds);(3) maintain research ﬁeld overviews by (3a) assigning rel-evant research articles to the research ﬁeld, (3b) deﬁne cor-responding templates, and (3c) ﬁll in the templates for therelevant research articles.Furthermore, the system should also offer

ApplicationProgramming Interfaces (APIs) to enable population by third-party applications, e.g.: – Submission portals such as during submission of an article. – Authoring tools such as during writing. – Virtual research environments [99] to store evaluationresults and links to datasets and source code during ex-perimenting and data analysis.To encourage crowd-sourced content , we see the followingoptions: – Top-down enforcement via submission portals and pub-lishers. – Incentive models : Researchers want their articles to becited; semantic content helps other researchers to ﬁnd,explore and understand an article. This is also related tothe concept of enlightened self-interest , i.e. act to furtherinterests of others to serve the own self-interest. – Provide public acknowledgements for curators. equirements Analysis for an Open Research Knowledge Graph 11 <>

TemplateInformationExtractor + getTemplate():Template+ couldBeRelevant(a: Article): boolean+ extractTemplateFields(p:Article):TemplateInstance ﬁelds

Template + name+ description

Field + name+ description type values

TemplateInstance type

FieldValue + value: Object properties

Article FieldType

Fig. 5: Conceptual meta-model in UML for templates and interface design for an external template-based information ex-tractor. – Bring together experts (e.g. librarians, researchers fromdifferent institutions) who curate and organise contentfor speciﬁc research problems or disciplines.4.2 (Semi-)automatic approaches

Ontology design:

The second group of use cases require ahigh completeness while a relatively low correctness anddomain specialisation are acceptable. For these use cases,rather simple or domain-independent ontologies should bedeveloped or reused. Although approaches for automatic on-tology learning exist (see Section 2.3), the quality of their re-sults is not sufﬁcient to generate a meaningful ORKG withcomplex conceptual models and relations. Therefore, mean-ingful ontologies should be designed by human experts.

Knowledge graph population:

Various approaches can beused to (semi-)automatically populate an ORKG. Methodsfor entity and relation extraction (see Section 2.3) can helpto populate ﬁne-grained KGs with high completeness and entity linking approaches can link mentions in text with en-tities in KGs. For cross-modal linking, Singh et al. [96] sug-gest an approach to detect URLs to datasets in research ar-ticles automatically, while the Scientiﬁc Software Explorer[51] connects text passages in research articles with codefragments. To extract relevant information at sentence level,approaches for sentence classiﬁcation in scientiﬁc text canbe applied (see Section 2.3). To support the curator ﬁll intemplates semi-automatically, template-based extraction can(1) suggest relevant templates for a research article and (2)pre-ﬁll ﬁelds of templates with appropriate values. For pre-ﬁlling, approaches such as n-ary relation extraction [43,53,56,58] or end-to-end question answering [90,33] could beapplied.Furthermore, the system should enable to plugin exter-nal information extractors , developed for certain scientiﬁcdomains to extract speciﬁc types of information. For instance,as depicted in Figure 5, an external template information ex-tractor has to implement an interface with three methods.This enables the system (1) to ﬁlter relevant template ex-tractors for an article and (2) extract ﬁeld values from anarticle.

In this paper, we have presented a requirements analysis foran Open Research Knowledge Graph (ORKG). An ORKGshould represent the content of research articles in a seman-tic way to enhance or enable a wide range of use cases.We identiﬁed literature-related core tasks of a researcherthat can be supported by an ORKG and formulated them asuse cases. For each use case, we discussed speciﬁcities andrequirements for the underlying ontology and the instancedata. In particular, we identiﬁed two groups of use cases:(1) the ﬁrst group requires instance data with high correct-ness and rather ﬁne-grained, domain-speciﬁc ontologies, butwith moderate completeness; (2) the second group requiresa high completeness, but the ontologies can be kept rathersimple and domain-independent, and a moderate correctnessof the instance data is sufﬁcient. Based on the requirements,we have described possible manual and semi-automatic ap-proaches (necessary for the ﬁrst group), and automatic ap-proaches (appropriate for the second group) for KG con-struction. In particular, we propose a framework with light-weight ontologies that can evolve by community curation.Furthermore, we have described the interdependence withexternal systems, user interfaces, and APIs for third-partyapplications to populate an ORKG.The results of our work aim to give a holistic view of therequirements for an ORKG and guide further research. Thesuggested approaches have to be reﬁned, implemented andevaluated in an iterative and incremental process (see for the current progress). Additionally, our anal-ysis can serve as a foundation for a discussion on ORKGrequirements with other researchers and practitioners.

Conﬂict of interest

The authors declare that they have no conﬂict of interest.

A Comparative Overviews for Information ExtractionDatasets from Scientiﬁc Text

Table 2, Table 3, and Table 4 show comparative overviews for somedatasets from research papers of various disciplines for the tasks sen-tence classiﬁcation, relation extraction, and concept extraction, respec-tively.

References

1. Ammar, W., Groeneveld, D., Bhagavatula, C., Beltagy, I., Craw-ford, M., Downey, D., Dunkelberger, J., Elgohary, A., Feldman,S., Ha, V., Kinney, R., Kohlmeier, S., Lo, K., Murray, T., Ooi,H., Peters, M.E., Power, J., Skjonsberg, S., Wang, L.L., Wil-helm, C., Yuan, Z., van Zuylen, M., Etzioni, O.: Constructionof the literature graph in semantic scholar. In: S. Bangalore,J. Chu-Carroll, Y. Li (eds.) Proceedings of the 2018 Conferenceof the North American Chapter of the Association for Compu-tational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Vol-ume 3 (Industry Papers), pp. 84–91. Association for Computa-tional Linguistics (2018). DOI 10.18653/v1/n18-3011. URL https://doi.org/10.18653/v1/n18-3011

2. Aryani, A., Wang, J.: Research graph: Building a distributedgraph of scholarly works using research data switchboard.In: Open Repositories CONFERENCE (2017). DOI10.4225/03/58c696655af8a. URL https://figshare.com/articles/Research_Graph_Building_a_Distributed_Graph_of_Scholarly_Works_using_Research_Data_Switchboard/4742413

3. Auer, S., Mann, S.: Towards an open research knowledge graph.The Serials Librarian (1-4), 35–41 (2019). DOI 10.1080/0361526X.2019.1540272. URL https://doi.org/10.1080/0361526X.2019.1540272

4. Augenstein, I., Das, M., Riedel, S., Vikraman, L., McCallum, A.:Semeval 2017 task 10: Scienceie - extracting keyphrases and re-lations from scientiﬁc publications. In: S. Bethard, M. Carpuat,M. Apidianaki, S.M. Mohammad, D.M. Cer, D. Jurgens (eds.)Proceedings of the 11th International Workshop on SemanticEvaluation, SemEval@ACL 2017, Vancouver, Canada, August3-4, 2017, pp. 546–555. Association for Computational Lin-guistics (2017). DOI 10.18653/v1/S17-2091. URL https://doi.org/10.18653/v1/S17-2091

5. Badie, K., Asadi, N., Mahmoudi, M.T.: Zone identiﬁcation basedon features with high semantic richness and combining results ofseparate classiﬁers. J. Inf. Telecommun. (4), 411–427 (2018).DOI 10.1080/24751839.2018.1460083. URL https://doi.org/10.1080/24751839.2018.1460083

6. Balog, K.: Entity-oriented search. Springer (2018). DOI 10.1007/978-3-319-93935-3. URL https://eos-book.org

7. Bechhofer, S., Buchan, I.E., Roure, D.D., Missier, P., Ainsworth,J.D., Bhagat, J., Couch, P.A., Cruickshank, D., Delderﬁeld, M.,Dunlop, I., Gamble, M., Michaelides, D.T., Owen, S., New-man, D.R., Suﬁ, S., Goble, C.A.: Why linked data is not enoughfor scientists. Future Gener. Comput. Syst. (2), 599–611(2013). DOI 10.1016/j.future.2011.08.004. URL https://doi.org/10.1016/j.future.2011.08.004

8. Beel, J., Gipp, B., Langer, S., Breitinger, C.: Research-paperrecommender systems: a literature survey. Int. J. Digit. Libr. (4), 305–338 (2016). DOI 10.1007/s00799-015-0156-0. URL https://doi.org/10.1007/s00799-015-0156-0

9. Beltagy, I., Lo, K., Cohan, A.: SciBERT: A pretrained languagemodel for scientiﬁc text. In: K. Inui, J. Jiang, V. Ng, X. Wan(eds.) Proceedings of the 2019 Conference on Empirical Meth-ods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pp.3613–3618. Association for Computational Linguistics (2019).DOI 10.18653/v1/D19-1371. URL https://doi.org/10.18653/v1/D19-1371

10. Bizer, C.: Quality-Driven Information Filtering- In the Contextof Web-Based Information Systems. VDM Verlag, Saarbr¨ucken,DEU (2007)11. Bodenreider, O.: The uniﬁed medical language system (UMLS):integrating biomedical terminology. Nucleic Acids Res. (Database-Issue), 267–270 (2004). DOI 10.1093/nar/gkh061.URL https://doi.org/10.1093/nar/gkh061

12. Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.:Freebase: a collaboratively created graph database for structuringhuman knowledge. In: J.T. Wang (ed.) Proceedings of the ACMSIGMOD International Conference on Management of Data,SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008, pp.1247–1250. ACM (2008). DOI 10.1145/1376616.1376746. URL https://doi.org/10.1145/1376616.1376746

13. Booch, G., Rumbaugh, J., Jacobson, I.: Uniﬁed Modeling Lan-guage User Guide, The (2nd Edition) (Addison-Wesley ObjectTechnology Series). Addison-Wesley Professional (2005)14. Bornmann, L., Mutz, R.: Growth rates of modern science: Abibliometric analysis based on the number of publications andcited references. J. Assoc. Inf. Sci. Technol. (11), 2215–2222(2015). DOI 10.1002/asi.23329. URL https://doi.org/10.1002/asi.23329

15. Brack, A., D’Souza, J., Hoppe, A., Auer, S., Ewerth, R.: Domain-independent extraction of scientiﬁc concepts from research arti-cles. In: J.M. Jose, E. Yilmaz, J. Magalh˜aes, P. Castells, N. Ferro,M.J. Silva, F. Martins (eds.) Advances in Information Retrieval- 42nd European Conference on IR Research, ECIR 2020, Lis-bon, Portugal, April 14-17, 2020, Proceedings, Part I,

LectureNotes in Computer Science , vol. 12035, pp. 251–266. Springer(2020). DOI 10.1007/978-3-030-45439-5 \

17. URL https://doi.org/10.1007/978-3-030-45439-5_17

16. Brack, A., Hoppe, A., Stocker, M., Auer, S., Ewerth, R.: Re-quirements analysis for an open research knowledge graph. In:M.M. Hall, T. Mercun, T. Risse, F. Duchateau (eds.) DigitalLibraries for Open Knowledge - 24th International Conferenceon Theory and Practice of Digital Libraries, TPDL 2020, Lyon,France, August 25-27, 2020, Proceedings,

Lecture Notes in Com-puter Science , vol. 12246, pp. 3–18. Springer (2020). DOI10.1007/978-3-030-54956-5 \

1. URL https://doi.org/10.1007/978-3-030-54956-5_1

17. Brack, A., M¨uller, D., Hoppe, A., Ewerth, R.: Coreference reso-lution in research papers from multiple domains. In: Proceedingsof ECIR 2021 (accepted for publication) (2021)18. Braun, R., Benedict, M., Wendler, H., Esswein, W.: Proposal forrequirements driven design science research. In: B. Donnellan,M. Helfert, J. Kenneally, D.E. VanderMeer, M.A. Rothenberger,R. Winter (eds.) New Horizons in Design Science: Broadeningthe Research Agenda - 10th International Conference, DESRIST2015, Dublin, Ireland, May 20-22, 2015, Proceedings,

LectureNotes in Computer Science , vol. 9073, pp. 135–151. Springer(2015). DOI 10.1007/978-3-319-18714-3 \

9. URL https://doi.org/10.1007/978-3-319-18714-3_9

19. Brodaric, B., Reitsma, F., Qiang, Y.: Skiing with DOLCE: to-ward an e-science knowledge infrastructure. In: C. Eschen-bach, M. Gr¨uninger (eds.) Formal Ontology in InformationSystems, Proceedings of the Fifth International Conference,FOIS 2008, Saarbr¨ucken, Germany, October 31st - Novem-ber 3rd, 2008,

Frontiers in Artiﬁcial Intelligence and Appli-cations , vol. 183, pp. 208–219. IOS Press (2008). DOI 10.3233/978-1-58603-923-3-208. URL https://doi.org/10.3233/978-1-58603-923-3-208 equirements Analysis for an Open Research Knowledge Graph 13 T a b l e : C h a r ac t e r i s ti c s o f d a t a s e t s a ndp e rf o r m a n ce m ea s u r e s f o r s e n t e n cec l a ss i ﬁ ca ti on i n r e s ea r c hp a p e r s . D a t a s e t D o m a i n s P a p er s C ov er ag e S e n t e n ce C l a ss e sI n t er - c o d er A g ree m e n t P er f o r m a n ce P ub M e d - [ ] B i o m e d i c i n e , a b s t r ac t s B ac kg r ound O b j ec ti v e M e t hod s R e s u lt s , C on c l u s i onn / a . % F [ ] N I C T A - P I B O S O [ ] B i o m e d i c i n e , a b s t r ac t s B ac kg r ound I n t e r v e n ti on S t udy P opu l a ti on O u t c o m e , O t h e r . % k84 . % F [ ] C S A B S T R U C T [ ] C o m pu t e r S c i e n ce , a b s t r ac t s B ac kg r ound O b j ec ti v e M e t hod R e s u lt , O t h e r . % k83 . % F [ ] C S - A b s t r ac t s [ ] C o m pu t e r S c i e n ce a b s t r ac t s B ac kg r ound O b j ec ti v e M e t hod s R e s u lt s , C on c l u s i on s n / a . % F [ ] E m e r a l d100k [ ] M a n a g e m e n t I n f o r m a ti on S c i e n ce E ng i n ee r i ng103 , a b s t r ac t s P u r po s e D e s i gn / m e t hodo l ogy / a pp r o ac h F i nd i ng s O r i g i n a lit y / v a l u e S o c i a li m p li ca ti on s P r ac ti ca li m p li ca ti on s R e s ea r c h li m it a ti on s /i m p li ca ti on s n / a n / a M A ZE A [ ] P hy s i c s E ng i n ee r i ng L i f ea nd H ea lt h S c i e n ce s , a b s t r ac t s B ac kg r ound G a p , P u r po s e M e t hod R e s u lt , C on c l u s i on59 , % k66 . % acc u r ac y [ ] S a f d e r e t a l . [ ] C o m pu t e r S c i e n ce f u llt e x t A l go r it h m i c E f ﬁ c i e n c y D a t a s e t D e s c r i p ti on A l go r it h m i c T i m e C o m p l e x it y O t h e r n / a . % acc u r ac y [ ] D r . I nv e n t o r[ ] C o m pu t e r G r a ph i c s f u llt e x t B ac kg r ound C h a ll e ng e A pp r o ac h O u t c o m e , F u t u r e W o r k66 . % k72 . % acc u r ac y [ ] A R T / C o r e S C [ ] C h e m i s t r y C o m pu t a ti on a l L i ngu i s ti c f u llt e x t B ac kg r ound M o ti v a ti on , G o a l H ypo t h e s i s O b j ec t M od e l , M e t hod E xp e r i m e n t , R e s u lt O b s e r v a ti on , C on c l u s i on57 . % k51 . % F [ ] T a b l e : C h a r ac t e r i s ti c s o f d a t a s e t s a ndp e rf o r m a n ce m ea s u r e s f o r b i n a r y a ndn - a r y r e l a ti on e x t r ac ti on i n r e s ea r c hp a p e r s . * F o r S O F C - E xp c o r pu s , p e rf o r m a n ce v a l u e s w e r e ob t a i n e d w it hg r ound t r u t h c on ce p t m e n ti on s . D a t a s e t D o m a i n s P a p er s C ov er ag e C a r d i n a li t y R e l a t i o n T y p e s S c o p e I n t er - c o d er A g ree m e n t R e l a t i o n s P er f o r m a n ce S e m E v a l [ ] C o m pu t e r S c i e n ce M a t e r i a l S c i e n ce s P hy s i c s a b s t r ac t b i n a r y s ynony m - o f hypony m - o f i n t r a - s e n t e n ce . % k67228 . % F [ ] S e m E v a l [ ] C o m p . L i ngu i s ti c s a b s t r ac t b i n a r y u s a g e r e s u lt m od e l p a r t - w ho l e t op i cc o m p a r i s on i n t r a - s e n t e n ce . % F . % F [ ] C h e m P r o t [ ] B i o m e d i c i n e a b s t r ac t b i n a r y U P R E GU L A T O R A C T I VA T O R DO W N R E GU L A T O R I NH I B I T O R AGON I S T AN T AGON I S T S U B S T R A TE i n t r a - s e n t e n ce n / a , . % F [ ] S c i E RC [ ] A r t . I n t e lli g e n ce a b s t r ac t b i n a r y hypony m - o f c o m p a r e p a r t - o f c on j un c ti on e v a l u a t e -f o rf ea t u r e - o f u s e d -f o r c r o ss - s e n t e n ce . % F , . % F [ ] P W C [ ] A r t . I n t e lli g e n ce f u llt e x t n - a r y ( T a s k , D a t a s e t , M e t r i c , S c o r e ) do c u m e n t - l e v e l n / a , . % F [ ] C K B [ ] B i o m e d i c i n e f u llt e x t n - a r y ( D r ug , G e n e , M u t a ti on ) do c u m e n t - l e v e l n / a , . % F [ ] S O F C - E xp [ ] M a t e r i a l S c i e n ce s f u llt e x t n - a r y ( A nod e M a t e r i a l , C a t hod e M a t e r i a l , D e v i ce , E l ec t r o l y t e M a t e r i a l , F u e l U s e d , I n t e r l a y e r M a t e r i a l , O p e n C i r c u it V o lt a g e , P o w e r D e n s it y , R e s i s t a n ce , W o r k i ng T e m p e r a t u r e ) do c u m e n t - l e v e l n / a n / a . % F * [ ] equirements Analysis for an Open Research Knowledge Graph 15 T a b l e : C h a r ac t e r i s ti c s o f d a t a s e t s a ndp e rf o r m a n ce m ea s u r e s f o r s c i e n ti ﬁ cc on ce p t e x t r ac ti on i n r e s ea r c hp a p e r s . * F o r S O F C - E xp c o r pu s , p e rf o r m a n ce v a l u e s w e r e ob t a i n e d w it hg r ound t r u t h s e n t e n ce s d e s c r i b i ng e xp e r i m e n t s . D a t a s e t D o m a i n s P a p er s C o n ce p t s C ov er ag e C o n ce p t T y p e sI n t er - c o d er A g ree m e n t P er f o r m a n ce S e m E v a l [ ] C o m pu t e r S c i e n ce M a t e r i a l S c i e n ce s P hy s i c s , a b s t r ac t P r o ce ss T a s k M a t e r i a l . % κ . % F [ ] S T M [ ] A g r i c u lt u r e A s t r ono m y B i o l ogy C h e m i s t r y C o m pu t e r S c i e n ce E a r t h S c i e n ce E ng i n ee r i ng M a t e r i a l s S c i e n ce M a t h e m a ti c s M e d i c i n e , a b s t r ac t P r o ce ss M e t hod M a t e r i a l D a t a . % κ . % F [ ] S c i E RC [ ] A r t . I n t e lli g e n ce , a b s t r ac t T a s k M e t hod M e t r i c M a t e r i a l O t h e r G e n e r i c . % κ . % F [ ] A C L [ ] C o m p . L i ngu i s ti c s , a b s t r ac t M e t hod T oo l L a ngu a g e R e s ou r ce ( L R ) L R p r odu c t M od e l M ea s u r e s / M ea s u r e m e n t s O t h e r . % F . % F [ ] B C D R [ ] B i o m e d i c i n e , a b s t r ac t C h e m i ca l D i s ea s e . % F . % F [ ] N CB I- d i s ea s e [ ] B i o m e d i c i n e , a b s t r ac t D i s ea s e . % F . % F [ ] S O F C - E xp [ ] M a t e r i a l S c i e n ce s , f u llt e x t M a t e r i a l D e v i ce V a l u e . % F . % F * [ ] (1/2) (2017). DOI 10.1045/january2017-burton. URL https://doi.org/10.1045/january2017-burton

21. Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Jr., E.R.H.,Mitchell, T.M.: Toward an architecture for never-ending lan-guage learning. In: M. Fox, D. Poole (eds.) Proceedings ofthe Twenty-Fourth AAAI Conference on Artiﬁcial Intelligence,AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010. AAAIPress (2010). URL

22. CB Insights: The data ﬂywheel: How enlightenedself-interest drives data network effects. . Accessed: 2020-11-1023. Cohan, A., Ammar, W., van Zuylen, M., Cady, F.: Structuralscaffolds for citation intent classiﬁcation in scientiﬁc publica-tions. In: J. Burstein, C. Doran, T. Solorio (eds.) Proceed-ings of the 2019 Conference of the North American Chap-ter of the Association for Computational Linguistics: HumanLanguage Technologies, NAACL-HLT 2019, Minneapolis, MN,USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp.3586–3596. Association for Computational Linguistics (2019).DOI 10.18653/v1/n19-1361. URL https://doi.org/10.18653/v1/n19-1361

24. Cohan, A., Beltagy, I., King, D., Dalvi, B., Weld, D.S.: Pre-trained language models for sequential sentence classiﬁcation.In: K. Inui, J. Jiang, V. Ng, X. Wan (eds.) Proceedings of the 2019Conference on Empirical Methods in Natural Language Process-ing and the 9th International Joint Conference on Natural Lan-guage Processing, EMNLP-IJCNLP 2019, Hong Kong, China,November 3-7, 2019, pp. 3691–3697. Association for Compu-tational Linguistics (2019). DOI 10.18653/v1/D19-1383. URL https://doi.org/10.18653/v1/D19-1383

25. Cohen, K.B., Lanfranchi, A., Choi, M.J., Bada, M., Jr., W.A.B.,Panteleyeva, N., Verspoor, K., Palmer, M., Hunter, L.E.: Coref-erence annotation and resolution in the colorado richly anno-tated full text (CRAFT) corpus of biomedical journal articles.BMC Bioinform. (1), 372:1–372:14 (2017). DOI 10.1186/s12859-017-1775-9. URL https://doi.org/10.1186/s12859-017-1775-9

26. Consortium, T.G.O.: The gene ontology resource: 20 years andstill going strong. Nucleic Acids Res. (Database-Issue),D330–D338 (2019). DOI 10.1093/nar/gky1055. URL https://doi.org/10.1093/nar/gky1055

27. Constantin, A., Peroni, S., Pettifer, S., Shotton, D.M., Vitali, F.:The document components ontology (doco). Semantic Web (2),167–181 (2016). DOI 10.3233/SW-150177. URL https://doi.org/10.3233/SW-150177

28. Dayrell, C., Jr., A.C., Lima, G., Jr., D.M., Copestake,A.A., Feltrim, V.D., Tagnin, S.E.O., Alu´ısio, S.M.: Rhetor-ical move detection in english abstracts: Multi-label sen-tence classiﬁers and their annotated corpora. In: N. Cal-zolari, K. Choukri, T. Declerck, M.U. Dogan, B. Maegaard,J. Mariani, J. Odijk, S. Piperidis (eds.) Proceedings of theEighth International Conference on Language Resources andEvaluation, LREC 2012, Istanbul, Turkey, May 23-25, 2012,pp. 1604–1609. European Language Resources Association(ELRA) (2012). URL

29. Degbelo, A.: A snapshot of ontology evaluation criteria andstrategies. In: R. Hoekstra, C. Faron-Zucker, T. Pellegrini,V. de Boer (eds.) Proceedings of the 13th International Con-ference on Semantic Systems, SEMANTICS 2017, Amsterdam,The Netherlands, September 11-14, 2017, pp. 1–8. ACM (2017). DOI 10.1145/3132218.3132219. URL https://doi.org/10.1145/3132218.3132219

30. Degtyarenko, K., de Matos, P., Ennis, M., Hastings, J., Zbinden,M., McNaught, A., Alc´antara, R., Darsow, M., Guedj, M., Ash-burner, M.: Chebi: a database and ontology for chemical enti-ties of biological interest. pp. 344–350 (2008). DOI 10.1093/nar/gkm791. URL https://doi.org/10.1093/nar/gkm791

31. Dernoncourt, F., Lee, J.Y.: Pubmed 200k RCT: a dataset for se-quential sentence classiﬁcation in medical abstracts. In: G. Kon-drak, T. Watanabe (eds.) Proceedings of the Eighth Interna-tional Joint Conference on Natural Language Processing, IJC-NLP 2017, Taipei, Taiwan, November 27 - December 1, 2017,Volume 2: Short Papers, pp. 308–313. Asian Federation ofNatural Language Processing (2017). URL

32. Dess`ı, D., Osborne, F., Recupero, D.R., Buscaldi, D., Motta,E., Sack, H.: AI-KG: an automatically generated knowledgegraph of artiﬁcial intelligence. In: J.Z. Pan, V.A.M. Tamma,C. d’Amato, K. Janowicz, B. Fu, A. Polleres, O. Seneviratne,L. Kagal (eds.) The Semantic Web - ISWC 2020 - 19th Inter-national Semantic Web Conference, Athens, Greece, Novem-ber 2-6, 2020, Proceedings, Part II,

Lecture Notes in Com-puter Science , vol. 12507, pp. 127–143. Springer (2020). DOI10.1007/978-3-030-62466-8 \

9. URL https://doi.org/10.1007/978-3-030-62466-8_9

33. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language under-standing. In: J. Burstein, C. Doran, T. Solorio (eds.) Proceed-ings of the 2019 Conference of the North American Chap-ter of the Association for Computational Linguistics: HumanLanguage Technologies, NAACL-HLT 2019, Minneapolis, MN,USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp.4171–4186. Association for Computational Linguistics (2019).DOI 10.18653/v1/n19-1423. URL https://doi.org/10.18653/v1/n19-1423

34. Dogan, R.I., Leaman, R., Lu, Z.: NCBI disease corpus: A re-source for disease name recognition and concept normalization.J. Biomed. Informatics , 1–10 (2014). DOI 10.1016/j.jbi.2013.12.006. URL https://doi.org/10.1016/j.jbi.2013.12.006

35. Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Mur-phy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: aweb-scale approach to probabilistic knowledge fusion. In: S.A.Macskassy, C. Perlich, J. Leskovec, W. Wang, R. Ghani (eds.)The 20th ACM SIGKDD International Conference on Knowl-edge Discovery and Data Mining, KDD ’14, New York, NY,USA - August 24 - 27, 2014, pp. 601–610. ACM (2014). DOI10.1145/2623330.2623623. URL https://doi.org/10.1145/2623330.2623623

36. D’Souza, J., Hoppe, A., Brack, A., Jaradeh, M.Y., Auer, S., Ew-erth, R.: The STEM-ECR dataset: Grounding scientiﬁc entityreferences in STEM scholarly content to authoritative encyclo-pedic and lexicographic sources. In: N. Calzolari, F. B´echet,P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isa-hara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk,S. Piperidis (eds.) Proceedings of The 12th Language Resourcesand Evaluation Conference, LREC 2020, Marseille, France, May11-16, 2020, pp. 2192–2203. European Language ResourcesAssociation (2020). URL

37. F¨arber, M.: The microsoft academic knowledge graph: A linkeddata source with 8 billion triples of scholarly data. In:C. Ghidini, O. Hartig, M. Maleshkova, V. Sv´atek, I.F. Cruz,A. Hogan, J. Song, M. Lefranc¸ois, F. Gandon (eds.) TheSemantic Web - ISWC 2019 - 18th International Seman-tic Web Conference, Auckland, New Zealand, October 26-equirements Analysis for an Open Research Knowledge Graph 1730, 2019, Proceedings, Part II,

Lecture Notes in ComputerScience , vol. 11779, pp. 113–129. Springer (2019). DOI10.1007/978-3-030-30796-7 \

8. URL https://doi.org/10.1007/978-3-030-30796-7_8

38. F¨arber, M., Bartscherer, F., Menne, C., Rettinger, A.: Linked dataquality of dbpedia, freebase, opencyc, wikidata, and YAGO. Se-mantic Web (1), 77–129 (2018). DOI 10.3233/SW-170275.URL https://doi.org/10.3233/SW-170275

39. Fathalla, S., Vahdati, S., Auer, S., Lange, C.: Towards a knowl-edge graph representing research ﬁndings by semantifying sur-vey articles. In: J. Kamps, G. Tsakonas, Y. Manolopoulos, L.S.Iliadis, I. Karydis (eds.) Research and Advanced Technology forDigital Libraries - 21st International Conference on Theory andPractice of Digital Libraries, TPDL 2017, Thessaloniki, Greece,September 18-21, 2017, Proceedings,

Lecture Notes in Com-puter Science , vol. 10450, pp. 315–327. Springer (2017). DOI10.1007/978-3-319-67008-9 \

25. URL https://doi.org/10.1007/978-3-319-67008-9_25

40. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database.Language, Speech, and Communication. MIT Press, Cambridge,MA (1998)41. Fink, A.: Conducting Research Literature Reviews: From the In-ternet to Paper. SAGE Publications (2014)42. Fisas, B., Saggion, H., Ronzano, F.: On the discoursive struc-ture of computer graphics research papers. In: A. Meyers, I. Re-hbein, H. Zinsmeister (eds.) Proceedings of The 9th Linguis-tic Annotation Workshop, LAW@NAACL-HLT 2015, June 5,2015, Denver, Colorado, USA, pp. 42–51. The Association forComputer Linguistics (2015). DOI 10.3115/v1/w15-1605. URL https://doi.org/10.3115/v1/w15-1605

43. Friedrich, A., Adel, H., Tomazic, F., Hingerl, J., Benteau, R.,Marusczyk, A., Lange, L.: The sofc-exp corpus and neural ap-proaches to information extraction in the materials science do-main. In: D. Jurafsky, J. Chai, N. Schluter, J.R. Tetreault (eds.)Proceedings of the 58th Annual Meeting of the Associationfor Computational Linguistics, ACL 2020, Online, July 5-10,2020, pp. 1255–1268. Association for Computational Linguistics(2020). DOI 10.18653/v1/2020.acl-main.116. URL https://doi.org/10.18653/v1/2020.acl-main.116

44. G´abor, K., Buscaldi, D., Schumann, A., QasemiZadeh, B.,Zargayouna, H., Charnois, T.: Semeval-2018 task 7: Seman-tic relation extraction and classiﬁcation in scientiﬁc papers.In: M. Apidianaki, S.M. Mohammad, J. May, E. Shutova,S. Bethard, M. Carpuat (eds.) Proceedings of The 12th Interna-tional Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2018, New Orleans, Louisiana, USA, June 5-6, 2018,pp. 679–688. Association for Computational Linguistics (2018).DOI 10.18653/v1/s18-1111. URL https://doi.org/10.18653/v1/s18-1111

45. Gal´arraga, L., Razniewski, S., Amarilli, A., Suchanek, F.M.: Pre-dicting completeness in knowledge bases. In: M. de Rijke,M. Shokouhi, A. Tomkins, M. Zhang (eds.) Proceedings of theTenth ACM International Conference on Web Search and DataMining, WSDM 2017, Cambridge, United Kingdom, February6-10, 2017, pp. 375–383. ACM (2017). DOI 10.1145/3018661.3018739. URL https://doi.org/10.1145/3018661.3018739

46. Gal´arraga, L.A., Teﬂioudi, C., Hose, K., Suchanek, F.M.: AMIE:association rule mining under incomplete evidence in ontologi-cal knowledge bases. In: D. Schwabe, V.A.F. Almeida, H. Glaser,R. Baeza-Yates, S.B. Moon (eds.) 22nd International World WideWeb Conference, WWW ’13, Rio de Janeiro, Brazil, May 13-17,2013, pp. 413–422. International World Wide Web ConferencesSteering Committee / ACM (2013). DOI 10.1145/2488388.2488425. URL https://doi.org/10.1145/2488388.2488425

47. Gonc¸alves, S., Cortez, P., Moro, S.: A deep learning classiﬁer forsentence classiﬁcation in biomedical and computer science ab-stracts. Neural Comput. Appl. (11), 6793–6807 (2020). DOI10.1007/s00521-019-04334-2. URL https://doi.org/10.1007/s00521-019-04334-2

48. Groza, T., Handschuh, S., M¨oller, K., Decker, S.: SALT - seman-tically annotated latex for scientiﬁc publications. In: E. Fran-coni, M. Kifer, W. May (eds.) The Semantic Web: Research andApplications, 4th European Semantic Web Conference, ESWC2007, Innsbruck, Austria, June 3-7, 2007, Proceedings,

LectureNotes in Computer Science , vol. 4519, pp. 518–532. Springer(2007). DOI 10.1007/978-3-540-72667-8 \

37. URL https://doi.org/10.1007/978-3-540-72667-8_37

49. Hars, A.: Structure of scientiﬁc knowledge, pp. 83–185. SpringerBerlin Heidelberg, Berlin, Heidelberg (2003). DOI 10.1007/978-3-540-24737-1 3. URL https://doi.org/10.1007/978-3-540-24737-1_3

50. Hevner, A.R., March, S.T., Park, J., Ram, S.: De-sign science in information systems research. MIS Q. (1), 75–105 (2004). URL http://misq.org/design-science-in-information-systems-research.html

51. Hoppe, A., Hagen, J., Holzmann, H., Kniesel, G., Ewerth,R.: An analytics tool for exploring scientiﬁc software and re-lated publications. In: E. M´endez, F. Crestani, C. Ribeiro,G. David, J.C. Lopes (eds.) Digital Libraries for Open Knowl-edge, 22nd International Conference on Theory and Practiceof Digital Libraries, TPDL 2018, Porto, Portugal, Septem-ber 10-13, 2018, Proceedings,

Lecture Notes in Computer Sci-ence , vol. 11057, pp. 299–303. Springer (2018). DOI 10.1007/978-3-030-00066-0 \

27. URL https://doi.org/10.1007/978-3-030-00066-0_27

52. Horvath, I.: Comparison of three methodological approaches ofdesign research. In: S.n. (ed.) Proceedings of the 16th Interna-tional Conference on Engineering Design, ICED’07, pp. 1–11.Ecole Central Paris (2007). Null ; Conference date: 28-08-2007Through 30-08-200753. Hou, Y., Jochim, C., Gleize, M., Bonin, F., Ganguly, D.: Identiﬁ-cation of tasks, datasets, evaluation metrics, and numeric scoresfor scientiﬁc leaderboards construction. In: A. Korhonen, D.R.Traum, L. M`arquez (eds.) Proceedings of the 57th Conference ofthe Association for Computational Linguistics, ACL 2019, Flo-rence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp.5203–5213. Association for Computational Linguistics (2019).DOI 10.18653/v1/p19-1513. URL https://doi.org/10.18653/v1/p19-1513

54. Jain, S., van Zuylen, M., Hajishirzi, H., Beltagy, I.: Scirex: Achallenge dataset for document-level information extraction. In:D. Jurafsky, J. Chai, N. Schluter, J.R. Tetreault (eds.) Proceed-ings of the 58th Annual Meeting of the Association for Com-putational Linguistics, ACL 2020, Online, July 5-10, 2020, pp.7506–7516. Association for Computational Linguistics (2020).DOI 10.18653/v1/2020.acl-main.670. URL https://doi.org/10.18653/v1/2020.acl-main.670

55. Jaradeh, M.Y., Oelen, A., Prinz, M., Stocker, M., Auer, S.:Open research knowledge graph: A system walkthrough. In:A. Doucet, A. Isaac, K. Golub, T. Aalberg, A. Jatowt (eds.) Dig-ital Libraries for Open Knowledge - 23rd International Confer-ence on Theory and Practice of Digital Libraries, TPDL 2019,Oslo, Norway, September 9-12, 2019, Proceedings,

LectureNotes in Computer Science , vol. 11799, pp. 348–351. Springer(2019). DOI 10.1007/978-3-030-30760-8 \

31. URL https://doi.org/10.1007/978-3-030-30760-8_31

56. Jia, R., Wong, C., Poon, H.: Document-level n-ary relation ex-traction with multiscale representation learning. In: J. Burstein,C. Doran, T. Solorio (eds.) Proceedings of the 2019 Conference8 Brack et al.of the North American Chapter of the Association for Compu-tational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1(Long and Short Papers), pp. 3693–3704. Association for Com-putational Linguistics (2019). DOI 10.18653/v1/n19-1370. URL https://doi.org/10.18653/v1/n19-1370

57. Kannan, A.V., Fradkin, D., Akrotirianakis, I., Kulahcioglu, T.,Canedo, A., Roy, A., Yu, S., Malawade, A.V., Faruque, M.A.A.:Multimodal knowledge graph for deep learning papers and code.In: M. d’Aquin, S. Dietze, C. Hauff, E. Curry, P. Cudr´e-Mauroux(eds.) CIKM ’20: The 29th ACM International Conference onInformation and Knowledge Management, Virtual Event, Ire-land, October 19-23, 2020, pp. 3417–3420. ACM (2020). DOI10.1145/3340531.3417439. URL https://doi.org/10.1145/3340531.3417439

58. Kardas, M., Czapla, P., Stenetorp, P., Ruder, S., Riedel, S.,Taylor, R., Stojnic, R.: Axcell: Automatic extraction of resultsfrom machine learning papers. In: B. Webber, T. Cohn, Y. He,Y. Liu (eds.) Proceedings of the 2020 Conference on Empir-ical Methods in Natural Language Processing, EMNLP 2020,Online, November 16-20, 2020, pp. 8580–8594. Associationfor Computational Linguistics (2020). DOI 10.18653/v1/2020.emnlp-main.692. URL https://doi.org/10.18653/v1/2020.emnlp-main.692

59. Kim, S., Mart´ınez, D., Cavedon, L., Yencken, L.: Auto-matic classiﬁcation of sentences to support evidence basedmedicine. BMC Bioinform. (S-2), S5 (2011). DOI 10.1186/1471-2105-12-S2-S5. URL https://doi.org/10.1186/1471-2105-12-S2-S5

60. Kitchenham, B.A., Charters, S.: Guidelines for per-forming systematic literature reviews in software engi-neering. Tech. Rep. EBSE 2007-001, Keele Univer-sity and Durham University Joint Report (2007). URL

61. Klampanos, I.A., Davvetas, A., Koukourikos, A., Karkaletsis,V.: ANNETT-O: an ontology for describing artiﬁcial neuralnetwork evaluation, topology and training. Int. J. MetadataSemant. Ontologies (3), 179–190 (2019). DOI 10.1504/IJMSO.2019.099833. URL https://doi.org/10.1504/IJMSO.2019.099833

62. Kolitsas, N., Ganea, O., Hofmann, T.: End-to-end neural entitylinking. In: A. Korhonen, I. Titov (eds.) Proceedings of the22nd Conference on Computational Natural Language Learn-ing, CoNLL 2018, Brussels, Belgium, October 31 - Novem-ber 1, 2018, pp. 519–529. Association for Computational Lin-guistics (2018). DOI 10.18653/v1/k18-1050. URL https://doi.org/10.18653/v1/k18-1050

63. Kringelum, J., Kjærulff, S.K., Brunak, S., Lund, O., Oprea, T.I.,Taboureau, O.: Chemprot-3.0: a global chemical biology diseasesmapping. Database J. Biol. Databases Curation (2016).DOI 10.1093/database/bav123. URL https://doi.org/10.1093/database/bav123

64. Lange, C.: Ontologies and languages for representing mathe-matical knowledge on the semantic web. Semantic Web (2),119–158 (2013). DOI 10.3233/SW-2012-0059. URL https://doi.org/10.3233/SW-2012-0059

65. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D.,Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer,S., Bizer, C.: Dbpedia - A large-scale, multilingual knowledgebase extracted from wikipedia. Semantic Web (2), 167–195(2015). DOI 10.3233/SW-140134. URL https://doi.org/10.3233/SW-140134

66. Li, J., Sun, Y., Johnson, R.J., Sciaky, D., Wei, C., Leaman, R.,Davis, A.P., Mattingly, C.J., Wiegers, T.C., Lu, Z.: BiocreativeV CDR task corpus: a resource for chemical disease relationextraction. Database J. Biol. Databases Curation (2016). DOI 10.1093/database/baw068. URL https://doi.org/10.1093/database/baw068

67. Liakata, M., Saha, S., Dobnik, S., Batchelor, C.R., Rebholz-Schuhmann, D.: Automatic recognition of conceptualizationzones in scientiﬁc articles and two life science applica-tions. Bioinform. (7), 991–1000 (2012). DOI 10.1093/bioinformatics/bts071. URL https://doi.org/10.1093/bioinformatics/bts071

68. Liakata, M., Teufel, S., Siddharthan, A., Batchelor, C.R.:Corpora for the conceptualisation and zoning of scien-tiﬁc papers. In: N. Calzolari, K. Choukri, B. Maegaard,J. Mariani, J. Odijk, S. Piperidis, M. Rosner, D. Tapias(eds.) Proceedings of the International Conference on Lan-guage Resources and Evaluation, LREC 2010, 17-23 May2010, Valletta, Malta. European Language Resources Asso-ciation (2010). URL

69. Lo, K., Wang, L.L., Neumann, M., Kinney, R., Weld, D.S.:S2ORC: the semantic scholar open research corpus. In: D. Ju-rafsky, J. Chai, N. Schluter, J.R. Tetreault (eds.) Proceedingsof the 58th Annual Meeting of the Association for Computa-tional Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 4969–4983. Association for Computational Linguistics (2020). DOI10.18653/v1/2020.acl-main.447. URL https://doi.org/10.18653/v1/2020.acl-main.447

70. Luan, Y., He, L., Ostendorf, M., Hajishirzi, H.: Multi-task iden-tiﬁcation of entities, relations, and coreference for scientiﬁcknowledge graph construction. In: E. Riloff, D. Chiang, J. Hock-enmaier, J. Tsujii (eds.) Proceedings of the 2018 Conference onEmpirical Methods in Natural Language Processing, Brussels,Belgium, October 31 - November 4, 2018, pp. 3219–3232. Asso-ciation for Computational Linguistics (2018). DOI 10.18653/v1/d18-1360. URL https://doi.org/10.18653/v1/d18-1360

71. Lubani, M., Noah, S.A.M., Mahmud, R.: Ontology population:Approaches and design aspects. J. Inf. Sci. (4) (2019). DOI10.1177/0165551518801819. URL https://doi.org/10.1177/0165551518801819

72. Manghi, P., Bardi, A., Atzori, C., Baglioni, M., Manola, N.,Schirrwagen, J., Principe, P.: The openaire research graph datamodel (2019). DOI 10.5281/zenodo.2643199. URL https://doi.org/10.5281/zenodo.2643199

73. Mesbah, S., Fragkeskos, K., Loﬁ, C., Bozzon, A., Houben,G.: Semantic annotation of data processing pipelines in scien-tiﬁc publications. In: E. Blomqvist, D. Maynard, A. Gangemi,R. Hoekstra, P. Hitzler, O. Hartig (eds.) The Semantic Web -14th International Conference, ESWC 2017, Portoroˇz, Slove-nia, May 28 - June 1, 2017, Proceedings, Part I,

Lecture Notesin Computer Science , vol. 10249, pp. 321–336 (2017). DOI10.1007/978-3-319-58068-5 \

20. URL https://doi.org/10.1007/978-3-319-58068-5_20

74. Nasar, Z., Jaffry, S.W., Malik, M.K.: Information extraction fromscientiﬁc articles: a survey. Scientometrics (3), 1931–1990(2018). DOI 10.1007/s11192-018-2921-5. URL https://doi.org/10.1007/s11192-018-2921-5

75. Nguyen, V.B., Sv´atek, V., Rabby, G., Corcho, ´O.: Ontolo-gies supporting research-related information foraging usingknowledge graphs: Literature survey and holistic model map-ping. In: C.M. Keet, M. Dumontier (eds.) KnowledgeEngineering and Knowledge Management - 22nd Interna-tional Conference, EKAW 2020, Bolzano, Italy, September16-20, 2020, Proceedings,

Lecture Notes in Computer Sci-ence , vol. 12387, pp. 88–103. Springer (2020). DOI 10.1007/978-3-030-61244-3 \

6. URL https://doi.org/10.1007/978-3-030-61244-3_6

76. Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A re-view of relational machine learning for knowledge graphs.equirements Analysis for an Open Research Knowledge Graph 19Proc. IEEE (1), 11–33 (2016). DOI 10.1109/JPROC.2015.2483592. URL https://doi.org/10.1109/JPROC.2015.2483592

77. Oelen, A., Jaradeh, M.Y., Stocker, M., Auer, S.: GenerateFAIR literature surveys with scholarly knowledge graphs. In:R. Huang, D. Wu, G. Marchionini, D. He, S.J. Cunningham,P. Hansen (eds.) JCDL ’20: Proceedings of the ACM/IEEEJoint Conference on Digital Libraries in 2020, Virtual Event,China, August 1-5, 2020, pp. 97–106. ACM (2020). DOI10.1145/3383583.3398520. URL https://doi.org/10.1145/3383583.3398520

78. Okoli, C.: A guide to conducting a standalone systematic liter-ature review. Commun. Assoc. Inf. Syst. , 43 (2015). URL http://aisel.aisnet.org/cais/vol37/iss1/43

79. Papers with code. https://paperswithcode.com/ . Ac-cessed: 2019-09-1280. Park, S., Caragea, C.: Scientiﬁc keyphrase identiﬁcation andclassiﬁcation by pre-trained language models intermediate tasktransfer learning. In: D. Scott, N. Bel, C. Zong (eds.) Pro-ceedings of the 28th International Conference on Computa-tional Linguistics, COLING 2020, Barcelona, Spain (Online),December 8-13, 2020, pp. 5409–5419. International Committeeon Computational Linguistics (2020). DOI 10.18653/v1/2020.coling-main.472. URL https://doi.org/10.18653/v1/2020.coling-main.472

81. Peng, Y., Yan, S., Lu, Z.: Transfer learning in biomedical nat-ural language processing: An evaluation of BERT and ELMoon ten benchmarking datasets. In: D. Demner-Fushman, K.B.Cohen, S. Ananiadou, J. Tsujii (eds.) Proceedings of the 18thBioNLP Workshop and Shared Task, BioNLP@ACL 2019, Flo-rence, Italy, August 1, 2019, pp. 58–65. Association for Compu-tational Linguistics (2019). DOI 10.18653/v1/w19-5006. URL https://doi.org/10.18653/v1/w19-5006

82. Peroni, S., Shotton, D.M.: Fabio and cito: Ontologies for describ-ing bibliographic resources and citations. J. Web Semant. , 33–43 (2012). DOI 10.1016/j.websem.2012.08.001. URL https://doi.org/10.1016/j.websem.2012.08.001

83. Pertsas, V., Constantopoulos, P.: Scholarly ontology: mod-elling scholarly practices. Int. J. Digit. Libr. (3), 173–190(2017). DOI 10.1007/s00799-016-0169-3. URL https://doi.org/10.1007/s00799-016-0169-3

84. Petasis, G., Karkaletsis, V., Paliouras, G., Krithara, A., Zavit-sanos, E.: Ontology population and enrichment: State of theart. In: G. Paliouras, C.D. Spyropoulos, G. Tsatsaronis (eds.)Knowledge-Driven Multimedia Information Extraction and On-tology Evolution - Bridging the Semantic Gap,

Lecture Notesin Computer Science , vol. 6050, pp. 134–166. Springer (2011).DOI 10.1007/978-3-642-20795-2 \

6. URL https://doi.org/10.1007/978-3-642-20795-2_6

85. Pineau, J., Vincent-Lamarre, P., Sinha, K., Larivi`ere, V., Beygelz-imer, A., d’Alch´e-Buc, F., Fox, E.B., Larochelle, H.: Improvingreproducibility in machine learning research (A report from theneurips 2019 reproducibility program). CoRR abs/2003.12206 (2020). URL https://arxiv.org/abs/2003.12206

86. Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment.Commun. ACM (4), 211–218 (2002). DOI 10.1145/505248.506010. URL https://doi.org/10.1145/505248.506010

87. Pujara, J., Singh, S.: Mining knowledge graphs from text. In:Y. Chang, C. Zhai, Y. Liu, Y. Maarek (eds.) Proceedings ofthe Eleventh ACM International Conference on Web Search andData Mining, WSDM 2018, Marina Del Rey, CA, USA, February5-9, 2018, pp. 789–790. ACM (2018). DOI 10.1145/3159652.3162011. URL https://doi.org/10.1145/3159652.3162011

88. Q. Zadeh, B., Handschuh, S.: The ACL RD-TEC: A datasetfor benchmarking terminology extraction and classiﬁcation in computational linguistics. In: Proceedings of the 4th Inter-national Workshop on Computational Terminology (Comput-erm), pp. 52–63. Association for Computational Linguisticsand Dublin City University, Dublin, Ireland (2014). DOI10.3115/v1/W14-4807. URL

89. QasemiZadeh, B., Schumann, A.: The ACL RD-TEC 2.0:A language resource for evaluating term extraction and en-tity recognition methods. In: N. Calzolari, K. Choukri,T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mar-iani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis (eds.) Pro-ceedings of the Tenth International Conference on LanguageResources and Evaluation LREC 2016, Portoroˇz, Slovenia,May 23-28, 2016. European Language Resources Association(ELRA) (2016). URL

90. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: J. Su,X. Carreras, K. Duh (eds.) Proceedings of the 2016 Conferenceon Empirical Methods in Natural Language Processing, EMNLP2016, Austin, Texas, USA, November 1-4, 2016, pp. 2383–2392. The Association for Computational Linguistics (2016).DOI 10.18653/v1/d16-1264. URL https://doi.org/10.18653/v1/d16-1264

91. Richardson, S., Wilson, M., Nishikawa, J., Hayward, R.: Thewell-built clinical question: a key to evidence-based decisions.ACP journal club (3), A12–13 (1995)92. Ruiz-Iniesta, A., Corcho, ´O.: A review of ontologies for de-scribing scholarly and scientiﬁc documents. In: A.G. Cas-tro, C. Lange, P.W. Lord, R. Stevens (eds.) Proceedings of the4th Workshop on Semantic Publishing co-located with the 11thExtended Semantic Web Conference (ESWC 2014), Anissaras,Greece, May 25th, 2014,

CEUR Workshop Proceedings , vol.1155. CEUR-WS.org (2014). URL http://ceur-ws.org/Vol-1155/paper-07.pdf

93. Safder, I., Hassan, S., Visvizi, A., Noraset, T., Nawaz, R., Tu-arob, S.: Deep learning-based extraction of algorithmic meta-data in full-text scholarly documents. Inf. Process. Manag. (6), 102269 (2020). DOI 10.1016/j.ipm.2020.102269. URL https://doi.org/10.1016/j.ipm.2020.102269

94. Salatino, A.A., Thanapalasingam, T., Mannocci, A., Birukou, A.,Osborne, F., Motta, E.: The computer science ontology: A com-prehensive automatically-generated taxonomy of research areas.Data Intell. (3), 379–416 (2020). DOI 10.1162/dint \ a \ https://doi.org/10.1162/dint_a_00055

95. Say, A., Fathalla, S., Vahdati, S., Lehmann, J., Auer, S.: Se-mantic representation of physics research data. In: D. Aveiro,J.L.G. Dietz, J. Filipe (eds.) Proceedings of the 12th Interna-tional Joint Conference on Knowledge Discovery, KnowledgeEngineering and Knowledge Management, IC3K 2020, Volume2: KEOD, Budapest, Hungary, November 2-4, 2020, pp. 64–75.SCITEPRESS (2020). DOI 10.5220/0010111000640075. URL https://doi.org/10.5220/0010111000640075

96. Singh, M., Barua, B., Palod, P., Garg, M., Satapathy, S., Bushi,S., Ayush, K., Rohith, K.S., Gamidi, T., Goyal, P., Mukherjee,A.: OCR++: A robust framework for information extraction fromscholarly articles. In: N. Calzolari, Y. Matsumoto, R. Prasad(eds.) COLING 2016, 26th International Conference on Com-putational Linguistics, Proceedings of the Conference: Techni-cal Papers, December 11-16, 2016, Osaka, Japan, pp. 3390–3400. ACL (2016). URL

97. Soldatova, L.N., King, R.D.: An ontology of scientiﬁc ex-periments. Journal of The Royal Society Interface (11),795–803 (2006). DOI 10.1098/rsif.2006.0134. URL https://royalsocietypublishing.org/doi/abs/10.1098/rsif.2006.0134 https://aclweb.org/anthology/papers/U/U19/U19-1016/

99. Stocker, M., Prinz, M., Rostami, F., Kempf, T.: Towards researchinfrastructures that curate scientiﬁc information: A use case inlife sciences. In: S. Auer, M. Vidal (eds.) Data Integration inthe Life Sciences - 13th International Conference, DILS 2018,Hannover, Germany, November 20-21, 2018, Proceedings,

Lec-ture Notes in Computer Science , vol. 11371, pp. 61–74. Springer(2018). DOI 10.1007/978-3-030-06016-9 \

6. URL https://doi.org/10.1007/978-3-030-06016-9_6

Lecture Notes in Computer Sci-ence , vol. 7031, pp. 697–713. Springer (2011). DOI 10.1007/978-3-642-25073-6 \

44. URL https://doi.org/10.1007/978-3-642-25073-6_44 https://doi.org/10.1145/1242572.1242667

Lecture Notes in Com-puter Science , vol. 11799, pp. 375–379. Springer (2019). DOI10.1007/978-3-030-30760-8 \

37. URL https://doi.org/10.1007/978-3-030-30760-8_37 (10), 78–85 (2014). DOI10.1145/2629489. URL https://doi.org/10.1145/2629489 CEUR Workshop Pro-ceedings , vol. 206. CEUR-WS.org (2006). URL http://ceur-ws.org/Vol-206/paper8.pdf (4), 5–33(1996). URL abs/2009.11564 (2020). URL https://arxiv.org/abs/2009.11564 https://doi.org/10.1145/3038912.3052558 OASICS ,vol. 70, pp. 15:1–15:8. Schloss Dagstuhl - Leibniz-Zentrum f¨urInformatik (2019). DOI 10.4230/OASIcs.LDK.2019.15. URL https://doi.org/10.4230/OASIcs.LDK.2019.15 (1), 63–93 (2016). DOI 10.3233/SW-150175. URL https://doi.org/10.3233/SW-150175 , 38 (2019). DOI 10.3389/fdata.2019.00038. URL, 38 (2019). DOI 10.3389/fdata.2019.00038. URL