Generate FAIR Literature Surveys with Scholarly Knowledge Graphs
GGenerate FAIR Literature Surveyswith Scholarly Knowledge Graphs
Allard Oelen
L3S Research Center, Leibniz University Hannover,[email protected]
Mohamad Yaser Jaradeh
L3S Research Center, Leibniz University Hannover,[email protected]
Markus Stocker
TIB Leibniz Information Centre for Science andTechnology, [email protected]
Sören Auer
TIB Leibniz Information Centre for Science andTechnology & L3S Research Center, [email protected]
ABSTRACT
Reviewing scientific literature is a cumbersome, time consumingbut crucial activity in research. Leveraging a scholarly knowledgegraph, we present a methodology and a system for comparingscholarly literature, in particular research contributions describingthe addressed problem, utilized materials, employed methods andyielded results. The system can be used by researchers to quicklyget familiar with existing work in a specific research domain (e.g.,a concrete research question or hypothesis). Additionally, it can beused to publish literature surveys following the FAIR Data Princi-ples. The methodology to create a research contribution comparisonconsists of multiple tasks, specifically: (a) finding similar contri-butions, (b) aligning contribution descriptions, (c) visualizing andfinally (d) publishing the comparison. The methodology is imple-mented within the Open Research Knowledge Graph (ORKG), ascholarly infrastructure that enables researchers to collaborativelydescribe, find and compare research contributions. We evaluatethe implementation using data extracted from published reviewarticles. The evaluation also addresses the FAIRness of comparisonspublished with the ORKG.
KEYWORDS
Scholarly Knowledge Comparison; Scholarly Information Systems;Comparison User Interface; Digital Libraries; Scholarly Communi-cation; FAIR Data Principles
When conducting scientific research, reviewing the existing liter-ature is an essential activity [33]. Familiarity with the state-of-the-art is required to effectively contribute to advancing it and dorelevant research. Mainly because published scholarly knowledgeis unstructured [17], it is currently very tedious to review existingliterature. Relevant literature has to be found among hundreds andincreasingly thousands of PDF articles. This activity is supportedby library catalogs and online search engines, such as Scopus orGoogle Scholar [18]. Because the search is keyword based, typ-ically large numbers of articles are returned by search engines.Researchers have to manually identify the relevant papers. Havingidentified the relevant papers, the relevant pieces of informationneed to be extracted in order to obtain an overview of the literature.Overall, these are manual and time consuming steps. We argue that a key issue is that the scholarly knowledge communicated in theliterature does not meet the FAIR Data Principles [40]. While PDFarticles can be found and accessed (assuming Open Access or aninstitutional subscription), the scholarly literature is insufficientlyinteroperable and reusable, especially for machines. For units moregranular than the PDF article, such as a specific result, findabilityand accessibility score low even for humans.We present a methodology and its implementation integratedinto the Open Research Knowledge Graph (ORKG) [15] that can beused to generate and publish literature surveys in form of machineactionable, comparable descriptions of research contributions. Ma-chine actionability of research contributions relates to the abilityof machines to access and interpret the contribution data. Thebenefits for researchers of such an infrastructure are (at least) two-fold. Firstly, it supports researchers in creating state-of-the-artoverviews for specific research problems efficiently. Secondly, itsupports researchers in publishing literature surveys that adhereto the FAIR principles, thus contributing substantially to reuse ofstate-of-the-art overviews and therein contained information, forboth humans and machines.Literature reviews are articles that focus on analysing existingliterature. Among other things, reviews can be used to gain under-standing about a research problem or to identify further researchdirections [8, 29]. Reviews can be used by authors to quickly ob-tain an overview of either emerging or mature research topics [36].Review papers are important for research fields to develop. Whenreview papers are lacking, the development of a research field isweakened [38]. Compiling literature review papers is a complicatedtask [39] and is often more time consuming than performing origi-nal research [38]. The structure of such articles often consists oftables that compare published research contributions. Althoughin the literature the terms “literature review” and “literature sur-vey” are sometimes used interchangeably, we make the followingdistinction. We refer to the tables in review articles as literaturesurveys . Together with a (textual) analysis and explanation, theyform the literature review . The state-of-the-art (SoTA) analysis is aspecial kind of literature review with the objective of comparingthe latest and most relevant papers in a specific domain.We implement the presented methodology in the ORKG. TheORKG is a scholarly infrastructure designed to acquire, publishand process structured scholarly knowledge published in the lit-erature [14]. ORKG is part of a larger research agenda aiming a r X i v : . [ c s . D L ] J un t machine actionable scholarly knowledge that understands theability to more efficiently compare literature as a key feature.We tackle the following research questions:(1) How to generate literature surveys using scholarly knowl-edge graphs?(2) How to ensure that published literature surveys complywith the FAIR principles?(3) How to effectively specify and visualize literature surveysin a user interface?In support of the first research question, we present a methodol-ogy that describes the steps required to generate literature surveys.In support of the second research question, we describe how theFAIRness of the published literature review is ensured. Finally, insupport of the third research question, we demonstrate how themethodology is implemented within the ORKG.The paper is structured as follows. Section 2 motivates the work.Section 3 reviews related work. Section 4 presents the system de-sign, the underlying methodology and its implementation. Section 5explains how the knowledge graph is populated with data. Section 6presents the evaluation of the system, specifically system FAIRnessand performance. Finally, Section 7 discusses the presented andfuture work. We motivate our work by means of two use cases that underscorethe usefulness of a literature survey generation system. In the firstuse case, a researcher wants to obtain an overview on state-of-the-art research addressing a specific problem. The second use casedescribes how a researcher can publish a FAIR compliant literaturereview with the ORKG.
Familiarize with the state-of-the-art.
A state-of-the-art (SoTA)analysis reviews new and emerging research. They are useful formultiple reasons. Firstly, they provide a broad overview of a re-search problem and support understanding. Secondly, they juxta-pose different approaches for a problem. Thirdly, they can supportclaims on why certain research is relevant by giving an overviewof the breadth of research addressing a problem. The proposedapproach enables automated generation of surveys to quickly ob-tain an overview of state-of-the-art research as well as sharing ofsurveys for others to reuse.
Publishing of literature reviews.
Literature reviews typically con-sist of multiple (survey) tables in which different approaches fromoriginal papers are compared based on a set of properties. Thesetables can be seen as the main contribution and most informativepart of the review paper, since the tables juxtapose and compareexisting work. Comparison tables are published in review papersas static content in PDF documents. This presentational format isgenerated from datasets that typically contain more (structured)information than what is presented in the published table. However,the additional information is not published. It is “dark data” whichis not stored or indexed and likely lost over time [12]. Furthermore,published tables are not machine actionable. Their overall lowFAIRness hinders reusability of the published content. With thepresented service, it is possible to publish a literature survey with high FAIRness, i.e. that is compliant with the FAIR principles to ahigh degree. Section 3 discusses this aspect in more details.
Summary of weaknesses of the current approach to literature review.
The weaknesses of the current approach to literature review canbe summarized as follows: • Static – reviews are static, since once published as PDF theyare rarely updated and there are no possibilities or incen-tives for creating new or updated reviews for considerabletime. • Lack of machine assistance – machine assistance is hardlypossible, since the PDF representation of reviews is onlyhuman readable and relevant raw data is mostly not pub-lished along with the review. • Delay – reviews are produced and published with signifi-cant delay (often years) after original research work wasdone. • Coverage – due to the amount of work required, reviewsare often only performed for relatively popular researchtopics and are stale or missing for less popular topics. • Lacking collaboration – collaboration on reviews is not pos-sible and reviews currently represent only the viewpointof the few authors not the community. • Missing overarching systematic semantic representation –the overlap between different reviews and related worksections in individual original research papers is not ex-plicit and cannot be exploited.We deem that these weaknesses of the current approach to schol-arly literature review and synthesis significantly hinder scientificprogress.
The task of comparing research contributions can be reviewed inlight of the more general task of comparing resources (or entities)in a knowledge graph. While this is a well-known task in multi-ple domains (for instance in e-commerce systems [42]), not muchwork has focused on comparison in knowledge graphs, specifically.One of the few works with this focus is by Petrova et al. [28] whocreated a framework for comparing entities in RDF graphs usingSPARQL queries. In order to compare contributions, they first haveto be found. Finding is an information retrieval problem. As awell-known technique, TF-IDF [21] can be used for this task. Moresophisticated techniques can be used to determine the structuralsimilarity between graphs (e.g., [20]) and matching semanticallysimilar predicates. This relates to dataset interlinking [1] or moregenerally ontology alignment [34]. For property alignment, tech-niques of interest include edit distance (e.g., Jaro-Winkler [41] orLevenshtein [19]) and vector distance. Gromann and Declerck [10]found that fastText [4] performs best for ontology alignment.In light of the FAIR Data Principles [40], scholarly data should beFindable, Accessible, Interoperable and Reusable both for humansand machines. Due to the publication format, literature surveytables published in scholarly articles weakly adhere to the FAIRguidelines, particularly so for machines. Scholarly data should beconsidered first-class objects [35], including data used to createliterature surveys. Rodríguez-Iglesias et al. [30] describe the diffi-culties of making data FAIR within the plant sciences. They argue hat it is more complicated than reformatting data. On the otherhand they suggest that most FAIR principles can be implementedrelatively easily by using off-the-shelf technologies. Boeckhoutet al. [3] argue that the FAIR principles alone are not sufficientto lead to responsible data sharing. More applied principles areneeded to ensure better scholarly data. This claim is supported bythe findings of Mons et al. [22] who suggest that there are verydiverse interpretations of the guidelines. In their work, they try toclarify what is FAIR and what is not.An efficient literature comparison relies on scholarly knowl-edge being represented in a structured way. There is substantialrelated work on representing scholarly knowledge in structuredform [31]. Building on the work of numerous philosophers of sci-ence, Hars [11] proposed a comprehensive scientific knowledgemodel that includes concepts such as theory, methodology andstatement. More recently, ontologies were engineered to describedifferent aspects of the scholarly communication process. SemanticPublishing and Referencing (SPAR) is a collection of ontologiesthat can be used to describe scholarly publishing and referencingof documents [5, 9, 26, 27]. Ruiz Iniesta and Corcho [31] reviewedthe state-of-the-art ontologies to describe scholarly articles. Sateliand Witte [32] use some of these scholarly ontologies to add se-mantic representations of scholarly articles to the Linked OpenData cloud. A literature survey comparing scholarly ontologiesis available via the ORKG. Most of these ontologies are designedto capture metadata about and structure of scholarly articles, notthe content communicated in articles. Another literature surveyis created to compare approaches for semantically representingscholarly communication. An initial attempt for semantifying review articles was donein [7]. The work comprises a relatively rigid ontology for describ-ing contributions (mainly centered around research problems, ap-proaches, implementations and evaluations) and a prototypicalimplementation using Semantic MediaWiki. We relax this con-straint, since we are not limited by a rigid ontology schema butrather allow arbitrary domain-specific semantic structures for re-search contributions. The work by Vahdati et al. [37] focuses onsemantic article representations for generating literature overviews.Their method is to use crowdsourcing to generate the overviews.Kohl et al. [16] present CADIMA, a system that supports systematicliterature reviews. The tool supports the formal process of perform-ing a literature review but does, for example, not publish data inmachine actionable form for reuse.
We now present the system design of the literature comparisonservice. It consists of a methodology that describes how to performa comparison of research contributions. An early version of thismethodology has been presented at the 3rd SciKnow workshop [24].The methodology consists of five steps: 1) finding comparisoncandidates, 2) selecting related statements, 3) aligning contributiondescriptions, 4) visualizing comparisons and 5) publishing FAIRcomparisons. The methodology is depicted in Figure 1. First, we http://purl.org/spar/{cito,c4o,fabio,biro,pro,pso,pwo,doco,deo} Select comparison candidates Select related statements Align contribution descriptionsPublish comparison Visualize comparison • Find similar candidates to compare • Manually select contributions to compare • Select all statements from the comparison contributions • Do this until a predefined depth has been reached • Align contributions properties that are the same (i.e., same ID) • Use word embeddings to align properties that are similar • Hide properties that are not shared among contributions • Let users customize the comparison • Publish a FAIR comparison including relevant metadata • Ensure the persistency of the (meta)data
Figure 1: Research contribution comparison methodology. discuss the data structure of the ORKG, which forms the foundationof the comparison. Then, each step of the methodology is describedin more detail. Finally, we discuss the implementation.
In ORKG, each paper is typed as paper class. A paper consists of atleast one research contribution , which addresses at least one researchproblem . Research contributions consist of contribution data that de-scribe the contribution. For instance, a paper in Computer Sciencemight have descriptions for materials, methods, implementationand results as contribution data. These predefined core conceptscan be easily extended with domain specific research problems,methods, etc. in ORKG curation using crowdsourcing or other cu-ration approaches. The underlying data structure uses the notionof statements. Statements are triples that consist of a subject, apredicate (also called a property) and an object. The granularityof a comparison is at the research contribution, meaning that con-tributions are compared rather than papers. For simplicity, weuse the terms “paper comparison” and “contribution comparison”interchangeably. Because a comparison happens on contributionlevel, it is possible to compare specific elements of a paper insteadof the complete paper. The benefit of this is that a comparison doesnot contain data from irrelevant contributions. The ORKG OWLontology is available online. To perform a comparison, a starting contribution is needed. Thiscontribution is called main contribution and is always manuallyselected by a user. The main contribution is compared againstother comparison contributions . There are two different approachesfor selecting the comparison contributions. The first approachautomatically selects comparison contributions based on similarity.The second approach lets users manually select contributions.
Comparing contributions makesonly sense when contributions can sensibly be compared. For ex-ample, it does not make (much) sense to compare a biology paper toa history paper. We thus argue that it makes only sense to comparecontributions that are similar. More specifically, contributions thatshare the same (or a similar set of) properties are good compari-son candidates. For instance, a paper about question answeringhas the property orkg:disambiguationTask and another paper is https://gitlab.com/TIBHannover/orkg/orkg-ontology orkg: denotes the ontology of the ORKG system described in Section 4.1 igure 2: Implementation of the first step of the methodol-ogy: the selection of comparison candidates. Showing boththe similarity-based and the manual selection approaches.Figure 3: Box showing the manually selected contributions. using the same property to describe what disambiguation tasksare performed. Since they share the same property it makes themlikely candidates for comparison. Finding similar contributions istherefore based on finding contributions that share the same orsimilar informative description properties. To achieve this, eachcomparison contribution is converted into a string by concatenat-ing all properties of the contribution. TF-IDF [21] is used to querythese strings with the string of the main contribution as query.The search returns the most similar contributions by weightingthe most informative properties higher due to TF-IDF. The top-kcontributions are selected and form a set of contributions that areused in the next step.Figure 2 displays how the similar contribution selection is im-plemented. As depicted, three similar contributions are suggestedto the user (with the corresponding similarity percentage beingdisplayed next to paper title). These suggested contributions canbe directly compared. There are scenarios where comparisonbased on similarity computation is not suitable or desired. Forexample, a researcher wants to compare a specific set of implemen-tations to see which performs best. Therefore, the manual selectionmethod is implemented in a similar fashion to an e-commerce shop-ping cart. When the “Add to comparison” checkbox is checked, abox appears listing the selected contributions (Figure 3).
This step selects the statements from the graph related to the set ofcontributions selected in the previous step. Statements are selectedtransitively to match contributions in subject or object position.This search is performed until a predefined maximum transitivedepth δ has been reached. The intuition is that the deeper a propertyis nested the less likely is its relevance for the comparison. Theprocess of selecting statements is repeated until depth δ = As described in the first step, comparisons are built using sharedor similar properties of contributions. In case the same propertyhas been used between contributions, these properties are groupedand form one comparison row . However, often different propertiesare used to describe the same concept. This occurs for variousreasons. The most obvious reason is when two different ontologiesare used to describe the same property. For example, for describingthe population of a city, DBpedia uses dbo:populationTotal whileWikiData uses
WikiData:population (actually the property identifieris P1082; for the purpose here we use the label). When comparingcontributions, these properties should be considered as equivalent.Especially for community-created knowledge graphs, differentlyidentified properties likely exist that are, in fact, equivalent.To overcome this problem, we use pre-trained fastText [4] wordembeddings to determine the similarity of properties. If the sim-ilarity is higher than a predetermined threshold τ , the propertiesare considered equivalent and are grouped. This happens when thesimilarity threshold τ ≥ . γ is generated γ p i = (cid:104) cos (−→ p i , −→ p j ) (cid:105) (1)with cos ( . ) as the cosine similarity of vector embeddings forproperty pairs ( p i , p j ) ∈ P , whereby P is the set of all contributions.Furthermore, we create a mask matrix Φ that selects propertiesof contributions c i ∈ C , whereby C is the set of contributions to becompared. Formally, Φ i , j = (cid:40) p j ∈ c i p we create the matrix φ thatslices Φ to include only similar properties. Formally, φ i , j = ( Φ i , j ) c i ∈C p j ∈ sim ( p ) (3)where sim ( p ) is the set of properties with similarity values γ [ p ] ≥ τ with property p . Finally, φ is used to efficiently compute the com-mon set of properties [14]. This process is displayed in Algorithm 1. lgorithm 1 Align contribution descriptions procedure AlignProperties(properties, threshold) for each property p ∈ properties do for each property p ∈ properties do similarity ← cos(Embb( p ), Embb( p )) if similarity > threshold then similarProps ← similarProps ∪ { p , p } return similarProps The next step of the workflow is to visualize the comparison andpresent the data in a human understandable format. Tabular formatis often appropriate for visualizing comparisons since tables providea good overview of data. Another aspect of the visualization isdetermining which properties should be displayed and which onesshould be hidden. A property is displayed when it is shared among apredetermined amount α of contributions, where α mainly dependson comparison use and can be determined based on the total amountof contributions in the comparison. By default, only properties thatare common to at least two contributions ( α ≥
2) are displayed.Another aspect of comparison visualization is the possibilityto customize the resulting table. This is needed because of thesimilarity-based matching of properties and the use of predeter-mined thresholds. For example, users should be able to enableor disable properties. They should also get feedback on propertyprovenance (i.e., the property’s path in the graph). Ultimately,this contributes to a better user experience, with the possibility tomanually correct mistakes made by the system.Figure 4 displays a comparison for research contributions relatedto visualization tools published in the literature. In this example,four properties are displayed. Literals are displayed as plain textwhile resources are displayed as links. When a resource link isselected, a popup is displayed showing the statements related tothis resource. The UI implements some additional features that areparticularly useful to compare research contributions.
Customization.
Users can customize comparisons including trans-posing the table as well as hiding and rearranging the properties.Especially the option to hide properties is helpful when contribu-tions with many statements are compared. Only properties consid-ered relevant to the user can be selected to display. Customizingthe comparison table can be useful before exporting or sharing thecomparison.
Sharing and persistence.
Comparisons can be shared using a per-sistent link. Especially when sharing the comparison for researchpurposes, it is important to refer to the original comparison. Sincecontribution descriptions may change over time comparisons mayalso change. To support persistency, the whole state of the compar-ison is stored in a document-oriented database and retrieved whenthe permalink is invoked.
Export.
It is possible to export comparisons in different outputformats such as PDF, CSV, RDF and L A TEX. The L A TEX export isuseful for direct integration in research papers. Together with theL A TEX table, a BibTeX file containing the bibliographic informationof the papers used in the comparison is also generated. Also, a
Figure 4: Comparison of research contributions related tovisualization tools. persistent link referring back to the comparison in ORKG is showedas table footnote.
Visualized and customized comparison tables can be stored. Storingtables is part of the publishing process and therefore only neededwhen a generated table is going to be used in a paper. In orderto regenerate the table the whole state of the comparison shouldbe saved. The knowledge graph from which the comparison wasgenerated changes over time and thus storing just the URIs of therespective papers would not suffice. While saving a comparison,the user can provide additional metadata to ensure findability, anaspect of the FAIR principles. Metadata include a comparison title,which would normally consist of a one sentence description ofthe comparison. Additionally, a longer textual description can beprovided. This metadata is extended with machine generated data,such as the creation date and the creator of the comparison. Themetadata is stored in the knowledge graph to support easy accessand interoperability. In Figure 5, the structure of the metadata isdisplayed using the Dublin Core Metadata Terms . The compari-son data itself is stored in a document-oriented database. An RDFexport of both the metadata and the comparison data can be gen-erated. The comparison data is modeled with the RDF Data CubeVocabulary . A unique identifier is attached when the comparisonis saved. This ID is used when the comparison is shared or when itis referenced in a paper. The literature comparison can also be per-formed without publishing. Although the workflow and the stepsto create a comparison stay the same, the goal is different. Insteadof creating a comparison that will be published and referenced in apaper, the comparison will be used by the researcher herself. The user interface of the comparison feature is seamlessly inte-grated with the ORKG front end, which is written in JavaScript andis publicly available . The back end of the comparison feature is aservice separate from the ORKG back end written in Python andalso available Open Source . The comparison back end is respon-sible for step two and three of the comparison methodology. The https://dublincore.org/specifications/dublin-core/dcmi-terms https://gitlab.com/TIBHannover/orkg/orkg-frontend https://gitlab.com/TIBHannover/orkg/orkg-similarity cterms:description dcterms:datedcterms:creator :hasUrldcterms:license ComparisonString DateUser StringString
Figure 5: The graph structure of the metadata for a pub-lished comparison. The dcterms: prefix denotes the DublinCore Metadata Terms ontology. input in step two is the set of contribution IDs. The API selects therelated statements and aligns the properties and returns the dataneeded to visualize the comparison. This data includes the list ofpapers, list of all properties and the values per property.
In order to generate useful literature reviews it is crucial for theknowledge graph to contain sufficient and relevant papers. Popu-lating the knowledge graph with high quality paper descriptions itnot straightforward. Structured descriptions of papers should becreated in such a way that it is possible to compare papers basedon shared properties. Both published papers and papers that willbe published in the future should be added to the ORKG, retrospec-tively or prospectively. Although a comprehensive description onhow to populate the ORKG is out-of-scope here, we now brieflydescribe how we envision populating the ORKG in a manner thatwould facilitate comparing contributions.Prospectively, authors can become part of generating structureddescriptions of their papers. This should be done in a crowdsourcedmanner and can become part of the paper submission process. In-put templates that collect relevant properties can be used to ensurestructured and comparable paper descriptions. Retrospectively,automated (machine learning) methods can be helpful ensure scal-ability of the process of adding a paper.
To populate the ORKG with comparable paper descriptions, weleverage the data published in review papers. Review papers consistof high quality, curated and often structured data that is collectedfrom a set of papers that address the same (or a similar) researchproblem. Hence, using reviews to populate a scholarly knowledgegraph is a relatively straightforward approach to obtain high qualitystructured paper descriptions. We now present a methodology toconvert survey paper data into a knowledge graph structure. Thesteps are as follows:(1)
Survey paper selection.
The first step is the selection ofsurvey papers that are suitable for building a knowledgegraph. Firstly, the survey should compare peer-reviewedscientific articles. For instance, a comparison of differentsystems without a reference to peer-reviewed work is notsuitable for the scholarly knowledge graph. Secondly, the review should compare the papers’ content in a structuredway and should not merely list work in a field. Especiallyreviews that present their results and literature compar-isons in tabular format are suitable. The result of this stepis a list of papers that will be added to the ORKG.(2)
Table selection.
Given the selected survey papers, tableshave to be selected. Some surveys contain only one tablewhile in others multiple tables are presented. In some casesa collection of tables can be joined into one larger table.(3)
Data modeling.
Given the selected tables, a suitable graphstructure has to be determined. The data structure has tobe modeled. For instance, when implemented systems arecompared, a suitable structure could be: [has implementation]-> System name . The referenced system can be describedwith a list of properties to be compared. Additionally, aresearch problem has to be defined, which is typically thesame for all papers that are part of the table.(4)
Metadata collection.
Next, the metadata for the papersthat are referenced in the survey table is collected. In casea referenced paper has a DOI , the metadata can be auto-matically retrieved via a lookup service (e.g. Crossref ).Otherwise, at least the title, authors and publication datehave to be collected.(5) Data ingestion.
Finally, the paper data is ingestion intothe knowledge graph. The paper data consists of both thepaper’s metadata and the extracted data from the compari-son table. This does not result in a single description of thesurvey paper. Each paper referenced in the survey table isingested individually. In order to speed up the process ofadding papers, we developed a Python package that hasa function to add a paper to the knowledge graph.This methodology has been used to populate the ORKG with com-parable paper data. The data is used to evaluate the presentedliterature review tool. The imported paper data is not only usefulfor the evaluation, but does also provide significant value to theORKG itself.In total, four review papers were selected for importing into theORKG. The Python script for importing the table data is availableonline. From those papers, 12 different tables were imported.Together, 169 papers were reviewed in those four survey papers.This resulted in a total amount of 3 750 statements being addedto the knowledge graph. Table 1 lists the imported review papersand tables. The survey papers address different research problems.Figure 6 depicts an excerpt of the resulting graph for one particularpaper. A set of comparison tables made with the imported data isavailable online. This list includes some alternative comparisontables that were generated with the same data.
In this section, we present an evaluation of multiple aspects of thepresented comparison methodology and implementation. Firstly,we evaluate information representation. Then, we evaluate the Digital Object Identifier https://gitlab.com/TIBHannover/orkg/orkg-pypi https://gitlab.com/TIBHannover/orkg/orkg-papers https://orkg.org/orkg/featured-comparisons able 1: List of imported survey tables in the ORKG. The paper and table reference can be used to identify the original table. Paper reference Table reference Research problem Papers ORKG representation Information loss
Bikakis and Sellis [2] Table 1 Generic visualizations 11 https://orkg.org/orkg/c/pdLJDk NoBikakis and Sellis [2] Table 2 Graph visualizations 21 https://orkg.org/orkg/c/Rx476Z NoDiefenbach et al. [6] Table 2 Question answering evaluations 33 https://orkg.org/orkg/c/gaVisD NoDiefenbach et al. [6] Table 3,4,5,6 Question answering systems 26 https://orkg.org/orkg/c/IuEWl2 NoHussain and Asghar [13] Table 4 Author name disambiguation 5 https://orkg.org/orkg/c/vDxKdr NoHussain and Asghar [13] Table 5 Author name disambiguation 6 https://orkg.org/orkg/c/XXg8Wg NoHussain and Asghar [13] Table 6 Author name disambiguation 9 https://orkg.org/orkg/c/9rOwPV NoHussain and Asghar [13] Table 7 Author name disambiguation 6 https://orkg.org/orkg/c/mB7kIK NoNaidu et al. [23] Table 4 Text summarization 52 https://orkg.org/orkg/c/OUqYB9 No has author has author has publication yearhas contribution
Template-basedquestion answeringover RDF data ChristinaUnger ... 2012 has implementationhas evaluation
Contribution 1 phrase mapping taskdisambiguation task
TBSLStringsimilarityLocaldisambiguation has question amountf-measure
Evaluation 1 0.4250
Figure 6: Partial graph structure of an imported paper. Or-ange colored resources indicate potentially interesting val-ues for a paper comparison.
FAIRness of published reviews. Finally, we present a performanceevaluation that tests the scalability.
This part of the evaluation focuses on the aspect of informationrepresentation. We use the data from the imported review papers,as described in Section 5. In order to build and publish usefuland correct literature reviews, at a minimum our service shoulddisplay the same information that was originally presented in thereview tables. This means that there should not be informationloss when review tables are published using our service. If thereis no information loss, it means our service can be used as analternative to the current way of publishing review tables. Apartfrom generating the same table, the added value comes from theability to aggregate new (tabular) views using the same data as wellas the increased FAIRness of the data published via our service.For each of the imported review tables, listed in Table 1, we canevaluate whether the same table can be generated with our service.For this, we have compared the table from the review paper to thetable generated by the ORKG comparison service. A collection of 169 paper with 9 distinct literature views/tables are part of thisevaluation. These tables can be viewed online, the links are listed inthe “ORKG representation” column. The results of this evaluationare displayed in the same table, in column “Information loss”. Asthe results show, using our service it is possible to recreate the sametabular views as originally published in the review papers.
As described before, with the presented service it is possible topublish a generated comparison that adheres with the FAIR princi-ples. Because the service leverages a knowledge graph to generateand save comparisons, complying with the FAIR principles is moreobvious for the ORKG comparison service than for tables in pub-lished PDF articles. In order to evaluate the FAIRness of a publishedcomparison, we evaluate each of the four FAIR principles in de-tail. Wilkinson et al. [40] described each principle by assigningsub-principles. We discuss the relevant sub-principles and ex-plain how they are met. We use the term (meta)data to refer toboth the actual comparison data (i.e., the data that is used to cre-ate the comparison table) and the associated metadata (e.g., thetitle, description and creator of a comparison). Table 2 presents anoverview for the evaluation of the FAIR principles.
Findable.
To make data findable for both humans and machines(i.e., agents), a unique and persistent identifier should be attachedto the data (F1). Additionally, metadata should describe the data(F2). In the metadata, the unique identifier of the data should bementioned (F3). Also a search interface should be available to findthe data (F4). To ensure the findability of comparisons, users cantitle and describe them. Furthermore, machine generated metadatais attached to a comparison (e.g., the number of papers and thecreation date). A unique identifier is generated and attached tothe data and included in the metadata. Finally, the ORKG searchinterface allows users to search the whole graph and has a dedicatedfilter to specifically find comparisons. Additionally, comparisonscan be indexed and found by third-party search engines (such asGoogle or Bing).
Accessible.
Having found data, agents need to know how toaccess it. This principle is primarily about using accessible stan-dardised communication protocols (A1). Additionally, metadatashould be available even when the data is not (A2). The metadata is For a more detailed definition of the FAIR principles, see: https://go-fair.org/fair-principles art of the knowledge graph, which can be accessed via the HTTPprotocol. The data can be accessed without authentication. Tosupport A2, the metadata and the actual comparison data are storedseparately. Therefore, it is possible to access only metadata whenthe original data is not available anymore (for example when datais retracted by the author). Interoperable.
To ensure the interoperability of data, it shoulduse a formal language for knowledge representation (I1) and shoulduse vocabularies that are FAIR (I2). Finally, references or links toother (meta)data should be made (I3). As argued before, thanks tohighly structured data and the integration of shared vocabularies,interoperability is an inherent feature of knowledge graphs. Datais (partially) described using the ORKG core ontology and otherontologies we use to canonicalize the representation of relevantinformation content types. Links to other data are present in theknowledge graph. For example, if a comparison uses the “Web”resource to specify the domain of an application, this resource isgeneric, can be shared among paper descriptions and comparisons,and can be described in more detail, independently of a particularcomparison.
Reusable.
Finally, data should be reuseable. This can be accom-plished by adding relevant (meta)data (R1). Required are an ac-cessible data license (R1.1) and detailed provenance (R1.2) data.Finally, (meta)data should use community standards to describedata (R1.3). It is possible to add additional metadata to a comparison,e.g. metadata about the scope of the comparison, which could be areference to the paper in which the comparison is being used. Themetadata is complemented with the metadata that is already part ofthe Findability principle, e.g. provenance data about the creator ofthe comparison. The data license of the graph data is CC BY-SA (Attribution-ShareAlike), which allows reuse of the data. There iscurrently no community standard to describe the comparison data.However, standard ontologies are used to describe metadata (e.g.,Dublin Core).The evaluation of the FAIR principles shows that comparisons pub-lished with our service rank high in FAIRness, which can be evenfurther increased with some effort from users. Users are mainlyresponsible for adding the correct information to the comparisonand reuse vocabularies. Otherwise, findability, accessibility and tosome extent also interoperability are largely handled by the service. In order to evaluate the performance of the overall comparison, wecompared the implemented ORKG approach to a naive approachfor comparing multiple resources. The naive approach compareseach property against all other properties to perform the propertyalignment. Table 3 shows the time needed to generate comparisons,for both the naive and the ORKG approach. In total, eight papersare compared with on average ten properties per paper. In thenaive approach, the “Align contribution descriptions” step is notscaling well, since each property is compared against all others. Ifmultiple contributions are selected, the number of property simi-larity checks grows exponentially. Table 3 shows that the ORKGapproach outperforms the naive approach. The total number of https://creativecommons.org/licenses/by-sa/2.0 Table 2: Overview of FAIR principles compliance.
Principle Level Explanation
Findable
F1 3 Unique IDs exist, DOI assignment for future workF2 2 Machine and user generated metadata is attachedF3 1 Properties used to link data to metadataF4 1 Comparisons are findable via a search interface
Accessible
A1 2 Data is accessed over HTTP (via REST or a user in-terface), requires user effort to integrate the ORKGAPI specificationA1.1 1 The protocol is free and widely usedA1.2 1 No authentication is required to access the dataA2 1 Metadata is stored in a persistent way and availablewithout the data itself
Interoperable
I1 1 RDF (with type assertions) and CSV export of com-parisonsI2 2 Reuse of ontologies where possible (ORKG core,Dublin core, RDF Data Cube Vocabulary). User re-sponsible for other ontology reuse.I3 3 For comparisons, the compared paper metadata islinked. More references are needed and can be createdby users.
Reusable
R1 1 Machine and user generated metadata is createdwhile publishingR1.1 1 CC-BY SA licenseR1.2 1 If a registered user publishes a comparison, the useris associated with the published dataR1.3 2 Users can describe contributions using domain-relevant ontologies
Table 3: Time (in seconds) to perform comparisons with 2-8contributions using the naive and ORKG approaches.
Number of compared research contributions2 3 4 5 6 7 8Naive
ORKG papers used for the evaluation is limited to eight because the naiveapproach does not scale to larger sets.
One of the aims of the contribution comparison functionality is tosupport literature reviews and make this activity less cumbersomeand time consuming for researchers. To live up to this aim, morestructured contribution descriptions are needed. Existing scholarlyknowledge graph initiatives focus primarily on scholarly metadata,while with ORKG we focus on making the actual research contribu-tions machine readable. Currently, the ORKG does not yet containsufficient contribution descriptions in order for the comparisonfunctionally to be practically useful for researchers. Furthermore,for an evaluation of the effectiveness of certain components of themethodology (such as finding related papers or aligning similar roperties), more contribution data is needed. Publishing surveysdoes not rely on data quantity and is therefore evaluated moreextensively in this work. The performance evaluation results in-dicate that the comparison feature performs well. This means thetechnical infrastructure is in place for the literature survey service.In the evaluation, we focused on the aspects of the system thatare necessary for researchers to use the system in practice. Theinformation representation evaluation is a straightforward evalu-ation to see if existing survey tables can be regenerated with theORKG. This is a minimal requirement for researchers when usingthe system, since they should at least be able to recreate tables.This evaluation does not give insight to the usefulness and usabil-ity of the system, but still provides an indication that the servicecan be successfully used to publish literature surveys. One of thereasons for using the service is that also “dark data” in comparisonsis published (as discussed in Section 2).Another interesting aspect of the service is that published litera-ture surveys rank high in FAIRness. Therefore, the second part ofthe evaluation focuses on how the FAIR principles are met. Merelypublishing data as RDF is not sufficient to fully meet the FAIR princi-ples. Hence, we conducted a more detailed evaluation that describeshow the service complies with each sub-principle. Since FAIR is nota standard , the principles are permissive and not prescriptive [22].No technical requirements are specified. Both the implementationand evaluation of the guidelines are therefore subject to interpreta-tion. With respect to data interoperability and reusability, certainaspects of the service can be improved. For example, to improveinteroperability, the contribution data should be reusing existingvocabularies where possible. Additionally, although most of FAIR-ification is done by the system, the researcher is responsible foradding correct and relevant metadata while publishing a survey.As indicated earlier, the usefulness of the presented tool dependson the number of papers present in the knowledge graph. Therefore,future work will focus on data collection, both in a crowdsourcedand automated manner. We plan on extending the methodologypresented in Section 5 with automated extraction of data and tablesfrom literature review papers. With the extracted review data, theknowledge graph can be extended more quickly than the previouslypresented manual method. It could form the basis of a high qualityscholarly knowledge graph that contains relevant and FAIR surveytable data. Furthermore, in the future we will assign (DataCite)DOIs to published surveys. They will serve as a persistent identifierfor the survey data [25]. Reviewing existing literature is an important but cumbersome andtime consuming activity. To address this problem, we presented amethodology and service that can be used to generate literaturesurveys from a scholarly knowledge graph. This service can be usedby researchers in order to get familiar with existing literature. Addi-tionally, the tool can be used to publish literature surveys in a waythat they largely adhere to the FAIR data principles. The presentedmethodology addresses multiple aspects, including finding suitablecontributions, aligning contribution descriptions, visualization andpublishing. The methodology is implemented within the Open Re-search Knowledge Graph (ORKG). Since the comparison relies on structured scholarly knowledge, we discussed how to populate theORKG with relevant data. This is done by extracting tabular surveydata from existing literature reviews. In order to evaluate whetherthe proposed service can be used to publish literature surveys, theoriginal survey table representations were compared with the onesgenerated by our service. As the results indicate, it is possible to usethe service as an addition or potentially even replacement of thecurrent publishing approach, since the same tables can be generated.The evaluation also showed how the published literature surveyslargely adhere to the FAIR data principles. This is crucial for datareusability and machine actionability. To conclude, the proposedliterature comparison service addresses multiple weaknesses of thecurrent survey publishing approach and can be used by researchersto generate, publish and reuse literature surveys.
ACKNOWLEDGMENTS
This work was co-funded by the European Research Council for theproject ScienceGRAPH (Grant agreement ID: 819536) and the TIBLeibniz Information Centre for Science and Technology. We wantto thank Kheir Eddine Farfar for his contributions to this work.
REFERENCES [1] Samur Araujo, Jan Hidders, Daniel Schwabe, and Arjen P. De Vries. 2011. SERIMI- Resource description similarity, RDF instance matching and interlinking.
CEURWorkshop Proceedings
814 (2011), 246–247.[2] Nikos Bikakis and Timos Sellis. 2016. Exploration and visualization in the webof big linked data: A survey of the state of the art.
CEUR Workshop Proceedings
European Journal ofHuman Genetics
26, 7 (2018), 931–936. https://doi.org/10.1038/s41431-018-0160-0[4] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. En-riching Word Vectors with Subword Information.
Transactions of the Associationfor Computational Linguistics
Semantic Web
7, 2(2016), 167–181. https://doi.org/10.3233/SW-150177[6] Dennis Diefenbach, Vanessa Lopez, Kamal Singh, and Pierre Maret. 2018. Coretechniques of question answering systems over knowledge bases: a survey.
Knowledge and Information Systems
55, 3 (2018), 529–569. https://doi.org/10.1007/s10115-017-1100-y[7] Said Fathalla, Sahar Vahdati, Sören Auer, and Christoph Lange. 2017. Towardsa Knowledge Graph Representing Research Findings by Semantifying SurveyArticles. In
International Conference on Theory and Practice of Digital Libraries .Springer, 315–327. https://doi.org/10.1007/978-3-319-67008-9_25[8] Meredith D. Gall and Walter R. Borg. 1996. Educational Research: An introduction(sixth edition).
White Plains, NY: Longman Publishers USA (1996).[9] Aldo Gangemi, Silvio Peroni, David Shotton, and Fabio Vitali. 2017. ThePublishing Workflow Ontology (PWO).
Semantic Web
8, 5 (2017), 703–718.https://doi.org/10.3233/SW-160230[10] Dagmar Gromann and Thierry Declerck. 2019. Comparing pretrained multi-lingual word embeddings on an ontology alignment task.
LREC 2018 - 11thInternational Conference on Language Resources and Evaluation (2019), 230–236.[11] Alexander Hars. 2001. Designing Scientific Knowledge Infrastructures: TheContribution of Epistemology.
Information Systems Frontiers
3, 1 (2001), 63–73.https://doi.org/10.1023/A:1011401704862[12] Patrick B. Heidorn. 2008. Shedding light on the dark data in the long tail ofscience.
Library Trends
57, 2 (2008), 280–299. https://doi.org/10.1353/lib.0.0036[13] Ijaz Hussain and Sohail Asghar. 2017. A survey of author name disambiguationtechniques: 2010âĂŞ2016.
The Knowledge Engineering Review
32 (2017), 1–24.https://doi.org/10.1017/s0269888917000182[14] Mohamad Yaser Jaradeh, Allard Oelen, Manuel Prinz, Jennifer D’Souza, GáborKismihók, Markus Stocker, and Sören Auer. 2019. Open Research KnowledgeGraph: Next Generation Infrastructure for Semantic Scholarly Knowledge. In
InProceedings of the 10th International Conference on Knowledge Capture (K-CAP’19) . ACM. https://doi.org/10.1145/3360901.3364435[15] Mohamad Yaser Jaradeh, Allard Oelen, Manuel Prinz, Markus Stocker, and SörenAuer. 2019. Open Research Knowledge Graph: A System Walkthrough. In nternational Conference on Theory and Practice of Digital Libraries . Springer,348–351. https://doi.org/10.1007/978-3-030-30760-8_31[16] Christian Kohl, Emma J. McIntosh, Stefan Unger, Neal R. Haddaway, SteffenKecke, Joachim Schiemann, and Ralf Wilhelm. 2018. Online tools supporting theconduct and reporting of systematic reviews and systematic maps: A case studyon CADIMA and review of existing tools. Environmental Evidence
7, 1 (2018),1–17. https://doi.org/10.1186/s13750-018-0115-5[17] Tobias Kuhn, Christine Chichester, Michael Krauthammer, NÞria Queralt-rosinach, Ruben Verborgh, and George Giannakopoulos. 2016. Decentralizedprovenance-aware publishing with nanopublications.
PeerJ Computer Science (2016), 1–29. https://doi.org/10.7717/peerj-cs.78[18] Elaine M. Lasda Bergman. 2012. Finding Citations to Social Work Literature: TheRelative Benefits of Using Web of Science, Scopus, or Google Scholar.
Journal ofAcademic Librarianship
38, 6 (2012), 370–379. https://doi.org/10.1016/j.acalib.2012.08.002[19] Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions,insertions, and reversals. In
Soviet physics doklady , Vol. 10. 707–710.[20] Pierre Maillot, Carlos Bobed, Pierre Maillot, Carlos Bobed, Pierre Maillot, and Car-los Bobed. 2019. Measuring structural similarity between RDF graphs.
Proceedingsof the 33rd Annual ACM Symposium on Applied Computing (2019), 1960–1967.[21] Carme Pinya Medina and Maria Rosa Rosselló Ramon. 2015. Using TF-IDF toDetermine Word Relevance in Document Queries Juan.
New Educational Review
42, 4 (2015), 40–51. https://doi.org/10.15804/tner.2015.42.4.03[22] Barend Mons, Cameron Neylon, Jan Velterop, Michel Dumontier, LuizOlavo Bonino Da Silva Santos, and Mark D. Wilkinson. 2017. Cloudy, in-creasingly FAIR; Revisiting the FAIR Data guiding principles for the Euro-pean Open Science Cloud.
Information Services and Use
37, 1 (2017), 49–56.https://doi.org/10.3233/ISU-170824[23] Reddy Naidu, Santosh Kumar Bharti, Korra Sathya Babu, and Ramesh KumarMohapatra. 2018. Text Summarization with Automatic Keyword Extraction inTelugu e-Newspapers. In
Smart Innovation, Systems and Technologies . Vol. 77.555–564. https://doi.org/10.1007/978-981-10-5544-7_54[24] Allard Oelen, Mohamad Yaser Jaradeh, Kheir Eddine Farfar, Markus Stocker, andSÃűren Auer. 2019. Comparing Research Contributions in a Scholarly KnowledgeGraph. In
Proceedings of the Third International Workshop on Capturing ScientificKnowledge (SciKnow19) . 21–26.[25] Norman Paskin. 2010. Digital object identifier (DOI®) system.
Encyclopedia oflibrary and information sciences
Journal of Web Semantics
17 (2012), 33–43.https://doi.org/10.1016/j.websem.2012.08.001[27] Silvio Peroni and David Shotton. 2018. The SPAR ontologies. In
Interna-tional Semantic Web Conference . Springer, 119–136. https://doi.org/10.1007/978-3-030-00668-6_8[28] Alina Petrova, Evgeny Sherkhonov, Bernardo Cuenca Grau, and Ian Horrocks.2017. Entity Comparison in RDF Graphs. In
International Semantic Web Confer-ence . 526–541. https://doi.org/10.1007/978-3-319-68288-4_31[29] Justus J. Randolph. 2009. A guide to writing the dissertation literature review.
Practical Assessment, Research and Evaluation
14, 13 (2009). https://doi.org/10.7275/b0az-8t74[30] Alejandro Rodríguez-Iglesias, Alejandro Rodríguez-González, Alistair G. Irvine,Ane Sesma, Martin Urban, Kim E. Hammond-Kosack, and Mark D. Wilkinson.2016. Publishing FAIR data: An exemplar methodology utilizing PHI-base.
Frontiers in Plant Science th Workshop on SemanticPublishing (SePublica) (CEUR Workshop Proceedings) .[32] Bahar Sateli and René Witte. 2015. Semantic representation of scientific literature:bringing claims, contributions and named entities onto the Linked Open Datacloud.
PeerJ Computer Science
Research Methodologies in Supply Chain Management . Physica-VerlagHD, Heidelberg, 91–106. https://doi.org/10.1007/3-7908-1636-1_7[34] Pavel Shvaiko and Jérôme Euzenat. 2013. Ontology matching: State of the artand future challenges.
IEEE Transactions on Knowledge and Data Engineering
PeerJ Computer Science
Human Resource Development Review
4, 3 (2005), 356–367. https: //doi.org/10.1177/1534484305278283[37] Sahar Vahdati, Said Fathalla, Sören Auer, Christoph Lange, and Maria-EstherVidal. 2019. Semantic Representation of Scientific Publications. In
InternationalConference on Theory and Practice of Digital Libraries . 375–379. https://doi.org/10.1007/978-3-030-30760-8_37[38] Jane Webster and Richard T. Watson. 2002. Analyzing the Past to Prepare forthe Future: Writing a Literature Review.
MIS Quarterly
26, 2 (2002), xiii – xxiii.[39] Bert Van Wee and David Banister. 2016. How to Write a Literature Review Paper?
Transport Reviews
36, 2 (2016), 278–288. https://doi.org/10.1080/01441647.2015.1065456[40] Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Ap-pleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan Willem Boiten, Luiz Boninoda Silva Santos, Philip E. Bourne, Jildau Bouwman, Anthony J. Brookes, TimClark, MercÃĺ Crosas, Ingrid Dillo, Olivier Dumon, Scott Edmunds, Chris T.Evelo, Richard Finkers, Alejandra Gonzalez-Beltran, Alasdair J.G. Gray, PaulGroth, Carole Goble, Jeffrey S. Grethe, Jaap Heringa, Peter A.C. t Hoen, RobHooft, Tobias Kuhn, Ruben Kok, Joost Kok, Scott J. Lusher, Maryann E. Mar-tone, Albert Mons, Abel L. Packer, Bengt Persson, Philippe Rocca-Serra, MarcoRoos, Rene van Schaik, Susanna Assunta Sansone, Erik Schultes, Thierry Sen-gstag, Ted Slater, George Strawn, Morris A. Swertz, Mark Thompson, Johan VanDer Lei, Erik Van Mulligen, Jan Velterop, Andra Waagmeester, Peter Wittenburg,Katherine Wolstencroft, Jun Zhao, and Barend Mons. 2016. Comment: The FAIRGuiding Principles for scientific data management and stewardship.
ScientificData
Proceedings of the Sectionon Survey Research, American Statistical Association (1990), 354–359. https://doi.org/10.1007/978-1-4612-2856-1_101[42] Paweł Ziemba, Jarosław Jankowski, and Jarosław Wątróbski. 2017. Onlinecomparison system with certain and uncertain criteria based on multi-criteriadecision analysis method. In
International Conference on Computational CollectiveIntelligence . Springer, 579–589. https://doi.org/10.1007/978-3-319-67077-5_56. Springer, 579–589. https://doi.org/10.1007/978-3-319-67077-5_56