Online Inf. Rev. | 2021

The unknown knowns: a graph-based approach for temporal COVID-19 literature mining

 
 
 
 

Abstract


PurposeThe COVID-19 pandemic has sparked a remarkable volume of research literature, and scientists are increasingly in need of intelligent tools to cut through the noise and uncover relevant research directions. As a response, the authors propose a novel framework. In this framework, the authors develop a novel weighted semantic graph model to compress the research studies efficiently. Also, the authors present two analyses on this graph to propose alternative ways to uncover additional aspects of COVID-19 research.Design/methodology/approachThe authors construct the semantic graph using state-of-the-art natural language processing (NLP) techniques on COVID-19 publication texts (>100,000 texts). Next, the authors conduct an evolutionary analysis to capture the changes in COVID-19 research across time. Finally, the authors apply a link prediction study to detect novel COVID-19 research directions that are so far undiscovered.FindingsFindings reveal the success of the semantic graph in capturing scientific knowledge and its evolution. Meanwhile, the prediction experiments provide 79% accuracy on returning intelligible links, showing the reliability of the methods for predicting novel connections that could help scientists discover potential new directions.Originality/valueTo the authors’ knowledge, this is the first study to propose a holistic framework that includes encoding the scientific knowledge in a semantic graph, demonstrates an evolutionary examination of past and ongoing research and offers scientists with tools to generate new hypotheses and research directions through predictive modeling and deep machine learning techniques.

Volume 45
Pages 687-708
DOI 10.1108/OIR-12-2020-0562
Language English
Journal Online Inf. Rev.

Full Text