IEEE Access | 2021

On Approximation of Concept Similarity Measure in Description Logic ELH With Pre-Trained Word Embedding

 

Abstract


Data-driven and knowledge-driven methods are two mainstream techniques in the pursuit of developing artificial intelligence systems. While data-driven methods seek to develop a decision model from observations in the real world, they are difficult to provide an explanation for the results in human terms. On the other hand, knowledge-driven methods that employ symbolic reasoning based on formal semantics of a knowledge-base are thus more interpretable and explainable, while lacking an ability to deal with incomplete modeling of the structured knowledge-bases. This work aims to tackle these issues on ontology similarity by proposing a general framework that combines the strengths of both approaches for measuring semantic similarity of concepts in a description logic (DL) ontology. More specifically, a neuro-symbolic integrated framework is defined to exploit the pre-trained word embeddings with semantic definitions in an ontology to yield an explainable degree of concept similarity. To demonstrate its applicability, we develop a concrete similarity measure $ {\\textsf {sim}}_\\epsilon $ conforming to the proposed framework and also introduce an efficient algorithm that can extract an explanation for why such a degree is indicated. The correctness is shown by analyzing theoretical properties that it guarantees to preserve and also by performing an empirical evaluation with a medical ontology SNOMED CT and a medical pre-trained embedding BioWordVec. The results show that our proposed method remains both interpretability and explainability while achieving comparable performance, relative to the state-of-the-art approaches in the data and knowledge-driven methods.

Volume 9
Pages 61429-61443
DOI 10.1109/ACCESS.2021.3073730
Language English
Journal IEEE Access

Full Text