ACM SIGMOD Record | 2019

Domain- and Structure-Agnostic End-to-End Entity Resolution with JedAI

 
 
 
 
 
 

Abstract


We present JedAI, a new open-source toolkit for endto- end Entity Resolution. JedAI is domain-agnostic in the sense that it does not depend on background expert knowledge, applying seamlessly to data of any domain with minimal human intervention. JedAI is also structure-agnostic, as it can process any type of data, ranging from structured (relational) to semi-structured (RDF) and un-structured (free-text) entity descriptions. JedAI consists of two parts: (i) JedAI-core is a library of numerous state-of-the-art methods that can be mixed and matched to form (thousands of) end-to-end workflows, allowing for easily benchmarking their relative performance. (ii) JedAI-gui is a user-friendly desktop application that facilitates the composition of complex workflows via a wizard-like interface. It is suitable for both lay and power users, offering concrete guidelines and automatic configuration, as well as manual configuration options, visual exploration, and detailed statistics for each method s performance. In this paper, we also delve into the new features of JedAI s latest version (2.1), and demonstrate its performance experimentally.

Volume 48
Pages 30 - 36
DOI 10.1145/3385658.3385664
Language English
Journal ACM SIGMOD Record

Full Text