Progress in Artificial Intelligence | 2019
WordificationMI: multi-relational data mining through multiple-instance propositionalization
Abstract
Multi-relational data mining (MRDM) looks for patterns from a relational database. One of the established approaches to MRDM is propositionalization, characterized by transforming a relational database into a simpler representation, commonly a single table. Another approach that has proven to be effective to address learning problems involving one-to-many relationships between the data is multiple-instance learning. In this paper, we propose a new technique to transform relational data, called WordificationMI, which takes advantage of the multiple-instance learning’s potentialities. This new proposal is based on the bag-of-words representation, proposed in the Wordification methodology, but with the difference that it transforms a relational database into a multiple-instance representation. Additionally, we propose a feature selection method, named MICHI ($$\\chi _\\mathrm{MI}^{2}$$χMI2), for reducing the dimensionality of the datasets obtained with WordificationMI. We also present an empirical evaluation with ten relational databases and four learning techniques that show the effectiveness of the proposed methods.