2019 8th Brazilian Conference on Intelligent Systems (BRACIS) | 2019

Dynamic Correlation-Based Feature Selection for Feature Drifts in Data Streams

 
 
 

Abstract


Learning from data streams requires efficient algorithms capable of constructing a model according to the arrival of new instances. These data stream learners need a quick and real-time response, but mainly, they must be tailored to adapt to possible changes in the data distribution, a condition known as concept drift. However, recent works have shown that changes of relevant feature subsets over time, called feature drift, may have significant impact in the learning process despite being commonly disregarded until now in the underlying concept of a data stream. To improve the performance of feature drifting data stream classification, in this work we present an algorithm called DCFS (Dynamic Correlation-based Feature Selection) that determines which features are the most important in each moment of a data stream. By implementing an adaptive strategy based on a drift monitor, in this algorithm, a correlation-based feature selection method is used to update the relevant feature subsets for data streams dynamically. The experimental results demonstrate that implementing our feature selection algorithm inside an incremental and online classifier leads the model to perform well on data stream datasets with feature drift, surpassing in some cases state-of-the-art data streams classifiers.

Volume None
Pages 198-203
DOI 10.1109/BRACIS.2019.00043
Language English
Journal 2019 8th Brazilian Conference on Intelligent Systems (BRACIS)

Full Text