Neurocomputing | 2021

Self-adjusting k nearest neighbors for continual learning from multi-label drifting data streams

 
 
 
 

Abstract


Abstract Drifting data streams and multi-label data are both challenging problems. Multi-label instances may simultaneously be associated with many labels and classifiers must predict the complete set of labels. Learning from data streams requires algorithms able to learn from potentially unbounded data that is constantly changing. When multi-label data arrives as a stream, the challenges of both problems must be addressed, but additional challenges unique to the combined problem also arise. Each label may experience different concept drifts, simultaneously or distinctly, and parameter optimizations may be different for each label. In this paper we present a self-adapting algorithm for drifting, multi-label data streams, that can adapt to a variety of concepts drifts, is robust to data-level difficulties, and mitigates the necessity to tune multiple parameters. The window of retained instances self-adjusts in size to retain only the current concept, enabling efficient response to abrupt concept drift. The value k is self-adapting for each label, relieving the necessity to tune and allowing it to change, over time, for each label individually. A novel, label-based mechanism disables individual labels that contribute to error, while another punitive measure removes erroneous instances entirely, increasing robustness to noise, concept drift and label differences. Extensive experiments on 35 multi-label streams and generators demonstrate the superiority and advantages of the self-adapting mechanisms proposed compared to existing state-of-the-art methods.

Volume 442
Pages 10-25
DOI 10.1016/J.NEUCOM.2021.02.032
Language English
Journal Neurocomputing

Full Text