Cognitive Computation | 2019

EPF: A General Framework for Supporting Continuous Top-k Queries Over Streaming Data

 
 
 

Abstract


Continuous top- k query over sliding window is a fundamental problem in the domain of streaming data management, which monitors the query window and retrieves k objects with the highest scores when the window slides. The key of supporting this query is maintaining a subset of objects in the window, and try to retrieve answers from them when the window slides. The state-of-the-art approach called SAP utilizes the partition technique to support top- k searches. Its key idea is using, as few as possible, high-quality candidates to support the query via finding a proper partition. However, it has to waste relatively high computation cost in evaluating whether the partition is proper and re-scanning the widow. In this paper, we propose an ELM -based framework named EPF , which improves SAP via learning the nature of streaming data. If we learn that the distribution of streaming data is predictable, we could construct a suitable prediction model for a more efficient partition of the window. Furthermore, we propose a novel algorithm to reduce the re-scanning cost. We conduct a thorough experimental study of this technique on real and synthetic datasets and show the significant performance improvement when applying the technique in existing algorithms.

Volume 12
Pages 176-194
DOI 10.1007/s12559-019-09661-z
Language English
Journal Cognitive Computation

Full Text