2019 14th International Conference on Computer Science & Education (ICCSE) | 2019
Research on DFF-TopK algorithm based on dynamic feature selection
Abstract
The time series of water level are affected by rainfall, temperature, upstream and downstream nodes and other factors, which have time fluctuation and spatial complexity, and the interaction between nodes will lead to the uncertainty of the prediction effect. Existing time series prediction algorithms require complex data preprocessing and dynamic feature attribute changes are not supported in these. Based on the above problems, a Dynamic feature-filtering algorithm DFF-TopK (Dynamic Feature filter-TopK) was proposed to reduce the degree of fusion of prior knowledge and support the change of dynamic feature attributes. The algorithm firstly established the initial random forest classifier by directly using the data of all existing features, sorted according to the importance of features. Subsequently, the mapped features are set up as priority queues, and K features with higher priority are selected as input items. When the priority queue length is determined, the importance of input data features within a certain period will be dynamically adjusted along with the priority queue. Furthermore, the influence degree of the upstream and downstream nodes opening and closing or the dry season and rainy season will be dynamically recognized with the algorithm, which reduces the time complexity and solves the uncertainty caused by a large number of characteristic stacking, and avoids the influence of the traditional prior knowledge division on the results. Compared with the existing global static RF and gradient lifting algorithm DS-TopK, the experiments show that the algorithm has a greater improvement in time complexity and prediction accuracy, which verifies the effectiveness of the algorithm.