2019 11th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) | 2019

Research on Multi-Attribute Data Completion Method Considering Data Distribution Characteristics

 
 

Abstract


For the data completion method, the maximum likelihood estimation method is suitable for big data, and the K-nearest neighbor method only considers the linear relationship between the same attributes of different data. Although the BP neural network considers the nonlinear relationship between data attributes, the sample distribution has a great influence on the data completion effect. DBSCAN is used to classify the sample data, analyze its distribution characteristics, eliminate the noise data and select training samples, and use BP neural network to fit the nonlinear relationship between data attributes to predict the missing data values. Through the analysis of the example data set, it can be seen that the BP considering the data distribution characteristics has the best complete accuracy.

Volume 2
Pages 138-141
DOI 10.1109/IHMSC.2019.10128
Language English
Journal 2019 11th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)

Full Text