Advances in Intelligent Systems and Computing | 2021

Active Learning-Based Data Collection in Crowd Replication

 
 

Abstract


Crowd replication, which combines crowd sensing, direct observation, and mathematical modeling to enable efficient and accurate evaluation of crowd, is a low-effort, easy-to-adopt, and cost-effective mechanism for crowd data collection and analysis. In crowd replication, the quality of data collection is particularly important, which depends on the representativeness of the target population-based sampling. The main two target selection strategies, population-based sampling strategy, and cluster sampling strategy will be labor-intensive and time-consuming to obtain the stable, reliable, and valid data. Therefore, in this paper, a novel method of data collection in crowd replication based on active learning, which is a modern method in machine learning, aiming to reduce the sample size, complexity, and increase the accuracy of the data tasks as much as possible with less data is proposed. We apply active learning to allow us to obtain the dataset with high representativeness and informativeness. We demonstrate with experimental results that, compared with the traditional probability-based sampling strategies, the more representative samples and dataset can be stably captured by our contribution.

Volume None
Pages None
DOI 10.1007/978-3-030-73113-7_5
Language English
Journal Advances in Intelligent Systems and Computing

Full Text