IEEE Transactions on Knowledge and Data Engineering | 2019

Multi-Label Learning from Crowds

 
 
 
 

Abstract


We consider multi-label crowdsourcing learning in two scenarios. In the first scenario, we aim at inferring instances’ groundtruth given the crowds’ annotations. We propose two approaches NAM/RAM (Neighborhood/Relevance Aware Multi-label crowdsourcing) modeling the crowds’ expertise and label correlations from different perspectives. Extended from single-label crowdsourcing methods, NAM models the crowds’ expertise on individual labels, but based on the idea that for rational workers, their annotations for instances similar in the feature space should also be similar, NAM utilizes information from the feature space and incorporates the local influence of neighborhoods’ annotations. Noting that the crowds tend to act in an effort-saving manner while labeling multiple labels, i.e., rather than carefully annotating every proper label, they would prefer scanning and tagging a few most relevant labels, RAM models the crowds’ expertise as their ability to distinguish the relevance between label pairs. In the second scenario, we care about cost-efficient crowdsourcing where the labeling and learning process are conducted in tandem. We extend NAM/RAM to the active paradigm and propose instance, label, and worker selection criteria such that the labeling cost is significantly saved compared to passive learning without labeling control. The proposals’ effectiveness are validated on simulated and real data.

Volume 31
Pages 1369-1382
DOI 10.1109/TKDE.2018.2857766
Language English
Journal IEEE Transactions on Knowledge and Data Engineering

Full Text