Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining | 2021

Aggregating Complex Annotations via Merging and Matching

Abstract

Human annotations are critical for training and evaluating supervised learning models, yet annotators often disagree with one another, especially as annotation tasks increase in complexity. A common strategy to improve label quality is to ask multiple annotators to label the same item and then aggregate their labels. While many aggregation models have been proposed for simple annotation tasks, how can we reason about and resolve annotator disagreement for more complex annotation tasks (e.g., continuous, structured, or high-dimensional), without needing to devise a new aggregation model for every different complex annotation task? We address two distinct challenges in this work. Firstly, how can a general aggregation model support merging of complex labels across diverse annotation tasks? Secondly, for multi-object annotation tasks that require annotators to provide multiple labels for each item being annotated (e.g., labeling named-entities in a text or visual entities in an image), how do we match which annotator label refers to which entity, such that only matching labels are aggregated across annotators? Using general constructs for merging and matching, our model not only supports diverse tasks, but delivers equal or better results than prior aggregation models: general and task-specific.

Volume None

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining | 2021

Aggregating Complex Annotations via Merging and Matching

Abstract

Volume None

Pages None

DOI 10.1145/3447548.3467411

Language English

Journal Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Full Text