Archive | 2021

Intrinsic Bias Metrics Do Not Correlate with Application Bias

Abstract

Natural Language Processing (NLP) systems learn harmful societal biases that cause them to extend and proliferate inequality widely, as they are deployed in more and more situations. To address and combat this, the NLP community has come to rely on a variety of metrics to identify and quantify bias in black-box models, which are used to monitor model behaviour and to guide efforts at debiasing. Some of these metrics are intrinsic, and are measured in word embedding spaces, and some are extrinsic, which measure the bias present downstream in the tasks that the word embeddings are plugged into. This research examines whether intrinsic metrics (which are easy to measure) correlate well to extrinsic metrics (which reflect real world bias). We measure both intrinsic and extrinsic bias across hundreds of trained models covering different tasks and experimental conditions and find that there is no reliable correlation between these metrics that holds in more than extremely specific settings. We advise that efforts to debias embedding spaces be always also paired with measurement of downstream model bias, and suggest that that community direct more effort into making downstream measurement simpler and easier.

Volume None

Archive | 2021

Intrinsic Bias Metrics Do Not Correlate with Application Bias

Abstract

Volume None

Pages 1926-1940

DOI 10.18653/v1/2021.acl-long.150

Language English

Journal None

Full Text