[PDF] Emergent Unfairness in Algorithmic Fairness-Accuracy Trade-Off Research

Abstract

Across machine learning (ML) sub-disciplines, researchers make explicit mathematical assumptions in order to facilitate proof-writing. We note that, specifically in the area of fairness-accuracy trade-off optimization scholarship, similar attention is not paid to the normative assumptions that ground this approach. Such assumptions presume that 1) accuracy and fairness are in inherent opposition to one another, 2) strict notions of mathematical equality can adequately model fairness, 3) it is possible to measure the accuracy and fairness of decisions independent from historical context, and 4) collecting more data on marginalized individuals is a reasonable solution to mitigate the effects of the trade-off. We argue that such assumptions, which are often left implicit and unexamined, lead to inconsistent conclusions: While the intended goal of this work may be to improve the fairness of machine learning models, these unexamined, implicit assumptions can in fact result in emergent unfairness. We conclude by suggesting a concrete path forward toward a potential resolution.

Full PDF

EEmergent Unfairness: Normative Assumptions and Contradictions inAlgorithmic Fairness-Accuracy Trade-Off Research

A. Feder Cooper, Ellen Abrams Cornell University, Department of Computer Science Cornell University, Society for the [email protected], [email protected]

Abstract

Across machine learning (ML) sub-disciplines, researchersmake explicit mathematical assumptions in order to facili-tate proof-writing. We note that, speciﬁcally in the area offairness-accuracy trade-off optimization scholarship, similarattention is not paid to the normative assumptions that groundthis approach. Such assumptions presume that 1) accuracyand fairness are in inherent opposition to one another, 2) strictnotions of mathematical equality can adequately model fair-ness, 3) it is possible to measure the accuracy and fairnessof decisions independent from historical context, and 4) col-lecting more data on marginalized individuals is a reason-able solution to mitigate the effects of the trade-off. We arguethat such assumptions, which are often left implicit and unex-amined, lead to inconsistent conclusions: While the intendedgoal of this work may be to improve the fairness of machinelearning models, these unexamined, implicit assumptions canin fact result in emergent unfairness. We conclude by suggest-ing a concrete path forward toward a potential resolution.

Optimization is a problem formulation technique that lies atthe core of multiple engineering domains. Given some ﬁxedor limited resource, we can model its usage to effectivelysolve a problem. The optimal solution is the one that eitherminimizes some cost function or maximizes some utilityfunction—functions that measure how well the model per-forms on a particular objective. Often there is more than oneobjective to satisfy simultaneously, and those objectives canbe in tension with one another. In this case, it is possible topose this problem as optimizing a trade-off (Yang 2010).For an intuitive example, consider a company that has aﬁxed amount of steel, which it can use to build cars andplanes, which it then sells to earn a proﬁt. The company hasto decide how to allocate the steel to maximize that proﬁtand can formulate the decision as an optimization problem.The blue curve in Figure 1 models possible ways to do thisoptimally; picking a speciﬁc point on the curve correspondsto the company’s choice for how to balance the trade-off.Optimization science has informed much of the lastdecade’s spectacular spate of statistical machine learning

Figure 1: Illustrating a trade-off. Given a ﬁxed amount ofsteel, a company can build a combination of cars and planes.To optimally utilize the steel for maximizing proﬁt, it canmanufacture any point on the blue curve. Any combinationabove the curve is not possible because there is insufﬁcientsteel; any below is suboptimal because the same amount ofsteel could either produce more cars or more planes.(ML) publications. It is at the core of how many ML al-gorithms learn. For example, learned classiﬁers use trainingdata examples to ﬁt a curve that optimizes for both classify-ing those examples correctly and generalizing to new, un-classiﬁed examples (Bishop 1995; Hastie, Tibshirani, andFriedman 2009). This process of automating classiﬁcationdecisions has in the past been framed as optimal in an-other sense; automation brought with it the hope of rootingout the human (suboptimal) whims of decisionmaking—ofeliminating the ugliest human biases, such as sexism andracism, that afﬂict high-impact decision processes. Yet, ashas been well-documented, this hope cannot be passivelyrealized merely by substituting humans with automated de-cision agents. Issues with biased data and biased model se-lection processes can in the worst case magnify, rather thanreplace, human biases (Abebe et al. 2020; Selbst et al. 2019).In response, there has been a widespread push to activelyengineer algorithmic fairness . This has become remarkablyurgent as automated decision systems are being deployed indomains that have a signiﬁcant impact on the quality of hu-man life—in deciding judicial bail conditions, loan alloca-tion, college admissions, hiring outcomes, COVID vaccinedistribution, etc...(Monahan and Skeem 2016; Barocas and a r X i v : . [ c s . C Y ] F e b elbst 2014). Models no longer just need to be correct; theyalso need to be fair.It has become common to position accuracy in opposi-tion to fairness and to formalize a mathematical trade-offbetween the two. For example, in the context of criminaljustice and bail decisions, the accuracy of decisions hasbeen framed as how to best “maximize public safety,” incontrast to satisfying “formal fairness constraints” that aimto reduce racial disparities in decision outcomes (Corbett-Davies et al. 2017). This kind of problem formulation is thenorm in a growing area of research, in which the trade-offbetween fairness and accuracy in ML is considered “inher-ent” or “unavoidable.” Prior work suggests various ways ofimplementing the trade-off: At best, under very particularconditions, the tension between the two can be dissolved toachieve both; at worst, fairness is sacriﬁced in favor of accu-racy, while the remaining cases fall somewhere in the mid-dle (Dutta et al. 2020; Chen, Johansson, and Sontag 2018;Bakker et al. 2019; Menon and Williamson 2018; Noriega-Campero et al. 2019). Our Contribution

Our work looks both at and beyond the fairness-accuracytrade-off, drawing from prior critiques of algorithmic fair-ness and studies of sociotechnical systems. We examine the choice —and our work here will show that it is a choice, not arequirement—to model assumptions that cast fairness in di-rect opposition to accuracy. Regardless of the particulars ofspeciﬁc implementations, this framing does not just involvemath, but also implicates normative concerns regarding howto value fairness and accuracy both independently and in re-lation to each other (Flanagan and Nissenbaum 2014; Fried-man and Hendry 2019; Selbst et al. 2019). Our contributionis to extract and explore patterns of these concerns acrosstrade-off scholarship that arise at three different stages: theinitial modeling assumption to treat accuracy and fairnesstogether in an optimization problem, the move from abstractframing to concrete problem formulation, and the “optimalsolutions” that result from those formulations.More speciﬁcally, we examine how the choice to opera-tionalize the relationship between fairness and accuracy us-ing the language of optimization inherently puts the two inconﬂict, rather than leaving open the possibility for them tobe in accord. We discuss how this choice falls into whatSelbst et al. (2019) calls the “Framing Trap”—a failure totake full account of social criteria when drawing the bound-aries of a problem—and broader trends of “solutionism,” inwhich math is (mistakenly) bestowed special authority to“solve” social problems (Abebe et al. 2020).Beyond these overarching framing assumptions, there areother underlying, unexamined normative assumptions (notjust explicit mathematical ones) that take root in how thetrade-off is formalized: That strict notions of mathematicalequality can model fairness, that it is possible to measure theaccuracy and fairness of decisions independent from histor-ical context, and that collecting more data on marginalizedindividuals—a practice called active fairness —is a reason-able solution to mitigate the effects of the trade-off. If we take the time to clarify these implicit assump-tions, we note that the conclusions that follow can actu-ally perpetuate unfairness: The mathematical proofs maybe sound—a particular choice of fairness metric may evenbe optimized—but the implicit normative assumptions andaccompanying broader normative results suggest that thesemethods will not ensure fairer outcomes in practical appli-cations.In summary:• Using the language of optimization to situate fairness andaccuracy in intrinsic opposition is an example of both so-lutionism and the Framing Trap (Section 3).• Underlying mathematical assumptions bring unexaminednormative dimensions, which can lead to emergent unfair-ness (Section 4).• In light of these observations, we close by suggesting apath forward to avoid these pitfalls (Section 5)

We begin by providing the background necessary for under-standing the problem formulation of the fairness-accuracytrade-off. Before clarifying what the trade-off actually char-acterizes, we address each component in turn, summarizingcommon quantiﬁable metrics that ML researchers map to thevalues of “accuracy” and “fairness.”

Accuracy Metrics

In brief, accuracy measures how often a ML model correctlypredicts or infers a decision outcome after training. So, tounderstand accuracy, we need to understand how ML modelsare trained. For the of classiﬁcation problems that dominatemuch of the fairness literature, training the model usuallyentails ﬁtting a curve to a set of training data points for whichclassiﬁcation labels are already known. After training, whensupplied with a previously-unseen data point, the model caninfer the classiﬁcation label for that data point. For example,in building a model that infers whether or not to grant an ap-plicant a loan, the curve-ﬁtting training process occurs withpast data concerning loan-granting decisions; inference cor-responds to the model receiving a new loan applicant’s dataand classifying whether that applicant should receive a loan.A model’s accuracy tends to be measured during a valida-tion process in between training and inference. Rather thanusing all labeled data for training, researchers reserve a por-tion to validate how well the trained model classiﬁes unseendata points with known classiﬁcation labels. In other words,accuracy is often a measure of label alignment; it is the per-centage of correctly classiﬁed validation data points, wherecorrectness is determined by whether the model’s classiﬁ-cation decision matches the known label. There are othermetrics that researchers use, such as Chernoff information(Dutta et al. 2020); however, label alignment is a popularaccuracy metric, in part due to its simplicity.This simplicity can be misleading, both in terms of whatthe math is actually measuring and the normative implica-tions of that measurement. The broader algorithmic fair-ness community (and corresponding community of critics)as paid ample attention to this issue in relation to fairness(Binns 2018; Selbst et al. 2019; Powles and Nissenbaum2018; Abdurahman 2019)); however, in fairness-accuracytrade-off literature, where accuracy is also explicitly cen-tered as a value, parallel analyses of accuracy have been rel-atively sparse. We examine this in Section 4; for now, weemphasize that something as simple as a percentage metriccan raise normative concerns.One can see this from work in the broader ML commu-nity, in which accuracy issues often get cast as a problemof label bias : The classiﬁcation labels in the training andvalidation data can be incorrect, in terms of some abstractnotion of “ground truth.” As an innocuous example, con-sider a labeled image dataset of dogs and cats. The indi-vidual that labeled the dataset incorrectly (though perhapsunderstandably) mis-labeled Pomeranians as cats. This mis-labeling in turn leads the learned ML model to mistakenlyconsider Pomeranians to be cats.The results of mis-labeling can be devastating for applica-tions that impact human lives. Consider again an automateddecision system that grants and denies loan applications.In the US, systemic racism against Black loan applicants,speciﬁcally manifested in the practice of redlining, has en-tailed denying loans to qualiﬁed Black applicants. These ap-plicants, in terms of “ground truth” should have been grantedloans, but the recorded, observed data and ensuing learnedmodels may not reﬂect this.

Fairness Metrics

Algorithmic fairness has dozens of mathematical deﬁnitionsthat can inform optimization problem formulation. All deﬁ-nitions, regardless of the speciﬁcs, involve some treatment of protected attributes , such as race and sex, along which deci-sion outcomes can be evaluated for “fair” treatment. Broadlyspeaking, there are two families of fairness metrics: Thosethat measure individual-focused fairness and those that eval-uate it in terms of groups deﬁned by protected attributes.Individual fairness, as the name suggests, centers analyz-ing automated decisions in terms of the individual (Dworket al. 2012; Joseph et al. 2016; Bakker et al. 2019). In con-trast, group fairness centers demographic group membershipand typically aims to ensure that membership in a protectedclass does not correlate with decision outcomes (Dworket al. 2018; Chen, Johansson, and Sontag 2018). A partic-ularly popular metric is Hardt, Price, and Srebro (2016)’sformulation of equality of opportunity, which in essenceonly requires that there is no discrimination based on de-mographics for those assigned the positive outcome classiﬁ- The implications of “true” classiﬁcation, including the simpli-fying assumptions that inform such classiﬁcation, are out of scopefor our purposes. We refer the reader to the rich literatures in sociol-ogy and STS on categorization and classiﬁcation, notably (Velocci2021; Bowker and Star 1999). Similar to our discussion of accuracy and classiﬁcation, ourwork will not focus on the normative limits of deﬁning fairnessmathematically, such as the challenges of formulating fairnessproblems that account for intersectional protected identities (Selbstet al. 2019; Chen, Johansson, and Sontag 2018; Dwork et al. 2018;Buolamwini and Gebru 2018). cation. For the example of granting loans, paying back/ notdefaulting on the loan represents the positive class. Equalityof opportunity corresponds to making sure that those in thisclass do not mistakenly get classiﬁed as defaulters—the neg-ative class. We discuss this further in Section 4; for now, itis important to note that this formulation is inextricably tiedto accuracy . Training a model to reduce mistaken negativeclassiﬁcation directly depends on training data that containsnegative labels. Those labels, however, may not align withwhat is correct in terms of “ground truth”—e.g., Black indi-viduals erroneously marked as defaulters.Lastly, it is also worth noting that multiple different math-ematical notions of fairness cannot be satisﬁed simultane-ously. This incompatibility has been formalized as impossi-bility results (Kleinberg, Mullainathan, and Raghavan 2017;Chouldechova 2017) and has placed a more signiﬁcant em-phasis on how computer scientists choose which fairnessmetric to study (Friedler, Scheidegger, and Venkatasubrama-nian 2016). While these ﬁndings are extremely well-cited,they are not surprising when considering fairness beyond itsdeﬁnition as a metric. In a pluralistic world, values like fair-ness depend on the time and place in which they are deﬁned;different, incompatible deﬁnitions can hold simultaneously,depending on context (Berlin 2013).

The “Inherent” Trade-Off

Algorithmic fairness tends to pose accuracy and fairness inan “inherent” or “unavoidable” trade-off: An increase in fair-ness necessarily comes with a decrease in accuracy; increas-ing accuracy necessarily decreases fairness (Chen, Johans-son, and Sontag 2018; Menon and Williamson 2018; Bakkeret al. 2019; Corbett-Davies et al. 2017; Dwork et al. 2018;Sabato and Yom-Tov 2020; Zhao and Gordon 2019).How did this trade-off problem formulation come about?Much of the literature that engages this trade-off does so em-pirically; that is, the authors perform or cite experiments thatconvey the intuition that a trade-off exists, so they decidethat a mathematical trade-off is an appropriate way to modelthe problem. However, most do not characterize or quan-tify the trade-off theoretically. The work that has attemptedtheoretical treatment and suggests that—at least in theory—the existence of the fairness-accuracy trade-off can be lessrigidly described (Dutta et al. 2020; Wick, panda, and Tris-tan 2019). Nevertheless, the general practice in the ﬁeld is totacitly accept the trade-off as fact, regardless of the particu-lar fairness and accuracy metrics under consideration.Computer scientists further observe that the ramiﬁcationsof this trade-off, particularly in high-stakes domains, can besigniﬁcant. As a result, they sometimes wade into the murki-ness of how to optimize the trade-off implementation in spe-ciﬁc “sensitive” applications. For example, several computerscientists have noted that in areas like healthcare, trade-offimplementations should favor accuracy, as privileging fair-ness can have “devastating” consequences, such as misdiag-nosing cancer (Chen, Johansson, and Sontag 2018; Srivas-tava, Heidari, and Krause 2019).Other researchers have posited why this trade-off existsin the ﬁrst place. For example, one popular explanationcomes from Friedler, Scheidegger, and Venkatasubramanian2016). The authors of this paper reason that there is an “ab-stract construct space” that represents the features we actu-ally want to measure but cannot observe (e.g., intelligence).Instead, we see features in the “observed space” of the actualworld (e.g., SAT score), and there is a mapping from fea-tures in the construct space to features in the observed space(e.g., SAT score is the mapped feature in the observed space,standing in place for intelligence in the construct space).According to Friedler et al, the trade-off between classiﬁ-cation accuracy and fairness exists in the real world due to“noisier mappings” for less privileged groups from the con-struct space to the observed space. They contend that thisnoise comes from historic differences, particularly in oppor-tunity and representation, which makes positive and negativedata points less distinguishable (in comparison to privilegedgroups) for the learned classiﬁer. It is this decrease in sepa-rability that leads to less fair classiﬁcation for less privilegedgroups. In the example of SATs, this means that the scoresare less reliable in terms of conveying information about in-telligence for underprivileged groups. While this posited ex-planation may seem reasonable, work that engages with itrarely (if ever) supports the explanation with data, as it isusually not the speciﬁc fairness problem under mathemati-cal consideration (Dutta et al. 2020).

In our critiques of emergent unfairness in Section 4, weexplicitly examine the fairness-accuracy trade-off model asa sociotechnical system. The choice to formulate algorith-mic fairness as an optimization problem produces a partic-ular kind of knowledge about fairness that cannot be de-tached from its broader social context. In the ﬁeld of Sci-ence and Technology Studies, the term “sociotechnical” isused to indicate the presence and interaction of both so-cial and technical components in systems like the electricpower grid (Hughes 1993), uranium mining practices (Hecht2002), Wikipedia organization (Geiger 2017), and algorith-mic decision-making (Selbst et al. 2019).In their discussion of algorithmic fairness research, Selbstet al. (2019) use the concept of sociotechnical systems todraw attention to ﬁve “traps” that imperil well-meaning datascientiﬁc approaches to fairness. Overall, the authors arguethat the use of “bedrock” computer science concepts in fair-ness research can lead to problematic outcomes. By focus-ing on one such bedrock concept—optimization—that char-acterizes a newly popular approach to algorithmic fairnessresearch, we offer additional speciﬁcity and insight to exist-ing critiques.One of the potential pitfalls that Selbst et al. (2019) iden-tify is called the “Framing Trap,” which the authors describeas the “failure to model the entire system over which a socialcriterion, such as fairness, will be enforced.” A sociotechni-cal lens, they argue, might suggest new ways for researchersto draw the boundaries of fairness problems to include so-cial relations and dynamics that may have otherwise beenexcluded.Deﬁning fairness and accuracy in trade-off exempliﬁesfalling into the Framing Trap. The language of deﬁningtrade-offs along a curve (Figure 1) necessarily requires a give-and-take relationship between the factors under consid-eration, as those factors are cast in opposition to one another.In the fairness-accuracy trade-off, fairness and accuracy areframed as inherently competing goals, in which we mustgive up some of one in order to gain some of the other (evenif “some” cannot be quantiﬁed deﬁnitively). Based on thisframing, it is consistent that much of the literature uses lan-guage that describes the “cost of fairness” (Chen, Johansson,and Sontag 2018; Menon and Williamson 2018; Corbett-Davies et al. 2017; Dutta et al. 2020), depending on whereon the trade-off optimization curve a particular implemen-tation lies. This cost, however, can be described as cuttingboth ways: It is similarly reasonable to talk about the “costof accuracy.” Yet, with few exceptions (Corbett-Davies et al.2017), the literature in this area chooses not to discuss costsin this way.This decision is perhaps due to the tendency within theﬁeld of ML more broadly to privilege accuracy during modeldesign; nonetheless, this framing shifts the burden of de-fensibility to fairness, in the sense that it implies that fair-ness’ “costs” require justiﬁcation. Moreover, this framing asa trade-off does not leave room for deﬁnitions of accuracythat are generally in accord with fairness (unless one speci-ﬁes particular conditions for which it is possible to demon-strate that the trade-off does not exist (Dutta et al. 2020;Wick, panda, and Tristan 2019))—that the accurate thing todo could be equivalent to the fair thing to do.In Section 4, we suggest that a solution to the “FramingTrap” should also include redrawing the boundaries of theproblem to account for social and technical considerations in historical context . A suitable frame for fairness researchcan neither be blind to past historical context nor ignore thefuture. Selbst et al. (2019) similarly attend to the time-basednature of fairness solutions in their discussion of “ripple ef-fects” and the need to be aware of how the implementationof new systems can alter sociotechnical relationships or re-inforce existing power dynamics in problematic ways.Selbst et al. (2019) also identify a “Portability Trap”that likely stems from the value placed on tools in com-puter science that are transferable from one problem to an-other. Likewise, our analysis demonstrates some of the chal-lenges of using trade-off and optimization tools in algo-rithmic fairness research. Concretely, framing the problemas a trade-off problem to be “solved” using math, insteadof using other interventions to address inequity, falls un-der the highly-critiqued practice of technological “solution-ism”(Selbst et al. 2019; Abebe et al. 2020)—the notion thattechnology is uniquely capable of solving social problems.We contend that it is unlikely that the same ideas used tosolve problems of steel allocation (Figure 1) will transferwithout issue to questions of fair hiring practices. We sug-gest that attending to the sociotechnical context of each situ-ation may help prevent the emergent unfairness we identifyin the following sections.

Choosing to convey fairness and accuracy as a trade-off isa mathematical modeling assumption. As we discussed inection 2, the authors have observed a pattern in their em-pirical results concerning accuracy and fairness, and deem atrade-off to be a useful way to formulate the mathematicalproblem of characterizing the relationship between the two.As we suggest above, fairness and accuracy metrics havenormative dimensions; so, too, does the modeling assump-tion that poses them in trade-off.There are also numerous, other mathematical assump-tions, each which carry their own implicit normative dimen-sions. We observe that, based on these implicit assumptions,fairness-accuracy trade-off scholarship is plagued with gapsand oversights. These issues can lead to conclusions that ac-tually perpetuate unfairness . There are dozens of examplesof particular assumptions speciﬁc to each paper in fairness-accuracy trade-off scholarship. It is not possible to be ex-haustive regarding each mathematical assumption’s corre-sponding normative assumptions. Instead, we isolate threepatterns of implicit, unexamined assumptions in the disci-pline, and the emergent unfairness that can result: Unfair-ness from assuming 1) strict notions of equality can substi-tute for fairness, 2) historical context is irrelevant when for-mulating the trade-off, and 3) that collecting more data onmarginalized groups is a reasonable mechanism for alleviat-ing the trade-off.

Unfairness from Assuming Fairness = Equality

One assumption prevalent in fairness-accuracy trade-off lit-erature concerns how different papers choose to measurefairness. Most of the work in this subﬁeld relies on parity-based deﬁnitions. Algorithmic fairness deﬁnitions like thiseffectively make the modeling assumption to represent “fair-ness” as “equality.” To be clear, we mean “equality” in thestrict sense of the values of metrics being as equal as possi-ble, by minimizing some form of measured inequality. Thiskind of strict equality can stand in for “fairness” in termsof what is actually being modeled; fairness is being framedas a problem of strict equality. This is easily discernible inthe popular equality of opportunity metric (Section 2), usedin numerous trade-off papers (Chen, Johansson, and Son-tag 2018; Dutta et al. 2020; Bakker et al. 2019; Noriega-Campero et al. 2019), which tries to minimize discrepanciesin false negative classiﬁcation decisions among different de-mographic groups; it literally tries to make those rates asequal as possible.However, what is fair and what is equal are not always thesame thing, and framing them as equivalent can actually leadto unfair outcomes. For example, when addressing historicor systemic inequity, it can be necessary to take correctiveor reparative action for some demographic groups in orderto create the conditions of more-equal footing. Such actionsnecessarily diverge from equality in the strict mathematicalsense, so strictly parity-based fairness metrics cannot cap-ture this kind of nuance. While such equality-based notions of fairness dominate the lit-erature more generally than in just fairness-accuracy trade-off, aswell, there are a growing number of exceptions, such as deﬁnitionsframed in terms of Rawlsian social welfare (Rawls 1971; Heidariet al. 2018; Joseph et al. 2016).

The ongoing debate in the United States around the fair-ness of afﬁrmative action policy can help illustrate this dis-tinction, as well as the complications that arise when deﬁn-ing fairness as equality. In brief, afﬁrmative action is a so-cial policy aimed at increasing the representation of histor-ically marginalized groups in university student and work-force populations; it attempts to implement a fairer play-ing ﬁeld by providing individuals from marginalized back-grounds with the chance to have the same opportunities asthose from more privileged backgrounds.While afﬁrmative action has existed as ofﬁcial policy inthe US for decades (Kennedy 1961), it is extremely con-tentious. Some white Americans and white supremacists,who do not feel personally responsible for systemic discrim-ination against BIPOC populations, feel that afﬁrmative ac-tion puts them at a disadvantage. They claim that the policyis unfair, and in fact is responsible for “reverse discrimi-nation” (Budryk 2020; Newkirk 2017; Pham 2018)). Thisbelief comes in part from the idea that afﬁrmative actiondoes not lead to “equal” comparisons in the strictest sense—comparing SAT scores or GPAs point for point. Instead, inthe language of Friedler, Scheidegger, and Venkatasubrama-nian (2016), one could say that afﬁrmative action attemptsto repair or normalize for the “noisy mappings” that thesescores convey for unprivileged populations to promote faireroutcomes. In short, the goal of afﬁrmative action illustrates how no-tions of fairness and equality can diverge. The policy’s exis-tence is predicated on the notion that strictly equal treatment,without attending to past inequity, can potentially perpetuateunfairness.

Unfairness from Assuming the Irrelevance ofContext

Fundamentally, the issue with the assumption that strict no-tions of equality can stand in for fairness has to do with howthe assumption treats—or rather discounts—context. Fair-ness metrics like equality of opportunity are only able toevaluate the local, immediate decision under consideration.As discussed above using the example of afﬁrmative action,this type of equality cannot accommodate reparative inter-ventions that attempt to correct for past inequity. This simi-larly implicates issues with how we measure accuracy, sincesuch metrics measure the correctness of current classiﬁca-tion decisions in relation to past ones. We next examine thisissue, as it presents fundamental contradictions in the fram-ing and formulation of the fairness-accuracy trade-off prob-lem. An acronym for “Black, Indigenous, and People of Color,”used particularly in the US and Canada. It is also interesting to note that this controversy has foundits way into the language of algorithmic fairness literature. Dworket al. (2012, 2018) use the term “fair afﬁrmative action” in theirwork; they seem to be attempting to distinguish their notion fromsome imagined, alternative, unfair variant. This term is arguablyredundant, since afﬁrmative action is fundamentally about tryingto promote fairer outcomes, even if that notion of fairness does notalign with parity-based notions in algorithmic fairness. gnoring the Past

As discussed in Section 1, optimiza-tion involves minimizing a loss function. In statistical termi-nology, minimizing the expected loss depends on the true class labels (Bishop 1995). In fairness-related applicationdomains we rarely, if ever, have access to true class labels.To return to an earlier example, Black people have system-atically been denied loans in the US due to their race; inmany cases, while a Black person’s “true” label should bethat they would not default on a loan, past loan-granting de-cisions (and therefore the corresponding data) mark them asa defaulter. This captures the problem of label bias: Mis-alignment between the “ground truth” label and the actual,observed label in real world data (Section 2). In a sense,this bias is what has motivated the entire ﬁeld of algorith-mic fairness in the ﬁrst place: Automated decision systemsthat do not account for systemic discrimination in trainingdata end up magnifying that discrimination (Barocas, Hardt,and Narayanan 2018; Abebe et al. 2020); to avoid this, suchsystems need to be proactive about being fair.Label bias presents an inherent issue with how we mea-sure accuracy: If labels are wrong, particularly for individ-uals in the groups for which we want to increase fairer out-comes, then there are cases where misclassiﬁcation is in factthe correct thing to do. In other words, how we measure ac-curacy is not truly accurate.Yet, in the fairness-accuracy trade-off literature, it is verycommon to assume label bias can be ignored. Much ofthe work in this space does not mention label bias at all,or claims that it is out of scope for the research problemunder consideration (Chen, Johansson, and Sontag 2018).This presents a contradiction: Simultaneously acknowledg-ing that labels in the observed space are noisy representa-tions of the ground truth (i.e., there is bias in the labels),but then explicitly assuming those labels in the training data(i.e., the observed space labels) are the same as the true la-bels (Dutta et al. 2020). In other words, because the labelsare biased, the corresponding accuracy measurements thatdepend on them are also biased, which this work explicitlyignores in its trade-off formulation.If the accuracy metric is conditioned on past unfairness,what is the trade-off between fairness and accuracy actuallymeasuring? What does it mean to “increase” or “decrease ac-curacy” in this context? If accuracy metrics encode past un-fairness for unprivileged groups, the fairness-accuracy trade-off is effectively positioning fairness in trade-off with un-fairness, which is tautological. Giving validity to an accu-racy metric that has a dependency on unfairness inherentlyadvantages privileged groups; it is aligned with maintainingthe status quo, as there is no way to splice out the past un-fairness on which it is conditioned. Wick, panda, and Tristan (2019) is a notable exception, ac-knowledging this contradiction in stark terms: “...there is a perni-cious modeling-evaluating dualism bedeviling fair machine learn-ing in which phenomena such as label bias are appropriately ac-knowledged as a source of unfairness when designing fair models,only to be tacitly abandoned when evaluating them.” To the best of our knowledge, no scholarship in this space hasattempted to put a Bayesian prior on this existing unfairness. Suchan approach would explicitly assume and model the existing un-

Being Blind to the Future

Similarly, studying speciﬁc,local classiﬁcation decisions in terms of balancing fairnessand accuracy does not provide insight about the more global,long-term effects that such decisions potentially have. Thisalso presents a contradiction: Some scholarship concerningthe trade-off explicitly aims to support the goals of policy-makers, but policymaking by its very nature takes a long-tailed view. Current policy interventions do not just have alocal impact, but rather also have desired cascading effectsthat carry into the future.Ironically, this contradiction is clearly spelled out insome of the trade-off literature as an intentional assump-tion. For example, Corbett-Davies et al. (2017) explicitlystates: “Among the rules that satisfy a chosen fairness cri-terion, we assume policymakers would prefer the one thatmaximizes immediate utility.” They intentionally examinethe “proximate costs and beneﬁts” of the fairness-accuracytradeoff, assuming that this is the temporal resolution thatwould be most useful to policymakers. This approach en-ables simplifying mathematical assumptions, as it does notrequire evaluating how the speciﬁc automated decision un-der consideration has potential ramiﬁcations in the future.In Corbett-Davies et al. (2017), in which they examine risk-assessment decisions for granting bail, they speciﬁcally donot need to look at how the immediate decision to detainsomeone may in fact be predictive of (even causally linkedto) future arrests. However, if such decisions are applied un-fairly across racial demographic groups (even if somewhereslightly “fairer” on the optimization curve), then they wouldjust repeat patterns of bias existing in past data.

Unfairness of “Active Fairness” Trade-OffRemedies

Some work regarding the fairness-accuracy trade-off some-times goes beyond observing, characterizing, or implement-ing the trade-off for different applications, as we have dis-cussed above. They note that while they agree with the no-tion that the trade-off is inherent, its effects can perhapsbe mitigated by increasing both accuracy and fairness—essentially, moving the trade-off optimization curve up andto the right (Figure 1). The trade-off still exists in this sce-nario, but perhaps is less of an issue since the models per-form better overall in terms of both accuracy and fairness.Concretely, authors recommend a technique they call ac-tive feature acquisition or active fairness (Noriega-Camperoet al. 2019), which promotes the idea that “data collectionis often a means to reduce discrimination without sacriﬁcingaccuracy” (Chen, Johansson, and Sontag 2018)—that col-lecting more features for the unprivileged group will helpensure fairer outcomes (Dutta et al. 2020; Bakker et al.2019). The rationale is that additional feature collection al-leviates the bias in the existing data for unprivileged groups,which will result in reduced bias in the classiﬁcation results fairness due to a history of discrimination against certain demo-graphics. Such modeling would not come without concern, as itwould require fairness researchers to introduce a different set ofmathematical modeling assumptions that carry their own norma-tive implications. or those groups. Moreover, the authors note that gatheringmore features for the unprivileged group leads to these ben-eﬁts without impacting the privileged group; the privilegedgroup’s accuracy and fairness metrics remain unchanged.Setting aside that it might not even be possible to collectmore features in practice, there are important implicit as-sumptions in this choice of solution. In particular, it seemslike this work poses data collection as a value-neutral so-lution to ensure greater fairness. This assumption is clearlyfalse. It is widely accepted, particularly in sociotechnical lit-erature, that data collection is often a form of surveillance(Zuboff 2018; Clarke 1994; Brayne 2017; Cohen 2011). It isnot a neutral act, and is generally not equally applied acrossdemographic groups in the US.The choice to collect more data raises a normative ques-tion directly in contradiction with their goal for increasedfairness for unprivileged groups: Do we really want tocollect more data on unprivileged groups—groups that al-ready tend to be surveilled at higher rates than those withmore privilege? In the US in particular there is a long his-tory of tracking non-white individuals: Black Americans,from Martin Luther King to Black Lives Matter activists;Japanese Americans during World War II; non-white Mus-lims, particularly since 9/11; Latine individuals in relationto immigration status (Bedoya 2016; DeSilver, Lipka, andFahmy 2020; Speri 2019; Painter 2011). In a more globaltreatment of fairness, is it fair to collect more data on thesepopulations just to ensure we are optimizing some local fair-ness metric?One could make the argument that machine learningbroadly speaking pushes toward greater surveillance. Theﬁeld is pushing to scale to train larger and larger modelspeciﬁcations, which tend to require training on larger andlarger datasets. These data-hungry methods in turn push forgreater data collection and surveillance in general (Zuboff2018; Kaplan et al. 2020). Yet, the proposed techniques inactive fairness to alleviate the fairness-accuracy trade-offstand apart: They speciﬁcally advocate for increasing datacollection of already-surveilled groups. They tend to leavethe data for the privileged group untouched in order to de-crease classiﬁcation disparities between groups. In essence,this unfairly shifts the burden of producing fair classiﬁca-tion results to the unprivileged group, affording the addi-tional privilege (i.e. even less relative surveillance) to thealready privileged group. Put another way, their solution tothe fairness-accuracy optimization problem introduces an-other, unexplored objective function—an objective functionconcerning the burden of surveillance, whose solution in thiscase causes residual unfairness for the marginalized group.Some work in active fairness does acknowledge that ad-ditional data collection is not costless; however, this workoften discusses it as a necessary cost for increased fairnessrather than investigating it as a potential source for increasedunfairness (Chen, Johansson, and Sontag 2018). Noriega-Campero et al. (2019) states that it would be useful to modelthe cost of each feature in the dataset, where costs implicatemonetary, privacy, and opportunity concerns. Beyond notingthis idea, they do not attempt to formalize it in their work.Bakker et al. (2019) goes a step further by including cost in their problem formulation, associating a vector of costswith each feature. However, it is unclear how they pick thevalues of those costs and, perhaps more importantly, theymake the assumption that the vector of costs is the same foreach individual in the population. This assumes that differ-ent values for different features do not not incur differentsocial impacts, which is demonstrably not the case. For ex-ample, consider individuals of transgender identity: Transpeople, in comparison to cis people, face signiﬁcant dis-crimination in response to disclosing their identity. In thelanguage of Bakker et al. (2019), it is more costly to transpeople to collect features about gender identity than it is forcis people. In fact, such disparate costs can be thought of asthe basis for needing to acknowledge and do policymakingusing protected demographic attributes in the ﬁrst place. Making Normative Assumptions Explicit

Writing mathematical proofs requires assumptions. For ex-ample, in machine learning, when writing proofs about analgorithm’s properties, it is common to assume that the dis-tribution we are trying to learn is convex. Assumptions likethis enable us to guarantee certain logical conclusions aboutan algorithm’s behavior, such as bounds on its convergencerate. While fairness-accuracy trade-off researchers are ac-customed to stating mathematical assumptions like this, andensuring that sound mathematical conclusions follow, wehave shown that they do not pay similar attention to nor-mative assumptions and their ensuing contradictory conclu-sions. We contend that researchers should take the time tomake explicit such assumptions underlying their work. Be-ing rigorous and clear about normative assumptions enablesthem to be reviewed just as rigorously as mathematical as-sumptions, and will facilitate greater scrutiny about the ap-propriateness of proposed algorithmic fairness solutions. Inthe language of Selbst et al. (2019), this could help preventfalling into the Framing Trap.ML researchers should engage the assistance of socialscientists if they believe they lack the expertise to do thiswork independently. Moreover, this process should facilitateresearchers being introspective about how their individualbackgrounds might inform the assumptions they bring intotheir work. This would be one necessary (though not on itsown sufﬁcient) way to address critics of fairness researchbeing dominated by white voices (Abdurahman 2019).

Tweaking Normative Assumptions for Robustness

Making normative assumptions explicit could also help fa-cilitate more robust ML fairness research. When investigat-ing algorithmic robustness, researchers are generally com-fortable with relaxing or changing certain mathematicalproof assumptions and reasoning out the resulting changes(or stasis) in algorithm behavior. As the economist Leamer(1983) notes:...an inference is not believable if it is fragile, if it canbe reversed by minor changes in assumptions. . . . Aresearcher has to decide which assumptions or whichsets of alternative assumptions are worth reporting.s a test of normative robustness, we similarly recom-mend that fairness-accuracy trade-off researchers perturbtheir normative assumptions and investigate how this mayalter normative outcomes. For example, when consideringsurveillance of the unprivileged via active feature acquisi-tion as an appropriate mechanism for alleviating the trade-off, it would be useful to state this as an explicit assumption,and then consider surveillance as a constraint for the prob-lem. In other words, one could ask, how much surveillance istolerable for increased fairness? Perhaps none, but perhapsthere is a small set of high quality features that could be col-lected to serve this purpose, rather than just indiscriminatelycollecting a lot of additional features.

Borrowing tools from adjacent ﬁelds in computer sciencenot only affects the results of fairness research, but it alsohelps to characterize the disciplinary status of fairness re-search itself. In this case, the operationalization of fairnessas a mathematical problem may help situate questions offairness within the realm of “science,” thus conferring a par-ticular legitimacy that science connotes (Gieryn 1983; Porter1995). Situating the fairness question as a scientiﬁc ques-tion and esconcing it in the language of trade-offs and op-timization suggests that it is reasonable to try to solve foran “optimal,” best answer. This tendency toward “solution-ism” (Selbst et al. 2019; Abebe et al. 2020)—the notion thattechnology can solve social problems—grants special legit-imacy to algorithmic fairness research, legitimacy absent inother ﬁelds that have tackled but have not “solved” the fair-ness problem from a social perspective.According to Passi and Barocas (2019), “Whether weconsider a data science project fair often has as much to dowith the formulation of the problem as any property of theresulting model.” Furthermore, the work of problem forma-tion is “rarely worked out with explicit normative considera-tions in mind” (Passi and Barocas 2019). As we have shownin this article, formulating algorithmic fairness as a trade-offbetween fairness and accuracy involves a variety of norma-tive assumptions that can lead to various forms of emergingunfairness. When it comes to the critical issue of algorithmicfairness, it may be time to reconsider the framing of trade-offs altogether.

Acknowledgments

This work was made possible by funding from a Cornell In-stitute for the Social Sciences (ISS) Team Grant and the Cor-nell Humanities Scholars Program. We would additionallylike to thank the following individuals for feedback on ear-lier versions of this work: Bilan A.H. Ali, Harry Auster, KateDonahue, Christopher De Sa, Jessica Zosa Forde, KwekuKwegyir-Aggrey, and Alec Pollak.

References

Abdurahman, J. K. 2019. A Response to Racial Cat-egories of Machine Learning by Sebastian Benthall and Bruce Haynes. URL https://medium.com/@blacksirenradio/fat-be-wilin-deb56bf92539.Abebe, R.; Barocas, S.; Kleinberg, J.; Levy, K.; Raghavan,M.; and Robinson, D. G. 2020. Roles for Computing inSocial Change. In

Proceedings of the Conference on Fair-ness, Accountability, and Transparency , FAT* ’20, 252–260.New York, NY, USA: Association for Computing Machin-ery. ISBN 9781450369367.Bakker, M. A.; Noriega-Campero, A.; Tu, D. P.; Sattigeri, P.;Varshney, K. R.; and Pentland, A. S. 2019. On Fairness inBudget-Constrained Decision Making.

KDD Workshop onExplainable Artiﬁcial Intelligence .Barocas, S.; Hardt, M.; and Narayanan, A. 2018.

Fair-ness and Machine Learning

SSRN eLibrary .Bedoya, A. M. 2016. The Color of Surveillance.

Slate

URL https://slate.com/technology/2016/01/what-the-fbis-surveillance-of-martin-luther-king-says-about-modern-spying.html.Berlin, I. 2013. The Pursuit of the Ideal. In Hardy, H., ed.,

The Crooked Timber of Humanity , chapter 1, 1–20. Prince-ton University Press.Binns, R. 2018. Fairness in Machine Learning: Lessonsfrom Political Philosophy. In Friedler, S. A.; and Wilson,C., eds.,

Proceedings of the 1st Conference on Fairness, Ac-countability and Transparency , volume 81 of

Proceedingsof Machine Learning Research , 149–159. New York, NY,USA: PMLR.Bishop, C. M. 1995.

Neural Networks for Pattern Recog-nition . USA: Oxford University Press, Inc. ISBN0198538642.Bowker, G. C.; and Star, S. L. 1999.

Sorting Things Out:Classiﬁcation and Its Consequences . Cambridge, Mass.:MIT Press.Brayne, S. 2017. Big Data Surveillance: The Case of Polic-ing.

American Sociological Review

The Hill

URL https://thehill.com/blogs/blog-brieﬁng-room/news/508608-two-white-students-sue-ut-austin-claiming-they-were-denied.Buolamwini, J.; and Gebru, T. 2018. Gender Shades: Inter-sectional Accuracy Disparities in Commercial Gender Clas-siﬁcation. In Friedler, S. A.; and Wilson, C., eds.,

Proceed-ings of the 1st Conference on Fairness, Accountability andTransparency , volume 81 of

Proceedings of Machine Learn-ing Research , 77–91. New York, NY, USA: PMLR.Chen, I. Y.; Johansson, F. D.; and Sontag, D. 2018. Why isMy Classiﬁer Discriminatory? In

Proceedings of the 32ndInternational Conference on Neural Information ProcessingSystems , NIPS’18, 3543–3554. Red Hook, NY, USA: Cur-ran Associates Inc.houldechova, A. 2017. Fair Prediction with Disparate Im-pact: A Study of Bias in Recidivism Prediction Instruments.

Big Data

Inf. Soc.

10: 77–92.Cohen, N. 2011. The Valorization of Surveillance: Towardsa Political Economy of Facebook.

Democratic Communiqu´e

Proceedings of the 23rd ACM SIGKDD In-ternational Conference on Knowledge Discovery and DataMining , KDD ’17, 797–806. New York, NY, USA: Associa-tion for Computing Machinery. ISBN 9781450348874.DeSilver, D.; Lipka, M.; and Fahmy, D. 2020. 10 thingswe know about race and policing in the U.S.

Pew Re-search Center

Proceedingsof the 37th International Conference on Machine Learning ,volume 119 of

Proceedings of Machine Learning Research ,2803–2813. PMLR.Dwork, C.; Hardt, M.; Pitassi, T.; Reingold, O.; and Zemel,R. 2012. Fairness through Awareness. In

Proceedings ofthe 3rd Innovations in Theoretical Computer Science Con-ference , ITCS ’12, 214–226. New York, NY, USA: Associa-tion for Computing Machinery. ISBN 9781450311151.Dwork, C.; Immorlica, N.; Kalai, A. T.; and Leiserson, M.2018. Decoupled Classiﬁers for Group-Fair and EfﬁcientMachine Learning. In Friedler, S. A.; and Wilson, C., eds.,

Proceedings of the 1st Conference on Fairness, Accountabil-ity and Transparency , volume 81 of

Proceedings of MachineLearning Research , 119–133. New York, NY, USA: PMLR.Flanagan, M.; and Nissenbaum, H. 2014.

Values at Play inDigital Games . The MIT Press.Friedler, S. A.; Scheidegger, C.; and Venkatasubramanian,S. 2016. On the (im)possibility of fairness.Friedman, B.; and Hendry, D. G. 2019.

Value Sensitive De-sign: Shaping Technology with Moral Imagination . The MITPress. ISBN 0262039532.Geiger, R. S. 2017. Beyond opening up the black box: In-vestigating the role of algorithmic systems in Wikipedianorganizational culture.

Big Data & Society

American SociologicalReview

The El-ements of Statistical Learning: Data Mining, Inference andPrediction . Springer, 2 edition.Hecht, G. 2002. Rupture-Talk in the Nuclear Age: Conju-gating Colonial Power in Africa.

Social Studies of Science

Advances in Neural Information ProcessingSystems , volume 31, 1265–1276. Curran Associates, Inc.Hughes, T. P. 1993.

Networks of Power: Electriﬁcationin Western Society, 1880-1930 . Johns Hopkins UniversityPress, 1st edition.Joseph, M.; Kearns, M. J.; Morgenstern, J.; Neel, S.; andRoth, A. 2016. Rawlsian Fairness for Machine Learning.

CoRR abs/1610.09559.Kaplan, J.; McCandlish, S.; Henighan, T.; Brown, T. B.;Chess, B.; Child, R.; Gray, S.; Radford, A.; Wu, J.; andAmodei, D. 2020. Scaling Laws for Neural Language Mod-els.Kennedy, J. F. 1961. Executive Order 10925.Kleinberg, J. M.; Mullainathan, S.; and Raghavan, M. 2017.Inherent Trade-Offs in the Fair Determination of RiskScores. In Papadimitriou, C. H., ed., , volume 67 of

LIPIcs , 43:1–43:23. Schloss Dagstuhl - Leibniz-Zentrum f¨ur Informatik.Leamer, E. 1983. Let’s Take the Con Out of Econometrics.

American Economic Review

Proceedings of the 1st Conference on Fairness,Accountability and Transparency , volume 81 of

Proceedingsof Machine Learning Research , 107–118. New York, NY,USA: PMLR.Monahan, J.; and Skeem, J. L. 2016. Risk Assessment inCriminal Sentencing.

Annual Review of Clinical Psychology

The At-lantic

Proceedings of the 2019 AAAI/ACMConference on AI, Ethics, and Society , AIES ’19, 77–83.New York, NY, USA: Association for Computing Machin-ery. ISBN 9781450363242.Painter, N. I. 2011.

The History of White People . W. W.Norton.Passi, S.; and Barocas, S. 2019. Problem Formulation andFairness. In

Proceedings of the Conference on Fairness,Accountability, and Transparency , FAT* ’19, 39–48. Nework, NY, USA: Association for Computing Machinery.ISBN 9781450361255.Pham, E. D. 2018. Fellow Asian-Americans, BackOff of Afﬁrmative Action.

The Harvard Crim-son

Trust in Numbers: The Pursuit of Objec-tivity in Science and Public Life . Princeton University Press.ISBN 9780691029085.Powles, J.; and Nissenbaum, H. 2018. The Seductive Diver-sion of Solving Bias in Artiﬁcial Intelligence.Rawls, J. 1971.

A Theory of Justice . Cambridge, Mas-sachusetts: Belknap Press of Harvard University Press.Sabato, S.; and Yom-Tov, E. 2020. Bounding the Fairnessand Accuracy of Classiﬁers from Population Statistics. InIII, H. D.; and Singh, A., eds.,

Proceedings of the 37th In-ternational Conference on Machine Learning , volume 119of

Proceedings of Machine Learning Research , 8316–8325.PMLR.Selbst, A. D.; Boyd, D.; Friedler, S. A.; Venkatasubrama-nian, S.; and Vertesi, J. 2019. Fairness and Abstraction inSociotechnical Systems. In

Proceedings of the Conferenceon Fairness, Accountability, and Transparency , FAT* ’19,59–68. New York, NY, USA: Association for ComputingMachinery. ISBN 9781450361255.Speri, A. 2019. The FBI Spends a Lot of Time Spying onBlack Americans.

The Intercept

URL https://theintercept.com/2019/10/29/fbi-surveillance-black-activists/.Srivastava, M.; Heidari, H.; and Krause, A. 2019. Mathe-matical Notions vs. Human Perception of Fairness: A De-scriptive Approach to Fairness for Machine Learning. In

Proceedings of the 25th ACM SIGKDD International Con-ference on Knowledge Discovery & Data Mining , KDD ’19,2459–2468. New York, NY, USA: Association for Comput-ing Machinery. ISBN 9781450362016.Velocci, B. 2021.

Binary Logic: Race, Expertise, and thePersistence of Uncertainty in American Sex Research . PHDDissertation, Yale University. Chapter 1, ”Unsolved Prob-lems of Anomalous Sex: Managing Sexual Multiplicity inNineteenth-Century Animal Studies”.Wick, M.; panda, s.; and Tristan, J.-B. 2019. Unlocking Fair-ness: a Trade-off Revisited. In Wallach, H.; Larochelle, H.;Beygelzimer, A.; d'Alch´e-Buc, F.; Fox, E.; and Garnett, R.,eds.,

Advances in Neural Information Processing Systems ,volume 32, 8783–8792. Curran Associates, Inc.Yang, X.-S. 2010.

Engineering Optimization: An Introduc-tion with Metaheuristic Applications . Wiley Publishing, 1stedition. ISBN 0470582464.Zhao, H.; and Gordon, G. 2019. Inherent Tradeoffs in Learn-ing Fair Representations. In Wallach, H.; Larochelle, H.;Beygelzimer, A.; d'Alch´e-Buc, F.; Fox, E.; and Garnett, R.,eds.,

Advances in Neural Information Processing Systems ,volume 32, 15675–15685. Curran Associates, Inc. Zuboff, S. 2018.