[PDF] Bias and Discrimination in AI: a cross-disciplinary perspective

Abstract

With the widespread and pervasive use of Artificial Intelligence (AI) for automated decision-making systems, AI bias is becoming more apparent and problematic. One of its negative consequences is discrimination: the unfair, or unequal treatment of individuals based on certain characteristics. However, the relationship between bias and discrimination is not always clear. In this paper, we survey relevant literature about bias and discrimination in AI from an interdisciplinary perspective that embeds technical, legal, social and ethical dimensions. We show that finding solutions to bias and discrimination in AI requires robust cross-disciplinary collaborations.

Full PDF

aa r X i v : . [ c s . C Y ] A ug Bias and Discrimination in AI: a cross-disciplinaryperspective

Xavier Ferrer ∗ , Tom van Nuenen ∗ , Jose M. Such ∗ , Mark Cot´e ∗ , and Natalia Criado ∗ , ∗ King’s College London, United Kingdom

Abstract —With the widespread and pervasive use of ArtiﬁcialIntelligence (AI) for automated decision-making systems, AI biasis becoming more apparent and problematic. One of its negativeconsequences is discrimination: the unfair, or unequal treatmentof individuals based on certain characteristics. However, therelationship between bias and discrimination is not always clear.In this paper, we survey relevant literature about bias anddiscrimination in AI from an interdisciplinary perspective thatembeds technical, legal, social and ethical dimensions. We showthat ﬁnding solutions to bias and discrimination in AI requiresrobust cross-disciplinary collaborations.

I. I

NTRODUCTION

Operating at a large scale and impacting large groupsof people, automated systems can make consequential andsometimes contestable decisions. Automated decisions canimpact a range of phenomena, from credit scores to insurancepayouts to health evaluations. These forms of automation canbecome problematic when they place certain groups or peopleat a systematic disadvantage. These are cases of discrimination– which is legally deﬁned as the unfair or unequal treatment ofan individual (or group) based on certain characteristics suchas income, education, gender or ethnicity. When the unfairtreatment is caused by automated decisions, usually taken byintelligent agents or other AI-based systems, we talk aboutdigital discrimination. Digital discrimination has been foundin a diverse range of ﬁelds, such as in risk assessment systemsfor policing and credit scores [1].Digital discrimination is becoming a serious problem, asmore and more decisions are delegated to systems increasinglybased on AI techniques such as Machine Learning. Whilea signiﬁcant amount of research has been undertaken fromdifferent disciplinary angles to understand this challenge –from computer science to law to sociology – none of theseﬁelds have been able to resolve the problem on their ownterms. For instance, computational methods to verify andcertify bias-free datasets and algorithms do not account forsocio-cultural or ethical complexities, and do not distinguishbetween bias and discrimination. Both of these terms havea technical inﬂection, but are predicated on legal and ethicalprinciples.In this paper, we propose a synergistic approach that allowsus to explore bias and discrimination in AI by supplementingtechnical literature with social, legal and ethical perspectives.Through a critical survey of a synthesis of related litera-ture, we compare and evaluate the sometimes contradictory

Corresponding author: X. Ferrer (email: xavier.ferrer [email protected]). priorities within these ﬁelds, and discuss how disciplinesmight collaborate to resolve the problem. We also highlighta number of interdisciplinary challenges to attest and addressdiscrimination in AI.II. B

IAS AND D ISCRIMINATION

Technical literature in the area of discrimination typicallyrefers to the related issue of bias. Yet, despite playing animportant role in discriminatory processes, bias does notnecessarily lead to discrimination. Bias means a deviation fromthe standard, sometimes necessary to identify the existence ofsome statistical patterns in the data or language used [2], [3].Classifying and ﬁnding differences between instances wouldbe impossible without bias.In this paper, we follow the most common deﬁnition of biasused in the literature and focus on the problematic instances ofbias that may lead to discrimination by AI-based automated-decision making systems. Three main, well-known causes forbias have been distinguished [2]: a) Bias in modelling:

Bias may be deliberately intro-duced, e.g., through smoothing or regularisation parameters tomitigate or compensate for bias in the data, which is called algorithmic processing bias , or introduced while modelling incases with the usage of objective categories to make subjectivejudgements, which is called algorithmic focus bias . b) Bias in training: Algorithms learn to make deci-sions or predictions based on datasets that often contain pastdecisions. If a dataset used for training purposes reﬂectsexisting prejudices, algorithms will very likely learn to makethe same biased decisions. Moreover, if the data does notcorrectly represent the characteristics of different populations,representing an unequal ground truth , it may result in biasedalgorithmic decisions. c) Bias in usage:

Algorithms can result in bias when theyare used in a situation for which they were not intended. Analgorithm utilised to predict a particular outcome in a givenpopulation can lead to inaccurate results when applied to adifferent population – a form of transfer context bias . Further,the potential misinterpretation of an algorithm’s outputs canlead to biased actions through what is called interpretationbias .A signiﬁcant amount of literature focuses on forms of biasthat may or may not lead to discriminatory outcomes, i.e.,the relationship between bias and discrimination is not alwaysclear or understood. Most literature assumes that systems freefrom biases do not discriminate, hence, reducing or eliminating biases reduces or eliminates the potential for discrimination.However, whether an algorithm can be considered discrim-inatory or not depends on the context in which it is beingdeployed and the task it is intended to perform. For instance,consider a possible case of algorithmic bias in usage, inwhich an algorithm is biased towards hiring young people.At ﬁrst glance, it can be considered that the algorithm isdiscriminating against older people. However, this (biased)algorithm should only be considered to discriminate if thecontext in which it is intended to be deployed does notjustify hiring more young people than older people. Therefore,statistically reductionist approaches, such as estimating theratio between younger and older people hired, are insufﬁcientto attest whether the algorithm is discriminating withoutconsidering this socially and politically fraught context; itremains ethically unclear where we need to draw the linebetween biased and discriminating outcomes. Therefore, AIand technical researchers often: i) use discrimination andbias as equivalent; or ii) focus on measuring biases withoutactually attending to the problem of whether or not there isdiscrimination. Our aim, in the below, is to disentangle someof these issues. III. M

EASURING B IASES

To assess whether an algorithm is free from biases, thereis a need to analyse the entirety of the algorithmic process.This entails ﬁrst conﬁrming that the algorithm’s underlyingassumptions and its modelling are not biased; second, that itstraining and test data does not include biases and prejudices;and ﬁnally, that it is adequate to make decisions for thatspeciﬁc context and task. More often than not, however, we donot have access to this information. A number of issues preventsuch an analysis. The data used to train a model, for instance,is typically protected since it contains personal information,rendering the task of attesting training bias impossible. Accessto the algorithm’s source code might also be restricted tothe general public, removing the possibility of identifyingmodelling biases. This is common as algorithms are valuableprivate assets of companies. Third, the speciﬁcs of where andhow the algorithm will be deployed might be unknown to anauditor. Depending on what is available, different types of biasattesting might be possible, both in terms of the process andin terms of the metrics used to measure it.

A. Procedural vs Relational Approaches

We can distinguish between two general approaches tomeasure bias: i) procedural approaches, which focus on iden-tifying biases in the decision making process of an algorithm[4], and ii) relational approaches, which focus on identifying(and preventing) biased decisions in the dataset or algorithmicoutput. While ensuring unbiased outcomes is useful to attestwhether a speciﬁc algorithm has a discriminatory impact ona population, focusing on the algorithmic process itself canhelp yield insights about the reason why it happened in theﬁrst place.Procedural approaches focus on identifying biases in thealgorithmic “logic”. Such ante-hoc interventions are hard to implement for two main reasons: (i) AI algorithms are oftensophisticated and complex since, in addition to being trainedon huge data sets, they usually make use of unsupervisedlearning structures that might prove difﬁcult to trace andunderstand (e.g. neural networks), and (ii) the source code ofthe algorithm is rarely available. Procedural approaches willbecome more beneﬁcial with further progress in explainableAI [4].Being able to understand the process behind an algorith-mic discriminatory decision can help us understand possibleproblems in the algorithm’s code and behaviour, and thusact accordingly towards the creation of non-discriminatoryalgorithms. As such, current literature on non-discriminatoryAI promotes the introduction of explanations into the modelitself, e.g., through inherently interpretable models such asdecision trees, association rules, or causal reasoning whichprovide coarse approximations of how a system behaves byexplaining the weights and relationships between variablesin (a segment of) a model [5], [6], [7]. Notice, however,that attesting that an algorithmic process is free from biasesdoes not ensure a non-discriminatory algorithmic output, sincediscrimination can arise as a consequence of biases in trainingor in usage [8].While procedural approaches attend to the algorithmic pro-cess, relational approaches measure biases in the dataset andthe algorithmic output. Such approaches are popular in theliterature, as they do not require insights into the algorithmicprocess. Besides evaluating biases in the data itself, whereit is available (e.g. by looking at statistical parity), imple-mentations can compare the algorithmic outcomes obtained bytwo different sub-populations in the dataset [9], or make useof counterfactual or contrastive explanation, asking questionssuch as “Why X instead of Y?”. Bias, here, is only located attesting time. One example is the post-hoc approach of LocalInterpretable Model-Agnostic Explanations (LIME), whichmakes use of adversarial learning to generate counterfactualexplanations [4]. Other approaches evaluate the correlationbetween algorithmic inputs and biased outputs, in order toidentify those features that may lead to biased actions thataffect protected sub-populations [10]. Since implementationsoften ignore the context in which the algorithm will bedeployed, the decision whether a biased output results in acase of discrimination is often left to the user to assess [6].

B. Bias Metrics

The metrics for measuring bias can be organised in threedifferent categories: statistical measures, similarity-based mea-sures, and causal reasoning. While reviews such as [11] offeran extensive description of some of these metrics, we willdiscuss the intuition behind the most common types of metricsused in the literature below.Statistical measures to attest biases represent the most intu-itive notion of bias, and focus on exploring the relationshipsor associations between the algorithm’s predicted outcomefor the different (input) demographic distributions of subjects,and the actual outcome that is achieved. These measuresinclude, ﬁrst, group fairness (also named statistical parity ), which requires that an equal quantity of each group of distinctindividuals should receive each possible algorithmic outcome.For instance, if four out of ﬁve applicants of the advantagedgroup were given a mortgage, the same ratio of applicantsfrom the protected group should obtain the mortgage as well.Second, predictive parity is satisﬁed if both protected andunprotected groups have equal positive predictive value – thatis, the probability of an individual to be correctly classiﬁed asbelonging to the positive class. Finally, the principle of well-calibration states that the probability estimates provided by thedecision-making algorithm should be properly adjusted withthe real values. Despite the popularity of statistical metrics,it has been shown that statistical deﬁnitions are insufﬁcientto estimate the absence of biases in algorithmic outcomes,as they often assume the availability of veriﬁed outcomesnecessary to estimate them, and often ignore other attributesof the classiﬁed subject than the sensitive ones [12].Similarity measures, on the other hand, focus on deﬁning asimilarity value between individuals. Causal discrimination isan example of such measures, stating that a classiﬁer is not bi-ased if it produces the same classiﬁcation for any two subjectswith the same non-protected attributes. A more complex biasmetric based on a similarity measure between individuals is fairness through awareness [12], which states that, for fairnessto hold, the distance between the distributions of outputs forindividuals should at most be the distance between the twoindividuals as estimated by means of a similarity metric. Thecomplexity in using this metric consists in accurately deﬁninga similarity measure that correctly represents the complexityof the situation in question, which is often an impossibletask to generalise. Moreover, the similarity measure betweenindividuals can suffer from the implicit biases of the expert,resulting in a biased similarity estimator.Finally, deﬁnitions based on causal reasoning assume biascan be attested by means of a directed causal graph. Inthe graph, attributes are presented as nodes joined by edgeswhich, by means of equations, represent the relations betweenattributes [7]. By exploring the graph, the effects that thedifferent protected attributes have on the algorithm’s outputcan be assessed and analysed. Causal fairness approaches arelimited by the assumption that a valid causal graph able todescribe the problem can be constructed, which is not alwaysfeasible due to the sometimes unknown and complex relationsbetween attributes and the impact they have on the output.IV. A

TTESTING AND A DDRESSING D ISCRIMINATION

The ﬁrst step explored in the related literature to identifydiscriminatory outputs is determining the groups whose al-gorithmic outputs are going to be compared. Technical ap-proaches to select the sub-populations of interest vary, either:i) they consider sub-populations as already deﬁned [6], [13];or ii) they are selected by means of a heuristic that aggre-gates individuals that share one or more protected or proxyattributes, as in

FairTest ’s framework for detecting biasesin datasets. Protected attributes are encoded in legislation(cf. Sect. V) and usually include attributes such as sex, gender, https://github.com/columbia/fairtest and ethnicity, while proxy attributes are attributes stronglycorrelated with protected attributes, e.g. weightlifting ability(strongly correlated with gender). However, the process ofselecting individuals or groups based on these attributes isnon-trivial since groups often result from the intersection ofmultiple protected and proxy attributes (cf. Sect. VI).Once the protected and the potentially advantaged groupshave been selected, implementations apply different bias met-rics (cf. Sect. III-B) to compare and identify relevant differ-ences in the algorithm’s outcomes for the different groups. Ifthese differences are a consequence of protected attributes,it is likely that the algorithm’s decision can be considereddiscriminatory.To alleviate the contextual problem of whether an algorith-mic outcome may form a case of discrimination, approachesoften incorporate explanatory attributes : user attributes onwhich is deemed acceptable to differentiate, even if this leadsto apparent discrimination on protected attributes [13]. Somerelevant approaches are the open-source IBM AI Fairness360 toolkit , which contains techniques developed by IBMand the research community to help detect and mitigate biasin machine learning models throughout the AI applicationlifecycle, and Google’s What-if-tool , which offers an inter-active visual interface that allows researchers to investigatemodel performances for a range of features in the dataset andoptimization strategies.Despite these efforts in parameterising context uncertaintyin technical implementations, the interpretive dimension thatseparates bias and discrimination remains a challenge. Asa response, some approaches base their implementations onvarious anti-discrimination laws that focus on the relationshipsbetween protected attributes and decision outcomes. For in-stance, the US fourth-ﬁfth court rule and the Castaneda rule are used as a general, and often arguably adequate, prima facie evidence of discrimination – see Section V for more detailson these rules.Approaches that intervene on problematic biases focus on(i) removing protected attributes from the data, as an attemptto impede the algorithm from using these protected attributesto make discriminatory decisions ( fairness through blindness [12], [8]), or on (ii) debiasing algorithms’ outputs [14]. Anissue here is that removing protected attributes from the inputdata often results in a signiﬁcant loss of accuracy in thealgorithm [12]. Moreover, excluded attributes can often becorrelated with proxy attributes that remain in the dataset,meaning bias may still be present (i.e. certain residentialareas have speciﬁc demographics that play the role of proxyvariables for ethnicity. These approaches can also be criticisedbecause they alter the model of the world that an AI makesuse of, instead of altering how that AI perceives and acts onbias [12].On a broader level, debiasing an algorithm’s output requiresa speciﬁc deﬁnition of its context and, as such, is difﬁcultto achieve from a technical perspective only. A myriad oflingering questions remains to be answered: how much bias https://github.com/IBM/AIF360 https://pair-code.github.io/what-if-tool/ does an algorithm need to encode in order to consider itsoutputs discriminating? How can we reﬂect on the peculiarityof the data on which these algorithms are operating – datawhich often reﬂects the inequities of its time? In short, aclearer deﬁnition of the relation between algorithmic biasesand discrimination is needed. We argue that such a deﬁnitioncan only be provided by a cross-disciplinary approach thattakes legal, social and ethical considerations into account. Inresponse, in the next sections we will engage critically withrelated work from legal, social and ethical perspectives.V. L EGAL P ERSPECTIVE

Legislation designed to prevent discrimination against par-ticular groups of people that share one or more protectedattributes – namely protected groups – receives the name ofanti-discrimination law. Anti-discrimination laws vary acrosscountries. For instance, European anti-discrimination legisla-tion is organised in directives, such as Directive 2000/43/ECagainst discrimination on grounds of race and ethnic origin,or Chapter 3 of the EU Charter of fundamental rights. Anti-discrimination laws in the US are described in the

Title VIIof the Civil Rights Act of 1964 and in other federal and statestatutes, supplemented by court decisions. For instance, theTitle VII prohibits discrimination in employment on the basisof race, sex, national origin and religion; and the

The EqualPay Act prohibits wage disparity based on sex by employersand unions.The main issues in trials related to discrimination consistof determining [15]: (1) the relevant population affected bythe discrimination case, and to which groups it should becompared, (2) the discrimination measure that formalisesgroup under-representation, e.g., disparate treatment or dis-parate impact [13], [16], and (3) the threshold that constitutesprima facie evidence of discrimination. Note that the threeissues coincide with the problems explored in the technicalapproaches presented earlier. With respect to the last point, nostrict threshold has been laid down by the European Union.In the US, the fourth-ﬁfth rule from the Equal EmploymentOpportunity Commission (1978), which states that a jobselection rate for the protected group of less than 4/5 of theselection rate for the unprotected group, is sometimes useda prima facie evidence of an adverse impact. The Castanedarule , which states that the number of people of the protectedgroup selected from a relevant population cannot be smallerthan 3 standard deviations the number expected in a randomselection, is also used [16]. While such laws can relievediscriminatory issues, more complex scenarios can arise. Forinstance, Hildebrandt and Koops mention the legally greyarea of price discrimination, where consumers in differentgeographical areas can be offered different prices based ondifferences in average income [17].More recent regulations, such as the General Data ProtectionRegulation (GDPR), have been offered as a framework to alle-viate some of the enforcement problems of anti-discriminationlaw, and include clauses on automated decision-making relatedto procedural regularity and accountability, introducing a rightof explanation for all individuals to obtain meaningful expla-nations of the logic involved when automated decision making takes place. However, these solutions often assume white boxscenarios, which, as we have seen, may be difﬁcult to achievetechnically, and even when they are achieved, they may notnecessarily provide the answers sought to assess whetherdiscrimination is present or not. Generally speaking, currentlaws are badly equipped to address algorithmic discrimination[16]. Leese [18], for instance, notes that anti-discriminationframeworks typically follow the establishment of a causalchain between indicators on the theoretical level (e.g. sex orrace) and their representation in the population under scrutiny.Data-driven analytics, however, create aggregates of individualproﬁles, and as such are prone to the production of arbitrarycategories instead of real communities. As such, even if datasubjects are granted procedural and relational explanations,the question remains at which point potential biases canreasonably be considered forms of discrimination.VI. S OCIAL P ERSPECTIVE

Digital discrimination is not only a technical phenomenonregulated by law, but one that also needs to be consideredfrom a socio-cultural perspective in order to be rigorously un-derstood. Deﬁning what constitutes discrimination is a matterof understanding the particular social and historical conditionsand ideas that inform it, and needs to be reevaluated accordingto its implementation context. Bias in usage, as deﬁned above,forms a challenge to any kind of generalist AI solution.One complication highlighted by a social perspective isthe potential of digital discrimination to reinforce existingsocial inequalities. This point becomes increasingly pressingwhen multiple identities and experiences of exclusion andsubordination start interacting – a phenomenon called inter-sectionality [19]. One example is formed by the multipleways that race and gender interact with class in the labourmarket, effectively generating new identity categories. From alegislation perspective, anti-discrimination laws can be appliedwhen discrimination is experienced by a population that sharesone or more protected attributes. However, this problem canexponentially grow in complexity when also considering proxyvariables and the intersection of different features [10].On a cultural and ideological level, the call for ever-expanding transparency of AI systems needs to be seen as an ideal as much as a form of ’truth production’ [20]. Further, nostandard evaluation methodology exists among AI researchersto ethically assess their bias classiﬁcations, as the explanationof classiﬁcation serves different functions in different contexts,and is arguably assessed differently by different people (forinstance, the way a dataset is deﬁned and curated, for instance,depends on the assumptions and values of the creator) [21].Conducting a set of experimental studies to elicit people’sresponses to a range of algorithmic decision scenarios andexplanations of these decisions, [22] ﬁnd a strong split in theirrespondents: some ﬁnd the general idea of algorithmic discrim-ination immoral, others resist imputing morality to a computersystem altogether ’the computer is just doing its job’ [22].While algorithmic decision-making implicates dimensions ofjustice, its claim to objectivity may also preclude the publicawareness of these dimensions.

Given the differing stances on discrimination in society,providing explanations to the public targeted by algorithmicdecision-making systems is key, as it allows individuals tomake up their own minds about their evaluations of thesesystems. Hildebrand and Koops in [17], for instance, callfor smart transparency by designing the socio-technical in-frastructures responsible for decision-making in a way thatallows individuals to anticipate and respond to how they areproﬁled. In this context of public evaluation, it also becomesimportant to question which moral standards can or should beencoded in AI, and which considerations of discrimination canbe expected to be most readily shared by a widely differingrange of citizens [23]. While such frameworks can alwaysbe criticised as reductionist approaches to the complexity ofsocial values, keeping into account what kinds of values areimportant in society can go some way in helping to establish how discrimination can be deﬁned.VII. E

THICAL P ERSPECTIVE

Finally, we need to bring in ethical perspective; as Tasioulasargues, discrimination does not need to be unlawful in order tobe unfair [24]. Yet, moral standards are historically dynamic,and continuously evolving due to technological developments.This explains why law and encoded social morality often lagbehind technical developments. In light of discriminatory risks(and beneﬁts) that AI might pose, moral standards need to bereassessed in order to enable new deﬁnitions of discriminatoryimpact. It is telling that one of the famous attempts toaddress this question in robotics derives from ﬁction: IsaacAsimov’s Three Laws of Robotics. More recently, the AIcommunity has attempted to codify ethical principles for AI,such as the Asilomar AI Principles . However, these principlesare criticised as being vague, mainly due to their level ofabstraction, making them not necessarily helpful [24].More grounded and detailed frameworks for AI ethics haverecently been proposed, such as the standards being deﬁnedby the IEEE Global Initiative on Ethics of Autonomous andIntelligent Systems , which aim to provide an incubation spacefor new solutions relevant to the ethical implementation ofintelligent technologies. Another noteworthy contribution ispresented in [24], stating that the ethical questions related tothe usage of AI can be organised into three interconnected lev-els. The ﬁrst level involves laws to govern AI-related activities,including public standards backed up by public institutions andenforcement mechanisms, which claim to be morally bindingon all citizens in virtue of their formal enactment. Someefforts discussed in Section V can be seen as examples ofthis. However, this evades the problem that not all of thesocially entrenched standards that govern our lives are legalstandards. We rely not only on the law to discourage peoplefrom wrongful behaviour, but also on moral standards that areinstilled in us from childhood and reinforced by society.The second level is the social morality around AI. Thedeﬁnition of such a morality is problematic as it involves apotential inﬁnity of reference points, as well as the cultivation https://futureoﬂife.org/ai-principles/ https://ethicsinaction.ieee.org/ of emotional responses such as guilt, indignation and empathy– both of which are effects of human consciousness andcognition [24]. The third and ﬁnal level includes individualsand their engagement with AI. Individuals and associationswill still need to exercise their own moral judgement by, forinstance, devising their own codes of practice. However, howthese levels can be operationalised (or to what extent) from atechnical AI point of view is not yet clear.VIII. O PEN C HALLENGES

Addressing and attesting digital discrimination and remedy-ing its corresponding deﬁciencies will remain a problem fortechnical, legal, social, and ethical reasons. Technically, thereare a number of practical limits to what can be accomplished,particularly regarding the ability to automatically determinethe relationship between biases and discrimination. Currentlegislation is poorly equipped to address the classiﬁcatorycomplexities arising from algorithmic discrimination. Socialinequalities and differing attitudes towards computation furtherobfuscate the distinction between bias and discrimination.From an ethical perspective, existing moral standards need tobe reassessed in light of the risks and beneﬁts AI might pose.In sum, the design and evaluation of AI systems is rootedin different perspectives, concerns and goals. To posit theexistence of a predeﬁned path through these perspectiveswould be misleading. What is needed, instead, is a sensitivityto the distinctions concerning what is desirable AI implemen-tation, and to a dialogical orientation towards design processes.Finding solutions to discrimination in AI requires robust cross-disciplinary collaborations. We conclude here by summarisingwhat we believe to be some of the most important cross-disciplinary challenges to advance research and solutions forattesting and avoiding discrimination in AI.

A. How Much Bias Is Too Much?

Whether a biased decision can be considered discriminatoryor not depends on many factors, such as the context inwhich AI is going to be deployed, the groups comparedin the decision, and other factors like a trade-off betweenindividualist-meritocratic and outcome-egalitarian values. Tosimplify these problems, technical implementations tend toborrow deﬁnitions from the legal literature, such as the thresh-olds that constitute prima facie evidence of discrimination, anduse it as a general rule to attest algorithmic discrimination. Yetthis cannot be addressed by simply encoding the legal, socialand ethical context, which in and of itself is nontrivial. Biasand discrimination have a different ontological status: whilethe former may seem easy to deﬁne in terms of programmaticsolutions, the latter involves a host of social and ethical issuesthat are challenging to resolve from a positivist framework.

B. Critical AI Literacy

Another challenge is the need for an improvement in criticalAI literacy. We have noted the need to take into accountthe end user of AI decision making systems, and the extentto which their literacy of these systems can be targeted and improved. In part, this entails end user knowledge of particu-larities such as the attributes being used in a dataset, as well asthe ability to compare explanation decisions and moral rulesunderlying those choices. This is, however, not solely a tech-nical exercise, as decision making systems render end usersinto algorithmically constructed data subjects. This challengecould be addressed through a socio-technical approach whichcan consider both the technical dimensions and the complexsocial contexts in which these systems are deployed. Buildingpublic conﬁdence and greater democratic participation in AIsystems requires ongoing development of not just explainableAI but of better Human-AI interaction methods and socio-technical platforms, tools and public engagement to increasecritical public understanding and agency.

C. Discrimination-aware AI

Third, AI should not just be seen as a potential problemcausing discrimination, but also as a great opportunity tomitigate existing issues. The fact that AI can pick up ondiscrimination suggests it can be made aware of it. Forinstance, AI could help spot digital forms of discrimination,and assist in acting upon it. For this aim to become a reality wewould need, as explored in this work, a better understandingof social, ethical, and legal principles, as well as dialogicallyconstructed solutions in which this knowledge is incorporatedinto AI systems. Two ways to achieve this goal are: i) usingdata-driven approaches like machine learning to actually lookat previous cases of discrimination and try to spot them inthe future; and ii) using model-based and knowledge-basedAI that operationalises the socio-ethical and legal principlesmentioned above (e.g., normative approaches that include non-discrimination norms as part of the knowledge of an AIsystem to inﬂuence its decision making). This would, forinstance, facilitate an AI system realising that the knowledgeit gathered or learned is resulting in discriminatory decisionswhen deployed in speciﬁc contexts. Hence, the AI systemcould alert an expert human about this, and/or proactivelyaddress the issue spotted.ACKNOWLEDGMENTThis work was supported by EPSRC under grantEP/R033188/1. It is part of the Discovering and AttestingDigital Discrimination (DADD) project – see https://dadd-project.org. R

EFERENCES[1] C. O’Neil,

Weapons of math destruction: How big data increasesinequality and threatens democracy . Broadway Books, 2017.[2] D. Danks and A. London, “Algorithmic Bias in Autonomous Systems,”

IJCAI , 2017.[3] X. Ferrer, T. van Nuenen, J. M. Such, and N. Criado, “Discovering andcategorising language biases in reddit,” 2020.[4] S. Mueller, R. Hoffman, W. Clancey, A. Emrey, and G. KleinMacrocognition, “Explanation in Human-AI Systems,” no. February,2019. [Online]. Available: https://arxiv.org/pdf/1902.01876.pdf[5] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, andD. Pedreschi, “A survey of methods for explaining black box models,”

ACM Computing Surveys (CSUR) , vol. 51, no. 5, p. 93, 2018.[6] S. Ruggieri, D. Pedreschi, and F. Turini, “Integrating induction anddeduction for ﬁnding evidence of discrimination,”

AI and Law , vol. 18,pp. 1–43, 2010. [7] N. Kilbertus, M. Carulla, G. Parascandolo, M. Hardt, D. Janzing, andB. Sch¨olkopf, “Avoiding discrimination through causal reasoning,” in

NIPS’17 , 2017, pp. 656–666.[8] T. Calders and I. ˇZliobait˙e, “Why unbiased computational processescan lead to discriminative decision procedures,” in

Discrimination andprivacy in the information society . Springer, 2013, pp. 43–57.[9] N. Criado, X. Ferrer, and J. M. Such, “A normative approach to attestdigital discrimination,” 2020.[10] N. Grgi´c-Hlaˇca, M. Zafar, K. Gummadi, and A. Weller,“Beyond Distributive Fairness in Algorithmic DecisionMaking,”

AAAI , pp. 51–60, 2018. [Online]. Available:https://people.mpi-sws.org/ ∼ nghlaca/papers/fair feature selection.pdf[11] S. Verma and J. Rubin, “Fairness deﬁnitions explained,” in IEEE/ACMFairWare , 2018.[12] C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel, “Fairnessthrough awareness,” in

ITCS 2012 . ACM, 2012, pp. 214–226.[13] M. Feldman, S. Friedler, J. Moeller, C. Scheidegger, and S. Venkata-subramanian, “Certifying and removing disparate impact,” in

ACM-SIGKDD’15 . ACM, 2015, pp. 259–268.[14] T. Bolukbasi, K. Chang, J. Y. Zou, V. Saligrama, and A. Kalai, “Man isto computer programmer as woman is to homemaker? debiasing wordembeddings,” in

NIPS’16 , 2016, pp. 4349–4357.[15] A. Romei and S. Ruggieri, “A multidisciplinary survey on discriminationanalysis,”

The Knowledge Engineering Review , vol. 29, no. 5, pp. 582–638, 2014.[16] S. Barocas and A. Selbst, “Big Data’s Disparate Impact,”

Cal.Law Rev. , vol. 104, pp. 671–729, 2016. [Online]. Available:https://ssrn.com/abstract=2477899[17] M. Hildebrandt and B. Koops, “The Challenges of Ambient Law andLegal Protection in the Proﬁling Era,”

SSRN , 2010.[18] M. Leese, “The new proﬁling: Algorithms, black boxes, and the failureof anti-discriminatory safeguards in the European Union,”

SecurityDialogue , vol. 45, no. 5, pp. 494–511, 2014.[19] S. Walby, J. Armstrong, and S. Strid, “Intersectionality: Multiple in-equalities in social theory,”

Sociology , vol. 46, no. 2, pp. 224–240, 2012.[20] M. Ananny and K. Crawford, “Seeing without knowing: Limitations ofthe transparency ideal and its application to algorithmic accountability,”

New Media and Society , vol. 20, no. 3, pp. 973–989, 2018.[21] T. van Nuenen, X. Ferrer, J. M. Such, and M. Cot´e, “Transparency forwhom? assessing discriminatory ai.”[22] R. Binns, M. Van Kleek, M. Veale, U. Lyngs, J. Zhao, and N. Shadbolt,“’It’s reducing a human being to a percentage’: Perceptions of justicein algorithmic decisions,” in

CHI 2018 . ACM, 2018, p. 377.[23] O. Curry, D. Mullins, and H. Whitehouse, “Is It Good to Cooperate?Testing the Theory of Morality-as-Cooperation in 60 Societies,”

CurrentAnthropology , vol. 60, no. 1, 2019.[24] J. Tasioulas, “First steps towards an ethics of robots and artiﬁcialintelligence,”