[PDF] Justicia: A Stochastic SAT Approach to Formally Verify Fairness

Abstract

As a technology ML is oblivious to societal good or bad, and thus, the field of fair machine learning has stepped up to propose multiple mathematical definitions, algorithms, and systems to ensure different notions of fairness in ML applications. Given the multitude of propositions, it has become imperative to formally verify the fairness metrics satisfied by different algorithms on different datasets. In this paper, we propose a \textit{stochastic satisfiability} (SSAT) framework, Justicia, that formally verifies different fairness measures of supervised learning algorithms with respect to the underlying data distribution. We instantiate Justicia on multiple classification and bias mitigation algorithms, and datasets to verify different fairness metrics, such as disparate impact, statistical parity, and equalized odds. Justicia is scalable, accurate, and operates on non-Boolean and compound sensitive attributes unlike existing distribution-based verifiers, such as FairSquare and VeriFair. Being distribution-based by design, Justicia is more robust than the verifiers, such as AIF360, that operate on specific test samples. We also theoretically bound the finite-sample error of the verified fairness measure.

Full PDF

JJusticia: A Stochastic SAT Approach to FormallyVerify Fairness

Bishwamittra Ghosh

School of ComputingNational University of SingaporeSingapore

Debabrota Basu

Department of Computer Science and EngineeringChalmers University of TechnologyG¨oteborg, Sweden

Kuldeep S. Meel

School of ComputingNational University of SingaporeSingapore

Abstract

As a technology ML is oblivious to societal good or bad, and thus, the ﬁeld of fair machinelearning has stepped up to propose multiple mathematical deﬁnitions, algorithms, andsystems to ensure diﬀerent notions of fairness in ML applications. Given the multitude ofpropositions, it has become imperative to formally verify the fairness metrics satisﬁed bydiﬀerent algorithms on diﬀerent datasets. In this paper, we propose a stochastic satisﬁability (SSAT) framework,

Justicia , that formally veriﬁes diﬀerent fairness measures of supervisedlearning algorithms with respect to the underlying data distribution. We instantiate

Justicia on multiple classiﬁcation and bias mitigation algorithms, and datasets to verify diﬀerentfairness metrics, such as disparate impact, statistical parity, and equalized odds.

Justicia isscalable, accurate, and operates on non-Boolean and compound sensitive attributes unlikeexisting distribution-based veriﬁers, such as FairSquare and VeriFair. Being distribution-based by design,

Justicia is more robust than the veriﬁers, such as AIF360, that operateon speciﬁc test samples. We also theoretically bound the ﬁnite-sample error of the veriﬁedfairness measure.

1. Introduction

Machine learning (ML) is becoming the omnipresent technology of our time. ML algorithmsare being used for high-stake decisions like college admissions, crime recidivism, insurance,and loan decisions etc. Thus, human lives are now pervasively inﬂuenced by data, ML, andtheir inherent bias.

Example 1

Let us consider an example (Figure 1) of deciding eligibility for health insurancedepending on the ﬁtness and income of the individuals of diﬀerent age groups (20-40 and40-60). Typically, incomes of individuals increase as their ages increase while their ﬁtnessdeteriorate. We assume relation of income and ﬁtness depends on the age as per the Normaldistributions in Figure 1. Now, if we train a decision tree (Narodytska et al., 2018) on theseﬁtness and income indicators to decide the eligibility of an individual to get a health insurance, a r X i v : . [ c s . A I] S e p hosh, Basu, and Meel ageﬁtness incomeˆ Y age < 40age 40 ﬁtness ≥ . ≥ .

29 income ≥ . Y = 1 ˆ Y = 0 ˆ Y = 1 ˆ Y = 0Y NY N Y N Figure 1: A trained decision tree to learn eligibility for health insurance using age-dependentﬁtness and income indicators. we observe that the ‘optimal’ decision tree (ref. Figure 1) selects a person above and below years with probabilities . and . respectively. This simple example demonstrates thateven if an ML algorithm does not explicitly learn to diﬀerentiate on the basis of a sensitiveattribute, it discriminates diﬀerent age groups due to the utilitarian sense of accuracy that ittries to optimize. Fair ML.

Statistical discriminations caused by ML algorithms have motivated researchersto develop several frameworks to ensure fairness and several algorithms to mitigate bias.Existing fairness metrics mostly belong to three categories: independence , separation , and suﬃciency (Mehrabi et al., 2019). Independence metrics, such as demographic parity,statistical parity, and group parity, try and ensure the outcomes of an algorithm to beindependent of the groups that the individuals belong to (Feldman et al., 2015; Dwork et al.,2012). Separation metrics, such as equalized odds, deﬁne an algorithm to be fair if theprobability of getting the same outcomes for diﬀerent groups are same (Hardt et al., 2016).Suﬃciency metrics, such as counterfactual fairness, constrain the probability of outcomes tobe independent of individual’s sensitive data given their identical non-sensitive data (Kusneret al., 2017).In Figure 1, independence is satisﬁed if the probability of getting insurance is samefor both the age groups. Separation is satisﬁed if the number of ‘actually’ (ground-truth)ineligible and eligible people getting the insurance are same. Suﬃciency is satisﬁed if theeligibility is independent of their age given their attributes are the same. Thus, we see thatthe metrics of fairness can be contradictory and complimentary depending on the applicationand the data (Corbett-Davies and Goel, 2018). Diﬀerent algorithms have also been devisedto ensure one or multiple of the fairness deﬁnitions. These algorithms try to rectify andmitigate the bias in the data and thus in the prediction-model in three ways: pre-processing the data (Kamiran and Calders, 2012; Zemel et al., 2013; Calmon et al., 2017), in-processing the algorithm (Zhang et al., 2018), and post-processing the outcomes (Kamiran et al., 2012;Hardt et al., 2016). airness Veriﬁers. Due to the abundance of fairness metrics and diﬀerence in algorithmsto achieve them, it has become necessary to verify diﬀerent fairness metrics over datasetsand algorithms.In order to verify fairness as a model property on a dataset, veriﬁers like

FairSquare (Al-barghouthi et al., 2017) and

VeriFair (Bastani et al., 2019) have been proposed. Theseveriﬁers are referred to as distributional veriﬁers owing to the fact that their inputs area probability distribution of the attributes in the dataset and a model of a suitable form,and their objective is to verify fairness w.r.t. the distribution and the model. ThoughFairSquare and VeriFair are robust and has asymptotic convergence guarantees, we observethat they scale up poorly with the size of inputs and also do not generalize to non-Booleanand compound sensitive attributes. In contrast to the distributional veriﬁers, another line ofwork, referred to as sample-based veriﬁers, has focused on the design of testing methodologieson a given ﬁxed data sample (Galhotra et al., 2017; Bellamy et al., 2018). Since sample-basedveriﬁers are dataset-speciﬁc, they generally do not provide robustness over the distribution.Thus, a uniﬁed formal framework to verify diﬀerent fairness metrics of an ML algorithm,which is scalable , capable of handling compound protected groups , robust with respect to thetest data, and operational on real-life datasets and fairness-enhancing algorithms, is missingin the literature. Our Contribution.

From this vantage point, we propose to model verifying diﬀerentfairness metrics as a Stochastic Boolean Satisﬁability (SSAT) problem (Littman et al., 2001).SSAT was originally introduced by (Papadimitriou, 1985) to model games against nature . Inthis work, we primarily focus on reductions to the exist-random quantiﬁed fragment of SSAT,which is also known as E-MAJSAT (Littman et al., 2001). SSAT is a conceptual frameworkthat has been employed to capture several fundamental problems in AI such as computationof maximum a posteriori (MAP) hypothesis (Fremont et al., 2017), propositional probabilisticplanning (Majercik, 2007), circuit veriﬁcation (Lee and Jiang, 2018) and so on. Furthermore,our choice of SSAT as a target formulation is motivated by the recent algorithmic progressthat has yielded eﬃcient SSAT tools (Lee et al., 2017, 2018).Our contributions are summarised below: • We propose a uniﬁed SSAT-based approach,

Justicia , to verify independence andseparation metrics of fairness for diﬀerent datasets and classiﬁcation algorithms. • Unlike previously proposed formal distributional veriﬁers, namely FairSquare andVeriFair,

Justicia veriﬁes fairness for compound and non-Boolean sensitive attributes. • Our experiments validate that our method is more accurate and scalable than thedistributional veriﬁers, such as FairSquare and VeriFair, and more robust than thesample-based empirical veriﬁers, such as AIF360. • We prove a ﬁnite-sample error bound on our estimated fairness metrics which isstronger than the existing asymptotic guarantees.It is worth remarking that signiﬁcant advances in AI bear testimony to the right choiceof formulation, for example, formulation of planning as SAT (Kautz et al., 1992). In thiscontext, we view that formulation of fairness as SSAT has potential to spur future work hosh, Basu, and Meel from both the modeling and encoding perspective as well as core algorithmic improvementsin the underlying SSAT solvers.

2. Background: Fairness and SSAT

In Section 2.1, we deﬁne diﬀerent fairness metrics for a supervised learning problem. Followingthat, we discuss Stochastic Boolean Satisﬁability (SSAT) problem in Section 2.2.

Let us represent a dataset D as a collection of triads ( X, A, Y ) sampled from an underlyingdata generating distribution D . X (cid:44) { X , . . . , X m } ∈ R m is the set of non-protected (ornon-sensitive) attributes. A (cid:44) { A , . . . , A n } is the set of categorical protected attributes. Y is the binary label (or class) of ( X, A ). A compound protected attribute a = { a , . . . , a n } is a valuation to all A i ’s and represents a compound protected group. For example, A = { race , sex } , where race ∈ { Asian , Colour , White } and sex ∈ { female , male } . Thus, a = { Colour , female } is a compound protected group. We deﬁne M (cid:44) Pr( ˆ Y | X, A ) to be a binaryclassiﬁer trained from samples in the distribution D . Here, ˆ Y is the predicted label (or class)of the corresponding data.As we illustrated in Example 1, a classiﬁer M that solely optimizes accuracy, i.e., theaverage number of times ˆ Y = Y , may discriminate certain compound protected groups overothers (Chouldechova and Roth, 2020). Now, we describe two family of fairness metrics thatcompute bias induced by a classiﬁer and are later veriﬁed by Justicia . The independence (or calibration) metrics of fairness state that the output of the classiﬁershould be independent of the compound protected group. A notion of independence isreferred to group fairness that speciﬁes an equal positive predictive value (PPV) acrossall compound protected groups for an algorithm M , i.e., Pr[ ˆ Y = 1 | A = a , M ] = Pr[ ˆ Y =1 | A = b , M ] , ∀ a , b ∈ A . Since satisfying group fairness exactly is hard, relaxations of groupfairness, such as disparate impact and statistical parity (Dwork et al., 2012; Feldman et al.,2015), are proposed. Disparate impact (DI) (Feldman et al., 2015) measures the ratio of PPVs between themost favored group and least favored group, and prescribe it to be close to 1. Formally, aclassiﬁer satisﬁes (1 − (cid:15) )-disparate impact if, for (cid:15) ∈ [0 , a ∈ A Pr[ ˆ Y = 1 | a , M ] ≥ (1 − (cid:15) ) max b ∈ A Pr[ ˆ Y = 1 | b , M ] . Another popular relaxation of group fairness, statistical parity (SP) measures the diﬀerenceof PPV among the compound groups, and prescribe this to be near zero. Formally, analgorithm satisﬁes (cid:15) -statistical parity if, for (cid:15) ∈ [0 , a , b ∈ A | Pr[ ˆ Y = 1 | a , M ] − Pr[ ˆ Y = 1 | b , M ] | ≤ (cid:15). For both disparate impact and statistical parity, lower value of (cid:15) indicates higher groupfairness of the classiﬁer M . .1.2 Separation Metrics of Fairness. In the separation (or classiﬁcation parity) notion of fairness, the predicted labels ˆ Y of aclassiﬁer M is independent of the sensitive attributes A given the actual class labels Y .In case of binary classiﬁers, a popular separation metric is equalized odds (EO) (Hardtet al., 2016) that computes the diﬀerence of false positive rates (FPR) and the diﬀerence oftrue positive rates (TPR) among all compound protected groups. Lower value of equalizedodds indicates better fairness. A classiﬁer M satisﬁes (cid:15) -equalized odds if, for all compoundprotected groups a , b ∈ A , | Pr[ ˆ Y = 1 | A = a , Y = 0] − Pr[ ˆ Y = 1 | A = b , Y = 0] | ≤ (cid:15), | Pr[ ˆ Y = 1 | A = a , Y = 1] − Pr[ ˆ Y = 1 | A = b , Y = 1] | ≤ (cid:15). In this paper, we formulate verifying the aforementioned independence and separationmetrics of fairness as stochastic Boolean satisﬁability (SSAT) problem, which we deﬁne next.

Let B = { B , . . . , B m } be a set of Boolean variables. A literal is a variable B i or itscomplement ¬ B i . A propositional formula φ deﬁned over B is in Conjunctive Normal Form(CNF) if φ is a conjunction of clauses and each clause is a disjunction of literals. Let σ be anassignment to the variables B i ∈ B such that σ ( B i ) ∈ { , } where 1 is logical TRUE and 0is logical FALSE. The propositional satisﬁability problem (SAT) (Biere et al., 2009) ﬁnds anassignment σ to all B i ∈ B such that the formula φ is evaluated to be 1. In contrast to theSAT problem, the Stochastic Boolean Satisﬁability (SSAT) problem (Littman et al., 2001) isconcerned with the probability of the satisfaction of the formula φ . An SSAT formula is ofthe form Φ = Q B , . . . , Q m B m , φ, (1)where Q i ∈ {∃ , ∀ , R p i } is either of the existential ( ∃ ), universal ( ∀ ), or randomized (R p i )quantiﬁers over the Boolean variable B i and φ is a quantiﬁer-free CNF formula. In the SSATformula Φ, the quantiﬁer part Q B , . . . , Q m B m is known as the preﬁx of the formula φ . Incase of randomized quantiﬁcation R p i , p i ∈ [0 ,

1] is the probability of B i being assigned to 1.Given an SSAT formula Φ, let B be the outermost variable in the preﬁx. The satisfyingprobability of Φ can be computed by the following rules :1. Pr[TRUE] = 1, Pr[FALSE] = 0,2. Pr[Φ] = max B { Pr[Φ | B ] , Pr[Φ | ¬ B ] } if B is existentially quantiﬁed,3. Pr[Φ] = min B { Pr[Φ | B ] , Pr[Φ | ¬ B ] } if B is universally quantiﬁed,4. Pr[Φ] = p Pr[Φ | B ] + (1 − p ) Pr[Φ | ¬ B ] if B is randomized quantiﬁed with probability p of being TRUE,where Φ | B and Φ | ¬ B denote the SSAT formulas derived by eliminating the outermostquantiﬁer of B by substituting the value of B in the formula φ with 1 and 0 respectively. Inthis paper, we focus on two speciﬁc types of SSAT formulas: random-exist (RE) SSAT and exist-random (ER) SSAT. In the ER-SSAT (resp. RE-SSAT) formula, all existentially (resp. hosh, Basu, and Meel randomized) quantiﬁed variables are followed by randomized (resp. existentially) quantiﬁedvariables in the preﬁx. Lemma 1

Solving the ER-SSAT and RE-SSAT problems are NP PP hard (Littman et al.,2001). The problem of SSAT and its variants have been pursued by theoreticians and practi-tioners alike for over three decades (Majercik and Boots, 2005; Fremont et al., 2017; Huanget al., 2006). We refer the reader to (Lee et al., 2017, 2018) for detailed survey. It is worthremarking that the past decade have witnessed a signiﬁcant performance improvementsthanks to close integration of techniques from SAT solving with advances in weighted modelcounting (Sang et al., 2004; Chakraborty et al., 2013, 2014). Justicia : An SSAT Framework to Verify Fairness Metrics

In this section, we present the primary contribution of this paper,

Justicia , which is anSSAT-based framework for verifying independence and separation metrics of fairness.Given a binary classiﬁer M and a probability distribution over dataset ( X, A, Y ) ∼ D ,our goal is to verify whether M achieves independence and separation metrics with respectto the distribution D . We focus on a classiﬁer that can be translated to a CNF formula ofBoolean variables B . The probability p i of B i ∈ B being assigned to 1 is induced by thedata generating distribution D . In order to verify fairness metrics in compound protectedgroups, we discuss an enumeration-based approach in Section 3.1 and an equivalent learning-based approach in Section 3.2. We conclude this section with a theoretical analysis for ahigh-probability error bound on the fairness metric in Section 3.3. In order to verify independence and separation metrics, the core component of

Justicia is tocompute the positive predictive value Pr[ ˆ Y = 1 | A = a ] for a compound protected group a .For simplicity, we initially make some assumptions and discuss their practical relaxations inSection 3.4. We ﬁrst assume the classiﬁer M is representable as a CNF formula, namely φ ˆ Y ,such that ˆ Y = 1 when φ ˆ Y is satisﬁed and ˆ Y = 0 otherwise. Since a Boolean CNF classiﬁer isdeﬁned over Boolean variables, we assume all attributes in X and A to be Boolean. Finally,we assume independence of non-protected attributes on protected attributes and p i is theprobability of the attribute X i being assigned to 1 for any X i ∈ X .Now, we deﬁne a RE-SSAT formula Φ a to compute the probability Pr[ ˆ Y = 1 | A = a ].In the preﬁx of Φ a , all non-protected Boolean attributes in X are assigned randomizedquantiﬁcation and they are followed by the protected Boolean attributes in A with existentialquantiﬁcation. The CNF formula φ in Φ a is constructed such that φ encodes the eventinside the target probability Pr[ ˆ Y = 1 | A = a ]. In order to encode the conditional A = a , wetake the conjunction of the Boolean variables in A that symbolically speciﬁes the compoundprotected group a . For example, we represent two protected attributes: race ∈ { White,Colour } and sex ∈ { male, female } by the Boolean variables R and S respectively. Thus, thecompound groups { White , male } and { Colour , female } are represented by R ∧ S and ¬ R ∧¬ S , espectively. Thus, the RE-SSAT formula for computing the probability Pr[ ˆ Y = 1 | A = a ] isΦ a := R p X , . . . , R p m X m (cid:124) (cid:123)(cid:122) (cid:125) non-protected attributes , ∃ A , . . . , ∃ A n (cid:124) (cid:123)(cid:122) (cid:125) protected attributes , φ ˆ Y ∧ ( A = a ) . In Φ a , the existentially quantiﬁed variables A , . . . , A n are assigned values according to theconstraint A = a . Therefore, by solving the SSAT formula Φ a , the SSAT solver ﬁnds theprobability Pr[Φ a ] for the protected group A = a given the random values of X , . . . , X m ,which is the PPV of the protected group a for the distribution D and algorithm M .For simplicity, we have described computing the PPV of each compound protected groupwithout considering the correlation between the protected and non-protected attributes. Inreality, correlation exists between the protected and non-protected attributes. Thus, theymay have diﬀerent conditional distributions for diﬀerent protected groups. We incorporatethese conditional distributions in Justicia enum by evaluating the conditional probability p i = Pr[ X i = TRUE | A = a ] instead of the independent probability Pr[ X i = TRUE] for any X i ∈ X . We illustrate this method in Example 2. Example 2 (RE-SSAT encoding)

Here, we illustrate the RE-SSAT formula for calcu-lating the PPV for the protected group ‘age ≥ ’ in the decision tree of Figure 1. We assignthree Boolean variables F, I, J for the three nodes in the tree such that the literal

F, I, J denote ‘ﬁtness ≥ . ’, ‘income ≥ . ’, and ‘income ≥ . ’, respectively. We consideranother Boolean variable A where the literal A represents the protected group ‘age ≥ ’.Thus, the CNF formula for the decision tree is ( ¬ F ∨ I ) ∧ ( F ∨ J ) . From the distribution inFigure 1, we get Pr[ F ] = 0 . , Pr[ I ] = 0 . , and Pr[ J ] = 0 . . Given this information, wecalculate the PPV for the protected group ‘age ≥ ’ by solving the RE-SSAT formula: Φ A := R . F, R . I, R . J, ∃ A, ( ¬ F ∨ I ) ∧ ( F ∨ J ) ∧ A. From the solution to this SSAT formula, we get

Pr[Φ A ] = 0 . . Similarly, to calculate thePPV for the group ‘age < ’, we replace the unit (single-literal) clause A with ¬ A in theCNF in Φ A and construct another SSAT formula Φ ¬ A where Pr[Φ ¬ A ] = 0 . . Therefore, if Pr[ F ] , Pr[ I ] , Pr[ J ] are computed independently of A and ¬ A , both age groups demonstrateequal PPV as the protected attribute is not explicitly present in the classiﬁer. However,there is an implicit bias in the data distribution for diﬀerent protected groups and theclassiﬁer unintentionally learns it. To capture this implicit bias, we calculate the conditionalprobabilities Pr[ F | A ] = 0 . , Pr[ I | A ] = 0 . , and Pr[ J | A ] = 0 . from the distribution. Usingthe conditional probabilities in Φ A , we ﬁnd that Pr[Φ A ] = 0 . for ‘age ≥ ’. For ‘age < ’, we similarly obtain Pr[ F |¬ A ] = 0 . , Pr[ I |¬ A ] = 0 . , and Pr[ J |¬ A ] = 0 . , andthus Pr[Φ ¬ A ] = 0 . . Thus, Justicia enum detects the discrimination of the classiﬁer amongdiﬀerent protected groups. An astute reader would observe that I and J are not independent.Following (Chavira and Darwiche, 2008), we can simply capture relationship between thevariables using constraints and if needed, auxiliary variables. In this case, it suﬃces to addthe the constraint J → I .

1. An RE-SSAT formula becomes an R-SSAT formula when the assignment to the existential variables areﬁxed. hosh, Basu, and Meel Algorithm 1

Justicia : SSAT-based Fairness Veriﬁer function Justicia enum ( X, A, ˆ Y ) φ ˆ Y := CNF ( ˆ Y = 1) for all a ∈ A do p i ← CalculateProb ( X i | a ) , ∀ X i ∈ X φ := φ ˆ Y ∧ ( A = a ) Φ a := R p X , . . . , R p m X m , ∃ A , . . . , ∃ A n , φ Pr[Φ a ] ← SSAT (Φ a ) return max a Pr[Φ a ] , min a Pr[Φ a ] function Justicia learn ( X, A, ˆ Y ) φ ˆ Y := CNF ( ˆ Y = 1) p i ← CalculateProb ( X i ) , ∀ X i ∈ X Φ ER := ∃ A , . . . , ∃ A n , R p X , . . . , R p m X m , φ ˆ Y Φ (cid:48) ER := ∃ A , . . . , ∃ A n , R p X , . . . , R p m X m , ¬ φ ˆ Y return SSAT (Φ ER ) , − SSAT (Φ (cid:48) ER ) Measuring Fairness Metrics.

As we compute the probability Pr[ ˆ Y = 1 | A = a ] bysolving the SSAT formula Φ a , we use Pr[Φ a ] to measure diﬀerent fairness metrics. For that,we compute Pr[Φ a ] for all compound groups a ∈ A that requires solving exponential (with n ) number of SSAT instances. We elaborate this enumeration approach, Justicia enum , inAlgorithm 1 (Line 1–8).We calculate the ratio of the minimum and the maximum probabilities according to thedeﬁnition of disparate impact in Section 2. We compute statistical parity by taking thediﬀerence between the maximum and the minimum probabilities of all Pr[Φ a ]. Moreover, tomeasure equalized odds, we compute two SSAT instances for each compound group withmodiﬁed values of p i . Speciﬁcally, to compute TPR, we use the conditional probability p i = Pr[ X i | Y = 1] on samples with class label Y = 1 and take the diﬀerence between themaximum and the minimum probabilities of all compound groups. In addition, to computeFPR, we use the conditional probability p i = Pr[ X i | Y = 0] on samples with Y = 0 and takethe diﬀerence similarly. Thus, Justicia enum allows us to compute diﬀerent fairness metricsusing a uniﬁed algorithmic framework.

In most practical problems, there can be exponentially many compound groups based on thediﬀerent combinations of valuation to the protected attributes. Therefore, the enumerationapproach in Section 3.1 may suﬀer from scalability issues. Hence, we propose eﬃcient SSATencodings to learn the most favored group and the least favored group for given M and D ,and to compute their PPVs to measure diﬀerent fairness metrics. Learning the Most Favored Group.

In an SSAT formula Φ, the order of quantiﬁcationof the Boolean variables in the preﬁx carries distinct interpretation of the satisfying probabilityof Φ. In ER-SSAT formula, the probability of satisfying Φ is the maximum satisfyingprobability over the existentially quantiﬁed variables given the randomized quantiﬁed ariables (by Rule 2, Sec. 2.2). In this paper, we leverage this property to compute the mostfavored group with the highest PPV. We consider the following ER-SSAT formula.Φ ER := ∃ A , . . . , ∃ A n , R p X , . . . , R p m X m , φ ˆ Y . (2)The CNF formula φ ˆ Y is the CNF translation of the classiﬁer ˆ Y = 1 without any speciﬁcationof the compound protected group. Therefore, as we solve Φ ER , we ﬁnd the assignment to theexistentially quantiﬁed variables A = a max1 , . . . , A n = a max n for which the satisfying probabil-ity Pr[Φ ER ] is maximum. Thus, we compute the most favored group a fav (cid:44) { a max1 , . . . , a max n } achieving the highest PPV. Learning the Least Favored Group.

In order to learn the least favored group in termsof PPV, we compute the minimum satisfying probability of the classiﬁer φ ˆ Y given therandom values of the non-protected variables X , . . . , X m . In order to do so, we have tosolve a ‘universal-random’ (UR) SSAT formula (Eq. (3)) with universal quantiﬁcation overthe protected variables and randomized quantiﬁcation over the non-protected variables (byRule 3, Sec. 2.2). Φ UR := ∀ A , . . . , ∀ A n , R p X , . . . , R p m X m , φ ˆ Y . (3)A UR-SSAT formula returns the minimum satisfying probability of φ over the universallyquantiﬁed variables in contrast to the ER-SSAT formula that returns the maximum satisfyingprobability over the existentially quantiﬁed variables. Due to practical issues to solve UR-SSAT formula, in this paper, we leverage the duality between UR-SSAT (Eq. (3)) andER-SSAT formulas (Eq. (4))Φ (cid:48) ER := ∃ A , . . . , ∃ A n , R p X , . . . , R p m X m , ¬ φ ˆ Y . (4)and solve the UR-SSAT formula on the CNF φ using the ER-SSAT formula on the comple-mented CNF ¬ φ (Littman et al., 2001). Lemma 2 encodes this duality. Lemma 2

Given Eq. (3) and (4) , Pr[Φ UR ] = 1 − Pr[Φ (cid:48) ER ] . As we solve Φ (cid:48) ER , we obtain the assignment to the protected attributes a unfav (cid:44) { a min , . . . , a minn } that maximizes Φ (cid:48) ER . If p is the maximum satisfying probability of Φ (cid:48) ER , according to Lemma 2,1 − p is the minimum satisfying probability of Φ UR , which is the PPV of the least favoredgroup a unfav . We present the algorithm for this learning approach, namely Justicia learn inAlgorithm 1 (Line 9–14).In ER-SSAT formula of Eq. (4), we need to negate the classiﬁer φ ˆ Y to another CNFformula ¬ φ ˆ Y . The na¨ıve approach of negating a CNF to another CNF generates exponentialnumber of new clauses. Here, we can apply Tseitin transformation that increases the clauseslinearly while introducing linear number of new variables (Tseitin, 1983). As an alternative,we also directly encode the classiﬁer M for the negative class label ˆ Y = 0 as a CNF formulaand pass it to Φ (cid:48) ER , if possible. The last approach is generally more eﬃcient than the otherapproaches as the resulting CNF is often smaller. Example 3 (ER-SSAT encoding)

Here, we illustrate the ER-SSAT encodings for learn-ing the most favored and the least favored group in presence of multiple protected groups. As hosh, Basu, and Meel the example in Figure 1 is degenerate for this purpose, we introduce another protected group‘sex ∈ { male, female } ’. Consider a Boolean variable S for ‘sex’ where the literal S denotes ‘sex= male’. With this new protected attribute, let the classiﬁer be M (cid:44) ( ¬ F ∨ I ∨ S ) ∧ ( F ∨ J ) ,where F, I, J have same distributions as discussed in Example 2. Hence, we obtain theER-SSAT formula of M to learn the most favored group: Φ ER := ∃ S, ∃ A, R . F, R . I, R . J, ( ¬ F ∨ I ∨ S ) ∧ ( F ∨ J ) . As we solve Φ ER , we learn that the assignment to the existential variables σ ( S ) = 1 , σ ( A ) = 0 ,i.e. ‘male individuals with age < ’ is the most favored group with PPV computed as Pr[Φ ER ] = 0 . . Similarly, to learn the least favored group, we negate the CNF of theclassiﬁer M to obtain the following ER-SSAT formula: Φ ER (cid:48) := ∃ S, ∃ A, R . F, R . I, R . J, ¬ (( ¬ F ∨ I ∨ S ) ∧ ( F ∨ J )) . Solving Φ ER (cid:48) , we learn the assignment σ ( S ) = 0 , σ ( A ) = 0 and Pr[Φ ER (cid:48) ] = 0 . . Thus, ‘femaleindividuals with age < ’ constitute the least favored group with PPV: − .

57 = 0 . . Thus, Justicia learn allows us to learn the most and least favored groups and the correspondingdiscrimination.

We use the PPVs of the most and least favored groups to compute fairness metricsas described in Section 3.1. We prove equivalence of

Justicia enum and

Justicia learn inLemma 3.

Lemma 3

Let Φ a be the RE-SSAT formula for computing the PPV of the compoundprotected group a ∈ A . If Φ ER is the ER-SSAT formula for learning the most favored groupand Φ UR is the UR-SSAT formula for learning the least favored group, then max a Pr[Φ a ] =Pr[Φ ER ] and min a Pr[Φ a ] = Pr[Φ UR ] . We access the data generating distribution through ﬁnite number of samples observed fromit. These ﬁnite sample set introduce errors in the computed probabilities of the randomisedquantiﬁers being 1. These ﬁnite-sample errors in computed probabilities induce furthererrors in the computed positive predictive value (PPV) and fairness metrics. In this section,we provide a bound on this ﬁnite-sample error.Let us consider that ˆ p i is the estimated probability of a Boolean variable B i beingassigned to 1 from k -samples and p i is the true probability according to D . Thus, the truesatisfying probability p of Φ is the weighted sum of all satisfying assignments of the CNF φ : p = (cid:80) σ (cid:81) B i ∈ σ p i . This probability is estimated as ˆ p using k -samples from the datagenerating distribution D such that ˆ p ≤ (cid:15) p for (cid:15) ≥ Theorem 4

For an ER-SSAT problem, the sample complexity is given by k = O (cid:18) ( n + ln(1 /δ )) ln m ln (cid:15) (cid:19) , where ˆ pp ≤ (cid:15) with probability − δ such that (cid:15) ≥ . able 1: Results on synthetic benchmark. ‘—’ refers that the veriﬁer cannot compute themetric. Metric Exact Justicia

FairSquare VeriFair AIF360Disparate impact 0 .

26 0 .

25 0 .

99 0 .

99 0 . .

53 0 .

54 — — 0 . Corollary 5 If k samples are considered from the data-generating distribution in Justicia such that k = O (cid:18) ( n + ln(1 /δ )) ln m ln (cid:15) (cid:19) , the estimated disparate impact ˆ DI and statistical parity ˆ SP satisfy, with probability − δ , ˆ DI ≤ (cid:15) DI, and ˆ SP ≤ (cid:15) SP.

In this section, we relax assumptions of Boolean classiﬁers and Boolean attributes andextend

Justicia to verify fairness metrics for more practical settings of decision trees, linearclassiﬁers, and continuous attributes.

Extending to Decision Trees and Linear Classiﬁers.

In the SSAT approach ofSection 3, we assume that the classiﬁer M is represented as a CNF formula. In the literatureof interpretable machine learning, several studies have been conducted for learning CNFclassiﬁers in the supervised learning setting, which include but are not limited to the workof (Angelino et al., 2017; Malioutov and Meel, 2018; Ghosh and Meel, 2019). Additionally,we extend Justicia beyond CNF classiﬁers to decision trees and linear classiﬁers , which arewidely used in the fairness studies (Zemel et al., 2013; Raﬀ et al., 2018; Zhang and Ntoutsi,2019). Extending to Continuous Attributes.

In practical problems, attributes are generallyreal-valued or categorical. But classiﬁers which are already represented using CNF areusually trained on a Boolean abstraction of the input attributes. In order to perform thisBoolean abstraction, each categorical attribute is one-hot encoded and each real-valuedattribute is discretised into a set of Boolean attributes (Lakkaraju et al., 2019; Ghosh et al.,2020). Detailed design choices are deferred to Appendix B.

4. Empirical Performance Analysis

In this section, we discuss the empirical studies to evaluate the performance of

Justicia inverifying diﬀerent fairness metrics. We ﬁrst discuss the experimental setup and the objectiveof the experiments and then evaluate the experimental results.

2. Linear classiﬁers can be encoded to CNF using pseudo-Boolean encoding (Roussel and Manquinho, 2009). hosh, Basu, and Meel Table 2: Scalability of diﬀerent veriﬁers in terms of execution time (in seconds). DT and LRrefer to decision tree and logistic regression respectively. ‘—’ refers to timeout.Dataset Ricci Titanic COMPAS AdultClassiﬁer DT LR DT LR DT LR DT LR

Justicia . . . . . . . . . . . . . . . . . . . Justicia . Numbers in bold refer to fairness improvement compared against theunprocessed (orig.) dataset. RW and OP refer to reweighing and optimized-preprocessingalgorithm respectively. Results for German dataset is deferred to the AppendixClassiﬁer Dataset → Adult COMPASProtected → Race Sex Race SexAlgorithm → orig. RW OP orig. RW OP orig. RW OP orig. RW OPLogisticregression Disparte impact 0 . .

85 0 . . .

61 0 . . .

36 0 . . .

80 0 . Stat. parity 0 . .

01 0 . . .

04 0 . . .

33 0 . . .

09 0 . Equalized odds 0 . .

03 0 . . .

02 0 . . .

33 0 . .

17 0 . . Decisiontree Disparte impact 0 .

82 0 .

60 0 .

67 0 . .

73 0 . .

61 0 .

58 0 .

57 0 .

94 0 .

78 0 . .

02 0 .

05 0 .

04 0 . .

05 0 . . .

17 0 . .

02 0 .

09 0 . . .

05 0 . . .

03 0 . . .

16 0 . . . . We have implemented a prototype of

Justicia in Python (version 3 . . Justicia relies on solving SSAT formulas using an oﬀ-the-shelf SSAT solver. To this end,we employ the state of the art RE-SSAT solver of (Lee et al., 2017) and the ER-SSATsolver of (Lee et al., 2018). Both solvers output the exact satisfying probability of the SSATformula.For comparative evaluation of

Justicia , we have experimented with two state-of-the-artdistributional veriﬁers FairSquare and VeriFair, and also a sample-based fairness measuringtool: AIF360. In the experiments, we have studied three type of classiﬁers: CNF learner,decision trees and logistic regression classiﬁer. Decision tree and logistic regression areimplemented using scikit-learn module of Python (Pedregosa et al., 2011) and we usethe MaxSAT-based CNF learner IMLI of (Ghosh and Meel, 2019). We have used thePySAT library (Ignatiev et al., 2018) for encoding the decision function of the logisticregression classiﬁer into a CNF formula. We have also veriﬁed two fairness-enhancingalgorithms: reweighing algorithm (Kamiran and Calders, 2012) and the optimized pre-processing algorithm (Calmon et al., 2017). We have experimented on multiple datasetscontaining multiple protected attributes: the UCI Adult and German-credit dataset (Dua ace(5) race,sex(10) race,age(20) race,sex,age(40) Protected groups D i s p a r a t e i m p a c t race(5) race,sex(10) race,age(20) race,sex,age(40) Protected groups S t a t . p a r i t y Figure 2: Fairness metrics measured by

Justicia for diﬀerent protected groups in the Adultdataset. The number within parenthesis in the xticks denotes total compound groups.and Graﬀ, 2017), ProPublicas COMPAS recidivism dataset (Angwin et al., 2016), Riccidataset (McGinley, 2010), and Titanic dataset .Our empirical studies have the following objectives:1. How accurate and scalable Justicia is with respect to existing fairness veriﬁers, FairSquareand VeriFair?2. Can

Justicia verify the eﬀectiveness of diﬀerent fairness-enhancing algorithms ondiﬀerent datasets?3. Can

Justicia verify fairness in the presence of compound sensitive groups?4. How robust is

Justicia in comparison to sample-based tools like AIF360 for varyingsample sizes?Our experimental studies validate that

Justicia is more accurate and scalable than thestate-of-the-art veriﬁers FairSquare and VeriFair.

Justicia is able to verify the eﬀectivenessof diﬀerent fairness-enhancing algorithms for multiple fairness metrics, and datasets.

Justicia achieves scalable performance in the presence of compound sensitive groups that the existingveriﬁers cannot handle. Finally,

Justicia is more robust than the sample-based tools such asAIF360. -error. In order to assess the accuracy of diﬀerent veriﬁers, wehave considered the decision tree in Figure 1 for which the fairness metrics are analyticallycomputable. In Table 1, we show the computed fairness metrics by

Justicia , FairSquare,VeriFair, and AIF360. We observe that

Justicia and AIF360 yield more accurate estimatesof DI and SP compared against the ground truth with less than 1% error. FairSquare hosh, Basu, and Meel S t d . o f D I VerifierAIF360Justicia 0.2 0.4 0.6 0.8 1.0Sample size0.0000.0050.0100.0150.0200.0250.030 S t d . o f S P VerifierAIF360Justicia

Figure 3: Standard deviation in estimation of disparate impact (DI) and stat. parity (SP)for diﬀerent sample sizes.

Justicia is more robust with variation of sample size than AIF360.and VeriFair estimate the disparate impact to be 0 .

99 and thus, being unable to verify thefairness violation. Thus,

Justicia is signiﬁcantly accurate than the existing formal veriﬁers:FairSquare and VeriFair.

Scalability: to Magnitude Speed-up.

We have tested the scalability of

Justicia ,FairSquare, and VeriFair on practical benchmarks with a timeout of 900 seconds and reportedthe execution time of these veriﬁers on decision tree and logistic regression in Table 2. Weobserve that

Justicia shows impressive scalability than the competing veriﬁers. Particularly,

Justicia is 1 to 2 magnitude faster than FairSquare and 1 to 3 magnitude faster than VeriFair.Additionally, FairSquare times out in most benchmarks. Thus,

Justicia is not only accuratebut also scalable than the existing veriﬁers.

Veriﬁcation: Detecting Compounded Discrimination in Protected Groups.

Wehave tested

Justicia for datasets consisting of multiple protected attributes and reportedthe results in Figure 2.

Justicia operates on datasets with even 40 compound protectedgroups and can potentially scale more than that while the state-of-the-art fairness veriﬁers(e.g., FairSquare and VeriFair) consider a single protected attribute. Thus,

Justicia removesan important limitation in practical fairness veriﬁcation. Additionally, we observe in mostdatasets the disparate impact decreases and thus, discrimination increases as more compoundprotected groups are considered. For instance, when we increase the total groups from 5 to40 in the Adult dataset, disparate impact decreases from around 0 . .

3, thereby detectinghigher discrimination. Thus,

Justicia detects that the marginalized individuals of a speciﬁctype (e.g., ‘race’) are even more discriminated and marginalized when they also belong to amarginalized group of another type (e.g., ‘sex’).

Veriﬁcation: Fairness of Algorithms on Datasets.

We have experimented withtwo fairness-enhancing algorithms: the reweighing (RW) algorithm and the optimized-preprocessing (OP) algorithm. Both of them pre-process to remove statistical bias from thedataset. We study the eﬀectiveness of these algorithms using

Justicia on three datasets eachwith two diﬀerent protected attributes. In Table 3, we report diﬀerent fairness metrics onlogistic regression and decision tree. We observe that

Justicia veriﬁes fairness improvementas the bias mitigating algorithms are applied. For example, for the Adult dataset with ‘race’ s the protected attribute, disparate impact increases from 0 .

23 to 0 .

85 for applying thereweighing algorithm on logistic regression classiﬁer. In addition, statistical parity decreasesfrom 0 .

09 to 0 .

01, and equalized odds decreases from 0 .

13 to 0 .

03, thereby showing theeﬀectiveness of reweighing algorithm in all three fairness metrics.

Justicia also ﬁnds instanceswhere the fairness algorithms fail, specially when considering the decision tree classiﬁer.Thus,

Justicia enables veriﬁcation of diﬀerent fairness enhancing algorithms in literature.

Robustness: Stability to Sample Size.

We have compared the robustness of

Justicia with AIF360 by varying the sample-size and reporting the standard deviation of diﬀerentfairness metrics. In Figure 3, AIF360 shows higher standard deviation for lower sample-sizeand the value decreases as the sample-size increases. In contrast,

Justicia shows signiﬁcantlylower ( ∼ × to 100 × ) standard deviation for diﬀerent sample-sizes. The reason is thatAIF360 empirically measures on a ﬁxed test dataset whereas Justicia provides estimatesover the data generating distribution. Thus,

Justicia is more robust than the sample-basedveriﬁer AIF360.

5. Discussion and Future Work

Though formal veriﬁcation of diﬀerent fairness metrics of an ML algorithm for diﬀerentdatasets is an important question, existing veriﬁers are not scalable, accurate, and extendableto non-Boolean attributes. We propose a stochastic SAT-based approach,

Justicia , thatformally veriﬁes independence and separation metrics of fairness for diﬀerent classiﬁersand distributions for compound protected groups. Experimental evaluations demonstratethat

Justicia achieves higher accuracy and scalability in comparison to the state-of-the-artveriﬁers, FairSquare and VeriFair, while yielding higher robustness than the sample-basedtools, such as AIF360.Our work opens up several new directions of research. One direction is to develop SSATmodels and veriﬁers for popular classiﬁers like Deep networks and SVMs. Other directionis to develop SSAT solvers that can accommodate continuous variables and conditionalprobabilities by design.

References

Aws Albarghouthi, Loris D’Antoni, Samuel Drews, and Aditya V Nori. FairSquare: probabilis-tic veriﬁcation of program fairness.

Proceedings of the ACM on Programming Languages ,1(OOPSLA):1–30, 2017.Elaine Angelino, Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer, and Cynthia Rudin.Learning certiﬁably optimal rule lists for categorical data.

The Journal of MachineLearning Research , 18(1):8753–8830, 2017.Julia Angwin, Jeﬀ Larson, Surya Mattu, and Lauren Kirchner. Machine bias risk assessmentsin criminal sentencing.

ProPublica, May , 23, 2016.Osbert Bastani, Xin Zhang, and Armando Solar-Lezama. Probabilistic veriﬁcation of fairnessproperties via concentration.

Proceedings of the ACM on Programming Languages , 3(OOPSLA):1–27, 2019. hosh, Basu, and Meel Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoﬀman, Stephanie Houde,Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mo-jsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, DiptikalyanSaha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. AIFairness 360: An extensible toolkit for detecting, understanding, and mitigating unwantedalgorithmic bias, October 2018. URL https://arxiv.org/abs/1810.01943 .Armin Biere, Marijn Heule, and Hans van Maaren.

Handbook of satisﬁability , volume 185.IOS press, 2009.Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, andKush R Varshney. Optimized pre-processing for discrimination prevention. In

Advancesin Neural Information Processing Systems , pages 3992–4001, 2017.Supratik Chakraborty, Kuldeep S Meel, and Moshe Y Vardi. A scalable approximatemodel counter. In

International Conference on Principles and Practice of ConstraintProgramming , pages 200–216. Springer, 2013.Supratik Chakraborty, Daniel J Fremont, Kuldeep S Meel, Sanjit A Seshia, and Moshe YVardi. Distribution-aware sampling and weighted model counting for sat. arXiv preprintarXiv:1404.2984 , 2014.Mark Chavira and Adnan Darwiche. On probabilistic inference by weighted model counting.

Artiﬁcial Intelligence , 172(6-7):772–799, 2008.Alexandra Chouldechova and Aaron Roth. A snapshot of the frontiers of fairness in machinelearning.

Communications of the ACM , 63(5):82–89, 2020.Sam Corbett-Davies and Sharad Goel. The measure and mismeasure of fairness: A criticalreview of fair machine learning. arXiv preprint arXiv:1808.00023 , 2018.Dheeru Dua and Casey Graﬀ. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml .Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairnessthrough awareness. In

Proceedings of the 3rd innovations in theoretical computer scienceconference , pages 214–226, 2012.Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkata-subramanian. Certifying and removing disparate impact. In proceedings of the 21th ACMSIGKDD international conference on knowledge discovery and data mining , pages 259–268,2015.Daniel J Fremont, Markus N Rabe, and Sanjit A Seshia. Maximum model counting. In

AAAI , pages 3885–3892, 2017.Sorelle A Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary,Evan P Hamilton, and Derek Roth. A comparative study of fairness-enhancing interventionsin machine learning. In

Proceedings of the conference on fairness, accountability, andtransparency , pages 329–338, 2019. ainyam Galhotra, Yuriy Brun, and Alexandra Meliou. Fairness testing: testing software fordiscrimination. In Proceedings of the 2017 11th Joint Meeting on Foundations of SoftwareEngineering , pages 498–510, 2017.Bishwamittra Ghosh and Kuldeep S. Meel. IMLI: An incremental framework for MaxSAT-based learning of interpretable classiﬁcation rules. In

Proc. of AIES , 2019.Bishwamittra Ghosh, Dmitry Malioutov, and Kuldeep S. Meel. Classiﬁcation rules in relaxedlogical form. In

Proceedings of ECAI , 6 2020.Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning.In

Advances in neural information processing systems , pages 3315–3323, 2016.Jinbo Huang et al. Combining knowledge compilation and search for conformant probabilisticplanning. In

ICAPS , pages 253–262, 2006.Alexey Ignatiev, Antonio Morgado, and Joao Marques-Silva. PySAT: A Python toolkitfor prototyping with SAT oracles. In

SAT , pages 428–437, 2018. doi: 10.1007/978-3-319-94144-8 26. URL https://doi.org/10.1007/978-3-319-94144-8_26 .Faisal Kamiran and Toon Calders. Data preprocessing techniques for classiﬁcation withoutdiscrimination.

Knowledge and Information Systems , 33(1):1–33, 2012.Faisal Kamiran, Asim Karim, and Xiangliang Zhang. Decision theory for discrimination-aware classiﬁcation. In , pages924–929. IEEE, 2012.Henry A Kautz, Bart Selman, et al. Planning as satisﬁability. In

ECAI , volume 92, pages359–363. Citeseer, 1992.Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In

Advances in neural information processing systems , pages 4066–4076, 2017.Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. Faithful and cus-tomizable explanations of black box models. In

Proc. of AIES , 2019.Nian-Ze Lee and Jie-Hong R Jiang. Towards formal evaluation and veriﬁcation of probabilisticdesign.

IEEE Transactions on Computers , 67(8):1202–1216, 2018.Nian-Ze Lee, Yen-Shi Wang, and Jie-Hong R Jiang. Solving stochastic boolean satisﬁabilityunder random-exist quantiﬁcation. In

IJCAI , pages 688–694, 2017.Nian-Ze Lee, Yen-Shi Wang, and Jie-Hong R Jiang. Solving exist-random quantiﬁedstochastic boolean satisﬁability via clause selection. In

IJCAI , pages 1339–1345, 2018.Michael L Littman, Stephen M Majercik, and Toniann Pitassi. Stochastic boolean satisﬁa-bility.

Journal of Automated Reasoning , 27(3):251–296, 2001.Stephen M Majercik. Appssat: Approximate probabilistic planning using stochastic satisﬁa-bility.

International Journal of Approximate Reasoning , 45(2):402–419, 2007. hosh, Basu, and Meel Stephen M Majercik and Byron Boots. Dc-ssat: a divide-and-conquer approach to solvingstochastic satisﬁability problems eﬃciently. In

AAAI , pages 416–422, 2005.Dmitry Malioutov and Kuldeep S Meel. MLIC: A MaxSAT-based framework for learninginterpretable classiﬁcation rules. In

International Conference on Principles and Practiceof Constraint Programming , pages 312–327. Springer, 2018.Ann C McGinley. Ricci v. destefano: A masculinities theory analysis.

Harv. JL & Gender ,33:581, 2010.Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan.A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 , 2019.Nina Narodytska, Alexey Ignatiev, Filipe Pereira, Joao Marques-Silva, and IS RAS. Learningoptimal decision trees with sat. In

IJCAI , pages 1362–1368, 2018.Christos H Papadimitriou. Games against nature.

Journal of Computer and System Sciences ,31(2):288–301, 1985.Fabian Pedregosa, Ga¨el Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion,Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al.Scikit-learn: Machine learning in python.

Journal of machine learning research , 12(Oct):2825–2830, 2011.Tobias Philipp and Peter Steinke. Pblib–a library for encoding pseudo-boolean constraintsinto cnf. In

International Conference on Theory and Applications of Satisﬁability Testing ,pages 9–16. Springer, 2015.Edward Raﬀ, Jared Sylvester, and Steven Mills. Fair forests: Regularized tree induction tominimize model bias. In

Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics,and Society , pages 243–250, 2018.Olivier Roussel and Vasco M Manquinho. Pseudo-boolean and cardinality constraints.

Handbook of satisﬁability , 185:695–733, 2009.Tian Sang, Fahiem Bacchus, Paul Beame, Henry A Kautz, and Toniann Pitassi. Combiningcomponent caching and clause learning for eﬀective model counting.

SAT , 4:7th, 2004.Grigori S Tseitin. On the complexity of derivation in propositional calculus. In

Automationof reasoning , pages 466–483. Springer, 1983.Depeng Xu, Shuhan Yuan, and Xintao Wu. Achieving diﬀerential privacy and fairness inlogistic regression. In

Companion Proceedings of The 2019 World Wide Web Conference ,pages 594–599, 2019.Jinqiang Yu, Alexey Ignatiev, Peter J Stuckey, and Pierre Le Bodic. Computing optimaldecision sets with sat. arXiv preprint arXiv:2007.15140 , 2020.Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P Gummadi.Fairness constraints: Mechanisms for fair classiﬁcation. In

Artiﬁcial Intelligence andStatistics , pages 962–970, 2017. ich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fairrepresentations. In International Conference on Machine Learning , pages 325–333, 2013.Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases withadversarial learning. In

Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics,and Society , pages 335–340, 2018.Wenbin Zhang and Eirini Ntoutsi. Faht: an adaptive fairness-aware decision tree classiﬁer. arXiv preprint arXiv:1907.07237 , 2019. hosh, Basu, and Meel Appendix A. Proofs of Theoretical Results

Lemma 1

Solving the ER-SSAT and RE-SSAT problems are NP PP hard (Littman et al.,2001). Proof [Proof of Lemma 1] The decision version of ER-SSAT problem isΦ := ∃ a , . . . , ∃ a n , R p x x , . . . , R p xm x m . Pr[ φ ˆ y ] ≥ t, where t is a threshold in [0 , N P

P P hard (Littman et al., 2001). If there’s no random variable and t = 1,ER-SSAT reduces to a SAT problem, which is NP-hard. If there’s no existential variable,ER-SSAT reduces to a MAJSAT problem, which is PP-hard. Similar arguments also holdfor RE-SSAT problem. Lemma 2

Given Eq. (3) and (4) , Pr[Φ UR ] = 1 − Pr[Φ (cid:48) ER ] . Proof [Proof of Lemma 2] Both Φ UR and Φ (cid:48) ER have random quantiﬁed variables in theidentical order in the preﬁx. According to the deﬁnition of SSAT formulas,Pr[Φ UR ] = min a ,...,a n Pr[ φ ˆ Y ] and Pr[Φ (cid:48) ER ] = max a ,...,a n Pr[ ¬ φ ˆ Y ] . We can show the following duality between ER-SSAT and UR-SSAT,Pr[Φ (cid:48) ER ] = max a ,...,a n Pr[ ¬ φ ˆ Y ]= min a ,...,a n (1 − Pr[ φ ˆ Y ])= 1 − min a ,...,a n Pr[ φ ˆ Y ]= 1 − Pr[Φ UR ] . Lemma 3

Let Φ a be the RE-SSAT formula for computing the PPV of the compoundprotected group a ∈ A . If Φ ER is the ER-SSAT formula for learning the most favored groupand Φ UR is the UR-SSAT formula for learning the least favored group, then max a Pr[Φ a ] =Pr[Φ ER ] and min a Pr[Φ a ] = Pr[Φ UR ] . Proof [Proof of Lemma 3] It is trivial that the PPV of most favored group a fav is themaximum PPV of all compound groups a ∈ A . Similarly, the PPV of the least favoredgroup a unfav is the minimum PPV of all compound groups a ∈ A .By construction of the SSAT formulas, the PPV of a fav and a unfav are Pr[Φ ER ] andPr[Φ UR ] respectively. Since Pr[Φ a ] is the PPV of the compound group a ,max a Pr[Φ a ] = Pr[Φ ER ] and min a Pr[Φ a ] = Pr[Φ UR ] . heorem 4 For an ER-SSAT problem, the sample complexity is given by k = O (cid:18) ( n + ln(1 /δ )) ln m ln (cid:15) (cid:19) , where ˆ pp ≤ (cid:15) with probability − δ such that (cid:15) ≥ . Corollary 5 If k samples are considered from the data-generating distribution in Justicia such that k = O (cid:18) ( n + ln(1 /δ )) ln m ln (cid:15) (cid:19) , the estimated disparate impact ˆ DI and statistical parity ˆ SP satisfy, with probability − δ , ˆ DI ≤ (cid:15) DI, and ˆ SP ≤ (cid:15) SP.

Proof [Proof of Corollary 5] By Theorem 4, we get that for k samples obtained from thedata generating distribution, where k ≥ ( n + ln(1 /δ )) ln m ln (cid:15) , the estimated probability of satisfaction for the most and least favoured groups ˆ p max andˆ p min satisﬁes ˆ p max ≤ (cid:15) max a Pr[Φ a ] and ˆ p min ≤ (cid:15) min a Pr[Φ a ] . with probability 1 − δ . Thus, the estimated value of disparate impact will satisfyˆ DI (cid:44) ˆ p max ˆ p min ≤ (cid:15) p max p min ≤ (cid:15) DI, and statistical parity will satisfyˆ SP (cid:44) | ˆ p max − ˆ p min |≤ (cid:15) | p max − p min |≤ (cid:15) SP, with probability 1 − δ . Appendix B. Practical Extensions and Design Choices

In this section, we relax assumptions of Boolean classiﬁers and Boolean attributes andextend

Justicia to verify fairness metrics in a more practical setting. We ﬁrst discuss theinput classiﬁers of

Justicia in the following.

B.1 Beyond CNF Classiﬁers.

In the presented SSAT approach for verifying fairness, we assume the classiﬁer ˆ Y to berepresented as a CNF formula. In the literature of interpretable machine learning, severalstudies have been conducted for learning CNF classiﬁers in the supervised learning setting,which include but are not limited to the work of (Angelino et al., 2017; Malioutov and Meel,2018; Ghosh and Meel, 2019; Yu et al., 2020). However, Justicia can be extended beyondCNF classiﬁers, in particular to decision trees and linear classiﬁers that are widely adoptedin the ML fairness studies (Zemel et al., 2013; Zafar et al., 2017; Xu et al., 2019; Zhang andNtoutsi, 2019; Raﬀ et al., 2018; Friedler et al., 2019). hosh, Basu, and Meel Encoding Decision Trees as CNF.

Existing rule-based classiﬁers, for example, binarydecision trees can be trivially encoded as CNF formulas. In the binary decision tree, eachnode in the tree is a literal. A path from the root to the leaf is a conjunction of literals (hence,a path is a clause) and the tree itself is a disjunction of all paths (or clauses). In order toderive a CNF representation φ of the decision tree, we ﬁrst construct a DNF by consideringall paths terminating at leaves with negative class label (ˆ y = 0) and then complement it toa CNF using De Morgan’s rule. Therefore, for any input that is classiﬁed positive by thedecision tree satisﬁes φ and vice versa. In Justicia learn for learning the least favored group,we can construct a negated CNF classiﬁer in Eq. 4 by only including paths terminating onpositive labeled leaves.

Encoding Linear Classiﬁers as CNF.

Linear classiﬁers on Boolean attributes can beencoded into CNF formulas using pseudo-Boolean encoding (Philipp and Steinke, 2015). Weconsider a linear classiﬁer W · X + b ≥ X with weights W ∈ R | X | and bias b ∈ R . We ﬁrst normalize W and b in [ − ,

1] and then round to integers so thatthe decision boundary becomes a pseudo-Boolean constraint, e.g., at-least k constraint. Wethen apply pseudo-Boolean constraints to CNF translation to encode the decision boundaryto CNF. This encoding usually introduces additional Boolean variables and results in largeCNF. In order to generate a smaller CNF, we can apply thresholding techniques on theweights W to consider attributes with higher weights only. For instance, if the weight | w i | ≤ λ for a threshold λ and w i ∈ W , we can set w i = 0. Thus, lower weighted (hence lessimportant) attributes do not appear in the encoded CNF. Finally, to construct the negatedclassiﬁer in the SSAT formula in Eq. 4, we encode W · X + b < at-most k encoding.In practical problems, attributes are generally real-valued or categorical. We next discusshow Justicia can work beyond Boolean attributes.

B.2 Beyond Boolean Attributes.

Classiﬁers that are already represented in CNF are usually trained on a Boolean abstractionof the input attributes where each categorical attribute is one-hot encoded and each real-valued attribute is discretized into a set of Boolean attributes (Lakkaraju et al., 2019; Ghoshet al., 2020). Thus,

Justicia can verify CNF classiﬁers readily.

Decision Trees.

In case of binary decision tree classiﬁers, the input attributes are nu-merical or categorical, but each attribute is compared against a constant in each internalnode of the tree. Hence, we ﬁx a Boolean variable for each internal node where the Booleanassignment to the variable decides one of the two branches to choose from the current node.

Linear Classiﬁers.

Linear classiﬁers are generally trained on numerical attributes wherewe apply following discretization. Consider a numerical attribute x where w is its weight.We want to discretize x to a set B of Boolean attributes and recalculate the weights of thevariables in B from w . For discretization, we simply consider interval-based approach wherefor each interval (or bin) in the continuous space of x , we consider a Boolean variable b i ∈ B such that b i is assigned (cid:62) (or 1) when the attribute-value of x lies within the interval and b i is assigned ⊥ (or 0) otherwise. Let µ i be the mean of the interval where b i can be (cid:62) . Total groups T i m e Decision tree

Encodinglearncondenum

10 20 30 40

Total groups T i m e CNF learner: IMLI

Encodinglearncondenum

Figure 4: Runtime comparison of diﬀerent encodings for varying total protected groups inthe Adult datasetWe then ﬁx the revised weight of b i to be µ i · w . We can show trivially that if we considerinﬁnite number of intervals, x ≈ (cid:80) i µ i b i . Appendix C. Additional Experimental Details

C.1 Experimental Setup

Since both

Justicia and FairSquare take a probability distribution of the attributes as input,we perform ﬁve-fold cross validation, use the train set for learning the classiﬁer, computedistribution on the test set and ﬁnally verify fairness metrics such as disparate impact andstatistical parity diﬀerence on the distribution.

C.2 Comparative Evaluation of Two Encodings

While both

Justicia enum and

Justicia learn have the same output, the

Justicia learn encodingimproves exponentially in runtime than

Justicia enum on both decision tree and BooleanCNF classiﬁers as we vary the total compound groups in Figure 4. This analysis justiﬁes thatthe na¨ıve enumeration-based approach cannot verify large-scale fairness problems containingmultiple protected attributes. hosh, Basu, and Meel Table 4: Veriﬁcation of diﬀerent fairness enhancing algorithms for multiple datasets andclassiﬁers using

Justicia . Numbers in bold refer to fairness improvement compared against theunprocessed (orig.) dataset. RW and OP refer to reweighing and optimized-preprocessingalgorithm respectively.Classiﬁer Dataset → GermanProtected → Age SexAlgorithm → orig. RW OP orig. RW OPLogisticregression Disparte impact 0 .

00 0 . . . . . . .

03 0 . . . . . .

04 0 . . . . . .

56 0 . . .

37 0 . Stat. parity 0 . .

02 0 . .

05 0 .

10 0 . . .

05 0 . .

06 0 .

16 0 .17