[PDF] Exclusion of Extreme Jurors and Minority Representation: The Effect of Jury Selection Procedures

Abstract

We compare two established jury selection procedures meant to safeguard against the inclusion of biased jurors that are also perceived as causing minorities to be under-represented in juries. The Strike and Replace procedure presents potential jurors one-by-one to the parties, while the Struck procedure presents all potential jurors before the parties exercise vetoes. In equilibrium, Struck more effectively excludes extreme jurors than Strike and Replace but leads to a worse representation of minorities. Simulations suggest that the advantage of Struck in terms of excluding extremes is sizable in a wide range of cases. In contrast, Strike and Replace only provides a significantly better representation of minorities if the minority and majority are heavily polarized. When parameters are estimated to match the parties' selection of jurors by race with jury-selection data from Mississippi in trials against black defendants, the procedures' outcomes are substantially different, and the size of the trade-off between objectives can be quantitatively evaluated.

Full PDF

EExclusion of Extreme Jurors and Minority Representation:The Eﬀect of Jury Selection Procedures ∗ Andrea Moro and Martin Van der LindenFebruary 16, 2021

Abstract

We compare two established jury selection procedures meant to safeguard against theinclusion of biased jurors that are also perceived as causing minorities to be under-represented in juries. The Strike and Replace procedure presents potential jurors one-by-one to the parties, while the Struck procedure presents all potential jurors beforethe parties exercise vetoes. In equilibrium, Struck more eﬀectively excludes extremejurors than Strike and Replace but leads to a worse representation of minorities. Sim-ulations suggest that the advantage of Struck in terms of excluding extremes is sizablein a wide range of cases. In contrast, Strike and Replace only provides a signiﬁcantlybetter representation of minorities if the minority and majority are heavily polarized.When parameters are estimated to match the parties’ selection of jurors by race withjury-selection data from Mississippi in trials against black defendants, the procedures’outcomes are substantially diﬀerent, and the size of the trade-oﬀ between objectivescan be quantitatively evaluated.

JEL Classiﬁcation: K40, K14, J14, J16Keywords: Jury selection, Peremptory challenge, Minority representation, Gender rep-resentation ∗ Moro: Vanderbilt University, [email protected] . Van Der Linden: Emory University [email protected] a r X i v : . [ ec on . GN ] F e b Introduction

In the U.S. legal system, it is customary to let the parties involved in a jury trial dismisssome of the potential jurors without justiﬁcation. These dismissals, known as peremptorychallenges , are meant to enable “each side to exclude those jurors it believes will be mostpartial toward the other side” thereby “eliminat[ing] extremes of partiality on both sides”. In the last decades, however, peremptory challenges have been criticized, mainly becausethey are perceived as causing some groups — in particular minorities — to be under-represented in juries. The procedure used to let the parties exercise their challenges varies greatly acrossjurisdictions and is sometimes left to the discretion of the judge. Two classes of proceduresare most frequently used in the U.S. In the Struck procedure (henceforth:

STR ), the partiescan observe and extensively question all the jurors who could potentially serve on theirtrial before exercising their challenges (this questioning process is known as voir dire ). Incontrast, in the Strike and Replace procedure (henceforth:

S&R ), smaller groups of jurorsare sequentially presented to the parties. The parties observe and question the group theyare presented with (sometimes a single juror) but must exercise their challenges on thatgroup without knowing the identity of the next potential jurors.The goal of this paper is to shed light on a debate that emerged in the legal doctrineover the relative eﬀectiveness of

STR and

S&R at satisfying the two objectives of excludingextreme jurors and ensuring adequate group representation. Bermant and Shapard (1981,pp. 93-94), for example, argues that, by avoiding uncertainty,

STR “always gives advocatesmore information on which to base their challenges, and, therefore, [...] is always to bepreferred”. Bermant further notes that “a primary purpose of peremptory challenges is to Holland v. Illinois , 493 U.S. 474, 484 (1990). For examples of this line of argument against peremptory challenges, see Sacks (1989), Broderick (1992),Hochman (1993), Marder (1994), and Smith (2014). Despite these attacks, the U.S. has so far resistedabandoning peremptory challenges altogether (unlike other countries, like the U.K. where they were abolishedin 1988). Peremptory challenges remain pervasive in all U.S. jurisdictions and have been aﬃrmed by theU.S. Supreme Court as “one of the most important rights secured to the accused;” (

Swain v. Alabama For example, in criminal cases in Illinois, “[State Supreme Court] Rule 434(a) expressly grants a trialcourt the discretion to alter the traditional procedure for impaneling juries so long as the parties haveadequate notice of the system to be used and the method does not unduly restrict the use of peremptorychallenges” (

People v. McCormick , 328 Ill.App.3d 378, 766 N.E.2d 671, (2d Dist., 2002)).

STR facilitates the exclusion of some groups from juries. Although in

Batson v. Kentucky and

J. E. B. v. Alabama the Supreme Court found it unconstitutionalto challenge potential jurors based on their race or gender, proving that a challenge isbased on race or gender is often diﬃcult and the Supreme Court’s mandate is notoriouslyhard to implement. Interestingly, in response, judges themselves have turned to the de-sign of the challenge procedure and the use of

S&R as an instrument to foster adequategroup representation. For example, in a memorandum on judges’ practices regarding juryselection, Shapard and Johnson (1994) reports about judges believing that by “prevent[ing]counsel from knowing who might replace a challenged juror”

S&R procedures “make it morediﬃcult to pursue a strategy prohibited by

Batson ”.To inform this debate, we extend in Section 2 the model of jury selection proposed inBrams and Davis (1978) by allowing potential jurors to belong to two diﬀerent groups. Inthe model, each potential juror is characterized by a probability to vote in favor of thedefendant’s conviction. This probability is drawn from a distribution that depends on thejuror’s group-membership. The group distributions are common knowledge but the partiesto the trial, a plaintiﬀ and a defendant, only observe their realization for a particularjuror upon questioning that juror. The parties have opposing goals: the plaintiﬀ wantsto maximize the probability of conviction, whereas the defendant seeks to maximize theprobability of acquittal.A jury must be formed to decide the outcome of the trial and the parties can inﬂuence itscomposition by challenging (i.e., vetoing) a certain number of potential jurors. Challengesare exercised according to

S&R or STR procedures which, as explained above, diﬀer mainlyin the timing of jurors’ questioning (and, as a consequence, in the parties’ ability to observe

476 U.S. 79 (1986) and, 511 U.S. 127 (1994). In terms of legal procedures, the response to these decisionhas consisted in allowing the parties to appeal peremptories from their opponent, allowing them to nullify aperemptory if they can show that it was indeed based on race. These appeals are known as Batson appeals . See Raphael and Ungvarsky (1993): “In virtually any situation, an intelligent plaintiﬀ can produce aplausible neutral explanation for striking Pat despite the plaintiﬀ’s having acted on racial bias. Consequently,given the current case law, a plaintiﬀ who wishes to oﬀer a pretext for a race-based strike is unlikely toencounter diﬃculty in crafting a neutral explanation.” See also Marder (2012) or Daly (2016) for whyjudges rarely rule in favor of Batson appeals.

STR is more eﬀective than

S&R at excluding jurors from the tailsof the conviction probability distribution, but is less likely to select minority jurors.The rest of the paper is devoted to characterizing conditions under which these resultsextend beyond the illustrative example of Section 3. In Section 4 we call a juror extreme ifits conviction probability falls below (above) a given threshold. We prove that there alwaysexists a low enough threshold such that

STR is more likely than

S&R to exclude extremejurors. Moreover, we show that

STR always selects fewer extreme jurors than a randomselection would, but that there are some (admittedly somewhat unusual) circumstancesin which

S&R would not. Simulations assuming a wide range of conviction probabilitydistributions reveal that, in terms of excluding extreme jurors, the advantage of

STR over

S&R can be substantial, even for relatively high thresholds.Section 5 compares procedures according to their ability to select minorities and identi-ﬁes conditions under which

S&R selects more minority jurors than

STR . Our proof uses alimiting argument showing that the result holds when the minority is vanishingly small andthe distributions of conviction probabilities for each group minimally overlap (i.e., groupsare polarized). However, simulations again suggest that the result remains true when thesize of the minority is relatively high and the overlap between distributions is signiﬁcant. InSection 6, we explore how changing the number of challenges aﬀect the results of Sections4 and 5.Depending on the extent to which jurors of diﬀerent races have polarized preferencesfor conviction, the model has diﬀerent empirical implications for the selection of jurors byrace. In Section 7 we exploit peremptory challenge data on a version of

STR adopted inFifth Circuit Court District of Mississippi to estimate the groups’ distributions of convictionprobabilities, and to simulate the outcomes of counterfactual procedures. Results show thatgroups appear to be substantially polarized in their preferences for convictions, and thatthe choice of procedure aﬀects both exclusion of extreme jurors and minority representationsubstantially. 4n Section 8 we show how our main theoretical results results extend to a diﬀerentdeﬁnition of extreme juries (i.e., a jury in which the highest (lowest) conviction-probabilityjuror is below (above) a given threshold). We also explore how the procedures compare inselecting members of groups that are about equal size (such as male and females, as opposedto minorities which induce groups of unequal sizes).

Related Literature

This paper belongs to a relatively small literature formalizing jury selection procedures.Brams and Davis (1978) model

S&R as a game and derive its subgame-perfect equilibriumstrategies which we use in our theoretical results and simulations. Perhaps closest to ourpaper is Flanagan (2015) who shows that, compared to randomly selecting jurors,

STR increases the probability that all jurors come from one particular side of the median of theconviction probability distribution (because

STR induces correlation between the convic-tion probability of the selected jurors). To our knowledge, this literature is silent on theimplications of jury selection for group representation and on the trade-oﬀ between exclud-ing extreme jurors and ensuring adequate group representation induced by using diﬀerentprocedures. These are the focus and main contributions of this paper.While the group-composition of a jury has been shown to inﬂuence the outcome of atrial (Anwar et al., 2012; Flanagan, 2018), legal scholars often argue in favor of represen-tative juries regardless of their eﬀect on verdicts. Diamond et al. (2009) for example arguethat “unrepresentative juries [...] threaten the public’s faith in the legitimacy of the legalsystem”. In an experiment on jury-eligible individuals, they show that participants rate theoutcome of trials as signiﬁcantly fairer when the jury is racially heterogeneous than whenit is not. This motivates us to consider group-representativity itself as a desirable featureof jury selection procedures. One might also be interested in the impact of group-representation on the conviction of defendantswho themselves belong to diﬀerent groups. Without taking groups into account or attempting to compareprocedures, Flanagan (2015) studies the impact of jury selection procedures on conviction rates. His resultsin terms of conviction rates require to assume that the parties have correct beliefs about the probability thatjurors eventually vote for conviction (as well as about these probabilities are independent of one another). Incontrast, our results about group-representation and exclusion of extremes do not require that the parties’belief at the moment of jury selection be accurate (at least if we are concerned with extremes as perceivedby the parties , as the U.S. Supreme Court seems to be when saying that the main purpose of peremptory In Section 6, weshow that limiting the number of challenges (while keeping the number of selected jurorsﬁxed) can have a similar eﬀect, though at the expense of a less eﬀective exclusion of extremejurors.

There are two parties to a trial, the defendant , D , and the plaintiﬀ , P . The outcome of thetrial is decided by a jury of j jurors who must be selected from the population. The partiesshare a common belief about the probability that a juror i will vote to convict the defendant.We denote this probability c i ∈ [0 , C , with probability distribution f ( c ). We denote its cumulative with F ( c ) and its expected value with µ . Throughout, we assume that C is continuous. Tosimplify the notation, we also assume that the boundaries of the support of C are 0 and 1. To address the issue of group representation, we assume that jurors belong to one of twogroups a or b . The parties have common beliefs about the probability that jurors from eachgroup vote to convict the defendant. We index the distributions representing these beliefsand their averages with subscript g ∈ { a, b } : f g ( c ), F g ( c ), and µ g . The corresponding challenges is to enable “each side to exclude those jurors it believes will be most partial toward the otherside”, see Footnote 1 and associated quote). Diamond et al. (2009) take advantage of a feature of civil cases in Florida where juries are made of sixjurors unless one of the parties requests a jury of twelve jurors and pays for the costs associated with sucha larger jury. This assumption is without loss of generality and all our results hold if C is re-scaled in such a way that F ( c ) = 0 or [1 − F (1 − c )] = 0 for some c ∈ (0 , Empirical evidence, including the one we report in Section 7 shows that that parties use their challengesunevenly across groups; see also the Related Literature section of the Introduction. C a and C b . Although throughout conviction probabilitiesand their distributions across groups should only be viewed as representing the partiescommon- beliefs , we henceforth lighten the terminology and speak directly of convictionprobabilities (rather than parties’ beliefs about conviction probabilities).We let r denote the proportion of group- a jurors in the population, and when discussinggroup representation, we assume that C is obtained by drawing from C a with probability r and from C b with probability (1 − r ) (in particular, f ( c ) = rf a ( c ) + (1 − r ) f b ( c )).Following the majority of the literature (Brams and Davis, 1978; Flanagan, 2015), weassume that, at the level of jury selection, the parties do not account for the process ofjury deliberations and — perhaps as a way to cope with the complexity of jury selection— view the probabilities that jurors votes for conviction as independent from one another.Since conviction in most U.S. trials requires a unanimous jury, the parties then considerthat a jury composed of jurors with conviction probabilities { c i } ji =1 convict the defendantwith probability Π ji =1 c i . The defendant, therefore, aims at minimizing Π ji =1 c i while theplaintiﬀ wants to maximize the same product.To inﬂuence the composition of jury, the defendant and the plaintiﬀ are allowed tochallenge (veto) up to d and p of the jurors in a panel of n = j + d + p potential jurorsrandomly and independently drawn the population (sometimes also called the pool ). Toavoid trivial cases, we assume throughout that d, p ≥

1. The parties use these challenges inthe course of a veto procedure M (formally, an extensive game-form). The jury resultingfrom the procedure is called the eﬀective jury .The two veto procedures we study are the STRuck procedure (

STR ) and the

StrikeAnd Replace procedure (

S&R ). For comparison, we also consider the

Random procedure(

RAN ) which simply draws j jurors independently at random from the population. In allprocedures, we assume that once a potential juror i is presented to the parties, the partiesobserve realized value of c i for that juror. The two procedures however diﬀer in the timing In the legal literature, what we call “panel” is sometimes called “ venire ” (though terminology variesand the latter term is sometimes used to speak of what we call the population). This is motivated by the practice of letting parties extensively question every juror they are presentedwith, a process known in the legal terminology as voire dire . In turn, the fact that the parties have the sameassessment of the probability a juror will vote for conviction is motivated by the fact that voir dire occurs inthe presence of both parties, and that the parties therefore and have access to the same information aboutthe jurors’ demographics, background, and opinions.

STR , the entire panel of j + d + p potential jurors is presented to the parties before they have the opportunity to use any of their challenges. Each party, therefore,observes the value of c i for every juror in the panel. The defendant and the plaintiﬀ thenchoose to challenge up to d and p of the jurors in the panel, respectively. In practice,there are several types of STR procedures that diﬀer in the way the parties exercise theirchallenges after having questioned the jurors in the panel. For concreteness and tractability,we focus in this paper on the

STR procedure in which the parties have a single opportunityto exercise their challenges on the whole panel. In equilibrium, this leads the plaintiﬀ tochallenge the p jurors in the panel with lowest conviction probabilities, and the defendantto challenge the d jurors with highest conviction probabilities. Whether these challengeshappen simultaneously or sequentially has no impact on the equilibrium and our results for

STR apply in either case. In contrast, under

S&R , groups of potential jurors are randomly drawn from the popula-tion and sequentially presented to the parties. In contrast with

STR procedures, the partiesmust exercise their challenges on jurors from a given group without knowing the identity ofjurors from subsequent groups. There is variation among

S&R used in practice in the size ofthe groups that are presented in each round. Again, for concreteness and tractability, wefocus in this paper on the

S&R procedure in which jurors are presented to the parties oneat a time . The defendant and the plaintiﬀ start the procedure with d and p challenges left,respectively. After each draw, the plaintiﬀ and the defendant observe the potential juror’sconviction probability and, if they have at least one challenge left, choose whether or notto challenge the juror. If a juror is not challenged by either party, it becomes a member of Alternative methods used in the ﬁeld include procedures in which the parties to challenge sequentiallyout of subgroups of jurors from the panel only. As long as the procedure remains of the struck type (i.e.,the entire panel — and not only the ﬁrst subgroup — is questioned before the parties start exercising theirchallenges), the equilibrium eﬀective jury is often the same as under the

STR procedure we consider here.Other outcome-irrelevant aspects of the equilibrium might, however, be diﬀerent such as the number ofchallenges used by the parties (e.g., if the ﬁrst group is made of the j “middle” jurors in the panel, theymay in some cases be selected as eﬀective jurors without the parties exercising any of their challenges). Since C is continuous, the probability that two jurors in a panel have the same conviction probability andone of the parties does not use all of its challenges in equilibrium is zero and this eventuality can thereforebe neglected. As well as in the ability of the parties to challenge, in a later round, potential jurors who were leftunchallenged in previous rounds, a practice known as “ backstricking ”. j members is formed.The (subgame perfect) equilibrium of S&R was characterized by Brams and Davis (1978)and takes the form of threshold strategies. In every subgame, D challenges the presentedjuror i if c i is above a certain threshold t D , P challenges i if c i is below some threshold t P , andneither of the parties challenges i if c i ∈ [ t P , t D ]. We will sometimes refer to these values as challenge thresholds . As Brams and Davis (1978) show, in any subgame, t P < t D and evenif the challenges happen simultaneously and both parties are charged for their challengeswhen they both decide to challenge the presented juror, the latter (i.e., a challenge by bothparties) never occurs in equilibrium. The equilibrium is therefore unaﬀected by the timingof challenges in each round and our results for S&R apply regardless of this timing. In our description of

S&R , Nature moves in each round to draw a new potential jurorfrom the population to present to the parties. To facilitate conditional comparisons between

STR and

S&R based on a particular ﬁxed panel, it will sometimes be useful to consider anequivalent description of

S&R in which Nature ﬁrst draws a panel of n jurors { c , . . . , c n } (which the parties are not aware of) and in each round k presents juror c k to the parties.For similar purposes, it will sometimes be useful to view RAN as ﬁrst drawing a panel of n jurors and then (uniformly at random) selecting j jurors among these n to form the eﬀectivejury. To illustrate the diﬀerences between the two procedures, consider the simple case d = p = j = 1 together with distributions C a ∼ U [0 , .

5] and C b ∼ U [0 . , r = 0 .

1, i.e., there is a minority of 10% of group- a jurors in the population. Each subgame can be characterized by the number of jurors κ that remain to be selected, the number ofchallenges left to the defendant δ , and the number of challenges left to the plaintiﬀ π . The parties thresholdin subgame ( κ, δ, π ) are a function of the value of subgames ( κ − , δ, π ), ( κ, δ − , π ), and ( κ, δ, π −

1) (whichcan result from the parties action in ( κ, δ, π )) and the distribution of C , see Brams and Davis (1978). By “timing”, here, we mean the order (potentially simultaneous) in which the parties decide whether ornot to challenge the presented juror. igure 1: Illustrative example, equilibrium outcomes under STR → a b a a ○ a a a ○ b a b ○ b b b ○ b U [0 , . U [0 , . U [ . , U [ . , .001 .03 .24 .73 Note:

The ﬁgure describes the equilibirum of

STR assuming j = p = d = 1, C a ∼ U [0 , . C b ∼ U [0 . , r = 0 .

10. The initial node illustrates distribution C = 0 . · C a + 0 . · C b . The numbers on each arrowindicate the probability of drawing a panel with the group-composition represented in the pointed boxes(conditional on each panel composition, the circled letter in the box corresponds to the group-membershipof the selected juror). Dashed arrows correspond to outcomes that lead to the selection of a group- a jurorand the graph underneath each box shows the distribution of conviction probabilities for the selected juror. Let U nx [ a, b ] denote the x -th order statistic for a U [ a, b ] random sample of size n . Withthis notation, Figure 1 shows the group-membership and distribution of conviction probabil-ity for the juror selected under STR , conditional on the composition of the panel. Observethat in this example, if there are group- a jurors in the panel, one of them is systematicallychallenged by the plaintiﬀ. Therefore, for a group- a juror (i.e., a minority juror) to beselected under STR , there need to be at least two group- a jurors in the panel of n = 3presented to the parties. This occurs with probability 0 . a juror can be selected under S&R even if the panel contains a singlegroup- a juror. To understand why, consider the equilibrium of S&R which is illustratedin Figure 2. If a group- b prospective juror with a suﬃciently low conviction probability( c i ∈ [0 . , . a juror is more likelyto be selected than if a juror was randomly drawn from the population. In particular, anygroup- a juror presented at the beginning of this later subgame is left unchallenged by thedefendant and selected to be the eﬀective juror (even if this juror is the only group- a juror10 igure 2: Illustrative example, equilibrium strategies and outcomes under S&R

Each round 1 draw from ↓ a b Round 1 c i ∈ [ . , c i ∈ [ . , . Group- bc i ∈ [0 , . c i ∈ [ . , c i ∈ [0 , . Group- a c i ∈ [ . , . Group- b c i ∈ [ . , Group- bc i ∈ [0 , . c i ∈ [ . , Group- bc i ∈ [0 , . Group- ac i ∈ [ . , Group- bc i ∈ [0 , . Group- a Round 3 .40.29.31 .54.47.54.36.10 .90.10 .90.10

Note:

The ﬁgure describes the equilibrium strategies conditional on the conviction probability of the jurordrawn in each round for the case j = d = p = 1, C a ∼ U [0 , . C b ∼ U [0 . ,

1] and r = 0 .

10. Dashed arrowscorrespond to paths that may lead to the selection of a group- a juror. The numbers on each arrow indicatethe probability of the path conditional on reaching the previous node. The second row of text inside boxesindicates an equilibrium action, whereas bold text below boxes indicates the group of the selected juror inthe game outcome. In round 3, challenges from both parties are exhausted and the parties do not take anyaction. in the panel because the third juror — who, in this case, is never presented to the parties —happens to be a group- b juror). This course of action follows from P ’s choice to challengea group- b juror with low conviction probability in the ﬁrst round, which leaves P withoutchallenges left in the second round. This choice of P is optimal from the perspective of theﬁrst round of S&R ( before the plaintiﬀ learns that the second juror in the panel is a group- a juror), but suboptimal under STR where, having observed the conviction probability of alljurors in the panel, the plaintiﬀ would have challenged the group- a juror instead.Considering only the branch of the S&R game-tree that starts with a challenge from P ,the probability of selecting a group- a juror is almost 0 .

05 = 0 . · (0 . · . . D challenges in the ﬁrst round followed by achallenge from P in the second round (which happens with probability 0 . · . · . ≈ . S&R is 0.067. This is larger than theprobability under

STR , 0.03, yet smaller than under

RAN , 0.10.In this example, the better representation of minority jurors produced by

S&R comesat the expense of selecting more extreme jurors. Suppose for the sake of illustration thatjurors are considered extreme if they come from the top or bottom 5th percentile of C .In our example, the bottom and top 5th percentile corresponds to conviction probabilitiesbelow 0.25 and above 0.94, respectively. The selected juror is within the bottom range withprobability 0 .

015 under

STR versus 0 .

033 under

S&R , and in the top range with probability0 .

076 under

STR versus 0 .

083 under

S&R .To understand the source of these diﬀerences, let us consider the bottom 5th percentile[0 , .

25] (a symmetric explanation applies to the top

STR selects a group- a juror — the type of juror whose conviction probabilitycould possibly be in the bottom 5th percentile — the distribution of that juror’s convictionprobability follows the middle or upper order-statistics of a random sample from C a . Theseorder-statistics are unlikely to result in the selection of a juror with conviction probabilityin the bottom 5th percentile. In contrast, as Figure 2 illustrates, all paths leading S&R toselect a group- a juror result in the juror’s conviction probability being drawn from U [0 , . S&R more likely to select a juror in the bottom 5th percentile than

STR .In the next two sections, we investigate the extent to which the advantages of

S&R interms of minority-representation and of

STR in terms of exclusion of extreme generalizesbeyond this illustrative example.

In the United States, one of the objectives of the jury selection process is to guarantee animpartial jury as dictated by the Sixth Amendment of the Constitution. In this respect,the peremptory challenge procedures implemented in U.S. jurisdictions are often viewedas a way to foster impartiality by preventing extreme potential jurors from serving on the These are the only cases in which a minority juror can be selected under

S&R . In particular, jurorsaccepted in the ﬁrst round are always group- b jurors ( c i ∈ [0 . , . D is the ﬁrst round ( c i ∈ [0 . , In the context of our model, we interpret this goal as that of limiting thepresence in the jury of jurors from the tails of the distributions of conviction probabilities.In Sections 4 to 6, we refer to a juror i as extreme if its conviction probability c i lies below or above given thresholds (we refer the reader to Section 8 for results underan alternative deﬁnition). For brevity, we will focus on jurors who qualify as extremebecause their conviction probability lies below some threshold c >

0. This is without lossof generality and all our results about extreme jurors apply symmetrically to jurors whoseconviction probability lies above a given threshold c < C are selected lessoften under STR than

S&R . This is not true in general. Fixing a particular threshold c >

0— or percentile of C — to characterize jurors as extreme, there always exists distributions of C and values of d , p , and j such that S&R selects fewer extreme jurors than

STR . However,our ﬁrst result shows that regardless of the distribution and value of the parameters, therealways exists a threshold suﬃciently small such that, if jurors are called “extreme” belowthat threshold, the probability of selecting extreme jurors is greater under

S&R than under

STR .Let T M ( x ; c ) denote the probability that there are at least x jurors with convictionprobability smaller or equal to c in the jury selected by procedure M . Proposition 1.

For any x ∈ { , . . . , j } , there exists c > such that T ST R ( x ; c ) < T S&R ( x ; c ) for all c ∈ (0 , c ) . All proofs are in the appendix. A symmetric statement, which we omit, applies forextreme jurors at the right-end of the distribution. Note that Proposition 1 can be rephrasedin terms of stochastic dominance. Let N cM denotes the expected number of jurors of type c i ≤ c in the jury selected by procedure M . Then, Proposition 1 says that there exists c > N c S&R has ﬁrst-order stochastic dominance over N cST R for all c ∈ (0 , c ). Adirect corollary of Proposition 1 is therefore that the expected number of extreme jurors islarger under S&R than under

STR .For some intuition about Proposition 1, consider the case x = 1. As illustrated inSection 3, the panel must be composed of more than one extreme juror for STR to select See Footnote 1 and its associated quote. For legal arguments in favor of peremptory challenges basedon the Sixth Amendment, see, among others, Beck (1998), Biedenbender (1991), Bonebrake (1988), Horwitz(1992), and Keene (2009).

13t least one such juror (since, if there is a single extreme juror in the panel, that juror issystematically challenged by the plaintiﬀ). In contrast, even in panels with a single extremejuror, the extreme juror can be part of the eﬀective jury resulting from

S&R . This happens,for example, if the extreme juror is presented to the parties after they both exhaustedall their challenges. The single extreme juror can also be accepted by both parties if itsconviction probability is suﬃciently close to c and it is presented after the plaintiﬀ usedmost of its challenges on non-extreme potential jurors. The proof then follows from thefact that, as c tends to zero, the probability that the panel contains more than one extremejuror goes to zero faster than the probability the panel contains a single extreme juror. Proposition 1 is silent about the value of the threshold c below which STR selects fewerjurors than

S&R , as well as the size of T S&R ( x ; c ) − T ST R ( x ; c ) for c < c . These values dependon the models’ parameters. To illustrate, we simulate T STR (1; c ) and T S&R (1; c ) using j =12, d = 6, and p = 6, a typical combination of jury size and number of peremptory challengesin U.S. jurisdictions. For the distribution of conviction probabilities in the population, weuse symmetric mixtures of beta distributions that represents a population made of twogroups with polarized views, which allows easier comparison with the results from Section 5in which we study group-representation. We provide simulation results for three mixtures ofthe distributions illustrated in Figure 3, which are meant to represent extreme (Panel (a)),moderate (Panel (b)), and mild levels of polarization (Panel (c)). Additional simulationsresults using U [0 ,

1] instead are reported in Appendix B.Using these parameters,

STR is found to exclude more extreme jurors than

S&R evenwhen the threshold for deﬁning jurors as extreme is relatively high. As illustrated in Fig-ure 4, the diﬀerence between the propensity of

STR and

S&R to select extreme jurors issizable. For example, in all three sets of simulations, only about 1% of juries selected by Subgames in which the defendant has more challenges left than the plaintiﬀ can lead the plaintiﬀ to beconservative and accept jurors who are “barely extreme” ( c i ≈ c ) in order to save its few challenges left for“very extreme” jurors ( c i ≈ Proposition 1 crucially depends on averaging across all possible panels and does not state that

STR rejects more extreme jurors than

S&R for any particular realization of the panel. The latter would obviouslyimply Proposition 1 but turns out to be false in general. For a counterexample, let j = d = p = 1. Considera panel of three jurors with c < c < c and c > c and the index of the jurors indicating the order in whichthey are presented under S&R . For this panel,

STR always leads to the selection of extreme juror 3. Incontrast, provided c falls between the challenge thresholds of the defendant and the plaintiﬀ in the ﬁrstround (which happens with positive probability), S&R selects non-extreme juror 2. igure 3: Distributions of conviction probabilities by group under extreme,moderate, and mild group-polarization . . . f a ( c ): Beta (1 , f b ( c ): Beta (5 , (a) Extreme . . . . . . . . f a ( c ): Beta (2 , f b ( c ): Beta (4 , (b) Moderate . . . c . . . . . f a ( c ): Beta (3 , f b ( c ): Beta (4 , (c) Mild STR feature at least one juror with conviction probability below the 10th percentile of thedistribution (the 10th percentile corresponds to 0.01 under the extreme polarization distri-bution, 0.17 under moderate polarization, and 0.25 under mild polarization). Under

S&R ,the proportion of juries with at least one juror below the 10th percentile rises to 56% withextreme polarization, 35% with moderate polarization, and remains quite high at 30% evenunder mild polarization. For comparison, a random selection would have resulted in about73% of the juries featuring at least one such juror.In these simulations, both procedures select fewer extreme jurors than a random drawfrom the population. Somewhat surprisingly, this is not true in general. There exist dis-tributions and values of the parameters d , p and j for which S&R selects more extremejurors than

RAN , no matter how small the threshold below which a juror is considered asextreme. In contrast, as we show in the next proposition,

STR always selects fewer extremejurors than

RAN . Proposition 2.

For any x ∈ { , . . . , j − } , there exists c > such that T ST R ( x ; c ) < T RAN ( x ; c ) for all c ∈ (0 , c ) . Proposition 2 generalizes Theorem 2 in Flanagan (2015) which shows that there always exists c > T STR ( n ; c ) < T RAN ( n ; c ) for all c ∈ (0 , c ). igure 4: Fraction of juries with at least one extreme juror . . . c . . . . . . F r a c t i o n o fj u r i e s RANS&RSTR

RANS&RSTR (a) Extreme . . . c . . . . . . F r a c t i o n o fj u r i e s RAN S&R STR (b) Moderate . . . c . . . . . . F r a c t i o n o fj u r i e s RAN S&R STR (c) Mild

Note:

For each set of parameters, results on the vertical axis are averages across 50,000 simulated juryselections, ﬁxing j = 12, d = p = 6, and C ∼ . · C a + 0 . · C b throughout (with the distributions for C a and C b illustrated in Figure 3). Each line illustrates the fraction of juries with at least one extreme juror,where a juror is considered extreme if her conviction probability falls below the threshold c correspondingto the value on the horizontal axis. Figure 5 illustrates Proposition 2 and the fact that a similar statement does not holdfor

S&R . For the simulations in the ﬁgure, we let j = d = p = 1 and adopt an extremelypolarized distribution of conviction probabilities with C ∼ . · U [0 , .

1] + 0 . · U [0 . , STR excludes extreme jurors more often than

RAN because,for any realization of the panel, the juror with the lowest conviction probability is neverselected under

STR (whereas the same juror is selected with positive probability under

RAN ). Under

S&R , however, if the distribution is suﬃciently right-skewed, the plaintiﬀ ismore likely than the defendant to challenge in the ﬁrst round. A challenge by the plaintiﬀin the ﬁrst round leads to a subgame in which only the defendant has challenges left andthe selection of an extreme juror is more likely than under a random draw. When they aresuﬃciently large (i) the added probability of selecting an extreme juror when the defendanthas more challenges left than the plaintiﬀ, coupled with (ii) the probability of a challengeby the plaintiﬀ in the ﬁrst round can, as in the simulation depicted in Figure 5, lead to

S&R selecting more extreme jurors than

RAN .16 igure 5: Fraction of juries with at least one extreme juror (case in which

S&R is more likely to pick extreme jurors than

RAN ) .

00 0 .

02 0 .

04 0 .

06 0 .

08 0 . c . . . . . F r a c t i o n o fj u r i e s RANS&R STR

Note:

For each set of parameters, results on the vertical axis are averages across 50,000 simulated juryselections, ﬁxing j = d = p = 1, and C ∼ . · U [0 , .

1] + 0 . · U [0 . ,

1] throughout. Each line illustratesthe fraction of juries with at least one extreme juror, where a juror is considered extreme if her convictionprobability falls below the threshold c corresponding to the value on the horizontal axis.

We could not fully characterize the situations in which

S&R selects more extreme jurorsthan

RAN , and we never observed such a situation in simulations where C is a symmet-ric mixture of beta or uniform distributions. The example in Figure 5 (as well as otherexamples we found) requires extreme skewness in the distribution, which may be viewedas unlikely. In this sense, situations in which S&R selects more extreme jurors than

RAN might represent worst-case scenarios for

S&R ’s ineﬀectiveness at excluding extreme juror(rather than ordinary situations).

In this section, we study the extent to which

STR ’s tendency to exclude more extreme jurorsthan

S&R impacts the representation of minorities under the two procedures. Without lossof generality, we let group- a be the minority group. Since the parties do not care intrin-sically about group-membership, any asymmetry in the use of their challenges arises from17eterogeneity in preferences for conviction between groups. In our simulations, we assumethat group- a is biased in favor of acquittal in the sense that C b ﬁrst-order stochasticallydominates C a . As suggested by Proposition 1, which procedure better represents minorities stronglydepends on the polarization between the two groups, and the concentration of minorityjurors at the tails of the distribution of conviction probabilities. To illustrate, supposethat d = p = j = 1 and C ∼ U [0 , RAN , STR , and

S&R are displayed in Figure6(a). Consistent with Proposition 1, below some threshold c ≈ .

25, the probability ofselecting a juror i with c i < c is lower under STR than under

S&R . If the two groupsare polarized and the distribution of C a is suﬃciently concentrated below c , it followsthat STR selects a minority juror less often than

S&R . But the same is not true if thedistributions lack polarization or the minority is too large. For example, decompose C asfollows: C ∼ U [0 ,

1] = rU [0 , r ] + (1 − r ) U [ r, per se , the value of r in thesedecompositions does not aﬀect the distributions of conviction probabilities for the jurorselected under RAN , STR , or

S&R . Then, letting C a ∼ U [0 , r ] and C b ∼ U [ r, r — which concentrate minorities at the bottom of thedistribution — make S&R select more minorities than

STR , whereas higher values of r —which spread the minority over a larger range of conviction-types — make STR select moreminorities than

S&R .From this example, we see that non-overlapping group-distributions are not suﬃcientto guarantee that

S&R selects more minority jurors than

STR . Neither is making the mi-nority arbitrarily small. For example, regardless of the size of the minority r , concentratingthe support of the minority distribution inside the interval [0 . , .

3] would result in

STR selecting more minority jurors, as can be seen from Panel 6(a). However, combining a smallminority with group-distributions that minimally overlap concentrates the distribution ofgroup- a at the tails which, as suggested by Proposition 1, makes S&R select more minoritiesthan

STR .Formally, consider a sequence of triples { ( C ia , C ib , r i ) } ∞ i =1 . If, We also simulated the scenario in which the minority is biased towards conviction, the results, which wereport in the Appendix, are symmetrically very close). igure 6: Jury selection and minority representation in size-1 juries .

00 0 .

25 0 .

50 0 .

75 1 . c . . . . D e n s i t y RANS&RSTR (a) Distribution of c for selected juror . . . . . r . . . . . F r a c t i o n o f g r o up - a i n j u r i e s RAN S&RSTR (b) Minority representation in juries

Note:

For each set of parameter, results on the vertical axes are averages across 20,000 simulated juryselections, ﬁxing j = 1, d = p = 1, and C ∼ r · U [0 , r ] + (1 − r ) · U [ r,

1] throughout. The distribution in panel(a) is independent of r whether the lines in panel (b) interpolate results from 20 values of r . (i) r i ∈ (0 ,

1] for all i ∈ N with lim i →∞ r i = 0, and(ii) C ia and C ib converge in distribution to C ∗ a and C ∗ b , with either P ( C ∗ a < C ∗ b ) = 0 or P ( C ∗ a > C ∗ b ) = 0,then we say that there is a vanishing minority and group-distributions thatdo not overlap in the limit . For any such sequence, let A iM ( x ) denote the probabilitythat there are at least x minority jurors in the jury selected by procedure M when group-distributions are C ia and C ib and the proportion of minority jurors in the population is r i . Proposition 3.

Suppose that, under { ( C ia , C ib , r i ) } ∞ i =1 , there is a vanishing minority andgroup distributions that do not overlap in the limit. Then for all x ∈ { , . . . , j } , there exists j suﬃciently large such that A i S&R ( x ) > A i STR ( x ) for all i > j . Note that, despite the argument presented in the motivating example illustrated in Figure 6, Proposition3 does not follow directly from Proposition 1. The reason is that, unlike in the motivating example, most of able 1: Representation of Group-a when Group-a is a minority of the pool Polarization Extreme Moderate Mild (All)Procedure

S&R STR S&R STR S&R STR RAN

Average fraction of minorities 0.10 0.08 0.18 0.16 0.23 0.23 0.25Standard deviation 0.11 0.11 0.12 0.12 0.12 0.12 0.12Fraction of juries with at least 1 0.57 0.45 0.88 0.84 0.96 0.95 0.97 (a) Group-a represents 25% of the jury pool

Polarization Extreme Moderate Mild (All)Procedure

S&R STR S&R STR S&R STR RAN

Average fraction of minorities 0.02 0.00 0.05 0.04 0.09 0.08 0.10Standard deviation 0.04 0.01 0.07 0.06 0.08 0.08 0.09Fraction of juries with at least 1 0.17 0.02 0.47 0.38 0.67 0.64 0.72 (b) Group-a represents 10% of the jury pool

Note:

The rows report the average number and standard deviation of group- a jury members, and the percentof juries with at least one group- a jurors, out of 50,000 simulations of jury selection with parameters j = 12and d = p = 6. Conviction probabilities are drawn for from Beta (5 , Beta (1 , Beta (4 , Beta (2 ,

4) (Moderate), and from

Beta (4 , Beta (4 ,

3) (Mild);see Figure 3 for the shape of these distributions.

Given the result in Proposition 3, it is natural to wonder how small the minority and theoverlap between the group-distributions must be for

S&R to select more minority jurors than

STR . When the latter is true, one may also wonder about the size of A S&R ( x ; r ) − A STR ( x ; r )is. Again, the answer naturally depends on the model’s parameters. To inform thesequestions, we ran a set of simulations with d = p = 6 and j = 12 using the distributionsdisplayed in Figure 3, where the green lines in each panel represent f a and the yellow lines f b . The results of our simulations, displayed in Table 1, suggest that S&R might select the sequences { ( C ia , C ib , r i ) } ∞ i =1 covered by Proposition 3 are such that C i = r i C ia + (1 − r i ) C ib varies acrossthe sequence (i.e., C j (cid:54) = C h for most j, h ∈ N ). STR even when the size of the minority is relatively high (ashigh as 25%) and the overlap between the group-distributions signiﬁcant. However, withoutstark polarization across groups, diﬀerences between the procedures’ propensities to selectminority jurors appear to be small. For example, under the distributions we labeled as“extreme group heterogeneity” and with minorities representing 10% of the population, only2.3% of juries selected by S&R include at least one minority juror whereas this number risesto 17.1% under

S&R (random selection would generate over 70% of such juries). However,under the distributions we labeled as “mild group heterogeneity”, the same numbers become66.5% under

S&R and 64.5% under

STR (random selection would generate over 71.9% ofjuries with at least one minority juror in this second case).

So far, we have compared

STR and

S&R assuming that the number of challenges the partiescan use, d and p , was the same under each procedure. This was motivated by the fact thatjudges often have a lot of freedom in selecting the procedure through which the parties usetheir challenges (see Footnote 3). In contrast, the number of challenges that the parties canuse are typically speciﬁed more rigidly by state rules of criminal procedure.In the last decades, several states have, however, reduced the number of challenges theparties can use. In some instances, these reforms also clarify or alter the jury selectionprocedures used in the state. In the context of such broader reforms, it is natural to askhow the ability to change both the number of challenges the parties are entitled to and theprocedure through which the parties exert their challenges aﬀect the trade-oﬀ between theexclusion of extreme jurors and the representation of minorities. Recall that C a and C b represent the parties’ beliefs that randomly drawn group- a or group- b jurorseventually vote to convict the defendant. Polarized C a and C b , therefore, corresponds to groups thatare perceived by the parties to have diﬀerent probabilities of voting for conviction (whether or not thismaterializes when jurors actually vote on conviction at the end of the trial). Examples include California’s Senate Bill 843, passed in 2016, which reduces the number of challengesa criminal defendant is entitled to from 10 to 6 (for charges carrying a maximal punishable of one year inprison, or less). Examples include the 2003 reform of jury selection in Tennessee where some aspects of the jury selectionprocedure were codiﬁed to apply uniformly across the state, while the number of peremptory challenges wasalso slightly reduced (see Cohen and Cohen, 2003). igure 7: The eﬀect of varying the number of challenges . . . . . . F r a c t i o n o fj u r i e s S&R STR (a) Fraction of extreme jurors . . . . F r a c t i o n o f m i n o r i t i e s S&R STR (b) Fraction of minority jurors

Note:

Fraction of juries with at least one juror below the 10th percentile (left panel) and fraction of minorityjurors (right panel). For each set of parameters, results on the vertical axes are averages across 50,000simulated jury selections, ﬁxing j = 12 and C ∼ . · C a + 0 . · C b throughout (with the distributions for C a ∼ Beta (2 ,

4) and C b ∼ Beta (4 , d = p are on the horizontal axes. Throughout this section, we ﬁx an arbitrary value of j and consider varying d = p . Forany procedure M , let M - y denote the version of M when d = p = y . The notation forthe two previous sections then carries over, with T M - y ( x ; c ) denoting the probability thatat least x jurors with conviction probability below c are selected under M - y , and A M - y ( x )the probability that at least x minority jurors are selected under M - y . For illustration purposes, we ﬁrst consider the case C ∼ . · C a +0 . · C b , C a ∼ Beta (2 , C b ∼ Beta (4 ,

2) ( C a and C b are illustrated in the Figure 3(b)), and consider a juroras extreme if its conviction probability falls in the bottom 10th percentile of C (which hereequals 0 . extreme jurors decreases as the number of challenges awarded to the parties increases, regardless of the procedurethat is used (Figure 7(a)). Conversely, the fraction of minority jurors decreases with the Again, in the case of extreme jurors, we focus on jurors who qualify as extreme because their convictionprobability falls below a certain threshold c , though all of our results hold symmetrically for jurors whoqualify as extreme because their conviction probability lies above a certain threshold c , STR and

S&R , more challenges lead to fewer extreme jurors being selected at the expense of aworse representation of minorities.As Figure 7(a) illustrates, however, increasing the number of challenges decreases theselection of extreme jurors much faster under

STR than under

S&R . As a consequence, forall values of y ∈ { , . . . , } , there exists w < y such that STR - w performs better than S&R - y in terms of both objectives. The latter is not true in general. Even when there exists w such that STR - w betterrepresents minorities than S&R - y , STR - w might still exclude fewer extreme jurors than S&R - y if jurors are considered extreme when their conviction probability falls below anarbitrary c >

0. However, an extension of Proposition 1 shows that when such a w exists,there also exists c > c , STR - w performs better than S&R - y in terms of both objectives. Proposition 4.

Consider any x ∈ { , . . . , j } and any y ≥ . Suppose that there exists w ≥ such that A STR- w ( x ) > A S&R- y ( x ) . Then for some c > , we also have T STR- w ( x ; c ) < T S&R- y ( x ; c ) for all c ∈ (0 , c ) . As emphasized in the analysis so far, group asymmetries in jury representation exist to theextent that groups have polarized preferences for conviction. In this section, we use jury se-lection data to estimate the distribution of conviction probabilities and provide quantitativeevidence of the eﬀect of jury selection procedures and their diﬀerences.Jury selection data is to our knowledge relatively scarce. For the purposes of thisSection, we exploit data from Craft (2018) on peremptory strikes in the Fifth Circuit CourtDistrict of Mississippi from 1992 to 2017, where a version of

STR was used to select jurors. Speciﬁcally, in this example, for any y ∈ { , . . . , } , there exists w ∈ { , . . . , y − } such that A STR - w (1) > A S&R - y (1) and T STR - w (1; 0 . < T S&R - y (1; 0 . Besides the data used in this section, another important source is the data of jury selection in NorthCarolina described in Wright et al. (2018) and analyzed in Flanagan (2018). We do not use this sourcebecause the jury selection procedures adopted in these jurisdictions do not conform to the rules we study inthis paper. While the adopted procedure diﬀers in some details from the stylized version we analyzed in this paper,we assume that in equilibrium, its outcome conforms to that of

STR . In addition, the number of jurors For each trial, the data reports the race and gender of the potential jurors, whethera juror was struck by the defendant or the state, and the race and gender composition ofthe seated jury and alternate jurors. This allows the computation of jury composition byrace, and the computation of challenges by race for each party.We limit our analysis to the juries’ racial composition focusing on Black and White jurorsonly . Assuming that the distributions of conviction probabilities in each group belong tothe class of beta distributions, the model parameters are ﬁve: the fraction of whites in thejury pool, 1 − r , which we directly observe in the data, and the four parameters of f Blacks ,f Whites . The data we observe does not allow to identify both of these distributions. Given r , for any given f Blacks = Beta ( α a , β a ), it is always possible to ﬁnd f Whites = Beta ( α b , β b )that replicates the same proportion of whites struck by the defendant and by the Stateof Mississippi, the plaintiﬀ (which in turn determine the fraction of whites in the jury).Intuitively, the reason behind this lack of identiﬁcation is that it is possible to shift somemass of both distributions to the right without changing, on average, the racial compositionof the juries. . While this shift would cause the conviction frequency at trial to change,using this moment for identiﬁcation would not change the outcomes we focus on in thispaper for STR (see Footnote 34).In Table 2 we report some summary statistics from the data. The sample contains 292trials, of which 229 include black defendants. We exclude all jurors dismissed by the judgefor causes that are not the focus of our analysis. Hence, we deﬁne the size of the panel as thesum of the number of jurors, alternate jurors, and jurors dismissed by either the state or thedefendant. There is some variation in the size of both the juries and the panel, in part dueto the fact that the process of selecting alternate jurors is separate. Unfortunately, the data selected and the number of challenges available sometimes diﬀer by type of trial. As we explain below, there is some variation to the number of jurors in the data and to the numberof challenges used by parties (due to variation in the kind of oﬀenses being prosecuted as well as in judgesdecisions in the allocation of additional challenges for the selection of alternate jurors). However, themoments we use for identiﬁcation rely only on race ratios and are relative stable across juries of diﬀerentsize. The full sample includes almost 15,000 jurors, of which 26% are Black, 42% are White, 32% are ofunknown race, and only 3 Latinos and 1 Asian which we pool with the Whites. With beta distributions, matching these two moments also matches the proportion of juries with x jurorsof a given race, for all x ∈ { , . . . , j } , making it impossible to use higher moments for identiﬁcation. able 2: Summary statisticsSample selection (1) (2) (3) (4) (5)Defendant White Black Black Black BlackSize of jury pool Any Any ≤

27 Any ≤ Trial statistics

Average size of jury pool 26.1 26.9 23.7 26.2 23.5(std) (5.0) (5.8) (2.5) (5.7) (2.6)Average size of jury 12.0 12.0 12.0 12.0 12.0(std) (0.3) (0.4) (0.4) (0.2) (0.2)% with unknown race in jury pool 31.2 30.7 26.9 0.0 0.0

Percentage of whites ∗ in jury pool 63.1 62.7 63.1 66.5 65.9in jury 61.0 66.8 67.8 70.5 69.7among struck by the defendant 86.2 91.4 92.3 93.1 92.9among struck by the state 40.8 23.6 21.4 23.5 21.6 Standard deviation in parenthesis. ∗ Percentage of white jurors in samples (1), (2), and (3) computed amongjurors that have been classiﬁed as either whites or blacks does not distinguish between jurors who were dismissed in the course of selecting regularjurors, or in the course of alternates. We present data for 5 samples that vary dependingon the race of the defendant, the size of the panel, and whether or not we include panelscontaining jurors of unknown race. These show that the racial composition of juries andchallenges is aﬀected by the the race of the defendant but is only weakly aﬀected by theway we select our sample.The average size of the jury (excluding alternates) is 12 in all samples, though the panelsare slightly over 24, mainly because they include potential alternate jurors (and because, insome cases, judges may grant additional challenges to the parties). Challenge behavior isaﬀected by the race of the defendant: Juries with black defendants have a higher percentage25 igure 8: Counterfactual analysis: Juries with at least one extreme juror . . . . . c . . . . . . F r a c t i o n o fj u r i e s RANS&RSTR

Note:

4) and C b ∼ Beta (5 . , . of whites than the panel does, whereas juries with white defendants include fewer whites.When the defendant is black the defense challenges a higher fraction of white jurors, and thestate a higher fraction of black jurors. Variation in the size of the jury pool has little impacton the racial composition of the juries or challenged jurors (for either party). Focusing ontrials with Black defendants, the fraction of whites in the pool is quite stable across all 5samples (between 62.7 and 66.5 percent). This is predicted by our theory when jurors havepolarized views that favor defendants of their own race. The behavior of the parties diﬀersubstantially by race: in sample (5), which we use to estimate our model, 93% of the jurorsstruck by the defendant are white, whereas only 22% of the jurors struck by the state areWhite. We use these two moments to estimate the distribution of conviction probabilities.We proceed by assuming f Blacks = Beta (2 , f Whites tomatch the fraction of white jurors struck by the defendant and the plaintiﬀ (the last twomoments of Table 2) using sample (5). The estimated parameters of f Whites are (alpha =26 igure 9: Counterfactual analysis: Number of challenges − . − . . . . F r a c t i o n o fj u r i e s S&R STR (a) Fraction of extreme jurors . . . . . F r a c t i o n o f m i n o r i t i e s S&R STR (b) Fraction of minority jurors

Note:

Fraction of juries with at least one juror below the 10th percentile (left panel) and fraction of minorityjurors (right panel). For each set of parameters, results on the vertical axes are averages across 50,000simulated jury selections, ﬁxing j = 12, d = p = 6, and C ∼ . · C a + 0 . · C b throughout (with thedistributions for C a ∼ Beta (2 ,

4) and C b ∼ Beta (5 . , . d = p are on the horizontal axes. Figure 8 reports the results of simulations computed with the estimated parameters. Theﬁgure reveals that the procedure adopted by this jurisdiction — a version of

STR whereeach party is allowed 6 challenges — is much more eﬀective at excluding extreme jurorsthan a counterfactual

S&R . The adopted procedure excludes nearly every juror below the10-th percentile, c = 0 .

21, whereas

S&R with the same number of challenges would produceabout 27% juries with at least one juror more extreme than 0.21.Figure 9 however suggests that a change to

S&R could improve the representation ofminorities. Keeping the number of challenges at 6,

S&R would include 6% more minorities Standard errors computed by bootstrapping 200 replications of the data set. We also tried assuming aleft-skewed f Blacks = Beta (10 , f Whites = Beta (23 . , . STR , the results obtainedwith these alternative distributions are almost identical to the ones reported in Figure 9.

S&R selects aboutthe same number of minorities across all number of challenges, but is capable of excluding fewer jurors belowthe 10th percentile by about 7 percentage points.

STR (about 27% vs 25%) and would produce a jury with 4 black jurors (about thesame as the black representation in the jury pool) 12% more often (about 41% vs 37%). Toreach a similar representation, the number of challenges in

STR would have to be reducedto 4, though this would increase the fraction of juries with jurors below the 10th percentilefrom almost zero to 4.4%.This analysis suggests that the data is consistent with the parties believing in a distribu-tion that makes the two procedures signiﬁcantly diﬀerent in their ability to exclude jurors.The data is also consistent with beliefs in sizeable heterogeneity between juror-groups which,in turn, implies that the procedures also diﬀer in their ability to select of minorities as well.

The primary purpose of jury selection is to prevent extreme potential jurors from serving onthe eﬀective jury (see Footnote 1 and its associated quote). In our model, it seems naturalto interpret this goal as that of limiting the selection of jurors coming from the tail of thedistribution. This is the interpretation of extreme that we have studied thus far.Although it is perhaps less clear that it aligns with the goals of practitioners, anotherapproach could be to consider the extremism of juries as a whole . For example, extremejuries could be viewed as juries in which the juror with the highest or lowest convictionprobability is extreme. Through variants of the arguments in the proofs of Propositions 1and 2, one can show that, in that sense too,

STR is more eﬀective than both

S&R and

RAN at excluding extreme juries. Another measure of juries’ extremism, proposed by Flanagan (2015), is whether a juryis excessively “unbalanced” in the sense of featuring a disproportionate proportion of ju-rors coming from one side of the median of C . Interestingly, Flanagan shows that STR introduces correlation between the selected jurors, which leads the procedure to select moreunbalanced juries than

RAN . Even though panels are the result of independent draws fromthe population, jurors selected under

STR have conviction probabilities between that of Speciﬁcally, for any x ∈ { , . . . , j − } , there exists c > c <

1, such that (a) for every c ∈ (0 , c ),the probability that the lowest conviction-probability in the jury is smaller than c is larger under S&R and

RAN than under

STR , and (b) for every c ∈ (¯ c, c is larger under S&R and

RAN than under

STR . .

25 and 0 .

75 indicates that challenges were used on jurors withconviction probabilities outside the [0 . , .

75] range. The latter makes it more likely that

STR selected additional jurors in the [0 . , . d = p ), the probability that all selected jurors come from one side of the median is larger under STR than under

RAN .Our next proposition generalizes this result. Using a new proof technique, we show that for any x larger than half the jury-size, the probability of selecting at least x jurors from oneside of the median is larger under STR than under

RAN . Similar to Section 4, we focus forbrevity on the probability that the selected jurors are below the median. All our results arehowever symmetrical and apply identically to the probability of selecting jurors above themedian. Let med [ C ] denote the median of C . Proposition 5. If d = p , then for any x ∈ { n/ , . . . , n } if n is even, and any x ∈{ n/ . , . . . , n } if n is odd, we have T STR (cid:0) x ; med [ C ] (cid:1) > T RAN (cid:0) x ; med [ C ] (cid:1) . Figure 10 illustrates Proposition 5 and that a similar statement does not hold for

S&R .For M ∈ { STR , RAN } , the value of T M ( x ; med [ C ]) can be computed analytically and doesnot depend on the distribution of C . For M = S&R , the value of T M ( x ; med [ C ]) dependson the distribution in a complex fashion and it is not possible to generally compare S&R with the two other procedures in terms of T M ( x ; med [ C ]). As the ﬁgure illustrates, theprobability to select at least x jurors below med [ C ] can, in some cases (in the ﬁgure, x = 7and, barely, x = 8 jurors), be larger under S&R than under both

RAN and

STR . In othercases, however, the same probability is lower under

S&R than under both

RAN and

STR .Figure 10 displays the result of simulations when the distribution of C is highly polarized(a mixture of Beta (1 ,

5) and

Beta (5 , S&R to more often select a majority of jurors below the medianthan

STR . Also, for lower levels of polarization,

S&R more often selects fewer juries made Speciﬁcally, T RAN (cid:0) x ; med [ C ] (cid:1) = P ( Bi [ j, . ≥ x ) whereas T STR (cid:0) x ; med [ C ] (cid:1) = P ( Bi [ j + d + p, . ≥ x + p ). igure 10: Selection of jurors below the median . . . . . F r a c t i o n o fj u r i e s ( d i ﬀ e r e n ce w i t h R AN ) STRS&R , r = . S&R , r = . S&R , r = . Note:

Fraction of juries with a at least given number of jurors below the median of C under STR (greendashed line) and

S&R (continuous lines) relative to the same fraction under

RAN (i.e. T M ( x ; med [ C ]) − T RAN ( x ; med [ C ])). Throughout, we ﬁx j = 12, d = p = 6 and C ∼ r · Beta (1 ,

5) + (1 − r ) · Beta (5 ,

1) (for r ∈ { . , . , . } ) whereas the number of jurors below the median is on the horizontal axis. For each setof parameters, results for S&R are averages across 50,000 simulated jury selections, whereas values for

RAN and

STR are computed analytically and are independent of r (see Footnote 36). of a majority of jurors below the median than RAN . Concerns about the eﬀect of jury selection on group-representation often focus on the repre-sentation of racial minorities. Thought the U.S. Supreme Court initially banned challengesbased on race in

Batson v. Kentucky (1986), it later also banned challenges based on gender in J.E.B. v. Alabama (1994). In this context, it is natural to ask whether the advantagesof

S&R in terms of minority representation comes at the cost of a worse representation ofgender groups.Unlike minorities which correspond to groups of unequal sizes represented by small Because the parties’ actions under

S&R are inﬂuenced by the mean of the distribution but not in anyclear way by the median (and because of the complexity of the game tree), we were unable to formalize theeﬀect of polarization on these comparisons in terms of the model parameters. r , gender-groups can be thought of as even-sized groups and are better modeledusing r ≈ .

5. With groups of similar sizes, both procedures almost always select at leasta few members from either group. It is therefore more interesting to compare proceduresin terms of the proportion of group- a jurors they select (than in terms of the probability ofselecting at least x members from group- a , as we did before).In this last section, we let r = 0 . a jurorsselected under STR and

S&R . We denote these proportions r STR and r S&R and focus on howclose r STR and r S&R are from the 50% of group- a jurors that prevail in the population. As in the last two sections, it is not possible to generally compare

STR and

S&R interms of the procedures’ ability to select an even proportion of group- a and group- b jurors.In some cases, r STR can be further away from 50% than r S&R , and the converse may betrue in other cases. For example, with d = p = 6 and j = 12, if C a ∼ U [0 ,

1] and C b ∼ Beta (1 , r STR = 43 .

7% whereas r S&R = 45 . C a ∼ Beta [4 ,

2] and C b ∼ Beta (1 , r STR = 50 .

3% whereas r S&R = 52 . r STR get closer to 50% . Our next proposition conﬁrms this pattern. If thegroup-distributions are symmetric or if they do not overlap, and if d = p , then r STR = 50%whereas

S&R does not necessarily select an even proportion of jurors from each group. Thelatter follows from the fact that, even when r = 50% and distributions are symmetrical, themultiplicative utility function that the parties use to assess the value of a jury (which is itselfa consequence of the fact that juries must reach unanimous decisions) creates asymmetriesin the use of challenges under S&R . We say that random variables C a and C b are symmetric if f a ( c ) = f b (1 − c ) for all c ∈ [0 , Proposition 6.

Suppose that r = 0 . and d = p . If (a) the two group distribution do not Previous results are stronger in the sense that they establish a ﬁrst-order stochastic dominance betweenthe number of jurors with certain characteristics (extremism or group-membership) selected under

STR and

S&R . As we explain after Proposition 1, showing, for example, that T STR ( x ; c ) < T S&R ( x ; c ) for all x ∈ { , . . . , j } directly implies that the expected proportion of selected jurors with conviction probability c i < c is lower under STR than under

S&R (whereas the converse is not true). Flanagan (2015) shows that, in this symmetrical case, the asymmetry of the payoﬀs still forces thedefendant to be more conservative than the plaintiﬀ when using its challenges, hence leading to an unevenselection of jurors from the two groups. verlap, or (b) C a and C b are symmetric, then r ST R = r RAN . Table 3(a) illustrates Proposition 6 and the fact that a similar statement does not holdfor

S&R . Unlike

STR , S&R can select unequal numbers of group- a and group- b jurorseven when distributions are symmetrical across groups. Therefore, as a consequence ofProposition 6, r S&R can in these cases be further away than r STR from the 50% of group- a jurors that prevail in the population.Table 3(a) however suggests that these diﬀerences may be quantitatively small, and thatsizable diﬀerences may require high levels of polarization between groups. Table 3(b) and3(c) also report the results of simulations in which the symmetries required for Proposition 6to hold are slightly relaxed. These indicate that the advantage of STR in the representationof balanced groups established in Proposition 6 (i.e., the fact that r STR is closer to 50%than r S&R ) may not be robust to even mild relaxations of these symmetries. In particular,when r = 0 . r STR is consistently closer than r S&R to the 55% of group − a that prevail inthe population (see Table 1). Also, when r = 0 . r S&R are identical except in the most polarized case.

In this paper, we study the relative performance of two stylized jury-selection procedures.Strike and Replace presents potential jurors one-by-one to the parties, whereas the Struckprocedure presents all potential jurors before they exercise vetoes. When jurors diﬀer intheir probability of voting for the defendant’s conviction, and on group membership, weshow that when groups have polarized views Strike is more eﬀective at excluding jurorswith extreme views, but generally selects fewer members of a minority group than Strikeand Replace, leading to a conﬂict between these two goals.Sociologists Small and Pager (2020) argue that systemic factors may lead to disparateoutcomes even in the absence of taste-based or statistical discrimination, the traditionalexplanations provided in Economic Theory. This paper formalizes an example in which thepursuit of one legitimate objective — preventing extreme jurors to serve on juries — maylead to group disparities. That is either P ( C a > C b ) = 0 or P ( C b > C a ) = 0. The same result would apply if the two distributionsdid not overlap in the limit as in Proposition 3. able 3: Representation of Group-a jurors with balanced group sizes Polarization Extreme Moderate Mild (All)Procedure

S&R STR S&R STR S&R STR RAN

Average fraction of minorities 0.48 0.50 0.49 0.50 0.50 0.50 0.50Standard deviation 0.18 0.20 0.16 0.17 0.15 0.15 0.14 (a) Group-a proportion r = 0 . , group distributions as in Figure 3. Polarization Extreme Moderate Mild (All)Procedure

S&R STR S&R STR S&R STR RAN

Average fraction of minorities 0.39 0.40 0.42 0.42 0.45 0.44 0.45Standard deviation 0.18 0.20 0.16 0.17 0.15 0.15 0.14 (b) Group-a proportion r = 0 . , group distributions as in Figure 3. Polarization Extreme ∗ Moderate ∗ Mild ∗ (All)Procedure S&R STR S&R STR S&R STR RAN

Average fraction of minorities 0.47 0.50 0.49 0.48 0.49 0.48 0.50Standard deviation 0.18 0.20 0.15 0.16 0.15 0.16 0.14 (c) Group-a proportion r = 0 . , group distributions slightly asymmetric ∗ ∗ In panel (c) Extreme ∗ corresponds to C a ∼ Beta (1 ,

5) and C b ∼ Beta (5 , ∗ to C a ∼ Beta (2 , C b ∼ Beta (4 , ∗ to C a ∼ Beta (3 ,

4) and C b ∼ Beta (4 , Note:

The rows report the average number and standard deviation of group- a jury members out of 50,000simulations of jury selection with parameters j = 12 and d = p = 6. Appendix: Proofs

A.1 Preliminary technical results

A.1.1 Limit of a ratio of binomial probabilitiesLemma 1.

For all η ∈ N and any k ∈ { , . . . , η − } , lim π → P [ Bi ( η, π ) = k ] P [ Bi ( η, π ) > k ] = ∞ . Proof.

Using the standard formula for the p.d.f. of a binomial and the representation of thec.d.f. of the binomial with regularized incomplete beta function , we can re-write the ratio as P [ Bi ( η, π ) = k ]1 − P [ Bi ( η, π ) ≤ k ] = (cid:0) ηk (cid:1) π k (1 − π ) η − k − ( η − k ) (cid:0) ηk (cid:1) (cid:82) − π x η − k − (1 − x ) k dx (1)As π →

0, both the numerator and the denominator tend to 0. We use L’Hopital’s rule tocomplete the proof: ( ∂/∂π ) (cid:0) ηk (cid:1) π k (1 − π ) η − k ( ∂/∂π ) (cid:16) − (cid:104) ( η − k ) (cid:0) ηk (cid:1) (cid:82) − π x η − k − (1 − x ) k dx (cid:105)(cid:17) = (cid:0) ηk (cid:1) · (cid:2) kπ k − (1 − π ) η − k + π k ( η − k )(1 − π ) η − k − (cid:3) − ( η − k ) (cid:0) ηk (cid:1) [( − · (1 − π ) η − k − π k ]= kπ k − (1 − π ) η − k ( η − k )(1 − π ) η − k − π k + π k ( η − k )(1 − π ) η − k − ( η − k )(1 − π ) η − k − π k = k (1 − π )( η − k ) π + 1 −−−→ π → ∞ (cid:3) A.1.2 Continuity of challenge thresholds in

S&R as C i converges in distributionLemma 2. Consider a sequence of random variables { C i } ∞ i =1 that converges in distributionto some random variable C ∗ . Let t I ( γ, C i (cid:1) denote the challenge threshold used by party I ∈{ D, P } in an arbitrary subgame γ of S&R when the distribution of conviction probabilitiesis C i . For any such subgame γ , we have lim i →∞ t I ( γ, C i (cid:1) = t I ( γ, C ∗ ) .Proof. In any subgame ˜ γ , t I (˜ γ, C i (cid:1) is the ratio of the value of continuation subgames if I challenges the presented juror, or if both parties abstain from challenging (Brams and Davis,34978). Therefore, lim i →∞ t I ( γ, C i (cid:1) = t I ( γ, C ∗ ) follows directly if we show that the value ofany subgame, which we denote V ( γ, C i (cid:1) , converges to V ( γ, C ∗ ) as i tends to inﬁnity. The latter follows directly from the recursive characterization of V ( γ, C i (cid:1) in Brams andDavis (1978). Recall that each subgame γ can be characterized by the number of jurors κ that remain to be selected, the number of challenges left to the defendant δ , and thenumber of challenges left to the plaintiﬀ π . With this notation, the recursive proof thatfor all κ, δ, π ≥ V (cid:0) [ κ, δ, π ] , C i (cid:1) converges to V (cid:0) [ κ, δ, π ] , C i (cid:1) as i tends to inﬁnity can bedecomposed in a number of cases. Let F i ( c ) denote the the c.d.f. of C i , F ∗ ( c ) the c.d.f.of C ∗ , and F ( c ) the c.d.f. of an arbitrary distribution C , with µ i , µ ∗ , and µ j being thecorresponding expected values. In each step, the initial formula for V (cid:0) [ κ, δ, π ] , C i (cid:1) is takenfrom Brams and Davis (1978). Case 1: κ = 0 , δ ≥ , π ≥ . In this case, V (cid:0) [0 , δ, π ] , C ) = 1 for all C and theconvergence of V (cid:0) [0 , δ, π ] , C i (cid:1) to V (cid:0) [0 , δ, π ] , C ∗ ) follows trivially. Case 2: κ > , δ = 0 , π = 0 . In this case, V (cid:0) [ κ, , , C ) = µ κ for all C and theconvergence of V (cid:0) [0 , δ, π ] , C i (cid:1) to V (cid:0) [0 , δ, π ] , C ∗ ) follows from the fact that C i converges indistribution to C ∗ . Case 3: κ > , δ = 0 , π > . In this case, for all C , V (cid:0) [ κ, , π ] , C ) = V ( κ − , , π ) ∗ (cid:34) − (cid:90) t I ([ κ, ,π ] ,C ) F ( c ) dc (cid:35) , and t I ([ κ, , π ] , C ) = V (cid:0) [ κ, , π − , C ) /V (cid:0) [ κ − , , π ] , C ). The convergence of V (cid:0) [ κ, , π ] , C i (cid:1) to V (cid:0) [ κ, , π ] , C ∗ ) then follows recursively from the previous cases and from C i convergingin distribution to C ∗ . Case 4: κ > , δ > , π = 0 . In this case, for all C , V (cid:0) [ κ, δ, , C ) = V (cid:0) [ κ, δ − , , C ) − V (cid:0) [ κ − , δ, , C ) ∗ (cid:90) t D ([ κ,δ, ,C )0 F ( c ) dc, where t D ([ κ, δ, , C ) = V (cid:0) [ κ, δ − , , C ) /V (cid:0) [ κ − , δ, , C ). The convergence of V (cid:0) [ κ, δ, π ] , C i (cid:1) to V (cid:0) [ κ, δ, π ] , C ∗ ) then follows recursively from the previous cases and from C i convergingin distribution to C ∗ . Because we assume that all distributions of conviction probabilities are continuous, there are no issuesrelated to the possibility for the bottom of one of these ratios to converge to zero. ase 5: κ > , δ > , π > . In this case, for all C , V (cid:0) [ κ, δ, π ] , C ) = V (cid:0) [ κ, δ − , π ] , C ) − V (cid:0) [ κ − , δ, π ] , C ) ∗ (cid:90) t D ([ κ,δ,π ] ,C ) t I ([ κ,δ,π ] ,C ) F ( c ) dc, where t D ([ κ, δ, π ] , C ) = V (cid:0) [ κ, δ − , π ] , C ) /V (cid:0) [ κ − , δ, π ] , C ) and and t I ([ κ, δ, π ] , C ) = V (cid:0) [ κ, δ, π − , C ) /V (cid:0) [ κ − , δ, π ] , C ). The convergence of V (cid:0) [ κ, δ, , C i (cid:1) to V (cid:0) [ κ, δ, , C ∗ )then follows recursively from the previous cases and from C i converging in distribution to C ∗ . (cid:4) A.1.3 Comparative statics of probabilities from a symmetric binomialLemma 3. P [ Bi ( η + 2 , . ≥ k + 1] > P [ Bi ( η, . ≥ k ] if and only if k > η + .Proof. We can decompose P [ Bi ( η + 2 , . ≥ k + 1] in terms of Bi ( η, .

5) and Bi (2 , . P [ Bi ( η + 2 , . ≥ k + 1]= P [ Bi ( η, . ≥ k + 1] + P [ Bi ( η, .

5) = k ] ∗ P [ Bi (2 , . ≥

1] + P [ Bi ( η, .

5) = k − ∗ P [ Bi (2 , .

5) = 2]= P [ Bi ( η, . ≥ k + 1] + P [ Bi ( η, .

5) = k ] ∗ .

75 + P [ Bi ( η, .

5) = k − ∗ . P [ Bi ( η, . ≥ k ] = P [ Bi ( η, . ≥ k + 1] + P [ Bi ( η, .

5) = k ] . Together, the last two equalities imply that P [ Bi ( η + 2 , . ≥ k + 1] > P [ Bi ( η, . ≥ k ]if and only if P [ Bi ( η, .

5) = k ] ∗ .

75 + P [ Bi ( η, .

5) = k − ∗ . > P [ Bi ( η, .

5) = k ] P [ Bi ( η, .

5) = k − ∗ . > P [ Bi ( η, .

5) = k ] ∗ . P [ Bi ( η, .

5) = k − > P [ Bi ( η, .

5) = k ] (cid:18) ηk − (cid:19) . k − . η − ( k − > (cid:18) ηk (cid:19) . k . η − k η !( η − [ k − k − > η !( η − k )! k !( η − k )!( η − [ k − > ( k − k !1 η − k + 1 > kk > η A.1.4 Relationship between order statistics of symmetric distributions

For any number of draws w and any k ≤ w , let C k,wg denote the k -th order statistic out of w draws from distribution C g , and f k,wg ( x ) the corresponding probability density function. Lemma 4.

Suppose that C a and C b are symmetric. Then, for any w ∈ N and any k ∈{ , . . . , w } , we have f k,wa ( c ) = f w − k +1 ,wb (1 − c ) for all c ∈ [0 , .Proof. Recall that, by deﬁnition, C a and C b being symmetric implies f a ( c ) = f b (1 − c ) forall c ∈ [0 , F a ( c ) = F b (1 − c ) for all c ∈ [0 , f ka ( c ) = k (cid:18) wk (cid:19) f a ( c )[ F a ( c )] k − [1 − F a ( c )] w − k = k (cid:18) wk (cid:19) f b (1 − c )[1 − F b (1 − c )] k − [1 − (1 − F b (1 − c ))] w − k = k w !( w − k )! k ! f b (1 − c )[1 − F b (1 − c )] k − [ f b (1 − c )] w − k = ( w − k + 1) w !( w − k + 1)!( k − f b (1 − c )[(1 − F b (1 − c )] k − [ F b (1 − c )] w − k = ( w − k + 1) w !( w − k + 1)!( w − ( w − k + 1)! f b (1 − c )[1 − F b (1 − c )] k − [ F b (1 − c )] w − k = ( w − k + 1) (cid:18) ww − k + 1 (cid:19) f b (1 − c )[1 − F b (1 − c )] k − [ F b (1 − c )] w − k = f w − k +1 b (1 − c ) (cid:4) A.2 Section 4: Eﬀectiveness at excluding extremes

A.2.1 Proof of Proposition 1

Consider an arbitrary c ∈ (0 ,

1) and let us refer to jurors with conviction probability nolarger than c as extreme jurors . Let T M ( x ; c | k ) denote the probability that at least x extreme jurors are selected by procedure M conditional on there being exactly k of extremejurors in the panel of n . By the Law of Total Probability, T M ( x ; c ) = n (cid:88) k = x P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) T M ( x ; c | k ) . (2)37onsider ﬁrst the STR procedure. Note that for all c , we have T STR ( x ; c | x ) = 0 becauseif there are exactly x extreme jurors in the panel, one of them is necessarily challenged bythe plaintiﬀ under STR (recall that p ≥ T ST R ( x ; c ) = n (cid:88) k = x +1 P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) T STR ( x ; c | k ) ≤ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) , (3)where the last inequality follows from the fact that T STR ( x ; c | k ) ∈ [0 ,

1] for all k (as T STR ( x ; c | k ) is a probability).Next, consider procedure S&R . Our goal is to construct a lower bound for the probabilityof selecting an extreme juror and show that, as c →

0, this lower bound does not convergeto 0 as fast as (3). To do so, we introduce an decreasing function σ ( c ) > c is suﬃciently small, T S&R ( x ; c | k ) ≥ σ ( c ) for any k ≥ x . To construct σ , consider therestricted sample space in which there are k extreme jurors in the panel.Let t P be the lowest challenge threshold used by the plaintiﬀ in any subgame of the S&R procedure. Clearly, t P > Henceforth, we focus on c ∈ (0 , t P ).We ﬁrst consider the function α ( c ) deﬁned as the probability that c j ∈ ( c, t P ) for all the( n − k ) non-extreme jurors in the panel. Because C is continuous and 0 is the lower-boundof its support, there exists y > α ( c ) > c ∈ [0 , y ]. Also, α ( c ) is weakly decreasing in c .By construction of t P , for such panels (with k extreme jurors and c j ∈ ( c, t P ) for all the( n − k ) non-extreme jurors), the plaintiﬀ uses all its challenges on the p ﬁrst jurors it ispresented with, and the defendant never uses any challenges. Therefore, for these panels,the probability that all k extreme jurors are selected is the probability that none of thesejurors are among the p ﬁrst jurors presented to the parties, i.e., (cid:0) n − pk (cid:1) / (cid:0) nk (cid:1) . Overall, for Formally, if Γ denotes the set of subgames of

S&R and t P ( γ ) the plaintiﬀ’s challenge threshold in anysubgame γ ∈ Γ, then t P = min γ ∈ Γ t p ( γ ) (the minimum is well-deﬁned since Γ is of ﬁnite size). In anysubgame γ of S&R , there is always a conviction probability c > γ is of type c , the plaintiﬀ will challenge that juror. Therefore, t P > By deﬁnition of the support, because 0 is the lower-bound of the support, P ( C ∈ [0 , (cid:15) ]) > (cid:15) > C is continuous, there must therefore exists some δ > P ( C ∈ [ δ/ , δ ]) >

0. We thenhave α ( c ) > c < δ . The latter follows from the fact that, in any subgame, the threshold used by the defendant is alwayshigher than the threshold used by the plaintiﬀ (in equilibrium, the defendant and the plaintiﬀ never bothwant to challenge the presented juror). ∈ (0 , t P ), we have T S&R ( x ; c | k ) ≥ α ( c ) · (cid:0) n − pk (cid:1) / (cid:0) nk (cid:1) , and σ ( c ) := α ( c ) · (cid:0) n − pk (cid:1) / (cid:0) nk (cid:1) has thedesired property.Applying T S&R ( x ; c | k ) ≥ σ ( c ) to (2) with M = S&R , we obtain for all c suﬃcientlysmall (speciﬁcally c ∈ (0 , t P )) T S&R ( x ; c ) ≥ n (cid:88) k = x P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) ∗ σ ( c ) ≥ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ σ ( c ) . (4)Overall, combining (3) and (4) yieldslim c → T S&R ( x ; c ) T ST R ( x ; c ) ≥ lim c → P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ σ ( c ) P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) = ∞ , (5)where the last equality follows from Lemma 1 and the fact that σ ( c ) > c . In turn, lim c → T S&R ( x ; c ) / T ST R ( x ; c ) = ∞ and the fact that lim c → T S&R ( x ; c ) =lim c → T ST R ( x ; c ) = 0 together imply implies that there exists some c > T ST R ( x ; c ) < T S&R ( x ; c ) for all c ∈ (0 , c ). A.2.2 Proof of Proposition 2

Using the same notation as in the proof of Proposition 1, we have T RAN ( x ; c ) ≥ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ T RAN ( x ; c | x ) . (6)Note that T RAN ( x ; c | x ) is the probability that an Hypergeometric random variable with x success, n − x failures, and j draws, results in the draw of exactly x successes. Therefore, T RAN ( x ; c | x ) >

0. Finally, combining (6) and (3) yieldslim c → T RAN ( x ; c ) T ST R ( x ; c ) ≥ lim c → P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ T RAN ( x ; c | x ) P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) = ∞ , where the last equality follows from Lemma 1 and the fact that T RAN ( x ; c | x ) >

0. Theresult then follows as in the proof of Proposition 1. To apply Lemma 1, note that because C is continuous and the lower-bound of the support of C is 0, wehave F ( c ) > c > c → F ( c ) = 0. .3 Section 5: Representation of minorities A.3.1 Proof of Proposition 3

The structure of the proof is similar to that of the previous propositions. We focus on thecase we analyzed in the main paper, where the minority uniformly favors the defendant,i.e., lim i →∞ P ( C ia > C ib ) = 0. The proof for the other case is symmetrical.For now, consider arbitrary C ia , C ib , and r i . Similar to the previous proofs, for any triple( C ia , C ib , r i ), we ﬁrst decompose A i STR ( x ) and A i S&R ( x ) by conditioning on the number ofminority jurors in the panel.First, consider STR and let us decompose A i STR ( x ) conditional, on the one hand, on thepanel containing more than x minority jurors — which occurs with probability P (cid:2) Bi ( n, r i ) >x (cid:3) , and on the other, on the panel containing exactly x minority jurors — which occurswith probability P (cid:2) Bi ( n, r i ) = x (cid:3) . In the ﬁrst case (i.e., more than x minority jurors in thepanel), the probability that the panel contains at least x minority jurors is an upper boundon the probability that STR selects them. In the second case (i.e., exactly x minorityjurors in the panel), STR selects at least x minority jurors provided that none of theminority jurors in the panel are challenged. This occurs with a probability no larger thanthe probability that the lowest conviction-probability among minorities is larger than the p -th conviction probability among majority jurors (since the latter is required for the plaintiﬀnot to challenge any of the minority jurors in the panel). Recall that for any number ofdraws w and any k ≤ w , we let C k,wg denote the k -th order statistic out of w draws fromgroup g ∈ { a, b } . With this notation, we therefore have, A i STR ( x ) ≤ P (cid:2) Bi ( n, r i ) > x (cid:3) + P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) . (7)Note that because lim i →∞ P ( C ia > C ib ) = 0, we have lim i →∞ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) = 0.Second, consider S&R . Clearly, A i S&R ( x ) is no smaller than the probability for S&R to select at least x minority jurors when there are exactly x minority jurors in the panel.The latter is equal to P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( x ; r i , C ia , C ib ), where σ ( x ; r i , C ia , C ib ) denotes theprobability that S&R selects x minority jurors conditional on having x minority jurors inthe panel, as a function of r i , C ia , and C ib . In summary, with this notation, we have, A i S&R ( x ) ≥ P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( x ; r i , C ia , C ib ) . (8)40e now show that lim i →∞ σ ( x ; r i , C ia , C ib ) >

0. For all i ∈ N , let C i = r i C ia + (1 − r i ) C ib .Observe that because lim i →∞ r i = 0 and because C ib converges in distribution to C ∗ b , C i converges in distribution to C ∗ b . By Lemma 2, this implies that for any subgame γ of S&R and both I ∈ { D, P } , we have lim i →∞ t I ( γ, C i (cid:1) = t I ( γ, C ∗ b (cid:1) . Note that t I ( γ, C ∗ b (cid:1) lies inthe interior of the support of C ∗ b for both I ∈ { D, P } . Also recall that in the limit, thesupports of C ia and C ib do not overlap as we have P ( C ∗ a > C ∗ b ) = 0. Therefore, in the limit,the defendant never challenges a minority juror, which in turn implies that (a) as i tends to inﬁnity, the probability that the defendant challenges one of the x minorityjurors in the panel tends to zero.Because t I ( γ, C ∗ b (cid:1) lies in the interior of the support of C ∗ b for both I ∈ { D, P } , there isalso a range of conviction probabilities [ c, c ] low enough inside the support of C ∗ b such that P ( C ∗ b ∈ [ c, c ]) > P challenged the juror presented in subgame γ if her convictionprobability lies within [ c, c ]. Furthermore, the probability that a juror with conviction-probability in [ c, c ] is a majority juror is strictly positive (and tends to one as i tends toinﬁnity). Overall, in the limit, (b) the probability that the plaintiﬀ challenges a majority juror presented in subgame γ isstrictly positive.Combining (a) and (b), in the limit and given a panel containing x minority jurors, thereis a positive probability that p majority jurors are presented ﬁrst, are all challenged by P ,and are followed by the x minority jurors which are left unchallenged by the parties (resultingin a jury composed of at least x minority jurors). That is, lim i →∞ σ ( x ; r i , C ia , C ib ) > i →∞ A i STR ( x ) A i S&R ( x ) ≤ lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) + P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib )= lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib ) + P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib )= lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib ) + P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) σ ( r i , C ia , C ib )= lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) P (cid:2) Bi ( n, r i ) = x (cid:3)(cid:124) (cid:123)(cid:122) (cid:125) =0 , by Lemma 1 ∗ lim i →∞ σ ( r i , C ia , C ib ) (cid:124) (cid:123)(cid:122) (cid:125) < ∞ , by lim i →∞ σ ( x ; r i ,C ia ,C ib ) > + lim i →∞ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) σ ( r i , C ia , C ib ) (cid:124) (cid:123)(cid:122) (cid:125) =0 , by lim i →∞ P ([ C ia ] ,x > [ C ib ] p,n − x )=0 , and lim i →∞ σ ( x ; r i ,C ia ,C ib ) > = 0In turn, lim i →∞ A i STR ( x ) / A i S&R ( x ) ≤ i →∞ A i STR ( x ) = lim i →∞ A i S&R ( x ) = 0 together imply that there exists some j suﬃciently large such that A i S&R ( x ) > A i STR ( x ) for all i > j . A.4 Section 2: Changing the number of challenges

A.4.1 Proof of Proposition 4

The structure of the proof is similar to that of the previous propositions. Observe that (3)and (4) are true regardless of the number of challenges awarded to the parties in

STR or S&R . That is, by the same arguments as in the proof of Proposition 1, the following twoinequalities hold regardless of the values of w , y , A STR - w ( x ), or A S&R - y ( x ), T STR - w ( x ; c ) = n (cid:88) k = x +1 P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) T STR - w ( x ; c | k ) ≤ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) , T S&R - y ( x ; c ) ≥ n (cid:88) k = x P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) ∗ σ ( c ) ≥ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ σ ( c ) . (9)The proposition then follows from the same argument as in the proof of Proposition 1 (inparticular, see (5)). Recall that the proposition assumes w, y ≥ .5 Section 8: Extensions: Unbalanced juries and representation of bal-anced groups A.5.1 Proof of Proposition 5

The probability that

STR selects at least x jurors with conviction-probability above themedian is the probability that at least x + d of the jurors in the panel have conviction-probability above the median (since d of these jurors are challenged by the defendant).Because d = p , for any x ∈ { , . . . , n } , we therefore have T STR (cid:0) x ; med [ C ] (cid:1) = P [ Bi ( j + d + p, . ≥ x + d ] = P [ Bi ( j + 2 d, . ≥ x + d ]In contrast, we have T RAN (cid:0) x ; med [ C ] (cid:1) = P [ Bi ( j, . ≥ x ] . Therefore, by repeated application of Lemma 3, x > ( n/ /

2) implies T STR (cid:0) x ; med [ C ] (cid:1) > T RAN (cid:0) x ; med [ C ] (cid:1) . Since n is integer-valued, the last inequality corresponds to x ≥ n/ n is even and x ≥ n/ . n is odd. A.5.2 Proof of Proposition 6Part (a).

Under

STR , since the group-distributions do not overlap, each party ﬁrst uses allof its challenges on one of the two groups before challenging the lowest conviction probabilityjurors from the other group. For concreteness and without loss of generality, suppose thatgroup a favors the defendant (i.e., P ( C a > C b ) = 0). Let m denote the number of jurorsfrom group- a in the panel.Note that because r = 0 .

5, the probability that m = k is the same as the probabilitythat m = n − k for all k ∈ { , . . . , (cid:98) n/ (cid:99)} . Also, because d = p , the number of group- a jurorswho are selected when m = k is equal to the number of group- b jurors who are selectedwhen m = n − k . Therefore, the expected number of group- a jurors in the jury selectedby STR is exactly j/ First, suppose that k ≤ p . Then, if m = k , no jurors from group- a (and j jurors from group- b )are selected, whereas if m = n − k , no jurors from group- b (and j jurors from group- a ) are selected.Second, suppose that k ∈ { p + 1 , . . . , (cid:98) n/ (cid:99)} . Then, if m = k , k − p = k − d jurors from group- a (and j − ( k − p ) = j − ( k − d ) jurors from group- a ) are selected, whereas if m = n − k , k − d = k − p jurors fromgroup- b (and j − ( k − d ) = j − ( k − p ) jurors from group- b ) are selected. art (b). The proof is similar to the proof of Part (a). Consider the set of panelconﬁgurations { a, b } n where, for example, vector ( a, b, a, . . . , b, b, b ) ∈ { a, b } n indicates thatthe juror with the lowest conviction probability in the panel is a group- a juror, the jurorwith second-lowest conviction probability is a group- b juror, the juror with the third-lowestconviction probability is a group- a juror, ..., and the jurors with the three highest convictionprobabilities are all group- b jurors. To explain the structure of the proof, suppose that n is even (we explain below how the argument generalizes to any n ). We ﬁrst construct apartition of { a, b } n into two subsets S a and S b of equal size and construct a bijection q between S a and S b . We then show that for every panel conﬁguration l ∈ S a which resultsin m l group- a jurors being selected, (a) the panel conﬁguration q [ l ] result j − m l group- a jurors being selected, and (b) panel conﬁgurations l and q [ l ] are equally likely. As in theproof of Part (b), the result then follows directly.Similar to the proof of Part (b), the bijection q [ l ] is obtained by (i) mirroring l aroundthe (cid:98) n/ (cid:99) position, and (ii) inverting the group of each juror in the resulting panel conﬁg-uration. For example, panel conﬁguration q [( a, a, b, a )] is obtained by mirroring ( a, a, b, a )around position (cid:98) n/ (cid:99) , which results in ( a, b, a, a ), and then inverting the group of eachjurors in ( a, b, a, a ), which results in ( b, a, b, b ). Formally, if inv [ l ] denotes the conﬁgura-tion that results from turning all the a ’s in l into b ’s and all the b ’s in l into a ’s, then q [( l , l , . . . , l n − , l n )] = inv [( l n , l n − , . . . , l , l )].Let S a and S b be two sets that together contain all l for which l (cid:54) = q [ l ] and are suchthat l ∈ S i implies q [ l ] / ∈ S i . Since q (cid:2) q [ l ] (cid:3) = l , the sets S a and S b have equal sizes. Alsolet S ∗ contain all l for which l = q [ l ], if any ( S ∗ (cid:54) = ∅ if and only if n is even). Note that { S a , S b , S ∗ } forms of partition of { a, b } n . Therefore, if we let ( m | l ) denote the numberof group- a juror that are selected conditional on conﬁguration l and P ( l ) the probability ofconﬁguration l , we have r ST R = (cid:88) l ∈ S a P ( l ) ∗ ( m | l ) + P ( q [ l ]) ∗ ( m | q [ l ]) + (cid:88) l ∈ S ∗ P ( l ) ∗ ( m | l ) . Part (b) then follows from the fact that (A) P ( l ) = P ( q [ l ]) for all l ∈ S a , (B) ( m | l ) = n − ( m | q [ l ]) for all l ∈ S a , and (C) ( m | l ) = j/ l ∈ S ∗ .Properties (B) and (C) follow directly from the construction of q and the fact that d = p .Property (A), on the other hand, follows from Lemma 4 which establishes the symmetry oforder statistics for symmetric distributions. A formal proof of (A) using Lemma 4 requires44eavy and tedious notation. Instead, we show how (A) follows from Lemma 4 in a simpleexample that clariﬁes how the argument generalizes to other cases.Consider the case of ( a, a, b ) for which q [( a, a, b )] = ( a, b, b ). We can obtain the probabil-ity of any conﬁguration by integrating the p.d.f. of the appropriate order statistics from thebottom to the top of [0 , P [( a, a, b )] = P [ m = 2] ∗ P [( a, a, b ) | m = 2]= P [ Bi (3 , .

5) = 2] ∗ (cid:90) a f , a ( x ) (cid:20)(cid:90) x f , a ( y ) (cid:18)(cid:90) y f , b ( w ) dw (cid:19) dy (cid:21) dx. (10)We can also obtain the probability of any conﬁguration by reverting the list of order statisticsand integrating from the top to the bottom of [0 , P [( a, b, b )]= P [ m = 1] ∗ P [( a, b, b ) | m = 1]= P [ Bi (3 , a.

5) = 1] ∗ (cid:90) a f , b (1 − x ) (cid:20)(cid:90) x f , b (1 − y ) (cid:18)(cid:90) y f , a (1 − w ) dw (cid:19) dy (cid:21) dx. (11)Finally, by Lemma 4, f , a ( x ) = f , b (1 − x ), f , a ( y ) = f , b (1 − y ), and f , b ( w ) = f , a (1 − w ),which together with symmetry of the binomial with 0.5 probability of success implies thatthe expressions in (10) and (11) are equal. 45 External Appendix: Additional simulations

B.1 Excluding extremes, uniform distribution of conviction probabilities

Figure B.1: Fraction of juries with at least one extreme juror . . . . . c . . . . . . F r a c t i o n o fj u r i e s RANS&RSTR

Note:

Results from 50,000 simulations of jury selections with parameters j = 12, d = p = 6, and C ∼ U [0 , .2 Minority representation when minorities favor conviction Table B.1: Representation of Group-a jurors in the eﬀective jury when Group-ais a minority of the jury pool

Polarization Extreme Moderate Mild (All)Procedure

S&R STR S&R STR S&R STR RAN

Average fraction of minorities 0.12 0.08 0.18 0.16 0.23 0.23 0.25Standard deviation 0.11 0.11 0.12 0.12 0.12 0.12 0.12Fraction of juries with at least 1 0.76 0.45 0.89 0.85 0.96 0.95 0.97 (a) Group-a represents 25% of the jury pool

Polarization Extreme Moderate Mild (All)Procedure

S&R STR S&R STR S&R STR RAN

Average fraction of minorities 0.01 0.00 0.05 0.04 0.09 0.08 0.10Standard deviation 0.03 0.02 0.06 0.06 0.08 0.08 0.09Fraction of juries with at least 1 0.09 0.02 0.44 0.38 0.66 0.64 0.72 (b) Group-a represents 10% of the jury pool

Note:

2) (Moderate), and from

Beta (3 , Beta (4 ,

3) (Mild);see Figure 3 for the shape of these distributions. .3 Excluding unbalanced juries, simulations from mild and moderatepolarization Figure B.2: Probability of selecting jurors below the median, diﬀerence with

RAN . . . . . P r o b a b ili t y ( d i ﬀ e r e n ce w i t h R AN ) STRS&R , r = . S&R , r = . S&R , r = . (a) Moderate polarization Number of jurors . . . . . P r o b a b ili t y ( d i ﬀ e r e n ce w i t h R A N ) STRS&R , r = . S&R , r = . S&R , r = . (b) Mild polarization Note:

The chart displays the probability of selecting a number of jurors with c i below the median under STR (green dashed line) and

S&R (orange lines) relative to the same probability under

RAN , i.e. T M ( x ; med [ C ]) − T RAN ( x ; med [ C ]). The model parameters are j = 12, d = p = 6 and C ∼ r · Beta (2 , − r ) · Beta (4 ,

2) forPanel (a) and C ∼ r · Beta (3 ,

4) + (1 − r ) · Beta (4 , r = { . , . , . } Values for

S&R are the resultsfrom 50,000 simulations of jury selection, whereas values for

RAN and

STR are computed analytically andare independent of r (see Footnote 36). eferences Anwar, S., P. Bayer, and R. Hjalmarsson.

The Quarterly Journal of Economics , 127(2): 1017–1055. (Cited on pages 5 and 6)

Beck, Coburn R.

William &Mary Law Review , 39(3): , p. 42. (Cited on page 13)

Bermant, Gordon, and John Shapard.

The Trial Process . ed. by Sales, Bruce Dennis,Berlin: Springer, 69–114. (Cited on page 2)

Biedenbender, Alice.

Catholic University Law Review , 40(3): ,p. 31. (Cited on page 13)

Bonebrake, James G.

The Journalof Criminal Law and Criminology , 79(3): , p. 23. (Cited on page 13)

Brams, Steven J., and Morton D. Davis.

Operations Research , 26(6):966–991. (Cited on pages 3, 5, 7, 9, 34, and 35)

Broderick, Raymond J.

Temple Law Review , 65, p. 369. (Cited on page 2)

Cohen, Neil P., and Daniel R. Cohen.

Universityof Memphis Law Review , 34 1–71. (Cited on page 21)

Craft, Will.

APMReports . (Cited on pages 6 and 23)

Daly, Meghan.

Duke Journal of Constitutional Law and Public Policy Sidebar , 11148–162. (Cited on page 3)

Diamond, Shari Seidman, Destiny Peery, Francis J. Dolan, and Emily Dolan.

Jour-nal of Empirical Legal Studies , 6(3): 425–449. (Cited on pages 5 and 6)

Flanagan, Francis X.

Journal of Lawand Economics , 58(2): 385–416. (Cited on pages 5, 7, 15, 28, 29, and 31)

Flanagan, Francis X. he Journal of Law and Economics , 61(2): 189–214. (Cited on pages 5, 6, and 23)

Hochman, Rodger L.

Nova Law Review , 17, p. 1367. (Cited on page 2)

Horwitz, Barbara L.

University of Cincinnati Law Review , 61 1391–1440.(Cited on page 13)

Keene, Douglas L.

The Jury Expert , 2(21): 24–25. (Cited on page 13)

LaFave, Wayne, Jerold Israel, Nancy King, and Orin Kerr.

Criminal Proce-dure . St. Paul, MN: West Academic Publishing, , 5th edition. (Cited on page 2)

Marder, Nancy S.

Texas Law Review , 73 1041–1138. (Cited on page 2)

Marder, Nancy S.

Raphael, Michael J., and Edward J. Ungvarsky.

University of Michigan Journal of Law Reform ,27 229–276. (Cited on page 3)

Rose, Mary R.

Law and Human Behavior , 23(6): 695–702. (Citedon page 6)

Sacks, Patricia E.

Washington UniversityLaw Quarterly , 67(2): , p. 29. (Cited on page 2)

Shapard, John, and Molly Johnson.

Federal Judicial Center, Research Division . (Citedon page 3)

Small, Mario L., and Devah Pager.

Journal of Economic Perspectives , 34(2): 49–67. (Cited on page 32)

Smith, Abbe.

George-town Journal of Legal Ethics , 27 1163–1186. (Cited on page 2)

Turner, Billy M., Rickie D. Lovell, John C. Young, and William F. Denny. ournal of Criminal Justice , 14(1): 61–69. (Cited on page 6)

Wright, Ronald F., Kami Chavis, and Gregory S. Parks.