Exclusion of Extreme Jurors and Minority Representation: The Effect of Jury Selection Procedures
EExclusion of Extreme Jurors and Minority Representation:The Effect of Jury Selection Procedures ∗ Andrea Moro and Martin Van der LindenFebruary 16, 2021
Abstract
We compare two established jury selection procedures meant to safeguard against theinclusion of biased jurors that are also perceived as causing minorities to be under-represented in juries. The Strike and Replace procedure presents potential jurors one-by-one to the parties, while the Struck procedure presents all potential jurors beforethe parties exercise vetoes. In equilibrium, Struck more effectively excludes extremejurors than Strike and Replace but leads to a worse representation of minorities. Sim-ulations suggest that the advantage of Struck in terms of excluding extremes is sizablein a wide range of cases. In contrast, Strike and Replace only provides a significantlybetter representation of minorities if the minority and majority are heavily polarized.When parameters are estimated to match the parties’ selection of jurors by race withjury-selection data from Mississippi in trials against black defendants, the procedures’outcomes are substantially different, and the size of the trade-off between objectivescan be quantitatively evaluated.
JEL Classification: K40, K14, J14, J16Keywords: Jury selection, Peremptory challenge, Minority representation, Gender rep-resentation ∗ Moro: Vanderbilt University, [email protected] . Van Der Linden: Emory University [email protected] a r X i v : . [ ec on . GN ] F e b Introduction
In the U.S. legal system, it is customary to let the parties involved in a jury trial dismisssome of the potential jurors without justification. These dismissals, known as peremptorychallenges , are meant to enable “each side to exclude those jurors it believes will be mostpartial toward the other side” thereby “eliminat[ing] extremes of partiality on both sides”. In the last decades, however, peremptory challenges have been criticized, mainly becausethey are perceived as causing some groups — in particular minorities — to be under-represented in juries. The procedure used to let the parties exercise their challenges varies greatly acrossjurisdictions and is sometimes left to the discretion of the judge. Two classes of proceduresare most frequently used in the U.S. In the Struck procedure (henceforth:
STR ), the partiescan observe and extensively question all the jurors who could potentially serve on theirtrial before exercising their challenges (this questioning process is known as voir dire ). Incontrast, in the Strike and Replace procedure (henceforth:
S&R ), smaller groups of jurorsare sequentially presented to the parties. The parties observe and question the group theyare presented with (sometimes a single juror) but must exercise their challenges on thatgroup without knowing the identity of the next potential jurors.The goal of this paper is to shed light on a debate that emerged in the legal doctrineover the relative effectiveness of
STR and
S&R at satisfying the two objectives of excludingextreme jurors and ensuring adequate group representation. Bermant and Shapard (1981,pp. 93-94), for example, argues that, by avoiding uncertainty,
STR “always gives advocatesmore information on which to base their challenges, and, therefore, [...] is always to bepreferred”. Bermant further notes that “a primary purpose of peremptory challenges is to Holland v. Illinois , 493 U.S. 474, 484 (1990). For examples of this line of argument against peremptory challenges, see Sacks (1989), Broderick (1992),Hochman (1993), Marder (1994), and Smith (2014). Despite these attacks, the U.S. has so far resistedabandoning peremptory challenges altogether (unlike other countries, like the U.K. where they were abolishedin 1988). Peremptory challenges remain pervasive in all U.S. jurisdictions and have been affirmed by theU.S. Supreme Court as “one of the most important rights secured to the accused;” (
Swain v. Alabama For example, in criminal cases in Illinois, “[State Supreme Court] Rule 434(a) expressly grants a trialcourt the discretion to alter the traditional procedure for impaneling juries so long as the parties haveadequate notice of the system to be used and the method does not unduly restrict the use of peremptorychallenges” (
People v. McCormick , 328 Ill.App.3d 378, 766 N.E.2d 671, (2d Dist., 2002)).
STR facilitates the exclusion of some groups from juries. Although in
Batson v. Kentucky and
J. E. B. v. Alabama the Supreme Court found it unconstitutionalto challenge potential jurors based on their race or gender, proving that a challenge isbased on race or gender is often difficult and the Supreme Court’s mandate is notoriouslyhard to implement. Interestingly, in response, judges themselves have turned to the de-sign of the challenge procedure and the use of
S&R as an instrument to foster adequategroup representation. For example, in a memorandum on judges’ practices regarding juryselection, Shapard and Johnson (1994) reports about judges believing that by “prevent[ing]counsel from knowing who might replace a challenged juror”
S&R procedures “make it moredifficult to pursue a strategy prohibited by
Batson ”.To inform this debate, we extend in Section 2 the model of jury selection proposed inBrams and Davis (1978) by allowing potential jurors to belong to two different groups. Inthe model, each potential juror is characterized by a probability to vote in favor of thedefendant’s conviction. This probability is drawn from a distribution that depends on thejuror’s group-membership. The group distributions are common knowledge but the partiesto the trial, a plaintiff and a defendant, only observe their realization for a particularjuror upon questioning that juror. The parties have opposing goals: the plaintiff wantsto maximize the probability of conviction, whereas the defendant seeks to maximize theprobability of acquittal.A jury must be formed to decide the outcome of the trial and the parties can influence itscomposition by challenging (i.e., vetoing) a certain number of potential jurors. Challengesare exercised according to
S&R or STR procedures which, as explained above, differ mainlyin the timing of jurors’ questioning (and, as a consequence, in the parties’ ability to observe
476 U.S. 79 (1986) and, 511 U.S. 127 (1994). In terms of legal procedures, the response to these decisionhas consisted in allowing the parties to appeal peremptories from their opponent, allowing them to nullify aperemptory if they can show that it was indeed based on race. These appeals are known as Batson appeals . See Raphael and Ungvarsky (1993): “In virtually any situation, an intelligent plaintiff can produce aplausible neutral explanation for striking Pat despite the plaintiff’s having acted on racial bias. Consequently,given the current case law, a plaintiff who wishes to offer a pretext for a race-based strike is unlikely toencounter difficulty in crafting a neutral explanation.” See also Marder (2012) or Daly (2016) for whyjudges rarely rule in favor of Batson appeals.
STR is more effective than
S&R at excluding jurors from the tailsof the conviction probability distribution, but is less likely to select minority jurors.The rest of the paper is devoted to characterizing conditions under which these resultsextend beyond the illustrative example of Section 3. In Section 4 we call a juror extreme ifits conviction probability falls below (above) a given threshold. We prove that there alwaysexists a low enough threshold such that
STR is more likely than
S&R to exclude extremejurors. Moreover, we show that
STR always selects fewer extreme jurors than a randomselection would, but that there are some (admittedly somewhat unusual) circumstancesin which
S&R would not. Simulations assuming a wide range of conviction probabilitydistributions reveal that, in terms of excluding extreme jurors, the advantage of
STR over
S&R can be substantial, even for relatively high thresholds.Section 5 compares procedures according to their ability to select minorities and identi-fies conditions under which
S&R selects more minority jurors than
STR . Our proof uses alimiting argument showing that the result holds when the minority is vanishingly small andthe distributions of conviction probabilities for each group minimally overlap (i.e., groupsare polarized). However, simulations again suggest that the result remains true when thesize of the minority is relatively high and the overlap between distributions is significant. InSection 6, we explore how changing the number of challenges affect the results of Sections4 and 5.Depending on the extent to which jurors of different races have polarized preferencesfor conviction, the model has different empirical implications for the selection of jurors byrace. In Section 7 we exploit peremptory challenge data on a version of
STR adopted inFifth Circuit Court District of Mississippi to estimate the groups’ distributions of convictionprobabilities, and to simulate the outcomes of counterfactual procedures. Results show thatgroups appear to be substantially polarized in their preferences for convictions, and thatthe choice of procedure affects both exclusion of extreme jurors and minority representationsubstantially. 4n Section 8 we show how our main theoretical results results extend to a differentdefinition of extreme juries (i.e., a jury in which the highest (lowest) conviction-probabilityjuror is below (above) a given threshold). We also explore how the procedures compare inselecting members of groups that are about equal size (such as male and females, as opposedto minorities which induce groups of unequal sizes).
Related Literature
This paper belongs to a relatively small literature formalizing jury selection procedures.Brams and Davis (1978) model
S&R as a game and derive its subgame-perfect equilibriumstrategies which we use in our theoretical results and simulations. Perhaps closest to ourpaper is Flanagan (2015) who shows that, compared to randomly selecting jurors,
STR increases the probability that all jurors come from one particular side of the median of theconviction probability distribution (because
STR induces correlation between the convic-tion probability of the selected jurors). To our knowledge, this literature is silent on theimplications of jury selection for group representation and on the trade-off between exclud-ing extreme jurors and ensuring adequate group representation induced by using differentprocedures. These are the focus and main contributions of this paper.While the group-composition of a jury has been shown to influence the outcome of atrial (Anwar et al., 2012; Flanagan, 2018), legal scholars often argue in favor of represen-tative juries regardless of their effect on verdicts. Diamond et al. (2009) for example arguethat “unrepresentative juries [...] threaten the public’s faith in the legitimacy of the legalsystem”. In an experiment on jury-eligible individuals, they show that participants rate theoutcome of trials as significantly fairer when the jury is racially heterogeneous than whenit is not. This motivates us to consider group-representativity itself as a desirable featureof jury selection procedures. One might also be interested in the impact of group-representation on the conviction of defendantswho themselves belong to different groups. Without taking groups into account or attempting to compareprocedures, Flanagan (2015) studies the impact of jury selection procedures on conviction rates. His resultsin terms of conviction rates require to assume that the parties have correct beliefs about the probability thatjurors eventually vote for conviction (as well as about these probabilities are independent of one another). Incontrast, our results about group-representation and exclusion of extremes do not require that the parties’belief at the moment of jury selection be accurate (at least if we are concerned with extremes as perceivedby the parties , as the U.S. Supreme Court seems to be when saying that the main purpose of peremptory In Section 6, weshow that limiting the number of challenges (while keeping the number of selected jurorsfixed) can have a similar effect, though at the expense of a less effective exclusion of extremejurors.
There are two parties to a trial, the defendant , D , and the plaintiff , P . The outcome of thetrial is decided by a jury of j jurors who must be selected from the population. The partiesshare a common belief about the probability that a juror i will vote to convict the defendant.We denote this probability c i ∈ [0 , C , with probability distribution f ( c ). We denote its cumulative with F ( c ) and its expected value with µ . Throughout, we assume that C is continuous. Tosimplify the notation, we also assume that the boundaries of the support of C are 0 and 1. To address the issue of group representation, we assume that jurors belong to one of twogroups a or b . The parties have common beliefs about the probability that jurors from eachgroup vote to convict the defendant. We index the distributions representing these beliefsand their averages with subscript g ∈ { a, b } : f g ( c ), F g ( c ), and µ g . The corresponding challenges is to enable “each side to exclude those jurors it believes will be most partial toward the otherside”, see Footnote 1 and associated quote). Diamond et al. (2009) take advantage of a feature of civil cases in Florida where juries are made of sixjurors unless one of the parties requests a jury of twelve jurors and pays for the costs associated with sucha larger jury. This assumption is without loss of generality and all our results hold if C is re-scaled in such a way that F ( c ) = 0 or [1 − F (1 − c )] = 0 for some c ∈ (0 , Empirical evidence, including the one we report in Section 7 shows that that parties use their challengesunevenly across groups; see also the Related Literature section of the Introduction. C a and C b . Although throughout conviction probabilitiesand their distributions across groups should only be viewed as representing the partiescommon- beliefs , we henceforth lighten the terminology and speak directly of convictionprobabilities (rather than parties’ beliefs about conviction probabilities).We let r denote the proportion of group- a jurors in the population, and when discussinggroup representation, we assume that C is obtained by drawing from C a with probability r and from C b with probability (1 − r ) (in particular, f ( c ) = rf a ( c ) + (1 − r ) f b ( c )).Following the majority of the literature (Brams and Davis, 1978; Flanagan, 2015), weassume that, at the level of jury selection, the parties do not account for the process ofjury deliberations and — perhaps as a way to cope with the complexity of jury selection— view the probabilities that jurors votes for conviction as independent from one another.Since conviction in most U.S. trials requires a unanimous jury, the parties then considerthat a jury composed of jurors with conviction probabilities { c i } ji =1 convict the defendantwith probability Π ji =1 c i . The defendant, therefore, aims at minimizing Π ji =1 c i while theplaintiff wants to maximize the same product.To influence the composition of jury, the defendant and the plaintiff are allowed tochallenge (veto) up to d and p of the jurors in a panel of n = j + d + p potential jurorsrandomly and independently drawn the population (sometimes also called the pool ). Toavoid trivial cases, we assume throughout that d, p ≥
1. The parties use these challenges inthe course of a veto procedure M (formally, an extensive game-form). The jury resultingfrom the procedure is called the effective jury .The two veto procedures we study are the STRuck procedure (
STR ) and the
StrikeAnd Replace procedure (
S&R ). For comparison, we also consider the
Random procedure(
RAN ) which simply draws j jurors independently at random from the population. In allprocedures, we assume that once a potential juror i is presented to the parties, the partiesobserve realized value of c i for that juror. The two procedures however differ in the timing In the legal literature, what we call “panel” is sometimes called “ venire ” (though terminology variesand the latter term is sometimes used to speak of what we call the population). This is motivated by the practice of letting parties extensively question every juror they are presentedwith, a process known in the legal terminology as voire dire . In turn, the fact that the parties have the sameassessment of the probability a juror will vote for conviction is motivated by the fact that voir dire occurs inthe presence of both parties, and that the parties therefore and have access to the same information aboutthe jurors’ demographics, background, and opinions.
STR , the entire panel of j + d + p potential jurors is presented to the parties before they have the opportunity to use any of their challenges. Each party, therefore,observes the value of c i for every juror in the panel. The defendant and the plaintiff thenchoose to challenge up to d and p of the jurors in the panel, respectively. In practice,there are several types of STR procedures that differ in the way the parties exercise theirchallenges after having questioned the jurors in the panel. For concreteness and tractability,we focus in this paper on the
STR procedure in which the parties have a single opportunityto exercise their challenges on the whole panel. In equilibrium, this leads the plaintiff tochallenge the p jurors in the panel with lowest conviction probabilities, and the defendantto challenge the d jurors with highest conviction probabilities. Whether these challengeshappen simultaneously or sequentially has no impact on the equilibrium and our results for
STR apply in either case. In contrast, under
S&R , groups of potential jurors are randomly drawn from the popula-tion and sequentially presented to the parties. In contrast with
STR procedures, the partiesmust exercise their challenges on jurors from a given group without knowing the identity ofjurors from subsequent groups. There is variation among
S&R used in practice in the size ofthe groups that are presented in each round. Again, for concreteness and tractability, wefocus in this paper on the
S&R procedure in which jurors are presented to the parties oneat a time . The defendant and the plaintiff start the procedure with d and p challenges left,respectively. After each draw, the plaintiff and the defendant observe the potential juror’sconviction probability and, if they have at least one challenge left, choose whether or notto challenge the juror. If a juror is not challenged by either party, it becomes a member of Alternative methods used in the field include procedures in which the parties to challenge sequentiallyout of subgroups of jurors from the panel only. As long as the procedure remains of the struck type (i.e.,the entire panel — and not only the first subgroup — is questioned before the parties start exercising theirchallenges), the equilibrium effective jury is often the same as under the
STR procedure we consider here.Other outcome-irrelevant aspects of the equilibrium might, however, be different such as the number ofchallenges used by the parties (e.g., if the first group is made of the j “middle” jurors in the panel, theymay in some cases be selected as effective jurors without the parties exercising any of their challenges). Since C is continuous, the probability that two jurors in a panel have the same conviction probability andone of the parties does not use all of its challenges in equilibrium is zero and this eventuality can thereforebe neglected. As well as in the ability of the parties to challenge, in a later round, potential jurors who were leftunchallenged in previous rounds, a practice known as “ backstricking ”. j members is formed.The (subgame perfect) equilibrium of S&R was characterized by Brams and Davis (1978)and takes the form of threshold strategies. In every subgame, D challenges the presentedjuror i if c i is above a certain threshold t D , P challenges i if c i is below some threshold t P , andneither of the parties challenges i if c i ∈ [ t P , t D ]. We will sometimes refer to these values as challenge thresholds . As Brams and Davis (1978) show, in any subgame, t P < t D and evenif the challenges happen simultaneously and both parties are charged for their challengeswhen they both decide to challenge the presented juror, the latter (i.e., a challenge by bothparties) never occurs in equilibrium. The equilibrium is therefore unaffected by the timingof challenges in each round and our results for S&R apply regardless of this timing. In our description of
S&R , Nature moves in each round to draw a new potential jurorfrom the population to present to the parties. To facilitate conditional comparisons between
STR and
S&R based on a particular fixed panel, it will sometimes be useful to consider anequivalent description of
S&R in which Nature first draws a panel of n jurors { c , . . . , c n } (which the parties are not aware of) and in each round k presents juror c k to the parties.For similar purposes, it will sometimes be useful to view RAN as first drawing a panel of n jurors and then (uniformly at random) selecting j jurors among these n to form the effectivejury. To illustrate the differences between the two procedures, consider the simple case d = p = j = 1 together with distributions C a ∼ U [0 , .
5] and C b ∼ U [0 . , r = 0 .
1, i.e., there is a minority of 10% of group- a jurors in the population. Each subgame can be characterized by the number of jurors κ that remain to be selected, the number ofchallenges left to the defendant δ , and the number of challenges left to the plaintiff π . The parties thresholdin subgame ( κ, δ, π ) are a function of the value of subgames ( κ − , δ, π ), ( κ, δ − , π ), and ( κ, δ, π −
1) (whichcan result from the parties action in ( κ, δ, π )) and the distribution of C , see Brams and Davis (1978). By “timing”, here, we mean the order (potentially simultaneous) in which the parties decide whether ornot to challenge the presented juror. igure 1: Illustrative example, equilibrium outcomes under STR → a b a a ○ a a a ○ b a b ○ b b b ○ b U [0 , . U [0 , . U [ . , U [ . , .001 .03 .24 .73 Note:
The figure describes the equilibirum of
STR assuming j = p = d = 1, C a ∼ U [0 , . C b ∼ U [0 . , r = 0 .
10. The initial node illustrates distribution C = 0 . · C a + 0 . · C b . The numbers on each arrowindicate the probability of drawing a panel with the group-composition represented in the pointed boxes(conditional on each panel composition, the circled letter in the box corresponds to the group-membershipof the selected juror). Dashed arrows correspond to outcomes that lead to the selection of a group- a jurorand the graph underneath each box shows the distribution of conviction probabilities for the selected juror. Let U nx [ a, b ] denote the x -th order statistic for a U [ a, b ] random sample of size n . Withthis notation, Figure 1 shows the group-membership and distribution of conviction probabil-ity for the juror selected under STR , conditional on the composition of the panel. Observethat in this example, if there are group- a jurors in the panel, one of them is systematicallychallenged by the plaintiff. Therefore, for a group- a juror (i.e., a minority juror) to beselected under STR , there need to be at least two group- a jurors in the panel of n = 3presented to the parties. This occurs with probability 0 . a juror can be selected under S&R even if the panel contains a singlegroup- a juror. To understand why, consider the equilibrium of S&R which is illustratedin Figure 2. If a group- b prospective juror with a sufficiently low conviction probability( c i ∈ [0 . , . a juror is more likelyto be selected than if a juror was randomly drawn from the population. In particular, anygroup- a juror presented at the beginning of this later subgame is left unchallenged by thedefendant and selected to be the effective juror (even if this juror is the only group- a juror10 igure 2: Illustrative example, equilibrium strategies and outcomes under S&R
Each round 1 draw from ↓ a b Round 1 c i ∈ [ . , c i ∈ [ . , . Group- bc i ∈ [0 , . c i ∈ [ . , c i ∈ [0 , . Group- a c i ∈ [ . , . Group- b c i ∈ [ . , Group- bc i ∈ [0 , . c i ∈ [ . , Group- bc i ∈ [0 , . Group- ac i ∈ [ . , Group- bc i ∈ [0 , . Group- a Round 3 .40.29.31 .54.47.54.36.10 .90.10 .90.10
Note:
The figure describes the equilibrium strategies conditional on the conviction probability of the jurordrawn in each round for the case j = d = p = 1, C a ∼ U [0 , . C b ∼ U [0 . ,
1] and r = 0 .
10. Dashed arrowscorrespond to paths that may lead to the selection of a group- a juror. The numbers on each arrow indicatethe probability of the path conditional on reaching the previous node. The second row of text inside boxesindicates an equilibrium action, whereas bold text below boxes indicates the group of the selected juror inthe game outcome. In round 3, challenges from both parties are exhausted and the parties do not take anyaction. in the panel because the third juror — who, in this case, is never presented to the parties —happens to be a group- b juror). This course of action follows from P ’s choice to challengea group- b juror with low conviction probability in the first round, which leaves P withoutchallenges left in the second round. This choice of P is optimal from the perspective of thefirst round of S&R ( before the plaintiff learns that the second juror in the panel is a group- a juror), but suboptimal under STR where, having observed the conviction probability of alljurors in the panel, the plaintiff would have challenged the group- a juror instead.Considering only the branch of the S&R game-tree that starts with a challenge from P ,the probability of selecting a group- a juror is almost 0 .
05 = 0 . · (0 . · . . D challenges in the first round followed by achallenge from P in the second round (which happens with probability 0 . · . · . ≈ . S&R is 0.067. This is larger than theprobability under
STR , 0.03, yet smaller than under
RAN , 0.10.In this example, the better representation of minority jurors produced by
S&R comesat the expense of selecting more extreme jurors. Suppose for the sake of illustration thatjurors are considered extreme if they come from the top or bottom 5th percentile of C .In our example, the bottom and top 5th percentile corresponds to conviction probabilitiesbelow 0.25 and above 0.94, respectively. The selected juror is within the bottom range withprobability 0 .
015 under
STR versus 0 .
033 under
S&R , and in the top range with probability0 .
076 under
STR versus 0 .
083 under
S&R .To understand the source of these differences, let us consider the bottom 5th percentile[0 , .
25] (a symmetric explanation applies to the top
STR selects a group- a juror — the type of juror whose conviction probabilitycould possibly be in the bottom 5th percentile — the distribution of that juror’s convictionprobability follows the middle or upper order-statistics of a random sample from C a . Theseorder-statistics are unlikely to result in the selection of a juror with conviction probabilityin the bottom 5th percentile. In contrast, as Figure 2 illustrates, all paths leading S&R toselect a group- a juror result in the juror’s conviction probability being drawn from U [0 , . S&R more likely to select a juror in the bottom 5th percentile than
STR .In the next two sections, we investigate the extent to which the advantages of
S&R interms of minority-representation and of
STR in terms of exclusion of extreme generalizesbeyond this illustrative example.
In the United States, one of the objectives of the jury selection process is to guarantee animpartial jury as dictated by the Sixth Amendment of the Constitution. In this respect,the peremptory challenge procedures implemented in U.S. jurisdictions are often viewedas a way to foster impartiality by preventing extreme potential jurors from serving on the These are the only cases in which a minority juror can be selected under
S&R . In particular, jurorsaccepted in the first round are always group- b jurors ( c i ∈ [0 . , . D is the first round ( c i ∈ [0 . , In the context of our model, we interpret this goal as that of limiting thepresence in the jury of jurors from the tails of the distributions of conviction probabilities.In Sections 4 to 6, we refer to a juror i as extreme if its conviction probability c i lies below or above given thresholds (we refer the reader to Section 8 for results underan alternative definition). For brevity, we will focus on jurors who qualify as extremebecause their conviction probability lies below some threshold c >
0. This is without lossof generality and all our results about extreme jurors apply symmetrically to jurors whoseconviction probability lies above a given threshold c < C are selected lessoften under STR than
S&R . This is not true in general. Fixing a particular threshold c >
0— or percentile of C — to characterize jurors as extreme, there always exists distributions of C and values of d , p , and j such that S&R selects fewer extreme jurors than
STR . However,our first result shows that regardless of the distribution and value of the parameters, therealways exists a threshold sufficiently small such that, if jurors are called “extreme” belowthat threshold, the probability of selecting extreme jurors is greater under
S&R than under
STR .Let T M ( x ; c ) denote the probability that there are at least x jurors with convictionprobability smaller or equal to c in the jury selected by procedure M . Proposition 1.
For any x ∈ { , . . . , j } , there exists c > such that T ST R ( x ; c ) < T S&R ( x ; c ) for all c ∈ (0 , c ) . All proofs are in the appendix. A symmetric statement, which we omit, applies forextreme jurors at the right-end of the distribution. Note that Proposition 1 can be rephrasedin terms of stochastic dominance. Let N cM denotes the expected number of jurors of type c i ≤ c in the jury selected by procedure M . Then, Proposition 1 says that there exists c > N c S&R has first-order stochastic dominance over N cST R for all c ∈ (0 , c ). Adirect corollary of Proposition 1 is therefore that the expected number of extreme jurors islarger under S&R than under
STR .For some intuition about Proposition 1, consider the case x = 1. As illustrated inSection 3, the panel must be composed of more than one extreme juror for STR to select See Footnote 1 and its associated quote. For legal arguments in favor of peremptory challenges basedon the Sixth Amendment, see, among others, Beck (1998), Biedenbender (1991), Bonebrake (1988), Horwitz(1992), and Keene (2009).
13t least one such juror (since, if there is a single extreme juror in the panel, that juror issystematically challenged by the plaintiff). In contrast, even in panels with a single extremejuror, the extreme juror can be part of the effective jury resulting from
S&R . This happens,for example, if the extreme juror is presented to the parties after they both exhaustedall their challenges. The single extreme juror can also be accepted by both parties if itsconviction probability is sufficiently close to c and it is presented after the plaintiff usedmost of its challenges on non-extreme potential jurors. The proof then follows from thefact that, as c tends to zero, the probability that the panel contains more than one extremejuror goes to zero faster than the probability the panel contains a single extreme juror. Proposition 1 is silent about the value of the threshold c below which STR selects fewerjurors than
S&R , as well as the size of T S&R ( x ; c ) − T ST R ( x ; c ) for c < c . These values dependon the models’ parameters. To illustrate, we simulate T STR (1; c ) and T S&R (1; c ) using j =12, d = 6, and p = 6, a typical combination of jury size and number of peremptory challengesin U.S. jurisdictions. For the distribution of conviction probabilities in the population, weuse symmetric mixtures of beta distributions that represents a population made of twogroups with polarized views, which allows easier comparison with the results from Section 5in which we study group-representation. We provide simulation results for three mixtures ofthe distributions illustrated in Figure 3, which are meant to represent extreme (Panel (a)),moderate (Panel (b)), and mild levels of polarization (Panel (c)). Additional simulationsresults using U [0 ,
1] instead are reported in Appendix B.Using these parameters,
STR is found to exclude more extreme jurors than
S&R evenwhen the threshold for defining jurors as extreme is relatively high. As illustrated in Fig-ure 4, the difference between the propensity of
STR and
S&R to select extreme jurors issizable. For example, in all three sets of simulations, only about 1% of juries selected by Subgames in which the defendant has more challenges left than the plaintiff can lead the plaintiff to beconservative and accept jurors who are “barely extreme” ( c i ≈ c ) in order to save its few challenges left for“very extreme” jurors ( c i ≈ Proposition 1 crucially depends on averaging across all possible panels and does not state that
STR rejects more extreme jurors than
S&R for any particular realization of the panel. The latter would obviouslyimply Proposition 1 but turns out to be false in general. For a counterexample, let j = d = p = 1. Considera panel of three jurors with c < c < c and c > c and the index of the jurors indicating the order in whichthey are presented under S&R . For this panel,
STR always leads to the selection of extreme juror 3. Incontrast, provided c falls between the challenge thresholds of the defendant and the plaintiff in the firstround (which happens with positive probability), S&R selects non-extreme juror 2. igure 3: Distributions of conviction probabilities by group under extreme,moderate, and mild group-polarization . . . f a ( c ): Beta (1 , f b ( c ): Beta (5 , (a) Extreme . . . . . . . . f a ( c ): Beta (2 , f b ( c ): Beta (4 , (b) Moderate . . . c . . . . . f a ( c ): Beta (3 , f b ( c ): Beta (4 , (c) Mild STR feature at least one juror with conviction probability below the 10th percentile of thedistribution (the 10th percentile corresponds to 0.01 under the extreme polarization distri-bution, 0.17 under moderate polarization, and 0.25 under mild polarization). Under
S&R ,the proportion of juries with at least one juror below the 10th percentile rises to 56% withextreme polarization, 35% with moderate polarization, and remains quite high at 30% evenunder mild polarization. For comparison, a random selection would have resulted in about73% of the juries featuring at least one such juror.In these simulations, both procedures select fewer extreme jurors than a random drawfrom the population. Somewhat surprisingly, this is not true in general. There exist dis-tributions and values of the parameters d , p and j for which S&R selects more extremejurors than
RAN , no matter how small the threshold below which a juror is considered asextreme. In contrast, as we show in the next proposition,
STR always selects fewer extremejurors than
RAN . Proposition 2.
For any x ∈ { , . . . , j − } , there exists c > such that T ST R ( x ; c ) < T RAN ( x ; c ) for all c ∈ (0 , c ) . Proposition 2 generalizes Theorem 2 in Flanagan (2015) which shows that there always exists c > T STR ( n ; c ) < T RAN ( n ; c ) for all c ∈ (0 , c ). igure 4: Fraction of juries with at least one extreme juror . . . c . . . . . . F r a c t i o n o fj u r i e s RANS&RSTR
RANS&RSTR (a) Extreme . . . c . . . . . . F r a c t i o n o fj u r i e s RAN S&R STR (b) Moderate . . . c . . . . . . F r a c t i o n o fj u r i e s RAN S&R STR (c) Mild
Note:
For each set of parameters, results on the vertical axis are averages across 50,000 simulated juryselections, fixing j = 12, d = p = 6, and C ∼ . · C a + 0 . · C b throughout (with the distributions for C a and C b illustrated in Figure 3). Each line illustrates the fraction of juries with at least one extreme juror,where a juror is considered extreme if her conviction probability falls below the threshold c correspondingto the value on the horizontal axis. Figure 5 illustrates Proposition 2 and the fact that a similar statement does not holdfor
S&R . For the simulations in the figure, we let j = d = p = 1 and adopt an extremelypolarized distribution of conviction probabilities with C ∼ . · U [0 , .
1] + 0 . · U [0 . , STR excludes extreme jurors more often than
RAN because,for any realization of the panel, the juror with the lowest conviction probability is neverselected under
STR (whereas the same juror is selected with positive probability under
RAN ). Under
S&R , however, if the distribution is sufficiently right-skewed, the plaintiff ismore likely than the defendant to challenge in the first round. A challenge by the plaintiffin the first round leads to a subgame in which only the defendant has challenges left andthe selection of an extreme juror is more likely than under a random draw. When they aresufficiently large (i) the added probability of selecting an extreme juror when the defendanthas more challenges left than the plaintiff, coupled with (ii) the probability of a challengeby the plaintiff in the first round can, as in the simulation depicted in Figure 5, lead to
S&R selecting more extreme jurors than
RAN .16 igure 5: Fraction of juries with at least one extreme juror (case in which
S&R is more likely to pick extreme jurors than
RAN ) .
00 0 .
02 0 .
04 0 .
06 0 .
08 0 . c . . . . . F r a c t i o n o fj u r i e s RANS&R STR
Note:
For each set of parameters, results on the vertical axis are averages across 50,000 simulated juryselections, fixing j = d = p = 1, and C ∼ . · U [0 , .
1] + 0 . · U [0 . ,
1] throughout. Each line illustratesthe fraction of juries with at least one extreme juror, where a juror is considered extreme if her convictionprobability falls below the threshold c corresponding to the value on the horizontal axis.
We could not fully characterize the situations in which
S&R selects more extreme jurorsthan
RAN , and we never observed such a situation in simulations where C is a symmet-ric mixture of beta or uniform distributions. The example in Figure 5 (as well as otherexamples we found) requires extreme skewness in the distribution, which may be viewedas unlikely. In this sense, situations in which S&R selects more extreme jurors than
RAN might represent worst-case scenarios for
S&R ’s ineffectiveness at excluding extreme juror(rather than ordinary situations).
In this section, we study the extent to which
STR ’s tendency to exclude more extreme jurorsthan
S&R impacts the representation of minorities under the two procedures. Without lossof generality, we let group- a be the minority group. Since the parties do not care intrin-sically about group-membership, any asymmetry in the use of their challenges arises from17eterogeneity in preferences for conviction between groups. In our simulations, we assumethat group- a is biased in favor of acquittal in the sense that C b first-order stochasticallydominates C a . As suggested by Proposition 1, which procedure better represents minorities stronglydepends on the polarization between the two groups, and the concentration of minorityjurors at the tails of the distribution of conviction probabilities. To illustrate, supposethat d = p = j = 1 and C ∼ U [0 , RAN , STR , and
S&R are displayed in Figure6(a). Consistent with Proposition 1, below some threshold c ≈ .
25, the probability ofselecting a juror i with c i < c is lower under STR than under
S&R . If the two groupsare polarized and the distribution of C a is sufficiently concentrated below c , it followsthat STR selects a minority juror less often than
S&R . But the same is not true if thedistributions lack polarization or the minority is too large. For example, decompose C asfollows: C ∼ U [0 ,
1] = rU [0 , r ] + (1 − r ) U [ r, per se , the value of r in thesedecompositions does not affect the distributions of conviction probabilities for the jurorselected under RAN , STR , or
S&R . Then, letting C a ∼ U [0 , r ] and C b ∼ U [ r, r — which concentrate minorities at the bottom of thedistribution — make S&R select more minorities than
STR , whereas higher values of r —which spread the minority over a larger range of conviction-types — make STR select moreminorities than
S&R .From this example, we see that non-overlapping group-distributions are not sufficientto guarantee that
S&R selects more minority jurors than
STR . Neither is making the mi-nority arbitrarily small. For example, regardless of the size of the minority r , concentratingthe support of the minority distribution inside the interval [0 . , .
3] would result in
STR selecting more minority jurors, as can be seen from Panel 6(a). However, combining a smallminority with group-distributions that minimally overlap concentrates the distribution ofgroup- a at the tails which, as suggested by Proposition 1, makes S&R select more minoritiesthan
STR .Formally, consider a sequence of triples { ( C ia , C ib , r i ) } ∞ i =1 . If, We also simulated the scenario in which the minority is biased towards conviction, the results, which wereport in the Appendix, are symmetrically very close). igure 6: Jury selection and minority representation in size-1 juries .
00 0 .
25 0 .
50 0 .
75 1 . c . . . . D e n s i t y RANS&RSTR (a) Distribution of c for selected juror . . . . . r . . . . . F r a c t i o n o f g r o up - a i n j u r i e s RAN S&RSTR (b) Minority representation in juries
Note:
For each set of parameter, results on the vertical axes are averages across 20,000 simulated juryselections, fixing j = 1, d = p = 1, and C ∼ r · U [0 , r ] + (1 − r ) · U [ r,
1] throughout. The distribution in panel(a) is independent of r whether the lines in panel (b) interpolate results from 20 values of r . (i) r i ∈ (0 ,
1] for all i ∈ N with lim i →∞ r i = 0, and(ii) C ia and C ib converge in distribution to C ∗ a and C ∗ b , with either P ( C ∗ a < C ∗ b ) = 0 or P ( C ∗ a > C ∗ b ) = 0,then we say that there is a vanishing minority and group-distributions thatdo not overlap in the limit . For any such sequence, let A iM ( x ) denote the probabilitythat there are at least x minority jurors in the jury selected by procedure M when group-distributions are C ia and C ib and the proportion of minority jurors in the population is r i . Proposition 3.
Suppose that, under { ( C ia , C ib , r i ) } ∞ i =1 , there is a vanishing minority andgroup distributions that do not overlap in the limit. Then for all x ∈ { , . . . , j } , there exists j sufficiently large such that A i S&R ( x ) > A i STR ( x ) for all i > j . Note that, despite the argument presented in the motivating example illustrated in Figure 6, Proposition3 does not follow directly from Proposition 1. The reason is that, unlike in the motivating example, most of able 1: Representation of Group-a when Group-a is a minority of the pool Polarization Extreme Moderate Mild (All)Procedure
S&R STR S&R STR S&R STR RAN
Average fraction of minorities 0.10 0.08 0.18 0.16 0.23 0.23 0.25Standard deviation 0.11 0.11 0.12 0.12 0.12 0.12 0.12Fraction of juries with at least 1 0.57 0.45 0.88 0.84 0.96 0.95 0.97 (a) Group-a represents 25% of the jury pool
Polarization Extreme Moderate Mild (All)Procedure
S&R STR S&R STR S&R STR RAN
Average fraction of minorities 0.02 0.00 0.05 0.04 0.09 0.08 0.10Standard deviation 0.04 0.01 0.07 0.06 0.08 0.08 0.09Fraction of juries with at least 1 0.17 0.02 0.47 0.38 0.67 0.64 0.72 (b) Group-a represents 10% of the jury pool
Note:
The rows report the average number and standard deviation of group- a jury members, and the percentof juries with at least one group- a jurors, out of 50,000 simulations of jury selection with parameters j = 12and d = p = 6. Conviction probabilities are drawn for from Beta (5 , Beta (1 , Beta (4 , Beta (2 ,
4) (Moderate), and from
Beta (4 , Beta (4 ,
3) (Mild);see Figure 3 for the shape of these distributions.
Given the result in Proposition 3, it is natural to wonder how small the minority and theoverlap between the group-distributions must be for
S&R to select more minority jurors than
STR . When the latter is true, one may also wonder about the size of A S&R ( x ; r ) − A STR ( x ; r )is. Again, the answer naturally depends on the model’s parameters. To inform thesequestions, we ran a set of simulations with d = p = 6 and j = 12 using the distributionsdisplayed in Figure 3, where the green lines in each panel represent f a and the yellow lines f b . The results of our simulations, displayed in Table 1, suggest that S&R might select the sequences { ( C ia , C ib , r i ) } ∞ i =1 covered by Proposition 3 are such that C i = r i C ia + (1 − r i ) C ib varies acrossthe sequence (i.e., C j (cid:54) = C h for most j, h ∈ N ). STR even when the size of the minority is relatively high (ashigh as 25%) and the overlap between the group-distributions significant. However, withoutstark polarization across groups, differences between the procedures’ propensities to selectminority jurors appear to be small. For example, under the distributions we labeled as“extreme group heterogeneity” and with minorities representing 10% of the population, only2.3% of juries selected by S&R include at least one minority juror whereas this number risesto 17.1% under
S&R (random selection would generate over 70% of such juries). However,under the distributions we labeled as “mild group heterogeneity”, the same numbers become66.5% under
S&R and 64.5% under
STR (random selection would generate over 71.9% ofjuries with at least one minority juror in this second case).
So far, we have compared
STR and
S&R assuming that the number of challenges the partiescan use, d and p , was the same under each procedure. This was motivated by the fact thatjudges often have a lot of freedom in selecting the procedure through which the parties usetheir challenges (see Footnote 3). In contrast, the number of challenges that the parties canuse are typically specified more rigidly by state rules of criminal procedure.In the last decades, several states have, however, reduced the number of challenges theparties can use. In some instances, these reforms also clarify or alter the jury selectionprocedures used in the state. In the context of such broader reforms, it is natural to askhow the ability to change both the number of challenges the parties are entitled to and theprocedure through which the parties exert their challenges affect the trade-off between theexclusion of extreme jurors and the representation of minorities. Recall that C a and C b represent the parties’ beliefs that randomly drawn group- a or group- b jurorseventually vote to convict the defendant. Polarized C a and C b , therefore, corresponds to groups thatare perceived by the parties to have different probabilities of voting for conviction (whether or not thismaterializes when jurors actually vote on conviction at the end of the trial). Examples include California’s Senate Bill 843, passed in 2016, which reduces the number of challengesa criminal defendant is entitled to from 10 to 6 (for charges carrying a maximal punishable of one year inprison, or less). Examples include the 2003 reform of jury selection in Tennessee where some aspects of the jury selectionprocedure were codified to apply uniformly across the state, while the number of peremptory challenges wasalso slightly reduced (see Cohen and Cohen, 2003). igure 7: The effect of varying the number of challenges . . . . . . F r a c t i o n o fj u r i e s S&R STR (a) Fraction of extreme jurors . . . . F r a c t i o n o f m i n o r i t i e s S&R STR (b) Fraction of minority jurors
Note:
Fraction of juries with at least one juror below the 10th percentile (left panel) and fraction of minorityjurors (right panel). For each set of parameters, results on the vertical axes are averages across 50,000simulated jury selections, fixing j = 12 and C ∼ . · C a + 0 . · C b throughout (with the distributions for C a ∼ Beta (2 ,
4) and C b ∼ Beta (4 , d = p are on the horizontal axes. Throughout this section, we fix an arbitrary value of j and consider varying d = p . Forany procedure M , let M - y denote the version of M when d = p = y . The notation forthe two previous sections then carries over, with T M - y ( x ; c ) denoting the probability thatat least x jurors with conviction probability below c are selected under M - y , and A M - y ( x )the probability that at least x minority jurors are selected under M - y . For illustration purposes, we first consider the case C ∼ . · C a +0 . · C b , C a ∼ Beta (2 , C b ∼ Beta (4 ,
2) ( C a and C b are illustrated in the Figure 3(b)), and consider a juroras extreme if its conviction probability falls in the bottom 10th percentile of C (which hereequals 0 . extreme jurors decreases as the number of challenges awarded to the parties increases, regardless of the procedurethat is used (Figure 7(a)). Conversely, the fraction of minority jurors decreases with the Again, in the case of extreme jurors, we focus on jurors who qualify as extreme because their convictionprobability falls below a certain threshold c , though all of our results hold symmetrically for jurors whoqualify as extreme because their conviction probability lies above a certain threshold c , STR and
S&R , more challenges lead to fewer extreme jurors being selected at the expense of aworse representation of minorities.As Figure 7(a) illustrates, however, increasing the number of challenges decreases theselection of extreme jurors much faster under
STR than under
S&R . As a consequence, forall values of y ∈ { , . . . , } , there exists w < y such that STR - w performs better than S&R - y in terms of both objectives. The latter is not true in general. Even when there exists w such that STR - w betterrepresents minorities than S&R - y , STR - w might still exclude fewer extreme jurors than S&R - y if jurors are considered extreme when their conviction probability falls below anarbitrary c >
0. However, an extension of Proposition 1 shows that when such a w exists,there also exists c > c , STR - w performs better than S&R - y in terms of both objectives. Proposition 4.
Consider any x ∈ { , . . . , j } and any y ≥ . Suppose that there exists w ≥ such that A STR- w ( x ) > A S&R- y ( x ) . Then for some c > , we also have T STR- w ( x ; c ) < T S&R- y ( x ; c ) for all c ∈ (0 , c ) . As emphasized in the analysis so far, group asymmetries in jury representation exist to theextent that groups have polarized preferences for conviction. In this section, we use jury se-lection data to estimate the distribution of conviction probabilities and provide quantitativeevidence of the effect of jury selection procedures and their differences.Jury selection data is to our knowledge relatively scarce. For the purposes of thisSection, we exploit data from Craft (2018) on peremptory strikes in the Fifth Circuit CourtDistrict of Mississippi from 1992 to 2017, where a version of
STR was used to select jurors. Specifically, in this example, for any y ∈ { , . . . , } , there exists w ∈ { , . . . , y − } such that A STR - w (1) > A S&R - y (1) and T STR - w (1; 0 . < T S&R - y (1; 0 . Besides the data used in this section, another important source is the data of jury selection in NorthCarolina described in Wright et al. (2018) and analyzed in Flanagan (2018). We do not use this sourcebecause the jury selection procedures adopted in these jurisdictions do not conform to the rules we study inthis paper. While the adopted procedure differs in some details from the stylized version we analyzed in this paper,we assume that in equilibrium, its outcome conforms to that of
STR . In addition, the number of jurors For each trial, the data reports the race and gender of the potential jurors, whethera juror was struck by the defendant or the state, and the race and gender composition ofthe seated jury and alternate jurors. This allows the computation of jury composition byrace, and the computation of challenges by race for each party.We limit our analysis to the juries’ racial composition focusing on Black and White jurorsonly . Assuming that the distributions of conviction probabilities in each group belong tothe class of beta distributions, the model parameters are five: the fraction of whites in thejury pool, 1 − r , which we directly observe in the data, and the four parameters of f Blacks ,f Whites . The data we observe does not allow to identify both of these distributions. Given r , for any given f Blacks = Beta ( α a , β a ), it is always possible to find f Whites = Beta ( α b , β b )that replicates the same proportion of whites struck by the defendant and by the Stateof Mississippi, the plaintiff (which in turn determine the fraction of whites in the jury).Intuitively, the reason behind this lack of identification is that it is possible to shift somemass of both distributions to the right without changing, on average, the racial compositionof the juries. . While this shift would cause the conviction frequency at trial to change,using this moment for identification would not change the outcomes we focus on in thispaper for STR (see Footnote 34).In Table 2 we report some summary statistics from the data. The sample contains 292trials, of which 229 include black defendants. We exclude all jurors dismissed by the judgefor causes that are not the focus of our analysis. Hence, we define the size of the panel as thesum of the number of jurors, alternate jurors, and jurors dismissed by either the state or thedefendant. There is some variation in the size of both the juries and the panel, in part dueto the fact that the process of selecting alternate jurors is separate. Unfortunately, the data selected and the number of challenges available sometimes differ by type of trial. As we explain below, there is some variation to the number of jurors in the data and to the numberof challenges used by parties (due to variation in the kind of offenses being prosecuted as well as in judgesdecisions in the allocation of additional challenges for the selection of alternate jurors). However, themoments we use for identification rely only on race ratios and are relative stable across juries of differentsize. The full sample includes almost 15,000 jurors, of which 26% are Black, 42% are White, 32% are ofunknown race, and only 3 Latinos and 1 Asian which we pool with the Whites. With beta distributions, matching these two moments also matches the proportion of juries with x jurorsof a given race, for all x ∈ { , . . . , j } , making it impossible to use higher moments for identification. able 2: Summary statisticsSample selection (1) (2) (3) (4) (5)Defendant White Black Black Black BlackSize of jury pool Any Any ≤
27 Any ≤ Trial statistics
Average size of jury pool 26.1 26.9 23.7 26.2 23.5(std) (5.0) (5.8) (2.5) (5.7) (2.6)Average size of jury 12.0 12.0 12.0 12.0 12.0(std) (0.3) (0.4) (0.4) (0.2) (0.2)% with unknown race in jury pool 31.2 30.7 26.9 0.0 0.0
Percentage of whites ∗ in jury pool 63.1 62.7 63.1 66.5 65.9in jury 61.0 66.8 67.8 70.5 69.7among struck by the defendant 86.2 91.4 92.3 93.1 92.9among struck by the state 40.8 23.6 21.4 23.5 21.6 Standard deviation in parenthesis. ∗ Percentage of white jurors in samples (1), (2), and (3) computed amongjurors that have been classified as either whites or blacks does not distinguish between jurors who were dismissed in the course of selecting regularjurors, or in the course of alternates. We present data for 5 samples that vary dependingon the race of the defendant, the size of the panel, and whether or not we include panelscontaining jurors of unknown race. These show that the racial composition of juries andchallenges is affected by the the race of the defendant but is only weakly affected by theway we select our sample.The average size of the jury (excluding alternates) is 12 in all samples, though the panelsare slightly over 24, mainly because they include potential alternate jurors (and because, insome cases, judges may grant additional challenges to the parties). Challenge behavior isaffected by the race of the defendant: Juries with black defendants have a higher percentage25 igure 8: Counterfactual analysis: Juries with at least one extreme juror . . . . . c . . . . . . F r a c t i o n o fj u r i e s RANS&RSTR
Note:
For each set of parameters, results on the vertical axis are averages across 50,000 simulated juryselections, fixing j = 12, d = p = 6, and C ∼ . · C a + 0 . · C b throughout (with the distributions for C a ∼ Beta (2 ,
4) and C b ∼ Beta (5 . , . of whites than the panel does, whereas juries with white defendants include fewer whites.When the defendant is black the defense challenges a higher fraction of white jurors, and thestate a higher fraction of black jurors. Variation in the size of the jury pool has little impacton the racial composition of the juries or challenged jurors (for either party). Focusing ontrials with Black defendants, the fraction of whites in the pool is quite stable across all 5samples (between 62.7 and 66.5 percent). This is predicted by our theory when jurors havepolarized views that favor defendants of their own race. The behavior of the parties differsubstantially by race: in sample (5), which we use to estimate our model, 93% of the jurorsstruck by the defendant are white, whereas only 22% of the jurors struck by the state areWhite. We use these two moments to estimate the distribution of conviction probabilities.We proceed by assuming f Blacks = Beta (2 , f Whites tomatch the fraction of white jurors struck by the defendant and the plaintiff (the last twomoments of Table 2) using sample (5). The estimated parameters of f Whites are (alpha =26 igure 9: Counterfactual analysis: Number of challenges − . − . . . . F r a c t i o n o fj u r i e s S&R STR (a) Fraction of extreme jurors . . . . . F r a c t i o n o f m i n o r i t i e s S&R STR (b) Fraction of minority jurors
Note:
Fraction of juries with at least one juror below the 10th percentile (left panel) and fraction of minorityjurors (right panel). For each set of parameters, results on the vertical axes are averages across 50,000simulated jury selections, fixing j = 12, d = p = 6, and C ∼ . · C a + 0 . · C b throughout (with thedistributions for C a ∼ Beta (2 ,
4) and C b ∼ Beta (5 . , . d = p are on the horizontal axes. Figure 8 reports the results of simulations computed with the estimated parameters. Thefigure reveals that the procedure adopted by this jurisdiction — a version of
STR whereeach party is allowed 6 challenges — is much more effective at excluding extreme jurorsthan a counterfactual
S&R . The adopted procedure excludes nearly every juror below the10-th percentile, c = 0 .
21, whereas
S&R with the same number of challenges would produceabout 27% juries with at least one juror more extreme than 0.21.Figure 9 however suggests that a change to
S&R could improve the representation ofminorities. Keeping the number of challenges at 6,
S&R would include 6% more minorities Standard errors computed by bootstrapping 200 replications of the data set. We also tried assuming aleft-skewed f Blacks = Beta (10 , f Whites = Beta (23 . , . STR , the results obtainedwith these alternative distributions are almost identical to the ones reported in Figure 9.
S&R selects aboutthe same number of minorities across all number of challenges, but is capable of excluding fewer jurors belowthe 10th percentile by about 7 percentage points.
STR (about 27% vs 25%) and would produce a jury with 4 black jurors (about thesame as the black representation in the jury pool) 12% more often (about 41% vs 37%). Toreach a similar representation, the number of challenges in
STR would have to be reducedto 4, though this would increase the fraction of juries with jurors below the 10th percentilefrom almost zero to 4.4%.This analysis suggests that the data is consistent with the parties believing in a distribu-tion that makes the two procedures significantly different in their ability to exclude jurors.The data is also consistent with beliefs in sizeable heterogeneity between juror-groups which,in turn, implies that the procedures also differ in their ability to select of minorities as well.
The primary purpose of jury selection is to prevent extreme potential jurors from serving onthe effective jury (see Footnote 1 and its associated quote). In our model, it seems naturalto interpret this goal as that of limiting the selection of jurors coming from the tail of thedistribution. This is the interpretation of extreme that we have studied thus far.Although it is perhaps less clear that it aligns with the goals of practitioners, anotherapproach could be to consider the extremism of juries as a whole . For example, extremejuries could be viewed as juries in which the juror with the highest or lowest convictionprobability is extreme. Through variants of the arguments in the proofs of Propositions 1and 2, one can show that, in that sense too,
STR is more effective than both
S&R and
RAN at excluding extreme juries. Another measure of juries’ extremism, proposed by Flanagan (2015), is whether a juryis excessively “unbalanced” in the sense of featuring a disproportionate proportion of ju-rors coming from one side of the median of C . Interestingly, Flanagan shows that STR introduces correlation between the selected jurors, which leads the procedure to select moreunbalanced juries than
RAN . Even though panels are the result of independent draws fromthe population, jurors selected under
STR have conviction probabilities between that of Specifically, for any x ∈ { , . . . , j − } , there exists c > c <
1, such that (a) for every c ∈ (0 , c ),the probability that the lowest conviction-probability in the jury is smaller than c is larger under S&R and
RAN than under
STR , and (b) for every c ∈ (¯ c, c is larger under S&R and
RAN than under
STR . .
25 and 0 .
75 indicates that challenges were used on jurors withconviction probabilities outside the [0 . , .
75] range. The latter makes it more likely that
STR selected additional jurors in the [0 . , . d = p ), the probability that all selected jurors come from one side of the median is larger under STR than under
RAN .Our next proposition generalizes this result. Using a new proof technique, we show that for any x larger than half the jury-size, the probability of selecting at least x jurors from oneside of the median is larger under STR than under
RAN . Similar to Section 4, we focus forbrevity on the probability that the selected jurors are below the median. All our results arehowever symmetrical and apply identically to the probability of selecting jurors above themedian. Let med [ C ] denote the median of C . Proposition 5. If d = p , then for any x ∈ { n/ , . . . , n } if n is even, and any x ∈{ n/ . , . . . , n } if n is odd, we have T STR (cid:0) x ; med [ C ] (cid:1) > T RAN (cid:0) x ; med [ C ] (cid:1) . Figure 10 illustrates Proposition 5 and that a similar statement does not hold for
S&R .For M ∈ { STR , RAN } , the value of T M ( x ; med [ C ]) can be computed analytically and doesnot depend on the distribution of C . For M = S&R , the value of T M ( x ; med [ C ]) dependson the distribution in a complex fashion and it is not possible to generally compare S&R with the two other procedures in terms of T M ( x ; med [ C ]). As the figure illustrates, theprobability to select at least x jurors below med [ C ] can, in some cases (in the figure, x = 7and, barely, x = 8 jurors), be larger under S&R than under both
RAN and
STR . In othercases, however, the same probability is lower under
S&R than under both
RAN and
STR .Figure 10 displays the result of simulations when the distribution of C is highly polarized(a mixture of Beta (1 ,
5) and
Beta (5 , S&R to more often select a majority of jurors below the medianthan
STR . Also, for lower levels of polarization,
S&R more often selects fewer juries made Specifically, T RAN (cid:0) x ; med [ C ] (cid:1) = P ( Bi [ j, . ≥ x ) whereas T STR (cid:0) x ; med [ C ] (cid:1) = P ( Bi [ j + d + p, . ≥ x + p ). igure 10: Selection of jurors below the median . . . . . F r a c t i o n o fj u r i e s ( d i ff e r e n ce w i t h R AN ) STRS&R , r = . S&R , r = . S&R , r = . Note:
Fraction of juries with a at least given number of jurors below the median of C under STR (greendashed line) and
S&R (continuous lines) relative to the same fraction under
RAN (i.e. T M ( x ; med [ C ]) − T RAN ( x ; med [ C ])). Throughout, we fix j = 12, d = p = 6 and C ∼ r · Beta (1 ,
5) + (1 − r ) · Beta (5 ,
1) (for r ∈ { . , . , . } ) whereas the number of jurors below the median is on the horizontal axis. For each setof parameters, results for S&R are averages across 50,000 simulated jury selections, whereas values for
RAN and
STR are computed analytically and are independent of r (see Footnote 36). of a majority of jurors below the median than RAN . Concerns about the effect of jury selection on group-representation often focus on the repre-sentation of racial minorities. Thought the U.S. Supreme Court initially banned challengesbased on race in
Batson v. Kentucky (1986), it later also banned challenges based on gender in J.E.B. v. Alabama (1994). In this context, it is natural to ask whether the advantagesof
S&R in terms of minority representation comes at the cost of a worse representation ofgender groups.Unlike minorities which correspond to groups of unequal sizes represented by small Because the parties’ actions under
S&R are influenced by the mean of the distribution but not in anyclear way by the median (and because of the complexity of the game tree), we were unable to formalize theeffect of polarization on these comparisons in terms of the model parameters. r , gender-groups can be thought of as even-sized groups and are better modeledusing r ≈ .
5. With groups of similar sizes, both procedures almost always select at leasta few members from either group. It is therefore more interesting to compare proceduresin terms of the proportion of group- a jurors they select (than in terms of the probability ofselecting at least x members from group- a , as we did before).In this last section, we let r = 0 . a jurorsselected under STR and
S&R . We denote these proportions r STR and r S&R and focus on howclose r STR and r S&R are from the 50% of group- a jurors that prevail in the population. As in the last two sections, it is not possible to generally compare
STR and
S&R interms of the procedures’ ability to select an even proportion of group- a and group- b jurors.In some cases, r STR can be further away from 50% than r S&R , and the converse may betrue in other cases. For example, with d = p = 6 and j = 12, if C a ∼ U [0 ,
1] and C b ∼ Beta (1 , r STR = 43 .
7% whereas r S&R = 45 . C a ∼ Beta [4 ,
2] and C b ∼ Beta (1 , r STR = 50 .
3% whereas r S&R = 52 . r STR get closer to 50% . Our next proposition confirms this pattern. If thegroup-distributions are symmetric or if they do not overlap, and if d = p , then r STR = 50%whereas
S&R does not necessarily select an even proportion of jurors from each group. Thelatter follows from the fact that, even when r = 50% and distributions are symmetrical, themultiplicative utility function that the parties use to assess the value of a jury (which is itselfa consequence of the fact that juries must reach unanimous decisions) creates asymmetriesin the use of challenges under S&R . We say that random variables C a and C b are symmetric if f a ( c ) = f b (1 − c ) for all c ∈ [0 , Proposition 6.
Suppose that r = 0 . and d = p . If (a) the two group distribution do not Previous results are stronger in the sense that they establish a first-order stochastic dominance betweenthe number of jurors with certain characteristics (extremism or group-membership) selected under
STR and
S&R . As we explain after Proposition 1, showing, for example, that T STR ( x ; c ) < T S&R ( x ; c ) for all x ∈ { , . . . , j } directly implies that the expected proportion of selected jurors with conviction probability c i < c is lower under STR than under
S&R (whereas the converse is not true). Flanagan (2015) shows that, in this symmetrical case, the asymmetry of the payoffs still forces thedefendant to be more conservative than the plaintiff when using its challenges, hence leading to an unevenselection of jurors from the two groups. verlap, or (b) C a and C b are symmetric, then r ST R = r RAN . Table 3(a) illustrates Proposition 6 and the fact that a similar statement does not holdfor
S&R . Unlike
STR , S&R can select unequal numbers of group- a and group- b jurorseven when distributions are symmetrical across groups. Therefore, as a consequence ofProposition 6, r S&R can in these cases be further away than r STR from the 50% of group- a jurors that prevail in the population.Table 3(a) however suggests that these differences may be quantitatively small, and thatsizable differences may require high levels of polarization between groups. Table 3(b) and3(c) also report the results of simulations in which the symmetries required for Proposition 6to hold are slightly relaxed. These indicate that the advantage of STR in the representationof balanced groups established in Proposition 6 (i.e., the fact that r STR is closer to 50%than r S&R ) may not be robust to even mild relaxations of these symmetries. In particular,when r = 0 . r STR is consistently closer than r S&R to the 55% of group − a that prevail inthe population (see Table 1). Also, when r = 0 . r S&R are identical except in the most polarized case.
In this paper, we study the relative performance of two stylized jury-selection procedures.Strike and Replace presents potential jurors one-by-one to the parties, whereas the Struckprocedure presents all potential jurors before they exercise vetoes. When jurors differ intheir probability of voting for the defendant’s conviction, and on group membership, weshow that when groups have polarized views Strike is more effective at excluding jurorswith extreme views, but generally selects fewer members of a minority group than Strikeand Replace, leading to a conflict between these two goals.Sociologists Small and Pager (2020) argue that systemic factors may lead to disparateoutcomes even in the absence of taste-based or statistical discrimination, the traditionalexplanations provided in Economic Theory. This paper formalizes an example in which thepursuit of one legitimate objective — preventing extreme jurors to serve on juries — maylead to group disparities. That is either P ( C a > C b ) = 0 or P ( C b > C a ) = 0. The same result would apply if the two distributionsdid not overlap in the limit as in Proposition 3. able 3: Representation of Group-a jurors with balanced group sizes Polarization Extreme Moderate Mild (All)Procedure
S&R STR S&R STR S&R STR RAN
Average fraction of minorities 0.48 0.50 0.49 0.50 0.50 0.50 0.50Standard deviation 0.18 0.20 0.16 0.17 0.15 0.15 0.14 (a) Group-a proportion r = 0 . , group distributions as in Figure 3. Polarization Extreme Moderate Mild (All)Procedure
S&R STR S&R STR S&R STR RAN
Average fraction of minorities 0.39 0.40 0.42 0.42 0.45 0.44 0.45Standard deviation 0.18 0.20 0.16 0.17 0.15 0.15 0.14 (b) Group-a proportion r = 0 . , group distributions as in Figure 3. Polarization Extreme ∗ Moderate ∗ Mild ∗ (All)Procedure S&R STR S&R STR S&R STR RAN
Average fraction of minorities 0.47 0.50 0.49 0.48 0.49 0.48 0.50Standard deviation 0.18 0.20 0.15 0.16 0.15 0.16 0.14 (c) Group-a proportion r = 0 . , group distributions slightly asymmetric ∗ ∗ In panel (c) Extreme ∗ corresponds to C a ∼ Beta (1 ,
5) and C b ∼ Beta (5 , ∗ to C a ∼ Beta (2 , C b ∼ Beta (4 , ∗ to C a ∼ Beta (3 ,
4) and C b ∼ Beta (4 , Note:
The rows report the average number and standard deviation of group- a jury members out of 50,000simulations of jury selection with parameters j = 12 and d = p = 6. Appendix: Proofs
A.1 Preliminary technical results
A.1.1 Limit of a ratio of binomial probabilitiesLemma 1.
For all η ∈ N and any k ∈ { , . . . , η − } , lim π → P [ Bi ( η, π ) = k ] P [ Bi ( η, π ) > k ] = ∞ . Proof.
Using the standard formula for the p.d.f. of a binomial and the representation of thec.d.f. of the binomial with regularized incomplete beta function , we can re-write the ratio as P [ Bi ( η, π ) = k ]1 − P [ Bi ( η, π ) ≤ k ] = (cid:0) ηk (cid:1) π k (1 − π ) η − k − ( η − k ) (cid:0) ηk (cid:1) (cid:82) − π x η − k − (1 − x ) k dx (1)As π →
0, both the numerator and the denominator tend to 0. We use L’Hopital’s rule tocomplete the proof: ( ∂/∂π ) (cid:0) ηk (cid:1) π k (1 − π ) η − k ( ∂/∂π ) (cid:16) − (cid:104) ( η − k ) (cid:0) ηk (cid:1) (cid:82) − π x η − k − (1 − x ) k dx (cid:105)(cid:17) = (cid:0) ηk (cid:1) · (cid:2) kπ k − (1 − π ) η − k + π k ( η − k )(1 − π ) η − k − (cid:3) − ( η − k ) (cid:0) ηk (cid:1) [( − · (1 − π ) η − k − π k ]= kπ k − (1 − π ) η − k ( η − k )(1 − π ) η − k − π k + π k ( η − k )(1 − π ) η − k − ( η − k )(1 − π ) η − k − π k = k (1 − π )( η − k ) π + 1 −−−→ π → ∞ (cid:3) A.1.2 Continuity of challenge thresholds in
S&R as C i converges in distributionLemma 2. Consider a sequence of random variables { C i } ∞ i =1 that converges in distributionto some random variable C ∗ . Let t I ( γ, C i (cid:1) denote the challenge threshold used by party I ∈{ D, P } in an arbitrary subgame γ of S&R when the distribution of conviction probabilitiesis C i . For any such subgame γ , we have lim i →∞ t I ( γ, C i (cid:1) = t I ( γ, C ∗ ) .Proof. In any subgame ˜ γ , t I (˜ γ, C i (cid:1) is the ratio of the value of continuation subgames if I challenges the presented juror, or if both parties abstain from challenging (Brams and Davis,34978). Therefore, lim i →∞ t I ( γ, C i (cid:1) = t I ( γ, C ∗ ) follows directly if we show that the value ofany subgame, which we denote V ( γ, C i (cid:1) , converges to V ( γ, C ∗ ) as i tends to infinity. The latter follows directly from the recursive characterization of V ( γ, C i (cid:1) in Brams andDavis (1978). Recall that each subgame γ can be characterized by the number of jurors κ that remain to be selected, the number of challenges left to the defendant δ , and thenumber of challenges left to the plaintiff π . With this notation, the recursive proof thatfor all κ, δ, π ≥ V (cid:0) [ κ, δ, π ] , C i (cid:1) converges to V (cid:0) [ κ, δ, π ] , C i (cid:1) as i tends to infinity can bedecomposed in a number of cases. Let F i ( c ) denote the the c.d.f. of C i , F ∗ ( c ) the c.d.f.of C ∗ , and F ( c ) the c.d.f. of an arbitrary distribution C , with µ i , µ ∗ , and µ j being thecorresponding expected values. In each step, the initial formula for V (cid:0) [ κ, δ, π ] , C i (cid:1) is takenfrom Brams and Davis (1978). Case 1: κ = 0 , δ ≥ , π ≥ . In this case, V (cid:0) [0 , δ, π ] , C ) = 1 for all C and theconvergence of V (cid:0) [0 , δ, π ] , C i (cid:1) to V (cid:0) [0 , δ, π ] , C ∗ ) follows trivially. Case 2: κ > , δ = 0 , π = 0 . In this case, V (cid:0) [ κ, , , C ) = µ κ for all C and theconvergence of V (cid:0) [0 , δ, π ] , C i (cid:1) to V (cid:0) [0 , δ, π ] , C ∗ ) follows from the fact that C i converges indistribution to C ∗ . Case 3: κ > , δ = 0 , π > . In this case, for all C , V (cid:0) [ κ, , π ] , C ) = V ( κ − , , π ) ∗ (cid:34) − (cid:90) t I ([ κ, ,π ] ,C ) F ( c ) dc (cid:35) , and t I ([ κ, , π ] , C ) = V (cid:0) [ κ, , π − , C ) /V (cid:0) [ κ − , , π ] , C ). The convergence of V (cid:0) [ κ, , π ] , C i (cid:1) to V (cid:0) [ κ, , π ] , C ∗ ) then follows recursively from the previous cases and from C i convergingin distribution to C ∗ . Case 4: κ > , δ > , π = 0 . In this case, for all C , V (cid:0) [ κ, δ, , C ) = V (cid:0) [ κ, δ − , , C ) − V (cid:0) [ κ − , δ, , C ) ∗ (cid:90) t D ([ κ,δ, ,C )0 F ( c ) dc, where t D ([ κ, δ, , C ) = V (cid:0) [ κ, δ − , , C ) /V (cid:0) [ κ − , δ, , C ). The convergence of V (cid:0) [ κ, δ, π ] , C i (cid:1) to V (cid:0) [ κ, δ, π ] , C ∗ ) then follows recursively from the previous cases and from C i convergingin distribution to C ∗ . Because we assume that all distributions of conviction probabilities are continuous, there are no issuesrelated to the possibility for the bottom of one of these ratios to converge to zero. ase 5: κ > , δ > , π > . In this case, for all C , V (cid:0) [ κ, δ, π ] , C ) = V (cid:0) [ κ, δ − , π ] , C ) − V (cid:0) [ κ − , δ, π ] , C ) ∗ (cid:90) t D ([ κ,δ,π ] ,C ) t I ([ κ,δ,π ] ,C ) F ( c ) dc, where t D ([ κ, δ, π ] , C ) = V (cid:0) [ κ, δ − , π ] , C ) /V (cid:0) [ κ − , δ, π ] , C ) and and t I ([ κ, δ, π ] , C ) = V (cid:0) [ κ, δ, π − , C ) /V (cid:0) [ κ − , δ, π ] , C ). The convergence of V (cid:0) [ κ, δ, , C i (cid:1) to V (cid:0) [ κ, δ, , C ∗ )then follows recursively from the previous cases and from C i converging in distribution to C ∗ . (cid:4) A.1.3 Comparative statics of probabilities from a symmetric binomialLemma 3. P [ Bi ( η + 2 , . ≥ k + 1] > P [ Bi ( η, . ≥ k ] if and only if k > η + .Proof. We can decompose P [ Bi ( η + 2 , . ≥ k + 1] in terms of Bi ( η, .
5) and Bi (2 , . P [ Bi ( η + 2 , . ≥ k + 1]= P [ Bi ( η, . ≥ k + 1] + P [ Bi ( η, .
5) = k ] ∗ P [ Bi (2 , . ≥
1] + P [ Bi ( η, .
5) = k − ∗ P [ Bi (2 , .
5) = 2]= P [ Bi ( η, . ≥ k + 1] + P [ Bi ( η, .
5) = k ] ∗ .
75 + P [ Bi ( η, .
5) = k − ∗ . P [ Bi ( η, . ≥ k ] = P [ Bi ( η, . ≥ k + 1] + P [ Bi ( η, .
5) = k ] . Together, the last two equalities imply that P [ Bi ( η + 2 , . ≥ k + 1] > P [ Bi ( η, . ≥ k ]if and only if P [ Bi ( η, .
5) = k ] ∗ .
75 + P [ Bi ( η, .
5) = k − ∗ . > P [ Bi ( η, .
5) = k ] P [ Bi ( η, .
5) = k − ∗ . > P [ Bi ( η, .
5) = k ] ∗ . P [ Bi ( η, .
5) = k − > P [ Bi ( η, .
5) = k ] (cid:18) ηk − (cid:19) . k − . η − ( k − > (cid:18) ηk (cid:19) . k . η − k η !( η − [ k − k − > η !( η − k )! k !( η − k )!( η − [ k − > ( k − k !1 η − k + 1 > kk > η A.1.4 Relationship between order statistics of symmetric distributions
For any number of draws w and any k ≤ w , let C k,wg denote the k -th order statistic out of w draws from distribution C g , and f k,wg ( x ) the corresponding probability density function. Lemma 4.
Suppose that C a and C b are symmetric. Then, for any w ∈ N and any k ∈{ , . . . , w } , we have f k,wa ( c ) = f w − k +1 ,wb (1 − c ) for all c ∈ [0 , .Proof. Recall that, by definition, C a and C b being symmetric implies f a ( c ) = f b (1 − c ) forall c ∈ [0 , F a ( c ) = F b (1 − c ) for all c ∈ [0 , f ka ( c ) = k (cid:18) wk (cid:19) f a ( c )[ F a ( c )] k − [1 − F a ( c )] w − k = k (cid:18) wk (cid:19) f b (1 − c )[1 − F b (1 − c )] k − [1 − (1 − F b (1 − c ))] w − k = k w !( w − k )! k ! f b (1 − c )[1 − F b (1 − c )] k − [ f b (1 − c )] w − k = ( w − k + 1) w !( w − k + 1)!( k − f b (1 − c )[(1 − F b (1 − c )] k − [ F b (1 − c )] w − k = ( w − k + 1) w !( w − k + 1)!( w − ( w − k + 1)! f b (1 − c )[1 − F b (1 − c )] k − [ F b (1 − c )] w − k = ( w − k + 1) (cid:18) ww − k + 1 (cid:19) f b (1 − c )[1 − F b (1 − c )] k − [ F b (1 − c )] w − k = f w − k +1 b (1 − c ) (cid:4) A.2 Section 4: Effectiveness at excluding extremes
A.2.1 Proof of Proposition 1
Consider an arbitrary c ∈ (0 ,
1) and let us refer to jurors with conviction probability nolarger than c as extreme jurors . Let T M ( x ; c | k ) denote the probability that at least x extreme jurors are selected by procedure M conditional on there being exactly k of extremejurors in the panel of n . By the Law of Total Probability, T M ( x ; c ) = n (cid:88) k = x P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) T M ( x ; c | k ) . (2)37onsider first the STR procedure. Note that for all c , we have T STR ( x ; c | x ) = 0 becauseif there are exactly x extreme jurors in the panel, one of them is necessarily challenged bythe plaintiff under STR (recall that p ≥ T ST R ( x ; c ) = n (cid:88) k = x +1 P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) T STR ( x ; c | k ) ≤ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) , (3)where the last inequality follows from the fact that T STR ( x ; c | k ) ∈ [0 ,
1] for all k (as T STR ( x ; c | k ) is a probability).Next, consider procedure S&R . Our goal is to construct a lower bound for the probabilityof selecting an extreme juror and show that, as c →
0, this lower bound does not convergeto 0 as fast as (3). To do so, we introduce an decreasing function σ ( c ) > c is sufficiently small, T S&R ( x ; c | k ) ≥ σ ( c ) for any k ≥ x . To construct σ , consider therestricted sample space in which there are k extreme jurors in the panel.Let t P be the lowest challenge threshold used by the plaintiff in any subgame of the S&R procedure. Clearly, t P > Henceforth, we focus on c ∈ (0 , t P ).We first consider the function α ( c ) defined as the probability that c j ∈ ( c, t P ) for all the( n − k ) non-extreme jurors in the panel. Because C is continuous and 0 is the lower-boundof its support, there exists y > α ( c ) > c ∈ [0 , y ]. Also, α ( c ) is weakly decreasing in c .By construction of t P , for such panels (with k extreme jurors and c j ∈ ( c, t P ) for all the( n − k ) non-extreme jurors), the plaintiff uses all its challenges on the p first jurors it ispresented with, and the defendant never uses any challenges. Therefore, for these panels,the probability that all k extreme jurors are selected is the probability that none of thesejurors are among the p first jurors presented to the parties, i.e., (cid:0) n − pk (cid:1) / (cid:0) nk (cid:1) . Overall, for Formally, if Γ denotes the set of subgames of
S&R and t P ( γ ) the plaintiff’s challenge threshold in anysubgame γ ∈ Γ, then t P = min γ ∈ Γ t p ( γ ) (the minimum is well-defined since Γ is of finite size). In anysubgame γ of S&R , there is always a conviction probability c > γ is of type c , the plaintiff will challenge that juror. Therefore, t P > By definition of the support, because 0 is the lower-bound of the support, P ( C ∈ [0 , (cid:15) ]) > (cid:15) > C is continuous, there must therefore exists some δ > P ( C ∈ [ δ/ , δ ]) >
0. We thenhave α ( c ) > c < δ . The latter follows from the fact that, in any subgame, the threshold used by the defendant is alwayshigher than the threshold used by the plaintiff (in equilibrium, the defendant and the plaintiff never bothwant to challenge the presented juror). ∈ (0 , t P ), we have T S&R ( x ; c | k ) ≥ α ( c ) · (cid:0) n − pk (cid:1) / (cid:0) nk (cid:1) , and σ ( c ) := α ( c ) · (cid:0) n − pk (cid:1) / (cid:0) nk (cid:1) has thedesired property.Applying T S&R ( x ; c | k ) ≥ σ ( c ) to (2) with M = S&R , we obtain for all c sufficientlysmall (specifically c ∈ (0 , t P )) T S&R ( x ; c ) ≥ n (cid:88) k = x P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) ∗ σ ( c ) ≥ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ σ ( c ) . (4)Overall, combining (3) and (4) yieldslim c → T S&R ( x ; c ) T ST R ( x ; c ) ≥ lim c → P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ σ ( c ) P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) = ∞ , (5)where the last equality follows from Lemma 1 and the fact that σ ( c ) > c . In turn, lim c → T S&R ( x ; c ) / T ST R ( x ; c ) = ∞ and the fact that lim c → T S&R ( x ; c ) =lim c → T ST R ( x ; c ) = 0 together imply implies that there exists some c > T ST R ( x ; c ) < T S&R ( x ; c ) for all c ∈ (0 , c ). A.2.2 Proof of Proposition 2
Using the same notation as in the proof of Proposition 1, we have T RAN ( x ; c ) ≥ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ T RAN ( x ; c | x ) . (6)Note that T RAN ( x ; c | x ) is the probability that an Hypergeometric random variable with x success, n − x failures, and j draws, results in the draw of exactly x successes. Therefore, T RAN ( x ; c | x ) >
0. Finally, combining (6) and (3) yieldslim c → T RAN ( x ; c ) T ST R ( x ; c ) ≥ lim c → P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ T RAN ( x ; c | x ) P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) = ∞ , where the last equality follows from Lemma 1 and the fact that T RAN ( x ; c | x ) >
0. Theresult then follows as in the proof of Proposition 1. To apply Lemma 1, note that because C is continuous and the lower-bound of the support of C is 0, wehave F ( c ) > c > c → F ( c ) = 0. .3 Section 5: Representation of minorities A.3.1 Proof of Proposition 3
The structure of the proof is similar to that of the previous propositions. We focus on thecase we analyzed in the main paper, where the minority uniformly favors the defendant,i.e., lim i →∞ P ( C ia > C ib ) = 0. The proof for the other case is symmetrical.For now, consider arbitrary C ia , C ib , and r i . Similar to the previous proofs, for any triple( C ia , C ib , r i ), we first decompose A i STR ( x ) and A i S&R ( x ) by conditioning on the number ofminority jurors in the panel.First, consider STR and let us decompose A i STR ( x ) conditional, on the one hand, on thepanel containing more than x minority jurors — which occurs with probability P (cid:2) Bi ( n, r i ) >x (cid:3) , and on the other, on the panel containing exactly x minority jurors — which occurswith probability P (cid:2) Bi ( n, r i ) = x (cid:3) . In the first case (i.e., more than x minority jurors in thepanel), the probability that the panel contains at least x minority jurors is an upper boundon the probability that STR selects them. In the second case (i.e., exactly x minorityjurors in the panel), STR selects at least x minority jurors provided that none of theminority jurors in the panel are challenged. This occurs with a probability no larger thanthe probability that the lowest conviction-probability among minorities is larger than the p -th conviction probability among majority jurors (since the latter is required for the plaintiffnot to challenge any of the minority jurors in the panel). Recall that for any number ofdraws w and any k ≤ w , we let C k,wg denote the k -th order statistic out of w draws fromgroup g ∈ { a, b } . With this notation, we therefore have, A i STR ( x ) ≤ P (cid:2) Bi ( n, r i ) > x (cid:3) + P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) . (7)Note that because lim i →∞ P ( C ia > C ib ) = 0, we have lim i →∞ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) = 0.Second, consider S&R . Clearly, A i S&R ( x ) is no smaller than the probability for S&R to select at least x minority jurors when there are exactly x minority jurors in the panel.The latter is equal to P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( x ; r i , C ia , C ib ), where σ ( x ; r i , C ia , C ib ) denotes theprobability that S&R selects x minority jurors conditional on having x minority jurors inthe panel, as a function of r i , C ia , and C ib . In summary, with this notation, we have, A i S&R ( x ) ≥ P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( x ; r i , C ia , C ib ) . (8)40e now show that lim i →∞ σ ( x ; r i , C ia , C ib ) >
0. For all i ∈ N , let C i = r i C ia + (1 − r i ) C ib .Observe that because lim i →∞ r i = 0 and because C ib converges in distribution to C ∗ b , C i converges in distribution to C ∗ b . By Lemma 2, this implies that for any subgame γ of S&R and both I ∈ { D, P } , we have lim i →∞ t I ( γ, C i (cid:1) = t I ( γ, C ∗ b (cid:1) . Note that t I ( γ, C ∗ b (cid:1) lies inthe interior of the support of C ∗ b for both I ∈ { D, P } . Also recall that in the limit, thesupports of C ia and C ib do not overlap as we have P ( C ∗ a > C ∗ b ) = 0. Therefore, in the limit,the defendant never challenges a minority juror, which in turn implies that (a) as i tends to infinity, the probability that the defendant challenges one of the x minorityjurors in the panel tends to zero.Because t I ( γ, C ∗ b (cid:1) lies in the interior of the support of C ∗ b for both I ∈ { D, P } , there isalso a range of conviction probabilities [ c, c ] low enough inside the support of C ∗ b such that P ( C ∗ b ∈ [ c, c ]) > P challenged the juror presented in subgame γ if her convictionprobability lies within [ c, c ]. Furthermore, the probability that a juror with conviction-probability in [ c, c ] is a majority juror is strictly positive (and tends to one as i tends toinfinity). Overall, in the limit, (b) the probability that the plaintiff challenges a majority juror presented in subgame γ isstrictly positive.Combining (a) and (b), in the limit and given a panel containing x minority jurors, thereis a positive probability that p majority jurors are presented first, are all challenged by P ,and are followed by the x minority jurors which are left unchallenged by the parties (resultingin a jury composed of at least x minority jurors). That is, lim i →∞ σ ( x ; r i , C ia , C ib ) > i →∞ A i STR ( x ) A i S&R ( x ) ≤ lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) + P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib )= lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib ) + P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib )= lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) P (cid:2) Bi ( n, r i ) = x (cid:3) ∗ σ ( r i , C ia , C ib ) + P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) σ ( r i , C ia , C ib )= lim i →∞ P (cid:2) Bi ( n, r i ) > x (cid:3) P (cid:2) Bi ( n, r i ) = x (cid:3)(cid:124) (cid:123)(cid:122) (cid:125) =0 , by Lemma 1 ∗ lim i →∞ σ ( r i , C ia , C ib ) (cid:124) (cid:123)(cid:122) (cid:125) < ∞ , by lim i →∞ σ ( x ; r i ,C ia ,C ib ) > + lim i →∞ P (cid:0) [ C ia ] ,x > [ C ib ] p,n − x (cid:1) σ ( r i , C ia , C ib ) (cid:124) (cid:123)(cid:122) (cid:125) =0 , by lim i →∞ P ([ C ia ] ,x > [ C ib ] p,n − x )=0 , and lim i →∞ σ ( x ; r i ,C ia ,C ib ) > = 0In turn, lim i →∞ A i STR ( x ) / A i S&R ( x ) ≤ i →∞ A i STR ( x ) = lim i →∞ A i S&R ( x ) = 0 together imply that there exists some j sufficiently large such that A i S&R ( x ) > A i STR ( x ) for all i > j . A.4 Section 2: Changing the number of challenges
A.4.1 Proof of Proposition 4
The structure of the proof is similar to that of the previous propositions. Observe that (3)and (4) are true regardless of the number of challenges awarded to the parties in
STR or S&R . That is, by the same arguments as in the proof of Proposition 1, the following twoinequalities hold regardless of the values of w , y , A STR - w ( x ), or A S&R - y ( x ), T STR - w ( x ; c ) = n (cid:88) k = x +1 P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) T STR - w ( x ; c | k ) ≤ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) > x (cid:105) , T S&R - y ( x ; c ) ≥ n (cid:88) k = x P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = k (cid:105) ∗ σ ( c ) ≥ P (cid:104) Bi (cid:0) n, F ( c ) (cid:1) = x (cid:105) ∗ σ ( c ) . (9)The proposition then follows from the same argument as in the proof of Proposition 1 (inparticular, see (5)). Recall that the proposition assumes w, y ≥ .5 Section 8: Extensions: Unbalanced juries and representation of bal-anced groups A.5.1 Proof of Proposition 5
The probability that
STR selects at least x jurors with conviction-probability above themedian is the probability that at least x + d of the jurors in the panel have conviction-probability above the median (since d of these jurors are challenged by the defendant).Because d = p , for any x ∈ { , . . . , n } , we therefore have T STR (cid:0) x ; med [ C ] (cid:1) = P [ Bi ( j + d + p, . ≥ x + d ] = P [ Bi ( j + 2 d, . ≥ x + d ]In contrast, we have T RAN (cid:0) x ; med [ C ] (cid:1) = P [ Bi ( j, . ≥ x ] . Therefore, by repeated application of Lemma 3, x > ( n/ /
2) implies T STR (cid:0) x ; med [ C ] (cid:1) > T RAN (cid:0) x ; med [ C ] (cid:1) . Since n is integer-valued, the last inequality corresponds to x ≥ n/ n is even and x ≥ n/ . n is odd. A.5.2 Proof of Proposition 6Part (a).
Under
STR , since the group-distributions do not overlap, each party first uses allof its challenges on one of the two groups before challenging the lowest conviction probabilityjurors from the other group. For concreteness and without loss of generality, suppose thatgroup a favors the defendant (i.e., P ( C a > C b ) = 0). Let m denote the number of jurorsfrom group- a in the panel.Note that because r = 0 .
5, the probability that m = k is the same as the probabilitythat m = n − k for all k ∈ { , . . . , (cid:98) n/ (cid:99)} . Also, because d = p , the number of group- a jurorswho are selected when m = k is equal to the number of group- b jurors who are selectedwhen m = n − k . Therefore, the expected number of group- a jurors in the jury selectedby STR is exactly j/ First, suppose that k ≤ p . Then, if m = k , no jurors from group- a (and j jurors from group- b )are selected, whereas if m = n − k , no jurors from group- b (and j jurors from group- a ) are selected.Second, suppose that k ∈ { p + 1 , . . . , (cid:98) n/ (cid:99)} . Then, if m = k , k − p = k − d jurors from group- a (and j − ( k − p ) = j − ( k − d ) jurors from group- a ) are selected, whereas if m = n − k , k − d = k − p jurors fromgroup- b (and j − ( k − d ) = j − ( k − p ) jurors from group- b ) are selected. art (b). The proof is similar to the proof of Part (a). Consider the set of panelconfigurations { a, b } n where, for example, vector ( a, b, a, . . . , b, b, b ) ∈ { a, b } n indicates thatthe juror with the lowest conviction probability in the panel is a group- a juror, the jurorwith second-lowest conviction probability is a group- b juror, the juror with the third-lowestconviction probability is a group- a juror, ..., and the jurors with the three highest convictionprobabilities are all group- b jurors. To explain the structure of the proof, suppose that n is even (we explain below how the argument generalizes to any n ). We first construct apartition of { a, b } n into two subsets S a and S b of equal size and construct a bijection q between S a and S b . We then show that for every panel configuration l ∈ S a which resultsin m l group- a jurors being selected, (a) the panel configuration q [ l ] result j − m l group- a jurors being selected, and (b) panel configurations l and q [ l ] are equally likely. As in theproof of Part (b), the result then follows directly.Similar to the proof of Part (b), the bijection q [ l ] is obtained by (i) mirroring l aroundthe (cid:98) n/ (cid:99) position, and (ii) inverting the group of each juror in the resulting panel config-uration. For example, panel configuration q [( a, a, b, a )] is obtained by mirroring ( a, a, b, a )around position (cid:98) n/ (cid:99) , which results in ( a, b, a, a ), and then inverting the group of eachjurors in ( a, b, a, a ), which results in ( b, a, b, b ). Formally, if inv [ l ] denotes the configura-tion that results from turning all the a ’s in l into b ’s and all the b ’s in l into a ’s, then q [( l , l , . . . , l n − , l n )] = inv [( l n , l n − , . . . , l , l )].Let S a and S b be two sets that together contain all l for which l (cid:54) = q [ l ] and are suchthat l ∈ S i implies q [ l ] / ∈ S i . Since q (cid:2) q [ l ] (cid:3) = l , the sets S a and S b have equal sizes. Alsolet S ∗ contain all l for which l = q [ l ], if any ( S ∗ (cid:54) = ∅ if and only if n is even). Note that { S a , S b , S ∗ } forms of partition of { a, b } n . Therefore, if we let ( m | l ) denote the numberof group- a juror that are selected conditional on configuration l and P ( l ) the probability ofconfiguration l , we have r ST R = (cid:88) l ∈ S a P ( l ) ∗ ( m | l ) + P ( q [ l ]) ∗ ( m | q [ l ]) + (cid:88) l ∈ S ∗ P ( l ) ∗ ( m | l ) . Part (b) then follows from the fact that (A) P ( l ) = P ( q [ l ]) for all l ∈ S a , (B) ( m | l ) = n − ( m | q [ l ]) for all l ∈ S a , and (C) ( m | l ) = j/ l ∈ S ∗ .Properties (B) and (C) follow directly from the construction of q and the fact that d = p .Property (A), on the other hand, follows from Lemma 4 which establishes the symmetry oforder statistics for symmetric distributions. A formal proof of (A) using Lemma 4 requires44eavy and tedious notation. Instead, we show how (A) follows from Lemma 4 in a simpleexample that clarifies how the argument generalizes to other cases.Consider the case of ( a, a, b ) for which q [( a, a, b )] = ( a, b, b ). We can obtain the probabil-ity of any configuration by integrating the p.d.f. of the appropriate order statistics from thebottom to the top of [0 , P [( a, a, b )] = P [ m = 2] ∗ P [( a, a, b ) | m = 2]= P [ Bi (3 , .
5) = 2] ∗ (cid:90) a f , a ( x ) (cid:20)(cid:90) x f , a ( y ) (cid:18)(cid:90) y f , b ( w ) dw (cid:19) dy (cid:21) dx. (10)We can also obtain the probability of any configuration by reverting the list of order statisticsand integrating from the top to the bottom of [0 , P [( a, b, b )]= P [ m = 1] ∗ P [( a, b, b ) | m = 1]= P [ Bi (3 , a.
5) = 1] ∗ (cid:90) a f , b (1 − x ) (cid:20)(cid:90) x f , b (1 − y ) (cid:18)(cid:90) y f , a (1 − w ) dw (cid:19) dy (cid:21) dx. (11)Finally, by Lemma 4, f , a ( x ) = f , b (1 − x ), f , a ( y ) = f , b (1 − y ), and f , b ( w ) = f , a (1 − w ),which together with symmetry of the binomial with 0.5 probability of success implies thatthe expressions in (10) and (11) are equal. 45 External Appendix: Additional simulations
B.1 Excluding extremes, uniform distribution of conviction probabilities
Figure B.1: Fraction of juries with at least one extreme juror . . . . . c . . . . . . F r a c t i o n o fj u r i e s RANS&RSTR
Note:
Results from 50,000 simulations of jury selections with parameters j = 12, d = p = 6, and C ∼ U [0 , .2 Minority representation when minorities favor conviction Table B.1: Representation of Group-a jurors in the effective jury when Group-ais a minority of the jury pool
Polarization Extreme Moderate Mild (All)Procedure
S&R STR S&R STR S&R STR RAN
Average fraction of minorities 0.12 0.08 0.18 0.16 0.23 0.23 0.25Standard deviation 0.11 0.11 0.12 0.12 0.12 0.12 0.12Fraction of juries with at least 1 0.76 0.45 0.89 0.85 0.96 0.95 0.97 (a) Group-a represents 25% of the jury pool
Polarization Extreme Moderate Mild (All)Procedure
S&R STR S&R STR S&R STR RAN
Average fraction of minorities 0.01 0.00 0.05 0.04 0.09 0.08 0.10Standard deviation 0.03 0.02 0.06 0.06 0.08 0.08 0.09Fraction of juries with at least 1 0.09 0.02 0.44 0.38 0.66 0.64 0.72 (b) Group-a represents 10% of the jury pool
Note:
The rows report the average number and standard deviation of group- a jury members, and the percentof juries with at least one group- a jurors, out of 50,000 simulations of jury selection with parameters j = 12and d = p = 6. Conviction probabilities are drawn for from Beta (1 , Beta (5 , Beta (2 , Beta (4 ,
2) (Moderate), and from
Beta (3 , Beta (4 ,
3) (Mild);see Figure 3 for the shape of these distributions. .3 Excluding unbalanced juries, simulations from mild and moderatepolarization Figure B.2: Probability of selecting jurors below the median, difference with
RAN . . . . . P r o b a b ili t y ( d i ff e r e n ce w i t h R AN ) STRS&R , r = . S&R , r = . S&R , r = . (a) Moderate polarization Number of jurors . . . . . P r o b a b ili t y ( d i ff e r e n ce w i t h R A N ) STRS&R , r = . S&R , r = . S&R , r = . (b) Mild polarization Note:
The chart displays the probability of selecting a number of jurors with c i below the median under STR (green dashed line) and
S&R (orange lines) relative to the same probability under
RAN , i.e. T M ( x ; med [ C ]) − T RAN ( x ; med [ C ]). The model parameters are j = 12, d = p = 6 and C ∼ r · Beta (2 , − r ) · Beta (4 ,
2) forPanel (a) and C ∼ r · Beta (3 ,
4) + (1 − r ) · Beta (4 , r = { . , . , . } Values for
S&R are the resultsfrom 50,000 simulations of jury selection, whereas values for
RAN and
STR are computed analytically andare independent of r (see Footnote 36). eferences Anwar, S., P. Bayer, and R. Hjalmarsson.
The Quarterly Journal of Economics , 127(2): 1017–1055. (Cited on pages 5 and 6)
Beck, Coburn R.
William &Mary Law Review , 39(3): , p. 42. (Cited on page 13)
Bermant, Gordon, and John Shapard.
The Trial Process . ed. by Sales, Bruce Dennis,Berlin: Springer, 69–114. (Cited on page 2)
Biedenbender, Alice.
Catholic University Law Review , 40(3): ,p. 31. (Cited on page 13)
Bonebrake, James G.
The Journalof Criminal Law and Criminology , 79(3): , p. 23. (Cited on page 13)
Brams, Steven J., and Morton D. Davis.
Operations Research , 26(6):966–991. (Cited on pages 3, 5, 7, 9, 34, and 35)
Broderick, Raymond J.
Temple Law Review , 65, p. 369. (Cited on page 2)
Cohen, Neil P., and Daniel R. Cohen.
Universityof Memphis Law Review , 34 1–71. (Cited on page 21)
Craft, Will.
APMReports . (Cited on pages 6 and 23)
Daly, Meghan.
Duke Journal of Constitutional Law and Public Policy Sidebar , 11148–162. (Cited on page 3)
Diamond, Shari Seidman, Destiny Peery, Francis J. Dolan, and Emily Dolan.
Jour-nal of Empirical Legal Studies , 6(3): 425–449. (Cited on pages 5 and 6)
Flanagan, Francis X.
Journal of Lawand Economics , 58(2): 385–416. (Cited on pages 5, 7, 15, 28, 29, and 31)
Flanagan, Francis X. he Journal of Law and Economics , 61(2): 189–214. (Cited on pages 5, 6, and 23)
Hochman, Rodger L.
Nova Law Review , 17, p. 1367. (Cited on page 2)
Horwitz, Barbara L.
University of Cincinnati Law Review , 61 1391–1440.(Cited on page 13)
Keene, Douglas L.
The Jury Expert , 2(21): 24–25. (Cited on page 13)
LaFave, Wayne, Jerold Israel, Nancy King, and Orin Kerr.
Criminal Proce-dure . St. Paul, MN: West Academic Publishing, , 5th edition. (Cited on page 2)
Marder, Nancy S.
Texas Law Review , 73 1041–1138. (Cited on page 2)
Marder, Nancy S.
Raphael, Michael J., and Edward J. Ungvarsky.
University of Michigan Journal of Law Reform ,27 229–276. (Cited on page 3)
Rose, Mary R.
Law and Human Behavior , 23(6): 695–702. (Citedon page 6)
Sacks, Patricia E.
Washington UniversityLaw Quarterly , 67(2): , p. 29. (Cited on page 2)
Shapard, John, and Molly Johnson.
Federal Judicial Center, Research Division . (Citedon page 3)
Small, Mario L., and Devah Pager.
Journal of Economic Perspectives , 34(2): 49–67. (Cited on page 32)
Smith, Abbe.
George-town Journal of Legal Ethics , 27 1163–1186. (Cited on page 2)
Turner, Billy M., Rickie D. Lovell, John C. Young, and William F. Denny. ournal of Criminal Justice , 14(1): 61–69. (Cited on page 6)
Wright, Ronald F., Kami Chavis, and Gregory S. Parks.