[PDF] Crime Aggregation, Deterrence, and Witness Credibility

Abstract

We present a model for the equilibrium frequency of offenses and the informativeness of witness reports when potential offenders can commit multiple offenses and witnesses are subject to retaliation risk and idiosyncratic reporting preferences. We compare two ways of handling multiple accusations discussed in legal scholarship: (i) When convictions are based on the probability that the defendant committed at least one, unspecified offense and entail a severe punishment, potential offenders induce negative correlation in witnesses' private information, which leads to uninformative reports, information aggregation failures, and frequent offenses in equilibrium. Moreover, lowering the punishment in case of conviction can improve deterrence and the informativeness of witnesses' reports. (ii) When accusations are treated separately to adjudicate guilt and conviction entails a severe punishment, witness reports are highly informative and offenses are infrequent in equilibrium.

Full PDF

aa r X i v : . [ ec on . GN ] S e p Crime Aggregation, Deterrence, and Witness Credibility ∗ Harry Pei Bruno StruloviciSeptember 15, 2020

Abstract:

We present a model for the equilibrium frequency of offenses and the informativeness of witnessreports when potential offenders can commit multiple offenses and witnesses are subject to retaliation risk andidiosyncratic reporting preferences. We compare two ways of handling multiple accusations discussed in legalscholarship: (i) When convictions are based on the probability that the defendant committed at least one, unspeciﬁedoffense and entail a severe punishment, potential offenders induce negative correlation in witnesses’ private information,which leads to uninformative reports, information aggregation failures, and frequent offenses in equilibrium.Moreover, lowering the punishment in case of conviction can improve deterrence and the informativeness ofwitnesses’ reports. (ii) When accusations are treated separately to adjudicate guilt and conviction entails a severepunishment, witness reports are highly informative and offenses are infrequent in equilibrium.

Keywords: soft evidence, deterrence, negative correlation, coordination.

JEL Codes:

D82, D83, K42.

When a defendant faces multiple charges, the legal norm is to consider these charges separately and to convict thedefendant if there are speciﬁc charges whose corresponding evidence meets the appropriate standard of proof.While this separation of charges is standard, its desirability for deterrence and fairness is by no means obvious.Consider a defendant who may have committed two offenses with probability . each, independent of each other.If the conviction threshold for each offense is . , the defendant is acquitted on both counts, even though theprobability that he is guilty of at least one offense is − . × . . . By contrast, a defendant accused of asingle offense may be convicted even if his probability of guilt is . , and thus lower than the ﬁrst defendant’s.This issue is most salient when defendants face multiple accusations that are hard to prove beyond a reasonabledoubt, such as abuses of power, extortions, and sexual assaults. In such cases, evidence often relies on witnesstestimonies and may thus be affected by witnesses’ incentives to tell the truth.Legal scholarship has explored the possibility of aggregating charges into an overall probability of guilt insteadof treating charges separately (see Cohen 1977, Bar Hillel 1984, Robertson and Vignaux 1993). In particular, Hareland Porat (2009) deﬁne the Aggregate Probabilities Principle (“APP”) as follows: a defendant is convicted if the ∗ Preliminary versions of this project have been circulated under various titles. We thank S. Nageeb Ali, Bocar Ba, Sandeep Baliga,Arjada Bardhi, Laura Doval, Mehmet Ekmekci, Alex Frankel, Chishio Furukawa, George Georgiadis, Bob Gibbons, Andrei Gomberg,Yingni Guo, Andreas Kleiner, Anton Kolotinin, Frances Xu Lee, Annie Liang, Matt Notowidigdo, Wojciech Olszewski, Alessandro Pavan,Joyce Sadka, Larry Samuelson, Ron Siegel, Vasiliki Skreta, Juuso Toikka, Rakesh Vohra, Alex White, Alex Wolitzky, Boli Xu, andour seminar audiences for helpful comments. Strulovici acknowledges ﬁnancial support from the National Science Foundation (NSFGrant No.1151410). Pei: Department of Economics, Northwestern University. [email protected]. Strulovici: Department ofEconomics, Northwestern University. [email protected]. unspeciﬁed offense exceeds a given threshold. They compare APP tothe

Distinct Probabilities Principle (“DPP”), which requires that the principal be convicted if there is at leastone speciﬁc offense for which the probability that the principal committed this offense exceeds some exogenousthreshold. Harel and Porat argue that APP can reduce adjudication errors, improve deterrence, and reduce the costof enforcement, and advocate using APP to varying degrees in both civil and criminal instances of the law.While the object of these studies is of clear importance, their arguments rely on the assumptions that the distribution of the defendant’s guilt across charges is exogenous , the quality of evidence across charges is exogenous ,and guilt is independently distributed across charges . This approach ignores any strategic consideration in thebehaviors of potential offenders and witnesses. In particular, it cannot speak to how the introduction of APP mayaffect the incentives of potential offenders and the informativeness of witness testimonies.This paper compares the effects of APP and DPP from a strategic perspective. In our model, the probability ofcommitting offenses and the informativeness of accusations (or absence thereof) are endogenous. Since potentialoffenders choose their actions strategically, distinct offenses need not be independently distributed. In fact, weshow that APP can induce potential offenders to introduce negative correlation in witnesses’ private informationand severely undermine the informativeness of witnesses’ reports in equilibrium. This negative correlation violatesthe independence assumption used in the legal literature to analyze APP and undermines its conclusions.In our model, a potential offender, the principal , has several opportunities to commit offenses, each of whichis associated with a distinct witness, or agent , who observes whether the corresponding offense takes place. Forexample, the principal may be an employer with multiple opportunities of violating the law and agents may beemployees in a position to witness these violations, as victims or potential whistleblowers. Our model encompassesvarious types of reportable misbehavior, from the most serious crimes to trivial violations of laws and social norms.Agents simultaneously decide whether to accuse the principal on the basis of three considerations: (i) apreference for punishing offenses, (ii) a risk of facing retaliation or social stigma, which is higher when accusationsfail to get the principal convicted, and (iii) some (possibly small) idiosyncratic private beneﬁts or costs of gettingthe principal convicted, which are independent of whether offenses have taken place. A Bayesian judge then observes agents’ reports (accusations, or absence thereof) and decides whether toconvict or acquit the principal. When contemplating the commission of offenses, the principal trades off theutility from committing offenses with the expected cost of punishment from conviction. In our model, some abuses go unreported and some charges of abuse are not deemed credible enough to lead to a conviction. Bothfeatures are consistent with the empirical evidence on abuses and reports of abuse. These patterns are documented in a study of harassmentin the U.S. military by the RAND corporation (2018) and studies of police brutality or inaction by Ba (2018) and Ba and Rivera (2019) usingdata from the city of Chicago. Similar partterns arise in a 2016 survey conducted by the USMSPB, which concluded that 21% of womenand 8.7% of men experienced at least one of 12 categorized behaviors of sexual harassment, of which only a small fraction was followedby charges. According to data released by USMSPB, among the harassment charges ﬁled in 2017, only 16% led to “ merit resolutions ,” i.e.,to outcomes favorable to the charging parties.

2e compare the principal’s incentive to commit offenses and the informativeness of witness reports wheneither APP or DPP is used to adjudicate guilt. APP and DPP are identical when there is only one offense underconsideration, but differ when the principal may have committed two or more offenses. For example, a defendantwho is surely guilty of exactly one of three accusations, uniformly across accusations, will surely be convictedif APP is the criterion used for conviction, no matter what standard of proof is applied, but will be acquittedunder DPP and the conviction standard is even as weak as the 50% threshold used for preponderance of evidence.Understanding the comparative beneﬁts of these criteria is relevant not only for criminal law and civil law but alsofor corporate decisions such as whether to ﬁre an employee facing multiple allegations of misconduct.To highlight the main forces at play, we start in Section 3 by comparing APP and DPP when (i) the principalcan commit at most two offenses and (ii) conviction entails a large punishment relative to the principal’s gain fromthe commission of these offenses. Theorem 1 shows that when APP is used as the adjudication criterion, theprincipal commits at most one offense in every equilibrium. This strategic restraint induces negative correlation in agents’ private information: when an agent observes an offense, he believes that the other agent is unlikely toobserve one. This negative correlation exposes agents to the risk of contradicting each other and to retaliation. Inequilibrium, this reduces the informativeness of witness reports and, despite the threat of a large punishment incase of conviction, prompts the principal to commit offenses with high probability.When DPP is used, by contrast, the principal’s decisions to commit distinct offenses are independently distributedin equilibrium and, hence, so are witnesses’ private observations (Theorem 2). As the punishment in case ofconviction becomes arbitrarily large, agents’ accusations become arbitrarily informative and the probability ofoffense converges to zero. The logic of Theorem 1 can be described more explicitly in two steps. First, when conviction entails alarge enough punishment, the principal is convicted in equilibrium only if both agents accuse him (Lemma 3.1).Intuitively, suppose that one accusation sufﬁced to convict the principal with positive probability. Since eachaccusation strictly increases the probability that the principal is guilty of at least one offense, a Bayesian judgewould then surely convict the principal when both agents accuse him. If conviction entails a large punishment, thiswould give the principal a strict incentive not to commit any offense. This, in turn, would prompt a Bayesian judgenever to convict the principal and lead to a contradiction (Lemma 2.1).This ﬁrst step implies that the principal’s decisions to commit offenses are strategic substitutes , because hegoes unpunished when he faces only one accusation but is convicted with positive probability when facing twoaccusations (Lemma 3.2). It also implies that agents’ decisions to accuse the principal are strategic complements ,because an agent who accuses the principal is more likely to convict him if the other agent also accuses. These In the benchmark model with only one offense and one witness, APP and DPP coincide. Proposition 1 shows that agent’s reportbecomes arbitrarily informative and the probability of offense vanishes as the punishment to the convicted principal increases. negative correlation in agents’ private information.Witnessing an offense reduces the probability that the other agent also witnessed an offense and, due to agents’reporting complementarity, weakens the incentive to accuse the principal. Likewise, not witnessing an offense leadsto a higher probability that the other agent witnessed an offense and raises the incentive to accuse the principal.Agents’ incentives are distorted in way that reduces the informativeness of their reports. This, in turn, increasesthe equilibrium probability of offenses because reports are less responsive to the commission of offenses.By contrast, DPP preserves the informativeness of witnesses’ reports by disentangling a potential offender’sincentives to commit distinct offenses and, hence, by removing any correlation in witnesses’ private information.Unlike the probability that the principal is guilty of at least one offense, the probability that he is guilty of a speciﬁc offense need not increase, other things equal, when a larger set of agents accuse him of other offenses.With DPP, the probability that the principal is convicted is linear in the number of accusations and the principal’sdecisions to commit distinct offenses are neither complements nor substitutes. This implies that the principal’sincentives to commit distinct offenses operate independently of one another and that agents’ private observationsare uncorrelated. As a result, agents’ coordination motive no longer undermines the informativeness of theirreports. Finally, these linear conviction probabilities are consistent with the DPP criterion, because the judge’sbelief concerning the occurrence of each offense reaches the conviction threshold if and only if the agent who canobserve this offense accuses the principal, and is unaffected by the reports of other agents.In practice, one rationale for aggregating different charges against a defendant is that these charges may shedlight on the defendant’s propensity to commit offenses, i.e., on the defendant’s type, or “character.” In Section 4,we account for this potential heterogeneity by allowing the principal to be either a virtuous type , whose beneﬁtfrom committing an offense is either zero or negative, or an opportunistic type whose beneﬁt from committing anoffense is strictly positive. The principal’s type is unobserved by other parties.The insights of Theorems 1 and 2 extend to this setting. When conviction entails a sufﬁciently large punishmentand the judge uses APP, two reports are required to convict the principal in every equilibrium. In contrast tothe baseline model, an opportunistic principal may now commit two offenses with strictly positive probability.However, agents’ private observations remain negatively correlated . As in the baseline model, this negativecorrelation undermines the informativeness of agents’ reports and leads to a high probability of offense in equilibrium.By contrast, when DPP is used as the conviction criterion, the offenses witnessed by the agents are uncorrelatedand the probability with which the principal is convicted is a linear function in the number of accusations. Asthe punishment in case of conviction becomes arbitrarily large (relative to an opportunistic type’s beneﬁt fromcommitting an offense), agents’ reports become arbitrarily informative and offenses become arbitrarily infrequent.4n Section 5, we explore the effects of APP and DPP when the principal can commit more than two offensesand, correspondingly, face accusations by more than two witnesses. When APP is used and conviction entails alarge enough punishment, the informativeness of each agent’s report becomes so weak that conviction occurs onlywhen all agents accuse the principal. Theorem 3 shows that, as the number of agents increases, the aggregateinformativeness of all reports pooled together decreases even though there are more reports available and thefrequency of offenses increases in equilibrium. However, the equilibrium probability that an agent accuses theprincipal is increasing in the number of agents. This distinguishes our result from theories of public good provision,in which each agent contributes less as the number of agents increases.In summary, our results provide a rationale for using DPP rather than APP in judicial settings. Our analysis isalso relevant for ﬁrms and organizations considering decisions such as whether to ﬁre an employee or to sanctionsome of their members. In these cases, DPP raises concerns for fairness and justice and APP can be justiﬁed expost. Schauer and Zeckhauser (1996) observe that “... although sound reasons for the criminal law’s refusal tocumulate multiple low-probability accusations exist, the reasons for such refusal are often inapt in other settings.Taking adverse decisions based on cumulating multiple low-probability charges is often justiﬁable both morallyand mathematically .” However, we show that enshrining APP in corporate decisions can have an undesirableeffect on ex ante incentives for both potential offenders and potential witnesses. Our ﬁndings echo some critiquesof procedures that link accusations across potential victims. Motivated by applications in which the use of APP is unavoidable and the use of DPP is inconvenient, we showin Proposition 3 that mitigating punishment to the convicted principal can improve deterrence. More precisely,there exist intermediate punishment levels for which the probability of conviction is concave in the number ofaccusations, and a single accusation sufﬁces to convict the principal with positive probability. The principal’sdecisions to commit offenses are now strategic complements , which induces positive correlation in witnesses’private information. By a reverse logic to the high-punishment case, this increases the informativeness of witnessreports and lowers the frequency of offenses. When the principal does commit at least one offense, he may nowcommit multiple offenses. In equilibrium, the frequency of guilty principals is lower but the severity of their actionsis higher than in the high-punishment case. Our result thus points to a tradeoff between reducing the fraction ofoffenders and reducing the severity of offender’s actions.While we did not specify any social welfare function, our analysis unveils several tradeoffs between improvingthe quality of judicial decisions and deterrence, between ex ante incentives and ex post fairness, and between thefrequency and the severity of committed offenses. We summarize these lessons in Section 6, which are useful For example, Keith Hiatt, director of the Technology Program at the Human Rights Center at UC Berkeley School of Law, notedconcerning the multiple-accusations approach taken by the online platform Callisto that “it may also codify an entrenched attitude thatwomen need to have corroborating evidence to be believed.”

New York Times, “The War on Campus Sexual Assault Goes Digital”.

Related Literature:

Models of information aggregation and strategic communication often assume that the factof interest is exogenously generated. In applications to civil and criminal law, however, facts are generated byindividuals whose incentives interact with how information is aggregated, communicated, and ultimately incorporatedinto judicial decisions. By endogenizing agents’ private information and incentives through the strategic commissionof offenses, this paper provides a new perspective on information aggregation and strategic communication, whichemphasizes the interaction between the “state of the world” and the choices made at later stages to aggregate anduse information, and illustrates the signiﬁcant, complex, and perhaps unintended, consequences of this interaction.First, our results provide a novel explanation for information aggregation failures. In Scharfstein and Stein(1990), Banerjee (1992), Bikhchandani et al. (1992), Ottaviani and Sørensen (2000), and Smith and Sørensen(2000), agents fail to act on their private information because they can observe informative actions taken by theirpredecessors. Here, by contrast, agents move simultaneously and information aggregation fails because of thenegative correlation in agents’ private information, combined with agents’ incentives to coordinate their reports. Our paper contributes to the literature on voting by studying a game in which both the voting rule andthe correlation between agents’ private information are endogenous. This stands in contrast to existing worksin this literature in which at least one of these two ingredients are exogenous. This includes voting modelswith endogenous information acquisition (Persico 2004), voting with negatively correlated private informationor payoffs (Schmitz and Tr¨oger 2012 and Ali, Mihm, and Siga 2018), and dynamic voting models in whichinformation acquisition may induce negative correlation in voters’ continuation values (Strulovici 2010).Second, our paper contributes to the literature on strategic information transmission with multiple senders(Battaglini 2002, 2017, Ambrus and Takahashi 2008, and Ekmekci and Lauermann 2019). In contrast to theseworks, senders in our model communicate information about the principal’s actions and the correlation between Information aggregation can also fail due to individual biases (Morgan and Stocken 2008) and voters using pivotal reasoning(Austen-Smith and Banks 1996, Bhattacharya 2013). Strulovici (2020) studies a sequential learning model in which an agent is less likely to have an informative signal, other things equal,if another agent has found such a signal. This creates negative correlation in the informativeness of agents’ signals, rather than the direction of these signals, which hampers social learning. Ourresults justify a key feature of criminal justice systems, which is to treat distinct accusations separately in convictiondecisions. Our ﬁnding (Proposition 3) that a lower punishment can reduce the probability of offense stands incontrast to Becker’s (1968) well-known observation that maximal punishments save on law-enforcement costs. Lee and Suen (2020) study the timing of reports by victims and libelers when a criminal commits offensesagainst two agents with exogenous probability. They provide an explanation for the well-documented fact thatvictims sometimes delay their accusations. Their analysis and ours consider complementary aspects of witnesses’reporting incentives. Cheng and Hsiaw (2020) adopt a global game perspective to study the reporting incentivesof a continuum of agents who observe conditionally independent signals of the state of the world. Naess (2020)also considers reporting incentives and, among other results, ﬁnds that making reporting costly may improve socialwelfare. The principal’s strategic restraint that emerges endogenously in our model and the negative correlationthat it induces on the agents’ private information are distinctive features of our analysis.

Overview:

We consider a game between a potential offender (the “principal”), n potential witnesses or victims(the “agents”), and a Bayesian judge, which unfolds in three stages.In the ﬁrst stage, the principal privately observes his beneﬁt from committing offenses and then chooses whichoffenses to commit within a given set of opportunities. For tractability, our analysis considers two principal types:a “virtuous” type who has no beneﬁt from committing the offense and an “opportunistic” type who has a strictlypositive beneﬁt per committed offense. Our results are ﬁrst presented when the principal is opportunistic withprobability (Section 3). This allows us to describe as simply as possible the forces at play and to show that ourresults do not rely on the presence of type heterogeneity. We then extend our analysis to allow for both principal Silva (2019) studies a model with multiple suspects and constructs a mechanism that elicits truthful confessions among suspects. InBaliga, Bueno de Mesquita and Wolitzky (2020), only one of the potential assailants has an opportunity to commit an offense. In bothpapers, the negative correlation in whether the suspects are guilty or innocent is exogenous. Stigler (1970) observes that several punishment levels should be used when criminals can choose between different levels of crime.The rationale there is to provide marginal incentives not to commit the worst crimes. In this scenario, applying the maximal punishment tothe worst crimes remains optimal from the perspective of deterring crimes. One could consider more types; for instance, some ethical principals could have a negative beneﬁt from committing the offense. In ourmodel they would behave identically to the virtuous type. This also addresses an ideological and legal concern about assuming heterogeneous defendant propensities to commit offenses. Withheterogeneous types, accusing someone of an offense affects the belief about the person’s type or “character” and, consequently, the aggregate probabilities principle (“APP”, deﬁned in (2.4)) and the distinct probabilities principle (“DPP”,deﬁned in (2.5)). In the baseline model, we consider a binary punishment decision: either the principal is punished(which may be interpreted as losing his job or reputation, or receiving a harsh sentence), or he is not. In Section7, we discuss an extension, in which the punishment to a convicted principal is higher if he is believed to havecommitted multiple offenses. As explained in that section, this possibility reinforces our main insights.The principal trades off his (possibly, null) beneﬁt from committing offenses with the risk of punishment.Agents choose their reports based on the following considerations: (i) a preference for convicting a guilty principal,(ii) a cost of retaliation or social stigma if they accuse the principal, which is strictly higher if the principal isacquitted than if he is convicted, (iii) an idiosyncratic taste shock (e.g., afﬁnity or grudge) for seeing the principalacquitted or convicted, and, in an extension, (iv) a desire for telling the truth irrespective of the adjudicationoutcome. For tractability, the components of the principal’s and the agents’ payoffs are assumed to be additivelyseparable. To guarantee that all report proﬁles may occur in equilibrium, we also include an inﬁnitesimal fractionof behavioral agents, who with some probability always accuse the principal and with the complement probabilityalways abstain from accusing him. We do not explicitly microfound the judge’s preference. However, bothadjudication rules maximize some quadratic payoff functions that we could stipulate. The conviction thresholdis taken as exogenous and could also easily be microfounded as a tradeoff between type I and type II errors.

We consider a three-stage game between a principal, n agents, and a judge. In stage , the principal privatelyobserves his type t ∈ { t v , t o } , where t = t v means that the principal is virtuous and t = t o means that he is opportunistic . The principal then chooses an n -dimensional vector θ ≡ ( θ , ..., θ n ) ∈ { , } n , where θ i = 1 probability of guilt regarding other accusations. This may conﬂict with rules that forbid the use of “character evidence”, such as the FederalRule of Evidence 404 in the United States. θ i = 0 ) means that the principal commits (does not commit) an offense witnessed by agent i .In stage , each agent i ∈ { , , ..., n } privately observes θ i , the realization of a payoff shock ω i ∈ R , andwhether he is strategic or behavioral. Each agent i then chooses between accusing the principal ( a i = 1 ) or not( a i = 0 ). If agent i is strategic, he chooses a i to maximize his expected payoff. If he is behavioral, with probability α ∈ (0 , he always accuses the principal and with probability − α he never accuses him. In stage , the judge observes a ≡ ( a , ..., a n ) ∈ { , } n , and makes a decision s ∈ { , } , where s = 1 standsfor convicting the principal and s = 0 stands for acquitting him. Type Distributions:

The random variables t , { ω i } ni =1 , and agents’ types (strategic vs. behavioral) are independentlydistributed. We let π o ∈ (0 , denote the probability that the principal is opportunistic. The variables ω i for i = 1 , . . . , n are drawn from a normal distribution with mean µ and variance σ . We let Φ( · ) and φ ( · ) denote theircdf and pdf. Each agent is strategic with probability δ ∈ (0 , . We are interested in asymptotic results as δ goesto , which means that the fraction of behavioral agents is arbitrarily small. Payoffs:

The principal’s realized payoff is { t = t o } · n X i =1 θ i − sL, (2.1)which means that a virtuous principal does not beneﬁt from committing offenses, and the opportunistic type tradesoff the beneﬁt (normalized to ) from committing each offense with the cost of being convicted L > .If agent i is strategic, his realized payoff is u i ( ω i , θ i , a i ) ≡  if s = 1 ω i − b (cid:16) (1 − γ ) θ i + γf ( θ i , θ − i ) (cid:17) − ca i if s = 0 , (2.2)where b > , c > , and γ ∈ [0 , are parameters, and f ( θ i , θ − i ) is strictly increasing in both arguments.This speciﬁcation normalizes to 0 agents’ utility when the principal is convicted. This is without loss ofgenerality for the purpose of analyzing agents’ incentives, since an agent’s reporting decision depends only on the difference between his utility when the principal is convicted and his utility when the principal is acquitted. The parameter b > captures each agent’s preference for punishing offenders; c > captures the retaliation or The assumption that each agent may be behavioral, even with an inﬁnitesimally small probability, guarantees that all reporting proﬁlesare on path and reﬁnes away unreasonable equilibria in which the principal commits offenses against all agents with probability and issurely convicted even when no agent accuses him. Our results hold for more general speciﬁcations of behavioral agents’ strategies, in whichagents’ reports depend on their observation of θ i . This formulation does rule out the possibility that agents have a preference for reporting the truth irrespective of the adjudicationoutcome. We study to this extension in Section 7 and generalize our main results. and γ ∈ [0 , capturesthe intensity of an agent’s social preference , i.e., the extent to which his payoff internalizes offenses committedagainst (or witnessed by) other agents. Finally, ω i is an idiosyncratic shock that captures agent i ’s preference foracquitting the principal. A lower ω i means that agent i has a stronger incentive to accuse the principal.We compare equilibrium outcomes when the judge uses either APP or DPP to adjudicate guilt. These rulesdiffer only in terms of how to measure the probability of guilt when the principal is capable of committing multipleoffenses . Both rules may be microfounded by assigning some appropriate quadratic payoff function to the judge.We let π ∗ ∈ (0 , denote the standard of proof used by the judge to adjudicate guilt. We assume that π ∗ isexogenous although it, too, could be microfounded by specifying costs for making type I and type II errors. Let θ ≡ max i ∈{ , ,...,n } θ i . (2.3)The principal has committed at least one offense if θ = 1 , and no offense if θ = 0 .The ﬁrst adjudication rule, APP, requires that s  = 1 if Pr (cid:16) θ = 1 (cid:12)(cid:12)(cid:12) a (cid:17) > π ∗ ∈ { , } if Pr (cid:16) θ = 1 (cid:12)(cid:12)(cid:12) a (cid:17) = π ∗ = 0 if Pr (cid:16) θ = 1 (cid:12)(cid:12)(cid:12) a (cid:17) < π ∗ . (2.4)This rule convicts the principal when the probability that he has committed at least one offense exceeds theexogenous threshold π ∗ . This rule has been advocated when an individual is accused of wrongdoing outsidethe criminal process, for example when managers are charged with discriminating against minority workers, orwhen supervisors are charged with abusing their subordinates. Schauer and Zeckhauser (1996) argue that theaggregation of multiple low-probability accusations to discipline agents is often appropriate in corporate and othernon-judicial settings. Harel and Porat (2009, p. 263) advocate the use of APP in some judicial settings, observingthat acquitting defendants who are almost surely guilty of some unspeciﬁed crime is neither just nor efﬁcient .The second adjudication rule, DPP, requires that s  = 1 if max i ∈{ , ,...,n } Pr (cid:0) θ i = 1 (cid:12)(cid:12) a (cid:1) > π ∗ ∈ { , } if max i ∈{ , ,...,n } Pr (cid:0) θ i = 1 (cid:12)(cid:12) a (cid:1) = π ∗ = 0 if max i ∈{ , ,...,n } Pr (cid:0) θ i = 1 (cid:12)(cid:12) a (cid:1) < π ∗ . (2.5)DPP is close to the criterion used to adjudicate guilt by most criminal justice systems when a defendant is charged In the baseline model, we normalize the social stigma (or loss from retaliation) to zero when the principal is convicted. Our resultshold as long as the social stigma (or loss from retaliation) is strictly larger when the principal is acquitted rather than convicted. speciﬁc offense exceeds some evidentiary threshold. When there is only one agent, (2.4) and (2.5) coincide. When there are two or more agents, (2.4) and (2.5)are distinct criteria. In the example presented in the introduction, using π ∗ = 0 . , a defendant who commits twooffenses with probability 0.8 each is convicted under decision rule (2.4) but acquitted under decision rule (2.5). Solution Concept:

We focus on proper equilibria (Myerson 1978) that satisfy two reﬁnements, described below,which will simply be referred to as equilibria . Formally, a proper equilibrium consists of a strategy proﬁle (cid:8) σ o , σ v , ( σ i ) ni =1 , q (cid:9) where • σ o ∈ ∆ (cid:0) { , } n (cid:1) and σ v ∈ ∆ (cid:0) { , } n (cid:1) are the strategies followed by an opportunistic type and a virtuoustype of principal, respectively; • σ i : R × { , } → ∆ { , } is agent i ’s strategy when he is strategic and maps each realization of ω i and θ i into a probability of accusing the principal; • q : { , } n → [0 , is the judge’s strategy and deﬁnes the probability that the judge convicts the principalafter observing a ∈ { , } n : q ( a ) = Pr( s = 1 | a ) .Our ﬁrst reﬁnement requires that the principal be acquitted if no agent accuses him. Reﬁnement 1 (No Conviction Unless Accused) . q (0 , , ...,

0) = 0 . Reﬁnement 1 rules out equilibria in which an opportunistic principal commits offenses against all agents withprobability one and is convicted with probability one regardless of agents’ reports. In these equilibria, the principalis convicted on the sole basis of a judge’s prior belief. These equilibria violate the legal principle of the presumptionof innocence. Our second reﬁnement confers their meaning to agents’ messages.

Reﬁnement 2 (Monotonicity) . For every i ∈ { , , ..., n } , q (1 , a − i ) ≥ q (0 , a − i ) for every a − i ∈ { , } n − ,and there exists a − i ∈ { , } n − such that q (1 , a − i ) > q (0 , a − i ) . Reﬁnement 2 requires, ﬁrst, that the conviction probability be nondecreasing in the set of agents who accusethe principal and, second, that each agent’s report have a nontrivial inﬂuence on the adjudication outcome, forat least one reporting proﬁle of other agents. This reﬁnement implies that each accusation is a move against the Actual sentencing procedures are of course more complex. In some cases, sentences are cumulated across offenses for which thedefendant is found guilty; in others, they are substitutes and only the most severe punishment among the charges for which the defendantwas found guilty is meted out. Our analysis can accommodate these variations, as explained in Section 7. The key feature that we, and legalscholars before us, emphasize is that DPP treats charges independently of one another to adjudicate guilt. c as a retaliation cost. Intuitively, a principal retaliatesagainst messages that reduce his payoff and does not retaliate against other messages. Lemma 2.1 establishes the existence of a proper equilibrium that satisﬁes both reﬁnements, and providespreliminary observations that apply to both APP or DPP and to any number of agents . Lemma 2.1.

For every punishment L above some threshold L ∈ R + , there exists a proper equilibrium thatsatisﬁes Reﬁnements 1 and 2. In every such equilibrium:1. For every i ∈ { , , ..., n } , agent i ’s strategy is characterized by cutoffs ω ∗ i > ω ∗∗ i such that agent i accusesthe principal when either θ i = 1 and ω i ≤ ω ∗ i or θ i = 0 and ω i ≤ ω ∗∗ i .2. The prior probability Pr (cid:0) θ = 1 (cid:1) that the principal commits at least one offense is strictly between and .3. A virtuous principal never commits any offense. The proof appears in Appendix A, except for the proof of existence, which is in Supplementary Appendix J.The existence of an equilibrium satisfying Reﬁnement 1 requires L to be sufﬁciently large. For example, if L < and π o > π ∗ , the principal commits an offense against all agents with probability when he is opportunistic, inevery proper equilibrium, and is convicted even when no agent accuses him. In this case, no Nash equilibriumsatisﬁes Reﬁnement 1. To get a sense of how large L has to be for the existence of an equilibrium, we performedsome numerical simulations. For example, when π ∗ = 0 . , ω i follows a normal distribution with mean andvariance , the fraction of strategic agents is δ = 0 . , b = 1 and c = 10 , a proper equilibrium that satisﬁes bothreﬁnements exists when L ≥ . Hidden vs. Public Reports

For tractability, our analysis is conducted under the assumption that agents’ reportsare public. The issues exposed in our analysis are not easily dispelled by relaxing this assumption. To see this,consider the case of two agents and suppose that agents’ reports are made public only if both agents accuse theprincipal. This rids agents from the risk of social stigma or retaliation but, by the same token, allows agents toaccuse the principal with impunity and replaces false negatives with false positives. It is easy to see that hidingreports in this way can never lead to highly informative reports and low offense frequency (see Remarks 1 and 2).We show that the effects of retaliation (or social stigma) costs are subtle and depend critically on the correlation A microfoundation for this monotonicity reﬁnement is provided by Chassang and Padr´o i Miquel (2019), in which the principaloptimally commits to a retaliation plan e c i : { , } → [0 , c ] privately against agent i , which maps agent i ’s report to his loss from theprincipal’s retaliation. Retaliation can only be carried out when the principal is acquitted, and c is the maximal damage the principal caninﬂict on each agent. The principal’s optimal strategy is to inﬂict maximal retaliation against a message that increases his probability ofconviction and zero retaliation against the other message. Simultaneous-Move vs Sequential-Move:

In our baseline model, agents simultaneously decide whether to ﬁleaccusations against a potential offender. In practice, such decisions are often made sequentially and agents canoften choose not only whether to ﬁle an accusation but also when to ﬁle it. We explain in Section 8 that the forcesunderlying our results also apply when reporting is sequential.To illustrate this, suppose that there are two agents, who make their reports in a predetermined order. An agentwho witnessed an offense and who is the ﬁrst to report may be concerned that the second agent will not accuse theprincipal, which is more likely if offenses are negatively correlated. Moreover, the second-reporting agent will notaccuse the principal, regardless of what he has witnessed, if the ﬁrst-moving agent remains silent. Similar issuesarise if agents’ order of move is uncertain.

General Punishment Function:

In our baseline model, the conviction decision is binary. In principle, thepunishment administered to a convicted defendant could depend on the entire probability distribution of the numberof offenses that the defendant is guilty of committing. For example, a defendant almost surely guilty of at leasttwo offenses could be given a higher punishment than a defendant almost surely guilty of just one offense.Our main result is robust with respect to such concerns. Theorem 1 establishes that when the punishment incase of conviction is large enough, an opportunistic principal commits at most one offense in equilibrium evenwhen the punishment for committing multiple offenses is the same as the punishment for committing a singleoffense. The incentives to commit two or more offenses are a fortiori even weaker when the punishment fordefendants likely of committing multiple offenses is higher, which reinforces our results.

We show that when there are multiple potential witnesses , APP leads to uninformative reports and ineffectivedeterrence (Theorem 1), while DPP leads to informative reports and effective deterrence (Theorem 2).To highlight the forces at play in their simplest form, this section assumes that the principal is surely opportunistic,i.e., π o = 1 , agents do not directly care about offenses that they cannot observed, i.e., γ = 0 , and compares decisionrules (2.4) and (2.5) when there are one or two potential witnesses . The results are generalized to multiple principaltypes in Section 4, three or more agents in Section 5, and agents who directly care about all offenses in Section 7.13 .1 Benchmark: Single Potential Witness When there is only one agent, Reﬁnement 1 implies that the principal may be convicted only if the agent accuseshim, i.e., q (0) = 0 and q (1) > . If θ = 1 , the agent accuses the principal when ω − b ≤ (1 − q (1))( ω − b − c ) or, equivalently, ω ≤ ω ∗ ≡ b − c − q (1) q (1) . (3.1)If θ = 0 , the agent accuses the principal when ω ≤ (1 − q (1))( ω − c ) or, equivalently, ω ≤ ω ∗∗ ≡ − c − q (1) q (1) . (3.2)Let Pr( θ = 1) be the prior probability that the principal commits the offense and Pr( θ = 1 | a = 1) be the judge’sposterior belief when the agent accuses the principal. Bayes rule implies that Pr( agent reports | θ = 1)Pr( agent reports | θ = 0) | {z } ≡I · Pr( θ = 1)1 − Pr( θ = 1) = Pr( θ = 1 | a = 1)1 − Pr( θ = 1 | a = 1) . (3.3)The ratio I measures the informativeness of the agent’s accusation about the principal’s guilt. Proposition 1.

Suppose that n = 1 .1. There exists L > such that for any L > L and any equilibrium, the judge assigns probability π ∗ to theprincipal being guilty when the agent accuses the principal.2. As L goes to + ∞ and the fraction of behavioral agents − δ goes to 0, The informativeness ratio I goesto + ∞ and the prior probability of offense Pr( θ = 1) goes to . The proof is in Appendix B. Proposition 1 shows that when the punishment in case of conviction is large andthere can be at most one accusation against the principal, the outcome of any equilibrium has highly desirableproperties: accusations are arbitrarily informative and the frequency of offenses becomes arbitrarily close to zero.Proposition 1 may be understood as follows: In equilibrium, the ex ante probability of offense is interior(Lemma 2.1), and the principal must be indifferent between committing an offense and abstaining from it. Thisindifference condition implies that q (1) vanishes to as L goes to inﬁnity. From (3.1) and (3.2), the agent’sreporting cutoffs are decreasing in L , but their distance ω ∗ − ω ∗∗ remains constant and equal to b .As the fraction − δ of behavioral agents goes to , the informativeness ratio I becomes approximately equalto Φ( ω ∗ ) / Φ( ω ∗ − b ) . Since ω ∗ − ω ∗∗ = b , the ratio Φ( ω ∗ ) / Φ( ω ∗ − b ) goes to inﬁnity as ω ∗ → −∞ . Since Throughout the paper, the limit is ﬁrst taken ﬁrst with respect to δ and then with respect to L . For example, Proposition 1 states that lim L → + ∞ (lim δ → Pr( θ = 1)) = 0 . This is a property of the Gaussian distribution and, more generally, of thin-tailed distributions. (1) is strictly between and when L is large enough, the judge’s posterior belief equals π ∗ after observing theagent’s report. Equation (3.3) then implies that the prior probability of offense vanishes to . Remark 1:

Proposition 1 fails when the agent’s retaliation cost c is exactly equal to zero. In this case, it isstraightforward to see that the agent’s accusation thresholds are ﬁxed, equal to b and , independently of L . Thisimplies that the prior probability of committing an offense is also ﬁxed in equilibrium, independently of L . Introducing a slightly positive retaliation cost when L is large leads to arbitrarily informative reports and reducesthe probability of offense to arbitrarily low levels in equilibrium. Suppose now that there are two agents and that adjudication rule (2.4) is used. Recalling that θ describes whetherat least one offense took place (from equation (2.3)), Bayes rule implies that for every a ∈ { , } , Pr( a | θ = 1)Pr( a | θ = 0) · Pr( θ = 1)1 − Pr( θ = 1) = Pr( θ = 1 | a )1 − Pr( θ = 1 | a ) . (3.4)The ratio I ( a ) ≡ Pr( a | θ = 1)Pr( a | θ = 0) , (3.5)is a sufﬁcient statistic for the judge’s posterior belief after observing a , and therefore, it measures the informativeness of reporting proﬁle a . Theorem 1.

When n = 2 and the judge uses decision rule (2.4), there exists L such that when L > L ,1.

Endogenous Negative Correlation Between Offenses:

In every equilibrium

Pr( θ = 1 | θ = 1) < Pr( θ = 1 | θ = 0) and Pr( θ = 1 | θ = 1) < Pr( θ = 1 | θ = 0) . (3.6) For every ε > , there exists L ε ∈ R + , such that in every equilibrium when L > L ε and δ ∈ (0 , ,2. Low Informativeness of Reports & Ineffective Deterrence: max a ∈{ , } I ( a ) < ε .3. Ineffective Deterrence:

Pr( θ = 1) > π ∗ − ε . Theorem 1 shows that when conviction entails a large punishment relative to the principal’s beneﬁt fromcommitting an offense, aggregating the probabilities of distinct offenses to adjudicate guilt induces negativecorrelation across the occurrences of distinct offenses. This negative correlation undermines the informativeness This must be true to maintain a judge’s incentive to convict the principal with interior probability when the agent accuses him.

15f agents’ reports and results in a high probability of offenses. As L → + ∞ , agents’ reports become arbitrarilyuninformative and the equilibrium probability of offense converges to π ∗ . These conclusions stand in sharp contrastto the single-agent benchmark, in which the probability of offense goes to and the informativeness ratio goes toinﬁnity as L becomes arbitrarily large. Remark 2:

Theorem 1 requires that the cost of social stigma or retaliation c be strictly positive but does notimpose any lower bound on c . In practice, one may devise mechanisms that shield to some extent agents fromstigma and retaliation. For example, agents reports may be kept private except if multiple agents the principal.Even in this case, the expected cost of retaliation is still strictly positive as long as there is a strictly positiveprobability that information is leaked to the principal.Theorem 1 suggests that deterrence can be improved by lowering the conviction threshold π ∗ , which wastaken as exogenously given. However, lowering π ∗ increases the risk of convicting an innocent defendant. In ourmodel, the probability that a convicted defendant is innocent equals − π ∗ in equilibrium. For example, if anyjudge were to use a conviction threshold π ∗ of , then in every equilibrium, each convicted individual has a chance of being innocent. In general, the conviction threshold may be chosen so as to strike a compromisebetween deterrence and wrongful convictions, and different values of π ∗ may be justiﬁed by different welfare costsof punishing the innocent.One may conjecture that the lower witness credibility that arises with two agents is driven by agents’ incentivesto free-ride on others’ reports, which is the intuition behind inefﬁcient public good provision (e.g., Chamberlin1974). The following comparative static result refutes this conjecture by showing that, other things equal, increasingthe number of agents results in a strictly higher probability that agents accuse the principal in every equilibrium.This comparative static is generalized to three or more agents in Theorem 3. Proposition 2.

When the judge uses decision rule (2.4), there exists

L > such that the following holds forevery L > L : Let ω ∗ i, ( L ) and ω ∗∗ i, ( L ) denote agent i ’s accusation cutoffs in some equilibrium of the two-agentsetting and ω ∗ ( L ) and ω ∗∗ ( L ) are the accusation cutoffs in some equilibrium of the single-agent setting. Then, ω ∗ i, ( L ) > ω ∗ ( L ) and ω ∗∗ i, ( L ) > ω ∗∗ ( L ) for every i ∈ { , } . The remaining of this section explains in more detail why agents’ private observations are negatively correlatedand how this negative correlation interacts with agents’ coordination motive.

Step 1: Equilibrium Conviction Probability

We ﬁrst explain why, when L is large enough, the principal isconvicted with positive probability only when he is accused by both agents: Theorem 1 is stated for the same order of limits as the one used in Proposition 1, i.e., lim L →∞ lim δ → . emma 3.1. There exists L ∈ R + such that for any L > L and any corresponding equilibrium, q (0 ,

0) = q (1 ,

0) = q (0 ,

1) = 0 and q (1 , ∈ (0 , . (3.7)This result is formally proved in Online Appendix F. The argument proceeds by contradiction: Suppose thata single accusation sufﬁces to convict the principal with strictly positive probability. Since each agent is strictlymore likely to accuse the principal when he has witnessed an offense, each additional accusation increases theprobability that the principal has committed at least one offense. Since a Bayesian judge weakly prefers to convictthe principal with only one accusation, she strictly prefers to convict him if both agents ﬁle accusations.Building on this observation, we show in the appendix that the incremental probability that the principal isconvicted when he commits an additional offense (whether starting from zero offense or from a single offense) isuniformly bounded below away from zero. As the punishment L in case of conviction becomes arbitrarily large,this implies that the marginal cost from committing an additional offense must exceed the marginal beneﬁt fromthe offense, which gives the principal a strict incentive not to commit any offense. This contradicts Lemma 2.1,which states that the equilibrium probability of offense is strictly positive, and, hence, shows that (3.7) must hold. Step 2: Principal’s Incentives & Endogenous Negative Correlation

We examine the principal’s incentivesto commit offenses and establish a necessary and sufﬁcient condition for his choices of θ and θ to be strategicsubstitutes or strategic complements. Lemma 3.2.

When the principal is opportunistic, offenses θ and θ are strategic substitutes if and only if q (1 ,

1) + q (0 , − q (1 , − q (0 , > (3.8) and strategic complements if and only if q (1 ,

1) + q (0 , − q (1 , − q (0 , < .Proof of Lemma 3.2: For i ∈ { , } , let Ψ ∗ i ≡ δ Φ( ω ∗ i ) + (1 − δ ) α and let Ψ ∗∗ i ≡ δ Φ( ω ∗∗ i ) + (1 − δ ) α . Thedifference in the probability of conviction conditional on ( θ , θ ) = (0 , and on ( θ , θ ) = (1 , is (Ψ ∗ − Ψ ∗∗ ) (cid:16) (1 − Ψ ∗∗ ) (cid:0) q (1 , − q (0 , (cid:1) + Ψ ∗∗ (cid:0) q (1 , − q (0 , (cid:1)(cid:17) , (3.9)while the difference in the probability of conviction conditional on ( θ , θ ) = (0 , and on ( θ , θ ) = (1 , is (Ψ ∗ − Ψ ∗∗ ) (cid:16) (1 − Ψ ∗ ) (cid:0) q (1 , − q (0 , (cid:1) + Ψ ∗ (cid:0) q (1 , − q (0 , (cid:1)(cid:17) . (3.10)17ommitting different offenses are strategic substitutes if and only if (3.9) is less than (3.10) or, equivalently, if (Ψ ∗ − Ψ ∗∗ )(Ψ ∗ − Ψ ∗∗ ) (cid:16) q (1 ,

0) + q (0 , − q (0 , − q (1 , (cid:17) < . Since ω ∗ i > ω ∗∗ i for every i ∈ { , } , we have (Ψ ∗ − Ψ ∗∗ )(Ψ ∗ − Ψ ∗∗ ) > , which implies (3.8).Lemmas 3.1 and 3.2 imply that in every equilibrium with L large enough, conviction probabilities satisfy(3.8). This implies that the principal views offenses as strategic substitutes and, in particular, that he cannot be indifferent between committing no offense and committing two offenses. Since the equilibrium probability ofoffense is interior, by Lemma 2.1, the principal must be indifferent between committing no offense and committingone offense. This shows that θ and θ are negatively correlated. Step 3: Agents’ Coordination Motive

From (3.7), the principal is convicted only if both agents accuse him.Therefore, an agent’s incentive to accuse the principal is stronger when he expects the other agent to accuse theprincipal with higher probability. Formally, if θ i = 1 , then agent i prefers to accuse the principal when: ω i ≤ ω ∗ i ≡ b − c − q (1 , Q ,j q (1 , Q ,j = b + c − cq (1 , Q ,j (3.11)where Q ,j is the probability that agent j = i accuses the principal conditional on θ i = 1 . Similarly, if θ i = 0 ,then agent i prefers to accuse the principal when: ω i ≤ ω ∗∗ i ≡ − c − q (1 , Q ,j q (1 , Q ,j = c − cq (1 , Q ,j (3.12)where Q ,j is the probability that agent j = i accuses the principal conditional on θ i = 0 . One can verify that ω ∗ i is strictly increasing in Q ,j , and ω ∗∗ i is strictly increasing in Q ,j .Since each agent is more likely to accuse the principal when he has witnessed an offense (Lemma 2.1), thenegative correlation between θ and θ implies that Q ,j < Q ,j and, hence, that ω ∗ i − ω ∗∗ i ∈ (0 , b ) . We showin Appendix C that ω ∗ i − ω ∗∗ i → as L → ∞ . This stands in sharp contrast to the single-agent benchmark,in which the distance between the two reporting cutoffs is equal to b regardless of the other parameters of themodel. The decrease in the distance between reporting cutoffs undermines the informativeness of agents’ reportsas measured by max a ∈{ , } I ( a ) . This informativeness measure converges to , which corresponds to completelyuninformative reports, as L → + ∞ . From (3.7), the judge’s assigns a posterior probability of guilt equal to π ∗ When the principal can be either opportunistic or virtuous, and the probability that the principal is virtuous exceeds − π ∗ , we showin Section 4 that the opportunistic type must be indifferent between committing one and two offenses in equilibrium, and commits twooffenses with strictly positive probability. Nevertheless, we show that agents’ private observations remain negatively correlated in this case. a = (1 , . From (3.4), Pr( a = (1 , | θ = 1)Pr( a = (1 , | θ = 0) | {z } ≡I (1 , a ∈{ , } I ( a ) · Pr( θ = 1)1 − Pr( θ = 1) = π ∗ − π ∗ , which implies that the prior probability of offense Pr( θ = 1) must be close to π ∗ when L is large enough. We now turn to equilibrium outcomes when there are two agents and the judge uses DPP to adjudicate guilt, i.e.,conviction is determined based on the maximal probability that the principal is guilty of some speciﬁc offense.

Theorem 2.

When n = 2 and the judge uses decision rule (2.5), there exists L ∈ R + such that for every L > L and every equilibrium,1.

Uncorrelated Offenses: , Pr( θ i = 1 | θ j = 1) = Pr( θ i = 1 | θ j = 0) for every i = j .2. Linear Conviction Probability:

Pr( θ i = 1 | a i = 1) = π ∗ for every i ∈ { , } , and the conviction probabilityis linear in the number of accusations.As the fraction of behavioral agents goes to zero and L → + ∞ , the following asymptotic results hold: Effective Deterrence:

The equilibrium probability of offense

Pr( θ = 1) converges to .4. Highly Informative Reports:

For every i ∈ { , } , the informativeness of agent i ’s report about θ i , measuredby Pr( a i =1 | θ i =1)Pr( a i =1 | θ i =0) , goes to + ∞ . The proof is in Appendix D. Theorem 2 shows that the offenses observed by distinct agents are uncorrelated ,and that the probability of convicting the principal is linear in the number of accusations. From Lemma 3.2, thisimplies that the principal’s incentives to commit distinct offenses are neither complements nor substitutes and,consequently, the principal’s incentives to commit distinct offenses operate independently of each other. Thisfeature preserves the credibility of the agents’ reports and lowers the probability of offense in equilibrium.We explain the intuition underlying Theorem 2 in three steps, focusing primarily on why agents’ privateobservations of offense are uncorrelated and why uncorrelated private information leads to linear convictionprobabilities, informative reports, and effective deterrence in equilibrium. As noted in Proposition 1, the results are obtained by ﬁrst taking the limit with respect to δ and then with respect to L . The same conclusion applies under other measures of informativeness, for example, the one developed in Theorem 1, in which caseone can show that max a ∈{ , } I ( a ) also goes to inﬁnity in the L → ∞ limit. uling Out Correlation: Suppose ﬁrst, by way of contradiction, that θ and θ are negatively correlated . In thiscase, Pr( θ = 1 | a = (1 , < Pr( θ = 1 | a = (1 , because an accusation by agent increases the probability that θ = 1 and, given the negative correlation betweenoffenses, reduces the probability that θ = 1 . A similar logic implies that Pr( θ = 1 | a = (1 , < Pr( θ =1 | a = (0 , . Under decision rule (2.5), the above inequalities imply that q (1 , ≥ q (1 , and q (0 , ≥ q (1 , .For this to satisfy Reﬁnement 2, it has to be the case that q (1 ,

1) = q (1 ,

0) = q (0 ,

1) = 1 and q (0 ,

0) = 0 . (3.13)When L is large enough, one can show similarly to Lemma 3.1 that under such conviction probabilities, theprincipal has a strict incentive not to commit any offense, which contradicts Lemma 2.1’s claim that the equilibriumprobability of offense is interior.Next, suppose that θ and θ are positively correlated , in which case a = (1 , is the unique maximizer ofboth Pr (cid:0) θ = 1 (cid:12)(cid:12) a (cid:1) and Pr (cid:0) θ = 1 (cid:12)(cid:12) a (cid:1) . Under decision rule (2.5), we have q (1 , ≥ max { q (1 , , q (0 , } ≥ q (0 ,

0) = 0 (3.14)where the ﬁrst inequality is strict unless max { q (1 , , q (0 , } = 1 . When L is large enough, the same logic as inLemma 3.1 shows that the principal has an incentive to commit an offense only if q (1 , ∈ (0 , . The positivecorrelation between θ and θ then implies that q (0 ,

1) = q (1 ,

0) = 0 . This, together with Lemma 3.2, shows thatthe principal’s decisions to commit offenses are strategic substitutes and contradicts the hypothesis that θ and θ are positively correlated. Linear Conviction Probabilities:

Since θ and θ are uncorrelated , the principal’s decisions to commit distinctoffenses are neither substitutes or complements. Lemma 3.2 and Reﬁnement 1 then imply that q (1 ,

1) = q (1 ,

0) + q (0 , . Since the equilibrium probability of offenses is interior (Lemma 2.1), the principal plays a completelymixed strategy in equilibrium. Taken together, these observations imply that conviction probabilities are symmetric,i.e., q (1 ,

0) = q (0 , . Otherwise, the principal would strictly prefer θ = (1 , to θ = (0 , or vice versa. Informative Report & Effective Deterrence:

The absence of correlation between θ and θ implies that anagent’s belief about whether the other agent has witnessed an offense is independent of the ﬁrst agent’s observation.From (3.11) and (3.12), the independence of θ and θ implies that Q ,j = Q ,j for j ∈ { , } , which further20mplies that ω ∗ j − ω ∗∗ j = b .As in the single-agent case, agents’ reporting cutoffs converge to −∞ as L becomes arbitrarily large, and theinformativeness ratio of each agent’s accusation, which is approximately equal to Φ( ω ∗ ) / Φ( ω ∗ − b ) , goes to + ∞ .Since the judge’s posterior about θ i is equal to π ∗ after observing a i = 1 , and since the informativeness ratio goesto + ∞ , the prior probability that θ i = 1 must converge to as L becomes arbitrarily large. Remark:

Theorems 1 and 2 provide a justiﬁcation for the use of DPP in criminal justice systems. They explainwhy aggregating probabilities of different offenses to adjudicate guilt can severely distort incentives of variousactors and offer a rationale for prohibiting character evidence to establish the guilt of a defendant.Implementing DPP in practice requires commitment power on the part of the adjudicator (judge, board oftrustees of a ﬁrm, etc.). This aspect may be illustrated by the following thought experiment: Consider a settingwith two defendants and three plaintiffs. The offenses committed by Defendant 1 against plaintiffs 1 and 2 areperfectly negatively correlated, described in the following table:

Pr(offense against plaintiff 1) Pr(offense against plaintiff 2) Pr(at least one offense )Defendant 1 49.5 % 49.5 % 99 %Pr(offense against plaintiff 3) Pr(at least one offense)Defendant 2 51 % 51 %

Suppose that DPP is used together with the preponderance of evidence criterion ( π ∗ = 50% ). Then, Defendant is acquitted and Defendant is convicted despite the fact that Defendant is almost certainly guilty.Using DPP in this case seems particularly problematic ex post, and it may be difﬁcult to implement DPP inpractice if the adjudicator lacks commitment power. In non-legal settings, ﬁrms face more social pressure to ﬁre amanager whose probabilities of abusing subordinates are given by the ﬁrst row, political parties have incentives toostracize party members with bad reputations (e.g., individuals who are believed to have committed at least someoffenses with high probability) in order to restore their popularity.Motivated by this commitment problem, we explore alternative remedies to reduce the probability of offensewhen conviction decisions are made according to APP. Proposition 3 suggests that reducing the magnitude ofpunishment can help deter offenses by inducing a positive correlation in agents’ private information. This positivecorrelation, together with agents’ coordination motive, encourages each agent to accuse the principal when he haswitnessed an offense. This improves the informativeness of all agents’ accusations. Proposition 3.

When the judge uses (2.4), there exists, for every c > , a nonempty interval ( L ( c ) , L ( c )) ⊂ R + such that q (1 ,

1) + q (0 , − q (1 , − q (0 , < for every L in this interval and corresponding equilibrium.For every ε > , there exists c > , such that when c > c and L ∈ ( L ( c ) , L ( c )) , the equilibrium probability ofoffense is less than ε . strategic complements . Since the equilibrium probability of guilt isinterior, this implies that the principal must be indifferent between committing no offense and committing twooffenses. This induces positive correlation in agents’ private information. Since an agent is less likely to faceretaliation when the other agent has witnessed an offense, this positive correlation encourages an agent to accusethe principal after witnessing an offense and discourages him from doing so when he has not witnessed any offense.Unlike the case of negatively correlated private information, here the reporting cost c improves the informativenessof agents’ reports and decreases the equilibrium probability and the expected number of offenses.Proposition 3 suggests that when the judge aggregates the probabilities of different offenses, the optimalpunishment that minimizes the probability of offense Pr( θ = 1) is interior . This ﬁnding stands in sharp contrastto Becker’s (1968) seminal analysis of criminal justice and law enforcement, according to which increasing themagnitude of punishment is efﬁcient in reducing offenses. Our ﬁnding suggests a new rationale for avoiding harshpunishments. This ﬁnding applies to settings in which evidence primarily consists of witnesses testimonies that ishard to corroborate with other forms of evidence.While reducing L may improve reports’ informativeness and deter offenses, it also increases the number ofoffenses conditional on an at least one offense being committed. When the principal is guilty, he systematicallycommits multiple offenses. Viewed from this perspective, Proposition 3 shows a tradeoff between reducing theproportion of guilty individuals and reducing the severity of actions committed by these individuals, as measuredby the number of offenses that they commit.In addition, implementing the solution proposed by Proposition 3 is challenging in practice since it requires acareful calibration of L . This issue is particularly salient when the beneﬁt from committing offenses is uncertain,since L represents the magnitude of punishment relative to the beneﬁt from committing an offense. While the analysis so far point to several weaknesses of APP, a potential advantage of aggregating probabilitiesacross offenses is to extract information about a defendant’s “character,” deﬁned as the propensity to commitoffenses. This propensity may run the gamut from virtuous individuals who incur a disutility from the commissionof offenses to serial offenders who experience a high utility from committing offenses.This section considers this possibility, focusing for tractability on the case of the two principal types introducedin Section 2: a virtuous one and an opportunistic one. Theorems 1 and 2 are extended to this setting.We show that, under both decision rules, the opportunistic type may commit multiple offenses with positiveprobability . Nevertheless, agents’ private observations are still negatively correlated under decision rule (2.4) and22 ncorrelated under decision rule (2.5). Our predictions concerning the effect of these rules on the informativenessof witness testimonies and deterrence remain unchanged. Recall that π o ∈ (0 , is the probability that the principalis opportunistic. Theorem 1’ generalizes Theorem 1: Theorem 1’.

When n = 2 and the judge uses decision rule (2.4), there exists L such that when L > L ,1.

Endogenous Negative Correlation Between Offenses:

In every equilibrium

Pr( θ = 1 | θ = 1) < Pr( θ = 1 | θ = 0) and Pr( θ = 1 | θ = 1) < Pr( θ = 1 | θ = 0) . For every ε > , there exists L ε ∈ R + , such that in every equilibrium when L > L ε ,2. Low Informativeness of Reports: max a ∈{ , } I ( a ) < n π ∗ − π ∗ . min { π ∗ , π o } − min { π ∗ , π o } o + ε. (4.1) Ineffective Deterrence:

Pr( θ = 1) > min { π ∗ , π o } − ε. (4.2)Theorem 1’ shows that the negative correlation in agents’ private information, the lack of informativeness ofagents’ reports, and the high frequency of offenses still arise when there is some unobserved heterogeneity in theprincipal’s propensity to commit offenses. Theorem 1 and Theorem 1’ provide different asymptotic upper boundson the informativeness of agents’ reports and lower bounds on the probability of offenses: • When the fraction π o of opportunistic types exceeds π ∗ , the equilibrium probability of offenses is closeto the conviction cutoff π ∗ . The informativeness of agents’ accusations, measured by max a ∈{ , } I ( a ) isarbitrarily close to as the punishment from conviction increases. • When the probability π o that the principal is opportunistic is less than π ∗ , the prior probability of offenseequals π o . This is the highest possible ex ante probability of guilt given that the virtuous type never commitsany offense (Lemma 2.1) and, in this sense, the worst possible outcome.Given that in every equilibrium, there exists a ∈ { , } such that the principal is convicted with positiveprobability when the judge observes a , the judge’s posterior belief that at least one offense has taken place isno less than π ∗ after observing a . This suggests a lower bound on the informativeness of agents’ accusations: max a ∈{ , } I ( a ) ≥ I min ≡ π ∗ − π ∗ . π o − π o . (4.3)23heorem 1’ shows that this lower bound I min is attained in all equilibria .Theorem 1’ is proved similarly to Theorem 1. Lemma 3.1 and Lemma 3.2 generalize to heterogeneous principaltypes and imply that the opportunistic type is never indifferent between committing zero and two offenses. Theopportunistic type is either indifferent between committing zero and one offense or indifferent between committingone and two offenses. This leads to two disjoint cases in Theorem 1’, whose separation depends on the priorprobability of the virtuous type:1. When π o ≥ π ∗ , the opportunistic type is indifferent between committing zero and one offense, and equilibriahave the same features as those predicted by Theorem 1.2. When π o < π ∗ and L is large enough, the opportunistic type always commits at least one offense. Otherwise,agents’ reports would be arbitrarily uninformative as L becomes arbitrarily large, by the same logic as inTheorem 1. The posterior probability would have to be strictly lower than π ∗ even when both agents accusethe principal. This would imply that the principal is never convicted and lead to a contradiction. Hence, theopportunistic type must be indifferent between committing one and two offenses.Nevertheless, θ and θ remain negatively correlated . To understand why, suppose by way of contradictionthat θ and θ were independent or positively correlated. We would then have Q ,j ≤ Q ,j , and theexpressions for the reporting cutoffs (3.11) and (3.12) would imply that ω ∗ j − ω ∗∗ j ≥ b . From Lemma3.1, q (1 , must converge to as L → + ∞ and agents’ reporting cutoffs both go to −∞ . As in thesingle-agent benchmark, the informativeness of agents’ report would then go to + ∞ . Since an opportunistictype commits at least one offense, the prior probability that the defendant commits at least one offenseis equal to π o . Therefore, the posterior probability that the defendant committed an offense when bothagents accuse the principal would have to exceed π ∗ . This contradicts Lemma 3.1, which requires that thejudge be indifferent between convicting and acquitting the defendant when both agents accuse him (since q (1 , ∈ (0 , ) and, hence, that the judge’s posterior be equal to the conviction threshold π ∗ .The next result generalizes Theorem 2, which studies equilibrium outcomes under DPP. The proof is similar to theproof of Theorem 2 and omitted. Theorem 2’.

When n = 2 and the decision rule is (2.5), there exists L ∈ R + such that when L > L ,1.

Uncorrelated Offenses: , Pr( θ i = 1 | θ j = 1) = Pr( θ i = 1 | θ j = 0) for every i = j .2. Linear Conviction Probability:

Pr( θ i = 1 | a i = 1) = π ∗ for every i ∈ { , } , and the conviction probabilityis linear in the number of accusations. s the fraction of behavioral agents goes to zero and L → + ∞ , the following asymptotic results hold: Effective Deterrence:

The equilibrium probability that offense taking place

Pr( θ = 1) converges to .4. Highly Informative Reports:

For every i ∈ { , } , the informativeness of agent i ’s report about θ i , measuredby Pr( a i =1 | θ i =1)Pr( a i =1 | θ i =0) , goes to + ∞ . This section extends Theorems 1 and 2 to an arbitrary number of agents, reverting to the case of a single principaltype, and provides a comparative static result (Theorem 3) on the number of agents that generalizes Proposition 2.We start by generalizing the measure of informativeness used in Theorem 1 to three or more agents. Bayes ruleimplies that

Pr( a | θ = 1)Pr( a | θ = 0) · Pr( θ = 1)1 − Pr( θ = 1) = Pr( θ = 1 | a )1 − Pr( θ = 1 | a ) for every a ∈ { , } n . (5.1)Therefore, the ratio I ( a ) ≡ Pr( a | θ = 1)Pr( a | θ = 0) (5.2)measures the change in a judge’s belief after observing report proﬁle a , which we use to measure the informativenessof a . For tractability, we focus our analysis on symmetric equilibria, i.e., equilibria in which the principal treats allagents symmetrically and all agents’ equilibrium strategies are the same. Proposition 4 generalizes the insights ofTheorem 1 to the case of three or more agents. Proposition 4.

Suppose n ≥ and the judge uses decision rule (2.4). For every ε > , there exists L ε > such that for every L > L ε and δ ∈ (0 , ,1. There exists a symmetric equilibrium that satisﬁes Reﬁnement 1.

2. In every symmetric equilibrium that satisﬁes Reﬁnement 1,(a)

Pr(max j = i θ j = 1 | θ i = 1) < Pr(max j = i θ j = 1 | θ i = 0) for every i ∈ { , , ..., n } ;(b) max a ∈{ , } n I ( a ) < ε and Pr( θ = 1) > π ∗ − ε . Statement 2(a) means that an agent who witnessed an offense assigns a lower probability to other agents havingwitnessed offenses than if did not witness any offense. We show that in every symmetric equilibrium, the principal As before, the results are obtained by ﬁrst taking the limit with respect to δ and then with respect to L . We show in Online Appendix H.4 that every symmetric Bayes Nash equilibrium that satisﬁes Reﬁnement 1 is a proper equilibrium thatsatisﬁes Reﬁnement 2. negativelycorrelated . Statement 2(b) shows that agents’ reports become arbitrarily uninformative and the probability that theprincipal commits at least one offense converges to π ∗ as the punishment L becomes large relative to the beneﬁtfrom committing offenses. This is similar to the prediction in Theorem 1.The following result provides comparative statics with respect to the number of agents, showing in particularthat each agent is more likely of accusing the principal as the number of agents increases. Theorem 3.

Suppose the judge uses decision rule (2.4). For every k, n ∈ N with k > n , there exists L > such that for every L > L , and compare any symmetric equilibrium corresponding to k agents to any symmetricequilibrium corresponding to n agents:1. Lower Informativeness: max a ∈{ , } k I ( a ) < max a ∈{ , } n I ( a ) .2. Higher Probability of Offense:

The equilibrium probability of offense

Pr( θ = 1) is strictly higher with k agents than with n agents.3. Higher Probability of Filing Accusations:

Each agent’s reporting cutoffs ( ω ∗ , ω ∗∗ ) are both strictly higherwith k agents than with n agents. The proof is in Appendix E. Theorem 3 shows that as the number of potential offenses increases, the probabilityof offense increases, the informativeness of reports decreases, and each agent is more likely to accuse the principal.This last feature distinguishes our result from those on public good provision (e.g., Chamberlin 1974), in whichinefﬁciencies arise because agents free ride on one another’s contributions. In our model, the lower informativenessof agents’ reports as n increases is due to an increased number of false accusations, not from agents abstaining ata higher rate from revealing observed offenses.Next, we study the game’s equilibrium outcomes when the judge uses DPP to adjudicate guilt. Proposition5 establishes the existence of equilibria in which agents’ private information is uncorrelated and the convictionprobability is a linear function of the number of accusations against the principal. These equilibria featurearbitrarily informative reports and a vanishing probability of offense as the punishment L becomes large. Proposition 5.

When the judge uses decision rule (2.5), there exists

L > , such that for every L > L , thereexist an equilibrium in which:1. θ i and θ j are uncorrelated for every i = j ,2. the probability that the principal is convicted is linear in the number of accusations.In the limit as lim L →∞ and lim δ → , . The probability Pr( θ = 1) that the principal commits an offense converges to .4. For every i ∈ { , , . . . , n } , the informativeness of agent i ’s report, Pr( a i =1 | θ i =1)Pr( a i =1 | θ i =0) , goes to + ∞ . While we did not specify any social welfare function, our analysis unveils several tradeoffs between improvingthe quality of judicial decisions and deterrence, between ex ante incentives and ex post fairness, and between thefrequency and the severity of committed offenses. Our results may inform a social planner considering variousadjudication rules (for example, choosing between APP and DPP, choosing L , π ∗ , and c under various constraints)to maximize an objective function that incorporates the various facets of these tradeoffs.First, Theorems 1 and 2 show that when (i) a defendant necessarily incurs a large disutility from convictionrelative to the beneﬁt of committing the offense (a disparity that may be due, e.g., to the loss of one’s jobor reputation), and (ii) the conviction threshold π ∗ is beyond the control of a social planner, APP induces asigniﬁcantly higher offense rate than DPP.Second, we obtain explicit formulas to describe a tradeoff between deterrence and fairness. Under both APPand DPP, a fraction − π ∗ of convicted defendants are innocent and a fraction π ∗ of acquitted defendants areguilty. While offenses become less frequent when the standard of proof is lowered ( π ∗ decreases), this increases thefraction of innocent individuals among convicted defendants. The optimal cutoff π ∗ could arise from maximizinga social welfare function that trades off the effectiveness of deterrence with the fraction of false positive amongcases that resulted in a conviction. Similarly, suppose that a judge commits to convict the principal whenever at least one report was ﬁled. Thiswould eliminate agents’ need to coordinate their reports and, when the punishment L is large, would give theprincipal a strict incentive not to commit any offense. However, in order to fulﬁll his commitment, the judge wouldhave to convict the principal whenever some agent accuses the principal, even when the principal never commitsany offense in equilibrium. This would lead to an undesirable outcome since all convicted individuals would beinnocent and the probability of convicting innocent individuals would be signiﬁcant. Third, our example in Section 3.3 illustrates how DPP may lead to ex post unfair outcomes: the defendant whois less likely to be guilty is convicted while the defendant who is almost surely guilty is acquitted. When ex postfairness has a strong inﬂuence on adjudication decisions, the use of APP may be viewed a constraint on the set of This tradeoff has been discussed in reduced-form by Harris (1970) and Miceli (1991), but it does not arise in Becker (1968) and Landes(1970), who ignore wrongful convictions. In Kaplow (2011), punishments are expressed in terms of ﬁnes, i.e., zero sum transfers that donot affect the social surplus. This paradox commonly arises in plea bargaining models, in which agents who reject pleas but are convicted at trial are known to beinnocent (Grossman and Katz 1983, Reinganum 1988, and Siegel and Strulovici 2020). rule to adjudicate guilt, it is justiﬁed as a way to capture anex post social objective function and show, paradoxically, that it may be socially undesirable to use APP as a ruleeven when the social objective is the one that motivates APP in the ﬁrst place. Intuitively, we have that APP maybe self-defeating once incentives are taken into account.Fifth, deterrence may, counter-intuitively, be stronger if the punishment to a convicted defendant takes onintermediate values than extreme ones, as it induces positive correlation in witnesses’ private observations andreduces their exposure to retaliation or social stigma. Proposition 3 shows that using a more lenient sentence incase of conviction can be effective in deterrence, without increasing the probability of convicting the innocent.These intermediate punishment levels result in an increase the number of offenses committed by guilty defendants,which points to a tradeoff between the frequency and the severity of committed offenses.Lastly, the cost of reporting c plays a subtle role on social welfare. Beyond the ethical concerns of protectingwhistleblowers, Theorem 1 suggests that under APP, reducing c can also improve the credibility of witnesstestimonies and reduce crime. By contrast, having a strictly positive reporting cost under DPP improves deterrenceby lowering the fraction of agents submitting false positive reports. This section presents several extensions of the baseline model, which convey the robustness our results. The detailsof the analysis can be found in Supplementary Appendix K.

Decreasing Marginal Beneﬁts from Committing Offenses:

Our results continue to hold when the principalfaces decreasing marginal returns from committing offenses or receives a punishment larger than L when he isbelieved to have committed multiple offenses. These changes motivate the principal to commit fewer offensesand, as in the baseline model, induce negative correlation in the agents’ private information. Agents’ coordinationmotive undermines the informativeness of their reports and increases the probability of offenses. In this extension,formalized in Supplementary Appendix K.3, the principal receives punishment L if the probability that he committedat least one offense exceeds π ∗ and receives punishment L ′ ( > L ) if the probability that he committed two offensesexceeds some other cutoff π ∗∗ ∈ (0 , . When L is large, the principal commits at most one offense in everysymmetric equilibrium that satisﬁes Reﬁnement 1. Accusation Costs:

An agent who accuses the principal may be subjected to retaliation or social stigma even whenthe principal is convicted. As long as the cost of retaliation or social stigma is strictly higher when the principal is28cquitted than when he is convicted, this does not affect our results. More generally, the results generalize wheneach agent’s loss from retaliation is decreasing in the number of accusations leveled against the principal, as long asthis loss is strictly positive when the principal is acquitted. In fact, these variations strengthen agents’ coordinationmotive and leaves unchanged the negative correlation affecting their private information.

Interdependent Preferences:

As anticipated in Section 2, an agent may directly care about other offensescommitted by the principal, in addition to the offense that he may observe. This situation is modeled as follows:when agent i is strategic, his payoff is normalized to when the principal is convicted and is equal to: ω i − b (cid:16) (1 − γ ) θ i + γf ( θ i , θ − i ) (cid:17) − ca i (7.1)when the principal is acquitted, where f ( θ i , θ − i ) is increasing in all arguments. The term γf ( θ i , θ − i ) captures how i ’s payoff depends on all the offenses committed by the principal.We show in Supplementary Appendix K.1 that agents’ reports become arbitrarily uninformative and the probabilitythat the principal commits some offense converges to π ∗ as L becomes arbitrarily large. In fact, these socialpreferences undermine even further than in the baseline model the informativeness of agents’ reports. Intuitively,when agent i ’s payoff depends directly on θ j , agent i ’s report a i becomes more responsive to his belief about θ j .This, together with the negative correlation between θ i and θ j , causes i ’s report to become even less responsive to θ i and less informative than in the baseline model. Preference for Truth-telling:

The baseline model took a consequentialist approach to model agents’ preferencesfor telling the truth: an agent who observed an offense had a higher payoff when the principal was convicted, butnot from accusing the principal per se. Suppose now that an agent who observed an offense receives a beneﬁt d ( > from accusing the principal, regardless of the adjudication outcome. Letting l ∗ = π ∗ − π ∗ , we show inSupplementary Appendix K.1 when there are two agents and d < l ∗ l ∗ +2 c , the informativeness (1 − δ ) α + δ Φ( ω ∗ )(1 − δ ) α + δ Φ( ω ∗∗ ) ofagents reports is bounded above by cl ∗ cl ∗ − ( l ∗ + 2) d . (7.2)This upper bound collapses to 1 (i.e., reports become completely uninformative) when the preference for truthtelling d converges to 0, or when the retaliation cost c becomes arbitrarily large. In particular, our results generalize whenagents experience a small beneﬁt from telling the truth. Ex Post Evidence & Punishing False Accusations:

Suppose that if the principal is wrongfully convicted,exonerating evidence arrives exogenously with probability p ∗ and causes every accuser to be penalized by some29onstant ℓ ≥ . These punishments are equivalent to increasing the beneﬁt b from convicting the principalafter witnessing an offense, and this extension is formally equivalent to the baseline model, as explained inSupplementary Appendix K.1. Uncertain Number of Offense Opportunities:

In applications such as workplace bullying, physical assaultsand discrimination, the number of potential offenses that the principal may commit is unobserved by the judge andby the victims. Suppose that nature randomly selects a subset e N of { , , ..., n } of offense opportunities for theprincipal. We assume that (i) only agents with index in e N can credibly accuse the principal and (ii) that agents whodo not accuse the principal do not ﬁle any report. Assumption (i) is justiﬁed for instance if agents outside of e N areinactive (equivalently, the set of agents in the model is stochastic, equal to e N , and only observed by the principal)or if they are active but the principal could easily refute any accusation by such agents, e.g., by providing an alibi.The principal privately observes the set e N . Agents in e N observe the same information as in the baseline modeland cannot observe the realization of e N .We informally argue that the logic behind our results is even stronger in this case than in the baseline model.Since the judge does not observe the cardinality of e N , his verdict depends only on the number of accusations(assuming for simplicity a symmetric treatment of accusers), not on the number of potential victims. Since theprobability of convicting the principal is increasing in the number of accusations, the principal has a strongerincentive to commit offenses when there are fewer agents overall who can accuse him (i.e., when | e N | is smaller).An agent who observes an offense therefore infers that | e N | is more likely to be small, other things equal, and,hence, that the number of accusations by other agents is also likely to be small. This effect dampens an agent’sincentive to accuse the principal when he has witnessed an offense and lowers the informativeness of agents’reports in equilibrium, by the same logic as in the baseline model. Behavioral Agents and Retaliation Cost:

There is a close relationship between behavioral agents whose reportingstrategy depends on their observation and strategic agents who are immune to retaliation. Suppose that a behavioralagent accuses the principal with probability α when he has observed an offense and with probability α when hehas not, with > α ≥ α > . This formulation is equivalent to one in which these agents are strategic butimmune to retaliation, so that their realized payoff is given by ( ω i − bθ i )(1 − s ) . In equilibrium, such an agent usesreporting cutoffs b and , depending on whether the agent has witnessed an offense or not, which corresponds toaccusation probabilities α = Φ( b ) and α = Φ(0) , respectively. Therefore, this strategic agent behaves as thoughhe were behavioral and playing an informative cutoff-strategy. As noted in Section 2, all our results hold for suchbehavioral agents. 30 Concluding Remarks

Relation to the Legal Scholarship:

Treating each charge made against an individual in isolation of other chargesmay seem a priori arbitrary, unfair, and ineffective. This observation has been formalized and explored in the legalscholarship and takes on a particular importance for individuals who face multiple accusations, each of which ishard to establish at the sufﬁcient level of certainty.The premise, common in the legal literature, that offenses are exogenously and independently distributed isparticularly problematic when analyzing the aggregation of offense probabilities. This aggregation creates anincentive for defendants to strategically restrict the number of offenses, which introduces negative correlation inthe occurrence of offenses and violates the premise that offenses are independently distributed. Indeed, our analysisshows that aggregating offense probabilities has severe drawbacks once the incentives of potential offenders andaccusers are taken into account.Underlying this difﬁculty, the

Aggregate Probabilities Principle (APP) describes a rule of punishment ratherthan a social objective function . While it may be socially desirable to punish a defendant deemed sufﬁciently likelyof committing some offense, even an unspeciﬁed one, the principle may be self-defeating and suboptimal from asocial welfare perspective once incentives are taken into account.

Equilibrium Analysis vs.

Nonequilibrium Adjustments:

Our results are derived from an equilibrium analysis,which presumes that players have correct expectations about the consequences of their actions and other players’strategies. When social rules change, as in case of a sudden crackdown on a speciﬁc type of offense, the introductionof new regulation, a drastic shift in social norms, or the emergence of new social media that change the socialconsequences of one’s actions, equilibrium analysis may be viewed as a potential harbinger of issues that willemerge as economic and social actors learn to interact under these new rules or norms. This distinction seemsparticularly relevant in the context of the recent me too movement, since abusers before the emergence of themovement likely underestimated the legal and professional consequences of their abusive behavior.

Shielding Accusers from Stigma through Secret Accusations:

To address the potential pressure that is sometimesexperienced by lone accusers, institutions have been developed under which reports are submitted to a third partyand are only released when enough of them have been ﬁled. Such provisions increase the risk of wrongful accusations. Indeed, an agent holding a grudge against theprincipal has an opportunity to secretly accuse him in the hope that other agents, rightfully or not, may also accuse Foundations of Nash equilibrium based on players learning one another’s strategies have a long history in economics. See, e.g.,Fudenberg and Levine (1995). Simultaneous vs Sequential Reporting:

The forces that underlie our results are also present in dynamic versionsof our model, in which reports may be ﬁled sequentially. First, the negative correlation between the agents’ privateinformation ( θ i ) continues to arise endogenously whenever a strategic principal is concerned about having too manyreports made against him. Second, an individual agent has an incentive to coordinate with other agents whenever heis unsure about whether his report is pivotal or not. In a dynamic setting, this incentive can materialize after a coldstart (i.e., where very few people have reported before and no agent wants to be the ﬁrst accuser). It can also occurwhen an agent has observed many reports and is unsure of the number of reports needed to convict the principal(for example, if he faces uncertainty about the conviction standard π ∗ used by the judge). The inefﬁciencies andlack of credibility caused by the agents’ coordination motive thus still arise in a dynamic environment. See Lee and Suen (2020) for a model of strategic accusation in which the timing of accusation plays a major role. Proof of Lemma 2.1

Statement 1:

When i is strategic and observes ( ω i , θ i ) , he chooses a i = 1 if and only if ω i X a − i ∈{ , } n − σ − i ( a − i ) (cid:16) q (1 , a − i ) − q (0 , a − i ) (cid:17) ≤ bθ i X a − i ∈{ , } n − σ − i ( a − i ) (cid:16) q (1 , a − i ) − q (0 , a − i ) (cid:17) − c X a − i ∈{ , } n − σ − i ( a − i ) (cid:16) − q (1 , a − i ) (cid:17) . (A.1)Reﬁnement 2 implies that q (1 , a − i ) − q (0 , a − i ) ≥ for every a − i ∈ { , } n − with a strict inequality forsome a − i . The existence of behavioral agents implies that every a − i ∈ { , } n − occurs with strictly positiveprobability. Therefore, X a − i ∈{ , } n − σ − i ( a − i ) (cid:16) q (1 , a − i ) − q (0 , a − i ) (cid:17) > . This, together with inequality (A.1), implies that agent i accuses the principal if and only if ω i lies below somecutoff that is strictly higher when θ i = 1 than when θ i = 0 . Statement 2:

Suppose toward a contradiction that

Pr( θ = 1) = 0 or, equivalently θ = ... = θ n = 0 withprobability . Since every a occurs with strictly positive probability, this implies that Pr( θ = 1 | a ) = 0 for every a ∈ { , } n . Since π ∗ ∈ (0 , , the principal is convicted with probability regardless of a . Therefore, anopportunistic principal has a strict incentive to commit offense, which contradicts Pr( θ = 1) = 0 . Statement 3:

Since ω ∗ i > ω ∗∗ i for every i ∈ { , , ..., n } and since every report vector a ∈ { , } n occurs withpositive probability, Reﬁnement 2 implies that for every θ ≻ θ ′ , the principal is convicted with strictly higherwhen θ than when θ ′ . Since a virtuous principal does not beneﬁt from committing offenses, he strictly prefers tocommit no offense in any proper equilibrium that satisﬁes Reﬁnements 1 and 2. B Proof of Proposition 1

Let q = q (1) denote the probability of conviction when the principal is accused. The principal’s expected cost ofcommitting an offense is δLq (cid:18) Φ (cid:18) b − c − qq (cid:19) − Φ (cid:18) − c − qq (cid:19)(cid:19) , (B.1)which is a continuous and strictly positive function of q that converges to as q → . Since the equilibriumprobability of offense is strictly positive (Statement 2 of Lemma 2.1), the value of (B.1) is less than or equal to .33uppose toward a contradiction that for every L > , there exists L > L under which in some properequilibrium that satisﬁes Reﬁnements 1 and 2, the value of (B.1) is strictly less than . An opportunistic principalthen has a strict incentive to commit offense, i.e., Pr( θ = 1) = π o . Moreover, q goes to and Φ( ω ∗ ) / Φ( ω ∗∗ ) goes to + ∞ as L → ∞ . This implies the existence of L > such that for every L > L , and for every q such that(B.1) is strictly less than , Pr( θ = 1 | a = 1) is strictly greater than π ∗ . This implies the judge surely convicts theprincipal and contradicts the fact that q goes to . Therefore, (B.1) must be equal to when L is large enough, q lies in (0 , , and Pr( θ = 1 | a = 1) = π ∗ .When L → + ∞ , /δL converges to . Suppose toward a contradiction that q s converges to some limit q > along some sequence { L n } ∞ n =1 with lim n →∞ L n = ∞ . Then, ω ∗ and ω ∗∗ respectively converge to b − c (1 − q ) /q and − c (1 − q ) /q . The LHS of (3.5) converges to δq (cid:18) Φ (cid:18) b − c − qq (cid:19) − Φ (cid:18) − c − qq (cid:19)(cid:19) (B.2)which is strictly positive. This leads to a contradiction and shows that q → . The expressions for ω ∗ and ω ∗∗ in(3.1) and (3.2) imply that both cutoffs go to −∞ . This shows that lim ω ∗ →−∞ lim δ → δ Φ( ω ∗ ) + (1 − δ ) αδ Φ( ω ∗ − b ) + (1 − δ ) α = + ∞ , (B.3)where we use the observation that lim ω →−∞ Φ( ω ) / Φ( ω − b ) → + ∞ for every b > . From (3.3) and the fact that Pr( θ = 1 | a = 1) = π ∗ , we conclude that Pr( θ = 1) converges to . C Proof of Theorems 1 and 1’

The following lemma establishes that all equilibria are symmetric.

Lemma C.1.

When n = 2 and the judge uses conviction rule (2.4), there exists L > such that when L > L ,the events ( θ , θ ) = (1 , and θ , θ ) = (0 , have the same probability and ( ω ∗ , ω ∗∗ ) = ( ω ∗ , ω ∗∗ ) . The proof of Lemma C.1 is in Supplementary Appendix I and the proof of Lemma 3.1, which is used to proveTheorems 1 and 1’, is in Online Appendix F. We now prove Theorems 1 and 1’ taking Lemmas 3.1 and C.1 asgiven. We consider two cases separately, depending on the order of π o and π ∗ . C.1 Case 1: π o ≥ π ∗ Lemma 3.1 implies that q (1 ,

1) + q (0 , − q (1 , − q (0 , > and Pr( θ = 1 | a ) ≤ π ∗ for every a ∈ { , } .Therefore, Pr( θ = 1) < π ∗ ≤ π o , which implies that an opportunistic principal chooses θ = (0 , with positive34robability. From Lemma 3.2, the principal’s actions are strategic substitutes. Therefore, the principal he chooses θ = (1 , with zero probability, which implies statement 1.Let π be the ex ante probability of offense taking place. Lemma C.1 implies that ( θ , θ ) equals (1 , and (0 , with equal probabilities. Conditional on θ i = 0 , the probability that θ = (0 , is β ≡ − π − π/ . (C.1)Let Q ≡ Q , = Q , and Q ≡ Q , = Q , , where the equalities are guaranteed to hold by Lemma C.1. Since θ = (1 , occurs with zero probability, Q = δ Φ( ω ∗∗ ) + (1 − δ ) α (C.2)and Q = δ (cid:16) β Φ( ω ∗∗ ) + (1 − β )Φ( ω ∗ ) (cid:17) + (1 − δ ) α. (C.3)Subtracting (3.12) from (3.11) yields ω ∗ − ω ∗∗ = b − cq (1 , · − Q /Q Q . (C.4) Lemma C.2. ω ∗ − ω ∗∗ ∈ (0 , b ) .Proof of Lemma C.2: According to (C.2) and (C.3), ω ∗ − ω ∗∗ > is equivalent to Q > Q . To see this, supposeby way of contradiction that Q ≤ Q . Equation (C.4) implies that ω ∗ ≥ ω ∗∗ + b > ω ∗∗ . The comparison between(3.11) and (3.12) then yields Q > Q , the desired contradiction. Since Q > Q , the term − Q /Q Q is strictlypositive, which shows that ω ∗ − ω ∗∗ < b .Let I ≡

Pr( a = a = 1 | θ = 1)Pr( a = a = 1 | θ = 0) . Since an opportunistic principal mixes between (0 , , (1 , , and (0 , , we have I ≡ (cid:0) δ Φ( ω ∗ ) + (1 − δ ) α (cid:1)(cid:0) δ Φ( ω ∗∗ ) + (1 − δ ) α (cid:1)(cid:0) δ Φ( ω ∗∗ ) + (1 − δ ) α (cid:1) = δ Φ( ω ∗ ) + (1 − δ ) αδ Φ( ω ∗∗ ) + (1 − δ ) α . (C.5)Since q (1 , ∈ (0 , , the judge assigns probability π ∗ to θ = 1 after observing ( a , a ) = (1 , . This impliesthat π − π = l ∗ I where l ∗ ≡ π ∗ − π ∗ . (C.6)35lugging (C.6) into (C.1), we obtain the following expressions for β and − β : β = 2 I l ∗ + 2 I and − β = l ∗ l ∗ + 2 I . (C.7)Plugging (C.7) into (C.2) and (C.3) then yields Q Q = β + (1 − β ) I = ( l ∗ + 2) I l ∗ + 2 I . (C.8)Plugging (3.11) and (3.12) into (C.8), we obtain | ω ∗ − c − b || ω ∗∗ − c | = ( l ∗ + 2) I l ∗ + 2 I . (C.9)This leads to the following lemma. Lemma C.3. If ω ∗ → −∞ , then I → and π → π ∗ .Proof of Lemma C.3: Since ω ∗ − ω ∗∗ ∈ (0 , b ) , the difference between | ω ∗ − c − b | and | ω ∗∗ − c | is at most b . TheLHS of (C.9) converges to as ω ∗ → −∞ . Since the RHS of (C.9) is strictly increasing in I and is equal to 1when I = 1 , I must converge to 1 as ω ∗ → −∞ . Equation (C.6) then shows that π converges to π ∗ .We now show that ω ∗ → −∞ as L → + ∞ . Recall an opportunistic principal is indifference betweencommitting zero and one offense if ( δL ) − = q (1 , (cid:16) δ Φ( ω ∗∗ ) + (1 − δ ) α (cid:17)(cid:16) Φ( ω ∗ ) − Φ( ω ∗∗ ) (cid:17) . (C.10)Suppose that there exists a sequence { L ( n ) , ω ∗ ( n ) , ω ∗∗ ( n ) , q ( n ) , π ( n ) (cid:9) ∞ n =1 such that1. L ( n ) ≥ L for every n ∈ N , and lim n →∞ L ( n ) = ∞ ;2. for each n ∈ N , ( ω ∗ ( n ) , ω ∗∗ ( n ) , q ( n ) , π ( n )) is an equilibrium when L = L ( n ) ;3. lim n →∞ ω ∗∗ ( n ) = ω ∗∗ for some ﬁnite ω ∗∗ ∈ R .Since δ Φ( ω ∗∗ ( n )) + (1 − δ ) α is bounded below away from , (C.10) implies that • either there exists a subsequence { k n } ∞ n =1 ⊂ N such that: lim n →∞ q ( k n ) = 0 . • or there exists a subsequence { k n } ∞ n =1 ⊂ N such that lim n →∞ (cid:0) Φ( ω ∗ ( k n )) − Φ( ω ∗∗ ( k n )) (cid:1) = 0 . The ﬁnitelimit of ω ∗∗ imposed by Condition 3 above then implies that lim n →∞ (cid:0) ω ∗ ( k n ) − ω ∗∗ ( k n ) (cid:1) = 0 .36irst, suppose that lim n →∞ q ( k n ) = 0 for some subsequence { k n } ∞ n =1 . Then (3.11) and (3.12) imply that ω ∗ ( k n ) and ω ∗∗ ( k n ) both go to −∞ , which contradicts the condition that ω ∗∗ ( n ) converges to some ﬁnite number. Next,suppose that lim n →∞ ( ω ∗ ( k n ) − ω ∗∗ ( k n )) = 0 for some subsequence { k n } ∞ n =1 . Since ω ∗∗ ( k n ) converges to aﬁnite limit, both Q ( k n ) and Q ( k n ) are bounded below away from . Given the expressions for Q and Q , thisimplies that Q ( k n ) /Q ( k n ) converges to as n → ∞ . From the previous step, we know that there does not existany subsequence of { k n } ∞ n =1 such that q ( k n ) converges to . Equivalently, there exists η > such that q ( k n ) ≥ η for every n ∈ N . Expression (C.4) then implies that ω ∗ ( k n ) − ω ∗∗ ( k n ) converges to b , which contradicts thehypothesis that lim n →∞ ω ∗ ( k n ) − ω ∗∗ ( k n ) = 0 . C.2 Case 2: π o < π ∗ First, we show that an opportunistic principal commits two offenses with positive probability in every equilibrium.Suppose toward a contradiction that he never commits two offenses. From Lemma 2.1, a virtuous principal nevercommits any offense. Therefore, the equilibrium probability of that an offense occurs cannot exceed π o , which isstrictly less than π ∗ . As a result, I ≡

Pr( a = a = 1 | θ = 1)Pr( a = a = 1 | θ = 0) ≥ π o − π o . π ∗ − π ∗ > . (C.11)The expressions for Q and Q in (C.2) and (C.3) also apply to this setting. The derivations contained in AppendixC.1 imply that for every ε > , there exists L ε > such that when L ≥ L ε , the informativeness ratio is less than ε . This contradicts (C.11), which requires that the informativeness ratio be strictly bounded away from .Since q (1 , ∈ (0 , and q (1 ,

1) + q (0 , − q (1 , − q (0 , > , Lemma 3.2 implies that an opportunisticprincipal cannot be indifferent between committing zero and two offenses. Lemma C.1 implies that he chooses θ = (0 , , (1 , and (1 , with positive probability, and chooses θ = (0 , with zero probability. Therefore, Pr( θ = 1) = π o . The informativeness ratio is pinned down by Bayes rule: I Pr( θ = 1)1 − Pr( θ = 1) = Pr( θ = 1 | a = a = 1)1 − Pr( θ = 1 | a = a = 1) . (C.12)We now show that θ and θ are negative correlated, as claimed in the ﬁrst statement of both theorems. Supposeby way of contradiction that θ and θ are independent or positively correlated, Lemma C.1 implies that Q ≥ Q .The reporting cutoff equations (3.11) and (3.12) then show that ω ∗ − ω ∗∗ ≥ b . When L is large enough, we have q (1 , → and ω ∗ → −∞ , and the analysis of the single-agent case shows that I → ∞ . This contradicts (C.12)and the conclusion

Pr( θ = 1 | a = a = 1) = π ∗ of Lemma 3.1.37 Proofs of Theorem 2 and 2’

Our proof uses the following result, which is proved in Online Appendix F.

Lemma D.1.

There exists

L > such that q (1 , < for every L > L and corresponding equilibrium.

Uncorrelated Offenses:

First, suppose by way of contradiction that

Pr( θ = 1 | θ = 1) < Pr( θ = 1 | θ = 0) .Since θ and θ are both binary, Pr( θ = 1 | θ = 1) < Pr( θ = 1 | θ = 0) . Since ω ∗ i > ω ∗∗ i for i ∈ { , } , Pr( θ = 1 | a = (1 , < Pr( θ = 1 | a = (1 , and Pr( θ = 1 | a = (1 , < Pr( θ = 1 | a = (0 , . Therefore, max i ∈{ , } Pr( θ i = 1 | a = (1 , < max n max i ∈{ , } Pr( θ i = 1 | a = (1 , , max i ∈{ , } Pr( θ i = 1 | a = (0 , o . (D.1)Decision rule (2.5) implies that max { q (0 , , q (1 , } ≥ q (1 , and Reﬁnement 2 requires that q (1 , ≥ max { q (0 , , q (1 , } .These inequalities imply that q (1 ,

1) = max { q (0 , , q (1 , } . Reﬁnement 1 requires that q (0 ,

0) = 0 and Lemma2.1 implies that max { q (1 , , q (1 , , q (0 , , q (0 , } > . Therefore, q (1 ,

1) = max { q (0 , , q (1 , } > .Inequality (D.1) and decision rule (2.5) rule out the possibility that q (1 ,

1) = max { q (0 , , q (1 , } ∈ (0 , .Therefore, q (1 ,

1) = max { q (0 , , q (1 , } = 1 , which contradicts Lemma D.1.Next, suppose by way of contradiction that Pr( θ = 1 | θ = 1) > Pr( θ = 1 | θ = 0) . Since θ and θ areboth binary, Pr( θ = 1 | θ = 1) > Pr( θ = 1 | θ = 0) . Since ω ∗ i > ω ∗∗ i for i ∈ { , } , Pr( θ = 1 | a = (1 , > Pr( θ = 1 | a = (1 , and Pr( θ = 1 | a = (1 , > Pr( θ = 1 | a = (0 , . Therefore, max i ∈{ , } Pr( θ i = 1 | a = (1 , > max n max i ∈{ , } Pr( θ i = 1 | a = (1 , , max i ∈{ , } Pr( θ i = 1 | a = (0 , o . (D.2)From Lemma D.1 and monotonicity, we have q (1 , ∈ (0 , . Decision rule (2.5) then requires that max i ∈{ , } Pr( θ i = 1 | a = (1 , π ∗ . (D.3)As a result, the RHS of (D.2) is strictly less than π ∗ , which according to decision rule (2.5) leads to q (0 ,

1) = q (1 ,

0) = 0 . Therefore, q (1 ,

1) + q (0 , − q (1 , − q (0 , > . Under such conviction probabilities, the twooffenses are strategic substitutes, which contradicts the hypothesis that Pr( θ = 1 | θ = 1) > Pr( θ = 1 | θ = 0) . Linear Conviction Probabilities:

First, we show that choosing θ = (0 , is weakly optimal for an opportunisticprincipal. Suppose toward a contradiction that it is not. Lemma 2.1 then implies that Pr( θ = 1) = π o . The38onclusion in the ﬁrst part suggests that Pr( θ = 1) = 1 − √ − π o , | ω ∗ i − ω ∗∗ i | = b for i ∈ { , } , and Pr( θ = 1 | a = (1 , θ = 1 | a = (1 , δ Φ( ω ∗ ) + (1 − δ ) αδ Φ( ω ∗∗ ) + (1 − δ ) α Pr( θ = 1) . For any π o > , the RHS is strictly greater than π ∗ when L is sufﬁciently large. Under decision rule (2.5), thisimplies that q (1 ,

1) = 1 , which contradicts Lemma D.1.Since

Pr( θ = 1 | θ = 1) = Pr( θ = 1 | θ = 0) and θ = (0 , is weakly optimal for an opportunisticprincipal, for every π o ∈ (0 , , an opportunistic principal must be indifferent between all offense proﬁles θ ∈{ (0 , , (0 , , (1 , , (1 , } : his actions are neither strict complements nor strict substitutes. Lemma 3.2 thenshows that q (1 ,

1) + q (0 ,

0) = q (1 ,

0) + q (0 , . Reﬁnement 1 requires that q (0 ,

0) = 0 . Suppose toward acontradiction that q (1 , > q (0 , . Then, an opportunistic principal strictly prefers θ = (0 , to θ = (1 , ,which contradicts the previous conclusion that he is indifferent between these proﬁles. Limiting Properties:

Parts 1 and 2 of our proof imply that | ω ∗ i − ω ∗∗ i | = b and that an opportunistic principalis indifferent between θ ∈ { (0 , , (0 , , (1 , , (1 , } . The principal’s indifference condition implies that ω ∗ i →−∞ as lim L → + ∞ lim δ → . As in Proposition 1, the informativeness ratio of each agent’s report converges toinﬁnity and Pr( θ i = 1) → . The latter implies that Pr( θ = 1) → . E Proofs of Theorem 3 and Proposition 2

Our proof uses the following lemma, which is shown in Online Appendix H.

Lemma E.1.

Fix any n ≥ and suppose that the judge uses decision rule (2.4). There exists L > such thatfor every L > L and every symmetric Bayes Nash equilibrium that satisﬁes Reﬁnement 11. the principal is convicted with positive probability only if a = (1 , , ..., ;2. the principal commits at most one offense. We derive formulas for agents’ reporting cutoffs ( ω ∗ n , ω ∗∗ n ) , the informativeness of reports I n and the equilibriumprobability of θ = 1 , denoted by π n . Agent i ’s reporting cutoff is ω ∗ n = b + c − cq n Q ,n when θ i = 1 (E.1)and ω ∗∗ n = c − cq n Q ,n when θ i = 0 (E.2)39here Q ,n ≡ (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − , (E.3) Q ,n ≡ n I n ( n − l ∗ + n I n (cid:16) δ Φ( ω ∗∗ n )+(1 − δ ) α (cid:17) n − + ( n − l ∗ ( n − l ∗ + n I n (cid:16) δ Φ( ω ∗∗ n )+(1 − δ ) α (cid:17) n − (cid:16) δ Φ( ω ∗ n )+(1 − δ ) α (cid:17) . (E.4)In any symmetric equilibrium, the aggregate informativeness of reports, deﬁned in (5.2), can be written as I n = δ Φ( ω ∗ n ) + (1 − δ ) αδ Φ( ω ∗∗ n ) + (1 − δ ) α . Since the judge is indifferent between convicting and acquitting the principal when there are n accusations, wehave I n = π ∗ − π ∗ . π n − π n . (E.5)When L is large enough, the principal is indifferent between committing an offense against a single agent andcommitting no offense, which leads to the indifference condition δL = q n (cid:16) Φ( ω ∗ n ) − Φ( ω ∗∗ n ) (cid:17)(cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − . (E.6) Reporting Cutoffs & Distance Between Cutoffs:

In this part, we show that ω ∗ k > ω ∗ n . Suppose toward acontradiction that ω ∗ k ≤ ω ∗ n . From (E.1), we have q k (cid:16) δ Φ( ω ∗∗ k ) + (1 − δ ) α (cid:17) k − ≤ q n (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − . (E.7)Therefore, q k Q ,k ≤ q n Q ,n , which is equivalent to q k (cid:16) δ Φ( ω ∗∗ k ) + (1 − δ ) α (cid:17) k − (cid:16) Φ( ω ∗ n ) − Φ( ω ∗∗ n ) (cid:17) ≤ q n (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − (cid:16) Φ( ω ∗ n ) − Φ( ω ∗∗ n ) (cid:17) = q k (cid:16) δ Φ( ω ∗∗ k ) + (1 − δ ) α (cid:17) k − (cid:16) Φ( ω ∗ k ) − Φ( ω ∗∗ k ) (cid:17) . This implies that Φ( ω ∗ n ) − Φ( ω ∗∗ n ) ≤ Φ( ω ∗ k ) − Φ( ω ∗∗ k ) . (E.8)Since ω ∗ k ≤ ω ∗ n , (E.8) holds only if ω ∗ n − ω ∗∗ n ≤ ω ∗ k − ω ∗∗ k , (E.9)40hich in turn implies that ω ∗∗ k ≤ ω ∗∗ n and, therefore, that q k Q ,k ≤ q n Q ,n . Computing the two sides of (E.9) bysubtracting (E.2) from (E.1) for n and k , we obtain ω ∗ n − ω ∗∗ n = b − cq n Q ,n − Q ,n Q ,n Q ,n and ω ∗ k − ω ∗∗ k = b − cq k Q ,k − Q ,k Q ,k Q ,k . Since we have shown that q k Q ,k ≤ q n Q ,n and q k Q ,k ≤ q n Q ,n , (E.9) can hold only if q n ( Q ,n − Q ,n ) ≥ q k ( Q ,k − Q ,k ) . (E.10)We have Q ,n − Q ,n = ( n − l ∗ ( n − l ∗ + n I n δ (cid:16) Φ( ω ∗ n ) − Φ( ω ∗∗ n ) (cid:17)(cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − and δ (cid:16) Φ( ω ∗ n ) − Φ( ω ∗∗ n ) (cid:17)(cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − = L − q − n δ Φ( ω ∗∗ n ) + (1 − δ ) α . Combining this with (E.6) and (E.10) yields ( n − l ∗ ( n − l ∗ (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) + n (cid:16) δ Φ( ω ∗ n ) + (1 − δ ) α (cid:17) ≥ ( k − l ∗ ( k − l ∗ (cid:16) δ Φ( ω ∗∗ k ) + (1 − δ ) α (cid:17) + k (cid:16) δ Φ( ω ∗ k ) + (1 − δ ) α (cid:17) , which may be re-expressed as ( n − k − l ∗ (cid:16) δ Φ( ω ∗∗ k ) + (1 − δ ) α (cid:17) + ( n − k (cid:16) δ Φ( ω ∗ k ) + (1 − δ ) α (cid:17) ≥ ( n − k − l ∗ (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) + ( k − n (cid:16) δ Φ( ω ∗ n ) + (1 − δ ) α (cid:17) . This inequality cannot hold because δ Φ( ω ∗∗ k ) + (1 − δ ) α < δ Φ( ω ∗∗ n ) + (1 − δ ) α , δ Φ( ω ∗ k ) + (1 − δ ) α < δ Φ( ω ∗ n ) +(1 − δ ) α and ( n − k < ( k − n , where the last inequality comes from the assumption that k > n . This leadsto a contradiction and shows that ω ∗ k > ω ∗ n whenever k > n .The relation k > n was only used in the last step. Using the fact that ω ∗ k > ω ∗ n and repeating the earlierargument up until (E.9), we obtain ω ∗ n − ω ∗∗ n > ω ∗ k − ω ∗∗ k . (E.11)This, together with ω ∗ k > ω ∗ n , implies that ω ∗∗ k > ω ∗∗ n . 41 eport Informativeness and Probability of Offense: We show that I n > I k . Since q (1 , , ...,

1) = π ∗ , thistogether with (5.1) proves that the ex ante probability of offense is ranked as claimed by Theorem 3.Applying (E.1) and (E.2) to both n and k , we get ω ∗ n − b − cω ∗ k − b − c = q k Q ,k q n Q ,n and ω ∗∗ n − cω ∗∗ k − c = q k Q ,k ( β k + (1 − β k ) I k ) q n Q ,n ( β n + (1 − β n ) I n ) . (E.12)First, we show that ω ∗ n − b − cω ∗ k − b − c > ω ∗∗ n − cω ∗∗ k − c . (E.13)Suppose toward a contradiction that the RHS of (E.13) is at least as large as the LHS of (E.13). Then, ω ∗∗ n − c − ( ω ∗ n − b − c ) ω ∗∗ k − c − ( ω ∗ k − b − c ) ≥ ω ∗ n − b − cω ∗ k − b − c . (E.14)The RHS of (E.14) is strictly greater than since > ω ∗ k > ω ∗ n when L is large enough. This implies that the LHSof (E.14) is greater than , which is equivalent to b − ( ω ∗ n − ω ∗∗ n ) > b − ( ω ∗ k − ω ∗∗ k ) . This contradicts (E.11), which was established earlier, and proves (E.13). This, together with (E.12), implies that β k + (1 − β k ) I k < β n + (1 − β n ) I n . Plugging in the expressions of I n and I k obtain in (E.5), we get I k (cid:0) k + ( k − l ∗ (cid:1)(cid:0) n I n + ( n − l ∗ (cid:1) < I n (cid:0) n + ( n − l ∗ (cid:1)(cid:0) k I k + ( k − l ∗ (cid:1) . Letting ∆ ≡ I k − I n , the previous inequality reduces to ( k − n ) I n ( I k −

1) = ( k − n ) I n ( I n + ∆ − < k ∆ − (cid:16) l ∗ ( k − n −

1) + nk (cid:17) ∆ . Suppose toward a contradiction that ∆ ≥ . Then, the LHS is strictly positive since I k > and k > n . The RHSis negative since l ∗ ( k − n −

1) + nk > k . This leads to the desired contradiction, and implies that ∆ < or, equivalently, that I n > I k . Equation (5.1) then implies that Pr( θ = 1) increases when the number of agentsincreases from n to k . 42 eferences [1] Ali, S. Nageeb, Maximilian Mihm and Lucas Siga (2018) “Adverse Selection in Distributive Politics,”Working Paper.[2] Ambrus, Attila, and Satoru Takahashi (2008) “Multi-sender Cheap Talk with Restricted State Spaces,” Theoretical Economics , 3, 1-27.[3] Austen-Smith, David and Jeffrey Banks (1996) “Information Aggregation, Rationality, and the CondorcetJury Theorem,”

American Political Science Review , 90(1), 34-45.[4] Ba, Bocar (2018) “Going the Extra Mile: The Cost of Complaint Filing, Accountability, and LawEnforcement Outcomes in Chicago,” Working paper[5] Ba, Bocar and Roman Rivera (2019) “The Effect of Police Oversight on Crime and Allegations ofMisconduct: Evidence from Chicago,” Working paper.[6] Baliga, Sandeep, Ethan Bueno de Mesquita and Alexander Wolitzky (2020) “Deterrence with ImperfectAttribution,”

American Political Science Review , forthcoming.[7] Banerjee, Abhijit (1992) “A Simple Model of Herd Behavior,”

Quarterly Journal of Economics , 107(3),797-817.[8] Bar-Hillel, Maya (1984) “Probabilistic Analysis in Legal Factﬁnding,”

Acta Psychologica , 56, 267-284.[9] Battaglini, Marco (2002) “Multiple Referrals and Multidimensional Cheap Talk,”

Econometrica , 70(4),1379-1401.[10] Battaglini, Marco (2017) “Public Protests and Policy Making,”

Quarterly Journal of Economics , 132(1),485-549.[11] Becker, Gary (1968) “Crime and Punishment: An Economic Approach,”

Journal of Political Economy , 76(2),169-217.[12] Bhattacharya, Sourav (2013) “Preference Monotonicity and Information Aggregation in Elections,”

Econometrica , 81(3), 1229-1247.[13] Bikhchandani, Sushil, David Hirshleifer, and Ivo Welch (1992) “A Theory of Fads, Fashion, Custom, andCultural Change as Information Cascades,”

Journal of Political Economy , 100, 992-1026.[14] Chamberlin, John (1974) “Provision of Collective Goods As a Function of Group Size,”

The AmericanPolitical Science Review , 68(2), 707-716.[15] Chassang, Sylvain and Gerard Padr´o i Miquel (2019) “Corruption, Intimidation and Whistle-Blowing: ATheory of Inference from Unveriﬁable Reports,”

Review of Economic Studies , forthcoming.[16] Cheng, Ing-Haw and Alice Hsiaw (2020) “Reporting Sexual Misconduct in the MeToo Era,” Working Paper.[17] Cohen, Jonathan (1977) “The Probable and The Provable,” Oxford University Press.[18] Dobbie, Will, Jacob Goldin, and Crystal S. Yang (2018) “The Effects of Pretrial Detention on Conviction,Future Crime, and Employment: Evidence from Randomly Assigned Judges,”

American Economic Review ,108(2), 201-240.[19] Ekmekci, Mehmet and Stephan Lauermann (2019) “Informal Elections with Dispersed Information,”Working Paper. 4320] Fudenberg, Drew and David Levine (1995) “The Theory of Learning in Games,” MIT Press.[21] Grossman, Gene and Michael Katz (1983) “Plea Bargaining and Social Welfare,”

American EconomicReview , 73(4), 749-757.[22] Harel, Alon and Ariel Porat (2009) “Aggregating Probabilities Across Cases: Criminal Responsibility forUnspeciﬁed Offenses,”

Minnesota Law Review , 482, 261-310.[23] Harris, John (1970) “On the Economics of Law and Order,”

Journal of Political Economy , 78(1), 165-174.[24] Kaplow, Louis (2011) “On the Optimal Burden of Proof,”

Journal of Political Economy , 119(6), 1104-1140.[25] Landes, William (1971) “An Economic Analysis of the Courts,”

Journal of Law and Economics , 14(1),61-107.[26] Lee, Frances Xu and Wing Suen (2020) “Credibility of Crime Allegations,”

American EconomicJournal-Microeconomics , 12, 220-259.[27] Lynch, Gerard (1987) “RICO: The Crime of Being a Criminal,”

Columbia Law Review , 87(4), 661-764.[28] Miceli, Thomas (1991) “Optimal Criminal Procedure: Fairness and Deterrence,”

International Review of Lawand Economics

American Economic Review ,98(3), 864-896.[30] Morgan, Rachel and Grace Kena (2016) “Criminal Victimization, 2016,” Bulletin, Bureau of Justice Statistics.[31] Myerson, Roger (1978) “Reﬁnements of the Nash Equilibrium Concept,”

International Journal of GameTheory , 7(2), 73-80.[32] Naess, Ole-Andreas Elvik (2020) “Under-reporting of Crime,” Working Paper.[33] Ottaviani, Marco and Peter Norman Sørensen (2000) “Herd Behavior and Investment: Comment,”

AmericanEconomic Review , 90(3), 695-704.[34] Persico, Nicola (2004) “Committee Design with Endogenous Information,”

Review of Economic Studies ,70(1), 1-27.[35] RAND Cooperation (2018) “Sexual Assault and Sexual Harassment in the US Military,” Technical Report.[36] Reinganum, Jennifer (1988) “Plea Bargaining and Prosecutorial Discretion,”

American Economic Review ,78(4), 713-728.[37] Scharfstein, David and Jeremy Stein (1990) “Herd Behavior and Investment,”

Amercian Economic Review ,80(3), 465-479.[38] Robertson, Bernard and G. A. Vignaux (1993) “Probabilit–The Logic of the Law,”

Oxford Journal of LegalStudies , 13(4), 457-478.[39] Schauer, Federick and Richard Zeckhauser (1996) “On the Degree of Conﬁdence for Adverse Decisions,”

Journal of Legal Studies , 25(1), 27-52.[40] Schmitz, Patrick and Thomas Tr¨oger (2011) “The Suboptimality of the Majority Rule,”

Games and EconomicBehavior , 651-665. 4441] Siegel, Ron and Bruno Strulovici (2020) “Judicial Mechanism Design,” Working Paper.[42] Silva, Francesco (2019) “If We Confess Our Sins,”

International Economic Review , 60(3), 1389–1412.[43] Smith, Lones and Peter Norman Sørensen (2000) “Pathological Outcomes of Observational Learning,”

Econometrica , 68(2), 371-398.[44] Stigler, George (1970) “The Optimal Enforcement of Laws,”

Journal of Political Economy , 78(3), 526-536.[45] Strulovici, Bruno (2010) “Learning while Voting: Determinants of Collective Experimentation,”

Econometrica , 78(3), 933–971.[46] Strulovici, Bruno (2020) “Can Society Function without Ethical Agents? An Informational Perspective,”Working Paper, Northwestern University.[47] U.S. Equal Employment Opportunity Commission (2017) “Fiscal Year 2017 Enforcement And LitigationData,” Research Brief.[48] USMSPB (2018) “Update on Sexual Harassment in the Federal Workplace,” Research Brief.45 r X i v : . [ ec on . GN ] S e p Online AppendixCrime Entanglement, Deterrence, and Witness Credibility

Harry Pei Bruno StruloviciSeptember 15, 2020

F. Properties of Conviction Probabilities

We show the following proposition, which implies Lemma 3.1 and Lemma D.1 of the main text.

Proposition F.

There exists

L > such that for any L > L and proper equilibrium that satisﬁesReﬁnements 1 and 2,1. if the judge uses decision rule (2.4), then q (0 ,

0) = q (1 ,

0) = q (0 ,

1) = 0 .2. if the judge uses decision rule (2.5), then there exists no a ∈ { , } such that q ( a ) = 1 . We consider three cases separately depending on whether the following expression is strictly positive,strictly negative, or equal to : q (1 ,

1) + q (0 , − q (1 , − q (0 , . (1)Lemma 3.2 implies that the principal’s actions are complements if (1) is negative and vice versa. F.1 The value of (1) is strictly positive

First, we show that q (1 , must be strictly less than 1 regardless the decision rule used to adjudicate guilt,which implies that q ( a ) < for any a . We the show that under decision rule (2.4), max { q (0 , , q (1 , } > implies that q (1 ,

1) = 1 .Suppose toward a contradiction that q (1 ,

1) = 1 . Reﬁnement 1 imposes that q (0 ,

0) = 0 . This leads to thefollowing expressions for agent ’s reporting cutoffs when he has and has not witnessed offense: ω ∗ ≡ b − c (1 − Ψ ∗∗ )(1 − q (1 , q (1 ,

0) + Ψ ∗∗ (1 − q (1 , − q (0 , , (2) ω ∗∗ ≡ − c (1 − X )(1 − q (1 , q (1 ,

0) + X (1 − q (1 , − q (0 , , (3)1here X ≡ − p − p − p Ψ ∗∗ + p − p Ψ ∗ (4)and p i is the probability with which θ i = 1 . We note that ω ∗ is increasing in Ψ ∗∗ and q (1 , and decreasingin q (0 , and that ω ∗∗ is increasing in X and q (1 , and decreasing in q (0 , . The distance between the twocutoffs is ω ∗ − ω ∗∗ = b − (Ψ ∗ − Ψ ∗∗ ) C (5)where C ≡ c (1 − q (0 , − q (1 , × p − p × q (1 ,

0) + X (1 − q (1 , − q (0 , × q (1 ,

0) + Ψ ∗∗ (1 − q (1 , − q (0 , . (6)By symmetrically, one can obtain the expressions for ω ∗ , ω ∗∗ , and their difference. Conditional on θ = 0 ,choosing θ = 1 rather than θ = 0 increases the principal’s conviction probability by (Ψ ∗ − Ψ ∗∗ ) (cid:16) q (1 ,

0) + Ψ ∗∗ (1 − q (1 , − q (0 , (cid:17) (7)Similarly, if choosing θ = 1 rather than θ = 0 given that θ = 0 increases the conviction probability by (Ψ ∗ − Ψ ∗∗ ) (cid:16) q (0 ,

1) + Ψ ∗∗ (1 − q (1 , − q (0 , (cid:17) . (8)In every equilibrium, both (7) and (8) are bounded below by /L . In what follows, we establish a lower boundfor the maximum of these two expressions that is independent of L and that will deliver the desired contradictionwhen L is large enough. Throughout the proof, we assume without loss of generality that ω ∗ ≥ ω ∗ . Thefollowing lemma provides a comparison between q (1 , and q (0 , . Lemma F.

In every equilibrium such that ω ∗ ≥ ω ∗ , we have q (1 , ≥ q (0 , . Lemma F is proved in Section F.4 and taken for granted for now.

Lower Bound on ω ∗ : For every ǫ > ,1. If q (1 , ≥ ǫ , then ω ∗∗ ≥ − c − ǫǫ . 2. If q (1 , < ǫ , then q (0 , ∈ (0 , ǫ ) , by Lemma F. Therefore, ω ∗ = b − c (1 − Ψ ∗∗ )(1 − q (0 , q (0 ,

1) + Ψ ∗∗ (1 − q (1 , − q (0 , ≥ b − c (cid:16) − δ Φ( ω ∗ − b ) − (1 − δ ) α (cid:17) (1 − q (0 , q (0 ,

1) + (cid:16) δ Φ( ω ∗ − b ) + (1 − δ ) α (cid:17)(cid:16) − q (1 , − q (0 , (cid:17) ≥ b − c (cid:16) − δ Φ( ω ∗ − b ) − (1 − δ ) α (cid:17) (1 − q (0 , q (0 ,

1) + (cid:16) δ Φ( ω ∗ − b ) + (1 − δ ) α (cid:17)(cid:16) − q (1 , − q (0 , (cid:17) ≥ b − c − δ Φ( ω ∗ − b ) − (1 − δ ) α (1 − ǫ ) (cid:16) δ Φ( ω ∗ − b ) + (1 − δ ) α (cid:17) . (9)Supplementary Appendix J, which proves the existence of an equilibrium, also proves that there exists asolution to the equation ω ∗ = b − c − δ Φ( ω ∗ − b ) − (1 − δ ) α (1 − ǫ ) (cid:16) δ Φ( ω ∗ − b ) + (1 − δ ) α (cid:17) . The RHS of (9) is bounded below for each ǫ , uniformly over its argument Φ( ω ∗ . This lower, which wedenote ω ∗ ( ǫ ) , is decreasing in ǫ . This yields a lower bound for ω ∗ given by ω ∗ ≡ sup ǫ ∈ [0 , n min (cid:8) b − c − ǫǫ , ω ∗ ( ǫ ) (cid:9)o , (10)which is ﬁnite and independent of L . Upper Bound on C : We provide an upper bound for q (1 ,

0) + Ψ ∗∗ (1 − q (1 , − q (0 , . (11)For every ǫ > , there are two cases:1. If q (1 , ≥ ǫ , then (11) is no more than /ǫ .2. If q (1 , < ǫ , then Lemma F implies that q (0 , < ǫ . Let ω ∗∗ ( ǫ ) be the smallest root of the followingequation: ω ≡ − c − δ Φ( ω ) − (1 − δ ) α (cid:16) δ Φ( ω ) + (1 − δ ) α (cid:17) (1 − ǫ ) . (12)3ince q (1 , , q (0 , ∈ [0 , ǫ ] , ω ∗∗ ( ǫ ) is a lower bound for ω ∗∗ . An upper bound on (11) is given by q (1 ,

0) + Ψ ∗∗ (1 − q (1 , − q (0 , ≤ ω ∗∗ ( ǫ ))(1 − ǫ ) . (13)In summary: C ≤ cY (14)where Y ≡ inf ǫ ∈ [0 , n max (cid:8) /ǫ, ω ∗∗ )(1 − ǫ ) (cid:9)o . Lower Bound on the Maximum of (7) and (8):

Since φ ≥ is the derivative of Φ , we have for all ω ′ > ω ′′ Φ( ω ′ ) − Φ( ω ′′ ) ≥ ( ω ′ − ω ′′ ) min ω ∈ [ ω ′ ,ω ′′ ] φ ( ω ) . (15)We consider two cases. First, suppose that Φ( ω ∗ ) − Φ( ω ∗∗ ) ≥ Φ( ω ∗ ) − Φ( ω ∗∗ ) . Using the fact that Ψ ∗ i − Ψ ∗∗ i = δ (Φ( ω ∗ i ) − Φ( ω ∗∗ i )) , we have δ min ω ∈ [ ω ∗∗ ,ω ∗ ] φ ( ω ) (cid:16) Φ( ω ∗ ) − Φ( ω ∗∗ ) (cid:17) ≥ ω ∗ − ω ∗∗ = b − C (Ψ ∗ − Ψ ∗∗ ) ≥ b − C (Ψ ∗ − Ψ ∗∗ ) . (16)This, together with (14), gives an lower bound on Ψ ∗ − Ψ ∗∗ . Moreover, q (1 ,

0) + Ψ ∗∗ (1 − q (1 , ≥ q (1 ,

0) + Ψ ∗∗ (cid:0) − q (1 , − q (0 , (cid:1) ≥ c (cid:0) − q (1 , (cid:1) (1 − Ψ ∗∗ ) | ω ∗ | , (17)where the last inequality uses (2) and the fact that ω ∗ ≤ ω ∗ . This provides a lower bound for q (1 , and impliesa lower bound on (7).Second, consider the case in which Φ( ω ∗ ) − Φ( ω ∗∗ ) < Φ( ω ∗ ) − Φ( ω ∗∗ ) and let β ≡ ω ∗ − ω ∗∗ b . (18)Since X > Ψ ∗∗ , we have β ∈ (0 , . Recalling that ω ∗ ≤ ω , we have δ (Ψ ∗ − Ψ ∗∗ ) = Φ( ω ∗ ) − Φ( ω ∗∗ ) ≥ βbφ ( ω ∗ − b ) . (19)4oreover, (5) and (14) imply that Ψ ∗ − Ψ ∗∗ = (1 − β ) b/C ≥ (1 − β ) bY c (20)Since the pdf of ω i is increasing in ω for ω < , (20) yields a lower bound on ω ∗∗ . We denote this lower boundby e ω ( β ) . By construction, e ω ( β ) is decreasing in β .1. When β ≥ / , (19) implies a lower bound for Φ( ω ∗ ) − Φ( ω ∗∗ ) . Inequality (17) then yields a lowerbound for q (1 , and implies a lower bound on (7).

2. When β < / , we have ω ∗∗ ≥ e ω (1 / and Ψ ∗ − Ψ ∗∗ ≥ b C . The lower bound on ω ∗∗ also delivers a lower bound on q (0 ,

1) + Ψ ∗∗ (1 − q (1 , − q (0 , , since (3)implies that e ω (1 / ≤ ω ∗∗ ≤ ω ∗ = − c (1 − Ψ ∗∗ )(1 − q (0 , q (0 ,

1) + Ψ ∗∗ (1 − q (1 , − q (0 , , which leads to q (0 ,

1) + Ψ ∗∗ (1 − q (1 , − q (0 , ≥ (1 − Ψ ∗∗ )(1 − q (0 , − e ω (1 / /c . (21)Since − Ψ ∗∗ ≥ δ − δ Φ(0) and − q (0 , ≥ / , the lower bound on q (0 , ∗∗ (1 − q (1 , − q (0 , is strictly bounded below away from . This leads to a uniform lower bound on (8).Next, we show that under decision rule (2.4), equilibrium that satisﬁes (1) max { q (0 , , q (1 , } > mustalso satisfy q (1 ,

1) = 1 . Suppose toward a contradiction that both q (1 , and q (1 , are strictly between and . Then, agent ’s accusation does not affect the posterior belief about θ . This implies that a is uninformativeabout θ . This is only possible if ω ∗ = ω ∗∗ , which contradicts Lemma 2.1. F.2 The value of (1) is strictly negative

Next, we study the case in which q (1 ,

0) + q (0 , > q (0 ,

0) + q (1 , , i.e., θ and θ are strategic complements.Lemma 3.2 implies that, conditional on committing an offense against one agent, the principal has a strictincentive to commit an offense against the other agent. Therefore, in such equilibria, the principal commitseither he commits both offenses or no offense. The validity of inequality (17) does not depend on the sign of Ψ ∗ − Ψ ∗∗ − Ψ ∗ + Ψ ∗∗ .

5e proceed in two steps, as in the previous case. First, we prove by contradiction that q (1 , < regardlessof which decision rule is applied, which implies that q ( a ) < for any a and decision rule. We then show thatunder decision rule (2.4), max { q (0 , , q (1 , } > only if q (1 ,

1) = 1 .When q (1 ,

0) + q (0 , > q (0 ,

0) + q (1 , , Lemma 3.2 implies that θ and θ are strategic complements.Therefore, agent i assigns a higher probability to agent j accusing the principal when θ i = 0 than when θ i = 1 .This implies that min { ω ∗ − ω ∗∗ , ω ∗ − ω ∗∗ } ≥ b. (22)By setting θ = θ = 1 , the principal raises the probability that he is convicted by at least (Ψ ∗ − Ψ ∗∗ ) (cid:16) Ψ ∗ (1 − q (0 , − Ψ ∗ ) q (1 , (cid:17) + (Ψ ∗ − Ψ ∗∗ ) (cid:16) Ψ ∗∗ (1 − q (1 , − Ψ ∗∗ ) q (0 , (cid:17) (23)compared to the case in which he sets θ = θ = 0 . Therefore, the value of (23) cannot exceed /L . Theremainder of this proof establishes a strictly positive lower bound on (23) that applies uniformly across all L .This will imply that when L is large enough, equilibria that exhibit strategic complementarities between θ and θ do not exist.First, max { q (0 , , q (1 , } ≥ / since q (0 ,

1) + q (1 , ≥ . Without loss of generality, we assume that q (1 , ≥ / . Second, it is a dominant strategy for agent i to abstain from accusing the principal when ω i > ,which implies that − Ψ ∗ i ≥ δ (1 − Φ(0)) . Third, player ’s reporting threshold when θ = 1 is ω ∗ = b − c (1 − Q H )(1 − q (1 , Q H (1 − q (0 , − Q H ) q (1 , , (24)where Q H is the probability that agent accuses the principal conditional on θ = 1 . The RHS of (24) isstrictly increasing in Q H . Therefore, ω ∗ ≥ b − c − q (1 , q (1 , ≥ b − c. From (22), we have δ (Ψ ∗ − Ψ ∗∗ ) = Φ( ω ∗ ) − Φ( ω ∗∗ ) ≥ b min ω ∈ [ − b − c, φ ( ω ) . (25)This yields the desired lower bound for (23): (Ψ ∗ − Ψ ∗∗ ) | {z } bounded by (25) (cid:16) Ψ ∗ (1 − q (0 , | {z } ≥ + (1 − Ψ ∗ ) | {z } ≥ δ (1 − Φ(0)) q (1 , | {z } ≥ / (cid:17) + (Ψ ∗ − Ψ ∗∗ ) (cid:16) Ψ ∗∗ (1 − q (1 , − Ψ ∗∗ ) q (0 , | {z } ≥ (cid:17) δ b − Φ(0)) min ω ∈ [ − b − c, φ ( ω ) (26)and implies that q (1 , < when L is large enough.Next, we show that under decision rule (2.4), q (1 ,

1) + q (0 , < q (1 ,

0) + q (0 , implies that q (1 ,

1) = 1 .Suppose toward a contradiction that q (1 , ∈ (0 , and q (1 ,

0) + q (0 , > q (0 ,

0) + q (1 , . Then, at leastone of inequalities q (0 , > and q (1 , > must hold. If q (0 , > , the judge’s posterior belief about θ is independent of agent ’s report. This gives an opportunistic principal a strict incentive to choose θ = 1 andleads to a contradiction. The case q (1 , > leads to a similar contradiction. F.3 The value of (1) is Part I:

We show that in every equilibrium where the value of (1) is , each agent witnesses an offense withstrictly positive probability and q (1 ,

1) = 1 . This implies that:1. q (1 ,

0) + q (0 ,

1) = 1 .2. The marginal cost of committing an offense is the same across agents, i.e., (Ψ ∗ − Ψ ∗∗ ) q (1 ,

0) = (Ψ ∗ − Ψ ∗∗ ) q (0 , . (27)First, suppose toward a contradiction that the principal chooses a = 1 with probability . Then, agent ’sreport does not affect the judge’s posterior belief about the value of θ θ . Therefore, q (1 ,

0) = q (0 ,

0) = 0 .Since the value of (1) is , we have q (0 ,

1) = q (1 , ∈ (0 , . This contradicts the conclusion of Lemma 2.1since q is not responsive to agent ’s report.Next, suppose toward a contradiction that q (1 , ∈ (0 , and that each agent witnesses an offense withpositive probability. Then, at least one of the following conditions must hold: q (1 , ∈ (0 , or q (0 , ∈ (0 , . The previous paragraph has ruled out equilibria in which either q (1 , or q (0 , is equal to . Supposethat q (1 , , q (0 , , q (1 , ∈ (0 , . Then, the report proﬁles a = (1 , , (1 , , (0 , must lead to thesame posterior probability that the principal has committed at least one offense. For i ∈ { , } , let p i bethe probability that θ i = 1 and θ − i = 0 conditional on the principal having committed at least one offense.Since the posterior probability of guilt is the same for reporting proﬁles (1 , and (1 , , we have (1 − p − p ) Ψ ∗ Ψ ∗ Ψ ∗∗ Ψ ∗∗ + p Ψ ∗ Ψ ∗∗ + p Ψ ∗ Ψ ∗∗ = (1 − p − p ) Ψ ∗ (1 − Ψ ∗ )Ψ ∗∗ (1 − Ψ ∗∗ ) + p Ψ ∗ Ψ ∗∗ + p − Ψ ∗ − Ψ ∗∗ . (28)7ince Ψ ∗ Ψ ∗ Ψ ∗∗ Ψ ∗∗ > Ψ ∗ (1 − Ψ ∗ )Ψ ∗∗ (1 − Ψ ∗∗ ) and Ψ ∗ Ψ ∗∗ > − Ψ ∗ − Ψ ∗∗ , the LHS of (28) is strictly greater than the RHS of (28) unless p = 1 . By assumption, the principal commitsan offense against each agent with strictly positive probability, so either − p − p > or p > , whichviolates (28). Part II:

We show that q (1 , < when L is large enough.Suppose toward a contradiction that q (1 ,

1) = 1 . We derive a lower bound on (27) that holds for all L .Without loss of generality, we assume that q (1 , ≥ q (0 , , which implies that q (1 , ≥ / . Agent ’sreporting cutoffs satisfy ω ∗ = b − c q (0 , q (1 , (cid:16) − p x Ψ ∗ − (1 − p x )Ψ ∗∗ (cid:17) and ω ∗∗ = − c q (0 , q (1 , (cid:16) − p y Ψ ∗ − (1 − p y )Ψ ∗∗ (cid:17) where p x , p y ∈ [0 , represent agent ’s beliefs about θ conditional on each realization of θ . This impliesthat ω ∗ − ω ∗∗ = b − c q (0 , q (1 ,

0) ( p x − p y )(Ψ ∗ − Ψ ∗∗ ) . (29)The absolute value of c q (0 , q (1 ,

0) ( p x − p y ) is at most c . To bound the LHS of (27) from below, we proceed according to the following two steps. Step 1: Lower bound on ω ∗ The formula for ω ∗ and the assumption that q (1 , ≥ q (0 , imply that ω ∗ ≥ b − c (cid:16) − p x Ψ ∗ − (1 − p x )Ψ ∗∗ (cid:17) ≥ b − cδ (1 − Φ(0)) . (30)We note the lower bound on the RHS by ω ∗ . Step 2: Lower bound on (27)

Since q (1 , ≥ q (0 , and q (1 ,

0) + q (0 , ≥ q (1 ,

1) = 1 , we have q (1 , ≥ / . Therefore, (27) will be bounded below if we establish a strictly positive lower bound on min { Ψ ∗ − Ψ ∗∗ , q (0 , ∗ − Ψ ∗∗ ) } . 8f p x − p y ≤ , we have ω ∗ − ω ∗∗ ≥ b . The lower bound on ω ∗ then implies a strictly positive lower boundon Ψ ∗ − Ψ ∗∗ , as desired. If p x − p y > , we follow same derivation as in the last step of Section F.1. Moreprecisely, we consider two cases.First, suppose that Ψ ∗ − Ψ ∗∗ ≥ Ψ ∗ − Ψ ∗∗ . Then, we have Ψ ∗ − Ψ ∗∗ φ ( ω ∗ − b ) ≥ ω ∗ − ω ∗∗ = b − c (Ψ ∗ − Ψ ∗∗ ) ≥ b − c (Ψ ∗ − Ψ ∗∗ ) . (31)This yields a strictly positive lower bound on Ψ ∗ − Ψ ∗∗ .Second, suppose that Ψ ∗ − Ψ ∗∗ < Ψ ∗ − Ψ ∗∗ . Then, the variable β ≡ ( ω ∗ − ω ∗∗ ) /b lies between and due to the assumption that p x − p y > . Equality (29) implies that ω ∗ − ω ∗∗ = b − c q (0 , q (1 ,

0) ( p x − p y )(Ψ ∗ − Ψ ∗∗ ) ≥ b − c (Ψ ∗ − Ψ ∗∗ ) , which yields Ψ ∗ ≥ Ψ ∗ − Ψ ∗∗ ≥ (1 − β ) b/c. (32)This provides a lower bound on ω ∗ that is decreasing in β and that we denote e ω ( β ) . We also have δ (Ψ ∗ − Ψ ∗∗ ) = Φ( ω ∗ ) − Φ( ω ∗∗ ) ≥ βbφ ( ω ∗ − b ) . (33)We consider two subcases, depending on the value of β relative to / .1. If β ≥ / , then (33) implies that Ψ ∗ − Ψ ∗∗ ≥ bδφ ( ω ∗ − b ) / . (34)2. If β < / , then (32) implies that Ψ ∗ − Ψ ∗∗ ≥ b/ c. (35)We have ω ∗ = b − c (1 − Q ) q (1 , q (0 , ≥ ω ( β ) (36)where Q is a number between and (1 − δ ) α + δ Φ(0) . This yields the following lower bound on q (0 , : q (0 , ≥ b − c (1 − Q ) q (1 , ω ( β ) ≥ b − c ω ( β ) . (37)This expression is bounded below away from for all β < / . This, together with (35), lead to the9ollowing lower bound on the RHS of (27): q (0 , ∗ − Ψ ∗∗ ) ≥ ( b − c ) b cω ( β ) . (38) F.4 Proof of Lemma F

Suppose toward a contradiction that there exists an equilibrium in which the value of (1) is strictly positive, ω ∗ > ω ∗ , and q (1 , < q (0 , . Then, (2) implies that Φ( ω ∗∗ ) > Φ( ω ∗∗ ) or, equivalently, that ω ∗∗ > ω ∗∗ .This, together with ω ∗ > ω ∗∗ and ω ∗ > ω ∗∗ , implies that ω ∗∗ < ω ∗∗ < ω ∗ < ω ∗ . (39)We start by showing that p , p > . Suppose that p = 0 and p > . Then, (4) implies that X = Ψ ∗∗ and,hence, that ω ∗ − ω ∗∗ = b > ω ∗ − ω ∗∗ , which contradicts (39). Now suppose that p > and p = 0 . Then, p Ψ ∗ Ψ ∗∗ + p − Ψ ∗ − Ψ ∗∗ > p Ψ ∗ Ψ ∗∗ + p − Ψ ∗ − Ψ ∗∗ . (40)This means that the judge attaches a higher probability to θ = 1 when only agent accuses the principal thanwhen only agent does. This implies that q (1 , ≥ q (0 , , which leads to a contradiction.Having established that p , p both lie in (0 , , we conclude that (7) and (8) are equal to each other.Applying (2) to both agents, we have (cid:12)(cid:12)(cid:12) ω ∗ − bω ∗ − b (cid:12)(cid:12)(cid:12) = 1 − Ψ ∗∗ − Ψ ∗∗ · − q (1 , − q (0 , · q (0 ,

1) + Ψ ∗∗ (1 − q (1 , − q (0 , q (1 ,

0) + Ψ ∗∗ (1 − q (1 , − q (0 , − Ψ ∗∗ − Ψ ∗∗ · − q (1 , − q (0 , · Ψ ∗ − Ψ ∗∗ Ψ ∗ − Ψ ∗∗ . (41)Since − Ψ ∗∗ − Ψ ∗∗ < Ψ ∗ − Ψ ∗∗ Ψ ∗ − Ψ ∗∗ ≤ Ψ ∗ − Ψ ∗∗ Ψ ∗ − Ψ ∗∗ , we get ≥ (cid:12)(cid:12)(cid:12) ω ∗ − bω ∗ − b (cid:12)(cid:12)(cid:12) > − q (1 , − q (0 , . (42)The RHS of (42) is greater than since q (1 , < q (0 , . This yields the desired contradiction.10 . Proof of Proposition 3 In Section G.1, we construct an open interval of L over which there exist equilibria in which the principal’sactions are strategic complements. In Section G.2, we show that for those values of L , the conviction probabilityis strictly concave in the number of accusations in all proper equilibria that satisfy Reﬁnements 1 and 2. G.1 Existence of an Equilibrium

We construct an interval of L such that under APP (i.e., decision rule (2.4)), there exists a proper symmetricBayes Nash Equilibrium that satisﬁes Reﬁnements 1 and 2 in which q (1 ,

1) = 1 , q (1 ,

0) = q (0 ,

1) = q , and q (0 ,

0) = 0 with q > / , which implies that q (1 ,

1) + q (0 , − q (1 , − q (0 , (43)is strictly negative. From Lemma 3.2, an opportunistic principal’s decisions to commit offenses against the twoagents are strategic complements. In equilibrium, the principal chooses either θ = θ = 1 or θ = θ = 0 butnot θ = (1 , or (0 , . Since each agent observes an offense with interior probability, this BNE is a properequilibrium.First, we derive formulas for the reporting cutoffs. When θ i = 1 , agent i accuses the principal if ω i ≤ ω ∗ ≡ b − c (1 − q )(1 − Ψ ∗ ) q + Ψ ∗ (1 − q ) . (44)When θ i = 0 , agent i accuses the principal if ω i ≤ ω ∗∗ ≡ − c (1 − q )(1 − Ψ ∗∗ ) q + Ψ ∗∗ (1 − q ) . (45)The principal’s indifference condition is given by /L = (Ψ ∗ − Ψ ∗∗ ) (cid:16) (1 − q )(Ψ ∗ + Ψ ∗∗ ) + 2 q (cid:17) , (46)where Ψ ∗ ≡ δ Φ( ω ∗ ) + (1 − δ ) and Ψ ∗∗ ≡ δ Φ( ω ∗∗ ) + (1 − δ ) . Moreover, the equilibrium probability that theprincipal commits an offense, denoted by π m , is implicitly determined by Ψ ∗ (1 − Ψ ∗ )Ψ ∗∗ (1 − Ψ ∗∗ ) = π ∗ − π ∗ . π m − π m , (47)where I ≡ Ψ ∗ (1 − Ψ ∗ )Ψ ∗∗ (1 − Ψ ∗∗ ) measures the aggregate informativeness of accusations. In the equilibria of interest, I is11he right measure of informativeness because one accusation is sufﬁcient to convict the principal, which impliesthat the judge indifferent between s = 0 and s = 1 when exactly one agent accuses the principal.Comparing (44) to (45), we conclude that ω ∗ − ω ∗∗ > b . We re-express (44) and (45) as ω ∗ − bc = Ψ ∗ − ∗ + (1 − Ψ ∗ ) q − q (48)and ω ∗∗ c = Ψ ∗∗ − ∗∗ + (1 − Ψ ∗∗ ) q − q . (49)For every q ∈ [1 / , , the function Ψ −

1Ψ + (1 − Ψ) q − q is convex function in Ψ . Moreover, the pdf of ω is strictly increasing for ω ≤ . As a result, the function Ψ( ω ) − ω ) + (1 − Ψ( ω )) q − q (50)is strictly increasing and convex for ω ≤ . Moreover, it takes values in [ − , .When ω ∗ = b , the LHS of (48) is strictly greater than its RHS. This, together with the convexity of (50)implies that there is a unique value ω ∗ of ω such that the LHS of (48) is equal to (50). This pins down the valueof ω ∗ in equilibrium. Similarly, (49) admits a unique solution, which pins down the value of ω ∗∗ in equilibrium.The RHS of (48) and (49) are both increasing in q , which implies that ω ∗ and ω ∗∗ are also increasing in q . Moreover, since equation (B.6) admits a unique solution ω ∗ , the LHS of (B.6) does not depend on q , andthe RHS is continuously increasing in q , the solution ω ∗ is continuous in q . Similarly, ω ∗∗ is also continuouslyincreasing in q .Let ω ∗ ( c, q ) and ω ∗∗ ( c, q ) denote the values of the cutoffs in equilibrium, and L ( c, q ) denote the value of L , as deﬁned by (46), when ω ∗ = ω ∗ ( c, q ) and ω ∗∗ = ω ∗∗ ( c, q ) . For every c > and L ∈ h min q ∈ [1 / , L ( c, q ) , max q ∈ [1 / , L ( c, q ) i , (51)there exists q ∈ (1 / , , such that when the retaliation cost is c and the punishment level is L , there exists anequilibrium such that q (1 ,

1) = 1 , q (1 ,

0) = q (0 ,

1) = q and q (0 ,

0) = 0 .12 .2 Conviction Probabilities

For every ( c, q ) , the thresholds ω ∗ and ω ∗∗ are computing as explained in the previous section. Let L ( c, q ) denote the punishment level that makes the principal indifferent, as deﬁned by (46), and let L ( c ) ≡ max q ∈ [1 / , L ( c, q ) . Proposition G.

For every c > , there exists ε > such that when L ∈ [ L ( c ) , L ( c ) + ε ] , in every properequilibrium that satisﬁes Reﬁnements 1 and 2, q (0 ,

0) + q (1 , − q (1 , − q (0 , < . Equivalently, the principal’s decisions are strategic complements in every equilibrium, and committing anoffense against only one agent is strictly suboptimal. Since L ( c, ≥ L ( c ) and L ( c, / ≥ L ( c ) . To proveProposition G, it is sufﬁcient to show that in every equilibrium such that q (0 , q (1 , − q (1 , − q (0 , ≥ , L is strictly above L ( c, or strictly above L ( c, / . Let ( ω ∗ , ω ∗∗ ) denote the unique solution to ω ∗ − bc = Ψ ∗ − and ω ∗∗ c = Ψ ∗∗ − , where Ψ ∗ ≡ δ Φ( ω ∗ ) + (1 − δ ) and Ψ ∗∗ ≡ δ Φ( ω ∗∗ ) + (1 − δ ) . By construction, ω ∗ and ω ∗∗ are the reporting cutoffs when q (1 ,

0) = q (0 ,

1) = 1 / . Therefore, L ( c, /

2) = 2Ψ ∗ − Ψ ∗∗ . (52)The rest of the proof consists of two parts. In Section G.3, we rule out equilibria in which q (0 ,

1) = q (1 ,

0) = 0 .In Section G.4, we rule out equilibria in which max { q (0 , , q (1 , } > but q (0 ,

1) + q (1 , ≤ . G.3 Equilibria in which q ( , ) = q ( , ) = Supplementary Appendix I shows that all proper equilibria that satisfy Reﬁnements 1 and 2 with q (0 ,

0) = q (1 ,

0) = q (0 ,

1) = 0 must be symmetric. Let q ≡ q (1 , ∈ (0 , be the conviction probability when bothagents accuse the principal. Let ω ∗ m and ω ∗∗ m denote agent’s reporting cutoffs, which satisfy (3.8) and (3.9) inthe main text, respectively. Let Ψ ∗ ≡ δ Ψ( ω ∗ m ) + (1 − δ ) α and let Ψ ∗∗ ≡ δ Ψ( ω ∗∗ m ) + (1 − δ ) α . The principal’sindifference condition implies that L = q Ψ ∗∗ (Ψ ∗ − Ψ ∗∗ ) . We now show that L > L ( c, / . It sufﬁces toshow that Ψ ∗ − Ψ ∗∗ > Ψ ∗∗ (Ψ ∗ − Ψ ∗∗ ) . (53)The expressions for the reporting cutoffs ω ∗ and ω ∗∗ imply that ω ∗ = b + c − cq Ψ ∗∗ ≤ b + c − c Ψ ∗ ≤ b + c (Ψ ∗ − and ω ∗ = b + c (Ψ ∗ − . Since c (Ψ( ω ) − is strictly convex in ω when ω < , and the value of c (Ψ( ω ) − is strictly negative when ω = 0 , we conclude that ω ∗ < ω ∗ . Since ω ∗ − ω ∗∗ > b > ω ∗ − ω ∗∗ , we conclude that Ψ ∗ − Ψ ∗∗ > Ψ ∗ − Ψ ∗∗ . (54)This in turn implies (53). 13 .4 q (1 , or q (0 , is strictly positive, and (1) is positive Suppose toward a contradiction that there exists an equilibrium such that, ﬁrst, q (1 ,

0) + q (0 , < q (1 ,

1) + q (0 , , and, second, at least one of the probabilities q (0 , and q (1 , is strictly positive. According to Lemma3.2, the principal’s incentives to commit crimes are strategic substitutes. According to Lemma 2.1, the principalcommits offense with interior probability, and therefore, he is indifferent between committing no offense andcommitting one offense, but he commits two offenses with zero probability.Let q ≡ q (1 , and q ≡ q (0 , . For i ∈ { , } , let p i be the probability that θ i = 1 , and ω ∗ i and ω ∗∗ i denote agent i ’s reporting cutoffs, which satisfy ω ∗ i = b − c (1 − Ψ ∗∗ j )(1 − q i ) q i + Ψ ∗∗ j (1 − q − q ) (55)and ω ∗∗ i = − c (1 − X j )(1 − q i ) q i + X j (1 − q − q ) (56)where j denotes the agent other than i and X i ≡ − p − p − p i Ψ ∗∗ i + p j − p i Ψ ∗ i . For i ∈ { , } , let I i ≡ p i p i + p j Ψ ∗ i Ψ ∗∗ i + p j p i + p j − Ψ ∗ j − Ψ ∗∗ j . The prior probability of guilt of the principal is p + p . Since the principal is convicted withpositive probability after one accusation, we have max {I , I } ≥ l ∗ − p − p p + p ≥ min {I , I } . Step 1:

We rule out equilibria in which the principal’s expected cost of committing the offense observedby agent , relative to committing no offense, is different from the expected cost of committing the offenseobserved by agent . Suppose toward a contradiction that the cost of committing the offense observed by agent is strictly higher than the cost of committing the offense observed by agent . This implies that p = 0 and p > and that I = Ψ ∗ Ψ ∗∗ > > − Ψ ∗ − Ψ ∗∗ = I . Therefore, q = 0 , q > , and the marginal cost of committing the offense associated with agent conditionalon θ = 0 is L (Ψ ∗ − Ψ ∗∗ )(1 − q )Ψ ∗∗ . The marginal cost of committing the offense associated with agent conditional on θ = 0 is L (Ψ ∗ − Ψ ∗∗ ) (cid:0) (1 − q )Ψ ∗∗ + q (cid:1) , which equals in equilibrium. Since the marginalcost of committing offense against agent is strictly higher, Ψ ∗ Ψ ∗∗ − Ψ ∗∗ Ψ ∗ Ψ ∗ − Ψ ∗∗ ≥ q − q . (57)14his implies that Ψ ∗ / Ψ ∗∗ > Ψ ∗ / Ψ ∗∗ Since the strategy proﬁle is a proper equilibrium, we have ω ∗ − ω ∗∗ = b > ω ∗ − ω ∗∗ . This can occur only if ω ∗∗ < ω ∗∗ . Since the density of ω is strictly increasing when ω < , Ψ ∗ − Ψ ∗∗ < Ψ ∗ − Ψ ∗∗ < Ψ(0) − Ψ( − b ) . (58)When L is close to L ( c ) , the equilibrium conditions imply that ∗ − Ψ ∗∗ ) (cid:16) (1 − q )Ψ ∗∗ + q (cid:17) ≤ L ( c, , or equivalently, q ≥ − Ψ ∗∗ − Ψ ∗∗ + (Ψ(0) − Ψ( − b ))(2 − Ψ(0) − Ψ( − b ))2(1 − Ψ ∗∗ )(Ψ ∗ − Ψ ∗∗ ) . Since

Ψ(0) , Ψ( − b ) < / , this implies that q ≥ − − Ψ ∗∗ + Ψ(0) − Ψ( − b )2(1 − Ψ ∗∗ )(Ψ ∗ − Ψ ∗∗ ) . (59)Plugging (59) into (57) yields Ψ ∗ Ψ ∗∗ − Ψ ∗∗ Ψ ∗ Ψ ∗ − Ψ ∗∗ ≥ (Ψ(0) − Ψ( − b )) − ∗∗ (Ψ ∗ − Ψ ∗∗ )2(Ψ ∗ − Ψ ∗∗ ) − (Ψ(0) − Ψ( − b )) . (60)Letting ∆ ≡ Ψ(0) − Ψ( − b ) and ∆ i ≡ Ψ ∗ i − Ψ ∗∗ i , the previous inequality can be re-expressed as ∆ Ψ ∗∗ ≥ ∆(Ψ ∗ (1 − Ψ ∗∗ ) − (1 − Ψ ∗ )Ψ ∗∗ ) = ∆∆ (1 − Ψ ∗∗ ) + ∆∆ Ψ ∗∗ . (61)However, (58) and the inequality > Ψ ∗∗ + Ψ ∗∗ are incompatible with (61), which yields the desiredcontradiction. Step 2:

From Step 1, the principal incurs the same marginal cost from committing the offense associated withagent and the offense associated with agent . This leads to the following indifference condition: L = 1(Ψ ∗ − Ψ ∗∗ ) (cid:16) Ψ ∗∗ (1 − q − q ) + q (cid:17) = 1(Ψ ∗ − Ψ ∗∗ ) (cid:16) Ψ ∗∗ (1 − q − q ) + q (cid:17) . (62)Without loss of generality, we assume q ≤ q . Since q + q ≤ , we have L = 1(Ψ ∗ − Ψ ∗∗ ) (cid:16) Ψ ∗∗ (1 − q − q ) + q (cid:17) ≥ ∗ − Ψ ∗∗ .

15n what follows, we show that ∗ − Ψ ∗∗ > L ( c,

1) = 2(Ψ(0) − Ψ( − b ))(2 − Ψ(0) − Ψ( − b )) . Since Ψ( b ) < / and Ψ(0) < / , the above inequality is implied by Ψ( b ) − Ψ(0) > Ψ ∗ − Ψ ∗∗ , (63)which holds because ω ∗ − ω ∗∗ < b , ω ∗ < and the density of ω is strictly increasing when ω < . G.5 q (1 , or q (0 , is strictly positive, and (1) is equal to zero We start by ruling out equilibria in which one of the agents never witnesses any offense. Suppose toward acontradiction that the principal chooses θ = 0 with probability . Then, as in Section G.4, we have q (1 ,

0) = 0 .Since also q (0 ,

0) = 0 , (1) implies q (0 ,

1) = q (1 , , which contradicts Lemma 2.1.Next, we rule out equilibria in which q (1 , < . Suppose that q (1 , ∈ (0 , . Then, either q (0 , and q (1 , both lie in (0 , , which is ruled out in Section F.1 of this document, or one of the probabilities q (1 , and q (0 , is equal to 0, which is ruled out by Lemma 2.1 and Reﬁnement 1.Therefore, we focus without loss of generality on equilibria with the following features: (1) both offensesentail the same marginal cost, and (2) q (1 ,

1) = 1 , q (1 , and q (0 , both lie in (0 , , and q (0 , q (1 ,

0) = 1 .Since the marginal costs associated with both offenses are the same, we have L = 1 q (Ψ ∗ − Ψ ∗∗ ) = 1 q (Ψ ∗ − Ψ ∗∗ ) . (64) Step 1:

First, we consider the symmetric case in which q = q = 1 / . We need to show that ∗ − Ψ ∗∗ > L ( c,

1) = 2(Ψ(0) − Ψ( − b ))(2 − Ψ(0) − Ψ( − b )) . This inequality is implied by

Ψ(0) − Ψ( − b ) > Ψ ∗ − Ψ ∗∗ . (65)To show (65), let ω be deﬁned by Ψ( ω ) ≡ Ψ( ω ∗ ) − Ψ(0) + Ψ( − b ) . Inequality (65) is equivalent to ω < ω ∗∗ ,or equivalently, Ψ( ω ) − > ω + bc . (66)16lugging in the expression c = ω ∗ Ψ( ω ∗ ) − into (66), we have (cid:16) Ψ( ω ∗ ) − Ψ(0) + Ψ( − b ) − (cid:17) ω ∗ Ψ( ω ∗ ) − > ω + b .This is equivalent to ω ∗ + (Ψ(0) − Ψ( − b )) ω ∗ − Ψ( ω ∗ ) | {z } > > ω + b. (67)Since the density of ω is strictly increasing when ω < and ω ∗ < , we ahve Ψ( ω ∗ ) − Ψ( ω ∗ − b ) < Ψ(0) − Ψ( − b ) . This implies that ω ∗ > ω + b , which veriﬁes inequality (67). Step 2:

We consider asymmetric equilibria in which q = q . Agent i ’s reporting cutoffs are bounded by ω ∗ i c ≤ q j q i (Ψ( ω ∗ j ) − and ω ∗∗ i + bc ≥ q j q i (Ψ( ω ∗∗ j ) − , where Ψ( ω ) ≡ (1 − δ ) α + δ Φ( ω ) . The function Ψ( · ) isconvex for ω small enough. For ﬁxed q and q such that q + q = 1 , this implies that ( ω ∗ , ω ∗ ) is boundedabove by the largest solution to ω ∗ c = q q (Ψ( ω ∗ ) − and ω ∗ c = q q (Ψ( ω ∗ ) − . The largest solution iswell-deﬁned since Ψ( · ) is a strictly increasing function, and therefore, for every pair of solutions ( ω ∗ , ω ∗ ) and ( ω ′ , ω ′ ) , if ω ∗ > ω ′ , then ω ∗ > ω ′ . Similarly, ( ω ∗∗ , ω ∗∗ ) is bounded below by the smallest solution to ω ∗∗ + bc = q q (Ψ( ω ∗∗ ) − and ω ∗∗ + bc = q q (Ψ( ω ∗∗ ) − . The smallest solution is also well-deﬁned since Ψ( · ) is a strictly increasing function, and therefore, for every pair of solutions ( ω ∗∗ , ω ∗∗ ) and ( ω ′ , ω ′ ) , if ω ∗∗ > ω ′ ,then ω ∗∗ > ω ′ .Let { ω ∗ i ( q ) } i =1 be the largest solution to the ﬁrst system of equations and let { ω ∗∗ i ( q ) } i =1 be the smallestsolution to the second system of equations. The minimum L in this class of equilibria is bounded below by max i ∈{ , } q i (cid:16) Ψ( ω ∗ i ( q )) − Ψ( ω ∗∗ i ( q )) (cid:17) . (68)We start by showing that when q < / , we have ω ∗ ( q ) > ω ∗ ( q ) and ω ∗∗ ( q ) > ω ∗∗ ( q ) . Supposetoward a contradiction that ω ∗ ( q ) ≤ ω ∗ ( q ) . Let α ≡ ( q /q ) , which is strictly greater than 1. We have α (Ψ ∗ ( q ) − − (Ψ ∗ ( q ) − > , which is equivalent to (1 − α )(1 − Ψ ∗ ( q )) | {z } < + Ψ ∗ ( q ) − Ψ ∗ ( q ) | {z } < by hypothesis > . This leads to a contradiction.Next, we show that ω ∗ ( q ) < ω ∗ (1 / < ω ∗ ( q ) and ω ∗∗ ( q ) < ω ∗∗ (1 / < ω ∗∗ ( q ) . These inequalitieshold because for every q ∈ (0 , / , ω ∗ ( q ) ω ∗ ( q ) c = (cid:16) Ψ ∗ ( q ) − (cid:17)(cid:16) Ψ ∗ ( q ) − (cid:17) . (69)17ince Ψ( ω ) is strictly convex for ω < , we have ω ≥ Ψ( ω ) − if and only if ω ≥ ω ∗ (1 / . This together with(69) implies that ω ∗ ( q ) < ω ∗ (1 / < ω ∗ ( q ) . Similarly, one can show that ω ∗∗ ( q ) < ω ∗∗ (1 / < ω ∗∗ ( q ) .Last, suppose toward a contradiction that

12 (Ψ ∗ − Ψ ∗∗ ) < min { q (Ψ ∗ − Ψ ∗∗ ) , q (Ψ ∗ − Ψ ∗∗ ) } . (70)This implies the following inequality, which violates Ψ ∗ ’s convexity and yields the desired contradiction: (cid:16) ω ∗ (1 / − ( ω ∗∗ (1 /

2) + b ) (cid:17) ≥ q q (cid:16) ω ∗ ( q ) − ( ω ∗∗ ( q ) + b ) (cid:17)(cid:16) ω ∗ ( q ) − ( ω ∗∗ ( q ) + b ) (cid:17) . H. Results with Three or More Agents

We show Lemma E.1 in Section H.1. Sections H.2 and H.3 prove Propositions 4 and 5 in the main text. Finally,Section H.4 shows that in symmetric equilibria, Reﬁnement 1 implies Reﬁnement 2.

H.1 Proof of Lemma E.1

First, we show that when L is large enough, q (1 , , ..., < in every symmetric BNE that satisﬁes Reﬁnement1. Suppose toward a contradiction that for every L ′ ∈ R + , there exists L ≥ L ′ and a symmetric equilibriumfor punishment L that satisﬁes Reﬁnement 1 such that q (1 , , ...,

1) = 1 . We establish a lower bound on themarginal increase in conviction probabilities that uniformly applies across all L . For every a ≻ a ′ , we have Pr( θ = 1 | a ) > Pr( θ = 1 | a ′ ) . As a result, there exist m ∈ { , , ..., n } and q ∈ [0 , such that the principalis convicted for sure when there are m accusations or more, and is convicted with probability q when there are m − accusations. Reﬁnement 1 requires that q = 0 when m = 1 .From Lemma 2.1, an agent’s equilibrium strategy is summarized by two cutoffs, ω ∗ and ω ∗∗ , such that forevery i ∈ { , , ..., n } , agent i accuses the principal when θ i = 1 and ω i ≤ ω ∗ , or when θ i = 0 and ω i ≤ ω ∗∗ .Let Ψ ∗ ≡ (1 − δ ) α + δ Φ( ω ∗ ) and Ψ ∗∗ ≡ (1 − δ ) α + δ Φ( ω ∗∗ ) . We have Ψ ∗ > Ψ ∗∗ .For every m ≤ n − , let Q ( m, θ − i ) be the probability with which agents other than i submit m accusationsgiven θ − i . Fixing θ − i , changing θ i from to , the marginal increase in conviction probability is given by: (Ψ ∗ − Ψ ∗∗ ) P ( m, q, θ − i ) , (71)18here P ( m, q, θ − i ) ≡ qQ ( m − , θ − i ) + n − X j = m − Q ( j, θ − i ) − qQ ( m − , θ − i ) − n − X j = m Q ( j, θ − i ) . (72)This yields P ( m, q, θ − i ) = (1 − q ) Q ( m − , θ − i ) + qQ ( m − , θ − i ) . (73)Since θ is binary and the equilibrium is symmetric, the functions Q ( m, θ − i ) and P ( m, q, θ − i ) depend on θ − i only through the number of s in the entries of θ − i . Let | θ − i | be the number of s in the vector θ − i . Abusingnotation, we rewrite Q ( m, θ − i ) as Q ( m, | θ − i | ) , and P ( m, q, θ − i ) as P ( m, q, | θ − i | ) . For any ﬁxed values of m and q : one of these three statements is true:1. P ( m, q, | θ − i | ) is strictly increasing in | θ − i | ,2. P ( m, q, | θ − i | ) is strictly decreasing in | θ − i | ,3. P ( m, q, | θ − i | ) is ﬁrst increasing and then decreasing in | θ − i | .In equilibrium, the principal is indifferent between committing an offense against k agents and not committingany offense, where k satisﬁes: k ∈ arg min e k ∈{ ,...,n } e k e k − X j =0 P ( m, q, | θ − i | ) . (74)This property must hold because from Lemma 2.1 the principal must be indifferent between committing nooffense and a positive number k of offenses. For this value of k , the average cost of committing an offensewhen the principal commits k offenses is equal to , and there does not exist k ′ ∈ { , , ..., n } such that if theprincipal commits k ′ offenses, his average cost of committing an offense is strictly less than .The value of k depends on the monotonicity of P ( m, q, · ) . When P ( m, q, · ) is strictly increasing, k = 1 andthe principal is indifferent between committing only one offense and committing no offense. When P ( m, q, · ) is strictly decreasing, k = n and the principal is indifferent between committing no offense and committingoffense against all agents. When P ( m, q, · ) is ﬁrst increasing and then decreasing, k is either or n , dependingon the parameters. In what follows, we consider the two values of k separately. Strategic Substitutes:

When k = 1 , an agent’s reporting cutoff when he has witnessed an offense is ω ∗ = b − c n − qQ ( m − , − n − X j = m − Q ( j, o P ( m, q, (75)19imilarly, an agent’s reporting cutoff when he has not witnessed any offense is ω ∗∗ = − c − β n qQ ( m − ,

0) + n − X j = m − Q ( j, o − (1 − β ) n qQ ( m − ,

1) + n − X j = m − Q ( j, o βP ( m, q,

0) + (1 − β ) P ( m, q, (76)where β is the probability with which θ = ... = θ n = 0 conditional on θ i = 0 .In the ﬁrst step, we show that ω ∗ < ω ∗∗ + b . This inequality comes from the fact that k = 1 , which impliesthat P ( m, q, > P ( m, q, . Moreover, qQ ( m − ,

1) + n − X j = m − Q ( j, > qQ ( m − ,

0) + n − X j = m − Q ( j, . Therefore, ω ∗ − ( ω ∗∗ + b ) c = 1 − β n qQ ( m − ,

0) + n − X j = m − Q ( j, o − (1 − β ) n qQ ( m − ,

1) + n − X j = m − Q ( j, o βP ( m, q,

0) + (1 − β ) P ( m, q, − n − qQ ( m − , − n − X j = m − Q ( j, o P ( m, q, < . In the second step, we bound ω ∗ from below using the facts that | ω ∗ − ω ∗∗ | < b and q (1 , , ...,

1) = 1 . First,for every m ∈ { , , ..., n − } and q , P ( m, q, ≥ P ( n − , ,

0) = (Ψ ∗∗ ) n − . From (75), we know that ω ∗ − bc ≥ − (Ψ ∗∗ ) − ( n − ≥ − (cid:16) δ Φ( ω ∗ − b ) + (1 − δ ) α (cid:17) − ( n − . Since the RHS of the last formula is bounded below, there exists ω ∗ ∈ R − , independent of L , such that ω ∗ ≥ ω ∗ .In the third step, we bound the value of Ψ ∗ − Ψ ∗∗ from below. Let X ≡ qQ ( m − ,

0) + n − X j = m − Q ( j, and X ≡ qQ ( m − ,

1) + n − X j = m − Q ( j, . From (75) and (76), ω ∗ − ω ∗∗ c = bc − (1 − β ) P ( m, q, − X ) − P ( m, q, − X ) P ( m, q, (cid:0) βP ( m, q,

0) + (1 − β ) P ( m, q, (cid:1) . (77)20e start by bounding P ( m, q, − X ) − P ( m, q, − X )Ψ ∗ − Ψ ∗∗ (78)from above. Since P ( m, q, − X ) − P ( m, q, − X ) = ( X − X ) P ( m, q,

0) + (1 − X )( P ( m, q, − P ( m, q, , and − X as well as P ( m, q, are bounded from above by , we only need to bound X − X Ψ ∗ − Ψ ∗∗ and P ( m,q, − P ( m,q, ∗ − Ψ ∗∗ from above. Notice that Q ( j, − Q ( j, ∗ − Ψ ∗∗ = (cid:18) n − j − (cid:19) (Ψ ∗∗ ) j − (1 − Ψ ∗∗ ) n − − j − (cid:18) n − j (cid:19) (Ψ ∗∗ ) j (1 − Ψ ∗∗ ) n − − j which is bounded from above by (cid:0) n − j − (cid:1) . Since X − X and P ( m, q, − P ( m, q, are both linear combinationsof terms of the form of Q ( j, − Q ( j, , X − X Ψ ∗ − Ψ ∗∗ and P ( m,q, − P ( m,q, ∗ − Ψ ∗∗ are also bounded above. Let C ∈ R + be the upper bound on (78). Since P ( m, q, (cid:0) βP ( m, q,

0) + (1 − β ) P ( m, q, (cid:1) is bounded away from , wecan also bound ∗ − Ψ ∗∗ · P ( m, q, − X ) − P ( m, q, − X ) P ( m, q, (cid:0) βP ( m, q,

0) + (1 − β ) P ( m, q, (cid:1) . (79)Letting C ∈ R + denote this bound, we have ω ∗ − ω ∗∗ c ≥ bc − δ (1 − β ) C (Φ( ω ∗ ) − Φ( ω ∗∗ )) . (80)Letting C ≡ δ (1 − β ) C , we obtain ω ∗ − ω ∗∗ c + C (Φ( ω ∗ ) − Φ( ω ∗∗ )) ≥ bc . (81)We have shown in the previous step that ω ∗ > ω ∗ . Let ǫ ∈ R + denote the unique solution of the equation ǫc + C ǫφ ( ω ∗ − ǫ ) = bc , whose LHS is continuous, strictly increasing in ǫ , strictly greater than bc for ǫ → ∞ , and strictly less than bc when ǫ → −∞ . From (81), we have ω ∗ = ω ∗∗ ≥ ǫ , and, hence, Ψ ∗ − Ψ ∗∗ = δ (Φ( ω ∗ ) − Φ( ω ∗∗ )) ≥ δǫφ ( ω ∗ − ǫ ) . (82)The principal’s incentive constraint is P ( m, q, ∗ − Ψ ∗∗ ) L = 1 . Since P ( m, q, is bounded below by21 δ Φ( ω ∗ − b ) + (1 − δ ) α (cid:17) − ( n − and Ψ ∗ − Ψ ∗∗ is bounded below by (82), the LHS must go to + ∞ as L → + ∞ , which leads to a contradiction. Strategic Complements:

When k = n , an agent’s reporting cutoff when he has witnessed an offense is ω ∗ = b − c n − qQ ( m − , n − − n − X j = m − Q ( j, n − o P ( m, q, n − (83)Similarly, an agent’s reporting cutoff when he has not witnessed any offense is ω ∗∗ = − c n − qQ ( m − , − n − X j = m − Q ( j, o P ( m, q, (84)Since k = n , P ( m, q, n − > P ( m, q, . Moreover, since Ψ ∗ − Ψ ∗∗ > , qQ ( m − , n − − n − X j = m − Q ( j, n − > qQ ( m − , − n − X j = m − Q ( j, . (85)These inequalities imply that the distance between the cutoffs exceeds b . The inequality ω ∗ c ≥ − (cid:16) (1 − δ ) α (cid:17) − ( n − (86)provides a lower bound on ω ∗ , which we denote by ω ∗ . The principal’s marginal cost of committing anotheroffense is bounded below by L (Ψ ∗ − Ψ ∗∗ ) P ( m, q, n − ≥ Lδbφ ( ω ∗ − b )(1 − δ ) n − α n − . (87)The RHS goes to inﬁnity as L → ∞ , which leads to a contradiction. Principal’s Equilibrium Strategy:

Since q (1 , , ..., ∈ (0 , and ω ∗ i > ω ∗∗ i in any symmetric equilibriumthat satisﬁes Reﬁnement 1, we have Pr( θ = 1 | a ) < Pr( θ = 1 | (1 , , ..., π ∗ for every a = (1 , , ..., .Therefore, q ( a ) = 0 for every a = (1 , , ..., . Let q ≡ q (1 , , ..., . When the principal commits anoffense in addition to m ∈ { , , , ..., m − } offenses, the probability that he gets convicted increases by q (Ψ ∗ − Ψ ∗∗ )(Ψ ∗∗ ) n − m (Ψ ∗ ) m , which is a strictly increasing function of m . Therefore, if the principal commits m ≥ offenses with positive probability, the probability Pr( θ = 1) that he commits at least one offense is22qual , which contradicts Lemma 2.1. Therefore, the principal must be indifferent between committing oneoffense and committing no offense, which proves the second part of Lemma E.1. H.2 Proof of Proposition 4

Exploiting Lemma E.1, we establish the properties common to all symmetric equilibria that satisfy Reﬁnement1 . We start by deriving formulas for the agents’ reporting cutoffs ( ω ∗ n , ω ∗∗ n ) , the informativeness of accusations I n and the equilibrium probability of offense π n . For every i ∈ { , , ..., n } , agent i ’s reporting cutoff when θ i = 1 is ω ∗ n = b + c − cq n Q ,n . (88)His reporting cutoff when θ i = 0 is ω ∗∗ n = c − cq n Q ,n (89)where Q ,n ≡ (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − (90)and Q ,n ≡ n I n ( n − l ∗ + n I n (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − + ( n − l ∗ ( n − l ∗ + n I n (cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − (cid:16) δ Φ( ω ∗ n ) + (1 − δ ) α (cid:17) . (91)From Lemma E.1, the principal commits at most one offense, which yields the informativeness ratio I n = δ Φ( ω ∗ n ) + (1 − δ ) αδ Φ( ω ∗∗ n ) + (1 − δ ) α . The judge is indifferent between convicting and acquitting a principal facing n accusations. This implies that I n = π ∗ − π ∗ . π n − π n . (92)When L is large enough, the principal’s indifference condition for committing zero or one offense is δL = q n (cid:16) Φ( ω ∗ n ) − Φ( ω ∗∗ n ) (cid:17)(cid:16) δ Φ( ω ∗∗ n ) + (1 − δ ) α (cid:17) n − . (93)Using these formulas, we now show that ω ∗ n − ω ∗∗ n ∈ (0 , b ) . Suppose toward a contradiction that ω ∗ n − ω ∗∗ n ≤ . Then, comparing (90) and (91) yields Q ,n ≥ Q ,n . Plugging this into (88) and (89) implies that ω ∗ n ≥ ω ∗∗ n + b , which contradicts the presumption that ω ∗ n − ω ∗∗ n ≤ and implies ω ∗ n − ω ∗∗ n > . Since23 ∗ n − ω ∗∗ n > , we have Q ,n < Q ,n . The expressions for the cutoffs then imply that ω ∗ n − ω ∗∗ n < b .Next, we show that I n → as ω ∗ n → −∞ . Equations (88) and (89) yield | ω ∗ n − b − c || ω ∗∗ n − c | = Q ,n Q ,n = ( n − l ∗ ( n − l ∗ + n I n I n + n I n ( n − l ∗ + n I n . (94)Since ω ∗ n − ω ∗∗ n ∈ (0 , b ) , the LHS converges to as ω ∗ n → −∞ , which implies that the RHS also convergesto . This can occur only if I n → .In the last step, we show that ω ∗ n → −∞ as L → + ∞ . Suppose toward a contradiction that there exists aﬁnite accumulation point ω ∗ ∈ R − for ω ∗ n . The LHS of (93) converges to when L → + ∞ . Therefore, at leastone of the following properties must occur along some subsequence: q n → or Φ( ω ∗ n ) − Φ( ω ∗∗ n ) → . Since ω ∗ n → ω ∗ , Φ( ω ∗ n ) − Φ( ω ∗∗ n ) → implies that ω ∗ n − ω ∗∗ n → .First, suppose toward a contradiction that q n → along some subsequence. From (88), ω ∗ n → −∞ alongthis subsequence, which leads to a contradiction.Second, suppose toward a contradiction that q n is bounded away from along some subsequence, i.e.,strictly greater than some q > . For the LHS of (93) to converge to , we need ω ∗ n − ω ∗∗ n → along thissubsequence. Subtracting the expression of ω ∗ n from that of ω ∗∗ n , we obtain q n c (cid:16) ω ∗ n − ( ω ∗∗ n + b ) (cid:17) = ( n − l ∗ ( n − l ∗ + n n δ Φ( ω ∗ n ) + (1 − δ ) α − δ Φ( ω ∗∗ n ) + (1 − δ ) α o . (95)The absolute value of the LHS is no less than qb/c in the limit as ω ∗ n − ω ∗∗ n → . The absolute value of theRHS converges to as Φ( ω ∗ n ) − Φ( ω ∗∗ n ) → , leading to a contradiction. This implies that ω ∗ n → −∞ in everyequilibrium as L → + ∞ .The three parts together imply that as L → + ∞ , ω ∗ n and ω ∗∗ n go to −∞ , I n → and π n → π ∗ . H.3 Proof of Proposition 5

Equilibrium existence is established in Supplementary Appendix J. This section establishes the limiting propertiesof such equilibria. Since the principal is indifferent between all action proﬁles, his marginal cost of committingan offense equals his marginal beneﬁt: qL (Ψ ∗ − Ψ ∗∗ ) = 1 where q ∈ (0 , is the incremental probability ofconviction after an agent accuses him, Ψ ∗ is the probability of a i = 1 conditional on θ i = 1 , and Ψ ∗∗ is theprobability of a i = 1 conditional on θ i = 0 . The reporting cutoffs are ω ∗ ≡ − c ( P − q ) q + b and ω ∗∗ ≡ − c ( P − q ) q ,where P ≡ Pr( s = 0 | a i = 0) . Therefore, ω ∗ − ω ∗∗ = b , which implies that Ψ ∗ − Ψ ∗∗ → if and only if ω ∗ → −∞ . As L → + ∞ , the indifference condition implies that either q → or Ψ ∗ − Ψ ∗∗ → or both. Theformulas for the reporting cutoffs implies that ω ∗ → −∞ if and only if q → . This implies that ω ∗ → −∞ as24 → ∞ . In the double limit lim L →∞ lim δ → , the informativeness ratio I ≡ δ Φ( ω ∗ i )+(1 − δ ) αδ Φ( ω ∗∗ i )+(1 − δ ) α converges to + ∞ . H.4 Equilibrium Reﬁnements

Lemma H.

When the judge uses decision rule (2.4), every symmetric Bayes Nash Equilibrium that satisﬁesReﬁnement 1 is a proper equilibrium and satisﬁes Reﬁnement 2.Proof:

As with Lemma 2.1, the principal commits an offense with positive probability in every Bayes NashEquilibrium. Furthermore, Reﬁnement 1 implies that the principal commits an offense with probability lessthan . Therefore, in any symmetric Bayes Nash equilibrium, the principal chooses θ i = 1 and θ i = 0 withprobabilities that are both strictly positive and identical across i .Therefore, in these equilibria players’ information sets all occur with positive probability on the equilibriumpath, implying that these equilibria are proper.Fix a symmetric Bayes Nash equilibrium and let P ≡ Pr( a i = 1 | θ i = 0) and P ≡ Pr( a i = 1 | θ i = 1) .We show that the equilibrium satisﬁes Reﬁnement 2 by showing that P > P .First, suppose toward a contradiction that P = P . Since the judge’s decision is measurable with respectto a , the probability of convicting the principal is independent of θ . This implies that the principal has a strictincentive to choose θ = (1 , , ..., . Therefore, Pr( θ = 1 | a ) = 1 for every a ∈ { , } n . This implies that q ( a ) = 1 for every a ∈ { , } n , which violates Reﬁnement 1.Next, suppose that P < P . Symmetry implies that Pr( θ = 1 | a ) < Pr( θ = 1 | a ′ ) whenever a containsstrictly more 1’s than a ′ . From decision rule (2.4), q (0 , , .., ≥ q ( a ) for every a = (0 , , ..., . Reﬁnement1 requires that q (0 , ...,

0) = 0 , which implies that q ( a ) = 0 for every a ∈ { , } n . Under these convictionprobabilities, the principal has a strict incentive to choose θ = (1 , , ..., , according to which q ( a ) = 1 forevery a ∈ { , } n , which leads to a contradiction.Since P > P , Pr( θ = 1 | a ) > Pr( θ = 1 | a ′ ) whenever a contains strictly more 1’s than a ′ . Underconviction rule (2.4), q ( a ) ≥ q ( a ′ ) for every a > a ′ . Suppose toward a contradiction that there exists i ∈{ , , ..., n } such that q (0 , a − i ) = q (1 , a − i ) for every a − i ∈ { , } n − . Then, the principal has a strictincentive to choose θ i = 1 , which implies that Pr( θ = 1 | a ) = 1 for every a ∈ { , } n . This leads to acontradiction, and establishes Reﬁnement 2. 25 r X i v : . [ ec on . GN ] S e p Supplementary Appendix (Not for Publication)Crime Entanglement, Deterrence, and Witness Credibility

Harry Pei Bruno StruloviciSeptember 15, 2020

I. Symmetry of Equilibrium

From Lemma 3.1, q (0 ,

0) = q (1 ,

0) = q (0 ,

1) = 0 and q (1 , ∈ (0 , . Let q ≡ q (1 , , Ψ ∗ i ≡ δ Φ( ω ∗ i ) + (1 − δ ) α , and Ψ ∗∗ i ≡ δ Φ( ω ∗∗ i ) + (1 − δ ) α . Agent i ’s reporting cutoffs are ω ∗ i = b − c − qQ ,j qQ ,j (1)and ω ∗∗ i = − c − qQ ,j qQ ,j (2)where Q ,j is the probability of a j = 1 conditional on θ i = 1 , given by Pr( θ j = 1 | θ i = 1)Ψ ∗ j + (1 − Pr( θ j = 1 | θ i = 1))Ψ ∗∗ j , (3)and Q ,j is the probability of a j = 1 conditional on θ i = 0 , given by Pr( θ j = 1 | θ i = 0)Ψ ∗ j + (1 − Pr( θ j = 1 | θ i = 0))Ψ ∗∗ j . (4)When θ changes from (0 , to (1 , , the probability of conviction increases by δq Φ( ω ∗∗ ) (cid:16) Φ( ω ∗ ) − Φ( ω ∗∗ ) (cid:17) . When θ changes from from (0 , to (1 , , the probability of conviction increases by δq Φ( ω ∗ ) (cid:16) Φ( ω ∗ ) − Φ( ω ∗∗ ) (cid:17) . θ takes on the values (1 , and (0 , with the same, strictly positive probability, only if Φ( ω ∗ )Φ( ω ∗∗ ) = Φ( ω ∗ )Φ( ω ∗∗ ) , (5)and θ takes on the value (1 , with a strictly higher probability than (0 , only if Φ( ω ∗ )Φ( ω ∗∗ ) ≤ Φ( ω ∗ )Φ( ω ∗∗ ) , (6)with equality when θ is equal to (0 , with strictly positive probability.First, we show that the symmetry in the principal’s equilibrium strategy implies the symmetry of the agents’reporting cutoffs. Suppose toward a contradiction that ω ∗ i > ω ∗ j . Then Q ,j > Q ,i , which implies that ω ∗∗ i < ω ∗∗ j , and thatthat Φ( ω ∗ i )Φ( ω ∗∗ i ) > Φ( ω ∗ j )Φ( ω ∗∗ j ) . (7)This contradicts the hypothesis that the principal chooses θ being (1 , and (0 , with the same probability.Next, we assume by way of contradiction that θ equals (1 , with a strictly higher probability than (0 , .This implies inequality (6). The rest of the proof is divided in two cases: Section I.1 considers the case π o ≥ π ∗ and Section I.2 considers the case π o < π ∗ . I.1 Case 1: π o ≥ π ∗ Lemma 2.1 implies that an opportunistic principal commits no offense with strictly positive probability. Lemmas3.1 and 3.2 then imply that the principal commits two offenses with zero probability.First, consider the case in which the principal chooses θ = (0 , with positive probability. As arguedearlier, (6) holds with equality, and for each j ∈ { , } we have Q ,j = Ψ ∗∗ j and Q ,j = Pr( θ j = 1 | θ i = 0)Ψ ∗ j + (1 − Pr( θ j = 1 | θ i = 0))Ψ ∗∗ j . (8)Therefore,1. If ω ∗ ≥ ω ∗ , (1) implies that ω ∗∗ < ω ∗∗ , which contradicts (6).2. If ω ∗ < ω ∗ , we must have ω ∗∗ > ω ∗∗ , and Φ( ω ∗ )Φ( ω ∗∗ ) < Φ( ω ∗ )Φ( ω ∗∗ ) , which leads to a contradiction.Second, consider the case in which the principal chooses θ = (0 , with zero probability. Upon observing θ = 1 , agent attaches probability to θ = (0 , in every proper equilibrium , because the principal’s2xpected payoff is strictly lower when he chooses θ = (1 , than when he chooses θ = (0 , . Therefore, (8)still applies, and ω ∗ = ω ∗∗ + b and ω ∗ − ω ∗∗ ∈ (0 , b ) .First, we show that ω ∗∗ < ω ∗∗ . Suppose toward a contradiction that ω ∗∗ ≥ ω ∗∗ . Then, comparing theexpressions for ω ∗∗ and ω ∗∗ yields Ψ ∗∗ ≥ π Ψ ∗ + (1 − π )Ψ ∗∗ > Ψ ∗∗ , a contradiction. This shows that ω ∗∗ < ω ∗∗ . Second, (6) implies that ω ∗ < ω ∗ , and therefore, Q , < Q , .Since Pr( θ = 1 | θ = 1) = Pr( θ = 1 | θ = 1) = 0 , we have ω ∗∗ < ω ∗∗ . This contradicts the previousconclusion that ω ∗∗ < ω ∗∗ . I.2 π o < π ∗ If an opportunistic principal chooses θ = (1 , with zero probability, then the proof follows the same steps asthe previous case. If he chooses θ = (1 , with strictly positive probability, then Lemmas 3.1 and 3.2 implythat he chooses θ = (0 , with zero probability. First, consider the case in which θ = (0 , is chosen withstrictly positive probability.1. If ω ∗ ≥ ω ∗ , then (1) implies that Q , ≥ Q , or, equivalently, that Pr( θ = 1 | θ = 1)Ψ ∗ +(1 − Pr( θ = 1 | θ = 1))Ψ ∗∗ ≥ Pr( θ = 1 | θ = 1)Ψ ∗ +(1 − Pr( θ = 1 | θ = 1))Ψ ∗∗ . The hypothesis that θ = (1 , has a strictly higher probability than θ = (0 , implies that Pr( θ =1 | θ = 1) < Pr( θ = 1 | θ = 1) . The hypothesis that ω ∗ ≥ ω ∗ implies that Ψ ∗ > Ψ ∗ . From Lemma 2.1,we have Ψ ∗ i > Ψ ∗∗ i for every i ∈ { , } . Therefore, the above inequality holds only if ω ∗∗ < ω ∗∗ , whichcontradicts (6).2. If ω ∗ < ω ∗ , then ω ∗∗ > ω ∗∗ and Φ( ω ∗ )Φ( ω ∗∗ ) < Φ( ω ∗ )Φ( ω ∗∗ ) , which leads to a contradiction. J. Existence of Equilibrium

Section J.1 proves existence of a proper equilibrium that satisﬁes Reﬁnements 1 and 2 under the aggregateprobabilities principle (or “APP”). Section J.2 proves the existence of a proper equilibrium that satisﬁes Reﬁnements1 and 2 under the distinct probabilities principle (or “DPP”).3 .1 Aggregate Probabilities Principle: Decision Rule (2.4)

Case 1: π o ≥ π ∗ We show that when L is large enough and π o ≥ π ∗ , there exists a symmetric Bayes Nashequilibrium that satisﬁes Reﬁnements 1 and 2 and possesses the following three properties:1. q ( a ) > if and only if a = (1 , , ..., .2. The opportunistic principal commits either no offense or exactly one offense.3. Each agent witnesses an offense with probability strictly between and .This equilibrium is a proper equilibrium since all the information sets for all players occur with strictly positiveprobability. For every ω ∗ , ω ∗∗ ∈ R , let Ψ ∗ ≡ δ Φ( ω ∗ ) + (1 − δ ) α and Ψ ∗∗ ≡ δ Φ( ω ∗∗ ) + (1 − δ ) α . Proposition J.1.

There exists

L > such that for every L > L , there exists a triple ( ω ∗ , ω ∗∗ , q ) ∈ R − × R − × (0 , that solves the following three equations: qc ( ω ∗ − c − b ) = − ∗∗ ) n − (9) qc ( ω ∗∗ − c ) = − nn + ( n − l ∗ ∗∗ ) n − − ( n − l ∗ n + ( n − l ∗ ∗∗ ) n − Ψ ∗ (10) δL = q (Ψ ∗∗ ) n − (Ψ ∗ − Ψ ∗∗ ) . (11) Proof.

The proof consists of two steps. In

Step 1 , we show that the following expression is bounded belowaway from : A ≡ inf ( ω ∗ ,ω ∗∗ ) that solves (9) and (10) when q = 1 δ (cid:16) Φ( ω ∗ ) − Φ( ω ∗∗ ) (cid:17) (Ψ ∗∗ ) n − . (12)This is shown by establishing lower bounds on Φ( ω ∗∗ ) and Φ( ω ∗ ) − Φ( ω ∗∗ ) , respectively. Since ( ω ∗ , ω ∗∗ ) solves (9) and (10) when q = 1 , we have: ω ∗∗ ≥ ω ∗ − b ≥ c − c (1 − δ ) n − α n − and therefore, Φ( ω ∗∗ ) ≥ Φ (cid:16) c − c (1 − δ ) n − α n − (cid:17) . (13)Next, we introduce the variable ∆ ≡ ω ∗ − ω ∗∗ , which is strictly between and b . Subtracting equation (10)from (9) and plugging in q = 1 , we have b − ∆ c = l ∗ ( n − l ∗ ( n −

1) + n (cid:0) Ψ ∗ (cid:1) − (cid:0) Ψ ∗∗ (cid:1) − ( n − (cid:0) Ψ ∗ − Ψ ∗∗ (cid:1) . (14)4e consider two cases separately,1. When ∆ ≥ b/ , Ψ ∗ − Ψ ∗∗ ≥ bδ φ ( ω ∗∗ ) ≥ bδ φ (cid:16) c − c (1 − δ ) n − α n − (cid:17) . (15)2. When ∆ < b/ , (14) implies that Ψ ∗ − Ψ ∗∗ ≥ b (( n − l ∗ + n )2( n − l ∗ c Ψ ∗ (cid:0) Ψ ∗∗ (cid:1) n − ≥ b (( n − l ∗ + n )2( n − l ∗ c (cid:16) δ Φ (cid:0) c − c (1 − δ ) n − α n − (cid:1) +(1 − δ ) α (cid:17) n . (16)Taking the minimum of the right-hand sides of (15) and (16), we obtain a lower bound for Ψ ∗ − Ψ ∗∗ . Thistogether with (13) implies a strictly positive lower bound for (12), which we denote by A .In Step 2 , we show that when

L > A − , there exists a solution to (9), (10) and (11). For every (Φ ∗ , Φ ∗∗ , q ) ∈ [0 , × [1 /L, , let f ≡ ( f , f , f ) : [0 , × [1 /L, → [0 , × [1 /L, be the following mapping: f (Φ ∗ , Φ ∗∗ , q ) = Φ (cid:16) b + c − cq (Ψ ∗∗ ) n − (cid:17) , (17) f (Φ ∗ , Φ ∗∗ , q ) = Φ (cid:16) c − c ( n − l ∗ q (( n − l ∗ + n ) 1Ψ ∗ (Ψ ∗∗ ) n − − cnq (( n − l ∗ + n ) 1(Ψ ∗∗ ) n − (cid:17) , (18) f (Φ ∗ , Φ ∗∗ , q ) = min n , δL ∗∗ ) n − (cid:16) Φ ∗ − Φ ∗∗ (cid:17) o . (19)where Ψ ∗ ≡ δ Φ ∗ + (1 − δ ) α and Ψ ∗∗ ≡ δ Φ ∗∗ + (1 − δ ) α . Since f is continuous, Brouwer’s ﬁxed point theoremimplies the existence of a ﬁxed point.Next, we show that if (Φ ∗ , Φ ∗∗ , q ) is a ﬁxed point, then q < . This implies that every solution to the ﬁxedpoint problem solves the system of equations (9), (10) and (11) as (19) and (11) are the same when q < .Suppose toward a contradiction that q = 1 . Then Φ − (Φ ∗ ) and Φ − (Φ ∗∗ ) solve (9) and (10). From Part I ofthe proof, the assumption that L > A − implies that δL ∗∗ ) n − (cid:0) Φ ∗ − Φ ∗∗ (cid:1) < . Therefore the RHS of (19) is strictly less than . This contradicts the claim that (Φ ∗ , Φ ∗∗ , is a ﬁxed point of f . Therefore, the value of q at the ﬁxed point is strictly less than .5iven the tuple ( ω ∗ , ω ∗∗ , q ) ∈ R − × R − × (0 , , one can then uniquely pin down the equilibrium probabilityof an offense π ≡ Pr( θ = 1) via the equation δ Φ( ω ∗ ) + (1 − δ ) αδ Φ( ω ∗∗ ) + (1 − δ ) α = l ∗ . π − π . (20)To see this, note that the principal is convicted with interior probability after the judge observes a = (1 , , ..., ,and therefore, Pr( θ = 1 | a = (1 , , ..., π ∗ . Since equations (9), (10), (11) and (21) are sufﬁcientconditions for a symmetric Bayes Nash equilibrium that satisﬁes Reﬁnement 1, the existence of a ﬁxed point in(9), (10), and (11) implies the existence of proper equilibrium that satisﬁes Reﬁnements 1 and 2. Case 2: π o < π ∗ We show that when L is large enough and π o < π ∗ , there exists a symmetric Bayes Nashequilibrium that satisﬁes Reﬁnements 1 and 2, and possesses the following three properties:1. q ( a ) > if and only if a = (1 , , ..., .2. There exists ≤ k ≤ n and r ∈ [0 , such that an opportunistic principal commits k − offenses withprobability r and commits k offenses with probability − r .3. Each agent witnesses offense with probability strictly between and .Such a Bayes Nash equilibrium is a proper equilibrium since all the information sets for all players occur withstrictly positive probability. For every Ψ ∗ , Ψ ∗∗ , k and r , let Q (Ψ ∗ , Ψ ∗∗ , k, r ) ≡ r ( k − ∗∗ + (1 − r ) k Ψ ∗ k − r (Ψ ∗ ) k − (Ψ ∗∗ ) n − k (21)and Q (Ψ ∗ , Ψ ∗∗ , k, r ) ≡ rπ o (1 − k − n )(Ψ ∗ ) k − (Ψ ∗∗ ) n − k + (1 − r ) π o (1 − kn )(Ψ ∗ ) k (Ψ ∗∗ ) n − k − + (1 − π o )(Ψ ∗∗ ) n − rπ o (1 − k − n ) + (1 − r ) π o (1 − kn ) + (1 − π o ) . (22)Intuitively, Q (Ψ ∗ , Ψ ∗∗ , k, r ) is an agent’s expected probability that all other agents will accuse the principalconditional on θ i = 1 , and Q (Ψ ∗ , Ψ ∗∗ , k, r ) is an agent’s expected probability that all other agents will accusethe principal conditional on θ i = 0 . For every k ∈ { , , ..., n } and r ∈ [0 , , let Λ( k − r ) be the set of valuesfor (Ψ ∗ , Ψ ∗∗ , q ) ∈ [0 , × [0 , × [0 , such that qL (Ψ ∗ ) k − (Ψ ∗∗ ) n − k (Ψ ∗ − Ψ ∗∗ ) = 1 if r ∈ (0 , , L (Ψ ∗ ) k − (Ψ ∗∗ ) n − k (Ψ ∗ − Ψ ∗∗ ) ≥ ≥ qL (Ψ ∗ ) k − (Ψ ∗∗ ) n − k +1 (Ψ ∗ − Ψ ∗∗ ) if r = 1 and qL (Ψ ∗ ) k (Ψ ∗∗ ) n − k − (Ψ ∗ − Ψ ∗∗ ) ≥ ≥ qL (Ψ ∗ ) k − (Ψ ∗∗ ) n − k (Ψ ∗ − Ψ ∗∗ ) if r = 0 . Letting l ∗ ≡ π ∗ (1 − π o )(1 − π ∗ ) π o , (23)we have l ∗ > . Let Ψ ∗ ≡ (1 − δ ) α + δ Φ( ω ∗ ) and Ψ ∗∗ ≡ (1 − δ ) α + δ Φ( ω ∗∗ ) . We show the followingproposition. Proposition J.2.

There exists

L > such that for every L > L , there exists a tuple ( ω ∗ , ω ∗∗ , q, k, r ) ∈ R − × R − × (0 , × { , , ..., n } × [0 , such that (1 − r ) (cid:16) Ψ ∗ Ψ ∗∗ (cid:17) k + r (cid:16) Ψ ∗ Ψ ∗∗ (cid:17) k − = l ∗ , (24) (Ψ ∗ , Ψ ∗∗ , q ) ∈ Λ( k − r ) , (25) qc ( ω ∗ − c − b ) = − Q (Ψ ∗ , Ψ ∗∗ , k, r ) , (26) and qc ( ω ∗∗ − c ) = − Q (Ψ ∗ , Ψ ∗∗ , k, r ) . (27) Proof.

As in the proof of Proposition J.1, we show that A ≡ inf ( ω ∗ ,ω ∗∗ ) that solves (24) and (25) when q = 1 (Ψ ∗ ) k − (Ψ ∗∗ ) n − k (Ψ ∗ − Ψ ∗∗ ) (28)is strictly bounded below away from . This bound comes from the fact that (Ψ ∗ ) k − (Ψ ∗∗ ) n − k ≥ (Ψ ∗∗ ) n , and a similar argument as the one used in Proposition J.1 to establish strictly positive lower bounds for Ψ ∗∗ and Ψ ∗ − Ψ ∗∗ . The resulting lower bound for A is denoted by A .In what follows, we focus on L ≥ A − . Let I ≡ Ψ ∗ / Ψ ∗∗ and β ≡ k − r . We introduce the followingcorrespondence: F : [0 , × [1 , − δ )(1 − α ) ] × [1 /L, × [0 , n ] ⇒ [0 , × [1 , − δ )(1 − α ) ] × [1 /L, × [0 , n ] , F (Ψ ∗∗ , I , q, k − r ) is uniquely pinned down by (1 − r ) F (Ψ ∗∗ , I , q, k − r ) k + rF (Ψ ∗∗ , I , q, k − r ) k − = l ∗ , (29) F (Ψ ∗∗ , I , q, k − r ) is the set of min { , q ′ } such that (cid:16) Ψ ∗∗ F (Ψ ∗∗ , I , q, k − r ) , Ψ ∗∗ , q ′ (cid:17) ∈ Λ( k − r ) , (30) F (Ψ ∗∗ , I , q, k − r ) is the set of Ψ ′ such that there exists q ∈ F (Ψ ∗∗ , I , q, k − r ) such that Ψ ′ = δ Φ (cid:16) c − cq Q ( F (Ψ ∗∗ , I , q, k − r )Ψ ∗∗ , Ψ ∗∗ , k, r ) (cid:17) + (1 − δ ) α, (31)and F (Ψ ∗∗ , I , q, k − r ) is the set of integers k − r such that there exists q ∈ F (Ψ ∗∗ , I , q, k − r ) for which qbc = 1 Q − Q . (32)Since F is deﬁned on a non-empty, convex, and compact set and F has a closed graph, Kakutani’s ﬁxed pointtheorem implies that F has a ﬁxed point. Proceeding as in the proof of Proposition J.1, one can show that inevery ﬁxed point (Φ ∗∗ , I , q, k − r ) , q must be strictly less than . Therefore, every ﬁxed point of correspondence F solves (24), (25), (26) and (27). J.2 Distinct Probabilities Principle: Decision Rule (2.5)

We construct for L large enough a symmetric proper equilibrium that satisﬁes Reﬁnements 1 and 2. This alsoshows the existence part of Proposition 5.1. q ( θ ) is linear in the number of s contained in θ , with q (0 , , ...,

0) = 0 .2. For every i = j , Pr( θ i = 1 | θ j = 1) = Pr( θ i = 1 | θ j = 0) and ( ω ∗ i , ω ∗∗ i ) = ( ω ∗ j , ω ∗∗ j ) .3. An opportunistic principal is indifferent between every θ ∈ { , } n .We establish the following proposition. Proposition J.3.

There exists (Φ ∗ , q ∗ , r ∗ ) ∈ [0 , × (0 , n ) × [0 , π ∗ ] such that q ∗ (cid:16) Φ ∗ − Φ ∗∗ (cid:17) = 1 δL , (33) Φ − (Φ ∗ ) = b + c + c (1 − δ ) α + cδ (cid:16) r ∗ Φ ∗ + (1 − r ∗ )Φ ∗∗ (cid:17) − cq ∗ , (34)8 nd δ Φ ∗ + (1 − δ ) αδ Φ ∗∗ + (1 − δ ) α r ∗ − r ∗ = π ∗ − π ∗ , (35) where Φ ∗∗ ≡ Φ (cid:16) Φ − (Φ ∗ ) − b (cid:17) . Mapping this into the game studied in the main text, Φ ∗ is a strategic agent’s probability of accusing theprincipal when he has witnessed an offense, q ∗ is the incremental probability of conviction when one additionalagent accuses the principal, and r ∗ is the probability that θ i equals . Equation (33) is the principal’s indifferencecondition when deciding whether to choose θ i = 1 or θ i = 0 , Equation (34) characterizes each agent’s reportingcutoff when he has witnessed an offense, and Equation (35) implies that, upon observing a i = 1 , the judgeattaches probability π ∗ to θ i = 1 . Since q ( · ) is linear in the number of accusations and θ i is independent of θ j ,the distance between the reporting cutoffs of each agent is exactly equal to b . Proof.

First, we calculate a lower bound on δ (Φ ∗ − Φ ∗∗ ) where (Φ ∗ , r ∗ ) is the solution to (34) and (35) when q ∗ is ﬁxed to be /n . Let ω ∗ ≡ Φ − (Φ ∗ ) . Equation (34)provides upper and lower bounds for ω ∗ : b + c + c (1 − δ ) α + cδ − cn ≥ ω ∗ ≥ b + c + c (1 − δ ) α − cn, which shows that δ (Φ ∗ − Φ ∗∗ ) ≥ δ min ω ∈ [ b + c + c (1 − δ ) α − cn,b + c + c (1 − δ ) α + cδ − cn ] n Φ( ω ) − Φ( ω − b ) o ≡ A (36)In what follows, let L > nA − . For every (Φ ∗ , q ∗ , r ∗ ) ∈ [0 , × [ nA , n ] × [0 , π ∗ ] , denite f ≡ ( f , f , f ) :[0 , × [ nA , n ] × [0 , π ∗ ] → [0 , × [ nA , n ] × [0 , π ∗ ] by f (Φ ∗ , q ∗ , r ∗ ) ≡ Φ (cid:16) b + c + c (1 − δ ) α + cδ (cid:0) r ∗ Φ ∗ + (1 − r ∗ )Φ (cid:0) Φ − (Φ ∗ ) − b (cid:1)(cid:1) − cq ∗ (cid:17) , (37) f (Φ ∗ , q ∗ , r ∗ ) ≡ min n n , δL ∗ − Φ (cid:16) Φ − (Φ ∗ ) − b (cid:17) o , (38)and f (Φ ∗ , q ∗ , r ∗ ) ≡ π ∗ ( δ Φ (cid:0) Φ − (Φ ∗ ) − b (cid:1) + (1 − δ ) α ) δ (cid:16) π ∗ Φ (cid:0) Φ − (Φ ∗ ) − b (cid:1) + (1 − π ∗ )Φ ∗ (cid:17) + (1 − δ ) α . (39)9rouwer’s ﬁxed point theorem that f has a ﬁxed point. We show that q ∗ < /n for every ﬁxed point of f ,which implies that any ﬁxed point solves (33), (34) and (35). Suppose toward a contradiction that (Φ ∗ , q ∗ , r ∗ ) is a ﬁxed point of f with q ∗ = 1 /n . Then, δL (cid:16) Φ ∗ − Φ (cid:0) Φ − (Φ ∗ ) − b (cid:1)(cid:17) ≤ n. (40)However, from (36) and the fact that L > nA − , the LHS of (40) is strictly greater than n , which yields thedesired contradiction. K. Extensions

We study two sets of extensions. Section K.1 concerns alternative speciﬁcations of agents’ payoffs. SectionK.2 concerns other strategies for agents’ behavioral types.

K.1 Alternative Speciﬁcations of Agents’ Payoffs

Social Preferences:

Recall that θ ≡ max { θ , θ , ..., θ n } . Suppose that agent i ’s payoff is when the principalis convicted and ω i − b (cid:16) (1 − γ ) θ i + γθ (cid:17) − ca i (41)when the principal is acquitted. Intuitively, an agent’s payoff depends not only on whether the principal hascommitted an offense observed by the agent himself, but also on whether the principal is guilty of committingany offense. The parameter γ ∈ [0 , measures the intensity of agents’ social preferences. The baseline modelcorresponds to γ = 0 .The agents decide whether to accuse the principal based on their beliefs about θ after observing their own θ i . As a result, an agent’s strategy in any given equilibrium is still characterized by two cutoffs: ω ∗ when θ i = 1 and ω ∗∗ when θ i = 0 . Whether the principal’s incentives to commit offenses are strategic complementsor substitutes is still determined by the sign of q (0 ,

0) + q (1 , − q (0 , − q (1 , . When L is large, twoaccusations are needed to convict the principal with positive probability and the principal’s decisions to commitoffenses are strategic substitutes. This induces negative correlation in agents’ private information.Agents’ reporting cutoffs are given by ω ∗ = b + c − cq m Ψ ∗∗ (42)10nd ω ∗∗ = c + b Ψ ∗ (1 − β ) γβ Ψ ∗∗ + (1 − β )Ψ ∗ − cq m (cid:16) β Ψ ∗∗ + (1 − β )Ψ ∗ (cid:17) (43)where q m ∈ (0 , is the probability of conviction when there are two accusations, Ψ ∗ ≡ δ Φ( ω ∗ ) + (1 − δ ) α , Ψ ∗∗ ≡ δ Φ( ω ∗∗ ) + (1 − δ ) α and β is the probability of θ j = 0 conditional on θ i = 0 . Compared to the baselinemodel, (43) contains the new term b Ψ ∗ (1 − β ) γβ Ψ ∗∗ + (1 − β )Ψ ∗ , (44)which measures the impact of social preferences on an agent’s equilibrium strategy. We have ≤ b Ψ ∗ (1 − β ) γβ Ψ ∗∗ + (1 − β )Ψ ∗ ≤ γb. (45)Proceeding as in Lemma 3.2 of the main text, we can show that ω ∗ − ω ∗∗ ∈ [0 , b ] . From (42) and (43), we have | ω ∗ − c − b | (cid:12)(cid:12)(cid:12) ω ∗∗ − c − b Ψ ∗ (1 − β ) γβ Ψ ∗∗ + (1 − β )Ψ ∗ (cid:12)(cid:12)(cid:12) = β Ψ ∗∗ + (1 − β )Ψ ∗ Ψ ∗∗ = β + (1 − β ) Ψ ∗ Ψ ∗∗ = ( l ∗ + 2) I m l ∗ + 2 I m , (46)where I m ≡ Ψ ∗ / Ψ ∗∗ measures the aggregate informativeness of agents’ accusations.As L → + ∞ , one can show that both ω ∗ and ω ∗∗ go to −∞ proceeding as in the proof of Theorem 1.Since the difference between the denominator and the numerator of the LHS of (46) is at most b , the value of(46) converges to . This implies that I m converges to . Consequently, agents’ accusations become arbitrarilyuninformative and the equilibrium probability of offense converges to π ∗ . Ex Post Evidence and Punishment of False Accusers:

Suppose that when an innocent principal is convicted(i.e., θ = ... = θ n = 0 and s = 1 ), some ex post evidence arrives with probability p ∗ that reveals his innocenceand results in a penalty l for every agent who has accused the principal. Our earlier analysis and results continueto apply, because this new setting is formally equivalent to increasing in the baseline setting b . To see this, agent i ’s indifference condition when θ i = 0 is now given by q m Q ω i = − c (1 − q m Q ) − q m Q p ∗ l (47)and the lower cutoff is given by ω ∗∗ m ≡ − p ∗ l − c − q m Q q m Q = − p ∗ l + c − cq m Q . (48)11his expression is the same as in the main text except that b is replaced with e b ≡ b + p ∗ l . Intrinsic Motive for Truth Telling:

Suppose that each agent receives an extra beneﬁt d , strictly less than c , from accusing the principal whenever he has observed an offense, regardless of whether the principal getsconvicted. Agents’ reporting cutoffs are now given by ω ∗ = b + c − c − dq Ψ ∗∗ and ω ∗∗ = c − cq ( β Ψ ∗∗ + (1 − β )Ψ ∗ ) where q ∈ (0 , is the probability of conviction conditional on both agents accusing the principal, and β is theprobability that an agent who has not witnessed any offense assigns to the other agent not having any witnessedany offense, either. Letting π denote the equilibrium probability that an offense takes place, we have β = 1 − π − π/ . Let l ∗ ≡ π ∗ / (1 − π ∗ ) and I ≡ Ψ ∗ / Ψ ∗∗ . We have β = I l ∗ +2 I . The principal’s incentive constraint is given by L = q Ψ ∗∗ (Ψ ∗ − Ψ ∗∗ ) . Therefore, as L → + ∞ , both ω ∗ and ω ∗∗ go to −∞ . The expressions for agent’s reporting cutoffs yield c + b − ω ∗ c − d = 1 q Ψ ∗∗ and c − ω ∗∗ c = 1 q ( β Ψ ∗∗ + (1 − β )Ψ ∗ ) . Therefore, ( c + b − ω ∗ ) / ( c − d )( c − ω ∗∗ ) /c = ( l ∗ + 2) I l ∗ + 2 I . In what follows, we show that as L → + ∞ , the LHS of the previous equation is bounded between and cc − d .The lower bound is straightforward to derive. To derive the upper bound cc − d , notice that since ω ∗ > ω ∗∗ and Ψ ∗ > Ψ ∗∗ , we have c + b − ω ∗ c − d > c − ω ∗∗ c > c − ω ∗ c . We consider two cases. First, when ω ∗ > ω ∗∗ + b , we have c + b − ω ∗ c − ω ∗∗ < c − ω ∗∗ c − ω ∗∗ = 1 , ( c + b − ω ∗ ) / ( c − d )( c − ω ∗∗ ) /c < cc − d . Second, when ω ∗ ≤ ω ∗∗ + b , both ω ∗ and ω ∗∗ go to −∞ and c + b − ω ∗ c − ω ∗∗ → , which implies that ( c + b − ω ∗ ) / ( c − d )( c − ω ∗∗ ) /c → cc − d . Therefore, we have ≤ ( l ∗ + 2) I l ∗ + 2 I ≤ cc − d as L → + ∞ , which yields an upper bound on the informativeness ratio I . This upper bound is nontrivial ifand only if l ∗ > cc − d , which requires d to be sufﬁciently small. K.2 Alternative Behavioral Types

We examine the robustness of our ﬁndings against alternative speciﬁcations of behavioral types’ strategies. Weallow the behavioral types’ accusations to be informative about the principal’s offenses and show that whenbehavioral types are rare and the principal’s loss from being convicted is sufﬁciently large, the informativenessof accusations converges to and the principal probability of guilt converges to π ∗ , as in the baseline model.We focus on comparing outcomes with one agent and two agents. K.2.1 Model and Result

Consider the following modiﬁcation of the baseline model. With probability δ ∈ (0 , , the agent is a strategictype who maximizes payoff function given by (2.4) in the main text. With probability − δ , the agent is abehavioral type whose reporting cutoff is ω when θ i = 1 and ω when θ i = 0 . We assume that both ω and ω areﬁnite and that ω ≥ ω , That is, the behavioral type’s accusation can be informative about θ . When there is only one agent, the agent’s reporting cutoffs ω ∗ s and ω ∗∗ s are given by (3.1) and (3.2),respectively. The probability that the principal is convicted following one accusation is q s , where the tuple Our analysis also applies when behavioral types are using arbitrary strategies contingent on ( θ i , ω i ) , as long as conditional on eachrealization of θ i , the probability that the behavioral type accuses the principal is interior, and is weakly higher when θ i = 1 than when θ i = 0 . q s , ω ∗ s , ω ∗∗ s ) satisﬁes q s (cid:16) δ (Φ( ω ∗ s ) − Φ( ω ∗∗ s )) + (1 − δ )(Φ( ω ) − Φ( ω )) (cid:17) = 1 /L. (49)One can show that when δ → and L is larger than some cutoff L ( δ ) , the informativeness of the agent’saccusation I s ≡ δ Φ( ω ∗ s ) + (1 − δ )Φ( ω ) δ Φ( ω ∗∗ s ) + (1 − δ )Φ( ω ) diverges to + ∞ . That is, the agent’s accusation becomes arbitrarily informative in the limit.In the two-agent case, for every i ∈ { , } , agent i ’s probability of accusing the principal is Ψ ∗ ≡ δ Φ( ω ∗ m )+(1 − δ )Φ( ω ) conditional on θ i = 1 and Ψ ∗∗ ≡ δ Φ( ω ∗∗ m ) + (1 − δ )Φ( ω ) conditional on θ i = 0 . A strategicagent’s reporting cutoffs are given by ω ∗ m ≡ b + c − cq m Ψ ∗∗ and ω ∗∗ m ≡ c − cq m (cid:16) β Ψ ∗∗ + (1 − β )Ψ ∗ (cid:17) . (50)Let I m ≡ Ψ ∗ / Ψ ∗∗ . When L is large enough, the conviction probabilities in every equilibrium must satisfy q (0 ,

0) = q (0 ,

1) = q (1 ,

0) = 0 and q (1 , ∈ (0 , . Therefore, the expressions for β and − β are the sameas in (3.15). The distance between the two cutoffs is given by ω ∗ m − ω ∗∗ m = b − cq m (1 − β )( I m − ∗∗ ( β + (1 − β ) I m ) = b − cq m Ψ ∗∗ l ∗ l ∗ I m − I m . (51)This implies ω ∗ m − ω ∗∗ m < b , because for ω ∗ m − ω ∗∗ m to weakly exceed b , we need I m ≤ , which can only betrue when ω ∗ m ≤ ω ∗∗ m , leading to a contradiction.In contrast to the baseline model, when behavioral types’ accusations are informative about the principal’sinnocence, the strategic type agents’ coordination motives can reverse the ordering between the two cutoffs.That is to say, ω ∗ m can be strictly smaller than ω ∗∗ m in equilibrium. As a result, the argument showing that I m → when ω ∗ m → −∞ in Lemma 3.3 of the main text no longer applies. In principle, ω ∗ m could be muchsmaller than ω ∗∗ m , and the ratio between the absolute values in (3.17) could converge to a limit strictly above as ω ∗ m and ω ∗∗ m diverge to −∞ . To circumvent this problem, we take an alternative approach based on thecomparison between ω ∗ m and ω ∗ s to establish the following proposition. Proposition K.

There exists L : R + × (0 , → R + such that an equilibrium exists for each L > L ( c, δ ) .Compared to the single-agent benchmark, q m > q s , ω ∗ m > ω ∗ s and ω ∗∗ m > ω ∗∗ s .Moreover, as δ → and L → + ∞ with the relative speed of convergence satisfying L ≥ L ( c, δ ) , we have ω ∗ m , ω ∗∗ m → −∞ , I m → and π m → π ∗ . regular case where ω ∗ m ≥ ω ∗∗ m ,one can still apply the ratio condition (3.17) to show that as ω ∗ m → −∞ , the LHS converges to which impliesthat I m → . In the irregular case where ω ∗ m < ω ∗∗ m , the distance between | ω ∗ m − b − c | and | ω ∗∗ m − c | can bestrictly larger than b and can explode as ω ∗ m → −∞ . However, since ω ∗ m > ω ∗ s and the informativeness in thesingle-agent benchmark grows without bound as L → + ∞ , it places an upper bound on the informativeness ofaccusations in the two-agent scenario. Since informativeness is entirely contributed by the behavioral types inthe irregular case, the value of the aforementioned upper bound converges to as I s → + ∞ . Summing up thetwo cases together, we know that the agents’ accusations are arbitrarily uninformative in the limit even whenthe behavioral types’ accusations are informative. K.2.2 Proof of Proposition K

We start by comparing between the single-agent and two-agent cases when behavioral types’ accusations canbe informative about θ , captured by the two exogenous reporting cutoffs ω and ω with ω ≥ ω .Suppose toward a contradiction that ω ∗ m ≤ ω ∗ s . The expressions for these cutoffs imply that q m (cid:16) δ Φ( ω ∗∗ m ) + (1 − δ )Φ( ω ) (cid:17) ≤ q s . Therefore, q m Ψ ∗∗ (cid:16) δ Φ( ω ∗ s ) + (1 − δ )Φ( ω ) − δ Φ( ω ∗∗ s ) − (1 − δ )Φ( ω ) (cid:17) ≤ q s (cid:16) δ Φ( ω ∗ s ) + (1 − δ )Φ( ω ) − δ Φ( ω ∗∗ s ) − (1 − δ )Φ( ω ) (cid:17) = 1 /L = q m Ψ ∗∗ (cid:16) δ Φ( ω ∗ m ) + (1 − δ )Φ( ω ) − δ Φ( ω ∗∗ m ) − (1 − δ )Φ( ω ) (cid:17) or, equivalently, Φ( ω ∗ m ) − Φ( ω ∗∗ m ) ≥ Φ( ω ∗ s ) − Φ( ω ∗∗ s ) . (52)Since ω ∗ m − ω ∗∗ m < b = ω ∗ s − ω ∗∗ s and ω ∗ m < ω ∗ s , we have Φ( ω ∗ m ) − Φ( ω ∗∗ m ) < Φ( ω ∗ s ) − Φ( ω ∗∗ s ) , (53)which contradicts (52). This shows that ω ∗ m > ω ∗ s . Since ω ∗ m − ω ∗∗ m < b = ω ∗ s − ω ∗∗ s , we have ω ∗∗ m > ω ∗∗ s .Moreover, ω ∗ m > ω ∗ s implies that q m Ψ ∗∗ > q s , or ≥ Ψ ∗∗ > q s /q m . This implies that q m > q s .Next, we evaluate the informativeness of agents’ accusations when there are two agents and δ and L aresufﬁciently large. First, for every X ∈ R + , there exists δ ∈ (0 , and L ∗ : ( δ, → R + such that when δ > δ L > L ∗ ( δ ) , the ﬁrst cutoff in the single-agent case satisﬁes δ Φ( ω ∗ s ) + (1 − δ )Φ( ω ) δ Φ( ω ∗ s − b ) + (1 − δ )Φ( ω ) > X, (54)which implies that δ Φ( ω ∗ s ) > (1 − δ ) (cid:16) X Φ( ω ) − Φ( ω ) (cid:17) . (55)Next, we establish an upper bound on the informativeness of accusations in the limit of the two-agent case.Consider the two-agent setting with parameter values ( L, c, δ ) such that L ≥ L ( c, δ ) , which guarantees thatthere exist proper equilibria satisfying Reﬁnements 1 and 2. In every equilibrium such that ω ∗ m ≥ ω ∗∗ m , theexpressions for ω ∗ m and ω ∗∗ m imply that | ω ∗ m − c − b || ω ∗∗ m − c | = ( l ∗ + 2) I m l ∗ + 2 I m . (56)The LHS converges to as ω ∗ m → −∞ . The expression on the RHS then implies that I m → .In equilibria such that ω ∗ m < ω ∗∗ m we have, since ω ∗ s < ω ∗ m , I m ≤ δ Φ( ω ∗ m ) + (1 − δ )Φ( ω ) δ Φ( ω ∗ m ) + (1 − δ )Φ( ω ) ≤ |{z} since I m > and ω ∗ m >ω ∗ s δ Φ( ω ∗ s ) + (1 − δ )Φ( ω ) δ Φ( ω ∗ s ) + (1 − δ )Φ( ω ) ≤ (1 − δ ) (cid:16) X Φ( ω ) − Φ( ω ) (cid:17) + (1 − δ )Φ( ω )(1 − δ ) (cid:16) X Φ( ω ) − Φ( ω ) (cid:17) + (1 − δ )Φ( ω ) = X Φ( ω ) X Φ( ω ) − Φ( ω ) + Φ( ω ) , (57)which also converges to as X → + ∞ .In summary, since ω ∗ m → −∞ and X → + ∞ as δ → and L → + ∞ , the informativeness ratio I m converges to irrespective of whether ω ∗ m ≥ ω ∗∗ m or ω ∗ m < ω ∗∗ m . K.3 Principal’s Payoffs

We allow the punishment to the principal to depend on the number of offenses that he is believed to havecommitted, focusing on the case of two agents. The principal receives a penalty L if Pr( θ = 1 | a ) ≥ π ∗ , and apenalty L ′ ( > L ) if Pr( θ = θ = 1 | a ) ≥ π ∗∗ . The evaluator is indifferent between both sentences at the cutoff π ∗∗ . If the judge’s posterior belief after observing a meets both of these requirements, the principal receives apenalty L ′′ that is at least max { L, L ′ } .In every proper equilibrium that satisﬁes Reﬁnements 1 and 2, each agent’s strategy takes the form of twocutoffs. For i ∈ { , } , let Ψ ∗ i be the probability that agent i accuses the principal when θ i = 1 and Ψ ∗∗ i be16he corresponding probability when θ i = 0 . Our reﬁnements imply that Ψ ∗ i > Ψ ∗∗ i for i ∈ { , } . From theprincipal’s perspective, whether θ and θ are strategic complements or strategic substitutes depends only onthe expected penalty under each a . Let P : { , } → [0 , + ∞ ) be the mapping from a to expected penalties.The principal’s decisions are strategic complements if P (1 ,

1) + P (0 , < P (1 ,

0) + P (0 , (58)and are strategic substitutes otherwise. We show the following lemma. Lemma K.

For every a and a ′ such that a > a ′ , Pr (cid:16) θ = 1 (cid:12)(cid:12)(cid:12) a (cid:17) > Pr (cid:16) θ = 1 (cid:12)(cid:12)(cid:12) a ′ (cid:17) , (59) and Pr (cid:16) θ = θ = 1 (cid:12)(cid:12)(cid:12) a (cid:17) > Pr (cid:16) θ = θ = 1 (cid:12)(cid:12)(cid:12) a ′ (cid:17) . (60)Lemma K implies that if P (0 , or P (1 , is strictly positive, then every proper equilibrium that satisﬁesReﬁnements 1 and 2 must also satisfy P (1 , ≥ L . This property comes from the fact that when the equilibriumis proper and L is large enough, every agent has witnessed offense with strictly positive probability. As in theproof of Theorem 1 in Online Appendix F, this leads to a uniform lower bound on the marginal increase in theexpected probability of receiving punishment when the principal commits one extra offense. Proof.

The second inequality follows from Lemma 3.2. For the ﬁrst inequality, it sufﬁces to compare thefollowing two ratios: I ≡ Pr( a = a = 1 | θ = θ = 1)Pr( a = a = 1 | P i =1 θ i ≤ and I ≡ Pr( a = 0 , a = 1 | θ = θ = 1)Pr( P i =1 a i = 1 | P i =1 θ i ≤ . Let p be the probability that ( θ , θ ) = (0 , conditional on ( θ , θ ) = (1 , , p be the probability that ( θ , θ ) = (1 , conditional on ( θ , θ ) = (1 , , and p be the probability that ( θ , θ ) = (0 , conditional on ( θ , θ ) = (1 , . We have I − = p Ψ ∗∗ Ψ ∗ Ψ ∗∗ Ψ ∗ + p Ψ ∗∗ Ψ ∗ + p Ψ ∗∗ Ψ ∗ , and I − = p − Ψ ∗∗ − Ψ ∗ Ψ ∗∗ Ψ ∗ + p Ψ ∗∗ Ψ ∗ + p − Ψ ∗∗ − Ψ ∗ . Ψ ∗ i > Ψ ∗∗ i for every i ∈ { , } , we have I − < I − , or, equivalently, I < I . L. Distribution of Payoff Shocks

We revisit the speciﬁcation of the distribution of agents’ payoff shock ω i . First, we motivate our model’sassumption that the support of ω i is unbounded below. Second, we explore the robustness of our results underalternative distributions of ω i . Unbounded Support:

The assumption that ω i is unbounded from below is equivalent to the following property:under every conviction rule q : { , } n → [0 , that is responsive to each agent’s accusation. That is,(*) for every i ∈ { , ..., n } , there exists a − i ∈ { , } n − such that q (1 , a − i ) = q (0 , a − i ) ,each strategic type agent accuses the principal with strictly positive probability .Recall that ( ∗ ) is required by Reﬁnement 2. When L is large, the conviction probability q ( · ) is low for all a .Therefore, the unbounded support assumption ensures that (strategic) victims accuse the principal with positiveprobability in all equilibria that satisfy Reﬁnement 1 when L is large enough.Next, we show that under the alternative assumption that the support of ω i is bounded below, all symmetricequilibria of the game violate Reﬁnement 1. As noted earlier, these equilibria contradict the key principle thatdefendants should not be convicted based on the judge’s prior belief: Lemma L. If ω i ≥ ω with probability for some ω ∈ R , there exists L ∈ R + , such that for every L ≥ L ,there exists no symmetric equilibrium that satisﬁes Reﬁnement 1.Proof. We rule out equilibria in which the probability of offense is or interior. First, suppose toward acontradiction that there exists an equilibrium in which the probability of offense is . Since a behavioral agentaccuses the principal with interior probability, the posterior probability of guilt is for every report proﬁle a ,and the principal is never punished. The principal thus have a strict incentive to commit offenses, which leads toa contradiction. Next, suppose toward a contradiction that there exists an equilibrium in which the probabilityof offense is interior. For every L ∈ R + , there exists q L such that: max a ∈{ , } n q ( a ) ≤ q L . (61)Since the principal is indifferent between committing one and zero offense, q L → as L → + ∞ . When θ i = 1 , agent i ’s reporting cutoff is bounded above by ω ∗ i ≤ b + c − cq L . (62)18hen L is large enough for the RHS of (62) to be less than ω , agent i accuses the principal with probability regardless of θ i . This implies that the principal has a strict incentive to commit offenses and leads to acontradiction. Alternative Distributions:

Let Φ be the cdf of ω i . We introduce two properties of Φ : Deﬁnition 1. Φ is regular if it admits a continuous density and there exists ω ∈ R such that1. the support of Φ contains ( −∞ , ω ) ;2. the density is strictly increasing when ω ≤ ω ;3. Φ( ω + b ) / Φ( ω ) is non-increasing in ω for some b ∈ R + when ω < ω . Φ has thin left tail if there exists b ∈ R + such that lim ω →−∞ Φ( ω + b ) / Φ( ω ) = + ∞ . (63)Our result in the single-agent benchmark (Proposition 1) and our result on DPP (Theorem 2 and 2’) requiresthat Φ be regular and have a thin left tail. Our insights that APP hurts witness credibility and harms deterrence,namely, Theorems 1 and 1’ remain valid for all distributions with support that is unbounded below. Thecomparative statics results (Proposition 2 and Theorem 3) require Φ to be regular. The second requirement of regularityregularity