[PDF] Probabilistic Verification in Mechanism Design

Abstract

We introduce a model of probabilistic verification in a mechanism design setting. The principal verifies the agent's claims with statistical tests. The agent's probability of passing each test depends on his type. In our framework, the revelation principle holds. We characterize whether each type has an associated test that best screens out all the other types. In that case, the testing technology can be represented in a tractable reduced form. In a quasilinear environment, we solve for the revenue-maximizing mechanism by introducing a new expression for the virtual value that encodes the effect of testing.

Full PDF

aa r X i v : . [ ec on . T H ] A ug Probabilistic Veriﬁcation in Mechanism Design ∗ Ian Ball † Deniz Kattwinkel ‡ August 16, 2019

Abstract

We introduce a model of probabilistic veriﬁcation in a mechanism design setting.The principal veriﬁes the agent’s claims with statistical tests. The agent’s probabilityof passing each test depends on his type. In our framework, the revelation principleholds. We characterize whether each type has an associated test that best screens outall the other types. In that case, the testing technology can be represented in a tractablereduced form. In a quasilinear environment, we solve for the revenue-maximizing mech-anism by introducing a new expression for the virtual value that encodes the eﬀect oftesting.

Keywords: probabilistic veriﬁcation, testing, revelation principle, ordering tests, evidence.

JEL Codes:

D82, D86. ∗ We thank our advisors, Dirk Bergemann and Stephan Lauermann, for continual guidance. For helpfuldiscussions, we thank Tilman B¨orgers, Marina Halac, Johannes H¨orner, Navin Kartik, Andreas Kleiner,Daniel Kr¨ahmer, Bart Lipman, Benny Moldovanu, Jacopo Perego, Larry Samuelson, Sebastian Schweighofer-Kodritsch, Philipp Strack, Roland Strausz, and Juuso V¨alim¨aki. We are grateful for funding from theGerman Research Foundation (DFG) through CRC TR 224 (Project B04). An extended abstract for thispaper appeared in EC ’19. † Department of Economics, Yale University, [email protected] . ‡ Department of Economics, University of Bonn, [email protected] . Introduction

In the standard mechanism design paradigm, the principal can commit to an arbitrarymechanism, which induces a game among the agents. There are no explicit constraints onthe mechanism, but there is an implicit assumption that the outcome of the induced gamedoes not depend directly on the agents’ types. Thus, each type can freely mimic every othertype. The principal learns an agent’s type only if the mechanism makes it optimal for thatagent to reveal it.In practice, claims about private information are often veriﬁed. If an employee applies fordisability beneﬁts, the provider performs a medical exam to assess the employee’s condition.If a driver makes an insurance claim after a car accident, the insurer checks the claimagainst a police report. If a consumer reports his income to a lender, the lender requests amonthly pay stub for conﬁrmation. In these examples, veriﬁcation is noisy—medical testsare imperfect, witnesses fallible, and pay stubs incomplete.The seminal paper of Green and Laﬀont (1986) ﬁrst incorporates partial veriﬁcation into mechanism design. In a principal–agent setting, they illustrate how veriﬁcation relaxesincentive compatibility and hence makes more social choice functions implementable. Intheir model, mechanisms are direct and each type faces an exogenous restriction on whichreports he can send to the principal. The following interpretation is suggested. As a functionof the agent’s type, certain reports are always detected as false while other reports neverare. The punishment for detection is prohibitively costly, so the agent chooses among thereports he can send without being detected.Partial veriﬁcation cannot capture the noisiness of real veriﬁcation. Under partial ver-iﬁcation, the agent knows with certainty which reports will be detected as false. But inmany applications, each report is detected with an associated probability. Thus, agentstrade oﬀ the beneﬁts of successful misreporting against the risk of detection.In this paper, we model probabilistic veriﬁcation by endowing a principal with a stochas-tic testing technology. We consider a principal–agent setting. The agent has a private type.The principal has full commitment power and controls decisions (which may or may notinclude transfers). To this standard setting, we add a family of pass–fail tests . The prin-cipal elicits a message from the agent and then conducts a test. The agent sees the testand privately chooses whether to exert eﬀort, which is costless. If he exerts eﬀort, then hispassage probability depends on his type and on the test; if he does not exert eﬀort, thenhe fails with certainty. The principal observes the result of the test—but not the agent’seﬀort—and then takes a decision.We analyze which social choice functions can be implemented with a given testing tech-nology. We reduce the class of mechanisms in two steps.2irst, we simplify communication. Since testing does not intrude into the communicationstage, we get a version of the revelation principle (Theorem 1): There is no loss in restrictingattention to direct mechanisms that induce the agent to report truthfully and to exert eﬀorton every test. We contrast our result with the failure of the revelation principle in the settingof Green and Laﬀont (1986).Second, we simplify the choice of tests. In general, which test is best for verifying aparticular type report depends on which types the principal would like to screen away. Weintroduce for each type θ an associated order over tests. One test is more θ -discerning thananother if it can better screen all other types away from type θ . This family of orders isthe appropriate analogue of Blackwell’s (1953) informativeness order for our testing setting.These discernment orders provide a uniﬁed generalization of various conditions imposed inthe deterministic veriﬁcation literature, such as nested range (Green and Laﬀont, 1986),full reports (Lipman and Seppi, 1995), and normality (Bull and Watson, 2007).A function assigning to each type θ a most θ -discerning test is a most-discerning test-ing function. The suﬃciency part of our main implementation result (Theorem 2) states:If there exists a most-discerning testing function, then every implementable social choicefunction can be implemented using that testing function. In this case, the testing technol-ogy induces an authentication rate , which speciﬁes the probabilities with which each typecan pass the test assigned to each other type. The principal’s problem reduces to an op-timization over decision rules, subject to incentive constraints involving the authenticationprobabilities.The reduction from a testing technology to an authentication rate can be inverted. Weprovide a necessary and suﬃcient condition for an authentication rate to be induced by amost-discerning testing function. In applications, we can directly specify an authenticationrate that satisﬁes our condition—testing need not be modeled explicitly. If the authentica-tion rate takes values 0 and 1 only, then our condition reduces to the nested range conditionthat Green and Laﬀont (1986) use to recover the revelation principle.We are the ﬁrst to analyze veriﬁcation with the ﬁrst-order approach. . Partial ver-iﬁcation is not amenable to this approach because the authentication probability jumpsdiscontinuously from 0 to 1. Under probabilistic veriﬁcation, the authentication rate candepend continuously on the agent’s report, so each local constraint is loosened but noteliminated. In a quasilinear environment, we aggregate these loosened local constraints toderive a virtual value that encodes the testing technology.We use this virtual value to solve for revenue-maximizing mechanisms in three classicalsettings: nonlinear pricing, selling a single good, and auctions. With veriﬁcation, therevenue-maximizing allocations have their usual expressions, except that our virtual valueappears in place of the classical virtual value. The associated transfers are higher in the3resence of veriﬁcation. If the tests are completely uninformative, then our virtual valueequals the classical virtual value. As the tests become more precise, our virtual valueincreases toward the agent’s true value, and the revenue-maximizing allocation becomesmore eﬃcient. When selling a single good, a posted price is not optimal. Instead, the pricedepends on the agent’s report. To study auction settings, we extend the model to allow forcompeting agents who submit reports and are tested separately. A virtual value is deﬁnedfor each agent. The revenue-maximizing auction allocates the good to the agent whosevirtual value is highest.Finally, we consider tests with more than two results. Our discernment orders naturallyextend. As in the baseline model, if there exists a most-discerning testing function, thenthere is no loss in using that testing function only.The rest of the paper is organized as follows. Section 2 presents our model of testing. Sec-tion 3 establishes the revelation principle. Section 4 introduces the discernment orders andstates the most-discerning implementation result. Section 5 reduces the veriﬁcation tech-nology to an authentication rate. Section 6 considers applications to revenue-maximization.We extend the model to allow for multiple agents in Section 7 and nonbinary tests in Sec-tion 8. Section 9 connects our model to previous models of veriﬁcation in economics andcomputer science; other relevant literature is referenced throughout the text. Section 10concludes. Measure-theoretic deﬁnitions are in Appendix A. Proofs are in Appendices Band C. There are two players: a principal (she) and an agent (he). The agent draws a privatetype θ ∈ Θ from a commonly known distribution. The principal takes a decision x ∈ X . Preferences depend on the decision x and on the agent’s type θ . The Bernoulli utilityfunctions for the agent and the principal are u : X × Θ → R and v : X × Θ → R . A social choice function, denoted f : Θ → ∆( X ) , assigns a decision lottery to each type. Decisions are completely abstract; they may or may not include transfers. We make the following standing technical assumptions. Each set is a Polish space endowed with itsBorel σ -algebra. The space of Borel probability measures on a Polish space Z is denoted ∆( Z ). All .2 Veriﬁcation technology To the principal–agent setting we add a veriﬁcation technology. There is a set T of tests ,with generic element τ . Each test generates a binary result—pass or fail, denoted 1 or 0. The distribution of test results is determined by the passage rate π : T × Θ → [0 , , which assigns to each pair ( τ, θ ) the probability with which type θ can pass test τ . Thepassage rate π is common knowledge, as is the rest of the setting, except the agent’s privatetype.The principal can conduct one test from the set T . The procedure is as follows. Firstthe principal selects a test τ . Next the agent observes τ and chooses to exert eﬀort or not,denoted ¯ e or ¯ e . Eﬀort is costless. If the agent exerts eﬀort, nature draws the test result 1with probability π ( τ | θ ) and the test result 0 otherwise. If the agent does not exert eﬀort,the test result is 0 with certainty. The principal observes the test result, but not the agent’seﬀort choice. Eﬀort is an inalienable choice of the agent, as in models of evidence in which the agentchooses whether to produce the evidence he possesses (Bull and Watson, 2007). Indeed, ifthe passage rate takes values 0 and 1 only, then each test can be interpreted as a requestfor a particular piece of hard evidence; see Example 2. In general, the passage rate takesinterior values, so the agent is unsure whether he will pass each test. The agent can,however, intentionally fail each test by exerting no eﬀort. The active role played by theagent and the resulting asymmetry between passage and failure are what distinguish testsfrom statistical experiments.

The principal commits to a mechanism, which induces an extensive-form game that proceedsas follows. The agent sends a message to the principal. Based on the message, the principalselects a test. The agent observes the test and chooses whether to exert eﬀort. Naturedraws the test result, as prescribed by the passage rate and the agent’s eﬀort. The principalobserves this result and takes a decision.In view of this timing, we deﬁne mechanisms and strategies. primitive functions are measurable. Social choice functions and other measure-valued maps satisfy a weakermeasurability condition called universal measurability. The details are in Appendix A. We consider nonbinary tests in Section 8; our main results go through. The same procedure is used in Deb and Stewart (2018). They allow the principal to conduct tests(termed tasks ) sequentially before making a binary classiﬁcation. In DeMarzo et al. (2019), a seller of anasset can conduct a test of the asset’s quality. Each test has a null result , which the seller can always claimto have received. If there is only one non-null result, then this technology is equivalent to ours. XM T { ¯ e, ¯ e } { , } r ft e π g Figure 1.

Social choice function induced by a proﬁle

Deﬁnition 1 (Mechanism) . A mechanism ( M, t, g ) consists of a message space M togetherwith a testing rule t : M → ∆( T ) and a decision rule g : M × T × { , } → ∆( X ) . Once the principal commits to a mechanism, the agent faces a dynamic decision problem.First he sends a message. Then, after observing the selected test, he chooses eﬀort.

Deﬁnition 2 (Strategy) . A strategy ( r, e ) for the agent consists of a reporting strategy r : Θ → ∆( M ) and an eﬀort strategy e : Θ × M × T → [0 , , specifying the probability of exerting eﬀort.A mechanism ( M, t, g ) and a strategy ( r, e ) together constitute a proﬁle , which inducesa social choice function by composition, as indicated in Figure 1. The functions e , π , and g are represented by dashed arrows as a reminder that these functions depend on histories,not just their source sets in the diagram. Using the notation for composition from Markovprocesses—think of a probability row vector right-multiplied by stochastic matrices—theinduced social choice function f is ( r × t × ( eπ )) g . The map r × t × ( eπ ) from Θ to∆( M × T × { , } ) is applied before the map g . For measure-theoretic deﬁnitions of theseoperations, see Appendix A.1.The players use expected utility to evaluate lotteries over decisions. A proﬁle ( M, t, g ; r, e )is incentive compatible if the strategy ( r, e ) is a best response to the mechanism ( M, t, g ),i.e., the strategy ( r, e ) maximizes the agent’s utility over all strategies in the game inducedby the mechanism (

M, t, g ). A proﬁle (

M, t, g ; r, e ) implements a social choice function f if ( M, t, g ; r, e ) is incentive compatible and f = ( r × t × ( eπ )) g . A social choice function is implementable if there exists a proﬁle that implements it.6 θ θ Figure 2.

Feasible reports

We now begin our analysis of which social choice functions can be implemented, given atesting technology. The space of incentive-compatible proﬁles is large. We reduce thisspace by establishing a version of the revelation principle. First we revisit the failure of therevelation principle in Green and Laﬀont’s (1986) model of partial veriﬁcation.

Example 1 (Partial veriﬁcation, Green and Laﬀont, 1986) . The agent has one of threetypes, labeled θ , θ , and θ . Type θ can report θ or θ ; type θ can report θ or θ ;and type θ can report only θ . This correspondence is represented as a directed graph inFigure 2. Each type is a node, and edges connect each type to each of his feasible reports.The principal chooses whether to allocate a single good to the agent, who prefers toreceive it, no matter his type. Consider the social choice function that allocates the goodif and only if the agent’s type is θ or θ . To truthfully implement this allocation, theprincipal must allocate the good if the agent reports θ . But then type θ can report θ inorder to get the good. Therefore, this allocation cannot be implemented truthfully. It can,however, be implemented untruthfully : The principal allocates the good if and only if theagent reports θ . Types θ and θ both report θ , but type θ cannot.What goes wrong in this example? In the partial veriﬁcation model, reports are notcheap-talk messages. Instead, each report serves as a test that certain types can pass.Truthful implementation implicitly assigns to each type θ the test “report θ ”, regardlessof which other types can pass that test. By contrast, our testing framework separatescommunication from veriﬁcation. Reports retain their usual meaning and thus the revelationprinciple holds.In the standard mechanism design setting, the revelation principle states that there isno loss in restricting to direct and truthful proﬁles. In our testing framework, we demandalso that the agent exert eﬀort on every test. Deﬁnition 3 (Canonical) . A proﬁle (

M, t, g ; r, e ) is canonical if (i) M = Θ, (ii) r = id,and (iii) e ( θ, θ, τ ) = 1 for all θ ∈ Θ and τ ∈ T .Here the identity, id, maps each type θ to the point mass δ θ in ∆(Θ). Condition (i)says that the mechanism is direct. Condition (ii) says that the agent reports truthfully. We sometimes identify a map y f ( y ) from Y to Z with the map y δ f ( y ) from Y to ∆( Z ). A social choice function is canonically implementable if it can be implementedby a canonical proﬁle. Our revelation principle states that there is no loss in restricting tocanonical implementation.

Theorem 1 (Revelation principle)

Every implementable social choice function is canonically implementable.

The structure of the proof is similar to that of the standard revelation principle. Givenan arbitrary proﬁle (

M, t, g ; r, e ) that implements a social choice function f , we construct acanonical proﬁle that also implements f . Play proceeds as follows. The agent truthfullyreports his type θ . The principal feeds this report θ into the reporting strategy r to drawa message m , which is then passed to the testing rule t to draw a test τ . The agentexerts eﬀort, so nature draws the test result 1 with probability π ( τ | θ ) and the test result 0otherwise. If the test result is 0, the principal feeds 0 into the decision rule g . If the testresult is 1, the principal feeds into g the result 1 with probability e ( θ, m, τ ) and the result0 otherwise. Therefore, the input to g is 1 with probability π ( τ | θ ) e ( θ, m, τ ). This canonical proﬁle induces f . To check that this proﬁle is incentive compatible, weshow that for any deviation in the canonical mechanism, there is a corresponding deviationin the original mechanism that induces the same (stochastic) decision. The original proﬁleis incentive compatible, so this deviation cannot be proﬁtable.Suppose that the agent reports θ ′ and then follows the strategy of exerting eﬀort withprobability ˜ e ( τ ) on each test τ . The agent can get the same decision in the original mecha-nism by playing as follows. Feed θ ′ into the strategy r to draw a message m to send to theprincipal. If test τ is conducted, exert eﬀort with probability ˜ e ( τ ) e ( θ ′ , m, τ ).By separating communication from testing, we recovered the revelation principle. Con-ceptually, our testing framework elucidates the role of veriﬁcation in eliciting private infor-mation. Computationally, our progress is less clear. The complexity of untruthful reportinghas been replaced by the complexity of testing rules. Reducing this class of testing rules iswhat we turn to next. The agent’s oﬀ-path behavior can be ignored because the best-response condition is in strategic form.Alternatively, condition (iii) could be strengthened to require that e ( θ, θ ′ , τ ) = 1 for all θ, θ ′ ∈ Θ and τ ∈ T .The results would not change. In the proof, we use a more general procedure that works also for nonbinary tests. Technically, this is not a mechanism but a generalized mechanism (Mertens et al., 2015, Exercise 10,p. 70) because the distribution of the randomization device—the message the principal draws—depends onthe agent’s type report. In the proof, we eliminate this device by applying the disintegration theorem. Ordering tests

In this section, we identify a smaller class of testing rules that suﬃces for implementation.To deﬁne this class, we introduce a family of orders over tests.

The basic question is whether one test can always be used in place of another. Moreprecisely, ﬁx a type θ and tests τ and ψ . Suppose that the principal conducts test ψ ontype θ . Is it always possible to replace test ψ with test τ and then adjust the decision ruleso that (i) the decision for type θ is preserved, and (ii) no new deviations are introduced?The key is to convert each score on test τ into an equivalent score on test ψ . The principalfeeds this converted score into the original decision rule.A score conversion is a Markov transition k on { , } , which associates to each score s in { , } a measure k s in ∆( { , } ). A transition k on { , } is monotone if k ﬁrst-orderstochastically dominates k , denoted k ≥ SD k . For each test τ and type θ , denote by π τ | θ the measure on { , } that puts probability π ( τ | θ ) on 1. When a test result is drawn fromthe measure π τ | θ and a Markov transition k is applied, the resulting distribution is denoted π τ | θ k , which can be viewed as the product of a row vector and a stochastic matrix. Deﬁnition 4 (Discernment order) . Fix a type θ . A test τ is more θ -discerning than a test ψ , denoted τ (cid:23) θ ψ , if there exists a monotone Markov transition k on { , } satisfying:(i) π τ | θ k = π ψ | θ ;(ii) π τ | θ ′ k ≤ SD π ψ | θ ′ for all types θ ′ with θ ′ = θ .Conditions (i) and (ii) correspond to parts (i) and (ii) of the motivating question above.Each condition compares two testing procedures. On the left side, the agent exerts eﬀorton test τ and his score is converted by k into a score on test ψ . On the right side, theagent exerts eﬀort on test ψ and his score is drawn. Condition (i) says that for type θ thesetwo procedures give the same score distribution. Condition (ii) says that for all other typesthe converted score distribution is ﬁrst-order stochastically dominated by the unconvertedscore distribution on test ψ . The conversion k is required to be monotone so that eﬀortweakly improves the distribution of the converted score. In short, the deﬁnition ensuresthat the conversion k from τ -scores to ψ -scores is fair for type θ but (weakly) unfavorablefor all other types.The relation (cid:23) θ is reﬂexive and transitive. For reﬂexivity, take k to be the identitytransition. For transitivity, compose the score conversions and note that monotone tran-sitions preserve ﬁrst-order stochastic dominance and are closed under composition; see We use score synonymously with result . Tests are pass–fail, but the language of scoring is more intuitiveand our constructions immediately extend to the nonbinary case. (cid:23) θ . We use the notation ∼ θ forequivalence under (cid:23) θ , and ≻ θ for the strict part of (cid:23) θ .The θ -discernment order resembles Blackwell’s (1953) informativeness order betweenexperiments. Given a state space Ω and a signal space S , an experiment is a map fromΩ to ∆( S ). In an experiment, the signal realizations are drawn exogenously by nature. A garbling is a Markov transition on S . An experiment τ is more Blackwell informative thanan experiment ψ if there exists a garbling g such that τ g = ψ. (1)To bring out the connection with the discernment orders, set Ω = Θ, and denote by π τ | θ ′ the distribution of signals from experiment τ in state θ ′ . Then (1) can be expressed as π τ | θ ′ g = π ψ | θ ′ for all θ ′ ∈ Θ . (2)The Blackwell order is concerned with information, not incentives. The garbled signalfrom experiment τ must have the same distribution as the ungarbled signal from experiment ψ , in every state of the world. No state is privileged, and no structure on the signal spaceis required. In contrast, the discernment orders reﬂect the agent’s incentives to reporttruthfully and to exert eﬀort. There is a family of discernment orders, one associated witheach type θ . For the distinguished type θ , the converted score on test τ must have thesame distribution as the unconverted score on test ψ . For all other types—the potentialdeviators—the converted score on test τ need only be stochastically dominated by the scoreon test ψ . A conversion, unlike a garbling, is required to be monotone so that eﬀort weaklyimproves the distribution of the converted score. For dominance and monotonicity to makesense, scores must be totally ordered. We are interested in maximum tests with respect to the discernment orders.

Deﬁnition 5 (Most discerning) . A test τ is most θ -discerning if τ (cid:23) θ ψ for all ψ ∈ T . Afunction t : Θ → T is most discerning if, for each type θ , the test t ( θ ) is most θ -discerning.To state the implementation result, we need a few deﬁnitions. Given a type space Θ, a testing environment consists of a test set T and a passage rate π : T × Θ → [0 , decisionenvironment consists of a decision set X and a utility function u : X × Θ → R for the agent.Given a testing rule ˆ t : Θ → ∆( T ), a social choice function f is canonically implementablewith ˆ t if there exists a decision rule g such that the mechanism (ˆ t, g ) canonically implements f . 10 heorem 2 (Most-discerning implementation) Fix a type space Θ and a testing environment ( T, π ) . For a measurable testing function ˆ t : Θ → T , the following are equivalent.1. ˆ t is most discerning.2. In every decision environment ( X, u ) , every implementable social choice function iscanonically implementable with ˆ t . The implication from condition 1 to condition 2 means that a single most-discerningtesting function suﬃces for implementation. The proof formalizes the test replacement thatmotivated our deﬁnition of the discernment orders. By the revelation principle (Theorem 1),there is no loss in considering only canonical proﬁles. Suppose that in a canonical proﬁle,some report θ is assigned a test ψ with ψ = ˆ t ( θ ). Since ˆ t ( θ ) (cid:23) θ ψ , the principal can use ascore conversion to replace test ψ with test ˆ t ( θ ), without introducing any new deviations. Weperform this replacement simultaneously for every type to construct an incentive-compatiblecanonical proﬁle with testing rule ˆ t .The implication from condition 2 to condition 1 means that the most-discerning propertyis necessary. If a testing function ˆ t is not most discerning, then in some decision environmentthere is an implementable social choice function that cannot be canonically implementedwith testing rule ˆ t . The construction is as follows. Since ˆ t is not most-discerning, thereexists some type θ and some test τ such that ˆ t ( θ ) θ τ . We start with a social choicefunction that can be implemented by assigning test τ to type θ . To replace test τ withtest ˆ t ( θ ), we would need a score conversion from τ to ˆ t ( θ ) that preserves the decision fortype θ . But any such conversion is either nonmonotone or violates the dominance condition(ii). If the conversion is nonmonotone, then type θ can improve his passage probabilityby not exerting eﬀort. If (ii) is violated, then there is some other type θ ′ whose scoredistribution after reporting θ is improved by the conversion. In the proof, we construct adecision environment in which these deviations are proﬁtable.The most-discerning property takes a simple form when tests are deterministic. Example 2 (Deterministic tests and evidence) . If the passage rate π is { , } -valued,then our testing framework reduces to Bull and Watson’s (2007) model of hard evidence.Each test can be interpreted as a request for a piece of evidence. Deﬁne a correspondence E : Θ ։ T by E ( θ ) = { τ ∈ T : π ( τ | θ ) = 1 } . Type θ can provide the evidence requested by test τ if and only if τ is in E ( θ ). It can be11hown that a test τ in E ( θ ) is most θ -discerning if and only if, for every test ψ in E ( θ ), E − ( τ ) ⊆ E − ( ψ ) . That is, test τ is the hardest test that type θ can pass. The existence of a most-discerningtesting function is equivalent to Bull and Watson’s (2007) evidentiary normality condition,which in turn is equivalent to Lipman and Seppi’s (1995) full reports condition.If a most-discerning testing function does not exist, we can still reduce the class oftesting rules that we need to consider. Deﬁnition 6 (Most-discerning correspondence) . A subset T of T is most θ -discerning iffor each test ψ ∈ T there exists a test τ ∈ T such that τ (cid:23) θ ψ . A correspondence ˆ T : Θ ։ T is most discerning if for each type θ the set ˆ T ( θ ) is most θ -discerning.A testing function ˆ t : Θ → T is a selection from a correspondence ˆ T : Θ ։ T if ˆ t ( θ ) ∈ ˆ T ( θ ) for each θ ∈ Θ. We extend this notion to stochastic testing rules. A testing ruleˆ t : Θ → ∆( T ) is supported on a correspondence ˆ T : Θ ։ T if supp ˆ t θ ⊂ ˆ T ( θ ) for each θ ∈ Θ.The next result says that if a correspondence is most discerning, then we can restrictattention to the testing rules supported on that correspondence. To avoid measurabilityproblems, we impose additional regularity conditions.

Theorem 3 (Implementation with a most-discerning correspondence)

Suppose that the passage rate π is continuous. Let ˆ T be a correspondence from Θ to T withclosed values and a measurable graph. If ˆ T is most discerning, then for every implementablesocial choice function f , there exists a testing rule ˆ t supported on ˆ T such that f is canonicallyimplementable with ˆ t . If ˆ T is singleton-valued, then exactly one testing rule is supported on ˆ T , and Theorem 3reduces to the suﬃciency part of Theorem 2. In general, Theorem 3 allow us to restrictattention to testing rules supported on a ﬁxed correspondence ˆ T . For example, supposeeach type θ in some subset Θ of Θ has a most θ -discerning test ˆ t ( θ ). Take ˆ T ( θ ) = { ˆ t ( θ ) } for θ ∈ Θ and ˆ T ( θ ) = T for θ Θ .We apply Theorem 3 to the the Green–Laﬀont example, which we reformulate withtests. Argue directly from the deﬁnition using deterministic score conversions. Alternatively, apply Proposi-tion 2, stated below. In this case, the continuity assumption on π is unnecessary. We use the regularity assumptions onlyto apply a measurable selection theorem, which ensures that there is a measurable way to assign each pair( θ, ψ ) a test τ in ˆ T ( θ ) satisfying τ (cid:23) θ ψ . If it can be shown independently that such an assignment exists,then the conclusion of Theorem 3 follows without any assumptions on π or ˆ T . When ˆ T is singleton-valued,there is at most one such assignment. θ θ τ τ τ Figure 3.

Passage correspondence

Example 3 (Green–Laﬀont with tests) . Set Θ = { θ , θ , θ } and T = { τ , τ , τ } . Type θ i can pass test τ j if and only if, in Example 1, type θ i can report θ j . This { , } -valuedpassage rate is represented in Figure 3 as a directed graph on Θ ∪ T . Edges connect eachtype to each of the tests he can pass. The discernment relations are given by τ ≻ θ τ ≻ θ τ , τ , τ ≻ θ τ , τ ≻ θ τ ∼ θ τ . Tests τ and τ are incomparable under (cid:23) θ because τ screens away type θ (but not θ )and τ screens away θ (but not θ ). The following correspondence ˆ T is most-discerning:ˆ T ( θ ) = { τ } , ˆ T ( θ ) = { τ , τ } , ˆ T ( θ ) = { τ } . By Theorem 3, there is no loss in restricting to testing rules supported on ˆ T . But we cannotassume that type θ is assigned test τ —this is the crux of the original counterexample. We provide a more practical characterization of each discernment order. Fix a type θ andtests τ and ψ . To determine whether τ is more θ -discerning than ψ , we ﬁrst characterize themonotone Markov transitions k satisfying π τ | θ k = π ψ | θ . We parameterize these transitionsas convex combinations of two extreme points.The ﬁrst extreme point is obtained by matching quantiles, much like scores are convertedbetween the SAT and the ACT. Figure 4 shows this Markov transition, separated into twocases. In the left panel, π ( τ | θ ) ≥ π ( ψ | θ ), so a score of 0 on τ is never converted to a 1 on ψ . In the right panel, π ( τ | θ ) < π ( ψ | θ ), so a score of 1 on τ is never converted to a 0 on ψ .To formally deﬁne this transition, we construct Markov transitions that are analogues ofdistribution and quantile functions. Given a type θ and a test τ , the associated cumulativedistribution transition ˜ F τ | θ : { , } → ∆([0 , , − π ( τ | θ )] and [1 − π ( τ | θ ) , Compare this graph on Θ ∪ T with the graph on Θ in Figure 2.

130 10 π ( ψ | θ ) π ( τ | θ ) − π ( ψ | θ ) π ( τ | θ ) − − π ( ψ | θ )1 − π ( τ | θ ) 1 − π ( ψ | θ )1 − π ( τ | θ ) Figure 4.

Conversion from test τ to test ψ for type θ The associated quantile transition ˜ Q τ | θ : [0 , → ∆( { , } )maps points in [0 , − π ( τ | θ )] to δ and points in (1 − π ( τ | θ ) ,

1] to δ . The quantile-matchingtransition is the composition ˜ F τ | θ ˜ Q ψ | θ .The second extreme point is the constant transition that maps both scores 0 and 1 tothe measure π ψ | θ . This transition is denoted π ψ | θ as well. Proposition 1 (Score conversion characterization)

Fix a type θ and tests τ and ψ . For a Markov transition k on { , } , the following areequivalent.1. k is monotone and π τ | θ k = π ψ | θ .2. k = λ ˜ F τ | θ ˜ Q ψ | θ + (1 − λ ) π ψ | θ for some λ ∈ [0 , . We characterize the discernment order in terms of this parameter λ ∈ [0 , π , deﬁne the associated failure rate ¯ π by ¯ π = 1 − π . Proposition 2 (Discernment order characterization)

Fix a type θ and tests τ and ψ .1. Suppose π ( τ | θ ) ≥ π ( ψ | θ ) . We have τ (cid:23) θ ψ if and only if there exists λ ∈ [0 , suchthat, for all types θ ′ , [ λπ ( τ | θ ′ ) + (1 − λ ) π ( τ | θ )] π ( ψ | θ ) ≤ π ( ψ | θ ′ ) π ( τ | θ ) . (3)

More θ -discerning with λ = 1 Remark. If π ( τ | θ ) ≥ π ( τ | θ ′ ) for all θ ′ , then (3) and (4) are each weakest when λ = 1, sowe can equivalently require λ = 1 in the statement of Proposition 2. If, in addition, thepassage rates are interior, then (3) and (4) can be expressed as π ( τ | θ ) π ( τ | θ ′ ) ≥ π ( ψ | θ ) π ( ψ | θ ′ ) and ¯ π ( τ | θ )¯ π ( τ | θ ′ ) ≤ ¯ π ( ψ | θ )¯ π ( ψ | θ ′ ) . For the relation τ (cid:23) θ ψ , the passage (failure) rate ratio is what matters if type θ is morelikely to pass (fail) test τ than test ψ . Example 4 (More θ -discerning with λ = 1) . For simplicity, we consider a type θ and twotests τ and ψ such that π ( τ | θ ) and π ( ψ | θ ) are equal and nonzero. In this case, test τ ismore θ -discerning than test ψ if and only if there exists λ ∈ [0 ,

1] such that λπ ( τ | θ ′ ) + (1 − λ ) π ( τ | θ ) ≤ π ( ψ | θ ′ ) for all θ ′ ∈ Θ . (5)The passage rates for these tests are plotted in Figure 5. The type space is an interval,plotted on the horizontal axis. For test τ , the passage rate is an increasing aﬃne function;for test ψ , the passage rate is increasing and convex. The dotted line takes the constantvalue π ( τ | θ ), and the dashed line is the average of the passage rate π ( τ |· ) and the constant π ( τ | θ ). From the graph we see that (5) is satisﬁed with λ = 1 /

2, so τ (cid:23) θ ψ . Moreover, thetangency at the point ( θ, π ( τ | θ )) shows that 1 / λ for which (5) holds. Example 5 (Relative performance and scaling tests) . Consider a type θ and tests τ , ψ , Algebraically, Θ = [1 / , /

4] and θ = 1 /

2. The passage rates are π ( τ | θ ′ ) = 1 / / θ ′ − /

2) and π ( ψ | θ ′ ) = θ ′ ( θ ′ − /

4) + 3 / ψ ψ τ Figure 6.

Relative performance and scaling and ψ whose passage rates are plotted in Figure 6. Test τ is the test that type θ is leastlikely to pass, but τ is more θ -discerning than ψ and ψ . This is possible because test τ isdiﬃcult for every type. The performance of type θ relative to the other types is better ontest τ than on tests ψ and ψ .This example illustrates also the subtle eﬀect of scaling the passage rate. The passagerate on ψ is a scaling of the passage rate on ψ : π ( ψ |· ) = 0 . π ( ψ |· ). Since π ( ψ | θ ) >π ( ψ | θ ), the relation ψ (cid:23) θ ψ depends on the relative passage rates; it is satisﬁed. Thereverse relation depends on the relative failure rates; it is not satisﬁed. Thus, ψ ≻ θ ψ .Lastly, we study equivalence with respect to each discernment order. Two tests are θ -equivalent if they are equivalent with respect to (cid:23) θ . Two tests are equal if their passagerates are equal. A type θ is minimal on a test τ if π ( τ | θ ) ≤ π ( τ | θ ′ ) for all θ ′ ∈ Θ. Proposition 3 ( θ -discernment equivalence) Fix a type θ . Tests τ and τ are θ -equivalent if and only if (i) τ and τ are equal, or (ii) θ is minimal on τ and τ . If two unequal tests are θ -equivalent, then these tests have no power to screen othertypes away from type θ . Whichever of the two tests is used, every decision that is feasiblefor type θ is also feasible for all other types. Proposition 3 considers θ -equivalence for aﬁxed type θ . If two tests are θ -equivalent for every type θ , then their passage rates areconstant, but the constants are not necessarily equal. Again, Θ = [1 / , /

4] and θ = 1 /

2. The passage rates are π ( ψ | θ ′ ) = 0 . − θ ′ − . ; π ( ψ | θ ′ ) =0 . (cid:0) . − θ ′ − . (cid:1) ; and π ( τ | θ ′ ) = 0 . − θ ′ − . . Testing in reduced form

If there exists a most-discerning testing function, then our framework takes a reduced formin which tests do not appear explicitly.

When the agent makes a type report, his passage probability depends on the test that theprincipal conducts after that report. If there is a most-discerning testing function ˆ t , thenby Theorem 2 we can restrict attention to mechanisms in which each report θ ′ is assignedtest ˆ t ( θ ′ ). In this case, the passage probabilities for each report are pinned down for eachtype. Deﬁnition 7 (Induced authentication rate) . Given a most-discerning testing functionˆ t : Θ → T , the authentication rate induced by ˆ t is the function α : Θ × Θ → [0 ,

1] givenby α ( θ ′ | θ ) = π (ˆ t ( θ ′ ) | θ ) . There can be multiple most-discerning testing functions. Each induces a diﬀerent au-thentication rate. But Theorem 2 guarantees that every most-discerning testing functioncan be used to implement the same set of social choice functions. There is a correspondingequivalence between the authentication rates induced by diﬀerent most-discerning testingfunctions. We need a few deﬁnitions. A testing environment is most discerning if it admitsa most-discerning testing function. A type θ is minimal for an authentication rate α if α ( θ | θ ) ≤ α ( θ | θ ′ ) for all θ ′ ∈ Θ. Authentication rates α and α are essentially equal if (i) α and α have the same minimal types, and (ii) α ( θ |· ) = α ( θ |· ) for all types θ that are notminimal for α and α . Essential equality is an equivalence relation. Proposition 4 (Essential uniqueness)

In a most-discerning testing environment, the authentication rates induced by most-discerningtesting functions are all essentially equal.

Hereafter, we identify authentication rates that are essentially equal, so we speak of the authentication rate induced by a most-discerning testing environment.

Given a most-discerning testing environment (

T, π ), we reformulate the principal’s problemin terms of the authentication rate α induced by ( T, π ). When the agent reports θ ′ , he is authenticated if he passes the associated most θ ′ -discerning test. A reduced-form mechanism consists of functions g and g from Θ to ∆( X ). When the agent reports θ ′ , the principal17akes the decision g ( θ ′ ) if the agent is authenticated and the decision g ( θ ′ ) otherwise. Iftype θ reports θ ′ and exerts eﬀort on the associated test, his interim utility u ( θ ′ | θ ) is givenby u ( θ ′ | θ ) = α ( θ ′ | θ ) u ( g ( θ ′ ) , θ ) + (1 − α ( θ ′ | θ )) u ( g ( θ ′ ) , θ ) . On the right side, the function u is extended linearly from X to ∆( X ). Even in thisreduced form, the cost of lying is determined endogenously by the mechanism g , in contrastto models of lying costs. The incentive-compatibility constraint becomes u ( θ | θ ) ≥ u ( θ ′ | θ ) ∨ u ( g ( θ ′ ) , θ ) for all θ, θ ′ ∈ Θ . (IC)The right side is the interim utility for type θ if he reports θ ′ and then chooses eﬀort tomaximize his utility. The principal selects a reduced-form mechanism to solvemaximize E [ α ( θ | θ ) v ( g ( θ ) , θ ) + (1 − α ( θ | θ )) v ( g ( θ ) , θ )]subject to (IC) . In the applications below, we also impose participation constraints.

We showed that a most-discerning testing environment has a simpler representation as anauthentication rate. Can we start with the authentication rate as a primitive? To retain thetesting interpretation, a primitive authentication rate must be induced by a most-discerningtesting environment. We characterize when this is the case.An authentication rate α implicitly associates to each report θ ′ a test ˆ t ( θ ′ ) with passagerate π (ˆ t ( θ ′ ) |· ) = α ( θ ′ |· ). To check whether this testing function ˆ t is most discerning, wemust specify which other tests are in the test set. We claim that there is no less in choosingthe minimal test set ˆ t (Θ) = { ˆ t ( θ ′ ) : θ ′ ∈ Θ } . If ˆ t is not most-discerning with this testset, then it cannot be most-discerning with any larger test set because adding tests addsconstraints to Deﬁnition 4. We translate this condition—that the testing function ˆ t is mostdiscerning with the test set ˆ t (Θ)—into a condition on the authentication rate α . For anauthentication rate α , let ¯ α = 1 − α . In models of lying costs, the agent’s utility is the diﬀerence between his consumption utility and anexogenous lying cost, which depends on the agent’s true type and the agent’s report. Lying costs relax theincentive constraints. See, for example, Lacker and Weinberg (1989), Maggi and Rodrigu´ez-Clare (1995),Crocker and Morgan (1998), Kartik et al. (2007), Kartik (2009), and Deneckere and Severinov (2017). Incomputer science, Kephart and Conitzer (2016) show that, with lying costs, the revelation principle holds ifthe lying cost function satisﬁes the triangle inequality. eﬁnition 8 (Most-discerning authentication) . An authentication rate α is most discerning if the following hold for all types θ and θ .1. If α ( θ | θ ) ≥ α ( θ | θ ), then there exists λ ∈ [0 ,

1] such that, for all types θ ,[ λα ( θ | θ ) + (1 − λ ) α ( θ | θ )] α ( θ | θ ) ≤ α ( θ | θ ) α ( θ | θ ) .

2. If α ( θ | θ ) < α ( θ | θ ), then there exists λ ∈ [0 ,

1] such that, for all types θ ,[ λ ¯ α ( θ | θ ) + (1 − λ ) ¯ α ( θ | θ )] ¯ α ( θ | θ ) ≥ ¯ α ( θ | θ ) ¯ α ( θ | θ ) . From our characterization of the discernment order (Proposition 2), we get the followingcharacterization for authentication rates.

Theorem 4 (Authentication rate characterization)

An authentication rate α is induced by some most-discerning testing environment if andonly if α is most discerning. Remark. If α ( θ | θ ) ≥ α ( θ | θ ′ ) for all types θ and θ ′ , then α is most discerning if and only if α ( θ | θ ) α ( θ | θ ) ≤ α ( θ | θ ) α ( θ | θ ) , (6)for all θ , θ , θ ∈ Θ.If α is { , } -valued and α ( θ | θ ) = 1 for all θ , then α induces a message correspondence M : Θ ։ Θ deﬁned by M ( θ ) = { θ ′ : α ( θ ′ | θ ) = 1 } . This correspondence M satisﬁes θ ∈ M ( θ ) for each θ , as in Green and Laﬀont (1986). Interms of M , (6) becomes θ ∈ M ( θ ) & θ ∈ M ( θ ) = ⇒ θ ∈ M ( θ ) , which is exactly Green and Laﬀont’s (1986) nested range condition. We solve for revenue-maximizing mechanisms with the local ﬁrst-order approach. Thesolutions use a new expression for the virtual value.19 .1 Quasilinear setting with veriﬁcation

Consider the nonlinear pricing setting from Mussa and Rosen (1978). The agent’s type θ ∈ Θ = [¯ θ, ¯ θ ] is drawn from a distribution function F with strictly positive density f . Theprincipal allocates a quantity q ∈ R + and receives a transfer t ∈ R . Utilities for the agentand the principal are u ( q, t, θ ) = θq − t and v ( q, t ) = t − c ( q ) . Here, c is the cost of production. Assume that c (0) = c ′ (0) = 0 and that the marginal cost c ′ is strictly increasing and unbounded.The veriﬁcation technology is represented by a measurable most-discerning authentica-tion rate α : Θ × Θ → [0 ,

1] that satisﬁes the following conditions.(i) α ( θ | θ ) = 1 for all types θ .(ii) For each θ ′ ∈ Θ, the function θ α ( θ ′ | θ ) is absolutely continuous.(iii) For each θ ∈ Θ, the right and left partial derivatives D α ( θ | θ ) and D − α ( θ | θ ) exist,and the functions θ D α ( θ | θ ) and θ D − α ( θ | θ ) are integrable. Condition (i) ensures that the agent is authenticated if he reports truthfully. The regularityconditions (ii) and (iii) allow us to apply the envelope theorem. Since α is most discerning,(i) implies that α ( θ | θ ) α ( θ | θ ) ≤ α ( θ | θ ) for all θ , θ , θ ∈ Θ. Figure 7 plots an authenti-cation rate that satisﬁes our assumptions. The agent’s type is on the horizontal axis. Eachcurve corresponds to a ﬁxed report. In this example, the authentication probability decaysexponentially in the absolute diﬀerence between the agent’s type and the agent’s report.We assume that the agent is free to walk away at any time, so we impose an ex postparticipation constraint. Whether or not the agent is authenticated, his utility must benonnegative. Without these constraints, the principal could apply severe punishmentsto eﬀectively prohibit the agent from making any report that is not authenticated withcertainty. In that case, the model reduces to partial veriﬁcation, as in Caragiannis et al.(2012).We work with reduced-form mechanisms. Since the agent is always authenticated on-path, there is no loss in holding him to his outside option if he is not authenticated. Wetake g ( θ ) = (0 ,

0) for all θ , and we optimize over the decision rule g . Without loss, we The pair ( q, t ) is the decision x in the general model. In applications, t always denotes transfers, not testing. Since we work directly with authentication rates, we make no reference to tests. These derivatives are deﬁned by D α ( θ ′ | θ ) = lim h ↓ α ( θ ′ | θ + h ) − α ( θ ′ | θ ) h , D − α ( θ ′ | θ ) = lim h ↓ α ( θ ′ | θ ) − α ( θ ′ | θ − h ) h . In particular, we rule out upfront payments like those used in Border and Sobel (1987). θ ′ θ ′′ α ( θ ′ | θ ) α ( θ ′′ | θ ) θ Figure 7.

Exponential authentication rate restrict g to be deterministic. The component functions of g are denoted q and t .The principal selects a quantity function q : Θ → R + and a transfer function t : Θ → R to solve maximize Z ¯ θ ¯ θ [ t ( θ ) − c ( q ( θ ))] f ( θ ) d θ subject to θq ( θ ) − t ( θ ) ≥ α ( θ ′ | θ )[ θq ( θ ′ ) − t ( θ ′ )] , θ, θ ′ ∈ Θ θq ( θ ) − t ( θ ) ≥ , θ ∈ Θ . (7)The ﬁrst constraint is incentive compatibility. The second is ex post participation, condi-tional upon being authenticated. The utility u ( θ ′ | θ ) takes a simple form because the agentgets zero utility if he is not authenticated. The maximum operation from (IC) is droppedbecause it is subsumed by the participation constraint. To motivate our new expression for the virtual value, we use the envelope theorem tocompute the agent’s equilibrium utility function U , deﬁned by U ( θ ) = u ( θ | θ ) = max θ ′ ∈ Θ u ( θ ′ | θ ) . In the classical setting without veriﬁcation, u ( θ ′ | θ ) = θq ( θ ′ ) − t ( θ ′ ). By the envelope theorem, U ′ ( θ ) = D u ( θ | θ ) = q ( θ ). Integrating gives U ( θ ) = Z θ ¯ θ q ( z ) d z. (8)21ith veriﬁcation, the equilibrium utility U takes a diﬀerent form. We sketch the deriva-tion here. From (7), the interim utility is given by u ( θ ′ | θ ) = α ( θ ′ | θ )[ θq ( θ ′ ) − t ( θ )] . The envelope theorem gives the bounds q ( θ ) + D α ( θ | θ ) U ( θ ) ≤ U ′ ( θ ) ≤ q ( θ ) + D − α ( θ | θ ) U ( θ ) . Since α ( θ | θ ) = 1, we have D α ( θ | θ ) ≤ ≤ D − α ( θ | θ ). If α has a cusp, as in the examplein Figure 7, these inequalities are strict and the derivative of U is not pinned down. Tomaximize the principal’s revenue, we set U ′ ( θ ) equal to the lower bound. Deﬁne the precisionfunction λ : Θ → R + by λ ( θ ) = − D α ( θ | θ ) . The larger is λ ( θ ), the steeper is the function α ( θ |· ) to the right of θ . For θ ′ ≤ θ , letΛ( θ ′ | θ ) = exp (cid:18)Z θθ ′ − λ ( w ) d w (cid:19) . The minimum equilibrium utility U is given by U ( θ ) = Z θ ¯ θ Λ( z | θ ) q ( z ) d z. (9)With this expression for equilibrium utility, we deﬁne the virtual value. Recall Myerson’s(1981) virtual value ϕ M ( θ ) = θ − − F ( θ ) f ( θ ) = θ − f ( θ ) Z ¯ θθ f ( z ) d z. (10)The virtual value of type θ is the marginal expected revenue with respect to the quantity q ( θ ). There are two parts. First, the principal can extract the additional consumptionutility from type θ , so the marginal revenue from type θ equals θ . Second, the quantity q ( θ )pushes up the equilibrium utility according to (8), so the marginal revenue from each type z above θ is −

1; this eﬀect is integrated against the relative density f ( z ) /f ( θ ). Veriﬁcationdoes not change the marginal revenue from type θ , but the marginal revenue from eachhigher type z becomes − Λ( θ | z ), by (9). We deﬁne the virtual value by ϕ ( θ ) = θ − f ( θ ) Z ¯ θθ Λ( θ | z ) f ( z ) d z. (11)22.5 1-11 ϕ M ( θ ) θ λ = 0 λ = 1 λ = 2 λ = 3 λ = ∞ Figure 8.

Virtual value for diﬀerent precision functions λ Comparing (10) and (11) gives the inequality ϕ M ( θ ) ≤ ϕ ( θ ) ≤ θ. The virtual value ϕ ( θ ) tends towards these bounds in limiting cases. Proposition 5 (Testing precision) As λ converges to pointwise, ϕ ( θ ) converges to ϕ M ( θ ) for each type θ . As λ converges to ∞ pointwise, ϕ ( θ ) converges to θ for each type θ . Figure 8 illustrates these limits in a simple example, where the agent’s type is uniformlydistributed on the unit interval and the precision function λ is constant. Remark. If λ ( θ ) = 0 for all θ , then Λ( θ | z ) = 1 for θ ≤ z . Therefore, our virtual valuecoincides with the classical virtual value, and by the results below, the revenue-maximizingmechanism is unaﬀected by veriﬁcation. This holds in particular if α has no kink on thediagonal, e.g., if α ( θ ′ | θ ) = 1 − | θ ′ − θ | σ with σ > The virtual value is derived from the envelope representation of the equilibrium utility,which uses only local incentive constraints. We assume that the virtual value is increasing.But because the interim utility is not linear in the agent’s type (due to the authenticationrate α ), we need further assumptions to ensure that the local incentive constraints implythe global incentive constraints. This implication holds in particular for the exponential α ( θ ′ | θ ) = exp (cid:18) − Z θθ ′ λ ( z ) d z (cid:19) , for integrable functions λ : Θ → R + . We permit a larger class of authentication rates.We impose a global condition on the relative values of α and Λ. The function Λ isdetermined by the behavior of α in a neighborhood of the diagonal in Θ × Θ. Because α ismost discerning, it follows that Λ is a global lower bound for α . Proposition 6 (Lower bound)

For all types θ ′ and θ with θ ′ ≤ θ , we have α ( θ ′ | θ ) ≥ Λ( θ ′ | θ ) . For the exponential authentication rates, this inequality holds with equality. We requirethat α not be much greater than Λ. The precise condition depends on the optimal quantityfunction q ⋆ , which will be deﬁned in the theorem statement. The global upper bound statesthat, for all types θ ′ and θ with θ ′ ≤ θ , α ( θ ′ | θ ) ≤ Λ( θ ′ | θ ) A ( θ ′ | θ ) , where A ( θ ′ | θ ) = Z θ ¯ θ Λ( z | ¯ θ ) q ⋆ ( z ) d z (cid:30) Z θ ¯ θ Λ( z ∧ θ ′ | ¯ θ ) q ⋆ ( z ∧ θ ′ ) d z. For the quantity function q ⋆ in the theorem statement, A ( θ ′ | θ ) ≥ θ ′ ≤ θ . Proposition 7 (Optimal nonlinear pricing)

Suppose that the virtual value ϕ is increasing. The optimal quantity function q ⋆ and transferfunction t ⋆ are unique and given by c ′ ( q ⋆ ( θ )) = ϕ ( θ ) + , t ⋆ ( θ ) = θq ⋆ ( θ ) − Z θ ¯ θ Λ( z | θ ) q ⋆ ( z ) d z, provided that the global upper bound is satisﬁed. In the optimal mechanism, type θ receives the quantity that is eﬃcient for type ϕ ( θ ) + ,just as in Mussa and Rosen (1978), except that ϕ is our new virtual value. Transfers arepinned down by the equilibrium utility U from (9). The faster the virtual value ϕ increases,the faster the optimal quantity function q ⋆ increases and the more permissive is the globalupper bound. 24 .4 Selling a single good Suppose that the principal is selling a single indivisible good, which she does not value. Theagent’s type is his valuation for the good. The principal allocates the good with probability q ∈ [0 ,

1] and receives a transfer t ∈ R . Utilities are u ( q, t, θ ) = θq − t and v ( q, t ) = t. Without veriﬁcation, the revenue-maximizing mechanisms is a posted price (Riley and Zeckhauser,1983). With veriﬁcation, the price my depend on the agent’s report.

Proposition 8 (Optimal sale of a single good)

Suppose that the virtual value ϕ is increasing. The optimal quantity and transfer functionsare unique and given as follows, provided that the global upper bound is satisﬁed. Let θ ⋆ =inf { θ : ϕ ( θ ) ≥ } . Each type below θ ⋆ receives nothing and pays nothing. Each type θ above θ ⋆ receives the good and pays t ⋆ ( θ ) = θ − Z θθ ⋆ Λ( z | θ ) d z. As in the no-veriﬁcation solution, there is a cutoﬀ type θ ⋆ who receives the good andpays his valuation. Each type below the cutoﬀ is excluded; each type above the cutoﬀreceives the good and pays less than his valuation. The allocation probability takes values0 and 1 only—there is no randomization. Veriﬁcation increases the virtual value relativeto the classical virtual value, so the cutoﬀ type is lower and more types receive the good.The price is (weakly) increasing in the agent’s report, and strictly increasing if λ is strictlypositive. Nevertheless, types above the cutoﬀ cannot proﬁt by misreporting downward—thebeneﬁt of a lower price is outweighed by the risk of not being authenticated. We extend our model to allow for n agents, labeled i = 1 , . . . , n . Each agent i independentlydraws his type θ i ∈ Θ i from a commonly known distribution µ i ∈ ∆(Θ i ). Set Θ = Q ni =1 Θ i .The decision set is denoted by X , as before. Each agent i has utility function u i : X × Θ → R ;the principal has utility function v : X × Θ → R .For each agent i , there is a set T i of tests and a passage rate π i : T i × Θ i → [0 , , π i ( τ i | θ i ) is the probability with which type θ i can pass test τ i . Set T = Q ni =1 T i . Eachagent sees his own test—but not the tests of the other agents—and then chooses whetherto exert eﬀort. Nature draws the test result for each agent independently. A mechanismspeciﬁes a message set M i for each agent i . Set M = Q ni =1 M i . The rest of the mechanismconsists of a testing rule t : M → ∆( T ) and a decision rule g : M × T × { , } n → ∆( X ). Thetest conducted on each agent can depend on the messages sent by other agents. For eachagent i , a strategy consists of a reporting strategy r i : Θ i → ∆( M i ) and an eﬀort strategy e i : Θ i × M i × T i → [0 , i and each type θ i in Θ i , the θ i -discernment order (cid:23) θ i over T i is deﬁned as in the baseline model with π i in place of π . Given testing functions t i : Θ i → T i for each i , deﬁne the product testing function ⊗ i t i : Θ → T by t ( θ , . . . , θ n ) =( t ( θ ) , . . . , t n ( θ n )). If for each agent i there is a most-discerning testing function, then theproduct of these testing functions suﬃces for implementation. In particular, the test foragent i depends only on agent i ’s report. Theorem 5 (Most-discerning implementation with multiple players)

Fix a type space Θ and a testing environment ( T, π ) . For a testing function ˆ t = ⊗ i ˆ t i , thefollowing are equivalent.1. ˆ t i is most discerning for all i .2. In every decision environment ( X, u ) , every implementable social choice function iscanonically implementable with ˆ t . For applications, we make the same assumptions as in the single-agent case. Each agentis free to walk away, so we impose ex post participation constraints. For each agent i ,the testing environment is represented by a measurable most-discerning authentication rate α i : Θ i × Θ i → [0 ,

1] that satisﬁes assumptions (i)–(iii). For each i , deﬁne λ i and Λ i byputting α i in place of α in the deﬁnitions of λ and Λ. Consider an auction for a single indivisible good. Each agent i independently draws his type θ i ∈ Θ i = [¯ θ i , ¯ θ i ] from a distribution function F i with positive density f i . The principalallocates the good to each agent i with probability q i ∈ [0 , q + · · · + q n ≤

1; theprincipal receives transfers t , . . . , t n ∈ R . Set q = ( q , . . . , q n ) and t = ( t , . . . , t n ). Forsimplicity, we assume that the principal does not value the good. Utilities are given by u i ( q, t, θ ) = θ i q i − t i and v ( q, t ) = n X i =1 t i . f − i ( θ − i ) denote Q j = i f j ( θ j ). For quantity functions q i : Θ → R , interim expectationsare denoted with capital letters: Q i ( θ i ) = Z Θ − i q i ( θ i , θ − i ) f − i ( θ − i ) d θ − i . As in the single-agent case, we impose a condition that depends on the optimal quantityfunction q ⋆ , which is deﬁned in Proposition 9. For each agent i , the global upper bound states that, for all θ i , θ ′ i ∈ Θ i with θ ′ i ≤ θ i , we have α i ( θ ′ i | θ i ) ≤ A i ( θ ′ i | θ i )Λ i ( θ ′ i | θ i ) , where A i ( θ ′ i | θ i ) = Z θ ¯ θ i Λ i ( z i | ¯ θ i ) Q ⋆i ( z i ) d z i (cid:30) Z θ ¯ θ i Λ i ( z i ∧ θ ′ i | ¯ θ i ) Q ⋆i ( z i ∧ θ ′ i ) d z i . For each i and all types θ i , θ ′ i ∈ Θ i with θ ′ i ≤ θ i , we have A i ( θ ′ i | θ i ) ≥ Proposition 9 (Optimal auctions)

Suppose that each virtual value ϕ i is increasing. The seller’s maximum revenue is achievedby the allocation function q ⋆ and transfer function t ⋆ given by q ⋆i ( θ ) =  if ϕ i ( θ i ) > ∨ max j = i ϕ j ( θ j ) , otherwise , and t ⋆i ( θ ) = q ⋆i ( θ ) " θ i − Z θ i ¯ θ i Λ i ( z i | θ i ) Q ⋆i ( z i ) Q ⋆i ( θ i ) d z i , provided that the global upper bound is satisﬁed for each agent i . The transfers are chosen so that each agent pays only if he receives the good, thusensuring that the the ex post participation constraints are satisﬁed. The allocation coincideswith Myerson’s (1981) solution, except ϕ is our new virtual value. In Myerson’s (1981)solution, the allocation rule disadvantages bidders whose valuation distributions are greaterin the sense of hazard rate dominance. In our solution, the allocation rule also advantagesbidders who can be veriﬁed more precisely in the sense that their local precision functionsare pointwise greater. 27 Nonbinary tests

In the main model, we consider pass–fail tests because of their natural connection withpartial veriﬁcation and evidence. Here we extend the baseline principal–agent model toallow for tests with more than two results. There is a family T of tests. Each test generatesscores in a ﬁnite subset S of R . The passage rate π : T × Θ → ∆( S )assigns to each pair ( τ, θ ) the score distribution π τ | θ in ∆( S ) for type θ on test τ .We generalize the agent’s eﬀort choice. On a pass–fail test τ , when type θ exerts eﬀortwith probability e , he passes with probability e · π ( τ | θ ). Thus, there is a correspondencebetween eﬀort probabilities in [0 ,

1] and passage probabilities in [0 , π ( τ | θ )]. We can equiva-lently model the agent as choosing the passage probability directly, subject to a stochasticdominance constraint. This alternative deﬁnition of a strategy extends immediately tononbinary tests. A performance strategy is a map p : Θ × M × T → ∆( S ) , satisfying p θ,m,τ ≤ SD π τ | θ for all ( θ, m, τ ) ∈ Θ × M × T . The other components of a proﬁleare deﬁned as before with S in place of { , } .The revelation principle (Theorem 1) is proved with performance strategies, so it appliesto nonbinary tests. To deﬁne the discernment orders, say that a Markov transition k on S is monotone if k r ≥ SD k s whenever r > s . Put S in place of { , } in the deﬁnition of most-discerning. As before, a single most-discerning testing function suﬃces for implementation. Theorem 6 (Most-discerning implementation with nonbinary tests)

If a testing function ˆ t : Θ → T is most discerning, then every implementable social choicefunction is canonically implementable with ˆ t . Veriﬁcation has been modeled in many ways, in both economics and computer science. Weorganize our discussion around the taxonomy in Table 1, which focuses on the primitives ineach model.Green and Laﬀont (1986) introduce partial veriﬁcation . They restrict their analysisto direct mechanisms. Veriﬁcation is represented as a correspondence M : Θ ։ Θ satisfying θ ∈ M ( θ ) for all θ ∈ Θ. Each type θ can report any type θ ′ in M ( θ ). In particular, each28 educed form Microfoundation partial message correspondence M : Θ ։ ΘGreen and Laﬀont (1986) evidence correspondence E : Θ ։ E Bull and Watson (2007)probabilistic authentication rate α : Θ × Θ → [0 , π : T × Θ → [0 , Table 1.

Taxonomy of veriﬁcation models type can report truthfully. In this framework the revelation principle does not hold, asGreen and Laﬀont (1986) illustrate with a three-type counterexample, which we adapt inExample 1. The revelation principle does hold, however, if the correspondence M satisﬁesthe nested range condition , which requires that the relation associated to M is transitive.Without the revelation principle, it is generally diﬃcult to determine whether a particularsocial choice function is implementable (Nisan and Ronen, 2001; Singh and Wittman, 2001;Fotakis and Zampetakis, 2015; Auletta et al., 2011; Yu, 2011; Rochet, 1987; Vohra, 2011).Bull and Watson (2004, 2007) and Lipman and Seppi (1995) model veriﬁcation with hard evidence . They introduce an evidence set E and an evidence correspondence E : Θ ։ E . Type θ possesses the evidence in E ( θ ); he can present one piece of evidencefrom E ( θ ) to the principal. The evidence environment is normal if each type θ has a pieceof evidence e ( θ ) in E ( θ ) that is maximal in the following sense: Every other type θ ′ whohas e ( θ ) also has every other piece of evidence in E ( θ ). Therefore, type θ ′ can mimic type θ if and only if E ( θ ′ ) contains e ( θ ). A normal evidence environment induces an abstractmimicking correspondence that satisﬁes the nested range condition. Normality is a specialcase of our most-discerning condition; see Example 2.In computer science, Caragiannis et al. (2012) and Ferraioli and Ventre (2018) studya reduced-form model of probabilistic veriﬁcation in mechanism design. They restricttheir analysis to direct mechanisms, and they specify the probabilities with which eachtype can successfully mimic each other type. Dziuda and Salas (2018) and Balbuzanov(2019) study a setting without commitment in which these probabilities are constant. Our testing framework microfounds these models of partial veriﬁcation, provided that theprimitive authentication rate is most discerning. If the primitive authentication rate is not Strausz (2016) recovers the revelation principle by modeling veriﬁcation as a component of the outcome. Evidence was introduced in games (without commitment) by Milgrom (1981) and Grossman(1981); for recent work on evidence games, see Hart et al. (2017), Ben-Porath et al. (2017), andKoessler and Perez-Richet (2017). For some ﬁxed p ∈ (0 , α ( θ ′ | θ ) equals p if θ ′ = θ and 1 if θ ′ = θ . Thisauthentication satisﬁes (6) and hence is most discerning. If one type cannot mimic another type perfectly, thenhe risks being detected and facing a prohibitive ﬁne. Therefore, this setting reduces topartial veriﬁcation. In our model, the agent can walk away at any time, so punishment islimited.If the environment is most discerning, tests can also be interpreted as stochastic ev-idence . Each test τ in T corresponds to a request for a particular piece of evidence. Theagent is asked to send a cheap talk message to the principal after he learns his payoﬀ typebut before he learns which evidence is available to him. With probability π ( τ | θ ), type θ willhave the evidence requested by test τ . Deneckere and Severinov (2008) study a diﬀerentkind of stochastic evidence. In their model the agent simultaneously learns his payoﬀ typeand his set of feasible evidence messages.In economics, “veriﬁcation” traditionally means that the principal can learn the agent’stype perfectly by taking some action, e.g., paying a fee or allocating a good. This lit-erature began with Townsend (1979) who studied costly veriﬁcation in debt contracts.Ben-Porath et al. (2019) connect costly veriﬁcation and evidence. When monetary transfersare infeasible, costly veriﬁcation is often used as a substitute; see Ben-Porath et al. (2014);Erlanson and Kleiner (2015); Halac and Yared (2017); Li (2017); Mylovanov and Zapechelnyuk(2017).

10 Conclusion

We model probabilistic veriﬁcation as a family of stochastic tests available to the principal.Our testing framework provides a uniﬁed generalization of previous veriﬁcation models.Because veriﬁcation is noisy, our framework is amenable to the local ﬁrst-order approach.We illustrate this approach in a few classical revenue-maximization problems. We believethis local approach will make veriﬁcation tractable in other settings as well.As the precision of the veriﬁcation technology varies, our setting continuously interpo-lates between private information and complete information. Thus we can quantify thevalue to the principal of a particular veriﬁcation technology. This is the ﬁrst step towardanalyzing a richer setting in which the principal decides how much to invest in veriﬁcation. Another diﬀerence is that Caragiannis et al. (2012) investigate which allocation rules can be supportedby some transfer rule. We view transfers as part of the outcome and we study revenue-maximization. Measure theory

A.1 Markov transitions

This section introduces Markov transitions, which are continuous generalizations of stochas-tic matrices. For further details, see Kallenberg (2017, Chapter 1).

Deﬁnition 9 (Markov transition) . Let ( X, X ) and ( Y, Y ) be measurable spaces. A Markovtransition from ( X, X ) to ( Y, Y ) is a function k : X × Y → [0 ,

1] satisfying:(i) for each x ∈ X , the map B k ( x, B ) is a probability measure on ( Y, Y );(ii) for each B ∈ Y , the map x k ( x, B ) is a measurable function on ( X, X ).A Markov transition k from ( X, X ) to ( Y, Y ) is sometimes written as k : X → ∆( Y ).Each measure k ( x, · ) on Y is denoted k x , and we write k x ( B ) for k ( x, B ). A Markovtransition from ( X, X ) to ( X, X ) is called a Markov transition on ( X, X ). When the σ -algebras are clear, we will speak of Markov transitions betweens sets.We introduce three operations between Markov transitions—composition, products, andouter products. ( µk )( B ) = Z X µ (d x ) k ( x, B ) , for all B ∈ Z .First we deﬁne composition. Let k be a Markov transition from ( X, X ) to ( Y, Y ) and ℓ a Markov transition from ( Y, Y ) to ( Z, Z ). The composition of k and ℓ , denoted kℓ , is theMarkov transition from ( X, X ) to ( Z, Z ) deﬁned by( kℓ )( x, C ) = Z Y k ( x, d y ) ℓ ( y, C ) , for all x ∈ X and C ∈ Z . Here the function y ℓ ( y, C ) is integrated over Y with respect tothe measure k x . Inside the integral, it is standard to place the measure before the integrandso that the sequencing of the variables mirrors the timing of the process.Next we deﬁne products. Let k be a Markov transition from ( X, X ) to ( Y, Y ) as before,and let m be a Markov transition from ( X × Y, X ⊗ Y ) to ( Z, Z ). Here X ⊗ Y is the product σ -algebra generated by the measurable rectangles A × B for A ∈ X and B ∈ Y . The product of k and m , denoted k × m , is the unique Markov transition from ( X, X ) to ( Y × Z, Y ⊗ Z )satisfying ( k ⊗ m )( x, B × C ) = Z B k ( x, d y ) m (( x, y ) , C ) , for all x ∈ X , B ∈ Y , and C ∈ Z . Here the function y m (( x, y ) , C ) is integrated over theset B with respect to the measure k x . If m is a Markov transition from ( Y, Y ) to ( Z, Z ), weuse the same notation, with the understanding that the integrand is m ( y, C ) rather than m (( x, y ) , C ).Finally, we deﬁne outer products. Let k be a Markov transition from ( X , X ) to( Y, Y ) and k a Markov transition from ( X , X ) to ( Y , Y ). The outer product of k Markov transitions are also called kernels or Markov/stochastic/probability/transition kernels. k , denoted k ⊗ k , is the unique Markov transition from ( X × X , X ⊗ X ) to( Y × Y , Y ⊗ Y ) satisfying( k ⊗ k )(( x , x ) , B × B ) = k ( x , B ) · k ( x , B ) , for all x ∈ X , x ∈ X , B ∈ Y , and B ∈ Y . All three operations are associative.This holds trivially for outer products; for composition and products, see Kallenberg (2017,Lemma 1.17, p. 33). We drop parentheses when there is no ambiguity. A.2 Markov transitions on the real line

The real line R is endowed with its usual Borel σ -algebra.First, we deﬁne Markov transitions that are analogues of the cumulative distribution andquantile functions. Let µ be a measure on R , and let F µ : R → [0 ,

1] be the associated right-continuous cumulative distribution function. Suppose µ has compact support S . Deﬁne theleft-continuous quantile function Q µ : [0 , → S by Q µ ( p ) = inf { s ∈ S : F µ ( s ) ≥ p } . The cumulative distribution transition associated to µ , denoted ˜ F µ , is the Markovtransition from R to [0 ,

1] that assigns to each point s in R the uniform measure on[ F µ ( s − ) , F µ ( s )], where F µ ( s − ) is the left-limit of F µ at s . In particular, if F µ is contin-uous at s , then F µ ( s − ) = F µ ( s ) and this uniform measure is the Dirac measure δ F µ ( s ) .The quantile transition associated to µ , denoted ˜ Q µ , is the Markov transition from [0 , R that assigns to each number p in [0 ,

1] the Dirac measure δ Q µ ( p ) .These Markov transitions extend the usual properties of distribution and quantile func-tions to nonatomic distributions. Let U [0 , denote the uniform measure on [0 , Lemma 1 (Distribution and quantile transitions) . For measures µ and ν on R with compactsupport, the following hold:(i) µ ˜ F µ = U [0 , ;(ii) U [0 , ˜ Q ν = ν ;(iii) µ ˜ F µ ˜ Q ν = ν .For measures µ and ν on the real line, µ ﬁrst-order stochastically dominates ν , denoted µ ≥ SD ν , if F µ ( x ) ≤ F ν ( x ) for all real x . In particular, ﬁrst-order stochastic dominance isreﬂexive.To state the next results, we make the standing assumption that S is a compact subsetof R , endowed with the restriction of the Borel σ -algebra. A Markov transition d on S is downward if d ( s, ( −∞ , s ] ∩ S ) = 1 for all s ∈ S . Lemma 2 (Downward transitions) . For measures µ and ν on S , the following are equivalent:(i) µ ≥ SD ν ;(ii) ˜ F µ ˜ Q ν is downward;(iii) µd = ν for some downward transition d .A Markov transition m on S is monotone if s > t implies m s ≥ SD m t , for all s, t ∈ S .32 emma 3 (Monotone transitions) . (i) A Markov transition m on S is monotone if and only if µm ≥ SD νm for all measures µ and ν on S satisfying µ ≥ SD ν .(ii) The composition of monotone Markov transitions is monotone. A.3 Measurability and universal completions

Our deﬁnition of θ -discernment includes an inequality for every type. To ensure that we canselect score conversions in a measurable way, we enlarge the Borel σ -algebra to its universalcompletion, which we introduce here.Let ( X, X ) and ( Y, Y ) be measurable spaces. A function f from X to Y is X / Y -measurable if f − ( B ) is in X for all B in Y . This condition is written more compactlyas f − ( Y ) ⊂ X . If the σ -algebra Y is understood, we say f is X -measurable, and if both σ -algebras are understood, we say that f is measurable.Let ( X, X , µ ) be a probability space. A set A in X is a µ -null set if µ ( A ) = 0. The µ -completion of σ -algebra X , denoted X µ , is the smallest σ -algebra that contains every setin X and every subset of every µ -null set. It is straightforward to check that a subset A of X is a member of X µ if and only if there are sets A and A in X such that A ⊂ A ⊂ A and µ ( A \ A ) = 0. The universal completion X of X is the σ -algebra on X deﬁned by X = ∩ µ X µ , where the intersection is taken over all probability measures on ( X, X ).It is convenient to work with the universal completion because of the following measur-able projection theorem (Cohn, 2013, Proposition 8.4.4, p. 264). Theorem 7 (Measurable projection)

Let ( X, X ) be a measurable space, Y a Polish space, and C a set in the product σ -algebra X ⊗ B ( Y ) . Then the projection of C on X belongs to X . By taking universal completions, we do not lose any Markov transitions.

Lemma 4 (Completing transitions) . A Markov transition k from ( X, X ) to ( Y, Y ) can beuniquely extended to a Markov transition ¯ k from ( X, X ) to ( Y, Y ).Next we consider the universal completion of a product σ -algebra. Lemma 5 (Product spaces) . For measurable spaces ( X, X ) and ( Y, Y ), X ⊗ Y ⊂ X ⊗ Y = X ⊗ Y . We adopt the convention that when a Markov transition is deﬁned between Polishspaces, both the domain and codomain are endowed with the universal completions of theirBorel σ -algebras. In particular, when we take a product of Markov transitions, we extendthis transition. By Lemma 5, it does not matter whether the component transitions areextended ﬁrst.We conclude with one note of caution about universal completions. Let k be a Markovtransition k from ( X × Y, X ⊗ Y to ( Z, Z ). Since X ⊗ Y is generally larger than

X ⊗ Y ,33he section k x may not be a Markov transition. For ﬁxed x ∈ X and B ∈ Z , the map y k ( x, y, B ) is deﬁned, but may not be Y -measurable. But if Y is countable, and Y = 2 Y , then measurability is automatic. A.4 Deﬁning mechanisms and strategies

In the model, we make the following technical assumptions. The sets Θ, T , and X are allPolish spaces. The ﬁnite signal space S is endowed with the discrete topology. The Markovtransition π is from (Θ × T, B (Θ × T )) to ( S, B ( S )). Recall that for Polish spaces Y and Z , we have B ( Y × Z ) = B ( Y ) ⊗ B ( Z ). In a mechanism, the message space M is Polish.Testing rules, decision rules, reporting strategies, and passage strategies are all Markovtransitions, with the domain and codomain endowed with the universal completions of theirBorel (product) σ -algebras. B Proofs

B.1 Proof of Theorem 1

We begin with new notation. To avoid confusion in the commutative diagrams below, wedenote the message set in a direct mechanism by Θ ′ . This set is a copy of Θ, but it is helpfulto keep the sets Θ and Θ ′ distinct.We work with performance strategies, as deﬁned in Section 8. Let f be an implementablesocial choice function. Select a proﬁle ( M, t, g ; r, p ) that implements f . We construct a directmechanism (ˆ t, ˆ g ) that canonically implements f .Let ˆ t be the composition rt , which is a Markov transition from Θ ′ to T . By disintegrationof Markov transitions (Kallenberg, 2017, Theorem 1.25, p. 39), there is a Markov transition h from Θ ′ × T to M such that r × t = ˆ t ⊗ h . Deﬁne a Markov transition d from Θ ′ × M × T × S to S as follows. For each ( θ ′ , m, τ ) ∈ Θ ′ × M × T , set d θ ′ ,m,τ = ˜ F τ | θ ′ ˜ Q θ ′ ,m,τ , where ˜ F τ | θ ′ is the distribution Markov transition corresponding to π τ | θ ′ and ˜ Q θ ′ ,m,τ is thequantile Markov transition corresponding to p θ ′ ,m,τ ; see Appendix A.2 for the deﬁnitions.By Lemma 2, π τ | θ ′ d θ ′ ,m,τ = p θ ′ ,m,τ , (12)and d θ ′ ,m,τ is downward because p θ ′ ,m,τ ≤ SD π τ | θ ′ .Deﬁne the direct decision rule ˆ g as the composition shown in the following commutativediagram, where j denotes the identity transition from Θ to Θ ′ . We cannot directly apply the result to the universal completions of the Borel σ -algebras. Argue asfollows. First, restrict r × t to a Markov transition from (Θ , B (Θ ′ )) to ( M × T, B ( M ) ⊗ B ( T )) and ˆ t to aMarkov transition from (Θ ′ , B (Θ ′ )) to ( T, B ( T )). By Kallenberg (2017, Theorem 1.25, p. 39), there existsa Markov transition h from (Θ ′ × T, B (Θ ′ ) ⊗ B ( T )) to ( M, B ( M )) that satisﬁes the desired equality for therestricted transitions from (Θ ′ , B (Θ ′ )) to ( M × T, B ( M ) ⊗ B ( T )). By Lemma 4, we can extend h to a Markovtransition from (Θ × T, B (Θ) ⊗ B ( T )) to ( M, B ( M )). To keep the diagrams uncluttered, we adopt the following conventions. The labeled Markov transitionmaps a subproduct of the source space into a subproduct of the target space. The other sets in the target × Θ ′ Θ × M Θ × Θ ′ × T Θ × M × T Θ ′ × T × S Θ ′ × M × T × S M × T × SX j r ˆ t tπ pj × πh ˆ g d g By construction, this diagram commutes, so the direct mechanism (ˆ t, ˆ g ) induces f . To proveincentive compatibility, we show that for any strategy (ˆ z, ˆ q ) in the direct mechanism, thereis a strategy ( z, q ) in the original mechanism that induces the same social choice function.Given (ˆ z, ˆ q ), deﬁne ( z, q ) by the following commutative diagrams.ΘΘ ′ M ˆ z zr Θ × Θ ′ × M × T Θ × M × T Θ ′ × M × T × S S ˆ q ˆ z qd (13)Since each Markov transition d θ ′ ,m,τ is downward, it follows that q is feasible. These devi-ations induce the same social choice function because the following diagram commutes.ΘΘ × Θ ′ Θ × M Θ × Θ ′ × T Θ × M × T Θ ′ × T × S Θ ′ × M × T × S M × T × SX ˆ z zr ˆ t t ˆ q q ˆ zh ˆ g d g space are carried along by the identity. Mechanisms act on the report set Θ ′ not the type space Θ; π actson Θ. xtension to multiple agents Following the notation in Section 7, let Θ = Q i Θ i , T = Q i T i , and S = Q i S i . Set π = ⊗ i π . With this notation, π is a Markov transitionfrom T × Θ to S , as in the single-agent case. After setting M = Q i M i , a mechanism inthe multi-agent setting maps between the same sets. Similarly for the agent’s strategies,set r = ⊗ i r i and p = ⊗ i p i . With these new deﬁnitions, construct h as before. Deﬁne d i separately for each agent i and then set d = ⊗ i d i . The proof is exactly as before. Toconsider a deviation by a ﬁxed player j , take ˆ z = ˆ z j ⊗ r − j and ˆ q = ˆ q j ⊗ p − j . Then thedeviation ( z, q ) constructed in (13) will have the form z = z j ⊗ r − j and q = q j ⊗ p − j . B.2 Proof of Theorem 2

We begin with some notation. As in the proof of Theorem 1, denote by Θ ′ the message setin a direct mechanism. Thus, Θ ′ is a copy of Θ. Similarly, denote by T ′ a copy of T thatwill be the codomain of a most-discerning testing function. Keeping these copies separatewill be helpeful for the commutative diagrams below.The proof is organized as follows. First, we select score conversions in a measurableway. Next, we prove suﬃciency and then necessity. Selecting score conversions

Let K denote the space ∆( S ) S of Markov transitions on S , viewed as a subset of R S × S , with the usual Euclidean topology and inner product h· , ·i .For k ∈ K , denote by k ( s, s ′ ) the transition probability from s to s ′ .Deﬁne the domain D = { ( θ, τ, ψ ) ∈ Θ ′ × T ′ × T : τ (cid:23) θ ψ } . Deﬁne the correspondence K : D ։ K by putting K ( θ, τ, ψ ) equal to the set of monotoneMarkov transitions k in K satisfying (i) π τ | θ k = π ψ | θ , and (ii) π τ | θ ′ k ≤ SD π ψ | θ ′ for all types θ ′ . By the choice of domain D , the correspondence K is nonempty-valued.Endow D with the restriction of the σ -algebra B (Θ ′ × T ′ × T ). To prove that thereexists a measurable selection ˆ k from K , we apply the Kuratowski–Ryll-Nardzewski selectiontheorem (Aliprantis and Border, 2006, 18.13, p. 600). The correspondence K has compactconvex values, so it suﬃces to check that associated support functions for K are measurable(Aliprantis and Border, 2006, 18.31, p. 611).Fix ℓ ∈ R S × S . Deﬁne the map C : D → R by C ( θ, τ, ψ ) = max k ∈ K ( θ,τ,ψ ) h k, ℓ i . It suﬃces to show that C is B (Θ ′ × T ′ × T )-measurable. Deﬁne a sequence of auxiliaryfunctions C m : D × (Θ ′ ) m → R as follows. Let C m ( θ, τ, ψ, θ ′ , . . . , θ ′ m ) be the value of the36rogram maximize h k, ℓ i subject to k ∈ K k is monotone π τ | θ k = π ψ | θ π τ | θ ′ j k ≤ SD π ψ | θ ′ j , j = 1 , . . . , m. This is a standard linear programming problem with a compact feasible set. By Berge’stheorem (Aliprantis and Border, 2006, 17.30, p. 569), the value of the linear program isupper semicontinuous (and hence Borel) as a function of the coeﬃcients appearing in theconstraints. Since π is Borel, so is each function C m . By the measurable projection theorem(Theorem 7), each map ( θ, τ, ψ ) inf θ ′ ∈ (Θ ′ ) m C m ( θ, τ, ψ, θ ′ )is B (Θ ′ × T ′ × T )-measurable. A compactness argument shows that C ( θ, τ, ψ ) = inf m inf θ ′ ∈ (Θ ′ ) m C m ( θ, τ, ψ, θ ′ ) , so C is also B (Θ ′ × T ′ × T )-measurable. Suﬃciency

Fix a decision environment (

X, u ) and let f be an implementable social choicefunction. By the revelation principle (Theorem 1), there is a direct mechanism ( t, g ) thatcanonically implements f . We now construct a decision rule ˆ g such that the direct mech-anism (ˆ t, ˆ g ) canonically implements f . Deﬁne ˆ g by the following commutative diagram, We claim that for each positive ε there exists a natural number m and a vector θ ′ ∈ (Θ ′ ) m such that C m ( θ, τ, ψ, θ ′ ) < C ( θ, τ, ψ ) + ε . Suppose not. For each θ ′ ∈ Θ ′ , let K θ ′ be the compact set of monotoneMarkov transitions k ∈ K satisfying (i) π τ | θ k = ψ ψ | θ , (ii) π τ | θ ′ k ≤ SD π ψ | θ ′ , and (iii) h k, ℓ i ≥ C ( θ, τ, ψ ) + ε .This family has the ﬁnite intersection property, but the intersection over all θ ′ ∈ Θ ′ is empty, which is acontradiction. j denotes the identity transition from Θ to Θ ′ .ΘΘ × Θ ′ Θ × Θ ′ × T ′ Θ × Θ ′ × T Θ ′ × T ′ × S Θ ′ × T ′ × T × S Θ ′ × T × SX id t ˆ tπ π ˆ tt ˆ g ˆ k g (14)Since the diagram commutes, the direct mechanism (ˆ t, ˆ g ) induces f . For incentive compat-ibility, we show that for any strategy (ˆ z, ˆ q ) in the new mechanism, there is a strategy ( z, q )in the old mechanism inducing the same social choice function. Given (ˆ z, ˆ q ), set z = ˆ z anddeﬁne q by the following commutative diagram.Θ × Θ ′ × T ′ × T Θ × Θ ′ × T Θ ′ × T ′ × T × S S ˆ q ˆ t q ˆ k (15)Since each transition ˆ k θ,τ,ψ is downward, it follows that q is feasible. These deviationsinduce the same social choice function because the following diagram commutes.ΘΘ × Θ ′ Θ × Θ ′ × T ′ Θ × Θ ′ × T Θ ′ × T ′ × S Θ ′ × T ′ × T × S Θ ′ × T × SX ˆ z t ˆ t ˆ q q ˆ tt ˆ g ˆ k g (16)38 ecessity If ˆ t is not most-discerning, then there exists a ﬁxed type θ such that ˆ t ( θ ) isnot most θ -discerning. Set τ = ˆ t ( θ ). Select a test ψ such that τ θ ψ .Deﬁne the decision set X = { x θ ′ : θ ′ ∈ Θ } ∪ { ¯ x } , where ¯ x , x θ ′ , and x θ ′′ are distinct for all distinct types θ ′ and θ ′′ .Deﬁne utilities as follows. Type θ gets utility 1 from x θ and utility 0 from every otherdecision in X . For θ ′ = θ , type θ ′ gets utility 1 from decision x θ , utility π ( ψ | θ ′ ) from x θ ′ ,and utility 0 from all other decisions.Let f be the social choice function that assigns to each type θ ′ with θ ′ = θ the decision x θ ′ with certainty, and assigns to type θ decision x θ with probability π ( ψ | θ ) and decision¯ x with probability 1 − π ( ψ | θ ). Then f can be canonically implemented by the mechanism( t, ˆ g ), where t is any function satisfying t ( θ ) = ψ and ˆ g is the decision rule speciﬁed asfollows. If the agent reports θ ′ = θ , assign x θ ′ no matter the test result; if the agent reports θ , select x θ if the agent passes test ψ and ¯ x if the agent fails test ψ .We claim that f cannot be canonically implemented with the testing function ˆ t . Supposefor a contradiction that there is a decision rule ˆ g such that f is canonically implemented bythe mechanism (ˆ t, ˆ g ). We separate into two cases.First suppose π ( τ | θ ) = 0. Then every type can get the good with probability π ( ψ | θ ).Since τ θ ψ , there is some type θ ′ = θ such that π ( ψ | θ ′ ) < π ( ψ | θ ), so type θ ′ has aproﬁtable deivation.Next suppose π ( τ | θ ) >

0. Deﬁne a Markov transition k on S , represented by a vector in[0 , by letting k ( s ) be the probability that the measure g θ,τ,s places on decision x θ . Then π τ | θ k = π ψ | θ . Since τ θ ψ , either k is not monotone, in which case type θ can proﬁtablydeviate by reporting type θ and failing the test, or there is some type θ ′ = θ such that π τ | θ ′ k > SD π ψ | θ ′ . In this case, type θ ′ can proﬁtably deviate by reporting θ ′ and exertingeﬀort. B.3 Proof of Theorem 3

First we use the regularity assumptions to prove that there exists a measurable test selection.Then we follow the proof of the suﬃciency part of Theorem 2.

Measurable test selection

We prove that there exists a measurable function ¯ t from(Θ ′ × T, B (Θ ′ × T )) to ( T ′ , B ( T ′ )) such that the test ¯ t ( θ, ψ ) is in ˆ T ( θ ) and satisﬁes ¯ t ( θ, ψ ) (cid:23) θ ψ , for each θ ∈ Θ and ψ ∈ T . Deﬁne a correspondence H : Θ ′ × T ։ T ′ by H ( θ, ψ ) = { τ ∈ T : τ (cid:23) θ ψ } . π is continuous, the graph of H is closed in Θ ′ × T × T ′ . Since the graph of ˆ T isBorel, so is the set { ( θ, ψ, τ ) : τ ∈ ˆ T ( θ ) } and also the intersection { ( θ, ψ, τ ) : τ (cid:23) θ ψ and τ ∈ ˆ T ( θ ) } . By the measurable projection theorem (Theorem 7), the associated correspondence from(Θ ′ × T, B (Θ ′ × T )) to T ′ is measurable. Moreover, this correspondence has closed values,so we can apply the Kuratowski–Ryll-Nardzewksi selection theorem (Aliprantis and Border,2006, 18.13, p. 600) to obtain the desired function ¯ t . Suﬃciency

With ¯ t in hand, the proof is almost the same as the proof of Theorem 2 inAppendix B.2, but we are not given ˆ t . Instead, ˆ t is deﬁned by the commutative diagram.The second and third rows of (14) and (16) becomeΘ × Θ ′ Θ × Θ ′ × T ′ Θ × Θ ′ × T t ˆ t ¯ t In (15), put ¯ t in place of ˆ t . The rest of the proof is completed as above. B.4 Proof of Proposition 1

A Markov transition k on { , } can be represented as a vector( k (0) , k (1)) ∈ [0 , , where k ( s ) is the probability of transitioning from s ∈ { , } to 1. A Markov transition k is monotone and satisﬁes π τ | θ k = π ψ | θ if and only if the vector ( k (0) , k (1)) satisﬁes k (1) ≥ k (0) and π ( τ | θ ) k (1) + (1 − π ( τ | θ )) k (0) = π ( ψ | θ ) . We separate the solution into cases.1. If π ( τ | θ ) ≥ π ( ψ | θ ), then the solutions are given by (cid:20) k (0) k (1) (cid:21) = λ (cid:20) π ( ψ | θ ) /π ( τ | θ ) (cid:21) + (1 − λ ) (cid:20) π ( ψ | θ ) π ( ψ | θ ) (cid:21) , for λ ∈ [0 , / π ( τ | θ ) < π ( ψ | θ ), then the solutions are given by (cid:20) k (0) k (1) (cid:21) = λ (cid:20) ¯ π ( ψ | θ ) / ¯ π ( τ | θ )1 (cid:21) + (1 − λ ) (cid:20) π ( ψ | θ ) π ( ψ | θ ) (cid:21) , Take a sequence ( θ n , ψ n , τ n ) in gr H converging to a limit ( θ, ψ, τ ) in Θ ′ × T × T ′ . For each n , thereis a monotone Markov transition k n on S such that (i) π τ n | θ n k n = π ψ n | θ n , and (ii) π τ n | θ ′ k n ≤ SD π ψ n | θ ′ forall θ ′ ∈ Θ. The space of Markov transitions on S is compact, so after passing to a subsequence, we mayassume that k n converges to a limit k , which must be monotone. Since π is continuous, taking limits gives(i) π τ | θ k = π ψ | θ , and (ii) π τ | θ ′ k ≤ SD π ψ | θ ′ for each θ ′ ∈ Θ. Therefore, ( θ, ψ, τ ) is in gr H . λ ∈ [0 , F τ | θ ˜ Q τ | θ and the right term is thevector representation of the constant Markov transition π ψ | θ . B.5 Proof of Proposition 2

0, we compare the probability of passage. Theinequality holds if and only if[ λπ ( τ | θ ′ ) + (1 − λ ) π ( τ | θ )] π ( ψ | θ ) π ( τ | θ ) ≤ π ( ψ | θ ′ ) , This reduces to the desired inequality.3. If π ( τ | θ ) < π ( ψ | θ ), we compare the probability of failure.The inequality holds if andonly if [ λ ¯ π ( τ | θ ′ ) + (1 − λ )¯ π ( τ | θ )] ¯ π ( ψ | θ )¯ π ( τ | θ ) ≥ ¯ π ( ψ | θ ′ ) . B.6 Proof of Proposition 3

Fix a type θ and tests τ and τ .One direction is clear. If τ and τ are equal, then τ and τ are clearly θ -equivalent.If type θ is minimal on τ and τ , take λ = 1 in Proposition 2 to see that τ and τ are θ -equivalent.For the other direction, suppose τ and τ are θ -equivalent. Without loss, we mayassume π ( τ | θ ) ≥ π ( ψ | θ ). We separate into cases according to whether θ is minimal on τ .First suppose θ is minimal on τ . Since τ (cid:23) θ τ , there are probabilities k (0) and k (1)such that for all types θ ′ , π ( τ | θ ) = k (0) + ( k (1) − k (0)) π ( τ | θ ) ≤ k (0) + ( k (1) − k (0)) π ( τ | θ ′ ) ≤ π ( τ | θ ′ ) . Now suppose θ is not minimal for τ , so there exists some type θ ′′ such that π ( τ | θ ′′ ) <π ( τ | θ ). In particular, π ( τ | θ ) > Claim. π ( τ | θ ) = π ( ψ | θ ). 41uppose for a contradiction that π ( τ | θ ) > π ( τ | θ ). By Proposition 2, there are constants λ and λ in [0 ,

1] such that for all types θ ′ ,[ λ π ( τ | θ ′ ) + (1 − λ ) π ( τ | θ )] π ( τ | θ ) ≤ π ( τ | θ ′ ) π ( τ | θ ) , [ λ ¯ π ( τ | θ ′ ) + (1 − λ )¯ π ( τ | θ )]¯ π ( τ | θ ) ≥ ¯ π ( τ | θ ′ )¯ π ( τ | θ ) . With θ ′ = θ ′′ the ﬁrst inequality is weakest when λ = 1, so π ( τ | θ ′′ ) π ( τ | θ ) ≤ π ( τ | θ ′′ ) π ( τ | θ ) . (18)Taking λ = 0 in the second inequality, and noting that ¯ π ( τ | θ ) = 1 − π ( τ | θ ) >

0, yieldsthe contradiction ¯ π ( τ | θ ) ≥ ¯ π ( τ | θ ′′ ). Therefore, the inequality must hold with λ = 1, so¯ π ( τ | θ ′′ )¯ π ( τ | θ ) ≥ ¯ π ( τ | θ )¯ π ( τ | θ ′′ ) . (19)We show that (18) and (19) are incompatible. In (19), subtract ¯ π ( τ | θ )¯ π ( τ | θ ) fromboth sides to get [¯ π ( τ | θ ′′ ) − ¯ π ( τ | θ )]¯ π ( τ | θ ) ≥ [¯ π ( τ | θ ′′ ) − ¯ π ( τ | θ )]¯ π ( τ | θ ) , which is equivalently,[ π ( τ | θ ) − π ( τ | θ ′′ )]¯ π ( τ | θ ) ≥ [ π ( τ | θ ) − π ( τ | θ ′′ )]¯ π ( τ | θ ) . The right side is strictly positive and ¯ π ( τ | θ ) < ¯ π ( τ | θ ), so π ( τ | θ ) − π ( τ | θ ′′ ) > π ( τ | θ ) − π ( τ | θ ′′ ) . (20)Now negate (18) and add π ( τ | θ ) π ( τ | θ ) to both sides to obtain[ π ( τ | θ ) − π ( τ | θ ′′ )] π ( τ | θ ) ≥ [ π ( τ | θ ) − π ( τ | θ ′′ )] π ( τ | θ ) . But π ( τ | θ ) < π ( τ | θ ), so (20) gives the opposite inequality.With the claim established, we now complete the proof. By Proposition 2 there areconstants λ and λ in [0 ,

1] such that for all types θ ′ ,[ λ π ( τ | θ ′ ) + (1 − λ ) π ( τ | θ )] π ( τ | θ ) ≤ π ( τ | θ ′ ) π ( τ | θ ) , [ λ π ( τ | θ ′ ) + (1 − λ ) π ( τ | θ )] π ( τ | θ ) ≤ π ( τ | θ ′ ) π ( τ | θ ) . After cancelling the common value of π ( τ | θ ) and π ( τ | θ ), which is nonzero by assumption,we have λ π ( τ | θ ′ ) + (1 − λ ) π ( τ | θ ) ≤ π ( τ | θ ′ ) , (21) λ π ( τ | θ ′ ) + (1 − λ ) π ( τ | θ ) ≤ π ( τ | θ ′ ) . (22)It suﬃces to show that λ = λ = 1, for then π ( τ | θ ′ ) = π ( τ | θ ′ ) for all types θ ′ . Take θ ′ = θ ′′ in both inequalities. We have π ( τ | θ ) = π ( τ | θ ) > π ( τ | θ ′′ ), so (22) implies that π ( τ | θ ′′ ) ≤ π ( τ | θ ′′ ). Substituting this into (21), we get λ = 1. Then π ( τ | θ ′′ ) ≤ π ( τ | θ ′′ ),42o (22) implies λ = 1. B.7 Proof of Proposition 4

We simply translate Proposition 3 into the language of authentication rates. Suppose ˆ t and ˆ t are most-discerning testing functions, and let α and α be the induced authenti-cation rates. For each type θ , we know ˆ t ( θ ) and ˆ t ( θ ) are most θ -discerning tests. ApplyProposition 3 and translate the conclusion into the language of authentication rates. B.8 Proof of Theorem 4

Let α be an authentication rate. First, suppose that α is most discerning. Let T = { τ θ ′ : θ ′ ∈ Θ } , and deﬁne the passage rate π by π ( τ θ ′ | θ ) = α ( θ ′ | θ ) for all types θ and θ ′ . CombiningDeﬁnition 8 and Proposition 2, we see that the testing function θ τ θ is most discerning.By construction, this testing function induces α .Now suppose α is induced by a most-discerning testing function ˆ t in a testing environ-ment ( T, π ). Substitute the equality α ( θ ′ | θ ) = π (ˆ t ( θ ′ ) , θ ) into Proposition 2 to concludethat α is most discerning. B.9 Proof of Proposition 5

We apply the dominated convergence theorem. As λ converges to 0 pointwise, Λ( z | θ )converges to 1 for all z and θ with z ≤ θ . Hence ϕ ( θ ) converges to ϕ M ( θ ), for each θ .Likewise, as λ converges to ∞ pointwise, Λ( z | θ ) converges to 0 for all z and θ with z < θ .Hence ϕ ( θ ) converges to θ , for each θ . B.10 Proof of Proposition 6

We will prove a stronger result that we use below. Deﬁne functions λ + and λ − from Θ to[0 , ∞ ) by λ + ( θ ) = − D α ( θ | θ ) , λ − ( θ ) = D − α ( θ | θ ) . In the main text, we only work with λ + , which is denoted λ . Extend the function Λ toΛ : Θ × Θ → [0 ,

1] by Λ( θ ′ | θ ) =  exp (cid:16) − R θθ ′ λ + ( s ) d s (cid:17) if θ ≥ θ ′ , exp (cid:16) − R θ ′ θ λ − ( s ) d s (cid:17) if θ < θ ′ . With these deﬁnitions, we now prove that α ( θ ′ | θ ) ≥ Λ( θ ′ | θ ) all types θ ′ and θ . Fix θ and θ ′ . For each h , transitivity gives α ( θ ′ | θ + h ) ≥ α ( θ ′ | θ ) α ( θ | θ + h ) . α ( θ ′ | θ ) from each side to get α ( θ ′ | θ + h ) − α ( θ ′ | θ ) ≥ α ( θ ′ | θ )( α ( θ | θ + h ) − α ( θ ′ | θ )[ α ( θ + h, θ ) − α ( θ | θ )] . Dividing by h and passing to the limit as h ↓ h ↑ D α ( θ ′ | θ ) ≥ − λ + ( θ ) α ( θ ′ | θ ) and D − α ( θ ′ | θ ) ≤ λ − ( θ ) α ( θ ′ | θ ) . Now we use absolute continuity to convert these local bounds into global bounds. Fixa report θ ′ . Deﬁne the function ∆ on [¯ θ, ¯ θ ] by∆( θ ) = α ( θ ′ | θ )Λ( θ ′ | θ ) . By construction, ∆( θ ′ ) = 1. We will argue that ∆( θ ) ≥ θ . For θ ′ < θ , if D Λ( θ ′ | θ )exists, then D + ∆( θ ) = 1Λ( θ ′ | θ ) (cid:0) D + α ( θ ′ | θ ) + λ + ( θ ) α ( θ ′ | θ ) (cid:1) ≥ . For θ ′ > θ , if D − Λ( θ ′ | θ ) exists, then D − ∆( θ ) = 1Λ( θ ′ | θ ) (cid:0) D − α ( θ ′ | θ ) − λ − ( θ ) α ( θ ′ | θ ) (cid:1) ≤ , Since Λ( θ ′ |· ) is absolutely continuous, these inequalities hold almost surely. Moreover, theproduct of absolutely continuous functions on a compact set is absolutely continuous, so ∆is absolutely continuous, and hence the fundamental theorem of calculus gives ∆( θ ) ≥

1, asdesired.

B.11 Proof of Proposition 7

First we introduce notation. For a given quantity function q , there is a one-to-one correspon-dence between the transfer function t and the utility function U , given by U ( θ ) = θq ( θ ) − t ( θ ).We will interchangeably refer to such a mechanism as ( q, t ) or ( q, U ). Let u ( θ ′ | θ ) = α ( θ ′ | θ )[ θq ( θ ′ ) − t ( θ ′ )] , U ( θ ) = u ( θ | θ ) = max θ ′ ∈ Θ u ( θ ′ | θ ) . Lemma 6 (Utility bound) . Let q be a bounded quantity function, and let U be a utilityfunction. If ( q, U ) is incentive compatible, then for every type θ , we have U ( θ ) ≥ Z θ ¯ θ Λ( z | θ ) q ( z ) d z. (23)If the function θ Λ( θ | ¯ θ ) q ( θ ) is increasing and the global upper bound is satisﬁed, it isincentive compatible for (23) to hold with equality.Lemma 6 is proved in Appendix B.12. Here we prove the theorem, taking Lemma 644s given. There is no loss in restricting attention to bounded quantity functions. Pick abounded quantity function q : Θ → R + . The principal’s objective function can be decom-posed as the diﬀerence between the total surplus and the agent’s rents: Z ¯ θ ¯ θ [ θq ( θ ) − c ( q ( θ ))] f ( θ ) d θ − Z ¯ θ ¯ θ U ( θ ) d θ. Plug in the bound from Lemma 6 and switch the order of integration to obtain the followingupper bound on the principal’s objective: V ( q ) = Z ¯ θ ¯ θ [ ϕ ( θ ) q ( θ ) − c ( q ( θ ))] f ( θ ) d θ. The quantity function q ⋆ from the theorem statement maximizes the expression in bracketspointwise, and t ⋆ is the corresponding transfer that achieves the utility bound. Since ϕ isincreasing and the global upper bound is satisﬁed, this mechanism is incentive compatible.Since c ′ is strictly increasing, the pointwise maximizer is unique, so this quantity function q ⋆ is unique almost everywhere. B.12 Proof of Lemma 6

Let ( q, t ) be a bounded incentive compatible mechanism. The ﬁrst step is showing that theequilibrium utility function U is absolutely continuous. Fix types θ and θ ′ . We have U ( θ ′ ) ≥ α ( θ ′ | θ ) (cid:0) θ ′ q ( θ ) − t ( θ ) (cid:1) + ≥ Λ( θ ′ | θ ) (cid:0) θ ′ q ( θ ) − t ( θ ) (cid:1) + ≥ Λ( θ ′ | θ ) (cid:0) θ ′ q ( θ ) − t ( θ ) (cid:1) , where the ﬁrst inequality uses individual rationality and incentive compatibility, and thesecond uses the inequality between α and Λ established in Appendix B.10. Therefore, U ( θ ) − U ( θ ′ ) ≤ θq ( θ ) − t ( θ ) − Λ( θ ′ | θ ) (cid:0) θ ′ q ( θ ) − t ( θ ) (cid:1) = θq ( θ ) − t ( θ ) − Λ( θ ′ | θ ) (cid:0) ( θ ′ − θ ) q ( θ ) + θq ( θ ) − t ( θ ) (cid:1) = (1 − Λ( θ ′ | θ ))( θq ( θ ) − t ( θ )) − Λ( θ ′ | θ )( θ ′ − θ ) q ( θ ) ≤ (1 − Λ( θ ′ | θ ))(¯ θ k q k ∞ + k t k ∞ ) + | θ ′ − θ |k q k ∞ . To avoid separating into cases according to the relative sizes of θ and θ ′ , set ˆ λ = λ + ∨ λ − .Since ˆ λ ≤ λ + + λ − , we know ˆ λ is integrable. Using the inequality e x ≥ x , we have1 − Λ( θ ′ | θ ) ≤ − exp (cid:18) − Z θ ′ ∨ θθ ′ ∧ θ ˆ λ ( z ) d z (cid:19) ≤ Z θ ′ ∨ θθ ′ ∧ θ ˆ λ ( z ) d z. Pick a quantity ¯ q such that ¯ θ ¯ q = c (¯ q ). Oﬀering more than ¯ q will always result in weakly negative proﬁts,so we can remove those oﬀerings from the menu and increase the principal’s revenue. Therefore, there is noloss in focusing on quantity functions that are bounded above by ¯ q . U ( θ ) − U ( θ ′ ) ≤ C Z θ ′ ∨ θθ ′ ∧ θ (ˆ λ ( z ) + 1) d z, where C = max { , ¯ θ }k q k ∞ + k t k ∞ . Switching the roles of θ and θ ′ gives the same inequality,so we conclude that | U ( θ ) − U ( θ ′ ) | ≤ C Z θ ′ ∨ θθ ′ ∧ θ (ˆ λ ( z ) + 1) d z, which proves the desired absolute continuity.Now we use the absolute continuity of U to establish the bound. Deﬁne the auxiliaryfunction ∆ on [¯ θ, ¯ θ ] by∆( θ ) = Λ( θ | ¯ θ ) (cid:18) U ( θ ) − Z θ ¯ θ Λ( z | θ ) q ( z ) d z (cid:19) = Λ( θ | ¯ θ ) U ( θ ) − Z θ ¯ θ Λ( z | ¯ θ ) q ( z ) d z. The function ∆ is absolutely continuous since it is the product of absolutely continuousfunctions on a compact set. By Theorem 1 in Milgrom and Segal (2002), whenever U isdiﬀerentiable, we have q ( s ) − λ + ( s ) U ( s ) = D u ( θ | θ ) ≤ U ′ ( s ) ≤ D − u ( θ, θ ) = q ( s ) + λ − ( s ) U ( s ) . At each point θ where all the functions involved are diﬀerentiable, which holds almost surely,we have ∆ ′ ( θ ) = λ + ( θ )Λ( θ | ¯ θ ) U ( θ ) + Λ( θ | ¯ θ ) U ′ ( θ ) − Λ( θ | ¯ θ ) q ( θ )= Λ(¯ θ, θ ) (cid:2) U ′ ( θ ) − ( q ( θ ) − λ + ( θ ) U ( θ )) (cid:3) ≥ . Since ∆(¯ θ ) = 0, the fundamental theorem of calculus implies that ∆( θ ) ≥ θ , asdesired.It remains to check that the global incentive constraints are satisﬁed, provided that q ismonotone. Expressing incentive-compatibility in terms of U , we need to show that for alltypes θ and θ ′ , U ( θ ) ≥ α ( θ ′ | θ )( U ( θ ′ ) + ( θ − θ ′ ) q ( θ ′ )) . (24)We consider upward and downward deviations separately. First suppose θ ′ > θ . Writeout (24) as U ( θ ) + α ( θ ′ | θ )( θ ′ − θ ) q ( θ ′ ) ≥ α ( θ ′ | θ ) U ( θ ′ ) , or equivalently, Z θ ¯ θ Λ( z | θ ) q ( z ) d z + α ( θ ′ | θ ) Z θ ′ θ q ( θ ′ ) d z ≥ α ( θ ′ | θ ) Z θ ¯ θ Λ( z | θ ′ ) q ( z ) d z + α ( θ ′ | θ ) Z θ ′ θ Λ( z | θ ′ ) q ( z ) d z. z | θ ) ≥ Λ( z | θ ′ ) for z ≤ θ ≤ θ ′ , weget the inequality between the ﬁrst terms. For the inequality between the second terms,multiply by Λ( θ ′ | ¯ θ ) /α ( θ ′ | θ ) and use the fact that Λ( z | ¯ θ ) q ( z ) is increasing in z .Now suppose θ ′ < θ . Express (24) as Z θ ′ ¯ θ Λ( z | θ ) q ( z ) d z + Z θθ ′ Λ( z | θ ) q ( z ) d z ≥ α ( θ ′ | θ ) "Z θ ′ ¯ θ Λ( z | θ ′ ) q ( z ) d z + Z θθ ′ q ( θ ′ ) d z = α ( θ ′ | θ )Λ( θ ′ | θ ) "Z θ ′ ¯ θ Λ( z | θ ) q ( z ) d z + Z θθ ′ Λ( θ ′ | θ ) q ( θ ′ ) d z . Multiply both sides by Λ( θ | ¯ θ ) and then rearrange to get the equivalent inequality α ( θ ′ | θ ) ≤ Λ( θ ′ | θ ) R θ ′ ¯ θ Λ( z | ¯ θ ) q ( z ) d z + R θθ ′ Λ( z | ¯ θ ) q ( z ) d z R θ ′ ¯ θ Λ( z | ¯ θ ) q ( z ) d z + R θθ ′ Λ( θ ′ | ¯ θ ) q ( θ ′ ) d z = Λ( θ ′ | θ ) Z θ ¯ θ Λ( z | ¯ θ ) q ( z ) d z (cid:30) Z θ ¯ θ Λ( z ∧ θ ′ | ¯ θ ) q ( z ∧ θ ′ ) d z. B.13 Proof of Proposition 8

The same argument as in the proof of Proposition 7 (Appendix B.11) shows that the prin-cipal’s value as a function of q can be written as V ( q ) = Z ¯ θ ¯ θ ϕ ( θ ) q ( θ ) d θ. The quantity function q ⋆ from the theorem statement maximizes this quantity pointwise,and t ⋆ is the corresponding transfer function. Since q ⋆ is monotone, this mechanism isincentive compatible and hence optimal. Except at points θ where ϕ ( θ ) = 0, the pointwisemaximizer is unique, and hence the mechanism is unique almost everywhere outside the set ϕ − (0). B.14 Proof of Theorem 5

The proof of Theorem 2 (Appendix B.2) can be extended to allow for multiple agents bytaking products like in the proof of Theorem 1 (Appendix B.1). For the suﬃciency, deﬁneˆ k i for each agent i and set ˆ k = ⊗ i ˆ k i . For necessity, if there is some j such that ˆ t j is notmost-discerning, apply the construction above on agent j , assuming every type of everyother agent is indiﬀerent over all decisions. 47 .15 Proof of Proposition 9 Applying the same argument player by player gives V ( Q ) = Z Θ (cid:16) n X i =1 ϕ i ( θ i ) q i ( θ i ) (cid:17) f ( θ ) d θ. This is maximized by the interim quantity functions q ⋆ in the theorem statement, whichinduces monotone interim quantity functions. For each agent i , the transfer function ispinned down by the envelope expression for U i . We then choose a transfer function t ⋆ consistent with these interim transfer functions that also satisﬁes the ex post participationconstraints. C Supplementary proofs

C.1 Proof of Lemma 1 (i) Fix p ∈ [0 , µ ˜ F µ )[0 , p ] = p . For each s ∈ R , let F µ ( s − ) denotethe left limit of F µ at s . With this notation, we have˜ F µ ( s, [0 , p ]) =  F µ ( s ) ≤ p, p − F µ ( s − ) F µ ( s ) − F µ ( s − ) if F µ ( s − ) ≤ p < F µ ( s ) , F µ ( s − ) > p. Set F + µ ( p ) = sup { t ∈ R : F µ ( t − ) ≤ p } . By the left-continuity of the map t F µ ( t − ), wehave F µ ( F + µ ( p )) ≤ p , with equality if µ is continuous at F + µ ( p ). If µ is continuous at F + µ ( p ),then ( µ ˜ F µ )[0 , p ] = µ ( −∞ , F + µ ( p )] = F µ ( F + µ ( p )) = p. If µ is discontinuous at F + µ ( p ), then( µ ˜ F µ )[0 , p ] = F µ ( F + µ ( p ) − ) + µ ( { F + µ ( p ) } ) p − F µ ( F + µ ( p ) − ) F µ ( F + µ ( p )) − F µ ( F + µ ( p ) − ) , and the right side simpliﬁes to p .(ii) Fix s ∈ R . For each p ∈ [0 , s ∈ supp ν , we have Q ν ( p ) ≤ s ifand only if p ≤ F ν ( s ). Therefore,˜ Q ν ( p, ( −∞ , s ]) = ( p ≤ F ν ( s ) , p > F ν ( s ) . We conclude that( U [0 , ˜ Q ν )( −∞ , s ] = U [0 , [0 , F ν ( s )] = F ν ( s ) = ν ( −∞ , s ] . This is the right-continuous inverse of F µ and is more commonly deﬁned as inf { t ∈ R : F µ ( t ) > p } . C.2 Proof of Lemma 2

By Lemma 1 (iii), we have (ii) = ⇒ (iii). We prove that (i) = ⇒ (ii) = ⇒ (i).(i) = ⇒ (ii). Suppose µ (cid:23) SD ν . Set k = ˜ F µ ˜ Q ν . Fix s ∈ S and let S = ( −∞ , s ] ∩ S .Recall that the left-continuous quantile function satisﬁes the Galois inequality Q ν ( p ) ≤ s ⇐⇒ p ≤ F ν ( s ) . Thus, k ( s, S ) = Z ˜ F µ ( s, d p ) ˜ Q ν ( p, S ) = Z F ν ( s )0 ˜ F µ ( s, d p ) . By ﬁrst-order stochastic dominance, F ν ( s ) ≥ F µ ( s ). Thus, the right side is at least˜ F µ ( s, [0 , F µ ( s )]), which equals 1.(iii) = ⇒ (i). Let k be a downward transition. Fix s in S , and let S = S ∩ ( −∞ , s ].We have ( µk )( S ) = Z R µ (d t ) k ( t, S ) ≥ Z ( −∞ ,s ] µ (d t ) k ( t, S ) = µ ( S ) . C.3 Proof of Lemma 3 (i) Suppose m is monotone on S . If µ ≥ SD ν , then by Lemma 2 (iii) there is a downwardtransition k on S such that µk = ν . Fix s ∈ S , and set S = ( −∞ , s ] ∩ S. Then( νm )( S ) − ( µm )( S ) = ( µkm )( S ) − ( µm )( S )= Z S µ (d s ) (cid:2) ( km )( s, S ) − m ( s, S ) (cid:3) = Z S µ (d s ) Z s −∞ k ( s, d t ) (cid:2) m ( t, S ) − m ( s, S ) (cid:3) ≥ , where we get the second equality because k is downward, and the inequality because m ismonotone.For the other direction, take µ = δ s and ν = δ t for s > t . Then m s = δ s m ≥ SD δ t m = m t . (ii) Let m and m ′ be monotone transitions on S . Suppose µ and ν are measures on S satisfying µ ≥ SD ν . Applying (i) twice, we have µm ≥ SD νm and hence µ ( mm ′ ) = ( µm ) m ′ ≥ SD ( νm ) m ′ = µ ( mm ′ ) . By (i), mm ′ is monotone. 49 .4 Proof of Lemma 4 Consider a Markov transition k : X ×Y → [0 , k : X ×Y → [0 ,

1] by setting ¯ k x equalto the extension of k x to Y , for each x in X . For each B in Y , we have ¯ k ( x, B ) = k ( x, B ),so the map x ¯ k ( x, B ) is measurable and hence universally measurable. We need tocheck universal measurability for sets B in Y . Let µ be an arbitrary probability measureon ( X, X ). It suﬃces to show that ¯ k ( · , B ) is X µ -measurable for all B in Y ( µk ) . If B is in Y ( µk ) , then we can sandwich B between B and B satisfying0 = ( µk )( B \ B ) = µ ( k ( · , B )) − µ ( k ( · , B )) . So the function ¯ k ( · , B ) is sandwiched between the X -measurable functions k ( · , B ) and k ( · , B ), which agree µ –almost surely. Hence ¯ k ( · , B ) is X µ -measurable. C.5 Proof of Lemma 5

For the inclusion, it suﬃces to show that every probability measure µ on ( X × Y, X ⊗ Y ),we have

X ⊗ Y ⊂ ( X ⊗ Y ) µ . Fix such a probability measure µ . Deﬁne measures µ and µ on X and Y by µ ( A ) = µ ( A × Y ) and µ ( B ) = µ ( X × B ) . If A is in X µ , then there exist A and A in X sandwiching A such that0 = µ ( A \ A ) = µ (( A \ A ) × Y ) = µ (( A × Y ) \ ( A × Y )) . Similarly, if B in Y µ , then X × B is in ( X ⊗ Y ) µ . Taking intersections we conclude that A × B is in ( X ⊗ Y ) µ . Therefore, X ⊗ Y ⊂ X µ ⊗ Y µ ⊂ ( X ⊗ Y ) µ . Now we turn to the last equality. For each probability measure µ on X ⊗ Y , let µ bethe restriction of µ to X ⊗ Y . Then

X ⊗ Y ⊂ ( X ⊗ Y ) µ ⊂ ( X ⊗ Y ) µ . Taking the intersection over all such µ gives X ⊗ Y ⊂ X ⊗ Y .Now we prove the reverse inclusion. Each probability measure ν on X ⊗Y has a completeextension ¯ ν to ( X ⊗ Y ) ν . Let ν be the restriction of ¯ ν to X ⊗ Y . Then(

X ⊗ Y ) ν = ( X ⊗ Y ) ν ⊃ X ⊗ Y . Taking the intersection over all such ν gives X ⊗ Y ⊃ X ⊗ Y .50 eferences

Aliprantis, C. D. and K. C. Border (2006):

Inﬁnite Dimensional Analysis: A Hitch-hiker’s Guide , Springer, 3 ed.

Auletta, V., P. Penna, G. Persiano, and C. Ventre (2011): “Alternatives to Truth-fulness are Hard to Recognize,”

Autonomous Agents and Multi-Agent Systems , 22, 200–216.

Balbuzanov, I. (2019): “Lies and Consequences,”

International Journal of Game Theory ,1–38.

Ben-Porath, E., E. Dekel, and B. L. Lipman (2014): “Optimal Allocation with CostlyVeriﬁcation,”

American Economic Review , 104, 3779–3813.——— (2017): “Disclosure and choice,”

The Review of Economic Studies , 85, 1471–1501.——— (2019): “Mechanisms With Evidence: Commitment and Robustness,”

Economet-rica , 87, 529–566.

Blackwell, D. (1953): “Equivalent Comparisons of Experiments,”

Annals of Mathemat-ical Statistics , 24, 265–272.

Border, K. C. and J. Sobel (1987): “Samurai Accountant: A Theory of Auditing andPlunder,”

Review of Economic Studies , 54, 525–540.

Bull, J. and J. Watson (2004): “Evidence Disclosure and Veriﬁability,”

Journal ofEconomic Theory , 118, 1–31.——— (2007): “Hard Evidence and Mechanism Design,”

Games and Economic Behavior ,58, 75–93.

Caragiannis, I., E. Elkind, M. Szegedy, and L. Yu (2012): “Mechanism Design:From Partial to Probabilistic Veriﬁcation,” in

Proceedings of the 13th ACM Conferenceon Electronic Commerce , New York, NY, USA: ACM, EC ’12, 266–283.

Cohn, D. L. (2013):

Measure Theory , Birkh¨auser Advanced Texts Basler Lehrb¨ucher,Birkh¨auser Basel, 2 ed.

Crocker, K. J. and J. Morgan (1998): “Is Honesty the Best Policy? Curtailing Insur-ance Fraud through Optimal Incentive Contracts,”

Journal of Political Economy , 106,355–375.

Deb, R. and C. Stewart (2018): “Optimal Adaptive Testing: Informativeness and In-centives,”

Theoretical Economics , 13, 1233–1274.

DeMarzo, P. M., I. Kremer, and A. Skrzypacz (2019): “Test Design and MinimumStandards,”

American Economic Review , 109, 2173–2207.

Deneckere, R. and S. Severinov (2008): “Mechanism Design with Partial State Veri-ﬁability,”

Games and Economic Behavior , 64, 487–513.51—— (2017): “Screening, Signalling and Costly Misrepresentation,” Working paper.

Dziuda, W. and C. Salas (2018): “Communication with Detectable Deceit,”

Availableat SSRN 3234695 . Erlanson, A. and A. Kleiner (2015): “Costly Veriﬁcation in Collective Decisions,”Working paper.

Ferraioli, D. and C. Ventre (2018): “Probabilistic Veriﬁcation for Obviously Strat-egyproof Mechanisms,” in

Proceedings of the 17th International Conference on Au-tonomous Agents and MultiAgent Systems , Richland, SC: International Foundation forAutonomous Agents and Multiagent Systems, AAMAS ’18, 1930–1932.

Fotakis, D. and E. Zampetakis (2015): “Truthfulness Flooded Domains and the Powerof Veriﬁcation for Mechanism Design,”

ACM Transactions on Economics and Computa-tion , 3, 20:1–29.

Green, J. R. and J.-J. Laffont (1986): “Partially Veriﬁable Information and Mecha-nism Design,”

Review of Economic Studies , 53, 447–456.

Grossman, S. J. (1981): “The informational role of warranties and private disclosureabout product quality,”

The Journal of Law and Economics , 24, 461–483.

Halac, M. and P. Yared (2017): “Commitment vs. Flexibility with Costly Veriﬁcation,”Working paper.

Hart, S., I. Kremer, and M. Perry (2017): “Evidence Games: Truth and Commit-ment,”

American Economic Review , 107, 690–713.

Kallenberg, O. (2017):

Random Measures, Theory and Applications , vol. 77 of

ProbabilityTheory and Stochastic Modelling , Springer.

Kartik, N. (2009): “Strategic Communication with Lying Costs,”

Review of EconomicStudies , 76, 1359–1395.

Kartik, N., M. Ottaviani, and F. Squintani (2007): “Credulity, Lies, and CostlyTalk,”

Journal of Economic Theory , 134, 93–116.

Kephart, A. and V. Conitzer (2016): “The Revelation Principle for Mechanism Designwith Reporting Costs,” in

Proceedings of the 2016 ACM Conference on Economics andComputation , 85–102.

Koessler, F. and E. Perez-Richet (2017): “Evidence Reading Mechanisms,” Workingpaper.

Lacker, J. M. and J. A. Weinberg (1989): “Optimal Contracts under Costly StateFalsiﬁcation,”

Journal of Political Economy , 97, 1345–1363.

Li, Y. (2017): “Mechanism Design with Costly Veriﬁcation and Limited Punishments,”Working paper. 52 ipman, B. L. and D. J. Seppi (1995): “Robust Inference in Communication Games withPartial Provability,”

Journal of Economic Theory , 66, 370–405.

Maggi, G. and A. Rodrigu´ez-Clare (1995): “Costly Distortion of Information inAgency Problems,”

RAND Journal of Economics , 26, 675–689.

Mertens, J.-F., S. Sorin, and S. Zamir (2015):

Repeated Games , vol. 55 of

EconometricSociety Monographs , Cambridge University Press.

Milgrom, P. and I. Segal (2002): “Envelope Theorems for Arbitrary Choice Sets,”

Econometrica , 70, 583–601.

Milgrom, P. R. (1981): “Good news and bad news: Representation theorems and appli-cations,”

The Bell Journal of Economics , 380–391.

Mussa, M. and S. Rosen (1978): “Monopoly and Product Quality,”

Journal of EconomicTheory , 18, 301–317.

Myerson, R. B. (1981): “Optimal Auction Design,”

Mathematics of Operations Research ,6, 58–73.

Mylovanov, T. and A. Zapechelnyuk (2017): “Optimal Allocation with Ex Post Ver-iﬁcation and Limited Penalties,”

American Economic Review , 107, 2666–2694.

Nisan, N. and A. Ronen (2001): “Algorithmic Mechanism Design,”

Games and EconomicBehavior , 35, 166–196.

Riley, J. and R. Zeckhauser (1983): “Optimal Selling Strategies: When to Haggle,When to Hold Firm,”

Quarterly Journal of Economics , 98, 267–289.

Rochet, J.-C. (1987): “A Necessary and Suﬃcient Condition for Rationalizability in aQuasi-Linear Context,”

Journal of Mathematical Economics , 16, 191–200.

Singh, N. and D. Wittman (2001): “Implementation with Partial Veriﬁcation,”

Reviewof Economic Design , 6, 63–84.

Strausz, R. (2016): “Mechanism Design with Partially Veriﬁable Information,” CowlesFoundation Discussion Paper No. 2040.

Townsend, R. M. (1979): “Optimal Contracts and Competitive Markets with CostlyState Veriﬁcation,”

Journal of Economic Theory , 21, 265–293.

Vohra, R. V. (2011):

Mechanism Design: A Linear Programming Approach , EconometricSociety Monographs, Cambridge University Press.

Yu, L. (2011): “Mechanism Design with Partial Veriﬁcation and Revelation Principle,”