[PDF] Data and Incentives

Abstract

Many firms, such as banks and insurers, condition their level of service on a consumer's perceived "quality," for instance their creditworthiness. Increasingly, firms have access to consumer segmentations derived from auxiliary data on behavior, and can link outcomes across individuals in a segment for prediction. How does this practice affect consumer incentives to exert (socially-valuable) effort, e.g. to repay loans? We show that the impact of an identified linkage on behavior and welfare depends crucially on the structure of the linkage---namely, whether the linkage reflects quality (via correlations in types) or a shared circumstance (via common shocks to observed outcomes).

Full PDF

DData and Incentives ∗ Annie Liang † Erik Madsen ‡ April 26, 2020

Abstract

Many ﬁrms, such as banks and insurers, condition their level of service on a con-sumer’s perceived “quality,” for instance their creditworthiness. Increasingly, ﬁrmshave access to consumer segmentations derived from auxiliary data on behavior, andcan link outcomes across individuals in a segment for prediction. How does this prac-tice aﬀect consumer incentives to exert (socially-valuable) eﬀort, e.g. to repay loans?We show that the impact of an identiﬁed linkage on behavior and welfare dependscrucially on the structure of the linkage—namely, whether the linkage reﬂects quality(via correlations in types) or a shared circumstance (via common shocks to observedoutcomes).

JEL classification: D62,D83,D40 ∗ We are grateful to Eduardo Azevedo, Dirk Bergemann, Alessandro Bonatti, Sylvain Chassang, YashDeshpande, Ben Golub, Yizhou Jin, Navin Kartik, Rishabh Kirpalani, Alessandro Lizzeri, Steven Matthews,Xiaosheng Mu, Larry Samuelson, Juuso Toikka, and Weijie Zhong for useful conversations, and to NationalScience Foundation Grant SES-1851629 for ﬁnancial support. We thank Changhwa Lee for valuable researchassistance on this project. † Department of Economics, University of Pennsylvania ‡ Department of Economics, New York University a r X i v : . [ ec on . T H ] J un Introduction

Many important economic transactions involve provision of a service whose proﬁtabilitydepends on an unobserved characteristic, or “quality,” of the recipient. For instance, theproﬁtability of a car insurance policy depends on the insuree’s driving ability, while the prof-itability of issuing a credit card or personal loan depends on the borrower’s creditworthiness.While a recipient’s quality is not directly observable, service providers can use data to helpforecast it, with these forecasts used to set the terms of service—e.g. an insurance premiumor interest rate.This paper is about the interaction between two kinds of data which inform these fore-casts: traditional past outcome data—e.g. insurance claims rates or credit card repayment—and novel consumer segment data identifying “similar” individuals based on aggregated on-line activity and other digitally tracked behaviors. These segments can identify individualsassociated with a diverse range of preferences, lifestyle choices, and recent life events. To understand the interaction between these two kinds of data, consider two prototypicalconsumers, Alice and Bob. In the absence of any identiﬁed linkages between these individuals,each recipient’s past outcomes are useful only for predicting his or her own quality: e.g. ifAlice was in an automobile accident last claims cycle, that event is informative about herdriving ability, but not about Bob’s. If, on the other hand, the provider learns that Aliceand Bob both enjoy extreme sports, then Alice’s accident from the last claims cycle may beinformative about Bob’s future accident risk as well.Our goal is to build a general model of identiﬁed “data linkages” across individuals,and to characterize how these linkages reshape incentives for productive eﬀort—for instance,driving more carefully in an auto insurance context, or exercising ﬁnancial prudence in aconsumer credit market. We then use the model to shed light on the welfare implicationsof these emerging practices. The main takeaway of our analysis is that the structure of See Appendix A for a list of actual consumer segments compiled by data brokers. Our assumption that eﬀort is socially valuable distinguishes our setting from Frankel and Kartik (2019)and Ball (2019), who model eﬀort as a “gaming” device that degrades signal quality but yields no socialvalue. A number of instances of organizations bolstering prediction with novel datasets have already come tolight. In 2008, the subprime lender CompuCredit was revealed to have reduced credit lines based on vis-its to various “red ﬂag” establishments, including marriage counselors and nightclubs (see ). Several healthinsurance companies have reportedly purchased datasets on purchasing and consumption habits from databrokers like LexisNexis to help predict anticipated healthcare costs (see ). Thecar insurance company Allstate recently ﬁled a patent for adjusting insurance rates based on routes and his-

2n identiﬁed linkage across individuals matters for predicting behavior and welfare—in par-ticular, linkages identifying correlations across persistent traits (i.e. intrinsic quality) havevery diﬀerent consequences than do ones identifying correlations across transient shocks (i.e.shared circumstances). Thus, regulations which treat all “big data” homogeneously are toocrude to achieve socially optimal outcomes, and should be tailored based on the role thatdata plays in forecasting.Our framework is a multiple-agent version of the classic career concerns model (Holm-str¨om, 1999). Each agent has an unknown type (e.g. creditworthiness), which a principal (abank) would like to predict. Agents choose whether to opt-in to interaction with the princi-pal (sign up for a credit card), and any agent opting-in receives a transfer from the principal.The principal observes an outcome (the agent’s past repayment behavior) from each agentwho opts in, which is informative about the agent’s underlying type, but also perhaps aboutthe types of others in his segment. The agent can manipulate his own outcome via costlyeﬀort (exercising ﬁnancial prudence), with the goal of improving his perceived type and ac-cruing a reputational payoﬀ. If agents were unrelated, each agent’s type would be forecaston the basis of their own past outcome alone. We consider an environment in which a datalinkage identiﬁes correlations between the outcomes of agents in the population, so thateach agent’s perceived quality is determined by the outcomes of other participating agents,in addition to the agent’s own outcome.In general the correlation structure between outcomes and types can be quite complex.Moreover, consumers may have uncertainty about which segments are used by the provider,or precisely what correlations are implied by those segments. But we show that this poten-tially complex structure boils down to two forces captured by two distinct kinds of linkages.First, some linkages identify agents with correlated types—we call these quality linkages .This linkage may be a lifestyle pattern (e.g. “Frequent Flier,” “Fitness Enthusiast”) or per-sonal characteristic (e.g. “Working-class Mom,” “Spanish Speaker”). Second, some linkagesidentify agents who have encountered similar shocks to their outcomes. We refer to these as circumstance linkages . For example, drivers who commute on the same roads to work areexposed to similar variations in local road conditions, e.g. construction or bad weather.Quality and circumstance linkages turn out to have opposing eﬀects on incentives for torical accident patterns (see ). And perhaps most strikingly, China’s “so-cial credit” system determines whether an individual is a good citizen based on detailed attributes rangingfrom the size of their social network to how often they play video games (see https://foreignpolicy.com/2018/04/03/life-inside-chinas-social-credit-laboratory ). The desire of an agent to be perceived as a high type contrasts with models of price discrimination, inwhich agents prefer to be perceived as a low type and receive a lower price. substitutes for inferring an agent’s type, whileunder circumstance linkages, they are complements . Consider ﬁrst the impact of a qualitylinkage. In this case, observation of outcomes from other agents in the segment helps theprincipal to learn an average quality for the segment, reducing the marginal informativenessof a given agent’s outcome about his type. Exerting eﬀort to distort one’s outcome thushas a smaller inﬂuence on the principal’s perception about one’s type. In contrast, under acircumstance linkage, observation of outcomes for other agents in the segment is informativeabout the size of the average shock to outcomes. Each agent’s outcome therefore becomesmore informative about his type—once debiased by the estimated common shock—increasingthe value of exerting eﬀort to improve one’s outcome.We establish these eﬀort comparative statics in a model with very general type and noisedistributions, imposing only a standard log-concavity condition ensuring that posterior es-timates of latent variables are monotone in signal realizations. Reasoning about incentivesfor eﬀort distortion in such an environment is challenging, because outside special cases theprincipal’s posterior expectation is a complex nonlinear function of signal realizations. Atechnical contribution of our paper is the development of techniques for establishing com-parative statics of marginal incentives for eﬀort as the number of correlated signals grows. The eﬀort comparative statics outlined above have direct implications for consumer pay-oﬀs from participation: In our model, as in Holmstr¨om (1999), the principal correctly infersthe equilibrium level of eﬀort and can de-bias observed outcomes. Since eﬀort is costly,higher equilibrium eﬀort necessarily means lower payoﬀs for agents. (This need not implylower social welfare, as we discuss below.) Thus under a quality linkage, agent participationdecisions are strategic complements: Participation by one agent improves the payoﬀs to par-ticipation for other agents by decreasing equilibrium eﬀort. We show existence of a uniqueequilibrium in which all agents choose to opt-in to interaction with the principal. In contrast,under a circumstance linkage, participation creates a negative externality on other agents byincreasing equilibrium eﬀort. For small populations, all agents opt-in in equilibrium, whilefor large populations, agents must mix over entry in the unique symmetric equilibrium.We next use these equilibrium characterizations to analyze the impact of data sharing Our comparative statics can be viewed as adapting the results of Dewatripont et al. (1999) to an additivesignal structure and generalizing them to many signals, although we derive our results independently usingdiﬀerent techniques. Frankel and Kartik (2019) introduces uncertainty in the ability of agents to manipulate outcomes, sothat the principal cannot perfectly de-bias the impact of eﬀort. In such settings a reduction in incentivesfor eﬀort improves the precision of forecasts, creating a tradeoﬀ for the principal when eﬀort and preciseforecasts are both valuable.

4n consumer and social welfare. As a benchmark, we consider a “no linkages” environmentin which the principal is permitted to use only an agent’s own past data to predict his type.(This may correspond either to absence of consumer segment data, or to an environment inwhich use of consumer segment data has been prohibited by regulation.) We compare equi-librium outcomes against this benchmark, under diﬀerent assumptions about how transfersare determined.We ﬁrst suppose transfers are held ﬁxed when a linkage is introduced, and that thetransfer is generous enough that all agents would participate in the absence of a linkage.This assumption reﬂects regulated environments in which service providers can’t discriminatetoward or against consumers solely on the basis of a data linkage. When agents share a qualitylinkage, aggregation of data across agents leads to a reduction in both consumer and socialwelfare. In contrast, when agents share a circumstance linkage, consumer welfare declineswhile social welfare increases for small populations. These results suggest that the type ofdata being used to link agents is a crucial determinant of the welfare eﬀect of data linkages.We next suppose that the principal is a monopolist who freely sets the transfer to maxi-mize proﬁts, potentially adjusting it in response to a linkage. As agents possess no privateinformation about their type, they are always held to their outside option, and the principalextracts all surplus whether or not a linkage is present. Thus in this environment, the prin-cipal chooses a transfer which maximizes social welfare. We ﬁnd that welfare rises under acircumstance linkage and falls under a quality linkage for any population size. Additionally,we show that while full participation is ensured under a circumstance linkage, the principalmay optimally induce only partial entry under a quality linkage.Finally, we consider an environment in which multiple principals compete to serve agents,and use the results to comment on a current policy debate regarding whether ﬁrms shouldhave proprietary ownership of their data, or if this data should be shared across an industry(as for example recently recommended by the European Commission). To model competi-tion between principals, we extend our model by having several principals each set a transfersimultaneously, after which agents choose which ﬁrm (if any) to participate with. We con-sider two diﬀerent data regimes—under proprietary data , an agent’s reputational payoﬀ isdetermined exclusively based on the outcomes of other agents participating at their chosen As reported in European Commission (2020): “[T]he Commission will explore the need for legislativeaction on issues that aﬀect relations between actors in the data-agile economy to provide incentives forhorizontal data sharing across sectors.” Such action might “support business-to-business data sharing, inparticular addressing issues related to usage rights for co-generated data...typically laid down in privatecontracts. The Commission will also seek to identify and address any undue existing hurdles hindering datasharing and to clarify rules for the responsible use of data (such as legal liability). The general principleshall be to facilitate voluntary data sharing.” data sharing , the outcomes of all agents are shared across ﬁrms for usein forecasting types. We show that regardless of whether agents are linked by quality orcircumstance, data sharing leads to an increase in consumer welfare. Market forces play akey role in this result: in particular, if ﬁrms were not able to freely choose transfers, thenthe welfare implications of data sharing would depend on the nature of the linkage.

Our paper contributes to an emerging literature regarding the welfare consequences of datamarkets and algorithmic scoring. This literature has tackled several important social ques-tions, such as whether predictive algorithms discriminate (Chouldechova, 2017; Kleinberget al., 2017); how to protect consumers from loss of privacy (Acquisti et al., 2015; Dworkand Roth, 2014; Fainmesser et al., 2019; Eilat et al., 2019); how to price data (Bergemannet al., 2018; Agarwal et al., 2019); whether seller or advertiser access to big data harms con-sumers (Jullien et al., 2018; Gomes and Pavan, 2019); and how to aggregate big data intomarket segments or consumer scores (Ichihashi, 2019; Bonatti and Cisternas, 2019; Yang,2019; Hidir and Vellodi, 2019; Elliott and Galeotti, 2019). There is additionally a grow-ing literature about strategic interactions with machine learning algorithms: see Eliaz andSpiegler (2018) on the incentives to truthfully report characteristics to a machine learningalgorithm, and Olea et al. (2018) on how economic markets select certain models for makingpredictions over others.In particular, Acemoglu et al. (2019) and Bergemann et al. (2019) also consider ex-ternalities created by social data. Diﬀerent from us, these papers study data sharing inenvironments where consumers may sell their data. In Bergemann et al. (2019), one agent’sinformation improves a ﬁrm’s ability to price-discriminate against other agents, which candecrease consumer surplus. In Acemoglu et al. (2019), agents value privacy, and thus infor-mation collected about one agent imposes a direct negative externality on other agents whentypes are correlated. The externality of interest in the present paper is how information pro-vided by other agents reshapes incentives to exert costly eﬀort . As we show, this externalitycan be positive or negative—in particular, when agents are connected by a quality linkage,their equilibrium payoﬀs turn out to be increasing in other agents’ participation.At a theoretical level, our paper builds on the career concerns model of Holmstr¨om (1999),the classic framework for analyzing the role of reputation-building in motivating eﬀort. Theinteraction of this incentive eﬀect with informational externalities from other agents’ behavioris the main focus of our analysis. The literature following Holmstr¨om (1999) has largely6ocused on signal extraction about a single agent’s type in dynamic settings, while weare interested in the externalities of social data in a multiple-agent setting. Our paper ismost closely related to Dewatripont et al. (1999), which studies how auxiliary data impactsagents’ incentives for eﬀort. That paper considers the externality of a single exogenousauxiliary signal, while we endogenize the auxiliary data as information from other players,who strategically decide whether or not to provide data. Thus, the number of auxiliarysignals is determined in equilibrium, and may also be uncertain; this requires comparison ofequilibrium actions across various information structures.Our circumstance linkage model, in which the principal uses outcomes from some agentsto help de-bias the outcomes of other agents, is reminiscent of the team production andtournament literatures (Holmstr¨om, 1982; Lazear and Rosen, 1981; Green and Stokey, 1983;Shleifer, 1985). In these papers, the observable output of each agent depends both on theagent’s eﬀort as well as on a common shock experienced by all agents. In such environmentsthe relative output of an agent is a more precise signal of eﬀort than the absolute output.Thus the principal may be able to extract more eﬀort through rewarding good relativeoutcomes rather than good absolute outcomes. Although we do not consider a contractingenvironment here, similar forces in our model permit the principal to extract more eﬀortfrom agents when their outcomes are related by correlated shocks.Finally, our paper contributes to work on strategic manipulation of information. Recentpapers in this category include: Frankel and Kartik (2020) and Ball (2019), which charac-terize the degree to which a principal with commitment power should link his decision toa manipulated signal about the agent’s type; Hu et al. (2019), which shows that heteroge-neous manipulation costs across diﬀerent social groups can lead to inequities in outcomes;and Georgiadis and Powell (2019), which studies optimal information acquisition for a de-signer setting a wage contract. Our paper contributes to this literature by exploring the roleof correlations across data in an individual’s incentives to manipulate an observed outcome. A single principal interacts with

N < ∞ agents, who have been identiﬁed as belonging toa common population segment. Each agent i has a type θ i ∈ R , which is unknown to allparties (including agent i ) and is commonly believed to be drawn from the distribution F θ A small set of papers, e.g. Auriol et al. (2002), study career concerns in a multiple-agent setting. Thesepapers typically look at eﬀort externalities instead of informational externalities. One exception is Meyerand Vickers (1997), which considers the impact of adding an additional agent with correlated outcomes inthe context of a ratchet eﬀect model with incentive contracts. µ > σ θ > Types are drawn symmetrically but may notbe independent across agents.As in the classic career concerns model of Holmstr¨om (1999), each agent’s payoﬀs areincreasing in the principal’s perception of his type, and the agent can exert costly eﬀort toinﬂuence an outcome realization that the principal observes (Section 2.2). Diﬀerent fromHolmstr¨om (1999), we introduce a preliminary stage at which the agent ﬁrst chooses whetherto opt-in or out of interaction with the principal (Section 2.1), and—most importantly—weallow the principal to aggregate the outcomes of multiple agents for prediction (Section 2.3).The model unfolds over three periods, with opt-in/out decisions made in period t = 0 , eﬀort exerted in period t = 1 , and forecasts of each agent’s type based on outcomes updatedin period t = 2 . At period t = 0, each agent i ﬁrst chooses whether to opt-in or opt-out of an interactionwith the principal, where this decision is observed by the principal, but not by other agents.Opting out yields a payoﬀ that we normalize to zero. The set of agents who opt-in is denoted I opt-in ⊆ { , . . . , N } . In period t = 1, each agent i ∈ I opt-in privately chooses a costly eﬀort level a i ∈ R + toinﬂuence an observable outcome. The outcome, S i , is related to the agent’s type and eﬀortlevel via S i = θ i + a i + ε i , where ε i ∼ F ε is a noise shock with mean E [ ε i ] = 0 and ﬁnite variance E [ ε i ] = σ ε > R − C ( a i )where R ∈ R is a monetary opt-in reward from the principal (possibly negative), and C ( a i )is the cost to choosing eﬀort a i . We suppose that the cost function is twice continuouslydiﬀerentiable and satisﬁes lim a i →∞ C (cid:48) ( a i ) > C (0) = C (cid:48) (0) = 0, and C (cid:48)(cid:48) ( a i ) > a i . None of our results would change if we gave the principal access to additional privately observed covariatesfor use in forecasting. Speciﬁcally, we could allow θ i to be decomposable as θ i = θ i + ∆ θ i , where θ i iscommonly unobserved with mean 0 while ∆ θ i is an idiosyncratic type shifter, independent of θ i with mean µ , which is privately observed by the principal. .3 Period 2—Principal’s Forecast of Agent’s Type In a second (and ﬁnal) period, each agent i ∈ I opt-in receives the principal’s forecast of theagent’s type θ i . The principal’s forecast is based on the observed outcomes of all agents whohave opted-in; thus, agent i ’s payoﬀ in the second period is E [ θ i | S j , j ∈ I opt-in ] . (1)Note that since each agent’s eﬀort choice is private, the forecast is based on a conjecturedeﬀort choice, which in equilibrium is simply the equilibrium eﬀort level. This payoﬀ is astand-in for the reputational consequences of the agent’s period-1 outcome. Note that theagent’s payoﬀ is increasing in the principal’s forecast of their type, reﬂecting the role of θ i as a quality variable determining average outcomes.The quantity in (1) depends on the (random) realizations of output; thus, the agentoptimizes over his expectation of (1). We will discuss this iterated expectation of θ i in detailin Section 3.1. Finally, the agent’s total payoﬀ is the sum of his expected payoﬀs across thetwo periods. This timeline is summarized in Figure 1. t = 0 opt-in opt-out t = 1 t = 2 exert e↵ort a i generate outcome S i = ✓ i + a i + " i receive R C ( a i ) receive E [ ✓ i | S j , j I opt-in ] receive zero receive zero Figure 1: TimelineSo far we have not described how agent outcomes are correlated, a speciﬁcation which iscrucial for computing the posterior expectation in (1). Our main analysis contrasts two kindsof relationships across agents, one in which agents within a segment have related qualities,and another in which they share a related circumstance: Formally, one could view this payoﬀ as representing the agent’s payoﬀ in a second-period market wheremultiple ﬁrms compete to serve the agent. uality Linkage. Suppose ﬁrst that agents within the segment have correlated qualities.We model this by decomposing θ i as θ i = θ + θ ⊥ i , where θ ∼ F θ is a common component of the type and θ ⊥ i ∼ F θ ⊥ is a personal or idiosyncraticcomponent, with each θ ⊥ i independent of θ and all θ ⊥ j for j (cid:54) = i. Without loss, we assume E [ θ ] = µ while E [ θ ⊥ i ] = 0 . In contrast, the shocks ε i are mutually independent. Circumstance Linkage.

Another possibility is that agents within the segment don’t havequalities which are intrinsically related, but instead have experienced a shared shock tooutcomes. Formally, we suppose that the noise shock can be decomposed as ε i = ε + ε ⊥ i where ε ∼ F ε is shared across agents and ε ⊥ i ∼ F ε ⊥ is idiosyncratic, with each ε ⊥ i independentof ε and all ε ⊥ j for j (cid:54) = i . In contrast, agents’ types θ i are mutually independent.The distinction between quality and circumstance linkages can be interpreted in at leasttwo ways. One interpretation is that θ i is the portion of the outcome that is valuable tothe principal, while ε i is a confounder that has an eﬀect on the observed outcome, but isnot payoﬀ-relevant. Another interpretation is that the type θ i is a permanent componentof the agent’s performance while ε i is a shock that aﬀects performance only temporarily.The examples of circumstance linkages in Section 2.4 follow the latter interpretation, with ε i reﬂecting a transient characteristic that aﬀected outcomes in a previous observation cycle,but is no longer present in future interactions. For example, if an agent was pregnant duringthe determination of S i , but has since given birth, then the principal should optimallyde-noise the “pregnancy eﬀect” from the prior outcome when predicting future behaviors.Throughout the paper, we consider these two models of linkage separately in order to clarifythe diﬀerence between them.Note that while the correlation structure across agent outcomes diﬀers in the two models,we will hold the marginal distributions of each agent’s type and noise shock ﬁxed acrossmodels (see Assumption 2). Commuters and auto-insurers.

The principal is an auto-insurer and the agents arecommuters. Agent i ’s type θ i is a function of his accident risk while driving, with higher-type commuters experiencing a lower risk of accidents while driving to work. Each commuter10ecides whether to own a car versus commuting via rideshares or public transit. Conditionalon owning a car, the commuter then chooses how much eﬀort to exert to drive safely. Theinsurance company observes his claims rate during an initial enrollment period, and usesthat outcome to predict his future claims rates.Examples of quality linkage segments include drivers who share similar commutes towork, e.g. routes primarily through surface streets or via highways, where these routes arediscoverable from geolocational data. If we suspect that commutes are stable and that theroute taken contributes to the risk of accident, then claims rates for other drivers in thesegment are directly informative about the future accident risk for a given driver. Exam-ples of circumstance linkage segments include drivers who passed through routes that werepreviously aﬀected by unusual road or weather conditions. Crucially, these conditions arenot expected to persist into the subsequent period. The principal can use claims ratesfrom drivers in this segment to learn the size and direction of the “road shock” or “weathershock,” allowing them to de-bias observed accident rates.

Consumers and credit-card issuers.

The principal is a bank issuing a credit card andagents are consumers. Agent i ’s type θ i is his creditworthiness, with more creditworthyconsumers being better able to pay back short-term loans. Each agent decides whether tosign up for a credit card versus making payments by debit card or cash. If an agent signs upfor a credit card, he decides how much eﬀort to exert in order to ensure repayment (e.g. byincreasing income or avoiding activities that risk ﬁnancial loss), and the card issuer observeshis repayment behavior during an initial enrollment period.Quality linkages relevant to creditworthiness include lifestyles (“Frequent Flier”) andﬁnancial sophistication (“Subscriber to Financial Newsletter”), categories which can be re-vealed by social media usage and online subscription databases. Circumstance linkagesinclude whether a consumer’s child was previously attending college (but has since gradu-ated) and whether a family member was previously experiencing a serious illness (but hassince improved), as inferred for example from purchasing and travel histories. We study Nash equilibria in which agents choose symmetric participation strategies and purestrategies in eﬀort. Our focus on symmetric participation reﬂects the ex-ante symmetry ofconsumers in our model, and their anonymity with respect to one another in most data If the conditions are persistent, we would consider the consumers instead to be related by a qualitylinkage. We additionally impose a reﬁnement on out-of-equilibrium beliefs. Since agents chooseparticipation and eﬀort simultaneously in our model, Nash equilibrium puts no restrictionson the principal’s inference about eﬀort in the event that an agent unexpectedly enters. Werequire that if an agent unilaterally deviates to entry, the principal expects that the agentwill exert the equilibrium eﬀort choice from a single-agent game with exogenous entry. Thisreﬁnement mimics sequential rationality in a modiﬁed model in which agents make entry andeﬀort decisions sequentially rather than simultaneously. In what follows, we will us the term equilibrium without qualiﬁcation to refer to symmetric equilibria in pure eﬀort strategiessatisfying this reﬁnement.

We impose several regularity conditions on the distributions F θ , F θ ⊥ , F ε , and F ε ⊥ , whichare maintained throughout the paper. Assumptions 1 through 4 are purely technical, andensure that all distributions have full support and are smooth enough for appropriate deriva-tives of conditional expectations to exist. Assumptions 5 and 6 are substantive, and ensuremonotonicity of inferences about latent variables in outcome and suﬃciency of the ﬁrst-orderapproach for characterizing equilibrium eﬀort. Assumption 1 (Regularity of densities) . The distribution functions F θ , F θ ⊥ , F ε , F ε ⊥ admitstrictly positive, C density functions f θ , f θ ⊥ , f ε , f ε ⊥ with bounded ﬁrst derivatives on R . Assumption 2 (Invariance of marginal densities) . In each model, the distribution functions F θ and F ε have density functions f θ and f ε satisfying f θ = f θ ∗ f θ ⊥ and f ε = f ε ∗ f ε ⊥ , where ∗ is the convolution operator. In each model one half of Assumption 2 is redundant, as in the quality linkage model θ i = θ + θ ⊥ i while in the circumstance linkage model ε i = ε + ε ⊥ i . The remaining half of theassumption ensures that θ i and ε i have the same marginal distributions across models. Thefollowing corollary reﬂects the fact that convolutions of variables satisfying the properties ofAssumption 1 inherit those properties. When agents mix over eﬀort, then even under the assumptions imposed in Section 2.6 higher output isnot guaranteed to lead to higher inferences about types. Depending on the equilibrium distribution of eﬀort,the principal may instead attribute a positive output shock to high realized eﬀort. See Rodina (2017) forfurther discussion. orollary 1. f θ and f ε are strictly positive, C , and have bounded ﬁrst derivatives on R . The following assumption ensures that posterior expectations are smooth enough to com-pute ﬁrst and second derivatives of an agent’s value function, and to compute the marginalimpact of a change in one agent’s outcome on the forecast of another agent’s type. Let S = ( S , ..., S N ) be the vector of outcomes for all agents, with a = ( a , ..., a N ) the vector ofactions for all agents. Assumption 3 (Regularity of posterior expectations) . For each model, population size N ,agent i ∈ { , ..., N } , and outcome-action proﬁle ( S , a ) : • ∂∂S j E [ θ i | S ; a ] exists and is continuous in S for every j ∈ { , ..., N } , • ∂ ∂S i E [ θ i | S ; a ] exists. The following assumption is a slight strengthening of the requirement that the Fisherinformation of S i about its common component ( θ in the quality linkage model or ε in thecircumstance linkage model) be ﬁnite. Let f ε + θ ⊥ ≡ f θ ⊥ ∗ f ε and f θ + ε ⊥ ≡ f θ ∗ f ε ⊥ . Assumption 4 (Finite Fisher information) . For each f ∈ { f ε + θ ⊥ , f θ + ε ⊥ } , there exists a ∆ > and a dominating function J : R → R + such that (cid:18) f ( z − ∆) − f ( z ) f ( z ) (cid:19) ≤ J ( z ) for all z ∈ R and ∆ ∈ (0 , ∆) and (cid:90) J ( z ) f ( z ) dz < ∞ . Roughly, this assumption ensures that ﬁnite-diﬀerence approximations to the Fisher infor-mation are also ﬁnite and uniformly bounded as the approximation becomes more precise. The following assumption imposes enough structure on the distributions of the compo-nents of each agent’s outcome to ensure that higher outcome realizations imply monotonicallyhigher forecasts of the components of the outcome. A suﬃcient condition for Assumption 4 is that f ε + θ ⊥ and f θ + ε ⊥ don’t vanish at the tails “much faster”than their derivatives: speciﬁcally, for each f ∈ { f ε + θ ⊥ , f θ + ε ⊥ } there should exist a K > > ε ∈ R , ∆ ∈ [0 , ∆] (cid:12)(cid:12)(cid:12)(cid:12) f (cid:48) ( ε − ∆) f ( ε ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ K. This suﬃcient condition is satisﬁed, for example, by the t -distribution and the logistic distribution. It is not satisﬁed by the normal distribution, although we show in Appendix O.2.1 using other methods that thenormal distribution does satisfy Assumption 4. We are not aware of any commonly-used distributions whichviolate Assumption 4. ssumption 5 (Monotone forecasts) . The density functions f θ , f θ ⊥ , f ε , and f ε ⊥ are strictlylog-concave. One basic property of strictly log-concave functions is that the convolution of two log-concave functions is also strictly log-concave. Thus an immediate corollary of Assumption 5is the following:

Corollary 2. f θ and f ε are strictly log-concave. Assumption 5 implies monotonicity of forecasts for the following reason. In general, giventhree random variables

X, Y, Z such that X = Y + Z and Y and Z are independent, strictlog-concavity of the density function of Z is both necessary and suﬃcient for the distributionof X to satisfy a strict monotone likelihood-ratio property in Y (Saumard and Wellner, 2014): f X | Y ( x (cid:48) | y (cid:48) ) f X | Y ( x | y (cid:48) ) > f X | Y ( x (cid:48) | y (cid:48) ) f X | Y ( x | y ) if and only if x (cid:48) > x, y (cid:48) > y. This monotone likelihood-ratio property is the canonical suﬃcient condition ensuring mono-tonicity of the conditional expectation of Y in the observed value of X (Milgrom, 1981).Assumption 5 guarantees that the appropriate monotone likelihood-ratio properties are sat-isﬁed in our model; see Appendix B.1 for details.Finally, we assume the cost function is “suﬃciently convex” that eﬀort choices satisfyinga ﬁrst-order condition are globally optimal. The assumption is a joint condition on the costfunction and the distribution of the outcome, since the required amount of convexity dependson how sensitive the posterior expectation is to the realization of individual outcomes. Assumption 6 (Suﬃcient convexity) . There exists a K ∈ R such that C (cid:48)(cid:48) ( x ) > K for every x ∈ R + , and for every population size N and agent i ∈ { , ..., N } , ∂ ∂S i E [ θ i | S ; a ] ≤ K forevery ( S , a ) . One important set of models satisfying these regularity conditions is Gaussian uncertainty. Example (Gaussian) . For each agent i ,  θθ ⊥ i εε ⊥ i  ∼ N  µ  ,  σ θ σ θ ⊥ σ ε

00 0 0 σ ε ⊥  . A function g > strictly log-concave if log g is strictly concave. The Gaussian versions of our quality and circumstance linkage models represent special cases of theinformation environment considered in Meyer and Vickers (1997) and Bergemann et al. (2019), both ofwhom allow for correlation between both types and shocks. The Gaussian version of our quality linkagemodel also corresponds to a symmetric version of the environment considered in Acemoglu et al. (2019).

14e verify in Appendix O.2.1 that Assumptions 1 through 5 are all met in this case, andAssumption 6 is satisﬁed by any strictly concave cost function.

We begin our analysis by studying eﬀort choices in a restricted model in which the set ofagents who opt-in is exogenously speciﬁed. Without loss, we suppose that all N agentsparticipate. In equilibrium, agents choose eﬀort such that the marginal impact of eﬀort on the principal’sforecast in the second period, which we will refer to as the marginal value of eﬀort , equalsits marginal cost. Here we deﬁne the marginal value of eﬀort and explore its properties.Fix an equilibrium eﬀort proﬁle ( a ∗ , ..., a ∗ N ). The principal believes that each outcome isdistributed S i = θ i + a ∗ i + ε i , and any agent i who chooses the equilibrium eﬀort level a ∗ i believes the same. But if some agent i deviates to a non-equilibrium action a i (cid:54) = a ∗ i , thenhe knows that his outcome is distributed S i = θ i + a i + ε i . This means that the agent’sexpected period-2 reward (i.e. the agent’s expectation of the principal’s forecast of his type)is an iterated expectation with respect to two diﬀerent probability measures over the spaceof types and outcomes.Formally, let E ∆ denote expectations when agent i chooses eﬀort level a ∗ i + ∆. For anyproﬁle of realized outcomes ( S , . . . , S N ), the principal’s expectation of agent i ’s type is E [ θ i | S , . . . , S N ] . If agent i exerts eﬀort a i = a ∗ i + ∆, then his ex-ante expectation of the principal’s forecast is µ N (∆) ≡ E ∆ [ E [ θ | S , . . . , S N ]] . Note that if the agent does not distort his eﬀort away from the equilibrium level, then µ N (0) = µ , reﬂecting the usual martingale property of posterior expectations.When ∆ (cid:54) = 0 , posterior expectations under the principal’s beliefs are not a martingalefrom agent 1’s perspective: As we show in Appendix C, µ N (∆) is strictly increasing in ∆ . Thus, increasing eﬀort beyond the expected eﬀort level always leads to a higher expectedvalue of the principal’s expectation. The agent’s incentives to distort eﬀort away from its Kartik et al. (2019) showed that if two agents with diﬀering priors update beliefs in response to signals marginal value of eﬀort

M V ( N ), which is deﬁnedas M V ( N ) ≡ µ (cid:48) N (0) . Our notation reﬂects the fact that µ (cid:48) N (0), thus also M V ( N ) , is independent of the equilibriumeﬀort levels a ∗ , ..., a ∗ N , due to the additive dependence of outcomes on eﬀort. Example.

In the Gaussian model described in Section 2.6, an agent who exerts eﬀort a = a ∗ + ∆ expects the principal’s forecast of his type to be µ N (∆) = µ + β ( N ) · ∆for a function β ( N ) that is independent of ∆ and a . See Online Appendix O.2.2 for theclosed-form expression for β ( N ) (which diﬀers depending on whether we assume a qualitylinkage or circumstance linkage). The existence of closed-form expressions, as well as linearityof µ N (∆), are particular to Gaussian uncertainty, although independence with respect to theequilibrium eﬀort level is general. The marginal value of eﬀort M V ( N ) = µ (cid:48) N (0) is thensimply the constant slope β ( N ) in this Gaussian setting.Throughout, we use M V Q ( N ) and M V C ( N ) to denote the marginal value functions in thequality linkage and circumstance linkage models, dropping the subscript when a statementholds in both models. Since agents are symmetric, they share the same marginal value and marginal cost of eﬀort.There is therefore a unique eﬀort level a ∗ ( N ) satisfying each agent’s equilibrium ﬁrst-ordercondition M V ( N ) = C (cid:48) ( a ∗ ( N )) (2)equating the marginal value of eﬀort M V ( N ) with its equilibrium marginal cost C (cid:48) ( a ∗ ( N )).This condition is both necessary and suﬃcient to ensure that—when the principal expectsall agents to exert eﬀort a ∗ ( N )—each agent’s optimal eﬀort choice is indeed a ∗ ( N ). Theunique equilibrium of the exogenous-entry model then entails choice of a ∗ ( N ) = C (cid:48)− ( M V ( N )) (3)by every agent. When we wish to denote equilibrium eﬀort in the quality linkage or thecircumstance linkage model speciﬁcally, we will write a ∗ Q ( N ) or a ∗ C ( N ) respectively. Note about an unknown state, the more optimistic agent expects the other’s expectation of the state to increase.Our Lemma C.1 complements this result, ﬁnding an analogous eﬀect when two agents share a common priorbut disagree about the correlation between the state and the signal. a ∗ Q (1) = a ∗ C (1); that is, the equilibrium action is the same in the single-agent version ofboth models. We now characterize how the number of participating agents impacts each agent’s incentivesto exert eﬀort. This comparative static plays a key role in characterizing equilibrium in thefull model.

Lemma 1.

The marginal value of eﬀort exhibits the following comparative static in popula-tion size:(a)

M V Q ( N ) is strictly decreasing in N and lim N →∞ M V Q ( N ) > .(b) M V C ( N ) is strictly increasing in N and lim N →∞ M V C ( N ) < . That is, the marginal value of eﬀort declines in the number of agents in the quality linkagemodel, and increases in the circumstance linkage model. Since C (cid:48) is strictly increasing, itis immediate from this lemma and (3) that the equilibrium actions a ∗ ( N ) display the samecomparative statics. Proposition 1.

Equilibrium eﬀort in the exogenous entry model exhibits the following com-parative static in population size:(a) a ∗ Q ( N ) is strictly decreasing in N and lim N →∞ a ∗ Q ( N ) > .(b) a ∗ C ( N ) is strictly increasing in N and lim N →∞ a ∗ C ( N ) < ∞ . The key to this result is understanding how the number of observations N impacts thesensitivity of the principal’s forecast of θ i to the realization of S i . All else equal, the strongerthe dependence of this forecast on i ’s outcome, the stronger the incentive to manipulate itsdistribution. In the circumstance linkage model, other agents’ data (which are informativeabout the common component of the noise term ε ) complements agent i ’s outcome, improvingits marginal informativeness. Thus, the larger N is, the more weight the principal puts on i ’s outcome in its forecast of θ i . This force incentivizes eﬀort. In the limit as N → ∞ , theprincipal learns ε perfectly and can de-bias the outcomes accordingly, so the incentives foragent i to exert eﬀort are the same as in a single-agent model with S i = θ i + ε ⊥ i . Meyer and Vickers (1997) establish the same comparative static in a Gaussian setting with up to twoagents; see their Proposition 1.

17y contrast, in the quality linkage model other agents’ data (which are informative aboutthe common part of the type θ ) substitutes for i ’s signal; thus, the larger N is, the lessweight the principal puts on the realization of i ’s outcome in its forecast of θ i . This forcede-incentivizes eﬀort. In the limit as N → ∞ , the principal can extract θ perfectly fromthe outcomes of other agents but retains uncertainty about θ ⊥ i , so manipulation of S i is stillvaluable. Speciﬁcally, the marginal value of eﬀort is the same as in a single-agent modelwith S i = θ ⊥ i + ε i .Although this intuition is straightforward, we do not in general have access to the distri-bution of the principal’s posterior expectation in closed form, so we cannot directly quantifythe “strength” of the posterior expectation’s dependence on the outcome S i . Moreover, al-though it is straightforward to show that the sequence of functions µ N (∆) converge pointwise to a limiting function µ ∞ (∆), the rates of this convergence may vary across ∆. Since weare interested in the limiting marginal value lim N →∞ M V ( N ) = lim N →∞ µ (cid:48) N (0), we need thestronger property of uniform convergence of µ N (∆) around ∆ = 0. In Appendix C.2.2, weshow that the expected impact of increasing eﬀort by ∆, i.e. µ N (∆) − µ N (0), can be boundedby an expression that shrinks (for Part (a)) or grows (for Part (b)) in N uniformly in ∆. This establishes that the marginal value of deviating from equilibrium eﬀort at ﬁnite N , µ (cid:48) N (0), indeed converges to the marginal value of eﬀort in the limiting model, µ (cid:48)∞ (0), whichwe can separately characterize. We now return to the main model, where the agents who participate (and thus the segmentsize N from the previous section) are endogenously determined. In this section we assumethat R is exogenously ﬁxed, and does not change in response to introduction of a linkage.In Sections 5 and 6 we explore extensions of the model in which R is chosen endogenously. An implication of Lemma 1 is that as N → ∞ , the agent’s expectation of the principal’s forecastconverges to the agent’s own expectation of his type; that is, µ . This implication has the ﬂavor of the classicBlackwell and Dubins (1962) result on merging of opinions, which says that if two agents have diﬀerent priorbeliefs which are absolutely continuous with respect to one another, then given suﬃcient information, theirposterior beliefs must converge. The diﬀerence is that the Blackwell and Dubins (1962) result demonstratesalmost-sure convergence, while we are interested in l -convergence under a shifted measure—that is, whetherthe agent’s expectation of the principal’s expectation converges to the agent’s own expectation given suﬃcientdata, where the agent and principal use diﬀerent priors. Neither of these two notions of convergence directlyimply the other. .1 Equilibrium In equilibrium, the principal correctly de-biases the impact of eﬀort on observed outcomes.The agent’s expected payoﬀ in the second period is thus the prior mean µ , no matter theequilibrium eﬀort level. Therefore opt-in is (weakly) optimal as part of an equilibriumstrategy if and only if the agent’s equilibrium action a ∗ satisﬁes R + µ − C ( a ∗ ) ≥ . We impose the following lower bound on R , which guarantees that agents would ﬁnd itoptimal to opt-in when no other agents are present in the segment. This restricts attentionto settings in which a functioning market existed prior to identiﬁcation of linkages acrossconsumers. Assumption 7 (Individual Entry) . R ≥ C ( a ∗ (1)) − µ , where a ∗ (1) is the equilibrium eﬀortin the exogenous-entry game with a single agent (as deﬁned in (2) with N = 1 ). In light of Assumption 7, there exists no equilibrium (respecting the reﬁnement intro-duced in Section 2.5) featuring no entry. This is because in any no-entry equilibrium, an agentdeviating to entry and choosing eﬀort a ∗ (1) would receive a payoﬀ of R + µ − C ( a ∗ (1)) > a ∗ (1) following such a deviation.Our main results characterize how the equilibrium implications of quality and circum-stance linkages diﬀer: Theorem 1.

In the quality linkage model, there is a unique equilibrium for all populationsizes N . In this equilibrium, each agent opts-in and chooses eﬀort a ∗ Q ( N ) . Theorem 2.

In the circumstance linkage model, there is a unique equilibrium for all popu-lation sizes N . There exists an N ∗ ∈ { , , ... } ∪ {∞} such that: • If N ≤ N ∗ , each agent opts-in and chooses eﬀort a ∗ C ( N ) , • If N > N ∗ , each agent opts-in with probability p ( N ) ∈ (0 , and chooses eﬀort a ∗∗ ∈ [ a ∗ C ( N ∗ ) , a ∗ C ( N ∗ + 1)) . The eﬀort level a ∗∗ is independent of N , while the opt-inprobability p ( N ) is strictly decreasing in N and satisﬁes lim N →∞ p ( N ) = 0 .The threshold N ∗ is increasing in R, and is ﬁnite for all R suﬃciently small. The equilibrium actions characterized in Theorems 1 and 2 are depicted in Figure 2.When the segment size is small, Assumption 7 ensures that opting-in is strictly proﬁtablefor all agents in each model, and so the equilibrium eﬀort levels a ∗ Q ( N ) and a ∗ C ( N ) are the19 ⇤⇤ E ↵ o r t Population Size N N ⇤ a ⇤ Q ( N ) a ⇤ C ( N ) Circumstance LinkageEquilibrium E↵ort

Quality LinkageEquilibrium E↵ort

Figure 2: The relationship between population size and equilibrium eﬀortsame as in the previous section. Thus, the equilibrium eﬀort levels inherit the propertiesdescribed in Proposition 1. As the population size grows, opting-in becomes increasinglyattractive in the quality linkage model, since equilibrium eﬀort a ∗ Q ( N ) decreases in N . As aresult, all agents participate no matter how large the population. But in the circumstancelinkage model, eﬀort a ∗ C ( N ) increases in N and so participation becomes less attractive as thepopulation of entering agents grows. If N is large enough that the total cost of participation C [ a ∗ ( N )] exceeds the expected reward R + µ, then full participation cannot be an equilibrium.We let N ∗ denote the largest N for which R + µ ≥ C [ a ∗ ( N )] . Then for any

N > N ∗ , agentsrandomize over entry in equilibrium. In this mixed equilibrium, agents must enter at a rate p ( N ) < a ∗∗ so as to satisfy two conditions:1. Agents are indiﬀerent over entry: R + µ = C ( a ∗∗ ) ,

2. The marginal value of distortion equals its marginal cost: E (cid:104) M V (1 + (cid:101) N ) (cid:12)(cid:12)(cid:12) (cid:101) N ∼ Bin( N − , p ( N )) (cid:105) = C (cid:48) ( a ∗∗ )The entry condition pins down the action level a ∗∗ , which is independent of the populationsize. The entry rate p ( N ) is then pinned down by the requirement that the expected marginalvalue of eﬀort must equal the marginal cost when agents who enter take action level a ∗∗ . If the opt-in reward R is large enough, it may be that N ∗ = ∞ and all agents enter no matter how largethe population, as even the limiting eﬀort level for very large populations is worth incurring for the largeentry reward. The value N ∗ is ﬁnite whenever R is not too large. p ( N )must drop with N to equilibrate marginal values and costs. We now analyze the welfare implications of the equilibrium outcomes derived in Section 4.1.Following Holmstr¨om (1999), we consider outcomes to represent socially valuable surplusgenerated by service provision, while eﬀort is socially costly. In addition, we consider theforecast E [ θ i | S j , j ∈ I opt-in ] to reﬂect surplus that the agent receives, e.g. through futureservice. These factors contribute to social surplus only for participating agents, since surplusis not generated by agents who opt-out. Meanwhile, we take the reward R to represent amonetary transfer, which aﬀects the split of surplus but not the amount generated. For any symmetric strategy proﬁle ( p, a ) chosen by a population of N agents, where p isthe opt-in probability and a is an action choice, we deﬁne total expected welfare to be W ( p, a, N ) = E (cid:34) N (cid:88) i =1 (opt-in) × [ S i + E ( θ i | S j , j ∈ I opt-in ) − C ( a )] (cid:35) = pN · ( a + 2 µ − C ( a )) . (4)Total welfare is divided between the principal and agents as follows: the principal receivesthe outcome S i and pays a reward R to every participating agent i, yielding expected proﬁtsΠ( p, a, N ) = pN · ( a + µ − R ) . Meanwhile every participating agent receives reward R and the reputational payoﬀ E [ θ i | S j , j ∈ I opt-in ], and incurs eﬀort cost C ( a ). Total consumer welfare is therefore CS ( p, a, N ) = pN · ( R + µ − C ( a )) . Note that W ( p, a, N ) = Π( p, a, N ) + CS ( p, a, N ) , so all surplus goes to either the principalor one of the agents.We consider how each of these welfare measures compares to a “no data linkages” bench-mark in which the principal does not observe the linkage across agents, and uses only agent In general, this probability p ( N ) is not the same as the probability p ∗ ( N ) satisfying M V (1 + p ∗ ( N ) · ( N − C (cid:48) ( a ∗∗ ) , i.e. the opt-in probability such that equilibrium eﬀort is a ∗∗ given deterministic entry of p ∗ ( N ) · ( N −

1) other agents. In the Gaussian setting (and we suspect more generally)

M V ( N ) is a concave function of N , implying that uncertainty about the number of entrants increases theequilibrium rate of entry. In Section 7.3 we consider how results change if improved prediction also contributes to social welfare. ’s outcome S i to predict their type θ i . That is, the principal’s forecast is E ( θ i | S i ) . Inequilibrium in this benchmark, each agent opts-in (by Assumption 7), and chooses eﬀortlevel a NDL ≡ a ∗ (1) (5)i.e. the action that would be taken for a population of size 1. (Recall that this action is thesame for both linkage models.) In a similar spirit to Assumption 7, we assume that servingagents is proﬁtable absent a linkage: Assumption 8 (Proﬁtable market) . a ∗ (1) + µ > R. This assumption ensures that a functioning market existed prior to linkages becoming avail-able, and that the principal would not prefer to drop out rather than serve the market.

Consumer welfare depends only on the action each agent is induced to take upon entry,and not on equilibrium entry rates. This is because agents randomize over entry only whenopting-in and -out yield the same payoﬀ. So consumer welfare can be computed as if everyagent entered and exerted the equilibrium eﬀort level, and this welfare is declining in eﬀort.Therefore consumer welfare drops under any quality linkage and rises under any circumstancelinkage, no matter the population size.

Principal proﬁts are rising in eﬀort, and also in the participation rate whenever per-agentproﬁts are positive. When agents within a segment have correlated quality, Theorem 1indicates that use of the linkage for prediction (increasing the eﬀective population size from1 to N ) will lead to depressed eﬀort by agents without aﬀecting participation, thus reducingﬁrm proﬁts relative to the no-linkage benchmark. Firms may therefore prefer to commit not to use big data analytics for forecasting outcomes based on such linkages.On the other hand, when agents experience shared circumstances (that aﬀect current-period outcomes but are not reﬂective of underlying quality), Theorem 2 shows that use ofthe linkage will boost agent eﬀort but may reduce participation. For small segments, ﬁrmsbeneﬁt from the eﬀort boost, and the linkage is proﬁtable. However, for suﬃciently largesegments the eﬀect of dampened participation outweighs this beneﬁt (since p ( N ) → N → ∞ but eﬀort levels are bounded), and the linkage becomes unproﬁtable.22 .2.3 Social surplus While ﬁrm proﬁts are always increasing in eﬀort and consumer welfare is always decreasing,social welfare is non-monotone in eﬀort. Each participating agent generates a surplus of a + 2 µ − C ( a ) , which is maximized at the unique eﬀort level a F B satisfying C (cid:48) ( a F B ) = 1 . Since µ > , surplus is strictly positive at this eﬀort level, and so aggregate surplus is maximized whenall agents enter and exert eﬀort a F B . We ﬁrst show that equilibrium actions are below the ﬁrst-best action in both models nomatter how many agents participate. This result implies that, ﬁxing the level of participation,linkages which boost eﬀort improve social welfare.

Lemma 2.

For every population size N , equilibrium eﬀort is ineﬃciently low in both models: a ∗ ( N ) < a F B . As N increases: • Eﬀort in the circumstance linkage model a ∗ C ( N ) becomes more eﬃcient but is boundedbelow the eﬃcient level: lim N →∞ a ∗ C ( N ) < a F B . • Eﬀort in the quality linkage model a ∗ Q ( N ) becomes less eﬃcient. Recall that the equilibrium action a ∗ satisﬁes C (cid:48) ( a ∗ ) = M V ( N ) while the ﬁrst-best action a F B satisﬁes C (cid:48) ( a F B ) = 1. The lemma is proved by demonstrating that

M V ( N ) < N . Intuitively, some eﬀort is always dissipated, since the realization of theoutcome is noisy, so the principal’s forecast of θ i moves less than 1-to-1 with the outcome.This result generalizes a classic result from Holmstr¨om (1999), which demonstrated that a ∗ (1) < a F B in the case of Gaussian random variables.The following proposition builds on the previous result and compares W NDL ( N ), W Q ( N ),and W C ( N ), which respectively denote social welfare under the no-linkage benchmark, aquality linkage, and a circumstance linkage. Proposition 2.

For every

N > , W Q ( N ) < W NDL ( N ) . There exists a population threshold N such that W NDL ( N ) < W C ( N )23 or all < N < N while W C ( N ) < W NDL ( N ) for all N > N . For all populations with N ≥ a NDL ( N ) = a ∗ Q (1) > a ∗ Q ( N ).In contrast, under a circumstance linkage, the comparison depends on the populationsize N . In small populations, all agents opt-in, so again the action comparison completelydetermines welfare. Since a NDL ( N ) = a ∗ C (1) < a ∗ C ( N ), data linkages leads to an improve-ment in social welfare. In large populations, depressed entry dominates and results in lowersocial welfare despite increased eﬀort levels from participating agents. (Both regimes existwhenever the population threshold N ∗ above which agents randomize over entry is ﬁnite andlarger than 1.) These results suggest that a social planner should restrict use of big datato identify linkages over quality while encouraging use of big data to identify linkages overcircumstances that are shared by small populations. In the previous section we considered the impact of a data linkage in a setting in which theprincipal’s transfer R to agents was held ﬁxed. We now consider the implications of allowingthe principal to adjust R freely to maximize proﬁts subject to each agent’s participationconstraint. Formally, we augment our baseline setup with an initial stage in which theprincipal chooses R, following which agents play the game described in Section 2. Wecontinue to restrict attention to equilibria in which players choose symmetric participationstrategies. Further, whenever multiple equilibria exist in the game among agents, we selectthe principal’s optimal equilibrium. Our ﬁrst result is that social surplus increases under a circumstance linkage and decreasesunder a quality linkage, no matter the population size. Meanwhile consumers are indiﬀerentto introduction of a linkage, as the principal extracts all consumer surplus in either case. Tostate this result formally, let W † NDL ( N ) , W † Q ( N ) , and W † C ( N ) denote total social surplus in In Section 4.1 we established equilibrium uniqueness whenever the inequality µ + R ≥ C ( a ∗ (1)) issatisﬁed. When this inequality is violated, there can exist multiple equilibria in the quality linkage model,and the principal’s proﬁt-maximizing choice of R depends on the equilibrium selection.

24 population of N agents under monopoly pricing given no data linkage, a quality linkage,and a circumstance linkages. (Social surplus is deﬁned as to Section 5.2). Lemma 3.

Suppose that R is chosen optimally by the principal. Then for every N > , W † Q ( N ) < W † NDL ( N ) < W † C ( N ) . Total consumer welfare is zero with or without a data linkage.

The intuition for this result follows from the fact that agents have no private informationabout their willingness to pay, and so the principal can always extract all surplus froman interaction through the transfer R, which may be negative (in which case it representsa price consumers must pay to participate). Given this fact, the principal’s choice of R seeks to maximize the surplus it can appropriate. Since higher eﬀort is achievable under acircumstance linkage than without one, and since higher eﬀort is eﬃciency-enhancing, totalwelfare increases under a circumstance linkage. On the other hand under a quality linkageit is never possible to induce full entry at eﬀort level a ∗ (1) no matter the choice of R, andso surplus falls under a quality linkage.Our second result describes how equilibrium patterns of eﬀort and participation changeunder a linkage. Note that without a linkage, all agents participate and exert eﬀort a ∗ (1)under an optimal choice of R . Lemma 4.

Under a quality linkage, agents exert eﬀort a † Q ( N ) ∈ [ a ∗ Q ( N ) , a ∗ (1)) and enterwith probability p † Q ( N ) ∈ (0 , . Under a circumstance linkage, agents exert eﬀort a ∗ C ( N ) andenter with probability 1. In contrast to the baseline model, when the principal optimally chooses R , participationmay fall under a quality linkage but not under a circumstance linkage. The result forthe circumstance linkage is straightforward: Increasing the number of participating agentsboosts surplus both through the value of the transaction and via increased eﬀort by allparticipating agents. Thus, the principal optimally sets R so that all agents enter. Thequality linkage result is more subtle, because in that setting increased participation hascountervailing eﬀects, increasing the surplus generated by the transaction, but reducing eﬀortby participating agents. Depending on model parameters, the latter eﬀect may dominatethe former, leading the principal to suppress entry and extract more eﬀort from agents whodo participate.To establish the result for the quality-linkage model we demonstrate that it is in factpossible for the principal to coordinate agents on a partial-entry equilibrium. This resultmay be surprising in light of our result from the baseline model that all agents participate25n the unique equilibrium. The key to that result was Assumption 7, which ensured R was large enough for participation to be proﬁtable in a single-agent model. However, when R is ﬂexible, the principal may contemplate choosing R low enough that a single agentwouldn’t enter if he expected to exert eﬀort a ∗ (1) . So long as R is suﬃciently large thatentry is proﬁtable at eﬀort level a ∗ Q ( N ) , there still exists a full-entry equilibrium. However,there also exist two additional equilibria due to the strategic complementarity of agents’entry choices—a no-entry equilibrium (which the principal never prefers) and a partial-entryequilibrium. The principal must then choose between the full-entry equilibrium induced bythe reward R = C ( a ∗ Q ( N )) − µ, and the range of partial-entry equilibria induced by rewards R ∈ ( C ( a ∗ Q ( N )) − µ, C ( a ∗ (1)) − µ ] . In the proof, we show that the latter may be optimal,depending on model parameters. In particular, if the curvature of the eﬀort cost function islow, then small changes in participation induce large changes in eﬀort, making partial-entryequilibria especially proﬁtable.

So far we have considered the implications of data linkages for a single ﬁrm which uses datato inform predictions about consumer behavior. This focus allowed us to isolate the directeﬀect of data linkages on consumer eﬀort and participation. When multiple ﬁrms competefor consumers, additional important questions regarding behavior and welfare arise whichwe can leverage our model to answer.In this section we address a recent policy debate regarding data sharing. In many markets,a consumer’s business brings with it data on the consumer’s behavior, which by default isprivately owned by the organization with which the consumer interacts. Recently, proposalshave been made to form so-called “data commons” to make this data freely accessible to allorganizations in the market. For example, the European Commission has begun exploringlegislative action that would support “business-to-business data sharing,” and new platformsfor data sharing, such as Data Republic, permit organizations to share anonymised datawith one another. We study here the impact of such data sharing on eﬀort provision andconsumer welfare. To do this, we extend our model to K ≥ N consumers according See . Our focus on consumer welfare mirrors recent policy discussions regarding data collection and sharing,which have been mostly concerned with the impact of these activities on consumers. Our main ﬁndingswould be similar if we instead analyzed total social surplus. In particular, an analog of Proposition 3 holdswhen considering the impact of data sharing on social surplus.

26o the following timeline: t = − k simultaneously chooses a reward R k . These transfers are publiclyobserved. t = 0 : Each consumer chooses a ﬁrm to participate with (if any). t = 1 : Participating consumers choose what level of eﬀort to exert, without observingthe participation decisions of other consumers. t = 2 : Participating consumers receive their ﬁrm’s forecast of their type.Payoﬀs and consumer welfare are as in the single-principal model.We contrast a proprietary data regime, under which each ﬁrm observes only the outcomesof the consumers who interact with them, with a data sharing regime, under which theoutcomes of all participating agents are shared across ﬁrms. These settings diﬀer only in theinformation that ﬁrms have access to when making their forecasts at time t = 2. We assumethat whether data is proprietary or shared is common knowledge.As our solution concept, we use subgame-perfect Nash equilibria in pure strategies (whichwe henceforth refer to simply as an equilibrium ). Throughout, we maintain a restriction onout-of-equilibrium beliefs analogous to the reﬁnement imposed in the single-principal model:at any information set in which agent i participates with principal k , principal k expectsagent i to choose the action a i satisfying M V (cid:0) N k − i (cid:1) = C (cid:48) ( a i ) , where N k − i is the number of agents j (cid:54) = i who participate with principal k under their equi-librium strategies. This reﬁnement ensures that each principal expects every participatingagent i to choose the equilibrium action from a game with exogenous participation of 1 + N k − i agents, even when participation by agent i is out-of-equilibrium.We do not provide a full characterization of the equilibrium set, as there exists a largeset of equilibria under proprietary data. Despite this fact, we can show that the shift This restriction diﬀers slightly from the one we used in the single-principal model: we require that agentsnot mix over participation, but we allow agents to make asymmetric participation decisions. Imposing theserestrictions in the single-principal model would not substantively impact the analysis. In particular, equilibriawould be identical except in the circumstance linkage model with

N > N ∗ . In that regime there exist pure-strategy equilibria with asymmetric entry decisions, which exhibit the same comparative statics in eﬀort andparticipation rates as the symmetric mixed equilibrium. For a given set of transfers, strategic substitutibility or complementarity between consumer participationdecisions allow for existence of a multiplicity of participation patterns. The selection of participation patternsacross subgames can then support a variety of equilibrium rewards by ﬁrms.

Proposition 3.

In both the quality linkage and circumstance linkage models, consumer wel-fare is higher under data sharing than under proprietary data.

This result arises from the interplay of two forces—how data sharing impacts the totalsurplus generated from the market via participation and eﬀort, and how it changes the splitof this surplus between consumers and ﬁrms. Under data sharing, ﬁrms are identical toconsumers, since all ﬁrms have access to the same outcomes regardless of the pattern ofparticipation. This forces ﬁrm proﬁts to zero and transfers all surplus to consumers. Onthe other hand, data sharing has a potentially ambiguous impact on total surplus. Totalsurplus is rising in eﬀort, and eﬀort is rising in the number of participating agents undercircumstance linkages, but falling under quality linkages (Proposition 1). So while consumerwelfare clearly rises in the circumstance linkage models, the result under quality linkages ismore subtle.We establish the result for the quality linkage model by proving that under proprietarydata, in every equilibrium agents endogenously choose to interact with a single ﬁrm (LemmaE.2). This means that data sharing does not increase the eﬀective population size, andaggregate surplus is the same with or without data sharing. The impact of data sharing onconsumer welfare is then completely determined by the split of surplus, which we alreadyobserved is maximized for consumers under data sharing. So consumer welfare must be atleast as large under this regime.Proposition 3 indicates that under either kind of linkage across consumer outcomes, theintroduction of data sharing is welfare-improving for consumers. This result does not implythat under data sharing, the identiﬁcation of linkages always increases consumer welfare.As noted in Section 4.2, introduction of a quality linkage increases consumer welfare, butintroduction of a circumstance linkage diminishes it.Thus, data sharing (the pooling of information across competitive ﬁrms) and data linkages(the identiﬁcation of relationships among consumers that make one consumer’s outcomespredictive of another’s), while related, play very diﬀerent roles: Data linkages determinehow the size of a ﬁrm’s consumer base impacts the eﬀort that each consumer exerts; whiledata sharing determines the pattern of participation across the ﬁrms and how surplus isdivided between consumers and ﬁrms. The results of this section reveal that data linkages More precisely, total surplus is rising in eﬀort on the interval [0 , a

F B ], where a F B is the ﬁrst-best actionsatisfying C (cid:48) ( a F B ) = 1. We showed in Lemma 2 that equilibrium actions are bounded below ﬁrst-best.Thus, on the relevant domain, total surplus is rising in eﬀort.

We have so far supposed that consumers know the total population size N and the structureof correlation across the outcomes S i . In practice, consumers may not have this kind ofdetailed knowledge about their segment. We show next that our qualitative ﬁndings remainunchanged when agents have uncertainty about the strength of correlation across outcomesand about the population size, so long as agents know whether consumers in their segmentare related by quality or circumstance.Formally, suppose that in the quality linkage model agents may be grouped into any of K “quality linkage” segments, each of which corresponds to a diﬀerent correlation structureacross types; that is, θ ∼ F kθ , θ ⊥ i ∼ F kθ ⊥ , and ε i ∼ F kε for segment k = 1 , ..., K. All agentsshare a common belief about the probability that they are in each segment. (The case of K “circumstance linkage” segments may be similarly deﬁned.) At the same time, supposethat the number of agents N is a random variable, potentially dependent on the segment,with distribution N ∼ G kγ , where γ is a scale factor known to all agents such that for eachsegment k, G kγ ﬁrst-order stochastically dominates G kγ (cid:48) whenever γ > γ (cid:48) . Under this speciﬁcation, the ﬁrst-order condition characterizing optimal eﬀort whenagents enter with probability p may be written E (cid:104) M V (1 + (cid:101)

N , k ) (cid:105) = C (cid:48) ( a ∗ ) , where M V ( N (cid:48) , k ) is the marginal value of distortion when N (cid:48) agents enter and the con-sumer is part of segment k, (cid:101) N ∼ Bin( N − , p ) , and N and k are both random variables.Note that for each segment k, M V ( N, k ) changes with N just as in Lemma 1. Then condi-tional on the segment k, E [ M V (1 + (cid:101)

N , k ) | k ] decreases with p and γ in the quality linkagemodel, and increases with p and γ in the circumstance linkage model. Since this propertyholds for every segment k, it must also hold for the unconditional expected marginal value E (cid:104) M V (1 + (cid:101)

N , k ) (cid:105) .The reasoning of the previous paragraph yields the conclusion that the expected marginalvalue of distortion moves with the population scale factor γ and the entry rate p just as itdoes with respect to N and p in the baseline model. So the following corollary holds: Corollary.

In the model with uncertainty over segment and population size, equilibrium ﬀort and participation rates exhibit the same comparative statics in γ as with respect to N in Theorems 1 and 2. That is, an increase in γ —which shifts up the distribution for the number of participants nomatter the realized segment—leads to higher eﬀort under circumstance linkages and lowereﬀort under quality linkage. So far we have conducted our analysis supposing that each consumer is identiﬁed as part ofa single segment. In practice a consumer may belong to several demographic and lifestylesegments, each of which may be used by an organization to improve predictions of theconsumer’s type. We now show that aggregation of outcomes from multiple segments forprediction creates a natural ampliﬁcation of the eﬀort eﬀect identiﬁed in Proposition 1:as the number of identiﬁable quality linkages for a consumer increases (e.g. because theorganization has purchased data about additional covariates), his eﬀort declines; and as thenumber of identiﬁable circumstance linkages for a consumer increases, his eﬀort rises.To formally model variation in the number of segments, we focus on the eﬀort exerted bya single agent, who we refer to as agent 0. We decompose the agent’s outcome S as the sumof a number of components, some common and some idiosyncratic. In the quality linkagecontext, we write S = a + J (cid:88) j =1 θ j + θ ⊥ + ε , where θ ⊥ and ε are idiosyncratic persistent and transient components of the outcome. Each θ j is a persistent component of the outcome which is held in common with a segment j consisting of N j agents. The outcomes of agents in segment j are observed by the principal,and each agent i in this segment has an outcome distributed as S ji = a ji + θ j + θ ⊥ ,ji + ε ji where θ ⊥ ,ji and ε ji are idiosyncratic. As usual, the principal wishes to predict θ = (cid:80) Jj =1 θ j + θ ⊥ i . Analogously, in the circumstance linkage model we decompose the agent’s outcome as S = a + θ + J (cid:88) j =1 ε j + ε ⊥ , The threshold N ∗ at which participation rates begin to drop in the “circumstance linkage” case would,however, depend on details of their beliefs about the segment. For simplicity, we do not model agents in other groups as having multiple linkages. Extending the modelto allow such linkages would not impact results in any way so long as no group j is linked to another group j (cid:48) also linked to agent 0. i from group j has an outcome distributed as S ji = a ji + θ ji + ε j + ε ⊥ ,ji . As in the baseline model, all type and shock terms are mutually independent. In each modelwe impose analogs of the assumptions in Section 2.6 on the relevant densities and posteriormeans. Participation of all agents is exogenously given.Proposition 4 below demonstrates a comparative static in the number of linkages observedby the principal. A principal who observes m linkages understands the correlation structureof each θ j (or ε j ) with the segment- j outcomes ( S j , ..., S jN j ) for j = 1 , ..., m, but believes thatfor j = m + 1 , ..., J each θ j (or ε j ) term is idiosyncratic. This could, for example, correspondto the principal knowing which of their consumers are charitable givers, but not knowingwhich consumers are single parents. Let a † Q ( m ) be agent 0’s equilibrium action when theprincipal observes m linkages in the quality linkage model, with a † C ( m ) similarly deﬁned forthe circumstance linkage model. The following result characterizes how agent 0’s equilibriumaction changes with m. Proposition 4. a † Q ( m ) is strictly decreasing in m, while a † C ( m ) is strictly increasing in m. For simplicity we have restricted attention to multiple linkages of the same type. However,the basic logic of Proposition 4 holds even when the agent may be linked to other segmentsvia both quality and circumstance linkages. Given any initial set of linkages (each of whichmay be either a quality or circumstance linkage), identiﬁcation of an additional qualitylinkage decreases equilibrium eﬀort, while identiﬁcation of an additional circumstance linkageincreases equilibrium eﬀort. (We omit the proof, which follows straightforwardly along thelines of the proof of Proposition 4.)

So far we have considered prediction of an agent’s type relevant for social welfare only insofaras it generates incentives for the agent to exert eﬀort to inﬂuence the prediction. However,in some applications, better tailoring of a service level to ﬁt the agent’s type may involvechanges in allocation which improve welfare. For instance, a bank extending loans to smallbusinesses may increase total output if it is able to more accurately match loan amounts tothe proﬁtability of each business.When better prediction improves welfare, the social welfare results of Proposition 2 arequalitatively the same for circumstance linkages, but may change under a quality linkage.Identiﬁcation of a circumstance linkage now has two positive forces on per-agent welfare,31mproving both the eﬀort exerted and the forecast precision of each participating agent’stype (given a ﬁxed entry rate). Since the participation rate still drops to zero when thepopulation size becomes large, circumstance linkages improve welfare for small populationsbut decrease it for large populations, identical to the baseline model.Under a quality linkage, the impact of the linkage on eﬀort and prediction accuracy havecountervailing eﬀects on welfare. For large populations the total eﬀect is determined bythe comparison between drop of eﬀort from a ∗ (1) to lim N →∞ a ∗ ( N ) versus the gains fromaccurate prediction of θ. When the value to improved prediction is small, quality linkagesdecrease welfare for large populations (as in our baseline model), while the opposite is truewhen the value to improved prediction is large.

As ﬁrms and governments move towards collecting large datasets of consumer transactionsand behavior as inputs to decision-making, the question of whether and how to regulate theusage of consumer data has emerged as an important policy question. Recent regulations,such as the European Union’s General Data Protection Regulation (GDPR), have focusedon protecting consumers’ privacy and improving transparency regarding what kind of datais being collected. An important complementary consideration when designing regulationsis how data impacts social and economic behaviors.In the present paper, we analyze one such impact—the eﬀect that consumer segmenta-tions identiﬁed by novel datasets have on consumer incentives for socially valuable eﬀort.We ﬁnd that the behavioral and welfare consequences depend crucially on how consumersin a segment are linked.These results suggest that regulations should take into account not just whether individ-ual data is informative about other consumers, but whether that data is primarily useful forinferring quality or denoising observations.In practice, the usage of a particular dataset is likely to diﬀer across domains, and mayhave as much to do with the underlying correlation structure of the data as it does withthe algorithms used to aggregate that data. We hope that even the reduced-form models ofdata aggregation that we have considered here make clear that regulation of the “amount”of data is too crude for many objectives—the structure of that data, and how it is used forprediction, can have important consequences.Finally, our analysis in Section 6 of the interaction between market forces and datalinkages points to another interesting avenue for subsequent work. Since participation is astrategic complement under quality linkages but a strategic substitute under circumstance32inkages, the former encourages the emergence of a single ﬁrm that serves all consumers,while the latter discourages it. This suggests that identiﬁcation of linkages across consumersaﬀects not just those consumers and their behavior, but can also have important implicationsfor market structure and antitrust policy.

References

Acemoglu, D., A. Makhdoumi, A. Malekian, and A. Ozdaglar (2019): “Too MuchData: Prices and Ineﬃciencies in Data Markets,” Working Paper.

Acquisti, A., L. Brandimarte, and G. Loewenstein (2015): “Privacy and HumanBehavior in the Age of Information,”

Science , 347, 509–514.

Agarwal, A., M. Dahleh, and T. Sarkar (2019): “A Marketplace for Data: AnAlgorithmic Solution,” in

Proceedings of the 2019 ACM Conference on Economics andComputation , 701–726.

Auriol, E., G. Friebel, and L. Pechlivanos (2002): “Career Concerns in Teams,”

Journal of Labor Economics , 20, 289–307.

Ball, I. (2019): “Scoring Strategic Agents,” Working Paper.

Bergemann, D., A. Bonatti, and T. Gan (2019): “The Economics of Social Data,”Working Paper.

Bergemann, D., A. Bonatti, and A. Smolin (2018): “The Design and Price of Infor-mation,”

American Economic Review , 108, 1–48.

Blackwell, D. and L. Dubins (1962): “Merging of Opinions with Increasing Informa-tion,”

The Annals of Mathematical Statistics , 3, 882–886.

Bonatti, A. and G. Cisternas (2019): “Consumer Scores and Price Discrimination,”

Review of Economic Studies , forthcoming.

Chouldechova, A. (2017): “Fair Prediction with Disparate Impact: A Study of Bias inRecidivism Prediction Instruments,”

Big Data , 5, 153–163.

Dewatripont, M., I. Jewitt, and J. Tirole (1999): “The Economics of CareerConcerns, Part I: Comparing Information Structures,”

Review of Economic Studies , 66,183–198.

Dwork, C. and A. Roth (2014): “The Algorithmic Foundations of Diﬀerential Privacy,”

Found. Trends Theor. Comput. Sci. , 9, 211–407.

Eilat, R., K. Eliaz, and X. Mu (2019): “Optimal Privacy-Constrained Mechanisms,”Working Paper.

Eliaz, K. and R. Spiegler (2018): “Incentive-Compatible Estimators,” Working Paper.33 lliott, M. and A. Galeotti (2019): “Market Segmentation through Information,”Working Paper.

European Commission (2020): “A European strategy for data,” .

Fainmesser, I. P., A. Galeotti, and R. Momot (2019): “Digital Privacy,” WorkingPaper.

Federal Trade Commission (2014): “Data Brokers: A Call for Transparency and Ac-countability,” .

Frankel, A. and N. Kartik (2019): “Muddled Information,”

Journal of Political Econ-omy , 127, 1739–1776.——— (2020): “Improving Information from Manipulable Data,” Working Paper.

Georgiadis, G. and M. Powell (2019): “Optimal Incentives under Moral Hazard: FromTheory to Practice,” Working Paper.

Gomes, R. and A. Pavan (2019): “Price Customization and Targeting in Matching Mar-kets,” Working Paper.

Green, J. and N. Stokey (1983): “A Comparison of Tournaments and Contracts,”

Journal of Political Economy , 91, 349–364.

Hidir, S. and N. Vellodi (2019): “Personalization, Discrimination and Information Dis-closure,” Working Paper.

Holmstr¨om, B. (1982): “Moral Hazard in Teams,”

Bell Journal in Economics , 13, 324–340.——— (1999): “Managerial Incentive Problems: A Dynamic Perspective,”

Review of Eco-nomic Studies , 66, 169–182.

Hu, L., N. Immorlica, and J. W. Vaughan (2019): “The Disparate Eﬀects of StrategicManipulation,” in

Proceedings of the Conference on Fairness, Accountability, and Trans-parency , 259–268.

Ichihashi, S. (2019): “Online Privacy and Information Disclosure by Consumers,”

Ameri-can Economic Review , 110, 569–595.

Jullien, B., Y. Lefouili, and M. H. Riordan (2018): “Privacy Protection and Con-sumer Retention,” Working Paper.

Kartik, N., F. X. Lee, and W. Suen (2019): “A Theorem on Bayesian Updating andApplications to Communication Games,” Working Paper.

Kleinberg, J., S. Mullainathan, and M. Raghavan (2017): “Inherent Trade-Oﬀsin the Fair Determination of Risk Scores,” in , vol. 67, 43:1–43:23.

Lazear, E. and S. Rosen (1981): “Rank-Order Tournaments as Optimum Labor Con-tracts,”

Journal of Political Economy , 89, 841–864.34 eyer, M. A. and J. Vickers (1997): “Performance Comparisons and Dynamic Incen-tives,”

Journal of Political Economy , 105, 547–581.

Milgrom, P. (1981): “Good News and Bad News: Representation Theorems and Applica-tions,”

The Bell Journal of Economics , 12, 380–391.

Olea, J. L. M., P. Ortoleva, M. M. Pai, and A. Prat (2018): “Competing Models,”Working Paper.

Rodina, D. (2017): “Information Design and Career Concerns,” Working Paper.

Saumard, A. and J. A. Wellner (2014): “Log-Concavity and Strong Log-Concavity: AReview,”

Statistics Surveys , 8, 45–114.

Senate Committee on Commerce, Science, and Transportation (2013): “A Re-view of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Mar-keting Purposes,” .

Shleifer, A. (1985): “A Theory of Yardstick Competition,”

RAND Journal of Economics ,16, 319–327.

Yang, K. H. (2019): “Selling Consumer Data for Proﬁt: Optimal Market-SegmentationDesign and its Consequences,” Working Paper.35 ppendix

The appendices are structured as follows: Appendix A reports a list of actual consumerdata segmentations sold by data brokers. Appendix B establishes technical results used inthe proofs of the results in the body of the paper. The remaining appendices present proofsof all results in the body of the paper.

Table of Contents

A Consumer Segments Provided by Data Brokers 37B Preliminary Results 38

B.1 Smooth MLRP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38B.2 SFOSD of Posterior Distributions . . . . . . . . . . . . . . . . . . . . . . . 40B.3 Monotonicity of Posterior Expectations . . . . . . . . . . . . . . . . . . . 45

C Proofs for Section 3 (Exogenous Entry) 50

C.1 Equilibrium Characterization . . . . . . . . . . . . . . . . . . . . . . . . . 50C.2 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

D Proofs for Section 4 (Main Results) 63

D.1 Proofs of Theorems 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 63D.2 Proof of Lemma 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66D.3 Proof of Proposition 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

E Proofs for Section 6 (Data Sharing, Markets, and Consumer Welfare) 66

E.1 Proof of Proposition 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

O For Online Publication 70

O.1 Distributional Regularity Results . . . . . . . . . . . . . . . . . . . . . . 70O.2 Proofs for the Gaussian Setting . . . . . . . . . . . . . . . . . . . . . . . 75O.3 Proofs for Section 7 (Extensions) . . . . . . . . . . . . . . . . . . . . . . . 78 Consumer Segments Provided by Data Brokers

In this appendix we produce a list of examples of actual consumer segmentations producedby data brokers, as reported in Federal Trade Commission (2014) and Senate Committee onCommerce, Science, and Transportation (2013).Table 1: Examples of Consumer Segments

Quality Linkage Circumstance Linkage

Outdoor/Hunting & Shooting Sending a Kid to CollegeSanta Fe/Native American Lifestyle Expectant ParentsMedia Channel Usage - Daytime TV Buying a HomeBible Lifestyle Getting MarriedNew Age/Organic Lifestyle DietersPlus-size Apparel Families with NewbornsBiker/Hell’s Angels Hard TimesLeans Left New Mover/Renter/OwnerFitness Enthusiast Death in the FamilyWorking-class MomThrifty EldersHealth & Wellness InterestVery SpartanSmall Town Shallow PocketsEstablished EliteFrugal FamiliesMcMansions & MinivansWe have informally categorized segments according to whether they might represent aquality linkage or a circumstance linkage; in practice, this categorization would depend alsoon the time frame for forecasting. For example, a segment of “consumers with children incollege” during a particular observation cycle is a quality linkage segment while the childrenremain in college, but a circumstance linkage segment once the children have graduated.Besides these named categories, data brokers provide also segmentation based on numer-ous demographic, health, interest, ﬁnancial, and social media indicators, including: milestraveled in the last 4 weeks, number of whiskey drinks consumed in the past 30 days, whetherthe individual or household is a pet owner, whether the individual donates to charitablecauses, whether the individual enjoys reading romance novels, whether the individual par-ticipates in sweepstakes or contests, whether the individual suﬀers from allergies, whether37he individual is a member of ﬁve or more social networks, whether individual is a heavyTwitter user, among countless others.

B Preliminary Results

In this section we establish a number of ﬁrst-order stochastic dominance and monotonicityresults used in proofs of results in the body of the paper. Throughout this appendix, ﬁx asegment size N and assume that all agents opt in. (All results extend immediately to anyset of agents I ⊂ { , ..., N (cid:48) } of size N entering from a segment of size N (cid:48) > N. ) Let G Mi denote the distribution function of agent i ’s outcome in model M ∈ { Q, C } , with M = Q the quality linkage model and M = C the circumstance linkage model. We will write g Mi for the density function associated with G Mi . For the joint distribution of the outcomes ofagents i through j , we will write G Mi : j , B.1 Smooth MLRP

A classic result of Milgrom (1981) demonstrates that if a signal satisﬁes the monotone likeli-hood ratio property (MLRP), then posterior beliefs can be ordered by ﬁrst-order stochasticdominance. For our results we desire not just that the posterior distribution is strictly de-creasing in the conditioning variable, but that it be diﬀerentiable and that the derivative bestrictly negative. We deﬁne a smooth form of the MLRP suﬃcient to achieve this result.

Deﬁnition B.1 (Smooth MLRP) . A family of conditional density functions { f ( x | y ) } y ∈ Y on R for some Y ⊂ R satisﬁes the smooth monotone likelihood ratio property (SMLRP) in y if: • f ( x | y ) is a strictly positive, C , function of ( x, y ) , • f ( x | y ) and ∂∂x f ( x | y ) are both uniformly bounded for all ( x, y ) , • The likelihood ratio function (cid:96) ( x ; y, y (cid:48) ) ≡ f ( x | y ) f ( x | y (cid:48) ) satisﬁes ∂(cid:96)∂x ( x ; y, y (cid:48) ) > for every x and y > y (cid:48) . A function f : R → R lies in the class C , if it is continuous everywhere and ∂f∂x ( x, y ) exists and iscontinuous everywhere. ∂(cid:96)∂x ( x ; y, y (cid:48) ) = f ( x | y ) f ( x | y (cid:48) ) (cid:18) ∂∂x log f ( x | y ) − ∂∂x log f ( x | y (cid:48) ) (cid:19) . Thus the condition on the likelihood ratio function imposed by SMLRP is equivalent to thecondition that ∂∂x log f ( x | y ) be a strictly increasing function of y for every x. The following lemma establishes a very important class of random variables satisfyingSMLRP.

Lemma B.1.

Let X and Y be two independent random variables with density functions f X and f Y which are each C , strictly positive, strictly log-concave functions, and which eachhave bounded ﬁrst derivative. Let Z = k + X + Y for a constant k . Then the conditionaldensities f Z | X ( z | x ) and f Z | Y ( z | y ) satisfy the SMLRP in x and y, respectively.Proof. First take k = 0 . We prove the result for f Z | X , with the result for f Z | Y followingsymmetrically. Note that f Z | X ( z | x ) = f Y ( z − x ) . By Lemma O.1, f Y is bounded. Thisresult along with the additional assumptions on f Y ensure that f Z | X satisﬁes the ﬁrst twoconditions of SMLRP. As for the likelihood ratio condition, it is suﬃcient to establish that ∂∂z log f Z | X ( z | x ) = ∂∂z log f Y ( z − x ) is strictly increasing in x for each z. But since f Y isstrictly log-concave, ∂∂z log f Y ( z − x ) > ∂∂z log f Y ( z − x (cid:48) ) whenever z − x < z − x (cid:48) , i.e. whenever x > x (cid:48) . So the likelihood ratio condition is satisﬁed as well.Now suppose k (cid:54) = 0 . Then the result applied to the random variable X + Y establishesthat f X + Y | X ( z | x ) and f X + Y | Y ( z | y ) satisfy the SMLRP in x and y, respectively. As f Z | X ( z | x ) = f X + Y | X ( z − k | x ) and f Z | Y ( z | y ) = f X + Y | Y ( z − k | y ), and since each of theconditions of the SMLRP are invariant to shifts in the ﬁrst argument, these densities satisfythe SMLRP as well.The following lemma is the main result of this appendix. It strengthens the FOSDresult of Milgrom (1981) to ensure that the posterior distribution function is smooth andhas a strictly negative derivative wrt the conditioning variable. The suﬃcient conditions arethat the likelihood function satisfy SMLRP and that the density function of the unobservedvariable be continuous. The proof here establishes the sign of the derivative, with the proofof smoothness relegated to Lemma O.2 in the Online Appendix.39 emma B.2 (Smooth FOSD) . Let X and Y be two random variables for which the density g ( y ) for Y and the conditional densities f ( x | y ) for X | Y exist. Suppose that f ( x | y ) satisﬁes the SMLRP in y and g ( y ) is continuous. Then H ( x, y ) ≡ Pr( Y ≤ y | X = x ) is a C function of ( x, y ) and ∂H∂x ( x, y ) < everywhere.Proof. Lemma O.2 establishes that H is a C function. To sign its derivative wrt x, notethat the derivative of (cid:98) H ( x, y ) ≡ H ( x, y ) − − ∂ (cid:98) H∂x ( x, y ) = (cid:18)(cid:90) y −∞ f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) (cid:19) − × (cid:90) ∞ y dG ( y (cid:48) ) (cid:90) y −∞ dG ( y (cid:48)(cid:48) ) (cid:18) f ( x | y (cid:48)(cid:48) ) ∂∂x f ( x | y (cid:48) ) − f ( x | y (cid:48) ) ∂∂x f ( x | y (cid:48)(cid:48) ) (cid:19) . (See the proof of Lemma O.2 for a detailed derivation.) The integrand may be rewritten f ( x | y (cid:48)(cid:48) ) ∂∂x f ( x | y (cid:48) ) − f ( x | y (cid:48) ) ∂∂x f ( x | y (cid:48)(cid:48) )= f ( x | y (cid:48)(cid:48) ) (cid:32) ∂∂x f ( x | y (cid:48) ) f ( x | y (cid:48)(cid:48) ) − f ( x | y (cid:48) ) ∂∂x f ( x | y (cid:48)(cid:48) ) f ( x | y (cid:48)(cid:48) ) (cid:33) = f ( x | y (cid:48)(cid:48) ) ∂∂x (cid:96) ( x ; y (cid:48) , y (cid:48)(cid:48) ) . Now, as y (cid:48) > y > y (cid:48)(cid:48) on the interior of the domain of integration, ∂∂x (cid:96) ( x ; y (cid:48) , y (cid:48)(cid:48) ) > ∂ (cid:98) H∂x ( x, y ) > . Therefore ∂H∂x ( x, y ) = − ∂ (cid:98) H∂x ( x, y )( (cid:98) H ( x, y ) + 1) < , as desired. B.2 SFOSD of Posterior Distributions

We now develop smooth ﬁrst-order stochastic dominance results regarding posterior distribu-tions of various latent variables as outcomes shift. These results rely heavily on the SFOSDresult established in Lemma B.2. Application of that lemma requires checking smoothnessand boundedness conditions of the underlying likelihood functions, which are straightforwardbut tedious in our environment. We relegate proofs of these regularity conditions to OnlineAppendix O.1.The following result establishes that as an agent’s outcome increases, inferences aboutthe common component of the outcome increase as well.

Lemma B.3.

For agent i ∈ { , ..., N } and outcome-action proﬁle ( S − i , a ) :40 F Qθ ( θ | S ; a ) is a C function of ( S i , θ ) satisfying ∂∂S i F Qθ ( θ | S ; a ) < for all ( S i , θ ) , • F Cε ( ε | S ; a ) is a C function of ( S i , ε ) satisfying ∂∂S i F Cε ( ε | S ; a ) < for all ( S i , ε ) ,Proof. For convenience, we suppress the dependence of distributions on a in this proof. Fix S − i . We will prove the ﬁrst result, with the second following from nearly identical work bypermuting the roles of θ and ε .The result follows from Lemma B.2 provided that 1) f Qθ ( θ | S − i ) is continuous wrt θ, and2) g Qi ( S i | θ, S − i ) satisﬁes SMLRP with respect to θ . As for the ﬁrst condition, Bayes’ rulegives f Qθ ( θ | S − i ) = f θ ( θ ) (cid:81) j (cid:54) = i g j ( S j | θ ) g − i ( S − i ) = f θ ( θ ) (cid:81) j (cid:54) = i f θ ⊥ + ε ( S j − θ − a j ) g − i ( S − i ) . Then as f θ and f θ ⊥ + ε are both continuous functions, f Qθ ( θ | S − i ) is a continuous function of θ. It therefore suﬃces to establish condition 2.Note that conditional on θ, S i is independent of S − i in the quality linkage model; so g Qi ( S i | θ, S − i ) = g Qi ( S i | θ ) . So it suﬃces to establish that g Qi ( S i | θ ) satisﬁes SMLRP withrespect to θ . Recall that in the quality linkage model, S i = a i + θ + θ ⊥ i + ε i , where byassumption θ, θ ⊥ i and ε i all have C , strictly positive, strictly log-concave density functionswith bounded derivatives. Lemma O.1 ensures that these densities are additionally bounded.These properties are all inherited by the density function of the sum θ ⊥ i + ε i , which is just theconvolution of the density functions for θ ⊥ i and ε i . Lemma B.1 then implies that g Qi ( S i | θ )satisﬁes SMLRP with respect to θ, as desired.The following lemma establishes smooth stochastic dominance of a posterior distributionarising in analysis of the quality linkage model. While the property is the same one estab-lished by Lemma B.2, the boundedness conditions of that lemma cannot be guaranteed andso slightly diﬀerent techniques are required to reach the result. Lemma B.4.

For every outcome-action proﬁle ( S , a ) and type θ , the function F Qθ ( θ | S , θ ; a ) is continuously diﬀerentiable wrt θ everywhere, and ∂∂θ F Qθ ( θ | S , θ ; a ) < . Proof.

For convenience, we suppress the dependence of distributions on a in this proof. ByBayes’ rule, F Qθ ( t | S , θ ) = (cid:82) t −∞ f θ ( θ | θ = t (cid:48) , S ) f θ ( t (cid:48) | S ) dt (cid:48) (cid:82) ∞−∞ f θ ( θ | θ = t (cid:48) , S ) f θ ( t (cid:48) | S ) dt (cid:48) . Note that f Qθ ( θ | θ , S ) is independent of S , as ( θ , S ) contains the same information as( θ , ε ) and θ is independent of ε . So f Qθ ( θ | θ , S ) = f Qθ ( θ | θ ) . Another application of41ayes’ rule reveals that f Qθ ( θ | θ ) = f Qθ ( θ | θ ) f θ ( θ ) f θ ( θ ) = f θ ⊥ ( θ − θ ) f θ ( θ ) f θ ( θ ) , while f θ ( θ | S ) = g ( S | θ ) f θ ( θ ) g ( S ) = f ε ( S − θ − a ) f θ ( θ ) g ( S ) . Inserting back into the previous expression for F Qθ = t ( θ | S , θ ) yields F Qθ ( t | S , θ ) = (cid:82) t −∞ f θ ⊥ ( t (cid:48) − θ ) f ε ( S − t (cid:48) − a ) dt (cid:48) (cid:82) ∞−∞ f θ ⊥ ( t (cid:48) − θ ) f ε ( S − t (cid:48) − a ) dt (cid:48) . Using the change of variables t (cid:48)(cid:48) = S − t (cid:48) − a yields F Qθ ( t | S , θ ) = (cid:82) ∞ S − t − a f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) dF ε ( t (cid:48)(cid:48) ) (cid:82) ∞−∞ f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) dF ε ( t (cid:48)(cid:48) ) . Now, as f (cid:48) θ ⊥ exists and is bounded, the Leibniz integral rule ensures that derivatives of thenumerator and denominator wrt θ may be moved inside the integral sign. So F Qθ ( t | S , θ )is diﬀerentiable wrt θ. And as f (cid:48) θ ⊥ is additionally continuous, the dominated convergencetheorem ensures that these derivatives are continuous. Meanwhile the numerator and de-nominator themselves are each continuous in θ given that f θ ⊥ is continuous and bounded.Thus F Qθ ( θ | S , θ ) is continuously diﬀerentiable wrt θ. To sign the derivative, we may equivalently sign H ( θ ) ≡ F Qθ ( t | S , θ ) − − (cid:82) S − t − a −∞ f θ ⊥ ( S − a − θ − t (cid:48) ) dF ε ( t (cid:48) ) (cid:82) ∞ S − t − a f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) dF ε ( t (cid:48)(cid:48) ) . Diﬀerentiating and re-arranging yields H (cid:48) ( θ ) = (cid:18)(cid:90) ∞ S − t − a f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) dF ε ( t (cid:48)(cid:48) ) (cid:19) − × (cid:90) S − t − a −∞ dF ε ( t (cid:48) ) (cid:90) ∞ S − t − a dF ε ( t (cid:48)(cid:48) ) × (cid:0) − f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) f (cid:48) θ ⊥ ( S − a − θ − t (cid:48) )+ f θ ⊥ ( S − a − θ − t (cid:48) ) f (cid:48) θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) (cid:1) . The integrand may be rewritten − f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) f (cid:48) θ ⊥ ( S − a − θ − t (cid:48) )+ f θ ⊥ ( S − a − θ − t (cid:48) ) f (cid:48) θ ⊥ ( S − a − θ − t (cid:48)(cid:48) )= f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) f θ ⊥ ( S − a − θ − t (cid:48) ) × (cid:32) − f (cid:48) θ ⊥ ( S − a − θ − t (cid:48) ) f θ ⊥ ( S − a − θ − t (cid:48) ) + f (cid:48) θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) (cid:33) . t (cid:48)(cid:48) > t (cid:48) , and so because f θ ⊥ is strictlylog-concave, f (cid:48) θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) f θ ⊥ ( S − a − θ − t (cid:48)(cid:48) ) > f (cid:48) θ ⊥ ( S − a − θ − t (cid:48) ) f θ ⊥ ( S − a − θ − t (cid:48) ) . Thus the integrand is strictly positive everywhere, meaning H (cid:48) ( θ ) >

0. In other words, ∂∂θ F Qθ ( θ | S , θ ) = − H (cid:48) ( θ )( H ( θ ) + 1) < , as desired.The following lemma establishes how inferences about one agent’s quality change asanother agent’s outcome changes. Note that the result depends critically on the model.For simplicity, the result is stated in terms of inferences about agent 1’s type as agent N ’soutcome shifts. By symmetry analogous results hold for any other pair of agents. Lemma B.5.

For every outcome-action proﬁle ( S − N , a ) ,∂∂S N F Qθ ( θ | S ; a ) < and ∂∂S N F Cθ ( θ | S ; a ) > for every ( θ , S N ) . Proof.

For convenience, we suppress the dependence of distributions on a in this proof. Fix S − N . Recall that Lemma O.4 established that F Mθ ( θ | S ) is a C function of ( S N , θ ) foreach model M ∈ { Q, C } . Consider ﬁrst the quality linkage model. Then F Qθ ( θ | S ) = (cid:90) ∞−∞ F Qθ ( θ | S , θ ) dF Qθ ( θ | S ) . Conditional on θ, θ depends on S only through S , so this can be written F Qθ ( θ | S ) = (cid:90) ∞−∞ F Qθ ( θ | S , θ ) dF Qθ ( θ | S ) . Lemma B.3 establishes that F Qθ ( θ | S ) is a C function of ( S N , θ ) satisfying ∂∂S N F Qθ ( θ | S ) < F Qθ ( θ | S ) − q is a C function of ( S N , θ, q ), with Jacobian f Qθ ( θ | S ) wrt θ . By Bayes’ rule, f Qθ ( θ | S ) = f θ ( θ ) (cid:81) Ni =1 g i ( S i | θ ) (cid:82) dθ (cid:48) f θ ( θ (cid:48) ) (cid:81) Ni =1 g i ( S i | θ (cid:48) ) . g i ( S i | θ ) = f θ ⊥ + ε ( S i − θ − a i ) and f θ and f θ ⊥ + ε are both strictly positive, f Qθ ( θ | S ) > C function φ ( q, S N )such that F Qθ ( φ ( q, S N ) | S ) = q for all ( q, S N ) , and further that ∂φ∂S N ( q, S N ) = − (cid:34) f Qθ ( t | S ) ∂∂S N F Qθ ( t | S ) (cid:35) t = φ ( q,S N ) > . A change of variables allows F Qθ ( θ | S ) to be integrated with respect to quantiles of θ usingthe quantile function φ , yielding F Qθ ( θ | S ) = (cid:90) F Qθ ( θ | S , θ = φ ( q, S N )) dq. Then for any ∆ > , − (cid:16) F Qθ ( θ | S N = s N + ∆ , S − N ) − F Qθ ( θ | S N = s N , S − ) (cid:17) = (cid:90) − (cid:16) F Qθ ( θ | S , θ = φ ( q, s N + ∆)) − F Qθ ( θ | S , θ = φ ( q, s N )) (cid:17) dq. Since F Qθ ( θ | S ) is diﬀerentiable wrt S N , the limit of both sides as ∆ ↓ ∂∂θ F Qθ ( θ | S , θ ) exists, is continuous in θ, and is strictlynegative everywhere. Meanwhile we showed above that φ ( q, S N ) is strictly increasing in S N . This means that the interior of the integrand is strictly positive for every q and ∆ > , implying by Fatou’s lemma and the chain rule that − ∂∂S N F Qθ ( θ | S ) ≥ − (cid:90) ∂∂θ F Qθ ( θ | S , θ ) (cid:12)(cid:12)(cid:12)(cid:12) θ = φ ( q,S N ) ∂φ∂S N ( q, S N ) dq. As the ﬁrst term in the integrand is strictly negative while the second is strictly positive,this inequality in turn implies ∂∂S N F Qθ ( θ | S ) < . Now consider the circumstance linkage model. Virtually all of the work for the qualitylinkage model goes through with ε exchanged for θ, with the key exception that the existence,continuity, and sign of ∂∂ε F Cθ ( θ | S , ε ) must be established separately. (Lemma B.4 appliesonly to the quality linkage model.) Note that F Cθ ( θ | S = s, ε = t ) = F Cθ ( θ | (cid:101) S = s − t ),where (cid:101) S ≡ a + θ + ε ⊥ . It is therefore suﬃcient to analyze ∂∂ (cid:101) S F Cθ ( θ | (cid:101) S ) . Let (cid:101) g ( (cid:101) S | θ )be the density function of (cid:101) S conditional on θ . We invoke Lemma B.1 to conclude that (cid:101) g ( (cid:101) S | θ ) satisﬁes SMLRP in θ . As additionally f θ ( θ ) is continuous by assumption, LemmaB.2 ensures that ∂∂ (cid:101) S F Cθ ( θ | (cid:101) S ) exists, is continuous, and is strictly negative everywhere.Thus ∂∂ε F Cθ ( θ | S , ε ) exists, is continuous, and is strictly positive everywhere.44n light of this result, the ﬁnal steps of the proof from the quality linkage case adaptedto the circumstance linkages model show that1∆ (cid:0) F Cθ ( θ | S N = s N + ∆ , S − N ) − F Cθ ( θ | S N = s N , S − ) (cid:1) = (cid:90) (cid:0) F Cθ ( θ | S , ε = φ ( q, s N + ∆)) − F Cθ ( θ | S , ε = φ ( q, s N )) (cid:1) dq, where the interior of the right-hand side is strictly positive for all ∆ >

0. Then by Fatou’slemma and the chain rule ∂∂S N F Cθ ( θ | S ) ≥ (cid:90) ∂∂ε F Qθ ( θ | S , ε ) (cid:12)(cid:12)(cid:12)(cid:12) ε = φ ( q,S N ) ∂φ∂S N ( q, S N ) dq > . B.3 Monotonicity of Posterior Expectations

This appendix establishes a series of monotonicity results about how posterior expectationsof various latent variables change as some agent’s outcome shifts. These results are con-sequences of the SFOSD results derived in Appendix B.2. Several of the results requiresmoothness or positivity conditions on underlying distribution and density functions, whichare straightforward but tedious to check in our environment. We relegate proofs of theseproperties to Online Appendix O.1.We ﬁrst establish that the posterior expectation of an agent’s type increases in his ownsignal, and that the rate of increase is bounded strictly between 0 and 1.

Lemma B.6 (Forecast sensitivity) . For each agent i ∈ { , ..., N } and outcome-action proﬁle ( S , a ) , < ∂∂S i E [ θ i | S ; a ] < . Proof.

For convenience, we suppress the dependence of distributions on a throughout thisproof. Also wlog consider agent i = 1 . We establish the result for the quality linkage model,with the result for the circumstance linkage model following by nearly identical work.Fix a vector of signal realizations S − . First note that g Q ( S | θ , S − ) = g Q ( S | θ ) , and S is the sum of a constant plus the independent random variables θ and ε , each of which hasa C , strictly positive, strictly log-concave density function with bounded derivative. Thusby Lemma B.1 g Q ( S | θ , S − ) satisﬁes SMLRP with respect to θ . Further, f Qθ ( θ | S − ) iscontinuous in θ by Lemma O.3. Lemma B.2 then ensures that F Qθ ( θ | S ) is a C functionof ( θ , S ) and ∂∂S F Qθ ( θ | S ) < S − , S can be written S = a + (cid:101) θ + θ ⊥ + ε , where (cid:101) θ is independent of θ ⊥ and ε and has density function f (cid:101) θ deﬁned by f (cid:101) θ ( t ) ≡ f Qθ ( θ = t | S − ) . We ﬁrst show that f (cid:101) θ is a C , strictly positive, strictly log-concave function withbounded derivative. By Bayes’ rule, f (cid:101) θ ( t ) = f θ ( t ) (cid:81) i> g Qi ( S i | θ = t ) g Q ( S − ) = f θ ( t ) (cid:81) i> f ε + θ ⊥ ( S i − t − a i ) g Q ( S − ) , where f ε + θ ⊥ is the convolution of f θ ⊥ and f ε . Since f θ ⊥ and f ε are both C , strictly positive,strictly log-concave functions with bounded derivatives, so is f ε + θ ⊥ . It follows immediatelythat f (cid:101) θ is a strictly positive, C function with bounded derivative. Further, taking logarithmsyields log f (cid:101) θ ( t ) = log f θ ( t ) − log g Q ( S − ) + (cid:88) i> log f ε + θ ⊥ ( S i − t − a i ) . Hence log f (cid:101) θ is a sum of constant and strictly concave functions, meaning it is strictly concave.Thus f (cid:101) θ is strictly log-concave. This means that conditional on S − , S is the sum ofa constant plus the independent random variables ε and (cid:101) θ + θ ⊥ , each of which has a C , strictly positive, strictly log-concave density function with bounded derivative. So by LemmaB.1, g Q ( S | ε , S − ) satisﬁes SMLRP with respect to ε . Further, f Qθ ( ε | S − ) = f ε ( ε ) iscontinuous in ε by assumption. Lemma B.2 then ensures that F Qε ( ε | S ) is a C functionof ( ε , S ) and ∂∂S F Qε ( ε | S ) < E [ θ | S ] is equal to E [ θ | S ] = (cid:90) ∞−∞ θ dF Qθ ( θ | S ) . We will perform a change of measure to expect over quantiles of θ rather than θ itself.Fix S − . The previous paragraphs ensure that F Qθ ( t | S ) − q is a C function of ( t, S , q )everywhere, while Lemma O.3 ensures that the Jacobian of this function wrt to t is f Qθ ( t | S ) >

0. Then by the implicit function theorem there exists a continuously diﬀerentiablequantile function φ ( q, S ) such that F Qθ ( φ ( q, S ) | S ) = q and ∂φ∂S ( q, S ) = − (cid:34) f Qθ ( t | S ) ∂∂S F Qθ ( t | S ) (cid:35) t = φ ( q,S ) > q ∈ (0 ,

1) and S ∈ R . Changing measure, E [ θ | S ] may be expressed as anexpectation over quantiles of θ , yielding E [ θ | S ] = (cid:90) φ ( q, S ) dq. > , E [ θ | S + ∆ , S − ] − E [ θ | S ] = (cid:90)

1∆ ( φ ( q, S + ∆) − φ ( q, S )) dq. By Assumption 3, E [ θ | S ] is diﬀerentiable wrt S everywhere. So the limit of each side iswell-deﬁned as ∆ ↓ . Further, as φ ( q, S ) is strictly increasing in S for each q, the interiorof the integrand is everywhere positive. Then by Fatou’s lemma ∂∂S E [ θ | S ] ≥ (cid:90) ∂φ∂S ( q, S ) dq > . Now, recall that S = a + θ + ε , so that S = E [ S | S ] = a + E [ θ | S ] + E [ ε | S ] . Then in particular E [ ε | S ] must be diﬀerentiable wrt S given that the remaining terms inthe identity are. Very similar work to the previous paragraph then implies that ∂∂S E [ ε | S ] > . Finally, diﬀerentiate the identity relating E [ θ | S ] and E [ ε | S ] to obtain1 = ∂∂S E [ θ | S ] + ∂∂S E [ ε | S ] . Since each term on the right-hand side is strictly positive, each much also be strictly lessthan 1.We next establish that the posterior expectation of the common component of the out-come in each model is increasing in each agent’s outcome, with the rate of increase boundedstrictly above 0.

Lemma B.7.

For each agent i ∈ { , ..., N } and outcome-action proﬁle ( S , a ) : • In the quality linkage model, ∂∂S i E [ θ | S ] > , • In the circumstance linkage model, ∂∂S i E [ ε | S ] > .Proof. For convenience, we suppress the dependence of distributions on a in this proof. Weestablish the result for the quality linkage model, with the proof for the circumstance linkagemodel following by nearly identical work. By deﬁnition of E [ θ | S ] , E [ θ | S ] = (cid:90) θ dF Qθ ( θ | S ) . F Qθ ( θ | S ) is a C function of ( θ, S i ) , and ∂∂S i F Qθ ( θ | S ) < F Gθ ( θ | S ) − q is a C function of ( q, θ, S i ) with Jacobian f Qθ ( θ | S ) wrt Q. By Bayes’ rule f Qθ ( θ | S ) = f θ ( θ ) (cid:81) Ni =1 g i ( S i | θ ) (cid:82) dθ (cid:48) f θ ( θ (cid:48) ) (cid:81) Ni =1 g i ( S i | θ (cid:48) ) , and as f θ ( θ ) and g i ( S i | θ ) = f θ ⊥ + ε ( S i − θ − a i ) are all strictly positive by assumption, f Qθ ( θ | S ) > S − i . Then by the implicit function theorem there exists a C quantile function φ ( q, S i ) such that F Qθ ( φ ( q, S i ) | S ) = q everywhere, and ∂φ∂S i ( q, S i ) = − (cid:34) f Qθ ( θ | S ) ∂∂S i F Qθ ( θ | S ) (cid:35) θ = φ ( q,S i ) > . By a change of measure, E [ θ | S ] may be expressed as an integral with respect to quantilesof θ as E [ θ | S ] = (cid:90) φ ( q, S i ) dq. Then for every ∆ > , (cid:0) E [ θ | S − i , S i = s i + ∆] − E [ θ | S − i , S i = s i ] (cid:1) = (cid:90)

1∆ ( φ ( q, s i + ∆) − φ ( q, s i )) dq. Assumption 3 guarantees that ∂∂S i E [ θ | S ] exists. So the limit of each side as ∆ ↓ φ ( q, S i ) is strictly increasing, the integrand on the rhs is well-deﬁned.Then by Fatou’s lemma, ∂∂S i E [ θ | S ] ≥ (cid:90) ∂φ∂S i ( q, S i ) dq > . We next establish how the posterior expectation of each agent’s type changes as someother agent’s outcome shifts. Note that the result depends critically on the model. Forsimplicity we consider how agent 1’s variables shift as agent N ’s outcome changes. Bysymmetry an analogous result holds for any pair of agents. Lemma B.8.

For every outcome-action proﬁle ( S , a ) , ∂∂S N E [ θ | S ; a ] > n the quality linkage model, while ∂∂S N E [ θ | S ; a ] < in the circumstance linkage model.Proof. For convenience, we suppress the dependence of distributions on a in this proof. Bydeﬁnition E [ θ | S ] is given by E [ θ | S ] = (cid:90) ∞−∞ θ dF Mθ ( θ | S ) . Fix S − N . By Lemma O.4, F Mθ ( θ | S ) is a C function of ( S N , θ ) , and so F Mθ ( θ | S ) − q isa C function of ( S N , θ , q ) with Jacobian f Mθ ( θ | S ) wrt θ . By Lemma O.3 the Jacobianis strictly positive everywhere, hence by the implicit function theorem there exists a C quantile function φ ( q, S N ) satisfying F Mθ ( φ ( q, S ) | S ) = q everywhere, with derivative ∂φ∂S N ( q, S N ) = − (cid:20) f Mθ ( θ | S ) ∂∂S N F Mθ ( θ | S ) (cid:21) θ = φ ( q,S N ) . By Lemma B.5, ∂∂S N F Qθ ( θ | S ) < ∂∂S N F Cθ ( θ | S ) > ∂φ∂S N ( q, S N ) > ∂φ∂S N ( q, S N ) < E [ θ | S ] may be expressed as an integral over quantiles of θ as E [ θ | S ] = (cid:90) φ ( q, S N ) dq. Consider ﬁrst the quality linkage model. For every ∆ > E [ θ | S − N , S N = s N + ∆] − E [ θ | S − N , S N = s N ])= (cid:90)

1∆ ( φ ( q, s N + ∆) − φ ( s N )) dq, where the integrand is strictly positive for every ∆ > ∂φ∂S N ( q, S N ) > E [ θ | S ] is diﬀerentiable wrt S N everywhere, so the limits of both sidesmust exist as ∆ ↓ . Then by Fatou’s lemma, ∂∂S N E [ θ | S ] ≥ (cid:90) ∂φ∂S N ( q, S N ) dq > . Analogously, in the circumstance linkage model −

1∆ ( E [ θ | S − N , S N = s N + ∆] − E [ θ | S − N , S N = s N ])= (cid:90) −

1∆ ( φ ( q, s N + ∆) − φ ( s N )) dq, − ∂∂S N E [ θ | S ] ≥ − (cid:90) ∂φ∂S N ( q, S N ) dq > , or equivalently ∂∂S N E [ θ | S ] < . C Proofs for Section 3 (Exogenous Entry)

C.1 Equilibrium Characterization

In this section we establish that there exists a unique equilibrium to the exogenous-entrymodel, which is characterized by the ﬁrst-order condition described in the body of the paper.Fix a population size N, and assume all agents in the segment enter in the ﬁrst period. Forevery α ∈ R N + and ∆ ≥ − α , deﬁne µ (∆; α ) ≡ E [ E [ θ | S ; a = α ] | a = ( α + ∆ , α − )]to be agent 1’s expected second-period payoﬀ from exerting eﬀort α + ∆ when the principalexpects each agent i ∈ { , ..., N } to exert eﬀort α i . Lemma C.1.

The value function µ (∆; α ) and its derivatives satisfy the following properties:(a) µ (∆; α ) is independent of α and is continuous and strictly increasing in ∆ . (b) µ (cid:48) (∆; α ) exists, is continuous in ∆ , and satisﬁes < µ (cid:48) (∆; α ) < for every ∆ .(c) D + µ (cid:48) (∆; α ) ≤ K for every ∆ . Proof.

Fix a model M ∈ { Q, C } . The quantity µ (∆; α ) can be written explicitly as µ (∆; α ) = (cid:90) dG M ( S = s | a = ( α + ∆ , α − )) E [ θ | S = s ; a = α ] . Further, E [ θ | S = s ; a = α ] = (cid:90) θ dF Mθ ( θ | S = s ; a = α ) , Given a function f : R → R , the Dini derivative D + is a generalization of the derivative existing forarbitrary functions and deﬁned by D + f ( x ) = lim sup h ↓ ( f ( x + h ) − f ( x )) /h . When f is diﬀerentiable at apoint x, D + f ( x ) = f (cid:48) ( x ) . f Mθ ( θ | S = s ; a = α ) = g M ( S = s | θ ; a = α ) f θ ( θ ) g M ( S = s | a = α ) . Since eﬀort aﬀects the outcome as an additive shift, g M ( S = s | a = α ) = g M ( S = s − α | a = ) and g M ( S = s | θ ; a = α ) = g M ( S = s − α | θ ; a = ) . So f Mθ ( θ | S = s ; a = α ) = g M ( S = s − α | θ ; a = ) f θ ( θ ) g M ( S = s − α | a = )= f Mθ ( θ | S = s − α ; α = ) . Thus E [ θ | S = s ; a = α ] = (cid:90) θ dF Mθ ( θ | S = s − α ; a = ) = E [ θ | S = s − α ; a = ] . Then µ (∆; α ) may be equivalently written µ (∆; α ) = (cid:90) dG M ( S = s − α | a = (∆ , )) E [ θ | S = s − α ; a = ] . Using the change of variables s (cid:48) = s − α then reveals that µ (∆; α ) = µ (∆; ) , so µ is indeedindependent of α. Now ﬁx ∆ and ∆ (cid:48) < ∆ . Since eﬀort aﬀects the outcome as an additive shift, G M ( S = s | a = ( α + ∆ , α − )) = G M ( S = ( s − (∆ − ∆ (cid:48) ) , s − ) | a = ( α + ∆ , α − )) for every s . Thendeﬁning a change of variables via s (cid:48) = s − (∆ − ∆ (cid:48) ) and s (cid:48)− i = s − i , the previous integralexpression for µ (∆; α ) may be equivalently written µ (∆; α ) = (cid:90) dG M ( S = s (cid:48) | a = ( α + ∆ (cid:48) , α − )) E [ θ | S = ( s (cid:48) + (∆ − ∆ (cid:48) ) , s (cid:48)− ); a = α ] . Now, by Assumption 3 ∂∂S E [ θ | S ; a ] exists and is continuous everywhere, and Lemma B.6established that 0 < ∂∂S E [ θ | S ; a ] < µ (cid:48) (∆; α ) exists and µ (cid:48) (∆; α ) = (cid:90) dG M ( S = s (cid:48) | a = α ) ∂∂ ∆ E [ θ | S = ( s (cid:48) + ∆ , s (cid:48)− ); a = α ] , and in particular 0 < µ (cid:48) (∆; α ) < . An immediate corollary is that µ (∆; α ) is continuousand strictly increasing everywhere. Further, the dominated convergence theorem impliesthat µ (cid:48) (∆; α ) is continuous in ∆ everywhere.Next, by Assumption 6 ∂ ∂ ∆ E [ θ | S = ( s (cid:48) + ∆ , s (cid:48)− ); a = α ]51xists and is bounded in the interval ( −∞ , K ] everywhere. Then for each δ > s , a , ∆) , the mean value theorem implies that1 δ (cid:18) ∂∂ ∆ E [ θ | S = ( s (cid:48) + ∆ + δ, s (cid:48)− ); a = α ] − ∂∂ ∆ E [ θ | S = ( s (cid:48) + ∆ , s (cid:48)− ); a = α ] (cid:19) = ∂ ∂ ∆ E [ θ | S = ( s (cid:48) + ∆ + δ (cid:48) , s (cid:48)− ); a = α ] ≤ K for some δ (cid:48) ∈ [0 , δ ] . Reverse Fatou’s lemma then implies that D + µ (cid:48) (∆; α ) ≤ K . Lemma C.2. µ (∆; α ) − C ( α + ∆) is a strictly concave function of ∆ for any α. Proof.

Fix an α, and deﬁne φ (∆) ≡ µ (∆; α ) − C ( α + ∆) . By Lemma C.1, φ (cid:48) exists and iscontinuous everywhere. We establish the necessary and suﬃcient condition for strict concav-ity that φ (cid:48) is strictly decreasing. We invoke the basic monotonicity theorem from analysisthat any function f which is continuous and satisﬁes D + f ≥ − µ (cid:48) (∆; α ) + K ∆ . Using basic properties of the Diniderivatives D + and D + , we have D + ( − µ (cid:48) (∆; α )) = − D + µ (cid:48) (∆; α ) ≥ − D + µ (cid:48) (∆; α ) . Then since K ∆ is diﬀerentiable and D + µ (cid:48) (∆; α ) ≤ K from Lemma C.1, we have D + ( − µ (cid:48) (∆; α )+ K ∆) = D + ( − µ (cid:48) (∆; α )) + K ≥ . So µ (cid:48) (∆; α ) − K ∆ is nonincreasing everywhere. So choose any ∆and ∆ (cid:48) > ∆ . Then φ (cid:48) (∆ (cid:48) ) = µ (cid:48) (∆ (cid:48) ; α ) − K ∆ (cid:48) + K ∆ (cid:48) − C (cid:48) ( α + ∆ (cid:48) ) ≤ µ (cid:48) (∆; α ) + K (∆ (cid:48) − ∆) − C (cid:48) ( α + ∆ (cid:48) ) . But also by Assumption 6, C (cid:48)(cid:48) ( α + ∆ (cid:48)(cid:48) ) > K for every ∆ (cid:48)(cid:48) ∈ (∆ , ∆ (cid:48) ) , so C (cid:48) ( α + ∆ (cid:48) ) >C (cid:48) ( α + ∆) + K (∆ (cid:48) − ∆) . Thus φ (cid:48) (∆ (cid:48) ) < µ (cid:48) (∆; α ) − C (cid:48) ( α + ∆) = φ (cid:48) (∆) , as desired. Proposition C.1.

There exists a unique equilibrium action proﬁle characterized by a i = a ∗ i ( N ) for each player i, where a ∗ i ( N ) is the unique solution to µ (cid:48) (0; a ∗ ( N )) = C (cid:48) ( a ∗ ( N )) . Proof.

Lemma C.1 established that µ (cid:48) (0; a ∗ ( N )) is well-deﬁned, independent of a ∗ ( N ) , andbounded in the interval [0 , . Then as C (cid:48) is continuous, strictly increasing, and satisﬁes C (cid:48) (0) = 0 and C (cid:48) ( ∞ ) = ∞ , there exists a unique solution to the stated ﬁrst-order condition.This solution constitutes an equilibrium so long as ∆ = 0 maximizes the objective function µ (∆; a ∗ ( N )) − C ( a ∗ ( N ) + ∆), which is guaranteed by the fact, established in Lemma C.2,that this function is strictly concave in ∆ . µ N (∆) ≡ µ (∆; a ∗ ( N )) and M V ( N ) ≡ µ (cid:48) N (0) for each N. When we wish to makethe model clear, we will write

M V M ( N ) for M ∈ { Q, C } . An immediate implication ofLemma C.1 is that 0 < M V ( N ) < N . We conclude this appendix by establishingthat these bounds also hold strictly in the limit as N → ∞ . Lemma C.3. < lim N →∞ M V ( N ) < . Proof.

Consider ﬁrst the quality linkage model. The proof of Lemma 1 establishes thatlim N →∞ M V Q ( N ) = M V Q ( ∞ ) , where M V Q ( ∞ ) is the equilibrium marginal value of eﬀortin a one-agent model where the common component θ is observed by the principal. In thiscase the agent’s equilibrium expected value of distortion is µ ∞ (∆; a ∗ ( ∞ )) = E [ θ ] + E [ E [ θ ⊥ | (cid:101) S ; a = a ∗ ( ∞ )] | a = a ∗ ( ∞ ) + ∆] , where (cid:101) S ≡ a + θ ⊥ + ε . Since the contribution of θ to the agent’s payoﬀ is not inﬂuencedby eﬀort, it has no incentive eﬀect. The marginal value of eﬀort in this setting is then justthe marginal value of eﬀort in a one-agent model where the agent’s type has density f θ ⊥ . As this distribution satisﬁes the same regularity conditions as f θ , the reasoning establishingthat 0 < M V Q (1) < < M V Q ( ∞ ) < N →∞ M V C ( N ) = M V C ( ∞ ) , where M V C ( ∞ ) is the equilibrium marginalvalue of eﬀort in a one-agent model where the common component ε is observed by theprincipal. In this case the agent’s equilibrium expected value of distortion is µ ∞ (∆; a ∗ ( ∞ )) = E [ E [ θ | (cid:101) S ; a = a ∗ ( ∞ )] | a = a ∗ ( ∞ ) + ∆] , where (cid:101) S ≡ a + θ + ε ⊥ . The marginal value of eﬀort in this setting is then just the marginalvalue of eﬀort in a one-agent model where the noise distribution has density f ε ⊥ . As thisdistribution satisﬁes the same regularity conditions as f ε , the reasoning establishing that0 < M V C (1) < < M V C ( ∞ ) < C.2 Proof of Lemma 1

Throughout this proof, we will without loss of generality consider agent 1’s problem. Tocompare results across segments of diﬀering sizes, we will consider there to be a singleunderlying vector S = ( S , S , ... ) of outcomes for a countably inﬁnite set of agents, with the N -agent model corresponding to observation of the outcomes of the ﬁrst N agents. We willwrite a ∗ ( N ) to indicate the N -vector with entries a ∗ ( N ) , and similarly a ∗ ( N + 1) to indicatethe N + 1-vector with entries a ∗ ( N + 1) . Given any ﬁnite or countably inﬁnite vector x withat least j elements, we will use x i : j to indicate the subvector of x consisting of elements i through j. For the distribution function of the outcome vector S i : j , we will write G Mi : j . .2.1 Monotonicity in N We ﬁrst establish the monotonicity claims of the lemma. Fix a model M ∈ { Q, C } and asegment size N. By deﬁnition, the expected value of distortion µ N (∆) is µ N (∆) = (cid:90) dG M N ( S N = s N | a N = ( a ∗ ( N ) + ∆ , a ∗ ( N ) N )) × E [ θ | S N = s N ; a N = a ∗ ( N )] . By Lemma C.1, the value of distortion is independent of the action vector expected by theprincipal, so we may equivalently write µ N (∆) = (cid:90) dG M N ( S N = s N | a N = ( a ∗ ( N + 1) + ∆ , a ∗ ( N + 1) N )) × E [ θ | S N = s N ; a N = a ∗ ( N + 1) N ] (C.1)replacing a ∗ ( N ) everywhere with a ∗ ( N + 1). Further, the additive structure of the modelimplies that the distribution function G M N satisﬁes the identity G M N ( S N = s N | a N ) = G M N ( S N = s N + b N | a N + b N )for any outcome realization s N , action vector a N , and shift vector b N . Then taking b N = ( − ∆ , N − ) , the representation of µ N (∆) in (C.1) may be rewritten µ N (∆) = (cid:90) dG M N ( S N = s (cid:48) N | a N = a ∗ ( N + 1) N ) × E [ θ | S N = ( s (cid:48) + ∆ , s (cid:48) N ); a N = a ∗ ( N + 1) N ] , where we have changed variables to the integrator s (cid:48) N = s N + b N .Meanwhile, the value of distortion with N + 1 agents is µ N +1 (∆) = (cid:90) dG M N +1 ( S N +1 = s N +1 | a N +1 = ( a ∗ ( N + 1) + ∆ , a ∗ ( N + 1) N +1 )) × E [ θ | S N +1 = s N +1 ; a N +1 = a ∗ ( N + 1)] . Using the same transformation as in the N -agent model, this expression may be equivalentlywritten µ N +1 (∆) = (cid:90) dG M N +1 ( S N +1 = s (cid:48) N +1 | a N +1 = a ∗ ( N + 1)) × E [ θ | S N +1 = ( s (cid:48) + ∆ , s (cid:48) N +1 ); a N +1 = a ∗ ( N + 1)] . For the remainder of the proof, all distributions will be conditioned on the action proﬁle a N +1 = a ∗ ( N + 1) , so conditioning of distributions on actions will be suppressed.54o compare the expressions for µ N (∆) and µ N +1 (∆) just derived, we use the law ofiterated expectations. In the N -agent model we have E [ θ | S N = ( s + ∆ , s N )] = (cid:90) dG MN +1 ( S N +1 = s N +1 | S N = ( s + ∆ , s N )) × E [ θ | S N +1 = ( s + ∆ , s N +1 )] . So µ N (∆) = (cid:90) dG M N ( S N = s N ) × (cid:90) dG MN +1 ( S N +1 = s N +1 | S N = ( s + ∆ , s N )) × E [ θ | S N +1 = ( s + ∆ , s N +1 )] . Meanwhile in the N + 1-agent model the law of iterated expectations may be applied to theunconditional expectation over S N +1 to obtain µ N +1 (∆) = (cid:90) dG M N +1 ( S N = s N ) × (cid:90) dG MN +1 ( S N +1 = s N +1 | S N = s N ) × E [ θ | S N +1 = ( s + ∆ , s N +1 )] . So deﬁne a function ψ by ψ ( δ , δ , s N ) ≡ (cid:90) dG MN +1 ( S N +1 = s N +1 | S N = ( s + δ , s N )) × E [ θ | S N +1 = ( s + δ , s N +1 )] . Then the values of distortion with N and N + 1 agents may be written in the common form µ N (∆) = (cid:90) dG M N ( S N ) ψ (∆ , ∆ , S N )while µ N +1 (∆) = (cid:90) dG M N ( S N ) ψ (0 , ∆ , S N ) . Then for any ∆ > ,

1∆ ( µ N (∆) − µ N +1 (∆)) = (cid:90) dG M N ( S N ) 1∆ ( ψ (∆ , ∆ , S N ) − ψ (0 , ∆ , S N )) . Now, as

M V ( N ) = µ (cid:48) N (0) and M V ( N + 1) = µ (cid:48) N +1 (0) both exist and are ﬁnite by LemmaC.1, it follows that M V ( N ) − M V ( N + 1) = lim ∆ ↓

1∆ ( µ N (∆) − µ ) − lim ∆ ↓

1∆ ( µ N +1 (∆) − µ )= lim ∆ ↓

1∆ ( µ N (∆) − µ N +1 (∆))55xists, so that M V ( N ) − M V ( N + 1) = lim ∆ ↓ (cid:90) dG M N ( S N ) 1∆ ( ψ (∆ , ∆ , S N ) − ψ (0 , ∆ , S N )) , and in particular the limit on the rhs also exists. To bound the right-hand side and completethe proof, we analyze the behavior of ( ψ (∆ , ∆ , S N ) − ψ (0 , ∆ , S N )) as ∆ tends to zero.Consider ﬁrst the quality linkage model. Using the law of total probability, we mayre-write ψ ( δ , δ , s N ) as ψ ( δ , δ , s N ) = (cid:90) dF Qθ ( θ | S N = ( s + δ , s N )) × (cid:90) dG QN +1 ( S N +1 = s N +1 | θ, S N = ( s + δ, s N )) × E [ θ | S N +1 = ( s + δ , s N +1 )] . As S N +1 is independent of S N conditional on θ, this is equivalently ψ ( δ , δ , s N ) = (cid:90) dF Qθ ( θ | S N = ( s + δ , s N )) × (cid:90) dG QN +1 ( S N +1 = s N +1 | θ ) E [ θ | S N +1 = ( s + δ , s N +1 )] . Inverting q = G QN +1 ( S N +1 = s N +1 | θ ) = F θ ⊥ + ε ( s N +1 − θ − a ∗ ( N + 1))yields the quantile function s N +1 = F − θ ⊥ + ε ( q ) + θ + a ∗ ( N + 1) , so by a change of variables ψ may be equivalently written ψ ( δ , δ , s N ) = (cid:90) dF Qθ ( θ | S N = ( s + δ , s N )) × (cid:90) dq E [ θ | S N +1 = ( s + δ , s N , F − θ ⊥ + ε ( q ) + θ + a ∗ ( N + 1))] . Now ﬁx s N , and write the integrand of this representation as ζ ( θ, δ, q ) ≡ E [ θ | S N +1 = ( s + δ, s N , F − θ ⊥ + ε ( q ) + θ + a ∗ ( N + 1))] . By Lemma B.5, F Qθ ( θ | S N = ( s + δ, s N )) is a C function of ( θ, δ ) satisfying ∂∂δ F Qθ ( θ | S N = ( s + δ, s N )) < F Qθ ( θ | S N = ( s + δ, s N )) − q (cid:48) is a C functionof ( q (cid:48) , δ, δ ) , with Jacobian f Qθ ( θ | S N = ( s + δ, s N )) wrt θ. By Lemma O.3 this Jacobianis strictly positive everywhere. Then by the implicit function theorem there exists a C quantile function φ ( q (cid:48) , δ ) satisfying F Qθ ( φ ( q (cid:48) , δ ) | S N = ( s + δ, s N )) = q (cid:48) for all ( q (cid:48) , δ ) and ∂φ∂δ ( q (cid:48) , δ ) = − (cid:34) f Qθ ( θ | S N = ( s + δ, s N )) ∂∂δ F Qθ ( θ | S N = ( s + δ, s N )) (cid:35) θ = φ ( q (cid:48) ,δ ) > . ψ ( δ , δ , s N ) may be written ψ ( δ , δ , s N ) = (cid:90) dq (cid:48) (cid:90) dq ζ ( φ ( q (cid:48) , δ ) , δ , q ) . By Lemma B.8 ∂∂S N +1 E [ θ | S N +1 ] > ∂φ/∂δ > , it followsthat ζ ( φ ( q (cid:48) , ∆) , ∆ , q ) > ζ ( φ ( q (cid:48) , , ∆ , q )for all ( q, q (cid:48) ) and every ∆ > . Hence ( ψ (∆ , ∆ , s N ) − ψ (0 , ∆ , s N )) > > . This argument holds independent of the choice of s N . Thus Fatou’s lemma implies

M V ( N ) − M V ( N + 1) ≥ (cid:90) dG M N ( S N ) lim inf ∆ ↓

1∆ ( ψ (∆ , ∆ , S N ) − ψ (0 , ∆ , S N )) . A further application of Fatou’s lemma yieldslim inf ∆ ↓

1∆ ( ψ (∆ , ∆ , S N ) − ψ (0 , ∆ , S N )) ≥ (cid:90) dq (cid:48) (cid:90) dq lim ∆ ↓

1∆ ( ζ ( φ ( q (cid:48) , ∆) , ∆ , q ) − ζ ( φ ( q (cid:48) , , ∆ , q )) . Recall that by Assumption 3, ∂∂S i E [ θ | S N +1 ] exists and is continuously diﬀerentiablein S N +1 for every i. Thus E [ θ | S N +1 ] is totally diﬀerentiable wrt S N +1 everywhere. Sowrite the integrand of the previous expression for lim inf ∆ ↓ ( ψ (∆ , ∆ , S N ) − ψ (0 , ∆ , S N ))as 1∆ ( ζ ( φ ( q (cid:48) , ∆) , ∆ , q ) − ζ ( φ ( q (cid:48) , , ∆ , q ))= 1∆ ( ζ ( φ ( q (cid:48) , ∆) , ∆ , q ) − ζ ( φ ( q (cid:48) , , , q )) −

1∆ ( ζ ( φ ( q (cid:48) , , ∆ , q, q (cid:48) ) − ζ ( φ ( q (cid:48) , , , q )) . Taking ∆ ↓ ∆ ↓

1∆ ( ζ ( φ ( q (cid:48) , ∆) , ∆ , q ) − ζ ( φ ( q (cid:48) , , ∆ , q ))= ∂ζ∂θ ( φ ( q (cid:48) , , , q ) ∂φ∂δ ( q (cid:48) ,

0) + ∂ζ∂δ ( φ ( q (cid:48) , , , q ) − ∂ζ∂δ ( φ ( q (cid:48) , , , q )= ∂ζ∂θ ( φ ( q (cid:48) , , , q ) ∂φ∂δ ( q (cid:48) , . The fact that ∂∂S N +1 [ θ | S N +1 ] > ∂ζ∂θ ( φ ( q (cid:48) , , , q ) >

0, and as previouslynoted ∂φ/∂δ >

0. It follows that this limit is strictly positive. Thuslim inf ∆ ↓

1∆ ( ψ (∆ , ∆ , S N ) − ψ (0 , ∆ , S N )) > M V ( N ) > M V ( N + 1) . ∂∂S N +1 E [ θ | S N +1 ] < ∆ ↓

1∆ ( ζ ( φ ( q (cid:48) , , ∆ , q ) − ζ ( φ ( q (cid:48) , ∆) , ∆ , q )) > ψ (0 , ∆ , s N ) − ψ (∆ , ∆ , s N )) > M V ( N + 1) − M V ( N ) = lim ∆ ↓ (cid:90) dG M N ( S N ) 1∆ ( ψ (0 , ∆ , S N − ψ (∆ , ∆ , S N )) ≥ (cid:90) dG M N ( S N ) lim inf ∆ ↓

1∆ ( ψ (0 , ∆ , S N − ψ (∆ , ∆ , S N )) > , or M V ( N ) < M V ( N + 1) . C.2.2 The N → ∞ limit Consider a limiting model in which the principal observes a countably inﬁnite vector ofoutcomes S = ( S , S , ... ). By the law of large numbers, in the quality linkage model thismeans that the principal perfectly infers θ , while in the circumstance linkage model theprincipal perfectly infers ε. Deﬁne µ (∆; α ) analogously to the ﬁnite-population case. In eachmodel,reasoning very similar to the proof of Lemma C.1 implies that µ (cid:48) (0 , α ) exists, is inde-pendent of α, and lies in [0 , . So there exists a unique, ﬁnite a ∗ ( ∞ ) satisfying µ (cid:48) (0; a ∗ ( ∞ )) = C (cid:48) ( a ∗ ( ∞ )) . Deﬁne µ ∞ (∆) ≡ µ (∆; a ∗ ( ∞ )) and M V ( ∞ ) ≡ µ (cid:48)∞ (0) in each model. Lemma C.3establishes that 0 < M V ( ∞ ) < . We will show that lim N →∞ M V ( N ) = M V ( ∞ ) . LemmaC.3 establishes that this result implies 0 < lim N →∞ M V ( N ) < . To prove the result, we will need the ability to change measure between the distributionof outcomes at the equilibrium action proﬁle, and one in which a single agent, without lossagent 1, deviates to a diﬀerent action. For each model, deﬁne a reference probability space(Ω , F , P a ) , containing all relevant random variables for arbitrary segment sizes. For thequality linkage model this space supports the latent types θ, θ ⊥ , θ ⊥ , ... and shocks ε , ε , ... as well as the outcomes S , S , ... Similarly, in the circumstance linkage model the spacesupports the latent types θ , θ , ... , shocks ε, ε ⊥ , ε ⊥ , ..., and outcomes S , S , ... In each modelthe probability measure P a depends on the vector of agent actions a = ( a , a , ... ), as thedistributions of the outcomes depend on the actions.We will use F ∞ to denote the σ -algebra generated by the full vector of outcomes S , S , ... Note that by the LLN all latent types may be taken to be measurable with respect to F ∞ . N, we will let P ∗ N denote the restriction of the measure P a ∗ ( N ) to (Ω , F ∞ ) , and similarly let P ∆ ,N denote the restriction of the measure P ( a ∗ ( N )+∆ , a ∗ ( N )) to(Ω , F ∞ ) . These measures represent the distributions over outcomes induced when all agentstake actions a ∗ ( N ) and when agent 1 deviates to action a ∗ ( N ) + ∆ , respectively. Lemma C.4.

The Radon-Nikodym derivative for the change of measure from (Ω , F ∞ , P ∗ N ) to (Ω , F ∞ , P ∆ ,N ) is d P ∆ ,N d P ∗ N = g Q ( S | θ ; a = a ∗ ( N ) + ∆) g Q ( S | θ ; a = a ∗ ( N )) in the quality linkage model and d P ∆ ,N d P ∗ N = g C ( S | ε ; a = a ∗ ( N ) + ∆) g C ( S | ε ; a = a ∗ ( N )) in the circumstance linkage model.Proof. For convenience we suppress the dependence of distributions on all actions otherthan a in this proof. We derive the derivative for the quality linkage model, with theexpression for the circumstance linkage model following from nearly identical work. Fix any F ∞ -measurable random variable X . Then there exists a measurable function x : R ∞ → R such that X = x ( S ) a.s. Thus E [ X | a = a ∗ ( N ) + ∆]= (cid:90) dF θ ( θ ) dG Q ( S | θ ; a = a ∗ ( N ) + ∆) dG Q − ( S − | θ, S ; a = a ∗ ( N ) + ∆) × x ( S ) . As S − is independent of S conditional on θ in the quality linkage model, G Q − ( S − | θ, S ; a = a ∗ ( N ) + ∆) = G Q − ( S − | θ ). So this expression may be equivalently written E [ X | a = a ∗ ( N ) + ∆]= (cid:90) dF θ ( θ ) dG Q ( S | θ ; a = a ∗ ( N ) + ∆) dG Q − ( S − | θ ) x ( S )= (cid:90) dF θ ( θ ) dG Q ( S | θ ; a = a ∗ ( N )) dG Q − ( S − | θ ) × g Q ( S | θ ; a = a ∗ ( N ) + ∆) g Q ( S | θ ; a = a ∗ ( N )) x ( S )= E (cid:34) g Q ( S | θ ; a = a ∗ ( N ) + ∆) g Q ( S | θ ; a = a ∗ ( N )) X | a = a ∗ ( N ) (cid:35) .

59s this argument holds for arbitrary F ∞ -measurable X, it must be that d P ∆ ,N d P ∗ N = g Q ( S | θ ; a = a ∗ ( N ) + ∆) g Q ( S | θ ; a = a ∗ ( N )) . To establish the desired limiting result, we will prove that for any ∆ and N, | µ N (∆) − µ ∞ (∆) | ≤ κ N (∆) β √ N , where κ N (∆) ≡ (cid:32) E (cid:34)(cid:18) d P ∆ ,N d P ∗ N − (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a = a ∗ ( N ) (cid:35)(cid:33) / and β is a ﬁnite constant independent of N and ∆ whose value depends on the model. Thefollowing lemma establishes several important properties of κ N . Lemma C.5. κ N (∆) is independent of N, κ N (0) = 0 , κ (cid:48) N, + (0) = lim sup ∆ ↓ κ N (∆) / ∆ < ∞ .Proof. We prove the theorem for the quality linkage model, with nearly identical work estab-lishing the result for the circumstance linkage model. Note that when ∆ = 0 , d P ∆ ,N /d P ∗ N =1 , and so trivially κ N (0) = 0 . To see that κ N (∆) is independent of N, note that the distribu-tion of each outcome satisﬁes the translation invariance property G Qi ( S i = s i | θ ; a i = α ) = G Qi ( S i = s i − α | θ ; a i = 0) for any s i and α. So κ N (∆) may be written κ N (∆) = (cid:90) dF θ ( θ ) dG Q ( S = s | θ ; a = a ∗ ( N )) (cid:32) g Q ( S = s | θ ; a = a ∗ ( N ) + ∆) g Q ( S = s | θ ; a = a ∗ ( N )) − (cid:33) = (cid:90) dF θ ( θ ) dG Q ( S = s − a ∗ ( N ) | θ ; a = 0) (cid:32) g Q ( S = s − a ∗ ( N ) | θ ; a = ∆) g Q ( S = s − a ∗ ( N ) | θ ; a = 0) − (cid:33) So perform a change of variables to s (cid:48) ≡ s − a ∗ ( N ) to obtain the representation κ N (∆) = (cid:90) dF θ ( θ ) dG Q ( S = s (cid:48) | θ ; a = 0) (cid:32) g Q ( S = s (cid:48) | θ ; a = ∆) g Q ( S = s (cid:48) | θ ; a = 0) − (cid:33) , which is independent of N, as desired.Now, let ξ ≡ θ ⊥ + ε . Let f ξ be the convolution of f θ ⊥ and f ε . Then for any ∆, g Q ( S | ; a = a ∗ ( N ) + ∆) = f ξ ( S − θ − a ∗ ( N ) − ∆) = f ξ ( ξ − ∆) under the measure P ∗ N . Hence κ N (∆) = (cid:32) E (cid:34)(cid:18) d P ∆ ,N d P ∗ N − (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a = a ∗ ( N ) (cid:35)(cid:33) / = (cid:32) E (cid:34)(cid:18) f ξ ( ξ − ∆) f ξ ( ξ ) − (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a = a ∗ ( N ) (cid:35)(cid:33) / = (cid:90) dF ξ ( ξ ) (cid:18) f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) (cid:19) We must therefore show that the limitlim sup ∆ ↓ κ (∆) = lim sup ∆ ↓ (cid:32)(cid:90) dF ξ ( ξ ) (cid:18) f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) (cid:19) (cid:33) / = (cid:32) lim sup ∆ ↓ (cid:90) dF ξ ( ξ ) 1∆ (cid:18) f ξ ( ξ − ∆) − f ε ( ξ ) f ε ( ξ ) (cid:19) (cid:33) / exists and is ﬁnite. By Assumption 4, for ∆ suﬃciently close to 0 there exists a non-negative,integrable function J ( · ) such that1∆ (cid:18) f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) (cid:19) ≤ J ( ξ )for all ξ. Then by reverse Fatou’s lemma,lim sup ∆ ↓ (cid:90) dF ξ ( ξ ) 1∆ (cid:18) f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) (cid:19) ≤ (cid:90) dF ξ ( ξ ) lim sup ∆ ↓ (cid:18) f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) (cid:19) ≤ (cid:90) dF ξ ( ξ ) J ( ξ ) < ∞ , as desired.The bound on | µ N (∆) − µ ∞ (∆) | just claimed impies the desired result because for ∆ > | ( µ N (∆) − µ ) / ∆ − ( µ ∞ (∆) − µ ) / ∆ | ≤ κ N (∆) − κ N (0)∆ β √ N , and thus by taking ∆ ↓ | µ (cid:48) N (0) − µ (cid:48)∞ (0) | ≤ κ (cid:48) N, + (0) βN must hold. Then as κ (cid:48) N, + (0) is ﬁnite and independent of N, µ (cid:48) N (0) → µ (cid:48)∞ (0) as N → ∞ , asdesired. 61e now derive the claimed bound. To streamline notation, we will write E ∗ N to representexpectations conditioning on a = a ∗ ( N ) , and E ∆ ,N to represent expectations conditioning on a = a ∗ ( N ) + ∆ and a N = a ∗ ( N ) N − . Note ﬁrst that the expected value of the principal’sposterior estimate of θ is a function only of the size of agent 1’s distortion ∆ , but not ofthe equilibrium action inference. Thus µ ∞ (∆) = E [ E [ θ | S ; a = a ∗ ( ∞ )] | a = ( a ∗ ( ∞ ) + ∆ , a ∗ ( ∞ ))]= E [ E [ θ | S ; a = a ∗ ( N )] | a = ( a ∗ ( N ) + ∆ , a ∗ ( N ))] = E ∆ ,N [ E ∗ N [ θ | S ]] . So we may write µ N (∆) − µ ∞ (∆) = E ∆ ,N [ E ∗ N [ θ | S N ] − E ∗ N [ θ | S ]] . Now, performing a change of measure, E ∆ ,N [ E ∗ N [ θ | S N ] − E ∗ N [ θ | S ]]= E ∗ N (cid:20) d P ∆ ,N d P ∗ N (cid:0) E ∗ N [ θ | S N ] − E ∗ N [ θ | S ] (cid:1)(cid:21) = E ∗ N (cid:20)(cid:18) d P ∆ ,N d P ∗ N − (cid:19) (cid:0) E ∗ N [ θ | S N ] − E ∗ N [ θ | S ] (cid:1)(cid:21) + E ∗ N [ E ∗ N [ θ | S N ] − E ∗ N [ θ | S ]]= E ∗ N (cid:20)(cid:18) d P ∆ ,N d P ∗ N − (cid:19) (cid:0) E ∗ N [ θ | S N ] − E ∗ N [ θ | S ] (cid:1)(cid:21) , with the last line following by the law of iterated expectations. Then by an application ofthe Cauchy-Schwarz inequality, | µ N (∆) − µ ∞ (∆) | ≤ κ N (∆) (cid:16) E ∗ N (cid:104)(cid:0) E ∗ N [ θ | S N ] − E ∗ N [ θ | S ] (cid:1) (cid:105)(cid:17) / . We will bound the right-hand side for the quality linkage model, with the result for thecircumstance linkage model following by nearly identical work.Deﬁne the family of random variables (cid:98) θ N ( z ) ≡ E ∗ N [ θ | S , θ = z ] for z ∈ R . Note that (cid:98) θ ( θ ) = E ∗ N [ θ | S ] , as S allows the principal to perfectly infer θ, and θ is independent of thevector of outcomes S − conditional on θ. Further, E ∗ N [ θ | S N ] = E ∗ N [ E ∗ N [ θ | S ] | S N ] isthe mean-square minimizing estimator of (cid:98) θ N ( θ ) conditional on the performance vector S N . Another estimator of (cid:98) θ N ( θ ) is (cid:98) θ N (cid:16)(cid:101) θ N (cid:17) , where (cid:101) θ N ≡ N N (cid:88) i =1 ( S i − µ ⊥ ) , µ ⊥ = E [ θ ⊥ i ] . So E ∗ N (cid:104)(cid:0) E ∗ N [ θ | S N ] − E ∗ N [ θ | S ] (cid:1) (cid:105) ≤ E ∗ N (cid:20)(cid:16)(cid:98) θ N (cid:16)(cid:101) θ N (cid:17) − E ∗ N [ θ | S ] (cid:17) (cid:21) . Given that shifts in θ aﬀect the outcome S i additively, E ∗ N [ θ | S = s , θ = z ] = E ∗ N [ θ | S = s − z, θ = 0] for every s and z. The proof of Lemma C.3 establishes that E ∗ N [ θ | S , θ ]is diﬀerentiable with respect to S and uniformly bounded in (0 ,

1) everywhere. Hence (cid:98) θ N ( z )is diﬀerentiable and (cid:98) θ (cid:48) N ( z ) ∈ ( − ,

0) for all z . Thus by the fundamental theorem of calculus, | (cid:98) θ N ( (cid:101) θ N ) − (cid:98) θ N ( θ ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) (cid:101) θ N θ (cid:98) θ (cid:48) N ( z ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:90) (cid:101) θ N θ | (cid:98) θ (cid:48) N ( z ) | dz ≤ | (cid:101) θ N − θ | . Further note that (cid:101) θ N − θ = 1 N N (cid:88) i =1 ( θ ⊥ i − µ ⊥ + ε i ) , which has mean 0 and variance ( σ θ ⊥ + σ ε ) /N given that θ ⊥ i and ε i are independent. So E ∗ N (cid:104)(cid:0) E ∗ N [ θ | S N ] − E ∗ N [ θ | S ] (cid:1) (cid:105) ≤ σ θ ⊥ + σ ε N , implying the desired bound with β = (cid:113) σ θ ⊥ + σ ε . D Proofs for Section 4 (Main Results)

D.1 Proofs of Theorems 1 and 2

Opt-In Equilibrium.

In any pure-strategy equilibrium in which all agents opt-in, the equi-librium eﬀort level a ∗ must satisfy two conditions: M V ( N ) = C (cid:48) ( a ∗ ) (D.1) R + µ − C ( a ∗ ) ≥ M V ( N ) is independent of a ∗ , and C (cid:48) is strictly monotone. Thus(D.1) pins down a unique eﬀort level a ∗ = C (cid:48)− ( M V ( N )). Since C is everywhere in-creasing, the conditions in (D.1) and (D.2) can be simultaneously satisﬁed if and only if0 ≤ C (cid:48)− [ M V ( N )] ≤ a ∗∗ ≡ C − ( R + µ ), or equivalently,0 = C (cid:48) (0) ≤ M V ( N ) ≤ C (cid:48) ( a ∗∗ )63oting that C (cid:48)− is everywhere increasing.By Assumption 7, R + µ > C ( a ∗ (1)). Since the cost function C has positive ﬁrst andsecond derivatives, R + µ > C ( a ∗ (1)) and R + µ = C ( a ∗∗ ) imply that a ∗ (1) < a ∗∗ , whichfurther implies C (cid:48) ( a ∗ (1)) < C (cid:48) ( a ∗∗ ). By Lemma 1, M V (1) =

M V Q (1) ≥ M V Q ( N ). Thus M V Q ( N ) ≤ M V Q (1) = C (cid:48) ( a ∗ (1)) ≤ C (cid:48) ( a ∗∗ ) , and a symmetric all opt-in equilibrium exists in the quality linkage model. In contrast, inthe circumstance linkage model, M V C ( N ) ≥ M V C (1) = C (cid:48) ( a ∗ (1)) (D.3)so the inequality M V C ( N ) ≤ C (cid:48) ( a ∗∗ ) is not guaranteed to hold. An opt-in equilibrium existsif and only if N is suﬃciently small; speciﬁcally, N ≤ N ∗ where N ∗ ≡ sup { N : M V C ( N ) ≤ C (cid:48) ( a ∗∗ ) } . (It is possible that N ∗ is inﬁnite if M V C ( N ) ≤ C (cid:48) ( a ∗∗ ) for all N. )Finally, for the parameters N ≤ N ∗ where an opt-in equilibrium exists in both models, itis possible to rank equilibrium eﬀort levels as follows: Deﬁne a ∗ C and a ∗ Q to be the respectiveequilibrium eﬀort levels. Then, since M V C ( N ) ≥ M V (1) ≥ M V Q ( N ) for all N , a ∗ C = C (cid:48)− ( M V C ( N )) ≥ C − ( M V Q ( N )) = a ∗ Q so equilibrium eﬀort is higher in the circumstance linkage model. Opt-Out Equilibrium.

Under the imposed reﬁnement on the principal’s oﬀ-equilibriumbelief about the agent’s action, the optimal action conditional on entry is a ∗ (1). Thus in anall opt-out equilibrium, the equilibrium action a ∗ must satisfy R + µ − C ( a ∗ (1)) < Mixed Equilibrium.

For any probability p ∈ [0 ,

1] and M ∈ { T, C } , let M V M ( p, N ) = E (cid:104)(cid:16) M V M ( (cid:101) N + 1 (cid:17) | (cid:101) N ∼ Binomial( N − , p ) (cid:105) be the expected marginal impact for agent i of exerting additional eﬀort beyond the princi-pal’s expectation, when agent i opts-in and all other agents opt-in with independent prob-ability p . Note that because M V C ( N ) is increasing in N, and increasing p shifts up the64istribution of (cid:101) N in the FOSD sense, M V C ( p, N ) is increasing in p. Further, because in-creasing p shifts Pr( (cid:101) N ≤ n ) strictly downward for every n < N − , this monotonicity isstrict whenever M V C ( n ) is not constant over the range { , .., N } . For the same reasons, M V C ( p, N ) is increasing in N for ﬁxed p, and strictly increasing whenever p ∈ (0 ,

1) and

M V C ( n ) is not constant over { , ..., N } . In a mixed equilibrium, the equilibrium eﬀort level a ∗ and probability p assigned toopting-in must jointly satisfy R + µ − C ( a ∗ ) = 0 . (D.5) M V ( p, N ) = C (cid:48) ( a ∗ ) . (D.6)The expression in (D.5) pins down the equilibrium action, which is identical to the actiondeﬁned as a ∗∗ above. Moreover, C (cid:48) ( a ) is independent of both the mixing probability p andalso the ﬁxed segment size N . Therefore an equilibrium exists if and only if M V ( p, N ) = C (cid:48) ( a ∗∗ ) for some p ∈ [0 , p ∈ [0 , M V Q ( p, N ) ≤ max ≤ N (cid:48) ≤ N M V Q ( N (cid:48) ) = M V Q (1) = C (cid:48) ( a ∗ (1)) < C (cid:48) ( a ∗∗ )using that M V Q is a decreasing function of N (Lemma 1). Thus the quality linkage modeldoes not admit a strictly mixed equilibrium.Similarly if M V C ( N ) < C (cid:48) ( a ∗∗ ), then M V C ( p, N ) ≤ max ≤ N (cid:48) ≤ N M V C ( N (cid:48) ) = M V C ( N ) < C (cid:48) ( a ∗∗ )since M V C is a strictly increasing function of N (Lemma 1). So there does not exist a strictlymixed equilibrium in the circumstance linkage model either. Indeed, this is exactly the rangefor N that supports the symmetric all opt-in equilibrium in the circumstance linkage model.If however M V ( N ) ≥ C (cid:48) ( a ∗∗ ), then M V C (1) = M V C (0 , N ) < C (cid:48) ( a ∗∗ ) ≤ M V C (1 , N ) = M V C ( N ) . This implies in particular that

M V C is not constant over the range { , ..., N } , so that M V C ( p, N ) is strictly increasing in p. Since

M V C ( p, N ) is also continuous in p , the interme-diate value theorem yields existence of a unique p ∗ ( N ) ∈ (0 ,

1] satisfying

M V C ( p ∗ ( N ) , N ) = C (cid:48) ( a ∗∗ ).If N ≤ N ∗ , i.e. M V ( N ) = C (cid:48) ( a ∗∗ ) , then it must be that p ∗ ( N ) = 1 . Thus in particularthe opt-in equilibrium is unique whenever it exists. Otherwise p ∗ ( N ) < , in which case thefact that M V C ( p, N ) is strictly increasing in N for ﬁxed p ∈ (0 ,

1) further implies that p ∗ ( N )must be strictly decreasing in N. Finally, the eﬀort level a ∗∗ chosen in this equilibrium weaklyexceeds the eﬀort level a ∗ C chosen in the symmetric opt-in equilibrium in the circumstancelinkage model, since R + µ ≥ C ( a ∗ C ) by (D.2), while R + µ = C ( a ∗∗ ) by (D.5).65 .2 Proof of Lemma 2 Comparisons between equilibrium actions correspond directly to comparisons of marginalvalues of eﬀort. It is therefore suﬃcient to establish that

M V ( N ) < N, andthat M V Q ( N ) is decreasing while M V C ( N ) is increasing in N, with lim N →∞ M V C ( N ) < . These facts in particular imply that

M V Q ( N ) ≤ M V (1) ≤ M V C ( N ) , with M V (1) dictatingequilibrium eﬀort in the no-data linkages benchmark. Lemma 1 establishes the desiredmonotonicity of the marginal value of eﬀort, while the upper bound on

M V and the limitingvalue of

M V C are established in Appendix C. D.3 Proof of Proposition 2

Suppose all agents in a segment of size N enter and choose action a . Social welfare W (1 , a, N ) = N · (2 µ + a − C ( a ))is strictly increasing on a ∈ [0 , a F B ). Thus the comparison a ∗ Q ( N ) ≤ a NDL < a

F B immedi-ately implies that for all N , welfare is ranked W Q ( N ) ≤ W NDL ( N )where the inequality is strict for all N >

N < N ∗ , the equilibrium action in the circumstance linkage modelsatisﬁes a ∗ C ( N ) ∈ [ a NDL , a

F B ) (Theorem 2), so the same argument implies W NS ( N ) ≤ W C ( N )with the inequality strict for N >

1. When the segment size

N > N ∗ , W C ( N ) = N · p ( N ) · [ a ∗∗ − C ( a ∗∗ ) + 2 µ ] . Since p ( N ) → N → ∞ , it follows that for N suﬃciently large, W C ( N ) , W Q ( N ) < W NDL ( N ) . E Proofs for Section 6 (Data Sharing, Markets, andConsumer Welfare)

E.1 Proof of Proposition 3

Deﬁnition E.1.

The competitive transfer for a segment of n consumers is R ∗ ( n ) = a ∗ ( n ) + µ (E.1)66 hile the monopolist transfer is R ∗ ( n ) = C [ a ∗ ( n )] − µ (E.2)We ﬁrst show that in any equilibrium under data sharing, consumers must receive all ofthe generated surplus. Lemma E.1.

Consider either the quality linkage or circumstance linkage model. In anyequilibrium under data sharing, ﬁrms receive zero payoﬀs, and consumer welfare is N × (2 µ + a ∗ ( N ) − C ( a ∗ ( N ))) . Proof.

Fix any subset of ﬁrms F where | F | ≥

2. Suppose each ﬁrm f ∈ F sets the compet-itive transfer R ∗ ( N ) (as deﬁned in (E.1)), while each ﬁrm f / ∈ F chooses a transfer weaklybelow R ∗ ( N ). Consumers opt-out if no ﬁrm oﬀers a transfer above R ∗ ( N ). Otherwise,consumers participate with the ﬁrm oﬀering the highest transfer, and exert eﬀort a ∗ ( N ).We now show that this is an equilibrium. By choosing the transfer R ∗ ( N ), ﬁrms f ∈ F receive a payoﬀ of − R ∗ ( N ) + µ + a ∗ ( N ) = 0 per consumer. They cannot proﬁtably deviate,since reducing their transfer would lose all of their consumers, while increasing their transferwould result in a negative payoﬀ. Firms f / ∈ F acquire consumers only by setting a transferstrictly above R ∗ ( N ), which leads to a negative payoﬀ. So there are no proﬁtable deviationsfor ﬁrms. Consumers also have no proﬁtable deviations: participation with any ﬁrm in F leads to the same (strictly positive) payoﬀ, while participation with any ﬁrm f / ∈ F involvesthe same equilibrium eﬀort but a lower transfer. So the described strategies constitute anequilibrium.Moreover these equilibria are the only equilibria under data sharing. Suppose towardscontradiction that some ﬁrm f receiving consumers sets R f < R ∗ ( N ). If another ﬁrm f (cid:48) oﬀers a transfer R f (cid:48) ∈ ( R f , R ∗ ( N )], then consumers participating with ﬁrm f can proﬁtablydeviate to participating with ﬁrm f (cid:48) . If no ﬁrms f (cid:48) oﬀer transfers in the interval ( R f , R ∗ ( N )],then ﬁrm f can proﬁtably deviate by raising its transfer. So transfers below R ∗ ( N ) are ruledout for ﬁrms receiving consumers. If any ﬁrm receiving consumers sets a transfer exceeding R ∗ ( N ) (which yields a negative payoﬀ), then that ﬁrm can strictly proﬁt by deviating to R ∗ ( N ) (which yields a payoﬀ of zero). So transfers above R ∗ ( N ) are ruled out as well. Inequilibrium, it must therefore be that all ﬁrms that receive consumers set transfer R ∗ ( N ).Firms not receiving consumers must set transfers weakly below R ∗ ( N ), or consumerscould proﬁtably deviate to one of these ﬁrms. Finally, since in equilibrium agents knowthe number of agents participating with their ﬁrm, uniqueness of agent eﬀort follows fromarguments given already in the proofs for exogenous transfers (see Section 4).67e show next that (E.1) is an upper bound on achievable consumer welfare under pro-prietary data in the circumstance linkage setting. Consider any equilibrium, and let N f be the number of agents participating with ﬁrm f in that equilibrium. We can obtain anupper bound on consumer welfare by evaluating consumer payoﬀs supposing that all ﬁrmsset the competitive transfer. Then, each agent interacting with ﬁrm f achieves a payoﬀ of R ∗ ( N f ) + µ − C ( a ∗ C ( N f )). But R ∗ ( N f ) + µ − C ( a ∗ C ( N f )) = a ∗ C ( N f ) + 2 µ − C [ a ∗ C ( N f )] ≤ a ∗ C ( N ) + 2 µ − C ( a ∗ C ( N )) , since the function ξ ( n ) = a ∗ C ( n ) − C ( a ∗ C ( n )) is increasing, and N ≥ N f . Thus consumerwelfare is bounded above by N × ( a ∗ C ( N ) + 2 µ − C ( a ∗ C ( N ))) as desired. Since this boundholds uniformly across all allocations of consumers to ﬁrms, welfare must be weakly higherunder data sharing than in any equilibrium with proprietary data.Now consider the quality linkage model. We ﬁrst show that in equilibrium, all consumersmust be served by a single ﬁrm. Lemma E.2.

In the quality linkage model under proprietary data, in every equilibriumexactly one ﬁrm receives consumers.Proof.

Suppose towards contradiction that there is an equilibrium in which two ﬁrms f = 1 , R f and receive N f > f mustchoose the eﬀort level a ∗ Q ( N f ). Agents’ IC constraints are described as follows: First, R + µ − C ( a ∗ Q ( N )) ≥ R + µ − C ( a ∗ Q ( N + 1))or agents participating with ﬁrm 1 could proﬁtably deviate to participating with ﬁrm 2.Likewise it must be that R + µ − C ( a ∗ Q ( N )) ≥ R + µ − C ( a ∗ Q ( N + 1))or agents participating with ﬁrm 2 could proﬁtably deviate to participating with ﬁrm 1.These displays simplify to R − R ≥ C ( a ∗ Q ( N )) − C ( a ∗ Q ( N + 1)) R − R ≥ C ( a ∗ Q ( N )) − C ( a ∗ Q ( N + 1)) . Summing these inequalities, we have0 ≥ C ( a ∗ Q ( N )) + C ( a ∗ Q ( N )) − C ( a ∗ Q ( N + 1)) − C ( a ∗ Q ( N + 1))68ut C ( a ∗ Q ( n )) is strictly decreasing in n , so the right-hand side of the above display must bestrictly positive, leading to a contradiction.Now suppose no ﬁrms receive consumers in equilibrium. If there exists a ﬁrm oﬀering atransfer R > R ∗ (1), then it is strictly optimal for a consumer to deviate to interaction withthat ﬁrm at eﬀort a ∗ (1). Otherwise, it is strictly optimal for a ﬁrm to deviate to any transfer R ∈ ( R ∗ M (1) , R ∗ (1)) and receive consumers.The lemma says that only one ﬁrm receives a strictly positive number of consumers inequilibrium; without loss, let this be ﬁrm 1. Consumer welfare is maximized when ﬁrm 1sets the competitive transfer R ∗ ( N ), in which case consumers receive (E.1), so consumerwelfare under proprietary data must be weakly lower than under data sharing, completingour proof. 69 For Online Publication

O.1 Distributional Regularity Results

To establish our main results we rely heavily on boundedness and smoothness of various likelihoodand posterior distribution functions. In this section we prove a number of technical lemmas ensuringsuﬃcient smoothness of functions invoked in proofs elsewhere.We ﬁrst prove a general result showing that log-concave density functions are necessarilybounded.

Lemma O.1.

Let f : R → R be any strictly positive, strictly log-concave function satisfying (cid:82) ∞−∞ f ( x ) dx < ∞ . Then f is bounded.Proof. As f is bounded below by 0, it suﬃces to show that it is bounded above. Since log f isstrictly concave everywhere, it is either a strictly monotone function, or else has a global maximizer.Suppose that log f is strictly increasing everywhere. Then f must be strictly increasing everywhereas well. But then as f > , (cid:90) ∞−∞ f ( x ) dx ≥ (cid:90) ∞ f ( x ) dx ≥ (cid:90) ∞ f (0) dx = ∞ , a contradiction of our assumption. So log f cannot be strictly increasing everywhere. Supposeinstead that log f is strictly decreasing everywhere. Then f must be strictly decreasing everywhereas well. But then as f > , (cid:90) ∞−∞ f ( x ) dx ≥ (cid:90) −∞ f ( x ) dx ≥ (cid:90) ∞ f (0) dx = ∞ , another contradiction. So f must have a global maximizer, meaning that it is bounded above asdesired. Corollary O.1. f θ , f θ , f θ ⊥ , f ε , f ε , f ε ⊥ are each bounded. The following lemma establishes a set of regularity conditions on a likelihood function suﬃcientto ensure that its associated posterior distribution function is C in both its arguments. Notethat these conditions amount to the regularity conditions imposed in SMLRP, plus a continuitycondition on the density of the unobserved variable. Lemma O.2.

Let X and Y be two random variables for which the density g ( y ) for Y and theconditional densities f ( x | y ) for X | Y exist. Suppose that: • f ( x | y ) is a C , function and g ( y ) is continuous, Since log f is strictly concave everywhere, it is continuous everywhere. Then so is f, meaning that f isa measurable function. f ( x, y ) and ∂∂x f ( x | y ) are both uniformly bounded for all ( x, y ) .Then H ( x, y ) ≡ Pr( Y ≤ y | X = x ) is a C function of ( x, y ) .Proof. Let G be the distribution function for y. By Bayes’ rule, H ( x, y ) = (cid:82) y −∞ f ( x | y (cid:48) ) dG ( y (cid:48) ) (cid:82) ∞−∞ f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) . We ﬁrst establish continuity of this function. It is suﬃcient to establish continuity of the numeratorand denominator separately. As for the denominator, f ( x | y (cid:48)(cid:48) ) is continuous in x and uniformlybounded for all ( x, y (cid:48)(cid:48) ) , so by the dominated convergence theorem the denominator is continuousin x, thus also in ( x, y ) given its independence of y. As for the numerator, write (cid:90) y −∞ f ( x | y (cid:48) ) dG ( y (cid:48) ) = (cid:90) ∞−∞ { y (cid:48) ≤ y } f ( x | y (cid:48) ) dG ( y (cid:48) ) . Consider any sequence converging to ( x , y ). Given the continuity of f ( x | y ) , the integrandconverges pointwise G -a.e. to { y (cid:48) ≤ y } f ( x | y (cid:48) ). (The only point of potential nonconvergenceis at y (cid:48) = y , but since Y is a continuous distribution this point is assigned measure zero under G. ) As the integrand is also uniformly bounded above for all ( x, y, y (cid:48) ) , the dominated convergencetheorem ensures that the numerator is continuous in ( x, y ) . Next, note that ∂H/∂y exists and is given by ∂H∂y ( x, y ) = f ( x | y ) g ( y ) (cid:82) ∞−∞ f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) , which is continuous everywhere given that the denominator is continuous by the argument of theprevious paragraph while f ( x | y ) and g ( y ) are continuous by assumption.Finally, consider ∂H/∂x. Let (cid:98) H ( x, y ) ≡ H ( x, y ) − − . Then ∂H∂x ( x, y ) exists and satisﬁes ∂H∂x ( x, y ) < ∂ (cid:98) H∂x ( x, y ) exists and satisﬁes ∂ (cid:98) H∂x ( x, y ) >

0. Note that (cid:98) H ( x, y ) may be written (cid:98) H ( x, y ) = (cid:82) ∞ y f ( x | y (cid:48) ) dG ( y (cid:48) ) (cid:82) y −∞ f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) . Because ∂∂x f ( x | y ) exists and is uniformly bounded for all x and y , the Leibniz integral rule ensuresthat this expression is diﬀerentiable with respect to x with derivative ∂ (cid:98) H∂x ( x, y ) = (cid:82) ∞ y ∂∂x f ( x | y (cid:48) ) dG ( y (cid:48) ) (cid:82) y −∞ f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) − (cid:16)(cid:82) ∞ y f ( x | y (cid:48) ) dG ( y (cid:48) ) (cid:17) (cid:16)(cid:82) y −∞ ∂∂x f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) (cid:17)(cid:16)(cid:82) y −∞ f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) (cid:17) . With some rearrangement, this may be equivalently written ∂ (cid:98) H∂x ( x, y ) = (cid:18)(cid:90) y −∞ f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) ) (cid:19) − × (cid:90) ∞ y dG ( y (cid:48) ) (cid:90) y −∞ dG ( y (cid:48)(cid:48) ) (cid:18) f ( x | y (cid:48)(cid:48) ) ∂∂x f ( x | y (cid:48) ) − f ( x | y (cid:48) ) ∂∂x f ( x | y (cid:48)(cid:48) ) (cid:19) . his function is continuous if both (cid:90) ∞−∞ { y (cid:48)(cid:48) ≤ y } f ( x | y (cid:48)(cid:48) ) dG ( y (cid:48)(cid:48) )and (cid:90) ∞−∞ dG ( y (cid:48) ) (cid:90) ∞−∞ dG ( y (cid:48)(cid:48) ) { y (cid:48) ≥ y } { y (cid:48)(cid:48) ≤ y } (cid:18) f ( x | y (cid:48)(cid:48) ) ∂∂x f ( x | y (cid:48) ) − f ( x | y (cid:48) ) ∂∂x f ( x | y (cid:48)(cid:48) ) (cid:19) are continuous. We have already seen that the former is continuous, so consider the latter expres-sion. By assumption f ( x | y ) and ∂∂x f ( x | y ) are both continuous in ( x, y ) . Thus for any sequenceconverging to ( x , y ) , the integrand converges to { y (cid:48) ≥ y } { y (cid:48)(cid:48) ≤ y } (cid:32) f ( x | y (cid:48)(cid:48) ) ∂∂x f ( x | y (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) x = x − f ( x | y (cid:48) ) ∂∂x f ( x | y (cid:48)(cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) x = x (cid:33) except possibly at points ( y (cid:48) , y (cid:48)(cid:48) ) such that y (cid:48) = y or y (cid:48)(cid:48) = y , a set which is assigned zero measureunder G × G given the continuity of the distribution of Y. Further, since f ( x | y ) and ∂∂x f ( x | y )are both uniformly bounded for all ( x, y ), so is f ( x | y ) ∂∂x f ( x | y (cid:48) ) − f ( x | y (cid:48) ) ∂∂x f ( x | y )for all x, y, y (cid:48) . Then the dominated convergence theorem ensures that the entire expression convergesto its value at ( x , y ) , as desired.The next lemma establishes that the density functions of θ i and ε i remain continuous whenconditioned on a set of outcomes. Lemma O.3.

For each model M ∈ { Q, C } , agent i ∈ { , ..., N } , and outcome-action proﬁle ( S , a ) : • The conditional densities f Mθ i ( θ i | S ; a ) and f Mθ i ( θ i | S − j ; a ) for each j ∈ { , ..., N } are strictlypositive and continuous in θ i everywhere, • The conditional densities f Mε i ( ε i | S ; a ) and f Mε i ( ε i | S − j ; a ) for each j ∈ { , ..., N } are strictlypositive and continuous in ε i everywhere.Proof. Throughout the proof we suppress explicit dependence of distributions on the action proﬁle a . We prove the result for the quality linkage model, with the circumstance linkage model followingby permuting the roles of θ i and ε i .Consider ﬁrst the density of θ i conditional on S . By Bayes’ rule f Qθ i ( θ i | S ) = g Q N ( S | θ i ) f θ ( θ i ) g Q N ( S ) , where g Q N ( S | θ i ) = g Qi ( S i | θ i ) (cid:90) dF Qθ ( θ | θ i ) (cid:89) j (cid:54) = i g Qj ( S j | θ ) nd g Q N ( S ) = (cid:90) dF θ ( θ ) N (cid:89) j =1 g Qj ( S j | θ ) . As g Q N ( S | θ i ) , g Q N ( S ) , and f θ ( θ i ) are all strictly positive, so is f Qθ i ( θ i | S ) . Further, g Qi ( S i | θ i ) = f ε ( S i − θ i − a i ) is continuous in θ i given the continuity of f ε . Then f Qθ i ( θ i | S ) is continuous in θ i solong as f θ ( θ i ) (cid:90) dF Qθ ( θ | θ i ) (cid:89) j (cid:54) = i g Qj ( S j | θ ) = (cid:90) dF θ ( θ ) f θ ⊥ ( θ − θ ) (cid:89) j (cid:54) = i g Qj ( S j | θ )is. As f θ ⊥ is bounded and continuous and (cid:82) dF θ ( θ ) (cid:81) j (cid:54) = i g Qj ( S j | θ ) = g N ( S ) is ﬁnite, the dom-inated convergence theorem ensures that this ﬁnal term is continuous, as desired. The result forthe density of θ i conditional on S − j for any j (cid:54) = i follows from nearly identical work.Next consider the density of θ i conditional on S − i . Now Bayes’ rule gives f Qθ i ( θ i | S − i ) = g Q − i ( S − i | θ i ) f θ ( θ i ) g Q − i ( S − i ) , where g Q − i ( S − i | θ i ) = (cid:90) dF Qθ ( θ | θ i ) (cid:89) j (cid:54) = i g Qj ( S j | θ )and g Q − i ( S − i ) = (cid:90) dF θ ( θ ) (cid:89) j (cid:54) = i g Qj ( S j | θ ) . As each of these terms is strictly positive, so is f Qθ i ( θ i | S − i ) . Further, g Q − i ( S − i | θ i ) f θ ( θ i ) was alreadyshown to be continuous in the previous paragraph. So f Qθ i ( θ i | S − i ) is continuous in θ i , as desired.Next, consider the density of ε i conditional on S . Bayes’ rule gives f Qε i ( ε i | S ) = g Q N ( S | ε i ) f ε ( ε i ) g Q N ( S ) , where g Q N ( S | ε i ) = g Qi ( S i | ε i ) (cid:89) j (cid:54) = i g Qj ( S j ) . Then as g Qi ( S i | ε i ) = f θ ( S i − ε i − a i ) is continuous in ε i given the continuity of f θ , so is g Q N ( S | ε i ) . The result for the density of ε i conditional on S − j for any j (cid:54) = i follows by nearly identical work.Finally, consider the density of ε i conditional on S − i . In the quality linkage model ε i is indepen-dent of S − i , so g Qε i ( ε i | S − i ) = f ε ( ε i ) , which is strictly positive and continuous by assumption.The following pair of lemmas establishes that the posterior distribution functions of the agent’stype conditional on the vector of outcomes satisﬁes a smoothness condition. To economize onnotation, the lemma is established with respect to agent 1’s latent variables, as the signal of agent N moves. By symmetry an analogous result applies to all other pairs of agents. emma O.4. For each model M ∈ { Q, C } and outcome-action proﬁle ( S − N , a ) , F Mθ ( θ | S ; a ) isa C function of ( S N , θ ) .Proof. For convenience, we suppress the dependence of distributions on a in this proof. Fix S − N . The result follows from Lemma O.2 so long as 1) f Mθ ( θ | S − N ) is continuous in θ , and 2) g MN ( S N | θ , S − N ) is a C , function of ( S N , θ ) and both it and its derivative wrt S N are uniformly bounded.Lemma O.3 ensures that condition 1 holds, so we need only establish condition 2.Consider ﬁrst the quality linkage model. In this case g QN ( S N | θ , S − N ) = g QN ( S N | θ , S N − ) , as S N is independent of S conditional on θ . And by the law of total probability, g QN ( S N | θ , S N − ) = (cid:90) g QN ( S N | θ, θ , S N − ) dF Qθ ( θ | θ , S N − ) . As S N is independent of ( θ , S N − ) conditional on θ, this is equivalently g QN ( S N | θ , S N − ) = (cid:90) g QN ( S N | θ ) dF Qθ ( θ | θ , S N − ) . Since g QN ( S N | θ ) = f θ ⊥ + ε ( S N − θ − a N ) , which is uniformly bounded by some M for all ( S N , θ ) ,g QN ( S N | θ , S N − ) is uniformly bounded by M as well for all ( S N , θ ) . Further, by Bayes’ rule f Qθ ( θ | θ , S N − ) = f Qθ ( θ | θ, S N − ) f θ ( θ | S N − ) f Qθ ( θ | S N − ) . Now, θ is independent of S N − conditional on θ, and so f Qθ ( θ | θ, S N − ) = f Qθ ( θ | θ ) = f θ ⊥ ( θ − θ ) . Then f Qθ ( θ | θ , S N − ) is equivalently f Qθ ( θ | θ , S N − ) = f θ ⊥ ( θ − θ ) f θ ( θ | S N − ) f Qθ ( θ | S N − ) . Inserting this into the previous expression for g QN ( S N | θ , S N − ) yields g QN ( S N | θ , S N − ) = 1 f Qθ ( θ | S N − ) (cid:90) f θ ⊥ + ε ( S N − θ − a N ) f θ ⊥ ( θ − θ ) dF Qθ ( θ | S N − ) . Applying Lemma O.3 to an ( N − f Qθ ( θ | S N − ) is continuous in θ .Meanwhile by assumption f θ ⊥ + ε ( S N − θ − a N ) and f θ ⊥ ( θ − θ ) are both continuous in ( S N , θ ) forevery θ , and are uniformly bounded above for every ( θ , S N , θ ) . Then by the dominated convergencetheorem the integral is also continuous in ( S N , θ ) , ensuring that g QN ( S N | θ , S N − ) is a continuousfunction of ( S N , θ ) . Finally, consider diﬀerentiating wrt S N . As f (cid:48) θ ⊥ + ε exists and is uniformlybounded, and f θ ⊥ is also uniformly bounded, the Leibniz integral rule ensures that ∂∂S N g QN ( S N | θ , S N − ) = 1 f Qθ ( θ | S N − ) (cid:90) f (cid:48) θ ⊥ + ε ( S N − θ − a N ) f θ ⊥ ( θ − θ ) dF Qθ ( θ | S N − ) . ince f (cid:48) θ ⊥ + ε is also continuous, this expression is continuous in ( S N , θ ) following the same logicwhich ensured that g QN ( S N | θ , S N − ) is continuous. Finally, let M be an upper bound on | f (cid:48) θ ⊥ + ε | . Then as f Qθ ( θ | S N − ) = (cid:90) f θ ⊥ ( θ − θ ) dF Qθ ( θ | S N − ) , it follows that (cid:12)(cid:12)(cid:12) ∂∂S N g QN ( S N | θ , S N − ) (cid:12)(cid:12)(cid:12) is uniformly bounded above by M as well. So g QN ( S N | θ , S N − ) satisﬁes condition 2.Now consider the circumstance linkage model. In this model g CN ( S N | θ = t, S = s, S N − ) = g CN ( S N | ε = s − t − a , S N − ) , as ε = S − θ − a and S N is independent of S conditional on ε . It is therefore enough toestablish that g CN ( S N | ε , S N − ) is a C , function of ( S N , ε ) with uniform bounds on it and itsderivative wrt S N . This follows from work nearly identical to the previous paragraph, substituting ε for θ and ε for θ. O.2 Proofs for the Gaussian Setting

O.2.1 Veriﬁcation of Assumptions in 2.6

Here we verify that Gaussian uncertainty satisﬁes the stated assumptions. Assumptions 1, 3, and5 are immediate. Assumption 6 is satisﬁed for any strictly convex cost function, since the secondderivative of the posterior expectation in each signal realization is zero. Assumption 4 is veriﬁedin the following lemma:

Lemma O.5.

Suppose ξ ∼ N (0 , σ ) . Then for any ∆ > , the function J ∗ ( ξ ) = 1∆ (cid:32) exp (cid:32) ∆ σ (cid:33) + exp (cid:18) ∆ | ξ | σ (cid:19) − (cid:33) satisﬁes | J ( ξ, ∆) | ≤ J ∗ ( ξ ) for every ξ ∈ R and ∆ ∈ [ − ∆ , ∆] , and E [ J ∗ ( ξ )] < ∞ . Proof.

Under the distributional assumption on ξ, the density function f ξ has the form f ξ ( ξ ) = 1 √ πσ exp (cid:18) − ξ σ (cid:19) . Therefore 1∆ f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) = exp (cid:0) σ ∆( ξ − ∆ / (cid:1) − . Now, we may equivalently write1∆ f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) = 1 σ (cid:90) ξ ∆ / exp (cid:18) σ ∆( (cid:101) ξ − ∆ / (cid:19) d (cid:101) ξ = exp (cid:16) − ∆ σ (cid:17) σ (cid:90) ξ ∆ / exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ. ence (cid:12)(cid:12)(cid:12)(cid:12) f ξ ( ξ − ∆) − f ξ ( ξ ) f ξ ( ξ ) (cid:12)(cid:12)(cid:12)(cid:12) = exp (cid:16) − ∆ σ (cid:17) σ (cid:90) max { ∆ / ,ξ } min { ∆ / ,ξ } exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ ≤ σ (cid:90) max { ∆ / ,ξ } min { ∆ / ,ξ } exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ. Let H ( ξ, ∆) ≡ σ (cid:90) max { ∆ / ,ξ } min { ∆ / ,ξ } exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ. We will show that H ( ξ, ∆) ≤ (cid:112) J ∗ ( ξ ) for all ξ and ∆ ∈ [ − ∆ , ∆] in cases, depending on the signsof ξ, ∆ , and ξ − ∆ / . Case 1: ξ ≥ ∆ / ≥ . Then H ( ξ, ∆) = 1 σ (cid:90) ξ ∆ / exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ ≤ σ (cid:90) ξ exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ = 1∆ (cid:18) exp (cid:18) ∆ ξσ (cid:19) − (cid:19) ≤ (cid:112) J ∗ ( ξ ) . Case 2: ξ ≥ > ∆ / . Then H ( ξ, ∆) = 1 σ (cid:90) ξ ∆ / exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ ≤ σ (cid:32)(cid:90) ξ exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ + (cid:90) − ∆ / exp (cid:32) − ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ (cid:33) = 1∆ (cid:32) exp (cid:18) ∆ ξσ (cid:19) + exp (cid:32) ∆ σ (cid:33) − (cid:33) = (cid:112) J ∗ ( ξ ) . Case 3: ∆ / > ξ ≥ . Then H ( ξ, ∆) = 1 σ (cid:90) ∆ / ξ exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ ≤ σ (cid:90) ∆ / exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ = 1∆ (cid:32) exp (cid:32) ∆ σ (cid:33) − (cid:33) ≤ (cid:112) J ∗ ( ξ ) . Case 4: ∆ / > > ξ. Then H ( ξ, ∆) = 1 σ (cid:90) ∆ / ξ exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ ≤ σ (cid:32)(cid:90) ∆ / exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ + (cid:90) ξ exp (cid:32) − ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ (cid:33) = 1∆ (cid:32) exp (cid:32) ∆ σ (cid:33) + exp (cid:18) ∆ | ξ | σ (cid:19) − (cid:33) = (cid:112) J ∗ ( ξ ) . ase 5: ≥ ∆ / > ξ. Then H ( ξ, ∆) = 1 σ (cid:90) ∆ / ξ exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ ≤ σ (cid:90) ξ exp (cid:32) − ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ = 1∆ (cid:18) exp (cid:18) ∆ | ξ | σ (cid:19) − (cid:19) ≤ (cid:112) J ∗ ( ξ ) . Case 6: > ξ ≥ ∆ / . Then H ( ξ, ∆) = 1 σ (cid:90) ξ ∆ / exp (cid:32) ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ ≤ σ (cid:90) − ∆ / exp (cid:32) − ∆ (cid:101) ξσ (cid:33) d (cid:101) ξ = 1∆ (cid:32) exp (cid:32) ∆ σ (cid:33) − (cid:33) ≤ (cid:112) J ∗ ( ξ ) . This establishes that | J ( ξ, ∆) | ≤ H ( ξ, ∆) ≤ J ∗ ( ξ ) for every ξ and ∆ ∈ [ − ∆ , ∆] , as desired. Itremains only to show that J ∗ is P -integrable. This follows because J ∗ ( ξ ) ≤ (cid:32) exp (cid:32) ∆ σ (cid:33) + exp (cid:18) ∆ | ξ | σ (cid:19)(cid:33) = 1∆ (cid:32) exp (cid:32) ∆ σ (cid:33) + 2 exp (cid:32) ∆ σ (cid:33) exp (cid:18) ∆ | ξ | σ (cid:19) + exp (cid:18) | ξ | σ (cid:19)(cid:33) = 1∆ (cid:32) exp (cid:32) ∆ σ (cid:33) + 2 exp (cid:32) ∆ σ (cid:33) (cid:18) exp (cid:18) ∆ ξσ (cid:19) + exp (cid:18) − ∆ ξσ (cid:19)(cid:19) + exp (cid:18) ξσ (cid:19) + exp (cid:18) − ξσ (cid:19)(cid:19) The ﬁrst term is a constant, while each of the remaining terms is proportional to a lognormalrandom variable. Thus each term has ﬁnite mean, and hence so does J ∗ ( ξ ) . O.2.2 Marginal Value of Eﬀort

Consider the quality linkage model, and suppose that agent i chooses eﬀort a i = a ∗ + ∆ while allagents j (cid:54) = i choose the equilibrium eﬀort level a ∗ . The principal’s posterior belief about θ + θ ⊥ i isindependent of S − i conditional on θ . Thus, using standard formulas for updating to normal signals,we can ﬁrst update the principal’s belief about θ to θ | S − i ∼ N (cid:16) ˆ µ θ , ˆ σ θ (cid:17) , whereˆ µ θ ≡ ( N − σ θ · ( S − i − a ∗ ) + ( σ θ ⊥ + σ ε ) · µ ( N − σ θ + σ θ ⊥ + σ ε , ˆ σ θ ≡ σ θ ( N − σ θ + σ θ ⊥ + σ ε . and S − i is the average outcome. The principal’s expectation of θ + θ ⊥ i after further updating to S i is E ( θ + θ ⊥ i | S ) = σ ε ˆ σ θ + σ θ ⊥ + σ ε · ( S − i − a ∗ ) + ˆ σ θ + σ θ ⊥ ˆ σ θ + σ θ ⊥ + σ ε · ( S i − a ∗ ) . aking an expectation with respect to the agent’s prior belief, we have: µ N (∆) = E (cid:104) E ( θ + θ ⊥ i | S ) (cid:105) = σ ε ˆ σ θ + σ θ ⊥ + σ ε · µ + ˆ σ θ + σ θ ⊥ ˆ σ θ + σ θ ⊥ + σ ε · ( µ + ∆)= µ + ˆ σ θ + σ θ ⊥ ˆ σ θ + σ θ ⊥ + σ ε · ∆and the marginal value of eﬀort is µ (cid:48) N (∆) = ˆ σ θ + σ θ ⊥ ˆ σ θ + σ θ ⊥ + σ ε = (cid:32) σ θ ( N − σ θ + σ θ ⊥ + σ ε + σ θ ⊥ (cid:33) / (cid:32) σ θ ( N − σ θ + σ θ ⊥ + σ ε + σ θ ⊥ + σ ε (cid:33) . (O.1)It is straightforward to verify that this expression is independent of ∆, decreasing in N , andconverges to σ θ ⊥ / (cid:0) σ θ ⊥ + σ ε (cid:1) as N → ∞ .Consider now the circumstance linkage model. Using parallel arguments to those above, theprincipal’s posterior belief about the common part of the noise shock ε after updating to S − i is ε | S − i ∼ N (cid:32) ( N − σ ε ( N − σ ε + σ θ + σ ε ⊥ · (cid:0) S − i − a ∗ − µ (cid:1) , σ ε ( σ ε ⊥ + σ θ )( N − σ ε + σ ε ⊥ + σ θ (cid:33) ≡ N ( η, ˆ σ ε )and the principal’s posterior expectation of θ i after further updating to S i is E ( θ i | S ) = σ θ σ θ + ˆ σ ε + σ ε ⊥ · ( S i − η ) + ˆ σ ε + σ ε ⊥ σ θ + ˆ σ ε + σ ε ⊥ · µ Since in the agent’s prior, E ( S i ) = µ + ∆ and E ( η ) = 0, the agent’s expectation of the principal’sforecast is µ N (∆) = E ( θ i | S ) = µ + σ θ σ θ + ˆ σ ε + σ ε ⊥ · ∆implying that the marginal value of eﬀort is µ (cid:48) N (∆) = σ θ / ( σ θ + ˆ σ ε + σ ε ⊥ )= σ θ / (cid:32) σ θ + σ ε ( σ ε ⊥ + σ θ )( N − σ ε + σ ε ⊥ + σ θ + σ ε ⊥ (cid:33) (O.2)This expression is constant in ∆, increasing in N , and converges to σ θ / ( σ θ + σ ε ⊥ ) as N grows large. O.3 Proofs for Section 7 (Extensions)

O.3.1 Proof of Proposition 4

Consider ﬁrst the quality linkage model. Let µ m (∆) be the agent’s value of distortion when m ∈{ , ...J } linkages have been identiﬁed. As in the main model, this value is diﬀerentiable and ndependent of the action the principal expects the agent to take. (See the proof of Lemma C.1.)Agent 0’s equilibrium eﬀort is then determined by µ (cid:48) m (0) = C (cid:48) ( a ) . We prove that µ (cid:48) m (0) > µ (cid:48) m +1 (0) for every m. Let S j = ( S j , ..., S jN j ) be the vector of signal realizations for each segment j, and S m for thematrix of signal realizations for all signal realizations from segments 1 through m. We will write G j for the distribution function of each S j , and G m for the distribution function of ( S , S m ) . Dropping explicit conditioning on actions for convenience, a change of variables as in the proof ofLemma 1 allows us to write µ m (∆) and µ m +1 (∆) as µ m (∆) = (cid:90) dG m ( S = s , S m ) E [ θ | S = s + ∆ , S m ]and µ m +1 (∆) = (cid:90) dG m +1 ( S = s , S m +1 ) E [ θ | S = s + ∆ , S m +1 ]for some common set of actions. The law of iterated expectations applied to E [ θ | S = s +∆ , S m ]allows the previous expression for µ m (∆) to be expanded as µ m (∆) = (cid:90) dG m ( S = s , S m ) × (cid:90) dG m +1 ( S m +1 | S = s + ∆ , S m ) E [ θ | S = s + ∆ , S m +1 ] . Meanwhile the law of iterated expectations applied to the outer expectation allows µ m +1 (∆) to beexpanded as µ m +1 (∆) = (cid:90) dG m ( S = s , S m ) × (cid:90) dG m +1 ( S m +1 | S = s , S m ) E [ θ | S = s + ∆ , S m +1 ] . Each of these inner integrals may be further expanded using the law of total probability, yielding µ m (∆) = (cid:90) dG m ( S = s , S m ) × (cid:90) dF θ m +1 ( θ m +1 | S = s + ∆ , S m ) × (cid:90) dG m +1 ( S m +1 | θ m +1 ) E [ θ | S = s + ∆ , S m +1 ]and µ m +1 (∆) = (cid:90) dG m ( S = s , S m ) × (cid:90) dF θ m +1 ( θ m +1 | S = s , S m ) × (cid:90) dG m +1 ( S m +1 | θ m +1 ) E [ θ | S = s + ∆ , S m +1 ] here we have used the fact that S m +1 is independent of ( S , S m ) conditional on θ m +1 to dropextraneous conditioning in the inner expectation.So deﬁne a function ψ ( δ , δ , s , S m ) by ψ ( δ , δ , s , S m ) ≡ (cid:90) dF θ m +1 ( θ m +1 | S = s + δ , S m ) × (cid:90) dG m +1 ( S m +1 | θ m +1 ) E [ θ | S = s + δ , S m +1 ] . Then for every ∆ > µ m (∆) − µ m +1 (∆) = (cid:90) dG m ( S = s , S m ) 1∆ (cid:0) ψ (∆ , ∆ , s , S m ) − ψ (∆ , , s , S m ) (cid:1) . Since µ (cid:48) m (0) − µ (cid:48) m +1 (0) = lim ∆ ↓ ( µ m (∆) − µ ) − lim ∆ ↓ ( µ m +1 (∆) − µ ) = lim ∆ ↓ ( µ m (∆) − µ m +1 (∆) , It is therefore suﬃcient to determine the limiting behavior of1∆ (cid:0) ψ (∆ , ∆ , s , S m ) − ψ (∆ , , s , S m ) (cid:1) as ∆ ↓ . Note that S m +1 i = θ m + θ ⊥ ,mi + ε i , where the densities of θ ⊥ ,mi and ε i each exist and are bounded by assumption. Then there existsa diﬀerentiable distribution function H with bounded derivative such that G m +1 i ( S m +1 i | θ m +1 ) = H ( S m +1 i − θ m +1 ) for each agent i in segment m + 1 . Since the elements of S m +1 are independentconditional on θ m +1 , we may write G m +1 ( S m +1 | θ m +1 ) = N m +1 (cid:89) i =1 H ( S m +1 i − θ m +1 ) . A change of variables therefore yields (cid:90) dG m +1 ( S m +1 | θ m +1 ) E [ θ | S = s + δ , S m +1 ]= (cid:90) dq ... (cid:90) dq N m +1 E [ θ | S = s + δ , S m , S m +1 = ( H − ( q i ) + θ m +1 ) i =1 ...,N m ] . Now ﬁx s and S m , and denote the integrand of this representation ζ ( z, δ, q ) ≡ E [ θ | S = s + δ, S m , S m +1 = ( H − ( q i ) + z ) i =1 ...,N m ] , where q ≡ ( q , ..., q N m +1 ) . Using techniques very similar to that used to prove Lemma B.3, itcan be shown that there exists a C quantile function φ ( q, δ ) satisfying ∂φ/∂δ > θ m +1 ( φ ( q, δ ) | S = s + δ, S m ) = q for every q and ∆ . Then by a further change of variables, ψ may be written ψ ( δ , δ , s , S m ) = (cid:90) dq ... (cid:90) dq N m +1 ζ ( φ ( q , δ ) , δ , q ) . By assumption, E [ θ | S , S m +1 ] is diﬀerentiable wrt each S m +1 i , and by arguments very similarto those used to prove Lemma B.8, it can be shown that ∂∂S m +1 i E [ θ | S , S m +1 ] > i = 1 , ..., m + 1 . Since additionally ∂φ/∂δ > ζ ( φ ( q , ∆) , ∆ , q ) > ζ ( φ ( q , , ∆ , q )for every ∆ > q , q ) , and thus that1∆ (cid:0) ψ (∆ , ∆ , s , S m ) − ψ (∆ , , s , S m ) (cid:1) = (cid:90) dq ... (cid:90) dq N m +1

1∆ ( ζ ( φ ( q , ∆) , ∆ , q ) − ζ ( φ ( q , , ∆ , q ))is strictly positive for every ∆ > . Since this result holds for every ( s , S N ) , Fatou’s lemmatherefore implies that µ (cid:48) m (0) − µ (cid:48) m +1 (0) ≥ (cid:90) dG m ( S = s , S m ) lim inf ∆ ↓ (cid:0) ψ (∆ , ∆ , s , S m ) − ψ (∆ , , s , S m ) (cid:1) and lim inf ∆ ↓ (cid:0) ψ (∆ , ∆ , s , S m ) − ψ (∆ , , s , S m ) (cid:1) ≥ (cid:90) dq ... (cid:90) dq N m +1 lim ∆ ↓

1∆ ( ζ ( φ ( q , ∆) , ∆ , q ) − ζ ( φ ( q , , ∆ , q )) . Further, the integrand of the previous expression can be equivalently written1∆ ( ζ ( φ ( q , ∆) , ∆ , q ) − ζ ( φ ( q , , ∆ , q ))= 1∆ ( ζ ( φ ( q , ∆) , ∆ , q ) − ζ ( φ ( q , , , q )) −

1∆ ( ζ ( φ ( q , , ∆ , q ) − ζ ( φ ( q , , , q )) . Now, by assumption E [ θ | S , S m +1 ] is diﬀerentiable wrt S and each S m +1 i , and each derivative iscontinuous in ( S , S m +1 ) . Hence E [ θ | S , S m +1 ] is a totally diﬀerentiable function of ( S , S m +1 )everywhere. Thus by the chain rulelim ∆ ↓

1∆ ( ζ ( φ ( q , ∆) , ∆ , q ) − ζ ( φ ( q , , ∆ , q ))= ∂∂S E [ θ | S = s , S m , S m +1 = ( H − ( q i ) + φ ( q , i =1 ...,N m ]+ N m (cid:88) i =1 ∂∂S m +1 i E [ θ | S = s , S m , S m +1 = ( H − ( q i ) + φ ( q , i =1 ...,N m ] ∂φ∂ ∆ ( q , − ∂∂S E [ θ | S = s , S m , S m +1 = ( H − ( q i ) + φ ( q , i =1 ...,N m ]= N m (cid:88) i =1 ∂∂S m +1 i E [ θ | S = s , S m , S m +1 = ( H − ( q i ) + φ ( q , i =1 ...,N m ] ∂φ∂ ∆ ( q , . s noted earlier, each of these derivatives is strictly positive, and so it follows that the entire limitis strictly positive. Thuslim inf ∆ ↓ (cid:0) ψ (∆ , ∆ , s , S m ) − ψ (∆ , , s , S m ) (cid:1) > µ (cid:48) m (0) − µ (cid:48) m +1 (0) > . In other words, the marginal value of eﬀortis declining in m in the quality linkage model.The result for the circumstance linkage model proceeds nearly identically, with the key diﬀerencethat now an analog of Lemma B.8 implies that ∂∂S m +1 i E [ θ | S , S m +1 ] < i. Thuslim ∆ ↓

1∆ ( ζ ( φ ( q , , ∆ , q ) − ζ ( φ ( q , ∆) , ∆ , q )) > ∆ ↓ (cid:0) ψ (∆ , , s , S m ) − ψ (∆ , ∆ , s , S m ) (cid:1) > µ (cid:48) m +1 (0) − µ (cid:48) m (0) > . So the marginal value of eﬀort is rising in m in thecircumstance linkage model.in thecircumstance linkage model.