Learning from Neighbors about a Changing State
SSOCIAL LEARNING IN A DYNAMIC ENVIRONMENT
KRISHNA DASARATHA, BENJAMIN GOLUB, AND NIR HAK
Abstract.
Agents learn about a state using private signals and the past actions of theirneighbors. In contrast to most models of social learning in a network, the target beinglearned about is moving around. We ask: when can a group aggregate information quickly,keeping up with the changing state? First, if each agent has access to neighbors withsufficiently diverse kinds of signals, then Bayesian learning achieves good informationaggregation. Second, without such diversity, there are cases in which Bayesian informationaggregation necessarily falls far short of efficient benchmarks. Third, good aggregationrequires agents who understand correlations in neighbors’ actions with the sophisticationneeded to concentrate on recent developments and filter out older, outdated information.In stationary equilibrium, agents’ learning rules incorporate past information by takinglinear combinations of other agents’ past estimates (as in the simple DeGroot heuristic),and we characterize the coefficients in these linear combinations.
Date : February 7, 2019.Department of Economics, Harvard University, Cambridge, U.S.A., [email protected],[email protected], [email protected]. We thank the Foundations for Human Behavior Initia-tive at Harvard for financial support. We are grateful to (in random order) Alireza Tahbaz-Salehi, EvanSadler, Muhamet Yildiz, Philipp Strack, Drew Fudenberg, Nageeb Ali, Tomasz Strzalecki, Jeroen Swinkels,Margaret Meyer, Michael Powell, Michael Ostrovsky, Leeat Yariv, Andrea Galeotti, Eric Maskin, ElliotLipnowski, Jeff Ely, Kevin He, Eddie Dekel, Annie Liang, Iosif Pinelis, Ozan Candogan, Ariel Rubinstein,Bob Wilson, Omer Tamuz, Jeff Zwiebel, Matthew O. Jackson, Xavier Vives, Matthew Rabin, and AndrzejSkrzypacz for valuable conversations, and especially to David Hirshleifer for detailed comments on a draft.We also thank numerous seminar participants for helpful questions and comments. a r X i v : . [ ec on . T H ] F e b OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 1 Introduction
Consider a group learning over a period of time about an evolving fundamental state,such as future conditions in a market. In addition to making use of public information,individuals learn from their own private information and also from the estimates of others.For instance, farmers who are trying to assess the demand for a crop they produce maylearn from neighbors’ actions (e.g., how much they are investing in the crop), which reflectthose neighbors’ estimates of market conditions. In another example, economic analystsor forecasters have their own data and calculations, but they may also have access tothe reports of some other analysts. Importantly, in many such settings, people haveaccess to the estimates of only some others. Therefore, without a central informationaggregation device, aggregation of information occurs locally and estimates may differacross a population.Given that the fundamental state in question is changing over time, a key question is:When can the group respond to the environment quickly, aggregating dispersed informationefficiently in real time? In contrast, when are estimates of present conditions confounded?These questions are important from a positive perspective, to better understand the deter-minants of information aggregation and the welfare implications. They will also be relevantin design decisions—e.g., for a planner who influences group composition or informationendowments and wants to facilitate better learning.The question of whether decentralized communication can facilitate efficient adaptationto a changing world is a fundamental one in economic theory, related to questions raisedby Hayek (1945). Nevertheless, there is relatively little modeling of dynamic states in thelarge literature on social learning and information aggregation in networks, though we dis-cuss some very important antecedents that we build on—including Frongillo, Schoenebeck, A literature in economic development studies situations in which the flow of information crucial to pro-duction decisions is constrained by geographic or social distance; see, e.g., Jensen (2007); Srinivasan andBurrell (2013). In a class of models of over-the-counter markets, agents learn about others’ valuations during trade (see,e.g., Vives, 1993; Duffie and Manso, 2007; Duffie, Malamud, and Manso, 2009; Babus and Kondor, 2018). “If we can agree that the economic problem of society is mainly one of rapid adaptation to changes in theparticular circumstances of time and place . . . there still remains the problem of communicating to [eachindividual] such further information as he needs.” Hayek’s main concern was aggregation of informationthrough markets, but the same questions apply more generally. See, among many others, DeMarzo, Vayanos, and Zweibel (2003), Acemoglu, Dahleh, Lobel, and Ozdaglar(2011), Mueller-Frank (2013), Eyster and Rabin (2014), Mossel, Sly, and Tamuz (2015), Lobel and Sadler(2015a), Akbarpour et al. (2017), and Molavi, Tahbaz-Salehi, and Jadbabaie (2018). There is also aliterature in finance on information aggregation in complex environments, where the models are mainlystatic: see, for instance, Malamud and Rostek (2017), Lambert, Ostrovsky, and Panov (2018), and thepapers cited there.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 2 and Tamuz (2011), Shahrampour, Rakhlin, and Jadbabaie (2013), and Alatas, Banerjee,Chandrasekhar, Hanna, and Olken (2016)—in Section 6. Our first contribution is to de-fine and study equilibria in a dynamic environment that captures two essential dimensionsemphasized above: the state of the world changes over time, and communication occurs inan arbitrary network. These equilibria take a simple form and relate to certain canonicallearning rules, such as that of DeGroot (1974), which have been used extensively in theliterature on learning in networks. The second contribution is to derive conditions underwhich decentralized information aggregation works well.Our main substantive finding is that in large populations decentralized learning can ap-proach an essentially optimal benchmark, as long as (i) each individual has access to aset of neighbors that is sufficiently diverse, in the sense of having different signal distri-butions from each other; and (ii) updating rules are Bayesian, responding to correlationsin a sophisticated way. If signal endowments are not diverse, then social learning can beinefficiently confounded and far from optimal, even though each agent has access to anunbounded number of observations, each containing independent information. Diversity is,in a sense, more important than precision: giving everyone better signals can hurt aggrega-tion severely if it makes those signals homogeneous. A key mechanism behind the value ofdiverse signal endowments is that diversity helps agents concentrate on new developmentsin the dynamic environment, and successfully filter out older, less useful information. Totake advantage of this, however, agents must have a sophisticated understanding of thecorrelations in their neighbors’ estimates.We now describe our dynamic model and some of the main results. The state, θ t , driftsaround according to a stationary, discrete-time AR(1) process given by θ t +1 = ρθ t + ν t +1 ,with 0 < ρ <
1, and agents receive conditionally independent Gaussian signals of its currentvalue. The population consists of overlapping generations of decision-makers (agents),located in a network. An agent’s action is an estimate a i,t : she sets it to her expectation,given all her information, of the current state θ t . In each period, the information the agentobserves consists of some past estimates of her neighbors in an arbitrary network and anindependent, normal signal s i,t ∼ N ( θ t , σ i ) of the current state; these estimates are relevantfor estimating θ t because they depend on recent states, which, in turn, are correlated withthe current state. Her estimate is then used by her neighbors in the next round in the sameway. We vary three features of the environment.1. The distributions of individuals’ information (the precisions σ − i of private signals).2. The structure of the network: the sizes and compositions of individuals’ neighbor-hoods. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 3
3. How agents update their beliefs: the baseline model is that agents take correctBayesian conditional expectations of the state of interest; an important alternativeis that agents do not optimally account for redundancies in their observations (inways we will be specific about).A helpful feature of the model is that stationary equilibrium learning rules take a simple,time-invariant form: agents form their next-period estimates by taking linear combinationsof their neighbors’ earlier estimates and their own private signals. Technically, the problemfacing each agent is a standard one: estimating a linear statistical model. However, thedata about underlying fundamentals that an agent observes depend on others’ strategies,i.e., updating rules. Each agent chooses her updating rule to optimally respond to thepast updating of others, and we study the stationary equilibria of this system. Our mainoutcome of interest is the equilibrium quality of learning—the equilibrium error rates inagents’ estimates of the state. We will now summarize our findings on the efficiency of information aggregation andthe importance of diverse signal endowments. Afterward, we will present an example toillustrate the key intuition behind why diversity of signal endowments helps with inference.First, consider agents who are Bayesian. Suppose there is sufficient diversity of privateinformation: there are at least two possible private signal precisions, and each individualis exposed to sufficiently many neighbors with each kind of signal. We show—within arandom graph model that can capture essentially arbitrary network heterogeneity—theseproperties are enough to guarantee an equilibrium in which information aggregation is asgood as it can possibly be. More precisely, each agent can figure out an arbitrarily goodestimate of the previous period’s state, which is the best information that she could hopeto extract from others’ actions, and then combine it with her own current private signal.On the other hand, without sufficient diversity of private information—if all agents havethe same kind of private signals—good information aggregation may fail. This can occurat the unique stationary equilibrium, even if each individual has access to the estimates ofvery many neighbors, all of whom get conditionally independent signals of the recent state.Our key theoretical results on the quality of learning use large random graphs with flexiblestructure. To argue that neither the technical conditions assumed on these graphs nor a In engineering terminology, each agent uses a Kalman filter, but the distribution of her observationsis determined by other agents’ behavior; thus, we analyze a set of Kalman filters in a network in Nashequilibrium; cf. Olfati-Saber (2007) and Shahrampour, Rakhlin, and Jadbabaie (2013), which model a setof distributed Kalman filters controlled by a planner. Equilibrium weights satisfy a system of polynomial equations, but the system usually has high degreeand is not amenable to explicit solutions, so additional work is required to characterize properties of thesesolutions.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 4 reliance on very large numbers is driving the conclusions, we calculate equilibria numericallyfor real-world social networks both for diverse and non-diverse signal endowments, andshow that signal diversity does enable much better learning, even in networks with severalhundred agents, with agents having 10-20 neighbors.We will also discuss some implications of our results for designers who wish to facili-tate better learning, and what distributions of expertise they would prefer. In particular,our results provide a distinctive rationale for informational specialization in organizations,which we flesh out in Section 7.2.Efficient aggregation depends on another factor, beyond the sufficient diversity of signalprecisions we have been discussing: sophisticated behavior by individuals. In particular,even when correct inference is possible, to achieve it, agents must understand the correla-tions among their observations to remove confounds (Eyster and Rabin, 2014). To make thepoint that such understanding is essential to good aggregation, we examine some canonicallearning rules, adapted to our setting, in which it is absent. There, information aggregationis essentially guaranteed to fall short of good aggregation benchmarks for all agents. Wealso make this point on fixed finite networks, where we show such learning strategies arenecessarily Pareto inefficient.Beyond these substantive results on what makes for efficient and inefficient informationaggregation, a methodological contribution of the paper is a model in which individuals’learning takes a simple, tractable form (reminiscient of the DeGroot (1974) linear updatingrule). Our model is one in which such updating arises from Bayesian behavior, and thecoefficients in the linear rules are determined endogenously. This gives a new frameworkfor comparative statics, welfare analysis, counterfactuals, and estimation exercises in sociallearning settings.
The value of diversity: An example.
We now present a simple example that illustratesthe value of diverse signal endowments. This illustration can be done in a stripped-downsetting—with a fixed state and a few sequential decisions in a simple network—but itdemonstrates forces crucial in our general model. Consider an environment with a singlewell-informed source S , many media outlets M , . . . , M n with access to the source as wellas some independent private information, and the general public. The public consists ofmany individuals who learn only from the media outlets, and we are interested in howwell a typical member of the public could learn by following many media outlets. Moreprecisely, we consider the example shown in Figure 1.1 and think of P as a generic memberof the large public. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 5
Figure 1.1.
The network used in the value of diversity example . . . SP M M M M M M M n In the first period, the source receives a noisy signal of the state θ , which we call s S = θ + η S . The source announces its posterior belief of the state, which (taking an improperprior for the state θ ) will be s S . After this, the media outlets receive noisy private signals s M i = θ + η M i and announce their posterior means of θ , which we denote by a M i . A typicalsuch estimate is a linear combination of s M i and s S . In particular, taking all signal errorsto be mean-zero, independent normal draws, the estimate can be expressed as a M i = w i s M i + (1 − w i ) s S where the weight w i on the media outlet’s signal is increasing in the precision of that signal.The member of the public then makes an estimate based on the observations a M , . . . , a M n . Suppose first that the media outlets have identically distributed private signals. Becausethe member of the public observes many symmetric media outlets, it turns out that herbest estimate of the state, a P , is simply the average of the estimates of the media outlets.Since each of these outlets uses the same weight w i on its private signal, we may write a P = w n (cid:88) i =1 s M i n + (1 − w ) s S ≈ w θ + (1 − w ) s S . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 6
In the approximate equality, we have used the fact that an average of many private signalsis approximately equal to the state, by our assumption of independent errors. Despite thelarge number of media outlets that each have independent information, the public’s beliefsare biased in the direction of the error in the source’s signal, even with fully Bayesianupdating.What if, instead, half of the media outlets (say M , . . . , M n/ ) have more precise privatesignals than the other half, perhaps because these outlets have invested more heavily incovering this topic? The media outlets with more precise signals will then place weight w A on their private signals, while the media outlets with less precise signals use a smallerweight w B . We will now argue that a member of the public can extract more informationfrom the media in this setting. In particular, she can first compute the averages of the twogroups’ actions w A n/ (cid:88) i =1 s M i n/ − w A ) s S ≈ w A θ + (1 − w A ) s S w B n (cid:88) i = n/ s M i n/ − w B ) s S ≈ w B θ + (1 − w B ) s S . Then, since w A > w B , the public knows two distinct linear combinations of θ and theconfound coming from the source’s signal error. The parameter θ is identified from these.So the member of the public can form a very precise estimate of θ , and this implies thatthe Bayesian estimate of θ must be very precise as n becomes large. The key force is thatthe two groups of media outlets give different mixes of the source’s bias and the state, andby understanding this, the public can infer both.This illustration has a number of unrealistic features: one-directional links; no communi-cation among the media outlets or public; only one round of updating for each agent; anda particular sequencing of these rounds. These features are important to the illustration.But, crucially, the intuition of how diversity affects inference plays a central role in ourdynamic model. There, quite generally, diversity of signal endowments allows agents toconcentrate on new developments in the state while filtering out old, less relevant informa-tion, analogously to how they filtered out the confounding source bias in this example. Outline.
Section 2 sets up the basic model and discusses its interpretation. Section 3defines our equilibrium concept and shows that equilibria exist. In Section 4, we give ourmain results on the quality of leaxrning and information aggregation. In Section 5, wediscuss learning outcomes with naive agents and more generally without anti-imitation.Section 6 relates our model and results to the social learning literature. In Section 7, we
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 7 discuss structural estimation of our model, multidimensional states, and the role of signalsturcture. 2.
Model
Description.
We describe the environment and game; complete details are formalizedin Appendix A.
State of the world.
At each discrete instant (also called period) of time, t ∈ { . . . , − , − , , , , . . . } , there is a state of the world, a random variable θ t taking values in R . This state evolves asan AR(1) stochastic process. That is, θ t +1 = ρθ t + ν t +1 , where ρ is a constant with 0 < | ρ | ≤ ν t +1 ∼ N (0 , σ ν ) are independent innovations.We can write explicitly θ t = ∞ (cid:88) (cid:96) =0 ρ (cid:96) ν t − (cid:96) , and thus θ t ∼ N (cid:16) , σ ν − ρ (cid:17) . We make the normalization σ ν = 1 throughout. Information and observations.
The set of nodes is N = { , , . . . , n } . Each node i hasa set N i ⊆ N of other nodes that i can observe, called its neighborhood .Each node is populated by a sequence of agents in overlapping generations. At each time t , there is a node- i agent, labeled ( i, t ), who takes that node’s action a i,t . When takingher action, the agent ( i, t ) can observe the actions in her node’s neighborhood in the m periods leading up to her decision. That is, she observes a j,t − (cid:96) for all nodes j ∈ N i and lags (cid:96) ∈ { , , . . . , m } . (One interpretation is that the agent ( i, t ) is born at time t − m at acertain location (node) and has m periods to observe the actions taken around her beforeshe acts.) She also sees a private signal, s i,t = θ t + η i,t , where η i,t ∼ N (0 , σ i ) has a variance σ i > η i,t and ν t are independent of each other. A vector of all of agent ( i, t )’sobservations— s i,t and the neighbors’ past actions—defines her information. An importantspecial case will be m = 1, where there is one period of memory, so that the agent’sinformation is ( s i,t , ( a j,t − ) j ∈ N i ) . The observation structure is common knowledge, as is theinformational environment (i.e., all precisions, etc.). We will sometimes take the network
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 8 G to mean the set of nodes N together with the set of links E , defined as the subset ofpairs ( i, j ) ∈ N × N such that j ∈ N i . Preferences and best responses.
As stated above, in each period t , agent ( i, t ) at eachnode i chooses an action a i,t ∈ R . Utility is given by u i,t ( a i,t ) = − E [( a i,t − θ t ) ] . The agent makes the optimal choice for the current period given her information—i.e., doesnot seek to affect future actions. By a standard fact about squared-error loss functions,given the distribution of ( a N i ,t − (cid:96) ) m(cid:96) =1 , she sets:(2.1) a i,t = E [ θ t | s i,t , ( a N i ,t − (cid:96) ) m(cid:96) =1 ] . Here the notation a N i ,t refers to the vector ( a j,t ) j ∈ N i . An action can be interpreted as anagent’s estimate of the state, and we will sometimes use this terminology.The conditional expectation (2.1) depends on the prior of agent ( i, t ) about θ t , whichcan be any normal distribution or a uniform improper prior (in which case all of i ’s beliefsabout θ t come from her own signal and her neighbors’ actions). We take priors, like theinformation structure and network, to be common knowledge. In the rest of the paper, weformally analyze the case where all agents have improper priors. Because actions under anormal prior are related to actions under the improper prior by a simple bijection—andthus have the same information content for other agents—all results extend to the generalcase.2.2.
Interpretation.
The agents are fully Bayesian given the information they have accessto. Much of our analysis is done for an arbitrary finite m ; we view the restriction tofinite memory as an assumption that avoids technical complications, but because m canbe arbitrarily large, this restriction has little substantive content. The model generalizes“Bayesian without Recall” agents from the engineering and computer science literature(e.g., Rahimian and Jadbabaie, 2017), which, within our notation, is the case of m = 1.Even when m is small, observed actions will indirectly incorporate signals from further inthe past, and so they can convey a great deal of information.Note that an agent does not have access to the past private signals observed either at herown node or at neighboring ones. This is not a critical choice—our main results are robustto changing this assumption—but it is worth explaining. Whereas a i,t is an observable In Section 2.2 we discuss this assumption and how it relates to applications. With 0 < ρ <
1, one natural choice for a prior is the stationary distribution of the state.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 9
Figure 2.1.
An illustration of the overlapping generations structure of themodel for m = 2.At time t −
1, agent ( i, t ) is born and observes estimates from time t −
2. At time t agent( i, t ) observes estimates from t −
1, her private signal s i,t and submits her estimate a i,t .choice, such as a published evaluation of an asset or a mix of inputs actually used by anagent in production, the private signals are not shareable. Finally, our agents act once and do not consider future payoffs, which shuts down thepossibility that they may distort reports to manipulate the future path of social learningfor their successors’ benefit. Equivalently, we could simply assume that agents sincerelyannounce their subjective expectations of the state, as in Geanakoplos and Polemarchakis(1982) and the literature following it. For discussions of this type of assumption in sociallearning models, and ways to relax it, see, for instance Mueller-Frank (2013) and Mosselet al. (2015).We discuss extensions of the basic model in various directions in Section 7.3.
Equilibrium
In this section we present the substance of our notion of equilibrium and the basicexistence result.
Though we model the signals for convenience as real numbers, a more realistic interpretation of these isan aggregation of all of an agent’s experiences, impressions, etc., and these may be difficult to summarizeor convey. Because time in this game is doubly infinite, there are some subtleties in definitions, which are dealtwith in Appendix A.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 10
Equilibrium in linear strategies.
A strategy of an agent is linear if the action takenis a linear function of the variables in her information set. We will focus on stationaryequilibria in linear strategies —ones in which all agents’ strategies are linear with time-invariant coefficients—though, of course, we will allow agents to consider deviating at eachtime to arbitrary strategies, including non-linear ones. Once we establish the existence ofsuch equilibria, we will refer to them simply as equilibria for the rest of the paper.We first argue that in studying agents’ best responses to stationary linear strategies, wemay restrict attention to linear strategies. If linear strategies have been played up to time t , we can express each action up until time t as a weighted summation of past signals.Because all innovations ν t and signal errors η i,t are independent and Gaussian, it followsthat the joint distribution of any finite random vector of the past errors ( a i,t − (cid:96) (cid:48) − θ t ) i ∈ N,(cid:96) (cid:48) ≥ is multivariate Gaussian. Thus, E [ θ t | s i,t , ( a N i ,t − (cid:96) ) m(cid:96) =1 ] is a linear function of s i,t and( a N i ,t − (cid:96) ) m(cid:96) =1 (see (3.1) below for details). It follows that solving for equilibrium can bereduced to solving for the weights agents place on the variables in their information sets.A reason for focusing on equilibria in linear strategies comes from considering how agentswould behave in a variant of the model where time began at t = 0. In that first period,agents would see only their own signals, and therefore play linear strategies; after that,inductively applying the argument in the previous paragraph shows that strategies wouldbe linear at all future times. This is the thought experiment that motivates our focus onlinear strategies; taking time to extend infinitely backward is an idealization that allows usto focus on exactly stationary behavior.3.2. Covariance matrices.
The optimal weights for an agent to place on her sources ofinformation depend on the precisions and covariances of these sources, and so we now studythese.Given a linear strategy profile played up until time t , let V t be the nm × nm covariancematrix of the vector ( ρ (cid:96) a i,t − (cid:96) − θ t ) i ∈ N, ≤ (cid:96) ≤ m − . The entries of this vector are the differencesbetween the best predictors of θ t based on actions a i,t − (cid:96) during the past m periods andthe current state of the world. (In the case m = 1, this is simply the covariance matrix V t = Cov( a i,t − θ t ).) The matrix V t records covariances of action errors: diagonal entriesmeasure the accuracy of each action, while off-diagonal entries indicate how correlated thetwo agents’ action errors are. The entries of V t are denoted by V ij,t .3.3. Best-response weights.
A strategy profile is an equilibrium if the weights each agentplaces on the variables in her information set minimize her posterior variance.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 11
We now characterize these in terms of the covariance matrices we have defined. Consideran agent at time t , and suppose some linear strategy profile has been played up until time t .Let V N i ,t − be a sub-matrix of V t − that contains only the rows and columns correspondingto neighbors of i and let C i,t − = V N i ,t − . . . σ i . Conditional on observations ( a N i ,t − (cid:96) ) m(cid:96) =1 and s i,t , the state θ t is normally distributed withmean(3.1) T C − i,t − T C − i,t − · ρ a N i ,t − ... ρ m a N i ,t − m s i,t +1 . (see Example 4.4 of Kay (1993)). This gives E [ θ t | s i,t , ( a N i ,t − (cid:96) ) m(cid:96) =1 ] (recall that this is the a i,t the agent will play). Expression (3.1) is a linear combination of the agent’s signal andthe observed actions; the coefficients in this linear combination depend on the matrix V t − (but not on realizations of any random variables). In (3.1) we use our assumption of animproper prior. We denote by ( W t , w st ) a weight profile in period t , with w st ∈ R n being the weightsagents place on their private signals and W t recording the weights they place on theirother information. When m = 1, we refer to the weight agent i places on a j,t − (agent j ’saction yesterday) as W ij,t and the weight on s i,t , her private signal, as w si,t .In view of the formula (3.1) for the optimal weights, we can compute the resulting next-period covariance matrix V t from the previous covariance matrix. This defines a mapΦ : V → V , given by(3.2) Φ : V t − (cid:55)→ V t which we study in characterizing equilibria. Explicitly, V N i ,t − are the covariances of ( ρ (cid:96) a j,t − (cid:96) − θ t ) for all j ∈ N i and (cid:96) ∈ { , . . . , m } . As we have mentioned, this is for convenience and without loss of generality. Our analysis applies equallyto any proper normal prior for θ t : To get an agent’s estimate of θ t , the formula in (3.1) would simplybe averaged with a constant term accounting for the prior, and everyone could invert this deterministicoperation to recover the same information from others’ actions. We do not need to describe the indexing of coefficients in W t explicitly in general; this would be a bitcumbersome because there are weights on actions at various lags. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 12
Equilibrium existence.
Consider the map Φ defined in (3.2). Stationary equilibriain linear strategies correspond to fixed points of the map Φ. Our first result concerns the existence of equilibrium:
Proposition 1.
A stationary equilibrium in linear strategies exists, and is associated witha covariance matrix (cid:98) V such that Φ( (cid:98) V ) = (cid:98) V . The proof appears in Appendix B.At the stationary equilibrium, the covariance matrix and all agent strategies are time-invariant. Actions are linear combinations of observations with stationary weights (whichwe refer to as (cid:99) W ij,t and (cid:98) w si ). The form of these rules has some resemblance to staticequilibrium notions studied in the rational expectations literature (e.g., Vives (1993); Babusand Kondor (2018); Lambert, Ostrovsky, and Panov (2018); Mossel, Mueller-Frank, Sly, andTamuz (2018)), but here we explicitly examine the dynamic environment in which theseemerge as steady states. We discuss the relationship between our model and DeGrootlearning, which has a related form, in Section 6.The idea of the argument is as follows. The goal is to apply the Brouwer fixed-pointtheorem to show there is a covariance matrix (cid:98) V that remains unchanged under updating.To find a compact set to which we can apply the fixed-point theorem, we use the fact thatwhen agents best respond to any beliefs about prior actions, all variances are boundedabove and bounded away from zero below. This is because all agents’ actions must be atleast as precise in estimating θ t as their private signals, and cannot be more precise thanestimates given perfect knowledge of yesterday’s state combined with the private signal.Because the Cauchy-Schwartz inequality bounds covariances in terms of the correspondingvariances, it follows that there is a compact set containing the image of Φ. This alongwith the continuity of Φ allow us to apply the Brouwer fixed-point theorem.
Example 1.
In the case of m = 1, we can write out the map Φ explicitly, which yields thefixed-point condition for the equilibrium variances and covariances (cid:98) V .(3.3) (cid:98) V ii = ( (cid:98) w si ) σ i + (cid:88) (cid:99) W ik (cid:99) W ik (cid:48) ( ρ (cid:98) V kk (cid:48) + 1) and (cid:98) V ij = (cid:88) (cid:99) W ik (cid:99) W jk (cid:48) ( ρ (cid:98) V kk (cid:48) + 1)More generally, for any m , we can obtain a formula in terms of (cid:98) V for the weights (cid:99) W ij and (cid:98) w si in the best response to (cid:98) V , in order to write the equilibrium (cid:98) V ij as the solutions to More generally, one can use this map to study how covariances of actions evolve given any initial distri-bution of play. Note that the map Φ is deterministic, so we can study this evolution without consideringthe particular realizations of signals. When m = 1, the proof gives bounds (cid:98) V ii ∈ [ σ − i , σ i ] on equilibrium variances and (cid:98) V ij ∈ [ − σ i σ j , σ i σ j ]on equilibrium covariances. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 13 a system of polynomial equations. These equations have large degree and cannot be solvedanalytically except in very simple cases, but they can readily be used to solve for equilibrianumerically. The main insight is that we can find equilibria by studying action covariances; this ideaapplies equally to many extensions of our model. We give two examples: (1) We assumethat agents observe neighbors perfectly, but one could define other observation structures.For instance, agents could observe actions with noise, or they could observe some set oflinear combinations of neighbors’ actions with noise. (2) We assume agents are Bayesianand best-respond rationally to the distribution of actions, but the same proof would alsoshow that equilibria exist under other behavioral rules. We show later, as part of Proposition 2, that there is a unique stationary linear equilib-rium in networks having a particular structure. In general, uniqueness of the equilibrium isan open question that we leave for future work; our efforts to use standard approaches forproving uniqueness have run into obstacles. Nevertheless, in computing equilibria numer-ically for many examples, we have not been able to find a case of equilibrium multiplicity(see Section 4.5 for more on numerical results).We now briefly touch on how agents could come to play the strategies posited above.If other agents are using stationary equilibrium strategies, then best-responding is easy todo under some conditions. For instance, if historical empirical data on neighbors’ errorvariances and covariances are available (i.e., the entries of the matrix V N i ,t discussed inSection 3.3), then the agent needs only to use these to compute a best estimate of θ t − ,which is essentially a linear regression problem.4. When is there fast information aggregation in large networks?
In this section, we consider the quality of learning outcomes in equilibrium, and whengood diffusion can be achieved. To make this exercise precise, we first define a benchmarkof good information aggregation. Our main results show that, under certain conditions onthe distribution of signals in the population, this benchmark can actually be achieved in a Indeed, we have used numerical solutions to study the system and to conjecture many of our results. Inpractice, a fixed point (cid:98) V is found by repeatedly applying Φ, as written in (3.3), to an initial covariancematrix. In all our experiments, the same fixed point has been found, independent of starting conditions. What is important in the proof is that actions depend continuously on the covariance structure of anagent’s observations; the action variances are uniformly bounded under the rule agents play; and there isa decaying dependence of behavior on the very distant past. We have checked numerically that Φ is not, in general, a contraction in any of the usual norms (entrywisesup, Euclidean operator norm, etc.), nor does it seem clear how to prove uniqueness by defining a Lyapunovfunction.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 14 class of large networks. Indeed, we will show that if signals are distributed in a suitablyheterogeneous way across the population the aggregation benchmark is achieved robustly.We also show this result is tight, in the sense that without such signal diversity, aggregationcan fail.4.1.
The aggregation benchmark.
Because agents cannot learn a moving state exactly,we must define what it means for agents to learn well. Our benchmark is the expectedpayoff that an agent would obtain given her private signal and perfect knowledge of thestate in the previous period. (The state in the previous period is the maximum that anagent can hope to learn from neighbors’ information, since social information arrives witha one-period delay.) Let V benchmark ii be the error variance that player i achieves at thisbenchmark: namely, V benchmark ii = ( σ − i + 1) − . Definition 1.
An equilibrium achieves the ε -perfect aggregation benchmark if, for all i , (cid:98) V ii V benchmark ii ≤ ε. This says that all agents do nearly as well as if each knew her private signal and yester-day’s state. Note agents can never infer yesterday’s state perfectly from observed actionsin any finite network, and so we must have (cid:98) V ii V benchmark ii > i on any fixed network.We give conditions under which ε - perfect aggregation is achieved for any ε > ρ and consider a sequence of networks( G n ) ∞ n =1 , where G n has n nodes. Example 2.
We use a very simple example to demonstrate that the ε -perfect aggregationbenchmark can be achieved for any ε >
0. Suppose each G n for n ≥ σ = 1 and σ = 1 /n . Then agent2’s weight on her own signal converges to 1 as n → ∞ . So (cid:98) V converges to V benchmark11 =( σ − + 1) − = as n → ∞ . Thus, the learning benchmark is achieved.The environment we have devised in this example is quite special: agent 1 can essentiallyinfer last period’s state because someone else has arbitrarily precise information. A muchmore interesting question is whether anything similar can occur without anyone havingextremely precise signals. In the next section, we address this and show that perfectaggregation can be achieved by all agents simultaneously even without anyone having veryprecise signals.4.2. Distributions of networks and signals.
To study learning in large populations,we specify two aspects of the environment: network distributions and signal distributions . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 15
In terms of network distributions, we define a stochastic model that makes the analysis oflarge networks tractable, but is flexible in that it allows us to encode rich heterogeneity innetwork positions. We also specify signal distributions : how signal precisions are allocatedto agents, in a way that may depend on network position. We now describe these twoprimitives of the model and make some assumptions that are maintained in Section 4.Fix a set of network types k ∈ K = { , , . . . , K } . There is a probability p kk (cid:48) for each pairof network types, which is the probability that an agent of network type k has a link to agiven agent of network type k (cid:48) . An assumption we maintain on these probabilities is thateach network type k observes at least one network type (possibly k itself) with positiveprobability. There is also a vector ( α , . . . , α K ) of population shares of each type, which weassume are all positive. Jointly, ( p kk (cid:48) ) k ∈K and α specify the network distribution. Theseparameters can encode differences in expected degree and also features such as homophily(where some groups of types are linked to each other more densely than to others).We next define signal distributions, which describe the allocation of signal variances tonetwork types. Fix a finite set S of private signal variances, which we call signal types. We let q kτ be the share of agents of network type k with signal type τ ; ( q kτ ) k ∈K ,τ ∈ S definesthe signal distribution. Generating the networks.
Let the nodes in network n be a disjoint union of sets N n , N n , . . . , N Kn , with the cardinality | N kn | equal to (cid:98) α k n (cid:99) or (cid:100) α k n (cid:101) (rounding so that thereare n agents in the network). We (deterministically) set the signal variances σ i equal toelements of S in accordance with the signal shares (again rounding as needed). Let ( G n ) ∞ n =1 be a sequence of undirected random networks with these nodes, so that i ∈ N kn and j ∈ N k (cid:48) n linked with probability p kk (cid:48) ; these realizations are all independent. Diversity of signals.
The environment is described by the linking probabilities ( p kk (cid:48) ) k ∈K ,the type shares α , and the signal distribution ( q kτ ) k ∈K ,τ ∈ S . We say that a signal type τ isrepresented in a network type k if q kτ > Definition 2.
We say that the environment satisfies signal diversity if at least two distinctsignal types are represented in each network type.We will satisfy environments that satisfy this condition as well as ones that do not, andshow that it is pivotal for information-aggregation. This type of network is known as a stochastic block model (Holland et al., 1983). The assumptions of finitely many signal types and network types are purely technical, and could berelaxed.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 16
Diverse signals.
Our first main result is that signal diversity is sufficient for good ag-gregation. Under this condition, the benchmark is achieved independently of the structuralproperties of the network.We say an event occurs asymptotically almost surely if for any ε >
0, the event occurswith probability at least 1 − ε for n sufficiently large. Theorem 1.
Let ε > . If an environment satisfies signal diversity, asymptotically almostsurely G n has a equilibrium where the ε -perfect aggregation benchmark is achieved. So on large networks, society is very likely to aggregate information as well as possible.The uncertainty in this statement is over the network, as there is always a small probabilityof a realized network which obstructs learning (e.g., an agent has no neighbors). We givean outline of the argument next, and the proof appears in Appendix C.
Outline of the argument.
To give intuition for the result, we first describe why the theoremholds on the complete network in the m = 1 case. This echoes the intuition of the examplein the introduction. We then discuss the challenges involved in generalizing the result toour general stochastic block model networks, and the techniques we use to overcome thosechallenges.Consider a time- t agent, ( i, t ). We define her social signal r i,t to be the optimal estimateof θ t − based on the actions she has observed in her neighborhood. On the completenetwork, all players have the same social signal, which we call r t . At any equilibrium, each agent’s action is a weighted average of her private signal andthis social signal: (4.1) a i,t = (cid:98) w is s i,t + (1 − (cid:98) w is ) r t . The weight (cid:98) w is depends only on the precision of agent i ’s signal. We call the weights of twodistinct signal types (cid:98) w As and (cid:98) w Bs .Now observe that each time-( t + 1) agent can average the time- t actions of each type,which can be written as follows using (4.1) and s i,t = θ t + η i,t :1 n A (cid:88) i : σ i = σ A a i,t = (cid:98) w As θ t + (1 − (cid:98) w As ) r t + O ( n − / ) , Note this is a special case of the stochastic block model. In particular, agent ( i, t ) sees everyone’s past action, including a i,t − . Agent i ’s weights on her observations s i,t and ρa j,t − sum to 1, because the optimal action is an unbiasedestimate of θ t . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 17 n B (cid:88) i : σ i = σ B a i,t = (cid:98) w Bs θ t + (1 − (cid:98) w Bs ) r t + O ( n − / ) . Here n A and n B denote the numbers of agents of each type, and the O ( n − / ) error termscome from averaging the signal noises η i,t of agents in each group. In other words, by thelaw of large numbers, each time-( t + 1) agent can obtain precise estimates of two differentconvex combinations of θ t and r t . Because the two weights, (cid:98) w As and (cid:98) w Bs , are distinct, she canapproximately solve for θ t as a linear combination of the average actions from each type (upto signal error). It follows that in the equilibrium we are considering, the agent must havean estimate at least as precise as what she can obtain by the strategy we have described,and will thus be very close the benchmark. The estimator of θ t that this strategy gives willplace negative weight on n A (cid:80) i : σ i = σ B a i,t − , thus anti-imitating the agents of signal typeA. It can be shown that the equilibrium we construct in which agents learn will also haveagents anti-imitating others.To use the same approach in general, we need to show that each individual observes alarge number of neighbors of each signal type with similar social signals. More precisely, theproof shows that agents with the same network type have highly correlated social signals.This is not easy, because the social signals at an equilibrium are endogenous, and in ageneral network will depend to some extent on many details of the network.A key insight allowing us to overcome this difficulty and get a handle on social signals isthat the number of paths of length two between any two agents is nearly deterministic inour random graph model. While any two agents of the same network type may have verydifferent neighborhoods, their connections at distance two will typically look very similar.(In fact, “length two” is not essential: in sparser random graphs, this statement holds witha different length, and the same arguments go through, as we discuss below.) This givesus a nice expression for the social signal as a combination of private signals and the socialsignals from two periods earlier. Using this expression, we show that if agents of the samenetwork type have similar social signals two periods ago, the same will hold in the currentperiod. We use this to show that Φ maps the neighborhood of covariance matrices whereall social signals are close to perfect to itself, and then we apply a fixed point theorem. (cid:3) We have assumed that agents know the signal types of their neighbors exactly, but thisassumption could be relaxed. For example, if each agent were instead to receive only a noisysignal about each of her neighbors’ signal types, she could solve her estimation problem in asimilar way. By conditioning on the observable correlate of signal type, an agent could formenough distinct linear combinations reflecting the previous state and older social signals to
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 18 form a precise estimate of the previous state, thus achieving the benchmark. Of course, infinite populations the precision of this inference would depend on the details.Concerning the rate of learning as n grows, the proof implies that, under the assumptionsof the theorem, the error in agents’ estimates of θ t − is O ( n − / ); thus they learn at thesame rate as in the central limit theorem, though the constants will depend considerably onthe network. Section 4.5 offers numerical evidence on the quality of aggregation in networksof practically relevant sizes. Sparser random graphs.
The random graphs we defined have held ( p kk (cid:48) ) fixed for sim-plicity. This yields expected degrees that grow linearly in the population size, which maynot be the desired asymptotic model. While it is important to have neighborhoods “largeenough” (i.e., growing in n ) to permit the application of laws of large numbers, their rateof growth can be considerably slower than linear: for example, our proof extends directlyto degrees that scale as n α for any α >
0. Instead of studying Φ and second-order neigh-borhoods, we apply the same analysis to Φ k and k th -order neighborhoods for k larger than1 /α .4.4. Non-diverse signals.
So far we have seen that good aggregation can be obtainedrobustly across network distributions assuming signal diversity. We now show this resultis tight: without signal diversity, there are environments in which equilibrium aggregationis much worse.To gain an intuition for this, note that it is essential to the argument from the previoussubsection that different agents have different signal precisions. Recall the complete graphcase we examined in our outline of the argument. From the perspective of an agent ( i, t +1),the fact that type A and type B neighbors place different weights on the social signal r t allows i to prevent the social signal used by her neighbors from confounding her estimateof θ t . We now show that without diversity in signal quality, information aggregation maybe much worse.We first study a class of networks with certain symmetries and show that for this classthere is a unique equilibrium, and at this equilibrium good aggregation is not achieved.We then present a corollary of this result showing that improving some agents’ signals canhurt learning, which distinguishes this regime not only in terms of its outcomes but also inits comparative statics.Finally, we outside that class of networks, there exists a similar equilibrium without goodaggregation on random graphs. This shows that some variation in network positions neednot give individuals enough power to identify the state, as they can under signal diversity. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 19
Graphs with symmetric neighbors.
Definition 3.
A network G has symmetric neighbors if N j = N j (cid:48) for any j, j (cid:48) ∈ N i .In the undirected case, the graphs with symmetric neighbors are the complete networkand complete bipartite networks. For directed graphs, the condition allows a larger varietyof networks.Consider a sequence ( G n ) ∞ n =1 of strongly connected graphs with symmetric neighbors.Assume that all signal qualities are the same, equal to σ , and that m = 1. Proposition 2.
Under the assumptions in the previous paragraph, each G n has a uniqueequilibrium. There exists ε > such that the ε -perfect aggregation benchmark is notachieved at this equilibrium for any n . All agents are bounded away from our learning benchmark at the unique equilibrium. Soall agents learn poorly compared to the diverse signals case. The proof of this proposition,and the proofs of all subsequent results, appear in Appendix D.This immediately implies that in environments not satisfying signal diversity, there arenetwork distributions—for which aggregation fails in a strong sense.This failure of good aggregation is not due simply to a lack of sufficient information in theenvironment: On the complete graph with exchangeable (i.e., non-diverse) signals, a socialplanner who exogenously set weights for all agents could achieve ε -perfect aggregation forany ε > n is large. See Appendix F for a formal statement, proof and numericalresults. We now give intuition for Proposition 2. In a graph with symmetric neighbors, in theunique equilibrium, the actions of any agent’s neighbors are exchangeable. So actionsmust be unweighted averages of observations. This prevents the sort of inference of θ t that occurred with diverse signals. This is easiest to see on the complete graph, where all observations are exchangeable. So, in any equilibrium, each agent’s action at time t + 1 isequal to a weighted average of his own signal and n (cid:80) j ∈ N i a j,t :(4.2) a i,t +1 = (cid:98) w is s i,t +1 + (1 − (cid:98) w is ) 1 n (cid:88) j ∈ N i a j,t . By iteratively using this equation, we can see that actions must place substantial weighton the average of signals from, e.g., two periods ago. Although the effect of signal errors These are both special cases of our stochastic block model from Section 4.3. We thank Alireza Tahbaz-Salehi for suggesting this analysis. The proof of the proposition establishes uniqueness by showing that Φ is a contraction in a suitablesense.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 20 η i,t vanishes as n grows large, the correlated error from past changes in the state ν t never“washes out” of estimates, and this is what prevents perfect aggregation.We can also explicitly characterize the limit action variances and covariances. Consideragain the complete graph and the (unique) symmetric equilibrium. Let V ∞ denote thelimit, as n grows large, of the variance of any agent’s error ( a i,t − θ t ). Let Cov ∞ denotethe limit covariance of any two agent’s errors. By direct computations, these can be seento be related by the following equations, which have a unique solution:(4.3) V ∞ = 1 σ − + ( ρ Cov ∞ + 1) − , Cov ∞ = ( ρ Cov ∞ + 1) − [ σ − + ( ρ Cov ∞ + 1) − ] . This variance and covariance describe behavior not only in the complete graph, but toany graph with symmetric neighbors where degrees tend uniformly to ∞ . In such graphs,too, the variances of all agents converge to V ∞ and the covariances of all pairs of agentsconverge to Cov ∞ , as n → ∞ . This implies that, in large graphs, the equilibrium actiondistributions are close to symmetric. Indeed, it can be deduced that these actions are equalto an appropriately discounted sum of past θ t − (cid:96) , up to error terms (arising from η i,t − (cid:96) ) thatvanish asymptotically.4.4.2. A corollary: Perverse consequences of improving signals.
As a consequence of The-orem 1 and Proposition 2, we can give an example where making one agent’s privateinformation less precise helps all agents.
Corollary 1.
There exists a network G and an agent i ∈ G such that increasing σ i givesa Pareto improvement in equilibrium variances. To prove the corollary, we consider the complete graph with homogeneous signals and n large. By Proposition 2, all agents do substantially worse than perfect aggregation. Ifwe instead give agent 1 a very uninformative signal, all players can anti-imitate agent 1and achieve nearly perfect aggregation. When the signals at the initial configuration aresufficiently imprecise, this gives a Pareto improvement.4.4.3. Non-diverse signals in large random graphs.
Our results on non-diverse signals haveused graphs with symmetric neighbors. In those graphs, the unique prediction is thatlearning outcomes fall far short of the perfect aggregation benchmark. We would like toshow that exact symmetry is not essential, and that the lack of good aggregation is robustto adding noise. To this end, we now show that in Erdos-Renyi random networks, there This is established by the same argument as in the proof of Proposition 3.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 21 is an equilibrium with essentially the same learning outcomes when signal precisions arehomogeneous.Let ( G n ) ∞ n =1 be a sequence of undirected random networks, with G n having n nodes, withany pair of distinct nodes linked (i.i.d.) with positive probability p . We continue to assumeall signal variances are equal to σ and m = 1. Proposition 3.
Under the assumptions in the previous paragraph, there exists ε > such that asymptotically almost surely there is an equilibrium on G n where the ε -perfectaggregation benchmark is not achieved. The equilibrium covariances in this equilibrium again converge to V ∞ and Cov ∞ (forany value of p ). Thus, when there is only one signal type, we obtain the same learningoutcomes asymptotically on a variety of networks.4.5. Aggregation and its absence without asymptotics: Numerical results.
Theresults presented in this section so far can be summarized as saying that, to achieve theaggregation benchmark of essentially knowing the previous period’s state, there need to beat least two different private signal variances in the network. Formally, this is a knife-edgeresult: As long as private signal variances differ at all, then as n → ∞ , perfect aggregationis achieved; with exactly homogeneous signal endowments, agents’ variances are muchhigher. In this section, we show numerically that for fixed values of n , the transition fromthe first regime to the second is actually gradual: Action error remains well above theperfect aggregation benchmark when signal qualities differ slightly. In Figure 4.1, we study the complete network with ρ = 0 .
9. The private signal varianceof agents of signal type A is fixed at σ A = 2. We then vary the private signal variance ofagents of type B (the horizontal axis), and compute the equilibrium variance of a i,t − θ for agents of type A (plotted on the vertical axis). The variance of type A agents at thebenchmark is / . We note several features: First, the change in aggregation quality iscontinuous, and indeed reasonably gradual, for n in the hundreds as we vary σ . Second, as n increases, we can see that the curve is moving toward the theoretical limit: a discontinuityat σ B = 2. Third, there are nevertheless considerable gains to increasing n , the number ofagents: going from n = 200 to n = 600 results in a gain of 5.2% in precision when σ B = 3.To examine whether the large network results above work in realistic networks with mod-erate degrees, we present numerical evidence based on the data in Banerjee, Chandrasekhar,Duflo, and Jackson (2013). This data set contains the social networks of villages in rural We repeatedly apply Φ, as written in (3.3), to an initial covariance matrix. In doing full grid searcheson many networks, we did not find any instances in which the fixed point found depended on the choice ofinitial point.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 22
Figure 4.1.
Distinct Variances Result in Learning B2 G r oup A V a r i an c e A2 = 2 n=200n=400n=600n= India. There are 43 networks in the data, with an average network size of 212 nodes(standard deviation = 53.5), and an average degree of 19 (standard deviation = 7.5). Foreach network, we calculated the equilibrium for two different situations. The first is thehomogeneous case, with all signal variances set to 2. The latter is a heterogeneous case,where a majority has the same signal distribution as in the first case, but a minority has asubstantially worse signal. More precisely, we kept the signal variances of people that haveaccess to electricity (92% of the nodes) at 2, while setting the signal variances of the restat 5. In Figure 4.2(a), the green points show that in the vast majority of networks, the medianagent in terms of learning quality has a lower error variance (i.e., more precise estimates ofthe state) in the heterogeneous case. Now consider an agent who is at the 25 th percentilein terms of error variance (and thus estimates the state better than 75 percent of agents);the red points show that the advantage of the heterogeneous case becomes even more starkfor these agents. In Figure 4.2(b), we pool all the agents together across all networksand depict the empirical distribution of error variance. In the homogeneous case (red We take the networks that were used in the estimation in Banerjee, Chandrasekhar, Duflo, and Jackson(2013). As in their work, we take every reported relationship to be reciprocal for the purposes of sharinginformation. This makes the graphs undirected. We made the heterogeneous signals dependent on electricity status because we believe signal precisionwould in practice be correlated with, e.g., access to communication technology (or similar attributes). Inthe figure, we plot outcomes of the nodes with access to electricity—i.e., those whose signal variances didnot change in our exercise.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 23
Figure 4.2.
Prediction Variance In Indian Villages
Prediction Variance Heterogeneous Signal P r ed i c t i on V a r i an c e H o m ogeneou s S i gna l (a) (a) The error variance of the agent in the 25 th , 50 th and 75 th percentiles in each village, inthe homogeneous and heterogeneous cases. (b) Histograms of error variance (we pool allthe agents together across all networks) for the homogeneous (red) and heterogeneous(blue) case. Vertical lines show the asymptotic variance for the complete graph as n → ∞ for the two cases.histogram), there is bunching around the asymptotic variance for the homogeneous-signalcase. When we introduce heterogeneity in signal quality (blue histogram), a substantialshare of households have prediction variance below this boundary, thus benefiting fromthe heterogeneity. Overall, we see that even in networks with relatively small degree ourqualitative results hold: adding heterogeneity helps learning in the population, even witha small group of agents with the new signal type.5. The importance of understanding correlations
In the proof of our positive result on achieving the perfect aggregation benchmark (Theo-rem 1), a key aspect of the argument involved agents filtering out confounding informationfrom their neighbors’ estimates by optimally responding to the correlation structure ofthese estimates. In this section, we demonstrate that this sort of behavior is essential fornearly perfect aggregation, and that more naively imitative heuristics yield outcomes farfrom the benchmark. Empirical studies have found evidence (depending on the settingand the subjects) consistent with both Bayesian behavior and correlation-neglect in thepresence of correlated observations (e.g., Eyster, Rabin, and Weizsacker (2015); Dasarathaand He (2017); Enke and Zimmermann (2019)).
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 24
We begin with a canonical model of agents who do not account for correlations amongtheir neighbors’ estimates conditional on the state, and show by example that naive agentsachieve much worse learning than Bayesian agents, and thus fail to reach the perfect aggre-gation benchmark. We then formalize the idea that accounting for correlations in neighbors’actions is crucial to reaching the benchmark. This is done by demonstrating a general lackof asymptotic learning by naive agents, and the techniques extend easily to alternativebehavioral specifications. Finally, we show that even in fixed, finite networks, any positiveweights chosen by optimizing agents will be Pareto-dominated.5.1.
Naive agents.
In this part we introduce agents who misunderstand the distributionof the signals they are facing and who therefore do not update as Bayesians with a correctunderstanding of their environment. We consider a particular form of misspecification thatsimplifies solving for equilibria analytically: Definition 4.
We call an agent naive if she believes that all neighbors choose actions equalto their private signals and maximizes her expected utility given these incorrect beliefs.Equivalently, a naive agent believes her neighbors all have empty neighborhoods. This isthe analogue, in our model, of “best-response trailing naive inference” (Eyster and Rabin,2010). So naive agents understand that their neighbors’ actions from the previous periodare estimates of θ t − . But they think each such estimate is independent given the state,and that the precision of the estimate is equal to the signal precision of the correspondingagent.In Figure 5.1, we compare Bayesian and naive learning outcomes. As in Figure 4.1,we consider a complete network where half of agents have signal variance σ A = 3 and wevary the signal variance σ B of the remaining agents. We observe that naive agents learnsubstantially worse than rational agents, whether signals are diverse or not. Formal analysisand formulas for variances under naive learning can be found in Appendix E.5.2. Understanding correlation is essential for reaching the benchmark.
We nowshow more generally that naive agents all fail to achieve perfect aggregation on any sequenceof growing networks.Consider a sequence of undirected networks ( G n ) ∞ n =1 with n agents in G n . A seminal paper studying boundedly rational learning rules in networks is Bala and Goyal (1998). Thereare a number of possible variants of our behavioral assumption, and it is straightforward to numericallystudy alternative specifications of behavior in our model (Alatas, Banerjee, Chandrasekhar, Hanna, andOlken (2016) consider one such variant).
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 25
Figure 5.1.
Bayesian and Naive Learning B2 G r oup A V a r i an c e A2 = 2, n=600 NaiveBayesian
Proposition 4.
Assume that all private signal variances are bounded below by σ > . Fixany sequence of naive equilibria on G n . Then there is an ε > such that, for all n , the ε -aggregation benchmark is not achieved by any agent i at the naive equilibrium. The essential idea is that at time t + 1 observed time- t actions all put weight on actionsfrom period t −
1, which causes θ t − to have a (positive weight) contribution to all observedactions. Agents do not know θ t − and, with positive weights, cannot take any linear combi-nation that would recover it. Even with a very large number of observations, this confoundprevents agents from learning yesterday’s state precisely.To make the argument more precise, assume toward a contradiction that agent i achievesthe (cid:15) -perfect aggregation benchmark for an arbitrarily small (cid:15) . Because of the confoundingdiscussed in the last paragraph, she would have to observe many neighbors who placealmost all of their weight on their private signals. Because the network is undirected,though, these neighbors themselves see i . Since i ’s action in this hypothetical reflects thestate very precisely, the neighbors would do better by placing substantial weight on agent i and not just on their private signals. So we cannot have such an agent i .In summary, bidirectional observation presents a fundamental obstruction to attainingthe best possible benchmark of aggregation. This is related to a basic observation aboutlearning from multivariate Gaussian signals about a parameter: if the signals (here, socialobservations), conditional on the state of interest ( θ t ) are all correlated and the correlationis bounded below, away from zero, (here this occurs because all involve some indirect weight OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 26 on θ t − ) then the amount one can learn from these signals is bounded, even if there areinfinitely many of them. Related obstructions to learning play an important role in Harel,Mossel, Strack, and Tamuz (2017).The statement of the proposition uses our functional form specification of naivet´e. How-ever, it is evident from the proof that the method extends to alternative specifications ofagents who estimate θ t − by averaging their observations with nonnegative weights, andthen combine this social information and private signals in a reasonable (e.g., Bayesian)manner. This nests various other specifications of correlation neglect. Moreover, the sameproof shows that in any sequence of Bayesian equilibria where all agents use positiveweights, no agent can learn well.5.3.
Without anti-imitation, outcomes are Pareto-inefficient.
The previous sectionargued that anti-imitation is critical to achieving the perfect aggregation benchmark. Wenow show that even in small networks, where that benchmark is not relevant, any equi-librium without anti-imitation is Pareto-inefficient relative to another steady state. Thisresult complements our asymptotic analysis by showing a different sense (relevant for smallnetworks) in which anti-imitation is necessary to make the best use of information.Our result in this section defines a profile of behavior that results in a steady state Pareto-dominating a given equilibrium. To make this formal, we make the following definition:
Definition 5.
The steady state associated with weights W and w s is the (unique) covari-ance matrix V ∗ such that if actions have a variance-covariance matrix given by V t = V ∗ and next-period actions are set using weights ( W , w s ), then V t +1 = V ∗ as well.In this definition of steady state, instead of optimizing (as at equilibrium) agents usefixed weights in all periods. By a straightforward application of the contraction mappingtheorem, if agents use any non-negative weights under which covariances remain boundedat all times, there is unique steady state. Proposition 5.
Suppose the network G is strongly connected and some agent has morethan one neighbor. Given any naive equilibrium or any Bayesian equilibrium where allweights are positive, the variances at that equilibrium are Pareto-dominated by variancesat another steady state. The basic argument behind Proposition 5 is that if agents place marginally more weighton their private signals, this introduces more independent information that eventually ben-efits everyone. In the proof in Appendix D, we state and prove a more general result withweaker hypotheses on behavior.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 27
In a review of sequential learning experiments, Weizs¨acker (2010) finds that subjectsweight their private signals more heavily than is optimal (given the empirical behavior ofothers they observe). Proposition 5 implies that in our environment with optimizing agents,it is actually welfare-improving for individuals to “overweight” their own information rela-tive to best-response behavior.
Discussion of conditions in the proposition.
We next briefly discuss the sufficientconditions in the proposition statement. First, it is clear that some condition on neighbor-hoods is needed: If every agent has exactly one neighbor and updates rationally or naively,there are no externalities and the equilibrium weights are Pareto optimal. Second, thecondition on equilibrium weights says that no agent anti-imitates any of her neighbors.This assumption makes the analysis tractable, but we believe the basic force also works infinite networks with some anti-imitation.
Proof sketch.
The idea of the proof of the rational case is to begin at the steady stateand then marginally shift the rational agent’s weights toward her private signal. By theenvelope theorem, this means agents’ actions are less correlated but not significantly worsein the next period. We show that if all agents continue using these new weights, thedecreased correlation eventually benefits everyone. In the last step, we use the absence ofanti-imitation, which implies that the updating function associated with agents using fixed(as opposed to best-response) weights is monotonic in terms of the variances of guesses. Tofirst order, some covariances decrease while others do not change after one period underthe new weights. Monotonicity of the updating function and strong connectedness implythat eventually all agents’ variances decrease.The proof in the naive case is simpler. Here a naive agent is overconfident about thequality of her social information, so she would benefit from shifting some weight from hersocial information to her signal. This deviation also reduces her correlation with otheragents, so it is Pareto-improving.
An illustration.
An example illustrates the phenomenon:
Example 3.
Consider n = 100 agents in an undirected circle—i.e., each agent observesthe agent to her left and the agent to her right. Let σ i = σ be equal for all agents and ρ = .
9. The equilibrium strategies place weight (cid:98) w s on private signals and weight (1 − (cid:98) w s )on each observed action. In fact, the result of Proposition 5 (with the same proof) applies to a larger class of networks: it issufficient that, starting at each agent, there are two paths of some length k to a rational agent and anotherdistinct agent. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 28
When σ = 10, the equilibrium weight is (cid:98) w s = 0 .
192 while the welfare-maximizing sym-metric weights have w s = 0 . σ = 1, the equilibrium weight is (cid:98) w s = 0 .
570 while the wel-fare maximizing symmetric weights have w s = 0 . Related literature
We now put our contribution in the context of the extensive literature on social learningand learning in networks. DeGroot and other network models.
Play in the stationary linear equilibria ofour model closely resembles behavior in the DeGroot (1974) and Friedkin and Johnsen(1997) heuristics, where agents update by linearly aggregating network neighbors’ pastestimates, with constant weights on neighbors over time. We now discuss how our modelcompares to existing work on these kinds of models—both in terms of foundations andoutcomes.DeMarzo, Vayanos, and Zweibel (2003) justified the DeGroot heuristic by assumingthat agents have an oversimplified model of their environment. In their model, the state isdrawn once and for all at time zero, and each agent receives one signal about it; then agentsrepeatedly observe each other’s conditional expectations of the state and form estimates. Attime zero, assuming all randomness is Gaussian, the Bayesian estimation rule is linear withcertain weights. DeMarzo, Vayanos, and Zweibel (2003) made the behavioral assumptionthat in subsequent periods, agents treat the informational environment as being identicalto that of the first period (even though past learning has, in fact, induced redundanciesand correlations). In that case, the agents behave according to the DeGroot rule, using thesame weights over time. Recently, Jadbabaie, Molavi, Sandroni, and Tahbaz-Salehi (2012)and Molavi, Tahbaz-Salehi, and Jadbabaie (2018) have offered powerful new analyses ofthese types of heuristics, and have introduced flexible forms suited to a state that changesover time. We give an alternative, Bayesian microfoundation for the same sort of rule bystudying a different environment. Our foundation relies on the fact that the environmentis stationary and so the joint distribution of the random variables in the model (neighbors’estimates and the state of interest) is actually stationary.
For surveys of different parts of this literature, see, among others, Acemoglu and Ozdaglar (2011), Goluband Sadler (2016), and Mossel and Tamuz (2017). Indeed, agents behaving according to the DeGroot heuristic even when it is not appropriate might haveto do with their experiences in stationary environments where it is closer to optimal.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 29
Concerning learning outcomes under the DeGroot learning rule, DeMarzo, Vayanos, andZweibel (2003) emphasized that in their model, the stationary rule could in general befar from optimal in finite populations. Golub and Jackson (2010) showed that DeGrootagents could nevertheless converge to precise estimates in large networks as long as noagent has too prominent a network position, and not otherwise. Less demanding sufficientconditions for good DeGroot-style learning were given by Jadbabaie, Molavi, Sandroni, andTahbaz-Salehi (2012) in a world with a fixed state but ongoing arrival of information. Anoverall message that emerges from these papers is that in these fixed-state environments,certain simple heuristics (requiring no sophistication about correlations between neighbors’behavior) can allow agents to guess the state quite precisely. Our findings highlightnew obstructions to naive learning arising in a world with a changing state: agents needa sophisticated response to the correlation in neighbors’ estimates that arises from thoseneighbors’ past learning.Moreover, in contrast to the environment of DeMarzo, Vayanos, and Zweibel (2003), evenBayesian agents who understand the environment perfectly are not guaranteed to be able toaggregate information well (Section 4.4). Bayesians’ good learning in our environment, andits failure, depend on conditions—namely, signal diversity throughout the network—thatdiffer markedly from the ones that play a role in the papers discussed above.6.2.
Recent models with evolving states.
Several recent papers in computer scienceand engineering study environments similar to ours. Frongillo, Schoenebeck, and Tamuz(2011) study (in our notation) a θ t that follows a random walk ( ρ = 1). They examineagents who learn using fixed (exogenous) weights on arbitrary networks. They characterizethe steady-state distribution of behavior with arbitrary (non-equilibrium) fixed weights.They also examine best-response (equilibrium) weights on a complete network, where allagents observe everyone’s last-period action. Their main result concerning these showsthat the equilibrium weights can be inefficient. This is generalized by our Proposition 5 onPareto-inefficiency on an arbitrary graph. Our existence result (Proposition 1) generalizesthe construction in their paper from the symmetric case of the complete network to arbitrarynetworks. Of course, when the model is enriched in various directions, the findings are not uniformly optimistic fornaive learning. For instance, Akbarpour et al. (2017) show that with a fixed state and a changing society,DeGroot learning can be quite inefficient, because agents do not know, even approximately, how precisethe guesses of various contacts are. For an early model of repeated learning about a changing state based on social information, see Ellisonand Fudenberg (1995); that model differs in that there is no persistence in the state over time.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 30
The stochastic process and information structure in Shahrampour, Rakhlin, and Jad-babaie (2013) are also the same as ours, though their analysis does not consider optimizingagents. The authors consider a class of fixed weights and study heuristics, computing orbounding various measures of welfare. When we study Pareto inefficiency, we comparewelfare under such fixed exogenous weights with the welfare obtained by optimizing agentsat equilibrium. Because weights in our model are determined in a Nash equilibrium, wecan consider how they respond endogenously to changes in the environment (e.g., network).We also give conditions for good learning even when agents are optimizing for themselves,as opposed to being programmed to achieve a global objective.In economics, the model in Alatas, Banerjee, Chandrasekhar, Hanna, and Olken (2016)most closely resembles ours. There, agents are not fully Bayesian, ignoring the correlationbetween social observations. The model is estimated using data on social learning in In-donesian villages, where the state variables are the wealths of villagers. As we show, howrational agents are in their inferences plays a major role in the accuracy of such aggrega-tion processes. Our model provides foundations for structural estimation with Bayesianbehavior as well as testing of the Bayesian model against behavioral alternatives such asthat of Alatas, Banerjee, Chandrasekhar, Hanna, and Olken (2016); we discuss this belowin Section 7.1.6.3.
Classical social learning models.
A canonical model of social learning involvesinfinitely many agents choosing, in sequence, from finitely many (often two) actions tomatch a fixed state, with access to predecessors’ actions (Bikhchandani, Hirshleifer, andWelch (1992); Banerjee (1992); Smith and Sørensen (2000); Eyster and Rabin (2010)).The first models were worked out with observation of all predecessors, but recent papershave developed analyses where some subset of predecessors seen by each agent (Acemoglu,Dahleh, Lobel, and Ozdaglar, 2011; Eyster and Rabin, 2014; Lobel and Sadler, 2015a,b).These models thus featAgents learn about a state using private signals and the past actionsof their neighbors. In contrast to most models of social learning in a network, the targetbeing learned about is moving around. We ask: when can a group aggregate informationquickly, keeping up with the changing state? First, if each agent has access to neighborswith sufficiently diverse kinds of signals, then Bayesian learning achieves good informationaggregation. Second, without such diversity, there are cases in which Bayesian informationaggregation necessarily falls far short of efficient benchmarks. Third, good aggregationrequires agents who understand correlations in neighbors’ actions with the sophisticationneeded to concentrate on recent developments and filter out older, outdated information.Agents’ stationary equilibrium learning rules incorporate past information by taking linear
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 31 combinations of other agents’ past estimates (as in the simple DeGroot heuristic), andwe characterize the coefficients in these linear combinations.ure an incomplete network ofobservation opportunities.A major concern of this literature is the potential for information aggregation to stopafter some finite time due to inference problems. The discreteness of individuals’ actionsoften plays an important role. Our focus is different in that we study a moving continuousstate and continuous actions, and ask how well agents aggregate information, in steadystate, about the relatively recent past. These modeling differences allow new insights toemerge: for example, heterogeneity of signal endowments turns out to be critical for goodaggregation in the Bayesian case, which is very different from the kinds of conditions thatplay a role in Smith and Sørensen (2000), Acemoglu, Dahleh, Lobel, and Ozdaglar (2011),and Lobel and Sadler (2015a). Some other recent models consider Gaussian environments, and the prospects for goodlearning there depend, as in our model, on agents’ ability to infer information from neigh-bors’ actions in the presence of confounds. This is the case, for example, in Sethi and Yildiz(2012) and Harel et al. (2017), both of which study a fixed state. In Sethi and Yildiz (2012),learning outcomes depend on whether individuals’ (heterogeneous) priors are independentor correlated. In Harel et al. (2017) mutual observation opportunities keep learning farfrom an efficient benchmark.Another point of contact with the classical social learning literature concerns the model-ing of changing states: Moscarini, Ottaviani, and Smith (1998) (see also van Oosten (2016))study learning models where the binary state evolves as a two-state Markov chain. Theirresults focus largely on the frequency and dynamics of information cascades: changes in thestate can break cascades/herds and renew learning. Our main focus is on the aggregationproperties.Finally, a robust aspect of rational learning in sequential models is anti-imitation. Eysterand Rabin (2014) give general conditions for fully Bayesian agents to anti-imitate in thesequential model. We find that anti-imitation also is an important feature in our dynamicmodel, and in our context is crucial for good learning. Despite this similarity, there is animportant contrast between our findings and standard sequential models. In those models,while rational agents do prefer to anti-imitate, in many cases individuals as well as societyas a whole could obtain good outcomes using heuristics without any anti-imitation: forinstance, by combining the information that can be inferred from one neighbor with one’s Those conditions require either that some signals are very informative, or that some agents have accessto a large number of samples of behavior that are not based on any common signals—neither of which weassume in our main results.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 32 own private signal, as in Acemoglu, Dahleh, Lobel, and Ozdaglar (2011) and Lobel andSadler (2015a). Our dynamic learning environment is different, as shown in Proposition 4:to have any hope of approaching good aggregation benchmarks, agents must respond in asophisticated way, with anti-imitation, to their neighbors’ (correlated) estimates.7.
Discussion and extensions
Identification and testable implications.
One of the main advantages of theparametrization we have studied is that standard methods can easily be applied to estimatethe model and test hypotheses within it. The key feature making the model econometri-cally well-behaved is that, in the solutions we focus on, agents’ actions are linear functionsof the random variables they observe. Moreover, the evolution of the state and arrival ofinformation creates exogenous variation. We briefly sketch how these features can be usedfor estimation and testing.Assume the following. The analyst obtains noisy measurements a i,t = a i,t + ξ i,t of agent’sactions (where ξ i,t are i.i.d., mean-zero error terms). He knows the parameter ρ governingthe stochastic process, but may not know the network structure or the qualities of privatesignals ( σ i ) ni =1 . Suppose also that the analyst observes the state θ t ex post (perhaps witha long delay). Now, consider any steady state in which agents put constant weights W ij on their neigh-bors and w si on their private signals over time. We will discuss the case of m = 1 to saveon notation, though all the statements here generalize readily to arbitrary m .We first consider how to estimate the weights agents are using, and to back out thestructural parameters our model when it applies. The strategy does not rely on uniquenessof equilibrium. We can identify the weights agents are using through standard vectorautoregression methods. In steady state,(7.1) a i,t = (cid:88) j W ij ρa j,t − + w si θ t + ζ i,t , where ζ i,t = w si η i,t − (cid:80) j W ij ρξ j,t − + ξ i,t are error terms i.i.d. across time. The firstterm of this expression for ζ i,t is the error of the signal that agent i receives at time t .The summation combines the measurement errors from the observations a j,t − from the We can instead assume that the analyst observes (a proxy for) the private signal s i,t of agent i ; wemention how below. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 33 previous period. Thus, we can obtain consistent estimators (cid:102) W ij and (cid:101) w si for W ij and w si ,respectively.We now turn to the case in which agents are using equilibrium weights. First, andmost simply, our estimates of agents’ equilibrium weights allow us to recover the networkstructure. If the weight (cid:99) W ij is non-zero for any i and j , then agent i observes agent j .Generically the converse is true: if i observes j then the weight (cid:99) W ij is non-zero. Thus,network links can generically be identified by testing whether the recovered social weightsare nonzero. For such tests (and more generally) the standard errors in the estimators canbe obtained by standard techniques. Now we examine the more interesting question of how structural parameters can beidentified assuming an equilibrium is played, and also how to test the assumption of equi-librium.The first step is to compute the empirical covariances of action errors from observed data;we call these (cid:101) V ij . Under the assumption of equilibrium, we now show how to determinethe signal variances using the fact that equilibrium is characterized by Φ( (cid:98) V ) = (cid:98) V andrecalling the explicit formula (3.3) for Φ. In view of this formula, the signal variances σ i are uniquely determined by the other variables:(7.2) (cid:98) V ii = (cid:88) j (cid:88) k (cid:99) W ij (cid:99) W ik ( ρ (cid:98) V jk + 1) + ( (cid:98) w si ) σ i . Replacing the model parameters other than σ i by their empirical analogues, we obtaina consistent estimate (cid:101) σ i of σ i . This estimate could be directly useful—for example, toan analyst who wants to choose an “expert” from the network and ask about her privatesignals directly.Note that our basic VAR for recovering the weights relies only on constant linear strate-gies and does not assume that agents are playing any particular strategy within this class.Thus, if agents are using some other behavioral rule (e.g., optimizing in a misspecifiedmodel) we can replace (7.2) by a suitable analogue that reflects the bounded rationality inagents’ inference. If such a steady state exists, and using the results in this section, onecan create an econometric test that is suitable for testing how agents are behaving. Forinstance, we can test the hypothesis that they are Bayesian against the naive alternativeof our Section 5.1. This system defines a VAR(1) process (or generally VAR( m ) for memory length m ). Methods involving regularization may be practically useful in identifying links in the network. Manresa(2013) proposes a regularization (LASSO) technique for identifying such links (peer effects). In a dynamicsetting such as ours, with serial correlation, the techniques required will generally be more complicated.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 34
Multidimensional states and informational specialization.
So far, we havebeen working with a one-dimensional state and one-dimensional signals, which varied onlyin their precisions. Our message about the value of diversity is, however, better interpretedin a mathematically equivalent multidimensional model.Consider Bayesian agents who learn and communicate about two independent dimensionssimultaneously (each one working as in our model). If all agents have equally precisesignals about both dimensions, then society may not learn well about either of them. Incontrast, if half the agents have superior signals about one dimension and inferior signalsabout the other (and the other half has the reverse), then society can learn well aboutboth dimensions. Thus, the designer has a strong preference for an organization withinformational specialization where some, but not all, agents are expert in a particulardimension. Of course, there are many familiar reasons for specialization, in information or any otheractivity. For instance, it may be that more total information can be collected in this case,or that incentives are easier to provide. Crucially, specialization is valuable in our settingfor a reason distinct from all these: it helps agents with their inference problems.7.3.
General distributions and dynamic networks.
The example of the previous sub-section involved trivially extending our model to several independent dimensions. We nowbriefly discuss a more substantive extension, which applies to more realistic signal struc-tures.Our analysis of stationary linear learning rules relied crucially on the assumptions thatthe innovations ν t and signal errors η i,t are Gaussian random variables. However, we believethe basic logic of our result about good aggregation with signal diversity (Theorem 1) doesnot depend on this particular distributional assumption, or the exact functional form ofthe AR(1) process.Suppose we have θ t = T ( θ t − , ν t ) and s i,t = S ( θ t , η t )and consider more general distributions of innovations ν t and signal errors η t . For simplicity,consider the complete graph and m = 1. Because θ t − is still a sufficient statistic for thepast, an agent’s action in period t will still be a function of her subjective distributionover θ t − and her private signal. An agent with type τ (which is observable) who believes This raises important questions about what information agents would acquire, and whom they wouldchoose to observe, which are the focus of a growing literature. For recent papers in environments mostclosely related to ours, see Sethi and Yildiz (2016), Myatt and Wallace (2017), Fudenberg et al. (2019),and Liang and Mu (2019), among others. These states and signals may now be multidimensional.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 35 θ t − is distributed according to D takes an action equal to f ( τ, D , s i,t ). Here, τ couldreflect the distribution of agent i ’s signal, but also perhaps her preferences. We no longerassume that an agent’s action is her posterior mean of the random variable: it might besome other statistic, and might be multi-dimensional. Similarly, information need not beone-dimensional, or characterized only by its precision.This framework gives an abstract identification condition: agents can learn well if, for anyfeasible distribution D over θ t − , the state θ t can be inferred from the observed distributionsof actions, i.e., distribution of ( τ, f ( τ, D , s i,t )), which each agent would essentially knowgiven enough observations.Now consider a time- t agent i . Suppose now that any possible distribution that time-( t −
1) agents might have over θ t − can be fully described by a finite tuple of parameters d ∈ R p (e.g., a finite number of moments). For each type τ of t − i observes,the distribution of f ( τ, d, s i,t ) gives an agent a different measurement of d , which is asummary of beliefs about θ t − , and θ t − . Assuming there is not too much “collinearity,”these measurements of the finitely many parameters of interest should, in fact, provideindependent information about θ t − . Thus, as long as the set of signal types τ is sufficientlyrich, we would expect the identification condition to hold.Throughout the paper, we have studied an unchanging network. However, we can alsoconsider neighborhoods that change over time, which is obviously an important possibilityin reality. Notice that even if the neighborhood of an agent changes from period to period,if some of the individuals in that neighborhood are randomly sampled, this can provideinformation about the empirical distribution of actions of various types as described above.That, in turn, can facilitate the identification of recent states. Considering dynamic net-works requires different solution concepts, and is thus left for future work, but the ideassketched here suggest that conditions for good learning can be formulated in that setting. ReferencesAcemoglu, D., M. Dahleh, I. Lobel, and A. Ozdaglar (2011): “Bayesian Learning in Social Net-works,”
The Review of Economic Studies , 78, 1201–1236.
Acemoglu, D. and A. Ozdaglar (2011): “Opinion dynamics and learning in social networks,”
DynamicGames and Applications , 1, 3–49.
Akbarpour, M., A. Saberi, and A. Shameli (2017): “Information Aggregation in Overlapping Gen-erations,” Available at SSRN, ssrn.com/abstract=3035178 . Alatas, V., A. Banerjee, A. G. Chandrasekhar, R. Hanna, and B. A. Olken (2016): “Networkstructure and the aggregation of information: Theory and evidence from Indonesia,”
The AmericanEconomic Review , 106, 1663–1704.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 36
Babus, A. and P. Kondor (2018): “Trading and information diffusion in over-the-counter markets,”
Econometrica , forthcoming.
Bala, V. and S. Goyal (1998): “Learning from neighbours,”
The review of economic studies , 65, 595–621.
Banerjee, A., A. G. Chandrasekhar, E. Duflo, and M. O. Jackson (2013): “The diffusion ofmicrofinance,”
Science , 341, 1236498.
Banerjee, A. V. (1992): “A simple model of herd behavior,”
Quarterly Journal of Economics , 107,797–817.
Bikhchandani, S., D. Hirshleifer, and I. Welch (1992): “A theory of fads, fashion, custom, andcultural change as informational cascades,”
Journal of Political Economy , 100, 992–1026.
Dasaratha, K. and K. He (2017): “Network Structure and Naive Sequential Learning,” arXiv preprintarXiv:1703.02105 . DeGroot, M. H. (1974): “Reaching a consensus,”
Journal of the American Statistical Association , 69,118–121.
DeMarzo, P., D. Vayanos, and J. Zweibel (2003): “Persuasion bias, social influence, and unidimen-sional opinions,”
The Quarterly Journal of Economics , 118, 909–968.
Duffie, D., S. Malamud, and G. Manso (2009): “Information percolation with equilibrium searchdynamics,”
Econometrica , 77, 1513–1574.
Duffie, D. and G. Manso (2007): “Information percolation in large markets,”
American EconomicReview , 97, 203–209.
Ellison, G. and D. Fudenberg (1995): “Word-of-mouth communication and social learning,”
TheQuarterly Journal of Economics , 110, 93–125.
Enke, B. and F. Zimmermann (2019): “Correlation neglect in belief formation,” .
Eyster, E. and M. Rabin (2010): “Naive herding in rich-information settings,”
American EconomicJournal: Microeconomics , 2, 221–243.——— (2014): “Extensive Imitation is Irrational and Harmful,”
Quarterly Journal of Economics , 129,1861–1898.
Eyster, E., M. Rabin, and G. Weizsacker (2015): “An Experiment on Social Mislearning,” .
Friedkin, N. E. and E. C. Johnsen (1997): “Social positions in influence networks,”
Social Networks ,19, 209–222.
Frongillo, R., G. Schoenebeck, and O. Tamuz (2011): “Social Learning in a Changing World,”
Internet and Network Economics , 146–157.
Fudenberg, D., P. Strack, and T. Strzalecki (2019): “Stochastic choice and optimal sequentialsampling,”
American Economic Review , forthcoming.
Geanakoplos, J. D. and H. M. Polemarchakis (1982): “We can’t disagree forever,”
Journal ofEconomic Theory , 28, 192–200.
Golub, B. and M. Jackson (2010): “Naive learning in social networks and the wisdom of crowds,”
American Economic Journal: Microeconomics , 2, 112–149.
Golub, B. and E. Sadler (2016): “Learning in Social Networks,” in
The Oxford Handbook of theEconomics of Networks , ed. by Y. Bramoull´e, A. Galeotti, B. Rogers, and B. Rogers, Oxford UniversityPress, chap. 19, 504–542.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 37
Harel, M., E. Mossel, P. Strack, and O. Tamuz (2017): “Groupthink and the Failure of InformationAggregation in Large Groups,” arXiv preprint arXiv:1412.7172 . Hayek, F. A. (1945): “The use of knowledge in society,”
The American economic review , 35, 519–530.
Holland, P. W., K. B. Laskey, and S. Leinhardt (1983): “Stochastic blockmodels: First steps,”
Social networks , 5, 109–137.
Jadbabaie, A., P. Molavi, A. Sandroni, and A. Tahbaz-Salehi (2012): “Non-Bayesian sociallearning,”
Games and Economic Behavior , 76, 210–225.
Jensen, R. (2007): “The digital provide: Information (technology), market performance, and welfare inthe South Indian fisheries sector,”
The Quarterly Journal of Economics , 122, 879–924.
Kay, S. M. (1993):
Fundamentals of statistical signal processing , Prentice Hall PTR.
Lambert, N. S., M. Ostrovsky, and M. Panov (2018): “Strategic trading in informationally complexenvironments,”
Econometrica , 86, 1119–1157.
Liang, A. and X. Mu (2019): “Complementary Information and Learning Traps,” PIER Working PaperNo. 18-008. Available at SSRN: https://ssrn.com/abstract=3057805.
Lobel, I. and E. Sadler (2015a): “Information diffusion in networks through social learning,”
Theoret-ical Economics , 10, 807–851.——— (2015b): “Preferences, homophily, and social learning,”
Operations Research , 64, 564–584.
Malamud, S. and M. Rostek (2017): “Decentralized exchange,”
American Economic Review , 107,3320–62.
Manresa, E. (2013): “Estimating the structure of social interactions using panel data,”
UnpublishedManuscript. CEMFI, Madrid . Molavi, P., A. Tahbaz-Salehi, and A. Jadbabaie (2018): “A Theory of Non-Bayesian Social Learn-ing,”
Econometrica , 86, 445–490.
Moscarini, G., M. Ottaviani, and L. Smith (1998): “Social Learning in a Changing World,”
EconomicTheory , 11, 657–665.
Mossel, E., M. Mueller-Frank, A. Sly, and O. Tamuz (2018): “Social learning equilibria,”ArXiv:1207.5895.
Mossel, E., A. Sly, and O. Tamuz (2015): “Strategic learning and the topology of social networks,”
Econometrica , 83, 1755–1794.
Mossel, E. and O. Tamuz (2017): “Opinion exchange dynamics,”
Probability Surveys , 14, 155–204.
Mueller-Frank, M. (2013): “A general framework for rational learning in social networks,”
TheoreticalEconomics , 8, 1–40.
Myatt, D. and C. Wallace (2017): “Information Acquisition and Use by Networked Players,” Uni-versity of Warwick. Department of Economics. CRETA Discussion Paper Series (32), available athttp://wrap.warwick.ac.uk/90449/.
Olfati-Saber, R. (2007): “Distributed Kalman filtering for sensor networks,” in
Decision and Control,2007 46th IEEE Conference on , IEEE, 5492–5498.
Pinelis, I. (2018): “Inverse of matrix with blocks of ones,” MathOverflow,URL:https://mathoverflow.net/q/296933 (version: 2018-04-04).
Rahimian, M. A. and A. Jadbabaie (2017): “Bayesian learning without recall,”
IEEE Transactions onSignal and Information Processing over Networks . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 38
Sethi, R. and M. Yildiz (2012): “Public disagreement,”
American Economic Journal: Microeconomics ,4, 57–95.——— (2016): “Communication with unknown perspectives,”
Econometrica , 84, 2029–2069.
Shahrampour, S., S. Rakhlin, and A. Jadbabaie (2013): “Online learning of dynamic parameters insocial networks,”
Advances in Neural Information Processing Systems . Smith, L. and P. Sørensen (2000): “Pathological outcomes of observational learning,”
Econometrica ,68, 371–398.
Srinivasan, J. and J. Burrell (2013): “Revisiting the fishers of Kerala, India,” in
Proceedings of theSixth International Conference on Information and Communication Technologies and Development: FullPapers-Volume 1 , ACM, 56–66. van Oosten, R. (2016): “Learning from Neighbors in a Changing World,”
Master’s Thesis . Vives, X. (1993): “How fast do rational agents learn?”
The Review of Economic Studies , 60, 329–347.
Weizs¨acker, G. (2010): “Do we follow others when we should? A simple test of rational expectations,”
The American Economic Review , 100, 2340–2360.
Appendix A. Details of definitions
A.1.
Exogenous random variables.
Fix a probability space (Ω , F , P ). Let ( ν t , η i,t ) t ∈ Z ,i ∈ N be normal, mutually independent random variables, with ν t having variance 1 and η i,t hav-ing variance σ i . Also take a stochastic process ( θ t ) t ∈ Z , such that for each t ∈ Z and i ∈ N ,we have (for 0 < | ρ | ≤ θ t = ρθ t − + ν t Such a stochastic process exists by standard constructions of the AR(1) process or, in thecase of ρ = 1, of the Gaussian random walk on a doubly infinite time domain. Define s i,t = θ t + η i,t .A.2. Formal definition of game and stationary linear equilibria.Players and strategies.
The set of players (or agents) is A = { ( i, t ) : i ∈ N, t ∈ Z } . Theset of (pure) responses of an agent ( i, t ) is defined to be the set of all Borel-measurablefunctions σ ( i,t ) : R × ( R | N ( i ) | ) m → R , mapping her own signal and her neighborhood’sactions, ( s i,t , ( a N i ,t − (cid:96) ) m(cid:96) =1 ), to a real-valued action a i,t . We call the set of these functions (cid:101) Σ ( i,t ) . Let (cid:101) Σ = (cid:81) ( i,t ) ∈A (cid:101) Σ ( i,t ) be the set of response profiles. We now define the set of (unambiguous) strategy profiles , Σ ⊂ (cid:101) Σ. We say that a response profile σ ∈ (cid:101) Σ is a strategyprofile if the following two conditions hold1. There is a tuple of real-valued random variables ( a i,t ) i ∈ N,t ∈ Z on (Ω , F , P ) such thatfor each ( i, t ) ∈ A , we have a i,t = σ ( i,t ) ( s i,t , ( a N i ,t − (cid:96) ) m(cid:96) =1 ) . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 39
2. Any two tuples of real-valued random variables ( a i,t ) i ∈ N,t ∈ Z satisfying Condition 1are equal almost surely.That is, a response profile is a strategy profile if there is an essentially unique specifi-cation of behavior that is consistent with the responses: i.e., if the responses uniquelydetermine the behavior of the population, and hence payoffs. Note that if σ ∈ Σ, then (cid:101) σ = ( σ (cid:48) ( i,t ) , σ − ( i,t ) ) ∈ Σ whenever σ (cid:48) ( i,t ) ∈ (cid:101) Σ ( i,t ) . This is because any Borel-measurable func-tion of a random variable is itself a well-defined random variable. Thus, if we start witha strategy profile and consider agent ( i, t )’s deviations, they are unrestricted: she mayconsider any response. Payoffs.
The payoff of an agent ( i, t ) under any strategy profile σ ∈ Σ is u i,t ( σ ) = − E (cid:2) ( a i,t − θ t ) (cid:3) ∈ [ −∞ , , where the actions a i,t are taken according to σ ( i,t ) and the expectation is taken in theprobability space we have described. This expectation is well-defined because inside theexpectation there is a nonnegative, measurable random variable, for which an expectationis always defined, though it may be infinite. Equilibria.
A (Nash) equilibrium is defined to be a strategy profile σ ∈ Σ such that, foreach ( i, t ) ∈ A and each (cid:101) σ ∈ Σ such that (cid:101) σ = ( σ (cid:48) ( i,t ) , σ − ( i,t ) ) for some σ (cid:48) ( i,t ) ∈ Σ ( i,t ) , we have u i,t ( (cid:101) σ ) ≤ u i,t ( σ ) . For p ∈ Z , we define the shift operator T p to translate variables to time indices shifted p steps forward. This definition may be applied, for example, to Σ. A strategy profile σ ∈ Σ is stationary if, for all p ∈ Z , we have T p σ = σ .We say σ ∈ Σ is a linear strategy profile if each σ i is a linear function. Our analysisfocuses on stationary, linear equilibria. Appendix B. Existence of equilibrium: Proof of Proposition 1
Recall from Section 3.3 the map Φ, which gives the next-period covariance matrix Φ( V t )for any V t . The expression given there for this map ensures that its entries are continuousfunctions of the entries of V t . Our strategy is to show that this function maps a compact Condition 1 is necessarily to rule out response profiles such as the one given by σ i,t ( s i,t , a i,t − ) = | a i,t − | +1. This profile, despite consisting of well-behaved functions, does not correspond to any specification ofbehavior for the whole population (because time extends infinitely backward). Condition 2 is necessary torule out response profiles such has the one given by σ i,t ( s i,t , a i,t − ) = a i,t − , which have many satisfyingaction paths, leaving payoffs undetermined. I.e., σ (cid:48) = T p σ is defined by σ ( i,t ) = σ ( i,t − p ) . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 40 set, K , to itself, which, by Brouwer’s fixed-point theorem, ensures that Φ has a fixed point (cid:98) V . We will then argue that this fixed point corresponds to a stationary linear equilibrium.We begin by defining the compact set K . Because memory is arbitrary, entries of V t are covariances between pairs of neighbor actions from any periods available in memory.Let k, l be two indices of such actions, corresponding to actions taken at nodes i and j respectively, and let σ i = max { σ i , ρ m − σ i + − ρ m − − ρ } . Now let K ⊂ V be the subset ofsymmetric positive semi-definite matrices V t such that, for any such k, l , V kk,t ∈ (cid:20) min (cid:26)
11 + σ − i , ρ m − σ − i + 1 − ρ m − − ρ (cid:27) , max (cid:26) σ i , ρ m − σ i + 1 − ρ m − − ρ (cid:27)(cid:21) V kl,t ∈ [ − σ i σ j , σ i σ j ] . This set is closed and convex, and we claim that Φ( K ) ⊂ K . To show this claim, we will first find upper and lower bounds on the variance of anyneighbor’s action (at any period in memory). For the upper bound, note that a Bayesianagent will not choose an action with a larger variance than her signal, which has variance σ i . For a lower bound, note that if she knew the previous period’s state and her ownsignal, then the variance of her action would be σ − i . Thus an agent observing only noisyestimates of θ t and her own signal can do no better.By the same reasoning applied to the node- i agent from m periods ago, the variance ofthe estimate of θ t − based on i ’s action from m periods ago is at most ρ m − σ i + − ρ m − − ρ andat least ρ m − σ − i + − ρ m − − ρ . This establishes bounds on V kk,t for observations k coming fromeither the most recent or the oldest available period. The corresponding bounds from theperiods between t − m + 1 and t are always weaker than at least one of the two bounds wehave described, so we need only take minima and maxima over two terms.This established the claimed bound on the variances. The bounds on covariances followfrom Cauchy-Schwartz.We have now established that there is a variance-covariance matrix (cid:98) V such that Φ( (cid:98) V ) = (cid:98) V . By definition of Φ, this means there exists some weight profile ( W, w s ) such that,when applied to prior actions that have variance-covariance matrix (cid:98) V , produce variance-covariance matrix (cid:98) V . However, it still remains to show that this is the variance-covariancematrix reached when agents have been using the weights ( W, w s ) forever.To show this, first observe that if agents have been using the weights ( W, w s ) forever, thevariance-covariance matrix V t in any period is uniquely determined and does not depend OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 41 on t ; call this ˇ V . This is because actions can be expressed as linear combinations ofprivate signals with coefficients depending only on the weights. Second, it follows from ourconstruction above of the matrix (cid:98) V and the weights ( W, w s ) that there is a distributionof actions where the variance-covariance matrix is (cid:98) V in every period and agents are usingweights ( W, w s ) in every period. Combining the two statements shows that in fact ˇ V = (cid:98) V ,and this completes the proof. The variance-covariance matrices are well-defined because the (
W, w s ) weights yield unambiguous strat-egy profiles in the sense of Appendix A. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 42
Appendix C. Proof of Theorem 1
C.1.
Notation and key notions.
Let S be the (by assumption finite) set of all possiblesignal variances, and let σ be the largest of them. The proof will focus on the covariances oferrors in social signals. Take two arbitrary agents i and j . Recall that both r i,t and r j,t havemean θ t − , because each is an unbiased estimate of θ t − ; we will thus focus on the errors r i,t − θ t − . Let A t denote the variance-covariance matrix (Cov( r i,t − θ t − , r j,t − θ t − )) i,j andlet W be the subset of such covariance matrices. For all i, j note that Cov( r i,t − θ t − , r j,t − θ t − ) ∈ [ − σ , σ ] using the Cauchy-Schwarz inequality and the fact that Var( r i,t − θ t − ) ∈ [0 , σ ] for all i . This fact about variances says that no social signal is worse than puttingall weight on an agent who follows only her private signal. Thus the best-response map Φis well-defined and induces a map (cid:101) Φ on W .Next, for any δ, ζ > W δ,ζ ⊂ W to be the set of covariancematrices in W such that both of the following hold:1. for any pair of distinct agents i ∈ G kn and j ∈ G k (cid:48) n ,Cov( r i,t − θ t − , r j,t − θ t − ) = δ kk (cid:48) + ζ ij where (i) δ kk (cid:48) depends only on the network types of the two agents ( k and k (cid:48) , whichmay be the same); (ii) | δ kk (cid:48) | < δ ; and (iii) | ζ ij | < ζ ;2. for any single agent i ∈ G kn ,Var( r i,t − θ t − ) = δ k + ζ ii where (i) δ k only depends on the network type of the agent; (ii) | δ k | < δ, and (iii) | ζ ii | < ζ .This is the space of covariance-matrices such that each covariance is split into two parts.Considering (1) first, δ kk (cid:48) is an effect that depends only on i ’s and j ’s network types, while ζ ij adjusts for the individual-level heterogeneity arising from different link realizations. Thedescription of the decomposition in (2) is analogous.C.2. Proof strategy.
C.2.1.
A set W δ,ζ of outcomes with good learning. Our goal is to show that as n grows large,Var( r i,t − θ t − ) becomes very small, which then implies that the agents asymptotically learn.We will take δ and ζ to be arbitrarily small numbers and show that for large enough n ,with high probability (which we abbreviate “asymptotically almost surely” or “a.a.s.”) the This is because it is a linear combination, with coefficients summing to 1, of unbiased estimates of θ t − . Throughout this proof, we abuse terminology by referring to agents and nodes interchangeably whenthe relevant t is clear or specified nearby. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 43 equilibrium outcome has a social error covariance matrix A t in the set W δ,ζ . In particular,Var( r i,t − θ t − ) becomes arbitrarily small in this limit. In our constructions, the ζ ij (resp., ζ i )terms will be set to much smaller values than the δ kk (cid:48) (resp., δ k ) terms, because group-levelcovariances are more predictable and less sensitive to idiosyncratic realizations.C.2.2. Approach to showing that W δ,ζ contains an equilibrium. To show that the equilib-rium outcome has (a.a.s.) a social error covariance matrix A t in the set W δ,ζ , the plan isto construct a set so that (a.a.s.) W ⊂ W δ,ζ and (cid:101) Φ( W ) ⊂ W . This set will contain anequilibrium by the Brouwer fixed point theorem, and therefore so will W δ,ζ .To construct the set W , we will fix a positive constant β (to be determined later), anddefine W = W βn , n ∪ (cid:101) Φ( W βn , n ) . We will then prove that, for large enough n , (i) (cid:101) Φ( W ) ⊆ W and (ii) for another suitablepositive constant λ , W ⊂ W βn , λn . This will allow us to establish that (a.a.s.)
W ⊂ W δ,ζ and (cid:101) Φ( W ) ⊂ W , with δ and ζ beingarbitrarily small numbers.The following two lemmas will allow us to deduce (immediately after stating them)properties (i) and (ii) of W . Lemma 1.
For all large enough β and all λ ≥ λ ( β ) , with probability at least − n , wehave (cid:101) Φ( W βn , n ) ⊂ W βn , λn . Lemma 2.
For all large enough β , with probability at least − n , the set W βn , n is invariantunder (cid:101) Φ , i.e., (cid:101) Φ ( W βn , n ) ⊂ W βn , n . Putting these lemmas together, a.a.s. we have, (cid:101) Φ ( W βn , n ) ⊂ W βn , n and (cid:101) Φ( W βn , n ) ⊂ W βn , λn . From this it follows that W = W βn , n ∪ (cid:101) Φ( W βn , n ) is invariant under (cid:101) Φ and contained in W βn , λn , as claimed.C.2.3. Proving the lemmas by analyzing how (cid:101) Φ and (cid:101) Φ act on sets W δ,ζ . The lemmas areabout how (cid:101)
Φ and (cid:101) Φ act on the covariance matrix A t , assuming it is in a certain set W δ,ζ , toyield new covariance matrices. Thus, we will prove these lemmas by studying two periodsof updating. The analysis will come in five steps. The notation (cid:101) Φ means the operator (cid:101) Φ applied twice.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 44
Step 1: No-large-deviations (NLD) networks and the high-probability event.
Step 1 concerns the “with high probability” part of the lemmas. In the entire argument,we condition on the event of a no-large-deviations (NLD) network realization, which saysthat certain realized statistics in the network (e.g., number of paths between two nodes)are close to their expectations. The expectations in question depend only on agents’ types.Therefore, on the NLD realization, the realized statistics do not vary much based on whichexact agents we focus on, but rather depend only on their types. Step 1 defines the NLDevent E formally and shows that it has high probability. We use the structure of the NLDevent throughout our subsequent steps, as we mention below. Step 2: Weights in one step of updating are well-behaved.
We are interested in (cid:101)
Φand (cid:101) Φ , which are about how the covariance matrix A t of social signal errors changes underupdating. How this works is determined by the “basic” updating map Φ, and so we beginby studying the weights involved in it and then make deductions about the matrix A t .The present step establishes that in one step of updating, the weight W ij,t (cid:48) that agent( i, t (cid:48) ), where t (cid:48) = t + 1, places on the action of another agent j in period t , does notdepend too much on the identities of i and j . It only depends on their (network andsignal) types. This is established by using our explicit formula for weights in terms ofcovariances. We rely on (i) the fact that covariances are assumed to start out in a suitable W δ,ζ , and (ii) our conditioning on the NLD event E . The NLD event is designed so that thenetwork quantities that go into determining the weights depend only on the types of i and j (because the NLD event forbids too much variation conditional on type). The restrictionto A t ∈ W δ,ζ ensures that covariances in the initial period t did not depend too much ontype, either. Step 3: Lemma 1: (cid:101) Φ( W βn , n ) ⊂ W βn , λn . Once we have analyzed one step of updating, itis natural to ask what that does to the covariance matrix. Because we now have a boundon how much weights can vary after one step of updating, we can compute bounds oncovariances. This step shows that the initial covariances A t being in W βn , n implies thatafter one step, covariances are in W βn , λn . Note that the introduction of another parameter λ on the right-hand side implies that this step might worsen our control on covariancessomewhat, but in a bounded way. This establishes Lemma 1. Step 4: Weights in two steps of updating are well-behaved.
The fourth step estab-lishes that the statement made in Step 2 remains true when we replace t (cid:48) by t + 2. By thesame sort of reasoning as in Step 2, an additional step of updating cannot create too muchfurther idiosyncratic variation in weights. Proving this requires analyzing the covariance OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 45 matrices of various social signals (i.e., the A t +1 that the updating induces), which is whywe needed to do Step 3 first. Step 5: Lemma 2: (cid:101) Φ ( W βn , n ) ⊂ W βn , n . Now we use our understanding of weights fromthe previous steps, along with additional structure, to show the key remaining fact. Whatwe have established so far about weights allows us to control the weight that a given agent’sestimate at time t + 2 places on the social signal of another agent at time t . This is Step5(a). In the second part, Step 5(b), we use that to control the covariances in A t +2 . It isimportant in this part of the proof that different agents have very similar “second-orderneighborhoods”: the paths of length 2 beginning from an agent are very similar, in terms oftheir counts and what types of agents they go through. We carefully separate the variation(across agents) in covariances in A t into three pieces and use our control of second-orderneighborhoods to bound this variation such that A t +2 ∈ W βn , n .C.3. Carrying out the steps.
C.3.1.
Step 1.
Here we formally define the NLD event, which we call E . It is given by E = ∩ i =1 E i , where the events E i will be defined next.( E ) Let X (1) i,τk be the number of agents having signal type τ and network type k whoare observed by i . The event E is that this quantity is close to its expected value in thefollowing sense, simultaneously for all possible values of the subscript:(1 − ζ ) E [ X (1) i,τk ] ≤ X (1) i,τk ≤ (1 + ζ ) E [ X (1) i,τk ] . ( E ) Let X (2) ii (cid:48) ,τk be the number of agents having signal type τ and network type k whoare observed by both i and i (cid:48) . The event E is that this quantity is close to its expectedvalue in the following sense, simultaneously for all possible values of the subscript:(1 − ζ ) E [ X (2) ii (cid:48) ,τk ] ≤ X (2) ii (cid:48) ,τk ≤ (1 + ζ ) E [ X (2) ii (cid:48) ,τk ] . ( E ) Let X (3) i,τk,j be the number of agents having signal type τ and network type k whoare observed by agent i and who observe agent j . The event E is that this quantity isclose to its expected value in the following sense, simultaneously for all possible values ofthe subscript: (1 − ζ ) E [ X (3) i,τk,j ] ≤ X (3) i,τk,j ≤ (1 + ζ ) E [ X (3) i,τk,j ] . ( E ) Let X (4) ii (cid:48) ,τk,j be the number of agents having signal type τ and network type k whoare observed by both agent i and i (cid:48) and who observe j . The event E is that this quantity OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 46 is close to its expected value in the following sense, simultaneously for all possible valuesof the subscript: (1 − ζ ) E [ X (4) ii (cid:48) ,τk (cid:48) ,j ] ≤ X (4) ii (cid:48) ,τk (cid:48) ,j ≤ (1 + ζ ) E [ X (4) ii (cid:48) ,τk (cid:48) ,j ] . ( E ) Let X (5) i,τk,jj (cid:48) be the number of agents of signal type τ and network type k who areobserved by agent i and who observe both j and j (cid:48) . The event E is that this quantity isclose to its expected value in the following sense, simultaneously for all possible values ofthe subscript: (1 − ζ ) E [ X (5) i,τk,jj (cid:48) ] ≤ X (5) i,τk,jj (cid:48) ≤ (1 + ζ ) E [ X (5) i,τk,jj (cid:48) ] . We claim that the probability of the complement of the event E vanishes exponentially.We can check this by showing that the probability of each of the E i vanishes exponentially.For E , for example, the bounds will hold unless at least one agent has degree outside thespecified range. The probability of this is bounded above by the sum of the probabilitiesof each individual agent having degree outside the specified range. By the central limittheorem, the probability a given agent has degree outside this range vanishes exponentially.Because there are n agents in G n , this sum vanishes exponentially as well. The other casesare similar.For the rest of the proof, we condition on the event E .C.3.2. Step 2.
As a shorthand, let δ = β/n for a sufficiently large constant β , and let ζ = 1 /n . Lemma 3.
Suppose that in period t the matrix A = A t of covariances of social signalssatisfies A ∈ W δ,ζ and all agents are optimizing in period t + 1 . Then there is a γ so thatfor all n sufficiently large, W ij,t +1 W i (cid:48) j (cid:48) ,t +1 ∈ (cid:104) − γn , γn (cid:105) . whenever i and i (cid:48) have the same network and signal types and j and j (cid:48) have the samenetwork and signal types. To prove this lemma, we will use our weights formula: W i,t +1 = T C − i,t T C − i,t . This says that in period t + 1, agent i ’s weight on agent j is proportional to the sum ofthe entries of column j of C − i,t . We want to show that the change in weights is small asthe covariances of observed social signals vary slightly. To do so we will use the Taylor OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 47 expansion of f ( A ) = C − i,t around the covariance matrix A (0) at which all δ kk (cid:48) = 0, δ k = 0and ζ ij = 0.We begin with the first partial derivative of f at A (0) in an arbitrary direction. Let A ( x ) be any perturbation of A in one parameter, i.e., A ( x ) = A (0) + xM for someconstant matrix M with entries in [ − , C i ( x ) be the matrix of covariances ofthe actions observed by i given that the covariances of agents’ social signals were A ( x ).There exists a constant γ depending only on the possible signal types such that each entryof C i ( x ) − C i ( x (cid:48) ) has absolute value at most γ ( x − x (cid:48) ) whenever both x and x (cid:48) are small.We will now show that the column sums of C i ( x ) − are close to the column sums of C (0) − i . To do so, we will evaluate the formula(C.1) ∂f ( A ( x )) ∂x = ∂ C i ( x ) − ∂x = C i ( x ) − ∂ C i ( x ) ∂x C i ( x ) − at zero. If we can bound each column sum of this expression (evaluated at zero) by aconstant (depending only on the signal types and the number of network types K ), thenthe first derivative of f will also be bounded by a constant.Recall that S is the set of signal types and let S = | S | ; index the signal types by numbersranging from 1 to S . To bound the column sums of C i (0) − , suppose that the agent observes r i agents from each signal type 1 ≤ i ≤ S . Reordering so that all agents of each signal typeare grouped together, we can write C i (0) = a r × r + b I r a r × r a S r × r S a r × r a r × r + b I r .... . . a S r S × r · · · a SS r s × r s + b s I r s Therefore, C i (0) can be written as a block matrix with blocks a ij r i × r j + b i δ ij I r i where1 ≤ i, j ≤ S and δ ij = 1 for i = j and 0 otherwise.We now have the following important approximation of the inverse of this matrix. Lemma 4 (Pinelis (2018)) . Let C be a matrix consisting of S × S blocks, with its (i,j) blockgiven by a ij r i × r j + b i δ ij I r i and let A = a ij r i × r j be an invertible matrix. As n → ∞ , then the ( i, i ) block of C − isequal to We are very grateful to Iosif Pinelis for suggesting this argument.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 48 b i I r i − b i r i r i × r i + O (1 /n ) while the off-diagonal blocks are O (1 /n ) .Proof. First note that the ij -block of C − has the form c ij r i × r j + d i δ ij I r i for some real c ij and d i .Therefore, CC − can be written in matrix form as (cid:80) k ( a ik r i × r k + b i δ ik I r i )( c kj r k × r j + d k δ kj I r k ) = a ij d j + (cid:80) k ( a ik r k + δ ik b k ) c kj r i × r j + b i d i δ ij I r i . (C.2)Note that the last summand is the identity matrix.Let D d denote the diagonal matrix with d i in the ( i, i ) diagonal entry, let D /b denotethe diagonal matrix with 1 /b i in the ( i, i ) diagonal entry, etc. Breaking up the previousdisplay (C.2) into its diagonal and off-diagonal parts, we can write AD d + ( AD r + D b ) C = 0 and D d = D /b . Hence, C = − ( AD r + D b ) − AD d = − ( I q + D − r A − D b ) − ( AD r ) − AD /b = − ( I q + D − r A − D b ) − D / ( br ) = − D / ( br ) + O (1 /n )where br := ( b r , . . . , b q r q ). Therefore as n → ∞ the off-diagonal blocks will be O (1 /n )while the diagonal blocks are 1 b i I r i − b i r i r i × r i + O (1 /n )as desired. (cid:3) Using Lemma 4 we can analyze the column sums of C i (0) − M C i (0) − . Recall we wrote A ( x ) = A (0) + xM , and in (C.1) we expressed the derivative of f in x in terms of thematrix we exhibit here. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 49
In more detail, we use the formula of the lemma to estimate both copies of C i (0) − , andthen expand this to write an expression for any column sum of C i (0) − M C i (0) − . It followsstraightforwardly from this calculation that all these column sums are O (1 /n ) whenever allentries of M are in [ − , x , we obtain an expression for the k th derivativein terms of C i (0) − and M : f ( k ) (0) = k ! C i (0) − M C i (0) − M C i (0) − · . . . · M C i (0) − , where M appears k times in the product. By the same argument as above, we can showthat the column sums of f ( k ) (0) k ! are bounded by a constant independent of n . The Taylorexpansion is f ( A ) = (cid:88) k f ( k ) (0) k ! x k . Since we take A ∈ W δ,ζ , we can assume that x is O (1 /n ). Because the column sums of eachsummand are bounded by a constant times x k , the column sums of f ( A ) are bounded bya constant.Finally, because the variation in the column sums is O (1 /n ) and the weights are propor-tional to the column sums, each weight varies by at most a multiplicative factor of γ /n for some γ . We find that the first part of the lemma, which bounded the ratios betweenweights W ij,t +1 /W i (cid:48) j (cid:48) ,t +1 , holds.C.3.3. Step 3.
We complete the proof of Lemma 1, which states that the covariance matrixof r i,t +1 is in W δ,ζ (cid:48) . Recall that ζ (cid:48) = λ/n for some constant n , so we are showing that if thecovariance matrix of the r i,t is in a neighborhood W δ,ζ , then the covariance matrix in thenext period is in a somewhat larger neighborhood W δ,ζ (cid:48) . The remainder of the argumentthen follows by the same arguments as in the proof of the first part of the lemma: we nowbound the change in time-( t + 2) weights as we vary the covariances of time-( t + 1) socialsignals within this neighborhood.Recall that we decomposed each covariance Cov( r i,t − θ t − , r j,t − θ t − ) = δ kk (cid:48) + ζ ij intoa term δ kk (cid:48) depending only on the types of the two agents and a term ζ ij , and similarlyfor variances. To show the covariance matrix is contained in W δ,ζ (cid:48) , we bound each of theseterms suitably. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 50
We begin with ζ ij (and ζ i ). We can write r i,t +1 = (cid:88) j W ij,t +1 − w si,t +1 a i,t = (cid:88) j W ij,t +1 − w si,t +1 ( w sj,t s j,t + (1 − w sj,t ) r j,t ) . By the first part of the lemma, the ratio between any two weights (both of the form W ij,t +1 , w si,t +1 , or w sj,t ) corresponding to pairs of agents of the same types is in [1 − γ /n, γ /n ]for a constant γ . We can use this to bound the variation in covariances of r i,t +1 withintypes by ζ (cid:48) : we take the covariance of r i,t +1 and r j,t +1 using the expansion above and thenbound the resulting summation by bounding all coefficients.Next we bound δ kk (cid:48) (and δ k ). It is sufficient to show that Var( r i,t +1 − θ t ) is at most δ . To do so, we will give an estimator of θ t with variance less than β/n , and this willimply Var( r i,t +1 − θ t ) < β/n = δ (recall r i,t +1 is the estimate of θ t given agent i ’s socialobservations in period t + 1). Since this bounds all the variance terms by δ , the covarianceterms will also be bounded by δ in absolute value.Fix an agent i of network type k and consider some network type k (cid:48) such that p kk (cid:48) > A and B , such that i observes Ω( n ) agentsof each of these signal types in G kn . The basic idea will be that we can approximate θ t well by taking a linear combination of the average of observed agents of network type k and signal type A and the average of observed agents of network type k and signal type B.In more detail: Let N i,A be the set of agents of type A in network type k observed by i and N i,B be the set of agents of type B in network type k observed by i . Then fixing someagent j of network type k, | N i,A | (cid:88) j ∈ N i,A a j,t − = σ − A σ − A θ t + 11 + σ − A r j ,t − + noise where the noise term has variance of order 1 /n and depends on signal noise, variation in r j,t , and variation in weights. Similarly1 | N i,B | (cid:88) j ∈ N i,B a j,t = σ − B σ − B θ t + 11 + σ − B r j ,t − + noise where the noise term has the same properties. Because σ A (cid:54) = σ B , we can write θ t as a linearcombination of these two averages with coefficients independent of n up to a noise term oforder 1 /n . We can choose β large enough such that this noise term has variance most β/n for all n sufficiently large. This completes the Proof of Lemma 1. We use the notation Ω( n ) to mean greater than Cn for some constant C > n is large. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 51
C.3.4.
Step 4:
We now give the two-step version of Lemma 3.
Lemma 5.
Suppose that in period t the matrix A = A t of covariances of social signalssatisfies A ∈ W δ,ζ and all agents are optimizing in periods t + 1 and t + 2 . Then there is a γ so that for all n sufficiently large, W ij,t +2 W i (cid:48) j (cid:48) ,t +2 ∈ (cid:104) − γn , γn (cid:105) . whenever i and i (cid:48) have the same network and signal types and j and j (cid:48) have the samenetwork and signal types. Given what we established about covariances in Step 3, the lemma follows by the sameargument as the proof of Lemma 3.
Step 5:
Now that Lemma 5 is proved, we can apply it to show that (cid:101) Φ ( W δ,ζ ) ⊂ W δ,ζ . We will do this by first writing the time-( t + 2) behavior in terms of agents’ time- t obser-vations (Step 5(a)), which comes from applying (cid:101) Φ twice. This gives a formula that can beused for bounding the covariances of time-( t + 2) actions in terms of covariances of time- t actions. Step 5(b) then applies this formula to show we can take ζ ij and ζ i to be sufficientlysmall. (Recall the notation introduced in Section C.1 above.) We split our expression for r i,t +2 into several groups of terms and show that the contribution of each group of termsdepends only on agents’ types up to a small noise term. Step 5(c) notes that we can alsotake δ kk (cid:48) and δ k to be sufficiently small. Step 5(a):
We calculate: r i,t +2 = (cid:88) j W ij,t +2 − w si,t +2 ρa j,t +1 = ρ ( (cid:88) j W ij,t +2 − w si,t +2 w sj,t +1 s j,t +1 + (cid:88) j,j (cid:48) W ij,t +2 − w si,t +2 W jj (cid:48) ,t +1 ρa j (cid:48) ,t )= ρ ( (cid:88) j W ij,t +2 − w si,t +2 w sj,t +1 s j,t +1 + ρ ( (cid:88) j,j (cid:48) W ij,t +2 − w si,t +2 W jj (cid:48) ,t +1 w sj (cid:48) ,t s j (cid:48) ,t + (cid:88) j,j (cid:48) W ij,t +2 − w si,t +2 W jj (cid:48) ,t +1 (1 − w sj (cid:48) ,t ) r j (cid:48) ,t )) . We take this term to refer to variances, as well.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 52
Let c ij (cid:48) ,t be the coefficient on r j (cid:48) ,t in this expansion of r i,t +2 . Explicitly, c ij (cid:48) ,t = (cid:88) j W ij,t +2 − w si,t +2 W jj (cid:48) ,t +1 (1 − w sj (cid:48) ,t ) . The coefficient c ij (cid:48) ,t adds up the influence of r j (cid:48) ,t on r i,t +2 over all paths of length two.First, we establish a lemma about how much these weights vary. Lemma 6.
For n sufficiently large, when i and i (cid:48) have the same network types and j (cid:48) and j (cid:48)(cid:48) have the same network and signal types, the ratio c ij (cid:48) ,t /c i (cid:48) j (cid:48)(cid:48) ,t is in [1 − γ/n, γ/n ] .Proof. Suppose i ∈ G k and j (cid:48) ∈ G k (cid:48) . For each network type k (cid:48)(cid:48) , the number of agents j of type k (cid:48)(cid:48) who are observed by i and who observe j (cid:48) varies by at most a factor ζ as wechange i in G k and j (cid:48) in G k (cid:48) . For each such j , the contribution of that agent’s action to c ij (cid:48) ,t is W ij,t +2 − w si,t +2 W jj (cid:48) ,t +1 (1 − w sj (cid:48) ,t ) . By Lemma 3 applied to each term, this expression varies by at most a factor of γ/n as wechange i in G k and j (cid:48) in G k (cid:48) . Combining these facts for each type k (cid:48)(cid:48) shows the lemma. (cid:3) Step 5(b):
We first show that fixing the values of δ kk (cid:48) and δ k in period t , the variation inthe covariances Cov( r i,t +2 − θ t +1 , r i (cid:48) ,t +2 − θ t +1 ) of these terms as we vary i and i (cid:48) over networktypes is not larger than ζ . From the formula above, we observe that we can decompose r i,t +2 − θ t +1 as a linear combination of three mutually independent groups of terms:(i) signal error terms η j,t +1 and η j (cid:48) ,t ;(ii) the errors r j (cid:48) ,t − θ t in the social signals from period t ; and(iii) changes in state ν t and ν t +1 between periods t and t + 2.Note that the terms r j (cid:48) ,t − θ t are linear combinations of older signal errors and changesin the state. We bound each of the three groups in turn: (i) Signal Errors: We first consider the contribution of signal errors. When i and i (cid:48) aredistinct, the number of such terms is close to its expected value because we are conditioningon the events E and E defined in Section C.1. Moreover the weights are close to theirexpected values by Step 2, so the variation is bounded suitably. When i and i (cid:48) are equal,we use the facts that the weights are close to their expected values and the variance of anaverage of Ω( n ) signals is small. (ii) Social Signals: We now consider terms r j (cid:48) ,t − θ t , which correspond to the thirdsummand in our expression for r i,t +2 . Since we will analyze the weight on ν t below, it issufficient to study the terms r j (cid:48) ,t − θ t − . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 53
By Lemma 6, the coefficients placed on r j (cid:48) ,t by i and on r j (cid:48)(cid:48) ,t by i (cid:48) vary by a factor of atmost 2 γ/n . Moreover, the absolute value of each of these covariances is bounded above by δ and the variation in these terms is bounded above by ζ . We conclude that the variationfrom these terms has order 1 /n . (iii) Innovations: Finally, we consider the contribution of the innovations ν t and ν t +1 .We treat ν t +1 first. We must show that any two agents of the same types place the sameweight on the innovation ν t +1 (up to an error of order n ). This will imply that thecontributions of timing to the covariances Cov( r i,t +2 − θ t +1 , r i (cid:48) ,t +2 − θ t +1 ) can be expressedas a term that can be included in the relevant δ kk (cid:48) and a lower-order term which can beincluded in ζ ii (cid:48) .The weight an agent places on ν t +1 is equal to the weight she places on signals fromperiod t + 1. So this is equivalent to showing that the total weight ρ (cid:88) j W ij,t +2 − w si,t +2 w sj,t +1 agent i places on period t + 1 depends only on the network type k of agent i and O (1 /n )terms. We will first show the average weight placed on time-( t + 1) signals by agents ofeach signal type depends only on k . We will then show that the total weights on agents ofeach signal type do not depend on n .Suppose for simplicity here that there are two signal types A and B ; the general case isthe same. We can split the sum from the previous paragraph into the subgroups of agentswith signal types A and B : ρ (cid:88) j : σ j = σ A W ij,t +2 − w si,t +2 w sj,t +1 + ρ (cid:88) j : σ j = σ B W ij,t +2 − w si,t +2 w sj,t +1 . Letting W Ai = (cid:80) σ j = σ A W ij,t +2 − w si,t +2 be the total weight placed on agents with signal type A andsimilarly for signal type B , we can rewrite this as: W Ai ρ (cid:88) j : σ j = σ A W ij,t +2 W Ai (1 − w si,t +2 ) w sj,t +1 + W Bi ρ (cid:88) j : σ j = σ B W ij,t +2 W Bi (1 − w si,t +2 ) w sj,t +1 . The coefficients W ij,t +2 W Ai (1 − w si,t +2 ) in the first sum now sum to one, and similarly for the second.We want to check that the first sum (cid:80) j : σ j = σ A W ij,t +2 W Ai (1 − w si,t +2 ) w sj,t +1 does not depend on k , andthe second sum is similar. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 54
For each j in group A , w sj,t +1 = σ − A σ − A + ( ρ κ j,t +1 + 1) − , where we recall that κ j,t +1 = Var( r j,t +1 − θ t ). Because κ j,t +1 is close to zero, we canapproximate w sj,t +1 locally as a linear function µ κ j,t +1 + µ where µ < n terms).So we can write the sum of interest as (cid:88) j : σ j = σ A W ij,t +2 W Ai (1 − w si,t +2 ) ( µ (cid:88) j (cid:48) ,j (cid:48)(cid:48) W jj (cid:48) ,t +1 W jj (cid:48)(cid:48) ,t +1 ( ρ V j (cid:48) j (cid:48)(cid:48) ,t + 1) + µ ) . By Lemma 3, the weights vary by at most a multiplicative factor contained in [1 − γ/n, γ/n ]. The number of paths from i to j (cid:48) passing through agents of any network type k (cid:48)(cid:48) and any signal type is close to its expected value (which depends only on i ’s networktype), and the weight on each path depends only on the types involved up to a factor in[1 − γ/n, γ/n ]. The variation in V j (cid:48) j (cid:48)(cid:48) ,t consists of terms of the form δ k (cid:48) k (cid:48)(cid:48) , δ k (cid:48) , and ζ j (cid:48) j (cid:48)(cid:48) ,all of which are O (1 /n ), and terms from signal errors η j (cid:48) ,t . The signal errors only contributewhen j = j (cid:48) , and so only contribute to a fraction of the summands of order 1 /n . So wecan conclude the total variation in this sum as we change i within the network type k hasorder 1 /n . Now that we know each the average weight on private signals of the observed agents ofeach signal type depends only on k , it remains to check that W Ai and W Bi only depend on k . The coefficients W Ai and W Bi are the optimal weights on the group averages (cid:88) j : σ j = σ A W ij,t +2 W Ai (1 − w si,t +2 ) ρa j,t +1 and (cid:88) j : σ j = σ B W ij,t +2 W Bi (1 − w si,t +2 ) ρa j,t +1 , so we need to show that the variances and covariance of these two terms depend only on k . We check the variance of the first sum: we can expand (cid:88) σ j = σ A W ij,t +2 W Ai (1 − w si,t +2 ) ρa j,t +1 = (cid:88) σ j = σ A W ij,t +2 W Ai (1 − w si,t +2 ) ρ ( w sj,t +1 s j,t +1 + (1 − w sj,t +1 ) r j,t +1 ) . We can again bound the signal errors and social signals as in the previous parts of thisproof, and show that the variance of this term depends only on k and O (1 /n ) terms. Thesecond variance and covariance are similar, so W Ai and W Bi depend only on k and O (1 /n )terms. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 55
This takes care of the innovation ν t +1 . Because we have included any innovations priorto ν t in the social signals r j (cid:48) ,t , to complete Step 5(b) we need only show the weight on ν t depends only on the network type k of an agent.The analysis is a simpler version of the analysis of the weight on ν t +1 . It is sufficient toshow the total weight placed on period t social signals depends only on the network typeof k of an agent i . This weight is equal to ρ (cid:88) j,j (cid:48) W ij,t +2 − w si,t +2 · W jj (cid:48) ,t +1 · (1 − w sj (cid:48) ,t ) . As in the ν t +1 case, we can approximate (1 − w sj (cid:48) ,t ) as a linear function of κ j (cid:48) ,t up to O (1 /n )terms. Because the number of paths to each agent j (cid:48) though a given type and the weightson each such path cannot vary too much within types, the same argument shows that thissum depends only on k and O (1 /n ) terms.Step 5(b) is complete. Step 5(c):
The final step is to verify that we can take δ kk (cid:48) and δ k to be smaller than δ .It is sufficient to show that the variance Var( r i,t +2 − θ t +1 ) of each social signal about θ t +1 is at most δ . The proof is the same as in Step 2(b). Appendix D. Remaining proofs (online appendix)
D.1.
Proof of Proposition 2.
We first check there is a unique equilibrium and then provethe remainder of Proposition 2.
Lemma 7.
Suppose G has symmetric neighbors. Then there is a unique equilibrium.Proof of Lemma. We will show that when the network satisfies the condition in the propo-sition statement, Φ induces a contraction on a suitable space. For each agent, we canconsider the variance of the best estimator for yesterday’s state based on observed actions.These variances are tractable because they satisfy the envelope theorem. Moreover, thespace of these variances is a sufficient statistic for determining all agent strategies andaction variances.Let r i,t be i ’s social signal— the best estimator of θ t based on the period t − N i —and let κ i,t be the variance of r i,t − θ t .We claim that Φ induces a map (cid:101) Φ on the space of variances κ i,t , which we denote (cid:101) V .We must check the period t variances ( κ i,t ) i uniquely determine all period t + 1 variances( κ i,t +1 ) i : The variance V ii,t of agent i ’s action, as well as the covariances V ii (cid:48) ,t of all pairsof agents i , i (cid:48) with N i = N i (cid:48) , are determined by κ i,t . Moreover, by the condition on our OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 56 network, these variances and covariances determine all agents’ strategies in period t + 1,and this is enough to pin down all period t + 1 variances κ i,t +1 .The proof proceeds by showing (cid:101) Φ is a contraction on (cid:101) V in the sup norm.For each agent j , we have N i = N i (cid:48) for all i, i (cid:48) ∈ N j . So the period t actions of an agent i (cid:48) in N j are(D.1) a i (cid:48) ,t = ( ρ κ i,t + 1) − σ − i (cid:48) + ( ρ κ i,t + 1) − · r i,t + σ − i (cid:48) σ − i (cid:48) + ( ρ κ i,t + 1) − · s i (cid:48) ,t where s i (cid:48) ,t is agent ( i (cid:48) )’s signal in period t and r i,t the social signal of i (the same one that i (cid:48) has). It follows from this formula that each action observed by j is a linear combination ofa private signal and a common estimator r i,t , with positive coefficients which sum to one.For simplicity we write(D.2) a i (cid:48) ,t = b · r i,t + b i (cid:48) · s i (cid:48) ,t (where b and b i (cid:48) depend on i (cid:48) and t , but we omit these subscripts). We will use the facts0 < b < < b i (cid:48) < κ j,t = Var( r j,t − θ t ) depends on κ i,t − = Var( r i,t − − θ t − ).The estimator r j,t is a linear combination of observed actions a i (cid:48) ,t , and therefore can beexpanded as a linear combination of signals s i (cid:48) ,t and the estimator r i,t − . We can write(D.3) r j,t = c · ( ρr i,t − ) + (cid:88) i (cid:48) c i (cid:48) s i (cid:48) ,t and therefore (taking variances of both sides) κ j,t = Var( r j,t − θ t ) = c Var( ρr i,t − − θ t ) + (cid:88) i (cid:48) c i (cid:48) σ i (cid:48) = c ( κ i,t − + 1) + (cid:88) i (cid:48) c i (cid:48) σ i (cid:48) The desired result, that (cid:101)
Φ is a contraction, will follow if we can show that the derivative dκ j,t dκ i,t − ∈ [0 , δ ] for some δ <
1. By the envelope theorem, when calculating this derivative,we can assume that the weights placed on actions a i (cid:48) ,t − by the estimator r j,t do not changeas we vary κ i,t − , and therefore c and the c i (cid:48) above do not change. So it is enough to showthe coefficient c on κ i,t − is in [0 , δ ]. (cid:3) OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 57
The intuition for the lower bound is that anti-imitation (agents placing negative weightson observed actions) only occurs if observed actions put too much weight on public infor-mation. But if c <
0, then the weight on public information is actually negative so thereis no reason to anti-imitate. This is formalized in the following lemma.
Lemma 8.
Agent j ’s social signal places non-negative weight on agent i ’s social signalfrom the previous period, i.e., c ≥ .Proof. To check this formally, suppose that c is negative. Then the social signal r j,t putsnegative weight on some observed action—say the action a k,t − of agent k . We want tocheck that the covariance of r j,t − θ t and a k,t − − θ t is negative. Using (D.2) and (D.3), wecompute that Cov( r j,t − θ t , a k,t − − θ t ) = Cov c ( ρr i,t − − θ t ) + (cid:88) i (cid:48) ∈ N j c i (cid:48) ( s i (cid:48) ,t − θ t )) , b ( ρr i,t − − θ t ) + b k ( s k,t − − θ t ) = c b Var( ρr i,t − − θ t ) + c k b k Var( s k,t − − θ t ) because all distinct summands above are mutually independent. We have b , b k >
0, while c < c k < r j,t puts negative weight on a k,t − .So the expression above is negative. Therefore, it follows from the usual Gaussian Bayesianupdating formula that the best estimator of θ t given r j,t and a k,t − puts positive weighton a k,t − . However, this is a contradiction: the best estimator of θ t given r j,t and a k,t − is simply r j,t , because r j,t was defined as the best estimator of θ t given observations thatincluded a k,t − .Now, for the upper bound c ≤ δ , the idea is that r j,t puts more weight on agents withbetter signals while these agents put little weight on public information, which keeps theoverall weight on public information from growing too large.Note that r j,t is a linear combination of actions ρa i (cid:48) ,t − for i ∈ N j , with coefficientssumming to 1. The only way the coefficient on ρr i,t − in r j,t could be at least 1 would beif some of these coefficients on ρa i (cid:48) ,t − were negative and the estimator r j,t placed greaterweight on actions a i (cid:48) ,t − which placed more weight on r j,t .Applying the formula (D.1) for a i (cid:48) ,t − , we see that the coefficient b on ρr i,t − is less than1 and increasing in σ i (cid:48) . On the other hand, it is clear that the weight on a i (cid:48) ,t − in the socialsignal r j,t is decreasing in σ i (cid:48) : more weight should be put on more precise individuals. Soin fact the estimator r j,t places less weight on actions a i (cid:48) ,t − which placed more weight on r i,t .Moreover, the coefficients placed on private signals are bounded below by a positiveconstant when we restrict to covariances in the image of (cid:101) Φ (because all covariances are
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 58 bounded as in the proof of Proposition 1). Therefore, each agent i (cid:48) ∈ N j places weight atmost δ on the estimator ρr i,t − for some δ<
1. Agent j ’s social signal r j,t is a sum of theseagents’ actions with coefficients summing to 1 and satisfying the monotonicity propertyabove. We conclude that the coefficient on ρr i,t − in the expression for r j,t is at most δ. Weconclude that the coefficient on ρr i,t − in r j,t is bounded above by some δ < (cid:3) This completes the proof of Lemma 7. We now prove Proposition 2.
Proof of Proposition 2.
By Lemma 7 there is a unique equilibrium on any network G withsymmetric neighbors. Let ε > i . Her neighbors have the same private signal qualities and the sameneighborhoods (by the symmetric neighbors assumption). So there exists an equilibriumwhere for all i , the actions of agent i ’s neighbors are exchangeable. By uniqueness, this infact holds at the sole equilibrium.So agent i ’s social signal is an average of her neighbors’ actions: r i,t = 1 | N i | (cid:88) j ∈ N i a j,t . Suppose the ε -perfect aggregation benchmark is achieved. Then all agents must placeweight at least (1+ ε ) − (1+ ε ) − + σ − on their social signals. So at time t , the social signal r i,t placesweight at least (1+ ε ) − (1+ ε ) − + σ − on signals from at least two periods ago. Since the variance ofany linear combination of such signals is at least 1 + ρ , for ε sufficiently small the socialsignal r i,t is bounded away from a perfect estimate of θ t − . This gives a contradiction. (cid:3) D.2.
Proof of Corollary 1.
Consider a complete graph in which all agents have signalvariance σ and memory m = 1. By Proposition 2, as n grows large the variances of allagents converge to A > (1 + σ − ) − . Choose σ large enough such that A > σ to ∞ . Then a ,t = r ,t in each period, so all agents caninfer all private signals from the previous period. As n grows large, the variance of agent 1converges to 1 and the variances of all other agents (1 + σ − ) − . By our choice of σ , thisgives a Pareto improvement. We can see by continuity that the same argument holds for σ finite but sufficiently large.D.3. Proof of Proposition 3.
We outline the argument. In Step 1, we construct a sym-metric version of the Erdos-Renyi network and show there exists a symmetric equilibrium (cid:98) V sym ( n ) on this symmetric network. In Step 2, we show variances and covariances at theequilibrium (cid:98) V sym ( n ) converge to V ∞ and Cov ∞ . The remainder of the proof shows there isan equilibrium on G n near (cid:98) V sym ( n ). Step 3 defines a no-large-deviations event depending OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 59 on the realized Erdos-Renyi network, and we condition on this event. Step 4 shows that Φmaps a small neighborhood of (cid:98) V sym ( n ) to itself. Finally, in Step 5 we apply the Brouwerfixed point theorem to conclude there exists an equilibrium on G n in this neighborhood. Step 1:
We first consider a symmetric and deterministic version G symn of the network G n on which all agents observe exactly pn other agents and any pair of agents commonlyobserves exactly p n other agents.Let V sym ⊂ V be the space of covariance matrices for which each entry V ( n ) ij dependsonly on whether i and j are equal and not on the particular agents. Even if such a G symn network does not exist (for combinatorial reasons), updating as if on such a network inducesa well-defined map Φ sym : V sym → V sym . This map Φ sym must have a fixed point, which wecall (cid:98) V sym ( n ). We will next show that the variances and covariances at (cid:98) V sym ( n ) convergeto V ∞ and Cov ∞ . The remainder of the proof will show that for n large enough, thereexists an equilibrium (cid:98) V ( n ) on G n close to the equilibrium (cid:98) V sym ( n ) on G symn . Step 2: At (cid:98) V sym ( n ), each agent’s social signal is: r i,t = (cid:88) j ∈ N i ρa j,t − pn . So the variance of the social signal about θ t is κ i,t = ( ρ (cid:98) V sym ( n ) ,t + 1) pn + ( pn − ρ (cid:98) V sym ( n ) ,t + 1) pn . Thus the covariance of any two distinct agents solves (cid:98) V sym ( n ) ,t = κ − i,t ( σ − + κ − i,t ) (cid:32) ( ρ (cid:98) V sym ( n ) ,t + 1) p n + ( p n − ρ (cid:98) V sym ( n ) ,t + 1) p n (cid:33) . As n → ∞ , the right-hand side approaches( ρ (cid:98) V sym ( n ) ,t + 1) − [ σ − + ( ρ (cid:98) V sym ( n ) ,t + 1) − ] , and the unique real solution to this equation is Cov ∞ . Computing (cid:98) V sym ( n ) ,t in terms of (cid:98) V sym ( n ) ,t , we also see the variances converge to V ∞ . Step 3:
We will show that when ζ = n , the updating map Φ on the network G n mapsa small neighborhood around (cid:98) V sym ( n ) to itself. Let V n ⊂ V be the subset of covariancematrices such that V ( n ) ij ∈ [ (cid:98) V sym ( n ) − ζ, (cid:98) V sym ( n ) + ζ ]for all i and j . We will show in Steps 3 and 4 that Φ( V n ) ⊂ V n for n large enough. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 60
We first show that the network is close to symmetric with high probability. We willconsider the event E = E ∩ E , where the E i are defined by:( E ) : The degree of each agent i is between 1 − ζ times its expected value and 1 + ζ times its expected value, i.e., in [(1 − n ) pn/ , (1 + n ) pn ].( E ) : For any two agents i and i (cid:48) , the number of agents observed by both i and i (cid:48) between 1 − ζ times its expected value and 1 + ζ times its expected value, i.e., in [(1 − n ) p n/ , (1 + n ) p n ].We can show as in the proof of Theorem 1 that the probability of the complement ofevent E vanishes exponentially in n . We will condition on the event E , which occurs withprobability converging to 1, for the remainder of the proof. Step 4:
Assume all agents observe period t actions with covariances in V n and then actoptimally in period t + 1. We can show as in the proof of Lemma 3 that there exists aconstant γ such any agent’s weight W ij,t +1 on an observed neighbor is in [(1 − γ/n ) n , (1 + γ/n ) n ]. The relevant matrix C i (0) now has only one block because we have only signaltype, so the calculation is in fact simpler.We have r i,t +1 = (cid:88) j W ij,t +1 a j,t , and therefore for i and i (cid:48) distinct,Cov( r i,t +1 − θ t +1 , r i (cid:48) ,t +1 − θ t +1 ) = (cid:88) j,j (cid:48) W ij,t +1 W i (cid:48) j (cid:48) ,t +1 ( ρ V jj (cid:48) ,t + 1) . The terms W ij,t +1 W i (cid:48) j (cid:48) ,t +1 sum to 1, and each non-zero term is contained in [ (1 − γ/n ) n , (1+ γ/n ) n ].The terms V jj (cid:48) ,t are each contained in [ (cid:98) V sym ( n ) − n , (cid:98) V sym ( n ) + n ] (for j and j (cid:48) distinct)and the terms V jj,t are each contained in [ (cid:98) V sym ( n ) − n , (cid:98) V sym ( n ) + n ]. So (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Cov( r i,t +1 − θ t +1 , r i (cid:48) ,t +1 − θ t +1 ) − ( ρ (cid:98) V sym ( n ) ,t + 1) p n − ( p n − ρ (cid:98) V sym ( n ) ,t + 1) p n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ρ n + O ( 1 n ) , where the terms of order n come from variation in weights and variation in the network.The term ( ρ (cid:98) V sym ( n ) ,t + 1) p n + ( p n − ρ (cid:98) V sym ( n ) ,t + 1) pn is the covariance of two distinct social signals in G symn . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 61
Similarly (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)
Var( r i,t +1 − θ t +1 , r i (cid:48) ,t +1 − θ t +1 ) − ( ρ (cid:98) V sym ( n ) ,t + 1) pn − ( pn − ρ (cid:98) V sym ( n ) ,t + 1) pn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ρ n + O ( 1 n ) . The term ( ρ (cid:98) V sym ( n ) ,t + 1) pn + ( pn − ρ (cid:98) V sym ( n ) ,t + 1) pn is the covariance of two distinct social signals in G symn .We compute from these inequalities that the variances V ii,t +1 and covariances V ii (cid:48) ,t +1 ofactions are within n of (cid:98) V sym ( n ) and (cid:98) V sym ( n ) , respectively. This shows that Φ( V n ) ⊂ V n . Step 5:
By the Brouwer fixed point theorem, there exists an equilibrium (cid:98) V ( n ) on G n with the desired properties. Because V ∞ > (1 + σ − ) − , there exists ε > ε -perfect aggregation benchmark is not achieved at this equilibrium for any n .D.4. Proof of Proposition 4.
Suppose that for all ε >
0, agent 1 acheives the ε -perfectaggregation benchmark on G n for some n . This implies the result after relabeling agentsas necessary. We can assume that agent 1 has at least one neighbor in each G n . We willdiscuss the case of rational agents using positive weights.If agent 1 achieve ε -perfect aggregation benchmark on G n for ε small enough, then (cid:98) V ii ( n ) < n so that V ( n ) < n forthe remainder of the proof).Then, at naive equilibrium, any agent i connected to 1 chooses an estimator r naivei,t of θ t based on observed actions which she believes has variance ( κ naivei,t ) less than 1 + ρ ≤ i ’s action a i,t = ( κ naivei,t ) − r i,t + σ − i s i,t ( κ naivei,t ) − + σ − i , puts weight at least σ on observed actions.Therefore, in period t , agent 1’s best estimator r naive ,t of θ t − (indirectly) puts weight atleast ρ σ on actions from period t −
2. BecauseVar( ρ (cid:88) b j a j,t − − θ t − ) = Var( ρ (cid:88) b j a j,t − − θ t − ) + Var( θ t − − θ t − ) ≥ positive coefficients b j summing to 1, the variance of r naivei,t − θ t − is at least ρ (2+ σ ) .But then agent 1’s action variance is bounded away from ( σ − i + 1) and this bound isindependent of ε and n , which contradicts our assumption that for all ε agent 1 achievesthe ε -perfect aggregation benchmark on some G n . OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 62
D.5.
Proof of Proposition 5.
We prove the following statement, which includes theproposition as special cases.
Proposition 6.
Suppose the network G is strongly connected. Consider weights W and w s and suppose they are all positive, with an associated steady state V t . Suppose either(1) there is an agent i whose weights are a Bayesian best response to V t , and some agentobserves that agent and at least one other neighbor; or(2) there is an agent whose weights are a naive best response to V t , and who observesmultiple neighbors.Then the steady state V t is Pareto-dominated by another steady state. We provide the proof in the case m = 1 to simplify notation. The argument carriesthrough with arbitrary finite memory.Case (1): Consider an agent l who places positive weight on a rational agent k andpositive weight on at least one other agent. Define weights W by W ij = W ij and w si = w si for all i (cid:54) = k , W kj = (1 − (cid:15) ) W kj for all j ≤ n , and w sk = (1 − (cid:15) ) w sk + (cid:15), where W ij and w si are the weights at the initial steady state. In words, agent k places weight (1 − (cid:15) ) onher equilibrium strategy and extra weight (cid:15) on her private signal. All other players use thesame weights as at the steady state.Suppose we are at the initial steady state until time t , but in period t and all subsequentperiods agents instead use weights W . These weights give an alternate updating functionΦ on the space of covariance matrices. Because the weights W are positive and fixed,all coordinates of Φ are increasing, linear functions of all previous period variances andcovariances. Explicitly, the diagonal terms are[Φ( V t )] ii = ( w si ) σ i + (cid:88) j,j (cid:48) ≤ n W ij W ij (cid:48) V jj (cid:48) ,t and the off-diagonal terms are[Φ( V t )] ii (cid:48) = (cid:88) j,j (cid:48) ≤ n W ij W i (cid:48) j (cid:48) V jj,t (cid:48) . So it is sufficient to show the variances Φ h ( V t ) after applying Φ for h periods Paretodominate the variances in V t for some h .In period t , the change in weights decreases the covariance V jk,t of k and some otheragent j , who l also observes, by f ( (cid:15) ) of order Θ( (cid:15) ). By the envelope theorem, the change inweights only increases the variance V kk by O ( (cid:15) ). Taking (cid:15) sufficiently small, we can ignore O ( (cid:15) ) terms. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 63
There exists a constant δ > δ . Then each coordinate [Φ( V )] ii is linear with coefficient at least δ on each varianceor covariance of agents observed by i .Because agent l observes k and another agent, agent l ’s variance will decrease below itsequilibrium level by at least δ f ( (cid:15) ) in period t + 1. Because Φ is increasing in all entriesand we are only decreasing covariances, agent l ’s variance will also decrease below its initiallevel by at least δ f ( (cid:15) ) in all periods t (cid:48) > t + 1.Because the network is strongly connected and finite, the network has a diameter. After d + 1 periods, the variances of all agents have decreased by at least δ d +2 f ( (cid:15) ) from theirinitial levels. This gives a Pareto improvement.Case (2): Consider a naive agent k who observes at least two neighbors. We can writeagent k ’s period t action as a k,t = w sk s i,t + (cid:88) j ∈ N i W kj a j,t − . Define new weights W as in the proof of case (1). Because agent k is naive and the sum-mation (cid:80) j ∈ N i W kj a j,t − has at least two terms, she believes the variance of this summationis smaller than its true value. So marginally increasing the weight on s i,t and decreasingthe weight on this summation decreases her action variance. This deviation also decreasesher covariance with any other agent. The remainder of the proof proceeds as in case (1). Appendix E. Naive Agents (online appendix)
In this section we provide rigorous detail for the analysis given in 5.1. We will describeoutcomes with two signal types, σ A and σ B . We use the same random network modelas in Section 4.3 and assume each network type contains equal shares of agents with eachsignal type.We can define variances(E.1) V ∞ A = κ t + σ − A (cid:0) σ − A (cid:1) , V ∞ B = κ t + σ − B (cid:0) σ − B (cid:1) where κ − t = 1 − ρ ( σ − A + σ − B ) (cid:18) σ − A σ − A + σ − B σ − B (cid:19) . Naive agents’ equilibrium variances converge to these values. The general case, with many signal types, is similar.
OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 64
Proposition 7.
Under the assumptions in this subsection:(1) There is a unique equilibrium on G n .(2) Given any δ > , asymptotically almost surely all agents’ equilibrium variances arewithin δ of V ∞ A and V ∞ B .(3) There exists ε > such that asymptotically almost surely the ε -perfect aggregationbenchmark is not achieved, and when σ A = σ B asymptotically almost surely all agents’variances are larger than V ∞ . Aggregating information well requires a sophisticated response to the correlations inobserved actions. Because naive agents completely ignore these correlations, their learn-ing outcomes are poor. In particular their variances are larger than at the equilibria wediscussed in the Bayesian case, even when that equilibrium is inefficient ( σ A = σ B ).When signal qualities are homogeneous ( σ A = σ B ), we obtain the same limit on anynetwork with enough observations. That is, on any sequence ( G n ) ∞ n =1 of (deterministic)networks with the minimum degree diverging to ∞ and any sequence of equilibria, theequilibrium action variances of all agents converge to V ∞ A .E.1. Proof of Proposition 7.
We first check that there is a unique naive equilibrium.As in the Bayesian case, covariances are updated according to equations 3.3: V ii,t = ( w si,t ) σ i + (cid:88) W ik,t W ik (cid:48) ,t ( ρ V kk (cid:48) ,t − + 1) and V ij,t = (cid:88) W ik,t W i (cid:48) k (cid:48) ,t ( ρ V kk (cid:48) ,t − + 1) . The weights W ik,t and w si,t are now all positive constants that do not depend on V t − .So differentiating this formula, we find that all partial derivatives are bounded above by1 − w si,t <
1. So the updating map (which we call Φ naive ) is a contraction in the sup normon V . In particular, there is at most one equilibrium.The remainder of the proof characterizes the variances of agents at this equilibrium. Wefirst construct a candidate equilibrium with variances converging to V ∞ A and V ∞ B , and thenwe show that for n sufficiently large, there exists an equilibrium nearby in V .To construct the candidate equilibrium, suppose that each agent observes the same num-ber of neighbors of each signal type. Then there exists an equilibrium (cid:98) V sym where covari-ances depend only on signal types, i.e., (cid:98) V sym is invariant under permutations of indicesthat do not change signal types. We now show variances of the two signal types at thisequilibrium converge to V ∞ A and V ∞ B .To estimate θ t − , a naive agent combines observed actions from the previous period withweight proportional to their precisions σ − A or σ − B . The naive agent incorrectly believesthis gives an almost perfect estimate of θ t − . So the weight on older observations vanishes OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 65 as n → ∞ . The naive agent then combines this estimate of θ t − with her private signal,with weights converging to the weights she uses if the estimate is perfect.Agent i observes | N i | neighbors of each signal type, so her estimate r naivei,t of θ t − isapproximately: r naivei,t = 2 | N i | ( σ − A + σ − B ) σ − A (cid:88) j ∈ N i ,σ j = σ A ρa j,t − + σ − B (cid:88) j ∈ N i ,σ j = σ B ρa j,t − . The actual variance of this estimate converges to:(E.2) Var( r naivei,t − θ t ) = ρ ( σ − A + σ − B ) (cid:2) σ − A Cov ∞ AA + σ − B Cov ∞ BB + 2 σ − A σ − B Cov ∞ AB (cid:3) + 1where Cov ∞ AA is the covariance of two distinct agents of signal type A and Cov ∞ BB and Cov ∞ AB are defined similarly.Since agents believe this variance is close to 1, the action of any agent with signal variance σ A is approximately: a i,t = r naivei,t + σ − A s i,t σ − A . We can then compute the limits of the covariances of two distinct agents of various signaltypes to be:
Cov ∞ AA = κ t (cid:0) σ − A (cid:1) ; Cov ∞ BB = κ t (cid:0) σ − B (cid:1) ; Cov ∞ AB = κ t (cid:0) σ − A (cid:1) (cid:0) σ − B (cid:1) . Plugging into E.2 we obtain κ − = 1 − ρ ( σ − A + σ − B ) (cid:18) σ − A σ − A + σ − B σ − B (cid:19) . Using this formula, we can check that the limits of agent variances in (cid:98) V sym matchequations E.1.We must check there is an equilibrium near (cid:98) V sym with high probability. Let ζ = 1 /n .Let E be the event that for each agent i , the number of agents observed by i with privatesignal variance σ A is within a factor of [1 − ζ , ζ ] of its expected value, and similarlythe number of agents observed by i with private signal variance σ B is within a factor of[1 − ζ , ζ ] of its expected value. This event implies that each agent observes a linearnumber of neighbors and observes approximately the same number of agents with eachsignal quality. We can show as in the proof of Theorem 1 that for n sufficiently large, the OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 66 event E occurs with probability at least 1 − ζ . We condition on E for the remainder of theproof.Let V ε be the ε -ball around in (cid:98) V sym the sup norm. We claim that for n sufficiently large,the updating map preserves this ball: Φ naive ( V ε ) ⊂ V ε . We have Φ naive ( (cid:98) V sym ) = (cid:98) V sym upto terms of O (1 /n ). As we showed in the first paragraph of this proof, the partial derivativesof Φ naive are bounded above by a constant less than one. For n large enough, these factsimply Φ naive ( V ε ) ⊂ V ε . We conclude there is an equilibrium in V ε by the Brouwer fixedpoint theorem.Finally, we compare the equilibrium variances to perfect aggregation and to V ∞ . It iseasy to see these variances are worse than the perfect aggregation benchmark, and thereforeby Theorem 1 also asymptotically worse than the Bayesian case when σ A (cid:54) = σ B .In the case σ A = σ B , it is sufficient to show that Bayesian agents place more weight ontheir private signals (since asymptotically action error comes from past changes in the stateand not signal errors). Call the private signal variance σ . For Bayesian agents, we showedin Theorem 1 that the weight on the private signal is equal to σ − σ − +( ρ Cov ∞ +1) − where Cov ∞ solves Cov ∞ = ( ρ Cov ∞ + 1) − [ σ − + ( ρ Cov ∞ + 1) − ] . For naive agents, the weight on the private signal is equal to σ − σ − +1 , which is smaller since Cov ∞ > Appendix F. Socially optimal learning outcomes with non-diverse signals(online appendix)
In this section, we show that a social planner can achieve asymptotically perfect aggrega-tion even when signals are non-diverse. Thus, the failure to achieve perfect aggregation atequilibrium with non-diverse signals is a consequence of individual incentives rather thana necessary feature of the environment.Let G n be the complete network with n agents. Suppose that σ i = σ for all i and m = 1. Proposition 8.
Let ε > . Under the assumptions in this section, for n sufficiently largethere exist weights weights W and w s such that at the corresponding steady state on G n ,the ε -perfect aggregation benchmark is achieved.Proof. An agent with a social signal equal to θ t − would place weight σ − σ − +1 on her privatesignal and weight σ − +1 on her social signal. Let w sA = σ − σ − +1 + δ and w sB = σ − σ − +1 − δ ,where we will take δ > OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 67
Assume that the first (cid:98) n/ (cid:99) agents place weight w sA on their private signals and weight1 − w sA on a common social signal r t we will define, while the remaining agents place weight w sB on their private signals and weight 1 − w sB on the social signal r t . As in the proof ofTheorem 2, 1 (cid:98) n/ (cid:99) (cid:98) n/ (cid:99) (cid:88) j =1 a j,t − = w sA θ t − + (1 − w sA ) r t − + O ( n − / ) , (cid:100) n/ (cid:101) n (cid:88) j = (cid:98) n/ (cid:99) +1 a j,t − = w sB θ t − + (1 − w sB ) r t − + O ( n − / ) . There is a linear combination of these summations equal to θ t − + O ( n − / ), and we cantake r t equal to this linear combination. Taking δ sufficiently small and then n sufficientlylarge, we find that ε -perfect aggregation is achieved. (cid:3) In Figure F.1, we conduct the same exercise as in Figure 4.1 with n = 600. The differenceis that we now also add the prediction variance of group A when a social planner minimizesthe total prediction variance (of both groups). The weights that each agent puts on herown private signal and the other agents are set to depend only on the groups. Under thesesocially optimal weights agents learn very well, and heterogeneity in signal variances onlyhas a small impact. OCIAL LEARNING IN A DYNAMIC ENVIRONMENT 68
Figure F.1.
Social Planner and Bayesian Learning B2 G r oup A P r ed V a r A2 = 2, n = 600= 2, n = 600