[PDF] Belief Error and Non-Bayesian Social Learning: Experimental Evidence

Abstract

This paper experimentally studies whether individuals hold a first-order belief that others apply Bayes' rule to incorporate private information into their beliefs, which is a fundamental assumption in many Bayesian and non-Bayesian social learning models. We design a novel experimental setting in which the first-order belief assumption implies that social information is equivalent to private information. Our main finding is that participants' reported reservation prices of social information are significantly lower than those of private information, which provides evidence that casts doubt on the first-order belief assumption. We also build a novel belief error model in which participants form a random posterior belief with a Bayesian posterior belief kernel to explain the experimental findings. A structural estimation of the model suggests that participants' sophisticated consideration of others' belief error and their exaggeration of the error both contribute to the difference in reservation prices.

Full PDF

BBelief Error and Non-Bayesian Social Learning:Experimental Evidence *Bo ˘gaçhan Çelen , Sen Geng , and Huihui Li University of Melbourne Xiamen UniversityNovember 20, 2020A bstract

This paper experimentally studies whether individuals hold a ﬁrst-order belief thatothers apply Bayes’ rule to incorporate private information into their beliefs, which isa fundamental assumption in many Bayesian and non-Bayesian social learning models.We design a novel experimental setting in which the ﬁrst-order belief assumptionimplies that social information is equivalent to private information. Our main ﬁndingis that participants’ reported reservation prices of social information are signiﬁcantlylower than those of private information, which provides evidence that casts doubt onthe ﬁrst-order belief assumption. We also build a novel belief error model in whichparticipants form a random posterior belief with a Bayesian posterior belief kernel toexplain the experimental ﬁndings. A structural estimation of the model suggests thatparticipants’ sophisticated consideration of others’ belief error and their exaggerationof the error both contribute to the difference in reservation prices.

Keywords: private information, social information, belief error, non-Bayesiansocial learningJEL: C91, C92, D83 * Bo ˘gaçhan Çelen: [email protected], Sen Geng: [email protected], Huihui Li: [email protected]. We have beneﬁted from the helpful comments of seminar participants at HuazhongUniversity of Science and Technology, Monash University, and Xiamen University, and participants at theconferences of Xiamen International Workshop on Experimental Economics (2014, 2016), China Meeting ofEconometric Society (2017), NYU CESS Meeting (2017), China Greater Bay Area Experimental EconomicsWorkshop (2018), ESA World Meeting (2018). Geng acknowledges ﬁnancial support from the FundamentalResearch Funds for the Central Universities (No. 20720151323) and China’s NSF Grant (No. 71703134). a r X i v : . [ ec on . GN ] N ov I ntroduction In many settings where individuals have to make a choice without knowing the underly-ing state that is payoff relevant, they can learn from either observing signals indicative ofthe underlying state (labeled as private information) or observing predecessors’ choices insimilar settings (labeled as social information). A typical social learning environment of-ten involves both private information and social information. How individuals aggregatethe two types of information is a continuing focus in the literature on social learning.As a useful benchmark, Bayesian social learning models assume that it is commonknowledge that individuals apply Bayes’ rule to incorporate private information andsocial information into their beliefs. Nevertheless, social information often has nestedrelationships in the sense that predecessors may base their choices on private informationand social information, which itself may also originate from private information andsocial information, and so on. The complexity of the network structure underlyingsocial information and the ignorance about predecessors’ information structures arelikely to prevent individuals from rationally interpreting social information. Motivatedby this observation, non-Bayesian social learning models deviate from the Bayesianbenchmark mainly in the direction of relaxing the assumption that individuals incorporatesocial information in a Bayesian manner. Meantime, non-Bayesian models maintain theassumption that it is common knowledge that individuals apply Bayes’ law to incorporateprivate information, i.e., individuals incorporate private information in a Bayesian mannerand hold ﬁrst-order and higher-order beliefs about others’ processing private informationin a Bayesian way.We design a novel experiment to test whether the ﬁrst-order belief assumption thatindividuals think that others apply Bayes’ law to incorporate private information intotheir beliefs holds. The multiplayer social learning game in our experiment consists ofthree stages. In stage 1, either urn 1 or urn 2 is selected for the game with an equalchance. Urn 1 contains p fraction of black balls and 1 − p fraction of white balls, and urn 2contains p fraction of white balls and 1 − p fraction of black balls. The fraction p is publicinformation, but the players do not know which of the two urns is selected. Each playerdraws a ball from the urn with replacement and privately observes the color of the drawnball. Then, each player makes an initial binary choice, i.e., guesses whether urn 1 or urn 2 This article conﬁnes its discussion of social information to observing others’ actions. More generally,social information also includes observing payoffs of others’ actions, communication and observation ofothers’ beliefs and opinions. See Bikhchandani et al. (1992), Banerjee (1992), Smith and Sørensen (2000) for an exploration ofinformation aggregation in settings where each individual observes private information and all predecessors’choices. See Acemoglu et al. (2011) for an exploration of information aggregation in settings where eachindividual observes private information and a stochastically generated subset of predecessors’ choices.

1s used. In stage 2, each player receives an endowment and states her reservation price forreceiving either additional private information (i.e., the colors of additional ball draws) oradditional social information (i.e., the initial choices made by other players). In stage 3, theBecker-DeGroot-Marschak method (Becker et al., 1964) is employed to determine whethera player successfully purchases the additional information. Conditional on successfulpurchase, the player observes the additional information and then makes a second binarychoice between urn 1 and urn 2. The payoff of each player is a ﬁxed reward for hercorrect ﬁnal choice in addition to her remaining endowment. Our experiment varies threeparameters: (1) information type, i.e., additional private information or additional socialinformation, (2) signal quantity, i.e., one or three additional observations, and (3) signalquality, with p increasing from 0.6 to 0.75 to 0.9.Given our experimental design, the ﬁrst-order belief assumption implies that socialinformation is identical to private information. Since both the network structure and infor-mation generating process are sufﬁciently simple and are known to individuals, whetheran individual views social information as identical to private information essentiallydepends on whether she believes that others make a Bayesian ﬁrst choice, i.e., chooseurn 1 (or urn 2) after observing a black (or white) ball. In addition to the ﬁrst-orderbelief assumption that others apply Bayes’ law, two other independent assumptionswork together to make her believe that others make a Bayesian ﬁrst choice. The ﬁrstis that she believes that no unobservables prevent her and others from agreeing on thestate-contingent payoff of each action. While the possibility of such unobservables cannotbe completely ruled out, it is reasonable for her to accept the absence of such a possibilitygiven the experimental design. The second is that she believes that others choose opti-mally given their own beliefs. Since choosing optimally given one’s own belief is treatedas the basic individual rationality and is used as the implicit guiding principle underlyingincentivized economic experiments, an individual in our simple decision task is expectedto hold this belief. Therefore, the ﬁrst-order belief assumption that others apply Bayes’law to incorporate private information is identiﬁed with a testable implication that socialinformation has the same value as private information.Our main ﬁnding is that participants’ reservation prices for observing additional socialinformation are signiﬁcantly lower than for observing additional private information withrespect to both the mean and empirical distributions of the reservation prices. This ﬁndingclearly rejects the null hypothesis about the equivalence of the two types of informationand in turn casts doubt on the ﬁrst-order belief assumption of social learning models.We also ﬁnd that participants do not always make a Bayesian ﬁrst choice and, moreinterestingly, that the frequency of their making Bayesian choices increases as the signalquality increases. This ﬁnding casts doubt on the assumption that individuals apply2ayes’ law to incorporate private information and suggests that signal quality impactswhether participants follow Bayes’ law. In addition, we ﬁnd that participants generallyreport higher reservation prices than predicted based on the Bayesian assumption and, inparticular, report a considerably positive reservation price for observing one additionalsignal, which has zero informational value in the Bayesian paradigm. A key question about the main ﬁnding is whether it can be fully explained by partici-pants’ sophisticated consideration of others’ failure to apply Bayes’ law to incorporateprivate information, i.e., whether the second ﬁnding rationalizes the ﬁrst ﬁnding. Toaddress this question and gain deeper insight into these ﬁndings, we propose a novelbelief error model with two main assumptions. The ﬁrst assumption is that after receivinginformation, participants form a random posterior belief with a Bayesian kernel due tobelief error. Speciﬁcally, we assume that the value of a participant’s random posteriorbelief follows a beta distribution with a mean equal to the corresponding Bayesian pos-terior probability and with a nonnegative parameter γ measuring the degree of belieferror. The model with the assumption predicts that the chance of participants’ followingBayes’ law rises as the signal quality increases and that there is positive informationalvalue of observing one additional signal. The second assumption is that participantshold the ﬁrst-order belief that others also form a random posterior belief that follows abeta distribution with a mean equal to the corresponding Bayesian posterior probabilityand with a nonnegative parameter θ measuring the degree of others’ belief error in theiropinion. From this perspective, the social information setting degenerates into the privateinformation setting when θ shrinks to zero. Overall, the belief error model predictsthat the optimal reservation price is increasing in the value of the posterior belief in theﬁrst stage on the interval [ ] but decreasing on the interval [ , 1 ] , indicating that themore uncertain participants feel about the underlying state, the more highly they valueadditional information.We employ maximum likelihood estimation to estimate a heterogeneous belief errormodel. We ﬁnd that the average ﬁrst-order belief about others’ belief error, ¯ θ , is consider-ably higher than the average belief error ¯ γ . Additional generalized likelihood ratio testsconﬁrm the difference. Our tests reject ¯ θ = θ >

0, which suggests that asophisticated consideration of others’ belief error contributes to the gap in reservationprices of private information and social information. In addition, the hypothesis that A few experimental studies (Bohm et al. 1997; Plott and Zeiler 2005; Cason and Plott 2014) showthat participants may misperceive the Becker-DeGroot-Marschak elicitation method, that the elicited valuemay be sensitive to the choice of bounds of the randomly generated number, and that participants mayoverbid. The concern about the elicitation method may dampen the last ﬁnding of the considerably higherreservation price. Nevertheless, this concern should not adversely affect our main ﬁnding of the gap inreservation prices because its effect (if any) on the treatment of private information and social informationshould be similar. θ (cid:54) ¯ γ , is also rejected, which suggests that an exaggeration of others’ belieferror also contributes to the gap in reservation prices.Our paper is related to the experimental studies on social learning games initiated byAnderson and Holt (1997). Subsequent experimental studies modify the baseline designand investigate systematic choice behavior off the Bayesian equilibrium path. Amongthese studies, Nöth and Weber (2003), Çelen and Kariv (2004), and Goeree et al. (2007)ﬁnd that private information is overweighted relative to social information based on theirmodel estimation. To circumvent confounding factors, such as speciﬁc modeling of thedecision process, underlying the interpretation, Weizsäcker (2010) applies a reduced-formapproach to perform a meta analysis of 13 social learning experiments and ﬁnds thatparticipants are much less likely to choose optimally in cases where the empiricallyoptimal action contradicts their own signal. Nevertheless, the interpretation of thisﬁnding as evidence of participants’ overweighting private information is underminedby the varying lucrativeness of choosing optimally. Our experimental design can beviewed as a truncated version of the sequential social learning design with only theﬁrst two positions. The experiment is intentionally designed in this way to identify theeffect of individuals’ ﬁrst-order beliefs about others’ applying Bayes’ law. Our papercontributes to this strand of literature by ﬁrst providing unequivocal evidence thatindividuals value private information more than social information and identify theirﬁrst-order beliefs about others’ not applying Bayes’ law as a reason. In addition, a recentstudy by De Filippis et al. (2017) investigates the ﬁrst two positions of a sequential sociallearning game. They collect belief data and infer from the model estimation that privateinformation is overweighted only if it contradicts the predecessor’ belief. By contrast,we collect subjects’ reservation price data and choice data and provide direct evidence See Hung and Plott (2001), Nöth and Weber (2003), Kübler and Weizsäcker (2004), and Goeree et al.(2007) for example. One may deﬁne the lucrativeness of choosing optimally in any given information set as the incrementin expected payoff from choosing optimally rather than choosing sub-optimally or as the expected payofffrom choosing the optimal action due to the binary choice and the normalized payoff. In the meta analysis,this value is equal to the fraction of decision rounds with an underlying true state in favor of the optimalaction of all decision rounds that include a speciﬁc information set. Then, a case where following one’sown signal is empirically optimal corresponds to an information set in which the fraction of decisionrounds with an underlying true state in favor of one’s own signal is greater than one-half. A case wherecontradicting one’s own signal is empirically optimal corresponds to an information set in which thefraction of decision rounds with an underlying true state against one’s own signal is greater than one-half.Conceivably, the fraction in the former case is on average larger than in the latter case. Thus, the ﬁnding inWeizsäcker (2010) that participants respond to incentives suggests that participants’ greater reluctance tocontradict their own signal may be due to the smaller incentive rather than their overweighting of privateinformation. Dominitz and Hung (2009) also collects subjects’ belief data in a stylized information cascade experi-ment and ﬁnds that private information is not overweighted. Among these non-Bayesian models, mostrecently in Molavi et al. (2018), individuals are assumed to process private informationin a Bayesian manner, which is treated as common knowledge among them. A class ofnon-Bayesian social learning models initiated by DeGroot (1974), which assumes a linearaggregation of private information and social information, also exists. By contrast, ourbelief error model focuses on a basic deviation from the ﬁrst-order belief assumptionand applies to social learning settings where the network structure is sufﬁciently simpleand individuals are fully informed of others’ information structure with a simple form.Therefore, our paper should be viewed as complementary to the existing work on non-Bayesian social learning models.Moreover, our paper is related to the literature on non-Bayesian individual decisionmaking. Individuals have been shown to deviate from the Bayesian updating paradigmsystematically in many settings of individual decision making. Alternative models thatcapture a certain deviation from Bayesian updating have been proposed via two typicalapproaches. In the ﬁrst approach, non-Bayesian descriptive models are formalized tocharacterize individuals’ misinterpretation of the signal generating process or signals(Barberis et al., 1998; Rabin, 2002; Rabin and Schrag, 1999; Rabin and Vayanos, 2010). Inthe second approach, non-Bayesian decision models are built on axiomatic foundationsand are applied to settings where the prior probability is subjectively determined andeven adjusted in the presence of new observations (Epstein, 2006; Epstein et al., 2008;Ortoleva, 2012). By contrast, our non-Bayesian decision model applies even when boththe prior probability and the signal generating process are given objectively and can bereasonably agreed on by individuals. Moreover, our model can explain the experimentalﬁndings that are difﬁcult to reconcile with the aforementioned non-Bayesian decisionmodels, e.g., the monotonic relationship between signal quality and the frequency ofmaking a Bayesian choice. See Bala and Goyal (1998), Rabin and Vayanos (2010), Guarino and Jehiel (2013), Eyster and Rabin(2010, 2014), and Bohren (2016), among others. See DeMarzo et al. (2003), Golub and Jackson (2010, 2012), and Jadbabaie et al. (2012), among others. See, for example, Tversky and Kahneman (1974), Holt and Smith (2009) and the surveys in Camerer(1995), Rabin (1998), and Camerer and Loewenstein (2004). For example, Epstein et al., 2010 (in the spirit of Epstein, 2006 and Epstein et al., 2008) assumes thatagents’ posterior belief is a linearly weighted sum of the Bayesian posterior probability and the prior Weargue in Section 5 that applying logit models with either foundation to our setting isinappropriate. Instead, we propose a novel belief error model to generate the stochasticbinary choice. In terms of modeling error, our method of making assumptions on randomposterior beliefs is different from the typical method of assuming an additively separableerror term, as in the logit model.The remaining parts of the paper proceed as follows. Section 2 presents a Bayesianbenchmark model of our social learning game. Sections 3 and 4 report experimentaldesign and results correspondingly. Section 5 proposes a model of belief error and struc-turally estimates the model parameters using maximum likelihood estimation. Finally,section 6 concludes. ayesian B enchmark M odel In the benchmark model, we assume that Bayesian agents share a common prior beliefabout the state of the world, i.e., state 1 and state 2 occur with probability . Each agenthas to make a binary action decision whose payoff is contingent on the underlying state.Speciﬁcally, action 1 delivers a payoff of 1 in state 1 and a payoff of 0 in state 2, and action2 delivers a payoff of 0 in state 1 and a payoff of 1 in state 2.Independent signals { s n } n ≥ , where s n ∈ { B , W } , provide incomplete informationabout the underlying state. We assume that the signal structure is symmetric, i.e., P ( s n = B | state 1 ) = P ( s n = W | state 2 ) = p and P ( s n = W | state 1 ) = P ( s n = B | state 2 ) = q ≡ − p . We also assume that p > . probability. This assumption always predicts a Bayesian ﬁrst choice in our experiment, which is inconsistentwith the data. Caplin and Dean (2015) also provides a rational inattention foundation for agents’ stochastic choicewithout imposing speciﬁc assumptions about the function form of the information cost.

6n a private information setting, an agent is endowed with a private signal s . Condi-tional on the realization of s , she makes a ﬁrst choice between the two actions and mayalso decide to acquire a certain number of additional signals conditional on which she isentitled to make a second choice. A social information setting is almost identical, exceptthat the additional observations consist of a certain number of other agents’ ﬁrst choices,each of which is made after observing an independent realization of signal s .Clearly, a Bayesian agent’s posterior belief about state 1 is p after observing a signal s = B and is q after observing a signal s = W . Bayesian agents who maximize expectedutility should optimally choose action 1 in the former case and action 2 in the lattercase. It then can be inferred from observing an agent’s ﬁrst choice of action 1 (action 2)in the social information setting that a signal with the realization of B ( W ) is observed.Thus, the informational content contained in a signal is identical to that inferred byobserving an agent’s ﬁrst choice. It is then straightforward that an agent’s willingnessto pay for observing n additional signals in the private information setting (denoted as W pris ( p , n ) ) is equal to her willingness to pay for observing n agents’ ﬁrst choices in thesocial information setting (denoted as W socs ( p , n ) ). Proposition 1.

For Bayesian agents who maximize expected utility, W pris ( p , n ) = W socs ( p , n ) . Since the prior belief about either state is and the signal structure is symmetric, W pris ( p , n ) remains the same regardless of the realization of signal s . For notationalsimplicity, we use W ( p , n ) below to indicate W pris ( p , n ) or W socs ( p , n ) .If agents are further assumed to be risk neutral, their willingness to pay for additionalobservations can be characterized in a closed form, speciﬁcally, a form that is monotonein the number of additional observations and concave in the informativeness of signals.We provide the proof of the proposition in the supplementary Appendix B. Proposition 2.

Assume that Bayesian agents maximize expected payoff. Then, the followingproperties hold.(i) W ( p , 1 ) = and W ( p , 2 n ) = W ( p , 2 n + ) = ∑ nk = n + C k n (cid:0) p k q n + − k − p n + − k q k (cid:1) forn (cid:62) .(ii) W ( p , 2 m ) > W ( p , 2 n ) whenever m > n, and lim n → ∞ W ( p , 2 n ) = − p.(iii) ∂ W ( p , 2 n ) / ∂ p > for p p ∗ n , where p ∗ n = + (cid:104) − n (cid:112) ( · · · · · · n ) / ( · · · · · · ( n + )) (cid:105) .(iv) The threshold level of informativeness of signals that makes additional observations mostvaluable to an agent, p ∗ n , decreases as the number of additional observations n rises, and lim n → ∞ p ∗ n = .(v) ∂ W ( p , 2 n ) / ∂ p < . ∑ nk = n + C k n (cid:0) p k q n + − k − p n + − k q k (cid:1) regardless of risk attitude. Nevertheless, the assump-tion of risk neutrality is necessary for a closed-form characterization of willingness to paybecause utility levels at four points are involved. xperimental design and hypotheses We design an experiment to collect data on subjects’ choices and their reservation pricesfor additional observations in both private information and social information settings. Ineach setting, we vary the signal quality and signal quantity parameters. Speciﬁcally, signalquality ( p ) varies from 0.6 to 0.75 to 0.9, i.e., from low to medium to high accuracy. Signalquantity ( n ) takes a value of either 1 or 3 and determines whether one or three additionalobservations are provided. A total of twelve treatment conditions are considered, and weimplement a within-subject design. An experimental session consists of sixty decisionrounds, of which the ﬁrst twelve correspond to the twelve treatment conditions, whichare followed by four repetitions. In addition to learning from private information and learning from social information,another type of learning is inherent in laboratory experiments: participants’ learningthe experiment during the course of the experiment. Speciﬁcally, for a multi-roundexperiment, whether participants receive feedback about their performance at the endof each round may have an impact. To investigate the potential effect of participants’learning about the experiment through feedback, we implement a between-subject designin which a decision round consists of the following three stages in no-feedback sessions,and in feedback sessions there is also the fourth stage in which feedback is provided.In stage 1, a computer randomly selects either of urn 1 and urn 2 for use in thatdecision round. Urn 1 of type contains twelve black balls and eight white balls, andurn 2 of this type contains twelve white balls and eight black balls. The compositions oftype urns and type are similarly determined. While the type of urns (i.e., , , or ) is revealed to all subjects, the label of the urn (i.e., urn 1 or urn 2) is not known. Thecomputer independently and randomly draws one ball from the urn with replacement Utility levels at four points are the utility of payoff of one, the utility of payoff of zero, the utility ofpayoff of one subtracted by willingness to pay, and the utility of payoff of zero subtracted by willingness topay. The display order of the twelve treatment conditions in our experiment is: (i) private information, n = p = → → n = p = → → n = p = → → n = p = → → {

0, 1, 2, ..., 299, 300 } . Thechosen number, say b , refers to her willingness to pay b tokens for additional information.Before stating her reservation price, the subject is informed of the composition of theadditional information. Speciﬁcally, in the private information treatment condition with n ∈ {

1, 3 } , the additional information includes the color(s) of n ball(s) randomly drawnfrom the urn used in that round. In the social information treatment condition with n ∈ {

1, 3 } , the additional information is the ﬁrst choices of one or three other subjects.In stage 3, the Becker-DeGroot-Marschak method is employed to determine whethersubjects successfully purchase additional information. Speciﬁcally, the computer ran-domly and equally chooses an ask price from {

0, 1, 2, ..., 299, 300 } , say s tokens. If asubject’s reported reservation price exceeds the ask price ( b (cid:62) s ), the subject is charged s tokens (collected from her endowment of 300 tokens) and is provided with the additionalinformation. After the additional observations, the subject is asked to make a secondbinary choice between urn 1 and urn 2. In this case, her second choice becomes herﬁnal choice in that round. If a subject’s reported reservation price is below the ask price( b < s ), the subject is neither charged nor provided with additional information. In thiscase, her ﬁrst choice is also her ﬁnal choice in that round. In feedback sessions, a fourthstage is implemented after stage 3, in which subjects are told of the urn that is actuallyused in that round.Subjects are paid according to their ﬁnal choices in a randomly selected decisionround. Speciﬁcally, the computer randomly selects one of the sixty decision rounds toserve as the paid round. We say that a subject makes a correct choice if her ﬁnal choice inthe paid round is the same as the urn that is actually used in that round and otherwisemakes an incorrect choice. A subject earns 300 tokens for a correct choice and zero tokensfor an incorrect choice. In addition, she retains her endowment of 300 tokens if shedoes not purchase additional information in the paid round. If she receives additionalinformation at a cost of s tokens in the paid round, she retains 300 − s tokens of theendowment. Finally, the total number of tokens earned is redeemed for Chinese yuan atan exchange rate of .According to the Bayesian benchmark model in Section 2, we have the following twoexperimental hypotheses. Hypothesis 1

The ﬁrst choice in a decision round is urn 1 (urn 2) if a black (white)ball is observed. The second choice (if any) is urn 1 if the majority of signals are in favor9f urn 1 and urn 2 if the majority of signals are in favor of urn 2. Hypothesis 2

Participants’ reservation prices for observing additional signals areidentical in the private information and social information treatment conditions for anygiven signal quality and signal quantity. The reservation price in each treatment conditionis shown in Table 1.T able

1: The same value of private information and social information (unit: tokens) p = p = p = n = n = xperimental R esults An observation of the experiment includes the parameters treatment condition (i.e.,private or social information; p = n = Let B and W refer to observing a black ball and a white ball, respectively, and let C and C refer toa choice of urn 1 and a choice of urn 2 in stage 1, respectively. Signals in favor of choosing urn 1 for thesecond time include BB , BC , 4 B , and 3 B W in the private information setting and 1 B C , 1 B C C , and1 W C in the social information setting. Signals in favor of choosing urn 2 for the second time include WW , WC , 4 W , 3 W B in the private information setting and 1 W C , 1 W C C , and 1 B C in the socialinformation setting. ifferent V aluations of P rivate I nformation and S ocial I nformation We ﬁnd both direct and indirect evidence that subjects treat private information andsocial information differently, which refutes experimental hypothesis 2.Table 2 shows that subjects’ average reservation prices are consistently much higherthan the theoretical values based on the Bayesian benchmark model; more importantly,social information has a lower value than private information. We run the two-group meantest under the null hypothesis that reservation prices are identical in both the private andthe social information settings with the alternative hypothesis that the average reservationprice in the social information setting is lower than that in the private information setting.The results show that the null hypothesis is rejected in all conditions of the no-feedbacksessions and is rejected in all but one condition of the feedback sessions. Figure 1 further illustrates the difference in the empirical distributions of subjects’reservation prices between the private information setting and the social informationsetting. We run a Kolmogorov-Smirnov test of the equality of the distributions in the twosettings for each of the six treatment conditions. The null hypothesis is rejected at the 5%signiﬁcance level in all six cases. In the subsample of observations where subjects were asked to make a second choice, An analysis of later rounds of a session, i.e., rounds 37-60, produces similar results. We also test the difference in the empirical distributions of the reported reservation prices between thetwo information settings for feedback session data and for no-feedback session data, separately. We run aKolmogorov-Smirnov test of the equality of the distributions in the two settings for the two subsamples ofdata, and the null hypothesis is rejected at the 1% signiﬁcance level for each of the two subsamples. Whenwe further divide each subsample of data into six treatment conditions and run a similar statistical test,we ﬁnd that the null hypothesis is rejected at the 5% signiﬁcance level in each treatment condition of thesubsample of no-feedback sessions and that the null hypothesis cannot be rejected at the 5% signiﬁcancelevel in each treatment condition of the subsample of feedback sessions. These results suggests that subjectsmay learn from feedback, which in turn has an impact on their valuation of the two types of information. able

2: Average reservation prices in the private and social information settings p = p = p = n = n = n = n = Notes: See standard errors in the parentheses and p -values of one-tailed tests of the correspondingmean comparison in the corresponding third row. F igure

1: Empirical distribution of reservation prices12ubjects’ ﬁnal choices were made on the basis of the initial information (the color of theball that was shown in stage 1) and the new information, which includes either the colorsof the additional balls or other subjects’ ﬁrst choices. If the impact of the new informationon the second choice differs depending on the type of information, then subjects have adifferent valuation of private information and social information. Our indirect evidenceis based on the ﬁnding that the effect is stronger when the new information is privateinformation than when it is social information.Speciﬁcally, we investigate in the subsample the percentage of decision rounds inwhich a subject’s second choice contradicts the color of the ball in stage 1, for example,when the subject’s second choice was urn 2 after observing a black ball in stage 1 or hersecond choice was urn 1 after observing a white ball in stage 1. We may interpret thatthe larger the percentage is, the bigger is the impact the new information has on subjects’second choices, everything else being equal. We ﬁnd that, overall, the percentage is 22%in the private information setting and 16% in the social information setting. We run atwo-group mean test under the null hypothesis that the percentage is the same in bothsettings and ﬁnd that the hypothesis is rejected at the 10% signiﬁcance level. ignal Q uality and B ayesian C hoice Since urn 1 consistently contains more black balls than white balls and urn 2 consistentlyincludes more white balls than black balls and because the prior belief about either urnis , we say that a subject’s ﬁrst choice is a Bayesian choice if she chooses urn 1 afterobserving a black ball or chooses urn 2 after observing a white ball. Table 3 reportsthe percentage of Bayesian choices in the ﬁrst stage. While it is not surprising that thepercentage is less than 100%, i.e., 94%, it is interesting that the percentage increases as the The overall percentage ignores the details of different realizations of the initial information and thenew information. To address this question, we introduce an explanatory variable z that counts the numberof signals in the new information in favor of the initial information. For example, the variable takes a valueof 1 if the new information contains exactly one signal in favor of the initial information, a value of 2 ifthe new information contains two signals in favor of the initial information, and so on. When a tie occursbetween signals in favor of either choice, i.e., z = n = z = n =

3, the percentage ishigher in the private information setting than in the social information setting, indicating a stronger effectof private information. T able

3: The effect of signal quality on Bayesian ﬁrst choice p = p = p = p = p = p = Notes: The percentage refers to the extent to which the ﬁrst choice is a Bayesian choice. The lastthree columns report p -values of two-tailed tests of the corresponding mean comparison. ndividual - level analysis At the individual level, we ﬁnd considerable heterogeneity across subjects. Figure 2illustrates each subject’s average reservation price for additional signals in the privateinformation and social information settings for each condition of signal quality andsignal quantity. Some subjects value private information more, some subjects value socialinformation more, and others value the two types of information more or less equally. elief E rror M odel and S tructural E stimation In this section, we provide a belief error foundation for participants’ systematic non-Bayesian choice behavior and their different valuations of private information and socialinformation. A structural estimation of the model is then employed to gain deeperinsights into the experimental data.We are aware that the toolbox includes the classical logit choice model, which has thepotential to explain our experimental ﬁndings. Nevertheless, we argue that an applicationof such a model to our experimental setting is not appropriate for the following reasons.When applying the logit model to generate a stochastic binary choice, the literature An analysis of rounds 37-60 of the feedback sessions produces similar results. Nöth and Weber (2003) extends Anderson and Holt (1997)’s sequential social learning experimentaldesign by introducing two signal qualities, i.e., p = p = p = p = Similarly, we say that a subject’s second choice (if any) is a Bayesian choice if she chooses the urnfavored by the majority of signals, as speciﬁed in Footnote 14. Among the 636 observations in which themajority of signals are in favor of either urn 1 or urn 2, subjects made a Bayesian choice in 612 cases, whichamounts to a Bayesian choice percentage of 96%. A v e r age R e s e r v a t i on P r i c e f o r P r i v a t e I n f o r m a t i on F igure

2: Individual average reservation prices for private and social information

Note: The diagonal line refers to the situation where reservation prices for the two types ofinformation are equal. This prediction is not consistent with our experimentalﬁnding that subjects’ reported reservation prices in the two cases are not signiﬁcantlydifferent ( p -value = According to the model with a rational inattention foundation,individuals are assumed to choose their information acquisition strategy optimally whenthey endogenously determine their way of acquiring information about actions’ payoffs,and the information acquisition cost is assumed to take the form of the Shannon costfunction (e.g., Matˇejka and McKay 2015). Under this assumption, individuals’ posteriorbeliefs are optimally chosen, and action choices based on these posterior beliefs followa logistic response function form. Differently in our experimental setting, informationacquisition must take the form of drawing independent signals with an exogenously givensignal structure, and after receiving the signals, there is minimal room for unobservablesthat may justify individuals’ choosing their information strategy. Thus, it is not compellingto justify the application of the logit model with a rational inattention foundation in oursetting. The intuition is that after not making a Bayesian ﬁrst choice subjects realize that they made a choicemistake and that the expected payoff is lower. Henceforth, the beneﬁt of observing additional signals largergiven a ﬁxed ex ante expected payoff of making a second choice. We also employ the payoff disturbance model to structurally estimate the experimental data bypretending to ignore the conceptual concern and ﬁnd that it performs worse than our method, as illustratedin Section 5.3. .1 A B asic M odel with B elief E rror The basic belief error model modiﬁes the Bayesian benchmark model in the followingthree aspects. First, subjects form random posterior beliefs with a Bayesian updatingkernel and have a sophisticated consideration of their own random posterior beliefs.Second, subjects make a sophisticated consideration of others’ random posterior beliefswhen interpreting others’ action decisions. Third, subjects also gain utility from the eventof making a correct choice ex post in addition to the monetary payoffs realized frommaking a correct choice. Clearly, the generality and applicability of the three assumptionsis ranked in descending order.5.1.1 R andom posterior belief with a B ayesian kernel Subjects sometimes systematically deviate from Bayesian updating even in many simplesettings of individual decision making and often exhibit behavioral biases (e.g., Tverskyand Kahneman 1974; Holt and Smith 2009). Recently, De Filippis et al. (2017) collectsubjects’ belief data and ﬁnd that while a high percentage of beliefs are in line withBayesian updating, a considerable percentage are smaller or greater than the Bayesianones. In addition, a small proportion of beliefs are even in the opposite direction speciﬁedby Bayesian updating.We model the variation from Bayes’ law by assuming that when updating beliefs fromprior beliefs and observations, subjects form a random posterior belief dependent onthe Bayesian posterior probability, i.e., the posterior probability calculated according toBayes’ law. Speciﬁcally, let p x be the Bayesian posterior probability of urn 1 given a signalset x . Subject i ’s posterior probability of urn 1 takes the form of a random variable ˜ p ix ;correspondingly, her posterior probability of urn 2 is 1 − ˜ p ix . We assume that the value ofthe random posterior probability is realized only after observing signal set x and thatit is privately known to subject i . We also assume that sophisticated subjects take therandomness of posterior belief into account when extrapolating posterior beliefs in anyfuture information set. We may infer that the randomness comes from belief errors whenthe subject forms her posterior probability. Since the posterior probability ˜ p ix is a random variable that takes a value from 0 to1, an ideal probability distribution of the random variable should have support on theinterval [

0, 1 ] . Among common continuous distribution functions with support on a The form of ˜ p ix implicitly assumes that in an information set including the ﬁrst signal and additionalsignals, the subject forms her posterior belief by taking a new draw of noise in her belief without takinginto account the old draw of noise in her belief after observing the ﬁrst signal. With this simpliﬁcation,the potential effect of the realized noise in belief after observing the ﬁrst signal on the noise in belief afterobserving the ﬁrst signal and additional signals is ignored. Assumption 1.

Subject i’s posterior belief ˜ p ix follows a beta distribution with two parameters γ i ≥ and p x ∈ [

0, 1 ] . Speciﬁcally, ˜ p ix ∼ Beta (cid:18) p x γ i , 1 − p x γ i (cid:19) when γ i > and p x ∈ (

0, 1 ) . ˜ p ix = p x almost surely when γ i = , ˜ p ix = almost surely whenp x = , and ˜ p ix = almost surely when p x = . Assumption 1 has a few desirable features. First, it guarantees that ˜ p ix , as a probability,is always bounded by 0 (cid:54) ˜ p ix (cid:54) [

0, 1 ] .Second, the expectation E ( ˜ p ix ) = p x , which suggests that the subject’s belief is on average“right” in the sense that the random posterior belief overall has a Bayesian posteriorbelief kernel. Third, the variance Var ( ˜ p ix ) = γ i + γ i p x ( − p x ) shrinks as the parameter γ i decreases and converges to zero as γ i approaches zero. Therefore, the parameter γ i cannaturally be interpreted as the degree of subject i ’s belief error: the smaller the parameteris, the smaller her belief error is. γ i = p ix is degenerate, or equivalently, ˜ p ix = p x almost surely. Finally, for anyﬁxed γ i , the variance is proportional to p x ( − p x ) , which has the highest value when p x = . The variance is also symmetric around and decreases when p x changes in thedirection of 0 or 1. In the boundary cases of p x = p ix = p ix = Alternatively, one may deﬁne subject i ’s posterior belief to be the Bayesian posterior probability plus anerror term, i.e., ˜ p ix = p x + (cid:101) ix . This deﬁnition will inevitably result in a truncated distribution if we assumethat (cid:101) ix follows a distribution with unbounded support such as normal distribution. While the truncationproblem could be resolved by applying the logit transformation twice, e.g., ln ˜ p ix − ˜ p ix = ln p x − p x + (cid:101) ix , theappealing feature of generating closed-form predictions will be lost.

18s a ﬁxed number. After observing signals conditional on the underlying binary state,which follows a Bernoulli distribution, subjects’ posterior beliefs are a ﬁxed number ifBayes’ law is applied. It is only in our non-Bayesian paradigm that subjects’ posteriorbeliefs are assumed to be random and follow a beta distribution. In summary, our paperis essentially different from theirs in terms of the information structure, the contents ofthe prior and posterior beliefs, and the Bayesian/non-Bayesian paradigm.When maximizing expected utility, subject i ’s optimal choice strategy according to thebelief error model is similar to that of the Bayesian benchmark model: choose urn 1 if˜ p ix > , choose urn 2 if ˜ p ix < , and be indifferent between the two options if ˜ p ix = . Let P ( correct choice i | x , ˜ p ix ) be the (subjective) probability of making a correct choice aftersubject i observes signals x and forms a realized value of her posterior belief ˜ p ix . Theoptimal choice strategy implies that P ( correct choice i | x , ˜ p ix ) = max ( ˜ p ix , 1 − ˜ p ix ) . Then,the probability of making a correct choice for information set x before the value of ˜ p ix isrealized is P ( correct choice i | x ) = E P ( correct choice i | x , ˜ p ix ) = E max ( ˜ p ix , 1 − ˜ p ix ) ,which reﬂects the subject’s assessment of her chance of making an ex post correct choicefor a hypothetical information set x . For comparison, the corresponding assessmentaccording to the Bayesian benchmark model is max ( p x , 1 − p x ) . We show that the twoassessments prescribe the same ranking of information sets. Proposition 3.

Assume that subjects maximize expected utility. Under Assumption 1, for signalsets x and y,(i) if max ( p x , 1 − p x ) = max ( p y , 1 − p y ) , i.e., p x = p y or p x = − p y , then P ( correct choice i | x ) = P ( correct choice i | y ) ;(ii) if max ( p x , 1 − p x ) > max ( p y , 1 − p y ) , then P ( correct choice i | x ) > P ( correct choice i | y ) .Proof. See Appendix A.1.Since the assessment depends on p x , we denote v ( p x ) ≡ P ( correct choice i | x ) afterskipping the index of subject i and belief error measure γ . The second property ofProposition 3 implies that when p x > , the assessment according to the belief errormodel is increasing in p x . The monotonicity of the assessment is trivially satisﬁed inthe Bayesian benchmark model because max ( p x , 1 − p x ) = p x when p x > . Anotherinteresting observation is that when p x > , the assessment according to the belief errormodel increases at an increasing rate, i.e., ∂ v ( p x ) / ∂ p x >

0, whereas the assessmentaccording to the Bayesian benchmark increases at a linear rate. We do not have a formal19roof of the convexity of v ( p x ) due to the intricate interaction between the function formand the beta distribution, but our simulation exercise suggests that ∂ v ( p x ) / ∂ p x > γ .It is straightforward to check that a subject who chooses optimally according to thebelief error model also makes a Bayesian choice if and only if (cid:16) ˜ p ix − (cid:17) (cid:16) p x − (cid:17) > So the chance of subject i ’s making a Bayesian choice is P (cid:16)(cid:16) ˜ p ix − (cid:17) (cid:16) p x − (cid:17) > (cid:17) .Proposition 4 below shows that the chance is increasing in p x when p x > and decreasingin p x when p x < . Since for a given prior belief, the Bayesian posterior probability p x is increasing in the signal quality p when p x > and decreasing in p when p x < ,the proposition implies that the chance of subjects’ making a Bayesian choice increasesas the signal quality increases, which is consistent with the experimental ﬁnding thatthe frequency of making a Bayesian ﬁrst choice increases from p = p . Proposition 4.

Assume that subjects maximize expected utility. Under Assumption 1, for signalsets x and y, if (cid:12)(cid:12)(cid:12) p x − (cid:12)(cid:12)(cid:12) > (cid:12)(cid:12)(cid:12) p y − (cid:12)(cid:12)(cid:12) , thenP (cid:18)(cid:18) ˜ p ix − (cid:19) (cid:18) p x − (cid:19) > (cid:19) > P (cid:18)(cid:18) ˜ p iy − (cid:19) (cid:18) p y − (cid:19) > (cid:19) . Proof.

See Appendix A.2.We now show that the belief error model predicts a positive informational value ofobserving one additional signal. The intuition proceeds as follows. A subject’s posteriorbelief after observing the ﬁrst private signal in the ﬁrst stage may not be equal to theBayesian posterior probability, and a sophisticated subject extrapolates that she will forma random posterior belief for each subsequent information set in later stages. For anysubsequent information set in later stages, she will choose optimally contingent on therealization of the random posterior probability, which is a better strategy than alwayschoosing a certain action for all subsequent information sets. Thus, the belief error in thelate stage universally increases the value of observing one additional signal. However,the effect of the belief error in the ﬁrst stage is ambiguous and essentially depends on thevalue of the realized posterior in the ﬁrst stage. Speciﬁcally, when the realized posterioris close to , i.e., she is not conﬁdent about the true state, the second effect is also positiveand the overall effect of observing one additional signal is positive. When the realizedposterior is close to 0 or 1, i.e., she is conﬁdent about the true state, the second effect isnegative and even dominates the ﬁrst effect, so the overall effect is negative. We skip the tie case ˜ p ix = since belief error is continuous according to Assumption 1. V i , priB + be subject i ’s expected payoff of observing one additional signalafter having already observed a black ball in the ﬁrst stage. We simplify the notation as V priB + whenever it is clear. Proposition 5.

Assume that sophisticated subjects with belief error maximize expected payoff.Then under Assumption 1,V priB + − max ( ˜ p iB , 1 − ˜ p iB )  > if ˜ p iB ∈ ( ¯ p , ¯ p ) , (cid:54) if ˜ p iB (cid:54) ¯ p or ˜ p iB (cid:62) ¯ p , where ¯ p , ¯ p satisfy that ¯ p < < p B < ¯ p and ¯ p is increasing in p B .Proof. See Appendix A.4.Proposition 5 suggests that when the realized posterior belief in the ﬁrst stage is inthe same direction as the Bayesian posterior belief and does not hit the boundary of highconﬁdence about the true state, i.e., < ˜ p iB < ¯ p , there is positive value of observing oneadditional signal. For example, when the realized posterior belief is equal to the Bayesianposterior belief (i.e., ˜ p iB = p B ), observing one additional signal has positive informationalvalue.Furthermore, the difference in expected payoffs is generally positive from the per-spective of an outside observer. Let E V priB + be a subject’s expected payoff, from theperspective of an observer, of observing one additional signal after having already ob-served a black ball. Since the posterior belief is unknown to the observer, E V priB + = ( p + q ) v ( p BB ) + pqv ( p BW ) given that E ˜ p iB = p B = p . Similarly, E max ( ˜ p iB , 1 − ˜ p iB ) = v ( p B ) is the expected payoff in the ﬁrst stage from the perspective of an observer. Since ( p + q ) p BB + pqp BW = p B , it is clear that E V priB + > v ( p B ) if v ( p x ) is convex.Finally, the belief error model also predicts that the value of observing three additionalsignals is always greater than the value of observing one additional signal, which isconsistent with the experimental ﬁnding. We leave the formalization and the proof of thisclaim in the supplementary Appendix B.5.1.2 F irst - order belief about others ’ random posterior beliefs In the later stages of our social information setting, all signals except the ﬁrst consist ofother subjects’ ﬁrst choices, which are based on the colors of their own ﬁrst balls drawn The assumption that the ﬁrst observed ball is black has no loss of generality, and we keep thesimpliﬁed assumption henceforth. Subject i ’s expected payoff after choosing optimally in the ﬁrst stageis max ( ˜ p iB , 1 − ˜ p iB ) and V priB + = ˜ p iB · [ p · E max ( ˜ p iBB , 1 − ˜ p iBB ) + q · E max ( ˜ p iBW , 1 − ˜ p iBW )] + ( − ˜ p iB ) · [ p · E max ( ˜ p iBW , 1 − ˜ p iBW ) + q · E max ( ˜ p iBB , 1 − ˜ p iBB )] . P i ( C j | urn k ) ( j =

1, 2 and k =

1, 2), which reﬂects subject i ’s belief about anothersubject’s probability of choosing urn j given the true state urn k . Subject i ’s belief aboutanother subject’s decision strategy is characterized by P i ( C j | B ) and P i ( C j | W ) , whichreﬂects subject i ’s belief about another subject’s probabilities of choosing urns j =

1, 2after the other subject observes a black ball and a white ball, respectively. In contrastto the private information setting, where the quality of a signal of the color of a drawnball is exogenously given and is naturally agreed on among subjects, the quality of asignal of another subject’s ﬁrst choice in the social information setting may be interpreteddifferently and is endogenously determined as follows. P i ( C | urn 1 ) = P ( B | urn 1 ) · P i ( C | B ) + P ( W | urn 1 ) · P i ( C | W )= p · P i ( C | B ) + q · P i ( C | W ) , P i ( C | urn 2 ) = P ( W | urn 2 ) · P i ( C | W ) + P ( B | urn 2 ) · P i ( C | B )= p · P i ( C | W ) + q · P i ( C | B ) .If a subject believes that other subjects form posterior beliefs in a Bayesian manner andchoose optimally given their beliefs, then P i ( C | B ) = P i ( C | W ) =

1, which implies that P i ( C | urn 1 ) = P ( B | urn 1 ) and P i ( C | urn 2 ) = P ( W | urn 2 ) . In this case, subject i views the signal quality as identical regardless of whether it is a signal of another subject’sﬁrst choice or it is a signal of the color of a drawn ball. Thus, she interprets a signal ofanother subject’s choosing urn 1 the same as a signal of a black ball.If a subject believes that other subjects form posterior beliefs in a non-Bayesianparadigm and choose optimally given their beliefs, then she may interpret that a signalof another subject’s choosing urn 1 comes from observing a black ball or observing awhite ball. In this case, the signal quality is generally viewed as different, i.e., P i ( C | urn 1 ) (cid:54) = P ( B | urn 1 ) and P i ( C | urn 2 ) (cid:54) = P ( W | urn 2 ) . In addition, knowing that P i ( C | urn 1 ) > P ( B | urn 1 ) if and only if P i ( C | W ) / P i ( C | B ) > p / q > P i ( C | urn 2 ) > P ( W | urn 2 ) if and only if P i ( C | W ) / P i ( C | B ) < q / p <

1, it isimpossible that both P i ( C | urn 1 ) > P ( B | urn 1 ) and P i ( C | urn 2 ) > P ( W | urn 2 ) .22n other words, for subject i ’s arbitrary modeling of others’ forming posterior beliefs,the quality of a signal of another subject’s ﬁrst choice can never be uniformly improvedcompared to the signal quality in the private information setting.We now know that how subject i interprets a signal of another subject’s ﬁrst choicedepends on her belief about the subject’s decision strategy. This belief in turn dependsonly on her belief about the other subject’s method of forming posterior beliefs, as longas the assumption that subjects choose optimally given their beliefs is maintained. Similarto our method of modeling a subject forming a random posterior belief, we assume thatsubject i holds the ﬁrst-order belief that others form random posterior beliefs with aBayesian kernel. Assumption 2.

Subject i believes that other subjects’ posterior belief ˜ p − ix follows a beta distribu-tion with two parameters θ i ≥ and p x ∈ [

0, 1 ] . Speciﬁcally, ˜ p − ix ∼ Beta (cid:18) p x θ i , 1 − p x θ i (cid:19) when θ i > and p x ∈ (

0, 1 ) . ˜ p − ix = p x almost surely when θ i = , ˜ p − ix = almost surely whenp x = , and ˜ p − ix = almost surely when p x = . Similar to Assumption 1, Assumption 2 suggests that subject i thinks that othersubjects’ posterior beliefs are unbiased on average. The parameter θ i describes subject i ’s opinion of the degree of others’ belief errors. The larger θ i is, the greater she thinksothers’ belief errors are. In the special case of θ i =

0, ˜ p − ix = p x almost surely because itsdistribution is degenerate. Therefore, subject i believes that others’ posterior beliefs allcoincide with the Bayesian posterior belief, so their ﬁrst choices are perfectly aligned withthe colors of the balls they observe. In this case, the observation of a subject choosing urn1/urn 2 is the same as observing a black/white ball, and the social information settingdegenerates to the private information setting.Moreover, we know that under Assumption 2, P i ( C | B ) = P i ( C | W ) > θ i >

0, which implies that P i ( C | urn 1 ) < P ( B | urn 1 ) and P i ( C | urn 2 ) < P ( W | urn 2 ) .In other words, subject i interprets that the signal quality is always lower when it is asignal of another subject’s ﬁrst choice than when it is a signal of the color of a drawn ball.Her observation of another subject’s ﬁrst choice is equivalent to observing a signal withdiscounted quality.Finally, the belief error model in the social information setting employs Assumption2 to determine a subject’s interpretation of others’ ﬁrst choices. The model uses bothAssumption 1 and Assumption 2 to determine a subject’s posterior belief after observingothers’ ﬁrst choices. These two assumptions and the assumption of subjects’ sophisticated23onsideration of belief error are needed when modeling subjects’ extrapolation in thelater stages. Let V i , socB + be subject i ’s ex ante expected payoff of observing another subject’s ﬁrstchoice after having already observed a black ball in the ﬁrst stage. We simplify thenotation to V socB + whenever it is clear. Similar to the private information setting, weshow that there is positive informational value of observing another subject’s ﬁrst choice,which contradicts the prediction according to the Bayesian benchmark model.

Proposition 6.

Assume that sophisticated subjects with belief error maximize expected payoff.Then under Assumptions 1 and 2,V socB + − max ( ˜ p iB , 1 − ˜ p iB )  > if ˜ p iB ∈ ( ¯ p (cid:48) , ¯ p (cid:48) ) , (cid:54) if ˜ p iB (cid:54) ¯ p (cid:48) or ˜ p iB (cid:62) ¯ p (cid:48) , where ¯ p (cid:48) , ¯ p (cid:48) satisfy that ¯ p (cid:48) < < p B < ¯ p (cid:48) .Proof. See Appendix A.5.Similar to the private information setting, the difference in expected payoffs is generallypositive from the perspective of an outside observer. Let E V socB + be a subject’s expectedpayoff of observing another subject’s ﬁrst choice in the ﬁrst stage from the perspectiveof an observer. Then, E V socB + = [ p π + q ( − π )] v ( p BC ) + [ p ( − π ) + q π ] v ( p BC ) , where π ≡ P i ( C | urn 1 ) = P i ( C | urn 2 ) . Since [ p π + q ( − π )] p BC + [ p ( − π ) + q π ] p BC = p B , it is clear that E V socB + > v ( p B ) if v ( p x ) is convex.We ﬁnally discuss the implication of the belief error model for different valuations ofsocial information and private information. Consider a random variable X that takes thevalue p BB with probability p + q and the value p BW with probability 2 pq , and anotherrandom variable Y that takes the value p BC with probability p π + q ( − π ) and takesthe value p BC with probability p ( − π ) + q π . It is straightforward to check that X is To obtain a better picture of the deviation of our belief error model from the Bayesian benchmarkmodel in the social information setting, one may introduce two intermediate non-Bayesian paradigms.Speciﬁcally, non-Bayesian paradigm 1 assumes that subjects form posterior beliefs in a Bayesian mannerbut believe that others form random posterior beliefs with a Bayesian kernel. Non-Bayesian paradigm 2assumes that subjects form random posterior beliefs with a Bayesian kernel and believe that others use asimilar method. Non-Bayesian paradigm 3, which is our belief error model in the social information setting,assumes that subjects make a sophisticated consideration of their following non-Bayesian paradigm 2. Theintroduction of Assumption 2 makes the model deviate from the Bayesian benchmark to non-Bayesianparadigm 1. Further introduction of Assumption 1 makes the model deviate from non-Bayesian paradigm 1to non-Bayesian paradigm 2. The ﬁnal introduction of the assumption of sophisticated consideration makesthe model deviate from non-Bayesian paradigm 2 to our belief error model. V socB + = ˜ p iB [ P i ( C | urn 1 ) · E max ( ˜ p iBC , 1 − ˜ p iBC ) + P i ( C | urn 1 ) · E max ( ˜ p iBC , 1 − ˜ p iBC )] + ( − ˜ p iB )[ P i ( C | urn 2 ) · E max ( ˜ p iBC , 1 − ˜ p iBC ) + P i ( C | urn 2 ) · E max ( ˜ p iBC , 1 − ˜ p iBC )] .

24 mean-preserving spread of Y since π E V socB + is that v ( p x ) is convex in p x . On thebasis of the same logic, it is also a sufﬁcient condition for E V priB + > E V socB + in the case ofthree additional observations. Since our simulation exercise conﬁrms that the convexityof v ( p x ) generally holds, the belief error model predicts that social information is lessvaluable than private information.5.1.3 P aying to seek confidence We have shown that when subjects form random posterior beliefs with a Bayesian kernel,there is positive informational value for observing one additional signal. On the onehand, the positive informational value of additional signals due to belief error has a clearupper bound, e.g., V priB + − max ( ˜ p iB , 1 − ˜ p iB ) < when subjects are assumed to maximizeexpected payoff. This upper bound corresponds to a maximum reservation price of 150tokens in our experiment. On the other hand, subjects’ reported reservation price in eachtreatment condition of our experiment ranges from 0 tokens to 300 tokens. Therefore, anadditional assumption is required to rationalize the full data sample.We adopt the assumption that subjects paying to seek conﬁdence, which has beendemonstrated in Eliaz and Schotter (2010). In their experimental study, they providecompelling evidence that subjects are willing to pay for information on the likelihoodthat a decision is ex post optimal. They propose an explanation that subjects have anintrinsic preference for being “conﬁdent” in choosing the right decision. In a similar vein,we assume that a subject’s utility consists of two components: the monetary reward frommaking a correct choice, and the psychological reward of making a correct choice. Forsimplicity, we assume that the second component of the utility is proportional to thechance of making a correct choice, and the interpretation is that the higher the chance is,the better the anticipatory feelings subjects have.

Assumption 3.

In addition to gaining utility from the monetary reward, subject i gains utilityfrom the conﬁdence in earning the monetary reward, which is assumed to be proportional to thechance of earning the monetary reward, characterized by a parameter α i (cid:62) . Suppose that subject i has an endowment w > s to observe additionalsignals. The reward of making a correct choice is r >

0, and the reward of an incorrect One may naturally wonder if introducing the assumption of risk aversion could rationalize subjects’considerably high reservation price for additional signals in the data. While risk aversion does help togenerate a higher reservation price than under the assumption of risk neutrality, the additional assumptionis still not sufﬁcient because we then need an unreasonably high coefﬁcient of risk aversion to rationalizethe reported reservation prices that are much higher than 150 tokens. Moreover, no coefﬁcient of riskaversion is able to rationalize the reported reservation price of 300 tokens, which is the monetary reward ofmaking a correct choice. x and posterior belief ˜ p ix is ˜ p ix ( w + r − s + α i ) + ( − ˜ p ix )( w − s ) = w − s + ˜ p ix ( r + α i ) . Correspondingly, her expected utility from choosing urn 2 is w − s + ( − ˜ p ix )( r + α i ) .We now investigate subject i ’s optimal strategy of bidding, i.e., the optimal reservationprice for additional signals after observing the ﬁrst signal. For simplicity, let us considerthe scenario in which the ﬁrst ball observed is black ( B ) since it can be shown that thesubject employs the same bidding strategy in the scenario of ﬁrst ball being white ( W ).Let X tB + n denote the collection of all signal sets x that contains a ﬁrst black ball in setting t ∈ { pri , soc } with n ∈ {

1, 3 } additional signals. For example, when the additional signalsconsist of the color of one ball, X priB + = { BB , BW } ; when the additional signals consist ofthe colors of three balls, X priB + = { B , 3 B W , 2 B W , 1 B W } .By experimental design, subject i bids b and eventually pays the ask price s to observeadditional signals when successfully purchasing information, i.e., when s (cid:54) b . When theask price s is uniformly distributed on [ w ] , sophisticated subject i with a belief of urn1, ˜ p iB , has an expected utility from bidding b as follows, U n , t ( b , ˜ p iB ) = (cid:90) b w ∑ x ∈X tB + n P i ( x | B ) (cid:110) ( w − s + r + α i ) E max ( ˜ p ix , 1 − ˜ p ix )+ ( w − s ) (cid:104) − E max ( ˜ p ix , 1 − ˜ p ix ) (cid:105)(cid:111) d s + (cid:90) w b w (cid:110) ( w + r + α i ) max ( ˜ p iB , 1 − ˜ p iB ) + w (cid:104) − max ( ˜ p iB , 1 − ˜ p iB ) (cid:105)(cid:111) d s = w (cid:90) b (cid:26) ( w − s + r + α i ) ∑ x ∈X tB + n P i ( x | B ) E max ( ˜ p ix , 1 − ˜ p ix )+ ( w − s ) (cid:20) − ∑ x ∈X tB + n P i ( x | B ) E max ( ˜ p ix , 1 − ˜ p ix ) (cid:21)(cid:27) d s + w − bw (cid:110) ( w + r + α i ) max ( ˜ p iB , 1 − ˜ p iB ) + w (cid:104) − max ( ˜ p iB , 1 − ˜ p iB ) (cid:105)(cid:111) ,where P i ( x | B ) refers to subject i ’s belief about observing signal set x conditional onobserving the ﬁrst black ball, and ∑ x ∈X tB + n P i ( x | B ) = V i , tB + n = ∑ x ∈X tB + n P i ( x | B ) E max ( ˜ p ix , 1 − ˜ p ix ) and we simplify the notation to V tB + n whenever it is clear. Since E max ( ˜ p ix , 1 − ˜ p ix ) refers to subject i ’s probability ofmaking a correct choice after observing signal set x before the posterior belief is realized,we interpret V tB + n as the “average” or “expected” probability of making a correct choiceby purchasing additional signals. We show that V tB + n is linear in the ﬁrst-stage posteriorbelief ˜ p iB and is increasing in ˜ p iB in Appendix A.3.We then characterize subject i ’s optimal bidding strategy below.26 roposition 7. Under Assumptions 1, 2, and 3, (i) when α i (cid:54) w V + V − − r, sophisticated subjecti’s optimal bidding strategy isb ( ˜ p iB ) =  if ˜ p iB < − V + V − V or ˜ p iB > V + V − V , ( r + α i )[ V tB + n − max ( ˜ p iB , 1 − ˜ p iB )] if ˜ p iB ∈ (cid:104) − V + V − V , V + V − V (cid:105) , and (ii) when α i > w V + V − − r, her optimal bidding strategy isb ( ˜ p iB ) =  if ˜ p iB < − V + V − V or ˜ p iB > V + V − V , ( r + α i )[ V tB + n − max ( ˜ p iB , 1 − ˜ p iB )] if ˜ p iB ∈ (cid:104) − V + V − V , − V + ∆ + V − V (cid:105) ∪ (cid:104) V − ∆ + V − V , V + V − V (cid:105) , w if ˜ p iB ∈ (cid:16) − V + ∆ + V − V , V − ∆ + V − V (cid:17) , where ∆ = w / ( r + α i ) and V k = ∑ x ∈X tB + n P i ( x | urn k , B ) · E max ( ˜ p ix , 1 − ˜ p ix ) for k ∈ {

1, 2 } .In addition, the bidding function b ( ˜ p iB ) is non-decreasing on (cid:2) (cid:3) and non-increasing on (cid:2) , 1 (cid:3) .Proof. See Appendix A.6.Proposition 7 shows that subject i bids the highest when her posterior belief in theﬁrst stage is close to and bids the lowest when the posterior belief is close to 0 or 1. Theinterpretation is that the additional signals are the most helpful when she feels uncertainabout the underlying state, i.e., ˜ p iB = . Moreover, the additional signals are the leasthelpful when she feels certain about the underlying state, i.e., ˜ p iB = b ( ˜ p iB ) = ( r + α i )[ V tB + n − max ( ˜ p iB , 1 − ˜ p iB )] , subject i ’s bid may be naturallydecomposed into two parts: r [ V tB + n − max ( ˜ p iB , 1 − ˜ p iB )] and α i [ V tB + n − max ( ˜ p iB , 1 − ˜ p iB )] .The ﬁrst part is the expected increment in monetary reward due to the increment in thechance of making a correct choice, which we call the instrumental value of additionalinformation. The second part is the expected increment in psychological reward due to theincrement in the chance of making a correct choice, which we label the non-instrumentalvalue of additional information. eterogeneous belief error model Individual-level analysis of the experimental data in Section 4.3 demonstrates that con-siderable heterogeneity exists among subjects, especially in terms of the valuations ofprivate information and social information. Since a basic model of belief error is capturedby three parameters, our heterogeneous belief error model assumes that subjects areheterogeneous in that the model parameters ( γ i , θ i , α i ) vary across subjects. Speciﬁcally,we assume that the model parameters for each subject are independently drawn from27 certain distribution and that the realization of the values is privately observable tothe subject. Since a natural interpretation of the model parameters requires that γ i (cid:62) θ i (cid:62) α i (cid:62)

0, we make the following assumption.

Assumption 4. ( γ i , θ i , α i ) , i =

1, 2, . . . ,

N, are independently and identically distributed. γ i , θ i , and α i are jointly independent and follow exponential distributions with means ¯ γ , ¯ θ , and ¯ α ,respectively. Speciﬁcally, the probability density function of ( γ i , θ i , α i ) is φ ( γ i , θ i , α i ) = γ ¯ θ ¯ α exp (cid:18) − γ i ¯ γ − θ i ¯ θ − α i ¯ α (cid:19) , γ i , θ i , α i (cid:62) when ¯ γ > , ¯ θ > , and ¯ α > ; if any of ¯ γ , ¯ θ , or ¯ α is zero, the distribution of corresponding modelparameter is degenerate with all probability mass at 0. According to the heterogeneous belief error model, the model parameters ( γ i , θ i , α i ) remain constant for a subject once they are drawn from the exponential distribution.Given the parameters ( γ i , θ i ) , the subject forms her posterior belief ˜ p ix and her ﬁrst-orderbelief about others’ posterior belief ˜ p − ix for any possible signal set x according to thecorresponding beta distributions speciﬁed in Assumptions 1 and 2. We assume that thesubject’s belief error and her belief about others’ belief error are independent of eachother and are both independent of the realization of the parameter α i that characterizesthe psychological reward of making a correct choice. Assumption 5.

For any i ∈ {

1, 2, . . . , N } , any realization of ( γ i , θ i ) , and any possible signal setx, ˜ p ix , ˜ p − ix , and α i are independent of each other. stimation strategy and results We apply the heterogeneous belief error model to estimate the model parameters ( ¯ γ , ¯ θ , ¯ α ) . The data consist of 6000 observations from N =

100 subjects in total J = c ij and b ij denote subject i ’s ﬁrst choice and her reservation price for additionalsignals in the j -th round. Let J i ⊆ {

1, 2, . . . , J } denote the collection of rounds in whichsubject i successfully purchases additional signals, and let d ij ( j ∈ J i ) be her secondchoice after the additional signals are observed. Then, given ( γ i , θ i , α i ) , the probability Our choice of structural estimation of the full data sample based on the heterogeneous model insteadof the basic model is made due mainly to the following reasons. First, subjects in feedback sessions andno-feedback sessions may have different model parameters due to the possibility of learning from thefeedback. Second, we ﬁnd substantial heterogeneity across subjects, as illustrated in Section 4.3. Third, weapply a basic model of belief error to estimate the model parameters for each subject’s data subsample, andthe estimation results show large variation in the estimated values across subjects. i chooses c ij ( j ∈ {

1, 2, . . . , J } ) and d ij ( j ∈ J i ) and reports areservation price of b ij ( j ∈ {

1, 2, . . . , J } ) is L ∗ i ( γ i , θ i , α i ) = J ∏ j = f j ( c ij , b ij | γ i , θ i , α i ) · ∏ j ∈J i g j ( d ij | γ i , θ i ) ,where f j represents the probability of the ﬁrst choice and the reservation price in round j ,and g j represents the probability of the second choice. Integrating out the unobservableindividual-speciﬁc γ i , θ i and α i yields the likelihood for subject i ’s behavioral data givenmodel parameters ( ¯ γ , ¯ θ , ¯ α ) as L i ( ¯ γ , ¯ θ , ¯ α ) = (cid:90) (cid:90) (cid:90) R + L ∗ i ( γ i , θ i , α i ) φ ( γ i , θ i , α i ) d γ i d θ i d α i = (cid:90) (cid:90) (cid:90) R + (cid:34) J ∏ j = f j ( c ij , b ij | γ i , θ i , α i ) ∏ j ∈J i g j ( d ij | γ i , θ i ) (cid:35) φ ( γ i , θ i , α i ) d γ i d θ i d α i . (1)We use maximum likelihood estimation to obtain the estimates of ( ¯ γ , ¯ θ , ¯ α ) and derive thelikelihood function in detail in the supplementary Appendix B. ( ˆ¯ γ , ˆ¯ θ , ˆ¯ α ) = argmax ( ¯ γ , ¯ θ ,¯ α ) ∈ R + N ∑ i = ln L i ( ¯ γ , ¯ θ , ¯ α ) .To compute the triple integral in the likelihood function, we use the three-dimensionalGauss-Jacobi quadrature with 50 nodes in each dimension. In the estimation, we alsonormalize the subjects’ reservation prices to the unit interval by dividing the reservationprices in units of token by the upper bound of the reservation price (i.e., 300 tokens). Theestimates of the model parameters are reported in column 1 of Table 4, with the standarderrors reported in parentheses. Recall that parameter ¯ γ measures the degree of an average subject’s belief error andthat parameter ¯ θ measures an average subject’s belief about the degree of others’ belieferrors. A larger estimate of ¯ θ relative to ¯ γ suggests that an average subject thinks that As discussed in the beginning of Section 5, we also apply a heterogeneous logit QRE model to performstructural estimation by pretending to ignore the conceptual concern. Speciﬁcally, the heterogeneous logitQRE model assumes that subject i ’s payoff disturbance follows a Gumbel distribution with a scale parameter β i , that in her opinion, others’ payoff disturbances follow a Gumbel distribution with a potentially differentscale parameter β i , and that the two parameters are independently drawn from exponential distributionswith means ¯ β and ¯ β . The maximum likelihood estimation shows that the estimate of ¯ β is signiﬁcantlylarger than the estimate of ¯ β . In addition, the log-likelihood value of the estimation is -2879.85, which isconsiderably smaller than the log-likelihood value of the estimation based on our heterogeneous belieferror model. able

4: Estimation results of model parameters(1)Unrestricted (2)Test H : ¯ θ = H : ¯ θ (cid:54) ¯ γ ¯ θ γ α − − − p -value 1.552 × − Thus, a subject’s lower valuation of observing social information is explained both by herawareness of others’ belief errors and by her exaggeration of others’ belief errors. In otherwords, subjects’ taking into account the belief error leads to their discounting the qualityof the signals of others’ choices, and their exaggeration of others’ belief errors leads to afurther discount in the signal quality.Finally, we conduct statistical tests to conﬁrm the two forces driving the differencein reservation prices of private information and social information. When ¯ θ approacheszero, θ i becomes a degenerate distribution, and θ i = i . According to the belieferror model, subject i does not think others have belief error in this case. Therefore, shebelieves that others’ ﬁrst choices perfectly coincide with the colors of the balls observedprivately; thus, social information is equivalent to private information. This possibility isruled out by testing H : ¯ θ = H : ¯ θ > (cid:34) max ¯ γ , ¯ θ ,¯ α (cid:62) N ∑ i = ln L i ( ¯ γ , ¯ θ , ¯ α ) − max ¯ θ =

0, ¯ γ , ¯ α (cid:62) N ∑ i = ln L i ( ¯ γ , ¯ θ , ¯ α ) (cid:35) d −→ χ + χ ,where χ is a degenerate distribution at 0 and χ is a chi-square distribution with degree Kübler and Weizsäcker (2004) estimate a logistic choice model and ﬁnd that the estimated value of asubject’s belief about others’ choice disturbance is greater than that of her own choice disturbance. Goereeet al. (2007) estimate a quantal response model with non-rational expectations and ﬁnd that a subject’sbelief about others’ payoff disturbance is greater than that of her own payoff disturbance. In contrast, theestimation of our model suggests that subjects view others’ belief errors as greater than theirs or greaterthan what it actually is. Our interpretation is conceptually different from theirs.

30f freedom 1. Given that the value of the test statistic is 26.1839 ( p -value = × − ),the null hypothesis is rejected at the signiﬁcance level of 1%.We also conduct the test H : ¯ θ (cid:54) ¯ γ versus H : ¯ θ > ¯ γ to verify the interpretation that subjects exaggerate others’ belief error. Similarly, thelikelihood ratio test statistic2 (cid:34) max ¯ γ , ¯ θ ,¯ α (cid:62) N ∑ i = ln L i ( ¯ γ , ¯ θ , ¯ α ) − max ¯ γ (cid:62) ¯ θ (cid:62)

0, ¯ α (cid:62) N ∑ i = ln L i ( ¯ γ , ¯ θ , ¯ α ) (cid:35) d −→ χ + χ under the null hypothesis. Since the p -value of this test is 0.0467, the test rejects the nullhypothesis at the signiﬁcance level of 5%. onclusions This paper reveals experimentally that individuals value social information less thanprivate information, even though they are expected to be identical in the Bayesianparadigm. Additionally, a monotonic relationship exists between signal quality and thefrequency of individuals’ making a Bayesian choice, and there is positive informationalvalue of observing an additional signal after already observing a signal, both of whichcontradict the Bayesian paradigm. These ﬁndings are explained by a belief error model inwhich individuals form a random posterior belief with a Bayesian kernel and individualssophisticatedly consider their and others’ belief errors . Finally, maximum likelihoodestimation of the heterogeneous belief error model suggests that individuals’ sophisticatedconsideration of others’ belief errors and their exaggeration of others’ belief errors bothcontribute to their lower valuation of social information than private information.This paper, to the best of our knowledge, is the ﬁrst to test the ﬁrst-order beliefassumption that individuals believe that others process private information in a Bayesianmanner. In our novel experimental design, the ﬁrst-order belief assumption is identiﬁedwith a testable implication about the equivalent reservation prices for private informationand social information. Our experimental evidence casts doubt on the ﬁrst-order beliefassumption and suggests that future non-Bayesian social learning models may need toreﬂect the failure of the assumption given that it is fundamental to many existing Bayesianand non-Bayesian social learning models.Our proposed belief error model ﬁrst formalizes the noise in individuals’ formation ofposterior beliefs by making a beta distribution assumption about their random posterior31eliefs. Compared to the method of modeling errors by assuming an additively separableerror term and making certain distribution assumption about the error term, we believethat our method is particularly useful for modeling belief error and, more generally, formodeling errors in economic variables whose values must fall within a bounded interval.In addition, our model has the advantage of retaining the feature that individuals are,on average, Bayesian and is ﬂexible for allowing some non-Bayesian choice behaviorthat cannot be predicted by existing non-Bayesian models. It is beyond the scope ofthis paper to provide a rational foundation for the beta distribution assumption aboutrandom posterior beliefs. We believe that an investigation of its rational foundation isan interesting research agenda, just as Matˇejka and McKay (2015) recently provides arational inattention foundation for the logit model that has been used for decades.32 P roofs of M ain R esults A.1 P roof of P roposition Proof.

For simplicity of notation, we omit the index i in the proof. Given Assump-tion 1, ˜ p x and ˜ p y follow the same distribution when p x = p y , so max ( ˜ p x , 1 − ˜ p x ) andmax ( ˜ p y , 1 − ˜ p y ) are also identically distributed. Then their expectation over belief error,i.e. P ( correct choice | x ) and P ( correct choice | y ) , should be equal.According to the property of beta distribution,˜ p y ∼ Beta (cid:18) p y γ , 1 − p y γ (cid:19) ⇒ − ˜ p y ∼ Beta (cid:18) − p y γ , p y γ (cid:19) .If p y = − p x , then 1 − ˜ p y and ˜ p x are identically distributed. So max ( ˜ p y , 1 − ˜ p y ) = max ( − ˜ p y , ˜ p y ) and max ( ˜ p x , 1 − ˜ p x ) are identically distributed and in turn their expecta-tion over belief error should be equal. This establishes (i).We now prove the monotonicity. According to property (i), it sufﬁces to show that p x > p y (cid:62) implies that P ( correct choice | x ) > P ( correct choice | y ) .Since P ( correct choice | x ) = E max ( ˜ p x , 1 − ˜ p x ) where ˜ p x ∼ Beta (cid:16) p x γ , − p x γ (cid:17) , we deﬁnerandom variables X = max ( ˜ p x , 1 − ˜ p x ) and Y = max ( ˜ p y , 1 − ˜ p y ) which have commonsupport (cid:104) , 1 (cid:105) . Then, the proposition states that E X > E Y whenever p x > p y (cid:62) . Thisis implied by that X ﬁrst order stochastically dominates Y ; that is, P ( X (cid:54) u ) < P ( Y (cid:54) u ) for any u ∈ (cid:18)

12 , 1 (cid:19) . (2)We shall show (2) by showing that for random variable T ∼ Beta (cid:16) p γ , − p γ (cid:17) with p > and for any

12 and any 12 < u <

1. (3)First, for the beta function B ( a , b ) = Γ ( a ) Γ ( b ) / Γ ( a + b ) with a , b > ∂ B ( a , b ) ∂ a = B ( a , b )[ ψ ( a ) − ψ ( a + b )] , ∂ B ( a , b ) ∂ b = B ( a , b )[ ψ ( b ) − ψ ( a + b )] where ψ ( z ) = Γ (cid:48) ( z ) / Γ ( z ) is the digamma function. Then for 0 < z < ∂ I z ( a , b ) ∂ a = ∂∂ a (cid:82) z t a − ( − t ) b − d t B ( a , b ) = (cid:82) z t a − ( − t ) b − ln t d t B ( a , b ) − I z ( a , b )[ ψ ( a ) − ψ ( a + b )] ,33nd ∂ I z ( a , b ) ∂ b = (cid:82) z t a − ( − t ) b − ln ( − t ) d t B ( a , b ) − I z ( a , b )[ ψ ( b ) − ψ ( a + b )] ,which gives ∂ I z (cid:16) p γ , − p γ (cid:17) ∂ p = γ (cid:26) (cid:90) z f T ( t ) ln t − t d t − I z (cid:18) p γ , 1 − p γ (cid:19) (cid:20) ψ (cid:18) p γ (cid:19) − ψ (cid:18) − p γ (cid:19)(cid:21)(cid:27) , (4)where f T ( t ) = t p γ − ( − t ) − p γ − (cid:14) B (cid:16) p γ , − p γ (cid:17) is the density function of T . Therefore, for < u < P ( max ( T , 1 − T ) (cid:54) u ) = P ( − u (cid:54) T (cid:54) u ) = I u (cid:16) p γ , − p γ (cid:17) − I − u (cid:16) p γ , − p γ (cid:17) ,and then ∂ P ( max ( T , 1 − T ) (cid:54) u ) ∂ p = γ (cid:26) (cid:90) u − u f T ( t ) ln t − t d t − (cid:90) u − u f T ( t ) d t · (cid:20) ψ (cid:18) p γ (cid:19) − ψ (cid:18) − p γ (cid:19)(cid:21)(cid:27) = (cid:82) u − u f T ( t ) d t γ (cid:40) (cid:82) u − u f T ( t ) ln t − t d t (cid:82) u − u f T ( t ) d t − (cid:20) ψ (cid:18) p γ (cid:19) − ψ (cid:18) − p γ (cid:19)(cid:21)(cid:41) . (5)Since (cid:82) u − u f T ( t ) d t > u > , we have the sign of ∂ P ( max ( T , 1 − T ) (cid:54) u ) / ∂ p isthe same as the sign of the term A ≡ (cid:82) u − u f T ( t ) ln t − t d t (cid:82) u − u f T ( t ) d t − (cid:20) ψ (cid:18) p γ (cid:19) − ψ (cid:18) − p γ (cid:19)(cid:21) = (cid:82) u [ f T ( t ) − f T ( − t )] ln t − t d t (cid:82) u [ f T ( t ) + f T ( − t )] d t − (cid:20) ψ (cid:18) p γ (cid:19) − ψ (cid:18) − p γ (cid:19)(cid:21) .Next, differentiating A with respect to u yields ∂ A ∂ u = f T ( u ) + f T ( − u ) (cid:82) u [ f T ( t ) + f T ( − t )] d t (cid:40) f T ( u ) − f T ( − u ) f T ( u ) + f T ( − u ) ln u − u − (cid:90) u f T ( t ) − f T ( − t ) f T ( t ) + f T ( − t ) ln t − t · f T ( t ) + f T ( − t ) (cid:82) u [ f T ( v ) + f T ( − v )] d v d t (cid:41) = f T ( u ) + f T ( − u ) (cid:82) u [ f T ( t ) + f T ( − t )] d t (cid:40) g ( u ) − (cid:90) u g ( t ) · f T ( t ) + f T ( − t ) (cid:82) u [ f T ( v ) + f T ( − v )] d v d t (cid:41) , (6)34here function g ( t ) ≡ f T ( t ) − f T ( − t ) f T ( t )+ f T ( − t ) ln t − t . Because t − t is increasing in t on (cid:16) , 1 (cid:17) , f T ( t ) − f T ( − t ) f T ( t ) + f T ( − t ) = t p γ − ( − t ) − p γ − − ( − t ) p γ − t − p γ − t p γ − ( − t ) − p γ − + ( − t ) p γ − t − p γ − = − + (cid:0) t − t (cid:1) p − γ is increasing in t given p > ; and so is g ( t ) since ln t − t is increasing in t on (cid:16) , 1 (cid:17) aswell. Note that the integral term in (6) is a weighted average of the values of g ( t ) overinterval (cid:104) , u (cid:105) . Thus, for any u > , g ( u ) > (cid:90) u g ( t ) · f T ( t ) + f T ( − t ) (cid:82) u [ f T ( v ) + f T ( − v )] d v d t due to the monotonicity of g ( · ) and then ∂ A / ∂ u < u = A = (cid:82) f T ( t ) ln t − t d t (cid:82) f T ( t ) d t − (cid:20) ψ (cid:18) p γ (cid:19) − ψ (cid:18) − p γ (cid:19)(cid:21) = E (cid:18) ln T − T (cid:19) − (cid:20) ψ (cid:18) p γ (cid:19) − ψ (cid:18) − p γ (cid:19)(cid:21) = ∂ A / ∂ u <

0, we have A < . Thisproves (3) and completes the proof. A.2 P roof of P roposition Proof.

For simplicity of notation, we omit the index i in the proof. We ﬁrst show that P (cid:16) ˜ p x (cid:54) (cid:17) is strictly decreasing in p x . This result is also implied by (5): when u = P ( max ( T , 1 − T ) (cid:54) ) = p , so the left-hand side of (5) is zero; it means the brace term on the right-hand side of (5) must be zero since (cid:82) f T ( t ) d t / γ = γ >

0. For the expectation of ln T − T , because for any a , b > (cid:90) u a − ( − u ) b − B ( a , b ) ln u d u = ( a , b ) (cid:90) ∂ u a − ( − u ) b − ∂ a d u = ( a , b ) ∂ B ( a , b ) ∂ a = ψ ( a ) − ψ ( a + b ) ,it follows that E ln T = ψ (cid:16) p γ (cid:17) − ψ ( ) and E ln ( − T ) = ψ (cid:16) − p γ (cid:17) − ψ ( ) . p x ∼ Beta (cid:16) p x γ , − p x γ (cid:17) , P (cid:16) ˜ p x (cid:54) (cid:17) = I (cid:16) p x γ , − p x γ (cid:17) . Then by (4), ∂ P (cid:16) ˜ p x (cid:54) (cid:17) ∂ p x = γ B (cid:16) p x γ , − p x γ (cid:17) (cid:90) t px γ − ( − t ) − px γ − ln t − t d t + I (cid:16) p x γ , − p x γ (cid:17) γ (cid:20) ψ (cid:18) − p x γ (cid:19) − ψ (cid:18) p x γ (cid:19)(cid:21) .For 0 < t < , the integral term is always negative as ln t − t <

0. Note that whenthe real part of z is positive then the digamma function has the following integralrepresentation ψ ( z ) = (cid:82) ∞ (cid:16) e − t t − e − zt − e − t (cid:17) d t , which is increasing in z . So when (cid:54) p x < ψ (cid:16) − p x γ (cid:17) (cid:54) ψ (cid:16) p x γ (cid:17) and therefore ∂ P (cid:16) ˜ p x (cid:54) (cid:17) / ∂ p x < < p x < , using I (cid:16) p x γ , − p x γ (cid:17) = − I (cid:16) − p x γ , p x γ (cid:17) , we can get ∂ P (cid:16) ˜ p x (cid:54) (cid:17) ∂ p x = − ∂∂ p x I (cid:18) − p x γ , p x γ (cid:19) = γ B (cid:16) − p x γ , p x γ (cid:17) (cid:90) t − px γ − ( − t ) px γ − ln t − t d t + I (cid:16) − p x γ , p x γ (cid:17) γ (cid:20) ψ (cid:18) p x γ (cid:19) − ψ (cid:18) − p x γ (cid:19)(cid:21) .A similar argument would show that ∂ P (cid:16) ˜ p x (cid:54) (cid:17) / ∂ p x < (cid:17) is strictly increasing in p x .We know that (cid:12)(cid:12) p x − (cid:12)(cid:12) > (cid:12)(cid:12) p y − (cid:12)(cid:12) can be categorized into four cases: (i) p x > p y > ,(ii) p x > p y and p x + p y >

1, and (iv) p y > > p x and p x + p y (cid:1) = P (cid:0) ˜ p x > (cid:1) > P (cid:0) ˜ p y > (cid:1) = P (cid:0)(cid:0) ˜ p y − (cid:1)(cid:0) p y − (cid:1) > (cid:1) . In thesecond case, P (cid:0)(cid:0) ˜ p x − (cid:1)(cid:0) p x − (cid:1) > (cid:1) = P (cid:0) ˜ p x < (cid:1) > P (cid:0) ˜ p y < (cid:1) = P (cid:0)(cid:0) ˜ p y − (cid:1)(cid:0) p y − (cid:1) > (cid:1) . In the third case, let p z = − p y , then it is straightforward to check that P (cid:0) ˜ p z > (cid:1) = P (cid:0) ˜ p y < (cid:1) . Since p x > p z > , P (cid:0)(cid:0) ˜ p x − (cid:1)(cid:0) p x − (cid:1) > (cid:1) = P (cid:0) ˜ p x > (cid:1) > P (cid:0) ˜ p z > (cid:1) = P (cid:0) ˜ p y < (cid:1) = P (cid:0)(cid:0) ˜ p y − (cid:1)(cid:0) p y − (cid:1) > (cid:1) . In the fourth case, let p z = − p x ,then P (cid:0) ˜ p z > (cid:1) = P (cid:0) ˜ p x < (cid:1) . Since p z > p y > , P (cid:0)(cid:0) ˜ p x − (cid:1)(cid:0) p x − (cid:1) > (cid:1) = P (cid:0) ˜ p x < (cid:1) = P (cid:0) ˜ p z > (cid:1) > P (cid:0) ˜ p y > (cid:1) = P (cid:0)(cid:0) ˜ p y − (cid:1)(cid:0) p y − (cid:1) > (cid:1) . This completes the proof.36 .3 L emma and its proof We follow the notations in subsection 5.1.3: let X tB + n denote the collection of all signal sets x that contain a ﬁrst black ball in setting t ∈ { pri , soc } , and deﬁne V i , tB + n = ∑ x ∈X tB + n P i ( x | B ) E max ( ˜ p ix , 1 − ˜ p ix ) . The following lemma shows that V i , tB + n is linear in ˜ p iB and is alsoincreasing in ˜ p iB . Lemma 1. V i , tB + n = V ˜ p iB + V ( − ˜ p iB ) and < V < V < , where V = ∑ x ∈X tB + n P i ( x | urn 1 , B ) · E max ( ˜ p ix , 1 − ˜ p ix ) and V = ∑ x ∈X tB + n P i ( x | urn 2 , B ) · E max ( ˜ p ix , 1 − ˜ p ix ) .Proof. First, V i , tB + n = ∑ x ∈X tB + n [ P i ( x , urn 1 | B ) + P i ( x , urn 2 | B )] E max ( ˜ p ix , 1 − ˜ p ix )= ∑ x ∈X tB + n [ ˜ p iB P i ( x | urn 1, B ) + ( − ˜ p iB ) P i ( x | urn 2, B )] E max ( ˜ p ix , 1 − ˜ p ix )= ˜ p iB V + ( − ˜ p iB ) V .It remains to show that < V < V <

1. Let v ( p x ) ≡ E max ( ˜ p ix , 1 − ˜ p ix ) . ByProposition 3, v ( p x ) is symmetric about and strictly increasing on (cid:0) , 1 (cid:1) . We shall ﬁrstprove V > V by cases.(i) Private information, n = V = pv ( p BB ) + qv ( p BW ) and V = qv ( p BB ) + pv ( p BW ) . Since p > q , p p + q > . It follows from the monotonicity of v ( · ) that V − V = ( p − q )[ v ( p BB ) − v ( p BW )] = ( p − q ) (cid:20) v (cid:18) p p + q (cid:19) − v (cid:18) (cid:19)(cid:21) > n = V = p v ( p B ) + p qv ( p B W ) + pq v ( p B W ) + q v ( p B W ) , V = q v ( p B ) + pq v ( p B W ) + p qv ( p B W ) + p v ( p B W ) . Since p > q , p − q > pq ( p − q ) >

0, and p B = p p + q > p p + q > > q p + q = p B W , p B W = p p + q > = p B W .Thus, by Proposition 3, v ( p B W ) < v ( p B ) , v ( p B W ) < v ( p B W ) , and therefore V − V = ( p − q )[ v ( p B ) − v ( p B W )] + pq ( p − q )[ v ( p B W ) − v ( p B W )] > n = i ’s belief aboutobserving C in the true state of urn 1 is P i ( C | urn 1 ) = P ( B | urn 1 ) · P i (cid:18) ˜ p − iB > (cid:19) + P ( W | urn 1 ) · P i (cid:18) ˜ p − iW > (cid:19) = p (cid:20) − I (cid:18) p B θ i , 1 − p B θ i (cid:19)(cid:21) + q (cid:20) − I (cid:18) p W θ i , 1 − p W θ i (cid:19)(cid:21) = p I (cid:16) q θ i , p θ i (cid:17) + q I (cid:16) p θ i , q θ i (cid:17) (7)as p B = p and p W = q . Similarly, P i ( C | urn 2 ) = P ( B | urn 2 ) · P i (cid:16) ˜ p − iB > (cid:17) + P ( W | urn 2 ) · P i (cid:16) ˜ p − iW > (cid:17) = q I ( q / θ i , p / θ i ) + p I ( p / θ i , q / θ i ) , P i ( C | urn 1 ) = − P i ( C | urn 1 ) , and P i ( C | urn 2 ) = − P i ( C | urn 2 ) .We omit the index i in the remaining proof for the sake of exposition. Deﬁne π ≡ P ( C | urn 1 ) = P ( C | urn 2 ) , then in this case, V = P ( C | urn 1 ) v ( p BC ) + P ( C | urn 1 ) v ( p BC ) = π v ( p BC ) + ( − π ) v ( p BC ) and V = P ( C | urn 2 ) v ( p BC ) + P ( C | urn 2 ) v ( p BC ) = ( − π ) v ( p BC ) + π v ( p BC ) .Because p > q and θ >

0, we have < I (cid:16) q θ , p θ (cid:17) <

1, 0 < I (cid:16) p θ , q θ (cid:17) = − I (cid:16) q θ , p θ (cid:17) <

12 .Then by (7), < π , and p BC = p π p π + q ( − π ) = + q ( − π ) p π > + q π p ( − π ) = p BC .It follows that v ( p BC ) > v ( p BC ) and therefore V − V = ( π − )[ v ( p BC ) − v ( p BC )] > n = V = π v ( p B C ) + π ( − π ) v ( p B C C ) + π ( − π ) v ( p B C C ) +( − π ) v ( p B C ) and V = ( − π ) v ( p B C ) + π ( − π ) v ( p B C C ) + π ( − π ) v ( p B C C ) + π v ( p B C ) . Since ˜ p − ix is assumed to be continuous, we skip the tie cases where ˜ p − iB = or ˜ p − iW = . I ( q / θ , p / θ ) = ( p / θ , q / θ ) = θ =

0. But θ = p > q and π > , p B C = p π p π + q ( − π ) = + qp (cid:0) − ππ (cid:1) > + qp (cid:0) π − π (cid:1) = p ( − π ) p ( − π ) + q π = p B C and p B C = + qp (cid:0) − ππ (cid:1) > + pq (cid:0) − ππ (cid:1) = q π p ( − π ) + q π = − p B C .In addition, note that p B C C = p BC = p π p π + q ( − π ) > p B C C = p BC = p ( − π ) p ( − π ) + q π >

12 .So we have v ( p B C ) > v ( p B C ) = v ( − p B C ) and v ( p B C C ) > v ( p B C C ) byProposition 3. Thus V − V = [ π − ( − π ) ][ v ( p B C ) − v ( p B C )]+ π ( − π )( π − )[ v ( p B C C ) − v ( p B C C )] > V and V are either weighted averagesof v ( p BB ) and v ( p BW ) or weighted averages of v ( p B ) , v ( p B W ) , v ( p B W ) and v ( p B W ) .And in the social learning settings, V and V are either weighted averages of v ( p BC ) and v ( p BC ) or weighted averages of v ( p B C ) , v ( p B C C ) , v ( p B C C ) and v ( p B C ) . Becausefor any signal set x , (cid:54) max ( ˜ p x , 1 − ˜ p x ) (cid:54) (cid:54) v ( p x ) (cid:54)

1. (8)We have V , V ∈ (cid:2) , 1 (cid:3) in any case. However, the ﬁrst equality of (8) holds if and onlyif p x = and γ =

0, and the second equality of (8) holds if and only if p x = γ =

0. It is impossible that all v ( · ) ’s being weighted are 0, 1, or at the same time.Therefore, V , V (cid:54) = V , V (cid:54) = . This completes the proof.39 .4 P roof of P roposition Proof.

By Lemma 1, V priB + = ˜ p iB V + ( − ˜ p iB ) V with V = pv ( p BB ) + qv ( p BW ) and V = qv ( p BB ) + pv ( p BW ) , where v ( p x ) = E max ( ˜ p ix , 1 − ˜ p ix ) . Then, V priB + − max ( ˜ p iB , 1 − ˜ p iB ) =  ( V − V − ) ˜ p iB + V if ˜ p iB > , ( + V − V ) ˜ p iB + V − p iB (cid:54) .Since < V < V < V priB + − max ( ˜ p iB , 1 − ˜ p iB )  > p iB ∈ ( ¯ p , ¯ p ) , (cid:54) p iB (cid:62) ¯ p or ˜ p iB (cid:54) ¯ p ,where ¯ p ≡ − V + V − V < < ¯ p ≡ V − V + V . It remains to show that ¯ p > p B ≡ p and ¯ p isincreasing in the Bayesian posterior probability p when p > .Since for any p x ∈ (

0, 1 ) , v ( p x ) = E max ( ˜ p x , 1 − ˜ p x ) > max ( E ˜ p x , 1 − E ˜ p x ) = max ( p x , 1 − p x ) (cid:62) p x by Assumption 1 and Jensen’s inequality, then V p + V ( − p ) = p (cid:20) pv (cid:18) p p + q (cid:19) + qv (cid:18) (cid:19)(cid:21) + q (cid:20) qv (cid:18) p p + q (cid:19) + pv (cid:18) (cid:19)(cid:21) = ( p + q ) v (cid:18) p p + q (cid:19) + pqv (cid:18) (cid:19) > ( p + q ) · p p + q + pq · = p ,so V > p − V p + V p , which implies that ¯ p = V − V + V > p .Note that ∂ ¯ p / ∂ p = (cid:104) V ∂ V ∂ p + ( − V ) ∂ V ∂ p (cid:105) / ( − V + V ) . Since p p + q is increas-ing in p , and by Assumption 1, v ( p x ) is an increasing function for p x > , we have ∂ v (cid:16) p p + q (cid:17) / ∂ p > ∂ V ∂ p = p · ∂∂ p v (cid:18) p p + q (cid:19) + v (cid:18) p p + q (cid:19) − v (cid:18) (cid:19) > q · ∂∂ p v (cid:18) p p + q (cid:19) + (cid:12)(cid:12)(cid:12)(cid:12) v (cid:18) (cid:19) − v (cid:18) p p + q (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) > (cid:12)(cid:12)(cid:12)(cid:12) q · ∂∂ p v (cid:18) p p + q (cid:19) + v (cid:18) (cid:19) − v (cid:18) p p + q (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) ∂ V ∂ p (cid:12)(cid:12)(cid:12)(cid:12) .40ince V > > − V >

0, we have V ∂ V ∂ p + ( − V ) ∂ V ∂ p > ( − V ) ∂ V ∂ p − ( − V ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ V ∂ p (cid:12)(cid:12)(cid:12)(cid:12) > ∂ ¯ p / ∂ p > p is increasing in p . A.5 P roof of P roposition Proof.

By Lemma 1, V socB + = ˜ p iB V + ( − ˜ p iB ) V with V = π v ( p BC ) + ( − π ) v ( p BC ) and V = ( − π ) v ( p BC ) + π v ( p BC ) , where π = P i ( C | urn 1 ) = P i ( C | urn 2 ) and v ( p x ) = E max ( ˜ p ix , 1 − ˜ p ix ) . Then applying the same argument as proving Proposition 5,we have V socB + − max ( ˜ p iB , 1 − ˜ p iB )  > p iB ∈ ( ¯ p (cid:48) , ¯ p (cid:48) ) , (cid:54) p iB (cid:62) ¯ p (cid:48) or ˜ p iB (cid:54) ¯ p (cid:48) ,where ¯ p (cid:48) ≡ − V + V − V < < ¯ p (cid:48) ≡ V + V − V . Using the property v ( p x ) (cid:62) p x for any p x ∈ (

0, 1 ) , we have V p + V ( − p ) = [ p π + q ( − π )] v (cid:18) p π p π + q ( − π ) (cid:19) + [ p ( − π ) + q π ] v (cid:18) p ( − π ) p ( − π ) + q π (cid:19) > [ p π + q ( − π )] · p π p π + q ( − π ) + [ p ( − π ) + q π ] · p ( − π ) p ( − π ) + q π = p π + p ( − π ) = p .It implies that V > p + V p − V p ⇒ ¯ p (cid:48) > p . A.6 P roof of P roposition Proof.

Since ∂ U n , t ( b , ˜ p iB ) ∂ b ∝ ( w − b + r + α i ) V tB + n + ( w − b )( − V tB + n ) − ( w + r + α i ) max ( ˜ p iB , 1 − ˜ p iB ) − w (cid:104) − max ( ˜ p iB , 1 − ˜ p iB ) (cid:105) = ( r + α i ) (cid:104) V tB + n − max ( ˜ p iB , 1 − ˜ p iB ) (cid:105) − b ≡ h i ( b , ˜ p iB ) ,and ∂ U n , t ( b , ˜ p iB ) / ∂ b = − w <

0, the bid maximizing the subject’s expected utility, b ( ˜ p iB ) , is determined as follows: 41i) If h i (

0, ˜ p iB ) (cid:54) b ( ˜ p iB ) = h i ( w , ˜ p iB ) (cid:62) b ( ˜ p iB ) = w ;(iii) If h i (

0, ˜ p iB ) > > h i ( w , ˜ p iB ) , b ( ˜ p iB ) is the unique root that satisﬁes h i ( b ( ˜ p iB ) , ˜ p iB ) = b ( ˜ p iB ) = ( r + α i )[ V tB + n − max ( ˜ p iB , 1 − ˜ p iB )] .Then the optimal bidding function is b ( ˜ p iB ) = min ( w , max ( ( r + α i )[ V tB + n − max ( ˜ p iB , 1 − ˜ p iB )])) . By Lemma 1, V tB + n − max ( ˜ p iB , 1 − ˜ p iB ) =  ( V − V + ) ˜ p iB − ( − V ) if 0 (cid:54) ˜ p iB (cid:54) , ( V − V − ) ˜ p iB + V if < ˜ p iB (cid:54) h i (

0, ˜ p iB ) > > h i ( w , ˜ p iB ) , b ( ˜ p iB ) is increasing on (cid:2) (cid:3) and decreasingon (cid:2) , 1 (cid:3) .We now investigate the boundary situations in more details. Note that h i (

0, ˜ p iB ) (cid:54) p iB (cid:54) − V + V − V when ˜ p iB (cid:54) , and ˜ p iB (cid:62) V + V − V when ˜ p iB > . Since < V < V < − V + V − V < and V + V − V > . Therefore, b ( ˜ p iB ) = p iB (cid:54) − V + V − V or ˜ p iB (cid:62) V + V − V .In addition, h i ( w , ˜ p iB ) (cid:62) p iB (cid:62) − V + ∆ + V − V when ˜ p iB (cid:54) , and ˜ p iB (cid:54) V − ∆ + V − V when ˜ p iB > , where ∆ = w / ( r + α i ) . Note that1 − V + ∆ + V − V > ⇔ V − ∆ + V − V < ⇔ α i < w V + V − − r .Hence, when α i ≤ w V + V − − r , for ˜ p iB ∈ (cid:104) − V + V − V , V + V − V (cid:105) , h i ( w , ˜ p iB ) w V + V − − r , b ( ˜ p iB ) = w for ˜ p iB ∈ (cid:104) − V + ∆ + V − V , V − ∆ + V − V (cid:105) . This completes the proof. R eferences A cemoglu , D., M. A. D ahleh , I. L obel , and A. O zdaglar (2011): “Bayesian Learning inSocial Networks,”

Review of Economic Studies , 78, 1201–1236.A nderson , L. R. and

C. A. H olt (1997): “Information Cascades in the Laboratory,”

American Economic Review , 87, 847–862.B ala , V. and

S. G oyal (1998): “Learning from Neighbours,”

Review of Economic Studies ,65, 595–621.B anerjee , A. V. (1992): “A Simple Model of Herd Behavior,”

Quarterly Journal of Economics ,107, 797–817. 42 arberis , N., A. S hleifer , and R. V ishny (1998): “A Model of Investor Sentiment,”

Journal of Financial Economics , 49, 307–343.B ecker , G. M., M. H. D e G root , and J. M arschak (1964): “Measuring Utility by aSingle-Response Sequential Method,”

Behavioral Science , 9, 226–232.B ikhchandani , S., D. H irshleifer , and I. W elch (1992): “A Theory of Fads, Fashion,Custom, and Cultural Change as Informational Cascades,”

Journal of Political Economy ,100, 992–1026.B ohm , P., J. L indén , and J. S onnegård (1997): “Eliciting Reservation Prices: Becker-DeGroot-Marschak Mechanisms vs. Markets,”

Economic Journal , 107, 1079–1089.B ohren , J. A. (2016): “Informational Herding With Model Misspeciﬁcation,”

Journal ofEconomic Theory , 163, 222–247.B radley , R. A. and

M. E. T erry (1952): “Rank Analysis of Incomplete Block Designs: I.The Method of Paired Comparisons,”

Biometrika , 39, 324–345.C amerer , C. F. (1995): “Individual Decision Making,” in

Handbook of Experimental Eco-nomics , ed. by J. H. Kagel and A. E. Roth, Princeton, NJ: Princeton University Press,587–703.C amerer , C. F. and

G. L oewenstein (2004): “Behavioral Economics: Past, Present,Future,” in

Advances in Behavioral Economics , ed. by C. F. Camerer, G. Loewenstein, andM. Rabin, Princeton, NJ: Princeton University Press.C aplin , A. and

M. D ean (2015): “Revealed Preference, Rational Inattention, and CostlyInformation Acquisition,”

American Economic Review , 105, 2183–2203.C ason , T. N. and

C. R. P lott (2014): “Misconceptions and Game Form Recognition:Challenges to Theories of Revealed Preference and Framing,”

Journal of Political Economy ,122, 1235–1270.Ç elen , B. and

S. K ariv (2004): “Distinguishing Informational Cascades from HerdBehavior in the Laboratory,”

American Economic Review , 94, 484–498.D e F ilippis , R., T. K itagawa , P. J ehiel , and A. G uarino (2017): “Updating AmbiguousBeliefs in a Social Learning Experiment,” Cemmap Working Papers CWP13/17.D e G root , M. H. (1974): “Reaching a Consensus,” Journal of the American StatisticalAssociation , 69, 118–121.D e M arzo , P. M., D. V ayanos , and J. Z wiebel (2003): “Persuasion Bias, Social Inﬂuence,and Unidimensional Opinions,”

Quarterly Journal of Economics , 118, 909–968.D ominitz , J. and

A. A. H ung (2009): “Empirical Models of Discrete Choice and BeliefUpdating in Observational Learning Experiments,”

Journal of Economic Behavior &Organization , 69, 94–109.E liaz , K. and

A. S chotter (2010): “Paying for Conﬁdence: An Experimental Study of theDemand for Non-Instrumental Information,”

Games and Economic Behavior , 70, 304–324.43 pstein , L. G. (2006): “An Axiomatic Model of Non-Bayesian Updating,”

Review ofEconomic Studies , 73, 413–436.E pstein , L. G., J. N oor , and A. S androni (2008): “Non-Bayesian Updating: A TheoreticalFramework,”

Theoretical Economics , 3, 193–229.——— (2010): “Non-Bayesian Learning,”

The B.E. Journal of Theoretical Economics , 10, Issue1, Article 3.E yster , E. and

M. R abin (2010): “Naïve Herding in Rich-Information Settings,”

AmericanEconomic Journal: Microeconomics , 2, 221–243.——— (2014): “Extensive Imitation is Irrational and Harmful,”

Quarterly Journal ofEconomics , 129, 1861–1898.G oeree , J. K., T. R. P alfrey , B. W. R ogers , and R. D. M c K elvey (2007): “Self-CorrectingInformation Cascades,” Review of Economic Studies , 74, 733–762.G olub , B. and

M. O. J ackson (2010): “Naïve Learning in Social Networks and theWisdom of Crowds,”

American Economic Journal: Microeconomics , 2, 112–149.——— (2012): “How Homophily Affects the Speed of Learning and Best-ResponseDynamics,”

Quarterly Journal of Economics , 127, 1287–1338.G uarino , A. and

P. J ehiel (2013): “Social Learning with Coarse Inference,”

AmericanEconomic Journal: Microeconomics , 5, 147–174.H olt , C. A. and

A. M. S mith (2009): “An Update on Bayesian Updating,”

Journal ofEconomic Behavior & Organization , 69, 125–134.H ung , A. A. and

C. R. P lott (2001): “Information Cascades: Replication and an Extensionto Majority Rule and Conformity-Rewarding Institutions,”

American Economic Review ,91, 1508–1520.J adbabaie , A., P. M olavi , A. S androni , and A. T ahbaz -S alehi (2012): “Non-BayesianSocial Learning,” Games and Economic Behavior , 76, 210–225.K übler , D. and

G. W eizsäcker (2004): “Limited Depth of Reasoning and Failure ofCascade Formation in the Laboratory,”

Review of Economic Studies , 71, 425–441.M atˇejka , F. and

A. M c K ay (2015): “Rational Inattention to Discrete Choices: A NewFoundation for the Multinomial Logit Model,” American Economic Review , 105, 272–298.M olavi , P., A. T ahbaz -S alehi , and A. J adbabaie (2018): “A Theory of Non-BayesianSocial Learning,”

Econometrica , 86, 445–490.N öth , M. and

M. W eber (2003): “Information Aggregation with Random Ordering:Cascades and Overconﬁdence,”

Economic Journal , 113, 166–189.N yarko , Y., A. S chotter , and B. S opher (2006): “On the Informational Content ofAdvice: A Theoretical and Experimental Study,”

Economic Theory , 29, 433–452.O rtoleva , P. (2012): “Modeling the Change of Paradigm: Non-Bayesian Reactions toUnexpected News,”

American Economic Review , 102, 2410–2436.44 lott , C. R. and

K. Z eiler (2005): “The Willingness to Pay–Willingness to Accept Gap,the “Endowment Effect,” Subject Misconceptions, and Experimental Procedures forEliciting Valuations,”

American Economic Review , 95, 530–545.R abin , M. (1998): “Psychology and Economics,”

Journal of Economic Literature , 36, 11–46.——— (2002): “Inference by Believers in the Law of Small Numbers,”

Quarterly Journal ofEconomics , 117, 775–816.R abin , M. and

J. L. S chrag (1999): “First Impressions Matter: A Model of ConﬁrmatoryBias,”

Quarterly Journal of Economics , 114, 37–82.R abin , M. and

D. V ayanos (2010): “The Gambler’s and Hot-Hand Fallacies: Theory andApplications,”

Review of Economic Studies , 77, 730–778.S mith , L. and

P. S ørensen (2000): “Pathological Outcomes of Observational Learning,”

Econometrica , 68, 371–398.T versky , A. and

D. K ahneman (1974): “Judgment under Uncertainty: Heuristics andBiases,”

Science , 185, 1124–1131.W eizsäcker , G. (2010): “Do We Follow Others when We Should? A Simple Test ofRational Expectations,”