[PDF] Herding driven by the desire to differ

Abstract

Observational learning often involves congestion: an agent gets lower payoff from an action when more predecessors have taken that action. This preference to act differently from previous agents may paradoxically increase all but one agent's probability of matching the actions of the predecessors. The reason is that when previous agents conform to their predecessors despite the preference to differ, their actions become more informative. The desire to match predecessors' actions may reduce herding by a similar reasoning.

Full PDF

aa r X i v : . [ ec on . T H ] M a r Herding driven by the desire to diﬀer

Sander Heinsalu ∗ April 2, 2019

Abstract

Observational learning often involves congestion: an agent gets lower payoﬀ froman action when more predecessors have taken that action. This preference to act diﬀer-ently from previous agents may paradoxically increase all but one agent’s probabilityof matching the actions of the predecessors. The reason is that when previous agentsconform to their predecessors despite the preference to diﬀer, their actions become moreinformative. The desire to match predecessors’ actions may reduce herding by a similarreasoning.Keywords: herding, information cascade, social preferences, congestion.JEL classiﬁcation: D83, D82, C73, C72.

This paper studies rational agents’ learning from the choices of others when the informa-tion of others is not directly available. Payoﬀs are interdependent due to congestion costs: ifmore preceding agents choose an action, then an agent’s payoﬀ from taking that action falls.Congestion arises in many situations, such as when individuals choose a supermarket lane orother queue, a route to drive or a service provider to use. For ﬁrms, choosing a market thatothers have entered is less proﬁtable, other things equal.The model follows the seminal papers of Banerjee (1992) and Bikhchandani et al. (1992)on herding and information cascades. Agents choose in sequence between two actions, afterobserving the previous agents’ actions and a private signal. All agents prefer their action tomatch a binary state, which is symmetrically unknown. The payoﬀ of an agent also increaseswhen the action diﬀers from those of the preceding agents. Such social preferences are alsoassumed in Gaigl (2009) and Eyster et al. (2014). ∗ Research School of Economics, Australian National University, 25a Kingsley St, Acton ACT 2601, Aus-tralia. Email: [email protected], website: https://sanderheinsalu.com/

1n equilibrium, the preference for an action diﬀerent from that of previous agents mayparadoxically increase all but one agent’s probability of matching the actions of the prede-cessors, compared to the case when payoﬀs do not depend on others’ actions. When previousagents choose the same action as their predecessors, congestion costs increase the informa-tiveness of this action. The reason is that a stronger signal is required to induce an actionwhen the preceding agents have chosen it. A more informative action in turn motivatesimitation, even when congestion moderately increases the cost of imitating.Similarly, the desire to conform to previous agents’ actions may reduce herding. If pastagents made the same choice as their predecessors, then these actions are less informativeunder a preference to match previous moves. The decreased informativeness of precedingactions allows an agent’s private signal to outweigh the combined eﬀect of previous movesand the desire to conform.In contrast to the current work, both Gaigl (2009) and Eyster et al. (2014) show thatcongestion costs reduce herding and, if not too large, improve learning. Large enough con-gestion costs cause agents to alternate their actions (anti-herd), which decreases learning.Gaigl (2009) and Eyster et al. (2014) focus on asymptotic learning, but the present paperconsiders the probability of each agent matching the actions of his predecessors, as well asthe correct action. As in the previous literature, when the desire to diﬀer is small enough, allagents take the same action from some ﬁnite time onward. In that case, learning is bounded,i.e. there is positive probability of the wrong action as time goes to inﬁnity.In Callander and H¨orner (2009), agents are diﬀerently informed and observe only thenumber of previous movers choosing an action, not who chose it. Following the minority issometimes optimal.Other forms of social preference in herding have been studied. In Ali and Kartik (2012),agents prefer others to take the correct action. Callander (2007) assumes that agents wantto match the eventual majority, thus payoﬀ depends on future agents’ choices, unlike in thecurrent work.The next section sets up the model where agents desire to diﬀer from previous movers.The results are collected in Section 2 and discussed in Section 3. The appendix shows thata desire to conform may reduce herding. 2

Model

Time is discrete, with periods and players indexed by i ∈ N . In period i , player i observesa private signal s i ∈ { L, ℓ, r, R } and chooses a public action a i ∈ { L, R } . The public historyof actions up to time t is denoted a t = ( a , . . . , a t ). Action a i is called uninformative afterhistory a i − if a i ( a i − , s i ) is constant in s i , and informative otherwise. An information cascade is said to occur after history a t if actions a i , i > t are uninformative after any continuationhistory a i = ( a t , a t +1 , . . . ). Herding after history a t means that a t +1 = a t regardless of signals.Thus a herd is a special case of an information cascade.An unknown state θ ∈ { L, R } determines payoﬀs via u i ( a i , θ ) = { a i = θ } − ki − i − X j =1 { a j = a i } , where S denotes the indicator function of set S and k ≥ k = 0,then the environment is standard herding with independent preferences. If k >

0, then eachplayer prefers to take a diﬀerent choice than the majority of the previous agents, other thingsequal.The prior probability of state R is p ∈ [ ,

1) w.l.o.g. Denote by p S ∈ (0 ,

1) the uncon-ditional probability of signal s i ∈ { L, R } . Conditional on the state, the probabilities of thesignals are Pr( L | L ) = Pr( R | R ) = Q ∈ (cid:0) p S , p S (cid:1) and Pr( ℓ | L ) = Pr( r | R ) = q ∈ (cid:16) − p S , Q (1 − p S ) p S (cid:17) .Therefore Pr( L | R ) = Pr( R | L ) = p S − Q and Pr( ℓ | R ) = Pr( r | L ) = 1 − p S − q . Bayesian up-dating determines each player’s posterior belief p i = Pr( R | a i − , s i ) and log likelihood ratio l i := ln p i − ln(1 − p i ). Using l i instead of p i simpliﬁes the exposition and is mathematicallyequivalent. Signals L, ℓ favour state L , in the sense of increasing the posterior probabilityof L . Similarly, R, r favour state R . Calling signals ℓ, r weak and L, R strong is justiﬁed by q − p S < Qp S , which means that the posterior belief moves more in response to L, R than to ℓ, r . Assume q > p (1 − p S ), equivalently l q > l , to ensure signals are informative enoughfor even a weak signal s ∈ { ℓ, r } to overturn the prior, i.e. player 1 to believe after signal ℓ that state L is more likely than R .Denote the (public) log likelihood ratio of player i > s i by l i ( a i − ).Action ˜ a i ∈ { L, R } is called more informative than ˆ a j ∈ { L, R } if | l i +1 (( a i − , ˜ a i )) − l i ( a i − ) | ≥| l j +1 (( a j − , ˆ a j )) − l j ( a j − ) | for any a i − , a j − , which means that ˜ a i moves the public log like-lihood ratio l i +1 more than ˆ a j moves l j +1 .To derive player i ’s private log likelihood ratios l i ( a i − , s i ) after a i − , s i , deﬁne l Q := ln Q − ln( p S − Q ) >

0, 3 q := ln q − ln(1 − p S − q ) ∈ (0 , l Q ), l Qq := ln( Q + q ) − ln(1 − Q − q ) ∈ ( l q , l Q ) and l ¬ Q := ln(1 − p S + Q ) − ln(1 − Q ) ∈ (0 , l Qq ),where l Q , l q are the log likelihood ratios of strong and weak signals respectively. The loglikelihood ratio l Qq does not distinguish strong and weak signals, only whether the signalfavours L or R . If the strong signal in favour of one state is distinguishable from the otherthree, but the latter look identical to an agent, then upon not seeing the distinguishablestrong signal, the agent uses l ¬ Q to update. The (private) log likelihood ratios of i uponobserving s i are l i ( a i − , L ) = l i ( a i − ) − l Q , l i ( a i − , ℓ ) = l i ( a i − ) − l q ,l i ( a i − , R ) = l i ( a i − ) + l Q , l i ( a i − , r ) = l i ( a i − ) + l q . Note that l i ( a i − , R ) = 2 l i ( a i − ) − l i ( a i − , L ) and l i ( a i − , r ) = 2 l i ( a i − ) − l i ( a i − , ℓ ).The expected utility of player i with log likelihood ratio l from action a i = R if fraction f of previous players chose R is exp( l )1+exp( l ) − f k , but the expected utility from a i = L is l ) − (1 − f ) k . The payoﬀ diﬀerence ∆( l, f ) := exp( l ) − l ) + (1 − f ) k determines the bestresponse: player i chooses R if ∆( l, f ) > l, f ) ≥

0. Deﬁne the cutoﬀ loglikelihood ratio l k ( f ) := ln(1 − k + 2 f k ) − ln(1 + k − f k ) (1)at which a player switches from action L to R . Clearly l k ( ) = 0 and l k (1) = − l k (0). The next section derives the optimal action choices of the players and provides suﬃcientconditions for herding to increase when players want to take a diﬀerent action from theirpredecessors.

Player 1 chooses a = L after signals L, ℓ and a = R after R, r , due to the assumption l q > l . There are no predecessors for player 1, so the optimal action a ∗ does not depend on k . Similarly, if exactly half the predecessors of an odd-numbered player 2 i − L , then k does not aﬀect a i − .Given l q > l , player 2’s log likelihood ratios conditional on player 1’s action a are l ( L ) = l − l Qq and l ( R ) = l + l Qq before observing s . The interpretation of a = L More generally, l k is antisymmetric around , i.e. l k ( f ) = − l k ( − f ) for any f ≥ . L and ℓ , and similarly a = R .If the congestion cost is not too large and the prior not too extreme, then the action ofplayer 2 responds to s . Lemma 1 characterises when the actions of the ﬁrst two agents areinformative. Lemma 1.

Player ’s action is informative if l Q > l and only if l Q ≥ l . Player ’s actionis informative after any a if l q > l , l − l Qq − l Q < l k (0) and l + l Qq − l Q < l k (1) .Proof. If l Q > l , then l ( L ) <

0, so a ( L ) = L . Due to l > a ( r ) = a ( R ) = R , thus a isinformative. If l Q < l , then l ( L ) >

0, so a = R for any s .Clearly a ( L, R ) = R for any k ≥

0. Before observing s , if l q > l , then l ( L ) = l − l Qq and l ( R ) = l + l Qq . Then l − l Qq − l Q < l k (0) implies ∆( l ( L, L ) , f ) <

0, so a ( L, L ) = L ,ensuring that a is informative after a = L .The condition l + l Qq + l Q > l k (1) ensuring a ( R, R ) = R is implied by l − l Qq − l Q < l k (0)and l k (0) = − l k (1). If l + l Qq − l Q < l k (1), then a ( R, L ) = L , thus a is informative after a = R .The maintained assumption l q > l implies l Q > l , which ensures a is informativeby Lemma 1. The conditions suﬃcient for a to be informative are not necessary. Theinterpretation of l − l Qq − l Q + l k (1) < l + l Qq − l Q − l k (1) < R is low enough that a strong signal s = L in favour of L together with the preference to diﬀer outweighs the prior and player 1’s action a = R .Next, suﬃcient conditions are provided for herding to increase after the introductionof the desire to diﬀer from previous agents. Increased herding means that actions becomeuninformative after some histories, but not the reverse. The set of histories after whichherding occurs under k >

0, but not under k = 0 can have probability close to 1, as thenumerical example after Proposition 2 demonstrates. Proposition 2 proves increased herdingfor the ﬁrst four players under k > k = 0. After that, Lemma 3 shows thatplayer 5 also herds more under k > Proposition 2.

Assume l q > l and l − l Qq − l Q + l k (1) < .(a) If k = 0 , l − l Qq + l q < and l + l Qq + l ¬ Q − l Q < , then a is informative after any a .(b) If k > , l + l Qq − l q − l k (1) < and l − l Qq + l Q + l k (1) < , then a is uninformativeafter a = a , the probability of which is ( Q + q ) + (1 − Q − q ) > . If in addition l + l Qq − l q − l k (cid:0) i +12 i +1 (cid:1) < , then a i +3 is uninformative after a i +1 = a i +2 .(c) If l − l Qq + l ¬ Q < and a ( a , s ) is informative under k > , then also under k = 0 . roof. (a) The condition l + l Qq − l Q − l k (1) < l + l Qq + l ¬ Q − l Q < l + l Qq − l q − l k (1) <

0. Recall l k ( f ) = − l k ( − f ).If l q > l , then a ( L ) = a ( ℓ ) = L and a ( R ) = a ( r ) = R . In this case, l − l Qq − l Q + l k (1) < a ( L, L ) = L . If k = 0, l − l Qq + l q < l + l Qq − l Q − l k (1) <

0, then player2’s actions are a ( R, s = L ) = R = a ( L, R ) and a ( R, L ) = L = a ( L, s = R ), so l ( L, L ) = l − l Qq − l ¬ Q , l ( L, R ) = l − l Qq + l Q ,l ( R, L ) = l + l Qq − l Q , l ( R, R ) = l + l Qq + l ¬ Q . When k = 0 and l + l Qq + l ¬ Q − l Q <

0, player 3’s action is informative after a = a :private history ( a , s ) l ( a , s ) a ( a , s ) R, R, L l + l Qq + l ¬ Q − l Q < LR, R, s = L ≥ l + l Qq + l ¬ Q − l q > RL, L, R l − l Qq − l ¬ Q + l Q > RL, L, s ∈ { ℓ, L } ≤ l − l Qq − l ¬ Q − l q < L If k = 0, then a is informative after a = a = R , because l ( L, R, L ) = l − l Qq < l Qq > l q > l , and l ( L, R, r ) = l − l Qq + l Q + l q >

0. Action a is informative after a = a = L , because l ( R, L, R ) = l + l Qq > l ( R, L, L ) = l + l Qq − l Q < l Q > l Qq > l .(b) If k > l q > l , l + l Qq − l q − l k (1) < l − l Qq + l q + l k (1) <

0, then a ( a , L ) = a ( a , ℓ ) = L and a ( a , R ) = a ( a , r ) = R for any a , so l ( L, L ) = l − l Qq , l ( L, R ) = l ( R, L ) = l and l ( R, R ) = l + 2 l Qq .If k > l ( L, L, R ) = l − l Qq + l Q + l k (1) <

0, then l ( L, L, s ) < s , so a ( L, L, s ) = L for any s . The condition l − l Qq + l Q + l k (1) < l +2 l Qq − l Q − l k (1) >

0, so l ( R, R, s ) > a ( R, R, s ) = R for any s . If player 3 herds after a = a , thenso do all subsequent players, because f = 1 remains unchanged. More generally, if player i herds after some history a i − and | f − | weakly decreases over time, then all subsequentplayers j > i also herd after any continuation of a i − .In the k > a = a , l + l Qq − l q − l k (1) < l − l Qq + l q + l k (1) < l ( a ) = l and f = , so a is informative by l q > l . For any period j , if f = ,then j is odd, l j ( a j − ) = l and if a j +1 = a j , then in period j + 2, f = ( j − / j +1 . If l − l Qq + l q + l k (1) <

0, then l − l Qq + l q + l k ( f ) < f ≤

1. Whenever f = and l j ( a j − ) = l , the game essentially restarts, with player j in the role of player 1 anda reduced l k ( f ), because f responds less to a j +1 = a j . Therefore if a herd has not startedafter a i (which implies a t = a t − for all t ≤ i ), then it starts after ( a i , L, L ), and if6able 1: Public log likelihood ratios of player 4 before seeing s , and conditions under which a responds to s . Maintained assumptions: l − l Qq + l q < l + l Qq + l ¬ Q − l Q < l + l Qq − l q − l k (1) < l − l Qq + l Q + l k (1) < l ( a ) a ( a , s ) responds to s ifhistory a k = 0 k > k = 0 k > L, L, L l − l Qq − l ¬ Q l − l Qq l − l Qq − l ¬ Q + l Q > R, R, R l + l Qq + 2 l ¬ Q l + 2 l Qq l + l Qq + 2 l ¬ Q − l Q < L, L, R l − l Qq − l ¬ Q + l Q oﬀ-path always never R, R, L l + l Qq + l ¬ Q − l Q oﬀ-path always never L, R, L l − l Qq + l Q − l Q l − l Qq always always R, L, R l + l Qq − l Q + l Q l + l Qq always always L, R, R l − l Qq + l Q + l ¬ Q l + l Qq l − l Qq + l ¬ Q < R, L, L l + l Qq − l Q − l ¬ Q l − l Qq always always l + l Qq − l q − l k (cid:0) i +12 i +1 (cid:1) <

0, then also after ( a i , R, R ). The conditional probability of a herdis Pr( a i +2 = a i +1 | a i +1 = a i ) = 1 − Q + q )(1 − Q − q ) = ( Q + q ) + (1 − Q − q ) .(c) Table 1 displays l ( a ) in the cases k = 0 and k >

0, as well as the conditionsfor a ( a , s ) to be informative. Suﬃcient for a ( a , s ) to be informative under k = 0 isthat l ( a , L ) < < l ( a , R ), which is how the fourth column of Table 1 is derived from thesecond. Under k >

0, if a = a , then herding already started from a , so a is uninformative.If a = a , then player 4 faces the same decision problem as player 2, so by Lemma 1, a isinformative for any a . More generally, if a t − = a t for all t < i , then player 2 i faces thesame decision problem as player 2, so by Lemma 1, a i is informative for any a i − . Table 1shows that if a is informative under k > l − l Qq + l ¬ Q <

0, then a is informativeunder k = 0.Proposition 2 is not vacuous—two numerical examples satisfying the assumptions arepresented next. Example . Take either p = , p S = , Q = , q = and any k ∈ [0 , ], or p = , p S = , Q = , q = and k ≈ .

01. In both cases, player 3’s herding probabilityincreases from 0 to ( Q + q ) + (1 − Q − q ) ≈ .

98. Herding by player 4 (and 5 and 6, asLemmas 3, 4 below show) increases after every history.The intuition for the condition l q > l in Proposition 2 is that player 1’s weak signaloutweighs the prior, so player 1 always follows own signal. The assumption l − l Qq + l q < k = 0, the prior p is close enough to for player 2’s weak signal in favourof state R not to outweigh the “average” signal (which is player 1’s action) favouring state L . The best response of player 2 is then to follow player 1 except when s is strong anddisagrees with a .From player 3’s perspective, observing a = a is equivalent to seeing a strong signal s = a , but observing a = a conﬂates the three other signals ℓ, r and s ∈ { L, R } \ { a } , inwhich case 3’s log likelihood ratio moves by only l ¬ Q . The intuition for l + l Qq + l ¬ Q − l Q < l Q of a strong signal outweighs the combined prior l , average signal l Qq andthe conﬂation l ¬ Q of three signals when k = 0. Thus player 3 always follows a strong signal s ∈ { L, R } , regardless of whether s = a , so a is informative.Under k >

0, the assumption l + l Qq − l q − l k (1) < from player 1 combines with the eﬀect of a weaksignal to outweigh the prior and the information derived from a . This choice of player 2 tofollow s increases the informativeness of a = a , but decreases that of a = a . The moreinformative event a = a together with l − l Qq + l Q + l k (1) < l k (1) do not overcome the eﬀect 2 l Qq of two “average” signals. If player 3 herds after a = a , then so do all subsequent players,because they have the same signal strengths and desire to diﬀer.The less informative a = a under k > k = 0, player 3 follows a strong signal after a = a . No additional assumptions areneeded, because a = a is either a strong signal (if k = 0) or average (if k >

0) favouring theopposite state to a . The average signal from a = a neutralises a , so l q > l is suﬃcient for a to respond to even weak signals. The strong signal from a = a under k = 0 is neutralisedby player 3’s strong private signal s ∈ { L, R } \ { a } , in which case a = a . On the otherhand, if s = a , then a = a = a , so the action of player 3 is informative in the k = 0 caseas well.In Table 1, l + l Qq + 2 l ¬ Q − l Q < a i = R is suﬃcientfor l − l Qq − l ¬ Q + l Q > a i = L line. The intuition for these conditions is that a strongsignal s overwhelms the eﬀect of an “average” signal from a plus two conﬂations ( a and a ) of the three signals other than a strong one opposing a . The condition l + l Qq − l ¬ Q > a ( R, L, L, s ) to respond to s under k = 0 always holds (so is omitted from the last lineof Table 1), because l ≥ l Qq > l ¬ Q . The maintained assumption l − l Qq + l q < l − l Qq + l ¬ Q < a ( L, R, R, s ) If l − l Qq − l q < − l k (1), which is implied by l − l Qq + l Q < − l k (1) and l Q > l Qq , then player 2 doesnot ignore s just to ensure a = a , i.e. k is small enough not to induce an anti-herding information cascade. l q > l ¬ Q and l q < l ¬ Q are possible. The reasonwhy l − l Qq + l ¬ Q < a ( L, R, R, s ) to respond to s is that the strong signalfrom a = a = L is cancelled by s = L , resulting in l ( L, R, R, L ) <

0, but if s ∈ { r, R } ,then l ( L, R, R, s ) > a under k = 0 to the k > Lemma 3. If l − l Qq + l ¬ Q < and a ( a , s ) is informative under k > , then a ( a , s ) isalso informative under k = 0 .Proof. History a = ( R, L, L ). Under k = 0, if l ( R, L, L, r ) = l + l Qq − l Q − l ¬ Q + l q < a = ( R, L, L ) are l ( R, L, L, L ) = l + l Qq − l Q − l ¬ Q − l ¬ Q and l ( R, L, L, R ) = l + l Qq − l Q − l ¬ Q + l Q , because player 4 chooses L after a weak signal s = r . In this case, a responds to s if l + l Qq − l ¬ Q >

0, because l ( R, L, L, L, R ) = l + l Qq − l ¬ Q and l ( R, L, L, L, L ) = l + l Qq − l Q − l ¬ Q <

0. After(

R, L, L, R ), player 5’s action always responds to the private signal. By comparison, recallthat when k >

0, player 5 (and any odd player) herds if the preceding two players took thesame action.On the other hand, if l ( R, L, L, r ) >

0, then l ( R, L, L, L ) = l + l Qq − l Q − l ¬ Q − l Qq and l ( R, L, L, R ) = l + l Qq − l Q − l ¬ Q + l Qq , because player 4 chooses L after a weak signal s = r .In this case, a responds to s if l ( R, L, L, L, R ) = l − l ¬ Q > a always respondsif the a contains an equal number of L, R ). The condition l − l ¬ Q > k >

0, player 5 always herds after history (

R, L, L, L ).History a = ( L, R, R ). If l − l Qq + l Q + l ¬ Q − l q >

0, then l ( L, R, R, L ) = l − l Qq + l Q + l ¬ Q − l Q and l ( L, R, R, R ) = l − l Qq + l Q + l ¬ Q + l ¬ Q , because a ( L, R, R, ℓ ) = R . Thecondition for a to respond to s is l ( L, R, R, R, L ) = l − l Qq + l Q + l ¬ Q + l ¬ Q − l Q <

0, thesame as for a to be informative after a = ( L, R, R ).In contrast, if l − l Qq + l Q + l ¬ Q − l q <

0, then l ( L, R, R, L ) = l − l Qq + l Q + l ¬ Q − l Qq and l ( L, R, R, R ) = l − l Qq + l Q + l ¬ Q + l Qq . Action a is always informative after L, R, R, L ,but never after

L, R, R, R (just like with k > l ( L, R, R, R, L ) = l + l ¬ Q > a = ( R, L, R ) and (

L, R, L ) lead to l ( a ) = l ± l Qq , so player 4 faces the samedecision problem as player 2. Thus continuing from these histories, any player herds moreunder k > k = 0. For histories in the top half of Table 1, herding has alreadystarted with player 3, so all subsequent players unambiguously herd more under k > a of player 6 under k = 0 and k >

0, analogously to Lemma 3 for a . Lemma 4. If l +2 l Qq − l ¬ Q − l Q < and a ( a , s ) is informative under k > , then a ( a , s ) is also informative under k = 0 .Proof. Based on Proposition 2 and Lemma 3, the only histories continuing from which player6 could conceivably herd more under k = 0 are a = ( L, R, R, L ) and (

R, L, L, R ). In thesecontinuations, under k >

0, player 6 faces the same decision as player 2, but this need notbe the case under k = 0. Consider ﬁrst a = ( R, L, L, R ). Separate two cases based on thesign of l ( R, L, L, r ) = l + l Qq − l Q − l ¬ Q + l q .If l + l Qq − l Q − l ¬ Q + l q <

0, then l ( R, L, L, R ) = l + l Qq − l ¬ Q . In this case, if l + l Qq − l ¬ Q − l q <

0, then l ( R, L, L, R, L ) = l − l ¬ Q (so a is informative) and l ( R, L, L, R, R ) = l + 2 l Qq − l ¬ Q . Then a is informative if l + 2 l Qq − l ¬ Q − l Q < l + l Qq − l Q − l ¬ Q + l q < l + l Qq − l ¬ Q − l q >

0, which implies l ( R, L, L, R, L ) = l + l Qq − l ¬ Q − l Q and l ( R, L, L, R, R ) = l + l Qq − l ¬ Q + l ¬ Q , for both ofwhich, a is informative.If l + l Qq − l Q − l ¬ Q + l q >

0, then l ( R, L, L, R ) = l + 2 l Qq − l Q − l ¬ Q . In this case, if l + 2 l Qq − l Q − l ¬ Q − l q < l + 2 l Qq − l Q − l ¬ Q <

0) and l + 2 l Qq − l Q − l ¬ Q + l q > l + l Qq − l Q − l ¬ Q + l q > l ( R, L, L, R, L ) = l + l Qq − l Q − l ¬ Q (so a isinformative) and l ( R, L, L, R, R ) = l + 3 l Qq − l Q − l ¬ Q . Therefore if l + 3 l Qq − l Q − l ¬ Q < l + 2 l Qq − l Q − l ¬ Q < a is informative.Consider next a = ( L, R, R, L ), so l ( L, R, R, ℓ ) = l − l Qq + l Q + l ¬ Q − l q . If l − l Qq + l Q + l ¬ Q − l q >

0, then l ( L, R, R, L ) = l − l Qq + l ¬ Q . In this case, if l − l Qq + l ¬ Q + l q >

0, then l ( L, R, R, L, R ) = l + l ¬ Q (so a is informative) and l ( L, R, R, L, L ) = l − l Qq + l ¬ Q . Then a is informative if l − l Qq + l ¬ Q + l Q >

0, suﬃcient for which is l + 2 l Qq − l ¬ Q − l Q < l − l Qq + l Q + l ¬ Q − l q > l − l Qq + l ¬ Q + l q <

0, whichimplies l ( L, R, R, L, R ) = l − l Qq + l ¬ Q + l Q and l ( L, R, R, L, L ) = l − l Qq + l ¬ Q − l ¬ Q , so a ( L, R, R, L, L, s ) is informative. Action a ( L, R, R, L, R, s ) is informative if l − l Qq + l ¬ Q <

0, which is implied by l − l Qq + l ¬ Q + l q < l − l Qq + l Q + l ¬ Q − l q <

0, then l ( L, R, R, L ) = l − l Qq + l Q + l ¬ Q . In thiscase, if l − l Qq + l Q + l ¬ Q + l q > l + 2 l Qq − l ¬ Q − l Q <

0) and l − l Qq + l Q + l ¬ Q − l q < l − l Qq + l Q + l ¬ Q − l q < l ( L, R, R, L, R ) = l − l Qq + l Q + l ¬ Q , so a ( L, R, R, L, R, s ) is informative, because l − l Qq + l Q + l ¬ Q − l q < l − l Qq + l ¬ Q <

0. Also, l ( L, R, R, L, L ) = l − l Qq + l Q + l ¬ Q , thus if l − l Qq + 2 l Q + l ¬ Q > l + 2 l Qq − l ¬ Q − l Q < a is informative.10he assumption l + 2 l Qq − l ¬ Q − l Q < k > k > k = 0.In some long enough histories, the probability of herding under k = 0 may overtake thatunder k >

0. This is because the eﬀective congestion cost decreases when f approaches ,which occurs each time the history lengthens by two actions without a herd having started.Eyster et al. (2014) show that as the congestion cost approaches zero, learning increases in thelimit as time goes to inﬁnity. The comparison of the limiting probabilities of learning under k = 0 and k > | f − | inthe present paper relative to that of the congestion cost in Eyster et al. (2014). What is clearis that discounting the beneﬁt of agents in the far future learning makes the welfare impactof the large initial increase in herding under k > k = 0. In other words, the discounted probability of correctdecisions is signiﬁcantly smaller when there is a desire to diﬀer from previous movers. The result that herding may increase with the desire to diﬀer from previous movers is ro-bust to varying the informativeness of signals or the congestion cost within some bounds.The informativeness and cost may also diﬀer to some extent across players. Unboundedlyinformative signals or a strong enough preference for non-conformity break herding, as es-tablished in the previous literature. If the congestion cost is small enough, then it does notaﬀect players’ actions, because it does not outweigh the weakest of the ﬁnitely many signals.In some applications, the congestion cost depends only on the actions of some precedingagents, not all. For example, if a service provider is capacity constrained and can serve only m agents at a time or ﬁnishes the service in at most m periods, then an agent’s payoﬀ onlydepends on the choices of the m immediate predecessors. The desire to diﬀer may increaseconformity also in this case, as is clear from redeﬁning f in Section 2 to be the fraction ofagents among the preceding m who choose R .Even if congestion depends only on the immediately preceding agent, a more informative a = a can motivate player 3 to herd. The less informative a = a cannot reduce player 3’sherding compared to the k = 0 case if 3 does not herd after a = a under k = 0. Thus theoverall probability of herding may increase, as in the baseline model. The proofs simplify,11ecause each time the belief returns to the prior, the subgame is identical to the whole game.In particular, the condition l + l Qq − l q − l k (cid:0) i +12 i +1 (cid:1) < i + 3 conditional on not having started earlier may be omitted w.l.o.g., because itreduces to l + l Qq − l q − l k (1) < A Herding reduced by the desire to conform

This section shows that a preference to match the actions of preceding agents may in factreduce herding. The idea is similar to why the desire to diﬀer may increase herding—theactions of previous players become more informative after some histories, less after others. Inthe current section, it is the less informative actions that matter. A strong signal overwhelmsthe eﬀect of two previous less informative actions plus the desire to conform, but does notoutweigh the more informative actions in the absence of a preference to follow previousmovers.Only the diﬀerences from the setup in Section 1 are mentioned. Payoﬀs are u i ( a i , θ ) = { a i = θ } + ki − i − X j =1 { a j = a i } , where k ≥ f ofprevious agents taking that action.There are six possible signal realisations s i ∈ { L, ℓ, λ, ρ, r, R } , with ℓ, r interpreted asmedium strength and λ, ρ as weak. Signals L, ℓ, λ favour state L , the others R . The respectiveunconditional probabilities of a strong, medium and weak signal are p S := Pr( L ) + Pr( R ), p s := Pr( ℓ ) + Pr( ℓ ) and p σ := Pr( λ ) + Pr( ρ ). The conditional probabilities are Pr( L | L ) =Pr( R | R ) =: Q , Pr( ℓ | L ) = Pr( r | R ) =: q and Pr( λ | L ) = Pr( ρ | R ) =: η . Assume < ηp σ < qp s < Qp S <

1, which justiﬁes the interpretations of the signals. Deﬁne l q := ln q − ln( p s − Q ) ∈ (0 , l Q ), l η := ln η − ln( p σ − η ) ∈ (0 , l q ), l Qq := ln( Q + q ) − ln( p S + p s − Q − q ) ∈ ( l q , l Q ), l Qqη := ln( Q + q + η ) − ln(1 − Q − q − η ) ∈ ( l η , l Q ), l ¬ qQ := ln( p σ + q + Q ) − ln(1 − q − Q ) ∈ (0 , l Qqη ),12 ¬ Q := ln( p s + p σ + Q ) − ln(1 − Q ) ∈ (0 , l ¬ qQ ).The next result is analogous to Proposition 2 and provides suﬃcient conditions for herdingto decrease when conformism is introduced. Proposition 5.

Assume l η > l . If k = 0 , l + l Qqη − l q < , l − l Qqη + l η < and l − l Qqη − l ¬ qQ + l Q < , then a is uninformative after a = a , the probability of which is ( Q + q + η )( p σ + q + Q ) + (1 − Q − q − η )(1 − q − Q ) .If k > , l − l Qqη + l q − l k (1) < and l + l Qqη + l ¬ Q − l Q + l k (1) < , then a is informativeafter any history.Proof. The assumption l < l η ensures that player 1 follows own signal. Then player 2’spublic log likelihood ratios are l ( L ) = l − l Qqη and l ( R ) = l + l Qqη . k = 0. Assume l + l Qqη − l q < l − l Qqη + l η <

0, so a ( R, ℓ ) = a ( L, ρ ) = L and byimplication, a ( L, r ) = a ( R, λ ) = R . Player 3’s log likelihood ratios before seeing s are l ( L, L ) = l − l Qqη − l ¬ qQ , l ( L, R ) = l − l Qqη + l Qq ,l ( R, L ) = l + l Qqη − l Qq , l ( R, R ) = l + l Qqη + l ¬ qQ . If l − l Qqη − l ¬ qQ + l Q <

0, then a is uninformative after a = a , i.e. a herd starts. After a = a , player 3’s action always responds to signals. k >

0. If l − l Qqη + l q − l k (1) < l + l Qqη − l Q + l k (1) < l + l Qqη + l ¬ Q − l Q + l k (1) < a ( L, r ) = a ( R, L ) = L and by implication, a ( R, ℓ ) = a ( L, R ) = R . Player 3’s log likelihood ratios before observing s are then l ( L, L ) = l − l Qqη − l ¬ Q , l ( L, R ) = l − l Qqη + l Q ,l ( R, L ) = l + l Qqη − l Q , l ( R, R ) = l + l Qqη + l ¬ Q . If l + l Qqη + l ¬ Q − l Q + l k (1) <

0, then a strong signal switches the sign of l ( R, R ), so a isinformative after a = a = R and by implication after any history.The next example exhibits parameter values satisfying the assumptions of Proposition 5. Example . Let l = 0, p S = , p s = p σ = , Qp S ≈ . qp s ≈ . ηp σ ≈ . k ≈ . · − , or alternatively p = 0 . p S = , p s = p σ = , Qp S ≈ . qp s ≈ . ηp σ ≈ . k ≈ . a = a is ( Q + q + η )( p σ + q + Q ) + (1 − Q − q − η )(1 − q − Q ) ≈ . k = 0, but ( Q + q + η )( p σ + p s + Q ) + (1 − Q − q − η )(1 − Q ) ≈ .

94 under k > .

08 to 1 when the desire to conform is introduced.13 eferences

Ali, S. N. and N. Kartik (2012): “Herding with collective preferences,”

Economic Theory ,51, 601–626.

Banerjee, A. V. (1992): “A simple model of herd behavior,”

The quarterly journal ofeconomics , 107, 797–817.

Bikhchandani, S., D. Hirshleifer, and I. Welch (1992): “A theory of fads, fashion,custom, and cultural change as informational cascades,”

Journal of political Economy , 100,992–1026.

Callander, S. (2007): “Bandwagons and momentum in sequential voting,”

The Review ofEconomic Studies , 74, 653–684.

Callander, S. and J. H¨orner (2009): “The wisdom of the minority,”

Journal of Eco-nomic Theory , 144, 1421–1439.

Eyster, E., A. Galeotti, N. Kartik, and M. Rabin (2014): “Congested observationallearning,”

Games and Economic Behavior , 87, 519–538.