[PDF] Promotion through Connections: Favors or Information?

Abstract

Connections appear to be helpful in many contexts such as obtaining a job, a promotion, a grant, a loan or publishing a paper. This may be due to favoritism or to information conveyed by connections. Attempts at identifying both effects have relied on measures of true quality, generally built from data collected long after promotion. This empirical strategy faces important limitations. Building on earlier work on discrimination, we propose a new method to identify favors and information from classical data collected at time of promotion. Under natural assumptions, we show that promotion decisions look more random for connected candidates, due to the information channel. We obtain new identification results and show how probit models with heteroscedasticity can be used to estimate the strength of the two effects. We apply our method to the data on academic promotions in Spain studied in Zinovyeva & Bagues (2015). We find evidence of both favors and information effects at work. Empirical results are consistent with evidence obtained from quality measures collected five years after promotion.

Full PDF

PPromotion through Connections:Favors or Information?

Yann Bramoull´e and Kenan Huremovi´c*

August 2017

Abstract : Connections appear to be helpful in many contexts such as obtaining a job, apromotion, a grant, a loan or publishing a paper. This may be due to favoritism or toinformation conveyed by connections. Attempts at identifying both eﬀects have relied onmeasures of true quality, generally built from data collected long after promotion. Thisempirical strategy faces important limitations. Building on earlier work on discrimination,we propose a new method to identify favors and information from classical data collectedat time of promotion. Under natural assumptions, we show that promotion decisions lookmore random for connected candidates, due to the information channel. We obtain newidentiﬁcation results and show how probit models with heteroscedasticity can be used toestimate the strength of the two eﬀects. We apply our method to the data on academicpromotions in Spain studied in Zinovyeva & Bagues (2015). We ﬁnd evidence of both favorsand information eﬀects at work. Empirical results are consistent with evidence obtainedfrom quality measures collected ﬁve years after promotion.

Keywords : Promotion, Connections, Social Networks, Favoritism, Information.

JEL classiﬁcation:

C3, I23, M51.*

Bramoull´e: Aix-Marseille University (Aix-Marseille School of Economics) & CNRS; Huremovi´c:IMT School for Advanced Studies. We thank participants in seminars and conferences and HabibaDjebbari, Bruno Decreuse, Mark Rosenzweig, Marc Sangnier, Adam Szeidl, Natalia Zinovyeva andRussell Davidson for helpful comments and suggestions. For ﬁnancial support, Yann Bramoull´ethanks the European Research Council (Consolidator Grant n. 616442). a r X i v : . [ q -f i n . E C ] A ug Introduction

Connections appear to be helpful in many contexts such as obtaining a job, a promotion,a grant, a loan or publishing a paper. Two main reasons help explain these wide-rangingeﬀects. On one hand, connections may convey information on candidates, projects andpapers. Connections then help recruiters, juries and editors make better decisions. Onthe other hand, decision-makers may unduly favor connected candidates, leading to worsedecisions. These two reasons have opposite welfare implications and empirical researchershave been trying to tease out the diﬀerent forces behind connections’ impacts. Almostall existing studies do so by building measures of candidates’ “true” quality. Researchersthen compare the quality of connected and unconnected promoted candidates. Informa-tion eﬀects likely dominate if connected promoted candidates have higher quality; favorslikely dominate if connected promoted candidates have lower quality. For instance, articlespublished in top economics and ﬁnance journals by authors connected to editors tend toreceive more citations, a sign that editors use their connections to identify better papersBrogaard, Engelberg & Parsons (2014). By contrast, Full Professors in Spain who wereconnected to members of their promotion jury publish less after promotion Zinovyvea &Bagues (2015), consistent with favoritism.This empirical strategy, while widely used, faces three important limitations. First,building a measure of true quality may not be easy or feasible. Looking at researchers’publications or articles’ citations requires a long enough time lag following promotion orpublication. And such measures are in any case imperfect proxies of quality. Second,identiﬁcation is only valid if the impact of promotion on measured quality is the samefor connected and unconnected promoted candidates, see e.g. Zinovyeva & Bagues (2015,p.283). This assumption is critical but not necessarily plausible, and can generally notbe tested. Third, connections may convey both information and favors. This empirical The literature on jobs and connections is large and expanding. Recent references include Beaman &Magruder (2012), Brown, Setren & Topa (2016), Hensvik & Skans (2016), Pallais & Sands (2016). Onpromotions, see Combes, Linnemer & Visser (2008), Zinovyeva & Bagues (2015). On grants, see Li (2017).On loans, see Engelberg, Gao & Parsons (2012). On publications, see Brogaard, Engelberg & Parsons(2014), Colussi (2017), Laband & Piette (1994). Favor exchange within a group might increase the group’s welfare at the detriment of society, seeBramoull´e & Goyal (2016). In this paper, we focus on the immediate negative implications of favoritism. We develop a new method to identify why connections matter, building on earlier workon discrimination. Our method addresses these limitations. It does not rely on measures oftrue quality. Rather, it exploits classical data collected at time of promotion: informationon candidates and whether they were promoted. It allows researchers to estimate the mag-nitudes of the two eﬀects. The method does require exogenous shocks on connections. Thisis, in any case, a precondition of any study of the reasons behind the eﬀect of connections.The method is indirect and looks for revealing signs of information and favors on therelation between candidates’ observables and promotion. Consider candidates applyingfor promotion. They are evaluated by a jury and some candidates are connected to jurymembers. When connections convey information, the jury has an extra signal on connectedcandidates’ ability. This signal is unobserved by the econometrician and could be positiveor negative. To the econometrician, then, the promotion decision looks more random forconnected candidates. We show how the strength of the information channel can berecovered, under appropriate assumptions, from this excess variance in the latent error ofconnected candidates. To recover favors, then, we estimate and compare the promotionthresholds faced by connected and unconnected candidates. Favors lead to systematic biasesin evaluation and the diﬀerence between promotion thresholds measures the magnitude ofthe underlying favors.Our econometric framework is based on normality assumptions. We make use of pro-bit models with heteroscedasticity to detect and estimate excess variance. We clarify theconditions under which favors and information are identiﬁed. Identiﬁcation fails to hold ifthe eﬀects depend in an arbitrary way on candidates’ observables (Proposition 1). Iden-tiﬁcation holds, however, under slight restrictions on this dependence, for instance in thepresence of an exclusion restriction or under linearity assumptions (Theorem 1).We then bring our method to data. We reanalyze the data on academic promotions in In a context of grant applications, Li (2016) develops a new method to recover the respective strengthsof favors and information. Her method relies on measures of true quality and on jury evaluations. A similar idea underlies Theorem 4 in Lu (2016); we discuss this relation in more detail below.

II A simple model

In this Section, we introduce a simple model to explain and illustrate our identiﬁcationstrategy. We develop our general model and derive formal identiﬁcation results in SectionIII. This model is similar to models analyzed in Heckman & Siegelman (1993, Appendix 5.D), Neumark(2012) and Zinovyeva & Bagues (2015, Section I). These grades may be aﬀected by connections to jury members, asdescribed below. Let a e be the exam-speciﬁc promotion threshold: a candidate is promotediﬀ her grade is higher than or equal to a e . This threshold may notably depend on thenumber of candidates applying for promotion in that wave and discipline.We assume that candidate i ’s true ability a i can be decomposed in three parts: a i = x i β + u i + v i (1)where x i ∈ R m denotes a vector of m characteristics observed by the econometrician andthe jury, u i is unobserved by both the econometrician and the jury, and v i is observed bythe jury but not the econometrician. In our empirical application, x i includes number ofpublications, age and gender; u i could capture creativity and v i the performance at theexam. Without loss of generality, we assume that E ( u i | x i ) = E ( v i | x i ) = 0. Thus, u i and v i represent parts of unobserved characteristics that cannot be explained by observables.Assume further that E ( u i | v i ) = 0 and that unobservables are normally distributed: u i ∼ N (0 , σ u ) and v i ∼ N (0 , σ v ). Denote by Φ the cumulative density function of a normalvariable with mean 0 and variance 1.Consider an unconnected candidate ﬁrst. We assume that her grade is equal to thejury’s expectation of her ability E ( a i | x i , v i ) = x i β + v i . Thus, unconnected candidate i ispromoted iﬀ x i β + v i ≥ a e . From the econometrician’s point of view, the probability thatan unconnected candidate with characteristics x i is promoted is equal to: p u ( y i = 1 | x i ) = Φ( x i β − a e σ v ) (2)where y i = 1 if candidate i obtains the promotion and 0 otherwise. We develop our approach under the assumption that the econometrician does not have data on juryevaluations. If E ( u i | x i ) (cid:54) = 0, deﬁne ˆ u i = u i − E ( u i | x i ) and similarly for ˆ v i . Note that E (ˆ u i | x i ) = 0 while E ( u i | x i )is a function of x i . Under linearity, this yields a i = x i ˆ β + ˆ u i + ˆ v i , which is then equivalent to equation (1). θ i = u i + ε i where ε i ∼ N (0 , σ ε ), and updates his belief on thecandidate’s ability based on this additional information. On the other hand, the jury maywant to favor the connected candidate. We assume that favors take the shape of a gradepremium B due to connections.A connected candidate’s grade is thus equal to its expected ability E ( a i | x i , v i , s i ) = x i β + E ( u i | θ i ) + v i plus the bias from favors B . Since E ( u i | θ i ) = σ u σ u + σ ε θ i , connectedcandidate i is hired iﬀ x i β + σ u σ u + σ ε θ i + v i + B ≥ a e . From the econometrician’s point ofview, the signal θ i enters in the latent error and generates extra variance on the jury’sdecision. Variance of the latent error is now equal to σ v + σ u σ u + σ ε . Let σ = 1 + σ u σ v ( σ u + σ ε ) > p c ( y i = 1 | x i ) = Φ( x i β + B − a e σσ v ) (3)Comparing equations (2) and (3), we see that information and favors have diﬀerent im-pacts on the probability to be promoted. When a jury has better information on connectedcandidates, this reduces the magnitude of the impact of observable characteristics on thelikelihood to be promoted. By contrast, favors lead to a shift in the eﬀective promotionthreshold, from a e to a e − B , leaving the impact of observables unchanged.We illustrate these eﬀects in Figure 1. The solid black curve depicts p u ( y i = 1 | x i ),the probability that an unconnected candidate is promoted as function of observed ability.The dashed curve depicts the probability that a connected candidate is promoted wheninformation eﬀects only are present. Note that the whole curve is less steep. The observed6robability to be promoted varies less with observed ability. Formally, an increase in σ leads to a second-order stochastic dominance shift of the whole curve. This also impliesthat the apparent impact of connections is negative for very good candidates for which x i β ≥ a e . This apparent negative impact is due to an asymmetry in the eﬀects of goodand bad news on candidates’ unobservables. While good news do not improve already goodchances by much, bad news signiﬁcantly reduce the chances of good candidates. For theeconometrician, connections then reduce the observed probability to be promoted for verygood candidates.The short-dashed curve depicts the probability that a connected candidate is promotedwhen only favors are present. The curve is now translated to the left, inducing a ﬁrst-order stochastic dominance shift. The shape of the whole curve is preserved. The apparentimpact of connections is now positive for all candidates. Finally, the grey curve depicts p c ( y i = 1 | x i ) when both eﬀects are present.Figure 1: Eﬀects of a connection a e - B a e x1p ( h = | x ) UnconnectedConnected : informationConnected : favorsConnected : information + favors Both eﬀects can thus be identiﬁed from data on promotion. Diﬀerences in the impactsof observables between connected and unconnected can be used to recover information ef-fects. Diﬀerences in estimated promotion thresholds between connected and unconnectedcan then be used to recover favors. From an econometric point of view, the diﬀerential in-formation that the jury has on connected candidates generates a form of heteroscedasticity . Formally, identiﬁcation in this model holds under the standard assumption that σ v = 1 and is a directconsequence of Theorem 1 below. III Identiﬁcation

We now develop our general framework. We maintain the assumption that connectionsare random and extend the simple model in three directions. We incorporate baselineheteroscedasticity, varying information and varying favors. Information and favors maynotably depend on the number and types of connections of a candidate to the jury. Inline with the empirical application, we consider two types here - strong and weak ties; theframework and results easily extend to a ﬁnite number of types. Denote by n iS and n iW the number of strong and weak ties that candidate i has to the jury.We ﬁrst assume that the variance of v i may depend on i ’s observables x i . Thus, v i ∼ N (0 , σ v ( x i )). In the empirical analysis, we adopt standard assumptions regardingheteroscedasticity in probit regressions, see Section IV. To state our identiﬁcation resultsbelow, we only require that such baseline heteroscedasticity does not raise identiﬁcationproblems in classical probit estimations. More precisely, consider unconnected candidates.We have: p ( y i = 1 | n iS = n iW = 0 , x i ) = Φ[( x i β − a e ) /σ v ( x i )]. We assume that β and σ v ( . )are identiﬁed from the sample of unconnected candidates. Second, we assume that the private signal received by the jury on a connected candidatemay depend on the candidate’s number and types of connections and on his other observablecharacteristics. Denote by σ ≥ σ = σ ( n iS , n iW , x i ) where, by assumption, σ (0 , , x i ) = 1. Third, the bias from favors B As is well-know, a probit model with coeﬃcients ( β, a e ) and variance σ v ( x i ) cannot be distinguishedfrom one with coeﬃcients ( λβ, λa e ) and variance λσ v . We therefore adopt the classical normalizationassumption that σ v ( ) = 1 in our econometric speciﬁcations. B = B ( n iS , n iW , x i ), with B (0 , , x i ) = 0. While we generally expect both σ and B to beincreasing in the number of connections, we do not impose it in what follows. This yieldsthe following probability to be hired, conditional on connections and observables: p ( y i = 1 | n iS , n iW , x i ) = Φ[ x i β + B ( n iS , n iW , x i ) − a e σ v ( x i ) σ ( n iS , n iW , x i ) ] (4)The simple model presented in Section II is a particular case with σ v ( x i ) = σ v , and B ( n iS , n iW , x i ) = B and σ ( n iS , n iW , x i ) = σ as soon as n iS + n iW ≥ B ( . ) and σ ( . ). If the bias B varieswith observable x ki in a direction opposite from the direct eﬀect β k , this leads to an apparentreduction in the impact of x ki on the likelihood to be promoted for connected. If thishappens on all observables and without further restrictions, it prevents the identiﬁcationof the information eﬀect. We next state this negative result and derive a formal proof inthe Appendix. Proposition 1

Consider model (4). Suppose that the precision of the signals conveyed byconnections and the bias from favors depend in an arbitrary way on connections and onother observable characteristics of candidates. Then, favors and information eﬀects cannotbe identiﬁed from data on promotion only.

We now derive our main result. We show that identiﬁcation holds under mild restrictionson bias and excess variance. We consider two types of restrictions: exclusion restrictionsand parametric assumptions.

Theorem 1

Consider model (4).(Exclusion restriction). Suppose that characteristic k leaves σ and B unaﬀected and that β k (cid:54) = 0 . Then, the model is identiﬁed and the functions σ ( n iS , n iW , x − ki ) and B ( n iS , n iW , x − ki ) are non-parametrically identiﬁed.(Linearity). Suppose that ln( σ ( n iS , n iW , x i )) = δ ( n iS , n iW ) x i and B ( n iS , n iW , x i ) = γ ( n iS , n iW )+ γ ( n iS , n iW ) x i , with γ (0 ,

0) = 0 and δ (0 ,

0) = γ (0 ,

0) = . Then, the model is identiﬁed nd the functions δ ( n iS , n iW ) , γ ( n iS , n iW ) and γ ( n iS , n iW ) are non-parametrically identi-ﬁed. To see why the ﬁrst part of Theorem 1 holds, suppose that σ and B do not de-pend on x ki . From data on the unconnected, we can recover β k , the direct eﬀect of x ki on grade, and σ v ( . ). Focus, then on candidates with number of connections n iS and n iW and with other characteristics x − ki . From data on these candidates, we can recover theheteroscedasticity-corrected impact of x ki on grade, equal to β k /σ ( n iS , n iW , x − ki ). If β k (cid:54) = 0,we obtain σ ( n iS , n iW , x − ki ). The bias B ( n iS , n iW , x − ki ) can then be obtained as the diﬀerencein inferred promotion thresholds between unconnected candidates and candidates with ties n iS and n iW and characteristics x − ki .Therefore, our identiﬁcation strategy operates as long as one exclusion restriction ispresent in the model. As with instrumental variables, the excluded variable should have adirect impact on the unconnected likelihood to be promoted and should not directly aﬀectthe precision of the signals conveyed by connections nor the bias from favors they maygenerate. In particular, a model where excess variance and bias from favors depend onconnections but not on other observables is identiﬁed. We estimate several variants of suchmodels in the empirical analysis below.Even without exclusion restrictions, the model can still be identiﬁed thank to functionalform assumptions. The second part of Theorem 1, proved in Appendix, shows that thisnotably holds when excess variance is log linear in observables while bias is an aﬃnefunction of observables. In this case, again, dependence on connections can be arbitrary andis fully identiﬁed. To achieve non-parametric identiﬁcation in practice may of course requirea very large number of observations. In the empirical analysis below, we adopt standardparametric assumptions on the way ln( σ ) and B vary with connections and observables.All models estimated in Section VI are covered by Theorem 1.10 V Data

We apply our framework to the data on academic promotions in Spain assembled andstudied by Zinovyeva & Bagues (2015). We describe the main features of the data here andrefer to their study for details. From 2002 to 2006, academics in Spain seeking promotionto Associate Professor ( profesor titular ) or Full Professor ( catedr´atico ) ﬁrst had to qualifyin a national exam ( habilita´cion ). All candidates in the same discipline in a given wavewere evaluated by a common jury composed of 7 members. The jury had to allocate apredetermined number of positions. These exams were highly competitive and obtainingthe national qualiﬁcation essentially ensured promotion. A central feature of this systemwas that jury members were picked at random from a pool of eligible evaluators. Therandom draw was actually carried out by Ministry oﬃcials using urns and balls. The datacontains information on all candidates to academic promotion during that period, theirconnections to eligible evaluators and to jury members, and their success or failure in thenational exam.Overall, there are 31 ,

243 applications to 967 exams: 17 ,

799 applications to 465 examsfor Associate Professor (AP) positions and 13 ,

444 to 502 exams for Full Professor (FP)positions. We have information on candidates’ demographics and academic outcomes attime of application. Observable characteristics include gender, age, whether the candidateobtained his PhD in Spain, the number of publications, the number of publications weightedby journal quality, the number of PhD students supervised, the number of PhD committeesof which the candidate had been a member, and the number of previous attempts atpromotion. Table 1 provides descriptive statistics. Standards regarding research outputsmay of course diﬀer between disciplines. To analyze applications in a common framework,we follow Zinovyeva & Bagues (2015) and normalize research indicators to have mean 0and variance 1 within exams. The data also contain information on six types of linksbetween candidates and evaluators. We adopt Zinovyeva & Bagues (2015)’s classiﬁcationof these links in strong and weak ties. A candidate is said to have strong ties to his We also normalize age and past experience to have mean 0 within exams. The data also contains information on indirect connections, for instance when a candidate and anevaluator have a common member on their PhD committees. Zinovyeva & Bagues (2015) do not ﬁnd any

All AP FP Eng. H&L Sci. Soc. Sci.Female 0.34 0.40 0.27 0.21 0.45 0.30 0.39(0.47) (0.49) (0.44) (0.41) (0.50) (0.46) (0.49)Age 41.21 37.49 46.14 38.74 41.86 41.97 40.39(7.59) (6.41) (6.06) (7.07) (7.62) (7.55) (7.54)PhD in Spain 0.78 0.83 0.70 0.83 0.77 0.76 0.78(0.42) (0.37) (0.46) (0.38) (0.42) (0.43) (0.42)Past Experience 0.81 0.73 0.91 0.85 0.63 0.89 0.88(1.27) (1.27) (1.26) (1.36) (0.94) (1.40) (1.30)Publications 12.84 8.12 19.09 7.76 11.45 16.99 9.22(18.31) (14.06) (21.18) (12.88) (11.39) (24.10) (11.61)AIS 0.72 0.70 0.74 0.52 - 0.80 0.62(0.53) (0.57) (0.48) (0.37) - (0.51) (0.75)PhD Students 1.00 0.24 2.00 0.83 0.61 1.45 0.66(2.11) (0.88) (2.75) (1.61) (1.63) (2.60) (1.58)PhD Committees 3.61 0.88 7.23 2.40 3.04 4.81 2.67(6.76) (2.55) (8.65) (4.42) (5.99) (8.21) (4.99)Observations 31243 17799 13444 4783 9005 12858 4597

Notes:

Average values of the observable characteristics at the time of exam. Standard deviation inparentheses. FP and AP stand for exams for Full Professor and Associate Professor positions respectively.Eng., H&L, Sci., and Soc. Sci. are abbreviations for Engineering, Humanities and Law, Sciences, and SocialSciences, which are 4 broad scientiﬁc areas in our sample. AIS is the sum of international publicationsweighted by corresponding Article Inﬂuence Scores. The table partially replicates Table 2 in Zinovyeva &Bagues (2015).

PhD advisor, to his coauthors and to his colleagues. He has weak ties with members of hisPhD committee, with members of the PhD committees of his PhD students and with othermembers of the PhD committees of which he was a member. Overall, 34.8% of candidatesend up having at least one strong connection with a member of their jury and 20.6% haveat least one weak connection. Table 2 provides further information on connections. eﬀect of indirect connections and we do not include them in our analysis. A connection which is both strong and weak is classiﬁed as strong.

All AP FP Eng. H&L Sci. Soc. Sci.Strong connections 31.71 29.08 35.18 37.78 27.65 31.13 34.94Advisor 3.17 2.97 3.43 4.60 3.29 2.43 3.50Coauthor 5.44 3.26 8.32 6.10 2.84 7.44 4.24Colleague 29.71 27.74 32.31 36.02 26.15 28.50 33.46Weak connections 18.79 7.33 33.97 17.06 23.63 16.43 17.71PhD committee member 7.08 5.31 9.43 8.05 10.22 4.39 7.48PhD committee of his PhD student 4.45 0.69 9.42 4.70 4.81 4.27 3.96Same PhD committee member 11.65 1.82 24.66 8.84 13.90 11.59 10.31

Notes:

The percentage of candidates with at least one connection to the jury. The table partially replicatesTable 3 in Zinovyeva & Bagues (2015).

V Empirical Implementation

We now apply our identiﬁcation strategy to the data on academic promotions in Spain. Wediscuss three key features of the empirical implementation: the random assignment of eval-uators; the exam-speciﬁc promotion thresholds; and the speciﬁc models being estimated.

A Random assignment of jury members

Our identiﬁcation result, Theorem 1, relies on the assumption that the distribution ofunobservables for candidates with connections ( n iS , n iW ) does not depend ( n iS , n iW ). Inthe data, random assignment of jury members ensures that this holds conditionally on theexpected number of connections to the jury . That is, candidates may vary in the extentof their connections to eligible evaluators. From the number of eligible evaluators and thenumbers of weak and strong ties to eligible evaluators, we can simply compute the expectednumber of actual connections of the candidate to the jury. Conditional on these expectednumbers, actual numbers of connections are random. We present the corresponding balancetests in Table 3. Controlling for candidates’ expected numbers of connections, we donot ﬁnd signiﬁcant correlations between observable characteristics and actual number of To be consistent with our main regressions, we run balance tests conditioning directly on the expectednumbers of connections. By contrast, Zinovyeva & Bagues (2015) control for expected connections throughan extensive set of dummies, see Table 4 p.278. Incorporating these dummies raise computational issues inour non-linear setup. Results from Table 1 show that even in a simple linear formulation, actual connectionsare uncorrelated with observable characteristics.

AIS Publications PhD PhD Paststudents committees experience

Without controls for the expected number of connections

Strong 0.009 0.010 − ∗∗∗ − ∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.007) (0.008) (0.010) (0.011) (0.007) Including controls for the expected number of connections

Strong − − − − − − Notes:

Results of 10 regressions of observables (columns) on the number of strong and weak connections tothe jury (rows). In regressions in the upper panel we do not control for the expected number of connections.Regressions in lower panel include controls for the expected number of strong connections to the jury andthe expected number of weak connections to the jury. OLS estimates. Standard errors clustered on theexam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < connections. Therefore, a conditional version of Theorem 1 holds in this context. Theprobability to be promoted for unconnected p ( y i = 1 | n iS = n iW = 0 , En iS , En iW , x i ),excess variance due to better information σ ( n iS , n iW , En iS , En iW , x i ) and bias from favors B ( n iS , n iW , En iS , En iW , x i ) may depend on the expected numbers of connections to thejury. Under the assumptions underlying Theorem 1, the conditional information and favoreﬀects are identiﬁed. Note that the expected numbers of connections represent measuresof social capital, built from information available to the jury. In the empirical analysis wetherefore simply include them in the set of candidates’ characteristics observable to thejury. B Exam-speciﬁc promotion thresholds

Our approach relies on exam-speciﬁc promotion thresholds. This is an important elementsince the bias from favors is identiﬁed from diﬀerences in promotion thresholds betweenconnected and unconnected candidates. We consider two ways to account for exam-speciﬁcthresholds empirically: exam ﬁxed eﬀects a e and exam grouped eﬀects a e = z e a , where z e is14 vector of exam-level characteristics. A ﬁrst approach is to include a full set of exam ﬁxedeﬀects. In practice, regressions then include 967 exam dummies. While exam ﬁxed eﬀectsimpose, in principle, less restrictions, they raise several problems in practice. They maynot be identiﬁed for exams with small numbers of candidates, due to full predictability.They raise computational diﬃculties caused by the high dimensionality of the non-linearoptimization problem to be solved in the estimations. And in circumstances where groupedeﬀects are appropriate, estimations based on ﬁxed eﬀects may be ineﬃcient.Alternatively, we consider exam grouped eﬀects as in Bester & Hansen (2016). Weallow promotion thresholds to depend on type, area and wave ﬁxed eﬀects - leading to72 dummies in total - and on the number of candidates, the number of positions, theproportion of ﬁlled positions and the proportion of unconnected candidates. This modelis of course nested in the model with exam ﬁxed eﬀects and we can then test whether itleads to a signiﬁcant loss in explanatory power. C Econometric model

In the empirical analysis, we estimate diﬀerent speciﬁcations of model (4). The generalmodel features three key ingredients: baseline heteroscedasticity σ v ( x i ), excess variancefrom better information σ ( n iS , n iW , x i ) and bias from favors B ( n iS , n iW , x i ). Note that theﬁrst two elements are closely related, since σ v ( x i ) σ ( n iS , n iW , x i ) represents the variance ofthe latent errror for candidates with connections n iS , n iW and characteristics x i .We adopt a standard formulation for baseline heteroscedasticity, see Woolridge (2010).We assume that the logarithm of the variance of v i , the determinant of ability of uncon-nected candidates observed by the jury but not by the econometrician, is a linear functionof observable characteristics: σ v ( x i ) = exp( δ x i ) (5)and where the constant is excluded from the x i ’s. To gain in statistical and computationaleﬃciency, we do not include all characteristics in σ v in our preferred speciﬁcation. Wepresent our estimation procedure in Appendix.We model information eﬀects by building on this heteroscedasticity formulation. We15onsider increasingly complex speciﬁcations: (1) constant information eﬀects σ ( n iS , n iW , x i ) =exp( δ c ) if n iS + n iW ≥

1; (2) information eﬀects depending on numbers and types oflinks: σ ( n iS , n iW , x i ) = exp( δ S n iS + δ W n iW ); and (3) information eﬀects depending onnumbers and types of links as well as other observable characteristics: σ ( n iS , n iW , x i ) =exp[( δ S x i ) n iS + ( δ W x i ) n iW ]. Thus, each new strong tie with the jury increases latent errorvariance by exp( δ S ) in formulation (2) and by exp( δ S x i ) in formulation (3). These assump-tions allow us to study the determinants of the variance of the latent error in a common,coherent framework. In addition, observe that formulation (3) can be obtained as the ﬁrstelement of the Taylor approximation of ln( σ ( n iS , n iW , x i ) /σ v ( x i )) with respect to n iS , n iW and x i , for any function σ .We also model increasingly complex speciﬁcations of the bias from favors: (1) constantbias: B ( n iS , n iW , x i ) = B if n iS + n iW ≥

1; (2) bias depending on the numbers and types oflinks, linearly: B ( n iS , n iW , x i ) = γ S n iS + γ W n iW , or in a quadratic way: B ( n iS , n iW , x i ) = γ S n iS + γ S n iS + γ W n iW + γ W n iW + γ SW n iS n iW ; and (3) bias depending on connections andother observables: B ( n iS , n iW , x i ) = ( γ S + γ S x i ) n iS +( γ W + γ W x i ) n iW + γ S n iS + γ W n iW + γ SW n iS n iW . Quadratic terms help capture decreasing marginal impacts of additional links.For instance in the quadratic variant of formulation (2), a new strong tie with the juryincreases bias by γ S + γ S for an unconnected candidate and by γ S + 3 γ S for a candidatewho already had one strong tie. VI Empirical Analysis

A Main results

We develop our empirical analysis in three stages. We ﬁrst estimate a version of the simplemodel discussed in Section II, where the extent of information and favors are constant.We then account for the number and types of links, holding both eﬀects independent ofobservables. Finally, we estimate a model with full dependence on links and observables.We ﬁrst examine the impact of having at least one connection of any kind to the jury. Weestimate constant favors and information eﬀects, accounting for baseline heteroscedasticity.16enote by c i the connection dummy: c i = 1 if n iS + n iW ≥ p ( y i = 1 | x i , c i ) = Φ[( x i β + Bc i − a e ) exp[ − ( δ x i + δ c c i )]] (6)We consider grouped exam eﬀects in our main regressions, and justify this choice in SectionVI.B. Results of the estimation of Model (6) are reported in Table 4.Table 4: Binary connections: Model (6) (All) (AP) (FP)Bias (connected) 0.179 ∗∗∗ ∗∗ ∗∗ (0.055) (0.084) (0.091)Information (connected) 0.174 ∗∗∗ ∗∗∗ Notes:

All speciﬁcations include controls for the full set of observable characteristics, expected numberof connections of each type, and the baseline heteroskedasticity. Heteroskedastic probit estimates. Examgrouped eﬀects. Standard errors clustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < On the whole sample, both the estimated bias from favors B and the estimated infor-mation eﬀect δ c are positive and statistically signiﬁcant. They are also both positive andsigniﬁcant when estimated on promotions to Associate Professor. By contrast, we detectfavors but no information eﬀect on promotions to Full Professor. Thus, connected candi-dates appear to face lower promotion thresholds at both levels and connected candidates toAssociate Professor have excess variance in their latent errors. In other words, observablecharacteristics have lower power to explain promotion decisions in their case. Are these eﬀects quantitatively signiﬁcant? How much do connections help? And howmuch each motive contributes to the overall impact? To answer these questions, we com-pute for each candidate the predicted impact of a change in his connection status. Wefocus, for clarity, on unconnected candidates with at least one link to potential evaluators.These computations could easily be replicated on other subsamples. Consider, then, un- For clarity, we do not report estimates of the impact of candidates’ and exams’ characteristics onpromotion ( β ) and on baseline variance ( δ ) in the Tables. i . Model (6) can be used to predict how much i ’s probability to bepromoted would change if i became connected. Denote estimated coeﬃcients with hats.The diﬀerence in predicted promotion probabilities is equal to:∆ p i / ∆ c i = p ( y i = 1 | x i , c i = 1) − p ( y i = 1 | x i , c i = 0)∆ p i / ∆ c i = Φ[( x i ˆ β + ˆ B − ˆ a e ) exp[ − ( ˆ δ x i + ˆ δ c )]] − Φ[( x i ˆ β − ˆ a e ) exp[ − ( ˆ δ x i )]]We can further decompose the overall impact of a change in connection status in two parts:one due to favors [∆ p i / ∆ c i ] F = Φ[( x i ˆ β + ˆ B − ˆ a e ) exp[ − ( ˆ δ x i )]] − Φ[( x i ˆ β − ˆ a e ) exp[ − ( ˆ δ x i )]]and another due to information [∆ p i / ∆ c i ] I = Φ[( x i ˆ β + ˆ B − ˆ a e ) exp[ − ( ˆ δ x i + ˆ δ c )]] − Φ[( x i ˆ β +ˆ B − ˆ a e ) exp[ − ˆ δ x i ]. Thus, ∆ p i / ∆ c i = [∆ p i / ∆ c i ] F + [∆ p i / ∆ c i ] I . Finally, we compute theaverages of these values over all individuals in the sample.We depict the results of these counterfactual computations in Table 5 and Figure 2. TheTable reports averages of initial predicted probability (ﬁrst column), the average predictedchange in promotion probability due to connection (second column), the part of this changedue to information (third column) and the part to due to favors (fourth column). Thus,an unconnected candidate with some link to potential evaluators only has, on average, a0 .

08 chance to be promoted, reﬂecting the highly competitive nature of these promotions.Getting, by luck, connected to the jury leads to a relative increase in the promotion prob-ability of 80%. This relative impact is higher for candidates at the Associate Professorlevel (+91%) than for candidates at the Full Professor level (+76%). The larger part ofthis eﬀect is due to information for AP candidates (63% of the total impact). By contrast,favors is the main determinant of this impact for FP candidates (71% of the total im-pact). Overall, these numbers provide a quantitative picture of the impact of connections.Getting connected to the jury almost doubles the chances to obtain the promotion. Consis- We assume that the exam’s promotion threshold a e is not aﬀected by the change in connection statusof candidate i . There are two ways to decompose the overall eﬀect in two parts. Due to non-linearities, these two waysmay not be equivalent. In practice they yield similar results, however, and we only present results fromthe decomposition described in the text. To compute the impact of connectedness for a subsample, we rely on estimates of Model (6) for thissubsample as presented in Table 4.

Baseline Marginal eﬀectPredicted Total Information BiasAll 0.080 0.064 0.035 0.029(0.069) (0.023) (0.008) (0.019)AP 0.088 0.080 0.050 0.030(0.072) (0.024) (0.011) (0.019)FP 0.063 0.048 0.014 0.034(0.058) (0.024) (0.003) (0.022)

Notes:

Average marginal eﬀect of being connected calculated for unconnected candidates with at least oneconnection to potential evaluators. Standard deviation of the eﬀect is in the parenthesis. tently with the estimation results, favors appear to dominate for FP candidates while theinformation eﬀect dominates for AP candidates. Figure 2 then depicts how the changein predicted probability ∆ p i / ∆ c i , and its two components [∆ p i / ∆ c i ] F and [∆ p i / ∆ c i ] I varywith predicted probability p i = Φ[( x i ˆ β − ˆ a e ) exp[ − ( ˆ δ x i )]]. We see that [∆ p i / ∆ c i ] I has aninverted U-shape, reaching a maximum for p i close to 0 . p i . By contrast, [∆ p i / ∆ c i ] F is initially increasing over a larger range and only de-creases - when it does - for high values of p i . These qualitative patterns are consistent withFigure 1. In particular, and as discussed in Section III, better information on candidatesappears to lower the promotion probabiltiy of candidates with very good CVs. On averagefor these candidates, the impact of bad news dominates the impact of good news. Overall,∆ p i / ∆ c i displays a clear inverted U shape for AP candidates, reaching a maximum around p i equal to 0 .

2, due to the key role of the information eﬀect. By contrast, FP candidateswith better observable characteristics beneﬁt more from being connected to the jury.We next assume that the bias from favors and the information eﬀect may depend on thenumber and types of links. We estimate a model with linear bias and log-linear variance: p ( y i = 1 | x i , n iS , n iW ) = Φ[( x i β + γ S n iS + γ W n iW − a e ) exp[ − ( δ x i + δ S n iS + δ W n iW )]] (7)as well as a model with quadratic bias and log-linear variance:19 ( y i = 1 | x i , n iS , n iW ) = Φ[( x i β + γ S n iS + γ S n iS + γ W n iW + γ W n iW + γ SW n iS n iW − a e ) exp[ − ( δ x i + δ S n iS + δ W n iW )] (8)Results are reported in Table 6. In the Left panel we report estimation results from Model(7). On the whole sample, the bias and information eﬀects from strong ties are bothpositive and signiﬁcant; they are positive but insigniﬁcant for weak ties. For Full Professorapplications, we detect favors and information eﬀects from strong ties and, in addition,favors from weak ties. For Associate Professor applications, we do not detect favors in thisspeciﬁcation; we do detect strongly signiﬁcant and positive information eﬀects for bothstrong and weak ties. Note that in general, the eﬀects of weak ties tend to be impreciselyFigure 2: Marginal eﬀect of being connected: Decomposition Predicted probability M a r g i na l e ﬀ e c t BiasInformationTotal

All -0.050.000.050.10 0.0 0.2 0.4 0.6

Predicted probability M a r g i na l e ﬀ e c t BiasInformationTotal AP Predicted probability M a r g i na l e ﬀ e c t BiasInformationTotal FP Notes : Nonparametric ﬁt using LOESS method. The grey region depicts 95% conﬁdence intervals. Plotsare constructed using estimated model (6) on subsamples indicated above each plot. (All) (AP) (FP) (All) (AP) (FP)

Bias n S ∗∗∗ ∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.047) (0.070) (0.068) (0.047) (0.072) (0.061) n S − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.008) (0.014) (0.006) n W − ∗∗ − ∗∗∗ (0.069) (0.230) (0.058) (0.073) (0.280) (0.067) n W − ∗ − − ∗∗∗ (0.014) (0.240) (0.010) n S × n W − Information n S ∗∗∗ ∗∗∗ ∗∗ ∗∗ ∗∗∗ n W ∗∗∗ − ∗∗∗ − Notes:

Estimation of Model (7) - Left panel, and Model (8) - Right panel. All speciﬁcations includecontrols for the full set of observable characteristics, expected number of connections of each type, andthe baseline heteroskedasticity. Heteroskedastic probit estimates. Exam grouped eﬀects. Standard errorsclustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < To sum up, strong connections to the jury lower the promotion threshold eﬀectivelyfaced by connected candidates. This impact is increasing in the number of strong ties ata decreasing rate. For applications to Full Professor, weak connections to the jury also21ower the promotion threshold in a similar way. For applications to Associate Professor, weface a problem of statistical power caused by the relatively low number of weak ties. Bothkinds of ties also appear to convey better information on candidates at the AP level. Bycontrast, we do not detect robust information eﬀects at FP level.We next present the outcomes of counterfactual computations on the impact of connec-tions in Table 7, based on Model (8). We now focus on unconnected candidates who haveat least one strong tie and one weak tie to potential evaluators. For each such candidate,we compute the predicted promotion probability and the predicted increase in promotionprobability caused by obtaining, by luck, one strong or weak connection to the jury. Wealso provide decompositions of these impacts into parts due to better information and tofavors. We then average over all candidates in the subsample. We see that one strongtie increases the promotion probability by 74% for AP candidates and by 72% for FPcandidates. By contrast, one weak tie increases the promotion probability by 51% for APcandidates and by 22% for FP candidates. Thus, strong ties have higher predicted im-pacts than weak ties. For FP candidates, favors dominate, quantitatively, for both weakand strong ties. For AP candidates, favors dominate for strong ties and information eﬀectsdominate for weak ties, consistently with the estimation results.Table 7: Marginal eﬀect of connections: Model (8)

Baseline Marginal eﬀectPredicted Total Information BiasStrong Weak Strong Weak Strong WeakAll 0.082 0.060 0.023 0.019 0.012 0.041 0.011(0.068) (0.026) (0.009) (0.004) (0.003) (0.024) (0.007)AP 0.091 0.067 0.046 0.028 0.092 0.039 -0.046(0.072) (0.025) (0.023) (0.006) (0.026) (0.023) (0.031)FP 0.068 0.049 0.015 0.019 -0.017 0.030 0.033(0.058) (0.021) (0.017) (0.004) (0.005) (0.018) (0.020)

Notes:

Average marginal eﬀects of strong and weak connections calculated for unconnected candidateswith at least one strong connection and one weak connection to potential evaluators. Standard deviationof the eﬀect is in the parenthesis. p ( y i =1 | x i , n iS , n iW ) = Φ[( x i β + ( γ S + γ S x i ) n iS + ( γ W + γ W x i ) n iW + γ S n iS + γ W n iW + γ SW n iS n iW − a e ) exp[ − ( δ x i + ( δ S x i ) n iS + ( δ W x i ) n iW )]] (9)We present estimation results in the Appendix, see Table A1 for AP candidates and TableA2 for FP candidates. A positive coeﬃcient of the impact of some characteristic on biasmeans that favors due to connections tend to be stronger for candidates with higher valuesof this characteristic. Similarly, a positive coeﬃcient on the information eﬀect means thatexcess variance, and hence the quality of the extra information brought about by an addi-tional connection, is higher for these candidates. Results are rich and complex and conﬁrmthat we can detect variations in the eﬀects of connections. For instance, AP candidateshaving obtained their PhD in Spain appear to have higher information eﬀects from bothweak and strong ties and lower bias from weak ties. Results on information are consistentwith the idea that having obtained a PhD abroad provides an informative signal on acandidate’s ability. We present counterfactual computations obtained from Model (9) inTable 8. Comparing with Table 7, we see that predicted probabilities are quantitativelysimilar. Table 8: Marginal eﬀect of connections: Model (9) Baseline Marginal eﬀectPredicted Total Information BiasStrong Weak Strong Weak Strong WeakAP 0.090 0.078 0.055 0.032 0.066 0.047 -0.012(0.067) (0.037) (0.052) (0.021) (0.067) (0.030) (0.044)FP 0.066 0.058 0.017 0.011 -0.003 0.047 0.020(0.056) (0.036) (0.027) (0.025) (0.039) (0.028) (0.026)

Notes:

The average marginal impacts of gaining one strong or weak link to the jury for uncon-nected candidates appear to be slightly lower under Model (9) than under Model (8). This24eans that unconnected candidates have, on average, observable characteristics for whichconnections’ impacts are slightly weaker. Strong ties still have higher predicted impactsthan weak ties. And the relative quantitative importance of the two factors is robust. Fa-vors dominate for strong and weak ties at the FP level and for strong ties at the AP level.By contrast, information eﬀects dominates for weak ties at the AP level.

B Robustness

In this section, we explore variations in the speciﬁcation of two important features of theeconometric model: exam-speciﬁc promotion thresholds and baseline variance. First, wecontrast estimations with exam ﬁxed eﬀects a e and exam grouped eﬀects a e = z e a . Wecompare estimation results of Model (6) under the two speciﬁcations in Table 9. The ﬁrstTable 9: Exam ﬁxed eﬀects vs. Exam grouped eﬀects: Model (6) All AP FPFE GE FE GE FE GEBias 0.304 ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ ∗∗ (0.088) (0.055) (0.078) (0.084) (0.097) (0.091)Information 0.166 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ Notes:

The row LR reports the value of LR statistics of comparison of the restricted model (GE) andthe unrestricted model (FE) in the preceding column. Heteroskedastic probit estimates. Standard errorsclustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < column reports results of ﬁxed eﬀects estimations; the second column duplicates the resultsfrom Table 4. We see that the sign and statistical signiﬁcance of both eﬀects are similarfor both speciﬁcations on the whole sample and on the subsample of FP candidates. OnAP candidates, the information eﬀect also has similar sign and signiﬁcance. Bias fromfavors is positive and signiﬁcant in the restricted model but positive and insigniﬁcant inthe unrestricted model. Results from likelihood ratio tests show that we cannot reject the25ypothesis that the grouped eﬀects speciﬁcation describes the data as well as the one withﬁxed eﬀects, on each subsample as well as on the whole sample. We therefore considergrouped eﬀects in our main regressions.Second, we consider diﬀerent speciﬁcations of baseline variance σ v ( x i ). We contrastestimations under homoscedasticity, when all individual characteristics are included, andwhen a subset of characteristics are included, as described in the Appendix. Results aredepicted in Table 10 for Model (6) and Table 11 in Model (7). We see that the sign andstatistical signiﬁcance of the main eﬀects are essentially similar for the last two speciﬁca-tions on the whole sample and on each subsample. Likelihood ratio tests also show that wecannot reject the hypothesis that the parsimonious speciﬁcation describes the data as wellas the full-ﬂedged speciﬁcation, even on subsamples. By contrast, estimates of main eﬀectsdiﬀer under homoscedasticity and the homoscedastic speciﬁcation is rejected by likelihoodratio test. This conﬁrms the importance of properly accounting for baseline heteroscedas-ticity. For reasons of computational and statistical eﬃciency, we therefore adopt the moreparsimonious heteroscedasticy speciﬁcation in our main regressions.Table 10: Robustness: Baseline heteroskedasticity: Model (6) All AP FPHom. Preferred Full Hom. Preferred Full Hom. Preferred FullBias 0.419 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗ ∗∗∗ ∗∗ ∗∗∗ (0.058) (0.055) (0.064) (0.076) (0.084) (0.095) (0.096) (0.091) (0.072)Information − ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ − ∗∗∗ ∗∗∗ ∗∗∗ Notes:

The row LR reports the value of LR statistics of comparison of the unrestricted model with therestricted model in the preceding column. Heteroskedastic probit estimates. Standard errors clustered onthe exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < All AP FPHom. Preferred Full Hom. Preferred Full Hom. Preferred FullBias (strong) 0.221 ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ ∗∗∗ ∗ − − − ∗ ∗∗ ∗∗ (0.055) (0.069) (0.077) (0.174) (0.230) (0.228) (0.046) (0.058) (0.054)Information (strong) 0.089 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ (0.031) (0.044) (0.042) (0.041) (0.055) (0.057) (0.044) (0.064) (0.066)Information (weak) 0.089 ∗∗ ∗∗ ∗∗∗ ∗∗∗ − − ∗∗∗ ∗∗∗ ∗∗∗ Notes:

In this article, we propose a new method to identify favors and information in the impactof connections, building on earlier work on discrimination. Our method combines naturalexperiments and semi-structural modelling. It requires exogenous shocks on connectionsand only exploits information collected at time of promotion. We develop an economet-ric framework based on probit regressions with heteroscedasticity. Our method can thusbe implemented using standard statistical softwares. We show that better information onconnected candidates yields excess variance in latent errors. Diﬀerences in estimated vari-ances between connected and unconnected candidates can be used to identify and quantifythe information eﬀect. Diﬀerences in estimated promotion thresholds can then be used toidentify the bias due to favors. We apply our method to the data assembled and studied inZinovyeva & Bagues (2015). Our empirical results are consistent with, and help sharpen,ﬁndings obtained from data collected ﬁve years after promotion.Our framework relies on a number of assumptions and, in particular, latent error nor-mality, deterministic favors and jury risk-neutrality. We next discuss the robustness ofour approach to relaxing these assumptions. First, we conjecture that this method canbe extended to non-normal latent errors. The fact that better private information leads to27xcess variance is quite general, as shown by Lu (2016). It could be interesting, in futureresearch, to try and extend this framework to logit or even non-parametric regressions.Second, suppose that favors are stochastic. Bias from favors is the sum of a deterministicpart and a stochastic part. If the stochastic part is independent of connections, our analysisand results goes through without modiﬁcations. This stochastic part is simply subsumedin the latent error. Our approach must be modiﬁed, however, if the bias’ stochastic part isaﬀected by connections. Current estimates of the information eﬀect provide a lower boundof the true eﬀect if bias variance decreases with connections and an upper bound if itincreases with connections.Third, consider a risk-averse jury. Risk aversion might lead the jury to promote a can-didate with lower expected ability if the uncertainty on her ability is lower. In other words,the grade of candidates evaluated by a risk averse jury may contain a risk penalty. Thismay invalidate the identiﬁcation of favors. Note that it also invalidates the identiﬁcationof favors in studies based on quality measures. For instance, Zinovyeva & Bagues (2015)’sﬁnding that promoted candidates with strong ties publish less in the 5 years after promo-tion could also be explained by risk aversion. Interestingly, however, we suspect that theidentiﬁcation of the information eﬀect might be robust to risk aversion. Developing empiri-cal methods to identify risk aversion, favors and information eﬀects provides an interestingchallenge for future research.To sum up, our method exploits variations in latent error variance and in promotionthresholds with connections. We clarify the conditions under which these variations yieldidentiﬁcation of favors and information in the impact of connections. Even in circum-stances when identiﬁcation does not hold, however, these estimates may contain valuableinformation on why connections matter.Finally, it would be interesting to combine our method with quality measures. Thiscould, potentially, yield more precise estimates of favors and information eﬀects and alsoallow researchers to test critical assumptions, such as whether promotion indeed has thesame impact on quality for connected and unconnected candidates.28

PPENDIX A

Proof of Proposition 1

A model with bias B ( . ) and excess variance σ ( . ) and an alter-native model with bias B (cid:48) ( . ) and σ (cid:48) ( . ) yield the same conditional probability to be hired p ( y i = 1 | n iS , n iW , x i ) if x i β + B ( n iS , n iW , x i ) − a e σ v ( x i ) σ ( n iS , n iW , x i ) = x i β + B (cid:48) ( n iS , n iW , x i ) − a e σ v ( x i ) σ (cid:48) ( n iS , n iW , x i )Therefore, for any functions B ( . ), B (cid:48) ( . ) and σ ( . ), a model based on B ( . ) and σ ( . ) andone based on B (cid:48) ( . ) and σ (cid:48) ( n iS , n iW , x i ) = x i β + B (cid:48) ( n iS , n iW , x i ) − a e x i β + B ( n iS , n iW , x i ) − a e σ ( n iS , n iW , x i )have the same empirical implications. QED. Proof of Theorem 1

Consider ﬁrst the classical Probit model with heteroscedasticity: p ( y i = 1 | x i ) = Φ[( a + bx i ) exp( − cx i )]Let us show that this model is identiﬁed if a b (cid:54) = 0. Identiﬁcation holds if the mappingfrom parameters to the population distribution of outcomes is injective. Consider two setsof parameters a, b , c and a (cid:48) , b (cid:48) , c (cid:48) such that ∀ x ∈ R k , Φ[( a + bx ) exp( − cx )] = Φ[( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x )]. We must show that a = a (cid:48) , b = b (cid:48) and c = c (cid:48) .Applying Φ − yields: ∀ x , ( a + bx ) exp( − cx ) = ( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x ). At x = , thisyields: a = a (cid:48) . Next, take the derivative with respect to x k and apply at x = . This yields b k − ac k = b (cid:48) k − ac (cid:48) k . Observe also that b k and b (cid:48) k must have the same sign. Indeed if x l = 0when l (cid:54) = k , then ( a + bx ) exp( − cx ) = ( a + b k x k ) exp( − c k x k ). As x k goes from −∞ to + ∞ ,the sign of this expression can vary in one of three ways: it goes from negative to positiveif b k >

0; it goes from positive to negative if b k <

0; or it stays constant if b k = 0.Assume ﬁrst that a (cid:54) = 0 and b (cid:54) = . Consider k such that b k (cid:54) = 0, for instance b k > x l = 0 except if l (cid:54) = k. For any x k large enough, a + bx = a + b k x k >

0. Taking logsyields: ln( a + b k x k ) − c k x k = ln( a + b (cid:48) k x k ) − c (cid:48) k x k . Take the derivative with respect to x k : b k / ( a + b k x k ) − c k = b (cid:48) k / ( a + b (cid:48) k x k ) − c (cid:48) k . Take the derivative twice more: b k / ( a + b k x k ) = b (cid:48) k / ( a + b (cid:48) k x k ) and − b k / ( a + b k x k ) = − b (cid:48) k / ( a + b (cid:48) k x k ) . Since this holds for any x k largeenough, this must hold for any x k . At x k = 0, this yields: b k = b (cid:48) k and hence b k = b (cid:48) k and c k = c (cid:48) k . If b k = 0, then b (cid:48) k = 0 and c k = c (cid:48) k .Assume next that b = . Then b (cid:48) = 0 and ∀ x , a exp( − cx ) = a exp( − c (cid:48) x ) and hence c = c (cid:48) . Finally, if a = 0 and b k >

0, then for any x k >

0, ln( b k x k ) − c k x k = ln( b (cid:48) k x k ) − c (cid:48) k x k and hence ln( b k ) − c k x k = ln( b (cid:48) k ) − c (cid:48) k x k . This implies that b k = b (cid:48) k and c k = c (cid:48) k . Thus, b = b (cid:48) and cx = c (cid:48) x for any x such that bx (cid:54) = 0, which implies that c = c (cid:48) .Observe that injectivity and identiﬁcation also hold if x belongs to an open set O of R k . The reason is that the function x → ( a + bx ) exp( − cx ) is analytic and that twoanalytic functions which are equal on an open set must be equal everywhere. Therefore, If a = 0 and b = 0, ∀ x , Φ[( a + bx ) exp( − cx )] = 1 / c is not identiﬁed. x ∈ O , Φ[( a + bx ) exp( − cx )] = Φ[( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x )] ⇒ ∀ x ∈ R k , ( a + bx ) exp( − cx ) =( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x ) and hence a = a (cid:48) , b = b (cid:48) and c = c (cid:48) .Identiﬁcation also holds if with some binary characteristics. Suppose that x i ∈ { , } and denote by x − i ∈ R k − , the vector of other characteristics. Then, p ( y i = 1 | x i =0 , x − i ) = Φ[( a + b − x i ) exp( − c − x i )] yielding identiﬁcation of a, b − and c − . Next, p ( y i =1 | x i = 1 , x − i ) = Φ[( a + b + b − x i ) exp( − c − c − x i )]. Rewrite Φ − ( p ) = [ e − c ( a + b ) + e − c b − x i exp]( − c − x i ). Therefore, e − c b − is identiﬁed and hence c is identiﬁed. Since e − c ( a + b ) is also identiﬁed, b is identiﬁed.Thus n becomes arbitrarily large, the econometrician can thus obtain consistent esti-mates of a, b and c if observables have full rank.Consider, next, the following model p ( y i = 1 | n iS , n iW , x i ) = Φ[(( β + γ ( n iS , n iW )) x i + γ ( n iS , n iW ) − a e ] exp[ − ( δ + δ ( n iS , n iW )) x i ]We apply the identiﬁcation result on the Probit model with heteroscedascticity repeatedly.On unconnected candidates, we have: p ( y i = 1 | n iS = 0 , n iW = 0 , x i ) = Φ( β x i − a e ) exp( − δ )and hence a e , β , and δ are identiﬁed. Similarly for candidates with connections n iS and n iW , the parameters γ ( n iS , n iW ) − a e , β + γ ( n iS , n iW ) and δ + δ ( n iS , n iW ) are identi-ﬁed. Therefore, γ ( n iS , n iW ), γ ( n iS , n iW ), and δ ( n iS , n iW ) are identiﬁed. Note that toobtain consisent estimates of a e , β , δ , γ ( n iS , n iW ), γ ( n iS , n iW ), δ ( n iS , n iW ), the numberof observations within exams must become arbitrarily large and observables conditional on( n iS , n iW ) must have full rank. QED. Preferred speciﬁcation for the baseline heteroscedasticity.

We ﬁrst estimate model(4) on unconnected candidates, under the assumption that latent error variance is log-linearand depends on all observable characteristics. We thus estimate the following model: p ( y i = 1 | x i ) = Φ[( x i β − a e ) exp( − δ x i )]on unconnected candidates. In our preferred speciﬁcation for σ v , we then include variablesthat are statistically insigniﬁcant as well expected numbers of connections En iS , En iW .We include these expected numbers given their critical role in ensuring the exogeneityof actual connections. We exclude other variables. Our preferred speciﬁcation includes thefollowing 10 observables: expected number of strong connections, expected number of weakconnections, PhD students advised, AIS, age, gender, number of candidates at the exam,share of unconnected candidates at the exam, type of exam, and the indicator if the broadarea is Humanities and Law. As discussed in Section VI.B. and following Davidson &McKinnon (1984), we also test whether this restricted model indeed explains the data aswell as the non-restricted model. Additional estimation results.

Results of the estimation of Model (9) for subsamplesof AP candidates and FP candidates are presented in Table A1 and Table A2 respectively.30able A1: Estimation of Model (9): AP candidates

Bias InformationStrong Weak Strong WeakConst. 0.466 ∗∗ . ∗∗ - -(0.185) (0.245) - -Strong − ∗∗∗ − ∗∗ ∗∗∗ -0.002 − − − ∗∗ − ∗∗ ∗∗ − . ∗∗ − ∗∗∗ (0.021) (0.215) (0.041) (0.164)PhD in Spain − − . ∗∗ ∗∗ ∗∗∗ (0.182) (0.281) (0.068) (0.174)Age − − − − − ∗∗∗ − Notes:

Estimation of Model (9). All speciﬁcations include controls for the full set of observable character-istics, expected number of connections of each type, and the baseline heteroskedasticity. Heteroskedasticprobit estimates. Exam grouped eﬀects. Standard errors clustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < Bias InformationStrong Weak Strong WeakConst. 0.366 ∗∗∗ ∗∗∗ - -(0.065) (0.058) - -Strong − ∗∗∗ − − − − ∗∗ − ∗∗ (0.036) (0.019) (0.051) (0.028)PhD Committees 0.007 0.038 ∗ ∗∗∗ ∗∗ ∗∗∗ − − ∗∗∗ (0.032) (0.034) (0.047) (0.037)PhD Students − − ∗∗ − ∗∗ − ∗∗ − ∗∗∗ − − − ∗∗∗ (0.006) (0.027) (0.025) (0.036)Expected strong 0.048 ∗∗ − ∗∗∗ − − ∗∗∗ (0.054) (0.031) (0.071) (0.034) Notes:

Beaman, Lori and Jeremy Magruder. 2012. “Who Gets the Job Referral? Evidence froma Social Networks Experiment.”

American Economic Review , 102(7): 3574-3593.Bertrand, Marianne and Sendhil Mullainathan. 2004. “Are Emily and Greg More Em-ployable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination.”

American Economic Review , 94(4): 991-1013.Bester, C. Alan and Christian B. Hansen. 2016. “Grouped Eﬀects Estimators in FixedEﬀects Models.”

Journal of Econometrics,

Journal of Development Eco-nomics,

Journal of Financial Economics,

Journal of LaborEconomics,

Review ofEconomics and Statistics, forthcoming.Combes, Pierre-Phillipe, Linnemer, Laurent and Michael Visser. 2008. “Publish or Peer-rish? The Role of Skills and Networks in Hiring Economic Professors.”

Labour Economics,

15: 423-441.Davidson, Russell and James G. MacKinnon. 1984. “Convenient Speciﬁcation Tests forLogit and Probit Models.”

Journal of Econometrics,

25: 241-262.Engelberg, Joseph, Gao, Pengjie and Christopher A. Parsons. 2012. “Friends with Money.”

Journal of Financial Economics,

American Journal of Sociology,

Journal of Economic Perspectives,

Clear and Convincing Evidence: Measurement of Discriminationin America,

M. Fix and R. Struyk, eds. Urban Institute.33ensvik, Lena and Oskar Nordstrom Skans. 2016. “Social Networks, Employee Selection,and Labor Market Outcomes ”

Journal of Labor Economics,

Journal of Political Econ-omy,

AmericanEconomic Journal: Applied Economics,

Econometrica,

Journal of Human Resources,

Journal of Political Economy,