Promotion through Connections: Favors or Information?
PPromotion through Connections:Favors or Information?
Yann Bramoull´e and Kenan Huremovi´c*
August 2017
Abstract : Connections appear to be helpful in many contexts such as obtaining a job, apromotion, a grant, a loan or publishing a paper. This may be due to favoritism or toinformation conveyed by connections. Attempts at identifying both effects have relied onmeasures of true quality, generally built from data collected long after promotion. Thisempirical strategy faces important limitations. Building on earlier work on discrimination,we propose a new method to identify favors and information from classical data collectedat time of promotion. Under natural assumptions, we show that promotion decisions lookmore random for connected candidates, due to the information channel. We obtain newidentification results and show how probit models with heteroscedasticity can be used toestimate the strength of the two effects. We apply our method to the data on academicpromotions in Spain studied in Zinovyeva & Bagues (2015). We find evidence of both favorsand information effects at work. Empirical results are consistent with evidence obtainedfrom quality measures collected five years after promotion.
Keywords : Promotion, Connections, Social Networks, Favoritism, Information.
JEL classification:
C3, I23, M51.*
Bramoull´e: Aix-Marseille University (Aix-Marseille School of Economics) & CNRS; Huremovi´c:IMT School for Advanced Studies. We thank participants in seminars and conferences and HabibaDjebbari, Bruno Decreuse, Mark Rosenzweig, Marc Sangnier, Adam Szeidl, Natalia Zinovyeva andRussell Davidson for helpful comments and suggestions. For financial support, Yann Bramoull´ethanks the European Research Council (Consolidator Grant n. 616442). a r X i v : . [ q -f i n . E C ] A ug Introduction
Connections appear to be helpful in many contexts such as obtaining a job, a promotion,a grant, a loan or publishing a paper. Two main reasons help explain these wide-rangingeffects. On one hand, connections may convey information on candidates, projects andpapers. Connections then help recruiters, juries and editors make better decisions. Onthe other hand, decision-makers may unduly favor connected candidates, leading to worsedecisions. These two reasons have opposite welfare implications and empirical researchershave been trying to tease out the different forces behind connections’ impacts. Almostall existing studies do so by building measures of candidates’ “true” quality. Researchersthen compare the quality of connected and unconnected promoted candidates. Informa-tion effects likely dominate if connected promoted candidates have higher quality; favorslikely dominate if connected promoted candidates have lower quality. For instance, articlespublished in top economics and finance journals by authors connected to editors tend toreceive more citations, a sign that editors use their connections to identify better papersBrogaard, Engelberg & Parsons (2014). By contrast, Full Professors in Spain who wereconnected to members of their promotion jury publish less after promotion Zinovyvea &Bagues (2015), consistent with favoritism.This empirical strategy, while widely used, faces three important limitations. First,building a measure of true quality may not be easy or feasible. Looking at researchers’publications or articles’ citations requires a long enough time lag following promotion orpublication. And such measures are in any case imperfect proxies of quality. Second,identification is only valid if the impact of promotion on measured quality is the samefor connected and unconnected promoted candidates, see e.g. Zinovyeva & Bagues (2015,p.283). This assumption is critical but not necessarily plausible, and can generally notbe tested. Third, connections may convey both information and favors. This empirical The literature on jobs and connections is large and expanding. Recent references include Beaman &Magruder (2012), Brown, Setren & Topa (2016), Hensvik & Skans (2016), Pallais & Sands (2016). Onpromotions, see Combes, Linnemer & Visser (2008), Zinovyeva & Bagues (2015). On grants, see Li (2017).On loans, see Engelberg, Gao & Parsons (2012). On publications, see Brogaard, Engelberg & Parsons(2014), Colussi (2017), Laband & Piette (1994). Favor exchange within a group might increase the group’s welfare at the detriment of society, seeBramoull´e & Goyal (2016). In this paper, we focus on the immediate negative implications of favoritism. We develop a new method to identify why connections matter, building on earlier workon discrimination. Our method addresses these limitations. It does not rely on measures oftrue quality. Rather, it exploits classical data collected at time of promotion: informationon candidates and whether they were promoted. It allows researchers to estimate the mag-nitudes of the two effects. The method does require exogenous shocks on connections. Thisis, in any case, a precondition of any study of the reasons behind the effect of connections.The method is indirect and looks for revealing signs of information and favors on therelation between candidates’ observables and promotion. Consider candidates applyingfor promotion. They are evaluated by a jury and some candidates are connected to jurymembers. When connections convey information, the jury has an extra signal on connectedcandidates’ ability. This signal is unobserved by the econometrician and could be positiveor negative. To the econometrician, then, the promotion decision looks more random forconnected candidates. We show how the strength of the information channel can berecovered, under appropriate assumptions, from this excess variance in the latent error ofconnected candidates. To recover favors, then, we estimate and compare the promotionthresholds faced by connected and unconnected candidates. Favors lead to systematic biasesin evaluation and the difference between promotion thresholds measures the magnitude ofthe underlying favors.Our econometric framework is based on normality assumptions. We make use of pro-bit models with heteroscedasticity to detect and estimate excess variance. We clarify theconditions under which favors and information are identified. Identification fails to hold ifthe effects depend in an arbitrary way on candidates’ observables (Proposition 1). Iden-tification holds, however, under slight restrictions on this dependence, for instance in thepresence of an exclusion restriction or under linearity assumptions (Theorem 1).We then bring our method to data. We reanalyze the data on academic promotions in In a context of grant applications, Li (2016) develops a new method to recover the respective strengthsof favors and information. Her method relies on measures of true quality and on jury evaluations. A similar idea underlies Theorem 4 in Lu (2016); we discuss this relation in more detail below.
II A simple model
In this Section, we introduce a simple model to explain and illustrate our identificationstrategy. We develop our general model and derive formal identification results in SectionIII. This model is similar to models analyzed in Heckman & Siegelman (1993, Appendix 5.D), Neumark(2012) and Zinovyeva & Bagues (2015, Section I). These grades may be affected by connections to jury members, asdescribed below. Let a e be the exam-specific promotion threshold: a candidate is promotediff her grade is higher than or equal to a e . This threshold may notably depend on thenumber of candidates applying for promotion in that wave and discipline.We assume that candidate i ’s true ability a i can be decomposed in three parts: a i = x i β + u i + v i (1)where x i ∈ R m denotes a vector of m characteristics observed by the econometrician andthe jury, u i is unobserved by both the econometrician and the jury, and v i is observed bythe jury but not the econometrician. In our empirical application, x i includes number ofpublications, age and gender; u i could capture creativity and v i the performance at theexam. Without loss of generality, we assume that E ( u i | x i ) = E ( v i | x i ) = 0. Thus, u i and v i represent parts of unobserved characteristics that cannot be explained by observables.Assume further that E ( u i | v i ) = 0 and that unobservables are normally distributed: u i ∼ N (0 , σ u ) and v i ∼ N (0 , σ v ). Denote by Φ the cumulative density function of a normalvariable with mean 0 and variance 1.Consider an unconnected candidate first. We assume that her grade is equal to thejury’s expectation of her ability E ( a i | x i , v i ) = x i β + v i . Thus, unconnected candidate i ispromoted iff x i β + v i ≥ a e . From the econometrician’s point of view, the probability thatan unconnected candidate with characteristics x i is promoted is equal to: p u ( y i = 1 | x i ) = Φ( x i β − a e σ v ) (2)where y i = 1 if candidate i obtains the promotion and 0 otherwise. We develop our approach under the assumption that the econometrician does not have data on juryevaluations. If E ( u i | x i ) (cid:54) = 0, define ˆ u i = u i − E ( u i | x i ) and similarly for ˆ v i . Note that E (ˆ u i | x i ) = 0 while E ( u i | x i )is a function of x i . Under linearity, this yields a i = x i ˆ β + ˆ u i + ˆ v i , which is then equivalent to equation (1). θ i = u i + ε i where ε i ∼ N (0 , σ ε ), and updates his belief on thecandidate’s ability based on this additional information. On the other hand, the jury maywant to favor the connected candidate. We assume that favors take the shape of a gradepremium B due to connections.A connected candidate’s grade is thus equal to its expected ability E ( a i | x i , v i , s i ) = x i β + E ( u i | θ i ) + v i plus the bias from favors B . Since E ( u i | θ i ) = σ u σ u + σ ε θ i , connectedcandidate i is hired iff x i β + σ u σ u + σ ε θ i + v i + B ≥ a e . From the econometrician’s point ofview, the signal θ i enters in the latent error and generates extra variance on the jury’sdecision. Variance of the latent error is now equal to σ v + σ u σ u + σ ε . Let σ = 1 + σ u σ v ( σ u + σ ε ) > p c ( y i = 1 | x i ) = Φ( x i β + B − a e σσ v ) (3)Comparing equations (2) and (3), we see that information and favors have different im-pacts on the probability to be promoted. When a jury has better information on connectedcandidates, this reduces the magnitude of the impact of observable characteristics on thelikelihood to be promoted. By contrast, favors lead to a shift in the effective promotionthreshold, from a e to a e − B , leaving the impact of observables unchanged.We illustrate these effects in Figure 1. The solid black curve depicts p u ( y i = 1 | x i ),the probability that an unconnected candidate is promoted as function of observed ability.The dashed curve depicts the probability that a connected candidate is promoted wheninformation effects only are present. Note that the whole curve is less steep. The observed6robability to be promoted varies less with observed ability. Formally, an increase in σ leads to a second-order stochastic dominance shift of the whole curve. This also impliesthat the apparent impact of connections is negative for very good candidates for which x i β ≥ a e . This apparent negative impact is due to an asymmetry in the effects of goodand bad news on candidates’ unobservables. While good news do not improve already goodchances by much, bad news significantly reduce the chances of good candidates. For theeconometrician, connections then reduce the observed probability to be promoted for verygood candidates.The short-dashed curve depicts the probability that a connected candidate is promotedwhen only favors are present. The curve is now translated to the left, inducing a first-order stochastic dominance shift. The shape of the whole curve is preserved. The apparentimpact of connections is now positive for all candidates. Finally, the grey curve depicts p c ( y i = 1 | x i ) when both effects are present.Figure 1: Effects of a connection a e - B a e x1p ( h = | x ) UnconnectedConnected : informationConnected : favorsConnected : information + favors Both effects can thus be identified from data on promotion. Differences in the impactsof observables between connected and unconnected can be used to recover information ef-fects. Differences in estimated promotion thresholds between connected and unconnectedcan then be used to recover favors. From an econometric point of view, the differential in-formation that the jury has on connected candidates generates a form of heteroscedasticity . Formally, identification in this model holds under the standard assumption that σ v = 1 and is a directconsequence of Theorem 1 below. III Identification
We now develop our general framework. We maintain the assumption that connectionsare random and extend the simple model in three directions. We incorporate baselineheteroscedasticity, varying information and varying favors. Information and favors maynotably depend on the number and types of connections of a candidate to the jury. Inline with the empirical application, we consider two types here - strong and weak ties; theframework and results easily extend to a finite number of types. Denote by n iS and n iW the number of strong and weak ties that candidate i has to the jury.We first assume that the variance of v i may depend on i ’s observables x i . Thus, v i ∼ N (0 , σ v ( x i )). In the empirical analysis, we adopt standard assumptions regardingheteroscedasticity in probit regressions, see Section IV. To state our identification resultsbelow, we only require that such baseline heteroscedasticity does not raise identificationproblems in classical probit estimations. More precisely, consider unconnected candidates.We have: p ( y i = 1 | n iS = n iW = 0 , x i ) = Φ[( x i β − a e ) /σ v ( x i )]. We assume that β and σ v ( . )are identified from the sample of unconnected candidates. Second, we assume that the private signal received by the jury on a connected candidatemay depend on the candidate’s number and types of connections and on his other observablecharacteristics. Denote by σ ≥ σ = σ ( n iS , n iW , x i ) where, by assumption, σ (0 , , x i ) = 1. Third, the bias from favors B As is well-know, a probit model with coefficients ( β, a e ) and variance σ v ( x i ) cannot be distinguishedfrom one with coefficients ( λβ, λa e ) and variance λσ v . We therefore adopt the classical normalizationassumption that σ v ( ) = 1 in our econometric specifications. B = B ( n iS , n iW , x i ), with B (0 , , x i ) = 0. While we generally expect both σ and B to beincreasing in the number of connections, we do not impose it in what follows. This yieldsthe following probability to be hired, conditional on connections and observables: p ( y i = 1 | n iS , n iW , x i ) = Φ[ x i β + B ( n iS , n iW , x i ) − a e σ v ( x i ) σ ( n iS , n iW , x i ) ] (4)The simple model presented in Section II is a particular case with σ v ( x i ) = σ v , and B ( n iS , n iW , x i ) = B and σ ( n iS , n iW , x i ) = σ as soon as n iS + n iW ≥ B ( . ) and σ ( . ). If the bias B varieswith observable x ki in a direction opposite from the direct effect β k , this leads to an apparentreduction in the impact of x ki on the likelihood to be promoted for connected. If thishappens on all observables and without further restrictions, it prevents the identificationof the information effect. We next state this negative result and derive a formal proof inthe Appendix. Proposition 1
Consider model (4). Suppose that the precision of the signals conveyed byconnections and the bias from favors depend in an arbitrary way on connections and onother observable characteristics of candidates. Then, favors and information effects cannotbe identified from data on promotion only.
We now derive our main result. We show that identification holds under mild restrictionson bias and excess variance. We consider two types of restrictions: exclusion restrictionsand parametric assumptions.
Theorem 1
Consider model (4).(Exclusion restriction). Suppose that characteristic k leaves σ and B unaffected and that β k (cid:54) = 0 . Then, the model is identified and the functions σ ( n iS , n iW , x − ki ) and B ( n iS , n iW , x − ki ) are non-parametrically identified.(Linearity). Suppose that ln( σ ( n iS , n iW , x i )) = δ ( n iS , n iW ) x i and B ( n iS , n iW , x i ) = γ ( n iS , n iW )+ γ ( n iS , n iW ) x i , with γ (0 ,
0) = 0 and δ (0 ,
0) = γ (0 ,
0) = . Then, the model is identified nd the functions δ ( n iS , n iW ) , γ ( n iS , n iW ) and γ ( n iS , n iW ) are non-parametrically identi-fied. To see why the first part of Theorem 1 holds, suppose that σ and B do not de-pend on x ki . From data on the unconnected, we can recover β k , the direct effect of x ki on grade, and σ v ( . ). Focus, then on candidates with number of connections n iS and n iW and with other characteristics x − ki . From data on these candidates, we can recover theheteroscedasticity-corrected impact of x ki on grade, equal to β k /σ ( n iS , n iW , x − ki ). If β k (cid:54) = 0,we obtain σ ( n iS , n iW , x − ki ). The bias B ( n iS , n iW , x − ki ) can then be obtained as the differencein inferred promotion thresholds between unconnected candidates and candidates with ties n iS and n iW and characteristics x − ki .Therefore, our identification strategy operates as long as one exclusion restriction ispresent in the model. As with instrumental variables, the excluded variable should have adirect impact on the unconnected likelihood to be promoted and should not directly affectthe precision of the signals conveyed by connections nor the bias from favors they maygenerate. In particular, a model where excess variance and bias from favors depend onconnections but not on other observables is identified. We estimate several variants of suchmodels in the empirical analysis below.Even without exclusion restrictions, the model can still be identified thank to functionalform assumptions. The second part of Theorem 1, proved in Appendix, shows that thisnotably holds when excess variance is log linear in observables while bias is an affinefunction of observables. In this case, again, dependence on connections can be arbitrary andis fully identified. To achieve non-parametric identification in practice may of course requirea very large number of observations. In the empirical analysis below, we adopt standardparametric assumptions on the way ln( σ ) and B vary with connections and observables.All models estimated in Section VI are covered by Theorem 1.10 V Data
We apply our framework to the data on academic promotions in Spain assembled andstudied by Zinovyeva & Bagues (2015). We describe the main features of the data here andrefer to their study for details. From 2002 to 2006, academics in Spain seeking promotionto Associate Professor ( profesor titular ) or Full Professor ( catedr´atico ) first had to qualifyin a national exam ( habilita´cion ). All candidates in the same discipline in a given wavewere evaluated by a common jury composed of 7 members. The jury had to allocate apredetermined number of positions. These exams were highly competitive and obtainingthe national qualification essentially ensured promotion. A central feature of this systemwas that jury members were picked at random from a pool of eligible evaluators. Therandom draw was actually carried out by Ministry officials using urns and balls. The datacontains information on all candidates to academic promotion during that period, theirconnections to eligible evaluators and to jury members, and their success or failure in thenational exam.Overall, there are 31 ,
243 applications to 967 exams: 17 ,
799 applications to 465 examsfor Associate Professor (AP) positions and 13 ,
444 to 502 exams for Full Professor (FP)positions. We have information on candidates’ demographics and academic outcomes attime of application. Observable characteristics include gender, age, whether the candidateobtained his PhD in Spain, the number of publications, the number of publications weightedby journal quality, the number of PhD students supervised, the number of PhD committeesof which the candidate had been a member, and the number of previous attempts atpromotion. Table 1 provides descriptive statistics. Standards regarding research outputsmay of course differ between disciplines. To analyze applications in a common framework,we follow Zinovyeva & Bagues (2015) and normalize research indicators to have mean 0and variance 1 within exams. The data also contain information on six types of linksbetween candidates and evaluators. We adopt Zinovyeva & Bagues (2015)’s classificationof these links in strong and weak ties. A candidate is said to have strong ties to his We also normalize age and past experience to have mean 0 within exams. The data also contains information on indirect connections, for instance when a candidate and anevaluator have a common member on their PhD committees. Zinovyeva & Bagues (2015) do not find any
All AP FP Eng. H&L Sci. Soc. Sci.Female 0.34 0.40 0.27 0.21 0.45 0.30 0.39(0.47) (0.49) (0.44) (0.41) (0.50) (0.46) (0.49)Age 41.21 37.49 46.14 38.74 41.86 41.97 40.39(7.59) (6.41) (6.06) (7.07) (7.62) (7.55) (7.54)PhD in Spain 0.78 0.83 0.70 0.83 0.77 0.76 0.78(0.42) (0.37) (0.46) (0.38) (0.42) (0.43) (0.42)Past Experience 0.81 0.73 0.91 0.85 0.63 0.89 0.88(1.27) (1.27) (1.26) (1.36) (0.94) (1.40) (1.30)Publications 12.84 8.12 19.09 7.76 11.45 16.99 9.22(18.31) (14.06) (21.18) (12.88) (11.39) (24.10) (11.61)AIS 0.72 0.70 0.74 0.52 - 0.80 0.62(0.53) (0.57) (0.48) (0.37) - (0.51) (0.75)PhD Students 1.00 0.24 2.00 0.83 0.61 1.45 0.66(2.11) (0.88) (2.75) (1.61) (1.63) (2.60) (1.58)PhD Committees 3.61 0.88 7.23 2.40 3.04 4.81 2.67(6.76) (2.55) (8.65) (4.42) (5.99) (8.21) (4.99)Observations 31243 17799 13444 4783 9005 12858 4597
Notes:
Average values of the observable characteristics at the time of exam. Standard deviation inparentheses. FP and AP stand for exams for Full Professor and Associate Professor positions respectively.Eng., H&L, Sci., and Soc. Sci. are abbreviations for Engineering, Humanities and Law, Sciences, and SocialSciences, which are 4 broad scientific areas in our sample. AIS is the sum of international publicationsweighted by corresponding Article Influence Scores. The table partially replicates Table 2 in Zinovyeva &Bagues (2015).
PhD advisor, to his coauthors and to his colleagues. He has weak ties with members of hisPhD committee, with members of the PhD committees of his PhD students and with othermembers of the PhD committees of which he was a member. Overall, 34.8% of candidatesend up having at least one strong connection with a member of their jury and 20.6% haveat least one weak connection. Table 2 provides further information on connections. effect of indirect connections and we do not include them in our analysis. A connection which is both strong and weak is classified as strong.
All AP FP Eng. H&L Sci. Soc. Sci.Strong connections 31.71 29.08 35.18 37.78 27.65 31.13 34.94Advisor 3.17 2.97 3.43 4.60 3.29 2.43 3.50Coauthor 5.44 3.26 8.32 6.10 2.84 7.44 4.24Colleague 29.71 27.74 32.31 36.02 26.15 28.50 33.46Weak connections 18.79 7.33 33.97 17.06 23.63 16.43 17.71PhD committee member 7.08 5.31 9.43 8.05 10.22 4.39 7.48PhD committee of his PhD student 4.45 0.69 9.42 4.70 4.81 4.27 3.96Same PhD committee member 11.65 1.82 24.66 8.84 13.90 11.59 10.31
Notes:
The percentage of candidates with at least one connection to the jury. The table partially replicatesTable 3 in Zinovyeva & Bagues (2015).
V Empirical Implementation
We now apply our identification strategy to the data on academic promotions in Spain. Wediscuss three key features of the empirical implementation: the random assignment of eval-uators; the exam-specific promotion thresholds; and the specific models being estimated.
A Random assignment of jury members
Our identification result, Theorem 1, relies on the assumption that the distribution ofunobservables for candidates with connections ( n iS , n iW ) does not depend ( n iS , n iW ). Inthe data, random assignment of jury members ensures that this holds conditionally on theexpected number of connections to the jury . That is, candidates may vary in the extentof their connections to eligible evaluators. From the number of eligible evaluators and thenumbers of weak and strong ties to eligible evaluators, we can simply compute the expectednumber of actual connections of the candidate to the jury. Conditional on these expectednumbers, actual numbers of connections are random. We present the corresponding balancetests in Table 3. Controlling for candidates’ expected numbers of connections, we donot find significant correlations between observable characteristics and actual number of To be consistent with our main regressions, we run balance tests conditioning directly on the expectednumbers of connections. By contrast, Zinovyeva & Bagues (2015) control for expected connections throughan extensive set of dummies, see Table 4 p.278. Incorporating these dummies raise computational issues inour non-linear setup. Results from Table 1 show that even in a simple linear formulation, actual connectionsare uncorrelated with observable characteristics.
AIS Publications PhD PhD Paststudents committees experience
Without controls for the expected number of connections
Strong 0.009 0.010 − ∗∗∗ − ∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.007) (0.008) (0.010) (0.011) (0.007) Including controls for the expected number of connections
Strong − − − − − − Notes:
Results of 10 regressions of observables (columns) on the number of strong and weak connections tothe jury (rows). In regressions in the upper panel we do not control for the expected number of connections.Regressions in lower panel include controls for the expected number of strong connections to the jury andthe expected number of weak connections to the jury. OLS estimates. Standard errors clustered on theexam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < connections. Therefore, a conditional version of Theorem 1 holds in this context. Theprobability to be promoted for unconnected p ( y i = 1 | n iS = n iW = 0 , En iS , En iW , x i ),excess variance due to better information σ ( n iS , n iW , En iS , En iW , x i ) and bias from favors B ( n iS , n iW , En iS , En iW , x i ) may depend on the expected numbers of connections to thejury. Under the assumptions underlying Theorem 1, the conditional information and favoreffects are identified. Note that the expected numbers of connections represent measuresof social capital, built from information available to the jury. In the empirical analysis wetherefore simply include them in the set of candidates’ characteristics observable to thejury. B Exam-specific promotion thresholds
Our approach relies on exam-specific promotion thresholds. This is an important elementsince the bias from favors is identified from differences in promotion thresholds betweenconnected and unconnected candidates. We consider two ways to account for exam-specificthresholds empirically: exam fixed effects a e and exam grouped effects a e = z e a , where z e is14 vector of exam-level characteristics. A first approach is to include a full set of exam fixedeffects. In practice, regressions then include 967 exam dummies. While exam fixed effectsimpose, in principle, less restrictions, they raise several problems in practice. They maynot be identified for exams with small numbers of candidates, due to full predictability.They raise computational difficulties caused by the high dimensionality of the non-linearoptimization problem to be solved in the estimations. And in circumstances where groupedeffects are appropriate, estimations based on fixed effects may be inefficient.Alternatively, we consider exam grouped effects as in Bester & Hansen (2016). Weallow promotion thresholds to depend on type, area and wave fixed effects - leading to72 dummies in total - and on the number of candidates, the number of positions, theproportion of filled positions and the proportion of unconnected candidates. This modelis of course nested in the model with exam fixed effects and we can then test whether itleads to a significant loss in explanatory power. C Econometric model
In the empirical analysis, we estimate different specifications of model (4). The generalmodel features three key ingredients: baseline heteroscedasticity σ v ( x i ), excess variancefrom better information σ ( n iS , n iW , x i ) and bias from favors B ( n iS , n iW , x i ). Note that thefirst two elements are closely related, since σ v ( x i ) σ ( n iS , n iW , x i ) represents the variance ofthe latent errror for candidates with connections n iS , n iW and characteristics x i .We adopt a standard formulation for baseline heteroscedasticity, see Woolridge (2010).We assume that the logarithm of the variance of v i , the determinant of ability of uncon-nected candidates observed by the jury but not by the econometrician, is a linear functionof observable characteristics: σ v ( x i ) = exp( δ x i ) (5)and where the constant is excluded from the x i ’s. To gain in statistical and computationalefficiency, we do not include all characteristics in σ v in our preferred specification. Wepresent our estimation procedure in Appendix.We model information effects by building on this heteroscedasticity formulation. We15onsider increasingly complex specifications: (1) constant information effects σ ( n iS , n iW , x i ) =exp( δ c ) if n iS + n iW ≥
1; (2) information effects depending on numbers and types oflinks: σ ( n iS , n iW , x i ) = exp( δ S n iS + δ W n iW ); and (3) information effects depending onnumbers and types of links as well as other observable characteristics: σ ( n iS , n iW , x i ) =exp[( δ S x i ) n iS + ( δ W x i ) n iW ]. Thus, each new strong tie with the jury increases latent errorvariance by exp( δ S ) in formulation (2) and by exp( δ S x i ) in formulation (3). These assump-tions allow us to study the determinants of the variance of the latent error in a common,coherent framework. In addition, observe that formulation (3) can be obtained as the firstelement of the Taylor approximation of ln( σ ( n iS , n iW , x i ) /σ v ( x i )) with respect to n iS , n iW and x i , for any function σ .We also model increasingly complex specifications of the bias from favors: (1) constantbias: B ( n iS , n iW , x i ) = B if n iS + n iW ≥
1; (2) bias depending on the numbers and types oflinks, linearly: B ( n iS , n iW , x i ) = γ S n iS + γ W n iW , or in a quadratic way: B ( n iS , n iW , x i ) = γ S n iS + γ S n iS + γ W n iW + γ W n iW + γ SW n iS n iW ; and (3) bias depending on connections andother observables: B ( n iS , n iW , x i ) = ( γ S + γ S x i ) n iS +( γ W + γ W x i ) n iW + γ S n iS + γ W n iW + γ SW n iS n iW . Quadratic terms help capture decreasing marginal impacts of additional links.For instance in the quadratic variant of formulation (2), a new strong tie with the juryincreases bias by γ S + γ S for an unconnected candidate and by γ S + 3 γ S for a candidatewho already had one strong tie. VI Empirical Analysis
A Main results
We develop our empirical analysis in three stages. We first estimate a version of the simplemodel discussed in Section II, where the extent of information and favors are constant.We then account for the number and types of links, holding both effects independent ofobservables. Finally, we estimate a model with full dependence on links and observables.We first examine the impact of having at least one connection of any kind to the jury. Weestimate constant favors and information effects, accounting for baseline heteroscedasticity.16enote by c i the connection dummy: c i = 1 if n iS + n iW ≥ p ( y i = 1 | x i , c i ) = Φ[( x i β + Bc i − a e ) exp[ − ( δ x i + δ c c i )]] (6)We consider grouped exam effects in our main regressions, and justify this choice in SectionVI.B. Results of the estimation of Model (6) are reported in Table 4.Table 4: Binary connections: Model (6) (All) (AP) (FP)Bias (connected) 0.179 ∗∗∗ ∗∗ ∗∗ (0.055) (0.084) (0.091)Information (connected) 0.174 ∗∗∗ ∗∗∗ Notes:
All specifications include controls for the full set of observable characteristics, expected numberof connections of each type, and the baseline heteroskedasticity. Heteroskedastic probit estimates. Examgrouped effects. Standard errors clustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < On the whole sample, both the estimated bias from favors B and the estimated infor-mation effect δ c are positive and statistically significant. They are also both positive andsignificant when estimated on promotions to Associate Professor. By contrast, we detectfavors but no information effect on promotions to Full Professor. Thus, connected candi-dates appear to face lower promotion thresholds at both levels and connected candidates toAssociate Professor have excess variance in their latent errors. In other words, observablecharacteristics have lower power to explain promotion decisions in their case. Are these effects quantitatively significant? How much do connections help? And howmuch each motive contributes to the overall impact? To answer these questions, we com-pute for each candidate the predicted impact of a change in his connection status. Wefocus, for clarity, on unconnected candidates with at least one link to potential evaluators.These computations could easily be replicated on other subsamples. Consider, then, un- For clarity, we do not report estimates of the impact of candidates’ and exams’ characteristics onpromotion ( β ) and on baseline variance ( δ ) in the Tables. i . Model (6) can be used to predict how much i ’s probability to bepromoted would change if i became connected. Denote estimated coefficients with hats.The difference in predicted promotion probabilities is equal to:∆ p i / ∆ c i = p ( y i = 1 | x i , c i = 1) − p ( y i = 1 | x i , c i = 0)∆ p i / ∆ c i = Φ[( x i ˆ β + ˆ B − ˆ a e ) exp[ − ( ˆ δ x i + ˆ δ c )]] − Φ[( x i ˆ β − ˆ a e ) exp[ − ( ˆ δ x i )]]We can further decompose the overall impact of a change in connection status in two parts:one due to favors [∆ p i / ∆ c i ] F = Φ[( x i ˆ β + ˆ B − ˆ a e ) exp[ − ( ˆ δ x i )]] − Φ[( x i ˆ β − ˆ a e ) exp[ − ( ˆ δ x i )]]and another due to information [∆ p i / ∆ c i ] I = Φ[( x i ˆ β + ˆ B − ˆ a e ) exp[ − ( ˆ δ x i + ˆ δ c )]] − Φ[( x i ˆ β +ˆ B − ˆ a e ) exp[ − ˆ δ x i ]. Thus, ∆ p i / ∆ c i = [∆ p i / ∆ c i ] F + [∆ p i / ∆ c i ] I . Finally, we compute theaverages of these values over all individuals in the sample.We depict the results of these counterfactual computations in Table 5 and Figure 2. TheTable reports averages of initial predicted probability (first column), the average predictedchange in promotion probability due to connection (second column), the part of this changedue to information (third column) and the part to due to favors (fourth column). Thus,an unconnected candidate with some link to potential evaluators only has, on average, a0 .
08 chance to be promoted, reflecting the highly competitive nature of these promotions.Getting, by luck, connected to the jury leads to a relative increase in the promotion prob-ability of 80%. This relative impact is higher for candidates at the Associate Professorlevel (+91%) than for candidates at the Full Professor level (+76%). The larger part ofthis effect is due to information for AP candidates (63% of the total impact). By contrast,favors is the main determinant of this impact for FP candidates (71% of the total im-pact). Overall, these numbers provide a quantitative picture of the impact of connections.Getting connected to the jury almost doubles the chances to obtain the promotion. Consis- We assume that the exam’s promotion threshold a e is not affected by the change in connection statusof candidate i . There are two ways to decompose the overall effect in two parts. Due to non-linearities, these two waysmay not be equivalent. In practice they yield similar results, however, and we only present results fromthe decomposition described in the text. To compute the impact of connectedness for a subsample, we rely on estimates of Model (6) for thissubsample as presented in Table 4.
Baseline Marginal effectPredicted Total Information BiasAll 0.080 0.064 0.035 0.029(0.069) (0.023) (0.008) (0.019)AP 0.088 0.080 0.050 0.030(0.072) (0.024) (0.011) (0.019)FP 0.063 0.048 0.014 0.034(0.058) (0.024) (0.003) (0.022)
Notes:
Average marginal effect of being connected calculated for unconnected candidates with at least oneconnection to potential evaluators. Standard deviation of the effect is in the parenthesis. tently with the estimation results, favors appear to dominate for FP candidates while theinformation effect dominates for AP candidates. Figure 2 then depicts how the changein predicted probability ∆ p i / ∆ c i , and its two components [∆ p i / ∆ c i ] F and [∆ p i / ∆ c i ] I varywith predicted probability p i = Φ[( x i ˆ β − ˆ a e ) exp[ − ( ˆ δ x i )]]. We see that [∆ p i / ∆ c i ] I has aninverted U-shape, reaching a maximum for p i close to 0 . p i . By contrast, [∆ p i / ∆ c i ] F is initially increasing over a larger range and only de-creases - when it does - for high values of p i . These qualitative patterns are consistent withFigure 1. In particular, and as discussed in Section III, better information on candidatesappears to lower the promotion probabiltiy of candidates with very good CVs. On averagefor these candidates, the impact of bad news dominates the impact of good news. Overall,∆ p i / ∆ c i displays a clear inverted U shape for AP candidates, reaching a maximum around p i equal to 0 .
2, due to the key role of the information effect. By contrast, FP candidateswith better observable characteristics benefit more from being connected to the jury.We next assume that the bias from favors and the information effect may depend on thenumber and types of links. We estimate a model with linear bias and log-linear variance: p ( y i = 1 | x i , n iS , n iW ) = Φ[( x i β + γ S n iS + γ W n iW − a e ) exp[ − ( δ x i + δ S n iS + δ W n iW )]] (7)as well as a model with quadratic bias and log-linear variance:19 ( y i = 1 | x i , n iS , n iW ) = Φ[( x i β + γ S n iS + γ S n iS + γ W n iW + γ W n iW + γ SW n iS n iW − a e ) exp[ − ( δ x i + δ S n iS + δ W n iW )] (8)Results are reported in Table 6. In the Left panel we report estimation results from Model(7). On the whole sample, the bias and information effects from strong ties are bothpositive and significant; they are positive but insignificant for weak ties. For Full Professorapplications, we detect favors and information effects from strong ties and, in addition,favors from weak ties. For Associate Professor applications, we do not detect favors in thisspecification; we do detect strongly significant and positive information effects for bothstrong and weak ties. Note that in general, the effects of weak ties tend to be impreciselyFigure 2: Marginal effect of being connected: Decomposition Predicted probability M a r g i na l e ff e c t BiasInformationTotal
All -0.050.000.050.10 0.0 0.2 0.4 0.6
Predicted probability M a r g i na l e ff e c t BiasInformationTotal AP Predicted probability M a r g i na l e ff e c t BiasInformationTotal FP Notes : Nonparametric fit using LOESS method. The grey region depicts 95% confidence intervals. Plotsare constructed using estimated model (6) on subsamples indicated above each plot. (All) (AP) (FP) (All) (AP) (FP)
Bias n S ∗∗∗ ∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.047) (0.070) (0.068) (0.047) (0.072) (0.061) n S − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.008) (0.014) (0.006) n W − ∗∗ − ∗∗∗ (0.069) (0.230) (0.058) (0.073) (0.280) (0.067) n W − ∗ − − ∗∗∗ (0.014) (0.240) (0.010) n S × n W − Information n S ∗∗∗ ∗∗∗ ∗∗ ∗∗ ∗∗∗ n W ∗∗∗ − ∗∗∗ − Notes:
Estimation of Model (7) - Left panel, and Model (8) - Right panel. All specifications includecontrols for the full set of observable characteristics, expected number of connections of each type, andthe baseline heteroskedasticity. Heteroskedastic probit estimates. Exam grouped effects. Standard errorsclustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < To sum up, strong connections to the jury lower the promotion threshold effectivelyfaced by connected candidates. This impact is increasing in the number of strong ties ata decreasing rate. For applications to Full Professor, weak connections to the jury also21ower the promotion threshold in a similar way. For applications to Associate Professor, weface a problem of statistical power caused by the relatively low number of weak ties. Bothkinds of ties also appear to convey better information on candidates at the AP level. Bycontrast, we do not detect robust information effects at FP level.We next present the outcomes of counterfactual computations on the impact of connec-tions in Table 7, based on Model (8). We now focus on unconnected candidates who haveat least one strong tie and one weak tie to potential evaluators. For each such candidate,we compute the predicted promotion probability and the predicted increase in promotionprobability caused by obtaining, by luck, one strong or weak connection to the jury. Wealso provide decompositions of these impacts into parts due to better information and tofavors. We then average over all candidates in the subsample. We see that one strongtie increases the promotion probability by 74% for AP candidates and by 72% for FPcandidates. By contrast, one weak tie increases the promotion probability by 51% for APcandidates and by 22% for FP candidates. Thus, strong ties have higher predicted im-pacts than weak ties. For FP candidates, favors dominate, quantitatively, for both weakand strong ties. For AP candidates, favors dominate for strong ties and information effectsdominate for weak ties, consistently with the estimation results.Table 7: Marginal effect of connections: Model (8)
Baseline Marginal effectPredicted Total Information BiasStrong Weak Strong Weak Strong WeakAll 0.082 0.060 0.023 0.019 0.012 0.041 0.011(0.068) (0.026) (0.009) (0.004) (0.003) (0.024) (0.007)AP 0.091 0.067 0.046 0.028 0.092 0.039 -0.046(0.072) (0.025) (0.023) (0.006) (0.026) (0.023) (0.031)FP 0.068 0.049 0.015 0.019 -0.017 0.030 0.033(0.058) (0.021) (0.017) (0.004) (0.005) (0.018) (0.020)
Notes:
Average marginal effects of strong and weak connections calculated for unconnected candidateswith at least one strong connection and one weak connection to potential evaluators. Standard deviationof the effect is in the parenthesis. p ( y i =1 | x i , n iS , n iW ) = Φ[( x i β + ( γ S + γ S x i ) n iS + ( γ W + γ W x i ) n iW + γ S n iS + γ W n iW + γ SW n iS n iW − a e ) exp[ − ( δ x i + ( δ S x i ) n iS + ( δ W x i ) n iW )]] (9)We present estimation results in the Appendix, see Table A1 for AP candidates and TableA2 for FP candidates. A positive coefficient of the impact of some characteristic on biasmeans that favors due to connections tend to be stronger for candidates with higher valuesof this characteristic. Similarly, a positive coefficient on the information effect means thatexcess variance, and hence the quality of the extra information brought about by an addi-tional connection, is higher for these candidates. Results are rich and complex and confirmthat we can detect variations in the effects of connections. For instance, AP candidateshaving obtained their PhD in Spain appear to have higher information effects from bothweak and strong ties and lower bias from weak ties. Results on information are consistentwith the idea that having obtained a PhD abroad provides an informative signal on acandidate’s ability. We present counterfactual computations obtained from Model (9) inTable 8. Comparing with Table 7, we see that predicted probabilities are quantitativelysimilar. Table 8: Marginal effect of connections: Model (9) Baseline Marginal effectPredicted Total Information BiasStrong Weak Strong Weak Strong WeakAP 0.090 0.078 0.055 0.032 0.066 0.047 -0.012(0.067) (0.037) (0.052) (0.021) (0.067) (0.030) (0.044)FP 0.066 0.058 0.017 0.011 -0.003 0.047 0.020(0.056) (0.036) (0.027) (0.025) (0.039) (0.028) (0.026)
Notes:
Average marginal effects of strong and weak connections calculated for unconnected candidateswith at least one strong connection and one weak connection to potential evaluators. Standard deviationof the effect is in the parenthesis.
The average marginal impacts of gaining one strong or weak link to the jury for uncon-nected candidates appear to be slightly lower under Model (9) than under Model (8). This24eans that unconnected candidates have, on average, observable characteristics for whichconnections’ impacts are slightly weaker. Strong ties still have higher predicted impactsthan weak ties. And the relative quantitative importance of the two factors is robust. Fa-vors dominate for strong and weak ties at the FP level and for strong ties at the AP level.By contrast, information effects dominates for weak ties at the AP level.
B Robustness
In this section, we explore variations in the specification of two important features of theeconometric model: exam-specific promotion thresholds and baseline variance. First, wecontrast estimations with exam fixed effects a e and exam grouped effects a e = z e a . Wecompare estimation results of Model (6) under the two specifications in Table 9. The firstTable 9: Exam fixed effects vs. Exam grouped effects: Model (6) All AP FPFE GE FE GE FE GEBias 0.304 ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ ∗∗ (0.088) (0.055) (0.078) (0.084) (0.097) (0.091)Information 0.166 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ Notes:
The row LR reports the value of LR statistics of comparison of the restricted model (GE) andthe unrestricted model (FE) in the preceding column. Heteroskedastic probit estimates. Standard errorsclustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < column reports results of fixed effects estimations; the second column duplicates the resultsfrom Table 4. We see that the sign and statistical significance of both effects are similarfor both specifications on the whole sample and on the subsample of FP candidates. OnAP candidates, the information effect also has similar sign and significance. Bias fromfavors is positive and significant in the restricted model but positive and insignificant inthe unrestricted model. Results from likelihood ratio tests show that we cannot reject the25ypothesis that the grouped effects specification describes the data as well as the one withfixed effects, on each subsample as well as on the whole sample. We therefore considergrouped effects in our main regressions.Second, we consider different specifications of baseline variance σ v ( x i ). We contrastestimations under homoscedasticity, when all individual characteristics are included, andwhen a subset of characteristics are included, as described in the Appendix. Results aredepicted in Table 10 for Model (6) and Table 11 in Model (7). We see that the sign andstatistical significance of the main effects are essentially similar for the last two specifica-tions on the whole sample and on each subsample. Likelihood ratio tests also show that wecannot reject the hypothesis that the parsimonious specification describes the data as wellas the full-fledged specification, even on subsamples. By contrast, estimates of main effectsdiffer under homoscedasticity and the homoscedastic specification is rejected by likelihoodratio test. This confirms the importance of properly accounting for baseline heteroscedas-ticity. For reasons of computational and statistical efficiency, we therefore adopt the moreparsimonious heteroscedasticy specification in our main regressions.Table 10: Robustness: Baseline heteroskedasticity: Model (6) All AP FPHom. Preferred Full Hom. Preferred Full Hom. Preferred FullBias 0.419 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗ ∗∗∗ ∗∗ ∗∗∗ (0.058) (0.055) (0.064) (0.076) (0.084) (0.095) (0.096) (0.091) (0.072)Information − ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ − ∗∗∗ ∗∗∗ ∗∗∗ Notes:
The row LR reports the value of LR statistics of comparison of the unrestricted model with therestricted model in the preceding column. Heteroskedastic probit estimates. Standard errors clustered onthe exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < All AP FPHom. Preferred Full Hom. Preferred Full Hom. Preferred FullBias (strong) 0.221 ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ ∗∗∗ ∗ − − − ∗ ∗∗ ∗∗ (0.055) (0.069) (0.077) (0.174) (0.230) (0.228) (0.046) (0.058) (0.054)Information (strong) 0.089 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ (0.031) (0.044) (0.042) (0.041) (0.055) (0.057) (0.044) (0.064) (0.066)Information (weak) 0.089 ∗∗ ∗∗ ∗∗∗ ∗∗∗ − − ∗∗∗ ∗∗∗ ∗∗∗ Notes:
The row LR reports the value of LR statistics of comparison of the unrestricted model with therestricted model in the preceding column. Heteroskedastic probit estimates. Standard errors clustered onthe exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < VII Discussion and Conclusion
In this article, we propose a new method to identify favors and information in the impactof connections, building on earlier work on discrimination. Our method combines naturalexperiments and semi-structural modelling. It requires exogenous shocks on connectionsand only exploits information collected at time of promotion. We develop an economet-ric framework based on probit regressions with heteroscedasticity. Our method can thusbe implemented using standard statistical softwares. We show that better information onconnected candidates yields excess variance in latent errors. Differences in estimated vari-ances between connected and unconnected candidates can be used to identify and quantifythe information effect. Differences in estimated promotion thresholds can then be used toidentify the bias due to favors. We apply our method to the data assembled and studied inZinovyeva & Bagues (2015). Our empirical results are consistent with, and help sharpen,findings obtained from data collected five years after promotion.Our framework relies on a number of assumptions and, in particular, latent error nor-mality, deterministic favors and jury risk-neutrality. We next discuss the robustness ofour approach to relaxing these assumptions. First, we conjecture that this method canbe extended to non-normal latent errors. The fact that better private information leads to27xcess variance is quite general, as shown by Lu (2016). It could be interesting, in futureresearch, to try and extend this framework to logit or even non-parametric regressions.Second, suppose that favors are stochastic. Bias from favors is the sum of a deterministicpart and a stochastic part. If the stochastic part is independent of connections, our analysisand results goes through without modifications. This stochastic part is simply subsumedin the latent error. Our approach must be modified, however, if the bias’ stochastic part isaffected by connections. Current estimates of the information effect provide a lower boundof the true effect if bias variance decreases with connections and an upper bound if itincreases with connections.Third, consider a risk-averse jury. Risk aversion might lead the jury to promote a can-didate with lower expected ability if the uncertainty on her ability is lower. In other words,the grade of candidates evaluated by a risk averse jury may contain a risk penalty. Thismay invalidate the identification of favors. Note that it also invalidates the identificationof favors in studies based on quality measures. For instance, Zinovyeva & Bagues (2015)’sfinding that promoted candidates with strong ties publish less in the 5 years after promo-tion could also be explained by risk aversion. Interestingly, however, we suspect that theidentification of the information effect might be robust to risk aversion. Developing empiri-cal methods to identify risk aversion, favors and information effects provides an interestingchallenge for future research.To sum up, our method exploits variations in latent error variance and in promotionthresholds with connections. We clarify the conditions under which these variations yieldidentification of favors and information in the impact of connections. Even in circum-stances when identification does not hold, however, these estimates may contain valuableinformation on why connections matter.Finally, it would be interesting to combine our method with quality measures. Thiscould, potentially, yield more precise estimates of favors and information effects and alsoallow researchers to test critical assumptions, such as whether promotion indeed has thesame impact on quality for connected and unconnected candidates.28
PPENDIX A
Proof of Proposition 1
A model with bias B ( . ) and excess variance σ ( . ) and an alter-native model with bias B (cid:48) ( . ) and σ (cid:48) ( . ) yield the same conditional probability to be hired p ( y i = 1 | n iS , n iW , x i ) if x i β + B ( n iS , n iW , x i ) − a e σ v ( x i ) σ ( n iS , n iW , x i ) = x i β + B (cid:48) ( n iS , n iW , x i ) − a e σ v ( x i ) σ (cid:48) ( n iS , n iW , x i )Therefore, for any functions B ( . ), B (cid:48) ( . ) and σ ( . ), a model based on B ( . ) and σ ( . ) andone based on B (cid:48) ( . ) and σ (cid:48) ( n iS , n iW , x i ) = x i β + B (cid:48) ( n iS , n iW , x i ) − a e x i β + B ( n iS , n iW , x i ) − a e σ ( n iS , n iW , x i )have the same empirical implications. QED. Proof of Theorem 1
Consider first the classical Probit model with heteroscedasticity: p ( y i = 1 | x i ) = Φ[( a + bx i ) exp( − cx i )]Let us show that this model is identified if a b (cid:54) = 0. Identification holds if the mappingfrom parameters to the population distribution of outcomes is injective. Consider two setsof parameters a, b , c and a (cid:48) , b (cid:48) , c (cid:48) such that ∀ x ∈ R k , Φ[( a + bx ) exp( − cx )] = Φ[( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x )]. We must show that a = a (cid:48) , b = b (cid:48) and c = c (cid:48) .Applying Φ − yields: ∀ x , ( a + bx ) exp( − cx ) = ( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x ). At x = , thisyields: a = a (cid:48) . Next, take the derivative with respect to x k and apply at x = . This yields b k − ac k = b (cid:48) k − ac (cid:48) k . Observe also that b k and b (cid:48) k must have the same sign. Indeed if x l = 0when l (cid:54) = k , then ( a + bx ) exp( − cx ) = ( a + b k x k ) exp( − c k x k ). As x k goes from −∞ to + ∞ ,the sign of this expression can vary in one of three ways: it goes from negative to positiveif b k >
0; it goes from positive to negative if b k <
0; or it stays constant if b k = 0.Assume first that a (cid:54) = 0 and b (cid:54) = . Consider k such that b k (cid:54) = 0, for instance b k > x l = 0 except if l (cid:54) = k. For any x k large enough, a + bx = a + b k x k >
0. Taking logsyields: ln( a + b k x k ) − c k x k = ln( a + b (cid:48) k x k ) − c (cid:48) k x k . Take the derivative with respect to x k : b k / ( a + b k x k ) − c k = b (cid:48) k / ( a + b (cid:48) k x k ) − c (cid:48) k . Take the derivative twice more: b k / ( a + b k x k ) = b (cid:48) k / ( a + b (cid:48) k x k ) and − b k / ( a + b k x k ) = − b (cid:48) k / ( a + b (cid:48) k x k ) . Since this holds for any x k largeenough, this must hold for any x k . At x k = 0, this yields: b k = b (cid:48) k and hence b k = b (cid:48) k and c k = c (cid:48) k . If b k = 0, then b (cid:48) k = 0 and c k = c (cid:48) k .Assume next that b = . Then b (cid:48) = 0 and ∀ x , a exp( − cx ) = a exp( − c (cid:48) x ) and hence c = c (cid:48) . Finally, if a = 0 and b k >
0, then for any x k >
0, ln( b k x k ) − c k x k = ln( b (cid:48) k x k ) − c (cid:48) k x k and hence ln( b k ) − c k x k = ln( b (cid:48) k ) − c (cid:48) k x k . This implies that b k = b (cid:48) k and c k = c (cid:48) k . Thus, b = b (cid:48) and cx = c (cid:48) x for any x such that bx (cid:54) = 0, which implies that c = c (cid:48) .Observe that injectivity and identification also hold if x belongs to an open set O of R k . The reason is that the function x → ( a + bx ) exp( − cx ) is analytic and that twoanalytic functions which are equal on an open set must be equal everywhere. Therefore, If a = 0 and b = 0, ∀ x , Φ[( a + bx ) exp( − cx )] = 1 / c is not identified. x ∈ O , Φ[( a + bx ) exp( − cx )] = Φ[( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x )] ⇒ ∀ x ∈ R k , ( a + bx ) exp( − cx ) =( a (cid:48) + b (cid:48) x ) exp( − c (cid:48) x ) and hence a = a (cid:48) , b = b (cid:48) and c = c (cid:48) .Identification also holds if with some binary characteristics. Suppose that x i ∈ { , } and denote by x − i ∈ R k − , the vector of other characteristics. Then, p ( y i = 1 | x i =0 , x − i ) = Φ[( a + b − x i ) exp( − c − x i )] yielding identification of a, b − and c − . Next, p ( y i =1 | x i = 1 , x − i ) = Φ[( a + b + b − x i ) exp( − c − c − x i )]. Rewrite Φ − ( p ) = [ e − c ( a + b ) + e − c b − x i exp]( − c − x i ). Therefore, e − c b − is identified and hence c is identified. Since e − c ( a + b ) is also identified, b is identified.Thus n becomes arbitrarily large, the econometrician can thus obtain consistent esti-mates of a, b and c if observables have full rank.Consider, next, the following model p ( y i = 1 | n iS , n iW , x i ) = Φ[(( β + γ ( n iS , n iW )) x i + γ ( n iS , n iW ) − a e ] exp[ − ( δ + δ ( n iS , n iW )) x i ]We apply the identification result on the Probit model with heteroscedascticity repeatedly.On unconnected candidates, we have: p ( y i = 1 | n iS = 0 , n iW = 0 , x i ) = Φ( β x i − a e ) exp( − δ )and hence a e , β , and δ are identified. Similarly for candidates with connections n iS and n iW , the parameters γ ( n iS , n iW ) − a e , β + γ ( n iS , n iW ) and δ + δ ( n iS , n iW ) are identi-fied. Therefore, γ ( n iS , n iW ), γ ( n iS , n iW ), and δ ( n iS , n iW ) are identified. Note that toobtain consisent estimates of a e , β , δ , γ ( n iS , n iW ), γ ( n iS , n iW ), δ ( n iS , n iW ), the numberof observations within exams must become arbitrarily large and observables conditional on( n iS , n iW ) must have full rank. QED. Preferred specification for the baseline heteroscedasticity.
We first estimate model(4) on unconnected candidates, under the assumption that latent error variance is log-linearand depends on all observable characteristics. We thus estimate the following model: p ( y i = 1 | x i ) = Φ[( x i β − a e ) exp( − δ x i )]on unconnected candidates. In our preferred specification for σ v , we then include variablesthat are statistically insignificant as well expected numbers of connections En iS , En iW .We include these expected numbers given their critical role in ensuring the exogeneityof actual connections. We exclude other variables. Our preferred specification includes thefollowing 10 observables: expected number of strong connections, expected number of weakconnections, PhD students advised, AIS, age, gender, number of candidates at the exam,share of unconnected candidates at the exam, type of exam, and the indicator if the broadarea is Humanities and Law. As discussed in Section VI.B. and following Davidson &McKinnon (1984), we also test whether this restricted model indeed explains the data aswell as the non-restricted model. Additional estimation results.
Results of the estimation of Model (9) for subsamplesof AP candidates and FP candidates are presented in Table A1 and Table A2 respectively.30able A1: Estimation of Model (9): AP candidates
Bias InformationStrong Weak Strong WeakConst. 0.466 ∗∗ . ∗∗ - -(0.185) (0.245) - -Strong − ∗∗∗ − ∗∗ ∗∗∗ -0.002 − − − ∗∗ − ∗∗ ∗∗ − . ∗∗ − ∗∗∗ (0.021) (0.215) (0.041) (0.164)PhD in Spain − − . ∗∗ ∗∗ ∗∗∗ (0.182) (0.281) (0.068) (0.174)Age − − − − − ∗∗∗ − Notes:
Estimation of Model (9). All specifications include controls for the full set of observable character-istics, expected number of connections of each type, and the baseline heteroskedasticity. Heteroskedasticprobit estimates. Exam grouped effects. Standard errors clustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < Bias InformationStrong Weak Strong WeakConst. 0.366 ∗∗∗ ∗∗∗ - -(0.065) (0.058) - -Strong − ∗∗∗ − − − − ∗∗ − ∗∗ (0.036) (0.019) (0.051) (0.028)PhD Committees 0.007 0.038 ∗ ∗∗∗ ∗∗ ∗∗∗ − − ∗∗∗ (0.032) (0.034) (0.047) (0.037)PhD Students − − ∗∗ − ∗∗ − ∗∗ − ∗∗∗ − − − ∗∗∗ (0.006) (0.027) (0.025) (0.036)Expected strong 0.048 ∗∗ − ∗∗∗ − − ∗∗∗ (0.054) (0.031) (0.071) (0.034) Notes:
Estimation of Model (9). All specifications include controls for the full set of observable character-istics, expected number of connections of each type, and the baseline heteroskedasticity. Heteroskedasticprobit estimates. Exam grouped effects. Standard errors clustered on the exam level are in the parenthesis. ∗ p < ∗∗ p < ∗∗∗ p < EFERENCES
Beaman, Lori and Jeremy Magruder. 2012. “Who Gets the Job Referral? Evidence froma Social Networks Experiment.”
American Economic Review , 102(7): 3574-3593.Bertrand, Marianne and Sendhil Mullainathan. 2004. “Are Emily and Greg More Em-ployable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination.”
American Economic Review , 94(4): 991-1013.Bester, C. Alan and Christian B. Hansen. 2016. “Grouped Effects Estimators in FixedEffects Models.”
Journal of Econometrics,
Journal of Development Eco-nomics,
Journal of Financial Economics,
Journal of LaborEconomics,
Review ofEconomics and Statistics, forthcoming.Combes, Pierre-Phillipe, Linnemer, Laurent and Michael Visser. 2008. “Publish or Peer-rish? The Role of Skills and Networks in Hiring Economic Professors.”
Labour Economics,
15: 423-441.Davidson, Russell and James G. MacKinnon. 1984. “Convenient Specification Tests forLogit and Probit Models.”
Journal of Econometrics,
25: 241-262.Engelberg, Joseph, Gao, Pengjie and Christopher A. Parsons. 2012. “Friends with Money.”
Journal of Financial Economics,
American Journal of Sociology,
Journal of Economic Perspectives,
Clear and Convincing Evidence: Measurement of Discriminationin America,
M. Fix and R. Struyk, eds. Urban Institute.33ensvik, Lena and Oskar Nordstrom Skans. 2016. “Social Networks, Employee Selection,and Labor Market Outcomes ”
Journal of Labor Economics,
Journal of Political Econ-omy,
AmericanEconomic Journal: Applied Economics,
Econometrica,
Journal of Human Resources,
Journal of Political Economy,