[PDF] Personality Traits and the Marriage Market

Abstract

Full PDF

aa r X i v : . [ ec on . GN ] F e b PERSONALITY TRAITS AND THE MARRIAGE MARKET

ARNAUD DUPUYCEPS/INSTEAD, MAASTRICHT SCHOOL OF MANAGEMENT AND IZAANDALFRED GALICHONSCIENCES PO, PARIS, IZA AND CEPR.

Abstract.

Which and how many attributes are relevant for the sorting of agentsin a matching market? This paper addresses these questions by constructing indicesof mutual attractiveness that aggregate information about agents’ attributes. Theﬁrst k indices for agents on each side of the market provide the best approximationof the matching surplus by a k-dimensional model. The methodology is applied ona unique Dutch households survey containing information about education, height,BMI, health, attitude toward risk and personality traits of spouses.We thank ﬁve anonymous referees, the Editor (Phil Reny), as well as Raicho Bojilov, OdranBonnet, Xavier Gabaix, Jim Heckman, Zuzanna Kosowska-Stamirowska, Jean-Marc Robin, MarkoTervi¨o, Bertrand Verheyden, Simon Weber and seminar participants at the University of Chicago,Universit´e de Montr´eal, Paris 1 Panth´eon–Sorbonne, University of Alicante, Tilburg University,Sciences Po, Harvard-MIT, Universit´e de Lausanne, and the 2013 EEA meeting for their commentsand Lex Borghans for useful discussions about the DNB data. Jinxin He provided excellent researchassistance. Dupuy warmly thanks ROA at Maastricht University where part of this paper waswritten. Galichon’s research has received funding from the European Research Council under theEuropean Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement no313699, and from FiME, Laboratoire de Finance des March´es de l’Energie.Accepted for publication by the

Journal of Political Economy

Volume 122, Number 6, December2014. Article DOI: https://doi.org/10.1086/677191 . . Introduction

Marriage, understood in a broad sense, is probably one of the most importantfactors for happiness . It also plays an important role in the generation of welfare andits redistribution across individuals. An in-depth understanding of marriage patternsis therefore of crucial importance for the study of a wide range of economic issues.A growing body of the economic literature studies the determinants of marriage,seen as a competitive matching market, both empirically and theoretically. Thisliterature draws insights from the seminal model of the marriage market developedby Becker (1973). At the heart of Becker’s theory lies a two-sided assignment modelwith transferable utility where agents on both sides of the market (men and women)are characterized by a set of attributes only partly observed by the researcher. Eachagent aims at matching with a member of the opposite sex so as to maximize his orher own payoﬀ. This model is particularly interesting since under certain conditions,one can identify and estimate features of agents’ preferences. A central question inthis market is which and how many attributes are relevant for the sorting of agents?A large body of literature has focused on the identiﬁcation and estimation of pref-erences in the marriage market and in other matching markets; however, it has beenconstrained by some methodological limitations regarding the quantitative methodsavailable to identify and estimate features of the joint utility function. In the cur-rent state of the art, no estimation tool can handle sorting on multiple continuousattributes in a convenient manner. Until recently, most empirical literature assumedsorting occurs on a single continuous dimension, which is a single index aggregating See e.g. Stutzer and Frey (2006) and Zimmermann and Easterlin (2006). For the marriage markets, see among others Becker (1991), Wong (2003), Anderberg (2004),Choo and Siow (2006), Browning et al. (2013), Chiappori and Oreﬃce (2008), Hitsch et al. (2010),Chiappori et al. (2010), Chiappori et al. (2012), Oreﬃce and Quintana-Domeque (2010), Bruze(2011), Charles et al. (2013), Echenique et al. (2013), and Jacquemet and Robin (2013). For othermarkets, see e.g. Fox (2010, 2011), Tervi¨o (2008), Gabaix and Landier (2008). . Another limitation of the current empiricalliterature is related to the set of observable attributes available in the data. Moststudies solely have access to data on education and earnings, and only a few observeother dimensions such as anthropometric measures captured by height and BMI orself-assessed measures of health (Chiappori, Oreﬃce and Quintana-Domeque, 2012and Oreﬃce and Quintana-Domeque, 2010 are notable exceptions).In the present paper, we contribute to the literature on three accounts.First, on the modeling front, we extend: (i) the Choo and Siow matching modelto account for possibly continuous multivariate attributes, and (ii) Galichon andSalani´e’s (2010, 2013) surplus estimator of the Choo and Siow model to the continu-ous case . Extending Choo and Siow’s model to continuous regressors is an important Recently, two papers have studied markets where sorting occurs on more than one dimension.Coles and Francesconi (2011) and Chiappori et al. (2012) study sorting on a single continuous indexand a binary variable. Nesheim (2012) focuses on the identiﬁcation of multivariate hedonic modelswithout heterogeneity, and based on the observation of the price. None of these two papers allows for continuous observable characteristics. k indices (for males and females) provide a convenientapproximation of the joint utility by a model where attributes are vectors of only k dimensions. As a consequence, one can perform inference on the number of di-mensions that are required to explain the equilibrium sorting by testing how manysingular values diﬀer from zero.Third, on the empirical front, we make use of a dataset that allows us to observea wide range of attributes of both spouses. The set of attributes we observe in thedata includes socio-economic variables such as education, anthropometric measures4uch as height and BMI, a measure of self-assessed health, as well as psychometricattributes such as risk aversion and the “Big Five” personality traits well-known inPsychology: conscientiousness, extraversion, agreeableness, emotional stability andautonomy. This paper is, to the extent of our knowledge, the ﬁrst attempt to evaluatethe importance of personality traits in the sorting of men and women in the marriagemarket. We will show that although education explains 28% of a couple’s observablejoint utility, personality traits explain another 17% and diﬀerent personality traitsmatter diﬀerently for men and for women. Our results relate to the literature showingthe importance of personality traits in making economic decisions (Borghans et al.,2008 for instance). Bowles et al. (2001) and Mueller and Plug (2006) among othershave shown the importance of personality traits for earnings inequality. Closer toour focus, Lundberg (2012) studies the impact of personality traits on the odds inand out of a relationship (marriage and divorce) and ﬁnds empirical evidence thatpersonality traits signiﬁcantly aﬀect the extensive margin in the marriage market. Inparticular, conscientiousness increases the probability of marriage at the age of 35for men and extraversion increases the odds of marriage at the age of 35 for women.In the present work, we study the intensive margin, that is to whom conscientiousmen and extraverted women are the most attractive. We show among other thingsthat conscientious men have preferences for conscientious women whereas extravertedwomen have preferences for autonomous and less agreeable men.The outline of the rest of the paper is as follows. Section 2 presents an importantextension of the model of Choo and Siow to continuously distributed observables.Section 3 deals with parametric estimation of the joint utility function in this setting.Section 4 presents a methodology for deriving indices of mutual attractiveness thatdetermine the principal dimensions on which sorting occurs. The problem of inferringthe number of dimensions on which sorting occurs is dealt with in section 5. Section6 presents the data used for our empirical estimation and Section 7 discusses theresults. Section 8 concludes. 5. The Continuous Choo and Siow model

The Becker-Shapley-Shubik model of marriage.

The setting is a one-to-one, bipartite matching model with transferable utility. Men and women are char-acterized by vectors of attributes, respectively denoted x ∈ X = R d x for men and y ∈ Y = R d y for women. Matched men and women are by deﬁnition in equal num-ber; we let P and Q be the respective probability distributions of their attributes.Throughout the paper, P and Q are treated as exogenous, except in Appendix Dwhere we show that incorporating singles leaves the analysis unchanged while allow-ing us to identify reservation utilities. P and Q are assumed to have densities withrespect to the Lebesgue measure denoted respectively f and g . Without loss of gen-erality, it is assumed throughout that P and Q are centered distributions, that is E P [ X ] = E Q [ Y ] = 0.A matching is the probability density π ( x, y ) of occurrence of a couple with char-acteristics ( x, y ) from the matched population. Quite obviously, this imposes that themarginals of π should be P and Q . Write π ∈ M ( P, Q ), where M ( P, Q ) = (cid:26) π : π ( x, y ) ≥ Z Y π ( x, y ) dy = f ( x ) and Z X π ( x, y ) dx = g ( y ) (cid:27) . Let Φ ( x, y ) be the joint utility generated when a man x and a woman y match,which is shared endogenously between them. Let Φ ( x, ∅ ) and Φ ( ∅ , y ) be the utilityof man x and woman y respectively if they remain single; in Appendix D, we shallshow that Φ ( x, ∅ ) and Φ ( ∅ , y ) are identiﬁed if and only if the populations of singlesare observed, but that the identiﬁcation of Φ ( x, y ) is not impeded if singles are not While we present the case with continuous distributions for x and y , our framework easily extendsto the case where some dimensions of x and y are discrete. . In the rest of the paper we shall assume that only the matched populationis observed, so we will not focus on Φ ( x, ∅ ) and Φ ( ∅ , y ); as a result, the matching surplus Φ ( x, y ) − Φ ( x, ∅ ) − Φ ( ∅ , y ) will not be identiﬁed. Shapley and Shubik (1972)have shown that the equilibrium matching π maximizes the total utility(2.1) max π ∈M ( P,Q ) E π [Φ ( X, Y )] . Optimality condition (2.1) leads to very strong restrictions on ( X, Y ), which arerarely met in practice. We need to incorporate some amount of unobserved hetero-geneity in the model.2.2.

Adding heterogeneities.

Bringing the model to the data requires the addi-tional step of acknowledging that sorting might also occur on attributes that areunobserved to the econometrician. In the case where men and women’s attributesare discrete , Choo and Siow (2006) introduced unobservable heterogeneities into thematching problem by considering that if a man m of attributes x m = x and a woman w of attributes y w = y match, they create a joint utility Φ ( x, y ) + ε m ( y ) + η w ( x ),where ε m ( y ) and η w ( x ) are unobserved random “sympathy shocks” drawn by indi-viduals. Assuming that ( ε m ( y )) y and ( η w ( x )) x have i.i.d. centered Gumbel (extremevalue type I) distributions with scaling parameter σ/

2, Choo and Siow have shownthat the joint utility Φ ( x, y ) can be split into Φ ( x, y ) = U ( x, y ) + V ( x, y ) such thatthe utility of man m matching with a woman of type y is given by U ( x, y ) + ε m ( y ) In Appendix D, this will be shown to be a consequence of the Independence of Irrelevant Alter-natives (IIA) property of the logit model. A basic result in the theory of optimal transportation (Brenier’s theorem) implies that whenΦ ( x, y ) = x ′ Ay , the optimal matching is characterized by ( AY ) i = ∂V ( X ) /∂x i where V is someconvex function. Hence as soon as A is invertible, the matching is pure , in the sense that no twomen of the same type may marry women of diﬀerent types. This is obviously never observed in thedata. w . An important implication ofthis setting is that at equilibrium, agents are indiﬀerent between partners with sameobservable attributes: the matching utility of man m at equilibrium depends only onthe observable attributes of that woman. As a consequence, each agent in the marketsolves a discrete choice problem.In the Choo and Siow model, partners are assumed to have i.i.d. Gumbel sympathyshocks for the discrete attributes of the opposite side of the market. However, in manyapplied settings, these attributes are continuous random vectors, and even though thedata that the analyst handles are obviously discretized, there is a strong need for acontinuous framework. To illustrate, we shall take a setting where only the heightof the partners is relevant, and assume that the precision of the measure is poor,say it is rounded at the nearest foot. A direct implication of the Choo and Siowassumptions is that individuals’ sympathy shocks are perfectly correlated within afoot bracket, and perfectly independent across feet. Suppose instead that height ismeasured at the nearest inch, Choo and Siow’s assumptions would now imply thatindividuals’ sympathy shocks are perfectly correlated within an inch bracket, andperfectly independent across inches, which of course comes at odds with the previousassumptions. So, while it is of course always possible to apply the Choo and Siowsetting to the discretized data, this implicitly leads to ad hoc assumptions whichdepend on the level of discretization of the available data.In the present paper, we shall present an application where x and y measure height,BMI and various personality traits, which have a continuous multivariate distribution.Hence we need to model the random processes for ε m ( x ) and η w ( y ) accordingly. Alegitimate candidate in the wake of Choo and Siow’s approach is the continuous logitmodel. Although very natural and particularly tractable, this setting has been sur-prisingly little used in economic modeling, with some notable exceptions. McFadden(1976) initiated the literature of continuous logit models by extending the deﬁnitionof Independence of Irrelevant Alternatives (IIA) beyond ﬁnite sets. Ben-Akiva and8atanatada (1981) and Ben-Akiva, Litinas and Tsunekawa (1985) deﬁne continuouslogit models by taking the limits of the discrete choice probabilities, with applicationsin particular to the context of spatial choice models. Cosslett (1988) and Dagsvik(1988) have independently suggested using max-stable processes to model continuouschoice. We base our approach on their insights.Assume that each man m of type x m = x only knows a random subset of the totalpopulation of women we will call “acquaintances”, and that man m only considerspotential partners from his set of acquaintances. These acquaintances are indexed by k ∈ N ; and their observable attributes are represented by y mk . Each of these acquain-tances is associated with a random “sympathy shock” ε mk which enters additively intothe man’s utility, so that the utility of a man m who marries a woman of attributes y mk can be written as(2.2) U ( x, y mk ) + σ ε mk , where U ( x, y ) is the “systematic” (in Choo and Siow’s term) part of the utilityobtained by man x matching with a woman with attributes y , whose existence andcharacterization will be provided in Theorem 1 below. Note that in contrast with theoriginal setting of Choo and Siow described above, men do not have access to thewhole population of women, but only to their randomly selected set of acquaintances,which is a subset of the whole population.We have yet to specify the distribution of y mk and ε mk . Following Cosslett andDagsvik’s idea, we assume that { ( y mk , ε mk ) , k ∈ N } are the points of a Poisson processon Y × R of intensity dy × e − ε dε . This means that: (i) the probability that man m has an acquaintance whose observable attributes are in a small set of inﬁnitesimalsize dy around y and with sympathy shock in a set of inﬁnitesimal size dε around ε is As explained in Appendix A, it will result from the distributional assumptions that each mandraws an inﬁnite but countable number of acquaintances almost surely, so that these can be indexedby the set of integers. e − ε dεdy , and (ii) letting S and S ′ be two disjoint subsets of Y × R , the events“ m has an acquaintance in S ” and “ m has an acquaintance in S ′ ” are independent.According to the standard theory of Poisson point processes, this implies that, for S a subset of Y × R , the probability that man m has no acquaintance in set S isexp (cid:0) − R S e − ε dydε (cid:1) . In Appendix A we show that this yields a continuous version ofthe multinomial logit choice model. As a result, the probability distribution of man m choosing a woman with attributes y is given by its density of probability(2.3) π Y | X ( y | x ) = exp U ( x,y ) σ/ R Y exp (cid:16) U ( x,y ′ ) σ/ (cid:17) dy ′ which is clearly the extension of the logit formalism to the continuous choice setting.Similarly, the utility of a woman w with attributes y w = y who marries a man withattributes x is(2.4) V ( x wl , y ) + σ η wl , where V ( x, y ) is the systematic part of the utility, and { ( x wl , η wl ) , l ∈ N } are thepoints of a Poisson process on X × R of intensity dx × e − η dη , so that the probabilitydistribution of woman w choosing a man with attributes x is given by its density ofprobability(2.5) π X | Y ( x | y ) = exp V ( x,y ) σ/ R X exp (cid:16) V ( x ′ ,y ) σ/ (cid:17) dx ′ . The continuous logit framework inherits the structural assumptions of the discretemultinomial logit model. In particular, the independence property, which impliesthat the sympathy shock for women whose attributes are in a small set around y isperfectly uncorrelated with the sympathy shock for women whose attributes are ina small set around y ′ = y . Hence the logit framework (continuous or discrete) doesnot allow for a systematic sympathy shock, i.e. correlated sympathy shocks acrossobservables. In the example where the attribute of interest is height, it may be de-sirable to accommodate a random sympathy shock for height (some men prefer taller10omen, some prefer shorter women, regardless of their own observable attributes).We conjecture, however, that if the amount of variation of the unobserved hetero-geneity is small, the misspeciﬁcation of the sympathy shocks has only a minor impacton the market outcome and the identiﬁcation of the joint utility.Taking the logarithm of Equations (2.3) and (2.5) respectively yields U − a =( σ/

2) log π and V − b = ( σ/

2) log π , where(2.6) a ( x ) = σ Z Y e U ( x,y ′ ) σ/ f ( x ) dy ′ and b ( y ) = σ Z X e V ( x ′ ,y ) σ/ g ( y ) dx ′ , and since Φ = U + V , one obtains by summation(2.7) log π ( x, y ) = Φ ( x, y ) − a ( x ) − b ( y ) σ . We formalize this result in Theorem 1, which extends Galichon and Salani´e (2010)to the continuous case.

Theorem 1.

Under the assumptions stated above, the following holds:(i) The equilibrium matching π maximizes the social gain (2.8) max π ∈M ( P,Q ) Z Z

X ×Y

Φ ( x, y ) π ( x, y ) dxdy − σ Z Z

X ×Y log π ( x, y ) π ( x, y ) dxdy. (ii) In equilibrium, for any x ∈ X , y ∈ Y (2.9) π ( x, y ) = exp (cid:18) Φ ( x, y ) − a ( x ) − b ( y ) σ (cid:19) where the potentials a ( x ) and b ( y ) are determined such that π ∈ M ( P, Q ) . Theyexist and are uniquely determined up to a constant.(iii) A man m of attributes x who marries a woman k ∗ from his set of acquaintancesobtains utility (2.10) U ( x, y mk ∗ ) + σ ε mk ∗ = max k (cid:16) U ( x, y mk ) + σ ε mk (cid:17) here (2.11) U ( x, y ) = Φ ( x, y ) + a ( x ) − b ( y )2 . Similarly, a woman w of attributes y who marries man l ∗ from her set of acquain-tances obtains utility (2.12) V ( x wl ∗ , y ) + σ η wl ∗ = max l (cid:16) V ( x wl , y ) + σ η wl (cid:17) where (2.13) V ( x, y ) = Φ ( x, y ) − a ( x ) + b ( y )2 . As in Galichon and Salani´e (2010; 2013), and independently, Decker et al. (2013),part (i) of this result expresses the fact that the equilibrium matching reﬂects atrade-oﬀ between sorting on the observed attributes (which tends to maximize theterm R Φ ( x, y ) π ( x, y ) dxdy ), and sorting on the unobserved attributes (which in turntends to maximize the entropic term R log π ( x, y ) π ( x, y ) dxdy ). The second termwill therefore pull the solution towards the random matching, where partners arerandomly assigned; the parameter σ , which captures the intensity of the unobservedheterogeneity, measures the intensity of this trade-oﬀ. The smaller the σ (i.e. the lessunobserved heterogeneity in the model), the closer the solution will be to the solutionwithout heterogeneity. On the contrary, the higher the σ , the larger the probabilisticindependence between the observed attributes of men and women. As an illustration,we consider the simple toy example below, in which this phenomenon is explicit. Example 1.

When P and Q are the standard univariate Gaussian distribution N (0 , ,and Φ ( x, y ) = − ( x − y ) , the equilibrium matching π is such that π Y | X ( y | x ) = N ( tx, − t ) , where t = q σ + 1 − σ . Hence, σ = 0 implies t = 1 , in whichcase Y = X (sorting predominates and we have positive assortative matching), while σ → ∞ implies t → , in the limit of which Y becomes independent from X (unob-served heterogeneity predominates, there is no sorting on observables). Closed-form ormulae can also be provided in the multivariate case when P and Q are Gaussianand Φ is quadratic, see Bojilov and Galichon (2013). Part (ii) of Theorem 1 is an expression of the ﬁrst order optimality conditions. Theprogram is an inﬁnite dimensional linear programming problem where a ( x ) and b ( y )are the Lagrange multipliers corresponding to the constraints R π ( x, y ) dy = f ( x ) and R π ( x, y ) dx = g ( y ) respectively. Equation (2.9), or more precisely its logarithmictransform Equation (2.7), will be the basis of our estimation strategy. Together withthe constraint π ∈ M ( P, Q ), this equation provides a nonlinear system of equations in a ( . ) and b ( . ). In the applied mathematical literature it is known as the Schr¨odinger-Bernstein equation , or more commonly as the

Schr¨odinger problem . Existence anduniqueness (up to a constant) are well studied under very general conditions on P and Q , see for instance R¨uschendorf and Thomsen (1993) and references therein. Aneﬃcient algorithm for the numerical determination of the solution based on a ﬁxedpoint idea is studied in R¨uschendorf (1995). For completeness it is further explainedin Appendix C.Part (iii) of Theorem 1 explains how the joint utility is shared at equilibrium.Unsurprisingly, just as in Choo and Siow (2006) and the ensuing literature, individualsdo not transfer their sympathy shock at equilibrium, which is expressed by (2.10) and(2.12). Expressions (2.11) and (2.13) provide the formulae for the systematic partof the utility. As previously noted, a ( x ) and b ( y ) are the Lagrange multipliers ofthe scarcity constraints of men’s observable attributes x and women’s attributes y .Hence a higher a ( x ) shall imply a higher relative scarcity for x , and therefore a greaterprospect for utility extraction. Identiﬁcation . From an identiﬁcation perspective, note that equations (2.3) and(2.5) imply that the observation of π ( x, y ) leads to identiﬁcation of U ( x, y ) up to an13dditive term c ( x ), and similarly, V ( x, y ) up to an additive term d ( y ) by U ( x, y ) = σ (cid:0) log π Y | X ( y | x ) + c ( x ) (cid:1) and V ( x, y ) = σ (cid:0) log π X | Y ( x | y ) + d ( y ) (cid:1) and thus Φ ( x, y ) = σ (cid:0) log π Y | X ( y | x ) + log π X | Y ( x | y ) + c ( x ) + d ( y ) (cid:1) As a result, Φ ( x, y ) is identiﬁed up to a separatively additive function since werestrict our attention to the matched population . Since Φ ( x, y ) yields the sameequilibrium matching as Φ ( x, y ) + c ( x ) + d ( y ), the identiﬁed quantity is actually thecross-derivative ∂ Φ ( x, y ) /∂x∂y , while neither ∂ Φ ( x, y ) /∂x nor ∂ Φ ( x, y ) /∂y canbe identiﬁed, nor can their signs be identiﬁed, either. To illustrate, assume thatthere is only one dimension–education. It may be that men and women who aremore educated generate more utility, which we call “absolute attractiveness,” andwhich translates into ∂ Φ ( x, y ) /∂x ≥ ∂ Φ ( x, y ) /∂y ≥

0. However, this isnot identiﬁable in our model, because models where the joint utility is Φ ( x, y ) areobservationally undistinguishable from models where the joint utility is Φ ( x, y ) + c ( x ) + d ( y ), and the terms c and d might be strongly negatively correlated witheducation. Instead, the present framework allows us to determine whether educationis mutually attractive in the sense that ∂ Φ ( x, y ) /∂x∂y ≥

0, meaning not only thathighly educated men and women attract each other, but also that lower educated menand women attract each other. Hence, our model allows us to measure the strengthof mutual attractiveness (or assortativeness) on various dimensions, but not absoluteattractiveness . Appendix D explains how these results are extended when singles are observed.

Parametric estimation

Speciﬁcation of the matching utility.

In this section, we shall specify a para-metric form for the joint utility function, the estimation of which shall be discussedin the next section. While Choo and Siow’s estimator is fully nonparametric, the factthat the variables under study are continuous reinforces the need for a parametricestimator. For the purpose of this discussion, we shall look back at the illustra-tive example from the introduction where only both partners’ heights are observed.The Choo and Siow analysis provides a nonparametric estimator for the joint utilityΦ ( x, y ) of a match between a man of height x and a woman of height y . If heightswere to be rounded at the nearest inch and individuals’ heights in inches ranged,say, from 50 to 90, then the dimension of vector Φ ( x, y ) would be 40 ×

40 = 1600.Note that this would worsen signiﬁcantly if several characteristics were observed. Buteven in the single-dimensional case, there would be a serious missing data problem,since the odds that one would observe data for every pair of heights are virtually zero.Moreover, even if one were lucky enough to obtain the full nonparametric estimator ofΦ ( x, y ), one would have to heavily process this information before being able to drawany stylized conclusion. This simple example highlights the need for a parametricestimation when considering continuous variables.Throughout the rest of the paper, we shall assume a quadratic parametrization ofΦ: for A a d x × d y matrix, we takeΦ A ( x, y ) = x ′ Ay, where we call matrix A the aﬃnity matrix . One has A ij = ∂ Φ ( x, y ) ∂x i ∂y j . The parameter A ij accounts for the strength of mutual attractiveness (which can bepositive or negative) between dimensions x i and y j . It measures how the (marginal)gain in joint utility from increasing the man’s i th attribute evolves as the woman’s j th x i of man x and attribute y j of woman y in the joint utility.Two comments about this parametric choice are noteworthy. First, this paramet-ric choice is arguably the simplest one which captures nontrivial complementaritiesbetween any pair of attributes. If A ij > x i and y j are complements, and (all thingselse being equal) high x i tend to match with high y j . It reﬂects positive assortativematching across men’s i -th attribute and women’s j -th attribute. For instance, thelevel of education of one of the partners may be complementary with the risk aversionof the other partner. On the contrary, if A ij <

0, then x i and y j are substitutes, thereis negative assortative matching between x i and y j . Note that attributes x and y should not be interpreted as an absolute quality (where a greater value of x i , the i -thdimension of x , would be more socially desirable than a smaller value of x i ). In fact,the model is observationally undistinguishable from a model where x is changed into − x and y is changed into − y .Second, this quadratic setting where Φ is bilinear in x and y is less restrictive thanit seems and can be extended to the case when the various observed attributes havenonlinear contributions to the joint utility . For instance it may be plausible thatextraverted men are indiﬀerent about the education and the height of their wives, butthat if a woman is tall, then men prefer her with a higher education. Our setting caneasily be extended to incorporate such nonlinear eﬀects. We assume no restrictionson the attributes that enter x and y , so that the observables can be enriched bythe addition of nonlinear functions of them, i.e. adding x i , x i etc. and x i x j asobservable attributes for men and similarly for women. This will allow Φ ( x, y ) tobe any polynomial function of x and y . Thus, our setting can easily incorporate anyutility function which is a polynomial expression of the observable attributes. We thank a Referee for pointing this out.

Inference.

We turn to the estimation of the aﬃnity matrix A . The technique weapply here was introduced by Galichon and Salani´e (2010); we discuss this extensionto the continuous case. By taking the cross-derivative of Equation (2.7), one has(3.1) A ij σ = ∂ log π ( x, y ) ∂x i ∂y j . A seemingly natural procedure would consist in estimating π nonparametrically,and obtaining A from the cross derivatives with respect to x i and y j . While feasible,this procedure faces a number of issues both in theory and in practice. First, itrequires a nonparametric estimation of the second derivatives of the loglikelihood,which is quite challenging: the “curse of dimensionality” would fully apply . Second,since equation (3.1) is valid at any point ( x, y ), this equation is an over-identifyingrestriction to the estimation of A . The right hand side of (3.1) depends on ( x, y ),while the left hand side does not. One may certainly take some averaging of theright hand side of Equation (3.1), but it is not quite obvious how to weigh each pointoptimally, and it would only partially oﬀset the problems stemming from the curseof dimensionality. As a result, this procedure will be statistically ineﬃcient.Instead, we prefer to resort to a moment matching procedure, which is relativelysimple while achieving asymptotic statistical eﬃciency as shown in Theorem 2 below.Let us provide intuition for this method. Each value of the matrix A yields an equilib-rium matching distribution, which we denote π A ( x, y ). As argued in Appendix C, π A can be computed eﬃciently using a ﬁxed point method. Recall that we have assumedthat the distributions of X and Y have zero mean, and introduce the cross-covariancematrix (3.2) Σ XY = ( E [ X i Y j ]) ij = E [ XY ′ ]which is observed in the data. The idea is to look for the value of A such that forall i and j , the covariances predicted by the model match the covariances observed In our application, both x and y have 10 dimensions, so ( x, y ) is of dimension 20.

17n the data, that is(3.3) E π A [ X i Y j ] = E [ X i Y j ] . This yields a map A → ( E π A [ X i Y j ]) ij that is invertible. The inversion of this map(in order to estimate A ) can be formulated as a convex optimization problem, thusmaking it easy to solve numerically. To see this, we shall recall that the equilibriummatching π maximizes the social gain(3.4) W σ ( A ) := max π ∈M ( P,Q ) E π [ X ′ AY ] − σ E π [ln π ( X, Y )] , and we see that models with parameters ( A, σ ) and models with parameters (

A/σ, W σ ( A ) = σ W ( A/σ ). By the envelope theorem, the predicted covariancebetween X i and Y j coincides with the partial derivative of W σ with respect to A ij ,that is(3.5) E π A [ X i Y j ] = ∂ W σ ∂A ij ( A ) = ∂ W ∂A ij ( A/σ ) , which implies that, upon normalization σ = 1, the map A → ( E π A [ X i Y j ]) ij is in-vertible since W is strictly convex (see Lemma 3). This led Galichon and Salani´e(2013) to conclude, in a setting with discrete observable attributes, that B = A/σ isidentiﬁed as a solution to the following convex optimization program(3.6) min B ∈M dxdy ( R ) ( W ( B ) − X ij B ij Σ ijXY ) whose ﬁrst-order conditions are precisely ∂ W ( B ) /∂B ij = Σ ijXY , that is E π B [ X i Y j ] =Σ ijXY . In the present setting with continuous observable attributes, things work inan identical manner. Since the model is scale-invariant, only A/σ is identiﬁed andwe normalize A so that k A k = 1, where k A k = ( P ij A ij ) / . A and σ are thenobtained by A = B/ k B k and σ = 1 / k B k . Let us denote A XY the (unique) solutionto this problem, which will be our estimator of the aﬃnity matrix A . Aﬃnity matrix18 XY is “dual” to cross-covariance matrix Σ XY in the sense that there is a one-to-one correspondence between them by Equation (3.5). However, the former has astructural interpretation: it measures the strength of the interactions between pairsof attributes.At this point, it is worth commenting on the relevance of the structural approach.Indeed, it does not suﬃce to just look at the variance-covariance matrix inside matchesto infer the sign of complementarities, as illustrated on the following example. Imag-ine two observed characteristics, where the ﬁrst dimension is education and the seconddimension is risk aversion. Suppose we observe positive correlation in partners’ edu-cations and in partners’ risk aversions (i.e., Σ > > A > A > A < A >

0) dominates the negative complementarity in risk aversion,thus leading to positive correlation in risk aversions inside matches. The structuralapproach allows to avoid this misinterpretation by allowing to control for the mar-ginal distributions (e.g. control for the fact that there is positive association betweenindividuals’ education and risk aversion).Once the aﬃnity matrix A XY has been estimated, two questions arise. First, whatis the rank of A XY ? This question is of importance since one would like to knowthe number of dimensions of x and y on which sorting occurs. Second, how canwe construct “indices of mutual attractiveness” such that each pair of indices formen and women explains a mutually exclusive part of the matching utility? Manystudies resort to a technique called “Canonical Correlation,” which essentially relieson a singular value decomposition of Σ XY . In Dupuy and Galichon (2012), we arguethat this technique is not well suited for studying assortative matching, and that theresulting procedure is inconsistent. Instead, in Section 4, we propose a method we call19Saliency Analysis” in order to accurately answer these two questions. This methodis essentially based on the singular value decomposition of the aﬃnity matrix A XY (instead of Σ XY as in Canonical Correlation). Testing the rank of the aﬃnity matrixis equivalent to testing the number of (potentially multiple) singular values diﬀerentfrom 0. Performing this decomposition allows one to construct the indices of mutualattractiveness that each explain a separate share of the joint utility. 20. Saliency Analysis

In this section we set out to determine the rank of the aﬃnity matrix A XY , and theprincipal dimensions in which it operates. For this, we introduce and describe a noveltechnique we call Saliency Analysis , which is similar in spirit to Canonical Correlationbut does not suﬀer the pitfalls of the latter. Instead of performing a singular valuedecomposition of the (renormalized) cross-covariance matrix Σ XY , we shall performa singular value decomposition of the aﬃnity matrix A XY , properly renormalized.This idea is similar in spirit to the proposal of Heckman (2007), who interprets theassignment matrix as a sum of Cobb-Douglas technologies using a singular valuedecomposition in order to reﬁne bounds on wages.Recall that we have deﬁned the cross-covariance matrix Σ XY = E π [ XY ′ ], and letus introduce S X and S Y the diagonal matrices whose diagonal terms are respectivelythe variances of the X i and the Y j , that is S X = diag ( var ( X i ) , i = 1 , ..., d x ) , S Y = diag ( var ( Y j ) , j = 1 , ..., d y ) . We shall work with the rescaled attributes S − / X X and S − / Y Y , whose entries eachhave unit variance. By Lemma 1 in Appendix B.3, the aﬃnity matrix between therescaled attributes S − / X X and S − / Y Y isΘ = S / X A XY S / Y , for which a singular value decomposition of Θ yieldsΘ = U ′ Λ V, where Λ is a diagonal matrix with nonincreasing elements ( λ , ..., λ d ), d = min ( d x , d y )and U and V are orthogonal matrices. Deﬁne the vectors of indices of mutual attrac-tiveness ˜ X = U S − / X X and ˜ Y = V S − / Y Y, A ˜ X ˜ Y be theaﬃnity matrix on the rescaled vectors of characteristics ˜ X and ˜ Y . From Lemma 1,it follows that A ˜ X ˜ Y = Λ, and as a resultΦ A ( x, y ) = d x X i =1 d y X j =1 A ij x i y j = d X i =1 λ i ˜ x i ˜ y i . Hence, the new indices ˜ x and ˜ y are such that ˜ x i and ˜ y j are complements for i = j ,and neither complements, nor substitutes if i = j . In other words, there is positiveassortative matching between ˜ x i and ˜ y j for i = j , and no assortativeness for i = j .This justiﬁes the choice of terminology: ˜ x i and ˜ y i are “mutually attractive” becausethey are complementary with each other and only with each other. All things beingequal, a man with a higher ˜ x i tends to match with a woman with a higher ˜ y i .The weights of each index of mutual attractiveness constructed by Saliency Analysisare given by the associated row of U S − / X for men and V S − / Y for women. The value λ i / ( P i λ i ) indicates the share of the observable matching utility of couples explainedby the i th pair of indices. The fact that U and V are orthogonal implies strongrestrictions on how ˜ x and ˜ y are obtained from S − / X x and S − / Y y . In particular, thismapping preserves distances between points; that is, the distance between ˜ x and ˜ x ′ is equal to the distance between S − / X x and S − / X x ′ .We observe that in contrast with Canonical Correlation analysis, a convenientfeature of Saliency Analysis is that the results do not change when the attributes aremeasured using diﬀerent measurement units, as expressed in Lemma 2. For instance,if the partner’s heights are measured in feet rather than in meters, the outcome ofSaliency Analysis does not change.For illustrative purposes, we give a stylized example of how Saliency Analysis op-erates in a simple two-dimensional situation. 22 xample 2. Assume that there are two dimensions on each side of the market, andthat S X = S Y = Id , so that Θ = A . Suppose that (4.1) A =  −  . Then the singular value decomposition of A is A = U ′ Λ V , where U =   , Λ =   and V =  −  The economic interpretation of this simple example is the following: if the joint utilityis given by

Φ ( x, y ) = 4 x y − x y , then the indices of mutual attractiveness shouldbe given by ˜ x = x , ˜ x = x and ˜ y = y , ˜ y = − y . One has Φ (˜ x, ˜ y ) = 4˜ x ˜ y + ˜ x ˜ y .The vectors ˜ x = ( x , x ) and ˜ y = ( y , − y ) can be interpreted as indices of mutualattractiveness, meaning that there is sorting between ˜ x = x and ˜ y = y , and between ˜ x = x and ˜ y = − y . If one was willing to approximate the model by a one-dimensional sorting model, then Saliency Analysis advocates to keep ˜ x = x as aproxy for the attributes of men and ˜ y = y as a proxy for the attributes of women.In this case the joint utility is approximated by Φ (˜ x, ˜ y ) = 4˜ x ˜ y . Example 2 is the occasion to compare Singular Value Decomposition to another ma-trix decomposition, the Eigenvalue Decomposition, which economists may be morefamiliar with. Eigenvalue decomposition consists in writing, whenever possible, asquare matrix M as M = R Λ R − , with Λ diagonal and R invertible. In the contextof Saliency Analysis, this decomposition cannot be performed on A as A is not nec-essarily a square matrix; further, as soon as A is not symmetric, this decompositiondoes not necessarily exist. In particular, when A is given by (4.1), it does not existsince A has no real eigenvalue . However, the singular values of A can be interpreted as eigenvalues of a larger matrix. Indeed,letting H be the constant Hessian matrix of the map 2Φ, then H is a symmetric matrix of size( d x + d y ) written blockwise with two zero blocks on its diagonal and A and A ′ as oﬀ-diagonal

23s Example 2 also illustrates, the observation of vector Λ will allow one to drawconclusions about the multivariate nature of the sorting, and on the number of dimen-sions on which the sorting occurs. In particular, testing for multidimensional sortingversus unidimensional sorting is equivalent to testing whether at least two singularvalues λ i are signiﬁcantly larger than 0, as we shall elaborate in the next section. blocks, and the eigenvalues of H are plus and minus the singular values of A (see Horn and Johnson,1991, p. 135). Inferring the number of sorting dimensions Assume that a ﬁnite sample of size n is observed. For the sake of readability,dependence in n of the estimators will be dropped from the notations. The vector ofmutual attraction weights estimated on the sample is denoted ˆΛ, while the vector ofmutual attraction weights in the population is denoted Λ. Similarly ˆ A is the estimatorof A (which was denoted A XY in Section 3.2, where the construction of that estimatoris described). Let ˆ S X , ˆ S Y and ˆΣ XY be the sample estimators of S X , S Y and Σ XY ,respectively. For a given quantity M , we shall denote(5.1) δM = ˆ M − M, the diﬀerence between the estimator of M and M .Consider two important matrices associated to the large sample properties of themodel. The Fisher Information matrix is deﬁned by(5.2) F ijkl = E π (cid:20) ∂ log π ( X, Y ) ∂A ij ∂ log π ( X, Y ) ∂A kl (cid:21) , where we note that the lines of F are indexed by pairs of integers ( ij ), just as thecolumns of F are indexed by pairs kl . (This is due to the fact that the parameter to beestimated, A ij , is not a vector but a matrix). Hence F is a “doubly-indexed matrix,”which we shall denote using bold font. Some basic formalism on doubly-indexedmatrices is recalled in Appendix B, Section B.4.Our next result expresses the asymptotic distribution of the estimators of A , S X and S Y . It will be the main building block for testing the rank of the aﬃnity matrix(and the number of sorting dimensions). In a discussion with one of the authors, Jim Heckman suggested the intuition of the approachproposed in this paper to test for multidimensional sorting. heorem 2. The following convergence holds in distribution for n → + ∞ : n / ( δA, δS X , δS Y ) = ⇒ N  ,  F − K XX K XY K ′ XY K Y Y  where F has been deﬁned in (5.2), K XY is deﬁned by ( K XY ) klij = 1 { i = j,k = l } cov π ( X i X j , Y k Y l ) and we deﬁne similarly K XX and K Y Y by ( K XX ) klij = 1 { i = j,k = l } cov π ( X i X j , X k X l ) and ( K Y Y ) klij = 1 { i = j,k = l } cov π ( Y i Y j , Y k Y l ) . Note that, as shown in Lemma 5 in Appendix B.3, F can be evaluated numericallyas the Hessian matrix of W . Theorem 2 implies in particular that the asymptoticvariance-covariance matrix of our estimator of A is the inverse of the Fisher informa-tion matrix. As a result, our estimator attains asymptotic statistical eﬃciency.Now denoting ˆΘ = ˆ S X ˆ A ˆ S Y the estimated counterpart of Θ whose singular valuedecomposition is denoted ˆΘ = ˆ U ′ ˆΛ ˆ V , we show in Appendix B.4 that ˆΘ is asymptoti-cally normal, and we give an expression for its asymptotic variance-covariance matrixin Lemma 6. We use this asymptotic result to test the rank of the aﬃnity matrix Λ.Testing the rank of a matrix is an important issue with a distinguished tradition ineconometrics (see e.g. Robin and Smith, 2000 and references therein). Here, we useresults from Kleibergen and Paap (2006). One wishes to test the null hypothesis H :the rank of the aﬃnity matrix is equal to p = 1 , , ..., d −

1. Following Kleibergen andPaap, the singular value decomposition ˆΘ = ˆ U ′ ˆΛ ˆ V is written blockwiseˆΘ =  ˆ U ′ ˆ U ′ ˆ U ′ ˆ U ′   ˆΛ

00 ˆΛ   ˆ V ˆ V ˆ V ˆ V  , ˆ U ′ and ˆ V are p × p square matrices.Deﬁne ˆ T p = (cid:16) ˆ U ′ ˆ U (cid:17) − / ˆ U ′ ˆΛ ˆ V (cid:16) ˆ V ′ ˆ V (cid:17) − / ˆ A p ⊥ =  ˆ U ′ ˆ U ′  (cid:16) ˆ U ′ (cid:17) − (cid:16) ˆ U ′ ˆ U (cid:17) / ˆ B p ⊥ = (cid:16) ˆ V ′ ˆ V (cid:17) / ˆ V − (cid:16) ˆ V ˆ V (cid:17) so that our next result provides a test for the number of sorting dimensions. Theorem 3.

Under the null hypothesis that the rank of the aﬃnity matrix is p ,the quantity n / ˆ T p is asymptotically normally distributed, and the expression of itsvariance-covariance matrix Ω p is given in Appendix B.4, formula (B.18). As a result,the test-statistic n ˆ T ′ p ˆΩ − p ˆ T p converges under the null hypothesis to a χ (( d x − p ) ( d y − p )) random variable. The data

The dataset.

In this paper, we use the waves 1993-2002 of the DNB House-hold Survey (DHS) to estimate preferences in the marriage market. For a thoroughdescription of the setup and the quality of this data we refer the reader to Nyhus(1996). This data is a representative panel of the Dutch population with respect toregion, political preference, housing, income, degree of urbanization, and age of thehead of the household among others. The DHS data was collected via on-line termi-nal sessions and each participating family was provided with a PC and a modem ifnecessary. The panel contains on average about 2,200 households in each wave.This data includes three main features that are particularly attractive for our pur-poses. First, within each household, all persons aged 16 or over were interviewed.This implies that the data contains detailed information not only about the head ofthe household but about all individuals in the household. In particular, the dataidentiﬁes “spouses” and “permanent partners” of the head of each household. Thisinformation reveals the nature of the relationship between the various individuals ofeach household and allows us to reconstruct “couples”.Second, this data contains very detailed information about individuals. This richset of information includes socio-demographic variables such as birth year and educa-tion, as well as variables about the anthropometry of respondents (height and weight),a self-assessed measure of health, and, above all, information about personality traitsand risk attitude, which are included in the waves 1993-2002.Finally, as for most panel data, the DHS data suﬀers from attrition problems. Theattrition of households is on average 25% each year, cf. Das and van Soest (1999)among others. To remedy attrition, refreshment samples were drawn each year, suchthat, over the period 1993-2002, about 7,700 distinct households were interviewed atleast once. Since the methodology implemented in this paper relies essentially on the28vailability of a cross-section of households, attrition and its remedy is in fact an assetof this data as it allows us to have access to a rather large pool of potential couples.Note that our methodology could be applied on other panel datasets (such asGSOEP, for instance) that also include supplementary questionnaires enabling oneto construct measures of personality traits and risks aversion together with socio-demographic and morphological variables. However, the main asset of the DNBdataset is that it allows us to measure all relevant variables in a single wave whereasin GSOEP, one would have to use the panel structure to match measures of BMI(from wave 2008 and 2009) and measures of personality traits (from waves 2006 and2007).6.2.

Variables.

Educational attainment is measured from the respondent’s reportedhighest level of education achieved. The respondents could choose among 13 cat-egories (7 in the later waves), ranging from primary to university education. Thereduction to 7 categories in the later waves implies that only three broad educationalcategories can be consistently constructed. We coded responses as follows:(1) Lower [kindergarten, primary, elementary secondary] education,(2) Intermediate [secondary, pre-university, vocational] education,(3) Higher [university] education.The respondents were also asked about their height and weight. The answers ofthese questions allowed us to calculate the Body Mass Index of each respondent asthe weight in kilograms divided by the square of the height measured in meters.The respondents were also asked to report their general health. The phrasing ofthe question was: “How do you rate your general health condition on a scale from1, excellent, to 5, poor?”. We make use of the panel structure to deal partly withnonresponses on socio-economic and health variables. When missing values for height,weight, education, year of birth etc. were encountered, values reported in adjacent29ears were imputed. We deﬁned our measure of health by subtracting the answer tothis question to 6.In Appendix E.2, we recall the methodology of Nyhus and Webley (2001) whichwe followed in order to construct ﬁve factors of personality traits. These factors werelabeled as: • Emotional stability: a high score indicates that the person is less likely tointerpret ordinary situations as threatening, and minor frustrations as hope-lessly diﬃcult, • Extraversion (outgoing): a high score indicates that the person is more likelyto need attention and social interaction, • Conscientiousness (meticulous): a high score indicates that the person is morelikely to be self-disciplined and plan his/her actions, • Agreeableness (ﬂexibility): a high score indicates that the person is more likelyto be pleasant with others and go out of their way to help others, • Autonomy (tough-mindedness): a high score indicates that the person is morelikely to be direct, rough and dominant.6.3.

Couples.

Our deﬁnition of a couple is a man and a woman living in the samehousehold and reporting being either head of the household, spouse of the head or apermanent partner (not married) of the latter . To construct our dataset of couples,we ﬁrst pool all the selected waves (1993-2002). We then keep only those respon-dents that report being head of the household, spouse of the head or permanent (notmarried) partner of the head. This sample contains roughly 13,000 men and womenand identiﬁes about 7,700 unique households. We then split this sample and createtwo datasets, one containing women and one containing men. Each dataset identiﬁesabout 6,500 diﬀerent men and women. We then merge the men dataset to the women Note that using the subsample of legally married couples does not aﬀect the three main resultsof our analysis mentioned in the abstract. These results are available from the authors upon request. , slightly more educated, taller by 13 centimeters,have a BMI of 1Kg/m higher, are less conscientious (meticulous), less extravertedbut more emotionally stable and more risk averse. On average, in our sample, menand women have similar (good) health and a comparable degree of agreeableness andautonomy.Oreﬃce and Quintana-Domeque (2010) estimate features of the (observed) match-ing function between men and women in the marriage market using the PSID data forthe US. Their strategy consists in regressing each attribute of men on all attributesof women and vice versa. This procedure can easily be replicated with our data inan attempt to compare features of the matching function in both datasets (US versus Note that the mean age at ﬁrst marriage is relatively high in the Netherlands (30.1 and 32.8for women and men respectively, source: United Nations Economic Commission for Europe, 2010Statistical Database) compared to the USA (26.1 and 28.2) which is reﬂected in the relative highmean age of men and women in our sample of couples.

Empirical results

We apply the Saliency Analysis, outlined in the previous section, on our sampleof couples. The procedure requires ﬁrst to estimate the aﬃnity matrix A . This isdone by applying the technique presented in section 3. The estimation results arereported in Table 3. It is important to note that the estimates reported in the tableare obtained using standardized attributes rather than the original ones. The mainadvantage of using standardized attributes is that the magnitude of the coeﬃcientsis directly comparable across attributes, allowing a direct interpretation in terms ofcomparative statics.The estimates of the aﬃnity matrix reveal four important and remarkable features:(1) On-diagonal : education is the single most important attribute in the marriagemarket. The largest coeﬃcient of the aﬃnity matrix is indeed observed on thediagonal for education. This coeﬃcient is more than twice as large as the sec-ond largest coeﬃcient obtained on the diagonal for the variable BMI. Looselyspeaking, this means that increasing the education of both spouses by 1 stan-dard deviation increases the couple’s joint utility by 0.56 units. To achievea similar increase in utility, the BMI of both spouses should be increased by1.63 standard deviations each.(2)

Oﬀ-diagonal : the table clearly indicates the importance of cross-gender in-teractions between the various attributes as many oﬀ-diagonal coeﬃcients ofthe aﬃnity matrix are signiﬁcantly diﬀerent from 0. This implies that impor-tant trade-oﬀs take place between the various attributes. For instance, men’semotional stability interacts positively with women’s conscientiousness, i.e.0.21. Stated otherwise, this means that increasing the husband’s emotionalstability increases the joint utility of couples whose wives are relatively consci-entious. Other examples are noticeable: men’s autonomy interacts negatively33ith women’s conscientiousness, i.e. − .

11 but positively with women’s ex-traversion, i.e. 0 .

11. Conversely, men’s agreeableness interacts positively withwomen’s conscientiousness, i.e. 0.13, but negatively with women’s extraver-sion, i.e. -0.14.(3)

Asymmetry: the aﬃnity matrix is not symmetric indicating that preferencesfor attributes are not similar for men and women. For instance, increasinga wife’s conscientiousness by 1 standard deviation increases the joint utilityof couples with more agreeable men relatively more (signiﬁcant coeﬃcient ofmagnitude 0 .

13) while increasing the husband’s conscientiousness by 1 stan-dard deviation has the same impact on a couple’s joint utility, indiﬀerently ofhow agreeable his wife is.(4)

Personality traits: personality traits matter for preferences, not only directly(terms on the diagonal are signiﬁcant for conscientiousness, and risk aversionand of respective magnitude, 0.14 and 0.11) but mainly indirectly throughtheir interactions with other attributes. For instance, the single most impor-tant interaction between observable attributes of men and women is foundbetween the emotional stability of husbands and the conscientiousness ofwomen, i.e. 0.21, a magnitude that matches with the direct eﬀect of BMI.Also, personality traits interact not only with other personality traits but alsowith anthropometry. The emotional stability of men interacts positively withwomen’s BMI, i.e. 0.12.Using the estimated aﬃnity matrix, we then proceed to the Saliency Analysis asintroduced in the previous section. This enables us i) to test whether sorting isunidimensional, i.e. occurs on a single-index and ii) to construct pairs of indices ofmutual attractiveness for men and women.We ﬁrst test the dimensionality of the sorting in the marriage market. For p = 1,that is testing against the null hypothesis that sorting occurs on a single index, we ﬁnd34hat n ˆ T ′ ˆΩ − ˆ T = 273 .

45 which is signiﬁcant at the 1% level. This implies that sortingin the marriage market does not occur on a single index as has been assumed in mostof previous literature. In fact, our test-statistic never becomes insigniﬁcant. Evenfor p = 9 we have n ˆ T ′ ˆΩ − ˆ T = 13 .

62 which is still signiﬁcant at the 1% level. Thissuggests that the aﬃnity matrix has full rank and that sorting occurs on at least 10observed indices. Our results therefore clearly highlight that sorting in the marriagemarket is multidimensional and individuals face important trade-oﬀs between theattributes of their spouses.Each pair of indices derived from Saliency Analysis explains a mutually exclusivepart of the total observable matching utility of couples. The share explained by eachof our 10 indices is reported in Table 4. The table shows that the share of the ﬁrst 8pairs of indices is signiﬁcantly diﬀerent from 0 at the 1% level.As for the Principal Component Analysis, the labeling of each dimension is subjec-tive and becomes increasingly diﬃcult to interpret as one considers more dimensions.Table 5 therefore only contains the 3 pairs of indices explaining most of the jointutility. Together these 3 pairs of indices explain about 60% of the total matchingutility. The ﬁrst pair, indexed I1, explains about 28% of the joint utility. These in-dices load heavily (in bold weight ≥ Summary and Discussion

This paper has introduced a novel technique to test for the dimensionality of thesorting in the marriage market, and derive indices of mutual attractiveness, namelySaliency Analysis. This technique is grounded in the structural equilibrium modelof Choo and Siow (2006) which we have extended to the continuous case in thispaper. Indices of mutual attractiveness derived in Saliency Analysis, in contrast toCanonical Correlation for instance, have a structural interpretation and are thereforeinformative about agents’ preferences.Saliency Analysis has been performed on a dataset of Dutch households containinginformation about education, height, BMI, health, attitude towards risk and ﬁvepersonality traits of both spouses. The empirical results of this paper reveal twoimportant features of the marriage market. First, our results clearly show that sortingoccurs on multiple indices rather than just on a single one, as assumed in most ofcurrent literature. This implies that individuals face important trade-oﬀs betweenthe attributes of their potential spouse. For instance, in the dataset we studied, moreconscientious men prefer more conscientious women (0.14), but more autonomousmen prefer less conscientious women (-0.11). Hence, women face a trade oﬀ betweenbeing attractive for more conscientious men and being attractive for more autonomousmen. Similarly, more conscientious women prefer more agreeable men (0.13) but moreextraverted women prefer less agreeable men (-0.14). Men therefore face a trade-oﬀbetween being attractive for more conscientious women and being attractive for moreextraverted women.Second, personality traits and attitude towards risk matter for the sorting of spousesin the marriage market. In fact, although education explains the largest share (28%)of the observable joint utility of spouses, personality traits explain a rather largeshare too (17%). Interestingly enough, diﬀerent traits matter diﬀerently for menand women. For instance, women ﬁnd emotionally stable men more attractive. Yet,37en prefer conscientious women but are indiﬀerent about the emotional stability ofwomen.The analysis presented in this paper opens up interesting possibilities for furtherresearch. In particular, our analysis could be applied on other markets besides themarriage one, such as the market for CEOs. A recent literature led by Bertrand andSchoar (2003), Falato, Li and Milbourn (2012) and Custodio, Ferreira and Matos(2013), acknowledges the multidimensionality of CEO’s talent, but assumes thatsorting occurs on a single index. Our setting can then be used to extend the semi-nal contributions of Tervi¨o (2008) and Gabaix and Landier (2008), who calibrate asingle-dimensional multiplicative sorting model in order to explain CEO compensa-tion. An important diﬀerence in the CEO compensation literature is that transfers(i.e. salaries) are typically observed, unlike in the case of the marriage market con-sidered in the present paper. The observation of the transfers has interesting conse-quences for identiﬁcation. Assume that x is a CEO’s vector of characteristics (say,track record, education, political inclinations, cultural aﬃnities) and y is a vector ofﬁrm’s characteristics. Let α ( x, y ) be the nonpecuniary utility of CEO x working withﬁrm y , and let γ ( x, y ) be the productivity (in monetary units) of CEO x if hired byﬁrm y . In the case where transfers are unobserved, only the joint utility Φ = α + γ isidentiﬁed. However, in the case where transfers are observed, it is possible to identifyseparately α and γ . Hence when CEO compensation data is available, the resultsof the present paper can be easily extended to identify simultaneously the CEO’sproductivity and his/her nonpecuniary utility for working with a given ﬁrm.Lastly, we observe that the Poisson process approach which appears in the frame-work of this paper may provide the “missing link” between search models and match-ing with unobserved heterogeneity. Indeed, Poisson processes are central to searchmodels, and the fact that they also play an important role in our model suggests thatthey may provide an interesting connection. The key diﬀerence, of course, comes fromthe fact that in search models, agents are faced with an optimal stopping problem:38gents cannot know what their opportunities will be in advance, and they cannotretain oﬀers, while in our framework they are fully aware of all their opportunitiesfrom the start. While we brieﬂy elaborate on the formal connection in Appendix A,we leave a full exploration of the matter for future work. 39 ppendix A. Continuous logit formalism

In this paragraph, we expound the main ideas of Cosslett (1988) and Dagsvik (1994)who show how to obtain a continuous version of the multinomial logit model. Assumethat { ( y mk , ε mk ) , k ∈ N } are the points of a Poisson point process on Y × R of intensity dy × e − ε dε . We recall that this implies that for S a subset of Y × R , the probabilitythat man m has no acquaintance in set S is exp (cid:0) − R S e − ε dydε (cid:1) . From (2.2), man m chooses woman k among his acquaintances such that his utility is maximized, thatis, man m solves max k { U ( x, y mk ) + ε mk } . Letting Z be the value of this maximum, one has for any c ∈ R Pr ( Z ≤ c ) = Pr ( U ( x, y mk ) + ε mk ≤ c ∀ k )which is exactly the probability that the Poisson point process ( y k , ε mk ) has no pointin { ( y, ε ) : U ( x, y ) + ε > c } , thuslog Pr ( Z ≤ c ) = − Z Z Y× R U ( x, y ) + ε > c ) dye − ε dε = − Z Y Z c − U ( x,y ) e − ε dεdy = − Z Y e − c + U ( x,y ) dy = − exp (cid:18) − c + log Z Y exp U ( x, y ) dy (cid:19) , hence Z is a (cid:0) log R Y exp U ( x, y ) dy, (cid:1) -Gumbel. In particular, E h max k { U ( x, y mk ) + ε mk } i = log Z Y exp U ( x, y ) dy and the choice probabilities are given by their density with respect to the Lebesguemeasure π ( y | x ) = exp( U ( x, y )) / ( Z Y exp U ( x, y ′ ) dy ′ ) . The same logic also implies that { ε k : k ∈ N } has a Gumbel distribution. Indeed,the probability that this Poisson point process has no element in the set { ε : ε > c }

40s equal to exp (cid:18) − Z + ∞ c e − ε dε (cid:19) = exp ( − exp ( − c ))which is equivalent to say that Pr (max k ∈ N ε k ≤ c ) = exp ( − exp ( − c )). Finally, notethat a similar argument would show that m has almost surely an inﬁnite, thoughcountable, number of acquaintances, as announced.Note that an interesting connection remains to be explored with the search liter-ature (Shimer and Smith, 2000, Atakan, 2006). Assume that each man m draws aPoisson sample of acquaintances ( y mk , ε mk ), where y mk is the type of partner of index k , and ε mk is now the time at which this acquaintance is met. Assume that it hasbeen agreed that if x matches with y , x will receive utility U ( x, y ) out of the jointutility Φ ( x, y ). In the spirit of Atakan (2006), assume unmatched agents pay a utilitycost equal to σ per unit of time while unmatched, such that if x matches with y attime ε , his utility is U ( x, y ) − σε . If agents could perfectly foresee their opportunities(i.e., know the full sample ( y mk , ε mk ) in advance), they would choose opportunity k soas to maximize the quantity U ( x m , y mk ) − σε mk exactly as in the present paper. Thediﬀerence, of course, comes from the fact that in search models, agents are faced withan optimal stopping problem: agents cannot know what their opportunities will be inadvance, and they cannot retain oﬀers. At each time t they know only what oppor-tunities have already been received up to time t , and they do not know about the setof k ’s such that ε mk > t . This is an optimal stopping problem with a Poisson process,well-studied in Probability Theory and Operations Research following seminal workby Elfving (1967). The basic idea is as follows: there exists a function ψ : R → R such that the partner chosen by m is the ﬁrst partner (in terms of meeting time) suchthat U ( x m , y mk ) exceeds ψ ( ε mk ). ψ can be characterized as a solution to an OrdinaryDiﬀerential Equation, and in some cases, can be expressed analytically. 41 ppendix B. Proofs

B.1.

Proof of Theorem 1.

Proof. (i) The ﬁrst part of the argument extends Galichon and Salani´e (2010) to thecontinuous case; the argument is decomposed in four steps which are now brieﬂycommented. In Step 1, we shall show that the expression of the social welfare is givenby min

U,V Z X G x ( U ( x, . )) f ( x ) dx + Z Y H y ( V ( ., y )) g ( y ) dy (B.1) s.t. U ( x, y ) + V ( x, y ) ≥ Φ ( x, y )where U ( x, y ) (resp. V ( x, y )) is the share of the systematic joint utility going to man x (resp. woman y ), and G x ( U ) (resp. H y ( V )) is the ex-ante indirect utility of a manof type x (resp. a woman of type y ), namely(B.2) G x ( U ( x, . )) = E h max k n U ( x, y mk ) + σ ε mk oi and H y ( V ( ., y )) = E h max l n U ( x wl , y ) + σ ε wl oi . Welfare expression (B.1) has a straightforward interpretation in terms of equilib-rium. The constraint U + V ≥ Φ is a stability condition, and the minimization of thesum of the individual ex-ante indirect utility function expresses the absence of rents.In step 2, we shall express the dual of variational problem (B.1) as W = sup π ∈M ( P,Q ) Z Φ dπ − I ( π )where I ( π ) = sup U (cid:18)Z X ×Y U ( x, y ) dπ ( x, y ) − Z X G x ( U ( x, . )) dP ( x ) (cid:19) + sup V (cid:18)Z X ×Y V ( x, y ) dπ ( x, y ) − Z Y H y ( V ( ., y )) dQ ( y ) (cid:19) .

42n step 3, we shall show that under the distributional assumptions made on theheterogeneities, the expression of I is given by I ( π ) = σ Z Z

X ×Y log π ( x, y ) p f ( x ) g ( y ) π ( x, y ) dxdy In step 4, we shall show that as a result, the social welfare, which is the value ofvariational problem (B.1), can be expressed up to irrelevant constants as(B.3) max π ∈M ( P,Q ) Z Z

X ×Y

Φ ( x, y ) π ( x, y ) dxdy − σ Z Z

X ×Y log π ( x, y ) π ( x, y ) dxdy which will establish (i). Step 1 . Introduce ε m ( . ) a stochastic process on Y deﬁned by ε m ( y ) = σ k { ε mk : y k = y } if the set { k : y k = y } is nonempty, ε m ( y ) = −∞ otherwise. Similarly, introduce η w ( x ) a stochastic process on X deﬁned by η w ( x ) = σ l { η wl : x l = x } if the set { l : x l = x } is nonempty, η w ( x ) = −∞ otherwise. By the results of Shapleyand Shubik (1972), extended to the continuous case by Gretsky, Ostroy and Zame(1992), the equilibrium matching solves the dual transportation problem which ex-presses the social welfare(B.4) W = inf u m + v w ≥ Φ( x m ,y m )+ ε m ( y )+ η w ( x ) Z u m dm + Z v w dw now, the constraint can be rewritten as U ( x, y ) + V ( x, y ) ≥ Φ ( x, y )where U and V have been deﬁned as U ( x, y ) = inf m ( u m − ε m ( y )) and V ( x, y ) = inf w ( v w − η w ( x )) 43hich implies that u m and v w can be expressed in U ( x, y ) and V ( x, y ) by(B.5) u m = sup y ∈Y ( U ( x, y ) + ε m ( y )) and v w = sup x ∈X ( V ( x, y ) + η w ( x )) . Therefore, replacing u m and v w by their expression in U and V , (B.4) rewrites as(B.1), with G x and H y given by (B.2). Step 2 . Rewrite (B.1) as a saddlepoint problem W = inf U,V sup π  RR X ×Y Φ dπ + R X G ( U ( x, . )) dP ( x ) − RR X ×Y

U dπ + R Y H ( V ( ., y )) dQ ( y ) − RR X ×Y

V dπ  or in other words W = sup π Z Φ dπ − I ( π )where I ( π ) = sup U (cid:18)Z Z X ×Y

U dπ − Z X G x ( U ( x, . )) dP ( x ) (cid:19) + sup V (cid:18)Z Z X ×Y

V dπ − Z Y H y ( V ( ., y )) dQ ( y ) (cid:19) . Step 3 . From the derivation in Appendix A, we get that G x ( U ( x, . )) = σ Z Y exp U ( x, y ) σ/ dy and H y ( V ( ., y )) = σ Z X exp U ( x, y ) σ/ dx Now, in order to get an expression for I ( π ) it remains to compute(B.6) sup U ( x,y ) Z Z

X ×Y U ( x, y ) π ( x, y ) dxdy − Z G x ( U ( x, . )) f ( x ) dx and the similar expression on the other side of the market.By F.O.C., π ( x, y ) = f ( x ) exp U ( x,y ) σ/ R Y exp U ( x,y ) σ/ dy R π ( x, y ) dy = f ( x ), inwhich case it is ( σ/ Z Z

X ×Y π ( x, y ) log π ( x, y ) f ( x ) dxdy which is the value of (B.6). A symmetric expression is obtained for the other side ofthe market, and ﬁnally I ( π ) obtains as I ( π ) = σ Z Z

X ×Y log π ( x, y ) p f ( x ) g ( y ) π ( x, y ) dxdy if π ∈ M ( P, Q ), while I ( π ) = + ∞ otherwise. Step 4 . One has I ( π ) = σ Z Z

X ×Y log π ( x, y ) π ( x, y ) dxdy − ( σ/ Z X log f ( x ) f ( x ) dx − ( σ/ Z Y log g ( x ) g ( x ) dx the last two terms do not depend on the particular matching π ∈ M ( P, Q ), thus areirrelevant in the expression of the social welfare, which establishes (B.3) and point(i).(ii) Letting a ( x ) = − σ f ( x ) R Y exp U ( x, y ) dy and b ( y ) = − σ g ( y ) R X exp V ( x, y ) dx , one has log π ( x, y ) = U ( x, y ) − a ( x ) σ/ π ( x, y ) = V ( x, y ) − b ( y ) σ/ π ( x, y ) = exp (cid:18) Φ ( x, y ) − a ( x ) − b ( y ) σ (cid:19) . (iii) One has U ( x, y ) = σ log π ( x, y )2 + a ( x ) = Φ ( x, y ) + a ( x ) − b ( y )2 45nd similarly V ( x, y ) = Φ ( x, y ) − a ( x ) + b ( y )2 . By (B.5), one sees that if man m of type x marries a woman of type x , he gets utility u m = sup y ′ ∈Y ( U ( x, y ′ ) + ε m ( y ′ )) = U ( x, y ) + ε m ( y ) . B.2.

Useful lemmas.

We state several useful lemmas which are useful in Sections 4and 5, and in the proof of Theorem 2. First, we need a formula which expresses theaﬃnity matrix of the rescaled attributes as a function of the aﬃnity matrix between X and Y . This is given in the following: Lemma 1.

For M and N two invertible matrices, one has: (B.7) A MX,NY = ( M ′ ) − A XY N − . This result should be compared with the expression of the cross-covariance matrixbetween

M X and

N Y , namely Σ

MX,NY = M Σ XY N ′ . A quick dimensionality checkis coherent, as the unit of A XY is the inverse of the product of the units of X and Y ,while the unit of Σ XY is the product of the units of X and Y . Proof of Lemma 1.

Recall that every aﬃnity matrix A XY is characterized by the factthat:(B.8) ∂ W P,Q ∂A ij (cid:0) A XY (cid:1) = Σ ijXY . Let P M (resp. Q N ) be the distribution of M X (resp

N Y ). We therefore have(B.9) ∂ W P M ,Q N ∂A ij (cid:0) A MX,NY (cid:1) = Σ ijMX,NY = M Σ ijXY N ′ = M ∂ W P,Q ∂A ij (cid:0) A XY (cid:1) N ′ , W P M ,Q N (cid:0) A MX,NY (cid:1) = W P,Q (cid:0) M ′ A MX,NY N (cid:1) . Taking the derivative with respect to A , yields(B.10) ∂ W P M ,Q N ∂A (cid:0) A MX,NY (cid:1) = M ∂ W P,Q ∂A (cid:0) M ′ A MX,NY N (cid:1) N ′ . And, by comparing (B.9) and (B.10), one gets ∂ W P,Q ∂A (cid:0) M ′ A MX,NY N (cid:1) = ∂ W P,Q ∂A (cid:0) A XY (cid:1) . From the strict convexity of W P,Q , we therefore have M ′ A MX,NY N = A XY , andgiven that M and N are invertible, it follows that A MX,NY = ( M ′ ) − A XY N − . QED.As a consequence of Lemma 1, we are able to state that the results of SaliencyAnalysis are invariant with respect to a (linear) change in the measurement units.

Lemma 2.

For ζ i and ξ j two vectors of positive scalars, let ˆ X i = ζ i X i and ˆ Y j = ξ j Y j be the values of partners’ attributes measured under diﬀerent measurement units.Then the outcome of Saliency Analysis under the new measurement units coincideswith the outcome under the former.Proof. Saliency Analysis consists in determining the Singular Value Decompositionof Θ = σ X A X,Y σ Y under the old units, and of ˆΘ = σ ˆ X A ˆ X, ˆ Y σ ˆ Y under the new units.Letting D ζ = diag ( ζ i ) and D ξ = diag ( ξ j ), one has A ˆ X, ˆ Y = D − ζ A X,Y D − ξ , σ ˆ X = σ X D ζ and σ ˆ Y = D ξ σ Y , 47hus ˆΘ = Θ.The next lemma shows that W ( A ) is strictly convex. Lemma 3.

The map A → W ( A ) is strictly convex.Proof. Consider two matrices A and ˜ A . Let π be the matching associated to Φ ( x, y ) = x ′ Ay , and ˜ π be the matching associated to Φ ( x, y ) = x ′ ˜ Ay (uniqueness of π and ˜ π fol-lows from the uniqueness of the solution to the Schr¨odinger problem, see R¨uschendorfand Thomsen 1993, Theorem 3). Then convexity of W implies(B.11) W (cid:16) ˜ A (cid:17) ≥ W ( A ) + D ∇W ( A ) , ˜ A − A E where, by the Envelope Theorem, ∇W ( A ) = E π [ XY ′ ]. In order to show strictconvexity, we need to show that equality in (B.11) implies A = A ′ . Assume (B.11)holds as an equality. One has W (cid:16) ˜ A (cid:17) = W ( A ) + D ∇W ( A ) , ˜ A − A E = E π [ X ′ AY ] − E π [ln π ( X, Y )] + E π h X ˜ AY ′ i − E π [ X ′ AY ]= E π h X ˜ AY ′ i − E π [ln π ( X, Y )]This implies that π is optimal for the matching problem associated to Φ ( x, y ) = x ′ ˜ Ay .Again by the uniqueness of the solution to the Schrodinger problem mentioned above,it follows that π = ˜ π , and hence that A = ∂ ln π ( x, y ) /∂x∂y = ∂ ln ˜ π ( x, y ) /∂x∂y = ˜ A. The following lemma allows us to characterize the conditional expectations of thegradient of the log-likelihood. 48 emma 4.

Let π A ∈ M ( P, Q ) be the equilibrium matching computed for joint utilityfunction Φ A . Then (B.12) E (cid:20) ∂ log π A ∂A ij | X = x (cid:21) = E (cid:20) ∂ log π A ∂A ij | Y = y (cid:21) = 0 , and (B.13) σ E π (cid:20) ∂ log π A ( X, Y ) ∂A ij ∂ log π A ( X, Y ) ∂A kl (cid:21) = E π (cid:20) ∂ log π A ( X, Y ) ∂A ij x k y l (cid:21) = E π (cid:20) x i y j ∂ log π A ( X, Y ) ∂A kl (cid:21) . Proof of Lemma 4.

It follows from equation (2.7) that σ ∂ log π∂A ij ( x, y ) = x i y j − ∂a∂A ij ( x ) − ∂b∂A ij ( y ) . But by diﬀerentiation of R Y π ( x, y ) dy = f ( x ) with respect to A ij , one gets Z Y ∂ log ∂A ij π ( x, y ) π ( x, y ) dy = 0thus E π (cid:20) ∂ log ∂A ij π ( X, Y ) | X = x (cid:21) = 0which proves (B.12). (B.13) then follows directly.The ﬁnal lemma in this section shows that the Hessian of W coincides with theFisher information matrix F . Lemma 5.

The Hessian of W is given by ∂ W ∂A ij ∂A kl = F ijkl where the expression of F is given by 5.2.Proof of Lemma 5. By the envelope theorem, ∂ W ∂A ij = Z x i y j π A ( x, y ) dxdy. ∂ W ∂A ij ∂A kl = Z x i y j ∂ log π A ( x, y ) ∂A kl π A ( x, y ) dxdy = F ijkl . where the second equality follows from Lemma 4.B.3. Proof of Theorem 2.

The proof of Theorem 2 relies on the auxiliary resultsderived in the previous paragraph. In the sequel, we assume σ = 1; by positivehomogeneity, this is without loss of generality. Proof of Theorem 2.

Letˆ π ( x, y ) = 1 n n X k =1 δ ( x − X k ) δ ( y − Y k )be the distribution of the empirical sample under observation, and π A is the equilib-rium matching computed for matching utility function Φ A (we shall drop the subscript A when there is no ambiguity). Recall that the (population) aﬃnity matrix A andits sample estimator ˆ A are respectively characterized by ∂ W ( A ) ∂A ij = Σ ijXY and ∂ W (cid:16) ˆ A (cid:17) ∂A ij = ˆΣ ijXY . By the Delta method, we get(B.14) ( F · δA ) ij = Z ∂ log π A ∂A ij ( b π − π ) dxdy + o D (cid:0) n − / (cid:1) where F is the Hessian of W at A , whose expression is F ijkl = E π (cid:20) ∂ log π A ( X, Y ) ∂A ij ∂ log π A ( X, Y ) ∂A kl (cid:21) π ∈ M ( P, Q ) is the equilibrium matching computed for the joint utility func-tion Φ A . Further,( δS X ) ij = 1 { i = j } Z x i x j ( b π − π ) dxdy + o D (cid:0) n − / (cid:1) ( δS Y ) kl = 1 { k = l } Z y i y j dπ ( b π − π ) dxdy + o D (cid:0) n − / (cid:1) hence E h ( F · δA ) ij ( δS X ) kl i = cov (cid:18) ∂ log π∂A ij , X k X l (cid:19) { k = l } = 0 , where we have used (B.12), and similarly, E h ( δA ) ij ( δS Y ) kl i = 0. This proves theasymptotic independence between δA and ( δS X , δS Y ). The conclusion follows bynoting that the asymptotic variance-covariance matrix of δA is F − , and that of( δS X , δS Y ) is  K XX K XY K ′ XY K Y Y  . B.4.

Proof of Theorem 3.

In order to give asymptotic distributions of matrixestimators, it is convenient to represent matrices as vectors, an operation which iscalled vectorization in matrix algebra. Linear operators acting on these vectorizedmatrices will therefore be called doubly-indexed matrices , for which we shall use thebold notation to distinguish them from simply-indexed matrices. If R is a doubly-indexed matrix, its general term will be denoted R ijkl , where ij indexes the lines and kl indexes the columns of R . Then R · M will denote the (simple) matrix N suchthat N ij = P kl R ijkl M kl . We recall the deﬁnition of the Kronecker product: for twomatrices A and B , A ⊗ B is the doubly-indexed matrix R such that R ijkl = A ik B jl . Lemma 6.

The following convergence holds in distribution for n → + ∞ : n / (cid:16) ˆΘ − Θ (cid:17) = ⇒ N (0 , V ) 51 here V = T XY F − T ′ XY + T X K XX T ′ X + T Y K Y Y T ′ Y + T X K XY T ′ Y + T Y K ′ XY T X . Proof of Lemma 6. As δS / X = (cid:16) I ⊗ S / X + S / X ⊗ I (cid:17) − δS X , one has δ Θ = (cid:16) S / Y ⊗ S / X (cid:17) δA + (cid:16) S / Y A ′ ⊗ I (cid:17) δS / X + (cid:16) I ⊗ S / X A (cid:17) δS / Y = T XY δA + T X δS X + T Y δS Y , where T X = (cid:16) S / Y A ′ ⊗ I (cid:17) (cid:16) S / X ⊗ I + I ⊗ S / X (cid:17) − (B.15) T XY = S / Y ⊗ S / X (B.16) T Y = (cid:16) I ⊗ S / X A (cid:17) (cid:16) S / Y ⊗ I + I ⊗ S / Y (cid:17) − , (B.17)The proof of Theorem 3 follows as an easy consequence. Proof of Theorem 3.

Let(B.18) Ω p = (cid:0) B p ⊥ ⊗ A ′ p ⊥ (cid:1) V (cid:0) B p ⊥ ⊗ A ′ p ⊥ (cid:1) ′ . By Kleibergen and Paap, Theorem 1, the convergence n / ˆ T p = ⇒ N (0 , Ω p )holds for n → + ∞ , and Theorem 3 follows. 52 ppendix C. Computation

Let a and b be the solutions of equation (2.7), and introduce˜ a ( x ) = exp ( − a ( x ) /σ ) and ˜ b ( y ) = exp ( − b ( y ) /σ )so equation (2.7) rewrites(C.1) π ( x, y ) = ˜ a ( x ) ˜ b ( y ) K ( x, y )where K ( x, y ) = exp (Φ ( x, y ) /σ ), and the system of equations formed by the con-straints on the marginals rewrites˜ a ( x ) = f ( x ) (cid:18)Z Y ˜ b ( y ) K ( x, y ) dy (cid:19) − (C.2) ˜ b ( y ) = g ( y ) (cid:18)Z X ˜ a ( x ) K ( x, y ) dx (cid:19) − (C.3)Note that by (C.3), ˜ b can be expressed as a function of ˜ a . Then ˜ a rewrites as aﬁxed point equation ˜ a = F (˜ a ), where F is given by F (˜ a ) ( x ) = f ( x ) Z Y (cid:18) g ( y ) Z X ˜ a ( x ′ ) K ( x ′ , y ) dx ′ (cid:19) − K ( x, y ) dy ! − . The Iterative Projection Fitting Procedure (IPFP) consists in starting with someproper choice of ˜ a ( x ) that ensures integrability of x → ˜ a ( x ) K ( x, y ), and iterativelyapplying ˜ a k +1 = F (˜ a k ). Details and proof of convergence are provided in R¨uschendorf(1995); convergence is very quick in practice. 53 ppendix D. Incorporating singles

Throughout this appendix, the symbol ∅ stands for singlehood; this enlarges thesets of marital choices of men and women, which we denote X = X ∪ {∅} and Y = Y ∪ {∅} . Let ¯ f ( x ) be the density of mass of men of type x , f ( x ) be the densityof mass of single men of type x , and, as in the rest of the paper, f ( x ) is the densityof mass of matched men of type x , so that ¯ f ( x ) = f ( x ) + f ( x ). Introduce similarnotations on the other side of the market: ¯ g ( y ) = g ( y ) + g ( y ), and note that thetotal mass of men and women no longer needs to coincide, i.e. in general one has Z X ¯ f ( x ) dx = Z Y ¯ g ( y ) dy. The set of acquaintance of man m is now expanded to include singlehood: { ( y mk , ε mk ) , k ∈ N } are now the points of a Poisson process on Y × R of intensity λ × e − ε dε , wherefor B ⊆ Y λ ( S ) = 1 {∅ ∈ B } + λ ( B \ {∅} )where λ is the Lebesgue measure on Y . As in Appendix A, the utility of a man m matching with acquaintance k is determined at equilibrium by U ( x, y mk ) + σ ε mk , but y mk can now take value ∅ , in which case U ( x, ∅ ) = Φ ( x, ∅ ). The indirect utility ofman m is thus given by Z = max k { U ( x, y mk ) + σ ε mk } , and one haslog Pr ( Z ≤ c ) = − Z Z Y × R (cid:16) U ( x, y ) + σ ε > c (cid:17) dλ ( y ) e − ε dε = − exp (cid:18) − c + log (cid:18) exp Φ ( x, ∅ ) σ/ Z Y exp U ( x, y ) σ/ dy (cid:19)(cid:19) , so that f ( x )¯ f ( x ) = exp Φ( x, ∅ ) σ/ exp Φ( x, ∅ ) σ/ + R Y exp U ( x,y ) σ/ dy and g ( y )¯ g ( y ) = exp Φ( ∅ ,y ) σ/ exp Φ( ∅ ,y ) σ/ + R X exp V ( x,y ) σ/ dx while π ( y | x ) = exp U ( x,y ) σ/ R Y exp U ( x,y ′ ) σ/ dy ′ and π ( x | y ) = exp V ( x,y ) σ/ R X exp V ( x ′ ,y ) σ/ dx ′ π identiﬁes U ( x, y ) up to an additive term c ( x ),and V ( x, y ) up to an additive term d ( y ), hence U and V are identiﬁed by U ( x, y ) = σ/ π ( y | x ) + c ( x )) , V ( x, y ) = σ/ π ( x | y ) + d ( y ))and Φ ( x, y ) = σ π ( y | x ) + log π ( x | y ) + c ( x ) + d ( y ))where c ( x ) and d ( y ) are undetermined. This is precisely the identiﬁcation achieved inSection 2.2. The crucial conclusion is that the observation of singles does not changeanything in the identiﬁcation of U and V . This is a consequence of the independenceof irrelevant alternatives (IIA) of the logit model: indeed, the incentive for remainingsingle does not aﬀect the odd ratios of the choices of the partners types. As a result,the distributions of matched men and women f ( x ) and g ( y ) may be treated asexogenous.Once U and V have been identiﬁed, one has f ( x )¯ f ( x ) = exp Φ( x, ∅ ) σ/ exp Φ( x, ∅ ) σ/ + exp c ( x ) and g ( y )¯ g ( y ) = exp Φ( ∅ ,y ) σ/ exp Φ( ∅ ,y ) σ/ + exp d ( y )hence by inversionΦ ( x, ∅ ) = σ (cid:18) log f ( x )¯ f ( x ) − f ( x ) + c ( x ) (cid:19) and Φ ( ∅ , y ) = σ (cid:18) log g ¯ g ( y ) − g ( y ) + d ( y ) (cid:19) which implies that the observation of single individuals allows one to identify thereservation utilities. As a result, the utility surplus from matching Φ ( x, y ) − Φ ( x, ∅ ) − Φ ( ∅ , y ) is identiﬁed in the data by(D.1) log π ( y | x ) (cid:0) ¯ f ( x ) − f ( x ) (cid:1) f ( x ) π ( x | y ) (¯ g ( y ) − g ( y )) g ( y ) ! and the ex-ante expected utility surpluses of men of type x and women of type y aregiven just as in Choo and Siow by(D.2) u ( x ) = log ¯ f ( x ) f ( x ) and v ( y ) = log ¯ g ( y ) g ( y ) . (cid:0) µ xy / ( µ x µ y ) (cid:1) ,where µ x and µ y are respectively the number of single men and women of type x and y respectively, and µ xy is the number of xy pairs. 56 ppendix E. Further details on the dataset

E.1.

Questionnaire about personality and attitudes . Personality traits, the16PA scale.

Now we would like to know how you would describe your personality. Below wehave mentioned a number of personal qualities in pairs. The qualities are not alwaysopposites. Please indicate for each pair of qualities which number would best describeyour personality. If you think your personality is equally well characterized by thequality on the left as it is by the quality on the right, please choose number 4. If youreally don’t know, type 0 (zero). Scale: 1 2 3 4 5 6 7TEG1: oriented towards things oriented towards people.TEG2 slow thinker quick thinker.TEG3: easily get worried not easily get worried.TEG4: ﬂexible, ready to adapt myself stubborn, persistent.TEG5: quiet, calm vivid, vivacious.TEG6: carefree meticulous.TEG7: shy dominant.TEG8: not easily hurt/oﬀended sensitive, easily hurt/oﬀended.TEG9: trusting, credulous suspicious.TEG10: oriented towards reality dreamer.TEG11: direct, straightforward diplomatic, tactful.TEG12: happy with myself doubts about myself.TEG13: creature of habit open to changes.TEG14: need to be supported independent, self-reliant. Attitude towards risk.

The following statements concern saving and taking risks. Please indicate for eachstatement to what extent you agree or disagree, on the basis of your personal opinionor experience.totally disagree 1 2 3 4 5 6 7 totally agreeSPAAR1: I think it is more important to have safe investments and guaranteedreturns, than to take a risk to have a chance to get the highest possible returns.SPAAR2: I would never consider investments in shares because I ﬁnd this too risky.SPAAR3: if I think an investment will be proﬁtable, I am prepared to borrowmoney to make this investment.SPAAR4: I want to be certain that my investments are safe.SPAAR5: I get more and more convinced that I should take greater ﬁnancial risksto improve my ﬁnancial position.SPAAR6: I am prepared to take the risk to lose money, when there is also a chanceto gain money.E.2.

Construction of the “Big Five” personality factors.

The DHS panel con-tains three lists of items that would allow one to assess a respondent’s personalitytraits.(1) The ﬁrst list contains 150 items and refers to the Five-Factor PersonalityInventory measure, developed by Hendriks et al. (1999). This list was includedin a supplement to the 1996 wave. 582) The second list refers to the 16 Personality Adjective (16PA) scale developedby Brandst¨atter (1988) and was included in the module “Economic and Psy-chological Concepts” from 1993 until 2002.(3) From 2003 on, the panel replaced the 16PA scale by the International Per-sonality Item Pool (IPIP) developed by Golberger (1999). The 10-item listversion of the IPIP scale is used except for the 2005 wave where the 50-itemlist was implemented.Of the three scales, the 16PA scale covers the largest sample of individuals. For thatreason, the 16PA scale was chosen to measure personality traits. This scale oﬀers therespondents the opportunity to locate themselves on 16 personality dimensions. Eachdimension is represented by two bipolar scales so that the full scale contains 32 items.Nyhus and Webley (2001) show that this scale distinguishes 5 factors . They labeledthese factors as: Emotional stability, Extraversion, Conscientiousness, Agreeableness,and Autonomy. Of the 32 items associated with the 16PA measure, the ﬁrst half wasasked in 1993, 1995 and each year between 1997 and 2002 while the other half wasasked in 1994 and 1996 only. Constructing the full scale, therefore, requires losing allrespondents but those who responded in two successive years between 1993 and 1996.To avoid throwing out too many observations, we constructed the ﬁve dimensionsusing only those 16 items included in waves 1993, 1995 and 1997-2002. Since answersgiven to the same item by the same person in diﬀerent waves are strongly correlated(see Nyhus and Webley, 2001), we simply collapse the data by individual using theperson’s median answer to each item. Using the 1996 wave that contains both the FFPI module and the 16PA module, Nyhus andWebley (2001) checked the correlation between the 5 factors identiﬁed by the 16PA scale and the(big) ﬁve factors identiﬁed by the FFPI. The correlation is generally high though not perfect. Thissuggests that both sets of factors assess slightly diﬀerent aspects of the latent factors. We followedNyhus and Webley and use a slightly less general wording for the various dimensions identiﬁed fromthe 16PA scale.

59e have constructed our ﬁve factors by adding the (standardized) items identiﬁedby Nyhus and Webley (2001) for the respective scales. In other words, “Emotionalstability” is constructed using items: • “oriented toward reality”/“dreamer”, • “happy with myself”/“doubtful”, • “need to be supported”/“independent”, • “well-balanced”/“quick-tempered”, • “slow-thinker”/“quick-thinker” and, • “easily worried”/“not easily worried”.“Agreeableness” is constructed using items: • “creature of habit”/“open to changes”, • “slow thinker”/“quick thinker”, • “quiet, calm”/“vivid, vivacious”.“Autonomy” is constructed based on: • “direct, straightforward”/“diplomatic”, • “quiet, calm”/“vivid, vivacious” and, • “shy”/“dominant”.“Extraversion” is based on: • “oriented towards things”/“towards people”, • “ﬂexible”/“stubborn” and, • “trusting, credulous”/“suspicious”.“Conscientiousness” is constructed using: • “little self-control”/“disciplined”, • “carefree”/“meticulous” and, • “not easily hurt”/“easily hurt, sensitive”. 60s a robustness check, we constructed the full scale using the 1993, 1994, 1995 and1996 waves. We followed Nyhus and Webley (2001) and constructed the ﬁve factorsusing Principal Component Analysis and varimax rotation on the ﬁve main factors.The correlation between each of the factors we constructed using only 16 items andthe corresponding factor using the full scale varies between 0.42 for agreeableness and0.76 for emotional stability. 61 eferences Anderberg, D. (2004): “Getting Hitched: The Equilibrium Marriage MarketBehaviour of a British Cohort,”

Royal Holloway, University of London: Dis-cussion Papers in Economics .Atakan, A. (2006): “Assortative Matching with Explicit Search Costs”,

Econo-metrica , Vol. 74, No. 3, pp. 667-680.Becker, G. S. (1973): “A Theory of Marriage: Part I,”

Journal of Political Econ-omy , 81(4), 813–46.Becker, G. S. (1991):

A Treatise on the Family . Harvard University Press.Ben-Akiva, M., N. Litinas, and K. Tsunekawa (1985): “Continuous spatial choice:The continuous Logit model and distributions of trips and urban densities,”

Transportation Research Part A : General, 19, 83–206.Ben-Akiva, M., and T. Watanatada (1981):

Application of a continuous spatialchoice logit model in Structural Analysis of Discrete Data with EconometricApplications , MIT Press.Bertrand, M., and A. Schoar (2003): “Managing With Style: The Eﬀect OfManagers On Firm Policies,”

Quarterly Journal of Economics , 118(4), 1169-1208.Bojilov, R., and A. Galichon (2013): “Closed-form formulas for multivariatematching”. Mimeo.Borghans, L., A.L. Duckworth, J.J. Heckman, and B. ter Weel (2008): “The Eco-nomics and Psychology of Personality Traits,”

Journal of Human Resources ,43(4), 972-1059.Bowles, S., H. Gintis, and M. Osborne (2001): “The Determinants of Earnings:A Behavioral Approach,”

Journal of Economic Literature , 39(4), 1137–1176.Brandstatter, H. (1988): “Sixteen personality adjective scales as a substitute forthe 16PF in experiments and ﬁeld studies,”

Zeitschrift fr Experimentelle undAngewandte Psychologie , 35, 370–390. 62rowning, M., P.-A. Chiappori, and Y. Weiss (2013): Family Economics, Cam-bridge University Press, forthcoming.Bruze, G. (2011): “Marriage Choices of Movie Stars: Does Spouses EducationMatter?,”

Journal of Human Capital , 5(1), 1–28.Charles, K. K., E. Hurst, and A. Killewald (2013): “Marital Sorting and ParentalWealth,”

Demography , 50(1), 51-70.Chiappori, P.-A., and S. Oreﬃce (2008): “Birth Control and Female Empow-erment: An Equilibrium Analysis,”

Journal of Political Economy , 116(1),113–140.Chiappori, P.-A., S. Oreﬃce, and C. Quintana-Domeque (2012): “Fatter attrac-tion: anthropometric and socieconomic matching on the marriage market,”

Journal of Political Economy

120 (4), pp. 659–695.Chiappori, P.-A., S. Oreﬃce, and C. Quintana-Domeque (2011): “Matching witha Handicap: The Case of Smoking in the Marriage Market,” working paper.Chiappori, P.-A., B. Salani´e, and Y. Weiss (2010): “Assortative Matching on theMarriage Market: A Structural Investigation,” working paper.Choo, E., and A. Siow (2006): “Who Marries Whom and Why,”

Journal ofPolitical Economy , 114(1), 175–201.Coles, M., and Francesconi, M. (2011): “On the emergence of toyboys: thetiming of marriage with aging and uncertain careers,”

International EconomicReview

Journal of Financial Economics ,forthcoming. 63agsvik, J. (1994): “Discrete and Continuous Choice, Max-Stable Processes, andIndependence from Irrelevant Attributes,”

Econometrica , 62, 1179–1205.Das, M., and A. van Soest (1999): “A panel data model for subjective informationon household income growth,”

Journal of Economic Behavior & Organization ,40(4), 409–426.Decker, C., E. Lieb, R. McCann, and B. Stephens (2013). “Unique equilibria andsubstitution eﬀects in a stochastic model of the marriage market”.

Journal ofEconomic Theory , 148, 778-792.Dupuy, A., and A. Galichon (2012): “Canonical Correlation and AssortativeMatching: A Remark,” working paper.Echenique, F., S. Lee, B. Yenmez, and M. Shum (2013): “The Revealed Prefer-ence Theory of Stable and Extremal Stable Matchings,”

Econometrica , 81(1),153–171.Elfving, G. (1967): “A persistency problem connected with a point process”,

Journal of Applied Probability

4, pp. 77-89.Falato, A., D. Li and T.T. Milbourn (2012), “Which Skills Matter in the Marketfor CEOs? Evidence from Pay for CEO Credentials,” working paper.Fox, J. (2010): “Identiﬁcation in Matching Games,”

Quantitative Economics

Quarterly Journal of Economics , 123, 49–100.Galichon, A., and B. Salani´e (2010): “Matching with Trade-oﬀs: Revealed Pref-erences over Competing Characteristics,” technical report.Galichon, A., and B. Salani´e (2013): “Cupid’s Invisible Hand: Social Surplusand Identiﬁcation in Matching Models,” working paper. 64raham, B. (2011): “Econometric Methods for the Analysis of Assignment Prob-lems in the Presence of Complementarity and Social Spillovers,” in

Handbookof Social Economics , ed. by J. Benhabib, A. Bisin, and M. Jackson. Elsevier.Gretsky, N., J. Ostroy, and W. Zame (1992): “The nonatomic assignmentmodel,”

Economic Theory , 2(1), 103–127.Heckman, J.J. (2007), Notes on Koopmans and Beckmann’s “Assignment Prob-lems and the Location of Economic Activities”, lecture notes, the Universityof Chicago.Hendriks, A. A. J., W. K. B. Hofstee, B. De Raad, and A. Angleitner (1999):“The Five-Factor Personality Inventory (FFPI),”

Personality and IndividualDiﬀerences , 27, 307–325.Hitsch, G. J., A. Hortasu, and D. Ariely (2010): “Matching and Sorting in OnlineDating,”

American Economic Review , 100-1, 130–163.Horn, R., and C. Johnson (1991):

Topics in Matrix Analysis . Cambridge Uni-versity Press.Jacquemet, N., and J.-M. Robin (2013): “Marriage with Labor Supply,” workingpaper.Kleibergen, F., and R. Paap (2006): “Generalized reduced rank tests using thesingular value decomposition,”

Journal of Econometrics , 133, 97–126.Lundberg, S. (2012): “Personality and Marital Surplus,”

IZA Journal of LaborEconomics,

Be-havioral Travel-Demand Models , ed. by P. Stopher, and A. Meyburg, pp.305–314. Heath and Co.Mueller, G., and E. Plug (2006): “Estimating the eﬀect of personality on maleand female earnings,”

Industrial and Labor Relations Review , 60(1), 3–22.Nesheim, L. (2012): “Identiﬁcation in multidimensional hedonic models,” work-ing paper. 65yhus, E. K. (1996): “The VSB-CentER Savings Project: Data CollectionMethods, Questionnaires and Sampling Procedures,” VSB-CentER SavingsProject. Progress Report.Nyhus, E. K., and P. Webley (2001): “The role of personality in household savingand borrowing behaviour,”

European Journal of Personality , 15, S85–S103.Oreﬃce, S., and C. Quintana-Domeque (2010): “Anthropometry and socioeco-nomics among couples: Evidence in the United States,”

Economics & HumanBiology , 8(3), 373–384.Resnick, S., and R. Roy (1991): “Random USC functions, Max-stable processesand continuous choice,”

Annals of Applied Probability , 1(2), 267–292.Robin, J.-M., and R. J. Smith (2000): “Tests of rank,”

Econometric Theory , 16,151–175.R¨uschendorf, L. (1995): “Convergence of the iterative proportional ﬁtting pro-cedure,”

Annals of Statistics

23, 1160-1174.R¨uschendorf, L., and W. Thomsen (1993): “Note on the Schr¨odinger Equationand I-projections,”

Statistics and Probability Letters

17, 369–375.Shapley, L., and M. Shubik (1972): “The Assignment Game I: The Core,”

Inter-national Journal of Game Theory , 1, 111–130.Shimer, R., and L. Smith (2000): “Assortative matching and Search,”

Econo-metrica , 68, 343–369.Stutzer, A., and B. Frey (2006): “Does marriage make people happy, or do happypeople get married?”

The Journal of Socio-Economics

American Economic Review , 98, 642–668.Wong, L. Y. (2003): “Structural Estimation of Marriage Models,”

Journal ofLabor Economics , 21(3), 699–728. 66immermann, A. and R. Easterlin (2006), “Happily Ever After? Cohabitation,Marriage, Divorce, and Happiness in Germany,”

Population and DevelopmentReview ppendix F. TablesTable 1.

Number of identiﬁed young couples and number of youngcouples with complete information for various subset of variables.NIdentiﬁed couples 2,897Couples with complete information on:Education 2,883The above + Health, Height and BMI a Notes: (1) We have excluded all individuals taller than 210cm or shorter than 145cm andall individuals lighter than 40kg, no one is heavier than 200kg in our data. Theseexclusions represent less than 1 percent of the sample of adults in the source data. (2) Theselected sample for our analysis is the one from the last row.a: Excluding health produces exactly the same number of couples at this stage.Source: DNB. Own calculation. able 2. Sample of young couples with complete information: sum-mary statistics by gender. Husbands WivesN mean S.E. N mean S.E.Age 1158 35.52 6.01 1158 32.78 4.84Educational level 1158 2.01 0.57 1158 1.87 0.57Height 1158 182.33 7.20 1158 169.35 6.41BMI 1158 24.53 2.94 1158 23.44 3.83Health 1158 3.21 0.66 1158 3.11 0.69Conscientiousness 1158 -0.25 0.64 1158 0.01 0.68Extraversion 1158 -0.12 0.69 1158 0.16 0.60Agreeableness 1158 -0.06 0.65 1158 -0.04 0.64Emotional stability 1158 0.17 0.57 1158 -0.19 0.53Autonomy 1158 0.00 0.67 1158 -0.01 0.69Risk aversion 1158 0.06 0.68 1158 -0.12 0.88

S.E. means Standard Error. able 3. Estimates of the Aﬃnity matrix: quadratic speciﬁcation (N = 1158).Wives Education Height BMI Health Consc. Extra. Agree. Emotio. Auto. RiskHusbandsEducation -0.04 0.05 -0.04 0.04 0.02 0.00Consc. -0.06 -0.03 0.07 0.00 -0.11 0.11 -0.04 0.03 -0.09 0.01Risk 0.00 0.02 -0.03 0.02 0.01 -0.01 -0.01 -0.05 0.05

Note: Bold coeﬃcients are signiﬁcant at the 5 percent level. able 4. Share of observed joint utility explained.I1 I2 I3 I4 I5 I6 I7 I8 I9 I10Share of joint utility explained 27.98*** 16.60*** 14.20*** 10.07*** 9.18*** 8.51*** 6.24*** 4.14*** 2.09 0.99Standard deviation of shares 2.25 1.55 1.59 1.54 1.68 1.48 2.58 1.91 2.26 1.03

I1-I10 indicates the 10 indices created by the Singular Value Decomposition of the aﬃnity matrix.** signiﬁcant at 1 percent. able 5. Indices of attractiveness.I1 I2 I3Attributes M W M W M WEducation -0.56

Health -0.08 0.02 -0.20 0.02 -0.04 -0.02Consc. -0.17 -0.14 0.37 -0.59

Agree. -0.02 -0.05 0.16 0.00