[PDF] Two-Sided Matching Markets in the ELLIS 2020 PhD Program

Abstract

The ELLIS PhD program is a European initiative that supports excellent young researchers by connecting them to leading researchers in AI. In particular, PhD students are supervised by two advisors from different countries: an advisor and a co-advisor. In this work we summarize the procedure that, in its final step, matches students to advisors in the ELLIS 2020 PhD program. The steps of the procedure are based on the extensive literature of two-sided matching markets and the college admissions problem [Knuth and De Bruijn, 1997, Gale and Shapley, 1962, Rothand Sotomayor, 1992]. We introduce PolyGS, an algorithm for the case of two-sided markets with quotas on both sides (also known as many-to-many markets) which we use throughout the selection procedure of pre-screening, interview matching and final matching with advisors. The algorithm returns a stable matching in the sense that no unmatched persons prefer to be matched together rather than with their current partners (given their indicated preferences). Roth [1984] gives evidence that only stable matchings are likely to be adhered to over time. Additionally, the matching is student-optimal. Preferences are constructed based on the rankings each side gives to the other side and the overlaps of research fields. We present and discuss the matchings that the algorithm produces in the ELLIS 2020 PhD program.

Full PDF

TTwo-Sided Matching Markets in the ELLIS 2020 PhD Program

Maximilian Mordig

MPI for Intelligent Systems and ETH Z¨urich

Riccardo Della Vecchia

Artiﬁcial Intelligence Lab, Institute for Data Science & AnalyticsBocconi University, Milano, ItalyFebruary 8, 2021

Abstract

PolyGS , an algorithm for the case of two-sided marketswith quotas on both sides (also known as many-to-many markets) which we use throughout theselection procedure of pre-screening, interview matching and ﬁnal matching with advisors. Thealgorithm returns a stable matching in the sense that no unmatched persons prefer to be matchedtogether rather than with their current partners (given their indicated preferences). Roth [1984]gives evidence that only stable matchings are likely to be adhered to over time. Additionally, thematching is student-optimal. Preferences are constructed based on the rankings each side gives tothe other side and the overlaps of research ﬁelds. We present and discuss the matchings that thealgorithm produces in the ELLIS 2020 PhD program.

The ELLIS PhD program is a European initiative that supports excellent young researchers by con-necting them to leading researchers in AI. In particular, PhD students are supervised by two advisorsfrom diﬀerent countries. This document summarizes the procedure that, in its ﬁnal step, matchesstudents to advisors in the ELLIS 2020 PhD program. After an overview of the selection procedure,we present the theory of two-sided matching markets which is used ﬁrst to match the students to theevaluators for an initial pre-screening, then to match candidates to advisors in the interviews, andﬁnally is responsible for matching students to advisors in ELLIS 2020. We propose an algorithm thatmatches students to advisors. This matching is a suggestion and may be inspected and correctedmanually to satisfy additional criteria. The co-advisor must be manually matched to a student andmust be from a diﬀerent country than the main advisor in ELLIS. Brieﬂy, each student ranks researchﬁelds and/or professors they would like to work with and vice versa for professors. This happens inthe following order. First, when students apply, they are pre-screened using just the preferences ofstudents over professors and the overlaps between ﬁelds of students and advisors. After this, advisorsalso rank the acceptable candidates and the interviews are set. After the interviews, both students andprofessors can consequentially update their preferences and the ﬁnal matching takes place, matchingeach student to an advisor. The co-advisor is added separately, taking care of possible constraintsimposed by the program.In [Mordig and Della Vecchia, 2021], we address the case when both advisors and co-advisorsshould be matched algorithmically to a student without human aid. We assume that all preferencesare available to us. When the acquisition of preferences is too expensive, Charlin and Zemel [2013] usesmachine-learning approaches to interpolate sparse preference data to other persons. Furthemore, our1 a r X i v : . [ c s . G T ] F e b rocedure relies heavily upon the seminal work on two-sided markets by Gale and Shapley [1962] whichhad high practical impact since its introduction, leading to improved matching systems for high-schooladmissions and labor markets [Roth, 1984], house allocations with existing tenants [Abdulkadiro˘gluand S¨onmez, 1999], content delivery networks [Maggs and Sitaraman, 2015], and kidney exchanges[Roth et al., 2005].We introduce the diﬀerent phases of the selection procedure in a broad and schematic way inSection 2. In Section 3 we present the algorithm that is used (with diﬀerent parameters) in thediﬀerent phases. In Section 4 we state the choices of parameters and give additional details about theprocedure adopted in the phases of the selection procedure. The ELLIS PhD selection procedure consists of the following phases: • Phase 1 (Student-Evaluator Matching for Pre-Screening) : Students state up to ﬁve researchﬁelds and optionally rank up to 10 professors. Each professor provides up to ﬁve research ﬁeldsand speciﬁes an evaluator who pre-screens applications (e.g. one of her postdocs).Based on research ﬁelds and on the ranking of professors made by students, both students andevaluators are assigned preferences over the other side of the market. A matching algorithm runsto match students and evaluators. Each student is matched to three evaluators. Each evaluatoris assigned the same maximum capacity of students to pre-screen. Evaluators score the assignedstudents. • Phase 2 (Student-Advisor Matching for Interviews) : After ﬁltering out badly scoring students,professors rank students based on the pre-screening scores and other means. Each professorranks a number of students from 1 - 4 (best to worse) reﬂecting their priorities for interviews. 5and 6 indicate that the scoring advisor does not consider the applicant for interviews. 5 speciﬁesthat the student is not a good ﬁt for the scoring advisor, but may be good for someone else; 6marks the student as unﬁt. An advisor may be assigned to interview any of her ranked students.Each professor speciﬁes her (maximum) interviewing capacity.Using preferences of students from

Phase 1 , the ranking assigned by advisors to students in thisphase, plus the overlap of research ﬁelds, the matching for interviews is computed. Each studentmust be interviewed by three advisors (ELLIS Excellence Criterion). The algorithm provideseach advisor with a number of students to interview not greater than her interviewing capacity.He must interview the ﬁrst 80% of the assigned students, the remaining 20% are provided on avoluntary basis. • Phase 3 (Student-Advisor Matching for Hiring) : Advisors and students rank the other side again.Advisors also provide their (maximum) hiring capacity. Based on their updated preferences,advisors and students are matched together. Each student is matched to at most one advisorand each advisor is matched up to her hiring capacity. • Phase 4 (Student-Co-advisor Matching for Hiring) : Advisors send out acceptance letters to allthe students they got matched with. They make sure that they ﬁnd a co-advisor for any studentwho accepts. This year, there is no algorithmic support to ﬁnd co-advisors. • Phase 5 (Matching the Unmatched) : Students who are ranked for oﬀers, but not matched, willenter the pool for the rematching phase. Advisors who were not matched or still have extrahiring capacity can see in the system which students are still unmatched but positively scored.This year, there is no algorithmic support for the rematching stage. Due to the limited interviewing capacity of advisors and the high number of ranked students, it proved impossibleto match all promising candidates to 3 interviews. In practice, then, students are also matched for 1 or 2 interviews. Matching theory – polygamous market

This section explains the theory behind the matching algorithm (Algorithm 1) which is used in each ofthe phases of the ELLIS PhD program selection. Our work is based on an extensive literature [Knuthand De Bruijn, 1997, Gale and Shapley, 1962, Roth and Sotomayor, 1992] which culminated in the 2012Economic Nobel Prize to Alvin E. Roth and Lloyd Shapley for their work on stable allocations andthe practice of market design . Two-sided matching markets are game-theoretic abstractions whichcorrespond to bipartite matchings. We recall some basic results for marriage markets in Appendix Aand their extensions to the college admission problem in Appendix B. The theory below provides thetheoretical background for the case of two-sided markets with quotas on both sides, also known asmany-to-many markets.Consider the setting where students need to be matched to advisors for interviews as in the caseat hand. Each student can take a maximum number of interviews and each professor can interviewa limited number of students. Furthermore, no student should be assigned to the same advisor morethan once. Students and advisors form the two sides of a matching market. We can phrase thisproblem as follows. We adopt the convention with men and women for consistency with the literature,and we want to stay out of the gender debate. We use the pronouns “his/her” to makeclear that we refer to a man/woman in the model, so “person or himself ” can also mean“person or herself ”.

Let M and W be ﬁnite sets of men and women, and let each person be endowedwith a strict order of preference with respect to the members of the opposite sex. For each person p ∈ W ∪ M, deﬁne q p to be the “quota”/“capacity” of this person, i.e., the amount of spouses from theopposite sex this person seeks. We call this market a “polygamous market” , in reference to the classicalmarriage market problem introduced by Gale and Shapley [1962]. In this case, we want to allow quotason both sides, unlike college admission markets and marriage markets in Appendix A and B.Let us formally deﬁne preference lists and the polygamous market. Deﬁnition 1 (preference lists) . For a market over men M and women W, each man m has preferences P ( m ) over all women in W ∪{ m } deﬁned by the binary relations ≥ m , = m (which deﬁnes > m , < m , ≤ m ).A man has strict preferences if = m is equal to = , i.e. if he is not indiﬀerent between any two women.Analogously, each woman w has preferences P ( w ) over all persons in M ∪{ w } . Person p is acceptableto p if p > p p . For persons p, p and a set of persons K , we deﬁne p > p K ⇐⇒ ∃ τ ∈ K : p > p τ . We assume that the spots are independent such that individual preferences suﬃce to express thepreferences over assignments of up to q p persons. Deﬁnition 2 (polygamous market) . A polygamous market over men M and women W is deﬁned bythe quadruple ( M, W, P, Q ) , where: • Q = { q m | m ∈ M } ∪ { q w | w ∈ W } are the quotas of men and women, • P is the set of preference lists of men and women: P = (cid:8) P ( m ) , . . . , P (cid:0) m | M | (cid:1) , P ( w ) , . . . , P (cid:0) w | W | (cid:1)(cid:9) . Deﬁnition 3 (valid matching) . A matching on the polygamous market ( M, W, P, Q ) is a multi-set(i.e. elements can appear more than once) µ ⊂ M × W such that: • | µ ( w ) | ≤ q w ∀ w ∈ W , where µ ( w ) = { m | ( m, w ) ∈ µ } , • | µ ( m ) | ≤ q m ∀ m ∈ M , where µ ( m ) = { w | ( m, w ) ∈ µ } . We have the property m ∈ µ ( w ) ⇐⇒ w ∈ µ ( m ). | µ ( m ) | ≤ q m means that m is matched q m − | µ ( m ) | times to itself and we ﬁll µ ( m ) with m up to size q m , same for women. Since a set does When the quotas Q satisfy the college admission market assumption, i.e. one side of the market has quotas all equal toone, this deﬁnition coincides with the deﬁnition given in Appendix B.A matching µ is unstable if any two persons prefer to be together rather than with their assignedpartners. From this, a new matching can be constructed. It should be valid in the sense that itmatches any pair at most once. So ( m, w ) is only a blocking pair if it is not matched already in µ . Onmarriage and college admission markets, this deﬁnition coincides with the deﬁnitions in Appendix B. Deﬁnition 4 (stability) . A matching µ is unstable if there exists a pair ( m, w ) ∈ ( M × W ) ∪ { ( m, m ) | m ∈ M } ∪ { ( w, w ) | w ∈ W } such that ( m = w ∨ m / ∈ µ ( w )) and w > m µ ( m ) and m > w µ ( w ) . ( m, w ) is called a blocking pair of µ . In Algorithm 1, we introduce

PolyGS , which is a direct reformulation of the Gale-Shapley (GS)algorithm to the case of quotas on both sides of the polygamous market. All men are initially un-matched and women start by matching with themselves as many times as they have capacity. At eachstep of the algorithm, a man with available quotas proposes to the woman he prefers most among allwomen who have not (yet) rejected him. Next, the woman compares this oﬀer with the least favouriteman among all the ones she is provisionally engaged to. If the new proposal is worse according to herpreference list, she directly rejects it. If the new proposal is better according to her preference list,she disengages the least favourite man and provisionally engages with the new one. As long as a manhasn’t ﬁlled her quotas, he continues proposing. If no acceptable woman is left, he ﬁlls the remainingspots with himself. Furthermore, we use the notation [ w ] q w to refer to the list [ w, . . . , w ] of size q w . InAlgorithm 1, the function offerNext ( P, m ), returns m ’s most preferred woman who did not rejecthim yet, given her preferences P ( m ). In the algorithm, we deﬁne µ ( m ) = { w | ( m, w ) ∈ µ } , so it isnot ﬁlled to capacity (during the algorithm). The function weakestMatch ( w, µ ( w ) , P ) returns theweakest match p ∈ µ ( w ) of the woman w ( p can be either a man or the woman herself) according toher preferences P ( w ). Note that m > w weakestMatch ( w, µ ( w ) , P ) is equivalent to m > w µ ( w ). Algorithm 1:

PolyGS

Data:

Market (

M, W, P, Q ), quotas Q Result:

Matching µµ ← ∪ w ∈ W ∪ q w i =1 { ( w, w ) } // match every woman q w times to herself while there is a man m with available capacity, i.e. | µ ( m ) | < q m do w ← offerNext ( P, m ) // best woman w to which m has not yet proposed, otherwise m if w = m then // man proposed to himself and accepts µ ← µ ∪ { ( m, m ) } else if m > w m (cid:48) = weakestMatch ( w, µ ( w ) , P ) then µ ← µ \ { ( m (cid:48) , w ) } // unmatch ( m (cid:48) , w ) (once only) µ ← µ ∪ { ( m, w ) } // match ( m, w ) endendreturn µ The algorithm coincides with the GS algorithm on marriage markets and with the college GSalgorithm on college admission markets. We now prove that this algorithm returns a matching whichmatches the same man and woman at most once. Additionally, it is stable and optimal for the sidewhich is proposing. The proofs are adaptations of the many-to-one case Gale and Shapley [1962].

Proposition 5.

PolyGS terminates and returns a valid and stable matching (Deﬁnition 4). If µ were a multi-set, they could match several times. In this case, stable matchings can be obtained by running thetraditional GS algorithm on the extended market, where both sides are replicated according to their capacities. The case m = w is also known as individual rationality. roof. The algorithm terminates because each man proposes to each woman at most once. Thereturned matching µ satisﬁes the quotas because | µ ( w ) | = q w is preserved over iterations for allwomen w and the algorithm terminates only when | µ ( m ) | = q m for all men m . Since each manproposes to each woman at most once, a man can be matched to a woman at most once. Therefore,the matching is valid.To prove stability, we use the following property: Over iterations, the weakest match of any woman w cannot decrease. By contradiction, let ( m, w ) a blocking pair of the matching µ . If m = w := p ∈ M ∪ W , this implies p > p µ ( p ), but this can never happen because only acceptable partners arematched. Otherwise, m is a man and w is a woman and there exist m (cid:48) , w (cid:48) (not necessarily man andwoman) such that w > m w (cid:48) ∈ µ ( m ) and m > w m (cid:48) ∈ µ ( w ), where m (cid:48) is the weakest element of µ ( w ).Since w > m w (cid:48) ∈ µ ( m ), m must have proposed to w . Since m > w m (cid:48) ∈ µ ( w ) and the weakest matchcan never decrease, w would never have rejected m , which contradicts m / ∈ µ ( w ).Looking at the proof of optimality in the college admission market (without passing via the ex-tended market), we see that it can be extended without problems to this setting. The diﬀerence isthat a matching µ is unstable only if the blocking pair ( m, w ) is not already part of µ , i.e. m / ∈ µ ( w ). Proposition 6 (Optimality) . Assume strict preferences and men propose (with quotas on both sides).

PolyGS returns a man-optimal result µ , i.e. for every man m : µ ( m ) i ≥ m µ (cid:48) ( m ) i ∀ m, i . Theassignments µ ( m ) , µ (cid:48) ( m ) are ordered from best to worst in terms of ≥ m and µ (cid:48) is stable according toDeﬁnition 4. Under strict preferences, PolyGS returns a unique result (independently of the orderin which men propose).Proof.

Uniqueness follows immediately from optimality. A woman w is achievable to man m if thereexists a stable matching µ (cid:48) such that m ∈ µ (cid:48) ( w ). It is enough to prove that a man m is neverrejected by an achievable woman. Indeed, assume there exists a man m and let i the ﬁrst indexsuch that µ ( m ) i < m µ (cid:48) ( m ) i . By assumption, µ ( m ) j ≥ m µ (cid:48) ( m ) j > m µ (cid:48) ( m ) i for all j < i . Also, µ (cid:48) ( m ) i > m µ ( m ) i ≥ m µ ( m ) j for all j ≥ i and this means that man m was rejected by µ (cid:48) ( m ) i since m applied to µ ( m ) i < s µ (cid:48) ( m ) i and is not matched to µ (cid:48) ( m ) i .Let m the ﬁrst man who is rejected by an achievable woman w during the execution of thealgorithm. Since w is achievable to m , let µ (cid:48) the stable matching in which m is matched to w , i.e. m ∈ µ (cid:48) ( w ) (and w ∈ µ (cid:48) ( m )). Since m was rejected (and is acceptable to w ), there must be q w other men who are all preferred by the woman: m i > w m ∈ µ (cid:48) ( w ) ∀ i = 1 , . . . , q w . Because none ofthese other men m i was yet rejected by an achievable woman, ∀ i ∃ j i : w ≥ m i µ (cid:48) ( m i ) j i . Indeed, bycontradiction, assume that there exists an index i s.t. w < m i µ (cid:48) ( m i ) j ∀ j = 1 , . . . , q m i . m i cannot bematched to w . Since m i applied to w , this means that m i was rejected by the achievable µ (cid:48) ( m i ) j i forsome j i , which is a contradiction with m being the ﬁrst man with this property.Since m ∈ µ (cid:48) ( w ), there must exist m i / ∈ µ (cid:48) ( w ). Thus w (cid:54) = µ (cid:48) ( m i ) j i and w > m i µ (cid:48) ( m i ) j i ∈ µ (cid:48) ( m i ).Thus, ( m i , w ) (with the additional property m i / ∈ µ (cid:48) ( w )) blocks µ (cid:48) , which contradicts the stability of µ (cid:48) . Hence w is not achievable to m .In fact, one can also prove that it is woman-pessimal, i.e. µ (cid:48) ( w ) i ≥ w µ ( w ) i ∀ w, i for all stablematchings µ (cid:48) .As for the GS algorithm, women can propose and we obtain a woman-optimal matching. Opti-mality generally does not hold when preferences are not strict. When preferences are non-strict, wecan break ties and the algorithm returns a matching that is also stable with respect to the originalpreferences. As part of the selection procedure, ELLIS aims to match people on one side of the market to the otherside, possibly several times. We will always use the

PolyGS with students proposing. It reducesto the traditional GS algorithm and college admission algorithm when quotas are all equal to one orequal to one on one side respectively. Students always propose to ensure student-optimality. Thismeans that students get the best matches among all the stable ones. In each of the phases, we need to5pecify the market participants and their preferences as well as their capacities. When the preferencesprovided to the algorithm are non-strict, they are broken arbitrarily. To have more control over thetie-breaking, we break some of the ties beforehand. We construct preferences based on the similarityscore between ﬁelds of research. For person p , the research similarity score with person p is S ( p , p ) = R ( p ) T · R ( p ) , where R ( p ) denotes the multi-one-hot encoding of the research interests of person p . Say [ A, B, C ]are the available research ﬁelds and person p is interested in ﬁelds A and C , then R ( p ) = [1 , , T . Aperson can use this score to rank the people on the other side of the market.We now describe each of the phases. Incomplete, fake and blatantly bad students are removedfrom the system before each phase. Removed students may be added again after manual inspectionat each phase, e.g. students who got kicked out because they weren’t matched to three evaluators(Phase 1), three interviews (Phase 2) or to an advisor (Phase 3). By December 1, students and professors specify their areas of interest. Students may additionally rankup to 10 professors. Each professor provides an evaluator to pre-screen applications. The evaluator’sresearch ﬁelds are those of the corresponding professor and the market consists of evaluators andstudents. Evaluators rank students based on research similarity score. Students give the best ranksto the (up to 10) advisors (evaluators) they listed, followed by all others ordered by research overlap.More precisely, given a student, let A the ordered set of (up to 10) advisors who were ranked by thestudent. Let B the ordered set of all advisors ordered by decreasing research overlap score. Then, thestudent’s preferences become: A + ( B \ A ) , where the + operation appends the second list to the ﬁrst list preserving the order. Since pref-erences are based on discrete scores, ties can occur. Advisors break ties between any indiﬀerentstudents such that they prefer students who listed them. For example, if A = { a , [ a , a ] } , B = { [ a , a ] , a , [ a , a , a ] } , the new preferences are { a , [ a , a ] , a , [ a , a ] } . If advisor a has preferences { [ s , s , s , s ] , [ s , s , s ] } and s , s , s are the only students who listed a , the advisor’s preferencesbecome { [ s , s ] , [ s , s ] , s , [ s , s ] } . The remaining ties are broken arbitrarily. Students all have quota3. Each evaluator has quota (cid:108) | S || A | (cid:109) , where | S | , | A | are the total number of students and advisors. Inthe period December 2 - 4, the algorithm runs. Students are assigned to evaluators. From December5 - 10, evaluators score each student (using the scores “A”, “A-B”, “B”, “B-C” and “C”). Because astudent may not get assigned to three evaluators (e.g. in the scenario with only one evaluator), webreak ties again randomly up to 10 times. If this is unfruitful, we remove (a subset of) insuﬃcientlymatched students and rerun the algorithm. This is repeated until a solution is found.In Figure 1, we plot a graph that shows that a very large proportion of students gets matched totheir ﬁrst choice. Bound on the number of removed students:

Since we remove students, the matching may notbe stable with respect to the original preferences over all students (including the removed students).We can bound the maximum number of removed students. Let q s the maximum (and target) capacityof students, q min s the minimum number of matches a student must have in order not to be removed. Theevaluator capacity is q e = (cid:108) q s | S || E | (cid:109) . Suppose k students were removed and consider the next iteration(one iteration corresponds to tie breaking up to 10 times). This means that at most q s ( | S | − k )evaluator spots are occupied, i.e. at least q e | E | − q s ( | S | − k ) ≥ q s k evaluator spots are free. In thisiteration, a student cannot be matched if all these free spots are distributed over q min s − q s k > ( q min s − · q e , every student is guaranteed to ﬁnd enough evaluators and the algorithmterminates. This holds when q s k > ( q min s − · ( q s | S || E | + 1). Therefore, the maximum number ofremoved students is k ≤ (cid:106) ( q min s − · ( | S || E | + q s ) + 1 (cid:107) . The fraction of removed students is k | S | ≤ i th ranked person (with i on the x -axis). The top-rightis for the second-best match and the bottom-right is for the third-best match.( q min s − · ( | E | + | S | q s ) + | S | . As expected, k is smaller the smaller the ratio | S || E | is. In Phase 1 ofELLIS, q s = 3 , q min s = 3 , | E | ≈ , | S | ≈ k ≤

11 and k | S | ≤ . From December 14 2020 - January 15 2021, professors score students (1 - 4, 5, 6) based on theapplication documents, scores from the pre-screening phase and other means . The score “5” means”not a good ﬁt for me, but still good”, “6” means ”this student should not be part of ELLIS” (andis an indication to remove this student from the system). If an advisor ranks a student “5” or “6”,he will not get matched to this student. Professors also specify their interviewing capacity. Based onthe scores and research overlaps (to break ties of students with same scores), a ranking over scoredstudents is created for each professor. A professor can only be matched to students he scored anddidn’t score “5” or “6”. Students are assigned preferences in the same fashion as before. Each studenthas quota 3 (“ELLIS Excellence Criterion” ). While it is certainly possible that an advisor interviewsall students he is interested in, the goal is to also give less “visible/good” candidates a chance toget interviewed and possibly hired. Therefore, even the best students can only get three assignedinterviews. Additional interviews can be organized individually if necessary. Each advisor is assigneda quota which is 80% of her stated interviewing capacity. We decrease the capacity to 80% becausean advisor must interview all of these 80% and 100% may be too severe. Advisors are not constrainedregarding their remaining capacity; we give suggestions for these remaining interviews as outlinedbelow.Because the ELLIS Excellence Criterion is hard to satisfy, we require each student to be matchedto at least 2 interviews rather than 3. If this student should be hired in Phase 3, the criterion can besatisﬁed by manually arranging an interview. Therefore, each student has a minimum capacity of 2 In the system, professors can see which students listed them, so they are more likely to rank those. To ensure the excellence of the hired students, this is one of the requirements decided by the ELLIS committee. When the stated interview capacity is less than 3, it is left as is. i th ranked person (with i on the x -axis). The top-rightis for the second-best match and the bottom-right is for the third-best match. If a person is matchedless than three times, the unmatched spots are assigned rank -1 (top-right and bottom-left).and a maximum capacity of 3. When a student is assigned to less than 2 interview slots, he is removedand the algorithm reruns (as described in Phase 1). Since many students may be matched to less than2 interviews, we remove at most 20 students at a time (based on average advisor ratings). Note thata student with one interview, who was not removed, can match to one interview slot previously takenby a removed student and this avoids her removal. Finally, when a student has at least one “1” andat least one “5” or at least two “1”, he is never removed even if he just ﬁnds one interview slot.The algorithm runs in the time frame January 15 - January 22 2021. To gain insight into thenext-best matches, students and advisors participate again with their remaining quota (including theadditional 20%). No students are removed in this second matching if they have too few interviews.These new matches are included in the optional list of candidates an advisor may wish to interview.However, ELLIS does not put any restrictions on how advisors eventually ﬁll their additional 20%capacity. In Figure 2, we plot the distribution of the rank of the i th best match for students with i = 1 , . . . ,

3, which is the analogous of Figure 1.

Note: These sections will be updated in the coming months.

Between January 23 - February19 2021, students and professors arrange interviews. Professors specify their hiring capacity. Studentsand advisors update their preferences by February 19. Advisors can be indiﬀerent (e.g. score twostudents “1”), but students cannot (their ranks are 1-10). These preferences are input unmodiﬁed tothe matching algorithm (and not modiﬁed as before based on the overlap of research ﬁelds). Studentshave quota 1 and advisors have quota equal to their hiring capacity. The algorithm is run. AroundMarch, advisors send out letters of acceptance to all students they matched with.TODO Ties are broken using one the two criteria: breaking ties based on excellence (number ofinterviews or average of point they receive - maybe 6 negative weight and the other positive) or tomaximise matchings. 8 .4 Phase 4 - Co-Advisor Matching

Note: These sections will be updated in the coming months.

This phase is manual. When astudent accepts, the advisor has to ﬁnd a co-advisor from a diﬀerent country as required for admissionto the ELLIS PhD Program. Students and advisors may already discuss potential co-supervisorsduring the interview stage.

We want to thank the following persons for the initial idea and help with the practical aspects of thisproject: Nicol`o Cesa-Bianchi, Bernhard Sch¨olkopf, Andreas Geiger, Lynn Anthonissen, Leila Masri,and the other members of the ELLIS PhD Committee.

References

Atila Abdulkadiro˘glu and Tayfun S¨onmez. House allocation with existing tenants.

Journal of EconomicTheory , 88(2):233–260, 1999.Laurent Charlin and Richard Zemel. The toronto paper matching system: an automated paper-reviewer assignment system. 2013.David Gale and Lloyd S Shapley. College admissions and the stability of marriage.

The AmericanMathematical Monthly , 69(1):9–15, 1962.Donald Ervin Knuth and NG De Bruijn.

Stable marriage and its relation to other combinatorialproblems: An introduction to the mathematical analysis of algorithms , volume 10. American Math-ematical Soc., 1997.Bruce M Maggs and Ramesh K Sitaraman. Algorithmic nuggets in content delivery.

ACM SIGCOMMComputer Communication Review , 45(3):52–66, 2015.Maximilian Mordig and Riccardo Della Vecchia. Three-sided matching markets ... (tbd). forthcoming ,2021.Alvin E Roth. The evolution of the labor market for medical interns and residents: a case study ingame theory.

Journal of political Economy , 92(6):991–1016, 1984.Alvin E Roth and Marilda Sotomayor. Two-sided matching.

Handbook of game theory with economicapplications , 1:485–541, 1992.Alvin E Roth, Tayfun S¨onmez, and M Utku ¨Unver. Pairwise kidney exchange.

Journal of Economictheory , 125(2):151–188, 2005.

A Marriage-market

In the seminal work by Gale and Shapley [1962], the authors introduced two types of matching markets,the marriage markets that we are going to recall in this section, and the college admission markets(Section B). These notions are at the basis of the extension in Section 3. In a classical marriage marketconsisting of a set of men M , women W and preferences of each person over the persons of the otherside, the goal is to match each man to at most one woman. Each woman can get married to oneman at most. In fact, a person may also choose to match with himself rather than match with someunacceptable partner. We start from some formal deﬁnitions and these are illustrated in Example 1.9 eﬁnition 7 (preference lists) . For a market over men M and women W, each man m has preferences P ( m ) over all persons in W ∪{ m } deﬁned by the binary relations ≥ m , = m (which deﬁnes > m , < m , ≤ m ).A man has strict preferences if = m is equal to = , i.e. he is not indiﬀerent between any two people.Analogously, each woman w has preferences P ( w ) over all persons in M ∪{ w } . Person p is acceptableto p if p > p p . These preferences can be represented as ordered lists as shown in Example 1.

Deﬁnition 8 (marriage market) . A marriage market over men M and women W is deﬁned by thetriple ( M, W, P ) , where P is the set of preference lists, P = { P ( m ) , . . . , P ( m n ) , P ( w ) , . . . , P ( w p ) } . A matching µ is valid if each person is matched to exactly one partner from the opposite sex orhimself. Deﬁnition 9 (valid matching) . A matching on the marriage market ( M, W, P ) is a correspondence µ : M ∪ W → M ∪ W such that: • µ ( m ) ∈ W ∪ { m } , • µ ( w ) ∈ M ∪ { w } , • µ ( m ) = w ⇐⇒ m = µ ( w ) . While there exist many valid matchings, matchings should not fall apart quickly because peopleﬁnd better partners, thus disregarding the matching. This can be ensured if the matching is stable.A matching is stable if there does not exist a man m and woman w such that ( m, w ) strictly prefereach other to the partners they are currently matched with. In addition, each person prefers to staysingle rather than match with some unacceptable partner. This leads to the following deﬁnition. Deﬁnition 10 (stability) . A matching µ is unstable if there exists a pair ( m, w ) ∈ ( M × W ) ∪{ ( m, m ) | m ∈ M } ∪ { ( w, w ) | w ∈ W } such that w > m µ ( m ) and m > w µ ( w ) . ( m, w ) is called a blocking pairof µ . Stability implies that it is enough to list all acceptable partners up to the position of the per-son himself. A person will always prefer to match to himself rather than match to anyone comingafterwards in her preferences.At this point, let us consider an example.

Example 1.

Consider the market with men M = { m , m } and women W = { w , w , w } andpreferences P (expressed as a list in the order of decreasing preference): P ( m ) = { w , w , w , m } , P ( m ) = { w , w , m } ,P ( w ) = { m , m , w } , P ( w ) = { m , m , w } , P ( w ) = { m , w } . We could equivalently write P ( m ) = { w , w , m , w } with woman w unacceptable to m , but thisdoes not aﬀect the set of stable matchings because m will always prefer himself to w . One can verifythat the matching µ = { ( m , w ) , ( m , w ) , ( w , w ) } is stable with woman w matched to herself.Another stable matching is µ = { ( m , w ) , ( m , w ) , ( w , w ) } . Indiﬀerent preferences are representedby square brackets: P ( m ) = { [ w , w ] , w , [ w , w ] , m } . It means that w = m w > m w > m w = m w > m m > m anyone else. A person can never beindiﬀerent between himself and anyone else, i.e. [ w , m ] is not allowed in the preferences of m . Though desirable, it is questionable whether a stable matchings can be found in general. Thecelebrated Gale-Shapley algorithm (GS algorithm) presented in [Gale and Shapley, 1962], shows by10 lgorithm 2:

Deferred Acceptance Algorithm or Gale-Shapley Algorithm

Data:

Marriage market (

M, W, P ) Result:

Matching µ M µ ← { ( w, w ) | w ∈ W } // match every woman to herself while there is an unmatched man m do w ← offerNext ( P, m ) // partner: woman or man himself if w = m then µ ← µ ∪ { ( m, m ) } // man proposed to himself and accepts else if m > w µ ( w ) then µ ← µ \ { ( m (cid:48) , w ) } // unmatch ( m (cid:48) , w ) µ ← µ ∪ { ( m, w ) } // match ( m, w ) endendreturn µ construction that a stable matching exists in all marriage markets. The pseudocode of the GS algo-rithm is provided in Algorithm 2. The GS algorithm is typically exposed by letting men propose at thesame time. The version presented here makes the generalization in Section 3 more straightforward.The GS algorithm works by letting men propose to women and women conditionally accept unlessthey get an oﬀer from a better man later on. It starts with all men unmatched and all women matchedto themselves. As long as a man is unmatched, consider any unmatched man. He proposes to her nextmost preferred woman he has not proposed to already. The function that does this in Algorithm 2 is offerNext . In case of indiﬀerent preferences, a man arbitrarily picks any of the equally preferredwomen. If a man has proposed to all of her acceptable woman, he proposes to himself instead (heaccept and remains single). If the woman prefers the man to her current partner, she disengages fromher old partner (a man or herself), leaving her old partner unmatched again. She engages/matcheswith this new man. The algorithm stops once all men are matched (with a woman or themeselves).The matched men and women are married. The algorithm is also termed deferred acceptance algorithm since a woman conﬁrms her engagement only once the algorithm terminates and may break up for abetter man any time before. The algorithm does not specify the order in which free men are chosenor how a man decides between indiﬀerent women. When preferences are strict, the returned matchingis always the same. Theorem 11 (Gale and Shapley) . The GS algorithm terminates and returns a valid and stable match-ing.Proof.

The algorithm terminates because the preference lists of men are ﬁnite and a man alwaysaccepts himself. The returned matching is valid because a man is matched to exactly one partner,himself or a woman.By contradiction, we prove that the matching is stable. Assume ( m, w ) blocks µ . If m = w , thismeans that m is either a man or a woman and matched to an unacceptable partner in µ . This isimpossible because only acceptable partners are matched in the algorithm. Otherwise, m is a manand w a woman and w > m µ ( m ), m > w µ ( w ). This means that m proposed to w before proposingto µ ( m ) and was rejected. Since w ’s match cannot decrease over iterations, this means that w wouldhave accepted.Assuming that an unmatched man can be identiﬁed in O (1), the running time is O ( M W ) sinceeach man proposes to a woman at most once. Whilst it is possible to ﬁnd an unmatched man in O (1)by storing the free men in a set, uniformly picking a free man is not O (1). One could obtain a randomsample from a set in O ( The case m = w is also known as individual rationality in the literature. Proposition 12 (Optimality) . Assume strict preferences and let µ the matching returned by the GSalgorithm. Then µ ( m ) ≥ m µ (cid:48) ( m ) for each man m and any stable matching µ (cid:48) . Also, µ (cid:48) ( w ) ≥ w µ ( w ) for each woman w and any stable matching µ (cid:48) . The proof is a special case of the proof in Proposition 6. Since optimality implies uniqueness, theGS algorithm returns a unique matching under strict preferences, independently of the order in whichmen propose. By inverting the roles of men and women, an equivalent version of the GS algorithmlets women propose to men and the matching is generally diﬀerent. Under strict preferences, thismatching is woman-optimal and men-worst-optimal.

B College Admission Problem

The marriage market can be generalized to the college admission setting. Instead of men and women,the market consists of colleges and students. A student can go to at most one college, but a collegecan accept more than one student, up to its capacity. We will state more general deﬁnitions respect tothe previous section and some of them can also be applied to the extension in Section 3 when studentscan have capacities as well (polygamous market).

Deﬁnition 13 (college admission market) . A college admission market over students S and colleges C is deﬁned by the quadruple ( S, C, P, Q ) , where: • Q = { q c | c ∈ C } ∪ { q s | s ∈ S } are the capacities of colleges and students, • P is the set of preference lists of colleges and students: P = { P ( c ) , . . . , P ( c n ) , P ( s ) , . . . , P ( s p ) } . We say that the quotas Q satisfy the college admission market assumption if one side of the markethas quotas all equal to one. Without loss of generality, we call students the side with quotas all equalto one. The college admission market is the market where only one side has quotas greater than one, i.e. q s = 1 ∀ s ∈ S . Deﬁnition 14 (valid matching) . A matching on the college admission market ( S, C, P, Q ) is a set µ ⊂ S × C such that: • | µ ( c ) | ≤ q c ∀ c ∈ C , where µ ( c ) = { s | ( s, c ) ∈ µ } , • | µ ( s ) | ≤ q s (= 1) ∀ s ∈ S , where µ ( s ) = { c | ( s, c ) ∈ µ } . µ ( s ) and µ ( c ) equivalently characterize the matching with the property: s ∈ µ ( c ) ⇐⇒ c ∈ µ ( s ) . | µ ( c ) | ≤ q c means that c is matched q c − | µ ( c ) | times to itself and we ﬁll µ ( c ) with c up to size q c , samefor students. We give an example of a college admission market.

Example 2.

Consider the college admission market with students S = { s , s , s } and colleges C = { c , c } with capacities and . Preferences for students over colleges are expressed as before. Forcolleges, they express their preferences over groups of students of size less or equal to their capacity.For some preferences, a valid matching could be µ = { ( c , { s , s } ) , ( c , {} ) , ( s , s ) } . This can beequivalently written by ﬁlling spots to capacity, µ = { ( c , { s , s } ) , ( c , { c , c , c } ) , ( s , s ) } , or bylisting the set µ = { ( c , s ) , ( c , s ) } . We deﬁne preferences between a single person and a set of persons as follows.12 eﬁnition 15.

Given a set of persons K , we say that p > p K if there exists p ∈ K such that p > p p . The stability deﬁnition carries over and we restate it here for clarity. It assumes that µ ( s ) and µ ( c ) are ﬁlled up to their capacity, as described in Deﬁnition 14. Deﬁnition 16 (stability) . A matching µ is unstable if there exists a pair ( s, c ) ∈ ( S × C ) ∪ { ( s, s ) | s ∈ S } ∪ { ( c, c ) | c ∈ C } such that s > c µ ( c ) and c > s µ ( s ) , i.e. there exist τ c ∈ µ ( c ) , τ s ∈ µ ( s ) suchthat s > c τ c and c > s τ s . In this case, ( s, c ) is called a blocking pair of µ . Theorem 17.

The college admission market admits a valid and stable matching.Proof (sketch).

It is possible to ﬁnd a stable matching by relying on the GS algorithm. This isillustrated in Algorithm 3. It works by mapping the college admission market to an extended marriagemarket, where college c is replicated according to its capacity to c , . . . , c q c , each with the samepreferences over students as in the original market (with c replaced by c i ). Students equally preferany of the replicated colleges, i.e. any occurrence of c is replaced by the indiﬀerent [ c , . . . , c q c ], seeExample 3. It is easy to show that a matching is stable on the college admission market if and onlyif it is stable on the extended marriage market. Therefore, the GS algorithm can be used to ﬁnd astable matching on the extended market and then map it back. Algorithm 3:

College Admission Algorithm

Data:

Market (

S, C, P, Q ), quotas Q for one side Result:

Matching µ ˜ S, ˜ C, ˜ P , mapping ← mapToExtendedMarket ( S, C, P, Q )˜ µ ← Gale-Shapley ( ˜ S, ˜ C, ˜ P ) µ ← mapFromExtendedMarket (˜ µ, mapping) return µ In the GS algorithm, either ˜ S can propose to ˜ C or vice versa. Traditionally, quotas are such thatonly one side has quotas greater than one, but the algorithm continues to work for quotas on bothsides if we deﬁne µ to be a multi-set so that the same pair can match more than once. In Section 3,we consider the case when both sides have quotas, but the same pair can match at most once. Example 3 (extended market construction) . Consider students S = { s, ˜ s } with capacities 2, 1 andcolleges C = { c, ˜ c } with capacities 1, 2 and preferences (omitting the person himself in the preferences): P ( s ) = { c, ˜ c } , P (˜ s ) = { ˜ c } ,P ( c ) = { [˜ s, s ] } , P (˜ c ) = { s } , The extended market maps s (cid:55)→ [ s , s ] , ˜ s (cid:55)→ [˜ s ] , c (cid:55)→ [ c ] , ˜ c (cid:55)→ [˜ c , ˜ c ] . It is deﬁned over students S ext = { s , s , ˜ s } and colleges C ext = { c , ˜ c , ˜ c } with preferences: P ( s ) = { [ c ] , [˜ c , ˜ c ] } , P ( s ) = P ( s ) , P (˜ s ) = { [˜ c , ˜ c ] } ,P ( c ) = { [˜ s , s , s ] } , P (˜ c ) = { [ s , s ] } , P (˜ c ) = P (˜ c ) , More precisely, P ( s ) = P ( s ) means P ( s ) = { [ c ] , [˜ c , ˜ c ] , s } and P ( s ) = { [ c ] , [˜ c , ˜ c ] , s } . Observewhat happens to the indiﬀerent preferences of c . The GS algorithm runs on this extended market andthe obtained matching is transformed back to give a stable matching on the original market.. The GS algorithm runs on this extended market andthe obtained matching is transformed back to give a stable matching on the original market.