Recovery thresholds in the sparse planted matching problem
RRecovery thresholds in the sparse planted matching problem
Guilhem Semerjian, ∗ Gabriele Sicuro, † and Lenka Zdeborov´a ‡ Laboratoire de Physique de l’ ´Ecole Normale Sup´erieure, ENS, Universit´e PSL,CNRS, Sorbonne Universit´e, Universit´e de Paris, F-75005 Paris, France Universit´e Paris-Saclay, CNRS, CEA, Institut de physique th´eorique, 91191, Gif-sur-Yvette, France (Dated: May 25, 2020)We consider the statistical inference problem of recovering an unknown perfect matching, hiddenin a weighted random graph, by exploiting the information arising from the use of two differentdistributions for the weights on the edges inside and outside the planted matching. A recent work hasdemonstrated the existence of a phase transition, in the large size limit, between a full and a partialrecovery phase for a specific form of the weights distribution on fully connected graphs. We generalizeand extend this result in two directions: we obtain a criterion for the location of the phase transitionfor generic weights distributions and possibly sparse graphs, exploiting a technical connection withbranching random walk processes, as well as a quantitatively more precise description of the criticalregime around the phase transition.
I. INTRODUCTION
A matching of a graph is a subset of its edges such thateach node belongs to at most one edge of the matching;in a perfect matching all the nodes are covered in thisway [1]. In a weighted graph one defines the weight of amatching as the sum of the weights on its edges, and onecan try to minimize or maximize this total weight underthe (perfect) matching constraint [2]. This extremizationis a problem of combinatorial optimization, widely stud-ied in mathematics, computer science, and also in statis-tical physics. In this paper we study the planted match-ing problem, a statistical inference problem where onehides (plants) a perfect matching into a graph, the goalbeing to find it back. Planted matching problems arisein applications such as particle tracking systems used inexperimental physics [3]; the present paper concentrateson a more fundamental aspect, namely the mathematicaldescription of the peculiar type of phase transition thisinference problem exhibits.Statistical inference on graphs and networks is an areaof recent interest including problems such as communitydetection [4], group testing [5], planted Hamiltonian cy-cle recovery [6], certain types of error correcting codes [7],and many others. All these problems share the commonpattern of a signal being observed indirectly via the edgesand weights of a graph, with the goal to infer the signalback from these observations. Interestingly as a functionof the signal-to-noise ratio and in the limit of large systemsizes one encounters sharp thresholds (phase transitions).A classical network-inference problem presenting a phasetransition is the community detection in graphs createdby the Stochastic Block Model (SBM) [4]. In the SBM onsparse graphs a phase transition happens between a no-recovery phase where an estimation of the signal better ∗ [email protected] † [email protected] ‡ [email protected] than a random guess is impossible and a partial-recoveryphase where a positive (but bounded away from one)correlation with the signal can be obtained. This phasetransition can be of second or first order depending onthe details of the model, an interesting connection wasput forward between phase transitions of first order andexistence of algorithmically hard phases [4]. A typologyof phase transitions in problems where the detectabilitytransition (from zero correlation to positive one) appearswas recently presented in [8]. In the SBM, in order to ob-tain full-recovery of the signal, i.e. a correlation with thesignal converging to one, the average degree of the graphhas to diverge as the logarithm of its size [9]. In low-density-parity-check error correcting codes [7], anotherwidely studied example of inference problem on sparsegraphs, there is also a first order phase transition but thistime from the partial-recovery phase to the full-recoveryphase where the signal (codeword) can be reconstructedwith an error that vanishes as the system size diverges.Here we study the planted matching problem, where aperfect matching is hidden in a graph by adding edges toit. The information about which edges were added comesthrough the distribution of the weights, with differentdistributions on the planted and non-planted edges. Thisproblem was introduced in [3] as a toy model in a particletracking problem, and was studied numerically by solvingthe corresponding recursive distributional equations fora particular case of the distribution of weights, suggest-ing a phase transition between full-recovery and partial-recovery phases. More recently, [10] rigorously analysedanother special case of the distribution of weights, andproved the existence of such a phase transition.We generalize and extend these previous results in sev-eral directions (at the level of rigor of theoretical physics).We locate the recovery phase transition for generic weightdistributions, considering also a sparse regime for theedges added to the planted configuration. As in [3] we usethe standard cavity method related to the belief propa-gation approximation [11]. Our key contribution is ananalytical insight into how the solution behaves, whichallows us to derive a rather simple closed-form expression a r X i v : . [ c ond - m a t . d i s - nn ] M a y for the threshold, Eq. (45), that holds for generic distri-bution of weights and both sparse and fully connectedgraphs (the threshold for the sparse case converges to itsfully connected limit exponentially fast in the average de-gree, see for example the phase diagram in Fig. 1). Theresults of [10] are also based on the cavity method, butapply only to fully connected graphs and only to the ex-ponential distribution of weights on the planted edges. Inthis particular case the corresponding recursive distribu-tional equations reduce into a closed system of differentialequations. Instead we obtain the generic expression forthe threshold by noticing a relation between the solutionof the recursive distribution cavity equations and proper-ties of branching random walk processes. The latter, andmore generically the phenomenon of front propagation forreaction-diffusion equations, appear in a variety of con-text and have been extensively studied both in physicsand in mathematics [12–22]; the precise way in which thisconnection arises here is nevertheless, as far as we know,original in the context of mean-field inference problems.Both our work and [10] show that the recovery thresh-old in the planted matching problem is of a rather differ-ent nature than the thresholds known in the stochasticblock model, error correcting codes, or others discussed inthe literature. Indeed it separates partial and full recov-ery phases while occuring at a finite average degree, andthere is no sign of a computational gap between the in-formation theoretically optimal reconstruction accuracyand the one achievable by efficient algorithms. Anotheraspect in which this transition differs from more usualones is its thermodynamic order: for the specific casestudied in [10] we provide a quantitatively more precisedescription of the critical regime around the phase tran-sition, Eq. (88), showing that it is of infinite order in theusual thermodynamic classification (all the derivatives ofthe order parameter vanish at the transition point).The rest of the paper is organized as follows. In Sec-tion II we define more explicitly the problem under studyand introduce two statistical estimators (block and sym-bol Maximum A Posteriori (MAP)) of the planted match-ing. In Sec. III we present the main equations (BeliefPropagation and their probabilistic description) that gov-erns the behavior of the problem. In Section IV we deriveour first main result, namely the location of the phasetransition for arbitrary weight distributions (for the blockMAP estimator), that is illustrated in Sec. V on severalexamples. Our second main result, i.e. a quantitativelymore precise description of the critical regime aroundthe phase transition, is explained in Section VI. We thenstudy numerically in Sec. VII the threshold of the phasetransition for the symbol MAP estimator, while conclu-sions and perspectives for future work are presented inSec. VIII. Some more technical details are deferred to aseries of Appendices. II. DEFINITIONSA. Planted random weighted graphs
We shall consider weighted graphs denoted G =( V , E , w ), where V = { , . . . , N } is the set of N ver-tices, N being an even integer, E the set of edges (un-ordered pairs of distinct vertices of V ), and w = { w e : e ∈ E } a collection of real weights assigned to each edgeof the graph. We endow the set of weighted graphs witha probability distribution, the generation of G from thislaw corresponding to the following steps: • One first chooses a perfect matching M of V uni-formly at random among the ( N − M contains N/ V belonging to exactly one edge of M . • The edge set E of G is made of the disjoint unionof M and additional edges chosen at random: eachof the (cid:0) N (cid:1) − ( N/
2) possible edges not already in-cluded in M is added to E with probability c/N . • The weights w e are independent random variables,with an absolutely continuous distribution given bythe density ˆ p if e ∈ M and p if e ∈ E \ M .We shall call planted (resp. non-planted) edges those in M (resp. in E \ M ). The parameters of this random en-semble of weighted graphs are thus the even integer N ,the parameter c controlling the density of non-plantededges, and the two distributions ˆ p and p for the genera-tion of the weights of the planted and non-planted edges.In formula the probability to generate a graph G , giventhe choice of M , translates from the above descriptionas: P ( G | M ) = (cid:89) e ∈ M ˆ p ( w e ) (cid:89) e ∈ E \ M p ( w e ) × (cid:16) cN (cid:17) | E |− N (cid:16) − cN (cid:17) ( N ) −| E | I ( M ⊆ E ) , (1)where here and in the following I ( A ) denotes the indi-cator function of the event A . Note that the number ofnon-planted edges concentrate in the large size (thermo-dynamic) limit N → ∞ around its average value cN/ c remains fixed in this limit, and that theseedges form essentially an Erd˝os-R´enyi random graph ofaverage degree c (modulo the exclusion of the plantededges).The model studied in [3, 10] corresponds to a dense,or fully-connected, version of the model defined above,in which G is a complete weighted graph, E containingall the possible edges between the N vertices. In orderto have a well-defined thermodynamic limit in this densecase it is necessary to rescale with N the weights on thenon-planted edges, i.e. to use an N -dependent distribu-tion p . The simplest way to perform this rescaling is touse p ( w ) = Q ( w/N ) /N , where Q is a density with a sup-port included in the non-negative reals, and a positivedensity Q (0) > Q (0) = 1. The thermodynamiclimit of this dense model is then equivalent to the largedegree limit of the sparse one, c → ∞ after N → ∞ , ifone uses for the distribution p of the non-planted edgesthe uniform distribution on the interval [0 , c ]. We shallthus study the richer sparse model, with finite c , and takethe large degree-limit when needed in order to compareour results with those of the dense case. B. A statistical inference problem
The question we shall investigate in the following iswhether the observation of a graph G generated accord-ing to the procedure above allows to infer the hiddenmatching M , assuming the observer knows the param-eters c , p and ˆ p of the model. In this setting all theinformation the observer can exploit to perform this taskis contained in the posterior probability P ( M | G ). Fromthe expression (1) of the graph generation probability,and the knowledge that the prior probability on M isuniform over the set of all perfect matchings, Bayes the-orem yields immediately the following expression for theposterior: P ( M | G ) ∝ (cid:89) e ∈ M ˆ p ( w e ) (cid:89) e ∈ E \ M p ( w e ) I ( M ⊆ E ) I pm ( M ) , where the symbol ∝ hides a normalization constant inde-pendent of M , and the last term is the indicator functionof the event “ M is a perfect matching”. For notationalsimplicity it is convenient to encode a set of edges M ⊆ E with binary variables, m = { m e : e ∈ E } ∈ { , } E ,where m e = 1 if and only e ∈ M , and rewrite the poste-rior as P ( m | G ) ∝ (cid:89) e ∈ E (cid:18) ˆ p ( w e ) p ( w e ) (cid:19) m e N (cid:89) i =1 I (cid:32) (cid:88) e ∈ ∂i m e = 1 (cid:33) , (2)where ∂i denotes the set of edges incident to the vertex i . The observer will now compute an estimator (cid:99) M ( G ),this function of the observations being “as close as possi-ble” to the hidden matching M . The optimal estimatoractually depends on which notion of “closeness” between M and the estimator (cid:99) M ( G ) is used.If the measure of the distance between them is simplythe indicator function I ( M (cid:54) = (cid:99) M ( G )), then the optimalestimator, optimal in the sense that it minimizes thisdistance averaged over all realizations of the problem, isthe one maximizing the posterior, (cid:99) M b ( G ) = argmax m P ( m | G ) , (3)where we slightly abused notations and used freely theequivalence between m and M . Following the nomen-clature of error-correcting codes [7] we shall call this the block Maximal A Posteriori (MAP) estimator, hence thesubscript b. As we shall detail below the estimator (cid:99) M b isthe perfect matching of G which minimizes the sum ofsome effective weights on the edges it contains.If instead the distance to be minimized is the totalnumber of misclassified edges, | M (cid:52) (cid:99) M ( G ) | , with (cid:52) thesymmetric difference between sets, or equivalently theHamming distance between the binary strings m and (cid:98) m ( G ) encoding them, then the optimal estimator is theso-called symbol MAP one, denoted (cid:99) M s ( G ), defined bythe binary string (cid:98) m ( G ) where, for all the edges e ∈ E , (cid:98) m e ( G ) = argmax m e P e ( m e | G ) , (4)with P e the marginal of the posterior probability (2) forthe edge e . Note that this estimator is not necessar-ily a perfect matching, nevertheless it is the one thatminimizes the distance | M (cid:52) (cid:99) M ( G ) | on average over allrealizations of the problem.For future use let us define the (reduced) distance be-tween the planted matching M and an arbitrary estima-tor (cid:99) M (i.e. a subset of the edge set E ) as (cid:37) ( M , (cid:99) M ) = 1 N | M (cid:52) (cid:99) M | (5)= 1 N (cid:88) e ∈ M I ( (cid:98) m e = 0) + 1 N (cid:88) e ∈ E \ M I ( (cid:98) m e = 1) . (6)If (cid:99) M contains exactly N/ (cid:37) ( M , (cid:99) M ) = 2 N (cid:88) e ∈ M I ( (cid:98) m e = 0) . (7)Our goal in the rest of the article is to discuss thequality of the estimators defined above, in the thermo-dynamic limit N → ∞ , as a function of the parametersof the model. Following the studies of [3, 10] one expectsto find phase transitions between full recovery phases,in which all but a vanishing fraction of the edges of M can be recovered from the observation of G , character-ized by a vanishing average reconstruction error E [ (cid:37) ] = 0,and partial recovery phases where a positive fraction ofthe edges will be misclassified, E [ (cid:37) ] >
0. Before enteringthe actual computations let us make two simple remarksin order to give the reader a first intuitive idea of the ef-fect of the parameters on the inference difficulty. (i) Theidentification of the planted edges will be easier if thedistributions p and ˆ p are less similar one to the other; inthe extreme cases where p = ˆ p the weights contain abso-lutely no information on M , while if p and ˆ p have disjointsupports M can be identified by a simple inspection ofthe weights on the edges. (ii) For a fixed choice of p and ˆ p the parameter c corresponds to a noise level: if c is verysmall E contains essentially only the sought-for edgesof M , increasing it the latter are hidden in the confusingnon-planted edges. III. CAVITY METHOD EQUATIONSA. A first pruning of the graph
Before proceeding further, let us observe that the infer-ence problem can be in general reduced in size after somesimple, preliminary observations. Following the remark(i) in a less drastic case, suppose that the supports of p and ˆ p are different (but not necessarily disjoint). Thenan edge e bearing a weight w e in the support of p but notin the one of ˆ p is, without doubt, non-planted; conversely e is certainly planted if w e is in the support of ˆ p but notin the one of p . All the edges identified in this way can beeliminated from G ; moreover the two vertices belongingto an edge identified as planted can also be eliminated,as well as the other edges incident to them, that cannotbe planted by definition of a perfect matching.To put these remarks on a quantitative ground let usdenote supp(ˆ p ) := { w ∈ R : ˆ p ( w ) > } (more preciselythe closure of this set) the support of the distribution ˆ p ,and similarly supp( p ) := { w ∈ R : p ( w ) > } the supportof p . We define Γ := supp( p ) ∩ supp(ˆ p ) , (8a) µ := (cid:90) Γ p ( w )d w , (8b)ˆ µ := (cid:90) Γ ˆ p ( w )d w. (8c)A non-planted edge e has weight w e (cid:54)∈ Γ, and can thusbe identified, with probability 1 − µ . Similarly, a plantededge e will have w e (cid:54)∈ Γ with probability 1 − ˆ µ , and, inthis case, it is surely an element of M . We will denote M := { e ∈ E : w e ∈ supp(ˆ p ) \ supp( p ) } (9)the set of planted edges immediately recognizable bymeans of these simple considerations. The edges in M can be removed from the graph, alongside with their end-points and all edges incident to them. After this pruningprocess, the obtained graph has, on average and in thelarge N limit, ˆ µN surviving vertices, each of them withdegree 1 + Z , Z being a Poisson random variable of mean γ := cµ ˆ µ (each non-planted edge is present with proba-bility µ cN , but in a graph with ˆ µN vertices).The distribution of the weights of the surviving edges isnow conditioned to the fact that their values are in Γ. Onthe new pruned graph, therefore, the weight distributionsare P ( w ) := 1 µ p ( w ) I ( w ∈ Γ) , (10a) ˆ P ( w ) := 1ˆ µ ˆ p ( w ) I ( w ∈ Γ) , (10b)for the non-planted and planted edges, respectively. Wewill denote G = ( V , E , w ) the graph obtained after thispruning, with V ⊆ V and E ⊆ E the new vertex andedge sets. B. The Belief Propagation equations
Here we present the belief propagation algorithm thatwas used in [3] for the planted matching problem. Let usintroduce a positive parameter (fictitious inverse temper-ature) β , and consider the following probability distribu-tion over the configurations m = { m e : e ∈ E } ∈ { , } E of binary variables on the edges of a weighted graph G , ν ( m ) ∝ e − β (cid:80) e ∈ E m e ω e (cid:89) i ∈ V I (cid:32) (cid:88) e ∈ ∂i m e = 1 (cid:33) , (11)where we introduced effective weights ω e on the edges,that are computed from the observed weights w e as ω e = ω ( w e ), with ω ( w ) := − ln ˆ P ( w ) P ( w ) . (12)To lighten the notation we kept implicit the dependencyon β and G of the probability distribution ν ; for β = 1it coincides with the posterior defined in (2), when β →∞ it concentrates on the configurations maximizing theposterior, these two values of β allow thus to deal withthe symbol and block MAP estimators, respectively.The probability distribution ν defined in Eq. (11) hasthe form of a Gibbs measure over all weighted perfectmatchings of the graph G . The exact computation of itsmarginals is an intractable task in general; we shall in-stead study it in an approximate way, using the BeliefPropagation (BP) algorithm (see for instance [11] for ageneral introduction to BP as well as chapter 16 thereinfor its application to matching problems), that is conjec-tured to provide an asymptotically exact description inthe large size limit for these sparse random graphs. Onecan indeed consider Eq. (11) as a graphical model, withvariable nodes m e living on the edges of G , and two typesof interaction nodes: one on each vertex i ∈ V , that im-poses that exactly one variable m e is equal to 1 aroundit, and one “local field” interaction e − βm e ω e for each vari-able. The BP equations are then obtained by introducing“messages” on the edges of this factor graph, that mimicthe marginal probabilities in amputated graphical mod-els and would become exact if the factor graph were atree. For the model at hand these messages are of theform ν i → e ( m ), from a vertex i to an edge e = ( i, j ), andobey the following equations (one for each directed edgeof the graph), ν i → e ( m ) ∝ (cid:88) { m ˜ e } ˜ e ∈ ∂i \ e I m + (cid:88) ˜ e ∈ ∂i \ e m ˜ e = 1 (cid:89) ˜ e =( r,i )˜ e ∈ ∂i \ e ν r → ˜ e ( m ˜ e ) e − βm ˜ e ω ˜ e . (13)We adopt the convention (cid:80) a ∈ A f ( a ) = 0 and (cid:81) a ∈ A f ( a ) = 1 if A = ∅ for any function f . As thevariables are binary, m e ∈ { , } for each e ∈ E , themessages can be conveniently parametrized in terms of“cavity fields” h i → e , one real number for each directededge, as ν i → e ( m ) := e βmh i → e βh i → e , (14)so that the BP equations become in terms of the cavityfields: h i → e = − β ln (cid:88) ˜ e =( r,i ) ∈ ∂i \ e e − β ( ω ˜ e − h r → ˜ e ) . (15)Once a solution of the set of BP equations has been found(for instance by iterating them starting from a randomor zero initial condition until convergence to a fixed pointis reached), the BP approximation of the marginal prob-ability of the variable m e on the edge e = ( i, j ) is givenby ν e ( m ) = e βm ( h i → e + h j → e − ω e ) β ( h i → e + h j → e − ω e ) . (16)The BP approximation to the symbol MAP estimatordefined in (4) is thus obtained by solving the BP equa-tions with β = 1, and estimating as a planted edge thosefor which ν e (1) > /
2, namely (cid:99) M s ( G ) := (cid:26) e ∈ E : ν e (1) > (cid:27) = { e = ( i, j ) ∈ E : h i → e + h j → e > ω e } . (17)We will keep the same rule (17) for the conversion of asolution of the BP equations into an estimator of M forall values of β , and in particular for β → ∞ . If the blockMAP configuration is unique, and if the marginal proba-bilities ν e are computed exactly, then this is a legitimateway of determining the block MAP estimator (3). TheBP algorithm is of course only an approximation here,but we conjecture it to be asymptotically exact, i.e. thatthe reduced Hamming distance between the block MAPestimator and its BP version vanishes in the thermody-namic limit. This relies on rigorous works that, even ifthey do not directly apply to the case considered here,have proven the exactness of the BP algorithm in simi-lar settings. More precisely, [24] proved that for a givenbipartite weighted graph, if the perfect matching withminimal weight is unique then the β → ∞ version of the BP equations, associated to the inclusion rule (17),converges to the optimal configuration, in a number ofiterations that scale with the gap between the optimalweight and the second minimum, and with the largestweight in the graph. [25] improved this convergence ratefor typical bipartite graphs of a random ensemble, while[26, 27] removed the bipartiteness assumption but addedan hypothesis on the absence of fractional solutions forthe Linear Programming relaxation of the problem. C. Recursive Distributional Equations
The BP equations have been written in Eqs. (15) for agiven instance of the graph G ; to obtain the average erroron the ensemble of all possible instances of our problemwe need to describe the statistics of the solutions of theBP equations. This step is known as density evolutionin the context of error-correcting codes, or as the cavitymethod in statistical mechanics [28]. We refer the readerto [29, 30] for similar studies of the matchings in sparse,non-planted random graphs, and to [31, 32] which consid-ered the weighted case (still without a planted structure).Suppose that an instance is generated at random, thatthe BP equations are solved on it, and that a directedplanted edge is chosen uniformly at random, say i → e ;let us call ˆ H the random variable that has the law of thecavity field h i → e . We define similarly H as the randomvariable distributed as h i → e when one chooses a non-planted edge. Let us also introduce the random vari-ables Ω = ω ( W ) and ˆΩ = ω ( ˆ W ), where W (resp. ˆ W )is a random variable with density P (resp. ˆ P ). If oneassumes that the typical realizations of ν have no long-range correlations (the so-called replica symmetric (RS)hypothesis), then (15) translates into recursive distribu-tional equations (RDEs) between the random variables H and ˆ H . A vertex i in a directed planted edge i → e is inci-dent to a Poissonian number of other non-planted edgesbecause of the Erd˝os-R´enyi nature of the latter, and sim-ilarly if i → e is non-planted there will be exactly oneplanted edge incident to i , and other non-planted edgesfrom the Erd˝os-R´enyi part of the graph. With the RSassumption of independence of the incoming cavity fieldsone thus obtains:ˆ H d = − β ln (cid:32) Z (cid:88) i =1 e − β (Ω i − H i ) (cid:33) , (18a) H d = − β ln (cid:32) e − β (ˆΩ − ˆ H ) + Z (cid:88) i =1 e − β (Ω i − H i ) (cid:33) d = − β ln (cid:16) e − β (ˆΩ − ˆ H ) + e − β ˆ H (cid:48) (cid:17) . (18b)In the equations above all random variables are inde-pendent, Z is Poisson distributed with mean γ , the Ω i ’shave the same law as Ω, and similarly H i are independentcopies of H , and ˆ H (cid:48) of ˆ H .The average of the reconstruction error defined in (6)can be computed in this setting, recalling the inclusionrule (17), as: E [ (cid:37) ] = ˆ µ P [ ˆ H + ˆ H (cid:48) ≤ ˆΩ] + ˆ µγ P [ H + H (cid:48) > Ω] . (19) D. A second pruning of the graph
Our goal in the following will be to understand theproperties of the random variables H and ˆ H solutions of(18), and their possible bifurcations when the parametersof the model are varied. As a first step in this directionwe shall isolate the contribution of “hard-fields”, in otherwords the probabilities of the events ˆ H = + ∞ and H = −∞ for these random variables. Observe indeed that P [ Z = 0] > H = + ∞ , andthat this event implies H = −∞ in (18b). From boththeoretical and practical point of views it is convenient todeal with these events explicitly, we shall thus introducethe probabilities ˆ q and q of the events ˆ H = + ∞ and H = −∞ respectively, and two new random variablesˆ H and H that have the law of ˆ H and H conditional onbeing finite (we exclude the possibility of ˆ H = −∞ and H = + ∞ , ˆ H and H are finite with probability one). Informulas these definitions amount to H d = (cid:40) −∞ with prob. q ,H with prob. 1 − q , (20a)ˆ H d = (cid:40) + ∞ with prob. ˆ q , ˆ H with prob. 1 − ˆ q . (20b)Let us insert them in (18) in order to obtain the equationsobeyed by q , ˆ q , H and ˆ H . In the right hand side of(18a) the number of infinite and finite H i ’s are easilyseen to be two independent Poisson random variables ofparameters γq and γ (1 − q ) respectively. ˆ H is infinite ifand only if the second of this number vanishes, hence onehas ˆ q = e − γ (1 − q ) andˆ H d = − β ln (cid:32) Z (cid:88) i =1 e − β (Ω i − H i ) (cid:33) , (21)where Z has the law of a Poisson random variable ofparameter γ (1 − q ) conditioned to be strictly positive,i.e. P [ Z = k ] = ( γ (1 − k )) k k !(e γ (1 − q ) − for k ≥
1. In (18b) one seesthat H is infinite if and only if ˆ H is infinite, hence q = ˆ q and H d = (cid:40) ˆΩ − ˆ H with prob. ˆ q , − β ln (cid:16) e − β (ˆΩ − ˆ H ) + e − β ˆ H (cid:48) (cid:17) with prob. 1 − ˆ q . (22)In summary the elimination of the hard fields amountto find the solution q of q = e − γ (1 − q ) , (23) and to study the finite random variables H and ˆ H solu-tion of the RDEsˆ H d = − β ln (cid:32) Z (cid:88) i =1 e − β (Ω i − H i ) (cid:33) , (24a) H d = (cid:40) ˆΩ − ˆ H with prob. q , − β ln (cid:16) e − β (ˆΩ − ˆ H ) + e − β ˆ H (cid:48) (cid:17) with prob. 1 − q , (24b)where the variable Z in Eq. (24a) has distribution P [ Z = k ] = π k with π k := q − q [(1 − q ) γ ] k k ! for k ≥ . (25)The average reconstruction error (19) can be reexpressedin terms of the new random variables H and ˆ H as: E [ (cid:37) ] = ˆ µ (1 − q ) P [ ˆ H + ˆ H (cid:48) ≤ ˆΩ]+ ˆ µ (1 − q ) γ P [ H + H (cid:48) > Ω] . (26)This procedure of “hard-fields” elimination that we ex-plained on the RDE’s admits also an interpretation on asingle graph instance. As a matter of fact the presence ofinfinite fields on the planted edges can be traced back tothe BP equation (15) which shows that h i → e = + ∞ if i is a leaf of the graph, i.e. i is of degree 1 and its only inci-dent edge is e . But this fact allows us to unambiguouslyidentify e as an edge of the planted matching, which bydefinition covers all the vertices of the graph. Then i ,the edge e = ( i, j ), the vertex j and all the edges inci-dent to it can be removed, the latter being with certaintynon-planted edges. This leaf removal procedure can beiterated until either all the graph has been pruned, orstops when a non-trivial core without any leaf has beenreached. The propagation of infinite fields in the BPequations is an equivalent way of describing this prun-ing algorithm. Note that such a leaf removal procedurehas already been studied in the literature for standardErd˝os-R´enyi graphs [33, 34], in this case a core perco-lation transition is found when the average degree ofthe graph crosses the Euler number value e : for sparsergraphs the leaf removal procedure typically destroys thewhole graph, while a non-trivial core survives for largeraverage degrees. In our case the core percolation transi-tion happens at γ = 1 (which corresponds to the usualpercolation transition of the Erd˝os-R´enyi random graphsuperposed to the planted matching): if 0 < γ ≤ q only admits the solu-tion q = 1, which means that the leaf removal procedureallows to recover completely the planted matching (up toa subextensive number of edges in the thermodynamiclimit). On the contrary for γ > q < γ > γ >
1, but in that case the simple leaf removal proce-dure is not able to identify all the edges of the plantedmatching, the full recovery is due to a non-trivial ampli-fication effect, by the iterations of the BP equations, ofthe information contained in the weights of the edges ofthe core.
IV. THE LOCATION OF THE PHASETRANSITION FOR THE BLOCK MAPESTIMATORA. RDEs for the block MAP
As explained in Sec. II B the block MAP estimator,that maximizes the probability of correct identificationof the whole planted matching, is the configuration m that maximizes the posterior in Eq. (2), and can be ob-tained by taking the “zero-temperature” limit β → ∞ inthe probability distribution ν defined in (11). The BPequations that we wrote in (15) for a generic value of β become in this limit h i → e = min ˜ e =( r,i ) ∈ ∂i \ e [ ω ˜ e − h r → ˜ e ] ; (27)the configuration maximizing the posterior (2) can beequivalently defined as the perfect matching of minimumcost on the weighted graph ( V , E , ω ), with the effectiveweights ω e replacing the observed weights w e . Hencethese BP equations coincide with those written in [24] tostudy such minimum weight matching problems. Withthe inclusion criterion of (17) the BP approximation forthe block MAP configuration is determined as m e = I ( h i → e + h j → e − ω e > . (28)The probabilistic treatment of the BP equations canalso be specialized very easily to the limit case β → + ∞ ,in particular the RDEs (24) yieldˆ H d = min ≤ i ≤ Z [Ω i − H i ] , (29a) H d = (cid:40) ˆΩ − ˆ H with prob. q , min (cid:16) ˆΩ − ˆ H, ˆ H (cid:48) (cid:17) with prob. 1 − q , (29b)with the law of the random variable Z defined in (25).The average reconstruction error (26) can actually besimplified for this β → ∞ situation into E [ (cid:37) ] = ˆ µ (1 − q ) P [ ˆ H + ˆ H (cid:48) ≤ ˆΩ] ; (30)as discussed in Sec. II B (see in particular (7)) there are asmany misclassified planted and non-planted edges whenthe estimator contains N/ β → ∞ , hence the twoterms in (26) are equal. For completeness we show inAppendix A that this equality follows indeed from theRDE (29), modulo an hypothesis of continuity for the distributions of H and ˆ H , that mimics the hypothesis ofuniqueness of the block MAP assignment.The equalities in distribution between random vari-ables stated in (29) can be equivalently rephrased asequations between the cumulative distribution functionsof the variables H and ˆ H . For a random variable X weshall define the c.d.f. F X and its reciprocal ¯ F X accordingto F X ( x ) = P [ X ≤ x ] , ¯ F X ( x ) = 1 − F X ( x ) = P [ X > x ] . One obtains from (29)¯ F H ( h ) ==( q +(1 − q ) ¯ F ˆ H ( h )) (cid:90) Γ F ˆ H ( ω ( w ) − h ) ˆ P ( w )d w , (31a)and¯ F ˆ H ( h )= q − q ∞ (cid:88) k =1 [ γ (1 − q )] k k ! (cid:90) Γ F H ( ω ( w ) − h ) P ( w )d w k = exp (cid:2) − γ (1 − q ) (cid:82) Γ ¯ F H ( ω ( w ) − h ) P ( w )d w (cid:3) − q − q . (31b)These are integral non-linear equations on the two func-tions F H and F ˆ H , describing the thermodynamic limitof the planted matching problem. These recursive equa-tions can also be understood as describing the optimalmatching problem on an infinite tree. The authors of[10] show rigorously that for fully connected graphs theoptimal configuration of the finite graph locally convergesto the optimal matching of the infinite tree.The above integral non-linear equations are quite com-plicated to solve in general. It was shown in [10], that, fora rather specific case (in the large degree limit with anexponential distribution for the planted weights) theseintegral equations can be transformed into a system ofordinary differential equations (which we will detail inSection VI). Unfortunately such a simplification does notseem to hold besides this special case.The question now is to understand the solution of (29)(or equivalently of (31)) as a function of the parame-ters of the model. It is easy to check that ˆ H = + ∞ , H = −∞ (i.e. F ˆ H ( h ) = 0, F H ( h ) = 1) is always asolution, for every choice of the parameters. If this isthe correct solution then E [ (cid:37) ] = 0, in other words oneis in a full recovery phase. The picture that emergesfrom the previous works [3, 10] is that for some valueof the parameters another solution of (29) exists, and isattractive when running BP from an initial condition un-correlated with the planted matching. One is then in apartial recovery phase, with E [ (cid:37) ] >
0. On the contraryif ˆ H = + ∞ , H = −∞ is the only solution of (29) oneis in a full recovery phase, the hidden matching beingsufficiently attractive to drive the iterations of the BPequations towards it. According to this description thephase transition between full and partial recovery cor-responds to the disappearance of a non-trivial solutionof the RDE (here and in the following non-trivial meansdistinct from ˆ H = + ∞ , H = −∞ ). This can of course bestudied numerically, and we shall display later on someresults obtained in this way; in the next subsection weshall argue that, with some additional hypotheses on thenature of the transition one can compute its location an-alytically. B. Locating the transition
Let us assume that the quantities p , ˆ p and c definingthe model depend on some continuous parameter denoted λ , and that there exists a threshold value ¯ λ such that E [ (cid:37) ]( λ ) = 0 for λ > ¯ λ , and E [ (cid:37) ]( λ ) > λ < ¯ λ . Wefurther assume that the transition at ¯ λ is continuous, i.e. E [ (cid:37) ]( λ ) → λ → ¯ λ − . Under these hypotheses, theexpression (30) of E [ (cid:37) ] reveals that P [ ˆ H + ˆ H (cid:48) > ˆΩ] → λ reaches its threshold value from the partialrecovery phase. But if P [ ˆ H + ˆ H (cid:48) > ˆΩ] = 1 the minimumin (29b) is always realized by the first argument, whichleads us to study the following, simplified form of theRDE (29): ˆ K d = min ≤ r ≤ Z [Ω r − K r ] , (32a) K d = ˆΩ − ˆ K , (32b)which bear on a new couple of random variables ˆ K and K . The transition point will be characterized by the factthat the simplified RDE in Eqs. (32) has a non-trivialsolution at ¯ λ . To facilitate the discussion we define anew random variable Ξ d := Ω − ˆΩ , (33)in terms of which Eqs. (32) can be written as a distribu-tional equation for ˆ K only,ˆ K d = min ≤ r ≤ Z (cid:104) Ξ r + ˆ K r (cid:105) . (34)This RDE is actually connected to the properties ofBranching Random Walk (BRW) processes, a subjectthat has generated a vast literature both in physics andmathematics [12–22], in the more general context of frontpropagation for reaction-diffusion equations. For the con-venience of the reader we summarize here the definitionsand the main properties we need about BRWs, somemore details can be found in Appendix B. A BRW de-scribes the evolution of a population of particles thatmove along a continuous unidimensional spatial axis, andmultiply as time increases in discrete steps (the equiv-alent process in continuous time being the BranchingBrownian Motion). More explicitly, at the initial gen-eration n = 0 there is a single particle at the origin, X (0)1 = 0. Each generation n is given by a set of parti-cles in positions { X ( n ) k } k and is constructed iteratively.Each particle of the generation n , say the i -th one at po-sition X ( n ) i , gives rise to a number (possibly infinite) ofoffsprings in the next generation, located at the positions X ( n +1) i,r = X ( n ) i +Ξ i,r where the displacements { Ξ i,r } r be-tween the positions of a parent particle and its offspringsare, independently for each i , copies of an identical pointprocess. In the simplest cases the number of offspringsis Z i , an independent copy of the random variable Z foreach parent i , and the displacements Ξ i,r are i.i.d. copiesof a given random variable Ξ.A realization of such a process is pictured on the ex-ample below: n = 0 n = 1 n = 2 n = 3 X (0)1 Note that BRWs combine and generalize Galton-Watsonbranching processes, that are recovered if one only looksat the number of particles in the BRW and discards theirpositions, and unidimensional random walks: the posi-tions of the particles along a single branch of the BRWfollow the law of a simple random walkAmong several properties of BRWs one that has at-tracted a lot of research effort is the asymptotic behaviorin the large n limit of the minimum of the process. Letus denote ˆ K ( n ) = min k [ X ( n ) k ] the position of the leftmostparticle in the n -th generation. Decomposing a BRW ofdepth n + 1 into Z BRW of depth n attached to the rootvia Z displacements Ξ r it is easy to convince oneself thatˆ K ( n ) obey the following RDE,ˆ K ( n +1) d = min ≤ r ≤ Z (cid:104) Ξ r + ˆ K ( n ) r (cid:105) , (35)with the initial condition ˆ K (0) = 0 and the conventionˆ K ( n ) = −∞ if the process is extinct before the n -th gen-eration. Such sequences of random variables have beenextensively studied, and very precise mathematical re-sults have been obtained. A first level of description[12–14] shows that, conditional on the non-extinctionof the process, ˆ K ( n ) has a ballistic behavior, namelyˆ K ( n ) /n a . s . −−−−−→ n → + ∞ v , with an almost sure convergence to-wards a velocity v that can be computed in terms of thelaw of the point process of the displacements: v = − inf θ> θ ln E (cid:34)(cid:88) r e − θ Ξ r (cid:35) . (36)When the number of offspring is a random variable of law Z and the displacements i.i.d. copies of Ξ this simplifiesinto: v = − inf θ> θ ln (cid:0) E [ Z ] E (cid:2) e − θ Ξ (cid:3)(cid:1) , (37)see also Appendix B for an heuristic justification of thisexpression of the velocity. More recently a finer descrip-tion of the limit has been obtained [18, 20, 22]: undersome technical conditions there exists a constant C , suchthat, conditional on the non-extinction of the process,ˆ K ( n ) − nv − C log n d −−−−−→ n → + ∞ L, (38)where the convergence is in law and L is a finite randomvariable satisfying L d = − v + min ≤ r ≤ Z [Ξ r + L r ] . (39)Note that this equation is invariant by translation: if L isa solution then L + a also is, for any constant a . Moreoverthe left tail behavior of the limit random variable L wasestablished in [20, 22] to be P [ L ≤ z ] ∼ α z e θ ∗ z as z → −∞ , (40)where θ ∗ is the minimizer of (37), and α < Z ofthe offspring of the BRW the expression (25), and forΞ the random variable defined in (33), where we recallthat Ω (resp. ˆΩ) is the random variable ω ( W ) with thefunction ω of (12) and W drawn with the distribution P (resp. ˆ P ). Note that in our case Z ≥ v = − inf θ> ln [ I ( θ ) I (1 − θ )] θ , (41)with I ( θ ) = √ γ (cid:90) Γ ˆ P ( w ) θ P ( w ) − θ d w . (42)where we remind the definition of Γ in Eq. (8a). One re-alizes at this point that if the parameters of the problemare such that v = 0, then the random variable L solu-tion of (39) and constructed through the large generationlimit of the BRW is a non-trivial solution of Eq. (34): thisis precisely the condition we argued to be satisfied at thecontinuous phase transition between full and partial re-covery phases. The vanishing velocity criterioninf θ> ln [ I ( θ ) I (1 − θ )] θ = 0 (43) is actually equivalent to I (1 /
2) = 1 because the argumentof the logarithm is a convex function symmetric around θ = 1 /
2, and therefore this condition becomes (cid:90) Γ (cid:113) ˆ P ( w ) P ( w )d w = 1 √ γ , (44)or equivalently in terms of the original parameters: (cid:90) Γ (cid:112) ˆ p ( w ) p ( w )d w = 1 √ c . (45)Note that the Cauchy-Schwarz inequality implies (cid:90) Γ (cid:113) ˆ P ( w ) P ( w )d w ≤ , (46)hence (44) cannot be satisfied if γ <
1. This is perfectlyconsistent with what we found in Sec. III D, if γ < q = 1, signalling a phase where fullrecovery can be achieved by the leaf removal procedure.The equation (45) is our first main result; it provides aprediction for the locus of the continuous phase transitionin the parameter space ( p, ˆ p, c ) of the model. We shallsimplify it in the large degree limit in Sec. IV C, and testit numerically on several examples in Sec. V. Before thatwe shall make a series of remarks on the reasoning whichled to it and on its consequences.(i) We have implicitly assumed that v = 0 is a neces-sary condition for (34) to have a non-trivial solution, inother words that the solution of (39) is unique (modulothe invariance under translations) and can thus be real-ized as the (properly shifted) large n limit of the BRWconstruction. This uniqueness is actually an open ques-tion in mathematics, stated as open problem 46 in [19].(ii) We justified the introduction of the simplified RDE(32) by an assumption on the continuity of E [ (cid:37) ]. We canbe more precise in some cases; suppose that the randomvariable ˆΩ is not bounded from above, i.e. that ω ( w )diverges to + ∞ at some point in Γ (as we will see lateron there are non-trivial examples where this property canbe true, or false). Then a continuously vanishing E [ (cid:37) ] inEq. (30) can only occur if ˆ H diverges to + ∞ as λ → ¯ λ − .In that case we can restate more precisely our hypotesisas the existence of a function m ( λ ) that diverges to + ∞ as λ → ¯ λ − , such thatˆ H − m ( λ ) d −−−−→ λ → ¯ λ − ˆ K , (47)with ˆ K solution of (34). The case studied in [10] falls inthis category, and Sec. VI will be devoted to the deter-mination of the divergence of m ( λ ).(iii) Independently of the continuity assumption, apoint in parameter space with v > H ( n +1) d = min ≤ i ≤ Z (cid:104) Ω i − H ( n ) i (cid:105) , (48a)0 H ( n ) d = (cid:40) ˆΩ − ˆ H ( n )1 with prob. q , min (cid:16) ˆΩ − ˆ H ( n )1 , ˆ H ( n )2 (cid:17) with prob. 1 − q , (48b)that defines a sequence of random variables ˆ H ( n ) , withthe initial condition ˆ H (0) = 0. Comparing these equa-tions with (35) one can show by induction on n thatˆ K ( n ) is stochastically smaller [35] than ˆ H ( n ) ; we detailthis proof in Appendix B 2. Here we shall only recall thatgiven two random variables X and Y one says that X isstochastically smaller than Y , to be denoted X (cid:22) Y , ifand only if P [ X > x ] ≤ P [ Y > x ] for all x . This condi-tion is equivalent to the existence of a coupling ( ˆ X, ˆ Y ),i.e. a random vector with marginal laws equal to thoseof X and Y respectively, such that P [ ˆ X ≤ ˆ Y ] = 1. Inour case if v > K ( n ) diverges to + ∞ in the large n limit, hence by this stochastic comparisonargument this will also be the case of ˆ H ( n ) . With theassumption that a non-trivial solution of the fixed pointequation (29), if it exists, will be reached as the large n limit of the sequence ˆ H ( n ) , this allows to conclude that v > C. The large degree limit
As explained in Sec. II the large degree limit, c → ∞ taken here after the thermodynamic limit N → ∞ , allowsto recover the dense models defined on complete graphsin [3, 10], if one performs an appropriate rescaling ofthe distribution of the weights on the non-planted edges.Consider indeed the condition (45): if p and ˆ p are keptconstant as c → ∞ this becomes (cid:82) Γ (cid:112) ˆ p ( w ) p ( w )d w = 0,which is never satisfied unless Γ is empty (and in this casethe planted edges can be identified by inspection of theweight on the edges, as p and ˆ p have disjoint supports).Indeed if the weights on both types of edges are of thesame order of magnitude, around one vertex the plantedweight will be hidden among the O ( c ) non-planted ones,and impossible to distinguish in the c → ∞ limit. Tohave a nontrivial partial-full recovery transition for c →∞ it is therefore necessary to scale the non-planted edgesweights, the simplest way being to take the non-plantedweight distribution uniform on the interval [0 , c ]. Thecondition (45) becomes then (cid:90) (cid:112) ˆ p ( w ) d w = 1 , (49)where we assumed for simplicity of notation that the sup-port of ˆ p is included in the positive real axis. V. EXAMPLES
We shall now confront our analytical prediction for thelocation of the phase transition with numerical results, . . . . . . . . . . F R ( q = ) c λ Figure 1. Phase diagram of the planted matching problemwith exponential planted weights on sparse graphs, exhibitingfull-recovery (FR) and partial recovery (PR) phases. The redarea corresponds to the phase where full recovery is achievableby the simple leaf removal procedure, and is delimited by the γ = 1 condition (red line). The blue area is the partial recov-ery phase enclosed by the vanishing velocity criterion (blueline), the white domain corresponding to full recovery. Thered dots have been obtained from the numerical resolution ofthe RDEs (29) by a population dynamics algorithm with 10 fields, and mark the limit of existence of a non-trivial solution. obtained both for finite N by solving the BP equations onsingle samples of the problem, and in the thermodynamiclimit by solving numerically the RDEs. Some details onthese numerical procedures are given in Sec. V D.For concreteness we will always take an uniform distri-bution for the non-planted weights, p ( w ) = 1 c I (0 ≤ w ≤ c ) , (50)for which Eq. (45) further simplifies as (cid:90) Γ (cid:112) ˆ p ( w ) d w = 1 , Γ = supp(ˆ p ) ∩ [0 , c ] . (51)We will present our results for different choices of ˆ p , someof them partially investigated in the literature. A. The exponential distribution
Let us start by considering the exponential distributionˆ p ( w ) = λ e − λw I ( w ≥ , (52)for which, in the large degree limit, [10] proved that λ < λ > c, λ ) plane displayed in Fig. 1. The red line cor-responds to the condition γ = 1 below which the leafremoval procedure described in Sec. III D recovers com-pletely the hidden matching; here µ = 1 and ˆ µ = 1 − e − λc ,the equation of this line is thus λ = − log(1 − (1 /c )) /c .The blue line is instead the vanishing velocity criterion(51), which becomes for this choice of ˆ p :1 = c (cid:90) √ λ e − λw d w = 2 1 − e − cλ √ λ , (53)a relation that can be inverted in c = − λ ln (cid:32) − √ λ (cid:33) . (54)This blue line separates a domain, in blue in Fig. 1, where v <
0, corresponding to a partial recovery phase, from afull recovery phase (in white) with v >
0. Note that thecurve has a minimal abscissa of c ≈ . λ . In the large degree limitthe two branches of the blue line converge to λ = 0 and λ = 4, we thus recover the results of [10].The red dots on this phase diagram have been obtainedfrom a numerical resolution of Eqs. (29), using a popu-lation dynamics algorithm, and correspond to the limitvalues for which we found a non-trivial solution of theequations. They are in agreement, within numerical ac-curacy, with the analytical prediction. In Fig. 2 we com-pare our prediction for the average reconstruction error E [ (cid:37) ] (non-zero in the partial recovery phase) obtainedby Eqs. (29, 30) (in the thermodynamic limit) with thenumerical results obtained running a belief propagationbased on Eq. (27) (on finite graphs). The agreementbetween these two procedures is very good, except forsmoothening finite-size effects close to the phase transi-tions. B. The folded Gaussian case
We have also considered a planted weight distributionof the folded Gaussian form,ˆ p ( w ) = (cid:114) πλ e − w λ I ( w ≥ , (55)that had been investigated previously in [3] in the largedegree limit. Our prediction for this case, easily obtainedby plugging this expression in (51) and taking the c → ∞ limit, is of a phase transition at¯ λ = 12 π ≈ . λ > ¯ λ and a full-recovery phase for 0 < λ < ¯ λ . This agrees qualitatively . . . . . . . . . λ % c = 2 c = 3 c = 4 c = 5 c = 10 c → + ∞ Figure 2. Reconstruction error for the planted matching prob-lem with exponential planted weights and different values of c . The lines have been obtained numerically solving the RDEs(29) using a population dynamics algorithm with 10 fields.The dots are the results of the resolution of the BP equa-tions (27) on graphs of N = 10 vertices, averaged over 10 instances, the c → + ∞ case corresponding to the completegraph. . . . . − − − − − − − − − π λ % M [ ˆ H ] Figure 3. Planted matching problem with (folded) Gaussianweights, in the large degree limit; the curves have been ob-tained from a numerical resolution of Eqs. (31), the red linecorresponding to the reconstruction error (cid:37) , the blue line tothe median cavity fields on planted edges M [ ˆ H ]. The light-redinterval corresponds to the estimate for the transition pointgiven in Ref. [3]. with the numerical investigations of [3], but not with thevalue of the threshold that was estimated in [3] to be¯ λ = 0 . λ < . . . . . . . . . . . . FR FR ( q = 1) PR α λ Figure 4. Phase diagram of the recovery transition for theplanted matching problem with a truncated power-law dis-tribution for the planted weights. The red line correspondsto the γ = 1 bound for the full-recovery transition. The cir-cles are the transition points obtained running a populationdynamics algorithm with 10 fields. C. The truncated power-law case
Let us consider as a final example the case where thedensity of the weights on the planted edges varies as apower-law on a finite interval,ˆ p ( w ) = αw α − λ α I (0 ≤ w ≤ λ ) , (57)with α >
0. For the sake of simplicity, we will restrict ouranalysis to the case c > λ . Then one finds immediatelythat Γ = [0 , λ ], ˆ P = ˆ p , P is the uniform distribution on[0 , λ ] and γ = λ , in such a way that as soon as c > λ theproblem on the pruned graph is completely independentof c .The transition condition γ = 1 for the full recoverabil-ity of the planted matching by the leaf removal algorithmis thus λ = 1, which yields the red domain in the phasediagram of Fig. 4. The vanishing velocity condition (51)becomes here ¯ λ = (1 + α ) α , (58)which is plotted as a blue line in Fig. 4, the full-recoveryphase corresponding to the domain λ < ¯ λ . This predic-tion is in agreement with the numerical results obtainedby a population dynamics resolution of the RDEs.For α = 1, that corresponds to the planted weightsuniformly distributed on [0 , λ ], one can actually solve theRDEs explicitly. Indeed in this case the function ω ( w ) inEq. (12) vanishes, hence the equations (29) reduce toˆ H d = − max ≤ i ≤ Z [ H i ] , (59a) H d = (cid:40) − ˆ H with prob. q , min (cid:16) − ˆ H, ˆ H (cid:48) (cid:17) with prob. 1 − q , (59b)which obviously admit the solution H d = ˆ H d = 0. Indeedthe effective weights on the pruned graph are all equal,the planted matching is one of the many perfect match-ings of this reduced graph, but there is no informationcontained in the weights to decide which one. In a simple-minded application of the inclusion rule (17) one wouldinclude in the estimator the edges of the pruned graphindependently with probability 1 /
2, leading to an aver-age estimation error E [ (cid:37) ] = (1 − q ) (1 + λ ) / λ > E [ (cid:37) ] = 0 for λ ≤ D. A note on the numerical procedures
Most of the thermodynamic limit results presentedabove have been obtained by a numerical resolution ofEqs. (29) via a population dynamics algorithm [28]. Theidea of this method, which is very commonly used to solveRDEs, is to represent the law of a random variable X asthe empirical distribution of a sample { X , . . . , X N } ofits representants, with N (cid:29) F X ( x ) ≈ N N (cid:88) i =1 I ( X i ≤ x ) . (60)One considers then the iterative version of the RDE writ-ten in (48), and update the population according to theserules. For instance each representant ˆ H i at the iteration n + 1 is generated, independently, by drawing an integer Z with the law (25), Z copies of the random variable Ω,and Z representants of H at the iteration n , by a uniformchoice over the N ones. These quantities are then com-bined according to the right hand side of (48a) to com-pute ˆ H i . The sample of representants of H at the itera-tion n + 1 is then generated similarly according to (48b).These steps are repeated a large number n of times, thetype of phase (partial or full recovery) is then decidedaccording to the convergence or divergence to + ∞ of thepopulation representing ˆ H ( n ) in the large n limit. Theaccurate determination of such a phase transition suffersfrom finite population size effects that are much moresevere than in usual applications of the population dy-namics algorithm. Indeed the transition is governed byan instability that manifests itself as a front propagationin the cumulative distribution function; such front prop-agations are generically driven by the behavior in theexponentially small tail far away from the front [15–17].As the finite population size implies a cutoff of 1 / N onthe smallest representable value of the cumulative dis-tribution function, this translates into logarithmic finitepopulation size effects on the velocity of the front andthe location of the phase transition, at variance with the3usual 1 / N corrections for the computation of observablesas empirical averages. We refer the reader to [15] for aquantitative study of these logarithmic corrections in thevelocity of a front in presence of a threshold in its tail.We thus believe that the discrepancy in the foldedGaussian case between our analytical prediction ¯ λ = π ≈ .
159 and the numerical estimate ¯ λ = 0 . N effects. Theresults presented in Fig. 3 that supports this thesis havebeen obtained with another numerical procedure: insteadof the population representation (60) of the cumulativedistribution functions F H ( h ) and F ˆ H ( h ) we stored theirvalues in M points h < h < · · · < h M over a giveninterval [ h , h M ], and updated them using Eqs. (31) un-til a certain convergence criterion was satisfied (until the L -distance between the solution at step n and the so-lution at step n − (cid:15) ). The advantage of this method is that thecutoff h M can be taken arbitrarily large, in such a waythat ¯ F ˆ H ( h M ) is very small, hence bypassing the thresh-old at 1 / N of the population dynamics algorithm. Afterconvergence the function F ˆ H ( h ) can be used to estimate E [ (cid:37) ], e.g. by a Monte Carlo integration. VI. A MORE PRECISE DESCRIPTION OF THECRITICAL REGIME
Once the threshold value of a parameter has been de-termined it is natural to aim at a more quantitative de-scription of the transition in its critical regime. In thecase considered in this paper of planted models that un-dergo a continuous transition from partial recovery for λ < ¯ λ to full recovery for λ > ¯ λ this point amounts to de-scribe how the average reconstruction error E [ (cid:37) ] vanishesas λ → ¯ λ − . This was raised as open question 2 in [10],and we shall study it in the model defined therein, i.e.with an exponential distribution for the planted weights,in the large degree limit. This case allows for some tech-nical simplifications; as shown in [10] the RDEs can thenbe reduced to a system of Ordinary Differential Equa-tions (ODEs), that we first recall in the next subsectionbefore studying their solution in the critical regime. A. The ODEs for the exponential model
Let us specialize our formalism with the followingchoices of weight distributions: ˆ p ( w ) = λ e − λw for w ≥ p ( w ) = 1 /c for w ∈ [0 , c ]. The intersection oftheir supports is thus Γ = [0 , c ], and one finds µ = 1,ˆ µ = 1 − e − λc . The reduced distributions ˆ P and P arethen, on their common support Γ, ˆ P ( w ) = λ e − λw / ˆ µ and P ( w ) = 1 /c , which gives an effective weight function ω ( w ) = λw − ln( λc/ ˆ µ ). The parameter q is the solutionof q = e − γ (1 − q ) with γ = c ˆ µ .To simplify the notations, and to get closer to the con-ventions used in [10], we shall define random variables X and Y that are affine transformations of ˆ H and H ,respectively. More precisely we define their cumulativefunctions as F X ( x ) := F ˆ H (cid:18) λ x −
12 ln (cid:18) λc ˆ µ (cid:19)(cid:19) , (61) F Y ( x ) := F H (cid:18) λ x −
12 ln (cid:18) λc ˆ µ (cid:19)(cid:19) , (62)and keep the convention ¯ F = 1 − F for reciprocal cumu-lative distributions. The equations (31) become¯ F Y ( x ) = (cid:0) q + (1 − q ) ¯ F X ( x ) (cid:1) λ ˆ µ c (cid:90) e − λw F X ( w − x )d w (63a)¯ F X ( x ) = exp (cid:104) − ˆ µ (1 − q ) (cid:82) c − x − x ¯ F Y ( w )d w (cid:105) − q − q . (63b)Taking the limit c → + ∞ , in which ˆ µ → q → F Y ( x ) = ¯ F X ( x ) + ∞ (cid:90) λ e − λw F X ( w − x )d w , (64a)¯ F X ( x ) = exp − + ∞ (cid:90) − x ¯ F Y ( w )d w . (64b)These equations between cumulative distribution func-tions correspond to the following RDEs on X and Y : X d = min { ξ i − Y i } , (65a) Y d = min( η − X, X (cid:48) ) , (65b)where in the first line the ξ i ’s are the points of a Poissonpoint process of intensity 1 on the positive real axis, andin the second line η has an exponential distribution ofparameter λ . It is convenient to introduce the auxiliaryfunction V ( x ) defined as the cumulative distribution ofthe random variable X − η , i.e. V ( x ) := P [ X − η ≤ x ] = + ∞ (cid:90) λ e − λw F X ( w + x )d w , (66)in such a way that (64a) can be rewritten ¯ F Y ( x ) =¯ F X ( x ) V ( − x ). Taking derivatives with respect to x in(64b,66), and denoting for simplicity F = F X , one ob-tains F (cid:48) ( x ) = (1 − F ( x ))(1 − F ( − x )) V ( x ) , (67a) V (cid:48) ( x ) = λ ( V ( x ) − F ( x )) , (67b)where the form of the equation on V crucially dependson the exponential character of the distribution of theplanted weight η . These two equations on F and V − x instead of x ; to bypassthis difficulty one introduces two additional functions, G ( x ) := F ( − x ) and W ( x ) := V ( − x ), in such a way thatthe four-dimensional vector ( F, G, V, W )( x ) obeys an au-tonomous first-order ODE, from the solution of which theaverage reconstruction error is computed as E [ (cid:37) ] = P [ X + X (cid:48) ≤ η ]= 2 ∞ (cid:90) (1 − F ( x ))(1 − G ( x )) V ( x ) W ( x )d x , (68)see [10] for the details of the derivation of the integralexpression of E [ (cid:37) ].The dimensionality of the problem can be reduced byexploiting the conservation law F ( x ) W ( x ) + G ( x ) V ( x ) − V ( x ) W ( x ) = 0 for all x . Introducing finally U ( x ) = F ( x ) /V ( x ) it is shown in [10] that the problem reducesto solve, for x ≥
0, the following ODE on the three-dimensional vector (
U, V, W )( x ): U (cid:48) ( x ) = − λU ( x )(1 − U ( x )) (69a)+ (1 − U ( x ) V ( x ))(1 − (1 − U ( x )) W ( x )) ,V (cid:48) ( x ) = λV ( x )(1 − U ( x )) , (69b) W (cid:48) ( x ) = − λW ( x ) U ( x ) , (69c)with the initial conditions U (0) = 12 , V (0) = W (0) . (70)Even if the notation does not show it explicitly the solu-tion of these ODEs depends of course on λ , both directlyas λ appears in (69), and indirectly through the initialcondition V (0) = W (0). It is indeed shown in [10] thatfor a given λ < F and V have the properties of cumulative distributionfunctions (non-decreasing and bounded between 0 and1). B. The divergence of X in the limit λ → We present in Fig. 5 the cumulative distribution F ofthe random variable X , for three values of λ increasingtowards the critical value λ = 4, obtained by a numericalresolution of the ODE (69). This plot suggests that F drifts without deformation when approaching the transi-tion; this impression is confirmed by the inset of the fig-ure, which shows a very good collapse of the curves onceshifted by the median M ( λ ) = F − (1 /
2) of X (any otherquantile would have led to the same collapse, with anadditional constant shift of the horizontal axis). This ob-servation has two equivalent translations: from the prob-abilistic point of view it corresponds to the existence ofa function m ( λ ) and a random variable ˆ X such that X − m ( λ ) d −−−−→ λ → − ˆ X , (71) − − . xF ( x ) . . . . . . . . . Figure 5. Cumulative distribution F ( x ) for, from left to right, λ = 3, λ = 3 . λ = 3 .
92. Inset: the three curves have beenshifted horizontally by their medians M ( λ ) = F − (1 / as was stated in (47) for generic weight distributions, m ( λ ) differing from M ( λ ) by an arbitrary constant. Fromthe analytic point of view it means that the solution ofthe ODEs admits a scaling regime x = z + m ( λ ) when z is kept fixed while λ → − , described by functions ˆ F ( z ),ˆ V ( z ), ˆ U ( z ), defined asˆ F ( z ) = lim λ → − F ( z + m ( λ )) , (72)similar definitions holding for ˆ V and ˆ U . The dots inthe main panel of Fig. 6 represent the numerically de-termined value of M ( λ ), and suggest a divergence of thisquantity as λ → − . Supposing that m ( λ ) indeed di-verges one can simplify the ODEs (67) in this scalingregime, with F ( − x ) →
0; this yieldsˆ F (cid:48) ( x ) = (1 − ˆ F ( x )) ˆ V ( x ) , (73a)ˆ V (cid:48) ( x ) = 4( ˆ V ( x ) − ˆ F ( x )) . (73b)ˆ F is the cumulative distribution of the limit variable ˆ X ,solution of the simplified RDEˆ X d = min { ξ i − ˆ Y i } , (74a)ˆ Y d = η − ˆ X . (74b)Studying this simplified ODE in the z → −∞ limit whereit can be linearized, or appealing to the theorems ex-plained in (40) for the left tail behavior of the limit ran-dom variable in the BRW interpretation, one findsˆ F ( z ) ∼ z →−∞ − A z e z , ˆ U ( z ) ∼ z →−∞ − z , (75)where A > . . . . . . . . m ( λ ) M ( λ ) λ . . . . . . . . . Figure 6. Median M ( λ ) of the cumulative distribution in theexponential case, compared to the analytical prediction (78)for m ( λ ). Inset: the dots show M ( λ ) − m ( λ ), the dashed lineis a linear fit that confirms the convergence of M ( λ ) − m ( λ )to a finite constant in the limit λ → In order to determine the sought-for divergence of m ( λ )as λ → − we need now to study the solution of theODE (69) on another scaling regime, x = tm ( λ ) with t ∈ [0 ,
1) kept fixed in the limit, that will allow to take intoaccount the initial condition (70). The derivation willthen conclude by a matching argument at the commonboundary of the two scaling regimes, t → − and z →−∞ .In order to study the scaling regime x = tm ( λ ) wefirst notice that V (0) → λ → − , because V is thecumulative distribution function of the random variable X − η that diverges in this limit. It is thus instructiveto solve first Eqs. (69) with V (0) = W (0) = 0, denoting U , V , W its solution, even if this cannot be exactly aproper solution. One finds V ( x ) = W ( x ) = 0 for all x ≥
0, and (69a) simplifies into U (cid:48) ( x ) = − λU ( x )(1 − U ( x )) + 1 , (76)an equation that can be solved exactly for any λ < U (0) = 1 /
2, yielding U ( x ) = 12 + 12 (cid:114) − λλ tan (cid:16) x (cid:112) λ (4 − λ ) (cid:17) . (77)This expression diverges when the argument of the tan-gent reaches π/
2, which gives us a natural candidate forthe scale m ( λ ) at which the first regime ends, namely m ( λ ) = π √ − λ , (78)and a conjecture for the behavior of the solution U ( x ) ofthe full ODE in the first scaling regime, namelylim λ → − √ − λ (cid:18) U ( tm ( λ )) − (cid:19) = 14 tan (cid:16) t π (cid:17) . (79) We have assumed here that V (0), even if strictly non-zero for all λ <
4, is sufficiently small for the second linein (69a) to be negligible, and hence for U to coincidewith U at the dominant order in this scaling regime. Tocheck the self-consistency of this hypothesis we first givean exact expression of V and W in terms of U obtainedby integration of (69b,69c): V ( x ) = V (0) exp λ x (cid:90) (1 − U ( y ))d y , (80) W ( x ) = e − λx V ( x ) . (81)Inserting the scaling ansatz (79) into (80) we obtainlim λ → − V (0) V ( tm ( λ )) e − tm ( λ ) = cos (cid:16) t π (cid:17) , (82)the behavior of W being easy to deduce from the one of V thanks to (81). We fix now the initial condition V (0)by matching the behavior t → − of this expression withthe limit z → −∞ of the other regime, which from (75) isˆ V ( z ) ∼ − Az e z , with the correspondance t ∼ zm ( λ ) .This yields V (0) = 2 A e − m ( λ ) √ − λ , (83)and allows to check that indeed the first line of (69a) isdominant in the scaling regime t = m ( λ ) as long as t < U in (75) and (79) matches at theboundary of the two scaling regimes. Note that the in-determinacy of the constant A , because of the invarianceby translation of the equations on ˆ F and ˆ V , is related in(83) to the additive arbitrary constant that can be addedto m ( λ ).The inset of Fig. 6 presents a numerical confirmationof this reasoning: the difference between the numericallydetermined median of X and our formula (78) is seen toconverge to a finite constant when λ → − (with cor-rections that seem polynomial in 4 − λ ). We have alsochecked that the numerical results for U ( x ) and V ( x ) arecompatible with the scaling ansatz of (79,82). C. The critical behavior of E [ (cid:37) ] We would like now to use our prediction (78) for thedivergence of X in order to describe the way in which theaverage reconstruction error E [ (cid:37) ] vanishes at the transi-tion. The expression of the latter, given in (68), can berewritten as E [ (cid:37) ] = E [e − λX ] . (84)Indeed the exponential distribution of η is such that P [ η ≥ x ] = e − λx , and X and X (cid:48) in (68) are indepen-dent random variables with the same law. Recalling the6 . . . . . . . . λ E [ ̺ ] . . . . . . . . . . . . . . . Figure 7. The average reconstruction error E [ (cid:37) ] that vanishescontinuously as λ → − . Inset: E [ (cid:37) ] e π √ − λ (4 − λ ) / as afunction of λ , the convergence to a postive constant as λ → − confirms (88). convergence in distribution stated in (71), and the pre-diction (78) of m ( λ ), it would be tempting to write E [ (cid:37) ] ∝ λ → − e − π √ − λ E [e − X ] . (85)Unfortunately this result has to be amended: as the tailof ˆ X varies as e z for z → −∞ (cf. Eq. (75)) the ex-pectation value E [e − X ] is infinite. Using the integralexpression of E [ (cid:37) ] given in (68), one finds that the lead-ing contribution is given by the scaling regime x = tm ( λ )and is of the form2 m ( λ ) (cid:90) V ( x ) W ( x )d x = 2 m ( λ ) (cid:90) V ( x ) e − λx d x (86) ∝ V (0) (cid:90) cos (cid:16) t π (cid:17) m ( λ )d t . (87)The asymptotic form of the initial condition stated in(83), combined with the expression of m ( λ ) given in (78),yields finally the prediction E [ (cid:37) ] ∝ λ → − e − π √ − λ (4 − λ ) − / . (88)Note that all the derivatives of E [ (cid:37) ] vanish as λ → − be-cause of the essential singularity of the exponential term,the transition is thus of infinite order in the usual ther-modynamic classification.We present in Fig. 7 our numerical results for E [ (cid:37) ],computed by a numerical integration of the ODE and theintegral expression in (68). The main panel shows qual-itatively that E [ (cid:37) ] is indeed very flat close to the transi-tion; the rescaling performed in the inset is in agreementwith the asymptotic form (88). VII. THE SYMBOL MAP CASE
We have discussed above the phase diagram of theproblem, and distinguished in particular full and partialrecovery phases, considering the block MAP estimator,i.e. the β → ∞ version of the BP equations. The phaseswere thus defined according to whether the average re-construction error E [ (cid:37) b ] vanished in the thermodynamiclimit or not, the subscript b specifying the use of theblock MAP estimator in the computation. However, weexplained in Sec. II B that the estimator that minimizesthe average reconstruction error is the symbol MAP one,obtained with β = 1, with an average reconstruction er-ror denoted E [ (cid:37) s ]. As E [ (cid:37) s ] ≤ E [ (cid:37) b ] the phases shown tobe of the full recovery type for β → ∞ are certainly soalso for the symbol MAP estimator, one can neverthelesswonder if the converse is true, namely if some choices ofparameters yield 0 = E [ (cid:37) s ] < E [ (cid:37) b ].We have investigated this question numerically, bysolving with a population dynamics algorithm the RDEs(24) with β = 1, and computed E [ (cid:37) s ] from (26). Our re-sults are presented in Fig. 8; for concreteness we haveused the exponential distribution of Eq. (52) for theplanted weights, and several values of the average degree c (the non-planted weight distribution being uniform on[0 , c ]). We found indeed that E [ (cid:37) s ] ≤ E [ (cid:37) b ] (the blockMAP results, previously presented on Fig. 2, are drawnwith dashed lines). Within our numerical accuracy thetransition to the full recovery phases occur for the samevalues of the parameters in the symbol and block MAPcases; this is in agreement with a conjecture of [10], seeopen question 1 therein. VIII. FUTURE WORK
Let us conclude by giving some thoughts on how ourstudy could be extended. One could try to study thecritical regime for generic distributions, i.e. extend theresults of Sec. VI that were obtained only for the expo-nential distribution and in the large degree limit. Weexpect the exponent − / E [ (cid:37) ] should be much more dependent onthe details of the models. One motivation for this direc-tion of research is the difficulty of an accurate numericaldetermination of the location of the phase transition, asdiscussed in Sec. V D. The numerical accuracy problemsshould be less stringent further away from ¯ λ inside thepartial recovery phase, hence an extrapolation of M [ ˆ H ],if one has a prediction for its functional form, should leadto more precise determinations of threshold parameters.It would also be interesting to further investigate thepossibility of discontinuous recovery phase transitions,for which the derivation presented in Sec. IV B wouldfail. We did not find evidence for their occurence, butwe cannot exclude this possibility because of the limitedaccuracy of our numerical results. Such situations might7 . . . . . . . . . . . λ % c = 2 c = 4 c = 10 Figure 8. The average reconstruction error for the sym-bol MAP estimator ( β = 1) with exponentially distributedplanted weights, at different values of c . The solid lines havebeen obtained numerically solving the RDEs in Eq. (24) with β = 1 using a population dynamics algorithm with 10 fields.The dots corresponds to the same error rate estimated run-ning BP over 10 instances of the problem, on graphs with N = 10 vertices. The dashed lines corresponds to the re-construction error of the block MAP estimator presented inFig. 2, that are larger than the symbol MAP ones for thesame value of c (encoded by the color of the curve). occur for contorted weight distributions, or if instead ofErd˝os-R´enyi random graphs one hides the planted match-ing in a configuration model with some well-chosen de-gree distributions, for which [30] unveiled the existenceof multiple BP fixed points.The coincidence of the thresholds for full recovery ofthe symbol and block MAP estimators observed numer-ically in Sec. VII also calls for further investigation andfor an analytical argument supporting (or disproving) it.This point is also connected to the apparent absence ofstatistical to computational gaps in this problem: theblock MAP estimator, being a minimal weight perfectmatching, can be determined in polynomial time [2], andthe results of [24–27] strongly suggest that it can beasymptotically (in the large size limit) obtained by the β → ∞ BP equations. An exact computation of the sym-bol MAP estimator is instead a computationally hardproblem, but it is tempting to conjecture that the BPalgorithm with β = 1 reaches asymptotically the infor-mation theoretically optimal reconstruction error E [ (cid:37) s ].A k -factor of a graph is a set of edges such that eachnode belongs to exactly k edges of the factor; a perfectmatching is thus a special case of this definition with k = 1. It would therefore be interesting to study theplanted k -factor problem for generic values of k . For k = 2 the problem is related to the planted Hamiltoniancycle that was considered in [6]. The planted k -factorcould also be studied using the cavity approach and theassociated belief propagation equations. At variance withthe matching case there is, for generic k , no efficient al-gorithm even for the block MAP estimator; this opensthe possibility for computationally hard phases in such a generalization.Another natural direction for future work is a rigor-ous proof of our results, notably of the threshold givenin Eq. (45) and the critical behaviour stated in Eq. (88).While the local-weak-convergence proof of [10] can likelybe extended to generic weights distribution and to thesparse graph settings, it is not clear how to control rig-orously the solution of the recursive distributional equa-tions, in particular the reasoning at the beginning of sec-tion IV B. The stochastic comparison argument explainedin remark (iii) at the end of this section, and expandedupon in Appendix B 2, should provide a scheme for a rig-orous proof of full recovery when v >
0, the much morechallenging question is to prove partial recovery when v < Acknowledgments
We thank the authors of [10] for discussions and forsharing with us their results prior to publication. We alsothank Florent Krzakala and Andrea Agazzi for useful dis-cussions on the problem. This project has received fund-ing from the European Union’s Horizon 2020 researchand innovation programme under the Marie Sk(cid:32)lodowska-Curie grant agreement CoSP No 823748, and from theFrench Agence Nationale de la Recherche under grantANR-17-CE23-0023-01 PAIL.
Appendix A: The reconstruction error for the blockMAP estimator
We prove in this Appendix that the equality of the twoexpressions (26) and (30) of the average reconstructionerror when β → ∞ follows from the RDE (29). We havethus to prove that P [ ˆ H + ˆ H (cid:48) ≤ ˆΩ] = γ P [ H + H (cid:48) > Ω].We first notice that (29b) implies that for any real x onehas P [ H ≥ x ] = P [ ˆΩ − ˆ H ≥ x ]( q +(1 − q ) P [ ˆ H ≥ x ]), hence P [ ˆΩ − ˆ H ≥ x ] = P [ H ≥ x ] q + (1 − q ) P [ ˆ H ≥ x ] . (A1)Multiplying this expression by − dd x P [ ˆ H ≥ x ], which isthe density of the random variable ˆ H , we obtain P [ ˆ H + ˆ H (cid:48) ≤ ˆΩ] = + ∞ (cid:90) −∞ (cid:18) − dd x P [ ˆ H ≥ x ] (cid:19) P [ ˆΩ − ˆ H ≥ x ]= 11 − q + ∞ (cid:90) −∞ (cid:18) − dd x ln( q + (1 − q ) P [ ˆ H ≥ x ]) (cid:19) P [ H ≥ x ]= 11 − q + ∞ (cid:90) −∞ (cid:18) dd x P [ H ≥ x ] (cid:19) ln( q + (1 − q ) P [ ˆ H ≥ x ])(A2)8where we performed an integration by part; the inte-grated term vanishes because there is no mass at infinityin the law of H and ˆ H .We exploit now the other RDE (29a), that gives P [ ˆ H ≥ x ] = ∞ (cid:88) k =1 q − q [(1 − q ) γ ] k k ! P [Ω − H ≥ x ] k = q − q (cid:16) e (1 − q ) γ P [Ω − H ≥ x ] − (cid:17) . This yieldsln( q +(1 − q ) P [ ˆ H ≥ x ]) = − (1 − q ) γ P [Ω − H < x ] , (A3)where we used the equation q = e − γ (1 − q ) to simplify theexpression. Inserting (A3) in (A2) gives P [ ˆ H + ˆ H (cid:48) ≤ ˆΩ] = γ + ∞ (cid:90) −∞ (cid:18) − dd x P [ H > x ] (cid:19) P [Ω − H < x ]= γ P [ H + H (cid:48) > Ω] , which proves our claim. Note that this derivation reliescrucially on the hypothesis that H and ˆ H have a continu-ous distribution, which allowed to introduce their densityand to perform integration by parts to connect the twoterms of (26). We expect this to be the case when theeffective weight distribution is continuous, in such a waythat the minimal weight perfect matching is unique (on afinite graph); a counterexample is discussed in Sec. V C. Appendix B: On the simplified RDE in Sec. IV B
We provide in this Appendix some additional detailsabout the simplified RDE defined in Sec. IV B; we firstgive an heuristic justification of the velocity (37) of theleftmost particle of a BRW, then we detail the stochasticordering argument that leads to the divergence of ˆ H ( n ) when v >
1. Heuristic derivation of the velocity in the BRWprocess
We will present a reasoning typical of the physics liter-ature on front propagation in reaction-diffusion systemsand equations of the FKPP type, see for instance [15–17],that leads to the expression (37) for the velocity of theleftmost particle of the BRW.We define the cumulative distribution function of ˆ K ( n ) as F ( x, n ) = P [ ˆ K ( n ) ≤ x ]. For a given time n this is anincreasing function of x , from 0 to 1 as x increases from −∞ to + ∞ . The RDE (35) translates into an evolutionequation for F as the discrete time increases, F ( x, n + 1) = 1 − ∞ (cid:88) k =1 π k (cid:18) − (cid:90) F ( x − Ξ , n ) χ (Ξ)dΞ (cid:19) k , where π k is the probability law of the random variable Z ,and χ the density of Ξ. We assume that at large times F exhibits a front propagating at a velocity v , and denote F v the shape of the front in the reference frame movingat this velocity: F ( z + vn, n ) → F v ( z ) as n → ∞ . Thisgives the following equation on F v : F v ( z − v ) = 1 − ∞ (cid:88) k =1 π k (cid:18) − (cid:90) F v ( z − Ξ) χ (Ξ)dΞ (cid:19) k , which is equivalent to the RDE (39) on the limit randomvariable L . When z → −∞ the distribution functionvanishes, in this limit we can thus linearize the equationon F v , which yields: F v ( z − v ) = (cid:32) ∞ (cid:88) k =1 π k k (cid:33) (cid:90) F v ( z − Ξ) χ (Ξ)dΞ . (B1)This linear (integral) equation admits solutions of theform F v ( z ) = e θz , with θ > θ and v obey thecondition e − θv = (cid:32) ∞ (cid:88) k =1 π k k (cid:33) (cid:90) e − θ Ξ χ (Ξ)dΞ , (B2)which gives a relation v = v ( θ ) corresponding to (37).The linearized equation thus admits a family of solutionsparametrized by the tail exponent θ >
0, correspondingto velocities v ( θ ). The delicate point in this reasoning, forwhich we refer the reader to the literature, is the justifica-tion of the minimum velocity selection principle, namelythe fact that the relevant solution of the full non-linearequation on F v is the one minimizing v ( θ ), as stated in(37).Note that the minimizer θ ∗ of v ( θ ) corresponds toa double root of the characteristic equation of the lin-earized equation on F v , which thus admits as solutionsthe linear combinations of e θ ∗ z and z e θ ∗ z . This enlight-ens the statement made in (40) for the left tail behav-ior of the limit random variable L , obtained rigorouslyin [20, 22].
2. Stochastic ordering argument