[PDF] A Stochastic Theory of the Hierarchical Clustering I. Halo Mass Function

Abstract

We present a new theory for the hierarchical clustering of dark matter (DM) halos based on stochastic differential equations, that constitutes a change of perspective with respect to existing frameworks (e.g., the excursion set approach); this work is specifically focused on the halo mass function. First, we present a stochastic differential equation that describes fluctuations in the mass growth of DM halos, as driven by a multiplicative white (Gaussian) noise dependent on the spherical collapse threshold and on the power spectrum of DM perturbations. We demonstrate that such a noise yields an average drift of the halo population toward larger masses, that quantitatively renders the standard hierarchical clustering. Then, we solve the Fokker-Planck equation associated to the stochastic dynamics, and obtain the Press & Schechter mass function as a (stationary) solution. Moreover, generalizing our treatment to a mass-dependent collapse threshold, we obtain an exact analytic solution capable of fitting remarkably well the N-body mass function over a wide range in mass and redshift. All in all, the new perspective offered by the theory presented here can contribute to better understand the gravitational dynamics leading to the formation, evolution and statistics of DM halos across cosmic times.

Full PDF

AACCEPTED BY APJ

Preprint typeset using L A TEX style emulateapj v. 01/23/15

A STOCHASTIC THEORY OF THE HIERARCHICAL CLUSTERING

I. HALO MASS FUNCTION

Andrea Lapi , Luigi Danese

ACCEPTED BY APJ

ABSTRACTWe present a new theory for the hierarchical clustering of dark matter (DM) halos based on stochasticdiﬀerential equations, that constitutes a change of perspective with respect to existing frameworks(e.g., the excursion set approach); this work is speciﬁcally focused on the halo mass function. First, wepresent a stochastic diﬀerential equation that describes ﬂuctuations in the mass growth of DM halos, asdriven by a multiplicative white (Gaussian) noise dependent on the spherical collapse threshold and onthe power spectrum of DM perturbations. We demonstrate that such a noise yields an average driftof the halo population toward larger masses, that quantitatively renders the standard hierarchicalclustering. Then, we solve the Fokker-Planck equation associated to the stochastic dynamics, andobtain the Press & Schechter mass function as a (stationary) solution. Moreover, generalizing ourtreatment to a mass-dependent collapse threshold, we obtain an exact analytic solution capable ofﬁtting remarkably well the N -body mass function over a wide range in mass and redshift. All inall, the new perspective oﬀered by the theory presented here can contribute to better understand thegravitational dynamics leading to the formation, evolution and statistics of DM halos across cosmictimes. Subject headings:

Cosmology (343) — Dark matter (353) INTRODUCTIONThe halo mass function, namely the statistics describ-ing the number of dark matter (DM) halos per unit co-moving volume as a function of halo mass and redshift,is a fundamental quantity in astrophysics and cosmol-ogy (see textbooks by Mo et al. 2010 and Cimatti et al.2020). For example, it is a basic ingredient to developsensible galaxy formation and evolution models (see re-views by Silk & Mamon 2012 and Naab & Ostriker 2017),and it is routinely exploited in cosmological studies re-lying on the abundance and clustering of collapsed ob-jects and of large-scale structures (see reviews by Frenk& White 2012 and Wechsler & Tinker 2018).Clearly, the halo mass function can be estimated viahigh-resolution, large-volume, cosmological N -body sim-ulations (see Sheth & Tormen 1999; Jenkins et al. 2001;Warren et al. 2006; Tinker et al. 2008; Crocce et al.2010; Bhattacharya et al. 2011; Watson et al. 2013).However, given the natural limits on resolution, compu-tational time, and storing capacity, it can be probed onlyin limited mass and redshift ranges. Moreover, the re-sults of simulations depend somewhat on the algorithmused to identify collapsed halos (e.g., friend-of-friend vs.spherical overdensity), and on speciﬁc parameters relatedto the identiﬁcation of isolated objects (e.g., the linkinglength). On the other hand, to estimate the halo massfunction from observations is even more challenging (e.g.,Castro et al. 2016; Dong et al. 2019; Sonnenfeld et al.2019; Li et al. 2020), given the statistical and system-atic uncertainties that arise when linking the observable SISSA, Via Bonomea 265, 34136 Trieste, Italy IFPU - Institute for fundamental physics of the Universe,Via Beirut 2, 34014 Trieste, Italy INFN-Sezione di Trieste, via Valerio 2, 34127 Trieste, Italy INAF-Osservatorio Astronomico di Trieste, via Tiepolo 11,34131 Trieste, Italy quantities to the halo mass. Therefore, a deep theoreti-cal understanding on how the halo mass function is orig-inated from ﬁrst principles is of crucial importance.The modern theoretical framework to address the is-sue was born with the seminal work by Press & Schechter(1974); these authors were able to compute an analyticexpression for the halo mass function by prescribing thata halo would collapse if it resided within a suﬃcientlyoverdense region of the initial (Gaussian) perturbationﬁeld. Given that the overdensity around a spatial lo-cation depends on scale, they recognized that the haloabundance is simply related to the mass fraction in thedensity ﬁeld, smoothed on diﬀerent scales, which is abovea critical threshold for collapse. A drawback of this ap-proach is the so called ’cloud-in-cloud’ problem, i.e., at-tention must be paid not to double count overdense re-gions embedded within a larger collapsing perturbation;in other words, one has to consider only the perturba-tions that overcome the threshold on a given smoothingscale, but not on a larger one.The problem was solved with the development of theexcursion set framework by Bond et al. (1991), whichstill nowadays constitutes the standard theory. This en-visages the overdensity around a given spatial location toexecute a random walk when considered as a function ofthe smoothing scale; if the smoothing is performed witha sharp ﬁlter in Fourier space, the walk is Markovian.The collapse threshold here plays the role of a barrier,and the halo mass function is related to the distributionof ﬁrst crossing, i.e., the probability that a walk crossesthe barrier for the ﬁrst time on a speciﬁc scale. In theoriginal theory, the collapse threshold was gauged on thespherical collapse model of DM perturbations (Gunn &Gott 1972), and as such it was assumed to be indepen-dent on halo mass; subsequent developments adopted amass-dependent threshold, inspired from the ellipsoidal a r X i v : . [ a s t r o - ph . C O ] S e p LAPI & DANESEcollapse model (see Sheth & Tormen 2002), in order tobetter reproduce the results of N -body simulations.The excursion sets approach was then exploited to de-rive the ’conditional’ halo mass function (see Lacey &Cole 1993), describing the mass and redshift distributionof a halo’s progenitors, to build up Monte Carlo realiza-tions of the merging process known as merger trees (seeKauﬀmann & White 1993; Somerville & Kolatt 1999;Cole et al. 2000; Parkinson et al. 2008), and to developmodels for the large-scale halo bias (see Mo & White1996; Sheth & Lemson 1999). Further, more recent,reﬁnements include non-Markovian walks (Maggiore &Riotto 2010a; Musso & Sheth 2012), stochastic collapsethresholds (Maggiore & Riotto 2010b; Corasaniti & Achi-touv 2011), extension to peaks theory (see Paranjapeet al. 2012), descriptions of the void distribution (seeSheth & van de Weygaert 2004; Jennings et al. 2013),and non-standard cosmologies (e.g., von Braun-Bates &Devriendt 2018; Lovell 2020).Despite this rich literature focused on the theoreticalfoundations and a number of undoubtable successes inpractical applications, the excursion set framework isknown to hide some pitfalls and drawbacks: no exactanalytic expression of the mass function for a generalmass-dependent collapse threshold is known, but for verysimple shapes (see Zhang & Hui 2006; Lapi et al. 2013);the merging kernel associated to the excursion set the-ory is not symmetric, and this causes a mathematicalinconsistency or at least an ambiguity in the deﬁnitionof the merger rates (see Benson et al. 2005; Neistein &Dekel 2008; Zhang et al. 2008); even adopting a mass-dependent collapse threshold and other reﬁnements, theexcursion set formalism struggles to reproduce the haloprogenitors’ distributions extracted from N -body simu-lations (e.g., Parkinson et al. 2008; Jiang & van denBosch 2014); the relation between the probability of ﬁrstupcrossing and the mass function, which is at the heartof the excursion set framework, is correct on statisticalgrounds but cannot be strictly true for individual masselements (see discussion by Mo et al. 2010, their Sect.7.2.2b).In this paper we submit a new theory of the hierarchi-cal clustering and halo mass function based on stochas-tic diﬀerential equations in real space, that constitutesa change of perspective with respect to the excursionset formalism. First, we invent a stochastic diﬀerentialequation that describes ﬂuctuations in the mass growthof DM halos, as driven by a multiplicative white (Gaus-sian) noise dependent on the spherical collapse thresh-old and on the power spectrum of DM perturbations;in this approach it is the mass (or mass variance) in agiven region of the Universe to perform a (Markovian)random walk as a function of cosmic time. In Sect. 2 wedemonstrate that the noise yields an average drift towardlarger masses, that quantitatively renders the standardhierarchical clustering. Then, in Sect. 2.1 we solve theFokker-Planck equation associated to the stochastic dy-namics, and obtain as a solution the Press & Schechtermass function; in Sect. 2.2 we point out that the solu-tion is stationary when the original equation is writtenin convenient variables.In Sect. 3 we introduce a minimal modiﬁcation of thestochastic equation in terms of a mass-dependent col-lapse threshold. Using a parametric shape analogous to that adopted in the excursion set framework, we obtain aclosed-form analytical solution of the associated Fokker-Planck equation. Remarkably, such a solution has ashape similar to the empirical ﬁtting formula introducedsince Sheth & Tormen (1999); moreover, for speciﬁc val-ues of the parameters describing the mass dependence ofthe collapse threshold, our result reproduces remarkablywell the N -body mass function over an extended rangeof masses and redshifts.As an aside, in Sect. 4 we explore how to generalizeour framework when a colored rather than a white noiseis considered, so as to enforce a non-Markovian evolu-tion. Adopting for deﬁniteness a multiplicative Ornstein-Uhlenbeck noise and a constant threshold for collapse,we are able to solve exactly the corresponding Fokker-Planck equation and obtain a closed form solution. Withrespect to the white noise case, the redshift evolution ofthe mass function is found to be modiﬁed somewhat, in afashion depending on the correlation time characterizingthe noise. Finally, in Sect. 5 we summarize our ﬁndingsand envisage possible outlooks.Throughout this work, we adopt the standard ﬂatΛCDM cosmology (Planck Collaboration 2018) withrounded parameter values: matter density Ω M ≈ . Λ ≈ .

7, baryon density Ω b ≈ . H = 100 h km s − Mpc − with h ≈ . σ ≈ . h − Mpc. Themost relevant expressions are highlighted with a box. A STOCHASTIC EQUATION FOR THEHIERARCHICAL CLUSTERINGOur proposal is to capture the essence of the hierar-chical clustering via the following nonlinear stochasticdiﬀerential equation:dd t ln σ − = σδ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) / η ( t ) , (1)or equivalently, in terms of mass dd t M = σ | d σ/ d M | δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) / η ( t ) . (2)Here η ( t ) is a Gaussian white noise (physical dimension1 / √ time) with ensemble-average properties (cid:104) η ( t ) (cid:105) = 0and (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = 2 δ D ( t − t (cid:48) ), where δ D is the Dirac delta-function (the factor 2 is only a convention and clearly itcould be reabsorbed into the multiplicative term). Thequantity δ c ( z ) = δ c D (0) /D ( z ) is the critical thresholdfor collapse extrapolated from linear perturbationtheory; in a ﬂat Universe (see Mo et al. 2010; Weinberg2008), one can use the approximations δ c (cid:39) (12 π ) / [1 + 0 . Ω M ( z )] ≈ .

68 and D ( z ) ≈ Throughout the paper we adopt, in line with the majority ofthe physics community, the Stratonovich convention; this allowsto use the rules of ordinary calculus on stochastic variables, at theprice of originating a noise-induced drift term in the Fokker-Planckcoeﬃcients associated to a given stochastic diﬀerential equation(see Appendix A). The alternative Ito convention, mostly used bymathematicians, removes such a noise-induced drift, but requiresto develop new rules for the diﬀerential calculus of stochastic vari-ables.

STOCHASTIC THEORY OF HIERARCHICAL CLUSTERING I. 3

Fig. 1.—

Euler-Maruyama integration of the stochastic diﬀeren-tial Eq. (1), yielding the evolution of the mass variance ln σ (left y − axis) and of the mass M (right y − axis) as a function of cosmictime t ; the initial condition M ≈ M (cid:12) at z ∼

10 (correspondingto t ∼ . δ c ( t ).

52 Ω M ( z )1+ z (cid:104) + Ω M ( z ) − Ω M ( z ) + Ω / M ( z ) (cid:105) − with Ω M ( z ) ≡ Ω M (1 + z ) / [Ω Λ + Ω M (1 + z ) ]. Finally, σ is the mass variance ﬁltered on the mass scale M : σ ( M ) = 1(2 π ) (cid:90) d k P ( k ) ˜ W M ( k ) ; (3)here P ( k ) is the power spectrum of density ﬂuctuation,and ˜ W M ( k ) is the Fourier transform of a window func-tion whose volume in real space encloses the mass M ; forstandard cold dark matter power spectra (e.g., Bardeenet al. 1986), σ ( M ) is an inverse, convex, slowly-varyingfunction of M . Note that in the present theory the re-lation σ ( M ) between σ and M is purely deterministic,but both M ( t ) and σ ( t ) = σ ( M ( t )) are to be consideredstochastic variables that ﬂuctuate over cosmic time t un-der the inﬂuence of the noise. We stress the change ofperspective with respect to the standard excursion setformalism: in the latter the overdensity ﬁeld δ ( σ ) exe-cutes a random walk as a function of the mass variance σ , which plays the role of a pseudo-time variable; here themass M ( t ) or the mass variance σ ( M ( t )) are themselvesstochastic variables, undergoing a Markovian evolutionas a function of (real) time t . Note that in the excursionset approach the choice of the ﬁlter function in Eq. (3)has a crucial impact, since the random trajectories δ ( σ )are Markovian only when a sharp ﬁlter in Fourier space isadopted; in the present theory, assuming a diﬀerent ﬁlterfunction (e.g., Gaussian or top-hat in real space) changes only the deterministic relation σ ( M ) but has otherwiseno eﬀect on the Markovianity of the stochastic processes M ( t ) and σ ( M ( t )).The rationale naively followed to invent the aboveEq. (1) is simple. On the left hand side it appears thetime derivative of an adimensional function of the massthat incorporates the power spectrum and the ﬁlteringscale; in choosing ln σ we have been inspired by a num-ber of N -body simulations (e.g., Zhao et al. 2009), thatsuggest the mass growth of halos to be easily describedin terms of such a quantity. On the right hand side itappears a stochastic driving η ( t ), that for dimensionalconsistency must be multiplied by the (inverse) squareroot of a characteristic timescale. Since our aim here isto describe the growth of DM perturbations using quan-tities related to the linear regime, we ﬁnd it natural tochoose the timescale | ˙ δ c /δ c | (cid:39) | ˙ D ( t ) /D ( t ) | , where thefactor D ( t ) deﬁned below Eq. (2) eﬀectively describes thelinear growth of perturbations under gravity in a givencosmological background. Moreover, the adopted noise is‘multiplicative’, in that its strength depends on the stateof the system, and speciﬁcally on the ratio σ/δ c . Regionswith σ (cid:38) δ c tend to change their mass more abruptly,while the evolution is slower for σ (cid:46) δ c . In particular,positive variations of M (or in ln σ − ) within the ﬁlteredregion can be reasonably related to mergers among col-lapsed halos, or mass accretion from the ﬁeld; negativevariations can be interpreted in terms of mass loss due togravitational interactions with surrounding regions, tidalforces, stripping, and fragmentation. We stress the cru-cial role played by the multiplicative noise in Eq. (1); as η ( t ) ﬂuctuates, also the random variable σ and hence themultiplicative factor σ/δ c on the r.h.s. varies, and there-fore (cid:104) σ η/δ c (cid:105) is not null even if (cid:104) η (cid:105) is; this noise-induceddrift actually makes σ ( t ) to copy the decrease of δ c ( t )with time and, given the inverse convex shape of the de-terministic function σ ( M ), a net average increase in mass M ( t ) is enforced. Similar stochastic models with multi-plicative noise have been employed to describe a widerange of physical phenomena, from Brownian motion ininhomogeneous media or in close approach to physicalbarriers, to thermal ﬂuctuations in electronic circuits, tothe evolution of stock prices, to the heterogeneous re-sponse of biological systems and randomness in gene ex-pression; to our knowledge, this is the ﬁrst time they areapplied to describe the formation of collapsed structurein the Universe.A more quantitative hint that such a stochastic equa-tion has some value can be derived by performing asimple integration via the Euler-Maruyama method (e.g.Kloeden & Platen 1992; there are other ways to obtainmore accurate numerical solutions of stochastic diﬀeren-tial equations but these are not needed here); Eq. (1) canbe discretized on a time grid t i with i = 0 , . . . , n − σ ( t i +1 ) = ln σ ( t i ) + 12 σ ( t i ) δ c ( t i ) (cid:12)(cid:12)(cid:12)(cid:12) δ c ( t i +1 ) − δ c ( t i ) δ c ( t i ) (cid:12)(cid:12)(cid:12)(cid:12) − σ ( t i ) δ c ( t i ) (cid:12)(cid:12)(cid:12)(cid:12) δ c ( t i +1 ) − δ c ( t i ) δ c ( t i ) (cid:12)(cid:12)(cid:12)(cid:12) / w i , (4)where w i are random weights extracted from a normal distribution with zero mean and unit variance. In Fig. 1 LAPI & DANESEwe show the resulting evolution of ln σ ( M ( t )) as a func-tion of cosmic time, with initial condition M ∼ M (cid:12) or ln σ ∼ . z ∼

10 (reasonable initial conditions donot change signiﬁcantly the outcome). The grey lines are30 randomly chosen evolutionary tracks, while the redline and shaded area illustrate the median and the quar-tiles over 3000 realization of the noise, and the dashedblue line shows the evolution of the spherical collapsethreshold δ c ( t ). It is easily seen that the noise inducesa drift of ln σ , making it to decrease, and hence makingthe mass M to increase. Remarkably, after a burn-inperiod needed to erase memory of the initial condition, σ ( M ( t )) tends to copy the evolution of δ c ( t ); this meansthat the noise-induced drift eﬀectively renders the stan-dard hierarchical clustering of the halo population, asexpressed by the increase with time of the characteristicmass M c ( t ) set by the condition σ ( M c ( t )) ∼ δ c ( t ).2.1. Fokker-Planck equation and the Press &Schechter mass function

We now look for the probability density P ( M, t ) fora region to be in a state of mass between M and M +d M at time t ; this can be found by solving the Fokker-Planck equation associated to Eq. (2), which reads (seeAppendix A for details) ∂∂t P ( M, t ) = − T ( t ) ∂∂M [ D ( M ) D (cid:48) ( M ) P ( M, t )] + T ( t ) ∂ ∂M (cid:2) D ( M ) P ( M, t )] (cid:3) , (5)where we have deﬁned the two quantities D ( M ) ≡ σ / | d σ/ d M | and T ( t ) ≡ | ˙ δ c | / /δ / c such that ˙ M = D ( M ) T ( t ) η ( t ) factorizes the mass and time dependen-cies. The Fokker-Planck equation may also be writ-ten as a pure continuity equation ∂ t P + ∂ M J = 0in terms of a probability current J ( M, t ) = − T ( t ) D ( M ) ∂ M [ D ( M ) P ( M, t )]. The natural boundary con-ditions lim M →∞ P ( M, t ) = 0, P ( M,

0) = δ D ( M ) and theconstraint P ( M, t ) = 0 whenever

M <

J | M =0 = [ D ∂ M ( D P )] | M =0 = 0 at the M = 0 point (nonet probability current through M = 0). Then the prob-ability mass function P is normalized as (cid:82) ∞ d M P = 1and thus it must be related to the halo mass function byd N d M d V ( M, t ) = ¯ ρ M M P ( M, t ) , (6)in terms of the average comoving matter density ¯ ρ M .Now to solve the Fokker-Planck equation we employthe transformations:  X ≡ (cid:90) d MD ( M ) = 1 σY ≡ (cid:90) d t T ( t ) = 12 δ c W ( X, Y ) ≡ D ( M ) P ( M, t ) . (7)Then Eq. (5) turns into ∂ Y W ( X, Y ) = ∂ X W ( X, Y ) , (8)which is a standard diﬀusion equation. In terms ofthese new variables, the boundary conditions stated be-low Eq. (5) read lim X →∞ W = 0, W ( X,

0) = δ D ( X )and ( ∂ X W ) | X =0 = 0. These stem from the followingcircumstances: (i) for reasonable power spectra σ is aslowly-varying inverse function of M , so that X ∝ σ − tends to zero or inﬁnity as M does; (ii) the collapsethreshold δ c ∝ D − ( t ) scales inversely with t , so that Y ∝ δ − c tends to zero as t does; (iii) ﬁnally, W ( X, Y ) = D ( M ) P ( M, t ) vanishes for large X since P ( M, t ) is ex-pected to be exponentially suppressed for M → ∞ sooverwhelming any slow (at most powerlaw) divergenceof D ( M ) = σ / | d σ/ d M | .The solution of this diﬀerential problem is standard (itcan be easily found via a Fourier transform) and writes W ( X, Y ) = 1 √ π Y e − X / Y . (9)Coming back to the original variables we get P ( M, t ) = 1 D ( M ) (cid:112) π Y ( t ) e − X ( M ) / Y ( t ) , (10)which after Eqs. (6) and (7) yields the Press & Schechtermass function: N ( M, t ) = (cid:114) π ¯ ρ M δ c ( t ) M σ (cid:12)(cid:12)(cid:12)(cid:12) d σ d M (cid:12)(cid:12)(cid:12)(cid:12) e − δ c ( t ) / σ . (11)Three interesting remarks follow. First, note that thepresence of the multiplicative noise in Eqs. (1) and (2) isfundamental in originating a second-order, diﬀusion-liketerm in the associated Fokker-Planck equation (see alsoAppendix A); from the derivation above it is seen thatsuch a term yields the exponential cut-oﬀ of the massfunction at the high-mass end. Second, one can easilycompute the moments (cid:42)(cid:18) δ c σ (cid:19) k (cid:43) = (cid:90) ∞ d M ( δ c /σ ) k P = 2 k/ √ π Γ (cid:18) k (cid:19) . (12)For scale-free power spectra M ∝ σ − / ( n +3) holds interms of the eﬀective spectral index n > −

3, so thatthe above implies (cid:104) M k (cid:105) ( t ) ∝ δ − k/ ( n +3) c ; this in turnscales as (cid:104) M k (cid:105) ( t ) ∝ t k/ ( n +3) in the redshift range where δ c ∝ t − / applies. Third, it is interesting to note thatthe Fokker-Planck Eq. (5) highlights that the evolutionof the probability density function is driven by the twoterms on the r.h.s: the ﬁrst represents its noise-induceddrift toward larger masses because of hierarchical col-lapses, and the second describes its diﬀusive reshaping STOCHASTIC THEORY OF HIERARCHICAL CLUSTERING I. 5at the high-mass end due to the stochasticity in merg-ing/accretion events.2.2. Stationarity

An alternative derivation, that will be useful for thegeneralization in the next Section, is the following. Weagain start from Eq. (1) and change variable from σ to ν ≡ δ c ( t ) /σ ( M ):dd t ν = − (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ν + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) / η ( t ) ; (13)thus now the variable ν is seen to undergo a stochasticOrnstein-Uhlenbeck process. The corresponding Fokker-Planck equation reads ∂ t P ( ν, t ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ν [ ν P ( ν, t ) + ∂ ν P ( ν, t )] . (14)We set the boundary conditions lim ν →∞ P = 0 and J | ν =0 = −| ˙ δ c /δ c | [ ν P + ∂ ν P ] | ν =0 = 0, implying (cid:82) ∞ d ν P = 1; the former is the natural boundary due tothe expected exponential suppression of the mass func-tion for large values of M , which correspond to large ν ∝ σ − ; the latter is the no-current boundary conditionat the ν = 0 point enforced by the constraint P ( ν, t ) = 0for ν < ν positively deﬁned.Having incorporated the time dependent quantity δ c ( t )into the new variable ν , under the ergodic hypothesisone expects that the relevant solution in terms of this variable should be stationary (e.g., Paul & Baschnagel2013), i.e., it should satisfy ∂ t P ( ν, t ) = 0. From Eq. (14), P ( ν, t ) = ¯ P ( ν ) is determined by ν ¯ P + dd ν ¯ P = 0 , (15)where the constant on the r.h.s. must be zero to satisfythe no-current boundary condition J | ν =0 = 0. The so-lution to this simple ordinary diﬀerential equation withnormalization (cid:82) ∞ d ν ¯ P ( ν ) = 1 is ¯ P = (cid:112) /π e − ν / . Themass function writes as N ( M, t ) = ¯ ρ M M (cid:12)(cid:12)(cid:12)(cid:12) d ν d M (cid:12)(cid:12)(cid:12)(cid:12) ¯ P ( ν ) ; (16)given that | d ν/ d M | = ( δ c /σ ) | d σ/ d M | , this is indeedeasily recognized to be again the Press & Schechter massfunction.One may wonder whether the general time-dependentsolution P ( ν, t | ν (cid:48) , t (cid:48) ) of Eq. (14) for a generic initialcondition P ( ν, t = t (cid:48) | ν (cid:48) , t (cid:48) ) = δ D ( ν − ν (cid:48) ) converges tothe stationary state, and over which timescale. Tothis purpose, we note that deﬁning a new time vari-able τ ∝ − ln δ c ( t ) brings Eq. (14) into a form thatcan be easily solved via a Fourier transform; the fun-damental solutions (a general result for Gaussian andMarkovian variables known as Doob’s theorem) read P ∝ e − ( ν ± ν (cid:48) e − τ ) / (1 − e − τ ) / √ − e − τ . Taking into ac-count the no-current boundary condition, and revertingto the original time variable one obtains P ( ν, t | ν (cid:48) , t (cid:48) ) = 1 (cid:112) π (1 − δ /δ (cid:48) ) (cid:26) exp (cid:20) −

12 ( ν − ν (cid:48) δ/δ (cid:48) ) − δ /δ (cid:48) (cid:21) + exp (cid:20) −

12 ( ν + ν (cid:48) δ/δ (cid:48) ) − δ /δ (cid:48) (cid:21)(cid:27) , (17)with δ ≡ δ c ( t ) and δ (cid:48) ≡ δ c ( t (cid:48) ). Plainly, away from any ini-tial condition, for δ substantially lower than δ (cid:48) , this con-verges to the stationary solution P ( ν, t ) derived above.Such transitional states could be possibly related to de-viation of the mass function from the self-similar shapeEq. (16) in terms of the variable ν . Note, in passing, thatthe transition probability Eq. (17) cannot be directly re-lated to the halo conditional mass function of the ex-tended Press & Schechter theory; we anticipate that toderive the latter a modiﬁcation of Eq. (13) is needed,but the issue demands an extended analysis that will bepresented in a forthcoming paper. MASS-DEPENDENT THRESHOLD: THE N -BODY MASS FUNCTIONIt is well known that the halo mass function derivedfrom N -body simulations deviates substantially from thePress & Schechter shape (see reference in Sect. 1 for de-tails); in particular, the former is ﬂatter than the latterboth at the high and at the low mass end, and evolvesmore slowly toward high redshift (see Fig. 2). This mis-match is usually cured by introducing a mass-dependent threshold for collapse: δ c ( σ, t ) = σ √ q ν (cid:20) β ( √ q ν ) γ (cid:21) ≡ σ B ( ν ) = δ c B ( ν ) ν , (18)where q , β , and γ are parameters to be set by ﬁtting the N -body outcomes. Such a modiﬁed threshold is generallyascribed, though a bit naively, to the fact that perturba-tions undergo an ellipsoidal rather than a spherical col-lapse (Sheth & Tormen 2002; see also discussion by Mo etal. 2010). Note, however, that the above shape is quitegeneral in describing a variety of phenomena that caninﬂuence the collapse, like tidal torques and angular mo-mentum, cosmological constant, dynamical friction (seeDel Popolo 2017 and references therein). In the excursionset framework, the parameters q ≈ . β ≈ .

47 and γ ≈ .

615 are required to ﬁt the N -body mass function;however, there is some degeneracy, in that for example asquare-root barrier with q ≈ . β ≈ . γ ≈ . γ > / δ ( σ ) do notcross the barrier at all, an occurrence thought to repre-sent fragmentation. Notice that no general analytical ex-pression exists for the excursion set mass function, apart LAPI & DANESE Fig. 2.—

Halo mass function at diﬀerent redshifts z = 0 (red), 1(magenta), 3 (green), 6 (blue), and 10 (cyan). Solid lines refer tothe mass function from this work based on the stationary Fokker-Planck solution (see Eqs. 23 and 24) for a mass-dependent collapsethreshold δ c ( σ, t ), see Sect. 3 for details. Filled dots illustrate the N -body results for FoF halos by Bhattacharya et al. (2011), sam-pled from their ﬁtting formula for − . ≤ ln[ σ ( M ) D ( z )] − ≤ . .

25 dex. Dashed line shows the mass functioncomputed from the excursion set theory. As a reference, dottedlines show the Press & Schechter mass function, which correspondsto the standard spherical collapse threshold δ c ( t ) independent ofmass. TABLE 1Solution parameters for amass-dependent threshold

Mass function q β γ

ST99 0 .

62 0 .

16 0 . .

69 0 .

09 0 . .

69 0 .

12 0 . Note . — Values of the parameters q , β and γ are set by ﬁtting the Fokker-Planck solutionEqs. (23) and (24) to the halo mass functionfrom the N -body simulations by Sheth & Tor-men (1999; ST99), Bhattacharya et al. (2011;Bh+11) and Watson et al. (2013; Wa+13), seeSect. 3 for details. from very particular barrier shapes (e.g., the constant,linear or square-root barriers; see for example Mahmoodet al. 2006; Giocoli et al 2007; Lapi et al. 2013) and ingeneral one must recur to numerical solutions (see Zhang& Hui 2006).In the framework presented here, we aim to show thata minimal modiﬁcation of the basic Eq. (1), which incor-porates a mass-dependent collapse threshold with shapeanalogous to the above Eq. (18), will yield a mass func-tion in excellent agreement with N -body simulations.Moreover, we will provide an analytic expression valid for any triples of parameters q , β , and γ < /

2, that inthe limit ν >> δ c ( t ) with the mass-dependent δ c ( σ, t ) given above; thus now the ratio σ/δ c ( σ, t ) modu-lates the noise toward enforcing collapse. We also retainthe term | ˙ δ c /δ c | / (cid:39) | ˙ D ( t ) /D ( t ) | / involving the char-acteristic timescale for the linear growth of perturbations(see discussion in Sect. 2). When formulated in terms of ν , Eq. (1) modiﬁed in such a way writes asdd t ν = − ν (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + νB ( ν ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) / η ( t ) , (19)where δ c ( t ) is the standard threshold for spherical col-lapse. The corresponding Fokker-Planck equation reads: ∂ t P ( ν, t ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ δ c δ c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ν (cid:26) ν P ( ν, t ) + νB ( ν ) ∂ ν (cid:20) νB ( ν ) P ( ν, t ) (cid:21)(cid:27) , (20)with boundary conditions lim ν →∞ P = 0 and J | ν =0 = −| ˙ δ c /δ c | [ ν P + ( ν/B ) ∂ ν ( ν P /B )] | ν =0 = 0, implying theconstraint (cid:82) ∞ d ν P = 1.In analogy to the procedure followed in Sect. 2.2, welook for stationary solutions P ( ν, t ) = ¯ P ( ν ) with ∂ t ¯ P =0; one obtains the equation νB ( ν ) dd ν (cid:20) νB ( ν ) ¯ P (cid:21) = − ν ¯ P , (21)where the integration constant must be null to satisfy theno-current boundary condition J | ν =0 = 0. The aboveequation can be easily solved by multiplying both sidesby 1 /B and recognizing that it becomes separable for thefunction ( ν/B ) ¯ P ; we ﬁnd¯ P ( ν ) = A B ( ν ) ν exp (cid:20) − (cid:90) d ν B ( ν ) ν (cid:21) , (22)where the normalization constant A is determined bythe condition (cid:82) ∞ d ν ¯ P ( ν ) = 1. We stress that this resultholds for any mass-dependent collapse threshold that canbe expressed as δ c ( σ, t ) ≡ δ c ( t ) B ( ν ) /ν in terms of thescaled variable ν .Specializing now to the shape of B ( ν ) from Eq. (18),we note that for ν → ν ¯ P ( ν ) ∝ B ( ν ) ∝ ν − γ applies; to satisfy the normalization constraint (cid:82) ∞ d ν ¯ P = 1 one must require γ < /

2. Performingexplicitly the integration, we get the closed form expres-sion¯ P ( ν ) = A √ q (cid:20) β ( √ q ν ) γ (cid:21) exp (cid:26) − q ν (cid:20) β − γ √ q ν ) γ + β − γ √ q ν ) γ (cid:21)(cid:27) . (23) STOCHASTIC THEORY OF HIERARCHICAL CLUSTERING I. 7 Fig. 3.—

Evolution with redshift of the quantity (2 Y ) − / en-tering the colored noise solution of Sect. 4, that in the white-noisecase is simply the spherical collapse threshold δ c (dotted line). Col-ored curves illustrate the result for diﬀerent values of the volatility ζ and mean reversal rate ω characterizing the noise: blue line isfor ω = 10 Gyr − , green for 1 Gyr − , and red for 0 . − ; solidlines refer to ζ = ω , while dot-dashed line to ζ = 2 ω , and dashedline to ζ = ω/ ω = 1Gyr − ). The resulting mass function just writes N ( M, t ) = ¯ ρM σ (cid:12)(cid:12)(cid:12)(cid:12) d σ d M (cid:12)(cid:12)(cid:12)(cid:12) ν ¯ P ( ν ) ; (24)incidentally, note that the multiplicity function f (ln σ )used in some literature works is just f (ln σ ) ≡ ν ¯ P ( ν ).We stress that the above is an exact expression, valid forany triple of values q , β , γ < /

2; the Press & Schechterfunction is recovered for q = 1 and β = 0. Remarkably,the asymptotic behavior for ν >>

1, corresponding tolarge masses and/or early cosmic times, is seen to pro-duce a shape akin to the empirical ﬁt of N -body simu-lations adopted since Sheth & Tormen (1999); however,for ﬁnite ν the terms in the exponential are importantand must be taken into account.We now use a Levenberg-Marquardt least-squares min-imization routine to ﬁt the multiplicity function ν ¯ P ( ν )to the simulation outcomes for FoF halos by Sheth & Tor-men (1999), Bhattacharya et al. (2011) and Watson et al.(2013), sampled from their (somewhat diﬀerent) ﬁttingformulas for − . ≤ ln[ σ ( M ) D ( z )] − ≤ . .

25 dex. For Sheth & Tormen (1999), we ﬁnd bestﬁt parameters q ≈ . β ≈ .

16 and γ ≈ .

37, yield-ing a value of the normalization constant

A ≈ .

63. ForBhattacharya et al. (2011), we obtain q ≈ . β ≈ . γ ≈ .

42, yielding

A ≈ .

64. For Watson et al.(2013), we get q ≈ . β ≈ .

12 and γ ≈ .

37, yielding

A ≈ .

66. These triples of values, reported for conve-nience in Table 1, are consistent within the uncertaintiesin the simulation results and in the ﬁtting procedure. In Fig. 2 we compare the mass function N ( M, t ) fromEqs. (23) and (24) to the N -body results by Bhat-tacharya et al. (2011), ﬁnding an excellent agreementover a wide range of masses M ∼ − M (cid:12) andredshifts z ∼ −

10. In the same Figure we also plot forreference the Press & Schechter mass function; moreover,we show the mass function computed from the excur-sion set approach (following the numerical algorithm byZhang & Hui 2006) with the barrier shape of Eq. (20) andthe standardly adopted parameters q ≈ . β ≈ . γ ≈ .

615 (see above). COLORED NOISE AND NON-MARKOVIANWALKSIn Nature, white noise is never found to be perfectlyrealized, but rather constitutes an idealization of thestochastic driving force aﬀecting a physical phenomenon.Thus one may wonder whether the previous treatmentcan be extended to a colored instead of a white noise; thiswill correspondingly enforce a non-Markovian evolutionof the system. To have a grasp on the impact of colorednoise and non-Markovianity on the mass function, weconsider a multiplicative stochastic Ornstein-Uhlenbeckprocess; for simplicity we adopt a mass-independent col-lapse threshold, so that the endpoint of this computationshould be compared with the Press & Schechter massfunction. Speciﬁcally, we modify our Eq. (2) into thetwo-dimensional stochastic system  ˙ M = T ( t ) D ( M ) Γ( t )˙Γ = − ω Γ( t ) + ζ η ( t ) (25)where, besides already deﬁned quantities, Γ( t ) is anOrnstein-Uhlenbeck noise with average (cid:104) Γ( t ) (cid:105) = 0 anda nontrivial correlation between diﬀerent times (cid:104) Γ( t ) Γ( t (cid:48) ) (cid:105) = ζ ω [ e − ω | t − t (cid:48) | − e − ω ( t + t (cid:48) ) ] , (26)controlled by the parameters ω and ζ (both have physicaldimension of 1 / time); ζ represents the degree of volatil-ity, i.e. the sensitivity of the system to random changes,while ω is the dissipation rate at which the system tendsto reverse toward its zero mean. This is perhaps the sim-plest generalization of the white noise case, since Γ( t )is still Markovian, but of course M ( t ) is not. In thelimit ζ = ω → ∞ one recovers a pure white-noise, since (cid:104) Γ( t ) Γ( t (cid:48) ) (cid:105) (cid:39) lim ω →∞ ω e − ω | t − t (cid:48) | = 2 δ D ( t − t (cid:48) ).The Fokker-Planck equation regulating the dynamicsof the probability density function P ( M, Γ , t ) for theabove system is ∂∂t P ( M, Γ , t ) = − T ( t ) Γ ∂∂M [ D ( M ) P ( M, Γ , t )] + ω ∂∂ Γ [Γ P ( M, Γ , t )] + ζ ∂ ∂ Γ P ( M, Γ , t ) ; (27)in analogy with the one-dimensional case, we can write this is as a continuity equation ∂ t P + ∇ · J = 0 in terms LAPI & DANESEof the vectorial diﬀerential operator ∇ = [ ∂ M , ∂ Γ ] andprobability current J = [ J M , J Γ ] = [ T D Γ P , − ω Γ P − ζ ∂ Γ P ]. In particular, we are interested in the marginal-ized P ( M, t ) = (cid:82) dΓ P ( M, Γ , t ) with boundary condi-tions lim M →∞ P ( M, t ) = 0, P ( M,

0) = δ D ( M ) and( (cid:82) dΓ Γ P ) | M =0 = 0; the latter expresses in the two-dimensional space ( M, Γ) the requirement of a zero cur-rent on the M = 0 line, i.e. ( (cid:82) dΓ J M ) | M =0 = 0. InAppendix B we show that the Fokker-Planck equation issolved by P ( M, t ) = 1 D √ π Y e − X / Y , (28)in terms of:  X ( M ) = (cid:90) d MD ( M ) = 1 σ ( M ) Y ( t ) = ζ ω (cid:90) t d τ T ( τ ) e − ω τ (cid:90) τ d τ (cid:48) T ( τ (cid:48) ) [ e ω τ (cid:48) − e − ω τ (cid:48) ] , (29)This is the equivalent for colored noise of Eq. (10),which is recovered in the limit ζ = ω → ∞ as Y ( t ) → (cid:82) d t T ( t ) = 1 / δ c . The corresponding expression forthe mass function is written as: N ( M, t ) = 1 (cid:112) π Y ( t ) ¯ ρ M M σ (cid:12)(cid:12)(cid:12)(cid:12) d σ d M (cid:12)(cid:12)(cid:12)(cid:12) e − / σ Y ( t ) ; (30)this is similar to the Press & Schechter shape, but for amodiﬁed redshift evolution encoded in Y ( t ). The quan-tity (2 Y ) − / , that in the white-noise limit is just δ c , canbe regarded as a modiﬁed collapse threshold; in Fig. 3we show how its evolution and absolute value diﬀer from δ c ( t ), depending on the volatility ζ and mean-reversalrate ω characterizing the noise. We conclude that onlyvalues of ζ ∼ ω (cid:38) several Gyr − are required not to movefar away from the Press & Schechter mass function, andhence from simulations. SUMMARY AND OUTLOOKIn this paper we have submitted a new theory of thehierarchical clustering based on stochastic diﬀerentialequations in real space, that constitutes a change of per-spective with respect to the excursion set formalism; thiswork is speciﬁcally focused on the halo mass function.First, we have invented a stochastic diﬀerential equa-tion that describes ﬂuctuations in the mass growth ofDM halos, as driven by a multiplicative white (Gaussian)noise dependent on the spherical collapse threshold andon the power spectrum of DM perturbations. By numer-ically integrating such a stochastic diﬀerential equation,in Sect. 2 we have demonstrated that the noise yieldsan average drift of the halo population toward largermasses, that quantitatively renders the standard hier-archical clustering (see Fig. 1). Then, in Sect. 2.1 wehave solved the Fokker-Planck equation associated to thestochastic dynamics, and obtained as a solution the Press& Schechter mass function; in Sect. 2.2 we have pointed out that the solution is stationary when the original equa-tion is written in convenient variables.Then in Sect. 3 we have introduced a minimal mod-iﬁcation of the stochastic equation in terms of a mass-dependent collapse threshold. Using a parametric shapeanalogous to that adopted in the excursion set frame-work, we have obtained a closed-form analytical solu-tion of the associated Fokker-Planck equation. Remark-ably, such a solution has a limiting shape for largemasses/early times similar to the empirical ﬁtting for-mula introduced since Sheth & Tormen (1999); in fact,for speciﬁc values of the parameters describing the massdependence of the collapse threshold, our result repro-duces extremely well the N -body mass function over awide range of masses and redshifts (see Fig. 2).As an aside issue, in Sect. 4 we have generalized ourstochastic approach to a colored, instead of a white,noise; in particular, we have investigated the modiﬁca-tion to the Press & Schechter mass function when thestochastic dynamics is ruled by a multiplicative Ornstein-Uhlenbeck noise with ﬁnite volatility and mean-reversalrate. We have exactly solved the related Fokker-Planckequation, ﬁnding that the mass function has shape anal-ogous to the Press & Schechter one when expressed interms of a modiﬁed, eﬀective collapse threshold; the lat-ter may substantially diﬀer from the standard δ c ( t ) inabsolute value and time evolution, depending on the cor-relation parameters of the noise (see Fig. 3). We con-clude that values of such parameters larger than severalGyr − are required not to move far away from the Press& Schechter mass function, and hence from simulations.The next-order development of this work will concernthe computation of the conditional mass function, i.e.,the mass function of a halo’s progenitors. This investi-gation will naturally extend to merger rates, formationtime distributions, and large-scale halo bias. A more de-tailed comparison of our results with the outcomes of N -body simulations, that includes the speciﬁcity of boththe numerical experiments as well as of the theory, willbe welcome. Other future applications could involve are-examination of the two-phase mass growth of DM ha-los, the halo speciﬁc angular momentum distribution, thevoid mass function, and halo statistics in non-standardcosmological frameworks. We very much hope that thenew perspective oﬀered by the theory presented here willcontribute to a better understanding of the gravitationaldynamics leading to the formation and evolution of DMhalos and hosted baryonic structures across cosmic times.We thank our referee for a constructive report, and forthe insightful comments and helpful suggestions. We ac-knowledge Carlo Baccigalupi, Alessandro Bressan, andGiovanni Bussi for enlightening discussions and criticalreading. This work has been partially supported byPRIN MIUR 2017 prot. 20173ML3WW 002, ‘Openingthe ALMA window on the cosmic evolution of gas, starsand supermassive black holes’. A.L. has taken advantageof the MIUR grant ‘Finanziamento annuale individualeattivit´a base di ricerca’ and of the EU H2020-MSCA-ITN-2019 Project 860744 ‘BiD4BEST: Big Data appli-cations for Black hole Evolution STudies’. STOCHASTIC THEORY OF HIERARCHICAL CLUSTERING I. 9APPENDIX A. A PRIMER ON THE STOCHASTIC DIFFERENTIAL AND FOKKER-PLANCK EQUATIONS

Given that concepts and techniques related to the stochastic diﬀerential and Fokker-Planck equations are not verycommon among the astrophysics community, for the reader’s convenience we present here a short primer, in a modernnotation and systematic way. In particular, we focus on the derivation of the Fokker-Planck equation associated toa given stochastic system, in presence of a state-dependent, multiplicative noise; this is extensively used in the maintext. More details and applications can be found, e.g., in the book by Risken (1996).The derivation involves two steps: (i) an expression for the time derivative of the probability density in terms of aTaylor-series of the conditional moments, known as Kramers-Moyal expansion; (ii) the explicit computation of suchmoments for a random variable satisfying a stochastic diﬀerential equation with multiplicative noise. Suppose we aregiven a system characterized by a physical variable ξ , whose evolution ξ ( t ) as a function of time t is stochastic. If theevolution is Markovian, by deﬁnition the probability density function P ( x, t ) of ﬁnding the system in state ξ ( t ) = x attime t satisﬁes: P ( x, t + τ ) = (cid:90) d x (cid:48) P ( x, t + τ | x (cid:48) , t ) P ( x (cid:48) , t ) , (A1)in terms of the conditional (transition) probability P ( x, t + τ | x (cid:48) , t ) between the times t and t + τ ; in other words, fora Markovian system the transition probability depends only on the value at the next earlier time. We rewrite theintegrand as P ( x, t + τ | x (cid:48) , t ) P ( x (cid:48) , t ) = P ( x + ∆ − ∆ , t + τ | x − ∆ , t ) P ( x − ∆ , t ) in terms of ∆ ≡ x − x (cid:48) and then performa Taylor expansion in ∆ to obtain P ( x, t + τ | x (cid:48) , t ) P ( x (cid:48) , t ) (cid:39) ∞ (cid:88) n =0 ( − n n ! ∆ n ∂ nx [ P ( x + ∆ , t + τ | x, t ) P ( x, t )] . (A2)Now we insert this expression in Eq. (A1) and perform the integration after changing variable from x (cid:48) to ∆; notingthat in the n = 0 term (cid:82) d∆ P ( x + ∆ , t + τ | x, t ) = 1 holds since the conditional probability is normalized, we get P ( x, t + τ ) − P ( x, t ) (cid:39) ∞ (cid:88) n =1 n ! ( − ∂ x ) n M n ( x, t ; τ ) , (A3)where we have deﬁned the conditional moments M n ( x, t ; τ ) ≡ (cid:104)| ξ ( t + τ ) − ξ ( t ) | n (cid:105)| ξ ( t )= x = (cid:90) d∆ ∆ n P ( x + ∆ , t + τ | x, t ) . (A4)Now we Taylor-expand the moments with respect to τ as follows M n ( x, t ; τ ) /n ! (cid:39) D n ( x, t ) τ + ϑ ( τ ) ; (A5)note that terms of order τ cannot be present since P ( x + ∆ , t | x, t ) = δ D (∆) by deﬁnition and in Eq. (A4) all theconditional moments for n ≥ D n are deﬁned as D n ( x, t ) ≡ lim τ → n ! M n ( x, t ; τ ) τ = lim τ → n ! (cid:104)| ξ ( t + τ ) − ξ ( t ) | n (cid:105)| ξ ( t )= x τ . (A6)All in all, we obtain the so called Kramers-Moyal expansion in terms of the partial diﬀerential equation ∂ t P ( x, t ) (cid:39) ∞ (cid:88) n =1 ( − ∂ x ) n D n ( x, t ) ; (A7)this ends the ﬁrst step in the derivation.We now compute explicitly the coeﬃcients D n when the variable ξ ( t ) satisﬁes a stochastic diﬀerential equation˙ ξ = h ( ξ, t ) + g ( ξ, t ) η ( t ) , (A8)with inital condition ξ ( t ) = x . Here η ( t ) is a white (Gaussian) noise with ensemble-average properties (cid:104) η ( t ) (cid:105) = 0 and (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = 2 δ D ( t − t (cid:48) ); the coeﬃcient 2 in this last expression is only a convenient arbitrary choice, since it can bereabsorbed into the multiplicative function g without loss of generality. We start by transforming the diﬀerential intoan integral stochastic equation ξ ( t + τ ) − x = (cid:90) t + τt d t (cid:48) [ h ( ξ ( t (cid:48) ) , t (cid:48) ) + g ( ξ ( t (cid:48) ) , t (cid:48) ) η ( t (cid:48) )] . (A9)We expand near x the functions h ( ξ ( t (cid:48) ) , t (cid:48) ) (cid:39) h ( x, t (cid:48) ) + ∂ x h ( x, t (cid:48) ) ( ξ ( t (cid:48) ) − x ) + ... and g ( ξ ( t (cid:48) ) , t (cid:48) ) (cid:39) g ( x, t (cid:48) ) +0 LAPI & DANESE ∂ x g ( x, t (cid:48) ) ( ξ ( t (cid:48) ) − x ) + ... to obtain ξ ( t + τ ) − x (cid:39) (cid:90) t + τt d t (cid:48) h ( x, t (cid:48) ) + (cid:90) t + τt d t (cid:48) ∂ x h ( x, t (cid:48) ) ( ξ ( t (cid:48) ) − x ) + . . . + (cid:90) t + τt d t (cid:48) g ( x, t (cid:48) ) η ( t (cid:48) ) + (cid:90) t + τt d t (cid:48) ∂ x g ( x, t (cid:48) ) η ( t (cid:48) ) ( ξ ( t (cid:48) ) − x ) + . . . (A10)Then we iterate for ξ ( t (cid:48) ) − x in the integrand to get ξ ( t + τ ) − x = (cid:90) t + τt d t (cid:48) h ( x, t (cid:48) ) + (cid:90) t + τt d t (cid:48) ∂ x h ( x, t (cid:48) ) (cid:90) t (cid:48) t d t (cid:48)(cid:48) h ( x, t (cid:48)(cid:48) )++ (cid:90) t + τt d t (cid:48) ∂ x h ( x, t (cid:48) ) (cid:90) t (cid:48) t d t (cid:48)(cid:48) g ( x, t (cid:48)(cid:48) ) η ( t (cid:48)(cid:48) ) + . . . + (cid:90) t + τt d t (cid:48) g ( x, t (cid:48) ) η ( t (cid:48) ) + (cid:90) t + τt d t (cid:48) g ( x, t (cid:48) ) (cid:90) t (cid:48) t d t (cid:48)(cid:48) h ( x, t (cid:48)(cid:48) ) η ( t (cid:48)(cid:48) )++ (cid:90) t + τt d t (cid:48) ∂ x g ( x, t (cid:48) ) (cid:90) t (cid:48) t d t (cid:48)(cid:48) g ( x, t (cid:48)(cid:48) ) η ( t (cid:48) ) η ( t (cid:48)(cid:48) ) + . . . (A11)Now taking the ensemble average and using the properties of the white noise yields (cid:104) ξ ( t + τ ) − x (cid:105) = (cid:90) t + τt d t (cid:48) h ( x, t (cid:48) ) + (cid:90) t + τt d t (cid:48) (cid:90) t (cid:48) t d t (cid:48)(cid:48) h ( x, t (cid:48)(cid:48) ) ∂ x h ( x, t (cid:48) ) + . . . + (cid:90) t + τt d t (cid:48) g ( x, t (cid:48) ) ∂ x g ( x, t (cid:48) ) + . . . (A12)where in the last term we have used that (cid:82) t (cid:48) t d t (cid:48)(cid:48) δ D ( t (cid:48)(cid:48) − t (cid:48) ) g ( x, t (cid:48)(cid:48) ) = g ( x, t (cid:48) ) since the Dirac- δ operates on anextremal of the integration. Dividing by τ and taking the limit for τ → D = h ( x, t ) + g ( x, t ) ∂ x g ( x, t ). For higher-order coeﬃcients notice that terms containing the noise are proportionalto τ n where n is the number of integrals involved, and vanish for small τ ; actually only one of such terms, containingtwo integrals and two noises contributes and yields D = lim τ → (1 / τ ) (cid:82) t + τt d t (cid:48) (cid:82) t (cid:48) t d t (cid:48)(cid:48) g ( x, t (cid:48) ) g ( x, t (cid:48)(cid:48) ) 2 δ D ( t (cid:48) − t (cid:48)(cid:48) ) = g ( x, t ), while D n = 0 for any n ≥

3. This ends the second step of the derivation.Putting together the coeﬃcients just derived in the Kramers-Moyal expansion of Eq. (A7), one ﬁnds the Fokker-Planck equation corresponding to the original stochastic equation:  ∂ t P ( x, t ) = − ∂ x [ D ( x, t ) P ( x, t )] + ∂ x [ D ( x, t ) P ( x, t )] D ( x, t ) = h ( x, t ) + g ( x, t ) ∂ x g ( x, t ) D ( x, t ) = g ( x, t ) (A13)The quantity g ∂ x g appearing in the coeﬃcient D is a noise-induced drift; this stems from the fact that as η ( t )ﬂuctuates, also the random variable ξ ( t ) and so the function g ( ξ ( t ) , t ) varies and therefore (cid:104) g ( ξ ( t ) , t ) η ( t ) (cid:105) is not nulleven if (cid:104) η ( t ) (cid:105) = 0 is. Finally, simple algebra shows that the Fokker-Planck equation may be written as a source-freecontinuity equation: ∂ t P ( x, t ) + ∂ x J ( x, t ) = 0 (A14)in terms of a probability current J ( x, t ) ≡ h ( x, t ) P ( x, t ) − g ( x, t ) ∂ x [ g ( x, t ) P ( x, t )] . (A15) B. SOLUTION OF THE FOKKER-PLANCK EQUATION FOR COLORED NOISE

In this Appendix we show how to solve the Fokker-Planck equation derived in Sect. 4 ∂ t P ( M, Γ , t ) = − T ( t ) Γ ∂ M [ D P ( M, Γ , t )] + ω ∂ Γ [Γ P ( M, Γ , t )] + ζ ∂ P ( M, Γ , t ) . (B1) STOCHASTIC THEORY OF HIERARCHICAL CLUSTERING I. 11for the marginalized P ( M, t ) = (cid:82) dΓ P ( M, Γ , t ) with boundary conditions lim M →∞ P = 0, P ( M,

0) = δ D ( M ) and( (cid:82) dΓ Γ P ) | M =0 = 0.As a ﬁrst step, we introduce a new variable X ≡ (cid:82) d M/D ( M ) in place of M , and redeﬁne the probability densityas W ( X, Γ , t ) = D ( M ) P ( M, Γ , t ); then the above equation turns into ∂ t W = − T ( t ) Γ ∂ X W + ω ∂ Γ [Γ W ] + ζ ∂ W . (B2)We now perform a two-dimensional Fourier transform W ( X, Γ , t ) ∝ (cid:90) d k X d k Γ ˜ W ( k X , k Γ , t ) e i ( k X X + k Γ Γ) , (B3)and obtain the following equation for the Fourier modes ∂ t ˜ W = T ( t ) k X ∂ k Γ ˜ W − ω k Γ ∂ k Γ ˜ W − ζ k ˜ W . (B4)Given the boundary conditions, it is convenient to look for solutions with shape˜ W ( k X , k Γ , t ) ∝ e − k X Σ XX / − k Σ ΓΓ / − k X k Γ Σ X Γ (B5)where Σ XX ( t ), Σ X Σ ( t ), Σ ΣΣ ( t ) are only functions of time. Inserting this ansantz into the previous equation yields thefollowing ordinary diﬀerential equations  ˙Σ ΓΓ = − ω Σ ΓΓ + 2 ζ ˙Σ X Γ = T Σ ΓΓ − ω Σ X Γ ˙Σ XX = 2 T Σ X Γ . (B6)These can be straightforwardly solved as  Σ ΓΓ ( t ) = ζ ω (1 − e − ω t )Σ X Γ ( t ) = (cid:90) t d τ T ( τ ) e − ω ( t − τ ) Σ ΓΓ ( τ )Σ XX ( t ) = 2 (cid:90) t d τ T ( τ ) Σ X Γ ( τ ) . (B7)Inverting the Fourier transform in Eq. (B3) one ﬁnds out the solution W ( X, Γ , t ) ∝ (cid:112) || Σ || exp (cid:26) − X XX − [Σ XX Γ − Σ X Γ X ] XX || Σ || (cid:27) , (B8)where || Σ || = Σ XX Σ ΓΓ − Σ X Γ is the determinant of the 2 × X and function W readlim X →∞ W = 0, W ( X,

0) = δ D ( X ), and ( (cid:82) dΓ Γ W ) | X =0 = 0.Finally, marginalizing over Γ and coming back to the original variables, one obtains P ( M, t ) = 1 D √ π Y e − X / Y , (B9)in terms of the quantities:  X ( M ) = (cid:90) d MD ( M ) Y ( t ) ≡ Σ XX ζ ω (cid:90) t d τ T ( τ ) e − ω τ (cid:90) τ d τ (cid:48) T ( τ (cid:48) ) [ e ω τ (cid:48) − e − ω τ (cid:48) ] (B10)note that the correct white-noise limit is recovered for ζ = ω → ∞ since Y ( t ) → (cid:82) d t T ( t ) = 1 / δ c ( t ), and thatactually (2 Y ) − / constitute an eﬀective collapse threshold, dependent on the parameters of the colored noise. REFERENCES

Bardeen, J. M., Bond, J. R., Kaiser, N., & Szalay, A. S. 1986, ApJ,304, 15 Benson, A. J., Kamionkowski, M., & Hassani, S. H. 2005, MNRAS,357, 8472 LAPI & DANESE