Moderate deviations for the size of the giant component in a random hypergraph
aa r X i v : . [ m a t h . P R ] J u l MODERATE DEVIATIONS FOR THE SIZE OF THE GIANTCOMPONENT IN A RANDOM HYPERGRAPH
JINGJIA LIU AND MATTHIAS L ¨OWE
Abstract.
We prove a moderate deviations principles for the size of thelargest connected component in a random d -uniform hypergraph. The keytool is a version of the exploration process, that is also used to investi-gate the giant component of an Erd¨os-R´enyi graph, a moderate deviationsprinciple for the martingale associated with this exploration process, andexponential estimates. Introduction
The research on random graphs was initiated by Erd¨os and R´enyi, see [14],[15]. Though it was originally motivated by questions from graph theory,random graphs quickly developed into an independent field with applicationsin many areas such as physics, neural networks, telecommunications, or thesocial sciences. Despite the fact that some of these applications ask for randomgraphs with a given degree distribution (see e.g. [12], [21] for very readablesurveys, or [18], for a recent application), the by far most popular model ofa random graph still is the Erd¨os-R´enyi graph. In this graph, one realizes allpossible connections between N vertices V = { , . . . , N } independently withequal probability p . This models is referred to as G ( N, p ).The corresponding random hypergraph model is the model G d ( N, p ). Here d ≥ d = 2 is nothing but the ordinary Erd¨os-R´enyi graph). Thus arealization of G d ( N, p ) will be a hypergraph G = ( V, E ), where all the edges in E are subsets of V with cardinality d . Moreover, in G d ( N, p ) all hyperedges ofcardinality d are selected independently with probability p . One of the moststriking first results about G ( N, p ) is, that there is a sharp phase transition inthe size of the largest connected component: If p = λ/N and λ <
1, then thelargest component will be of size O (log N ), while for λ >
1, the component isof order O ( N ), both with probability converging to 1. In the latter case, thesize of the largest component with high probability is of order ρ λ N + o ( N ), Date : July 19, 2019.2010
Mathematics Subject Classification.
Primary: 60F10; Secondary: 05C65.
Key words and phrases.
Large deviations, moderate deviations, random hypergraph, gi-ant component, Erd¨os-R´enyi graph. where ρ λ satisfies 1 − ρ λ = e − λρ λ (1.1)(here we say that the random (hyper)graph G d ( N, p ) enjoys a certain property A with high probability (w.h.p.), if the probability that A holds in G d ( N, p )converges to 1 as N tends to infinity). A very detailed study of this and manyother phenomena concerning this phase transition can be found in [16] or [21].The corresponding result for the d-regular random hypergraph models G d ( N, p )were shown in [20], [17], and [8]: If for some ε > d − (cid:0) N − d − (cid:1) p < − ε the resulting hypergraph consist of components of order O (log N ), while for( d − (cid:0) N − d − (cid:1) p > ε there is a unique giant component of size O ( N ). Tomake this more precise, we need a number of definitions. We set p = λ ( d − N d − . (1.2)For each fixed λ >
1, we define the dual branching process parameter λ ∗ < λ ∗ e − λ ∗ = λe λ . In case d = 2, we specify ρ λ given by (1.1) as ρ λ =: ρ ,λ , whereas for d ≥
3, wedefine ρ d,λ by the equation1 − ρ d,λ = (1 − ρ λ ) / ( d − . (1.3)It can be checked that ρ d,λ satisfies λ ∗ = λ (1 − ρ d,λ ) d − . (1.4)For fixed d we abbreviate ρ λ = ρ d,λ . The role of ρ λ is that it determines theasymptotic size of the giant component. Indeed, if λ >
1, it has been shown in[14] and [8] that the unique giant component is of size ρ λ N + o ( N ) with highprobability. This statement can be regarded as a law of large numbers for thesize of the giant component, which we will call henceforth C max .Moreover it was shown that and ρ λ also can be written as the unique solutionof the transcendental equation (cf. [3]):1 − ρ λ = exp (cid:18) − λd − (cid:0) − (1 − ρ λ ) d − (cid:1)(cid:19) . (1.5)Combining (1.3) with (1.4), one sees that (1.1) is indeed the case d = 2 of (1.5).Note that both ρ λ and ρ λ depend on d , which is suppressed in the notation.Let us assume for the rest of the paper that we are in the supercritical regime,i.e. ( d − (cid:18) N − d − (cid:19) p > ε for some ε > , where the precise conditions on ε will be given later explicitly. Note that thisis equivalent to assuming that λ > ε . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 3
For both, random graphs and random hypergraphs, fluctuations around theaforementioned law of large numbers for the size of C max were investigated. ACentral Limit Theorem (CLT, for short) for the size of C max in G ( N, p ) wasproved e.g. in [2], for a nice proof we also refer to [21], Section 4.5. Largedeviations in the same situations go back to O’Connell in his nice paper [19],while moderate deviations were investigated in [1]. The corresponding CLTfor C max is G d ( N, p ) was established in [3] using Stein’s method. In [4] a localCLT is proved, even for the joint distribution of the number of vertices andedges in C max . Another way to prove a CLT that uses the so-called explorationprocess and is reminiscent to the proof for random graphs given in [21] wasintroduced by Grimmett and Riordan [5].The aim of the present paper is to establish moderate deviations results forthe number of vertices in C max for the case of the random hypergraph model G d ( N, p ). To this end, we will modify the exploration process for hypergraphsintroduced in [5] in such a way that is resembles the exploration process usedin [1].In order to formulate our main theorems we need to recall that a sequenceof real valued random variables ( Y n ) obeys a large deviation principle (LDP)with speed a n and good rate function I ( · ) : R → R +0 ∪ { + ∞} if • For every L ∈ R +0 , the level sets of I denoted by N L := { x ∈ R : I ( x ) ≤ L } , are compact • For every open set G ⊆ R it holdslim inf n →∞ a n log P ( Y n ∈ G ) ≥ − inf x ∈ G I ( x ) . (1.6) • For every closed set A ⊆ R it holdslim sup n →∞ a n log P ( Y n ∈ A ) ≤ − inf x ∈ A I ( x ) . (1.7)As announced, in this paper we will prove a moderate deviation principle(MDP) for | C max | (which is a function of N ). Formally, there is no distinctionbetween an MDP and an LDP. Usually, an LDP lives on the scale of a law oflarge number type ergodic phenomenon, while MDPs describe the probabilitieson a scale between a law of large numbers and some sort of CLT. For both,large deviation principles and MDPs the three points mentioned above serveas a definition.Having this in mind our central result reads as: Theorem 1.1 (MDP for the size of the giant component in G d ( N, p )) . Let < α < . For each d ≥ , set p = λ ( d − N d − with λ = 1 + ε . Assume that ε = O (1) , as well there exists ι > such that ε N τ → ∞ where τ = min { , − α − ι } . (1.8) JINGJIA LIU AND MATTHIAS L ¨OWE
Then the sequence of random variables ( | C max | − ρ λ N ) /N α satisfies an MDPin G d ( N, p ) with speed N α − and rate function J ( x ) = x (1 − λ ∗ ) c . (1.9) Here c = λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) and λ ∗ = λ (1 − ρ d,λ ) d − as in (1.4) . Remarks 1.2. (1) For d = 2 this result is contained in [1].(2) The asymptotic notation should be understood as N → ∞ . We use X = O ( Y ), if there is an M > N →∞ | XY | ≤ M , and X = o ( Z ), if there exists c ( N ) such that X ≤ c ( N ) Z , where c ( N ) → N → ∞ . Furthermore, if such quantities M, c ( N ) depend onsome parameters, we will indicate this by subscripts, e.g. X = O ρ ( Y )meaning M = M ( ρ ).(3) From the proof of Lemma 6.1 one might get the impression that re-quiring the (slightly more natural) condition ε N τ → ∞ with τ =min { , − α } would be enough. However, in the proof of Lemma 5.4we need the slightly stronger condition (1.8).The rest of this paper is organized as follows: In Section 2 we give a shortintroduction to the exploration process which will be used in Sections 4, 5 and6 to prove Theorem 1.1. Briefly this exploration process starts with a number k = k N of vertices and investigates the union of its connected components.If k N is chosen appropriately, this union coincides with the giant componentof the hypergraph up to negligible terms. On the other hand, the size of thisunion can be controlled by a martingale underlying the exploration process.In Section 3 we prove an MDP for this martingale. In Sections 4,5, and 6 wewill see that indeed this MDP helps to show our main Theorem 1.1.2. An exploration process on hypergraphs
The aim of this section is to introduce an exploration process to investigatethe components of a hypergraph. This exploration process is inspired by thecorresponding process for graphs as defined e.g. in [21]. A similar, yet slightlydifferent process for hypergraphs was introduced in [5]. We will also use resultsfrom this paper.We start by taking the given enumeration of the vertices from 1 , . . . , N . Ver-tices during this exploration process will get one of three labels: active, unseen,or explored. At time t the sets of active, unseen, or explored vertices will bedenoted by A t , U t , and E t , respectively. We will start by declaring the first k = k N vertices active and the rest unseen. Now, in each step of the process,the first active vertex (with respect to the given enumeration) is selected and ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 5 declared explored . At the same time, all of its unseen neighbors are set active.The process terminates when there are no active vertices. If we denote by C ≤ k the union of the connected components of the first k vertices, then, at the endof the process all vertices in C ≤ k are explored and all the others are unseen.We remark the following Remarks 2.1. (1) For two sequences X N and Y N , we write X N ∼ Y N , ifthe limit lim N →∞ X N /Y N exists and equals to 1.(2) Obviously, since we add an explored vertex in every step | C ≤ k N | = min { t ∈ N : A t = 0 } . (3) By construction, A = k N (to be specified later) and, for all t with A t − >
0, one has A t = A t − + η t − . Here η t is the number of unseen vertices that are set active in the t − thstep and A t = 0 if A t − = 0.(4) Consider the distribution of A t when we are investigating G d ( N, p ) with p = λ ( d − N d − and λ >
1. For each t ∈ N after the t − th step, with s active and N − t − s unseen vertices u , for each unseen vertex u thereare exactly ν t +1 = (cid:18) N − t − d − (cid:19) potential edges that contain u and the vertex we are about to explore,but none of the vertices we have already explored. Hence the probabil-ity that u becomes active during step t + 1 is given by π = π t = 1 − (1 − p ) ν t +1 . Note that for times t ≪ N and with our scaling of p one has π ∼ − e − λN which is the same scaling as in the case d = 2. On the other hand, π t = pν t +1 + O ( p ν t +1 ) = λ ( N − t ) d − N − d +1 + O ( 1 N ) . This implies that η t given A t − = s is distributed like P N − ( t − − si =1 Y ti ,where each of the Y ti is an indicator with success probability π t . Notethat the Y i are not independent, which establishes the major differencebetween the case d = 2 (when the Y i are obviously independent) andthe cases d ≥ JINGJIA LIU AND MATTHIAS L ¨OWE
To simplify matters, we will change our process slightly and call instead A t = A t − + η t − t ∈ N the exploration process. Of course, this process agrees with theone previously considered up to the first time the process hits 0. We will followthe ideas in [5] and rewrite A t (up to small errors) as the sum of a deterministicprocess and a martingale. This also motivates our study of moderate deviationsof martingales with n dependent martingale increments in the next section.To this end, let D t := E [ η t − |F t − ]where F t is the σ -Algebra generated by the random variables A , η , . . . , η t .From the above we learn that E [ η t +1 |F t ] = U t π = U t pν t +1 + O ( 1 N ) . For later use we also recall that in [5] it was shown that with π := 1 − (1 − p )( N − t − d − ) ∼ p ( N − t − d − ) ∼ λ ( d − N − t ) d − N − d +1 and π ∼ π we have that V [ η t +1 |F t ] = U t ( U t − π + π ) + U t π − ( U t π ) ∼ U t π + U t π ∼ λ ( d − − tN ) d − U t N + λ (1 − tN ) d − U t N (2.1)From these computations we obtain D t +1 = U t pν t +1 − O ( 1 N )= α t +1 ( N − t − A t ) − O ( 1 N )where α t = pν t = p (cid:18) N − t − d − (cid:19) . Now set ∆ t +1 := A t +1 − A t − D t +1 = η t +1 − E [ η t +1 |F t ] (2.2)such that A t +1 = A t + D t +1 + ∆ t +1 =(1 − α t +1 ) A t + α t +1 ( N − t ) − t +1 + O ( 1 N ) . (2.3)By definition we have E [∆ t +1 |F t ] = 0, thus (∆ t ) t is a martingale differencesequence. In particular, a simple bound for the variance of (∆ t ) t is given by ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 7
Lemma 2.2. [7, Lemma 8]
Let p and λ > be given as in Theorem 1.1. Thenthere is a constant M > such that for all ≤ t ≤ N , we have V (∆ t |F t − ) ≤ M with probability 1. Obviously, the process A t is a key quantity for the analysis of the size of C ≤ k N (and hence for the size of C max ) We want to approximate it by the sum of adeterministic sequence and a martingale. To this end, we define x = 0 x t +1 = (1 − α t +1 ) x t + α t +1 ( N − t ) − . Then, with A = k N , define A t +1 − x t +1 = (1 − α t +1 )( A t − x t ) + ∆ t +1 + ε t +1 where ε t +1 is a shorthand for the error term at level t + 1, which is of order O ( N ) (cf. (2.3) and for more details we refer the reader to [5, (10)]). So, if weset β t := t Y i =1 (1 − α i )we arrive at A t − x t = t X i =1 β t β i (∆ i + ε i ) . By defining S t := t X i =1 β i ∆ i , (2.4)we observe that ( S t ) is a martingale. Thus our desired approximation is givenby ˜ A t := x t + β t S t , (2.5)and we have by [5, Lemma 3] that | A t − ˜ A t | = | A t − x t − β t S t | = | t X i =1 β t β i (∆ i + ε i ) − β t t X i =1 β i ∆ i | = | t X i =1 β t β i ε i | = O (cid:0) tN (cid:1) (2.6)uniformly in 1 ≤ t ≤ N . JINGJIA LIU AND MATTHIAS L ¨OWE
Note that the behaviour of x t can be determined as well (see [5, (15)]). Indeed,define g d,λ as g d,λ ( x ) = 1 − x − exp (cid:18) − λd − (cid:0) − (1 − x ) d − (cid:1)(cid:19) (2.7)and f ( t ) := f N,d,λ ( t ) := N g d,λ ( tN ) . (2.8)Then, we obtain uniformly in 0 ≤ t ≤ N , x t = f ( t ) + O (1) . (2.9)In a nutshell the idea is now the following: The first time t , that A t is 0 isthe size of the union of the connected components of the first k N vertices.If k N is chosen large enough, this union will contain the giant componentwith overwhelming probability (where we say that an event has ”overwhelmingprobability” when the probability of the complement of the event is negligi-ble on the chosen moderate deviation scale). On the other hand, also withoverwhelming probability, there is only one component with a size larger than N ξ , if ξ < k N vertices is the size of the largestcomponent with overwhelming probability. On the other hand, as observedabove, we may safely replace A t by ˜ A t when considering these quantities on amoderate deviation scale. That is to say P ( A t /N α ≥ x ) ∼ P ( ˜ A t /N α ≥ x ) (2.10)for any x and α > P ( A t /N α ≤ x ) ∼ P ( ˜ A t /N α ≤ x ) . (2.11)Moreover, notice that the stochastic behaviour of ˜ A t is governed by the mar-tingale ( S t ) defined in (2.4). We will therefore analyze its moderate deviationsin the next section.3. Moderate deviations for the martingale S t defined in (2.4)As we will see in Section 4 the moderate deviations for the size of the giantcomponent can be played back to the moderate deviations for the for the mar-tingale ( S t ) defined in (2.4), in this section we will prove a moderate deviationsprinciple for this martingale. Our main tool is the G¨artner-Ellis theorem [11,Theorem 2.3.6]. Note that we cannot simply quote MDPs for martingalesfrom [10] or [13], because in our context the distributions of the martingaledifferences does depend on N . The central result of this section is ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 9
Theorem 3.1.
Consider the process ( S n ) defined in (2.4) . For ζ ∈ ( −∞ , ∞ ) ,and < α < , put γ ( N ) = ρ λ N + ζ N α and assume for simplicity that γ ( N ) is an integer to avoid rounding. Then, for any choice of ζ the sequence ofrandom variables β γ ( N ) S γ ( N ) N α satisfies an MDP with speed N α − and rate function I ( x ) = x / c where c isgiven by c = c d,λ = λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) . (3.1) Here we write ρ λ for ρ d,λ and λ ∗ is the dual branching parameter given in (1.4) . The key idea will be to employ the G¨artner-Ellis theorem [11, Theorem 2.3.6].To this end we need to study the moment generating function of S n on thelevel of moderate deviations, i.e. we need to establish the existence oflim N →∞ N α − log E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) for t ∈ R . We will expand the moment generating function into a Taylor seriesup to the third order. However, to compute this we need some preparation.Recall our definitions (2.2) and (2.4), from which we obtain ∆ t +1 = η t +1 − E [ η t +1 |F t ] and S t = P ti =1 ∆ i /β i .3.1. Moments of η i . The essential point in the Taylor expansion is that theconditional variances of ∆ i and thus of η i depend on the number of unseenvertices at time i , U i . We will therefore show a rough concentration result for U i . To this end, recall that η i is the number of unseen vertices that are setactive in the i -th step. Thus, it holds η i = X j ∈ U i { ∃ an unexplored hyperedge containing j and the currently active vertex } and for any k ≥ η ki = X j i ,...,j ik ∈ U i { i ,...,i k }⊂{ ,...,N } { j i ,...,j ik are activated in step i } . Assume from stage i to i + 1, the vertex v i is being explored. There arevarious ways to activate j , . . . j k . Without loss of generality, we assume thatthe j i are pairwise different; otherwise this problem is reduced to estimatingthe lower moment of η i . One needs to activate a set of hyperedges e , . . . , e m such that j , . . . , j k are contained in these hyperedges. Because we have tochoose the remaining vertices of the hyperedge from the unexplored vertices, the probability that fixed l of the vertices are contained in one hyperedge isgiven by π l = 1 − (1 − p )( N − i − l − d − l − ) ∼ p (cid:18) N − i − l − d − l − (cid:19) ∼ λ ( d − · · · ( d − l )( N − i ) d − l − N − d . That is, if l , . . . l m sum up to k , then Q π l ≤ Dλ m N − k for some constant D .On the other hand, there is a constant C such that there at most C k/d +1 waysto write k as a sum of integers at most d −
1. Altogether with (2.1) this givesfor k = 3 and any 1 ≤ i ≤ N E (cid:2) | η i | |F i − (cid:3) ≤ L V ( η i |F i − ) (3.2)for some constant L > E (cid:2) | ∆ i | |F i − (cid:3) ≤ L V (∆ i |F i − ) . (3.3)3.2. Exponential estimates.
Moreover, we will need a Hoeffding-Azuma-type inequality for S N (e.g. [7, Lemma 12]):For a constant c >
0, it holds P ( max ≤ t ≤ N S t ≥ y ) ≤ exp (cid:0) − c y /N (cid:1) . In particular, taking y = N β for β > P ( max ≤ t ≤ N S t ≥ N β ) ≤ exp (cid:0) − c N β − (cid:1) . (3.4)(This could, in fact, also be proved using the results in [9] together with ourabove considerations).3.3. Taylor expansion of ∆ i up to third order. As we know U t = N − t − A t , A t = ˜ A t + O ( tN ) , ˜ A t = x t + β t S t and the trajectory of x t is given by (2.8). Denote (cf. [5], equation(22)) u i = N exp (cid:18) − λd − (cid:18) − (1 − iN ) d − (cid:19)(cid:19) . (3.5)Hence, we have U i − u i = − β i S i + O (1) . Fix a β > < α < β < b N with b N → ∞ and b N = O ( N β ), thus b N = o ( N ). Let us define the eventΣ N = {∀ ≤ i ≤ N : | U γ ( i ) − u γ ( i ) | ≤ b N } , which has probability at least 1 − exp( − ηN β − ) for an appropriate η > {∃ i : 1 ≤ i ≤ N, such that | U γ ( i ) − u γ ( i ) | > b N } hasprobability at most exp( − c N β − ) for some constant c >
0. ( c differs from c in (3.4) by at most an absolute constant factor.) ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 11
We use the martingale property to get E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) = E (cid:18) E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( N ) − (cid:21)(cid:19) = E (cid:18)(cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) − (cid:19)(cid:19) E (cid:20) exp (cid:18) t N α N ∆ γ ( N ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( N ) − (cid:21)(cid:19) . (3.6)By expanding up to third order, using that E [∆ i +1 |F i ] = 0, for all i , applying(2.1) as well as the crude bounds on the third moment derived in (3.2), weobtain E (cid:20) exp (cid:18) tN α − β γ ( N ) β γ ( i )+1 ∆ γ ( i )+1 (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( i ) (cid:21) =1 + t N α − β γ ( N ) β γ ( i )+1 λ ( d − (cid:18) − γ ( i ) N (cid:19) d − U γ ( i ) N + λ (cid:18) − γ ( i ) N (cid:19) d − U γ ( i ) N ! + O (cid:18) t N α − U γ ( i ) N (cid:19) . (3.7)Here we used the fact U n ≤ n ≤ N and β t = Q ti =1 (1 − α i ). We arrive at E (cid:20) exp (cid:18) tN α − β γ ( N ) β γ ( i )+1 ∆ γ ( i )+1 (cid:19) Σ N (cid:12)(cid:12)(cid:12)(cid:12) F γ ( i ) (cid:21) = exp (cid:18) t N α − β γ ( N ) β γ ( i )+1 λ ( d − (cid:18) − γ ( i ) N (cid:19) d − u γ ( i ) N + λ (cid:18) − γ ( i ) N (cid:19) d − u γ ( i ) N ! + O ( N α − ) (cid:19) (1 + o (1)) . Redoing this conditioning γ ( N ) times, we finally see that E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ N (cid:21) = exp (cid:18) t N α − γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! + O ( N α − ) (cid:19) , (3.8)which gives our desired Taylor expansion of the moment generating functionconditioned on the event Σ N . Boundedness of the moment generating function.
Let us brieflyrecall some notations: α t = pν t = p (cid:18) N − t − d − (cid:19) , ∆ t +1 = A t +1 − A t − D t +1 ,A t = A t − + η t − ,D t +1 = E [ η t +1 |F t ] ,β t = t Y i =1 (1 − α i ) ,S t = t X i =1 β i ∆ i . Note that β t /β i = Q tj = i +1 (1 − α j ) is between 0 and 1. Moreover recall thatfrom Lemma 2.2 we obtain that V (∆ t |F t − ) ≤ M for all t . Using the fact that U i ≤ N for all 1 ≤ i ≤ γ N analogously to (3.7),we can derive the following expansion for all 0 ≤ i ≤ γ N and any fixed t E (cid:20) exp (cid:18) tN α − β γ ( N ) β γ ( i )+1 ∆ γ ( i )+1 (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( i ) (cid:21) =1 + t N α − β γ ( N ) β γ ( i )+1 V (cid:0) ∆ γ ( i )+1 |F γ ( i ) (cid:1) + O ( t N α − ) ≤ t β γ ( N ) β γ ( i )+1 M N α − + O ( t N α − ) ≤ exp (cid:18) t CN α − (cid:19) , where C > ≤ i ≤ γ ( N ) and employing the martingale difference sequenceproperty as in (3.6) yield E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) ≤ E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) − (cid:19) E (cid:20) exp (cid:18) t N α N ∆ γ ( N ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( N ) − (cid:21)(cid:19) ≤ exp (cid:18) t CN α − (cid:19) , (3.9)where ˜ C >
ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 13
The goal.
With above preparation, we are finally ready to prove:
Proof of Theorem 3.1.
On one hand, by expanding the moment generatingfunction in (3.8), taking the logarithm and dividing by N α − , we see thatlim N →∞ N α − log E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) ≥ lim N →∞ N α − log E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ N (cid:21) = lim N →∞ t N γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! = t (cid:0) λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) (cid:1) , (3.10)where we used the abbreviation ρ λ = ρ λ,d and the last line follows by theasymptotics for γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! in [5, (23)]. Indeed, by γ ( N ) = ρ λ N + ζ N α and the explicit formula of u i in(3.5), we have u i ≤ cN for some c > ≤ i ≤ N . Moreover, β i is a constant between 0 and 1. Thus, each summand in ρ λ N + ζN α X i = ρ λ N β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! is uniformly bounded. Therefore, it follows γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! − ρ λ N − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! = o ( N ) . Thus1 N ρ λ N − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! + o (1)yields the claim.On the other hand, with the bound for moment generating function derived in(3.9) and the fact < α < β <
1, it holds lim N →∞ N α − log E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ cN (cid:21) ≤ lim N →∞ N α − log (cid:18) exp (cid:18) t CN α − (cid:19) P (Σ cN ) (cid:19) ≤ lim N →∞ N α − (cid:18) t CN α − + log P (Σ cN ) (cid:19) ≤ lim N →∞ N α − (cid:18) t CN α − − ηN β − (cid:19) = − ∞ . (3.11)Therefore, we can concludelim N →∞ N α − log E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) ≤ lim N →∞ N α − log max (cid:26) E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ N (cid:21) , E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ cN (cid:21)(cid:27) = t (cid:0) λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) (cid:1) . Here the last but one line follows by combining the equations (3.10) and (3.11).Since the parabola ct / x / c is then the rate function in Theorem 3.1. (cid:3) Choice of k N Following the idea we sketched in the previous sections, we seek a smart choicefor k N such that the union of the connected components of the first k N verticesdoes not essentially differ from the giant component with overwhelming prob-ability. This union will be called C ≤ k N in the sequel, its size will be denotedby | C ≤ k N | .Let us first recall two very useful results from the literature: It has been shownthat | C max | concentrates on ρ λ N and the second largest component, denotedby C second is unlikely to be large as well. Remark
The notation f ( N ) = Ω( g ( N )) is used for ∃ N ∈ N , C > f ( N ) ≥ Cg ( N ) for N ≥ N . Theorem 4.1. [7, Theorem 4]
With the same assumptions as in Theorem 1.1,i.e. set λ = 1 + ε with ε = O (1) as well as ε N → ∞ . If ω = ω ( N ) → ∞ and ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 15 ω = O (cid:16) √ ε N (cid:17) , then P (cid:16)(cid:12)(cid:12) | C max | − ρ λ N (cid:12)(cid:12) ≥ ω p N/ε (cid:17) = exp (cid:0) − Ω( ω ) (cid:1) . Furthermore, if L = L ( n ) fulfills ε L → ∞ and L = O ( εN ) . Then there exists C > such that the second largest component C second in G ( N, p ) satisfies P ( | C second | > L ) ≤ C εNL exp( − ε L/C ) . (4.1) for large enough N . Choice of k N : we set k N = N γ for a 2 α − < γ < α, (4.2)(recall that α ∈ ( , k N will be discussed more preciselyin Section 6, whereas the lower bound arises from the fact that the unionof connected components starting with k N vertices should contain the giantcomponent with overwhelming probability: Proposition 4.2.
For any < α < , P ( | C max | > | C ≤ k N | ) ≤ exp( − Ck N ) , for some C > large enough.Proof. Note that P ( | C max | > | C ≤ k N | ) = P ( ∀ ≤ i ≤ k N : i / ∈ C max )= P (1 / ∈ C max ) P (2 / ∈ C max | / ∈ C max ) · · · P ( k N / ∈ C max | , . . . , k N − / ∈ C max ) . For all ξ >
0, it holds P (1 / ∈ C max )= P (1 / ∈ C max , || C max | − ρ λ N | > ξN ) + P (1 / ∈ C max , || C max | − ρ λ N | ≤ ξN ) ≤ P ( || C max | − ρ λ N | > ξN ) + P (1 / ∈ C max , || C max | − ρ λ N | ≤ ξN ) . Let us set 0 < ξ < ρ λ and apply the large deviation bound given in Theorem4.1 with ω = ξ √ ε N , then the first term turns to be P ( || C max | − ρ λ N | > ξN ) ≤ exp( − cξ ε N )for some constant c >
0. This converges to 0 by assumption on ε .On the other hand, note that on the event { ρ λ N − ξN ≤ | C max | ≤ ρ λ N + ξN } ,there are at most N − ( ρ λ N − ξN ) vertices which are not contained in C max .Then we can bound the second term by P (1 / ∈ C max , || C max | − ρ λ N | ≤ ξN ) ≤ − ( ρ λ − ξ ) . Therefore, we obtain P (1 / ∈ C max ) ≤ − ( ρ λ − ξ ) + exp( − cε ξ N ) , P ( j / ∈ C max | , . . . , j − / ∈ C max ) ≤ P (1 / ∈ C max ) , for all 2 ≤ j ≤ k N . Consequently, there exists an appropriate C > P ( | C max | > | C ≤ k N | ) ≤ (cid:16) P (1 / ∈ C max ) (cid:17) k N ≤ (cid:0) − ( ρ λ − ξ ) + exp( − cε ξ N ) (cid:1) k N ≤ exp ( − C ( ρ λ − ξ )) k N ≤ exp( − Ck N ) . (cid:3) A moderate deviations principle for | C ≤ k N | In this section we are going to show an MDP for | C ≤ k N | , in other words, thesize of the union of the connected components, if we set the first k N verticesactive in the exploration process and k N is of the right size. We will provethis MDP using the MDP for the martingale part of the exploration processderived in Theorem 3.1. In the next section we will see, that if k N is largeenough, | C ≤ k N | and | C max | only differ by an amount that is negligible on themoderate deviations scale. Theorem 5.1.
Consider a probability for the presence of a hyperedge as in (1.2) with λ = 1 + ε as in Theorem 1.1. Take k N = N γ for α − < γ < α asin (4.2) . Then for any < α < and y > we have that lim N →∞ N α − log P ( || C ≤ k N | − ρ λ N | > yN α ) = − J ( y ) where J is given by (1.9) , i.e. J ( y ) = I ( y (1 − λ ∗ )) = y (1 − λ ∗ ) c . and I ( · ) as well as c is given explicitly in Theorem 3.1. Remark 5.2.
Due to the topological structure of R Theorem 5.1 is indeed amoderate deviations principle, see [11, Section 3.7] . We will break up the proof of Theorem 5.1 into several lemmas.
ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 17
The upper bounds.Lemma 5.3.
In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≤ − J ( y ) . Proof.
Let yN α + ρ λ N =: m + y . Firstly, recall the approximation process ˜ A t := x t + β t S t , and the trajectory of x t given by (2.9), and we see x m + y = f ( m + y ) + O (1) = N g d,λ ( m + y N ) + O (1) , (5.1)where g d,λ is given by (2.7): g d,λ ( x ) = 1 − x − exp (cid:18) − λd − (cid:0) − (1 − x ) d − (cid:1)(cid:19) . We define h ( y ) = exp (cid:18) − λd − (cid:0) − (1 − yN α − − ρ λ ) d − (cid:1)(cid:19) . Note that h ′ ( y ) = exp (cid:18) − λd − (cid:0) − (1 − yN α − − ρ λ ) d − (cid:1)(cid:19) (cid:0) − λ (1 − yN α − − ρ λ ) d − N α − (cid:1) , and h ′′ ( y ) = exp (cid:18) − λd − (cid:0) − (1 − yN α − − ρ λ ) d − (cid:1)(cid:19) (cid:0) λ (1 − yN α − − ρ λ ) d − N α − + λ ( d − − yN α − − ρ λ ) d − N α − (cid:1) . Thus we may expand h in y = 0 to obtain h ( y ) = exp (cid:18) − λd − (cid:0) − (1 − ρ λ ) d − (cid:1)(cid:19) (cid:0) − λ (1 − ρ λ ) d − yN α − (cid:1) + O ( N α − ) . Inserting the above calculation into g d,λ yields g d,λ ( yN α − + ρ λ )=1 − yN α − − ρ λ − h ( y )= − yN α − + exp (cid:18) − λd − (cid:0) − (1 − ρ λ ) d − (cid:1)(cid:19) (cid:0) λ (1 − ρ λ ) d − yN α − (cid:1) + O ( N α − )= − yN α − + λ (1 − ρ λ ) d − yN α − + O ( N α − ) , where we used the fact that g d,λ ( ρ λ ) = 0 by (1.5). Therefore, we conclude bythe definition of x t in (5.1) x m + y = N g d,λ ( yN α − + ρ λ ) + O (1)= − y (cid:0) − λ (1 − ρ λ ) d − (cid:1) N α + O ( N α − ) + O (1)= − y (1 − λ ∗ ) N α + o ( N α ) + O (1) , (5.2)where in the last line we used the abbreviation λ ∗ = λ (1 − ρ d,λ ) d − in (1.4).Finally, recall that ˜ A is the approximation process given by ˜ A t = x t + β t S t in(2.5) and E ˜ A t = x t (as well as (2.10), (2.11)). We observe P (cid:0) | C ≤ k N | > m + y (cid:1) = P (cid:0) ∀ m ≤ m + y : A m > (cid:1) ≤ P (cid:16) A m + y > (cid:17) = P (cid:18) A m + y − E A m + y N α > − E A m + y N α (cid:19) ∼ P ˜ A m + y − E ˜ A m + y N α > − E ˜ A m + y N α ! = P (cid:18) β m + y S m + y N α > − x m + y N α (cid:19) = P (cid:18) β m + y S m + y N α > y (1 − λ ∗ ) + o (1) (cid:19) . Applying Theorem 3.1, it yieldslim N →∞ N α − log P (cid:0) | C ≤ k N | > m + y (cid:1) ≤ lim N →∞ N α − log P (cid:18) β m + y S m + y N α > y (1 − λ ∗ ) + o (1) (cid:19) = − I ( y (1 − λ ∗ )) = − J ( y ) . (cid:3) Lemma 5.4.
In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | < − yN α + ρ λ N ) ≤ − J ( y ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 19
Proof.
Analogously to m + y , let us define m − y := − yN α + ρ λ N . We observe foreach ζ with < ζ < P (cid:0) | C ≤ k N | < m − y (cid:1) = P (cid:0) ∃ m < m − y : A m = 0 (cid:1) ≤ m − y X m = − yN ζ + ρ λ N P ( A m = 0) + P (cid:0) ∃ m < − yN ζ + ρ λ N : A m = 0 (cid:1) = m − y X m = − yN ζ + ρ λ N P ( A m = 0) + P (cid:0) | C ≤ k N | < − yN ζ + ρ λ N (cid:1) . (5.3)In particular, the second term in (5.3) is negligible on the chosen moder-ation deviation scale. We pick ω = y √ ε N ζ − and one can verify that ω = O ( √ ε N ).Indeed, it is implied by Theorem 4.1 that for some constant C > N →∞ N α − log P ( | C max | < − yN ζ + ρ λ N ) ≤ lim N →∞ N α − log P ( || C max | − ρ λ N | > yN ζ ) ≤ lim N →∞ N α − log P ( || C max | − ρ λ N | > ω p N/ε ) ≤ lim N →∞ − Cy ε N ζ − N α − = −∞ , Here, for the last equality to hold we need to choose ζ appropriately. This isdone as follows: Recall that we require the condition (1.8), i.e. ε N τ → ∞ with τ = min { , − α − ι } for a given ι > − α − ι < , i.e. if α > / − ι , we get ε N − α − ι → ∞ . We set ζ > max { − ι , α } , and thus obtain 2 ζ − α > − α − ι . This ensures ε N ζ − α → ∞ . (2) If 2 − α − ι > i.e. if α < / − ι , we see ε N / → ∞ . In thiscase define ζ > α + 1 /
4, which implies 2 ζ − α > . Hence, it follows ε N ζ − α → ∞ .(3) If 2 − α − ι = , any of the above choices for ζ can be applied.Now we fix y and ζ satisfying the above conditions. For m ∈ [ − yN ζ + ρ λ N, m − y ],we can find a δ with α ≤ δ ≤ ζ such that it holds m = m δ := − yN δ + ρ λ N .Note that δ will depend on N .We distinguish again the following two cases. Firstly, consider the set M and M defined below (to simplify notation, assume that sets M and M defined below are sets of integers, to avoid the irrelevant rounding): M := (cid:26) m ∈ [ − yN ζ + ρ λ N, m − y ] : m − ρ λ Nm − y − ρ λ N N →∞ −→ ∞ (cid:27) . Note that M contains exactly those m for which lim inf δ = lim inf δ N > α ,when we write m in the form m δ as above. Applying Theorem 3.1 for m δ = − yN δ + ρ λ N in the role of γ ( N ) yields that β m δ S m δ satisfies an MDP withspeed N δ − and rate function I ( x ) = x / c , where c was given explicitly in(3.1).By (2.10) and (2.11), we apply Theorem 3.1 to the summands in the first termof (5.3) to obtain for all m = m δ ∈ M thatlim N →∞ N α − log P ( A m δ = 0) ≤ lim N →∞ N α − log P ( A m δ ≤ ∼ lim N →∞ N α − log P (cid:16) ˜ A m δ ≤ (cid:17) ≤ lim N →∞ N α − log P (cid:18) β m δ S m δ N δ ≤ − x m δ N δ (cid:19) ≤ lim N →∞ N δ − N α − (cid:20) N δ − log P (cid:18) β m δ S m δ N δ ≤ − x m δ N δ (cid:19)(cid:21) = − ∞ . The last step follows, since x m δ /N δ ∈ R such that the MDP for β m δ S m δ holds.Indeed, we can expand x m δ analogously to (5.2) to see that x m δ is of the order N δ .Secondly, define M := (cid:26) m ∈ [ − yN ζ + ρ λ N, m − y ] : m − ρ λ Nm − y − ρ λ N N →∞ −→ − const . (cid:27) . Note that M contains exactly those m for which lim δ = lim δ N = α , when wewrite m in the form m δ as above. Applying Theorem 3.1 together with (2.10),(2.11), it holds for all m = m δ ∈ M thatlim N →∞ N α − log P ( A m δ = 0) ≤ lim N →∞ N α − log P ( A m δ ≤ ∼ lim N →∞ N α − log P (cid:16) ˜ A m δ ≤ (cid:17) ≤ lim N →∞ N δ − N α − (cid:20) N δ − log P (cid:18) β m δ S m δ N δ ≤ − x m δ N δ (cid:19)(cid:21) = − I ( − y (1 − λ ∗ )) = − J ( y ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 21
Therefore, we concludelim N →∞ N α − log P (cid:0) | C ≤ k N | < m − y (cid:1) ≤ lim N →∞ N α − log max m ∈ [ − yN ζ + ρ λ N, m − y ] P ( A m = 0) ≤ lim N →∞ N α − max m δ ∈ [ − yN ζ + ρ λ N, m − y ] log P ( A m δ = 0) ≤ − J ( y ) . (cid:3) The lower bounds.
In order to derive the corresponding lower boundsfor our MDP, we need some preparations to get familiar with the properties ofthe exploration processes.5.2.1.
Recap of exploration process.
Let0 = t < t < t < · · · < t l = N enumerate the event { t : A t − A t − = − } , which are the moments where theexploration starts with a new component of the hypergraph. Let C t = (cid:12)(cid:12) { i : 0 ≤ i < t, A i − A i − = − } (cid:12)(cid:12) be the number of components which have been explored by time t .Furthermore, we define the random walk X t = A t − C t . Recall the definition of λ = 1 + ε in Theorem 1.1 that we have ε = O (1) aswell as ε N τ → ∞ with τ = min { , − α − ι } for a given ι > ε N → ∞ .) Next fix the function ω = ω ( N ) satisfying ω = ω ( N ) → ∞ and ω ≤ C √ ε N (5.4)for some constant C >
0. We define t ∗ := ω p N/ε.
Furthermore, let us denote the number of components completely exploredwithin time t ∗ by Z := − inf { X t : t ≤ t ∗ } . Finally, on the event { C max ⊆ C ≤ k N } , set T = inf { t : X t = − Z } , and T = inf { t : X t = − Z − } , where T is the time point at which the last component within time t ∗ iscompletely explored, while T is the time when we finish exploring the next component. Simply by definition, we have T ≤ t ∗ < T . Now, ignoring theirrelevant rounding to integers let us set t ∗ = ρ λ N. The component C , . Denote the component by C , , which we explorefrom time T + 1 to time T . Recall in the supercritical regime, there is aunique giant component. Bollob´as and Riordan show in [7, Lemma 16] thatwith probability 1 − exp ( − Ω( w )) (where ω is given in (5.4)) on the event { C max ⊆ C ≤ k N } the component C , is the unique giant component of G d ( N, p ).Moreover, the formula for the size C , (conditioned on { C max ⊆ C ≤ k N } ) is thengiven by |C , | = t ∗ + ˜ A t ∗ − λ ∗ (5.5)in terms of the constructed approximation process ˜ A t , see [6, (21)]. Lemma 5.5.
In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≥ − J ( y ) . Proof.
As implied by Proposition 4.2 we have P ( { C max * C ≤ k N } ) ≤ P ( | C max | > | C ≤ k N | ) ≤ exp( − Ck N ) . By our construction, on the event { C max ⊆ C ≤ k N } , the component C , existsand it satisfieslim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≥ lim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N, C max ⊆ C ≤ k N ) ≥ lim N →∞ N α − log P ( |C , | > yN α + ρ λ N, C max ⊆ C ≤ k N ) . Indeed, by our construction, the component C , exists and is contained in theunion of connected components C ≤ k N on the event { C max ⊆ C ≤ k N } .Therefore, by inserting the approximation for |C , | in (5.5) we observe P ( | C ≤ k N | > yN α + ρ λ N ) ≥ P ( |C , | > yN α + ρ λ N | C max ⊆ C ≤ k N ) P ( C max ⊆ C ≤ k N ) ≥ P (cid:0) t ∗ + ˜ A t ∗ − λ ∗ > yN α + ρ λ N | C max ⊆ C ≤ k N (cid:1)(cid:0) − exp ( − Ck N ) (cid:1) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 23
Finally, using the notation t ∗ = ρ λ N , we arrive atlim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≥ lim N →∞ N α − (cid:20) log P t ∗ + ˜ A t ∗ − λ ∗ > yN α + ρ λ N ! + log (1 − exp( − Ck N )) (cid:21) ≥ lim N →∞ N α − (cid:20) log P (cid:18) β t ∗ S t ∗ N α > y (1 − λ ∗ ) (cid:19)(cid:21) ≥ − I (( y (1 − λ ∗ ))) = − J ( y ) . (cid:3) Lemma 5.6.
In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | < − yN α + ρ λ N ) ≥ − J ( y ) . Proof.
Again let m − y = − yN α + ρ λ N and by ( ?? ) we obtain x m − y = y (1 − λ ∗ ) N α + o ( N α ) + O (1) . Therefore, using Theorem 3.1 we obtainlim N →∞ N α − log P (cid:0) | C ≤ k N | < m − y (cid:1) ≤ lim N →∞ N α − log P (cid:0) | C ≤ k N | ≤ m − y (cid:1) = lim N →∞ N α − log P (cid:0) ∃ m ≤ m − y : A m = 0 (cid:1) ≥ lim N →∞ N α − log P (cid:16) A m − y ≤ (cid:17) ∼ lim N →∞ N α − log P (cid:18) β m − y S m − y N α ≤ − x m − y N α (cid:19) = lim N →∞ N α − log P (cid:18) β m δ S m δ N α ≤ − y (1 − λ ∗ ) + o (1) (cid:19) = − I ( − y (1 − λ ∗ )) = − J ( y ) . (cid:3) Proof of Theorem 1.1: a moderate deviations principle for | C max | Compare | C max | and | C ≤ k N | . In this subsection we improve Proposition4.2. We show, that if we allow an error term r N = o ( N α ) we obtain a boundon the probability of the event | C max | + r N < | C ≤ k N | that is negligible on themoderate devation scale. Hence, the upper bound of k N given by (4.2) can beobtained. We remind the reader of k N = N γ for 2 α − < γ < α , where < α < Lemma 6.1.
Let r N = N ξ for γ < ξ < α . Then P ( | C max | + r N < | C ≤ k N | ) ≤ exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) for some constant M > small enough.Proof. Let us denote the i -th largest component of G d ( N, p ) by L i , whose sizeis given by l i = | L i | . Then, because | C ≤ k N | can at most be as large as theunion of the k N largest components, we get for each δ > P ( | C max | + r N < | C ≤ k N | ) ≤ P | C max | + r N < | C max | + (cid:12)(cid:12) k N [ i =2 L i (cid:12)(cid:12)! ≤ P r N < (cid:12)(cid:12) k N [ j =2 L i (cid:12)(cid:12) , | C max | > δN ! + P ( | C max | ≤ δN ) ≤ X a > ··· >a kN a + ··· + a kN >r N P ( l = a , . . . , l k N = a k N , | C max | > δN ) + P ( | C max | ≤ δN ) . Define δN := ρ λ N − εN with ε again as defined in Theorem 1.1. We obtainby Theorem 4.1 with ω = √ ε N that P ( | C max | ≤ δN ) ≤ P (cid:16) || C max | − ρ λ N | ≥ ω p N/ε (cid:17) ≤ exp (cid:0) − cε N (cid:1) , (6.1)where c > ε = O (1) as well as ε N τ → ∞ with τ = min { , − α − ι } , for a fixed small ι > N α − log P ( | C max | ≤ δN ) ≤ N α − log exp (cid:0) − cε N (cid:1) ≤ − cε NN α − N →∞ −→ −∞ . (6.2)Moreover, note that for fixed a > · · · > a k N with a + · · · + a k N > r N , it holds P ( l = a , . . . , l k N = a k N , | C max | > δN )= k N Y j =2 P ( l j = a j | l = a , . . . , l j − = a j − , | C max | > δN ) P ( | C max | > δN ) ≤ k N Y j =3 P ( l j = a j | l = a , . . . , l j − = a j − , | C max | > δN ) P ( { l ≥ a } ∩ {| C max | > δN } ) ≤ k N Y j =3 P ( l j ≥ a j | l = a , . . . , l j − = a j − , | C max | > δN ) P ( l ≥ a ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 25
On one hand, we obtain from (4.1) that for some constant c, C > P ( l > a ) ≤ C εNa exp (cid:0) − cε a (cid:1) , where the ε was given by the branching factor λ = 1 + ε of the originalhypergraph G d ( N, p ).On the other hand, for each j ∈ { , . . . , k N } , the hypergraph G d ( N, p ) afterremoving the components L , . . . , L j − and C max , conditioned on {| C max | >δN } by [7, Lemma 8.1] (or common sense) is again a hypergraph G d ( N − s j , p ),where s j ≤ (1 − δ ) N . Recall that p = λ ( d − N d − = λ j ( r − N − s j ) d − in G d ( N − s j , p ) is characterised by its branching factor λ j = (cid:16) − s j N (cid:17) d − λ, where we have λ = 1 + ε by assumptions. Now let us denote by ε j the followingquantity: ε j = 1 − (cid:16) − s j N (cid:17) d − (1 + ε ) . Then we can arrive at λ j = 1 − ε j . (6.3)Since s j ≤ (1 − δ ) N , we see by [7, (8.3)] that ε j ≤ − δ d − (1 + ε ) ,c δ ε ≤ ε j ≤ C δ ε for some constanst c δ , C δ > δ but not on j .Therefore, the branching factor λ j given in (6.3) of the new hypergraph G d ( N − s j , p ), belongs to the subcritical regime. From [7, Theorem 2] we obtain a largedeviation bound for the largest component in G d ( N − s j , p ), that is to say, the L j of G d ( N, p ). We thus obtain P ( l j ≥ a j | l = a , . . . , l j − = a j − , | C max | > δN ) ≤ C ε j Na j exp (cid:0) − cε j a j (cid:1) ≤ C δ εNa j exp (cid:0) − c δ ε a j (cid:1) . Hence, we get fianally for
M > X a > ··· >a kN a + ··· + a kN >r N P ( l = a , . . . , l k N = a k N , | C max | > δN ) ≤ r k N N ( C δ εN ) k N Q k N j =2 a j exp − c δ ε k N X j =2 a j ! ≤ exp (cid:0) − c δ ε r N + k N log( C δ εr N N ) (cid:1) ≤ exp ( − M k N ) . (6.4)The last step follows, since not only the fact k N = o ( r N ) is implied by r N = N ξ with γ < ξ < α and k N = N γ with 2 α − < γ < α in (4.2); but also k N = o ( N )holds. Hence, (6.4) together with (6.1), (6.2) yields P ( | C max | + r N < | C ≤ k N | ) ≤ exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) by adjusting the constant M > (cid:3) Proof of Theorem 1.1.
As we described before, by an appropriatechoice of k N , | C ≤ k N | and | C max | only differ by an amount that is negligibleon the moderate deviations scale. Proof.
Note that we pick r N = N ξ for γ < ξ < α , in particular, it satisfies r N = o ( N α ). Now for all y >
0, we estimate the upper tail by applying Lemma6.1 with M given there to obtain P ( || C max | − ρ λ | > yN α )= P ( || C max | − ρ λ | > yN α , || C max | − | C ≤ k N || ≤ r N )+ P ( || C max | − ρ λ | > yN α , || C max | − | C ≤ k N || > r N ) ≤ P ( || C ≤ k N | − ρ λ | > yN α + o ( N α )) + exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) . It suffices to apply the MDP for | C ≤ k N | derived in Theorem 5.1, it yieldslim N →∞ N α − log P ( || C max | − ρ λ | > yN α ) ≤ lim N →∞ N α − log h P ( || C ≤ k N | − ρ λ | > yN α + o ( N α )) + exp ( − M k N ) i ≤ lim N →∞ N α − log max n P ( || C ≤ k N | − ρ λ | > yN α + o ( N α )) , exp ( − M k N ) o ≤ − J ( y ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 27
Similarly, for the lower tail, it satisfies for all y > P ( || C ≤ k N | − ρ λ | > yN α )= P ( || C ≤ k N | − ρ λ | > yN α , || C max | − | C ≤ k N || ≤ r N )+ P ( || C ≤ k N | − ρ λ | > yN α , || C max | − | C ≤ k N || > r N ) ≤ P ( || C max | − ρ λ | > yN α + o ( N α )) + exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) . Again by Theorem 5.1, we arrive at − J ( y ) = lim N →∞ N α − log P ( || C ≤ k N | − ρ λ | > yN α ) ≤ lim N →∞ N α − log max n P ( || C max | − ρ λ | > yN α + o ( N α )) , exp ( − M k N ) o ≤ lim N →∞ N α − log P ( || C max | − ρ λ | > yN α ) . Altogether, the claimlim N →∞ N α − log P ( || C max | − ρ λ | > yN α ) = − J ( y )follows for all y > (cid:3) Acknowledgements
Research of the authors was funded by the Deutsche Forschungsgemeinschaft(DFG, German Research Foundation) under Germany’s Excellence StrategyEXC 2044 -390685587, Mathematics M¨unster: Dynamics -Geometry -Structure.
References [1] J. Ameskamp and M. L¨owe. Moderate deviations for the size of the largest componentin a super-critical Erd¨os-R´enyi graph.
Markov Process. Related Fields , 17(3):369–390,2011.[2] D. Barraez, S. Boucheron, and W. Fernandez de la Vega. On the fluctuations of thegiant component.
Combin. Probab. Comput. , 9(4):287–304, 2000.[3] M. Behrisch, A. Coja-Oghlan, and M. Kang. The order of the giant component ofrandom hypergraphs.
Random Structures Algorithms , 36(2):149–184, 2010.[4] M. Behrisch, A. Coja-Oghlan, and M. Kang. Local limit theorems for the giant com-ponent of random hypergraphs.
Combin. Probab. Comput. , 23(3):331–366, 2014.[5] B. Bollob´as and O. Riordan. Asymptotic normality of the size of the giant componentin a random hypergraph.
Random Structures Algorithms , 41(4):441–450, 2012.[6] B. Bollob´as and O. Riordan. Asymptotic normality of the size of the giant componentvia a random walk.
J. Combin. Theory Ser. B , 102(1):53–61, 2012.[7] B. Bollob´as and O. Riordan. Exploring hypergraphs with martingales.
Random Struc-tures Algorithms , 50(3):325–352, 2017.[8] A. Coja-Oghlan, C. Moore, and V. Sanwalani. Counting connected graphs and hyper-graphs via the probabilistic method.
Random Structures Algorithms , 31(3):288–329,2007. [9] V. H. de la Pe˜na. A general class of exponential inequalities for martingales and ratios.
Ann. Probab. , 27(1):537–564, 1999.[10] A. Dembo. Moderate deviations for martingales with bounded jumps.
Electron. Comm.Probab. , 1:no. 3, 11–17, 1996.[11] A. Dembo and O. Zeitouni.
Large deviations techniques and applications , volume 38 of
Stochastic Modelling and Applied Probability . Springer-Verlag, Berlin, 2010. Correctedreprint of the second (1998) edition.[12] R. Durrett.
Random graph dynamics . Cambridge Series in Statistical and ProbabilisticMathematics. Cambridge University Press, Cambridge, 2010.[13] P. Eichelsbacher and M. L¨owe. Lindeberg’s method for moderate deviations and randomsummation. preprint, arxiv: 1705.03837 , 2017.[14] P. Erd˝os and A. R´enyi. On random graphs. I.
Publ. Math. Debrecen , 6:290–297, 1959.[15] P. Erd˝os and A. R´enyi. On the evolution of random graphs.
Bull. Inst. Internat. Statist. ,38:343–347, 1961.[16] S. Janson, D. E. Knuth, T. Luczak, and B. Pittel. The birth of the giant component.
Random Structures Algorithms , 4(3):231–358, 1993. With an introduction by the edi-tors.[17] M. Karo´nski and T. Luczak. The phase transition in a random hypergraph.
J. Com-put. Appl. Math. , 142(1):125–135, 2002. Probabilistic methods in combinatorics andcombinatorial optimization.[18] M. L¨owe and F. Vermet. Capacity of an associative memory model on random grapharchitectures.
Bernoulli , (to appear), 2014.[19] N. O’Connell. Some large deviation results for sparse random graphs.
Probab. TheoryRelated Fields , 110(3):277–285, 1998.[20] J. Schmidt-Pruzan and E. Shamir. Component structure in the evolution of randomhypergraphs.
Combinatorica , 5(1):81–94, 1985.[21] R. van der Hofstadt. Random graphs and complex networks. , 2013.(Jingjia Liu)
Fachbereich Mathematik und Informatik, Universit¨at M¨unster,Einsteinstraße 62, 48149 M¨unster, Germany
E-mail address , Jingjia Liu: [email protected] (Matthias L¨owe)
Fachbereich Mathematik und Informatik, Universit¨at M¨unster,Einsteinstraße 62, 48149 M¨unster, Germany
E-mail address , Matthias L¨owe:, Matthias L¨owe: