[PDF] Moderate deviations for the size of the giant component in a random hypergraph

Abstract

We prove a moderate deviations principles for the size of the largest connected component in a random d -uniform hypergraph. The key tool is a version of the exploration process, that is also used to investigate the giant component of an Erdös-Rényi graph, a moderate deviations principle for the martingale associated with this exploration process, and exponential estimates.

Full PDF

aa r X i v : . [ m a t h . P R ] J u l MODERATE DEVIATIONS FOR THE SIZE OF THE GIANTCOMPONENT IN A RANDOM HYPERGRAPH

JINGJIA LIU AND MATTHIAS L ¨OWE

Abstract.

We prove a moderate deviations principles for the size of thelargest connected component in a random d -uniform hypergraph. The keytool is a version of the exploration process, that is also used to investi-gate the giant component of an Erd¨os-R´enyi graph, a moderate deviationsprinciple for the martingale associated with this exploration process, andexponential estimates. Introduction

The research on random graphs was initiated by Erd¨os and R´enyi, see [14],[15]. Though it was originally motivated by questions from graph theory,random graphs quickly developed into an independent ﬁeld with applicationsin many areas such as physics, neural networks, telecommunications, or thesocial sciences. Despite the fact that some of these applications ask for randomgraphs with a given degree distribution (see e.g. [12], [21] for very readablesurveys, or [18], for a recent application), the by far most popular model ofa random graph still is the Erd¨os-R´enyi graph. In this graph, one realizes allpossible connections between N vertices V = { , . . . , N } independently withequal probability p . This models is referred to as G ( N, p ).The corresponding random hypergraph model is the model G d ( N, p ). Here d ≥ d = 2 is nothing but the ordinary Erd¨os-R´enyi graph). Thus arealization of G d ( N, p ) will be a hypergraph G = ( V, E ), where all the edges in E are subsets of V with cardinality d . Moreover, in G d ( N, p ) all hyperedges ofcardinality d are selected independently with probability p . One of the moststriking ﬁrst results about G ( N, p ) is, that there is a sharp phase transition inthe size of the largest connected component: If p = λ/N and λ <

1, then thelargest component will be of size O (log N ), while for λ >

1, the component isof order O ( N ), both with probability converging to 1. In the latter case, thesize of the largest component with high probability is of order ρ λ N + o ( N ), Date : July 19, 2019.2010

Mathematics Subject Classiﬁcation.

Primary: 60F10; Secondary: 05C65.

Key words and phrases.

Large deviations, moderate deviations, random hypergraph, gi-ant component, Erd¨os-R´enyi graph. where ρ λ satisﬁes 1 − ρ λ = e − λρ λ (1.1)(here we say that the random (hyper)graph G d ( N, p ) enjoys a certain property A with high probability (w.h.p.), if the probability that A holds in G d ( N, p )converges to 1 as N tends to inﬁnity). A very detailed study of this and manyother phenomena concerning this phase transition can be found in [16] or [21].The corresponding result for the d-regular random hypergraph models G d ( N, p )were shown in [20], [17], and [8]: If for some ε > d − (cid:0) N − d − (cid:1) p < − ε the resulting hypergraph consist of components of order O (log N ), while for( d − (cid:0) N − d − (cid:1) p > ε there is a unique giant component of size O ( N ). Tomake this more precise, we need a number of deﬁnitions. We set p = λ ( d − N d − . (1.2)For each ﬁxed λ >

1, we deﬁne the dual branching process parameter λ ∗ < λ ∗ e − λ ∗ = λe λ . In case d = 2, we specify ρ λ given by (1.1) as ρ λ =: ρ ,λ , whereas for d ≥

3, wedeﬁne ρ d,λ by the equation1 − ρ d,λ = (1 − ρ λ ) / ( d − . (1.3)It can be checked that ρ d,λ satisﬁes λ ∗ = λ (1 − ρ d,λ ) d − . (1.4)For ﬁxed d we abbreviate ρ λ = ρ d,λ . The role of ρ λ is that it determines theasymptotic size of the giant component. Indeed, if λ >

1, it has been shown in[14] and [8] that the unique giant component is of size ρ λ N + o ( N ) with highprobability. This statement can be regarded as a law of large numbers for thesize of the giant component, which we will call henceforth C max .Moreover it was shown that and ρ λ also can be written as the unique solutionof the transcendental equation (cf. [3]):1 − ρ λ = exp (cid:18) − λd − (cid:0) − (1 − ρ λ ) d − (cid:1)(cid:19) . (1.5)Combining (1.3) with (1.4), one sees that (1.1) is indeed the case d = 2 of (1.5).Note that both ρ λ and ρ λ depend on d , which is suppressed in the notation.Let us assume for the rest of the paper that we are in the supercritical regime,i.e. ( d − (cid:18) N − d − (cid:19) p > ε for some ε > , where the precise conditions on ε will be given later explicitly. Note that thisis equivalent to assuming that λ > ε . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 3

For both, random graphs and random hypergraphs, ﬂuctuations around theaforementioned law of large numbers for the size of C max were investigated. ACentral Limit Theorem (CLT, for short) for the size of C max in G ( N, p ) wasproved e.g. in [2], for a nice proof we also refer to [21], Section 4.5. Largedeviations in the same situations go back to O’Connell in his nice paper [19],while moderate deviations were investigated in [1]. The corresponding CLTfor C max is G d ( N, p ) was established in [3] using Stein’s method. In [4] a localCLT is proved, even for the joint distribution of the number of vertices andedges in C max . Another way to prove a CLT that uses the so-called explorationprocess and is reminiscent to the proof for random graphs given in [21] wasintroduced by Grimmett and Riordan [5].The aim of the present paper is to establish moderate deviations results forthe number of vertices in C max for the case of the random hypergraph model G d ( N, p ). To this end, we will modify the exploration process for hypergraphsintroduced in [5] in such a way that is resembles the exploration process usedin [1].In order to formulate our main theorems we need to recall that a sequenceof real valued random variables ( Y n ) obeys a large deviation principle (LDP)with speed a n and good rate function I ( · ) : R → R +0 ∪ { + ∞} if • For every L ∈ R +0 , the level sets of I denoted by N L := { x ∈ R : I ( x ) ≤ L } , are compact • For every open set G ⊆ R it holdslim inf n →∞ a n log P ( Y n ∈ G ) ≥ − inf x ∈ G I ( x ) . (1.6) • For every closed set A ⊆ R it holdslim sup n →∞ a n log P ( Y n ∈ A ) ≤ − inf x ∈ A I ( x ) . (1.7)As announced, in this paper we will prove a moderate deviation principle(MDP) for | C max | (which is a function of N ). Formally, there is no distinctionbetween an MDP and an LDP. Usually, an LDP lives on the scale of a law oflarge number type ergodic phenomenon, while MDPs describe the probabilitieson a scale between a law of large numbers and some sort of CLT. For both,large deviation principles and MDPs the three points mentioned above serveas a deﬁnition.Having this in mind our central result reads as: Theorem 1.1 (MDP for the size of the giant component in G d ( N, p )) . Let < α < . For each d ≥ , set p = λ ( d − N d − with λ = 1 + ε . Assume that ε = O (1) , as well there exists ι > such that ε N τ → ∞ where τ = min { , − α − ι } . (1.8) JINGJIA LIU AND MATTHIAS L ¨OWE

Then the sequence of random variables ( | C max | − ρ λ N ) /N α satisﬁes an MDPin G d ( N, p ) with speed N α − and rate function J ( x ) = x (1 − λ ∗ ) c . (1.9) Here c = λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) and λ ∗ = λ (1 − ρ d,λ ) d − as in (1.4) . Remarks 1.2. (1) For d = 2 this result is contained in [1].(2) The asymptotic notation should be understood as N → ∞ . We use X = O ( Y ), if there is an M > N →∞ | XY | ≤ M , and X = o ( Z ), if there exists c ( N ) such that X ≤ c ( N ) Z , where c ( N ) → N → ∞ . Furthermore, if such quantities M, c ( N ) depend onsome parameters, we will indicate this by subscripts, e.g. X = O ρ ( Y )meaning M = M ( ρ ).(3) From the proof of Lemma 6.1 one might get the impression that re-quiring the (slightly more natural) condition ε N τ → ∞ with τ =min { , − α } would be enough. However, in the proof of Lemma 5.4we need the slightly stronger condition (1.8).The rest of this paper is organized as follows: In Section 2 we give a shortintroduction to the exploration process which will be used in Sections 4, 5 and6 to prove Theorem 1.1. Brieﬂy this exploration process starts with a number k = k N of vertices and investigates the union of its connected components.If k N is chosen appropriately, this union coincides with the giant componentof the hypergraph up to negligible terms. On the other hand, the size of thisunion can be controlled by a martingale underlying the exploration process.In Section 3 we prove an MDP for this martingale. In Sections 4,5, and 6 wewill see that indeed this MDP helps to show our main Theorem 1.1.2. An exploration process on hypergraphs

The aim of this section is to introduce an exploration process to investigatethe components of a hypergraph. This exploration process is inspired by thecorresponding process for graphs as deﬁned e.g. in [21]. A similar, yet slightlydiﬀerent process for hypergraphs was introduced in [5]. We will also use resultsfrom this paper.We start by taking the given enumeration of the vertices from 1 , . . . , N . Ver-tices during this exploration process will get one of three labels: active, unseen,or explored. At time t the sets of active, unseen, or explored vertices will bedenoted by A t , U t , and E t , respectively. We will start by declaring the ﬁrst k = k N vertices active and the rest unseen. Now, in each step of the process,the ﬁrst active vertex (with respect to the given enumeration) is selected and ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 5 declared explored . At the same time, all of its unseen neighbors are set active.The process terminates when there are no active vertices. If we denote by C ≤ k the union of the connected components of the ﬁrst k vertices, then, at the endof the process all vertices in C ≤ k are explored and all the others are unseen.We remark the following Remarks 2.1. (1) For two sequences X N and Y N , we write X N ∼ Y N , ifthe limit lim N →∞ X N /Y N exists and equals to 1.(2) Obviously, since we add an explored vertex in every step | C ≤ k N | = min { t ∈ N : A t = 0 } . (3) By construction, A = k N (to be speciﬁed later) and, for all t with A t − >

0, one has A t = A t − + η t − . Here η t is the number of unseen vertices that are set active in the t − thstep and A t = 0 if A t − = 0.(4) Consider the distribution of A t when we are investigating G d ( N, p ) with p = λ ( d − N d − and λ >

1. For each t ∈ N after the t − th step, with s active and N − t − s unseen vertices u , for each unseen vertex u thereare exactly ν t +1 = (cid:18) N − t − d − (cid:19) potential edges that contain u and the vertex we are about to explore,but none of the vertices we have already explored. Hence the probabil-ity that u becomes active during step t + 1 is given by π = π t = 1 − (1 − p ) ν t +1 . Note that for times t ≪ N and with our scaling of p one has π ∼ − e − λN which is the same scaling as in the case d = 2. On the other hand, π t = pν t +1 + O ( p ν t +1 ) = λ ( N − t ) d − N − d +1 + O ( 1 N ) . This implies that η t given A t − = s is distributed like P N − ( t − − si =1 Y ti ,where each of the Y ti is an indicator with success probability π t . Notethat the Y i are not independent, which establishes the major diﬀerencebetween the case d = 2 (when the Y i are obviously independent) andthe cases d ≥ JINGJIA LIU AND MATTHIAS L ¨OWE

To simplify matters, we will change our process slightly and call instead A t = A t − + η t − t ∈ N the exploration process. Of course, this process agrees with theone previously considered up to the ﬁrst time the process hits 0. We will followthe ideas in [5] and rewrite A t (up to small errors) as the sum of a deterministicprocess and a martingale. This also motivates our study of moderate deviationsof martingales with n dependent martingale increments in the next section.To this end, let D t := E [ η t − |F t − ]where F t is the σ -Algebra generated by the random variables A , η , . . . , η t .From the above we learn that E [ η t +1 |F t ] = U t π = U t pν t +1 + O ( 1 N ) . For later use we also recall that in [5] it was shown that with π := 1 − (1 − p )( N − t − d − ) ∼ p ( N − t − d − ) ∼ λ ( d − N − t ) d − N − d +1 and π ∼ π we have that V [ η t +1 |F t ] = U t ( U t − π + π ) + U t π − ( U t π ) ∼ U t π + U t π ∼ λ ( d − − tN ) d − U t N + λ (1 − tN ) d − U t N (2.1)From these computations we obtain D t +1 = U t pν t +1 − O ( 1 N )= α t +1 ( N − t − A t ) − O ( 1 N )where α t = pν t = p (cid:18) N − t − d − (cid:19) . Now set ∆ t +1 := A t +1 − A t − D t +1 = η t +1 − E [ η t +1 |F t ] (2.2)such that A t +1 = A t + D t +1 + ∆ t +1 =(1 − α t +1 ) A t + α t +1 ( N − t ) − t +1 + O ( 1 N ) . (2.3)By deﬁnition we have E [∆ t +1 |F t ] = 0, thus (∆ t ) t is a martingale diﬀerencesequence. In particular, a simple bound for the variance of (∆ t ) t is given by ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 7

Lemma 2.2. [7, Lemma 8]

Let p and λ > be given as in Theorem 1.1. Thenthere is a constant M > such that for all ≤ t ≤ N , we have V (∆ t |F t − ) ≤ M with probability 1. Obviously, the process A t is a key quantity for the analysis of the size of C ≤ k N (and hence for the size of C max ) We want to approximate it by the sum of adeterministic sequence and a martingale. To this end, we deﬁne x = 0 x t +1 = (1 − α t +1 ) x t + α t +1 ( N − t ) − . Then, with A = k N , deﬁne A t +1 − x t +1 = (1 − α t +1 )( A t − x t ) + ∆ t +1 + ε t +1 where ε t +1 is a shorthand for the error term at level t + 1, which is of order O ( N ) (cf. (2.3) and for more details we refer the reader to [5, (10)]). So, if weset β t := t Y i =1 (1 − α i )we arrive at A t − x t = t X i =1 β t β i (∆ i + ε i ) . By deﬁning S t := t X i =1 β i ∆ i , (2.4)we observe that ( S t ) is a martingale. Thus our desired approximation is givenby ˜ A t := x t + β t S t , (2.5)and we have by [5, Lemma 3] that | A t − ˜ A t | = | A t − x t − β t S t | = | t X i =1 β t β i (∆ i + ε i ) − β t t X i =1 β i ∆ i | = | t X i =1 β t β i ε i | = O (cid:0) tN (cid:1) (2.6)uniformly in 1 ≤ t ≤ N . JINGJIA LIU AND MATTHIAS L ¨OWE

Note that the behaviour of x t can be determined as well (see [5, (15)]). Indeed,deﬁne g d,λ as g d,λ ( x ) = 1 − x − exp (cid:18) − λd − (cid:0) − (1 − x ) d − (cid:1)(cid:19) (2.7)and f ( t ) := f N,d,λ ( t ) := N g d,λ ( tN ) . (2.8)Then, we obtain uniformly in 0 ≤ t ≤ N , x t = f ( t ) + O (1) . (2.9)In a nutshell the idea is now the following: The ﬁrst time t , that A t is 0 isthe size of the union of the connected components of the ﬁrst k N vertices.If k N is chosen large enough, this union will contain the giant componentwith overwhelming probability (where we say that an event has ”overwhelmingprobability” when the probability of the complement of the event is negligi-ble on the chosen moderate deviation scale). On the other hand, also withoverwhelming probability, there is only one component with a size larger than N ξ , if ξ < k N vertices is the size of the largestcomponent with overwhelming probability. On the other hand, as observedabove, we may safely replace A t by ˜ A t when considering these quantities on amoderate deviation scale. That is to say P ( A t /N α ≥ x ) ∼ P ( ˜ A t /N α ≥ x ) (2.10)for any x and α > P ( A t /N α ≤ x ) ∼ P ( ˜ A t /N α ≤ x ) . (2.11)Moreover, notice that the stochastic behaviour of ˜ A t is governed by the mar-tingale ( S t ) deﬁned in (2.4). We will therefore analyze its moderate deviationsin the next section.3. Moderate deviations for the martingale S t defined in (2.4)As we will see in Section 4 the moderate deviations for the size of the giantcomponent can be played back to the moderate deviations for the for the mar-tingale ( S t ) deﬁned in (2.4), in this section we will prove a moderate deviationsprinciple for this martingale. Our main tool is the G¨artner-Ellis theorem [11,Theorem 2.3.6]. Note that we cannot simply quote MDPs for martingalesfrom [10] or [13], because in our context the distributions of the martingalediﬀerences does depend on N . The central result of this section is ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 9

Theorem 3.1.

Consider the process ( S n ) deﬁned in (2.4) . For ζ ∈ ( −∞ , ∞ ) ,and < α < , put γ ( N ) = ρ λ N + ζ N α and assume for simplicity that γ ( N ) is an integer to avoid rounding. Then, for any choice of ζ the sequence ofrandom variables β γ ( N ) S γ ( N ) N α satisﬁes an MDP with speed N α − and rate function I ( x ) = x / c where c isgiven by c = c d,λ = λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) . (3.1) Here we write ρ λ for ρ d,λ and λ ∗ is the dual branching parameter given in (1.4) . The key idea will be to employ the G¨artner-Ellis theorem [11, Theorem 2.3.6].To this end we need to study the moment generating function of S n on thelevel of moderate deviations, i.e. we need to establish the existence oflim N →∞ N α − log E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) for t ∈ R . We will expand the moment generating function into a Taylor seriesup to the third order. However, to compute this we need some preparation.Recall our deﬁnitions (2.2) and (2.4), from which we obtain ∆ t +1 = η t +1 − E [ η t +1 |F t ] and S t = P ti =1 ∆ i /β i .3.1. Moments of η i . The essential point in the Taylor expansion is that theconditional variances of ∆ i and thus of η i depend on the number of unseenvertices at time i , U i . We will therefore show a rough concentration result for U i . To this end, recall that η i is the number of unseen vertices that are setactive in the i -th step. Thus, it holds η i = X j ∈ U i { ∃ an unexplored hyperedge containing j and the currently active vertex } and for any k ≥ η ki = X j i ,...,j ik ∈ U i { i ,...,i k }⊂{ ,...,N } { j i ,...,j ik are activated in step i } . Assume from stage i to i + 1, the vertex v i is being explored. There arevarious ways to activate j , . . . j k . Without loss of generality, we assume thatthe j i are pairwise diﬀerent; otherwise this problem is reduced to estimatingthe lower moment of η i . One needs to activate a set of hyperedges e , . . . , e m such that j , . . . , j k are contained in these hyperedges. Because we have tochoose the remaining vertices of the hyperedge from the unexplored vertices, the probability that ﬁxed l of the vertices are contained in one hyperedge isgiven by π l = 1 − (1 − p )( N − i − l − d − l − ) ∼ p (cid:18) N − i − l − d − l − (cid:19) ∼ λ ( d − · · · ( d − l )( N − i ) d − l − N − d . That is, if l , . . . l m sum up to k , then Q π l ≤ Dλ m N − k for some constant D .On the other hand, there is a constant C such that there at most C k/d +1 waysto write k as a sum of integers at most d −

Moreover, we will need a Hoeﬀding-Azuma-type inequality for S N (e.g. [7, Lemma 12]):For a constant c >

0, it holds P ( max ≤ t ≤ N S t ≥ y ) ≤ exp (cid:0) − c y /N (cid:1) . In particular, taking y = N β for β > P ( max ≤ t ≤ N S t ≥ N β ) ≤ exp (cid:0) − c N β − (cid:1) . (3.4)(This could, in fact, also be proved using the results in [9] together with ourabove considerations).3.3. Taylor expansion of ∆ i up to third order. As we know U t = N − t − A t , A t = ˜ A t + O ( tN ) , ˜ A t = x t + β t S t and the trajectory of x t is given by (2.8). Denote (cf. [5], equation(22)) u i = N exp (cid:18) − λd − (cid:18) − (1 − iN ) d − (cid:19)(cid:19) . (3.5)Hence, we have U i − u i = − β i S i + O (1) . Fix a β > < α < β < b N with b N → ∞ and b N = O ( N β ), thus b N = o ( N ). Let us deﬁne the eventΣ N = {∀ ≤ i ≤ N : | U γ ( i ) − u γ ( i ) | ≤ b N } , which has probability at least 1 − exp( − ηN β − ) for an appropriate η > {∃ i : 1 ≤ i ≤ N, such that | U γ ( i ) − u γ ( i ) | > b N } hasprobability at most exp( − c N β − ) for some constant c >

0. ( c diﬀers from c in (3.4) by at most an absolute constant factor.) ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 11

We use the martingale property to get E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) = E (cid:18) E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( N ) − (cid:21)(cid:19) = E (cid:18)(cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) − (cid:19)(cid:19) E (cid:20) exp (cid:18) t N α N ∆ γ ( N ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( N ) − (cid:21)(cid:19) . (3.6)By expanding up to third order, using that E [∆ i +1 |F i ] = 0, for all i , applying(2.1) as well as the crude bounds on the third moment derived in (3.2), weobtain E (cid:20) exp (cid:18) tN α − β γ ( N ) β γ ( i )+1 ∆ γ ( i )+1 (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( i ) (cid:21) =1 + t N α − β γ ( N ) β γ ( i )+1 λ ( d − (cid:18) − γ ( i ) N (cid:19) d − U γ ( i ) N + λ (cid:18) − γ ( i ) N (cid:19) d − U γ ( i ) N ! + O (cid:18) t N α − U γ ( i ) N (cid:19) . (3.7)Here we used the fact U n ≤ n ≤ N and β t = Q ti =1 (1 − α i ). We arrive at E (cid:20) exp (cid:18) tN α − β γ ( N ) β γ ( i )+1 ∆ γ ( i )+1 (cid:19) Σ N (cid:12)(cid:12)(cid:12)(cid:12) F γ ( i ) (cid:21) = exp (cid:18) t N α − β γ ( N ) β γ ( i )+1 λ ( d − (cid:18) − γ ( i ) N (cid:19) d − u γ ( i ) N + λ (cid:18) − γ ( i ) N (cid:19) d − u γ ( i ) N ! + O ( N α − ) (cid:19) (1 + o (1)) . Redoing this conditioning γ ( N ) times, we ﬁnally see that E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ N (cid:21) = exp (cid:18) t N α − γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! + O ( N α − ) (cid:19) , (3.8)which gives our desired Taylor expansion of the moment generating functionconditioned on the event Σ N . Boundedness of the moment generating function.

Let us brieﬂyrecall some notations: α t = pν t = p (cid:18) N − t − d − (cid:19) , ∆ t +1 = A t +1 − A t − D t +1 ,A t = A t − + η t − ,D t +1 = E [ η t +1 |F t ] ,β t = t Y i =1 (1 − α i ) ,S t = t X i =1 β i ∆ i . Note that β t /β i = Q tj = i +1 (1 − α j ) is between 0 and 1. Moreover recall thatfrom Lemma 2.2 we obtain that V (∆ t |F t − ) ≤ M for all t . Using the fact that U i ≤ N for all 1 ≤ i ≤ γ N analogously to (3.7),we can derive the following expansion for all 0 ≤ i ≤ γ N and any ﬁxed t E (cid:20) exp (cid:18) tN α − β γ ( N ) β γ ( i )+1 ∆ γ ( i )+1 (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( i ) (cid:21) =1 + t N α − β γ ( N ) β γ ( i )+1 V (cid:0) ∆ γ ( i )+1 |F γ ( i ) (cid:1) + O ( t N α − ) ≤ t β γ ( N ) β γ ( i )+1 M N α − + O ( t N α − ) ≤ exp (cid:18) t CN α − (cid:19) , where C > ≤ i ≤ γ ( N ) and employing the martingale diﬀerence sequenceproperty as in (3.6) yield E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) ≤ E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) − (cid:19) E (cid:20) exp (cid:18) t N α N ∆ γ ( N ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) F γ ( N ) − (cid:21)(cid:19) ≤ exp (cid:18) t CN α − (cid:19) , (3.9)where ˜ C >

ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 13

The goal.

With above preparation, we are ﬁnally ready to prove:

Proof of Theorem 3.1.

On one hand, by expanding the moment generatingfunction in (3.8), taking the logarithm and dividing by N α − , we see thatlim N →∞ N α − log E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) ≥ lim N →∞ N α − log E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ N (cid:21) = lim N →∞ t  N γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! = t (cid:0) λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) (cid:1) , (3.10)where we used the abbreviation ρ λ = ρ λ,d and the last line follows by theasymptotics for γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! in [5, (23)]. Indeed, by γ ( N ) = ρ λ N + ζ N α and the explicit formula of u i in(3.5), we have u i ≤ cN for some c > ≤ i ≤ N . Moreover, β i is a constant between 0 and 1. Thus, each summand in ρ λ N + ζN α X i = ρ λ N β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! is uniformly bounded. Therefore, it follows γ ( N ) − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! − ρ λ N − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! = o ( N ) . Thus1 N ρ λ N − X i =0 β γ ( N ) β i +1 λ ( d − (cid:18) − iN (cid:19) d − u i N + λ (cid:18) − iN (cid:19) d − u i N ! + o (1)yields the claim.On the other hand, with the bound for moment generating function derived in(3.9) and the fact < α < β <

1, it holds lim N →∞ N α − log E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ cN (cid:21) ≤ lim N →∞ N α − log (cid:18) exp (cid:18) t CN α − (cid:19) P (Σ cN ) (cid:19) ≤ lim N →∞ N α − (cid:18) t CN α − + log P (Σ cN ) (cid:19) ≤ lim N →∞ N α − (cid:18) t CN α − − ηN β − (cid:19) = − ∞ . (3.11)Therefore, we can concludelim N →∞ N α − log E (cid:18) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19)(cid:19) ≤ lim N →∞ N α − log max (cid:26) E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ N (cid:21) , E (cid:20) exp (cid:18) t N α N β γ ( N ) S γ ( N ) (cid:19) Σ cN (cid:21)(cid:27) = t (cid:0) λ (1 − ρ λ ) − λ ∗ (1 − ρ λ ) + ρ λ (1 − ρ λ ) (cid:1) . Here the last but one line follows by combining the equations (3.10) and (3.11).Since the parabola ct / x / c is then the rate function in Theorem 3.1. (cid:3) Choice of k N Following the idea we sketched in the previous sections, we seek a smart choicefor k N such that the union of the connected components of the ﬁrst k N verticesdoes not essentially diﬀer from the giant component with overwhelming prob-ability. This union will be called C ≤ k N in the sequel, its size will be denotedby | C ≤ k N | .Let us ﬁrst recall two very useful results from the literature: It has been shownthat | C max | concentrates on ρ λ N and the second largest component, denotedby C second is unlikely to be large as well. Remark

The notation f ( N ) = Ω( g ( N )) is used for ∃ N ∈ N , C > f ( N ) ≥ Cg ( N ) for N ≥ N . Theorem 4.1. [7, Theorem 4]

With the same assumptions as in Theorem 1.1,i.e. set λ = 1 + ε with ε = O (1) as well as ε N → ∞ . If ω = ω ( N ) → ∞ and ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 15 ω = O (cid:16) √ ε N (cid:17) , then P (cid:16)(cid:12)(cid:12) | C max | − ρ λ N (cid:12)(cid:12) ≥ ω p N/ε (cid:17) = exp (cid:0) − Ω( ω ) (cid:1) . Furthermore, if L = L ( n ) fulﬁlls ε L → ∞ and L = O ( εN ) . Then there exists C > such that the second largest component C second in G ( N, p ) satisﬁes P ( | C second | > L ) ≤ C εNL exp( − ε L/C ) . (4.1) for large enough N . Choice of k N : we set k N = N γ for a 2 α − < γ < α, (4.2)(recall that α ∈ ( , k N will be discussed more preciselyin Section 6, whereas the lower bound arises from the fact that the unionof connected components starting with k N vertices should contain the giantcomponent with overwhelming probability: Proposition 4.2.

0, it holds P (1 / ∈ C max )= P (1 / ∈ C max , || C max | − ρ λ N | > ξN ) + P (1 / ∈ C max , || C max | − ρ λ N | ≤ ξN ) ≤ P ( || C max | − ρ λ N | > ξN ) + P (1 / ∈ C max , || C max | − ρ λ N | ≤ ξN ) . Let us set 0 < ξ < ρ λ and apply the large deviation bound given in Theorem4.1 with ω = ξ √ ε N , then the ﬁrst term turns to be P ( || C max | − ρ λ N | > ξN ) ≤ exp( − cξ ε N )for some constant c >

0. This converges to 0 by assumption on ε .On the other hand, note that on the event { ρ λ N − ξN ≤ | C max | ≤ ρ λ N + ξN } ,there are at most N − ( ρ λ N − ξN ) vertices which are not contained in C max .Then we can bound the second term by P (1 / ∈ C max , || C max | − ρ λ N | ≤ ξN ) ≤ − ( ρ λ − ξ ) . Therefore, we obtain P (1 / ∈ C max ) ≤ − ( ρ λ − ξ ) + exp( − cε ξ N ) , P ( j / ∈ C max | , . . . , j − / ∈ C max ) ≤ P (1 / ∈ C max ) , for all 2 ≤ j ≤ k N . Consequently, there exists an appropriate C > P ( | C max | > | C ≤ k N | ) ≤ (cid:16) P (1 / ∈ C max ) (cid:17) k N ≤ (cid:0) − ( ρ λ − ξ ) + exp( − cε ξ N ) (cid:1) k N ≤ exp ( − C ( ρ λ − ξ )) k N ≤ exp( − Ck N ) . (cid:3) A moderate deviations principle for | C ≤ k N | In this section we are going to show an MDP for | C ≤ k N | , in other words, thesize of the union of the connected components, if we set the ﬁrst k N verticesactive in the exploration process and k N is of the right size. We will provethis MDP using the MDP for the martingale part of the exploration processderived in Theorem 3.1. In the next section we will see, that if k N is largeenough, | C ≤ k N | and | C max | only diﬀer by an amount that is negligible on themoderate deviations scale. Theorem 5.1.

Consider a probability for the presence of a hyperedge as in (1.2) with λ = 1 + ε as in Theorem 1.1. Take k N = N γ for α − < γ < α asin (4.2) . Then for any < α < and y > we have that lim N →∞ N α − log P ( || C ≤ k N | − ρ λ N | > yN α ) = − J ( y ) where J is given by (1.9) , i.e. J ( y ) = I ( y (1 − λ ∗ )) = y (1 − λ ∗ ) c . and I ( · ) as well as c is given explicitly in Theorem 3.1. Remark 5.2.

Due to the topological structure of R Theorem 5.1 is indeed amoderate deviations principle, see [11, Section 3.7] . We will break up the proof of Theorem 5.1 into several lemmas.

ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 17

The upper bounds.Lemma 5.3.

In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≤ − J ( y ) . Proof.

Let yN α + ρ λ N =: m + y . Firstly, recall the approximation process ˜ A t := x t + β t S t , and the trajectory of x t given by (2.9), and we see x m + y = f ( m + y ) + O (1) = N g d,λ ( m + y N ) + O (1) , (5.1)where g d,λ is given by (2.7): g d,λ ( x ) = 1 − x − exp (cid:18) − λd − (cid:0) − (1 − x ) d − (cid:1)(cid:19) . We deﬁne h ( y ) = exp (cid:18) − λd − (cid:0) − (1 − yN α − − ρ λ ) d − (cid:1)(cid:19) . Note that h ′ ( y ) = exp (cid:18) − λd − (cid:0) − (1 − yN α − − ρ λ ) d − (cid:1)(cid:19) (cid:0) − λ (1 − yN α − − ρ λ ) d − N α − (cid:1) , and h ′′ ( y ) = exp (cid:18) − λd − (cid:0) − (1 − yN α − − ρ λ ) d − (cid:1)(cid:19) (cid:0) λ (1 − yN α − − ρ λ ) d − N α − + λ ( d − − yN α − − ρ λ ) d − N α − (cid:1) . Thus we may expand h in y = 0 to obtain h ( y ) = exp (cid:18) − λd − (cid:0) − (1 − ρ λ ) d − (cid:1)(cid:19) (cid:0) − λ (1 − ρ λ ) d − yN α − (cid:1) + O ( N α − ) . Inserting the above calculation into g d,λ yields g d,λ ( yN α − + ρ λ )=1 − yN α − − ρ λ − h ( y )= − yN α − + exp (cid:18) − λd − (cid:0) − (1 − ρ λ ) d − (cid:1)(cid:19) (cid:0) λ (1 − ρ λ ) d − yN α − (cid:1) + O ( N α − )= − yN α − + λ (1 − ρ λ ) d − yN α − + O ( N α − ) , where we used the fact that g d,λ ( ρ λ ) = 0 by (1.5). Therefore, we conclude bythe deﬁnition of x t in (5.1) x m + y = N g d,λ ( yN α − + ρ λ ) + O (1)= − y (cid:0) − λ (1 − ρ λ ) d − (cid:1) N α + O ( N α − ) + O (1)= − y (1 − λ ∗ ) N α + o ( N α ) + O (1) , (5.2)where in the last line we used the abbreviation λ ∗ = λ (1 − ρ d,λ ) d − in (1.4).Finally, recall that ˜ A is the approximation process given by ˜ A t = x t + β t S t in(2.5) and E ˜ A t = x t (as well as (2.10), (2.11)). We observe P (cid:0) | C ≤ k N | > m + y (cid:1) = P (cid:0) ∀ m ≤ m + y : A m > (cid:1) ≤ P (cid:16) A m + y > (cid:17) = P (cid:18) A m + y − E A m + y N α > − E A m + y N α (cid:19) ∼ P ˜ A m + y − E ˜ A m + y N α > − E ˜ A m + y N α ! = P (cid:18) β m + y S m + y N α > − x m + y N α (cid:19) = P (cid:18) β m + y S m + y N α > y (1 − λ ∗ ) + o (1) (cid:19) . Applying Theorem 3.1, it yieldslim N →∞ N α − log P (cid:0) | C ≤ k N | > m + y (cid:1) ≤ lim N →∞ N α − log P (cid:18) β m + y S m + y N α > y (1 − λ ∗ ) + o (1) (cid:19) = − I ( y (1 − λ ∗ )) = − J ( y ) . (cid:3) Lemma 5.4.

In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | < − yN α + ρ λ N ) ≤ − J ( y ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 19

Proof.

Analogously to m + y , let us deﬁne m − y := − yN α + ρ λ N . We observe foreach ζ with < ζ < P (cid:0) | C ≤ k N | < m − y (cid:1) = P (cid:0) ∃ m < m − y : A m = 0 (cid:1) ≤ m − y X m = − yN ζ + ρ λ N P ( A m = 0) + P (cid:0) ∃ m < − yN ζ + ρ λ N : A m = 0 (cid:1) = m − y X m = − yN ζ + ρ λ N P ( A m = 0) + P (cid:0) | C ≤ k N | < − yN ζ + ρ λ N (cid:1) . (5.3)In particular, the second term in (5.3) is negligible on the chosen moder-ation deviation scale. We pick ω = y √ ε N ζ − and one can verify that ω = O ( √ ε N ).Indeed, it is implied by Theorem 4.1 that for some constant C > N →∞ N α − log P ( | C max | < − yN ζ + ρ λ N ) ≤ lim N →∞ N α − log P ( || C max | − ρ λ N | > yN ζ ) ≤ lim N →∞ N α − log P ( || C max | − ρ λ N | > ω p N/ε ) ≤ lim N →∞ − Cy ε N ζ − N α − = −∞ , Here, for the last equality to hold we need to choose ζ appropriately. This isdone as follows: Recall that we require the condition (1.8), i.e. ε N τ → ∞ with τ = min { , − α − ι } for a given ι > − α − ι < , i.e. if α > / − ι , we get ε N − α − ι → ∞ . We set ζ > max { − ι , α } , and thus obtain 2 ζ − α > − α − ι . This ensures ε N ζ − α → ∞ . (2) If 2 − α − ι > i.e. if α < / − ι , we see ε N / → ∞ . In thiscase deﬁne ζ > α + 1 /

4, which implies 2 ζ − α > . Hence, it follows ε N ζ − α → ∞ .(3) If 2 − α − ι = , any of the above choices for ζ can be applied.Now we ﬁx y and ζ satisfying the above conditions. For m ∈ [ − yN ζ + ρ λ N, m − y ],we can ﬁnd a δ with α ≤ δ ≤ ζ such that it holds m = m δ := − yN δ + ρ λ N .Note that δ will depend on N .We distinguish again the following two cases. Firstly, consider the set M and M deﬁned below (to simplify notation, assume that sets M and M deﬁned below are sets of integers, to avoid the irrelevant rounding): M := (cid:26) m ∈ [ − yN ζ + ρ λ N, m − y ] : m − ρ λ Nm − y − ρ λ N N →∞ −→ ∞ (cid:27) . Note that M contains exactly those m for which lim inf δ = lim inf δ N > α ,when we write m in the form m δ as above. Applying Theorem 3.1 for m δ = − yN δ + ρ λ N in the role of γ ( N ) yields that β m δ S m δ satisﬁes an MDP withspeed N δ − and rate function I ( x ) = x / c , where c was given explicitly in(3.1).By (2.10) and (2.11), we apply Theorem 3.1 to the summands in the ﬁrst termof (5.3) to obtain for all m = m δ ∈ M thatlim N →∞ N α − log P ( A m δ = 0) ≤ lim N →∞ N α − log P ( A m δ ≤ ∼ lim N →∞ N α − log P (cid:16) ˜ A m δ ≤ (cid:17) ≤ lim N →∞ N α − log P (cid:18) β m δ S m δ N δ ≤ − x m δ N δ (cid:19) ≤ lim N →∞ N δ − N α − (cid:20) N δ − log P (cid:18) β m δ S m δ N δ ≤ − x m δ N δ (cid:19)(cid:21) = − ∞ . The last step follows, since x m δ /N δ ∈ R such that the MDP for β m δ S m δ holds.Indeed, we can expand x m δ analogously to (5.2) to see that x m δ is of the order N δ .Secondly, deﬁne M := (cid:26) m ∈ [ − yN ζ + ρ λ N, m − y ] : m − ρ λ Nm − y − ρ λ N N →∞ −→ − const . (cid:27) . Note that M contains exactly those m for which lim δ = lim δ N = α , when wewrite m in the form m δ as above. Applying Theorem 3.1 together with (2.10),(2.11), it holds for all m = m δ ∈ M thatlim N →∞ N α − log P ( A m δ = 0) ≤ lim N →∞ N α − log P ( A m δ ≤ ∼ lim N →∞ N α − log P (cid:16) ˜ A m δ ≤ (cid:17) ≤ lim N →∞ N δ − N α − (cid:20) N δ − log P (cid:18) β m δ S m δ N δ ≤ − x m δ N δ (cid:19)(cid:21) = − I ( − y (1 − λ ∗ )) = − J ( y ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 21

Therefore, we concludelim N →∞ N α − log P (cid:0) | C ≤ k N | < m − y (cid:1) ≤ lim N →∞ N α − log max m ∈ [ − yN ζ + ρ λ N, m − y ] P ( A m = 0) ≤ lim N →∞ N α − max m δ ∈ [ − yN ζ + ρ λ N, m − y ] log P ( A m δ = 0) ≤ − J ( y ) . (cid:3) The lower bounds.

In order to derive the corresponding lower boundsfor our MDP, we need some preparations to get familiar with the properties ofthe exploration processes.5.2.1.

Recap of exploration process.

Let0 = t < t < t < · · · < t l = N enumerate the event { t : A t − A t − = − } , which are the moments where theexploration starts with a new component of the hypergraph. Let C t = (cid:12)(cid:12) { i : 0 ≤ i < t, A i − A i − = − } (cid:12)(cid:12) be the number of components which have been explored by time t .Furthermore, we deﬁne the random walk X t = A t − C t . Recall the deﬁnition of λ = 1 + ε in Theorem 1.1 that we have ε = O (1) aswell as ε N τ → ∞ with τ = min { , − α − ι } for a given ι > ε N → ∞ .) Next ﬁx the function ω = ω ( N ) satisfying ω = ω ( N ) → ∞ and ω ≤ C √ ε N (5.4)for some constant C >

0. We deﬁne t ∗ := ω p N/ε.

Furthermore, let us denote the number of components completely exploredwithin time t ∗ by Z := − inf { X t : t ≤ t ∗ } . Finally, on the event { C max ⊆ C ≤ k N } , set T = inf { t : X t = − Z } , and T = inf { t : X t = − Z − } , where T is the time point at which the last component within time t ∗ iscompletely explored, while T is the time when we ﬁnish exploring the next component. Simply by deﬁnition, we have T ≤ t ∗ < T . Now, ignoring theirrelevant rounding to integers let us set t ∗ = ρ λ N. The component C , . Denote the component by C , , which we explorefrom time T + 1 to time T . Recall in the supercritical regime, there is aunique giant component. Bollob´as and Riordan show in [7, Lemma 16] thatwith probability 1 − exp ( − Ω( w )) (where ω is given in (5.4)) on the event { C max ⊆ C ≤ k N } the component C , is the unique giant component of G d ( N, p ).Moreover, the formula for the size C , (conditioned on { C max ⊆ C ≤ k N } ) is thengiven by |C , | = t ∗ + ˜ A t ∗ − λ ∗ (5.5)in terms of the constructed approximation process ˜ A t , see [6, (21)]. Lemma 5.5.

In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≥ − J ( y ) . Proof.

As implied by Proposition 4.2 we have P ( { C max * C ≤ k N } ) ≤ P ( | C max | > | C ≤ k N | ) ≤ exp( − Ck N ) . By our construction, on the event { C max ⊆ C ≤ k N } , the component C , existsand it satisﬁeslim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≥ lim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N, C max ⊆ C ≤ k N ) ≥ lim N →∞ N α − log P ( |C , | > yN α + ρ λ N, C max ⊆ C ≤ k N ) . Indeed, by our construction, the component C , exists and is contained in theunion of connected components C ≤ k N on the event { C max ⊆ C ≤ k N } .Therefore, by inserting the approximation for |C , | in (5.5) we observe P ( | C ≤ k N | > yN α + ρ λ N ) ≥ P ( |C , | > yN α + ρ λ N | C max ⊆ C ≤ k N ) P ( C max ⊆ C ≤ k N ) ≥ P (cid:0) t ∗ + ˜ A t ∗ − λ ∗ > yN α + ρ λ N | C max ⊆ C ≤ k N (cid:1)(cid:0) − exp ( − Ck N ) (cid:1) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 23

Finally, using the notation t ∗ = ρ λ N , we arrive atlim N →∞ N α − log P ( | C ≤ k N | > yN α + ρ λ N ) ≥ lim N →∞ N α − (cid:20) log P t ∗ + ˜ A t ∗ − λ ∗ > yN α + ρ λ N ! + log (1 − exp( − Ck N )) (cid:21) ≥ lim N →∞ N α − (cid:20) log P (cid:18) β t ∗ S t ∗ N α > y (1 − λ ∗ ) (cid:19)(cid:21) ≥ − I (( y (1 − λ ∗ ))) = − J ( y ) . (cid:3) Lemma 5.6.

In the situation and with the notation of Theorem 5.1 we havefor any < α < and y > that lim N →∞ N α − log P ( | C ≤ k N | < − yN α + ρ λ N ) ≥ − J ( y ) . Proof.

Again let m − y = − yN α + ρ λ N and by ( ?? ) we obtain x m − y = y (1 − λ ∗ ) N α + o ( N α ) + O (1) . Therefore, using Theorem 3.1 we obtainlim N →∞ N α − log P (cid:0) | C ≤ k N | < m − y (cid:1) ≤ lim N →∞ N α − log P (cid:0) | C ≤ k N | ≤ m − y (cid:1) = lim N →∞ N α − log P (cid:0) ∃ m ≤ m − y : A m = 0 (cid:1) ≥ lim N →∞ N α − log P (cid:16) A m − y ≤ (cid:17) ∼ lim N →∞ N α − log P (cid:18) β m − y S m − y N α ≤ − x m − y N α (cid:19) = lim N →∞ N α − log P (cid:18) β m δ S m δ N α ≤ − y (1 − λ ∗ ) + o (1) (cid:19) = − I ( − y (1 − λ ∗ )) = − J ( y ) . (cid:3) Proof of Theorem 1.1: a moderate deviations principle for | C max | Compare | C max | and | C ≤ k N | . In this subsection we improve Proposition4.2. We show, that if we allow an error term r N = o ( N α ) we obtain a boundon the probability of the event | C max | + r N < | C ≤ k N | that is negligible on themoderate devation scale. Hence, the upper bound of k N given by (4.2) can beobtained. We remind the reader of k N = N γ for 2 α − < γ < α , where < α < Lemma 6.1.

Let r N = N ξ for γ < ξ < α . Then P ( | C max | + r N < | C ≤ k N | ) ≤ exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) for some constant M > small enough.Proof. Let us denote the i -th largest component of G d ( N, p ) by L i , whose sizeis given by l i = | L i | . Then, because | C ≤ k N | can at most be as large as theunion of the k N largest components, we get for each δ > P ( | C max | + r N < | C ≤ k N | ) ≤ P | C max | + r N < | C max | + (cid:12)(cid:12) k N [ i =2 L i (cid:12)(cid:12)! ≤ P r N < (cid:12)(cid:12) k N [ j =2 L i (cid:12)(cid:12) , | C max | > δN ! + P ( | C max | ≤ δN ) ≤ X a > ··· >a kN a + ··· + a kN >r N P ( l = a , . . . , l k N = a k N , | C max | > δN ) + P ( | C max | ≤ δN ) . Deﬁne δN := ρ λ N − εN with ε again as deﬁned in Theorem 1.1. We obtainby Theorem 4.1 with ω = √ ε N that P ( | C max | ≤ δN ) ≤ P (cid:16) || C max | − ρ λ N | ≥ ω p N/ε (cid:17) ≤ exp (cid:0) − cε N (cid:1) , (6.1)where c > ε = O (1) as well as ε N τ → ∞ with τ = min { , − α − ι } , for a ﬁxed small ι > N α − log P ( | C max | ≤ δN ) ≤ N α − log exp (cid:0) − cε N (cid:1) ≤ − cε NN α − N →∞ −→ −∞ . (6.2)Moreover, note that for ﬁxed a > · · · > a k N with a + · · · + a k N > r N , it holds P ( l = a , . . . , l k N = a k N , | C max | > δN )= k N Y j =2 P ( l j = a j | l = a , . . . , l j − = a j − , | C max | > δN ) P ( | C max | > δN ) ≤ k N Y j =3 P ( l j = a j | l = a , . . . , l j − = a j − , | C max | > δN ) P ( { l ≥ a } ∩ {| C max | > δN } ) ≤ k N Y j =3 P ( l j ≥ a j | l = a , . . . , l j − = a j − , | C max | > δN ) P ( l ≥ a ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 25

On one hand, we obtain from (4.1) that for some constant c, C > P ( l > a ) ≤ C εNa exp (cid:0) − cε a (cid:1) , where the ε was given by the branching factor λ = 1 + ε of the originalhypergraph G d ( N, p ).On the other hand, for each j ∈ { , . . . , k N } , the hypergraph G d ( N, p ) afterremoving the components L , . . . , L j − and C max , conditioned on {| C max | >δN } by [7, Lemma 8.1] (or common sense) is again a hypergraph G d ( N − s j , p ),where s j ≤ (1 − δ ) N . Recall that p = λ ( d − N d − = λ j ( r − N − s j ) d − in G d ( N − s j , p ) is characterised by its branching factor λ j = (cid:16) − s j N (cid:17) d − λ, where we have λ = 1 + ε by assumptions. Now let us denote by ε j the followingquantity: ε j = 1 − (cid:16) − s j N (cid:17) d − (1 + ε ) . Then we can arrive at λ j = 1 − ε j . (6.3)Since s j ≤ (1 − δ ) N , we see by [7, (8.3)] that ε j ≤ − δ d − (1 + ε ) ,c δ ε ≤ ε j ≤ C δ ε for some constanst c δ , C δ > δ but not on j .Therefore, the branching factor λ j given in (6.3) of the new hypergraph G d ( N − s j , p ), belongs to the subcritical regime. From [7, Theorem 2] we obtain a largedeviation bound for the largest component in G d ( N − s j , p ), that is to say, the L j of G d ( N, p ). We thus obtain P ( l j ≥ a j | l = a , . . . , l j − = a j − , | C max | > δN ) ≤ C ε j Na j exp (cid:0) − cε j a j (cid:1) ≤ C δ εNa j exp (cid:0) − c δ ε a j (cid:1) . Hence, we get ﬁanally for

M > X a > ··· >a kN a + ··· + a kN >r N P ( l = a , . . . , l k N = a k N , | C max | > δN ) ≤ r k N N ( C δ εN ) k N Q k N j =2 a j exp − c δ ε k N X j =2 a j ! ≤ exp (cid:0) − c δ ε r N + k N log( C δ εr N N ) (cid:1) ≤ exp ( − M k N ) . (6.4)The last step follows, since not only the fact k N = o ( r N ) is implied by r N = N ξ with γ < ξ < α and k N = N γ with 2 α − < γ < α in (4.2); but also k N = o ( N )holds. Hence, (6.4) together with (6.1), (6.2) yields P ( | C max | + r N < | C ≤ k N | ) ≤ exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) by adjusting the constant M > (cid:3) Proof of Theorem 1.1.

As we described before, by an appropriatechoice of k N , | C ≤ k N | and | C max | only diﬀer by an amount that is negligibleon the moderate deviations scale. Proof.

Note that we pick r N = N ξ for γ < ξ < α , in particular, it satisﬁes r N = o ( N α ). Now for all y >

0, we estimate the upper tail by applying Lemma6.1 with M given there to obtain P ( || C max | − ρ λ | > yN α )= P ( || C max | − ρ λ | > yN α , || C max | − | C ≤ k N || ≤ r N )+ P ( || C max | − ρ λ | > yN α , || C max | − | C ≤ k N || > r N ) ≤ P ( || C ≤ k N | − ρ λ | > yN α + o ( N α )) + exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) . It suﬃces to apply the MDP for | C ≤ k N | derived in Theorem 5.1, it yieldslim N →∞ N α − log P ( || C max | − ρ λ | > yN α ) ≤ lim N →∞ N α − log h P ( || C ≤ k N | − ρ λ | > yN α + o ( N α )) + exp ( − M k N ) i ≤ lim N →∞ N α − log max n P ( || C ≤ k N | − ρ λ | > yN α + o ( N α )) , exp ( − M k N ) o ≤ − J ( y ) . ODERATE DEVIATIONS FOR RANDOM HYPERGRAPHS 27

Similarly, for the lower tail, it satisﬁes for all y > P ( || C ≤ k N | − ρ λ | > yN α )= P ( || C ≤ k N | − ρ λ | > yN α , || C max | − | C ≤ k N || ≤ r N )+ P ( || C ≤ k N | − ρ λ | > yN α , || C max | − | C ≤ k N || > r N ) ≤ P ( || C max | − ρ λ | > yN α + o ( N α )) + exp ( − M k N ) + exp (cid:0) o ( N α − ) (cid:1) . Again by Theorem 5.1, we arrive at − J ( y ) = lim N →∞ N α − log P ( || C ≤ k N | − ρ λ | > yN α ) ≤ lim N →∞ N α − log max n P ( || C max | − ρ λ | > yN α + o ( N α )) , exp ( − M k N ) o ≤ lim N →∞ N α − log P ( || C max | − ρ λ | > yN α ) . Altogether, the claimlim N →∞ N α − log P ( || C max | − ρ λ | > yN α ) = − J ( y )follows for all y > (cid:3) Acknowledgements

Research of the authors was funded by the Deutsche Forschungsgemeinschaft(DFG, German Research Foundation) under Germany’s Excellence StrategyEXC 2044 -390685587, Mathematics M¨unster: Dynamics -Geometry -Structure.

References [1] J. Ameskamp and M. L¨owe. Moderate deviations for the size of the largest componentin a super-critical Erd¨os-R´enyi graph.

Markov Process. Related Fields , 17(3):369–390,2011.[2] D. Barraez, S. Boucheron, and W. Fernandez de la Vega. On the ﬂuctuations of thegiant component.

Combin. Probab. Comput. , 9(4):287–304, 2000.[3] M. Behrisch, A. Coja-Oghlan, and M. Kang. The order of the giant component ofrandom hypergraphs.

Random Structures Algorithms , 36(2):149–184, 2010.[4] M. Behrisch, A. Coja-Oghlan, and M. Kang. Local limit theorems for the giant com-ponent of random hypergraphs.

Combin. Probab. Comput. , 23(3):331–366, 2014.[5] B. Bollob´as and O. Riordan. Asymptotic normality of the size of the giant componentin a random hypergraph.

Random Structures Algorithms , 41(4):441–450, 2012.[6] B. Bollob´as and O. Riordan. Asymptotic normality of the size of the giant componentvia a random walk.

J. Combin. Theory Ser. B , 102(1):53–61, 2012.[7] B. Bollob´as and O. Riordan. Exploring hypergraphs with martingales.

Random Struc-tures Algorithms , 50(3):325–352, 2017.[8] A. Coja-Oghlan, C. Moore, and V. Sanwalani. Counting connected graphs and hyper-graphs via the probabilistic method.

Random Structures Algorithms , 31(3):288–329,2007. [9] V. H. de la Pe˜na. A general class of exponential inequalities for martingales and ratios.

Ann. Probab. , 27(1):537–564, 1999.[10] A. Dembo. Moderate deviations for martingales with bounded jumps.

Electron. Comm.Probab. , 1:no. 3, 11–17, 1996.[11] A. Dembo and O. Zeitouni.

Large deviations techniques and applications , volume 38 of

Stochastic Modelling and Applied Probability . Springer-Verlag, Berlin, 2010. Correctedreprint of the second (1998) edition.[12] R. Durrett.

Random graph dynamics . Cambridge Series in Statistical and ProbabilisticMathematics. Cambridge University Press, Cambridge, 2010.[13] P. Eichelsbacher and M. L¨owe. Lindeberg’s method for moderate deviations and randomsummation. preprint, arxiv: 1705.03837 , 2017.[14] P. Erd˝os and A. R´enyi. On random graphs. I.

Publ. Math. Debrecen , 6:290–297, 1959.[15] P. Erd˝os and A. R´enyi. On the evolution of random graphs.

Bull. Inst. Internat. Statist. ,38:343–347, 1961.[16] S. Janson, D. E. Knuth, T. Luczak, and B. Pittel. The birth of the giant component.

Random Structures Algorithms , 4(3):231–358, 1993. With an introduction by the edi-tors.[17] M. Karo´nski and T. Luczak. The phase transition in a random hypergraph.

J. Com-put. Appl. Math. , 142(1):125–135, 2002. Probabilistic methods in combinatorics andcombinatorial optimization.[18] M. L¨owe and F. Vermet. Capacity of an associative memory model on random grapharchitectures.

Bernoulli , (to appear), 2014.[19] N. O’Connell. Some large deviation results for sparse random graphs.

Probab. TheoryRelated Fields , 110(3):277–285, 1998.[20] J. Schmidt-Pruzan and E. Shamir. Component structure in the evolution of randomhypergraphs.

Combinatorica , 5(1):81–94, 1985.[21] R. van der Hofstadt. Random graphs and complex networks. , 2013.(Jingjia Liu)

Fachbereich Mathematik und Informatik, Universit¨at M¨unster,Einsteinstraße 62, 48149 M¨unster, Germany

E-mail address , Jingjia Liu: [email protected] (Matthias L¨owe)

Fachbereich Mathematik und Informatik, Universit¨at M¨unster,Einsteinstraße 62, 48149 M¨unster, Germany

E-mail address , Matthias L¨owe:, Matthias L¨owe: