[PDF] Is the space complexity of planted clique recovery the same as that of detection?

Abstract

We study the planted clique problem in which a clique of size k is planted in an Erdős-Rényi graph G(n, 1/2), and one is interested in either detecting or recovering this planted clique. This problem is interesting because it is widely believed to show a statistical-computational gap at clique size k=sqrt{n}, and has emerged as the prototypical problem with such a gap from which average-case hardness of other statistical problems can be deduced. It also displays a tight computational connection between the detection and recovery variants, unlike other problems of a similar nature. This wide investigation into the computational complexity of the planted clique problem has, however, mostly focused on its time complexity. In this work, we ask- Do the statistical-computational phenomena that make the planted clique an interesting problem also hold when we use `space efficiency' as our notion of computational efficiency? It is relatively easy to show that a positive answer to this question depends on the existence of a O(log n) space algorithm that can recover planted cliques of size k = Omega(sqrt{n}). Our main result comes very close to designing such an algorithm. We show that for k=Omega(sqrt{n}), the recovery problem can be solved in O((log*{n}-log*{k/sqrt{n}}) log n) bits of space. 1. If k = omega(sqrt{n}log^{(l)}n) for any constant integer l > 0, the space usage is O(log n) bits. 2.If k = Theta(sqrt{n}), the space usage is O(log*{n} log n) bits. Our result suggests that there does exist an O(log n) space algorithm to recover cliques of size k = Omega(sqrt{n}), since we come very close to achieving such parameters. This provides evidence that the statistical-computational phenomena that (conjecturally) hold for planted clique time complexity also (conjecturally) hold for space complexity.

Full PDF

aa r X i v : . [ c s . CC ] A ug Is the space complexity of planted clique recovery the same as that ofdetection?

Jay Mardia ∗ Abstract

We study the planted clique problem in which a clique of size k is planted in an Erd˝os-R´enyigraph G ( n, ), and one is interested in either detecting or recovering this planted clique. This prob-lem is interesting because it is widely believed to show a statistical-computational gap at clique size k = Θ( √ n ), and has emerged as the prototypical problem with such a gap from which average-casehardness of other statistical problems can be deduced. It also displays a tight computational con-nection between the detection and recovery variants, unlike other problems of a similar nature. Thiswide investigation into the computational complexity of the planted clique problem has, however,mostly focused on its time complexity. In this work, we ask- Do the statistical-computational phenomena that make the planted clique an interest-ing problem also hold when we use ‘space eﬃciency’ as our notion of computationaleﬃciency?

It is relatively easy to show that a positive answer to this question depends on the existence of a O (log n ) space algorithm that can recover planted cliques of size k = Ω( √ n ). Our main result comesvery close to designing such an algorithm. We show that for k = Ω( √ n ), the recovery problem canbe solved in O (cid:16)(cid:16) log ∗ n − log ∗ k √ n (cid:17) · log n (cid:17) bits of space.1. If k = ω ( √ n log ( ℓ ) n ) for any constant integer ℓ >

0, the space usage is O (log n ) bits.2. If k = Θ( √ n ), the space usage is O (log ∗ n · log n ) bits.Our result suggests that there does exist an O (log n ) space algorithm to recover cliques of size k = Ω( √ n ), since we come very close to achieving such parameters. This provides evidence that thestatistical-computational phenomena that (conjecturally) hold for planted clique time complexityalso (conjecturally) hold for space complexity. The planted clique problem is a well-studied task in average-case computational complexity, in whicha clique of size k is planted in an Erd˝os-R´enyi graph of size n , G ( n, ). The problem comes in twoﬂavours, detection ( PC D ( n, k )) and recovery ( PC R ( n, k )). In the former, we are given either a G ( n, )graph or a planted clique graph and must identify the graph we have been given. That is, we mustdetect whether or not the graph has a planted clique. In the latter, we are given a planted cliquegraph and must recover all the vertices in the clique.The planted clique problem shows a variety of interesting phenomena in its time complexity. Not onlydoes it exhibit a statistical-computational gap at clique size k = Θ( √ n ), it has also emerged as thecentral problem whose average-case hardness implies average-case hardness for many other problemswith statistical-computational gaps. See [BBH18, BB20] for some examples. Further, the detection ∗ Department of Electrical Engineering, Stanford University. [email protected] Here log ( ℓ ) n means we repeatedly take the logarithm of n ℓ times. For example, log (3) n = log log log n . k = Ω( √ n )and less eﬃciently in quasi-polynomial time n O (log n ) for cliques larger than the information-theoreticthreshold, k ≥ (2 + ǫ ) log n . The widely believed Planted Clique Conjecture even states that if theclique size is small k = O ( n − δ ) for any constant δ >

0, no polynomial time algorithm can solvethe planted clique detection (and hence also the recovery) problem. We survey the results providingevidence for this conjecture in Section 1.1. The question we ask in this work is:

Do the statistical-computational phenomena that make the planted clique an interestingproblem also hold when we use ‘space eﬃciency’ as our notion of computational eﬃciency?

To answer this question, we must ﬁrst discuss what a ‘space eﬃcient’ algorithm is. One of the mostwell studied classes of space bounded computation is that of logarithmic space. This is the class ofalgorithms that run in O (log n ) bits of space on inputs of size poly( n ), and is widely considered abenchmark of ‘space eﬃcient’ computation. Hence we will try and design algorithms with this spacebound in mind.Let us further motivate this target space bound. It is well known that any deterministic algorithm thatuses at most s ( n ) bits of space must also run in time 2 O ( s ( n )) [AB09, Theorem 4.3]. This means thatdeterministic logspace algorithms are a subset of polynomial time algorithms, which are considered‘eﬃcient’. If the Planted Clique Conjecture is true, no deterministic O (log n ) space algorithm can solvethe planted clique detection (or recovery) problems for k = O ( n / − δ ). If we can show logarithmic spacealgorithms exist above the polynomial time threshold k = Ω( √ n ), we will show that the statistical-computational gap holds even for space complexity.Also, as observed in [MAC20], the work [RS19] implies that if k = O ( n − δ ) for some constant δ > n ) time. Henceany successful planted clique algorithm requires Ω(log n ) bits of space, which makes the quest for a O (log n ) space algorithm also the quest for an optimal deterministic algorithm. Detection:

For detection, essentially the same straightforward algorithms that have been designed for time ef-ﬁciency can also be implemented space eﬃciently. For clique sizes above the information theoreticthreshold k ≥ (2 + ǫ ) log n , the same ‘exhaustively search over sets of Θ(log n ) vertices’ idea thatgives a quasi-polynomial n O (log n ) time algorithm also gives a O (log n ) space algorithm . For largecliques above the polynomial time threshold k = Ω( √ n ), the folklore ‘sum test’ or ‘edge counting’algorithm (see for example Section 1.5 of [Lug17]) can be implemented in O (log n ) bits of space. We Strictly speaking, the theorem we point to relating space complexity to time complexity [AB09, Theorem 4.3] isfor Turing machines. While it is convenient to deﬁne computational complexity classes using Turing machines, it isextremely inconvenient to design algorithms using them. Instead, we work with a slightly stronger model of computationthat allows random access to the input to make algorithm design reasonable. However, the idea behind [AB09, Theorem4.3] also holds in any reasonable RAM model and so we ignore this distinction for the purposes of our discussion. Since the best known time complexity for this problem is n O (log n ) , we do not expect to solve this problem in o (log n )bits of space k = Θ( √ n ) in terms of spacecomplexity if it holds for time complexity. Recovery:

But what about planted clique recovery? Before we go any further, we should clarify what we meanby a small space algorithm for planted clique recovery. The size of the output is k log n bits, whichcould be much larger than the space we are allowing the algorithm. However, the space bound appliesonly to the working-space of the algorithm, and the output is written on a write-only area which doesnot count towards the space bound. This is standard in the space complexity literature, so we canwrite-to-output very large answers. See, for example, Section 14.1 of [Wig19].Just like for detection, simple pre-existing ideas can easily be used to obtain a O (log n ) space al-gorithm for recovering planted cliques above the information theoretic threshold, thus matching thedetection space complexity in this range of parameters. We provide more details in Section 1.3.Also like for detection, we do not expect a O (log n ) space algorithm in this regime because of the Planted Clique Conjecture and the relation between space and time complexity.If we can design a O (log n ) space algorithm that recovers large planted cliques k = Ω( √ n ), we willhave shown two things: • If the conjectured statistical-computational gap at k = Θ( √ n ) holds for the time complexity ofthe planted clique recovery problem, it also holds for space complexity. • Assuming the above statistical-computational gap holds, the coarse-grained computational com-plexity of planted clique detection and recovery are indeed the same, no matter the notion ofcomplexity we use - time or space.1. Our ﬁrst hope for an O (log n ) space recovery algorithm for larger cliques k = Ω( √ n ) is to seeif speciﬁc pre-existing algorithms are space eﬃcient. However, none of the polynomial timealgorithms designed for PC R ( n, k ) above k = Ω( √ n ) run in small space. They all require atleast poly( n ) bits of space, and in Section 1.3 we discuss, for each of them, why it seems hardto implement them in O (log n ) bits of space.Of course, the ‘degree-counting’ polynomial time recovery algorithm for large cliques of size k = Ω( √ n log n ) from [Kuˇc95] can easily be implemented in O (log n ) space. This matchesthe space complexity for detection in this parameter range. For such large cliques, a simplethreshold separates the degrees of non-clique vertices and clique vertices, so membership caneasily be decided from a vertex’s degree. A space eﬃcient implementation exists because we caneasily count the degree of a vertex (which takes O (log n ) bits to store) and iterate over all verticesin logarithmic space, re-using the counter used to store the degree across vertices. However, thisidea does not work for Ω( √ n ) = k = o ( √ n log n ), and it is this parameter range in which mostalgorithmic work for the planted clique problem has been done in the past two decades. If wewant to show that the statistical-computational phenomena that hold for time complexity alsohold for space complexity, we will need to focus on these parameters.2. Our next hope is to recall that the lack of a detection-recovery gap in the time complexity ofthe planted clique problem is not merely an algorithmic coincidence. Section 4.3.3 of [AAK + By coarse-grained we mean that the time complexity of detection and recovery are within polynomial factors of eachother. If we were looking at a more ﬁne-grained picture, a gap does emerge between detection and recovery. [MAC20]showed that for k = ω ( n / ), planted clique detection can be solved in o ( n ) time. However, by results of [RS19] we knowthat any recovery algorithm must require Ω( n ) time. v is in the clique, the subgraph induced on the vertex set that does not contain v or its neighbours is distributed as an Erd˝os-R´enyi graph. But, if v is notin the clique, this induced subgraph is distributed as a planted clique graph. Then we cansimply run the detection algorithm to decide if v is in the planted clique or not . If we coulduse the edge counting detection algorithm and implement this reduction between recovery anddetection in small space, then it seems we would be done. What is more, such a reduction can be implemented in small space! However, there is a slight issue. The statistical success of the reduction in [AAK +

07] requiresthe failure probability of the detection algorithm to be at most o ( n ). This is because we needto repeat the detection algorithm n times, once for each vertex in the original graph, and thusneed to take a union bound. However, as we can see from Section 1.5 in [Lug17], the failureprobability of the edge counting test is exp (cid:16) Θ (cid:16) − k n (cid:17)(cid:17) . This means the failure probability is o ( n ) only for k = Ω( √ n log n ), which is not a huge improvement over the degree countingalgorithm.Due to the discussion above, we need some new ideas to get small space recovery algorithms for plantedcliques of size k = Ω( √ n ). Our main result, stated informally below, is one that falls just short of ouraim of a O (log n ) space algorithm. For a formal statement, see Theorem 1 in Section 2.For some large enough constant C >

0, for planted cliques of size k ≥ C √ n , the recoveryproblem PC R ( n, k ) can be solved in O (cid:16)(cid:16) log ∗ n − log ∗ k √ n (cid:17) · log n (cid:17) bits of space.1. If k = ω ( √ n log ( ℓ ) n ) for any constant integer ℓ >

0, the space usage is indeed O (log n ) bits,which was our target.2. However, if k = C √ n , the space usage is O (log ∗ n · log n ) bits, which is just shy of what we wereaiming for.Our result suggests that there does exist an O (log n ) space algorithm to recover cliques of size k =Ω( √ n ), since we come very close to achieving such parameters. We fail to answer our titular question,but only just. We provide strong evidence that the answer is ‘yes’, and the statistical-computationalphenomena that (conjecturally) hold for planted clique time complexity also (conjecturally) hold forspace complexity. We have thus initiated the study of high dimensional statistical problems in termsof their space complexity.As we see in Section 1.1, a long line of work on restricted models of computation has been used toshow hardness of the planted clique problem. On the other hand, this work (like [MAC20]) studiesa restricted model of computation with the primary aim of making algorithmic progress and furtherpushing down the complexity of successful planted clique algorithms. Open Problem:

Is there an O (log n ) space algorithm that recovers planted cliques of size k = Ω( √ n )reliably, or is there a (tiny) detection-recovery gap in the space complexity of the planted cliqueproblem? Of course, such a reduction has a built in O ( n ) factor time overhead for the recovery algorithm above the detectionalgorithm. To count the number of edges induced in such a manner by a vertex v , we can simply iterate over all pairs u, w ofvertices in the original graph. We increment the counter only if the edge ( u, w ) exists and neither of the vertices u, w have an edge to v . .1 Related Work Planted Clique Hardness:

It is widely believed that polynomial time algorithms can only detector recover the planted clique for clique sizes above k = Ω( √ n ). One piece of evidence for this beliefis the long line of algorithmic progress using a variety of techniques that has been unable to breakthis barrier [Kuˇc95, AKS98, FK00, FR10, AV11, DGGP14, CX14, DM15a, HWX15, MAC20]. Theother piece of evidence comes from studying restricted but powerful classes of algorithms. [Jer92]showed that a natural Markov chain based technique requires more than polynomial time below thisthreshold. Similar hardness results (for the planted clique problem or its variants) have been shown forstatistical query algorithms [FGR + +

18, BHK + +

17, Hop18, KWB19] and through concepts from statistical physics [GZ19].

Statistical-Computational Gaps:

Statistical-computational gaps are not unique to the plantedclique problem, and are found in problems involving community detection / recovery [DKMZ11, Mas14,MNS15, AS16], sparse PCA [BR13, LKZ15], tensor PCA [RM14, HKP + + Detection-Recovery Gaps:

As we have mentioned, the statistical-computational gap in the plantedclique problem appears at k = Θ( √ n ) for both the detection and recovery variants. This means thereis no detection-recovery gap in time complexity, and our work is trying to show that no such gapexists for space complexity either. To understand that the non-existence of this gap is not a foregoneconclusion, we note that for several other problems, detection-recovery gaps do exist. For example, forcommunities in the stochastic block model [Abb17], or planted submatrix problems [HWX15, CX14].Moreover, the (non-)existence of a detection-recovery gap is not an inconsequential detail. Since theplanted clique problem does not display such a gap, it is not straightforward to use it as a startingpoint to show detection-recovery gaps for other problems. [BB20] overcomes this issue for semirandomcommunity recovery by starting from a variant of the planted clique problem, and [SW20] develops alow-degree-likelihood ratio technique tailored to recovery tasks to get around this problem. Notation:

We will use standard big O notation ( O, Θ , Ω). An edge between vertices u, v is denoted( u, v ). We let

Bin (cid:0) n, (cid:1) denote a Binomial random variable with parameters (cid:0) n, (cid:1) . Similarly, Bern ( p )denotes a Bernoulli random variable that is 1 with probability p and 0 otherwise. Unless statedotherwise, all logarithms are taken base 2. For a vertex v in graph G = ([ n ] , E ), we will denoteits degree by deg( v ). Throughout this work we identify the vertex set of the graph with the set[ n ] := { , , ..., n } . We will also crucially utilise the natural ordering this confers on the names of thevertices. 5e also deﬁne the so-called binary iterated logarithm log ∗ n .log ∗ n = ( n ≤

11 + log ∗ (log n ) if n > Deﬁnition 1.1 (Erd˝os-R´enyi graph distribution: G ( n, )) . Let G = ([ n ] , E ) be a graph with vertex set of size n . The edge set E is created by including each pos-sible edge independently with probability . The distribution on graphs thus formed is denoted G ( n, ) . Deﬁnition 1.2 (Planted Clique graph distribution: G ( n, , k )) . Let G = ([ n ] , E ) be a graph with vertex set of size n . Moreover, let K ⊂ [ n ] be a set of size k chosenuniformly at random from all (cid:0) nk (cid:1) subsets of size k . For all distinct pairs of vertices u, v ∈ K , weadd the edge ( u, v ) to E . For all remaining distinct pairs of vertices u, v , we add the edge ( u, v ) to E independently with probability . The distribution on graphs thus formed is denoted G ( n, , k ) . Deﬁnition 1.3 (Planted Clique Detection Problem: PC D ( n, k )) . This is the following hypothesis testing problem. H : G ∼ G ( n,

12 ) and H : G ∼ G ( n, , k ) Deﬁnition 1.4 (Planted Clique Recovery Problem: PC R ( n, k )) . Given an instance of G ∼ G ( n, , k ) , recover the planted clique K . Our space eﬃcient recovery algorithm will depend on the ability to take a small subset of the plantedclique and expand it to recover the entire clique. We ﬁrst discuss such a subroutine, and then talkabout our main result, the O (cid:16)(cid:16) log ∗ n − log ∗ k √ n (cid:17) · log n (cid:17) space algorithm for planted clique recoveryfor large cliques of size k = Ω( √ n ). We do this by ﬁrst studying polynomial time algorithms thatwork in this regime, discussing why they take polynomial amounts of space to implement, and thenproviding the high level ideas of our algorithm. After this, we end with some more details on thestraightforward O (log n ) space implementations of the known quasi-polynomial time algorithms forclique detection and recovery above the information theoretic threshold. Small space clique completion:

Several polynomial time recovery algorithms use clique completion / clean-up subroutines to ﬁnd theentire planted clique after ﬁnding just a large enough (possibly noisy) subset of it [AKS98, FR10,DGGP14, DM15a, MAC20]. However, none of these seem amenable to space eﬃcient implementation,so we create a simple completion algorithm of our own.We assume that we have a space s ( n ) algorithm that can take the graph as input alongwith theidentity of a vertex, and output 1 if and only if that vertex is part of some large enough subset S C of the planted clique. This is what we mean by ‘having access to’ a subset of the clique that we cannow complete. Consider the set of those vertices which are connected to every vertex in S C . It iseasy to show that this new set contains the entire planted clique and very few non-clique vertices (seeLemma 3.2). As a result, the number of edges to this set from a clique vertex is far larger than that6f a non-clique vertex, and a simple logspace computable threshold can distinguish between the twocases. We show in Algorithm 1 ( Small Space Clique Completion ) and Lemma 2.1 that we canuse this to decide if a given vertex is in the planted clique or not using O (log n + s ( n )) bits of space.Then we simply loop over all vertices with a further O (log n ) bits of space and thus have a plantedclique recovery algorithm. Recovery for k = Ω( √ n ) : We ﬁrst take a look at existing polynomial time algorithms for k = Ω( √ n ) to see why they all requirepoly( n ) bits of space, and to see if they have good ideas that we can build on to get small spacealgorithms.1. Spectral algorithms: The spectral algorithm of [AKS98], which was the ﬁrst polynomial timealgorithm to recover planted cliques of size k = Ω( √ n ), requires access to an n -dimensionaleigenvector. Even just storing this takes poly( n ) bits of space, and it is unclear how to space-eﬃciently compute only bits and pieces of this eigenvector. Perhaps the most promising avenuefor a space eﬃcient spectral algorithm would be to use the spectral detection test (based on thesecond eigenvalue of the adjacency matrix) with the reduction between recovery and detectionfrom [AAK + o ( n ),so if we can implement it in small space, this approach would actually work. However, it is notat all clear how to compute the second eigenvalue of the adjacency matrix to desired accuracyin O (log n ) space. [DTS15, DSTS17] study the problem of approximating eigenvalues of anundirected graph in logarithmic space and we might hope to use their algorithms to solve ourproblem. However, these algorithms, which are randomized and run in logarithmic space, canonly approximate the normalized eigenvalues to within constant accuracy. We require inversepolynomial accuracy to use the spectral detection test.2. Optimization / SDP algorithms: Several optimization theoretic algorithms involving semideﬁ-nite programs have been designed that solve the planted clique recovery problem for k = Ω( √ n )[FK00, AV11, CX14, HWX15]. However, we do not expect to have a general-purpose logarithmicspace algorithm for semideﬁnite programs. The works [DLR79, Ser91] show that even (approx-imately) solving linear programs, which are a special case of semideﬁnite programs, is logspacecomplete for P. This means that if we had a logspace algorithm for semideﬁnite programs, everyproblem with a polynomial time algorithm could also be solved in logarithmic space. Such aproposition is believed to be untrue [Wig19, Conjecture 14.8].3. (Nearly) Linear time algorithms:(a) The algorithm of [FR10] maintains a subset of ‘plausible clique vertices’ and reduces thesize of this subset by 1 in every round. As a result, it needs to maintain a polynomiallylarge subset for most of the time it runs. There also does not seem to be a clever way tocompress this set, since it depends crucially on the edge structure of the graph.(b) The message passing algorithm of [DM15a] is iterative and produces a new dense n × n matrix at every iteration, which can not be done in logarithmic space. It is plausible thata more space eﬃcient recursive algorithm that recomputes messages as needed exists. But,since the algorithm requires Θ(log n ) sequential iterations / recursive calls, and we will needΩ(log n ) bits of space for each level of recursion, we do not expect this space usage to be o (log n ) bits. Since this does not improve the space usage over the simple algorithm thatworks above the information theoretic threshold, we do not pursue this idea further.(c) The algorithm of [DGGP14], like [FR10], maintains a sequence of shrinking subsets of ver-tices where the ratio between the number of clique and non-clique vertices improves in7 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 N N N N Figure 1:

An example of our sets N t for n = 17. every round. Further, these subsets are polynomial sized and random. Since the pruningof the set depends on randomness from the algorithm, any clever space eﬃcient imple-mentation that re-uses space would need to store the random coins it tosses, defeatingthe purpose of a space eﬃcient implementation. However, the key idea behind this algo-rithm can be de-randomized, and this is the ﬁrst observation that forms the basis of our O (cid:16)(cid:16) log ∗ n − log ∗ k √ n (cid:17) · log n (cid:17) space algorithm.We brieﬂy explain the technique of [DGGP14] in more detail, but using the notation of this workrather than that of [DGGP14]. Their algorithm runs for T rounds and maintains a sequence of vertexsubsets { N t , V t } ≤ t ≤ T . N = V is essentially the entire vertex set [ n ], and then each vertex of V t − is included in N t iid with some probability and each vertex of N t is added to V t by cleverly usinginformation from the edge structure of the input graph. This results in the ratio of clique vertices tonon-clique vertices in V t increasing by a constant factor in every round. T is then chosen large enoughso that V T is entirely a subset of the planted clique. The entire clique is now output using a cliquecompletion subroutine.Since the subsets N t described above depend so heavily on the randomness of the algorithm as wellas the edge structure of the input graph, this algorithm can not be implemented in less that poly( n )space. On the other hand, we have already noted that creating a space eﬃcient clique completionalgorithm can be done, and we have done so in Lemma 2.1 with Algorithm 1. So we now focus ontrying to modify the ﬁrst part of the algorithm to something that can be implemented space eﬃciently.Our challenge is to concisely represent the sets N t (and by extension, V t ).Our observation is that the clever ﬁltering of [DGGP14] does not depend crucially on the set N t beinga subset of V t − (which is what makes it depend on the edge structure of the graph). Nor does itdepend on the set being random. The only thing we really need is that the proportion of clique tonon-clique vertices in N t is not too small, and that we can easily iterate over all the vertices in any set N t . This gives us the freedom we require to design the sets N t to be concisely representable, and weuse our computer’s representation of the vertex set to our advantage. For our computer, the names ofthe n vertices of the graph are log n bit integers, and we can use the fact that integers have a naturalordering as well as the fact that simple arithmetic can easily be done in O (log n ) bits of space.We ﬁrst set up some notation. The quantities we deﬁne will be functions of n, k , and the graph G ∼ G ( n, , k ) although our notation will not explicitly denote this. The value of n, k and the graphwill always be clear from context. Recall that [ n ] is the vertex set of a graph G ∼ G ( n, , k ) with n vertices and a planted clique called K of size k and a set of edges called E . • Deﬁne n as the smallest integer that is a power of 2 and is at least n/

2. This means n/ ≤ n < n . Deﬁne k := k n n • For any integer 0 < t < log n , let n t := n t , k t := k t . Note that n t is always an integer. • We can now deﬁne the subsets N t of the vertex set [ n ] that will be of particular interest in ourﬁltering algorithm. Let N t := [ n t − ] \ [ n t ], and note that the N t ’s are all disjoint sets. Clearly, | N t | = n t . See Figure 1 for an example. 8t is easy to observe that given n, t we can iterate over the vertex set N t in O (log n ) space, whichis exactly what we wanted. In the analysis of Lemma 2.2, we will also show that the ratio of cliquevertices to non-clique vertices in any N t is roughly the same as in the whole graph, which is not toosmall. Now we must implement the rest of the ideas in [DGGP14], the ones that actually use theinput graph to ﬁnd the clique.After setting V = N , the ﬁltering step of [DGGP14] ﬁxes a threshold and adds a vertex in N t tothe set V t if and only if that vertex has more edges to V t − than the set threshold. Since cliquevertices in N t are likely to have a higher number of edges to V t − than non-clique vertices, the formerare more likely to appear in V t and the latter are more likely ot be ﬁltered out. This is how the ratioof clique to non-clique vertices in V t gradually increases with t . If we had an algorithm to check tocheck membership in V t − that uses s t − ( n ) bits of space, we could design an algorithm to check formembership in V t that uses O (log n + s t − ( n )) bits of space. To see this, suppose we have a vertex v ∈ N t and we want to decide if it is in V t . We can simply iterate over the set N t − , and for eachvertex u ∈ N t − , check if it is also in V t − using our assumed algorithm. We can also maintain a O (log n ) bit counter to count the number of edges from v to all u that are in V t . Since we can re-usethe s t − ( n ) bits of space to check membership in V t − across diﬀerent u , the whole things can be donein O (log n + s t − ( n )) bits of space. By induction, this means we can check for membership in V T using O ( T · log n ) bits of space. We provide a formal algorithm and proof of such a claim in Lemma 2.3using Algorithm 2.Overall, this promises to give a O ( T · log n ) space algorithm. What can we set T to be? Unfortunately,the algorithm of [DGGP14] uses T = Θ(log n ) rounds, since it only gets a constant factor improvementin the ratio of clique to non-clique vertices in going from V t − to V t . This gives a O (log n ) spacealgorithm, which is not an improvement over the simple algorithm that works above the informationtheoretic threshold.Our key idea, inspired by [MAC20], is to implement a better ﬁltering step that gets more than aconstant factor of improvement in each round. The ﬁltering / thresholding of [DGGP14] does notutilise the size of the planted clique k at all, other than the fact that it is Ω( √ n ). On the otherhand, [MAC20] uses knowledge of k to design a single round ﬁltering algorithm that recovers theplanted clique for clique sizes ω ( √ n log log n ) = k = o ( √ n log n ) in sublinear time. By appropriatelyimplementing this idea in our context for multiple rounds, we can utilize knowledge of the number ofclique vertices in V t − , | V t − ∩ K | , to make sure that in going from V t − to V t the following happens.The number of clique vertices decreases by at most a constant factor, while the number of non-cliquevertices decreases by at least a factor of exp (cid:16) Θ (cid:16) | V t − ∩ K | | V t − | (cid:17)(cid:17) , which is exp (cid:16) Θ (cid:16) k n (cid:17)(cid:17) for t = 2. For k = Θ( √ n ), this is still a constant factor, but for larger k , this is much better than a constant factorimprovement.To use this idea, our algorithm needs to know | V t − ∩ K | , which it does not. However, we do have highprobability lower bounds on the size | V t − ∩ K | . We design our thresholds using these estimates, andour analysis in Lemma 2.2 shows that this suﬃces to get the beneﬁts of this better ﬁlter. Let us nowdeﬁne the sets V t for our algorithm, thus specifying the ﬁltering threshold. We proceed inductively. • V := N • For any integer t > V t is a subset of N t of vertices which have ‘large’ V t − -degree. Quantita-tively, V t := { v ∈ N t | P u ∈ V t − ( u,v ) ∈ E ≥ | V t − | + k t +2 − p | V t − |} . Technically, [DGGP14] counts the number of edges to V t − \ N t , but in our construction we will have V t − \ N t = V t − .

9t is this carefully chosen threshold sequence which, unlike in [DGGP14], varies with t and uses thevalue of k that allows us to improve on the O (log n ) space bound. In Lemma 2.2 we will show that V T , as deﬁned above, is with high probability a subset of the planted clique if T is large enough. Wecan implement an algorithm to check membership in V T in O ( T · log n ) bits of space as discussedabove (and formalized in Lemma 2.3). Moreover, we get the beneﬁts of a very quickly acceleratingimprovement in the ratio of clique to non-clique vertices from V t − to V t . From [MAC20] we knowthat one round of such a ﬁlter improves the ratio by a factor of exp (cid:16) Θ (cid:16) k n (cid:17)(cid:17) , and the analysis ofour ﬁltering in Lemma 2.2 shows that after t rounds of such ﬁltering, the ratio improves by what isessentially a tower of exponentials of height t , i.e. exp (cid:16) exp (cid:16) ... exp (cid:16) Θ (cid:16) k √ n (cid:17)(cid:17)(cid:17)(cid:17) . This is why weare able to take T = O (cid:16) log ∗ n − log ∗ k √ n (cid:17) (Lemma 2.2). This gives us our main result, an algorithmthat can recover planted cliques of size k ≥ C √ n in O (cid:16)(cid:16) log ∗ n − log ∗ k √ n (cid:17) · log n (cid:17) bits of space. Theformal statement and proof can be found in Theorem 1. Detection:

1. It is well known (see [BE76] or Lemma 3.4) that for any positive constant ǫ >

0, the probabilitythat an Erd˝os-R´enyi G ( n, ) graph has a clique of size at least (2 + ǫ ) log n goes to 0. Meanwhile,if k ≥ (2 + ǫ ) log n , then by deﬁnition a planted clique graph G ( n, , k ) has a clique of size(2 + ǫ ) log n . The existence of a clique of this size is a well-known and simple detection testfor PC D ( n, k ) (see, for example, Proposition 1.3 [Lug17]). Moreover, such a test only needs toiterate over all vertex subsets of size (2 + ǫ ) log n , which can be done by maintaining a log n bitname/number for each of the (2 + ǫ ) log n vertices and looping over all possibilities. For a givenpossible clique, the algorithm needs to check if all (cid:0) (2+ ǫ ) log n (cid:1) edges exist. This can be done bylooping over all these edges with 2 more O (log log n ) bit counters. Overall, this implementationrequires O (log n ) bits of space.2. The simple ‘sum test’ or ‘edge counting’ algorithm that is well-known to work for large plantedclique k = Ω( √ n ) detection (see for example Section 1.5 of [Lug17]) can easily be implementedin O (log n ) space. The planted graph has signiﬁcantly more edges than the graph without aclique, so simply counting the number of edges in the input graph and using a threshold testgives a successful detection algorithm. The algorithm only needs to maintain the edge count,which is a number between 1 and n (which can be done with O (log n ) bits), and it can alsoeasily iterate over all distinct vertex pairs in O (log n ) bits of space. Lastly, the algorithm alsoneeds to compute the threshold (from [Lug17], we can use the threshold ( n ) + ( k ) ), which caneasily be computed from the input (which contains n, k ) in logarithmic space. This means thatfor planted clique detection, assuming we have a time complexity based statistical-computationalgap, we also have a space complexity based statistical-computational gap. Recovery above the information theoretic threshold:

For cliques of size (2 + ǫ ) log n ≤ k = O (log n ), with high probability the planted clique is the uniquelargest clique in G ( n, , k ) [Lug17, Theorem 1.7]. This means that an algorithm that loops over allpossible vertex subsets of size k can ﬁnd and output the entire planted clique. To do this it only needto maintain k names of vertices (which takes O ( k log n ) bits of space) and 2 counters of O (log k ) bits ofspace to check if a given set of k vertices form a clique. Overall, this implementation needs O (log n )bits of space.A simple application of the reduction between detection and recovery from [AAK +

07] combined withthe O (log n ) space detection algorithm for clique sizes above the information theoretic threshold10 ≥ (2 + ǫ ) log n also gives a O (log n ) space recovery algorithm for k = ω (log n ). We provide a formalstatement and proof in Lemma 3.5. We now prove our main results after formalizing our model of computation in Section 2.1. In Sec-tion 2.2 we give a space eﬃcient algorithm for clique completion. In Section 2.3 we prove our O (cid:16)(cid:16) log ∗ n − log ∗ k √ n (cid:17) · log n (cid:17) space recovery algorithm for clique sizes above the polynomial timethreshold k = Ω( √ n ). We use a standard notion of deterministic space bounded computation. See, for example, [Wig19,Section 14.1]. For a s ( n )-space algorithm, the input is a read-only version of the n × n adjacencymatrix of the graph as well as the clique size k . Every entry in the matrix as well as the value of k is stored in its own register. The algorithm has access to s ( n ) bits of working space, and the outputis write-only (and possibly much larger than s ( n )). The last fact allows us to solve problems whoseoutputs may be much larger than s ( n ), a property we will use to solve PC R ( n, k ).To make our model convenient for algorithm design, we also allow random access to the input registers.In our model, we assume basic arithmetic (addition, multiplication, subtraction, division) on O (log n )bit numbers can be done in O (log n ) bits of space. We also assume that the algorithm can computeor knows n by accessing the adjacency matrix using O (log n ) bits of space. The main idea behind this algorithm is discussed in Section 1.3. If we have access to a large enoughsubset of the clique, very few vertices that are adjacent to the entire subset (i.e ‘common neighbours’)are not in the planted clique. Counting the edges from a vertex v to this set of ‘common neighbours’of the known clique subset allows us to decide whether or not v is in the planted clique. Lemma 2.1 (Deterministic + small space clique completion) . Let k = ω (log n ) , and G ∼ G ( n, , k ) = ([ n ] , E ) . Let O S C be a deterministic algorithm that uses s ( n ) bits of space and, except with probability at most p ( n ) ≤ (over the randomness in G ), has thefollowing properties.1. When given as input the graph G and clique size k , it implicitly deﬁnes a subset of the plantedclique vertices S C such that S C ⊂ K and | S C | ≥ n .2. It does this by returning, for v ∈ [ n ] , O S C ( v ) = 1 if and only if v ∈ S C , and otherwise.Then for large enough n , Small Space Clique Completion (Algorithm 1), when run on G withaccess to the algorithm O S C , runs deterministically in space O ( s ( n ) + log n ) and writes to output thecorrect planted clique K except with probability at most p ( n ) + (cid:0) n (cid:1) log k + n exp (cid:0) − k (cid:1) (which is over therandomness in G ).Proof. Space usage:

The algorithm needs to store vertices v, u, w to run the for loops, each of whichtake log n bits of space since the size of the vertex set is n . The for loops can be run simply by11 lgorithm 1: Small Space Clique Completion (SSCC)

Input:

Graph G = ([ n ] , E ) ∼ G ( n, , k ), clique size k , oracle O S C with access to a clique set S C ⊂ K : O S C ( v ) = 1 if v ∈ S C , O S C ( v ) = 0 if v / ∈ S C Output:

Clique K for v ∈ [ n ] do Initialize g deg( v ) = 0 for u ∈ [ n ] do Initialize in e V ( u ) = TRUE for w ∈ [ n ] doif O S C ( w ) = 1 and ( w, u ) / ∈ E then  Decide if u is a common neighbourSet in e V ( u ) = FALSE endendif in e V ( u ) = TRUE and ( u, v ) ∈ E then g deg( v ) = g deg( v ) + 1 endend  Use ‘common neighbour’-degree of v if f deg ( v ) ≥ k + 3 log k thenwrite-to-output v endend incrementing the counter that stores the name/number of v , u , or w . The algorithm also needs toinvoke the oracle O S C which we know takes s ( n ) bits of space. The only other variables the algorithmneeds to store are g deg( v ) and in e V ( u ), which take log n bits (because g deg( v ) ranges from 0 to n − k + log k which can bedone upto the few bits of precision required to make the comparison in O (log n ) bits of space. Hencethe entire algorithm has a space requirement of O ( s ( n ) + log n ) bits. Note that both g deg and in e V areonly ever required for one u, v pair at a time, and so their space is re-used across the outer for loops.Similarly, space can be re-used for every call to the oracle. Correctness:

By assumption, we know that except with probability at most p ( n ) (over the random-ness of the input graph) the oracle O S C outputs 1 only on a set S C with the following properties. S C ⊂ K and | S C | ≥ n . We shall call this event A and condition on it happening for the rest ofthis proof. Let the event C denote the correctness of our algorithm, and note that we are trying toupper bound P ( C c ) ≤ P ( C c , A ) + P ( A c ) ≤ P ( C c , A ) + p ( n ).We need to argue that the algorithm writes to output every vertex in K and no other vertices. Considerthe vertex set e V consisting of vertices that have edges to every vertex in the known clique set S C . Forevery vertex v in [ n ], our algorithm computes the number of edges from v to e V (we call this g deg( v )).This is because an edge ( u, v ) is counted towards g deg( v ) only if in e V ( u ) = T RU E , which happensonly when u ∈ e V . The algorithm then writes v to output if g deg( v ) ≥ k + 3 log k and otherwise doesnothing.To complete our proof, we need to show two things. First, for every clique vertex v in K , g deg( v ) ≥ k + 3 log k . This happens because the entire clique K is contained in e V once we have conditioned on12 , and k − ≥ k + 3 log k for k large enough.Second, we need to show that for every non-clique vertex v ∈ [ n ] \ K , g deg( v ) < k + 3 log k . To dothis, we use some structural properties of the random input graph. Let A be the event that themaximum number of clique vertices any non-clique vertex is connected to is less than k , and let A be the event that the structural facts guaranteed by Lemma 3.2 are true. If A and A happen, thenit is clear that our algorithm behaves as desired. Hence, P ( C c , A , A , A ) = 0. Thus, P ( C c , A ) ≤ P ( C c , A , A , A ) + P ( A c , A ) + P ( A c , A ) ≤ P ( A c ) + P ( A c ). Lemma 3.2 shows P ( A c ) ≤ (cid:0) n (cid:1) log k andLemma 3.3 shows P ( A c ) ≤ n exp (cid:0) − k (cid:1) , which completes the proof. We recall some notation deﬁned in Section 1.3. • Deﬁne n as the smallest integer that is a power of 2 and is at least n/

2. This means n/ ≤ n < n . Deﬁne k := k n n • For any integer 0 < t < log n , let n t := n t , k t := k t . Note that n t is always an integer. • We also deﬁne some subsets of the vertex set [ n ] that will be of particular interest in our ﬁlteringalgorithm. Let N t := [ n t − ] \ [ n t ], and note that the N t ’s are all disjoint sets. Clearly, | N t | = n t .So far, we have deﬁned vertex subsets that do not depend at all on the edge structure of the graph.Now we deﬁne some subsets that do incorporate information about such edge structure (and hencewill be useful in ﬁnding the planted clique). We proceed inductively. • V := N • For any integer t > V t is a subset of N t of vertices which have ‘large’ V t − -degree . Quanti-tatively, V t := { v ∈ N t | P u ∈ V t − ( u,v ) ∈ E ≥ | V t − | + k t +2 − p | V t − |} .Our main structural lemma shows that for large enough T , V T is a large enough subset of the plantedclique. Lemma 2.2 (Filtering lemma) . Let

C > be some large enough constant. Let G ∼ G ( n, , k ) , with C √ n ≤ k and T be an integer suchthat ∗ n − log ∗ ( k/ √ n )) + 3 ≤ T = O (log ∗ n ) . Then for large enough n , except with probability atmost O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) , V T ⊂ K and ω (log n ) = k T +3 ≤ | V T | .Proof. Step 1:

First, we show that with high probability (over the choice of planted vertices) the number of plantedvertices in each subset N t is very close to what we would expect. Fix some 1 ≤ t ≤ T . By linearityof expectation, E [ | N t ∩ K | ] = ( k/n ) × n t = k t = k t . Since | N t ∩ K | is a hypergeometric randomvariable, we can use concentration inequalities for hypergeometric random variables [HS05, Theorem1] to conclude that k t +1 = k t − k t +1 ≤ | N t ∩ K | ≤ k t + k t +1 = 3 k t +1 except with probability at most 2 exp (cid:0) − k t +10 (cid:1) . Union bounding over all values of t from 1 to T , wesee that this concentration fact is simultaneously true (which we call event A ) for all such t except Hence the V t ’s are all disjoint for diﬀerent values of t . Deﬁned as the number of edges from a vertex v ∈ V t to V t − . P ( A c ) = O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) . Here we have used T = O (log ∗ n ) as well as k = Ω( √ n ). Step 2:

We now show that (conditioned on A ) with high probability, at least half these clique vertices in N t are also present in the ﬁltered set V t ⊂ N t . Let A t denote the event that V t has at least k t +2 cliquevertices. That is, | V t ∩ K | ≥ k t +2 . For the base case P ( A c | A ) is trivially 0 since V = N .Consider P ( A ct | A t − , A ). Since A , A t − , we know that there are at least k t +1 clique vertices in N t aswell as V t − . For a given clique vertex in N t , what is the probability that it is also in V t ? If v ∈ N t ∩ K ,then P u ∈ V t − \ K ( u,v ) ∈ E is a Bin (cid:16) | V t − | − e k t − , (cid:17) random variable where e k t − = | V t − ∩ K | ≥ k t +1 .Using the Chernoﬀ Bound (Lemma 3.1) − p t := P ( v / ∈ V t | A t − , A ) = P  X u ∈ V t − ( u,v ) ∈ E ≤ | V t − | k t +2 − p | V t − |  = P  X u ∈ V t − \ K ( u,v ) ∈ E − | V t − | − e k t − ! ≤ − p | V t − | − k t +2 + e k t − ! ≤ P  X u ∈ V t − \ K ( u,v ) ∈ E − | V t − | − e k t − ! ≤ − p | V t − |  ≤ exp − | V t − | | V t − | − e k t ) ! ≤ exp ( − / ≤ . N t is added to V t independently, the total number of clique vertices in V t , e k t is the sum of at least k t +1 iid Bern ( p t ) random variables. Using the Chernoﬀ Bound (Lemma 3.1), thismeans | V t ∩ K | = e k t ≥ k t +2 except with probability at most exp (cid:0) − c k t (cid:1) for some constant c. Hence, P ( A ct | A t − , A ) ≤ exp (cid:0) − c k t (cid:1) = O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) . Again, we have used T = O (log ∗ n ).We are now in a position to understand the probability that all the events A t for 0 ≤ t ≤ T happensimultaneously (which we call A ). P ( A c ) ≤ P Tt =0 P ( A ct | A t − , A t − , ..., A ). But conditioned on theevents A t − , A , the event A t is indpendent of A , ..., A t − . This gives P ( A c ) ≤ P Tt =0 P ( A ct | A t − , A ) = O (cid:0) T exp (cid:0) − n . (cid:1)(cid:1) . Step 3: If A happens, then | V T ∩ K | ≥ k T +2 ≥ k T +3 , which means we only need to additionally show that V T ⊂ K to complete the proof. To this end, we will show that the number of non-clique vertices in V t are small for all 1 ≤ t ≤ T simultaneously with high probability. Before doing so, we must set upsome further notation. Deﬁne m k n := m := k √ n and for t ≥ , m t := 2 (cid:16) mt − t − (cid:17) . We can assume | V t − | − e k t − > p | V t − | because if not, then clearly we have p t = 1. B t for t ≥ | V t \ K | ≤  max n m n t m t , k t +2 o if m n t − m t − ≥ k t +1 m n t − m t − < k t +1 . Observe that P ( B ) = 1 because m = m k n and | V \ K | ≤ | N | = n . We will now show that P ( B ct | B t − , A ) is small even for t ≥

2. After conditioning on B t − , A , what is the probability that agiven non-clique vertex in N t is added to V t ? Let v ∈ N t \ K , and consider P ( v ∈ V t | B t − , A ). Sincewe have conditioned on B t − , | V t − \ K | is suitably small, as deﬁned above. We can use this to upperbound | V t − | .In particular, we make sure that with k ≥ C √ n , C > n , k t +3 ≤ k t +2 − p | V t − | . This is equivalent to | V t − | ≤ k t +4 for all t ≥

2. Let us show that this isindeed true. Because of A , | V t − ∩ K | ≤ k t . If k t +1 > m n t − m t − , then | V t − | ≤ k t + k t +1 ≤ k t +4 forlarge enough n . If, on the other hand, k t +1 ≤ m n t − m t − , we use the fact that m t − ≥ t − m which weprove later in Step 4 . | V t − | ≤ k t + n t − t − ≤ k t +4 because we have chosen C large enough.Armed with this inequality k t +3 ≤ k t +2 − p | V t − | and a Chernoﬀ Bound (Lemma 3.1), we have q t := P ( v ∈ V t | B t − , A ) = P  X u ∈ V t − ( u,v ) ∈ E ≥ | V t − | k t +2 − p | V t − |  ≤ P  X u ∈ V t − ( u,v ) ∈ E ≥ | V t − | k t +3  ≤ exp (cid:18) − k t +3 | V t − | (cid:19) • Case 1:

First we tackle the easy case. Suppose m n t − m t − < k t +1 .Since we have conditioned on A, B t − , | V t − ∩ K | ≥ k t +1 , which means | V t − \ K | < | V t − ∩ K | .Thus | V t − | < | V t − ∩ K | ≤ | N t − ∩ K | ≤ k t − . This gives q t ≤ exp( − ck t ) for some constant c >

0, and by a union bound over all v ∈ N t \ K , we get that | V t \ K | = 0 (which is asuﬃcient condition for B t ) except with probability at most | N t \ K | exp( − ck t ) ≤ n t exp( − ck t ) = O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) , because T = O (log ∗ n ). • Case 2:

Now we tackle the case m n t − m t − ≥ k t +1 .Since B t − and A have happened, we have | V t − \ K | ≤ m n t − m t − and | V t − ∩ K | ≤ | N t − ∩ K | ≤ k t ≤ m n t − m t − , which gives | V t − | ≤ m n t − m t − . Using this with our upper bound on q t , we get q t ≤ exp (cid:16) − c k t +3 m t − m n t − (cid:17) for some constant c >

0. With k ≥ C √ n , let C > m = k √ n ≥

32 and q t ≤ − mt − t − = 1 m t , It is this upper bound on q t that leads to us deﬁning m t the way we do, and thus dictates, eventually, our spacecomplexity bound. E [ | V t \ K | ] ≤ n t m t . Since each vertex in N t \ K gets added to V t independently, wecan use the Chernoﬀ Bound (Lemma 3.1) to control P ( B t | B t − , A ). We now have the following,using the fact that m is at least a large constant greater than 2.If k t +2 ≥ m n t m t , we have the upper bound P ( B ct | B t − , A ) ≤ exp (cid:16) − ( m − k t +2 m (cid:17) = O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) .If k t +2 ≤ m n t m t , we have the upper bound P ( B ct | B t − , A ) ≤ exp (cid:16) − ( m − n t m t (cid:17) ≤ exp (cid:16) − ( m − k t +2 m (cid:17) = O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) .Our case analysis thus gives P ( B ct | B t − , A ) = O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) . Now we show that all B t ’s happensimultaneously with high probability, that is, all ﬁltered sets V t have an appropriately small numberof non-clique vertices. Let B = ∪ Tt =1 B t . Then P ( B c | A ) ≤ P Tt =1 P ( B ct | B t − , B t − , ..., B , A ). Butconditioned on the events B t − , A , the event B t is indpendent of B , ..., B t − . This gives P ( B c | A ) ≤ P Tt =1 P ( B ct | B t − , A ) = O (cid:0) T exp (cid:0) − n . (cid:1)(cid:1) . Step 4: If A and B both happen, and T is such that m n T − m T − < k T +1 , then we have | V T \ K | = 0, whichmeans V T ⊂ K and | V T | ≥ k T +3 , which is exactly the desired outcome. Note that P (( A, B ) c ) ≤ P ( B c | A ) + P ( A c ) = O (cid:0) T exp (cid:0) − n . (cid:1)(cid:1) = O (cid:0) exp (cid:0) − n . (cid:1)(cid:1) .So we now only need to show that m n T − m T − < k T +1 which is equivalent to m T − > √ n .To do this, we need m t to grow very fast with t , and we will show this in steps. First we show that m t grows with t . We then use this growth to show that it grows quite fast. We then use this fast growthto show that it grows very fast.1. We prove, by induction on t , that for all t ≥ m t ≥ m t − and m t ≥ t +1 .Recall from the analysis of Case 2 that we have assumed C is large enough so that m ≥ m = 2 m ≥ m ≥ t = 2. The ﬁrstinequality holds because the function 2 t/ − t is positive for t = 32 as well as increasing for t ≥ ≤ t ≤ ℓ −

1, we show it holds for t = ℓ ≥ Because m ℓ − ≥ ℓ , we get m ℓ = 2 mℓ − ℓ − ≥ ℓ +1 ≥ ℓ +1) = 4 ℓ +1 .Since m ℓ − ≥ ℓ − and m ℓ − ≥ m ℓ − , we get m ℓ = 2 mℓ − ℓ − ≥ mℓ − ℓ − ≥ mℓ − ℓ − = 4 m ℓ − .Note that we have now also shown the fact m t − ≥ t − m that we used in Step 3 .2. Now that we have m t ≥ m t − for t ≥

2, we can show that m t grows even faster. For t ≥ m t +1 = 2 mt t ≥ mt − t − = m t .

3. Now that we have m t ≥ m t − for t ≥ m t ≥ t ≥

1, we can show that m t grows muchfaster. For t ≥ m t +1 = 2 mt t ≥ mt − t − · mt − = ( √ m t ) m t − ≥ m t − . For ℓ = 3, we also use the additional fact m ≥ ≥ ∗ n is a non-decreasing function, if t ≥

3, log ∗ m t +1 ≥ log ∗ (2 m t − ) = 1 + log ∗ m t − . Unrollingthis gives log ∗ m t − ≥ t −

32 + log ∗ m as long as t is odd and t ≥ t := 2 (log ∗ n − log ∗ ( k/ √ n )) + 3. Because T ≥ ˆ t , m T − ≥ m ˆ t − . Combining this with m ≥ m = k/ √ n , the fact that log ∗ is non-decreasing, and plugging into the inequality above, we getlog ∗ m T − ≥ log ∗ n = ⇒ m T − ≥ n > √ n, which completes the proof. Algorithm 2: V t -Membership ( t ≥ Input:

Graph G = ([ n ] , E ) ∼ G ( n, , k ), clique size k , t , vertex v ∈ N t , access to V t − -Membership Output:

Membership in V t : v ∈ V t Initialize size V t − = 0 , deg V t − = 0 for u ∈ N t − doif V t − -membership ( G, k, t − , u ) = 1 then  Compute | V t − | , ‘ V t − ’-degree of v size V t − = size V t − + 1deg V t − = deg V t − + ( u,v ) ∈ E endendoutput n deg V t − ≥ size Vt − + k t +2 − √ size V t − o The V t -Membership algorithm simply computes the number of edges from a vertex v to the set V t − and uses this to determine whether or not v is in V t . Lemma 2.3 (Small space ﬁlter implementation) . Let G = ([ n ] , E ) ∼ G ( n, , k ) with a clique size k . Let V -Membership be an algorithm that returns for every vertex in N , and let V t -Membership be deﬁned as in Algorithm 2 for t ≥ . Given a vertex v ∈ N t , V t -Membership ( G, k, t, v ) returns if and only if v ∈ V t . Otherwise it returns . Moreover,it runs in space O ( t · log n ) .Proof. We prove this via induction on t . For the base case t = 1, V = N so the algorithm behavesas advertised. It’s space usage is clearly O (log n ) since it outputs a constant.For the inductive step, we assume the statement of the lemma is true for t = ℓ −

1, and prove it for t = ℓ .The correctness of V ℓ -Membership follows immediately from the correctness of V ℓ − -Membership and the deﬁnition of the vertex set V ℓ .Let us now analyse the space usage. To iterate over N t − , the algorithm needs to maintain u , and caniterate simply by increasing the name of u by 1. Additionally, the algorithm also needs to be able tocompute n ℓ − , n ℓ − to decide the start and end points of the loop. It can do all of this in O (log n )space because it has access to n, ℓ from the input. The algorithm requires a further O (log n ) bits tomaintain 0 ≤ size V t − , deg V t − ≤ n −

1. Lastly, it needs to run V ℓ − -Membership . It can compute ℓ − ℓ , and then by our inductive assumption it can run V ℓ − -Membership using17nother O (( ℓ − · log n ) bits of space. Note that this space can be re-used for every call to V ℓ − -Membership . Finally, to implement the thresholding, the algorithm also needs access to k ℓ +2 , whichit can easily compute in O (log n ) space from the inputs n, k, ℓ . Square roots can also be computed inlogarithmic space upto the desired few bits of precision required to make the comparison. The totalspace usage is thus O ( ℓ · log n ) bits, which completes the proof. Theorem 1.

Let G = ([ n ] , E ) ∼ G ( n, , k ) with a planted clique of size k ≥ C √ n with the constant C > chosen as in Lemma 2.2. Suppose T := 2 (log ∗ n − log ∗ ( k/ √ n )) + 3 . Then for large enough n , there exists a deterministic algorithm that takes as input the adjacency matrix of the graph and thesize of the planted clique, exactly outputs the clique K with probability at least − O (cid:16)(cid:0) n (cid:1) log k (cid:17) overthe randomness in the graph G , and runs using O ( T · log n ) bits of space.1. If k = C √ n , the space usage is O (log ∗ n · log n ) bits.2. If k = ω ( √ n log ( ℓ ) n ) for some constant integer ℓ > , the space usage is O (log n ) bits.Proof. We ﬁrst note that given n, k as inputs, T can be computed with O (log n ) bits of space. Thismeans we can easily implement an algorithm to check membership in V T . Given a vertex v ∈ [ n ],in O (log n ) space we can check if it is in N T . If it is not, we declare it is not in V T . If it is,we run V T -Membership ( G, k, T, v ). Due to Lemma 2.3 this gives us an O ( T · log n ) space oraclethat can answer if a vertex is in V T or not. Moreover, by Lemma 2.2, except with probability atmost O (exp( n − . )), V T is subset of the planted clique and has more than 2 log n vertices. Usingthis oracle with Algorithm 1 ( Small Space Clique Completion ) and invoking Lemma 2.1 givesus a deterministic algorithm that runs in space O ( T · log n ) and outputs the planted clique K withprobability at least 1 − O (cid:16) exp( n − . ) + (cid:0) n (cid:1) log k + n exp (cid:0) − k (cid:1)(cid:17) ≥ − O (cid:16)(cid:0) n (cid:1) log k (cid:17) over the randomnessin the graph G . We state the Chernoﬀ bound we use here, for the convenience of the reader.

Lemma 3.1.

Let X = n P i =1 X i where X i are independent Bern ( p i ) random variables. Let µ = n P i =1 p i ,and < δ . Then P ( X ≥ (1 + δ ) µ ) ≤ exp (cid:18) − µδ δ (cid:19) P ( X ≤ (1 − δ ) µ ) ≤ exp (cid:18) − µδ (cid:19) We state some structural lemmas about the planted clique graph that follow from simple probabilisticarguments.First we show that with high probability, any clique subset of size greater than 2 log n has at most3 log k non-clique vertices connected to every vertex of the subset. The ideas of such are analysis arecontained in the proof of [DGGP14, Lemma 2.9]. If δ <

1, this also means P ( X ≥ (1 + δ ) µ ) ≤ exp (cid:16) − µδ (cid:17) , a fact we will use often. emma 3.2. Let G ∼ G ( n, , k ) for k ≥ n and S be any arbitrary subset of the planted clique K with | S | ≥ n . Let T be the set of all non-clique vertices that are connected to every vertex in S .Then, except with probability at most (cid:0) n (cid:1) log k , | T | ≤ k Proof.

Fix S ′ ⊂ S ⊂ K such that | S ′ | = 2 log n . Let T ′ denote the set of all non-clique vertices thatare connected to every vertex in S ′ . Clearly, | T | ≤ | T ′ | . So we will show that | T ′ | ≤ k exceptwith probability at most (cid:0) n (cid:1) log k .Let W be any subset of K with | W | = 2 log n . The probability there exists a subset of non-cliquevertices of size ℓ connected to every element in W is at most (cid:0) nℓ (cid:1) − ℓ (2 log n ) . A union bound then impliesthat the probability there exists a subset of non-clique vertices of size at least ℓ = 1+3 log k connectedto every element in W is at most P n − kℓ = ℓ (cid:0) nℓ (cid:1) − ℓ (2 log n ) ≤ − k log n . Further union bounding over allsubsets of K of size 2 log n implies | T ′ | ≤ k except with probability at most (cid:18) k n (cid:19) − k log n ≤ k log n − k log n = 2 − log k log n = (cid:18) n (cid:19) log k . We also control the number of clique vertices any non-clique vertex is connected to.

Lemma 3.3.

Let G ∼ G ( n, , k ) , and let d be the maximum number of clique vertices connected to anon-clique vertex. Then P ( d ≥ k ) ≤ n exp (cid:0) − k (cid:1) Proof.

A Chernoﬀ bound (Lemma 3.1) shows that any given non-clique vertex has is connected tomore than k clique vertices with probability at most exp (cid:0) − k (cid:1) and a union bound over the at most n non-clique vertices then ﬁnished the proof.For the convenience of the reader, we present a proof of the well known fact that Erd˝os-R´enyi graphsdo not have large cliques. See, for example, [BE76]. Lemma 3.4.

Let G ∼ G ( n, ) and ǫ > be a positive constant. Except with probability at most O (cid:16) − ǫ log n (cid:17) , G contains no cliques of size (2 + ǫ ) log n or larger.Proof. If G has a clique of size larger than (2 + ǫ ) log n , it also has a clique of size (2 + ǫ ) log n . By asimple union bound over all vertex subsets of size (2 + ǫ ) log n , the probability that G has a clique ofthis size is at most (cid:0) n (2+ ǫ ) log n (cid:1) − ( (2+ ǫ ) log n ) = O (cid:16) − ǫ log n (cid:17) .We show the existence of a O (log n ) space recovery algorithm above the information theoretic thresh-old. Lemma 3.5 ([AAK +

07] reduction + O (log n ) space detection) . Let ω (log n ) = k = o ( n ) and G ∼ G ( n, , k ) = ([ n ] , E ) . Then there is a deterministic O (log n ) spacealgorithm that outputs the planted clique except with probability at most O ( n exp ( − k/ n − Θ(log n ) ) .Proof. For a vertex v ∈ [ n ], denote by G v the graph induced on the vertex subset formed by removing v and all its neighbours from [ n ]. Assume that every non-clique vertex in G is connected to at most k clique vertices. By Lemma 3.3, this happens except with probability at most n exp ( − k/ G has degree at most 2 n/

3. By a union and Chernoﬀ bound, this happensexcept with probability at most n exp( − cn ) for some constant c >

0. By a union bound, we can assumethat both the structural assumptions we have made hold simultaneously except with probability atmost O ( n exp ( − k/ v is a clique vertex, G v is an Erd˝os-R´enyi graph with no planted clique and at least n/ G v for all clique vertices v is less than 3 log n . Overall, all our structural assumptions hold except withprobability at most O ( n exp ( − k/

54) + n − Θ(log n ) ).If v is not a clique vertex, G v is a planted clique graph with a planted clique of size at least k/

3. Henceit has a clique of size 3 log n . We can use this property to distinguish between clique and non-cliquevertices.Our algorithm can use a O (log n ) bit counter to loop over all vertices in [ n ]. For a given vertex v , ouralgorithm says it is not in the planted clique if and only if it ﬁnds a clique of size 3 log n in G v . Tocheck this, the algorithm can store 3 log n names of vertices (taking O (log n ) bits of space) and loopover all possibilities. If it ﬁnds a clique formed by vertices that are all unconnected to v , it declares v to be not in the planted clique. To check the existence of a clique for a given set of 3 log n vertices,it only needs a further O (log log n ) bits of space to loop over all possible edges between this set ofvertices. The overall space usage is thus O (log n ) bits. Acknowledgments

We would like to thank Dean Doron, G´abor Lugosi, and Kevin Tian for helpful discussions and pointersto relevant literature.

References [AAK +

07] Noga Alon, Alexandr Andoni, Tali Kaufman, Kevin Matulef, Ronitt Rubinfeld, and NingXie. Testing k-wise and almost k-wise independence. In

Proceedings of the thirty-ninthannual ACM symposium on Theory of computing , pages 496–505, 2007. 3, 4, 5, 7, 10, 19[AB09] Sanjeev Arora and Boaz Barak.

Computational complexity: a modern approach . Cam-bridge University Press, 2009. 2[Abb17] Emmanuel Abbe. Community detection and stochastic block models: recent develop-ments.

The Journal of Machine Learning Research , 18(1):6446–6531, 2017. 2, 5[ACO08] Dimitris Achlioptas and Amin Coja-Oghlan. Algorithmic barriers from phase transitions.In , pages 793–802. IEEE, 2008. 5[AKS98] Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden clique in arandom graph.

Random Structures & Algorithms , 13(3-4):457–466, 1998. 5, 6, 7[AS16] Emmanuel Abbe and Colin Sandon. Achieving the ks threshold in the general stochasticblock model with linearized acyclic belief propagation. In

Advances in Neural InformationProcessing Systems , pages 1334–1342, 2016. 5[AV11] Brendan PW Ames and Stephen A Vavasis. Nuclear norm minimization for the plantedclique and biclique problems.

Mathematical programming , 129(1):69–89, 2011. 5, 720BB20] Matthew Brennan and Guy Bresler. Reducibility and statistical-computational gaps fromsecret leakage. arXiv preprint arXiv:2005.08099 , 2020. 1, 5[BBH18] Matthew Brennan, Guy Bresler, and Wasim Huleihel. Reducibility and compu-tational lower bounds for problems with planted sparse structure. arXiv preprintarXiv:1806.07508 , 2018. 1, 5[BDLS17] Sivaraman Balakrishnan, Simon S Du, Jerry Li, and Aarti Singh. Computationally ef-ﬁcient robust sparse estimation in high dimensions. In

Conference on Learning Theory ,pages 169–212, 2017. 5[BE76] B Bollobas and P Erd¨os. Cliques in random graphs.

MPCPS , 80(3):419, 1976. 10, 19[BHK +

19] Boaz Barak, Samuel Hopkins, Jonathan Kelner, Pravesh K Kothari, Ankur Moitra, andAaron Potechin. A nearly tight sum-of-squares lower bound for the planted clique prob-lem.

SIAM Journal on Computing , 48(2):687–735, 2019. 5[BR13] Quentin Berthet and Philippe Rigollet. Complexity theoretic lower bounds for sparseprincipal component detection. volume 30 of

Proceedings of Machine Learning Research ,pages 1046–1066, Princeton, NJ, USA, 12–14 Jun 2013. PMLR. 5[CX14] Yudong Chen and Jiaming Xu. Statistical-computational tradeoﬀs in planted problemsand submatrix localization with a growing number of clusters and submatrices. arXivpreprint arXiv:1402.1267 , 2014. 2, 5, 7[DGGP14] Yael Dekel, Ori Gurel-Gurevich, and Yuval Peres. Finding hidden cliques in linear timewith high probability.

Combinatorics, Probability and Computing , 23(1):29–49, 2014. 5,6, 7, 8, 9, 10, 18[DKMZ11] Aurelien Decelle, Florent Krzakala, Cristopher Moore, and Lenka Zdeborov´a. Asymp-totic analysis of the stochastic block model for modular networks and its algorithmicapplications.

Physical Review E , 84(6):066106, 2011. 5[DLR79] David Dobkin, Richard J Lipton, and Steven Reiss. Linear programming is log-space hardfor p.

Information Processing Letters , 8(2):96–97, 1979. 7[DM15a] Yash Deshpande and Andrea Montanari. Finding hidden cliques of size p N/e in nearlylinear time.

Foundations of Computational Mathematics , 15(4):1069–1128, 2015. 5, 6, 7[DM15b] Yash Deshpande and Andrea Montanari. Improved sum-of-squares lower bounds for hid-den clique and hidden submatrix problems. In

Conference on Learning Theory , pages523–562, 2015. 5[DSTS17] Dean Doron, Amir Sarid, and Amnon Ta-Shma. On approximating the eigenvalues ofstochastic matrices in probabilistic logspace. computational complexity , 26(2):393–420,2017. 7[DTS15] Dean Doron and Amnon Ta-Shma. On the problem of approximating the eigenvalues ofundirected graphs in probabilistic logspace. In

International Colloquium on Automata,Languages, and Programming , pages 419–431. Springer, 2015. 7[FGR +

17] Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh S Vempala, and Ying Xiao.Statistical algorithms and a lower bound for detecting planted cliques.

Journal of theACM (JACM) , 64(2):1–37, 2017. 5 21FK00] Uriel Feige and Robert Krauthgamer. Finding and certifying a large hidden clique in asemirandom graph.

Random Structures & Algorithms , 16(2):195–208, 2000. 5, 7[FR10] Uriel Feige and Dorit Ron. Finding hidden cliques in linear time. 2010. 5, 6, 7[GZ19] David Gamarnik and Ilias Zadik. The landscape of the planted clique problem: Densesubgraphs and the overlap gap property. arXiv preprint arXiv:1904.07174 , 2019. 5[HKP +

17] Samuel B Hopkins, Pravesh K Kothari, Aaron Potechin, Prasad Raghavendra, TselilSchramm, and David Steurer. The power of sum-of-squares for detecting hidden struc-tures. In , pages 720–731. IEEE, 2017. 5[HKP +

18] Samuel B Hopkins, Pravesh Kothari, Aaron Henry Potechin, Prasad Raghavendra, andTselil Schramm. On the integrality gap of degree-4 sum of squares for planted clique.

ACM Transactions on Algorithms (TALG) , 14(3):1–31, 2018. 5[Hop18] Samuel Brink Klevit Hopkins. Statistical inference and the sum of squares method. 2018.5[HS05] Don Hush and Clint Scovel. Concentration of the hypergeometric distribution.

Statistics& probability letters , 75(2):127–132, 2005. 13[HS17] Samuel B Hopkins and David Steurer. Bayesian estimation from few samples: communitydetection and related problems. arXiv preprint arXiv:1710.00264 , 2017. 5[HWX15] Bruce Hajek, Yihong Wu, and Jiaming Xu. Computational lower bounds for communitydetection on random graphs. In

Conference on Learning Theory , pages 899–928, 2015. 2,5, 7[Jer92] Mark Jerrum. Large cliques elude the metropolis process.

Random Structures & Algo-rithms , 3(4):347–359, 1992. 5[KMOW17] Pravesh K Kothari, Ryuhei Mori, Ryan O’Donnell, and David Witmer. Sum of squareslower bounds for refuting any csp. In

Proceedings of the 49th Annual ACM SIGACTSymposium on Theory of Computing , pages 132–145, 2017. 5[Kuˇc95] Ludˇek Kuˇcera. Expected complexity of graph partitioning problems.

Discrete AppliedMathematics , 57(2-3):193–212, 1995. 3, 5[KWB19] Dmitriy Kunisky, Alexander S Wein, and Afonso S Bandeira. Notes on computationalhardness of hypothesis testing: Predictions using the low-degree likelihood ratio. arXivpreprint arXiv:1907.11636 , 2019. 5[Li17] Jerry Li. Robust sparse estimation tasks in high dimensions. arXiv preprintarXiv:1702.05860 , 2017. 5[LKZ15] Thibault Lesieur, Florent Krzakala, and Lenka Zdeborov´a. Phase transitions in sparsepca. In , pages 1635–1639. IEEE, 2015. 5[Lug17] G´abor Lugosi. Lectures on combinatorial statistics. , pages 1–91, 2017. 2, 4, 10[MAC20] Jay Mardia, Hilal Asi, and Kabir Aladin Chandrasekher. Finding planted cliques insublinear time. arXiv preprint arXiv:2004.12002 , 2020. 2, 3, 4, 5, 6, 9, 1022Mas14] Laurent Massouli´e. Community detection thresholds and the weak ramanujan property.In

Proceedings of the forty-sixth annual ACM symposium on Theory of computing , pages694–703, 2014. 5[MNS15] Elchanan Mossel, Joe Neeman, and Allan Sly. Consistency thresholds for the plantedbisection model. In

Proceedings of the forty-seventh annual ACM symposium on Theoryof computing , pages 69–75, 2015. 5[MPW15] Raghu Meka, Aaron Potechin, and Avi Wigderson. Sum-of-squares lower bounds forplanted clique. In

Proceedings of the forty-seventh annual ACM symposium on Theory ofcomputing , pages 87–96, 2015. 5[RM14] Emile Richard and Andrea Montanari. A statistical model for tensor pca. In

Advancesin Neural Information Processing Systems , pages 2897–2905, 2014. 5[Ros08] Benjamin Rossman. On the constant-depth complexity of k-clique. In

Proceedings of thefortieth annual ACM symposium on Theory of computing , pages 721–730, 2008. 5[Ros10] Benjamin Rossman. The monotone complexity of k-clique on random graphs. In , pages 193–201.IEEE, 2010. 5[RS19] Mikl´os Z R´acz and Benjamin Schiﬀer. Finding a planted clique by adaptive probing. arXiv preprint arXiv:1903.12050 , 2019. 2, 3[Ser91] Maria Serna. Approximating linear programming is log-space complete for p.

InformationProcessing Letters , 37(4):233–236, 1991. 7[SW20] Tselil Schramm and Alexander S Wein. Computational barriers to estimation from low-degree polynomials. arXiv preprint arXiv:2008.02269 , 2020. 5[Wig19] Avi Wigderson.