[PDF] Finding Planted Cliques in Sublinear Time

Abstract

We study the planted clique problem in which a clique of size k is planted in an Erdős-Rényi graph of size n and one wants to recover this planted clique. For k=Ω( n − − √ ) , polynomial time algorithms can find the planted clique. The fastest such algorithms run in time linear O( n 2 ) (or nearly linear) in the size of the input [FR10,DGGP14,DM15a]. In this work, we initiate the development of sublinear time algorithms that find the planted clique when k=ω( nloglogn − − − − − − − − √ ) . Our algorithms can recover the clique in time O ˜ (n+( n k ) 3 )= O ˜ ( n 3 2 ) when k=Ω( nlogn − − − − − − √ ) , and in time O ˜ ( n 2 /exp( k 2 24n )) for ω( nloglogn − − − − − − − − √ )=k=o( nlogn − − − − − − √ ) . An Ω(n) running time lower bound for the planted clique recovery problem follows easily from the results of [RS19] and therefore our recovery algorithms are optimal whenever k=Ω( n 2 3 ) . As the lower bound of [RS19] builds on purely information theoretic arguments, it cannot provide a detection lower bound stronger than Ω ˜ ( n 2 k 2 ) . Since our algorithms for k=Ω( nlogn − − − − − − √ ) run in time O ˜ ( n 3 k 3 +n) , we show stronger lower bounds based on computational hardness assumptions. With a slightly different notion of the planted clique problem we show that the Planted Clique Conjecture implies the following. A natural family of non-adaptive algorithms---which includes our algorithms for clique detection---cannot reliably solve the planted clique detection problem in time O( n 3−δ k 3 ) for any constant δ>0 . Thus we provide evidence that if detecting small cliques is hard, it is also likely that detecting large cliques is not \textit{too} easy.

Full PDF

FFinding Planted Cliques in Sublinear Time

Jay Mardia ∗ Hilal Asi ∗ Kabir Aladin Chandrasekher ∗ Abstract

We study the planted clique problem in which a clique of size k is planted in an Erd˝os-R´enyigraph G ( n, ) and one is interested in recovering this planted clique. It is well known that for k = Ω( √ n ), polynomial time algorithms can ﬁnd the planted clique. In fact, the fastest knownalgorithms in this regime run in time linear O ( n ) (or nearly linear) in the size of the input[FR10, DGGP14, DM15a].In this work, we develop sublinear time algorithms that ﬁnd the planted clique in the regime k = ω ( √ n log log n ). Our algorithms can reliably recover the clique in time (cid:101) O (cid:0) n + ( nk ) (cid:1) = (cid:101) O (cid:16) n (cid:17) when k = Ω( √ n log n ), and in time (cid:101) O (cid:16) n / exp (cid:16) k n (cid:17)(cid:17) for ω ( √ n log log n ) = k = o ( √ n log n ). An Ω( n ) running time lower bound for the planted clique recovery problem followseasily from the results of [RS19] and therefore our recovery algorithms are optimal whenever k = Ω( n ).As the lower bound of [RS19] builds on purely information theoretic arguments, it cannotprovide a detection lower bound stronger than (cid:101) Ω( n k ). Since our algorithms for k = Ω( √ n log n )run in time (cid:101) O (cid:16) n k + n (cid:17) , we show stronger lower bounds based on computational hardnessassumptions. Using a slightly diﬀerent formalization of the planted clique problem, in whichevery vertex is included in the clique independently with probability k/n , we show that the Planted Clique Conjecture implies the following. A natural family of non-adaptive algorithms—which includes our algorithms for clique detection—cannot reliably solve the planted cliquedetection problem in time O (cid:16) n − δ k (cid:17) for any constant δ >

0. Thus we provide evidence that ifdetecting small cliques is hard, it is also likely that detecting large cliques is not too easy.

Contents ∗ Department of Electrical Engineering, Stanford University. { jmardia, asi, kabirc } @stanford.edu a r X i v : . [ c s . CC ] A p r Algorithms 11 (cid:101) O ( n / ) algorithm for ﬁnding cliques of size k = Θ( √ n log n ) . . . . . . . . . . . 145.3 An (cid:101) O (cid:16) ( n/k ) + n (cid:17) algorithm for ﬁnding cliques of size k = Ω( √ n log n ) . . . . . . . 165.4 An (cid:101) O (cid:16) n / exp (cid:16) k n (cid:17)(cid:17) algorithm for ﬁnding cliques of size ω ( √ n log log n ) = k = o ( √ n log n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 k = (cid:101) Θ( √ n ) . . . . 246.4 Sublinear time lower bounds for detecting cliques of size k = (cid:101) Θ( √ n ) from the PlantedClique Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.5 What do these lower bounds formally imply for PC D ( n, k )? . . . . . . . . . . . . . . 28 The planted clique problem, in which a clique of size k is planted in an Erd˝os-R´enyi graph G ( n, )has been well studied over the past two decades and has emerged as a fruitful playground for thestudy of average-case discrete optimization problems. The goal here is to develop algorithms thatcan eﬃciently ﬁnd the planted clique, and previous work [FR10, DGGP14, DM15a] has resultedin (nearly) linear time algorithms that run in time ˜ O ( n ) if the clique size k is large enough. Nolower bounds, however, were proved to establish the optimality of these algorithms, and so it isintriguing to ask if it is possible to recover the planted clique more eﬃciently by looking at only asmall subset of the graph. As a result, in this work we investigate the following two questions:1. Do there exist sublinear time algorithms for recovering the planted clique? What is the smallest running time any algorithm can hope to have?

The main contribution of this work is to provide partial answers to the above questions, by develop-ing sublinear time algorithms for the planted clique problem and establishing some evidence—basedon the

Planted Clique Conjecture —of their optimality. In the remainder of this section, we describeour contribution towards answering the above questions.

Algorithms:

We develop several sublinear time algorithms for the planted clique problem in Section 5. First, inSection 5.2 we develop an algorithm that runs in time (cid:101) O ( n / ) and recovers the clique with highprobability for k = Θ( √ n log n ). For even larger clique sizes, we show in Section 5.3 that there is2n (cid:101) O (cid:16) ( n/k ) + n (cid:17) algorithm for clique recovery. Finally, when ω ( √ n log log n ) = k = o ( √ n log n ),we provide an algorithm which runs in time (cid:101) O (cid:16) n / exp (cid:16) k n (cid:17)(cid:17) in Section 5.4 .Given the widespread belief (which goes by the name Planted Clique Conjecture ) that no polynomialtime algorithm can recover the planted clique if k = O ( n − δ ) for any constant δ >

0, we certainlydo not expect sublinear time algorithms to work in that regime. Thus our work builds towards theidea that the planted clique problem can either be solved without even looking at the entire graph,or it needs more than a polynomial amount of time.

Impossibility Results:

We begin our investigation of the second question in Section 6 by observing that the results of[RS19] imply that any recovery algorithm requires time at least Ω( n k + n ). As a consequence, for k = Θ( n ) this implies that our (recovery) algorithm has an optimal running time of (cid:101) O ( n ), and—somewhat surprisingly—increasing the size of the planted clique does not lead to faster recoveryalgorithms.The lower bound techniques of [RS19] are purely information theoretic, and it can be seen thatsuch techniques will not be able to prove stronger lower bounds, of the form Ω( n k ) that we mighthope for given our algorithmic results. To circumvent this, we aim to show stronger lower boundsusing widely accepted average case computational hardness assumptions. The most natural suchassumption in this scenario is, evidently, the Planted Clique Conjecture . We aim to build on hard-ness of the planted clique problem for small cliques (as codiﬁed by the

Planted Clique Conjecture )to show lower bounds for algorithms that recover large planted cliques. Our goal is to convey the(however rough) notion that the hardness of the planted clique problem in all regimes is due to thesame reason. We note that the rest of our lower bounds work for the easier detection version of theplanted clique problem; these bounds imply lower bounds for the recovery problem.We make some progress towards this goal, albeit with a caveat. The caveat is that the reductionswe show use a slightly diﬀerent notion of a planted clique problem. In this model, each vertex isincluded in the clique independently with probability kn . In the vanilla planted clique problem, theclique is a uniformly random subset of k vertices. While we also show reductions between thesetwo formalizations of the planted cliqe problem which demonstrate that they behave in essentiallythe same way, there is some subtlety to these reductions. We discuss these issues further in thesubsequent sections. For now, we remark that our algorithms work for both variants. For now, westate our results while pretending that they formally hold for the vanilla planted clique problem.In Section 6.4 we show a connection for a restricted family of algorithms . We show that assumingthe Planted Clique Conjecture , no non-adaptive rectangular algorithm can detect the existenceof a planted clique of size k = (cid:101) Ω( √ n ) in time O (cid:0) n − δ /k (cid:1) for any constant δ >

0. We havethus transformed a computational hardness assumption that distinguishes between polynomial andsuperpolynomial time algorithms into a result that distinguishes between more ﬁne-grained (in factsublinear) running times. As we argue in Remark 5.2, we believe the requirement k = ω ( √ n log log n ) in the algorithm just mentioned isan artifact of our speciﬁc approach, and that a similar result should hold as long as k = ω ( √ n ), although we do notprove this. The algorithms for planted clique recovery we discuss in Section 5 are technically not in this class, but detectionversions of these algorithms are . We elaborate more on this in later sections. Deﬁnition 6.10 Deﬁnition 6.11

3n (worst case) Fine-Grained Complexity it is a big open question to prove any polynomial lowerbounds under assumptions about polynomial vs super-polynomial time, and it is even known tobe impossible with ﬁne-grained reductions in certain settings [AB18]. We show that it is in factpossible to prove such ﬁne-grained lower bounds in the sublinear (average case) regime and undersome assumptions on the algorithms.In the other direction, in Section 6.3 we show that for planted cliques of size k = Θ(log n √ n ), anydetection runtime lower bound of the form ω ( n ) gives a non-trivial ω ( n ) runtime lower bound fordetecting planted cliques of size k = 3 log n (i.e. near the information theoretic threshold belowwhich detection is information theoretically impossible). While this is nowhere near as spectacularas claiming that no polynomial time algorithms can exist, it is a conditional super-linear lowerbound.We hope that these results are just ﬁrst steps in showing that the non-existence of fast sublineartime algorithms for detecting large cliques is related to the hardness of detecting small cliques.

1. The running times of our algorithms for planted clique sizes just above and just belowΘ( √ n log n ) are dramatically diﬀerent. For k = Ω( √ n log n ), we can recover the clique intime (cid:101) O ( n + ( nk ) ) = (cid:101) O ( n ). For k = o ( √ n log n ), our algorithms are not even ‘truly sublin-ear’, by which we mean that they run slower than Ω( n − δ ) for any constant δ >

0. Is theresome threshold phenomenon at clique size k = Θ( √ n log n ) with such diﬀerent behavior aboveand below it, or are there faster algorithms for smaller cliques? Both positive and negativeanswers to the following become very interesting. Does there exist an algorithm which runs in time O ( n − δ ) for some constant δ > which can recover planted cliques of size k = o ( √ n log n ) ?

2. Detection versions of our algorithms are non-adaptive and rectangular. To complement this,we have shown that the

Planted Clique Conjecture implies non-existence of non-adaptive rect-angular algorithms that reliably solve the detection problem and run much faster than ouralgorithms. This leads to wondering about the power of general algorithms with an adaptivesampling strategy.

Does there exist an adaptive and/or non-rectangular algorithm which runs in time O ( n − δ ) for some constant δ > and reliably detects planted cliques of size k =Θ( √ n log n ) ?

3. To show strong lower bounds, we have relied on the most natural computational hardnessassumption for this setting, namely the

Planted Clique Conjecture . However, it is plausiblethat other assumptions might be relevant too. Can we gain evidence for the non-existence offast sublinear time algorithms that solve the planted clique problem using other computationalhardness assumptions? 4

Related work

As far as the authors are aware, the planted clique problem was ﬁrst studied in [Jer92] in whichJerrum studied Markov chain Monte Carlo methods and showed that the metropolis process can-not ﬁnd cliques of size O ( √ n ). It is known that just above the information theoretic threshold, k = 2 log n , there is a unique largest clique with high probability and the brute force algorithm willsuccessfully ﬁnd the clique. This lies in stark contrast to where polynomial-time algorithms beginto work. The ﬁrst polynomial time algorithm was provided in [Kuˇc95] although shown only to workabove the degree counting threshold k = Ω( √ n log n ). Several algorithms were later shown to workfor k = Ω( √ n ), starting with the spectral algorithm from [AKS98], and including an algorithm thatis based on semideﬁnite programming from [FK00]. In fact, a line of work including more sophisti-cated degree counting algorithms [FR10, DGGP14] and approximate message passing [DM15a] hasshown that cliques of size larger than Ω( √ n ) can be found in nearly linear ( (cid:101) O ( n )) time. To thebest of our knowledge, no sublinear time algorithm has been proposed so far.On the ﬂip side, it is widely believed that no polynomial time algorithm can solve the plantedclique problem for clique size signiﬁcantly smaller than O ( √ n ). Evidence for this fact has mountedup in recent years, and comes from showing that restricted classes of algorithms can not beat thisbound. Θ( √ n ) was shown to be a barrier for the powerful sum-of-squares hierarchy [MPW15,DM15b, HKP +

18, BHK +

19] and for statistical query algorithms [FGR + Planted Clique Conjec-ture (or close variants) to show average-case hardness results for various problems [AAK +

07, ABBG,BR13, KZ14, MW +

15, WBP16, BBH18, SBW19]. It has additionally been used as a cryptographicprimitive [ABW10]. We follow in these footsteps by using the

Planted Clique Conjecture as ourmain hardness assumption to prove lower bounds for sublinear time algorithms. A key diﬀerencehere is that instead of using an assumption that talks about that gap between polynomial andsuperpolynomial time algorithms to obtain another such gap, we use it to show a ﬁne-grained (infact sublinear) hardness result that distinguishes between diﬀerent polynomial running times.In recent years, there have indeed been reductions of this form. Such connections have resulted inthe burgeoning ﬁeld of ﬁne-grained complexity (see [Wil] for a nice survey), including the studyof a ﬁne-grained understanding of clique problems [ABW18]. [GR18] studied the relation betweenworst-case and average-case hardness of clique problems and recently [BRSV17] explored one of theﬁrst ﬁne-grained average case complexity results by using the random self-reducibility of low-degreepolynomials to turn worst case ﬁne-grained hardness results into average case results. However,it should be noted here that our techniques are quite diﬀerent to these works. We rely on thefact that when we look at only a small fraction of our input, the loss of information makes it lookindistinguishable from a problem that should not have any polynomial time algorithm.More recently, [FGN +

20, RS19] have considered the problem of ﬁnding cliques in random graphswhere the cost of the algorithm is the number of queries it makes to the adjacency matrix ofthe input graph. This is similar to our framework in that this quantity, the number of queries,plays a central role in both our algorithms and impossibility results. However, both of these worksonly bound the number of queries but allow unbounded computation time, while we only allowa sublinear amount of computation. In fact, our interest in the number of queries is simply abyproduct of this requirement. 5

Our techniques

All of our algorithms build on a simple idea: once an algorithm has found slightly more than log n (say 2 log n ) clique vertices, it can eﬃciently (and with high probability of success) test whether anyother vertex is in the clique by checking whether it is connected to all of the certiﬁed clique verticesit already has. Any non-clique vertex is unlikely to be connected to all 2 log n clique vertices. Thealgorithm can then simply iterate over all vertices. Thus it will ﬁnd all other clique vertices, aswell as a few false positives which can be removed with some post-processing.Subroutines of this form are not new, and are known in the planted clique literature [DGGP14,Lemma 2.9]. However, we need our subroutine to run in time (cid:101) O ( n ) and without knowledge of theplanted clique size. The clique completion lemma in [DGGP14] both needs k to be speciﬁed, andruns in time Ω( k ), which could be (cid:101) Ω( n ), and so is unsuitable for our purposes.We circumvent this and create a clique completion subroutine with the desired properties by usinga slightly diﬀerent post-processing technique. In Section 5.1 we describe this subroutine Clique-Completion (Algorithm 1) which takes a subset of the clique of size 2 log n and, in running time O ( n log n ), returns the planted clique with high probability (as long as k = ω (cid:0) log n (cid:1) ). Buildingon this subroutine, our algorithms ﬁrst ﬁnd a subset of the clique of size 2 log n and then invokethe completion procedure to ﬁnd the remaining clique vertices. (cid:101) O ( n / ) algorithm for ﬁnding cliques of size k = Θ( √ n log n )Theorem 1 in Section 5.2 describes an algorithm Keep-High-Degree-And-Complete (Algorithm 2) which runs in time (cid:101) O ( n / ) and ﬁnds the planted clique with high prob-ability of success as long as the clique size k ≥ C √ n log n for a large enough constant C .The algorithm follows from the same simple observations that led [Kuˇc95] to give the ﬁrst polyno-mial time algorithm for planted clique at the same threshold k ≥ C √ n log n .1. The degree of each non-clique vertex is distributed as Bin (cid:0) n, (cid:1) . There are at most n non-clique vertices, and with high probability the maximum of n (possibly dependent) Bin (cid:0) n, (cid:1) random variables is at most n + c √ n log n for some constant c > n − k + k − c √ n log n = n + C √ n log n − c √ n log n .If we choose C large enough, simply computing the degree of a vertex (which takes time O ( n )) letsus decide if the vertex is in the clique or not. Since k out of the n vertices are in the clique, if werandomly sample slightly more than nk vertices from V , we will get at least 2 log n clique vertices,and can identify them by computing the degree of all the vertices we have sampled, which takestime (cid:101) O (cid:16) n k (cid:17) = (cid:101) O (cid:16) n (cid:17) . Then we simply use the Clique-Completion subroutine to ﬁnd theentire clique in a further O ( n log n ) time. 6 .1.2 An (cid:101) O (cid:16) ( n/k ) + n (cid:17) algorithm for ﬁnding cliques of size k = Ω( √ n log n )The results of Theorem 1 give a runtime of (cid:101) O (cid:16) n k + n (cid:17) for k = Ω( √ n log n ). However, Keep-High-Degree-And-Complete was tailored for k = Θ (cid:0) √ n log n (cid:1) .Theorem 2 in Section 5.3 shows that for larger k , we can improve the runtime to (cid:101) O (cid:16)(cid:0) nk (cid:1) + n (cid:17) using Subsample-And-KHDAC , Algorithm 3.In view of the (unconditional) lower bound we note in Remark 6.7, which says that any algorithmthat reliably recovers the clique requires running time Ω( n ), this has the following slightly surprisingconsequence. Once the planted clique is of size at least k = Ω (cid:16) n (cid:17) , we get an optimal runtimeof (cid:101) O ( n ), and increasing the size of the planted clique further does not make the recovery problemeasier.The idea behind the faster algorithm is simple. Let p < pn vertices (which can be done in time O ( pn )), we expect to have pn vertices ofwhich pk are planted clique vertices. If pk = Ω( (cid:112) pn log ( pn )), we can run Keep-High-Degree-And-Complete on this smaller problem instance in time (cid:101) O (cid:16) ( pn ) (cid:17) to recover pk planted cliquevertices. We can then just run Clique-Completion on a subset of them in time O ( n log n ).Observe that we need p ≈ nk for this to work, resulting in a runtime of (cid:101) O (cid:16)(cid:0) nk (cid:1) + n (cid:17) . (cid:101) O (cid:16) n / exp (cid:16) k n (cid:17)(cid:17) algorithm for ﬁnding cliques of size ω ( √ n log log n ) = k = o ( √ n log n )Theorem 3 analyses Subsample-And-Filter (Algorithm 4) and shows that even when ω ( √ n log log n ) = k = o ( √ n log n ), degrees do help solve the planted clique recoveryproblem in sublinear time (cid:101) O (cid:16) n / exp (cid:16) k n (cid:17)(cid:17) .However, our algorithm is not ‘truly sublinear’. That is, it does not have running time O ( n − (cid:15) ) forany constant (cid:15) > k = o ( √ n log n ) as a compelling open problem.The reason degree counting (as in [Kuˇc95] and Theorem 1) works for ﬁnding planted cliques ofsize Ω( √ n log n ) is because there exists a clear separation between the degree of clique vertices andnon-clique vertices. Stated diﬀerently, if we see a vertex that has degree close to n + k , we know itis in the clique, and if the degree is much lesser than n + k (even if it is much larger than n ), weknow it is not in the clique. The situation changes when ω ( √ n ) = k = o ( √ n log n ). A vertex withdegree close to (or even much larger than) n + k may be a non-clique vertex.However, all is not lost. Given a clique vertex, its degree is very likely to be close to its expectationof n + k . On the other hand, given a non-clique vertex, its degree is much less likely to be close to n + k , even though this likelihood is not as small as in the case of k = Ω( √ n log n ). This suggeststhat we ﬁlter out vertices based on this closeness criterion.We subsample an i.i.d p fraction of the vertices (this can be done in time O ( n )), and then computethe degree of each of these (approximately) pn vertices. This takes time O ( pn ). We then throw7way all vertices that are not within O ( √ n ) of n + k . The hope is that this will boost the ratioof clique to non-clique vertices because of the discussion above. If we choose p large enough sothat at the end of this process we get n (cid:48) vertices in all, out of which k (cid:48) are planted clique vertices,and k (cid:48) = Ω( √ n (cid:48) log n (cid:48) ), then we can use Keep-High-Degree-And-Complete for ﬁnding plantedcliques on this smaller problem instance. This takes time O ( n (cid:48) ) = O ( p n ) = O ( pn ). We canthen use Clique-Completion as a ﬁnal step to ﬁnd all the clique vertices in the original problem.As we see during the analysis, it suﬃces to take p = (cid:101) O (cid:16) exp (cid:16) − k n (cid:17)(cid:17) . The simple observation that underlies our impossibility results is that a sublinear time algorithmcan not see the entire input, and hence must work without a fair chunk of information aboutthe input. When this information is not available to the algorithm, we will argue that what it does see is either statistically (in results that follow immediately from [RS19]) or computationally(because of the

Planted Clique Conjecture ) not solvable. In this sense, we convert a polynomial vssuperpolynomial hardness gap to a ﬁne-grained (in fact sublinear) hardness gap.

While we defer formal deﬁnitions of the problem statement and model of computation to Sections 4.1and 4.2, essentially our model is that the input is presented to the algorithm via the adjacencymatrix of the graph, and we assume that querying any entry of this matrix takes unit time. Sinceaccessing an entry of the input takes unit time, if an algorithm runs in time T ( n ), it can observeat most O ( T ( n )) entries of the input adjacency matrix (Remark 6.1).The work [RS19] completely characterizes (upto log factors) as (cid:101) Θ( n k + n ) the query complexity ofthe planted clique recovery problem in the following model. The algorithm gets as input an instanceof the planted clique problem (which is the adjacency matrix of the graph), and can only accessthe input by querying entries of this adjacency matrix. The cost of the algorithm is measured asthe number of entries of the matrix it needs to query, and computation is not penalized.Since any algorithm needs to make at least Ω( n k + n ) queries [RS19], it also requires atleast Ω( n k + n ) running time.While this provides a tight lower bound for cliques of size k = Ω( n ), it is quite far from ouralgorithmic upper bound of (cid:101) O ( n ) for cliques of size Θ( √ n log n ) by only providing an Ω( n ) lowerbound. However, since we also know that (cid:101) O ( n k + n ) queries suﬃce to (ineﬃciently) solve theplanted clique problem, we resort to using computational hardness assumptions to show strongerlower bounds. We focus on showing results for the detection version of the problem, since recoveryis only harder than detection. Since most natural average case computational hardness assumption in this scenario is the

PlantedClique Conjecture (Conjecture 6.3), our goal is to relate hardness of the planted clique problem for8mall cliques to the non-existence of fast sublinear time algorithms for large planted cliques. Wealso want to show that this connection goes both ways.As we remarked earlier, we show this connection using a slightly diﬀerent notion of a planted cliqueproblem which we call iidPC D (Deﬁnition 6.2). In this model, each vertex is included in the cliqueindependently with probability kn . The idea is that since the two models are quite similar to eachother, we can use one as a proxy to study the other. Hence impossibility results for one give evidencefor impossibility theorems in the other. We ﬁrst state the results we obtain, and then discuss therelation between this model and the vanilla planted clique detection problem PC D (Deﬁnition 4.3).1. Lower bounds for detecting planted cliques of size close to information theoreticthreshold from sublinear lower bounds for detection at clique size k = (cid:101) Θ( √ n )Consider the planted clique detection problem with planted clique size just larger than (cid:112) n log n . Create a subgraph by only retaining the ﬁrst √ n vertices. Then we have agraph of size √ n with a planted clique of size slightly more than 2 log( √ n ), the informationtheoretic threshold.Hence if we could solve the detection problem on a graph of size n with a plantedclique of size near the information theoretic threshold in time O (cid:0) n δ (cid:1) (for anyconstant δ > √ n log n in time (cid:101) O (cid:0) n δ (cid:1) .A lower bound on the original problem then translates into a lower bound on the problem atthe information theoretic threshold. Moreover, a lower bound of the form ω ( n ) would implya non-trivial superlinear ( ω ( n )) lower bound for detecting small cliques. This indicates thata lower bound of the form ω ( n ) will require computational hardness assumptions to show.Formalizing this intuition is more convenient in the iidPC D world than the PC D , and we provea slightly more general reduction in Section 6.3 in Lemma 6.9.2. Sublinear time lower bounds for detecting cliques of size k = (cid:101) Θ( √ n ) from thePlanted Clique Conjecture In the other direction, our results hold for a (reasonable) subclass of all algorithms, namely non-adaptive rectangular algorithms.In Theorem 4 we show that if the

Planted Clique Conjecture is true, any non-adaptive rectangular algorithm that reliably solves the iidPC D problem for cliquesizes around k = (cid:101) Θ( √ n ) must have runtime Ω( n − δ ) for any positive constant δ ,which essentially matches our algorithmic upper bound. What are these restrictions on the algorithm?

A non-adaptive algorithm is one inwhich the set of queries the algorithm makes is chosen (possibly randomly) ahead of timeand does not depend on the input to the algorithm. A rectangular algorithm is one whosequery set is ‘structured’ in some sense. It is one way of trying to impose the idea that anon-adaptive algorithm must treat all vertices as equally as possible since a priori they areall equally likely to be in the planted clique. Restricting our lower bounds to non-adaptiverectangular algorithms is not too unreasonable. This is because our upper bound algorithmsare only weakly adaptive or non-rectangular. In fact, the

Clique-Completion subroutineis the only adaptive or non-rectangular part of Algorithms 2 3, or 4. Moreover, Clique-Completion is only required for the planted clique recovery problem. If we only wanted tosolve the detection problem, a simple tweak to Algorithm 2 so that it does not use

Clique-Completion , but only decides whether or not a planted clique exists based on the largestdegree it observes can give a non-adaptive rectangular detection algorithm that runs in time In fact we show a Ω( n − δ k ) lower bound for any k , matching our algorithmic upper bound. O ( n ). Similarly, removing the Clique-Completion subroutine from Algorithm 3 whileusing the modiﬁed version of Algorithm 2 inside it gives a non-adaptive rectangular detectionalgorithm that runs in time (cid:101) O ( n k ) and reliably detects cliques of size k = Ω (cid:0) √ n log n (cid:1) . Weleave the details to the reader. Since we are showing lower bounds for the detection version ofthe problem, our upper bound algorithms do indeed belong to the class of algorithms againstwhich we are showing lower bounds. Intuition for impossibility result:

Let δ > (cid:101) Θ( √ n ). We ﬁrst use the non-adaptivity of the algorithmto argue that we need to only consider algorithms whose queries are deterministic and notrandomized (Remark 6.4). If the algorithm runs in time O ( n − δ ), it can (deterministically)query at most O ( n − δ ) entries of the adjacency matrix. Under the randomness of the locationof the planted vertices, each oﬀ-diagonal entry in the adjacency matrix corresponds to a‘planted’ entry with probability roughly k /n = (cid:101) O (1 /n ). By linearity of expectation, theexpected number of queries the algorithm makes which are ‘planted’ entries is (cid:101) O ( n − δ ). Thismeans that we expect the algorithm to obtain evidence of ‘plantedness’ from roughly only (cid:101) O ( n − δ ) vertices. According to the Planted Clique Conjecture , if there are such few plantedvertices, it is computationally hard to distinguish between the planted and null models. Thuswe might believe that solving the original problem is also computationally hard if we querysuch few entries.It turns out that we are only able to turn this intuition into a formal reduction for rectangularalgorithms, and do so in Section 6.4.We can now discuss in a little more detail the connection between iidPC D and PC D . Intuitively,we expect these two problems behave similarly since they denote morally similar ways of randomlysampling a clique. This similarity can be made formal, and we do so in Lemmas 6.12 and 6.13, wherewe show that hardness of one problem implies hardness of other. However, these reductions involvea subtlety, and do not let us obtain theorems like Theorem 4 for the PC D problem. Instead, inSection 6.2 we discuss how our algorithms all actually work for iidPC D too, not just PC D . Hence thereader can view this entire paper as showing formal sublinear time algorithms as well impossibilityresults for iidPC D . In Section 6.5 we further discuss what implications we can obtain for the PC D problem. These implications are of the ﬂavour that any very fast sublinear time algorithm mustcrucially utilise a very precise estimate of the size of the planted clique it is to succeed. Thus suchan algorithm can not be very robust to misspeciﬁcation of the clique size. In Section 4.1 we set up the notation we will use along with all the formal deﬁnitions of the variousﬂavours of planted clique problem we consider - detection, recovery, and iid detection. In Section 4.2we specify our model of computation.

Notation:

We will use standard big O notation ( O, Θ , Ω) and will denote (cid:101) O ( f ( n )) to denote poly (log n ) O ( f ( n )) and deﬁne (cid:101) Θ , (cid:101) Ω similarly. We will denote the set of graphs on n vertices by G n .For a vertex v in graph G = ( V, E ), we will denote its degree by deg( v ). An edge between nodes u, v ∈ V is denoted ( u, v ). We let Bin (cid:0) n, (cid:1) denote a Binomial random variable with parameters10 n, (cid:1) . Similarly, Bern ( p ) denotes a Bernoulli random variable that is 1 with probability p and0 otherwise. Unless stated otherwise, all logarithms are taken base 2. By [ n ] we denote the set { , , ..., n } . We will sometimes drop the word planted from planted clique and simply use clique,since the planting will be implied from context. All graphs in this work are undirected.In this section we provide formal deﬁnitions of the graphs ensembles we use and the planted cliqueproblem. Deﬁnition 4.1 (Erd˝os-R´enyi graph distribution: G ( n, )) . Let G = ( V, E ) be a graph with vertex set V of size n . The edge set E is created by including eachpossible edge independently with probability . The distribution on graphs thus formed is denoted G ( n, ) . Deﬁnition 4.2 (Planted Clique graph distribution: G ( n, , k )) . Let G = ( V, E ) be a graph with vertex set V of size n . Moreover, let K ⊂ V be a set of size k chosenuniformly at random from all (cid:0) nk (cid:1) subsets of size k . For all distinct pairs of vertices u, v ∈ K , weadd the edge ( u, v ) to E . For all remaining distinct pairs of vertices u, v , we add the edge ( u, v ) to E independently with probability . The distribution on graphs thus formed is denoted G ( n, , k ) . Deﬁnition 4.3 (Planted Clique Detection Problem: PC D ( n, k )) . This is the following hypothesis testing problem. H : G ∼ G ( n,

12 ) and H : G ∼ G ( n, , k ) (4.1) Deﬁnition 4.4 (Planted Clique Recovery Problem: PC R ( n, k )) . Given an instance of G ∼ G ( n, , k ) , recover the planted clique K . When we talk about sublinear algorithms, it is necessary to specify the model of computation withinwhich we are working. Since we are working with dense graphs (both G ( n, ) and G ( n, , k ) have O ( n ) edges with high probability), it is reasonable to assume that the graph is provided via itsadjacency matrix. Formally, the algorithm has access to the adjacency matrix A G of the graph G which is a matrix whose rows and columns are indexed by the vertex set V and entries are deﬁnedas follows. A G ( u, v ) = A G ( v, u ) = 1 if ( u, v ) ∈ E and 0 otherwise. Also, A G ( u, u ) = 0. This isessentially the same as the Dense Graph Model that has been widely studied in the graph propertytesting literature (see, eg, [Gol10]). Computationally, we assume that the algorithm can query anyentry of this matrix in unit time. We also assume that sampling a vertex uniformly at randomtakes unit time, and any other similar edge or vertex manipulation operations take unit time. We begin this section by describing clique completion, a crucial subroutine upon which all of ouralgorithms build. Section 5.1 describes our clique completion subroutine. Section 5.2 providesan algorithm which reliably recovers planted cliques of size k = Θ( √ n log n ) in running time (cid:101) O ( n / ). Section 5.3 builds upon the preceding algorithm to reliably recover planted cliques of size11 = Ω( √ n log n ) in time (cid:101) O (cid:0) ( n/k ) + n (cid:1) . Lastly, Section 5.4 provides an algorithm which reliablyrecovers planted cliques of size ω ( √ n log log n ) = k = o ( √ n log n ) in time (cid:101) O (cid:16) n / exp (cid:16) k n (cid:17)(cid:17) .The intuition for all of these algorithms is provided in Section 3.1, and rather than repeat the samehere, we simply provide the technical details and proofs in this section. We encourage the readerto read Section 3.1 before reading these proofs. As we state in Section 3.1, the intuition for the

Clique-Completion subroutine is that once wehave, say, 2 log n vertices that are in the planted clique, we can ﬁnd the rest. We show that everyother planted clique vertex is connected to all these 2 log n vertices and with high probability veryfew non-clique vertices are connected to all these 2 log n initial clique vertices. Thus we can restrictour attention to only those vertices which are connected to all 2 log n initial clique vertices. Call thisset V (cid:48) . To remove the false positives, we do some post-processing. We want this post-processing torun in time ˜ O ( n ) and to not require the size of the planted clique as an input, since we will use thissubroutine in situations where these constraints need to be met. We simply select a random subset S (cid:48) C of size 2 log n from V (cid:48) . With high probability, this subset S (cid:48) C will contain only clique vertices,and then we run the same ”common neigbour” procedure on this small subset. Note that now S (cid:48) C is not just ‘some’ subset of the planted clique, but is in fact a uniformly random subset. We canthen utilise this randomness to show that with high probability no non-clique vertex is connectedto all 2 log n elements in S (cid:48) C . We formalize this in Algorithm 1 and prove the following statements. Algorithm 1:

Clique-Completion

Input:

Graph G = ( V, E ) ∼ G ( n, , k ), known clique set S C ⊂ V Output:

Clique K Initialize S = S C for v ∈ V \ S C doif ( v, u ) ∈ E for all u ∈ S C then Update S ← S ∪ { v } endend Let V (cid:48) ← S Pick (u.a.r) a subset of size (1 + c ) log n from V (cid:48) and call it S (cid:48) C Initialize S (cid:48) = S (cid:48) C for v ∈ V (cid:48) \ S (cid:48) C doif ( v, u ) ∈ E for all u ∈ S (cid:48) C then Update S (cid:48) ← S (cid:48) ∪ { v } endendreturn S (cid:48) Lemma 5.1 (Runtime) . For any constant c > , Clique-Completion runs in time O ( n log n ) . Lemma 5.2 (Correctness) . Draw a graph G according to G ∼ G ( n, , k ) and let the set S C ⊂ K (the planted clique in the instance G ) with | S C | = (1 + c ) log n for some constant c > . If k = ω (cid:0) log n (cid:1) then the output of Algorithm 1, Clique-Completion ( G, S C ) is K with probability t least − (3+ c ) c max (cid:18) (1+ c ) log nk , log nn c (cid:19) .Proof. Throughout the proof, we will follow the notation of Algorithm 1. The algorithm has threestages and our proof upper bounds the probability of failure of each stage (conditioned on the theprevious stages succeeding). The ﬁrst stage of the algorithm begins with our known clique set, andappends to it every vertex which is a common neighbor. We show that the number of non-cliquevertices added is not too big. Let A be the event that | S \ K | < (cid:96) (for some parameter (cid:96) tobe speciﬁed later). The algorithm then takes the output set of the ﬁrst stage, S , and keeps auniformly random subset S (cid:48) C of size (1 + c ) log n . Thus, let A be the event that S (cid:48) C ⊂ K . Finally,we analyze the last stage of the algorithm. We notice that S (cid:48) C ⊂ K implies K ⊆ S ; that is, if theinput to the last stage of the algorithm is entirely contained within the clique, then the output ofthe algorithm contains the clique. We then prove that the output of the algorithm is exactly theclique by showing that it has no intersection with the non-clique vertices. Let A be the event that S (cid:48) \ K = ∅ . With this notation, A ∩ A is the event that the clique K is contained in the output S (cid:48) and that no non-clique vertex is contained in S (cid:48) . Thus, A ∩ A is the success event. Notice that1 − P ( A ∩ A ) ≤ P ( A c ) + P ( A c | A ) + P ( A c | A , A ). We control each of the terms on the RHS. Step 1

We will show that for (cid:96) = c (1 + (2 + c ) log k ), P ( A c ) ≤ (cid:0) n (cid:1) log k . Recall that A denotes the event that the output of the ﬁrst stage of the algorithm does not containtoo many non-clique vertices. We note that the analysis of this step is contained in the proofof [DGGP14, Lemma 2.9]. First, ﬁx S ⊂ K such that | S | = (1 + c ) log n . The probability thereexists a subset of non-clique vertices of size (cid:96) connected to every element in S is (cid:0) n(cid:96) (cid:1) − (cid:96) (1+ c ) log n . Aunion bound then implies that the probability there exists a subset of non-clique vertices of size atleast (cid:96) connected to every element in S is at most (cid:80) n − k(cid:96) = (cid:96) (cid:0) n(cid:96) (cid:1) − (cid:96) (1+ c ) log n ≤ − ( c(cid:96) −

1) log n . Furtherunion bounding over all subsets of K of size (1 + c ) log n implies: P ( A c ) ≤ (cid:18) k (1 + c ) log n (cid:19) − ( c(cid:96) −

1) log n ≤ (1+ c ) log n log k − ( cl −

1) log n = 2 − log k log n = (cid:18) n (cid:19) log k . Step 2

We will show that P ( A c | A ) ≤ (cid:96) (1+ c ) log nk .Recall that A denotes the event that the uniformly at random subset S (cid:48) C of the output of theﬁrst stage S is contained in the clique: S (cid:48) C ⊂ K . We are interested in this event conditionedon the output of the ﬁrst stage not containing too many non-clique vertices. To this end, let b := | V (cid:48) \ K | = | S \ K | and notice that b < l . Now, P ( A | A ) = (cid:0) k (1+ c ) log n (cid:1)(cid:0) k + b (1+ c ) log n (cid:1) = ( k − (1 + c ) log n + b )( k − (1 + c ) log n + b − ... ( k − (1 + c ) log n + 1)( k + b )( k + b − ... ( k + 1) ≥ (cid:18) k − (1 + c ) log nk (cid:19) b ≥ − b (1 + c ) log nk ≥ − l (1 + c ) log nk . Step 3

We will show that P ( A c | A , A ) ≤ n exp (cid:18) − k (cid:19) + (cid:96) n − (1+ c )3 . Recall that A denotes the event that S (cid:48) \ K = ∅ , that is that the output of the algorithm containsno non-clique vertices. In order to analyze this, we need to control the number of clique vertices13ny non-clique vertex is connected to. In particular, we would like that each non-clique vertex isconnected to at most 2 k/ A to be the event that everynon-clique vertex is connected to at most 2 k/ P ( A c | A , A ) ≤ P ( A c | A , A , A ) + P ( A c | A , A ) ≤ P ( A c | A , A , A ) + 4 P ( A c ) , where the last inequality follows as long as n, k satisfy P ( A ) ≥ / P ( A | A ) ≥ /

2. To tackle P ( A c ), we notice that any non-clique vertex is connected to roughly k clique vertices. A Chernoﬀbound (Lemma 7.1) and union bound imply that except with probability at most n exp (cid:0) − k (cid:1) ,none of the non-clique vertices in the graph are connected to more than k clique vertices. Thatis, P ( A c ) ≤ n exp (cid:0) − k (cid:1) . We now condition on this event and calculate P ( A c | A , A , A ). Theupshot of choosing S (cid:48) C the way we do is that conditioned on A and A , it is a uniformly randomsubset of the planted clique K . Further conditioning on A , for a given non-clique vertex in V (cid:48) ,the probability that it is connected to all vertices in S (cid:48) C is at most (cid:0) k (1+ c ) log n (cid:1)(cid:0) k (1+ c ) log n (cid:1) ≤ (cid:18) − (1 + c ) log nk (cid:19) k ≤ exp (cid:18) − (1 + c ) log n (cid:19) = n − (1+ c )3 Union bounding over all the at most l non-clique vertices in S (cid:48) C , no non-clique vertex gets addedto S (cid:48) except with probability at most l n − (1+ c )3 .To conclude, recall that 1 − P ( A ∩ A ) ≤ P ( A c ) + P ( A c | A ) + P ( A c | A , A ). Thus, steps 1-3imply 1 − P ( A ∩ A ) ≤ (cid:18) n (cid:19) log k + l (cid:18) (1 + c ) log nk + n − (1+ c )3 (cid:19) + 4 n exp (cid:18) − k (cid:19) ≤ c ) c max (cid:18) (1 + c ) log nk , log nn c (cid:19) . (cid:101) O ( n / ) algorithm for ﬁnding cliques of size k = Θ( √ n log n ) Remark 5.1.

This algorithm works even if we only have an underestimate of the true plantedclique size. This is because the algorithm only uses the clique size implicitly, when deciding howmany vertices to sample. If we underestimate the clique size, we will only sample more verticesthan necessary. This will increase the runtime, but will not aﬀect the correctness of the output.This turns out to be useful when we later use this algorithm as a black box subroutine in otheralgorithms where we only have an estimate of the size of the planted clique.

Theorem 1.

Let √ n log n ≤ k , and let L in be a user deﬁned parameter. If n · (log n ) k ≤ L in , whengiven an instance G of G ( n, , k ) , Keep-High-Degree-And-Complete ( G, L in ) (Algorithm 2)runs in time O ( nL in + n log n ) and outputs the hidden clique K with probability at least − O (cid:16) log n √ n (cid:17) .Proof. Runtime analysis:

Using Lemma 5.1, the algorithm clearly runs in time O ( nL + n log n ).14 orrectness analysis: We ﬁrst show that for this size of the clique k , degree counting is suﬃcientto separate clique vertices from non-clique vertices. We then show that randomly sampling L in vertices will yield a subset of the clique of size at least 2 log n . We conclude by invoking Lemma 5.2to show that Clique-Completion works correctly with high probability.To this end, let D min denote the event that the minimum degree of a clique vertex is at least d min = n + k − √ n log n and D max denote the event that the maximum degree of a non-cliquevertex is at most d max = n + √ n log n . Note that the degree of a non-clique vertex is Bin (cid:0) n, (cid:1) ,so by a Chernoﬀ bound 7.1 and a union bound, P ( D c max ) ≤ ( n − k ) n ≤ n . Likewise, the degree of anon-clique vertex is Bin (cid:0) n − k, (cid:1) + k , so a similar argument shows P ( D c min ) ≤ kn ≤ n . Because8 √ n log n ≤ k and setting T d = n + 2 √ n log n , we have that d max < T d < d min . Therefore exceptwith probability at most n + n ≤ n , the degree of all clique nodes is larger than T d and that of allnon-clique nodes is smaller than T d .Now, we show that randomly sampling L in vertices will yield at least 2 log n clique vertices. Letthis random sample of L in vertices be denoted S L . In fact, we show something slightly stronger.If we divide the clique vertices K into 2 log n disjoint sets K , K , ..., K n of equal size k n ,then with high probability we will get at least one vertex from each K i . This implies that we willhave at least 2 log n distinct clique vertices in S L . Let E i be the event that S L ∩ K i = ∅ . P ( E i ) = (cid:18) − k n log n (cid:19) L in ≤ exp (cid:18) − kL in n log n (cid:19) ≤ (cid:16) − kLin n log n (cid:17) . Let E = (cid:84) i E ci ; that is E is the event that each K i has non-empty intersection with S L . Then, aunion bound shows that P ( E c ) ≤ (2 log n )2 (cid:16) − kL n log n (cid:17) . Since L in ≥ n (log n ) k , this probability of failureis at most nn .We now note that the probability Algorithm 2 fails can be denoted by P ( C c ) where C is theevent that Clique-Completion outputs K , the planted clique. So we can upper bound P ( C c ) ≤ P ( C c , D max , D min , E ) + P ( D c max ) + P ( D c min ) + P ( E c ). Using our estimates from above, we can upperbound P ( D c max ) + P ( D c min ) + P ( E c ) = O ( n ) + O ( log nn ) = O ( log n √ n ). Hence it only remains to showthat P ( C c , D max , D min , E ) = O ( log n √ n ) to complete the proof.To upper bound this quantity, consider the following thought experiment. Consider a genie whogets the same input as Algorithm 2 and also knows the location of the planted clique. The genieobserves our algorithm, and if S C is not a subset of the planted clique K , the genie selects any otherset of 2 log n true clique vertices and runs Clique-Completion using this new genie-aided inputset instead. We can denote the event that the genie’s version of

Clique-Completion succeeds as C genie and by Lemma 5.2, we can conclude that P ( C c genie ) = O ( log n √ n ). This is because the genie-aided algorithm takes as input a graph G ∼ G ( n, , k ) and a true clique subset, which are preciselythe conditions on Lemma 5.2.To relate this to our quantity of interest, we note that when S good := D max ∩ D min ∩ E happens,the input S C used by Algorithm 2 is a subset of K and so the genie-aided algorithm and Algo-rithm 2 behave identically conditioned on S good . This means that P ( C c , S good ) = P ( C c genie , S good ) ≤ P ( C c genie ) = O ( log n √ n ) which completes the proof.15 lgorithm 2: Keep-High-Degree-And-Complete

Input:

Graph G = ( V, E ) = G ( n, , k ), number of vertices to sample L in Output:

Clique K Initialize S C = ∅ repeat L in times Sample a random vertex v ∈ G and compute deg( v ) if deg( v ) ≥ n + 2 √ n log n then Update S C ← S C ∪ { v } endendif | S C | < n thenreturn Declare Failure end

Initialize S C = ∅ Select 2 log n vertices from S C uniformly at random and add them to S C S ← Clique-Completion ( G, S C ) return S (cid:101) O (cid:0) ( n/k ) + n (cid:1) algorithm for ﬁnding cliques of size k = Ω( √ n log n ) Theorem 2.

Let √ n log n ≤ k ≤ n and set p = · n log nk . Given an instance G of G ( n, , k ) , Subsample-And-KHDAC ( G, k, p ) (Algorithm 3) runs in time O (cid:16) n k · log n + n log n (cid:17) and out-puts the hidden clique K with probability at least − O (cid:16) log ( pk ) √ pk (cid:17) .Proof. Runtime Analysis:

Since we can sample a random vertex in unit time in our modelof computation, sampling pn vertices takes time O ( pn ). Further, using the running times fromTheorem 1 and Lemma 5.1, it is easy to observe that the algorithm runs in time O (cid:0) pn + n (cid:48) L (cid:48) + n log n (cid:1) = O (cid:18) pn k · log n + n log n (cid:19) = O (cid:18) n k · log n + n log n (cid:19) Correctness Analysis:

We follow the notation of Algorithm 3. We begin by showing that thesubsampling step behaves as expected, in the sense that k p = | S P ∩ K | is roughly equal to pk . Let A be the event that pk ≤ k p ≤ pk . As k p is a hypergeometric random variable, we use boundson the concentration of a hypergeometric random variable around its mean (see, for eg, [HS05,Theorem 1]) to get that P ( A c ) ≤ (cid:16) − pk n (cid:17) ≤ n .It is easy to observe that G (cid:48) ∼ G ( n (cid:48) , , k p ). Now we show that S C , the output of the Keep-High-Degree-And-Complete subroutine is equal to S P ∩ K with high probability. Denote this eventby A . Since k (cid:48) = pk ≤ k p and (using that p ≤ )8 (cid:112) | S P | log | S P | ≤ √ (cid:112) pn log n = pk ≤ k p , the conditions of Theorem 1 are satisﬁed if A holds, therefore P ( A c | A ) = O (cid:16) log n (cid:48) √ n (cid:48) (cid:17) = O (cid:16) log ( pk ) √ pk (cid:17) . We have omitted certain ﬂoors and ceilings for the sake of readability

16o ﬁnish the proof, we need to prove that

Clique-Completion succeeds with high probabil-ity. Let A denote the probability that the output of clique completion (and of the algorithm) S = K . Overall, we can then upper bound the probability of failure of Algorithm 3 as P ( A c ) ≤ P ( A c , A , A )+ P ( A c | A )+ P ( A c ). We have shown that P ( A c | A )+ P ( A c ) = O ( n )+ O (cid:16) log ( pk ) √ pk (cid:17) = O (cid:16) log ( pk ) √ pk (cid:17) . Thus it only remains to show that P ( A c , A , A ) = O (cid:16) log ( pk ) √ pk (cid:17) . We can now deﬁne S good := A ∩ A and use the same genie-aided analysis as in the proof of Theorem 1 to concludethat P ( A c , A , A ) = O (cid:16) log ( n ) √ n (cid:17) = O (cid:16) log ( pk ) √ pk (cid:17) which completes the proof. Algorithm 3:

Subsample-And-KHDAC

Input:

Graph G = ( V, E ) = G ( n, , k ), clique size k , subsampling fraction p Output:

Clique K Set n (cid:48) = np and k (cid:48) = pk Initialize S P = ∅ Pick n (cid:48) vertices uniformly at random from V and add them to S P Let G (cid:48) be the subgraph of G induced by S P S C ← Keep-High-Degree-And-Complete (cid:16) G (cid:48) , L (cid:48) in = n (cid:48) · (log n (cid:48) ) k (cid:48) (cid:17) if | S C | < n thenreturn k vertices chosen uniformly at random from V end Initialize S C = ∅ Select 2 log n vertices from S C uniformly at random and add them to S C S ← Clique-Completion ( G, S C ) return S (cid:101) O (cid:16) n / exp (cid:16) k n (cid:17)(cid:17) algorithm for ﬁnding cliques of size ω ( √ n log log n ) = k = o ( √ n log n ) In Section 3.1 we described the idea to get sublinear algorithm to recover planted cliques of size k = o ( √ n log n ) as follows. First subsample the vertices, then ﬁlter them according to their degree,in the hope of boosting the number of clique versus non-clique vertices. Then we can use Keep-High-Degree-And-Complete on this smaller graph to get a sublinear runtime. The algorithmwe state and analyse here, Algorithm 4, is actually slightly diﬀerent from the sketch describedabove. We ﬁrst split the vertices of the input graph into two disjoint sets V and V of equal size n/

2. We then subsample vertices from V , and use their V -degree to ﬁlter them. By V -degree wemean that we estimate their degree by only counting the number of edges from a vertex in V toall the vertices in V . The advantage now is that when we take our ﬁltered vertices (which are asubset of V ) and consider the subgraph induced by them, we have not seen any of the edges inthis subgraph. Thus we can use the randomness of these edges to argue that this subgraph is aninstance of the planted clique problem and can invoke Theorem 1 to analyse the performance of Keep-High-Degree-And-Complete on this subgraph.17 lgorithm 4:

Subsample-And-Filter

Input:

Graph G = ( V, E ) = G ( n, , k ), clique size k , subsampling fraction p Output:

Clique K Let V , V be two disjoint subsets of V of size n each.Initialize S P = ∅ for v ∈ V do With probability p , update S P ← S P ∪ { v } endif | S P | > pn thenreturn k vertices chosen uniformly at random from V end Initialize S F = ∅ Set T l = n + k − √ n and T d = n + k + 2 √ n for v ∈ S P doif T l ≤ (cid:80) u ∈ V (( u,v ) ∈ E ) ≤ T d then S F ← S F ∪ { v } endend Set n (cid:48) = | S F | Let G (cid:48) be the subgraph of G induced by S F S C ← Keep-High-Degree-And-Complete ( G (cid:48) , L (cid:48) in = n (cid:48) ) if | S C | < n thenreturn k vertices chosen uniformly at random from V end Initialize S C = ∅ Select 2 log n vertices from S C uniformly at random and add them to S C S ← Clique-Completion ( G, S C ) return S Remark 5.2.

It is unlikely that the requirement k = ω (cid:0) √ n log log n (cid:1) in the statement of Theo-rem 3 is a fundamental barrier of our technique. It shows up because we require k (cid:48) = Ω (cid:0) √ n (cid:48) log n (cid:48) (cid:1) when we run the Keep-High-Degree-And-Complete subroutine. Instead, we could use anyoﬀ-the-shelf (almost-)linear time algorithm that only requires k (cid:48) = Ω (cid:16) √ n (cid:48) (cid:17) as a subroutine in thesketch above. This would let us only require k = ω ( √ n ). However, this subroutine can not usethe precise value of k (cid:48) , since we only have an estimate. The Low Degree Removal algorithm from[FR10] has this property, but only achieves constant probability of success. The algorithms in both[DGGP14, DM15a] succeed with high probability, but use k (cid:48) as an input. If there was a linear timealgorithm that works with just an estimate of k (cid:48) and achieves high probability of success, we coulduse them as subroutines and only require k = ω ( √ n ). Remark 5.3.

In contrast to the behaviour of Algorithm 2 noted in Remark 5.1, it is unlikely thatAlgorithm 4,

Subsample-And-Filter , is very robust to misspeciﬁcation of the size of the plantedclique. This is because we seem to be using this size crucially in our ﬁltering step. The algorithmneeds an estimate of k that has additive error at most o ( √ n ). Theorem 3.

Let ω (cid:0) √ n log log n (cid:1) = k = o (cid:0) √ n log n (cid:1) , and let G be an instance of G (cid:0) n, , k (cid:1) . Set = n log nk exp (cid:16) − k n (cid:17) . Then Subsample-And-Filter ( G, k, p ) (Algorithm 4) runs in time O ( pn ) = O (cid:18) n log nk exp (cid:18) − k n (cid:19)(cid:19) = (cid:101) O  n exp (cid:16) k n (cid:17)  = o ( n ) and outputs the planted clique except with probability at most O (cid:16) exp (cid:16) − k n (cid:17)(cid:17) = O (cid:16) n (cid:17) .Proof. Note that since k = o ( √ n log n ), we have that p = ω ( n − (cid:15) ) for any constant (cid:15) >

0. So wehave pk = ω ( n . ) and pn = ω ( n . ). Runtime Analysis:

The subsampling step takes time O ( n ). If | S P | > pn , the algorithm terminates with a further O ( k )runtime. This would mean a runtime bounded by O ( n ) which is also in O ( pn ) .If, on the other hand, | S P | ≤ pn , then we need to compute deg( v ) for at most pn vertices andeach such computation takes time at most O ( n ). This step thus takes time O ( pn ). By usingruntime bounds from Theorem 1 and Lemma 5.1, susbequent steps of the algorithm take time O ( p n + n log n ) which is O ( pn ). Hence the complete algorithm has a runtime that is O ( pn ). Correctness Analysis:

We assume the notation set in the algorithm. We analyze each stage ofthe algorithm.

Step 1

First, we show that subsampling steps behave as expected. Let k and k be randomvariables denoting the number of planted vertices in V and V respectively. We must have k + k = k . Let P denote the event that k/ − √ n ≤ k ≤ k/ √ n . P implies that k/ − √ n ≤ k ≤ k/ √ n . We show that the probability P ( P c ) is small and so then assume for the rest of theproof that P holds. Since k is a hypergeometric random variable, using concentration boundsfrom [HS05, Theorem 1] we have P ( P c ) ≤ − n/k ).Now controlling the subsampling step that is used to obtain the set S P , deﬁne P to be the eventthat || S P | − . pn | ≤ . pn and P denote the event that || S P ∩ K | − pk | ≤ . pk . Using Chernoﬀbounds from Lemma 7.2, we have P ( P c | P ) ≤ − pn ) and P ( P c | P ) ≤ − pk ) ≤ − pk ).Deﬁning the event P := P ∩ P ∩ P , we can upper bound P ( P c ) ≤ P ( P c | P ) + P ( P c | P ) + P ( P c ) = O (exp( − n/k )). For brevity, we let ˆ n = | S P | and ˆ k = | S P ∩ K | . Step 2

We now assume the event P happens and aim to show that the ﬁltering step also behavesas expected and analyse the size of S F and S F ∩ K . The subgraph induced by S F is the input to the Keep-High-Degree-And-Complete subroutine so we want to show that | S F ∩ K | is relativelylarge and that | S F | is not too large, so that the subroutine works as expected. To this end, wedeﬁne the event F to denote the event that S F does not contain too many non-clique vertices.That is, we let F = (cid:110) | S F \ ( S F ∩ K ) | ≤ pn exp (cid:16) − k n (cid:17)(cid:111) . Similarly, we let F deﬁne the event that S F ∩ K is fairly large: F = (cid:110) || S F ∩ K | − p ˆ k | ≤ . p ˆ k (cid:111) (for some parameter p to be deﬁnedlater).If v ∈ S P \ ( S P ∩ K ) (that is, it not a clique vertex), we upper bound the probability that it will19e added to S F . Using a Chernoﬀ bound (Lemma 7.1) and ω ( √ n log log n ) = k P ( v ∈ S F | P, v ∈ S P \ ( S P ∩ K )) = P  | (cid:88) u ∈ V (( u,v ) ∈ E ) − n + k | ≤ √ n  ≤ P  (cid:88) u ∈ V (( u,v ) ∈ E ) ≥ n + k − √ n  ≤ exp (cid:32) − k n · (cid:18) − √ nk (cid:19) (cid:33) By linearity of expectation, E [ | S F \ ( S F ∩ K ) || P ] ≤ (ˆ n − ˆ k ) P ( v ∈ S F | P, v ∈ S P \ ( S P ∩ K )) = O (cid:32) pn exp (cid:32) − k n · (cid:18) − √ nk (cid:19) (cid:33)(cid:33) . Using Markov’s inequality, P ( F c | P ) = P (cid:18) | S F \ ( S F ∩ K ) | ≥ pn exp (cid:18) − k n (cid:19) | P (cid:19) = O (cid:18) exp (cid:18) − k n + 4 k √ n (cid:19)(cid:19) = O (cid:18) exp (cid:18) − k n (cid:19)(cid:19) . We now analyse S F ∩ K and show that it is relatively large (conditioned on P ). For any v ∈ S P ∩ K ,using Chernoﬀ (Lemma 7.1)1 − p := P ( v / ∈ S F | P ) = P  | (cid:88) u ∈ V (( u,v ) ∈ E ) − n + k | ≥ √ n  P  | (cid:88) u ∈ V (( u,v ) ∈ E ) − n + 2 k k − k | ≥ √ n  ≤ P  (cid:88) u ∈ V (( u,v ) ∈ E ) − n + 2 k ≥ | √ n − | k − k ||  ≤ P  (cid:88) u ∈ V (( u,v ) ∈ E ) − n + 2 k ≥ . √ n  ≤ (cid:18) − (cid:19) ≤ . This gives us p = P ( v ∈ S F | P ) ≥ . Since | S F ∩ K | is a sum of ˆ k independent Bern( p ) randomvariables, by Lemma 7.2 P (cid:16) || S F ∩ K | − p ˆ k | ≤ . p ˆ k | P (cid:17) ≥ − (cid:32) − p ˆ k (cid:33) ≥ − (cid:18) − pk (cid:19) ≥ − (cid:18) − pk (cid:19) . Denoting F := F ∩ F , we have thus upper bounded P ( F c | P ) ≤ P ( F c | P )+ P ( F c | P ) = O (cid:16) exp (cid:16) − k n (cid:17)(cid:17) . Step 3

We now analyze the event A that Keep-High-Degree-And-Complete succeeds condi-tioned on

P, F . Conditioned on P ∩ F , we can observe that the subgraph G (cid:48) induced by the vertexset S F is distributed as G ( | S F | , , | S F ∩ K | ). This is because we have not yet used the randomness20rom any of the edges in this subgraph. Moreover, we can see that | S F | ≤ pn exp (cid:16) − k n (cid:17) + p ˆ k = O (cid:16) pn exp (cid:16) − k n (cid:17)(cid:17) . Also, we have | S F ∩ K | = ω ( pk ). This gives8 (cid:112) | S F | · log | S F | = O (cid:32)(cid:115) pn exp (cid:18) − k n (cid:19) · log n (cid:33) = O ( pk ) = O ( | S F ∩ K | ) , which means that the conditions of Theorem 1 are satisiﬁed and we can upper bound the failureprobability of the Keep-High-Degree-And-Complete subroutine as P ( A c | P, F ) = O (cid:16) log ( pk ) √ pk (cid:17) . Step 4

Finally, we can analyze the event Algorithm 4 proceeds to the

Clique-Completion stepand succeeds. Let A denote the event that the output of Clique-Completion is the plantedclique K . Then the failure probability of the algorithm is P ( A c ) ≤ P ( A c , A , F, P ) + P ( A c | P, F ) + P ( F c | P )+ P ( P c ). We have already shown that P ( A c | P, F )+ P ( F c | P )+ P ( P c ) = O (cid:16) exp (cid:16) − k n (cid:17)(cid:17) andso it we only need to show that P ( A c , A , F, P ) = O (cid:16) exp (cid:16) − k n (cid:17)(cid:17) to complete the proof. Deﬁne S good := A ∩ F ∩ P and note that conditioned on S good , the input vertex set S C to Clique-Completion is a subset of the planted clique K because 2 log n = O ( pk ) = O ( | S F ∩ K | ). Hencewe can use the same genie-aided analysis technique as in the proof of Theorem 1 to show that P ( A c | A , F, P ) = O (cid:16) log nn . (cid:17) = O (cid:16) exp (cid:16) − k n (cid:17)(cid:17) . This completes the proof. Having developed our algorithmic ideas and upper bounds, in this section we prove several lowerbounds. The simple observation that underlies our impossibility results is that a sublinear timealgorithm can not see the entire input, and hence must work without a fair chunk of informationabout the input. When this information is not available to the algorithm, we will argue that whatit does see is either statistically (in results that follow immediately from [RS19]) or computationally(because of the

Planted Clique Conjecture ) not solvable.As we have stated earlier, in some of our results we utilise an ‘iid’ version of the planted cliqueproblem. We ﬁrst formally deﬁne what we mean by this. We also provide formal statements of the

Planted Clique Conjecture and the iid Planted Clique Conjecture that we use. Remark 6.3 notesthat the

Planted Clique Conjecture implies the iid Planted Clique Conjecture . Deﬁnition 6.1 (iid Planted Clique graph distribution: ˆ G ( n, , p )) . Let G = ( V, E ) be a graph with vertex set V of size n . K ⊂ V is a set such that every vertex v ∈ V is included in K iid with probability p . For all distinct pairs of vertices u, v ∈ K , we addthe edge ( u, v ) to E . For all remaining distinct pairs of vertices u, v , we add the edge ( u, v ) to E independently with probability . The distribution on graphs thus formed is denoted ˆ G ( n, , p ) . Deﬁnition 6.2 (iid Planted Clique Detection Problem: iidPC D ( n, p )) . This is the following hypothesis testing problem. H : G ∼ G ( n,

12 ) and H : G ∼ ˆ G ( n, , p ) (6.1)We now provide the formal statement of the Planted Clique Conjecture , and use the version from[BBH18, Conjecture 2.1]. 21 onjecture 6.3 (Planted Clique Conjecture) . Suppose that {A n } is a sequence of randomized polynomial time algorithms that take as input theadjacency matrix A G of a graph G on n vertices, A n : A G → { , } . Let k ( n ) be a sequence ofpositive integers such that k ( n ) = O ( n − δ ) for any constant δ > . Then if G n is a sequence ofinstances of PC D ( n, k ( n )) , it holds that P H {A n ( A G n ) = 0 } + P H {A n ( A G n ) = 1 } ≤ o (1) . Conjecture 6.4 (iid Planted Clique Conjecture) . Suppose that {A n } is a sequence of randomized polynomial time algorithms that take as input theadjacency matrix A G of a graph G on n vertices, A n : A G → { , } . Let k ( n ) be a sequence ofpositive integers such that k ( n ) = O ( n − δ ) for any constant δ > . Then if G n is a sequence ofinstances of iidPC D ( n, k ( n ) n ) , it holds that P H {A n ( A G n ) = 0 } + P H {A n ( A G n ) = 1 } ≤ o (1) . For the purpose of showing these impossibility results, we must also deﬁne what it means for an“algorithm” to “solve” the planted clique problem. Let P ( n, k ( n )) denote any of the followingcomputational problems - PC D ( n, k ( n )) , PC R ( n, k ( n )) , iidPC D ( n, k ( n ) n ). Deﬁnition 6.5 (‘Solving’ a problem) . Let k ( n ) , T ( n ) , and p ( n ) be some functions of n such that k ( n ) ≤ n . A parametrized family ofalgorithms {A n } is said to run in time T ( n ) and solve P ( n, k ( n )) with failure probability at most p ( n ) if the following happens. For all large enough n , when given an instance of P ( n, k ( n )) as input, A n terminates in time T ( n ) and returns the correct answer with probability at least − p ( n ) . Whenever clear from context, we simplify notation by writing k instead of k ( n ). The property of a sublinear algorithm that we use in showing lower bounds is that such an algo-rithm can only see a subset of the entire input to the problem. For the planted clique problem,this means that the algorithm can only look at a subset of entries in the adjacency matrix A G ofthe graph. Let us set up some notation. Deﬁnition 6.6 (Query set of an algorithm) . Let A be any algorithm that takes as input A G , the adjacency matrix of a graph G = ( V, E ) . Deﬁne E A ⊂ V × V as the set of entries of A G that A queries before it terminates. Since A G is symmetric,we assume for convenience that E A is symmetric. That is, if A queries ( i, j ) , it also queries ( j, i ) . We note the following simple fact about sublinear time algorithms which follows immediately fromthe fact that in our model of computation any query to an entry of A G takes unit time. Remark 6.1. If {A n } is an algorithmic family that runs in time T A ( n ), then | E A n | = O ( T A ( n )). Obviously, since A G is symmetric, if A queries ( i, j ) there is no need to query ( j, i ). However, doing so canincrease the number of queries (and hence the runtime) by at most a factor of 2. Since it is convenient to assumethat the set E A is symmetric rather than tracking which of the two queries the algorithm made, we simply assumethe algorithm queries both options. A n has | E A n | = o (cid:16) n k + n (cid:17) , then A n must fail to ﬁnd the planted cliquewith probability tending to 1. Combining this with Remark 6.1, we get Proposition 6.7. [RS19, Theorem 2] Let k ( n ) ≤ n and let { G n } be a sequence of instancesof the planted clique recovery problem PC R ( n, k ( n )) . Any algorithmic family { A n } that runs intime T A ( n ) = o (cid:16) n ( k ( n )) + n (cid:17) must fail to output the correct planted clique with probability at least − o (1) . However, Theorem 2 in [RS19] also shows that lower bound techniques relying purely on analysingthe number of queries required will fail to give better lower bounds. This is because there existsan ineﬃcient algorithm making as few as (cid:101) O (cid:16) n k + n (cid:17) queries that can ﬁnd the planted cliquewith good probability. Hence we need to incorporate computational lower bound techniques andnot just information theoretic ones if we want to show any stronger lower bounds. To drive thispoint home, we show that any stronger lower bounds for the sublinear time PC D ( n, k ) problemat k = (cid:101) Θ ( √ n ) would actually imply non-trivial lower bounds for the PC D ( n, k ) problem at theinformation theoretic threshold. Thus, the (non-)existence of really good sublinear time algorithmsfor planted clique detection in the ‘easy’ regime seems connected to the (non-)existence of very fast(not necessarily sublinear) algorithms for the detection problem in the ‘hard’ regime. Before we dothat, however, we must make a slight detour. An algorithm for recovery can be easily converted into an algorithm for detection simply by pickingsome large enough random subset of the output of the algorithm (say 3 log n ) and checking (intime O (log n )) if all the vertices are connected to each other. In fact, since all our algorithms relyeventually on Keep-High-Degree-And-Complete , if we wanted to convert these algorithmsto detection algorithms, we can simply test based on the highest degree and omit the

Clique-Completion steps altogether. For this reason, we will focus on the detection problem whenproving our lower bounds, since this immediately translates into a recovery lower bound.It is also more natural to show the impossibility results we are about to prove in a model of theplanted clique problem that is slightly diﬀerent from PC D ( n, k ). Namely, it is more natural to proveand state some results for iidPC D ( n, kn ) (Deﬁnition 6.2) which is formally diﬀerent to but morallysimilar to PC D ( n, k ). In iidPC D ( n, kn ), the planted clique is not a set of exactly k vertices which ischosen uniformly at random. Instead, each vertex is included in the clique iid with probability kn .In light of this, since we will be stating some impossibility results with iidPC D ( n, kn ), we note herethat the algorithms developed in Section 5 (or minor tweaks thereof) actually work for this diﬀer-ent model too. This lends some credence to showing impossibility in these models as proxies forshowing impossibility results for PC D ( n, k ) or PC R ( n, k ).23 act 6.8. Let k ( n ) ≤ n and ω (1) = f ( n ) = o ( √ n ) be any sequence. With probability at least − o (1) , an instance of iidPC D ( n, k ( n ) n ) is an instance of PC D ( n, k (cid:48) ( n )) for some sequence k (cid:48) ( n ) satisfying | k (cid:48) ( n ) − k ( n ) | ≤ f ( k ( n )) (cid:112) k ( n ) . Remark 6.2.

It can be veriﬁed that all the algorithms developed Section 5 (even

Subsample-And-Filter , as stated in Remark 5.3) work as long as the estimate of k they take in as input iswithin an additive o ( √ n ) from the size of the true planted clique. Combining this with Fact 6.8,this means that our algorithms solve iidPC D ( n, kn ) with the same runtime as PC D ( n, k ) and a mildlyworse success probability.This means that even if we are hesitant to think of impossibility results about iidPC D ( n, kn ) asbeing proxies for impossibility results about PC D ( n, k ), we can think of iidPC D ( n, kn ) as being thefundamental problem worth studying for which this work describes sublinear time algorithms aswell as hardness results.However, there is in fact a formal relationship between PC D ( n, k ) and iidPC D ( n, kn ), although thisrelationship is slightly subtle. Later in the manuscript, we prove two lemmas. Lemma 6.12 saysthat if iidPC D ( n, kn ) is hard, the PC D ( n, k ) is hard. Lemma 6.13 says the reverse. While we discussthe exact content of these lemmas after proving our lower bounds, in Section 6.5, for now the readerjust needs to keep in mind the following fact. Remark 6.3.

It follows immediately from Lemma 6.13 that the

Planted Clique Conjecture (Con-jecture 6.3) implies the iid Planted Clique Conjecture (Conjecture 6.4).This lets us use the iid Planted Clique Conjecture to show impossibility results for iidPC D ( n, kn ) insubsequent sections. k = (cid:101) Θ( √ n ) The discussion preceding the lemma and its proof appears in Section 3.2 but we reproduce it herefor the reader’s convenience. Consider the iidPC D ( n, kn ) problem with k just larger than (cid:112) n log n .Create a subgraph by only retaining the ﬁrst √ n vertices. Then we have a graph of size √ n with aplanted clique of size slightly more than 2 log( √ n ), the information theoretic threshold. Hence if wecould solve the detection problem on a graph of size n with a planted clique near the informationtheoretic threshold in time O (cid:0) n δ (cid:1) (for any constant δ > (cid:101) O (cid:0) n δ (cid:1) . A lower bound on the original problem then translates into a lowerbound on the problem at the information theoretic threshold. Moreover, a lower bound of the form ω ( n ) would imply a non-trivial superlinear lower bound for detecting small cliques. This indicatesthat a lower bound of the form ω ( n ) will require computational hardness assumptions to show.We prove the following more general reduction, which yields the discussion above by setting g ( n ) = log n . Lemma 6.9.

Let < δ < be some constant. Let ω (1) = g ( n ) = o ( √ n ) be some sequence indexedby n and deﬁne k ( n ) = g ( n ) √ n . Suppose that any algorithmic family { A n } that attempts tosolve iidPC D ( n , k ( n ) n ) in time T ( n ) = O (cid:16) n δ (cid:17) has probability of success at most + o (1) . Let ( n ) = g ( n ) . Then any algorithmic family {A n } that attempts to solve iidPC D ( n , k ( n ) n ) intime T ( n ) = O ( n δ ) has probability of success at most + o (1) .Proof. Assume towards a contradiction that there exists an algorithmic family A n that runs intime T ( n ) = O ( n δ ) and achieves probability of success at least p (for some constant p > )when solving iidPC D ( n , k n ). Suppose we are given an instance G of iidPC D ( n , k n ) to solve, andconsider the following algorithm. Set n = √ n , pick out the ﬁrst n vertices of the graph G ,and call the induced subgraph (which we don’t have to compute, just provide access to) G (cid:48) . Notethat we have set n so that k n = k n . The deﬁnition of iidPC D implies that G (cid:48) is an instance of iidPC D ( n , k n ) hence also an instance of iidPC D ( n , k n ). We can then use A n to solve it in time T ( n ) = O ( n δ ) = O ( n δ ) with success probability at least p . This gives an algorithmicfamily that runs in time T ( n ) = O ( n δ ) and solves iidPC D ( n , k n ) with success probability atleast p > . This provides the desired contradiction and completes the proof. k = (cid:101) Θ( √ n ) fromthe Planted Clique Conjecture Having seen that information theoretic techniques will not give better lower bounds, we now turnto proving lower bounds based on computational hardness conjectures and against restricted classesof algorithms.In the previous section we saw strong lower bounds on the detection problem at clique sizes near k = (cid:101) Θ( √ n ) imply non-trivial lower bounds for the detection problem at the information theoreticthreshold. In this section we show that this connection between sublinear time algorithms for largecliques and polynomial time algorithms for small cliques goes both ways. If the latter is hard (ascodiﬁed by the Planted Clique Conjecture ), we provide some evidence that the former is hard too.First we deﬁne non-adaptive and rectangular algorithms, a sub-class of algorithms against whichwe can show better lower bounds than we can against an arbitrary algorithm. Non-adaptive algo-rithms are essentially those algorithms for which the algorithm (possibly using randomness) ﬁxesthe entries of the adjacency matrix A G to query before it begins querying A G . Thus the loca-tions of queries does not depend on the entries of A G . We formalize this as follows. Rectangularalgorithms are those in which the set of queries form a combinatorial rectangle (modulo symmetry). Deﬁnition 6.10 (Non-adaptive algorithm) . Let A be an algorithm that takes as input the adjacency matrix A G of a graph G = ( V, E ) , and E A be the (symmetric) set of queries it makes to A G as deﬁned in Deﬁnition 6.6. We say A isnon-adaptive if the random variable E A is independent of the random variable A G . Deﬁnition 6.11 (Rectangular algorithm) . Let A be an algorithm that takes as input the adjacency matrix A G of a graph G = ( V, E ) , and E A bethe (symmetric) set of queries it makes to A G as deﬁned in Deﬁnition 6.6. We say A is rectangular ifthe random variable E A is deﬁned using two random disjoint subsets of vertices I, J ⊂ V with I ∩ J = ∅ as follows. E A = { ( u, v ) : u ∈ J, v ∈ J, u (cid:54) = v }∪{ ( u, v ) : u ∈ I, v ∈ J }∪{ ( u, v ) : u ∈ J, v ∈ I } . SeeFigure 1 for an illustrative example. igure 1. An example of the query set E A for a rectangular algorithm as deﬁned in Deﬁnition 6.11 Remark 6.4.

While the deﬁnition above allows E A to be randomized, for the sake of provinglower bounds it actually suﬃces to consider only deterministic choices of E A . That is, the set ofqueries the algorithm makes is deterministically ﬁxed before the input is provided. This is becausethe probability of success of the algorithm is an expectation of the success probabilities under eachof the random choices of E A (this is where we use the fact that the algorithm is non-adaptive).This means that there exists at least one choice E A which achieves the probability of success thatthe randomized algorithm is guaranteed to achieve. This choice can be pre-computed, and thischoice provides a deterministic algorithm that does at least as well as the randomized non-adaptivealgorithm we started with. Hence, for the rest of the paper, we will assume this choice of E A isdeterministic.We will show that if the Planted Clique Conjecture (Conjecture 6.3) is true, then no non-adaptiverectangular algorithm can run in time O (cid:16) n − δ (cid:17) for any constant δ > iidPC D ( n, kn )reliably for cliques of size k = (cid:101) Θ( √ n ).The following remark justiﬁes why it is reasonable to consider non-adaptive rectangular algorithms.It appears in Section 3.2 but we reproduce it here for the reader’s convenience. Remark 6.5.

Restricting our lower bounds to non-adaptive rectangular algorithms is not toounreasonable. This is because our upper bound algorithms are only weakly adaptive or non-rectangular. In fact, the

Clique-Completion subroutine is the only adaptive or non-rectangular part of Algorithms 2 3, or 4. Moreover, Clique-Completion is only required for the plantedclique recovery problem. If we only wanted to solve the detection problem, a simple tweak toAlgorithm 2 so that it does not use

Clique-Completion , but only decides whether or not a plantedclique exists based on the largest degree it observes can give a non-adaptive rectangular detectionalgorithm that runs in time (cid:101) O ( n ). Similarly, removing the Clique-Completion subroutinefrom Algorithm 3 while using the modiﬁed version of Algorithm 2 inside it gives a non-adaptiverectangular detection algorithm that runs in time (cid:101) O ( n k ) and reliably detects cliques of size k =Ω (cid:0) √ n log n (cid:1) . We leave the details to the reader. Since we are showing lower bounds for thedetection version of the problem, our upper bound algorithms do indeed belong to the class ofalgorithms against which we are showing lower bounds.First we note that the only evidence for a planted clique comes from querying entries for whichboth vertices are in the planted clique. By Remark 6.1, an algorithm that runs in time O ( n − δ )can only query O ( n − δ ) entries of the adjacency matrix. Moreover, this implies that for the sets I, J that deﬁne the queried rectangular subset E A , we must have ( | I | + | J | ) | J | = O ( n − δ ). If26 I | + | J | = Ω( n − δ ), we will have | J | = O ( n − δ ). Since J is chosen independent of the location ofthe (possibly) planted clique, we expect the number of clique vertices in J to be roughly | J | kn = o (1).Hence we expect to get no evidence of the existence of a planted clique independent of whether or notone existed. If, on the other hand, we had | I | + | J | = O ( n − δ ), then we expect to get evidence of atmost ( | I | + | J | ) kn = O ( n − δ ) planted vertices. By the Planted Clique Conjecture (Conjecture 6.3), webelieve that being able to detect whether or not such a small clique exists should be computationallyhard .This means that no rectangular non-adaptive algorithm can run in time O ( n − δ ) for any constant δ > iidPC D ( n, kn ) reliably for cliques of size roughly k = (cid:101) Θ( √ n ). We prove the followinggeneral theorem by turning these ideas into a formal reduction. Theorem 4.

Assume the

Planted Clique Conjecture (Conjecture 6.3) holds. Let δ > be anyconstant. Let k ( n ) be any sequence such that k ( n ) = Ω( √ n ) . Then if any non-adaptive rectangularalgorithmic family { A n } tries to solve iidPC D ( n, k ( n ) n ) in time O (cid:16) n − δ ( k ( n )) (cid:17) , it has probability ofsuccess at most + o (1) .Proof. Fix n and let k denote k ( n ). Let δ > A n is non-adaptive rectangular algorithm that runs in time O ( n − δ k ) and solves iidPC D ( n, kn )with success probability p > . By Remark 6.4 we can assume that E A n , and hence I and J , aredeterministic. By Remark 6.1 | E A n | = O ( n − δ k ). Hence, ( | I | + | J | ) | J | = O ( | E A n | ) = O ( n − δ k ).We will consider two cases. First, consider the simpler case, where | I | + | J | = Ω( n − δ k ). Then wemust have | J | = O ( n − δ k ). With high probability (at least 1 − o (1)) over the randomness of theclique vertices, J ∩ K = ∅ . In this scenario, the distribution of the entries of E A n will be identicalunder both G ( n, ) and ˆ G ( n, , kn ). Hence no algorithm will be able to distinguish between thesetwo cases with success probability greater than + o (1). In this case, the conclusion to our theoremimmediately follows.Now consider the case where | I | + | J | = O ( n − δ k ). The idea here is to use A n to solve iidPC D ( n (cid:48) , k (cid:48) n (cid:48) )for k (cid:48) = o ( √ n (cid:48) )—an intractable problem according to the Planted Clique Conjecture and Re-mark 6.3, hence getting a contradiction.Assume with out loss of generality I ∪ J = { , . . . , n (cid:48) } and let k (cid:48) be such that k (cid:48) n (cid:48) = kn . As n (cid:48) = O ( n − δ k ) we get that k (cid:48) = O ( n (cid:48) − δ ). Consider an instance G (cid:48) of iidPC D ( n (cid:48) , k (cid:48) n (cid:48) ). We claimthat running A n on G (cid:48) succeeds with high probability in detecting cliques. To prove this claim,we notice that A n can access its input graph G (cid:48) only through E A n ( G (cid:48) ) . Moreover, if G (cid:48) is anull instance (no planted clique), then E A n ( G (cid:48) ) d = E A n ( G ) (identically distributed) where G is anull instance of iidPC D ( n, kn ). And if G (cid:48) has a planted clique, then our choice of k (cid:48) n (cid:48) = kn impliesthat E A n ( G (cid:48) ) d = E A n ( G ) where G is an instance of iidPC D ( n, kn ) with planted clique. Overall, ourassumption then implies that A n succeeds in solving iidPC D ( n (cid:48) , k (cid:48) n (cid:48) ) with probability p > whichcontradicts the iid Planted Clique Conjecture . We remark that a similar sort of ‘in expectation’ intuition works even for non-rectangular algorithms. However,we have not been able to leverage this intuition into a formal reduction for a generic non-rectangular algorithm. .5 What do these lower bounds formally imply for PC D ( n, k ) ? In Lemma 6.9 and Theorem 4 we have shown that when the planted clique problem is formalizedas iidPC D ( n, kn ), the non-existence of very fast sublinear time algorithms for detecting large plantedcliques is related to the hardness of detecting small cliques. However, what does this imply whenthe problem is formalized using the more vanilla PC D ( n, k )? To discuss this, we ﬁrst prove thefollowing two easy lemmas that relate PC D ( n, k ) and iidPC D ( n, kn ). Lemma 6.12 ( iidPC D is hard → PC D is hard) . Let k ( n ) ≤ n and ω (1) = f ( n ) = o ( √ n ) be some sequences. Suppose that an algorithmic family {A n } that attempts to solve iidPC D ( n, k ( n ) n ) has probability of success at most p s ( n ) . Then thereexists a sequence k (cid:48) ( n ) (which may depend on { A n } ) satisfying | k (cid:48) ( n ) − k ( n ) | ≤ f ( k ( n )) (cid:112) k ( n ) withprobability − o (1) for any n , such that if { A n } tries to solve PC D ( n, k (cid:48) ( n )) , it has probability ofsuccess at most p s ( n ) + o (1) .Proof. Fix n and let k, ˆ k denote k ( n ) , ˆ k ( n ). Let G be an instance (with clique) of iidPC D ( n, kn ) andlet random variable ˆ k denote the clique size. It is clear that the problem instance G is an instanceof PC D ( n, ˆ k ) conditioned on the value of ˆ k the random size the clique takes. Let S denote the eventthat an algorithm A n succeeds on an instance on G , and E denote the event that | ˆ k − k | ≤ f ( k ) √ k .Note that P ( E c ) = o (1) from the deﬁnition of iidPC D ( n, kn ) and P ( S ) = p s ( n ) because of ourassumptions. Then P ( S ) = P ( S | E ) P ( E ) + P ( S | E c ) P ( E c ) , which gives P ( S | E ) ≤ p s ( n ) + o (1) after rearranging. Let S k (cid:48) denote the event that A n succeedson an instance of PC D ( n, k (cid:48) ), and k E be the distribution of ˆ k conditioned on the event E occuring.However, P ( S | E ) = E ˆ k ∼ k E (cid:2) P (cid:0) S ˆ k (cid:1)(cid:3) . This implies that for some k (cid:48) such that | k (cid:48) − k | ≤ f ( k ) √ k , P ( S k (cid:48) ) ≤ p s ( n ) + o (1), which completes the proof. Lemma 6.13 ( PC D is hard → iidPC D is hard) . Let k ( n ) ≤ n and ω (1) = f ( n ) = o ( √ n ) be some sequences. Let k (cid:48) ( n ) be any sequence satisfying | k (cid:48) ( n ) − k ( n ) | ≤ f ( k ( n )) (cid:112) k ( n ) with probability − o (1) for any n . Suppose that an algorithmicfamily {A n } that attempts to solve PC D ( n, k (cid:48) ( n )) has probability of success at most p s ( n ) . Then if { A n } tries to solve iidPC D ( n, k ( n ) n ) , it has probability of success at most p s ( n ) + o (1) .Proof. Fix n and let k, ˆ k denote k ( n ) , ˆ k ( n ). Let G be an instance (with clique) of iidPC D ( n, kn ) andlet random variable ˆ k denote the clique size. It is clear that the problem instance G is an instanceof PC D ( n, ˆ k ) conditioned on the value of ˆ k the random size the clique takes. Let S denote the eventthat an algorithm A n succeeds on an instance on G , and E denote the event that | ˆ k − k | ≤ f ( k ) √ k .Note that P ( E c ) = o (1) because of how the clique is chosen. Then P ( S ) = P ( S | E ) P ( E ) + P ( S | E c ) P ( E c ) ≤ P ( S | E ) + o (1) . Let S k (cid:48) denote the event that A n succeeds on an instance of PC D ( n, k (cid:48) ), and k E be the distributionof ˆ k conditioned on the event E occurring. By assumption, P ( S k (cid:48) ) ≤ p s ( n ) for all k (cid:48) such that | k (cid:48) − k | ≤ f ( k ) √ k . This means that P ( S | E ) = E ˆ k ∼ k E (cid:2) P (cid:0) S ˆ k (cid:1)(cid:3) ≤ p s ( n ), which completes theproof.At ﬁrst glance, this seems great. We can simply uses these lemmas to get analogues of Lemma 6.9and Theorem 4 for PC D ( n, k ). However, this does not quite work. We will illustrate this by focusing28n trying to show that the Planted Clique Conjecture implies the non-existence of rectangular non-adaptive algorithms that can solve PC D ( n, k ) for clique sizes k = (cid:101) Θ( √ n ) and run in time O ( n − δ )for some constant δ >

0. We already have, from Theorem 4 that this fact holds for iidPC D ( n, kn ). Ifwe try to use this with Lemma 6.12, all we can say is that for any algorithmic family A n , there is some sequence k (cid:48) , which is very close to k , which this algorithmic family can not solve. However,this need not be the same k (cid:48) for every algorithm. In eﬀect, it is possible that for every sequence k (cid:48) ,there is some algorithmic family that can solve it. Thus we can not rule out a fast algorithm foreven single such sequence k (cid:48) .However, this does not mean there is nothing useful we can say. For example, if an algorithmdesigner (who believes in the Planted Clique Conjecture ) wants to build a non-adaptive rectangularalgorithm that can solve PC D ( n, k = √ n log n ), we can tell them that their algorithm must crucially utilize a very good estimate of the size of the planted clique. This is because their algorithmdeﬁnitely must fail for some sequence of planted clique sizes that is very close to the true size inthe problem instance. As we note in Remark 6.2, the algorithms developed in this work do notcrucially utilize such a ﬁne estimate of k . We state the Chernoﬀ bound we use here, for the convenience of the reader.

Lemma 7.1.

Let X = n (cid:80) i =1 X i where X i are independent Bern ( p i ) random variables. Let µ = n (cid:80) i =1 p i ,and δ ∈ (0 , . Then P ( X ≥ (1 + δ ) µ ) ≤ exp (cid:18) − µδ (cid:19) P ( X ≤ (1 − δ ) µ ) ≤ exp (cid:18) − µδ (cid:19) We also state the following subsampling concentration lemma that proves useful.

Lemma 7.2.

Let V be a set of size n , and let K ⊂ V be of size k . Let S P be a subset of V formedby including every element with probability p , and excluded otherwise. Then P (0 . pn ≤ | S P | ≤ . pn ) ≥ − (cid:18) − pn (cid:19) and P (0 . pk ≤ | S P ∩ K | ≤ . pk ) ≥ − (cid:18) − pk (cid:19) Proof.

Follows immediately from the Chernoﬀ bounds (Lemma 7.1).

Acknowledgments

The authors would like to thank Amir Abboud for helpful discussions about ﬁne grained complexitythat improved the presentation of our results. KAC would like to thank Kannan Ramchandran forasking a question that helped lead to this work.29 eferences [AAK +

07] Noga Alon, Alexandr Andoni, Tali Kaufman, Kevin Matulef, Ronitt Rubinfeld, andNing Xie. Testing k-wise and almost k-wise independence. In

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing , pages 496–505, 2007. 5[AB18] Amir Abboud and Karl Bringmann. Tighter connections between formula-sat and shav-ing logs. arXiv preprint arXiv:1804.08978 , 2018. 4[ABBG] Sanjeev Arora, Boaz Barak, Markus Brunnermeier, and Rong Ge. Computational com-plexity and information asymmetry in ﬁnancial products. 5[ABW10] Benny Applebaum, Boaz Barak, and Avi Wigderson. Public-key cryptography fromdiﬀerent assumptions. In

Proceedings of the forty-second ACM symposium on Theoryof computing , pages 171–180, 2010. 5[ABW18] Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. If the current cliquealgorithms are optimal, so is valiant’s parser.

SIAM Journal on Computing , 47(6):2527–2555, 2018. 5[AKS98] Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden clique ina random graph.

Random Structures & Algorithms , 13(3-4):457–466, 1998. 5[BBH18] Matthew Brennan, Guy Bresler, and Wasim Huleihel. Reducibility and compu-tational lower bounds for problems with planted sparse structure. arXiv preprintarXiv:1806.07508 , 2018. 5, 21[BHK +

19] Boaz Barak, Samuel Hopkins, Jonathan Kelner, Pravesh K Kothari, Ankur Moitra,and Aaron Potechin. A nearly tight sum-of-squares lower bound for the planted cliqueproblem.

SIAM Journal on Computing , 48(2):687–735, 2019. 5[BPW18] Afonso S Bandeira, Amelia Perry, and Alexander S Wein. Notes on computational-to-statistical gaps: predictions using statistical physics. arXiv preprint arXiv:1803.11132 ,2018. 5[BR13] Quentin Berthet and Philippe Rigollet. Complexity theoretic lower bounds for sparseprincipal component detection. In

Conference on Learning Theory , pages 1046–1066,2013. 5[BRSV17] Marshall Ball, Alon Rosen, Manuel Sabin, and Prashant Nalini Vasudevan. Average-caseﬁne-grained hardness. In

Proceedings of the 49th Annual ACM SIGACT Symposium onTheory of Computing , pages 483–496, 2017. 5[DGGP14] Yael Dekel, Ori Gurel-Gurevich, and Yuval Peres. Finding hidden cliques in linear timewith high probability.

Combinatorics, Probability and Computing , 23(1):29–49, 2014. 1,2, 5, 6, 13, 18[DM15a] Yash Deshpande and Andrea Montanari. Finding hidden cliques of size (cid:112)

N/e in nearlylinear time.

Foundations of Computational Mathematics , 15(4):1069–1128, 2015. 1, 2,5, 18[DM15b] Yash Deshpande and Andrea Montanari. Improved sum-of-squares lower bounds forhidden clique and hidden submatrix problems. In

Conference on Learning Theory ,pages 523–562, 2015. 5 30FGN +

20] Uriel Feige, David Gamarnik, Joe Neeman, Mikl´os Z R´acz, and Prasad Tetali. Findingcliques using few probes.

Random Structures & Algorithms , 56(1):142–153, 2020. 5[FGR +

17] Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh S Vempala, and Ying Xiao.Statistical algorithms and a lower bound for detecting planted cliques.

Journal of theACM (JACM) , 64(2):1–37, 2017. 5[FK00] Uriel Feige and Robert Krauthgamer. Finding and certifying a large hidden clique in asemirandom graph.

Random Structures & Algorithms , 16(2):195–208, 2000. 5[FR10] Uriel Feige and Dorit Ron. Finding hidden cliques in linear time. 2010. 1, 2, 5, 18[Gol10] Oded Goldreich. Introduction to testing graph properties. In

Property testing , pages105–141. Springer, 2010. 11[GR18] Oded Goldreich and Guy Rothblum. Counting t-cliques: Worst-case to average-casereductions and direct interactive proof systems. In , pages 77–88. IEEE, 2018. 5[GZ19] David Gamarnik and Ilias Zadik. The landscape of the planted clique problem: Densesubgraphs and the overlap gap property. arXiv preprint arXiv:1904.07174 , 2019. 5[HKP +

18] Samuel B Hopkins, Pravesh Kothari, Aaron Henry Potechin, Prasad Raghavendra, andTselil Schramm. On the integrality gap of degree-4 sum of squares for planted clique.

ACM Transactions on Algorithms (TALG) , 14(3):1–31, 2018. 5[HS05] Don Hush and Clint Scovel. Concentration of the hypergeometric distribution.

Statistics& probability letters , 75(2):127–132, 2005. 16, 19[Jer92] Mark Jerrum. Large cliques elude the metropolis process.

Random Structures & Algo-rithms , 3(4):347–359, 1992. 5[Kuˇc95] Ludˇek Kuˇcera. Expected complexity of graph partitioning problems.

Discrete AppliedMathematics , 57(2-3):193–212, 1995. 5, 6, 7[KZ14] Pascal Koiran and Anastasios Zouzias. Hidden cliques and the certiﬁcation of therestricted isometry property.

IEEE transactions on information theory , 60(8):4999–5006, 2014. 5[MPW15] Raghu Meka, Aaron Potechin, and Avi Wigderson. Sum-of-squares lower bounds forplanted clique. In

Proceedings of the forty-seventh annual ACM symposium on Theoryof computing , pages 87–96, 2015. 5[MW +

15] Zongming Ma, Yihong Wu, et al. Computational barriers in minimax submatrix detec-tion.

The Annals of Statistics , 43(3):1089–1116, 2015. 5[RS19] Mikl´os Z R´acz and Benjamin Schiﬀer. Finding a planted clique by adaptive probing. arXiv preprint arXiv:1903.12050 , 2019. 1, 3, 5, 8, 21, 23[SBW19] Nihar B Shah, Sivaraman Balakrishnan, and Martin J Wainwright. Feeling the bern:Adaptive estimators for bernoulli probabilities of pairwise comparisons.

IEEE Trans-actions on Information Theory , 65(8):4854–4874, 2019. 5[WBP16] Tengyao Wang, Quentin Berthet, and Yaniv Plan. Average-case hardness of rip certiﬁ-cation. In

Advances in Neural Information Processing Systems , pages 3819–3827, 2016.5 31Wil] Virginia Vassilevska Williams. On some ﬁne-grained questions in algorithms and com-plexity. World Scientiﬁc. 5[WX18] Yihong Wu and Jiaming Xu. Statistical problems with planted structures: Information-theoretical and computational limits. arXiv preprint arXiv:1806.00118arXiv preprint arXiv:1806.00118