Approximate Spielman-Teng theorems for the least singular value of random combinatorial matrices
aa r X i v : . [ m a t h . P R ] A p r Approximate Spielman-Teng theorems for the least singular value ofrandom combinatorial matrices
Vishesh Jain ∗ Abstract
An approximate Spielman-Teng theorem for the least singular value s n ( M n ) of a random n × n square matrix M n is a statement of the following form: there exist constants C, c > such thatfor all η ≥ , Pr( s n ( M n ) ≤ η ) . n C η + exp( − n c ) . The goal of this paper is to develop a simpleand novel framework for proving such results for discrete random matrices. As an application, weprove an approximate Spielman-Teng theorem for { , } -valued matrices, each of whose rows is anindependent vector with exactly n/ zero components. This improves on previous work of Nguyenand Vu, and is the first such result in a ‘truly combinatorial’ setting. Let M n be an n × n real matrix. Its singular values , denoted by s k ( M n ) for k ∈ [ n ] are theeigenvalues of p M Tn M n arranged in non-decreasing order. Of particular interest are the largest andsmallest singular values, which have the following variational characterizations: s ( M n ) := sup x ∈ S n − k M n x k ; s n ( M n ) := inf x ∈ S n − k M n x k , where k · k denotes the Euclidean norm on R n , and S n − denotes the n − dimensional Euclideansphere in R n .The study of the non-limiting or non-asymptotic behavior of the largest and smallest singularvalues of random matrices plays a crucial role in diverse areas of mathematics – such as applied linearalgebra, computer science, statistics, and asymptotic geometric analysis – in addition to often being akey ingredient in proving other results in random matrix theory, for instance the circular law (which isthe non-Hermitian counterpart of the classical semicircle law of Wigner) and delocalization propertiesof eigenvectors. We refer the reader to the the surveys [22, 29, 37] and the books [31, 33] for a detailedaccount of the development of the area.The behavior of the largest singular value of random matrices with independent entries is relativelywell-understood. Latała [15] showed that if the entries of M n have mean and uniformly boundedfourth moment, then with high probability, s ( M n ) = O ( n / ); for i.i.d. entries with mean , variance , and uniformly bounded fourth moment, it was already knownmuch earlier [1, 39] that with high probability, s ( M n ) = Θ( n / ) . ∗ Massachusetts Institute of Technology. Department of Mathematics. Email: [email protected] .
1n the other hand, the study of the behavior of the smallest singular value has proved to be muchharder. For an overview of the history of this problem for matrices with i.i.d. entries, we refer thereader to [28]; here, we only briefly summarize a few developments. For random matrices with i.i.d.standard Gaussian entries, it was proved by Edelman [5] that Pr (cid:16) s n ( M n ) ≤ εn − / (cid:17) ∼ ε, (1)thereby confirming (in a very strong form) a conjecture of Smale, and a speculation of von Neumannand Goldstine. In connection with their work on smoothed analysis , Spielman and Teng [30] conjecturedthat Equation (1) should also hold for random Rademacher matrices (i.e. each entry is independently ± with equal probability), up to an additive error of c n (for some c < ) to account for the probabilitythat such a matrix is singular; i.e., they conjectured that Pr (cid:16) s n ( M n ) ≤ εn − / (cid:17) ≤ ε + c n . (2)Note that the ε = 0 version of this conjecture asserts that the probability that a random signed matrixis singular is exponentially small; even proving that this probability goes to as n → ∞ is a non-trivialresult due to Komlós [12], and the exponential bound was only obtained much later by Kahn, Komlósand Szemerédi [11].It was shown in breakthrough works by Rudelson [26] that Pr (cid:16) s n ( M n ) ≤ εn − / (cid:17) . ε + n − / for all random matrices M n with i.i.d. centered subgaussian entries, and by Tao and Vu [35] that forrandom signed matrices M n , for any A > , there exists B > such that Pr (cid:0) s n ( M n ) ≤ n − B (cid:1) ≤ n − A . (3)These results have been greatly refined in subsequent remarkable works: Rudelson and Vershynin [28]showed that Equation (2) holds (up to a multiplicative constant) for all random matrices with i.i.d.centered subgaussian entries, Rebrova and Tikhomirov [25] proved the same result assuming only thatthe i.i.d. entries are centered and have variance , and in the special case of random signed matrices,Tikhomirov [36] proved the same result but with the correct ‘constant’ c = (1 / o n (1)) . Random matrices with dependent entries:
Despite the great progress in the study of randommatrices with independent entries, much less is known about the behavior of the least singular valuefor models of random matrices with non-trivial dependence between entries. Some measure of the dif-ficulty in the study of such models may be obtained by noting that the symmetric analog of Komlós’sclassical result (on the asymptotically almost sure invertibility of random Bernoulli matrices) was onlyproved almost 40 years later (in 2006) by Costello, Tao, and Vu [4]. Similarly, while the Spielman-Tengconjecture for random signed matrices has been settled up to an overall constant, the current beststatement of the same form for random symmetric signed matrices M n is due to Vershynin [38], whoproved that Pr (cid:16) s n ( M n ) ≤ εn − / (cid:17) . ε / + e − n c (4)for some small constant c > . Motivated by this, we will henceforth refer to a result of the followingform as an approximate Spielman-Teng theorem for a random matrix M n ; these will be the subject ofthe present work: there exist constants C, c > such that Pr (cid:16) s n ( M n ) ≤ εn − / (cid:17) . n C ε + e − n c . (5)2n recent years, motivated by combinatorial applications, the study of such questions for the adja-cency matrices of random graphs has attracted a lot of attention, with particular emphasis on graphsor bipartite graphs satisfying various regularity constraints (which translate to constraints on therow/column sums of the matrix). In these settings, even the analogs of Komlós’s theorem have onlyvery recently been proved – for d -regular digraphs with n − ≥ d ≥ , this is due to (complementary)work of Cook [3], Litvak, Lytova, Tikhomirov, Tomczak-Jaegermann and Youssef [16], and Huang [8],whereas for d -regular graphs with n − ≥ d ≥ , this is due to Landon, Sosoe, and Yau [14], andHuang [8] (see also the parallel works of Mészáros [19] and Nguyen and Wood [23]). Whereas somequantitative control on the least singular value in combinatorial settings has been obtained (see, e.g.,[2] and [17], and also the discussion below regarding [21]), these bounds are still quite far from ap-proximate Spielman-Teng type results. In fact, prior to the very recent work of the Ferber, Jain, Luh,and Samotij [7], we are not even aware of any ‘ exponential-type ’ bound (by which we mean a bound ofthe form exp( − n c ) for some constant c > ) on the singularity probability in combinatorial settings ofsuch nature. Our goal in this work is to establish a novel framework (utilizing the recent approach to the ‘countingproblem in inverse Littlewood–Offord theory’ developed by the author, along with Ferber, Luh, andSamotij [7]) for proving approximate Spielman-Teng results in the discrete setting in a simple andunified manner. As an illustration of our main techniques (while keeping technicalities to a minimum),we begin by providing a proof of the following theorem which, in our opinion, is much simpler thanexisting proofs in the literature.
Theorem 1.1.
Let M n denote an n × n random matrix, each of whose entries is an independentRademacher random variable. Then, for any η ≥ − n . , Pr ( s n ( M n ) ≤ η ) . ηn / . Remark 1.2.
We have not made any attempt to optimize the constant . or the factor n / inthe above theorem, choosing instead to keep the exposition simple. We also note that our proof goesthrough with very minor modifications to yield a similar result for the case when the entries of M n arei.i.d., with each entry taking on the value with probability − µ and ± with probability µ/ each,for some fixed constant µ ∈ (0 , , thereby providing a simple new proof of (a quantitative improvementof) the main result of Tao and Vu in [35]. On the other hand, as mentioned in the introduction, betterand nearly optimal quantitative bounds are already known in this case.Next, we use our general framework, along with certain combinatorial ideas developed in [7], to prove(to the best our knowledge) the first approximate Spielman-Teng theorem in a ‘truly combinatorial’setting. Theorem 1.3.
Let n ∈ N be even, and let Q n denote an n × n random matrix, sampled uniformlyfrom n × n { , } -valued matrices, each of whose rows sums to n/ . Then, for any η ≥ − n . , Pr( s n ( Q n ) ≤ η ) . ηn . Remark 1.4.
Once again, we have not tried to optimize the constant . or the factor n in theabove theorem. The restriction to row sums being equal to n/ is also made for simplicity; similarideas may be used to prove a statement like the one above with n/ replaced by some other row sum s satisfying ǫn ≤ s ≤ (1 − ǫ ) n for some fixed ǫ > .3he problem of estimating the probability that Q n is singular was first considered by Nguyen in[20] (as a step towards understanding the regular digraph/graph case), where it was shown that, forany constant C > , Pr( Q n is singular ) = O C ( n − C ) . An exponential-type upper bound on this probability was recently provided in [7]. The question ofobtaining quantitative lower tail bounds on the least singular value of Q n was considered by Nguyenand Vu in [21], where a much weaker bound of the form Equation (3) was obtained. The goal ofthat work was to prove a circular law for such matrices; while we do not consider this matter here,we remark that obtaining quantitative lower tail estimates on the least singular value (of perturbedmatrices) is a key step in proving circular laws, and we believe that our techniques should extendto that setting as well. We also believe that our techniques (combined with additional combinatorialarguments) should allow one to prove an approximate Spielman-Teng theorem for sufficiently denserandom regular digraphs. We will discuss the main ingredients of our method in detail in the next two sections; here, wemake a few general remarks. Our general approach to proving lower tail estimates on the least singularvalue lies somewhere between the method of Tao and Vu (as developed in [34] and subsequent works),and the method of Rudelson and Vershynin (as developed in [28] and subsequent works). Like Tao andVu, we reduce to working with integer vectors (as opposed to working with nets on the unit sphere);however, we completely avoid the use of inverse Littlewood-Offord type theorems, choosing instead towork with the simple and quantitatively stronger counting variant developed in [7]. On the other hand,like Rudelson and Vershynin, we utilize the key notion of the
Least Common Denominator (LCD) ofa vector. However, while their work requires dividing vectors on the unit sphere into approximatelevel sets of the LCD and carefully analyzing each piece, we only need to distinguish ‘large’ LCD from‘small’ LCD. Interestingly, our method provides a view of the LCD as a bridge from the problem ofcontrolling the least singular value to the problem of controlling the singularity probability on a subsetof integer vectors.In upcoming work, we will build upon the ideas introduced here in a couple of directions. In [9], weextend the techniques of [7] to prove a counting counterpart for the inverse Littlewood-Offord problemfor very general distributions, and use this to provide a simple combinatorial proof of an approximateSpielman-Teng theorem for random matrices with i.i.d. heavy-tailed entries (a Spielman-Teng theoremfor such matrices was recently proved by Rebrova and Tikhomirov [25]). In [10], we further develop theideas here to prove approximate Spielman-Teng results in the important setting of smoothed analysisi.e. when the random matrix is perturbed by a fixed, polynomially bounded matrix; here, weakerbounds of the form Equation (3) are known due to Tao and Vu [32].
Organization:
The remainder of this paper is organized as follows. In Section 2, we provide a high-level outline of the proof of Theorem 1.1 (the proof of Theorem 1.3 is conceptually quite similar, andwe will discuss the necessary changes at the start of Section 5); in Section 3, we collect some toolsand auxiliary results which will be used in the proofs of our main results. Finally, in Section 4 andSection 5, we prove Theorem 1.1 and Theorem 1.3 respectively.
Notation:
Throughout the paper, we will omit floors and ceilings when they make no essentialdifference. For convenience, we will also say ‘let p = x be a prime’, to mean that p is a prime between x and x ; again, this makes no difference to our arguments. As is standard, we will use [ n ] to de-note the discrete interval { , . . . , n } . We will also use the asymptotic notation . , & , ≪ , ≫ to denote4 ( · ) , Ω( · ) , o ( · ) , ω ( · ) respectively. All logarithms are natural unless noted otherwise. Acknowledgements:
I would like to thank Kyle Luh for comments on a preliminary version of thispaper, and Jake Lee Wellens for helpful conversations.
To motivate our proof, we begin by recalling the high-level approach of Tao and Vu from [35]. Let
B > be a large number (depending on A ) to be chosen later. Then, if s n ( M n ) < n − B , there mustexist a unit vector v ∈ S n − for which k M n v k < n − B . By rounding each coordinate v to the nearest multiple of n − B − , we can find a vector ˜ v ∈ n − B − · Z n of magnitude . ≤ k ˜ v k ≤ . such that k M n ˜ v k ≤ n − B . Hence, writing w := n B +2 ˜ v , we can find an integer vector w ∈ Z n of magnitude . n B +2 ≤ k w k ≤ . n B +2 such that k M n w k ≤ n . Let Ω be the set of integer vectors w ∈ Z n of magnitude . n B +2 ≤ k w k ≤ . n B +2 . By the abovediscussion, it suffices to show that Pr (cid:0) ∃ w ∈ Ω such that k M n w k ≤ n (cid:1) = O A ( n − A ) . In order to show this, Tao and Vu partition the elements of Ω into three sets, which they analyze usingseparate arguments. This partition is based on whether or not the vector is ‘close’ to a sufficientlylow-dimensional subspace, as well as the following key quantity. Definition 2.1 (Largest atom probability) . For an integer vector w ∈ Z n , we define its largest atomprobability to be ρ ( w ) := sup x ∈ Z Pr ( ǫ w + · · · + ǫ n w n = x ) , where ǫ , . . . , ǫ n are i.i.d. random Rademacher variables.The partitioning scheme of Tao and Vu is as follows: • A vector w ∈ Ω is rich if ρ ( w ) ≥ n − A − and poor otherwise. Let Ω be the set of poor w ’s. • A rich w is singular is fewer than n . of its coordinates have absolute value n B − or greater.Let Ω be the set of rich and singular w ’s. • A rich w is nonsingular if at least n . of its coordinates have absolute value n B − or greater.Let Ω be the set of rich and nonsingular w ’s.The desired claim follows directly from the estimates below and the union bound. • (Lemma 7.1 in [35]) Pr (cid:0) ∃ w ∈ Ω : k M n w k ≤ n (cid:1) = o A ( n − A ) . • (Lemma 7.2 in [35]) Pr (cid:0) ∃ w ∈ Ω : k M n w k ≤ n (cid:1) = o A ( n − A ) .5 (Lemma 7.3 in [35]) Pr (cid:0) ∃ w ∈ Ω : k M n w k ≤ n (cid:1) = o A ( n − A ) .The proofs of the first two bullet points above are relatively straightforward and standard, and basedon similar proofs in [18, 26]. The main work in [35] is the proof of the third bullet point, which requiresthe inverse Littlewood-Offord theorems along with additional additive combinatorial arguments. The starting point of our approach is the following simple observation. Let Γ ⊆ Z n be a set ofnon-zero integer vectors. Then, Pr (cid:0) ∃ w ∈ Γ : k M n w k ≤ C ( n ) √ n (cid:1) ≤ X z ∈ Z n ∩ B (0 ,C ( n ) √ n ) Pr ( ∃ w ∈ Γ : M n w = z ) ≤ (cid:12)(cid:12) Z n ∩ B (0 , C ( n ) √ n ) (cid:12)(cid:12) · sup z ∈ Z n ∩ B (0 ,C ( n ) √ n ) Pr ( ∃ w ∈ Γ : M n w = z ) ≤ (100 C ( n )) n · sup z ∈ Z n Pr ( ∃ w ∈ Γ : M n w = z ) , (6)where the first equality uses that M n w is always an integer vector, and the last inequality uses astandard (loose) volumetric estimate on the number of integer points in an n -dimensional ball ofradius R . The second quantity in the last equation i.e. sup z ∈ Z n Pr ( ∃ w ∈ Γ : M n w = z ) (7)is reminiscent of the singularity problem for random Rademacher matrices, which corresponds to thecase when Γ = Z n \ { } and the supremum is replaced simply by z = 0 . The bounds on the singularityproblem coming from either inverse Littlewood-Offord theory [35] or its counting variant [7] show thatfor a suitable set of vectors of ‘intermediate’ largest atom probability, one may bound the quantity inEquation (7) by O ( n − cn ) for some (small) absolute constant c > (see also Proposition 4.5). Hence,for C ( n ) = o ( n c ) , the quantity on the right hand side of Equation (6) is ( o (1)) n .Since the set of vectors of ‘intermediate’ largest atom probability mentioned above correspond, in asense, to ‘rich, nonsingular’ vectors, one may hope to use a similar decomposition of integer vectors asTao and Vu to complete the proof. However, one runs into the immediate obstacle that the discussionin the above paragraph only holds for C ( n ) = o ( n c ) , whereas the reduction to integer vectors in[35] requires one to be able to work with C ( n ) = Ω( n / ) . Note that this reduction, as stated, isclearly wasteful; by using the fact (Proposition 3.4) that, except with exponentially small probability, k M n k = O ( √ n ) , one is able to reduce the consideration to C ( n ) = O ( √ n ) , which turns out to be justout of reach.However, this loss is because we are using the worst-case estimate that the closest vector w ∈ n − B − · Z n to a given vector v ∈ S n − satisfies k w − v k ≤ n − B − / . To overcome this obstacle, wewill use the connection between largest atom probability and Diophanine approximation (as capturedby the Least Common Denominator (LCD)) developed in [28]. In particular, we will use the fact(Proposition 3.3) that vectors v ∈ S n − for which this worst-case estimate is ‘close’ to being true havehigh LCD, and hence, are necessarily ‘poor’; in other words, for ‘rich’ vectors, we gain sufficiently overthe worst-case estimate (Proposition 4.3) for the above strategy to be effective.6 Tools and auxiliary results
In this subsection, we record the definition of the LCD of a vector and its connection to the classicalLévy concentration function, as developed in [28].
Definition 3.1.
The Lévy concentration function of a random variable X at scale δ ≥ is defined as L ( X, δ ) := sup r ∈ R Pr ( | X − r | ≤ δ ) . Definition 3.2 (Least Common Denominator (LCD)) . For γ ∈ (0 , and α > , and for a non-zerovector a ∈ R n , define LCD γ,α ( a ) := inf { θ > dist ( θ a , Z n ) < min { γ k θ a k , α }} . Note that the requirement that the distance is smaller than γ k θ a k forces us to consider only non-trivialinteger points as approximations of θ a .The following proposition, which appears in [28], shows that vectors with large LCD have smallLévy concentration function on scales which are larger than Ω(1 / LCD ) . Here, for completeness, wereproduce a particularly simple proof for the Rademacher case from the lecture notes [27]; this isessentially the only case that will be needed in this paper. Proposition 3.3 (Theorem 6.2 in [27]) . Let ǫ , . . . , ǫ n denote i.i.d. Rademacher random variables.Consider a unit vector a = ( a , . . . , a n ) ∈ S n − . Let S := P ni =1 ǫ i a i . Then, for every α > , and for δ ≥ (4 /π )LCD γ,α ( a ) , we have L ( S, δ ) . δγ + exp( − α / . Proof.
We start by using Esséen’s inequality ([6]), which estimates the Lévy concentration function ofa random variable in terms of its characteristic function as follows: L ( X, . Z − | E [exp( iθX )] | dθ Then, we have L ( S, δ ) = L ( S/δ, . Z − | E [exp( iθS/δ )] | dθ = Z − n Y j =1 | E [exp( ia j ǫ j θ/δ ) | dθ = Z − n Y j =1 | cos( a j θ/δ ) | dθ ≤ Z − n Y j =1 exp (cid:18) − (cid:0) − cos ( a j θ/δ ) (cid:1)(cid:19) dθ Z − n Y j =1 exp (cid:18) −
12 sin ( a j θ/δ ) (cid:19) dθ ≤ Z − n Y j =1 exp −
12 min q ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) θπδ a j − q (cid:12)(cid:12)(cid:12)(cid:12) ! dθ, where in the fourth line, we have used the inequality | x | ≤ exp (cid:0) − (1 − x ) (cid:1) , and in the last line, wehave used the pointwise inequality | sin( x ) | ≤ min q ∈ Z (cid:12)(cid:12) π x − q (cid:12)(cid:12) . Thus, we see that L ( S, δ ) . Z − exp (cid:0) − h ( θ ) / (cid:1) dθ, where h ( θ ) := min p ∈ Z n (cid:13)(cid:13)(cid:13)(cid:13) θπδ a − p (cid:13)(cid:13)(cid:13)(cid:13) . Since, by assumption, / ( πδ ) ≤ LCD γ,α ( a ) , it follows that for any θ ∈ [ − , , h ( θ ) ≥ min (cid:18) γ θπδ k a k , α (cid:19) = min (cid:18) γ θπδ , α (cid:19) , so that L ( S, δ ) . Z − (cid:0) exp (cid:0) − (2 γθ/πδ ) / (cid:1) + exp( − α / (cid:1) dθ . δγ + exp( − α / , as desired. We will make use of the following two results, which may be proved in a straightforward mannerusing standard concentration and epsilon-net arguments. Later, in Lemma 5.1 and Proposition 5.2, wewill provide proofs of analogous results for the random matrix model under consideration there.The first result is a bound on the standard ℓ → ℓ operator norm of a typical realization of M n . Proposition 3.4 (See, e.g., Proposition 4.4 in [27]) . There exist absolute constants C . > , c . > for which the following holds. For all t ≥ C . , Pr (cid:0) k M n k ≥ t √ n (cid:1) . exp (cid:0) − c . t n (cid:1) . The second result shows that, with very high probability, the image of a fixed unit vector under M n does not have norm o ( √ n ) . Lemma 3.5 (See, e.g., Corollary 4.6 in [27]) . There exists an absolute constant c . > for which thefollowing holds. Fix v ∈ S n − . Then, Pr (cid:0) k M n v k ≤ c . √ n (cid:1) . exp( − c . n ) . .3 The counting problem in inverse Littlewood-Offord theory Our definition of the set Γ in Equation (6) and the bound on Equation (7) rely on the approachto the counting problem in inverse Littlewood-Offord theory developed in [7]. The starting point ofthis approach is a classical anti-concentration inequality due to Halász, which bounds the largest atomprobability of an integer vector in terms of its ‘arithmetic structure’. In order to state this inequality,we need the following definition. Throughout this section, we will work over F p (the reader shouldview p as a ‘large’ (depending on n ) prime) instead of over Z . Definition 3.6.
Suppose that a ∈ F np for n ∈ N and an odd prime p , and let k ∈ N . We define R ∗ k ( a ) to be the number of solutions to ± a i ± a i · · · ± a i k = 0 mod p, where repetitions are allowed in the choice of i , . . . , i k ∈ [ n ] and such that |{ i , . . . , i k }| > (1 . k . Remark 3.7.
Let F ∗ p denote the set of all finite-dimensional vectors with coefficients in F p . Then, forevery vector a ∈ F ∗ p and for every k ∈ N , we have the trivial bound R ∗ k ( a ) ≤ k · | a | k , where | a | denotes the number of components of a . Indeed, there are at most | a | k ways of choosingindices i , . . . , i k ∈ [ | a | ] , and at most k ways of choosing a sign pattern which will satisfy the requiredequation for a given choice of indices. Theorem 3.8 (Halász’s inequality over F p , see Theorem 1.4 in [7]) . There exists a constant C . suchthat the following holds for every odd prime p , integer n , and vector a := ( a , . . . , a n ) ∈ F np \ { } .Suppose that an integer ≤ k ≤ n/ and positive real M satisfy M ≤ | supp ( a ) | and kM ≤ n .Then, ρ F p ( a ) ≤ p + C . R ∗ k ( a ) + C . k . n . ) k k n k M / + e − M . Here, ρ F p ( a ) denotes the largest atom probability of a over F p . The next theorem bounds the number of vectors over F np which have no ‘large’ subvector with‘small’ R ∗ k , and is a straightforward consequence of Theorem 1.7 in [7]. Later, we will see that thisreadily translates to a good upper bound on the number of vectors in F np with given largest atomprobability. Theorem 3.9 (See also Lemma 3.3 in [7]) . Let p be an odd prime and let k ∈ N , s ≥ s ∈ [ n ] , t ∈ [ p ] .Let B s k,s , ≥ t ( n ) := (cid:26) a ∈ F np : | supp ( a ) | ≥ s , ∀ b ⊂ a s.t. | supp ( b ) | ≥ s we have R ∗ k ( b ) ≥ t · k · | b | k p (cid:27) . Then, | B s k,s , ≥ t ( n ) | ≤ (200) n (cid:18) s s (cid:19) k − p n t − n + s . Proof.
Let us first fix an S ⊆ [ n ] with | S | ≥ s and count only those vectors a with supp ( a ) = S .Define B k,s , ≥ t ( s ) := (cid:26) a ∈ F s p : ∀ b ⊂ a s.t. | supp ( b ) | ≥ s we have R ∗ k ( b ) ≥ t · k · | b | k p (cid:27) . a ∈ B s k,s , ≥ t ( n ) , it follows that a | S ∈ B k,s , ≥ t ( s ) . Hence, Theorem 1.7 in [7] shows that thenumber of choices for a | S is at most (cid:18) s s (cid:19) k − (0 . t ) s (cid:18) pt (cid:19) s ≤ n (cid:18) s s (cid:19) k − p n t − n + s . Finally, summing over all the at most n possible choices for S gives the desired conclusion.We conclude this subsection by noting that, by Remark 3.7, any vector a ∈ F np with | supp ( a ) | ≥ s must also lie in at least one of the sets B s k,s , ≥ t ( n ) , where t ranges over integers from to p . Throughout this section, we will take α := n / and γ := c . / C . . Moreover, since Theorem 1.1is trivially true for η ≥ n − / , we will henceforth assume that − n . ≤ η < n − / . We decomposethe unit sphere S n − into Γ ( η ) ∪ Γ ( η ) , where Γ ( η ) := n a ∈ S n − : LCD α,γ ( a ) ≥ n / · η − o and Γ ( η ) := S n − \ Γ ( η ) . Accordingly, we have Pr ( s n ( M n ) ≤ η ) ≤ Pr (cid:0) ∃ a ∈ Γ ( η ) : k M n a k ≤ η (cid:1) + Pr (cid:0) ∃ a ∈ Γ ( η ) : k M n a k ≤ η (cid:1) . (8)Therefore, Theorem 1.1 follows from the following two propositions and the union bound. Proposition 4.1. Pr (cid:0) ∃ a ∈ Γ ( η ) : k M n a k ≤ η (cid:1) . ηn / + n exp( −√ n/ . Proposition 4.2. Pr (cid:0) ∃ a ∈ Γ ( η ) : k M n a k ≤ η (cid:1) . exp( − c . n ) . The proof of Proposition 4.1 is relatively simple, and follows from a conditioning argument devel-oped in [18], once we observe the crucial fact (Proposition 3.3) that for any a ∈ Γ ( η ) , L ( P ni =1 ǫ i a i , δ ) . δ + exp( −√ n/ for all δ ≥ (4 /π ) η · n − / . Proof of Proposition 4.1 (following [18, 35]).
Since M Tn and M n have the same singular values, it fol-lows that a necessary condition for a matrix M n to satisfy the event in Proposition 4.1 is that thereexists a unit vector a ′ = ( a ′ , . . . , a ′ n ) such that k a ′ T M n k ≤ η . To every matrix M n , associate such avector a ′ arbitrarily (if one exists) and denote it by a ′ M n ; this leads to a partition of the space of all {± } -valued matrices with least singular value at most η . Then, by taking a union bound, it sufficesto show the following. Pr (cid:16) ∃ a ∈ Γ ( η ) : k M n a k ≤ η ^ k a ′ M n k ∞ = | a ′ n | (cid:17) . η √ n + exp( −√ n/ . (9)To this end, we expose the first n − rows X , . . . , X n − of M n . Note that if there is some a ∈ Γ ( η ) satisfying k M n a k ≤ η , then there must exist a vector y ∈ Γ ( η ) , depending only on the first n − rows X , . . . , X n − , such that n − X i =1 ( X i · y ) ! / ≤ η. In other words, once we expose the first n − rows of the matrix, either the matrix cannot be extendedto one satisfying the event in Proposition 4.1, or there is some unit vector y ∈ Γ ( η ) , which can be10hosen after looking only at the first n − rows, and which satisfies the equation above. For the restof the proof, we condition on the first n − rows X , . . . , X n − (and hence, a choice of y ).For any vector w ′ ∈ S n − with w ′ n = 0 , we can write X n = 1 w ′ n u − n − X i =1 w ′ i X i ! , where u := w ′ T M n . Thus, for the event { s n ( M n ) ≤ η } V {k a ′ M n k ∞ = | a ′ n |} to occur, we mustnecessarily have | X n · y | = inf w ′ ∈ S n − ,w ′ n =0 | w ′ n | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) u · y − n − X i =1 w ′ i X i · y (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ | a ′ n | k a ′ TM n M n k k y k + k a ′ M n k n − X i =1 ( X i · y ) ! / ≤ η √ n (cid:0) k y k + k a ′ M n k (cid:1) ≤ η √ n, where the second line is due to the Cauchy-Schwarz inequality and the particular choice w ′ = a ′ M n .It follows, by definition, that the probability in Equation (9) is bounded by L ( y , η √ n ) , and hence, by L ( y , η √ n ) . η √ n + exp( −√ n/ , which completes the proof.The proof of Proposition 4.2 is the content of the next three subsections. Here, we present the initial crucial step, which consists of efficiently passing from vectors on theunit sphere to integer vectors.
Proposition 4.3.
With notation as above, we have Pr (cid:0) ∃ a ∈ Γ ( η ) : k M n a k ≤ η (cid:1) . e − c . n +Pr( ∃ w ∈ ( Z n \ { } ) ∩ [ − η − n / , η − n / ] n : k M n w k ≤ min { γC . √ n k w k , C . α √ n } ) . Proof.
Since by Proposition 3.4,
Pr ( k M n k ≥ C . √ n ) . exp( − c . C . n ) , we may henceforth restrictto the complement of this event. Let a ∈ Γ ( η ) . Then, by definition, there exists some < θ ≤ LCD α,γ ( a ) ≤ n / η − and some w ∈ Z n \{ } such that k θ a − w k ≤ min { γθ, α } . Thus, if k M n a k ≤ η ,it follows from the triangle inequality that k M n w k = k M n ( w − θ a ) + M n ( θ a ) k ≤ k M n k · k θ a − w k + θ · k M n a k ≤ C . √ n · min { γθ, α } + θη ≤ C . √ n · min { γθ, α } , where the last inequality follows since η ≤ γ √ n and θη ≤ n / ≤ √ nα . The desired conclusion nowfollows from the straightforward case analysis below.11 ase I: γθ ≤ α . In this case, w is a non-zero integer vector of norm k w k = θ (1 ± γ ) satisfying k M n w k ≤ γC . √ nθ ≤ min { γC . √ n k w k , C . α √ n } , where the last inequality uses θ ≤ k w k and γθ ≤ α . Case II: γθ > α . In this case, w is a non-zero integer vector of norm k w k = θ (1 ± γ ) ≥ γ − α/ satisfying k M n w k ≤ C . α √ n ≤ min { C . γ − αγ √ n, C . α √ n } ≤ min { γC . √ n k w k , C . α √ n } . The goal of this subsection is to prove the following lemma, which follows from Lemma 3.5 and asimple union bound. Throughout this subsection and the next one, p = 2 n . is a prime. Note, inparticular, that p ≫ η − n / . Lemma 4.4. Pr (cid:0) ∃ w ∈ ( Z n \ { } ) ∩ [ − p, p ] n , | supp ( w ) | ≤ n . : k M n w k ≤ γC . √ n k w k (cid:1) . exp( − c . n/ . Proof.
The number of vectors w ∈ ( Z n \ { } ) ∩ [ − p, p ] n with support of size no more than n . is atmost (cid:18) nn . (cid:19) (3 p ) n . ≪ n . . By Lemma 3.5, for any such vector, Pr (cid:0) k M n w k ≤ γC . √ n k w k (cid:1) ≤ Pr (cid:0) k M n w k ≤ c . √ n k w k (cid:1) . exp( − c . n ) . Therefore, the union bound gives the desired conclusion.
Throughout this subsection, we fix k = n . , s = s = n . . It remains to deal with integervectors with support of size at least n . . Formally, let W := (cid:8) w ∈ ( Z n \ { } ) ∩ [ − η − , η − ] n : | supp ( w ) | ≥ n . (cid:9) . In view of Proposition 4.3 and Lemma 4.4, and since η ≤ n − / , the following proposition sufficesto prove Proposition 4.2. Proposition 4.5. Pr (cid:0) ∃ w ∈ W : k M n w k ≤ C . n / (cid:1) . n − . n . This will be accomplished by a union bound, following the strategy outlined in Equation (6). Notethat for our choice of parameters, the natural map ι : W → F np is injective, and we will often abuse notation by using w to denote ι ( w ) . This identification enablesus to make the following definition. 12 efinition 4.6. For an integer t ∈ [ p ] , let W t := n w ∈ W : ι ( w ) ∈ B s k,s , ≥ t − ( n ) \ B s k,s , ≥ t ( n ) o . We will need the following two lemmas.
Lemma 4.7.
There exists an absolute constant C . > such that, for our choice of parameters, if w ∈ W t , then ρ ( w ) ≤ C . p (cid:18) tn . + 1 (cid:19) . Proof.
Since ρ ( w ) ≤ ρ F p ( ι ( w )) =: ρ F p ( w ) , it suffices to prove the statement for the latter quantity.This, in turn, follows from a direct application of Halász’s inequality (Theorem 3.8). Indeed, since w / ∈ B s k,s , ≥ t ( n ) , there exists some b ⊂ a such that | supp ( b ) | ≥ s and R ∗ k ( b ) ≤ t · k · | b | k p . Moreover, for our choice of parameters, we have (40 k . n . ) k ≪ k s k √ p ≤ t · k · | b | k p . Hence, applying Halasz’s inequality to the | b | -dimensional vector b with M = n . (note that thischoice of M satisfies the conditions M ≤ s ≤ | supp ( b ) | and kM ≤ s ≤ | b | needed to applyHalász’s inequality), and observing that (trivially) ρ F p ( a ) ≤ ρ F p ( b ) , we get ρ F p ( a ) . p + t · k ·| b | k p k | b | k n . + e − n . . p (cid:18) tn . + 1 (cid:19) , as desired. Lemma 4.8.
For our choice of parameters, | W t | ≤ (300) n (cid:16) pt (cid:17) n . Proof.
By definition, any w ∈ W t satisfies ι ( w ) ∈ B s k,s , ≥ t − ( n ) . Hence, by Theorem 3.9, the numberof possible such vectors ι ( w ) is at most (200) n (cid:18) pt − (cid:19) n p s ≤ (300) n (cid:16) pt (cid:17) n . Using the injectivity of ι gives the desired conclusion.Finally, we are in a position to prove Proposition 4.5. As discussed at the start of this subsection,this completes the proof of Proposition 4.2 and hence, the proof of Theorem 1.1.13 roof of Proposition 4.5. We begin by noting that every w ∈ W has ρ ( w ) ≥ η n − / . Indeed, forany such vector w , P ni =1 ǫ i w i can take on at most nη − values, so that the claim follows from thepigeonhole principle. Since η n − / ≫ / √ p , it follows from Lemma 4.7 that W t = ∅ for all t ≤ √ p .On the other hand, using Equation (6) with Γ = W t and C ( n ) = 2 C . n / , it follows fromLemma 4.7 and Lemma 4.8 that for all t ≥ √ p , the probability that the image of any vector in W t under M n lies in the ball of radius C . n / centered around the origin is at most (200 C . n / ) n | W t | (cid:18) C . tpn . (cid:19) n ≤ (200 C . n / ) n (300) n (cid:16) pt (cid:17) n (cid:18) C . tpn . (cid:19) n ≪ n − . n . Finally, taking the union bound over integers t ∈ [ √ p, p ] completes the proof. A major difference between the proofs of Theorem 1.1 and Theorem 1.3 is that Proposition 3.4 andLemma 3.5 are no longer available to us; indeed, the operator norm of Q n is n/ , whereas the standardproof of Lemma 3.5 does not immediately go through since the random variables h Q n v , e i i might nothave their largest atom probability bounded away from (for instance, this is the case when v is theall ones vector). A large part of the proof is devoted to circumventing these issues.To overcome the first problem, we exploit the presence of a ‘spectral gap’. Namely, we show(Lemma 5.1) that, while the operator norm of Q n is n/ , the operator norm of Q n restricted to thehyperplane H := { v ∈ R n : P ni =1 v i = 0 } is at most n . with high probability. The utility of this isthat one can slightly modify the best integer approximation to a vector (guaranteed by the definitionof the LCD) in such a way that the difference/approximation error is contained almost entirely in H (Proposition 5.10); since the only place where we need the operator norm is to bound the norm of Q n applied to this difference, it follows that the ‘effective operator norm’ for our purpose is at most n . .To overcome the second obstacle, we prove (Proposition 5.2) a concentration inequality for sumsof low-degree polynomials on slices of the Boolean hypercube. Our proof combines the classical hy-percontractive estimates for polynomials on the Boolean hypercube with more recent hypercontractiveestimates for polynomials on slices of the Boolean hypercube, and may be of independent interest.Even given these additional tools, the remainder of the proof is not as straightforward as the proofof Theorem 1.1; after our reduction to integer vectors (Proposition 5.10), we will need to exploit theapproach in [7] (used there to study the singularity probability of Q n ) in order to get to the setting ofEquations (6) and (7) and complete the proof. H Lemma 5.1.
There exist absolute constants C . > and c . > for which the following holds. Forall t ≥ C . , Pr (cid:18) sup v ∈ H ∩ S n − k Q n v k ≥ tn . (cid:19) . exp (cid:0) − c . t n . (cid:1) . Proof.
Let M n denote a uniformly random n × n {± } -valued matrix. We will use the easy observationthat Q n ∼ (cid:0) − ( n × n + M n ) (cid:1) |{ M n = } , where n × n denotes the n × n all ones matrix and denotesthe all ones vector. Since Pr ( M n = ) ≥ (cid:16) √ n (cid:17) n = exp( − Θ( n log n )) , it suffices to show that Pr (cid:18) sup v ∈ H ∩ S n − k − ( n × n + M n ) v k ≥ tn . (cid:19) = Pr (cid:18) sup v ∈ H ∩ S n − k M n v k ≥ tn . (cid:19) . exp (cid:0) − Ω( t n . ) (cid:1) , n × n v = for any v ∈ H . But from Proposition 3.4, we have for all t ≥ C . that Pr (cid:18) sup v ∈ H ∩ S n − k M n v k ≥ tn . (cid:19) ≤ Pr (cid:0) k M n k ≥ tn . √ n (cid:1) . exp (cid:0) − c . t n . (cid:1) , which completes the proof. Proposition 5.2.
For any ǫ > , there exists a constant C . C . ǫ ) > for which the followingholds. Fix v ∈ S n − . Then, Pr (cid:18) k Q n v k ≤ √ n k v k (cid:19) ≤ C . ǫ ) exp (cid:18) − n − ǫ (cid:19) . The proof of this proposition will require a few intermediate steps. We begin by computing theexpectation of the random variable k Q n v k for fixed v = ( v , . . . , v n ) ∈ S n − . Consider the randomvariable X := v (1 + x ) + · · · + v n (1 + x n ) , where x , . . . , x n are {± } -valued random variables sampleduniformly from the hyperplane x + · · · + x n = 0 . Then, for all i ∈ [ n ] , the random variables h Q n v , e i i are independent copies of X/ , so that k Q n v k ∼ X + · · · + X n , where X , . . . , X n are i.i.d. copies of X . Since E [ X ] = E n X i =1 v i + n X i =1 v i x i ! = n X i =1 v i ! + E n X i =1 v i x i ! + 2 n X i =1 v i ! n X i =1 v i E [ x i ] ! = n X i =1 v i ! + E n X i =1 v i x i ! = n X i =1 v i ! + n X i =1 v i E [ x i ] + X i = j v i v j E [ x i x j ]= n X i =1 v i ! + n X i =1 v i − n − X i = j v i v j = n X i =1 v i ! + (cid:18) n − (cid:19) n X i =1 v i − n − n X i =1 v i + X i = j v i v j = n X i =1 v i ! + nn − n X i =1 v i − n − n X i =1 v i ! = n − n − n X i =1 v i ! + nn − n X i =1 v i ,
15t follows that E (cid:2) k Q n v k (cid:3) = n − n n − n X i =1 v i ! + n n − n X i =1 v i . The remainder of the proof consists of showing that the random variable k Q n v k is sufficiently well-concentrated around its expectation using the standard exponential moment method (Bernstein’s trick).For this, we need good control on the moments of X . The control for ‘low’ moments is provided bythe following hypercontractivity inequality on slices of the Boolean hypercube, which is applicable inour setting since X is a linear polynomial on the central slice of the Boolean hypercube. Lemma 5.3 (see, e.g., Proposition 2.5 and Corollary 2.6 in [13]) . For any integer q ≥ , E [ X q ] ≤ O q (1) (cid:0) E [ X ] (cid:1) q . For ‘high’ moments, the above estimate is possibly wasteful since the factor O q (1) could grow tooquickly as a function of q . However, we can do better by combining the classical hypercontractiveestimate for polynomials on the Boolean hypercube with a simple conditioning argument. Lemma 5.4.
For any integer q ≥ , E [ X q ] ≤ √ n · (4 q ) q (cid:0) E [ X ] (cid:1) q . Proof.
Consider the random variable Y := v (1 + ǫ ) + · · · + v n (1 + ǫ n ) , where ǫ , . . . , ǫ n are i.i.d.Rademacher random variables, and observe as before that X ∼ Y |{ ǫ + · · · + ǫ n = 0 } . Since Y is alinear form on the Boolean hypercube {± } n equipped with the uniform measure, it follows from theusual hypercontractive inequality (see Theorem 9.21 of [24]) that for all integers q ≥ , E (cid:2) Y q (cid:3) ≤ (2 q ) q · (cid:0) E (cid:2) Y (cid:3)(cid:1) q . Moreover, a short calculation similar to (but easier than) the one for X shows that E [ Y ] = n X i =1 v i ! + n X i =1 v i . Therefore, we have E (cid:2) X q (cid:3) = E (cid:2) Y q | ǫ + · · · + ǫ n = 0 (cid:3) ≤ E (cid:2) Y q (cid:3) Pr ( ǫ + · · · + ǫ n = 0) ≤ √ n · E (cid:2) Y q (cid:3) ≤ √ n · (2 q ) q · (cid:0) E [ Y ] (cid:1) q ≤ √ n · (2 q ) q · (2 E [ X ]) q , which gives the desired conclusion.Combining these two lemmas immediately gives the following. Lemma 5.5.
For any integer q ≥ , k X − E [ X ] k q ≤ min n O q (1) , (100 √ n ) /q · q o · E [ X ] . roof. By the triangle inequality for the L q -norm, we get that k X − E [ X ] k q ≤ k X k q + k E [ X ] k q ≤ min n O q (1) , (100 √ n ) /q · q o · E [ X ] + E [ X ] ≤ min n O q (1) , (100 √ n ) /q · q o · E [ X ] , where the second inequality follows from the previous two lemmas.The previous bound on moments can now be used to obtain a useful bound on the moment gener-ating function. Lemma 5.6.
Let Z := E [ X ] − X . Then, for any integer t ≥ and for any < λ < / (40 E [ X ]) , E [exp ( λZ )] ≤ O t (1) λ E [ X ] + 200 √ n · t λ t E [ X ] t . Proof.
For the range of parameters in the statement of the lemma, we have E [exp ( λZ )] = E ∞ X q =0 λ q Z q q ! = 1 + ∞ X q =2 λ q E [ Z q ] q ! ≤ t − X q =2 λ q k Z k qq q ! + ∞ X q = t λ q k Z k qq q ! ≤ O t (1) t − X q =2 λ q q ! E [ X ] q + 100 √ n ∞ X q = t λ q · (5 q ) q E [ X ] q q ! ≤ O t (1) λ E [ X ] + 100 √ n ∞ X q = t (cid:0) λ E [ X ] (cid:1) q ≤ O t (1) λ E [ X ] + 200 √ n · t λ t E [ X ] t , where the third line follows by Lemma 5.5.Finally, we are in a position to prove Proposition 5.2. Proof of Proposition 5.2.
As above, let Z := E [ X ] − X , and let Z , . . . , Z n be i.i.d. copies of Z . Forany integer t ≥ and for any < λ < / (40 E [ X ]) , we have Pr (cid:18) k Q n v k ≤ √ n k v k (cid:19) ≤ Pr (cid:16) k Q n v k ≤ n k v k (cid:17) ≤ Pr n X i =1 X i ≤ n k v k ! ≤ Pr n X i =1 X i ≤ n E [ X ] ! ≤ Pr n X i =1 Z i ≥ n E [ X ] / ! ≤ Pr exp λ n X i =1 Z i ! ≥ exp (cid:0) λn E [ X ] / (cid:1)! ≤ exp (cid:18) − λn E [ X ]16 (cid:19) n Y i =1 E [exp( λZ i )] exp (cid:18) − λn E [ X ]2 (cid:19) (cid:0) O t (1) λ E [ X ] + 200 √ n · t λ t E [ X ] t (cid:1) n , where the last line follows from Lemma 5.6. Let ǫ > be fixed as in the statement of the theorem,and take t ≥ to be the smallest integer for which √ nn − tǫ ≤ n − ǫ . Then, for λ = 1 / ( n ǫ E [ X ]) (which satisfies our assumption on λ for all n sufficiently large), we see thatthe right hand side is at most exp (cid:18) − n − ǫ (cid:19) (cid:0) O t (1) n − ǫ (cid:1) n ≤ exp (cid:18) − n − ǫ O t (1) n − ǫ (cid:19) . exp (cid:18) − n − ǫ (cid:19) , which completes the proof. Throughout this section, we will take α := n / and γ = n − . Moreover, since Theorem 1.3 istrivially true for η ≥ n − , we will henceforth assume that − n . ≤ η < n − . We decompose the unitsphere S n − into Γ ( η ) ∪ Γ ( η ) , where Γ ( η ) := n a ∈ S n − : LCD α,γ ( a ) ≥ n / · η − o and Γ ( η ) := S n − \ Γ ( η ) . Accordingly, we have Pr ( s n ( Q n ) ≤ η ) ≤ Pr (cid:0) ∃ a ∈ Γ ( η ) : k Q n a k ≤ η (cid:1) + Pr (cid:0) ∃ a ∈ Γ ( η ) : k Q n a k ≤ η (cid:1) . (10)Therefore, Theorem 1.3 follows from the following two propositions and the union bound. Proposition 5.7. Pr (cid:0) ∃ a ∈ Γ ( η ) : k Q n a k ≤ η (cid:1) . ηn + n / exp( −√ n/ . Proposition 5.8. Pr (cid:0) ∃ a ∈ Γ ( η ) : k Q n a k ≤ η (cid:1) . exp( − c . √ n ) . The proof of Proposition 5.7 is almost exactly the same as that of Proposition 4.1. The onlydifference is that, at the very end, instead of using Proposition 3.3, we use the following variant.
Proposition 5.9.
Let n ≥ be an even integer. Fix a unit vector a = ( a , . . . , a n ) ∈ S n − and considerthe random variable S := P ni =1 y i a i , where y i are { , } -valued random variables sampled uniformlyfrom the hyperplane y + · · · + y n = n/ . Then, for every α > , and for δ ≥ (4 /π )LCD γ,α ( a ) , we have L ( S, δ ) . δ √ nγ + √ n exp( − α / . roof. Note that S ∼ P ni =1 (1 + x i ) a i , where x i are {± } -valued random variables sampled uniformlyfrom the hyperplane x + · · · + x n = 0 , and that L ( S, δ ) = L (2 S, δ ) = L ( P ni =1 x i a i , δ ) . The desiredconclusion follows since for any r ∈ R , Pr (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 x i a i − r (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ δ ! = Pr (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 ǫ i a i − r (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ δ (cid:12)(cid:12)(cid:12)(cid:12) ǫ + · · · + ǫ n = 0 ! . √ n Pr (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 ǫ i a i − r (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ δ ! . √ n L n X i =1 ǫ i a i , δ ! . δ √ nγ + √ n exp( − α / , where the last inequality follows from Proposition 3.3.The proof of Proposition 5.8 will be the content of the next two subsections. Here, we present thekey initial step, which consists of efficiently passing from vectors on the unit sphere to integer vectors. Proposition 5.10.
With notation as above, we have Pr (cid:0) ∃ a ∈ Γ ( η ) : k Q n a k ≤ η (cid:1) . e − c . n . +Pr( ∃ w ∈ ( Z n \ { } ) ∩ [ − η − n / , η − n / ] n : k Q n w k ≤ C . { n . k w k , n . } ) . Proof.
Since by Lemma 5.1, Pr (cid:0) sup v ∈ H ∩ S n − k Q n v k ≥ C . n . (cid:1) . exp (cid:0) − c . n . (cid:1) , we may hence-forth restrict to the complement of this event. Let a ∈ Γ ( η ) . Then, by definition, there exists some < θ ≤ LCD α,γ ( a ) ≤ n / η − and some w ∈ Z n \ { } such that k θ a − w k ≤ min { γθ, α } . Case I: γθ ≤ n − . . In particular, θ ≤ n . . In this case, if k Q n a k ≤ η , then k Q n w k = k Q n ( w − θ a ) + Q n ( θ a ) k ≤ k Q n k · k w − θ a k + θ · k Q n a k ≤ n · γθ + θη ≤ n . + 2 η k w k ≤ n . k w k + 2 η k w k ≤ n . k w k ≤
10 min { n . k w k , n . } where the fourth line uses k w k ≥ θ (1 − γ ) ≥ θ/ ; the fifth line uses k w k ≥ (since w ∈ Z n \{ } );the sixth line uses η ≤ n . ; and the last line uses k w k ≤ θ (1 + γ ) ≤ θ ≤ n . . Case II: γθ > n − . . In particular, θ > n − . γ − > n . . Let ℓ := |h w − θ a , i| and let s := sgn( h w − θ a , i ) . w ′ ∈ { , } n denote the vector whose first ⌊ ℓ ⌋ coordinates are and the remaining coordinates are ; note that this makes sense since, by the Cauchy-Schwarz inequality, we have ℓ ≤ k w − θ a k ≤ √ nα ≪ n. We will need the following easily established claims.1. |h w − s w ′ − θ a , i| ≤ . Indeed, we have h w − s w ′ − θ a , i = h w − θ a , i − s h w ′ , i = sℓ − s ⌊ ℓ ⌋ = s ( ℓ − ⌊ ℓ ⌋ ) ∈ [ − , . k w − s w ′ k = θ (1 ± n − / ) . This follows from k w − s w ′ k = k w k ± k w ′ k along with theestimate k w ′ k ≤ ℓ ≤ k w − θ a k ≤ √ nγθ, from which we see that k w ′ k ≤ ( √ nγ ) / √ θ ≤ n − / θ.
3. Restricted to the event sup v ∈ H ∩ S n − k Q n v k ≤ C . n . , we have k Q n ( w − s w ′ − θ a ) k ≤ C . n . min { n − / θ, n / } . Indeed, writing w − s w ′ − θ a = h w − s w ′ − θ a , i n + Proj H ( w − s w ′ − θ a ) , we see that k Q n ( w − s w ′ − θ a ) k ≤ |h w − s w ′ − θ a , i| n k Q n k + k Q n (Proj H ( w − s w ′ − θ a ) k ≤ n · n √ n + C . n . k w − s w ′ − θ a k ≤ √ n + C . n . (cid:0) k w ′ k + k w − θ a k (cid:1) , where the second inequality uses the estimate from 1. Next, note that k w ′ k + k w − θ a k ≤ min { n − / θ, ( √ nα ) / } + min { γθ, α }≤ { n − / θ, n / } . It follows that k Q n ( w − s w ′ − θ a ) k ≤ √ n + 2 C . n . min { n − / θ, n / }≤ min { n . θ, √ n } + 2 C . n . min { n − / θ, n / }≤ C . n . min { n − / θ, n / } , where the second inequality uses θ > n . . 20rom these facts, it follows that if k Q n a k ≤ η , then k Q n ( w − s w ′ ) k = k Q n ( w − s w ′ − θ a ) + Q n ( θ a ) k ≤ k Q n ( w − s w ′ − θ a ) k + θ · k Q n a k ≤ C . n . min { n − / θ, n / } + θη ≤ C . n . min { n − / θ, n / } + n / ≤ C . n . min { n − / θ, n / } + n . min { n − / θ, n / }≤ C . n . min { n − / θ, n / }≤ C . { n . θ, n . }≤ C . { n . k w − s w ′ k , n . } where the third line uses 3.; the fourth line uses θ ≤ n / η − ; the fifth line uses θ ≥ n . ; and the lastline uses 2. Throughout this subsection and the next one, p = 2 n . is a prime. Definition 5.11.
For an integer vector v ∈ Z n , we define the size of its largest level set to be L ( v ) = sup z ∈ Z |{ i ∈ [ n ] : v i = z }| . The goal of this subsection is to prove the following lemma, which follows from Proposition 5.2 anda simple union bound.
Lemma 5.12. Pr (cid:0) ∃ w ∈ ( Z n \ { } ) ∩ [ − p, p ] n , L ( w ) ≥ n − n . : k Q n w k ≤ C . n . k w k (cid:1) . exp( − c . n/ . Proof.
The number of vectors w ∈ ( Z n \ { } ) ∩ [ − p, p ] n with L ( w ) ≥ n − n . is at most (cid:18) nn . (cid:19) · (3 p ) · (3 p ) n . ≪ n . . For n ∈ N sufficiently large, by Proposition 5.2, for any such vector, Pr (cid:0) k Q n w k ≤ C . n . k w k (cid:1) ≤ Pr (cid:0) k Q n w k ≤ √ n k w k / (cid:1) ≤ C . . − n . / . Therefore, the union bound gives the desired conclusion.
It remains to deal with integer vectors which are not almost-constant. Formally, let V := { v ∈ ( Z n \ { } ) ∩ [ − η − , η − ] n : L ( v ) < n − n . } . In view of Proposition 5.10 and Lemma 5.12, and since η ≤ n − , the following proposition suffices toprove Proposition 5.8. Proposition 5.13. Pr (cid:0) ∃ v ∈ V : k Q n v k ≤ C . n . (cid:1) . −√ n/ . Q n are obtained),which we now discuss.For n ∈ N , let Q n denote the set of all n × n matrices with entries in { , } , each of whose rowssums to n/ . As will soon be clear, we will find it more convenient to work with a ‘two-step’ model forgenerating a uniformly random element of Q n . Let Σ n denote the set of all permutations on [ n ] , andconsider the map f : (Σ n ) n × (cid:16) { , } n/ (cid:17) n → Q n , which takes (( σ , . . . , σ n ) , ξ , . . . , ξ n ) to the matrix in Q n whose i th row is ( q i , . . . , q in ) , where q ij := ( ξ i ( k ) if σ i (2 k −
1) = j, − ξ i ( k ) if σ i (2 k ) = j. In other words, for each k ∈ [ n/ , exactly one among the σ i (2 k − th and σ i (2 k ) th entries in the i th row is equal to (the other is equal to ), and the value of ξ i ( k ) determines which one of the twoentries it is. It is straightforward to see that the pushforward measure of the uniform measure on (Σ n ) n × (cid:0) { , } n/ (cid:1) n under the map f gives the uniform measure on Q n . Hence, we have the followingprocess for generating a uniformly random element of Q n . First, choose an n -tuple of permutations σ = ( σ , . . . , σ n ) , where each coordinate is chosen independently, and uniformly at random from Σ n .We shall refer to σ as the base of the matrix Q n . Second, for each i ∈ [ n ] and each k ∈ [ n/ , chooseexactly one among the σ i (2 k − th entry or the σ i (2 k ) th entry of the i th row of the matrix to be (andthe other to be ) uniformly at random, independently for all such values of i and k . Let us note herethat for each i ∈ [ n ] , the set comprising the n/ unordered pairs { σ i (2 k − , σ i (2 k ) } , for all k ∈ [ n/ ,is a uniformly random perfect matching in the complete graph on n -vertices K n ; we shall refer to thismatching as the matching induced by σ i .As in [7], we will need the notion of an ‘expanding base’, which is formalized in the followingdefinition. Definition 5.14.
We say that σ := ( σ , . . . , σ n ) ∈ (Σ n ) n belongs to E n if it satisfies the following twoproperties:(Q1) The union of any two perfect matchings of the form σ i and σ j ( i = j ) has at most n . connectedcomponents.(Q2) For any two subsets A, B ⊆ [ n ] such that n . ≤ | A | , | B | ≤ n/ , there are at most √ n/ indices i ∈ [ n ] such that the perfect matching induced by σ i has fewer than | A || B | / (8 n ) edges between A and B .It turns out that, with high probability, a uniformly random ‘base’ is ‘expanding’. Proposition 5.15 (Proposition 5.4 in [7]) . Let σ be a uniformly random element of (Σ n ) n . Then, Pr( σ / ∈ E n ) ≤ −√ n/ . Denote by Q σ the random matrix chosen uniformly among all the matrices in Q n with base σ , andby τ ∈ (Σ n ) n , a vector of i.i.d uniformly random permutations. Then, by the law of total probability,we have Pr Q n (cid:0) ∃ v ∈ V : k Q n v k ≤ C . n . (cid:1) = Pr Q τ (cid:0) ∃ v ∈ V : k Q τ v k ≤ C . n . (cid:1) Pr (cid:0) ∃ v ∈ V : k Q τ v k ≤ C . n . ∩ ( τ ∈ E n ) (cid:1) + Pr ( τ / ∈ E n ) ≤ sup σ ∈E n Pr (cid:0) ∃ v ∈ V : k Q σ v k ≤ C . n . (cid:1) + 2 −√ n/ , where the last inequality is due to Proposition 5.15. Thus, in order to prove Proposition 5.13, it sufficesto prove the following. Proposition 5.16.
For any σ ∈ E n , Pr (cid:0) ∃ v ∈ V : k Q σ v k ≤ C . n . (cid:1) . n − . n . For the remainder of this subsection, fix σ ∈ E n . Moreover, fix k = n . , and s = s = n . .For v ∈ V and i ∈ [ n ] , we define v σ i to be the n/ -dimensional integer vector whose k th coordinate is (cid:0) v σ i (2 k − − v σ i (2 k ) (cid:1) . This definition is motivated by the following. Lemma 5.17. sup z ∈ Z Pr (( Q σ v ) i = z ) ≤ ρ ( v σ i ) . Proof.
By unwrapping definitions, we see that, sup z ∈ Z Pr (( Q σ v ) i = z ) = sup z ∈ Z Pr n/ X k =1 v σ i (2 k − + v σ i (2 k ) n/ X k =0 (1 − ξ i ( k )) v σ i (2 k − − v σ i (2 k ) z ≤ sup z ′ ∈ Z / Pr n/ X k =1 (1 − ξ i ( k )) v σ i (2 k − − v σ i (2 k ) z ′ = sup z ∈ Z Pr n/ X k =1 (1 − ξ i ( k )) (cid:0) v σ i (2 k − − v σ i (2 k ) (cid:1) = z = sup z ∈ Z Pr n/ X k =1 ǫ i v σ i = z = ρ ( v σ i ) . For the purposes of anti-concentration, we would like (as a start) for the vectors v σ i to havesufficiently large support. Accordingly, let T v := { i ∈ [ n ] : | supp ( v σ i ) | ≥ n . / } . Lemma 5.18.
For every v ∈ V , | T v | ≥ n − √ n/ . Proof.
Note first that for any v ∈ V , the assumption that L ( v ) < n − n . implies that there existdisjoint sets A v , B v ⊆ [ n ] such that | A v | = n . , | B v | = n/ and v i = v j for all i ∈ A v , j ∈ B v .Then, property (Q2) from the definition of E n implies that for all but at most √ n/ indices i ∈ [ n ] ,the perfect matching induced by σ i has at least n . / edges with one endpoint in each of A v and B v . It is easy to see that each such index belongs to T v .23s in the previous section, note that for our choice of parameters, the natural map ι : V → F np is injective. We will abuse notation, and use v to denote ι ( v ) . This identification allows us to make thenext two key definitions, which will enable us to prove effective analogs of Lemma 4.7 and Lemma 4.8in our setting. Definition 5.19 (Witnessing pair) . For any v ∈ V , we say that the pair ( i , i ) ∈ T v × T v , i = i witnesses v if min b ⊆ v σi , | supp ( b ) |≥ s R ∗ k ( b ) | b | k ≥ min b ⊆ v σ , | supp ( b ) |≥ s R ∗ k ( b ) | b | k ≥ max i ∈ T v \{ i ,i } min b ⊆ v σi , | supp ( b ) |≥ s R ∗ k ( b ) | b | k . For a vector v ∈ V , we will denote its witnessing pair (taking the lexicographically first one, in casethere are multiple) by ( i ( v ) , i ( v )) . This gives a partition of V into at most (cid:0) n (cid:1) parts. Definition 5.20.
For an integer t ∈ [ p ] , let V t := n v ∈ V : ι (cid:16) v σ i ( v ) (cid:17) ∈ B s k,s , ≥ t − ( n/ \ B s k,s , ≥ t ( n/ o . The next two lemmas are the analogs of Lemma 4.8 and Lemma 4.7 respectively in the presentsetting.
Lemma 5.21.
For our choice of parameters and for any integer t ∈ [ p ] , | V t | ≤ (500) n (cid:16) pt (cid:17) n . Proof.
It is enough to show that the number of vectors v ∈ V t that are witnessed by a given pair ( i , i ) of distinct indices in T v is at most (400) n ( p/t ) n , and then take the union bound over all suchpairs of witnessing indices. Let us now fix such a pair for the remainder of the proof.It follows from the definition of a witnessing sequence that both ι ( v σ i ) and ι ( v σ i ) belong to B s k,s , ≥ t − ( n/ . Hence, Theorem 3.9 shows that each of the vectors ι ( v σ i ) and ι ( v σ i ) belong to aset of size at most (300) n/ (cid:16) pt (cid:17) n/ , and the injectivity of ι gives the same conclusion for v σ i and v σ i .Next, we bound the number of vectors v ∈ V with a given value of (cid:16) v σ i , v σ i (cid:17) . Note that allsuch vectors v have the same differences between all those pairs of coordinates that are connected byan edge of the union of the matchings induced by σ i and σ i . In particular, each vector v is uniquelydetermined once we fix the value of a single coordinate in each connected component of this graph.Since property (Q1) from the definition of E n implies that the number of connected components doesnot exceed n . , we may conclude that | V t | ≤ p n . · (cid:16) (300) n/ ( p/t ) n/ (cid:17) ≤ (400) n (cid:16) pt (cid:17) n . Lemma 5.22.
There exists an absolute constant C . such that for our choice of parameters and forany integer t ∈ [ p ] , if v ∈ V t , then For any i ∈ T v \ i ( v ) , sup z ∈ Z Pr (( Q σ v ) i = z ) ≤ (cid:18) C . p (cid:18) tn . + 1 (cid:19)(cid:19) . • For any z ∈ Z n , Pr ( Q σ v = z ) ≤ (cid:18) C . p (cid:18) tn . + 1 (cid:19)(cid:19) n −√ n . Proof.
It follows from the definition of a witnessing pair that, for each v ∈ V t and for every i ∈ T v \ { i ( v ) } , we have min b ⊆ v σi , | supp ( b ) |≥ s R ∗ k ( b ) | b | k ≤ min b ⊆ v σi v ) , | supp ( b ) |≥ s R ∗ k ( b ) | b | k < t · k p . In particular, for all i ∈ T v \{ i ( v ) } , v σ i / ∈ B s k,s , ≥ ( t +1) ( n/ . Therefore, by essentially the samecomputation as in Lemma 4.7, we have for all i ∈ T v \{ i ( v ) } that ρ ( v σ i ) ≤ C . p (cid:18) tn . + 1 (cid:19) , so that the first bullet point follows from Lemma 5.17. The second bullet point follows immediately,since for any z ∈ Z n , so that for any z ∈ Z n , Pr ( Q σ v = z ) ≤ Pr (( Q σ v ) i = z i ∀ i ∈ T v \{ i i ( v ) } ) ≤ (cid:18) C . p (cid:18) tn . + 1 (cid:19)(cid:19) | T v |− ≤ (cid:18) C . p (cid:18) tn . + 1 (cid:19)(cid:19) n −√ n , where the final inequality follows from Lemma 5.18.Finally, we are in a position to prove Proposition 5.16. As discussed earlier, this completes theproof of Proposition 5.8, and hence, the proof of Theorem 1.3. Proof of Proposition 5.16.
We begin by noting that every v ∈ V satisfies sup z ∈ Z Pr (( Q σ v ) i = z ) ≥ η n − / by the same pigeonhole argument as in the proof of Proposition 4.5. Since η n − ≫ / √ p , it followsfrom Lemma 5.22 that V t = ∅ for all t ≤ √ p .On the other hand, using Equation (6) with Γ = V t and C ( n ) = 10 C . n . , it follows fromLemma 5.21 and Lemma 5.22 that for all t ≥ √ p , the probability that the image of any vector in V t under Q σ lies in the ball of radius C . n . centered at the origin is at most (1000 C . n . ) n | V t | (cid:18) C . tpn . (cid:19) n −√ n ≤ (500000 C . n . ) n (cid:16) pt (cid:17) n (cid:18) C . tpn . (cid:19) n −√ n ≤ (500000 C . n (cid:16) pnt (cid:17) √ n (cid:18) C . n . (cid:19) n ≪ n − . n . Finally, taking the union bound over integers t ∈ [ √ p, p ] completes the proof.25 eferences [1] Z. D. Bai, J. W. Silverstein, and Y. Q. Yin. A note on the largest eigenvalue of a large dimensionalsample covariance matrix. Journal of Multivariate Analysis , 26(2):166–168, 1988.[2] N. A. Cook. The circular law for random regular digraphs. arXiv preprint arXiv:1703.05839 , 2017.[3] N. A. Cook. On the singularity of adjacency matrices for random regular digraphs.
ProbabilityTheory and Related Fields , 167(1-2):143–200, 2017.[4] K. P. Costello, T. Tao, and V. H. Vu. Random symmetric matrices are almost surely nonsingular.
Duke Mathematical Journal , 135(2):395–413, 2006.[5] A. Edelman. Eigenvalues and condition numbers of random matrices.
SIAM Journal on MatrixAnalysis and Applications , 9(4):543–560, 1988.[6] C. Esseen. On the Kolmogorov-Rogozin inequality for the concentration function.
ProbabilityTheory and Related Fields , 5(3):210–216, 1966.[7] A. Ferber, V. Jain, K. Luh, and W. Samotij. On the counting problem in inverse Littlewood–Offordtheory. arXiv preprint arXiv:1904.10425 , 2019.[8] J. Huang. Invertibility of adjacency matrices for random d -regular graphs. arXiv preprintarXiv:1807.06465 , 2018.[9] V. Jain. Approximate Spielman-Teng theorems for random matrices with heavy tailed entries: acombinatorial view. In preparation , 2019.[10] V. Jain. Smoothed analysis of the condition number without inverse Littlewood-Offord theory.
Inpreparation , 2019.[11] J. Kahn, J. Komlós, and E. Szemerédi. On the probability that a random ± Journal of the American Mathematical Society , 8(1):223–240, 1995.[12] J. Komlós. On determinant of (0, 1) matrices.
Studia Science Mathematics Hungarica , 2:7–21,1967.[13] M. Kwan, B. Sudakov, and T. Tran. Anticoncentration for subgraph statistics.
Journal of theLondon Mathematical Society , 2018.[14] B. Landon, P. Sosoe, and H.-T. Yau. Fixed energy universality of Dyson Brownian motion.
Advances in Mathematics , 346:1137–1332, 2019.[15] R. Latała. Some estimates of norms of random matrices.
Proceedings of the American MathematicalSociety , 133(5):1273–1282, 2005.[16] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann, and P. Youssef. Adjacency ma-trices of random digraphs: singularity and anti-concentration.
Journal of Mathematical Analysisand Applications , 445(2):1447–1491, 2017.[17] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann, and P. Youssef. The smallestsingular value of a shifted d-regular random square matrix.
Probability Theory and Related Fields ,pages 1–47, 2017. 2618] A. E. Litvak, A. Pajor, M. Rudelson, and N. Tomczak-Jaegermann. Smallest singular value ofrandom matrices and geometry of random polytopes.
Advances in Mathematics , 195(2):491–523,2005.[19] A. Mészáros. The distribution of sandpile groups of random regular graphs. arXiv preprintarXiv:1806.03736 , 2018.[20] H. H. Nguyen. On the singularity of random combinatorial matrices.
SIAM Journal on DiscreteMathematics , 27(1):447–458, 2013.[21] H. H. Nguyen and V. Vu. Circular law for random discrete matrices of given row sum. arXivpreprint arXiv:1203.5941 , 2012.[22] H. H. Nguyen and V. H. Vu. Small ball probability, inverse theorems, and applications. In
ErdősCentennial , pages 409–463. Springer, 2013.[23] H. H. Nguyen and M. M. Wood. Nonsingularity of adjacency matrices of random r -regular graphs. arXiv preprint arXiv:1806.10068 , 2018.[24] R. O’Donnell. Analysis of Boolean functions . Cambridge University Press, 2014.[25] E. Rebrova and K. Tikhomirov. Coverings of random ellipsoids, and invertibility of matrices withiid heavy-tailed entries.
Israel Journal of Mathematics , 227(2):507–544, 2018.[26] M. Rudelson. Invertibility of random matrices: norm of the inverse.
Annals of Mathematics , pages575–600, 2008.[27] M. Rudelson. Lecture notes on non-asymptotic theory of random matrices. 2013.[28] M. Rudelson and R. Vershynin. The Littlewood–Offord problem and invertibility of randommatrices.
Advances in Mathematics , 218(2):600–633, 2008.[29] M. Rudelson and R. Vershynin. Non-asymptotic theory of random matrices: extreme singularvalues. In
Proceedings of the International Congress of Mathematicians 2010 (ICM 2010) (In 4Volumes) Vol. I: Plenary Lectures and Ceremonies Vols. II–IV: Invited Lectures , pages 1576–1602.World Scientific, 2010.[30] D. A. Spielman and S.-H. Teng. Smoothed analysis of algorithms: Why the simplex algorithmusually takes polynomial time.
Journal of the ACM (JACM) , 51(3):385–463, 2004.[31] T. Tao.
Topics in random matrix theory , volume 132. American Mathematical Soc., 2012.[32] T. Tao and V. Vu. Random matrices: the circular law.
Communications in Contemporary Math-ematics , 10(02):261–307, 2008.[33] T. Tao and V. H. Vu.
Additive combinatorics , volume 105. Cambridge University Press, 2006.[34] T. Tao and V. H. Vu. On the singularity probability of random Bernoulli matrices.
Journal of theAmerican Mathematical Society , 20(3):603–628, 2007.[35] T. Tao and V. H. Vu. Inverse Littlewood-Offord theorems and the condition number of randomdiscrete matrices.
Annals of Mathematics , pages 595–632, 2009.[36] K. Tikhomirov. Singularity of random Bernoulli matrices. arXiv preprint arXiv:1812.09016 , 2018.2737] R. Vershynin. Introduction to the non-asymptotic analysis of random matrices. arXiv preprintarXiv:1011.3027 , 2010.[38] R. Vershynin. Invertibility of symmetric random matrices.
Random Structures & Algorithms ,44(2):135–182, 2014.[39] Y.-Q. Yin, Z.-D. Bai, and P. R. Krishnaiah. On the limit of the largest eigenvalue of the largedimensional sample covariance matrix.