Codes over integers, and the singularity of random matrices with large entries
aa r X i v : . [ c s . CC ] O c t Codes over integers, and the singularity of random matrices withlarge entries
Sankeerth Rao Karingula ∗ Shachar Lovett † Department of Computer ScienceUniversity of California, San DiegoOctober 26, 2020
Abstract
The prototypical construction of error correcting codes is based on linear codes over finitefields. In this work, we make first steps in the study of codes defined over integers. We focuson Maximum Distance Separable (MDS) codes, and show that MDS codes with linear rate anddistance can be realized over the integers with a constant alphabet size. This is in contrast tothe situation over finite fields, where a linear size finite field is needed.The core of this paper is a new result on the singularity probability of random matrices.We show that for a random n × n integer matrix with entries chosen uniformly from the range {− m, . . . , m } , the probability that it is singular is at most m − cn for some absolute constant c > Coding theory is the study of error correction codes. Codes are widely used in many applications,such as data compression, cryptography, error detection and correction, data transmission and datastorage. Algorithms needed to implement codes perform arithmetic operations over an underlyingalphabet, and hence their computational complexity is constrained by this alphabet size. Thus,understanding the alphabet size needed to support a given code structure is a central question incoding theory. By far, the most common approach to design codes is to use linear codes over finitefields. The main focus of this paper is to investigate the possibility of designing codes over integers.In particular, we study the alphabet size needed to support basic code structures, and focus on themost basic and well-studied family of codes - Maximum Distance Separable (MDS) codes.
An MDS code is a code with the best possible tradeoff between the message length, codeword lengthand minimal distance. Concretely, an ( n, k, d )-code is a code with message length k , codewordlength n and minimal distance d . The Singleton bound [28] gives that d ≤ n − k + 1. MDS codesare codes achieving this bound, namely ( n, k, d )-codes with d = n − k + 1. If we consider linearcodes, then it is well-known that MDS codes are generated by the row span of MDS matrices . ∗ email: [email protected]. Research supported by NSF CCF award 1614023. † email: [email protected]. Research supported by NSF CCF award 1614023. efinition 1.1 (MDS matrix) . Let n ≥ k . A k × n matrix is called an MDS matrix if any k columns in it are linearly independent. Equivalently, if any k × k minor of it is nonsingular. Note that MDS matrices can be defined over finite fields or over the integers. If we define themover a finite field F q , then it is well-known that a linear field size is needed to support MDS matrices.Concretely, if we assume n ≥ k + 2, then it is known that q ≥ max( k, n − k + 1) (see for examplethe introduction of [2] for a proof). In particular, this implies that q ≥ n/
2. Reed-Solomon codescan be constructed over fields of size q ≥ n −
1, which is tight up to a factor of two. The MDSconjecture of Segre [27] speculates that this is indeed the best possible (except for a few specialcases), and Ball [2] proved this over prime finite fields. In summary, over finite fields a linear fieldsize q = Θ( n ) is both necessary and sufficient.We show that over the integers, MDS matrices exist over much smaller alphabet sizes. Theorem 1.2 (MDS matrices over integers) . Let n ≥ k . There exist k × n MDS matrices overintegers whose entries are in {− m, . . . , m } , where m ≤ ( cn/k ) c for some absolute constant c > . The typical regime in coding theory is that of linear rate and linear distance; namely, where k = αn for some constant α ∈ (0 , constant alphabet size, which is in stark contrast with the case overfinite fields. It is easy to see that Theorem 1.2 is best possible, up to the unspecified constant c . Claim 1.3.
Let n ≥ k ≥ . Let M be a k × n MDS matrix whose entries are in an alphabet Σ .Then | Σ | ≥ p n/k .Proof. Let P i = ( M ,i , M ,i ) ∈ Σ denote the first two elements in the i -th column of M . If n > | Σ | k , then there must be k distinct columns i , . . . , i k ∈ [ n ] such that P i = . . . = P i k . Butthen M cannot be an MDS matrix, as the k × k minor formed by taking these columns has thefirst two rows being a scalar multiple of each other, and hence cannot be nonsingular.We prove Theorem 1.2 by choosing the matrix M randomly, and showing that with high prob-ability it will be an MDS matrix. This is another aspect in which codes over integers seem tobe different from codes over finite fields. Constructing MDS matrices over finite fields seems torequire algebraic constructions (such as Reed-Solomon codes), unless the field size is exponentialin n ; whereas over the integers, random matrices work well even for very small entries. Our main result is a bound on the singularity probability of random n × n matrices with uniforminteger entries in {− m, . . . , m } . Note that the probability that such a matrix is singular is at least(2 m + 1) − n , which is the probability that its first two rows are the same. We show that this boundis tight, up to polynomial factors. Theorem 1.4 (Singularity of random matrices) . Let n, m ≥ . Let M be an n × n random integermatrix with entries chosen uniformly in {− m, . . . , m } . Then for some absolute constant c > , Pr[ M is singular ] ≤ m − cn . Theorem 1.2 follows directly from Theorem 1.4.2 roof of Theorem 1.2.
Let M be a random k × n matrix with entries chosen uniformly from {− m, . . . , m } . The number of k × k minors for M is (cid:0) nk (cid:1) , and the probability that each one issingular is at most m − ck by Theorem 1.4. ThusPr[ M is not MDS] ≤ (cid:18) nk (cid:19) m − ck ≤ (cid:16) enk (cid:17) k m − ck = (cid:16) enkm c (cid:17) k . In particular, this probability is at most 2 − k (say) whenever m ≥ (2 en/k ) /c . Random matrices over integers vs over finite fields.
Note that if instead we chose M tobe a random n × n matrix over a finite field F q , then the probability that M is singular would beabout 1 /q , independent of how large n is. This is the main point of difference between randommatrices over integers and over finite fields - the singularity probability over integers decreases asthe matrix becomes larger, whereas over finite fields it stabilizes. We study in this paper the probability that a random n × n matrix with uniform entries in {− m, . . . , m } is singular. We discuss below previous results in random matrix theory, which mostlyfocused on the regime of large n and fixed entry distribution. We also discuss and compare the twomain techniques used to prove these results. Previous results.
Most previous works in random matrix theory focused on random matriceswhose entries are sampled independently from distributions with bounded tail behaviour. The moststudied case is that of random sign matrices, namely with uniform {− , } entries. Koml´os [15]proved that the probability that a random n × n sign matrix is singular is o (1) as n → ∞ , whichalready is a nontrivial result. It took nearly 30 years until Kahn, Koml´os and Szemer´edi [12]improved the bound to c n for some constant c ∈ (0 , c , and recently Tikhomirov [31] proved that c = 1 / o (1), which is bestpossible, as the probability that the first two rows of the matrix are equal is 2 − n . For more generaldistributions, Rudelson and Vershynin [22, 23] proved that if the entries of an n × n matrix aresampled from a sub-Gaussian distribution, then the probability it is singular is at most c n for some c ∈ (0 , m and constant n , was less explored. The only work we are awareof is by Katznelson [14] which gave a bound of the form c n m − n for some constant c n depending on n . While having optimal dependence on m for constant n , it has a caveat - it only applies in theregime where m is much larger than n (more precisely, for every fixed n , in the limit of large m ).A recent work that did address the regime of both large n and large m , but for a differententry distribution, is that of Vempala, Wang and Woodruff [32]. Fix µ ∈ (0 , D µ be adistribution over {− , , } with D µ (0) = 1 − µ , D µ (1) = D µ ( −
1) = µ/
2. Let M be a random n × n matrix, whose entries are the sum of m independent copies of D µ . They show that theprobability that M is singular is at most m − cn for some constant c = c ( µ ) >
0. We note that thisresult is sufficient to prove Theorem 1.2. However, we view the entry distribution in Theorem 1.4(uniform in {− m, . . . , m } ) as more natural for coding applications. In fact, we conjecture that anyentry distribution that doesn’t give too much probability to any specific element would do, seeConjecture 1.6 for the details. 3 echniques. We prove Theorem 1.4 following the approach of Rudelson and Vershynin [22, 23],in particular following the lecture notes of Rudelson [21] modified appropriately to handle the caseof large m . On the other hand, Vempala et al. [32] follow the approach of Kahn et al. [12] and Taoand Vu [30]. We briefly discuss the difference between the two approaches below.At a high level, both approaches aim to study “approximate periodicity” of random vectors.However, they take different routes. The first approach is more direct, using the notion of LowestCommon Denominator (LCD) to define and study approximate periodicity. The second approachis indirect, using Fourier analysis. Fourier analytic techniques seem useful when the underlyingentry distribution has well-behaved Fourier tails; for example, this is the case for the distributionconsidered in [32], where the entries are a sum of m independent copies of a distribution over {− , , } . However, the distribution we consider in this paper, uniform in {− m, . . . , m } , has lesswell-behaved Fourier tails, and Fourier analytic techniques seem less suited to analyze it. We view Theorem 1.2 as a first step towards the study of codes over integers. There are manyintriguing questions that arise in coding theory, once we can show that random integer matrices areMDS with high probability. There are also interesting conjectures on the singularity probability ofmatrices with entries sampled from general distributions. We discuss both briefly below.
Explicit constructions.
A natural question is to give an explicit construction of MDS matricesover integers with small integer values. Concretely, when k = αn for some constant α ∈ (0 , k × n MDS matrix with a constant alphabet size (namely,independent of n ). Algorithms.
The next natural question, once there are explicit constructions, is to design efficientdecoding algorithms for such codes. In particular, it would be intriguing to see if the smalleralphabet size can be utilized to obtain improved runtime (even by logarithmic factors).
General code designs.
In this paper, we focus on MDS codes and the alphabet size needed torealize them over integers. Many other code designs have been studied, many of which have thefollowing common form. Let P be a pattern matrix whose entries are { , ∗} . A matrix M (over afinite field, or over the integers) of the same dimensions as P , is said to realize P if it satisfies thefollowing two conditions:(i) If P i,j = 0 then M i,j = 0.(ii) For any maximal minor in P , if it can be realized by some nonsingular matrix, then thecorresponding minor in M is nonsingular.Questions of this form, for various patterns P , have been studied in coding theory. For example,MDS matrices correspond to patterns P which are all ∗ . In some applications, condition (ii) isreplaced with the following stronger condition (in which case we say that M strongly realizes P ):(ii)’ For any (maximal or not) minor in P , if it can be realized by some nonsingular matrix, thenthe corresponding minor in M is nonsingular.4ome areas where these questions arise are: MDS codes with sparse generating matrices (alsoknown as GM-MDS) [7, 10, 11, 17, 33]; tree codes, used in coding for interactive communication[4, 6, 19, 24, 25]; and maximally recoverable codes, used in coding for distributed storage (there aretoo many results to list here, we refer to [1] for a recent survey).Given a pattern P , it is not known when it can be realized (or strongly realized) over small finitefields. Some works show that an exponential field size is needed in some cases [9,13], whereas otherworks show that in other cases, a polynomial field size is sufficient, using an algebraic construction[17,33]. However, a general understanding is currently lacking. In contrast, we speculate that everypattern (except maybe some pathological cases) can be realized over integers with small entries.To pose a concrete conjecture, let P n be the n × n pattern with ∗ on and below the diagonal,and 0 above the diagonal. Such patterns underlie optimal tree codes. For example for n = 4: P = ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ The best known construction (see [6]) of a matrix M realizing P n is the binomial coefficientsmatrix, namely M i,j = (cid:0) ij (cid:1) , whose entries are integers of magnitude about 2 n . We conjecture thatthis cannot be improved much over finite fields, but can be reduced to poly( n ) over the integers. Conjecture 1.5.
The following holds for the pattern P n :1. Any matrix M strongly realizing P n over a finite field F q requires exponential field size, namely q ≥ exp(Ω( n )) .2. There exist matrices M strongly realizing P n over the integers, with the nonzero entries in {− m, . . . , m } for m = poly ( n ) . In fact, random matrices of this form should work with highprobability. Singularity of matrices over general distributions.
As we discussed above, most works onthe singularity of random matrices give a bound on the singularity of c n for some absolute constant c ∈ (0 , {− m, . . . , m } , we cantake c = 1 / poly( m ). We speculate that this is an instance of a much more general phenomena -the singularity probability is determined by the anti-concentration of the underlying entries dis-tribution. Given a distribution D over R , define its max-probability as kDk ∞ = max x D ( x ). Forexample, if D is the uniform distribution over {− m, . . . , m } , then kDk ∞ = 1 / (2 m + 1). Conjecture 1.6.
Let D be a distribution over R and set p = k D k ∞ . Let M be a random n × n matrix with independent entries from D . Then for some absolute constant c > , Pr[ M is singular ] ≤ p cn . One can even speculate a more general conjecture, where each entry comes from a differentunderlying distribution, as long as they all have bounded max-probability.5 aper Outline.
We prove Theorem 1.4 in the remainder of the paper. We start with a high-level overview of our framework in Section 2. We compute some preliminary estimates in Section 3,define and study incompressible vectors in Section 4, define the LCD condition in Section 5, wherewe also prove some properties of it, and bound the LCD of random vectors in Section 6. We putall the ingredients together and complete the proof in Section 7.
We will follow the general approach of Rudelson [21] with several modifications needed to obtaineffective bounds for large m . Notation.
It will be convenient to scale the entries to be in [ − , D the uniformdistribution over { a/m : a ∈ {− m, . . . , m }} . We denote by D n the distribution over n -dimensionalvectors with independent entries from D , and by D n × n the distribution over n × n matrices withindependent entries from D . We denote by S n − the unit sphere in R n , namely S n − = { x ∈ R n : k x k = 1 } . We will use the c, c ′ , c , etc, to denote unspecified positive constants. Note that thesame letter (e.g. c ) can mean different unspecified constants in different lemmas. We may assume that n, m are large enough.
We will assume throughout the proof that n, m are large enough; concretely, for any absolute constants n , m , we may assume that n ≥ n , m ≥ m , and this would only effect the value of the constant c in Theorem 1.4.To see why, consider first the regime of constant m and large n . The distribution D is symmetricand bounded in [ − , M is singular] ≤ c n for some absolute constant c ∈ (0 , m .The other regime is that of constant n and large m . While we may appeal to the result ofKatznelson [14] in this regime, which gives a sharp bound of c n m − n , there is a much simplerargument that gives a bound of the order of 1 /m which is good enough to establish Theorem 1.4 inthis regime. As the determinant of an n × n matrix is a polynomial of degree n , the Schwartz-Zippellemma [26, 34] gives Pr[ M is singular] = Pr[det( M ) = 0] ≤ nm . In particular, for constant n and large m , this probability is of the order of 1 /m , which is consistentwith the claimed bound of Theorem 1.4 (taking c < /n ). General approach.
Let M ∼ D n × n , and let X , . . . , X n denote its rows. If M is singular, thenone of the rows belongs to the span of the other rows. By symmetry we havePr[ M is singular] ≤ n · Pr[ X n ∈ Span( X , . . . , X n − )] . Let X ∗ be any unit vector orthogonal to X , . . . , X n − (if there are multiple ones, choose one insome deterministic way). We call it a random normal vector . We will shorthand X = X n . Observethat X, X ∗ are independent. Thus we can boundPr[ M is singular] ≤ n · Pr[ h X ∗ , X i = 0] .
6o do so, we will show that unless X ∗ belongs to a set of “bad” vectors, then the above probabilityis at most m − cn , and that the probability that X ∗ is bad is also at most m − cn . We establish some preliminary estimates in this section, which will be needed later in the proof.
Maximal eigenvalues of random matrices.
The first ingredient is bounding the spectralnorm of M . In fact, we would need this bound for rectangular matrices. Given an n × k matrix R we denote its spectral norm as k R k = max {k Rx k : x ∈ S k − } . Note that k R k = k R T k since k R k = max x ∈ S k − ,y ∈ S n − y T Rx .The following claim is a special case of [21, proposition 4.4], who showed that it holds for anysymmetric distribution D supported in [ − , Claim 3.1.
Let R ∼ D n × k for n ≥ k . Then for any λ > , Pr[ k R k ≥ λ √ n ] ≤ − cλ k . Anti-concentration of projections.
Next, we need anti-concentration results for projections of D n . To begin with we consider projections of the uniform distribution over the solid cube [ − , n . Claim 3.2.
Let U ∼ [ − , n be uniformly distributed. Then for every x ∈ S n − and ε > , Pr u [ |h U, x i| ≤ ε ] ≤ cε. Proof.
The uniform distribution U ∼ [ − , n is a log-concave distribution. Let S = h U, x i and notethat S is a projection of U along the direction x . The Pr´ekopa–Leindler inequality [16, 20] statesthat projections of log-concave distributions are log-concave, and so S is a log-concave distribution.Carbery and Wright [5, Theorem 8] show that the required anti-concentration bound holds for anylog-concave distribution.We extend this anti-concentration to the discrete case using a coupling argument. Here andthroughout, we denote by log( · ) logarithm in base 2. Claim 3.3.
Let X ∼ D n and set ε = √ log mm . Then for every x ∈ S n − and ε ≥ ε , Pr [ |h X, x i| ≤ ε ] ≤ cε. Proof.
We apply a coupling argument between the uniform distribution in [ − , n and D n . Sample X ∼ D n , Y ∼ [ − , n and set Z = X + Y / m . Observe that Z is uniform in the solid cube[ − − / m, / m ] n . Next, fix ε > h X, x i = h Z, x i − h
Y, x i / m . Thus wecan bound Pr[ |h X, x i| ≤ ε ] ≤ Pr[ |h Z, x i| ≤ ε ] + Pr[ |h Y, x i| ≥ εm ] . For the first term, Claim 3.2 bounds its probability by c ε . For the second term, the Chernoffbound bounds its probability for ε ≥ ε by 1 /m . As we have 1 /m ≤ ε , the claim follows.7 ensorization lemma. We also need the following “tensorization lemma” [21, Lemma 6.5].
Claim 3.4.
Let Y , . . . , Y n be independent real-valued random variables. Assume for some K, ε > that Pr[ | Y i | ≤ ε ] ≤ Kε for all ε ≥ ε . Then Pr " n X i =1 Y i ≤ ε n ≤ ( cKε ) n for all ε ≥ ε . Nets.
A set of unit vectors
N ⊂ S n − is called an ε -net, for ε >
0, if it satisfies: ∀ x ∈ S n − ∃ y ∈ N k x − y k ≤ ε. The following claim bounds the size of such a net. For a proof see for example [18, Lemma 2.6].
Claim 3.5.
For any ε > , there exists a ε -net N ⊂ S n − of size |N | ≤ (3 /ε ) n . Integer points in ball.
We need a bound on the number of integer vectors in a ball of a givenradius. Let B n ( r ) = { x ∈ R n : k x k ≤ r } denote the ball of radius r in R n . The following bound iswell known. Claim 3.6.
The number of integer vectors in B n ( r ) is at most (cid:16) cr √ n (cid:17) n . The first set of “bad” vectors that we want to rule out are vectors which are close to sparse. Avector u ∈ R n is k -sparse if it has at most k nonzero coordinates. Definition 4.1 (Compressible vectors) . Let α, β ∈ (0 , . A unit vector x ∈ S n − is called ( α, β ) - compressible if it can be expressed as x = u + v , where u is ( αn ) -sparse and k v k ≤ β . Otherwise,we say that x is ( α, β ) - incompressible . We will later choose α, β , but we note here that α will be a small enough absolute constantand β a small polynomial in 1 /m . Concrete values that work are α = 1 / , β = 1 / √ m . We willimplicitly assume that both n, m are large enough; concretely, at various places we assume that αn ≥ Lemma 4.2.
Let α ∈ (0 , / , β ∈ ( ε , / where ε = √ log mm . Then Pr [ X ∗ is ( α, β ) -compressible ] ≤ ( cβ ) n/ . We need a bound on the smallest singular value of a rectangular matrix.
Claim 4.3.
Let R ∼ D n × k for n ≥ k . Then for every x ∈ S k − and ε ≥ ε , Pr (cid:2) k Rx k ≤ ε √ n (cid:3) ≤ ( cε ) n/ . roof. Assume k Rx k < ε √ n . This implies that | ( Rx ) i | ≤ ε for at least n/ i ∈ [ n ].Note that for each fixed i , the value ( Rx ) i is distributed as h X, x i for some X ∼ D k . ApplyingClaim 3.3 and the union bound over the choice of the n/ (cid:2) k Rx k ≤ ε √ n (cid:3) ≤ n ( c ε ) n/ = ( cε ) n/ . Claim 4.4.
Let R ∼ D n × k for n ≥ k . Then for every ε ≥ ε , Pr (cid:20) min x ∈ S k − k Rx k ≤ ε √ n (cid:21) ≤ ( cε ) n/ . Proof.
We may assume that ε ≤ c ≥
1. Let N be an ( ε )-net in S k − of size |N | ≤ (3 /ε ) k , as given by Claim 3.5. Let E denote the event that there exists y ∈ N for which k Ry k ≤ ε √ n . Applying Claim 4.3 and a union bound givesPr [ E ] ≤ (3 /ε ) k · ( c ε ) n/ ≤ ( c ε ) n/ , where we used the assumption n ≥ k . Let E denote the event that k R k ≥ λ √ n for λ = p log(1 /ε ).Claim 3.1 shows that Pr[ E ] ≤ ( c ε ) n . We next show that if E , E don’t hold then the conditionof the claim also doesn’t hold, namely that k Rx k > ε √ n for all x ∈ S k − .Let x ∈ S k − be arbitrary and let y ∈ N be such that k x − y k ≤ ε . Then k Rx k ≥ k Ry k − k R k · k x − y k ≥ (2 ε − ε λ ) √ n. It can be verified that for ε ≤ ελ ≤
1, which implies that k Rx k ≥ ε √ n .We will now use these two claims to prove Lemma 4.2. Proof of Lemma 4.2.
Let M ′ be the ( n − × n matrix with rows X , . . . , X n − . Assume that thereexists an ( α, β )-compressible vector x ∈ S n − in the kernel of M ′ . By definition, x = u + v where u is ( αn )-sparse and k v k ≤ β . In particular, M ′ ( u + v ) = 0 and hence k M ′ u k = k M ′ v k . Inaddition, k u k ≥ k x k − k v k ≥ / x is a unit vector and k v k ≤ β ≤ / E denote the event that k M ′ k ≥ λ √ n for λ = c p log(1 /β ), where we choose c ≥ E ] ≤ β n . Note that as we assume β ≤ / λ ≥ c ≥ E doesn’t hold, we have k M ′ u k = k M ′ v k ≤ k M ′ k · k v k ≤ λβ √ n. In particular, y = u/ k u k is an ( αn )-sparse unit vector that satisfies k M ′ y k ≤ λβ √ n . We nextbound the probability that such a vector exists.Let ε = 2 λβ , and note that ε ≥ ε since β ≥ ε and λ ≥
1. There are (cid:0) nαn (cid:1) options for thesupport of y . Let I = { i : y i = 0 } denote a possible support, set k = | I | and let R be an ( n − × k matrix with columns ( Y i : i ∈ I ). As α < / n − ≥ k . Thus we can apply Claim 4.4and obtain thatPr h ¬ E ∧ ∃ y ∈ S k − , k Ry k ≤ ε √ n i ≤ ( c ε ) n/ = (cid:16) c β p log 1 /β (cid:17) n/ . Note that for β ≤ β log(1 /β ) ≤ c β ) n/ .To conclude, we union bound over the choices for I , the number of which is (cid:0) nαn (cid:1) ≤ n . Thuswe can bound the total probability by 2 n ( c β ) n/ = ( c β ) n/ .9 The LCD condition
In this section we will introduce the notion of the Lowest Common Denominator (LCD) of a vector,which is a variant of the LCD definition in [21]. Informally, the LCD of a vector is a robust notionof “almost periodicity”; concretely, it is the least multiple where most of its entries are close tointegers.Given x ∈ R n let x = [ x ] + { x } be its decomposition into integer and fractional parts, where[ x ] ∈ Z n and { x } ∈ [ − / , / n . Definition 5.1 (Least common denominator (LCD)) . Let α, β ∈ (0 , . Given a unit vector x ∈ S n − , its least common denominator (LCD), denoted LCD α,β ( x ) , is the infimum of D > such that we can decompose { Dx } = u + v , where u is ( αn ) -sparse and k v k ≤ β min( D, √ n ) . Claim 5.2.
Assume x ∈ S n − is (5 α, β ) -incompressible. Then LCD α,β ( x ) > √ αn .Proof. Let D = LCD α,β ( x ) and assume towards a contradiction that D ≤ √ αn . Let y = Dx . As k y k ≤ αn there are at most 4 αn coordinates i ∈ [ n ] where | y i | ≥ /
2. In all other coordinates { y i } = y i , and hence y − { y } is (4 αn )-sparse. By assumption we can decompose { y } = u + v where u is ( αn )-sparse and k v k ≤ βD . This implies that we can decompose y = u ′ + v where u ′ is (5 αn )-sparse. Thus, we can decompose x = y/D as x = u ′′ + v ′′ , where u ′′ = u/D is (5 αn )-sparse and v ′′ = v/D satisfies k v ′′ k ≤ β . This violates the assumption that x is (5 α, β )-incompressible.Our main goal in this section is to prove the following lemma, which extends Claim 3.3 assuming x has large LCD. To get intuition, we note that the lemma below is useful as long as β ≪ γ ≪ γ = √ β to be such a choice. In particular, if we set β = m − / then we have γ = m − / . Lemma 5.3.
Let X ∼ D n . Let α, β, γ ∈ (0 , / , x ∈ S n − be ( α, γ ) -incompressible and set D = LCD α,β ( x ) . Then for every ε ≥ / πmD , it holds that Pr [ |h X, x i| ≤ ε ] ≤ c (cid:18) εγ + 1( αβm ) αn (cid:19) . The proof of Lemma 5.3 relies on Esseen’s lemma [8].
Lemma 5.4 (Esseen’s Lemma) . Let Y be a real-valued random variable. Let φ Y ( t ) = E [ e itY ] denotethe characteristic function of Y . Then for any ε > , it holds that Pr[ | Y | ≤ ε ] ≤ cε Z /ε − /ε | φ Y ( t ) | dt. Before proving Lemma 5.3, we need some auxiliary claims. Fix some x ∈ S n − , let X ∼ D n and let Y = h X, x i . In order to apply Lemma 5.4, we need to compute the characteristic functionof Y . Claim 5.5.
Let X ∼ D n , x ∈ S n − and set Y = h X, x i . For t ∈ R it holds that | φ Y ( t ) | = n Y k =1 F (cid:18) x k t πm (cid:19) here F : R → R is defined as follows: F ( y ) = (cid:12)(cid:12)(cid:12)(cid:12) sin ((2 m + 1) πy )(2 m + 1) sin( πy ) (cid:12)(cid:12)(cid:12)(cid:12) . Proof.
We have Y = P x i ξ i where ξ , . . . , ξ n ∼ D are independent. Hence φ Y ( t ) = n Y k =1 E [ e ix k ξ k t ] . Next we compute E [ e ix k ξ k t ] = 12 m + 1 m X ℓ = − m e ix k ( ℓ/m ) t = 12 m + 1 · sin( m +12 m x k t )sin( m x k t ) . Hence (cid:12)(cid:12)(cid:12) E [ e itx k ξ k ] (cid:12)(cid:12)(cid:12) = F (cid:18) x k t πm (cid:19) . The next claim proves some basic properties of the function F . Claim 5.6.
The function F satisfies the following properties:1. F is symmetric: F ( y ) = F ( − y ) for all y ∈ R .2. F is invariant to shifts by integers: F ( y ) = F ( { y } ) for y ∈ R .3. F is bounded: F ( y ) ∈ [0 , for all y ∈ R .4. F ( y ) ≤ G ( my ) for y ∈ [0 , / , where G : R + → [0 , is defined as follows: G ( y ) = ( e − ηy if y ∈ [0 , e − η y if y ≥ Here, η > is an absolute constant. Note that G is decreasing.Proof. The first three claims follow immediately from the definition of F in Claim 5.5. In orderto prove the last claim, we will prove that F ( y ) ≤ c my for y ∈ [1 /m, /
2] for some c ∈ (0 , F ( y ) ≤ exp( − c ( my ) ) for y ∈ [0 , /m ] for some c >
0. The claim then follows by taking η = min(ln(1 /c ) , c ).First, note that F ( y ) ≤ m +1) | sin( πy ) | . Using Taylor expansion at 0, we get for y ∈ [0 , /
2] thatsin ( πy ) ≥ πy − π y ≥ πy . In particular, F ( y ) ≤ πmy , which gives the desired bound for c = 1 /π .Next, note that F ( y ) = m +1 | sin((2 m + 1) πy ) · csc( πy ) | . The Laurent series of csc( x ) at x = 0 iscsc( x ) = x + x + x + x + Θ( x ) and the Taylor series for sin( x ) is sin( x ) = x − x + x + Θ( x ).So for y ∈ [0 , /m ] we have F ( y ) ≤ − c ( my ) ≤ exp( − c ( my ) ).11e also need the following claim, which shows that incompressible vectors retain a large fractionof their norm when restricted to small coordinates. We use the following notation: given x ∈ R n and a set of coordinates J ⊂ [ n ], we denote by x | J ∈ R J the restriction of x to coordinates in J . Claim 5.7.
Let x ∈ S n − be ( α, γ ) -incompressible. Let J = n i : x i ≤ √ αn − o . Then k x | J k ≥ k x | J k ∞ + γ . Proof.
Let J c = [ n ] \ J . Since x is a unit vector, we have | J c | ≤ αn −
1. Let j ∈ J be suchthat | x j | is maximal and take K = J \ { j } . Then | K c | ≤ αn , and since we assume that x is( α, γ )-incompressible, we have k x | K k ≥ γ . This completes the proof, since k x | J k − k x | J k ∞ = k x | J k − x j = k x | K k ≥ γ . We would need the following lemma in the computations later on.
Lemma 5.8.
Let γ, δ > . Let x ∈ R n be a vector such that k x k ∞ ≤ δ and k x k ≥ k x k ∞ + γ . Let T = πm/δ . Then I = Z T n Y i =1 F (cid:18) x i t πm (cid:19) dt ≤ cγ . Proof.
To simplify the proof, we may assume by Claim 5.6(1) that x i ≥ i . Reorder thecoordinates of x so that x ≥ x ≥ . . . ≥ x n ≥
0. Observe that for x i ∈ [0 , T ] we have x i t πm ∈ [0 , / F (cid:0) x i t πm (cid:1) ≤ G (cid:0) x i t π (cid:1) . Thus I ≤ Z T n Y i =1 G (cid:18) x i t π (cid:19) dt = 2 π Z T/ π n Y i =1 G ( x i t ) dt ≤ π Z ∞ n Y i =1 G ( x i t ) dt. We bound this last integral. Let t i = 1 /x i so that t ≤ t ≤ . . . ≤ t n . For simplicity ofnotation set t = 0 , t n +1 = ∞ . We break the computation of the integral to intervals [ t k , t k +1 ) for k = 0 , . . . , n , and denote by I k the integral in each interval: I k = Z t k +1 t k n Y i =1 G ( x i t ) dt = Z t k +1 t k k Y i =1 e − η x i t · n Y i = k +1 e − ηt x i dt = e − ηk Z t k +1 t k e − ηt P ni = k +1 x i t k Q ki =1 x i dt. Fix k and consider first the case that P ni = k +1 x i ≥ γ /
2. In this case, using the fact that x i t ≥ i ∈ [ k ] and t ∈ [ t k , t k +1 ], we can bound I k by I k ≤ e − ηk Z t k +1 t k e − ηγ t / dt ≤ e − ηk Z ∞ e − ηγ t / dt ≤ c e − ηk γ . Next, consider the case that P ni = k +1 x i < γ /
2, which means that P ki =1 x i > k x k − γ / ≥k x k ∞ + γ /
2. Observe that this is impossible for k = 0 or k = 1, and hence we may assume k ≥ I k ≤ e − ηk Z t k +1 t k t k Q ki =1 x i dt ≤ e − ηk Z ∞ t k t k Q ki =1 x i dt = e − ηk x k − k ( k − Q ki =1 x i ≤ e − ηk ( k − x .
12y our assumption, P ki =1 x i ≥ γ / x ≥ γ / k . Thus we can bound I k ≤ e − ηk ( k − γ/ √ k ≤ c e − ηk γ . Overall, we bounded the integral by I ≤ π n X k =0 I k ≤ π max( c , c ) n X k =0 e − ηk γ ≤ cγ , where we used the fact that c , c , η > Proof of Lemma 5.3.
Let Y = h X, x i . Lemma 5.4 and Claim 5.5 give the boundPr[ | Y | ≤ ε ] ≤ c εI, where I is the following integral: I = Z /ε n Y i =1 F (cid:18) x i t πm (cid:19) dt. Let T = πm √ αn −
1. We will bound the integral in the regime [0 , T ] and [ T, /ε ], and denote thecorresponding integrals by I , I .Consider first the integral I in [0 , T ]. Let δ = 1 / √ αn − J = { i : x i ≤ δ } . Observethat by Claim 5.6(3), we can bound F (cid:0) x i t πm (cid:1) ≤ i / ∈ J . Thus I = Z T n Y i =1 F (cid:18) x i t πm (cid:19) dt ≤ Z T Y i ∈ J F (cid:18) x i t πm (cid:19) dt. Next, as we assume that x is ( α, γ )-incompressible, Claim 5.7 gives that k x | J k ≥ k x | J k ∞ + γ .Applying Lemma 5.8 to x | J , we bound the first integral by I ≤ c γ . Next, consider the second integral I in [ T, /ε ]. We will apply the LCD assumption to uniformlybound the integrand in this range. Given t ∈ [ T, /ε ], let y ( t ) = (cid:8) xt πm (cid:9) ∈ [ − / , / n , β ( t ) = β min( t/ √ n,
1) and J ( t ) = { i ∈ [ n ] : | y ( t ) i | ≥ β ( t ) } . As t ≤ /ε ≤ πmD , we have that t πm ≤ D =LCD α,β ( x ), and hence | J ( t ) | ≥ αn . Applying Claim 5.6, we bound the integrand by n Y i =1 F (cid:18) x i t πm (cid:19) = n Y i =1 F ( y i ) ≤ Y i ∈ J ( t ) F ( y i ) ≤ Y i ∈ J ( t ) G ( my i ) ≤ Y i ∈ J ( t ) G ( mβ ( t )) ≤ G ( mβ ( t )) αn . Following up on this, we have β ( t ) ≥ β ( T ) = β √ αn − √ n ≥ β p α/ ≥ αβ, αn ≥ α ≤ /
2. We may assume that αβm ≥
1, otherwisethe conclusion of the lemma is trivial. In that case we have by Claim 5.6(4) that G ( mβ ( t )) ≤ G ( αβm ) ≤ αβm . Thus we can bound the integral I by I = Z /εT n Y i =1 F (cid:18) x i t πm (cid:19) dt ≤ /ε ( αβm ) αn . Overall, we get Pr[ | Y | ≤ ε ] ≤ c εI = c ε ( I + I ) ≤ c c εγ + c ( αβm ) αn . Our main goal in this section is to prove that a random normal vector X ∗ has large LCD with highprobability. Let M ′ denote the ( n − × n matrix with rows X , . . . , X n − . Let D = √ αn and D = β ( αβm ) αn in this section. Lemma 6.1.
Let α ∈ (0 , / , β ∈ (0 , / and D ∈ (1 , D ) . Then Pr[
LCD α,β ( X ∗ ) ≤ D ] ≤ D (1 /αc ) n β cn for some absolute constant c ∈ (0 , . We set γ = √ β throughout the section. We first condition on a number of bad events notholding. Define: E = h k M k ≥ p n log(1 /β ) i E = [ X ∗ is (5 α, β ) -compressible] E = [ X ∗ is ( α, γ ) -compressible]Applying Claim 3.1 for E , and Lemma 4.2 for E , E , we get thatPr[ E or E or E ] ≤ β cn . Thus, we will assume in this section that none of E , E , E hold. Assuming ¬ E , Claim 5.2 yieldsthat LCD α,β ( X ∗ ) ≥ D . For D ≥ D define S D = (cid:8) x ∈ S n − : LCD α,β ( x ) ∈ [ D, D ] and x is ( α, γ ) -incompressible (cid:9) . The following is an analog of Lemma 7.2 in [21].
Claim 6.2.
Let D ≥ D and set ν = 6 β √ n/D . There exists a ν -net N D ⊂ S D of size |N D | ≤ ( D/β ) (cid:18) cD √ αn (cid:19) n (1 /β ) αn . Namely, for each x ∈ S D there exists y ∈ N D that satisfies k x − y k ≤ ν . roof. Let x ∈ S D and shorthand D ( x ) = LCD α,β ( x ). By definition, we can decompose { D ( x ) x } = u + v where u is ( αn )-sparse and k v k ≤ β min( D, √ n ) ≤ β √ n .Let W denote the set of ( αn )-sparse vectors w ∈ [ − / , / n such that each w i is an integermultiple of β . Then | W | ≤ (cid:0) nαn (cid:1) (1 /β ) αn , and there exists w ∈ W such that k u − w k ∞ ≤ β , whichimplies k u − w k ≤ β √ n . This implies that k{ D ( x ) x } − w k ≤ β √ n. Next, consider [ D ( x ) x ] ∈ Z n . As | [ a ] | ≤ | a | for all a ∈ Z , we have k [ D ( x ) x ] k ≤ D ( x ) k x k ≤ D . Let Z = { z ∈ Z n : k z k ≤ D } . Then [ D ( x ) x ] ∈ Z , and Claim 3.6 bounds | Z | ≤ (cid:16) c D √ n (cid:17) n .So there is z ∈ Z such that k D ( x ) x − z − w k ≤ β √ n. Next, let R be set of integer multiples of β in the range [ D, D ], so that | R | ≤ D/β and thereexists r ∈ R with | D ( x ) − r | ≤ β . As k x k = 1 we have k rx − z − w k ≤ β √ n + β ≤ β √ n. Finally, define the set Y = { ( z + w ) /r : z ∈ Z, w ∈ W, r ∈ R } . Then there exists y ∈ Y such that k x − y k ≤ β √ n/D = ν/ . Take a maximal set N D ⊂ S D which is ν -separated. That is, for any x ′ , x ′′ ∈ N D we have k x ′ − x ′′ k > ν . Note that by maximality, N D is a ν -net in S D . Next, note that |N D | ≤ | Y | , as anypoint x ∈ N D must be ( ν/ Y . To conclude, we need to bound | Y | .We have | Y | ≤ | W || Z || R | ≤ (cid:18) nαn (cid:19) (1 /β ) αn · (cid:18) cD √ n (cid:19) n · ( D/β ) . As D ≥ D = √ αn we can simplify 1 + cD √ n ≤ ( c +1) D √ αn . We can trivially bound (cid:0) nαn (cid:1) ≤ n . Hence |N D | ≤ | Y | ≤ ( D/β ) (cid:18) c + 1) D √ αn (cid:19) n (1 /β ) αn . Claim 6.3.
For any D ∈ [ D , D ] we have Pr [ X ∗ ∈ S D and ¬ E ] ≤ D ( c/α ) n β n/ . Proof.
First, note that we may assume β ≤ β for any absolute constant β ∈ (0 , c > c ≥ /β ). In particular,setting β = 2 − works.If X ∗ ∈ S D then there exists y ∈ N D such that k X ∗ − y k ≤ ν for ν = 6 β √ n/D . By definitionof X ∗ we have M ′ X ∗ = 0, and as we assume that ¬ E hold, we have k M ′ y k ≤ k M ′ kk X ∗ − y k ≤ ν p n log(1 /β ) . β = 6 β p log(1 /β ). The assumption β ≤ β implies that β ≤ β / . Set δ = β / √ n/D . Wewill bound the probability that there exists y ∈ N D such that k M ′ y k ≤ δ √ n .Fix y ∈ N D , let X ∼ D n , and define p ( ε ) = Pr[ |h X, y i| ] ≤ ε . As y ∈ N D ⊂ S D we have that y is ( α, γ )-incompressible, and hence we can apply Lemma 5.3, which gives p ( ε ) ≤ c (cid:18) εγ + 1( αβm ) αn (cid:19) for all ε ≥ / πmD. Next, we restrict attention to only ε ≥ δ , and note that in this regime the first term is dominant(since D ≤ D we have δ ≥ β / √ n/D ≥ / ( αβm ) αn ). We can then simplify the bound as p ( ε ) ≤ c εγ for all ε ≥ δ. Applying Claim 3.4, and recalling that we set γ = √ β , givesPr (cid:2) k M ′ y k ≤ δ √ n (cid:3) ≤ (cid:18) c δγ (cid:19) n − = c β / √ nD ! n − . Union bounding over all y ∈ N D , using Claim 6.2 to bound its size, givesPr[ ∃ y ∈ N D , k M ′ y k ≤ δ √ n ] ≤ ( D/β ) (cid:18) cD √ αn (cid:19) n (1 /β ) αn · c β / √ nD ! n − ≤ D (cid:0) c / √ α (cid:1) n β n/ − αn − . Our assumption α < /
40 and the implicit assumption αn ≥ αn + 2 ≤ n/
8, whichsimplifies the above bound to the claimed bound.We are now in place to prove Lemma 6.1.
Proof of Lemma 6.1.
We may assume that non of E , E , E hold, as the probability that any ofthem hold is at most β c n for some absolute constant c ∈ (0 , α,β ( X ∗ ) ≥ D . Fix D ∈ [ D , D ]. As D ≤ D we can applying Claim 6.3 to D i = 2 i D as longas D i ≤ D/
2. Summing the results we getPr [LCD α,β ( X ∗ ) ≤ D and ¬ E , ¬ E , ¬ E ] ≤ (2 D ) ( c /α ) n β n/ . Thus overall we have Pr [LCD α,β ( X ∗ ) ≤ D ] ≤ β c n + (2 D ) ( c /α ) n β n/ . The lemma follows by taking c ∈ (0 ,
1) small enough.
We now prove Theorem 1.4. 16 roof of Theorem 1.4.
Fix α = 1 / , β = 1 / √ m and assume m ≥ m for a large enough constant m to be determined soon. Let D to be determined soon. Lemma 6.1 givesPr[LCD α,β ( X ∗ ) ≤ D ] ≤ D (1 /αc ) n β c n . As α is constant, and using the choice β = 1 / √ m , we can simplify the bound as follows. For asmall enough constant c ∈ (0 , D = m cn and c = 1 /αc , we havePr[LCD α,β ( X ∗ ) ≤ m cn ] ≤ m cn c n m − ( c / n ≤ c n m − ( c / − c ) n ≤ c n m − cn . Assuming m ≥ m for a large enough constant m , we can simplify this bound further asPr[LCD α,β ( X ∗ ) ≤ m cn ] ≤ m − ( c/ n . Next, assume D = LCD α,β ( X ∗ ) ≥ m cn . In this case, Lemma 5.3 for ε = 1 / πmD gives thatPr[ h X ∗ , X i = 0] ≤ Pr[ |h X ∗ , X i| ≤ ε ] ≤ c (cid:18) εγ + 1( αβm ) αn (cid:19) ≤ m − c ′ n for some c ′ ∈ (0 , Acknowledgement
We would like to thank Roman Vershynin and Konstantin Tikhomirov for helpful discussions.
References [1] S. Balaji, M. N. Krishnan, M. Vajha, V. Ramkumar, B. Sasidharan, and P. V. Kumar.Erasure coding for distributed storage: An overview.
Science China Information Sciences ,61(10):100301, 2018.[2] S. Ball. On sets of vectors of a finite vector space in which every subset of basis size is a basis.
Journal of the European Mathematical Society , 14(3):733–748, 2012.[3] J. Bourgain, V. H. Vu, and P. M. Wood. On the singularity probability of discrete randommatrices.
Journal of Functional Analysis , 258(2):559–603, 2010.[4] M. Braverman. Towards deterministic tree code constructions. In
Proceedings of the 3rdInnovations in Theoretical Computer Science Conference , pages 161–167, 2012.[5] A. Carbery and J. Wright. Distributional and L q norm inequalities for polynomials over convexbodies in R n . Mathematical research letters , 8(3):233–248, 2001.[6] G. Cohen, B. Haeupler, and L. J. Schulman. Explicit binary tree codes with polylogarithmicsize alphabet. In
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory ofComputing , pages 535–544, 2018.[7] S. H. Dau, W. Song, and C. Yuen. On the existence of MDS codes over small fields with con-strained generator matrices. In ,pages 1787–1791. IEEE, 2014. 178] C. Esseen. On the Kolmogorov-Rogozin inequality for the concentration function.
Zeitschriftf¨ur Wahrscheinlichkeitstheorie und Verwandte Gebiete , 5(3):210–216, 1966.[9] S. Gopi, V. Guruswami, and S. Yekhanin. Maximally recoverable LRCs: A field size lowerbound and constructions for few heavy parities.
IEEE Transactions on Information Theory ,2020.[10] A. Heidarzadeh and A. Sprintson. An algebraic-combinatorial proof technique for the GM-MDS conjecture. In , pages11–15. IEEE, 2017.[11] S. Hoang Dau, W. Song, and C. Yuen. On the existence of MDS codes over small fields withconstrained generator matrices. arXiv , pages arXiv–1401, 2014.[12] J. Kahn, J. Koml´os, and E. Szemer´edi. On the probability that a random ± Journal of the American Mathematical Society , 8(1):223–240, 1995.[13] D. Kane, S. Lovett, and S. Rao. The independence number of the Birkhoff polytope graph, andapplications to maximally recoverable codes.
SIAM Journal on Computing , 48(4):1425–1435,2019.[14] Y. Katznelson. Singular matrices and a uniform bound for congruence groups of SL n ( Z ). DukeMathematical Journal , 69(1):121–136, 1993.[15] J. Koml´os. On determinant of (0 ,
1) matrices.
Studia Science Mathematics Hungarica , 2:7–21,1967.[16] L. Leindler. On a certain converse of H¨older’s inequality. In
Proceedings of the 1971 OberwolfachConference, BirkhHuser Verlag. Basel-Stuttgart , 1972.[17] S. Lovett. MDS matrices over small fields: A proof of the GM-MDS conjecture. In , pages 194–199. IEEE,2018.[18] V. D. Milman and G. Schechtman.
Asymptotic theory of finite dimensional normed spaces:Isoperimetric inequalities in riemannian manifolds , volume 1200. Springer, 2009.[19] C. Moore and L. J. Schulman. Tree codes and a conjecture on exponential sums. In
Proceedingsof the 5th conference on Innovations in theoretical computer science , pages 145–154, 2014.[20] A. Pr´ekopa. On logarithmic concave measures and functions.
Acta Scientiarum Mathemati-carum , 34:335–343, 1973.[21] M. Rudelson. Lecture notes on non-asymptotic theory of random matrices. 2013.[22] M. Rudelson and R. Vershynin. The Littlewood–Offord problem and invertibility of randommatrices.
Advances in Mathematics , 218(2):600–633, 2008.[23] M. Rudelson and R. Vershynin. Non-asymptotic theory of random matrices: extreme singularvalues. In
Proceedings of the International Congress of Mathematicians 2010 (ICM 2010)(In 4 Volumes) Vol. I: Plenary Lectures and Ceremonies Vols. II–IV: Invited Lectures , pages1576–1602. World Scientific, 2010. 1824] L. J. Schulman. Deterministic coding for interactive communication. In
Proceedings of thetwenty-fifth annual ACM symposium on Theory of computing , pages 747–756, 1993.[25] L. J. Schulman. Coding for interactive communication.
IEEE transactions on informationtheory , 42(6):1745–1756, 1996.[26] J. T. Schwartz. Fast probabilistic algorithms for verification of polynomial identities.
Journalof the ACM (JACM) , 27(4):701–717, 1980.[27] B. Segre. Curve razionali normali ek-archi negli spazi finiti.
Annali di Matematica Pura edApplicata , 39(1):357–379, 1955.[28] R. Singleton. Maximum distance q-nary codes.
IEEE Transactions on Information Theory ,10(2):116–118, 1964.[29] T. Tao and V. Vu. On random ± Random Structures& Algorithms , 28(1):1–23, 2006.[30] T. Tao and V. Vu. On the singularity probability of random Bernoulli matrices.
Journal ofthe American Mathematical Society , 20(3):603–628, 2007.[31] K. Tikhomirov. Singularity of random Bernoulli matrices.
Annals of Mathematics , 191(2):593–634, 2020.[32] S. S. Vempala, R. Wang, and D. P. Woodruff. The communication complexity of optimization.In
Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms , pages1733–1752. SIAM, 2020.[33] H. Yildiz and B. Hassibi. Further progress on the GM-MDS conjecture for Reed-Solomoncodes. In , pages 16–20.IEEE, 2018.[34] R. Zippel. Probabilistic algorithms for sparse polynomials. In