On the number of rich lines in truly high dimensional sets
aa r X i v : . [ m a t h . C O ] D ec On the number of rich lines in truly high dimensional sets
Zeev Dvir ∗ Sivakanth Gopi † Abstract
We prove a new upper bound on the number of r -rich lines (lines with at least r points) in a‘truly’ d -dimensional configuration of points v , . . . , v n ∈ C d . More formally, we show that, if thenumber of r -rich lines is significantly larger than n /r d then there must exist a large subset of thepoints contained in a hyperplane. We conjecture that the factor r d can be replaced with a tight r d +1 . If true, this would generalize the classic Szemer´edi-Trotter theorem which gives a boundof n /r on the number of r -rich lines in a planar configuration. This conjecture was shown tohold in R in the seminal work of Guth and Katz [GK15] and was also recently proved over R (under some additional restrictions) [SS14]. For the special case of arithmetic progressions ( r collinear points that are evenly distanced) we give a bound that is tight up to low order terms,showing that a d -dimensional grid achieves the largest number of r -term progressions.The main ingredient in the proof is a new method to find a low degree polynomial thatvanishes on many of the rich lines. Unlike previous applications of the polynomial method,we do not find this polynomial by interpolation. The starting observation is that the degree r − r -collinear points to r linearly dependent images. Hence, eachcollinear r -tuple of points, gives us a dependent r -tuple of images. We then use the design-matrixmethod of [BDWY12] to convert these ‘local’ linear dependencies into a global one, showing thatall the images lie in a hyperplane. This then translates into a low degree polynomial vanishingon the original set. The Szemer´edi-Trotter theorem gives a tight upper bound on the number of incidences between acollection of points and lines in the real plane. We write A . B to denote A ≤ C · B for someabsolute constant C and A ≈ B if we have both A . B and B . A . Theorem 1.1 ([ST83]) . Given a set of points V and a set of lines L in R , let I ( V, L ) be the setof incidences between V and L . Then, I ( V, L ) . | V | / |L| / + | V | + |L| . This fundamental theorem has found many applications in various areas (see [Dvi12] for someexamples) and is known to also hold in the complex plane C [Tot03, Zah12]. In recent years therehas been a growing interest in high dimensional variants of line-point incidence bounds [SSZ13,Kol14, Rud14, SS14, ST12, BS14]. This is largely due to the breakthrough results of Guth and ∗ Department of Computer Science and Department of Mathematics, Princeton University. Email: [email protected] . † Department of Computer Science, Princeton University. Email: [email protected] . R satisfying some ‘truly 3dimensional’ condition (e.g, not too many lines in a plane). The intuition is that, in high dimensions,it is ‘harder’ to create many incidences between points and lines. This intuition is of course falseif our configuration happens to lie in some low dimensional space. In this work we prove strongerline-point incidence bounds for sets of points that do not contain a large low dimensional subset.To state our main theorem we first restate the Szemer´edi-Trotter theorem as a bound on thenumber of r -rich lines (lines containing at least r points) in a given set of points. Since our resultswill hold over the complex numbers we will switch now from R to C . The complex version ofSzemeredi-Trotter was first proved by Toth [Tot03] and then proved using different methods byZahl [Zah12]. For a finite set of points V , we denote by L r ( V ) the set of r -rich lines in V . Thefollowing is equivalent to Theorem-1.1 (but stated over C ). Theorem 1.2 ([Tot03, Zah12]) . Given a set V of n points in C , for r ≥ , |L r ( V ) | . n r + nr . Theorem 1.2 is tight since a two dimensional square grid of n points contains & n /r lines thatare r -rich. We might then ask whether a d -dimensional grid G d = { , , . . . , h } d , with h ≈ n /d ,has asymptotically the maximal number of r -rich lines among all n -point configurations that donot have a large low dimensional subset. It can be shown that for r ≪ d n /d , |L r ( G d ) | ≈ d n r d +1 , where the subscript d denotes that the constants in the inequalities may depend on the dimension d [SV04]. Clearly, we can obtain a larger number of rich lines in C d if V is a union of several lowdimensional grids. For example, for some α ≫ d A ≫ B to mean A ≥ C · B for somesufficiently large constant C ) and d > ℓ >
1, we can take a disjoint union of r d − ℓ /α ℓ -dimensionalgrids G ℓ of size αn/r d − ℓ each. Each of these grids will have & d α n /r d − ℓ +1 r -rich lines and so,together we will get & d αn /r d +1 rich lines. We can also take a union of n/r lines containing r points each, to get more r -rich lines than in the d -dimensional grid G d when r ≫ d n /d . We thusarrive at the following conjecture which, if true, would mean that the best one can do is to pastetogether a number of grids as above. Conjecture 1.3.
For r ≥ , suppose V ⊂ C d is a set of n points with |L r ( V ) | ≫ d n r d +1 + nr . Then there exists < ℓ < d and a subset V ′ ⊂ V of size & d n/r d − ℓ which is contained in an ℓ -flat(i.e., an ℓ -dimensional affine subspace). This conjecture holds in R [GK15] and, in a slightly weaker form, in R [SS14]. We comparethese two results with ours later in the introduction. Our main result makes a step in the directionof this conjecture. First of all, our bound is off by a factor of r from the optimal bound (i.e., with n /r d instead of n /r d +1 ). Secondly, we are only able to detect a d − ℓ which may be smaller). 2 heorem 1. For all d ≥ there exists constants C d , C ′ d such that the following holds. Let V ⊂ C d be a set of n points and let r ≥ be an integer. Suppose that for some α ≥ , |L r ( V ) | ≥ C d · α · n r d . Then, there exists a subset ˜ V ⊂ V of size at least C ′ d · α · nr d − contained in a ( d − -flat. We cantake the constants C d , C ′ d to be d cd , d c ′ d for absolute constants c, c ′ > . Notice that the theorem is only meaningful when r ≫ d (otherwise the factor r d in the as-sumption will be swallowed by the constant C d ). On the other hand, if r ≫ n / ( d − then theconclusion always holds. Hence, the theorem is meaningful when r is in a ‘middle’ range. Noticealso that for d = 2 , r sufficiently small, the condition of the theorem also cannot hold, by theSzemeredi-Trotter theorem. However, when d becomes larger, our theorem gives non trivial results(and becomes closer to optimal for large d ). The proof of Theorem 1 actually shows (Lemma 3.1)that, under the same hypothesis, most of the rich lines must be contained in a hypersurface ofdegree smaller than r . This in itself can be very useful, as we will see in the proof of Theorem 3which uses this fact to prove certain sum-product estimates. The existence of such a low-degreehypersurface containing most of the curves can also be obtained when there are many r -rich curvesof bounded degree with ‘two degrees of freedom’ i.e. through every pair of points there are at most O (1) curves (see Remark 3.4). Counting arithmetic progressions An r -term arithmetic progression in C d is simply a set of r points of the form { y, y + x, y + 2 x, . . . , y + ( r − x } with x, y ∈ C d . This is a special case of r collinear points and, for this case, we can derive a tighter bound than for the general case. In anutshell, we can show that a d -dimensional grid contains the largest number of r -term progressions,among all sets that do not contain a large d − V with itself, the number of progressions of length r squares.For a finite set V ⊂ C d , let us denote the number of r -term arithmetic progressions containedin V by AP r ( V ). We first observe that, for all sufficiently small r , the grid G d (defined above)contains at least & d n /r d r -term progressions. To see where the extra factor of r comes from,notice that the 2 r -rich lines in G d will contain r arithmetic progressions of length r each. Our maintheorem shows that this is optimal, as long as there is no large low-dimensional set. Theorem 2.
Let < ǫ < and V ⊂ C d be a set of size n and suppose that for some r ≥ we have AP r ( V ) ≫ d,ǫ n r d − ǫ . Then, there exists a subset ˜ V ⊂ V of size & d,ǫ nr d/ǫ − contained in a hyperplane. Using the incidence bound between points and lines in R proved in [GK15], one can prove thefollowing theorem from which Conjecture 1.3 in R trivially follows (see Appendix A).3 heorem 1.4 ([GK15]) . Given a set V of n points in R , let s denote the maximum number ofpoints of V contained in a -flat. Then for r ≥ , |L r ( V ) | . n r + ns r + nr . Similarly, using the results of in [SS14], we can prove the following theorem from which a slightlyweaker version of Conjecture 1.3 in R trivially follows (see Appendix A). Theorem 1.5 ([SS14]) . Given a set V of n points in R , let s denote the maximum number ofpoints of V contained in a 2-flat and s denote the maximum number of points of V contained in aquadric hypersurface or a hyperplane. Then there is an absolute constant c > such that for r ≥ , |L r ( V ) | . c √ log n · (cid:18) n r + ns r + ns r + nr (cid:19) . We are not aware of any examples where points arranged on a quadric hypersurface in R resultin significantly more rich lines than in a four dimensional grid. It is, however, possible that oneneeds to weaken Conjecture 1.3 so that for some 1 < ℓ < d , an ℓ -dimensional hypersurface ofconstant degree (possibly depending on ℓ ) contains & d n/r d − ℓ points.To make the comparison with the above theorems easier, Theorem 1 can be stated equivalentlyas follows: Theorem 1.6 (Equiv. to Theorem 1) . Given a set V of n points in C d , let s d − denote themaximum number of points of V contained in a hyperplane. Then for r ≥ , |L r ( V ) | . d n r d + ns d − r . In [SV04], it was shown that |L r ( V ) | . d n r d +1 when V ⊂ R d is a homogeneous set. This roughlymeans that the point set is a perturbation of the grid G d . In [LS07], the result was extendedfor pseudolines and homogeneous sets in R n where pseudolines are a generalization of lines whichinclude constant degree irreducible algebraic curves. Adding the homogeneous condition on a setis a much stronger condition (for sufficiently small r ) than requiring that no large subset belongsto a hyperplane (however, we cannot derive these results from ours since our dependence on d issuboptimal). The main tool used in the proof of Theorem 1 is a rank bound for design matrices. A design matrix is a matrix with entries in C and whose support (set of non-zero entries) forms a specific pattern.Namely, the supports of different columns have small intersections, the columns have large supportand rows are sparse (see Definition 2.1). Design matrices were introduced in [BDWY12, DSW14]to study quantitative variants of the Sylvester-Gallai theorem. These works prove certain lowerbounds on the rank of such matrices, depending only on the combinatorial properties of theirsupport (see Section 2.1). Such rank bounds can be used to give upper bounds on the dimension ofpoint configurations in which there are many ‘local’ linear dependencies. This is done by using thelocal dependencies to construct rows of a design matrix M , showing that its rank is high and thenarguing that the dimension of the original set is small since it must lie in the kernel of M .4uppose we have a configuration of points with many r -rich lines. Clearly, r ≥ r may be larger than3. To use this information, we observe that a certain map, called the Veronese embedding, takes r -collinear points to r linearly dependent points in a larger dimensional space (see Section 2.2).Thus we can create a design matrix using these linear dependencies similarly to the constructionsof [BDYW11, DSW14] to get an upper bound on the dimension of the image of the original set,under the Veronese embedding. We use this upper bound to conclude that there is a polynomialof degree r − Here, we show a simple application of our techniques to prove sum product estimates over C . Theestimates we will get can also be derived from the Szemer´edi-Trotter theorem in the complex plane(see Section 5.1) and we include them only as an example of how to use a higher dimensionaltheorem in this setting. We hope that future progress on proving Conjecture 1.3 will result inprogress on sum product problems.We begin with some notations. For two sets A, B ⊂ C we denote by A + B = { a + b | a, b ∈ A } thesum set of A and B . For a set A ⊂ C and a complex number t ∈ C we denote by tA = { ta | a ∈ A } the dilate of A by t . Hence we have that A + tA = { a + ta ′ | a, a ′ ∈ A } . Theorem 3.
Let A ⊂ C be a set of N complex numbers and let ≪ C ≪ √ N . Define the set T C = (cid:26) t ∈ C (cid:12)(cid:12)(cid:12)(cid:12) | A + tA | ≤ N . C √ log N (cid:27) . Then, | T C | . NC . By taking C to be a large constant, an immediate corollary is: Corollary 1.7.
Let A ⊂ C be a finite set. Then | A + A · A | = |{ a + bc | a, b, c ∈ A }| & | A | . p log | A | . In Section 2 we give some preliminaries, including on design matrices and the Veronese embedding.In Section 3 we prove Theorem 1. In Section 4 we prove Theorem 2. In Section 5 we proveTheorem 3. In Appendix A we give a possible strengthening of Conjecture 1.3 along with theproofs of Theorem 1.4 and Theorem 1.5.
We thank Ben Green and Noam Solomon for helpful comments. Research supported by NSF grantCCF-1217416 and by the Sloan fellowship. Some of the work on the paper was carried out duringthe special semester on ‘Algebraic Techniques for Combinatorial and Computational Geometry’,held at the Institute for Pure and Applied Mathematics (IPAM) during Spring 2014.5
Preliminaries
We begin with some notations. For a vector v ∈ C n and a set I ⊂ [ n ] we denote by v I ⊂ C I therestriction of v to indices in I . We denote the support of a vector v ∈ C d by supp ( v ) = { i ∈ [ d ] | v i =0 } (this notation is extended to matrices as well). For a set of n points V ⊂ C d and an integer ℓ ,we denote by V ℓ ⊂ C dℓ its ℓ -fold Cartesian product i.e. V ℓ = V × V × · · · × V ( ℓ times) where wenaturally identify C d × C d × · · · × C d ( ℓ times) with C dℓ . Design matrices, defined in [BDWY12], are matrices that satisfy a certain condition on their sup-port.
Definition 2.1 (Design matrix) . Let A be an m × n matrix over a field F . Let R , . . . , R m ∈ F n be the rows of A and let C , . . . , C n ∈ F m be the columns of A . We say that A is a ( q, k, t )-designmatrix if1. For all i ∈ [ m ] , | supp ( R i ) | ≤ q .2. For all j ∈ [ n ] , | supp ( C j ) | ≥ k .3. For all j = j ∈ [ n ] , | supp ( C j ) ∩ supp ( C j ) | ≤ t . Surprisingly, one can derive a general bound on the rank of complex design matrices, despitehaving no information on the values present at the non zero positions of the matrix. The firstbound of this form was given in [BDWY12] which was improved in [DSW14].
Theorem 2.2 ([DSW14]) . Let A be an m × n matrix with entries in C . If A is a ( q, k, t ) designmatrix then the following two bounds hold rank ( A ) ≥ n − ntq k . (1) rank ( A ) ≥ n − mtq k . (2) We denote by m ( d, r ) = (cid:18) d + rd (cid:19) the number of monomials of degree at most r in d variables. We will often use the lower bound m ( d, r ) ≥ ( r/d ) d . The Veronese embedding φ d,r : C d C m ( d,r ) sends a point a = ( a , . . . , a d ) ∈ C d to the vector of evaluations of all monomials of degree at most r at the point a . For example, themap φ , sends ( a , a ) to (1 , a , a , a , a a , a ). We can identify each point w ∈ C m ( d,r ) with apolynomial f w ∈ C [ x , . . . , x d ] of degree at most r in an obvious manner so that the value f w ( a ) ata point a ∈ C d is given by the standard inner product h w, φ d,r ( a ) i . We will use the following twoeasy claims. 6 laim 2.3. Let V ⊂ C d and let U = φ d,r ( V ) ⊂ C m ( d,r ) . Then U is contained in a hyperplane iffthere is a non-zero polynomial f ∈ C [ x , . . . , x d ] of degree at most r that vanishes on all points of V .Proof. Each hyperplane in C m ( d,r ) is given as the set of points having inner product zero with some w ∈ C m ( d,r ) . If we take the corresponding polynomial f w ∈ C [ x , . . . , x d ] we get that it vanishes on V iff φ d,r ( V ) is contained in the hyperplane defined by w . Claim 2.4.
Suppose the r +2 points v , . . . , v r +2 ∈ C d are collinear and let φ = φ d,r : C d C m ( d,r ) .Then, the points φ ( v ) , . . . , φ ( v r +2 ) are linearly dependent. Moreover, every r + 1 of the points φ ( v ) , . . . , φ ( v r +2 ) are linearly independent.Proof. Denote u i = φ ( v i ) for i = 1 . . . r + 2. To show that the u i ’s are linearly dependent it isenough to show that, for any w ∈ C m ( d,r ) , if all the r + 1 inner products h w, u i , . . . , h w, u r +1 i are zero, then the inner product h w, u r +2 i must also be zero. Suppose this is the case, and let f w ∈ C [ x , . . . , x d ] be the polynomial of degree at most r associated with the point w so that h w, u i i = f w ( v i ) for all 1 ≤ i ≤ r + 1. Since the points v , . . . , v r +2 are on a single line L ⊂ C d , andsince the polynomial f w vanishes on r + 1 of them, we have that f w must vanish identically on theline L and so f w ( v r +2 ) = h w, u r +2 i = 0 as well.To show the ‘moreover’ part, suppose in contradiction that u , . . . , u r span u r +1 . We canfind, by interpolation, a non zero polynomial f ∈ C [ x , . . . , x d ] of degree at most r such that f ( v ) = . . . = f ( v r ) = 0 and f ( v r +1 ) = 1. More formally, we can translate the line containingthe r + 1 points to the x -axis and then interpolate a degree r polynomial in x with the requiredproperties using the invertibility of the Vandermonde matrix. Now, let w ∈ C m ( d,r ) be the pointsuch that f = f w . We know that h w, u i i = 0 for i = 1 . . . r and thus, since u r +1 is in the span of u , . . . , u r , we get that f ( v r +1 ) = h w, u r +1 i = 0 in contradiction. This completes the proof. We recall the Schwartz-Zippel lemma.
Lemma 2.5 ([Sch80, Zip79]) . Let S ⊂ F be a finite subset of an arbitrary field F and let f ∈ F [ x , . . . , x d ] be a non-zero polynomial of degree at most r . Then |{ ( a , . . . , a d ) ∈ S d ⊂ F d | f ( a , . . . , a d ) = 0 }| ≤ r · | S | d − . An easy corollary is the following claim about homogeneous polynomials.
Lemma 2.6.
Let S ⊂ F be a finite subset of an arbitrary field F and let f ∈ F [ x , . . . , x d ] be anon-zero homogeneous polynomial of degree at most r . Then |{ (1 , a , . . . , a d ) ∈ { } × S d − | f (1 , a , . . . , a d ) = 0 }| ≤ r · | S | d − . Proof.
Let g ( x , . . . , x d ) = f (1 , x , . . . , x d ) be the polynomial one obtains from fixing x = 1 in f .Then g is a polynomial of degree at most r in d − g was the zero polynomial then f would have been divisible by 1 − x which is impossible for a homogeneous polynomial. Hence, wecan use Lemma 2.5 to bound the number of zeros of g in the set S d − by r · | S | d − . This completesthe proof. 7nother useful claim says that if a degree one polynomial (i.e., the equation of a hyperplane)vanishes on a large subset of the product set V ℓ , then there is another degree one polynomial thatvanishes on a large subset of V . Lemma 2.7.
Let V ⊂ C d be a set of n points and let V ℓ ⊂ C dℓ be its ℓ -fold Cartesian product. Let H ⊂ C dℓ be an affine hyperplane such that | H ∩ V ℓ | ≥ δ · n ℓ . Then, there exists an affine hyperplane H ′ ⊂ C d such that | H ′ ∩ V | ≥ δ · n .Proof. Let h ∈ C dℓ be the vector perpendicular to H so that x ∈ H iff h x, h i = b for some b ∈ C .Observing the product structure of C dℓ = ( C d ) ℓ we can write h = ( h , . . . , h ℓ ) with each h i ∈ C d .W.l.o.g suppose that h = 0. For each a = ( a , . . . , a ℓ ) ∈ V ℓ − let V ℓa = V × { a } × . . . { a ℓ } . Sincethere are n ℓ − different choices for a ∈ V ℓ − , and since | V ℓ ∩ H | = X a ∈ V ℓ − | V ℓa ∩ H | , there must be some a with | V ℓa ∩ H | ≥ δ · n . Let H ′ ⊂ C d be the hyperplane defined by the equation x ∈ H ′ iff h x, h i + h a , h i + . . . + h a ℓ , h ℓ i = b. Then, H ′ ∩ V is in one-to-one correspondence with the set V ℓa ∩ H and so has the same size. We will need the following simple lemma, showing that any bipartite graph can be refined so thatboth vertex sets have high minimum degree (relative the to the original edge density).
Lemma 2.8.
Let G = ( A ⊔ B, E ) be a bipartite graph with E ⊂ A × B and edge set E = φ . Thenthere exists non-empty sets A ′ ⊂ A and B ′ ⊂ B such that if we consider the induced subgraph G ′ = ( A ′ ⊔ B ′ , E ′ ) then1. The minimum degree in A ′ is at least | E | | A |
2. The minimum degree in B ′ is at least | E | | B | | E ′ | ≥ | E | / .Proof. We will construct A ′ and B ′ using an iterative procedure. Initially let A ′ = A and B ′ = B .Let G ′ = ( A ′ ⊔ B ′ , E ′ ) be the induced subgraph of G . If there is a vertex in A ′ with degree (in theinduced subgraph G ′ ) less than | E | | A | , remove it from A ′ . If there is a vertex in B ′ with degree (inthe induced subgraph G ′ ) less than | E | | B | , remove it from B ′ . At the end of this procedure, we areleft with sets A ′ , B ′ with the required min-degrees. We can count the number of edges lost as weremove vertices in the procedure. Whenever a vertex in A ′ is removed we lose at most | E | | A | edgesand whenever a vertex from B ′ is removed we lose at most | E | | B | edges. So | E ′ | ≥ | E | − | A | | E | | A | − | B | | E | | B | ≥ | E | / . Proof of Theorem 1
The main technical tool will be the following lemma, which shows that one can find a vanishingpolynomial of low degree, assuming each point is in many rich lines.
Lemma 3.1.
For each d ≥ there is a constant K d ≤ d ) d such that the following holds. Let V ⊂ C d be a set of n points and let r ≥ be an integer. Suppose that, through each point v ∈ V ,there are at least k r -rich lines where k ≥ K d · nr d − . Then, there exists a non-zero polynomial f ∈ C [ x , . . . , x d ] of degree at most r − such that f ( v ) = 0 for all v ∈ V .If we have the stronger condition that the number of r -rich lines through each point of V isbetween k and k then we can get the same conclusion (vanishing f of degree r − ) under theweaker inequality k ≥ K d · nr d − . Proof.
Let V = { v , . . . , v n } and let φ = φ d,r − : C d C m ( d,r − be the Veronese embedding withdegree bound r −
2. Let us denote U = { u , . . . , u n } ⊂ C m ( d,r − with u i = φ ( v i ) for all i ∈ [ n ].We will prove the lemma by showing that U is contained in a hyperplane and then usingClaim 2.3 to deduce the existence of the vanishing polynomial. Let M be an n × m ( d, r −
2) matrixwhose i ’th row is u i = φ ( v i ). To show that U is contained in a hyperplane, it is enough to showthat rank ( M ) < m ( d, r − M are linearly dependent, whichmeans that all the rows lie in some hyperplane.We will now construct a design matrix A such that A · M = 0. Since rank ( A ) + rank ( M ) ≤ n ,we will be able to translate a lower bound on the rank of A (which will be given by Theorem 2.2)to the required upper bound on the rank of M . Each row in A will correspond to some collinear r -tuple in V . We will construct A in several stages. First, for each r -rich line L ∈ L r ( V ) we willconstruct a set of r -tuples R L ⊂ (cid:0) Vr (cid:1) such that1. Each r -tuple in R L is contained in L ∩ V .2. Each point v ∈ L ∩ V is in at least one r -tuple from R L .3. Every pair of distinct points u, v ∈ L ∩ V appear together in at most two r -tuples from R L .If | L ∩ V | is a multiple of r , we can construct such a set R L easily by taking a disjoint cover of r -tuples. If | L ∩ V | is not a multiple of r (but is still of size at least r ) we can take a maximalset of disjoint r -tuples inside it and then add to it one more r -tuple that will cover the remainingelements and will otherwise intersect only one other r -tuple. This will guarantee that the thirdcondition holds. We define R ⊂ (cid:0) Vr (cid:1) to be the union of all sets R L over all r -rich lines L . We cannow prove: Claim 3.2.
The set R ⊂ (cid:0) Vr (cid:1) defined above has the following three properties.1. Each point v ∈ V is contained in at least k r -tuples from R .2. Every pair of distinct points u, v ∈ V is contained together in at most two r -tuples from R . . Let ( v i , . . . , v i r ) ∈ R . Then there exists r non zero coefficients α , . . . , α r ∈ C so that α · u i + . . . + α r · u i r = 0 .If, in addition, we know that each point belongs to at most k rich lines (as in the second part ofthe lemma) then we also have that | R | ≤ nk/r .Proof. The first property follows from the fact that each v is in at least k r -rich lines and that each R L with v ∈ L has at least one r -tuple containing v . The second property follows from the factthat each pair u, v can belong to at most one r -rich line L and that each R L can contain at mosttwo r -tuples with both u and v . The fact that the r -tuple of point u i , . . . , u i r is linearly dependentfollows from Claim 2.4. The fact that all the coefficients α j are non zero holds since no propersubset of that r -tuple is linearly dependent (by the ‘moreover’ part of Claim 2.4). If each pointis in at most 8 k lines then each point is in at most 16 k r -tuples (at most two on each line). Thismeans that there could be at most 16 nk/r tuples in R since otherwise, some point would be in toomany tuples.We now construct the matrix A of size m × n where m = | R | . For each r -tuple ( v i , . . . , v i r ) ∈ R we add a row to A (the order of the rows does not matter) that has zeros in all positions except i , . . . , i r and has values α , . . . , α r given by Claim 3.2 in those positions. Since the rows of M arethe points u , . . . , u n , the third item of Claim 3.2 guarantees that A · M = 0 as we wanted. Thenext claim asserts that A is a design matrix. Claim 3.3.
The matrix A constructed above is a ( r, k, -design matrix.Proof. Clearly, each row of A contains at most r non zero coordinates. Since each point v ∈ V is inat least k r -tuples from R we have that each column of A contains at least k non-zero coordinates.The size of the intersection of the supports of two distinct columns in A is at most two by item (2)of Claim 3.2.We now use Eq. (1) from Theorem 2.2 to get rank ( A ) ≥ n − nr k . This implies (using r ≥
4) that rank ( M ) ≤ nr k ≤ (cid:18) r − d (cid:19) d < m ( d, r − , if k ≥ d ) d · nr d − . If we have the additional assumption that each point is in at most 8 k lines then, using thebound m = | R | ≤ nk/r in Eq. (2) of Theorem 2.2. We get rank ( A ) ≥ n − mr k ≥ n − nrk which gives rank ( M ) ≤ nrk < m ( d, r − k ≥ d ) d nr d − . Hence, the rows of M lie in a hyperplane. This completes the proof of the lemma. Remark 3.4.
Lemma 3.1 can be extended to the case where we have r -rich curves of boundeddegree D = O (1) with ‘two degrees of freedom’ i.e. through every pair of points there can be at most C = O (1) distinct curves (e.g. unit circles). Under the Veronese embedding φ d, ⌊ r − D ⌋ , the imagesof r points on a degree D curve are linearly dependent. So we can still construct a design matrixas in the above proof where the design parameters depend on D, C . Once we get a hypersurfaceof degree (cid:4) r − D (cid:5) vanishing on all the points, the hypersurface should also contain all the degree Dr -rich curves. We will now use Lemma 3.1 to prove Theorem 1. The reduction uses Lemma 2.8 to reduce tothe case where each point has many rich lines through it. Once we find a vanishing low degreepolynomial we analyze its singularities to find a point such that all lines though it are in somehyperplane.
Proof of Theorem 1.
Since L r ( V ) ≤ n for all r ≥
2, by choosing C d > R dd we can assume that r ≥ R d for any large constant R d depending only on d .Let L = L r ( V ) be the set of r -rich lines in V and let I = I ( L , V ) be the set of incidencesbetween L and V . By the conditions of the theorem we have | I | ≥ r |L| ≥ C d · αn r d − . (3)Applying Lemma 2.8 to the incidence graph between V and L , we obtain non-empty subsets V ′ ⊂ V and L ′ ⊂ L such that each v ∈ V ′ is in at least k = | I | n lines from L ′ and such that eachline in L ′ is r/ V ′ and | I ′ | = | I ( L ′ , V ′ ) | ≥ | I | / . We would like to apply Lemma 3.1 with the stronger condition that each point is incident onapproximately the same number of lines (which gives better dependence on r ). To achieve this, wewill further refine our set of points using dyadic pigeonholing.Let V ′ = V ′ ⊔ V ′ ⊔ · · · be a partition of V ′ into disjoint subsets where V ′ j is the set of pointsincident to at least k j = 2 j − k and less than 2 j k lines from L ′ . Let I ′ j = I ( L ′ , V ′ j ), so that X j ≥ | I ′ j | = | I ′ | ≥ | I | / . Since P j ≥ j <
1, there exists j such that | I ′ j | ≥ | I | j . Let us fix j to this value for the rest of theproof.We will first upper bound j . Since | I ′ j | > V ′ j is non-empty and let p ∈ V ′ j . There are at least k j ( r/ p and by choosing R d ≥
8, there are at least r/ − ≥ r/ p on each of these lines and they are all distinct. So, n = | V | ≥ j − k · r j − r | I | n ≥ C d j − αnr d − ≥ j − nr d − . j . d log r where we assumed above that C d ≥ L ′ need not be r/ V ′ j , we need further refinement. Apply Lemma 2.8again on the incidence graph I ′ j = I ( L ′ , V ′ j ) to get non-empty V ′′ ⊂ V ′ j and L ′′ ⊂ L ′ and | I ′′ | = | I ( L ′′ , V ′′ ) | ≥ | I ′ j | ≥ | I | j ≥ r |L| j . Each line in L ′′ is incident to at least | I ′ j | |L ′ | ≥ r j = r points from V ′′ and so L ′′ is r -rich w.r.t V ′′ . And each point in V ′′ is incident to at least | I ′ j | | V ′ j | ≥ k j j − k = k and at most 2 j k = 8 k lines from L ′′ . Since j . d log r , we can assume r = r j ≥ R d ≫ d .The following claim shows that we can apply Lemma 3.1 to V ′′ and L ′′ Claim 3.5. k ≥ K d · | V ′′ | r d − where K d is the constant in Lemma 3.1Proof. We have | V ′′ | ≤ | V ′ j | ≤ | I | j − k = n j − . So it is enough to show that k ≥ K d · n j − r d − . Substituting the bounds we have for k and r , this will follow from to | I | ≥ K d · d · j d − j ! n r d − which follows from Eq. (3) by choosing C d > K d · d · max j (cid:16) j d − j (cid:17) .Hence, by Lemma 3.1, there exists a non-zero polynomial f ∈ C [ x , . . . , x d ] of degree at most r −
2, vanishing at all points of V ′′ . W.l.o.g suppose f has minimal total degree among allpolynomials vanishing on V ′′ . Since f has degree at most r − L ′′ .We say that a point v ∈ V ′′ is ‘flat’ if the set of lines from L ′′ passing through v are contained insome affine hyperplane through v . Otherwise, we call the point v a ‘joint’. We will show that thereis at least one flat point in V ′′ . Suppose towards a contradiction that all points in V ′′ are joints.Let v ∈ V ′′ be some point and let ∇ f ( v ) be the gradient of f at v . Since f vanishes identicallyon all lines in L ′′ we get that ∇ f ( v ) = 0 ( v is a singular point of the hypersurface defined by f ).We now get a contradiction since one of the coordinates of ∇ f is a non-zero polynomial of degreesmaller than the degree of f that vanishes on the entire set V ′′ .12ence, there exists a point v ∈ V ′′ and an affine a hyperplane H passing through v such thatall r -rich lines in L ′′ passing through v are contained in H . Since there are at least k such lines,and each line contain at least r − v , we get that H contains at least( r − k ≥ r j · j − | I | n ≥ C d (cid:18) j − j (cid:19) αnr d − ≥ C ′ d αnr d − points from V where C ′ d = C d · min j (cid:16) j − j (cid:17) . Observing the proof we can take the constants to be C d = d Θ( d ) and C ′ d = C d . Remark 3.6.
Observe that, we can take L to be any subset of L r ( V ) of size ≥ C d αn r d and obtainthe same conclusion. Moreover, the hyperplane H that we obtain at the end contains k & αnr d linesof L . We will reduce the problem of bounding r -term arithmetic progressions to that of bounding r -richlines using the following claim: Claim 4.1.
Let V ⊂ C d then AP r ( V ) ≤ |L r ([ r ] × V ) | where [ r ] = { , , · · · , r − } Proof.
For u, w ∈ C d , w = 0, let ( u, u + w, · · · , u + ( r − w ) be an r -term arithmetic progressionin V . Then the line { (0 , u ) + z (1 , w ) } z ∈ C is r -rich w.r.t the point set [ r ] × V ⊂ C d ; moreover thismapping is injective.We need the following claim regarding arithmetic progressions in product sets. Claim 4.2.
Let V ⊂ C d be a set of n points and let ℓ ≥ be an integer. Then, for all r ≥ , theproduct set V ℓ ⊂ C dℓ satisfies AP r ( V ℓ ) ≥ AP r ( V ) ℓ . Proof.
Let P ( V ) be the set of r -term arithmetic progressions in V and let P ( V ℓ ) be the set of r -term progressions in V ℓ . We will describe an injective mapping from P ( V ) ℓ into P ( V ℓ ). For u, w ∈ C d let L u,w = { u, u + w, . . . , u + ( r − w } be the r -term progression starting at u withdifference w . Let u , . . . , u ℓ , w , . . . , w ℓ ∈ C d such that L u i ,w i ∈ P ( V ) for each i ∈ [ ℓ ]. We mapthem into the arithmetic progression L u,w ∈ P ( V ℓ ) with u = ( u , . . . , u ℓ ) and w = ( w , . . . , w ℓ ).Clearly, this map is injective (care should be taken to assign each progression a unique differencesince these are determined up to a sign). Proof of Theorem 2.
Let us assume AP r ( V ) ≫ d,ǫ n r d − ǫ . Let ℓ = ⌈ ǫ ⌉ . By Claim 4.2, AP r ( V ℓ ) ≥ AP r ( V ) ℓ . Let L be the collection of r -rich lines w.r.t [ r ] × V ℓ ⊂ C dℓ corresponding to non-trivial r -term arithmetic progressions in V ℓ , as given by Claim 4.1. So |L r ([ r ] × V ℓ ) | ≥ |L| = AP r ( V ℓ ) ≥ AP r ( V ) ℓ ≫ d,ǫ n ℓ r dℓ − ǫℓ ≥ n ℓ r dℓ − = ( n ℓ r ) r dℓ +1 . By Theorem 1 (choosing the constants appropriately), there is a hyperplane H in C dℓ whichcontains & d,ǫ n ℓ rr dℓ − points of [ r ] × V ℓ . Moreover, by Remark 3.6, H contains some of the lines of L .13o H cannot be one of the hyperplanes { z = i } i ∈ [ r ] because they do not contain any lines of L . Sothe intersection of H with one of the r hyperplanes { z = i } i ∈ [ r ] (say j ) gives a ( dℓ − & d,ǫ n ℓ r dℓ − points of V ℓ × { j } . This gives a hyperplane H ′ in C dℓ which contains & d,ǫ n ℓ r dℓ − points of V ℓ . Now by Lemma 2.7, we can conclude that there is a hyperplane in C d which contains & d,ǫ nr dℓ − ≥ nr d/ǫ − points of V . Suppose in contradiction that | T C | > λN/C for some large absolute constant λ which we willchoose later. Let Q ⊂ T C be a set of size | Q | = (cid:24) λNC (cid:25) containing the zero element 0 ∈ Q (we have 0 ∈ T C since the sum-set | A + 0 A | = | A | is small). Letus denote by r = | Q | and let m = N . C √ log N .
Let d = ⌈
100 log N ⌉ . We will use our assumption on the size of Q to construct a configuration of points V ⊂ C d withmany r -rich lines. Then we will use Lemma 3.1 to derive a contradiction. The set V will be a unionof the sets V t = { t } × ( A + tA ) d − = { ( t, a + tb , . . . , a d + tb d ) | a i , b j ∈ A } over all t ∈ Q . That is V = [ t ∈ Q V t . Notice the special structure of the set V = { } × A d − . We denote by n = | V | ≤ r · m d − (4)Notice that, by construction, for every a = (0 , a , . . . , a d ) and every b = (1 , b , . . . , b d ) (with allthe a i , b j in A ), the line through the point a ∈ V in direction b is r -rich w.r.t V . Let us denote by L ⊂ L r ( V ) the set of all lines of this form. We thus have |L| = N d − . (5)Let I = I ( V, L ), then | I | ≥ r |L| . We now use Lemma 2.8 to find subsets V ′ ⊂ V and L ′ ⊂ L suchthat each point in V ′ is in at least k = rN d − n L ′ , each line in L ′ is r = r/ V ′ and | I ( V ′ , L ′ ) | ≥ | I | / . Observe that, since each line in L ′ contains at most r points from V ′ , we have |L ′ | ≥ | I ( V ′ , L ′ ) | /r ≥ |L| / . The following claim shows that we can apply Lemma 3.1 on the set V ′ . Claim 5.1. k ≥ K d nr d − . where K d = 32(2 d ) d is the constant in Lemma 3.1Proof. Plugging in the value of k, r and rearranging, we need to show that N d − r d − d ) d ≥ n . Using Eq. (4) to bound n we get that it is enough to show N d − r d − d ) d ≥ r N d − C d − (log N ) d − . Rearranging, we need to show that r d − ≥ d ) d N d − ( C ) d − (log N ) d − . We now raise both sides to the power 1 / ( d −
3) and use the fact that, for ℓ > log X , we have1 ≤ X /ℓ ≤
2. Thus it is enough to show r ≥ K ′ dNC log N for some absolute constant K ′ . Plugging in the value of d we get that the claim would follow if r ≥ K ′ NC which holds by choosing λ = 100 K ′ .Since C ≪ √ N , r ≥
4. Applying Lemma 3.1, we get a non-zero polynomial f ∈ C [ x , . . . , x d ]of degree at most r − V ′ . This means that f must also vanishidentically on all lines in L ′ (since these are all r -rich w.r.t V ′ ). Since each line in L ′ intersects V exactly once, and since | V | = N d − , we get that there must be at least one point v ∈ V that iscontained in at least |L ′ | /N d − ≥ N d − lines (in different directions) from L ′ . Let ˜ f denote thehomogeneous part of f of highest degree. If f vanishes identically on a line in direction b ∈ C d ,this implies that ˜ f ( b ) = 0 (to see this notice that the leading coefficient of g ( t ) = f ( a + tb ) is˜ f ( b )). Hence, since all the directions of lines in L ′ are from the set { } × A d − , we get that ˜ f hasat least N d − zeros in the set { } × A d − . This contradicts Lemma 2.6 since the degree of ˜ f isat most r − r/ − < N/ r = ⌈ λN/C ⌉ and C ≫ .1 A proof of Theorem 3 using Szemer´edi-Trotter in C The following is a slightly stronger version of Theorem 3 (without the logarithmic factor), whichwe prove using a simple application of the two-dimensional Szemer´edi-Trotter theorem (to deriveTheorem 3 replace C with C √ N ). Theorem 5.2.
Let A ⊂ C be a set of N complex numbers and let C ≫ . Define T C = (cid:26) t ∈ C : | A + tA | ≤ N C (cid:27) . Then | T C | . N C . Proof.
Define the set of points P = [ t ∈ T C { t } × ( A + tA )and the set of lines L = { ( z, a + za ′ ) z ∈ C : a, a ′ ∈ A } in C . Each line in L contains r = | T C | points from P . So by using Theorem 1.2, we have |L| ≤ K (cid:18) | P | r + | P | r (cid:19) where K is some absolute constant. By construction, | P | ≤ | T C | N /C = rN /C and |L| = N . So N K ≤ N rC + N C ⇒ r ≤ KN C where we assumed that C ≥ K . References [BDWY12] B. Barak, Z. Dvir, A. Wigderson, and A. Yehudayoff. Fractional Sylvester-Gallai the-orems.
Proceedings of the National Academy of Sciences , 2012.[BDYW11] B. Barak, Z. Dvir, A. Yehudayoff, and A. Wigderson. Rank bounds for design matriceswith applications to combinatorial geometry and locally correctable codes. In
Proceed-ings of the 43rd annual ACM symposium on Theory of computing , STOC ’11, pages519–528, New York, NY, USA, 2011. ACM.[BS14] Saugata Basu and Martin Sombra. Polynomial partitioning on varieties and point-hypersurface incidences in four dimensions. arXiv preprint arXiv:1406.2144 , 2014.[DSW14] Z. Dvir, S. Saraf, and A. Wigderson. Improved rank bounds for design matrices and anew proof of kellys theorem.
Forum of Mathematics, Sigma , 2, 10 2014.[Dvi12] Z. Dvir. Incidence Theorems and Their Applications.
Foundations and Trends inTheoretical Computer Science , 6(4):257–393, 2012.16GK10] L. Guth and N. Katz. Algebraic methods in discrete analogs of the Kakeya problem.
Advances in Mathematics , 225(5):2828 – 2839, 2010.[GK15] Larry Guth and Nets Hawk Katz. On the erd˝os distinct distances problem in the plane.
Annals of Mathematics , 181(1):155–190, 2015.[Kol14] J. Kollar. Szemeredi–Trotter-type theorems in dimension 3. 2014. arXiv:1405.2243.[LS07] Izabella Laba and J´ozsef Solymosi. Incidence theorems for pseudoflats.
Discrete &Computational Geometry , 37(2):163–174, 2007.[Rud14] M. Rudnev. On the number of incidences between planes and points in three dimen-sions. 2014. arXiv:1407.0426v2.[Sch80] J. T. Schwartz. Fast probabilistic algorithms for verification of polynomial identities.
J. ACM , 27(4):701–717, 1980.[SS14] Micha Sharir and Noam Solomon. Incidences between points and lines in R . 2014.http://arxiv.org/abs/1411.0777.[SSZ13] Micha Sharir, Adam Sheffer, and Joshua Zahl. Improved bounds for incidences be-tween points and circles. In Proceedings of the Twenty-ninth Annual Symposium onComputational Geometry , SoCG ’13, pages 97–106, New York, NY, USA, 2013. ACM.[ST83] E. Szemer´edi and W. T. Trotter. Extremal problems in discrete geometry.
Combina-torica , 3(3):381–392, 1983.[ST12] Jzsef Solymosi and Terence Tao. An incidence theorem in higher dimensions.
Discreteand Computational Geometry , 48(2):255–280, 2012.[SV04] J´ozsef Solymosi and VH Vu. Distinct distances in high dimensional homogeneous sets.
Contemporary Mathematics , 342:259–268, 2004.[Tot03] C. Toth. The Szemeredi-Trotter theorem in the complex plane. arXiv:math/0305283v4 ,2003.[Zah12] Joshua Zahl. A szemeredi-trotter type theorem in R . CoRR , abs/1203.4600, 2012.[Zip79] R. Zippel. Probabilistic algorithms for sparse polynomials. In
Proceedings of theInternational Symposiumon on Symbolic and Algebraic Computation , pages 216–226.Springer-Verlag, 1979.
A Towards an optimal incidence theorem for points and lines in C d For α ≥ < ℓ < d , by pasting together r d − ℓ /α ℓ -dimensional grids of size αn/r d − ℓ each, we get & d αn /r d +1 r -rich lines. This motivates a stronger version of Conjecture 1.3.17 onjecture A.1. Suppose V ⊂ C d is a set of n points and let α ≥ . For r ≥ , if |L r ( V ) | ≫ d α n r d +1 + nr , then there is an integer ℓ such that < ℓ < d and a subset V ′ ⊂ V of size & d αn/r d − ℓ which iscontained in an ℓ -flat. The above conjecture is equivalent to the following conjecture.
Conjecture A.2 (Equiv. to Conjecture A.1) . Given a set V of n points in C d , let s ℓ denote themaximum number of points of V contained in an ℓ -flat. Then for r ≥ , |L r ( V ) | . d n r d +1 + n d − X ℓ =2 s ℓ r ℓ +1 + nr . Proof of equivalence to Conjecture A.1.
A.2 ⇒ A.1 is trivial. To show the other direction, if |L r ( V ) | . d n r d +1 + nr , we are done. Else, let |L r ( V ) | = C d (cid:18) α n r d +1 + nr (cid:19) for some α ≥ C d ≫ d
1. By A.1, for some 1 < ℓ < d , we have s ℓ & d αn/r d − ℓ . So wehave |L r ( V ) | . d (cid:16) ns ℓ r ℓ +1 + nr (cid:17) . This implies A.2.Note that Theorem 1.6 and thus Theorem 1, trivially follow from Conjecture A.2 by observingthat s d − ≥ s d − ≥ · · · ≥ s ≥ r . Conjecture A.2, if true, can be used to give an optimal bound onincidences between points and lines in C d in terms of s ℓ ’s by standard arguments.For d = 2, Conjecture A.2 is exactly Theorem 1.2. Using the incidence bounds of [GK15]and [SS14] we can prove Conjecture A.2 for R and ‘almost’ prove it for R (these are stated asTheorem 1.4 and Theorem 1.5 respectively). As already discussed, it is possible that one needs toweaken Conjecture A.2 so that s ℓ is the maximum number of points in an ℓ -dimensional hypersurfaceof constant degree (possibly depending on ℓ ).For completeness, we include in this section a short derivation of Theorems 1.4 and 1.5 fromthe incidence bounds of [GK15] and [SS14] which we now state. Theorem A.3 ([GK15]) . Let V be a set of n points and L be a set of m lines in R . Let q be themaximum number of lines in a hyperplane (2-flat). Then, | I ( V, L ) | . n / m / + n / m / q / + n + m. Theorem A.4 ([SS14]) . Let V be a set of n points and L be a set of m lines in R . Let q be themaximum number of lines of L contained in a quadric hypersurface or a hyperplane and q be themaximum number of lines of L contained in a 2-flat. Then, | I ( V, L ) | . c √ log n (cid:16) n / m / + n (cid:17) + n / m / q / + n / m / q / + m for some absolute constant c . roof of Theorem 1.4. Let L = L r ( V ) be the set of r -rich lines and let |L| = m . Let q be themaximum number of lines of L contained in a hyperplane. By Theorem A.3, rm ≤ | I ( V, L ) | . n / m / + n / m / q / + n + m. From this it follows that m . n r + nq / r / + nr . Now we will upper bound q . Let L ′ ⊂ L be a set of q lines contained in some hyperplane H andlet V ′ = V ∩ H . We know that | V ′ | ≤ s . By applying Theorem 1.2 to L ′ , V ′ in H , we get q = |L ′ | . | V ′ | r + | V ′ | r ≤ s r + s r ⇒ q / . s r / + s / r / . Using this bound on q we get, m . n r + ns r + ns / r + nr . n r + ns r + nr where in the last step we used AM-GM inequality. Proof of Theorem 1.5.
In this proof c represents some absolute constant which can vary from stepto step. Let L = L r ( V ) be the set of r -rich lines and let |L| = m . Let q be the maximum numberof lines of L contained in a quadric hypersurface or a hyperplane and q be the maximum numberof lines of L contained in a 2-flat. By Theorem A.4, rm ≤ | I ( V, L ) | . c √ log n (cid:16) n / m / + n (cid:17) + n / m / q / + n / m / q / + m . From this it follows that m . c √ log n (cid:18) n r + nr (cid:19) + nq / r + nq / r / . (6)By applying Theorem 1.2 to the collection of q lines contained in a 2-flat, we get q . s r + s r . Now we will upper bound q . Let L ′ ⊂ L be a set of q lines contained in some quadric or hyperplane Q and let V ′ = V ∩ Q . We know that | V ′ | ≤ s . By applying Theorem A.4 again to L ′ , V ′ , we get rq ≤ | I ( V ′ , L ′ ) | . c √ log s (cid:16) s / q / + s (cid:17) + s / q / + s / q / q / + q ⇒ q . c √ log s · (cid:18) s r + s r (cid:19) + s r + s q / r / . Substituting these bounds on q , q in Eq. 6 and using AM-GM inequality a few times gives m . c √ log n (cid:18) n r + nr (cid:19) + ns r c √ log s r / ! + ns r . c √ log n (cid:18) n r + ns r + ns r + nr (cid:19) ..