The cylindrical width of transitive sets
aa r X i v : . [ m a t h . M G ] J a n THE CYLINDRICAL WIDTH OF TRANSITIVE SETS
ASHWIN SAH, MEHTAAB SAWHNEY, AND YUFEI ZHAO
Abstract.
We show that for every ≤ k ≤ d/ (log d ) C , every finite transitive set of unit vectors in R d lies within distance O (1 / p log( d/k )) of some codimension k subspace, and this distance boundis best possible. This extends a result of Ben Green, who proved it for k = 1 . Introduction
The following counterintuitive fact was conjectured by the third author and proved by Green [4].It says that every finite transitive subset of a high dimensional sphere is close to some hyperplane.Here a subset X of a sphere in R d is transitive if for every x, x ′ ∈ X , there is some g ∈ O ( R d ) so that gX = X and gx = x ′ . We say that X has width at most r if it lies within distance r of some hyperplane. The finiteness assumption is important since otherwise the whole sphere is acounterexample. Theorem 1.1 (Green [4]) . Let X be a finite transitive subset of the unit sphere in R d . Then thewidth of X is at most O (1 / √ log d ) . Furthermore, this upper bound is best possible up to a constantfactor. The bound in the theorem is tight since the set X obtained by taking all permutations andcoordinate-wise ± signings of the unit vector (1 , / √ , . . . , / √ d ) / √ H d , where H d = 1 + 1 / · · · +1 /d ∼ log d , has width on the order of / √ log d .Green’s proof uses a clever induction scheme along with sophisticated group theoretic arguments,including an application of the classification of finite simple groups.We generalize Green’s result by showing that a finite transitive set lies not only near somehyperplane, but in fact it lies near a subspace of codimension k , as long as k is not too large.We say that X ⊂ R d has k -cylindrical width at most r if X lies within distance r of some affinecodimension k subspace. The case k = 1 corresponds to the usual notion of width. Our main resultbelow implies that every finite transitive subset of the unit sphere in R d has k -cylindrical width O (1 / p log( d/k )) as long as k is not too large. Theorem 1.2.
There is an absolute constant
C > so that the following holds. Let ≤ k ≤ d/ (log(3 d )) C . Let X be a finite transitive subset of the unit sphere in R d . Then there is a real k -dimensional subspace W such that sup x ∈ X k proj W x k . p log( d/k ) . Here and throughout a . b means that a ≤ C ′ b for some absolute constant C ′ . We write k x k for the usual Euclidean norm of a vector x . Also proj W is the orthogonal projection onto W .We deduce the above theorem from a complex version using a theorem on restricted invertibility(see Section 6). A transitive subset of the complex unit sphere is defined to be the orbit of a pointunder the action of some subgroup of the unitary group. Sah and Sawhney were supported by NSF Graduate Research Fellowship Program DGE-1745302. Zhao wassupported by NSF Award DMS-1764176, a Sloan Research Fellowship, and the MIT Solomon Buchsbaum Fund.
Theorem 1.3.
There is an absolute constant
C > so that the following holds. Let ≤ k ≤ d/ (log(3 d )) C . Let X be a finite transitive subset of the unit sphere in C d . Then there is a complex k -dimensional subspace W such that sup x ∈ X k proj W x k . p log( d/k ) . We suspect that the ≤ k ≤ d/ (log(3 d )) C hypothesis is unnecessary in both Theorems 1.2and 1.3. Conjecture 1.4.
Let ≤ k ≤ d . Let X be a finite transitive subset of the unit sphere in C d . Thenthere is a complex k -dimensional subspace W such that sup x ∈ X k proj W x k . p log(2 d/k ) . One particularly intriguing special case of Conjecture 1.4 is that every finite transitive set of unitvectors in R d has k -cylindrical width o (1) for all k = o ( d ) .We prove a matching lower bound on the cylindrical radius (See Section 7 for proof.) Theorem 1.5.
Let ≤ k ≤ d . There exists a transitive set X in R d such that for any (real orcomplex) k -dimensional subspace W we have sup x ∈ X k proj W x k & p log(2 d/k ) . We propose another closely related conjecture: every finite transitive set in R d lies inside a smallcube. Conjecture 1.6.
Let X be a finite transitive subset of the unit sphere in R d (or C d ). Then thereis a unitary basis L such that sup x ∈ X, v ∈ L |h v , x i| . √ log d . (1.1)Establishing an upper bound that decays to zero as d → ∞ would already be interesting. Notethat Theorem 1.3 implies the existence of a set L of orthonormal vectors with | L | ≥ d . so that(1.1) holds (and likely extendable to | L | ≥ d/ (log d ) C via our techniques). Proving either conjecturein full appears to require additional ideas. Remark.
Green’s proof [4] of Theorems 1.2 and 1.3 in the case k = 1 contains two errors. The firsterror is due to a missing supremum inside the integral in the first and second lines of the last displayequation in proof of Proposition 2.1 on page 560. The second error occurs at the final equality stepof the top display equation on page 569, after right after (4.4); here an orthogonality relation wasincorrectly applied as it requires an unjustified exchange of the integral and supremum. Our proofhere corrects these errors. Green has also updated the arXiv version of his paper [4] incorporatingthese corrections. 2. Proof strategy
The subspace W in Theorem 1.3 must vary according to the transitive set X . On other hand,the strategy is to construct a single probability distribution µ (depending only on the symmetrygroup G U ( C d ) but not on X ) on the set Gr C ( k, d ) of k -dimensional subspaces of C d . This is animportant idea introduced by Green (for k = 1 ). Definition 2.1.
Let ≤ k ≤ d . Let f k ( d ) be the smallest value so that for every finite G U ( C d ) ,there is a probability measure µ on Gr C ( k, d ) such that for all v ∈ S ( C d ) , Z sup g ∈ G k proj W ( g v ) k dµ ( W ) ≤ f k ( d ) HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 3
The values f k ( d ) are well defined since the space of probability measures µ in question is closedunder weak limits.Our main result about f k ( d ) is stated below. Theorem 2.2. If k ≤ d/ (log d ) , then f k ( d ) . p log( d/k ) . Proof of Theorem 1.3 given Theorem 2.2.
Let our transitive set X be the orbit of v ∈ S ( C d ) underthe action of the the finite subgroup G U ( C d ) . By Theorem 2.2 and Definition 2.1, there is ameasure µ on Gr C ( k, d ) such that Z sup g ∈ G k proj W ( g v ) k dµ ( W ) ≤ f k ( d ) . Therefore there is some k -dimensional subspace W with sup g ∈ G k proj W ( g v ) k ≤ f k ( d ) . p log( d/k ) . (cid:3) To prove Theorem 2.2 , we will decompose G to “smaller”, more restricted cases, namely irre-ducible and primitive representations. We will also need to consider permutation groups (for boththe reduction step as well as the primitive case).2.1. Preliminaries.Definition 2.3.
We say that G U ( C d ) is imprimitive if there is a system of imprimitivity : adecomposition C d = ℓ M i =1 V i with < dim V i < d such that for every g ∈ G and i ∈ [ ℓ ] one has gV i = V j for some j ∈ [ ℓ ] . (Thesubspaces V i need not be orthogonal.) Otherwise we say that G is primitive . Remark.
Both primitivity and irreducibility are properties of a representation, rather than intrinsicto a group. We identify G U ( C d ) with its natural representation on C d .It follows from Maschke’s theorem that primitive group representations are irreducible. Definition 2.4.
Given v = ( v , . . . , v d ) ∈ C d , let v ≻ = ( | v σ (1) | , . . . , | v σ ( d ) | ) ∈ R d where σ is a permutation of [ d ] so that | v σ (1) | ≥ · · · ≥ | v σ ( d ) | . We write v ≻ i for the i -th coordinate of v ≻ . Let Dom( v ) = { w ∈ C d : w ≻ i ≤ v ≻ i for all i ∈ [ d ] } . Let (here S d denotes the symmetric group) Γ d := S d ⋉ ( S ) d ≤ U ( C d ) be the group that acts on C d be permuting its coordinates and multiplying individual coordinatesby unit complex numbers. Then Dom( v ) is the convex hull of the Γ d -orbit of v .We define some variants of f k ( d ) when the group G is restricted to special types. SAH, SAWHNEY, AND ZHAO
Definition 2.5.
Given k ∈ [ d ] , let f irred k ( d ) (resp. f prim k ( d ) ) be the smallest value so that for everyfinite G U ( C d ) which is irreducible (resp. primitive), there is a probability measure µ on Gr C ( k, d ) such that for every v ∈ S ( C d ) , Z sup g ∈ G k proj W ( g v ) k dµ ( W ) ≤ f irred k ( d ) (resp. f prim k ( d ) ) . The permutation action on C d deserves special attention. Definition 2.6.
Let f sym k ( d ) be the smallest value so that there is a probability measure µ on Gr C ( k, d ) such that for every v ∈ S ( C d ) , Z sup u ∈ Dom( v ) k proj W ( u ) k dµ ( W ) ≤ f sym k ( d ) . Define f alt k ( d ) to be the same with the additional constraint that µ is supported on the set of k -dimensional subspaces of the hyperplane x + · · · + x d = 0 .We will often equivalently consider, instead of µ on Gr C ( k, d ) , the corresponding measure µ ∗ on the complex Stiefel manifold V k ( C d ) , that is, µ ∗ is derived from µ by first sampling a µ -random k -dimensional subspace W of C d , and then outputting a uniformly sampled a unitarybasis ( w , . . . , w k ) of W . We have k proj W ( u ) k = P kℓ =1 |h g v , w ℓ i| .2.2. Reductions.
We first reduce the general problem to the irreducible case.
Proposition 2.7. If ≤ k < ℓ ≤ d then f k ( d ) ≤ max (cid:8)p k/ℓ, sup d ′ ≥ d/ (2 ℓ ) f irred ⌈ kd ′ /d ⌉ ( d ′ ) (cid:9) . We then reduce the irreducible case to the primitive case and the alternating case.
Proposition 2.8. If k ≤ d/ , then f irred k ( d ) ≤ max d d = d (cid:0) min (cid:8) f prim ⌈ k/d ⌉ ( d ) , f alt k ( d ) + k ≥ d (cid:9)(cid:1) . The symmetric and alernating cases can be handled explicitly, yielding the following.
Proposition 2.9. If k ≤ d/ (log d ) , then f sym k ( d ) ≤ f alt k ( d ) . / p log( d/k ) . This leaves the primitive case, which we prove by invoking an group theoretic result proved byGreen [4, Proposition 4.2] that allows us to once again reduce to the alternating case once again.
Proposition 2.10.
There is an absolute constant c > such that for k ≤ cd/ (log d ) we have f prim k ( d ) . sup d ′ ≥ cd/ (log d ) f alt k ( d ′ ) . Putting everything together.
We are now in position to derive Theorem 2.2 using thepreceding statements.
Proposition 2.11. If k ≤ d/ (log d ) then f prim k ( d ) . / p log( d/k ) .Proof. Combine Propositions 2.9 and 2.10. (cid:3)
Proposition 2.12. If k ≤ d/ (log d ) then f irred k ( d ) . / p log( d/k ) . HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 5
Proof.
By Proposition 2.8, we have f irred k ( d ) ≤ max d d = d (min( f prim ⌈ k/d ⌉ ( d ) , f alt k ( d ) + k ≥ d )) . First consider the case d ≤ k . We have ⌈ k/d ⌉ ≤ dd (log d ) ≤ d (log d ) . By Proposition 2.11, we have f prim ⌈ k/d ⌉ ( d ) . p log( d / ⌈ k/d ⌉ ) ≤ p log( d/ (2 k )) . Now consider the case d > k . Since d ( d /k ) = d/k , we have max { d , d /k } ≥ p d/k . If d ≥ p d/k , then f prim ⌈ k/d ⌉ ( d ) = f prim1 ( d ) . √ log d . p log( d/k ) . On the other hand, if d /k ≥ p d/k , then d /k ≥ (log d ) so k ≤ d (log d ) ≤ d (log d ) . Hence Proposition 2.9 yields f alt k ( d ) . p log( d /k ) . p log( d/k ) . Thus it follows that, for all d d = d , min( f prim ⌈ k/d ⌉ ( d ) , f alt k ( d ) + k ≥ d ) . p log( d/k ) , and the result follows. (cid:3) Now we show the main result assuming the above statements.
Proof of Theorem 2.2.
Let ℓ = ⌈√ dk ⌉ ≥ k . We have k/ℓ . p k/d . p log( d/k ) . Also, if d ′ ≥ d/ (2 ℓ ) then (cid:24) kd ′ d (cid:25) ≤ d ′ d/ (2 ℓ ) ≤ d ′ (log d ) ≤ d ′ (log d ′ ) . By Proposition 2.12, we have f irred ⌈ kd ′ /d ⌉ ( d ′ ) . p log( d ′ / ⌈ kd ′ /d ⌉ ) . p log( d/ (2 ℓ )) . p log( d/k ) . Applying Proposition 2.7 to k and ℓ = ⌈√ dk ⌉ , we find f k ( d ) ≤ max( p k/ℓ, sup d ′ ≥ d/ (2 ℓ ) f irred ⌈ kd ′ /d ⌉ ( d ′ )) . p log( d/k ) . (cid:3) Paper outline.
In Section 3, we prove the two key reductions, Propositions 2.7 and 2.8. InSection 4, we prove the key estimate for the symmetric and alternating cases, Proposition 2.9. InSection 5, we prove the primitive case, Proposition 2.10. Finally, in Section 6 we deduce a realversion from the complex version, proving Theorem 1.2. In Section 7 we demonstrate optimality ofour results by exhibiting the matching lower bound Theorem 1.5.
SAH, SAWHNEY, AND ZHAO Reduction to primitive representations
We first reduce the general case to the alternating and irreducible cases.
Proof of Proposition 2.7.
Consider G U ( C d ) . By Maschke’s theorem, we can decompose intoirreducible representations of G : C d = m M j =1 V j . Let d j = dim V j . Let J = { j ∈ [ m ] : d j ≥ d/ (2 ℓ ) } . First suppose P j ∈ J d j ≥ d/ . Then in each such V j , we consider the probability measure µ j that witnesses f irred ⌈ kd j /d ⌉ ( d j ) for the irreducible representation of G on V j . That is, µ j samples a ⌈ kd j /d ⌉ -dimensional subspace of V j and satisfies Z sup g ∈ G k proj W ( g v ) k dµ j ( W ) ≤ f irred ⌈ kd ′ /d ⌉ ( d j ) k v k for each v ∈ V j . We define µ to be a uniformly random k -dimensional subspace of L j ∈ J W j , whereeach W j is an independent µ j -random ⌈ kd j /d ⌉ -dimensional subspace of V j . (Note the W j ’s areorthogonal as the V j ’s are.) The total dimension of this direct sum is at least k , so µ is well-defined.Given v ∈ C d , write v = P mj =1 v j with v j ∈ V j . We have Z sup g ∈ G k proj W ( g v ) k dµ ( W ) ≤ Z sup g ∈ G k proj L j ∈ J W j ( g v ) k Y j ∈ J dµ j ( W j ) ≤ X j ∈ J Z sup g ∈ G k proj W j ( g v ) k dµ j ( W j ) ≤ X j ∈ J f irred ⌈ kd j /d ⌉ ( d j ) k v j k ≤ sup d ′ ≥ d/ (2 ℓ ) f irred ⌈ kd ′ /d ⌉ ( d ′ ) k v k by orthogonality of the V j .Next suppose P j ∈ J d j < d/ . Then | [ m ] \ J | ≥ ℓ . Let I be an ℓ -element subset of [ m ] \ J . Choosearbitrary w j ∈ S ( V j ) ⊆ C d for j ∈ I , which are clearly orthogonal. Let µ be the probability measureon k -dimensional subspaces of C d obtained by taking the span of k uniform random elements in { w , . . . , w ℓ } .For each g ∈ G , write u g = ( h g v , w i , . . . , h g v , w ℓ i ) and v ′ = ( k proj V v k , . . . , k proj V ℓ v k ) . HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 7
Given S ⊆ [ ℓ ] , let proj S take the projection of an ℓ -dimensional vector down to that subset ofcoordinates. We have Z sup g ∈ G k proj W ( g v ) k dµ ( W ) = 1 (cid:0) ℓk (cid:1) X S ∈ ( [ ℓ ] k ) sup g ∈ G k proj S ( u g ) k ≤ (cid:0) ℓk (cid:1) X S ∈ ( [ ℓ ] k ) X j ∈ S ( v ′ j ) = kℓ ℓ X j =1 ( v ′ j ) ≤ kℓ k v k . The first equality follows by the definition of µ , the subsequent inequality follows by |h g v , w j i| ≤ v ′ j ,and the last line is by direct computation and orthogonality of the V j . (cid:3) We next reduce the irreducible case to the primitive case. We first collect a few facts proved in[4] regarding systems of imprimitivity.
Lemma 3.1 ([4, Section 2]) . Let G U ( C d ) be irreducible but imprimitive. Consider a systemimprimitivity C d = d M j =1 V j with d maximal over all such systems of primitivity. Let H = { g ∈ G : gV = V } and choose γ , . . . , γ d such that γ j V = V j . Then the following hold:1. The V j are orthogonal and have the same dimension, and G acts transitively on them.2. H has primitive action on V (i.e. the representation of H on V is primitive).3. γ , . . . , γ d form a complete set of left coset representatives for H in G .4. For each g ∈ G there is σ g ∈ S d so that γ − σ g ( j ) gγ j ∈ H for all j ∈ [ d ] (i.e., σ g records how g permutes { V , . . . , V d } ). Now we are ready to prove Proposition 2.8, which recall says that for all k ≤ d/ , f irred k ( d ) ≤ max d d = d (cid:0) min (cid:8) f prim ⌈ k/d ⌉ ( d ) , f alt k ( d ) + k ≥ d (cid:9)(cid:1) . Proof of Proposition 2.8.
Let G U ( C d ) be irreducible but imprimitive. Consider a system ofimprimitivity C d = d M j =1 V j with d maximal among all systems of imprimitivity. By Lemma 3.1, the spaces V j are orthogonaland all the dim V j are equal. Let d = dim V , so that d d = d . Furthermore, H = { g ∈ G : gV = V } acts primitively on V , that G acts transitively on the V j , and that there are γ , . . . , γ d so that γ j V = V j which form a complete set of left coset representatives for H in G . For each g ∈ G wehave some σ g ∈ S d so that γ − σ g ( j ) gγ j ∈ H for all j ∈ [ d ] . Define h ( g, j ) = γ − σ g ( j ) gγ j .Let v ∈ C d . There is a unique orthogonal decomposition v = d X j =1 γ j v j SAH, SAWHNEY, AND ZHAO where v j ∈ V for all j ∈ [ d ] . We have g v = d X j =1 gγ j v j = d X j =1 γ j h ( g, σ − g ( j )) v σ − g ( j ) . Finally, if w = d X j =1 λ j γ j x for some λ = ( λ , . . . , λ d ) ∈ C d and x ∈ V then we see from the above and orthogonality that h g v , w i = d X j =1 λ j h h ( g, σ − g ( j )) v σ − g ( j ) , x i . Now we return to the situation at hand: we need to choose a k -dimensional space with a goodprojection for our transitive set. Consider the map ψ : V × C d → C d given by ψ ( x , λ ) = d X j =1 λ j γ j x . It clearly maps the pair of unit spheres into the unit sphere. Given probability measures µ on Gr C ( k , V ) and µ on Gr C ( k , C d ) , we define the pushforward measure µ on Gr C ( k k , d ) bytaking the image of these two subspaces under ψ . Equivalently, suppose µ ∗ samples a unitary basis x , . . . , x k of a subspace of V and µ ∗ samples a unitary basis λ , . . . , λ k of a subspace of C d ,then µ samples the subspace of C d with basis { ψ ( x i , λ j ) : i ∈ [ k ] , j ∈ [ k ] } . It is easy to check thisbasis is in fact unitary.Next, we choose µ and µ based on the sizes of d and d .First let k = ⌈ k/d ⌉ ≤ d (as k ≤ d/ ) and k = d . We let µ be the measure guaranteed byDefinition 2.1 so that Z sup h ∈ H k proj W ( h u ) k dµ ( W ) ≤ f prim k ( d ) k u k for all u ∈ V and let µ be the atom on the space C d in Gr C ( d , d ) . Let µ be the ψ -pushforwardof ( µ , µ ) as described earlier. We find Z sup g ∈ G k proj W ( g v ) k dµ ( W ) = Z sup g ∈ G k X ℓ =1 d X j =1 |h g v , ψ ( x ℓ , e j ) i| dµ ∗ ( x , . . . , x k )= Z sup g ∈ G k X ℓ =1 d X j =1 |h h ( g, σ − g ( j )) v σ − g ( j ) , x ℓ i| dµ ∗ ( x , . . . , x k )= Z sup g ∈ G k X ℓ =1 d X j =1 |h h ( g, j ) v j , x ℓ i| dµ ∗ ( x , . . . , x k ) ≤ d X j =1 Z sup g ∈ G k X ℓ =1 |h h ( g, j ) v j , x ℓ i| dµ ∗ ( x , . . . , x k ) ≤ d X j =1 Z sup h ∈ H k X ℓ =1 |h h v j , x ℓ i| dµ ∗ ( x , . . . , x k ) ≤ d X j =1 f prim k ( d ) k v j k = f prim k ( d ) k v k . HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 9
The last equality is by orthogonality of V , . . . , V d and unitarity of γ j for j ∈ [ d ] .Now suppose that k < d . Let k = 1 and k = k . Choose an arbitrary unit vector x ∈ V and µ be an atom on Gr C (1 , V ) supported on the line C x . Let µ be guaranteed by Definition 2.6 sothat Z sup u ∈ Dom( w ) k X ℓ =1 |h u , λ ℓ i| dµ ∗ ( λ , . . . , λ k ) ≤ f alt k ( d ) k w k for all w ∈ V . Let µ be the ψ -pushforward of ( µ , µ ) as described earlier. We find Z sup g ∈ G k proj W ( g v ) k dµ ( W ) = Z sup g ∈ G k X ℓ =1 |h g v , w ℓ i| dµ ∗ ( w , . . . , w k )= Z sup g ∈ G k X ℓ =1 |h g v , ψ ( x , λ ℓ ) i| dµ ∗ ( λ , . . . , λ k )= Z sup g ∈ G k X ℓ =1 (cid:12)(cid:12)(cid:12)(cid:12) d X j =1 λ ℓ,j h h ( g, σ − g ( j )) v σ − g ( j ) , x i (cid:12)(cid:12)(cid:12)(cid:12) dµ ∗ ( λ , . . . , λ k ) ≤ Z sup u ∈ Dom( y ) k X ℓ =1 |h u , λ ℓ i| dµ ∗ ( λ , . . . , λ k ) , where y has coordinates y j = sup h ∈ H |h h v j , x i| for j ∈ [ d ] . We immediately deduce Z sup g ∈ G k proj W ( g v ) k dµ ( W ) ≤ Z sup u ∈ Dom( y ) k X ℓ =1 |h u , λ ℓ i| dµ ∗ ( λ , . . . , λ k ) ≤ f alt k ( d ) k y k ≤ f alt k ( d ) d X j =1 k v j k = f alt k ( d ) k v k . Note that the above constructed measures in both cases are independent of v . The second con-struction is only valid when k < d . Therefore since the f values are clearly bounded by , we havean upper bound of f prim k ( d ) ≤ max d d = d (min( f prim ⌈ k/d ⌉ ( d ) , f alt k ( d ) + k ≥ d )) , as claimed. (cid:3) Permutation groups
In this section, we establish upper bounds for f sym k ( d ) and f alt k ( d ) , extending the previous con-struction [4, Section 3] for k = 1 .A useful high dimensional intuition is that, for small k , a random k -dimensional subspace of R d has the property that all its unit vectors have distribution of coordinate magnitudes similar to thatof a random Gaussian vector.We first need the existence of a large dimension subspace of R d with certain delocalization prop-erties. We encode this through the following norm. Definition 4.1.
Given v ∈ R d , let k v k T = sup ∅ ( S ⊆ [ d ] log (2 d/ | S | ) X j ∈ S v j and let T ∗ = { t ∈ R d : |h t , w i| ≤ whenever k w k T ≤ } . Remark.
Note that k·k T is a norm as it can be represented as a supremum of seminorms. Hence k w k T = sup t ∈ T ∗ |h t , w i| . We next recall a classical lemma regarding the concentration of norms on Gaussian space (seee.g. [5]); we provide a short proof for convenience.
Lemma 4.2.
There is an absolute constant
C > so that for all p ≥ , a Gaussian random vector w ∼ N (0 , I d ) satisfies ( E w + ··· + w d =0 k w k pT ) /p ≤ ( E k w k pT ) /p ≤ E k w k T + C √ p sup t ∈ T ∗ k t k . Proof.
For first inequality note that w ∼ N (0 , I d ) can be written as w ′ + G where w ′ is drawnfrom N (0 , I d ) conditioned on having coordinate sum zero and G ∈ N (0 , is independent of w ′ .Then by convexity note that ( E k w k pT ) /p = ( E k w ′ + G k pT ) /p ≥ ( E w ′ k E [ w ′ + v ′ | w ′ ] k pT ) /p = ( E w + ··· + w d =0 k w k pT ) /p . To prove the second inequality first note that k w k T − k v k T ≤ k w − v k T = sup t ∈ T ∗ |h t , w − v i| ≤ k w − v k sup t ∈ T ∗ k t k . Therefore if L = sup t ∈ T ∗ k t k then w
7→ k w k T is an L -Lipschitz function with respect to Euclideandistance. Therefore by Gaussian concentration for Lipschitz functions (see e.g. [1, p. 125]) we havethat P [ |k w k T − E [ k w k T ] | ≥ t ] ≤ − ct /L ) where c is an absolute constant. Using standard moment bounds for sub-Gaussian random variables(see e.g. [8, Proposition 2.5.2]), we find that ( E |k w k T − E k w k T | p ) /p ≤ C √ p sup t ∈ T ∗ k t k for an absolute constant C > . Finally, Minkowski’s inequality implies that ( E k w k pT ) /p ≤ E k w k T + ( E |k w k T − E k w k T | p ) /p and therefore the result follows. (cid:3) We now prove an upper bound for E [ k w k T ] . Lemma 4.3.
A Gaussian random vector w ∼ N (0 , I d ) satisfies E k w k T . √ d .Proof. Recall w ≻ i from Definition 2.4. We have E ( w ≻ i ) = Z ∞ P [ w ≻ i ≥ √ t ] dt ≤ Z ∞ min (cid:18) , (cid:18) di (cid:19) (2 e − t/ ) i (cid:19) dt ≤ Z ∞ min(1 , (2 de − t/ /i ) i ) dt . log(2 d/i ) . Therefore ( E k w k T ) ≤ E k w k T ≤ d X i =1 log (2 d/i )( w ≻ i ) . d X i =1 log (2 d/i ) ≤ d Z log(2 /x ) dx = d Z ∞ ( y + log 2) e − y dy . d. (cid:3) We are in position to derive a high-probability version.
HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 11
Lemma 4.4.
With probability at least − exp( − d/ (log d ) ) , a standard Gaussian vector w ∼N (0 , I d ) satisfies k w k T . √ d . In fact, the same is true after conditioning w to have coordinatesum .Proof. Note that if t ∈ T ∗ , then k t k = k t k T (cid:12)(cid:12)(cid:12)(cid:12)(cid:28) t , t k t k T (cid:29)(cid:12)(cid:12)(cid:12)(cid:12) ≤ k t k T ≤ log (2 d ) k t k . Hence sup t ∈ T ∗ k t k ≤ log (2 d ) . To deduce the claimed bound, note that P [ k w k T ≥ K √ d ] ≤ ( K √ d ) − p E [ k w k pT ] ≤ ( K √ d ) − p ( E k w k T + C √ p sup t ∈ T ∗ k t k ) p ≤ ( K √ d ) − p ( C ′ √ d + C ′ √ p log (2 d )) p for appropriate absolute constants C, C ′ > , using Lemmas 4.2 and 4.3 and the above inequality.Letting p = d/ (log d ) and K > be a sufficiently large absolute constant yields P [ k w k T ≥ K √ d ] ≤ exp( − p ) , as desired. The same holds is we condition on sum , using the moment bound for the conditionalvariable derived in Lemma 4.2 instead. (cid:3) Lemma 4.5.
There is a ⌈ d/ (log d ) ⌉ -dimensional subspace of the hyperplane ⊥ in R d such thateach of its unit vectors v satisfies k v k T . . Proof.
We can assume d is sufficiently large. Let k = ⌈ d/ (log d ) ⌉ , and consider a uniformly random k -dimensional subspace W of ⊥ . Let U be a d × k matrix whose columns form an orthonormalbasis of W , chosen uniformly at random.By a standard volume packing argument (e.g., see [7, Lemma 4.3]), there exists N ⊂ S ( R k ) with |N | ≤ k such that for every v ∈ S ( R k ) there is v ′ ∈ N so that k v − v ′ k ≤ / . Thus if u is a unitvector in the direction of v − v ′ , we have k U v k T ≤ k U v ′ k T + k U ( v − v ′ ) k T ≤ k U v ′ k T + 12 k U u k T . We deduce sup v ∈ S ( R k ) k U v k T ≤ sup v ′ ∈N k U v ′ k T + 12 sup u ∈ S ( R k ) k U u k T and thus sup v ∈ S ( R k ) k U v k T ≤ v ′ ∈N k U v ′ k T . Now fix some v ∈ N . Note the distribution of U v is uniform among unit vectors in ⊥ since W was chosen uniformly. Now note that for any constant C we have that P [ k U v k T ≥ C ] = P [ k G / k G k k T ≥ C ] where G ∼ N (0 , I d − ( T ) /d ) . Now since G / k G k and k G k are independent we have that P [ k G / k G k k T ≥ C ] = P [ k G k ≤ √ d ] − P [ k G / k G k k T ≥ C and k G k ≤ √ d ] ≤ P [ k G / k G k k T ≥ C and k G k ≤ √ d ] ≤ P [ k G / k G k k T ≥ C √ d ] . By Lemma 4.4, the last expression is at most − d/ (log d ) ) . The result follows upon takingthe union bound over at most k vectors in N , since < e . (cid:3) Finally, we will need a form of Selberg’s inequality (see [3, Chapter 27, Theorem 1]).
Lemma 4.6.
For v , . . . , v m ∈ C d we have that sup w ∈ S ( C d ) m X i =1 |h w , v i i| ≤ sup i ∈ [ m ] m X j =1 |h v i , v j i| . Now we prove Proposition 2.9, which recall says that for k ≤ d/ (log d ) , one has f sym k ( d ) ≤ f alt k ( d ) . / p log( d/k ) . The first inequality is immediate as the set of allowable µ ’s in the definition of f alt k is a subset ofthose of f sym k . So we just need to prove the second inequality. Proof of Proposition 2.9.
Let e i be the i -th coordinate vector. For each j with k ≤ j / (log 2 j ) ≤ d ,we apply Lemma 4.5 to the space V j = span R { e , . . . , e j } . Here the T -norm is defined with respectto this j -dimensional space. In particular, there exists a k -dimensional (real) subspace of theorthogonal complement of e + · · · + e j within V j , call it W j , so that every unit vector u ∈ W j satisfies X i ∈ S u i . (2 j +1 / | S | ) for every nonempty S ⊆ [2 j ] . Let V ′ j = span C V j and W ′ j = span C W j . We immediately deduce thatevery unit vector u ∈ W ′ j satisfies X i ∈ S | u i | . (2 j +1 / | S | ) (4.1)since we can write it as u = α u r + β √− u c where u r , u c ∈ W j are real unit vectors and α, β ∈ R satisfy α + β = 1 .Now we construct our random subspace as follows: let W = W j where j is a random integeruniformly chosen from J = {⌈ log (2 k log d ) ⌉ , . . . , ⌊ log d ⌋} . Let µ be the probability measure on Gr C ( k, n ) that gives W .For every v ∈ S ( C d ) , we have sup γ ∈ Γ d k proj W ( γ v ) k = sup γ ∈ Γ d sup w ∈ S ( W ) |h γ v , w i| = sup w ∈ S ( W ) h v ≻ , w ≻ i . Therefore Z sup γ ∈ Γ d k proj W ( γ v ) k dµ ( W ) = 1 | J | X j ∈ J sup γ ∈ Γ d k proj W j ( γ v ) k = 1 | J | X j ∈ J sup w ∈ S ( W j ) h v ≻ , w ≻ i . Let w ′ j ∈ S ( W j ) be such that sup w ∈ S ( W j ) h v ≻ , w ≻ i = h v ≻ , ( w ′ j ) ≻ i , which exists by compactness. For i, j ∈ J with i ≥ j , we have |h ( w ′ i ) ≻ , ( w ′ j ) ≻ i| ≤ k proj V i (( w ′ j ) ≻ ) k . (2 i +1 / j ) . The first inequality follows from w ′ j ∈ V j , which implies ( w ′ j ) ≻ ∈ V j . The second follows from (4.1)applied to w ′ j and S a subset of [2 i ] composed of the j largest magnitude coordinates of w ′ j . HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 13
Applying Lemma 4.6, we deduce Z sup γ ∈ Γ d k proj W ( γ v ) k dµ ( W ) = 1 | J | X j ∈ J h v ≻ , ( w ′ j ) ≻ i ≤ sup i ∈ J | J | X j ∈ J |h ( w ′ i ) ≻ , ( w ′ j ) ≻ i| . | J | (cid:18) X j ∈ J,j ≥ i (2 j +1 / i ) + X j ∈ J,j
We now turn to the case of bounding f prim k ( d ) . First, we show that if the group G U ( R d ) issufficiently small, then a random basis achieves the necessary bound for f prim k ( d ) . This is a minormodification of [4, Proposition 4.1]. Proposition 5.1.
Let G U ( C d ) . Suppose that [ G : Z d ∩ G ] ≤ e d/ log d , where Z d := { λI d : | λ | = 1 } .Then for k ∈ [ d ] there exists a probability measure µ on Gr C ( k, d ) such that Z sup g ∈ G k proj W ( g v ) k dµ ( W ) . d/k ) k v k for all v ∈ C d .Proof. We let µ be the uniform measure on Gr C ( k, d ) . By scaling, we may assume that v is a unitvector. Furthermore let W ′ be the subspace generated by the first k coordinate vectors e , . . . , e k .Note that P W (cid:20) sup g ∈ G k proj W ( g v ) k ≥ t (cid:21) ≤ e d/ log d P W [ k proj W ( v ) k ≥ t ] ≤ e d/ log d P v ′ ∈ S ( C d ) [ k proj W ′ ( v ′ ) k ≥ t ] using a union bound and then orthogonal invariance. Now note that E [ k proj W ′ ( v ′ ) k ] ≤ E [ k proj W ′ ( v ′ ) k ] = k/d and that k proj W ′ ( v ′ ) k is a -Lipschitz function of v ′ . Therefore by Lévy concentration on the spherewe have that P v ′ ∈ S ( C d ) [ k proj W ′ ( v ′ ) k ≥ p k/d + C/ p log d ] ≤ e − d/ log d for a suitably large absolute constant C . Finally, using p k/d . / p log(2 d/k ) and using the bound k proj W ′ ( v ′ ) k ≤ , the desired result follows immediately. (cid:3) We need the following key group theoretic result from Green [4], which in turn builds on ideasfrom Collins’ work on optimal bounds for Jordan’s theorem [2]. Roughly, it says that if [ G : Z d ∩ G ] is large then G has a large normal alternating subgroup. The first part of the following theorem is[4, Proposition 4.2], while the rest is implicit in the proof of [4, Proposition 1.11]. Theorem 5.2 ([4, Section 4]) . Let G U ( C d ) be primitive and suppose that [ G : Z d ∩ G ] ≥ e d/ log d .If d is sufficiently large then all of the following hold.(1) G has a normal subgroup isomorphic to the alternating group A n for some n & d/ (log d ) .(2) G has a subgroup of index at most of the form A n × H , with the same n .(3) The resulting representation ρ : A n × H ֒ → G ֒ → U ( C d ) decomposes into irreducible rep-resentations, at least one of which (call it ρ ) is of the form ρ ≃ ψ ⊗ ψ ′ , where ψ ′ is anirreducible representation of H and ψ is the representation of A n acting via permutation ofcoordinates on { z ∈ C n : z + · · · + z n = 0 } . We are now in position to prove Proposition 2.10, which recall says that there is an absoluteconstant c > such that for every k ≤ cd/ (log d ) we have f prim k ( d ) . sup d ′ ≥ cd/ (log d ) f alt k ( d ′ ) . The proof mirrors that of [4, Proposition 1.11], but we correct an error of Green ([4, p. 20]) involvingan incorrect orthogonality identity. This erroneous deduction is replaced by an argument which stillallows one to reduce the primitive case to the alternating case.
Proof of Proposition 2.10.
We may assume d is sufficiently large. If [ G : Z d ∩ G ] ≤ e d/ log d , then theresult follows by Proposition 5.1. So we can assume [ G : Z d ∩ G ] ≥ e d/ log d , and thus by Theorem 5.2, G has a normal subgroup isomorphic to A n for some n & d/ (log d ) and that G has a subgroup ofindex at most which is of the form A n × H . If the index is , let τ be the nontrivial right cosetrepresentative of A n × H in G (otherwise just let τ be the identity). Note that sup g ∈ G k proj W ( g v ) k ≤ sup g ∈ A n × H k proj W ( g v ) k + sup g ∈ A n × H k proj W ( gτ v ) k , so it is easy to see that, up to losing a constant factor, we may reduce to studying groups of the form G = A n × H where n & d/ (log d ) (but note that the representation may no longer be primitive, oreven irreducible).Now Theorem 5.2 shows that the representation ρ : A n × H → U ( C d ) coming from this setup hasan irreducible component of the form ρ ≃ ψ ⊗ ψ ′ , where ψ ′ is an irreducible representation of H and ψ is the representation of A n acting via permutation of coordinates on { z ∈ C n : z + · · · + z n = 0 } .Note that dim ρ ≥ dim ψ = n − & d/ (log d ) , so dim ρ ≥ k provided that c > is sufficientlysmall. We will choose a k -dimensional subspace of the irreducible component ρ .We explicitly present this situation as follows. Let V ′ be the space acted on by ψ ′ (unitarily).Consider V = ⊥ ⊆ C n , and consider the spaces V ⊗ V ′ ⊆ C n ⊗ V ′ , which has a natural unitarystructure given by the tensor product. Note ψ acts on V by permutation of coordinates whenrepresented in C n . Every vector in V ⊗ V ′ is spanned by pure tensors v ⊗ v ′ where v has zerocoordinate sum, and ρ (( a, h )) acts by ψ ( a ) ⊗ ψ ′ ( h ) on pure tensors. In fact, we can extend thisaction to all of C n ⊗ V ′ in the natural way (and the resulting representation is isomorphic to adirect sum of ρ and triv A n ⊗ ψ ′ ). At this point, the analysis will be similar to that in the proof ofProposition 2.8.Let ν be the measure on Gr C ( k, n ) which is guaranteed by Definition 2.6 (so is supported onsubspaces of V ⊆ C n ) and consider the measure which is supported on a single atom in Gr C (1 , V ′ ) in the direction of a fixed unit vector x . Let µ be the tensor of these two measures, i.e., if ν ∗ samples k orthonormal (sum zero) vectors u , . . . , u k then we choose the subspace with basis u ⊗ x , . . . , u k ⊗ x .Now consider some v in the space V ⊗ V ′ ⊆ C n ⊗ V ′ , and write it as v = n X j =1 e j ⊗ v ′ j where the e j is the j -th coordinate vector of C n . In fact, the v ′ j must add up to ∈ V ′ . We seethat k v k = n X j =1 k v ′ j k . HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 15
We have Z sup g ∈ A n × H k proj W ( g v ) k dµ ( W )= Z sup a ∈ A n ,h ∈ H k X ℓ =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:28) n X j =1 ψ ( a ) e j ⊗ ψ ′ ( h ) v ′ j , w ℓ (cid:29)(cid:12)(cid:12)(cid:12)(cid:12) dµ ∗ ( w , . . . , w k )= Z sup a ∈ A n ,h ∈ H k X ℓ =1 (cid:12)(cid:12)(cid:12)(cid:12) n X j =1 h ψ ( a ) e j , u ℓ ih ψ ′ ( h ) v ′ j , x i (cid:12)(cid:12)(cid:12)(cid:12) dν ∗ ( u , . . . , u k ) ≤ Z sup w ∈ Dom( y ) k X ℓ =1 |h w , u ℓ i| dν ∗ ( u , . . . , u k ) ≤ f alt k ( n ) k y k , where y ∈ C n satisfies y j = sup h ∈ H |h ψ ′ ( h ) v ′ j , x i| . The first inequality follows by noting that h ψ ( a ) e j , u ℓ i as j varies simply records the coordinates of u ℓ in some permutation, and by considering w = ( w , . . . , w n ) defined via w j = h ψ ′ ( h ) v ′ j , x i , which is clearly on Dom( y ) . Now we see Z sup g ∈ A n × H k proj W ( g v ) k dµ ( W ) ≤ f alt k ( n ) k y k ≤ f alt k ( n ) n X j =1 k v ′ j k = f alt k ( n ) k v k . (cid:3) This completes all the components of the proof of Theorem 1.3.6.
Real subspaces
We already proved Theorem 1.3, which finds a complex subspace. Now we use it to deduceTheorem 1.2, which gives a real subspace. We will apply the following version of the restrictedinvertibility theorem, which is a special case of [6, Theorem 6]. We write s ( M ) ≥ s ( M ) ≥ · · · forthe singular values of a matrix M . Theorem 6.1 ([6, Theorem 6]) . Let M be a real k × k matrix of rank k . There exists S ⊆ [4 k ] with | S | = k such that M S , the restriction of M to the columns S , satisfies s k ( M S ) & s P kj =3 k/ s j ( M ) k . Proof of Theorem 1.2.
Let k ≤ d/ (log d ) C , where C is as in Theorem 1.3. By embedding X in S ( C d ) and using Theorem 1.3 we can find a k -dimensional complex subspace W of C d such that sup x ∈ X k proj W x k . / p log( d/k ) . Let v , . . . , v k be a unitary basis for the subspace W and let the matrix with these columns bedenoted by B . Now consider the matrix M which has k columns which are Re v , . . . , Re v k and Im v , . . . , Im v k . Note that M has s k ( M ) ≥ / √ as any vectors in C k satisfying iv j = v j +2 k have k M v k = k v k / √ . Therefore by Theorem 6.1 one can select k columns such that the matrix N with those k columns satisfies s k ( N ) & . Now consider any unit vector v in the image of N . Such a vector can be represented as v = N w where k w k . . It therefore suffices to prove that sup x ∈ X, w ∈ S ( R k ) |h N w , x i| . / p log( d/k ) . To see this separate N into N and N where N corresponds to columns chosen from the real partsof vectors v i and the columns are chosen from the complex parts of v i . Let these have ℓ and k − ℓ columns respectively. Then sup x ∈ X, w ∈ S ( R k ) |h N w , x i| ≤ sup x ∈ X, w ∈ S ( R ℓ ) |h N w , x i| + sup x ∈ X, w ∈ S ( R k − ℓ ) |h N w , x i|≤ x ∈ X, w ∈ S ( C k ) |h B w , x i| . / p log( d/k ) . (cid:3) Lower Bound
Finally, we show a lower bound of
Ω(1 / p log(2 d/k )) , which demonstrates optimality of our results. Proof of Theorem 1.5.
We prove the real case; an analogous proof works over C by considering asuitably fine discretization of Γ d , or we can repeat the proof in Section 6 to transfer a lower boundfrom real to complex.The claim for k = 1 was already proved in [4, Sharpness after Theorem 1.3] (see the constructionat the beginning of this article right after Theorem 1.1). The case k = 1 implies the result also for k ≤ d − c for any constant c , since we can project from W onto a arbitrary 1-dimensional subspaceof W .So from now on assume k ≥ d / . Consider the action of G = S d ⋉ ( Z / Z ) d on R d by permutationand signing. Let a = p ⌊ k/ ⌋ + 1 , . . . , √ d , , . . . , ! . Let X be the G -orbit of a / k a k .Let W be a k -dimensional subspace of R d . We wish to show sup x ∈ X k proj W x k & / p log(2 d/k ) .Let y = ( y , . . . , y d ) a uniform random vector in S ( W ) . Let σ i = ( E y i ) / . We have σ + · · · + σ d = E [ y + · · · + y d ] = 1 (7.1)and σ i = 1 k k proj W ( e i ) k ≤ k . (7.2)Without loss of generality, assume that / √ k ≥ σ ≥ · · · ≥ σ d ≥ , so that σ i ≤ / √ i for each i .We claim that a i ≥ r σ i for all ≤ i ≤ d − k/ . Indeed, for i ≤ k , we have a i ≥ / p k/ ≥ p / σ i . For k < i ≤ d − ⌊ k/ ⌋ , we have a i =1 / p ⌊ k/ ⌋ + i ≥ σ i p i/ ( ⌊ k/ ⌋ + i ) ≥ p / σ i .We have E | y i | & ( E y i ) / = σ i since y i is distributed as the first coordinate of a random pointon σ i √ k · S ( R k ) .Putting everything together, we have k a k sup x ∈ X k proj W x k ≥ sup g ∈ G k proj W g a k ≥ E sup g ∈ G h a , g y i≥ E X ≤ i ≤ d a i | y i | & d X i =1 a i σ i & d − k/ X i =1 σ i ≥ , where the final step uses (7.1) and (7.2). Thus sup x ∈ X k proj W x k & k a k & / p log(2 d/k ) . (cid:3) HE CYLINDRICAL WIDTH OF TRANSITIVE SETS 17
References [1] Stéphane Boucheron, Gábor Lugosi, and Pascal Massart,
Concentration inequalities , Oxford University Press,Oxford, 2013, A nonasymptotic theory of independence, With a foreword by Michel Ledoux.[2] Michael J. Collins,
On Jordan’s theorem for complex linear groups , J. Group Theory (2007), 411–423.[3] Harold Davenport, Multiplicative number theory , third ed., Graduate Texts in Mathematics, vol. 74, Springer-Verlag, New York, 2000, Revised and with a preface by Hugh L. Montgomery.[4] Ben Green,
On the width of transitive sets: Bounds on matrix coefficients of finite groups , Duke Math. J. (2020), 551–578.[5] Michel Ledoux and Michel Talagrand,
Probability in Banach spaces , Ergebnisse der Mathematik und ihrer Gren-zgebiete (3) [Results in Mathematics and Related Areas (3)], vol. 23, Springer-Verlag, Berlin, 1991, Isoperimetryand processes.[6] Assaf Naor and Pierre Youssef,
Restricted invertibility revisited , A journey through discrete mathematics, Springer,Cham, 2017, pp. 657–691.[7] Mark Rudelson,
Recent developments in non-asymptotic theory of random matrices , Modern aspects of randommatrix theory, Proc. Sympos. Appl. Math., vol. 72, Amer. Math. Soc., Providence, RI, 2014, pp. 83–120.[8] Roman Vershynin,
High-dimensional probability , Cambridge Series in Statistical and Probabilistic Mathematics,vol. 47, Cambridge University Press, Cambridge, 2018, An introduction with applications in data science, Witha foreword by Sara van de Geer.
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Email address ::