Statistical Enumeration of Groups by Double Cosets
aa r X i v : . [ m a t h . P R ] F e b Statistical Enumeration of Groups by Double Cosets
In memory of Jan Saxl
Persi Diaconis a,b , Mackenzie Simper a a Department of Mathematics, Stanford University b Department of Statistics, Stanford University
Abstract
Let H and K be subgroups of a finite group G . Pick g ∈ G uniformly atrandom. We study the distribution induced on double cosets. Three examplesare treated in detail: 1) H = K = the Borel subgroup in GL n ( F q ). This leadsto new theorems for Mallows measure on permutations and new insights intothe LU matrix factorization. 2) The double cosets of the hyperoctahedral groupinside S n , which leads to new applications of the Ewens’s sampling formula ofmathematical genetics. 3) Finally, if H and K are parabolic subgroups of S n ,the double cosets are ‘contingency tables’, studied by statisticians for the past100 years. Keywords: double cosets, Mallows measure, Ewens measure, contingencytables
1. Introduction
Let G be a finite group. Pick g ∈ G uniformly at random. What does g ‘looklike’ ? This ill-posed question can be sharpened in a variety of ways; this is thesubject of ‘probabilistic group theory’ initiated by Erd˝os and Turan [39], [40],[41], [42]. Specializing to the symmetric group, one can ask about features ofcycles, fixed points, number of cycles, longest (or shortest) cycles, and the orderof g [84]. The descent pattern has also been well-studied [18]. Specializing tofinite groups of Lie type gives ‘random matrix theory over finite fields’ [47]. The Preprint submitted to Journal of Algebra February 10, 2021 numerative theory of p -groups is developed in [15]. The questions also makesense for compact groups and lead to the rich world of random matrix theory[4], [28], [44]. ‘Probabilistic group theory’ is used in many ways, see [37] and[83] for alternative perspectives.This paper specializes in a different direction. Let H and K be subgroupsof G . Then G splits into double cosets and one can ask about the distributionthat a uniform distribution on G induces on the double cosets. Three examplesare treated in detail: • If G = GL n ( F q ) and H = K is the lower triangular matrices B (a Borelsubgroup), then the Bruhat decomposition G = [ ω ∈ S n BωB shows that the double cosets are indexed by permutations. The inducedmeasure on S n is the actively studied Mallows measure p q ( ω ) = q I ( ω ) [ n ] q ! , (1)where I ( ω ) is the number of inversions in the permutation ω and [ n ] q ! =(1 + q )(1 + q + q ) . . . (1 + q + . . . + q n − ). The double cosets vary in size,from 1 / [ n ] q ! to q ( n ) / [ n ] q !. This might lead one to think that ‘most g liein the big double coset’. While this is true for q large, when q is fixed and n is large, the double coset containing a typical g corresponds to an I ( ω )with normal distribution centered at (cid:0) n (cid:1) − ( n − q +1 , with standard deviationof order √ n . See Theorem 3.5. The descent pattern of a typical ω isa one dependent determinental point process with interesting properties[17]. There has been intensive work on the Mallows measure developed inthe past ten years, reviewed in Section 3.3. This past work focuses on q as a parameter with 0 < q ≤
1. The group theory applications have q > • If G is the symmetric group S n and H = K is the hyperoctahedral groupof centrally symmetric permutations (isomorphic to C n ⋊ S n ), then the2ouble cosets are indexed by partitions of n and the induced measure isthe celebrated Ewens’s sampling formula p q ( λ ) = q ℓ ( λ ) z · z λ , (2)where ℓ ( λ ) is the number of parts of λ , z λ = Q ni =1 i a i a i ! if λ has a i partsof size i , and z = q ( q + 1) . . . ( q + n − p q is in genetics. In statistical applications, q isa parameter taken with 0 < q ≤
1. The group theory application callsfor new representations and theorems, developed here using symmetricfunction theory. • If G is the symmetric group S n and H = S λ , K = S µ are Young subgroupscorresponding to fixed partitions λ and µ of n , then the double cosets areindexed by contingency tables: I × J arrays of non-negative integer entrieswith row sums λ and column sums µ . If T = { T ij } is such a table, theinduced measure on double cosets is the Fisher-Yates distribution p ( T ) = 1 n ! Y i,j µ i ! λ j ! T ij ! . (3)where λ , . . . , λ I are the row sums of T and µ , . . . , µ J are the columnsums. This measure has been well-studied in statistics because of its ap-pearance in ‘Fisher’s Exact Test’. This is explained in Section 5. Itsappearance in group theory problems suggests new questions developedhere – what is the distribution of the number of zeros or the largest entry?Conversely, available tools of mathematical statistics (chi-squared approx-imation) answer natural group theory questions – which double coset islargest, and how large is it?The topics above have close connections to a lifetime of work by Jan Saxl.When the parabolic subgroups are S k × S n − k , the double cosets give Gelfandpairs. The same holds for B n ⊂ S n and, roughly, Jan prove that these are theonly subgroups of S n giving Gelfand pairs for n sufficiently large. He solved3imilar classification problems for finite groups of Lie type. These provide openresearch areas for the present project.Section 2 provides background and references for double cosets, Hecke alge-bras, and Gelfand pairs. Section 3 treats the Bruhat decomposition B \ GL n ( F q ) /B .Section 4 treats B n \ S n /B n and Section 5 treats parabolic subgroups of S n andcontingency tables. In each of these sections, open related problems are dis-cussed.
2. Background
This section gives definitions, properties, and literature for double cosets,Hecke algebras, and Gelfand pairs.
Let H and K be subgroups of the finite group G . Define an equivalencerelation on G by s ∼ t ⇐⇒ s = h − tk for s, t ∈ G, h ∈ H, k ∈ K. The equivalence classes are called double cosets , written
HsK for the doublecoset containing s and H \ G/K for the set of double cosets. This is a standardtopic in undergraduate group theory [90], [38]. Simple properties are: | HsK | = | H || K || H ∩ sKs − | = | H || K || K ∩ sHs − | (4) | G : H | = X HsK ∈ H \ G/K | HsK || H | (5) | H \ G/K | = 1 | H || K | X h ∈ H,k ∈ K | G hk | , where G hk = { g : h − gk = g } . (6)Despite these nice formulas, enumerating the number of double cosets canbe an intractable problem. For example, when H and K are Young subgroups,double cosets are contingency tables with fixed row and column sums. Enumer-ating these is a H = K , the smallest is the double coset containing id (with size | H · H | = | H | ).When K = H , it is not clear. Is it the double coset containing id ? Not always.Indeed, for H a proper subgroup of G , let g be any element of G not in H . Let K = H g = g − Hg . Then HgK = Hgg − Hg = HHg = Hg , so the double coset HgK is just a single right coset of H . This has minimal possible size amongdouble cosets H × K . Since g is not in H , HgK = Hg does not contain theidentity. It can even happen that the double coset containing the identity hasmaximal size. This occurs, from (4) above, whenever H ∩ K = id .The seemingly simple problem of deciding when there is only one doublecoset becomes the question of factoring G = HK . This has a literature surveyedin [10].All professional group theorists use double cosets – one of the standardproofs of the Sylow theorems is based on (5), and Mackey’s theorems aboutinduction and restrictions are in this language. In addition, double cosets haverecently been effective in computational group theory. Laue [73] uses them toenumerate all isomorphism classes of semi-direct products. Slattery [85] usesthem in developing a version of coset enumeration. With H and K subgroups of a finite group G , consider the group algebraover a field F : L F ( G ) = { f : G → F } . This is an algebra with ( f + f )( s ) = f ( s )+ f ( s ) and f ∗ f ( s ) = P t f ( t ) f ( st − ).The group H × K acts on L F ( G ) by f h,k ( s ) = f ( h − sk ) . The bi-invariant functions (satisfying f ( h − sk ) = f ( s ) for all h ∈ H, k ∈ K, s ∈ G ) form a sub-algebra of L F ( G ) which is here called the Hecke algebra . Manyother names are used, see [86] for history.5ecke algebras are a mainstay of modern number theory (usually with infi-nite groups). They are also used by probabilists (e.g. [32]) and many stripes ofalgebraists. Curtis and Reiner [26] is a standard reference for the finite theory.We denote them by L F ( H \ G/K ). Clearly the indicator functions of the doublecosets form a basis for L F ( H \ G/K ). For some choices of G and H , with K = H , the space L F ( H \ G/K ) forms a commutative algebra (even though G and H are non-commutative). Exampleswith G = S n are H n = S k × S n − k or H n = B n in S n . Of course, G acts on L F ( G/H ) (say with F = C ) and commutativity of L C ( G/H ) is equivalent tothe representation of G on L C ( G/H ) being multiplicity free: Since L C ( G/H ) =Ind GH (1) (the trivial representation of H induced up to G ), Frobenius reciprocityimplies that each irreducible ρ λ occurring in L C ( G/H ) has a 1-dimensionalsubspace of left H -invariant functions. Let s λ be such a function, normalizedby s λ (1) = 1. These are the spherical functions of the Gelfand pair ( G, H ).Standard theory shows that the spherical functions { s λ } form a second basis for L C ( H \ G/H ).We will not develop this further and refer to [27] (Chapter 3F), [20], [75] forapplications of Gelfand pairs in probability.We also note that Gelfand pairs occur more generally for compact and non-compact groups. For example, O n / O n − is Gelfand and the spherical functionsbecome the spherical harmonics of classical physics. For O n ⊂ GL n ( R ), thespherical functions are the zonal polynomials beloved of older mathematicalstatisticians. Gelfand pairs are even useful for large groups such as S ∞ and U ∞ ,which are not locally compact. See [19].Clearly, finding subgroups H giving Gelfand pairs is a worthwhile project.Jan Saxl worked on classifying subgroups giving Gelfand pairs over much of hiscareer [82], [63]. He gave definitive results for the symmetric and alternatinggroups and for most all the almost simple subgroups of Lie type. Alas, it turnsout that if n is sufficiently large, then S k × S n − k and B n give the only Gelfand6airs in S n (at least up to subgroups of index 2; e.g. A k × S n − k is Gelfand in S n ).
3. Bruhat decomposition and Mallows measure
Let G = GL n ( F q ), the general linear group over a field with q elements.Let B be the lower triangular matrices in G . Let W denote the permutationgroup embedded in G as permutation matrices. The decomposition of G into B − B double cosets is called the Bruhat decomposition [86] and has the followingproperties: G = [ ω ∈ W BωB, | B | = ( q − n q ( n ) , | G | = | B | n − Y i =1 (1 + q + . . . + q i ) . (7)Thus, permutations index the double cosets. The size of BwB is | BωB | = | B | q I ( ω ) , where I ( ω ) is the number of inversions of ω (that is, I ( ω ) = |{ i < j : ω i > ω j }| ).Dividing by | G | , we get the induced measure p q ( ω ) = q I ( ω ) [ n ] q ! , [ n ] q ! = n − Y i =1 (1 + q + . . . + q i ) (8) Example 3.1. In S , the inversions are ω
123 132 213 231 312 321 I ( ω ) 0 1 1 2 2 3and (1 + q )(1 + q + q ) = 1 + 2 q + 2 q + q .The measure p q ( ω ), ω ∈ W , is studied as the Mallows measure on W = S n in the statistical and combinatorial probability literature. A review is in Section3.3. Much of this development is for the statistically natural case of 0 < q < q close to 1. The group theory application has q = p a for a prime p and7 ∈ { , , , . . . } . This calls for new theorems and insights. The question ofinterest isPick g ∈ G from the uniform distribution. What double coset is g likely to be in?(9)An initial inspection of (8) reveals the minimum and maximum values: p q ( id ) = 1 / [ n ] q ! and p q ( ω ) = q ( n ) / [ n ] q ! for ω = n ( n − . . .
21, the rever-sal permutation. Thus, Bω B is the largest double coset. It is natural to guessthat ‘maybe most elements are in Bω B ’. This turns out to not be the case. Lemma 3.2. q ( n )[ n ] q ! = c ( q ) (cid:18) − q (cid:19) n − , c ( q ) = n − Y i =2 (cid:18) − q i (cid:19) − (10) Proof.
Using that (cid:0) n (cid:1) = P n − i =1 i , simple algebra gives q ( n )[ n ] q ! = q ( n )(1 + q )(1 + q + q ) . . . (1 + q + . . . + q n − )= 1 (cid:16) q (cid:17) (cid:16) q + q (cid:17) . . . (cid:16) q + . . . + q n − (cid:17) = (cid:16) − q (cid:17) n − Q n − i =2 (cid:16) − q i (cid:17) The infinite product Q ∞ i =1 (1 − /q i ) converges. This shows that for fixed q ,when n is large p q ( ω ) is exponentially small. Of course, for n fixed and q large, p q ( ω ) tends to 1 (only q ≫ n is needed).In Section 3.2, it is shown that a uniform g is contained in BωB for I ( ω ) = (cid:0) n (cid:1) − ( n − q − + Z √ ( n − qq − with Z a standard normal random variable.Let us conclude this introductory section with two applied motivations forstudying this double coset decomposition. Example 3.3 (LU decomposition of a matrix) . Consider solving Ax = b with A fixed in GL n ( F q ) and b fixed in F nq . The standard ‘Gaussian elimination’8olution subtracts an appropriate multiple of the first row from lower rows tomake the first column (1 , , . . . , T , then subtracts multiples of the second rowto make the second column ( ∗ , , , . . . , T , and so on, resulting in the system U x ∗ = b ∗ with U upper triangular. This can be solved inductively for x ∗ and then x .This description assumes that at stage j , the ( c, j ) entry of the current trian-gularization is non-zero. If it is zero, a permutation (pivoting step) is madeto work with the first non-zero element in column j . A marvelous article byRoger Howe [62] shows in detail how this is equivalent to expressing A = BωB with the number of pivoting steps being q n − I ( ω ) . Thus, matrices in the largestBruhat cell require no pivots and p q ( ω ) gives the chance of various pivotingpermutations. Example 3.4 (Random generation for GL n ( F q )) . Suppose one wants to gen-erate N independent picks from the uniform distribution on GL n ( F q ). We havehad to do this in cryptography applications when q = 2 , n = 256 , N = 10 .Testing conjectures for G also uses random samples. One easy method is tofill in an n × n array with independent picks from the uniform distribution on F q and then check if the resulting matrix A is invertible (using Gaussian elim-ination). If A is not invertible, this is simply repeated. The chance of successis approximately Q ∞ i =1 (cid:16) − q i (cid:17) ( ≈ .
29 when q = 2). Alas, this calls for avariable number of steps and made a mess in programming our crypto chip.Igor Pak suggested a simple algorithm that works in one pass:1. Pick ω ∈ W from p q ( ω ).2. Pick B , B ∈ B uniformly.3. Form B ωB .Since picking B i uniformly is simple, this is a fast algorithm. But how topick ω from p q ? The following algorithm is standard:9. Place symbols 1 , , . . . , n down in a row sequentially, beginning with 1.2. If symbols 1 , , . . . , i − i leftmostwith probability q i − ( q − / ( q i − q i − ( q − / ( q i − i th with probability ( q − / ( q i − n symbols are placed.The following sections develop some theorems for the Mallows distribution(8) for q > n large. In Section 3.2, the normality of I ( ω ) is es-tablished. Section 3.3 develops other properties along with a literature reviewof what is known for q <
1. The descent pattern is developed in 3.4, gener-alizations to other finite groups and parallel orbit decompositions (e.g. G × G acting on M at ( n, q )) are in Section 5.3. These sections are also filled with openresearch problems. I ( ω )This section proves the limiting normality of the number of inversions I ( ω )under the Mallows measure p q ( ω ) defined in (8), when q > n islarge. Thus, most g ∈ GL n ( F q ) are not in the largest double coset. Theorem 3.5.
With notation as above, for any x ∈ R , p q ( I ( ω ) − (cid:0) n (cid:1) + ( n − q − p ( n − q/ ( q − ≤ x ) = 1 √ π Z x −∞ e − t / dt + o (1) . The error term is uniform in x .Proof. The argument uses the classical fact that under p q on S n , I ( ω ) is exactlydistributed as a sum of independent random variables. Let P j ( i ) = q i ( q − / ( q j +1 −
1) for 0 ≤ i ≤ j < ∞ . Write X j for a random variable withdistribution P j , ≤ j ≤ n −
1, taking X j independent. Then p q ( I ( ω ) = a ) = P { X + X + . . . + X n − = a } for all n and 0 ≤ a ≤ n − . (11)10o see (11), use generating functions. Rodrigues [81] proved for any θ that X ω ∈ S n θ I ( ω ) = (1 + θ )(1 + θ + θ ) . . . (1 + θ + . . . + θ n − ) . Take θ = xq and divide both sides by [ n ] q ! to see E q [ x I ( ω ) ] = E [ x X ] . . . E [ x X n − ] . Under P j P j ( X j = j − a ) = q j − a ( q − q j +1 − (cid:18) − q (cid:19) q a (cid:18) q j +1 − (cid:19) . Thus, when j is large (and using that q > j − X j is exponentiallyclose to a geometric random variable X with P ( X = a ) = (cid:16) − q (cid:17) (cid:16) q (cid:17) a , ≤ a < ∞ . This X has E [ X ] = 1 / ( q −
1) and Var( X ) = q/ ( q − . Now, theclassical central limit theorem implies the result. p q ( ω )The discussion above points to the question of: What properties of ω are‘typical’ under p q ( ω )? We now see that ω with I ( ω ) = (cid:0) n (cid:1) − ( n − / ( q − ±√ ( n − qq − are typical, but are all such ω equally likely?The distribution p q ( ω ) is studied (for general Coxeter groups) in [32]. Theyshow (for all q, n ) p q ( ω ) = p q ( ω − ) (12) p q ( ω = j ) = q j − ( q − / ( q −
1) (13) p q ( ω n = j ) = q n − j ( q − / ( q n −
1) (14) p q ( ω ) = p q − ( R ( ω )) , (15)where in the last expression R ( ω ) is the reversal of ω (e.g. R (31542) = 24513).However, there do not appear to be simple expressions for p q ( ω i = j ), 1 < i < n ,nor for the distribution of the number of fixed points, cycles, or other featuresstandard in enumerative combinatorics.There has been remarkable study of features when q is close to 1 (often q = 1 − β/n ). These include 11 The limiting distribution of the empirical measure n P δ i,ω i was studiedby Shannon Starr [88]. He shows, for q = 1 − β/n ,lim ǫ ↓ ,n →∞ p q ((cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X f (cid:18) in , ω i n (cid:19) − Z [0 , × [0 , f ( x, y ) u ( x, y ) dxdy (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) > ǫ ) = 0for any continuous function f : [0 , × [0 , → R , where u ( x, y ) = ( β/
2) sinh( β/ (cid:0) e β/ cosh( β ( x − y ) / − e − β/ cosh( β ( x + y − / (cid:1) . Starr derives these results rigorously by considering a Gibbs measure onpermutations Z − ( β ) e − βH ( ω ) with H ( ω ) = n − P ≤ i
1. To describe this, let X i ( ω ) = ω i +1 < ω i ω i +1 > ω i , ≤ i ≤ n − . If ω is random, then X , X , . . . , X n − is a point process. The following result ofBorodin, Diaconis, Fulman [18] describes many properties of this process. Forfurther definitions and background, see [17]. Theorem 3.6.
Let p q , q > , be the Mallows measure (8) . a) The chance that a random permutation chosen from p q has descent set con-taining s < s < . . . < s k isdet (cid:20) s j +1 − s j ] q ! (cid:21) ki,j =0 , with s = 0 , s k +1 = n .(b) The point process X i ( ω ) is stationary, one dependent, and determinentalwith kernel K ( x, y ) = k ( x − y ) where X m ∈ Z k ( m ) z m = 11 − ( P ∞ m =0 z n / [ m ] q !) − . (c) The chance of finding k descents in a row is q ( k +12 ) / [ k + 1] q ! . In particular,the number d ( ω ) of descents has mean µ ( n, q ) = qq +1 ( n − and variance σ = q (cid:18) ( q − q + 1)( n − q + 3 q − q + 1) (1 + q + q ) (cid:19) . Normalized by its mean and variance, the number of descents has a limitingstandard normal distribution.
Remarks
1. Consider the distribution of d ( ω ) in part (c) of Theorem 3.6. Underthe uniform distribution on S n , d ( ω ) has mean ( n − / n + 1) /
12 (obtained by setting q = 1 in the formula in part (c)). Thedistribution p q pushes ω toward ω . How much? The mean increase to qq +1 ( n −
1) and, as makes sense, the variance decreases . For large q , themean goes to the maximum value ( n −
1) and the variance goes to zero.2. The paper [18] gives simple formulas for the k -point correction function p q ( S ( ω ) ⊂ S ) for general sets S .3. There is an interesting alternative way to compute various moments for d ( σ ) under the measure p q . Let A n ( y ) = X σ ∈ S n p q ( σ ) y d ( σ ) . ∞ X n =0 A n ( y ) z n = 1 − yE ( z ( y − − y , E ( w ) = ∞ X n =0 q ( n )[ n ] q ! w n . Differentiating in y and setting y = 1 gives the generating function of thefalling factorial moments for d under p q . Using Maple, Stanley (personalcommunication) computes X n A ′ n (1) z n = qz ( z − ( q + 1) X n A ′′ n (1) z n = q z ( q + q + z )( z − ( q + 1) ( q + q + 1) X n A ′′′ n (1) z n = q z ( q + q + 2 q z − qz + 2 qz + z )( z − ( q + 1) ( q + q + 1)( q + 1)These give independent checks on the mean and variance reported beforeand an expression for the third moment. It would be a challenge to provethe central limit theorem by this route.Sections 3.1-3.4 underscore our main point: Enumeration by double cosetscan lead to interesting mathematics. The Bruhat decomposition (7) is a special case of more general results.Bruhat showed that a classical semi-simple Lie group has a double coset of thisform where B is a maximal solvable subgroup of G and W is the Weyl group.Then Chevalley showed the construction makes sense for any field, particularlyfinite fields. This gives | G | = | B | X ω ∈ W q ℓ ( ω ) , with ℓ ( ω ) the length of the word ω in the Coxeter generators. The lengthgeneration function factors X ω ∈ W q ℓ ( ω ) = n Y i =1 (1 + q + . . . + q e i ) , e i , the exponents of W , are known. From here, one can prove the analogof Theorem 3.5. The Weyl groups have a well developed descent theory (numberof positive roots sent to negative roots) and one may ask about the analog ofTheorem 3.6, along with the other distribution questions above.We want to mention two parallel developments. Louis Solomon [86] has builta beautiful parallel theory for describing the orbits of GL n ( F q ) × GL n ( F q ) on M at ( n, q ), the set of n × n matrices. This has been wonderfully developed byTom Halverson and Arun Ram [53]. None of the probabilistic consequenceshave been worked out. There is clearly something worthwhile to do.Second, Bob Gualnick [52] has classified the orbits of GL n ( F q ) × GL m ( F q )acting on the set M at ( n, m ; q ) of n × m matrices over F q . Estimating the sizesand other natural questions about the orbit in the spirit of this section seemslike an interesting project.Finally, it is worth pointing out that finding ‘nice descriptions’ of doublecosets is usually not possible. For example, let U n ( q ) be the group of n × n uni-upper triangular matrices with entries in F q . Let G = U n ( q ) × U n ( q ), with H = K = U n ( q ) embedded diagonally. Describing H − H double cosets is a well-studied wild problem in the language of quivers [50]. In [3], this was replacedby the easier problem of studying the ‘super characters’ of U n . This leads tonice probabilistic limit theorems. See [23], [24].
4. Hyperoctahedral double cosets and the Ewens sampling formula
Let B n be the group of symmetries of an n -dimensional hypercube. This isone of the classical groups generated by reflections. It can be represented as B n ∼ = C n ⋊ S n (16)with S n acting on the binary n -tuples by permuting coordinates. Thus, | B n | = n !2 n . For present purposes it is useful to see B n ⊂ S n as the subgroup ofcentrally symmetric permutations. That is, permutations σ ∈ S n with σ ( i ) +16 (2 n + 1 − i ) = 2 n + 1 for all 1 ≤ i ≤ n . For example, when n = 2 we have | B | = 8 and, as elements of S , can write B n = { , , , , , , , } . The first and last values in each permutation sum to five, as do the middle twovalues. This representation is useful in studying perfect shuffles of a deck of 2 n cards [31].The double coset space B n \ S n /B n is a basic object of study in the statis-ticians world of zonal polynomials. Macdonald ([76], Section 7.1) develops thisclearly, along with citations to the statistical literature, and this section followshis notation.We begin by noting two basic facts: 1) The double cosets B n \ S n /B n forma Gelfand pair. 2) The double cosets are indexed by partitions of n . To see howthis goes, to each permutation σ ∈ S n associate a graph T ( σ ) with vertices1 , , . . . , n and edges { ǫ i , ǫ σi } ni =1 where ǫ i joins vertices 2 i − , i and ǫ σi joinsvertices σ (2 i − , σ (2 i ). Color the ǫ i edges red and the ǫ σi edges as blue. Then,each vertex lies on exactly one red and one blue edge. This implies the compo-nents of T ( σ ) are cycles with alternating red and blue edges, so each cycle hasan even length. Dividing these cycle lengths by 2 gives a partition of n , call it λ σ . Example 4.1.
Take n = 3 and σ = 612543. The graph T ( σ ) is1 2 3456 ǫ ǫ ǫ ǫ σ ǫ σ ǫ σ Here there is a cycle of length 4 and a cycle of length 2, thus this correspondsto the partition λ σ = (2 , λ σ = λ σ ′ if and only if σ ∈ B n σ ′ B n . Thus, the partitions of n serve as double coset representatives for B n \ S n /B n .If we denote B λ = B n σB n , λ = λ σ , then | B λ | = | B n | ℓ ( λ ) z λ , (17)with z λ = Q ni =1 i a i a i ! and ℓ ( λ ) = P ni =1 a i ( λ ) is the number of parts in λ , where λ has a i parts of size i . For example, for σ = id , we see λ σ = 1 n , z λ = n ! and | B n | = (2 n n !) / (2 n n !) = | B n | . The largest double coset corresponds to the 2 n cycles (12 . . . n ) in S n (not all 2 n cycles are in the same double coset).To see this, let f ( λ ) = z λ ℓ ( λ ) for a partition λ of n . Note that for any λ , ifa box from the lower right corner is moved to the right end of the top row, theresult λ ′ is still a partition. For example, λ = λ ′ = Lemma 4.2.
With notation above, f ( λ ) > f ( λ ′ ) .Proof. Assume the first row of λ has a boxes and the last row has b boxes.Consider f ( λ ′ ) /f ( λ ). If b >
1, then ℓ ( λ ′ ) = ℓ ( λ ). With z λ = Q ni =1 i m i m i !,where m i is the number of parts of λ of length i , then from λ to λ ′ only i = a, b, a + 1 , b − f ( λ ′ ) f ( λ ) = a m a − ( m a − · ( a + 1) m a +1 +1 ( m a +1 + 1)! · b m b − ( m b − · ( b − m b − +1 ( m b − + 1)! a m a ( m a )! · ( a + 1) m a +1 ( m a +1 )! · b m b ( m b )! · ( b − m b − ( m b − )!= ( a + 1) · ( m a +1 + 1) · ( b − · ( m b − + 1) a · m a · b · m b = ( a + 1) · ( b − a · m a · b · m b < a is the length of the top row and b is the length of the bottom row, m a +1 = m b − = 0, and the inequality comes since m a , m b ≥ b ≤ a .If b = 1, then ℓ ( λ ′ ) = ℓ ( λ ) − f ( λ ′ ) f ( λ ) = ( a + 1)2 · am a m < . orollary 4.3. For λ ⊢ n , f ( λ ) ≤ n with equality if and only if λ = ( n ) . Remark.
It is natural to guess that f ( λ ) is monotone in the usual partial orderon partitions. This fails, for example: λ = λ ′ =Here λ < λ ′ , but f ( λ ) = 2 · < ·
4! = f ( λ ′ ). Still, inspection of special casessuggest that the partial order in the lemma can be refined.The lemma shows that the smallest double coset in B n \ S n /B n correspondsto id, λ = 1 n , and the largest corresponds to the 2 n cycle (12 . . . n ) , λ = ( n ).Dividing (17) by | S n | gives the probability measure P n ( λ ) = Z − · n ! · ℓ ( λ ) z λ , with Z = 12 (cid:18)
12 + 1 (cid:19) . . . (cid:18)
12 + n − (cid:19) (18)This shows that P n ( λ ) is the Ewens measure P θ for θ = 1 /
2. The Ewens measurewith parameter θ is usually described as a measure on the symmetric group S n with P θ ( η ) = θ c ( η ) θ ( θ + 1) . . . ( θ + n − , c ( η ) = number of cycles in η. (19)If η is in the conjugacy class corresponding to λ ⊢ n , then c ( η ) = ℓ ( λ ) and thesize of the conjucacy class is n ! /z λ . Using this, simple calculations show (18) is(19) with θ = 1 / S n because of its appearance in genetics. The survey by Harry Crane[25] gives a detailed overview of its many appearances and properties. Limittheorems for P θ are well developed. Arratia-Barbour-Tavar´e ([5], chapter 4)studies the distribution of cycles (number of cycles, longest and shortest cycles,etc. ) under P θ . The papers of F´eray [43] study exceedences, inversions, andsubword patterns. A host of features display a curious property: The limitingdistribution does not change with q (!). For example, the structure of the de-scent set of an Ewens permutation matches that of a uniform permutation. An19legant, unified theory is developed in the papers of Kammoun [68], [67], [69].More or less any natural feature of λ has been covered. These papers work forall θ so the results hold for P θ in (19).The following section gives more details. The final section suggests relatedproblems. For a partition λ of n let a i ( λ ) be the number of parts of λ equal to i .Thus, P ni =1 ia i ( λ ) = n . For σ ∈ S n , write a i ( σ ) for a i ( λ σ ) and introduce thegenerating functions: f n ( x , . . . , x n ) = 1(2 n )! X σ ∈ S n n Y i =1 x a i ( σ ) i , n ≥ f = 1and f ( t ) = ∞ X n =0 t n (cid:0) nn (cid:1) n f n . The following analog of Polya’s cycle index theorem holds.
Theorem 4.4.
With notation as above, f ( t ) = exp ∞ X n =1 t n n x i ! Proof.
The proof uses symmetric function theory as in [76]. In particular, thepower sum symmetric functions in variables y = ( y , y , . . . ) are p j ( y ) = P i y ji and, for λ = 1 a a . . . , p λ ( y ) = Q i p a i i . A formula at the bottom of pg 307 in[76] specializes to X λ z − λ − ℓ ( λ ) p λ ( y ) p λ ( y ′ ) = exp ∞ X n =1 n p n ( y ) p n ( y ′ ) ! . (20)In (20), y and y ′ are distinct sets of variables. We have set v λ = 2 in Macdonald’sformula (see the discussion following the proof). Set further y = y ′ and replace y by √ ty to get X λ z − λ − ℓ ( λ ) t | λ | p λ ( y ) = exp ∞ X n =1 t n n p n ( y ) ! , | λ | = P i λ i . Since the p n are free generators of the ring of symmetricfunctions, they may be specialized to p n → √ x i (that is, setting p i ( y ) = √ x i ).Then the formula becomes ∞ X n =0 t n X λ ⊢ n z − λ − ℓ ( λ ) Y i x a i ( λ ) i = exp ∞ X n =1 t n n x n ! . (21)As above, the inner sum is X λ ⊢ n z − λ − ℓ ( λ ) Y i x a i ( λ ) i = (2 n n !) X σ ∈ S n Y i x a i ( λ σ ) i = (cid:0) nn (cid:1) n f n . To bring out the probabilistic content of Theorem 4.4, recall the negativebinomial density with parameters 1 / , − t assigns mass p / , − t ( n ) = Z − (cid:0) nn (cid:1) t n n , Z − = √ − t, n = 0 , , , . . . Divide both sizes of (21) by √ − t to see ∞ X n =0 p / , − t ( n ) · f n = ∞ Y n =1 exp (cid:18) t n n x n − t n n (cid:19) , (22)using the expansion √ − t = Q n e t n / n . Recall the Poisson( λ ) distributionon { , , , . . . , } has density e − λ λ j /j ! and moment generating function e − λ + λx .This and (22) gives Corollary 4.5.
Pick n ∈ { , , , . . . } from p / , − t ( n ) and then σ ∈ S n fromthe uniform distribution. If σ has λ σ with a i parts equal to i , then the { a i } ni =1 are independent with a i having a Poisson distribution with parameter t i / i . From this corollary one may prove theorems about the joint distribution ofcycles exactly as in [84]. This gives analytic proofs of previously proved results.For example, for large n : • The { a i } ni =1 are asymptotically independent with Poisson(1 / i ) distribu-tions. 21 ℓ ( λ ) has mean asymptotic to log( n ) /
2, variance asymptotic to log( n ) / ℓ ( λ ) has a limiting normal dis-tribution.The distribution of smallest and largest parts are similarly determined. Thecalculations in this section closely match the development in [91]. This gives avery clear description of the results above from the genetics perspective. (a) The formula of Macdonald used in Section 4.2 involved a sequence ofnumbers v i , ≤ i < ∞ . For a partition λ , define v λ = v λ v λ . . . v λ l multiplica-tively. Macdonald proves X λ v − λ z − λ p λ ( y ) p λ ( y ′ ) = e P n p n ( y ) p n ( y ′ ) / ( nv n ) . At the right, the product means ‘something is independent’ and it is up to usto see what it is.As a first example, take v i = 1 for all i . Then, proceeding as in (22) theformula becomes ∞ X n =0 t n n C n = e P n i n t n , with C n ( x , . . . , x n ) = n ! P σ ∈ S n Q i x a i ( σ ) i , the cycle indicator of S n . This isexactly Polya’s cycle formula, see [84].Taking v n ≡ / v n : 1 , (1 − t n ) − , , α, (1 − q n ) / (1 − t n )and shows that each gives celebrated special functions: Schur, Hall-Littlewood,Zonal, Jack, and Macdonald, respectively. We are sure that each will giverise to an interesting enumerative story, if only we could find out what is beingcounted. Indeed, in [48] Jason Fulman has shown that the case of v i = (1 − q i ) − enumerates F -stable maximal tori in GL n ( F ).22 b) For the cycles of the symmetric group, Polya’s formula shows that thelimiting Poisson approximation is remarkably accurate. In particular, under theuniform distribution on S n : • The first n moments of the number of fixed points of σ , a ( σ ), are equalto the first n moments of the Poisson(1) distribution. • More generally, the mixed moments E S n [ a k a k . . . a k l l ]equal the same moments of independent Poisson variables with parameters1 , / , . . . , /l , as long as k + 2 k + . . . + lk l ≤ n .Theorem 4.4 allows exact computation of the joint mixed moments of a , a , . . . for λ chosen from Ewens(1 /
2) distribution. They are not equal to the limitingmoments. The moments were first computed by Watterson in [91]. (c)
We mention a q -analog of the results of this section which is paralleland ‘nice’. It remains to be developed. The n -dimensional symplectic group Sp n ( F q ) is a subgroup of GL n ( F q ) and GL n , Sp n is a Gelfand pair. Thedouble cosets are nicely labeled and the enumerative facts are explicit enoughthat analogs of he results above should be applicable. For details, see [6].Jimmy He ([54], [56]) worked out the convergence rates for the natural ran-dom walk on GL n , Sp n using the spherical functions. This problem was sug-gested to the first author by Jan Saxl as a way of tricking himself into learningsome probability. The result becomes a walk on quadratic forms, and He provesa cutoff occurs.
5. Parabolic subgroups of S n Let λ be a partition of n (denoted λ ⊢ n ). That is, λ = ( λ , λ , . . . , λ I ) with λ ≥ λ ≥ . . . ≥ λ I > λ + λ + . . . + λ I = n . The parabolic subgroup S λ is the set of all permutations in S n which permute only { , , . . . , λ } amongthemselves, only { λ + 1 , . . . , λ + λ } among themselves, and so on. Thus, S λ ∼ = S λ × S λ × . . . × S λ I . s i = ( i, i + 1) , ≤ i ≤ n − S n , then S λ is generated by { s i } n − i =1 \ { ( s λ , s λ +1 ) , . . . , ( s n − λ I − , s n − λ I ) } . The group S λ isoften called a Young subgroup .Let µ = ( µ , . . . , µ J ) be a second partition of n . This section studies thedouble cosets S λ \ S n /S µ . These cosets are a classical object of study; they canbe indexed by contingency tables: I × J arrays of non-negative integers withrow sums given be the parts of λ and column sums the parts of µ . Example 5.1.
When n = 5 , λ = (3 , , µ = (2 , ,
1) there are five possibletables:
24 12 24 48 12The number under each table is the size of the corresponding double coset.The mapping from S n to tables is easy to describe: Fix σ ∈ S n . Inspect thefirst λ positions in σ . Let T be the number of elements from { , , . . . , µ } occurring in these positions, T the number of elements from { µ + 1 , . . . , µ + µ } , . . . and T J the number of elements from { n − µ J + 1 , . . . , n } . In general, T ij is the number of elements from { µ + . . . + µ i − + 1 , . . . , µ + . . . + µ j } whichoccur in the positions λ + λ + . . . + λ i − + 1 up to λ + . . . + λ i . Example 5.2.
When n = 5, λ = (3 , , µ = (2 , , σ = 31542is mapped to the table . The mapping σ → T ( σ ) is S λ × S µ bi-invariant and gives a coding of thedouble cosets. See [64] for further details and proof of this correspondence.Jones [66] gives a different coding.Any double coset has a unique minimal length representative. This is easy toidentify: Given T , build σ sequentially, left to right, by putting down 1 , , . . . , T then µ + 1 , µ + 2 , . . . , µ + T ... each time putting down the longest available24umbers in the µ j block, in order. Thus, in example 5.2 the shortest doublecoset representative is 13524. For more details, see [13].The measure induced on contingency tables by the uniform distribution on S n is P λ,µ ( T ) = 1 n ! Y i,j λ i ! µ j ! T ij ! . (23)This is the Fisher-Yates distribution on contingency tables, a mainstay of ap-plied statistical work in chi-squared tests of independence. The distribution canbe described by a sampling without replacement problem: Suppose that an urncontains n total balls of I different colors, r i of color i . To empty the urn, make J sets of draws of unequal sizes. First draw c balls, next c , and so on untilthere are c J = n − P J − j =1 c j balls left. Create a contingency table by setting T ij to be the number of color i in the j th draw.This perspective, along with the previously defined mapping from permuta-tions to cosets, proves that the distribution on contingency tables induced bythe uniform distribution on S n is indeed the Fisher-Yates: Suppose a permuta-tion σ ∈ S n represents a deck of cards labeled 1 , . . . , n . Given partitions λ, µ color cards 1 , . . . , µ with color 1, labels µ + 1 , . . . , µ color 2 and so on. Froma randomly shuffled deck, draw the first λ cards and count the number of eachcolor, then draw the next λ , and so on.More statistical background and available distribution theory is given in thefollowing section. These results give some answers to the question:Pick σ ∈ S n uniformly. What S λ \ S n /S µ double coset is it likely to be in?(24)From (23), | S λ σS µ | = Y i,j µ i ! λ j ! T ij ! , for T = T ( σ ) . (25)However, enumerating the number of double cosets is a λ = µ = ( k, n − k ), the double cosets give a Gelfand pair with sphericalfunctions the Hahn polynomials. The associated random walk is the Bernoulli-
Total
Brown 68 119 26 7 220Blue 20 84 17 94 215Hazel 15 54 14 10 93Green 5 29 14 16 64
Total
108 286 71 127
Table 1: This table has a total of 592 entries, with row sums r , r , r , r = 220 , , ,
64 andcolumn sums c , c , c , c = 108 , , , , , , , , . = 1 . × tables with these row and column sums. Laplace urn , which is perhaps the first Markov chain! (See [33].) More generalpartitions give interesting urn models but do not seem to admit orthogonalpolynomial eigenvectors.One final note: there has been a lot of study on the uniform distribution onthe space of tables with fixed row and column sums. This was introduced withstatistical motivation in [29]. The central problem has been efficient generationof such tables; enumerative theory is also natural but remains to be developed.See [30], [22], [35], [36], [8] and their references. The Fisher-Yates distribu-tion (23) is quite different from the uniform and central to both the statisticalapplications and to the main pursuits of the present paper.Section 5.1 develops statistical background and uses this to understand thesize of various double cosets, Section 5.2 proves a new limit theorem for thenumber of zeros in T ( σ ). The final section discusses natural open problems. Contingency tables arise whenever a population of size n is classified withtwo discrete categories. For example, Table 1 shows 592 subjects classified by 4levels of eye color and 4 levels of hair color.A classic task is the chi-squared test for independence. This is based in the26hi-squared statistic χ ( T ) = X i,j (cid:18) T ij − λ i · µ j n (cid:19) / ( λ i · µ j /n ) . (26)This measure how close the table is to a natural product measure on tables. Inthe example Table 1, χ = 138 . n , with each individual independently assigned into one of the I × J cells withprobability p ij ( p ij ≥ , P ij p ij = 1). The independence model postulates p ij = α i · β j for α i , β j ≥ P i α i = P j β j = 1. A basic theorem in the subject [70] saysthat if n is large and α i , β j > χ statistic has a limiting distribution f k ( x ),i.e. P ( χ ≤ x ) → Z x f k ( t ) dt, where f k ( x ) is the chi-squared density with k = ( I − J −
1) degrees of freedom: f k ( x ) = x k/ − · e − x/ k/ · Γ( k/ , x ≥ . (27)The density f k has mean k and variance 2 k and it is customary to compare theobserved χ statistic with the k ± √ k limits and reject the null hypothesis ifthe statistic falls outside this interval. In the example, k = 9 and the hypothesisof independence is rejected.The above simple rendition omits many points which are carefully developedin [72], [1], [2].The great statistician R.A. Fisher suggested a different calibration: Fix therow sums, fix the column sums and look at the conditional distribution of thetable given the row and column sums (under the independence model). It is anelementary calculation to show that P ( T | λ i , µ j ) is the Fisher-Yates distribution(23). Notice that the Fisher-Yates distribution does not depend on the ‘nuisanceparameters’ α i , β j . This is called Fishers exact test . There is a different line ofdevelopment leading to the same distribution. This is the conditional testing27pproach (also due to Fisher). David Freedman and David Lane [45], [46] givedetails, philosophy, and history. We only add that conditional testing is arich, difficult subject (starting with the question: what to condition on?). Fordiscussion and extensive pointers to the literature, see [74] (Chapter 2), [34](Section 4).All of this said, mathematical statisticians have long considered the dis-tribution of tables with given row and column sums under the Fisher-Yatesdistribution.The following central limit theorem determines the joint limiting distributionof the table entries T ij under the Fisher-Yates distribution. They are approxi-mately multivariate normal. As a corollary, the χ statistic has the appropriatechi-squared distribution. This can be translated into estimates of the size ofvarious double cosets, as discusses after the statement.In the following, fix I and J . Let λ n = ( λ n , . . . , λ nI ) , µ n = ( µ n , . . . , µ nJ )be two sequences of partitions of n . Suppose there are constants α i , β j with0 < α i , β j < n →∞ λ ni /n = α i , lim n →∞ µ nj /n = β j for 1 ≤ i ≤ I, ≤ j ≤ J. (28)Let T be drawn from the Fisher-Yates distribution (23) and let Z nij = √ n (cid:18) T ij n − λ ni µ nj n (cid:19) Theorem 5.3.
With notation as above, assuming (28) , the random vector Z n = ( Z n , Z n , . . . , Z n J , . . . , Z nI , . . . , Z nIJ ) converges in distribution to a normal distribution with mean zero and covariancematrix Σ = (cid:0)
Diag ( α ) − α · α T (cid:1) ⊗ (cid:0) Diag ( β ) − β · β T (cid:1) for α = ( α , . . . , α I ) , β = ( β , . . . , β J ) . Note that since the final entry in each row (or column) is determined by theother entries, the IJ × IJ covariance matrix is singular with rank ( I − J − orollary 5.4. Under the conditions of Theorem 5.3, the chi-squared statistic (26) has a limiting chi-squared distribution (27) with k = ( I − J − degreesof freedom. A very clear proof of Theorem 5.3 and the corollary is given by Kang andKlotz [70]. They review the history, as well as survey several approaches to theproof. Their argument is a classical, skillful use of Stirling’s formula and theirpaper is a model of exposition.The usual way of using these results, for a single entry T ij in the table, gives P T ij − ν ij q nσ ij ≤ x ∼ √ π Z x −∞ e − t / dt, ν ij = λ i µ j n , σ ij = λ i µ j n (cid:18) − λ i µ j n (cid:19) . Any single entry of the table has a limiting normal approximation. This canalso be seen through the normal approximation to the hypergeometric distribu-tion. This is available with a Berry-Esseen error; see [58].The limiting χ approximation shows that, under the Fisher-Yates distribu-tion, most tables are concentrated around the ‘independence table’ T ∗ ij = λ i µ j n . This T ∗ is rank one. While it does not have integer entries, it gives a goodpicture of the approximate size of a typical double coset.To be quantitative, let us define a distance between tables T, T ′ with thesame row and column sums: k T − T ′ k = X i,j | T ij − T ′ ij | . This is the L distance, familiar as total variation from probability. Since P i,j T ij = n , for many tables T, T ′ , k T − T ′ k . = n . The Cauchy-Schwartzinequality shows k T − T ∗ k ≤ √ n · χ ( T ) . (29)Corollary 5.4 shows that, under the Fisher-Yates distribution, χ ( T ) is typically( I − J − ± p I − J − k T − T ∗ k is of order29 n ≪ n . A different way to say this is to divide the tables T and T ∗ by n toget probability distributions T , T ∗ on IJ points. Then, for most T , k T − T ∗ k = O p (cid:18) √ n (cid:19) . Barvinok [8] studies the question in the paragraph above under the uniform distribution on tables. In this setting, he shows that most tables are close (in asomewhat strange distance) to quite a different table T ∗∗ .Theorem 5.3 also gives an asymptotic approximation to the size of the doublecoset corresponding to the table T . Call this S λ T S µ . It is easy to see that | S λ T S µ | = n ! · P ( T | λ, µ ) ∼ n ! · ϕ ( T ) / √ n with ϕ ( T ) = e − Z − Σ − − Z T − / det(Σ − ) / for Z − the vector corresponding to the upper left ( I − J ) × ( J −
1) sub-matrix of T (with notation as in Theorem 5.3) and Σ − the associated ( I − J − × ( I − J −
1) covariance matrix. This uses the local limit version of Theorem 5.3,which follows from the argument of Kang and Klotz [70]. See [21] for furtherdetails.The asymptotics above show that the large double cosets are the ones closestto the independence table. This may be supplemented by the following non-asymptotic development.Let T and T ′ be tables with the same row and column sums. Say that T ≺ T ′ (‘ T ′ majorizes T ’) if the largest element in T ′ is greater than the largestelement in T , the sum of the two largest elements in T ′ is greater than the sumof the two largest elements in T , and so on. Of course the sum of all elementsin T ′ equals the sum of all elements of T . Example 5.5.
For tables with n = 8 , λ = λ = µ = µ = 4, there is thefollowing ordering ≺ ≺ . Majorization is a standard partial order on vectors [77] and Harry Joe [65]has shown it is useful for contingency tables.30 roposition 5.6.
Let T and T ′ be tables with the same row and column sumsand P the Fisher-Yates distribution. If T ≺ T ′ , then P ( T ) > P ( T ′ ) . Proof.
From the definition (23), we have log( P ( T )) = C − P i,j log( T ij !) fora constant C . This form makes it clear the right hand side is a symmetricfunction of the IJ numbers { T ij } . The log convexity of the Gamma functionshows that it is concave. A symmetric concave function is Schur concave: Thatis, order-reversing for the majorization order [77]. Remark.
Joe [65] shows that, among the real-valued tables with given rowand column sums, the independence table T ∗ is the unique smallest table inmajorization order. He further shows that if an integer valued table is, entry-wise, within 1 of the real independence table, then T is the unique smallesttable with integer entries. In this case, the corresponding double coset has P ( T ) largest. Example 5.7.
Fix a positive integer a and consider an I × J table T with allentries equal to a . This has constant row sums J · a and column sums I · a . It isthe unique smallest table with these row and column sums, and so correspondsto the largest double coset. For a = 2 , I = 2 , J = 3, this table is T = . Contingency tables with fixed row and column sums form a graph with edgesbetween tables that can be obtained by one move of the following: pick two rows i, i ′ and two columns j, j ′ . Add +1 to the ( i, j ) entry, − i ′ , j ) entry,+1 to the ( i ′ , j ′ ) entry, and − i, j ′ ) entry. This graph is connected andmoves up or down in the majorization order as the 2 × i, i ′ and columns j, j ′ moves up or down. See Example 5.5 above. In this section we will use r , . . . , r I for the row sums of a table and c , . . . , c J for the column sums. One natural feature of a contingency table is its zero31ntries. As shown in Section 5.1, most tables will be close to the table T ∗ withentries T ∗ ij = r i c j /n . This has no zero entries. Therefore, zeros are a pointer tothe breakdown of the independence model. In statistical applications, there isalso the issue of ‘structural zeros’ – categories such as ‘pregnant males’ whichwould give zero entries in cross-classified data due their impossibility. See [14]for discussion. The bottom line is, professional statisticians are always on thelook-out for zeros in contingency tables. This section gives a limit theorem forthe number of zeros under natural hypotheses.A simple observation which leads to the theorem is that a Fisher-Yates tableis equivalent to rows of independent multinomial vectors, conditioned on thecolumn sums: let X , . . . , X I be independent random vectors of length J , with X i ∼ M ultinomial ( r i , { q j } Jj =1 ) for some probabilities q j > P j q j = 1.That is, X i are the occupancy counts generated by assigning r i balls to J boxes,with one ball going to the j th box with probability q j . The joint distributionfor the vectors is then P ( X = ( x ij )) = I Y i =1 (cid:18) r i x i , . . . , x iJ (cid:19) · q x i . . . q x iJ J . (30)Let Y , . . . , Y I be distributed as X , . . . , X I conditioned on the sums P Ii =1 X ij = c j . From (30) it is clear that Y , . . . , Y I has the Fisher-Yates distribution (23),regardless of the choices q j .This perspective allows us to use known limit results for multinomial dis-tributions, translated to contingency tables using conditioned limit theory. Forthe remainder, assume that the row sums r i = r are constant, so that the X i are iid vectors. Let f ( X i ) = P Jj =1 ( X ij = 0) count the number of zero-entriesin the vector. [59] contains limit theorems for f ( X i ) as r → ∞ , with eitherPoisson or normal limit behavior depending on the asymptotics of r, J and the q j . Example 5.8.
Consider an I × J table with constant column sums c = I (log( I · J ) + θ ). The row sums are determined by r = n/I . If the table is created fromthe counts of dropping n balls in I · J boxes, with each box equally likely, then32he expected number of zero entries is λ ∗ = IJ · (cid:18) − I · J (cid:19) n = IJ · (cid:16) − cn · J (cid:17) n ∼ IJe − c/J ∼ e − θ If n, I, J → ∞ then λ ∗ → e − θ , and the following theorem shows that the numberof zeros has a Poisson( λ ∗ ) distribution under these assumptions. Indeed, itshows this for varying column sums. Theorem 5.9.
Suppose that n → ∞ and fix sequences I n , J n , c nj such that I n · J n X j =1 (cid:18) − c nj n (cid:19) n/I n → β. Let Z n be the number of zeros in a Fisher-Yates contingency table of size I n × J n with constant row sums r n = n/I n and column sums c n , . . . , c nJ n . Then L ( Z n ) → P oisson ( β ) . Proof.
Let X nj ∼ Multinomial( r n , { q nj } J n j =1 ), with the probabilities q j = c nj /n chosen so that E [ P I n i =1 X nij ] = I n · r n · q nj = c nj . Then conditioned limit theorem(Corollary 3.5 in [60]) says that if L I n X i =1 f ( X ni ) ! → L ( U ) , where U has no normal component, then L I n X i =1 f ( Y ni ) ! = L I X i =1 f ( X ni ) | I n X i =1 X nij = c nj , ≤ j ≤ J n ! → L ( U ) . If X is a multinomial generated by dropping r balls in J boxes, with probabilities q j , and if J X j =1 (1 − q j ) r → α, then the number of empty boxes is asymptotically Poisson( α ) (e.g. Theorem 6Din [7]). Thus the condition I n J n X j =1 (cid:18) − c nj r n I n (cid:19) r n → β means that f ( X ni ) is asymptotically Poisson( β/I n ) and so P I n i =1 f ( X ni ) is Poisson( β ).33reliminary computations indicate that Theorem 5.9 will hold with row sumsthat do not vary too much. Figure 5.2 shows the result from simulations for thenumber of zeros in a 50 ×
20 table with row and column sums fixed.
Number of zeros F r equen cy . . . . . . Figure 1: Results for the number of zeros from 50,000 samples of a contingency table with I = 50 , J = 20 , c = 275 , r = 110. The blue curve is the frequency polygon of a Poissondistribution with λ = 3 . It is natural to ask further questions about the distribution of natural fea-tures of the tables representing double cosets. Three that stand out:1. The positions of the zeros under the hypotheses of Theorem 5.9.2. The size and distribution of the maximum entry in the table.3. The RSK shape: Knuth’s extension of the Robinson-Schensted correspon-dence assigns to a table T a pair P, L of semi-standard Young tableux ofthe same shape. We have not seen these statistics used in statistical work.So much is known about RSK asymptotics that this may fall out easily.34. Going back to Section 1: One nice development in probabilistic grouptheory on the symmetric group has been to look at the distribution ofnatural statistics within a fixed conjugacy class [49], [71]. In parallel, onecould fix a double coset and look at the distribution of standard statistics.5. Going further, this section has focused on enumerative probabilistic theo-rems for parabolic subgroups of the symmetric group. The questions makesense for parabolic subgroups of any finite Coxeter group. An enormousamount of combinatorial description is available (how does one describedouble cosets?). This is wonderfully summarized in the very accessiblepaper [13]. In any Coxeter group, each double coset contains a uniqueminimal length representative. These minimal length double coset repre-sentatives can be used as identifiers for the double coset. See [57] for moreon this. The focus of [13] is understanding W S · ω · W T with ω fixed as S and T vary over subsets of the generating reflections. Acknowledgements.
We thank Jason Fulman, Bob Guralnick, Mary Isaacs, SlimKammoun, Sumit Mukherjee, Arun Ram, Mehrdad Shahshahani, Richard Stan-ley, and Nat Theim for their help with this paper. MS is supported by a NationalDefense Science & Engineering graduate fellowship. Research supported in partby National Science Foundation grant DMS 1954042.
References [1] Agresti, A., 1992. A survey of exact inference forcontingency tables. Statist. Sci. 7, 131–177. URL: http://links.jstor.org.stanford.idm.oclc.org/sici?sici=0883-4237(199202)7:1<131:ASOEIF>2.0.CO;2-A&origin=MSN .with comments and a rejoinder by the author.[2] Agresti, A., 2013. Categorical data analysis. Wiley Series in Probabilityand Statistics. third ed., Wiley-Interscience [John Wiley & Sons], Hoboken,NJ. 353] Aguiar, M., Andr´e, C., Benedetti, C., Bergeron, N., Chen, Z., Diaco-nis, P., Hendrickson, A., Hsiao, S., Isaacs, I.M., Jedwab, A., Johnson,K., Karaali, G., Lauve, A., Le, T., Lewis, S., Li, H., Magaard, K.,Marberg, E., Novelli, J.C., Pang, A., Saliola, F., Tevlin, L., Thibon,J.Y., Thiem, N., Venkateswaran, V., Vinroot, C.R., Yan, N., Zabrocki,M., 2012. Supercharacters, symmetric functions in noncommuting vari-ables, and related Hopf algebras. Adv. Math. 229, 2310–2337. URL: https://doi-org.stanford.idm.oclc.org/10.1016/j.aim.2011.12.024 ,doi: .[4] Anderson, G.W., Guionnet, A., Zeitouni, O., 2010. An introduction to ran-dom matrices. volume 118 of
Cambridge Studies in Advanced Mathematics .Cambridge University Press, Cambridge.[5] Arratia, R., Barbour, A.D., Tavar´e, S., 2003. Logarithmiccombinatorial structures: a probabilistic approach. EMS Mono-graphs in Mathematics, European Mathematical Society (EMS),Z¨urich. URL: https://doi-org.stanford.idm.oclc.org/10.4171/000 ,doi: .[6] Bannai, E., Kawanaka, N., Song, S.Y., 1990. The character table of theHecke algebra H(GL n ( F q ) , Sp n ( F q )). J. Algebra 129, 320–366. URL: https://doi-org.stanford.idm.oclc.org/10.1016/0021-8693(90)90224-C ,doi: .[7] Barbour, A.D., Holst, L., Janson, S., 1992. Poisson approximation. vol-ume 2 of Oxford Studies in Probability . The Clarendon Press, Oxford Uni-versity Press, New York. Oxford Science Publications.[8] Barvinok, A., 2010. What does a random contingency ta-ble look like? Combin. Probab. Comput. 19, 517–539. URL: https://doi-org.stanford.idm.oclc.org/10.1017/S0963548310000039 ,doi: .369] Basu, R., Bhatnagar, N., et al., 2017. Limit theorems for longest monotonesubsequences in random mallows permutations, in: Annales de l’InstitutHenri Poincar´e, Probabilit´es et Statistiques, Institut Henri Poincar´e. pp.1934–1951.[10] Baumeister, B., 1997. Factorizations of primitive per-mutation groups. J. Algebra 194, 631–653. URL: https://doi-org.stanford.idm.oclc.org/10.1006/jabr.1997.7027 ,doi: .[11] Bhatnagar, N., Peled, R., 2015. Lengths of monotone subsequences ina Mallows permutation. Probab. Theory Related Fields 161, 719–780. URL: https://doi-org.stanford.idm.oclc.org/10.1007/s00440-014-0559-7 ,doi: .[12] Bhattacharya, B.B., Mukherjee, S., 2017. Degree sequence ofrandom permutation graphs. Ann. Appl. Probab. 27, 439–484.URL: https://doi-org.stanford.idm.oclc.org/10.1214/16-AAP1207 ,doi: .[13] Billey, S.C., Konvalinka, M., Petersen, T.K., Slofstra, W.,Tenner, B.E., 2018. Parabolic double cosets in Coxetergroups. Electron. J. Combin. 25, Paper No. 1.23, 66. URL: https://doi-org.stanford.idm.oclc.org/10.37236/6741 ,doi: .[14] Bishop, Y.M.M., Fienberg, S.E., Holland, P.W., 2007. Discrete multivariateanalysis: theory and practice. Springer, New York. With the collaborationof Richard J. Light and Frederick Mosteller, Reprint of the 1975 original.[15] Blackburn, S.R., Neumann, P.M., Venkataraman, G., 2007. Enu-meration of finite groups. volume 173 of
Cambridge Tracts inMathematics . Cambridge University Press, Cambridge. URL: https://doi-org.stanford.idm.oclc.org/10.1017/CBO9780511542756 ,doi: . 3716] Borga, J., 2020. Local convergence for permutations and lo-cal limits for uniform ρ -avoiding permutations with | ρ | =3. Probab. Theory Related Fields 176, 449–531. URL: https://doi-org.stanford.idm.oclc.org/10.1007/s00440-019-00922-4 ,doi: .[17] Borodin, A., 2011. Determinantal point processes, in: The Oxford hand-book of random matrix theory. Oxford Univ. Press, Oxford, pp. 231–249.[18] Borodin, A., Diaconis, P., Fulman, J., 2010. On adding alist of numbers (and other one-dependent determinantal pro-cesses). Bull. Amer. Math. Soc. (N.S.) 47, 639–670. URL: https://doi-org.stanford.idm.oclc.org/10.1090/S0273-0979-2010-01306-9 ,doi: .[19] Borodin, A., Olshanski, G., 2017. Representations of the infinitesymmetric group. volume 160 of Cambridge Studies in AdvancedMathematics . Cambridge University Press, Cambridge. URL: https://doi-org.stanford.idm.oclc.org/10.1017/CBO9781316798577 ,doi: .[20] Ceccherini-Silberstein, T., Scarabotti, F., Tolli, F., 2008. Harmonicanalysis on finite groups. volume 108 of
Cambridge Studies in Ad-vanced Mathematics . Cambridge University Press, Cambridge. URL: https://doi-org.stanford.idm.oclc.org/10.1017/CBO9780511619823 ,doi: . representation theory, Gelfand pairs andMarkov chains.[21] Chaganty, N.R., Sethuraman, J., 1985. Large deviation local limit theoremsfor arbitrary sequences of random variables. The Annals of Probability ,97–114.[22] Chen, Y., Diaconis, P., Holmes, S.P., Liu, J.S., 2005. Se-quential Monte Carlo methods for statistical analysis of38ables. J. Amer. Statist. Assoc. 100, 109–120. URL: https://doi-org.stanford.idm.oclc.org/10.1198/016214504000001303 ,doi: .[23] Chern, B., Diaconis, P., Kane, D.M., Rhoades, R.C.,2014. Closed expressions for averages of set parti-tion statistics. Res. Math. Sci. 1, Art. 2, 32. URL: https://doi-org.stanford.idm.oclc.org/10.1186/2197-9847-1-2 ,doi: .[24] Chern, B., Diaconis, P., Kane, D.M., Rhoades, R.C.,2015. Central limit theorems for some set partitionstatistics. Adv. in Appl. Math. 70, 92–105. URL: https://doi-org.stanford.idm.oclc.org/10.1016/j.aam.2015.06.008 ,doi: .[25] Crane, H., 2016. The ubiquitous Ewens sam-pling formula. Statist. Sci. 31, 1–19. URL: https://doi-org.stanford.idm.oclc.org/10.1214/15-STS529 ,doi: .[26] Curtis, C.W., Reiner, I., 2006. Representation theory of finite groupsand associative algebras. AMS Chelsea Publishing, Providence, RI.URL: https://doi-org.stanford.idm.oclc.org/10.1090/chel/356 ,doi: . reprint of the 1962 original.[27] Diaconis, P., 1988. Group representations in probability and statistics. vol-ume 11 of
Institute of Mathematical Statistics Lecture Notes—MonographSeries . Institute of Mathematical Statistics, Hayward, CA.[28] Diaconis, P., 2003. Patterns in eigenvalues: the 70th Josiah WillardGibbs lecture. Bull. Amer. Math. Soc. (N.S.) 40, 155–178. URL: https://doi-org.stanford.idm.oclc.org/10.1090/S0273-0979-03-00975-3 ,doi: .3929] Diaconis, P., Efron, B., 1985. Testing for indepen-dence in a two-way table: new interpretations of thechi-square statistic. Ann. Statist. 13, 845–913. URL: https://doi-org.stanford.idm.oclc.org/10.1214/aos/1176349634 ,doi: . with discussions and with a reply by theauthors.[30] Diaconis, P., Gangolli, A., 1995. Rectangular arrays with fixed mar-gins, in: Discrete probability and algorithms (Minneapolis, MN, 1993).Springer, New York. volume 72 of
IMA Vol. Math. Appl. , pp. 15–41. URL: https://doi-org.stanford.idm.oclc.org/10.1007/978-1-4612-0801-3_3 ,doi: .[31] Diaconis, P., Graham, R.L., Kantor, W.M., 1983. The mathemat-ics of perfect shuffles. Adv. in Appl. Math. 4, 175–196. URL: https://doi-org.stanford.idm.oclc.org/10.1016/0196-8858(83)90009-X ,doi: .[32] Diaconis, P., Ram, A., 2000. Analysis of system-atic scan Metropolis algorithms using Iwahori-Hecke al-gebra techniques, volume 48, pp. 157–190. URL: https://doi-org.stanford.idm.oclc.org/10.1307/mmj/1030132713 ,doi: . dedicated to William Fulton on theoccasion of his 60th birthday.[33] Diaconis, P., Shahshahani, M., 1987. Time to reach stationarity in theBernoulli-Laplace diffusion model. SIAM J. Math. Anal. 18, 208–218.URL: https://doi-org.stanford.idm.oclc.org/10.1137/0518016 ,doi: .[34] Diaconis, P., Sturmfels, B., et al., 1998. Algebraic algorithms for samplingfrom conditional distributions. Annals of statistics 26, 363–397.[35] Dittmer, S., 2019. Counting linear extensions and contingency tables. Ph.D.thesis. UCLA. 4036] Dittmer, S., Lyu, H., Pak, I., 2020. Phase transition in random contingencytables with non-uniform margins. Trans. Amer. Math. Soc. 373, 8313–8338.URL: https://doi-org.stanford.idm.oclc.org/10.1090/tran/8094 ,doi: .[37] Dixon, J.D., 2002. Probabilistic group theory. Mathematical Reports ofthe Academy of Sciences 24, 1–15.[38] Dummit, D.S., Foote, R.M., 2004. Abstract algebra. Third ed., John Wiley& Sons, Inc., Hoboken, NJ.[39] Erd˝os, P., Tur´an, P., 1965. On some problems of a statistical group-theory.I. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 4, 175–186 (1965).URL: https://doi-org.stanford.idm.oclc.org/10.1007/BF00536750 ,doi: .[40] Erd˝os, P., Tur´an, P., 1967a. On some problems of a statisti-cal group-theory. II. Acta math. Acad. Sci. Hungar. 18, 151–163.URL: https://doi-org.stanford.idm.oclc.org/10.1007/BF02020968 ,doi: .[41] Erd˝os, P., Tur´an, P., 1967b. On some problems of a statisti-cal group-theory. III. Acta Math. Acad. Sci. Hungar. 18, 309–320.URL: https://doi-org.stanford.idm.oclc.org/10.1007/BF02280290 ,doi: .[42] Erd˝os, P., Tur´an, P., 1968. On some problems of a statisti-cal group-theory. IV. Acta Math. Acad. Sci. Hungar. 19, 413–435.URL: https://doi-org.stanford.idm.oclc.org/10.1007/BF01894517 ,doi: .[43] F´eray, V., 2013. Asymptotic behavior of some statistics in Ewensrandom permutations. Electron. J. Probab. 18, no. 76, 32. URL: https://doi-org.stanford.idm.oclc.org/10.1214/EJP.v18-2496 ,doi: . 4144] Forrester, P.J., 2010. Log-gases and random matrices. vol-ume 34 of
London Mathematical Society Monographs Se-ries . Princeton University Press, Princeton, NJ. URL: https://doi-org.stanford.idm.oclc.org/10.1515/9781400835416 ,doi: .[45] Freedman, D., Lane, D., 1983a. A nonstochastic interpretation of reportedsignificance levels. Journal of Business & Economic Statistics 1, 292–298.[46] Freedman, D.A., Lane, D., 1983b. Significance testing in a nonstochasticsetting. A festschrift for Erich L. Lehmann , 185–208.[47] Fulman, J., 2002. Random matrix theory over finitefields. Bull. Amer. Math. Soc. (N.S.) 39, 51–85. URL: https://doi-org.stanford.idm.oclc.org/10.1090/S0273-0979-01-00920-X ,doi: .[48] Fulman, J., 2016. A generating function approach to counting theorems forsquare-free polynomials and maximal tori. Ann. Comb. 20, 587–599. URL: https://doi-org.stanford.idm.oclc.org/10.1007/s00026-016-0310-4 ,doi: .[49] Fulman, J., Kim, G.B., Lee, S., 2019. Central limit theorem for peaks ofa random permutation in a fixed conjugacy class of s n . arXiv preprintarXiv:1902.00978 .[50] Gabriel, P., Ro˘ıter, A.V., 1992. Representations of finite-dimensional alge-bras, in: Algebra, VIII. Springer, Berlin. volume 73 of
Encyclopaedia Math.Sci. , pp. 1–177. With a chapter by B. Keller.[51] Gladkich, A., Peled, R., et al., 2018. On the cycle structure of mallowspermutations. The Annals of Probability 46, 1114–1169.[52] Guralnick, R.M., 2020. On the singular value decomposition over finitefields and orbits of gu x gu. arXiv:1805.06999 .4253] Halverson, T., Ram, A., 2001. q -rook monoid algebras, Hecke al-gebras, and Schur-Weyl duality. Zap. Nauchn. Sem. S.-Peterburg.Otdel. Mat. Inst. Steklov. (POMI) 283, 224–250, 262–263. URL: https://doi-org.stanford.idm.oclc.org/10.1023/B:JOTH.0000024623.99412.13 ,doi: .[54] He, J., 2019. A characteristic map for the symmetric space of symplecticforms over a finite field. arXiv:1906.05966 .[55] He, J., 2020a. A central limit theorem for descents of a mallows permutationand its inverse. arXiv:2005.09802 .[56] He, J., 2020b. Random walk on the symplectic forms over a finite field.Algebraic Combinatorics 3, 1165–1181.[57] He, X., 2007. Minimal length elements in some doublecosets of Coxeter groups. Adv. Math. 215, 469–503. URL: https://doi-org.stanford.idm.oclc.org/10.1016/j.aim.2007.04.005 ,doi: .[58] H¨oglund, T., 1978. Sampling from a finite population. a remainder termestimate. Scandinavian Journal of Statistics , 69–71.[59] Holst, L., 1979. A unified approach to limit theoremsfor urn models. J. Appl. Probab. 16, 154–162. URL: https://doi-org.stanford.idm.oclc.org/10.2307/3213383 ,doi: .[60] Holst, L., 1981. Some conditional limit theorems in ex-ponential families. Ann. Probab. 9, 818–830. URL: http://links.jstor.org.stanford.idm.oclc.org/sici?sici=0091-1798(198110)9:5<818:SCLTIE>2.0.CO;2-D&origin=MSN .[61] Hoppen, C., Kohayakawa, Y., Moreira, C.G., R´ath, B.,Menezes Sampaio, R., 2013. Limits of permutation se-quences. J. Combin. Theory Ser. B 103, 93–113. URL:43 ttps://doi-org.stanford.idm.oclc.org/10.1016/j.jctb.2012.09.003 ,doi: .[62] Howe, R., 1992. A century of Lie theory, in: American Mathematical So-ciety centennial publications, Vol. II (Providence, RI, 1988). Amer. Math.Soc., Providence, RI, pp. 101–320.[63] Inglis, N.F.J., Liebeck, M.W., Saxl, J., 1986. Multiplicity-free permu-tation representations of finite linear groups. Math. Z. 192, 329–337.URL: https://doi-org.stanford.idm.oclc.org/10.1007/BF01164008 ,doi: .[64] James, G., Kerber, A., 2009. The representation theory of the symmetricgroup, cambridge u. Press, Cambridge .[65] Joe, H., 1985. An ordering of dependence for contingency tables. Linearalgebra and its applications 70, 89–103.[66] Jones, A.R., 1996. A combinatorial approach to the dou-ble cosets of the symmetric group with respect to Youngsubgroups. European J. Combin. 17, 647–655. URL: https://doi-org.stanford.idm.oclc.org/10.1006/eujc.1996.0056 ,doi: .[67] Kammoun, M.S., 2020a. On the longest common subsequence of conjuga-tion invariant random permutations. The Electronic Journal of Combina-torics , 4–10.[68] Kammoun, M.S., 2020b. Universality for random permutations and someother groups. arXiv:2012.05845 .[69] Kammoun, M.S., et al., 2018. Monotonous subsequences and the descentprocess of invariant random permutations. Electronic Journal of Probabil-ity 23. 4470] Kang, S.h., Klotz, J., 1998. Limiting conditional dis-tribution for tests of independence in the two way ta-ble. Comm. Statist. Theory Methods 27, 2075–2082. URL: https://doi-org.stanford.idm.oclc.org/10.1080/03610929808832210 ,doi: .[71] Kim, G.B., Lee, S., 2020. Central limit theorem for descents in conju-gacy classes of S n . J. Combin. Theory Ser. A 169, 105123, 13. URL: https://doi-org.stanford.idm.oclc.org/10.1016/j.jcta.2019.105123 ,doi: .[72] Lancaster, H.O., 1969. The chi-squared distribution. John Wiley & Sons,Inc., New York-London-Sydney.[73] Laue, R., 1982. Computing double coset representatives for the genera-tion of solvable groups, in: Computer algebra (Marseille, 1982). Springer,Berlin-New York. volume 144 of Lecture Notes in Comput. Sci. , pp. 65–70.[74] Lehmann, E.L., Romano, J.P., 2006. Testing statistical hypotheses.Springer Science & Business Media.[75] Letac, G., 1981. Probl`emes classiques de probabilit´e sur un couple deGelfand, in: Analytical methods in probability theory (Oberwolfach, 1980).Springer, Berlin-New York. volume 861 of
Lecture Notes in Math. , pp. 93–120.[76] Macdonald, I.G., 1998. Symmetric functions and Hall polynomials. Oxforduniversity press.[77] Marshall, A.W., Olkin, I., Arnold, B.C., 1979. Inequalities: theory ofmajorization and its applications. volume 143. Springer.[78] Mueller, C., Starr, S., 2013. The length of thelongest increasing subsequence of a random Mallows per-mutation. J. Theoret. Probab. 26, 514–540. URL:45 ttps://doi-org.stanford.idm.oclc.org/10.1007/s10959-011-0364-5 ,doi: .[79] Mukherjee, S., 2016a. Estimation in exponential fami-lies on permutations. Ann. Statist. 44, 853–875. URL: https://doi-org.stanford.idm.oclc.org/10.1214/15-AOS1389 ,doi: .[80] Mukherjee, S., 2016b. Fixed points and cycle structure of ran-dom permutations. Electron. J. Probab. 21, 1—-18. URL: https://doi-org.stanford.idm.oclc.org/10.1214/16-EJP4622 ,doi: .[81] Rodrigues, O., 1839. Note sur les inversions, ou d´erangements produitsdans les permutations. J. de Math 4, 236–240.[82] Saxl, J., 1981. On multiplicity-free permutation representations, in: Finitegeometries and designs (Proc. Conf., Chelwood Gate, 1980), CambridgeUniv. Press, Cambridge-New York. pp. 337–353.[83] Shalev, A., 1999. Probabilistic group theory. London Mathematical SocietyLecture Note Series , 648–678.[84] Shepp, L.A., Lloyd, S.P., 1966. Ordered cycle lengths in a ran-dom permutation. Trans. Amer. Math. Soc. 121, 340–357. URL: https://doi-org.stanford.idm.oclc.org/10.2307/1994483 ,doi: .[85] Slattery, M.C., 2001. Computing double cosets insoluble groups, volume 31, pp. 179–192. URL: https://doi-org.stanford.idm.oclc.org/10.1006/jsco.1999.1005 ,doi: . computational algebra and number theory(Milwaukee, WI, 1996).[86] Solomon, L., 1990. The Bruhat decomposition, Tits system and Iwahoriring for the monoid of matrices over a finite field. Geom. Dedicata 36, 15–49.46RL: https://doi-org.stanford.idm.oclc.org/10.1007/BF00181463 ,doi: .[87] Stanley, R.P., 1986. What is enumerative combinatorics?, in: Enumerativecombinatorics. Springer, pp. 1–63.[88] Starr, S., 2009. Thermodynamic limit for the Mallowsmodel on S n . J. Math. Phys. 50, 095208, 15. URL: https://doi-org.stanford.idm.oclc.org/10.1063/1.3156746 ,doi: .[89] Starr, S., Walters, M., 2018. Phase uniqueness for the Mal-lows measure on permutations. J. Math. Phys. 59, 063301, 28.URL: https://doi-org.stanford.idm.oclc.org/10.1063/1.5017924 ,doi: .[90] Suzuki, M., 1982. Group theory. I. volume 247 of Grundlehren der Math-ematischen Wissenschaften [Fundamental Principles of Mathematical Sci-ences] . Springer-Verlag, Berlin-New York. Translated from the Japaneseby the author.[91] Watterson, G.A., 1976. The stationary distribution of the infinitely-manyneutral alleles diffusion model. J. Appl. Probability 13, 639–651. URL: https://doi-org.stanford.idm.oclc.org/10.1017/s0021900200104309 ,doi:10.1017/s0021900200104309