[PDF] Incorporating Weisfeiler-Leman into algorithms for group isomorphism

Abstract

In this paper we combine many of the standard and more recent algebraic techniques for testing isomorphism of finite groups (GpI) with combinatorial techniques that have typically been applied to Graph Isomorphism. In particular, we show how to combine several state-of-the-art GpI algorithms for specific group classes into an algorithm for general GpI, namely: composition series isomorphism (Rosenbaum-Wagner, Theoret. Comp. Sci., 2015; Luks, 2015), recursively-refineable filters (Wilson, J. Group Theory, 2013), and low-genus GpI (Brooksbank-Maglione-Wilson, J. Algebra, 2017). Recursively-refineable filters -- a generalization of subgroup series -- form the skeleton of this framework, and we refine our filter by building a hypergraph encoding low-genus quotients, to which we then apply a hypergraph variant of the k-dimensional Weisfeiler-Leman technique. Our technique is flexible enough to readily incorporate additional hypergraph invariants or additional characteristic subgroups.

Full PDF

aa r X i v : . [ c s . CC ] M a y Incorporating Weisfeiler–Leman into algorithms for group isomorphism

Peter A. Brooksbank ∗ Joshua A. Grochow † Yinan Li ‡ Youming Qiao § James B. Wilson ¶ May 8, 2019

Abstract

In this paper we combine many of the standard and more recent algebraic techniques for testingisomorphism of ﬁnite groups (

GpI ) with combinatorial techniques that have typically been applied toGraph Isomorphism. We show how to combine several state-of-the-art

GpI algorithms for speciﬁc groupclasses into an algorithm for general

GpI , namely: composition series isomorphism (Rosenbaum–Wagner,

Theoret. Comp. Sci. , 2015; Luks, 2015), recursively-reﬁneable ﬁlters (Wilson,

J. Group Theory , 2013),and low-genus

GpI (Brooksbank–Maglione–Wilson,

J. Algebra , 2017). Recursively-reﬁneable ﬁlters—a generalization of subgroup series—form the skeleton of this framework, and we reﬁne our ﬁlter bybuilding a hypergraph encoding low-genus quotients, to which we then apply a hypergraph variant ofthe k -dimensional Weisfeiler–Leman technique. Our technique is ﬂexible enough to readily incorporateadditional hypergraph invariants or additional characteristic subgroups as they emerge.After introducing this general technique, we prove three main results about its complexity: • Let the width of a ﬁlter be the dimension of the largest quotient of two adjacent subgroups of theﬁlter; the color-ratio of our hypergraph captures how much smaller a color class is compared tothe layer of the ﬁlter it is coloring. When we use genus- g quotients and hypergraph k -WL, we cansolve isomorphism for solvable groups of order n in time (cid:16) n color-ratio (cid:17) width poly( n ) + n O ( gk ) In the “base case”, where the solvable radical is itself low-genus and the semisimple part actstrivially, we can get a better guaranteed running time of n O (log log n ) , by combining cohomolog-ical techniques (Grochow–Qiao, CCC ’14, SIAM J. Comput. , 2017), code equivalence (Babai–Codenotti–Grochow–Qiao, SODA ’11), and low-genus isomorphism ([BMW], ibid. ). • We introduce a new random model of ﬁnite groups. Unlike previous models, we prove that ourmodel has good coverage, in that it produces a wide variety of groups, and in particular a numberof distinct isomorphism types that is logarithmically equivalent to the number of all isomorphismtypes. In this random model, we show that our ﬁlter-and-1-WL reﬁnement method results inconstant average width (the above result uses max width). • For p -groups of class 2 and exponent p —widely believed to be the hardest cases of GpI , and wherewe also expect the above techniques to get stuck—we improve on the average-case algorithm of Li–Qiao (FOCS ’17). Our new algorithm is simpler and applies to a larger fraction of random p -groupsof class 2 and exponent p . The previous algorithm was based on a linear-algebraic analogue of theindividualize-and-reﬁne technique; our new algorithm combines that technique with concepts fromisomorphism of low-genus groups. We also implement this algorithm in MAGMA and show thatin experiments it improves over the default (brute force) algorithm for this problem. ∗ Department of Mathematics, Bucknell University, Lewisburg, PA 17837, United States. [email protected] † Departments of Computer Science and Mathematics, University of Colorado—Boulder, Boulder, CO 80309-0430,United States. [email protected] ‡ CWI and QuSoft, Science Park 123, 1098XG Amsterdam, Netherlands.

[email protected] . § Center for Quantum Software and Information, University of Technology Sydney, Ultimo NSW 2007, Australia.

[email protected] ¶ Department of Mathematics, Colorado State University, Fort Collins, CO 80523, United States.

[email protected]

Introduction

The problem of deciding whether two ﬁnite groups are isomorphic (

GpI ) has a century-old historythat straddles several ﬁelds, including topology, computational algebra, and computer science. Italso has several unusual variations in complexity. For example, the dense input model—wheregroups are speciﬁed by their multiplication “Cayley” tables, has quasi-polynomial time complexityand it reduces to the better known Graph Isomorphism problem (

GphI ); cf. [L3, Section 10].Meanwhile,

GpI for a sparse input model for groups, such as by permutations, matrices, or black-box groups, reduces from

GphI ; cf. [HL, LV]. At present sparse

GpI is in Σ P and it is not known tolie in either NP nor coNP ; see [BS, Propostion 4.8, Corollary 4.9]. In fact in the model of groupsinput by generators and relations, Adian and Rabin famously showed GpI is undecideable [A, R1].Following L. Babai’s breakthrough proof that

GphI is in quasi-polynomial time [B1], dense(Cayley table)

GpI is now an essential bottleneck to improving graph isomorphism. So while themethods we explore here can be applied in both the dense and the sparse models of

GpI , weconcentrate our complexity claims on the dense case. In particular, when we say polynomial-time, we mean polynomial time in the group order unless speciﬁed otherwise. Our contributionhere is to expose from within the structure of groups, graph theoretic properties which relateto the diﬃculty of solving

GpI . We expect this to facilitate the systematic use of combinatorialisomorphism techniques within

GpI that interplay with existing algebraic strategies.We introduce a colored (hyper-)graph based on an algebraic data structure known as a recursively-reﬁneable ﬁlter which identiﬁes abelian groups and vector spaces layered together to form the struc-ture of a ﬁnite group. Filters have been useful in several isomorphism tests [W3, M3, M1, BOW]. Asthe name suggests, ﬁlters can be reﬁned, and with each reﬁnement the cost T ( n ) of isomorphismtesting decreases after reﬁnement to a function in O ( T ( n ) /c ), for some c >

2. The more rounds ofreﬁnement we can carry out the lower the cost of isomorphism. Existing uses of ﬁlter reﬁnementﬁnd characteristic structure algebraically; our principal innovation is to add a combinatorial per-spective to reﬁnement. We color (co)dimension- g subspaces of the layers of the ﬁlter using localisomorphism invariants. This parameter g will be referred to as the genus parameter . The layersare in turn connected to each other according to their position within the group, and this presentsfurther opportunities for coloring. With so much nuanced local information, the graph we asso-ciate to a group is well suited to individualization and reﬁnement techniques like the dimension- k Weisfeiler–Leman procedure [WL, B2, IL, CFI]. The critical work is to reﬁne these graphs compati-bly with the reﬁnement of the ﬁlter (Theorem A). Thus, one maintains the relationship between thegroup and graph isomorphism properties as we recursively reﬁne. While our methods do not applyto structures as general as semigroups and quaisgroups, they can be adapted to other problems,such as ring isomorphism [KS].To explore the implications of this technique we introduce a model for random ﬁnite groups.In doing so we consider pitfalls identiﬁed in previously suggested models for random ﬁnite groups.We are especially concerned with coverage —the idea that we are able to easily sample from groupswithin natural classes such as non-solvable, solvable, nilpotent, and abelian—and that within eachsubclass the number of isomorphism types is dense on a log scale (Theorem C). Log-scale is fornow the best granularity we know for the enumeration of groups, cf. [BNV]. We then prove(Theorem D) that in our random model, genus-1 1-WL-reﬁnement on average reﬁnes to a seriesof length Θ(log n ), which thereby achieves the expected reﬁnement length posed in [W3, p. 876].Following the reﬁnement, the average width of the ﬁlter is thus constant, though the cost ofreﬁnement increases. (If the maximum width were constant it would result in a polynomial-timeaverage-case isomorphism test in our model.)Finally within our random model there are several “base cases” where the recursive reﬁnements1ecome less likely, or where our analysis is inadequate. We demonstrate that in two of these cases,isomorphism can be solved either in polynomial time in the average-case sense (Theorem E) or innearly polynomial time ( n O (log log n ) ) in the worst-case sense (Theorem F). The former also solvesa related problem of average complexity of tensor equivalence.Our strategy harnesses critical features of a great variety of existing approaches to isomor-phism (code equivalence, ﬁlter reﬁnements, adjoint-tensor methods, bidirectional collision) anduses Weisefeiler–Leman reﬁnement as the top-level strategy to combine the various implications.That diversity was not so much a plan but the result of hitting barriers and looking to the liter-ature for solutions. The result, however, is a framework that is rather ﬂexible and is well suitedto accommodate future ideas, both algebraic and combinatorial, as featured here already. Thatstrength of course comes at a cost that the mechanics and analysis are rather involved. We expectthat in time better analysis and simpliﬁed models will improve our understanding. Much recent progress in

GpI has been had by considering special classes of groups; the recent papers[BMW1, GQ, LGR] survey and supplement these results. That has created powerful but highlyvaried strategies with no obvious means of synthesis. Within our reﬁnement model of computing

GpI we have the opportunity to begin merging some of the many options that have been developedto date. To help explain our approach we consider examples of groups of invertible matrices overﬁnite ﬁelds of prime order, as graphically communicated in Figure 1.1. In fact, these examples willlater evolve into the aforementioned random model for ﬁnite groups.( a ) ( c ) ( b )Figure 1.1: Diagrams of matrix groups can capture many of the well-studied examples of ﬁnitegroups: (a) depicts a large variety of nilpotent groups; (b) depicts products of quasi- and almost-simple groups together with possible permutations of isomorphic blocks; and (c) depicts wide rangeof general ﬁnite groups decomposed into smaller classes of groups. First thread: connection with linear and multilinear algebras.

Algorithms and datastructures for linear and multilinear structures are on the whole far more evolved than counterpartsfor groups. This explains why progress for groups can be made by mapping problems into the realmof linear and multilinear algebra. Such a correspondence has been known for close to a century,originating in work of Brahana [B3] and Baer [B3]. Consider groups U of the following form. U H ( d , . . . , d ℓ ; F ) :=  I d a a · · · I d a . . . . . . I d ℓ  (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a ij ∈ M d i × d j ( F )  . U . In the creation of our random model we shall sample groups U by selecting random matrices in H ( d , . . . , d ℓ ; F ). A surprising necessity is that we sample onlysparse matrices. Although this might seem counter to the goals of seeding a group with lots ofentropy, we will demonstrate that groups with too much random seeding become virtually identical(Theorem 5.2).As a general remark it will be helpful throughout this work to regard all groups U = h U, · , − , i as having been enriched by the addition of a second binary operation [ a, b ] = a − b − ab known as commutation . In this way groups behave much more like rings than they do like semigroups orquasigroups. In particular, [ , ] very nearly distributes over the usual binary operation · in U , inthat [ ab, c ] = b − [ a, c ] b [ b, c ]. That explains the link to multilinear algebra. In the case of U :  I d a · · · . . . . . . I d ℓ  ,  I d b · · · . . . . . . I d ℓ  =  I d a + b a + b + a b − b a · · · . . . . . . I d ℓ  . Stripping away the addition leaves us to compare bilinear (and later multilinear) products suchas ( a ij , b jk ) a ij b jk , under base changes. We treat these as functions F a × F b ֌ F c , where ֌ indicates the function is multilinear. Equivalently, we must study the orbits of groups GL( a, F ) × GL( b, F ) × GL( c, F ) acting on elements of the tensor space F a ⊗ F b ⊗ F c . Such reductions ofgroup isomorphism to multilinear equivalence, and more general tensor equivalence problems havebeen the key to the recent progress on some of the largest and most diﬃcult instances of GpI [BMW1, BW, LQ, IQ, LW2, BOW, W2]. The strategies buried within those methods are neverthelessquite distinct. For example, several focus on ∗ -algebras and properties of rings and modules actingon tensors. Others focus on tensors as high-dimensional arrays, and perform individualization andreﬁnement techniques on slices of this data structure. Our model of reﬁnement allows for bothstrategies. Second thread: relationship to code equivalence.

Now consider the types of groups wecould place on the block diagonal of the matrix group examples in Figure 1.1. These could in-clude groups like GL( d i , F ). We could also use subgroups such as GL(1 , F p di ) = F × p d − , as wellas natural families of geometrically interesting groups such as orthogonal, unitary, and symplecticgroups. We may even embed the same group several times into multiple blocks on the diagonal,e.g. (cid:10)(cid:2) A A (cid:3) : A ∈ GL( e, F ) (cid:11) . Those blocks could further be permuted producing groups of blockmonomial matrices such as (cid:10)(cid:2) AA (cid:3) : A ∈ GL( e, F ) (cid:11) . We can capture the spirit of such a groupgraphically in Figure 1.1(b). Indeed, our random group model builds random semi-simple andquasi-semisimple groups in just this way.Isomorphism in the context of groups of this kind has been approached mostly through the useof code equivalence. For example, for semisimple groups—those with no non-trivial abelian normalsubgroups—there is an algorithm that runs in time polynomial in the group order [BCGQ, BCQ],as well as an algorithm that is eﬃcient in practice [CH]. The key algorithmic idea is dynamicprogramming, and its use follows the one by Luks in the context of hypergraph isomorphism andcode equivalence [L2]. Later [GQ] considers the further implications when the groups centrallyextend abelian groups similar to the general family we have described in this thread. Third thread: composition series and ﬁlters.

In recent years there has been some progress onimproving general isomorphism using subgroup chains. Rosenbaum and Wagner [RW] demonstratedthat one can ﬁx a composition series C ( G ) for group G and then, given a composition series C ( H )for another group H , eﬃciently decide if there is an isomorphism G → H sending C ( G ) to C ( H ).3uks gave an improvement of that test [L1]. In this way, the putative cost of n log n + O (1) steps todecide isomorphism by brute-force is reduced to the number of possible composition series, whichis at most n ( + o (1)) log n .Filters can use characteristic subgroups to recursively ﬁnd more characteristic subgroups, ul-timately producing a large enough collection of ﬁxed subgroups that an isomorphism test alongthe lines of Rosenbaum–Wagner becomes eﬃcient. For several families of groups such reﬁnementshave been discovered [W3, M2]. Our approach here extends the ﬁltration process by taking themethods known and combining them into a colored hypergraph where individualization-reﬁnementtechniques can be applied. The goal is to make it even more likely to reach a situation in whichthe Rosenbaum–Wagner and Luks algorithms can be applied eﬃciently. Our approach to

GpI uses recursively-reﬁneable ﬁlters to build and reﬁne a colored hypergraphwithin, and between, abelian layers of a given group. A ﬁlter φ on a group G assigns to a c -tuple s = ( s , . . . , s c ) of natural numbers (including 0) a normal subgroup φ s of G subject tonatural compatibility requirements. Let Norm( G ) denote the set of normal subgroups of G , andfor A, B ⊆ G let [ A, B ] = h [ a, b ] | a ∈ A, b ∈ b i . Deﬁnition 1.1 (Filter [W3]) . A ﬁlter on a group G is a map φ : N d → Norm( G ), where( ∀ s, t ∈ N d ) s lex t = ⇒ φ s > φ t and [ φ s , φ t ] φ s + t . (1.2)Note that the ﬁrst condition implies that the subgroups φ s form a descending chain of subgroups,though in general it is not a proper chain. Computationally we only store the lexicographicallyleast label s for each distinct subgroup φ s in the image of φ . Thus, a ﬁlter’s image is bounded bythe length of the longest subgroup chain. For a group of order n this is at most log n .We begin with a ﬁlter φ : N c → Norm( G ) known from the structure of general ﬁnite groups,and then reﬁne by increasing the value of c . That reﬁnements exist is proved in [W3] and thatthey can be computed eﬃciently is shown in [M3]. Our initial value for c will be the number ofdistinct primes p , . . . , p c dividing n = | G | . For each prime p i , we let O p i ( G ) denote the intersectionof all Sylow p i -subgroups of G , the maximum normal subgroup having order a power of p i . Let e i = ( . . . , , i , , . . . ) ∈ N c , sorted lexicographically (so that e i < e i +1 ), and deﬁne φ : N c → Norm( G )as follows: φ s =  G s = 0 , Q cj = i O p j ( G ) s = e i , [ φ s i e i , G ] φ p i s i e i s = ( s i + 1) e i , Q ci =1 φ s i e i s = P ci =1 s i e i . Here the product Q i φ s i e i means the normal subgroup generated by the terms φ s i e i . For examplethe group S of permutations on 4 letters would have φ (0 , = S > φ (1 , = O ( S ) O ( S ) = h (12)(34) , (13)(24) i > φ (2 , = φ (0 , = O ( S ) = 1 . The boundary ﬁlter ∂φ : N d → Norm( G ) is deﬁned by ∂φ s = h φ s + t : t ∈ N d \{ }i (if d = 1, then ∂φ s = φ s +1 ), and the quotients L s := φ s /∂φ s are the layers of φ . Note that for each s = 0, L s isabelian, and in fact a Z [ φ /∂φ ]-module. In the selected ﬁlter above these are in fact F p i -vectorspaces for some i . The set L ( φ ) = L s =0 L s , with homogeneous bilinear products[ , ] st : L s × L t ֌ L s + t : ( x∂φ s , y∂φ t ) [ x, y ] ∂φ s + t ,

4s a graded Lie algebra whose graded components are invariant under Aut( G ) [W3, Theorem 3.1].A bilinear map (bimap) L s × L t ֌ L s + t is said to have genus g if it is deﬁned over a ﬁeld F suchthat dim F L s + t g , or (see [BMW1] for details) if it is built from such maps by certain elementaryproducts (such as direct products, but even “central” products are allowed). We will primarily beconcerned with the case where F = Z p and we consider bimaps whose codomain has dimension atmost g , but our results extend without diﬃculty to the more general notion of genus.In our setting the layers of φ are elementary abelian, and our approach is to build a hypergraphwhose vertices are the union of the points (1-spaces) in the projective geometries of the layers.For s ∈ N d , let PG k ( L s ) denote the set of ( k + 1)-dimensional subspaces of L s . Deﬁne a familyof hypergraphs H ( g ) ( φ ), where 1 < g ∈ Z is a parameter, with vertices and hyperedges deﬁned asfollows:The vertex set of H ( g ) ( φ ) is V = S s ∈ N d PG ( L s ).The hyperedge set of H ( g ) ( φ ) is E = S s ∈ N d PG g ( L s ) ∪ PG dim L s − g ( L s ) ∪ S s = t K st , where K st is a hypergraph with edges and 3-edges on PG ( L s ) ∪ PG ( L t ) ∪ PG ( L s + t ).Having deﬁned the hypergraph H ( g ) ( φ ), we shall apply the k -dimensional Weisfeiler-Leman(WL) procedure to it in an appropriate way. Brieﬂy, it is a hypergraph version of the WL procedure[WL] on graphs [B2, IL]. When k = 1, such a WL procedure on hypergraphs was recently studiedby B¨oker [B1].To this end we obtain an algorithm that, given a ﬁnite group G and integers g, k >

1, computesa suitable characteristic ﬁlter φ : N d → Norm( G ), where N = O ∞ ( G ) = Q p O p ( G ) is the Fittingsubgroup, and an associated hypergraph H ( g,k ) ( φ ). Further, it colors the hyperedges E of H ( g,k ) ( φ )in a certain desirable way. If χ : E → N is a coloring of hyperedges, denote the correspondingcolored hypergraph by H ( g,k ) χ ( φ ). Theorem A.

There is a deterministic algorithm that, given a ﬁnite group G and integers g, k > ,constructs the Fitting subgroup N = O ∞ ( G ) , a characteristic ﬁlter φ : N d → Norm( G ) whose non-zero layers are elementary abelian Aut( G ) -modules, the hypergraph H = H ( g,k ) ( φ ) , and a coloring χ : V ( H ) ∪ E ( H ) → N satisfying:(i) H ( g,k ) χ ( φ ) is hereditary in the following sense: for each s ∈ N d − { } , the vertex-and-edge-colored hypergraph obtained by restricting H ( g,k ) χ ( φ ) to G/φ s is a reﬁnement of the coloredhypergraph for G/φ s based on the ﬁlter φ truncated at φ s .(ii) H ( g,k ) χ ( φ ) is also hereditary in k in the following sense: the underlying hypergraphs of H ( g,k ) ( φ ) and H ( g,k +1) ( φ ) are identical, and the coloring of the latter reﬁnes the coloring of the former.(iii) If G ∼ = G ′ , there is a colored hypergraph isomorphism f : H ( g,k ) χ ( φ ) → H ( g,k ) χ ′ ( φ ′ ) such that ∀ e ∈ E ( H ) , χ ( e ) = χ ′ ( f ( e )) , ∀ v ∈ V ( H ) , χ ( v ) = χ ′ ( f ( v )) . The time complexity is | G | O ( gk ) . The algorithm to construct the colored hypergraph H ( g,k ) χ ( φ ) is an iterative procedure thatwe describe in detail in Section 3. Within a ﬁxed iteration, we apply a Weisfeiler–Leman typeindividualization procedure to obtain a stable coloring (a hypergraph analogue of k -dimensional5L). We then use that stable coloring to search for characteristic structure in G not alreadycaptured by the ﬁlter φ . If we succeed, we use this structure to reﬁne φ and iterate.Given the result of our WL-algorithm and applying Luks’s extension [L1] of the Rosenbaum–Wagner composition series comparison [RW], whenever we reﬁne we improve our isomorphism test,resulting in: Theorem B.

Let φ = φ g,k and H ( g,k ) = H ( g,k ) χ ( φ ) denote the ﬁlter and colored hypergraph fromTheorem A. Let width ( φ ) denote the maximum dimension of any layer of φ , and let color-ratio ( H ) be the product over all layers s of | L s | / | C s | , where C s is the smallest color class in layer L s . Thengiven a nilpotent group N of order n , isomorphism can be tested in time (cid:18) n color-ratio ( H ( g,k ) ) (cid:19) width ( φ g,k ) poly( n ) + n O ( gk ) . We extend this with an individualize-and-reﬁne technique in Section 4.3, though for that we donot have as cleanly stated an upper bound.

Remark . The initial ﬁlter described above can be extended to solvable groups, and in particularthe solvable radical Rad( G ) of any group, by doing something similar to the above within eachlayer of the Fitting series. This would let us extend all our results from using the Fitting subgroup O ∞ ( G ) to using the solvable radical Rad( G ) instead, and would extend Theorem B from nilpotentto solvable groups. Unlike sampling a random graph, where edges can freely be added or omitted, sampling groups of aﬁxed order requires some delicacy. For example, there are 15 isomorphism types of groups of order16 but only 1 each of orders 15 and 17. Sampling random groups has hitherto been approached inone of the following two ways.

Quotient Sampling.

Fix a free group F [ X ] of all strings on an alphabet X ∪ X − , and considerquotients by normal subgroups N = h S i sampled by choosing S ⊂ F by some aleatory process. Subgroup Sampling.

Fix an automorphism group of a structure, such as the group Sym(Ω) ofpermutations of a set Ω, or the group GL( V ) of invertible linear transformations of a vectorspace V . Consider subgroups H = h S i where S is sampled by some aleatory process.Evidently, both methods yield groups, but neither oﬀers suﬃcient variability when restricted toﬁnite groups. For instance Gromov studied quotient sampling as a function of the word lengthsof elements in S , ﬁnding most quotients are 1, Z /

2, or inﬁnite [G2]. Also, subgroup sampling in G = Sym(Ω) (respectively GL( V )) has been shown by Dixon, Kantor–Lubotsky [KL], and othersto essentially sample A n , S n (respectively, subgroups SL( V ) H GL( V )).To escape these conditions we adopt a method of sampling that appears antithetical to randommodels: we strongly bias our random selections. We settle on a model related to subgroup samplingin GL( d, p ) since this aﬀords us easy-to-use group operations. (Note, Novikov–Boone demonstratedthat the word problem in the free group is undecidable and thus working with quotients F [ X ] /N is not in general feasible [N, B2].)First, we sample random upper ( d × d )-unitriangular matrices u , . . . , u ℓ ∈ U ( d, p ) but we insistthat they are ǫ -sparse, for some constant ǫ . Then U = h u , . . . , u ℓ i . ×××××××××××××××××××××××××××××××××××××××××× ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ Figure 1.2: Plots of the orders of 100 subgroups sampled as h u , . . . , u i U (10 ,

3) with threediﬀerent densities ǫ : (+ , ǫ = 1 / × , ǫ = 1 / ∗ , ǫ = 1). Greater density makes grouporder, and structure, less varied. The X -axis is labelled by the group order, while the Y -axis islabelled by the percentage of the sampled groups.samples a subgroup whose order is a power of the prime p characteristic of our ﬁxed ﬁeld F . As weshall demonstrate in Theorem 5.2, without limiting our randomness to sparse matrices the groups U will almost always contain the following subgroup. γ ( U ( d, p )) =  ∗ . . . . . . . . .1 0 ∗  . In essence, this is a p -group analogue of the observations we made about sampling in S n andGL( d, p ). However, sampling with sparsity gives substantial variation, as illustrated simply bycomparing orders in Figure 1.2. An interesting recent study by R. Gilman describes a similarsituation for permutations analyzed by Kolomogorov complexity [G1].Secondly, once we have selected a suitably random upper unitriangular group U , an extension tothis group is selected by adding to its block-diagonal. That process consists of choosing a partitionof the series of common generalized eigen 1-spaces (the ﬁxed point ﬂag) of the group U . In eachblock we select a random (almost) quasisimple group with a representation of dimension at mostthe size of the block. We further allow for multiplicity and for permutations of isomorphic modules.This extends U ﬁrst by a block-diagonal abelian group, then a product of simple groups, followed bya layer of abelian groups, and a ﬁnal layer of permutations. It is well known that every ﬁnite grouphas such a decomposition, often referred to as the Babai–Beals ﬁltration [BB]. We note our ownﬁltration descends to the Fitting subgroup instead of to the solvable radical as in the Babai–Bealstreatment; revisit Figure 1.1 for an illustration.Along with the proposal of such a model inevitably come questions as to its eﬃcacy. We addresstwo of the more critical issues here. First, our model samples a large number of groups:7 heorem C.

A random d × d group over Z /b , as above, samples from each of the following classesof groups.(i) ﬁnite abelian groups of exponent dividing b and order at most O ( b d / ) .(ii) For each Z /b -bilinear map ∗ : U × V ֌ W , with rank U + rank V + rank W d , the Brahanagroups [B3] Bh ( ∗ ) = U × V × W with product (also denoted by ∗ ) as ( u, v, w ) ∗ ( u ′ , v ′ , w ′ ) = ( u + u ′ , v + v ′ , w + w ′ + u ∗ v ′ ) . (iii) For each alternating Z /b -bilinear map ∗ : U × U ֌ W , with rank U + rank W d , the Baergroups [B3] Br ( ∗ ) = U × W with product (also denoted by ∗ ) as ( u, w ) ∗ ( u ′ , w ′ ) = ( u + u ′ , w + w ′ + u ∗ v ′ ) . (iv) All classical groups T ( r, q ) for rank r over F q where r log q d .(v) All permutation groups of degree at most d .In particular this class of groups samples from p Θ( d ) pairwise non-isomorphic groups of order p d which is a logarithmically dense set of all isomorphism types of groups of order p d . Furthermore,this class of groups is closed to direct products and subdirect products. Secondly, for groups selected from our model, even a genus-1, 1-WL reﬁnement results in aﬁlter with constant average width. (Note, constant max width would result in a polynomial-timeisomorphism test.)

Theorem D.

For a random group G U ( d, b ) sampled by our model, one of the following casesoccurs on average when d and b are large enough:(a) O b ( G ) is abelian; or(b) G has characteristic WL-ﬁlter reﬁnement of length Θ(log | G | ) . It was predicted in [W3] that most p -groups P had characteristic ﬁlters of length O (log | P | ),owing in part to a result of Helleloid–Martin [HM]. However, outside of examples in [W3, M3] therewhere no large classes of groups where it could be demonstrated that such a ﬁlter could be eﬃcientlycomputed. In a survey of 500,000,000 groups of order 2 conducted by J. Maglione and the ﬁfthauthor, it was discovered that 96% of groups admitted a ﬁlter reﬁnement by algebraic methods,with most stabilizing at 10 = log p -groups havingorders between 100 and 3 , most ﬁlters reﬁned to a factor of about 10 times the original length.Theorem D oﬀers a theoretical explanation for those experimental results. One base case for which the application of Weisfeiler–Leman is unlikely to go much further is p -groups of class 2 and exponent p . (This special case has long been considered as diﬃcult as thegeneral group isomorphism problem.) As we have seen in Baer’s correspondence [B3] (cf. Theorem C(iii)), when p is odd testing isomorphism of such groups is equivalent to the following problem: giventwo alternating bilinear maps α, β : U × U ֌ V , decide whether they are pseudo-isometric, thatis, whether they are the same under the natural action of GL( U ) × GL( V ).8et Λ( n, q ) denote the linear space of all n × n alternating matrices over F q , namely the n × n matrices G such that v t Gv = 0 for all v ∈ F n . Note, v v t and G G t denotes transpositionon vectors and matrices, respectively. An alternating bilinear map α : U × U ֌ V with U ∼ = F nq and V ∼ = F mq will be represented by an m -tuple of n × n such matrices. Testing pseudo-isometryof alternating bilinear maps translates to the following: given two m -tuples of n × n alternatingmatrices over F q , G = ( G , . . . , G m ) and H = ( H , . . . , H m ), decide whether there exists T ∈ GL( n, q ), such that the linear spans of T t G T := ( T t G T, . . . , T t G m T ) and H are the same. For anodd prime p , testing the pseudo-isometry of alternating bilinear maps over F p in time p O ( n + m ) isequivalent to testing isomorphism of p -groups of class 2 and exponent p in time polynomial in grouporder. Also note that the na¨ıve brute-force algorithm—enumerating all possible T ∈ GL( n, q )—takes time q n · poly( n, log q ).In [LQ] it was shown that when n and m are linearly related, for all but at most 1 /q Ω( n ) fraction of G ∈ Λ( n, q ) m , there is an algorithm in time q O ( n ) to test isometry of G with anarbitrary H ∈ Λ( n, q ) m . The technique used to derive this result merits further comment. Itwas inspired by, and can be viewed as a linear algebraic analogue of, a classical combinatorial ideafrom graph isomorphism testing, namely the individualization and reﬁnement technique. Morespeciﬁcally, it follows the use and analysis of this technique by Babai, Erd˝os, and Selkow, in theﬁrst eﬃcient average-case algorithm for graph isomorphism [BES]. By incorporating the genusconcept [BMW1] into the individualization and reﬁnement scheme as used in [LQ, BES] we canboth extend and improve this result and at the same time greatly simplify the algorithm. Indeed,we have implemented an eﬀective version of this new algorithm in

Magma [BJP]. We prove:

Theorem E.

Suppose m is larger than some constant. There is an algorithm that, for all but atmost /q Ω( nm ) fraction of G ∈ Λ( n, q ) m , tests the pseudo-isometry of G to an arbitrary m -tuple ofalternating matrices H , in time q O ( n + m ) . We brieﬂy outline a simpliﬁed version of the algorithm, which is easy to describe and straight-forward to implement. A more detailed description can be found in Section 6.1. The simpliﬁedversion has already captured the essence of the strategy, but it comes with two small drawbacks.First, it does not work over ﬁelds of characteristic 2. Secondly, the average-case analysis does notachieve the level stated in Theorem E. Both issues will be remedied in the algorithm presented inSection 6.2, followed by a rigorous average-case analysis.Assume we are given two m -tuples of G = ( G , . . . , G m ) and H = ( H , . . . , H m ) from Λ( n, q ) m for suﬃciently large m and odd q . Let H be the subspace of Λ( n, q ) spanned by H . Take the ﬁrst c matrices of G to form a tuple A = ( G , . . . , G c ) for some constant c < m . Note, every pseudo-isometry from G to H maps A to a c -tuple B of matrices in H . This simple observation leads to the following algorithm. (We say two c -tuples of alternating ma-trices A and B are isometric if there exists an invertible matrix T ∈ GL( n, q ) such that T t A T = B ,and the autometry group of A is { T ∈ GL( n, q ) : T t A T = A } .) First, check if the autometry groupof A is too large (larger than q Ω( n ) ). If so, G does not satisfy our generic condition. Thus, supposethe autometry group is not too large, and enumerate all possible c -tuples B in H . Exhaustivelycheck if any of them is isometric to A , and, in the case of isometry, check if any isometry between A and B extends to an pseudo-isometry between G and H . The number of isometries between A and B is also not too large, because it is equal to the order of the autometry group of A .Note that the coset of isometries between A and B can be computed in time poly( n, c, log q ) overﬁelds of characteristic not 2 [BW, IQ]. Enumerating all possible c -tuples in H incurs a multiplicative The main result in [LQ] is stated in a so-called linear algebraic Erd˝os–R´enyi model. This model is not essentiallydiﬀerent from sampling random alternating matrix tuples. See also Remark 6.20 for some details. q cm . Given an isometry between A and B , we can check whether G and H are pseudo-isometricin poly( n, m, log q ). Thus, the overall time complexity is bounded above by q cm · s · poly( n, m, log q ),where s is the order of the autometry group of A . As we shall prove in Section 6, there is an absoluteconstant c such that for almost all m -tuple of n × n alternating matrices G , the ﬁrst c matrices haveautometry group of order at most q O ( n ) . Thus, the overall time complexity of the aforementionedisometry test is q O ( n + m ) for almost all G and arbitrary H . Performance.

We implemented the above algorithm in

Magma with some key adjustments(see Section 6.1 for details). The implementation is publicly available on GitHub as part of acomprehensive collection of tools—developed and maintained by the ﬁrst and last authors andtheir collaborators—to compute with groups, algebras, and multilinear functions [BMW2].Absent additional characteristic structure that can be exploited, the traditional approach todeciding pseudo-isometry between alternating bilinear maps α, β : V × V ֌ W is as follows. Letˆ α, ˆ β : V ∧ V → W denote the linear maps induced by α, β . Compute the natural (diagonal) actionof GL( V ) on V ∧ V , and decide if ker ˆ α and ker ˆ β —each of codimension dim W in V ∧ V —belongto the same orbit. An alternative version of brute force is to enumerate GL( W ) and check if oneof these transformations lifts to a pseudo-isometry from α to β . Which of these two brute-forceoptions represents the best choice depends on the dimensions of V and W .Our implementation is typically an improvement over both options. For example, in a prelim-inary experiment, our implementation readily decides pseudo-isometry between randomly selectedalternating bilinear maps F × F ֌ F , while both brute-force options failed to complete. Notethat the worst-case for all methods should be when α, β are not isometric, since in that caseone must exhaust the entire enumerated list (or orbit) to conﬁrm non-equivalence. However, themodiﬁcations we made tend to detect non-equivalence rather easily, since other (easily computed)invariants typically do not align in this case. We were therefore careful to also run tests withequivalent inputs, so as to ensure a fair comparison with default methods. radicals There are examples by the ﬁfth author of non-isomorphic p -groups having all proper nontrivialsubgroups of a common order isomorphic, and likewise for quotients [W2]. No amount of localinvariants will distinguish such groups, so when a WL-reﬁnement style algorithm such as oursencounters such a group it can go no further. Even so, those examples are low genus and thusisomorphism can be decided eﬃciently by unrelated methods [BMW1]. However, should thesegroups arise as O p ( G ) for a non-nilpotent group G it remains to contend with them as a base case.Combining the code equivalence technique of [BCGQ], the cohomological techniques of [GQ], andresults on the automorphism groups of low-genus groups [BMW1], we are able to get a nearly-polynomial running time for testing isomorphism in an important subclass of such groups. Theorem F.

Let G be the class of groups G such that Rad( G ) —the largest solvable normal subgroupof G —is a p -group of class 2, exponent p = 2 , such that G acts on Rad( G ) by inner automorphisms.Given groups G , G of order n , it can be decided in poly( n ) time if they lie in G . If so, isomorphismcan be decided, and a generating set for Aut( G i ) found, in time n O ( g +log log n ) , where g is the genusof Rad( G ) . Structure of the paper.

After presenting some preliminaries in Section 2, we detail the con-struction of the colored hypergraphs and prove Theorem A in Section 3. We then explain thecombination of ﬁlters and composition series isomorphism in

GpI , proving Theorem B in Section 4.The model of random groups, and the eﬀect of the reﬁnement procedure in this model, are the10ubject of Section 5, where Theorems C and D are proved. Finally, we provide the average-casealgorithm for p -groups of class 2 and exponent p (Theorem E) in Section 6, and the worst-casealgorithm for groups with genus-2 radical (Theorem F) in Section 7. Notation.

Let [ m ] = { , . . . , m } for m ∈ N . We use (cid:2) nd (cid:3) q to denote the Gaussian binomialcoeﬃcient with parameters n , d and with base q . Let M( n × n ′ , F ) (resp. M( n, F )) be the linearspace of all n × n ′ (resp. n × n ) matrices over F . The general linear group of degree n over F isdenoted by GL( n, F ). When F = F q for some prime power q , we write simply M( n, q ) and GL( n, q )in place of M( n, F q ) and GL( n, F q ). Deﬁnitions of bilinear maps.

Let

U, V, W be vector spaces over a ﬁeld F . A ( F -)bilinear mapis a function α : U × V ֌ W such that( ∀ u ∈ U, ∀ v, v ′ ∈ V, ∀ a, b ∈ F ) α ( u, av + bv ′ ) = aα ( u, v ) + bα ( u, v ′ )( ∀ u, u ′ ∈ U, ∀ v ∈ V, ∀ a, b ∈ F ) α ( au + bu ′ , v ) = aα ( u, v ) + bα ( u ′ , v ) . If β : U ′ × V ′ → W ′ is another F -bilinear map, we regard β as a function on the same domain andcodomain as α by selecting arbitrary linear isomorphisms U → U ′ , V → V ′ , and W → W ′ . Wesay α, β : U × V ֌ W are isotopic if there exists ( f, g, h ) ∈ GL( U ) × GL( V ) × GL( W ) such that β ( f ( u ) , g ( v )) = h ( α ( u, v )) for all u ∈ U, v ∈ V , and principally isotopic if there is an isotopismof the form ( f, g, W ). If U = V , we often require that f = g . We say α, β : V × V → W are pseudo-isometric if there is an isotopism of the form ( g, g, h ), and that they are isometric if thereis a pseudo-isometry of the form ( g, g, W ). A bilinear map α : V × V → W is alternating , if forany v ∈ V , α ( v, v ) = 0. Computational models.

Suppose, after ﬁxing bases, that U = F ℓ , V = F n , and W = F m ,which we regard as column spaces. A bilinear map α : U × V ֌ W can be represented as a tupleof matrices A = ( A , . . . , A m ) ∈ M( ℓ × n, F ) m , where( ∀ u ∈ U, v ∈ V ) α ( u, v ) = ( u t A v, . . . , u t A m v ) t . Suppose β : U × V → W is represeted by B = ( B , . . . , B m ) ∈ M( ℓ × n, F ) m . The concepts ofisotopism and principal isotopism then have natural and straightforward interpretations in terms ofthese matrices. Namely, we say A , B ∈ M( ℓ × n, F ) m are isotopic, if there exist invertible matrices T ∈ GL( ℓ, F ), S ∈ GL( n, F ) and R ∈ GL( m, F ), such that T t A S = ( T t A S, . . . , T t A m S ) = m X i =1 r ,i B i , . . . , m X i =1 r m,i B i ! = B R , where r i,j denotes the ( i, j )-th entry of R for i, j ∈ [ m ]. We say A and B are principal isotopic ifthey are isotopic with R = I m .Similarly, an alternating bilinear map α : V × V ֌ W can be represented by a tuple ofalternating matrices. Recall that an n × n matrix G over F is alternating if for every v ∈ F n , v t Gv = 0. When F is not of characteristic 2, this is equivalent to the skew-symmetry condition.Let Λ( n, F ) be the linear space of all n × n alternating matrices over F (and Λ( n, q ) when F = F q ).11hen pseudo-isometry and isometry have analogous formulations in terms of alternating matrixtuples.Given two tuples of alternating matrices G , H ∈ Λ( n, q ) m , the set of isometries between G and H is denoted as Isom( G , H ) = { T ∈ GL( n, F ) : T t G T = H } ;the group of autometries (or self-isometries) of G is denoted as Aut( G ) = Isom( G , G ). The set ofpseudo-isometries between G and H is deﬁned asΨIsom( G , H ) = { T ∈ GL( n, F ) : ∃ T ′ ∈ GL( m, q ) , T t G T = H T ′ } ;the group of pseudo-autometries (or self-pseudo-isometries) of G is denoted as ΨAut( G ) = ΨIsom( G , G ).It is straightforward to see that Isom( G , H ) is a (possibly empty) coset of Aut( G ), and ΨIsom( G , H )is a (possibly empty) coset of ΨAut( G ). Some algorithms for bilinear maps.

We note several of the algorithms we cite are describedas Las Vegas randomized algorithm in that they depend on factoring polynomials over ﬁnite ﬁelds.That is known to be deterministic if the characteristic of the ﬁeld is bounded. In our input model weare given a list of the group elements, so all primes are bounded and so we cite these as deterministicalgorithms.

Theorem 2.1.

Let α, β : U × V ֌ W be bilinear maps of vector spaces over a ﬁnite ﬁeld F .1. In time poly(dim U, dim V, | F | ) one can decide if α, β are principally isotopic [BOW, Theo-rem 3.7] .2. If U = V and the characteristic of F is not , in time poly(dim U, | F | ) one can decide if α, β are isometric [IQ] .In each case an aﬃrmative answer is accompanied by a principal isotopism (or isometry). We also require the following, which follows directly from Theorem 2.1 by enumerating GL( W ). Theorem 2.2.

Let α, β : U × V → W be bilinear maps of vector spaces over a ﬁnite ﬁeld F .1. In time poly(dim U, dim V, | W | dim | W | ) one can decide if α, β are isotopic. [BOW]

2. If U = V and the characteristic of F is not , in time poly(dim U, | W | dim | W | ) one can decideif α, β are pseudo-isometric [IQ] . The following theorem is the automorphism version of Theorem 2.1. Note that, unlike the caseof graph isomorphism, for the problems here there are no known reductions from the isomorphismversion to the automorphism version.

Theorem 2.3.

Let α : U × V → W be a bilinear map of vector spaces over a ﬁnite ﬁeld F .1. In time poly(dim U, dim V, | F | ) , one can compute a generating set for the group of principalautotopisms of α [BOW] .2. If U = V and the characteristic of F is not , in time poly(dim U, | F | ) , one can compute agenerating set for the group of autometries of α [BW] .Remark . A bilinear map ∗ : U × V ֌ W can be encoded as a 3-dimensional array.Transposing that array allows us to change swap the roles of U, V, W , for example creating a bilinearmap ∗ : V × U ֌ W or ∗ : W † × V ֌ U † , etc. (Here U † is the dual space of U ). This swappingis functorial and therefore isotopisms are permuted accordingly; cf. [BOW]. So while we highlightthe situation for principal isotopisms we could indeed specialize any one of the three spaces. Weshall assume throughout that when necessary a bilinear map is shuﬄed.12 The colored hypergraph algorithm

A high-level description of our algorithm to construct a colored hypergraph associated to a ﬁnitegroup was given in the introduction. We now provide the details; for convenient reference, anoutline is given in Algorithm 1 below.

Algorithm 1

Colored Hypergraph

Input: a ﬁnite group G , and integers g, k > Output: a characteristic ﬁlter φ : N d → Norm( G ) and a colored hypergraph H ( g,k ) χ ( φ ) upon whichAut( G ) acts as color-preserving automorphims. φ ← initial characteristic ﬁlter for G . Section 1.2 Repeat the following steps until φ stops changing (stabilizes):a: Build H ( g ) χ ( φ ) on each layer of φ . Section 3.1b: Extend H ( g ) χ ( φ ) between layers of φ . Section 3.2c: Apply k -dimensional Weisfeiler–Leman to H ( g ) χ ( φ ) Section 3.3d: S ← { Aut( G )-invariant subgroups extracted from WL( k, H ( g ) χ ( φ )) } . Section 3.4e: Reﬁne φ using S . Section 3.5 Return φ and WL( k, H ( g ) χ ( φ )). For s ∈ N d , L s is a Z p -vector space for some prime p = p s of dimension d s . Recall that for anyvector space L , PG( L ) denotes the projective geometry of L , which we may think of as a posetwhose elements are the vector subspaces of L , (partially) ordered by inclusion, and we use PG k ( L ) todenote the set of k + 1-dimensional subspaces. Let L ∗ s = Hom( L s , Z p ) denote the set of linear mapsfrom L s to Z p , i. e., the dual vector space of L s . Then the map X X ∗ = { ν ∈ L ∗ s : ν ( X ) = 0 } is an order-reversing bijection PG( L s ) → PG( L ∗ s ). By the Fundamental Theorem of ProjectiveGeometry, there is a bijective linear transformation f s : L s → L ∗ s such that X ∗ = f s ( X ). Let b s : L s × L s ֌ Z p be the linear form deﬁned by b s ( x, y ) = f s ( y )( x ). For X L s , let X ⊥ = { x ∈ L s : b s ( x, X ) = 0 } .The vertices and hyperedges of H ( g ) ( φ ) are, respectively, V = [ s ∈ N d PG ( L s ) , E =  [ s ∈ N d :dim L s >g (PG g − ( L s ) ∪ PG d s − g − ( L s ))  ∪ [ s ∈ N d :dim L s g PG d s ( L s ) . (3.1)(Recall that L s ∼ = Z d s p s .) To regard X ∈ PG d ( L s ) as a hyperedge, when convenient we identify the d -subspace X with the set of points (1-spaces) it contains. The initial coloring is as follows. • Vertices. The initial color χ ( v ) of a vertex v ∈ V is simply the index s of the layer L s suchthat v ∈ PG ( L s ). We note that in some cases, it makes sense to consider a layer L s as being deﬁned over a larger ﬁeld F p k , thuseﬀectively reducing its dimension, and reducing the size of the hypergraph. In such cases, this map is only guaranteedto be semi -linear, that is, f s ( a + b ) = f s ( a ) + f s ( b ), but f s ( λa ) = α ( λ ) f s ( a ) where α ∈ Gal( F p k ) is an automorphismof the ﬁeld F p k . This doesn’t present any essential diﬃculties, but needs to be kept track of. Hyperedges corresponding to subspaces of codimension g (dimension d s − g − L s >g . The initial color χ ( X ) of these hyperedges X ∈ PG d s − g − ( L s ) is determined by s togetherwith a set of labels indexed by pairs t, u ∈ N d such that t + u = s as follows: if t = u , thelabel of X corresponding to the pair ( t, u ) is the isotopism type of the projection L t × L u ֌ L s → L s /X ⊥ ; when t = u it is the pseudo-isometry type of this projection. • Hyperedges corresponding to subspaces of dimension g (elements of PG g − ( L s )), when dim L s >g . The initial color χ ( X ) of these hyperedges is determined by s together with a set of labelsindexed by t ∈ N d t = s as follows: the label of X corresponding to t is the isotopism type ofthe restriction of the bimap L s × L t ֌ L s + t to X × L t ֌ L s + t . (When the dimension is suchthat dimension g and codimension g subspaces are the same, this set of labels is appendedto the set of labels for codimension g subspaces; the two sets of labels are kept separate bytheir indexing.) • Hyperedges when dim L s g . In this case, there is only a single hyperedge X correspondingto the entire layer L s . It is given a color that is similar to the previous two, namely, for each t, u ∈ N d such that t + u = s , χ ( X ) gets a set of labels indexed by the pairs ( t, u ), labeled bythe isotopism type of L t × L u ֌ L s (resp., pseudo-isometry type if t = u ), together with, foreach t ∈ N d (now including t = s ) the isotopism (resp., pseudo-isometry) type of the bimap L s × L t ֌ L s + t .Observe, one need not pre-compute all isotopism (resp. pseudo-isometry) types. Instead, onecan generate labels on the ﬂy by pairwise comparison. Namely, given a new hyperedge X to label,test for isotopism (or pseudo-isometry) between L t × L u ֌ L s /X ⊥ and all distinctly labelled L t × L u ֌ L s /Y ⊥ , introducing a new label for X if necessary.By Theorem 2.2, isotopism and pseudo-isometry of bilinear maps U × V → W can be decided intime poly(dim U, dim V, | W | dim W ), and also (by Remark 2.4) in time poly( | U | dim U , dim V, dim W ).(When g = 2, this can be decided very eﬃciently using the algorithm in [BMW1].) It follows thatwe can label all hyperedges in time | G | O ( g ) . Note that if the charactistic is 2, then even for mapsof the form L s × L s ֌ L s + s , we only use the isotopism label instead of pseudo-isometry label,because the results of [IQ] are not yet known to extend to characteristic 2. While this is less reﬁnedinformation, it is still useful. The colored hypergraph H ( g ) χ ( φ ) described in the previous section already contains much local infor-mation from which global characteristic structure may be inferred, extracted, and used. However,we can often elucidate further characteristic structure by examining individual commutator rela-tions between the layers. Of the various possible strategies one could try, we propose one that isboth elementary and eﬀective.For each distinct pair s, t ∈ N d , add to E the the following edges. For each x ∈ L s , y ∈ L t suchthat [ x, y ] = 0 in L s + t (that is, [ x, y ] ∈ ∂φ s + t ), we add an edge from x to y . For each x, y whichdo not commute modulo ∂φ s + t , we add a hyperedge of size 3, connecting x ∈ L s , y ∈ L t , and[ x, y ] ∈ L s + t . Upon reﬁnement, this allows the vertex colors within each layer to aﬀect the colorsin the other layers. Given a vertex-and-hyperedge-colored (hereafter just “colored”) hypergraph H = ( V , E , χ ), where χ : V ∪ E → C ( C a ﬁnite set of colors), we show here how to apply the k -dimensional Weisfeiler–14eman procedure k -WL, originally developed in the context of graphs independently by Babai–Mathon [B2] and Immerman–Lander [IL] (see [CFI] and [B1] for more detailed history). For the caseof k = 1 (color reﬁnement) applied to hypergraphs, the same procedure was proposed and studiedin the very recent preprint by B¨oker [B1]. In particular, B¨oker shows that when we consider a graphas a (2-uniform) hypergraph, this procedure coincides with the usual color reﬁnement procedureon graphs.Let WL( k, H ) denote the colored hypergraph resulting from applying k -WL to H . The two keyproperties we will need in our application of this procedure are that: (1) WL( k, H ) can be computedfrom H in | H | O ( k ) time, and (2) If H ′ is another colored hypergraph, then H and H ′ are isomorphic(as colored hypergraphs) iﬀ WL( k, H ) and WL( k, H ′ ) are isomorphic as colored hypergraphs. (Infact, the set of isomorphisms will be the same: Iso( H, H ′ ) = Iso(WL( k, H ) , WL( k, H ′ ))).We ﬁnd it simplest to describe the application of WL to hypergraphs by using instead their“incidence (bipartite) graphs.” We believe this bijection between vertex-and-edge-colored hyper-graphs and vertex-colored bipartite graphs is essentially folklore; we include it here for completeness.Given a hypergraph H = ( V , E ), its incidence graph is the bipartite graph I ( H ) = ( V L , V R , E ) where V L = V , V R = E , E = { ( v, e ) ∈ V × E : v ∈ e } . It is not hard to see that every bipartite grapharises from a unique hypergraph in this manner, so I is a bijection and I − is well-deﬁned.An isomorphism between two vertex-and-edge-colored hypergraphs H i = ( V i , E i , χ i ) ( i = 1 , f : V → V such that (1) f ( E ) = { f ( e ) : e ∈ E } = {{ f ( v ) : v ∈ e } : e ∈ E } = E ,(2) χ ( v ) = χ ( f ( v )) for all v ∈ V , and (3) χ ( e ) = χ ( f ( e )) for all e ∈ E . We say that twovertex-colored bipartite graphs G i = ( V L,i , V

R,i , E i , χ i : V L,i ∪ V R,i → C ) ( i = 1 ,

2) are isomorphic ifthere are bijections f L : V L, → V L, and f R : V R, → V R, such that (1) f ( E ) = { ( f L ( u ) , f R ( v )) :( u, v ) ∈ E } = E and (2) χ ( u ) = χ ( f L ( u )) for all u ∈ V L, and χ ( f R ( v )) = χ ( v ) for all v ∈ V R, . Proposition 3.2 (Folklore) . Given two vertex-and-edge-colored hypergraphs H , H , there is anatural bijection between Iso ( H , H ) and Iso ( I ( H ) , I ( H )) ; in particular, H is isomorphic to H iﬀ their vertex-colored bipartite incidence graphs are isomorphic. Furthermore, both I and I − canbe computed in O ( V + E ) time. Proof sketch.

Notation as above. Given χ : V ∪ E → C , a vertex-and-edge coloring on a hypergraph H = ( V , E ), we get a coloring on the vertices of I ( H ), which we also denote by χ by abuse ofnotation. The coloring on V ( I ( H )) = V L ∪ V R is the same as before, since V L = V and V R = E .The inverse is similar. The running time results from the fact that H and I ( H ) can essentially bedescribed by identical underlying data structures.We show the natural bijection between Iso( H , H ) and Iso( I ( H ) , I ( H )). Given an isomor-phism f : V → V from H to H , we deﬁne an isomorphism ˆ f from I ( H ) to I ( H ) in the naturalway: ˆ f ( v ) = f ( v ) for v ∈ V L, = V , and for e ∈ V R, = E we deﬁne ˆ f ( e ) = f ( e ), that is, ˆ f ( e )is the vertex in V R, = E which corresponds to the hyperedge { f ( u ) : u ∈ e } . To see that ˆ f isan isomorphism we must check that it preserves incidences and colors. For incidences, we have( v, e ) ∈ E ( I ( H )) iﬀ v ∈ e (thinking of v ∈ V L, = V and e ∈ V R, = E ) iﬀ f ( v ) ∈ f ( e ) (since f isan isomorphism of hypergraphs) iﬀ f ( v ) = ˆ f ( v ) ∈ ˆ f ( e ) = f ( e ), by the deﬁnition of ˆ f . To see thatthe colors are preserved, for v ∈ V L, = V , we have, by deﬁnition (and abuse of notation), that χ ( v ) = χ ( f ( v )) = χ ( ˆ f ( v )), and for u ∈ V R, = E we have χ ( u ) = χ ( f ( u )) = χ ( ˆ f ( u )). The inverseconstruction of an isomorphism H → H from an isomorphism I ( H ) → I ( H ) is essentially gottenby reading all the preceding equations in reverse. V = |V| for hypergraphs and | V L | + | V R | for bipartite graphs; E = |E| for hypergraphs and | E | for bipartitegraphs. k -WL procedure is to apply standard (graph) k -WL to I ( H ), then applying I − to getback a reﬁned colored hypergraph.Finally, we recall the k -WL procedure as applied to a vertex-colored graph. If the graph is bipar-tite and we want to preserve the bipartition ( V L , V R )—as in our setting—we assume that the verticesin V L have distinct colors from those in V R . Given a vertex-colored graph G = ( V, E, χ : V → C ), k -WL reﬁnement is the following procedure. Each k -tuple of vertices ( v , . . . , v k ) is initially assigneda color according to its colored, ordered isomorphism type; that is, two such k -tuples ( v , . . . , v k )and ( u , . . . , u k ) are given the same initial color iﬀ (1) χ ( v i ) = χ ( u i ) for all i = 1 , . . . , k , (2) v i = v j iﬀ u i = u j for all i, j ∈ [ k ], and (3) ( v i , v j ) ∈ E ( G ) iﬀ ( u i , u j ) ∈ E ( G ) for all i, j ∈ [ k ]. Two k -tuples v = ( v , . . . , v k ) and u are said to be i -neighbors if they are equal except that v i = u i . In eachstep of the reﬁnement procedure, the coloring is reﬁned as follows: the new color of a tuple v isa k -tuple of multisets, where the i -th multiset is the multiset of colors of all the i -neighbors of v .At each stage, the coloring partitions V k ; the procedure terminates when this partition doesn’tchange upon further reﬁnement. Once the coloring on V k has stabilized, we get a new coloring on V = V ( G ) by deﬁning χ ′ ( v ) for v ∈ V to be the color of the diagonal k -tuple ( v, v, . . . , v ) ∈ V k .We denote the resulting colored graph by WL( k, G ). From G , WL( k, G ) can be trivially computedin time O ( k n k +1 ); the current best-known running time is still O ( k n k +1 log n ) [IL, Section 4.9].For more details on running time, implementation, and the properties of k -WL on graphs, see, e. g.,[W1, WL, IL, AFKV, DGR]. Each color class of vertices of WL( k, H ( g ) χ ( φ )) provides (by lifting from φ s /∂φ s to φ s along thenatural projection) characteristic sub sets of G , but not necessarily characteristic sub groups ; it isonly the latter which can be used to reﬁne the ﬁlter φ . To get characteristic subgroups instead, weconsider the subgroup generated by all the vertices in a given color class. We now write out thisprocedure more formally.Let χ ′ denote the reﬁned coloring function of WL( k, H ( g ) χ ( φ )). For each s ∈ N d , let χ ′ s denotethe restriction of χ ′ to the vertices in PG ( L s ). For each color c in the image of χ ′ s , let X s,c = P x ∈ PG ( L s ): χ ′ ( x )= c h x i be the subgroup of L s generated by the elements that are colored c . Finally,let π s : φ s → φ s /∂φ s = L s be the natural projection; we lift X s,c to a characteristic subgroup of φ s (and hence of G ) as π − ( X s,c ).Finally, the set of new characteristic subgroups we consider is S = { π − s ( X s,c ) : s ∈ N d , c ∈ N } − { φ s : s ∈ N d } . (3.3)If S = ∅ , its members may be supplied to Theorem 3.4 to reﬁne φ , in which case step 3 of Algorithm 1is repeated. If not, then our colored hypergraph is now stable and Algorithm 1 terminates. One ﬁlter φ reﬁnes another ﬁlter ψ on the same group if the image of φ contains that of ψ (the imageis the collection of all subgroups in the ﬁlter). If φ is a characteristic ﬁlter and H is a characteristicsubgroup such that ∂φ s H φ s for some s , then φ can be reﬁned to a characteristic ﬁlter thatincludes H . This was ﬁrst introduced in [W3], and shown to be computable in polynomial time byMaglione [M3]: Theorem 3.4 ([M3, Theorem 1]) . Let φ be a ﬁlter on G , and H (cid:2) G such that there exists s ∈ N d with ∂φ s < H < φ s . Then a ﬁlter reﬁning φ and including H can be computed in polynomial time.Furthermore, if φ and H are characteristic, then so is the reﬁned ﬁlter.

16e proceed sequentially through the characteristic subgroups of S , reﬁning φ as we go. For part (i), let s ∈ N d − { } . Observe, if Step 3 (c) was omitted from Algorithm 1, then colorswould only be assigned to hyperedges on points in ﬁxed layers . In that case, moreover, the colorof a hyperedge in layer L s is determined completely by pairs t, u ∈ N d with t + u = s ; the coloringfunction χ does not depend at all on layers v ∈ ∂φ s . That is to say, if Step 3 (c) is omitted, then H ( g ) χ ( φ ) restricted to N/φ s would be identical to the colored hypergraph based on φ truncated at φ s . Step 3 (c) colors edges between layers using information from layers ‘lower’ in the ﬁlter; thismeans the restricted hypergraph is a reﬁnement of the hypergraph on the truncated ﬁlter.For part (ii), let G and G ′ be two ﬁnite groups. Suppose we ﬁrst construct H ( g ) χ ( φ ). Next, weconstruct H ( g ) χ ′ ( φ ′ ) introducing new color for χ ′ only when it is new to both colored hypergraphs.Evidently, if G ∼ = G ′ , then H ( g ) χ ( φ ) and H ( g ) χ ′ ( φ ′ ) are isomorphic with identical color sets.Finally, we analyze the running time. Computing the Fitting subgroup O ∞ ( G ) and the initialcharacteristic ﬁlter can be done in poly( | G | ) time, even by naive algorithms (which can be improvedsigniﬁcantly when G is given by generating permutations, generating matrices, or black-box gener-ators). Building the hypergraph H ( g ) χ ( φ ) can be done in time linear in the number of hyperedges,which is the number of codimension- g subspaces of each layer L s , which is ∼ | L s | O ( g ) , and thusin total is at most | G | O ( g ) . The hyperedges can then be colored in poly( | G | ) × | G | O ( g ) = | G | O ( g ) time using the isotopism and isometry algorithms (Theorem 2.2). As with k -WL for graphs, k -WLfor hypergraphs can be computed in | V + E | O ( k ) , which in our case is | G | O ( gk ) . Extracting thecharacteristic subgroups from WL( k, H ( g ) χ ( φ )) can easily be done in poly( | G | ) time, and reﬁningthe ﬁlter φ can then be done in poly( | G | ) time as well [M3] (reproduced as Theorem 3.4 above).The only remaining question is how many times the main reﬁnement loop can run. Because weonly reﬁne when a characteristic subgroup K is found which lies strictly in between some φ s and ∂φ s , and the indices | φ s : K | and | K : ∂φ s | are both at least 2, reﬁnement can happen at mostlog | G | times. Thus the total running time is | G | O ( gk ) log | G | = | G | O ( gk ) . Our algorithm is not particular to the initial characteristic ﬁlter we choose. In any given groupclass, further characteristic subgroups (or subsets, or collections of subgroups) may be availablewhich could be used to reﬁne the ﬁlter, either at the beginning, or in each iteration of the mainloop of Algorithm 1. We give two examples here without much discussion, just to illustrate theconcept, without detracting from the main foci of the paper.First, it may be the case that some of the bimaps L s × L t → L s + t are deﬁned over a ﬁeld largerthan Z p , i. e., F p k for some k >

1. If this is true for suﬃciently many of the bimaps, we may beable to treat some layers L s entirely over F p k , thus reducing their dimension by k , and reducing thenumber of vertices in the corresponding factor of the hypergraph by a factor of k in the exponent(from p kℓ to p ℓ ).Second, as G acts on N by conjugation, and the layers of φ are Aut( G )-invariant, for each s ∈ N d we can compute a linear representation of G/N on the elementary abelian layer L s := φ s /∂φ s . Us-ing standard module machinery—for example, the version of the Meataxe algorithm describedin [HR]—in time polynomial in log | G | each G/N -module may be decomposed ﬁrst into indecom-posable summands, and then into isotypic components. The collection of isotypic components is acharacteristic subset of subgroups—namely, they can be permuted amongst themselves by the action17f Aut( G ), but that’s it. We can either group these into Aut( G )-orbits of isotypic components toget characteristic subgroups to reﬁne the ﬁlters, or keep the characteristic subset of subgroups andincorporate it into Rosenbaum’s composition series isomorphism technique, discussed in Section 4. We examine the procedure with a toy example as follows. Consider the following alternating matrixtuple in Λ(4 , , which was also considered in [BOW]. A = ( A , A , A ) =  − −  ,  −  ,  −  . We construct a bipartite graph G A = ( L ∪ R, E ), where L = PG ( F ), and R = PG ( F ), sothat for v ∈ L and U ∈ R , ( v, U ) ∈ E if and only if v ∈ U . In particular, note that | L | = | U | = 13.For each v ∈ L = PG ( F ), we choose a non-zero vector on v as its representative. So L = { (0 , , , (0 , , , (0 , , , (0 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , } . For each U ∈ R = PG ( F ), since U is a 2-dimensional subspace of F , we choose one deﬁninglinear equation u ∗ , u ∈ F , as its representative. So U is also U = { (0 , , , (0 , , , (0 , , , (0 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , } . In this notation, v = ( v , v , v ) ∈ L connects to u = ( u , u , u ) ∈ U , if and only if v u + v u + v u = 0.For v = ( v , v , v ) t ∈ L , we deﬁne A v = v A + v A + v A in Λ(4 , rk ( A v ) to give v the vertex color. Using red for rank 2 and blue for rank 4, we have L = { (0 , , , (0 , , , (0 , , , (0 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , } . The ﬁrst step of reﬁnement uses the colors on the L side to color the vertices on the U side.For example, (0 , ,

1) on the U side is adjacent to (1 , , , , , , , , , , U = { (0 , , , (0 , , , (0 , , , (0 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , } . Note that these colors, which comes from genus-1 information, already gives the genus-2 isomor-phism types.The second reﬁnement uses the colors of the U side to recolor the vertices on the L side. Forexample, (1 , ,

1) on the L side is adjacent to (1 , , , , , , , ,

2) on the U side. So181 , ,

1) obtains the color as “2 blues and 2 reds”, or 2B2R for short. We therefore let red for 3R1G,blue for 1R2G1B, and green for 2R2B. We then have L = { (0 , , , (0 , , , (0 , , , (0 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , , (1 , , } . It can be checked that we reach at a stable coloring after this step.Note that these colors suggest the green points form a characteristic set. This characteristic setwould generate the whole group, so it does not yield a non-trivial characteristic subgroup. However,this characteristic set is already interesting, because it does suggest that the Weisfeiler–Lemanprocedure, or even the naive reﬁnement, gives non-trivial information regarding group elementsunder the action of automorphisms. We discuss how to take advantage of such characteristicsubsets in isomorphism testing in the next section.

Our colored hypergraph and ﬁlter constructions can be used to reﬁne the composition-series iso-morphism method of Rosenbaum and Wagner [RW], thereby speeding up the resulting isomorphismtest. Here, we present an isomorphism algorithm which runs in poly( | G | ) time if the ﬁlter outputby Algorithm 1 with k, g O (1) also has “width” at most O (1) (deﬁned below). Though we claimno asymptotic improvements in the worst case, we expect our test to perform well for many speciﬁcgroup classes, as well as for groups chosen randomly (including groups selected from the randommodel we discuss in detail in the following section). In practice, one should also apply Rosenbaum’sbidirectional collision technique [R2] to get a square-root speed-up, but this causes no new technicaldiﬃculties.In fact, the running time we get is n (1 / φ g,k )+ O (1) + n O ( gk ) . We note that the largest thatboth g and the width can be is log n ; if we allow g to be near-maximal (take kg = log n/ log log n ),and this results in a ﬁlter whose width is just slightly less than maximal, say, O (log n/ log log n ),then the entire algorithm runs in time n O (log n/ log log n ) , asymptotically beating the trivial algorithmby a log log n factor in the exponent. Because this is such a generous bound on kg and a weakdesired outcome for the width, we expect this runtime to hold for many classes of groups.We begin with a simple version, building up to Theorem B in steps. We begin by recalling the composition-series isomorphism technique of Rosenbaum and Wagner[RW], and show the simplest way to incorporate our characteristic ﬁlter into that technique. (Re-call that, although we are not using the colored hypergraph here directly, it contributed to theconstruction of the ﬁlter.) Composition Series Isomorphism is the following problem: given twogroups

G, H , and a composition series of each 1 (cid:2) G (cid:2) · · · (cid:2) G m = G and 1 (cid:2) H (cid:2) · · · (cid:2) H m = H ,decide whether there an isomorphism ϕ : G → H such that ϕ ( G i ) = H i for all i = 1 , . . . , m .Rosenbaum and Wagner [RW] show how to reduce p -group isomorphism to Composition SeriesIsomorphism, and then how to reduce the resulting Composition Series Isomorphism Problem toGraph Isomorphism on graphs of degree at most p + O (1); Rosenbaum more generally showed howto reduce GpI to Composition Series Isomorphism in n (1 /

2) log n + o (log n ) time. Luks [L1] showed howto solve Composition Series Isomorphism in poly( n ) time. Recall the socle series of a group G isdeﬁned as follows: the socle Soc( G ) is the subgroup generated by all minimal normal subgroups.Soc( G ) is always a direct product of simple groups. We then recursively deﬁne Soc i +1 ( G ) to be19he preimage of Soc( G/ Soc i ( G )) in G , that is, if π i : G → G/ Soc i ( G ) is the natural projection,then Soc i +1 ( G ) = π − i (Soc( G/ Soc i ( G ))). The reduction is to pick a composition series for G thatis compatible with its socle series, and then to try all possible composition series for H compat-ible with its socle series. One of the keys to their running time is to show that the number ofcomposition series compatible with the socle series is bounded by n (1 /

2) log n .Within O ∞ ( G ), we reﬁne the socle series with our characteristic ﬁlter. Without loss of generality,we may assume that the restriction of our characteristic ﬁlter φ to the Fitting subgroup O ∞ ( G )reﬁnes the socle series of O ∞ ( G ). If it doesn’t originally, we may further reﬁne it using the socleseries, then iterate the main loop of Algorithm 1 until it stabilizes again. Our algorithm here isto reduce to Composition Series Isomorphism, but to only consider composition series that arecompatible both with our ﬁlter φ and with the socle series. If the ﬁlter has many small layers, thiswill cut down the number of composition series that need to be considered, thus reducing—for suchgroups—the dominant factor in the running time of [RW, R2, R3].To illustrate the potential savings, we deﬁne the width of a ﬁlter φ with elementary abelianlayers to be the maximum dimension of any layer:width( φ ) := max s dim p s ( φ s /∂φ s ) . Then we have:

Theorem 4.1.

Let N be a solvable group of order n , and φ N be a characteristic ﬁlter on N computable in time t ( n ) . Then isomorphism of N with any group can be tested, and an isomorphismfound, in time n (1 /

2) max P width ( φ P )+ O (1) + t ( n ) .In particular, using the characteristic ﬁlter φ g,k output by Algorithm 1 with parameters g and k , isomorphism of solvable groups can be solved in time n (1 / width ( φ g,k )+ O (1) + n O ( kg ) . Proof.

The outline of the algorithm follows Rosenbaum–Wagner [RW], also using Luks’s polynomial-time algorithm for Composition Series Isomorphism [L1]; the key diﬀerence here is that we onlyconsider composition series which reﬁne our characteristic ﬁlter φ , rather than more general com-position series as in Rosenbaum and Wagner. The runtime of their algorithm is a product of therunning time to enumerate the desired composition series, and the time to solve Composition Se-ries Isomorphism. Our improvement is in the ﬁrst step. So we only need calculate the number ofcomposition series of N compatible with φ g,k . In our case, we must ﬁrst compute the ﬁlter φ , whichtakes time t ( n ).Let M be a second solvable group. Enumerating the composition series of M compatible with φ M can be achieved as follows. Go through s ∈ N d in lexicographic order, starting with thelexicographically largest s such that φ s = 1. Within each layer L s = φ s /∂φ s we choose all possiblecomposition series. By [RW, Lemma 3.1], this can be done in time | L s | (1 /

2) log ps | L s | | L s | / φ ) .Taking the product over all layers, we get a bound of | M | (1 / φ ) . For each such compositionseries, we then use Luks’s poly( | M | )-time algorithm for Composition Series Isomorphism, yieldingthe stated result.For the “in particular,” we compute φ g,k using Algorithm 1, which takes n O ( gk ) time. The vertex coloring of the hypergraph H ( g,k ) χ ( φ ) may inform us of characteristic sub sets that arenot subgroups. Although the ﬁlter has been reﬁned as much as possible (in particular, any one20f the color classes of the hypergraph in a given layer L s must generate the whole layer), we cannonetheless take advantage of these characteristic subsets in the preceding algorithm, by furtherrestricting the composition series that we need to consider.Towards this end, for each layer L s let C s denote the smallest color class in L s , and deﬁne the color ratio of a layer L s as | L s | / | C s | . Finally, deﬁne the color ratio of a solvable group N ascolor-ratio( N ) := Y s ∈ N d color-ratio( L s ) = Y s | L s || C s | = | N | Q s | C s | . We now restate (a slightly reﬁned version) of Theorem B:

Theorem B (Reﬁned) . Let N be a solvable group of order n . Let φ = φ g,k and H ( g,k ) χ ( φ ) be theﬁlter and colored hypergraph for N output by Algorithm 1 with parameters g, k . In each layer L s ,let C s denote the smallest color class. Then isomorphism of N with any group can be tested, andan isomorphism found, in time  Y s ∈ N d min {| L s | / , | C s |}  width ( φ g,k ) poly( n ) + n O ( gk ) (cid:18) n color-ratio ( N ) (cid:19) width ( φ g,k ) poly( n ) + n O ( gk ) . Proof of Theorem B.

The outline of the algorithm is the same as in Theorem 4.1; the key diﬀerenceis how we enumerate composition series within each layer L s (and how many we enumerate). Tosee how to take advantage of the size of the smallest color class C s ⊆ L s , we must recall the detailsof Rosenbaum & Wagner’s Lemma 3.1 [RW], on enumerating composition series. In the algorithmof Theorem 4.1 above we have already taken care of the ordering of the layers, so the only diﬀerencehere will be on how we enumerate the part of the composition series within each layer L s . Thatis, we may assume that we have already built a composition series of ∂φ s , which we now want toextend to a composition series of φ s . Since the subgroup generated by C s would be a characteristicsubgroup of L s , and φ has already been reﬁned according to the coloring χ on H , it must be the casethat C s generates all of L s . Thus we may select only those composition series where the generatorof each step of the composition series comes from C s . Since any generating set (and hence anycomposition series) for L s has size log p s | L s | width( φ ), the number of choices of compositionseries where all the generators in the series are chosen from C s is bounded by | C s | ( | C s | − | C s | − · · · ( | C s | − log p s | L s | + 1) | C s | width( φ ) . This analysis already gives the second bound in the statement of the theorem. To get the morereﬁned bound, within each layer L s , if | C s | < | L s | / , then we employ the above strategy, andotherwise we use the | L s | (1 /

2) log ps | L s | | L s | (1 / φ ) strategy from Rosenbaum–Wagner [RW]. Finally, we give a version of the individualize-and-reﬁne paradigm from Graph Isomorphism asapplied to composition series that are compatible with our ﬁlter and colored hypergraph. Thealgorithm is similar to that from the previous section, except now, each time we pick a subgroup21n our composition series, we give a new color to the corresponding vertex in our hypergraph, andthen we run more iterations of the main loop of Algorithm 1 until the ﬁlter and hypergraph againstabilize, before we pick the next subgroup in our composition series. This can potentially have theeﬀect of reducing the width of the layers and/or the size of the smallest color class in each layer aswe go.In somewhat more detail: compute the ﬁlter φ and colored hypergraph H ( g,k ) χ ( φ ) as before. Webuild up a composition series in G and simultaneously keep a list of partial composition series in H that we want to test for isomorphism in the end. Suppose we are at the point where we already havebuilt a composition series up to ∂φ s in G , and we have a list L of composition series up to ∂φ s in H .Then we extend the partial composition series of G by picking an element of C s (the smallest colorclass in L s ). We then color the corresponding vertex in H a new color, and reﬁne both H and φ untilstabilization (as in the main loop of Algorithm 1). Within H , we try each element of C s in turn,reﬁning the ﬁlter and hypergraph for H . If for any x ∈ C s ( H ) the reﬁnement does not agree withthe reﬁnement we got in G , we throw it away, otherwise we extend our composition series for H bythe subgroup generated by x and ∂φ s , and add this new partial composition series to our list L . Thiscomes at a multiplicative cost of | C s | . We then continue this process within the (potentially new,smaller C s ) until we get a composition series that now includes all of φ s . The total multiplicativecost within the layer L s is thus at most | C s | ( | C s | − · · · ( | C s | − log p s | L s | + 1) | C s | width( φ ) , so thisat most squares the total running time from Theorem B.Thus, asymptotically, we get a similar worst-case upper bound. We could state a more reﬁnedupper bound along the lines of Theorem B, but the deﬁnitions involved are somewhat delicateand recursive (because they depend on how the width and the color-ratio change as the algorithmprogresses). Nonetheless, in practice, we expect this individualize-and-reﬁne technique to performmuch better, as the layers and color class should decrease in size as the algorithm progresses. Inspired by a suggestion of A. Mann [M4, Question 8] (answered in [KTW]), we describe here amodel for random ﬁnite groups. We ﬁrst give a simpliﬁed model that samples only random ﬁnitenilpotent groups. Later we extend this to sample solvable, semisimple, and general ﬁnite groups.

As a ﬁrst approximation we choose ℓ random upper unitriangular ( d × d )-matrices u , . . . , u ℓ , overthe integers modulo a ﬁxed positive integer b . The u i are drawn according to a ﬁxed distribution µ ( d, b ). Later we shall discuss the eﬀect of µ on the group theory, but ﬁrst we survey the possibleoutcomes.An immediate observation is that U = h u , . . . , u ℓ i is a subgroup of the full group of upperunitriangular matrices. Therefore, U is nilpotent of order at most b d / . In particular, if U p denotes the Sylow p -subgroup of U , then U = Q p | b U p . The choice of ℓ generators also has aﬁngerprint within the structure of our groups U . In particular by Burnside’s Basis Theorem, foreach p | b , | U p : [ U p , U p ] U pp | p ℓ . Thus, there is a certain amount of structure which is ﬁxed by the choice of parameters ( d, b, ℓ ).Nevertheless, the coverage asserted in Theorem C shows the diversity of these groups.22 .2 General model

To sample a more general class of groups, we add terms to the block-diagonal. Sampling randominvertible square matrices will almost always generate the entire general linear group. As noted inSection 1, a more nuanced approach is called for.Our strategy is as follows:(a) Add solvable groups by selecting any matrix that is diagonalizable over the algebraic closure.We call this a random toral subgroup .(b) From the classiﬁcation of ﬁnite simple groups we can select at random, according to a ﬁxeddistribution, a non-abelian ﬁnite simple group T and let T S/Z ( S ) Aut( T )—that is, choose S , a (possibly trivial) central extension of an almost simple group. Then we form the groupalgebra A = ( Z /b ) h S i . We then sample from the minimal left ideals I of A . This deﬁnes alinear representation ρ : T → End( I ) where I is a Z /b -module. It is a straightforward exerciseto see that endomorphisms of ﬁnite modules are representable as chequered matrices. We copythe image of a generating set for S into chequered matrices, and then place this on the blockdiagonal. We repeat until we exceed a bound on d .(c) Add permutation to the block diagonal to any two terms with isomorphic representations.(d) As a ﬁnal step we now sample block upper unitriangular matrices.It is important to proceed in this order to avoid redundant choices. The number of variabilityof the simple modules represented on the block diagonal is again controlled by the distribution andthat can have substantial impact on the resulting group. Proposition 5.1.

The class of groups sampled includes: A permutation group P , central extensions S a of almost simple groups, P ≀ ( S × · · · × S s ) ⋉ U where U is sampled as above and L ( U ) is a Q a T a -module. For (i) consider matrices of the form u ij = I + a ij E ij . If 1 i < d/ j d then all such u ij commute and are independent. So ﬁx a divisor chain e | · · · | e s | b and coeﬃcients a ij (in some indexorder) having additive order e im + j , it follows that these u ij generate an abelian group with thespeciﬁed invariants.For (ii-iii), let R v : U → W where R v ( u ) = u ∗ v , represented as a matrix. Let ¯ U be representationof U as in (i), and likewise with ¯ W . Then Bh ( ∗ ) ∼ =  u w I r R v I s  (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) u ∈ ¯ U , v ∈ V, w ∈ ¯ W  Br ( ∗ ) ∼ =  u w I r R u I s  (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) u ∈ ¯ U , w ∈ ¯ W  For the count note that it suﬃces to count the number of distinct bilinear maps ∗ : U × V ֌ W .As Higman demonstrates [H2], there are p dim U dim V dim W / | GL( U ) × GL( V ) × GL( W ) | ∈ p Θ( n ) such maps. 23inally, for a given list of groups sampled in smaller dimensions, form block diagonal represen-tations. This aﬀords the direct product of the list. For subdirect products take a subgroup of theblock-diagonal group. This completes the proof of Theorem C. If we sample dense matrices when we shall call the result the dense random subgroup model for thegeneral linear group GL( d, F p ). While this is an easy model to reason about it is also fairly rigid,as the following result illustrates. Theorem 5.2. If u , . . . , u ℓ are chosen uniformly at random from the group of upper uni-triangularmatrices U d ( Z /b ) and ℓ ∈ Ω( √ d ) then Pr (cid:16) |h u , . . . , u ℓ i| = b ℓ + ( d − ) (cid:17) → . In fact we shall prove the following stronger claim: with high probability, such groups h u , . . . , u ℓ i contain the group of commutators of U d ( Z /b ).In this model, groups can range widely in isomorphism types, one does not see much variabilityin coarse isomorphism invariants such as group order, numbers of subgroups or quotients, conjugacyclasses, and so forth.Our proof of Theorem 5.2 relies on some details of Sims’ proof on the asymptotic upper boundon the number of isomorphism types of p -groups [S]. It begins as follows. For a group G let γ i ( G )be the i th term in the lower central series. Every p -group G has a subgroup H G such that γ ( H ) γ ( G ) = γ ( G ) and where d ( H ) the least number of group elements to generate H , is minimalwith that property [BNV, Proposition 3.8]. We call d ( H ) the the Sims’ rank of G . Deﬁnition 5.3. A Sims subgroup of a nilpotent group G is a subgroup H G minimal withrespect to γ ( H ) γ ( G ) = γ ( G ). The Sims rank of G is the minimum number of generators neededto generate a Sims subgroup.Fix G = U ( d, k ), V = G/γ ( G ) ∼ = F d − , W = γ ( G ) /γ ( G ) ∼ = F d − . Then there is a bimap ∗ : V × V ֌ W given by commutation:[( γ ( G ) x ) , ( γ ( G ) y )] ≡ [ x, y ] (mod γ ( G )) . (5.4)Fix a subgroup H of G , and put U = Hγ ( G ) /γ ( G ). Observe that H is a Sims subgroup if, andonly if, [ U, U ] = [

V, V ]. Also observe that after taking natural bases for F d − and F d − , the bimap ∗ can be represented as follows. Let [ , ] : F d − × F d − ֌ F d − be deﬁned, in a parametrized form,by [ u, v ] = uBv t where B =  f − f − f − f . . . . . .. . . f d − − f d −  . (5.5)That is, B could be understood as a 3-tensor of size ( d − × ( d − × ( d − i th frontalslice is given according to f i . 24 roof of Theorem 5.2. Our approach is to show that a subgroup generated by enough elements is aSims subgroup. To do this it suﬃces to show that for most suﬃciently large dimensions, the bilinearmap of (5.5) has the property that most X F d − satisfy [ X, X ] = F d − . For notation we let V = F d − with basis { e , . . . , e d − } and W = F d − with basis { f = [ e , e ] , . . . , f d − = [ e d − , e d − ] } .If X V is the row span of the full rank ( s × ( d − M then( M BM † ) ij = d − X k =1 ( M ik M j ( k +1) − M i ( k +1) M jk ) f k . This deﬁnes a natural 3-tensor of size s × s × ( d −

2) by m i,j,k = ( M ik M j ( k +1) − M i ( k +1) M jk ) . Notice [

X, X ] = W if, and only if, h P k m ijk f k | i, j s i = W . That is, if we ﬂatten the tensorinto a ( s × ( d − m , as follows,˜ m ( s · ( i − j ) ,k = ( M ik M j ( k +1) − M i ( k +1) M jk ) = det (cid:20) M ik M i ( k +1) M jk M j ( k +1) (cid:21) ;then we are asking that ˜ m is of full rank. Now we argue that for s > √ d this is the expectedbehavior.By our model, each entry in M is drawn independently at random. However the entries of m (and therefore ˜ m ) are dependent. Nevertheless, we can observe that the values of m ijk are almostindependent of k . Certianly m ijk is independent of m ijk ′ if | k − k ′ | >

1. Also, if k ′ = k + 1, if m ijk = 0 then nothing can be said about m ij ( k +1) . Even if m ijk = 0 it may be impossible to predict m ij ( k +1) , the exception is when M i ( k +1) = 0 = M j ( k +1) . So there is 1 /q chance of dependencebetween with the exception of pairs of 0. Each such dependency will be compensated for by addinga row j ′ such that M j ′ ( k +1) = 0. Thus m ij ′ k will be independent of m ij ′ ( k +1) . Since the selectionof a nonzero entry is a 1 − /q > / O ( √ d ) rows we obtain with high probability a matrix M whose associated matrix ˜ m is full rank. For added variation a diﬀerent distribution is required, one which favors sparse matrices. Fixpositive integers b and d . Let wt ( u ) be the number of non-zero values in the upper unitriangular u . Fix a distribution µ on Z /b − { } and a distribution ν on { , . . . , (cid:0) n − (cid:1) } . Deﬁne a ( b, d, µ, ν ) -random triangular matrix as an α : (cid:0) d (cid:1) → Z /b sampled according to a distribution | supp α | = k with probability ν ( k ) and for each { i, j } ∈ supp α , α i,j is sampled according to µ . Notice α uniquelydetermines an upper unitriangular matrix: u ( α ) = I d + d X i =1 d X j = i +1 α ij E ij . (5.6)The distribution ν describes how large the support of α is expected to be, and µ describes whatnon-zero values in Z /b will be used as entries.Finally deﬁne a ( µ, ν ) -random unitriangular group as the group generated by independentlysampling ℓ upper unitriangular ( d × d )-matrices over Z /b according to their ( µ, ν )-distribution.The precise outcomes of this distribution appear intricate. Through some empirical testing (e.g.Figure 1.2) we have produced the following question:25f ν ( | A | ) → | A | > C , does log |h u , . . . , u ℓ i| approach a discrete Gaussian distribu-tion on { , . . . , (cid:0) d (cid:1) } ?Our model makes several constraining choices in order that it avoids the analysis that wouldotherwise create rather similar groups. The cost of this is that we can so far only oﬀer heuristicexplanations for the behavior. Even so, we explain what we understand and encourage a thoroughexploration in the future.The ﬁrst question is what to expect the length of the block diagonal to be in U . Suppose weassume that the block diagional is chosen uniformly at a partition of d . From Vershik’s theorem [V],the shape of the tableaux of random partition of d with at least √ d terms tends to O ( e − t ). Thatimplies that there are relatively few large blocks as those are in the tail of the random distribution.Thus there would be many blocks of small size. This however requires one justify that sampling U at random samples partitions of d uniformly at random. That need not be the case. So we askIs the typical sparsely sample group h u , . . . , u ℓ i convex (tending toward the middle) orconcave (tending away from the middle)?The answer to this speaks to the expected nilpotence class of the groups U . The length of thisblock diagonal is a bound on the nilpotence class. For example, if there are just two blocks, then U (cid:26)(cid:20) I a ∗ I b (cid:21)(cid:27) implies that U is abelian. In general, if F denotes the subspace ﬂag determining the block structureof U , then the nilpotence class of U is at most |F | − avoid being a Sims subgroup. Lemma 5.7.

Fix an alternating bimap [ , ] : V × V ֌ W with W = [ V, V ] . Let π , . . . , π d − be abasis of W ∗ and deﬁne ( u, v ) i = π i [ u, v ] . For X V , if there exists an i such that ( X | X ) i = 0 ,then [ X, X ] = W .Proof. If ( X | X ) i = 0 then for u ∈ W with π i ( u ) = 1, u / ∈ [ X, X ].Now here is the situation. The maps ( | ) i : V × V ֌ K are alternating bilinear forms, possiblydegenerate. The subspaces X V with ( X | X ) i are what are known as totally isotropic . Thenumber of maximal totally isotropic subspaces of V is q O ( m ) where m = dim V − dim { v : ( v | V ) =0 } . Therefore the smaller the radical the much large the number of totally isotropic subspacesthere are and therefore the less likely that a subspace X generates W . So as we move towardsbimaps for unipotent hulls for ﬂags of ﬁxed length at least 3, then the commutator involved willhave quotients to alternating forms with large numbers of totally isotropic subspaces. Thus moresubspaces will fail to generate W . As result, fewer subgroups will be Sims subgroups. This howeveris only a crude guide to the number of Sims subgroups and we encourage an actual analysis withbetter insights. So now let us consider the eﬀects of reﬁnement in our random model. Our proof is in two parts.Either our unipotent groups U have long block diagonal series or it has bounded class. In theformer case we reduce the reﬁnement analysis to a result of Maglione [M3]. In the later case weappeal to classical results on nonsingular products. In either case we discover reﬁnements. We aimto prove Theorem D. 26 eﬁnements for many blocks. First let us consider groups with many blocks.

Theorem 5.8.

The reﬁnement length of a random subgroup U U ( d, p ) is on average at Ω( ℓ ) where ℓ is the length of is generalized eigen -space ﬂag. Primarily we want to appeal to the following. Note that in this case we do not apply theWeisfeiler–Leman procedure developed in this paper; instead, it will be used in the next setting.

Theorem 5.9 (Maglione [M2]) . The group U ( d, p ) has an (adjoint) characteristic ﬁlter reﬁnementof length Θ( d ) . However we do not have the group U ( d, p ). Instead, we have a subgroup sampled at randomeither with dense or sparse matrices. First we dispense with the dense case. Corollary 5.10.

A subgroup H U ( d, p ) generated by dense matrices u , . . . , u ℓ with ℓ > √ d hason average a characteristic ﬁlter reﬁnement of length Θ( d ) .Proof. By Theorem 5.2, H is almost certainly a Sims subgroups of U ( d, p ) and therefore [ H, H ] =[ U ( d, p ) , U ( d, p )]. As a scholium to Maglione’s theorem we observe that the adjoint ﬁlter reﬁnementof U ( d, p ) can be deﬁned as reﬁnement through terms L s for s < (2 , , . . . ,

0) in the ﬁlter. As aresult these same terms appear in the ﬁlter of H and so H reﬁnes to a length of Ω( d − d ) = Ω( d ).Since log p | H | ∈ O ( d ) the result follows.Next we need to consider the sparse case as this is where our model presides. What we do isdemonstrate a form of Morita condensation theory that transports our sparse problem into a denseproblem [W4]. What we observe is that each right-hand edge j of block on the block diagonal of U is deﬁned by the presence of an element u ∈ U with a non-zero value u ij , otherwise the blockwould be wider. We select one such row i s for each block s , and one such column j s . Thus outof the ( ℓ × ℓ )-block matrix u ∈ U , we create an ( ℓ × ℓ )-matrix by copying the entire in u i s j s . Forexample  a a a a a a a a a a a   a a a  . This may seem a bit unnatural but in fact it is applying a functorial property not on the levelof groups but on the level of the enveloping algebra of the matrices and more importantly on thelevel of bilinear maps. While this function has no relationship in the context of groups, it is byconsidering the associated ring context that we see that we have simply performed a condensationof modules, that is we have changed to an equivalent category. So for each of the blocks B , . . . B d we let e s be the ( d s × d s ) matrix with zero in every position except jj . Set e = e ⊕ · · · ⊕ e ℓ . Then eue is matrix with at most ℓ × ℓ nonzero entries. Removing the all zero rows and columns producesan ( ℓ × ℓ )-matrix. In the example above, e =  

27n particular this induces a functorial Morita condensation of each bilinear map L s × L t ֌ L s + t ,see [W4]. We therefore denote this group eU e to remind us of the natural process to create thissmaller matrix group.Having applied this transform, notice eU e is now a dense subgroup of U ( ℓ, p ). Therefore wearrive at the following. Proof of Theorem 5.8.

Suppose U U ( d, p ) generate by random matrices u , . . . , u t . If the u i aredense then by Corollary 5.10 there is a computable ﬁlter reﬁnement of length O ( ℓ ) where ℓ is thenumber of blocks of U . If on the other hand the u i are sparse, then eU e has a reﬁnement of lengthΩ( ℓ ). As the map U eU e is functorial in the bilinear maps used to select reﬁnement, it followsthat U also has a reﬁnement of length Ω( ℓ ). Reﬁnements for few blocks.

The last case to concern us is when U has a bounded numberof blocks on the diagonal, but that the number of blocks is at least 3. (Otherwise the group U isabelian which is the ﬁrst case of Theorem D.) Because the number of blocks is bounded at leastone block has dimension proportional to d as d → ∞ .Let us consider coloring with g = 1. This means that with a selected layer L s × L t ֌ L s + t weconsider labels on 1-dimensional subspaces h x i L s by labeling the restriction h x i × L t ֌ L s + t .One observes this structure is nothing more than a linear transformation L t → L s + t and is thusdeﬁned up to change of basis solely by the rank of the transformation. Therefore to each elementof PG ( L s ) we record the rank of the associated matrix. We do likewise with PG( L t ). Finally welabel the edges between PG( L s ) and PG( L t ) by whether or not the pair of points commutes.In order to model this behavior in colors we make the following observation. Treating x = ( x : · · · : x d ) as homogeneous point in d variables, the evaluation [ x, − ] : L t → L s + t produces a matrix M ( x ) with entries in F [ x , . . . , x d ]. The rank of this matrix changes as we evaluate x but certainlythere are two natural states: either M ( x ) has rank at most r or it does not. If M ( x ) has rankat most r then all ( r × r )-minors must vanish, and this produces a polynomial number degree r -polynomials that must all vanish on x . That is to say, the condition of the rank of M ( x ) is a variety(or more generally a scheme). It is in fact a determinantal variety and the subject of considerablestudy in the algebraic community as well as the computer science community [H1, FSEDS]. It isimportant to observe that many results in the ﬁeld are only known over algebraically closed ﬁelds.However it is known that these varieties are reduced and irreducible [H1]. Therefore to count pointswe can use Lang–Weil theorem [LW1], but that requires that we allow for a large ﬁeld. So thisportion of our estimate assume b → ∞ and d → ∞ .Let us assume for now that M ( x ) has points, i.e. that for some x ∈ PG ( L s ), [ x, − ] does nothave full rank, and for other points it does. Thus our vertex set has (at least) 2 colors, say white if[ x, − ] has full rank and black otherwise. We do the same for PG( L t ). Recall that we are includinga hyperedge ( x, y, [ x, y ]) only if [ x, y ] = 0.Now consider the situation. The number of black points is in general a solution to a system ofrandom nonlinear homogeneous polynomials of degree r . That this is nonlinear means we can expectthat the number of black points is not a subspace. Now the points in PG( L t ) not connected to blackpoints x are the points y ∈ x ⊥ := ker[ x, − ]. In particular we have a nonlinear set parameterizing asubspace arrangement within PG( L t ). If we write the generator matrix of each subspace ker[ x, − ]it will be the dual of a linear combination of the matrices used to deﬁne [ , ], which we sampled atrandom. Therefore we have a random subspace arrangement. In general this incidence relation isnot equitable, so proper reﬁnements will be discovered in the WL-reﬁnement process.With that we have proved the following. 28 heorem 5.11. If U U ( d, p ) has a bounded number of blocks and d, p are large, then there existsa proper reﬁnement of the standard ﬁlter. To remove the assumption that d, p are large here, the following interesting question needs tobe addressed.Let A , . . . , A m be random n × n matrices over F q . What is the typical number ofnon-full-rank matrices in the linear span of A i ’s? Let us suppose G is sampled according to our model. Let U be the intersection of G with U ( d, p ) andbegin with the initial ﬁlter of our introduction. Then if U is abelian we are in case (i). Otherwise U has at least 3 blocks so we can use either Theorem 5.8 for the case of large blocks, or Theorem 5.11in the case of small blocks. In either case we obtain a proper reﬁnement. Note that after reﬁnementof the bounded number of blocks several times we cross over to the large number of blocks and sothe result follows. In this subsection we formally describe the simpliﬁed main algorithm presented in Section 1.4, thatis Algorithm 2. We also discuss some important adjustments used in the implementation. We needthe following observation, which follows easily by computing the closure of the given generatingset.

Observation 6.1.

Let C , . . . , C t ∈ GL( n, q ) , and let G be the group generated by C i ’s. Let s ∈ N .Then there exists an algorithm that either reports that | G | > s , or lists all elements in G , in time poly( s, n, log q ) . Let us ﬁrst examine the running time of Algorithm 2.

Proposition 6.2.

Algorithm 2 runs in time poly( q cm , s, n ) .Proof. If Algorithm 2 outputs | Aut( A ) | > s , then its running time is determined by Theorem 2.3(2) and Observation 6.1, which together require poly( s, n, log q ).If | Aut( A ) | s , we analyze the two For-loops at Step 4 and Step 4.c, respectively. The ﬁrstloop adds a multiplicative factor of q cm , since enumerating a single element in H costs q m . Thesecond loop adds a multiplicative factor of s , due to the fact that | Isom( A , B ) | = | Aut( A ) | s , asIsom( A , B ) is a coset of Aut( A ). Other steps can be carried out in time poly( n, log q ). Thereforethe overall running time is upper bounded by poly( q cm , s, n ).We then prove the correctness of Algorithm 2, in the case that it does not report | Aut( A ) | > s . Proposition 6.3.

If Algorithm 2 does not report | Aut( A ) | > s , then it lists the set of pseudo-isometries (possibly empty) between G and H . In particular, | ΨIsom( G , H ) | q cm · s .Proof. By Step 4.c, every T added to L is a pseudo-isometry. We are left to show that L containsall the pseudo-isometries. For this, take any pseudo-isometry T . Since the linear spans of T t G T and H are the same, we know T t A T is equal to some B ∈ H c . So when enumerating this B in Step4, T will pass all the tests in the following, and then be added to L . This concludes the proof.29 lgorithm 2 The ﬁrst average-case algorithm for alternating space isometry.

Input: G = ( G , . . . , G m ) ∈ Λ( n, q ) m , H = ( H , . . . , H m ) ∈ Λ( n, q ) m , c, s ∈ N , and q is odd. Output:

Either (1) | Aut( A ) | > s , where A = ( G , . . . , G c ), or (2) ΨIsom( G , H ). Algorithm procedure:

1. Set L ← {} . Set A = ( G , . . . , G c ), the ﬁrst c matrices from G .2. Use Theorem 2.3 (2) to compute a generating set for Aut( A ).3. Use Observation 6.1 with input s and the generating set of Aut( A ).(If | Aut( A ) | > s , we terminate the algorithm and report that “ | Aut( A ) | > s .”)4. Put H = h H i , the linear span of H ; for every B = ( B , . . . , B c ) ∈ H c , do the following.a. Use Theorem 2.1 (2) to decide whether A and B are isometric.b. If not, go to the next B . Otherwise, we get the non-empty coset Isom( A , B ).c. For every T ∈ Isom( A , B ), do the following.Test whether the linear spans of T t G T and H are the same. If not, go to thenext T . If so, add T into L .5. Output L .It remains to specify the choices of c and s in Algorithm 2 in the average-case analysis. This isstated in the following, whose proof will be deferred to Section 6.3. Proposition 6.4.

Let c := 20 . For all but at most /q Ω( n ) fraction of A = ( G , . . . , G c ) ∈ Λ( n, q ) c ,we have | Aut( A ) | s := q n . Combining Propositions 6.2, 6.3 and 6.4, we have the following theorem.

Theorem 6.5.

Let m > , and let F q be a ﬁnite ﬁeld of odd size. For all but at most /q Ω( n ) fraction of G = ( G , . . . , G m ) ∈ Λ( n, q ) m , Algorithm 2 tests the isometry of G with an arbitrary H ∈ Λ( n, q ) m in time q O ( n + m ) . Implementation details.

We now explain some issues in the implementation of Algorithm 2.To make this algorithm suitable for practical purposes, recall that the algorithm’s running timeis dominated by the two For-loops which give multiplicative factors of q cm and s , respectively. Forthe average-case analysis we used c = 20, but having this standing on the exponent is too expensive.In practice, actually using c = 3 already imposes a severe restriction on s , the order of Aut( A ). Sowe use c = 3 in the implementation which gives a reasonable performance.But having q m in the For-loop is still too demanding. Indeed, in practice the tolerable enu-meration is around 5 , namely q = 5 and 10 on the exponent. So with c = 3, the range of m isstill severely limited. (Interestingly, the algorithm seems to have a better dependence on n .) It ismost desirable if we could let c = 1, namely simply q m .To achieve that we use the following heuristic. Note that if G , . . . , G c are low-rank matrices,then we will only need to match them with the low-rank matrices from H . Our experiement showsthat, for a random G over F q when q is a small constant, the number of low-rank (i.e. non-full-rank)matrices in G is expected to be small (i.e. much smaller than q m ) and non-zero (i.e. no less than3) at the same time. So we can use q m · poly( n, log q ) to choose 3 low-rank matrices from G . Then30se q m · poly( n, log q ) to compute the set of low-rank matrices from H , denoted as H c . We can thenreplace enumerating H c with H cl , which in general is much smaller. To state our algorithm, we need the concept of adjoint algebra. For two tuples of alternatingmatrices G , H ∈ Λ( n, F ) m , the adjoint algebra of G is deﬁned asAdj( G ) = { ( A, D ) ∈ M( n, F ) ⊕ M( n, F ) : A G = G D } , and the adjoint space from G to H isAdj( G , H ) = { ( A, D ) ∈ M( n, F ) ⊕ M( n, F ) : A G = H D } . Clearly, if T ∈ Aut( G ), then ( T t , T − ) ∈ Adj( G ). Furthermore, if G and H are isometric, then | Adj( G , H ) | = | Adj( G ) | .We now introduce the algorithm (see Algorithm 3) that supports Theorem E. We point outthat Algorithm 3 diﬀers from the algorithm presented in Section 1.4 in two places.1. The ﬁrst and major diﬀerence is to replace the uses of Aut( G ) and Isom( G , H ) with Adj( G )and Adj( G , H ), thereby avoiding using Theorem 2.1 (2) and Theorem 2.3 (2). Since Adj( G )and Adj( G , H ) are easy to compute over any ﬁeld, this resolves the characteristic-2 ﬁeldissue. Furthermore, Adj( G ) and Adj( G , H ) are also easier to analyze. But Adj( G ) andAdj( G , H ) could be larger than Aut( G ) and Isom( G , H ), so they are less useful from thepractical viewpoint.2. The second place is step 2 in Algorithm 3: instead of just using the ﬁrst c matrices as inthe algorithm presented in Section 1.4, Algorithm 3 slices the m matrices of G into ⌊ m/c ⌋ segments of c -tuples of matrices, and tries each segment until it ﬁnds one segment with asmall adjoint algebra. This step helps in improving the average-case analysis, and can beapplied to the algorithm presented in Section 1.4 as well.Let us ﬁrst examine the running time of Algorithm 3. Proposition 6.6.

Algorithm 3 runs in time poly( q cm , s, n ) .Proof. If Algorithm 3 outputs “ G does not satisfy the generic condition,” then it just executes theFor-loop in Step 3, which together runs in time poly( m, n, log q ).Otherwise, there are two For-loops at Step 4 and Step 4.c, which add multiplicative factors q cm and s , respectively. Other steps can be carried out in time poly( n, log q ). Therefore the wholealgorithm runs in time poly( q cm , s, n ).We then prove the correctness of Algorithm 3 in the case that that it does not report “ G doesnot satisfy the generic condition.” Proposition 6.7.

Suppose that Algorithm 3 does not report “ G does not satisfy the generic con-dition.” Then the algorithm lists the set of pseudo-isometries (possibly empty). In particular, | ΨIsom( G , H ) | q cm · s .Proof. By Step 5.c, every T added to L is a pseudo-isometry. So we are left to show that L containsall the pseudo-isometries. For this, take an arbitrary pseudo-isometry T . Then T sends A to some B ∈ H c , i.e., T t A T = B . In particular, ( T t , T − ) ∈ Adj( A , B ). So when enumerating this B ∈ H c ,( T t , T − ) will pass all the tests in the following, and then be added to L . This concludes theproof. 31 lgorithm 3 The second average-case algorithm for alternating space isometry.

Input: G = ( G , . . . , G m ) ∈ Λ( n, q ) m , H = ( H , . . . , H m ) ∈ Λ( n, q ) m and c, s ∈ N . Output:

Either (1) | Aut( A ) | > s ; or (2) ΨIsom( G , H ) as a set, which may be empty. Algorithm procedure:

1. Set L ← {} . Set F ← false .2. For i = 1 , . . . , ⌊ m/c ⌋ , do the following.a. Set A = ( G c ( i − , . . . , G ci ).b. Compute a linear basis of Adj( A ) ⊆ M( n, q ) ⊕ M( n, q ).c. If | Adj( A ) | s , set F to be true , and break the For-loop.3. If F = false , return “ G does not satisfy the generic condition.” and terminate.Otherwise,4. Put H = h H i , the linear span of H ; for every B = ( B , . . . , B c ) ∈ H c , do the following.a. Compute a linear basis for Adj( A , B ) ⊆ M( n, q ) ⊕ M( n, q ).b. If | Adj( A , B ) | > s , go to the next B .c. For every ( T, S ) ∈ Adj( A , B ), do the following.If S and T are invertible and S = T − t , test whether the linear spans of T G T t and H are the same. If not, go to the next ( T, S ). If so, add T t into L .5. Output L .Therefore, to prove Theorem E, the key is to analyze when a random G satisﬁes the genericcondition as in Algorithm 3. Proposition 6.8.

Let m > c = 20 , and let ℓ = ⌊ m/ ⌋ ∈ N . For all but at most /q Ω( n · ℓ ) =1 /q Ω( nm ) fraction of G = ( G , . . . , G m ) ∈ Λ( n, q ) m , there exists some i ∈ [ ℓ ] , such that, letting A = ( G c ( i − ,...,c ( i − ) , we have | Adj( A ) | q n . Clearly, Theorem E follows from Propositions 6.6, 6.7, and 6.8.

We now formulate the key proposition that supports the proof of Proposition 6.8.

Proposition 6.9.

Let c = 20 . For all but at most /q Ω( n ) fraction of A = ( G , . . . , G c ) ∈ Λ( n, q ) c ,we have | Adj( A ) | q n . Given Proposition 6.9, we easily obtain the following.

Proof of Proposition 6.4.

This is because, if T ∈ Aut( A ), then ( T t , T − ) ∈ Adj( A ). So | Aut( A ) | | Adj( A ) | . Proof of Proposition 6.8.

We slice G into ℓ = ⌊ m/c ⌋ segments, where each segment consists of c random alternating matrices. Each segment is some A ∈ Λ( n, q ) c , with Pr[ | Adj( A ) | > q n ] /q Ω( n ) . Since each G i is chosen independently and uniformly at random, the probability of every( G c ( i − , . . . , G ci ), i ∈ [ ℓ ], with | Adj(( G c ( i − , . . . , G c ( i − c )) | > q n , is upper bounded by(1 /q Ω( n ) ) ℓ = 1 /q Ω( nm ) . 32he rest of this subsection is devoted to the proof of Proposition 6.9. For this we need thefollowing from [LQ]. Given a tuple A = ( A , . . . , A r ) ∈ M( n, q ) r , deﬁne the image of U F nq under A as A ( U ) := h∪ ri =1 A i ( U ) i . Deﬁnition 6.10.

We say A = ( A , . . . , A r ) ∈ M( n, q ) r is stable , if for any nonzero, proper U F nq ,we have dim( A ( U )) > dim( U ). Proposition 6.11 ([LQ, Proposition 10 in arXiv version]) . If A ∈ M( n, q ) r is stable, then | Adj( A ) | q n . A key technical result in [LQ] is that, a random A ∈ M( n, q ) is stable with probability 1 − q Ω( n ) [LQ, Proposition 20 in arXiv version]. However, we cannot directly apply that result to proveProposition 6.9, because here we have alternating matrices instead of general matrices. So we haveto run the arguments for the proof of [LQ, Proposition 20 in arXiv version] again, and carefullyadjust some of the details there to accommodate the structure of alternating matrices.To start with, we need the following easy linear algebraic result, which suggests the connectionbetween random alternating matrices and random general matrices. Lemma 6.12.

Let d ∈ Z + and d > . Given two random alternating matrix X, Y ∈ Λ( d, q ) , wecan construct a matrix P ∈ M( d × ( d − , q ) , whose columns are linear combinations of the columnsof X and Y , such that P is a random matrix from M( d × ( d − , q ) .Proof. Let X and Y be given as X =  x , x , . . . x ,d − x , x , . . . x ,d − x , − x , . . . x ,d ... ... ... . . . ... − x ,d − x ,d − x ,d . . .  , Y =  y , y , . . . y ,d − y , y , . . . y ,d − y , − y , . . . y ,d ... ... ... . . . ... − y ,d − y ,d − y ,d . . .  , where each x i,j and y i,j are independent random variables from F q . Deﬁne M =  y , x , + y , x , + y , . . . x ,d − + y ,d x ,d − x , y , x , + y , . . . x ,d − + y ,d x ,d − y , − x , − y , − x , y , . . . x ,d − + y ,d x ,d − y , ... ... ... . . . ... ... − x ,d − y ,d − x ,d − y ,d − x ,d − y ,d . . . − x d,d − − y ,d  :=  z , z , z , . . . z ,d − z ,d z , z , z , . . . z ,d − z ,d z , z , z , . . . z ,d − z ,d ... ... ... . . . ... ... z d, z d, z d, . . . z d,d − z d,d  , be the matrix obtained by adding the ( i + 1)th column of Y to the i th column of X for i ∈ [ d − Y to the d th column of X . Let P be the d × ( d −

1) matrix consisting ofthe ﬁrst ( d −

1) columns of M . We need to show that P is uniformly sampled from M( n × ( d − , q )as X and Y are uniformaly sampled from Λ( d, q ).To see this, ﬁrst note that for any two random variable x and y , which are chosen independentlyand uniformly at random from F dq , x ± y are also new random variables which are chosen uniformly33t random from F dq , and is independent with either x or y . Thus each z i,j is again a random variablewhich is chosen uniformly at random from F dq for i, j ∈ [ d ].We then exploit the linear relations among the z i,j ’s. In fact, we only need to focus on theanti-diagonal directions, as z ,i + z ,i − + · · · + z i, + z i +1 ,d + z i +2 ,d − + · · · + z d,i +1 = 0for any i ∈ [ d ]. Thus, we can view z ,i , z ,i − , · · · , z i, , z i +2 ,d − , . . . , z d,i +1 (note the missing z i +1 ,d )to be mutually independent for each i ∈ [ d ], then every entry in P can be viewed as chosenindependently and uniformly at random. This can be veriﬁed in a straightforward way, and we canconclude the proof. Remark . Following the similar argument, if we would like to get an d × d random matrixover F q , we can in turn do the following: take two d × d random alternating matrices X and Y and construct M as in Lemma 6.3. We then take another two random alternating matrices Z and W . We add up the ﬁrst column of Z and W , of which each coordinates can be viewed as chosenindependently and uniformly at random. We replace the last column of M by the new randomvector, which gives an d × d random matrix.We are now ready to prove Proposition 6.9. Proof of Proposition 6.9.

Given Proposition 6.11, we need upper bound the probability of a random A ∈ Λ( n, q ) c , such that A is not stable, by 1 /q Ω( n ) .By the union bound, we know thatPr[ A ∈ Λ( n, q ) c is not stable] X U F nq , dim( U ) n − Pr[ A ∈ Λ( n, q ) c , dim( A ( U )) dim( U )] . (6.14)We ﬁrst simplify the right-hand-side. For a non-zero, proper U F nq , let A U := { A ∈ Λ( n, q ) r :dim( A ( U )) dim( U ) } . Clearly,Pr[ A ∈ Λ( n, q ) c , dim( A ( U )) dim( U )] = | A U || Λ( n, q ) c | . We show that for any two dimension- d subspaces U and V , | A U | = | A V | . To see this, let T ∈ GL( n, q ) be any invertible matrix that sends V to U . Note that T further induces a linearmap from Λ( n, q ) r to itself by sending A to T t A T . Since T is invertible, this map is a bijection.Moreover, for any A ∈ A U , we claim that T t A T ∈ A V . This is becausedim(( T t A T )( V )) = dim(( T t A )( U )) = dim( A ( U )) dim( U ) = dim( V ) , where the second equality holds since left and right multiplying invertible matrices does not changethe rank of a matrix. To summarize, if dim( U ) = dim( V ), thenPr[ A ∈ Λ( n, q ) c , dim( A ( U )) dim( U )] = Pr[ A ∈ Λ( n, q ) c , dim( A ( V )) dim( V )] . The right-hand-side of 6.14 can be then simpliﬁed asPr[ A ∈ Λ( n, q ) c is not stable ] n − X d =1 (cid:20) nd (cid:21) q · Pr[ A ∈ Λ( n, q ) c , dim( A ( U d )) d ] . (6.15)34here U d is the d -dimensional subspace of F nq spanned by the ﬁrst d standard basis e , . . . , e d .The next goal is to upper bound (cid:2) nd (cid:3) q Pr[ A ∈ Λ( n, q ) c , dim( A ( U d )) d ] for d = 1 , . . . , n − A di be the n × d matrices consists of the ﬁrst d columns of A i for i ∈ [ c ]. (Note that thesuperscript here does not denote exponentiation.) Let A d = [ A d , · · · , A dc ] ∈ M( n × cd, q ). Thendim( A ( U d )) is just the rank of A d . Note that for i ∈ [ c ], the ﬁrst d row of A di can be viewed as arandom alternating matrix from Λ( d, q ), and the last n − d rows of A di can be viewed as a ( n − d ) × d random matrix. Moreover, these two matrices can be viewed as being chosen independently.By Lemma 6.12 together with Remark 6.13, there exist a series of column operations representedby an invertible matrix R ∈ GL( cd × cd, q ), such that the following holds. Let V d ∈ M( n × d, q ) bethe matrix consists of the ﬁrst 5 d columns of A d R . Then V d can be viewed as chosen independentlyand uniformly at random from M ( n × d, q ), as A is chosen uniformly at random from Λ( n, q ) c .Note that when d = 1, the ﬁrst row of A di is 0 for all i ∈ [ c ]. This degenerate case suggest us toconsider V as randomly choosing from M (( n − × , q ). Note thatPr[ A ∈ Λ( n, q ) c , dim( A ( U )) Pr[ V ∈ M (( n − × , q ) , rk ( V ) A ∈ Λ( n, q ) c , dim( A ( U d )) d ] Pr[ V d ∈ M ( n × d, q ) , rk ( V d ) d ]for 2 d n − n − × V ∈ M (( n − × , q ) , rk ( V ) (cid:0) (cid:1) · q n − · q − q n − = 5 q n − . So we have (cid:20) n (cid:21) q · Pr[ A ∈ Λ( n, q ) c , dim( A ( U )) q n − . (6.16)Using the same idea, we deal with 2 d n −

1. All possible V d such that rk ( V d ) d can beconstructed by ﬁrst choosing d columns in V d and ﬁxing their entries, and then choosing the othercolumns from their linear span. This gives the boundPr[ V d ∈ M ( n × d, q ) , rk ( V d ) d ] (cid:0) dd (cid:1) × q nd × q d q nd q nd − d − d , where the last inequality uses (cid:0) dd (cid:1) d q d . For d n , we upper bound (cid:2) nd (cid:3) q by q nd . This givesthat (cid:20) nd (cid:21) q Pr[ A ∈ Λ( n, q ) c , dim( A ( U d )) d ] q nd − d − d q n − . (6.17)For n < d n −

2, we upper bound (cid:2) nd (cid:3) q by q n ( n − d ) . This gives that (cid:20) nd (cid:21) q Pr[ A ∈ Λ( n, q ) c , dim( A ( U d )) d ] q nd − n − d − d q n − . (6.18)For d = n −

1, we note that Pr[ V d ∈ M ( n × n − , q ) , rk ( V d ) n −

1] is the probability that V d is not of rank n when n > (cid:20) nn − (cid:21) q Pr[ A ∈ Λ( n, q ) c , dim( A ( U d )) d ] n × nq n − − n +1 = n q n − . (6.19)35ombining equations from 6.14 to 6.19, we havePr[ A ∈ Λ( n, q ) c is not stable] X U F nq , dim( U ) n − Pr[ A ∈ Λ( n, q ) c , dim( A ( U )) d ] n − X d =1 (cid:20) nd (cid:21) q Pr[ A ∈ Λ( n, q ) c , dim( A ( U d )) d ] q Ω( n ) , which concludes the proof. Remark . In [LQ], the linear alge-braic Erd˝os-R´enyi model, LinER( n, m, q ), was introduced as the uniform distribution over all m -dimensional subspaces of Λ( n, q ). Randomly sampling m -tuples of n × n alternating matrices wastermed as the naive model in [LQ]. It was also shown in [LQ] that the analysis in the naive modelcan be upgraded, with a mild loss in the parameters, to an analysis in LinER( n, m, q ). Such anupgrade can also be done similarly for the analysis here, though with a little bit more work thanin [LQ]. We omit the details. In this section we show how to combine the methods of [GQ] for groups with abelian radicalsand the methods of [BMW1] to study subclasses of groups whose solvable radicals are p -groupsof class 2. Recall that p -groups of class 2 are considered as diﬃcult as the general case for groupisomorphism, so we did not expect to beat the n log n bound for this entire class. However, as acorollary of the results in this section, we give an n O (log log n ) -time isomorphism test for a class ofgroups whose radicals have genus 2. We shall work throughout with the following class of groups:Let G be the class of groups G whose solvable radical, Rad( G ), is a p -group of exponent p = 2 and class 2 upon which G acts as inner automorphisms of Rad( G ).In [GQ] the classical strategy of using actions and cohomology was formally analyzed, showingthat GpI “splits” into two problems: Action Compatibility (

ActComp ), and Cohomology ClassIsomorphism (

CohoIso ); we state their deﬁnitions in the relevant sections below. When G has anormal subgroup N we may consider G as an extension of N by Q = G/N ; both

ActComp and

CohoIso have as their witnesses certainly elements of Aut( N ) × Aut( Q ) × ( Q → N ), and two groupsare isomorphic if, and only if, there is a witness that works simultaneously for ActComp and

CohoIso (see [GQ] for a leisurely exposition). Furthermore,

ActComp and

CohoIso each reduce to

GpI .The two key cases to handle ﬁrst are the extreme situations with regards to this natural splitting:semi-direct products, where the isomorphism problems reduce to just

ActComp ; and “central”products (or rather, where G/ Rad( G ) acts trivially on the radical Rad( G )), where the problemreduces to (nonabelian) CohoIso . The class G that we consider here is of the second type of extremesituation. We expect the ﬁrst yield to techniques in [GQ, Section 3], perhaps using methods tosolve isometry [IQ], but we are not yet able to see a clear path to this case. We brieﬂy recall deﬁnitions and results on the automorphism group of groups of genus 2; see[BMW1] for details. For any group G , let Z = Z ( G ) and G ′ = [ G, G ]; then we deﬁne the commutatormap of G as ◦ G : G/Z × G/Z → G ′ . Two groups G, H are isoclinic if there are isomorphisms36 : G/Z ( G ) → H/Z ( H ) and ˆ ϕ : G ′ → H ′ such that g ϕ ◦ H g ϕ = ( g ◦ G g ) ˆ ϕ . When G, H arenilpotent of class 2, their commutator maps are in fact Z -bilinear (note that in this case G/Z ( G ) isabelian), and the groups are isoclinic iﬀ ◦ G and ◦ H are pseudo-isometric, by deﬁnition (recall § ◦ : U × V → W ( U, V, W abelian groups), its centroid is C ( ◦ ) := { ( ϕ, ψ, ρ ) ∈ End( U ) × End( V ) × End( W ) : ( ∀ u ∈ U, v ∈ V )[ u ϕ ◦ v = ( u ◦ v ) ρ = u ◦ ( v ψ )] } ;the centroid is the largest ring of scalars over which ◦ is bilinear. A nilpotent group G of class2 is isoclinic to a direct product H × · · · × H s of directly indecomposable groups; the genus of G is the maximum rank of [ H i , H i ] as a C ( ◦ H i )-module. Although the concept of genus is fullygeneral, we focus on p -groups of exponent p and class 2; in this case isoclinism and isomorphismcoincide, and centrally indecomposable p -groups of class 2 and exponent p have their centroids aﬁnite ﬁeld of characteristic p . For a biadditive map ◦ : U × U → V , let ΨIsom( ◦ ) denote its group ofpseudo-isometries; if ◦ is bilinear over a ﬁeld F , let ΨIsom F ( ◦ ) = ΨIsom( ◦ ) ∩ (GL F ( U ) × GL F ( V )).Given a ﬁnite ﬁeld F of characteristic p , its Galois group denoted Gal( F ), consists of those ﬁeldautomorphisms of F that act trivially on the prime subﬁeld Z p F ; Gal( F ) is cyclic of order[ F : Z p ] = log p | F | , generated by the Frobenius automorphism a a p . Proposition 7.1 (See, e. g., [BMW1, Prop. 2.4]) . Let P be a p -group of class 2 and exponent p satisfying Z ( P ) = [ P, P ] . Then Aut( P ) = ΨIsom( ◦ P ) ⋉ Hom(

P/Z ( P ) , Z ( P )) . If ◦ P is F -bilinear,then ΨIsom F ( ◦ P ) (cid:2) ΨIsom( ◦ P ) , with quotient ΨIsom( ◦ P ) / ΨIsom F ( ◦ P ) ∼ = Gal( F ) .Note that elements of ΨIsom F ( ◦ P ) ⋉ Hom(

P/Z ( P ) , Z ( P )) are faithfully represented by matrices (cid:18) α V dα α Z (cid:19) , where α V ∈ Aut(

P/Z ( P )) , α Z ∈ Aut( Z ( P )) , and dα : P/Z ( P ) → Z ( P ) is linear. Recall that a map α : V → W of F -vector spaces is F -semilinear if it is additive ( α ( v + v ′ ) = α ( v ) + α ( v ′ )) and it is “twisted” linear, that is, α ( λv ) = λ γ α ( v ), where γ ∈ Gal( F ). From thepreceding, it follows immediately that: Observation 7.2.

Let P be a p -group of class 2 and exponent p such that ◦ P is F -bilinear. Forany α ∈ Aut( P ) , the induced automorphisms on [ P, P ] and P/ [ P, P ] are both F -semilinear. Observation 7.3. If P is a p -group of class 2 and exponent p such that Z ( P ) = [ P, P ] , then P ∼ = Q × A , where Q is characteristic subgroup of P and satisﬁes Z ( Q ) = [ Q, Q ] , and A is anelementary abelian p -group. Moreover, Q and A and the isomorphism P ∼ = Q × A can be constructedin polynomial time in the number of generators, even when the groups are given as a black box.Standard proof sketch. Z ( P ) > [ P, P ] since P is of class 2. Since P is of exponent p , Z ( P ) iselementary abelian, and thus is a vector space Z ep . Let { g , . . . , g s } be a generating set of P . Let Q = h g i : g i / ∈ Z ( P ) i . Then Q ∩ Z ( P ) = [ P, P ]. Let A be a Z p -linear complement to [ P, P ] in Z ( P ). Theorem 7.4 ([BMW1, IQ]) . Let P be a p -group of class 2, exponent p = 2 , and genus g . Given α ∈ Aut( Z ( P )) , one can test whether α extends to an automorphism ˆ α ∈ Aut( P ) in poly-logarithmictime when g , and in polynomial time otherwise.Proof. When g = 2, the result is immediate from [BMW1, Thm. 3.22], and their comments about itsconstructive nature (see [BMW1, § Theorem 7.5.

Isomorphism of p -groups of class 2, exponent p = 2 , and genus can be decidedin poly-logarithmic time [BMW1, Thm. 1.1] , and of genus p log | G | can be decided in polynomialtime [IQ, Thm. 3] . .2 Testing isomorphism in the class G Our goal in this ﬁnal section is to prove Theorem F, which for convenience we now recall:

Theorem F.

Let G be the class of groups G deﬁned at the start of Section 7. Given groups G , G of order n , it can be decided in poly( n ) time if they lie in G . If so, isomorphism can be decided,and a generating set for Aut( G i ) found, in time n O ( g +log log n ) , where g is the genus of Rad( G ) . We will need the following two results from Grochow–Qiao [GQ], which ﬁrst require a fewconcepts we haven’t yet discussed. Recall that a pair of subgroups H , H G is a central decom-position of G if h H , H i = G and [ H , H ] = 1. Given two groups M , M and an isomorphism ϕ : Y → Y between two subgroups Y i Z ( M i ), the quotient of M × M by { ( y − , ϕ ( y )) : y ∈ Y } is the central product of M and M along ϕ , denoted M × ϕ M , and ϕ is called the amalgamatingmap . In this case, { M , M } is a central decomposition of M × ϕ M ; conversely, if { H , H } is acentral decomposition of a group G , then there exist Y i Z ( H i ) and an isomorphism ϕ : Y → Y such that G ∼ = H × ϕ H . Lemma 7.6 ([GQ, Lem. 3.10]) . Let N (cid:2) G , and suppose G acts on N as inner automorphisms of N . Then there is a subgroup H G , constructible in time poly( | G | ) , such that H ∩ N = Z ( N ) , H/N = Q , and { N, H } is a central decomposition of G . We denote this subgroup H by G | Z ( N ) . Proposition 7.7 (Special case of [GQ, Prop. 3.13]) . Let G i ( i = 1 , ) be a group such that Rad( G i ) = P is a p -group of class 2, exponent p , and genus 2, and such that Q = G i / Rad( G i ) actson Rad( G i ) by inner automorphisms of Rad( G i ) . Suppose that G | Z ( P ) ∼ = G | Z ( P ) (as in Lem. 7.6),which we denote by ˆ Q , and let ϕ i : Z ( P ) → Z ( ˆ Q ) be the corresponding amalgamating maps. Then G ∼ = G iﬀ there exist ( α, β ) ∈ Aut( P ) × Aut( ˆ Q ) such that ϕ = β − | Z ( ˆ Q ) ◦ ϕ ◦ α | Z ( P ) . Proposition 7.8 (See [GQ, § . Let G be a group with Rad( G ) = Z ( G ) , and let Q = G/Z ( G ) an elementary abelian group. Given β ∈ Aut( Q ) , one can compute in poly dim Z ( G ) timea single α ∈ Aut( Z ( G )) and a basis of a linear subspace L ⊆ End( Z ( G )) such that ( β, γ ) ∈ Aut( G ) iﬀ γ ∈ α + L .Proof of Thm. F. Let G , G be groups satisfying the hypotheses. In poly( | G | ) time, ﬁnd Rad( G i )and denote this by P ′ i . By Lem. 7.6, construct ˆ Q i = G i | Z ( P ′ i ) and the amalgamating maps ϕ ′ i : Z ( P ′ i ) → Z ( ˆ Q i ). Using Thm. 7.5 [BMW1, IQ], decide whether P ′ ∼ = P ′ ; if not, then G = G and we can stop, and if so, then let ρ ′ : P ′ → P ′ be such an isomorphism.Note (Observation 7.3) that it may be the case that P ′ i ∼ = P i × A i for some abelian groups A i ;if this is the case, we can ﬁnd P i and A i such that Z ( P i ) = [ P i , P i ] in polynomial time. Replace P ′ i by P i and replace ρ ′ by ρ := ρ ′ | P ; this will not hurt us later because P i is characteristic in P ′ i , andtherefore also in G i . Intuitively, the only place that A i interacts with P ′ i is as a direct product, andthe only way A i interacts with ˆ Q i is as a subgroup of its center, where A i still appears.Next, since ˆ Q i is a group with Rad( ˆ Q i ) Z ( ˆ Q i ), by [GQ] we can decide whether ˆ Q ∼ = ˆ Q in time n O (log log n ) ; if not, then G = G and we can stop, and if so, let τ : ˆ Q → ˆ Q be such anisomorphism. Let ϕ = ϕ ′ and ϕ = τ − ◦ ϕ ′ ◦ ρ − . These are both isomorphisms Z ( P ) → Z ( ˆ Q ),so from now on we let P = P and ˆ Q = ˆ Q , and we have G i ∼ = P × ϕ ′ i ˆ Q for i = 1 , G ∼ = G iﬀ there exists ( α, β ) ∈ Aut( P ) × Aut( ˆ Q ) such that ϕ ′ = β − | Z ( ˆ Q ) ◦ ϕ ′ ◦ α | Z ( P ) . (7.9)By Observation 7.2, α | Z ( P ) is F -semilinear, and since P has genus g , Z ( P ) ∼ = F g . EnumerateΓL( F g ); for each α ∈ ΓL( F g ), check whether α extends to an automorphism of P (Theorem 7.438BMW1, IQ]). Let Q = ˆ Q/Z ( ˆ Q ) = ˆ Q/ Rad( ˆ Q ). Enumerate γ ∈ Aut( Q ). For each α ∈ ΓL( F g ) thatextends to an automorphism of P , and each γ ∈ Aut( Q ), we seek β ∈ Aut( Z ( ˆ Q )) such that ( γ, β )induces an automorphism of ˆ Q and ( α, β ) satisﬁes (7.9). By Proposition 7.8, the set of such γ suchthat ( γ, β ) is an automorphism of ˆ Q is an aﬃne linear space β + B , where B is a linear subspace ofEnd( Z ( ˆ Q )), and we can compute γ and a basis for B in polynomial time. Once α is ﬁxed, (7.9) islinear in β . Intersecting the linear space which solves (7.9) with the aﬃne space β + B is standardlinear algebra, and can thus be computed in polynomial time.To summarize, for each α ∈ Aut F ( Z ( P )) ∼ = ΓL g ( F ) and each γ ∈ Aut( Q ), we can compute asingle element and generating set for those β such that α extends to an automorphism P , ( β, γ ) ∈ Aut( ˆ Q ), and ( α, β ) satisfy (7.9). Taking the union over all choices in ΓL g ( F ) and Aut( Q ) gives usthe coset of isomorphisms G → G . Analysis of running time.

When g O (log log | G | ), we have | ΓL g ( F ) | ∼ | Gal( F ) | · F g = k ( p k ) g = k ( p kg ) g = k | Z ( P ) | g | G | g + o (1) where | G | > | F | = p k , so their number is not toolarge, and ΓL g ( F ) is easily enumerated in n O ( g ) time. By [BCGQ], Aut( Q ) can be listed in time n O (log log n ) . Since we are enumerating over both of these, we take their product n O ( g +log log n ) , whichends up dominating the runtime. By [GQ], isomorphism of ˆ Q and ˆ Q can be tested in n O (log log n ) time. The rest is polynomial time or poly-logarithmic time by previous results, or linear algebra(poly-logarithmic time in | G | ). Remark . There is some hope when g Acknowledgments

The authors would like to acknowledge V. Arvind and M. Grohe for useful comments on hypergraph k -WL, Avinoam Mann for discussions on random generation of p -groups, and L´aszl´o Babai andXiaorui Sun for discussions on average-case algorithms for testing isomorphism of p -groups of class 2and exponent p . P. A. B. was partially supported by NSF grant DMS-1620362. J. A. G. was partiallysupported by NSF grant DMS-1750319. Y. L. was partially supported by ERC Consolidator Grant615307-QPROGRESS. Y. Q. was partially supported by the Australian Research Council DECRADE150100720. J. B. W. was partially supported by NSF grant DMS-1620454. P. A. B. and J. B. W.also acknowledge the Hausdorﬀ Institute for Mathematics, and the University of Auckland wheresome of this research was conducted. P. A. B., J. A. G., J. B. W., and Y. Q. also acknowledge theSanta Fe Institute, where some of this research was conducted. References [A] Sergei I. Adian,

Unsolvability of some algorithmic problems in the theory of groups , Trudy MoskovskogoMatematicheskogo Obshchestva (1957), 231–298.[AFKV] V. Arvind, Frank Fuhlbr¨uck, Johannes K¨obler, and Oleg Verbitsky, On Weisfeiler–Leman invariance: Sub-graph counts and related graph properties . arXiv:1811.04801, 2018.[B1] L´aszl´o Babai,

Graph isomorphism in quasipolynomial time [extended abstract] , Proceedings of the 48thAnnual ACM SIGACT Symposium on Theory of Computing, STOC 2016, pp. 684–697. arXiv:1512.03547,version 2.[B2] ,

Lecture on graph isomorphism , 1979. B3] Reinhold Baer,

Groups with abelian central quotient group , Transactions of the American MathematicalSociety (1938), no. 3, 357–386.[BB] L´aszl´o Babai and Robert Beals, A polynomial-time theory of black box groups I , London MathematicalSociety Lecture Note Series (1999), 30–64.[B1] Jan B¨oker,

Color reﬁnement, homomorphisms, and hypergraphs . arXiv: 1903.12432, 2019.[B2] William W. Boone,

The word problem , Annals of Mathematics (1959), 207–265.[B3] H.R. Brahana,

Metabelian groups and trilinear forms , Duke Mathematical Journal (1935), no. 2, 185–197.[BCGQ] L´aszl´o Babai, Paolo Codenotti, Joshua A. Grochow, and Youming Qiao, Code equivalence and group isomor-phism , Proceedings of the Twenty-Second Annual ACM–SIAM Symposium on Discrete Algorithms SODA2011, pp. 1395–1408.[BCQ] L´aszl´o Babai, Paolo Codenotti, and Youming Qiao,

Polynomial-time isomorphism test for groups with noabelian normal subgroups - (extended abstract) , Automata, languages, and programming - 39th internationalcolloquium, ICALP 2012, pp. 51–62.[BES] L´aszl´o Babai, Paul Erd˝os, and Stanley M. Selkow,

Random graph isomorphism , SIAM J. Comput. (1980),no. 3, 628–635.[BJP] W. Bosma, J. J. Cannon, and C. Playoust, The Magma algebra system I: the user language , J. Symb.Comput. (1997), 235–265.[BMW1] Peter A. Brooksbank, Joshua Maglione, and James B. Wilson,

A fast isomorphism test for groups whoseLie algebra has genus 2 , J. Algebra (2017), 545–590.[BMW2] ,

Thetensor.space , GitHub, 2019.[BNV] Simon R. Blackburn, Peter M. Neumann, and Geetha Venkataraman,

Enumeration of ﬁnite groups , Cam-bridge Tracts in Mathematics, vol. 173, Cambridge University Press, Cambridge, 2007.[BOW] Peter A. Brooksbank, Eamonn A. O’Brien, and James B. Wilson,

Isomorphism testing of graded algebras .arXiv:1708.08873, 2017.[BS] L. Babai and E. Szemeredi,

On the complexity of matrix group problems I , Proceedings of the 25th annualsymposium on foundations of computer science, SFCS 1984, pp. 229–240.[BW] Peter A. Brooksbank and James B. Wilson,

Computing isometry groups of Hermitian maps , Trans. Amer.Math. Soc. (2012), no. 4, 1975–1996.[CFI] Jin-Yi Cai, Martin F¨urer, and Neil Immerman,

An optimal lower bound on the number of variables forgraph identiﬁcations , Combinatorica (1992), no. 4, 389–410.[CH] John Cannon and Derek F. Holt, Automorphism group computation and isomorphism testing in ﬁnite groups ,J. Symbolic Comput. (2003), no. 3, 241–267.[DGR] Holger Dell, Martin Grohe, and Gaurav Rattan, Lov´asz meets Weisfeiler and Leman , 45th internationalcolloquium on automata, languages, and programming, ICALP 2018, pp. 40:1–40:14.[FSEDS] Jean-Charles Faug`ere, Mohab Safey El Din, and Pierre-Jean Spaenlehauer,

On the complexity of the gen-eralized MinRank problem , J. Symbolic Comput. (2013), 30–58.[G1] Robert Gilmon, Algorithmic search in group theory . arXiv:1812.08116, 2018.[G2] M. Gromov,

Random walk in random groups , Geom. Funct. Anal. (2003), no. 1, 73–146.[GQ] Joshua A. Grochow and Youming Qiao, Algorithms for group isomorphism via group extensions and coho-mology , SIAM J. Comput. (2017), no. 4, 1153–1216.[H1] J. Harris, Algebraic geometry , Graduate Texts in Mathematics 133 (1992).[H2] Graham Higman,

Enumerating p -groups. I: Inequalities , Proceedings of the London Mathematical Society (1960), no. 1, 24–30.[HL] Hermann Heineken and Hans Liebeck, The occurrence of ﬁnite groups in the automorphism group of nilpotentgroups of class

2, Arch. Math. (Basel) (1974), 8–16.[HM] Geir T. Helleloid and Ursula Martin, The automorphism group of a ﬁnite p -group is almost always a p -group ,J. Algebra (2007), no. 1, 294–329.[HR] Derek F. Holt and Sarah Rees, Testing modules for irreducibility , J. Austral. Math. Soc. (1994), 1–16.[IL] Neil Immerman and Eric S. Lander, Describing graphs: a ﬁrst-order approach to graph canonization , Com-plexity theory retrospective—in honor of Juris Hartmanis on the occasion of his 60th birthday, 1990, pp. 59–81. IQ] G´abor Ivanyos and Youming Qiao,

Algorithms based on ∗ -algebras, and their applications to isomorphismof polynomials with one secret, group isomorphism, and polynomial identity testing , Proceedings of thetwenty-ninth annual ACM-SIAM symposium on discrete algorithms, SODA 2018, pp. 2357–2376.[KL] William M. Kantor and Alexander Lubotzky, The probability of generating a ﬁnite classical group , Geom.Dedicata (1990), no. 1, 67–87.[KS] Neeraj Kayal and Nitin Saxena, Complexity of ring morphism problems , Computational Complexity (2006), no. 4, 342–390.[KTW] Martin Kassabov, Brady Tyburski, and James B. Wilson, The number of isomorphism types of subgroupsof simple groups is maximum possible . (in preparation).[LGR] Francois Le Gall and David J. Rosenbaum,

On the group and color isomorphism problems . arXiv:1609.08253,2016.[L1] Eugene M. Luks,

Group isomorphism with ﬁxed subnormal chains . arXiv: 1511.00151, 2015.[L2] ,

Hypergraph isomorphism and structural equivalence of boolean functions , Proceedings of the thirty-ﬁrst annual ACM symposium on theory of computing STOC 1999, pp. 652–658.[L3] ,

Permutation groups and polynomial-time computation , Groups and Computation, 1993.[LQ] Yinan Li and Youming Qiao,

Linear algebraic analogues of the graph isomorphism problem and the erd˝os-r´enyi model , 58th IEEE annual symposium on foundations of computer science, FOCS 2017, pp. 463–474.[LV] Ruvim Lipyanski and Natalia Vanetik,

On Borel complexity of the isomorphism problems for graph relatedclasses of Lie algebras and ﬁnite p -groups , J. Algebra Appl. (2015), no. 5, 1550078, 15.[LW1] Serge Lang and Andr´e Weil, Number of points of varieties in ﬁnite ﬁelds , American Journal of Mathematics (1954), no. 4, 819–827.[LW2] Mark L. Lewis and James B. Wilson, Isomorphism in expanding families of indistinguishable groups , GroupsComplex. Cryptol. (2012), no. 1, 73–110.[M1] Joshua Maglione, Compatible ﬁlters for isomorphism testing . arXiv:1805.03732, 2018.[M2] ,

Longer nilpotent series for classical unipotent subgroups , J. Group Theory (2015), no. 4, 569–585. MR3365818[M3] , Eﬃcient characteristic reﬁnements for ﬁnite groups , J. Symbolic Comput. (2017), 511–520.[M4] Avinoam Mann, Some questions about p -groups , J. Austral. Math. Soc. Ser. A (1999), no. 3, 356–379.[N] P.S. Novikov, On algorithmic undecidability of the word problem in the theory of groups , Trudy Mat. Inst.Steklov (1955), 1–144.[R1] Michael O. Rabin, Recursive unsolvability of group theoretic problems , Annals of Mathematics (1958), 172–194.[R2] David J. Rosenbaum,

Bidirectional collision detection and faster deterministic isomorphism testing . arXiv:1304.3935, 2013.[R3] ,

Breaking the nlog n barrier for solvable-group isomorphism , Proceedings of the Twenty-fourthAnnual ACM-SIAM Symposium on Discrete Algorithms, SODA 13, pp. 1054–1073.[RW] David J. Rosenbaum and Fabian Wagner,

Beating the generator-enumeration bound for p -group isomor-phism , Theoret. Comput. Sci. (2015), 16–25.[S] Charles C. Sims, Enumerating p -groups , Proceedings of the London Mathematical Society (1965), no. 1,151–166.[V] A.M. Vershik, Statistical mechanics of combinatorial partitions, an their limit shapes , V.A. Steklov Instituteof Mathematics, Russian Academy of Sciences.[WL] Boris Weisfeiler and Andrei A. Lehman,

A reduction of a graph to a canonical form and an algebra aris-ing during this reduction , Nauchno-Technicheskaya Informatsia On construction and identiﬁcation of graphs , Lecture Notes in Mathematics, vol. 558,Springer-Verlag, 1976. With contributions by A. Lehman, G. M. Adelson-Velsky, V. Arlazaraov, I. Faragev,A. Uskov, I. Zuev, M. Rosenfeld, and B. Weisfeiler.[W2] James B. Wilson,

The threshold for subgroup proﬁles to agree is

Ω(log n ). arXiv:1612.01444.[W3] , More characteristic subgroups, Lie rings, and isomorphism tests for p -groups , J. Group Theory (2013), no. 6, 875–897. MR3198722[W4] , Skolem–Noether for nilpotent products , arXiv preprint arXiv:1507.04406 (2015)., arXiv preprint arXiv:1507.04406 (2015).