Algorithms for matrix groups

(1)

Algorithms for matrix groups

Eamonn O’Brien

University of Auckland

December 2010

(2)

Overview

G = hX i ≤ GL(d , R) where R is a ring; usually finite field GF(q)

Goal: efficient algorithms, for their study, which are both theoretically and practically effective.

(3)

Overview

G = hX i ≤ GL(d , R) where R is a ring; usually finite field GF(q) Goal: efficient algorithms, for their study, which are both

theoretically and practically effective.

(4)

Why do we care?

Modular representation theory: Dickson (1910s), applications to number theory, algebraic groups etc.

Sporadic simple groups: constructed as irreducible representations over small fields.

Benson et al. (1982): J₄≤ GL(112, 2), order 10²⁰.

Invariant theory: irreducible representations, Kronecker products, tensor-induced representations.

Energy levels of systems of identical particles: irreducible representations of classical groups

(5)

Why do we care?

(6)

Why do we care?

(7)

Why do we care?

Benson et al. (1982): J₄ ≤ GL(112, 2), order 10²⁰.

(8)

Why do we care?

(9)

Why do we care?

(10)

Cost of matrix multiplication

Two d × d matrices A and B

Cost of A × B using conventional algorithm is O(d³).

Strassen: O(d^log²⁽⁷⁾)

Coppersmith & Winograd (1990): O(d^2.37)

Where do we notice improvements? Perhaps for d ≥ 100.

(11)

Cost of matrix multiplication

(12)

Cost of matrix multiplication

(13)

Cost of matrix multiplication

Coppersmith & Winograd (1990): O(d^2.37) Where do we notice improvements?

Perhaps for d ≥ 100.

(14)

Cost of matrix multiplication

(15)

Membership

Given G ≤ GL(d , Z), and x ∈ GL(d, Z): is x ∈ G ?

Mihailova (1958): membership problem is undecidable for d ≥ 4. GF(q) : |GL(d , q)| = O(q^d²)

Membership decidable from exhaustive search. Even for . . . 1 × 1 matrices over GF(q): membership related to

Discrete log problem

F = GF(q), ω ∈ F primitive.

Given α ∈ F , determine k so that α = ω^k. No polynomial-time algorithm known.

(16)

Membership

Mihailova (1958): membership problem is undecidable for d ≥ 4.

GF(q) : |GL(d , q)| = O(q^d²)

(17)

Membership

GF(q) : |GL(d , q)| = O(q^d²)

(18)

Membership

GF(q) : |GL(d , q)| = O(q^d²)

Membership decidable from exhaustive search.

Even for . . . 1 × 1 matrices over GF(q): membership related to

(19)

Membership

GF(q) : |GL(d , q)| = O(q^d²)

Even for . . . 1 × 1 matrices over GF(q):

membership related to

(20)

Membership

GF(q) : |GL(d , q)| = O(q^d²)

membership related to Discrete log problem

Given α ∈ F , determine k so that α = ω^k.

No polynomial-time algorithm known.

(21)

Membership

GF(q) : |GL(d , q)| = O(q^d²)

membership related to Discrete log problem

Given α ∈ F , determine k so that α = ω^k.

(22)

Challenge Problem I: Order of a matrix

Let g ∈ GL(d , q).

Find n ≥ 1 such that gⁿ= 1.

GL(d , q) has elements of order q^d− 1 (Singer cycles)

To find |g |: probably requires factorisation of numbers of form qⁱ− 1, a hard problem.

Babai & Beals (1999): Theorem

If the set of primes dividing a multiplicative upper-bound B for |g | is known, then the precise value of |g | can be determined in polynomial time.

(23)

Challenge Problem I: Order of a matrix

(24)

Challenge Problem I: Order of a matrix

(25)

Challenge Problem I: Order of a matrix

Babai & Beals (1999):

Theorem

If the set of primes dividing a multiplicative upper-bound B for |g | is known, then the precise value of |g | can be determined in

(26)

Celler & Leedham-Green (1995): compute order in time O(d³log q) subject to factorisation of qⁱ − 1 for 1 ≤ i ≤ d .

• Compute a “good” multiplicative upper bound E for |g |. Determine and factorise minimal polynomial for g as

m(x ) =

t

Y

i =1

f_i(x )^mⁱ

where deg(f_i) = d_i and β = dlog_pmax m_ie. E = lcm(q^dⁱ − 1) × p^β

|g | divides E .

(27)

• Compute a “good” multiplicative upper bound E for |g |.

Determine and factorise minimal polynomial for g as

m(x ) =

t

Y

i =1

f_i(x )^mⁱ

where deg(f_i) = d_i and β = dlog_pmax m_ie. E = lcm(q^dⁱ − 1) × p^β

|g | divides E .

(28)

m(x ) =

t

Y

i =1

f_i(x )^mⁱ

where deg(f_i) = d_i and β = dlog_pmax m_ie.

E = lcm(q^dⁱ − 1) × p^β

|g | divides E .

(29)

m(x ) =

t

Y

i =1

f_i(x )^mⁱ

E = lcm(q^dⁱ − 1) × p^β

|g | divides E .

(30)

m(x ) =

t

Y

i =1

f_i(x )^mⁱ

E = lcm(q^dⁱ − 1) × p^β

|g | divides E .

(31)

How can we use E ?

If E =Qt

i =1p_i^αⁱ then we can determine |g | in O(log t log n) multiplications.

If t = 1, then compute g^p^j¹ for j = 1, 2, . . . , α1.

Otherwise write E = uv where u, v are coprime and have approximately same number of distinct prime factors. Now g^u has order k say, dividing v ;

and g^k has order ` say, dividing u. The order of g is k`.

(32)

How can we use E ?

If E =Qt

Otherwise write E = uv where u, v are coprime and have approximately same number of distinct prime factors. Now g^u has order k say, dividing v ;

(33)

How can we use E ?

If E =Qt

Otherwise write E = uv where u, v are coprime and have approximately same number of distinct prime factors.

Now g^u has order k say, dividing v ; and g^k has order ` say, dividing u. The order of g is k`.

(34)

How can we use E ?

If E =Qt

Now g^u has order k say, dividing v ;

(35)

How can we use E ?

If E =Qt

Now g^u has order k say, dividing v ; and g^k has order ` say, dividing u.

The order of g is k`.

(36)

How can we use E ?

If E =Qt

Now g^u has order k say, dividing v ; and g^k has order ` say, dividing u.

The order of g is k`.

(37)

So cost is O(d³log q log t) field operations if we can factorise E .

If we don’t complete the factorisation, then obtain pseudo-order [order × some large primes] of g suffices for most theoretical and practical purposes.

Implementations in both GAP and Magma use databases of factorisations of numbers of the form qⁱ− 1, prepared as part of the Cunningham Project.

(38)

So cost is O(d³log q log t) field operations if we can factorise E . If we don’t complete the factorisation, then obtain pseudo-order [order × some large primes] of g

suffices for most theoretical and practical purposes.

(39)

So cost is O(d³log q log t) field operations if we can factorise E . If we don’t complete the factorisation, then obtain pseudo-order [order × some large primes] of g suffices for most theoretical and practical purposes.

(40)

So cost is O(d³log q log t) field operations if we can factorise E . If we don’t complete the factorisation, then obtain pseudo-order [order × some large primes] of g suffices for most theoretical and practical purposes.

(41)

Variation on this theme

Task: Determine if g has even order.

If we just know E , then we can learn in polynomial time the exact power of 2 (or of any specified prime) which divides |g |.

By repeated division by 2, we write E = 2^mb where b is odd. Now we compute h = g^b, and determine (by powering) its order which divides 2^m.

(42)

Variation on this theme

By repeated division by 2, we write E = 2^mb where b is odd. Now we compute h = g^b, and determine (by powering) its order which divides 2^m.

(43)

Variation on this theme

By repeated division by 2, we write E = 2^mb where b is odd.

Now we compute h = g^b, and determine (by powering) its order which divides 2^m.

(44)

Variation on this theme

By repeated division by 2, we write E = 2^mb where b is odd.

Now we compute h = g^b, and determine (by powering) its order which divides 2^m.

(45)

Randomness

|GL(d , q)| = O(q^d²)

Many algorithms are randomised: use random search in G to find elements having prescribed property P.

Example

Characteristic polynomial having factor of degree > d /2.

Order divisible by prescribed prime.

Common feature: algorithms depend on detailed analysis of proportion of elements of finite simple groups satisfying P.

(46)

Randomness

|GL(d , q)| = O(q^d²)

Example

(47)

Randomness

|GL(d , q)| = O(q^d²)

Example

(48)

Assume we determine a lower bound, say 1/k, for proportion of elements in G satisfying Property P.

To find element satisfying P by random search with a probability of failure less than given ∈ (0, 1): choose a sample of uniformly distributed random elements in G of size at least d− log_e()ek.

(49)

Assume we determine a lower bound, say 1/k, for proportion of elements in G satisfying Property P.

To find element satisfying P by random search with a probability of failure less than given ∈ (0, 1): choose a sample of uniformly distributed random elements in G of size at least d− log_e()ek.

(50)

Challenge Problem II: Generate random elements

Babai (1991): Vertex-transitive graph approach

Independent nearly uniformly random distributed elements of finite group G = hX i can be found after a preprocessing stage consisting of O(log⁵|G |) group operations.

Preprocessing proceeds in O(log |G |) phases.

In each phase, random walk of random length between 1 and O((log |G |)⁴) performed on Cayley graph of G .

Element found when walk finished is added to generators of G . Walk is repeated O(log |G |) times.

(51)

Challenge Problem II: Generate random elements

(52)

Challenge Problem II: Generate random elements

(53)

Challenge Problem II: Generate random elements

(54)

Challenge Problem II: Generate random elements

Element found when walk finished is added to generators of G .

Walk is repeated O(log |G |) times.

(55)

Challenge Problem II: Generate random elements

Element found when walk finished is added to generators of G .

(56)

Final list S of O(log |G |) elements input to construction phase.

Random element is random subproduct of S : g₁¹. . . g_m^m

where S = {g₁, . . . , g_m} and _i ∈ {0, 1} (chosen independently). For G ≤ GL(d , q), log |G | < d²log q.

Initialisation phase O(d¹⁰log⁵q). Cost per random element is O(log |G |).

(57)

where S = {g₁, . . . , g_m} and _i ∈ {0, 1} (chosen independently).

For G ≤ GL(d , q), log |G | < d²log q. Initialisation phase O(d¹⁰log⁵q). Cost per random element is O(log |G |).

(58)

For G ≤ GL(d , q), log |G | < d²log q.

Initialisation phase O(d¹⁰log⁵q). Cost per random element is O(log |G |).

(59)

Initialisation phase O(d¹⁰log⁵q).

Cost per random element is O(log |G |).

(60)

Initialisation phase O(d¹⁰log⁵q).

Cost per random element is O(log |G |).

(61)

CLMNO (1995): Product replacement algorithm

Input: ordered list of generators [g₁, . . . , g_m] for G . Accumulator: r initialised to be identity of G . Basic step:

Select at random i , j where 1 ≤ i , j ≤ m.

Replace g_i by either g_ig_j or g_jg_i.

Multiply r by g_i.

Basic step repeated a number, say t, of times.

Now to obtain random element: execute basic operation once, and return r as random element.

(62)

CLMNO (1995): Product replacement algorithm

Input: ordered list of generators [g₁, . . . , g_m] for G .

Accumulator: r initialised to be identity of G . Basic step:

Multiply r by g_i.

(63)

CLMNO (1995): Product replacement algorithm

Input: ordered list of generators [g₁, . . . , g_m] for G . Accumulator: r initialised to be identity of G .

Basic step:

Multiply r by g_i.

(64)

CLMNO (1995): Product replacement algorithm

Multiply r by g_i.

(65)

CLMNO (1995): Product replacement algorithm

Multiply r by g_i.

(66)

CLMNO (1995): Product replacement algorithm

Multiply r by g_i.

(67)

CLMNO (1995): Product replacement algorithm

Multiply r by g_i.

(68)

CLMNO (1995): Product replacement algorithm

Multiply r by g_i.

(69)

Cost: after initialisation, two matrix multiplications.

Markov chain: a discrete random process with a finite number of states and it satisfies the property that the next state depends only on the current state.

Aperiodic: all states occur with equal probability. Theorem

Let T be set of all m-tuples of generators of G . Then the

algorithm constructs a Markov chain over state space T , and if m is at least twice the size of a minimal generating set of generators for G , this Markov chain is connected and aperiodic.

The random walk approaches a limiting distribution at exponential rate O((1 − δ)^t) where t is number of steps taken.

(70)

Aperiodic: all states occur with equal probability. Theorem

(71)

Aperiodic: all states occur with equal probability.

Theorem

(72)

Theorem

(73)

Theorem

The random walk approaches a limiting distribution at exponential

(74)

Mixing time

What can we say about the “mixing time”, t?

Variety of statistical tests applied to test outcome of algorithm. Practical: excellent.

Diaconis & Saloff-Coste (1997, 1998):

t = O(δ²(G , S ) · m), where δ(G , S ) is the maximal diameter for the Cayley graph of G wrt generating set S .

Comparison of two Markov chains on different but related state spaces and combinatorics of random paths.

Pak (2001): Mixing time is polynomial. Multi-commodity flow technique.

Lubotzky & Pak (2002):

Does the group of automorphisms of a free group of rank > 3 have Kazhdan’s property (T)? If so, then “graph of states” is well-behaved, giving excellent mixing time.

(75)

Mixing time

Variety of statistical tests applied to test outcome of algorithm.

Practical: excellent.

(76)

Mixing time

(77)

Mixing time

(78)

Mixing time

Does the group of automorphisms of a free group of rank > 3

(79)

Permutation groups

Sims (1970, 1971): base and strong generating set (BSGS).

G acts faithfully on Ω = {1, . . . , n}

G = {g ∈ G | ^g = }.

Base: sequence of points B = [1, 2, . . . , _k] where G1,2,...,k = 1. This determines chain of stabilisers

G = G⁽⁰⁾ ≥ G⁽¹⁾≥ · · · ≥ G^(k−1)≥ G^(k) = 1, where G^{(i )}= G₁_,₂_,...,_i.

S strong generating set: G^{(i )}=S ∩ G^{(i )} Example

G = h(1, 5, 2, 6), (1, 2)(3, 4)(5, 6)i B = [1, 3]

G > G₁ > G_1,3 = 1

S = {(1, 5, 2, 6), (1, 2)(3, 4)(5, 6), (3, 4)}

(80)

Permutation groups

G = {g ∈ G | ^g = }.

Base: sequence of points B = [1, 2, . . . , _k] where G1,2,...,_k = 1.

This determines chain of stabilisers

G = G⁽⁰⁾ ≥ G⁽¹⁾≥ · · · ≥ G^(k−1)≥ G^(k) = 1, where G^{(i )}= G₁_,₂_,...,_i.

S strong generating set: G^{(i )}=S ∩ G^{(i )} Example

G = h(1, 5, 2, 6), (1, 2)(3, 4)(5, 6)i B = [1, 3]

G > G₁ > G_1,3 = 1

S = {(1, 5, 2, 6), (1, 2)(3, 4)(5, 6), (3, 4)}

(81)

Permutation groups

G = {g ∈ G | ^g = }.

Base: sequence of points B = [1, 2, . . . , _k] where G1,2,...,_k = 1.

This determines chain of stabilisers

G = G⁽⁰⁾ ≥ G⁽¹⁾≥ · · · ≥ G^(k−1) ≥ G^(k) = 1, where G^{(i )}= G₁_,₂_,...,_i.

S strong generating set: G^{(i )} =S ∩ G^{(i )} Example

G = h(1, 5, 2, 6), (1, 2)(3, 4)(5, 6)i

(82)

Central task: construct basic orbits – orbit B_i of the base point

i +1 under G^{(i )}.

|G^{(i )}: G^{(i +1)}| = #B_i

Schreier’s Lemma gives generating set for each G^{(i )}. Base image B^g = [^g₁, . . . ^g_k] uniquely determines g :

if B^g = B^h then B^gh⁻¹ = B, so gh⁻¹ = 1. Hence g can be represented as |B|-tuple.

Variations underpin both theoretical and practical approaches to permutation group algorithms.

(83)

|G^{(i )}: G^{(i +1)}| = #B_i

(84)

|G^{(i )}: G^{(i +1)}| = #B_i

Schreier’s Lemma gives generating set for each G^{(i )}.

Base image B^g = [^g₁, . . . ^g_k] uniquely determines g :

(85)

|G^{(i )}: G^{(i +1)}| = #B_i

(86)

|G^{(i )}: G^{(i +1)}| = #B_i

(87)

|G^{(i )}: G^{(i +1)}| = #B_i

(88)

Schreier-Sims for matrix groups

G acts faithfully on V = F^d: v · g , for v ∈ V

Compute BSGS for G , viewed as permutation group on the vectors. Base points: standard basis vectors for V .

Central problem: basic orbits B_i large. Usually |B₁| is |G |. Butler (1979): action of G on one-dimensional subspaces of V . Murray & O’Brien (1995): heuristic algorithm to select base points. Neunh¨offer et al. (2000s): use “helper subgroups” to construct large orbits

(89)

Schreier-Sims for matrix groups

Compute BSGS for G , viewed as permutation group on the vectors.

Base points: standard basis vectors for V .

Central problem: basic orbits B_i large. Usually |B₁| is |G |. Butler (1979): action of G on one-dimensional subspaces of V . Murray & O’Brien (1995): heuristic algorithm to select base points. Neunh¨offer et al. (2000s): use “helper subgroups” to construct large orbits

(90)

Schreier-Sims for matrix groups

Central problem: basic orbits B_i large. Usually |B₁| is |G |.

Butler (1979): action of G on one-dimensional subspaces of V . Murray & O’Brien (1995): heuristic algorithm to select base points. Neunh¨offer et al. (2000s): use “helper subgroups” to construct large orbits

(91)

Schreier-Sims for matrix groups

Butler (1979): action of G on one-dimensional subspaces of V .

Murray & O’Brien (1995): heuristic algorithm to select base points. Neunh¨offer et al. (2000s): use “helper subgroups” to construct large orbits

(92)

Schreier-Sims for matrix groups

Butler (1979): action of G on one-dimensional subspaces of V . Murray & O’Brien (1995): heuristic algorithm to select base points.

Neunh¨offer et al. (2000s): use “helper subgroups” to construct large orbits

(93)

Schreier-Sims for matrix groups

Butler (1979): action of G on one-dimensional subspaces of V . Murray & O’Brien (1995): heuristic algorithm to select base points.

Neunh¨offer et al. (2000s): use “helper subgroups” to construct large orbits

(94)

Critical for success: index of one stabiliser in its predecessor.

|S_n: Sn−1| = n

“Optimal” subgroup chain for GL(d , q)?

GL(d , q) ≥ q^{d −1}.GL(d − 1, q) ≥ GL(d − 1, q) ≥ . . . Leading index: q^d− 1.

Example

Largest maximal subgroup 2¹¹: M₂₄≤ J₄ index 173 067 389.

(95)

|S_n: Sn−1| = n

Example

(96)

|S_n: Sn−1| = n

Example

(97)

|S_n: Sn−1| = n

Example

(98)

Geometry following Aschbacher

Aschbacher (1984)

G maximal subgroup of GL(d , q), let V be underlying vector space

G preserves somenatural linear structure associated with the action of G on V , and has normal subgroup related to this structure,

or G is almost simple modulo scalars: T ≤ G /Z ≤ Aut(T ) where T is simple.

(99)

Geometry following Aschbacher

Aschbacher (1984)

(100)

Geometry following Aschbacher

Aschbacher (1984)

(101)

Basic strategy

1 Determine (at least one of) its Aschbacher categories.

2 If N C G exists, recognise N and G /N recursively, ultimately obtaining a composition series for the group.

7 categories giving normal subgroup Example

G acts imprimitively on V , preserving r blocks, so V = ⊕^r_{i =1}Vi. Then φ : G → S_r where r |d and N = ker φ.

CompositionTree: exploits geometry to produce composition series for G , factors are leaves of tree.

(102)

Basic strategy

7 categories giving normal subgroup

Example

(103)

Basic strategy

G acts imprimitively on V , preserving r blocks, so V = ⊕^r_{i =1}Vi.

Then φ : G → S_r where r |d and N = ker φ.

(104)

Basic strategy

(105)