GL( N,GF(p) ).random_element() is slow

70 views
Skip to first unread message

charles Bouillaguet

unread,
Jul 2, 2012, 3:12:25 PM7/2/12
to sage-...@googlegroups.com
Hi,

I was wondering why some of my code was Dawn Slow (tm), and I ended up
being surprised to notice that it was spending all its time trying to
generate a random invertible matrix.... In particular, over finite
fields, GL(N, GF(q)).random_element() is MUCH MUCH MUCH slower than
the naive method that just generates a random matrix, checks if it is
invertible, and tries again if it is not the case


sage: %time GL(64, GF(2)).random_element()
CPU times: user 20.47 s, sys: 2.64 s, total: 23.11 s
Wall time: 28.93 s

--> 30s is not a reasonable performance to generate a small random
invertible matrix....

By the way, this fails with large primes.

%time GL(64, GF(2^127-1)).random_element()
....
TypeError: Unable to convert Gap element 'ZmodpZObj(
152551219330529388046437174479921365258,
170141183460469231731687303715884105727 )'
error coercing to finite field

I would suggest overloading GL(N, K).random_element() with the naive
procedure when K is a finite field.

def faster_random_invertible_matrix(n,K):
S = matrix(K,n)
while not S.is_invertible():
S = MatrixSpace(K,n,n).random_element()
return S


sage: %timeit faster_random_invertible_matrix(64, GF(2))
125 loops, best of 3: 1.8 ms per loop

---> this is about 15000 times faster....

How do you feel about this ? It's not a bug stricto sensu, but
math-oriented people might stumble across GL(N,K).random_element() and
try to use it even is a much faster solution is available.
--
Charles Bouillaguet

Martin Albrecht

unread,
Jul 2, 2012, 6:32:53 PM7/2/12
to sage-...@googlegroups.com
We should be able to do even better, right?

Generate upper triangular + lower triangular matrix and do a product?
Cheers,
Martin

--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_otr: 47F43D1A 5D68C36F 468BAEBA 640E8856 D7951CCF
_www: http://martinralbrecht.wordpress.com/
_jab: martinr...@jabber.ccc.de

Robert Bradshaw

unread,
Jul 2, 2012, 6:44:09 PM7/2/12
to sage-...@googlegroups.com
The question is what distribution one is aiming for. Is there
something special about GL(n, F) that we're trying to achieve?
Otherwise, I think the generate-and-check, perhaps re-defining a
single random entry on failure, is an evener distribution.

- Robert
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org

Martin Albrecht

unread,
Jul 2, 2012, 7:03:06 PM7/2/12
to sage-...@googlegroups.com
Shouldn't both give the same distribution mod p? Since every non-singular
matrix A has a LU decomposition we should be able to just sample L and U
separately to produce A?

Simon King

unread,
Jul 2, 2012, 7:34:03 PM7/2/12
to sage-...@googlegroups.com
Hi!

On 2012-07-02, Martin Albrecht <martinr...@googlemail.com> wrote:
> Shouldn't both give the same distribution mod p? Since every non-singular
> matrix A has a LU decomposition we should be able to just sample L and U
> separately to produce A?

Sorry for my ignorance, but is it really the case that an LU
decomposition exists for all invertible matrices? I thought there may
only be an LUP decomposition.

If I am not mistaken, the LU decomposition is unique if one requires
that L (or U) has only 1 on the diagonal. Because of the uniqueness, I'd
expect that putting 1 on the diagonal of L and choosing the entries of U
and the remaining of L randomly equally distributed yields a reasonable
distribution of invertible matrices.

However, if it is really the case that we must consider LUP
decompositions, then I am not totally convinced that a nicely distributed
random choice of a permutation matrix P on top of the choice of L and U as
above yields a nice distribution of invertible matrices.

Cheers,
Simon


Martin Albrecht

unread,
Jul 2, 2012, 7:46:48 PM7/2/12
to sage-...@googlegroups.com
Hi,

On Tuesday 03 Jul 2012, Simon King wrote:
> Hi!
>
> On 2012-07-02, Martin Albrecht <martinr...@googlemail.com> wrote:
> > Shouldn't both give the same distribution mod p? Since every non-singular
> > matrix A has a LU decomposition we should be able to just sample L and U
> > separately to produce A?
>
> Sorry for my ignorance, but is it really the case that an LU
> decomposition exists for all invertible matrices? I thought there may
> only be an LUP decomposition.

Argh, yes, you're right: should be LUP.

> If I am not mistaken, the LU decomposition is unique if one requires
> that L (or U) has only 1 on the diagonal. Because of the uniqueness, I'd
> expect that putting 1 on the diagonal of L and choosing the entries of U
> and the remaining of L randomly equally distributed yields a reasonable
> distribution of invertible matrices.
>
> However, if it is really the case that we must consider LUP
> decompositions, then I am not totally convinced that a nicely distributed
> random choice of a permutation matrix P on top of the choice of L and U as
> above yields a nice distribution of invertible matrices.

Mhh, why not? If A = LUP we just write AP^-1 = LU, hence for each LU we
construct there are as many As as there are permutation matrices, or am I
missing something (again :))?

Dima Pasechnik

unread,
Jul 2, 2012, 9:29:28 PM7/2/12
to sage-...@googlegroups.com


On Tuesday, 3 July 2012 07:46:48 UTC+8, Martin Albrecht wrote:
Hi,

On Tuesday 03 Jul 2012, Simon King wrote:
> Hi!
>
> On 2012-07-02, Martin Albrecht  wrote:
> > Shouldn't both give the same distribution mod p? Since every non-singular
> > matrix A has a LU decomposition we should be able to just sample L and U
> > separately to produce A?
>
> Sorry for my ignorance, but is it really the case that an LU
> decomposition exists for all invertible matrices? I thought there may
> only be an LUP decomposition.

Argh, yes, you're right: should be LUP. 

> If I am not mistaken, the LU decomposition is unique if one requires
> that L (or U) has only 1 on the diagonal. Because of the uniqueness, I'd
> expect that putting 1 on the diagonal of L and choosing the entries of U
> and the remaining of L randomly equally distributed yields a reasonable
> distribution of invertible matrices.
>
> However, if it is really the case that we must consider LUP
> decompositions, then I am not totally convinced that a nicely distributed
> random choice of a permutation matrix P on top of the choice of L and U as
> above yields a nice distribution of invertible matrices.

Mhh, why not? If A = LUP we just write AP^-1  = LU, hence for each LU we
construct there are as many As as there are permutation matrices, or am I
missing something (again :))?

P can be taken to be identity iff the leading principal minors of A are non-0.
Thus LU (say, with diag(L)=(1,...,1)) gives you a random matrix from this class.
So if you are content with sampling from such subclass, LU suffices.


There is, in fact, a big industry in the theory of finite groups that handles these
kinds of questions.

Dima
 

Cheers,
Martin

charles Bouillaguet

unread,
Jul 2, 2012, 9:56:04 PM7/2/12
to sage-...@googlegroups.com
> Mhh, why not? If A = LUP we just write AP^-1 = LU, hence for each LU we
> construct there are as many As as there are permutation matrices, or am I
> missing something (again :))?

I am not sure that the LUP decomposition is unique (I understand that
the LU is). If A has more distinct LUP factorizations than B, then A
is more likely to be produced by this process than B....

Charles

Dima Pasechnik

unread,
Jul 2, 2012, 10:36:37 PM7/2/12
to sage-...@googlegroups.com


On Tuesday, 3 July 2012 09:56:04 UTC+8, Charles Bouillaguet wrote:
> Mhh, why not? If A = LUP we just write AP^-1  = LU, hence for each LU we
> construct there are as many As as there are permutation matrices, or am I
> missing something (again :))?

I am not sure that the LUP decomposition is unique (I understand that
the LU is).

An invertible matrix need not have an LU decomposition. E.g. A=[[0,1],[1,1]] does not have it.

It's evident over F_2: L can only take 2 values, and U can only take 2 values, so you can't have
more than 4 different matrices of the form LU :–) 

Dima Pasechnik

unread,
Jul 2, 2012, 11:26:54 PM7/2/12
to sage-...@googlegroups.com


On Tuesday, 3 July 2012 10:36:37 UTC+8, Dima Pasechnik wrote:


On Tuesday, 3 July 2012 09:56:04 UTC+8, Charles Bouillaguet wrote:
> Mhh, why not? If A = LUP we just write AP^-1  = LU, hence for each LU we
> construct there are as many As as there are permutation matrices, or am I
> missing something (again :))?

I am not sure that the LUP decomposition is unique (I understand that
the LU is).

An invertible matrix need not have an LU decomposition. E.g. A=[[0,1],[1,1]] does not have it.

It's evident over F_2: L can only take 2 values, and U can only take 2 values, so you can't have
more than 4 different matrices of the form LU :–) 

by the way, for generating random elements it might be better to use the Bruhat decomposition, which is unique. See

Dima Pasechnik

unread,
Jul 3, 2012, 12:31:39 AM7/3/12
to sage-...@googlegroups.com


On Tuesday, 3 July 2012 11:26:54 UTC+8, Dima Pasechnik wrote:


On Tuesday, 3 July 2012 10:36:37 UTC+8, Dima Pasechnik wrote:


On Tuesday, 3 July 2012 09:56:04 UTC+8, Charles Bouillaguet wrote:
> Mhh, why not? If A = LUP we just write AP^-1  = LU, hence for each LU we
> construct there are as many As as there are permutation matrices, or am I
> missing something (again :))?

I am not sure that the LUP decomposition is unique (I understand that
the LU is).

An invertible matrix need not have an LU decomposition. E.g. A=[[0,1],[1,1]] does not have it.

It's evident over F_2: L can only take 2 values, and U can only take 2 values, so you can't have
more than 4 different matrices of the form LU :–) 

by the way, for generating random elements it might be better to use the Bruhat decomposition, which is unique. See


I take the last part back, sorry, it makes little sense. 
What certainly is doable is the following:
1) uniformly select a random maximal flag
2) uniformly select a random element U in the subgroup stabilizing the "canonical" maximal flag, i.e. the one stabilized by the 
upper triangular matrices.
Then an element MU, for M being a matrix mapping the "canonical" flag to the flag selected at step 1, will be uniformly
distributed over the whole group G (as step 1 selects a coset of U in G).

Javier López Peña

unread,
Jul 3, 2012, 5:45:52 AM7/3/12
to sage-...@googlegroups.com
I know little of random methods, but do we really need to make things so complicated? As the OP suggests, we might as well just generate matrices uniformly at random and discard if not invertible. The set of invertible matrices is Zariski open, so the probability of hitting a non-invertible matrix should be very small (0 for real or complex matrices, I don't know about finite fields).

I understand this naive method might not be the quickest possible algorithm but it is much faster than what we have now and just works, so why not implement just that, with the obvious fix so that the MatrixSpace doesn't get called over and over?

Cheers,
J

Dima Pasechnik

unread,
Jul 3, 2012, 5:53:23 AM7/3/12
to sage-...@googlegroups.com


On Tuesday, 3 July 2012 17:45:52 UTC+8, Javier López Peña wrote:
I know little of random methods, but do we really need to make things so complicated? As the OP suggests, we might as well just generate matrices uniformly at random and discard if not invertible. The set of invertible matrices is Zariski open, so the probability of hitting a non-invertible matrix should be very small (0 for real or complex matrices, I don't know about finite fields).

well, it's not that small, especially for finite fields. E.g. for F_2 and n=3, one only gets 168 invertible matrices out of 512=2^9 in total...
(I can't resist saying that the order of GL(n,q) is (q^n-1)(q^{n-1}-1)...(q^2-1)(q-1))
So it's not gonna be very fast, also note that computing the determinant comes at a nonzero cost when matrices are big...

Dima

Javier López Peña

unread,
Jul 3, 2012, 7:55:54 AM7/3/12
to sage-...@googlegroups.com


On Tuesday, July 3, 2012 10:53:23 AM UTC+1, Dima Pasechnik wrote:
well, it's not that small, especially for finite fields. E.g. for F_2 and n=3, one only gets 168 invertible matrices out of 512=2^9 in total...
(I can't resist saying that the order of GL(n,q) is (q^n-1)(q^{n-1}-1)...(q^2-1)(q-1))
So it's not gonna be very fast, also note that computing the determinant comes at a nonzero cost when matrices are big...

I stand corrected. Even in the worst case scenario (with F_2) it still seems that about one in three matrices is invertible, so the situation is not too bad.  Also, there is no need to compute the determinants, knowing if the rank is full or not is enough. So my point is: even if not *very* fast, still seems faster than what we have now, and it is very easy to implement while we think of a better solution.

Javier

Dima Pasechnik

unread,
Jul 3, 2012, 8:19:31 AM7/3/12
to sage-...@googlegroups.com


On Tuesday, 3 July 2012 19:55:54 UTC+8, Javier López Peña wrote:


On Tuesday, July 3, 2012 10:53:23 AM UTC+1, Dima Pasechnik wrote:
well, it's not that small, especially for finite fields. E.g. for F_2 and n=3, one only gets 168 invertible matrices out of 512=2^9 in total...
(I can't resist saying that the order of GL(n,q) is (q^n-1)(q^{n-1}-1)...(q^2-1)(q-1))
So it's not gonna be very fast, also note that computing the determinant comes at a nonzero cost when matrices are big...

I stand corrected. Even in the worst case scenario (with F_2) it still seems that about one in three matrices is invertible,

no. As n grows, the ratio gets much, much worse: e.g. for n=2, q=2:

sage: float((2^10-1)*(2^9-1)*(2^8-1)*(2^7-1)*(2^6-1)*(2^5-1)*(2^4-1)*7*3/2^100)
8.215872026646278e-15

It's asymptotically 0, as is not hard to see, just look at the behaviour of the dominating sequence
q^n q^{n-1}... q^2 q/q^{n^2}

 
so the situation is not too bad.  Also, there is no need to compute the determinants, knowing if the rank is full or not is enough. 

sure, but for F_2 rank and determinant are essentially equal problems :–)
 
So my point is: even if not *very* fast, still seems faster than what we have now, and it is very easy to implement while we think of a better solution.

well, pseudorandom field element generation is not very cheap, and generating "good" pseudorandom elements got be be expensive, otherwise
some (generally believed to be true) conjectures of complexity theory would fail.

No, it's not the way to go, unless n is very small: see above.

Dima


Javier

Martin Albrecht

unread,
Jul 3, 2012, 8:22:37 AM7/3/12
to sage-...@googlegroups.com
As far as I can tell all methods need matrix multiplication, so whatever
choice me make the difference will be constant, i.e., how many matrix-matrix
products we need (assuming asymptotically fast techniques were implemented)

Dima Pasechnik

unread,
Jul 3, 2012, 8:56:24 AM7/3/12
to sage-...@googlegroups.com


On Tuesday, 3 July 2012 20:22:37 UTC+8, Martin Albrecht wrote:
As far as I can tell all methods need matrix multiplication, so whatever
choice me make the difference will be constant, i.e., how many matrix-matrix
products we need (assuming asymptotically fast techniques were implemented)

well, as far as I see, my latest proposal needs one product, of a general matrix by an upper-triangular
matrix.

Martin Albrecht

unread,
Jul 3, 2012, 9:20:16 AM7/3/12
to sage-...@googlegroups.com

On Tuesday 03 Jul 2012, Dima Pasechnik wrote:
> On Tuesday, 3 July 2012 19:55:54 UTC+8, Javier López Peña wrote:
> > On Tuesday, July 3, 2012 10:53:23 AM UTC+1, Dima Pasechnik wrote:
> >> well, it's not that small, especially for finite fields. E.g. for F_2
> >> and n=3, one only gets 168 invertible matrices out of 512=2^9 in
> >> total... (I can't resist saying that the order of GL(n,q) is
> >> (q^n-1)(q^{n-1}-1)...(q^2-1)(q-1))
> >> So it's not gonna be very fast, also note that computing the determinant
> >> comes at a nonzero cost when matrices are big...
> >
> > I stand corrected. Even in the worst case scenario (with F_2) it still
> > seems that about one in three matrices is invertible,
>
> no. As n grows, the ratio gets much, much worse: e.g. for n=2, q=2:
>
> sage:
> float((2^10-1)*(2^9-1)*(2^8-1)*(2^7-1)*(2^6-1)*(2^5-1)*(2^4-1)*7*3/2^100)
> 8.215872026646278e-15
>
> It's asymptotically 0, as is not hard to see, just look at the behaviour of
> the dominating sequence
> q^n q^{n-1}... q^2 q/q^{n^2}

This analysis doesn't seem right:

sage: j = 0
sage: for i in range(100):
A = random_matrix(GF(2),10000,10000)
if A.rank() == 10000:
j+=1
....:
sage: j
25

Indeed, the *probability* that a *random* binary matrix has full rank is
prod(1-2^i for i in range(1,infinity)) ~= 0.288.

> > so the situation is not too bad. Also, there is no need to compute the
> > determinants, knowing if the rank is full or not is enough.
>
> sure, but for F_2 rank and determinant are essentially equal problems :–)
>
> > So my point is: even if not *very* fast, still seems faster than what we
> > have now, and it is very easy to implement while we think of a better
> > solution.
>
> well, pseudorandom field element generation is not very cheap, and
> generating "good" pseudorandom elements got be be expensive, otherwise
> some (generally believed to be true) conjectures of complexity theory would
> fail.
>
> No, it's not the way to go, unless n is very small: see above.
>
> Dima
>
> > Javier

Martin Albrecht

unread,
Jul 3, 2012, 9:23:00 AM7/3/12
to sage-...@googlegroups.com
For comparison: PLE decomposition needs 2.8 multiplications if asymptotically
fast multiplication is used and 0.66 if cubic multiplication is used, cf.

http://arxiv.org/pdf/1112.5717.pdf

Dima Pasechnik

unread,
Jul 3, 2012, 11:21:27 AM7/3/12
to sage-...@googlegroups.com
oops, wrong formula. Jetlag got me :–(
It should have been |GL(n,q)}=(q^n-1)(q^n-q)(q^n-q^2)...(q^n-q^{n-2})(q^n-q^{n-1})

Indeed, you're right, it's a finite limit, and taking 
a random matrix and checking its rank will do just fine.

Dima

charles Bouillaguet

unread,
Jul 3, 2012, 11:25:35 AM7/3/12
to sage-...@googlegroups.com
Opening a ticket with a patch, then...

Charles

2012/7/3 Dima Pasechnik <dim...@gmail.com>:

Robert Bradshaw

unread,
Jul 3, 2012, 12:21:53 PM7/3/12
to sage-...@googlegroups.com
You don't have to re-generate the entire matrix, you could likely get
away with changing one random element at a time. (And if you did so,
perhaps computing the rank would be cheaper). Still, +1 to it's much
faster than what we have now and still easy to implement.

- Robert

Javier López Peña

unread,
Jul 3, 2012, 1:12:43 PM7/3/12
to sage-...@googlegroups.com
Hi Robert,


On Tuesday, July 3, 2012 5:21:53 PM UTC+1, Robert Bradshaw wrote:
You don't have to re-generate the entire matrix, you could likely get
away with changing one random element at a time. (And if you did so,
perhaps computing the rank would be cheaper). Still, +1 to it's much
faster than what we have now and still easy to implement.

A naive implementation of this idea doesn't quite work, setting:

def faster_random_invertible_matrix(n,K): 
    M = MatrixSpace(K,n,n)
    S = M.random_element()
    while not S.is_invertible(): 
        S = M.random_element() 
    return S 

def faster_random_invertible_matrix2(n,K): 
    M = MatrixSpace(K,n,n)
    S = M.random_element()
    while not S.is_invertible(): 
        i = randint(0,n-1)
        j = randint(0,n-1)
        S[i,j] = K.random_element()
    return S 

yields the following test results:

sage: %timeit faster_random_invertible_matrix(200, GF(2)) 
625 loops, best of 3: 792 µs per loop
sage: %timeit faster_random_invertible_matrix2(200, GF(2)) 
125 loops, best of 3: 1.59 ms per loop

I think changing the matrix elements one by one might be a bad idea in cases where there is a single row that is a multiple of another one, as the chance of hitting the offending row is 1/n and the rank gets recomputed every time.

Unless there is a faster way of computing the rank taking advantage of previously used computations, that is.

Javier

Robert Bradshaw

unread,
Jul 3, 2012, 4:19:50 PM7/3/12
to sage-...@googlegroups.com
You're right, one does want to ensure that there's a high probability
of changing its rank. One could also change O(n) entries (one on each
row/column) rather than all O(n^2). This will have a larger impact for
large fields (where the per-element entropy is high, and GF(2) is a
worst-case as individual-element-setting is expensive). But I suppose
in those cases you're unlikely to hit a non-invertible matrix in the
first place.

- Robert

Maarten Derickx

unread,
Jul 3, 2012, 4:59:17 PM7/3/12
to sage-...@googlegroups.com
And it is also not entirely clear to me why implementation 2 would not affect the random distribution.

Javier López Peña

unread,
Jul 4, 2012, 4:32:06 AM7/4/12
to sage-...@googlegroups.com

On Tuesday, July 3, 2012 9:19:50 PM UTC+1, Robert Bradshaw wrote:
You're right, one does want to ensure that there's a high probability
of changing its rank. One could also change O(n) entries (one on each
row/column) rather than all O(n^2). This will have a larger impact for
large fields (where the per-element entropy is high, and GF(2) is a
worst-case as individual-element-setting is expensive). But I suppose
in those cases you're unlikely to hit a non-invertible matrix in the
first place.

There is always the issue on whether this would yield matrices uniformly 
random matrices, but anyway, I tried a toy implementation of this idea
by replacing one element on each row and column, chosen via a random
permutation. Over GF(2) this has yet again a serious 
impact on performance.

def faster_random_invertible_matrix(n,K): 
    M = MatrixSpace(K,n,n)
    S = M.random_element()
    while S.rank() != n: 
        S = M.random_element() 
    return S 

def faster_random_invertible_matrix2(n,K): 
    M = MatrixSpace(K,n,n)
    P = Permutations(n)
    S = M.random_element()
    while S.rank() != n: 
        s = P.random_element()
        for i in range(n):
            S[i,s(i+1)-1] = K.random_element()
    return S 

sage: %timeit faster_random_invertible_matrix(200, GF(2))
625 loops, best of 3: 1.11 ms per loop
sage: %timeit faster_random_invertible_matrix2(200, GF(2))
25 loops, best of 3: 8.37 ms per loop

sage: %timeit faster_random_invertible_matrix(300, GF(2)) 
125 loops, best of 3: 1.84 ms per loop
sage: %timeit faster_random_invertible_matrix2(300, GF(2)) 
25 loops, best of 3: 9.7 ms per loop

25 loops, best of 3: 13.5 ms per loop
sage: %timeit faster_random_invertible_matrix2(1000, GF(2))
5 loops, best of 3: 24.9 ms per loop

As the matrix sizes get bigger, the random permutation method appears 
to be about 2x slower than the whole random matrix method.

Cheers,
Javier

Martin Albrecht

unread,
Jul 4, 2012, 5:06:14 AM7/4/12
to sage-...@googlegroups.com
You are setting elements one by one, which is really really slow over GF(2).
I'd rather set the 64-bit words directly.
Reply all
Reply to author
Forward
0 new messages