Rank of random dense matrices over GF(2)

111 views
Skip to first unread message

Kiran Kedlaya

unread,
Apr 23, 2016, 5:16:33 PM4/23/16
to sage-devel
I just tried the following experiment on several different machines; several different recent versions of Sage; algorithm = ple, m4ri; and n = 20000, 25000, 30000. In all cases, the result is identical:

sage: M = MatrixSpace(GF(2), n)
sage: for i in range(10):
...:     print M.random_element().rank()
19937
19937
19937
19937
19937
19937
19937
19937
19937
19937

Using the Internet, I also retroactively tested this in Sage 4.0.1. (Search in the page for "19937".)


By contrast, I just tried a whole bunch of sparse examples coming from Cremona modular symbols that gave more plausible answers, e.g.:
sage: S = CremonaModularSymbols(300001, sign=-1)
sage: T = hecke_matrix(2).sage_matrix_over_ZZ().change_ring(GF(2)).dense_matrix()
sage: T.dimensions()
(27549, 27549)
sage: T.rank()
27461

I'm not sure if this rank is correct, but at least it's not 19937.

Kiran

Volker Braun

unread,
Apr 23, 2016, 6:20:32 PM4/23/16
to sage-devel
I'm guessing that its not a coincidence that Mersenne twister has a period of 2^19937 ;-)

The relevant code is in sage.matrix.matrix_mod2_dense.Matrix_mod2_dense.randomize

Not sure exactly whats wrong but taking rstate.c_random() % (number of columns) at the very least leads to modulo bias

Dima Pasechnik

unread,
Apr 23, 2016, 6:36:28 PM4/23/16
to sage-devel
well, this is hardly a surprise:

sage: random_matrix?
...
   Warning: Matrices generated are not uniformly distributed. For
     unimodular matrices over finite field this function does not even
     generate all of them: for example "Matrix.random(GF(3), 2)" never
     generates "[[2,0],[0,2]]". This function is made for teaching
     purposes.

---------
Improvements to this code are most welcome. 
I looked at it many years ago, and it was a horrible broken mess.
After that ticket it still was quite a mess, at least it did not hang our teaching notebook server any more :-)


 

Kiran

Kiran Kedlaya

unread,
Apr 23, 2016, 7:01:39 PM4/23/16
to sage-devel
On Saturday, April 23, 2016 at 3:20:32 PM UTC-7, Volker Braun wrote:
I'm guessing that its not a coincidence that Mersenne twister has a period of 2^19937 ;-)

Probably not!
 
The relevant code is in sage.matrix.matrix_mod2_dense.Matrix_mod2_dense.randomize

Not sure exactly whats wrong but taking rstate.c_random() % (number of columns) at the very least leads to modulo bias


I've created ticket #20493 for this issue. One observation I made there is that when density != 1 this problem seems to go away, probably because randomize is running different code in that case. The key snippet seems to be:
{{{
                mask = __M4RI_LEFT_BITMASK(self._entries.ncols % m4ri_radix)
                for i from 0 <= i < self._nrows:
                    for j from 0 <= j < self._entries.width:
                        # for portability we get 32-bit twice rather than 64-bit once
                        low = gmp_urandomb_ui(rstate.gmp_state, 32)
                        high = gmp_urandomb_ui(rstate.gmp_state, 32)
                        self._entries.rows[i][j] = m4ri_swap_bits( ((<unsigned long long>high)<<32) | (<unsigned long long>low) )
                    self._entries.rows[i][self._entries.width - 1] &= mask
}}}
This seems to be filling out each row using successive random 32-bit words from gmp, lopping off the extra bits at the end. Is that a bad idea for some reason?

Kiran

Kiran Kedlaya

unread,
Apr 23, 2016, 7:04:27 PM4/23/16
to sage-devel
Really?

sage: l = [Matrix.random(GF(3),2) for i in range(10000)]
sage: len([i for i in l if i == 2])
114

The issue I raised appears to be distinct (see my response to Volker), but maybe there should also be a ticket to correct this docstring.
 
---------
Improvements to this code are most welcome. 
I looked at it many years ago, and it was a horrible broken mess.
After that ticket it still was quite a mess, at least it did not hang our teaching notebook server any more :-)

 I haven't looked at that ticket, so I can't comment.

Kiran

 

Dima Pasechnik

unread,
Apr 23, 2016, 7:58:45 PM4/23/16
to sage-devel
indeed, you have the codepath pointed out by Volker, not the one I kept in mind. Sorry for noise.

An entertaining bug!

Dima

Volker Braun

unread,
Apr 24, 2016, 4:45:37 AM4/24/16
to sage-devel
On Sunday, April 24, 2016 at 12:36:28 AM UTC+2, Dima Pasechnik wrote:
sage: random_matrix?
...
   Warning: Matrices generated are not uniformly distributed. For
     unimodular matrices over finite field this function does not even
     generate all of them: for example "Matrix.random(GF(3), 2)" never
     generates "[[2,0],[0,2]]". This function is made for teaching
     purposes.


At least its not THAT predictable ;-)

... after some attempts ...
sage: MatrixSpace(GF(3), 2, 2).random_element()
[2 0]
[0 2]

Volker Braun

unread,
Apr 24, 2016, 5:04:29 AM4/24/16
to sage-devel
On Sunday, April 24, 2016 at 1:01:39 AM UTC+2, Kiran Kedlaya wrote:
This seems to be filling out each row using successive random 32-bit words from gmp, lopping off the extra bits at the end. Is that a bad idea for some reason?

Its only lopping off the excess bits at the last column of each row to avoid writing out of bounds. The snippet is exactly how the prng should be used to fill the matrix.

What is bad is to use lower bits of a prng for entries like  "m[i,j] = random() % n". But the code snippet does not do that. 

Jori Mäntysalo

unread,
Apr 24, 2016, 7:13:44 AM4/24/16
to sage-devel
On Sun, 24 Apr 2016, Volker Braun wrote:

>    Warning: Matrices generated are not uniformly distributed. For
>      unimodular matrices over finite field this function does not even
>
> At least its not THAT predictable ;-)
>
> ... after some attempts ...
> sage: MatrixSpace(GF(3), 2, 2).random_element()
> [2 0]
> [0 2]

Unimodular! This happens with algorithm='unimodular'.

--
Jori Mäntysalo

Kiran Kedlaya

unread,
Apr 26, 2016, 6:07:53 PM4/26/16
to sage-devel
Yes, but the docstring itself still needs to be corrected: it says 'for example "Matrix.random(GF(3), 2)" never generates "[[2,0],[0,2]]".' without the flag.

Kiran

Jori Mäntysalo

unread,
Apr 27, 2016, 1:57:25 AM4/27/16
to sage-devel
On Tue, 26 Apr 2016, Kiran Kedlaya wrote:

> Yes, but the docstring itself still needs to be corrected: it says 'for
> example "Matrix.random(GF(3), 2)" never generates "[[2,0],[0,2]]".' without
> the flag.

OK, I added that to http://trac.sagemath.org/ticket/20465 .

--
Jori Mäntysalo
Reply all
Reply to author
Forward
0 new messages