sampleNullspace examples?

48 views
Skip to first unread message

Laurent Bartholdi

unread,
Aug 31, 2023, 4:48:24 PM8/31/23
to linbox-use
I have some absolutely huge, extremely sparse matrices (think 500Mx100M) with extremely low density (13 nonzeros per column). I want to compute their right nullspace, which I suspect has quite small dimension (say 1000); thus given by a 100Mx1000 basis.
I'm happy doing everything over a small field (say GF(251)).

I understand that there are two good methods, Wiedemann and Lanczos, to sample the nullspaces and hopefully get all solutions after enough sampling.

Unfortunately, I have not been able to extract from the tests/ or examples/ subdirectories enough information to compile and test these algorithms.

Can someone give me some help? I would be fine with a standalone program reading the matrices in triplets format and printing vectors sampled from the nullspace as sparse vectors.

B Saunders

unread,
Sep 1, 2023, 1:11:58 PM9/1/23
to linbo...@googlegroups.com
Hello Laurent. I would be happy to help with this if I can.  A couple of years ago I worked with a matrix of this general size (several billion non-zeroes).  The problem was rank modulo 3, so it did benefit from more compact representation and no (very large) null space basis object as output.  Still I think LinBox may be able to handle your matrices. I would propose you start by trying the rank.C in the examples directory.  This would reveal if you are running into memory limitations or other issues on your hardware.  If the rank is successfully computed you would then have a more precise estimate of the memory needed for the nullspace.  By the way, my a priori expectation is that the null space basis vectors will be dense.  The blackbox methods do nothing to search for a sparse basis.

It is entirely possible the rank computation (via Wiedemann) will take days, so you might try some smaller matrices (chop off some columns, say) to get a feel for the runtime.  If the number of rows and of non-zeroes per column are held fixed, I expect the runtime to be roughly quadratic in the number of columns.

Good luck! 

-dave Saunders

--
You received this message because you are subscribed to the Google Groups "linbox-use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to linbox-use+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/linbox-use/0ed154e4-4fa3-4f08-82b1-b674c23c78f0n%40googlegroups.com.

Jean-Guillaume Dumas

unread,
Sep 7, 2023, 8:33:44 AM9/7/23
to linbo...@googlegroups.com, B Saunders
On 01/09/2023 19:11, B Saunders wrote:
Hello Laurent. I would be happy to help with this if I can.  A couple of years ago I worked with a matrix of this general size (several billion non-zeroes).  The problem was rank modulo 3, so it did benefit from more compact representation and no (very large) null space basis object as output.  Still I think LinBox may be able to handle your matrices. I would propose you start by trying the rank.C in the examples directory.  This would reveal if you are running into memory limitations or other issues on your hardware.  If the rank is successfully computed you would then have a more precise estimate of the memory needed for the nullspace.  By the way, my a priori expectation is that the null space basis vectors will be dense.  The blackbox methods do nothing to search for a sparse basis.

It is entirely possible the rank computation (via Wiedemann) will take days, so you might try some smaller matrices (chop off some columns, say) to get a feel for the runtime.  If the number of rows and of non-zeroes per column are held fixed, I expect the runtime to be roughly quadratic in the number of columns.

Dear Laurent,

A few small things to complete Dave's message.

[+] rank computation with Wiedemann is likely to be longer unless on very large matrices so the LinBox automatic choices will then most likely not select that method on small examples.

You can however  force the call to this method, for instance in examples/rank.C by adding a parameter to the rank call as follows:

LinBox::rank(r, B, Method::Wiedemann());

(note it is possible to get some intermediate information about the ETA of Wiedemann methods -- probably the simplest (sorry!) way to see this in action is to disable openmp and remove "-DDISABLE_COMMENTATOR" in the examples/Makefile ...)

[+] We do distribute also nullspace example programs, see examples/nullspacebasis*.C but those examples are only usable with elimination methods, and the latter will most probably fill-up your very large sparse examples.

[+] to get a feel for the runtime, the example/solve.C could also help ...

Do not hesitate also if you need further info.

Regards,

PS: We do have some, not public, previous prototype code, e.g., for largely heterogeneous distributed system solving via block Wiedemann methods that might be adapted to your problem, but this will represent a more complicated project ...

On Thu, Aug 31, 2023 at 4:48 PM Laurent Bartholdi <laurent....@gmail.com> wrote:
I have some absolutely huge, extremely sparse matrices (think 500Mx100M) with extremely low density (13 nonzeros per column). I want to compute their right nullspace, which I suspect has quite small dimension (say 1000); thus given by a 100Mx1000 basis.
I'm happy doing everything over a small field (say GF(251)).

I understand that there are two good methods, Wiedemann and Lanczos, to sample the nullspaces and hopefully get all solutions after enough sampling.

Unfortunately, I have not been able to extract from the tests/ or examples/ subdirectories enough information to compile and test these algorithms.

Can someone give me some help? I would be fine with a standalone program reading the matrices in triplets format and printing vectors sampled from the nullspace as sparse vectors.
--
You received this message because you are subscribed to the Google Groups "linbox-use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to linbox-use+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/linbox-use/0ed154e4-4fa3-4f08-82b1-b674c23c78f0n%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "linbox-use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to linbox-use+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/linbox-use/CAOSgTpW2WvQssJ4AvAjH-4iAikbWo%3DToM9sSwYaZ-xZrJsEgzg%40mail.gmail.com.


-- 
					Jean-Guillaume Dumas.
____________________________________________________________________
Jean-Guill...@univ-grenoble-alpes.fr    Tél.: +33 457 421 732
Directeur, Laboratoire Jean Kuntzmann.         Bât. IMAG, bureau 115
Mathématiques Appliquées et Informatique, Université Grenoble Alpes.
150 place du Torrent. IMAG, CS 40700, 38058 GRENOBLE cedex 9, FRANCE
http://membres-ljk.imag.fr/Jean-Guillaume.Dumas
____________________________________________________________________

B Saunders

unread,
Sep 7, 2023, 11:16:45 AM9/7/23
to Jean-Guillaume Dumas, linbo...@googlegroups.com
Laurent has shared with me a matrix that has about 16M columns of 13 non-zeroes each and about 80M rows of 1,2, or 3 entries each. It is a homology boundary matrix.  My first-cut timing for matrix-vector product is just under 10 seconds.  At that rate Wiedemann or block  Wiedemann will take 2*16M*10s = 320Ms time in matrix-vector products. I guess that is about 10 cpu years.  I propose to see if your sparse elimination code, applied until fill-in starts (choosing pivots on 1 or 2 entry rows until there are no more of them).  If that reduces the dimension enough, continuation along the lines you mentioned might have a chance.  Even then, extensive parallelism may be wanted...

-bds

Charles Bouillaguet

unread,
Sep 7, 2023, 12:46:16 PM9/7/23
to linbo...@googlegroups.com
On 9/7/23 17:16, B Saunders wrote:
> Laurent has shared with me a matrix that has about 16M columns of 13
> non-zeroes each and about 80M rows of 1,2, or 3 entries each. [...] I
> guess that is about 10 cpu years.

On Laurent's matrix, my sparse elimination code
(https://gitlab.lip6.fr/almasty/spasm) reduces the problem to that of
echelonizing a dense matrix of dimension about 160K in a few hours.
Then, using FFLAS-FFPACK, a full echelonized basis can be obtained in a
few extra hours. I am debugging my code and running the computation
right now, but in the end it should only require a few hundred CPU hours.

Sparse elimination works quite well on Laurent's matrix, and I assume
that this is because it has a somewhat triangular structure.

Best regards

--
Charles Bouillaguet

B Saunders

unread,
Sep 12, 2023, 3:45:03 PM9/12/23
to linbo...@googlegroups.com
This is good news.  When it works, sparse elimination rocks!

Out of curiosity, what size field are you using?

Best, -bds

--
You received this message because you are subscribed to the Google Groups "linbox-use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to linbox-use+...@googlegroups.com.

Charles Bouillaguet

unread,
Sep 13, 2023, 9:10:47 AM9/13/23
to linbo...@googlegroups.com
Hi,
On 9/12/23 21:44, B Saunders wrote:
> This is good news.  When it works, sparse elimination rocks!

This compensates the fact that, more often than not, it fails miserably!

> Out of curiosity, what size field are you using?

My current code works will small-ish prime field, i.e. GF(p) with 15-bit
p --- e.g. such that (p-1)^2 + (p-1) fits in an "int". This could
probably be improved a little without too much effort, at least to any
prime that can be handled by FFLAS-FFPACK, or éventually to any 32-bit
prime.

Cheers,

--
Charles Bouillaguet

Reply all
Reply to author
Forward
0 new messages