Hi Charles,
Le 28/08/2023 à 10:39,
charles.b...@gmail.com a écrit :
> Dear all,
>
> I am trying to use FFPACK to compute the row echelon form of largish matrices (dimension 150K)
> modulo a 16-bit prime, in parallel. I have a lots of threads to use (say 128).
>
> I use the ModularBalanced<double> field (as suggested by JGD in 2016...) but I could use anything else.
This is fine. Modular<double> could be another good choice, but likely with similar timings.
>
> My questions are:
> 0) I have to call pRowEchelonForm() and not rowEchelonForm, right?
Yes.
> 1) should I use FfpackSlabRecursive (LUdivine) or FfpackTileRecursive?
The SlabRecursive variant is sequential only, while the TileRecursive allow for parallelization.
> 2) I understand that I can provide a num_threads argument. Should I give 0 (I assume it means "do
> whatever is best")?
Quite right. numthreads=0 sets the max number of ffpack threads to the current OPENMP NUM_THREADS.
> 3) The underlying BLAS is OpenBLAS. It can be multi-threaded. Should I try to have a multi-threaded
> BLAS or a sequential one?
By default, fflas-ffpack forces the BLAS to run on only 1 thread, even if the BLAS is compiled using
multithreaded (in order to manage the parallism at a higher level).
Yet if you want to run fflas-ffpack using the multithreaded OpenBLAS, you can set the macro
__FFLASFFPACK_OPENBLAS_NUM_THREADS
to the number of threads you want for OpenBLAS.
This can be automated by running configure with option
--with-openblas-num-threads=<num-threads>
You should then add in your main:
#ifdef __FFLASFFPACK_OPENBLAS_NUM_THREADS
openblas_set_num_threads(__FFLASFFPACK_OPENBLAS_NUM_THREADS);
#endif
and
#define __FFLASFFPACK_OPENBLAS_NT_ALREADY_SET 1
before all other includes.
(see benchmarks/benchmark-pluq.C for instance)
>
> (I am doing this because I am trying to obtain a "usable" version of
> [SpaSM](
https://github.com/cbouilla/spasm), in particular code that can echelonize a sparse matrix
> while keeping the result sparse; this leads to a "switch-to-dense" strategy at the end of the
> factorization, hence the use of FFLAS-FFPACK; it will become a hard dependency for SpaSM; I keep
> being contacted by various people who want sparse kernel basis of largish sparse matrices, of size,
> say, 1M, and it is sometimes feasible).
>
Great! Let us know how this works and feel free to ask for further assistance if needed.
Btw: I presume that you saw the boolean option "transform" which allow you to avoid computing the
transformation matrix if you don't need it.
Best.
Clément