Thread safety of dpotrf (and others)

36 views
Skip to first unread message

Tiziano Müller

unread,
Jun 16, 2021, 12:09:15 PM6/16/21
to libflame-discuss
Hi everone,

while testing AMDs libflame fork with our software CP2K I encountered the following errors when running with multiple OpenMP threads (limiting to OMP_NUM_THREADS=1 solves the issue):

libflame: /scratch/e1000/timuel/spack-stage/spack-stage-amdlibflame-3.0-ffgwl56r4x36ie7wjxbdlvlrce63tmti/spack-src/src/base/flamec/main/FLA_Blocksize.c (line 125): libflame: Encountered NULL pointer. libflame: Aborting.

and

libflame: /scratch/e1000/timuel/spack-stage/spack-stage-amdlibflame-3.0-ffgwl56r4x36ie7wjxbdlvlrce63tmti/spack-src/src/lapack/dec/chol/front/flamec/FLA_Chol_u.c (line 69): libflame: Function or conditional branch/case not yet implemented. libflame: Aborting.

All of them originate from calls to dpotrf from inside an OMP parallel region. This has been reported also for LU factorization here: https://github.com/flame/libflame/issues/34

Is there any known limitation wrt thread-safety of some of the routines in libflame itself? Or is this maybe an issue introduced in the AMD libflame fork?

Thank you very much in advance,
best regards,
Tiziano

David Haufe

unread,
Oct 20, 2021, 3:34:26 PM10/20/21
to libflame-discuss
Tiziano,
We have been experiencing the same issue with dgels using AMD libflame. We ended up re-writing our software to spawn multiple processes instead of using multiple threads in a single process due to this issue. We tried all sorts of libflame build options, seems to us it is not thread safe in any configuration. Building BLIS with multithread support does work, in case you are using only BLAS/BLIS calls elsewhere.

-Dave
Reply all
Reply to author
Forward
0 new messages