Speedup of the direct solver in parallel computing

Giang Huynh

unread,

Jul 18, 2024, 9:10:43 AM7/18/24

to deal.II User Group

Hi all,

I am writing a parallel code for the Newton-Raphson scheme and testing the speedup for different parts of the scheme. The performance is attached in the figure below. It shows that the speed up of the direct solver is 1, degrading the performance of the whole scheme. Other parts including assembling the global system, updating internal variables exhibit linear speedup.

Has anyone met the same issue with the linear solver? are there any solutions to gain better performance for using a linear solver?

Thank you,

Giang

speedup.png

James Shi

unread,

Jul 18, 2024, 10:22:16 AM7/18/24

to deal.II User Group

Are you using multi-threading or multi-processing? The bundled UMFPACK solver is single threaded, the theoretical speedup is about 1.

I see 2 choices: (a) Use external multi-threaded direct solver like PARDISO, you can pass the reordered CSR data to it and (b) use MPI multi-processing and distributed parallel direct solver MUMPS or SuperLU_DIST from PETSc or Trilinos.

Giang Huynh

unread,

Jul 18, 2024, 3:07:44 PM7/18/24

to deal.II User Group

I am using UMFPACK and the code is parallelized based on MPI. Do you have a code example or tutorial to use MUMPS or SuperLU_DIST in deal.ii as you mentioned?
Thank you so much for your help.

Best,

Giang

James Shi

unread,

Jul 18, 2024, 10:50:43 PM7/18/24

to deal.II User Group

If you are using PETScWrappers for MPI parallelization, there's a PETScWrappers::SparseDirectMUMPS that works with PETScWrappers out of the box.

If you are familiar with PETSc or Trilinos, you can also manually choose MUMPS or SuperLU_DIST by tuning ksp/pc or amesos/amesos2 package.

Reply all

Reply to author

Forward