Problem with MUMPS_4.9.2 (and possibly ParMETIS)

153 views
Skip to first unread message

Martin Meyer

unread,
Mar 23, 2011, 11:44:58 AM3/23/11
to matrixprogramming
Hi there,

right now I'm trying to parallelize a software which includes solving
a big linear system of equations. My choice fell on MUMPS (on 4
processors) to do the job. Now I have two main problems:

(1) The solution process is way slower than the sequential way in
which the program was written previoulsy, which was of course a
combination of do-loops.
(2) The solution process fails due to a DivideByZeroException, I
believe at the stage where ParMETIS calculates the symbolic
factorization.

I'm new to MUMPS, so it would be nice if someone with more experience
could maybe give a hint about about what goes wrong.
I'm using FORTRAN, and my non-default parameters for MUMPS are:


mumps_par%NRHS=1
mumps_par%ICNTL(9)=1 !solve Ax=b
mumps_par%ICNTL(10)=50 !iterative refinement steps
mumps_par%ICNTL(13)=0 !use scalapack for factorization
!Setting ICNTL(13) to a non-zero value will help with the
correct detection of null pivots but degrade performance.
mumps_par%ICNTL(23)=5 !supposed to be bigger than infog(26) in the
parallel version, which had a value of 1
mumps_par%ICNTL(28)=2 !use parallel analysis phase
mumps_par%ICNTL(29)=2 !use parmetis
mumps_par%ICNTL(4)=4 !print more information


As the comment says, ICNTL(23) can degrade performance, but because of
that DivideByZeroException, I set it to 0, thinking it could maybe
help.
My output log is:


DMUMPS 4.9.2
L D L^T Solver for general symmetric matrices
Type of parallelism: Host not working

****** ANALYSIS STEP ********

Using ParMETIS for parallel ordering.
Structual symmetry is:100%
WARNING: Largest root node of size 19
not selected for parallel execution

Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 0
-- (3) Storage of factors (REAL, estimated) = 21498
-- (4) Storage of factors (INT , estimated) = 14284
-- (5) Maximum frontal size (estimated) = 20
-- (6) Number of nodes in the tree = 375
-- (32) Type of analysis effectively used = 2
-- (7) Ordering option effectively used = 2
ICNTL(6) Maximum transversal option = 0
ICNTL(7) Pivot order option = 7
Percentage of memory relaxation (effective) = 20
Number of level 2 nodes = 0
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 2.606D+05
** Rank of proc needing largest memory in IC facto : 0
** Estimated corresponding MBYTES for IC facto : 1
** Estimated avg. MBYTES per work. proc at facto (IC) : 1
** TOTAL space in MBYTES for IC factorization : 4
** Rank of proc needing largest memory for OOC facto : 0
** Estimated corresponding MBYTES for OOC facto : 1
** Estimated avg. MBYTES per work. proc at facto (OOC) : 1
** TOTAL space in MBYTES for OOC factorization : 4

****** FACTORIZATION STEP ********


GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
NUMBER OF WORKING PROCESSES = 3
OUT-OF-CORE OPTION (ICNTL(22)) = 0
REAL SPACE FOR FACTORS = 21498
INTEGER SPACE FOR FACTORS = 14284
MAXIMUM FRONTAL SIZE (ESTIMATED) = 20
NUMBER OF NODES IN THE TREE = 375
Convergence error after scaling for ONE-NORM (option 7/8) =
0.25D-01
Maximum effective relaxed size of S = 575014
Average effective relaxed size of S = 572911
GLOBAL TIME FOR MATRIX DISTRIBUTION = 0.0006
** Memory relaxation parameter ( ICNTL(14) ) : 20
** Rank of processor needing largest memory in facto : 0
** Space in MBYTES used by this processor for facto : 1
** Avg. Space in MBYTES per working proc during facto : 1

ELAPSED TIME FOR FACTORIZATION = 0.0015
Maximum effective space used in S (KEEP8(67) = 7646
Average effective space used in S (KEEP8(67) = 7338
** EFF Min: Rank of processor needing largest memory : 0
** EFF Min: Space in MBYTES used by this processor : 1
** EFF Min: Avg. Space in MBYTES per working proc : 1

GLOBAL STATISTICS
RINFOG(2) OPERATIONS DURING NODE ASSEMBLY = 1.937D+04
------(3) OPERATIONS DURING NODE ELIMINATION = 2.606D+05
INFOG (9) REAL SPACE FOR FACTORS = 21498
INFOG(10) INTEGER SPACE FOR FACTORS = 14284
INFOG(11) MAXIMUM FRONT SIZE = 20
INFOG(29) NUMBER OF ENTRIES IN FACTORS = 18727
INFOG(12) NB OF NEGATIVE PIVOTS = 0
INFOG(12) NUMBER OF DELAYED PIVOTS = 0
NUMBER OF 2x2 PIVOTS in type 1 nodes = 0
NUMBER OF 2x2 PIVOTS in type 2 nodes = 0
INFOG(14) NUMBER OF MEMORY COMPRESS = 0


****** SOLVE & CHECK STEP ********


STATISTICS PRIOR SOLVE PHASE ...........
NUMBER OF RIGHT-HAND-SIDES = 1
BLOCKING FACTOR FOR MULTIPLE RHS = 1
ICNTL (9) = 1
--- (10) = 50
--- (11) = 0
--- (20) = 0
--- (21) = 0


BEGIN ITERATIVE REFINEMENT
MAXIMUM NUMBER OF STEPS = 50

STATISTICS AFTER ITERATIVE REFINEMENT
NUMBER OF STEPS OF ITERATIVE REFINEMENTS 0

** Rank of processor needing largest memory in solve : 1
** Space in MBYTES used by this processor for solve : 4
** Avg. Space in MBYTES per working proc during solve : 4

LEAVING SOLVER WITH: INFOG(1) ............ = 0
INFOG(2) ............ = 0


The DivideByZeroException is thrown in the analysis phase, the output
log ends after "Using ParMETIS for parallel ordering. Structual
symmetry is:100%". I don't know if this is a ParMETIS error, or an
error in the input data or parameters, or something with MUMPS itself.

Thanks in advance for taking a look at this!

Martin

Evgenii Rudnyi

unread,
Mar 23, 2011, 2:38:31 PM3/23/11
to matrixpr...@googlegroups.com
on 23.03.2011 16:44 Martin Meyer said the following:

> Hi there,
>
> right now I'm trying to parallelize a software which includes
> solving a big linear system of equations. My choice fell on MUMPS (on
> 4 processors) to do the job. Now I have two main problems:

I would advise you to post this question to MUMPS Discussion List.

https://listes.ens-lyon.fr/sympa/arc/mumps-users

I have not tried ParMetis yet. Please not that there is also Scotch
reordering. I believe that there should be something on ParMetis/Scotch
in the archives of the MUMPS Discussion List.

Reply all
Reply to author
Forward
0 new messages