Hello All.
This is more of an observation with questions, I have been looking closely at CPU usage on a single Quad-Core processor with and without PETSc/MPI.
I was surprised to see all 4 cores (8 threads) being used in step-8 which does not have PETSc/MPI support. In the solver, the main thread was using 100% of it's resources. The others were using very consistently 24-25% independent of #DOF. Is the normal SolverCG<> of step-8 multi-theaded? Would the overall performance on a single processor improve if one were able to have multiple controlling threads instead of one. Is it even possible? This might be even more important for the 8 or 16 core CPUs I see available now. If I am correct the code that exists was probably optimum with Single or Dual Core CPUs but it does not appear to be true now. It may not be feasible but I though I'd mention it.
Also, it appears that the efficiency with one CPU decreases as the #DOF increase with PETSc/MPI. Does anyone know why?
Thanks
Pete Griffin
=========================================================================================
 step-8 NO PETSc/MPI
=========================================================================================
Cycle 7 of step-8
Threads: 546 total,   2 running, 543 sleeping,   1 stopped,   0 zombie
%Cpu(s): 34.6 us,  0.1 sy,  0.0 ni, 65.2 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16325840 total, 11629396 free,  1940364 used,  2756080 buff/cache
KiB Swap: 16668668 total, 15488860 free,  1179808 used. 13757012 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 6752 pgriffin  20   0 2330772 1.260g  48340 R 99.9  8.1   0:41.49 step-8            
 6765 pgriffin  20   0 2330772 1.260g  48340 S 25.6  8.1   0:04.43 step-8            
 6760 pgriffin  20   0 2330772 1.260g  48340 S 25.2  8.1   0:04.43 step-8            
 6761 pgriffin  20   0 2330772 1.260g  48340 S 25.2  8.1   0:04.37 step-8            
 6763 pgriffin  20   0 2330772 1.260g  48340 S 25.2  8.1   0:04.34 step-8            
 6766 pgriffin  20   0 2330772 1.260g  48340 S 25.2  8.1   0:04.36 step-8            
 6762 pgriffin  20   0 2330772 1.260g  48340 S 24.9  8.1   0:04.33 step-8            
 6764 pgriffin  20   0 2330772 1.260g  48340 S 24.9  8.1   0:04.40 step-8            
 Cycle 8 of step-8
Threads: 547 total,   9 running, 537 sleeping,   1 stopped,   0 zombie
%Cpu(s): 36.2 us,  0.0 sy,  0.0 ni, 63.6 id,  0.1 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem : 16325840 total,  9097668 free,  4475188 used,  2752984 buff/cache
KiB Swap: 16668668 total, 15488876 free,  1179792 used. 11225328 avail Mem 
Cycle 8 of step-8
PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 6752 pgriffin  20   0 4807840 3.672g  48340 R 99.7 23.6   1:52.88 step-8            
 6761 pgriffin  20   0 4807840 3.672g  48340 R 27.2 23.6   0:12.15 step-8            
 6764 pgriffin  20   0 4807840 3.672g  48340 R 27.2 23.6   0:12.15 step-8            
 6760 pgriffin  20   0 4807840 3.672g  48340 R 26.9 23.6   0:12.26 step-8            
 6765 pgriffin  20   0 4807840 3.672g  48340 R 26.9 23.6   0:12.24 step-8            
 6766 pgriffin  20   0 4807840 3.672g  48340 R 26.9 23.6   0:11.93 step-8            
 6762 pgriffin  20   0 4807840 3.672g  48340 R 26.6 23.6   0:12.10 step-8            
 6763 pgriffin  20   0 4807840 3.672g  48340 R 25.6 23.6   0:12.17 step-8            
Cycle 9 of step-8
Threads: 543 total,   2 running, 540 sleeping,   1 stopped,   0 zombie
%Cpu(s): 34.0 us,  0.3 sy,  0.0 ni, 65.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16325840 total,  4041224 free,  9150464 used,  3134152 buff/cache
KiB Swap: 16668668 total, 15489424 free,  1179244 used.  6552112 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 6982 pgriffin  20   0 9311500 8.120g  48408 R 99.9 52.2   6:16.43 step-8            
 6991 pgriffin  20   0 9311500 8.120g  48408 S 24.6 52.2   0:48.27 step-8            
 6992 pgriffin  20   0 9311500 8.120g  48408 S 24.6 52.2   0:48.34 step-8            
 6993 pgriffin  20   0 9311500 8.120g  48408 S 24.3 52.2   0:48.15 step-8            
 6994 pgriffin  20   0 9311500 8.120g  48408 S 24.3 52.2   0:48.31 step-8            
 6995 pgriffin  20   0 9311500 8.120g  48408 S 24.3 52.2   0:48.30 step-8            
 6996 pgriffin  20   0 9311500 8.120g  48408 S 24.3 52.2   0:48.15 step-8            
 6990 pgriffin  20   0 9311500 8.120g  48408 S 23.9 52.2   0:48.20 step-8            
=========================================================================================
 step-17 OLD with PETSc/MPI uses max_couplings_between_dofs(), high memory usage/DOF
=========================================================================================
Cycle 6 of step-17 OLD
Threads: 540 total,   9 running, 530 sleeping,   1 stopped,   0 zombie
%Cpu(s): 24.3 us, 75.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16325840 total,  8039968 free,  3926700 used,  4359172 buff/cache
KiB Swap: 16668668 total, 15491740 free,  1176928 used. 11769984 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 7332 pgriffin  20   0 4043180 3.149g  54160 R 99.9 20.2   0:04.97 step-17           
 7338 pgriffin  20   0 4043180 3.149g  54160 R 99.9 20.2   0:04.90 step-17           
 7331 pgriffin  20   0 4043180 3.149g  54160 R 99.9 20.2   0:12.75 step-17           
 7334 pgriffin  20   0 4043180 3.149g  54160 R 99.9 20.2   0:04.86 step-17           
 7335 pgriffin  20   0 4043180 3.149g  54160 R 99.7 20.2   0:04.88 step-17           
 7337 pgriffin  20   0 4043180 3.149g  54160 R 99.7 20.2   0:04.91 step-17           
 7333 pgriffin  20   0 4043180 3.149g  54160 R 99.3 20.2   0:04.94 step-17           
 7336 pgriffin  20   0 4043180 3.149g  54160 R 99.3 20.2   0:04.84 step-17           
Cycle 7 of step-17 OLD
Threads: 542 total,   9 running, 532 sleeping,   1 stopped,   0 zombie
%Cpu(s): 21.0 us, 53.8 sy,  0.0 ni, 25.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16325840 total,  1494532 free, 10477460 used,  4353848 buff/cache
KiB Swap: 16668668 total, 15491740 free,  1176928 used.  5224572 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 7331 pgriffin  20   0 10.090g 9.382g  54224 R 99.9 60.3   0:55.10 step-17           
 7334 pgriffin  20   0 10.090g 9.382g  54224 R 71.8 60.3   0:21.78 step-17           
 7335 pgriffin  20   0 10.090g 9.382g  54224 R 71.4 60.3   0:21.67 step-17           
 7336 pgriffin  20   0 10.090g 9.382g  54224 R 71.4 60.3   0:21.55 step-17           
 7332 pgriffin  20   0 10.090g 9.382g  54224 R 71.1 60.3   0:21.78 step-17           
 7333 pgriffin  20   0 10.090g 9.382g  54224 R 71.1 60.3   0:21.79 step-17           
 7337 pgriffin  20   0 10.090g 9.382g  54224 R 70.8 60.3   0:21.89 step-17           
 7338 pgriffin  20   0 10.090g 9.382g  54224 R 69.8 60.3   0:21.50 step-17           
  
 
=========================================================================================
 step-17 NEW with PETSc/MPI uses DynamicSparsityPattern
=========================================================================================
Cycle 6 of step-17 NEW
top - 05:53:39 up 7 days,  9:00,  1 user,  load average: 1.84, 1.66, 1.62
Threads: 539 total,   9 running, 529 sleeping,   1 stopped,   0 zombie
%Cpu(s): 24.0 us, 76.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16325840 total, 13719720 free,  1130428 used,  1475692 buff/cache
KiB Swap: 16668668 total, 15388984 free,  1279684 used. 14687740 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 1710 pgriffin  20   0 1342740 598964  55076 R 99.9  3.7   0:14.40 step-17           
 1712 pgriffin  20   0 1342740 598964  55076 R 99.9  3.7   0:05.05 step-17           
 1711 pgriffin  20   0 1342740 598964  55076 R 99.7  3.7   0:05.11 step-17           
 1714 pgriffin  20   0 1342740 598964  55076 R 99.7  3.7   0:05.07 step-17           
 1715 pgriffin  20   0 1342740 598964  55076 R 99.7  3.7   0:05.07 step-17           
 1716 pgriffin  20   0 1342740 598964  55076 R 99.7  3.7   0:05.07 step-17           
 1717 pgriffin  20   0 1342740 598964  55076 R 99.0  3.7   0:04.97 step-17           
 1713 pgriffin  20   0 1342740 598964  55076 R 98.7  3.7   0:05.02 step-17           
Cycle 7 of step-17 NEW
Threads: 541 total,   9 running, 531 sleeping,   1 stopped,   0 zombie
%Cpu(s): 21.0 us, 54.0 sy,  0.0 ni, 25.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16325840 total, 12567900 free,  2284752 used,  1473188 buff/cache
KiB Swap: 16668668 total, 15389040 free,  1279628 used. 13536156 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 1773 pgriffin  20   0 2482828 1.657g  55080 R 99.9 10.6   1:03.73 step-17           
 1781 pgriffin  20   0 2482828 1.657g  55080 R 71.8 10.6   0:21.70 step-17           
 1776 pgriffin  20   0 2482828 1.657g  55080 R 71.4 10.6   0:21.80 step-17           
 1777 pgriffin  20   0 2482828 1.657g  55080 R 71.4 10.6   0:21.84 step-17           
 1775 pgriffin  20   0 2482828 1.657g  55080 R 71.1 10.6   0:21.71 step-17           
 1779 pgriffin  20   0 2482828 1.657g  55080 R 71.1 10.6   0:21.76 step-17           
 1780 pgriffin  20   0 2482828 1.657g  55080 R 71.1 10.6   0:21.66 step-17           
 1778 pgriffin  20   0 2482828 1.657g  55080 R 69.4 10.6   0:21.44 step-17           
Cycle 8 of step-17 NEW
Threads: 539 total,   9 running, 529 sleeping,   1 stopped,   0 zombie
%Cpu(s): 16.3 us, 20.5 sy,  0.0 ni, 63.1 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16325840 total,  9603772 free,  5828536 used,   893532 buff/cache
KiB Swap: 16668668 total, 15388100 free,  1280568 used.  9999864 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND           
 1404 pgriffin  20   0 6046792 5.051g  55048 R 99.9 32.4   3:10.39 step-17           
 1407 pgriffin  20   0 6046792 5.051g  55048 R 28.2 32.4   0:40.49 step-17           
 1410 pgriffin  20   0 6046792 5.051g  55048 R 28.2 32.4   0:40.27 step-17           
 1406 pgriffin  20   0 6046792 5.051g  55048 R 27.9 32.4   0:40.51 step-17           
 1412 pgriffin  20   0 6046792 5.051g  55048 R 27.9 32.4   0:40.45 step-17           
 1408 pgriffin  20   0 6046792 5.051g  55048 R 27.6 32.4   0:40.60 step-17           
 1411 pgriffin  20   0 6046792 5.051g  55048 R 27.6 32.4   0:40.22 step-17           
 1409 pgriffin  20   0 6046792 5.051g  55048 R 26.9 32.4   0:39.94 step-17