Hello All.
This is more of an observation with questions, I have been looking closely at CPU usage on a single Quad-Core processor with and without PETSc/MPI.
I was surprised to see all 4 cores (8 threads) being used in step-8 which does not have PETSc/MPI support. In the solver, the main thread was using 100% of it's resources. The others were using very consistently 24-25% independent of #DOF. Is the normal SolverCG<> of step-8 multi-theaded? Would the overall performance on a single processor improve if one were able to have multiple controlling threads instead of one. Is it even possible? This might be even more important for the 8 or 16 core CPUs I see available now. If I am correct the code that exists was probably optimum with Single or Dual Core CPUs but it does not appear to be true now. It may not be feasible but I though I'd mention it.
Also, it appears that the efficiency with one CPU decreases as the #DOF increase with PETSc/MPI. Does anyone know why?
Thanks
Pete Griffin
=========================================================================================
step-8 NO PETSc/MPI
=========================================================================================
Cycle 7 of step-8
Threads: 546 total, 2 running, 543 sleeping, 1 stopped, 0 zombie
%Cpu(s): 34.6 us, 0.1 sy, 0.0 ni, 65.2 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16325840 total, 11629396 free, 1940364 used, 2756080 buff/cache
KiB Swap: 16668668 total, 15488860 free, 1179808 used. 13757012 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6752 pgriffin 20 0 2330772 1.260g 48340 R 99.9 8.1 0:41.49 step-8
6765 pgriffin 20 0 2330772 1.260g 48340 S 25.6 8.1 0:04.43 step-8
6760 pgriffin 20 0 2330772 1.260g 48340 S 25.2 8.1 0:04.43 step-8
6761 pgriffin 20 0 2330772 1.260g 48340 S 25.2 8.1 0:04.37 step-8
6763 pgriffin 20 0 2330772 1.260g 48340 S 25.2 8.1 0:04.34 step-8
6766 pgriffin 20 0 2330772 1.260g 48340 S 25.2 8.1 0:04.36 step-8
6762 pgriffin 20 0 2330772 1.260g 48340 S 24.9 8.1 0:04.33 step-8
6764 pgriffin 20 0 2330772 1.260g 48340 S 24.9 8.1 0:04.40 step-8
Cycle 8 of step-8
Threads: 547 total, 9 running, 537 sleeping, 1 stopped, 0 zombie
%Cpu(s): 36.2 us, 0.0 sy, 0.0 ni, 63.6 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 16325840 total, 9097668 free, 4475188 used, 2752984 buff/cache
KiB Swap: 16668668 total, 15488876 free, 1179792 used. 11225328 avail Mem
Cycle 8 of step-8
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6752 pgriffin 20 0 4807840 3.672g 48340 R 99.7 23.6 1:52.88 step-8
6761 pgriffin 20 0 4807840 3.672g 48340 R 27.2 23.6 0:12.15 step-8
6764 pgriffin 20 0 4807840 3.672g 48340 R 27.2 23.6 0:12.15 step-8
6760 pgriffin 20 0 4807840 3.672g 48340 R 26.9 23.6 0:12.26 step-8
6765 pgriffin 20 0 4807840 3.672g 48340 R 26.9 23.6 0:12.24 step-8
6766 pgriffin 20 0 4807840 3.672g 48340 R 26.9 23.6 0:11.93 step-8
6762 pgriffin 20 0 4807840 3.672g 48340 R 26.6 23.6 0:12.10 step-8
6763 pgriffin 20 0 4807840 3.672g 48340 R 25.6 23.6 0:12.17 step-8
Cycle 9 of step-8
Threads: 543 total, 2 running, 540 sleeping, 1 stopped, 0 zombie
%Cpu(s): 34.0 us, 0.3 sy, 0.0 ni, 65.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16325840 total, 4041224 free, 9150464 used, 3134152 buff/cache
KiB Swap: 16668668 total, 15489424 free, 1179244 used. 6552112 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6982 pgriffin 20 0 9311500 8.120g 48408 R 99.9 52.2 6:16.43 step-8
6991 pgriffin 20 0 9311500 8.120g 48408 S 24.6 52.2 0:48.27 step-8
6992 pgriffin 20 0 9311500 8.120g 48408 S 24.6 52.2 0:48.34 step-8
6993 pgriffin 20 0 9311500 8.120g 48408 S 24.3 52.2 0:48.15 step-8
6994 pgriffin 20 0 9311500 8.120g 48408 S 24.3 52.2 0:48.31 step-8
6995 pgriffin 20 0 9311500 8.120g 48408 S 24.3 52.2 0:48.30 step-8
6996 pgriffin 20 0 9311500 8.120g 48408 S 24.3 52.2 0:48.15 step-8
6990 pgriffin 20 0 9311500 8.120g 48408 S 23.9 52.2 0:48.20 step-8
=========================================================================================
step-17 OLD with PETSc/MPI uses max_couplings_between_dofs(), high memory usage/DOF
=========================================================================================
Cycle 6 of step-17 OLD
Threads: 540 total, 9 running, 530 sleeping, 1 stopped, 0 zombie
%Cpu(s): 24.3 us, 75.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16325840 total, 8039968 free, 3926700 used, 4359172 buff/cache
KiB Swap: 16668668 total, 15491740 free, 1176928 used. 11769984 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7332 pgriffin 20 0 4043180 3.149g 54160 R 99.9 20.2 0:04.97 step-17
7338 pgriffin 20 0 4043180 3.149g 54160 R 99.9 20.2 0:04.90 step-17
7331 pgriffin 20 0 4043180 3.149g 54160 R 99.9 20.2 0:12.75 step-17
7334 pgriffin 20 0 4043180 3.149g 54160 R 99.9 20.2 0:04.86 step-17
7335 pgriffin 20 0 4043180 3.149g 54160 R 99.7 20.2 0:04.88 step-17
7337 pgriffin 20 0 4043180 3.149g 54160 R 99.7 20.2 0:04.91 step-17
7333 pgriffin 20 0 4043180 3.149g 54160 R 99.3 20.2 0:04.94 step-17
7336 pgriffin 20 0 4043180 3.149g 54160 R 99.3 20.2 0:04.84 step-17
Cycle 7 of step-17 OLD
Threads: 542 total, 9 running, 532 sleeping, 1 stopped, 0 zombie
%Cpu(s): 21.0 us, 53.8 sy, 0.0 ni, 25.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16325840 total, 1494532 free, 10477460 used, 4353848 buff/cache
KiB Swap: 16668668 total, 15491740 free, 1176928 used. 5224572 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7331 pgriffin 20 0 10.090g 9.382g 54224 R 99.9 60.3 0:55.10 step-17
7334 pgriffin 20 0 10.090g 9.382g 54224 R 71.8 60.3 0:21.78 step-17
7335 pgriffin 20 0 10.090g 9.382g 54224 R 71.4 60.3 0:21.67 step-17
7336 pgriffin 20 0 10.090g 9.382g 54224 R 71.4 60.3 0:21.55 step-17
7332 pgriffin 20 0 10.090g 9.382g 54224 R 71.1 60.3 0:21.78 step-17
7333 pgriffin 20 0 10.090g 9.382g 54224 R 71.1 60.3 0:21.79 step-17
7337 pgriffin 20 0 10.090g 9.382g 54224 R 70.8 60.3 0:21.89 step-17
7338 pgriffin 20 0 10.090g 9.382g 54224 R 69.8 60.3 0:21.50 step-17
=========================================================================================
step-17 NEW with PETSc/MPI uses DynamicSparsityPattern
=========================================================================================
Cycle 6 of step-17 NEW
top - 05:53:39 up 7 days, 9:00, 1 user, load average: 1.84, 1.66, 1.62
Threads: 539 total, 9 running, 529 sleeping, 1 stopped, 0 zombie
%Cpu(s): 24.0 us, 76.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16325840 total, 13719720 free, 1130428 used, 1475692 buff/cache
KiB Swap: 16668668 total, 15388984 free, 1279684 used. 14687740 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1710 pgriffin 20 0 1342740 598964 55076 R 99.9 3.7 0:14.40 step-17
1712 pgriffin 20 0 1342740 598964 55076 R 99.9 3.7 0:05.05 step-17
1711 pgriffin 20 0 1342740 598964 55076 R 99.7 3.7 0:05.11 step-17
1714 pgriffin 20 0 1342740 598964 55076 R 99.7 3.7 0:05.07 step-17
1715 pgriffin 20 0 1342740 598964 55076 R 99.7 3.7 0:05.07 step-17
1716 pgriffin 20 0 1342740 598964 55076 R 99.7 3.7 0:05.07 step-17
1717 pgriffin 20 0 1342740 598964 55076 R 99.0 3.7 0:04.97 step-17
1713 pgriffin 20 0 1342740 598964 55076 R 98.7 3.7 0:05.02 step-17
Cycle 7 of step-17 NEW
Threads: 541 total, 9 running, 531 sleeping, 1 stopped, 0 zombie
%Cpu(s): 21.0 us, 54.0 sy, 0.0 ni, 25.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16325840 total, 12567900 free, 2284752 used, 1473188 buff/cache
KiB Swap: 16668668 total, 15389040 free, 1279628 used. 13536156 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1773 pgriffin 20 0 2482828 1.657g 55080 R 99.9 10.6 1:03.73 step-17
1781 pgriffin 20 0 2482828 1.657g 55080 R 71.8 10.6 0:21.70 step-17
1776 pgriffin 20 0 2482828 1.657g 55080 R 71.4 10.6 0:21.80 step-17
1777 pgriffin 20 0 2482828 1.657g 55080 R 71.4 10.6 0:21.84 step-17
1775 pgriffin 20 0 2482828 1.657g 55080 R 71.1 10.6 0:21.71 step-17
1779 pgriffin 20 0 2482828 1.657g 55080 R 71.1 10.6 0:21.76 step-17
1780 pgriffin 20 0 2482828 1.657g 55080 R 71.1 10.6 0:21.66 step-17
1778 pgriffin 20 0 2482828 1.657g 55080 R 69.4 10.6 0:21.44 step-17
Cycle 8 of step-17 NEW
Threads: 539 total, 9 running, 529 sleeping, 1 stopped, 0 zombie
%Cpu(s): 16.3 us, 20.5 sy, 0.0 ni, 63.1 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16325840 total, 9603772 free, 5828536 used, 893532 buff/cache
KiB Swap: 16668668 total, 15388100 free, 1280568 used. 9999864 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1404 pgriffin 20 0 6046792 5.051g 55048 R 99.9 32.4 3:10.39 step-17
1407 pgriffin 20 0 6046792 5.051g 55048 R 28.2 32.4 0:40.49 step-17
1410 pgriffin 20 0 6046792 5.051g 55048 R 28.2 32.4 0:40.27 step-17
1406 pgriffin 20 0 6046792 5.051g 55048 R 27.9 32.4 0:40.51 step-17
1412 pgriffin 20 0 6046792 5.051g 55048 R 27.9 32.4 0:40.45 step-17
1408 pgriffin 20 0 6046792 5.051g 55048 R 27.6 32.4 0:40.60 step-17
1411 pgriffin 20 0 6046792 5.051g 55048 R 27.6 32.4 0:40.22 step-17
1409 pgriffin 20 0 6046792 5.051g 55048 R 26.9 32.4 0:39.94 step-17