./testing_zgesv -c -l -N 16000 --niter 5
% MAGMA 2.8.0 svn 32-bit magma_int_t, 64-bit pointer.
% Compiled with CUDA support for 7.0
% CUDA runtime 12010, driver 12030. OpenMP threads 72. MKL 2023.0.2, MKL threads 36.
% device 0: NVIDIA A100-PCIE-40GB, 1410.0 MHz clock, 40338.3 MiB memory, capability 8.0
% Mon Jul 8 18:37:03 2024
% Usage: ./testing_zgesv [options] [-h|--help]
% ngpu 1
% N NRHS CPU Gflop/s (sec) GPU Gflop/s (sec) ||B - AX|| / N*||A||*||X|| ||B - AX|| / N*||A||*||X||_CPU
%================================================================================================================
16000 1 1245.34 ( 8.77) 3180.01 ( 3.44) 5.87e-24 ok 1.21e-23 ok
16000 1 1553.29 ( 7.03) 3281.26 ( 3.33) 5.11e-19 ok 4.37e-19 ok
16000 1 1415.47 ( 7.72) 3470.82 ( 3.15) 5.13e-19 ok 4.51e-19 ok
16000 1 1411.46 ( 7.74) 3489.04 ( 3.13) 4.38e-19 ok 3.90e-19 ok
16000 1 1660.47 ( 6.58) 3484.29 ( 3.14) 4.99e-19 ok 4.35e-19 ok
./testing_zgesv_gpu -c -l -N 16000 --niter 5
% MAGMA 2.8.0 svn 32-bit magma_int_t, 64-bit pointer.
% Compiled with CUDA support for 7.0
% CUDA runtime 12010, driver 12030. OpenMP threads 72. MKL 2023.0.2, MKL threads 36.
% device 0: NVIDIA A100-PCIE-40GB, 1410.0 MHz clock, 40338.3 MiB memory, capability 8.0
% Mon Jul 8 18:39:00 2024
% Usage: ./testing_zgesv_gpu [options] [-h|--help]
% N NRHS CPU Gflop/s (sec) GPU Gflop/s (sec) ||B - AX|| / N*||A||*||X||
%===============================================================================
16000 1 1305.78 ( 8.37) 6431.12 ( 1.70) 5.87e-24 ok
16000 1 1602.54 ( 6.82) 6482.08 ( 1.69) 5.11e-19 ok
16000 1 1602.09 ( 6.82) 6490.74 ( 1.68) 5.13e-19 ok
16000 1 1599.98 ( 6.83) 6495.13 ( 1.68) 4.38e-19 ok
16000 1 1600.33 ( 6.83) 6492.85 ( 1.68) 4.99e-19 ok
--
You received this message because you are subscribed to the Google Groups "MAGMA User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to magma-user+...@icl.utk.edu.
To view this discussion on the web visit https://groups.google.com/a/icl.utk.edu/d/msgid/magma-user/1d019efd-08b5-4b8f-b224-3d73aeb0029en%40icl.utk.edu.
On Jul 8, 2024, at 3:02 PM, Danesh Daroui <danesh...@gmail.com> wrote:Hi Ahmad,Thanks for your response. I will test my code on a machine equipped with Tesla GPU as well to see if I get better results. I can use magma_zgesv_gpu but the problem is that in many cases the matrices are quite large that they won't probably fit into GPU's memory. If I use magma_zmalloc to allocate memory which is larger than GPU's memory, would the routine allocate the rest on RAM instead or the allocation will simply failed? May I also know how MAGMA's memory management work when the data in RAM is larger than GPU's internal memory? I though MAGMA will internally transfer the parts that would fit into memory and then perform factorization and Gaussian elimination both in GPU's memory and RAM in parallel to achieve best performance.Also, do you know why MAGMA's CPU utilization is that low? If MAGMA would use MKL for its internal operations, then it should automatically use all available cores. MKL is very good in that as I see that it uses all cores when I only use MKL in my code to solve equations. Do I build MAGMA correctly to use MKL using the make.inc file that I pasted?Regards,Dan
To view this discussion on the web visit https://groups.google.com/a/icl.utk.edu/d/msgid/magma-user/14cd8c06-df77-4b36-8be6-4d640eefb297n%40icl.utk.edu.