Info about magmaf_zgeev

Gianluca Frazzei

unread,

Mar 10, 2025, 9:49:38 AMMar 10

to MAGMA User

As for the title, I was wondering if such function is under construction, if there is some information about problems that were encountered in its creation, why doesn't it exists, and so on.

thanks a lot to you all

Mark Gates

unread,

Mar 10, 2025, 10:03:08 AMMar 10

to Gianluca Frazzei, MAGMA User

Hi Gianluca,

There's magmaf_zgeev (in Fortran) and magma_zgeev (in C), which is a hybrid CPU–GPU routine that takes the matrix in CPU memory. There is no version that takes the matrix in GPU memory. A significant portion of work is still done on the CPU, so there did not seem to be a large benefit to having a GPU interface.

Mark

Gianluca Frazzei

unread,

Mar 10, 2025, 12:42:08 PMMar 10

to MAGMA User, mga...@icl.utk.edu, MAGMA User, Gianluca Frazzei

I am still a newbie in all of this, so I'll try and ask this instead: I am interested in diagonalizing Non-Hermitian Hamiltonians but on my machine I get stuck at the order of magnitude 10^3 of the system size. I found out that by using the magmaf version of zgeev it actually takes more computing time to obtain the eigenstates and eigenvalues compared to the standard lapack one, and I felt really confused about this. I was wondering of you knew how could this be the case, and if you also knew any way to leverage the magma library to speed up my computations.

Thanks a lot

Mark Gates

unread,

Mar 10, 2025, 1:03:42 PMMar 10

to Gianluca Frazzei, MAGMA User

Hi Gianluca,

Can you provide some more specifics?

What kind of machine are you running on (what CPU, GPU, # cores)?
What CPU BLAS/LAPACK library are you using?
What is your target matrix size?
Do you need eigenvalues only, left eigvecs, right eigvecs, or both?
The output of MAGMA's testing_zgeev would be helpful (see below).

Matrix size n = 1000 is pretty small for MAGMA. I wouldn't expect it to be able to accelerate; seeing a slow-down compared to LAPACK is not surprising, depending on the CPUs and GPUs.

Note that the non-symmetric eigenvalue problem is memory bound, so it won't get close to the peak flop/s performance of GPUs. In this case, MAGMA is taking advantage of the higher bandwidth of GPU memory vs. CPU memory.

Example with 1 Volta V100-SXM2-32GB GPU and 10 CPU cores, Intel E5-2698 v4 @ 2.20GHz.

# Eigenvalues only, no vectors case (-LN -RN).

magma/testing> export OMP_NUM_THREADS=10
magma/testing> ./testing_zgeev -n 100:900:100 -n 1000:20000:1000 --lapack -LN -RN
% MAGMA 2.9.0 svn 32-bit magma_int_t, 64-bit pointer.
% Compiled for CUDA architectures 70
% CUDA runtime 11080, driver 12070. OpenMP threads 10. MKL 2024.0.2, MKL threads 10.
% device 0: Tesla V100-SXM2-32GB, 1530.0 MHz clock, 32494.1 MiB memory, capability 7.0
% Mon Mar 10 16:24:27 2025
% Usage: ./testing_zgeev [options] [-h|--help]

% jobvl = No vectors, jobvr = No vectors, ngpu = 1
% N CPU Time (sec) GPU Time (sec) |W_magma - W_lapack| / |W_lapack|
%==========================================================================
100 0.02 0.02 1.68e-15 ok
200 0.05 0.05 2.52e-15 ok
300 0.09 0.09 2.84e-15 ok
400 0.15 0.15 2.96e-15 ok
500 0.24 0.23 2.98e-15 ok
600 0.47 0.41 3.49e-15 ok
700 0.59 0.56 3.83e-15 ok
800 0.72 0.66 3.77e-15 ok
900 0.82 0.75 3.52e-15 ok
1000 0.99 0.85 3.54e-15 ok
2000 3.40 2.81 4.38e-15 ok
3000 10.05 7.98 5.21e-15 ok
4000 19.73 13.12 5.41e-15 ok
5000 35.53 19.15 5.53e-15 ok
6000 63.72 33.08 6.75e-15 ok
7000 91.87 43.48 6.70e-15 ok
8000 135.70 58.77 6.76e-15 ok

# Left eigenvectors, no right eigvec (-LV -RN).

magma/testing> ./testing_zgeev -n 100:900:100 -n 1000:20000:1000 --lapack -LV -RN

% MAGMA 2.9.0 svn 32-bit magma_int_t, 64-bit pointer.
% Compiled for CUDA architectures 70
% CUDA runtime 11080, driver 12070. OpenMP threads 10. MKL 2024.0.2, MKL threads 10.
% device 0: Tesla V100-SXM2-32GB, 1530.0 MHz clock, 32494.1 MiB memory, capability 7.0
% Mon Mar 10 16:34:41 2025
% Usage: ./testing_zgeev [options] [-h|--help]

% jobvl = Vectors needed, jobvr = No vectors, ngpu = 1
% N CPU Time (sec) GPU Time (sec) |W_magma - W_lapack| / |W_lapack|
%==========================================================================
100 0.02 0.03 1.82e-15 ok
200 0.07 0.07 2.60e-15 ok
300 0.11 0.12 2.84e-15 ok
400 0.19 0.18 2.97e-15 ok
500 0.30 0.27 3.13e-15 ok
600 0.52 0.48 3.70e-15 ok
700 0.66 0.61 3.68e-15 ok
800 0.87 0.76 3.77e-15 ok
900 1.02 0.91 3.63e-15 ok
1000 1.15 1.08 3.47e-15 ok
2000 4.69 4.15 4.24e-15 ok
3000 14.57 11.40 5.28e-15 ok
4000 28.43 20.98 5.54e-15 ok
5000 50.63 33.22 5.48e-15 ok
6000 85.10 54.21 6.79e-15 ok
7000 125.27 72.77 6.72e-15 ok
8000 182.26 102.08 6.77e-15 ok

Mark

Reply all

Reply to author

Forward

Info about magmaf_zgeev_gpu

Gianluca Frazzei

Mark Gates

Gianluca Frazzei

Mark Gates