Hi,
I am new to MAGMA.
How does MAGMA schedule linear algebra routines on CPU + GPU.
For example, if I am calling dgemm_ with very large matrix size - does this get executed in the GPU and let's say if I am calling dgemm for tiny matrix size - does this get scheduled on CPU ?
Or its the responsibility of the end user to invoke appropriate dgemm routine based on the problem size.
Can somebody throw more light on this.
My usecase:
Application calls let's say dgemm_() - if the problem size is small, I invoke CPU BLAS library like AOCL/MKL else I will invoke rocM BLAS API on GPU.
Thanks,
Kiran V