Hi,
I have several questions about magma_dgetf2_native_blocked. It will be helpful if you answer me. Thank you
1) Is it possible to launch
magma_dscal_dger_native for panels with size larger than default BATF2_NB=8? I changed the size of
BATF2_NB to 16 and added several statically defined
dscal_dger_1d_kernel_native template up to n=16 in
dgetf2_kernels.cu, but it is failing during execution.
About the number of threads and grid size.
const int tbx=MAX_NTHREADS/2
dim3 grid(magma_ceildiv(m,tbx),1,1);
2) Why is the maximum number of threads dividing to 2 ? 1024/2
Let's consider n=4 and m=1024 so we are launching 1 block with 512 threads.
3) Is it possible to to launch the kernel dscal_dger_1d_kernel_native with a grid size of n? Again if I increase the grid size it will fail. Why? Which other parameters should be changed to utilise more resources for dscal_dger_1d_kernel_native ?
Best regards,
A. N.