GETRF Pivot

13 views
Skip to first unread message

aran nokan

unread,
Dec 24, 2020, 11:27:11 AM12/24/20
to MAGMA User
Hi,

Which function is for selecting pivot in DGETRF GPU version?  Is it a function for CPU or GPU? I want to see the pivot selection process but it seems that is not easy.

Best Regards,
AN

Stanimire Tomov

unread,
Dec 24, 2020, 2:54:00 PM12/24/20
to aran nokan, MAGMA User
Hi,

The DGETRF GPU version has two subversions - hybrid and native.

The hybrid is computing the panels on the CPU, so pivot selection is the typical CPU function
(IDAMAX plus proper offset) from LAPACK.

The native version is performing the selection in the dgetf2_native_kernel (file dgetf2_native_kernel.cu).
Note that the entire panel is one kernel, so selecting the pivots are not separate kernels/functions.
The selection for the pivot associated with column i is done by thread block i, while the other thread blocks wait -
(see lines between 92 and 159 in file dgetf2_native_kernel.cu).

Hope this clarification helps.

Best regards,
Stan
> --
> You received this message because you are subscribed to the Google Groups "MAGMA User" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to magma-user+...@icl.utk.edu.
> To view this discussion on the web visit https://groups.google.com/a/icl.utk.edu/d/msgid/magma-user/CAKHt_YZrsEsUir-KTrkntu9B4KP9LOTtVbH_VmnFjyxjnEUDWw%40mail.gmail.com.

aran nokan

unread,
Dec 27, 2020, 12:54:23 PM12/27/20
to Stanimire Tomov, MAGMA User
Thank you for your explanation.
I have some questions also. Actually I did not find a way to a dgetf2_native_kernel.cu. Let me tell you what I have understood from native LU.

1) dgetrf_gpu.cpp is the main file for doing the LU.
and if we want the blocked native version we will go to lines 234- 248 (version 2.5.4)

2) magma_dgetrf_recpanel_native is a function for performing LU here, and for seeing the inside we see the file:

3) dgetrf_panel_native.cpp which seems like Toledo's recursive  LU algorithm in LAPACK (dividing in two part) and finally if "nb<=recpnb" (I don't know the meaning of recpnb here) we will face with:

4) magma_dgetf2_native function.



Ahmad Abdelfattah

unread,
Dec 28, 2020, 10:38:03 AM12/28/20
to aran nokan, Stanimire Tomov, MAGMA User
Hi, 

Please find some answers below.

On Dec 27, 2020, at 12:54 PM, aran nokan <noka...@gmail.com> wrote:

Thank you for your explanation. 
I have some questions also. Actually I did not find a way to a dgetf2_native_kernel.cu. Let me tell you what I have understood from native LU. 


The dgetf2_native_kernel.cu file is under the magmablas/ directory, not src/. 

1) dgetrf_gpu.cpp is the main file for doing the LU. 
and if we want the blocked native version we will go to lines 234- 248 (version 2.5.4) 


Not true. The segment you point to is for performing the panel factorization on the GPU, not the entire factorization. In fact, the routine magma_dgetrf_gpu_expert is a combined implementation for both hybrid and native factorizations. There is multiple if statements like this 

if (mode == MagmaHybrid) {
    /* hybrid portion of the code */
}
else {
    /* native portion of the code */
}

Look for these statement to see the difference between the two modes. 

2) magma_dgetrf_recpanel_native is a function for performing LU here, and for seeing the inside we see the file: 


This code in this file is a recursive panel factorization, not the entire factorization code. It can theoretically be used to perform the whole factorization, but it is not meant for that purpose. 


3) dgetrf_panel_native.cpp which seems like Toledo's recursive  LU algorithm in LAPACK (dividing in two part) and finally if "nb<=recpnb" (I don't know the meaning of recpnb here) we will face with: 


This is another level of recursion or blocking. There are two routines in this file: magma_zgetf2_native_blocked and magma_zgetf2_native_recursive. The recursive version is faster, but does not support arbitrarily large panels. The blocked version is used instead if the panel height is very large (e.g. larger than ~23k for double precision). 

4) magma_dgetf2_native function. 



This is a fused kernel that performs the entire panel factorization step. You can find it under the magmablas/ directory in dgetf2_native_kernel.cu


Ahmad




On Thu, Dec 24, 2020 at 8:53 PM Stanimire Tomov <to...@icl.utk.edu> wrote:
Hi,

The DGETRF GPU version has two subversions - hybrid and native.

The hybrid is computing the panels on the CPU, so pivot selection is the typical CPU function 
(IDAMAX plus proper offset) from LAPACK. 

The native version is performing the selection in the dgetf2_native_kernel (file dgetf2_native_kernel.cu).
Note that the entire panel is one kernel, so selecting the pivots are not separate kernels/functions.
The selection for the pivot associated with column i is done by thread block i, while the other thread blocks wait -
(see lines between 92 and 159 in file dgetf2_native_kernel.cu).

Hope this clarification helps.

Best regards,
Stan


> On Dec 24, 2020, at 11:26 AM, aran nokan <noka...@gmail.com> wrote:
> 
> Hi, 
> 
> Which function is for selecting pivot in DGETRF GPU version?  Is it a function for CPU or GPU? I want to see the pivot selection process but it seems that is not easy. 
> 
> Best Regards,
> AN
> 
> -- 
> You received this message because you are subscribed to the Google Groups "MAGMA User" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to magma-user+...@icl.utk.edu.
> To view this discussion on the web visit https://groups.google.com/a/icl.utk.edu/d/msgid/magma-user/CAKHt_YZrsEsUir-KTrkntu9B4KP9LOTtVbH_VmnFjyxjnEUDWw%40mail.gmail.com.


-- 
You received this message because you are subscribed to the Google Groups "MAGMA User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to magma-user+...@icl.utk.edu.
Reply all
Reply to author
Forward
0 new messages