sgetrf_panel_native & Second Level Panel Size

4 views
Skip to first unread message

aran nokan

unread,
Sep 25, 2021, 7:10:53 PM9/25/21
to MAGMA User
Hi,

I am seeing that V100 and A100 have more that 80SM, so I tried to change the second level panel size (recpnb variable in sgetrf_panel_native.cpp) with this hope to see better performance, but it seems that I have a misunderstanding here and it goings slower even.

Why is the value of recpnb 32?(I remember in a paper Ahmad mentioned that because most GPUs have this number of SM ).

Why by using a larger number the performance does not change? I think I have free SM, because during factorization of the panel GPU does not do anything else, and the other 48 SM should be free.

Best regards,
Aran

Ahmad Abdelfattah

unread,
Sep 25, 2021, 11:50:28 PM9/25/21
to aran nokan, MAGMA User
The value of ‘recnb’ is a tuning parameter, and setting it to 32 happened to give the best overall performance. Generally speaking, it could change depending on the GPU. 

If the panel is “too wide”, it is usually better to utilize level-3 BLAS by means of recursive panel factorization. That is, you split the panel recursively so that use TRSM and GEMM in the trailing part of the panel. The term “too wide” does not correspond to a fixed value, but in the LU case, it looks like it is better to use the recursive factorization for panels wider than 32. 

Ahmad



--
You received this message because you are subscribed to the Google Groups "MAGMA User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to magma-user+...@icl.utk.edu.
To view this discussion on the web visit https://groups.google.com/a/icl.utk.edu/d/msgid/magma-user/CAKHt_Ya9r_sxuNChuFrwVXBwfMCtZ_i-MpOLxRb8_f1e-J20WA%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages