sgetrf_panel_native & Second Level Panel Size

Skip to first unread message

aran nokan

Sep 25, 2021, 7:10:53 PM9/25/21
to MAGMA User

I am seeing that V100 and A100 have more that 80SM, so I tried to change the second level panel size (recpnb variable in sgetrf_panel_native.cpp) with this hope to see better performance, but it seems that I have a misunderstanding here and it goings slower even.

Why is the value of recpnb 32?(I remember in a paper Ahmad mentioned that because most GPUs have this number of SM ).

Why by using a larger number the performance does not change? I think I have free SM, because during factorization of the panel GPU does not do anything else, and the other 48 SM should be free.

Best regards,

Ahmad Abdelfattah

Sep 25, 2021, 11:50:28 PM9/25/21
to aran nokan, MAGMA User
The value of ‘recnb’ is a tuning parameter, and setting it to 32 happened to give the best overall performance. Generally speaking, it could change depending on the GPU. 

If the panel is “too wide”, it is usually better to utilize level-3 BLAS by means of recursive panel factorization. That is, you split the panel recursively so that use TRSM and GEMM in the trailing part of the panel. The term “too wide” does not correspond to a fixed value, but in the LU case, it looks like it is better to use the recursive factorization for panels wider than 32. 


You received this message because you are subscribed to the Google Groups "MAGMA User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Reply all
Reply to author
0 new messages