On Sep 22, 2022, at 6:23 PM, Nima Sahraneshin <unix...@gmail.com> wrote:I am testing the peak performance of RTX3090 but some results are strange to me.first the DGEMM:nvidia is saying that the double peak for GA102 is 556.0 GFLOPS, but with MAGMA test I am seeing larger numbers:% M N K MAGMA Gflop/s (ms) cuBLAS Gflop/s (ms) CPU Gflop/s (ms) MAGMA error cuBLAS error
%========================================================================================================
1024 1024 1024 420.58 ( 5.11) 406.02 ( 5.29) --- ( --- ) 0.00e+00 --- ok
2048 2048 2048 527.70 ( 32.56) 473.61 ( 36.27) --- ( --- ) 1.96e-17 --- ok
3072 3072 3072 599.68 ( 96.69) 529.70 ( 109.46) --- ( --- ) 0.00e+00 --- ok
4096 4096 4096 623.33 ( 220.49) 539.50 ( 254.75) --- ( --- ) 0.00e+00 --- ok
5120 5120 5120 617.11 ( 434.99) 537.57 ( 499.35) --- ( --- ) 0.00e+00 --- ok
6144 6144 6144 621.39 ( 746.49) 539.89 ( 859.17) --- ( --- ) 0.00e+00 --- ok
7168 7168 7168 621.32 (1185.53) 536.08 (1374.04) --- ( --- ) 0.00e+00 --- ok
8192 8192 8192 619.22 (1775.63) 535.94 (2051.54) --- ( --- ) 0.00e+00 --- ok
9216 9216 9216 619.59 (2526.68) 536.21 (2919.61) --- ( --- ) 0.00e+00 --- ok
10240 10240 10240 624.20 (3440.38) 541.41 (3966.48) --- ( --- ) 0.00e+00 --- ok
11264 11264 11264 625.40 (4570.39) 539.11 (5301.91) --- ( --- ) 0.00e+00 --- ok
12288 12288 12288 540.56 (6864.81) 539.22 (6881.84) --- ( --- ) 0.00e+00 --- ok
13312 13312 13312 538.82 (8756.28) 538.86 (8755.55) --- ( --- ) 0.00e+00 --- ok
14336 14336 14336 539.05 (10931.55) 536.36 (10986.52) --- ( --- ) 0.00e+00 --- ok
15360 15360 15360 536.20 (13517.01) 536.16 (13517.81) --- ( --- ) 0.00e+00 --- ok
16384 16384 16384 536.12 (16406.98) 536.05 (16409.07) --- ( --- ) 0.00e+00 --- okHow can we justify it?Does MAGMA do GEMM without cuBLAS ?