cblas_ssbmv miscomputes

14 views
Skip to first unread message

Selin Yıldırım

unread,
May 4, 2022, 10:41:52 AM5/4/22
to OpenBLAS-users
I need to make a minor change in implementation details of openblas cblas_ssbmv routine to be able to use it in my own program. To do that, I found out sbmv_k.c which is a kernel that is invoked by cblas_ssbmv. In there, there are other routines axpyu_k and dotu_k, whose parameters I can trace in the loop but cannot figure out what exactly they do. In documentation, they are simply dot product and y= a*x+y, but I see  sbmv_k.c passed another unknown parameters as well.

For symmetric matrices with bandwith k=2 (size 5x5 and 10x10, X full of 1 and Y initially 0) , without any modification in the library code, openblas cblas_ssbmv routine miscomputes some of the resulting Y elements. 

I would like to learn what I am doing wrong. Let's start with which arguments (row/column order is whose order, only that of array A?) need to be passed to cblas_ssbmv. Then, I need to learn the execution steps of axpyu_k and dotu_k.

Thanks in advance.

Selin Yıldırım

unread,
May 4, 2022, 10:54:57 AM5/4/22
to OpenBLAS-users
My platform is Linux 18.04 with AMD opteron processors. To clarify, I am well aware of the signature of the function cblas_ssbmv as it is available in the header and I pass them accordingly. Just the orders and lower/triangular triangular parameters' naming and behaviour are confusing. With lower and row major order passed, lower section of the kernel code executed, but it results as if it were upper triangular (top-right section) of the matrix is computed, when I modify the kernel for skew-symmetric banded matrices (by only changing one addition operation to subtraction).

4 Mayıs 2022 Çarşamba tarihinde saat 17:41:52 UTC+3 itibarıyla Selin Yıldırım şunları yazdı:

martin-frbg

unread,
May 7, 2022, 9:09:21 AM5/7/22
to OpenBLAS-users
Some OpenBLAS kernels including AXPY happen to possess dummy parameters that are always set to zero on invocation - probably for historic reasons (parts of the original GotoBLAS may be rooted in even earlier projects, there may have been plans for non-standard options at some time, or it may have been useful on some platforms to  have pre-zeroed registers available). Probably best to look at the plain C implementations in kernel/generic (and for similarly historic reasons, in kernel/arm as well). You could also change the definitions of the AXPY and DOT kernels for your platform in kernel/x86_64/KERNEL.OPTERON to make them point to these generic C functions - these are probably easier to debug. With an old platform like Opteron, it is also possible that you hit a bug in the kernel code, e.g. a register not saved or not declared as clobbered on exit, that may only cause problems with modern, more aggressively optimizing compilers.

For (C)BLAS functions in general, OpenBLAS is supposed to conform to the behaviour of the reference implementation, whose documentation is at https://www.netlib.org/blas . I believe in the case of SSBMV, the output comprises the same triangular section as what was chosen for the input matrix A.
Reply all
Reply to author
Forward
0 new messages