half transpose

Aran Nokan

unread,

Jun 25, 2022, 1:25:11 PM6/25/22

to MAGMA User

Hi,

Do we have a half transpose_inplace or just half transpose version in MAGMA?

Regards,

A. N.

Aran Nokan

unread,

Jun 25, 2022, 1:50:24 PM6/25/22

to MAGMA User

I modified the double version by changing the data type, but it seems that it is not working properly.

HA

[
-1.2725 -1.6289 0.7954 -0.5454 1.9704 0.1819 -1.4677 -1.2020 -1.6541 0.4963
-2.3516 0.2910 -0.9487 -1.0889 -2.2561 0.0498 -0.1365 -0.0642 0.4177 1.6740
0.6235 1.1201 -0.4075 0.4719 0.5611 0.7630 -1.1637 -0.7216 1.6673 0.1064
-0.0669 0.5063 0.5088 -2.3516 0.1781 -0.5828 1.0681 0.7148 -0.0632 -0.8739
0.4158 -0.9302 -0.8374 -0.1564 -0.7804 -1.0615 -2.6556 1.6532 -0.3983 -0.5425
-1.5586 0.3037 -1.9443 -0.8463 1.0265 -1.0838 0.4788 1.5802 1.0619 1.3954
0.2228 0.5791 0.2450 0.6272 1.1028 -0.4920 0.0285 -0.4303 -0.7356 -1.0794
0.7769 -1.4453 -1.1787 0.8233 -2.5626 0.1606 1.0928 -0.5802 2.1829 -0.5191
-1.5283 2.2754 0.8354 0.2167 -0.4939 -0.3893 0.0904 0.7436 -0.3631 -0.3429
-0.8076 -0.6445 -2.3047 -0.5370 0.6561 -1.4656 -0.7422 0.4093 0.8884 -0.3030
];

HAT

[
-1.2725 -2.3516 0.6235 -0.0669 1.9704 0.1819 -1.4677 -1.2020 -1.6541 0.4963
-1.6289 0.2910 1.1201 0.5063 -2.2561 0.0498 -0.1365 -0.0642 0.4177 1.6740
0.7954 -0.9487 -0.4075 0.5088 0.5611 0.7630 -1.1637 -0.7216 1.6673 0.1064
-0.5454 -1.0889 0.4719 -2.3516 0.1781 -0.5828 1.0681 0.7148 -0.0632 -0.8739
0. 0. 0. -0.1564 -0.7804 -1.0615 -2.6556 1.6532 -0.3983 -0.5425
0. 0. 0. -0.8463 1.0265 -1.0838 0.4788 1.5802 1.0619 1.3954
0. 0. 0. 0.6272 1.1028 -0.4920 0.0285 -0.4303 -0.7356 -1.0794
0. 0. 0. 0.8233 -2.5626 0.1606 1.0928 -0.5802 2.1829 -0.5191
0. 0. 0. 0.2167 -0.4939 -0.3893 0.0904 0.7436 -0.3631 -0.3429
0. 0. 0. -0.5370 0.6561 -1.4656 -0.7422 0.4093 0.8884 -0.3030
];

htranspose_inplace.cu

Stanimire Tomov

unread,

Jun 25, 2022, 3:12:17 PM6/25/22

to Aran Nokan, MAGMA User

Hi Aran,

We don’t have it yet. It will be great if you can contribute it!

I do’t see any problem with the approach. I suppose you modified the transpose.cu (vs. the inplace versions).

I would double check the types, leading dimensions, and the printing.

If you try small enough matrices, e.g., less than 32x8 there will be no blocking in the code

and everything will be done from a single thread block - and a thread will transpose just one element.

You can make for example thread i,j print from the kernel HA(i,j) and HA(j,i) and after the transposition HAT(i,j)

and HAT(j,i).

It is interesting that a 4x4 leading block got transposed correctly, and the rest stayed the same (except those zeroes).

Thanks,

Stan

--
You received this message because you are subscribed to the Google Groups "MAGMA User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to magma-user+...@icl.utk.edu.
To view this discussion on the web visit https://groups.google.com/a/icl.utk.edu/d/msgid/magma-user/CAKHt_YZSoHewHjTkXNoCoJaS%3DRdzbkpSNAXnH_1%3DfPLZzo9ssA%40mail.gmail.com.
<htranspose_inplace.cu>

Reply all

Reply to author

Forward