Hi Denis,
Thank you so much for this information, I am so fortunate getting your help all the time :).
I look into other libraries to find out if there is anything out there that can compete with AMGCL,
and I must say that from what I have seen, for general sparse systems on regular workstations, AMGCL is way better than anything else that I have tried.
So kudos to you, I am baffled because I can't understand why people continue to use other libraries that have much lower performance than AMGCL.
With that said, and because I am interested in just raw speed with the simplest solver to reduce the residual just by 0.1 or 0.05,
I tested ViennaCL and I got these values:
WIth my limited testing I could not get Vienna to do any AMG, etc, my first impression is that it is not robust, but perhaps it is because I did not spend enough time .....
But when it comes the the simplest case that I want to solve,
I see that on my old laptop the performances are comparable,
but then on the the newer gpu the wall time reduction is much more pronounced with Vienna,
and I think that I finally understood what you told me before about the 1st/2nd run,
because I see that the second solve within the same executing process, with the same num of iters, same residual,
the measured time decreases considerably, I guess it is because the compilation is already done, library initialized, etc...
I was wondering if you know what is going on, why Vienna gets such better performance on this AMD card.
Of course this affects me because of my particular use case, I bet this is not an issue when AMG is used, but I just wanted to know your take on this.
Regards,