AMGCL's solvers tests - Preconditioner options mapped to coding options

C B

unread,

Jul 8, 2021, 11:12:48 AM7/8/21

to amgcl

Hi Denis,

I am trying to mimic these two options:

1) amgcl::relaxation::as_preconditioner<PrecBackend, amgcl::relaxation::spai0>
2) amgcl::preconditioner::dummy<PrecBackend>

With the solvers example, however when I try these options:

-1 -p precond.type=spai0

-1 -p precond=spai0

-1 -p precond=dummy

They all produce the same residual and number of iterations, so I suppose they are all interpreted the same. I was thinking that if the matrix does not have a unit diagonal, spai0 should give different results than dummy, but I am not sure...

Thanks!

Denis Demidov

unread,

Jul 8, 2021, 12:25:48 PM7/8/21

to amgcl

Hi Carl,

1) should be -1 -p precond.type=spai0 (or -p precond.class=relaxation precond.type=spai0)

2) should be -p precond.class=dummy

Denis Demidov

unread,

Jul 9, 2021, 1:13:06 AM7/9/21

to amgcl

---------- Forwarded message ---------
From: C B <cebau...@gmail.com>
Date: Fri, Jul 9, 2021 at 2:03 AM
Subject: Re: AMGCL's solvers tests - Preconditioner options mapped to coding options
To: Denis Demidov <dennis....@gmail.com>

Hi Denis,

Thank you so much for this information, I am so fortunate getting your help all the time :).

I look into other libraries to find out if there is anything out there that can compete with AMGCL,

and I must say that from what I have seen, for general sparse systems on regular workstations, AMGCL is way better than anything else that I have tried.

So kudos to you, I am baffled because I can't understand why people continue to use other libraries that have much lower performance than AMGCL.

With that said, and because I am interested in just raw speed with the simplest solver to reduce the residual just by 0.1 or 0.05,

I tested ViennaCL and I got these values:

WIth my limited testing I could not get Vienna to do any AMG, etc, my first impression is that it is not robust, but perhaps it is because I did not spend enough time .....

But when it comes the the simplest case that I want to solve,

I see that on my old laptop the performances are comparable,

but then on the the newer gpu the wall time reduction is much more pronounced with Vienna,

and I think that I finally understood what you told me before about the 1st/2nd run,

because I see that the second solve within the same executing process, with the same num of iters, same residual,

the measured time decreases considerably, I guess it is because the compilation is already done, library initialized, etc...

I was wondering if you know what is going on, why Vienna gets such better performance on this AMD card.

Of course this affects me because of my particular use case, I bet this is not an issue when AMG is used, but I just wanted to know your take on this.

Regards,

--
You received this message because you are subscribed to the Google Groups "amgcl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to amgcl+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/amgcl/413c2913-f604-4d19-9073-ade965ad6cf5n%40googlegroups.com.

--

Cheers,
Denis

Denis Demidov

unread,

Jul 9, 2021, 1:25:11 AM7/9/21

to amgcl

Carl,

Since you are not mentioning the preconditioner, I am assuming you are testing unpreconditioned solvers here. ViennaCL manual says they use pipelined versions of the solvers when used without preconditioner:

http://viennacl.sourceforge.net/doc/manual-algorithms.html

And dummy preconditioner in amgcl is not completely cost-free: it models identity matrix, and the cost is equivalent to a single vector copy. It is mostly there to test the unpreconditioned solver convergence. Maybe the unpreconditioned version of pipelined bicgstab from viennacl simply works best for your problem.

Finally, there is the ViennaCL backend implemented in amgcl:

https://amgcl.readthedocs.io/en/latest/components/backends.html#viennacl-backend

Below are some comparisons using cuda, vexcl, and viennacl backends.

$ solver_cuda -n 128

NVIDIA GeForce GTX 1050 Ti

Solver

======

Type: BiCGStab

Unknowns: 2097152

Memory footprint: 112.00 M

Preconditioner

==============

Number of levels: 4

Operator complexity: 1.62

Grid complexity: 1.13

Memory footprint: 567.74 M

level unknowns nonzeros memory

---------------------------------------------

0 2097152 14581760 422.42 M (61.61%)

1 263552 7918340 127.63 M (33.46%)

2 16128 1114704 15.07 M ( 4.71%)

3 789 53055 2.62 M ( 0.22%)

Iterations: 10

Error: 2.50965e-09

[Profile: 2.141 s] (100.00%)

[ self: 0.255 s] ( 11.90%)

[ assembling: 0.158 s] ( 7.36%)

[ setup: 1.168 s] ( 54.56%)

[ solve: 0.561 s] ( 26.19%)

$ solver_vexcl_cl -n 128

1. NVIDIA GeForce GTX 1050 Ti (NVIDIA CUDA)

Solver

======

Type: BiCGStab

Unknowns: 2097152

Memory footprint: 112.00 M

Preconditioner

==============

Number of levels: 4

Operator complexity: 1.62

Grid complexity: 1.13

Memory footprint: 744.74 M

level unknowns nonzeros memory

---------------------------------------------

0 2097152 14581760 553.22 M (61.61%)

1 263552 7918340 168.88 M (33.46%)

2 16128 1114704 20.01 M ( 4.71%)

3 789 53055 2.62 M ( 0.22%)

Iterations: 10

Error: 2.50965e-09

[Profile: 2.198 s] (100.00%)

[ self: 0.097 s] ( 4.42%)

[ assembling: 0.119 s] ( 5.41%)

[ setup: 1.421 s] ( 64.62%)

[ solve: 0.562 s] ( 25.54%)

$ solver_viennacl -n 128

NVIDIA GeForce GTX 1050 Ti (NVIDIA Corporation)

Solver

======

Type: BiCGStab

Unknowns: 2097152

Memory footprint: 0.00 B

Preconditioner

==============

Number of levels: 4

Operator complexity: 1.62

Grid complexity: 1.13

Memory footprint: 2.60 M

level unknowns nonzeros memory

---------------------------------------------

0 2097152 14581760 0.00 B (61.61%)

1 263552 7918340 0.00 B (33.46%)

2 16128 1114704 0.00 B ( 4.71%)

3 789 53055 2.60 M ( 0.22%)

Iterations: 10

Error: 2.50965e-09

[Profile: 2.420 s] (100.00%)

[ self: 0.102 s] ( 4.21%)

[ assembling: 0.122 s] ( 5.04%)

[ setup: 1.321 s] ( 54.60%)

[ solve: 0.875 s] ( 36.15%)

C B

unread,

Jul 9, 2021, 5:40:18 AM7/9/21

to Denis Demidov, amgcl

Denis,

Thank you so much for your insights.

Yes, as you guessed I am comparing the unpreconditioned solvers, and now I understand when you mentioned pipelined solvers a few days ago :).

The pipelined approach seems to make a big difference for the atypical requirements that I am dealing with.

Thank you again for your help, it is very much appreciated.

Cheers,

To view this discussion on the web visit https://groups.google.com/d/msgid/amgcl/cd134536-bb0f-4910-8b9d-415e0de13e3cn%40googlegroups.com.

Reply all

Reply to author

Forward