Porting deal.II to GPUs

Martin Steigemann

unread,

May 15, 2013, 1:48:57 AM5/15/13

to dea...@googlegroups.com

Dear deal.II users,

are there any efforts or is someone out there interested or already working on porting deal.II to GPUs via CUDA?

There is a group of students in Canada working on a small project porting matrices and vectors from deal.II to Cuda, communicated by Wolfgang, they send us their code, so maybe this a point to start integrating some new GPU functionalities in deal.II.

What I have just noticed is that PETSc can use GPUs since release 3.2, but up to now, it does not work on my machine with latest Cuda 5.0 and gcc 4.6.2, many errors during compiling, has someone tried to use GPUs in PETSc?

Best,

Martin

Guido Kanschat

unread,

May 16, 2013, 1:21:03 PM5/16/13

to deal. II user group

Dear Martin,

I am currently running a class on GPU computing, which hopefully results in such functionality for deal.II. furthermore, Stephan Kramer in Göttingen has done quite some research in this direction.

From my current perspective, I'd say that some more research has to go into this. Writing sparse matrix functions for Cuda will not be the solution. But it is definitely on my list!

Guido

Wolfgang Bangerth

unread,

May 16, 2013, 2:19:52 PM5/16/13

to dea...@googlegroups.com

> From my current perspective, I'd say that some more research has to go
> into this. Writing sparse matrix functions for Cuda will not be the
> solution. But it is definitely on my list!

Just for everyone's information: The Canadian students built upon the
ChunkSparseMatrix that stores sparse matrices as a collection of dense
tiles that provide better opportunities for GPUs than completely sparse
matrices. I tend to think that this is a direction worth pursuing
further, but the big question to me is not how to use GPUs for sparse
matrix-vector products, but how to come up with efficient
preconditioners on such systems.

Our most efficient preconditioners build on algebraic and geometric
multigrid methods. For AMG, both PETSc and Trilinos are working on GPU
implementations that we would get for free. For us, using tiled matrices
may be of interest in the context of geometric multigrid.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/

Martin Steigemann

unread,

May 20, 2013, 5:15:28 AM5/20/13

to dea...@googlegroups.com

Dear Guido and Wolfgang,

thanks a lot for your answers.

Writing some wrapper functions and using cuSparse and cuBlas libraries from
cuda should not be the problem, the same with using cuda functionalities
from PETSc and Trilinos, but of course, performance is the point. And, just
as a comment, PETSc seems not to be working with Cuda 5.0 and gcc 4.6.2,
maybe Cuda 4.1, also there, a lot of is going on.

Besides an "efficient" implementation, the problem that I see are the
dimensions of matrices we are dealing with, as long as the GPU can use local
memory, it will be very fast, but if objects are too large and also global
memory has to be used, I am not sure that there is much increase in
performance over a usual implementation. Thats why my first idea was using
GPUs in a matrix-free solver, but thinking about details it gets more and
more difficult, especially on a distributed triangulation ...

Guido, if I can help with something, implementing some wrapper functions or
whatever, let me know.

Best,
Martin

-----Urspr�ngliche Nachricht-----
From: Wolfgang Bangerth
Sent: Thursday, May 16, 2013 8:19 PM
To: dea...@googlegroups.com
Subject: Re: [deal.II] Porting deal.II to GPUs

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see
https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to dealii+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all

Reply to author

Forward