Issue on CUDA code

giuseppe orlando

unread,

Jul 17, 2023, 9:04:42 AM7/17/23

to deal.II User Group

Hello everyone,

I'm writing this post to ask some help for a CUDA code. I have implemented a matrix-based CUDA coda for the so-called solid body rotation, namely a time dependent advection problem. In my first implementation, I encapulased the computation of the matrix and of the right-hand side of the resulting linear system in the same routine ('assemble_system'). Then, I tried to separate the assembling of the matrix ('assemble_matrix') and of the rhs ('assemble_rhs'). Since the matrix does not change (I am using an explicit time integration scheme and, therefore, the resulting matrix is a mass matrix), I compute it at the beginning and then I just update the right-hand side. Everything works until I save the results, after which I obtain this error ("On entry to cusparseSpMV() parameter number 1 (handle) had an illegal value: invalid pointer mode").

I also tried to create an auxiliary vector just for saving, but nothing changes.

Everything runs with 1 GPU. Have anyone ever obtained this kind of issue? If it can help, I can share the two versions of the code.

Thanks in advance

Best

Giuseppe

Bruno Turcksin

unread,

Jul 17, 2023, 10:11:18 AM7/17/23

to deal.II User Group

Giuseppe,

It's hard to tell what's wrong without seeing the code.

Best,

Bruno

giuseppe orlando

unread,

Jul 17, 2023, 11:06:11 AM7/17/23

to deal.II User Group

Here, you can find the two codes. 'SBR_GPU' is the one that arrives up to the end, whereas 'SBR_GPU_WRONG' is the one that stops with the error described above once I call 'output_results'. If the code is too long and can be helpful, I could try to reduce it avoiding ParameterHandler and other more 'advanced' stuff and to reduce everything to a unique file.

Best,

Giuseppe

SBR_GPU_WRONG.zip

SBR_GPU.zip

Bruno Turcksin

unread,

Jul 17, 2023, 11:50:27 AM7/17/23

to dea...@googlegroups.com

Giuseppe,

The CUDA::Handle needs to live longer than the sparse matrix but in your code it goes out-of-scope at the end of assemble_system(). I know the same code works with assemble_matrix but it only works "by chance" because nothing wrote over the memory. I don't know if that's the only problem but that's definitely wrong.

Best,

Bruno

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "deal.II User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dealii/p5eZs3NaGRk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dealii+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/9d079799-6b51-46f6-8564-8058877e83d7n%40googlegroups.com.

giuseppe orlando

unread,

Jul 17, 2023, 2:12:54 PM7/17/23

to deal.II User Group

Dear Bruno,

thank you very much. That was the issue with the matrix-based version.

I take the opportunity for another question (sorry in advance to bother). I also tried to implement a matrix-free version of the same problem, exploiting therefore all the routines such as 'assemble_rhs' and delegating the action of the matrix to suitable operators. The results are the same for a few time-steps and then it blows up. I guess that it is something again related to the memory access, but I couldn't figure out the issue. In thics case, it should not be CUDA::Handle, because I am not using it since we are in a matrix-free framework.

Thanks again for the previous suggestion.

Best,

Giuseppe

SBR_GPU_MF.zip

Bruno Turcksin

unread,

Jul 17, 2023, 2:34:18 PM7/17/23

to dea...@googlegroups.com

Giuseppe,

We totally rewrote the GPU matrix-free framework on the latest release of deal.II (version 9.5). One of the advantages is that you can now write/debug the code on the CPU and then switch to run the code on the GPU. The new code is not 100% backward compatible but the changes are pretty small. I advise you to switch to the latest version of deal.II, debug the code on the CPU, and then run on the GPU.

Best,

Bruno

To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/7f8102ef-3bce-4e15-b1ae-d18c0ca7e2fbn%40googlegroups.com.

giuseppe orlando

unread,

Jul 18, 2023, 11:31:17 AM7/18/23

to deal.II User Group

Dear Bruno,

thank you very much. I am having some troubles to install the new version with CUDA and MPI using 'nvcc_wrapper', but, once solved these issues (hopefully), I will definitely try with the last version of deal.II.

Best,

Giuseppe

giuseppe orlando

unread,

Aug 22, 2023, 1:27:20 PM8/22/23

to deal.II User Group

Hello everyone,

I'm sorry to restore this thread after more than one month. I installed the latest version of deal.II using the bundled Kokkos package. I wrote a code to simulate an unsteady advection problem with an explicit time-stepping. Here are my questions/curiosities:

For 'additional_data.mapping_update_flags', I need to specify 'update_gradients' even though the resulting matrix for the linear system is nothing but a mass matrix. If I do not specify 'update_gradients', I get a segmentation fault during 'reinit' of the MatrixFree structure. Did anyone find out the same behaviour?
I would need to consider periodic boundary conditions for a configuration of interest, however this results in a blow-up of the simulation. Does it mean that the matrix-free framework associated with CUDA can handle only Dirichlet boundary conditions? I was already able with the previous version of deal.II to write down a matrix-based version for GPU with periodic bcs and matrix-based and matrix-free codes for GPU with Dirichlet bcs provide the same results and, therefore, I was wondering if there is an underlying issue in handling periodic bcs in a matrix-free framework for CUDA.

Thanks in advance

Best regards,

Giuseppe

Bruno Turcksin

unread,

Aug 22, 2023, 1:48:48 PM8/22/23

to dea...@googlegroups.com

Giuseppe,

Le mar. 22 août 2023 à 13:27, giuseppe orlando <gius...@gmail.com> a écrit :

For 'additional_data.mapping_update_flags', I need to specify 'update_gradients' even though the resulting matrix for the linear system is nothing but a mass matrix. If I do not specify 'update_gradients', I get a segmentation fault during 'reinit' of the MatrixFree structure. Did anyone find out the same behaviour?

This should be fixed in master. If it's not, let us know.

I would need to consider periodic boundary conditions for a configuration of interest, however this results in a blow-up of the simulation. Does it mean that the matrix-free framework associated with CUDA can handle only Dirichlet boundary conditions? I was already able with the previous version of deal.II to write down a matrix-based version for GPU with periodic bcs and matrix-based and matrix-free codes for GPU with Dirichlet bcs provide the same results and, therefore, I was wondering if there is an underlying issue in handling periodic bcs in a matrix-free framework for CUDA.

Periodic boundary conditions are not supported currently. I don't know much about periodic boundaries, so I don't know how much work is required to support that. Maybe someone else knows?