GPU implementation

41 views

Skip to first unread message

Joss G.

unread,

Oct 25, 2021, 5:21:15 AM10/25/21

to deal.II User Group

Hi all,

I have a question regarding the GPU implementation. I am not following a matrix-free implementation but looking at step-64 It it is given this help:

unsigned int size = 10;

LinearAlgebra::ReadWriteVector<double> rw_vector(size);

...do something with the rw_vector...

// Move the data to the device:

LinearAlgebra::CUDAWrappers::Vector<double> vector_dev(size);

vector_dev.import(rw_vector, VectorOperations::insert);

...do some computations on the device...

// Move the data back to the host:

rw_vector.import(vector_dev, VectorOperations::insert);

It is not clear to me when do I need to use __device__. Is it in the function that already move data to and from device, or is in the function that do some computations on the device after moving data?

Is it any example of gpu implementation without matrix-free to see how the data is moved and managed?

Thank you

Bruno Turcksin

unread,

Oct 25, 2021, 8:50:22 AM10/25/21

to deal.II User Group

Hi,

The data is moved when you call import. This is done using cudaMemcopy, so there is no need to use __device__ You only need __device__ for functions that are executed on the device. The best place to look for examples is to look at the tests: https://github.com/dealii/dealii/tree/master/tests/cuda

Best,

Bruno

Reply all

Reply to author

Forward

0 new messages