Role of CUDAWrappers::MatrixFree::Data and CUDAWrappers::SharedData in CUDAWrappers::MatrixFree:cell_loop

13 views

Skip to first unread message

Stephen DeWitt

unread,

Feb 6, 2019, 2:49:26 PM2/6/19

to deal.II User Group

Hello all,

Following up from a previous post about where the GPU capabilities are, I've been going through the CUDA matrix-free matrix-vector product tests as Bruno suggested. For the most part I've been able to make sense of it and have found parallels between what is going on there versus step-48.

The one area that I don't really understand for matrix_free_matrix_vector_01.cu is how information flows into the HelmholtzOperator::operator() method. Inside the vmult method in MatrixFreeTest a HelmholtzOperator object is created and then it is passed into CUDAWrapper::MatrixFree cell_loop method along with two CUDAVectors. However, when I look at the HelmholtzOperator::operator() definition it takes CUDAWrappers::MatrixFree::Data * and CUDAWrappers::SharedData * types as inputs. Do the CUDAWrapper::MatrixFree internals take care of building objects with those types in a way that I can just ignore? I thought from the previous discussion that the user had to package what they needed into CUDAWrappers::MatrixFree::Data * and CUDAWrappers::SharedData * objects but maybe I misunderstood it. Obviously it would be great if this happened in the background.

On a related topic, I can see that the signature for HelmholtzOperator::operator() is notably different than SineGordonOperation:: local_apply in step-48. Where in the documentation should I be looking for the requirements for the functor passed into CUDAWrapper::MatrixFree::cell_loop? I'm reaching a dead end at the documentation entry for CUDAWrapper::MatrixFree::cell_loop. I can just copy the form from HelmholtzOperator::operator(), but I'm worried that I could get myself into trouble if I don't quite understand it.

Thanks in advance,

Steve DeWitt

Bruno Turcksin

unread,

Feb 6, 2019, 4:40:54 PM2/6/19

to deal.II User Group

Steve,

On Wednesday, February 6, 2019 at 2:49:26 PM UTC-5, Stephen DeWitt wrote:

The one area that I don't really understand for matrix_free_matrix_vector_01.cu is how information flows into the HelmholtzOperator::operator() method. Inside the vmult method in MatrixFreeTest a HelmholtzOperator object is created and then it is passed into CUDAWrapper::MatrixFree cell_loop method along with two CUDAVectors. However, when I look at the HelmholtzOperator::operator() definition it takes CUDAWrappers::MatrixFree::Data * and CUDAWrappers::SharedData * types as inputs. Do the CUDAWrapper::MatrixFree internals take care of building objects with those types in a way that I can just ignore? I thought from the previous discussion that the user had to package what they needed into CUDAWrappers::MatrixFree::Data * and CUDAWrappers::SharedData * objects but maybe I misunderstood it. Obviously it would be great if this happened in the background.

Basically, the unsigned int cell, CUDAWrappers::MatrixFree::Data *, and CUDAWrappers::SharedData * are data that is produced by MatrixFree and that are needed by FEEvalution. The user needs to pass this data to initialize the FEEvaluation object, this is very similar to what is done on the CPU where you need to pass a MatrixFree object to initialize the FEEvaluation. The user is not supposed to touch CUDAWrappers::MatrixFree::Data *, everything that you want to pass needs to be stored inside the functor. If you look at the latest version of the matrix_vector_mf.h, we store the coefficients inside the functor (re-reading the post I wrote in the other thread, it was definitely not clear that it is what you are supposed to do).

On a related topic, I can see that the signature for HelmholtzOperator::operator() is notably different than SineGordonOperation:: local_apply in step-48. Where in the documentation should I be looking for the requirements for the functor passed into CUDAWrapper::MatrixFree::cell_loop? I'm reaching a dead end at the documentation entry for CUDAWrapper::MatrixFree::cell_loop. I can just copy the form from HelmholtzOperator::operator(), but I'm worried that I could get myself into trouble if I don't quite understand it.

You should copy the signature of HelmholtzOperator::operator() and ask questions when you have a problem. Also I would advise you to use master not only for your code but also for the documentation. We did add the signature of the functors in the documentation a couple of months ago see here.

In more details the difference between the CPU and the GPU operator are:

- (const unsigned int cell, const typename CUDAWrappers::MatrixFree<dim, Number>::Data *gpu_data, CUDAWrappers::SharedData<dim, Number> * shared_data) is equivalent to (const MatrixFree<dim, typename VectorType::value_type> &data).

- instead of having access to cell_range like in the CPU case, we perform the loop ourselves in apply_quad_point_operations() and so what would have been inside the for loop needs to be in a functor.

Best,

Bruno

Reply all

Reply to author

Forward

0 new messages