about deal.II with CUDA C programming acceleration

Chih-Che Chueh

unread,

Aug 14, 2017, 7:56:45 AM8/14/17

to dea...@googlegroups.com

Dear deal.II developers and users,

Recently, I spent some spare time assimilating CUDA C programming in the last few months, and I already know very well how to use CUDA stream events to let CPU and kernel (GPU) execution work asynchronously with efficiently overlapping data transfer between CPU and GPU, how to use shared memory to ensure global memory coalescing efficiently, how to map threads to matrix elements either using CARTESIAN x, y, z or a row/column mapping in GPU, as well as how to use shared memory to enhance data reuse. Most importantly, for actual practice, we have a GPU accelerator (i.e. NVIDIA Tesla K40) that was bought last year. I plan to use the CUDA C programming to deal with big data or image identification with artificial intelligence (deep learning) for atmospheric data.

Anyway, I am writing to ask if I could get involved with a deal.II project of people who are working on asynchronous adaptive mesh refinement for acceleration or other performance improvement in deal.II with CUDA C programming.

Thanks!

Sincerely,

Chih-Che

Bruno Turcksin

unread,

Aug 15, 2017, 7:19:51 AM8/15/17

to deal.II User Group

Chih-Che,

The CUDA support in deal.II is very new. It is only in the development version of deal (https://github.com/dealii/dealii). You can see our current development plan here https://github.com/dealii/dealii/projects/2 Right now, we have support for vector and partial support for matrix-free (a good place to see the capabilities is the test suite https://github.com/dealii/dealii/tree/master/tests/cuda). I would advise you to wait for this PR https://github.com/dealii/dealii/pull/4846 to be merged before your try to install deal with CUDA. This PR makes it a lot easier to install deal with CUDA. If you want to work on CUDA, you should work on something that you like / is of interest to you. If you want to help our existing effort, you can pick something from this list https://github.com/dealii/dealii/issues/4399. I am working on the first item but let me know if you find anything else interesting, I can help you implementing it.

If you have any questions, please ask. We are looking for people to help us with CUDA.

Best,

Bruno

Chih-Che Chueh

unread,

Aug 16, 2017, 4:54:02 AM8/16/17

to dea...@googlegroups.com

Bruno,

It is nice to meet you again. I still remember we met and chatted a little bit in 2013 deal.II workshop, which was held in Texas A&M University.

Thank you for letting me be involved in this development. Actually, once I really get started after the installation of the development version of deal.II in the server I am using, I am sure that I will have many questions to you about CUDA site of deal.II, and I will need your helps.

After I just take a glimpse at this to try to understand the CUDA style you tried to implement, I have a quick question to you:

When you did test the CUDA code, you used cudaMalloc to create an object on the device (GPU) and then copy the results from the device to the host using cudaMemcpy. As far as I know, now CUDA offers a simple way using Unified Memory, which is providing a single memory space accessible by all GPUs and CPUs in your system with the efficient page migration engine of the recently released NVIDIA Tesla P100. This implementation is like the following CUDA code.

=================Unified Memory==========================

int N = 10000;

float *x, *y;

// Allocate Unified Memory – accessible from CPU or GPU

cudaMallocManaged(&x, N*sizeof(float));

cudaMallocManaged(&y, N*sizeof(float));

// initialize x and y arrays on the host

for (int i = 0; i < N; i++) {

x[i] = 1.0f;

y[i] = 2.0f;

}

add<<<1, 1>>>(N, x, y);

// Free memory

cudaFree(x);

cudaFree(y);

=======================================================

You might also want to take a look at this about unified memory:

https://devblogs.nvidia.com/parallelforall/cuda-8-features-revealed/

Is there any reason why you still used the previous way? Is this because your GPU accelerator was launched some years ago so that you are unable to use the new simple way?

Thanks!

Sincerely,

Chih-Che

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bruno Turcksin

unread,

Aug 16, 2017, 8:53:27 AM8/16/17

to dea...@googlegroups.com

Chih-Che

2017-08-16 4:53 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:
> It is nice to meet you again. I still remember we met and chatted a little
> bit in 2013 deal.II workshop, which was held in Texas A&M University.

It has been a while! Nice to meet you again.

> Is there any reason why you still used the previous way? Is this because
> your GPU accelerator was launched some years ago so that you are unable to
> use the new simple way?

The problem with UVM is that it makes it easy to write bad code. I am
working on a code where we use UVM and it happens quite often that we
run some code that was intended for the GPU on the CPU. Because of UVM
it works but at the cost of performance and so it is harder to detect
that you made a mistake. Another advantage of copying the data
manually is that you don't hide the costly copy operation. This is
also the philosophy used in Kokkos. So yes UVM makes writing the code
easier but you need to be a lot more careful when you write the code
if you don't want to perform a bunch of useless copy. Of course, if
you have NVLink it doesn't matter but that only works for POWER chips
which most people don't have access to.

Best,

Bruno

Chih-Che Chueh

unread,

Aug 17, 2017, 3:27:53 AM8/17/17

to dea...@googlegroups.com

Bruno,

> Is there any reason why you still used the previous way? Is this because
> your GPU accelerator was launched some years ago so that you are unable to
> use the new simple way?
The problem with UVM is that it makes it easy to write bad code. I am
working on a code where we use UVM and it happens quite often that we
run some code that was intended for the GPU on the CPU. Because of UVM
it works but at the cost of performance and so it is harder to detect
that you made a mistake. Another advantage of copying the data
manually is that you don't hide the costly copy operation. This is
also the philosophy used in Kokkos. So yes UVM makes writing the code
easier but you need to be a lot more careful when you write the code
if you don't want to perform a bunch of useless copy. Of course, if
you have NVLink it doesn't matter but that only works for POWER chips
which most people don't have access to.

That is a convincing reason for you to do so. I know it now. Thanks!

Let me know once your merger is done so that I can start to install the deal.II with CUDA in the server I use now.

The CUDA compiler I use now in the server is shown in the following:

[chueh@icycn06 cuda]$ nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver

Built on Tue_Aug_11_14:27:32_CDT_2015

Cuda compilation tools, release 7.5, V7.5.17

Also, I would appreciate to see the detailed instruction on how to install deal.II together with CUDA, because I don't know how to do that.

Thanks!

Sincerely,

Chih-Che

Bruno Turcksin

unread,

Aug 17, 2017, 8:21:15 PM8/17/17

to dea...@googlegroups.com

Chih-che

2017-08-17 3:27 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:
> Let me know once your merger is done so that I can start to install the
> deal.II with CUDA in the server I use now.

The merge was done but in the mean time we broke three tests for CUDA
:-(. I think that it is still worth to try to install deal with CUDA
and just update later.

> The CUDA compiler I use now in the server is shown in the following:
> [chueh@icycn06 cuda]$ nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2015 NVIDIA Corporation
> Built on Tue_Aug_11_14:27:32_CDT_2015
> Cuda compilation tools, release 7.5, V7.5.17

I don't know if that will work. I think we require CUDA 8.0. Because
the feature is brand new and we are all using CUDA 8.0, I am not sure
that it will work with CUDA 7.5 (nobody has ever tried).

> Also, I would appreciate to see the detailed instruction on how to install
> deal.II together with CUDA, because I don't know how to do that.

Yes, I still need to write the documentation. However it should be
pretty simple. You need CMake 3.9 and then just use
-DDEAL_II_WITH_CUDA=ON and it should work. Let me know if you have any
problem.

Best,

Bruno

Chih-Che Chueh

unread,

Aug 17, 2017, 8:39:43 PM8/17/17

to dea...@googlegroups.com

Bruno,

> Let me know once your merger is done so that I can start to install the
> deal.II with CUDA in the server I use now.
The merge was done but in the mean time we broke three tests for CUDA
:-(. I think that it is still worth to try to install deal with CUDA
and just update later.

I can wait for the tests that are done. No worry.

> The CUDA compiler I use now in the server is shown in the following:
> [chueh@icycn06 cuda]$ nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2015 NVIDIA Corporation
> Built on Tue_Aug_11_14:27:32_CDT_2015
> Cuda compilation tools, release 7.5, V7.5.17
I don't know if that will work. I think we require CUDA 8.0. Because
the feature is brand new and we are all using CUDA 8.0, I am not sure
that it will work with CUDA 7.5 (nobody has ever tried).

OK! I am going to ask our system administration to have CUDA 8 compiler version installed in the GPU before I try to install the deal.II with CUDA.

> Also, I would appreciate to see the detailed instruction on how to install
> deal.II together with CUDA, because I don't know how to do that.
Yes, I still need to write the documentation. However it should be
pretty simple. You need CMake 3.9 and then just use
-DDEAL_II_WITH_CUDA=ON and it should work. Let me know if you have any
problem.

OK! I see.

Sincerely,

Chih-Che

Bruno Turcksin

unread,

Aug 21, 2017, 8:16:27 AM8/21/17

to dea...@googlegroups.com

Chih-Che

2017-08-17 20:39 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:
> I can wait for the tests that are done. No worry.

Everything is working now.

Best,

Bruno

Chih-Che Chueh

unread,

Aug 21, 2017, 8:21:39 AM8/21/17

to dea...@googlegroups.com

Hi Bruno,

Thanks for the notification. I am going to install p4est, trilino and the deal.II with CUDA, as I also want to run some tutorial programs made before. I will let you know if I have any problem or trouble installing them.

By the way, the CUDA compiler version in our server has been updated into CUDA 8.0, as you suggested last week.

[chueh@icycn06 ~]$ nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver

Built on Tue_Jan_10_13:22:03_CST_2017

Cuda compilation tools, release 8.0, V8.0.61

Sincerely,

Chih-Che

Best,

Bruno

Chih-Che Chueh

unread,

Aug 21, 2017, 9:08:20 PM8/21/17

to dea...@googlegroups.com

Bruno,

I just start to install p4est first in our server consisting of many machines. Then, after following the basic in instruction shown here, I get the following error. Can you help me out?

===============Instruction=====================================

If you want to use the script, copy the tarball to a fresh directory together with the p4est-setup.sh script (the one from this link, not one you may have gotten from the p4est webpage). Then call the script as follows:

sh ./p4est-setup.sh p4est-x-y-z.tar.gz /path/to/installation

where p4est-x-y-z.tar.gz is the name of the p4est distribution file, and /path/to/installation is a directory into which you want to install p4est.

In both cases, you can configure deal.II using a line like

cmake -DP4EST_DIR=/path/to/installation -DDEAL_II_WITH_P4EST=ON -DDEAL_II_WITH_MPI=ON <...>

if the p4est library isn't picked up automatically.

==============================================================

[chueh@icycn06 p4est]$ sh ./p4est-setup.sh p4est-1.1.tar.gz /work2/chueh/deal.II_project/p4est

CFLAGS_FAST: -O2

CFLAGS_DEBUG: -O0 -g

This script tries to unpack, configure and build the p4est library.

Build FAST: /work2/chueh/deal.II_project/p4est/p4est-build/FAST

Build DEBUG: /work2/chueh/deal.II_project/p4est/p4est-build/DEBUG

Install FAST: /work2/chueh/deal.II_project/p4est/FAST

Install DEBUG: /work2/chueh/deal.II_project/p4est/DEBUG

Checking environment: CFLAGS P4EST_CFLAGS_FAST P4EST_CFLAGS_DEBUG

Unpack directory: /work2/chueh/deal.II_project/p4est/p4est-1.1

Unpacking... done

See output in files .../config.output and .../make.output

Build FAST version in /work2/chueh/deal.II_project/p4est/p4est-build/FAST

configure: WARNING: Static source code checker splint not found

pgc++-Error-Unknown switch: -print-prog-name=ld

configure: WARNING: Static source code checker splint not found

pgc++-Error-Unknown switch: -print-prog-name=ld

/tmp/pgccUZWckQrbzfNI.s: Assembler messages:

/tmp/pgccUZWckQrbzfNI.s:1605: Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm3'

/tmp/pgccUZWckQrbzfNI.s:1623: Error: no such instruction: `vinserti128 $1,%xmm1,%ymm1,%ymm1'

/tmp/pgccUZWckQrbzfNI.s:1626: Error: suffix or operands invalid for `vpsubd'

/tmp/pgccUZWckQrbzfNI.s:1631: Error: suffix or operands invalid for `vpaddd'

/tmp/pgccUZWckQrbzfNI.s:1717: Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm3'

/tmp/pgccUZWckQrbzfNI.s:1742: Error: no such instruction: `vinserti128 $1,%xmm1,%ymm1,%ymm1'

/tmp/pgccUZWckQrbzfNI.s:1745: Error: suffix or operands invalid for `vpsubd'

/tmp/pgccUZWckQrbzfNI.s:1749: Error: suffix or operands invalid for `vpaddd'

/tmp/pgccUZWckQrbzfNI.s:1790: Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm3'

/tmp/pgccUZWckQrbzfNI.s:1806: Error: no such instruction: `vinserti128 $1,%xmm1,%ymm1,%ymm1'

/tmp/pgccUZWckQrbzfNI.s:1810: Error: suffix or operands invalid for `vpsubd'

/tmp/pgccUZWckQrbzfNI.s:1815: Error: suffix or operands invalid for `vpaddd'

/tmp/pgccUZWckQrbzfNI.s:1878: Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm3'

/tmp/pgccUZWckQrbzfNI.s:1903: Error: no such instruction: `vinserti128 $1,%xmm1,%ymm1,%ymm1'

/tmp/pgccUZWckQrbzfNI.s:1906: Error: suffix or operands invalid for `vpsubd'

/tmp/pgccUZWckQrbzfNI.s:1910: Error: suffix or operands invalid for `vpaddd'

make: *** [src/src_libsc_la-sc_bspline.lo] Error 1

make: *** Waiting for unfinished jobs....

Error: Error in make sc

Chih-Che

Best,

Bruno

Bruno Turcksin

unread,

Aug 21, 2017, 9:20:07 PM8/21/17

to dea...@googlegroups.com

Chih-Che,

2017-08-21 21:08 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:
> I just start to install p4est first in our server consisting of many
> machines. Then, after following the basic in instruction shown here, I get
> the following error. Can you help me out?

Sure.

> pgc++-Error-Unknown switch: -print-prog-name=ld
Don't use pgi :-) We aren't done with the pgi port. You will also get
into trouble when you try to compile boost. Look at this thread
https://github.com/dealii/dealii/issues/4363 You will need to patch
Boost to get PGI to work or install it with spack (that works too I
have tried). If you want to use pgi you cannot use the script to
install p4est, you will need to do it by hand (or maybe use spack, I
don't know if it works). I am using pgi on one of my code and it's
incredibly slow to compile C++. pgi is great for fortran not so much
for C++ but it's definitely improving!

Best,

Bruno

Chih-Che Chueh

unread,

Aug 23, 2017, 3:05:15 AM8/23/17

to dea...@googlegroups.com

Bruno,

> pgc++-Error-Unknown switch: -print-prog-name=ld
Don't use pgi :-) We aren't done with the pgi port.

I just turn off pgi on my linux background while turning on the necessary intel C compiler etc, and try again to use the online script. It seems to work well but with some warnings.

[chueh@icy p4est]$ module load intel/comp-15.0.0

[chueh@icy p4est]$ module load intel/mkl-11.1.2

[chueh@icy p4est]$ module load openmpi-1.6.5/intel-15.0

[chueh@icy p4est]$ sh ./p4est-setup.sh p4est-1.1.tar.gz /work2/chueh/deal.II_project/p4est

CFLAGS_FAST: -O2

CFLAGS_DEBUG: -O0 -g

This script tries to unpack, configure and build the p4est library.

Build FAST: /work2/chueh/deal.II_project/p4est/p4est-build/FAST

Build DEBUG: /work2/chueh/deal.II_project/p4est/p4est-build/DEBUG

Install FAST: /work2/chueh/deal.II_project/p4est/FAST

Install DEBUG: /work2/chueh/deal.II_project/p4est/DEBUG

Checking environment: CFLAGS P4EST_CFLAGS_FAST P4EST_CFLAGS_DEBUG

Unpack directory: /work2/chueh/deal.II_project/p4est/p4est-1.1

Source directory found (remove it to unpack anew)

See output in files .../config.output and .../make.output

Build FAST version in /work2/chueh/deal.II_project/p4est/p4est-build/FAST

configure: WARNING: Static source code checker splint not found

FAST version installed in /work2/chueh/deal.II_project/p4est/FAST

Build DEBUG version in /work2/chueh/deal.II_project/p4est/p4est-build/DEBUG

configure: WARNING: Static source code checker splint not found

DEBUG version installed in /work2/chueh/deal.II_project/p4est/DEBUG

Can you help me tell me if p4est has been installed successfully?

Sincerely,

Chih-Che

Daniel Arndt

unread,

Aug 23, 2017, 5:21:45 AM8/23/17

to deal.II User Group

Chih-Che,

[...]

Build FAST version in /work2/chueh/deal.II_project/p4est/p4est-build/FAST
configure: WARNING: Static source code checker splint not found
configure: WARNING: Static source code checker splint not found
FAST version installed in /work2/chueh/deal.II_project/p4est/FAST

Build DEBUG version in /work2/chueh/deal.II_project/p4est/p4est-build/DEBUG
configure: WARNING: Static source code checker splint not found
configure: WARNING: Static source code checker splint not found
DEBUG version installed in /work2/chueh/deal.II_project/p4est/DEBUG

Can you help me tell me if p4est has been installed successfully?

Yes, this looks good. The next step is to tell deal.II via

cmake -DP4EST_DIR=/work2/chueh/deal.II_project/p4est/ .

to use P4EST and check if it works.

Best,

Daniel

Chih-Che Chueh

unread,

Aug 23, 2017, 5:45:04 AM8/23/17

to dea...@googlegroups.com

Hi Daniel,

Thanks for the answer for the successful p4est installation. I have not seen you since 2013 deal.II workshop. It was nice to meet you and eat dinner with you with the same table.

Hi Daniel, Bruno or any others,

I have already moved to the trilinos installation. But after installing for a while, I get the error that you can see below. Is this because I am using an old version of Intel MKL Math Kernel library? If not, does any one know how to fix it?

cmake \

-DTrilinos_ENABLE_Amesos=ON \

-DTrilinos_ENABLE_Epetra=ON \

-DTrilinos_ENABLE_Ifpack=ON \

-DTrilinos_ENABLE_AztecOO=ON \

-DTrilinos_ENABLE_Sacado=ON \

-DTrilinos_ENABLE_Teuchos=ON \

-DTrilinos_ENABLE_MueLu=ON \

-DTrilinos_ENABLE_ML=ON \

-DTrilinos_ENABLE_EXPLICIT_INSTANTIATION=ON \

-DTrilinos_VERBOSE_CONFIGURE=OFF \

-DBLAS_LIBRARY_NAMES:STRING="mkl_intel_lp64" \

-DBLAS_LIBRARY_DIRS:STRING=/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64 \

-DLAPACK_LIBRARY_NAMES:STRING="mkl_intel_lp64" \

-DLAPACK_LIBRARY_DIRS:STRING=/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64 \

-DTPL_ENABLE_MPI=ON \

-DBUILD_SHARED_LIBS=ON \

-DCMAKE_VERBOSE_MAKEFILE=OFF \

-DCMAKE_BUILD_TYPE=RELEASE \

-DCMAKE_INSTALL_PREFIX:PATH=/work2/chueh/deal.II_project/trilinos \

../

make install

==============ERROR========

[...]

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_lapack_dlanv2'

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_lapack_sla_gercond'

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_trs_dtrnlsp_init'

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_spblas_lp64_mkl_sbsrsv'

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_blas_dsyr2k'

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_lapack_zlantp'

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_spblas_lp64_mkl_zdiamm'

/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.so: undefined reference to `mkl_lapack_dgelsd'

make[2]: *** [packages/amesos2/src/SuiteSparse/KLU2/Source/Amesos2_klu2_simple.exe] Error 1

make[1]: *** [packages/amesos2/src/SuiteSparse/KLU2/Source/CMakeFiles/Amesos2_klu2_simple.dir/all] Error 2

make: *** [all] Error 2

Sincerely,

Chih-Che

--

Bruno Turcksin

unread,

Aug 23, 2017, 9:17:54 AM8/23/17

to dea...@googlegroups.com

Chih-Che,

2017-08-23 5:45 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:
> I have already moved to the trilinos installation. But after installing for
> a while, I get the error that you can see below. Is this because I am using
> an old version of Intel MKL Math Kernel library? If not, does any one know
> how to fix it?

First you used pgi now you want to use MKL, I see that you like challenges ;-)

> -DBLAS_LIBRARY_NAMES:STRING="mkl_intel_lp64" \
>
> -DBLAS_LIBRARY_DIRS:STRING=/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64
> \
> -DLAPACK_LIBRARY_NAMES:STRING="mkl_intel_lp64" \
>
> -DLAPACK_LIBRARY_DIRS:STRING=/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64
> \

Try using this for the library names
-DBLAS_LIBRARY_NAMES:STRING="mkl_core;mkl_sequential"
-DLAPACK_LIBRARY_NAMES:STRING="mkl_intel_lp64"

Best,

Bruno

Chih-Che Chueh

unread,

Aug 23, 2017, 8:09:59 PM8/23/17

to dea...@googlegroups.com

Hi Bruno

> I have already moved to the trilinos installation. But after installing for
> a while, I get the error that you can see below. Is this because I am using
> an old version of Intel MKL Math Kernel library? If not, does any one know
> how to fix it?
First you used pgi now you want to use MKL, I see that you like challenges ;-)

Believe me! I don't really like this kind of challenge. I really just want to fix the problem and get to work on CUDA programming. ;-)

> -DBLAS_LIBRARY_NAMES:STRING="mkl_intel_lp64" \
>
> -DBLAS_LIBRARY_DIRS:STRING=/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64
> \
> -DLAPACK_LIBRARY_NAMES:STRING="mkl_intel_lp64" \
>
> -DLAPACK_LIBRARY_DIRS:STRING=/work2/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64
> \
Try using this for the library names
-DBLAS_LIBRARY_NAMES:STRING="mkl_core;mkl_sequential"
-DLAPACK_LIBRARY_NAMES:STRING="mkl_intel_lp64"

That's still not working with getting the same error. Is this just because it can't find the BLAS and LAPACK libraies?

By the way, before I installed the trilinos, I module load many things:

[chueh@icy]$ module load intel/comp-15.0.0

[chueh@icy]$ module load intel/mkl-11.1.2

[chueh@icy]$ module load openmpi-1.6.5/intel-15.0

[chueh@icy]$ module load cmake/3.9.1

Can you help me see if there is any other things I have to module load before the trilinos installation to get to work successfully? Thanks!

[chueh@icy ~]$ module av

----------------------------------------------------------------------------------------------------------------------- /work2/opt/Modules/modulefiles -----------------------------------------------------------------------------------------------------------------------

cmake/3.9.1 hdf5/gcc-4.4.6_1.8.14 intel/comp-15.0 module-git ncl_ncarg/6.3.0 netcdf/hdf5_pgi-15.1_4.3.3.1 pgi/15.10/x86_64

dot hdf5/intel-15.0_1.10.0 intel/comp-15.0.0 module-info netcdf/hdf5_gnu_4.3.3.1 null pgi_mpi/15.10

grads/2.0.2 hdf5/intel-15.0_1.8.14 intel/mkl-11.1 modules netcdf/hdf5_intel-15.0_4.3.3.1 opengrads/2.0.2.oga.2 python/3.5.3

grads/2.1.a3 hdf5/pgi-15.1_1.8.14 intel/mkl-11.1.2 ncl_ncarg/6.1.2 netcdf/hdf5_intel-15.0_4.4.1.1 openmpi-1.6.5/intel-15.0 use.own

Best,

Chih-Che

Bruno Turcksin

unread,

Aug 23, 2017, 8:49:38 PM8/23/17

to dea...@googlegroups.com

Chih-Che,

2017-08-23 20:09 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:
> That's still not working with getting the same error. Is this just because
> it can't find the BLAS and LAPACK libraies?

Did you reconfigure Trilinos in an empty directory? If not do it, you
never know what has been cached by CMake...

The problem with MKL is that instead of having one or two libraries,
they have split BLAS and LAPACK into a bunch of small libraries and
you need to find which libraries you need to include. You can use this
website to help you:
https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
Look at the libraries that are needed at linking time and add them to
BLAS_LIBRARY_NAMES or LAPACK_LIBRARY_NAMES. It shouldn't matter which
one.

> By the way, before I installed the trilinos, I module load many things:
>
> [chueh@icy]$ module load intel/comp-15.0.0
> [chueh@icy]$ module load intel/mkl-11.1.2
> [chueh@icy]$ module load openmpi-1.6.5/intel-15.0

Maybe pgi was a better idea :-D There are two problems with intel 15:
1) we don't test deal.II with intel15 so I am not sure if it will
works but I guess it should work
2) the big problem is how are you going to use CUDA? I didn't know
that nvcc supported icc but apparently it kinda does but it looks
pretty bad https://devtalk.nvidia.com/default/topic/947888/intel-icc-compiler-and-c-11/?offset=4
For your information, deal now requires C++11 so it won't work. Don't
you have gcc 5 on your machine?

> [chueh@icy ~]$ module av
>
> -----------------------------------------------------------------------------------------------------------------------
> /work2/opt/Modules/modulefiles
> -----------------------------------------------------------------------------------------------------------------------
> cmake/3.9.1 hdf5/gcc-4.4.6_1.8.14
> intel/comp-15.0 module-git
> ncl_ncarg/6.3.0 netcdf/hdf5_pgi-15.1_4.3.3.1
> pgi/15.10/x86_64
> dot hdf5/intel-15.0_1.10.0
> intel/comp-15.0.0 module-info
> netcdf/hdf5_gnu_4.3.3.1 null pgi_mpi/15.10
> grads/2.0.2 hdf5/intel-15.0_1.8.14 intel/mkl-11.1
> modules netcdf/hdf5_intel-15.0_4.3.3.1
> opengrads/2.0.2.oga.2 python/3.5.3
> grads/2.1.a3 hdf5/pgi-15.1_1.8.14
> intel/mkl-11.1.2 ncl_ncarg/6.1.2
> netcdf/hdf5_intel-15.0_4.4.1.1 openmpi-1.6.5/intel-15.0 use.own

Are these all the modules that are available to you :-( Ideally, you
have something like openmpi/gcc5 (gcc 4.8 or 4.9 would also work not 6
or later because they are not compatible with CUDA8). Now your problem
is that pgi-15 is probably too old to be useful for C++ code and
intel-15 does not work well with cuda. Personally, I would compile my
own gcc 5.4. It's actually pretty easy because you only need C++ and
C. You should _not_ compile support for the other languages. Then, you
load openmpi-1.6.5/intel-15.0 and use this
https://www.open-mpi.org/faq/?category=mpi-apps#override-wrappers-after-v1.0
to change the underlying compiler from intel to gcc 5. What I am
wondering is where is your CUDA module? How do you call nvcc? If you
want to use gcc 5 as the underlying compiler, you will need to do this
https://stackoverflow.com/questions/8317510/default-host-compiler-used-by-nvcc-for-linux.

Best,

Bruno

Bruno Turcksin

unread,

Sep 15, 2017, 2:15:36 PM9/15/17

to dea...@googlegroups.com

Chih-Che,

Did you get everything to work or you gave up? You are the first non-developer trying to install deal.II with CUDA, so I am really interested to know which problems you are encountering.

Best,

Bruno

Chih-Che Chueh

unread,

Sep 15, 2017, 7:43:03 PM9/15/17

to dea...@googlegroups.com

HI Bruno,

Thanks for the reminder and helps. Indeed, our system administrator has trouble with installing the newer version of GNU C compiler. Would you provide detailed instructions on how to installing the GNU C compiler? Thanks!

The system administrator told me that he was trying to install the newer version of GNU C compiler alone, but failed. He told me that people usually reinstall the newer version of linux UBUNTU system with the newer version of GNU C compiler instead of installing the newer version of GNU C compiler alone. However, because we have other important softwares installed/used in our system, I don't know when he is going to update the entire linux system. So, I guess that the only way to get the newer version of GNU C compiler is feasible right now. But we don't know how to get it working.

Sincerely,

Chih-Che

luksh...@gmail.com

unread,

Sep 16, 2017, 5:15:37 AM9/16/17

to dea...@googlegroups.com

On Saturday, September 16, 2017 07:42 AM, Chih-Che Chueh wrote:
> HI Bruno,
>
> Thanks for the reminder and helps. Indeed, our system administrator has
> trouble with installing the newer version of GNU C compiler. Would you
> provide detailed instructions on how to installing the GNU C compiler?
> Thanks!
>
> The system administrator told me that he was trying to install the newer
> version of GNU C compiler alone, but failed. He told me that people
> usually reinstall the newer version of linux UBUNTU system with the
> newer version of GNU C compiler instead of installing the newer version
> of GNU C compiler alone. However, because we have other important
> softwares installed/used in our system, I don't know when he is going to
> update the entire linux system. So, I guess that the only way to get the
> newer version of GNU C compiler is feasible right now. But we don't know
> how to get it working.

FWIW, there's a PPA where you can get the newer gcc that can co-exist
with the default ones.

https://launchpad.net/~ubuntu-toolchain-r/+archive/ubuntu/test

See if it works for you.
-- st

Bruno Turcksin

unread,

Sep 16, 2017, 3:24:02 PM9/16/17

to dea...@googlegroups.com

Chih-Che,

2017-09-15 19:42 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:

Thanks for the reminder and helps. Indeed, our system administrator has trouble with installing the newer version of GNU C compiler. Would you provide detailed instructions on how to installing the GNU C compiler? Thanks!

The system administrator told me that he was trying to install the newer version of GNU C compiler alone, but failed. He told me that people usually reinstall the newer version of linux UBUNTU system with the newer version of GNU C compiler instead of installing the newer version of GNU C compiler alone. However, because we have other important softwares installed/used in our system, I don't know when he is going to update the entire linux system. So, I guess that the only way to get the newer version of GNU C compiler is feasible right now. But we don't know how to get it working.

You definitely don't want to update your OS. I am also using ubuntu and I have several versions of gcc and clang installed. Lately, I have been using spack to install everything for me (https://github.com/LLNL/spack). This is becoming the standard way to install new programs on clusters at the DOE labs. Last time I installed gcc myself was about a year ago and I just followed the instruction here https://gcc.gnu.org/install/configure.html If your admin wants to shoot me an email with the errors he is getting, I will be happy to help.

Best,

Bruno

Denis Davydov

unread,

Sep 16, 2017, 6:30:51 PM9/16/17

to deal.II User Group

On Saturday, September 16, 2017 at 9:24:02 PM UTC+2, Bruno Turcksin wrote:

Chih-Che,

2017-09-15 19:42 GMT-04:00 Chih-Che Chueh <chu...@gmail.com>:

You definitely don't want to update your OS. I am also using ubuntu and I have several versions of gcc and clang installed. Lately, I have been using spack to install everything for me (https://github.com/LLNL/spack). This is becoming the standard way to install new programs on clusters at the DOE labs.

@Chih-Che we have a quick set-up guide for deal.II in Spack https://github.com/dealii/dealii/wiki/deal.II-in-Spack

including some example on how to use it on a cluster with externally (to Spack) provided OpenMPI + GCC.

But I never tried cuda build as I don't have access to such machine.

@Bruno I don't any experience with cuda, but do we need anything special but

$spack install cuda

to use it?

I guess one way is to use directly cuda compiler wrappers. It also looks like openmpi can be compiled with cuda support (then maybe MPI wrappers can be used directly?).

If you can roughly tell me how this shall work, I can try blindly extend deal.II package in Spack for CUDA and wait for you to test.

p.s. i guess whole lot of dependencies like Trilinos, PETSc, SLEPc would need to be turned off though...

Regards,

Denis.

Bruno Turcksin

unread,

Sep 16, 2017, 7:00:45 PM9/16/17

to dea...@googlegroups.com

Denis,

2017-09-16 18:30 GMT-04:00 Denis Davydov <davy...@gmail.com>:

@Bruno I don't any experience with cuda, but do we need anything special but
$spack install cuda
to use it?

I wouldn't use spack to install cuda because there is nothing to compile. You need binaries and you need drivers that depends on your kernel. If you try to do something a little bit too exotic you get into trouble when you upgrade your kernel or your drivers (I have been there...).

I guess one way is to use directly cuda compiler wrappers. It also looks like openmpi can be compiled with cuda support (then maybe MPI wrappers can be used directly?).

cuda support in openmpi allows you to by-pass the cpu when you want to send a MPI message from one GPU to another GPU. I don't think it changes the wrapper.

If you can roughly tell me how this shall work, I can try blindly extend deal.II package in Spack for CUDA and wait for you to test.

I'll try to do it myself. I'll you know if I hit a road block.

p.s. i guess whole lot of dependencies like Trilinos, PETSc, SLEPc would need to be turned off though...

I don't think so. Basically, nvcc is a pre-processor that sends all the code to gcc. If there is no cuda code, you basically have a regular gcc compiler.

Best,

Bruno

Denis Davydov

unread,

Sep 17, 2017, 1:22:51 AM9/17/17

to dea...@googlegroups.com

On 17 Sep 2017, at 01:00, Bruno Turcksin <bruno.t...@gmail.com> wrote:

Denis,

2017-09-16 18:30 GMT-04:00 Denis Davydov <davy...@gmail.com>:
@Bruno I don't any experience with cuda, but do we need anything special but
$spack install cuda
to use it?

I wouldn't use spack to install cuda because there is nothing to compile. You need binaries and you need drivers that depends on your kernel. If you try to do something a little bit too exotic you get into

I though you could also get binaries (nvcc) and the rest by compiling CUDA, i.e. https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

but of course one can also set it as an externally provided package.

trouble when you upgrade your kernel or your drivers (I have been there...).

I guess one way is to use directly cuda compiler wrappers. It also looks like openmpi can be compiled with cuda support (then maybe MPI wrappers can be used directly?).
cuda support in openmpi allows you to by-pass the cpu when you want to send a MPI message from one GPU to another GPU. I don't think it changes the wrapper.

ok, thanks for explanation.

If you can roughly tell me how this shall work, I can try blindly extend deal.II package in Spack for CUDA and wait for you to test.
I'll try to do it myself. I'll you know if I hit a road block.

great!

If you hit some issues using/building with CUDA, maybe create an issue/question on either

https://github.com/LLNL/spack/issues or

https://groups.google.com/forum/?fromgroups#!forum/spack

I am sure one of the sysadmins who use Spack to manager HPC cluster must be playing with CUDA as well.

Cheers,

Denis.

Chih-Che Chueh

unread,

Sep 18, 2017, 7:54:17 AM9/18/17

to dea...@googlegroups.com

Hi Bruno,

You definitely don't want to update your OS. I am also using ubuntu and I have several versions of gcc and clang installed. Lately, I have been using spack to install everything for me (https://github.com/LLNL/spack). This is becoming the standard way to install new programs on clusters at the DOE labs. Last time I installed gcc myself was about a year ago and I just followed the instruction here https://gcc.gnu.org/install/configure.html If your admin wants to shoot me an email with the errors he is getting, I will be happy to help.

Thanks for your kind help.

I have conveyed what you say above to our system administrator, who will surely follow the instruction you provide. If he has any trouble with this and need your help, he will tell me (as his English is not as good as mine) and then I will let you know.

Sincerely,

Chih-Che

Bruno Turcksin

unread,

Sep 18, 2017, 8:31:41 AM9/18/17

to dea...@googlegroups.com

2017-09-17 1:22 GMT-04:00 Denis Davydov <davy...@gmail.com>:

I though you could also get binaries (nvcc) and the rest by compiling CUDA, i.e. https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run
but of course one can also set it as an externally provided package.

Sure, you can but the big advantage of using a package is when you update your drivers or your kernel, you need to everything to always be synchronized. When you use their package, they take care of that.

Bruno

Denis Davydov

unread,

Sep 18, 2017, 9:03:18 AM9/18/17

to dea...@googlegroups.com

Sent from iPhone, please excuse my brevity.

18 сент. 2017 г., в 15:31, Bruno Turcksin <bruno.t...@gmail.com> написал(а):

So it looks like all u need is:

1. Tell Spack where to find CUDA (externally provided)

packages:

cuda:

version: [8.0]

paths:

cuda@8.0: /cuda_prefix/

buildable: False

2. Add to dealii a variant(cuda,...) and

depends_on(cuda@8:, when='+cuda')

Regards,

Denis

Denis Davydov

unread,

Sep 20, 2017, 5:45:53 AM9/20/17

to deal.II User Group

deal.II in Spack is being extended for CUDA https://github.com/LLNL/spack/pull/5402

Builds ok on Ubuntu PC with gcc 5.4.0.

Chih-Che, if you still have issues, you might want to try this out.

Regards,

Denis.

Chih-Che Chueh

unread,

Sep 20, 2017, 10:22:34 PM9/20/17

to dea...@googlegroups.com

Hi Bruno,

Our system administrator is busy working on something else so that he is unable to update GNU C in our system through what you advised. But he is able to give you an account so that you can have access to our system. As a result, I am wondering if you are willing to install the newer version of GNU C in our system for us and then help us write down a simple, clear instruction on how to install it in our system. Afterward, we can know exactly how to update the newer version of GNU C.

p.s. If you want to do that for us, I will send you the account through your personal email.

Thank you.

Hi Denis,

Of course, if we have any issue, we will try what you suggest. Thanks!

Sincerely,

Chih-Che

--

Wolfgang Bangerth

unread,

Sep 20, 2017, 11:21:10 PM9/20/17

to dea...@googlegroups.com

On 09/20/2017 08:22 PM, Chih-Che Chueh wrote:
>
> Our system administrator is busy working on something else so that he is
> unable to update GNU C in our system through what you advised. But he is able
> to give you an account so that you can have access to our system. As a result,
> I am wondering if you are willing to install the newer version of GNU C in our
> system for us and then help us write down a simple, clear instruction on how
> to install it in our system. Afterward, we can know exactly how to update the
> newer version of GNU C.

Chih-Che,
I think you are expecting too much of Bruno. He has a day job as well, for
which he is paid. I think you will need to figure out the installation on your
system yourself -- otherwise, Bruno (or any other among us) will be asked to
do so for everyone else on the mailing list who wants to install CUDA.

I'm going to add that installing a compiler will really only be the first
step. After that, of course, comes the task of understanding how to program
these graphics cards, and all of the other things that require interacting
with the appropriate software levels. You will need to master all of these
things at one point or other. You might as well learn how to install operating
systems and compilers.

Best
WB

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Chih-Che Chueh

unread,

Sep 21, 2017, 12:33:21 AM9/21/17

to dea...@googlegroups.com

Hi Wolfgang,

I think you are expecting too much of Bruno. He has a day job as well, for which he is paid. I think you will need to figure out the installation on your system yourself -- otherwise, Bruno (or any other among us) will be asked to do so for everyone else on the mailing list who wants to install CUDA.

OK! I understand this now.

Hi Bruno, Please ignore this request. Sorry about that.

I'm going to add that installing a compiler will really only be the first step. After that, of course, comes the task of understanding how to program these graphics cards, and all of the other things that require interacting with the appropriate software levels. You will need to master all of these things at one point or other. You might as well learn how to install operating systems and compilers.

Frankly speaking, all we need is write a simple, clear instruction on the updating the newer version of GNU C like you in deal.II write this for p4est (http://www.dealii.org/developer/external-libs/p4est.html) so that deal.II users would follow easily and quickly. Is it possible for any one in deal.II to have this kind of information?

Sincerely,

Chih-Che

Denis Davydov

unread,

Sep 21, 2017, 1:02:42 AM9/21/17

to dea...@googlegroups.com

Hi Chih-Che

21 сент. 2017 г., в 6:33, Chih-Che Chueh <chu...@gmail.com> написал(а):

Compiling compilers is usually much more complicated than p4est. You can have a look at what Spack does in this Python class https://github.com/LLNL/spack/blob/develop/var/spack/repos/builtin/packages/gcc/package.py

But frankly I don’t think we need those instructions. In majority of cases users have a working compiler already. If not, you can use Spack to compile GCC and then the whole dealii, high level instructions are here https://github.com/dealii/dealii/wiki/deal.II-in-Spack

Note that you do NOT need admin rights to use it, it’s perfectly fine to do everything in your home folder.

It was also mentioned above that there are newer versions of gcc available as Linux packages (for that u would need admin rights)

Regards,

Denis

Wolfgang Bangerth

unread,

Sep 21, 2017, 10:39:31 AM9/21/17

to dea...@googlegroups.com

> I'm going to add that installing a compiler will really only be the first
> step. After that, of course, comes the task of understanding how to
> program these graphics cards, and all of the other things that require
> interacting with the appropriate software levels. You will need to master
> all of these things at one point or other. You might as well learn how to
> install operating systems and compilers.
>
>
>
> Frankly speaking, all we need is write a simple, clear instruction on the
> updating the newer version of GNU C like you in deal.II write this for p4est
> (http://www.dealii.org/developer/external-libs/p4est.html) so that deal.II
> users would follow easily and quickly. Is it possible for any one in deal.II
> to have this kind of information?

That, too, is asking for too much. It's not something the average deal.II user
ever needs to do, and it's something that we will never be able to document as
well as the GCC folks. Please read their instructions -- they have very good
installation instructions.

Best
W.

Chih-Che Chueh

unread,

Sep 21, 2017, 7:02:39 PM9/21/17

to dea...@googlegroups.com

I'm going to add that installing a compiler will really only be the first
step. After that, of course, comes the task of understanding how to
program these graphics cards, and all of the other things that require
interacting with the appropriate software levels. You will need to master
all of these things at one point or other. You might as well learn how to
install operating systems and compilers.

Frankly speaking, all we need is write a simple, clear instruction on the updating the newer version of GNU C like you in deal.II write this for p4est (http://www.dealii.org/developer/external-libs/p4est.html) so that deal.II users would follow easily and quickly. Is it possible for any one in deal.II to have this kind of information?

That, too, is asking for too much. It's not something the average deal.II user ever needs to do, and it's something that we will never be able to document as well as the GCC folks. Please read their instructions -- they have very good installation instructions.

OK! I see. I will solve the problem I face now.

Thanks for the answer!

Chih-Che

Denis Davydov

unread,

Sep 25, 2017, 5:09:07 AM9/25/17

to deal.II User Group

Hi Chih-Che,

FYI, deal.ii in Spack was extended to optionally build with CUDA and tests pass, see

https://github.com/LLNL/spack/pull/5402#issuecomment-331821313

Regards,

Denis.

Reply all

Reply to author

Forward