[SyneRBI/SIRF-SuperBuild] Move docker images to ubuntu 20.04 (PR #769)

Kris Thielemans

unread,

Sep 2, 2022, 4:52:51 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Fixes #767

You can view, comment on, or merge this pull request online at:

https://github.com/SyneRBI/SIRF-SuperBuild/pull/769

Commit Summary

569b5f2 docker: move to ubuntu 20.04
788d037 [GHA} make sure that gcc version is installed

File Changes

(8 files)

M .github/workflows/c-cpp.yml (3)
M CHANGES.md (5)
M docker/Dockerfile (12)
M docker/README.md (2)
M docker/build_essential-ubuntu.sh (8)
M docker/docker-compose.srv-gpu.yml (10)
M docker/docker-compose.yml (4)
M docker/update_nvidia_keys.sh (2)

Patch Links:

—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 2, 2022, 5:02:36 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Push

@KrisThielemans pushed 1 commit.

6221012 docker: fix cude image for ubuntu 20.04

—
View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 2, 2022, 6:13:37 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

GPU image fails with https://github.com/SyneRBI/SIRF-SuperBuild/runs/8164262792?check_suite_focus=true#step:6:3890

[ 56%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/joseph3d_back_cuda.cu.o
nvcc fatal   : Unsupported gpu architecture 'compute_86+PTX'

The internet generally implies that this is a conflict between toolkit and driver, or to upgrade the toolkit. However, I am using the nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04 docker image, which is the one recommended on https://hub.docker.com/r/nvidia/cuda/.

@gschramm, @paskino any ideas?

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 2, 2022, 6:19:19 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

In the Cuda Cmake, I have included "magic" that tries to auto detect the cuda architecture of the system - see here. Could that be the issue? Is there a way to figure out which CUDA architecture is present on the system where you are building?

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 2, 2022, 6:22:18 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

latest and devel docker builds worked!

4 actions of the normal workflow failed because of TomographicImaging/CIL#1351. We'll have to ignore that here...

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 2, 2022, 6:48:19 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

In the Cuda Cmake, I have included "magic" that tries to auto detect the cuda architecture of the system - see here. Could that be the issue?

I have no idea if this is what causes this problem. Higher up in the log I see

Automatic GPU detection failed. Building for common architectures.

but that'll be irrelevant as it also occurs in a previous job with ubuntu 18.04 that worked

Is there a way to figure out which CUDA architecture is present on the system where you are building?

This is on github actions. I have no idea, especially is it then sits inside a docker image. (I don't think there's a GPU behind all this on github actions, hence probably the 86?)

Note of course that parallelproj has no problem building in our Ubuntu 18.04 docker image. This problem only occurs in this PR.

After a bit of googling, there seems to be better magic in CMake 3.18. Somebody wants to try this on your system?

I can't as no GPU here. However, the question is if this is a problem only on that docker image. I suppose you could try and only build parallelproj inside that particular docker image.

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 2, 2022, 7:05:24 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Ok, I try building in the nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04 docker image later.

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 2, 2022, 7:41:19 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

@KrisThielemans @paskino: stupid questions from a docker newbie: I pulled nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04, and opened a terminal in the container using the docker gui. In this image there is no cmake and apt search cmake says it cant find the package. How can I install missing packeges inside the image?

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 2, 2022, 10:30:04 PM9/2/22

to SyneRBI/SIRF-SuperBuild, Subscribed

@KrisThielemans @paskino
For me building in nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04 works.
After pulling the image and starting the container, I had to first "unminize" and install cmake though.

root@d99bdcbbfb34:/tmp/parallelproj-1.0.1/build# cmake ..
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
-- The CUDA compiler identification is NVIDIA 11.7.99
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Automatic GPU detection failed. Building for common architectures.
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/parallelproj-1.0.1/build
root@d99bdcbbfb34:/tmp/parallelproj-1.0.1/build# cmake --build .
Scanning dependencies of target parallelproj_c
[  5%] Building C object c/CMakeFiles/parallelproj_c.dir/src/joseph3d_back.c.o
[ 10%] Building C object c/CMakeFiles/parallelproj_c.dir/src/joseph3d_back_tof_lm.c.o
[ 15%] Building C object c/CMakeFiles/parallelproj_c.dir/src/joseph3d_back_tof_sino.c.o
[ 21%] Building C object c/CMakeFiles/parallelproj_c.dir/src/joseph3d_fwd.c.o
[ 26%] Building C object c/CMakeFiles/parallelproj_c.dir/src/joseph3d_fwd_tof_lm.c.o
[ 31%] Building C object c/CMakeFiles/parallelproj_c.dir/src/joseph3d_fwd_tof_sino.c.o
[ 36%] Building C object c/CMakeFiles/parallelproj_c.dir/src/ray_cube_intersection.c.o
[ 42%] Building C object c/CMakeFiles/parallelproj_c.dir/src/tof_utils.c.o
[ 47%] Linking C shared library libparallelproj_c.so
[ 47%] Built target parallelproj_c
Scanning dependencies of target parallelproj_cuda
[ 52%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/joseph3d_back_cuda.cu.o
[ 57%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/joseph3d_back_tof_lm_cuda.cu.o
[ 63%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/joseph3d_back_tof_sino_cuda.cu.o
[ 68%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/joseph3d_fwd_cuda.cu.o
[ 73%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/joseph3d_fwd_tof_lm_cuda.cu.o
[ 78%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/joseph3d_fwd_tof_sino_cuda.cu.o
[ 84%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/projector_kernels.cu.o
[ 89%] Building CUDA object cuda/CMakeFiles/parallelproj_cuda.dir/src/utils_cuda.cu.o
[ 94%] Linking CUDA device code CMakeFiles/parallelproj_cuda.dir/cmake_device_link.o
[100%] Linking CUDA shared library libparallelproj_cuda.so
[100%] Built target parallelproj_cuda

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 5:28:39 AM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

yes, the docker image they distribute is bare bones. We need to install a lot on top of it.

Did you build this on a system with GPU? What architecture did it choose?

This goes well beyond what I know. I had a look at the nvcc doc, and see that compute_86 is Ampere. Note though that it said compute_86+PTX. I wonder if this is related to the note in the stackoverflow answer that you used on PTX on systems without GPU.

Also, those tricks mentioned there seem to rely on internals of CMake which could break at other versions. I optimistically changed from 3.17 to 3.24 in this PR, which might be the cause (as opposed to moving to Ubuntu 20.04).

Final comment from someone who doesn't know what he's talking about. The nvcc doc says

the virtual architecture should always be chosen as low as possible, thereby maximizing the actual GPUs to run on. The real architecture should be chosen as high as possible (assuming that this always generates better code), but this is only possible with knowledge of the actual GPUs on which the application is expected to run. As we will see later, in the situation of just in time compilation, where the driver has this exact knowledge: the runtime GPU is the one on which the program is about to be launched/executed.

To me, this implies that --gpu-architecture should be set as low as possible, and you (might) know what architecture features that parallelproj needs. For building a generic executable (to be distributed on docker or pip or whatever) it shouldn't auto-detect that. Another post says that for building locally, from CMake 3.24 you could use

set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES native)

but choosing this should be an option as far as I can see (just like for the CPU compilation).

Anyway, I'll downgrade CMake again and see what happens

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 5:36:28 AM9/3/22

to SyneRBI/SIRF-SuperBuild, Push

@KrisThielemans pushed 2 commits.

ee14b85 docker: revert to cmake 3.17.5
3f8b574 docker: write cmake/gcc versions to stdout

—
View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 9:30:30 AM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

This worked ok (aside from the upload for whatever reason). So, I think is indeed a CMake issue in parallelproj. I'll merge this PR therefore, and leave the CMake upgrade to later.

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 9:38:09 AM9/3/22

to SyneRBI/SIRF-SuperBuild, Push

@KrisThielemans pushed 1 commit.

95bbd7e fix CHANGES for downgrade of CMake

—
View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 9:45:37 AM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Merged #769 into master.

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 3, 2022, 10:45:07 AM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

(1) I am building locally on MAC M1 without CUDA GPU

(2) I get the same error when using CMake 3.24 with the current version of parallelproj

(3) Setting CUDA_ARCHITECTURES to "all" solves the problem.
I will commit / push the change later today.

@kris: CUDA_ARCHITECTURES "all/all-major" was introduced in CMake 3.23. Is it ok for you to require cmake>=3.23?

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 11:20:17 AM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Great. 3.23 is rather recent. Many people won't have it. What about doing a check on CMake version and use the old code for older CMake (and recommend on your installation instructions to people to use 3.23 it later)

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 3, 2022, 5:25:53 PM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Hm. I guess I would prefer to keep the recipe "clean and simple" and to require >=3.23, especially since installing >=3.23 is "easy" (e.g. from conda-forge, the cmake apt repo or even brew.
Moreover, the old way of autodetecting the CUDA architecture seems very "hacky".

Is there any other reason why not to force people to upgrade cmake to >=3.23?

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 5:52:45 PM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Is there any other reason why not to force people to upgrade cmake to >=3.23?

Many people do apt install cmake and then aren't happy that version isn't recent enough (and I don't blame them). Requiring them to get conda or brew to install cmake isn't going to work either. Our user experience is already bad enough, I'd hate to make it worse.

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 3, 2022, 6:39:46 PM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Ok. Then I add the if/else depending on the version of cmake and add a bit of explanations.
I hope that ubuntu updates the cmake versions in their repos as soon as possible.

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 3, 2022, 7:15:58 PM9/3/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Thanks!

Of course, Ubuntu might update, but people won't update their ubuntu :-/

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 6, 2022, 8:57:11 PM9/6/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Hi Kris,

I just pushed two commits such that compiling in the docker image (with cmake 3.16) works.
However, I think it would be best if we would require at least 3.18 (which I require in the latest commit, but I can revert that commit).
The reason is that "CMAKE_CUDA_ARCHITECTURES" was only defined in 3.18. So if we build with cmake 3.16, the information about the architecture is not passed at all to the compiler.

The new default is that depending on the cmake version, CMAKE_CUDA_ARCHITECTURES is set to the "native" arch. version as before. On a system without any CUDA devices, we could e.g. pass "-DCMAKE_CUDA_ARCHITECTURE=70" to just build for a certain architecture (but that only works as expected for cmake 3.18).
For cmake 3.16 (the default in ubuntu 20.04), this build would also work, but no architecture information is passed.
Which should be fine for testing the build, but I guess you don't want that for using the libs.

What do you think?

Georg

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Kris Thielemans

unread,

Sep 7, 2022, 4:39:54 AM9/7/22

to SyneRBI/SIRF-SuperBuild, Subscribed

I don't know enough about this stuff. sorry. Maybe @gfardell or @paskino have some opinion, or @casperdcl has some time.

Some comments

defaulting to native makes sense for a local build, but seems wrong for a docker instance or a pypi package, where all seems more appropriate. Of course, we can set the CMAKE_CUDA_ARCHITECTURE when building our docker image. That probably makes sense for other packages (with recent CMake support) anyway, so I think you can choose whatever default you like, as long as we can modify it
what are the consequences of not passing architecture information? (as would happen in CMake 3.16) If it happens to build a ptx with support for everything, we'd be fine...

If you require CMake 3.18, I'd appreciate it if you introduce a CMake variable DISABLE_CUDA in parallelproj to make that requirement conditional, such that we don't have to upgrade our requirements from 3.16.2.

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 15, 2022, 11:37:29 PM9/15/22

to SyneRBI/SIRF-SuperBuild, Subscribed

Hi all,
parallelproj 1.0.2 is released. On my MacBook without GPU it build fine in a docker container derived from nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04 with the "native" cmake 3.16 even without specifying CMAKE_CUDA_ARCHITCTURE
@KrisThielemans can you try to build again? If it fails, it is best to set CMAKE_CUDA_ARCHITCTURE to an arb. value (e.g. 75)?

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Casper da Costa-Luis

unread,

Sep 17, 2022, 9:08:50 AM9/17/22

to SyneRBI/SIRF-SuperBuild, Subscribed

missing an S - CMAKE_CUDA_ARCHITECTURES. Default is no flags are passed to nvcc (which on my machine behaves the same as passing native).

For using on your own machine, I think explicit native (cmake>=3.24) is best (only compiles for the GPU on your machine).

For shipping public docker/binaries, all is best (maximum compatibility, ~10% larger file binaries).

—
Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.

Georg Schramm

unread,

Sep 17, 2022, 1:07:42 PM9/17/22

to SyneRBI/SIRF-SuperBuild, Subscribed

@KrisThielemans @casperdcl @paskino
I have clean up the way CMAKE_CUDA_ARCHITECTURES is set in the new v1.1 release. The default value depends on the cmake version:

v>=3.23 -> "all" is used, code is build for all CUDA ARCHs
v<3.32 -> "cmake tries to find the CUDA ARCH that is present on the build system".

In principle that means that if you are using cmake <3.23 and there is no physical GPU present, we should manually specify -DCMAKE_CUDA_ARCHITECTURES (e.g. 75). However, I just tried myself on my M1 MAC in the ubuntu 20.04 cuda 11.7 container and it also works without specifying -DCMAKE_CUDA_ARCHITECTURES explicitly.
Can somebody verify that?