Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

Giacomo Mulas

unread,

Oct 12, 2022, 11:30:04 AM10/12/22

to

Package: gcc-12-offload-nvptx
Version: 12.2.0-5
Severity: grave
Justification: renders package unusable for nvidia

Dear Maintainer,

the nvptx plugin for gcc-12 currently available for sid mandates a
cuda level sm_30, which is no longer available in cuda 11 (the one
now in sid). This means that even a trivial example code like

#include <omp.h>
#include <stdio.h>
int main(int argc, char **argv){
#pragma omp target parallel
{
int i, j;
i = omp_get_thread_num();
j = omp_get_num_threads();
printf("Hello world! I am thread %d out of %d\n", i, j);
}
}

fails to compile with

capitanata:~/test$ gcc-12 -fopenmp test_openmp_2.c
ptxas fatal : Value 'sm_30' is not defined for option 'gpu-name'
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-linux-gnu-accel-nvptx-none-gcc-12 returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /usr/lib/gcc/x86_64-linux-gnu/12//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

even trying to set a specific target gpu arch does not seem to work, e.g.

gmulas@capitanata:~/test$ gcc-12 -fopenmp -foffload-options="-misa=sm_35" test_openmp_2.c
ptxas fatal : Value 'sm_30' is not defined for option 'gpu-name'
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-linux-gnu-accel-nvptx-none-gcc-12 returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /usr/lib/gcc/x86_64-linux-gnu/12//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

On the other hand, gcc-11 appears to have sm_35 as default, meaning it works,
both with and without the -misa option:

capitanata:~/test$ gcc-11 -fopenmp -foffload="-misa=sm_35" test_openmp_2.c
/usr/bin/ld: /tmp/user/1000/ccY5a4YE.crtoffloadtable.o: warning: relocation against `__offload_vars_end' in read-only section `.rodata'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE

capitanata:~/test$ gcc-11 -fopenmp test_openmp_2.c
/usr/bin/ld: /tmp/user/1000/ccHibGBc.crtoffloadtable.o: warning: relocation against `__offload_vars_end' in read-only section `.rodata'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE

and the resulting code runs:

capitanata:~/test$ ./a.out
Hello world! I am thread 4 out of 8
Hello world! I am thread 1 out of 8
Hello world! I am thread 6 out of 8
Hello world! I am thread 7 out of 8
Hello world! I am thread 0 out of 8
Hello world! I am thread 5 out of 8
Hello world! I am thread 2 out of 8
Hello world! I am thread 3 out of 8

Would it be possible to change the default -misa of gcc 12 to sm_35,
to enable gpu offloading to nvidia to work with gcc-12? And/or, is there
some undocumented, or poorly documented, way to actually specify on the
command line the requested cuda level architecture so that it works with
cuda 11 libraries?

Thanks in advance

Best regards
Giacomo Mulas

-- System Information:
Debian Release: bookworm/sid
APT prefers unstable
APT policy: (401, 'unstable'), (10, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.19.0-2-amd64 (SMP w/12 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=it_IT.UTF-8, LC_CTYPE=it_IT.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages gcc-12-offload-nvptx depends on:
ii gcc-12 12.2.0-5
ii gcc-12-base 12.2.0-5
ii libc6 2.35-3
ii libc6-dev 2.35-3
ii libgmp10 2:6.2.1+dfsg1-1.1
ii libgomp-plugin-nvptx1 12.2.0-5
ii libmpc3 1.2.1-2
ii libmpfr6 4.1.0-3
ii libzstd1 1.5.2+dfsg-1
ii nvptx-tools 0.20180301-1
ii zlib1g 1:1.2.11.dfsg-4.1

gcc-12-offload-nvptx recommends no packages.

gcc-12-offload-nvptx suggests no packages.

-- no debconf information

Thomas Schwinge

unread,

Oct 13, 2022, 5:50:03 AM10/13/22

to

reassign 1021660 nvptx-tools
stop

Hi!

On 2022-10-12T17:19:02+0200, Giacomo Mulas <giacom...@inaf.it> wrote:
> the nvptx plugin for gcc-12 currently available for sid mandates a
> cuda level sm_30, which is no longer available in cuda 11 (the one

> now in sid). This means that even a trivial example code like [...]

> fails to compile with
>
> capitanata:~/test$ gcc-12 -fopenmp test_openmp_2.c
> ptxas fatal : Value 'sm_30' is not defined for option 'gpu-name'
> nvptx-as: ptxas returned 255 exit status

Debian need to update nvptx-tools to a version that includes
<https://github.com/MentorEmbedded/nvptx-tools/pull/33>
"as: Deal with CUDA 11.0, "Support for Kepler 'sm_30' and 'sm_32'
architecture based products is dropped"

Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

Thomas Schwinge

unread,

Oct 13, 2022, 7:10:03 AM10/13/22

to

Hi!

(Tom added in CC; for your information, this is about
<https://bugs.debian.org/1021660>.)

On 2022-10-13T12:38:17+0200, Giacomo Mulas <giacom...@inaf.it> wrote:
> On Thu, 13 Oct 2022, Thomas Schwinge wrote:
>
>> Aha, that's a legit question to wonder about: the reason is that GCC 11
>> defaulted to sm_35 code generation (which CUDA 11 still does support):
>> <https://gcc.gnu.org/gcc-11/changes.html#nvptx>, just then GCC 12 again
>> reverted to sm_30: <https://gcc.gnu.org/gcc-12/changes.html#nvptx>.
>
> but then, wouldn't the most straightforward fix to change again GCC 12 to
> generate sm_35 code by default?

GCC upstream would like to continue supporting old GPU hardware, some of
which doesn't support sm_35. That's why GCC 12 fixed this "regression"
introduced in GCC 11, and reverted back to sm_30, like GCC 10 and earlier
had.

> And also, is there some oscure command line
> option to explicitly request GCC 12 to generate code of some specific sm
> level (possibly even higher than sm_35)? I did try using
> gcc-12 -fopenmp -foffload=nvptx-none -foffload-options="-misa=sm_35"
> but it still does not work, while in principle it should. Why doesn't it?

It does work, but only for the code that GCC/nvptx generates in that
'gcc-12' invocation, but not for the support libraries that it linkes in,
which are built for sm_30.

> On the other hand, if I use
> gcc-11 -fopenmp -foffload=nvptx-none="-misa=sm_30"
> then I get the same error message I get with gcc-12.

ACK.

> Is there something wrong in how GCC 12 handles nvptx code generation
> options?

Works as expected, per my understanding.

That said, with the upcoming GCC 13 you'll be able to build (!) GCC/nvptx
with a '--with-arch=[...]' 'configure' option, see
<https://gcc.gnu.org/gcc-13/changes.html#nvptx>.

'gcc/doc/install.texi', "nvptx-*-none":

The @option{--with-arch} option may be specified to override the
default value for the @option{-march} option, and to also build
corresponding target libraries.
The default is @option{--with-arch=sm_30}.

For example, if @option{--with-arch=sm_70} is specified,
@option{-march=sm_30} and @option{-march=sm_70} target libraries are
built, and code generation defaults to @option{-march=sm_70}.

However, that doesn't really help you as a user of GCC, as long as the
distributions don't (have an easy way to) build more variants for several
sm_[...]. More work is necessary in GCC/nvptx upstream to make that
feasible.

Giacomo Mulas

unread,

Oct 13, 2022, 9:30:03 AM10/13/22

to

On Thu, 13 Oct 2022, Thomas Schwinge wrote:

> It does work, but only for the code that GCC/nvptx generates in that
> 'gcc-12' invocation, but not for the support libraries that it linkes in,
> which are built for sm_30.

Does this mean then that the support libraries of gcc-11-offload-nvptx
include both support for sm_30 and sm_35? Is it possible to compile such
support libraries so that they do support more than one cuda arch level,
instead of having, as in the case of GCC 12 support libraries, _only_ sm_30
as available option (if I understood you correctly)?

> However, that doesn't really help you as a user of GCC, as long as the
> distributions don't (have an easy way to) build more variants for several
> sm_[...]. More work is necessary in GCC/nvptx upstream to make that
> feasible.

well, debian in itself does support this kind of setup, doesn't it? With
alternatives, provides in dpkg... Of course, I gather that putting together
the machinery to build a number of versions of the same package would be
somewhat of a pain to set up and maintain.

But anyway, given all you said, can this issue be solved at all, even acting
on nvptx-tools? If the issue lies in the support libraries, that problem
would still remain regardless of what you do on nvptx-tools, wouldn't it?

Thanks, bye
Giacomo

--
_________________________________________________________________

Giacomo Mulas <giacom...@inaf.it>
_________________________________________________________________

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel. +39 070 71180255
mob. : +39 329 6603810
_________________________________________________________________

"It's just a shadow of the man you should be
Like a garden in the forest that the world will never see
You have no thought of answers only questions to be filled"
(Big Country)
_________________________________________________________________