riscv-gcc / ld.so names / multiarch / UTS_MACHINE

182 views
Skip to first unread message

Karsten Merker

unread,
Dec 23, 2016, 2:51:02 PM12/23/16
to Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, sw-...@groups.riscv.org
Hello everybody,

first I have to state that I am not a toolchain guy and I am not
familiar with the gcc codebase, so please correct me if I'm
getting things wrong.

I have looked through the recent riscv-gcc commits and I'm happy
that there has been work on replacing the old conflicting dynamic
linker names (/lib/ld.so.1 respectively /lib32/ld.so.1) with
names that include an ABI specifier so that multiarch support
becomes possible for RISC-V.

https://github.com/riscv/riscv-gcc/commit/3e7b179f8b60f74b2f6eb63ff464305b8f685a95
contains the following definition

#define GLIBC_DYNAMIC_LINKER "/lib" XLEN_SPEC "/" ABI_SPEC "/ld.so.1"

with XLEN_SPEC being either 32 or 64 and ABI_SPEC being one of
ilp32, ilp32f, ilp32d, lp64, lp64f, lp64d.

This would give uns a dynamic linker path of /lib64/lp64d/ld.so.1
for the RV64G "default" case and e.g. /lib32/ilp32/ld.so.1 for an
RV32I system.

While there is AFAIK no other architecture that currently uses
this linker path so there technically isn't a naming conflict, I
think that we should follow the convention of the other "modern"
architectures, which all include the architecture name in their
linker path:

* arm64: /lib/ld-linux-aarch64.so.1
* armhf: /lib/ld-linux-armhf.so.3
* ia64: /lib/ld-linux-ia64.so.2
* mips n64: /lib64/ld-linux-mipsn8.so.1
* nios2: /lib/ld-linux-nios2.so.1
* x86_64: /lib64/ld-linux-x86-64.so.2

So something like

#define GLIBC_DYNAMIC_LINKER "/lib" XLEN_SPEC "/" ABI_SPEC "/ld-linux-rv.so.1"

would be more in line with what the other architectures do.

One thing that is unclear to me is how this naming scheme could
handle multiarch with different base ISAs which use the same ABI,
but a different set of instruction set extensions. As an example
think of RV32I vs. RV32IM vs. RV32IMA which would all use the
same ABI (ilp32) and therefore the same linker path
(/lib32/ilp32/ld.so.1) while not being binary compatible (in one
direction).

This is an issue that other platforms usually don't have as they
don't have different base ISAs in the way RISC-V has them. The
difference between RV32IM and RV32IMA for example is important in
practice as modern C++ code often makes use of C++-11 atomics,
which are fast on platforms that offer the relevant native atomic
instructions but dog slow on platforms that have to resort to
using Linux kernel atomic helpers instead, so if available, one
really wants to use the A extension.

IMHO the linker path should - besides the ABI specifier - also
include the ISA specifier of the minimum base ISA on which the
code can run, so that we would end up with something like
/lib32/ilp32/ld-linux-rv32i.so.1 or
/lib32/ilp32/ld-linux-rv32ima.so.1.
This would allow proper multiarch installations for different base
ISAs.

A related issue comes up with defining the "machine name"
(UTS_MACHINE, which is returned by "uname -m") in the kernel.
Currently we only return "riscv32" or "riscv64" depending on
XLEN, but we don't provide proper ISA information. As the
output of "uname -m" is used by configure scripts to determine on
which platform they are running, to call the compiler with the
proper prefix and set the proper options, wouldn't it make sense
to return a full ISA specifier (e.g. riscv32ima or
riscv64imafd)?

Opinions and ideas welcome :-).

Regards,
Karsten
--
Gem. Par. 28 Abs. 4 Bundesdatenschutzgesetz widerspreche ich der Nutzung
sowie der Weitergabe meiner personenbezogenen Daten für Zwecke der
Werbung sowie der Markt- oder Meinungsforschung.

Stefan O'Rear

unread,
Dec 23, 2016, 3:50:07 PM12/23/16
to Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
Recent Linux on Aarch32 (without hw atomics) supports cmpxchg in the
VDSO, which does _not_ trap to the kernel, instead the kernel trap
exit has a PC adjust check which causes cmpxchg to fail if it was
interrupted. Linux on RISC-V has nothing similar right now but it
could be added.

> really wants to use the A extension.
>
> IMHO the linker path should - besides the ABI specifier - also
> include the ISA specifier of the minimum base ISA on which the
> code can run, so that we would end up with something like
> /lib32/ilp32/ld-linux-rv32i.so.1 or
> /lib32/ilp32/ld-linux-rv32ima.so.1.
> This would allow proper multiarch installations for different base
> ISAs.
>
> A related issue comes up with defining the "machine name"
> (UTS_MACHINE, which is returned by "uname -m") in the kernel.
> Currently we only return "riscv32" or "riscv64" depending on
> XLEN, but we don't provide proper ISA information. As the
> output of "uname -m" is used by configure scripts to determine on
> which platform they are running, to call the compiler with the
> proper prefix and set the proper options, wouldn't it make sense
> to return a full ISA specifier (e.g. riscv32ima or
> riscv64imafd)?
>
> Opinions and ideas welcome :-).

This area is rather confused. It's not very clear what problem we're
trying to solve. Here are my thoughts on the matter, but few answers:


The first question is what kind of multiarch are we trying to support.
I see three possibilities here:

1. ABI-compatible libraries, such as RV32IMA and RV32IA.

2. ABI-incompatible but related architectures, such as having RV32G
and RV64G simultaneously present on a machine that supports both
modes.

3. Arbitrary unrelated architectures, for instance a single rootfs
with both x86_64 and riscv64 binaries. Such a design could be used
for a network environment a la Plan 9, or as a transitional stage in
bootstrapping with qemu-user.

It seems to me that embedding the architecture/ABI in the basename of
ld.so is not useful, because _every_ system library needs to be scoped
to an ABI, and the sensible place to do that is the dirname. I'm not
sure about the "ld-linux" case; are there reasonable cases where two
kernels could share a /lib64 but need separate dynamic linkers?


Next question is: who is the target audience of gcc's builtin multiarch?

Fedora has a hard policy of /lib/libc-2.24.so and /lib64/libc-2.24.so
and is already patching gcc to use those paths (I _think_ Fedora is
less picky about the name of ld.so).

Debian appears to want full type-3 multiarch and scopes every library
to a triple.

People who get gcc's multiarch rules are people who are either
building gcc from source to use as a cross-compiler or distributions
that don't override it. How many of the latter exist? If people are
doing cross-compilation, that could well be from riscvSOMETHING to
riscvSOMETHINGELSE?


Triples seem in general to be poorly specified. We've been using
riscv32 and riscv64 to keep them separate, but nothing for POSIX has
really been done outside of RV64G. RISC-V is not the only modern
architecture that is specified piecemeal with optionally installed
facilities; "llc -mattr=help" gives me several pages of x86 variants,
and x86 has only been extended by two parties in recent years. RISC-V
will be far worse due to the larger number of parties, and I don't
know whether encoding all variations into the triple is possible or
desirable.

Fedora intends to compile all libraries for RV64G (possibly RV64GC
after support is added to qemu?) and rely on runtime discovery for
additional extensions. Of course this requires a way to do the
runtime discovery (the obvious way is to pass a config-string fragment
in the ELF auxiliary vector; I have been made aware that some people
want a very small number of course-grained extensions that could be
handled as a bitfield, but that seems unlikely given current trends).

The upstream version of config.sub can only handle the exact strings
"riscv32" and "riscv64". Changing that will be annoying.

Android does on-device compilation; iOS requires app submission as
bitcode which is compiled for each phone model on Apple's
infrastructure; I think it's clear that the future of medium- and
large-systems software is personalized binaries, although neither
Fedora nor Debian is in a position to do that *right now*. How does
this interact with the triples system? When a future large-systems
RISC-V chip requires 75 bits of information to describe the
architecture and tuning parameters, will we encode that in a triple,
or what will we use as a post-triple?

-s

Manuel A. Fernandez Montecelo

unread,
Dec 23, 2016, 3:53:25 PM12/23/16
to Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
Hi,

2016-12-23 20:45 GMT+01:00 Karsten Merker <mer...@debian.org>:
> IMHO the linker path should - besides the ABI specifier - also
> include the ISA specifier of the minimum base ISA on which the
> code can run, so that we would end up with something like
> /lib32/ilp32/ld-linux-rv32i.so.1 or
> /lib32/ilp32/ld-linux-rv32ima.so.1.
> This would allow proper multiarch installations for different base
> ISAs.

There's this pull request following our previous discussions, and some
of the toolchain devels liked the idea (see comments to the pull
request), but I didn't have time to follow up for half a year now :(

https://github.com/riscv/riscv-gnu-toolchain/pull/136

I would be glad if anybody wants to push things forward in that direction.


Cheers.
--
Manuel A. Fernandez Montecelo <manuel.m...@gmail.com>

Karsten Merker

unread,
Dec 23, 2016, 5:19:57 PM12/23/16
to Stefan O'Rear, Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
On Fri, Dec 23, 2016 at 12:50:05PM -0800, Stefan O'Rear wrote:
> On Fri, Dec 23, 2016 at 11:45 AM, Karsten Merker <mer...@debian.org> wrote:

> > One thing that is unclear to me is how this naming scheme could
> > handle multiarch with different base ISAs which use the same ABI,
> > but a different set of instruction set extensions. As an example
> > think of RV32I vs. RV32IM vs. RV32IMA which would all use the
> > same ABI (ilp32) and therefore the same linker path
> > (/lib32/ilp32/ld.so.1) while not being binary compatible (in one
> > direction).
[snip]
> > IMHO the linker path should - besides the ABI specifier - also
> > include the ISA specifier of the minimum base ISA on which the
> > code can run, so that we would end up with something like
> > /lib32/ilp32/ld-linux-rv32i.so.1 or
> > /lib32/ilp32/ld-linux-rv32ima.so.1.
> > This would allow proper multiarch installations for different base
> > ISAs.
[snip]
> The first question is what kind of multiarch are we trying to support.
> I see three possibilities here:
>
> 1. ABI-compatible libraries, such as RV32IMA and RV32IA.
>
> 2. ABI-incompatible but related architectures, such as having RV32G
> and RV64G simultaneously present on a machine that supports both
> modes.
>
> 3. Arbitrary unrelated architectures, for instance a single rootfs
> with both x86_64 and riscv64 binaries. Such a design could be used
> for a network environment a la Plan 9, or as a transitional stage in
> bootstrapping with qemu-user.

Hello,

"type 3" is what Debian targets with multiarch.

> It seems to me that embedding the architecture/ABI in the basename of
> ld.so is not useful, because _every_ system library needs to be scoped
> to an ABI, and the sensible place to do that is the dirname.

I'm not sure I understand you correctly - are you proposing that
the ld.so for all architectures/ABIs should have the same
path/name, i.e. the old-style /lib/ld.so? How could that work
in a multiarch environment even if the the libraries for the
different architectures/ABIs are in different directories, as
each of the different architectures/ABIs might require a
different dynamic linker (that is for "type-3-multiarch" which is
what Debian does)?

> I'm not sure about the "ld-linux" case; are there reasonable
> cases where two kernels could share a /lib64 but need separate
> dynamic linkers?

You mean the naming convention of ld-linux vs. ld? I don't know
for sure as this definitely isn't my field of expertise, but one
case that I could at least imagine might be a syscall compat
layer for executing "foreign" binaries (such as the FreeBSD
support for executing Linux binaries), where the "foreign"
binaries also use the corresponding "foreign" dynamic linker.

Regardless of such an indeed very exotic case: AFAIK all current
Linux platforms call their ELF linker ld-linux*.so, so IMHO we
should adhere to this convention on RISC-V as well. IIRC ld.so
(in contrast to ld-linux*.so) had on Linux only been used for the
old a.out binary format.

Stefan O'Rear

unread,
Dec 23, 2016, 5:28:50 PM12/23/16
to Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
On Fri, Dec 23, 2016 at 2:14 PM, Karsten Merker <mer...@debian.org> wrote:
> On Fri, Dec 23, 2016 at 12:50:05PM -0800, Stefan O'Rear wrote:
>> It seems to me that embedding the architecture/ABI in the basename of
>> ld.so is not useful, because _every_ system library needs to be scoped
>> to an ABI, and the sensible place to do that is the dirname.
>
> I'm not sure I understand you correctly - are you proposing that
> the ld.so for all architectures/ABIs should have the same
> path/name, i.e. the old-style /lib/ld.so? How could that work
> in a multiarch environment even if the the libraries for the
> different architectures/ABIs are in different directories, as
> each of the different architectures/ABIs might require a
> different dynamic linker (that is for "type-3-multiarch" which is
> what Debian does)?

Actually I'm proposing, for Debian, /lib/riscv32ima-linux-gnu/ld.so.1.
Adding "linux" and "rv32ima" in the basename would just be redundant.

What gcc does when you compile it from unpatched upstream sources is a
different question with no good answer right now.

>> I'm not sure about the "ld-linux" case; are there reasonable
>> cases where two kernels could share a /lib64 but need separate
>> dynamic linkers?
>
> You mean the naming convention of ld-linux vs. ld? I don't know
> for sure as this definitely isn't my field of expertise, but one
> case that I could at least imagine might be a syscall compat
> layer for executing "foreign" binaries (such as the FreeBSD
> support for executing Linux binaries), where the "foreign"
> binaries also use the corresponding "foreign" dynamic linker.

Foreign binaries would use a foreign libc.so which implies they need a
private libdir. If you have a private libdir, renaming ld.so's
basename is questionable.

> Regardless of such an indeed very exotic case: AFAIK all current
> Linux platforms call their ELF linker ld-linux*.so, so IMHO we
> should adhere to this convention on RISC-V as well. IIRC ld.so
> (in contrast to ld-linux*.so) had on Linux only been used for the
> old a.out binary format.

Maybe. I'm not sold but not opposed either.

-s

Michael Clark

unread,
Dec 23, 2016, 5:48:54 PM12/23/16
to Karsten Merker, Stefan O'Rear, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
Type 3 seems ideal, however it’s really up to the vendor.

With multiarch, all of the required information is present in the directory in /lib/arch-os-abi/ where arch is a host triple and the /lib/ directory is empty besides architecture subdirectories. It’s a nice system.

Just looking at my Debian Stretch system. ld is ld-2.24.so and the arch subdirectory is x86_64-linux-gnu and the binaries contain a fully qualified RTLD name.

$ ls -l /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
lrwxrwxrwx 1 root root 10 Oct 19 10:10 /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 -> ld-2.24.so


This would be somewhat equivalent to:

/lib/riscv32i-linux-gnu/ld-linux-riscv32i.so.1 -> ld-1.0.so (RVABI dispatch to either hard or soft floating point, hard or soft multiply and hard atomics or kernel assist)
/lib/riscv32g-linux-gnu/ld-linux-riscv32g.so.1 -> ld-1.0.so (monitor emulation for fill in)
/lib/riscv64g-linux-gnu/ld-linux-riscv64g.so.1 -> ld-1.0.so (monitor emulation for fill in)

The question I have, which is vendor agnostic, is whether the hard/soft float and 64-bit integer support on 32-bit is defined in a compiler agnostic RVABI (header interfaces) or is the generic set provided by libgcc or compiler-rt. I like the idea of defining an RVABI interface that is OS and compiler agnostic so that the implementation in LLVM with compiler-rt can be the same as libgcc without either of them having to reinvent the wheel. It also increases the chances that with containerisation, that RISC-V containers are portable between operating systems. I don’t know if this is a goal of the RISC-V Foundation, but it seems to be where the industry is heading. There also may in the future be vendors that have their own compilers. e.g. ICC, MSC, etc, which would make having an RVABI for the fill in functions a goal. LLVM's compiler-rt is a good starting point for an RVABI that could be used across multiple OS as it has a liberal license that is GPL compatible (e.g. Dual Licensed and DFSG compatible).

I notice that FreeBSD 11 (which now uses compiler-rt instead of libgcc) has /lib/ and the interpreter is /libexec/ld-elf.so.1

In terms of host triples, it seems that the RISC-V Foundation should really be specifying the arch part in arch-os-abi, however an RVABI for the fill in support (soft float, atomics, 64-on-32) may also be under the umbrella of the foundation.

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/20161223221438.GA18551%40excalibur.cnev.de.

Michael Clark

unread,
Dec 23, 2016, 5:58:54 PM12/23/16
to Karsten Merker, Stefan O'Rear, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
My point of view is really one as a developer not tied to any particular platform as I regularly try to compile C/C++ code for Windows, Linux, macOS, FreeBSD, iOS and Android. Of course there’s going to be various other embedded RTOS, L4, Plan 9 and many other Operating Systems that might support RISC-V so we should take an OS agnostic standpoint. i.e. just specify one part of the host triple and potentially a compiler agnostic RVABI for fill in functions, to standardise the runtimes and make containerisation possible.

Stefan O'Rear

unread,
Dec 23, 2016, 5:59:17 PM12/23/16
to Michael Clark, Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
On Fri, Dec 23, 2016 at 2:48 PM, Michael Clark <michae...@mac.com> wrote:
> Just looking at my Debian Stretch system. ld is ld-2.24.so and the arch
> subdirectory is x86_64-linux-gnu and the binaries contain a fully qualified
> RTLD name.
>
> $ ls -l /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
> lrwxrwxrwx 1 root root 10 Oct 19 10:10
> /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 -> ld-2.24.so

The symlinks are a red herring. The compatibility question is about
PT_INTERP, e.g.

$ file /bin/ls
/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux
2.6.32, BuildID[sha1]=9c8f94edddf5d4eea037be7a35e805e334921898,
stripped

PT_INTERP is a inter-distribution compatibility issue, so it needs to
include all relevant information.

> The question I have, which is vendor agnostic, is whether the hard/soft
> float and 64-bit integer support on 32-bit is defined in a compiler agnostic
> RVABI (header interfaces) or is the generic set provided by libgcc or

That's an important question, but I don't think it's entirely RISC-V
specific and it might be a subject for another thread. Usually libgcc
is statically linked which avoids some of these issues.

> compiler-rt. I like the idea of defining an RVABI interface that is OS and
> compiler agnostic so that the implementation in LLVM with compiler-rt can be
> the same as libgcc without either of them having to reinvent the wheel. It
> also increases the chances that with containerisation, that RISC-V
> containers are portable between operating systems. I don’t know if this is a

libgcc_s.so is part of the filesystem so it travels with containers.
There is no container impact here, I call red herring.

> goal of the RISC-V Foundation, but it seems to be where the industry is
> heading. There also may in the future be vendors that have their own
> compilers. e.g. ICC, MSC, etc, which would make having an RVABI for the fill
> in functions a goal. LLVM's compiler-rt is a good starting point for an
> RVABI that could be used across multiple OS as it has a liberal license that
> is GPL compatible (e.g. Dual Licensed and DFSG compatible).
>
> I notice that FreeBSD 11 (which now uses compiler-rt instead of libgcc) has
> /lib/ and the interpreter is /libexec/ld-elf.so.1

What does PT_INTERP look like on FreeBSD? (file _should_ tell you)

> In terms of host triples, it seems that the RISC-V Foundation should really
> be specifying the arch part in arch-os-abi, however an RVABI for the fill in
> support (soft float, atomics, 64-on-32) may also be under the umbrella of
> the foundation.

RVABI is part of the C/C++ psABI, like DWARF unwind tables that are
used for exceptions. Not relevant to "arch", might be relevant to
"abi".

> I'm not sure I understand you correctly - are you proposing that
> the ld.so for all architectures/ABIs should have the same
> path/name, i.e. the old-style /lib/ld.so? How could that work

I responded to this separately.

-s

Michael Clark

unread,
Dec 23, 2016, 6:30:41 PM12/23/16
to Stefan O'Rear, Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
> What does PT_INTERP look like on FreeBSD? (file _should_ tell you)\

$ file /bin/ls
/bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 11.0 (1100122), FreeBSD-style, stripped

>> In terms of host triples, it seems that the RISC-V Foundation should really
>> be specifying the arch part in arch-os-abi, however an RVABI for the fill in
>> support (soft float, atomics, 64-on-32) may also be under the umbrella of
>> the foundation.
>
> RVABI is part of the C/C++ psABI, like DWARF unwind tables that are
> used for exceptions. Not relevant to "arch", might be relevant to
> "abi".
>
>> I'm not sure I understand you correctly - are you proposing that
>> the ld.so for all architectures/ABIs should have the same
>> path/name, i.e. the old-style /lib/ld.so? How could that work
>
> I responded to this separately.
>
> -s
>
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CADJ6UvNz9W0HM4sqY_xD2-zk-%2B2LXdP-Oi51001u5iHW1d%3DE9w%40mail.gmail.com.

Michael Clark

unread,
Dec 23, 2016, 6:44:10 PM12/23/16
to Stefan O'Rear, Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
On 24 Dec 2016, at 11:59 AM, Stefan O'Rear <sor...@gmail.com> wrote:

On Fri, Dec 23, 2016 at 2:48 PM, Michael Clark <michae...@mac.com> wrote:
Just looking at my Debian Stretch system. ld is ld-2.24.so and the arch
subdirectory is x86_64-linux-gnu and the binaries contain a fully qualified
RTLD name.

$ ls -l /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
lrwxrwxrwx 1 root root 10 Oct 19 10:10
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 -> ld-2.24.so

The symlinks are a red herring.  The compatibility question is about
PT_INTERP, e.g.

$ file /bin/ls
/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux
2.6.32, BuildID[sha1]=9c8f94edddf5d4eea037be7a35e805e334921898,
stripped

PT_INTERP is a inter-distribution compatibility issue, so it needs to
include all relevant information.

Yes. I remember when this was being standardised. I joined a few of the ISO/ITC JTC1/SC22 Linux Rapporteur Group meetings when the LSB was being standardised. Interesting that Aarch64 has not been standardised by the Linux Foundation. No surprise really given the many various ABI variants. 


Many people think of LSB as init script requirements and vendors has stated things like they have dropped the LSB, but this just shows a misunderstanding of what the LSB is, in such that it is mainly binary compatibility at the C ABI layer. i.e. C ABI, libc symbol versions, Exception Headers, C++ ABI compatibility and base system symbols for core libraries (libssl). In the late 90’s and early 2000’s there was a lot of instability in G++ with respect to the C++ ABI and binaries were not portable between different versions of the same distro let alone different distros. Today we can run Skype or Vivado binaries on multiple x86 Linux distributions. I don’t think ARM has that level of compatibility even today, except perhaps as shared objects in APKs on Android. I spend some time building NDK binaries for ARM/x86 in Android apps, so I know the pain…

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Stefan O'Rear

unread,
Dec 23, 2016, 11:02:39 PM12/23/16
to Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
On Fri, Dec 23, 2016 at 11:45 AM, Karsten Merker <mer...@debian.org> wrote:
> While there is AFAIK no other architecture that currently uses
> this linker path so there technically isn't a naming conflict, I
> think that we should follow the convention of the other "modern"
> architectures, which all include the architecture name in their
> linker path:
>
> * arm64: /lib/ld-linux-aarch64.so.1
> * armhf: /lib/ld-linux-armhf.so.3
> * ia64: /lib/ld-linux-ia64.so.2
> * mips n64: /lib64/ld-linux-mipsn8.so.1
> * nios2: /lib/ld-linux-nios2.so.1
> * x86_64: /lib64/ld-linux-x86-64.so.2

Note: this is asking about the PT_INTERP name, which is relevant for
LSB compatibility. Increasing deployment of namespace-based systems
like docker and flatpak may mean that standardizing on a dynamic
linker name is less important than it once was, but that doesn't mean
it's not important.

https://sourceware.org/glibc/wiki/ABIList
https://wiki.linaro.org/RikuVoipio/LdSoTable

** The current interp is problematic because it does not mention
riscv, although it is sufficiently weird that it won't collide with
anything extant **

Linux multilib and multiarch systems, as far as I've been able to
determine, use one set of libraries per base ISA and ABI, not per
hwcap permutation.

All PT_INTERP names from the list above are /lib /lib32 /lib64 with
the architecture, if present at all, in the final basename. In the
interest of consistency I propose:

/lib32 is fairly weird and aarch64 doesn't use /lib64 either, so I'm
fine with /lib. Debian will symlink the PT_INTERP name to the install
location of ld.so anyway.

Proposal:

"/lib/ld-linux-riscv" XLEN_SPEC ABI_SPEC ".so.1"
/lib/ld-linux-riscv32ilp32.so.1
/lib/ld-linux-riscv32ilp32d.so.1 # Most likely for RV32G
/lib/ld-linux-riscv64ilp32.so.1
/lib/ld-linux-riscv64ilp32d.so.1
/lib/ld-linux-riscv64lp64.so.1
/lib/ld-linux-riscv64lp64d.so.1 # Most likely for RV64G

> One thing that is unclear to me is how this naming scheme could
> handle multiarch with different base ISAs which use the same ABI,
> but a different set of instruction set extensions. As an example
> think of RV32I vs. RV32IM vs. RV32IMA which would all use the
> same ABI (ilp32) and therefore the same linker path
> (/lib32/ilp32/ld.so.1) while not being binary compatible (in one
> direction).

That kind of one-directional compatibility is not something that
either multilib or multiarch normally handles. What other gcc targets
do is pick a baseline and then use it unless you pass a -march=native
argument.

Baseline for Linux will likely be RV(32|64)GC for all hardfloat ABIs,
and GC or IMAC for softfloat (could be convinced either way).

Common extensions beyond the baseline, like V and specific
crypographic accelerators, can be picked up at runtime using hwcaps.
Targetting less than the baseline will require a cross-compiler.
Given that freedom-u-sdk won't even boot with less than 20 MB of RAM,
removing MAC is probably a false economy / unbalanced design, but we
do need to support small profiles for riscv32-unknown-elf.

> This is an issue that other platforms usually don't have as they
> don't have different base ISAs in the way RISC-V has them. The
> difference between RV32IM and RV32IMA for example is important in
> practice as modern C++ code often makes use of C++-11 atomics,
> which are fast on platforms that offer the relevant native atomic
> instructions but dog slow on platforms that have to resort to
> using Linux kernel atomic helpers instead, so if available, one
> really wants to use the A extension.

To support LSB we want as few normal Linux targets as possible.
Anything beyond RV64GC LP64D and RV32GC ILP32D is a tough sell, and
I'd rather focus on the first as much as possible.

I don't see a good argument against making A mandatory for Linux.

> IMHO the linker path should - besides the ABI specifier - also
> include the ISA specifier of the minimum base ISA on which the
> code can run, so that we would end up with something like
> /lib32/ilp32/ld-linux-rv32i.so.1 or
> /lib32/ilp32/ld-linux-rv32ima.so.1.
> This would allow proper multiarch installations for different base
> ISAs.

Other ISAs have the same problem (32-bit x86 has had this problem
_several times_ due to things like 64-bit floating point) and
generally address it at the OS level or by using cross compilers, not
by multilib proliferation.

> A related issue comes up with defining the "machine name"
> (UTS_MACHINE, which is returned by "uname -m") in the kernel.
> Currently we only return "riscv32" or "riscv64" depending on
> XLEN, but we don't provide proper ISA information. As the
> output of "uname -m" is used by configure scripts to determine on
> which platform they are running, to call the compiler with the
> proper prefix and set the proper options, wouldn't it make sense
> to return a full ISA specifier (e.g. riscv32ima or
> riscv64imafd)?

I think it might make sense to support "riscv32i-unknown-elf" or
"riscv32ima-unknown-elf", although you can accomplish the same thing
with toolchain configuration. I don't see a compelling case for
supporting full Linux with anything less than RV32GC ILP32D.

-s

Manuel A. Fernandez Montecelo

unread,
Dec 27, 2016, 12:56:49 PM12/27/16
to Stefan O'Rear, Karsten Merker, Andrew Waterman, Kito Cheng, Palmer Dabbelt, Manuel A. Fernandez Montecelo, RISC-V SW Dev
2016-12-24 5:02 GMT+01:00 Stefan O'Rear <sor...@gmail.com>:
>
> /lib32 is fairly weird and aarch64 doesn't use /lib64 either, so I'm
> fine with /lib. Debian will symlink the PT_INTERP name to the install
> location of ld.so anyway.
>
> Proposal:
>
> "/lib/ld-linux-riscv" XLEN_SPEC ABI_SPEC ".so.1"
> /lib/ld-linux-riscv32ilp32.so.1
> /lib/ld-linux-riscv32ilp32d.so.1 # Most likely for RV32G
> /lib/ld-linux-riscv64ilp32.so.1
> /lib/ld-linux-riscv64ilp32d.so.1
> /lib/ld-linux-riscv64lp64.so.1
> /lib/ld-linux-riscv64lp64d.so.1 # Most likely for RV64G

This is simular to the previous work (only that 'd'/'ifd' is different):

https://github.com/riscv/riscv-gnu-toolchain/pull/136/files

This came from a proposal by Karsten and agreed by people from
Berkeley, and the details like "ifd_lp64" were their suggestion, so
both the previous proposal or the one above are OK for me -- as long
as the separation is clear and allow multi-arch between 32/64 bits, or
weird combinations within the same XLEN if the need arises.
Reply all
Reply to author
Forward
0 new messages