gnu toolchain problems with medany and explicit-relocs

278 views
Skip to first unread message

Jim Wilson

unread,
Aug 27, 2018, 2:17:21 PM8/27/18
to RISC-V SW Dev
Consider this testcase.

int array[995] = { [10] 10, [99] 99 };
long long ll = 100;

long long
sub (void)
{
return ll;
}

int
main (void)
{
return sub ();
}

If I use riscv-gnu-toolchain, configured for rv32i newlib, and compile
it with -O -mcmodel=medany -mexplicit-relocs, in the assembly output I
see

sub:
.LA0: auipc a5,%pcrel_hi(ll)
lw a0,%pcrel_lo(.LA0)(a5)
lw a1,%pcrel_lo(.LA0+4)(a5)
ret

which looks reasonable. Though maybe that should be %pcrel_lo(.LA0)+4
instead, because the +4 is added to the address of ll not the address
of .LA0. However, when I disassemble the a.out file, I see

000101ac <sub>:
101ac: 00002797 auipc a5,0x2
101b0: 7fc7a503 lw a0,2044(a5) # 129a8 <ll>
101b4: 8007a583 lw a1,-2048(a5)
101b8: 00008067 ret

and note that the +4 offset overflowed giving silent bad code.

I carefully choose the array size to force the error. if you have a
slightly different version or configuration of the tools, you might
need a different array size to see the error.

The problem here is that while the variable ll is 8-byte aligned, the
auipc is not aligned, and medany is using the offset between the auipc
and ll, so this offset is not a multiple of 8. The auipc is only
guaranteed to have 4-byte alignment without the C extension, and 2
byte alignment with the C extension. GCC is assuming that any offset
smaller than the alignment of the variable is safe, which is not true
in this case.

The same problem can happen for both rv32 and rv64 when using long
doubles and int128_t, which requires 16-byte alignment. We don't have
anything that requires more than 16-byte alignment though, so the
problem ends here.

Unfortunately, I don't see an obvious, easy, and good solution for this.

We could disallow offsets with pcrel_lo, but that means medany code
won't be as efficient as medlow because it will need extra address
generation instructions.

We could force alignment of auipc, but that means potentially emitting
multiple nops before auipc, which again hurts medany code size and
performance.

We could maybe change the code sequence to something like
aupic %pcrel_hi
addi %pcrel_addi
lw %pcrel_lo_with_addi
lw %pcrel_lo_with_addi+4
and then the new pcrel_addi reloc adds a value if necessary to avoid
overflow, and the pcrel_lo_with_addi subtracts the same value. The
addi can then be deleted via relaxation if it is unnecessary.
However, cleanly specifying and implementing these relocs could be a
problem because of the complex interactions between them.

Other solutions might involve defining a new code model, a new ABI, or
adding new instructions to the ISA, all of which I'm hoping to avoid.

While testing this support, I've also managed to find two binutils
bugs that can result in link time errors when pcrel_lo is used with an
offset. Though exactly how those should be fixed depends on how
exactly we decide to fix the gcc problem. There is also the (third)
linker problem of silently creating bad code when pcrel_lo+offset
overflows. I can add an error for that, but if someone hits it, there
isn't anything they can do to fix it, other than to recompile without
-mexplicit-relocs.

Meanwhile, with the gnu toolchain, use of -mcmodel=medany is safe, but
use of both -mcmodel=medany and -mexplicit-relocs together is not
safe.

Jim

Bruce Hoult

unread,
Aug 27, 2018, 8:26:09 PM8/27/18
to Jim Wilson, RISC-V SW Dev
I needed a different size to make it happen. I used 1085 and got:

00010152 <sub>:
   10152:       00002797                auipc   a5,0x2
   10156:       7fe7a503                lw      a0,2046(a5) # 12950 <ll>
   1015a:       8027a583                lw      a1,-2046(a5)
   1015e:       8082                    ret

Near enough I think.

I tried using a struct containing two longs instead of a long long. I had to fiddle with the size and also add a dummy function to adjust the alignment of the auipc in sub() because gcc decided to add a gratuitous stack frame to sub():

struct s {long l; long h;};

int array[1082] = { [10] 10, [99] 99 };
struct s ll = {2, 4};

int
foo(){
  return 1;
}

struct s
sub (void)
{
    return ll;
}

int
main (void)
{
    struct s r = sub();
    return r.h*10 + r.l; // exit value is 42
}

Now, for sub() I get:

00010156 <sub>:
   10156:       1141                    addi    sp,sp,-16
   10158:       00002797                auipc   a5,0x2
   1015c:       7fc78713                addi    a4,a5,2044 # 12954 <ll>
   10160:       7fc7a503                lw      a0,2044(a5)
   10164:       434c                    lw      a1,4(a4)
   10166:       0141                    addi    sp,sp,16
   10168:       8082                    ret

So gcc is already capable of generating basically the same code as you suggested. I don't know why it generates "lw  a0,2044(a5)" instead of "lw  a0,0(a4)" (or even just doing the addi to a5) but other than a couple of bytes of code size it doesn't matter too much.


Jim

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAFyWVaZ49CsB4f%2B8skzJt4TRGRjoG8VO1Fkv5H9wqrRXKMc3eg%40mail.gmail.com.

Jim Wilson

unread,
Aug 27, 2018, 9:11:26 PM8/27/18
to Bruce Hoult, RISC-V SW Dev
On Mon, Aug 27, 2018 at 5:26 PM, Bruce Hoult <bruce...@sifive.com> wrote:
> So gcc is already capable of generating basically the same code as you
> suggested. I don't know why it generates "lw a0,2044(a5)" instead of "lw
> a0,0(a4)" (or even just doing the addi to a5) but other than a couple of
> bytes of code size it doesn't matter too much.

The gcc optimization support for medany isn't very good, and sometimes
it does stupid things like computing addresses multiple times, or in
multiple different redundant ways. This is the problem I was supposed
to look at, then I started finding problems, of which the silent
overflow of offsets is the worst one.

Jim

Michael Clark

unread,
Aug 28, 2018, 2:03:55 AM8/28/18
to Jim Wilson, RISC-V SW Dev
Isn’t the only necessary change to get the compiler to emit %pcrel_lo(.LA0)+4 instead of %pcrel_lo(.LA0+4)?

Does this require some changes to expression parsing to make sure in the case that there is only an addition and that it is outside of the expression? It doesn’t make sense to add to the label in this case as %pcrel_lo uses the label to find the reloc it is relative to.

Can %pcrel_lo be changed to only contain a symbol/label and for the addend to be derived from the whole expression? or does this differ to the way GCC handles addends for explicit relocs? It seems clear to me that in the case of RISC-V that the addend needs to be applied last after the indirect reloc address calculation.

> While testing this support, I've also managed to find two binutils
> bugs that can result in link time errors when pcrel_lo is used with an
> offset. Though exactly how those should be fixed depends on how
> exactly we decide to fix the gcc problem. There is also the (third)
> linker problem of silently creating bad code when pcrel_lo+offset
> overflows. I can add an error for that, but if someone hits it, there
> isn't anything they can do to fix it, other than to recompile without
> -mexplicit-relocs.
>
> Meanwhile, with the gnu toolchain, use of -mcmodel=medany is safe, but
> use of both -mcmodel=medany and -mexplicit-relocs together is not
> safe.

Should -mexplicit-relocs be renamed to -mexperimental-explicit-relocs or the option should be disabled until it is safe?

> Jim
>
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

Michael Clark

unread,
Aug 28, 2018, 2:30:39 AM8/28/18
to Michael Clark, Jim Wilson, RISC-V SW Dev
Oh I see what you mean. Interesting problem. Can’t the same thing happen with two %lo references sharing an LUI?
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/18CC3D2A-163C-427B-899F-1D3F0DEDB08A%40mac.com.

Jim Wilson

unread,
Aug 28, 2018, 11:37:57 AM8/28/18
to Michael Clark, RISC-V SW Dev
On Mon, Aug 27, 2018 at 11:03 PM, Michael Clark <michae...@mac.com> wrote:
> Isn’t the only necessary change to get the compiler to emit %pcrel_lo(.LA0)+4 instead of %pcrel_lo(.LA0+4)?
>
> Does this require some changes to expression parsing to make sure in the case that there is only an addition and that it is outside of the expression? It doesn’t make sense to add to the label in this case as %pcrel_lo uses the label to find the reloc it is relative to.

This requires both a gcc and a gas change, and maybe some
clarification in the psABI. This is easy enough to do, but it doesn't
fix the underlying problem though, and the solution for that problem
affects how this problem needs to be solved, so the underlying problem
should be dealt with first.

Jim

Jim Wilson

unread,
Aug 28, 2018, 11:47:46 AM8/28/18
to Michael Clark, RISC-V SW Dev
On Mon, Aug 27, 2018 at 11:30 PM, Michael Clark <michae...@mac.com> wrote:
>> Should -mexplicit-relocs be renamed to -mexperimental-explicit-relocs or the option should be disabled until it is safe?

Changing gcc option names causes other problems, as you break
makefiles, and then have to answer lots of questions from lots of
different people about why the name changed. Emitting a warning or
error by default when the option is used is probably better. Or maybe
accepting the option and ignoring it. Gcc has a number of options
that do absolutely nothing at all. This is because if you drop an
option you get lots of complaints from people about build failures.
But if you leave an option in place and modify it to do nothing, few
people will notice and complain. So the latter choice causes less
trouble in the long run.

Jim
Reply all
Reply to author
Forward
0 new messages