gcc reorder instructions question

873 views
Skip to first unread message

Liviu Ionescu

unread,
Dec 4, 2017, 6:53:19 PM12/4/17
to RISC-V SW Dev, Palmer Dabbelt
(This is a question for Palmer and/or the other compiler gurus)

With a sequence like:

```c
volatile uint32_t array[2];

volatile uint32_t *p = array;

uint32_t v1 = *p;
uint32_t v2 = *(p+1);
```

Given the current ISO/ANSI specs, and the current behaviour of RISC-V GCC, are the last two instructions always executed in this order, or, for whatever reason, the compiler may reorder them, and the accesses are done in reverse order?

If the instructions may be reordered, is there any trick available (memory barriers, atomics, or anything else) to prevent this?

Same question for

```c
volatile uint64_t long_variable;

volatile uint32_t *p = (uint32_t*)&long_variable;

uint32_t v1 = *p;
uint32_t v2 = *(p+1);
```

In other words, is it possible, on a 32-bits platform, with the current RISC-V GCC, to find a way to always access the two halves of a 64-bits variable (or memory mapped register) in the desired order?


Thank you,

Liviu


Jatin Bhateja

unread,
Dec 4, 2017, 7:47:48 PM12/4/17
to Liviu Ionescu, RISC-V SW Dev, Palmer Dabbelt
GCC/LLVM's address sanitizer' ability to catch initialization ordering issues seems to be at your respite here, this can  be enabled  with ASAN_OPTIONS=check_initialization_order

Thanks,
Jatin
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/D185C88E-AB3A-463A-9A50-7912B0E99B0C%40livius.net.

evan...@maximintegrated.com

unread,
Dec 4, 2017, 8:00:49 PM12/4/17
to RISC-V SW Dev, pal...@sifive.com
Not sure this has to do with initialization issues.

Outside of a bug in GCC, the answer is "No", volatile accesses can't be reordered with respect to each other across sequence points (different statements, basically). However, non-volatile access could be reordered across volatile accesses. To prevent this, you'd need a platform specific barrier instruction, or the following portable inline "assembly":

__asm__ __volatile__("":::"memory");

This may be GCC specific and not part of ISO/ANSI C.
-Evan

Liviu Ionescu

unread,
Dec 4, 2017, 8:02:06 PM12/4/17
to Jatin Bhateja, RISC-V SW Dev, Palmer Dabbelt


> On 5 Dec 2017, at 02:47, Jatin Bhateja <jatin....@gmail.com> wrote:
>
> GCC/LLVM's address sanitizer' ability to catch initialization ordering issues seems to be at your respite here, this can be enabled with ASAN_OPTIONS=check_initialization_order

I'm not sure I follow you.

the question is not how to build a custom toolchain, the question is what is the behaviour with the official toolchain, and if I can get the desired order.


regards,

Liviu

evan...@maximintegrated.com

unread,
Dec 4, 2017, 8:02:26 PM12/4/17
to RISC-V SW Dev, pal...@sifive.com
As you can imagine, reordering of volatile accesses would wreak havoc on any IO/peripheral driver code through memory-mapped registers. This code would basically become impossible to write; you'd have to include a barrier instruction between every single line.

Liviu Ionescu

unread,
Dec 4, 2017, 8:07:28 PM12/4/17
to evan...@maximintegrated.com, RISC-V SW Dev, pal...@sifive.com


> On 5 Dec 2017, at 03:00, evan...@maximintegrated.com wrote:
>
> Outside of a bug in GCC, the answer is "No", volatile accesses can't be reordered with respect to each other across sequence points (different statements, basically).

aha. so separate instructions with volatile accesses should be safe.

how about a direct access to a 64-bit variable on a 32-bits platform? the order of the two word accesses is always the same, on a given endianness?


regards,

Liviu


Jatin Bhateja

unread,
Dec 4, 2017, 8:28:02 PM12/4/17
to evan...@maximintegrated.com, RISC-V SW Dev, pal...@sifive.com


On Tuesday, December 5, 2017, <evan...@maximintegrated.com> wrote:
Not sure this has to do with initialization issues.

Outside of a bug in GCC, the answer is "No", volatile accesses can't be reordered with respect to each other across sequence points (different statements, basically). However, non-volatile access could be reordered across volatile accesses. To prevent this, you'd need a platform specific barrier instruction, or the following portable inline "assembly":

__asm__ __volatile__("":::"memory");

Fences ensures that memory instructions gets committed to main memory from caches / store buffers, globals gets their initialization values either at the load time if constant or during static initialization which happens before entry to main. So if we can inject a barrier after each global in static initialization  then off course we can prevent such ordering issues, but i am not aware if there is any such mechanism to access static initialization from a high level language. But compilers can do so while emitting code under a special mode. 


This may be GCC specific and not part of ISO/ANSI C.
-Evan

On Monday, December 4, 2017 at 5:53:19 PM UTC-6, Liviu Ionescu wrote:
(This is a question for Palmer and/or the other compiler gurus)

With a sequence like:

```c
volatile uint32_t array[2];

volatile uint32_t *p = array;

uint32_t v1 = *p;
uint32_t v2 = *(p+1);
```

Given the current ISO/ANSI specs, and the current behaviour of RISC-V GCC, are the last two instructions always executed in this order, or, for whatever reason, the compiler may reorder them, and the accesses are done in reverse order?

If the instructions may be reordered, is there any trick available (memory barriers, atomics, or anything else) to prevent this?

Same question for

```c
volatile uint64_t long_variable;

volatile uint32_t *p = (uint32_t*)&long_variable;

uint32_t v1 = *p;
uint32_t v2 = *(p+1);
```

In other words, is it possible, on a 32-bits platform, with the current RISC-V GCC, to find a way to always access the two halves of a 64-bits variable (or memory mapped register) in the desired order?
 

Thank you,

Liviu


--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/2fd0bd3e-8213-498a-8723-939b7accac4d%40groups.riscv.org.


--
Jatin Bhateja

Jatin Bhateja

unread,
Dec 4, 2017, 8:31:54 PM12/4/17
to Liviu Ionescu, RISC-V SW Dev, Palmer Dabbelt


On Tuesday, December 5, 2017, Liviu Ionescu <i...@livius.net> wrote:


> On 5 Dec 2017, at 02:47, Jatin Bhateja <jatin....@gmail.com> wrote:
>
> GCC/LLVM's address sanitizer' ability to catch initialization ordering issues seems to be at your respite here, this can  be enabled  with ASAN_OPTIONS=check_initialization_order

I'm not sure I follow you.

Yes,  ASAN can help in reporting such an issues not inhibit reordering.

the question is not how to build a custom toolchain, the question is what is the behaviour with the official toolchain, and if I can get the desired order.


regards,

Liviu



--
Jatin Bhateja

Jatin Bhateja

unread,
Dec 4, 2017, 8:34:42 PM12/4/17
to evan...@maximintegrated.com, RISC-V SW Dev, pal...@sifive.com


On Tuesday, December 5, 2017, Jatin Bhateja <jatin....@gmail.com> wrote:


On Tuesday, December 5, 2017, <evan...@maximintegrated.com> wrote:
Not sure this has to do with initialization issues.

Outside of a bug in GCC, the answer is "No", volatile accesses can't be reordered with respect to each other across sequence points (different statements, basically). However, non-volatile access could be reordered across volatile accesses. To prevent this, you'd need a platform specific barrier instruction, or the following portable inline "assembly":

__asm__ __volatile__("":::"memory");

Fences ensures that memory instructions gets committed to main memory from caches / store buffers, globals gets their initialization values either at the load time if constant or during static initialization which happens before entry to main. So if we can inject a barrier after each global in static initialization  then off course we can prevent such ordering issues, but i am not aware if there is any such mechanism to access static initialization from a high level language. But compilers can do so while emitting code under a special mode. 
 
I meant dynamic initialization of globals  in above context.


This may be GCC specific and not part of ISO/ANSI C.
-Evan

On Monday, December 4, 2017 at 5:53:19 PM UTC-6, Liviu Ionescu wrote:
(This is a question for Palmer and/or the other compiler gurus)

With a sequence like:

```c
volatile uint32_t array[2];

volatile uint32_t *p = array;

uint32_t v1 = *p;
uint32_t v2 = *(p+1);
```

Given the current ISO/ANSI specs, and the current behaviour of RISC-V GCC, are the last two instructions always executed in this order, or, for whatever reason, the compiler may reorder them, and the accesses are done in reverse order?

If the instructions may be reordered, is there any trick available (memory barriers, atomics, or anything else) to prevent this?

Same question for

```c
volatile uint64_t long_variable;

volatile uint32_t *p = (uint32_t*)&long_variable;

uint32_t v1 = *p;
uint32_t v2 = *(p+1);
```

In other words, is it possible, on a 32-bits platform, with the current RISC-V GCC, to find a way to always access the two halves of a 64-bits variable (or memory mapped register) in the desired order?
 

Thank you,

Liviu


--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.


--
Jatin Bhateja


--
Jatin Bhateja

Liviu Ionescu

unread,
Dec 4, 2017, 8:35:16 PM12/4/17
to Jatin Bhateja, evan...@maximintegrated.com, RISC-V SW Dev, pal...@sifive.com


> On 5 Dec 2017, at 03:27, Jatin Bhateja <jatin....@gmail.com> wrote:
>
> ... static initialization

ah, sorry for the misunderstanding, the code looks like static initialisations, but the question is actually related to normal use in a function.

> Yes, ASAN can help in reporting such an issues not inhibit reordering.

is the RISC-V toolchain built with this option?


regards,

Liviu




Jatin Bhateja

unread,
Dec 4, 2017, 8:55:16 PM12/4/17
to Liviu Ionescu, evan...@maximintegrated.com, RISC-V SW Dev, pal...@sifive.com
llvm support address sanitization with following option.

 -fsanitize=address 

AddressSanitizerFlags · google/sanitizers Wiki · ...
https://github.com › google › AddressSa...

Gcc sanitization implementation is a blatant ripoff from llvm.

Thanks

Jim Wilson

unread,
Dec 4, 2017, 9:40:13 PM12/4/17
to Liviu Ionescu, evan...@maximintegrated.com, RISC-V SW Dev, Palmer Dabbelt
On Mon, Dec 4, 2017 at 5:07 PM, Liviu Ionescu <i...@livius.net> wrote:
> how about a direct access to a 64-bit variable on a 32-bits platform? the order of the two word accesses is always the same, on a given endianness?

The compiler will split the 64-bit load into two 32-bit loads. There
is no guarantee that two different compilers will order the same, or
that two different versions of the same compiler will order them the
same. However, for one version of one compiler, for a volatile 64-bit
load, you will get two volatile 32-bit loads after the split, and they
should not be reordered, so the order of the accesses should stay the
same if the compiler doesn't change.

Jim

Jatin Bhateja

unread,
Dec 5, 2017, 2:29:04 AM12/5/17
to Jim Wilson, Liviu Ionescu, evan...@maximintegrated.com, RISC-V SW Dev, Palmer Dabbelt
Yeah, so initialization ordering of splits of volatile 64bit global must be preseved by compiler as per the architecture endiannes. But for volatile array elements of legal architecture types compiler does not give any ordering guarantee.

Thanks

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Liviu Ionescu

unread,
Dec 5, 2017, 5:05:21 AM12/5/17
to Jim Wilson, evan...@maximintegrated.com, RISC-V SW Dev, Palmer Dabbelt


> On 5 Dec 2017, at 04:40, Jim Wilson <ji...@sifive.com> wrote:
>
> However, for one version of one compiler, for a volatile 64-bit
> load, you will get two volatile 32-bit loads after the split, and they
> should not be reordered, so the order of the accesses should stay the
> same if the compiler doesn't change.

ok. for the current compilers (GCC 7.2 and probably LLVM too) I guess this is true, but... no guarantees for future versions.

> On 5 Dec 2017, at 03:02, evan...@maximintegrated.com wrote:
>
> As you can imagine, reordering of volatile accesses would wreak havoc on any IO/peripheral driver code through memory-mapped registers. This code would basically become impossible to write; you'd have to include a barrier instruction between every single line.

so the portable solution is to manually issue the two 32-bits reads as volatiles, in the desired order, and reassemble the 64-bits result, and this should not be reordered by the compiler.


thank you,

Liviu


Arnd Bergmann

unread,
Dec 5, 2017, 8:56:38 AM12/5/17
to Liviu Ionescu, RISC-V SW Dev, Palmer Dabbelt
What we do in the Linux kernel is to always go MMIO accesses through an inline
assembly in a macro or inline function. Having the compiler reorder
the accesses is
not the only problem this solves, others include:

- if the compiler determines that 'long_variable' may not be naturally
aligned, it
could split up the access further, and do byte accesses, which you don't want.
- depending on the device, the order of the two halves for a 64-bit access may
be different. In particular for writes, some devices require
accessing the upper
half first, others need the lower half first
- For portable code, you want to handle endianess conversion here, many
architectures eventually grow both a little-endian and big-endian mode
and the CPU architecture can be switched, but MMIO registers are usually
fixed. This might not happen on RISC-V, but it makes sense to handle it
anyway.
- depending on how the registers are mapped, the CPU might reorder the
two accesses after all, or combine them into a larger access, so you might
need a fence in-between. This should not be needed on RISC-V
- If you surround your code with locking to protect access to the device,
you need extra fences to ensure that the MMIO access doesn't leak
outside the lock.
- if there is DMA triggered by an MMIO, or the MMIO tells you about DMA
completion, you also need barriers.

None of the above are handled by the 'voilatile ' keyword.

Arnd

Liviu Ionescu

unread,
Dec 5, 2017, 9:17:29 AM12/5/17
to Arnd Bergmann, RISC-V SW Dev, Palmer Dabbelt


> On 5 Dec 2017, at 15:56, Arnd Bergmann <ar...@arndb.de> wrote:
>
> ... always go MMIO accesses through an inline assembly

thank you, Arnd, I know the assembly way, but that's exactly what I'm trying to avoid...

for the linux kernel it is probably a natural solution, you write it once, then it remains hidden in the kernel or module, and millions of users happily use it.

in the bare-metal embedded world, each developer deals with the hardware directly, and most of the time they try to outsmart the library provided code, so the less assembly code required, the better.


regards,

Liviu

Anton Krug

unread,
Dec 6, 2017, 7:09:11 AM12/6/17
to Liviu Ionescu, Arnd Bergmann, RISC-V SW Dev, Palmer Dabbelt

I know this is not the solution and only a hack, but what about wrapping the read in a function and then with __attribute force no optimalizations to avoid the compiler trying to improve it. Probably as the compiler evolves it still could break, so it won't be safe. The mentioned assembler would be safer


__attribute__((optimize("O0"))) long read(int addr) {

}


From: Liviu Ionescu <i...@livius.net>
Sent: 05 December 2017 14:17:24
To: Arnd Bergmann
Cc: RISC-V SW Dev; Palmer Dabbelt
Subject: Re: [sw-dev] gcc reorder instructions question
 
EXTERNAL EMAIL
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
Reply all
Reply to author
Forward
0 new messages