Hi all,
apologies if my question looks a bit naive, I'm definitely not an expert
in memory models.
I was toying with the code generation of riscv64-unknown-linux-gnu GCC
8.1 of the following C11 program:
-- atomic.c
#include <stdatomic.h>
atomic_int x;
void foo(void) { atomic_fetch_add_explicit(&x, 5, memory_order_acq_rel); }
-- end of atomic.c
For the atomic fetch and add, GCC emits this sequence
fence iorw,ow;
amoadd.w.aq zero,a4,0(a5)
Reading the GCC source, the first fence implements the release semantics.
My question is, why didn't GCC just choose to emit?
amoadd.w.aq.rl zero,a4,0(a5)
(no fence intended :)
Is this because we still have to order both the device IO/memory before
the AMO itself? I infer this from 7.1p2 of the current RISC-V ISA[1]
draft (emphasis mine)
> To provide more efficient support for release consistency [10], each
atomic instruction has two bits, aq and rl, used to specify additional
memory ordering constraints as viewed by other RISC-V harts. **The bits
order accesses to one of the two address domains, memory or I/O,
depending on which address domain the atomic instruction is accessing.**"
My understanding is that fence is the only that can order the two
domains. I infer that from 7.1p1 (emphasis mine)
> The base RISC-V ISA has a relaxed memory model, with the FENCE
instruction used to impose additional ordering constraints. The address
space is divided by the execution environment into memory and I/O
domains, and the FENCE instruction provides options to order accesses to
one or **both of these two address domains**.
But then I'm mildly confused by a later note in 7.3
> "The AMOs were designed to implement the C11 and C++11 memory models
efficiently. Although the FENCE R, RW instruction suffices to implement
the acquire operation and FENCE RW, W suffices to implement release,
both imply additional unnecessary ordering as compared to AMOs with the
corresponding aq or rl bit set."
The overall message makes sense to me but I'm confused why the spec
suggests "fence rw, w" and not "fence iorw, ow" (looks like GCC extends
'r' to include 'i' and 'w' to include 'o').
Thank you very much,
[1]
https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20180808-ce5e74a
--
Roger Ferrer Ibáñez -
roger....@bsc.es
Barcelona Supercomputing Center - Centro Nacional de Supercomputación
http://bsc.es/disclaimer