Data-Cache misaligned access

455 views
Skip to first unread message

vithurson subasharan

unread,
Aug 23, 2017, 7:00:00 AM8/23/17
to RISC-V SW Dev
hi all,
I am developing the RISC-V 32 bit processor implementation. For my undergraduate bachelors FYP. 
1)  Should I make the architecture to support misaligned Data memory access is there any possibility that RISC-V tool chain compiler generating code that requires misaligned access?
2)  For SB , SH instructions should we take the 8 bit or 16 bit data and 0 extend to 32 bit and write as 32bit value to the memory or write to only the specified 8bits or 16bits?

Tommy Murphy

unread,
Aug 23, 2017, 7:13:16 AM8/23/17
to RISC-V SW Dev
From the spec:



2.6 Load and Store Instructions 
 
The base ISA supports misaligned accesses,

and

2.2 Base Instruction Formats 
 
Except for the 5-bit immediates used in CSR instructions (Section 2.8), immediates are always sign-extended, 

Hope this helps.

Tommy Murphy

unread,
Aug 23, 2017, 7:17:29 AM8/23/17
to RISC-V SW Dev
Sorry - the second snippet above is not relevant.
The spec seems a bit vague on SB/SH but I believe that other where explicitly noted immediate and register values are sign extended by default.
So I would expect that SH and SB would sign extend the (partial) register value before writing it to memory.

Richard Herveille

unread,
Aug 23, 2017, 7:27:39 AM8/23/17
to vithurson subasharan, RISC-V SW Dev, Richard Herveille

 

#1 The ISA does not not allow misaligned memory accesses. However it seems unlikely GCC would generate code that relies on misaligned accesses. As that would break those implementations that do not support misaligned accesses.

#2 Only write the relevant bits. So an SB only writes 8bits of data.

 

Richard

 

 

 

 

Richard Herveille

Managing Director

Phone +31 (45) 405 5681

Cell +31 (6) 5207 2230

richard....@roalogic.com

 

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/0ff9d3b0-f880-4cf7-b6c8-b0a1c65823eb%40groups.riscv.org.

Tommy Murphy

unread,
Aug 23, 2017, 7:44:45 AM8/23/17
to RISC-V SW Dev, vith...@gmail.com, richard....@roalogic.com
OK - apologies if what I posted was inaccurate.
The spec does say this so admittedly it does not say anything about sign extending.

The SW, SH, and SB instructions store 32-bit, 16-bit, and 8-bit values from the low bits of register rs2 to memory.

The ISA does not not allow misaligned memory accesses.

:-)
 
However it seems unlikely GCC would generate code that relies on misaligned accesses. As that would break those implementations that do not support misaligned accesses.

But could other tools - e.g. clang/llvm?

Note that even picorv32 has an option to disable misaligned memory access trapping:

Bruce Hoult

unread,
Aug 23, 2017, 8:18:00 AM8/23/17
to vithurson subasharan, RISC-V SW Dev
On Wed, Aug 23, 2017 at 2:00 PM, vithurson subasharan <vith...@gmail.com> wrote:
hi all,
I am developing the RISC-V 32 bit processor implementation. For my undergraduate bachelors FYP. 
1)  Should I make the architecture to support misaligned Data memory access is there any possibility that RISC-V tool chain compiler generating code that requires misaligned access?

Misaligned access support is not required in M mode and for example the SiFive FE310-G000 SoC in the HiFive1 board does not support misaligned accesses.

gcc will never create code that generates a misaligned access unless the user explicitly casts integers to pointers.

I believe U mode code must work with misaligned accesses, but it's ok to implement that with software in a trap handler.
 
2)  For SB , SH instructions should we take the 8 bit or 16 bit data and 0 extend to 32 bit and write as 32bit value to the memory or write to only the specified 8bits or 16bits?
 
Certainly a store must not touch data the user doesn't want changed!!!

If your memory is only writable 32 bits at a time then you must make SB and SH read the old 32 bit value, modify only the appropriate bytes, and then write everything back. But I think it would be more common for 32-bit wide memory to have 4 "write-enable" lines, one for each byte?

David Chisnall

unread,
Aug 23, 2017, 8:39:15 AM8/23/17
to Bruce Hoult, vithurson subasharan, RISC-V SW Dev
On 23 Aug 2017, at 13:17, Bruce Hoult <br...@hoult.org> wrote:
>
> gcc will never create code that generates a misaligned access unless the user explicitly casts integers to pointers.

Unless GCC’s behaviour is *very* different from LLVM’s here, I do not believe that this is true.

There are basically two possibilities for a compiler:

- Assume all accesses are correctly aligned[1] and emit instructions assuming that the processor can handle the case where they’re not.

- Assume all accesses are misaligned unless they can be proven to be aligned and emit a sequence of smaller loads / stores to guarantee correct alignment[2].

For most input code, the former assumption will generate noticeable faster binaries, even if you have to trap-and-emulate for all unaligned accesses, because the penalty for unaligned loads (even if you have a MIPS-style lwl / lwr, which RISC-V lacks) is very high, even just in terms of increased i-cache usage.

For CHERI, we made the processor handle all unaligned loads and stores within a cache line and only trap for ones that span a cache line boundary. The traps are very rare and we get a big win in both performance and binary size compared to our earlier approach of not generating unaligned accesses in the compiler.

David

[1] Technically, if they are not and the source language is C, then it’s undefined behaviour.

[2] Assuming 100% correct source code with no undefined behaviour - even with this strategy, things such as casting an unaligned buffer to void* then casting it to int* and loading it may generate misaligned accesses.

Bruce Hoult

unread,
Aug 23, 2017, 9:04:41 AM8/23/17
to David Chisnall, vithurson subasharan, RISC-V SW Dev
On Wed, Aug 23, 2017 at 3:39 PM, David Chisnall <David.C...@cl.cam.ac.uk> wrote:
On 23 Aug 2017, at 13:17, Bruce Hoult <br...@hoult.org> wrote:
>
> gcc will never create code that generates a misaligned access unless the user explicitly casts integers to pointers.

Unless GCC’s behaviour is *very* different from LLVM’s here, I do not believe that this is true.

There are basically two possibilities for a compiler:

 - Assume all accesses are correctly aligned[1] and emit instructions assuming that the processor can handle the case where they’re not.

And all accesses to simple variables, array elements, struct fields *will* be correctly aligned if the user doesn't cast a pointer to int or char*, do some arithmetic, and cast back. Which is what I said.

Ok, I forgot the case of specifying field widths explicitly in structs. I should think the compiler would do those accesses the slow way, using aligned accesses.

David Chisnall

unread,
Aug 23, 2017, 9:08:41 AM8/23/17
to Bruce Hoult, vithurson subasharan, RISC-V SW Dev

> On 23 Aug 2017, at 14:04, Bruce Hoult <br...@hoult.org> wrote:
>
> On Wed, Aug 23, 2017 at 3:39 PM, David Chisnall <David.C...@cl.cam.ac.uk> wrote:
>> On 23 Aug 2017, at 13:17, Bruce Hoult <br...@hoult.org> wrote:
>> >
>> > gcc will never create code that generates a misaligned access unless the user explicitly casts integers to pointers.
>>
>> Unless GCC’s behaviour is *very* different from LLVM’s here, I do not believe that this is true.
>>
>> There are basically two possibilities for a compiler:
>>
>> - Assume all accesses are correctly aligned[1] and emit instructions assuming that the processor can handle the case where they’re not.
>>
> And all accesses to simple variables, array elements, struct fields *will* be correctly aligned if the user doesn't cast a pointer to int or char*, do some arithmetic, and cast back. Which is what I said.

No, you said unless they cast a pointer to an integer. There are a lot of cases in common C idioms that don’t involve casting via an integer, but do involve casting via void* or char* and even more that appear as a result of optimisation (particularly optimisations involving string manipulations).

David


Cesar Eduardo Barros

unread,
Aug 23, 2017, 9:11:25 AM8/23/17
to Bruce Hoult, vithurson subasharan, RISC-V SW Dev
Em 23-08-2017 09:17, Bruce Hoult escreveu:
> On Wed, Aug 23, 2017 at 2:00 PM, vithurson subasharan
> <vith...@gmail.com <mailto:vith...@gmail.com>> wrote:
>
> hi all,
> I am developing the RISC-V 32 bit processor implementation. For my
> undergraduate bachelors FYP.
> 1) Should I make the architecture to support misaligned Data memory
> access is there any possibility that RISC-V tool chain compiler
> generating code that requires misaligned access?
>
>
> Misaligned access support is not required in M mode and for example the
> SiFive FE310-G000 SoC in the HiFive1 board does not support misaligned
> accesses.
>
> gcc will never create code that generates a misaligned access unless the
> user explicitly casts integers to pointers.

It's more like "gcc will always align variables to their natural size,
so access to them will always be aligned". But it's easy to see gcc
generating misaligned access with a packed struct (as could be found for
instance in networking code):

#include <stdint.h>

static void sb(uint8_t *p, uint8_t v) {
*p = v;
}

static void sh(uint16_t *p, uint16_t v) {
*p = v;
}

static void sw(uint32_t *p, uint32_t v) {
*p = v;
}

struct __attribute__((packed)) mystruct {
uint8_t byte;
uint16_t half;
uint32_t word;
};

void test(struct mystruct *s) {
// If s is aligned, the sh and sw are misaligned.
sb(&s->byte, 12);
sh(&s->half, 34);
sw(&s->word, 56);
}

This generates the following:

0000000000000001 <test>:
1: 47b1 li a5,12
3: 00f50023 sb a5,0(a0)
7: 02200793 li a5,34
b: 00f510a3 sh a5,1(a0)
f: 03800793 li a5,56
13: 00f521a3 sw a5,3(a0)
17: 8082 ret

As one can see, if the argument happened to be aligned (which will be
the case, for instance, if the struct was allocated on the heap by
malloc), the sh and sw will always be misaligned.

Also, given that most implementations will handle unaligned access in
hardware, it's probable that future gcc or llvm versions will optimize
some access patterns to unaligned loads/stores. For instance, the
following "read unaligned int using memcpy" idiom could be optimized to
a single lw instruction:

#include <stdint.h>
#include <string.h>

uint32_t read_unaligned(uint8_t *p) {
uint32_t ret;
memcpy(&ret, p, sizeof(ret));
return ret;
}

> I believe U mode code must work with misaligned accesses, but it's ok to
> implement that with software in a trap handler.

I believe that, even in M mode, misaligned accesses could be emulated in
software by a trap handler.

> 2) For SB , SH instructions should we take the 8 bit or 16 bit data
> and 0 extend to 32 bit and write as 32bit value to the memory or
> write to only the specified 8bits or 16bits?
>
> Certainly a store must not touch data the user doesn't want changed!!!
>
> If your memory is only writable 32 bits at a time then you must make SB
> and SH read the old 32 bit value, modify only the appropriate bytes, and
> then write everything back. But I think it would be more common for
> 32-bit wide memory to have 4 "write-enable" lines, one for each byte?

I don't think doing a read-modify-write of a full 32 bits for a 16 bit
or 8 bit store is a good idea. One can have for instance a pair of
adjacent 16-bit values being modified at the same time by different
harts, or by a hart and another piece of hardware. If one side updates
its value after the other side reads but before it writes, the update
can be lost.

It's even worse with some memory-mapped hardware, where even writing
what was just read could change the hardware state. When dealing with
hardware registers, the size of the write is as important as the value
being written.

--
Cesar Eduardo Barros
ces...@cesarb.eti.br

vithurson subasharan

unread,
Aug 23, 2017, 11:37:48 AM8/23/17
to RISC-V SW Dev
what does handling the misaligned instruction with software trap means?
Does it means that an interrupt occurs when a misaligned read/write found and ISR will have the code to do it in an aligned manner?

Bruce Hoult

unread,
Aug 23, 2017, 11:54:14 AM8/23/17
to vithurson subasharan, RISC-V SW Dev
Exactly.


On Wed, Aug 23, 2017 at 6:37 PM, vithurson subasharan <vith...@gmail.com> wrote:
what does handling the misaligned instruction with software trap means?
Does it means that an interrupt occurs when a misaligned read/write found and ISR will have the code to do it in an aligned manner?

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Andrew Waterman

unread,
Aug 23, 2017, 12:09:57 PM8/23/17
to Bruce Hoult, David Chisnall, RISC-V SW Dev, vithurson subasharan
On Wed, Aug 23, 2017 at 5:39 AM David Chisnall <David.C...@cl.cam.ac.uk> wrote:
On 23 Aug 2017, at 13:17, Bruce Hoult <br...@hoult.org> wrote:
>
> gcc will never create code that generates a misaligned access unless the user explicitly casts integers to pointers.

Unless GCC’s behaviour is *very* different from LLVM’s here, I do not believe that this is true.

There are basically two possibilities for a compiler:

 - Assume all accesses are correctly aligned[1] and emit instructions assuming that the processor can handle the case where they’re not.

This is the approach RISC-V GCC takes by default (though the tuning parameters can tell the compiler to assume that misaligned accesses are expensive, so it won't opportunistically generate them, like it would on, say, x86).



 - Assume all accesses are misaligned unless they can be proven to be aligned and emit a sequence of smaller loads / stores to guarantee correct alignment[2].

...and this is the approach it takes when the -mstrict-align flag is passed.



For most input code, the former assumption will generate noticeable faster binaries, even if you have to trap-and-emulate for all unaligned accesses, because the penalty for unaligned loads (even if you have a MIPS-style lwl / lwr, which RISC-V lacks) is very high, even just in terms of increased i-cache usage.


For CHERI, we made the processor handle all unaligned loads and stores within a cache line and only trap for ones that span a cache line boundary.  The traps are very rare and we get a big win in both performance and binary size compared to our earlier approach of not generating unaligned accesses in the compiler.

David

[1] Technically, if they are not and the source language is C, then it’s undefined behaviour.

[2] Assuming 100% correct source code with no undefined behaviour - even with this strategy, things such as casting an unaligned buffer to void* then casting it to int* and loading it may generate misaligned accesses.

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
Reply all
Reply to author
Forward
0 new messages