On Sun, 12 May 2019 at 21:02, Joan Lluch via llvm-dev
<llvm...@lists.llvm.org> wrote:
> How do I change that behaviour to get structs always (at least) 2 byte aligned ?
I don't think there's a feature you can toggle for this (except,
maybe, making the alignment of every basic type 2 bytes; but that
would obviously affect arrays and even normal variables too). So you
probably have to modify lib/AST/RecordLayoutBuilder.cpp directly.
Might be worth asking this on the cfe-dev mailing list though. That's
where most of the Clang experts live.
Cheers.
Tim.
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On Mon, 13 May 2019 at 07:53, Joan Lluch <joan....@icloud.com> wrote:
> The reason I want structs to be aligned/padded to 2 bytes is because my architecture only has 16 bit operations. I can read (sign and zero extended) and write (truncated) 8 bit data from/to memory, but all intermediate operations in registers are performed in 16 bit registers.
This is very normal. Mostly it's at 32-bits rather than 16, but it
applies to basically every RISC architecture so LLVM should handle it
well without adjusting the alignment requirements of types.
> This causes LLVM to generate odd tricks such as shifts and byte-swaps, when trying to replace struct ‘memcpy’s by word sized load/store instructions.
That sounds odd, as if you've not taught the backend to use those
8-bit loads and stores so it's trying to emulate them with word-sized
ones (early Alpha chips genuinely didn't have byte access so had to do
that kind of thing). You can (and should) probably fix that.
Also, there are a few customization points where you can control how
memcpy is implemented. The function "findOptimalMemOpLowering" lets
you control the type used for the loads and stores, and
MaxStoresPerMemcpy controls when LLVM will call the real memcpy. If
you want even more control you can implement EmitTargetCodeForMemcpy
to do the whole thing.
On Mon, 13 May 2019 at 18:01, Joan Lluch <joan....@icloud.com> wrote:
> After looking at it a bit further, I think this is a Clang thing. Clang issues “align 2” if the struct has at least one int (2 bytes), but also if the entire struct size is multiple of 2. For example a struct with 4 char members. In these cases the LLVM backend correctly creates word sized load/stores (2 bytes).
I'm slightly surprised that it happens based purely on size, but
either way LLVM should be able to cope.
> The LLVM backend just follows what’s dictated by Clang regarding alignment and thus it creates 2 byte or 1 byte load/stores instructions accordingly. I have not found a way to override this in LLVM. Any suggestions are appreciated.
That sounds right, but I don't think it explains the shifts you
described before. It should work out a lot better than what you're
seeing. Specifically, a 3 byte struct (for example) ought to either
lower to:
load i16, load i8 + stores if your target can do misaligned i16 operations.
or
load i8, load i8, load i8 + stores if not.
Neither of those involve shifting operations. I'd suggest breaking
just after getMemcpyLoadsAndStores and using SelectionDAG::dump to see
exactly what it's created. Then try to work out where that gets
pessimized to shifts, because it's not normal.
Also: I'm still pretty certain that "solving" this in Clang is the
wrong approach.
On 13 May 2019, at 20:09, Tim Northover <t.p.no...@gmail.com> wrote:
On Tue, 14 May 2019 at 17:51, Joan Lluch <joan....@icloud.com> wrote:
> This problem is also shared by the MSP430 target, and it’s very easy to reproduce by just compiling the code that I posted before.
That's some good detective work; it definitely explains what you're
seeing. Since MSP430 is affected it would probably be pretty easy to
upstream an alignment-aware version of the function and test it, if
you're keen. I'd promise to do a review promptly!
<FindOptimalMemOpLowering.txt>
In terms of increasing the alignment of a variable, there’s an API getOrEnforceKnownAlignment which is used by LLVM optimizations to increase the alignment of global variables and stack allocations, when it appears profitable. There currently isn’t any transform that calls it for memcpy operations, but maybe worth implementing for cases where the size of the memcpy is a small constant.
-Eli