----- Original Message -----
> From: "Simon Taylor" <simont...@ntlworld.com>
> To: llv...@cs.uiuc.edu
> Sent: Sunday, December 7, 2014 1:15:51 PM
> Subject: [LLVMdev] NEON intrinsics preventing redundant load optimization?
>
> Hi all,
>
> I’m not sure if this is the right list, so apologies if not.
This is not a bad place ;)
I recommend filing a bug report so that someone can look at this in detail. You can do this at llvm.org/bugs -- select "libraries" as the product, and then "Scalar Optimizations" as the component (that's probably right, and we can always change it if it turns out the problem lies elsewhere).
In the mean time, I recommend trying to pass by value, instead of by reference, in your multiplication operator. It is hard to say without looking at the code in detail, but it is easier for the compiler to analyze:
vec4 operator* (vec4 a, vec4 b)
than to analyze:
vec4 operator* (vec4& a, vec4& b)
-Hal
> Is there
> any hope for this improving in the future, or anything I can do now
> to improve the generated code?
>
> Thanks,
>
> Simon
> _______________________________________________
> LLVM Developers mailing list
> LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
My view is that you should only use intrinsics where the language has
no semantics for it. Since this is not the case, using pointers is
probably the best way, anyway.
There is still the "bug" where the load/store intrinsics don't map to
simple pointer references, but since you found a better work-around,
that has lower priority now.
I changed the bug to reflect that.
cheers,
--renato
I think dereferencing pointers is explicitly discouraged in the
documentation for portability reasons. It may well have issues on
wrong-endian targets.
Tim.
Tim is right, this can be a potential danger, but not more than other
endian or type size issues. If you're writing portable code, I assume
you'll already be mindful of those issues.
This is why I said it's still a problem, but not a critical one. Maybe
adding a comment to your code explaining the issue will help you in
the future to move it back to NEON loads/stores once this is fixed.
cheers,
--renato
After a bit more testing applying the pointer dereferencing workaround in my real code (which has some layers of templates and inline functions) I’ve decided against using it in practice.
GCC produced incorrect code unless -fno-strict-aliasing was specified. It’s probably entirely entitled to do that, so it just seems too flaky to recommend in any case.
Simon
Ok, I raised the priority back to Normal in the bug, since the work around wasn't good enough.
Cheers,
Renato
Ok, I raised the priority back to Normal in the bug, since the work around wasn't good enough.
To correct my previous correction, this does indeed cause all the important instructions to get stripped with the android NDK’s gcc 4.8, when optimisation is set to -O2. With -O3 for whatever reason this doesn’t happen. So I’d definitely agree with the general consensus to only create vectors with the official intrinsics and steer well clear of my nasty hacky attempt at a workaround.
Simon