In the process I found out that clang doesn't support NEON (as per http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html), but there has been at least some effort in adding it (https://www.codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch).
I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times and I gave up. After some discussions with colleagues (notably Alberto Magni, who added OpenCL support to clang some time ago http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-November/012293.html) my current plan is to implement the ARM NEON intrinsics as a shared library, using attributes as in:
typedef float float4 __attribute__((ext_vector_type(4)));or if that doesn't work, I will try to implement the intrinsics in clang itself (not sure this is the best way of doing it).
Ideally, I want to be able to compile C code that includes ARM NEON intrinsics to other targets (TI processors, e.g.).
> I spent the last three days trying to compile a version of LLVM that would
> allow me to compile sources that contain these intrinsics, but with no success.
Ok. This we can probably help with. Did you manage to build a version
of Clang (preferably from git/subversion)?
If so, you're probably having problems cross-compiling. Renato's
recently worked on some documentation in this area:
http://clang.llvm.org/docs/CrossCompilation.html.
But for a quick hack, you could try:
$ cat > neon.c
#include <arm_neon.h>
float32x4_t my_func(float32x4_t lhs, float32x4_t rhs) {
return vaddq_f32(lhs, rhs);
}
$ clang --target=arm-linux-gnueabihf -mcpu=cortex-a15 -ffreestanding
-O3 -S -o - neon.c
("ffreestanding" will dodge any issues with your supporting toolchain,
but won't work for larger tests. You've got to actually solve the
issues before you start running code).
> In the process I found out that clang doesn't support NEON (as per
> http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html),
That's rather out of date, I'm afraid. 32-bit ARM does support both
NEON intrinsics and a reasonable amount of LLVM's own
auto-vectorisation (which is in its early stages, but we have some
kind of loop and SLP vectorisation going on).
> but there has been at least some effort in adding it
> (https://www.codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch).
That patch is part of the effort to implement NEON (instructions and
intrinsics) on the 64-bit ARM architecture (AArch64).
> I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times
> and I gave up.
Yep. llvm-gcc is long dead, and LLVM 2.9 isn't much healthier.
> current plan is to implement the ARM NEON intrinsics as a shared library,
> using attributes as in:
That would probably be possible, but very bad from a performance
perspective. The whole point of NEON intrinsics is to speed up vector
code; if you've got the overhead of a call/return for each intrinsic
and completely fixed registers around even that you'll be in for a
world of pain.
> Ideally, I want to be able to compile C code that includes ARM NEON
> intrinsics to other targets (TI processors, e.g.).
Now that's going to be harder. LLVM itself doesn't support any TI
processors, for a start. And many of the NEON intrinsics (those with
more complex semantics) compile to LLVM IR with LLVM-level intrinsics,
which are only supported in the ARM backend.
Your shared library idea would work semantically, of course. But I'm
not sure what useful information could be extracted from it.
To answer your question I am testing on a pandaboard currently, which has an arm cortex-a9 processor, which I think is 64-bit.
however it doesn't if I remove the -ffreestanding flag. I need to figure this out next.
which I suspect has something to do with the fact that in /usr/include I have a folder called x86_64-linux-gnu but not one called arm-linux-gnueabihf. Am I even remotely right?
On 26 September 2013 12:22, Stanislav Manilov <S.Z.M...@sms.ed.ac.uk> wrote:
In the process I found out that clang doesn't support NEON (as per http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html), but there has been at least some effort in adding it (https://www.codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch).
Hi Stanislav,
LLVM does support NEON on ARM32 for a very long time. The commit you're referring is about AArch64, and yes, support for ARM64 NEON is patchy at the moment, but it's progressing quite quickly. What back-end are you trying to use? 32-bits or 64-bits?
I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times and I gave up. After some discussions with colleagues (notably Alberto Magni, who added OpenCL support to clang some time ago http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-November/012293.html) my current plan is to implement the ARM NEON intrinsics as a shared library, using attributes as in:
LLVM 2.9 is really old, and llvm-gcc is discontinued, so I wouldn't even try that. If you don't want to use trunk, I recommend you to use LLVM with Clang 3.3 and see what you get.
typedef float float4 __attribute__((ext_vector_type(4)));or if that doesn't work, I will try to implement the intrinsics in clang itself (not sure this is the best way of doing it).
Ideally, I want to be able to compile C code that includes ARM NEON intrinsics to other targets (TI processors, e.g.).
To answer your question I am testing on a pandaboard currently, which has an arm cortex-a9 processor, which I think is 64-bit.
Cortex-A9 is still 32-bits, so you'll have all support you need. ;)
however it doesn't if I remove the -ffreestanding flag. I need to figure this out next.Can you at least assemble the file to .s? You won't be able to compile Tim's example to executable because you don't have a main in there.