OpenBLAS support in ARM9+NEON?

538 views
Skip to first unread message

crw...@gmail.com

unread,
Sep 5, 2014, 1:45:17 PM9/5/14
to openbla...@googlegroups.com
Hello,

Been using OpenBLAS under Linux on a SandyBridge processor and been very impressed with the results.  We are now considering an ARM9+NEON platform.  Am I right in assuming tuning has not been done for this architecture yet?  Any plans?

Thanks,

Charles

Werner Saar

unread,
Sep 6, 2014, 1:18:07 AM9/6/14
to openbla...@googlegroups.com
Hi,

tuning has been done for ARM9/Cortex-A9. ARM9 has the
instruction set ARMV7, which is the processor name in
OpenBLAS. You need Linux, to run OpenBLAS, not Android.

Best regards
Werner

crw...@gmail.com

unread,
Sep 6, 2014, 10:18:33 PM9/6/14
to openbla...@googlegroups.com
Thanks, Werner.  That is good news for me, especially since we will be using Linux, with the latest gcc.

Not to belabor the point, though, I guess I need a bit more clarification.  Is it actually tuned to use NEON instructions, too?

Thanks,

Charles

Werner Saar

unread,
Sep 7, 2014, 3:39:03 AM9/7/14
to openbla...@googlegroups.com
On 07.09.2014 04:18, crw...@gmail.com wrote:
> Thanks, Werner. That is good news for me, especially since we will be
> using Linux, with the latest gcc.
>
> Not to belabor the point, though, I guess I need a bit more clarification.
> Is it actually tuned to use NEON instructions, too?
>
> Thanks,
>
> Charles
>
> On Saturday, September 6, 2014 1:18:07 AM UTC-4, Werner Saar wrote:
>> On 05.09.2014 19:45, crw...@gmail.com <javascript:> wrote:
>>> Hello,
>>>
>>> Been using OpenBLAS under Linux on a SandyBridge processor and been very
>>> impressed with the results. We are now considering an ARM9+NEON
>> platform.
>>> Am I right in assuming tuning has not been done for this architecture
>> yet?
>>> Any plans?
>>>
>>> Thanks,
>>>
>>> Charles
>>>
>> Hi,
>>
>> tuning has been done for ARM9/Cortex-A9. ARM9 has the
>> instruction set ARMV7, which is the processor name in
>> OpenBLAS. You need Linux, to run OpenBLAS, not Android.
>>
>> Best regards
>> Werner
>>
>>
Hi,

It is tuned to use vfpv3 instructions, not NEON instructions, because
neon instructions are not IEEE754 compliant. But this is not a performance
penalty on this platform.

Best regards
Werner

johnny b

unread,
Mar 5, 2015, 4:59:52 AM3/5/15
to openbla...@googlegroups.com
Hi,
sorry for bringing this old thread back. I only want to make sure that my understanding is correct. In my opinion NEON optimizations would be a useful feature for OpenBLAS which could speed-up single precision floating point computation quite a lot.


Am Sonntag, 7. September 2014 09:39:03 UTC+2 schrieb Werner Saar:
Hi,

It is tuned to use vfpv3 instructions, not NEON instructions, because
neon instructions are not IEEE754 compliant.
That is true.

But this is not a performance
penalty on this platform.
But are you sure about this?
VFP is not a parallel SIMD architecture like NEON.
http://stackoverflow.com/questions/4097034/arm-cortex-a8-whats-the-difference-between-vfp-and-neon
But therefore only single precision floating point computation is available for NEON SIMD
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0409g/DDI0409G_cortex_a9_neon_mpe_r3p0_trm.pdf (section 1.1)
 

Best regards
Werner


Best regards,
Johannes
 

Zhang Xianyi

unread,
Mar 5, 2015, 12:30:39 PM3/5/15
to johnny b, openbla...@googlegroups.com
Hi Johannes,

We are working on optimizing Neon on ARM Cortex-A15 processors.

According to the preliminary result , we can got almost 80% of peak performance for sgemm. We expect we can release the work on June or July.

Thank you

Xianyi



--
You received this message because you are subscribed to the Google Groups "OpenBLAS-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openblas-user...@googlegroups.com.
To post to this group, send email to openbla...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

johnny b

unread,
Mar 6, 2015, 2:04:27 AM3/6/15
to openbla...@googlegroups.com, rehm.j...@gmail.com
Hi Xianyi,

very cool, sounds great!

Best regards,
Johannes

Amit kumar

unread,
Jul 18, 2015, 4:18:46 AM7/18/15
to openbla...@googlegroups.com, rehm.j...@gmail.com
Hi Xianyi,
I am also very much interested in seeing performance boost with NEON instruction. personally I feel NEON like instructions can help a lot.
Do you have some preliminary version of NEON implementation in OpenBlas for A15, that I can also experiment.

Waiting for your reply.

Thanks
Amit
Reply all
Reply to author
Forward
0 new messages