NEON FPU

422 views
Skip to first unread message

Andrés Calderón

unread,
Jan 28, 2009, 11:20:22 PM1/28/09
to beagl...@googlegroups.com
Hi,

There are some benchmark results of the NEON FPU ?

Someone has tested FFTW[1] in the beagleboard?

[1] www.fftw.org

thanks,

Andrés Calderón
Cel: +57 (300) 275 3666
Email: andres....@emqbit.com
Web: www.emqbit.com

Ian R

unread,
Jan 29, 2009, 2:38:26 PM1/29/09
to Beagle Board
There is a free NEON FFT implementation available in the OpenMAX DL
implementation available for download from ARM. Note this is written
for the ARM toolchain so the asm will need some rework into gas
format, or you could use the ARM Realview eval tools which can
interwork with gcc to build Linux apps.

http://www.arm.com/products/multimedia/openmax/
http://www.arm.com/products/multimedia/openmax/v7libraries.html

The FFT code is in:
OX002-BU-00010-r1p0-00alp0\OX002-BU-00010-r1p0-00alp0\sp\src

Would be great to find someone who was keen to put this NEON code (or
similar) into FFTW.

As a example of NEON FFT performance: 256-point, 16-bit signed complex
numbers takes 4.7us (on 500MHz Beagle)

Philip Balister

unread,
Jan 29, 2009, 2:43:59 PM1/29/09
to beagl...@googlegroups.com

Off hand, do you know what sort of license is involved?

In other words, is this usable with a GPL project such as gnuradio?

Philip

Koen Kooi

unread,
Jan 29, 2009, 2:50:11 PM1/29/09
to beagl...@googlegroups.com

• Subject to the provisions of this Agreement, ARM hereby grants to
YOU (either an individual or single entity), under ARM's copyright in
the Software, a perpetual, non-exclusive, non-transferable, royalty
free, worldwide licence to ; (i) use, copy, modify, the Software for
the purposes of developing or having developed software applications
and; (ii) distribute and sublicense the right to use, copy and modify
the software applications to third parties.
• THE SOFTWARE IS LICENSED “AS IS”. ARM EXPRESSLY DISCLAIMS ALL
REPRESENTATIONS, AND WARRANTIES EXPRESS, IMPLIED OR STATUTORY,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF SATISFACTORY QUALITY,
MERCHANTABILITY, NON-INFRINGMENT OR FITNESS FOR A PARTICULAR PURPOSE.
• Your use of this Software and the right to redistribute any
software applications developed by or for YOU and which are derived
from the Software may require you to obtain patent licences from third
parties (“Third Party Patents”). ARM therefore requires and YOU hereby
agree that prior to exercise of any of the rights to distribute any
software applications in accordance with the licences granted under
this Agreement, YOU shall have obtained all necessary rights and
licences to Third Party Patents, of which YOU are aware of or become
aware during the term of this Agreement, to enable YOU to distribute
the ARM Software in accordance with the licences granted hereunder
without infringing the Third Party Patents whether as a primary,
secondary, indirect or contributory infringer, or otherwise, and the
copyright licences contained herein are conditional on you agreeing to
obtain such licences. For the purpose of interpretation of this Clause
3, any allegation by a third party that any action by YOU infringes
any Third Party Patents shall be presumed as valid until properly
rebutted by YOU and ARM may suspend the licences granted in Clause 1
until any such allegation is resolved in favour of YOU or YOU reach a
settlement with the party making the allegation. If any breach by YOU
of the provisions of this Clause 3 results in ARM being subject to a
claim for infringement of any Third Party Patents, YOU shall indemnify
against and hold ARM harmless from any claims, demands, damages, costs
and expenses made against or suffered by ARM as a result of any such
claim or action.
• No licence, express, implied or otherwise, is granted to YOU under
the provisions of Clause 1, to use the ARM tradename in connection
with the Software or any products based thereon. Nothing in Clause 1
shall be construed as authority for YOU to make any representations on
behalf of ARM in respect of the Software.
• If you are downloading the Software on behalf of a company,
partnership or other legal entity, you represent and warrant that you
have authority to bind that entity to these terms and Conditions. If
you do not have this authority you should not proceed to download the
Software.
• Any breach by YOU of the terms of this Agreement shall entitle ARM
to terminate this Agreement with immediate effect. Upon termination of
this Agreement, all licences granted to YOU shall cease immediately
and YOU shall at ARM's option either return to ARM or destroy all
copies of the Software including any modifications or derivatives
thereof.
• This Agreement shall be governed by and construed in accordance
with the laws of England and Wales.

PGP.sig

Bob McGwier

unread,
Jan 29, 2009, 3:21:22 PM1/29/09
to beagl...@googlegroups.com

fftw is built upon the premise that it will be used with tests run on
the machine to find optimized code. For most machines of interest
(desktops and servers) there are optimized SIMD codelets to do the heavy
lifting in fftw. For Intel there is use of SSE, SSE2. For PPC there is
use of altivec. For Cell, there is use of SPE, etc.

Without the neon SIMD optimizations, it would be native compile under
ARM-gcc for fftw. The performance will be poor. NEON codelets are
needed to attain good throughput.

Bob McGwier

--
(Co)Author: DttSP, Quiktrak, PowerSDR, GnuRadio
Member: ARRL, AMSAT, AMSAT-DL, TAPR, Packrats,
NJQRP, QRP ARCI, QCWA, FRC.
"It is human nature to think wisely and act in
an absurd fashion.", Anatole France.

Bob McGwier

unread,
Jan 29, 2009, 3:25:02 PM1/29/09
to beagl...@googlegroups.com
When you run gnuradio on machines with OpenGL enabled to get speed up of
the wxPython and/or Qt widgets in it, do you have a license to
distribute the Nvidia restricted driver to make the OpenGL go fast?

No of course not. So in GnuRadio, given the Neon FFT license, it may
not be checked into the GnuRadio repository and distributed with the GPL
v3.0 code checked in there. But individual users, under the license
granted (as quoted in another email message) will be quite useful for
people to download and install and run GnuRadio all day, every day.

You simply make it a requirement for GnuRadio/OMAP3530 under OE.

GPL is a distribution license. Not a usage license.

Bob

Philip Balister

unread,
Jan 29, 2009, 3:26:41 PM1/29/09
to beagl...@googlegroups.com
On Thu, Jan 29, 2009 at 3:21 PM, Bob McGwier <rwmc...@gmail.com> wrote:
>
> Andrés Calderón wrote:
>> Hi,
>>
>> There are some benchmark results of the NEON FPU ?
>>
>> Someone has tested FFTW[1] in the beagleboard?
>>
>> [1] www.fftw.org
>>
>> thanks,
>>
>> Andrés Calderón
>> Cel: +57 (300) 275 3666
>> Email: andres....@emqbit.com
>> Web: www.emqbit.com
>>
>>
>
> fftw is built upon the premise that it will be used with tests run on
> the machine to find optimized code. For most machines of interest
> (desktops and servers) there are optimized SIMD codelets to do the heavy
> lifting in fftw. For Intel there is use of SSE, SSE2. For PPC there is
> use of altivec. For Cell, there is use of SPE, etc.
>
> Without the neon SIMD optimizations, it would be native compile under
> ARM-gcc for fftw. The performance will be poor. NEON codelets are
> needed to attain good throughput.

The numbers I just posted suggest fftw will be horrible on the OMAP3
until we add NEON optimizations.

Philip

Ian R

unread,
Jan 30, 2009, 4:21:19 PM1/30/09
to Beagle Board
If anyone wants to do these FFTW optimizations, please let me know and
I can help (know a bit about NEON here ;) )

Ian

Philip Balister

unread,
Feb 1, 2009, 9:11:45 AM2/1/09
to beagl...@googlegroups.com
On Fri, Jan 30, 2009 at 4:21 PM, Ian R <ian.ri...@btinternet.com> wrote:
>
> If anyone wants to do these FFTW optimizations, please let me know and
> I can help (know a bit about NEON here ;) )

The fftw sources have a directory called "simd" with files for altivec
and sse. Adding NEON support may be as straight forward as inserting
code here. I haven't looked at the rest of the code. It certainly
would be a start.

Philip

kbkrisha...@gmail.com

unread,
Mar 25, 2014, 7:23:21 AM3/25/14
to beagl...@googlegroups.com
Hi Ian,
I want to do a  performance comparison b/w DSP and Arm cortex A15(using neon core) for floating point and Digital Signal Processing(fft).
Could you suggest me a fair benchmark(open source) for the same?
what i have done for dsp is, I used some fft function(optimized asm code) of dsplib. and obtained cpu cycles required for it.
Now i want to implement similar function for Cortex A15, What  should be done?
Reply all
Reply to author
Forward
0 new messages