How to enable Neon optimization

295 views
Skip to first unread message

Rajeev Verma

unread,
Nov 4, 2014, 8:10:47 PM11/4/14
to relic-...@googlegroups.com
Hi, 


If I want to compile Relic toolkit with Neon optimization enable for compiler, what would be the possiblechanges?


I am giving neon based flags in the compilation :

 -mcpu=cortex-a15 -mtune=cortex-a15 -mfpu=neon-vfpv4 -mfloat-abi=hard -funroll-loops -fomit-frame-pointer -DCORTEX=15 -marm -finline-small-functions


But it is not giving any change in timing for the results. Is there any specific chage needed in toolkit. 

Please help me out. 


Thanks,
Rajeev

Diego Aranha

unread,
Nov 10, 2014, 9:49:47 AM11/10/14
to relic-...@googlegroups.com
Dear Rajeev,

Can you please provide the CMake command line you are using? You probably need to turn on GCC auto-vectorization with -ftree-vectorize.

Best,

--
Diego de Freitas Aranha
Institute of Computing - University of Campinas
http://www.ic.unicamp.br/~dfaranha

--

---
You received this message because you are subscribed to the Google Groups "relic-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to relic-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rajeev Verma

unread,
Nov 21, 2014, 9:46:29 PM11/21/14
to relic-...@googlegroups.com
HI Diego,


Sorry, I was busy with courses last week. This is the command I am using to compile with Neon optimization :

CC=arm-linux-gnueabihf-gcc cmake -DARCH=ARM -DWORD=32 -DSEED=LIBC -DOVERH=OFF -DSHLIB=OFF -DSTBIN=ON -DTIMER=POSIX -DWITH="ALL" -DBENCH=10 -DTESTS=10000 -DCHECK=off -DVERBS=off -DSTRIP=off -DQUIET=off -DCORTEX=15 -DARITH=gmp -DFP_PRIME=256 -DFP_METHD="COMBA;COMBA;MONTY" -DCOMP="-O3 -g -flto -Ofast -mcpu=cortex-a15 -fprefetch-loop-arrays -mfpu=neon -mfloat-abi=hard -funroll-all-loops -mtune=cortex-a15 -ftree-vectorize -fomit-frame-pointer -mvectorize-with-neon-quad -mthumb-interwork -finline-small-functions  -ffast-math -marm -ffunction-sections -fdata-sections -fomit-frame-pointer -finline-small-functions" -DFP_PMERS=ON -DFP_QNRES=on -DPP_METHD="LOWER;O_ATE" ../relic-src-code

But its not giving any effect on timings if I compare with C-only implementation.

I have included arm_neon.h in relic.h and I am using Arndale board with Exynose ARM Cortex-A15 processor.

If you can pointout anything wrong here.



Thanks,
Rajeev
Reply all
Reply to author
Forward
0 new messages