Seeking data points

21 views
Skip to first unread message

Matthew Du Puy

unread,
Jan 29, 2014, 8:03:15 PM1/29/14
to yeppp-...@googlegroups.com

Out of curiosity, what functions are you comparing here?

I appreciate any analysis pointers as we strive to optimize ProjectNe10 and move towards a more C intrinsic based solution.
Thanks!

--
Matthew Du Puy | Software Engineer | ARM Strategic Alliances
community.arm.com/people/mattdupuy
community.arm.com/groups/android-community

Marat Dukhan

unread,
Feb 4, 2014, 6:10:13 AM2/4/14
to Matthew Du Puy, yeppp-...@googlegroups.com
Copying to yeppp-discuss as well

Dear Matthew,

Details on the benchmarks from the title page will show up if you hover the mouse on a picture:
Inline image 1

The plot you refer to demonstrates performance of in-cache single-precision vector addition (function yepCore_Add_V32fV32f_V32f) on Cortex-A8.

In my opinion, the main performance issue with Ne10 is that it doesn't hide the latency of instructions (BTW, Yeppp! is also not perfect here, this kernel could perform at 0.75 CPE is better optimized).

Regards,
Marat


--
You received this message because you are subscribed to the Google Groups "yeppp-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to yeppp-discus...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

yeppp-tooltip.png
Reply all
Reply to author
Forward
0 new messages