Martin Guy
unread,Jun 27, 2013, 11:55:13 PM6/27/13Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to linux-...@freelists.org, si...@googlegroups.com, Crossgcc list
[Bcc to 15 developers]
Hi!
I've been looking at the old EP93xx MaverickCrunch FPU patches for
GCC again and almost have a working set for gcc-4.4. It "seems to
work" for most programs but one FP-math-intensive testsuite is
failing.(*)
The good news is that FFTW's speed test reports an increase from
6.13 to 7.65 MFlops and I think I now know what was wrong with the
64-bit integer arithmetic(**): fixing that should bring a speed
increase to OpenSSL and others. It also integrates better with the new
Debian release "wheezy", whose minimum GCC version is now 4.4.
I'm writing as a fundraiser to prod me into finding and applying the
bug fix, and finish, package and publish it, to then compile the
Debian repository of crunch-acelerated packages for wheezy.
So if anyone happens to work for a company that gains form my work,
do feel free to campaign with the accountants on my behalf :)
In any case, I'll write again when the new compiler and repositories
are available.
Cheers
M
*) In fftw. It looks like another variant of a known silicon bug: when
one instruction modifies an ARM register ina a way that induces a wait
state, and the next instruction is a crunch double load/store whose
address indirects through that register, the result is garbage or RAM
is corrupted at random.
**) By default, its 64-bit arithmetic operations do saturating
arithmetic instead of "going round the clock", maxing out at 0x7F* and
-0x80*. The only place that notices is openssl's 64-bit-optimized
bignum div/rem function.