Speed regression with fat binary build

55 views
Skip to first unread message

fwjmath

unread,
Feb 27, 2014, 8:19:37 AM2/27/14
to mpir-...@googlegroups.com
Hi all,

I am cooperating with a volunteer computing project (yoyo@home, if anyone knows it), and I need MPIR to build a performance-crucial application on Windows that will be run on heterogenous CPU. The last version I built also uses MPIR, and I am really grateful of all your work.

This time, since the performance of the arbitrary-precision arithmetic is extremely important, a fat binary is needed. I downloaded the master branch from Git, and I used mingw64 on msys as build environment, with the following option for configure

./configure --disable-fft --disable-shared --enable-fat

The "make" and "make check" were successful, but when compiling the application, mingw64-g++ hinted that there are duplicated symbols, namely g?mpn_k8_redc_1 in mpn/bobcat/redc_1.asm and ?mpn_core2_redc_1 in mpn/sandybridge/redc_1.asm. I did a renaming to get rid of this problem, and the application compiles.

But then when I tested it against another build with the following option

./configure --disable-fft --disable-shared

the fat binary is slower by about 10%. Furthermore, comparing to GMP on cygwin64, there is also a large slowdown. Here are some rough timing:

msys-mpir-fat ~60min
msys-mpir-nofat ~55min
cygwin64-gmp-nofat ~40min

I was not able to build MPIR on cygwin, it was more or less stucked when building libmpn.la with libtool. I have used the --with-system-yasm option (or else it complains on cygwin-style symbolic links).

The build environment:
CPU: i7 3720QM (Ivy Bridge)
Memory: 8GB
MPIR version: 2014-02-26
Mingw64 version: gcc-4.8.2
Msys version: 2011-11-23

I would deeply appreciate any useful suggestion. Thank you in advance!

Cheers,
fwjmath.

Jean-Pierre Flori

unread,
Feb 27, 2014, 8:26:40 AM2/27/14
to mpir-...@googlegroups.com


On Thursday, February 27, 2014 2:19:37 PM UTC+1, fwjmath wrote:
Hi all,

I am cooperating with a volunteer computing project (yoyo@home, if anyone knows it), and I need MPIR to build a performance-crucial application on Windows that will be run on heterogenous CPU. The last version I built also uses MPIR, and I am really grateful of all your work.

This time, since the performance of the arbitrary-precision arithmetic is extremely important, a fat binary is needed. I downloaded the master branch from Git, and I used mingw64 on msys as build environment, with the following option for configure

./configure --disable-fft --disable-shared --enable-fat

The "make" and "make check" were successful, but when compiling the application, mingw64-g++ hinted that there are duplicated symbols, namely g?mpn_k8_redc_1 in mpn/bobcat/redc_1.asm and ?mpn_core2_redc_1 in mpn/sandybridge/redc_1.asm. I did a renaming to get rid of this problem, and the application compiles.

But then when I tested it against another build with the following option

./configure --disable-fft --disable-shared

the fat binary is slower by about 10%. Furthermore, comparing to GMP on cygwin64, there is also a large slowdown. Here are some rough timing:

msys-mpir-fat ~60min
msys-mpir-nofat ~55min
cygwin64-gmp-nofat ~40min

I was not able to build MPIR on cygwin, it was more or less stucked when building libmpn.la with libtool. I have used the --with-system-yasm option (or else it complains on cygwin-style symbolic links).

Strange, I have been successful for a long time building MPIR on Cygwin64, using the system or MPIR shipped yasm.
I'll double check this week-end if nothing went wrong since last time I tried.

Jean-Pierre Flori

unread,
Feb 27, 2014, 8:27:48 AM2/27/14
to mpir-...@googlegroups.com


On Thursday, February 27, 2014 2:26:40 PM UTC+1, Jean-Pierre Flori wrote:

I was not able to build MPIR on cygwin, it was more or less stucked when building libmpn.la with libtool. I have used the --with-system-yasm option (or else it complains on cygwin-style symbolic links).

Could you provide more details such as output from configure, make and the config.log file?

Bill Hart

unread,
Feb 27, 2014, 8:31:44 AM2/27/14
to mpir-devel
Hi fwjmath,




On 27 February 2014 14:19, fwjmath <fwj...@gmail.com> wrote:
Hi all,

I am cooperating with a volunteer computing project (yoyo@home, if anyone knows it), and I need MPIR to build a performance-crucial application on Windows that will be run on heterogenous CPU. The last version I built also uses MPIR, and I am really grateful of all your work.

This time, since the performance of the arbitrary-precision arithmetic is extremely important, a fat binary is needed. I downloaded the master branch from Git, and I used mingw64 on msys as build environment, with the following option for configure

./configure --disable-fft --disable-shared --enable-fat

The "make" and "make check" were successful, but when compiling the application, mingw64-g++ hinted that there are duplicated symbols, namely g?mpn_k8_redc_1 in mpn/bobcat/redc_1.asm and ?mpn_core2_redc_1 in mpn/sandybridge/redc_1.asm. I did a renaming to get rid of this problem, and the application compiles.

Thanks for the report. We will look into this one before the release.
 

But then when I tested it against another build with the following option

./configure --disable-fft --disable-shared

the fat binary is slower by about 10%.

The fat binary should be slower. The fat binary uses a runtime lookup table of functions for a variety of processors which slows many things down.
 
Furthermore, comparing to GMP on cygwin64, there is also a large slowdown. Here are some rough timing:

msys-mpir-fat ~60min
msys-mpir-nofat ~55min
cygwin64-gmp-nofat ~40min

Cygwin uses a different ABI to msys. So I expect the times to be different.

I'm not saying that for your particular application GMP won't be faster. That's possible, as we haven't finished with performance improvements for this release yet (and sometimes they are faster than us and vice versa even when we do make a release). 
 

I was not able to build MPIR on cygwin, it was more or less stucked when building libmpn.la with libtool. I have used the --with-system-yasm option (or else it complains on cygwin-style symbolic links).

Thanks again for the report. Obviously the master branch is not ready for a release yet!
 

The build environment:
CPU: i7 3720QM (Ivy Bridge)
Memory: 8GB
MPIR version: 2014-02-26
Mingw64 version: gcc-4.8.2
Msys version: 2011-11-23

I would deeply appreciate any useful suggestion. Thank you in advance!

Cheers,
fwjmath.

--
You received this message because you are subscribed to the Google Groups "mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+...@googlegroups.com.
To post to this group, send email to mpir-...@googlegroups.com.
Visit this group at http://groups.google.com/group/mpir-devel.
For more options, visit https://groups.google.com/groups/opt_out.

Volker Braun

unread,
Feb 27, 2014, 8:32:53 AM2/27/14
to mpir-...@googlegroups.com
There is no fat binary, its just a misnomer for generic cflags.


On Thursday, February 27, 2014 2:19:37 PM UTC+1, fwjmath wrote:

Bill Hart

unread,
Feb 27, 2014, 8:37:40 AM2/27/14
to mpir-devel
Volker,

MPIR does have the capability of building a fat binary.

Bill.


--

Jean-Pierre Flori

unread,
Feb 27, 2014, 9:13:47 AM2/27/14
to mpir-...@googlegroups.com, goodwi...@googlemail.com


On Thursday, February 27, 2014 2:31:44 PM UTC+1, Bill Hart wrote:
Hi fwjmath,


Cygwin uses a different ABI to msys. So I expect the times to be different.

I think they both use the Win64 function call ABI (oh yeah Microsoft was nice enough not to use the AMD64 ABI...).
There is no real doc about that but you can find some clue on the Cygwin mailing list.
In particular fewer args are passed through register.

The main difference is that the Win64 ABI is LLP64 (so longs are only 4 bytes), mingw64 is LLP64 and cygwin64 is LP64 (as in the AMD64 ABI).
Note that MPIR and GMP should be smart enough to define the limb type which is used most of the time to be 8 bytes long on mingw64 and on cygwin64

Bill Hart

unread,
Feb 27, 2014, 9:43:03 AM2/27/14
to Jean-Pierre Flori, mpir-devel
Yeah the Win64 function call ABI is used and but the longs are 64 bits not 32.

This probably shouldn't affect things too much for the reasons you state.

I don't have any special insight except that maybe it is not detecting the processor correctly on Cygwin64, or there's some bottleneck which has been sped up in GMP and not MPIR.

The configuration information you requested should help us get to the bottom of that one (as will the performance improvements I have been making...).

Bill.

fwjmath

unread,
Feb 27, 2014, 10:29:53 AM2/27/14
to mpir-...@googlegroups.com
Hi Jean-Pierre,

Thanks for your reply. I need to precise that MPIR builds on cygwin64 with the cygwin toolchain, but not with the mingw64 toolchain.

But I just gave it another try without --enable-fat, and it seems to work.

I will now try it with --enable-fat.

Cheers,
fwjmath.

在 2014年2月27日星期四UTC+1下午2时27分48秒,Jean-Pierre Flori写道:

fwjmath

unread,
Feb 27, 2014, 10:47:38 AM2/27/14
to mpir-...@googlegroups.com, goodwi...@googlemail.com
Hi Bill,

Thank you very much for your reply. It is really helpful. If fat binary is slower then I will simply switch to normal build. The ABI issue is already explained so I will not go into that.

In case it helps, here is my application scenario. The whole computation relies heavily on MPIR, but it only uses integers at most a bit more than 128 bit and mostly basic operations.

Since when compiling with mingw64, the generic x86_64 is picked for CPU tuning, I will try later with an appropriate option to see if it will get better timing on my machine.

Cheers,
fwjmath.

在 2014年2月27日星期四UTC+1下午2时31分44秒,Bill Hart写道:

Bill Hart

unread,
Feb 27, 2014, 10:54:35 AM2/27/14
to mpir-devel
--enable-fat is not guaranteed to work on all machines. Basically the idea is to build a fat binary on one machine and then distribute the binaries to all the other machines.

Which means.... finding a machine on which --enable-fat builds.

To be honest, I'm surprised it works at all on Windows.

Bill.


--

fwjmath

unread,
Feb 27, 2014, 11:49:02 AM2/27/14
to mpir-...@googlegroups.com, goodwi...@googlemail.com
It is strange that now it also compiles with --enable-fat. The only explanation is that I confused --build and --host options... And the non-fat MPIR compiled under cygwin seems to work great, with comparable speed with GMP, though more testing is needed. Since in x86_64 it is tuned as k8, it should be general enough.

Sorry for troubling all of you and thanks a lot for all your comments!

Cheers,
fwjmath.

在 2014年2月27日星期四UTC+1下午4时54分35秒,Bill Hart写道:

Bill Hart

unread,
Feb 27, 2014, 1:11:20 PM2/27/14
to fwjmath, mpir-devel
Some things that can cause a 10% speed regression:

* not using the latest gcc
* a statistical timing fluke
* machine load
* dynamically linking against MPIR instead of statically linking (the latter is faster, where permitted)

Bill.

Bill Hart

unread,
Mar 28, 2014, 11:09:41 AM3/28/14
to mpir-devel, Wenjie Fang
Hi fwjmath,

I have now fixed the duplicate symbol issue you were seeing, in our master branch. The actual assembly files themselves contained the wrong symbol names, doubtless due to copy and paste error.

If you build using MinGW or Cygwin at present you need to do

autoreconf -fiv

before running configure. This extra step won't be necessary in the final release, due out within days.

I hope this fixes the issue for you. 

Bill.

On 27 February 2014 14:19, fwjmath <fwj...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages