MPIR tuning -- help needed

55 views
Skip to first unread message

Bill Hart

unread,
Oct 23, 2012, 9:57:35 PM10/23/12
to mpir-devel
Hi all,

I have run make check and make tune on the following arches:

x86_64/k102 - fermat
x86_64/nehalem - jeff gilchrist
x86_64/k8 - flavius
x86/pentium4/sse2 - cicero
x86_64/netburst - sextus (tune crashed)
x86_64/core2 - eno
sparc64 - mark (tune crashed) (ultrasparc3)
sparc32 - mark (ultrasparc3)
x86_64/penryn - sage.math
x86_64/k10 - gcc16
x86_64/atom - gcc46
mips64 - gcc49 (tuning failed)
ppc64 - (tuning failed) (IBM power7)

However, I still have no tuning values for alpha, ARM, AMD bobcat,
Intel sandybrige, mips32, ppc32. If anyone has access to such
machines, please let me know.

We have generic tuning values for the fft, but it is better to have
properly tuned values.

Brian, you should be able to pull tuning values for Windows from the
*nix values now. I'm afraid the only x86 amongst them is the
x86/pentium4/sse2 machine. But there are plenty of x86_64s.

The two crashed tuning runs are due to the fft tuning crashing. I
don't know what caused this, but it isn't urgent to fix it. We
expected tuning to fail on lots of platforms. I constructed the best
set of values I could from the tuning values that we were able to get.
The other tuning failures are known failures which have never been
fixed. The default values will have to do on these machines.

Bill.

Jean-Pierre Flori

unread,
Oct 24, 2012, 4:58:05 AM10/24/12
to mpir-...@googlegroups.com, goodwi...@googlemail.com
[jp@jp-x220]% uname -a
Linux jp-x220 3.5-trunk-amd64 #1 SMP Debian 3.5.5-1~experimental.1 x86_64 GNU/Linux

[jp@jp-x220]% cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 42
model name      : Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz
...

[jp@jp-x220]% gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-4' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.2 (Debian 4.7.2-4)

[jp@jp-x220]% ./configure --prefix=$LOCAL --enable-gmpcompat --enable-cxx

[jp@jp-x220]% ./config.guess
sandybridge-unknown-linux-gnu

[jp@jp-x220]% make tune
...
Parameters for ./mpn/x86_64/sandybridge/gmp-mparam.h
Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 1.25e-09 secs, CPU freq 800.00 MHz
DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-24, gcc 4.7 */

#define MUL_KARATSUBA_THRESHOLD          16
#define MUL_TOOM3_THRESHOLD             105
#define MUL_TOOM4_THRESHOLD             244
#define MUL_TOOM8H_THRESHOLD            327

#define SQR_BASECASE_THRESHOLD            0  /* always (native) */
#define SQR_KARATSUBA_THRESHOLD          31
#define SQR_TOOM3_THRESHOLD             101
#define SQR_TOOM4_THRESHOLD             256
#define SQR_TOOM8_THRESHOLD             333

#define POWM_THRESHOLD                  138

#define HGCD_THRESHOLD                   75
#define GCD_DC_THRESHOLD               2797
#define GCDEXT_DC_THRESHOLD            1788
#define JACOBI_BASE_METHOD                1

#define DIVREM_1_NORM_THRESHOLD       MP_SIZE_T_MAX  /* never */
#define DIVREM_1_UNNORM_THRESHOLD     MP_SIZE_T_MAX  /* never */
#define MOD_1_NORM_THRESHOLD              0  /* always */
#define MOD_1_UNNORM_THRESHOLD            0  /* always */
#define USE_PREINV_DIVREM_1               1  /* native */
#define USE_PREINV_MOD_1                  1
#define DIVEXACT_1_THRESHOLD              0  /* always */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always (native) */
#define MOD_1_1_THRESHOLD                 7
#define MOD_1_2_THRESHOLD                 7
#define MOD_1_3_THRESHOLD                23
#define DIVREM_HENSEL_QR_1_THRESHOLD     29
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD      5
#define DIVREM_EUCLID_HENSEL_THRESHOLD    146

#define ROOTREM_THRESHOLD                 6

#define GET_STR_DC_THRESHOLD             17
#define GET_STR_PRECOMPUTE_THRESHOLD     23
#define SET_STR_DC_THRESHOLD           6915
#define SET_STR_PRECOMPUTE_THRESHOLD   7939

#define MUL_FFT_FULL_THRESHOLD         3008

#define SQR_FFT_FULL_THRESHOLD         3520

#define MULLOW_BASECASE_THRESHOLD         7
#define MULLOW_DC_THRESHOLD              30
#define MULLOW_MUL_THRESHOLD           4525

#define MULHIGH_BASECASE_THRESHOLD       10
#define MULHIGH_DC_THRESHOLD             27
#define MULHIGH_MUL_THRESHOLD          2966

#define MULMOD_2EXPM1_THRESHOLD          20

#define FAC_UI_THRESHOLD               1590
#define DC_DIV_QR_THRESHOLD             100
#define DC_DIVAPPR_Q_N_THRESHOLD         90
#define INV_DIV_QR_THRESHOLD            465
#define INV_DIVAPPR_Q_N_THRESHOLD        90
#define DC_DIV_Q_THRESHOLD              136
#define INV_DIV_Q_THRESHOLD            5581
#define DC_DIVAPPR_Q_THRESHOLD          100
#define INV_DIVAPPR_Q_THRESHOLD       12502
#define DC_BDIV_QR_THRESHOLD            100
#define DC_BDIV_Q_THRESHOLD              44

/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 4, 3 }, { 3, 2 }, { 3, 2 }, { 2, 1 }, { 1, 0 } }

#define MULMOD_TAB \
   { 4, 3, 4, 4, 4, 3, 3, 3, 3, 2, 2, 3, 2, 2, 2, 2, 2, 1, 1 }

#define FFT_N_NUM 19

#define FFT_MULMOD_2EXPP1_CUTOFF 128


/* Tuneup completed successfully, took 125 seconds */


Jean-Pierre Flori

unread,
Oct 24, 2012, 5:06:32 AM10/24/12
to mpir-...@googlegroups.com, goodwi...@googlemail.com
There might have been some problems with CPU throttling above (look at the 800MHz in the make tune output.
Here is what I get when  setting the cpufreq governor to performance (i.e. 2.7GHz).


[jp@jp-x220]% make tune
...
./tuneup

Parameters for ./mpn/x86_64/sandybridge/gmp-mparam.h
Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 3.70e-10 secs, CPU freq 2701.00 MHz

DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-24, gcc 4.7 */

#define MUL_KARATSUBA_THRESHOLD          16
#define MUL_TOOM3_THRESHOLD             105
#define MUL_TOOM4_THRESHOLD             246

#define MUL_TOOM8H_THRESHOLD            327

#define SQR_BASECASE_THRESHOLD            0  /* always (native) */
#define SQR_KARATSUBA_THRESHOLD          31
#define SQR_TOOM3_THRESHOLD              61
#define SQR_TOOM4_THRESHOLD             178
#define SQR_TOOM8_THRESHOLD             240

#define POWM_THRESHOLD                  138

#define HGCD_THRESHOLD                   42
#define GCD_DC_THRESHOLD               2770

#define GCDEXT_DC_THRESHOLD            1788
#define JACOBI_BASE_METHOD                1

#define DIVREM_1_NORM_THRESHOLD       MP_SIZE_T_MAX  /* never */
#define DIVREM_1_UNNORM_THRESHOLD     MP_SIZE_T_MAX  /* never */
#define MOD_1_NORM_THRESHOLD              0  /* always */
#define MOD_1_UNNORM_THRESHOLD            0  /* always */
#define USE_PREINV_DIVREM_1               1  /* native */
#define USE_PREINV_MOD_1                  1
#define DIVEXACT_1_THRESHOLD              0  /* always */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always (native) */
#define MOD_1_1_THRESHOLD                 7
#define MOD_1_2_THRESHOLD                 7
#define MOD_1_3_THRESHOLD                23
#define DIVREM_HENSEL_QR_1_THRESHOLD     31
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD      5
#define DIVREM_EUCLID_HENSEL_THRESHOLD     15

#define ROOTREM_THRESHOLD                 6

#define GET_STR_DC_THRESHOLD             16

#define GET_STR_PRECOMPUTE_THRESHOLD     23
#define SET_STR_DC_THRESHOLD           6915
#define SET_STR_PRECOMPUTE_THRESHOLD   6915


#define MUL_FFT_FULL_THRESHOLD         3008

#define SQR_FFT_FULL_THRESHOLD         3520

#define MULLOW_BASECASE_THRESHOLD         7
#define MULLOW_DC_THRESHOLD              30
#define MULLOW_MUL_THRESHOLD           4570


#define MULHIGH_BASECASE_THRESHOLD       10
#define MULHIGH_DC_THRESHOLD             27
#define MULHIGH_MUL_THRESHOLD          2966

#define MULMOD_2EXPM1_THRESHOLD          20

#define FAC_UI_THRESHOLD               1605
#define DC_DIV_QR_THRESHOLD             100
#define DC_DIVAPPR_Q_N_THRESHOLD         91
#define INV_DIV_QR_THRESHOLD            465
#define INV_DIVAPPR_Q_N_THRESHOLD        91
#define DC_DIV_Q_THRESHOLD              130
#define INV_DIV_Q_THRESHOLD            5581
#define DC_DIVAPPR_Q_THRESHOLD          102
#define INV_DIVAPPR_Q_THRESHOLD       12637
#define DC_BDIV_QR_THRESHOLD            100
#define DC_BDIV_Q_THRESHOLD              42


/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 4, 3 }, { 3, 3 }, { 3, 2 }, { 2, 1 }, { 1, 0 } }

#define MULMOD_TAB \
   { 4, 3, 4, 4, 4, 3, 3, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 2, 1, 2, 2, 1, 1 }

#define FFT_N_NUM 23

#define FFT_MULMOD_2EXPP1_CUTOFF 128


/* Tuneup completed successfully, took 137 seconds */

These are slightly different, but not more than if I rerun make tune once more with the CPU stuck at max frequency, see below:


[jp@jp-x220]% make tune
...
./tuneup

Parameters for ./mpn/x86_64/sandybridge/gmp-mparam.h
Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 3.70e-10 secs, CPU freq 2701.00 MHz

DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-24, gcc 4.7 */

#define MUL_KARATSUBA_THRESHOLD          16
#define MUL_TOOM3_THRESHOLD             105
#define MUL_TOOM4_THRESHOLD             244
#define MUL_TOOM8H_THRESHOLD            303


#define SQR_BASECASE_THRESHOLD            0  /* always (native) */
#define SQR_KARATSUBA_THRESHOLD          31
#define SQR_TOOM3_THRESHOLD              95
#define SQR_TOOM4_THRESHOLD             250
#define SQR_TOOM8_THRESHOLD             351

#define POWM_THRESHOLD                  138

#define HGCD_THRESHOLD                   37
#define GCD_DC_THRESHOLD               2587

#define GCDEXT_DC_THRESHOLD            1788
#define JACOBI_BASE_METHOD                1

#define DIVREM_1_NORM_THRESHOLD       MP_SIZE_T_MAX  /* never */
#define DIVREM_1_UNNORM_THRESHOLD     MP_SIZE_T_MAX  /* never */
#define MOD_1_NORM_THRESHOLD              0  /* always */
#define MOD_1_UNNORM_THRESHOLD            0  /* always */
#define USE_PREINV_DIVREM_1               1  /* native */
#define USE_PREINV_MOD_1                  1
#define DIVEXACT_1_THRESHOLD              0  /* always */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always (native) */
#define MOD_1_1_THRESHOLD                 7
#define MOD_1_2_THRESHOLD                 7
#define MOD_1_3_THRESHOLD                23
#define DIVREM_HENSEL_QR_1_THRESHOLD     31
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD      5
#define DIVREM_EUCLID_HENSEL_THRESHOLD    121


#define ROOTREM_THRESHOLD                 6

#define GET_STR_DC_THRESHOLD             17
#define GET_STR_PRECOMPUTE_THRESHOLD     23
#define SET_STR_DC_THRESHOLD           6915
#define SET_STR_PRECOMPUTE_THRESHOLD   8097


#define MUL_FFT_FULL_THRESHOLD         3008

#define SQR_FFT_FULL_THRESHOLD         3520

#define MULLOW_BASECASE_THRESHOLD         7
#define MULLOW_DC_THRESHOLD              30
#define MULLOW_MUL_THRESHOLD           4525

#define MULHIGH_BASECASE_THRESHOLD       10
#define MULHIGH_DC_THRESHOLD             30

#define MULHIGH_MUL_THRESHOLD          2966

#define MULMOD_2EXPM1_THRESHOLD          20

#define FAC_UI_THRESHOLD               1590
#define DC_DIV_QR_THRESHOLD             100
#define DC_DIVAPPR_Q_N_THRESHOLD         90
#define INV_DIV_QR_THRESHOLD            465
#define INV_DIVAPPR_Q_N_THRESHOLD        90
#define DC_DIV_Q_THRESHOLD               39
#define INV_DIV_Q_THRESHOLD            5581
#define DC_DIVAPPR_Q_THRESHOLD          104
#define INV_DIVAPPR_Q_THRESHOLD       14091

#define DC_BDIV_QR_THRESHOLD            100
#define DC_BDIV_Q_THRESHOLD              44

/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 4, 3 }, { 3, 3 }, { 3, 2 }, { 2, 1 }, { 1, 0 } }

#define MULMOD_TAB \
   { 4, 3, 3, 4, 4, 3, 3, 3, 3, 2, 2, 3, 2, 2, 2, 2, 2, 1, 1 }


#define FFT_N_NUM 19

#define FFT_MULMOD_2EXPP1_CUTOFF 128


/* Tuneup completed successfully, took 124 seconds */

Jean-Pierre Flori

unread,
Oct 24, 2012, 5:48:32 AM10/24/12
to mpir-...@googlegroups.com, goodwi...@googlemail.com
Another Intel/Linux, subfamily of Nehalem I guess:

uname -a
Linux lame8.enst.fr 3.5.4-2.fc17.x86_64 #1 SMP Wed Sep 26 21:58:50 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

cat /proc/cpuinfo
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz

./config.guess
westmere-unknown-linux-gnu

./tuneup
Parameters for ./mpn/x86_64/nehalem/gmp-mparam.h

Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 3.41e-10 secs, CPU freq 2934.00 MHz

DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-24, gcc 4.7 */

#define MUL_KARATSUBA_THRESHOLD          16
#define MUL_TOOM3_THRESHOLD              90
#define MUL_TOOM4_THRESHOLD             166
#define MUL_TOOM8H_THRESHOLD            294


#define SQR_BASECASE_THRESHOLD            0  /* always (native) */
#define SQR_KARATSUBA_THRESHOLD          31
#define SQR_TOOM3_THRESHOLD              95
#define SQR_TOOM4_THRESHOLD             250
#define SQR_TOOM8_THRESHOLD             324

#define POWM_THRESHOLD                  101

#define HGCD_THRESHOLD                  103
#define GCD_DC_THRESHOLD               1488
#define GCDEXT_DC_THRESHOLD            1095

#define JACOBI_BASE_METHOD                1

#define DIVREM_1_NORM_THRESHOLD       MP_SIZE_T_MAX  /* never */
#define DIVREM_1_UNNORM_THRESHOLD     MP_SIZE_T_MAX  /* never */
#define MOD_1_NORM_THRESHOLD              0  /* always */
#define MOD_1_UNNORM_THRESHOLD            0  /* always */
#define USE_PREINV_DIVREM_1               1  /* native */
#define USE_PREINV_MOD_1                  1
#define DIVEXACT_1_THRESHOLD              0  /* always */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always (native) */
#define MOD_1_1_THRESHOLD                 5
#define MOD_1_2_THRESHOLD                 8
#define MOD_1_3_THRESHOLD                19
#define DIVREM_HENSEL_QR_1_THRESHOLD     10
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD      7
#define DIVREM_EUCLID_HENSEL_THRESHOLD     55

#define ROOTREM_THRESHOLD                 6

#define GET_STR_DC_THRESHOLD             13
#define GET_STR_PRECOMPUTE_THRESHOLD     20
#define SET_STR_DC_THRESHOLD           6203
#define SET_STR_PRECOMPUTE_THRESHOLD   9208

#define MUL_FFT_FULL_THRESHOLD         3008

#define SQR_FFT_FULL_THRESHOLD         2880

#define MULLOW_BASECASE_THRESHOLD         6
#define MULLOW_DC_THRESHOLD              11
#define MULLOW_MUL_THRESHOLD           2966

#define MULHIGH_BASECASE_THRESHOLD       16
#define MULHIGH_DC_THRESHOLD             16
#define MULHIGH_MUL_THRESHOLD          2966

#define MULMOD_2EXPM1_THRESHOLD          14


#define FAC_UI_THRESHOLD               1590
#define DC_DIV_QR_THRESHOLD             100
#define DC_DIVAPPR_Q_N_THRESHOLD         51
#define INV_DIV_QR_THRESHOLD            465
#define INV_DIVAPPR_Q_N_THRESHOLD        51
#define DC_DIV_Q_THRESHOLD              108
#define INV_DIV_Q_THRESHOLD            5365
#define DC_DIVAPPR_Q_THRESHOLD           76

#define INV_DIVAPPR_Q_THRESHOLD       12502
#define DC_BDIV_QR_THRESHOLD            100
#define DC_BDIV_Q_THRESHOLD              11


/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 4, 3 }, { 3, 3 }, { 3, 2 }, { 2, 1 }, { 1, 0 } }

#define MULMOD_TAB \
   { 4, 3, 3, 3, 4, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1 }


#define FFT_N_NUM 19

#define FFT_MULMOD_2EXPP1_CUTOFF 128


/* Tuneup completed successfully, took 141 seconds */


Jean-Pierre Flori

unread,
Oct 24, 2012, 6:20:43 AM10/24/12
to mpir-...@googlegroups.com, goodwi...@googlemail.com
On a SPARC machine (segfault at the end of make tune):

uname -a
SunOS esmeralda 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Blade-1500

gcc -v   
Reading specs from /usr/local/packages/gcc3/bin/../lib/gcc/sparc-sun-solaris2.10/3.4.3/specs
Configured with: /sfw10/builds/build/sfw10-patch/usr/src/cmd/gcc/gcc-3.4.3/configure --prefix=/usr/sfw --with-as=/usr/ccs/bin/as --without-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++ --enable-shared
Thread model: posix
gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)

./config.guess
ultrasparc3-sun-solaris2.10

./tuneup
Parameters for ./mpn/sparc64/gmp-mparam.h

Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 9.42e-10 secs, CPU freq 1062.00 MHz
DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-24, gcc 3.4 */

#define MUL_KARATSUBA_THRESHOLD          34
#define MUL_TOOM3_THRESHOLD             102
#define MUL_TOOM4_THRESHOLD             450
#define MUL_TOOM8H_THRESHOLD            450

#define SQR_BASECASE_THRESHOLD            9
#define SQR_KARATSUBA_THRESHOLD          71
#define SQR_TOOM3_THRESHOLD             117
#define SQR_TOOM4_THRESHOLD             547
#define SQR_TOOM8_THRESHOLD             547

#define POWM_THRESHOLD                  984

#define HGCD_THRESHOLD                  113
#define GCD_DC_THRESHOLD                753
#define GCDEXT_DC_THRESHOLD             577
#define JACOBI_BASE_METHOD                3

#define DIVREM_1_NORM_THRESHOLD           3
#define DIVREM_1_UNNORM_THRESHOLD         3
#define MOD_1_NORM_THRESHOLD              3
#define MOD_1_UNNORM_THRESHOLD            3
#define USE_PREINV_DIVREM_1               1
#define USE_PREINV_MOD_1                  1
#define DIVREM_2_THRESHOLD                0  /* always */

#define DIVEXACT_1_THRESHOLD              0  /* always */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always */
#define MOD_1_1_THRESHOLD                 9
#define MOD_1_2_THRESHOLD                12
#define MOD_1_3_THRESHOLD                17
#define DIVREM_HENSEL_QR_1_THRESHOLD    996
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD    996
#define DIVREM_EUCLID_HENSEL_THRESHOLD      8


#define ROOTREM_THRESHOLD                 6

#define GET_STR_DC_THRESHOLD             13
#define GET_STR_PRECOMPUTE_THRESHOLD     19
#define SET_STR_DC_THRESHOLD            286
#define SET_STR_PRECOMPUTE_THRESHOLD    362

#define MUL_FFT_FULL_THRESHOLD         2016

#define SQR_FFT_FULL_THRESHOLD         2016

#define MULLOW_BASECASE_THRESHOLD        32
#define MULLOW_DC_THRESHOLD              32
#define MULLOW_MUL_THRESHOLD           2257

#define MULHIGH_BASECASE_THRESHOLD       44
#define MULHIGH_DC_THRESHOLD             44
#define MULHIGH_MUL_THRESHOLD          3609

#define MULMOD_2EXPM1_THRESHOLD          16

#define FAC_UI_THRESHOLD               1074
#define DC_DIV_QR_THRESHOLD              19
#define DC_DIVAPPR_Q_N_THRESHOLD        106
#define INV_DIV_QR_THRESHOLD            465
#define INV_DIVAPPR_Q_N_THRESHOLD       106
#define DC_DIV_Q_THRESHOLD              132
#define INV_DIV_Q_THRESHOLD            1442
#define DC_DIVAPPR_Q_THRESHOLD          114
#define INV_DIVAPPR_Q_THRESHOLD        2538
#define DC_BDIV_QR_THRESHOLD            100
#define DC_BDIV_Q_THRESHOLD              48


/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 4, 3 }, { 3, 2 }, { 2, 1 }, { 2, 1 }, { 1, 0 } }

#define MULMOD_TAB \
   { gmake: *** [tune] Segmentation Fault

Jean-Pierre Flori

unread,
Oct 24, 2012, 8:01:49 AM10/24/12
to mpir-...@googlegroups.com, goodwi...@googlemail.com
Another sparc machine (same segfault):

uname -a
Linux lame5 3.2.0-3-sparc64-smp #1 SMP Mon Jul 23 05:25:29 UTC 2012 sparc64 GNU/Linux

cat /proc/cpuinfo
cpu             : UltraSparc T1 (Niagara)

./config.guess
cpu             : UltraSparc T1 (Niagara)


./tuneup
Parameters for ./mpn/sparc64/gmp-mparam.h
Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 1.33e-10 secs, CPU freq 7499.55 MHz
DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-24, gcc 4.6 */

#define MUL_KARATSUBA_THRESHOLD           6
#define MUL_TOOM3_THRESHOLD              53
#define MUL_TOOM4_THRESHOLD             242
#define MUL_TOOM8H_THRESHOLD            242

#define SQR_BASECASE_THRESHOLD            0  /* always */
#define SQR_KARATSUBA_THRESHOLD          12
#define SQR_TOOM3_THRESHOLD              41
#define SQR_TOOM4_THRESHOLD             222
#define SQR_TOOM8_THRESHOLD             222

#define POWM_THRESHOLD                  984

#define HGCD_THRESHOLD                   30
#define GCD_DC_THRESHOLD                212
#define GCDEXT_DC_THRESHOLD             194
#define JACOBI_BASE_METHOD                2

#define DIVREM_1_NORM_THRESHOLD           3
#define DIVREM_1_UNNORM_THRESHOLD         4

#define MOD_1_NORM_THRESHOLD              3
#define MOD_1_UNNORM_THRESHOLD            3
#define USE_PREINV_DIVREM_1               1
#define USE_PREINV_MOD_1                  1
#define DIVREM_2_THRESHOLD                0  /* always */
#define DIVEXACT_1_THRESHOLD              0  /* always */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always */
#define MOD_1_1_THRESHOLD                25
#define MOD_1_2_THRESHOLD                25
#define MOD_1_3_THRESHOLD                25

#define DIVREM_HENSEL_QR_1_THRESHOLD    996
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD    996
#define DIVREM_EUCLID_HENSEL_THRESHOLD    108

#define ROOTREM_THRESHOLD                 3

#define GET_STR_DC_THRESHOLD            149
#define GET_STR_PRECOMPUTE_THRESHOLD    149
#define SET_STR_DC_THRESHOLD            152
#define SET_STR_PRECOMPUTE_THRESHOLD    161

#define MUL_FFT_FULL_THRESHOLD          248

#define SQR_FFT_FULL_THRESHOLD         1008

#define MULLOW_BASECASE_THRESHOLD         0  /* always */
#define MULLOW_DC_THRESHOLD              10
#define MULLOW_MUL_THRESHOLD            987

#define MULHIGH_BASECASE_THRESHOLD        0  /* always */
#define MULHIGH_DC_THRESHOLD              6
#define MULHIGH_MUL_THRESHOLD          1932

#define MULMOD_2EXPM1_THRESHOLD           4

#define FAC_UI_THRESHOLD               1024
#define DC_DIV_QR_THRESHOLD              34
#define DC_DIVAPPR_Q_N_THRESHOLD         55
#define INV_DIV_QR_THRESHOLD            241
#define INV_DIVAPPR_Q_N_THRESHOLD        55
#define DC_DIV_Q_THRESHOLD               75
#define INV_DIV_Q_THRESHOLD             924
#define DC_DIVAPPR_Q_THRESHOLD           73
#define INV_DIVAPPR_Q_THRESHOLD        2132
#define DC_BDIV_QR_THRESHOLD             58
#define DC_BDIV_Q_THRESHOLD              10


/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 2, 1 }, { 1, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 } }

#define MULMOD_TAB \
   { make: *** [tune] Segmentation fault

Jean-Pierre Flori

unread,
Oct 24, 2012, 8:34:22 AM10/24/12
to mpir-...@googlegroups.com, goodwi...@googlemail.com
Another sparc machine (same segfault):

uname -a
SunOS infres1 5.10 Generic_139555-08 sun4v sparc SUNW,T5140

prtdiag
System Configuration:  Sun Microsystems  sun4v T5140
Memory size: 16160 Megabytes

================================ Virtual CPUs ================================


CPU ID Frequency Implementation         Status
------ --------- ---------------------- -------
0      1165 MHz  SUNW,UltraSPARC-T2+    on-line
...

gcc -v
Reading specs from /local/packages/gcc3/bin/../lib/gcc/sparc-sun-solaris2.10/3.4.3/specs

Configured with: /sfw10/builds/build/sfw10-patch/usr/src/cmd/gcc/gcc-3.4.3/configure --prefix=/usr/sfw --with-as=/usr/ccs/bin/as --without-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++ --enable-shared
Thread model: posix
gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)

./config.guess
ultrasparc-sun-solaris2.10


./tuneup
Parameters for ./mpn/sparc64/gmp-mparam.h
Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 8.58e-10 secs, CPU freq 1165.00 MHz
DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-24, gcc 3.4 */

#define MUL_KARATSUBA_THRESHOLD          22
#define MUL_TOOM3_THRESHOLD              78
#define MUL_TOOM4_THRESHOLD             390
#define MUL_TOOM8H_THRESHOLD            390

#define SQR_BASECASE_THRESHOLD            8
#define SQR_KARATSUBA_THRESHOLD          54
#define SQR_TOOM3_THRESHOLD              87
#define SQR_TOOM4_THRESHOLD             426
#define SQR_TOOM8_THRESHOLD             426

#define POWM_THRESHOLD                  984

#define HGCD_THRESHOLD                   45
#define GCD_DC_THRESHOLD                 57
#define GCDEXT_DC_THRESHOLD             354
#define JACOBI_BASE_METHOD                2

#define DIVREM_1_NORM_THRESHOLD       MP_SIZE_T_MAX  /* never */
#define DIVREM_1_UNNORM_THRESHOLD     MP_SIZE_T_MAX  /* never */
#define MOD_1_NORM_THRESHOLD          MP_SIZE_T_MAX  /* never */
#define MOD_1_UNNORM_THRESHOLD        MP_SIZE_T_MAX  /* never */

#define USE_PREINV_DIVREM_1               1
#define USE_PREINV_MOD_1                  1
#define DIVREM_2_THRESHOLD            MP_SIZE_T_MAX  /* never */

#define DIVEXACT_1_THRESHOLD              0  /* always */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always */
#define MOD_1_1_THRESHOLD                23
#define MOD_1_2_THRESHOLD                23
#define MOD_1_3_THRESHOLD                23

#define DIVREM_HENSEL_QR_1_THRESHOLD    996
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD    996
#define DIVREM_EUCLID_HENSEL_THRESHOLD     45

#define ROOTREM_THRESHOLD                 3

#define GET_STR_DC_THRESHOLD             13
#define GET_STR_PRECOMPUTE_THRESHOLD     30
#define SET_STR_DC_THRESHOLD            210
#define SET_STR_PRECOMPUTE_THRESHOLD    238

#define MUL_FFT_FULL_THRESHOLD         1248

#define SQR_FFT_FULL_THRESHOLD         2016


#define MULLOW_BASECASE_THRESHOLD         0  /* always */
#define MULLOW_DC_THRESHOLD              14
#define MULLOW_MUL_THRESHOLD           1989

#define MULHIGH_BASECASE_THRESHOLD        4
#define MULHIGH_DC_THRESHOLD             11
#define MULHIGH_MUL_THRESHOLD          3114

#define MULMOD_2EXPM1_THRESHOLD          12

#define FAC_UI_THRESHOLD               1024
#define DC_DIV_QR_THRESHOLD              10
#define DC_DIVAPPR_Q_N_THRESHOLD         45
#define INV_DIV_QR_THRESHOLD            309
#define INV_DIVAPPR_Q_N_THRESHOLD        45
#define DC_DIV_Q_THRESHOLD               75
#define INV_DIV_Q_THRESHOLD            1787
#define DC_DIVAPPR_Q_THRESHOLD           55
#define INV_DIVAPPR_Q_THRESHOLD        5922
#define DC_BDIV_QR_THRESHOLD            100
#define DC_BDIV_Q_THRESHOLD              15


/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 3, 2 }, { 3, 2 }, { 2, 1 }, { 1, 0 }, { 0, 0 } }

leif

unread,
Oct 24, 2012, 2:18:03 PM10/24/12
to mpir-...@googlegroups.com
Bill Hart wrote:
> Hi all,
>
> I have run make check and make tune on the following arches:
>
> x86_64/k102 - fermat
> x86_64/nehalem - jeff gilchrist
> x86_64/k8 - flavius
> x86/pentium4/sse2 - cicero
> x86_64/netburst - sextus (tune crashed)
> x86_64/core2 - eno
> sparc64 - mark (tune crashed) (ultrasparc3)
> sparc32 - mark (ultrasparc3)
> x86_64/penryn - sage.math
> x86_64/k10 - gcc16
> x86_64/atom - gcc46
> mips64 - gcc49 (tuning failed)
> ppc64 - (tuning failed) (IBM power7)
>
> However, I still have no tuning values for alpha, ARM, AMD bobcat,
> Intel sandybrige, mips32, ppc32. If anyone has access to such
> machines, please let me know.

I can submit tuning parameters for an AMD E-450 (btver1, 512 KB L2 per
core; Linux, GCC 4.6.3 and/or GCC 4.7.0) tomorrow.


-leif


> We have generic tuning values for the fft, but it is better to have
> properly tuned values.
>
> Brian, you should be able to pull tuning values for Windows from the
> *nix values now. I'm afraid the only x86 amongst them is the
> x86/pentium4/sse2 machine. But there are plenty of x86_64s.
>
> The two crashed tuning runs are due to the fft tuning crashing. I
> don't know what caused this, but it isn't urgent to fix it. We
> expected tuning to fail on lots of platforms. I constructed the best
> set of values I could from the tuning values that we were able to get.
> The other tuning failures are known failures which have never been
> fixed. The default values will have to do on these machines.
>
> Bill.
>


--
() The ASCII Ribbon Campaign
/\ Help Cure HTML E-Mail

Bill Hart

unread,
Oct 24, 2012, 2:59:13 PM10/24/12
to Jean-Pierre Flori, mpir-...@googlegroups.com
Hi,

I'm aware of the segfaults on sparc64. I've derived some adequate
tuning values for this machine, so won't worry about it too much.
Tuning has always been dodgy and always crashed on numerous machines.
We need a big drive to sort this out in the future, but it's a huge
job. The crash in the fft tuning code is almost certainly the tuning
code itself, and it was expected that it would fail on a few machines.

We simply do what we can for now. Thanks very much for the figures.
I'll add the new values in tonight and tomorrow (when Leif supplies
his). This is probably realistically about all we are going to get.
There are default parameters for old machines which should be
sufficient given that you practically have to go to a museum to view
some of them.

Bill.

Bill Hart

unread,
Oct 25, 2012, 9:12:11 AM10/25/12
to mpir-...@googlegroups.com
OK, I've added the parameters from JP. I'll wait until Leif supplies
us with the AMD Bobcat timings and that will probably have to do.

The one major thing we don't have is ARM tuning values. But we just
don't seem to have any ARMs online anywhere at the moment (except in
our pockets).

I'll move the *nix tuning values over to Windows tonight if Brian
doesn't beat me to it. Then we'll issue a beta.

Bill.

Brian Gladman

unread,
Oct 25, 2012, 11:44:30 AM10/25/12
to mpir-...@googlegroups.com
-----Original Message-----
From: Bill Hart
Sent: Thursday, October 25, 2012 2:12 PM
To: mpir-...@googlegroups.com
Subject: [mpir-devel] Re: MPIR tuning -- help needed

OK, I've added the parameters from JP. I'll wait until Leif supplies
us with the AMD Bobcat timings and that will probably have to do.

The one major thing we don't have is ARM tuning values. But we just
don't seem to have any ARMs online anywhere at the moment (except in
our pockets).

I'll move the *nix tuning values over to Windows tonight if Brian
doesn't beat me to it. Then we'll issue a beta.

=====================
I've been doing these as you add them and I have just done the latest one a
few minutes ago.

Brian

leif

unread,
Oct 25, 2012, 4:34:42 PM10/25/12
to mpir-...@googlegroups.com
Bill Hart wrote:
> OK, I've added the parameters from JP. I'll wait until Leif supplies
> us with the AMD Bobcat timings and that will probably have to do.

On the way... Some figures vary quite a lot (despite the machine being
otherwise idle), so I'm running tuneup a couple more times.


> The one major thing we don't have is ARM tuning values. But we just
> don't seem to have any ARMs online anywhere at the moment (except in
> our pockets).

I asked Julien Puydt, and he gracefully contributed the attached ones
for ARM/Linux, GCC 4.6.? (ARM v7l, cf. attached cpuinfo).


-leif
tuneup.arm-linux.log
cpuinfo-arm.txt

Bill Hart

unread,
Oct 25, 2012, 4:45:21 PM10/25/12
to mpir-...@googlegroups.com
On 25 October 2012 21:34, leif <not.r...@online.de> wrote:
> Bill Hart wrote:
>>
>> OK, I've added the parameters from JP. I'll wait until Leif supplies
>> us with the AMD Bobcat timings and that will probably have to do.
>
>
> On the way... Some figures vary quite a lot (despite the machine being
> otherwise idle), so I'm running tuneup a couple more times.

That's ok, some of the crossovers are pretty wide.

>
>
>
>> The one major thing we don't have is ARM tuning values. But we just
>> don't seem to have any ARMs online anywhere at the moment (except in
>> our pockets).
>
>
> I asked Julien Puydt, and he gracefully contributed the attached ones for
> ARM/Linux, GCC 4.6.? (ARM v7l, cf. attached cpuinfo).
>

Fantastic! Thanks Julien!

Bill.

>
> -leif
>
>
> --
> () The ASCII Ribbon Campaign
> /\ Help Cure HTML E-Mail
>
> --
> You received this message because you are subscribed to the Google Groups
> "mpir-devel" group.
> To post to this group, send email to mpir-...@googlegroups.com.
> To unsubscribe from this group, send email to
> mpir-devel+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mpir-devel?hl=en.
>

Jean-Pierre Flori

unread,
Oct 25, 2012, 4:49:19 PM10/25/12
to mpir-...@googlegroups.com, goodwi...@googlemail.com
Here is another x86:

uname -a
Linux pichou 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:04:05 UTC 2012 i686 i686 i386 GNU/Linux

cat /proc/cpuinfo
model name    : Intel(R) Atom(TM) CPU N450   @ 1.66GHz

gcc -v
Utilisation des specs internes.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.6/lto-wrapper
Target: i686-linux-gnu
Configuré avec: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --enable-targets=all --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu --target=i686-linux-gnu
Modèle de thread: posix
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

./config.guess
atom-pc-linux-gnu

./tuneup
Parameters for ./mpn/x86/k7/gmp-mparam.h

Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1000000, speed_unittime 6.00e-10 secs, CPU freq 1667.00 MHz
DEFAULT_MAX_SIZE 1000, fft_max_size 50000

/* Generated by tuneup.c, 2012-10-25, gcc 4.6 */

#define MUL_KARATSUBA_THRESHOLD          20
#define MUL_TOOM3_THRESHOLD             131
#define MUL_TOOM4_THRESHOLD             200
#define MUL_TOOM8H_THRESHOLD            327

#define SQR_BASECASE_THRESHOLD            0  /* always (native) */
#define SQR_KARATSUBA_THRESHOLD          39
#define SQR_TOOM3_THRESHOLD             132
#define SQR_TOOM4_THRESHOLD             315
#define SQR_TOOM8_THRESHOLD             372

#define POWM_THRESHOLD                  110

#define HGCD_THRESHOLD                   37
#define GCD_DC_THRESHOLD                 77
#define GCDEXT_DC_THRESHOLD             951
#define JACOBI_BASE_METHOD                2


#define USE_PREINV_DIVREM_1               1  /* native */
#define USE_PREINV_MOD_1                  1  /* native */
#define DIVREM_2_THRESHOLD                5
#define DIVEXACT_1_THRESHOLD              0  /* always (native) */
#define MODEXACT_1_ODD_THRESHOLD          0  /* always (native) */
#define MOD_1_1_THRESHOLD                37
#define MOD_1_2_THRESHOLD                38
#define MOD_1_3_THRESHOLD                40

#define DIVREM_HENSEL_QR_1_THRESHOLD    996
#define RSH_DIVREM_HENSEL_QR_1_THRESHOLD    996
#define DIVREM_EUCLID_HENSEL_THRESHOLD     52

#define ROOTREM_THRESHOLD                 6

#define GET_STR_DC_THRESHOLD             13
#define GET_STR_PRECOMPUTE_THRESHOLD     24
#define SET_STR_DC_THRESHOLD            254
#define SET_STR_PRECOMPUTE_THRESHOLD    254

#define MUL_FFT_FULL_THRESHOLD         2240

#define SQR_FFT_FULL_THRESHOLD         2752

#define MULLOW_BASECASE_THRESHOLD         4
#define MULLOW_DC_THRESHOLD              50
#define MULLOW_MUL_THRESHOLD            458

#define MULHIGH_BASECASE_THRESHOLD        6
#define MULHIGH_DC_THRESHOLD             36
#define MULHIGH_MUL_THRESHOLD          2937

#define MULMOD_2EXPM1_THRESHOLD          20

#define FAC_UI_THRESHOLD               1024
#define DC_DIV_QR_THRESHOLD             100
#define DC_DIVAPPR_Q_N_THRESHOLD        233
#define INV_DIV_QR_THRESHOLD            465
#define INV_DIVAPPR_Q_N_THRESHOLD       233
#define DC_DIV_Q_THRESHOLD              233
#define INV_DIV_Q_THRESHOLD            2914
#define DC_DIVAPPR_Q_THRESHOLD          225
#define INV_DIVAPPR_Q_THRESHOLD        5624
#define DC_BDIV_QR_THRESHOLD            278
#define DC_BDIV_Q_THRESHOLD             162


/* fft_tuning -- autogenerated by tune-fft */

#define FFT_TAB \
   { { 4, 3 }, { 3, 2 }, { 2, 1 }, { 1, 1 }, { 1, 0 } }

#define MULMOD_TAB \
   { 4, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1 }

#define FFT_N_NUM 15

#define FFT_MULMOD_2EXPP1_CUTOFF 128


/* Tuneup completed successfully, took 788 seconds */

leif

unread,
Oct 25, 2012, 6:04:49 PM10/25/12
to mpir-...@googlegroups.com
Bill Hart wrote:
> On 25 October 2012 21:34, leif <not.r...@online.de> wrote:
>> Bill Hart wrote:
>>>
>>> OK, I've added the parameters from JP. I'll wait until Leif supplies
>>> us with the AMD Bobcat timings and that will probably have to do.
>>
>>
>> On the way... Some figures vary quite a lot (despite the machine being
>> otherwise idle), so I'm running tuneup a couple more times.
>
> That's ok, some of the crossovers are pretty wide.

Ok, I took the dominant or approx. average ones; see below for those
that vary broadly.


Have fun,

-leif


#define DIVREM_EUCLID_HENSEL_THRESHOLD 8
#define DIVREM_EUCLID_HENSEL_THRESHOLD 8
#define DIVREM_EUCLID_HENSEL_THRESHOLD 15
#define DIVREM_EUCLID_HENSEL_THRESHOLD 17
#define DIVREM_EUCLID_HENSEL_THRESHOLD 18
#define DIVREM_EUCLID_HENSEL_THRESHOLD 18
#define DIVREM_EUCLID_HENSEL_THRESHOLD 21
#define DIVREM_EUCLID_HENSEL_THRESHOLD 23
#define DIVREM_EUCLID_HENSEL_THRESHOLD 23
#define DIVREM_EUCLID_HENSEL_THRESHOLD 75
#define DIVREM_EUCLID_HENSEL_THRESHOLD 89
#define DIVREM_EUCLID_HENSEL_THRESHOLD 91
#define DIVREM_EUCLID_HENSEL_THRESHOLD 141
#define DIVREM_EUCLID_HENSEL_THRESHOLD 170
#define DIVREM_EUCLID_HENSEL_THRESHOLD 208


#define HGCD_THRESHOLD 30
#define HGCD_THRESHOLD 30
#define HGCD_THRESHOLD 30
#define HGCD_THRESHOLD 30
#define HGCD_THRESHOLD 30
#define HGCD_THRESHOLD 31
#define HGCD_THRESHOLD 35
#define HGCD_THRESHOLD 37
#define HGCD_THRESHOLD 52
#define HGCD_THRESHOLD 54
#define HGCD_THRESHOLD 92
#define HGCD_THRESHOLD 109
#define HGCD_THRESHOLD 110
#define HGCD_THRESHOLD 318
#define HGCD_THRESHOLD 422


#define SET_STR_PRECOMPUTE_THRESHOLD 214
#define SET_STR_PRECOMPUTE_THRESHOLD 214
#define SET_STR_PRECOMPUTE_THRESHOLD 222
#define SET_STR_PRECOMPUTE_THRESHOLD 230
#define SET_STR_PRECOMPUTE_THRESHOLD 240
#define SET_STR_PRECOMPUTE_THRESHOLD 382
#define SET_STR_PRECOMPUTE_THRESHOLD 399
#define SET_STR_PRECOMPUTE_THRESHOLD 411
#define SET_STR_PRECOMPUTE_THRESHOLD 427
#define SET_STR_PRECOMPUTE_THRESHOLD 499
#define SET_STR_PRECOMPUTE_THRESHOLD 671
#define SET_STR_PRECOMPUTE_THRESHOLD 716
#define SET_STR_PRECOMPUTE_THRESHOLD 752
#define SET_STR_PRECOMPUTE_THRESHOLD 828
#define SET_STR_PRECOMPUTE_THRESHOLD 984



bobcat-unknown-linux-gnu (btver1)
mpn__x86_64__bobcat__gmp-mparam.h
cpuinfo-bobcat.txt

Bill Hart

unread,
Oct 25, 2012, 6:11:59 PM10/25/12
to mpir-...@googlegroups.com
Hi Leif,

Some of the tuning code is absolute rubbish, so it produces almost
worthless values. Some of it is relatively stable though.

We need to do a major overhaul of the entire tuning system some day.

Thanks for these values. I think we've done pretty well, covering all
of x86_64 and most of the other major platforms still in use.

I'll commit them and once Brian indicates he's finished on the Windows
side I'll upload a beta.

Bill.

Brian Gladman

unread,
Oct 25, 2012, 6:17:58 PM10/25/12
to mpir-...@googlegroups.com
-----Original Message-----
From: Bill Hart
Sent: Thursday, October 25, 2012 11:11 PM
To: mpir-...@googlegroups.com
Subject: Re: [mpir-devel] Re: MPIR tuning -- help needed

Hi Leif,

Some of the tuning code is absolute rubbish, so it produces almost
worthless values. Some of it is relatively stable though.

We need to do a major overhaul of the entire tuning system some day.

Thanks for these values. I think we've done pretty well, covering all
of x86_64 and most of the other major platforms still in use.

I'll commit them and once Brian indicates he's finished on the Windows
side I'll upload a beta.

======================

My bit won't happen until tomorrow now as I'm almost asleep already!

Brian

Bill Hart

unread,
Oct 25, 2012, 6:19:02 PM10/25/12
to mpir-...@googlegroups.com
Brian,

that's ok. It's only the bobcat timings I think. Do you want me to do
it, or is there other stuff you need to take care of on the Windows
side?

Bill.

Brian Gladman

unread,
Oct 25, 2012, 6:22:37 PM10/25/12
to mpir-...@googlegroups.com
-----Original Message-----
From: Bill Hart
Sent: Thursday, October 25, 2012 11:19 PM
To: mpir-...@googlegroups.com
Subject: Re: [mpir-devel] Re: MPIR tuning -- help needed

Brian,

that's ok. It's only the bobcat timings I think. Do you want me to do
it, or is there other stuff you need to take care of on the Windows
side?

=================

Be my guest Bill - putting any tuning not yet committed into Windows is all
that needs to be done.

Brian

Bill Hart

unread,
Oct 25, 2012, 6:23:33 PM10/25/12
to mpir-...@googlegroups.com
OK, I think nehalem/westmere is also not there. I'll put that in too.
Not sure if it gets used or not, but it can't hurt anyway.

Bill.
Reply all
Reply to author
Forward
0 new messages