First: It would be quite helpful to know which system (hardware,
operating system, 32bit or 64bit mode) you used for the benchmark.
I actually wonder why you didn't see that much improvement with -O3.
Usually, -O2/-O3 is significantly faster than -O0. Especially, if you
are on a 32bit system, you could try to use "-march=native" - otherwise,
GCC/gfortran generates code which also runs on rather old systems, which
are much less capable. (Intel's equivalent is -xHost, but the compiler
defaults to more capable computers thus it matters less.)
For older GCCs, I usually use "-O3 -march=native -ffast-math
-funroll-loops" as benchmark setting.
With GCC 4.6/4.7, I additionally use -finline-limit=600 -fwhole-program
-flto
With GCC 4.7, I use on top of the 4.6 settings: -fstack-arrays.
With those settings, I can lower the geometric mean execution time of
the Polyhedron benchmark with GCC 4.7 from 10.56s (=100%) to 9.52s
(90%); using Intel's floating-point math library (libimf), the timing
reduces further to 9.06s (85%), while with ifort 11.1 I get 9.86s (93%)
[with -fast, which implies -ipo -O3 -no-prec-div -static -xHost]. Note:
The performance of the single benchmarks varies hugely.
Cf.
https://userpage.physik.fu-berlin.de/~tburnus/gcc-trunk/benchmark/iff/
(Intel Core(TM)2 Duo CPU E8400 @ 3.00GHz and using CentOS Linux 5.5
(x86-64)).
Tobias
PS: I plan to have a look at your benchmark.
--
You received this message because you are subscribed to the Google Groups "GNU Fortran" group.
To post to this group, send email to gnu-f...@googlegroups.com.
To unsubscribe from this group, send email to gnu-fortran...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/gnu-fortran?hl=en.
-- Best Regards, PcX