On 29/08/15 19:17, Prroffessorr Fir Kenobi wrote:
> W dniu sobota, 29 sierpnia 2015 18:57:11 UTC+2 użytkownik Vir
> Campestris napisał:
>> On 27/08/2015 22:02, Prroffessorr Fir Kenobi wrote:
>>> W dniu czwartek, 27 sierpnia 2015 22:21:23 UTC+2 użytkownik Vir
>>> Campestris napisał:
>>>> On 26/08/2015 20:40, Prroffessorr Fir Kenobi wrote: What
>>>> optimisation settings are you using?
>>>>
>>>
>>> c:\mingw\bin\g++ -std=c++98 -O2 -c -w test7.c -fno-rtti
>>> -fno-exceptions -fno-threadsafe-statics -march=core2
>>> -mtune=generic -mfpmath=both -msse2
>>>
Why are you using "-mtune=generic" ? Here you are telling gcc that the
code will always run on a "core2" machine (with -march=core2), yet it
should (with "-mtune=generic") make code that runs reasonably on any x86.
Instead, you want "-march=native" for best performance - this tells the
compiler that it can use anything supported by the current processor,
and should optimise code to run on exactly that system. You only need
other flags if you are trying to make binaries that run fast on a range
of different x86 systems - but here your main aim is speed on your own
system.
And you don't want to use flags like "-mfpmath=both" or "-msse2" - use
"-march=native" and let the compiler do the best job, as it already
knows what extensions your cpu supports.
If your code has any floating point, and you don't need absolute
conformance to the IEEE rules (few programs need that), then use
"-ffast-math" to allow the compiler greater freedom in maths optimisations.
>>> (-std=c++98 was used to check if it would be maybe speed up but
>>> not a difference)
>>>
>>
>> OK, I'm not familiar with that compiler. But is O2 really the
>> highest level? GCC and Clang both have O3.
mingw is gcc on Windows.
>>
>> No point in complaining the compiler is too slow when you haven't
>> told it to do its best.
>>
>> Andy
>
> -O3 may produce unstable code which has risky optimisations and
> sometimes even crashes.. here i tested and O3 makes slower code
>
-O3 should not ever produce "risky" optimisations or code that crashes
because of optimisations. If your code crashes with -O3 but not with
-O2, then either your code has bugs or the compiler has bugs. Almost
certainly it is the former, especially when you are mixing in inline
assembly. But if you think it is a real gcc bug, then the gcc folk
would love to know about it.
There are sometimes optimisations that are experimental or known to
cause problems - these are clearly marked in the documentation for that
release of gcc, and always require specific flags (they are never part
of any -O flag).
On the other hand, it is well-known that -O3 does not necessarily lead
to faster code than -O2. It depends very much on the circumstances and
the code in question, as well as the processor in use. For example, -O3
is more aggressive about loop unrolling - cache effects and a good
branch predictor may mean that an unrolled loop is actually slower than
the rolled loop. Some trial and error here is worth the effort for best
code, and you might have fun playing with other options in:
<
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html>
You can even change the optimisation levels on different functions, at
least in newer gcc.