I did not mean to imply that you used those words - I meant it as a
general sort of remark that I have seen made many times. Some people
are convinced that compiler makers are concerned only about the speed of
benchmarks, not how their compiler works for other code.
>> 1. Those working for chip manufacturers - Intel, AMD, ARM, etc. These
>> don't care if you use gcc or anything else, as long as you buy their
>> chips, so their concern is that you (the programmer) get the best from
>> the compiler and their chip. Benchmarks for the compiler don't matter -
>> support for the chip matters.
>
> The market statements are not mostly about performance of those chips
> and the likes of "Meltdown and Spectre" are not result of overeager
> optimizing of those and even when it seems to be so then nothing of it
> is measured with benchmarks compiled on those compilers?
Meltdown and Spectre are both /hardware/ problems. Can we at least
agree on that? Compilers have been able to add additional code to
reduce the effect of the problems - but this is /not/ because the
compilers were previously generating incorrect or suspect code.
(And while it's easy to see in hindsight that Intel should have been
more careful with speculation across privilege levels, it was harder to
see that in advance. There is no way to make a system where code from
multiple sources can run and share resources, without information
leakage at least in theory.)
>
>> 2. Those working for software companies like Red Hat, IBM, etc., that
>> provide tools and services to developers. They want programmers to be
>> happy with the tools - they don't care if you use a different compiler
>> instead.
>
> The programmers they target are not overly concerned with performance?
Of course performance is important to many people - compiler writers and
compiler users included. Performance on /benchmarks/ is of smaller
concern, especially of artificial benchmarks. Programmers want /their/
programs to run fast.
> My impression is that programmers start to post about performance before
> figuring out how to turn the optimizations on.
>
It is often the case that people who understand less, post more. (That
is in no way a comment about anyone posting here, in case you feel I was
trying to be insulting.)
>> 3. Those working for big users, like Google and Facebook. They don't
>> care about benchmarks - they care about performance on their own software.
>
> They do not use compilers to build their software and so do not care about
> compiler optimizations?
I really don't understand your point. I said they /do/ care about
performance - and therefore also about compiler optimisations.
>
>> 4. The independent and volunteer developers. They care about the
>> quality of their code, and making something worthwhile - they don't care
>> about benchmark performances.
>
> Again there are developers who don't care about performance?
I think you have been misunderstanding my point all along here. I
apologise if I have been unclear.
Yes, compiler writers care about performance and optimisations. I have
said this all along.
No, compiler writers are /not/ solely concerned about performance and
optimisations. It is merely one of many factors. The same applies to
people choosing compilers.
And it is definitely not the case that compiler writers care /primarily/
about the speed on artificial benchmark programs (like SPEC benchmarks).
They do not write their compilers with an aim of getting top marks on
these benchmarks. That does not mean they don't run them sometimes, and
compare figures (especially if there are regressions). It just means
that they do not use them as driving forces.
It is not the case that compiler writers are always looking for new ways
to take undefined behaviour in source code and turn it into a mess in
the object code. Rather, it /is/ the case that compiler writers have no
responsibility to try to generate "correct" results from incorrect
source code. They have /never/ had that responsibility. From the days
of the first C compilers 40+ years ago, signed integer overflow has been
undefined behaviour and compilers could assume it never happened if that
helped optimisation. Code that assumes wrapping behaviour is either
tied to a specific compiler (if the compiler documents that behaviour),
or it is /wrong/. As compilers have got more sophisticated, there have
been more occasions where incorrect code had predictable behaviour in
the past. That is why compiler writers add extra "this code has
mistakes, but worked with the behaviour of the old compiler" flags like
"-fwrapv" or "-fno-strict-aliasing".
>
>> I'm sure there are other categories that you can think of. I can't see
>> any significant number being benchmark oriented. People don't choose
>> compilers because of their benchmarks - they choose for features, target
>> support, static checking, language support, compatibility with existing
>> source code, etc. They expect a gradual progress towards faster code
>> with newer versions, but not more than that. And those that pick a
>> compiler for its speed, do so based on the speed for their own source
>> code, not for some benchmark.
>
> However it seems that there are only few weirdos like me who think that
> it does not matter how fast the wrong answers are calculated and consider
> it better when those wrong answers are not calculated at all.
>
/I/ don't care how fast wrong answers are calculated. I just don't
expect compilers to calculate right answers from wrong code. And I do
care how fast right answers are calculated from correct code.
>> Like all conspiracy theories, the best way to test it is to follow the
>> money. Who would profit from making compilers focused on benchmark
>> performance as the main goal, with a disregard for support for existing
>> C or C++ sources?
>
> What conspiracy theory? Where did I say that they disregard support for
> existing source code?
You wrote "Since compiler writers are people with extremely
benchmark-oriented head shape" and then suggested it is because that's
what their employers tell them to do - /that/ is the "conspiracy theory".
> If to follow money then Google, Apple, Microsoft
> and Oracle have tons of own C and C++ source code that they want to
> be performant and don't want to break but they want developers to use
> Go, Swift, C# or Java. So indeed they might want to "extend" crappy
> "features" and "optimizations" into C++ that they won't ever use in
> and with their own code.
You think Microsoft would intentionally introduce broken features or
mistakes into their C++ compilers to push people towards using C# ?
>
>>> Both hardware makers and compiler writers kept optimizing until it
>>> was "overly optimal".
>>>
>>
>> That again is simply incorrect.
>>
>> Developers - hardware or software - can make mistakes, and release a
>> design which later turns out to have unforeseen flaws. With software,
>> you can often find these early and fix them, but sometimes the flaws are
>> discovered quite late. Hardware flaws are harder to fix - but very easy
>> for amateurs to condemn once they are found.
>
> The cache, branch prediction and speculative execution are performance
> optimizations piled together. That can be tricky to get such a pile
> correct and if to prioritize correctness below performance then defects
> slip through. Same things do happen with compiler optimizations.
My understanding of the hardware problems here is that the operation was
perfectly correct - but there was information leakage. A certain degree
of information leakage is inevitable when there are shared resources,
but this situation could be exploited more conveniently than many others.
Compiler optimisations are a different game altogether. The main
similarity seems to be "they are both hard, so someone might make a
mistake".
>
>>>> Accusations that compiler writers are only interested in maximum
>>>> benchmark speeds are absurd. gcc, for example, tests compilation on the
>>>> entire Debian repository - vast quantities of code, much of which is
>>>> ancient.
>>>
>>> Where I wrote "only"? Sure, they have to care of other things like to
>>> keep backwards compatibility. Also if code of popular benchmark
>>> contains UB then they have to avoid "optimizing" it, to keep the result
>>> "correct".
>>
>> They sometimes have to make such adjustments, yes. Often that is
>> because they realise that not only do the benchmark writers make such
>> mistakes, but others do too - and that it can be helpful to treat such
>> code in the manner the programmer appeared to expect. But for most
>> undefined behaviour, it is hard or impossible to guess what the
>> programmer expected - that is the nature of undefined behaviour.
>
> What is so controversial what is the behavior that programmer expects
> on case of -ftrapv? Fortunately has been obvious that I want
> division-by-zero to trap (even on MIPS and ARM, without special
> compiler options) but that might also change out of blue when
> a way to "optimize" it will be discovered, and then we need to
> add some -fplease-dont-remove-divide-by-zero-trap I suspect.
"-ftrapv" was somewhat poorly conceived and specified in the first place
(IMHO), and has only got worse. Originally compilers were relatively
straightforward - if your code said "x = y + z;", you'd get an "add"
assembler instruction. "-ftrapv" just meant adding a "trap if overflow
flag set" instruction after it. Then things got complicated - what
about "x = y * 5", that can be done with a single "lea" instruction (on
x86)? What about "x = y + z - y;" - do you simplify without trapping,
or do you detect overflows on the intermediary values? The original
idea was to have a cheap and simple way to spot overflow errors, but the
practice is a good deal more complicated, and subject to a lot of
opinions as to what kinds of checks should be done.
So if I wanted code to check for overflows, I'd do it specifically in
the situations where it was appropriate - using __builtin_add_overflow
and friends, or more manual checking (using C++ classes to keep it all
neat). And for debugging or testing, -fsanitize=undefined is a good choice.
>
>>>> What compiler writers do not do, however, is limit their handling of
>>>> well-written code because some people write poor code. Instead, they
>>>> provide flags to support those that have code that relies on particular
>>>> handling of undefined behaviours, or code that "worked fine on my old
>>>> compiler" - flags like "-fwrapv" and "-fno-strict-aliasing". Then
>>>> people who know how to integer arithmetic works in C and C++ can get
>>>> faster code, and people who can't get it right (or who have to use
>>>> broken code from others) have an escape route.
>>>>
>>>>>
>>>>> Without -ftrapv or -fwrapv all bets are off and one can get full set of
>>>>> nasal demons on signed integer overflow on those compilers.
>>>>>
>>>>
>>>> Of course all bets are off if you have signed integer overflow - this
>>>> should well known to anyone who has learned C or C++ programming. It is
>>>> not the compiler that launches nasal daemons - it is the programmer,
>>>> when they write code that does not make sense in the language.
>>>>
>>>> "-ftrapv", by the way, has been considered a poor and unreliable
>>>> solution for a good many years - "-fsanitize=signed-integer-overflow" is
>>>> what you want.
>>>
>>> Why did it become "poor" and "unreliable"? What was the reasoning?
>>> Is it OK to release software with "poor" and "unreliable" features?
>>
>> Many bugs have been found in the "-ftrapv" implementation - and in
>> particular, it does not trap in all cases. Personally, I think the flag
>> should be dropped in favour of the sanitizer, which is a more modern and
>> flexible alternative and which is actively maintained.
>
> Sanitizers sound like debugging options.
Yes.
> Why two almost equal features
> are developed into same tool?
-ftrapv is a debugging tool. It is not really appropriate for final
binaries - it is too unpredictable, and plays havoc with code
optimisation. In particular, expressions involving several parts are
either significantly slower as each bit needs to be handled separately,
or you don't get full overflow checking. (I have no idea where the
lines are drawn in practice in gcc.)
> With such logic one day the
> -fsanitize=signed-integer-overflow will also become "poor"
> and "unreliable" and then some third feature will be the "correct"
> way to make programs to crash on signed integer overflow. With
> feature creep after a while nothing is reliable.
>
There is /no/ "correct way to make programs crash on undefined
behaviour". There never has been, and never will be. The best you can
get are tools to help you find the bugs in your program when testing and
debugging. Maybe one day other tools will replace the sanitize options.
Options like "-ftrapv" have never been about changing the semantics of C
to give a defined behaviour to signed integer overflow. (This is
different from -fwrapv.) If you want to complain about poor or
misleading documentation in the gcc manual page, then I can agree with that.
C and C++ are languages which trust the programmer - they expect you to
write code that is correct and will not misbehave. They have little or
no checking unless you add it manually, which is how you can get
efficient code. But there is nothing hindering you from manual checks -
especially in C++ where you can make classes with operators to simplify
the process. (Yes, I know that means you can get overflow checking in
your own code, but can't easily add it to existing code.)
>>> When incorrect answer is worse than no answer then normal people
>>> want defects to crash in release too.
>>
>> "-ftrapv" could has always been slower than non-trapping code. People
>> usually aim to right correct code, and have that correct code run as
>> fast as reasonably possible. If you want software that is full of
>> run-time checks, you don't program in C or C++.
>
> See? We have modern, branch-predicting and eagerly executing hardware
> with megabytes of cache but reasonability of -ftrapv usage is questionable.
> Sure, it is not for omnipotent programmers, without even measuring what
> it costs but I am fallible and have never met anyone almighty.
> How is it so self-evident that -ftrapv is unreasonable option?
> Correctness of behavior is more important than performance of behavior
> and incorrect behavior is often worse than no behavior whatsoever.
>
"-ftrapv" does not turn incorrect behaviour into correct behaviour. At
best, it helps you spot the incorrect behaviour so that you can fix the
program. (And that's a useful thing for testing and debugging.)
>> In C and C++, you can always manually add any checks you want. With
>> C++, you can make your own types that do checking in the manner that
>> suits your needs.
>
> Why? I agree that signed integer overflow is programming error and
> so if the integer calculations happen to be important then the -ftrapv
> handles it correctly (when it happens to work).
> Also how can I be sure that they don't "optimize" my manual checks
> away? After a while it can be like with null pointers. One optimizes
> dereference away, other optimizes null pointer check away (since it
> was "after" dereference) and result is that neither dereference does
> crash nor check work.
>
You can be sure it works by writing the code correctly. The "null
pointer" incident you are alluding to was a serious bug in the Linux
source code - changes to the compiler optimisation affected the
consequences of the bug, but did not introduce the bug. Compilers
assume that the programmer obeys the rules of the language - including
the "thou shalt not dereference a null pointer" rule.