David Brown <
david...@hesbynett.no> writes:
>On 31/03/18 19:57, Anton Ertl wrote:
>> David Brown <
david...@hesbynett.no> writes:
>>> On 31/03/18 17:23, Terje Mathisen wrote:
>>>
>>>>
>>>> I think we both know that the real problem with modern C(++) compilers
>>>> is with the way some of them insist on taking any existence of
>>>> "undefined behavior" and use it to taint everything it touches.
>>>
>>> I think a bigger problem is when some programmers think that particular
>>> code has defined behaviour, and some compilers treat it in a consistent
>>> manner, but the standards don't give a relevant definition.
>>
>> I don't see a problem there.
>>
>>> These
>>> programmers are then surprised when a newer compiler, or one with
>>> different settings, does not follow this same behaviour.
>>
>> A responsible software maintainer does not change behaviour that users
>> make use of.
>
>That's nice in theory. But what if you have a hundred thousand users,
>of which a large proportion know what they are doing, and want to get
>the highest quality results from their source code - while a certain
>proportion blatantly refuses to learn and obey the rules for the
>software?
Yes, C programmers generally know what they are doing; they write
programs that work acceptably with the C compilers that they have
during development. The problem is compiler maintainers who blatantly
refuse to accept this, and claim that most of these programs (even gcc
and llvm themselves, see <
https://blog.regehr.org/archives/761>) are
buggy, and that they are therefore free to miscompile them in any way
they can conceive (aka they want to produce code that summons nasal
demons).
>There is no way a compiler can progress and make improvements
>while retaining the expected behaviour for all its users - it can't
>happen.
There are many ways, but they tend to be ignored by the current
generation of C compiler maintainers, because they are so in love with
their nasal demons that they prefer to introduce new ways to summon
them, and work around the bugs introduced by the ensuing complexity
(e.g., see
<
https://pldi17.sigplan.org/event/pldi-2017-papers-taming-undefined-behavior-in-llvm>).
As an example of a possible improvement, consider the following the
for x being at the minimal value of it's type:
x-1>=x
On AMD 64 gcc-5.2.0 -fwrapv compiles this (with long x) to
48 8d 47 ff lea -0x1(%rdi),%rax
48 39 c7 cmp %rax,%rdi
7f 06 jg ...
Nasal demon lovers would suggest that I change this to
x==LONG_MIN
This is not only type-specific, it also generates longer code:
48 b8 00 00 00 00 00 00 00 80 movabs $0x8000000000000000,%rax
48 39 c7 cmp %rax,%rdi
75 05 jne ...
A compiler could improve on that while retaining the intended
behaviour by compiling both variants to
48 ff cf dec %rdi
71 05 jno ...
But unfortunately, the gcc maintainers ignored this optimization
opportunity, prefering to spend time on introducing nasal demons.
>The best you can do, and the way gcc does it, is to make sure
>that new optimisations are configurable. They provide extra flags, and
>disabled optimisation modes, precisely to please people or programs that
>get things wrong in their C coding. Disabled optimisation is even the
>default mode! It is hard to comprehend why a few (but loud) people
>complain about this - if you don't like the gcc optimisations because
>they don't follow your personal assumptions and concepts of what C
>should be, then don't use them. Problem solved, I would have thought.
The problem is not solved, for several reasons:
1) Some versions of gcc introduce nasal demons even at default
optimization. More generally, gcc does not give any better
guarantees about the language it supports for default optimization
than for any other optimization level.
2) Even if the behaviour was guaranteed not to change with disabled
optimization, gcc with disabled optimization produces code that is
several times slower than what an earlier gcc version produces at a
higher optimization level (and where the code behaves as intended).
Is the goal of all this work on nasal demon "optimizations" really
to make all programs several times slower. And yes, it would
affect all programs: Which real-world application is guaranteed not
to contain any undefined behaviours? So they would all have to be
compiled with optimization disabled in order to be safe against
future gcc versions. But actually, even with optimizations
disabled, they would not be safe, see 1).
What gcc does provide, and what is moderately workable is flags for
disabling individual nasal demon assumptions. E.g., in my work with
gcc-5.2.0 I used
-fno-aggressive-loop-optimizations -fno-unsafe-loop-optimizations
-fno-delete-null-pointer-checks -fno-strict-aliasing
-fno-strict-overflow -fno-isolate-erroneous-paths-dereference -fwrapv
I am sure that the number of flags has increased in the meantime.
The problem with this is that a given application does not know which
such flags the gcc maintainers will introduce in the future, so they
cannot make their program proof against new nasal demons in future
versions of gcc. It is also not easy to see which of the large number
of -f flags that gcc provides protects against nasal demons.
So it would be nice to have a flag, say,
-fnasal-demons/-fno-nasal-demons that enables or disables all such
assumptions (ideally -fno-nasal-demons would be the default, making
programs that were written before the introduction of this flag
compile as intended).
Alternatively, define language levels: Say, if the program works as
intended on gcc-4.9, -std=gcc-4.9 would give me the behaviour
supported by gcc-4.9, while -std=gcc-6.3 would give me the smaller set
of behaviour that the gcc-6.3 language level supports.
>> See, e.g.,
>> <
https://felipec.wordpress.com/2013/10/07/the-linux-way/>.
>
>This is a /totally/ different situation - and you would make a better
>argument if you understood that. The "Linux way" is that you don't
>change the public interface, as that is the contract between the Linux
>user (or programmer) and the Linux kernel (and kernel programmers). In
>the C world, the equivalent is /not/ a compiler, but the C standards.
In the Unix world, the equivalent of the C standard is the POSIX
standard: a specification of the intersection of various
implementations. And if the Linux maintainers were as irresponsible
as the gcc maintainers, they would declare all user-level programs
that do something beyond what POSIX guarantees as buggy, and would
feel free to break them with every new release.
But the Linux maintainers actually are responsible and do not take
this position. Instead, they take the position that a (real, not
theoretical) binary that worked on a previous version must continue to
work in future version. In consequence, if a binary worked in an
earler version of Linux, and is broken by a new version, and that is
reported as a bug, they don't try to talk the reporter into submission
with verbiage like "while a certain proportion blatantly refuses to
learn and obey the rules for the software" or "programs that get
things wrong". They just fix the bug!
And likewise, for a given C compiler, the standard that its
maintainers should apply is the behaviour of earlier versions of this
C compiler.
>What you are asking for is that the C compiler makers (gcc in
>particular, I believe) should understand what different programmers mean
>by code that does not have a defined behaviour specified in either the C
>standards, the compiler documentation, or other standards (like POSIX).
>This is /not/ an easy task.
On the contrary, it's easy: The compiler chose a particular behaviour
in its previous versions. That's the meaning.
>But do compiler makers add such optimisations because they are evil?
>Because they care more about benchmark results than real code?
In your words: Absolutely.
>Because
>they don't test on real code, or talk to real programmers?
The image I get after reading a lot of stuff by nasal demon advocats,
in particular
<
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>,
but also Usenet postings by gcc maintainers and others, and seeing
some of the bug reports against gcc (and how the gcc maintainers
react), is this:
The gcc and LLVM compiler maintainers believe (often with little or no
evidence) in vast speedups possible by giving as few guarantees as
possible, and undefined behaviour in the C standard is therefore much
loved by them (and so important to them that it even gets its own
acronym: UB).
Benchmark code (even with undefined behaviour, as present in pretty
much every program) is necessary to be able to show off what one has
done. Real code from real programmers is, at best, seen as a
necessary evil. When the compiler maintainers break their programs,
that worked in a previous version, and this is reported, they find
some case of "undefined behaviour", and reject the bug as invalid.
There seem to be some factors that moderate their zest for summoning
nasal demons. I guess that the people who sign their paychecks get
annoyed when they break code that is important to those people, so,
from what I have read, the gcc maintainers compile many packages of a
Linux distribution before releasing a new version these days. And
maybe not all of the gcc and LLVM compiler maintainers believe in the
large power of nasal demons, but those that do have enough influence
that the overall outcome is as described above, and when communicating
with programmers, or discussing these issues, the belief system comes
through, e.g., in phrases like "code written by people who should
never have been allowed to touch a keyboard"
<
https://www.sourceware.org/bugzilla/show_bug.cgi?id=12518#c4> (in
this case wrt a standard C library function by a library maintainer,
so it is not limited to compiler maintainers).
>No, that is
>conspiracy theory nonsense, and that kind of drivel is
>counter-productive in the grand aim of all this, which must surely be to
>help programmers get their source code correct and generate efficient
>object code.
Given that the nasal demon stuff helps with neither objective (except
for the notion of "correct" that nasal demon advocates often use), it
is obvious that these are not the primary objectives of the gcc and
LLVM maintainers.
Concerning "conspiracy theory": The belief system of the LLVM and gcc
maintainers has been publically outlined in
<
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>,
and by many nasal demon advocacy posts, e.g., yours. There is no
conspiracy and no theory about it: It's publically documented.
>Two is to work with compiler developers towards
>documenting the assumptions you want and making them a feature of the
>tools - compiler writers are open to that.
I have yet to see evidence of that.
>(If it really /were/ a miscompilation, you'd file a bug with gcc and
>they would work towards fixing it.)
My experience is that they mark it as invalid for some reason or other
(but containing "undefined behaviour" is a favourite), and
consequently don't fix it.
>> That would mean to
>> distribute the program in binary form only; that's not a big burden
>> for proprietary software, but it's not an appropriate suggestion for
>> free software. So, through this irresponsibility of its maintainers,
>> gcc is becoming a compiler that's more appropriate for proprietary
>> software than for free software. Way to go!
>>
>
>Have you ever thought that constructive dialogue with the gcc developers
>might be more helpful than accusations?
Yes, once upon a time I thought that. Experience has proved
otherwise.
Actually, it's more complicated: In the beginning I had constructive
dialogue with the gcc maintainer. Maintainership changed, and the
dialogue aspect became less productive. At some point, it became
obvious that reporting bugs is pointless. I still had dialogue with a
gcc maintainer, but he pretended not to understand what the intended
behaviour of a program is. And various sources (including
<
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>)
claimed significant performance improvements from nasal demons. So I
wrote
http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf
to discuss all that. There were complaints that this did not explain
the intended behaviour clearly enough, so I wrote
http://www.kps2017.uni-jena.de/proceedings/kps2017_submission_5.pdf
in order to clarify that. If, in your filter bubble, that is all just
"accusations", "endless complaining", "wild exaggerations" and
"unjustified claims", then I see no way to engage in a constructive
dialogue with you.