On 18/04/18 20:17, Rick C. Hodgin wrote:
> On 4/18/2018 1:56 PM, Chris Vine wrote:
>> On Wed, 18 Apr 2018 12:21:35 -0400
>> "Rick C. Hodgin" <
rick.c...@gmail.com> wrote:
>>> On 4/18/2018 12:03 PM, Chris Vine wrote:
<snip>
>> It is not about philosophy, whether about data or anything else. That
>> your code breaches the C++ language specification is the end of it.
>
> I agree it is the end of it with regards to the C++ language standard.
> My position is it's not the end of it with regards to the compiler. The
> compiler is free to do what it wants in the cases of UB, including do
> the correct operation.
There is no "correct operation" as far as C++ is concerned. The code
has undefined behaviour - that means it does not make sense in C++ or to
the compiler, even if you feel the intention of the code is clear to a
human reader. (Which it is, in this particular case.)
Imagine it as though someone had written the sentence "I went for a
drive in my bar". This is grammatically correct, and has correct
spelling - a computer spell-checker cannot spot the problem. Most
people would realise "bar" was a typo for "car" - your "unit test" on
the sentence would pass. But some people - perhaps someone with a
different native language, would get confused. The sentence has
"undefined behaviour".
>
>> That there is no strict aliasing rule in CAlive or some other language
>> (vapourware or otherwise) is irrelevant to the issue, as is whether you
>> think there should be a strict aliasing rule in C and C++. This is a
>> C++ newsgroup, there is a strict aliasing rule in C++ and if he is
>> writing C++ code with a C++ compiler he needs to know it.
>
> Agreed. If the C++ compiler adheres explicitly to the standard it
> may produce unusable code. If, however, it goes ahead and performs
> the operation, as we just saw MSVC++ does, then it is working in that
> compiler.
Code like this certainly /can/ have defined behaviour for particular
compilers and/or flags. But you can only rely on it if it is
documented, and if you are sure the code will only be used on such a
compiler. Otherwise it is a very subtle error waiting to creep up on
people.
(It's fine to write code that is specific for a particular compiler -
but you should do so only if you have good reason for it. And you
should document it, and ideally cause compile-time failures if the
assumptions about the tools are broken.)
>
>> [snip]
>>> The iterative loop method you propose is slower and very likely
>>> completely unnecessary given the nature of data computing in
>>> assembly / machine code.
>>
>> I wasn't proposing an iterative loop.
> I apologize. I mistook this post from Paavo Helde for being from you:
>
> bool is_valid(const char* data) {
> return std::find_if(data, data+8,
> [](char c) {return c&'\x80';})==data+8;
>
> My mistake.
>
> > You seem to be clueless. What
> > makes your attitude even more ridiculous is that there is a
> > zero-overhead way of doing it right.
>
> It's not zero-overhead. Your proposed memcpy() is iterative, and
> operates on the data on a byte-by-byte basis.
Logically, yes, memcpy() is byte for byte. In practice, good compilers
will optimise memcpy() very nicely when the operands are appropriate.
In a case like this, a good compiler (with optimisation enabled,
obviously) will do a single 64-bit load on an architecture that supports
unaligned loads. It will do its best in other cases - using
byte-for-byte loads if needed, or bigger loads if the compiler has some
information about the alignment.
So the memcpy() solution will be as fast as your version on any target
that allows unaligned loads, and /correct/ on all targets regardless of
optimisations, flags, compiler variations, etc.
> It is slower than
> the proposal by the OP, and while yours may be conforming ... who
> cares if his faster method works? If his goal is to be expressly
> conforming, then it matters. But if he's targeting a range of
> tools where it will work using is_valid2() ... then honestly, who
> cares? Every C++ compiler is different and these things can be
> wrangled into tests and validated at startup with the simple load
> of a test case library that calls some functions included in the
> main executable.
>
There are certainly cases where implementation-specific code is fine.
If the code is full of calls to WinAPI functions, then relying on x86
features is perfectly reasonable. If it is full of MSVC extensions,
then relying on MSVC behaviour is also fine. (I don't know if MSVC
documents that it allows such pointer casts in this way.)
It is also fine to do:
#if __COMPILER_XXX
// Fast implementation known to work on XXX
bool is_valid(...
#else
// Possibly slow, but definitely correct fall-back version
bool is_valid(...
#endif
But in this particular case, memcpy() is your friend. In general,
memcpy with a small fixed size should be well optimised.