Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A counterintuitive breaking change related to new comparisons

74 views
Skip to first unread message

Andrey Tarasevich

unread,
Jan 3, 2022, 12:49:09 PM1/3/22
to
A colleague discovered that switching from `-stc=c++17` to `-std=c++20`
in their project resulted in a different behavior from some associative
containers. A bit of research allowed to narrow down the culprit to what
can be demonstrated by the following minimalist example

#include <iostream>

struct S
{
char c;

operator const char *() const { return &c; }

friend bool operator <(const S& lhs, const S& rhs)
{ return lhs.c < rhs.c; }
};

int main()
{
std::pair<int, S> p1{}, p2{};
std::cout << (p1 < p2) << (p2 < p1) << std::endl;
}

C++17 compilers output `00`. C++20 compilers output `01` (or, perhaps,
`10`).

The obvious guess is that comparisons for `std::pair` work differently
in C++20 after the introduction of `<=>`. And indeed, the `<=>` operator
for `std::pair` in this case ignores the user-defined `<` and instead
opts for raw pointer comparison through conversion to `const char *`.
This is a rather surprising and counterintuitive breaking change, to put
it mildly...

If we remove the user-defined conversion to `const char *`, C++20 will
use the user-defined `<` and also output `00`.

The above results were obtained with GCC. Clang 10 outputs `00` even in
`-std=c++20` mode, but Clang 11 and later output `01`.

--
Best regards,
Andrey Tarasevich

Chris Vine

unread,
Jan 3, 2022, 4:06:36 PM1/3/22
to
As p1 and p2 are not within the same array or object, I believe
operator< has undefined behaviour so the compiler is within its rights
to behave this way (that was certainly the case in C++11). If you use
std::less it should work, as it provides an implementation defined
total order over pointers even where operator< does not.

Andrey Tarasevich

unread,
Jan 3, 2022, 5:13:32 PM1/3/22
to
Um... The top-level `<` operation in `p1 < p2` is a user-defined
operator. It is not subject to any restrictions, like "within the same
array or object" or somesuch.

On the other hand, comparison between two `const char *` pointers,
obtained from `p1` and `p2` probably does have undefined behavior.
However, that's completely besides the point. The point is that in C++20
mode the compiler even opted to perform implicit conversion to `const
char *`. That is already surprising.

Before C++20 the conversion to `const char *` is not used and everything
is perfectly defined here. There's no "undefined behavior" of any kind
in C++17 semantics of this code.

Andrey Tarasevich

unread,
Jan 3, 2022, 5:24:29 PM1/3/22
to
To take out of the picture the irrelevant matter of undefined pointer
comparison, here's another example

#include <iostream>

struct S
{
int a;

S(int a) : a(a) {}

operator int() const { return a; }

friend bool operator <(const S& lhs, const S& rhs)
{ return lhs.a > rhs.a; }
};

int main()
{
std::pair<int, S> p1{ 0, 1 }, p2{ 0, 2 };
std::cout << (p1 < p2) << (p2 < p1) << std::endl;
}

This program outputs `01` in C++17 mode (and earlier)
http://coliru.stacked-crooked.com/a/98f0b6b57e9caf45

but outputs `10` in C++20 mode
http://coliru.stacked-crooked.com/a/3f8c3e96cd39f596

In C++17 the user-defined comparison operator is used. In C++20
conversion to `int` and subsequent comparison of `int`s is used.

Alf P. Steinbach

unread,
Jan 3, 2022, 5:56:15 PM1/3/22
to
I remember arguing for the superiority of memcmp/strcmp style comparison
functions, in old clc++m.

Boy was I stupid.

When I failed to think about how it could be fouled up by Microsoft-ish
thinking in the committee (or even that it could once be standardized).

- Alf

Manfred

unread,
Jan 3, 2022, 6:56:55 PM1/3/22
to
That's really odd indeed.
Are you sure that is is the C++20 standard that mandates this behavior?
I wouldn't be surprised if this were a plain bug in the implementation -
possibly related with the changes that you mention above - but still a bug.

In fact, gcc 11.2 with plain -std=c++20 gives '01':

https://www.godbolt.org/z/rEjfehdnY

Manfred

unread,
Jan 3, 2022, 6:59:55 PM1/3/22
to
scratch that last bit (for some reason godbolt uses diferent comiler
selections fro compilation and program output)

https://www.godbolt.org/z/be1q7PErE

Chris Vine

unread,
Jan 3, 2022, 7:23:53 PM1/3/22
to
> obtained from `p1` and `p2` probably does have undefined behavior. (
> However, that's completely besides the point. The point is that in C++20
> mode the compiler even opted to perform implicit conversion to `const
> char *`. That is already surprising.
>
> Before C++20 the conversion to `const char *` is not used and everything
> is perfectly defined here. There's no "undefined behavior" of any kind
> in C++17 semantics of this code.

Ah yes, I misread your code. You are not comparing pointers, you are
comparing indeterminate values. I would need to look it up but my
recollection is that reading (and so comparing) an uninitialized value
also comprises undefined behaviour, unless it is of unsigned char or
std::byte type. Here you seem to be comparing an uninitialized int and
char (via S:: operator <) type.

Andrey Tarasevich

unread,
Jan 3, 2022, 7:34:03 PM1/3/22
to
On 1/3/2022 4:23 PM, Chris Vine wrote:
> Ah yes, I misread your code. You are not comparing pointers, you are
> comparing indeterminate values. I would need to look it up but my
> recollection is that reading (and so comparing) an uninitialized value
> also comprises undefined behaviour, unless it is of unsigned char or
> std::byte type. Here you seem to be comparing an uninitialized int and
> char (via S:: operator <) type.

No. That is incorrect. All values involved in these comparisons are
perfectly initialized.

Both `std::pair` objects are initialized with their default
constructors, which in turn perform value-initialization of their
members. In my example all `std::pair`s fields are guaranteed to be
initialized to zero. Nothing indeterminate here.

--
Best regards,
Andrey Tarasevich .

Chris Vine

unread,
Jan 3, 2022, 7:39:28 PM1/3/22
to
Actually I think they should be initialized because of the brace
initializer. I don't use C++ very much any more so I may be wrong
about that; but if so there is surely something odd about this. It
looks like a bug.

Andrey Tarasevich

unread,
Jan 3, 2022, 7:59:43 PM1/3/22
to
I'm not yet that good with the intricacies of the implicitly generated
comparators, but I see the dilemma the compiler is facing here. It is either

1. user-provided measly `<`-only comparison directly on `S`, or

2. implicit conversion that leads from `S` to `int` with strong
full-blown `<=>` comparison.

The compiler apparently opts for the latter.

I can see the attraction of the latter path, but the fact that it
quietly defeats a used-defined comparator looks scary. Granted, it
doesn't defeat it directly, since `std::pair`s comparator serves as an
intermediary, but still...

Once we declare the conversion operator as `explicit`, the phenomenon
expectedly disappears and C++20 falls back to the "traditional" behavior.

Chris Vine

unread,
Jan 3, 2022, 8:27:50 PM1/3/22
to
On Mon, 3 Jan 2022 16:59:25 -0800
Andrey Tarasevich <andreyta...@hotmail.com> wrote:
[snip]
> I'm not yet that good with the intricacies of the implicitly generated
> comparators, but I see the dilemma the compiler is facing here. It is either
>
> 1. user-provided measly `<`-only comparison directly on `S`, or
>
> 2. implicit conversion that leads from `S` to `int` with strong
> full-blown `<=>` comparison.
>
> The compiler apparently opts for the latter.
>
> I can see the attraction of the latter path, but the fact that it
> quietly defeats a used-defined comparator looks scary. Granted, it
> doesn't defeat it directly, since `std::pair`s comparator serves as an
> intermediary, but still...
>
> Once we declare the conversion operator as `explicit`, the phenomenon
> expectedly disappears and C++20 falls back to the "traditional" behavior.

Frankly, this is unacceptable for an industry standard language such as
C++. I was disappointed with some of the breaking changes in C++17
(one bizarre outcome of which was that std::uninitialized_copy, and the
other std::unitialized_* functions, gave rise to undefined behaviour
according to other parts of the same standard), but this one from C++20
to which you refer is I think worse because it causes breaking compiler
behaviour in real life.

C++ is not my first choice for new projects nowadays, although it still
has some uses.

Chris M. Thomasson

unread,
Jan 3, 2022, 9:51:15 PM1/3/22
to
On 1/3/2022 9:48 AM, Andrey Tarasevich wrote:
Bugger!

Juha Nieminen

unread,
Jan 4, 2022, 3:27:43 AM1/4/22
to
Andrey Tarasevich <andreyta...@hotmail.com> wrote:
> Both `std::pair` objects are initialized with their default
> constructors, which in turn perform value-initialization of their
> members. In my example all `std::pair`s fields are guaranteed to be
> initialized to zero. Nothing indeterminate here.

Are you sure "default-initialization = zero-initialization" is guaranteed
for POD objects?

The default initialization of 'char' (as all the other primitive types) is
"don't initialize". Is this different if the 'char' is a member of a POD
object that has no explicit constructor?

wij

unread,
Jan 4, 2022, 6:54:10 AM1/4/22
to
// file t.cpp
#include <iostream>

struct S
{
char c;

operator const char *() const { return &c; }

friend bool operator <(const S& lhs, const S& rhs)
{
std::cout << 'c' ;
return lhs.c < rhs.c; }
};

int main()
{
std::pair<int, S> p1{}, p2{};
std::cout << (p1 < p2) << (p2 < p1) << std::endl;
}
----

[]$ g++ t.cpp -std=c++20
[]$ ./a.out
01
[]$ g++ t.cpp -std=c++17
[]$ ./a.out
c0c0

---
Problem was verified.
It looked to me an overloading issue with std::pair.

Alf P. Steinbach

unread,
Jan 4, 2022, 11:25:02 AM1/4/22
to
Value initialization was about the only new feature in C++03, which was
otherwise just Technical Corrigendum 1. It was invented by Andrew Koenig.

The point of value initialization was to guarantee initialization of
everything, if anything in an object was initialized (e.g., an object
with a `std::string` and a `double` should not end up with the `string`
initialized and the `double` indeterminate, as it could in C++98).

For that purpose value initialization, used by the `std::pair` default
constructor, ends up zero-initializing those things not otherwise
initialized.


- Alf

Andrey Tarasevich

unread,
Jan 4, 2022, 1:26:31 PM1/4/22
to
`std::pair` is a not a POD type (if you insist on using outdated
terminology). It has a user-provided constructor.

In my example the top-level initialization is performed by the default
constructor of `std::pair`, which is guaranteed (by specification of
`std::pair`) to invoke *value-initialization* of both members.

Value-initialization of `char` (and other scalar types) is
zero-initialization.

Even if `std::pair` were a POD, the explicitly specified `{}`
initializer in my example would still trigger value-initialization of
the entire `std::pair`, which would in turn lead to value-initialization
of both members.

Öö Tiib

unread,
Jan 4, 2022, 5:49:55 PM1/4/22
to
I so far thought that <=> only breaks the philosophy of not
paying for what we don't use (implementation of < is often
cheaper than <=>, operator != is almost always cheaper).
Thanks for showing that it is far worse. :(

That it somehow manages to compose the <=> when you
make operator int() explicit is also odd ... I think all the
comparison operators of std::pair were removed
by C++20 and replaced with alien flying vessel operator.

Öö Tiib

unread,
Jan 4, 2022, 6:16:45 PM1/4/22
to
And oh ... not only std::pair ... all comparison operators of all
containers ... basically anyone comparing whatever
standard containers gets that pessimization together
with the risk of that breaking change manifesting.
Perhaps it is time to switch to boost from standard
containers until the madness hasn't yet spread to there:
<https://www.boost.org/doc/libs/1_78_0/doc/html/boost/container/vector.html>

wij

unread,
Jan 5, 2022, 12:59:40 AM1/5/22
to
Let along how C++ interprets the codes' intention,
I think adding the conversion function in S should firstly consider that
which-ever overloads will fit. Otherwise, the ambiguity issue is difficult to
resolve. This indicates a class design problem.

Andrey Tarasevich

unread,
Jan 5, 2022, 12:06:51 PM1/5/22
to
Potential performance pessimization for aggregate/container comparators
is indeed there, which can be seen from the following example

#include <utility>
#include <iostream>

struct S
{
int a;

S(int a) : a(a) {}

friend bool operator <(const S& lhs, const S& rhs)
{
bool b = lhs.a < rhs.a;
std::cout << "cmp " << lhs.a << " < " << rhs.a <<
" = " << b << std::endl;
return b;
}
};

int main()
{
std::pair<int, S> p1{ 0, 1 }, p2{ 0, 2 };
bool b = p1 < p2;
std::cout << "result = " << b << std::endl;
b = p2 < p1;
std::cout << "result = " << b << std::endl;
}

This case does not suffer from the original problem (no implicit
conversion present), meaning that in this case the comparison results
are consistent between C++17 and C++20. However, the paths that lead to
those results are still different.

When compiled as C++17 it outputs

cmp 1 < 2 = 1
result = 1
cmp 2 < 1 = 0
result = 0

When compiled as C++20 it outputs

cmp 1 < 2 = 1
result = 1
cmp 2 < 1 = 0
cmp 1 < 2 = 1
result = 0

C++20 still has to route the `std::pair` comparison through the `<=>`
operator. And ultimately it has to implement that operator based on the
available user-provided `<` comparison for `S`. In general case it takes
more calls to `<` to properly calculate the three-state result for
`<=>`, as evidenced by the above output. E.g. when "less" comparator
returns `false` an extra call is required to tell "equivalent" from
"greater".

If comparisons for `S` are "heavy", this might easily result in
significant pessimization.

So, even though the ultimate goal of three-state comparison support in
C++20 is to improve the performance of heavy comparisons, it does not
come for free: it still requires the user to take active steps to
provide efficient `<=>` comparators for their classes. Without it the
code might work correctly, but sub-optimally.
0 new messages