C++ may sometimes be *too* simple (to use)

Juha Nieminen

unread,

Apr 27, 2021, 8:07:40 AM4/27/21

to

Recently I posted a controversial thread where I recounted my recent
experience trying to tutor a beginner programmer who wanted to create
a small program that would read a file and allow the user to edit it,
like a simple text editor, and how frustrating it was trying to do
this in C due to the complexities of its memory management.

Because such accusations about C is heresy, I thought I would redeem
myself and cleanse my sins by commenting a bit on the flipside, based
on actual examples out there that I have had to deal with.

While trying to tutor that beginner, I was constantly thinking how
much easier the task would have been in C++, using std::string
(and std::vector, etc).

Indeed, std::string makes such things so much easier and simpler.
On the flipside, perhaps it makes things a bit *too* easy, as it
induces people to create inefficient code with it, perhaps due to
ignorance or laziness.

For example, I am currently dealing with C++ code that does a lot of
things like:

if(str1.substr(n, 5) == str2)

Mind you, this is in commercial production code done by a relatively
experienced C++ programmer, not some noob.

I'm constantly thinking like "sheesh, there's already a function that
does that exact comparison in-place, without creating any temporary
copies, avoiding a useless allocation, copying of data and deallocation."
Namely:

if(str1.compare(n, 5, str2) == 0)

In this particular case it's not like the more efficient version is more
complicated to write or use, or would require several lines of code,
or something.

There are lots and lots of other examples, related to std::string
(and std::wstring), such as using it for fixed-sized strings in situations
where a small inbuilt char array would suffice, always taking a std::string
as parameter when a const char* would suffice (or could well be
provided as a more efficient alternative) and so on and so forth.

std::string is an awesome tool that makes one's life a thousand times
easier, but any competent C++ programmer should learn how to use it
*efficiently*, not just lazily. Know your tools. Use them efficiently.

Paavo Helde

unread,

Apr 27, 2021, 10:57:17 AM4/27/21

to

I have wondered about this exact thing myself. So I now made a little
test to check if and how much the slowdown actually is.

It looks like with small strings so that SSO gets used, substr()
comparing is ca 2x slower than compare(), and with large strings
(needing a dynamic allocation) substr() comparing is ca 10x slower than
compare().

OTOH, a single compare() is only something like 4 ns, so if a program
makes e.g. a million of such comparisons, then with large substrings it
would slow down from 0.004 s to 0.04 s. I bet many people won't be
overly concerned about this.

Öö Tiib

unread,

Apr 27, 2021, 9:14:39 PM4/27/21

to

The focus has long gone away from considering each time we solve some
problem if the result is most efficient. On most cases it matters
if it works correctly for all input and if it is easy to understand what is
going on. Efficiency matters only for about 5% of code base.
For example if we need to sort very small array of fixed size then on 95%
of cases it is not worth to dig out some code that implements one of
such mathematically proven optimal from Donald Knuth's "The Art of
Computer Programming" volume 3 (written in sixties) but to use std::sort.

Andrey Tarasevich

unread,

Apr 27, 2021, 10:57:39 PM4/27/21

to

On 4/27/2021 5:07 AM, Juha Nieminen wrote:
>
> For example, I am currently dealing with C++ code that does a lot of
> things like:
>
> if(str1.substr(n, 5) == str2)
>
> Mind you, this is in commercial production code done by a relatively
> experienced C++ programmer, not some noob.
>
> I'm constantly thinking like "sheesh, there's already a function that
> does that exact comparison in-place, without creating any temporary
> copies, avoiding a useless allocation, copying of data and deallocation."
> Namely:
>
> if(str1.compare(n, 5, str2) == 0)
>
> In this particular case it's not like the more efficient version is more
> complicated to write or use, or would require several lines of code,
> or something.

> ...

In modern C++ the original comparison is better done through a
`std::string_view`

if (std::string_view(str1).substr(n, 5) == str2)

This is also an in-place comparison with zero actual run-time overhead,
even though conceptually it looks like a creation of an extra
[lightweight] temporary.

Tri-state comparisons like `std::string::compare` have their own
important role (which is why we now have built-in tri-state comparisons
through `<=>`), but the version you are proposing doesn't exactly look
too elegant when invoked for the purpose of a plain equality comparison.
Might be one of the reasons people would consciously avoid it in favor
of the original `.substr` version, despite the considerable extra
run-time cost.

Before the `std::string_view` era I would also opt to use
`std::string::compare` in this context for the same reasons you
mentioned. But, again, some people might stick to an expensive `.substr`
version in non-critical code just for its stylistic elegance.

--
Best regards,
Andrey Tarasevich

wij

unread,

Apr 28, 2021, 12:16:08 AM4/28/21

to

On Tuesday, 27 April 2021 at 20:07:40 UTC+8, Juha Nieminen wrote:
> Recently I posted a controversial thread where I recounted my recent
> experience trying to tutor a beginner programmer who wanted to create
> a small program that would read a file and allow the user to edit it,
> like a simple text editor, and how frustrating it was trying to do
> this in C due to the complexities of its memory management.
>
> Because such accusations about C is heresy, I thought I would redeem
> myself and cleanse my sins by commenting a bit on the flipside, based
> on actual examples out there that I have had to deal with.
>
> While trying to tutor that beginner, I was constantly thinking how
> much easier the task would have been in C++, using std::string
> (and std::vector, etc).
>
> Indeed, std::string makes such things so much easier and simpler.
> On the flipside, perhaps it makes things a bit *too* easy, as it
> induces people to create inefficient code with it, perhaps due to
> ignorance or laziness.
>
> For example, I am currently dealing with C++ code that does a lot of
> things like:
>
> if(str1.substr(n, 5) == str2)
>

With my library, it would be: if(str1.cseg(n,5)==str2.cseg())
With QString, there may be more examples.
In not few cases, using some kind of XX::vect<char> is efficient.

> Mind you, this is in commercial production code done by a relatively
> experienced C++ programmer, not some noob.
>
> I'm constantly thinking like "sheesh, there's already a function that
> does that exact comparison in-place, without creating any temporary
> copies, avoiding a useless allocation, copying of data and deallocation."
> Namely:
>
> if(str1.compare(n, 5, str2) == 0)
>
> In this particular case it's not like the more efficient version is more
> complicated to write or use, or would require several lines of code,
> or something.
>
> There are lots and lots of other examples, related to std::string
> (and std::wstring), such as using it for fixed-sized strings in situations
> where a small inbuilt char array would suffice, always taking a std::string
> as parameter when a const char* would suffice (or could well be
> provided as a more efficient alternative) and so on and so forth.
>
> std::string is an awesome tool that makes one's life a thousand times
> easier, but any competent C++ programmer should learn how to use it
> *efficiently*, not just lazily. Know your tools. Use them efficiently.

There are lot examples using C-string is more efficient and convenient
than wrapped in std::string, sort of OO twisted.
The basic tool is still C. Yes, Know your tools. Use them efficiently.

Juha Nieminen

unread,

Apr 28, 2021, 1:21:47 AM4/28/21

to

Öö Tiib <oot...@hot.ee> wrote:
> The focus has long gone away from considering each time we solve some
> problem if the result is most efficient. On most cases it matters
> if it works correctly for all input and if it is easy to understand what is
> going on. Efficiency matters only for about 5% of code base.

I don't see a problem in choosing the more efficient solution, especially
if both solutions are approximately of the same complexity.

The problem with thinking like "this is just like a millisecond slower,
it doesn't really matter" is that when an entire huge program is full of
such compromises, they stack up, and may add up to a significant slowdown.
Multiply that millisecond by a thousand instances of doing the same thing,
and suddenly your program takes a second to do something that it could
be doing in a hundreth of a second. It will start feeling sluggish instead
of responding immediately. It may be slow to react to things. If the
program is being constantly run, doing that same thing over and over,
suddenly it might take a minute to do something that it could be doing
in one second.

Then you wonder why programs seem to be getting slower and slower,
even though hardware is getting faster and faster. It's exactly because of
this kind of attitude by programmers. It's precisely because of the
"efficiency only matters only for about 5% of code base". Because of the
"it's just like a millisecond slower, it doesn't matter."

Paavo Helde

unread,

Apr 28, 2021, 4:14:15 AM4/28/21

to

There are lot of examples when they are equally efficient and equally
(in-)convenient, like here with strcmp() and std::string::compare().

And there are lot of examples where using C strings is less efficient
(constantly recalculating the string length as it is not stored
anywhere) and/or less convenient. So indeed, know your tools and use
them efficiently.

Öö Tiib

unread,

Apr 28, 2021, 12:09:47 PM4/28/21

to

On Wednesday, 28 April 2021 at 08:21:47 UTC+3, Juha Nieminen wrote:
> Öö Tiib <oot...@hot.ee> wrote:
> > The focus has long gone away from considering each time we solve some
> > problem if the result is most efficient. On most cases it matters
> > if it works correctly for all input and if it is easy to understand what is
> > going on. Efficiency matters only for about 5% of code base.
> I don't see a problem in choosing the more efficient solution, especially
> if both solutions are approximately of the same complexity.

I don't see any problem either way as both seemed as error prone and used
same magic number 5 ... I only thought wtf is that 5.

> The problem with thinking like "this is just like a millisecond slower,
> it doesn't really matter" is that when an entire huge program is full of
> such compromises, they stack up, and may add up to a significant slowdown.
> Multiply that millisecond by a thousand instances of doing the same thing,
> and suddenly your program takes a second to do something that it could
> be doing in a hundreth of a second.

Nope. Paavo said that the difference was 4 nanoseconds.
Multiplying 4 nanoseconds with thousand results with 4 microseconds.
You need to multiply with 250 000 to first reach that millisecond.
There are only rare entity types in programs whose counts ever reach
such numbers and even with those you won only that millisecond.

> It will start feeling sluggish instead
> of responding immediately. It may be slow to react to things. If the
> program is being constantly run, doing that same thing over and over,
> suddenly it might take a minute to do something that it could be doing
> in one second.

Have you ever used profiler on some large product? It is not running all
its code base over and over ever. Most code is ran next to never and so
must be left exactly as inefficient as it was written. Changing it can just
cause more defects in rarely verified places and so waste time into
doing something meaningless and money into funding something
counterproductive. Meanwhile lot (perhaps majority) of programs can
be made orders of magnitude faster overall by concentrating the effort
of choosing optimal algorithms to those few places where it matters
because these places are ran over and over.

> Then you wonder why programs seem to be getting slower and slower,
> even though hardware is getting faster and faster. It's exactly because of
> this kind of attitude by programmers. It's precisely because of the
> "efficiency only matters only for about 5% of code base". Because of the
> "it's just like a millisecond slower, it doesn't matter."

Nonsense, I don't wonder, I have decades of evidence that the attitude
is correct, and considering optimal when focus should be to robustness
and correctness is worthless and counterproductive. The projects do
not lag behind, get nowhere or fail exactly because of not tinkering over
things that will never matter. And so there will be time to profile it and
to optimize.

Juha Nieminen

unread,

Apr 29, 2021, 12:45:17 AM4/29/21

to

Öö Tiib <oot...@hot.ee> wrote:
> I don't see any problem either way as both seemed as error prone and used
> same magic number 5 ... I only thought wtf is that 5.

Are you always this nitpicky about examples (that have been simplified from
the real-life counterparts that they are based on)?

> Have you ever used profiler on some large product?

If you have slight unnecessary inefficiencies at a thousand places in a
huge program, the profiler isn't going to tell you where and which.
The only thing it's going to tell you is that "this action is taking
1 second" but it won't be able to pinpoint the exact bottleneck...
because the bottleneck will be about evenly distributed among a
thousand instances of "this doesn't need to be efficient".

It's, arguably, the worst possible bottleneck that could exist,
because you can't pinpoint it. And fixing it is extraordinarily
laborious, it will probably never be done, and the program will be
doomed to being inefficient because nobody will fix it. All because
of thousands of instances of the "this doesn't need to be efficient"
attitude.

> Nonsense, I don't wonder, I have decades of evidence that the attitude
> is correct, and considering optimal when focus should be to robustness
> and correctness is worthless and counterproductive.

Yeah, because string::substr() is so much more robust and correct than
string::compare().

I think that *deliberately* writing inefficient code is highly
counterproductive.

Öö Tiib

unread,

Apr 29, 2021, 6:16:31 PM4/29/21

to

On Thursday, 29 April 2021 at 07:45:17 UTC+3, Juha Nieminen wrote:
> Öö Tiib <oot...@hot.ee> wrote:
> > I don't see any problem either way as both seemed as error prone and used
> > same magic number 5 ... I only thought wtf is that 5.
> Are you always this nitpicky about examples (that have been simplified from
> the real-life counterparts that they are based on)?

Sorry, but those removed parts can actually show why it was done like that.

> > Have you ever used profiler on some large product?
> If you have slight unnecessary inefficiencies at a thousand places in a
> huge program, the profiler isn't going to tell you where and which.
> The only thing it's going to tell you is that "this action is taking
> 1 second" but it won't be able to pinpoint the exact bottleneck...
> because the bottleneck will be about evenly distributed among a
> thousand instances of "this doesn't need to be efficient".

No, you haven't used profiler? It always shows that only tiny subpart of
whole program code is ran over 90% of time. Rest of code is ran more
rarely and rarely. Half of most C++ code can be rewritten in Python and
overall perceived performance of product does not change at all.

> It's, arguably, the worst possible bottleneck that could exist,
> because you can't pinpoint it. And fixing it is extraordinarily
> laborious, it will probably never be done, and the program will be
> doomed to being inefficient because nobody will fix it. All because
> of thousands of instances of the "this doesn't need to be efficient"
> attitude.

Because that bottleneck does not exist. All that time wasted and 1%
of performance gained. Meanwhile possible opportunities to improve
performance several times lost.

> > Nonsense, I don't wonder, I have decades of evidence that the attitude
> > is correct, and considering optimal when focus should be to robustness
> > and correctness is worthless and counterproductive.
> Yeah, because string::substr() is so much more robust and correct than
> string::compare().

For example the size() of all containers wasn't O(1) in C++98 but
programmers used it to check if container was empty. It had to be
made O(1) as it was somehow easier for programmers to reason.
Same can be about your code that whatever you simplified out
made that substr() easier to reason than compare().

> I think that *deliberately* writing inefficient code is highly
> counterproductive.

I don't think someone deliberately wrote c.size()==0 when it was less
efficient than c.empty() ... or it++ when ++it was more efficient. Their
focus was simply on things that mattered.

Juha Nieminen

unread,

Apr 30, 2021, 6:10:39 AM4/30/21

to

?? Tiib <oot...@hot.ee> wrote:
> Because that bottleneck does not exist. All that time wasted and 1%
> of performance gained. Meanwhile possible opportunities to improve
> performance several times lost.

If an action that could take a hundreth of a second is instead
taking a second, there certainly is a bottleneck.

The problem is that if the bottleneck is distributed among
thousands of sloppily written lines, each with an attitude of
"this doesn't need to be efficient", you aren't going to fix
that bottleneck any time soon.

>> I think that *deliberately* writing inefficient code is highly
>> counterproductive.
>
> I don't think someone deliberately wrote c.size()==0 when it was less
> efficient than c.empty() ... or it++ when ++it was more efficient. Their
> focus was simply on things that mattered.

When the attitude is "this doesn't need to be efficient", and
"it doesn't matter if this takes 10 times longer because the
time is still in the microseconds", then you *are* deliberately
and willingly writing inefficient code.

Bo Persson

unread,

Apr 30, 2021, 9:34:11 AM4/30/21

to

On 2021-04-30 at 12:10, Juha Nieminen wrote:
> ?? Tiib <oot...@hot.ee> wrote:
>> Because that bottleneck does not exist. All that time wasted and 1%
>> of performance gained. Meanwhile possible opportunities to improve
>> performance several times lost.
>
> If an action that could take a hundreth of a second is instead
> taking a second, there certainly is a bottleneck.

It's not a bottleneck if that program runs 3 times a day. At non peak hours.

>
> The problem is that if the bottleneck is distributed among
> thousands of sloppily written lines, each with an attitude of
> "this doesn't need to be efficient", you aren't going to fix
> that bottleneck any time soon.
>
>>> I think that *deliberately* writing inefficient code is highly
>>> counterproductive.
>>
>> I don't think someone deliberately wrote c.size()==0 when it was less
>> efficient than c.empty() ... or it++ when ++it was more efficient. Their
>> focus was simply on things that mattered.
>
> When the attitude is "this doesn't need to be efficient", and
> "it doesn't matter if this takes 10 times longer because the
> time is still in the microseconds", then you *are* deliberately
> and willingly writing inefficient code.
>

It is often more efficient use of the developer's time to focus on the
applications that has the most runtime on a specific system - perhaps
using several CPU-hours per day. Not trying to get everything else down
from 5 to 4 microseconds.

Öö Tiib

unread,

May 1, 2021, 9:51:49 AM5/1/21

to

On Friday, 30 April 2021 at 13:10:39 UTC+3, Juha Nieminen wrote:
> Öö Tiib <oot...@hot.ee> wrote:

> > I don't think someone deliberately wrote c.size()==0 when it was less
> > efficient than c.empty() ... or it++ when ++it was more efficient. Their
> > focus was simply on things that mattered.
>
> When the attitude is "this doesn't need to be efficient", and
> "it doesn't matter if this takes 10 times longer because the
> time is still in the microseconds", then you *are* deliberately
> and willingly writing inefficient code.

But performance does not *actually* matter most of software
development life cycle. Donald Knuth's fat volumes of science of
optimizing algorithms are outright admirable. Yet he wrote in
"Computer Programming as an Art" (1974) golden words that
every programmer I know can cite "premature optimization is
the root of all evil". So come back from dark and evil side, Juha.
Performance starts to matter only after correctness has been
reached from end to end and then only to minor subpart of
whole program.

Alf P. Steinbach

unread,

May 3, 2021, 4:29:09 AM5/3/21

to

On 27.04.2021 14:07, Juha Nieminen wrote:
> [snip]

>
> Indeed, std::string makes such things so much easier and simpler.
> On the flipside, perhaps it makes things a bit *too* easy, as it
> induces people to create inefficient code with it, perhaps due to
> ignorance or laziness.
>
> For example, I am currently dealing with C++ code that does a lot of
> things like:
>
> if(str1.substr(n, 5) == str2)
>
> Mind you, this is in commercial production code done by a relatively
> experienced C++ programmer, not some noob.
>
> I'm constantly thinking like "sheesh, there's already a function that
> does that exact comparison in-place, without creating any temporary
> copies, avoiding a useless allocation, copying of data and deallocation."
> Namely:
>
> if(str1.compare(n, 5, str2) == 0)
>
> In this particular case it's not like the more efficient version is more
> complicated to write or use, or would require several lines of code,
> or something.

The rewrite has many more degrees of freedom. I couldn't write that
without checking the documentation of `compare`. Of course, that
checking might be nothing more than inspecting an editor's tooltip, but.

Also, the first version expresses nearly directly the desired operation,
while the rewrite is just that: a rewrite to fit more arbitrary
requirements, in the direction of expressing C++ code in assembly.

As I see it the second version is therefore decidedly less clear to a
good C++ programmer (even a C++20 programmer accustomed to `<=>`), so
that it can waste people's time when they have to deal with the code,
but since it uses the three-value idiom it may be slightly more clear to
a C programmer accustomed to `strcmp`; at least, to an associative C
programmer, and which programmer doesn't delegate the small stuff to the
brain's automatic associative machinery, with "== 0" there's something
familiar, while "== str2" could look like a bug of comparing pointers.

The efficiency or not of a single instance of possible dynamic
allocation is of little concern unless /measurements/ say that this, or
the wholesale adoption of this as a practice, is a problem.

However, by coding for efficiency /by default/, when that's possible
without sacrificing clarity, one is likely to avoid Microsoft-like
sluggishness and extreme inefficiency. Like half a minute to show a
right-click popup menu, or so sluggish directory operations that high
quality programs give up (ordinary programs just slow down with
inexplicable pauses). In modern post-C++17 one can do

if( string_view( str1 ).substr(n, 5) == str2 )

Disclaimer: I didn't compile that.

And of course, if that's needed three times or more one could/should
define a function `substr_view`.

> There are lots and lots of other examples, related to std::string
> (and std::wstring), such as using it for fixed-sized strings in situations
> where a small inbuilt char array would suffice, always taking a std::string
> as parameter when a const char* would suffice (or could well be
> provided as a more efficient alternative) and so on and so forth.

I agree that the standard library support is currently very much less
than complete for supporting basic programming.

In the positive direction, for `std::string` one can use
`std::string_view` in a great many cases, in particular for parameters,
but also for e.g. your string comparison example.

Using `std::string_view` as parameter type when that's practically
possible, should be one's default.

But when the parameter is to be used as a null-terminated C string then
that introduces inefficient copying in all cases. That gross
inefficiency can be avoided with /a parameter type that represents a
guaranteed nullterminated string/, i.e., by not just trowing away that
information about the actual argument. Then the copying is only needed
when the actual argument is a `std::string_view`, and furthermore it can
then be centralized and hidden away in the parameter type.

That's one glaring omission in the standard library.

A guaranteed non-throwing string carrier is sometimes needed, e.g. as
part of an exception value. The standard library /uses/ such a type
internally, it's not just an implementation detail but absolutely
required since standard exceptions must be copyable without throwing,
but that internal type is not exposed. One can implement such a type
trivially, however, by leveraging e.g. a `std::runtime_error` as string
carrier.

So, that's another glaring omission in the standard library.

> std::string is an awesome tool that makes one's life a thousand times
> easier, but any competent C++ programmer should learn how to use it
> *efficiently*, not just lazily. Know your tools. Use them efficiently.

Yep.

- Alf

Paavo Helde

unread,

May 3, 2021, 2:38:53 PM5/3/21

to

That's true, but sometimes the subpart where performance matters is not
known beforehand. A lot of C++ code goes into reusable libraries, and
when writing a library function it is often not known how many times it
will get called by different client programs. So if there is an easy way
to write some code line to run 10x faster, in library code it should be
done IMO.

Juha Nieminen

unread,

May 4, 2021, 6:33:10 AM5/4/21

to

Bo Persson <b...@bo-persson.se> wrote:
> It is often more efficient use of the developer's time to focus on the
> applications that has the most runtime on a specific system - perhaps
> using several CPU-hours per day. Not trying to get everything else down
> from 5 to 4 microseconds.

Or, conversely, it may be worth for the developer to spend an extra 10 hours
of development time if the end result is a program that runs twice as fast,
saving potentially thousands of CPU hours in the long run.

Those microseconds add up when they are run many, many times.

Juha Nieminen

unread,

May 4, 2021, 6:44:10 AM5/4/21

to

Öö Tiib <oot...@hot.ee> wrote:
> But performance does not *actually* matter most of software
> development life cycle. Donald Knuth's fat volumes of science of
> optimizing algorithms are outright admirable. Yet he wrote in
> "Computer Programming as an Art" (1974) golden words that
> every programmer I know can cite "premature optimization is
> the root of all evil". So come back from dark and evil side, Juha.
> Performance starts to matter only after correctness has been
> reached from end to end and then only to minor subpart of
> whole program.

When Knuth wrote "premature optimization is the root of all evil"
in 1974, I doubt he was thinking about std::string::substr()
vs. std::string::compare() in a language standardized in 1998.

In fact, I personally consider the (in)famous adage to have caused
more harm than good overall because way, way, way too many programmers
have interpreted it to mean "deliberately and purposefully make your
program as inefficient as you possibly can, because optimization,
any kind of optimization, is one of the worst things you could
possibly do".

I'm not even kidding. I have encountered that sort of attitude in the
real world out there many times. People deliberately choosing a less
efficient algorithm or data container even when a significantly
more efficient one is available, because "premature optimization
is the root of all evil", and there's literally zero reason not to
use the more efficient version.

I don't think many people even understand what he meant by it.
Does anybody understand what he meant by it? By "premature optimization"
didn't he merely mean ugly hacks that supposedly make the program
a few clock cycles faster at the cost of making it almost unreadable
(something that was especially common in the 70's and 80's, when
compilers were extraordinarily basic and bad at optimization).
Such as explicitly using bitshifting instead of multiplication,
manually inlining code, and other such hacks. The more such hacks
are used, the less readable and maintainable the code belongs, and it
becomes.

"Premature optimization" does not mean using quicksort instead of
bubble sort. Or using a balanced binary tree instead of an array (and
linear searches). Or using std::string::compare() instead of
std::string::substr().

Juha Nieminen

unread,

May 4, 2021, 6:53:43 AM5/4/21

to

Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
>> if(str1.substr(n, 5) == str2)

>>
>> if(str1.compare(n, 5, str2) == 0)
>

> The rewrite has many more degrees of freedom. I couldn't write that
> without checking the documentation of `compare`. Of course, that
> checking might be nothing more than inspecting an editor's tooltip, but.
>
> Also, the first version expresses nearly directly the desired operation,
> while the rewrite is just that: a rewrite to fit more arbitrary
> requirements, in the direction of expressing C++ code in assembly.

The two lines are about equal in length and complexity, are arguably
equally clear to read and understand, and as someone measured, the
second version is about 10 times faster than the first version.

Just because you personally are not acquainted too well with
std::string::compare() doesn't make it somehow the inferior option.

There really is no good reason to deliberately make the comparison
10 times slower just so that it can have "substr" instead of "compare".

It would be different if the faster version would be like 10 lines of
complicated code while the slower version is just a one-liner. Then
there may be an argument to be made. In this case, however, just no.

David Brown

unread,

May 4, 2021, 9:20:27 AM5/4/21

to

I have certainly seen code written without any thought given to
efficiency (like "x * 0.5" instead of "x / 2" on an 8-bit embedded
system). But I don't believe I have seen anyone deliberately choosing a
less efficient algorithm when the efficiency was the only difference.

> I don't think many people even understand what he meant by it.
> Does anybody understand what he meant by it? By "premature optimization"
> didn't he merely mean ugly hacks that supposedly make the program
> a few clock cycles faster at the cost of making it almost unreadable
> (something that was especially common in the 70's and 80's, when
> compilers were extraordinarily basic and bad at optimization).
> Such as explicitly using bitshifting instead of multiplication,
> manually inlining code, and other such hacks. The more such hacks
> are used, the less readable and maintainable the code belongs, and it
> becomes.
>
> "Premature optimization" does not mean using quicksort instead of
> bubble sort. Or using a balanced binary tree instead of an array (and
> linear searches). Or using std::string::compare() instead of
> std::string::substr().
>

For most code, efficiency does not particularly matter (pick your own
favourite statistic - 95.32% of run time is spent inside 2.12% of the
code). It might begin to matter if the bulk of the code uses really bad
algorithms, but you don't need to put special effort into it. In
particular, you should not optimise for efficiency at the expense of
making code less clear, or increasing the risk of making it incorrect.
I can't remember if it was Knuth or someone else who said that
optimising for speed is easy if you don't worry about correctness.

There are good reasons why most code these days is not written in
compiled languages - developer time, ease of maintenance, flexibility,
etc., are often more important than absolute speed.

But premature pessimisation follows close behind premature optimisation
as the root of all evil. So it might be better to write a piece of code
in Python rather than C++, but you still want to make sure you use lists
or dicts as appropriate and avoid needlessly inefficient algorithms.

Manfred

unread,

May 4, 2021, 10:41:50 AM5/4/21

to

You are bringing on an important aspect here.
I have experienced too the fact that performance really matters in a
small fraction of code, however I am seeing the opposite attitude
growing up to a point that the performance hit becomes actually relevant
even for ordinary applications.
The most relevant issue is not really about writing one piece of C or
C++ code in one way or the other - what is becoming (has become)
relevant IMO is pervasive use of managed languages because of their
appeal to make programming "easy" at the cost of some performance hit
that is supposed to be not that relevant after all with nowadays' hardware.
This marketing narrative may be appealing to some management that has
some financially biased perspective on sw development - be as it may,
the fact is that it is quite common to run into some trivial application
that for the sole purpose of keeping running needs to load dozens of DLLs.

The issue even stops being a matter of preventing "premature
optimization", i.e. favor simplicity over sophistication for the purpose
of ensuring correctness. When you build something that needs a ton of
subsystems for the the sake of its own language infrastructure you are
still introducing extra and brittle sophistication even if it is hidden
by the curtain of apparently compact source code.

James Kuyper

unread,

May 4, 2021, 11:42:50 PM5/4/21

to

On 5/4/21 6:43 AM, Juha Nieminen wrote:
...> When Knuth wrote "premature optimization is the root of all evil"

> in 1974, I doubt he was thinking about std::string::substr()
> vs. std::string::compare() in a language standardized in 1998.

The fact that the specifics of a particular optimization were unknown to
him doesn't render his advice about optimization irrelevant to that
decision. It was very general advice, not specific to any particular
programming language or development environment.

> In fact, I personally consider the (in)famous adage to have caused
> more harm than good overall because way, way, way too many programmers
> have interpreted it to mean "deliberately and purposefully make your
> program as inefficient as you possibly can, because optimization,
> any kind of optimization, is one of the worst things you could
> possibly do".
>
> I'm not even kidding. I have encountered that sort of attitude in the
> real world out there many times. People deliberately choosing a less
> efficient algorithm or data container even when a significantly
> more efficient one is available, because "premature optimization
> is the root of all evil", and there's literally zero reason not to
> use the more efficient version.

If you say you've seen such behavior, I can't prove you wrong, but I
would consider any developer who did such a thing insane. Such behavior
is pretty much completely unrelated to what Knuth actually meant.

> I don't think many people even understand what he meant by it.
> Does anybody understand what he meant by it? By "premature optimization"
> didn't he merely mean ugly hacks that supposedly make the program
> a few clock cycles faster at the cost of making it almost unreadable
> (something that was especially common in the 70's and 80's, when
> compilers were extraordinarily basic and bad at optimization).
> Such as explicitly using bitshifting instead of multiplication,
> manually inlining code, and other such hacks. The more such hacks
> are used, the less readable and maintainable the code belongs, and it
> becomes.

While those are the worst examples of premature optimization, any time
you have a choice between a clear, easily understood algorithm, and a
more efficient algorithm that's less easy to understand, Knuth's adage
quite rightly favors using the clearer one rather than the more
efficient one, unless and until you've determined through such methods
as profiling that it's sufficiently less efficient for the difference to
matter. I could easily imagine someone who was not sympathetic to such
advice incorrectly describing such a decision the way you did above. The
key thing that would be incorrect about doing so would the claim that

"there's literally zero reason not to use the more efficient version".

Greater clarity can be a very legitimate reason for such a decision.

Juha Nieminen

unread,

May 5, 2021, 1:01:44 AM5/5/21

to

James Kuyper <james...@alumni.caltech.edu> wrote:
> While those are the worst examples of premature optimization, any time
> you have a choice between a clear, easily understood algorithm, and a
> more efficient algorithm that's less easy to understand, Knuth's adage
> quite rightly favors using the clearer one rather than the more
> efficient one, unless and until you've determined through such methods
> as profiling that it's sufficiently less efficient for the difference to
> matter. I could easily imagine someone who was not sympathetic to such
> advice incorrectly describing such a decision the way you did above. The
> key thing that would be incorrect about doing so would the claim that
> "there's literally zero reason not to use the more efficient version".
> Greater clarity can be a very legitimate reason for such a decision.

The argument has been made that getting the program working and stable in
a timely manner is more important than whether it happens to be a few
seconds faster or not. After all, in this busy world we don't have time
to waste on such trivialities as code optimization. We need a working
product yesterday! We don't have time nor money to make it actually
efficient!

The thing is, if that's the approach, then do you really think that
such developers will later come back to the product and go through the
time and effort to profiling it and refactoring it to make it more
efficient?

After all, writing the inefficient version and then the more efficient
version is more time spent than merely writing the inefficient version
and doing nothing more.

So, the best way to avoid this is to write the more efficient version
from the get-go! Then there's no need to go back and profile the program
and refactor it (which sometimes may mean extensive changes to the
program).

Sure, maybe if the "KISS" version of the routine takes 1 minute to write
and the more efficient version takes 3 hours to write, there might be an
argument there. However, the more efficient version takes 3 minutes to
write, I would argue it's worth it.

David Brown

unread,

May 5, 2021, 5:26:26 AM5/5/21

to

On 05/05/2021 07:01, Juha Nieminen wrote:
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> While those are the worst examples of premature optimization, any time
>> you have a choice between a clear, easily understood algorithm, and a
>> more efficient algorithm that's less easy to understand, Knuth's adage
>> quite rightly favors using the clearer one rather than the more
>> efficient one, unless and until you've determined through such methods
>> as profiling that it's sufficiently less efficient for the difference to
>> matter. I could easily imagine someone who was not sympathetic to such
>> advice incorrectly describing such a decision the way you did above. The
>> key thing that would be incorrect about doing so would the claim that
>> "there's literally zero reason not to use the more efficient version".
>> Greater clarity can be a very legitimate reason for such a decision.
>
> The argument has been made that getting the program working and stable in
> a timely manner is more important than whether it happens to be a few
> seconds faster or not. After all, in this busy world we don't have time
> to waste on such trivialities as code optimization. We need a working
> product yesterday! We don't have time nor money to make it actually
> efficient!

That is certainly correct for some systems.

>
> The thing is, if that's the approach, then do you really think that
> such developers will later come back to the product and go through the
> time and effort to profiling it and refactoring it to make it more
> efficient?

For a great deal of code written, there is no "later". Many products -
and the software in them - have short lifetimes. If you spend a lot of
time developing efficient solutions, the product will never exist -
while you have been ensuring your system reacts in 10 milliseconds
instead of 100 milliseconds, competing companies have taken the market
and you (or your client) is out of business. The competing company
won't bother optimising their software efficiency because they will only
be making the product for 6 months before moving on to something new.

Software development means getting the balance right between the
different costs in time, money, resources, etc. If time to market is
the key issue, then optimising the software development might mean using
off-the-shelf embedded Linux boards running code in Python even though
you are just controlling a few buttons and a small motor that could be
controlled by a $1 microcontroller. If developer costs are the key
issue, you use whatever language that one developer knows best. If
production costs are the key issue, you use the cheapest
microcontroller. If battery lifetimes are key, you optimise there.

There are only two things you can be sure about. One is that if you
concentrate /solely/ on run-time speed efficiency of the code, you will
be wrong more often than you are right. The other is that no matter
what the time or money pressure, getting the software /right/ trumps all
other considerations.

Knuth's observation is not about optimisation of code speed - it is
about optimisation of the development process.

>
> After all, writing the inefficient version and then the more efficient
> version is more time spent than merely writing the inefficient version
> and doing nothing more.
>

And if writing the inefficient version takes less time than writing the
efficient version, then often it is the right answer. Computer time is
vastly cheaper than developer time - you don't spend even an hour of
developer time to save a millisecond of run time unless that code is
going to be run at least a million million times. /Some/ code is run
that often, but most code is not.

James Kuyper

unread,

May 5, 2021, 9:54:13 AM5/5/21

to

On 5/5/21 1:01 AM, Juha Nieminen wrote:
...

> The argument has been made that getting the program working and
> stable in a timely manner is more important than whether it happens
> to be a few seconds faster or not. After all, in this busy world we
> don't have time to waste on such trivialities as code optimization.
> We need a working product yesterday! We don't have time nor money to
> make it actually efficient!

Code optimization can be very important - but it's also difficult and
costly, which is why you should be concentrating your time optimizing
those locations where most of the processing time is actually being
spent. Diffusing your efforts by trying to optimize everything is a was
of expensive developer time.

> The thing is, if that's the approach, then do you really think that
> such developers will later come back to the product and go through
> the time and effort to profiling it and refactoring it to make it
> more efficient?>
> After all, writing the inefficient version and then the more
> efficient version is more time spent than merely writing the
> inefficient version and doing nothing more.

I've never worked on a project where the program wasn't undergoing
constant revision. As soon as we deliver one version, we start working
on the next. In my experience adding new features in one version, and
improving them in a later version is the norm, not the exception.

> So, the best way to avoid this is to write the more efficient
> version from the get-go! Then there's no need to go back and profile
> the program and refactor it (which sometimes may mean extensive
> changes to the program).

If you spend time making a part of the program more complicated and
harder to understand in the name of making it more efficient, without
having first confirmed that it's a significant source of delays, that
effort is far more likely than not to have been wasted. That time could
have been better spent delivering other desired features, or improving
the performance of a section that really is a major time sink. Also,
having made the code harder to understand will pay off with bugs and
maintenance nightmares in the future.

Öö Tiib

unread,

May 6, 2021, 9:21:15 PM5/6/21

to

Yes. I agree. You are saying same thing. The subpart where performance
matters often becomes known only after correctness has been reached
from end to end and the code is put into use by different clients. Usually
because it does everything it was intended to do and nothing that it wasn't
intended to do. Often the context and data we considered "realistic" when
we profiled it was far from needs that arise in few years. But fortunately
software can be profiled and improved in few years too.

> So if there is an easy way
> to write some code line to run 10x faster, in library code it should be
> done IMO.

But you somehow conclude from above that that we should ponder each
line of code if there is some easier or harder way to make it faster. Why?
The whole file with that line may be will be erased entirely when we get
actual performance issue and data with what to profile.

Öö Tiib

unread,

May 7, 2021, 12:40:24 AM5/7/21

to

On Tuesday, 4 May 2021 at 13:44:10 UTC+3, Juha Nieminen wrote:
> Öö Tiib <oot...@hot.ee> wrote:
> > But performance does not *actually* matter most of software
> > development life cycle. Donald Knuth's fat volumes of science of
> > optimizing algorithms are outright admirable. Yet he wrote in
> > "Computer Programming as an Art" (1974) golden words that
> > every programmer I know can cite "premature optimization is
> > the root of all evil". So come back from dark and evil side, Juha.
> > Performance starts to matter only after correctness has been
> > reached from end to end and then only to minor subpart of
> > whole program.
> When Knuth wrote "premature optimization is the root of all evil"
> in 1974, I doubt he was thinking about std::string::substr()
> vs. std::string::compare() in a language standardized in 1998.

Demagogy ... perhaps read the book.

> In fact, I personally consider the (in)famous adage to have caused
> more harm than good overall because way, way, way too many programmers
> have interpreted it to mean "deliberately and purposefully make your
> program as inefficient as you possibly can, because optimization,
> any kind of optimization, is one of the worst things you could
> possibly do".

Most problems with C++ come from that wrong focus to unneeded
micro-optimizations (that are always possible) instead of using it
as language with what software can be written rapidly safely and
correctly.

> I'm not even kidding. I have encountered that sort of attitude in the
> real world out there many times. People deliberately choosing a less
> efficient algorithm or data container even when a significantly
> more efficient one is available, because "premature optimization
> is the root of all evil", and there's literally zero reason not to
> use the more efficient version.

All we have met insane things but someone deliberately and specially
writing inefficient code is hard to believe ... I have never met nor heard
it. Maybe the evil of preliminary optimizations made you to misrepresent
someone or to make that story up? Then it is truly root of *all* evil. :D

> I don't think many people even understand what he meant by it.
> Does anybody understand what he meant by it? By "premature optimization"
> didn't he merely mean ugly hacks that supposedly make the program
> a few clock cycles faster at the cost of making it almost unreadable
> (something that was especially common in the 70's and 80's, when
> compilers were extraordinarily basic and bad at optimization).
> Such as explicitly using bitshifting instead of multiplication,
> manually inlining code, and other such hacks. The more such hacks
> are used, the less readable and maintainable the code belongs, and it
> becomes.

There are lot of faces of evil and only precious few of those are visibly
ugly. Lets look at that "time-wasting face" in context of passing string?

1) Should we move the string here instead of copying? No idea, just copy
right now, later may appear that we need it down the line.
2) Should we make it constexpr? Maybe, as it is currently just string constant
but such things are wanted configurable later, so let it be just string.
3) Should we use string_view? Who knows, maybe we want to pass it to
different thread. Leave it as string now.

See? All questions were about C++ micro-optimizations
none of what is ugly hack but time was wasted to considering.
If to apply those prematurely these add constraints that potentially
waste time in the future too.

> "Premature optimization" does not mean using quicksort instead of
> bubble sort.

Usage of std::sort is not premature optimization. Usage of something
else without clear reason is premature optimization or pessimization
or just foolishness.

> Or using a balanced binary tree instead of an array (and
> linear searches).

Usage of std::multiset or sorted std::vector with std::lower_bound or
std::unordered_multiset is just usual usage of usual containers so
not premature optimization.
Usage of things with what rare are familiar like boost::intrusive::sgtree
or your own implementation of such is premature optimization or
pessimization or just foolishness.

> Or using std::string::compare() instead of
> std::string::substr().

Depends how it is done. If someone used it, changed it within other
work or suggested in code review then it is fine. But if someone
raised issue management ticket then ... premature optimizations have
done outright sabotage there IMO.

Juha Nieminen

unread,

May 7, 2021, 3:38:34 AM5/7/21

to

?? Tiib <oot...@hot.ee> wrote:
>> The more such hacks
>> are used, the less readable and maintainable the code belongs, and it
>> becomes.

I don't know what kind of brainfart happened to me here. Obviously I meant
to write "the less readable and maintainable the code becomes".
I think I wrote something longer first, but then re-edited it to be shorter
but somehow I botched the editing and didn't remove something I should have.

But anyway.

> There are lot of faces of evil and only precious few of those are visibly
> ugly. Lets look at that "time-wasting face" in context of passing string?
>
> 1) Should we move the string here instead of copying? No idea, just copy
> right now, later may appear that we need it down the line.
> 2) Should we make it constexpr? Maybe, as it is currently just string constant
> but such things are wanted configurable later, so let it be just string.
> 3) Should we use string_view? Who knows, maybe we want to pass it to
> different thread. Leave it as string now.
>
> See? All questions were about C++ micro-optimizations
> none of what is ugly hack but time was wasted to considering.
> If to apply those prematurely these add constraints that potentially
> waste time in the future too.

I think that an experienced C++ programmer can gain enough knowledge of
the language and its standard library that using the more efficient
version of things becomes second nature and happens fluently and without
much disruption.

In this particular case, for example, it shouldn't be a question of
thinking that taking a substr() and comparing it to another string is
the way to do the comparison but then starting to think if there might
be a better way and spending ten minutes researching cppreference.com
to see if there might be something more efficient.

Instead, optimally it should be a case that you already know of
std::string::compare() (or at a very minimum remember that something
like it exists) and directly implementing the comparison using it, because
you know it's the more efficient way. No time is wasted on this kind of
"micro-optimization". Your code will be ready equally fast, but it will
not be so wasteful.

To me it makes no sense to think "I'll just use substr() and maybe
consider changing it to compare() later". Why? I just use compare()
from the get-go. I see no problem with it.

Paavo Helde

unread,

May 7, 2021, 5:06:01 AM5/7/21

to

For me it seems very easy, the programmer anyway needs to choose for
each line how they write it. For each line or task there are many ways
to code it incorrectly, there are some ways to code it more or less
correctly, and then there are optimizations which might not be obvious,
so cannot be taken into account when first writing that line.

This correctness for me also includes reasonable performance. For
example, choosing a solution with O(log N) complexity over O(N). Even
though the latter would also work and produce the needed result, it
would not be a "correct" solution in my book, unless it is known and
documented that the function will only be used with very small N. The
time and effort for documenting and checking this assumption may easily
turn out to be more time consuming than using the correct O(log N)
algorithm in the first place.

For the OP's example, there are two ways to compare the substring in
C++, either with string::substr()== or with string::compare(). In my
book, as the former is 2-10x slower than the latter, only the latter is
a "correct" way to code this line IMO. When the compilers get smarter
and become able to optimize them to the same speed, one can reconsider.

That's the same as with passing by value or by reference. In C++ one
should prefer passing non-trivial objects by reference if possible, as
pass by value might often waste both speed and memory for no reason.
Yet, passing by reference involves writing some more keystrokes, so
according to some opinions which I have seen in this thread it should
not be done before profiler shows there is a bottle-neck.

Paavo Helde

unread,

May 7, 2021, 5:10:45 AM5/7/21

to

07.05.2021 07:40 Öö Tiib kirjutas:
> On Tuesday, 4 May 2021 at 13:44:10 UTC+3, Juha Nieminen wrote:
>> "Premature optimization" does not mean using quicksort instead of
>> bubble sort.
>
> Usage of std::sort is not premature optimization. Usage of something
> else without clear reason is premature optimization or pessimization
> or just foolishness.

Correct. And similarly, using std::string::compare() for comparing
substrings is not a premature optimization either, because that's
exactly what it does and what it is meant for. Usage of something else
without clear reason is not justified.

Hans Bos

unread,

May 7, 2021, 10:20:16 AM5/7/21

to

Op 1-5-2021 om 15:51 schreef Öö Tiib:

This quote is from his paper "Structured Programming with go to
Statements" (1974)

He also wrote in there:
The conventional wisdom shared by many of today's software engineers
calls for ignoring efficiency in the small; but I believe this is simply
an overreaction to the abuses they see being practiced by penny-
wise-and-pound-foolish programmers, who can't debug or maintain their
"optimized" programs. In established engineering disciplines a 12 %
improvement, easily obtained, is never considered marginal;

So I think he means: don't get carried away with unnecessary code
optimization, but don't use inefficient constructs if efficient
constructs can be "easily obtained".

Hans.

Öö Tiib

unread,

May 7, 2021, 12:41:07 PM5/7/21

to

For me it is as basic_string::compare is set of overloads, longest has
perhaps 5 arguments that few do remember and it isn't guaranteed
to be more efficient by anything else but by profiling. Also its starship
nature is good bear trap for novices if you want to be asshole:

if (!a.compare(b)) {
// get here when a and b are equal
// but majority of novices read opposite
}

Paavo Helde

unread,

May 7, 2021, 3:33:14 PM5/7/21

to

07.05.2021 19:40 Öö Tiib kirjutas:
> On Friday, 7 May 2021 at 12:10:45 UTC+3, Paavo Helde wrote:
>> 07.05.2021 07:40 Öö Tiib kirjutas:
>>> On Tuesday, 4 May 2021 at 13:44:10 UTC+3, Juha Nieminen wrote:
>>>> "Premature optimization" does not mean using quicksort instead of
>>>> bubble sort.
>>>
>>> Usage of std::sort is not premature optimization. Usage of something
>>> else without clear reason is premature optimization or pessimization
>>> or just foolishness.
>> Correct. And similarly, using std::string::compare() for comparing
>> substrings is not a premature optimization either, because that's
>> exactly what it does and what it is meant for. Usage of something else
>> without clear reason is not justified.
>
> For me it is as basic_string::compare is set of overloads, longest has
> perhaps 5 arguments that few do remember

You mean, you do not remember the arguments, because you are not using
that function. I'm using it all the time and have no problem with
arguments, they are pretty logical.

BTW, std::sort also has 4 overloads and up to 4 arguments plus up to 3
template arguments. Just saying...

> and it isn't guaranteed
> to be more efficient by anything else but by profiling.

So what's the reason for its existence? There are certain reasons why
something is included in the C++ standard, and avoiding creation of
totally unneeded temporaries might well be one of them.

> Also its starship
> nature is good bear trap for novices if you want to be asshole:
>
> if (!a.compare(b)) {
> // get here when a and b are equal
> // but majority of novices read opposite
> }

There are bad news for those novices as they now have to cope with even
more ternary result values, with this brand new spaceship operator in C++20.

red floyd

unread,

May 7, 2021, 8:59:43 PM5/7/21

to

But then again, strcmp/strncmp has the exact same bear trap.

Öö Tiib

unread,

May 8, 2021, 6:27:57 AM5/8/21

to

On Friday, 7 May 2021 at 22:33:14 UTC+3, Paavo Helde wrote:
> 07.05.2021 19:40 Öö Tiib kirjutas:
> > On Friday, 7 May 2021 at 12:10:45 UTC+3, Paavo Helde wrote:
> >> 07.05.2021 07:40 Öö Tiib kirjutas:
> >>> On Tuesday, 4 May 2021 at 13:44:10 UTC+3, Juha Nieminen wrote:
> >>>> "Premature optimization" does not mean using quicksort instead of
> >>>> bubble sort.
> >>>
> >>> Usage of std::sort is not premature optimization. Usage of something
> >>> else without clear reason is premature optimization or pessimization
> >>> or just foolishness.
> >> Correct. And similarly, using std::string::compare() for comparing
> >> substrings is not a premature optimization either, because that's
> >> exactly what it does and what it is meant for. Usage of something else
> >> without clear reason is not justified.
> >
> > For me it is as basic_string::compare is set of overloads, longest has
> > perhaps 5 arguments that few do remember
> You mean, you do not remember the arguments, because you are not using
> that function. I'm using it all the time and have no problem with
> arguments, they are pretty logical.

No. In all performance-demanding processing I have since constexpr used
other thing somewhat resembling current string_view. It is powerful as it
can be used to represent better string literals and does compile time
processing. Unfortunately C++17 broke some of it but not all.
Where performance does not matter there are std::strings but efficiency
of comparing those doesn't matter just that operator== is better to read.

> BTW, std::sort also has 4 overloads and up to 4 arguments plus up to 3
> template arguments. Just saying...

For all purposes there are 2. That was actually not surprise to me that
sabotaged C++17 added execution policies (that clearly should be compile
time settings) as dynamic function parameters. Perhaps to ensure that
std:sort can't be made efficiently concurrent.

> > and it isn't guaranteed
> > to be more efficient by anything else but by profiling.
> So what's the reason for its existence? There are certain reasons why
> something is included in the C++ standard, and avoiding creation of
> totally unneeded temporaries might well be one of them.

Stop reading my sentences as these are conveying bool values. The
std::string is useful and its compare is useful. The a==b compiles
to same binary as !a.compare(b) right now so I suggest former but
wont consider latter as defect. The reasons were all told ad nauseum
during the long lasting shitstorm about std::string and every framework
having its own string class.
For me the compare is merely in brighter gray when we talk about
efficiency and in darker gray when we talk about readability and
robustness. So I consider it mild and unimportant performance
improvement.

> > Also its starship
> > nature is good bear trap for novices if you want to be asshole:
> >
> > if (!a.compare(b)) {
> > // get here when a and b are equal
> > // but majority of novices read opposite
> > }
> There are bad news for those novices as they now have to cope with even
> more ternary result values, with this brand new spaceship operator in C++20.

That hurts indeed and are bad news to me. It is already tricky to find people
who are willing and capable to write C++. My recruiters search literally
globally. The damage after C++17 did just make it worse. My current rhetoric
to novices is that C++ has been made perverse specially as test. Person who
can't handle such mild complications isn't fit to be engineer. But there is
quite easy rapid development language hidden inside of that mess.

Öö Tiib

unread,

May 8, 2021, 6:42:52 AM5/8/21

to

I have always thought that C++ loaned the design of worst bear traps
from C because a) these work and b) Bjarne had other things to do
than to be inventive there. Just add some places with unexpected
undefined behavior, some others with unexpected defined behavior
plus some surprising silent type insecurity like
false -> 0 -> nullptr -> (char*)nullptr make sure that every feature
contains at least one and that the version you least likely want is
most brief to type and people start to pay attention.

Juha Nieminen

unread,

May 8, 2021, 4:27:38 PM5/8/21

to

?? Tiib <oot...@hot.ee> wrote:
> Also its starship
> nature is good bear trap for novices if you want to be asshole:
>
> if (!a.compare(b)) {
> // get here when a and b are equal
> // but majority of novices read opposite
> }

Good programming practices, and common sense in programming, would dictate
to avoid deliberate obfuscation and "fancy syntactic tricks" (when much
more readable and understandable 100% equivalent alternatives exist).

Many a programmer, especially those who never grew out of their beginner
programmer fascination phase, may engage in fancy syntactical "tricks"
just because they like them, find them fancy and cool, but a good programmer
is distinguished by being able to write clear readable code that's still
reasonably simple and reasonably efficient.

Anyway, rather obviously there are benefits to the trinary comparison
result in many situations. Consider, for example, that std::lower_bound
does not tell you whether the searched element exists in the range or not,
it only tells you where the element should be inserted to keep the range
sorted. While there exists a std::binary_search, that only tells you whether
it exists or not, but not where to insert the element if you wanted.
Thus, if you wanted to check (using a binary search) whether the element
exists and insert it only if it didn't exist, std::binary_search would tell
you the point of insertion, but you still need to make an additional equality
check to see if it already exists. Meaning that the last element compared
in the search would have to be compared twice. If the comparison is
relatively expensive (eg. the elements are large strings), this is a waste
of time which the trinary comparison operator solves: It directly tells you
if that last element that was compared was larger or equal to the searched
element, so you just need to check the integer value to know.

(Unfortunately std::lower_bound doesn't give you this value, so you would
need to perform the binary search yourself. But the trinary comparison
result will help you in this case to avoid the double comparison of the
same element at the end of the search.)

Öö Tiib

unread,

May 9, 2021, 12:56:16 PM5/9/21

to

On Saturday, 8 May 2021 at 23:27:38 UTC+3, Juha Nieminen wrote:
> ?? Tiib <oot...@hot.ee> wrote:
> > Also its starship
> > nature is good bear trap for novices if you want to be asshole:
> >
> > if (!a.compare(b)) {
> > // get here when a and b are equal
> > // but majority of novices read opposite
> > }
> Good programming practices, and common sense in programming, would dictate
> to avoid deliberate obfuscation and "fancy syntactic tricks" (when much
> more readable and understandable 100% equivalent alternatives exist).

Yes. Correct. My whole point was that it is often not worth even to consider
how 100% it is.

> Many a programmer, especially those who never grew out of their beginner
> programmer fascination phase, may engage in fancy syntactical "tricks"
> just because they like them, find them fancy and cool, but a good programmer
> is distinguished by being able to write clear readable code that's still
> reasonably simple and reasonably efficient.

Indeed reasonably efficient.

> Anyway, rather obviously there are benefits to the trinary comparison
> result in many situations.

Yes and in many situations like with IEEE 754 types we may benefit
from quaternary comparison results. But if it brings any performance
benefits is unclear until shown by profiler. The discussions about
spaceship operators I've had so far have been clearly that it most
likely is less performant than binary operators.

> Consider, for example, that std::lower_bound
> does not tell you whether the searched element exists in the range or not,
> it only tells you where the element should be inserted to keep the range
> sorted. While there exists a std::binary_search, that only tells you whether
> it exists or not, but not where to insert the element if you wanted.
> Thus, if you wanted to check (using a binary search) whether the element
> exists and insert it only if it didn't exist, std::binary_search would tell
> you the point of insertion, but you still need to make an additional equality
> check to see if it already exists. Meaning that the last element compared
> in the search would have to be compared twice. If the comparison is
> relatively expensive (eg. the elements are large strings), this is a waste
> of time which the trinary comparison operator solves: It directly tells you
> if that last element that was compared was larger or equal to the searched
> element, so you just need to check the integer value to know.

It solves that only when the trinary comparison is of equal cost with binary
comparison.

> (Unfortunately std::lower_bound doesn't give you this value, so you would
> need to perform the binary search yourself. But the trinary comparison
> result will help you in this case to avoid the double comparison of the
> same element at the end of the search.)

And if we are in benefit or loss with it only profiler can tell.

Juha Nieminen

unread,

May 10, 2021, 1:07:32 AM5/10/21

to

Öö Tiib <oot...@hot.ee> wrote:
> Yes and in many situations like with IEEE 754 types we may benefit
> from quaternary comparison results. But if it brings any performance
> benefits is unclear until shown by profiler.

And I still maintain that only an astronomically minuscule fraction of
projects out there, especially commercial projects, are going to go back
to an existing project, profile it, find out the bottlenecks and fix them.

The theory "only optimize after a profiler has shown that there's a
bottleneck" is utterly naive and flawed. In practice, it's in essence
completely equal to "never optimize".

My point is that if you optimize *now* you won't have to optimize later.
You'll get the benefits from the get-go, and the program will be more
optimized. The "optimize later" phase is not going to happen 99.9999% of
the time, especially when we are talking about commercial products.

(There are many other things for which the same is true. Such as cleaning
up and commenting the code. "I'll comment the code later" is in essence
the equivalent of "I'll never comment this code". If you don't comment the
code now, you are not going to comment it ever. Not in a commercial product,
most probably not even in a hobby project.)

Richard Damon

unread,

May 10, 2021, 7:29:25 AM5/10/21

to

The key to the addage about premature optimization is that if you spend
time to work on adding 'tricks' to your code to make it faster, almost
always at the expense of code clarity, then this will end up slowing
down your develpment process because you WILL be tracing down bugs that
you made due to trying to be tricky to make code faster that never
needed to be faster.

This does NOT mean not spending a bit of time to use the right
algorithms that will have a reasonable speed/complexity ratio for your
application, or use bad techniques on the way.

You don't normally 'go back' to a released product to try and speed it
up, but you get along in the design and as part of testing you find the
system is running slower than you want, so as PART of the development
you see a NEED for better speed, so you profile, find bottle-necks, and
apply optimization there.

There is also the fact that probably at least half of the optimization
techniques that are used to micro-optimize can be done better by a good
compiler, and often by trying to do it yourself you hurt the copilers
ability to do a better job.

Note, as I said, this does NOT mean you bad algorithms. You have a big
table of data you need to find something in it. Thinking a minute to see
if a linear seach to find it would be bad is NOT 'pre-mature
optimization', it is good design. At minimum you encapsulate the
decision of what technique will be used to look up the data lookup so
when you figure out the exact pattern to your data access, you can chose
what method to use.

Öö Tiib

unread,

May 10, 2021, 11:18:11 PM5/10/21

to

On Monday, 10 May 2021 at 08:07:32 UTC+3, Juha Nieminen wrote:
> Öö Tiib <oot...@hot.ee> wrote:
> > Yes and in many situations like with IEEE 754 types we may benefit
> > from quaternary comparison results. But if it brings any performance
> > benefits is unclear until shown by profiler.
> And I still maintain that only an astronomically minuscule fraction of
> projects out there, especially commercial projects, are going to go back
> to an existing project, profile it, find out the bottlenecks and fix them.

My experience is opposite. Perhaps it depends on problem domain.
Only mid way failed products were never refactored later. Rest had
works for to fine tune and improve speed and/or resource efficiency
always planned and carried out as planned.