Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Anyone ever used vector<bool> ?

252 views
Skip to first unread message

Mut...@dastardlyhq.com

unread,
May 3, 2022, 11:16:42 AM5/3/22
to
I need to store ~70K images in memory as B&W bitmaps so any chance of reducing
memory usage is a bonus. I've never used vector<bool> because of all the
warnings that it doesn't play nice with other parts of the STL. Anyone know
any different?

Unfortunately I need something that be dynamically sized or I'd use std::bitset
but that has to be sized at compile time which is no use.

Ben

unread,
May 3, 2022, 11:39:59 AM5/3/22
to
Mut...@dastardlyhq.com writes:

> I need to store ~70K images in memory as B&W bitmaps so any chance of reducing
> memory usage is a bonus.

I'd wrap a std::vector<unsigned int> in a class so you can implement
whatever access you want to the bits cleanly.

--
Ben.

Mut...@dastardlyhq.com

unread,
May 3, 2022, 12:12:57 PM5/3/22
to
To be fair, if I was going to roll my own I'd just use a C u_char array
wrapped in a class but I can't be bothered to write all the boilerplate
construct,set,unset,copy etc code, hence I was hoping to use vector<bool>
directly if its not going to cause any nasty gotchas.

Mut...@dastardlyhq.com

unread,
May 3, 2022, 12:14:28 PM5/3/22
to
On 3 May 2022 15:43:45 GMT
r...@zedat.fu-berlin.de (Stefan Ram) wrote:
>Mut...@dastardlyhq.com writes:
>>I need to store ~70K images in memory as B&W bitmaps so any chance of reducing
>
>>memory usage is a bonus. I've never used vector<bool> because of all the
>>warnings that it doesn't play nice with other parts of the STL. Anyone know
>>any different?
>
> Don't hesitate to use ::std::vector< bool >! You might like
> to hide it behind an interface or a custom class; that way
> you can switch easily to another implementation if need be.
> (If you don't use an interface, you can also switch the
> implementation later, albeit maybe a bit more difficult.)
>
> Bjarne wrote in 2014, "std::vector<bool> - when we need
> more than 8*sizeof(long) bits". He did /not/ write "std::
> vector<bool> - Don't ever use it!". The core guidelines
> also do not seem to advise against it.
>
> "vector<bool> - neither a vector, nor does it contain bools."

Advising against it is not the same as promoting it :)

Jack Lemmon

unread,
May 3, 2022, 12:34:52 PM5/3/22
to
On 03/05/2022 16:16, Mut...@dastardlyhq.com wrote:
>
> Unfortunately I need something that be dynamically sized or I'd use std::bitset
> but that has to be sized at compile time which is no use.
>
How about using List? <https://www.cplusplus.com/reference/list/list/>
Not fast but dynamic in nature and vector modifiers and iterators can be
used.



Manfred

unread,
May 3, 2022, 1:13:52 PM5/3/22
to
Well, not.

The OP wants compact & fast. List is not it.

Bo Persson

unread,
May 3, 2022, 2:25:53 PM5/3/22
to
The only gotcha is that vector<bool> doesn't *fully* conform to the
standard. A container, and its iterators, should be able to return a
reference to the elements. But, of course, you cannot return a (real)
reference to a single bit.

If that doesn't look like a problem in your program, feel free to use it.

And I have never used it myself, but mainly because I have never had a
use for that many bools in a program.

Keith Thompson

unread,
May 3, 2022, 3:00:52 PM5/3/22
to
Bo Persson <b...@bo-persson.se> writes:
[...]
> The only gotcha is that vector<bool> doesn't *fully* conform to the
> standard. A container, and its iterators, should be able to return a
> reference to the elements. But, of course, you cannot return a (real)
> reference to a single bit.

Strictly speaking, it does fully conform to the standard, since the
standard defines it. It doesn't fully conform to the requirements for
vector<foo>.

(IMHO it might have been better to define a vector_bool type that's not
a specialization of vector.)

> If that doesn't look like a problem in your program, feel free to use it.
>
> And I have never used it myself, but mainly because I have never had a
> use for that many bools in a program.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Malcolm McLean

unread,
May 3, 2022, 5:41:41 PM5/3/22
to
I used one today. I have two lists of open paths, and I want to join paths which abut
each other. With special rules when you have two or more choices about which path
to join to which.
So I used a matrix of bools set to true for joinable ends.
There was no need to use a reference or an iterator. The point of a matrix is that the
elements are addressed by index. Since the number of paths is always going to be
quite small, packing into bits is overkill, but use of "bool" documents that the value
must be true / false.

Andrea Venturoli

unread,
May 4, 2022, 2:21:02 AM5/4/22
to

On 5/3/22 17:16, Mut...@dastardlyhq.com wrote:

Yes I did and it served me well.

Just don't expect it can play nicely wherever a "real" vector class
would, but if it's enough for your use case, go ahead.

Mut...@dastardlyhq.com

unread,
May 4, 2022, 5:14:17 AM5/4/22
to
Indeed. Aside from using a list to store individual bits seems silly, I also
need something that does direct indexing which list obviously doesn't do.

wij wij

unread,
May 4, 2022, 6:21:14 AM5/4/22
to
Nearly all in C++ standard library are not necessary for average programmers
(the first C++ years may partly need), they are small utilities (you learn real
'programming' skill or those deep engineering/artificial concepts/rules?).
Anyone chooses C++ largely eventually has to understand what is inside and
code the utility himself.

70k(bytes) is quite small of image data.
I would probably use VLInt (Variable Length Integer) class of my library,
because VLInt contains a member set_bit(size_t,bool) suitable for simple B&W
bitmap's basic r/w.
VLInt is 'dynamic-sized', memory usage is approximately MSB/8 (less than 70k/8)

--------- Usage example of VLInt for printing prime numbers
#include <Wy.stdio.h>
#include <CSCall/VLInt.h>

using namespace Wy;

typedef VLInt<unsigned int> VLType;

// [Syn] Print prime numbers less or equal to n
//
// Algo: sieve
//
void print_prime(size_t n)
{
Errno r;
VLType barr; // bit=0 =prime

for(size_t i=2; i<=n; ++i) {
if(barr.bit(i)) {
continue;
}
// Cond: i is a prime number
for(size_t j=i+i; j<=n; j+=i) {
if((r=barr.set_bit(j,true))!=Ok) {
WY_THROW(r);
}
}
}

// print prime number (from 2)
size_t pcnt=0;
for(size_t i=2; i<=n; ++i) {
if(barr.bit(i)==false) {
cout << wrd(i) << ' ';
++pcnt;
}
}
cout << WY_ENDL;
cout << "pi(" << n << ")= " << pcnt << WY_ENDL;
};

int main()
try {
print_prime(1000000);
cout << "Ok" WY_ENDL;
return 0;
}
catch(const Errno& e) {
cerr << wrd(e) << WY_ENDL;
return -1;
}
catch(const Errno& e) {
cerr << wrd(e) << WY_ENDL;
}
catch(const Errno& e) {
cerr << wrd(e) << WY_ENDL;
return -1;
}
catch(...) {
cerr << "unknown stack unwind" << WY_ENDL;
throw;
};

Juha Nieminen

unread,
May 4, 2022, 6:45:40 AM5/4/22
to
I have hard time thinking of a less efficient way to store a bitmap
(at least using standard library containers).

Juha Nieminen

unread,
May 4, 2022, 6:47:43 AM5/4/22
to
Mut...@dastardlyhq.com wrote:
> I need to store ~70K images in memory as B&W bitmaps so any chance of reducing
> memory usage is a bonus.

Using std::vector<bool> is fine.

If you can use Boost, you may consider using boost::dynamic_bitset.
It has many utilities that std::vector<bool> doesn't have (and is ostensibly
more efficient).

Juha Nieminen

unread,
May 4, 2022, 6:50:45 AM5/4/22
to
Bo Persson <b...@bo-persson.se> wrote:
> The only gotcha is that vector<bool> doesn't *fully* conform to the
> standard.

What "standard"?

The C++ standard defines how std::vector<bool> works, so it's pretty much
"standard" by definition.

What's special about it is that in a few aspects it doesn't work in the
same way as other standard library data containers. However, the others
don't all work exactly in the same as each other either.

Maybe std::vector<bool> should have a category of its own (like the
other containers are categorized by how they behave and what they provide).

Mut...@dastardlyhq.com

unread,
May 4, 2022, 6:52:41 AM5/4/22
to
On Wed, 4 May 2022 03:21:03 -0700 (PDT)
wij wij <wyni...@gmail.com> wrote:
>On Tuesday, 3 May 2022 at 23:16:42 UTC+8, Mut...@dastardlyhq.com wrote:
>> I need to store ~70K images in memory as B&W bitmaps so any chance of
>reducing
>> memory usage is a bonus. I've never used vector<bool> because of all the
>> warnings that it doesn't play nice with other parts of the STL. Anyone know
>> any different?
>>
>> Unfortunately I need something that be dynamically sized or I'd use
>std::bitset
>> but that has to be sized at compile time which is no use.
>
>Nearly all in C++ standard library are not necessary for average programmers
>(the first C++ years may partly need), they are small utilities (you learn
>real
>'programming' skill or those deep engineering/artificial concepts/rules?).
>Anyone chooses C++ largely eventually has to understand what is inside and
>code the utility himself.

I'm quite capable of writing a bitmap class but given I'm writing an image
recognition algo I have enough on my plate without having to write basic
data storage classes.

>70k(bytes) is quite small of image data.
>I would probably use VLInt (Variable Length Integer) class of my library,
>because VLInt contains a member set_bit(size_t,bool) suitable for simple B&W
>bitmap's basic r/w.

Very nice.

Mut...@dastardlyhq.com

unread,
May 4, 2022, 6:53:50 AM5/4/22
to
I rather gave up on boost. C++ since 2011 does pretty much does everything I
need but it perhaps not in this case.

Paavo Helde

unread,
May 4, 2022, 7:09:41 AM5/4/22
to
I have hard time finding any use case at all for std::list. I know there
is a use case for holding dynamically created global objects alive and
at the same memory address, but globals are evil anyway, regardless of
how they are held.

Frederick Virchanza Gotham

unread,
May 4, 2022, 5:39:50 PM5/4/22
to
On Wednesday, May 4, 2022 at 12:09:41 PM UTC+1, Paavo Helde wrote:

> > I have hard time thinking of a less efficient way to store a bitmap
> > (at least using standard library containers).
> I have hard time finding any use case at all for std::list. I know there
> is a use case for holding dynamically created global objects alive and
> at the same memory address, but globals are evil anyway, regardless of
> how they are held.


I'm currently writing a "pre-compiler" that processes a translation unit and give output that can go into a normal C++20 compiler.

In order to make it as fast as possible, I overloaded global new and delete so that memory is allocated in 10-megabyte blocks at a time, and nothing ever gets de-allocated. (It uses about 400 megs of RAM for an average source file). But anyway I use 'list' extensively to make sure every object stays where it is, instead of a vector de-allocating and moving stuff around.

red floyd

unread,
May 4, 2022, 6:18:01 PM5/4/22
to
On 5/4/2022 3:07 PM, Stefan Ram wrote:
> Frederick Virchanza Gotham <cauldwel...@gmail.com> writes:
>> I use 'list' extensively to make sure every object stays
>> where it is, instead of a vector de-allocating and moving
>> stuff around.
>
> When one uses a vector of pointers to objects, the objects
> are not moved when the size of the vector changes. Such a
> vector then can be used like a list, but tests have often
> shown the vector to be faster.
>

Items can, in fact, be moved. If you push_back() on a vector
whose size() is the same as its capacity(), the vector needs
to be reallocated, and the data may therefore move and be
copied/moved.


James Kuyper

unread,
May 5, 2022, 1:57:32 AM5/5/22
to
On 5/4/22 18:17, red floyd wrote:
> On 5/4/2022 3:07 PM, Stefan Ram wrote:
...
>> When one uses a vector of pointers to objects, the objects
>> are not moved when the size of the vector changes. Such a
>> vector then can be used like a list, but tests have often
>> shown the vector to be faster.
>>
>
> Items can, in fact, be moved. If you push_back() on a vector
> whose size() is the same as its capacity(), the vector needs
> to be reallocated, and the data may therefore move and be
> copied/moved.

True, the pointer objects stored in the vector may be moved. The objects
that they point at are not.

Paavo Helde

unread,
May 5, 2022, 2:13:07 AM5/5/22
to
05.05.2022 00:39 Frederick Virchanza Gotham kirjutas:
> On Wednesday, May 4, 2022 at 12:09:41 PM UTC+1, Paavo Helde wrote:
>
>>> I have hard time thinking of a less efficient way to store a bitmap
>>> (at least using standard library containers).
>> I have hard time finding any use case at all for std::list. I know there
>> is a use case for holding dynamically created global objects alive and
>> at the same memory address, but globals are evil anyway, regardless of
>> how they are held.
>
>
> I'm currently writing a "pre-compiler" that processes a translation unit and give output that can go into a normal C++20 compiler.
>
> In order to make it as fast as possible, I overloaded global new and delete so that memory is allocated in 10-megabyte blocks at a time,

This looks like what a normal memory allocator does anyway.

> and nothing ever gets de-allocated.

This looks like a typical pool allocator. No need to reinvent the wheel.

(It uses about 400 megs of RAM for an average source file). But anyway I
use 'list' extensively to make sure every object stays where it is,
instead of a vector de-allocating and moving stuff around.

If you never delete or insert any elements, then std::deque also
guarantees that the stuff is not moved around, and provides better
memory proximity guarantees than std::list.

Even if your allocator manages to place the elements next to each other,
they would be still interspersed with unneeded forward and backward
links inside a std::list.

If you are concerned about performance, you should be concerned about
memory compactness and proximity.

So still I do not see a use case for std::list here, std::deque looks
better in all senses. YMMV.




Juha Nieminen

unread,
May 5, 2022, 2:18:08 AM5/5/22
to
Paavo Helde <ees...@osa.pri.ee> wrote:
> I have hard time finding any use case at all for std::list.

When I was working at the university here I once had to implement
(a rather complicated and advanced) algorithm (which had something to
do with graph manipulation) which required a doubly-linked list. No other
data container would have done the job because the speed of the algorithm
relied on the ability to splice linked lists (and sections of them).

That being said, std::list couldn't be used for this, especially now
that std::list::splice() is not an O(1) operation (which would pretty
much defeat the purpose).

The one strength of std::list that no other container had (nor can have),
and they removed it...

Ben

unread,
May 5, 2022, 5:50:26 AM5/5/22
to
Juha Nieminen <nos...@thanks.invalid> writes:

> Paavo Helde <ees...@osa.pri.ee> wrote:
>> I have hard time finding any use case at all for std::list.
>
> When I was working at the university here I once had to implement
> (a rather complicated and advanced) algorithm (which had something to
> do with graph manipulation) which required a doubly-linked list. No other
> data container would have done the job because the speed of the algorithm
> relied on the ability to splice linked lists (and sections of them).
>
> That being said, std::list couldn't be used for this, especially now
> that std::list::splice() is not an O(1) operation (which would pretty
> much defeat the purpose).

It's O(1) in C++11, 14 and 20 (at least in the public drafts).

> The one strength of std::list that no other container had (nor can have),
> and they removed it...

--
Ben.

Öö Tiib

unread,
May 5, 2022, 7:29:50 AM5/5/22
to
On Thursday, 5 May 2022 at 12:50:26 UTC+3, Ben wrote:
> Juha Nieminen <nos...@thanks.invalid> writes:
>
> > Paavo Helde <ees...@osa.pri.ee> wrote:
> >> I have hard time finding any use case at all for std::list.
> >
> > When I was working at the university here I once had to implement
> > (a rather complicated and advanced) algorithm (which had something to
> > do with graph manipulation) which required a doubly-linked list. No other
> > data container would have done the job because the speed of the algorithm
> > relied on the ability to splice linked lists (and sections of them).
> >
> > That being said, std::list couldn't be used for this, especially now
> > that std::list::splice() is not an O(1) operation (which would pretty
> > much defeat the purpose).
>
> It's O(1) in C++11, 14 and 20 (at least in the public drafts).

It has 6 overloads. Most are O(1) but one case where we transfer iterator
range from other list to this list is linear in length of said range.

Ben

unread,
May 5, 2022, 8:00:19 AM5/5/22
to
Obviously, but that's surely not what the OP was talking about. That
one could never have been, and will never be O(1).

--
Ben.

Alf P. Steinbach

unread,
May 5, 2022, 3:38:13 PM5/5/22
to
On 5 May 2022 13:56, Ben wrote:
With the choice of potentially O(n) `.size()`, instead of the silly
C++11 decision, all splicing overloads could be O(1) and `std::list`
could have been useful for something.

Since the word "now" was used, that was surely what was being mentioned.

Obviously. ;-)


- Alf

Ben

unread,
May 5, 2022, 4:13:52 PM5/5/22
to
"Alf P. Steinbach" <alf.p.s...@gmail.com> writes:

> On 5 May 2022 13:56, Ben wrote:
How much of your post does the smiley apply to? I don't want to waste
everyone's time...

--
Ben.

Juha Nieminen

unread,
May 6, 2022, 2:35:53 AM5/6/22
to
Ben <ben.u...@bsb.me.uk> wrote:
>>> > That being said, std::list couldn't be used for this, especially now
>>> > that std::list::splice() is not an O(1) operation (which would pretty
>>> > much defeat the purpose).
>>>
>>> It's O(1) in C++11, 14 and 20 (at least in the public drafts).
>>
>> It has 6 overloads. Most are O(1) but one case where we transfer iterator
>> range from other list to this list is linear in length of said range.
>
> Obviously, but that's surely not what the OP was talking about. That
> one could never have been, and will never be O(1).

That's exactly the operation I was talking about: Splicing a segment of
a list into another list. Which is (quite trivially) an O(1) operation
when you have the necessary iterators (which in the algorithm I mentioned
you have, at that point).

It used to be O(1) in C++98 std::list, but now they made it an O(n)
operation, destroying one of the few (if not the only) strengths of
std::list over any other data container.

Alf P. Steinbach

unread,
May 6, 2022, 4:40:09 AM5/6/22
to
On 5 May 2022 22:13, Ben wrote:
> "Alf P. Steinbach" <alf.p.s...@gmail.com> writes:
>
>> On 5 May 2022 13:56, Ben wrote:
I guess you're unfamiliar with the history. The fight was about the
guaranteed complexity of `.size()`. With O(1) complexity a splice has to
count the number of spliced nice, which turns a simple O(1) link
manipulation into O(n) counting.

The academic idealists won, the practically oriented people lost, and
`std::list` became useless.

As I see it that was the turning point in the C++ standard's evolution.


- Alf

Ben

unread,
May 6, 2022, 6:29:21 AM5/6/22
to
Juha Nieminen <nos...@thanks.invalid> writes:

> Ben <ben.u...@bsb.me.uk> wrote:
>>>> > That being said, std::list couldn't be used for this, especially now
>>>> > that std::list::splice() is not an O(1) operation (which would pretty
>>>> > much defeat the purpose).
>>>>
>>>> It's O(1) in C++11, 14 and 20 (at least in the public drafts).
>>>
>>> It has 6 overloads. Most are O(1) but one case where we transfer iterator
>>> range from other list to this list is linear in length of said range.
>>
>> Obviously, but that's surely not what the OP was talking about. That
>> one could never have been, and will never be O(1).
>
> That's exactly the operation I was talking about: Splicing a segment of
> a list into another list. Which is (quite trivially) an O(1) operation
> when you have the necessary iterators (which in the algorithm I mentioned
> you have, at that point).

It's not /trivially/ constant time since, as has been pointed out, the
spliced sub-list list needs to be scanned for length. I don't see how
it could be constant time unless std::list::size's constraint were to be
relaxed.

> It used to be O(1) in C++98 std::list,

Not in the draft I have. And it's been the same in C++ 03, 11, 14 and
20.

> but now they made it an O(n)
> operation, destroying one of the few (if not the only) strengths of
> std::list over any other data container.

I don't think they changed it. And I don't see how it could be anything
but linear if the length has to be maintained.

I agree that there is a trade-off here, and that C++'s choice did not
suit your algorithm, but they didn't change anything, and the common
cases of splicing into a list always have been O(1).

--
Ben.

Ben

unread,
May 6, 2022, 6:54:27 AM5/6/22
to
"Alf P. Steinbach" <alf.p.s...@gmail.com> writes:

> On 5 May 2022 22:13, Ben wrote:
>> "Alf P. Steinbach" <alf.p.s...@gmail.com> writes:
>>
>>> On 5 May 2022 13:56, Ben wrote:
I'm familiar with the trade-off in abstract (since it crops up every
time one implements a list as was so common in days gone by), but I did
not know there had been a fight about it in C++.

> The academic idealists won, the practically oriented people lost, and
> `std::list` became useless.

Curious that you put it like that. There doesn't seem to be anything
obviously idealistic in opting for O(1) size and linear splicing for
"foreign" sub-lists. I don't recall ever wanting that form of splice,
so I would consider C++'s choice quite practical (for me). What makes
this the "idealists" preferred option?

The engineer's perspective might be to have two list variants, one with
a maintained length and one without. Did that option crop up in the
fight?

--
Ben.

Alf P. Steinbach

unread,
May 6, 2022, 8:30:20 AM5/6/22
to
On 6 May 2022 12:54, Ben wrote:
> "Alf P. Steinbach" <alf.p.s...@gmail.com> writes:
>
>> On 5 May 2022 22:13, Ben wrote:
>>> "Alf P. Steinbach" <alf.p.s...@gmail.com> writes:
>>>
>>>> On 5 May 2022 13:56, Ben wrote:
E.g. as Howard Hinnant put it some time before C++11, in an article at
<url: https://howardhinnant.github.io/On_list_size.html>:

❝As the standard is written today, none of the containers are required
to have an O(1) size(). I believe that this should be corrected in
C++0X. [Expressing an academic ideal:] If a container has a size()
member, it should be required to have O(1) complexity. [Again expressing
an academic ideal:] If a container can not manage this, it should not
have a size() member at all. Clients can always compute
distance(begin(),end()).❞

The inline square brackets notes are my annotations.

In the article Howard makes a number of to me dubious claims and
presents a misleading feature sheet, where the reader is led to compare
number of alleged advantages for O(1) and O(n) `.size()`. I'd say that
kind of misleading argument indicates that in this matter he represented
an academic view. But possibly that reflects my personal impression of
academics (I've held an academic position but I didn't really fit in).

Likelihood argument of academic POV: Howard was the designer of
`std::unique_ptr`, and in a slightly heated discussion between us over
on SO, where I see now that I presented a needlessly complicated example
(not proud of that, but at least it was an example), he turned out to be
unaware that `unique_ptr` could exhibit undefined behavior due to
implicit conversion of `unique_ptr<Derived>` to `unique_ptr<Base>`, i.e.
that it's not as type safe in this respect as `std::shared_ptr`.

I guess the most upsetting thing about that was not his ignorance of the
issue, but that most anybody would /assume/ that the following would
either be safe, or else would not compile at all, while reality is that
it compiles and exhibits manifest UB with both MSVC 2022 and g++ 9.2:


#include <memory>
#include <utility>
using std::unique_ptr, std::make_unique,
std::move;

struct Base{ int m_answer = 42; };
struct Derived: Base{ virtual ~Derived(){}; };

auto main() -> int
{
auto p_derived = unique_ptr<Derived>( new Derived() );

#if PLEASE_FAIL
auto p_base = unique_ptr<Base>( move( p_derived ) );
(void) p_base;
#endif
(void) p_derived;
}


`std::shared_ptr` is safe for code like above because it retains the
original object pointer.


> The engineer's perspective might be to have two list variants, one with
> a maintained length and one without. Did that option crop up in the
> fight?

Sorry I don't remember, and I was not involved other than e.g. in
discussions here in clc++ and clc++m (I've never been involved, really).
But I can imagine a lot of ways to let client code decide instead of
forcing a decision. Two list variants, as you mention, is one way.


Cheers,

- Alf

Scott Lurndal

unread,
May 6, 2022, 9:55:46 AM5/6/22
to
The practical engineer would simply code a linked list and be done with it.

Ben

unread,
May 6, 2022, 12:06:02 PM5/6/22
to
Seems odd. The paper appears to be motivated entirely by practical
concerns and Howard Hinnart is not an academic. He does not even have a
CS degree (that's not criticism, just a point against your idea that it
espouses an academic idealist's view).

But the main point is that he is proposing the exact opposite of what
you cited as the academic idealist position that won the day. Have I
misinterpreted something somewhere?

--
Ben.

Andrey Tarasevich

unread,
May 6, 2022, 1:23:02 PM5/6/22
to
On 5/6/2022 5:29 AM, Alf P. Steinbach wrote:
> On 6 May 2022 12:54, Ben wrote:
>>
>>> The academic idealists won, the practically oriented people lost, and
>>> `std::list` became useless.
>>
>> Curious that you put it like that.  There doesn't seem to be anything
>> obviously idealistic in opting for O(1) size and linear splicing for
>> "foreign" sub-lists.  I don't recall ever wanting that form of splice,
>> so I would consider C++'s choice quite practical (for me).  What makes
>> this the "idealists" preferred option?
>
> E.g. as Howard Hinnant put it some time before C++11, in an article at
> <url: https://howardhinnant.github.io/On_list_size.html>:
>
> ❝As the standard is written today, none of the containers are required
> to have an O(1) size(). I believe that this should be corrected in
> C++0X. [Expressing an academic ideal:] If a container has a size()
> member, it should be required to have O(1) complexity. [Again expressing
> an academic ideal:] If a container can not manage this, it should not
> have a size() member at all. Clients can always compute
> distance(begin(),end()).❞
>
> The inline square brackets notes are my annotations.


The quoted text actually makes an extremely good and logical point. This
is exactly why we have such "red flag" functions as `std::distance` and
`std::advance` in standard library: as deliberately conspicuous
indicators of potentially inefficient temporary/stub/niche code.

If the objective were to keep `splice()` efficient, then removing
inefficient `size()` from `std::list<>` entirely and forcing the user to
use `std::distance` would have been an ideal solution. Beautiful idea!

But alas, the committee decided that having `std::list<>` to conform to
common interface is of greater value... What it really is is conformance
to legacy code.

--
Best regards,
Andrey

Alf P. Steinbach

unread,
May 6, 2022, 4:19:12 PM5/6/22
to
No no no. A `std::list` with O(1) other list splice can in most cases of
ordinary usage know its size, and it can keep track of whether it knows
or this time needs to count. This is a near cost free optimization
(relative to overhead of e.g. creating a node, with internal dynamic
allocation) with great dividends, that I believe client code can't do.

I.e. requiring client code to always get O(n) size checking, instead of
mainly O(1) and O(n) just in rare cases that may not even appear in
one's code, is IMO not a beautiful idea.

If that pessimization of mostly O(1) to always O(n) size checking is a
kind of beauty, then it's an entirely academic, impractical, almost
sabotage-like sense of beauty -- as I see it.


> But alas, the committee decided that having `std::list<>` to conform to
> common interface is of greater value... What it really is is conformance
> to legacy code.

Yeah I think you're right.

Cheers,

- Alf

Malcolm McLean

unread,
May 6, 2022, 4:51:36 PM5/6/22
to
Often when we splice a list, we will know the size of the portion to be spliced,
because we have previously traversed it. You could provide an O(1) splice
that takes the size as a parameter.
But if you go down that route, the library becomes steadily more complicated,
difficult to learn, and bug prone (because you can't enforce that the size
passed is the actual size of the splice, leading to bugs in later processing).

Mut...@dastardlyhq.com

unread,
May 7, 2022, 4:41:54 AM5/7/22
to
On Fri, 6 May 2022 22:18:52 +0200
"Alf P. Steinbach" <alf.p.s...@gmail.com> wrote:
>On 6 May 2022 19:22, Andrey Tarasevich wrote:
>> If the objective were to keep `splice()` efficient, then removing
>> inefficient `size()` from `std::list<>` entirely and forcing the user to
>> use `std::distance` would have been an ideal solution. Beautiful idea!
>
>No no no. A `std::list` with O(1) other list splice can in most cases of
>ordinary usage know its size, and it can keep track of whether it knows

Why wouldn't it always know its size? Or are you suggesting it can occasionally
insert and delete without updating its element count?


Alf P. Steinbach

unread,
May 7, 2022, 6:34:06 AM5/7/22
to
With O(1) splice it doesn't know how many nodes are inserted unless it's
told, and two of the splice overload don't tell it.

These are the two overloads numbered (3) in the cppreference list at
<url: https://en.cppreference.com/w/cpp/container/list/splice>.

One practical design for that case is, as I see it, to just mark the
count as unknown, e.g. set an `optional<Size>` to empty. Then when
`.size()` is called, do an O(n) count and cache that. Next call of
`.size()` then has efficient O(1), and this is simple to relate to for
programmers, but not so for academic who wants a clear single O(what?).

One way to instead force inefficiency on the client code is ¹the C++11
design where `.size()` is required to be O(1), so that those two splice
overloads have to count the nodes, every time. This is inefficient but
affords a simple description in the standard. O(1) here, O(n) there.

And since efficient O(1) splice was about the only advantage of
`std::list` over other containers such as `std::vector`, that silly C11
decision removed the ~only advantage and made `std::list` dead meat,
much like `std::valarray` which was a not ungood idea once.


Cheers,

- Alf

Notes:
¹ In C++98 and C++03 the standard expressed an intent of O(1) `.size()`
via a note, "Note A" in the table of general container requirements, but
notes in ISO standard are non-normative text.

Mut...@dastardlyhq.com

unread,
May 7, 2022, 12:13:12 PM5/7/22
to
On Sat, 7 May 2022 12:33:41 +0200
"Alf P. Steinbach" <alf.p.s...@gmail.com> wrote:
>On 7 May 2022 10:41, Mut...@dastardlyhq.com wrote:
>> On Fri, 6 May 2022 22:18:52 +0200
>> "Alf P. Steinbach" <alf.p.s...@gmail.com> wrote:
>>> On 6 May 2022 19:22, Andrey Tarasevich wrote:
>>>> If the objective were to keep `splice()` efficient, then removing
>>>> inefficient `size()` from `std::list<>` entirely and forcing the user to
>>>> use `std::distance` would have been an ideal solution. Beautiful idea!
>>>
>>> No no no. A `std::list` with O(1) other list splice can in most cases of
>>> ordinary usage know its size, and it can keep track of whether it knows
>>
>> Why wouldn't it always know its size? Or are you suggesting it can
>occasionally
>> insert and delete without updating its element count?
>
>With O(1) splice it doesn't know how many nodes are inserted unless it's
>told, and two of the splice overload don't tell it.

splice() is a list method, not a standalone function. Its given the list to
insert into itself. It can just take its size and add it to its own size.
I don't see the problem.


Andrey Tarasevich

unread,
May 7, 2022, 12:37:31 PM5/7/22
to
`splice` has multiple overloaded versions. And the version being
discussed here is the one that really matters: `splice` that takes a
pair of iterators into another list.

So, no, it is not given a list, it is given a range of iterators. It
doesn't know how many list elements are linked inside that range. It
doesn't know how much to "add to its own size".

That's what's being discussed here.

--
Best regards,
Andrey

Alf P. Steinbach

unread,
May 7, 2022, 12:38:22 PM5/7/22
to
Dear, you're trolling.


Andrey Tarasevich

unread,
May 7, 2022, 12:46:00 PM5/7/22
to
Lazy evaluation of `size()` is a viable technique that nevertheless has
its own set of issues. It is an operation that is "logically const" but
not "physically const", since it has to updated its "size is known"
state. In a single-threaded context it is viable, but cue in
multithreading considerations and it becomes problematic.

--
Best regards,
Andrey

Alf P. Steinbach

unread,
May 7, 2022, 1:13:08 PM5/7/22
to
Right, a reasonable C++11 implementation is a `mutable optional<Size>`,
where `Size` is `ptrdiff_t`.


> In a single-threaded context it is viable, but cue in
> multithreading considerations and it becomes problematic.

That as I see it erroneous conclusion appears to build on three invalid
assumptions:

Assumption 1. Any `const` object of a standard library type is or should
be safe for multithreading.
Assumption 2. It's impossible or impractical to let client code know if
object is in mt-safe stable state.
Assumption 3: It's impossible or impractical for client code to force an
mt-safe stable state.

In particular, a single call to `.size()` would suffice to bring the
object to stable state so it could be shared for reading. Client code
responsibility to do MT right, not `std::list` responsibility.

I failed to mention similar technique earlier for forcing a counting of
the transferred nodes in a splice. With lazy size count one can just
transfer them to a temporary empty list first; transferring all from
that intermediate list then effects a count. That trick could be made
unnecessary in the rare cases where needed, by adding an option to the
splice function to count -- which is about empowering client code,
giving client code the choice, instead of forcing inefficiency on it.


Cheers,

- Alf

James Kuyper

unread,
May 7, 2022, 6:28:50 PM5/7/22
to
On 5/7/22 12:12, Mut...@dastardlyhq.com wrote:
...
> splice() is a list method, not a standalone function. Its given the
> list to insert into itself. It can just take its size and add it to
> its own size.
> I don't see the problem.

"void splice(const_iterator position, list & x, const_iterator first,
const_iterator last);
void splice(const_iterator position, list && x, const_iterator first,
const_iterator last);

11 Preconditions: [first, last) is a valid range in x. position is not
an iterator in the range [first, last).
12 Effects: Inserts elements in the range [first, last) before position
and removes the elements from x. Pointers and references to the moved
elements of x now refer to those same elements but as members of *this.
Iterators referring to the moved elements will continue to refer to
their elements, but they now behave as iterators into *this, not into x.
13 Throws: Nothing.
14 Complexity: Constant time if addressof(x) == this; otherwise, linear
time."

Would you explain how to implement these particular overloads of
std::list::splice() as constant-time operations in the case where
addressof(x) != this, if [first,last) identifies an arbitrary sub-range
of x?

Andrey Tarasevich

unread,
May 7, 2022, 11:36:55 PM5/7/22
to
Nope. I did not make such a broad assumption as the one you presented
under #1. I simply implied that `size()` should not belong to the family
of container methods that can trigger a data race.

--
Best regards,
Andrey

Alf P. Steinbach

unread,
May 8, 2022, 2:26:58 AM5/8/22
to
Well, as I wrote (but snipped here) /a single call to `.size()`/ brings
the object to a stable state where it can be shared for reading.

Plus that correct multi-threading is the responsibility of client code,
not of each class that the client code uses.

The "should" therefore sounds entirely idealistic.

One gains a nice and tidy view of something that's only academically
relevant, in the usual meaning of the phrase "that's academic", at the
cost of inefficiency that to boot removes the ~only advantage of the class.

That's why I called that kind of idealistic view "academic".

But considering that COW strings have roughly the same little MT-safety
gotcha in any access of pointer or reference to (even `const`) char
item, perhaps the standard should have introduced a general convention
of an idempotent ".ensure_safe_for_sharing()" method and an
`.is_safe_for_sharing()` checker for asserts. For `std::list` the former
would just call `.size()`, paying the outstanding counting debt if any.
For a putative COW string type (C++11 removed the C++03 COW support for
`std::string`, but perhaps an additional string type, not necessarily
`std::string`) it would ensure single ownership of the buffer e.g. by
calling `operator[]`, paying the outstanding copying debt if necessary.

Cheers,

- Alf

Manfred

unread,
May 8, 2022, 12:37:22 PM5/8/22
to
This has already been answered in thread, but I'll recall it here next
to your point:
As long as you need a O(1) .size() method, you can't.
If you allow for a O(n) .size() method (or, probably even better, you
remove the .size() method entirely), then these overloads of splice, in
addition to the other ones, become constant-time operations.

In other words, the requirements you list here do not make for a
necessary O(n) .splice() overload by themselves, but combined with the
O(1) requirement for .size(), they do.

After following this thread, I have to say that I share the sentiment
that .size() should just not be a method of std::list.

A doubly-linked list is inherently a series of elements that has no
locality, i.e. it may be seen as a sparse set (or a scattered set).
While it may make sense to know the /count/ of its elements (which I
wouldn't consider a key feature of std::list, however), the concept of
size (i.e how "big" it is) of a sparse set looks somewhat forced.
It's no surprise that it ends up defeating the main purpose of
std::list, which is O(1) insertions and deletions.

Malcolm McLean

unread,
May 8, 2022, 2:51:01 PM5/8/22
to
In C you use linked lists mainly because resizing arrays is a nuisance.
You have to maintain the size separately, then the reallocation can fail
and that has to be handled, and all the internal pointers become invalid.

To create a linked list, however, you just add a "next" field to a structure.

In C++, this is handled automatically by an std::vector. And setting up an
std::list is as much overhead as setting up an std::vector .

James Kuyper

unread,
May 8, 2022, 6:18:33 PM5/8/22
to
If I'm bothering to cite the words of the standard, you can safely
assume that I'm talking about the case where all of the other
requirements of the standard are also met.

While O(n) size() has been mentioned elsewhere in this thread, as far as
I could tell Muttley's comment was not based upon that idea, but on a
misconception that the only splice() overload that could be used with a
'foreign' list is one that splices the entire other list into this one.

Juha Nieminen

unread,
May 9, 2022, 12:58:26 AM5/9/22
to
Ben <ben.u...@bsb.me.uk> wrote:
>> That's exactly the operation I was talking about: Splicing a segment of
>> a list into another list. Which is (quite trivially) an O(1) operation
>> when you have the necessary iterators (which in the algorithm I mentioned
>> you have, at that point).
>
> It's not /trivially/ constant time since, as has been pointed out, the
> spliced sub-list list needs to be scanned for length. I don't see how
> it could be constant time unless std::list::size's constraint were to be
> relaxed.

You don't need to know the length of the lists, or the list segments,
for splicing (nor does the algorithm which I mentioned).

std::list::size() was an O(n) operation in C++98, which allowed all
splicing operations to be O(1).

Then they went and destroyed the most useful feature unique to doubly
linked lists for no reason.

>> It used to be O(1) in C++98 std::list,
>
> Not in the draft I have. And it's been the same in C++ 03, 11, 14 and
> 20.

No, I'm pretty certain that all splicing operations were O(1), and
std::list::size() was O(n) in C++98. They demanded an O(1) std::list::size()
only in later standards.

Why are we even arguing about this? Splicing (any variant) can *trivially*
be done in constant-time for doubly-linked lists (when you have the
iterators pointing to the desired positions). This is an actually useful
feature of doubly-linked lists in some algorithms. Nobody is talking about
the speed of std::list::size(). Why are you arguing this?

>> but now they made it an O(n)
>> operation, destroying one of the few (if not the only) strengths of
>> std::list over any other data container.
>
> I don't think they changed it. And I don't see how it could be anything
> but linear if the length has to be maintained.

Yes, they did thcange it. And you don't need to know the length of a
linked list (in constant time) in order to handle one.

> I agree that there is a trade-off here, and that C++'s choice did not
> suit your algorithm, but they didn't change anything

They absolutely did. The requirement for an O(1) std::list::size() was
not in C++98.

Andrey Tarasevich

unread,
May 9, 2022, 1:30:35 AM5/9/22
to
On 5/8/2022 9:58 PM, Juha Nieminen wrote:
> Ben <ben.u...@bsb.me.uk> wrote:
>>> That's exactly the operation I was talking about: Splicing a segment of
>>> a list into another list. Which is (quite trivially) an O(1) operation
>>> when you have the necessary iterators (which in the algorithm I mentioned
>>> you have, at that point).
>>
>> It's not /trivially/ constant time since, as has been pointed out, the
>> spliced sub-list list needs to be scanned for length. I don't see how
>> it could be constant time unless std::list::size's constraint were to be
>> relaxed.
>
> You don't need to know the length of the lists, or the list segments,
> for splicing (nor does the algorithm which I mentioned).
>
> std::list::size() was an O(n) operation in C++98, which allowed all
> splicing operations to be O(1).

No. Neither.

In C++98 it was _suggested_ that `std::list<>::size()` should be an O(1)
operation, but not required ("should" , not "shall" in "Note A" in
23.1/5). The choice was ultimately left to the implementation. The
implementation was free to choose between O(1) `size()` with O(n)
`splice()` or the other way around.

C++11 explicitly demanded O(1) `size()`.

> Then they went and destroyed the most useful feature unique to doubly
> linked lists for no reason.
>
>>> It used to be O(1) in C++98 std::list,
>>
>> Not in the draft I have. And it's been the same in C++ 03, 11, 14 and
>> 20.
>
> No, I'm pretty certain that all splicing operations were O(1), and
> std::list::size() was O(n) in C++98. They demanded an O(1) std::list::size()
> only in later standards.

See above.

It was GCC's implementation of C++ standard library that not only
insisted on O(1) `std::list<>::splice()` in C++98, but actively refused
to adopt the C++11 requirements after its release due to "problems with
binary compatibility".

BTW, what is the current state of affairs in GCC?

--
Best regards,
Andrey

Mut...@dastardlyhq.com

unread,
May 9, 2022, 5:03:10 AM5/9/22
to
Last time I looked you could subtract one iterator from another and get
a difference. I presume that applies to std::list though I almost never use
this container.

Bo Persson

unread,
May 9, 2022, 6:14:58 AM5/9/22
to
No, it does not work for std::list, as each node is allocated
separately. For splice to work the elements are not required to be
stored sequentially, like in a std::vector.

Do you begin to see the problem now?

Andrey Tarasevich

unread,
May 9, 2022, 11:33:55 AM5/9/22
to
Well, the very crux of the issue is that you can't do that [efficiently]
with `std::list<>`s iterators. Subtraction is available for
random-access iterators only. And `std::list<>`s iterators are
bidirectional, not random-access.

Even if subtraction were formally available for `std::list<>`s
iterators, it'd still be a `std::distance` kind of operation. Not useful
within the context of the issue in question.

--
Best regards,
Andrey

Mut...@dastardlyhq.com

unread,
May 9, 2022, 11:37:03 AM5/9/22
to
Not really because there's obviously some way of counting them even if its
inefficient. Perhaps iterate through them. Clues in the name.

James Kuyper

unread,
May 9, 2022, 11:56:53 AM5/9/22
to
Yes, std::distance<> does precisely that for iterators which are not
random-access iterators (23.4.2p5). For such iterators, it is an O(n)
algorithm.


James Kuyper

unread,
May 9, 2022, 12:02:04 PM5/9/22
to
On 5/9/22 05:02, Mut...@dastardlyhq.com wrote:
...
> Last time I looked you could subtract one iterator from another and get
> a difference. ...

That's true only for random-access iterators (23.3.5.6p1). The only
standard container which is explicitly required to support random-access
iterators is std::deque<>.
A contiguous container is required to have a random access iterator
(22.2.1p13).
std::array<> is required to be a contiguous container. (22.3.7.1p1).
std::vector<> used to be required to support a random access iterator,
but that appears to have been dropped in C++ 2014.
std::list<> has never been required to support random access iterators.

Manfred

unread,
May 9, 2022, 12:13:44 PM5/9/22
to
On 5/9/2022 5:56 PM, James Kuyper wrote:
> On 5/9/22 05:02, Mut...@dastardlyhq.com wrote:
> ...
>> Last time I looked you could subtract one iterator from another and get
>> a difference. ...
>
> That's true only for random-access iterators (23.3.5.6p1). The only
> standard container which is explicitly required to support random-access
> iterators is std::deque<>.
> A contiguous container is required to have a random access iterator
> (22.2.1p13).
> std::array<> is required to be a contiguous container. (22.3.7.1p1).
> std::vector<> used to be required to support a random access iterator,
> but that appears to have been dropped in C++ 2014.

22.3.11.1p2:
"A vector meets all of the requirements of a container and ..., and, for
an element type other than bool, of a contiguous container (22.2.1)."

That makes for the requirement of random access, except for
vector<bool>, quite apropos for this thread..

Mut...@dastardlyhq.com

unread,
May 9, 2022, 12:18:27 PM5/9/22
to
So why are certain people claiming a std::list won't always know the size
of whats being inserted into it?

Andrey Tarasevich

unread,
May 9, 2022, 12:38:28 PM5/9/22
to
I think it's been explained rather clearly already: spending O(n) to
"know" it is prohibitively expensive.

That's, again, the crux of the issue: if we don't know "the size of
whats being inserted", we don't want to spend O(n) to calculate that size.

--
Best regards,
Andrey

Scott Lurndal

unread,
May 9, 2022, 1:20:39 PM5/9/22
to
When I create linked lists (in C or C++) and I need to keep a
count of the number of elements in the list, I keep a size_t
in the listhead that is incremented by the insert/remove functions.

I'm not sure why std::list (or perhaps a derived std::counted_list)
couldn't do similar.

Malcolm McLean

unread,
May 9, 2022, 3:05:42 PM5/9/22
to
Of course it can. In fact, since size() is constrained to be O(1) it must do.
However if you insert a long section of another list, you pass the start and
one past the end pointers. Because it is a linked list, there's no way of knowing
how many elements are between them, other than by starting at the start
and iterating through the list until you reach the one past the end.

So you can had O(1) size, or O(1) splicing, but not both.

Bo Persson

unread,
May 9, 2022, 3:58:39 PM5/9/22
to
But counting them makes the operation O(n), which is not what we want.

std::list has splice with (first, last)-patameters to move nodes from
one list to another. Very fast if you only have to adjust a few pointers.

Very slow if you then have to traverse the (first, last) sequence just
to see how long it is. Could be *very* long.

And most of the advantage of splice(first, last) over insert(first,
last) is suddenly gone.


wij

unread,
May 9, 2022, 4:57:05 PM5/9/22
to
Because many questions in the thread looked odd to me, my guess is that
they may be from people did not have actually written a List class.
Is this the goal of C++ standard library? I think yes. But C++ standard library
cannot be responsible for user's lack of experience of implementation.
There are more from hiding the level stuff to the user. What is the point, then?
C++ standard's explanation cannot be from 'implement' and expect user
not to understand the 'implement'.

James Kuyper

unread,
May 10, 2022, 10:14:35 AM5/10/22
to
On 5/9/22 12:13, Manfred wrote:
> On 5/9/2022 5:56 PM, James Kuyper wrote:
>> On 5/9/22 05:02, Mut...@dastardlyhq.com wrote:
>> ...
>>> Last time I looked you could subtract one iterator from another and get
>>> a difference. ...
>>
>> That's true only for random-access iterators (23.3.5.6p1). The only
>> standard container which is explicitly required to support random-access
>> iterators is std::deque<>.
>> A contiguous container is required to have a random access iterator
>> (22.2.1p13).
>> std::array<> is required to be a contiguous container. (22.3.7.1p1).
>> std::vector<> used to be required to support a random access iterator,
>> but that appears to have been dropped in C++ 2014.
>
> 22.3.11.1p2:
> "A vector meets all of the requirements of a container and ..., and, for
> an element type other than bool, of a contiguous container (22.2.1)."

Annoying. The Evince Document Viewer that I'm using does not find a
match to "contiguous container" when there's a line break separating the
two words. I thought the exclusion of std::vector seemed odd.

James Kuyper

unread,
May 10, 2022, 10:16:41 AM5/10/22
to
Nobody has claimed that it can't know. They are saying that it can't
find out without taking O(n) operations, somewhere. There's three main
options being discussed:

Current standard:
l.splice(position, x, first, last) takes O(N) time because it traverses
the portion of the list spliced in order to update both this->size and
x.size().
l.size() is O(1)

Alternative 1:
l.size() is not supported, since it cannot always be O(1). If you need
the equivalent, do std::distance(l.begin(), l.end()), which is O(N).
l.splice(position, x, first, last) is O(1)

Alternative 2:
l.size() is supported, but takes O(N) time.
l.splice(position, x, first, last) is O(1)

Note that every option involves one thing that takes O(N) operations.

Scott Lurndal

unread,
May 10, 2022, 10:32:06 AM5/10/22
to
I've always preferred xpdf over evince. I really don't like evince.

Mut...@dastardlyhq.com

unread,
May 10, 2022, 11:55:54 AM5/10/22
to
And then you end up with a container where size() returns an incorrect value.
How is that a good thing?

Alf P. Steinbach

unread,
May 10, 2022, 12:40:29 PM5/10/22
to
The always-O(n) size alternatives are not really practical, as I see it.

The IMO only reasonable alternative, let's call it alternative 0:

* A number-of-values operation (could just be .size) is supported, with
cached but possibly unknown size so that it's generally O(1) but O(n)
first time after a don't-know-size-of splice.
* Splice of unknown size sequence is O(1).
* A `const` such list is safe for MT when it knows its size, which can
be forced by just calling .size(); it's MT unsafe when size is unknown,
i.e. in the period between an unknown size splice and a call of .size().

Without this practically useful fastish list alternative there wouldn't
be much basis for a discussion.

The constraint on sharing of `const` lists, that they must have stable
caches (one can call .size() to force that), is unique wrt. the current
standard library. However, as I mentioned else-thread it would apply
also to any COW string type, which the committee also opted out of.
Their choices simplified the complexity specifications and simplified
the guarantees for MT safety, but I think those advantages academic.

Cheers,

- Alf

James Kuyper

unread,
May 10, 2022, 1:28:28 PM5/10/22
to
On 5/10/22 11:55, Mut...@dastardlyhq.com wrote:
> On Mon, 9 May 2022 09:38:11 -0700
> Andrey Tarasevich <andreyta...@hotmail.com> wrote:
>> On 5/9/2022 9:18 AM, Mut...@dastardlyhq.com wrote:
...
>>> So why are certain people claiming a std::list won't always know the
>>> size
>>> of whats being inserted into it?
>>
>> I think it's been explained rather clearly already: spending O(n) to
>> "know" it is prohibitively expensive.
>>
>> That's, again, the crux of the issue: if we don't know "the size of
>> whats being inserted", we don't want to spend O(n) to calculate that
>> size.
>
> And then you end up with a container where size() returns an incorrect
> value.
> How is that a good thing?


Two options are under discussion:
size() never returns an incorrect value, it just may require O(n)
operations to give you the correct value. The advantage of this over
constant-time size() is that you only need to pay the O(n) cost if you
actually need the size. If size() never gets called, you never pay the
price. If it only gets called once for each list after a long series of
splices between lists, you only pay that cost once per list rather than
once per inter-list splice(). Whether or not this is an advantage
depends upon the number of inter-list splices vs. the number of
different lists.

Some people have suggested that inter-list splices are so rare that it
isn't worth worrying about - most splices are between different parts of
the same list, which never changes the list size.
Other people have suggested that essentially all splices are inter-list,
and that the O(N) time for inter-list splicing therefore removes the
only advantage that list has over the other container types.
I mistrust extreme statements, and suspect that the truth lies somewhere
between those two extremes. But I have no evidence to present in favor
of that suspicion.

The second option is that size() never returns an incorrect value,
because size() is not supported at all. If you need the number of
elements in the list, you call std::distance(l.begin(), l.end()). The
advantage of this approach over the previous one is that people won't
accidentally call size() in the incorrect expectation that it will be an
O(1) operation, as it is for most other containers.


Manfred

unread,
May 10, 2022, 1:50:56 PM5/10/22
to
This last bit is true, but there is more to the story, which I think is
important: we are not talking about some list implementation for a
specific application - something that anyone could do for the task at hand.
This discussion is about the standard library, which has different
requirements than some specific program.

As a standard library feature, I think that alternative 1 or 2 would be
preferred to the current status, possibly alternative 1 over alternative 2.
I see two reasons for this:

a) Standard library code is not supposed to be modified by application
programmers, so any feature that comes with it should be well
consolidated, so that there is widespread agreement that alternative
solutions are to be considered suboptimal. If some feature is
controversial, better leave the implementation to the programmer, who is
in fact the person responsible for the actual product.

b) Standard library implementations should keep complexity (including
design complexity) to a minimum - this is related to having well
established solutions in it.
Now, in the case of a list, complete ownership of the container involves
two data items: the 'begin' and 'end' iterators.
Adding a 'size' data member increases complexity, and, even worse,
happens to duplicate information, which brings with itself obvious
management hassles. I believe that the choice to duplicate information
should be justified only by analysis of the specific problem domain,
which is something a standard library cannot be aware of.

Putting this in other words, the distinctive feature of list is O(1)
deletions and insertions (including splicing), more than counting its
elements. In fact I think it's fair to say that in algorithms in which a
linked list is an appropriate tool the number of elements is often not
even used. So, a compromise, if needed, should favor splice() over size().

And, if for some reason one needs a linked list with an O(1) size()
operation, one can easily wrap a std::list in a class that caches and
manages the additional size member.
The other way around, i.e. getting an O(1) splice operation from a list
implementation that comes with an O(1) size() operation, is just not
possible.

Öö Tiib

unread,
May 12, 2022, 8:44:32 AM5/12/22
to
Why the third, complex, alternative is missing?
It is like that inter-list sequence splice causes next call to size() to be O(N)
but after that one call to size() (or clear() or empty() that returned true) it
turns subsequent calls to size() back to O(1). Vast majority of operations
can upkeep size being O(1).


> > Note that every option involves one thing that takes O(N) operations.
> >
> This last bit is true, but there is more to the story, which I think is
> important: we are not talking about some list implementation for a
> specific application - something that anyone could do for the task at hand.
> This discussion is about the standard library, which has different
> requirements than some specific program.
>
> As a standard library feature, I think that alternative 1 or 2 would be
> preferred to the current status, possibly alternative 1 over alternative 2.
> I see two reasons for this:
>
> a) Standard library code is not supposed to be modified by application
> programmers, so any feature that comes with it should be well
> consolidated, so that there is widespread agreement that alternative
> solutions are to be considered suboptimal. If some feature is
> controversial, better leave the implementation to the programmer, who is
> in fact the person responsible for the actual product.

That unfortunately is not the case with standard library. It just has couple
containers that are well implemented and tested but can't be exactly
optimal for every case. That raises plenty of controversy. For example the
std::deque chunk size, std::vector growth factor, underlying tree type of
associative containers etc. Taking better fitting or configurable
implementation can sometimes win noticeable gains in performance.

> b) Standard library implementations should keep complexity (including
> design complexity) to a minimum - this is related to having well
> established solutions in it.

The main problem with O(N) size() of list I have observed was that
programmers called it when they were really interested in O(1) empty().
I was bored of typing to code review that they should use c.empty() not
!c.size() or c.size() < 1 etc. Their brains just are wired to want to evaluate
size.

> Now, in the case of a list, complete ownership of the container involves
> two data items: the 'begin' and 'end' iterators.
> Adding a 'size' data member increases complexity, and, even worse,
> happens to duplicate information, which brings with itself obvious
> management hassles. I believe that the choice to duplicate information
> should be justified only by analysis of the specific problem domain,
> which is something a standard library cannot be aware of.
>
> Putting this in other words, the distinctive feature of list is O(1)
> deletions and insertions (including splicing), more than counting its
> elements. In fact I think it's fair to say that in algorithms in which a
> linked list is an appropriate tool the number of elements is often not
> even used. So, a compromise, if needed, should favor splice() over size().
>
> And, if for some reason one needs a linked list with an O(1) size()
> operation, one can easily wrap a std::list in a class that caches and
> manages the additional size member.
> The other way around, i.e. getting an O(1) splice operation from a list
> implementation that comes with an O(1) size() operation, is just not
> possible.

There is configurable implementation of list in boost::container. I would
take it ... as there are usually better things to do than to implement
mundane containers.

wij

unread,
May 12, 2022, 11:05:43 AM5/12/22
to
What kind of library is supposed to be modified by application programmers?

> b) Standard library implementations should keep complexity (including
> design complexity) to a minimum - this is related to having well
> established solutions in it.
> Now, in the case of a list, complete ownership of the container involves
> two data items: the 'begin' and 'end' iterators.
> Adding a 'size' data member increases complexity, and, even worse,
> happens to duplicate information, which brings with itself obvious
> management hassles. I believe that the choice to duplicate information
> should be justified only by analysis of the specific problem domain,
> which is something a standard library cannot be aware of.

What library implementations should not keep complexity (including
design complexity) to a minimum?

> Putting this in other words, the distinctive feature of list is O(1)
> deletions and insertions (including splicing), more than counting its
> elements. In fact I think it's fair to say that in algorithms in which a
> linked list is an appropriate tool the number of elements is often not
> even used. So, a compromise, if needed, should favor splice() over size().
>
> And, if for some reason one needs a linked list with an O(1) size()
> operation, one can easily wrap a std::list in a class that caches and
> manages the additional size member.
> The other way around, i.e. getting an O(1) splice operation from a list
> implementation that comes with an O(1) size() operation, is just not
> possible.

Informationless.

My opinion: The standard library is what it is, difficult to change (unless the goal is to change/evaluate it).
The member size() should be in std::list for the standard's theory to work.
Otherwise the missing of size() would cause more serious/fundamental problems.
(In all practical cases I know, the cost of size() is negligibly minor)

Manfred

unread,
May 12, 2022, 11:24:26 AM5/12/22
to
That would be Alf's proposal. Not my favorite, but sure part of the picture.

>
>>> Note that every option involves one thing that takes O(N) operations.
>>>
>> This last bit is true, but there is more to the story, which I think is
>> important: we are not talking about some list implementation for a
>> specific application - something that anyone could do for the task at hand.
>> This discussion is about the standard library, which has different
>> requirements than some specific program.
>>
>> As a standard library feature, I think that alternative 1 or 2 would be
>> preferred to the current status, possibly alternative 1 over alternative 2.
>> I see two reasons for this:
>>
>> a) Standard library code is not supposed to be modified by application
>> programmers, so any feature that comes with it should be well
>> consolidated, so that there is widespread agreement that alternative
>> solutions are to be considered suboptimal. If some feature is
>> controversial, better leave the implementation to the programmer, who is
>> in fact the person responsible for the actual product.
>
> That unfortunately is not the case with standard library.

This does not mean it shouldn't be its goal, right?

It just has couple
> containers that are well implemented and tested but can't be exactly
> optimal for every case. That raises plenty of controversy. For example the
> std::deque chunk size, std::vector growth factor, underlying tree type of
> associative containers etc. Taking better fitting or configurable
> implementation can sometimes win noticeable gains in performance.
>

These two examples are a different matter: you can't have a std::deque
without some chunk size, and you can't have a std::vector without a
growth factor - i.e. these are design parameters that must be part of
the implementation, one way or the other.
A 'size' cached member is /not/ required for a std::list to do its job.
Moreover, at least for std::vector, the standard library gives you a
valid solution if the default growth factor does not suit your needs:
/If/ (and I stress _if_) the default growth factor is proven to be not
good for you, then this means that you have some very specific and hard
requirements - you want to use .reserve() in your program.

>> b) Standard library implementations should keep complexity (including
>> design complexity) to a minimum - this is related to having well
>> established solutions in it.
>
> The main problem with O(N) size() of list I have observed was that
> programmers called it when they were really interested in O(1) empty().
> I was bored of typing to code review that they should use c.empty() not
> !c.size() or c.size() < 1 etc. Their brains just are wired to want to evaluate
> size.
>

As one good project manager I once worked with used to say, this falls
into the category 'laziness'. The appropriate teamleader would happily
'educate' them (citing another former coworker of mine)

More seriously, if the O(N) .size() function compromises the program
performance, then this means that the project requires proper attention
to container usage, including questioning if 'list' is the right
container choice.

>> Now, in the case of a list, complete ownership of the container involves
>> two data items: the 'begin' and 'end' iterators.
>> Adding a 'size' data member increases complexity, and, even worse,
>> happens to duplicate information, which brings with itself obvious
>> management hassles. I believe that the choice to duplicate information
>> should be justified only by analysis of the specific problem domain,
>> which is something a standard library cannot be aware of.
>>
>> Putting this in other words, the distinctive feature of list is O(1)
>> deletions and insertions (including splicing), more than counting its
>> elements. In fact I think it's fair to say that in algorithms in which a
>> linked list is an appropriate tool the number of elements is often not
>> even used. So, a compromise, if needed, should favor splice() over size().
>>
>> And, if for some reason one needs a linked list with an O(1) size()
>> operation, one can easily wrap a std::list in a class that caches and
>> manages the additional size member.
>> The other way around, i.e. getting an O(1) splice operation from a list
>> implementation that comes with an O(1) size() operation, is just not
>> possible.
>
> There is configurable implementation of list in boost::container. I would
> take it ... as there are usually better things to do than to implement
> mundane containers.

I'm not a big fan of boost, but that's an option, yes.
Note, however, that when I wrote 'wrapping' I didn't suggest rewriting a
linked list from scratch. On the contrary, with the current state of
things you need to rewrite the whole thing if you need a O(1) .splice().
That is what is really bad.

Again, in the rare case that the .size() call is critical (which really
needs to be proven by detailed profiling), then the project deserves
extra care with handling the container.
Message has been deleted

James Kuyper

unread,
May 12, 2022, 10:53:51 PM5/12/22
to
On 5/12/2022 2:44 PM, Öö Tiib wrote:
> On Tuesday, 10 May 2022 at 20:50:56 UTC+3, Manfred wrote:
>> On 5/10/2022 4:16 PM, James Kuyper wrote:
>>> On 5/9/22 12:18, Mut...@dastardlyhq.com wrote:
...
[Correction to my previous post, which I've cancelled - but cancellation
requests are usually ignored]
That's covered by the second option. Just because the specification allows
size() to be O(N) doesn't meant that it's required to always involve
O(N) operations.

Öö Tiib

unread,
May 12, 2022, 11:12:06 PM5/12/22
to
It is impossible goal to have generics that are optimal.
Point of generics is to perform good enough, and reliably, not optimally.

> It just has couple
> > containers that are well implemented and tested but can't be exactly
> > optimal for every case. That raises plenty of controversy. For example the
> > std::deque chunk size, std::vector growth factor, underlying tree type of
> > associative containers etc. Taking better fitting or configurable
> > implementation can sometimes win noticeable gains in performance.
> >
> These two examples are a different matter: you can't have a std::deque
> without some chunk size, and you can't have a std::vector without a
> growth factor - i.e. these are design parameters that must be part of
> the implementation, one way or the other.
>
> A 'size' cached member is /not/ required for a std::list to do its job.
> Moreover, at least for std::vector, the standard library gives you a
> valid solution if the default growth factor does not suit your needs:
> /If/ (and I stress _if_) the default growth factor is proven to be not
> good for you, then this means that you have some very specific and hard
> requirements - you want to use .reserve() in your program.

These were three random examples that can matter to performance followed
with "etc."

> >> b) Standard library implementations should keep complexity (including
> >> design complexity) to a minimum - this is related to having well
> >> established solutions in it.
> >
> > The main problem with O(N) size() of list I have observed was that
> > programmers called it when they were really interested in O(1) empty().
> > I was bored of typing to code review that they should use c.empty() not
> > !c.size() or c.size() < 1 etc. Their brains just are wired to want to evaluate
> > size.
> >
> As one good project manager I once worked with used to say, this falls
> into the category 'laziness'. The appropriate teamleader would happily
> 'educate' them (citing another former coworker of mine)
>
> More seriously, if the O(N) .size() function compromises the program
> performance, then this means that the project requires proper attention
> to container usage, including questioning if 'list' is the right
> container choice.

I did not profile if it mattered. It was just code review opinion that usage of
size() was pessimal.

> >> Now, in the case of a list, complete ownership of the container involves
> >> two data items: the 'begin' and 'end' iterators.
> >> Adding a 'size' data member increases complexity, and, even worse,
> >> happens to duplicate information, which brings with itself obvious
> >> management hassles. I believe that the choice to duplicate information
> >> should be justified only by analysis of the specific problem domain,
> >> which is something a standard library cannot be aware of.
> >>
> >> Putting this in other words, the distinctive feature of list is O(1)
> >> deletions and insertions (including splicing), more than counting its
> >> elements. In fact I think it's fair to say that in algorithms in which a
> >> linked list is an appropriate tool the numiber of elements is often not
> >> even used. So, a compromise, if needed, should favor splice() over size().
> >>
> >> And, if for some reason one needs a linked list with an O(1) size()
> >> operation, one can easily wrap a std::list in a class that caches and
> >> manages the additional size member.
> >> The other way around, i.e. getting an O(1) splice operation from a list
> >> implementation that comes with an O(1) size() operation, is just not
> >> possible.
> >
> > There is configurable implementation of list in boost::container. I would
> > take it ... as there are usually better things to do than to implement
> > mundane containers.
>
> I'm not a big fan of boost, but that's an option, yes.
> Note, however, that when I wrote 'wrapping' I didn't suggest rewriting a
> linked list from scratch. On the contrary, with the current state of
> things you need to rewrite the whole thing if you need a O(1) .splice().
> That is what is really bad.
>
> Again, in the rare case that the .size() call is critical (which really
> needs to be proven by detailed profiling), then the project deserves
> extra care with handling the container.

In my experience when it is worth to consider alternatives then
boost has more useful things than folly or abseil.

Alf P. Steinbach

unread,
May 13, 2022, 12:34:29 PM5/13/22
to
By that logic it can be described as O(n^2 ).

And it's within the formal definition of big O notation.

However, using it that way would be very misleading so it's not done;
there is an understanding that the big O communicates something useful
to know, whereas the way you used it it communicated a falsehood.


- Alf

James Kuyper

unread,
May 13, 2022, 11:46:56 PM5/13/22
to
On 5/13/22 12:34, Alf P. Steinbach wrote:
> On 13 May 2022 04:53, James Kuyper wrote:
...
>> [Correction to my previous post, which I've cancelled - but cancellation
>> requests are usually ignored]
>> That's covered by the second option. Just because the specification
>> allows
>> size() to be O(N) doesn't meant that it's required to always involve
>> O(N) operations.
>>
>
> By that logic it can be described as O(n^2 ).

You misunderstand. This isn't a description.

> And it's within the formal definition of big O notation.
>
> However, using it that way would be very misleading so it's not done;
> there is an understanding that the big O communicates something useful
> to know, whereas the way you used it it communicated a falsehood.

Since it's not a description, it can't be a false description, either.
It's a specification. Standard specifications can neither be false, nor
true. A given implementation either meets the specifications, or it
doesn't. The specifications might be impossible to meet, but that's a
different matter entirely from truth or falsity.

Whenever the standard specifies the complexity of an operation, an
implementation that has a lower complexity is always allowed. In
particular, if the standard specifies O(N), an implementation that
sometimes takes O(1) time, and sometimes takes O(N) meets that
specification. An implementation that took O(N^2) time would not.

Alf P. Steinbach

unread,
May 14, 2022, 5:44:09 AM5/14/22
to
That's a meaningless response.

I guess by design.

- Alf

Tim Rentsch

unread,
May 15, 2022, 10:48:26 AM5/15/22
to
James's response is not meaningless. It is a bit sloppy in
places in how it uses big-O notation, but it is not meaningless.
If you have trouble understanding what he means I suggest asking
a question.

> I guess by design.

It's hard to see this comment as anything but a gratuitous
insult. Heaven know I don't always agree with what James has to
say, but he is not someone given to making statements that are
deliberately meaningless, and I certainly believe he did not
do so in this instance.

Alf P. Steinbach

unread,
May 15, 2022, 9:30:39 PM5/15/22
to
You know that I have no trouble understanding the literal meaning.

It's meaningless drivel, though.

If you don't understand that, then any contribution you make here is
just more noise.



>> I guess by design.
>
> It's hard to see this comment as anything but a gratuitous
> insult. Heaven know I don't always agree with what James has to
> say, but he is not someone given to making statements that are
> deliberately meaningless, and I certainly believe he did not
> do so in this instance.

Your opinion of his preference in argumentation does very little to add
meaning to what he wrote; rather the opposite.

The natural interpretation of James' remark is an attempt to obscure and
hide that he earlier misled or failed to consider a possibility, namely
that of a cached list size, by posting a stream of literally true
meaningless nonsense, and for your remark, that you failed to understand
both the earlier exchange and the beyond-the-end obfuscation attempt.

Have fun.


Cheers,

- Alf

James Kuyper

unread,
May 15, 2022, 10:57:55 PM5/15/22
to
On 5/15/22 21:30, Alf P. Steinbach wrote:
> On 15 May 2022 16:48, Tim Rentsch wrote:
...
>> It's hard to see this comment as anything but a gratuitous
>> insult. ...

It was, of course, an insult, and clearly intended as such.

> ... Heaven know I don't always agree with what James has to
>> say, but he is not someone given to making statements that are
>> deliberately meaningless, and I certainly believe he did not
>> do so in this instance.
>
> Your opinion of his preference in argumentation does very little to add
> meaning to what he wrote; rather the opposite.
>
> The natural interpretation of James' remark is an attempt to obscure and
> hide that he earlier misled or failed to consider a possibility, namely
> that of a cached list size, ...

No, that is not at all the case. I was well aware that a cached list
size was one possibility being discussed. I very carefully worded my
summary of the options being discussed so that it would include that
possibility. I very deliberately chose not to separate it out as an
option distinct from the others, because doing so would serve no purpose
that was relevant to the point I was making.

Possibly you failed to recognize the point that I was making, and
thought that some other point should have been made. My point was that,
no matter how the class was defined, in order to determine the number of
elements after a splice from a foreign list, O(N) operations must occur
somewhere; the discussion among the other participants in this
discussion has only been about where and when that cost should be
incurred, and not, as Muttley claimed, about whether on not it was
possible to determine the size.
Resolving that confusion on Muttley's part was the primary purpose of my
message. If you thought it had some other purpose, that might have been
the reason you found it so upsetting.

Juha Nieminen

unread,
May 16, 2022, 1:46:51 AM5/16/22
to
James Kuyper <james...@alumni.caltech.edu> wrote:
> Whenever the standard specifies the complexity of an operation, an
> implementation that has a lower complexity is always allowed. In
> particular, if the standard specifies O(N), an implementation that
> sometimes takes O(1) time, and sometimes takes O(N) meets that
> specification. An implementation that took O(N^2) time would not.

One has to be quite careful with the wording used, though.

"at most O(n)", "on average O(n)" and "amortized O(n)" can mean
quite different things.

Also things like "O(n) element comparisons" can insiduously hide the
allowance of doing more than O(n) other operations.

Tim Rentsch

unread,
May 17, 2022, 8:20:10 AM5/17/22
to
"Alf P. Steinbach" <alf.p.s...@gmail.com> writes:

> On 15 May 2022 16:48, Tim Rentsch wrote:
>
>> "Alf P. Steinbach" <alf.p.s...@gmail.com> writes:
>>
>>> On 14 May 2022 05:46, James Kuyper wrote:
>>>
>>>> On 5/13/22 12:34, Alf P. Steinbach wrote:
>>>>
>>>>> On 13 May 2022 04:53, James Kuyper wrote:
>>>>
>>>> ...
>>>>
I don't know what kinds of statements you might have trouble
understanding. My comment was only about James's statements,
not about whether you understood them.

> It's meaningless drivel, though.
>
> If you don't understand that, then any contribution you make here
> is just more noise.

Since I don't think his statements are meaningless I wouldn't
call them meaningless drivel. Apparently you mean something
different by the phrase than how I would normally read it.

>>> I guess by design.
>>
>> It's hard to see this comment as anything but a gratuitous
>> insult. Heaven know I don't always agree with what James has to
>> say, but he is not someone given to making statements that are
>> deliberately meaningless, and I certainly believe he did not
>> do so in this instance.
>
> Your opinion of his preference in argumentation does very little
> to add meaning to what he wrote; rather the opposite.
>
> The natural interpretation of James' remark is an attempt to
> obscure and hide that he earlier misled or failed to consider a
> possibility, namely that of a cached list size, by posting a
> stream of literally true meaningless nonsense, and for your
> remark, that you failed to understand both the earlier exchange
> and the beyond-the-end obfuscation attempt.

I'm sorry you didn't find my comments more helpful.

Tim Rentsch

unread,
May 17, 2022, 8:31:00 AM5/17/22
to
James Kuyper <james...@alumni.caltech.edu> writes:

> On 5/15/22 21:30, Alf P. Steinbach wrote:
>
>> On 15 May 2022 16:48, Tim Rentsch wrote:
>
> ...
>
>>> It's hard to see this comment as anything but a gratuitous
>>> insult. ...
>
> It was, of course, an insult, and clearly intended as such.

There's an important difference between what I said and saying
Alf's (now missing) statement is an insult. I don't doubt that
you feel insulted, but I still wouldn't say Alf's statement is
an insult.

Tim Rentsch

unread,
May 17, 2022, 8:38:04 AM5/17/22
to
Juha Nieminen <nos...@thanks.invalid> writes:

> James Kuyper <james...@alumni.caltech.edu> wrote:
>
>> Whenever the standard specifies the complexity of an operation, an
>> implementation that has a lower complexity is always allowed. In
>> particular, if the standard specifies O(N), an implementation that
>> sometimes takes O(1) time, and sometimes takes O(N) meets that
>> specification. An implementation that took O(N^2) time would not.
>
> One has to be quite careful with the wording used, though.
>
> "at most O(n)", "on average O(n)" and "amortized O(n)" can mean
> quite different things.

Because James is talking about what is said in the C++ standard,
what "O(n)" means is the same as what is meant in the standard.

Bonita Montero

unread,
May 31, 2022, 11:46:42 PM5/31/22
to
I use my own class

template<typename RandIt>
concept bool_cursor_iterator = bit_cursor_iterator<RandIt>;

template<bool_cursor_iterator BcIt>
struct bool_cursor
{
using value_type = bool;
using difference_type = std::ptrdiff_t;
struct reference;
bool_cursor() = delete;
constexpr bool_cursor( BcIt begin, std::ptrdiff_t offset = 0 ) noexcept;
constexpr bool_cursor( bool_cursor const &other ) noexcept;
constexpr bool_cursor &operator =( bool_cursor const &other ) noexcept;
constexpr bool operator ==( bool_cursor const &other ) const noexcept =
default;
constexpr bool operator !=( bool_cursor const &other ) const noexcept =
default;
constexpr bool_cursor operator +( difference_type offset ) const noexcept;
constexpr bool_cursor operator -( difference_type offset ) const noexcept;
constexpr bool_cursor &operator ++() noexcept;
constexpr bool_cursor &operator --() noexcept;
constexpr bool_cursor operator ++( int ) noexcept;
constexpr bool_cursor operator --( int ) noexcept;
constexpr difference_type operator -( bool_cursor const &other ) const
noexcept;
constexpr bool operator <( bool_cursor const &other ) const noexcept;
constexpr bool operator <=( bool_cursor const &other ) const noexcept;
constexpr bool operator >( bool_cursor const &other ) const noexcept;
constexpr bool operator >=( bool_cursor const &other ) const noexcept;
constexpr bool_cursor &operator +=( difference_type offset ) noexcept;
constexpr bool_cursor &operator -=( difference_type offset ) noexcept;
constexpr reference operator *() const noexcept;
constexpr reference operator []( difference_type offset ) const noexcept;
struct reference
{
constexpr operator bool() const noexcept;
constexpr bool operator =( bool f ) const noexcept;
private:
friend struct bool_cursor;
constexpr reference( BcIt it, unsigned char shift ) noexcept;
BcIt m_it;
unsigned char m_shift;
};
constexpr void swap( bool_cursor &other ) noexcept;
private:
BcIt m_it;
unsigned char m_shift;
};

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt>::bool_cursor( BcIt begin, std::ptrdiff_t
offset ) noexcept :
m_it( begin + (offset >> 3) ),
m_shift( (unsigned char)offset & 7 )
{
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt>::bool_cursor( bool_cursor const &other )
noexcept :
m_it( other.m_it ),
m_shift( other.m_shift )
{
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> &bool_cursor<BcIt>::operator =( bool_cursor
const &other ) noexcept
{
m_it = other.m_it;
m_shift = other.m_shift;
return *this;
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> bool_cursor<BcIt>::operator +(
difference_type offset ) const noexcept
{
using namespace std;
assert(in_range_add( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift + offset;
return bool_cursor( m_it + (iBit >> 3), (unsigned char)iBit & 7 );
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> bool_cursor<BcIt>::operator -(
difference_type offset ) const noexcept
{
using namespace std;
assert(in_range_sub( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift - offset;
return bool_cursor( m_it + (iBit >> 3), (unsigned char)iBit & 7 );
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> &bool_cursor<BcIt>::operator ++() noexcept
{
unsigned char newShift = m_shift + 1 & 7;
m_it += newShift < m_shift;
m_shift = newShift;
return *this;
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> &bool_cursor<BcIt>::operator --() noexcept
{
unsigned char newShift = m_shift - 1 & 7;
m_it -= newShift > m_shift;
m_shift = newShift;
return *this;
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> bool_cursor<BcIt>::operator ++( int ) noexcept
{
bool_cursor ret( *this );
++*this;
return ret;
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> bool_cursor<BcIt>::operator --( int ) noexcept
{
bool_cursor ret( *this );
--*this;
return ret;
}

template<bool_cursor_iterator BcIt>
constexpr typename bool_cursor<BcIt>::difference_type
bool_cursor<BcIt>::operator -( bool_cursor const &other ) const noexcept
{
using namespace std;
assert((ptrdiff_t)(m_it - other.m_it) * 8 / 8 == (m_it - other.m_it) &&
in_range_add( (ptrdiff_t)(m_it - other.m_it), (ptrdiff_t)(signed
char)(m_shift - other.m_shift) ));
return (ptrdiff_t)(m_it - other.m_it) + (ptrdiff_t)(signed
char)(m_shift - other.m_shift);
}

template<bool_cursor_iterator BcIt>
constexpr bool bool_cursor<BcIt>::operator <( bool_cursor const &other )
const noexcept
{
return m_it < other.m_it || m_it == other.m_it && m_shift < other.m_shift;
}

template<bool_cursor_iterator BcIt>
constexpr bool bool_cursor<BcIt>::operator <=( bool_cursor const &other
) const noexcept
{
return m_it < other.m_it || m_it == other.m_it && m_shift <= other.m_shift;
}

template<bool_cursor_iterator BcIt>
constexpr bool bool_cursor<BcIt>::operator >( bool_cursor const &other )
const noexcept
{
return m_it > other.m_it || m_it == other.m_it && m_shift > other.m_shift;
}

template<bool_cursor_iterator BcIt>
constexpr bool bool_cursor<BcIt>::operator >=( bool_cursor const &other
) const noexcept
{
return m_it > other.m_it || m_it == other.m_it && m_shift >= other.m_shift;
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> &bool_cursor<BcIt>::operator +=(
difference_type offset ) noexcept
{
assert(in_range_add( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift + offset;
m_it += iBit >> 3;
m_shift = (unsigned char)iBit & 7;
return *this;
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt> &bool_cursor<BcIt>::operator -=(
difference_type offset ) noexcept
{
assert(in_range_sub( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift - offset;
m_it -= iBit >> 3;
m_shift = (unsigned char)iBit & 7;
return *this;
}

template<bool_cursor_iterator BcIt>
constexpr typename bool_cursor<BcIt>::reference
bool_cursor<BcIt>::operator *() const noexcept
{
return reference( m_it, m_shift );
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt>::reference::operator bool() const noexcept
{
unsigned char value;
std::memcpy( &value, to_address( m_it ), 1 );
return (bool)(value >> m_shift & 1);
}

template<bool_cursor_iterator BcIt>
constexpr bool bool_cursor<BcIt>::reference::operator =( bool f ) const
noexcept
{
using namespace std;
unsigned char value;
memcpy( &value, to_address( m_it ), 1 );
value |= (unsigned char)f << m_shift;
memcpy( to_address( m_it ), &value, 1 );
return f;
}

template<bool_cursor_iterator BcIt>
constexpr typename bool_cursor<BcIt>::reference
bool_cursor<BcIt>::operator []( difference_type offset ) const noexcept
{
using namespace std;
assert(in_range_add( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift + offset;
return reference( m_it + (iBit >> 3), (unsigned char)iBit & 7 );
}

template<bool_cursor_iterator BcIt>
constexpr bool_cursor<BcIt>::reference::reference( BcIt it, unsigned
char shift ) noexcept :
m_it( it ),
m_shift( shift )
{
}

namespace std
{
template<typename BcIt>
constexpr void swap( bool_cursor<BcIt> &left, bool_cursor<BcIt> &right
) noexcept
{
left.swap( right );
}
}

Bonita Montero

unread,
Jun 1, 2022, 1:31:02 AM6/1/22
to
And this even more fancy. A class to handle a series of bit-fields
nearly a convenient like handling a series of dissimilar byte sequences
with a normal pointer. Conforming to C++ aliasing-constraints that
aliasing is only absolutely safe if you alias via memcpy().

#pragma once
#include <cstdint>
#include <cassert>
#include <iterator>
#include <cassert>
#include <type_traits>
#include <iterator>
#include <cstring>
#include <utility>
#include "bswap.h"
#include "range_check.h"

#if defined(_MSC_VER)
#pragma warning(push)
#pragma warning(disable: 4554) // operator predence
#pragma warning(disable: 26495) // always initialize members
#endif
#if defined(__llvm__)
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wshift-op-parentheses"
#pragma clang diagnostic ignored "-Wlogical-op-parentheses"
#pragma clang diagnostic ignored "-Wbitwise-op-parentheses"
#endif

static_assert(std::bit_floor( sizeof(max_align_t) ) ==
sizeof(max_align_t), "sizeof(max_align_t) must be a power of two");
static_assert(sizeof(std::max_align_t) >= sizeof(std::uint64_t),
"sizeof(max_align_t) must be the same or a a larger power of two than
sizeof(uint64_t)");

template<typename RandIt>
concept bit_cursor_iterator = std::random_access_iterator<RandIt> &&
(sizeof(std::iter_value_t<RandIt>) == 1);

template<bit_cursor_iterator BitIt>
struct bit_cursor
{
using value_type = std::uint64_t;
using difference_type = std::ptrdiff_t;
bit_cursor() = delete;
constexpr bit_cursor( BitIt begin, std::ptrdiff_t offset = 0 ) noexcept;
constexpr bit_cursor( bit_cursor &other ) noexcept;
constexpr bit_cursor &operator =( bit_cursor const &other ) noexcept;
constexpr bool operator ==( bit_cursor const &other ) const noexcept;
constexpr bool operator !=( bit_cursor const &other ) const noexcept;
constexpr bit_cursor operator +( difference_type offset ) const noexcept;
constexpr bit_cursor operator -( difference_type offset ) const noexcept;
constexpr difference_type operator -( bit_cursor const &other ) const
noexcept;
constexpr bool operator <( bit_cursor const &other ) const noexcept;
constexpr bool operator <=( bit_cursor const &other ) const noexcept;
constexpr bool operator >( bit_cursor const &other ) const noexcept;
constexpr bool operator >=( bit_cursor const &other ) const noexcept;
constexpr bit_cursor &operator +=( difference_type offset ) noexcept;
constexpr bit_cursor &operator -=( difference_type offset ) noexcept;
constexpr std::uint64_t read_na( unsigned char bits ) noexcept;
constexpr void write_na( std::uint64_t value, unsigned char bits )
noexcept;
constexpr std::uint64_t read( unsigned char bits ) noexcept;
constexpr void write( std::uint64_t value, unsigned char bits ) noexcept;
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr std::uint64_t read_na() noexcept;
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr void write_na( std::uint64_t value ) noexcept;
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr std::uint64_t read() noexcept;
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr void write( std::uint64_t value ) noexcept;
constexpr void swap( bit_cursor &other ) noexcept;
private:
static std::uint64_t dRead( std::uint64_t const &data ) noexcept;
static void dWrite( std::uint64_t &data, std::uint64_t value ) noexcept;
constexpr bit_cursor( BitIt begin, std::uint64_t *data, unsigned char
shift ) noexcept;
static constexpr std::uint64_t bswap( std::uint64_t value ) noexcept;
#if !defined(NDEBUG)
BitIt m_safeData;
#endif
std::uint64_t *m_data;
unsigned char m_shift;
};

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt>::bit_cursor( BitIt begin, std::ptrdiff_t
offset ) noexcept :
#if !defined(NDEBUG)
m_safeData( begin + (offset >> 3) ),
#endif
m_data( (std::uint64_t *)((std::size_t)std::to_address( begin ) +
(offset >> 3) & -(std::ptrdiff_t)sizeof(std::uint64_t)) ),
m_shift( (unsigned char)((char *)std::to_address( begin ) - (char
*)m_data) * 8 + (unsigned char)(offset & 7) )
{
}

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt>::bit_cursor( bit_cursor &other ) noexcept :
#if !defined(NDEBUG)
m_safeData( other.m_safeData ),
#endif
m_data( other.m_data ),
m_shift( other.m_shift )
{
}

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt> &bit_cursor<BitIt>::operator =( bit_cursor
const &other ) noexcept
{
#if !defined(NDEBUG)
m_safeData = other.m_safeData;
#endif
m_data = other.m_data;
m_shift = other.m_shift;
return *this;
}

template<bit_cursor_iterator BitIt>
constexpr bool bit_cursor<BitIt>::operator ==( bit_cursor const &other )
const noexcept
{
return m_data == other.m_data && m_shift == other.m_shift;
}

template<bit_cursor_iterator BitIt>
constexpr bool bit_cursor<BitIt>::operator !=( bit_cursor const &other )
const noexcept
{
return m_data != other.m_data || m_shift != other.m_shift;
}

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt> bit_cursor<BitIt>::operator +(
difference_type offset ) const noexcept
{
using namespace std;
#if !defined(NDEBUG)
BitIt newSafeIt( m_safeData + ((ptrdiff_t)(m_shift & 7) + (offset >> 3)) );
#else
BitIt newSafeIt;
#endif
assert(in_range_add( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift + offset;
return bit_cursor( newSafeIt, m_data + (iBit >> 6), (unsigned char)iBit
% 64 );
}

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt> bit_cursor<BitIt>::operator -(
difference_type offset ) const noexcept
{
using namespace std;
#if !defined(NDEBUG)
BitIt newSafeIt( m_safeData + ((ptrdiff_t)(m_shift & 7) - (offset >> 3)) );
#else
BitIt newSafeIt;
#endif
assert(in_range_sub( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift - offset;
return bit_cursor( newSafeIt, m_data - (iBit >> 6), (unsigned char)iBit
% 64 );
}

template<bit_cursor_iterator BitIt>
constexpr typename bit_cursor<BitIt>::difference_type
bit_cursor<BitIt>::operator -( bit_cursor const &other ) const noexcept
{
using namespace std;
signed char shiftDiff = (signed char)(m_shift - other.m_shift);
assert((m_data - other.m_data) * 64 / 64 == (m_data - other.m_data) &&
in_range_add( (m_data - other.m_data) * 64, (ptrdiff_t)shiftDiff ));
return (m_data - other.m_data) * 64 + (ptrdiff_t)shiftDiff;
}

template<bit_cursor_iterator BitIt>
constexpr bool bit_cursor<BitIt>::operator <( bit_cursor const &other )
const noexcept
{
return m_data < other.m_data || m_data == other.m_data && m_shift <
other.m_shift;
}

template<bit_cursor_iterator BitIt>
constexpr bool bit_cursor<BitIt>::operator <=( bit_cursor const &other )
const noexcept
{
return m_data < other.m_data || m_data == other.m_data && m_shift <=
other.m_shift;
}

template<bit_cursor_iterator BitIt>
constexpr bool bit_cursor<BitIt>::operator >( bit_cursor const &other )
const noexcept
{
return m_data > other.m_data || m_data == other.m_data && m_shift >
other.m_shift;
}

template<bit_cursor_iterator BitIt>
constexpr bool bit_cursor<BitIt>::operator >=( bit_cursor const &other )
const noexcept
{
return m_data > other.m_data || m_data == other.m_data && m_shift >=
other.m_shift;
}

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt> &bit_cursor<BitIt>::operator +=(
difference_type offset ) noexcept
{
using namespace std;
#if !defined(NDEBUG)
(void)(m_safeData + ((ptrdiff_t)(m_shift & 7) + (offset >> 3)));
#endif
assert(in_range_add( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift + offset;
m_data += iBit >> 6;
m_shift = (unsigned char)iBit % 64;
return *this;
}

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt> &bit_cursor<BitIt>::operator -=(
difference_type offset ) noexcept
{
using namespace std;
#if !defined(NDEBUG)
(void)(m_safeData + ((ptrdiff_t)(m_shift & 7) - (offset >> 3)));
#endif
assert(in_range_sub( (ptrdiff_t)m_shift, offset ));
ptrdiff_t iBit = (ptrdiff_t)m_shift - offset;
m_data -= iBit >> 6;
m_shift = (unsigned char)iBit % 64;
return *this;
}

template<bit_cursor_iterator BitIt>
constexpr std::uint64_t bit_cursor<BitIt>::read_na( unsigned char bits )
noexcept
{
using namespace std;
assert(bits <= 64);
#if !defined(NDEBUG)
(void)(m_safeData + ((ptrdiff_t)(m_shift & 7) + bits + 7) / 8);
#endif
unsigned char
shift = m_shift,
rShift = 64 - shift;
uint64_t
mask = bits != 64 ? ((uint64_t)1 << bits) - 1 : (uint64_t)-1,
value = bswap( dRead( m_data[0] ) ) >> shift & mask;
if( bits > rShift )
value |= bswap( dRead( m_data[1] ) ) << rShift & mask;
return value;
}

template<bit_cursor_iterator BitIt>
constexpr void bit_cursor<BitIt>::write_na( std::uint64_t value,
unsigned char bits ) noexcept
{
using namespace std;
assert(bits <= 64);
#if !defined(NDEBUG)
(void)(m_safeData + ((ptrdiff_t)(m_shift & 7) + bits + 7) / 8);
#endif
unsigned char
shift = m_shift,
rShift = 64 - shift;
uint64_t mask = bits != 64 ? ((uint64_t)1 << bits) - 1 : (uint64_t)-1;
assert(!(value & ~mask));
dWrite( m_data[0], bswap( bswap( dRead( m_data[0] ) ) & ~(mask <<
shift) | value << shift ) );
if( bits > rShift )
dWrite( m_data[1], bswap( bswap( dRead( m_data[1] ) ) & ~(mask >>
rShift) | value >> rShift ) );
}

template<bit_cursor_iterator BitIt>
constexpr std::uint64_t bit_cursor<BitIt>::read( unsigned char bits )
noexcept
{
std::uint64_t value = read_na( bits );
*this += (difference_type)bits;
return value;
}

template<bit_cursor_iterator BitIt>
constexpr void bit_cursor<BitIt>::write( std::uint64_t value, unsigned
char bits ) noexcept
{
write_na( value, bits );
*this += (difference_type)bits;
}

template<bit_cursor_iterator BitIt>
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr std::uint64_t bit_cursor<BitIt>::read_na() noexcept
{
using namespace std;
#if !defined(NDEBUG)
(void)(m_safeData + ((ptrdiff_t)(m_shift & 7) + Bits + 7) / 8);
#endif
unsigned char
shift = m_shift,
rShift = 64 - shift;
constexpr uint64_t MASK = Bits != 64 ? ((uint64_t)1 << Bits) - 1 :
(uint64_t)-1;
uint64_t value = bswap( dRead( m_data[0] ) ) >> shift & MASK;
if( Bits > rShift )
value |= bswap( dRead( m_data[1] ) ) << rShift & MASK;
return value;
}

template<bit_cursor_iterator BitIt>
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr void bit_cursor<BitIt>::write_na( std::uint64_t value ) noexcept
{
using namespace std;
#if !defined(NDEBUG)
(void)(m_safeData + ((ptrdiff_t)(m_shift & 7) + Bits + 7) / 8);
#endif
unsigned char
shift = m_shift,
rShift = 64 - shift;
constexpr uint64_t MASK = Bits != 64 ? ((uint64_t)1 << Bits) - 1 :
(uint64_t)-1;
assert(!(value & ~MASK));
dWrite( m_data[0], bswap( bswap( dRead( m_data[0] ) ) & ~(MASK <<
shift) | value << shift ) );
if( Bits > rShift )
dWrite( m_data[1], bswap( bswap( dRead( m_data[1] ) ) & ~(MASK >>
rShift) | value >> rShift ) );
}

template<bit_cursor_iterator BitIt>
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr std::uint64_t bit_cursor<BitIt>::read() noexcept
{
std::uint64_t value = this->template read_na<Bits>();
*this += (difference_type)Bits;
return value;
}

template<bit_cursor_iterator BitIt>
template<unsigned char Bits>
requires (Bits > 0 && Bits <= 64)
constexpr void bit_cursor<BitIt>::write( std::uint64_t value ) noexcept
{
this->template write_na<Bits>( value);
*this += (difference_type)Bits;
}

template<bit_cursor_iterator BitIt>
constexpr void bit_cursor<BitIt>::swap( bit_cursor &other ) noexcept
{
#if !defined(NDEBUG)
std::swap( m_safeData, other.m_safeData );
#endif
std::swap( m_data, other.m_data );
std::swap( m_shift, other.m_shift );
}

template<bit_cursor_iterator BitIt>
std::uint64_t bit_cursor<BitIt>::dRead( std::uint64_t const &data ) noexcept
{
uint64_t value;
std::memcpy( &value, &data, sizeof(std::uint64_t) );
return value;
}

template<bit_cursor_iterator BitIt>
void bit_cursor<BitIt>::dWrite( std::uint64_t &data, std::uint64_t value
) noexcept
{
std::memcpy( &data, &value, sizeof(std::uint64_t) );
}

template<bit_cursor_iterator BitIt>
constexpr bit_cursor<BitIt>::bit_cursor( BitIt begin, std::uint64_t
*data, unsigned char shift ) noexcept :
#if !defined(NDEBUG)
m_safeData( begin ),
#endif
m_data( data ),
m_shift( shift )
{
}

template<bit_cursor_iterator BitIt>
constexpr std::uint64_t bit_cursor<BitIt>::bswap( std::uint64_t value )
noexcept
{
return from_little_endian( value );
}

namespace std
{
template<typename BitIt>
constexpr void swap( bit_cursor<BitIt> &left, bit_cursor<BitIt> &right
) noexcept
{
left.swap( right );
}
}

#if defined(_MSC_VER)
#pragma warning(pop)
#endif
#if defined(__llvm__) || defined(__INTEL_LLVM_COMPILER)
#pragma clang diagnostic pop
#endif

Alf P. Steinbach

unread,
Jun 1, 2022, 8:20:31 AM6/1/22
to
On 1 Jun 2022 05:46, Bonita Montero wrote:
> I use my own class
> [snipalot]

Two alternatives:

* Use a `vector<Truth>` where `Truth` is your own class type `bool`
replacement.
* Use a `deque<bool>`.

The first possibility is preferable when you have a `Truth` type that
limits implicit conversions to conversion to and from `bool`.


Cheers,

- Alf

Bonita Montero

unread,
Jun 1, 2022, 8:23:15 AM6/1/22
to
Am 01.06.2022 um 14:20 schrieb Alf P. Steinbach:
> On 1 Jun 2022 05:46, Bonita Montero wrote:
>> I use my own class
>> [snipalot]
>
> Two alternatives:
>
> * Use a `vector<Truth>` where `Truth` is your own class type `bool`
> replacement.

That's not possible because the minimal data type a vector<> can
address is the size of a native type or structure - which in turn
is one.

> * Use a `deque<bool>`.

LOL.

Alf P. Steinbach

unread,
Jun 1, 2022, 8:26:51 AM6/1/22
to
Well, you know, as Niels Bohr said about the good luck horseshoe he had
on the wall in his office, it may work even if you don't believe in it.

<url: https://quoteinvestigator.com/2013/10/09/horseshoe-luck/>


>> * Use a `deque<bool>`.
>
> LOL.

It also works.


Cheers,

- Alf

Bonita Montero

unread,
Jun 1, 2022, 8:41:00 AM6/1/22
to
Show me the data-type in source feeded to deque for this purpose
to make that working.
0 new messages