An improvement to bitset

1,164 views
Skip to first unread message

Evan Teran

unread,
Apr 23, 2014, 4:02:13 PM4/23/14
to std-pr...@isocpp.org

I noticed the other day that regardless of the number of bits you specify std::bitset (at least GNU's libstdc++ implementation) the size of the type is always the size of a pointer. For example the following code:

//----------------

#include <bitset>
#include <iostream>

int main() {
    std::bitset<8> b8;
    std::bitset<16> b16;
    std::bitset<32> b32;
    std::bitset<64> b64;
    std::cout << sizeof(b8) << "\n";
    std::cout << sizeof(b16) << "\n";
    std::cout << sizeof(b32) << "\n";
    std::cout << sizeof(b64) << "\n";
}

//----------------

prints:

8
8
8
8

It seems obvious to me why this is the case. the implementation of std::bitset clearly uses the heap for storage regardless of how many bits you are trying to store. Certainly, this is the simplest implementation. But I think we can do better.

Now that <cstdint> is part of c++11. I would love to see the standard require the following:

If a given uintN_t type is present, an object of type std::bitset<N> shall have a sizeof of precisely N bits and use no heap allocations for storage.

If any of those types are not present on the system, the implementation is free to use the heap as in current implementations.

To me this has multiple benefits:

1. Meets the "efficiency while maintain abstraction" mentality of C++. Why should I use std::bitset<64> when a uint64_t is more efficient? I would *expect* them to be equally efficient.

2. Smaller memory footprint if using many std::bitset<> instances. If I have an array of 100 std::bitset<8> objects. My proposal would occupy no more than 100 bytes. The current version occupies at least: 900 bytes (100 pointers, and 100 bytes pointed too. Possibly more if they simply the implementation by always storing an array of unsigned longs and a bitmask, but 900 is the minimum).

3. Makes the type more useful for systems programming.

4. it's more efficient! No dereferences of memory, and less generally work to do the bit twiddling in std::bitset's implementation for the special cases where the underlying type is a native size.

I know that library writers have been free to do this for basically forever and obviously have chosen not to. But it seems to be an obvious optimization. Perhaps they just overlooked it? Or perhaps they just don't find it worth it. Either way, I think that would be nice to have the standard insist on an efficient implementation when it's possible to do. It's not a particularly hard specialization to do, I've made an experimental one myself in the past which worked beautifully if an example is desired.

As a secondary improvement, I'd love to see std::bitset have first_set_bit() and last_set_bit() members which a particularly clever implementation could implement with O(1) compiler intrinsics when dealing with these "native sized types"

Thoughts?

Howard Hinnant

unread,
Apr 23, 2014, 4:15:08 PM4/23/14
to std-pr...@isocpp.org
On Apr 23, 2014, at 4:02 PM, Evan Teran <evan....@gmail.com> wrote:

> Thoughts?

std::bitset<65> b65;
std::cout << sizeof(b65) << "\n";

Howard

Nicola Gigante

unread,
Apr 23, 2014, 4:21:21 PM4/23/14
to std-pr...@isocpp.org, Nicola Gigante

Il giorno 23/apr/2014, alle ore 22:02, Evan Teran <evan....@gmail.com> ha scritto:

>
> I noticed the other day that regardless of the number of bits you specify std::bitset (at least GNU's libstdc++ implementation) the size of the type is always the size of a pointer. For example the following code:
>
> //----------------
>
> #include <bitset>
> #include <iostream>
>
> int main() {
> std::bitset<8> b8;
> std::bitset<16> b16;
> std::bitset<32> b32;
> std::bitset<64> b64;
> std::cout << sizeof(b8) << "\n";
> std::cout << sizeof(b16) << "\n";
> std::cout << sizeof(b32) << "\n";
> std::cout << sizeof(b64) << "\n";
> }
>
> //----------------
>
> prints:
>
> 8
> 8
> 8
> 8
>
> It seems obvious to me why this is the case. the implementation of std::bitset clearly uses the heap for storage regardless of how many bits you are trying to store. Certainly, this is the simplest implementation. But I think we can do better.
> <cut>
> Thoughts?

If you try sizes greater than 64 you'll see that sizeof() increase accordingly.
Actually, for N up to 64, sizeof(bitset<N>) is always 8 because the GNU implementation
(and LLVM one, too), uses an array of size_t as the storage, so at least 8 bytes are used
if size_t is 64bits (commonly it is).

No heap involved here, and yes, using a bitset<64> is as efficient as a size_t.
I don't know if it's mandated by the standard or not, though...

P.S. Just seen Howard reply, I couldn't be more succint :)

Bye,
Nicola

Evan Teran

unread,
Apr 23, 2014, 4:22:43 PM4/23/14
to std-pr...@isocpp.org
Howard, 

I think you may have misunderstood (or I failed to clarify).

For that example, the library would be allowed to fall back on the current heap based behavior. What I am suggesting is simply a specialization for sizes that happen to match up with native integer types.

So for you b65, sizeof(b65) would be 8 on my system. because it would be have a pointer to (at least) 9 bytes of memory. Occupying at a minimum 17 bytes total of RAM. But my proposed b64 specialization would occupy *precisely* 8 bytes of RAM.

Matthew Woehlke

unread,
Apr 23, 2014, 4:23:43 PM4/23/14
to std-pr...@isocpp.org
On 2014-04-23 16:02, Evan Teran wrote:
> I noticed the other day that regardless of the number of bits you specify
> std::bitset (at least GNU's libstdc++ implementation) the size of the type
> is always the size of a pointer. [...]
>
> It seems obvious to me why this is the case. the implementation of
> std::bitset clearly uses the heap for storage regardless of how many bits
> you are trying to store.

Are you sure? Did you inspect the code? (*My* knee-jerk guess would be
that an inline array of long of suitable size is used. It's
statically-sized after all.)

And in fact, on my machine (using gcc), sizeof(bitset<96>()) is indeed 16...

> Now that <cstdint> is part of c++11. I would love to see the standard
> require the following:
>
> If a given uintN_t type is present, an object of type std::bitset<N> shall
> have a sizeof of precisely N bits and use no heap allocations for storage.

...or uint_fastN_t. I wonder if the heap is really always used, or if
the compiler is just using an inline long because there is little reason
to use a smaller type (and possibly even reason *not* to).

That said, I don't object to the standard specifying inline storage
where possible. I'm much more leery of requiring *minimum adequate*
inline storage, however, as opposed to allowing the implementation to
use a larger-than-strictly-necessary type. (Even if that means you are
still out of luck in memory usage of large quantities - especially e.g.
when packed - of bitset's.)

--
Matthew

Evan Teran

unread,
Apr 23, 2014, 4:29:31 PM4/23/14
to std-pr...@isocpp.org, mw_t...@users.sourceforge.net
Good point, I have not inspected the code, but I will in just a moment to be sure :-).

Nevin Liber

unread,
Apr 23, 2014, 4:31:54 PM4/23/14
to std-pr...@isocpp.org
On 23 April 2014 15:02, Evan Teran <evan....@gmail.com> wrote:
It seems obvious to me why this is the case. the implementation of std::bitset clearly uses the heap for storage regardless of how many bits you are trying to store.

Really?  I'm 100% sure that *isn't* the reason.  It would violate the noexcept specifications on constructors, for a start.

There is usually an embedded array of "words" that holds the bits, and usually no reason to make it smaller than a word due to alignment considerations.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Evan Teran

unread,
Apr 23, 2014, 4:37:11 PM4/23/14
to std-pr...@isocpp.org, mw_t...@users.sourceforge.net
My implementation appears to do the following for all cases:


      typedef unsigned long _WordT;
      _WordT _M_w
[_Nw];

So I was wrong in my assumption of it being heap allocated. But it definitely allocates in multiples of sizeof(unsigned long). I would still argue that for native types, it would be possibly be more efficient to special case them. Additionally, I would love to see that bitset<8> is just as space efficient as uint8_t :-).

rhalb...@gmail.com

unread,
Apr 24, 2014, 2:38:48 PM4/24/14
to std-pr...@isocpp.org
What would be nice though are separate specializations for std::bitset<N> if N is a multiple the number of bits in the word type. This would avoid the needless bit-shaving that is done in the operator<< and operator>>.  IIRC, both libstdc++ and libc++ only specialize for the case of a single word, but not for multiple words. 

Geoffrey Romer

unread,
Apr 24, 2014, 5:52:05 PM4/24/14
to std-pr...@isocpp.org
This seems like a quality-of-implementation issue; I don't see any need for the standard to address it. 

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Thiago Macieira

unread,
Apr 24, 2014, 6:16:32 PM4/24/14
to std-pr...@isocpp.org
Em qua 23 abr 2014, às 13:37:11, Evan Teran escreveu:
> So I was wrong in my assumption of it being heap allocated. But it
> definitely allocates in multiples of sizeof(unsigned long). I would still
> argue that for native types, it would be possibly be more efficient to
> special case them. Additionally, I would love to see that bitset<8> is just
> as space efficient as uint8_t :-).

Who says it's more efficient?

If you qualify as "space efficient" like you did in the last sentence, then yes,
it is. But a generic "efficient" isn't a necessity. There could be a platform in
which the processor provides bit manipulation instructions but they operate on
quantities larger than 1 byte.

Like, for example, x86. From the Intel Software Developers Manual about
instruction "BT - Bit Test"

When accessing a bit in memory, the processor may access 4 bytes starting
from the memory address for a 32-bit operand size, using by the following
relationship:
Effective Address + (4 ∗ (BitOffset DIV 32))

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Evan Teran

unread,
Apr 25, 2014, 9:54:19 AM4/25/14
to std-pr...@isocpp.org

You make a fair point. So I can certainly agree that this is a quality of implementation issue and not something that should be mandated by the standard.

As a modification to the original concept, perhaps there could be a new template parameter which would allow the user to specify favoring space efficiency or not? 

something like:

bitset<8, compact> bs8;

It would of course be optional and default to allowing the implementation to decide.

Nevin Liber

unread,
Apr 25, 2014, 10:20:25 AM4/25/14
to std-pr...@isocpp.org
On 25 April 2014 08:54, Evan Teran <evan....@gmail.com> wrote:

As a modification to the original concept, perhaps there could be a new template parameter which would allow the user to specify favoring space efficiency or not? 

Adding a template parameter to an existing template is a breaking change and extremely unlikely to happen for such a minor use case.

David Krauss

unread,
Apr 25, 2014, 11:22:59 PM4/25/14
to std-pr...@isocpp.org

On 2014–04–25, at 9:54 PM, Evan Teran <evan....@gmail.com> wrote:


bitset<8, compact> bs8;

It would of course be optional and default to allowing the implementation to decide.

Is there ever any reason not to “compact”? I think it’s just implementation effort.

Some libraries may resist or at least postpone such an improvement because any change to class layout is a change to the application ABI, causing DLLs to be mutually incompatible. But, the best way forward is to contribute a patch to your library of choice.

Matthew Woehlke

unread,
Apr 28, 2014, 11:15:23 AM4/28/14
to std-pr...@isocpp.org
On 2014-04-25 23:22, David Krauss wrote:
> On 2014-04-25, at 9:54 PM, Evan Teran wrote:
>> bitset<8, compact> bs8;
>>
>> It would of course be optional and default to allowing the implementation to decide.
>
> Is there ever any reason not to "compact"? I think it's just implementation effort.

If compacting does not reduce the alignment requirement, there is no
reason to compact. If it does, you're likely to see a performance impact
from your reads/writes being unaligned and smaller than word size.
(Especially on platforms that SIGBUS, where your implementation must now
explicitly use different instructions, or at worst implement alignment
adjusting itself.)

IOW, compacting (assuming it reduces alignment) is a space efficiency
versus performance trade-off.

--
Matthew

Thiago Macieira

unread,
Apr 28, 2014, 12:38:20 PM4/28/14
to std-pr...@isocpp.org
Em sáb 26 abr 2014, às 11:22:59, David Krauss escreveu:
> > It would of course be optional and default to allowing the implementation
> > to decide.
> Is there ever any reason not to "compact"? I think it's just implementation
> effort.

See my email Evan replied to. The reason not to compact is performance
efficiency, like alignments and use of bit manipulation instructions that
operate on quantities larger than bytes.
Reply all
Reply to author
Forward
0 new messages