I noticed the other day that regardless of the number of bits you specify std::bitset (at least GNU's libstdc++ implementation) the size of the type is always the size of a pointer. For example the following code:
//----------------
#include <bitset>
#include <iostream>
int main() {
std::bitset<8> b8;
std::bitset<16> b16;
std::bitset<32> b32;
std::bitset<64> b64;
std::cout << sizeof(b8) << "\n";
std::cout << sizeof(b16) << "\n";
std::cout << sizeof(b32) << "\n";
std::cout << sizeof(b64) << "\n";
}
//----------------
prints:
8
8
8
8
It seems obvious to me why this is the case. the implementation of std::bitset clearly uses the heap for storage regardless of how many bits you are trying to store. Certainly, this is the simplest implementation. But I think we can do better.
Now that <cstdint> is part of c++11. I would love to see the standard require the following:
If a given uintN_t type is present, an object of type std::bitset<N> shall have a sizeof of precisely N bits and use no heap allocations for storage.
If any of those types are not present on the system, the implementation is free to use the heap as in current implementations.
To me this has multiple benefits:
1. Meets the "efficiency while maintain abstraction" mentality of C++. Why should I use std::bitset<64> when a uint64_t is more efficient? I would *expect* them to be equally efficient.
2. Smaller memory footprint if using many std::bitset<> instances. If I have an array of 100 std::bitset<8> objects. My proposal would occupy no more than 100 bytes. The current version occupies at least: 900 bytes (100 pointers, and 100 bytes pointed too. Possibly more if they simply the implementation by always storing an array of unsigned longs and a bitmask, but 900 is the minimum).
3. Makes the type more useful for systems programming.
4. it's more efficient! No dereferences of memory, and less generally work to do the bit twiddling in std::bitset's implementation for the special cases where the underlying type is a native size.
I know that library writers have been free to do this for basically forever and obviously have chosen not to. But it seems to be an obvious optimization. Perhaps they just overlooked it? Or perhaps they just don't find it worth it. Either way, I think that would be nice to have the standard insist on an efficient implementation when it's possible to do. It's not a particularly hard specialization to do, I've made an experimental one myself in the past which worked beautifully if an example is desired.
As a secondary improvement, I'd love to see std::bitset have first_set_bit() and last_set_bit() members which a particularly clever implementation could implement with O(1) compiler intrinsics when dealing with these "native sized types"
Thoughts?