bitset search function

821 views
Skip to first unread message

jono...@googlemail.com

unread,
Jan 24, 2015, 2:31:05 PM1/24/15
to std-pr...@isocpp.org
Hi there,

I'd love to see a quick and efficient std::bitset search function: Example: Search for bitstate "true", starting from bit-index 0.

It could have an interface like this, maby:

template<size_t N>
size_t std::bitset<N>::find(size_t idx, bool bitstate=true);


The effect of      bs.find(idx, bitstate)
we search from idx (usually from 0, to start from the rightmost lsb bit), until we find the first index from which:
bs[idx] == bitstate
If no such index is found, return idx = std::bitset::npos     (e.g. equal to numeric_limits<size_t>::max())

Why?
Well some datastructures can profit from using bitset and providing and efficient search for a particular bitstate.
I'm particularly thinking of a memory pool here: Mark the mem-objects that are in use, in a bitset.
Then use something like this, to iterate and find the appropriate bit (e.g. in the destructor of the pool):

for (size_t idx = 0; (idx = bs.find(idx)) != bitset::npos; ++idx) {
   
    /* ... example: release resource idx */

    // unmark that bit
    bs.reset(idx);
}



The most important point is this: we can search incredibly fast in a bitset, while having a nice compressed representation.
In the current std::bitset, all we have is a compressed representation (if the library-designers decided to use 8 bits in a byte); but NO good search possibility.
And that's a pity.

Here's my example of an efficient search . This is just an example for a speedy search, for library designers who are interested. But yes: this example (just an example!) abuses a bitset, by assuming that the internal bit-representation is in the first member, and therefore only plays with libraries such as libstdc++, which do indeed put the bit-representation in the first member (link). But such a search really needs to be part of the bitset class, to hide complexities...

Regards,
John

Howard Hinnant

unread,
Jan 24, 2015, 4:10:19 PM1/24/15
to std-pr...@isocpp.org
> Here's my example of an efficient search . This is just an example for a speedy search, for library designers who are interested. But yes: this example (just an example!) abuses a bitset, by assuming that the internal bit-representation is in the first member, and therefore only plays with libraries such as libstdc++, which do indeed put the bit-representation in the first member (link). But such a search really needs to be part of the bitset class, to hide complexities…

I very much like the functionality you describe here. But what if we gave bitset iterators, and then made the interface look like:

template <size_t N>
bitset<N>::const_iterator
find(bitset<N>::const_iterator first, bitset<N>::const_iterator last, bool value);

This would mimic in syntax the generic function:

template <class InputIterator, class T>
InputIterator
find(InputIterator first, InputIterator last, const T& value);

Then generic code could search for things, be they bits or whatever, with the same syntax and maximal efficiency for each type.

A detail here is that bitset<N>::const_iterator would have to be an alias for an iterator defined at namespace scope, as the N is not in a deducible context the way I’ve written it. I /believe/ that such an iterator could be designed to be independent of N. I also believe that the algorithm with the iterator interface would could still look similar to your example efficient search.

Here is an example implementation of it with the iterator interface:

http://llvm.org/svn/llvm-project/libcxx/trunk/include/__bit_reference

The __bit_iterator at this site is not necessarily the best design of bitset<N>::iterator. But it illustrates the concept.

Here is some performance data on these algorithms:

http://howardhinnant.github.io/onvectorbool.html

Howard

jono...@googlemail.com

unread,
Jan 24, 2015, 7:10:42 PM1/24/15
to std-pr...@isocpp.org

Hi Howard,

thanks for your reply.

You're definitely right. Using find and iterators is the way to do this. The C++ way!
I've given it a go...

As a very basic proof of concept. ;)

(Please realize: I'm probably way out of league here... This is how one codes after just having almost finished a dedicated reading of Stroustup's rather 'basic'  "Principles and Practice" book... and having done most of the book exercises.
In particular: I'm wondering about const_iterator (it's not really mentioned in "Principles and Practice"), so any recommendations or tips will be appreciated.)

C++ wizards will certainly be able to improve this!

/////////////////////////////////////////////////////////////////
#include <iostream>
#include <bitset>
#include <climits>
#include <cassert>

using namespace std;

int find_lowest_idx_set_in_ulong(unsigned long val);




struct Bititerator {
  Bititerator(unsigned long *mem1, size_t quot1, size_t rem1) : mem{mem1}, quot{quot1}, rem{rem1} {}

  constexpr static int num_bits_ulong = sizeof(unsigned long) * CHAR_BIT;
 
  Bititerator& operator++()
  {
    ++rem;
    if (rem == num_bits_ulong) {
      ++quot;
      rem = 0;
    }
    return *this;
  }

  Bititerator& operator--()
  {
    if (rem == 0) {
      --quot;
      rem = num_bits_ulong-1;
    } else {
      --rem;
    }
    return *this;
  }

 
  bool operator==(const Bititerator& it) const {
    return ((mem  == it.mem)  &&
            (quot == it.quot) &&
            (rem  == it.rem));
  }

  bool operator!=(const Bititerator& it) const {
    return !(*this == it);
  }

  bool operator*() { return mem[quot] & (1UL << rem); }
 
  friend Bititerator find(Bititerator first, Bititerator last, const bool bitstate);

  friend int operator-(const Bititerator& a, const Bititerator& b);
   
private:
  // the current bit is (mem[quot] & (1UL << rem)), which has index (quot * (num_bits_ulong) + rem)
 
  unsigned long *mem;
  size_t quot;
  size_t rem;
};


Bititerator find(Bititerator first, Bititerator last, const bool bitstate)
/* bitstate not yet taken into account.
   This algorithm (currently) always takes bitstate == true, i.e only search for set true bits

   Just a demonstration...
   Todo: handle bitstate...
 */
{
  if (first == last)
    return last;
 
  using ulong = unsigned long;

  size_t lim_quot;
  size_t lim_rem;
  if (last.rem > 0) {
    lim_quot = last.quot;
    lim_rem  = last.rem-1;
  } else {
    lim_quot = last.quot-1;
    lim_rem  = Bititerator::num_bits_ulong-1;
  }

  ulong val = first.mem[first.quot] & (-1UL << first.rem);

  if (first.quot < lim_quot) {
    if (val) {
      first.rem = find_lowest_idx_set_in_ulong(val);
      return first;
    }
 
    while (++first.quot < lim_quot) {
      if (first.mem[first.quot]) {
        first.rem = find_lowest_idx_set_in_ulong(val);
        return first;
      }
    }

    val = first.mem[first.quot];
  }

  val &= (-1UL >> (Bititerator::num_bits_ulong - lim_rem - 1));
  if (val) {
    first.rem = find_lowest_idx_set_in_ulong(val);
    return first;
  }

  return last; // not found
}


int operator-(const Bititerator& a, const Bititerator& b)
{
  return ((a.quot * (Bititerator::num_bits_ulong) + a.rem) -
          (b.quot * (Bititerator::num_bits_ulong) + b.rem));
}




#if (ULONG_MAX == 4294967295UL)

// unsigned long has 32 bits
static_assert(sizeof(unsigned long) == 4, "unsigned long must have 4 bytes");

static const int MultiplyDeBruijnBitPosition[32] =
  {
    0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
    31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
  };

#else
#if (ULONG_MAX == 18446744073709551615UL)

// unsigned long has 64 bits
static_assert(sizeof(unsigned long) == 8, "unsigned long must have 8 bytes");

static const int MultiplyDeBruijnBitPosition[64] =
  {
    0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
    54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
    46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
    25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
  };

#else

static_assert(0, "strange machine architecture");

#endif
#endif

int find_lowest_idx_set_in_ulong(unsigned long val)
// only call this function if val does definately have a bit that is set
{
#if (ULONG_MAX == 4294967295UL)

  // unsigned long has 32 bits
 
  // Reference: http://stackoverflow.com/a/757266 ( http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup )
  return MultiplyDeBruijnBitPosition[(((val & -val) * 0x077CB531UL)) >> 27];

 
#else
#if (ULONG_MAX == 18446744073709551615UL)

  // unsigned long has 64 bits

  // return __builtin_ctzl(val);

  // Reference: http://stackoverflow.com/a/24146182 ( http://chessprogramming.wikispaces.com/BitScan#KimWalisch )
  return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) * 0x03F79D71B4CB0A89UL)) >> 58];

#endif
#endif 
}


template<size_t N> class Mybitset : public bitset<N> {
public:
  using bitset<N>::bitset;

  constexpr static int num_bits_ulong = sizeof(unsigned long) * CHAR_BIT;

  using iterator = Bititerator;
 
  Bititerator begin()
  {
    assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned *>(this)) % sizeof(unsigned long)) == 0); /* check alignment */
    return Bititerator{reinterpret_cast<unsigned long*>(this), 0, 0};
  }
  Bititerator end()
  {
    assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned *>(this)) % sizeof(unsigned long)) == 0); /* check alignment */
    size_t quot = N / num_bits_ulong;
    return Bititerator{reinterpret_cast<unsigned long*>(this), quot, N - quot*num_bits_ulong};
  }
};


int main()
{
  Mybitset<10> x;
  x.set(0);
  x.set(3);

  // print bitpattern of x using iterators
  if (x.begin() != x.end()) {
    Mybitset<10>::iterator it = x.end();
    do {
      --it;
      cout << *it;
    } while (it != x.begin());
  }
  cout << "\n\n";

  // search for bit
  for (; true; x <<= 1) {
    auto it = find(x.begin(), x.end(), true);
    cout << x << '\t';
    if (it != x.end())
      cout << (it - x.begin()) << "\n";
    else
      cout << "not found\n";

    if (x.none())
      break;
  }
 
  return 0;
}
/////////////////////////////////////////////////////////////////


Regards,
John

Howard Hinnant

unread,
Jan 24, 2015, 7:22:37 PM1/24/15
to std-pr...@isocpp.org
Looks like an excellent start to me John!

Howard

David Krauss

unread,
Jan 25, 2015, 12:27:34 AM1/25/15
to std-pr...@isocpp.org

On 2015–01–25, at 8:10 AM, jono...@googlemail.com wrote:

But what if we gave bitset iterators,

Implementations should be able to reuse std::vector<bool>::iterator. If array and vector<bool> both have iterators, then bitset should too.

Also, referring to GCC’s standard library, SGI already defined bitset::find_first and bitset::find_next functions; these were just never standardized. (They are available in GCC as _Find_first and _Find_next.) Such functions could be adapted into optimized std::find overloads.

This is really a job for the standard library, because intrinsic count-leading-zeroes functionality is going to be faster than any bitwise arithmetic tricks.

Finding a bit-string in a bit-sequence with std::search or comparing bit-strings with std::equal or std::mismatch are also interesting operations with applications in data compression. I wonder if any library already optimizes these. GCC does not, so I guess it’s all rather niche.

jono...@googlemail.com

unread,
Jan 25, 2015, 10:34:30 AM1/25/15
to std-pr...@isocpp.org, jono...@googlemail.com


Here's an improvement of the above code:

In find, I got rid of lim_quot and lim_rem (which was used above)


What else has changed?

Bititerator find(Bititerator first, Bititerator last, const bool bitstate)
now calls either
inline Bititerator find_1(Bititerator first, Bititerator last)
(this one is superfast!!! On my machine MultiplyDeBruijnBitPosition actually seemed slightly faster than __builtin_ctzl(val))
or
inline Bititerator find_0(Bititerator first, Bititerator last)
(this one basically inverts words and then uses the same search as find_1 uses on words. this can probably be improved: any fast "count trailing ones" available??)



/////////////////////////////////////////////////////////////////
#include <iostream>
#include <bitset>
#include <climits> /* CHAR_BIT */
#include <cassert>
#include <cstddef> /* ptrdiff_t */
  friend inline Bititerator find_1(Bititerator first, Bititerator last);
  friend inline Bititerator find_0(Bititerator first, Bititerator last);

  friend ptrdiff_t operator-(const Bititerator& a, const Bititerator& b);

   
private:
  // the current bit is (mem[quot] & (1UL << rem)), which has index (quot * (num_bits_ulong) + rem)
 
  unsigned long *mem;
  size_t quot;
  size_t rem;
};


inline Bititerator find_1(Bititerator first, Bititerator last)
{
  /*
  // optional, so gain efficiency by leaving this

  if (first == last)
    return last;
  */

  using ulong = unsigned long;

  ulong val = first.mem[first.quot] & (-1UL << first.rem);

  if (first.quot < last.quot) {

    if (val) {
      first.rem = find_lowest_idx_set_in_ulong(val);
      return first;
    }
 
    while (++first.quot < last.quot) {

      if (first.mem[first.quot]) {
        first.rem = find_lowest_idx_set_in_ulong(val);
        return first;
      }
    }

    val = first.mem[first.quot];
  }

  if (last.rem) { /* avoid last.rem == 0,
                     otherwise (Bititerator::num_bits_ulong - last.rem) >= width of type
                     and the right-shift will then misbehave */
    val &= (-1UL >> (Bititerator::num_bits_ulong - last.rem));

    if (val) {
      first.rem = find_lowest_idx_set_in_ulong(val);
      return first;
    }
  }
 
  return last; // not found
}

inline Bititerator find_0(Bititerator first, Bititerator last)
{
  /*
  // optional, so gain efficiency by leaving this

  if (first == last)
    return last;
  */

  using ulong = unsigned long;

  ulong val = first.mem[first.quot] & (-1UL << first.rem);
  constexpr ulong fail = -1UL; // fail with all 1's -- we're looking for 0's here
 
  if (first.quot < last.quot) {
    val = ~val;

    if (val) {
      first.rem = find_lowest_idx_set_in_ulong(val);
      return first;
    }
 
    while (++first.quot < last.quot) {
      val = ~first.mem[first.quot];

      if (val) {
        first.rem = find_lowest_idx_set_in_ulong(val);
        return first;
      }
    }

    val = first.mem[first.quot];
  }

  if (last.rem) { /* avoid last.rem == 0,
                     otherwise (Bititerator::num_bits_ulong - last.rem) >= width of type
                     and the right-shift will then misbehave */
    val &= (-1UL >> (Bititerator::num_bits_ulong - last.rem));
    val = ~val;

    if (val) {
      first.rem = find_lowest_idx_set_in_ulong(val);
      return first;
    }
  }
 
  return last; // not found
}


Bititerator find(Bititerator first, Bititerator last, const bool bitstate)
{
  if (bitstate)
    return find_1(first, last);
  return find_0(first, last);
}

ptrdiff_t operator-(const Bititerator& a, const Bititerator& b)

{
  return ((a.quot * (Bititerator::num_bits_ulong) + a.rem) -
          (b.quot * (Bititerator::num_bits_ulong) + b.rem));
}




#if (ULONG_MAX == 4294967295UL)

// unsigned long has 32 bits
static_assert(sizeof(unsigned long) == 4, "unsigned long must have 4 bytes");

static const int MultiplyDeBruijnBitPosition[32] =
  {
    0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
    31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
  };

#else
#if (ULONG_MAX == 18446744073709551615UL)

// unsigned long has 64 bits
static_assert(sizeof(unsigned long) == 8, "unsigned long must have 8 bytes");

static const int MultiplyDeBruijnBitPosition[64] =
  {
    0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
    54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
    46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
    25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
  };

#else

static_assert(0, "strange machine architecture");

#endif
#endif



inline int find_lowest_idx_set_in_ulong(unsigned long val)

// only call this function if val does definately have a bit that is set
{
#if (ULONG_MAX == 4294967295UL)

  // unsigned long has 32 bits

#ifdef __GNUC__
  return __builtin_ctz(val);
#else

  // Reference: http://stackoverflow.com/a/757266 ( http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup )
  return MultiplyDeBruijnBitPosition[(((val & -val) * 0x077CB531UL)) >> 27];
#endif

 
#else
#if (ULONG_MAX == 18446744073709551615UL)

  // unsigned long has 64 bits

#ifdef __GNUC__
  return __builtin_ctzl(val);
#else

  // Reference: http://stackoverflow.com/a/24146182 ( http://chessprogramming.wikispaces.com/BitScan#KimWalisch )
  return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) * 0x03F79D71B4CB0A89UL)) >> 58];
#endif
 
#endif
#endif 
}


template<size_t N> class Mybitset : public bitset<N> {
public:
  using bitset<N>::bitset;

  constexpr static int num_bits_ulong = sizeof(unsigned long) * CHAR_BIT;

  using iterator = Bititerator;
 
  Bititerator begin()
  {
    assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned *>(this)) % sizeof(unsigned long)) == 0); /* check alignment */
    return Bititerator{reinterpret_cast<unsigned long*>(this), 0, 0};
  }
  Bititerator end()
  {
    assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned *>(this)) % sizeof(unsigned long)) == 0); /* check alignment */
    size_t quot = N / num_bits_ulong;
    return Bititerator{reinterpret_cast<unsigned long*>(this), quot, N - quot*num_bits_ulong};
  }
};


int main()
{
  Mybitset<50> x;

  x.set(0);
  x.set(3);

  // print bitpattern of x using iterators
  if (x.begin() != x.end()) {
    Mybitset<10>::iterator it = x.end();
    do {
      --it;
      cout << *it;
    } while (it != x.begin());
  }
  cout << "\n\n";

  // search for 1-bit
  cout << "search for 1\n"
       << "============\n";
  auto it_end = x.end();        //  --(--(x.end()));

  for (; true; x <<= 1) {
    auto it = find(x.begin(), it_end, true);
    cout << x << '\t';
    if (it != it_end)

      cout << (it - x.begin()) << "\n";
    else
      cout << "not found\n";

    if (x.none())
      break;
  }

  // search for 0-bit
  cout << "\n"
       << "search for 0\n"
       << "============\n";
  x.reset();
  x.flip();
  it_end = x.end();        //  --(--(x.end()));

  for (; true; x >>= 1) {
    auto it = find(x.begin(), it_end, false);
    cout << x << '\t';
    if (it != it_end)

      cout << (it - x.begin()) << "\n";
    else
      cout << "not found\n";

    if (x.none())
      break;
  }


  constexpr unsigned num = 100000;
  cout << "\n"
       << "searching " << num+1 << " times through a Mybitset<" << num << "> with shifting bits\n"
       << "(please wait for a few minor seconds...)\n";
  Mybitset<100000> y;
  y.set(0);
  for (; true ; y <<=1 ) {
    auto it = find(y.begin(), y.end(), true);
    /*
    if (it != y.end()) {
      cout << (it - y.begin()) << " ";

    }
    else {
      cout << "not found\n";
    }
    */
    if (y.none())
      break;
  }
  return 0;
}

/////////////////////////////////////////////////////////////////



 

Howard Hinnant

unread,
Jan 25, 2015, 4:13:18 PM1/25/15
to std-pr...@isocpp.org
On Jan 25, 2015, at 12:27 AM, David Krauss <pot...@gmail.com> wrote:

>
>> On 2015–01–25, at 8:10 AM, jono...@googlemail.com wrote:
>>
>> But what if we gave bitset iterators,
>
> Implementations should be able to reuse std::vector<bool>::iterator. If array and vector<bool> both have iterators, then bitset should too.

Agreed. And reuse the vector<bool>::reference as well.

> Also, referring to GCC’s standard library, SGI already defined bitset::find_first and bitset::find_next functions; these were just never standardized. (They are available in GCC as _Find_first and _Find_next.) Such functions could be adapted into optimized std::find overloads.
>
> This is really a job for the standard library, because intrinsic count-leading-zeroes functionality is going to be faster than any bitwise arithmetic tricks.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3864.html is a proposal to /finally/ make these intrinsics available. It is long overdue.

>
> Finding a bit-string in a bit-sequence with std::search or comparing bit-strings with std::equal or std::mismatch are also interesting operations with applications in data compression. I wonder if any library already optimizes these. GCC does not, so I guess it’s all rather niche.

libc++ optimizes:

swap, find, count, fill, fill_n, copy, copy_backward, move, move_backward, swap_ranges, rotate and equal. No doubt there is more work to be done in this area.

Howard

Reply all
Reply to author
Forward
0 new messages