Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Counting the number of bits in a 32-bit word.

37 views
Skip to first unread message

George Marsaglia

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
Aside from a massive-table lookup, what is the fastest, or some of the
fastest,
ways to count the number of bits in a 32-bit unsigned long?
I recall we had various folding strategies with small-table lookups in
the 1960's ,
but have forgotten details. Some of the machines had an instruction
that did it.
But what are fast ways using C?
George Marsaglia

Alex Krol

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
In article <3A2F8DF2...@stat.fsu.edu>,


Lookup table for a 8-bit byte contains only 256 entries.
So you can easily count number of bits in every byte of your
unsigned long separately (combining shifts and masking).

--
Regards,
Alex Krol
Disclaimer: I'm not speaking for CreoScitex, Creo and/or Scitex Corp.
C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html


Sent via Deja.com http://www.deja.com/
Before you buy.

Christian Bau

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
In article <3A2F8DF2...@stat.fsu.edu>, George Marsaglia
<g...@stat.fsu.edu> wrote:

> Aside from a massive-table lookup, what is the fastest, or some of the
> fastest,
> ways to count the number of bits in a 32-bit unsigned long?

Processor and implementation dependent. Using a massive lookup table is
probably among the slower approaches. I'll assume that you want to do this
operation for many values, otherwise who cares what is the fastest method.


On many processors, using a lookuptable with 2048 entries will be the
fastest; use this to get the number of bits in x & 0x7ff, (x >> 11) &
0x7ff and (x >> 22) & 0x3ff and add the results. On a PowerPC that is 8
operations total; quite hard to beat.

The other fast method: A 32 bit unsigned long can hold 16 2-bit numbers.
Take (x & 0x55555555) and ((x >> 1) & 0x55555555), interpret each as 16
2-bit numbers with a value of 0 or 1. The number of bits in x is equal to
the sum of these 32 numbers. Adding both gives you a result of 16 2-bit
numbers with values of 0, 1 or 2.

Call the result y. Split y into 2 32 bit numbers, each holding 8 4-bit
numbers: (y & 0x33333333) and ((y >> 2) & 0x33333333), add these and you
have 8 4-bit numbers in the range 0 to 4. Call the result z. Add (z >> 4)
+ z, then you have 4 4bit numbers in the range 0 to 8 plus 4 4bit numbers
that contain garbage. Go on like this.

This approach obviously gets better if you need the total bitcount of more
than one 32 bit number. For example, if you have 32 such numbers: 32 shift
+ 64 and produce 64 times 16 2-bit numbers with values 0 and 1; with 42
adds you are down to 22 times 16 2-bit numbers from 0 to 3. 22 shift + 44
and gives 44 times 8 4 bit numbers, 35 adds give 9 times 8 4bit numbers
from 0 to 15. 9 shifts and 18 ands give 18 times 4 8-bit numbers, 16 adds
and you are down to 2 times 4 8 bit numbers from 0 to 135. Another 2
shift, 4 and, 3 add to get a 32 bit integer containing 2 16bit numbers,
another shift/and/add to get the complete bitcount. That is 66 shifts, 131
and, 97 adds or roughly 9 operations per 32 bit unsigned long. A
lookuptable looks quite competitive; things change if you have 64 bit or
128 bit operations.

> I recall we had various folding strategies with small-table lookups in
> the 1960's ,
> but have forgotten details. Some of the machines had an instruction
> that did it.
> But what are fast ways using C?

> George Marsaglia

David Rubin

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
George Marsaglia wrote:
>
> Aside from a massive-table lookup, what is the fastest, or some of the
> fastest, ways to count the number of bits in a 32-bit unsigned long?

There are 32...Oh, maybe you want to know the number of 1 bits...

How about splitting the 32-bit value into 4 8-bit values and using a 256-element table lookup, adding the results?

> I recall we had various folding strategies with small-table lookups in
> the 1960's , but have forgotten details.

This sounds like it could be what I've suggested. I really don't know.

> Some of the machines had an instruction that did it.

Not standard...

> But what are fast ways using C?

"Fastest" will have to measured and evaluated by you...

david


--
FORTRAN was the language of choice for the same reason that three-legged races are popular.
-- Ken Thompson, "Reflections on Trusting Trust"

Dann Corbit

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
"cja" <cj_a...@my-deja.com> wrote in message
news:90o97n$f98$1...@nnrp1.deja.com...

> In article <3A2F8DF2...@stat.fsu.edu>,
> George Marsaglia <g...@stat.fsu.edu> wrote:
>
> > ... ways to count the number of bits ... ?
> >
> Check http://www.snippets.org/#section1group14 for various techniques.
>


There is a far more complete list here:
ftp://cap.connx.com/pub/bitcount/BITC.ZIP

BITC.ZIP also includes some of the brilliant techniques displayed by Chris
Torek (which are faster than any of the snippets ones on all of my machines).

For a very fast Assembly version, you could examine the popcnt() function in
Crafty's source code at the University of Alabama at Birmingham.
--
C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
"The C-FAQ Book" ISBN 0-201-84519-9
C.A.P. FAQ: ftp://cap.connx.com/pub/Chess%20Analysis%20Project%20FAQ.htm

Francois Grieu

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
"Dann Corbit" <dco...@solutionsiq.com> wrote:

> There is a far more complete list here:
> ftp://cap.connx.com/pub/bitcount/BITC.ZIP
>
> BITC.ZIP also includes some of the brilliant techniques displayed by Chris
> Torek (which are faster than any of the snippets ones on all of my machines).


This is a gem ! I'm amazed ! My favorite pick:

/*
* This function counts the number of bits in a long.
* It is limited to 63 bit longs, but a minor mod can cope with 511 bits.
*
* It is so magic, an explanation is required:
* Consider a 3 bit number as being
* 4a+2b+c
* if we shift it right 1 bit, we have
* 2a+b
* subtracting this from the original gives
* 2a+b+c
* if we shift the original 2 bits right we get
* a
* and so with another subtraction we have
* a+b+c
* which is the number of bits in the original number.
* Suitable masking allows the sums of the octal digits in a
* 32 bit number to appear in each octal digit. This isn't much help
* unless we can get all of them summed together.
* This can be done by modulo arithmetic (sum the digits in a number by molulo
* the base of the number minus one) the old "casting out nines" trick they
* taught in school before calculators were invented.
* Now, using mod 7 wont help us, because our number will very likely have
* more than 7 bits set. So add the octal digits together to get base64
* digits, and use modulo 63.
* (Those of you with 64 bit machines need to add 3 octal digits together to
* get base512 digits, and use mod 511.)
*
* This is HACKMEM 169, as used in X11 sources.
*/
int
t0_hackmemmod (unsigned long n)
{
unsigned long tmp;

tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
return ((tmp + (tmp >> 3)) & 030707070707) % 63;
}

Francois Grieu

Martin Peach

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to

Francois Grieu <fgr...@micronet.fr> wrote in message
news:fgrieu-1289A2....@news.cybercable.fr...

> George Marsaglia <g...@stat.fsu.edu> wrote:
>
> > Aside from a massive-table lookup, what is the fastest, or some
> > of the fastest, ways to count the number of bits in a 32-bit
> > unsigned long?
>
>
> /* a simple, compact and relatively fast classic */
> /* compressed from <http://www.snippets.org/BITCNT_1.C> */
> int popcount1(unsigned long x)
> {int n=0;if(x)do++n;while(x&=x-1);return n;}
>
>
> /* a classic with some twists */
> /* improved from <http://www.snippets.org/BITCNT_2.C> */
> /* then with additional contribution from Christian Bau */
> int popcount3(unsigned long x)
> {
> /* count on fields 2,4, then 8 bit wide */
> /* using the same mask twice may help on some architectures */
> x = ((x>>1)&0x55555555)+(x&0x55555555);
> x = ((x>>2)&0x33333333)+(x&0x33333333);
> /* in the next step only one mask is needed */
> x = ((x>>4)+x)&0x0f0f0f0f;
> /* from now on each field is wide enough for the end result */
> /* quickly reduce the width we work on to 16, then 8 bits */
> x += x>>16;
> /* finishing with the 8 bit shift gives a better chance of */
> /* optimization to smart 8 bit compilers */
> return ((x>>8)+x)&255;
> }

It is not clear that these are any faster than the intuitively obvious:
int count(unsigned long x)
{
int count = x & 1;
while (x>>1) if (x & 1) ++count;
return count;
}
(Just typing the code on a single line doesn't make it run any faster :))

/\/\/\/*= Martin

Dan Pop

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
In <3A2FBC40...@gecm.com> des walker <des.w...@gecm.com> writes:

>Dan Pop wrote:
>
><snip>
>
>>
>> while (foo &= foo - 1) count++;
>>
>
></snip>
>
>I thought this looked worth remembering, I hope you don't mind me pointing out the
>count is out by 1 if foo is non-zero :( Possibly replace with,
>
>while(foo) {
> foo &= foo - 1;
> count++;
>}

Indeed!

Dan
--
Dan Pop
CERN, IT Division
Email: Dan...@cern.ch
Mail: CERN - IT, Bat. 31 1-014, CH-1211 Geneve 23, Switzerland

Morris Dovey

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
Francois Grieu wrote:

> This is a gem ! I'm amazed ! My favorite pick:

Francois...

I agree! It's elegant.

What would happen if, instead of working with 3-bit numbers (4a + 2b + c), we used
two-bit numbers (2a + b)?

If we shifted right one bit we'd have
a
Which, if subtracted from the original would give
a + b
Which, again, would be the number of bits in the original. Each 2-bit bit count
would now be in the place there the counted bits used to be -- and so no masking is
required to isolate the counts.

Now I'm stumped because I didn't learn the "casting out nines" trick in school.
Could you guide me through the logic needed to sum these counts?
--
Morris Dovey
West Des Moines, Iowa USA
mrd...@iedu.com

Chris Torek

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
In article <29RX5.397$RS4.499@client>

Dann Corbit <dco...@solutionsiq.com> writes:
>BITC.ZIP also includes some of the brilliant techniques displayed by Chris
>Torek (which are faster than any of the snippets ones on all of my machines).

Credit where due: most of the functions in there (like the Amazing
HAKMEM Trick) are someone else's. The table lookup versions are
largely mine, and happened to be the fastest on the various machines
on which I tested this stuff, whenever that was.
--
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA, USA Domain: to...@bsdi.com +1 510 234 3167
http://claw.bsdi.com/torek/ (not always up) I report spam to abuse@.

Tom Dailey

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
George Marsaglia wrote:
Aside from a massive-table lookup, what is the fastest, or some of the
fastest,
ways to count the number of bits in a 32-bit unsigned long?
I recall we had various folding strategies with small-table lookups in
the 1960's ,
but have forgotten details.  Some of the machines had an instruction
that did it.
But what are fast ways using C?
George Marsaglia
The table lookup is the good way to do it. A 64K table of ints is really
pretty small, memory being as cheap as it is. Split the 32-bit int
into two 16-bit ints, look them both up in the 64K table, and add
the two results. (If you're really obsessive, use unsigned chars instead
of ints for the table.) Something like the following should work.

************************************************************************

#include    <limits.h>

typedef    unsigned short int    unsigned_16_bit_int_t;
typedef    unsigned int          unsigned_32_bit_int_t;

int    Nr_of_bits_in_16_bit_int  [ USHRT_MAX ];

int    nr_of_bits_in_16_bit_int   /* ...the slow way, for initializing */
(
     unsigned_16_bit_int_t    n
) {
    int                      nr_of_bits;
    unsigned_16_bit_int_t    shift_register;

    shift_register  =  n;
    nr_of_bits      =  0;
    while ( shift_register != 0 ) {
        if ( ( shift_register & 0x8000 ) != 0 ) {
            nr_of_bits  +=  1;
        }
        shift_register  <<=  1;
    }
    return  nr_of_bits;
}

void    initialize_bit_counting_machinery
(
     void
) {
    unsigned_16_bit_int_t    n;

    for ( n = 0;  n < USHRT_MAX;  n += 1 ) {
        Nr_of_bits_in_16_bit_int [ n ]  =  nr_of_bits_in_16_bit_int ( n );
    }
}

int    nr_of_bits_in_32_bit_int
(
    unsigned_32_bit_int_t    n
) {
    unsigned_16_bit_int_t    upper_half_of_n  =  n >> 16;
    unsigned_16_bit_int_t    lower_half_of_n  =  n & 0x0000ffff;

    return  Nr_of_bits_in_16_bit_int [ upper_half_of_n ] +
            Nr_of_bits_in_16_bit_int [ lower_half_of_n ];
}

*************************************************************************
 
 
 
 
 

Keith Thompson

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
George Marsaglia <g...@stat.fsu.edu> writes:
> Aside from a massive-table lookup, what is the fastest, or some of
> the fastest, ways to count the number of bits in a 32-bit unsigned
> long?

printf("32\n");

(Unless you meant the number of one bits.)

--
Keith Thompson (The_Other_Keith) k...@cts.com <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Welcome to the last year of the 20th century.

cja

unread,
Dec 7, 2000, 10:09:48 AM12/7/00
to
In article <3A2F8DF2...@stat.fsu.edu>,
George Marsaglia <g...@stat.fsu.edu> wrote:

> ... ways to count the number of bits ... ?
>
Check http://www.snippets.org/#section1group14 for various techniques.

\\//
Chris Adamo
ad...@computer.org

Dan Pop

unread,
Dec 7, 2000, 10:38:38 AM12/7/00
to
In <3A2F8DF2...@stat.fsu.edu> George Marsaglia <g...@stat.fsu.edu> writes:

>Aside from a massive-table lookup, what is the fastest, or some of the
>fastest,
>ways to count the number of bits in a 32-bit unsigned long?

>I recall we had various folding strategies with small-table lookups in
>the 1960's ,
>but have forgotten details. Some of the machines had an instruction
>that did it.

Some processors still have a "population count" instruction that does
exactly what you need. Unfortunately, I can't figure out any way to
write a piece of portable C code that would get translated into one such
instruction, if available on the target processor.

>But what are fast ways using C?

It depends. If you don't need constant execution time and if there aren't
many bits set in the variable, the good old trick may be what you want:

unsigned long foo = ...;
int count = 0;

while (foo &= foo - 1) count++;

Otherwise, you can use the lookup table approach:

int table[16] = {0, 1, ... 4};

while (foo) {
count += table[foo & 0xF];
foo >>= 4;
}

Of course, you can increase the size of the lookup table, according to
your needs. Both methods make no assumptions about the size of the
variable, they only rely on it being an unsigned type. You can use any
size which is a multiple of 2 for the lookup table.

Ben Pfaff

unread,
Dec 7, 2000, 10:47:09 AM12/7/00
to
Dan...@cern.ch (Dan Pop) writes:

[counting 1-bits in a byte]

> Otherwise, you can use the lookup table approach:
>
> int table[16] = {0, 1, ... 4};
>
> while (foo) {
> count += table[foo & 0xF];
> foo >>= 4;
> }
>
> Of course, you can increase the size of the lookup table, according to
> your needs. Both methods make no assumptions about the size of the
> variable, they only rely on it being an unsigned type. You can use any
> size which is a multiple of 2 for the lookup table.

^^^^^^^^^^^^^ power of 2?

Mathew Hendry

unread,
Dec 7, 2000, 11:11:27 AM12/7/00
to
George Marsaglia <g...@stat.fsu.edu> wrote:

: [population count]
:
: what are fast ways using C?

You'll find a few different approaches at

http://www.snippets.org

--

Mathew Hendry, Programmer, Visual Sciences Ltd.
Work <ma...@vissci.com>, Home <sca...@dial.pipex.com>

des walker

unread,
Dec 7, 2000, 11:35:13 AM12/7/00
to
Dan Pop wrote:

<snip>

>
> while (foo &= foo - 1) count++;
>

</snip>

I thought this looked worth remembering, I hope you don't mind me pointing out the
count is out by 1 if foo is non-zero :( Possibly replace with,

while(foo) {
foo &= foo - 1;
count++;
}

Des Walker
Alenia-Marconi Systems

Francois Grieu

unread,
Dec 7, 2000, 11:40:17 AM12/7/00
to
George Marsaglia <g...@stat.fsu.edu> wrote:

> Aside from a massive-table lookup, what is the fastest, or some
> of the fastest, ways to count the number of bits in a 32-bit
> unsigned long?

/* a simple, compact and relatively fast classic */
/* compressed from <http://www.snippets.org/BITCNT_1.C> */
int popcount1(unsigned long x)
{int n=0;if(x)do++n;while(x&=x-1);return n;}


/* a classic with some twists */
/* improved from <http://www.snippets.org/BITCNT_2.C> */

int popcount2(unsigned long x)


{
/* count on fields 2,4, then 8 bit wide */
/* using the same mask twice may help on some architectures */

x = (x>>1&0x55555555)+(x&0x55555555);
x = (x>>2&0x33333333)+(x&0x33333333);
x = (x>>4&0x0f0f0f0f)+(x&0x0f0f0f0f);


/* from now on each field is wide enough for the end result */

x += x>>16;
/* only the 16 low bits matter now; fishing with the 8 bit shift */
/* gives a better chance of optimization to smart 8 bit compilers */
return ((x>>8)+x)&255;
}


Another option is adding the result of 3 lookup in a 2 kByte
(11 bit wide) lookup table, which will nicely fit the L1 cache
of modern CPUs.


Francois Grieu

Dan Pop

unread,
Dec 7, 2000, 11:12:55 AM12/7/00
to

This is what I had in mind when I wrote "multiple" :-)

Jeff Jacoby

unread,
Dec 7, 2000, 11:51:59 AM12/7/00
to
On 7 Dec 2000 15:38:38 GMT, Dan Pop wrote:

[snip]

> It depends. If you don't need constant execution time and if there aren't
> many bits set in the variable, the good old trick may be what you want:
>
> unsigned long foo = ...;
> int count = 0;
>
> while (foo &= foo - 1) count++;

Are you sure about this? I've tried it by hand
and a sample program, and I get one less than
the actual number of set bits in foo, eg. if
foo is 7, count would be 2.

Or have I done something wrong (see below)?


Jeff

The program is:

#include <stdio.h>
int main(void) {
unsigned int f, f1, c;

printf( "Enter value: \n" );
scanf( "%u", &f );

f1 = f;
c = 0;

while ( f &= f - 1 ) c++;

printf( "Set bits in %u = %u\n", f1, c );

return 0;
}

Francois Grieu

unread,
Dec 7, 2000, 12:10:28 PM12/7/00
to
George Marsaglia <g...@stat.fsu.edu> wrote:

> Aside from a massive-table lookup, what is the fastest, or some
> of the fastest, ways to count the number of bits in a 32-bit
> unsigned long?


/* a simple, compact and relatively fast classic */
/* compressed from <http://www.snippets.org/BITCNT_1.C> */
int popcount1(unsigned long x)
{int n=0;if(x)do++n;while(x&=x-1);return n;}


/* a classic with some twists */
/* improved from <http://www.snippets.org/BITCNT_2.C> */

/* then with additional contribution from Christian Bau */

int popcount3(unsigned long x)


{
/* count on fields 2,4, then 8 bit wide */
/* using the same mask twice may help on some architectures */

x = ((x>>1)&0x55555555)+(x&0x55555555);
x = ((x>>2)&0x33333333)+(x&0x33333333);
/* in the next step only one mask is needed */
x = ((x>>4)+x)&0x0f0f0f0f;

/* from now on each field is wide enough for the end result */

/* quickly reduce the width we work on to 16, then 8 bits */
x += x>>16;
/* finishing with the 8 bit shift gives a better chance of */
/* optimization to smart 8 bit compilers */
return ((x>>8)+x)&255;
}


Another option is adding the result of 3 lookup in a 2 kByte
(11 bit wide) lookup table, which will nicely fit the L1 cache
of modern CPUs.


Francois Grieu
[revised post]

-hs-

unread,
Dec 7, 2000, 6:51:12 PM12/7/00
to
George Marsaglia a écrit dans le message <3A2F8DF2...@stat.fsu.edu>...

>Aside from a massive-table lookup, what is the fastest, or some of the
>fastest,
>ways to count the number of bits in a 32-bit unsigned long?
>I recall we had various folding strategies with small-table lookups in
>the 1960's ,
>but have forgotten details. Some of the machines had an instruction
>that did it.
>But what are fast ways using C?

What's wrong with shift and mask?

#include <stddef.h>
#include <limits.h>

size_t nbits (unsigned long value)
{
size_t n = 0;
unsigned long m;

for (m = value; m; m >>= 1)
{
if (m & 1)
{
n++;
}
}
return n;
}

#ifdef TEST

#include <stdio.h>
int main (void)
{
typedef struct
{
unsigned long v;
size_t n;
} test_t;

static test_t atest[]=
{
{ 0x00000000LU, 1} /* error tester */
,{ 0x00000000LU, 0}
,{ 0x00000001LU, 1}
,{ 0x80000000LU, 1}
,{~0x00000000LU, CHAR_BIT * sizeof(unsigned long)}
,{~0x00000001LU, CHAR_BIT * sizeof(unsigned long)- 1}
,{~0x80000000LU, CHAR_BIT * sizeof(unsigned long)- 1}
};
size_t i;

for (i = 0; i < sizeof atest / sizeof *atest; i++)
{
test_t *p = &atest[i];
size_t n = nbits(p->v);

printf ("v = %08lX n = %lu (lu=%lu): %s\n"
,(unsigned long) p->v
,(unsigned long) p->n
,(unsigned long) n
,(char *) ((n == p->n) ? "ok" : "KO!")
);
}
return 0;
}
#endif


D:\CLC\DEV_N\NBITS32>bc proj.prj
v = 00000000 n = 1 (lu=0): KO!
v = 00000000 n = 0 (lu=0): ok
v = 00000001 n = 1 (lu=1): ok
v = 80000000 n = 1 (lu=1): ok
v = FFFFFFFF n = 32 (lu=32): ok
v = FFFFFFFE n = 31 (lu=31): ok
v = 7FFFFFFF n = 31 (lu=31): ok

--
-hs- Tabs out, spaces in.
CLC-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
ISO-C Library: http://www.dinkum.com/htm_cl
FAQ de FCLC : http://www.isty-info.uvsq.fr/~rumeau/fclc

Dan Pop

unread,
Dec 7, 2000, 7:23:19 PM12/7/00
to
In <90p7pk$3rq$1...@news2.isdnet.net> "-hs-" <email....@server.invalid> writes:

>What's wrong with shift and mask?

You mean, other than being the slowest method a reasonable person can
imagine?

>size_t nbits (unsigned long value)
>{
> size_t n = 0;
> unsigned long m;
>
> for (m = value; m; m >>= 1)
> {
> if (m & 1)
> {
> n++;
> }
> }
> return n;
>}

If value has only the most significant bit set, you have iterate
(at least) 32 times. That's fine if performance is a non-issue.

Dan Pop

unread,
Dec 8, 2000, 7:59:08 AM12/8/00
to
In <fgrieu-8B8637....@news.wanadoo.fr> Francois Grieu <fgr...@micronet.fr> writes:

>I massaged things to the (tested working !) one-liner:
>
>/* adapted from source found at ftp://cap.connx.com/pub/bitcount/BITC.ZIP */
>/* which reportedly was "HACKMEM 169, as used in X11 sources" */
>int popcount4(unsigned long x)
>{x-=(x>>1&0xdb6db6db)+(x>>2&0x49249249);return((x>>3)+x&0xc71c71c7)%63;}
>
>Using one less temporary than the original might help some compilers.
>I think this would often beat table-driven implementations,
>especialy if the code is inlined (if not, the many sequence points
>are a handicap).

The biggest handicap on the RISC platforms lacking an integer division
instruction is likely to be the modulo operator.

Gary E. Ansok

unread,
Dec 8, 2000, 12:42:42 PM12/8/00
to
In article <fgrieu-8B8637....@news.wanadoo.fr>,

Francois Grieu <fgr...@micronet.fr> wrote:
>I massaged things to the (tested working !) one-liner:
>
>/* adapted from source found at ftp://cap.connx.com/pub/bitcount/BITC.ZIP */
>/* which reportedly was "HACKMEM 169, as used in X11 sources" */
>int popcount4(unsigned long x)
>{x-=(x>>1&0xdb6db6db)+(x>>2&0x49249249);return((x>>3)+x&0xc71c71c7)%63;}

That's HAKMEM, not HACKMEM.

-- Gary

HAKMEM 41: There are exactly 23,000 primes less than 2**18.

Francois Grieu

unread,
Dec 8, 2000, 1:42:54 PM12/8/00
to
On my supposedly fast [now deobfuscated]

/* adapted from source found at ftp://cap.connx.com/pub/bitcount/BITC.ZIP */
/* which reportedly was "HACKMEM 169, as used in X11 sources" */

int popcount4a(unsigned long x) {
/* reduce to 2 + 10 * 3 bits in a slightly parallelizable way */
x -= ((x>>1)&033333333333)+((x>>2)&011111111111);
/* reduce to 2 + 5 * 6 bits, then to 6 bits */
return (((x>>3)+x)&030707070707)%63;
}

Dan...@cern.ch (Dan Pop) wrote:

> The biggest handicap on the RISC platforms lacking an integer
> division instruction is likely to be the modulo operator.

This is very true on a PowerPC 750.

On machines without hardware modulo, what about one of the
(tested, believed original and improved)


/* inspired from ftp://cap.connx.com/pub/bitcount/BITC.ZIP */
int popcount5(unsigned long x) {
/* reduce to 2 + 10 * 3 bits in a slightly parallelizable way */
x -= ((x>>1)&033333333333)+((x>>2)&011111111111);
/* reduce to 2 + 5 * 6 bits */
x = ((x>>3)+x)&030707070707;
/* reduce to 3 * 6 bits in the low 24 bits */
x += x>>18;
/* reduce to 6 bits in a slightly parallelizable way */
return ((x>>12)+(x>>6)+x)&63;
}


/* inspired from ftp://cap.connx.com/pub/bitcount/BITC.ZIP */
int popcount6(unsigned long x) {
/* reduce to 16*2 bits */
x -= ((x>>1)&0x55555555);
/* reduce to 8*4 bits using the same mask twice */
x = ((x>>2)&0x33333333)+(x&0x33333333);
/* reduce to 4*8 bits */
x = ((x>>4)+x)&0x0F0F0F0F;
/* reduce to 2*8 bits in the low 16 bits */
x += x>>16;
/* reduce to 8 bits */
return ((x>>8)+x)&255;
}


My PPC compiler takes 24 words for popcount4a (among which two
multiplications), 23 words for popcount5, 22 words for popcount6.


Francois Grieu

Morris Dovey

unread,
Dec 8, 2000, 5:01:49 PM12/8/00
to
Francois Grieu wrote:

> It says x, and the sum of the digits of x in base 10 (optionaly
> skipping any 9) are equal mod 9; when applied recursively, this
> let one compute x%9 quickly.
> Then using that (x+y)%9==((x%9)+(y%9))%9 [assuming x+y does not
> overflow !] we have a quick technique to check an addition.
>
> Example: 8754301 -> 8+7+5+4+3+0+1 = 28 -> 2+8 = 10 -> 1+0 = 1
> + 741526 -> 7+4+1+5+2+6 = 25 -> 2+5 = + 7
> = 9495827 -> 4+5+8+2+7 = 26 -> 2+6 = = 8
>
> This "casting out nines" can be modified easily for substraction,
> multiplication, or division. It can be generalized to any base.
> There is also a mod 11 variant, alternatively adding and
> substracting digits in base 10 starting from the right; using
> both checks gives you a high degree of confidence.
>
> In the code snippet, the idea if to compute x%9, which is the
> sum of the digits in x, if it does not excced 8 ! All that in
> base 64 instead of 10.
>
> Note there is a minor error in the original snippet's comments:
> the trick only works to 62 bits.

Francois...

Thank you! I wish I'd been known of this when I was in school. Ah
well, better late than never!

I played with the snippet last night (experimenting with different
numbers of bits than 3), displaying intermediate results in binary;
but by the time I was comfortable with what I was doing, I still
hadn't worked it out for ordinary usages in base ten, and was tired
enough to call it a night.

It's a neat algorithm indeed. Thanks for posting!

0 new messages