2770 views

Skip to first unread message

Aug 23, 2002, 4:34:42 AM8/23/02

to

I was wondering if anyone had a quicker way of rounding an unsigned int

value up to the nearest power of 2.

value up to the nearest power of 2.

obviously I can do:

unsigned int n = 1;

while (n < val)

n <<= 1;

but I figured there has to be a faster way...?

Aug 23, 2002, 6:52:36 AM8/23/02

to

Your loop never terminates if "unsigned int" is 32 bits and val is

greater than 2^31.

Here's my effort (note: it's got better O() complexity, but it may not

be *faster* for the ranges of numbers you're interested in). I make the

assumption that only numbers in the range 0..2^31 are passed in

(anything higher returns 0). It's basically an unrolled binary search:

unsigned long rop2(unsigned long val)

{

unsigned long n = 0;

unsigned long q = 1<<16;

if (val < 2)

return 1;

if (val <= q) {

n = q;

q >>= 8;

} else {

q <<= 8;

}

if (val <= q) {

n = q;

q >>= 4;

} else {

q <<= 4;

}

if (val <= q) {

n = q;

q >>= 2;

} else {

q <<= 2;

}

if (val <= q) {

n = q;

q >>= 1;

} else {

q <<= 1;

}

if (val <= q) {

n = q;

}

return n;

}

- Kevin.

Aug 23, 2002, 11:08:23 AM8/23/02

to

"Faster" is always guesswork until you actually implement

something and measure it, because the C language itself says

nothing about how much time any particular operation takes.

Even after you've implemented and measured, the results may

change when you recompile with different switches, or upgrade

your compiler, or move to a different machine, or allow the

moon to change its phase.

And, of course, you've got to need this computation a

*lot* of times to make any tweaking worth while. If it only

runs a couple thousand times, shaving a microsecond off the

execution time for each computation gives no observable net

improvement.

All that said, here's one approach:

unsigned int n1, n2;

for (n1 = val; (n2 = n1 & (n1 - 1)) != 0; n1 = n2)

;

return (n1 << (n1 < val)) + (val == 0);

The loop executes once for each 1-bit in the value of `val',

and there's an adjustment at the end just in case `val' itself

happens to be an exact power of two or equal to zero.

Another possibility is to do a binary search, either in an

explicit table or by writing a nested `if' chain with constants;

this will have an operation count proportional to the total number

of bits in an `unsigned int' (or, more generally, in the largest

`val' value you'll ever see). This has the disadvantage that you

need to know the maximum bit count when you write the code, and

you'll need to rewrite if you move to a machine with a

different value of UINT_MAX.

Aug 23, 2002, 11:57:59 AM8/23/02

to

Not true - the binary search has an operation count proportional to log

base 2 of the total number of bits in an unsigned int.

- Kevin.

Aug 23, 2002, 12:22:55 PM8/23/02

to

Kevin Easton wrote:

>

> Eric Sosman <Eric....@sun.com> wrote:

> >

> > Another possibility is to do a binary search, [...]

> > this will have an operation count proportional to the total number

> > of bits in an `unsigned int' (or, more generally, in the largest

> > `val' value you'll ever see).

>

>

> Eric Sosman <Eric....@sun.com> wrote:

> >

> > Another possibility is to do a binary search, [...]

> > this will have an operation count proportional to the total number

> > of bits in an `unsigned int' (or, more generally, in the largest

> > `val' value you'll ever see).

>

> Not true - the binary search has an operation count proportional to log

> base 2 of the total number of bits in an unsigned int.

> base 2 of the total number of bits in an unsigned int.

Yes, of course (gotta stop typing so fast; that's two blunders

in two days). Sorry for the error, and thanks for the correction.

Aug 23, 2002, 2:02:57 PM8/23/02

to

* Kevin Easton wrote:

> Jeffrey Stedfast <fe...@ximian.com> wrote:

> > I was wondering if anyone had a quicker way of rounding an unsigned int

> > value up to the nearest power of 2.

*snip*> Jeffrey Stedfast <fe...@ximian.com> wrote:

> > I was wondering if anyone had a quicker way of rounding an unsigned int

> > value up to the nearest power of 2.

> Here's my effort (note: it's got better O() complexity, but it may not

> be *faster* for the ranges of numbers you're interested in). I make the

> assumption that only numbers in the range 0..2^31 are passed in

> (anything higher returns 0). It's basically an unrolled binary search:

>

> unsigned long rop2(unsigned long val)

> {

> unsigned long n = 0;

> unsigned long q = 1<<16;

nitpick: you might want to make that

unsigned long q = 1lu<<16;

otherwise it won't work portably for unsigned longs.

> if (val < 2)

> return 1;

>

> if (val <= q) {

> n = q;

> q >>= 8;

> } else {

> q <<= 8;

> }

>

*snip unrolling*

>

> return n;

> }

Thanks for a good hint towards a more portable solution rolling up your

code:

/*premise: no padding in unsigned long, but see below */

#include <limits.h>

#define BITS_IN_ULONG (CHAR_BIT * sizeof(unsigned long))

unsigned long rop2(unsigned long val)

{

unsigned b = BITS_IN_ULONG / 2;

unsigned long q = 1lu<<b;

unsigned long n = 0;

/* one could do

assert(q<<(b-1) != 0);

here to abort if above premise does not hold

*/

if (val < 2)

return 1;

while (b != 0) {

b >>= 1;

if (val <= q) {

n = q;

q >>= b;

} else {

q <<= b;

}

}

return n;

}

One could use Jeffreys solution on ULONG_MAX/2 to calculate

BITS_IN_ULONG-1 in initialization code, in order to make this really

portable.

offtopic part: on my system this sometimes beats Jeffreys idea when

using 64-bit wide variables, and is faster than the unrolled version.

Regards,

Ralf

Aug 23, 2002, 5:33:41 PM8/23/02

to

[I am not sure who wrote which part here]

>* Kevin Easton wrote:

>> Jeffrey Stedfast <fe...@ximian.com> wrote:

>> > I was wondering if anyone had a quicker way of rounding an unsigned int

>> > value up to the nearest power of 2.

>*snip*

>> Here's my effort ... [that is] basically an unrolled binary search:

[more snippage]

>* Kevin Easton wrote:

>> Jeffrey Stedfast <fe...@ximian.com> wrote:

>> > I was wondering if anyone had a quicker way of rounding an unsigned int

>> > value up to the nearest power of 2.

>*snip*

[more snippage]

In article <hv9i3x...@ID-34913.user.dfncis.de>

Ralf Wildenhues <ralfimn...@gmx.de> writes:

>Thanks for a good hint towards a more portable solution rolling up your

>code:

[more code snipped]

>One could use Jeffreys solution on ULONG_MAX/2 to calculate

>BITS_IN_ULONG-1 in initialization code, in order to make this really

>portable.

>

>offtopic part: on my system this sometimes beats Jeffreys idea when

>using 64-bit wide variables, and is faster than the unrolled version.

Yet another trick is to write a portable C program that writes

a nonportable, "unrolled-loop" version of the desired function.

Here is a method that relies on binary representation, an unrolled

loop for 32-bit "int"s, and an attempt (untested) at a portable C

program to write the unrolled loop as a non-portable C function.

Observe that if n is already a power of 2, it has only one bit

set in it:

000001000...000 (for some number of 0's)

and the answer is then just "n". Otherwise n has additional

bits set, e.g.:

00010001010...

and the answer is then whichever bit pattern has only a single

bit set one bit higher than the currently-topmost set bit:

00010001010... =>

00100000000...

i.e., we need to round up to the next power of two (as in the

original problem statement).

Let us begin by assuming that we have the latter case -- multiple

bits set. How can we find "the next power of 2"? Well, consider

the lowly bitwise-OR operator, and what happens if it is repeated.

Once a 1 is ORed in, it remains a 1 forever. Suppose we take the

initial value of n and shift it right one bit, introducing a zero

at the top and discarding the bottom-most bit, then mash them

together with bitwise-OR:

0001000101000001

| 0000100010100000x

----------------

0001100111100001

Now do this again but shift two bits:

0001100111100001

| 0000011001111000xx

----------------

0001111111111001

Repeat with a four-bit shift:

0001111111111001

| 0000000111111111xxxx

----------------

0001111111111111

Because I had 16 bits in the illustration, we really should do it

again with an 8-bit shift, but obviously this time we have reached

a stable point; any further shift-and-OR-ing will keep repeating

the same bit pattern. In fact, you always reach this point within

ceil(log2(nbits)) steps (proof left as an exercise :-) ).

Having done this shift-and-OR sequence log2(nbits) times, we have

effectively smeared *any* "1"-bits, from whatever is the topmost

set bit, on down through all the lower-order bits. Now we come to

the trick: we add 1.

By definition, 1+1 is 0 with a carry. By this time, the carries

propagate all the way up to the most-significant "1" bit. Thus

all the low order bits turn to 0 and we end up with a "1" in the

bit above the currently-MSbit:

0001111111111111

+ 0000000000000001

----------------

0010000000000000

which is just the value we wanted: the next higher power of two.

Of course, the next higher power of two may be unrepresentable;

for instance, if the bit pattern before adding 1 is all-ones,

adding 1 will cause it to wrap to 0 (unsigned arithmetic being

modular "clock" arithmetic). For now let us simply assume that

this is the desired result.

This suggests the following method:

/* given unsigned integral value "n" */

if (n has only one "1" bit in its representation)

return n;

else {

n |= n >> 1;

n |= n >> 2;

... as many n |= n >> k operations as needed ...

n++;

return n;

}

Of course, this requires a pesky (if easy-to-implement after all)

test: "does n have only one `1' bit in its representation?" But

look what happens if we "cheat" instead. Suppose we begin by

subtracting 1 from n. There are only two possible cases: either

n has *exactly* one 1 bit, or it has *more* than one 1 bit:

000100010000000000000000

- 000000000000000000000001

------------------------

000100001111111111111111

If n has *more* than one "1" bit set, the subtraction never touches

the topmost "1" bit, because at worst, it causes a borrow from all

the low-order "0"s up until we reach the first "1" bit, and then

it stops borrowing. And if n has *exactly* one "1" bit set, then

either the subtraction causes a borrow from that single "1" bit,

giving us all-1s in the lower-order bits:

000100000000000000000000

- 000000000000000000000001

------------------------

000011111111111111111111

or else n is initially 1 and subtracting 1 just gives 0. In all

three cases, the resulting bit pattern is just what we need to run

through our bit-smearing algorithm. Bit-smearing all-zeros has no

effect, and bit-smearing any "1", followed by some (or even no)

zeros, followed by some (or no) all-ones, does just what we want.

Thus, instead of the initial test "is there only one 1-bit", we

just subtract one, smear, then add one and return that:

/* assumes 16-bit "int"s */

unsigned int round_by_smearing(unsigned int n) {

n--; /* subtract one for power-of-2 input */

n |= n >> 1; /* "smear" steps, repeat log2(nbits) times */

n |= n >> 2;

n |= n >> 4;

n |= n >> 8;

n++; /* add 1 to reach next power of 2 (or 0) */

return n;

}

(Note, incidentally, that if the input is 0, the output is also 0

-- subtracting 1 gives all-1-bits, which are unchanged under

smearing, then adding 1 gives 0 again. So input-zero is

indistinguishable from overflow-out, which occurs whenever the

input exceeds the mathematical value of (UINT_MAX+1)/2. In this

case UINT_MAX is presumably 65535, and 32768 in => 32768 out => no

overflow; this is why we are OK with (UINT_MAX+1)/2 instead of

UINT_MAX/2.)

Of course, "round_by_smearing" above makes brash and unsupportable

assumptions about UINT_MAX, so now we get to the other tool in our

bag of tricks: C programs that write C translation units.

For some unsigned type T, we need to produce log2(nbits_in_T)

"smear" steps. We could examine Uwhatever_MAX, but on the other

hand, we can just do the same kind of smearing until we reach

all-one-bits and adding one overflows. So, here is a program to

do that:

/* define OUR_TYPE as desired, but note that it must be an

unsigned integral type! */

#ifndef OUR_TYPE

#define OUR_TYPE unsigned int

#endif

#define STR0(x) #x

#define STR1(x) STR0(x)

int main(void) {

OUR_TYPE v, vplus1;

int shift_factor;

printf("%s round_by_smearing(%s n) {\n n--;\n",

STR1(OUR_TYPE), STR1(OUR_TYPE));

for (shift_factor = 1, v = 1;; shift_factor *= 2) {

printf(" n |= n >> %d;\n", shift_factor);

v |= v << shift_factor;

vplus1 = v + 1;

if (vplus1 == 0)

break;

}

printf(" n++;\n return n;\n}\n");

return 0;

}

Although the computer I tested this on has only 8, 16, 32, and

64-bit integer types, I believe it would work correctly on an

18-and-36-bit Univac too.

--

In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)

Salt Lake City, UT, USA Domain: to...@bsdi.com

http://67.40.109.61/torek/ (for the moment) I report spam to abuse@.

"nos...@elf.eng.bsdi.com" *is* my address (one of many actually).

Aug 25, 2002, 6:27:48 PM8/25/02

to

In article <ak69nl$fu$1...@elf.eng.bsdi.com> I wrote:

> /* assumes 16-bit "int"s */

> unsigned int round_by_smearing(unsigned int n) {

> n--; /* subtract one for power-of-2 input */

> n |= n >> 1; /* "smear" steps, repeat log2(nbits) times */

> n |= n >> 2;

> n |= n >> 4;

> n |= n >> 8;

> n++; /* add 1 to reach next power of 2 (or 0) */

> return n;

> }

> /* assumes 16-bit "int"s */

> unsigned int round_by_smearing(unsigned int n) {

> n--; /* subtract one for power-of-2 input */

> n |= n >> 1; /* "smear" steps, repeat log2(nbits) times */

> n |= n >> 2;

> n |= n >> 4;

> n |= n >> 8;

> n++; /* add 1 to reach next power of 2 (or 0) */

> return n;

> }

(followed by a program to generate the proper function for however

many bits your "int"s are)

It just occurred to me that one can use a loop of the form

"smear until value does not change":

typedef unsigned int UT;

UT another_round(UT n) {

UT n1 = n - 1;

int shift_count = 1;

do {

n = n1 | (n1 >> shift_count);

shift_count *= 2;

} while (n != n1);

return n + 1;

}

Of course, "speed" is not something addressed by Standard C, and

one would have to test each version to see which one(s) are fastest

on your particular implementation.

Aug 25, 2002, 6:33:51 PM8/25/02

to

In article <akbll4$3bv$1...@elf.eng.bsdi.com> I posted untested code:

> typedef unsigned int UT;

> UT another_round(UT n) {

> UT n1 = n - 1;

> int shift_count = 1;

>

> do {

> n = n1 | (n1 >> shift_count);

> shift_count *= 2;

> } while (n != n1);

> return n + 1;

> }

> typedef unsigned int UT;

> UT another_round(UT n) {

> UT n1 = n - 1;

> int shift_count = 1;

>

> do {

> n = n1 | (n1 >> shift_count);

> shift_count *= 2;

> } while (n != n1);

> return n + 1;

> }

Exercises:

- spot the obvious bug.

- fix it.

:-)

Aug 26, 2002, 2:02:33 AM8/26/02

to

Chris Torek <nos...@elf.eng.bsdi.com> wrote:

> In article <akbll4$3bv$1...@elf.eng.bsdi.com> I posted untested code:

>> typedef unsigned int UT;

>> UT another_round(UT n) {

>> UT n1 = n - 1;

>> int shift_count = 1;

>>

>> do {

>> n = n1 | (n1 >> shift_count);

>> shift_count *= 2;

>> } while (n != n1);

>> return n + 1;

>> }

>

> Exercises:

>

> - spot the obvious bug.

> - fix it.

>

> :-)

> In article <akbll4$3bv$1...@elf.eng.bsdi.com> I posted untested code:

>> typedef unsigned int UT;

>> UT another_round(UT n) {

>> UT n1 = n - 1;

>> int shift_count = 1;

>>

>> do {

>> n = n1 | (n1 >> shift_count);

>> shift_count *= 2;

>> } while (n != n1);

>> return n + 1;

>> }

>

> Exercises:

>

> - spot the obvious bug.

> - fix it.

>

> :-)

typedef unsigned int UT;

UT another_round(UT n) {

UT n1;

int shift_count = 1;

n--;

do {

n1 = n;

n = n1 | (n1 >> shift_count);

shift_count *= 2;

} while (n != n1);

return n + 1;

}

(in the spirit of the thread, also untested :)

- Kevin.

Aug 26, 2002, 2:09:37 PM8/26/02

to

Here's a routine that returns the highest bit number:

const char clz_table[32] =

{

0, 31, 9, 30, 3, 8, 18, 29, 2, 5, 7, 14, 12, 17,

22, 28, 1, 10, 4, 19, 6, 15, 13, 23, 11, 20, 16,

24, 21, 25, 26, 27

};

unsigned long clz(unsigned long n)

{

unsigned long c = 0x7dcd629; /* magic constant... */

n |= (n >> 1);

n |= (n >> 2);

n |= (n >> 4);

n |= (n >> 8);

n |= (n >> 16);

if (n == 0) return 32;

n = c + (c * n);

return 31 - clz_table[n >> 27]; /* For little endian */

}

--

#include <standard.disclaimer>

_

Kevin D Quitt USA 91351-4454 96.37% of all statistics are made up

Per the FCA, this email address may not be added to any commercial mail list

Aug 27, 2002, 7:21:07 AM8/27/02

to

#include <stdlib.h>

#include <limits.h>

int main(int argc, char *argv[]) {

unsigned int i, n = 250;

if (argc > 1)

n = atoi(argv[1]);

n &= INT_MAX;

for (i = 1; n > i; i <<= 1) ;

printf("the nearest power of 2 for %u is %u\n", n, i);

return 0;

}

--

Joe Wright mailto:joeww...@earthlink.net

"Everything should be made as simple as possible, but not simpler."

--- Albert Einstein ---

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu