Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

bitboard 2^i mod 67 is unique

15 views
Skip to first unread message

Joel Rivat

unread,
Sep 2, 1996, 3:00:00 AM9/2/96
to

bue...@asterix.urc.tue.nl (Marcel van Kervinck) wrote:

> Btw, I did some investigation to code the %67 function efficiently,
> but gave up. Even /67 isn't trivial, but doable. Perhaps we should
> be looking for another %n with unique 2^i%n.

I think no investigation is necessary for doing the %67 function
efficiently. Look the following c function:
unsigned long mod67(unsigned long n)
{
return n%67;
}

You can compile this on alpha using gcc and you obtain:
.rdata
.quad 0
.align 3
$C33:
.quad 0xf4898d5f85bb3951
.text
.align 3
.globl mod67
.ent mod67
mod67:
ldgp $29,0($27)
mod67..ng:
.frame $30,0,$26,0
.prologue 1
lda $1,$C33
ldq $1,0($1)
umulh $16,$1,$1
srl $1,6,$1
sll $1,4,$0
addq $0,$1,$0
s4subq $0,$1,$0
subq $16,$0,$0
ret $31,($26),1
.end mod67

If you have look you will notice that the mod67 function is achieved
with one umulh (multiply "high") instruction and a few trivial instructions
so I doubt you can do better...

>
> Perhaps we should return to programming chess.

This is chess programming: a 64 bits integer is a set of squares
and you constently need:
1/ Loop on the squares in a set...
This is the purpose of the a&-a trick
2/ Find the index of a square i, given n=2^i (obtained by a&-a)

For 2/ I have tested the %67 method about 2 years ago but I do not
like it for the moment as it is very slow on 32 bits computers...
I still use dichotomy:
char __firstbit[256] =
{
0,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
6,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
7,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
6,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0
};
typedef unsigned long long ULONG; /* 64 bits ints */
long firstbit(ULONG n)
{
long s;
if (n&0xFFFFFFFF)
if (n&0xFFFF)
if (n&0xFF) s=0;
else s=8;
else
if (n&0xFF0000) s=16;
else s=24;
else
if (n&(ULONG(0xFFFF)<<32))
if (n&(ULONG(0xFF)<<32)) s=32;
else s=40;
else
if (n&(ULONG(0xFF0000)<<32)) s=48;
else s=56;
return s+__firstbit[(n>>s)&0xFF];
}

If you want to compare (roughtly) the cost:
mod67 : umulh + table access
dichotomy: 3 tests + table access

So you would need a really fast multiplication
in order to beat the dichotomy...
Furthermore 32 bits compilers are very clever with the dichotomy method,
and very poor with the mod67 method...

Now for fun let me give another method (not better):
inline long bitsum(ULONG a)
{
const ULONG mask1 = (ULONG(0x55555555)<<32)|ULONG(0x55555555);
a = (a&mask1) + ((a>>1)&mask1);
const ULONG mask2 = (ULONG(0x33333333)<<32)|ULONG(0x33333333);
a = (a&mask2) + ((a>>2)&mask2);
unsigned long b = (a&0xFFFFFFFF)+(a>>32);
b = (b&0x0F0F0F0F)+((b>>4)&0x0F0F0F0F);
b = (b&0xFFFF)+(b>>16);
b = (b&0xFF)+(b>>8);
return long(b);
}

long firstbit(ULONG n)
{
return bitsum((n&-n)-1);
}

Notice that this method avoid completely the table lookup...
IMHO it's the most elegant, but unfortunately it does not seem
to be faster than the dichotomy...

Comments welcome...

Joel Rivat

0 new messages