Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bitboards: speeding up FirstOne

150 views
Skip to first unread message

Laurent.DESNOGUES

unread,
Apr 10, 1996, 3:00:00 AM4/10/96
to
After a discussion with Joel Rivat and some programming,
I've written a function that seems to be faster than
Crafty's ones (both on a Sparc20 and on an Alpha 21064).
The code output by gcc 2.7.0 for SuperSPARC can be a
little be optimized by hand. For the Alpha I don't know
well enough the instruction set and the pipelines
constraints to tell if the generated could be optimized.

I'd like to share the program, and get some tests and
comments in exchange.

Note the basic idea is by Joel; I just sped it up.


typedef unsigned long long ULONG;

unsigned char __firstbit[256] =
{
8,
0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
6,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
7,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
6,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0
};


inline long firstbit5__(ULONG n)
{
union {
unsigned long truc[2];
ULONG machin;
} v;

#ifdef LITTLE_ENDIAN
# define Z 0
# define U 1
#else
# define Z 1
# define U 0
#endif

v.machin = n;
if (v.truc[Z])
if (v.truc[Z]&0xFFFF)
if (v.truc[Z]&0xFF) return __firstbit[v.truc[Z]&0xFF];
else return 8+__firstbit[(v.truc[Z]>>8)&0xFF];
else
if (v.truc[Z]&0xFF0000) return 16+__firstbit[(v.truc[Z]>>16)&0xFF];
else return 24+__firstbit[v.truc[Z]>>24];
else
if (v.truc[U]&0xFFFF)
if (v.truc[U]&0xFF) return 32+__firstbit[v.truc[U]&0xFF];
else return 40+__firstbit[(v.truc[U]>>8)&0xFF];
else
if (v.truc[U]&0xFF0000) return 48+__firstbit[(v.truc[U]>>16)&0xFF];
else return 56+__firstbit[v.truc[U]>>24];

#undef Z
#undef U
}


Laurent

Laurent.DESNOGUES

unread,
Apr 11, 1996, 3:00:00 AM4/11/96
to
I obviously made a mistake: my function firstbit5__
has to be compared to Crafty's LastOne. I've always
counted bits from the least to the most significant
because of my assembly language programming experience...


Laurent

Anders Thulin

unread,
Apr 11, 1996, 3:00:00 AM4/11/96
to
In article <4kfvhu$4...@malibu.unice.fr>,
Laurent.DESNOGUES <desn...@aiguemarine.unice.fr> wrote:

>unsigned char __firstbit[256] =

As this isn't comp.lang.c, perhaps this is off topic...

Be careful about names starting with underlines: in ANSI C, they're
almost always reserved for use by the system. Collisions may not
necessarily be detected.

But just to stay reasonably close to the topic ...

In cases where the sole purpose is to extract bits from the BITBOARD
and the exact order this is done is unimportant, it seems it would be
better to call LastOne instead, as that function can be computed
fairly quickly as:

#define LastOne(BB) ((BB) & -(BB))

at least for systems with 64-bit support. (If Knuth only had published
vol. 4. ...)

--
Anders Thulin Anders...@lejonet.se 013 - 23 55 32
Telia Research AB, Teknikringen 2B, S-583 30 Linkoping, Sweden

Robert Hyatt

unread,
Apr 11, 1996, 3:00:00 AM4/11/96
to
In article <4kiphl$n...@nala.devnet.lejonet.se>,
Anders Thulin <a...@nala.devnet.lejonet.se> wrote:
-->In article <4kfvhu$4...@malibu.unice.fr>,
-->Laurent.DESNOGUES <desn...@aiguemarine.unice.fr> wrote:
-->
-->>unsigned char __firstbit[256] =
-->
--> As this isn't comp.lang.c, perhaps this is off topic...
-->
--> Be careful about names starting with underlines: in ANSI C, they're
-->almost always reserved for use by the system. Collisions may not
-->necessarily be detected.
-->
--> But just to stay reasonably close to the topic ...
-->
--> In cases where the sole purpose is to extract bits from the BITBOARD
-->and the exact order this is done is unimportant, it seems it would be
-->better to call LastOne instead, as that function can be computed
-->fairly quickly as:
-->
--> #define LastOne(BB) ((BB) & -(BB))
-->
-->at least for systems with 64-bit support. (If Knuth only had published
-->vol. 4. ...)
-->
-->--
-->Anders Thulin Anders...@lejonet.se 013 - 23 55 32
-->Telia Research AB, Teknikringen 2B, S-583 30 Linkoping, Sweden


Except on the Cray which has "leadz" (FirstOne) in hardware. :)

Bob
--
Robert Hyatt Computer and Information Sciences
hy...@cis.uab.edu University of Alabama at Birmingham
(205) 934-2213 115A Campbell Hall, UAB Station
(205) 934-5473 FAX Birmingham, AL 35294-1170

Laurent.DESNOGUES

unread,
Apr 11, 1996, 3:00:00 AM4/11/96
to
In article <4kiphl$n...@nala.devnet.lejonet.se>, a...@nala.devnet.lejonet.se (Anders Thulin) writes:
[...]

> In cases where the sole purpose is to extract bits from the BITBOARD
> and the exact order this is done is unimportant, it seems it would be
> better to call LastOne instead, as that function can be computed
> fairly quickly as:

>
> #define LastOne(BB) ((BB) & -(BB))
>
> at least for systems with 64-bit support. (If Knuth only had published
> vol. 4. ...)
[...]

That code will also be good on 32 bit architectures that
have condition codes (at least carry; e.g. SPARC) and anyway
probably better on any computer.

My problem is that I need the *position* of the bit set
and not just the mask, since I use the position as an index
in an array. On top of that, most of the time I clear the
bit found with a masking operation that does several other
maskings as well, so that having the value returned by
your function should really be useless.

BTW, I'm writing that for an Othello end-game solver and
have a question: can anyone recommend a good hashing
function? The function I plan to use would be quite easy to
compute:

unsigned long hash = 0;
for i = 0 to 7 do
hash = (hash << 1) + hashvalues[byte(black, i)];
hash = (hash << 1) + hashvalues[byte(white, i)];

where hashvalues would be an array containing 256 32-bit
random values. There's a little problem with it: doing
so I make the assumption each square of the board can have
four values instead of three; would that increase the
number of collisions? An efficient solution would be
to index the concatenation of two bytes in an array that
would give me a number in [0, 3^8-1]; is it worth the
128 KB? I guess I would need a bitboard on a ternary
computer (titboard ;).

Another problem is that that hashing function can't be
easily computed incrementally.

Then I'd and the result with the size of the hash table
which would be a power of two (Knuth would laugh at that ;).

Any comment?


Laurent

Robert Hyatt

unread,
Apr 11, 1996, 3:00:00 AM4/11/96
to
In article <4kj5i3$a...@malibu.unice.fr>,
Laurent.DESNOGUES <desn...@aiguemarine.unice.fr> wrote:
-->In article <4kiphl$n...@nala.devnet.lejonet.se>, a...@nala.devnet.lejonet.se (Anders Thulin) writes:
-->[...]

-->> In cases where the sole purpose is to extract bits from the BITBOARD
-->> and the exact order this is done is unimportant, it seems it would be
-->> better to call LastOne instead, as that function can be computed
-->> fairly quickly as:
-->>
-->> #define LastOne(BB) ((BB) & -(BB))
-->>
-->> at least for systems with 64-bit support. (If Knuth only had published
-->> vol. 4. ...)
-->[...]
-->
--> That code will also be good on 32 bit architectures that
-->have condition codes (at least carry; e.g. SPARC) and anyway
-->probably better on any computer.
-->
--> My problem is that I need the *position* of the bit set
-->and not just the mask, since I use the position as an index
-->in an array. On top of that, most of the time I clear the
-->bit found with a masking operation that does several other
-->maskings as well, so that having the value returned by
-->your function should really be useless.
-->
--> BTW, I'm writing that for an Othello end-game solver and
-->have a question: can anyone recommend a good hashing
-->function? The function I plan to use would be quite easy to
-->compute:
-->
--> unsigned long hash = 0;
--> for i = 0 to 7 do
--> hash = (hash << 1) + hashvalues[byte(black, i)];
--> hash = (hash << 1) + hashvalues[byte(white, i)];
-->
-->where hashvalues would be an array containing 256 32-bit
-->random values. There's a little problem with it: doing
-->so I make the assumption each square of the board can have
-->four values instead of three; would that increase the
-->number of collisions? An efficient solution would be
-->to index the concatenation of two bytes in an array that
-->would give me a number in [0, 3^8-1]; is it worth the
-->128 KB? I guess I would need a bitboard on a ternary
-->computer (titboard ;).
-->
--> Another problem is that that hashing function can't be
-->easily computed incrementally.
-->
--> Then I'd and the result with the size of the hash table
-->which would be a power of two (Knuth would laugh at that ;).
-->
--> Any comment?
-->
-->
--> Laurent


I'm not an Othello person at all, but a student, many years ago, converted
Cray Blitz to play othello by replacing the move generator, make/unmake,
and the evaluation code.

He kept the same hashing scheme I used in CB, which is exactly what's being
used in probably all of the current computer chess programs. That is, the
famous:

hash=0;
for (sq=0;sq<max;sq++)
hash=Xor(hash,random[board[sq]][sq]);

where board[sq] contains the value of the piece (0,1,-1 for othello,
maybe) and sq is the board square. in this case (you have to fudge the
subscripts, obviously because -1 is not going to fly) random[0,1,2][0..63]
is the array of random numbers. each time you add a piece, you xor in
random[piece_added][square_it_is_on]; Of course you also flip at least
one other piece, so you have to take the old out, and xor in the new.

Just thinking, I'd bet bitmaps would be the thing to do, because I might
be able to use two 64bit bitmaps, one for black, one for white, and with
an exclusive or, flip a bunch of pieces at one time, and with some clever
usage of rotated bitmaps, figure out where the distant friendly piece is
on each ray in one quick step.

Nah... not going to think about it... chess is enough fun.. :)

BTW, when the othello conversion was done, I was skeptical that the hash
table as used in chess would be any good. remarkably, it was just as
useful in othello as in chess/checkers. It's an *odd* game, however. :)

0 new messages