Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Chess move binary encoding

136 views
Skip to first unread message

Steven J Edwards

unread,
Jul 3, 1994, 8:27:20 PM7/3/94
to
The following is from Jeff Mallett, the author of the Innovation II
chessplaying program. He sent it to me and asked me to post it to the
newsgroup. It contains a proposal for a storage technique for binary
coded chess moves that may in some cases be a bit more compact than
the one byte-one move method used by PGC and some other programs.

If you have any comments on this, please direct your e-mail to Jeff
at:

j.ma...@genie.geis.com

-- Steven (s...@world.std.com)
---------------------------------------------------------------------------

Re: The PGC Standard

I read "Standard" for game notations and I see that the section 20 (PGC)
is still under development. The method described for PGC, i.e. storing the
index of a move's position in a move list, really should be attributed to the
Spracklens, as I seem to recall they published this years and years ago. In
any case I don't see any reason to stop at one-byte per move; since "storage
efficiency" is the primary goal here, I propose an alternate method that
is more space-efficient than PGC, which I'll refer to as MGC -- Mallett
Game Coding :) :)

------------
MGC

In PGC the average case and worst case take up exactly 8 bits for a move.
My suggestion is simple: reduce the average case storage space at the cost
of the worst case. There are many ways this can be done. To keep everything
very simple I suggest using 4-bit boundaries. (I have a feeling 5-bit
boundaries would be more efficient.) Thus a move would take up a
minimum of 4 bits rather than 8 bits. If a move can't fit in 4 bits then
the first 4 bits will be 0xF (1111) to denote an extension and the next 4
bits will denote the move. This is similar to how assembly language opcodes
might be encoded.

Move Ordinal MGC representation
0..14 0x0..0xE ( 4 bits: 0000 ..1110)
15..29 0xF0..0xFE ( 8 bits: 11110000 ..11111110)
30..44 0xFF0..0xFFE (12 bits: 111111110000..111111111110)
etc.

Let's try an example. Suppose a game is stored where each side has
exactly 30 moves available at each position, so the zero-based move
"ordinal" is 0 to 29. Half of the time the ordinal will be 0-14
and the other half of the time it will be 15-29. Therefore the average
case storage space for this example is .5x4 + .5x8 = 6 bits. In this
simple example the game moves can be stored using MGC in 75% of the space
PGC would use. And rather than restricting positions to those with 2**n
possible moves, as a 6, 7, or 8 bit Spracklen-type representation would,
this system handles positions with an unlimited number of moves,
guaranteeing compatibility with even the most ridiculous composed
positions.

Of course there are many positions which have more than 30 moves
available and so a move could potentially take up more than a byte.
Still the chances are that the played move will be one of the first 30.
Where is the break-even point, i.e. where MGC and PGC have the same
average performance? The answer is when the side to move has 45
moves available. While this is not an uncommon situation in a complex
middlegame, more usual is 30-40 possible moves. In many other positions
such as king and pawn endgames, cramped queenless middlegames, or when
a side is in check, the number of available moves will be far less
than 30.

------------
CHESS KNOWLEDGE

Like PGC the MGC output is dependent on the ordering of moves, but
whereas move ordering can't help PGC store a move in less than a byte,
move ordering in MGC can increase the savings even more! This is
because we can order the move list so that the moves appearing in the
first 15 positions is more likely to be the played move than other moves.
For instance a simple ordering as follows:

1) pawn promotions
2) castling
3) captures of Queens
4) captures of Rooks
5) captures of Bishops and Knights
6) captures of Pawns
7) moves towards the opponent's side
8) all other moves

can be calculated quickly from the move and board without any search or
particularly obscure chess knowledge. It is bound to increase the
odds that the move played can be encoded in only a half-byte. A
more elaborate scheme that improves on this, such as one where checks
are ordered early, could easily be developed. It should be, as this
one is, quick to calculate.

------------
IMPLEMENTATION

MGC move representation can be integrated with the PGC defined definitions
and opcodes already defined. Headers for mvseq-1, mvseq-2, and mvset-1
would have to be defined to represent either the number of moves or bytes,
but not both. Also, if the move sequence/set ends in the middle of a
byte, the last 4 bits should be skipped as "undefined", so that the next
piece of information begins on a byte boundary.

MGC is a bit more complicated than PGC. Fortunately, this isn't as big an
issue as in PGN, since the binary form will be written/read by computers
rather than humans. Be that as it may, I'm providing some code to
convert a move ordinal to MGC format. This code is UNTESTED so use at
your own risk. It is independent of whatever move sorting scheme may be
eventually chosen.

// MGC Encoder/Decoder by Jeff Mallett
// This code is freeware.
// Note: "Boolean" may be typedef'd to an int
// Note: "Byte" may be typedef'd to an unsigned char

typedef struct {
Byte *bptr; // Pointer to a byte in games database
Boolean firstHalf; // Looking at first 4 bits of the byte?
} movesDBPositionT;

#define TRUE 1
#define FALSE 0

#define GET_4_BITS(b, p) \
b = (p->firstHalf) ? *p->bptr & 0x0F : (*(p->bptr++) & 0xF0) >> 4; \
p->firstHalf = !p->firstHalf

short GetNextMove(movesDBPositionT *p);
void PutNextMove(movesDBPositionT *p, short moveIndex);


// Converts compacted (MGC) move in database to move ordinal
// Also updates database position to point to next move
short GetNextMove(movesDBPositionT *p)
{
register short moveIndex, i;

moveIndex = 0;
do {
GET_4_BITS(i, p);
moveIndex += i;
} while (i != 0x0F);

return moveIndex;
}

// Converts move ordinal to compacted (MGC) move in database
// Also updates database position to point to next move
void PutNextMove(movesDBPositionT *p, short moveIndex)
{
if (p->firstHalf) {
if (moveIndex <= 14) {
*p->bptr = moveIndex << 4;
p->firstHalf = FALSE;
} else { // moveIndex > 14
if (moveIndex > 29) {
*(p->bptr++) = 0xFF;
PutNextMove(p, moveIndex - 30);
} else // 14 < moveIndex <= 29
*(p->bptr++) = 0xF0 + moveIndex - 15;
}

} else { // !p->firstHalf
p->firstHalf = TRUE;
if (moveIndex <= 14) {
*p->bptr |= moveIndex;
p->bptr++;
} else { // moveIndex > 14
*(p->bptr++) |= 0x0F;
PutNextMove(p, moveIndex - 15);
}
}
}

------------
Jeff Mallett (Innovation II)
j.ma...@genie.geis.com


0 new messages