Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help on PGN/ EPD-Format and its implementation wanted

129 views
Skip to first unread message

Rudolf Posch

unread,
Nov 5, 1996, 3:00:00 AM11/5/96
to

My private PC chess program uses a self designed, proprietary file format for storing/ loading (single) chess
games to/from disk.
In order to get access to the existing huge chess game databases, I want to enhance my program with an I/O
module for chess games in PGN format.

In this context I have the following questions und would appreciate answers, tips and hints to the following:

1) What is (at the moment and for the near future) the most used,openly documented and well defined format in
the chess computer community?
I have heard of the PGN-format, commercial formats like Chess Base,Chess Assistant,.., and some older
(obsolete) formats.
I suppose PGN is the best choice for my purpose.

2) I have found at Pit Chess Archive the "Standard: PGN specification and Implementation guide", revised
1994.03.12.
2a) Is this the latest version available?
2b) Adhere the current implementations in chess programs, PGN reader etc. to the described standard fully or
are there some features in the standard which "did´nt get accepted" by most programmers (and which I therefore
also may ignore at writing my PGN parser/ generator)?

3) There are described 3 PGN variants, Import Format (more freely), Export Format (stringent) and Reduced
Export Format (7 Tag Roster; without comments, numeric annotation glyphs (NAGs) and recursive annotations).
I plan not to use the Import format (not even in my import PGN routine). This presupposes that most of the
available PGN files are in strict Export format.
I think best is using the Export Format (both at the Load and Save procedure). But if for instance NAGs are not
very common in praxis I would like to avoid programming those features.
So is it a good choice to use the PGN Export format variant exclusively and with what ommissions?

4) Extended Position Description (EPD)
I have to load and store the starting position of a chess game, if it doesnt start with a full board.
I found the 2 tags SETUP and FEN in the above mentioned PGN standard.
But there is also EPD as an extension to FEN. I missed a tag pair "Setup, EPD" ?

5) I want to store with some/every move diverse data like the evaluation, searched nodes, nodes/sec,hash
statistics, internal flags ...
How do I do this best in conformance with widespread use, for instance incooperating the data in {comments}?
In the EPD standard there are "EPD operations" like "acs" (analysis count:seconds), but they are seemingly
meant for "position annotation" and not "move annotation".


--
MfG Rudolf Posch/E:Mail Rudolf...@banyan.siemens.at

Steven J. Edwards

unread,
Nov 8, 1996, 3:00:00 AM11/8/96
to

[I was unable to e-mail to the original author.]

Rudolf Posch <Rudolf...@banyan.siemens.at> writes:

> My private PC chess program uses a self designed, proprietary file
> format for storing/ loading (single) chess
> games to/from disk.

Most programs start out that way; it's much more fun to write a chess
playing algorithm than to worry about I/O formatting.

> In order to get access to the existing huge chess game databases, I want
> to enhance my program with an I/O
> module for chess games in PGN format.

A good idea, as PGN is the only well defined non-proprietary format.

> In this context I have the following questions und would appreciate
> answers, tips and hints to the following:

> 1) What is (at the moment and for the near future) the most used,openly
> documented and well defined format in
> the chess computer community?

PGN, as above.

> I have heard of the PGN-format, commercial formats like Chess Base,Chess
> Assistant,.., and some older
> (obsolete) formats.
> I suppose PGN is the best choice for my purpose.

There are a number of difficulties with the proprietary formats you
mention. First, none of them have official publicly available
specifications. Second, they are subject to change without notice and
without upward compatibility considerations. Third, they are all tied
to commercial interests and may involve license and royalty issues.
Finally, a format that has been optimized for multiple games may not
be appropriate for single games and vice versa.

> 2) I have found at Pit Chess Archive the "Standard: PGN specification
> and Implementation guide", revised
> 1994.03.12.

It's also at chess.onenet.net as the pub/chess/PGN/Standard text file.

> 2a) Is this the latest version available?

Yes, but only for a while. A revision is underway and should be
available before the end of the year. However, there will be no major
changes and nearly all current data meeting the PGN specification will
continue to do so; the only compatibility changes foreseen are:

1: Removal of the "?", "!", "??", "?!", "!?", "!!" move annotations;
NAGs are to be used instead. A simple pass with a text editor can fix
these in extant data.

2: Removal of the percent sign line escape mechanism; this has been
rarely, if ever used. This will free up the percent sign glyph for
possible future use.

I am also considering dropping the section on the binary version of
PGN (i.e., PGC) to further simplify the standard.

Candidate material to be added:

1: BIF formalization.

2: PPD (Portable Player Directory) formalization; this will facilitate
a global player database on the Internet.

3: Tablebase references formalization.

4: Hypertext link formalization.

5: Digital signatures for tag pairs lists, movetexts, games, and
files.

6: Well supported suggestions from the Internet community.

> 2b) Adhere the current implementations in chess programs, PGN reader
> etc. to the described standard fully or
> are there some features in the standard which "did´nt get accepted" by
> most programmers (and which I therefore
> also may ignore at writing my PGN parser/ generator)?

To my knowledge, no one has done a comprehensive survey on PGN
compatible software. The best way to find out about a particular
product is to ask here on the rec.games.chess.computer newsgroup.
Authors with comments and questions should post to the newsgroup so
that the reply comments and answers can be shared with others.
Communication by e-mail is okay, too. There is never any charge for
consultation, nor is there any license or royalty needed for using PGN
(or EPD, FEN, and SAN). Note to publishers: it is much better to get
everything resolved prior to product release as it helps prevent a
product from being pilloried in a public forum.

Clearly, some implementations are better than others. However, even
the best implementation can be limited by inaccuracies in data entry.
For example, many data are created with incomplete or emply tag fields
for Event, Site, and Date tag pairs.

> 3) There are described 3 PGN variants, Import Format (more freely),
> Export Format (stringent) and Reduced
> Export Format (7 Tag Roster; without comments, numeric annotation glyphs
> (NAGs) and recursive annotations).

In the upcoming revision of the PGN Standard, references to the Import
and Export format will be removed. Instead of the Import format
description, a more formal language specification (Backus Naur Form)
will appear. The Export format will have its name changed to
something like "Base Presentation" format. A section will be included
to help make a clear distinction between data interchange formats and
presentation formats.

The Seven Tag Roster is provided as a minimum requirement for each PGN
capable program and for each game represented using PGN.

> I plan not to use the Import format (not even in my import PGN routine).
> This presupposes that most of the
> available PGN files are in strict Export format.

This may not be the best assumption. As more programs use the more
advanced features of PGN, this may couse problems. At the very least,
a program that does not support all of the PGN feature set should be
able to intelligently ignore unsupported features on input data.

> I think best is using the Export Format (both at the Load and Save
> procedure). But if for instance NAGs are not
> very common in praxis I would like to avoid programming those features.
> So is it a good choice to use the PGN Export format variant exclusively
> and with what ommissions?

I think that NAGs are fairly easy to program. The more difficult
features are the RAVs. The Broket Information Forms (BIFs) are of
intermediate difficulty.



> 4) Extended Position Description (EPD)
> I have to load and store the starting position of a chess game, if it
> doesnt start with a full board.

Most programs work this way. Mine always stores the starting
position, even if it is the standard one. It also stores a "setup"
flag.

> I found the 2 tags SETUP and FEN in the above mentioned PGN standard.

The tag "Setup" is needed only if the starting position is not the
standard one. It would then be:

[Steup "1"]

The alternative for a standard setup is never needed, but should be
accepted on input if it appears:

[Setup "0"]

If a [Setup "1"] tag pair appears, a FEN tag pair must also appear.
Remember that the data portion of a FEN tag must have all six fields.

> But there is also EPD as an extension to FEN. I missed a tag pair
> "Setup, EPD" ?

No, EPD is not an extension of FEN. They do share the first four
fields, but are otherwise different. There is no EPD tag in PGN. A
refresher:

1: FEN (Forsyth-Edwards Notation) represents one position on a single
text line and always uses EXACTLY SIX fields (in order: piece
placement, active color, castling availability, en passant target
square, half move clock, and full move number). FEN is handy for cut
and paste operations and is commonly used in newsgroup postings to
allow readers to try out posted positions in their own programs. It
is also useful for graphical diagram construction software. Here is
the FEN for the initial position:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

And after 1. e4 e5 2. Nf3 Nc6 3. Bb5:

r1bqkbnr/pppp1ppp/2n5/1B2p3/4P3/5N2/PPPP1PPP/RNBQK2R b KQkq - 3 3

2: EPD (Extended Position Description) represents one position with
extended description and always uses FOUR fields (in order: piece
placement, active color, castling availability, en passant target
square) which are the same as the first four fields of FEN. After the
four fields, each EPD record has zero or more operations. Each
operation consists of an operator mnemonic, an operand list, and a
semicolon. An operator mnemonic is an identifier taken from the EPD
standard operator set. An operand list is a sequence of zero or more
operands. An operand is a symbol that makes sense for the particular
operator and for its position in the operand list.

Sample EPD files can be had from the chess.onenet.net ftp site in the
pub/chess/Tests directory.

The big new thing in the upcoming PGN revision is the formalization of
the equivalence of EPD operations and BIF operations in PGN movetext.
(They form an isomorphism.) For example, the EPD operator mnemonic
"pv" is used to represent a predicted variation analysis:

pv e4 e5 Nf3 Nc6;

The equivalent BIF appearing in a PGN movetext would be:

<pv e4 e5 Nf3 Nc6>

There may be some mnemonics that apply either only to EPD or only to
BIFs, but most are dual usage. The idea is to simplify implementation
and use by keeping the two sets pretty much the same with regard to
syntax and semantics.

> 5) I want to store with some/every move diverse data like the
> evaluation, searched nodes, nodes/sec,hash
> statistics, internal flags ...

A good idea.

> How do I do this best in conformance with widespread use, for instance
> incooperating the data in {comments}?

PGN comments, either brace comments or semicolon comments, are not
intended for automated parsing, but are rather for human consumption.
Therefore, programs should not make too many assumptions about their
contents or format. The better way is to use BIFs as they are
standard and can be intelligently read by both humans and programs.

> In the EPD standard there are "EPD operations" like "acs" (analysis
> count:seconds), but they are seemingly
> meant for "position annotation" and not "move annotation".

EPD is for describing a position, and this may be interpreted as
including comments on moves associated with a position.

For example, if an EPD record had the following analysis data:

acn 314159; acs 30; ttm 4567; ttp 54321;

Then the corresponding movetext BIFs would be:

<acn 314159> <acs 30> <ttm 4567> <ttp 54321>

(ttm: transposition table matches; ttp: transposition table probes)

There is an important consideration for BIFs that may not be
immediately obvious; they always describe the "current position".
That is, the position that occurs between moves (or before the first
move or after the last move). So, most BIFs for a particular move
should appear immediately prior to the move in the PGN movetext. This
is of interest to authors of presentation format programs, as most
presentation formats will follow human traditional conventions and
have analysis appearing after the played move.

The presence of BIFs in movetext, like RAVs and NAGs, has no effect on
the actual move history of the game. Therefore, they may be ignored
by importing programs whose only purpose is recovering simple game
data.

Anyway, the BIF/EPD connection will be more fully explained in the
revision.

I will have the revised standard available for Internet discussion by
January 1997 with a target date for finalization of July 1997. The
next revision (if any) will probably be from three to five years
later.

-- Steven (s...@mv.mv.com)

0 new messages