Não é mais possível fazer postagens ou usar assinaturas novas da Usenet nos Grupos do Google. O conteúdo histórico continua disponível.
Dismiss

Supporting PGN, EPD and FEN

4 visualizações
Pular para a primeira mensagem não lida

Rob McDonell

não lida,
14 de jun. de 1998, 03:00:0014/06/1998
para

I'm releasing a new version of K-Chess Elite which will support reading and
writing PGN, EPD and FEN file formats. It's been an interesting exercise,
not because the formats are hard but because there seems to be so many
generally accepted deviations from them. For example...

* Some PGN files have "ep" appended to en passant captures, even though
the PGN spec prohibits it.

* PGN tags are usually recorded in some sort of logically grouped (at
least to someone) order rather than according to the spec which says ASCII
order.

* Some PGN files include WhiteCountry and BlackCountry tags which are not
in the spec.

* Some EPD files don't have semicolons between operations which the spec
says should follow every operation.

* Some EPD files have miscellaneous information after the final semicolon,
but this practice is never mentioned in the spec.

* Some FEN files seems to have comments at the end, surrounded by square
brackets.

Here's my question. Is there somewhere these "deviations" are discussed and
either condoned or condemned? I've tried to cater for all the above, but
I'm wondering whether there are any more out there I'm not aware of.

By the way, if you'd care to try out the last beta version of K-Chess Elite
you can get it from www.pnc.com.au/~arkangle/kchess.htm

Regards,
Rob McDonell
ARK ANGLES
Web: www.pnc.com.au/~arkangle

Richard A. Fowell

não lida,
14 de jun. de 1998, 03:00:0014/06/1998
para

Basically, its kind of a double standard - make sure your program
doesn't output files with any of these things, but try to
handle them gracefully yourself.


In article <01bd9759$eb8530e0$327339cb@arkangle> "Rob McDonell" <arka...@compuserve.com> writes:
>I'm releasing a new version of K-Chess Elite which will support reading and
>writing PGN, EPD and FEN file formats. It's been an interesting exercise,
>not because the formats are hard but because there seems to be so many
>generally accepted deviations from them. For example...
>
>* Some PGN files have "ep" appended to en passant captures, even though
>the PGN spec prohibits it.
>

Sad but true, gladly ... few.

>* PGN tags are usually recorded in some sort of logically grouped (at
>least to someone) order rather than according to the spec which says ASCII
>order.

The spec says - the seven required tags first, in order, then
any tags you like, in ASCII order. But there are a lot of files
out there that do as you say.


>
>* Some PGN files include WhiteCountry and BlackCountry tags which are not
>in the spec.

The spec says (section 8.1.1) that
" ... the definition and use of additional tag names and semantics
is permitted and encouraged when needed ..."
Typical practice is to simply ignore unrecognized tags.


>
>* Some EPD files don't have semicolons between operations which the spec
>says should follow every operation.
>
>* Some EPD files have miscellaneous information after the final semicolon,
>but this practice is never mentioned in the spec.
>
>* Some FEN files seems to have comments at the end, surrounded by square
>brackets.
>
>Here's my question. Is there somewhere these "deviations" are discussed and
>either condoned or condemned? I've tried to cater for all the above, but
>I'm wondering whether there are any more out there I'm not aware of.

Almost certainly. I'm most familiar with PGN sins.
Here are a few in more or less order of frequency:

- ending a PGN movetext line short
(the spec says maximal number of tokens in less than 80 characters)
- using a space as the last character of a movetext line
- omitting the space between a period and White's move when
adjacent on the same line
- inserting optional tags before the last of the standard 7 tags
- omitting the empty line after the movetext
- using "ep" for en passant
- using "" rather than "?" for an unknown tag entry
- omitting the Game Termination Marker
- using "+" or "++" for checkmate rather than "#"
- using lower case "oh" or zeros for castling
- omitting the empty line after the tag section
- omitting a "+" for a check
- putting in extra spaces in the movetext
- using tabs rather than spaces in the movetext
- omitting one of the standard 7 tags (typically the "Round" tag)
- having the standard 7 tags out of order
- having the "Date" tag data in the wrong format
- Result (and/or termination marker) of "1/2" rather than "1/2-1/2"

I tried 9 PGN programs yesterday.
Of the above errors:
- 7 made the first
- 5 made the second
- 2 made the third
- 1 program (and no more than one) made the next nine errors.
Unfortunately, most programs made at least one kind of error
in addition to the "top three".

One point here - as the statistics above show, any given PGN error
(with the possible exception of the top two) is not "generally accepted".
As the stats above show, no more than one program or two programs out
out of the nine made any other particular error.

What is "generally accepted" (by me, based on my little test above)
is that you can expect program generated PGN output to have errors/deviations
from the standard. There was one program whose output appeared to be correct,
but clearly, one out of nine is an exception <grin>.

Basically, there are so many types of errors to be made
that it is rare for a program not to exhibit some.

The good news is, if your PGN input parser can handle all the errors I list
above, you'll be do pretty well (and better than most).

However, based on my above data, when you throw in programs "ten"
and "eleven" into the mix, you can probably expect some new quirks.

Here's a nice test file, by the way. It exercises a lot of the PGN features
present in "reduced PGN export format" in a single game:

- Checkmate symbol used
- Promotion to Queen (with capture and check!)
- Underpromotion to Knight
- En Passant capture
- En Passant capture with check
- Both kingside and queenside castling
- Five file disambiguations (two knight, three rook)
- Two row disambiguations (one knight, one rook)

================= snip here ====================
[Event "KO-50.2"]
[Site "IECC"]
[Date "1997.04.15"]
[Round "?"]
[White "Brown, Mary"]
[Black "Green, John"]
[Result "1-0"]

1. e4 e5 2. Nf3 Nc6 3. Bc4 b6 4. O-O Bb7 5. d4 Qf6 6. c3 O-O-O 7. Nbd2 exd4 8.
cxd4 Nge7 9. d5 Ne5 10. Qe2 N7g6 11. Ba6 Bd6 12. Nxe5 Qxe5 13. Bxb7+ Kxb7 14.
Nf3 Qh5 15. b3 c5 16. dxc6+ dxc6 17. Bb2 Rhe8 18. Rfc1 Bf4 19. Rc4 Rd2 20. Qxd2
Bxd2 21. Nxd2 Nf4 22. e5 f5 23. exf6 Qg5 24. g3 Ne2+ 25. Kf1 Qb5 26. f7 Kc8 27.
fxe8=Q+ Kc7 28. Rd4 Nc1+ 29. Kg1 c5 30. Qe3 Nxb3 31. axb3 g5 32. Rda4 c4 33.
Ra6 Qa5 34. R6xa5 c3 35. Nc4 cxb2 36. Rd1 b1=N 37. Qe4 Nc3 38. Rxa7+ Kb8 39.
Qb7# 1-0

================= snip here ====================
<snip>

fow...@netcom.com (Richard A. Fowell)

Andy Duplain

não lida,
14 de jun. de 1998, 03:00:0014/06/1998
para

What is needed is a public body to define the PGN standard and allow it to
grow. Authors would then lobby this group in order to get modifications
included that support the features they want. If a new PGN spec was
released each year, you could then label products (commercial or otherwise)
as PGN/98 compliant, meaning it supports all features upto and including
those in the "98" standard. Oh yeah, I better no forget about the
millennium bugs this will cause; it should of course be PGN/1998.

I wouldn't personally write code to support the deviations, as I feel it
just encourages them... the smaller the amount of software that supports
non-standard PGN, the smaller the number of non-standard PGN files.

--
Andy Duplain, Trojan Consulting Ltd., Brighton, UK.

Richard A. Fowell <fow...@netcom.com> wrote in message
fowellEu...@netcom.com...

Komputer Korner

não lida,
15 de jun. de 1998, 03:00:0015/06/1998
para

On the CCC, right now there is an ongoing debate about the 1998 PGN
standard. Steven Edwards seems to be leading the standard revision
himself but I wish that there would be a committee set up so that
everyone would be sure to be heard.

--
--
Komputer Korner
The inkompetent komputer

To send email take the 1 out of my address. My email address is
kor...@netcom.ca but take the 1 out before sending the email.
Andy Duplain wrote in message
<897855231.25950.0...@news.demon.co.uk>...

Paul Onstad

não lida,
15 de jun. de 1998, 03:00:0015/06/1998
para

> - ending a PGN movetext line short
> (the spec says maximal number of tokens in less than 80 characters)

> Richard A. Fowell

Now what if 80 is too long? Most editors and Emailers only handle about 76
safely. Right now, for example, MS wraps your sample game.

That particular point, and space after period (1. e4) have always been the
most objectional parts of the PGN spec to me. It's good to know that a
review is going on right now at CCC......But what/where is CCC?

-Paul

Richard A. Fowell

não lida,
15 de jun. de 1998, 03:00:0015/06/1998
para

In article <01bd9838$382548c0$99952299@dell1> "Paul Onstad" <paulo...@msn.com> writes:
>> - ending a PGN movetext line short
>> (the spec says maximal number of tokens in less than 80 characters)
>
>> Richard A. Fowell
>
>Now what if 80 is too long? Most editors and Emailers only handle about 76
>safely. Right now, for example, MS wraps your sample game.
>
>That particular point, and space after period (1. e4) have always been the
>most objectional parts of the PGN spec to me. It's good to know that a
>review is going on right now at CCC......But what/where is CCC?
>
> -Paul


I appreciate the issues about the line length.

What don't you like about the space, however?
It seems pretty standard in books/magazines,
and helps makes things more readable.

fow...@netcom.com (Richard A. Fowell)

Komputer Korner

não lida,
15 de jun. de 1998, 03:00:0015/06/1998
para

Go to ICD and get a free password to get into CCC. A lot of
programmers hang out at CCC.

--
--
Komputer Korner
The inkompetent komputer

To send email take the 1 out of my address. My email address is
kor...@netcom.ca but take the 1 out before sending the email.

Paul Onstad wrote in message <01bd9838$382548c0$99952299@dell1>...


>> - ending a PGN movetext line short
>> (the spec says maximal number of tokens in less than 80
characters)
>

Paul Onstad

não lida,
15 de jun. de 1998, 03:00:0015/06/1998
para

> I appreciate the issues about the line length.
>
> What don't you like about the space, however?
> It seems pretty standard in books/magazines,
> and helps makes things more readable.

> Richard A. Fowell

But is it standard or more readable? USCF uses it; Inside Chess does not.
Many books, when the examples are not in tabular format, do not. It's
really not a thing people notice, and would likely be a non-issue except
for the 10% extra of hard disk space it requires. That, and a little more
info on the screen without it.

Now for a more radical suggestion: we should next do away with [Event ""].
That single tag has corrupted more games since 1994 than perhaps even the
hyphen separator (White - Black) of the old Nunn's Text Reader used by
ChessBase.

-Paul

0 nova mensagem