Opening book cleanup

1789 views
Skip to first unread message

Dariusz Orzechowski

unread,
Nov 3, 2013, 9:11:20 AM11/3/13
to fishc...@googlegroups.com
I have applied to book 8moves_GM.pgn slightly modified procedure described by Martin here: http://www.tcec-chess.net/viewtopic.php?f=14&t=145 and produced cleaned up version of the book. Changes: I have used Stockfish from 31 October for analysis, checked two plies out of the book to depth 16 and assumed higher threshold +- 50 cp. Rationale: in our tests we use Stockfish in self-play so it seemed natural to use it also for analysis (I don't own H3 anyway) and at TC used in fishtest, depth of the first moves out of the book is around 16 so there is no point in enormously time-consuming analysis at higher depths, 2 plies at d16 in over 46.000 games was long enough. Threshold 50 cp seems to be a good compromise between variety and balance of the book and keeps the book with enough positions to be usable.

Some stats:
1. Original book had 49217 positions, final version shrinked to 32925 which is still big enough for all practical purposes.
2. Around 3000 positions were duplicates - either direct (rare) or just the same position out of the book but with transposed moves (a lot).
3. Number of positions found with average eval bigger than:
   +-400:      8
   +-300:     13
   +-200:     76
   +-150:    379 
   +-100:   2893 
   +- 50:  13200


These positions were filtered out. I have also created an epd version which is much smaller and should load quicker and be easier to use. Both pgn and epd version attached if someone is interested.
8moves_v2_epd.zip
8moves_v2_pgn.zip

Marco Costalba

unread,
Nov 3, 2013, 11:59:02 AM11/3/13
to Dariusz Orzechowski, fishc...@googlegroups.com
Great work !

I think we can push them to fishtest and make them the default ones.  Gary, what do you think ?

Gary Linscott

unread,
Nov 3, 2013, 1:41:55 PM11/3/13
to fishc...@googlegroups.com, Dariusz Orzechowski
Awesome!  Thanks Dariusz, sure, let's make it the new default.

Marco Costalba

unread,
Nov 3, 2013, 4:30:55 PM11/3/13
to Gary Linscott, fishc...@googlegroups.com, Dariusz Orzechowski
Ok, uploaded now !

Just write 8moves_v2.epd or 8moves_v2.pgn instead of 8moves_GM.pgn in 'book' field, when creating a new test..and it should force workers to download the new book and use it.

Adam

unread,
Nov 3, 2013, 6:45:09 PM11/3/13
to fishc...@googlegroups.com
I am curious about the work you did on the pgn. I created the pgn using a similar method as you used, except I used H3 and somewhat deeper search. I too used a 50 centipawn threshold. I guess the number of positions that you found to be outside the threshold could be due to the larger scale of Stockfish's scoring. I think that using a 50 cp bound with Stockfish (as opposed to using an engine with a smaller score scale) might remove useful positions (positions that are less drawish and help detect strength gains) and leave the pgn with an even higher percentage of dead drawn openings. May be +- 75 cp would be better when using Stockfish.

What really perplexes me was your count of 49,217 openings and your discovery of duplicate openings of any type. First, the pgn should contain 48,491 openings, unless it has been modified since I shared 8moves_GM.pgn with the Stockfish testing framework. Also, the final position of every opening contained in the pgn is unique (no duplicates). I used 2 different methods to ensure that none of the positions when coming out of book were duplicates. I used the 'enddups' command for Gaviota to compare and remove any duplicate final positions in the pgn. I also used pgn extract and 2 epd utilities from Norm Pollock to create an fen of the final position of each opening in the pgn and check for duplicate fens. I just checked the pgn again, and I found no duplicates.

There is no doubt in my mind that this pgn is not perfect. For example, I do not know if any of the positions tend to transpose into each other in the first few moves out of the book. But the problems that you found are unexpected and were accounted for, IMO. Here is the link to the pgn: http://www.mediafire.com/?qb8bt8k9acdq1ko . Could you compare it to the pgn that you worked on?

Mindbreaker

unread,
Nov 3, 2013, 7:14:24 PM11/3/13
to fishc...@googlegroups.com
You might also want to filter out any positions where either side has the option of a forcing or near forcing 3-fold repetition.

Not as easy to do with the large number of positions, but I think if you had an engine that was configured to jump at any draw, vs just any opponent with no bias and each took turns at each side playing games but where you had an adjudication rule that automatically ended the games at 8 moves out of book (16 plys), then anything shorter than 6 moves and drawn would probably be a repetition.  You could even have reasonable depth as the games would be very few moves.  You probably would not get any that are decicive...if they were, they probably would also be poor choices.  Getting rid of these drawn positions can also speed up tests, as less games would be required.

Mindbreaker

unread,
Nov 3, 2013, 7:16:14 PM11/3/13
to fishc...@googlegroups.com
Correction "anything shorter than 6 moves" should read "anything shorter than 8 moves".

Dariusz Orzechowski

unread,
Nov 3, 2013, 7:18:41 PM11/3/13
to fishc...@googlegroups.com
First: I am far from stating that the book was bad, it was pretty good but I tried to improve it.

Yes, it was exactly the same file. Count 49.217 was displayed in pgnscanner, it may have been somewhat misleading as I didn't bother to double check the number.

As for duplicates: I gave a link to the whole procedure that I used, you may verify it for yourself. I am not imposing anything, only described in details what was done and shared the results. Some time ago I spotted some positions that were lost immediately out of the book so I decided to check the whole book in more systematic way. The 50 cp threshold is of course arbitrary but the book still contains almost 33.000 of positions so I don't think it is a problem.

Dariusz Orzechowski

unread,
Nov 3, 2013, 7:25:43 PM11/3/13
to fishc...@googlegroups.com
On Monday, 4 November 2013 01:14:24 UTC+1, Mindbreaker wrote:
You might also want to filter out any positions where either side has the option of a forcing or near forcing 3-fold repetition.

I think that it would be too much effort for negligible gain but using epd positions instead of pgn is good enough remedy because history of moves is lost in epd.

Dariusz Orzechowski

unread,
Nov 3, 2013, 7:59:04 PM11/3/13
to fishc...@googlegroups.com
There is of course also possibility that pgnscanner has some bugs and finds false duplicates where there are none but I obviously haven't verified it manually.

lucas....@gmail.com

unread,
Nov 3, 2013, 10:35:07 PM11/3/13
to fishc...@googlegroups.com, Gary Linscott, Dariusz Orzechowski
=> WAIT!

I don't agree with Dariuz's results. For a start, I did not find a single duplicate:

* I transformed the PGN friol from Adam Hair into EPD (one FEN per line). This is a much superior format, for many reasons, that will become clear below.

* There are NO duplicates:
$ wc -l ./book.epd
48491 ./book.epd
$ sort -u ./book.epd|wc -l
48491

As for analysis, I have not done any, but we cannot trust blindly this tool that we know nothing about. Especially if we know that it's buggy, as it detected duplicates erroneously.

I suggest we write a simple Python script to do this. I do not feel comfortable enough in Python to write it, but for a seasoned Python developer, it should be easy to do:
* open the engine process and send the introduction stuff:
uci
setoption name Hash value ...
setoption name Threads value ... (use all cores)
isready
* read the EPD file, line per line (each line is a FEN). for each fen send:
ucinewgame
position fen ...
go movetime ...
* parse the resulting lines, and everytime we get "score cp %d" pattern, put the value of %d in a variable X (overwrite so to keep last score). If X is outside the bounds, output the fen in a file "bad.epd", and if X is within the bounds, output in "good.epd"
* Then look at bad.epd, see how many we have, and perhaps do qanother pass with the same scripe, this time increasing the "movetime".

If anyone is willing to write such a script, even just the backbone of it, so I can modify it, I can send you the EPD file.

As a benefit, we would get an engine agnostic tool to validate EPD books, that we can run on other files, or with other engines: eg. maybe Stockfish misevaluates a position as bad, but Houdini and Komodo thing this opening is perfectly playable (as you know opening scores are often inflated and volatile as there are strong development terms in the eval).

Dariusz Orzechowski

unread,
Nov 4, 2013, 6:01:59 AM11/4/13
to fishc...@googlegroups.com, Gary Linscott, Dariusz Orzechowski, lucas....@gmail.com
Analysis was done in cutechess-cli, you would know it if you have read the procedure that I linked in the first post. Other tool was used to filter out duplicates and if it was buggy giving false positives, that does not render the whole procedure useless. The only side effect is that some good positions may be not included and it is easy to fix if needed.

I spent enough time on this and am not willing to argue with ideas how it should be done. If you have better idea just do it. I trusted Martin's method. If duplicates were incorrectly marked and removed, there is very simple solution: I may grab these "duplicates", run SF eval on them and add those landing under +-50 cp threshold. That may add another 1800 positions or so at most.

Adam

unread,
Nov 4, 2013, 7:23:03 AM11/4/13
to fishc...@googlegroups.com
I was not taking exception to you improving the pgn, for I do not think it is perfect. It probably is best to do as you did and evaluate the the positions with Stockfish, but it might be best to tweak the bounds. Maybe throw out positions scored > |75cp| and those in the interval +- 10cp (as opposed to using Houdini 3 and +- 50cp like I did).

I know from experience that pgnscanner is not reliable for finding duplicates. It misses some duplicates and falsely tags other positions as duplicates. That is why I used different methods to find duplicates.

Adam

unread,
Nov 4, 2013, 7:31:29 AM11/4/13
to fishc...@googlegroups.com, Gary Linscott, Dariusz Orzechowski, lucas....@gmail.com
I use the sim tool, a ruby script, and LibreOffice to do what the Python script you described does, Lucas. I do something similar with very large sets of positions with Gaviota. If you tell me what you think the movetime should be, I can process 8moves_GM through the sim tool with the latest Stockfish and find which positions fall outside certain bounds. I would be able to do this 8 to 12 hours from now.

mehdi....@gmail.com

unread,
Nov 4, 2013, 7:40:37 AM11/4/13
to fishc...@googlegroups.com
nice !!!! thx a lot Dariusz (I hope Marco won't be angry but couldn't resist to say thx)

Dariusz Orzechowski

unread,
Nov 4, 2013, 9:40:46 AM11/4/13
to fishc...@googlegroups.com, mehdi....@gmail.com
To conclude:
  1. It seems that pgnscanner raised false positives and there were no duplicates in original book - thanks for spotting and raising this issue!
  2. I may easily fix this by taking all these false positives that I have in a separate file (exactly 2406 positions in total), run analysis over them and add those below +-50 cp threshold. This can be done in few hours and I will do it. I think that this step is not strictly necessary as the book is still big enough but it could be done for philosophical reasons (to not throw out good things too easily).
  3. It is of course possible to manipulate with threshold, for example raising it to +-75 cp but I think it is not necessary, because a) book with +-50 cp is large enough, b) 75 cp is quite a big advantage and especially at fast TC that is used in fishtest, it is probably too much bias introduced for no reason. Filtering out positions closer to 0 that +-10 cp makes no sense in my opinion. Book lines are short enough (8 moves) to allow for well-fought games even if eval out of the book is 0.00.
  4. Adam, I think there is no need to duplicate my work and process the book one more time for very similar end result. Unless for some reason it is important to you (e.g. you don't want anyone to touch your book or whatever).
Message has been deleted

Adam

unread,
Nov 4, 2013, 11:30:04 AM11/4/13
to fishc...@googlegroups.com
No, I do not feel possessive about the pgn. Anyone is free to do what they want with it. My interest is in improving the efficiency of any testing that makes use the pgn, not protecting my "work". It did catch my attention when the pgn was reduced by 1/3 using virtually the same process I used for creating it. That makes me suspicious of the process.

I hope that you (Dariusz) do not have any objections to me duplicating your work but with a different method. This would verify pgn scanner's output. 8moves_GM is not the first large pgn of unique starting positions I have created, but it is the first one that I used engine evaluations to remove positions. I would like to improve the process so that I can create a better set of starting positions.

To be honest, I have no interest in fighting over what positions are used by the Stockfish testing framework. If it was decided that the 8moves_GM pgn was not good enough for the framework, it would not bother me. My true interest is in creating the best set of openings that can be made so that I can use them in the Gaviota framework. And then happily share them with anybody else who thought the positions would be useful to them.

Anyway, I hope that you understand that I am not upset or annoyed or anything else about the pgn being modified. I do not necessarily agree about the bounds being used to filter the positions, but I do hope the end result of your work does improve the efficiency of the testing.

Dariusz Orzechowski

unread,
Nov 4, 2013, 12:08:32 PM11/4/13
to fishc...@googlegroups.com
Thanks Adam, all is clear now. I of course don't mind if you want to process the book again.

As an example I attach my output containing positions with SF evals bigger than +-400 cp at depth 16, you may verify them. Maybe for Houdini these positions are below 50 cp but it is extremely unlikely, I rather suspect that you had some issues with your setup and such positions slipped through.
out_400.pgn

Dariusz Orzechowski

unread,
Nov 4, 2013, 12:34:52 PM11/4/13
to fishc...@googlegroups.com
I have updated the book with positions incorrectly removed before as duplicates. Added 1775 positions for total of 34700. Updated pgn and corresponding epd attached.


8moves_v3_epd.zip
8moves_v3_pgn.zip

Adam

unread,
Nov 4, 2013, 12:53:35 PM11/4/13
to fishc...@googlegroups.com
Thanks for uploading those +400 cp positions, Dariusz. I will check them out.

Adam

unread,
Nov 5, 2013, 6:25:54 AM11/5/13
to fishc...@googlegroups.com
I have only checked one position so far, but it is obvious that it is unbalanced and slipped through the filter. I am glad you found these unbalanced positions.

Mindbreaker

unread,
Nov 5, 2013, 1:24:46 PM11/5/13
to fishc...@googlegroups.com
It should be possible to remove all 0.00 positions.  It is true that there would be some that are not drawish just balanced and those would be fine, but it should also catch most of the perpetuals and positions devoid of life.  I think the trade off is a good one.



Dariusz Orzechowski

unread,
Nov 5, 2013, 6:21:06 PM11/5/13
to fishc...@googlegroups.com
I have checked how many games are kept at exactly 0.00 at depth 16 for 6 consecutive plies right out of the book. I found only 23 such openings so I won't even bother to remove them as their impact is practically zero.

Out of curiosity I have played two very short matches with these positions (TC 5+0.05, 46 games each).
Results (W-L-D):
SF 021113 - H1.5a: 18-16-12 (4 draws by repetition),
SF 021113 - C1.6a:  8-13-25 (22 draws by repetition).

Even these positions are not so dead drawn as they may seem at first glance.
I will therefore leave book 8moves_v3 as my final version for now.

lucas....@gmail.com

unread,
Nov 5, 2013, 6:29:12 PM11/5/13
to fishc...@googlegroups.com
On Tuesday, November 5, 2013 7:25:54 PM UTC+8, Adam wrote:
> I have only checked one position so far, but it is obvious that it is unbalanced and slipped through the filter. I am glad you found these unbalanced positions.
>
> On Monday, November 4, 2013 12:53:35 PM UTC-5, Adam wrote:Thanks for uploading those +400 cp positions, Dariusz. I will check them out.

I have difficulty to believe that there can be so much crap in the opening book we've been using so far. Yet it is true: these GM really don't know how to play chess ;-)

I am redoing Darius analysis at the moment (50cp threshold), and so far, my results agree:
- 1040 FEN tested out of 48491
- already 319 bad ones: that's just over 21%

Some of it can be explained by Houdini showing lower scores than Stockfish, but certainly not all of it.

Once I've finished, I'll get a second opinion from Critter. That was, I'll only exclude the positions where both Stockfish and Critter show a score outside [-50,+50], which will hopefully remove many of the inflated SF scores.

Dariusz Orzechowski

unread,
Nov 5, 2013, 7:10:20 PM11/5/13
to fishc...@googlegroups.com, lucas....@gmail.com
Not that is matters a lot but 319/1040 is a bit over 30%, not 21%. My result: (48491-34700)/48491 which is ~28.44% of positions filtered out. I think that even after second opinion from Critter still 20-25% of positons will be out.

I think that 8moves_GM.pgn should be immediately abandoned and switched to 8moves_v3 at least until we have something better. But I see that all new scheduled tests are still using old book that is now confirmed to have a lot of severely unbalanced positions.

Gary Linscott

unread,
Nov 5, 2013, 7:58:53 PM11/5/13
to fishc...@googlegroups.com, lucas....@gmail.com
I'll switch the default over tonight.

Mindbreaker

unread,
Nov 5, 2013, 11:52:41 PM11/5/13
to fishc...@googlegroups.com
When I made my file I use, I used Houdini 3 to check them at 24 ply perhaps more, I can't remember.  And I did that for 3 or 4 ply after the position.  It was a far smaller set of positions.  But the main difference is that I rejected anything over +/-.40

I think .50 is a sizable advantage that could easily grow on its own into a win even against equal opposition.  .70 is often enough to win when it happens early in a game.  Being only .20 from that is fairly dangerous.

Even if reducing the bounds to +/-.40 cuts the number of positions in half, there would still be plenty of positions left.

I also duplicated every one so it appears twice in a row.  In a match, this would likely mean they trade sides of the same position.

I also wonder if any of the positions past 1,000 ever get used at all.  Do we know that the interface chooses at random or in sequence?  As every computer is given a task of 1,000 games, if they run the positions in sequence, it is the same 1,000 positions and only those 1,000 positions.

If that is the case, more attention to the first 1,000 would make sense as would shrinking it to 1,000 positions to reduce unnecessary data movement.

Lucas Braesch

unread,
Nov 6, 2013, 8:03:40 AM11/6/13
to Gary Linscott, fishc...@googlegroups.com
Results of my analysis (see attachment):

  48491 original.epd
  13045 sf_bad.epd
  35446 sf_good.epd

26.9% (13045/48491) were considered bad by Stockfish (threshold 50cp). But SF scores are notoriously inflated, so they cannot be trusted -- especially in the early opening. So I decided to use SF only as a pre-filter, and let Critter 1.6a do a second pass on the SF rejects:

   3867 sf_bad_critter_bad.epd
   9178 sf_bad_critter_good.epd

In the end, only 7.97% (3867/48491) of openings are bad according to both SF and Critter.

The openings to keep are therefore:

  35446 sf_good.epd
   9178 sf_bad_critter_good.epd

44624 in total.

PS: I don't know what kind of filter Adam used, but it looks like there was no filter at all. I found several positions with scores larger than 400cp, and even one above 500cp. It demonstrates that GM games CANNOT BE TRUSTED.
book.tar.gz

Dariusz Orzechowski

unread,
Nov 6, 2013, 2:16:01 PM11/6/13
to fishc...@googlegroups.com
I also wonder if any of the positions past 1,000 ever get used at all.  Do we know that the interface chooses at random or in sequence?  As every computer is given a task of 1,000 games, if they run the positions in sequence, it is the same 1,000 positions and only those 1,000 positions.

If that is the case, more attention to the first 1,000 would make sense as would shrinking it to 1,000 positions to reduce unnecessary data movement.

No problem, positions are chosen randomly by cutechess-cli: -openings file=8moves_GM.pgn format=pgn order=random. AFAIK in LittleBlitzer there is a limit of 32.000 positions in a book, the rest above that is never chosen.

Dariusz Orzechowski

unread,
Nov 6, 2013, 2:25:34 PM11/6/13
to fishc...@googlegroups.com, Gary Linscott, lucas....@gmail.com

The openings to keep are therefore:

  35446 sf_good.epd
   9178 sf_bad_critter_good.epd

The first number is very similar to mine (34700) so it is good to have confirmation that the process was sound. I have some doubts whether it is good to directly add those other 9178 positions. Advantage of 50 cp in Critter eval may be winning especially at TC used in fishtest. It would me more cautious to lower somewhat the threshold for Critter, for example to 30 cp and include only such positions.

Gary Linscott

unread,
Nov 6, 2013, 3:27:17 PM11/6/13
to fishc...@googlegroups.com, Gary Linscott, lucas....@gmail.com
Cool, thanks for doing all this great analysis!  Will there be another pass on the book before it's final?

Dariusz Orzechowski

unread,
Nov 6, 2013, 4:59:49 PM11/6/13
to fishc...@googlegroups.com, Gary Linscott, lucas....@gmail.com
Well, I personally would prefer a book with only one filtering (by SF) because specific to SF eval inflation is already accounted for by accepting 50 cp threshold. In many other engines, 50 cp is too much of an advantage and this threshold would have to be lower. Moreover, such procedure is simpler and natural for making a book for SF - SF matches as in fishtest. There is little harm in throwing away some positions that _might be_ ok when we have some 35.000 positions that _are_ ok for our purpose. Of course YMMV.

At any rate, I'm glad that my findings are confirmed and have been taken seriously. As a result we will have better book so everyone wins.

Gary Linscott

unread,
Nov 6, 2013, 5:20:03 PM11/6/13
to fishc...@googlegroups.com, Gary Linscott, lucas....@gmail.com
New test default updated!  Thanks again :).

Adam

unread,
Nov 6, 2013, 8:48:52 PM11/6/13
to fishc...@googlegroups.com, Gary Linscott, lucas....@gmail.com
I have found the same as you and Dariusz have found. Stockfish scored one opening position greater than 900 cp 8-( . I can only guess that pgn scanner choked to some degree on the large pgn that I fed to it. It filtered out thousands of openings, but obviously many were missed (and most likely some good openings were thrown out).

It looks like there are still enough good opening positions for the framework. I am sorry that the original pgn had crappy positions in it. I am glad that it has apparently not hindered Stockfish's development much.

Gary Linscott

unread,
Nov 7, 2013, 12:17:17 AM11/7/13
to fishc...@googlegroups.com, Gary Linscott, lucas....@gmail.com
Don't be sorry!  cutechess is configured to play both sides of the opening, so usually a bad opening will result in a win&loss.  The original 8moves was a vast, vast improvement over the previous opening book :).

goo...@transversale.fr

unread,
Dec 3, 2013, 6:04:57 PM12/3/13
to fishc...@googlegroups.com
Hi,
I'm Gabriel the author of Pgnscanner and Arion chess engine. I inform you that I've been lastly working on pgnscanner (extensive tests and bugs fixing) in order to make it more reliable on doubles detection (results checked with Linux script as suggested here) and more robust on very big files readings. I will probably publish the new release and my test's protocol in a few days on www.zeitnot.fr.
See you.

Marco Costalba

unread,
Dec 4, 2013, 1:56:06 AM12/4/13
to goo...@transversale.fr, fishc...@googlegroups.com
Could you please give it a try on this one ?

https://github.com/mcostalba/FishCooking/blob/setup/8moves_v3.pgn

goo...@transversale.fr

unread,
Dec 4, 2013, 2:33:58 AM12/4/13
to fishc...@googlegroups.com, goo...@transversale.fr
> https://github.com/mcostalba/FishCooking/blob/setup/8moves_v3.pgn

Hi,
On 8moves_v3.pgn, here are the current results I got:

Total games : 34700 games

Doubles :
- Pgnscanner 0.88 : 0 game
- Linux script : 0 game

Marco Costalba

unread,
Dec 4, 2013, 2:44:44 AM12/4/13
to google, fishc...@googlegroups.com
Thanks !

Gabriel Guillory

unread,
Dec 8, 2013, 4:56:33 PM12/8/13
to fishc...@googlegroups.com, google
Hi everyone,

Just to inform you that pgnscanner 0.90 is on line at http://www.zeitnot.fr/index.php/pgnscanner/5-pgnscanner-eng.
I mainly focused on doubles detection improvments (dbl -abs) : speed, reliablility and stability.

see you :)

Gabriel

Mindbreaker

unread,
Dec 9, 2013, 2:43:18 PM12/9/13
to fishc...@googlegroups.com, google
Thanks. The program looks very interesting.

I have some very large game files but they don't have any evals because Little Blitzer does not store them.  But I still want to find bad start positions.  I would just like a list of the start positions and the % they score.  I suppose if it looked at Elo of the players and gave an Elo performance that would be nice to...I can add the Elos with Fritz.

The list could get long with thousands of start positions, still, it would be great to see the list.  I could search each position in a batabase but that would take eons.

Oh, it would also be very nice to see the draw % too.  I like the idea of removing positions that draw frequently from my start position set.

Actual statistics of the positions seem better than evals of those positions, and they are faster to generate.

goo...@transversale.fr

unread,
Dec 9, 2013, 4:37:02 PM12/9/13
to fishc...@googlegroups.com, google
Hi Mindbreaker,

The functionality you describe is not a piece of cake. However, Pgnscanner already has its own book generator that gives statistical informations in a Fritz's book manner (see "newbk /?", "openbk /?" and "showbk /?" commands). Thereby, it's possible to calculate and give an elo to your engine (givelo command) then create a pgnscanner's book and navigate through it in order to get interesting statistics. But okay, searching for thousands given positions could take a very long time this way.

Perhaps a possible method could be :

1. get only games where Little Blitzer is white
2. calculate and give elo to Little Blitzer (and opponents)
3. build a pgnscanner's book
4. with a pgnscanner command that doesn't exist yet, extract positions from the book that are not good enough according to statistical datas.
5. repeat the process with Little Blitzer as black

If people are interested in this kind of functionnality, I think I could consider it in a few weeks without too much overload.

See you :)

Gabriel

Mindbreaker

unread,
Dec 10, 2013, 2:00:21 AM12/10/13
to fishc...@googlegroups.com, google
When I mentioned Little Blitzer, I was referring to the GUI: http://kimiensoftware.com/software/chess/littleblitzer

It makes PGN files of the tournaments.  It does not use a book but instead starts the game from positions on a list.

It looks like this (just a random game):

[White "stockfish_13120423_x64"]
[Black "Houdini_3_Pro_x64"]
[Result "1/2-1/2"]
[SetUp "1"]
[FEN "r1bn1rk1/ppp1qppp/3pp3/3P4/2P1n3/2B2NP1/PP2PPBP/2RQK2R w K - 0 0"]

 1. O-O f5 2. Nd2 Nxc3 3. Rxc3 c6 4. e4 f4 5. Nb3 fxg3 6. hxg3 Nf7 7. c5 dxc5 8. Nxc5 Rb8 9. Qb3 exd5 10. exd5 Bf5 11. Rd1 Kh8 12. Ne6 Rfc8 13. Qa3 Qxa3 14. Rxa3 cxd5 15. Bxd5 a6 16. Nd4 Bg6 17. Re3 Nd6 18. Re7 Re8 19. Rd7 Bh5 20. f3 Rbd8 21. Rxd8 Rxd8 22. Ne6 Rc8 23. Bb3 Rc6 24. Nd8 Rb6 25. g4 Be8 26. Re1 g6 27. Re2 Kg7 28. Kh2 h5 29. Kg3 hxg4 30. fxg4 Bc6 31. Nxc6 bxc6 32. Re7+ Kh6 33. Re6 c5 34. Bd5 Nc8 35. Kf4 Rxb2 36. Rxa6 Rb4+ 37. Be4 Nb6 38. a4 Kg7 39. g5 Nxa4 40. Rxg6+ Kf7 41. Rf6+ Kg7 42. Kf5 Nc3 43. Bf3 Nb5 44. Rg6+ Kh7 45. Be4 Nc3 46. Rh6+ Kg7 47. Bf3 Nb5 48. Bd5 Nd4+ 49. Ke5 Rb8 50. Rd6 Ne2 51. Re6 Nc3 52. Bc6 Rb1 53. Kf4 Rf1+ 54. Kg4 c4 55. Re8 Rf8 56. Re1 Rc8 57. Re7+ Kf8 58. Re6 Nd1 59. Bd7 Rd8 60. Ba4 Nb2 61. Bc2 Nd3 62. Kf5 Nb4 63. Ba4 Nd5 64. Re5 Ne7+ 65. Ke4 Ra8 66. Bc2 Rc8 67. Rb5 Rc7 68. Kd4 c3 69. Ke3 Kg7 70. Ra5 Ng6 71. Rf5 Nh4 72. Re5 Ng2+ 73. Kf3 Nh4+ 74. Kg4 Ng6 75. Ra5 Ne7 76. Rb5 Kf7 77. Kf4 Ng6+ 78. Ke4 Kg7 79. Kf3 Ne7 80. Ke3 Ng6 81. Ra5 Re7+ 82. Kd4 Nf4 83. Kxc3 Ne2+ 84. Kd3 Nf4+ 85. Kd2 Re2+ 86. Kc3 Re7 87. Rf5 Ne2+ 88. Kd3 Ng3 89. Rb5 Ne2 90. Bb1 Ng1 91. Rc5 Re1 92. Bc2 Re7 93. Kc4 Nf3 94. Kc3 Re5 95. Rc7+ Kf8 96. g6 Re7 97. Rc6 Kg7 98. Rb6 Rc7+ 99. Kb2 Ne5 100. Re6 Ng4 101. Bf5 Nf6 102. Re3 Nd5 103. Rh3 Rb7+ 104. Kc1 Nf6 105. Ra3 Nd5 106. Be4 Rd7 107. Rf3 Nf6 108. Bf5 Ra7 109. Re3 Nd5 110. Re6 Nf4 111. Rb6 Rc7+ 112. Kd2 Re7 113. Bb1 Rd7+ 114. Ke1 Nd5 115. Ra6 Rb7 116. Bf5 Nc7 117. Rd6 Ne8 118. Rc6 Ra7 119. Re6 Nf6 120. Ke2 Nd5 121. Kf3 Rc7 122. Kg3 Re7 123. Rd6 Nf6 124. Rd1 Re2 125. Kf3 Re5 126. Kf4 Re2 127. Kg3 Re5 128. Rf1 Ra5 129. Bb1 Rb5 130. Kf3 Rb3+ 131. Kf4 Rb4+ 132. Kf5 Rb5+ 133. Ke6 Rb6+ 134. Ke5 Ng4+ 135. Kd4 Rb4+ 136. Kc5 Rb7 137. Rg1 Nf6 138. Bf5 Rc7+ 139. Kd4 Ra7 140. Kc4 Ra5 141. Bd3 Re5 142. Rb1 Re7 143. Bf5 Ne8 144. Rb6 Rc7+ 145. Kb3 Ra7 146. Bb1 Nc7 1/2-1/2

If the program could list each start FEN and keep track of the statistics for each, that would make my day.  I just have 250 or so positions but the regular testing has 34,700.  If it could accommodate up to 64,000 that should work in almost all cases.  I'd like to use the 34,700 position set so I can find any dubious positions.  In the regular testing no games are saved.  But I could do some independent testing and get enough games to get some useable statistics if a program could compile them from the PGN.  No rush though.

goo...@transversale.fr

unread,
Dec 10, 2013, 6:05:52 AM12/10/13
to fishc...@googlegroups.com, google
Le mardi 10 décembre 2013 08:00:21 UTC+1, Mindbreaker a écrit :
> When I mentioned Little Blitzer, I was referring to the GUI:

Hi,
Sorry, I didnt' know Little Blitzer but I find it quite amazing. Now, I better understand what you have in mind and I personally could even be interested by this idea for Arion. I see a nice potential interest for engines tuning. For example, pgnscanner could extract some types of position (closed, minor pieces, rooks endings...). Then Little Blitzer could run hundreds games and pgnscanner would see whether my chess engine has the expected statistical results from this position or not. If not, then the engine shows weaknesses in this kind of position and should be appropriately corrected.
Anyway, I think I will add this in my "to do" list :)

Gabriel

Gabriel Guillory

unread,
Dec 14, 2013, 7:21:29 AM12/14/13
to fishc...@googlegroups.com, google
hi everyone,

I'm currently implementing a new command (vfilt) for Pgnscanner which automatically filters games according to the evaluation given by an external engine. In my current testings, I'm running stockfish-DD for 1 sec on each last move of the 8moves_v3.pgn file. I set a acceptance interval of [-80 / 80]. At this moment, it ran on 18000 of 34700 games and found 87 games outside of the acceptance interval. Most of them are just near the bounds (eval ~ 80-90), but a few seem to be significantly worse and should be verified for more than 1 sec.

For example :

[Event "?"]
[Site "?"]
[Date "2013.11.03"]
[Round "1"]
[White "Stockfish"]
[Black "Stockfish"]
[Result "1/2-1/2"]
[ECO "C24"]
[Annotator "game 10181, depth=16 val=307 filter=[-80/80] status=rejected"]

1.e4 e5 2.Bc4 Nf6 3.d3 c6 4.Nf3 Be7 5.O-O d6 6.Re1 O-O 7.Nbd2 Re8 8.Bb3 Nbd7 1/2-1/2

[Event "?"]
[Site "?"]
[Date "2013.11.03"]
[Round "1"]
[White "Stockfish"]
[Black "Stockfish"]
[Result "1/2-1/2"]
[ECO "A46"]
[Annotator "game 12225, depth=13 val=147 filter=[-80/80] status=rejected"]

1.d4 Nf6 2.Nf3 d5 3.c4 c6 4.e3 Bf5 5.cxd5 cxd5 6.Nc3 e6 7.Ne5 Nc6 8.g4 Bg6 1/2-1/2

[Event "?"]
[Site "?"]
[Date "2013.11.03"]
[Round "1"]
[White "Stockfish"]
[Black "Stockfish"]
[Result "1/2-1/2"]
[ECO "B40"]
[Annotator "game 11725, depth=17 val=141 filter=[-80/80] status=rejected"]

1.e4 c5 2.Nf3 e6 3.d4 cxd4 4.Nxd4 Nf6 5.Nc3 Bb4 6.e5 Ne4 7.Qg4 Nxc3 8.Qxg7 Rf8 1/2-1/2

Let me know if this function has an interest for the fishcooking project.

See you.
--
Gabriel

Mindbreaker

unread,
Dec 15, 2013, 11:57:27 PM12/15/13
to fishc...@googlegroups.com, google
Yep.  All those positions look bad.

Gabriel

unread,
Jan 29, 2014, 6:40:30 PM1/29/14
to fishc...@googlegroups.com, google
hi everyone

Just to inform you that pgnscanner 0.91 is out at http://www.zeitnot.fr/index.php/pgnscanner/5-pgnscanner-eng

A few new commands have been implemented.
The 2 following commands might interest you for SF book's building :
- fenstat command in order to calculate statistical results of games starting from a specified position.
- vfilt command in order to automatically evaluate and filter openings (e.g 8moves_vX.pgn) by an engine such as stockfish.

Thanks to Mindbreaker for his help.

See you
--
Gabriel.

pvol...@gmail.com

unread,
Jan 29, 2014, 11:57:35 PM1/29/14
to fishc...@googlegroups.com, google
четверг, 30 января 2014 г., 3:40:30 UTC+4 пользователь Gabriel написал:
Gabriel, thanks for this program! Is it possible to filter games using this program by the following criteria.
We have a pgn database with 50000 games. Some games are annotated (store engine evals) and some not.
First filter.
1. If eval of the last move in the game is equal or more than 2.00 then check game result. If it is 1-0 then put the game in file good_win_white.pgn. If it is not 1-0 then edit game result to 1-0 and then put the game in file good_win_white.pgn.
2. If eval of the last move in the game is equal or less than -2.00 then check game result. If it is 0-1 then put the game in file good_win_black.pgn. If it is not 0-1 then edit game result to 0-1 and then put the game in file good_win_black.pgn.
3. If eval of the last move in the game is equal to 0.00 then check game result. If it is 0.5-0.5 then put the game in file good_draw.pgn. If it is not 0.5-0.5 then edit game result to 0.5-0.5 and then put it in file good_draw.pgn
4. It there is no eval of the last move in the game then run engine for 5 seconds to check eval and then put the game in one of 3 files above according to rules 1-3.

Second filter is more complicated.
1. If the game result is 1-0 then check all evals during the game. If the game has no evals less than -0.10 then put the game in file pure_win_white.pgn. If it has evals in the range of -0.30 and below after move 15 for more than 2 moves or after move 30 for more than 0 moves then put the game in file dubious_win_white.pgn.
All other games with result 1-0 put to file almost_pure_win_white.pgn.

2. If the game result is 0-1 then check all evals during the game. If the game has no evals more than 0.10 then put the game in file pure_win_black.pgn. If it has evals in the range of 0.30 and above after move 15 for more than 2 moves or after move 30 for more than 0 moves then put the game in file dubious_win_black.pgn.
All other games with result 0-1 put to file almost_pure_win_black.pgn.

3. If the game result is 0.5-0.5 then check all evals during the game. If the game has no evals more than 0.10 or less then -0.10 then put the game in file pure_draw.pgn. If it has evals in the range of -0.50 and below or 0.50 and above after move 15 for more than 2 moves or after move 30 for more than 1 move then put the game in file dubious_draw.pgn. All other games with result 0.5-0.5 put to file almost_pure_draw.pgn.

Third batch is to annotate unannotated games.
1. Check if the game is annotated. If yes then put game in file annotated.pgn. If not then annotate with engine evals using 5 seconds per move. Then put annotated game in file annotated.pgn.

If it all can be done with pgnscanner please help me with the correct command lines. If not then please implement these features in your great program. Thanks in advance!
Pavel

goo...@transversale.fr

unread,
Jan 30, 2014, 1:47:46 PM1/30/14
to fishc...@googlegroups.com, google, pvol...@gmail.com
Hi Pavel,

I suppose your request is not related to the Stockfish project.
So with respect to the topic of this thread, I will directly answer you by mail :-)

See you
--
Gabriel

Reply all
Reply to author
Forward
0 new messages