Knight fork threats

516 views
Skip to first unread message

Stefan Geschwentner

unread,
Nov 17, 2014, 9:50:00 AM11/17/14
to fishc...@googlegroups.com
I'am scheduled a test which gives bonus if a safe knight fork is possible. For each forked piece which is more valuable or undefended depended on piece type a bonus is added.
That are 10 parameters to select. For SPSA an initial guess is needed, but i have no glue which! So i tried another approach.
Inspired by following site i tried in the last month population base incremental leraning (PBIL) for tuning this values.

http://macechess.blogspot.de/2013/03/population-based-incremental-learning.html

I have experimented with different settings and have in the end used this extreme settings to get short generations:
- Population size 2
- Learning rate: 0.02 (lower because of small population and matches)
- Each version plays against master a two games match (same opening with reversed color), better one is target for learning

so after around 11000 generations i get the used values (last between zero and one normalized entropy was 0.04).

To see how this goes a do quick fixed number games test.
Afterward a SPSA run for retuning can be done.

algot...@gmail.com

unread,
Nov 17, 2014, 2:38:06 PM11/17/14
to fishc...@googlegroups.com
Now this seems quite innovative. Very interesting article. This 0,1 gene-encoding of numbers ensure some progressive convergence.

Your adaptation of those ideas looks quite interesting. Excited to see what will be the result. If the extra computation time is not too bad, this will probably works.

I must agree that according to my few experiments, SPSA is amazing to tune, but without good starting values, it will not help much. I beg to be corrected, or enlightened on this, but here is my understanding:

For example, in a few private experiments, if I introduce a new parameter (which could be worthless or worse, detrimental), SPSA (with default R and A values ) will not converge to 0 to cancel the parameter (as I was expecting).

Once you have the best SPSA value, it is still not a guarantee that you will improve the master code with your new parameter...
The problem can be that the extra computation time will be the culprit. Or it could be that the code is simply wrong or random.

SPSA will find the best average value for this parameter, but simply, this average value might not do any good in some extreme cases, which overall will be more harmfull then the "good" average cases. The different cases are not necessarily distributed "normally".

To conquer this, my thought was that by breaking the parameter in more parameters, I would be able to find out the cases that pay. But still, it does not seem to work that way.

Also, one can look at tests done by hw (the endless optimise best move series) to get convinced that SPSA will eventually converge to "something", but that will never tell you if the original concept was good or not, however hard you try to make it work.

fenchel,

unread,
Nov 17, 2014, 3:01:49 PM11/17/14
to fishc...@googlegroups.com, algot...@gmail.com
SPSA is approximating gradient descent, and as such:

- under some conditions (e.g., small step sizes and convexity of wins/losses as a function of the parameters), it is guarantee to converge in wins/losses;

- convergence in wins/losses does not imply convergence in parameters (e.g., when they are linearly dependent);

- averaging the parameters is only guaranteed to work with convexity, and in general averaging should be done for some suffix of the iteration sequence that resides in the same basin of attraction.

Stefan's method is designed for nonconvex problems and so using it as initialization for something like SPSA could work very nicely.  However, both need many many games to do a good job, and reducing relationships between variables always helps....

(To be clear, I'm just stating theoretical facts about these methods which agree with your empirical observations.) 

fenchel,

unread,
Nov 17, 2014, 6:04:11 PM11/17/14
to fishc...@googlegroups.com
Stefan,

Did you leave the "PBIL" running?  Even if SPSA ends up finding nice values, it could be interesting to see if further PBIL finds some further initializations which are quite different.

Of course, the universe is going to die of heat death, so I understand if you shut the procedure off....

zamar

unread,
Nov 18, 2014, 3:01:35 AM11/18/14
to fishc...@googlegroups.com
CLOP?

Stefan Geschwentner

unread,
Nov 18, 2014, 7:18:00 AM11/18/14
to fishc...@googlegroups.com
in the past i have experimented with CLOP but the results were not compelling. Additionally for so many parameters convergence is mostly very slow (> 100,000 games).

Further it seems on my experiments that the needed amount of games only increase slow with many parameters. And CLOP is best for konvex solution space, for PBIl this doesn't really matter.
One point i found useful is you can clearly determine if parameters has converged, than all bits a very near zero or one.
Here is a list of the parameter after my last generation:

KnightFork1_mg=[mode:55,mean:52.2304,std:7.16983](0.0987387/0.168766)
KnightFork1_eg=[mode:53,mean:53,std:1.78531e-035](2.74547e-059/9.52366e-072)
KnightFork2_mg=[mode:76,mean:76,std:1.6183e-025](6.1622e-049/3.39649e-050)
KnightFork2_eg=[mode:69,mean:69,std:2.62426e-019](4.05123e-039/5.3591e-039)
KnightFork3_mg=[mode:42,mean:42,std:0.00229311](4.31851e-009/3.02296e-008)
KnightFork3_eg=[mode:35,mean:44.9251,std:27.8161](0.224515/0.67522)
KnightFork4_mg=[mode:76,mean:75.8035,std:1.82081](0.102118/0.0393998)
KnightFork4_eg=[mode:116,mean:116,std:5.66295e-011](9.10001e-018/1.71908e-021)
KnightFork5_mg=[mode:98,mean:98.0046,std:0.178829](0.00235277/0.000907762)
KnightFork5_eg=[mode:123,mean:123,std:0](1.04211e-040/0)
KnightFork6_mg=[mode:100,mean:100,std:0.0219565](4.21593e-006/7.20361e-006)
KnightFork6_eg=[mode:14,mean:21.7021,std:15.3062](0.108611/0.377144)

Therefore is
- mean = average value
- mode = most probable value
- std = standard deviation of parameter
- in round parentheses are the normalized entropy (bits equally weighted) and weighted entropy (bit i => weight s^i)

Most parameters have a entropy near zero. This parameters does not change anymore.
Here are only few parameters have high entropy, especially KnightFork3_eg.

But the oscilated during my run further, so i stopped by a total entropy of  0.044695.

Despite of this the approach of PBIL is interresting!

The fixed games test seems to indicate around 0 elo for my parameter, so try to improve on this with SPSA.



Stefan Geschwentner

unread,
Nov 18, 2014, 7:26:53 AM11/18/14
to fishc...@googlegroups.com
Because i couldn't run this always i saved after each generation the current state. So i could restart later from any generation (perhaps with changed settings) in the past.
If you look at my anwser to Joona, their is not much left to converge. I have only limited computer resource and this run takes about 3 weeks (with interruptions). Perhaps i give them later a new try. But lets first see how SPSA is doing.

algot...@gmail.com

unread,
Nov 18, 2014, 8:48:27 AM11/18/14
to fishc...@googlegroups.com
Have you implemented also the Gray Code trick ? It is essential to the success of this algorithm.

zamar

unread,
Nov 18, 2014, 9:07:37 AM11/18/14
to fishc...@googlegroups.com
I'm not familiar with this method, but I'll definitely take a look at it at some point...

joster

unread,
Nov 18, 2014, 9:18:44 AM11/18/14
to fishc...@googlegroups.com
Hi Stefan, 100.000 games is a very optimistic guess.
My experiences with CLOP are, that for 8 parameters you need to play at least 2.000.000 games.
Unfortunately, number of games to be played to get good convergence seem to grow exponentially with the number of parameters.

Stefan Geschwentner

unread,
Nov 18, 2014, 9:19:48 AM11/18/14
to fishc...@googlegroups.com, algot...@gmail.com
Yep i use Gray-Coding, it helps all bitwise optimization algorithms which encode numbers.

Stefan Geschwentner

unread,
Nov 21, 2014, 6:30:42 PM11/21/14
to fishc...@googlegroups.com
from my test runs we can make two conclusions:
- SPSA couldn't improve on my PBIL initial values, so it seems there are very close to at least a local maximum
- the current implementation gives around zero elo

One drawback is that currently each piece independ generate a bonus. But is it important what two pieces are attacked.
par example a queen+king fork is more valuable than a queen+rook fork. In the first the queen is lost, in the second one only rook or nothing if checks possible.
But my best values delivers at least for the endgame the opposite:
queen+king=S(198,141)
queen+rook=S(174,239)

So i try a two dimensional bonus table index by two piece types.
We can use the lsb and msb of the found targets. In this apporach for a triple fork the third piece is than ignored.
I have scheduled a new SPSA run on this modification with initial values calculated on the PBIL values.
Now we are probable not anymore at maximum, so SPSA should a have a chance to improve.

fenchel,

unread,
Nov 21, 2014, 6:44:19 PM11/21/14
to fishc...@googlegroups.com
Dear Stefan,

Just some miscellaneous ideas; apologize in advance if they are trivial in comparison to things you've considered/tried:

* Rather than modifying score for every knight fork, how about a single modification based on the existence of any knight fork?  (The hope is that training is simpler (one parameter, or two if you want to split out one other fork case), and the implementation is marginally faster (don't need two loops).)

* Is there any way to speed things up?  (I.e., is it possible that you're seeing an ELO loss (despite tuning a more powerful eval) due to the time spent?)

* You might be right about being at a decent local maximum, but maybe some more PBIL is useful.

* Did you get this safe-knight-fork idea based on some games you observed?  Maybe they have some other hints for how to make this work?

Good luck.

Stefan Geschwentner

unread,
Nov 21, 2014, 7:07:53 PM11/21/14
to fishc...@googlegroups.com
For the simple versionyou propose i have done a PBIl run too. But my local results seems not good (but i can't play with my hardware 20000-40000 games to really confirm). But later i give them perhaps a try.

One advantage of the new version i test now on fishtest is that the inner loop is removed, because we use now direct the lsb and msb of the targets (ignoring a third forked piece). So this should i little bit faster than my last version.

In PBIL if bit probabilities near zero or one, the essentially fixed. And in my run only very few bits are not converged. But it would be interresting to rerun from the start and look if it delivers a signifcant different parameter set. Perhaps i try this if i have resources available, but this needs some weeks.

The idea for the fork patch comes not especially from games. But i'am self a club player (around 1700 elo) and know how important it is to create pressure on the opponents position. Humans makes errors, the machine seldom, but all threats limit the moves the opponent can make and perhaps not all threats can be defended. And knight forks are really ugly at least for humans.

fenchel,

unread,
Nov 21, 2014, 7:47:47 PM11/21/14
to fishc...@googlegroups.com
Sounds like you're a few steps ahead of me =)

About watching games, I asked for another reason.  Specifically, I was asking if you'd observed the need for this check in games lost by SF.  I ask because it's possible that SF, while currently lacking your explicit check, is already pretty good in such positions (for example, maybe most such positions give a strong advantage after a few plies of search, due to something already in the eval).
Reply all
Reply to author
Forward
0 new messages