What hardware? Is this the P5/166?
I checked this out. Here's a table:
Time Solved
---- ------
<=10 564
<=20 598
<=30 622
<=40 634
<=50 648
<=60 660
<=70 669
<=80 674
<=90 679
<=100 685
<=110 686
<=120 689
<=180 697
<=240 708
< 300 711
Which seems pretty impressive to me, it is a hard suite.
Did you really run this for 879*5min = 73 hours, or does it quit the ply
after it finds the solution or something like that?
My own results are 595 in <= 20.00 seconds on a P6/200. I always do
"time until found and held until end of test".
bruce
> This might be one thing I could do as well. 595 doesn't sound bad,
> either. Did you ever try it at longer time controls? I found the
> positions solved in <300 seconds but not in <20 seconds to be more
> interesting. Maybe you could run the suite for Ferret with all the
> positions it doesn't get in <= 20 seconds? 595 for Ferret (albeit on a
> P6 200) vs. 598 for Fritz (P166) sounds pretty close, but what about
> number of positions solved at longer time controls? E.g. how about
> #877?
I'm going to respond to this to let you know I got it, and so it won't
disappear from my server.
I'll run some more ECM stuff. #877 had a fail-high fail-low at 8
seconds, maybe it would have found it in the next ply.
bruce
> This might be one thing I could do as well. 595 doesn't sound bad,
> either. Did you ever try it at longer time controls? I found the
> positions solved in <300 seconds but not in <20 seconds to be more
> interesting. Maybe you could run the suite for Ferret with all the
> positions it doesn't get in <= 20 seconds? 595 for Ferret (albeit on a
> P6 200) vs. 598 for Fritz (P166) sounds pretty close, but what about
> number of positions solved at longer time controls? E.g. how about
> #877?
I would like to go into some detail with my answer.
I just got done running my thing for one day approximately. I ran each
position for 2 minutes (120 seconds).
I would search less than 2 minutes in exactly one case: if a position
was determined a mate in N, and I had searched to depth (N*2)-1, I would
quit. This isn't strictly correct, since a null move program doesn't
necessarily find the fastest mate this way, it can take you longer than
(N*2)-1 plies to find a mate in N, so perhaps I could miss some shorter
mates this way. I didn't see any examples of this here.
I would search more than 2 minutes in some special cases, which I will
not detail. If you cut the search time off at exactly 2 minutes, my
program finds 667 correct answers. If you tell it to search for 2
minutes, throw out any cases where it finds the correct answer in more
than 2 minutes, yet make the program responsible for instances where it
has the correct answer at 2 minutes, but switches away from the correct
answer after 2 minutes is over, it finds 663.
Fritz finds 689 in 2 minutes, on wimpier hardware.
But I got to wondering what the effect of this, "go for one ply beyond
where you get a solution", mechanism would be.
I searched for cases where the following occurred:
1) It had had a correct answer after *finishing* depth D (for some D).
2) It had also had a correct answer after *finishing* depth D+1.
3) At some later depth it switched to a wrong move, and would still have
played a wrong move at the end of 2 minutes.
I was expecting to find only a couple of these cases, but I found 37 of
them.
I did not attempt to break down these cases, but I noticed some stuff in
passing. In some, it was obviouly not understanding the position, but
had the correct answer for a while, before switching to some other move,
which either won some material or didn't. In other cases, it had +3, +5,
or whatever, and switched away to something else where the score was
higher.
But anyway, I think that this shows that it is important to be clear
about how you are scoring correct answers, since how you do this
seemingly trivial piece of bookkeeping had, in my case, a big affect upon
the score. 667 is a lot different than 704.
By the way, #877 Qe3 is played after 28.3 seconds, ply 9, evaluated after
29.4 seconds, also ply 9, as a draw, so obviously this was a case where
it's all happy because it thinks it's found a perpetual check. It still
thinks it's a perpetual check after 2 minutes.
I may try your idea of making a sub-suite containing all the positions
mine missed, but tonight I'm going to run that Corvax suite.
bruce
Fritz 5 (P166) / Covax results:
No. 10 (Nxf7): 155s (11 ply), doesn't change in 10'
No. 22 (..f4): 358s (11 ply), doesn't change in 10'
These are the only 2 positions Fritz gets right within 10 minutes.
Moritz
On Sat, 30 Aug 1997 19:36:35 -0700, brucemo <bru...@seanet.com>
wrote:
>Moritz Berger wrote:
Mine only found 624 in search exact 2 minutes, on a P6-200.
>But I got to wondering what the effect of this, "go for one ply beyond
>where you get a solution", mechanism would be.
>
>I searched for cases where the following occurred:
>
>1) It had had a correct answer after *finishing* depth D (for some D).
>2) It had also had a correct answer after *finishing* depth D+1.
>3) At some later depth it switched to a wrong move, and would still have
>played a wrong move at the end of 2 minutes.
>
>I was expecting to find only a couple of these cases, but I found 37 of
>them.
>
>I did not attempt to break down these cases, but I noticed some stuff in
>passing. In some, it was obviouly not understanding the position, but
>had the correct answer for a while, before switching to some other move,
>which either won some material or didn't. In other cases, it had +3, +5,
>or whatever, and switched away to something else where the score was
>higher.
I also noticed this and it make me think if the key move is really the
best move. There are many positions that key move is the best move
with the score 3+ or something like that, but then the program found a
move with even higher score.
I wonder if we should compare our results on this suite, not just the
best moves but also the scores etc. I post this a while ago, but I
guess you didn't see it, or maybe you didn't like the idea. I really
like to have a test suite which is big (300+), hard (> n seconds, or >
d ply) and correct.
>But anyway, I think that this shows that it is important to be clear
>about how you are scoring correct answers, since how you do this
>seemingly trivial piece of bookkeeping had, in my case, a big affect upon
>the score. 667 is a lot different than 704.
>
>By the way, #877 Qe3 is played after 28.3 seconds, ply 9, evaluated after
>29.4 seconds, also ply 9, as a draw, so obviously this was a case where
>it's all happy because it thinks it's found a perpetual check. It still
>thinks it's a perpetual check after 2 minutes.
Mine needs 130 seconds to realize it can draw by play the key move.
Too bad!
On Thu, 04 Sep 1997 22:53:40 -0700, brucemo <bru...@seanet.com>
wrote:
>Ren Wu wrote:
>
>> I wonder if we should compare our results on this suite, not just the
>> best moves but also the scores etc. I post this a while ago, but I
>> guess you didn't see it, or maybe you didn't like the idea. I really
>> like to have a test suite which is big (300+), hard (> n seconds, or >
>> d ply) and correct.
>
>I don't remember this. I miss things now and again, they come in days
>late and look like they are very old, and I don't read them before they
>are recycled, etc.
>
>It's hard to compare something so monstrous as output from from ECM, even
>if it is summarized.
That's true. I may try to come up a way to deal with this. Or I may
post a just few interesting positions so that everyone can try them.
>It might be very useful to have a hard tactical suite made up of very
>hard problems from all the suites we have, but I haven't tried to
>construct such a thing.
I am not sure. I like to have a big and hard tactic suite so that I
can try to minimize the solotion time with confidence. I mainly use
LCT2, Bt2630, and now BS2830, etc as benchmark. These are too small
and I don't really want optimze my program against them. however,
everytime I change my program, I always run them to see if anything I
did wrong.
>Is your program going to be in Paris? Will you be there with it?
>
>Sorry if I already asked, I have been distracted.
Unfortunetely, I will not. I am waiting my PR here and it is not good
time for me to leave US. Also, currently I spent 90% of my time doing
my thesis, there is very little time left for me on my chess program.
This is also the same reason I missed the Jakata event. I hope I will
participate next WMCCC, or even WMCC if possible.
i really wish I can go there through. It must be a great event.
>bruce
Ren
-delete one loop for the proper email address