Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Deep Fritzen's Bahrain-machine's ELO ?

2 views
Skip to first unread message

Sterten

unread,
Aug 12, 2001, 11:56:13 AM8/12/01
to
Deep Fritz Bahrain-machine's ELO ?


hi computer chess experts ,


I read some of the press annoucements about the Kramnik-Fritz match.

Now, I was wondering about the chances and found nothing
objective about it !

I don't consider Kramnik's and Keene's publicated opinions
as very reliable.
They give no plausible reasons for their estimates and probably have
also some reason to not say what they really think about the odds.

What's the estimated actual ELO numbers of Kramnik and DeepFritz
under normal tournament conditions ?

Kramnik = 2802 (FIDE,Jan2001)
Deep Fritz , 128MB,K6-2-450 = 2653 (SSDF,Jun2001)

in the SSDF-FAQ I also find that redoubling computer speed gains
70 ELO-points , redoubling RAM gains 7 ELO-points.

Price fund is 1mio$ , for 1 mio $ I can get an estimated
10000 K6-2-450s . Let's also take 1GB of RAM.
That gives an ELO of about 3600 for DeepFritz.

Bad chances for Kramnik.

Is the calculation correct ? Probably not, but are there
better calculations ?

What are your estimates here ? Please give your numbers !
I find it disappointing and strange, that none of Kramnik , Keene ,
DeepFritz team ,Press , newsgroup-posters etc. already published
their ELO-estimates. Only some unexactly formulated opinions.

I'll start with
Kramnik=2800
DeepFritz=2850

but this is only based on what I read, also comparing with DeepBlue.
Hopefully I'll change my estimates after reading some valuable
other opinions/estimates here !?!


Guenter Stertenbrink

Mike S.

unread,
Aug 12, 2001, 11:48:11 PM8/12/01
to
"Sterten" <ste...@aol.com> schrieb:

> (...)


> Kramnik = 2802 (FIDE,Jan2001)
> Deep Fritz , 128MB,K6-2-450 = 2653 (SSDF,Jun2001)

It's difficult to compare, for various reasons. First, SSDF ratings are
not FIDE ratings. This is a different rating pool and consists of
results among computers (OTOH, Junior's comp-human performances for
example seem to indicate that the SSDF ratings are good estimations).
Second, everybody uses different hardware. Various comp-human results
come from a large hardware speed bandwidth.

> in the SSDF-FAQ I also find that redoubling computer speed
> gains 70 ELO-points , redoubling RAM gains 7 ELO-points.

Concerning RAM, it depends on the program. Fritz seems to gain very much
from bigger hash size, more than other programs.

> Price fund is 1mio$ , for 1 mio $ I can get an estimated
> 10000 K6-2-450s . Let's also take 1GB of RAM.
> That gives an ELO of about 3600 for DeepFritz.

No, Deep Fritz will run on 8 CPUs. I don't know the clock rate, but I
think it will be around 1 GHz. But that doesn't give 8 times the speed
of 1 CPU, I think approx. 6,5...7 times "only".

450*2*2*2*2 = 7.200, which means the +70 elo estimation for redoubling
the speed says that the elo difference could be somewhat less than +280
elo, if Deep Fritz (6) would play. I think we can add some points for
the RAM and the new program version, but don't forget: The +70 elo
experience comes from comp-comp, not from comp-human.

> Bad chances for Kramnik.
>
> Is the calculation correct ? Probably not, but are there
> better calculations ?

I don't think so. This question can't be answered by calculation, nor by
elo estimates IMO. We have to wait and see.

I think it won't be decisive that DF will see many things much faster
than on one or two CPUs, it will be decisive if he can see *more* and
how much more. Things he doesn't see at 2h/40 on small machines. I'm
sure there will be such things, but not necessarily one's that change
the final result of a game.

> What are your estimates here ? Please give your numbers !

I expect Kramnik to win 6-2, or 5½-2½.

But if Deep Fritz can win the first 2 games, everything can happen...

Regards,
M.Scheidl

Sterten

unread,
Aug 13, 2001, 3:32:02 AM8/13/01
to
>"Sterten" <ste...@aol.com> schrieb:
>
>> (...)
>> Kramnik = 2802 (FIDE,Jan2001)
>> Deep Fritz , 128MB,K6-2-450 = 2653 (SSDF,Jun2001)

M.Scheidl answered:

>It's difficult to compare, for various reasons.

but we should try it nevertheless , IMO

>First, SSDF ratings are
>not FIDE ratings. This is a different rating pool and consists of
>results among computers (OTOH, Junior's comp-human performances for
>example seem to indicate that the SSDF ratings are good estimations).

is there a better source for computer ratings ?
Do you think, that they are too low or too high and how much ?

>Second, everybody uses different hardware. Various comp-human results
>come from a large hardware speed bandwidth.
>
>> in the SSDF-FAQ I also find that redoubling computer speed
>> gains 70 ELO-points , redoubling RAM gains 7 ELO-points.
>
>Concerning RAM, it depends on the program. Fritz seems to gain very much
>from bigger hash size, more than other programs.

so I make it 14 points for redoubled RAM ...

>> Price fund is 1mio$ , for 1 mio $ I can get an estimated
>> 10000 K6-2-450s . Let's also take 1GB of RAM.
>> That gives an ELO of about 3600 for DeepFritz.
>
>No, Deep Fritz will run on 8 CPUs. I don't know the clock rate, but I
>think it will be around 1 GHz.

weird. The event is still announced in the press as big
man-machine contest. And if Kramnik wins I can hear them
saying that man is still superior in chess to computers.
But they are only playing with a 5000$ (estimated) machine for a
price-fond of 1mio$ !

>But that doesn't give 8 times the speed
>of 1 CPU, I think approx. 6,5...7 times "only".
>
>450*2*2*2*2 = 7.200, which means the +70 elo estimation for redoubling
>the speed says that the elo difference could be somewhat less than +280
>elo, if Deep Fritz (6) would play. I think we can add some points for
>the RAM and the new program version, but don't forget: The +70 elo
>experience comes from comp-comp, not from comp-human.

what value would you regard appropriate ?

Anyway , now I get 2653+70*(log_2(6.75*1000/450*1.3)) + 14*3 + 50 = 3045

the 1.3 is a speed estimate factor for a better CPU than K6-2
assuming 1GB of RAM
the 50 extra points are since you are suggesting that they have a
better software than DF6 .

>> Bad chances for Kramnik.
>>
>> Is the calculation correct ? Probably not, but are there
>> better calculations ?
>
>I don't think so. This question can't be answered by calculation, nor by
>elo estimates IMO. We have to wait and see.

it isn't allow to estimate ? Everyone has his/her estimate.
Some are better calculated than others.
But the strange thing is, that very few people express their
estimates in ELO-numbers.

>I think it won't be decisive that DF will see many things much faster
>than on one or two CPUs, it will be decisive if he can see *more* and
>how much more. Things he doesn't see at 2h/40 on small machines. I'm
>sure there will be such things, but not necessarily one's that change
>the final result of a game.

??? Are you saying, that DF won't be stronger on more CPUs ?
Only the final result counts, of course.
That is, the estimated probability to achieve it.

>> What are your estimates here ? Please give your numbers !
>
>I expect Kramnik to win 6-2, or 5½-2½.

Expectation value ?
How much is that in ELO ? I don't know the formula , how to calculate it.
Can someone tell the formula how to compute the win/draw probabilities from
the ELO-differences ?
But I do assume your estimate would mean that you think that
DF with 8 CPUs has _less_ than 2653 ELO.

Under normal conditions.
Or you think the conditions are quite unnormal and in favour of Kramnik ?

>But if Deep Fritz can win the first 2 games, everything can happen...

of course , and we'll have to include that possibility in our calculation
and estimate the probability
Are there bookmakers , where we can bet on the winner ?
What are the odds ?

>Regards,
>M.Scheidl


Regards, Guenter

Andy Platt

unread,
Aug 13, 2001, 8:35:59 AM8/13/01
to
The important thing about ratings is that they are only meaningful within
the specific pool of players the ratings are applied to. If I get 100 people
together and we decide to rate ourselves, and I always win against everyone
else in that group (assuming the rest of the group have a set of
wins/draws/losses closer to those you would expect in another group) I would
eventually end up with a rating to beat Kasparov. It doesn't mean much
though because he would still beat me even if he gave me odds of a Queen.

Same with the SSDF ratings. It's not that they are bad (as implied by your
statement "is there a better source for computer ratings"), it's just that
they are not within the same pool as the FIDE ratings for players.

The same arguments crop up when people ask, "What would Morphy's rating be
today?".

Andy.

--
I'm not really here - it's just your warped imagination.
"Sterten" <ste...@aol.com> wrote in message
news:20010813033202...@mb-fj.aol.com...

Mike S.

unread,
Aug 13, 2001, 9:06:57 AM8/13/01
to
"Sterten" <ste...@aol.com> schrieb:

> (...)


> weird. The event is still announced in the press as big
> man-machine contest. And if Kramnik wins I can hear them
> saying that man is still superior in chess to computers.
> But they are only playing with a 5000$ (estimated) machine for a
> price-fond of 1mio$ !

I don't think there is another choice, because Fritz is a PC software...
I don't think a program can be re-written and optimized for another
computer system within the available time.

> >(...) but don't forget: The +70 elo


> >experience comes from comp-comp, not from comp-human.
>
> what value would you regard appropriate ?

I have no idea. If a computer makes 50% and it's speed ist doubled, the
+70 elo estimation means that it should make approx. 60% now against the
same opposition. What lets me doubt that this value applies to
comp-human games is, that speed helps mostly the tactical part IMO, but
not the positional part so much. Current chess programs are far superior
in calculating combinations - if there is one - already, even on single
CPU - so I think they do not gain so much more strength by this against
Grandmasters. But this is pure speculation.

> (...) it isn't allow to estimate ? Everyone has his/her estimate.

Sure it's allowed, I even collect match predictions. Most expect a more
or less clear win by Kramnik.

> >I think it won't be decisive that DF will see many things much
faster
> >than on one or two CPUs, it will be decisive if he can see *more*
and
> >how much more. Things he doesn't see at 2h/40 on small machines. I'm
> >sure there will be such things, but not necessarily one's that
change
> >the final result of a game.
>
> ??? Are you saying, that DF won't be stronger on more CPUs ?

Of course it will be stronger. But against a player of world champion
level, it may be that the strength gain by that hardware is still not
sufficient. Imagine Kramnik plays against Fritz 7 on single CPU and
draws. This could certainly happen then and when, if a higher number of
games would be played. But which strength increase would be necessary
for the program, to win the game from the same opening book lines? It
may even be virtually impossible...

> >I expect Kramnik to win 6-2, or 5½-2½.
>
> Expectation value ?
> How much is that in ELO ? I don't know the formula , how to calculate
it.

This prediction would mean 25...31% for Deep Fritz, or -141 to -193 elo
(I take these values from a table).

> Can someone tell the formula how to compute the win/draw probabilities
from
> the ELO-differences ?
> But I do assume your estimate would mean that you think that
> DF with 8 CPUs has _less_ than 2653 ELO.

Not if DF makes 2½ points, but 2660 also seems a bit too low for such a
monster, compared to the SSDF list... I wouldn't wonder, if SSDF will
consider after the match, to reduce their rating level (again). They
already did it some time ago by -100 points. But OTOH, 8 games are
probably not enough reason for that, no matter what the result will be.

Your elo estimation for Deep Fritz/8 CPUs of 3045 means, we would have
to expect DF to achieve 80% or 6...6½ points. Well, never say never...
:o)

Regards,
M.Scheidl

Supriyo Chatterjee

unread,
Aug 13, 2001, 6:33:17 PM8/13/01
to
Since you introduced $$$ in the equation, the prize money is subject to
TAXATION. It's more profoundly true for Kramnik who is in a much higher tax
bracket in his native country.
SBC

"Sterten" <ste...@aol.com> wrote in message

news:20010812115613...@mb-mg.aol.com...

Kevin Heider

unread,
Aug 13, 2001, 6:21:42 PM8/13/01
to
Super fast hardware does NOT make software (written for machines 100x
slower) smarter. DF is a great program for PCs running at less than
4,000MHz. But once you get to say (5) 1000 P3s (effectively running
3800MHz), extra cpus should be meaningless to DFs current programming
because DFs core was designed and tested mostly on 500 to 1500MHz
machines. Sure they have slightly tweaked it for the (8) P3 SMP
system, but I doubt the software has gotten that much smarter.

I still have Psion Chess 2.12. Psion Chess 2.12 was written in 1986
for XTs (8088s) and 286s running from 6 to 12MHz. (Yes, a few lucky,
power users even had super fast 386-16s.) I have tested Psion chess
and other chess programs on all of my newer systems. And ever since I
purchased a 38-40 with a 64K cache (back in 1991), even at the
infinite level Psion isn't any stronger. Since Psion was written,
designed, and tested on the original XT class PCs, Psion just
completely exhausts itself on the *super computer* 386-40. Psion just
was not written well enough to take advantage of all the processing
power.

Some quick PC math:
A 286-10 is 5x faster than a 8088-8.
A 386-40 is 5x faster than a 286-10.
A P5-90 is 7x faster than a 386-40.
A K6-233 is 3x faster than a P5-90.

(These numbers are based on my memory and may be every so slightly
off. But I have owned each and every machine stated above.)

This makes a K6-233 roughly 525x faster than a XT running at 8MHz.
Needless to say, Psion chess is just as good on the 386-40 (25x faster
than the xt) as it is on the K6-233 (525x faster than the xt).

I expect DF to run into the same problem. IBM's DEEP Blue had DEEPER
pockets, and better hardware/software development resources.

I figure Kramnik will basically destroy DF.

In game 1 DF may get a draw because Kramnik will be feeling out the
machine.

Since DF is fairly good, DF will earn one more draw against Kramnik
later in the match.

DF will get a 3rd draw because by the end of the match Kramnik will
grow tired of DFs boring, poorly played chess.

My expected result: 6.5 / 1.5 in favor of Kramnik...

-- Kevin Heider

Sterten

unread,
Aug 13, 2001, 11:53:50 PM8/13/01
to
M.Scheidl wrote:

>I don't think a program can be re-written and optimized for another
>computer system within the available time.

they can run it on 8 processors, so why not 64 ?

>>>(...) but don't forget: The +70 elo
>>>experience comes from comp-comp, not from comp-human.
>> what value would you regard appropriate ?
>I have no idea.

but from what follows , I suppose that you think it's less than 70
and reducing. So , it could go +58,+49,+41,+34,+28 or such for
redoubled CPU-speeds.
But in the SSDF-FAQ they write, that the 70 difference was valid
for more than 10 years , so it would be a bit surprising if it
would suddenly change now at level 2650.


Similar differences should apply to humans , if they have
redoubled/halfed reflection time.
That would be interesting, if someone had already measured this !
Has it been done ?

Also, they should make two ELO-lists :
one for playing white and one for playing black.

>Of course it will be stronger. But against a player of world champion
>level, it may be that the strength gain by that hardware is still not
>sufficient. Imagine Kramnik plays against Fritz 7 on single CPU and
>draws. This could certainly happen then and when, if a higher number of
>games would be played. But which strength increase would be necessary
>for the program, to win the game from the same opening book lines? It

it gets more points in the long run.
In theory a position is either won or drawn ,
but in practice there are continous levels of position-valuation
and winning/drawing probabilities.


>> >I expect Kramnik to win 6-2, or 5½-2½.
>> Expectation value ?
>> How much is that in ELO ? I don't know the formula ,
>> how to calculate it.
>
>This prediction would mean 25...31% for Deep Fritz, or -141 to -193 elo
>(I take these values from a table).

can I download that table ?

>... I wouldn't wonder, if SSDF will
>consider after the match, to reduce their rating level (again). They
>already did it some time ago by -100 points.

I didn't know that.
It appears to me, that you and Andy (and probably most of all the other people
who expect that Kramnik wins) regard these SSDF values as too high.
Maybe another 100 points too high ?
Or how many ?

I still calculate that DF is favorite.


Guenter

Sterten

unread,
Aug 13, 2001, 11:56:47 PM8/13/01
to
Kevin Heider wrote:

that must be a special problem with Psion.
I can see no reason, why a chess-program should not play better
on faster hardware and this was observed by the SSDF since 1988,
and they scored 70 ELO-points per speed redoubling.
So 525x speed should give 633 ELO-points !

But when a program runs in parallel on several CPUs, then you need a
good "chief", which splits the tasks.This is possible in chess.
And much easier for computers than for humans , in case you are
thinking of 8 people analizing a "Haengepartie" ;-)


>I expect DF to run into the same problem. IBM's DEEP Blue had DEEPER
>pockets, and better hardware/software development resources.
>
>I figure Kramnik will basically destroy DF.
>
>In game 1 DF may get a draw because Kramnik will be feeling out the
>machine.
>
>Since DF is fairly good, DF will earn one more draw against Kramnik
>later in the match.
>
>DF will get a 3rd draw because by the end of the match Kramnik will
>grow tired of DFs boring, poorly played chess.
>
>My expected result: 6.5 / 1.5 in favor of Kramnik...

there is little reason to expect this, regarding
Kramnik's and DF's ELO.
But many humans can't admit, that computers are the better
chess-players (already) .


Guenter

Bruce Moreland

unread,
Aug 14, 2001, 1:18:56 AM8/14/01
to

"Kevin Heider" <khe...@davis.com> wrote in message
news:lligntoluclmsrso5...@4ax.com...

With parallel programs it's a matter of how efficiently the program can use
extra CPU's. My program with 4 CPU's runs at perhaps 2.5 to 3.5x the speed
it runs on a single CPU. With 8 it'd probably get 5-6.

That's a lot of fun.

I expect that Fritz has a similar amount of fun.

A program that was written for Win32 back in 1996 or so would scale pretty
well on modern single-processor hardware. Stuff older than that is 16-bit
and who knows how well it will do.

bruce

Bruce Moreland

unread,
Aug 14, 2001, 1:21:09 AM8/14/01
to

"Sterten" <ste...@aol.com> wrote in message
news:20010813235350...@mb-ma.aol.com...

> but from what follows , I suppose that you think it's less than 70
> and reducing. So , it could go +58,+49,+41,+34,+28 or such for
> redoubled CPU-speeds.
> But in the SSDF-FAQ they write, that the 70 difference was valid
> for more than 10 years , so it would be a bit surprising if it
> would suddenly change now at level 2650.

Have there been any tests to see what effect doubling CPU speed has *against
humans*? The only research I've seen involves computer vs computer tests.

bruce

Eric Hallsworth

unread,
Aug 14, 2001, 3:37:58 AM8/14/01
to
In article <20010813235350...@mb-ma.aol.com>, Sterten
<ste...@aol.com> writes

>M.Scheidl wrote:
>
> >I don't think a program can be re-written and optimized for another
> >computer system within the available time.
>
>they can run it on 8 processors, so why not 64 ?
>
>
>
> >>>(...) but don't forget: The +70 elo
> >>>experience comes from comp-comp, not from comp-human.
> >> what value would you regard appropriate ?
> >I have no idea.
>
>but from what follows , I suppose that you think it's less than 70
>and reducing. So , it could go +58,+49,+41,+34,+28 or such for
>redoubled CPU-speeds.
>But in the SSDF-FAQ they write, that the 70 difference was valid
>for more than 10 years , so it would be a bit surprising if it
>would suddenly change now at level 2650.
>
The Swedish SSDF difference of 70 Elo for a doubling of speed applies to
computer v computer games only, which is what they test. We used to
believe the figure was 80 Elo, but that was with old dedicated machines
with 6502 processors at 3 and 5MHz!

The faster programs run, the deeper they search. The deeper they search
the further away from the root position they get AND the less critical
their new choices are likely to be. It is my own view from the Selective
Search rating list run since 1985 that the doubling value has dropped
from 80 to no more than 60... again that's for computer v computer. Put
Fritz on a P/800 against the same version of Fritz on a P/400, and over
say 50 games, a gap will definitely appear in the score favouring the
P/800 PC. This because the program on the P/800 will constantly see
deeper than it's colleague on the P/400, and occasionally the deeper
search will find something of value that will be advantageous.

One thing we probably all agree on is that computers are great at
(most!) tactics, but not so good at long term planning and certain
strategic concepts. They struggle to understand play, for example, in
blocked pawn centres, they don't recognise the loss of value of a
trapped piece ('my rook is worth 5 whether it's on an open file or
trapped on h8 with my king on g8 after I've lost castling rights. If
it's not worth 5, it's certainly worth 4.5!'). Other long term concepts
which computers get wrong are things like opposite coloured bishops...
okay, they know the magnetism of the draw these produce when they get to
the endgame, but they can take little account of this in the middle game
whereas a human KNOWS the long terms effect and can easily either opt
for or opt against different material set-ups whilst in the middle game,
knowing he will yield benefits in due course!

NO AMOUNT of deeper searching solves this yet, it's a programmer's
problem. And these are the sort of situations which the best players
(and Kramnik is certainly one of those) knows how to take advantage of.
You could put Fritz (or, currently, any other program) on 64 processors,
and it still wont evaluate these types of position correctly (until,
often, it's too late).

This doesn't matter computer v computer - neither of them understand it,
so the faster processor still makes its gains. But it DOES matter
against humans, and in that respect extra speed doesn't solve the
problem.

Of course extra speed will enable Fritz to find improved tactics here
and there, and occasionally make a better choice between, say, 2 almost
equally valid moves. Therefore I believe that something like the +70
(starting from the Swedish base), +58, +49, +41, +34, +28 series shown
above DOES apply to computer speed doubling v human, and I also believe
that at the P/700 level the +41 is already operating against humans. If
I am right, then assuming Fritz rates at around 2650 on a P/700 (and I
*know* opinions will differ on this, too! but that's another subject!),
but *IF* it's 2650 on a P/700, then the next doubling gains are +41,
+34, +28, +22, and Fritz is going to get to about 2750 or 2775 *against
humans* even on its 8x processors.

This wont be enough to beat Kramnik, who has had plenty of time to
prepare against Fritz... he's had and used Fritz6 already for AGES! and,
as I understand the rules, will have an exact copy of his actual
opponent for a period of time before the match, so will doubtless have
familiarised himself with many of the changes in style, evaluation etc.

--
With best wishes from:
Eric Hallsworth, The Red House, 46 High St. Wilburton, Cambs CB6 3RA, England.
Editor of Selective Search, the UK's only Computer Chess Magazine, est. 1985.
http://www.elhchess.demon.co.uk

Robert Hyatt

unread,
Aug 14, 2001, 10:17:47 AM8/14/01
to
Sterten <ste...@aol.com> wrote:
> M.Scheidl wrote:

> >I don't think a program can be re-written and optimized for another
> >computer system within the available time.

> they can run it on 8 processors, so why not 64 ?

The problem is that there are no pc-compatible machines with shared memory
_and_ 64 cpus. The memory system becomes a very limiting bottleneck. No
one has yet tried to build large numbers of processors with uniform-access
shared memory, with the exception of companies like Cray. Their machines
cost a tad more. :)


> >>>(...) but don't forget: The +70 elo
> >>>experience comes from comp-comp, not from comp-human.
> >> what value would you regard appropriate ?
> >I have no idea.

> but from what follows , I suppose that you think it's less than 70
> and reducing. So , it could go +58,+49,+41,+34,+28 or such for
> redoubled CPU-speeds.
> But in the SSDF-FAQ they write, that the 70 difference was valid
> for more than 10 years , so it would be a bit surprising if it
> would suddenly change now at level 2650.

There has always been a superstition that deeper searches produce a smaller
and smaller improvement in program chess skill. So far, there is no evidence
to support this. There is some evidence to suggest that it doesn't happen
however.


> It appears to me, that you and Andy (and probably most of all the other people
> who expect that Kramnik wins) regard these SSDF values as too high.
> Maybe another 100 points too high ?
> Or how many ?

Hard to say. Elo ratings only mean something within the context of the
"pool" of players used to produce the ratings. SSDF ratings are based on
a pool of computer opponents, with little human rating included. As a result,
the ratings are pretty accurate, but only when used to predict the game
outcome between two players _in_ the rating pool. Comparing their ratings
to FIDE is statistically invalid.

> I still calculate that DF is favorite.

--
Robert Hyatt Computer and Information Sciences
hy...@cis.uab.edu University of Alabama at Birmingham
(205) 934-2213 115A Campbell Hall, UAB Station
(205) 934-5473 FAX Birmingham, AL 35294-1170

Jose Lopez Jr.

unread,
Aug 14, 2001, 10:38:57 AM8/14/01
to
What I would like to find out is if they have all or most of Kramnik's games
and checking to see if the computer can figure out most of the difficult
moves of each game Kramnik made during key positions. This is just to see
if the computer can actually find these moves.

I am really eager to see this match...actually seeing Kramnik with
commentators free through the internet, if possible. I just hope the first
game is played really impressively by the computer and not a huge
disappointment. Basically, they better have a plan B (program
modifications) if the first game is a disaster. I'm sure they will....at
least I hope so.

Robert Hyatt

unread,
Aug 14, 2001, 11:35:34 AM8/14/01
to


Contrary to popular opinion, playing a GM with a computer _is_ a big
challenge. Playing a GM with a program he has had a copy of for several
months is an enormous challenge. Not being allowed to make any changes
makes it even harder.

I hope Fritz does well here. But I suspect great problems are going to
flare up due to the specific rules of this match, which I would consider to
be grossly unfair to the computer.

Sterten

unread,
Aug 14, 2001, 11:45:20 AM8/14/01
to
bruce wrote:

>Have there been any tests to see what effect doubling CPU speed has *against
>humans*? The only research I've seen involves computer vs computer tests.

why should it be different ?
Well, computers are better in tactics , so a human opponent (if he knows
that he plays against a computer) might try to avoid tactical positions
or openings.
The same is also true in human-human games : when one player
wants a draw , it's much harder for the other player (in fact for both)
to win.
Suppose the normal probabilities are 25%/50%/25% for win/draw/lost
then you can make it 10/75/15 or such if you want a draw.

Ideally , computers should also play differently against humans
than against other computers.


redoubling speed:
computer-computer = 70 ELO-points
computer-human =
human-computer =
human-human =

it should always be about 70 points, except maybe that it's not linear.
And assuming that all easy trick to fight the computers are already
known and appropriately answered


and the variations :
white/black ,
player/computer can't see/identify the opponent ,
player wants a draw ,


and how about human with computer-aid ?
A 2500-ELO-human allowed to use a 2500-computer is probably better
than 2570 , maybe 2650 ?


Guenter

Sterten

unread,
Aug 14, 2001, 11:46:22 AM8/14/01
to
Robert Hyatt wrote:
>ste...@aol.com> wrote:

>> M.Scheidl wrote:
>
>> >I don't think a program can be re-written and optimized for another
>> >computer system within the available time.
>
>> they can run it on 8 processors, so why not 64 ?
>
>The problem is that there are no pc-compatible machines with shared memory
>_and_ 64 cpus. The memory system becomes a very limiting bottleneck. No

redoubling memory is only worth 7-15 ELO points. There is no big loss,
if every processor has only -say- 128MB available.
Transferring a chess-sub-position for analysis and receiving the result
doesn't require much memory nor bandwith/speed.

>one has yet tried to build large numbers of processors with uniform-access
>shared memory, with the exception of companies like Cray. Their machines
>cost a tad more. :)
>
>

>There has always been a superstition that deeper searches produce a smaller
>and smaller improvement in program chess skill. So far, there is no evidence
>to support this. There is some evidence to suggest that it doesn't happen
>however.

so, you are saying that our 70-points rule applies quite well even in
high ELO-ranges and that the previous 58-49-41-... estimates are
not correct ?

>> It appears to me, that you and Andy (and probably most of all the other
people
>> who expect that Kramnik wins) regard these SSDF values as too high.
>> Maybe another 100 points too high ?
>> Or how many ?
>
>Hard to say. Elo ratings only mean something within the context of the
>"pool" of players used to produce the ratings. SSDF ratings are based on
>a pool of computer opponents, with little human rating included. As a
result,
>the ratings are pretty accurate, but only when used to predict the game
>outcome between two players _in_ the rating pool. Comparing their ratings
>to FIDE is statistically invalid.

well, not invalid as there _are_ some games. But probably less accurate.
So I do understand that your estimate would be difficult , ...but
can you give a number, please ?

>> I still calculate that DF is favorite.

... under average circumstances. I didn't study the exact rules of
the match.
I want to figure out, who is better before the match starts,
so I needn't the match itself to decide that ;-)


Guenter

Sterten

unread,
Aug 14, 2001, 11:45:56 AM8/14/01
to
Eric Hallsworth wrote:

>The Swedish SSDF difference of 70 Elo for a doubling of speed applies to
>computer v computer games only, which is what they test.

suppose it's less for human v computer games above some level L.
Then strong computers should deny to play against humans, while
humans could improve their ELO by playing against strong computers.
I would also assume that human speed redoubling gains 70 ELO
below some level (~2500) and then decreases.

>We used to
>believe the figure was 80 Elo, but that was with old dedicated machines
>with 6502 processors at 3 and 5MHz!
>
>The faster programs run, the deeper they search. The deeper they search
>the further away from the root position they get AND the less critical
>their new choices are likely to be. It is my own view from the Selective
>Search rating list run since 1985 that the doubling value has dropped
>from 80 to no more than 60... again that's for computer v computer. Put
>Fritz on a P/800 against the same version of Fritz on a P/400, and over
>say 50 games, a gap will definitely appear in the score favouring the
>P/800 PC. This because the program on the P/800 will constantly see
>deeper than it's colleague on the P/400, and occasionally the deeper
>search will find something of value that will be advantageous.
>
>One thing we probably all agree on is that computers are great at
>(most!) tactics, but not so good at long term planning and certain
>strategic concepts. They struggle to understand play, for example, in
>blocked pawn centres, they don't recognise the loss of value of a
>trapped piece ('my rook is worth 5 whether it's on an open file or
>trapped on h8 with my king on g8 after I've lost castling rights. If
>it's not worth 5, it's certainly worth 4.5!').

it should be possible to improve this. In general all human concepts
can also be implemented in software. Only that it's very tedious
and difficult in some cases.

>Other long term concepts
>which computers get wrong are things like opposite coloured bishops...
>okay, they know the magnetism of the draw these produce when they get to
>the endgame, but they can take little account of this in the middle game
>whereas a human KNOWS the long terms effect and can easily either opt
>for or opt against different material set-ups whilst in the middle game,
>knowing he will yield benefits in due course!
>
>NO AMOUNT of deeper searching solves this yet, it's a programmer's
>problem. And these are the sort of situations which the best players
>(and Kramnik is certainly one of those) knows how to take advantage of.
>You could put Fritz (or, currently, any other program) on 64 processors,
>and it still wont evaluate these types of position correctly (until,
>often, it's too late).

OK, you are saying tactics gets exhausted at some depth-level.
Once you calculated all tactics upto a certain depth , there is not
much gain if you increase the depth further.
Positions so difficult, that valuation changes dramatically at
level L , where L is greater than current computer depth
are really rare. This is maybe an intrinsic property of chess.
You could easily modify the chess-rules to increase L and thus
make the +70 valid upto ELO 3000 or 4000. But obviously this is
not the way how chess is.

The other positional flaws in computer-valuation which you mention ,
I regard them only as programming errors. Once these are eliminated
the situation would probably not change much , I guess.

>This doesn't matter computer v computer - neither of them understand it,
>so the faster processor still makes its gains. But it DOES matter
>against humans, and in that respect extra speed doesn't solve the
>problem.
>
>Of course extra speed will enable Fritz to find improved tactics here
>and there, and occasionally make a better choice between, say, 2 almost
>equally valid moves. Therefore I believe that something like the +70
>(starting from the Swedish base), +58, +49, +41, +34, +28 series shown
>above DOES apply to computer speed doubling v human, and I also believe

all your arguments should also apply at lower ELO-levels , still you think
that we are at a point now, where this curve is starting to
decrease more rapidly than before.
(80 to 60 from ~2000 to ~2500 , but 60 to 28 from 2500 to 2750
assuming the 80,60 values in the past were also valid for
computer-human games )

>that at the P/700 level the +41 is already operating against humans. If
>I am right, then assuming Fritz rates at around 2650 on a P/700 (and I
>*know* opinions will differ on this, too! but that's another subject!),
>but *IF* it's 2650 on a P/700, then the next doubling gains are +41,
>+34, +28, +22, and Fritz is going to get to about 2750 or 2775 *against
>humans* even on its 8x processors.

.. while I calculate 2815-2860 against computers
and we're not adding the RAM-increase numbers here

>This wont be enough to beat Kramnik, who has had plenty of time to
>prepare against Fritz... he's had and used Fritz6 already for AGES! and,
>as I understand the rules, will have an exact copy of his actual
>opponent for a period of time before the match, so will doubtless have
>familiarised himself with many of the changes in style, evaluation etc.
>

>--
>With best wishes from:
>Eric Hallsworth, The Red House, 46 High St. Wilburton, Cambs CB6 3RA,
England.
>Editor of Selective Search, the UK's only Computer Chess Magazine, est. 1985.
>http://www.elhchess.demon.co.uk

one other problem is , that:

chess journalists are usually biased, since their readers are humans
humans players are often biased since they don't want computers to
surpass them

how the Fritz-team estimates the odds ?

Tord Kallqvist Romstad

unread,
Aug 14, 2001, 11:57:49 AM8/14/01
to
"Mike S." <Michael...@lion.cc> writes:

> I expect Kramnik to win 6-2, or 5½-2½.

I think the margin will be considerably smaller. Kramnik will
probably win one of the first two games, and then effortlessly draw
the rest.

--
Tord Romstad

Atlan

unread,
Aug 14, 2001, 11:53:05 AM8/14/01
to

> Playing a GM with a program he has had a copy of for several
> months is an enormous challenge. Not being allowed to make any changes
> makes it even harder.

How is it made possible that Kramnik doesn't play pre-played lines against
DF?

Atlan.


Aaron

unread,
Aug 14, 2001, 12:05:37 PM8/14/01
to


ste...@aol.com (Sterten) wrote in
news:20010814114556...@mb-fk.aol.com:

> Eric Hallsworth wrote:

> >With best wishes from:
> >Eric Hallsworth, The Red House, 46 High St. Wilburton, Cambs CB6 3RA,
> >England. Editor of Selective Search, the UK's only Computer Chess
> >Magazine, est. 1985. http://www.elhchess.demon.co.uk
>
> one other problem is , that:
>
> chess journalists are usually biased, since their readers are humans
> humans players are often biased since they don't want computers to
> surpass them

Hmm..One problem i see is that....

*Computer Chess* Journalists are usually biased, since their readers are
often not merely chess players but what some call "computer chess freaks"
who are often biased since they want to believe that Computers are
stronger than they really are ..


> how the Fritz-team estimates the odds ?

Funny you should ask that. Seems to me , that asking Kramnik or Fritz is
not the best idea given your fear of people being biased.. :)



Sterten

unread,
Aug 14, 2001, 12:35:04 PM8/14/01
to
>Hmm..One problem i see is that....
>
>*Computer Chess* Journalists are usually biased, since their readers are
>often not merely chess players but what some call "computer chess freaks"
>who are often biased since they want to believe that Computers are
>stronger than they really are ..

yes,but they are also humans , some are probably chessplayers
who are just interested in chess, which includes computerchess.
Some are computer freaks and programmers.

Hmm, maybe their audience is quite balanced and they are less biased.

>> how the Fritz-team estimates the odds ?
>
>Funny you should ask that. Seems to me , that asking Kramnik or Fritz is
>not the best idea given your fear of people being biased.. :)

not that much biased, that I wouldn't appreciate to hear their
opinions :)
Maybe they can give good reasoning to support their estimates.

However, I read that Kramnik said , he regards the DF machine as
better than Deep Blue was.
Still he thinks he is stronger.
Both statements would increase his "value" as chess-player ,
so I'm a bit careful.
R.Keene also said that he favours Kramnik.
Well, I'm not sure whether he says what he really thinks ,
or what he thinks is good for his busyness.

Robert Hyatt

unread,
Aug 14, 2001, 5:29:45 PM8/14/01
to
Sterten <ste...@aol.com> wrote:
> Robert Hyatt wrote:
> >ste...@aol.com> wrote:
> >> M.Scheidl wrote:
> >
> >> >I don't think a program can be re-written and optimized for another
> >> >computer system within the available time.
> >
> >> they can run it on 8 processors, so why not 64 ?
> >
> >The problem is that there are no pc-compatible machines with shared memory
> >_and_ 64 cpus. The memory system becomes a very limiting bottleneck. No

> redoubling memory is only worth 7-15 ELO points. There is no big loss,
> if every processor has only -say- 128MB available.
> Transferring a chess-sub-position for analysis and receiving the result
> doesn't require much memory nor bandwith/speed.

It is daunting when you do a hash probe for every node, on every processor,
plus shared kiiller moves, history move data, and other things that are
shared thru memory...

This isn't an issue of _size_ it is an issue of _bandwidth_. Chess engines
require a lot of bandwidth.

> >one has yet tried to build large numbers of processors with uniform-access
> >shared memory, with the exception of companies like Cray. Their machines
> >cost a tad more. :)
> >
> >
> >There has always been a superstition that deeper searches produce a smaller
> >and smaller improvement in program chess skill. So far, there is no evidence
> >to support this. There is some evidence to suggest that it doesn't happen
> >however.

> so, you are saying that our 70-points rule applies quite well even in
> high ELO-ranges and that the previous 58-49-41-... estimates are
> not correct ?

I am not saying that at all. I am saying there is no proof that the
diminishing returns is real. There is some evidence that it is not
real, but the jury will be out for a long time on this question.


> >> It appears to me, that you and Andy (and probably most of all the other
> people
> >> who expect that Kramnik wins) regard these SSDF values as too high.
> >> Maybe another 100 points too high ?
> >> Or how many ?
> >
> >Hard to say. Elo ratings only mean something within the context of the
> >"pool" of players used to produce the ratings. SSDF ratings are based on
> >a pool of computer opponents, with little human rating included. As a
> result,
> >the ratings are pretty accurate, but only when used to predict the game
> >outcome between two players _in_ the rating pool. Comparing their ratings
> >to FIDE is statistically invalid.

> well, not invalid as there _are_ some games. But probably less accurate.
> So I do understand that your estimate would be difficult , ...but
> can you give a number, please ?

N

:)

There is no way to do this.... You can talk all the SSDF programs and
enter them into FIDE or USCF events to get real ratings there, then you could
compare. But with the Elo formula, it is the _spread_ between two player's
ratings that is important. IE a 1400 player will win 3 of 4 games vs a 1200
player. A 2600 player will win 3 of every 4 games vs a 2400 player.

Thinking of ratings as "absolute" is wrong...


> >> I still calculate that DF is favorite.

> ... under average circumstances. I didn't study the exact rules of
> the match.
> I want to figure out, who is better before the match starts,
> so I needn't the match itself to decide that ;-)

Not an easy task. Unless you found a crystal ball that is still functional. :)

> Guenter

Robert Hyatt

unread,
Aug 14, 2001, 5:32:08 PM8/14/01
to
Atlan <at...@neonzion.fi> wrote:

> Atlan.

This is not easy to do. A program's move selection depends on how long it
has to search, and the speed of the hardware. IE how many nodes can it search
before deciding on the move. Since the program thinks on the opponent's
time, reproducing both the same moves from the human _and_ the same timing
for those moves is necessary. That will be very difficult, if not impossible.

It is more reasonable to try to find strategic weaknesses in the program and
play to exploit those rather than trying to walk down a won game..

Mike S.

unread,
Aug 15, 2001, 8:50:47 PM8/15/01
to
"Sterten" <ste...@aol.com> schrieb:

> can I download that table ?

After an intense search with google, I found only two larger elo
differences tables on the net (I think the experts will call them
"percentage expectancy tables"), one of which is surprisingly connected
to volleyball even:

http://www.lexicon.net/ianandjan/Ian/Chess/RatingAdjustmentTable.htm

http://www.msu.edu/~ballicor/vb/tablepro.htm

Regards,
M.Scheidl


Akorps

unread,
Aug 16, 2001, 7:18:18 AM8/16/01
to
>> How is it made possible that Kramnik doesn't play pre-played lines against
DF?

Can't the computer introduce some
randomness into its move selection?
If 2 moves are evaluated equally choose
randomly between them, and if it is
desired *not* to have any 2 moves rated
equal, have a small random chance of playing
the second best move, if it's evaluation is
close to the best move? That way it
would be much harder for the human to
play prepared lines.

Sterten

unread,
Aug 16, 2001, 8:54:16 AM8/16/01
to
Robert Hyatt wrote:
>Sterten <ste...@aol.com> wrote:
>> Robert Hyatt wrote:
>> >ste...@aol.com> wrote:
>> >> M.Scheidl wrote:

>> >> >I don't think a program can be re-written and optimized for another
>> >> >computer system within the available time.
>> >
>> >> they can run it on 8 processors, so why not 64 ?
>> >
>> >The problem is that there are no pc-compatible machines with shared
memory
>> >_and_ 64 cpus. The memory system becomes a very limiting bottleneck. No
>
>> redoubling memory is only worth 7-15 ELO points. There is no big loss,
>> if every processor has only -say- 128MB available.
>> Transferring a chess-sub-position for analysis and receiving the result
>> doesn't require much memory nor bandwith/speed.
>
>It is daunting when you do a hash probe for every node, on every processor,
>plus shared kiiller moves, history move data, and other things that are
>shared thru memory...
>
>This isn't an issue of _size_ it is an issue of _bandwidth_. Chess engines
>require a lot of bandwidth.

sorry if I were unclear. I meant : sharing no data at all. Only the
sub-position is transferred. The subprocessor builds its own data
in its own memory. I don't think that there is a big disadvantage
in not reusing already created data.

>> >one has yet tried to build large numbers of processors with
uniform-access
>> >shared memory, with the exception of companies like Cray. Their machines
>> >cost a tad more. :)
>> >
>> >
>> >There has always been a superstition that deeper searches produce a
smaller
>> >and smaller improvement in program chess skill. So far, there is no
evidence
>> >to support this. There is some evidence to suggest that it doesn't
happen
>> >however.
>
>> so, you are saying that our 70-points rule applies quite well even in
>> high ELO-ranges and that the previous 58-49-41-... estimates are
>> not correct ?
>
>I am not saying that at all. I am saying there is no proof that the
>diminishing returns is real. There is some evidence that it is not
>real, but the jury will be out for a long time on this question.

you're very carefully formulating. But I extract from your wording
that you see more evidence for linear distribution than otherwise.

>> >> It appears to me, that you and Andy (and probably most of all the other
>> people
>> >> who expect that Kramnik wins) regard these SSDF values as too high.
>> >> Maybe another 100 points too high ?
>> >> Or how many ?
>> >
>> >Hard to say. Elo ratings only mean something within the context of the
>> >"pool" of players used to produce the ratings. SSDF ratings are based on
>> >a pool of computer opponents, with little human rating included. As a
>> result,
>> >the ratings are pretty accurate, but only when used to predict the game
>> >outcome between two players _in_ the rating pool. Comparing their
ratings
>> >to FIDE is statistically invalid.
>
>> well, not invalid as there _are_ some games. But probably less accurate.
>> So I do understand that your estimate would be difficult , ...but
>> can you give a number, please ?
>
>N
>
>:)
>
>There is no way to do this....

I'm sure, you do have an estimate (which you don't want to tell)
reflecting your current state of information

>You can talk all the SSDF programs and
>enter them into FIDE or USCF events to get real ratings there, then you could
>compare.

I'd be content with less

>But with the Elo formula, it is the _spread_ between two player's
>ratings that is important. IE a 1400 player will win 3 of 4 games vs a 1200
>player. A 2600 player will win 3 of every 4 games vs a 2400 player.
>
>Thinking of ratings as "absolute" is wrong...

once you fix a reference point, then they are absolute.
If there's a problem with two badly comparable lists, then I'm
trying to do the best possible with reasonable effort

>> >> I still calculate that DF is favorite.
>
>> ... under average circumstances. I didn't study the exact rules of
>> the match.
>> I want to figure out, who is better before the match starts,
>> so I needn't the match itself to decide that ;-)
>
>Not an easy task.

depends on the degree of reliability that is required

>Unless you found a crystal ball that is still functional. :)

I was hoping for a bit more reliability than this


actually estimating 2835 for DF's ELO in FIDE terms on
average tournament conditions , after what I read here the last days


2653 DF,K6-2-450,128RAM
-50 SSDF list too high compared with FIDE
+log_2(6.5*1000/450*1.2)*55 1.2 for better processor

+log_2(512/128) * 10 for 512MB estimated memory per processor
+20 for better software
-----
2869

55 is interpolated: 80(ELO=2000),60(ELO=2600),50(ELO=2900)
I also found Eric's 60 (vs.70 from SSDF) on a webpage


2869 vs. 2802 would mean an estimated 4.5:3.5 for DF .


Guenter

Robert Hyatt

unread,
Aug 16, 2001, 10:12:49 AM8/16/01
to

Alpha/beta doesn't work like that. If you could randomly affect the move
order, this might work. But alpha/beta always takes the _first_ of a set
of N equal moves...

Robert Hyatt

unread,
Aug 16, 2001, 10:18:27 AM8/16/01
to

This is a huge penalty. Just disable the transposition table and try a
search to see what I mean. You can find some test results in some of the
old JICCA issues where different people experimented with a program that
did (and did not) share memory between processors (by memory, I mean
trans table memory).

Even on distributed programs, it has been proven to be better to implement
a shared trans table even though it means lots of message traffic.

> >> >one has yet tried to build large numbers of processors with
> uniform-access
> >> >shared memory, with the exception of companies like Cray. Their machines
> >> >cost a tad more. :)
> >> >
> >> >
> >> >There has always been a superstition that deeper searches produce a
> smaller
> >> >and smaller improvement in program chess skill. So far, there is no
> evidence
> >> >to support this. There is some evidence to suggest that it doesn't
> happen
> >> >however.
> >
> >> so, you are saying that our 70-points rule applies quite well even in
> >> high ELO-ranges and that the previous 58-49-41-... estimates are
> >> not correct ?
> >
> >I am not saying that at all. I am saying there is no proof that the
> >diminishing returns is real. There is some evidence that it is not
> >real, but the jury will be out for a long time on this question.

> you're very carefully formulating. But I extract from your wording
> that you see more evidence for linear distribution than otherwise.

Correct. But it is easier to "wait and see" since we are going faster
every year... :)


> >> >> It appears to me, that you and Andy (and probably most of all the other
> >> people
> >> >> who expect that Kramnik wins) regard these SSDF values as too high.
> >> >> Maybe another 100 points too high ?
> >> >> Or how many ?
> >> >
> >> >Hard to say. Elo ratings only mean something within the context of the
> >> >"pool" of players used to produce the ratings. SSDF ratings are based on
> >> >a pool of computer opponents, with little human rating included. As a
> >> result,
> >> >the ratings are pretty accurate, but only when used to predict the game
> >> >outcome between two players _in_ the rating pool. Comparing their
> ratings
> >> >to FIDE is statistically invalid.
> >
> >> well, not invalid as there _are_ some games. But probably less accurate.
> >> So I do understand that your estimate would be difficult , ...but
> >> can you give a number, please ?
> >
> >N
> >
> >:)
> >
> >There is no way to do this....

> I'm sure, you do have an estimate (which you don't want to tell)
> reflecting your current state of information

I really don't. I believe today's best programs on good hardware are
in the upper 2400 to lower 2500 range of performance. But that leaves
a lot of room for error...

> >You can talk all the SSDF programs and
> >enter them into FIDE or USCF events to get real ratings there, then you could
> >compare.

> I'd be content with less

> >But with the Elo formula, it is the _spread_ between two player's
> >ratings that is important. IE a 1400 player will win 3 of 4 games vs a 1200
> >player. A 2600 player will win 3 of every 4 games vs a 2400 player.
> >
> >Thinking of ratings as "absolute" is wrong...

> once you fix a reference point, then they are absolute.
> If there's a problem with two badly comparable lists, then I'm
> trying to do the best possible with reasonable effort


NO they aren't absolute. You can take the current FIDE ratings, and add N
to every one of them (where N is a constant) without changing a single thing
other than the value of each player's rating. That is the problem with the
Elo rating number. You can use two ratings from the same "pool" of players
to predict the outcome if those two players meet in a game. You can't take
the ratings of two different pools (ie USCF vs BCF) and predict anything at
all.


> >> >> I still calculate that DF is favorite.
> >
> >> ... under average circumstances. I didn't study the exact rules of
> >> the match.
> >> I want to figure out, who is better before the match starts,
> >> so I needn't the match itself to decide that ;-)
> >
> >Not an easy task.

> depends on the degree of reliability that is required

> >Unless you found a crystal ball that is still functional. :)

> I was hoping for a bit more reliability than this


> actually estimating 2835 for DF's ELO in FIDE terms on
> average tournament conditions , after what I read here the last days

That is 300 too high at least. A micro program is most definitely not a
Super-GM.


> 2653 DF,K6-2-450,128RAM
> -50 SSDF list too high compared with FIDE
> +log_2(6.5*1000/450*1.2)*55 1.2 for better processor
>
> +log_2(512/128) * 10 for 512MB estimated memory per processor
> +20 for better software
> -----
> 2869

> 55 is interpolated: 80(ELO=2000),60(ELO=2600),50(ELO=2900)
> I also found Eric's 60 (vs.70 from SSDF) on a webpage


> 2869 vs. 2802 would mean an estimated 4.5:3.5 for DF .


> Guenter

--

Sterten

unread,
Aug 16, 2001, 11:10:39 AM8/16/01
to
Robert Hyatt wrote:

>This is a huge penalty. Just disable the transposition table and try a
>search to see what I mean. You can find some test results in some of the
>old JICCA issues where different people experimented with a program that
>did (and did not) share memory between processors (by memory, I mean
>trans table memory).

I don't know about transposition table,JICCA, trans table memory.
SSDF says redoubling of memory is worth 7 ELO-points.
Now, on every move Fritz can partition his possibilities
(or all level-2 possibilities etc.) into 8 different sets
and analyse them separately without shared memory on each CPU.
I think , this logically follows from the 7-difference.
Am I wrong ?

>> you're very carefully formulating. But I extract from your wording
>> that you see more evidence for linear distribution than otherwise.
>
>Correct. But it is easier to "wait and see" since we are going faster
>every year... :)

feel free to correct your estimate every year - month or day :)


>> I'm sure, you do have an estimate (which you don't want to tell)
>> reflecting your current state of information
>
>I really don't. I believe today's best programs on good hardware are
>in the upper 2400 to lower 2500 range of performance. But that leaves
>a lot of room for error...

upper2400-lower2500 is better than no estimate at all and surprisingly
low. That's 150-200 points lower than the SSDF list ! (2653 for DF6)


>> >Thinking of ratings as "absolute" is wrong...
>
>> once you fix a reference point, then they are absolute.
>> If there's a problem with two badly comparable lists, then I'm
>> trying to do the best possible with reasonable effort
>
>
>NO they aren't absolute. You can take the current FIDE ratings, and add N
>to every one of them (where N is a constant) without changing a single thing
>other than the value of each player's rating. That is the problem with the
>Elo rating number. You can use two ratings from the same "pool" of players
>to predict the outcome if those two players meet in a game.

OK. So, no problem with ELO here.

>You can't take
>the ratings of two different pools (ie USCF vs BCF) and predict anything at
>all.

I don't know USCF,BCF. But presumably there are some games between
players from different list.

>> actually estimating 2835 for DF's ELO in FIDE terms on
>> average tournament conditions , after what I read here the last days
>
>That is 300 too high at least. A micro program is most definitely not a
>Super-GM.

300 ? Are we talking about the same System ?
I meant DF7,on the Bahrain-hardware.

>> 2653 DF,K6-2-450,128RAM
>> -50 SSDF list too high compared with FIDE
>> +log_2(6.5*1000/450*1.2)*55 1.2 for better processor
>>
>> +log_2(512/128) * 10 for 512MB estimated memory per processor
>> +20 for better software
>> -----
>> 2869

I'm already down 50 points or such after your estimate.
Would need some more opinions.

Guenter

Sterten

unread,
Aug 16, 2001, 11:11:29 AM8/16/01
to
>Akorps <akorps> wrote:
>>>> How is it made possible that Kramnik doesn't play pre-played lines
against
>> DF?
>
>> Can't the computer introduce some
>> randomness into its move selection?
>> If 2 moves are evaluated equally choose
>> randomly between them, and if it is
>> desired *not* to have any 2 moves rated
>> equal, have a small random chance of playing
>> the second best move, if it's evaluation is
>> close to the best move? That way it
>> would be much harder for the human to
>> play prepared lines.
>
>Alpha/beta doesn't work like that. If you could randomly affect the move
>order, this might work. But alpha/beta always takes the _first_ of a set
>of N equal moves...

but it's very easy to change this a little bit , isn't it ?

Tord Kallqvist Romstad

unread,
Aug 16, 2001, 12:02:51 PM8/16/01
to
ste...@aol.com (Sterten) writes:

> >Alpha/beta doesn't work like that. If you could randomly affect the move
> >order, this might work. But alpha/beta always takes the _first_ of a set
> >of N equal moves...
>
> but it's very easy to change this a little bit , isn't it ?

It is, but it would make the program much slower and weaker. The
whole point of alpha beta is that you only have to prove that the move
you are currently searching is worse than the best move so far, not
how much worse.

--
Tord Romstad

Robert Hyatt

unread,
Aug 16, 2001, 12:16:03 PM8/16/01
to
Sterten <ste...@aol.com> wrote:
> Robert Hyatt wrote:

> >This is a huge penalty. Just disable the transposition table and try a
> >search to see what I mean. You can find some test results in some of the
> >old JICCA issues where different people experimented with a program that
> >did (and did not) share memory between processors (by memory, I mean
> >trans table memory).

> I don't know about transposition table,JICCA, trans table memory.
> SSDF says redoubling of memory is worth 7 ELO-points.
> Now, on every move Fritz can partition his possibilities
> (or all level-2 possibilities etc.) into 8 different sets
> and analyse them separately without shared memory on each CPU.
> I think , this logically follows from the 7-difference.
> Am I wrong ?

The SSDF numbers are simply _wrong_. there is no linear relationship
between rating and memory size, except for very small hash table sizes.

IE if you look at dejanews, you will find a test I ran several years when
Alan (Komputer Korner) asked the question. I found that from a table that
is way too small, to one that is the right size, the speed of the program
just about doubled. Beyone the "right size" nothing changed. Because once
the table is big enough to hold the entire tree you are searching, larger
doesn't do a thing for you.


> >> you're very carefully formulating. But I extract from your wording
> >> that you see more evidence for linear distribution than otherwise.
> >
> >Correct. But it is easier to "wait and see" since we are going faster
> >every year... :)

> feel free to correct your estimate every year - month or day :)

I am slowly adjusting it upward as speed definitely helps, contrary to what
some would have you think. But the problem is this: Given two computers
playing each other, if you double the speed of one, it will have a lop-sided
effect on the match result. Typically 70 rating points. But if you play
a human GM with a machine, and then you double its speed, you won't see
anywhere near that big an improvement. I have tried this on ICC on
_several_ occasions... taking my original pentium pro 200 X 4 machine (four
processors) and playing games vs the same GM (Roman) using 2 and 4 processors.
He could not tell the difference, basically, which lends credibility to my
claim that playing a program against computers and playing it against humans
is two different things, totally.

> >> I'm sure, you do have an estimate (which you don't want to tell)
> >> reflecting your current state of information
> >
> >I really don't. I believe today's best programs on good hardware are
> >in the upper 2400 to lower 2500 range of performance. But that leaves
> >a lot of room for error...

> upper2400-lower2500 is better than no estimate at all and surprisingly
> low. That's 150-200 points lower than the SSDF list ! (2653 for DF6)

I know. But remember, the SSDF rating pool has _no_ humans in it. So
that is not a surprise. The number that is important is

rating(player1) - rating(player2) which gives a statistical prediction on
who will win a game and how frequently. Doesn't matter whether they are
rated 1400 and 1450 or 5690 and 5740. The statistical prediction is the
same for either set of ratings.


> >> >Thinking of ratings as "absolute" is wrong...
> >
> >> once you fix a reference point, then they are absolute.
> >> If there's a problem with two badly comparable lists, then I'm
> >> trying to do the best possible with reasonable effort
> >
> >
> >NO they aren't absolute. You can take the current FIDE ratings, and add N
> >to every one of them (where N is a constant) without changing a single thing
> >other than the value of each player's rating. That is the problem with the
> >Elo rating number. You can use two ratings from the same "pool" of players
> >to predict the outcome if those two players meet in a game.

> OK. So, no problem with ELO here.

> >You can't take
> >the ratings of two different pools (ie USCF vs BCF) and predict anything at
> >all.

> I don't know USCF,BCF. But presumably there are some games between
> players from different list.

Yes. USCF is the United States Chess Federation, which is a different pool
of players from the British Chess Federation, or the Canadian Federation,
or FIDE, or the correspondence ratings, or the Israeli league, etc...

> >> actually estimating 2835 for DF's ELO in FIDE terms on
> >> average tournament conditions , after what I read here the last days
> >
> >That is 300 too high at least. A micro program is most definitely not a
> >Super-GM.

> 300 ? Are we talking about the same System ?
> I meant DF7,on the Bahrain-hardware.

The hardware DF is using is only 2x faster than the hardware I normally use
with Crafty. It is not a super-computer as ChessBase would have you believe.
In fact, it probably is not 2x faster than mine, as the 8-way boxes have a
significant memory bandwidth bottleneck that isn't as serious on 4-way boxes.


> >> 2653 DF,K6-2-450,128RAM
> >> -50 SSDF list too high compared with FIDE
> >> +log_2(6.5*1000/450*1.2)*55 1.2 for better processor
> >>
> >> +log_2(512/128) * 10 for 512MB estimated memory per processor


That is wrong. The DF machine will probably have 512mb _total_. It is a
shared-memory architecture like my quad xeons here at UAB. My office machine
has 4 700mhz xeon processors, each with 1024K of L2 cache, sharing one large
512mb memory system.


> >> +20 for better software
> >> -----
> >> 2869

> I'm already down 50 points or such after your estimate.
> Would need some more opinions.

> Guenter

--

Robert Hyatt

unread,
Aug 16, 2001, 12:16:49 PM8/16/01
to

Nope. The way alpha/beta works is to search a move, establish a score,
then search the rest of the moves and toss out any that have a score <=
that of the first move. To remove that = would absolutely kill search
performance.

Sterten

unread,
Aug 16, 2001, 12:20:50 PM8/16/01
to
Tord Romstad wrote:

but computers do this by assigning values to each move or subposition.
There are sometimes situations, where the best and second best
move are only a tiny fraction of value apart.
It wouldn't matter , if you choose the second best move
in such a situation.


Guenter

Dr Stupid

unread,
Aug 16, 2001, 1:12:01 PM8/16/01
to
On 16 Aug 2001 16:16:49 GMT, Robert Hyatt <hy...@crafty.cis.uab.edu>
wrote:

>> >
>> >> Can't the computer introduce some
>> >> randomness into its move selection?

>> >Alpha/beta doesn't work like that. If you could randomly affect the move


>> >order, this might work. But alpha/beta always takes the _first_ of a set
>> >of N equal moves...
>
>> but it's very easy to change this a little bit , isn't it ?
>
>Nope. The way alpha/beta works is to search a move, establish a score,
>then search the rest of the moves and toss out any that have a score <=
>that of the first move. To remove that = would absolutely kill search
>performance.

Is not the way some engines do this to make their evaluation function
'dust' the evaluation with a little random jitter? That way the
alpha/beta algorithm is unchanged, only the static position evaluation
varies. I used to have a Saitek computer that had this option.


Andy Platt

unread,
Aug 16, 2001, 2:10:29 PM8/16/01
to
> Is not the way some engines do this to make their evaluation function
> 'dust' the evaluation with a little random jitter? That way the
> alpha/beta algorithm is unchanged, only the static position evaluation
> varies. I used to have a Saitek computer that had this option.

But when you are playing Kramnik, you really want the best evaluation of a
position you can get!!!

Andy.

--
I'm not really here - it's just your warped imagination.

Robert Hyatt

unread,
Aug 16, 2001, 2:17:04 PM8/16/01
to


> Guenter

The problem is that you don't know that the "second best" move is just a
tiny fraction worse. Alpha/Beta doesn't make that distinction. When it
proves a move is worse, it could be .001 pawns worse, or 3 queens worse,
and you can't tell due to the way alpha/beta prunes the moves...

Robert Hyatt

unread,
Aug 16, 2001, 2:17:50 PM8/16/01
to


What about hash tables? And the fact that now the same position can
have multiple different scores? It could lead to significant performance
penalties as well as inconsistent search results.

Dr Stupid

unread,
Aug 16, 2001, 3:17:58 PM8/16/01
to
On 16 Aug 2001 18:17:50 GMT, Robert Hyatt <hy...@crafty.cis.uab.edu>
wrote:

>> Is not the way some engines do this to make their evaluation function


>> 'dust' the evaluation with a little random jitter? That way the
>> alpha/beta algorithm is unchanged, only the static position evaluation
>> varies. I used to have a Saitek computer that had this option.
>
>
>What about hash tables? And the fact that now the same position can
>have multiple different scores? It could lead to significant performance
>penalties as well as inconsistent search results.

As far as hash tables are concerned, the evaluation (with its random
jitter) is stored in the hash. Since most computer engines do not
persist the hash table from one game to the next (I know crafty is an
exception) one would still get slight variations from one game to the
next.

Of course adding a small random factor to evaluations is probably
going to weaken the engine. Every program which offers this feature
I've seen points out that it results in suboptimal playing strength.
But if the jitter is very small, it presumably will not affect tree
searches. The idea would be to make it just large enough to add a
little variety.

I don't have enough expertise to estimate how much such a small jitter
would weaken an engine. I merely know that such techniques are used by
some programs because they allow alpha-beta to be used as normal,
whereas playing "the second-best move" does not. As another poster
said, against Kramnik the computer would be advised to play as strong
as it can! Kramnik has said that he is not interested in playing
prepared lines against Deep Fritz, he wants to outplay it
strategically (presumably using typical 'anti-computer' ideas, which
against the traditionally pawn-grabbing Fritz should be quite
effective.)


Tord Kallqvist Romstad

unread,
Aug 17, 2001, 3:29:40 AM8/17/01
to
ste...@aol.com (Sterten) writes:

This is all true, but alpha-beta does not give you any kind of
information about how big the difference between the best and the
second best move is. In fact, it does not even provide a second best
move at all. The search returns a score and a best move, nothing
else. You are right when you claim that computers make their decision
by "assigning values to each move or subposition", but the values for
most nodes in the game tree are not exact values, but upper or lower
bounds. For the root moves, you will usually get an exact value for
the best move and just upper bounds for the values of the rest of the
moves.

--
Tord Romstad

Sterten

unread,
Aug 17, 2001, 4:08:54 AM8/17/01
to
>This is all true, but alpha-beta does not give you any kind of
>information about how big the difference between the best and the
>second best move is. In fact, it does not even provide a second best
>move at all. The search returns a score and a best move, nothing
>else. You are right when you claim that computers make their decision
>by "assigning values to each move or subposition", but the values for
>most nodes in the game tree are not exact values, but upper or lower
>bounds. For the root moves, you will usually get an exact value for
>the best move and just upper bounds for the values of the rest of the
>moves.


A good computer program should assign values to each move on
each level , (maybe better two values, one for expectation
value and one for deviation)
and then decide on the move's values, how much time to spend
on that move in the next level.
OK, if a value seems to change dramatically, then time-assignments
could be redone.

If you only assign a value to the currently best move , then
you are disregarding useful information. It can't be good IMO.

Sterten

unread,
Aug 17, 2001, 4:09:46 AM8/17/01
to
>As far as hash tables are concerned, the evaluation (with its random
>jitter) is stored in the hash. Since most computer engines do not
>persist the hash table from one game to the next (I know crafty is an
>exception) one would still get slight variations from one game to the
>next.
>
>Of course adding a small random factor to evaluations is probably
>going to weaken the engine. Every program which offers this feature
>I've seen points out that it results in suboptimal playing strength.
>But if the jitter is very small, it presumably will not affect tree
>searches. The idea would be to make it just large enough to add a
>little variety.

in positions without tactical complications , the best moves are
often quite similar in value , you can watch this with your
computer. Take e.g. the starting position/opening choice.
When you realize that your opponent is better than you in
some special opening, then you would chose another one.
DF should be able to do this too.

>I don't have enough expertise to estimate how much such a small jitter
>would weaken an engine. I merely know that such techniques are used by
>some programs because they allow alpha-beta to be used as normal,
>whereas playing "the second-best move" does not.

this is a matter of implementation.The overall effect is the same

>As another poster
>said, against Kramnik the computer would be advised to play as strong
>as it can!

it could specialize on Kramnik's style and finetune some of it's
parameters to it

>Kramnik has said that he is not interested in playing
>prepared lines against Deep Fritz, he wants to outplay it
>strategically (presumably using typical 'anti-computer' ideas, which
>against the traditionally pawn-grabbing Fritz should be quite
>effective.)

you could tell it:
"hey, Fritz , take care , this Kramnik is going to offer poisoned pawns "
but you should translate it into Fritz-readable form.

Increasing parameters for development-advantage or king-safety
or piece-dynamics or such.
Or spend some additional time on considering possible pawn-sacrifices
in advance.

Sterten

unread,
Aug 17, 2001, 4:11:44 AM8/17/01
to
>What about hash tables?

don't know. Are they important ?

>And the fact that now the same position can
>have multiple different scores?

take the average.
But if this happens and there are significant
differences , then an alarm bell should ring and you should
redo the whole calculation with more time on a higher level.

>It could lead to significant performance
>penalties as well as inconsistent search results.

not if implemented correctly

Sterten

unread,
Aug 17, 2001, 4:12:40 AM8/17/01
to
>match to a somewhat weaker machine than the one the Kramnik will be playing.

I think , it was Kramnik, who said this.
That increases the value of his result ,
if he increases the apparent strength of his opponent

>As for this being a match of machine vs. man, my view is somewhat different.
>Fritz, like Blue, is a human creation. The match is really between Fritz's
>creators and Kramnik.

hehe, and who created Kramnik ?
and what about all the companies that made the hardware ? ;-)

Sterten

unread,
Aug 17, 2001, 4:13:51 AM8/17/01
to
>> >
>> >Alpha/beta doesn't work like that. If you could randomly affect the move
>> >order, this might work. But alpha/beta always takes the _first_ of a set
>> >of N equal moves...
>
>> but it's very easy to change this a little bit , isn't it ?
>
>Nope. The way alpha/beta works is to search a move, establish a score,
>then search the rest of the moves and toss out any that have a score <=
>that of the first move. To remove that = would absolutely kill search
>performance.

you mean "<" instead of "<=" ?!
We usually have 1/100 pawn-units precision , I think .
So "=" will occur rarely. And I can't understand why it should
kill search performance, unless this is some special,strange
implementation or there is a bug.

We could also use "jittering", that should also work.
But I would just allow some randomized tolerance.
You could also decrease the precision or occasionally permute
the moves or such.

I don't know much about alpha/beta , but if it has a problem here,
then I wonder, why/whether it can be good.
IMO , a good chess-algo has to try to somehow evaluate all possible
moves , spending more time on the high-valued moves of course.

Once the two best moves are very close together in value,
there is no problem with switching.


Guenter

Sterten

unread,
Aug 17, 2001, 4:15:13 AM8/17/01
to
>The SSDF numbers are simply _wrong_. there is no linear relationship
>between rating and memory size, except for very small hash table sizes.

but they did quite a lot games to test this. At least with
computer vs computer.This sort things, they should be able
to measure quite well

>IE if you look at dejanews, you will find a test I ran several years when
>Alan (Komputer Korner) asked the question. I found that from a table that
>is way too small, to one that is the right size, the speed of the program
>just about doubled. Beyone the "right size" nothing changed. Because once
>the table is big enough to hold the entire tree you are searching, larger
>doesn't do a thing for you.

I couldn't find it (keywords: alan,komputer,korner,hyatt,memory or ram)
how many games, how many different computers ?

>> >> you're very carefully formulating. But I extract from your wording
>> >> that you see more evidence for linear distribution than otherwise.
>> >
>> >Correct. But it is easier to "wait and see" since we are going faster
>> >every year... :)
>
>> feel free to correct your estimate every year - month or day :)
>
>I am slowly adjusting it upward as speed definitely helps, contrary to what
>some would have you think. But the problem is this: Given two computers
>playing each other, if you double the speed of one, it will have a lop-sided
>effect on the match result. Typically 70 rating points. But if you play
>a human GM with a machine, and then you double its speed, you won't see
>anywhere near that big an improvement. I have tried this on ICC on
>_several_ occasions... taking my original pentium pro 200 X 4 machine (four
>processors) and playing games vs the same GM (Roman) using 2 and 4
processors.
>He could not tell the difference, basically, which lends credibility to my
>claim that playing a program against computers and playing it against humans
>is two different things, totally.

how many games ? It doesn't make too much sense to me. Increasing speed
must help , also against humans. Maybe less than against computers ,
i.e. if they play an anti-computer-strategy.
I could accept if it drops to -say- 40 from 70. More would violate my
"feeling" for how chess works. You can watch the computer thinking
and sometimes it definately finds better moves with more time given.
Better moves means more points in the long run , no doubt.

>> upper2400-lower2500 is better than no estimate at all and surprisingly
>> low. That's 150-200 points lower than the SSDF list ! (2653 for DF6)
>
>I know. But remember, the SSDF rating pool has _no_ humans in it. So

I think, I read somewhere that some (few) human-computer results
confirmed the ratings. And they already adjusted the list 100 points.
So, they must have had some reason ( = human-computer games) to do this.

>that is not a surprise. The number that is important is
>rating(player1) - rating(player2) which gives a statistical prediction on
>who will win a game and how frequently. Doesn't matter whether they are
>rated 1400 and 1450 or 5690 and 5740. The statistical prediction is the
>same for either set of ratings.

OK

>> >> actually estimating 2835 ...

>> >That is 300 too high at least. A micro program is most definitely not a
>> >Super-GM.
>
>> 300 ? Are we talking about the same System ?
>> I meant DF7,on the Bahrain-hardware.
>
>The hardware DF is using is only 2x faster than the hardware I normally use
>with Crafty. It is not a super-computer as ChessBase would have you believe.
>In fact, it probably is not 2x faster than mine, as the 8-way boxes have a
>significant memory bandwidth bottleneck that isn't as serious on 4-way boxes.

with 2535 you are much much lower than even Kramnik , Keene.
And lower than the other posters here.
2535 vs. 2806 would mean : > 99% that Kramnik wins .
You should really find a bookmaker and place a bet on Kramnik !

>>>> 2653 DF,K6-2-450,128RAM
>>>> -50 SSDF list too high compared with FIDE
>>>> +log_2(6.5*1000/450*1.2)*55 1.2 for better processor
>>>>
>>>> +log_2(512/128) * 10 for 512MB estimated memory per processor
>
>
>That is wrong. The DF machine will probably have 512mb _total_.

uhh, can't they afford more ?

Estimate goes 30 points down then to about Kramnik's ELO.
Under average conditions, disregarding the special rules of the match.

Robert Hyatt

unread,
Aug 17, 2001, 10:00:33 AM8/17/01
to

Perhaps it can't be "good", but it works well. First, if you search
to depth D, with a branching factor of W, then you have to search W^D
nodes. With alpha/beta, you search 2*sqrt(W^D) nodes for the same depth.
Which means that effectively, alpha/beta searches _twice_ as deep as pure
minimax tree searching.

Second, if you look up "An analysis of alpha/beta pruning" by Knuth/Moore
(circa 1975) you will find a proof that alpha/beta will find the _exact_
same best move/score as minimax while doing a much reduced level of work.
Notice the emphasis on _exact_.

Who cares about how much worse any one move is than another. The only point
that is important is that you select the _best_ move. And alpha/beta does
this more efficiently than any other search algorithm.

Robert Hyatt

unread,
Aug 17, 2001, 10:04:18 AM8/17/01
to
Sterten <ste...@aol.com> wrote:
> >What about hash tables?

> don't know. Are they important ?

> >And the fact that now the same position can
> >have multiple different scores?

> take the average.


That doesn't make any sense in the context of the tree search. If you
do a static evaluation of position P, you get "X" which is a normal score
plus some random value factored in. If you probe for position P, you get
a score of "Y" because the same position will produce different scores with
a random value added in.

What do you do if one time you get value X, the next time you get value Y,
the next time you get value Z, all for the same position? It will screw up
move ordering and greatly reduce the efficiency of alpha/beta.


> But if this happens and there are significant
> differences , then an alarm bell should ring and you should
> redo the whole calculation with more time on a higher level.

> >It could lead to significant performance
> >penalties as well as inconsistent search results.

> not if implemented correctly

Sorry, but until you understand how alpha/beta works, I don't see how
you can make a direct statement like that. Alpha/beta is all about getting
the "move ordering" as correct as possible. As it learns that moves are
good, your random scoring will then make them look a little worse. It will
kill performance. And performance == tactics.

Robert Hyatt

unread,
Aug 17, 2001, 10:09:43 AM8/17/01
to
Sterten <ste...@aol.com> wrote:
> >> >
> >> >Alpha/beta doesn't work like that. If you could randomly affect the move
> >> >order, this might work. But alpha/beta always takes the _first_ of a set
> >> >of N equal moves...
> >
> >> but it's very easy to change this a little bit , isn't it ?
> >
> >Nope. The way alpha/beta works is to search a move, establish a score,
> >then search the rest of the moves and toss out any that have a score <=
> >that of the first move. To remove that = would absolutely kill search
> >performance.

> you mean "<" instead of "<=" ?!


Nope. I mean <=. If a move is no better than the best move so far,
I don't search it any further than necessary. What would be the point
to waste time thoroughly examining a move that has the _same_ score as
the best move so far? If I can prove it is no better, there is no need
to search it even if it is exactly equal also.


> We usually have 1/100 pawn-units precision , I think .
> So "=" will occur rarely. And I can't understand why it should
> kill search performance, unless this is some special,strange
> implementation or there is a bug.

= occurs in 99.99999999% of the searches using alpha/beta with the
"PVS" enhancement, which makes it even more efficient.

> We could also use "jittering", that should also work.
> But I would just allow some randomized tolerance.
> You could also decrease the precision or occasionally permute
> the moves or such.


Still that kills alpha/beta, and alpha/beta with PVS. And I do mean
"kills".


> I don't know much about alpha/beta , but if it has a problem here,
> then I wonder, why/whether it can be good.

If you make a move and I completely analyze all responses to it and prove
that you win no material, and I win no material, we know something about
that move, correct? Now you try a different move and I quickly prove that
based on the first reply I try, you lose a pawn. Should I continue to
look at replies to that move to see if I can win even more, or would you
give up on it after proving it loses at least a pawn, when your best move
so far doesn't lose a thing? A pawn is enough. So what if it loses a queen
or leads to a lost checkmate? losing a pawn is enough to convince me I won't
play the move.

That is alpha/beta, in a nutshell.


> IMO , a good chess-algo has to try to somehow evaluate all possible
> moves , spending more time on the high-valued moves of course.

That is also alpha/beta in a nutshell. :) Searching only the moves that
could actually not be rejected as above.

> Once the two best moves are very close together in value,
> there is no problem with switching.

There is a _big_ problem...


> Guenter

Robert Hyatt

unread,
Aug 17, 2001, 10:21:00 AM8/17/01
to
Sterten <ste...@aol.com> wrote:
> >The SSDF numbers are simply _wrong_. there is no linear relationship
> >between rating and memory size, except for very small hash table sizes.

> but they did quite a lot games to test this. At least with
> computer vs computer.This sort things, they should be able
> to measure quite well

They compared tiny to small. But it is provable by induction that when the
table becomes large enough to hold the entire tree you search, making it
even larger has absolutely no effect on making the search go still faster.

Dozens. Played over two or three weeks. I just randomly changed it, and
let it play for a day or two, then I would change it again. But only when
it was playing Roman.

The result didn't surprise me because when I went from a 4 X 400mhz machine to
a 4 x 550 or 4 x 700, I didn't see any great change in how he did. I saw
almost no change in fact, although against other computers Crafty did better
as the hardware got faster.

> I could accept if it drops to -say- 40 from 70. More would violate my
> "feeling" for how chess works. You can watch the computer thinking
> and sometimes it definately finds better moves with more time given.
> Better moves means more points in the long run , no doubt.

More time == better moves, yes. But mainly in tactics. If a program doesn't
understand a pawn majority, or a pawn lever, or king safety, going 2x faster
isn't going to let it suddenly understand those ideas.


> >> upper2400-lower2500 is better than no estimate at all and surprisingly
> >> low. That's 150-200 points lower than the SSDF list ! (2653 for DF6)
> >
> >I know. But remember, the SSDF rating pool has _no_ humans in it. So

> I think, I read somewhere that some (few) human-computer results
> confirmed the ratings. And they already adjusted the list 100 points.
> So, they must have had some reason ( = human-computer games) to do this.

What happens if you adjust the ratings, then play more games, but _only_
computer vs computer. It is pretty well-known that if you make a small
change to a program, and play it against the unchanged version, the rating
difference will be highly exaggerated, when comparing how the same program
does against strong humans. So if you calibrate the list, then allow the
programs to continue playing without any humans in the pool, the ratings
still drift far from reality...


> >that is not a surprise. The number that is important is
> >rating(player1) - rating(player2) which gives a statistical prediction on
> >who will win a game and how frequently. Doesn't matter whether they are
> >rated 1400 and 1450 or 5690 and 5740. The statistical prediction is the
> >same for either set of ratings.

> OK

> >> >> actually estimating 2835 ...

> >> >That is 300 too high at least. A micro program is most definitely not a
> >> >Super-GM.
> >
> >> 300 ? Are we talking about the same System ?
> >> I meant DF7,on the Bahrain-hardware.
> >
> >The hardware DF is using is only 2x faster than the hardware I normally use
> >with Crafty. It is not a super-computer as ChessBase would have you believe.
> >In fact, it probably is not 2x faster than mine, as the 8-way boxes have a
> >significant memory bandwidth bottleneck that isn't as serious on 4-way boxes.

> with 2535 you are much much lower than even Kramnik , Keene.
> And lower than the other posters here.
> 2535 vs. 2806 would mean : > 99% that Kramnik wins .
> You should really find a bookmaker and place a bet on Kramnik !

I don't think it is that high. 200 rating points says the higher-rated
player should win 3 of every 4. But that assumes both want to win. A good
match strategy against a computer is to play cautiously and go for draws,
but watching for a positional mistake that a computer will eventually make.
Then, in that game, go for the safe win.

You can't use match strategy to predict ratings. That is why you can't really
produce a valid Elo if just two players play over and over. Then use those
ratings to estimate how they will do against other players. And see how far
off they are. When I was 20, there was a 2000 player in the local club that
I could beat like a drum for some reason. Yet when I played other 1700-1800
players I was playing even with them. If we established my rating by only
looking at the 2000 player games, my rating would have been over 2200. But
that would have been high by a significant amount in a bigger pool of players.


> >That is wrong. The DF machine will probably have 512mb _total_.

> uhh, can't they afford more ?

That is a PC. A PC can address 4 gigabytes max, due to the 32 bit
address space. Some have more than 4 gigs of memory due to a trick
Intel introduced in the pentium xeon MMU, but a single program/process
can only see 4 gigs total. There are lots of architectural issues about
using such large memory on a machine with a very small L1 and L2 cache,
and they know that going larger is not going to help the program anyway.
Cost isn't the issue with today's memory prices.

> Estimate goes 30 points down then to about Kramnik's ELO.
> Under average conditions, disregarding the special rules of the match.

If you really think Kramnik and DF are similar in ratings, I have some
ocean-front property for sale in Kansas you might be interested in. :)


> >It is a
> >shared-memory architecture like my quad xeons here at UAB. My office machine
> >has 4 700mhz xeon processors, each with 1024K of L2 cache, sharing one large
> >512mb memory system.

--

Hyattian Flu

unread,
Aug 17, 2001, 11:56:05 AM8/17/01
to

"Robert Hyatt" <hy...@crafty.cis.uab.edu> wrote in message
news:9lj98c$11j$4...@juniper.cis.uab.edu...

> A good
> match strategy against a computer is to play cautiously and go for draws,
> but watching for a positional mistake that a computer will eventually
make.
> Then, in that game, go for the safe win.

I agree, many programs are to aggressive they would rather win than draw and
make a strange move to avoid a draw. Then, you can beat them. This is a
well known weakness.

Thx,

Hyattian Flu

Sterten

unread,
Aug 17, 2001, 12:17:55 PM8/17/01
to
>What do you do if one time you get value X, the next time you get value Y,
>the next time you get value Z, all for the same position?

usually the same position isn't calculated twice within the same level.
If it is, take either value.
In different levels take the value of the deeper level

>It will screw up
>move ordering and greatly reduce the efficiency of alpha/beta.

>> not if implemented correctly


>Sorry, but until you understand how alpha/beta works,

I should look that up..

>I don't see how
>you can make a direct statement like that.

I meant: I can see no principal reason why it can't be implemented
as to avoid the problem (different values for the same position)

Sterten

unread,
Aug 17, 2001, 12:18:27 PM8/17/01
to
>Sterten <ste...@aol.com> wrote:
>> >The SSDF numbers are simply _wrong_. there is no linear relationship
>> >between rating and memory size, except for very small hash table sizes.
>
>> but they did quite a lot games to test this. At least with
>> computer vs computer.This sort things, they should be able
>> to measure quite well
>
>They compared tiny to small. But it is provable by induction that when the
>table becomes large enough to hold the entire tree you search, making it
>even larger has absolutely no effect on making the search go still faster.

table of ELO-numbers ?
Or searchtree , assuming complete brute-force search by a computer ?
Of course, if you can completely solve chess in time T with memory M ,
then increasing T or M can't do more.
Provided the opponent plays perfectly.
Else you can still build traps.
But we're far from this and probably chess will never be completely solved.

>Dozens. Played over two or three weeks. I just randomly changed it, and
>let it play for a day or two, then I would change it again. But only when
>it was playing Roman.

Roman opening ? Or Roman is a (human) chessplayer ?

>The result didn't surprise me because when I went from a 4 X 400mhz machine
to
>a 4 x 550 or 4 x 700, I didn't see any great change in how he did. I saw
>almost no change in fact, although against other computers Crafty did better
>as the hardware got faster.
>
>> I could accept if it drops to -say- 40 from 70. More would violate my
>> "feeling" for how chess works. You can watch the computer thinking
>> and sometimes it definately finds better moves with more time given.
>> Better moves means more points in the long run , no doubt.
>
>More time == better moves, yes. But mainly in tactics. If a program doesn't
>understand a pawn majority, or a pawn lever, or king safety, going 2x faster
>isn't going to let it suddenly understand those ideas.

they should tell it. If a human can learn it, a computer can also.
In principle.Could be difficult to program,though.

>> >> upper2400-lower2500

is better than no estimate at all and surprisingly
>> >> low. That's 150-200 points lower than the SSDF list ! (2653 for DF6)
>> >
>> >I know. But remember, the SSDF rating pool has _no_ humans in it. So
>
>> I think, I read somewhere that some (few) human-computer results
>> confirmed the ratings. And they already adjusted the list 100 points.
>> So, they must have had some reason ( = human-computer games) to do this.
>
>What happens if you adjust the ratings, then play more games, but _only_
>computer vs computer. It is pretty well-known that if you make a small
>change to a program, and play it against the unchanged version, the rating
>difference will be highly exaggerated, when comparing how the same program
>does against strong humans. So if you calibrate the list, then allow the
>programs to continue playing without any humans in the pool, the ratings
>still drift far from reality...

probably it's only because humans can play anti-computer strategies.
If they'd play as always, do you agree that the list should
overlap perfectly ?
We should estimate the computer-human ratings by experts and then take the
average. Has it been done ?

>>>>> [2835] is 300 too high at least.


>> 2535 vs. 2806 would mean : > 99% that Kramnik wins .
>> You should really find a bookmaker and place a bet on Kramnik !
>
>I don't think it is that high.

taken from the recently posted list

>200 rating points says the higher-rated player should win 3 of every 4.

.. which implies, that he wins a match of 8 games with probability >97%

>But that assumes both want to win. A good


>match strategy against a computer is to play cautiously and go for draws,
>but watching for a positional mistake that a computer will eventually make.
>Then, in that game, go for the safe win.

you're arguing for > 99% here ?

>You can't use match strategy to predict ratings. That is why you can't
really
>produce a valid Elo if just two players play over and over. Then use those
>ratings to estimate how they will do against other players. And see how far
>off they are. When I was 20, there was a 2000 player in the local club that
>I could beat like a drum for some reason. Yet when I played other 1700-1800
>players I was playing even with them. If we established my rating by only
>looking at the 2000 player games, my rating would have been over 2200. But
>that would have been high by a significant amount in a bigger pool of
players.

computers and grandmasters usually play (balanced) against many opponents

>> >That is wrong. The DF machine will probably have 512mb _total_.
>
>> uhh, can't they afford more ?
>
>That is a PC. A PC can address 4 gigabytes max, due to the 32 bit
>address space.

..for every processor

>Some have more than 4 gigs of memory due to a trick
>Intel introduced in the pentium xeon MMU, but a single program/process
>can only see 4 gigs total. There are lots of architectural issues about
>using such large memory on a machine with a very small L1 and L2 cache,
>and they know that going larger is not going to help the program anyway.
>Cost isn't the issue with today's memory prices.
>
>> Estimate goes 30 points down then to about Kramnik's ELO.
>> Under average conditions, disregarding the special rules of the match.
>
>If you really think Kramnik and DF are similar in ratings,

I'm still learning, how to make a good estimate

>I have some
>ocean-front property for sale in Kansas you might be interested in. :)

not sure whath you want to point out , English is difficult for me
sometimes , i.e. ":)"-English

Dr Stupid

unread,
Aug 17, 2001, 1:19:01 PM8/17/01
to
On 17 Aug 2001 14:09:43 GMT, Robert Hyatt <hy...@crafty.cis.uab.edu>
wrote:

>> We usually have 1/100 pawn-units precision , I think .


>> So "=" will occur rarely. And I can't understand why it should
>> kill search performance, unless this is some special,strange
>> implementation or there is a bug.
>
>= occurs in 99.99999999% of the searches using alpha/beta with the
>"PVS" enhancement, which makes it even more efficient.

Right, I understand now. :) I was not previously aware how often equal
evaluations arose in practice.

What I now wonder is how programs implement this "don't always play
the best move" feature without killing performance. (Or maybe, they do
kill performance?)

One method which it seems is fairly harmless (except that it doesn't
allow hash tables to persist between moves) is to randomly jitter the
weightings used in the evaluation function. That is to say, on a given
move the program as a whole might be a little more or less concerned
about king safety in its evaluation function. A given position always
gets the same eval while computing the move, so alph-beta works
properly. It would be like running nimzo and tweaking its engine
parameters every few moves. The adjustment of weightings might make
two previously equally rated moves unequal, and change the final
'best' move.


Daniel Lidström

unread,
Aug 18, 2001, 1:12:43 AM8/18/01
to
"Akorps" wrote...

> Can't the computer introduce some
> randomness into its move selection?
> If 2 moves are evaluated equally choose
> randomly between them, and if it is
> desired *not* to have any 2 moves rated
> equal, have a small random chance of playing
> the second best move, if it's evaluation is
> close to the best move? That way it
> would be much harder for the human to
> play prepared lines.

Playing the second move by a random chance could if done carelessly drop the
evaluation from each move (by the amount of randomness introduced). I think
it is the opening book that needs to be extended to some humungous number of
positions. This works very well in Othello so I see no reason why it
shouldn't work well in chess. As someone pointed out, memory is no problem
with current prizes so maybe a larger hash table won't improve DF but a ~10
GB opening book should probably improve it a lot (or bigger, I have no idea
what size their's is).
/Daniel


Robert Hyatt

unread,
Aug 17, 2001, 6:03:58 PM8/17/01
to
Sterten <ste...@aol.com> wrote:
> >Sterten <ste...@aol.com> wrote:
> >> >The SSDF numbers are simply _wrong_. there is no linear relationship
> >> >between rating and memory size, except for very small hash table sizes.
> >
> >> but they did quite a lot games to test this. At least with
> >> computer vs computer.This sort things, they should be able
> >> to measure quite well
> >
> >They compared tiny to small. But it is provable by induction that when the
> >table becomes large enough to hold the entire tree you search, making it
> >even larger has absolutely no effect on making the search go still faster.

> table of ELO-numbers ?
> Or searchtree , assuming complete brute-force search by a computer ?

Assuming you search 1M nodes per second, and you have 3 minutes to choose
a move, then you will search 180M nodes. If your hash table will hold
180M nodes, making it even bigger will have no influence on the search
speed. Once it is big enough, making it bigger doesn't help or hurt a
thing.

> Of course, if you can completely solve chess in time T with memory M ,
> then increasing T or M can't do more.
> Provided the opponent plays perfectly.
> Else you can still build traps.
> But we're far from this and probably chess will never be completely solved.

> >Dozens. Played over two or three weeks. I just randomly changed it, and
> >let it play for a day or two, then I would change it again. But only when
> >it was playing Roman.

> Roman opening ? Or Roman is a (human) chessplayer ?

A well-known GrandMaster. :)

> >> >> upper2400-lower2500

> ..for every processor

--

Jose Lopez Jr.

unread,
Aug 18, 2001, 10:24:30 AM8/18/01
to
What would be really interesting is if Kramnik gets to play the Berlin
Defense against Deep Fritz. I would be soo curious on how Deep Fritz
handles it.

"Robert Hyatt" <hy...@crafty.cis.uab.edu> wrote in message

news:9lk4ce$ec3$1...@juniper.cis.uab.edu...

Mike Garcia

unread,
Aug 18, 2001, 1:22:00 PM8/18/01
to
In article <p1kqntgqln2k605ab...@4ax.com>, Dr Stupid <drst...@bulbous.freeserve.co.uk> wrote:
<snip>

>One method which it seems is fairly harmless (except that it doesn't
>allow hash tables to persist between moves) is to randomly jitter the
>weightings used in the evaluation function.

The way I do it is by saying that a move's value is equal to the resultant
board's value * 100 + jiter where 0<jiter<100. Then to compensate a
board that is not a leaf has a value of the best available move / 100. All
values are integers.

Mike G

Manuel J. Petit de G.

unread,
Aug 19, 2001, 6:59:58 PM8/19/01
to

"Robert Hyatt" <hy...@crafty.cis.uab.edu> wrote in message
news:9lj98c$11j$4...@juniper.cis.uab.edu...

> That is a PC. A PC can address 4 gigabytes max, due to the 32 bit
> address space. Some have more than 4 gigs of memory due to a trick
> Intel introduced in the pentium xeon MMU, but a single program/process
> can only see 4 gigs total.

This is not the firt time you claim that only 4 gigs can be addressed... and
that's not true
at all. Or better said, it's *only* true if you assume the Unix flat memory
model.


manuel,


Robert Hyatt

unread,
Aug 19, 2001, 8:36:11 PM8/19/01
to

It is true for any program that runs on a PC. You tell me how you are going
to address more than 4 gigs with a 32 bit address space. The answer is,
you are not. If you want to write something that shuffles something in and
out of the 32 bit address space, that is doable, but it is _not_ addressing
> 4 gigs of memory. It means somebody is going to be fiddling with the
mmu tables between instructions, which is not exactly the same as addressing
more than 4 gigs.

What other memory model are you going to use on the PC?


> manuel,

Dr A. N. Walker

unread,
Aug 21, 2001, 9:08:30 AM8/21/01
to
In article <9lh2oe$7i9$2...@juniper.cis.uab.edu>,
Robert Hyatt <hy...@crafty.cis.uab.edu> wrote:
>Dr Stupid <drst...@bulbous.freeserve.co.uk> wrote:
>>Robert Hyatt <hy...@crafty.cis.uab.edu> wrote:
[...]

>>>Nope. The way alpha/beta works is to search a move, establish a score,
>>>then search the rest of the moves and toss out any that have a score <=
>>>that of the first move.

Hmm! I don't think I'd give that answer many marks out of
ten in an exam script. But let that pass.

>>> To remove that = would absolutely kill search
>>>performance.
>> Is not the way some engines do this to make their evaluation function

>> 'dust' the evaluation with a little random jitter? [...]


>What about hash tables? And the fact that now the same position can
>have multiple different scores? It could lead to significant performance
>penalties as well as inconsistent search results.

The answer is to make the jitter not entirely random but be a
function of the current hash. Thus, the jitter is always the same when
the same position is evaluated, and is also entirely consistent with
what is happening to the transposition table. The performance hit is
negligible [much the same as adding any new simple term to the
evaluation function].

The above suffices to break equalities, if that matters.
If you want the next step, for the program to play differently in
different games, then this will happen automatically if the hash
function is different between games [eg, based on new random
numbers], otherwise you need to make the jitter depend also on a
global random, set when the game starts [and then a persistent
or pre-loaded hash table would be slightly inconsistent between
games -- for small jitters, I can't see that mattering].

--
Andy Walker, School of MathSci., Univ. of Nott'm, UK.
a...@maths.nott.ac.uk

0 new messages