[Computer-go] Zen lost to Mi Yu Ting

83 views
Skip to first unread message

Paweł Morawiecki

unread,
Mar 21, 2017, 6:31:09 AM3/21/17
to compu...@computer-go.org
Hi,

After an interesting game DeepZen lost to Mi Yu Ting. 
Here you can replay the complete game: http://duiyi.sina.com.cn/gibo_new/live/viewer.asp?sno=13

According to pro experts, Zen fought really well, but it seems there is still some issue how Zen (mis)evaluates its chances. At one point it showed 84% chance of winning (in the endgame), whereas it was already quite clear Zen is little behind (2-3 points).

Regards,
Paweł 

Hideki Kato

unread,
Mar 21, 2017, 8:03:21 AM3/21/17
to compu...@computer-go.org
The value network has been trained with Chinese rules and 7.5
pts komi. Using this for Japanese and 6.5, there will be some
error in close games. We knew this issue and thought such
chances would be so small that postponed correcting (not so
easy).

Best,
Hideki

Pawe Morawiecki: <CAKSbshpvD34hvJt-B+X73RDP...@mail.gmail.com>:

>Pawel
>---- inline file
>_______________________________________________
>Computer-go mailing list
>Compu...@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go
--
Hideki Kato <mailto:hideki...@ybb.ne.jp>
_______________________________________________
Computer-go mailing list
Compu...@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Aja Huang via Computer-go

unread,
Mar 21, 2017, 9:47:41 AM3/21/17
to compu...@computer-go.org
On Tue, Mar 21, 2017 at 10:48 AM, Hideki Kato <hideki...@ybb.ne.jp> wrote:
The value network has been trained with Chinese rules and 7.5
pts komi.  Using this for Japanese and 6.5, there will be some
error in close games.  We knew this issue and thought such
chances would be so small that postponed correcting (not so
easy).

Oh, so that's why! Good luck with Zen's next two games. 

Aja
 
Best,
Hideki

Pawe  Morawiecki: <CAKSbshpvD34hvJt-B+X73RDPG5-4wsxoezykbHeSLPREwCi5YQ@mail.gmail.com>:

Paweł Morawiecki

unread,
Mar 21, 2017, 10:59:34 AM3/21/17
to Aja Huang, compu...@computer-go.org
Hideki,

 Using this for Japanese and 6.5, there will be some
error in close games.  We knew this issue and thought such
chances would be so small that postponed correcting (not so
easy).

But how would you fix it? Isn't that you'd need to retrain your value network from the scratch?

Regards,
Paweł
 

Oh, so that's why! Good luck with Zen's next two games. 

Aja

"Ingo Althöfer"

unread,
Mar 21, 2017, 1:18:39 PM3/21/17
to compu...@computer-go.org
Hi,

now we see how clever the DeepMind team was (and likely still is).
In both matches (against Fan Hui and Lee Sedol) Chinese rules
were applied.

************************************************

Some years ago I performed experiments with Monte Carlo search
in special non-zero sum games (with two players). The rules were made
in such a way that outcomes were possible that both sides were
winning according to their respective rules.
(An example from the Go framework: Black might think that komi is
5.5 points, whereas White might think that komi is 7.5 points.)
RATHER OFTEN the outcome was a score where both sides thought
to have won. In the 5.5/7.5 komi example from Go this means that
outcomes with +6 or +7 points for Black on the board would occur
often.

Of course, this is not welcome for zero-sum games. But it is a hint
that in reallife scenarios (with non-zero-sum payoffs) Monte Carlo
heuristics (with their tendency to produce narrow wi0ns) might be
helpful in finding good compromises.

Ingo.

David Ongaro

unread,
Mar 21, 2017, 4:49:45 PM3/21/17
to compu...@computer-go.org
On Mar 21, 2017, at 7:00 AM, Paweł Morawiecki <pawel.mo...@gmail.com> wrote:

Hideki,

 Using this for Japanese and 6.5, there will be some
error in close games.  We knew this issue and thought such
chances would be so small that postponed correcting (not so
easy).

But how would you fix it? Isn't that you'd need to retrain your value network from the scratch?

I would think so as well. But I some months ago I already made a proposal in this list to mitigate that problem: instead of training a different value network for each Komi, add a “Komi adjustment” value as input during the training phase. That should be much more effective, since the “win/lost” evaluation shouldn’t change for many (most?) positions for small adjustments but the resulting value network (when trained for different Komi adjustments) has a much greater range of applicability.

Regards

David O.

Gian-Carlo Pascutto

unread,
Mar 21, 2017, 7:04:14 PM3/21/17
to compu...@computer-go.org
On 21/03/2017 21:08, David Ongaro wrote:
>> But how would you fix it? Isn't that you'd need to retrain your value
>> network from the scratch?
>
> I would think so as well. But I some months ago I already made a
> proposal in this list to mitigate that problem: instead of training a
> different value network for each Komi, add a “Komi adjustment” value as
> input during the training phase. That should be much more effective,
> since the “win/lost” evaluation shouldn’t change for many (most?)
> positions for small adjustments but the resulting value network (when
> trained for different Komi adjustments) has a much greater range of
> applicability.

The problem is not the training of the network itself (~2-4 weeks of
letting a program someone else wrote run in the background, easiest
thing ever in computer go), or whether you use a komi input or a
separate network, the problem is getting data for the different komi values.

Note that if getting data is not a problem, then a separate network
would perform better than your proposal.

--
GCP

caze...@ai.univ-paris8.fr

unread,
Mar 21, 2017, 8:50:26 PM3/21/17
to compu...@computer-go.org

Why can't you reuse the same self played games but score them with a
different komi value ? The policy network does not use the komi to choose
its moves so it should make no difference.


> On 21/03/2017 21:08, David Ongaro wrote:
>>> But how would you fix it? Isn't that you'd need to retrain your value
>>> network from the scratch?
>>
>> I would think so as well. But I some months ago I already made a
>> proposal in this list to mitigate that problem: instead of training a

>> different value network for each Komi, add a “Komi adjustment†value


>> as
>> input during the training phase. That should be much more effective,

>> since the “win/lost†evaluation shouldn’t change for many (most?)

Chun Sun

unread,
Mar 21, 2017, 9:08:51 PM3/21/17
to compu...@computer-go.org
How does Zen know it's playing a Japanese rule game? Can it be set to play a Chinese rule game and hope to converge at the end?

On Mar 21, 2017 8:03 AM, "Hideki Kato" <hideki...@ybb.ne.jp> wrote:
The value network has been trained with Chinese rules and 7.5
pts komi.  Using this for Japanese and 6.5, there will be some
error in close games.  We knew this issue and thought such
chances would be so small that postponed correcting (not so
easy).

Best,
Hideki

uurtamo .

unread,
Mar 21, 2017, 10:04:09 PM3/21/17
to compu...@computer-go.org
I guess that 1 point in such a game matters to the evaluation function. Pretty fascinating. Can you not train for the two different rulesets and just pick which at the beginning? Ignoring Chinese versus Japanese, just training on komi? Or is the problem of Japanese rules the whole issue? (I.e not komi)?

Álvaro Begué

unread,
Mar 21, 2017, 11:01:31 PM3/21/17
to computer-go
I was thinking the same thing. You can easily equip the value network with several outputs, corresponding to several settings of komi, then train as usual.

The issue with Japanese rules is easily solved by refusing to play under ridiculous rules. Yes, I do have strong opinions. :)

Álvaro.


Detlef Schmicker

unread,
Mar 22, 2017, 3:17:39 AM3/22/17
to compu...@computer-go.org
oakfoam value network does exactly this, we have 6 komi layers -7.5 -5.5
-0.5 0.5 5.5 7.5 (+ and - due to color played) and trained from 4d+ kgs
games with this:
if (c_played==1):
if ("0.5" in komi):
komiplane=1;
if ("6.5" in komi or "2.75" in komi or "5.5" in komi):
#komi 6.5 and 5.5 not very different in chinese scoring
komiplane=2;
if ("7.5" in komi or "3.75" in komi):
komiplane=3;
if (c_played==2):
if ("0.5" in komi):
komiplane=4;
if ("6.5" in komi or "2.75" in komi or "5.5" in komi):
#komi 6.5 and 5.5 not very different in chinese scoring
komiplane=5;
if ("7.5" in komi or "3.75" in komi):
komiplane=6;


But I was unable to get a sgf file from the japanese language site :)

I did not really test if this layers help, but they are there and
trained and you might check yourself :)

Detlef

Am 21.03.2017 um 21:08 schrieb David Ongaro:
> On Mar 21, 2017, at 7:00 AM, Paweł Morawiecki <pawel.mo...@gmail.com> wrote:
>>
>> Hideki,
>>
>> Using this for Japanese and 6.5, there will be some
>> error in close games. We knew this issue and thought such
>> chances would be so small that postponed correcting (not so
>> easy).
>>
>> But how would you fix it? Isn't that you'd need to retrain your value network from the scratch?
>
> I would think so as well. But I some months ago I already made a proposal in this list to mitigate that problem: instead of training a different value network for each Komi, add a “Komi adjustment” value as input during the training phase. That should be much more effective, since the “win/lost” evaluation shouldn’t change for many (most?) positions for small adjustments but the resulting value network (when trained for different Komi adjustments) has a much greater range of applicability.
>
> Regards
>
> David O.
>
>
>>
>> Oh, so that's why! Good luck with Zen's next two games.
>>
>> Aja
>>
>> Best,
>> Hideki
>>

>> Pawe Morawiecki: <CAKSbshpvD34hvJt-B+X73RDP...@mail.gmail.com <mailto:CAKSbshpvD34hvJt-B%2BX73RDPG5-4wsxoe...@mail.gmail.com>>:


>>> Hi,
>>>
>>> After an interesting game DeepZen lost to Mi Yu Ting.
>>> Here you can replay the complete game:

>>> http://duiyi.sina.com.cn/gibo_new/live/viewer.asp?sno=13 <http://duiyi.sina.com.cn/gibo_new/live/viewer.asp?sno=13>


>>>
>>> According to pro experts, Zen fought really well, but it seems there is
>>> still some issue how Zen (mis)evaluates its chances. At one point it showed
>>> 84% chance of winning (in the endgame), whereas it was already quite clear
>>> Zen is little behind (2-3 points).
>>>
>>> Regards,
>>> Pawel
>>> ---- inline file
>>> _______________________________________________
>>> Computer-go mailing list

>>> Compu...@computer-go.org <mailto:Compu...@computer-go.org>
>>> http://computer-go.org/mailman/listinfo/computer-go <http://computer-go.org/mailman/listinfo/computer-go>
>> --
>> Hideki Kato <mailto:hideki...@ybb.ne.jp <mailto:hideki...@ybb.ne.jp>>
>> _______________________________________________
>> Computer-go mailing list
>> Compu...@computer-go.org <mailto:Compu...@computer-go.org>
>> http://computer-go.org/mailman/listinfo/computer-go <http://computer-go.org/mailman/listinfo/computer-go>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Compu...@computer-go.org <mailto:Compu...@computer-go.org>
>> http://computer-go.org/mailman/listinfo/computer-go <http://computer-go.org/mailman/listinfo/computer-go>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Compu...@computer-go.org <mailto:Compu...@computer-go.org>
>> http://computer-go.org/mailman/listinfo/computer-go <http://computer-go.org/mailman/listinfo/computer-go>

Gian-Carlo Pascutto

unread,
Mar 22, 2017, 5:03:57 AM3/22/17
to compu...@computer-go.org
On 22-03-17 00:36, caze...@ai.univ-paris8.fr wrote:
>
> Why can't you reuse the same self played games but score them

If you have self-play games that are played to the final position so
scoring is fool-proof, then it could work. But I think things get really
interesting when timing of a pass matters (which is the kind of
situation we're trying to resolve) and you're using pure policy players.

Does your DCNN only player know *precisely* when to pass *first* under
Japanese rules?

> The policy network does not use the komi to choose its moves so it
> should make no difference.

Do you not play different moves when you are behind 0.5 points compared
to when you're ahead 0.5 points?

(Or if you're ignoring komi completely, behind multiple stones vs ahead
multiple stones?)

Darren Cook

unread,
Mar 22, 2017, 6:01:05 AM3/22/17
to compu...@computer-go.org
> The issue with Japanese rules is easily solved by refusing to play under
> ridiculous rules. Yes, I do have strong opinions. :)

And the problem with driver-less cars is easily "solved" by banning all
road users that are not also driver-less cars (including all
pedestrians, bikes and wild animals).

Or how about this angle: humans are still better than the programs at
Japanese rules. Therefore this is an interesting area of study.

Darren


--
Darren Cook, Software Researcher/Developer
My New Book: Practical Machine Learning with H2O:
http://shop.oreilly.com/product/0636920053170.do

Paweł Morawiecki

unread,
Mar 22, 2017, 6:19:19 AM3/22/17
to compu...@computer-go.org

RATHER OFTEN the outcome was a score where both sides thought
to have won. In the 5.5/7.5 komi example from Go  this means that
outcomes with +6 or +7 points for Black on the board would occur
often.


It looks like this issue is serious again was a factor in today's game against Park 9p. Zen was winning and in the endgame starts giving away points and the game was reversed.
Hideki, was that the case? 

Too bad it's 6.5 komi as it seems Zen has potential to win both games :-(

Regards,
Paweł

Gian-Carlo Pascutto

unread,
Mar 22, 2017, 7:54:38 AM3/22/17
to compu...@computer-go.org
On 22-03-17 09:41, Darren Cook wrote:
>> The issue with Japanese rules is easily solved by refusing to play
>> under ridiculous rules. Yes, I do have strong opinions. :)
>
> And the problem with driver-less cars is easily "solved" by banning
> all road users that are not also driver-less cars (including all
> pedestrians, bikes and wild animals).

I think you misunderstand the sentiment completely. It is not: Japanese
rules are difficult for computers, so we don't like them.

It is: Japanese rules are problematic on many levels, so we prefer to
work with Chinese ones and as a consequence that's what the programs are
trained for and tested on. It is telling that Zen is having these
troubles despite being made by Japanese programmers. I believe the
saying for this is "voting with your feet".

> Or how about this angle: humans are still better than the programs
> at Japanese rules. Therefore this is an interesting area of study.

Maybe some people are interested in studying Japanese rules, like
finding out what they actually are
(http://home.snafu.de/jasiek/j1989c.html). That's fine, but not all that
interesting for AI, or, actually, computer go.

Of course, commercial programs that need to cater to a Japanese (or
Korean) audience are stuck. As are people that want to play the UEC Cup etc.

--
GCP

Hideki Kato

unread,
Mar 22, 2017, 8:53:17 AM3/22/17
to compu...@computer-go.org
We have set komi to 5.5 today. This looks worked fine.

The strange yose moves were caused by unknown reason. We are
seeking the cause(s). Observed fact: The upper left center
three black stones cannot be captured but Zen looks evaluated
them as dead. When Zen noticed the truth, horizen effect forced
several miserable moves in upper side white territory. Then,
upper left white stones together with many short-liberty stones
forced the value network misrecognized them as
living by seki, because the shape looked seki (for VN) and many
moves were required to capture them in rollout.

Hideki

Pawe Morawiecki: <CAKSbshogYyn8Wk2htV0XCzAv...@mail.gmail.com>:


>>
>>
>> RATHER OFTEN the outcome was a score where both sides thought
>> to have won. In the 5.5/7.5 komi example from Go this means that
>> outcomes with +6 or +7 points for Black on the board would occur
>> often.
>>
>>
>It looks like this issue is serious again was a factor in today's game
>against Park 9p. Zen was winning and in the endgame starts giving away
>points and the game was reversed.
>Hideki, was that the case?
>
>Too bad it's 6.5 komi as it seems Zen has potential to win both games :-(
>
>Regards,

>Pawel


>
>
>
>
>> Of course, this is not welcome for zero-sum games. But it is a hint
>> that in reallife scenarios (with non-zero-sum payoffs) Monte Carlo
>> heuristics (with their tendency to produce narrow wi0ns) might be
>> helpful in finding good compromises.
>>
>> Ingo.
>> _______________________________________________
>> Computer-go mailing list
>> Compu...@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go

>---- inline file


>_______________________________________________
>Computer-go mailing list
>Compu...@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go

--
Hideki Kato <mailto:hideki...@ybb.ne.jp>

Álvaro Begué

unread,
Mar 22, 2017, 9:56:30 AM3/22/17
to computer-go
Thank you, Gian-Carlo. I couldn't have said it better.

Álvaro.


Darren Cook

unread,
Mar 22, 2017, 12:32:24 PM3/22/17
to compu...@computer-go.org
>>> The issue with Japanese rules is easily solved by refusing to play
>>> under ridiculous rules. Yes, I do have strong opinions. :)
>>
>> And the problem with driver-less cars is easily "solved" by banning
>> all road users that are not also driver-less cars (including all
>> pedestrians, bikes and wild animals).
>
> I think you misunderstand the sentiment completely. It is not: Japanese
> rules are difficult for computers, so we don't like them.
>
> It is: Japanese rules are problematic on many levels, ...

Yes, that was the sentiment I understood. Chinese rules (Tromp-Taylor,
etc.) are nice and clean, so easy to implement. They were useful props
to make the progress up until now. The real world is messy and
illogical, as are the corner cases in Japanese rules. Assuming you are
in this for the AI learnings, not just to make a strong Chinese-rules go
program, why not embrace the messiness!

(Japanese rules are not *that* hard. IIRC, Many Faces, and all other
programs, including my own, scored in them, before MCTS took hold and
being able to shave milliseconds off scoring became the main decider of
a program's strength.)

Darren


--
Darren Cook, Software Researcher/Developer
My New Book: Practical Machine Learning with H2O:
http://shop.oreilly.com/product/0636920053170.do

Gian-Carlo Pascutto

unread,
Mar 22, 2017, 2:18:31 PM3/22/17
to compu...@computer-go.org
On 22-03-17 16:27, Darren Cook wrote:
> (Japanese rules are not *that* hard. IIRC, Many Faces, and all other
> programs, including my own, scored in them

There is a huge difference between doing some variation of territory
scoring and implementing Japanese rules. Understanding this difference
will get you some way to understanding why some people do not like them,
and that has got nothing to do with computer go.

--
GCP

John Tromp

unread,
Mar 22, 2017, 4:06:32 PM3/22/17
to computer-go
>> (Japanese rules are not *that* hard. IIRC, Many Faces, and all other
>> programs, including my own, scored in them
>
> There is a huge difference between doing some variation of territory
> scoring and implementing Japanese rules. Understanding this difference
> will get you some way to understanding why some people do not like them,
> and that has got nothing to do with computer go.

I do not like them because, as far as i can tell, they cannot answer
questions like: what is fair komi for 2x2 Go (i.e. what is the outcome
with perfect play) ?

regards,
-John

Hideki Kato

unread,
Mar 22, 2017, 7:19:50 PM3/22/17
to compu...@computer-go.org
The strange moves (start with 234th move) could be caused a deep
search together with the misrecognition of the seki (described
in previous post).

With one-shot testing, Zen always chose H14 instead of R18
(actual 234th move), which looks normal. (Time setting was 2
min for a move.) An important difference from actual game is
the search tree, which is very big in real, long-time setting
game. One possible interpretation is, Zen read in deep and
found the (wrong) seki, which would lead W a sure win and so,
played R18 toward this (again wrong!) winning position.

Hideki

Hideki Kato: <58d26196.6952%hideki...@ybb.ne.jp>:

Paweł Morawiecki

unread,
Mar 23, 2017, 6:20:49 AM3/23/17
to compu...@computer-go.org
Hideki,
 
 An important difference from actual game is
the search tree, which is very big in real, long-time setting
game.  One possible interpretation is, Zen read in deep and
found the (wrong) seki, which would lead W a sure win and so,
played R18 toward this (again wrong!) winning position.

Looks like DeepZenGo Team just missed a couple of months (weeks?) to train stronger value network to be able to win the tournament. Michael Redmond 9p said that DeepZen already plays at the top professional level, particularly opening and middle game. Congratulations on today's well-deserved win against Iyama!

When would be possible to buy a new DeepZen?

Regards,
Paweł


 
Hideki

Hideki Kato: <58d26196.6952%hideki_katoh@ybb.ne.jp>:
>We have set komi to 5.5 today.  This looks worked fine.
>
>The strange yose moves were caused by unknown reason.  We are
>seeking the cause(s).  Observed fact: The upper left center
>three black stones cannot be captured but Zen looks evaluated
>them as dead.  When Zen noticed the truth, horizen effect forced
>several miserable moves in upper side white territory.  Then,
>upper left white stones together with many short-liberty stones
>forced the value network misrecognized them as
>living by seki, because the shape looked seki (for VN) and many
>moves were required to capture them in rollout.
>
>Hideki
>
>Pawe  Morawiecki:

Hideki Kato

unread,
Mar 23, 2017, 10:51:08 AM3/23/17
to compu...@computer-go.org
Pawe Morawiecki: <CAKSbshqxJoQvm9f03XfU8vcb...@mail.gmail.com>:

>Hideki,
>
>
>> An important difference from actual game is
>> the search tree, which is very big in real, long-time setting
>> game. One possible interpretation is, Zen read in deep and
>> found the (wrong) seki, which would lead W a sure win and so,
>> played R18 toward this (again wrong!) winning position.
>>
>> Looks like DeepZenGo Team just missed a couple of months (weeks?) to train
>stronger value network to be able to win the tournament. Michael Redmond 9p
>said that DeepZen already plays at the top professional level, particularly
>opening and middle game. Congratulations on today's well-deserved win
>against Iyama!

Thanks.

>When would be possible to buy a new DeepZen?

Fully depends on the publisher of Tencho-no-Igo, Mynavi.

This version will be about one stone weaker on a gaming PC
(eight-core Intel with GTX-1080, for example) and two stones or
three weaker on a laptop.

Best,
Hideki

>Regards,
>Pawel
>
>
>
>
>> Hideki
>>
>> Hideki Kato: <58d26196.6952%hideki...@ybb.ne.jp>:


>> >We have set komi to 5.5 today. This looks worked fine.
>> >
>> >The strange yose moves were caused by unknown reason. We are
>> >seeking the cause(s). Observed fact: The upper left center
>> >three black stones cannot be captured but Zen looks evaluated
>> >them as dead. When Zen noticed the truth, horizen effect forced
>> >several miserable moves in upper side white territory. Then,
>> >upper left white stones together with many short-liberty stones
>> >forced the value network misrecognized them as
>> >living by seki, because the shape looked seki (for VN) and many
>> >moves were required to capture them in rollout.
>> >
>> >Hideki
>> >
>> >Pawe Morawiecki:

>> ><CAKSbshogYyn8Wk2htV0XCzAv...@mail.gmail.com>:

"Ingo Althöfer"

unread,
Mar 23, 2017, 6:01:48 PM3/23/17
to compu...@computer-go.org
Dear Hideki,

thanks for all your open comments here in the mailing
list in the last few days.

I know that these days (with the losses) are really hard
bread for the Zen team. But "in the end" you will emerge
from the lessons stronger than anytime before.


> >When would be possible to buy a new DeepZen?
>
> Fully depends on the publisher of Tencho-no-Igo, Mynavi.
> This version will be about one stone weaker on a gaming PC
> (eight-core Intel with GTX-1080, for example) and two stones or
> three weaker on a laptop.

Thanks for the info. Youo know, that dozens of Go friends
in Jena/Germany/Europe are eagerly waiting for a new Zen
analysis tool.

Reply all
Reply to author
Forward
0 new messages