[Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

Aja Huang

unread,

Jan 27, 2016, 1:46:29 PM1/27/16

to compu...@computer-go.org

Hi all,

We are very excited to announce that our Go program, AlphaGo, has beaten a professional player for the first time. AlphaGo beat the European champion Fan Hui by 5 games to 0. We hope you enjoy our paper, published in Nature today. The paper and all the games can be found here:

http://www.deepmind.com/alpha-go.html

AlphaGo will be competing in a match against Lee Sedol in Seoul, this March, to see whether we finally have a Go program that is stronger than any human!

Aja

PS I am very busy preparing AlphaGo for the match, so apologies in advance if I cannot respond to all questions about AlphaGo.

Julian Schrittwieser

unread,

Jan 27, 2016, 2:10:12 PM1/27/16

to compu...@computer-go.org

Congratulations Aja, well done :)

_______________________________________________
Computer-go mailing list
Compu...@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

"Ingo Althöfer"

unread,

Jan 27, 2016, 2:44:49 PM1/27/16

to compu...@computer-go.org

Hello Aja,

congratulations to the success of you and the other team member!

To the others: Should we call the game "Goo" in the future,
to honour Goo-gles progress?

CHeers, Ingo.

Gesendet: Mittwoch, 27. Januar 2016 um 19:46 Uhr
Von: "Aja Huang" <ajah...@google.com>
An: compu...@computer-go.org
Betreff: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

PS I am very busy preparing AlphaGo for the match, so apologies in advance if I cannot respond to all questions about AlphaGo._______________________________________________ Computer-go mailing list Compu...@computer-go.org http://computer-go.org/mailman/listinfo/computer-go[http://computer-go.org/mailman/listinfo/computer-go]

Aja Huang

unread,

Jan 27, 2016, 3:08:11 PM1/27/16

to compu...@computer-go.org

2016-01-27 18:46 GMT+00:00 Aja Huang <ajah...@google.com>:

Hi all,

We are very excited to announce that our Go program, AlphaGo, has beaten a professional player for the first time. AlphaGo beat the European champion Fan Hui by 5 games to 0. We hope you enjoy our paper, published in Nature today. The paper and all the games can be found here:

http://www.deepmind.com/alpha-go.html

The paper is freely available to download at the bottom of the page.

https://storage.googleapis.com/deepmind-data/assets/papers/deepmind-mastering-go.pdf

Aja

AlphaGo will be competing in a match against Lee Sedol in Seoul, this March, to see whether we finally have a Go program that is stronger than any human!

Aja

PS I am very busy preparing AlphaGo for the match, so apologies in advance if I cannot respond to all questions about AlphaGo.

"Ingo Althöfer"

unread,

Jan 27, 2016, 3:10:30 PM1/27/16

to compu...@computer-go.org

Sorry for a typo. I meant

> Hello Aja,
>
> congratulations to the success of you and the other team memberS!

So, not singular, but plural.

Ingo.

Michael Markefka

unread,

Jan 27, 2016, 3:10:49 PM1/27/16

to compu...@computer-go.org

I really do hope that this also turns into a good analysis and
teaching tool for human player. That would be a fantastic benefit from
this advancement in computer Go.

Yamato

unread,

Jan 27, 2016, 3:19:00 PM1/27/16

to compu...@computer-go.org

Congratulations Aja.

Do you have a plan to run AlphaGo on KGS?

It must be a 9d!

Yamato

Olivier Teytaud

unread,

Jan 27, 2016, 3:27:14 PM1/27/16

to computer-go

Congratulations people at DeepMind :-)

I like the fact that alphaGo uses many forms of learning (as humans do!):

- imitation learning (on expert games, learning an actor policy);

- learning by playing (self play, policy gradient), incidentally generating games;

- use of those games for teaching a second deep network (supervised learning);

- real time learning with Monte Carlo simulations (including Rave ?).

==> just beautiful :-)

--

=========================================================
Olivier Teytaud, olivier...@inria.fr, TAO, LRI, UMR 8623(CNRS - Univ. Paris-Sud),
bat 490 Univ. Paris-Sud F-91405 Orsay Cedex France http://www.slideshare.net/teytaud

Hideki Kato

unread,

Jan 27, 2016, 4:55:09 PM1/27/16

to compu...@computer-go.org

Congratulations Aja and David!

What an interesting idea to train the value network and surprising power
of the cloud!

Then, when you will get +400 Elo? :)

Hideki

Aja Huang: <CAJbO_wHzyvkAoYdUDeoQ86+t...@mail.gmail.com>:

>---- inline file

>_______________________________________________
>Computer-go mailing list
>Compu...@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go
--

Hideki Kato <mailto:hideki...@ybb.ne.jp>

Erik van der Werf

unread,

Jan 27, 2016, 5:26:59 PM1/27/16

to computer-go

Wow, excellent results, congratulations Aja & team!

I'm surprised to see nothing explicitly on decomposing into subgames (e.g. for semeai). I always thought some kind of adaptive decomposition would be needed to reach pro-strength... I guess you must have looked into this; does this mean that the networks have learnt to do it by themselves? Or perhaps they play in a way that simply avoids their weaknesses?

Would be interesting to see a demonstration that the networks have learned the semeai rules through reinforcement learning / self-play :-)

Best,

Erik

Jason Li

unread,

Jan 27, 2016, 6:13:53 PM1/27/16

to compu...@computer-go.org

Congratulations to Aja!

A question to the community. Is anyone going to replicate the experimental results?

https://www.quora.com/Is-anyone-replicating-the-experimental-results-of-the-human-level-Go-player-published-by-Google-Deepmind-in-Nature-in-January-2016?

Jason

René van de Veerdonk

unread,

Jan 27, 2016, 8:07:23 PM1/27/16

to computer-go

Really nice result! Congratulations to the team.

Now off to study the paper instead of the blogs ...

René

Robert Jasiek

unread,

Jan 28, 2016, 1:41:58 AM1/28/16

to compu...@computer-go.org

Congratulations to the researchers!

On 27.01.2016 21:10, Michael Markefka wrote:
> I really do hope that this also turns into a good analysis and
> teaching tool for human player. That would be a fantastic benefit from
> this advancement in computer Go.

The programs successful as computer players mostly rely on computation
power for learning and decision-making. This can be used for teaching
tools that do not need to provide text explanations and other reasoning
to the human pupils: computer game opponent, life and death playing
opponent, empirical winning percentages of patterns etc.

Currently such programs do not provide sophisticated explanations and
reasoning about tactical decision-making, strategy and positional
judgement fitting human players' / pupils' conceptual thinking.

If always correct teaching is not the aim (but if a computer teacher may
err as much as a human teacher errs), in principle it should be possible
to combine the successful means of using computation power with the
reasonably accurate human descriptions of sophisticated explanations and
reasoning. This requires implementation of expert system knowledge
adapted from the best (the least ambiguous, the most often correct /
applicable) descriptions of human-understandable go theory and further
research in the latter.

--
robert jasiek

David Fotland

unread,

Jan 28, 2016, 2:15:59 AM1/28/16

to compu...@computer-go.org

Google’s breakthrough is just as impactful as the invention of MCTS. Congratulations to the team. It’s a huge leap for computer go, but more importantly it shows that DNN can be applied to many other difficult problems.

I just added an answer. I don’t think anyone will try to exactly replicate it, but a year from now there should be several strong programs using very similar techniques, with similar strength.

An interesting question is, who has integrated or is integrating a DNN into their go program? I’m working on it. I know there are several others.

David

valk...@phmp.se

unread,

Jan 28, 2016, 3:39:52 AM1/28/16

to compu...@computer-go.org

Congratulations!

What I find most impressive is the engineering effort, combining so many
different parts, which even standalone would be a strong program.

I think the design philosophy of using 3 different sources of "go
playing" strength is great in it self (and if you read the paper there
are a lot of old school computer go programming expetise used as well).
I think we oft get stuck trying to perfect one module when perhaps what
we need is a new module that improves search effectively on a different
scale. I have not time and resources to do neural networks learning, but
for my new program I would like to experimentwith using patterns on many
levels, and this is inspiring.

Magnus Persson

On 2016-01-27 19:46, Aja Huang wrote:
> Hi all,
>
> We are very excited to announce that our Go program, AlphaGo, has
> beaten a professional player for the first time. AlphaGo beat the
> European champion Fan Hui by 5 games to 0. We hope you enjoy our
> paper, published in Nature today. The paper and all the games can be
> found here:
>

> http://www.deepmind.com/alpha-go.html [1]

>
> AlphaGo will be competing in a match against Lee Sedol in Seoul, this
> March, to see whether we finally have a Go program that is stronger
> than any human!
>
> Aja
>
> PS I am very busy preparing AlphaGo for the match, so apologies in
> advance if I cannot respond to all questions about AlphaGo.
>

> Links:
> ------
> [1] http://www.deepmind.com/alpha-go.html

Michael Markefka

unread,

Jan 28, 2016, 6:27:20 AM1/28/16

to compu...@computer-go.org

I think many amateurs would already benefit from a simple blunder
check and a short list of viable alternatives and short continuations
for every move.

If I could leave my PC running over night for a 30s/move analysis at
9d level and then walk through my game with that quality of analysis,
I'd be more than satisfied.

Petri Pitkanen

unread,

Jan 28, 2016, 7:16:07 AM1/28/16

to computer-go

I think such analysis might not bee too usefull. At least chess players think it is not very usefull. Usually for learning you need "wake-up" your brains so computer analysis without reasons probabaly on marginally useful. But very entertaining

Aja Huang

unread,

Jan 28, 2016, 7:54:52 AM1/28/16

to compu...@computer-go.org

Thanks all. I'm glad you enjoy our works.

On Wed, Jan 27, 2016 at 10:26 PM, Erik van der Werf <erikvan...@gmail.com> wrote:

I'm surprised to see nothing explicitly on decomposing into subgames (e.g. for semeai). I always thought some kind of adaptive decomposition would be needed to reach pro-strength... I guess you must have looked into this; does this mean that the networks have learnt to do it by themselves? Or perhaps they play in a way that simply avoids their weaknesses?

The value function does surprisingly well in positions with several local games, without any special-case code.

Aja

Aja Huang

unread,

Jan 28, 2016, 7:56:04 AM1/28/16

to compu...@computer-go.org

2016-01-27 20:18 GMT+00:00 Yamato <yama...@yahoo.co.jp>:

Congratulations Aja.

Do you have a plan to run AlphaGo on KGS?

It must be a 9d!

Thanks Yamato.

We are currently very busy preparing for the match against Lee Sedol in March and not planning to play AlphaGo on KGS in near future.

Aja

Stefan Kaitschick

unread,

Jan 28, 2016, 9:14:56 AM1/28/16

to compu...@computer-go.org

I always thought the same. But I don't think they tackled the decomposition problem directly.

Achieving good(non-terminal) board evaluations must have reduced the problem.

If you don't do full playouts, you get much less thrashing between independent problems.

It also implies a useful static L&D evaluation.

That "value network" is just amazing to me.

It does what computer go failed at for over 20 years, and what MCTS was designed to sidestep.

Michael Markefka

unread,

Jan 28, 2016, 10:11:04 AM1/28/16

to compu...@computer-go.org

On Thu, Jan 28, 2016 at 3:14 PM, Stefan Kaitschick
<stefan.k...@hamburg.de> wrote:

> That "value network" is just amazing to me.
> It does what computer go failed at for over 20 years, and what MCTS was
> designed to sidestep.

Thought it worth a mention: Detlef posted about trying to train a CNN
on win rate as well in February. So it seems he was onto something
there.

Lucas, Simon M

unread,

Jan 28, 2016, 10:41:05 AM1/28/16

to compu...@computer-go.org

Indeed – Congratulations to Google DeepMind!

It’s truly an immense achievement. I’m struggling

to think of other examples of reasonably mature

and strongly contested AI challenges where a new

system has made such a huge improvement over

existing systems – and I’m still struggling …

Simon Lucas

From: Computer-go [mailto:computer-...@computer-go.org] On Behalf Of Olivier Teytaud
Sent: 27 January 2016 20:27
To: computer-go <compu...@computer-go.org>
Subject: Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

Congratulations people at DeepMind :-)

Petr Baudis

unread,

Jan 28, 2016, 11:07:54 AM1/28/16

to compu...@computer-go.org

Hi!

Since I didn't say that yet, congratulations to DeepMind!

(I guess I'm a bit disappointed that no really new ML models had to be
invented for this though, I was wondering e.g. about capsule networks or
training simple iterative evaluation subroutines (for semeai etc.) by
NTM-based approaches. Just like everyone else, color me very awed by
such an astonishing result with just what was presented.)

On Wed, Jan 27, 2016 at 11:15:59PM -0800, David Fotland wrote:
> Google’s breakthrough is just as impactful as the invention of MCTS. Congratulations to the team. It’s a huge leap for computer go, but more importantly it shows that DNN can be applied to many other difficult problems.
>
> I just added an answer. I don’t think anyone will try to exactly replicate it, but a year from now there should be several strong programs using very similar techniques, with similar strength.
>
> An interesting question is, who has integrated or is integrating a DNN into their go program? I’m working on it. I know there are several others.
>
> David
>
> From: Computer-go [mailto:computer-...@computer-go.org] On Behalf Of Jason Li
> Sent: Wednesday, January 27, 2016 3:14 PM
> To: compu...@computer-go.org
> Subject: Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search
>
> Congratulations to Aja!
>
> A question to the community. Is anyone going to replicate the experimental results?
>
> https://www.quora.com/Is-anyone-replicating-the-experimental-results-of-the-human-level-Go-player-published-by-Google-Deepmind-in-Nature-in-January-2016?

A perfect question, I think - what can we do to replicate this,
without Google's computational power?

I probably couldn't have resisted giving it a try myself (especially
given that a lot of what I do nowadays are deep NNs, though on NLP),
but thankfully I have two deadlines coming... ;-)

I'd propose these as the major technical points to consider when
bringing a Go program (or a new one) to an Alpha-Go analog:

* Asynchronous integration of DNN evaluation with fast MCTS. I'm
curious about this, as I thought this would be a much bigger problem
that it apparently is, based on old results with batch parallelization.
I guess virtual loss makes a lot of difference? Is 1 lost playout enough?
I wonder if Detlef has already solved this sufficiently well in oakfoam?

What's the typical lag of getting the GPU evaluation (in, I guess,
#playouts) in oakfoam and is the throughput sufficient to score all
expanded leaf nodes (what's the #visits?)? Sorry if this has been
answered before.

* Are RL Policy Networks essential? AIUI by quick reading, they are
actually used only for RL of the value networks, and based on Fig. 4
the value network didn't use policy network for training on but still
got quite stronger than zen/crazystone? Aside of extra work, this'd
save us 50 GPU-days.

(My intuition is that RL policy networks are the part that allows
embedding knowledge about common tsumego/semeai situations in the
value networks, because they probably have enough capacity to learn
them. Does that make sense?)

* Seems like the push for SL Policy Network prediction accuracy from
50% to 60% is really important for real-world strength (Fig. 2).
I think right now the top open source solution has prediction
accuracy 50%? IDK if there's any other factor (features, dataset
size, training procedure) involved in this than "Updates were
applied asynchronously on 50 GPUs using DistBelief 60; gradients older
than 100 steps were discarded. Training took around 3 weeks for 340
million training steps."

* Value Networks require (i) 30 million self-play games (!); (ii) 50
GPU-weeks to train the weights. This seems rather troublesome, even
1/10 of that is a bit problematic for individual programmers. It'd
be interesting to see how much of that are diminishing returns and
if a much smaller network on smaller data (+ some compromises like
sampling the same game a few times, or adding the 8 million tygem
corpus to the mix) could do something interesting too.

In summary, seems to me that the big part of why this approach was so
successful are the huge computational resources applied to this, which
is of course an obstacle (except the big IT companies).

I think the next main avenue of research is exploring solutions that
are much less resource-hungry. The main problem here is hungry at
training time, not play time. Well, the strength of this NN running on
a normal single-GPU machine is another big question mark, of course.

Petr Baudis

Jim O'Flaherty

unread,

Jan 28, 2016, 11:29:33 AM1/28/16

to compu...@computer-go.org

I think the first goal was and is to find a pathway that clearly works to reach into the upper echelons of human strength, even if the first version used a huge amount of resources. Once found, then the approach can be explored for efficiencies from both directions, top down (take this away and see what we lose, if anything) and bottom up (efficiently reoriginate a reflection of a larger pattern in a much more constrained environment). From what I can see in the chess community, this is essentially what happened following Deep Blue's win against Kasperov. And now their are solutions on single desktops that can best what Deep Blue did with far more computational resources.

Petr Baudis

unread,

Jan 28, 2016, 11:38:26 AM1/28/16

to compu...@computer-go.org

On Thu, Jan 28, 2016 at 10:29:29AM -0600, Jim O'Flaherty wrote:
> I think the first goal was and is to find a pathway that clearly works to
> reach into the upper echelons of human strength, even if the first version
> used a huge amount of resources. Once found, then the approach can be
> explored for efficiencies from both directions, top down (take this away
> and see what we lose, if anything) and bottom up (efficiently reoriginate a
> reflection of a larger pattern in a much more constrained environment).
> >From what I can see in the chess community, this is essentially what
> happened following Deep Blue's win against Kasperov. And now their are
> solutions on single desktops that can best what Deep Blue did with far more
> computational resources.

Certainly!

Also, reflecting on what I just wrote,

> On Thu, Jan 28, 2016 at 10:07 AM, Petr Baudis <pa...@ucw.cz> wrote:
> >
> > (I guess I'm a bit disappointed that no really new ML models had to be
> > invented for this though, I was wondering e.g. about capsule networks or
> > training simple iterative evaluation subroutines (for semeai etc.) by
> > NTM-based approaches. Just like everyone else, color me very awed by
> > such an astonishing result with just what was presented.)
> >

> > In summary, seems to me that the big part of why this approach was so
> > successful are the huge computational resources applied to this, which
> > is of course an obstacle (except the big IT companies).

this is not meant at all as a criticism of AlphaGo, purely just
a discussion point! Even if you have a lot of hardware, it's *hard* to
make it add value, as anyone who tried to run MCTS on a cluster could
testify - it's not just a matter of throwing it at the problem, and the
challenges aren't just engineering-related either.

So maybe I'd actually say that this was even understated in the paper
- that AlphaGo uses an approach which scales so well with available
computational power (at training time) compared to previous approaches.

--
Petr Baudis
If you have good ideas, good data and fast computers,
you can do almost anything. -- Geoffrey Hinton

Michael Alford

unread,

Jan 28, 2016, 4:02:33 PM1/28/16

to compu...@computer-go.org

On 1/27/16 12:08 PM, Aja Huang wrote:

> 2016-01-27 18:46 GMT+00:00 Aja Huang <ajah...@google.com

> <mailto:ajah...@google.com>>:

>
> Hi all,
>
> We are very excited to announce that our Go program, AlphaGo, has
> beaten a professional player for the first time. AlphaGo beat the
> European champion Fan Hui by 5 games to 0. We hope you enjoy our
> paper, published in Nature today. The paper and all the games can be
> found here:
>
> http://www.deepmind.com/alpha-go.html
>
>
> The paper is freely available to download at the bottom of the page.
> https://storage.googleapis.com/deepmind-data/assets/papers/deepmind-mastering-go.pdf
>
> Aja
>
> AlphaGo will be competing in a match against Lee Sedol in Seoul,
> this March, to see whether we finally have a Go program that is
> stronger than any human!
>
> Aja
>
> PS I am very busy preparing AlphaGo for the match, so apologies in
> advance if I cannot respond to all questions about AlphaGo.

Congratulations on your achievement. While scanning the web articles
yesterday, I came across this one:

http://www.bloomberg.com/news/articles/2016-01-27/google-computers-defeat-human-players-at-2-500-year-old-board-game

It states that the winner of the March match gets $1mil. This is the
only reference to any prize I have found. Is it correct?

Thank you,
Michael

>
> _______________________________________________
> Computer-go mailing list
> Compu...@computer-go.org <mailto:Compu...@computer-go.org>
> http://computer-go.org/mailman/listinfo/computer-go

>
>
>
>
> _______________________________________________
> Computer-go mailing list
> Compu...@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

--

http://en.wikipedia.org/wiki/Pale_Blue_Dot

Darren Cook

unread,

Jan 28, 2016, 4:18:45 PM1/28/16

to compu...@computer-go.org

> I'd propose these as the major technical points to consider when
> bringing a Go program (or a new one) to an Alpha-Go analog:

> ...
> * Are RL Policy Networks essential? ...

Figure 4b was really interesting (see also Extended Tables 7 and 9): any
2 of their 3 components, on a single machine, are stronger than Crazy
Stone and Zen. And the value of the missing component:

Policy Network: +813 elo
Rollouts: +713 elo
Value Network: +474 elo

Darren

"Ingo Althöfer"

unread,

Jan 28, 2016, 8:14:56 PM1/28/16

to compu...@computer-go.org

Hi Simon,

do your remember my silly remarks in an email discussion almost a year ago?

You had written:
>> So, yes, with all the exciting work in DCNN, it is very tempting
>> to also do DCNN. But I am not sure if we should do so.

And my silly reply had been:
> I think that DCNN is somehow in a dreamdancing appartment.
> My opinion: We might mention it in our proposal, but not as a central topic.

In my mathematical life I have been wrong with my intuition only a few times.
This DCNN topic was the worst case so far...

Greetings from the bottom,
Ingo.

Gesendet: Donnerstag, 28. Januar 2016 um 16:41 Uhr
Von: "Lucas, Simon M" <s...@essex.ac.uk>
An: "compu...@computer-go.org" <compu...@computer-go.org>
Betreff: Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

Indeed – Congratulations to Google DeepMind!

It’s truly an immense achievement. I’m struggling
to think of other examples of reasonably mature
and strongly contested AI challenges where a new
system has made such a huge improvement over
existing systems – and I’m still struggling …

Simon Lucas

Detlef Schmicker

unread,

Jan 29, 2016, 1:06:40 AM1/29/16

to compu...@computer-go.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Ingo,

I think you are not alone: When I started computer go 4 years ago I
ask a good friend of mine, who did his PhD in Neural Networks back in
the 90s, if I have any chance to use them instead of pattern matching
and he said, they will probably not generalize in a good way :)

I think the big size of the nets make a qualitative difference,
therefore our intuition is misleading...

Congrats to the AlphaGo team,

Detlef

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJWqwGDAAoJEInWdHg+Znf4Ok8P/ix67Uj91gXi0+dJFgNq8R+m
cWRU7J3rRtPT+PuQM4suYjVhvn2+r/jmUOSWOxXYqzL03d323ufIl935jtZK00k7
428bp2g0r7NqZHvp4r8SnEVMhvVhjWG+vuGqPWPGJWnimpV/6C0d6/8JHZMVnoWR
BtAhlnJC1xOIjGJi86Xs3xRMiAdqEOeyu0HLu0LrJjm+fz4JNmVpENAPWivHBP4U
gnYSMHztrDxfBeWJCKg5a1hJBp7tGoN2LN7axbIZeNKh6cD30VtyWtHPDCpkfRlc
mW4mH5ljSrkcyjcfkC7ZL+qGf7aZ80vlNlGVlufeDSQaet8mkh592+lBVDSBViyU
NERSjqkWcbj9i5spNhyor+XEyLXgic23oXoPdTFhkYWQ0NKjRIdIhkGGTyvggbsn
Z4+T1GibInRS6MhLNxr2WPBDKqMYPGwXP325J1NQj+/aKpItOC4wRoQbtEdZdzb2
gDQ+Wu+DSlM99jgA0606BTKrv5n8ktTSP0CO8H9HBxsm+0rM1w/Nb0rg9AZnhPzD
9iEGkUo9MyWmQcsLasV/sYhZkbrR84l+GWkRSNnxF7DkbhbjnM6VmSI4q7j9H+R/
iotyslx7G3EZUuFTkFx2O9ePmbb3WEBKnAYFQmx2zXbrZ9hFtMQ8q1wAaa4svDE4
HiwOsxg3aaTpxTJCBT8u
=NpUI
-----END PGP SIGNATURE-----

Erik van der Werf

unread,

Jan 29, 2016, 5:40:55 AM1/29/16

to computer-go

This fluctuating sentiment on artificial neural networks is a bit weird; popularity comes and goes in waves, and many academics appear to be just following the hype. Most of the stuff I learned on ANN's in the 90s and early zeros just works, and now we can see that if one throws huge computational power at it, it even works extremely well! Sure there have been a few new tricks added, but for the most of it the deep learning hype just feels like one big 90s connectionists revival party.

Over the years (when they were out of grace) I've had plenty of good results with ANN's in my Go programs (Steenvreter & Magog), I just never went for the depths that are now feasible with modern hardware (but then again, at least my app runs well on a phone:-)).

I am amazed by how far the AlphaGo team was able to push it, but the general approach doesn't come as a surprise. If anything is a surprise to me, it's the things they could leave out, and still have it work so well...

Erik

Brian Cloutier

unread,

Jan 29, 2016, 2:08:37 PM1/29/16

to compu...@computer-go.org

> Even if you have a lot of hardware, it's *hard* to
make it add value, as anyone who tried to run MCTS on a cluster could
testify - it's not just a matter of throwing it at the problem, and the

challenges aren't just engineering-related either.

For those of us who don't know, could you talk a little about those challenges?

Peter Drake

unread,

Jan 31, 2016, 2:28:57 PM1/31/16

to compu...@computer-go.org

Let me add my congratulations to the chorus. Well done!

I'm due for a sabbatical next year. I had been joking, "It sure would be good timing if someone cracked Go right before that started. Then I'd have plenty of time to pick a new research topic." It looks like AlphaGo has provided.

On Wed, Jan 27, 2016 at 10:46 AM, Aja Huang <ajah...@google.com> wrote:

Hi all,

We are very excited to announce that our Go program, AlphaGo, has beaten a professional player for the first time. AlphaGo beat the European champion Fan Hui by 5 games to 0. We hope you enjoy our paper, published in Nature today. The paper and all the games can be found here:

http://www.deepmind.com/alpha-go.html

AlphaGo will be competing in a match against Lee Sedol in Seoul, this March, to see whether we finally have a Go program that is stronger than any human!

Aja

PS I am very busy preparing AlphaGo for the match, so apologies in advance if I cannot respond to all questions about AlphaGo.

_______________________________________________
Computer-go mailing list
Compu...@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

--

Peter Drake
https://sites.google.com/a/lclark.edu/drake/

"Ingo Althöfer"

unread,

Jan 31, 2016, 4:15:09 PM1/31/16

to compu...@computer-go.org

Hi Peter,

> I'm due for a sabbatical next year. I had been joking, "It sure would be good
> timing if someone cracked Go right before that started. Then I'd have plenty
> of time to pick a new research topic." It looks like AlphaGo has provided.

you are not the only one in such a situation or a similar one...

Probably you know, that the ICGA has its next Computer Olympiad end of
June 2916 in Leiden (NL), together with a 3-day conference "Computers and Games".
It is an exciting question how the whole event will be. In which moods
the programmers are, to which other games CNNs will be applied, which
new directions of research will be started, which bots will start in the
computer go competition ...

I think the questions may become more urgent when certain
things will happen in March.

Cheers, Ingo.

Hideki Kato

unread,

Jan 31, 2016, 4:28:19 PM1/31/16

to compu...@computer-go.org

Ingo and all,

Why you care AlphaGo and DCNN so much? Surely DeepMind team did
a big leap but the big problems, such as detecting double-ko and
solving complex positions are left unchanged. Also it's well
known that to attack these weakpoint of MCTS bots, the
opponents have to be strong enough. On 9x9, this was shown in
fall 2012. Now this can be applied 19x19 as well.

Hideki

Ingo Althofer: <trinity-c3f565fb-803c-4ddf-86fb-d6b815232762-1454274905354@3capp-gmx-bs61>:
>Hi Peter,

>

>computer go competition ...

>

>Cheers, Ingo.

>_______________________________________________

>Computer-go mailing list

>Compu...@computer-go.org

>http://computer-go.org/mailman/listinfo/computer-go

--
Hideki Kato <mailto:hideki...@ybb.ne.jp>

Robert Jasiek

unread,

Jan 31, 2016, 7:36:07 PM1/31/16

to compu...@computer-go.org

On 31.01.2016 20:28, Peter Drake wrote:
> pick a new research topic.

- explain by the program to human players why MC / DNN play is good in
terms of human understanding of the game
- incorporate the difficult parts, such as long-term aji
- solve the game: prove the correct score, prove a weak solution, prove
a strong solution [These mathematics keep us busy for at least 400 years
unless bot research occurs earlier.]
- create computers that act as mathematicians incl. creativity,
invention of propositions and their proving [so that the bot researchers
can solve the game faster]
- teach the computer expert knowledge so that a) MC / DNN bots become
even stronger and b) programs can teach with explanation and reasoning
understood by human pupils
- apply computer go research to other fields while ensuring that the
humans cannot be the victims of bugs and ambiguous responsibilty towards
law and ethics [medicin or cars: who goes to jail if AI kills people,
how to prevent AI from ruling the world]
- Play "Conway / Jasiek": modify the rules, invent new games, apply
computers.

Enough for research for centuries if not millenia, I'd say.

"Game over / intelligence solved" - never heard greater nonsense before.

--
robert jasiek

Jay Scott

unread,

Jan 31, 2016, 10:13:38 PM1/31/16

to compu...@computer-go.org

Robert Jasiek jas...@snafu.de:

>On 31.01.2016 20:28, Peter Drake wrote:
>> pick a new research topic.
>

>[a bunch of topics]

I have another topic suggestion. Deep learning needs tons of data. Humans reach top performance after seeing far, far fewer examples than AlphaGo sees. Whatever method humans learn by apparently has different scaling characteristics than deep learning.

So: Match human learning performance, in terms of data volume needed to reach a given level of skill.

Jay

Petri Pitkanen

unread,

Feb 1, 2016, 1:30:23 AM2/1/16

to computer-go

Explaining why the move is good in human terms is useless goal. Good chess programs cannot do it nor it is meaningful. As the humans and computers have vastly different approach to selecting a move then by the definition have reasons for moves. As an example your second item 'long-term aji', For human an important short cut but computer a mere result for seeing far enough in the future or combining several features of postion into non-linear/linear computation.

Petri

Robert Jasiek

unread,

Feb 1, 2016, 1:49:59 AM2/1/16

to compu...@computer-go.org

On 01.02.2016 07:30, Petri Pitkanen wrote:
> Explaining why the move is good in human terms is useless goal. Good chess
> programs cannot do it nor it is meaningful. As the humans and computers
> have vastly different approach to selecting a move then by the definition
> have reasons for moves. As an example your second item 'long-term aji', For
> human an important short cut but computer a mere result for seeing far
> enough in the future or combining several features of postion into
> non-linear/linear computation.

Such is not "useless" but requires additional research or implementation.

Darren Cook

unread,

Feb 1, 2016, 4:20:10 AM2/1/16

to compu...@computer-go.org

> someone cracked Go right before that started. Then I'd have plenty of
> time to pick a new research topic." It looks like AlphaGo has
> provided.

It seems [1] the smart money might be on Lee Sedol:

1. Ke Jie (world champ) – limited strength…but still amazing… Less than
5% chance against Lee Sedol now. But as it can go stronger, who knows
its future…
2. Mi Yuting (world champ) – appears to be a ‘chong-duan-shao-nian (kids
on the path to pros)’, ~high-level amateur.
3, Li Jie (former national team player) – appears to be pro-level. one
of the games is almost perfect (for AlphaGo)

On the other hand, AlphaGo got its jump in level very quickly (*), so it
is hard to know if they just got lucky (i.e. with ideas things working
first time) or if there is still some significant tweaking possible in
these 5 months of extra development (October 2015 to March 2016).

Have the informal game SGFs been uploaded anywhere? I noticed (Extended
Data Table 1) they were played *after* the official game each day, so
the poor pro should have been tired, but instead he won 2 of the 5 (day
1 and day 5). Was this just due to the short time limits, or did Fan Hui
play a different style (e.g. more aggressively)?

Darren

[1]: Comment by xli199 at
http://gooften.net/2016/01/28/the-future-is-here-a-professional-level-go-ai/

[2]: When did DeepMind start working on go? I suspect it might only
after have been after the video games project started to wound down,
which would've Feb 2015? If so, that is only 6-8 months (albeit with a
fairly large team).

Michael Markefka

unread,

Feb 1, 2016, 4:48:54 AM2/1/16

to compu...@computer-go.org

On Mon, Feb 1, 2016 at 10:19 AM, Darren Cook <dar...@dcook.org> wrote:
> It seems [1] the smart money might be on Lee Sedol:

In the DeepMind press conferences (
https://www.youtube.com/watch?v=yR017hmUSC4 -
https://www.youtube.com/watch?v=_r3yF4lV0wk ) Demis Hassabis stated,
that he was quietly confident.

I assume that means they've got a version up and running that at least
matches Lee Sedol's Elo rating, perhaps even slightly exceeding it.
They might be wary of the engine displaying some idiosyncracy they
haven't picked up on yet, which Sedol might notice and then exploit.

Petr Baudis

unread,

Feb 1, 2016, 6:12:46 AM2/1/16

to compu...@computer-go.org

Hi!

On Mon, Feb 01, 2016 at 09:19:56AM +0000, Darren Cook wrote:
> > someone cracked Go right before that started. Then I'd have plenty of
> > time to pick a new research topic." It looks like AlphaGo has
> > provided.
>
> It seems [1] the smart money might be on Lee Sedol:
>
> 1. Ke Jie (world champ) – limited strength…but still amazing… Less than
> 5% chance against Lee Sedol now. But as it can go stronger, who knows
> its future…
> 2. Mi Yuting (world champ) – appears to be a ‘chong-duan-shao-nian (kids
> on the path to pros)’, ~high-level amateur.
> 3, Li Jie (former national team player) – appears to be pro-level. one
> of the games is almost perfect (for AlphaGo)
>
>
> On the other hand, AlphaGo got its jump in level very quickly (*), so it
> is hard to know if they just got lucky (i.e. with ideas things working
> first time) or if there is still some significant tweaking possible in
> these 5 months of extra development (October 2015 to March 2016).

AlphaGo's achievement is impressive, but I'll bet on Lee Sedol
any time if he gets some people to explain the weaknesses of computers
and does some serious research.

AlphaGo didn't seem to solve the fundamental reading problems of
MCTS, just compensated with great intuition that can also remember
things like corner life&death shapes. But if Lee Sedol gets the game to
a confusing fight with a long semeai or multiple unusual life&death
shapes, I'd say based on what I know on AlphaGo that it'll collapse just
as current programs would. And, well, Lee Sedol is rather famous for
his fighting style. :)

Unless of course AlphaGo did achieve yet another fundamental
breakthrough since October, but I suspect it'll be a long process yet.
For the same reason, I think strong players that'd play against AlphaGo
would "learn to beat it" just as you see with weaker players+bots on
KGS.

I wonder how AlphaGo would react to an unexpected deviation from a
joseki that involves a corner semeai.

> [1]: Comment by xli199 at
> http://gooften.net/2016/01/28/the-future-is-here-a-professional-level-go-ai/
>
> [2]: When did DeepMind start working on go? I suspect it might only
> after have been after the video games project started to wound down,
> which would've Feb 2015? If so, that is only 6-8 months (albeit with a
> fairly large team).

Remember the two first authors of the paper:

* David Silver - his most cited paper is "Combining online and offline
knowledge in UCT", the 2007 paper that introduced RAVE

* Aja Huang - the author of Erica, among many other things

So this isn't a blue sky research at all, and I think they had Go in
crosshairs for most of the company's existence. I don't know the
details of how DeepMind operates, but I'd imagine the company works
on multiple things at once. :-)

--
Petr Baudis
If you have good ideas, good data and fast computers,
you can do almost anything. -- Geoffrey Hinton

Olivier Teytaud

unread,

Feb 1, 2016, 6:25:04 AM2/1/16

to computer-go

If AlphaGo had lost at least one game, I'd understand how people can have an upper bound on its level, but with 5-0 (except for Blitz) it's hard to have an upper bound on his level. After all, AlphaGo might just have played well enough for crushing Fan Hui, and a weak move while the position is still in favor of AlphaGo is not really a weak move (at least in a game-theoretic point of view...).

Petr Baudis

unread,

Feb 1, 2016, 6:38:20 AM2/1/16

to compu...@computer-go.org

On Mon, Feb 01, 2016 at 12:24:21PM +0100, Olivier Teytaud wrote:
> If AlphaGo had lost at least one game, I'd understand how people can have
> an upper bound on its level, but with 5-0 (except for Blitz) it's hard to
> have an upper bound on his level. After all, AlphaGo might just have played
> well enough for crushing Fan Hui, and a weak move while the position is
> still in favor of AlphaGo is not really a weak move (at least in a
> game-theoretic point of view...).

That's right, but unless I've overlooked something, I didn't see Fan Hui
create any complicated fight, there wasn't any semeai or complex
life&death (besides the by-the-book oonadare). This, coupled with the
fact that there is no new mechanism to deal with these (unless the value
network has truly astonishing generalization capacity, but it just
remembering common tsumego and joseki shapes is imho a simpler
explanation), leads me to believe that it remains a weakness.

Of course there are other possibilities, like AlphaGo always steering
the game in a calmer direction due to some emergent property. But
sometimes, you just have to go for the fight, don't you?

Petr Baudis

Hideki Kato

unread,

Feb 1, 2016, 7:07:30 AM2/1/16

to compu...@computer-go.org

Olivier Teytaud: <CAMpyiGN-3nWxivfMv2uJiAyzVj7zLxt8T=eci+_r0U...@mail.gmail.com>:

>If AlphaGo had lost at least one game, I'd understand how people can have
>an upper bound on its level, but with 5-0 (except for Blitz) it's hard to

No, the other five are not blitz games. Quoting from the
paper (pp. 28):
Time controls for formal games were 1 hour main time plus 3
period of 30 seconds byoyomi. Time controls for informal games
were 3 periods of 30 seconds byoyomi.

Hideki

>---- inline file
>_______________________________________________

>Computer-go mailing list

>Compu...@computer-go.org

>http://computer-go.org/mailman/listinfo/computer-go
--
Hideki Kato <mailto:hideki...@ybb.ne.jp>

Olivier Teytaud

unread,

Feb 1, 2016, 7:15:38 AM2/1/16

to computer-go

Ok, it's not blitz according to http://senseis.xmp.net/?BlitzGames

(limit at 10s/move for Blitz). But really shorter time settings.

I've seen (as you all) many posts guessing that AlphaGo will lose, but I find

that hard to know. If Fan Hui had won one game, I would say that AlphaGo is not ready for Lee Sedol, but with 5-0...

(incidentally, there is one great piece of news for machine learning people: people in industry are much more interested than before for letting us try our deep learning algorithms on their data and that's good for the world :-) )

Hideki Kato

unread,

Feb 1, 2016, 7:45:03 AM2/1/16

to compu...@computer-go.org

Olivier Teytaud: <CAMpyiGPsNRFtNtk_m1jTvDm4s-_XLcyKGV=SxVC8TiKG9f=F...@mail.gmail.com>:

>Ok, it's not blitz according to http://senseis.xmp.net/?BlitzGames
>(limit at 10s/move for Blitz). But really shorter time settings.
>
>I've seen (as you all) many posts guessing that AlphaGo will lose, but I
>find
>that hard to know. If Fan Hui had won one game, I would say that AlphaGo is
>not ready for Lee Sedol, but with 5-0...

Main time is not so important if we evaluate AI's playing
strength because AI's thinking time is almost 30 seconds in
either time setting, due to not-so-smart time control algorithms.
#If the team has developed smarter one, this is wrong.

Anyway, I wonder why Google don't publish the records of the
informal games. Certainly losing games are much more
informative.

Hideki

>(incidentally, there is one great piece of news for machine learning
>people: people in industry are much more interested than before for letting
>us try our deep learning algorithms on their data and that's good for the
>world :-) )

Aja Huang

unread,

Feb 1, 2016, 8:38:32 AM2/1/16

to compu...@computer-go.org

Hi Petr,

On Mon, Feb 1, 2016 at 11:38 AM, Petr Baudis <pa...@ucw.cz> wrote:

That's right, but unless I've overlooked something, I didn't see Fan Hui
create any complicated fight, there wasn't any semeai or complex
life&death (besides the by-the-book oonadare). This, coupled with the
fact that there is no new mechanism to deal with these (unless the value
network has truly astonishing generalization capacity, but it just
remembering common tsumego and joseki shapes is imho a simpler
explanation), leads me to believe that it remains a weakness.

If you check Myungwan Kim 9p's comments in the video, in the 4th game there was a semeai that AlphaGo read out at top side. See the game at

http://britgo.org/deepmind2016/summary

Unfortunately before the Lee match I'm not allowed to answer some of the interesting questions raised in this thread, or mention how strong is AlphaGo now. But for now what I can say is that in the nature paper (about 5 months ago) AlphaGo reached nearly 100% win rate against the latest commercial versions of Crazy Stone and Zen, and AlphaGo still did well even on 4 handicap stones, suggesting AlphaGo may do much better in tactical situations than Crazy Stone and Zen.

I understand you bet on Lee but I hope you will enjoy watching the match. :)

Aja

Jim O'Flaherty

unread,

Feb 1, 2016, 9:15:16 AM2/1/16

to compu...@computer-go.org

Robert,

I'm not seeing the ROI in attempting to map human idiosyncratic linguistic systems to/into a Go engine. Which language would be the one to use; English, Chinese, Japanese, etc? As abstraction goes deeper, the nuance of each human language diverges from the others (due to the way the human brain is just a fractal based analogy making engine). The scare resource is human mind power producing advances on the main goal making a superior AI to what already exists. As the linguistic pathway hasn't emerged in Chess in the last decade, then I find it considerably less likely it will end up emerging for Go...unless you are, of course, suggesting that is something you are taking up. :)

The AI world is changing to make explaining computation cognition to humans less necessary, or even desirable. Why bound the solution space to only what cognitively linguistically limited humans can imagine and/or consider? And given even one AI team is thinking this way, the nature of competition will drive other competing teams to similar motivation(s). Welcome to "memetic evolution in action". Kind of makes those of us in the nearby human cognitive domains just a wee bit more nervous about what is rapidly approaching as human cognition automateable. For example, books about josekis could be rendered far less valuable if/when AlphaGo and some other AI competitor more strongly influenced by josekis pushes AlphaGo into new spaces which involve much longer resolution horizons than humans used for those that exist now.

No matter what, the future sure does sound very exciting now that Alpha Go has broken the Go AI ceiling. I cannot WAIT to see the results of the event against Lee Sedol.

Congratulations, Alpha Go team and Aja!

Jim

"Ingo Althöfer"

unread,

Feb 1, 2016, 9:59:37 AM2/1/16

to compu...@computer-go.org

Hi Aja,

congratulations again to the fantastic achievement of your team!

One bunch of management questions:

* How many games will be played in March between Alpha-Go and Lee Sedol?
* Will it be just "X games" or some "best of X" format?

* What will be the thinking times?
* Will there be rest days between the rounds?

* Would it be ok for DeepMind, when Lee Sedol takes one or two coaches
from the computer-go scene?

Cheers, Ingo.

Gesendet: Montag, 01. Februar 2016 um 14:38 Uhr
Von: "Aja Huang" <ajah...@google.com>
An: compu...@computer-go.org

Betreff: Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

Hi Petr,

On Mon, Feb 1, 2016 at 11:38 AM, Petr Baudis <pa...@ucw.cz> wrote:That's right, but unless I've overlooked something, I didn't see Fan Hui
create any complicated fight, there wasn't any semeai or complex
life&death (besides the by-the-book oonadare). This, coupled with the
fact that there is no new mechanism to deal with these (unless the value
network has truly astonishing generalization capacity, but it just
remembering common tsumego and joseki shapes is imho a simpler
explanation), leads me to believe that it remains a weakness.

If you check Myungwan Kim 9p's comments in the video, in the 4th game there was a semeai that AlphaGo read out at top side. See the game at

http://britgo.org/deepmind2016/summary[http://britgo.org/deepmind2016/summary]

Unfortunately before the Lee match I'm not allowed to answer some of the interesting questions raised in this thread, or mention how strong is AlphaGo now. But for now what I can say is that in the nature paper (about 5 months ago) AlphaGo reached nearly 100% win rate against the latest commercial versions of Crazy Stone and Zen, and AlphaGo still did well even on 4 handicap stones, suggesting AlphaGo may do much better in tactical situations than Crazy Stone and Zen.

I understand you bet on Lee but I hope you will enjoy watching the match. :)

Aja
Of course there are other possibilities, like AlphaGo always steering
the game in a calmer direction due to some emergent property. But
sometimes, you just have to go for the fight, don't you?

Petr Baudis

_______________________________________________
Computer-go mailing list
Compu...@computer-go.org[Compu...@computer-go.org]
http://computer-go.org/mailman/listinfo/computer-go_______________________________________________ Computer-go mailing list Compu...@computer-go.org http://computer-go.org/mailman/listinfo/computer-go[http://computer-go.org/mailman/listinfo/computer-go]

"Ingo Althöfer"

unread,

Feb 1, 2016, 10:17:46 AM2/1/16

to compu...@computer-go.org

Hi Hideki,

first of all congrats to the nice performance of Zen over the weekend!

> Ingo and all,
> Why you care AlphaGo and DCNN so much?

I can speak only for myself. DCNNs may be not only applied to
achieve better playing strength. One may use them to create
playing styles, or bots for go variants.

One of my favorites is robot frisbee go.
http://www.althofer.de/robot-play/frisbee-robot-go.jpg
Perhaps one can teach robots with DCNN to throw the disks better.

And my expectation is: During 2016 we will see many more fantastic
applications of DCNN, not only in Go. (Olivier had made a similar
remark already.)

Ingo.

PS. Dietmar Wolz, my partner in space trajectory design, just told me
that in his company they started woth deep learning...

Robert Jasiek

unread,

Feb 1, 2016, 10:19:31 AM2/1/16

to compu...@computer-go.org

On 01.02.2016 14:38, Aja Huang wrote:
> AlphaGo may do much better in tactical
> situations than Crazy Stone and Zen.

Judging very quickly from the Fan Hui games, AlphaGo's group-local
"reading" is very deep and accurate but I'd need to read for myself
equally deeply and carefully before I would want to confirm Myongwan
Kim's related opinion.

--
robert jasiek

Robert Jasiek

unread,

Feb 1, 2016, 10:36:31 AM2/1/16

to compu...@computer-go.org

On 01.02.2016 15:15, Jim O'Flaherty wrote:
> I'm not seeing the ROI in attempting to map human idiosyncratic linguistic
> systems to/into a Go engine. Which language would be the one to use;
> English, Chinese, Japanese, etc? As abstraction goes deeper, the nuance of
> each human language diverges from the others (due to the way the human

> brain is just a fractal based analogy making engine). [...]

> unless you are, of course, suggesting that is something
> you are taking up. :)

The human language for interaction with / translation to programming
language includes

- well-defined terms / concepts
- rules / principles with stated presuppositions
- methods / procedures / informal algorithms
- proofs / strong evidence for the aforementioned being correct /
successful (always or to some extent)

Of course, I am an example of a person having been doing this for many
years. In fact, I might be the leading generalist for go theory expert
knowledge stated in writing.

> The AI world is changing to make explaining computation cognition to humans
> less necessary, or even desirable.

I disagree strongly.

Almost all the AI world has done is creating strong programs. Explaining
human thinking and explaining program thinking in terms of human
thinking is as important as it has always been.

> Why bound the solution space to only
> what cognitively linguistically limited humans can imagine and/or consider?

Indeed. I prefer to exceed limitations by creating new terms,
definitions for undefined terms, principles, methods etc. Human beings
can better learn if they know what to learn because the contents is
described clearly.

> about what is rapidly
> approaching as human cognition automateable.

Eh? Besides GoTools, there has been very little, AFAIK.

Petr Baudis

unread,

Feb 1, 2016, 11:14:45 AM2/1/16

to compu...@computer-go.org

Hi!

On Mon, Feb 01, 2016 at 01:38:28PM +0000, Aja Huang wrote:
> On Mon, Feb 1, 2016 at 11:38 AM, Petr Baudis <pa...@ucw.cz> wrote:
> >
> > That's right, but unless I've overlooked something, I didn't see Fan Hui
> > create any complicated fight, there wasn't any semeai or complex
> > life&death (besides the by-the-book oonadare). This, coupled with the
> > fact that there is no new mechanism to deal with these (unless the value
> > network has truly astonishing generalization capacity, but it just
> > remembering common tsumego and joseki shapes is imho a simpler
> > explanation), leads me to believe that it remains a weakness.
> >
>
> If you check Myungwan Kim 9p's comments in the video, in the 4th game there
> was a semeai that AlphaGo read out at top side. See the game at
>
> http://britgo.org/deepmind2016/summary

(It's at ~1:33:00+ https://www.youtube.com/watch?v=NHRHUHW6HQE)

Well, there was a potential semeai, but did AlphaGo read it out?
I don't know, you probably do. :-)

> Unfortunately before the Lee match I'm not allowed to answer some of the
> interesting questions raised in this thread, or mention how strong is
> AlphaGo now. But for now what I can say is that in the nature paper (about
> 5 months ago) AlphaGo reached nearly 100% win rate against the latest
> commercial versions of Crazy Stone and Zen, and AlphaGo still did well even
> on 4 handicap stones, suggesting AlphaGo may do much better in tactical
> situations than Crazy Stone and Zen.

But CrazyStone and Zen are also pretty bad at semeai and tsumego, it's
a bit of a self-play problem; when playing against MCTS programs, some
mistakes aren't revealed.

(I guess that you probably played tens of games against AlphaGo
yourself, so you'll have a pretty good idea about its capabilities.
I just can't imagine how will the value network count and pick liberties
or tsumego sequence combinations; it might just have more memory
capacity than we'd imagine.)

> I understand you bet on Lee but I hope you will enjoy watching the match. :)

I certainly will! And in my heart, maybe I root for AlphaGo too :)

--
Petr Baudis
If you have good ideas, good data and fast computers,
you can do almost anything. -- Geoffrey Hinton

John Tromp

unread,

Feb 1, 2016, 12:01:46 PM2/1/16

to computer-go

For those of you who missed it, chess grandmaster Hikaru Nakamura,
rated 2787, recently played a match against the world's top chess program
Komodo, rated 3368. Each of the 4 games used a different kind of handicap:

Pawn and Move Odds
Pawn Odds
Exchange Odds
4-Move Odds

As you can see, handicaps in chess are no easy matter:-(
When AlphaGo surpasses the top human professionals we may see such
handicap challenges in the future. One may wonder if we'll ever see a
computer giving 4 handicap to a professional...

So how did Nakamura fare? See for yourself at

https://www.chess.com/news/komodo-beats-nakamura-in-final-battle-1331

regards,
-John

Thomas Wolf

unread,

Feb 1, 2016, 12:15:22 PM2/1/16

to computer-go

The next type of event could be a new 'Pair Go'
Where a human and a program make up a pair, like Mark Zuckerberg and his facebook
program against a Google VP and alphaGo. :-)

Thomas

Hideki Kato

unread,

Feb 1, 2016, 3:00:22 PM2/1/16

to compu...@computer-go.org

Ingo Althofer: <trinity-a297d40e-3cf2-45f1-8d38-13a5912b636c-1454339862588@3capp-gmx-bs72>:

>Hi Hideki,
>
>first of all congrats to the nice performance of Zen over the weekend!
>
>> Ingo and all,
>> Why you care AlphaGo and DCNN so much?
>
>I can speak only for myself. DCNNs may be not only applied to
>achieve better playing strength. One may use them to create
>playing styles, or bots for go variants.
>
>One of my favorites is robot frisbee go.
>http://www.althofer.de/robot-play/frisbee-robot-go.jpg
>Perhaps one can teach robots with DCNN to throw the disks better.
>
>And my expectation is: During 2016 we will see many more fantastic
>applications of DCNN, not only in Go. (Olivier had made a similar
>remark already.)

Agree but one criticism. If such great DCNN applications all
need huge machine power like AlphaGo (upon execution, not
training), then the technology is hard to apply to many areas,
autos and robots, for examples. Are DCNN chips the only way to
reduce computational cost? I don't forecast other possibilities.
Much more economical methods should be developed anyway.
#Our brain consumes less than 100 watt.

Hideki

>Ingo.
>
>PS. Dietmar Wolz, my partner in space trajectory design, just told me
>that in his company they started woth deep learning...
>_______________________________________________
>Computer-go mailing list
>Compu...@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go

--
Hideki Kato <mailto:hideki...@ybb.ne.jp>

Rainer Rosenthal

unread,

Feb 1, 2016, 4:03:48 PM2/1/16

to compu...@computer-go.org

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Robert: "Hey, AI, you should provide explanations!"
AI: "Why?"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cheers,
Rainer
> Date: Mon, 1 Feb 2016 08:15:12 -0600
> From: "Jim O'Flaherty" <jim.ofla...@gmail.com>
> To: compu...@computer-go.org
> Subject: Re: [Computer-go] Mastering the Game of Go with Deep Neural
> Networks and Tree Search
> Message-ID:
> <CAKX5Gkjc7J0UQ_PMxyUmYFRe...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Robert,

>
> I'm not seeing the ROI in attempting to map human idiosyncratic linguistic
> systems to/into a Go engine.

_______________________________________________

"Ingo Althöfer"

unread,

Feb 1, 2016, 4:08:20 PM2/1/16

to compu...@computer-go.org

Hi Hideki,

you put it wonderfully into two lines:

******************************************************************
******************************************************************
*** ***
*** Much more economical methods should be developed anyway. ***
*** #Our brain consumes less than 100 watt. ***
*** ***
******************************************************************
******************************************************************

Hopefully the box remains formatted nicely ;-)

Ingo.

"Ingo Althöfer"

unread,

Feb 1, 2016, 4:16:00 PM2/1/16

to compu...@computer-go.org

Hello everybody,

some weeks ago I had given a hint already on the conference
CG2016 (CG standing for "Computer and Games"), to take place in
Leiden (NL) on June 29 - July 01.

https://cg2016leiden.wordpress.com/

The deadline for papers has been prolonged already to February 11.

In view of the current DNN-explosion there are plans to have another
prolongement for paper submissions in the field of Neural Nets - and
also a special workshop for "Neural Nets in Games".

So, start your research and contribute a paper...
More information will be given, when available.

Ingo (Vize President of the ICGA).

Álvaro Begué

unread,

Feb 1, 2016, 4:45:56 PM2/1/16

to computer-go

Aja,

I read the paper with great interest. [Insert appropriate praises here.]

I am trying to understand the part where you use reinforcement learning to improve upon the CNN trained by imitating humans. One thing that is not explained is how to determine that a game is over, particularly when a player is simply a CNN that has a probability distribution as its output. Do you play until every point is either a suicide or looks like an eye? Do you do anything to make sure you don't play in a seki?

I am sure you are a busy man these days, so please answer only when you have time.

Thanks!

Álvaro.

On Wed, Jan 27, 2016 at 1:46 PM, Aja Huang <ajah...@google.com> wrote:

Hi all,

We are very excited to announce that our Go program, AlphaGo, has beaten a professional player for the first time. AlphaGo beat the European champion Fan Hui by 5 games to 0. We hope you enjoy our paper, published in Nature today. The paper and all the games can be found here:

http://www.deepmind.com/alpha-go.html

AlphaGo will be competing in a match against Lee Sedol in Seoul, this March, to see whether we finally have a Go program that is stronger than any human!

Aja

PS I am very busy preparing AlphaGo for the match, so apologies in advance if I cannot respond to all questions about AlphaGo.

Brian Cloutier

unread,

Feb 1, 2016, 5:01:57 PM2/1/16

to computer-go

> One thing that is not explained is how to determine that a game is over

You'll find that very little of the literature explicitly covers this. When I asked this question I had to search a lot of papers on MCTS which mentioned "terminal states" before finding one which defined them.

Let me see if I can find the actual paper, but they defined it as a position where there are no more legal moves. You're right though, that ignores sekis, which makes me think I'm remembering wrong.

Brian Sheppard

unread,

Feb 1, 2016, 5:15:46 PM2/1/16

to compu...@computer-go.org

You play until neither player wishes to make a move. The players are willing to move on any point that is not self-atari, and they are willing to make self-atari plays if capture would result in a Nakade (http://senseis.xmp.net/?Nakade)

This correctly plays seki.

From: Computer-go [mailto:computer-...@computer-go.org] On Behalf Of Brian Cloutier
Sent: Monday, February 1, 2016 5:02 PM
To: computer-go <compu...@computer-go.org>
Subject: Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

> One thing that is not explained is how to determine that a game is over

George Dahl

unread,

Feb 1, 2016, 11:14:26 PM2/1/16

to computer-go

If anything, the other great DCNN applications predate the application of these methods to Go. Deep neural nets (convnets and other types) have been successfully applied in computer vision, robotics, speech recognition, machine translation, natural language processing, and hosts of other areas. The first paragraph of the TensorFlow whitepaper (http://download.tensorflow.org/paper/whitepaper2015.pdf) even mentions dozens at Alphabet specifically.

Of course the future will hold even more exciting applications, but these techniques have been proven in many important problems long before they had success in Go and they are used by many different companies and research groups. Many example applications from the literature or at various companies used models trained on a single machine with GPUs.

"Ingo Althöfer"

unread,

Feb 2, 2016, 3:31:11 AM2/2/16

to compu...@computer-go.org

Hi George,

welcome, and thanks for your valuable hint on the Google-whitepaper.

Do/did you have/see any cross-relations between your research and
computer Go?

Cheers, Ingo.

Gesendet: Dienstag, 02. Februar 2016 um 05:14 Uhr
Von: "George Dahl" <georg...@gmail.com>
An: computer-go <compu...@computer-go.org>
Betreff: Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

If anything, the other great DCNN applications predate the application of these methods to Go. Deep neural nets (convnets and other types) have been successfully applied in computer vision, robotics, speech recognition, machine translation, natural language processing, and hosts of other areas. The first paragraph of the TensorFlow whitepaper (http://download.tensorflow.org/paper/whitepaper2015.pdf) even mentions dozens at Alphabet specifically.

Of course the future will hold even more exciting applications, but these techniques have been proven in many important problems long before they had success in Go and they are used by many different companies and research groups. Many example applications from the literature or at various companies used models trained on a single machine with GPUs.

On Mon, Feb 1, 2016 at 12:00 PM, Hideki Kato <hideki...@ybb.ne.jp[hideki...@ybb.ne.jp]> wrote:Ingo Althofer: <trinity-a297d40e-3cf2-45f1-8d38-13a5912b636c-1454339862588@3capp-gmx-bs72>:
>Hi Hideki,
>
>first of all congrats to the nice performance of Zen over the weekend!
>
>> Ingo and all,
>> Why you care AlphaGo and DCNN so much?
>
>I can speak only for myself. DCNNs may be not only applied to
>achieve better playing strength. One may use them to create
>playing styles, or bots for go variants.
>
>One of my favorites is robot frisbee go.

>http://www.althofer.de/robot-play/frisbee-robot-go.jpg[http://www.althofer.de/robot-play/frisbee-robot-go.jpg]

>Perhaps one can teach robots with DCNN to throw the disks better.
>
>And my expectation is: During 2016 we will see many more fantastic
>applications of DCNN, not only in Go. (Olivier had made a similar
>remark already.)

Agree but one criticism. If such great DCNN applications all
need huge machine power like AlphaGo (upon execution, not
training), then the technology is hard to apply to many areas,
autos and robots, for examples. Are DCNN chips the only way to
reduce computational cost? I don't forecast other possibilities.
Much more economical methods should be developed anyway.
#Our brain consumes less than 100 watt.

Hideki

>Ingo.
>
>PS. Dietmar Wolz, my partner in space trajectory design, just told me
>that in his company they started woth deep learning...
>_______________________________________________
>Computer-go mailing list

>Compu...@computer-go.org[Compu...@computer-go.org]
>http://computer-go.org/mailman/listinfo/computer-go[http://computer-go.org/mailman/listinfo/computer-go]
--
Hideki Kato <mailto:hideki...@ybb.ne.jp[hideki...@ybb.ne.jp]>

_______________________________________________
Computer-go mailing list
Compu...@computer-go.org[Compu...@computer-go.org]
http://computer-go.org/mailman/listinfo/computer-go_______________________________________________ Computer-go mailing list Compu...@computer-go.org http://computer-go.org/mailman/listinfo/computer-go[http://computer-go.org/mailman/listinfo/computer-go]

Robert Jasiek

unread,

Feb 2, 2016, 3:34:41 AM2/2/16

to compu...@computer-go.org

On 01.02.2016 23:01, Brian Cloutier wrote:> I had to search a lot of

papers on MCTS which
> mentioned "terminal states" before finding one which defined them.

> [...] they defined it as a position where there are no more legal
> moves.

On 01.02.2016 23:15, Brian Sheppard wrote:
> You play until neither player wishes to make a move. The players
> are willing to move on any point that is not self-atari, and they
> are willing to make self-atari plays if capture would result in a
> Nakade (http://senseis.xmp.net/?Nakade)

Defining "terminal state" as no more legal moves is probably
inappropriate. The phrase "willing to move" is undefined, unless they
exactly define it as "to make self-atari plays iff capture would result
in a Nakade". This requires a proof that this is the only exception.
Where is that proof? It also requires a definition of nakade. Where is
that definition?

In my book Capturing Races 1, I have outlined a definition of
"[semeai-]eye" and, in Life and Death Problems 1, of "nakade". Such are
more complicated by far than naive descriptions online suggest. In
particular, such outlined definitions depend on the still undefined
"essential [string]", "seki" [sic, undefined as a strategic object
because the Japanese 2003 Rules' definition does not distinguish good
from bad strategy!] and "lake" [connected part of the potential
eyespace..., which in turn is still undefined as a strategic object].
They also depend on "ko", but at least this I have defined:
http://home.snafu.de/jasiek/ko.pdf Needless to say, determining the
objects that are essential, seki, lake, ko is a hard task in itself.

So where is the mathematically strict "definition" of nakade? Has
anybody proceeded beyond my definition attempts? I suspect the standard
problem of research again: definition by reference to a different paper
with an ambiguous description. If ambiguous terms are presumed for
pragmatic reasons, this must be stated! My mentioned terms are ambiguous
but less so than every other attempt - or where are the better attempts?

--
robert jasiek

Petr Baudis

unread,

Feb 2, 2016, 5:49:31 AM2/2/16

to compu...@computer-go.org

Hi Robert,

maybe it's just me, but you seem to come off as perhaps a little too
aggressive in your recent few emails...

On Tue, Feb 02, 2016 at 09:35:14AM +0100, Robert Jasiek wrote:
> On 01.02.2016 23:01, Brian Cloutier wrote:> I had to search a lot of papers
> on MCTS which
> > mentioned "terminal states" before finding one which defined them.
> > [...] they defined it as a position where there are no more legal
> > moves.
>
> On 01.02.2016 23:15, Brian Sheppard wrote:
> >You play until neither player wishes to make a move. The players
> > are willing to move on any point that is not self-atari, and they
> >are willing to make self-atari plays if capture would result in a
> >Nakade (http://senseis.xmp.net/?Nakade)
>
> Defining "terminal state" as no more legal moves is probably inappropriate.
> The phrase "willing to move" is undefined, unless they exactly define it as
> "to make self-atari plays iff capture would result in a Nakade". This
> requires a proof that this is the only exception. Where is that proof? It
> also requires a definition of nakade. Where is that definition?

The question was about the practical implementation of an MC
simulation, which does *not* require formal definitions of all concepts
used in the description, or any proofs. It's just a heuristic, and it
can be arbitrarily complicated, making a tradeoff between speed and
accuracy.

My definition of this state is in the (quite literal) code

https://github.com/pasky/pachi/blob/master/tactics/selfatari.c#L638

It's the most complicated part of Pachi. :-) (And doesn't really work
that well either.) But it covers a lot of cases.

To the subject at hand, I'd suggest a lot simpler approach if the DCNN
is itself capable of avoiding the bad self-ataris. Just do not restrict
the DCNN in any way and have a separate check that stops the playout if
all the remaining moves are self-atari in the most trivial sense of the
word.

This means that you stop the game too early if the whole board is
filled (including dame and territory) but some nakade or throwin remains
that the DCNN would like to play out, but I suspect that would pretty
much never happen in practice?

--
Petr Baudis
If you have good ideas, good data and fast computers,
you can do almost anything. -- Geoffrey Hinton

Robert Jasiek

unread,

Feb 2, 2016, 6:10:34 AM2/2/16

to compu...@computer-go.org

On 02.02.2016 11:49, Petr Baudis wrote:
> you seem to come off as perhaps a little too
> aggressive in your recent few emails...

If I were not aggressively critical about inappropriate ambiguity, it
would continue for further decades. Papers containing mathematical
contents must clarify when something whose use or annotation looks
mathematical is not a definition / well-defined term but intentionally
ambiguous. This clarity is a fundamental of mathematical, informatical
or scientific research. Without clarity, progress is delayed. Every
professor at university will confirm this to you.

> The question was about the practical implementation of an MC
> simulation, which does *not* require formal definitions of all concepts
> used in the description, or any proofs. It's just a heuristic, and it
> can be arbitrarily complicated, making a tradeoff between speed and
> accuracy.

Fine, provided it is clearly stated that it is an ambiguous heuristic
and not an [unambiguous] definition / term. References / links (possibly
iterative) hiding ambiguity without declaring it are inappropriate.

--
robert jasiek

Olivier Teytaud

unread,

Feb 2, 2016, 6:32:21 AM2/2/16

to compu...@computer-go.org

Without clarity, progress is delayed. Every professor at university will confirm this to you.

IMHO, Petr contributed enough to academic research

for not needing a discussion with a professor at university

for learning how to do/clarify research :-)

--

=========================================================
"I will never sign a document with logos in black & white." A. Einstein
Olivier Teytaud, olivier...@inria.fr, http://www.slideshare.net/teytaud

"Ingo Althöfer"

unread,

Feb 2, 2016, 7:05:31 AM2/2/16

to compu...@computer-go.org

Hi Robert,

we met for the first time at the EGC 2000 in Berlin-Strausberg.
I know your special ways of argumenting - and think that you
are an enrichment both for the go world and for the computer go scene.

But ...

> Without clarity, progress is delayed. Every
> professor at university will confirm this to you.

as a professor of Mathematics (for 22 years now) I severely
question this point of view. Of course, when a student starts
studying Mathematics (s)he learns in the first two semesters that
everything has to be defined waterproof. Later, in particular
when (s)he comes near to doing own research, you have to make
compromises - otherwise you will never make much progress.

For research in general it is good to have waves:
moving forward in informal thoughts and handwaving proofs (and maybe
even with a glass of beer in the hand) - then having another phase where
precision and clarity is on the agenda. And back to informal mode ...
Also a team of mathematicians will be most successful when they
have handwavers, dreamdancers, bean-counters, and formalists.

Accept that the world is multi-facetted, even our small
computer go community.

Cheers (without a beer at hand right now),
Ingo.

Robert Jasiek

unread,

Feb 2, 2016, 7:14:51 AM2/2/16

to compu...@computer-go.org

On 02.02.2016 13:05, "Ingo Althöfer" wrote:
> For research in general it is good to have waves:

Research is faster if informalism and formalism progress simultaneously
(by different people or in different papers).

--
robert jasiek

Jim O'Flaherty

unread,

Feb 2, 2016, 11:29:19 AM2/2/16

to compu...@computer-go.org

And to meta this awesome short story...

AI Software Engineers: Robert, please stop asking our AI for explanations. We don't want to distract it with limited human understanding. And we don't want the Herculean task of coding up that extremely frail and error prone bridge.

Robert Jasiek

unread,

Feb 2, 2016, 1:02:56 PM2/2/16

to compu...@computer-go.org

On 02.02.2016 17:29, Jim O'Flaherty wrote:
> AI Software Engineers: Robert, please stop asking our AI for explanations.
> We don't want to distract it with limited human understanding. And we don't
> want the Herculean task of coding up that extremely frail and error prone
> bridge.

Currently I do not ask a specific AI engine explanations. If an AI
program only has the goal of playing strong, then - while it is playing
or preparing play - it should not be disturbed with extra tasks.

Explanations can come from AI programs, their programmers, researchers
providing the theory applied in those programs, researchers analysing
the program codes, data structures or outputs.

I do not expect everybody to be interested in explanations, but I ask
those interested. It must be possible to study theory for playing
programs, their data structures or outputs and find connections to
explanatory theory - as much as it must be possible to use explanatory
theory to improve "brute force" programs.

Herculean task? Likely. The research in explanatory theory is, too.

Error prone? I disagree. Errors are not created due to volume of a task
but due to carelessness or missing study of semantic conflicts.

--
robert jasiek

David Fotland

unread,

Feb 2, 2016, 1:07:46 PM2/2/16

to compu...@computer-go.org

Robert, please consider some of this as the difference between math and engineering. Math desires rigor. Engineering desires working solutions. When an engineering solution is being described, you shouldn't expect the same level of rigor as in a mathematical proof. Often all we can say is something like, "I tried a bunch of things, and this one worked best". Both have value.

-David

> -----Original Message-----
> From: Computer-go [mailto:computer-...@computer-go.org] On Behalf
> Of Robert Jasiek
> Sent: Tuesday, February 02, 2016 3:11 AM
> To: compu...@computer-go.org
> Subject: *****SPAM***** Re: [Computer-go] Mastering the Game of Go with
> Deep Neural Networks and Tree Search
>

David Fotland

unread,

Feb 2, 2016, 1:11:33 PM2/2/16

to compu...@computer-go.org

Since I sell go software, providing explanations is an interesting topic for me. Weaker players want something that can help them learn, and this requires more than just "This is the best move". Many Faces gives crude explanations, and it’s something I will continue to work on. On the other hand, when presenting to people learning go, simple explanations are often good enough.

David

> -----Original Message-----
> From: Computer-go [mailto:computer-...@computer-go.org] On Behalf
> Of Robert Jasiek
> Sent: Tuesday, February 02, 2016 10:03 AM
> To: compu...@computer-go.org

> Subject: *****SPAM***** Re: [Computer-go] Mastering the Game of Go with
> Deep Neural Networks and Tree Search
>

Robert Jasiek

unread,

Feb 2, 2016, 1:23:16 PM2/2/16

to compu...@computer-go.org

On 02.02.2016 19:07, David Fotland wrote:
> consider some of this as the difference between math and engineering. Math desires rigor.
> Engineering desires working solutions. When an engineering solution is being described,
> you shouldn't expect the same level of rigor as in a mathematical proof. Often all we can
> say is something like, "I tried a bunch of things, and this one worked best". Both have value.

Of course. This is perfectly fine. - I have criticised something else:
the hiding of ambiguity of things portrayed as maths when statements of
the kind "this is a heuristic / engineering / first guess" are easily
possible. Research papers should be honest. (They may hide secret
details, but this is another topic.)

David Fotland

unread,

Feb 2, 2016, 1:32:24 PM2/2/16

to compu...@computer-go.org

Amazon uses deep neural nets in many, many areas. There is some overlap with the kind of nets used in AlphaGo. I passed a link to the paper on to one of our researchers and he found it very interesting. DNN works very well when there is a lot of labelled data to learn from. It can be useful to examine a problem area from the point of view: where can I get the most labelled data?

David

> -----Original Message-----
> From: Computer-go [mailto:computer-...@computer-go.org] On Behalf
> Of "Ingo Althöfer"
> Sent: Tuesday, February 02, 2016 12:31 AM
> To: compu...@computer-go.org

> http://computer-go.org/mailman/listinfo/computer-
> go_______________________________________________ Computer-go mailing
> list Compu...@computer-go.org http://computer-

> go.org/mailman/listinfo/computer-go[http://computer-

Robert Jasiek

unread,

Feb 2, 2016, 1:54:20 PM2/2/16

to compu...@computer-go.org

On 02.02.2016 13:05, "Ingo Althöfer" wrote:

> when a student starts
> studying Mathematics (s)he learns in the first two semesters that
> everything has to be defined waterproof. Later, in particular
> when (s)he comes near to doing own research, you have to make
> compromises - otherwise you will never make much progress.

When I studied maths and theoretical informatics at FU Berlin (and a bit
at TU Berlin) (until quitting because of studying too much go, of
course), during all semesters with every paper, lecture, homework or
professor, everything had to be well-defined, assumptions complete and
mandatory proofs accurate.

As a hobby go theory / go rules theory researcher, I can afford the
luxury of choosing formality (see Cycle Law), semi-formality (see Ko) or
informality (in informal texts) because I need not pass university
degrees with the work. My luxury of laziness / convenience when I use
semi-formal style (as typical in the theory parts of my go theory
papers) indeed has the advantages of being understood more easily from
the go player's (also my own) perspective and allowing my faster
research progress. If I had had to use formal style for every text, I
might have finished only half of the papers.

If we can believe Penrose (The Road to Reality) and Smolin (The Trouble
with Physics), the world of mathematical physics is split into guesswork
(string theory without valid mathematical foundation) and accurate
maths. Progress might not be made because too many have lost themselves
in the black hole of ambiguous string theory. Computer go theory seems
to be similar to physics.

--
robert jasiek

Xavier Combelle

unread,

Feb 2, 2016, 2:07:56 PM2/2/16

to compu...@computer-go.org

2016-02-01 12:24 GMT+01:00 Olivier Teytaud <tey...@lri.fr>:

If AlphaGo had lost at least one game, I'd understand how people can have an upper bound on its level, but with 5-0 (except for Blitz) it's hard to have an upper bound on his level. After all, AlphaGo might just have played well enough for crushing Fan Hui, and a weak move while the position is still in favor of AlphaGo is not really a weak move (at least in a game-theoretic point of view...).

I just want to point that according to Myungwan Kim 9p (video referenced in this thread) on the first game, Alpha Go did some mistake early in the game and was behind during nearly the whole game so some of his moves should be weak in game-theoric point of view.

Olivier Teytaud

unread,

Feb 2, 2016, 2:21:54 PM2/2/16

to compu...@computer-go.org

If AlphaGo had lost at least one game, I'd understand how people can have an upper bound on its level, but with 5-0 (except for Blitz) it's hard to have an upper bound on his level. After all, AlphaGo might just have played well enough for crushing Fan Hui, and a weak move while the position is still in favor of AlphaGo is not really a weak move (at least in a game-theoretic point of view...).

I just want to point that according to Myungwan Kim 9p (video referenced in this thread) on the first game, Alpha Go did some mistake early in the game and was behind during nearly the whole game so some of his moves should be weak in game-theoric point of view.

Thanks, this point is interesting - that's really an argument limiting the strength of AlphaGo.

On the other hand, they have super strong people in the team (at the pro level, maybe ? if Aja has pro level...),

and one of the guys said he is "quietly confident", which suggests they have strong reasons for believing they have a big chance :-)

Good luck AlphaGo :-) I'm grateful because since this happened many more doors are opened for people

working with these tools, even if they don't touch games, and this is really useful for the world :-)

Marc Landgraf

unread,

Feb 2, 2016, 3:10:59 PM2/2/16

to compu...@computer-go.org

What? You have mixed up things.

http://www.europeangodatabase.eu/EGD/Player_Card.php?&key=17374016

Robert Jasiek

unread,

Feb 2, 2016, 5:20:51 PM2/2/16

to compu...@computer-go.org

On 02.02.2016 20:21, Olivier Teytaud wrote:
> On the other hand, they have super strong people in the team (at the pro
> level, maybe ? if Aja has pro level...)

Ca. 5d amateur in the team is enough, regardless of whether Myongwan Kim
thinks that only 9p can understand. Not so. Kim's above 5d amateur
comments were related to reading or by heart knowledge of the latest
nadare variations (before the post-joseki aji mistakes, which can be
detected by 5d, or even below), but reading / joseki is not AlphaGo's
weakness.

--
robert jasiek

Igor Polyakov

unread,

Feb 2, 2016, 8:06:25 PM2/2/16

to compu...@computer-go.org

I think it would be an awesome commercial product for strong Go players. Maybe even if the AI shows the continuations and the score estimates between different lines, it will give the player enough reasoning to understand why one move is better than the other.

Oliver Lewis

unread,

Feb 3, 2016, 4:37:56 AM2/3/16

to computer-go

Is the paper still available for download? The direct link appears to be broken.

Thanks

Oliver

Álvaro Begué

unread,

Feb 3, 2016, 9:21:54 AM2/3/16

to computer-go

I searched for the file name on the web and found this copy: http://airesearch.com/wp-content/uploads/2016/01/deepmind-mastering-go.pdf

Álvaro.

Detlef Schmicker

unread,

Feb 4, 2016, 12:09:08 PM2/4/16

to compu...@computer-go.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I try to reproduce numbers from section 3: training the value network

On the test set of kgs games the MSE is 0.37. Is it correct, that the
results are represented as +1 and -1?

This means, that in a typical board position you get a value of
1-sqrt(0.37) = 0.4 --> this would correspond to a win rate of 70% ?!

Is it really true, that a typical kgs 6d+ position is judeged with
such a high win rate (even though it it is overfitted, so the test set
number is to bad!), or do I misinterpret the MSE calculation?!

Any help would be great,

Detlef

Am 27.01.2016 um 19:46 schrieb Aja Huang:
> Hi all,
>
> We are very excited to announce that our Go program, AlphaGo, has
> beaten a professional player for the first time. AlphaGo beat the
> European champion Fan Hui by 5 games to 0. We hope you enjoy our
> paper, published in Nature today. The paper and all the games can
> be found here:
>
> http://www.deepmind.com/alpha-go.html
>
> AlphaGo will be competing in a match against Lee Sedol in Seoul,
> this March, to see whether we finally have a Go program that is
> stronger than any human!
>
> Aja
>
> PS I am very busy preparing AlphaGo for the match, so apologies in
> advance if I cannot respond to all questions about AlphaGo.
>
>
>

> _______________________________________________ Computer-go mailing
> list Compu...@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJWs4XOAAoJEInWdHg+Znf4btkQAKP5T6o8qk9Fv1z/JJGTdcph
A8rXWXNPNybSTqBVh7IJHKB9WxNIIOODjj6JYL4vhjFC+Eqy7AcpHkFZEn+f9j2G
Kl3w6G88M9P8/iCn++pNHF3dPTrsn2doUcUjF1fJZsKeNCISuvbHwbzgXupna+lH
qeCpYx/VBQMJIdhGqTmWsQozbFMFbeTumMH94UwNkTwo4Tnue/UCJweU0bWIIt0D
TtCyLDsDcFy/qNrZC97858tpvOpo3hWs7pLf8ed+9r13UGeJhQJkedg6Oq0e5wTl
Ye36Z1/2QHnQtvUbk6yjd6GMK0lo6LOOC1lTpp1nFvzcZ4ifrY2LejQ+7nWmafRq
y46aeH4jqtEF2GXsFTq7ATftSYoeeUzgKxb8t4D8ShP3cRRWkpHFyGtdq25yYO2P
hp35zJHAlCtUUOzy0YBY+mngYnbwxjp1ykwUi5DmubtRTfhf4pTAH+DF5/UeItn/
IebMR9JlfCsFJZ7BLq7P7UoHz0eiG2vVKNXUP0Np4LA3KAm1IAOfJWRmLP9TrHOX
32vdIQLXlbDMTfbloXrFjQo3wm3pKrKstI41Pyoo5d8kE8FPxjCga6EfmarjRKGY
UBf7Vz8iijGWjTjszui78HlJBIGls9qyuxNUcBN4+kfJJLP633HO+tmDM1hcsD4D
BPtgsamuDtf5jvxGTa9d
=Ab0j
-----END PGP SIGNATURE-----

Álvaro Begué

unread,

Feb 4, 2016, 12:11:56 PM2/4/16

to compu...@computer-go.org

The positions they used are not from high-quality games. They actually include one last move that is completely random.

Álvaro.

Detlef Schmicker

unread,

Feb 4, 2016, 12:21:07 PM2/4/16

to compu...@computer-go.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks for the response, I do not refer to the finaly used data set:
in the referred chapter they state, they have used their kgs dataset
in a first try (which is in another part of the paper referred to
being a 6d+ data set).

>>>> mailing list Compu...@computer-go.org <javascript:;>
>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>
>> _______________________________________________ Computer-go
>> mailing list Compu...@computer-go.org <javascript:;>
>> http://computer-go.org/mailman/listinfo/computer-go

>
>
>
> _______________________________________________ Computer-go mailing
> list Compu...@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJWs4idAAoJEInWdHg+Znf4gyEP/iqAdAkxlsilYGQflCyN3z4V
xg7esBxj2p8cR2SP3erFQNtThDMN+Qr8FNSYMbMHyJRUaJcsEhsk76Qbro/Bh815
Xk/79w2LF0rdHwzYkIye3YunifIAwREaIXwCokzuPv0zFrKFgCu7UbIpup7oMXdr
q3c+FvBwjXX/UtChBZC6kC8U2b1dijMnxPOQC05Hw/LMycKinOzzwByKS3CHdzg6
eFHAAmrJsaY9iJvCyJQL5ZLdOMBVl50iLez5P8F2t1Bf+Qm03w2nnhAWl/3bjyVy
hdvcNw6VGSUNeXo2wmF8SJoB1fOUOLAVVenc9jJHkcdcRQxSEBzuH25OfPgNTz55
JgRqiSM0iOeQ9NmQlC1LRz1BDRYRUx0RsaCvcA1G3m5gKCsbbsVkluppNHUzxAUz
o3+jazCi+88Gb5EZfdKF7p+g0JoWE2OucwXKyzlmUZMz4Hce+zOfSwv2k/9vrTLW
z0LfKxDbqQG7cj1jVysvTcQvxSkA54cNLtj/uVNzTvoti+pwyscd5DqJ8jXfcHGG
HZC3tPsVM7wvqf46EGgmjDI9jjhSTzXXdpDW7gfFTtUvZx8S4iGfmxWLod48deIP
MiLDehl3rQBuQq5fx+i1ZDB3Gej6vvc9MHAiTo+kUf5TCYBMDAaxRt4OiLnK6N+d
J02Sn7O1jSG4Fw5ud6iR
=p2U5

Álvaro Begué

unread,

Feb 4, 2016, 1:34:27 PM2/4/16

to computer-go

I re-read the relevant section and I agree with you. Sorry for adding noise to the conversation.

Álvaro.

Hideki Kato

unread,

Feb 4, 2016, 2:10:20 PM2/4/16

to compu...@computer-go.org

Detlef Schmicker: <56B385CE...@physik.de>:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Hi,
>
>I try to reproduce numbers from section 3: training the value network
>
>On the test set of kgs games the MSE is 0.37. Is it correct, that the
>results are represented as +1 and -1?

Looks correct.

>This means, that in a typical board position you get a value of
>1-sqrt(0.37) = 0.4 --> this would correspond to a win rate of 70% ?!

Since all positions of all games in the dataset are used,
winrate should distributes from 0% to 100%, or -1 to 1, not 1.
Then, the number 70% could be wrong. MSE is 0.37 just means the
average error is about 0.6, I think.

Hideki

--
Hideki Kato <mailto:hideki...@ybb.ne.jp>

Detlef Schmicker

unread,

Feb 4, 2016, 2:24:27 PM2/4/16

to compu...@computer-go.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>> Since all positions of all games in the dataset are used, winrate

>> should distributes from 0% to 100%, or -1 to 1, not 1. Then, the
>> number 70% could be wrong. MSE is 0.37 just means the average
>> error is about 0.6, I think.

0.6 in the range of -1 to 1,

which means -1 (eg lost by b) games -> typical value -0.4
and +1 games -> typical value +0.4 of the value network

if I rescale -1 to +1 to 0 - 100% (eg winrate for b) than I get about
30% for games lost by b and 70% for games won by B?

Detlef

Am 04.02.2016 um 20:10 schrieb Hideki Kato:
> Detlef Schmicker: <56B385CE...@physik.de>: Hi,

iQIcBAEBAgAGBQJWs6WFAAoJEInWdHg+Znf4eTsP/21vawWsmrZkDuAjTkwbKB2S
7LpLi3huuLlepkulmUr3rIUvDHhTOwD04pDHjjVrIDBB1k3JjQQ/YKWDfijQQYu6
ZI1GK55pglUPH+uc+rxfM89ziJwCQrza71l5XU+5ffcBwxRjeAL+D1fGGyr0CPlv
WKR/Q07XDslXhwlk2O6NDpd80d38dMlMV9lO4s8Zf3Y+o8WJOuyEdybRpg8VOibq
o59RCAWUiVkTs++iSihcIrVAwGnLtkPyMJ/lBN6zMyZQeuM0dyYL+IAoMH9IdCLQ
0jpbtJEqtSsp1ZjWs9s/M4pxKlvUZLThtYSjyGDJ2qDYXII6DeBgxHGUoUxc5A6a
HVF04gG77U2fMCa/6eGlQN2380kNCjdyRCDUZc9St3tbQPnWU+syk6U/inF7bhAA
7ONJD0dcjZROmblqurv32pO6sLuS8wA4DfJhpM5xSSJcYI46YQtVWL4OXY+dtx6S
6uQ1fiPqgo4WM0iHEOnh7BEz0NqZeahIUJJVmgKODzp2krOqbpOpbwe7WUI7UHmK
3LCNC9oMRybNuc+jrbHqFwT+tgQLTqpbHZuDVzKkBcxqPSj7hRvjLXAjkWNCzL7j
Yo4MySS6rzenuj9ZRSrQDSYfowRZyzPzMnmjkMbM7R7wpR5CL4U95LqOdMnce2IG
s/6iYcuUH8KqpG9NMy0U
=TnKW

Álvaro Begué

unread,

Feb 4, 2016, 2:35:22 PM2/4/16

to computer-go

I am not sure how exactly they define MSE. If you look at the plot in figure 2b, the MSE at the very beginning of the game (where you can't possibly know anything about the result) is 0.50. That suggests it's something else than your [very sensible] interpretation.

Álvaro.

Michael Markefka

unread,

Feb 4, 2016, 2:55:11 PM2/4/16

to compu...@computer-go.org

That sounds like it'd be the MSE as classification error of the eventual result.

I'm currently not able to look at the paper, but couldn't you use a
softmax output layer with two nodes and take the probability
distribution as winrate?

Hideki Kato

unread,

Feb 4, 2016, 3:40:58 PM2/4/16

to compu...@computer-go.org

I think the error is defined as the difference between the
output of the value network and the average output of the
simulations done by the policy network (RL) at each position.

Hideki

Michael Markefka: <CAJg7PAN9G2_htRs0mfKuFi82yef7gNFCsouE4ez4f37_pK=K...@mail.gmail.com>:

>

>distribution as winrate?

>

>> your [very sensible] interpretation.

>>

>> Álvaro.

>>

>>>

>>> -----BEGIN PGP SIGNED MESSAGE-----

>>> Hash: SHA1

>>>

>>> Detlef

>>>

>>> > Detlef Schmicker: <56B385CE...@physik.de>: Hi,

>>> >

>>> > network

>>> >

>>> >> Looks correct.

>>> >

>>> > ?!

>>> >

>>> >> Hideki

>>> >

>>> > Detlef

>>> >

>>> >>>> Hi all,

>>> >>>>

>>> >>>> http://www.deepmind.com/alpha-go.html

>>> >>>>

>>> >>>> Aja

>>> >>>>

>>> >>>> about AlphaGo.

>>> >>>>

>>> >>>> _______________________________________________ Computer-go

>>> >>>> mailing list Compu...@computer-go.org

>>> >>>> http://computer-go.org/mailman/listinfo/computer-go

>>> >>>>

>>> >> _______________________________________________ Computer-go

>>> >> mailing list Compu...@computer-go.org

>>> >> http://computer-go.org/mailman/listinfo/computer-go

>>> -----BEGIN PGP SIGNATURE-----

>>> Version: GnuPG v2.0.22 (GNU/Linux)

>>>

>>> iQIcBAEBAgAGBQJWs6WFAAoJEInWdHg+Znf4eTsP/21vawWsmrZkDuAjTkwbKB2S

>>> 7LpLi3huuLlepkulmUr3rIUvDHhTOwD04pDHjjVrIDBB1k3JjQQ/YKWDfijQQYu6

>>> ZI1GK55pglUPH+uc+rxfM89ziJwCQrza71l5XU+5ffcBwxRjeAL+D1fGGyr0CPlv

>>> WKR/Q07XDslXhwlk2O6NDpd80d38dMlMV9lO4s8Zf3Y+o8WJOuyEdybRpg8VOibq

>>> o59RCAWUiVkTs++iSihcIrVAwGnLtkPyMJ/lBN6zMyZQeuM0dyYL+IAoMH9IdCLQ

>>> 0jpbtJEqtSsp1ZjWs9s/M4pxKlvUZLThtYSjyGDJ2qDYXII6DeBgxHGUoUxc5A6a

>>> HVF04gG77U2fMCa/6eGlQN2380kNCjdyRCDUZc9St3tbQPnWU+syk6U/inF7bhAA

>>> 7ONJD0dcjZROmblqurv32pO6sLuS8wA4DfJhpM5xSSJcYI46YQtVWL4OXY+dtx6S

>>> 6uQ1fiPqgo4WM0iHEOnh7BEz0NqZeahIUJJVmgKODzp2krOqbpOpbwe7WUI7UHmK

>>> 3LCNC9oMRybNuc+jrbHqFwT+tgQLTqpbHZuDVzKkBcxqPSj7hRvjLXAjkWNCzL7j

>>> Yo4MySS6rzenuj9ZRSrQDSYfowRZyzPzMnmjkMbM7R7wpR5CL4U95LqOdMnce2IG

>>> s/6iYcuUH8KqpG9NMy0U

>>> =TnKW

>>> -----END PGP SIGNATURE-----

>>> _______________________________________________

>>> Computer-go mailing list

>>> Compu...@computer-go.org

>>> http://computer-go.org/mailman/listinfo/computer-go

>>

>> _______________________________________________

>> Computer-go mailing list

>> Compu...@computer-go.org

>> http://computer-go.org/mailman/listinfo/computer-go

>_______________________________________________

>Computer-go mailing list

>Compu...@computer-go.org

>http://computer-go.org/mailman/listinfo/computer-go

--
Hideki Kato <mailto:hideki...@ybb.ne.jp>

Álvaro Begué

unread,

Feb 4, 2016, 3:43:49 PM2/4/16

to computer-go

I just want to see how to get 0.5 for the initial position on the board with some definition.

One possibility is that 0=loss, 1=win, and the number they are quoting is sqrt(average((prediction-outcome)^2)).

Detlef Schmicker

unread,

Feb 4, 2016, 4:44:12 PM2/4/16

to compu...@computer-go.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> One possibility is that 0=loss, 1=win, and the number they are
quoting is
> sqrt(average((prediction-outcome)^2)).

this makes perfectly sense for figure 2. even playouts seem reasonable.

But figure 2 is not consistent with the numbers in section 3 would be
0.234 (test set of the self-play data base. The figure looks more like
0.3 - 0.35 or even higher...

>>>>> _______________________________________________
>>
>>>>> Computer-go mailing list
>>
>>>>> Compu...@computer-go.org
>>
>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>> _______________________________________________
>>
>>>> Computer-go mailing list
>>
>>>> Compu...@computer-go.org
>>
>>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>> _______________________________________________
>>
>>> Computer-go mailing list
>>
>>> Compu...@computer-go.org
>>
>>> http://computer-go.org/mailman/listinfo/computer-go
>> -- Hideki Kato <mailto:hideki...@ybb.ne.jp>
>> _______________________________________________ Computer-go
>> mailing list Compu...@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
>

> _______________________________________________ Computer-go mailing
> list Compu...@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJWs8ZGAAoJEInWdHg+Znf4D4sP/Rr7HPtRpE0rgzIjzSvI4NtM
EZMldUdsEyJ8u6C4o8cWUfHX7TChfgjUpDpJL/uvmAgiunvB3RXccT3DKLWAbo8G
t9QUsMgd791g4RkFsJ5ZZWJ/bGrchov9bXIcPO9QzJ1FJRrVuwMfH43SBnPItee7
Z9QH7FF6jgyBjFxeNChhF8FMOD55+uuu8/o3htMCAHBZ6Y4aMEdFQYQHmHdGUYHF
Vtgy++yRIP9V0BiiBqNCKxT41cK5kaEzbUYgIoLs0kHpxTzJd/WAiLxSHAPyYnLY
WL9/NU1/dW6/Ef7wbi8I68lDz+COfIaZ8KMH75Q4O90OIta+O7eBznNBEC3Ei5iH
3BvlfzPZ+fHZb6Yw7MrbVJFfPJzXRM0C/C9uHjDcdi6wZTpoEhWiYFKeGogRRSg3
2Y+xJrFh/p+akLjo70BcD48TwJIYdDdgFUfgj5vvyru3H9oZ/fJKLX6WPx1brCnj
RXtmH+k+G6Gi+WRACKEgtw59Rm5h7F/sQv3apqXFii8QnHcChNnsXcn/mCYBqlnM
W4e2fk6+HJbth0bLobAG4DaM+j9C/gde0ruUhTtYIap4iC5hkf8zrZTZzzVdsCcc
tBv8CFXif8cjAAQwYIhMt/VDMbIoPwwczCsJS6XXr7j7vzoKiiMCrSLZ8DF+IXEi
0nKF0PbVS4JPpajYpGkL
=tLXF

uurtamo .

unread,

Feb 4, 2016, 4:58:21 PM2/4/16

to computer-go

Robert,

Just as an aside, I really respect your attention to detail and your insistence that proof technique follow generalizing statements about aspects of go.

I think that the counting problems recently were pretty interesting (number of positions versus number of games).

The engineering problem of winning against humans is of course much simpler than the math problem of understanding the game deeply (just think about all of the work Berlekamp did working on this). I like that they move together hand-in-hand, and right now it seems like the human urge for most people is to make a set of computers strong enough that they can beat a top pro, once, then the next goal will be to beat them regularly, as has been done in chess.

If you think about chess endgames, and how they've been categorized, we're nowhere near that in Go except for fairly quiescent positions. The opening is a nightmare, the midgame is a nightmare, and multiple fights with multiple kos are a nightmare. Solving this mathematically is of course hugely far in the future. Faking your way forward with engineering (similar to how people play) seems to be our best guess at the moment.

Thanks for your insight and rigor and I'm glad that you're continuing down your rigorous path when so many of us have forgotten that extremely minor rule differences can be: inexplicable (japanese rules, if i understand correctly), difficult to deal with (chinese rules, under very liberal understanding) or useless (mathematical descriptions of a game which is totally different than how people actually play).

Thanks again,

steve

John Tromp

unread,

Feb 12, 2016, 10:21:32 PM2/12/16

to computer-go

On Wed, Jan 27, 2016 at 1:46 PM, Aja Huang <ajah...@google.com> wrote:
> We are very excited to announce that our Go program, AlphaGo, has beaten a
> professional player for the first time. AlphaGo beat the European champion
> Fan Hui by 5 games to 0.

It's interesting to go back nearly a decade and read this 2007 article:

http://spectrum.ieee.org/computing/software/cracking-go

where Feng-Hsiung Hsu, Deep Blue's lead developer, made this prediction:

"Nevertheless, I believe that a world-champion-level Go machine can be
built within 10 years"

Which now appears to be spot on. March 9 cannot come soon enough...
The remainder of his prediction rings less true though:

", based on the same method of intensive analysis—brute force,
basically—that Deep Blue employed for chess".

regards,
-John

muupan

unread,

Feb 23, 2016, 11:52:40 PM2/23/16

to compu...@computer-go.org

Congratulations, people at DeepMind! Your paper is very interesting to read.

I have a question about the paper. On policy network training it says

> On the first pass through the training pipeline, the baseline was set to zero; on the second pass we used the value network vθ(s) as a baseline;

but I cannot find any other description about the "second pass". What is it? It uses vθ(s), so at least it is done after training vθ(s). Is it that after completing the whole training pipeline depicted in Fig. 1, only the RL policy network training part is repeated? Or training vθ(s) is also repeated? Is the second pass the last pass, or there are more passes? Sorry if I just missed the relevant part of the paper.

Petri Pitkanen

unread,

Feb 24, 2016, 1:51:01 AM2/24/16

to computer-go

Opent to intepretation if this method is brute force. I think it i. Uses huge amounts of CPU power to run simulations and evaluate NN's. Even in chess it was not just about tree search, it needs evaluationfunction ot make sense of the search

Stefan Kaitschick

unread,

Mar 13, 2016, 9:04:52 AM3/13/16

to compu...@computer-go.org

The evaluation is always at least as deep as leaves of the tree.

Still, you're right that the earlier in the game, the bigger the inherent uncertainty.

One thing I don't understand: if the network does a thumbs up or down, instead of answering with a probability,

what is the use of MSE? Why not just prediction rate?

On Thu, Feb 4, 2016 at 8:34 PM, Álvaro Begué <alvaro...@gmail.com> wrote:

Reply all

Reply to author

Forward