connectionism in go

k p c

unread,

May 16, 1992, 6:59:56 PM5/16/92

to

has anybody tried applying connectionism to the game of go?

i assume from the tremendous popularity of each subject that a lot of
people have probably tried it and found it wanting, or had good
reasons not to try it.

what kinds of negative results appeared? what good reasons were there
for not trying it?

thanks.

(my experience is in writing a very slow standard minimax go player in
prolog! since i'm interested in ann, i'm curious what an ann in this
domain would look like.)
--
no signature

Herbert Enderton

unread,

May 16, 1992, 10:39:50 PM5/16/92

to

In article <KPC.92Ma...@zog.arc.nasa.gov>, k...@pluto.arc.nasa.gov (k p c) writes:
|> has anybody tried applying connectionism to the game of go?
|>
|> i assume from the tremendous popularity of each subject that a lot of
|> people have probably tried it and found it wanting, or had good
|> reasons not to try it.
|>
|> what kinds of negative results appeared? what good reasons were there
|> for not trying it?

Yes, I have a neural network trained to play go. The focus of my effort
was to get a heuristic measure of good and bad shape. So I set up a
neural net with 116 input units corresponding to local features of a
candidate move, 20 hidden units (completely connected), and 1 output
unit. The input features are things like "is there an enemy stone
two spaces up from this point?" (similarly for 35 other nearby points),
"how many liberties does the stone (if any) directly below this point have?",
"if my opponent played here could I catch him in a ladder?", and a
few others. The training data came from Howard Landman's collection
of professional go games; the pros' moves were taken to be positive
examples, and nearby random moves from the same positions were used
as negative training examples.

So what I got was a neural net that would rate each candidate move
based mainly on its shape (i.e. the local stone pattern), and this
was suitable for a heuristic move generator, but not sufficient to
play decent go just by itself. Below is an example of its outputs
for each of the legal moves in a position in a test game between
Many Faces of Go (Black) and my program Golem (White). Note that
the neural network is just one element in Golem; for the most part
it relies on a territory-counting evaluation function. It's White's
move below, and the numbers indicate the neural net's estimate of
how much each move locally resembles a professional go player's move,
on a scale of 1 (terrible) to 999 (splendid). The X's and O's are
black and white stones.

335 725 700 750 811 735 634 588 559 586 570 623 576 590 565 551 558 659 344
645 806 658 X O 868 745 692 704 706 689 644 701 714 695 680 700 688 643
534 752 745 X X O 755 798 794 769 722 O 740 808 822 791 762 684 534
553 796 747 X O 778 809 741 753 766 755 671 756 773 812 708 O 658 627
559 705 831 900 O 799 726 O 650 756 752 784 734 723 748 761 748 699 575
563 678 841 802 814 X 641 682 724 709 723 722 726 714 702 780 799 700 564
593 706 825 681 O X 694 858 705 666 686 691 700 709 753 772 795 678 561
596 760 880 774 O X 778 829 745 724 687 687 736 760 764 657 764 652 601
640 746 X 798 O O 863 X 842 757 759 734 744 795 598 O 721 709 600
579 741 586 X X O 754 O 859 X 780 800 850 O 602 729 775 669 546
605 768 721 X O 820 842 692 760 723 799 749 X O 800 812 827 672 559
593 684 821 812 867 828 738 748 728 717 736 778 802 X O 764 815 669 560
561 683 777 743 806 764 688 709 692 651 703 844 676 X 657 818 790 708 561
563 661 803 775 755 719 719 700 715 711 717 750 789 204 X 764 871 722 564
559 678 763 735 743 727 759 707 741 779 733 O 825 X X 718 768 768 575
559 715 782 X 728 775 757 764 764 807 830 872 O O X 720 O 729 622
549 643 796 773 778 791 X 787 804 X 810 O X X X O 830 842 578
629 719 643 699 696 701 679 718 804 582 X X O O O X O 792 733
344 629 549 559 597 599 614 609 618 694 582 774 824 675 719 910 712 749 366

Some of the moves the network rates high are plausible or even good, and some
are awful. Golem uses it to select twenty or so moves throughout the
board, and then runs its full evaluation function to make a selection
among those. In this case the moves it considers are marked `+' below:

19 - - - - - + - - - - - - - - - - - - -
18 - + - X O + - - - - - - - - - - - - -
17 - - - X X O - - + - - O - - + - - - -
16 - - - X O - - - - - - - - - - - O - -
15 - - + + O - - O - - - - - - - - - - -
14 - - - - - X - - - - - - - - - - - - -
13 - - - - O X - + - - - - - - - - - - -
12 - - + - O X - - - - - - - - - - - - -
11 - - X + O O - X - - - - - - - O - - -
10 - - - X X O - O + X - - + O - - - - -
9 - - - X O - - - - - - - X O - - - - -
8 - - - - f - - - - - - - + X O - - - -
7 - - - - - - - - - - - - - X - + - - -
6 - - + - - - - - - - - - - - X - + - -
5 - - - - - - - - - - - O - X X - - - -
4 - - - X - - - - - - - + O O X - O - -
3 - - + - - - X - - X - O X X X O - - -
2 - - - - - - - - - - X X O O O X O - -
1 - - - - - - - - - - - - - - - + - - -
A B C D E F G H J K L M N O P Q R S T White (O) to play

A description of this and the other aspects of my go program Golem is
available in a short technical report, CMU-CS-92-101. If you want a copy,
please write to:

Computer Science Documentation
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213-3890
rep...@cs.cmu.edu
412/268-2596
Each technical report: $2.00 (Domestic, Canada)
5.00 (Overseas)
This charge includes first class/airmail postage.

parkhill

unread,

May 18, 1992, 10:30:55 AM5/18/92

to

What was your neural net learning algorithm? I am reading through a paper
by Sa & Ballard called "Top-Down Teaching Enables Non-trivial Clustering
via Competitive Learing" which describes a way of improving "Hebbian style
learning with a top down teaching input. The paper suggests that their
method works better and faster than "back propogation" for pattern
recognition. The paper states "A teaching signal allows relevent
information to guide the development of synaptic conections." Not that
that statement says much.

Did you work seem to show that the neural net was only suitable for
"quickly" finding potential moves that are then passed onto a more
"intelligent" part of the program? It seems to me there was a chess
program developed in this manner. Didn't use NN but used one computer
for classification and the other for choosing one of the provided options.

Given your experience using NN algorithms what are they useful for
in terms of solving game related problems?

Do they seem applicable to more complex games such as military conflict
simulation games or fantasy games?

I am just starting to read on the topic of neural nets and would
appreciate your comments.

Robert Parkhill

Herbert Enderton

unread,

May 20, 1992, 8:11:11 PM5/20/92

to

In article <59...@email.sp.unisys.com>, park...@email.sp.unisys.com (parkhill) writes:
|> What was your neural net learning algorithm? I am reading through a paper

|> ...

I used "vanilla" back-propagation. It took a long time for the weights
to converge, and it never was able (nor did I expect it) to correctly
learn all the training data. The only new twist I put on the learning
process was the way I assigned desired-output values: I had the training
data in pairs, each comprising the pro's move and some other move
(presumably worse) in the same position. When it rated the pro's move
higher than the other move by a sufficient margin, I left it alone, and
only did the backpropagation of errors for cases it got wrong.

|>
|> Did you work seem to show that the neural net was only suitable for
|> "quickly" finding potential moves that are then passed onto a more
|> "intelligent" part of the program? It seems to me there was a chess
|> program developed in this manner. Didn't use NN but used one computer
|> for classification and the other for choosing one of the provided options.
|>

With rich enough inputs, I think a neural net could play the whole game,
i.e. make the final move selection, but that wasn't what I was after.
There's a lot of important tactical calculations to make in go (and chess),
and I think it's unreasonable to expect a neural net to figure out how
to do these calculations. I wrote code to read out loose ladders, to
count eyes, to classify moves as tactically viable or suicidal, and so
forth, so for my top-level neural net I give it a lot of this calculated
data as inputs. I even give it the territory evaluation function's estimate
of the ownership (on a scale of 0-100) of the nearby points. But even
with all of that, the neural net doesn't play very well by itself. It
plays a lot of small endgame moves when there are big open areas it
could invade. My evaluation function makes better decisions just with
a one ply search. Actually a combination of the neural net's rating and
the evaluation functions territorial value for each move works pretty
well; somewhat better than either by itself.

|> Given your experience using NN algorithms what are they useful for
|> in terms of solving game related problems?
|>
|> Do they seem applicable to more complex games such as military conflict
|> simulation games or fantasy games?

Hard to say. If you've got a problem where you want to give the program
a lot of vague knowledge, then a neural net might be a relatively easy
way to do it. Many go programs use a large set of hand-coded rules,
things that say "when you see this stone pattern, make (or at least
consider) this move". Golem doesn't have such rules; its neural net
subsumes them. I guess the two places in a game-playing program where
one wants lots of vague heuristic knowledge are in the evaluation function
and in the search control (if it's doing something like forward pruning).

You should read Gerald Tesauro's papers on NeuroGammon. That was the
first neural net game program, and it's also the best backgammon-playing
program in the world. My work is in a comparatively primitive state.

-- Bert Enderton

Henrik Klagges

unread,

May 21, 1992, 4:37:55 AM5/21/92

to

hd...@CS.CMU.EDU (Herbert Enderton) writes:
>learn all the training data. The only new twist I put on the learning
>process was the way I assigned desired-output values: I had the training
>data in pairs, each comprising the pro's move and some other move
>(presumably worse) in the same position. When it rated the pro's move
>higher than the other move by a sufficient margin, I left it alone, and
>only did the backpropagation of errors for cases it got wrong.

That is essentially Tesauro's comparison training with a speedup.

Cheers, Henrik
IBM Research
MPCI at LLNL

parkhill

unread,

May 21, 1992, 12:04:36 PM5/21/92

to

>You should read Gerald Tesauro's papers on NeuroGammon. That was the
>first neural net game program, and it's also the best backgammon-playing
>program in the world. My work is in a comparatively primitive state.

> -- Bert Enderton

Thank you for the information. I have ordered your paper. By the way do
you have the name of the book/periodical the NeuroGammon papers appeared in?

Thanks

Robert Parkhill
Paramax, A Unisys Company