Reinforcement learning

28 views
Skip to first unread message

Detlef Schmicker

unread,
Oct 22, 2017, 4:45:22 AM10/22/17
to Oakfoam
Just in case somebody is thinking about using oakfoam for this (Alphago Zero), I am trying to set up a

reinforcement pipeline from our tools. It uses gomill to produce selfplay games and our scripts to train on them:

This will be the readme:

1) run gomill to produce the games

2) find gomill/gomill.games/* >games.txt


3) scripts/CNN$ head -n 100 ../../games.txt | ./CNN-data-collection.sh initial.gamma 13 > test_positions.txt


4) scripts/CNN$ tail -n -100 ../../games.txt | ./CNN-data-collection.sh initial.gamma 13 > train_positions.txt

5) /train-net$ cat ../scripts/CNN/test_positions.txt | python ./generate_sample_data_leveldb.py

6) mv lenet_train_new lenet_test_new

7) /train-net$ cat ../scripts/CNN/train_positions.txt | python ./generate_sample_data_leveldb.py

8) /train-net$ ~/caffe-master/build/tools/caffe train -solver lenet_solver_value_small.prototxt --weights ../gomill/selfplay.trained

9) cp train-net/snapshots_selfplay/_iter_100000.caffemodel gomill/selfplay.trained


Ask me to push it to the repository, if you want to start. I just want to test a little before pushing,


Detlef

Detlef Schmicker

unread,
Oct 29, 2017, 1:14:05 PM10/29/17
to Oakfoam
initial push of ReinforcementLearning :)

Steve Kroon

unread,
Oct 30, 2017, 2:33:02 AM10/30/17
to oak...@googlegroups.com
Cool - thanks for this, Detlef!

Steve

On Sun, Oct 29, 2017 at 7:14 PM, Detlef Schmicker <dschm...@physik.de> wrote:
initial push of ReinforcementLearning :)

--
You received this message because you are subscribed to the Google Groups "Oakfoam" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oakfoam+unsubscribe@googlegroups.com.
To post to this group, send email to oak...@googlegroups.com.
Visit this group at https://groups.google.com/group/oakfoam.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages