Reinforcement learning

Detlef Schmicker

unread,

Oct 22, 2017, 4:45:22 AM10/22/17

to Oakfoam

Just in case somebody is thinking about using oakfoam for this (Alphago Zero), I am trying to set up a

reinforcement pipeline from our tools. It uses gomill to produce selfplay games and our scripts to train on them:

This will be the readme:

1) run gomill to produce the games

2) find gomill/gomill.games/* >games.txt

3) scripts/CNN$ head -n 100 ../../games.txt | ./CNN-data-collection.sh initial.gamma 13 > test_positions.txt

4) scripts/CNN$ tail -n -100 ../../games.txt | ./CNN-data-collection.sh initial.gamma 13 > train_positions.txt

5) /train-net$ cat ../scripts/CNN/test_positions.txt | python ./generate_sample_data_leveldb.py

6) mv lenet_train_new lenet_test_new

7) /train-net$ cat ../scripts/CNN/train_positions.txt | python ./generate_sample_data_leveldb.py

8) /train-net$ ~/caffe-master/build/tools/caffe train -solver lenet_solver_value_small.prototxt --weights ../gomill/selfplay.trained

9) cp train-net/snapshots_selfplay/_iter_100000.caffemodel gomill/selfplay.trained

Ask me to push it to the repository, if you want to start. I just want to test a little before pushing,

Detlef

Detlef Schmicker

unread,

Oct 29, 2017, 1:14:05 PM10/29/17

to Oakfoam

initial push of ReinforcementLearning :)

Steve Kroon

unread,

Oct 30, 2017, 2:33:02 AM10/30/17

to oak...@googlegroups.com

Cool - thanks for this, Detlef!

Steve

On Sun, Oct 29, 2017 at 7:14 PM, Detlef Schmicker <dschm...@physik.de> wrote:

initial push of ReinforcementLearning :)

--
You received this message because you are subscribed to the Google Groups "Oakfoam" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oakfoam+unsubscribe@googlegroups.com.
To post to this group, send email to oak...@googlegroups.com.
Visit this group at https://groups.google.com/group/oakfoam.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward