DQN: Meaning of "Average Q-Learning Loss"

242 views
Skip to first unread message

Marco Pleines

unread,
Jun 18, 2017, 5:06:45 PM6/18/17
to convnetjs
Hey there,

I've got some questions concerning the DQN demo.

What does the average q-learning loss value actually mean? Is this the error term resulting from the training of the neural network in the DQN demo? I'm applying that approach to one of my games and value is usually greater than two. If this is related to the error property of training the neural network, then I'd assume that I have conflicting training data.

And another question:
Concerning calling the train function:

var loss = this.tdtrainer.train(x, ystruct);

Is this completly related to the neural network implementation? Like, could I just go ahead and just go on from that point with another neural net implementation?


Thanks in advance!

(I'm trying to make use of DQN for my game BRO)

Marco Pleines

unread,
Jun 27, 2017, 6:25:10 AM6/27/17
to convnetjs
It looks like that the rewards were the problem of having such high loss values. Originally I used these reward signals:

continuous alive reward = 1 (gathered each tick while being alive)
final alive reward = 20 (if the player survived)
death reward = -1000 (if the player died)
kill reward = 30 (if the player managed to kill someone else)

The divergence usually started with the death of the player. Now the values are like:

continuous alive reward = 0.1 (gathered each tick while being alive)
final alive reward = 10 (if the player survived)
death reward = -20 (if the player died)
kill reward = 10 (if the player managed to kill someone else)

Does this q learning loss actually affects the training?
Reply all
Reply to author
Forward
0 new messages