Basic example

138 views

Skip to first unread message

kolo...@gmail.com

unread,

Jul 18, 2018, 5:49:18 AM7/18/18

to keras-rl-users

HI, I tried to create a dqn agent for HotterColder-v0 env. Keras rl documentation is very poor, so I don't understand some basic things. I just copied an example with cartpole and tried to run it. Here is my code and error which I get

james...@gmail.com

unread,

Aug 30, 2018, 6:48:04 PM8/30/18

to keras-rl-users

I'm not an expert, but I suspect it's because the two gym environments have different action space requirements:

https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py#L77 -- Discrete(2)

https://github.com/openai/gym/blob/master/gym/envs/toy_text/hotter_colder.py#L29 -- Box()

You can see on this line that the 'contains' function for Box checks the shape of the input against the shape of the output. Chances are that you need to adjust the shape of your prediction to match that of the action space.

But even then, your model's output is a single 0 to 1 float which your BoltzmannQPolicy is going to sample from. But it's not a real sample, it's always going to choose 0, as the len(q_values) == 1. You could either have your model output a softmax over all 4000 digits or you could modify env.process_action() or something to convert from your (0-1) float to a [-2000, 2000] integer.

In general I think the documentation is so minimal because much of the functionality is serving as a bridge between gym and keras, so most troubleshooting requires digging into one of those two packages.

Hope that's helpful! If not, feel free to send the actual code rather than a screenshot for further debugging.

Reply all

Reply to author

Forward

0 new messages