Training with competitive self-play using RLLib

shehrumbk

unread,

Aug 5, 2020, 5:02:49 PM8/5/20

to Google Research Football

Hi everyone. I'm trying to train my RL agent using competitive self-play (controlling agent on both sides), using RLlib framework for multi-agents. However, I'm getting .nan in episode_rewards. Has anyone experienced a similar case, or trained agents in a similar setting? Any help would be greatly appreciated

+++++++++++++++Training iteration 9+++++++++++++++++++

custom_metrics: {}

date: 2020-08-05_20-58-57

done: false

episode_len_mean: .nan

episode_reward_max: .nan

episode_reward_mean: .nan

episode_reward_min: .nan

episodes_this_iter: 0

episodes_total: 0

experiment_id: c8ec3950c1704719af5ffdfc54947987

hostname: ip-172-31-29-189

info:

learner:

policy_01:

allreduce_latency: 0.0

cur_kl_coeff: 0.5

cur_lr: 0.0002500000000000001

entropy: 2.8751889978136336

entropy_coeff: 0.009999999999999998

kl: 0.013963970422212566

policy_loss: -0.10596811185990061

total_loss: -0.12725649561200822

vf_explained_var: 0.9584423899650574

vf_loss: 0.0004815203574253246

num_steps_sampled: 36000

num_steps_trained: 36000

iterations_since_restore: 9

node_ip: 172.31.29.189

num_healthy_workers: 4

off_policy_estimator: {}

perf:

cpu_util_percent: 62.568965517241374

ram_util_percent: 30.829310344827576

pid: 1923

policy_reward_max: {}

policy_reward_mean: {}

policy_reward_min: {}

sampler_perf: {}

time_since_restore: 741.328654050827

time_this_iter_s: 80.76175713539124

time_total_s: 741.328654050827

timers:

learn_throughput: 102.519

learn_time_ms: 39016.975

sample_throughput: 92.416

sample_time_ms: 43282.315

update_time_ms: 37.503

timestamp: 1596661137

timesteps_since_restore: 0

timesteps_total: 36000

training_iteration: 9

+++++++++++++++++++++++++++++++++++++++++++++++++++

Yutai Zhou

unread,

Aug 28, 2020, 2:16:01 PM8/28/20

to Google Research Football

I found that after some number of iterations, the nan issue goes away

shehrumbk

unread,

Aug 28, 2020, 2:26:35 PM8/28/20

to Google Research Football

Yes, it went away for me too.

Reply all

Reply to author

Forward