Hi what algorithms and libraries you guys using?
and started from dqn -> ddqn -> duelingddqn with prioritised replay in pytorch.
Am at the moment trying to create some form of distributed learning setup (3 actors and 1 learner across my 4 processors) (from the Ape-x paper).
Haven't put much of time into policy gradient methods yet but will get there eventually.
Oh and im also using 'simple115' observation with just linear layers as i dont have the computational muscle on my laptop to use conv layers.
So yeah what is everyone else's approach?
Tom