Hello, I've built an A3C implementation in keras using this as referance: /https://jaromiru.com/2017/03/26/lets-make-an-a3c-implementation/
And I'm using custom environment, which I have tested before on DQN and it sucessfully converged showing really good results. But when I use the same environment in A3C, it results in model just choosing the same action over and over. I tried changing some hyper-parametrs, but no result. I also tried using target model and updating it every n episodes, which resulted in better convergence in case of gym CartPole environment, buy still no effect on performance of my model in my custom environment. Any ideas are welcomed, thank you.