Hi,
In the policy gradient toolbox which Jan Peters provide, there are three types of policies, i.e., the decision border policy, the ε-soft Gibbs policy, and the Gaussian policy. He consider two kind of problems, discrete problems and linear-quadratic regulation problems.The Gaussian policy is applied in linear-quadratic regulation problems.So I have a question,i.e.,can the Gaussian policy be used in the discrete or the nonlinear problems?
In my simulation,I need a type of policy whose prameter is one dimension ,so which policy can meet the situation?
Another question is "how to understand the queueing problem in reinforcement learning ?"
So someone here can give me a hint? Many thanks!
Best regards,
MRS FENG