Issues in box2d pointmass example. How can I train the moving aircraft to reach any given points?

115 views
Skip to first unread message

cuik...@163.com

unread,
Sep 5, 2016, 11:27:03 AM9/5/16
to gps-help
I noticed that the moving aircraft can reach a fixed point from any initial state in box2d pointmass example. When I alter the target point, the training process has to be carried out again. So,  how can I train the moving aircraft to reach any given points? I try to rewrite code to add mass target points and make the target point as inputs of policy network. Unfortunately, the result was bad. In the paper End-to-end Training of Deep Visuomotor Policies, the agent can adapt different target object position. How to do this. Thanks.

cuik...@163.com

unread,
Sep 5, 2016, 11:36:58 PM9/5/16
to gps-help
ps: I rewite the box2d pointmass example using badmm

在 2016年9月5日星期一 UTC+8下午11:27:03,cuik...@163.com写道:

Chelsea Finn

unread,
Sep 6, 2016, 11:32:53 AM9/6/16
to cuik...@163.com, gps-help
Hi,

I don't currently have the bandwidth to help figure out your problem, other than a few recommendations:
- If the cost is decreasing appropriately but the policy isn't improving, then it is likely that something is wrong with the objective, e.g. the cost isn't defined correctly
- If this cost *isn't* decreasing, then there is probably either an issue with the analytic derivatives of the cost or how the state is represented

Note that, for the pointmass example, changing the position of the target relative to the initial position is the same as moving the initial position relative to the target (except in the reverse direction). So, algorithmically, changing the target is the same problem as changing the initial position.

Chelsea

--
You received this message because you are subscribed to the Google Groups "gps-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gps-help+unsubscribe@googlegroups.com.
To post to this group, send email to gps-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gps-help/054eabbd-ffad-4059-9a97-911d123d6722%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Chelsea Finn

unread,
Sep 8, 2016, 9:41:42 PM9/8/16
to cuik...@163.com, gps-help
Yes, the target needs to be part of the state representation in this case.

Chelsea

On Tue, Sep 6, 2016 at 10:46 PM, <cuik...@163.com> wrote:
Thank you for your replay!
 "changing the target is the same problem as changing the initial position." But, if I change the target point, I have to tell the policy network where the target is, so I make the target point and current state as inputs of policy network. In changing initial position problem, make the current state as inputs of policy network is enough. Am I right? Is it a good idea to make the target point as inputs of policy network and let the aircraft fellow the target point inputs?

在 2016年9月6日星期二 UTC+8下午11:32:53,Chelsea Finn写道:
Hi,

I don't currently have the bandwidth to help figure out your problem, other than a few recommendations:
- If the cost is decreasing appropriately but the policy isn't improving, then it is likely that something is wrong with the objective, e.g. the cost isn't defined correctly
- If this cost *isn't* decreasing, then there is probably either an issue with the analytic derivatives of the cost or how the state is represented

Note that, for the pointmass example, changing the position of the target relative to the initial position is the same as moving the initial position relative to the target (except in the reverse direction). So, algorithmically, changing the target is the same problem as changing the initial position.

Chelsea
On Mon, Sep 5, 2016 at 8:36 PM, <cuik...@163.com> wrote:
ps: I rewite the box2d pointmass example using badmm

在 2016年9月5日星期一 UTC+8下午11:27:03,cuik...@163.com写道:
I noticed that the moving aircraft can reach a fixed point from any initial state in box2d pointmass example. When I alter the target point, the training process has to be carried out again. So,  how can I train the moving aircraft to reach any given points? I try to rewrite code to add mass target points and make the target point as inputs of policy network. Unfortunately, the result was bad. In the paper End-to-end Training of Deep Visuomotor Policies, the agent can adapt different target object position. How to do this. Thanks.

--
You received this message because you are subscribed to the Google Groups "gps-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gps-help+u...@googlegroups.com.

To post to this group, send email to gps-...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "gps-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gps-help+unsubscribe@googlegroups.com.
To post to this group, send email to gps-...@googlegroups.com.

Burak Çetinkaya

unread,
Sep 13, 2016, 10:46:37 AM9/13/16
to gps-help
Hello, I also tried to tackle this question but failed to successfully feed the target point information to the neural network. When I tried to add the target state information to the initial states, I got some problems with the GMM, and unfortunately gave up for now. I would be glad if you could share your experience on the matter if you make any progress.

Tao Chen

unread,
Jan 17, 2017, 5:06:21 AM1/17/17
to gps-help
I have tried to modify the code to train multi-targets case and got good results. The pointmass agent can ran from any start point to any target state within a square region. My code is available here
Reply all
Reply to author
Forward
0 new messages