Why does u equal to K * x + k? I think it should be delta_u = K * delta_x + k

134 views
Skip to first unread message

Tao Chen

unread,
Jan 10, 2017, 8:42:55 AM1/10/17
to gps-help
The iLQR algorithm produce the mean value for delta_u which equals to K * delta_x + k. And page 30 in End-to-End Training of Deep Visuomotor Policies also says when Taylor expansions do not centered around zero, increment of x and u should be used. But https://github.com/cbfinn/gps/blob/master/python/gps/algorithm/policy/lin_gauss_policy.py#L41` only calculates the new u value based on the x value. I think new_u = old_u + K * (new_x - old_x) + k, why the code does not implement in this way? Thanks.

Bin

unread,
Feb 19, 2017, 9:44:08 PM2/19/17
to gps-help
Hi,
Have you got it?  I am puzzle too.
                                   Bin
在 2017年1月10日星期二 UTC+8下午9:42:55,Tao Chen写道:

Bart Keulen

unread,
Aug 11, 2017, 5:20:40 PM8/11/17
to gps-help
Hi,

Does somebody have the answer to this question?

Thanks.

Op dinsdag 10 januari 2017 07:42:55 UTC-6 schreef Tao Chen:

Marvin Zhang

unread,
Aug 13, 2017, 8:10:15 PM8/13/17
to gps-help
The quadratic cost approximation is corrected from the reference trajectory (centered around zero) in these lines of code: https://github.com/cbfinn/gps/blob/master/python/gps/algorithm/algorithm.py#L158-167.

The linear dynamics approximation is computed directly on x and u, rather than delta_x and delta_u.

Thus, because we have cost and dynamics approximations directly on x and u, we can work with linear-Gaussian policies and LQR updates that operate directly on x and u.

Bart Keulen

unread,
Aug 14, 2017, 11:36:06 AM8/14/17
to gps-help
So if i understand it correct, the cost is approximated quadratically around zero like this:


(End-to-End Training of Deep Visuomotor Policies page 30)

Which is just done to simplify it? Otherwise the original [x0, u0] have to be added everywhere.

It is nice this results in a controller does not need the previous trajectory but only the state.


Op zondag 13 augustus 2017 19:10:15 UTC-5 schreef Marvin Zhang:
Auto Generated Inline Image 1
Reply all
Reply to author
Forward
0 new messages