Another problem about the code

137 views
Skip to first unread message

alba cheng

unread,
Sep 24, 2016, 12:26:15 AM9/24/16
to gps-help
I'm reading traj_opt_utils.py, and I'm confused by the computation of KL divergence between new and previous trajectory.It's about the code:
kl_div[t] = max(
0,
-0.5 * mu_t.T.dot(M_new - M_prev).dot(mu_t) -
mu_t.T.dot(v_new - v_prev) - c_new + c_prev -
0.5 * np.sum(sigma_t * (M_new-M_prev)) - 0.5 * logdet_new +
0.5 * logdet_prev
)

I cann't figure out why kl divergence could be computed like this. Any help would be grateful!

biao sun

unread,
Oct 31, 2016, 7:56:18 AM10/31/16
to gps-help
Hi,

Have you found the reason? I am puzzle too.

biao

在 2016年9月24日星期六 UTC+8下午12:26:15,alba cheng写道:

Victor Barbarosh

unread,
Nov 10, 2016, 9:06:07 PM11/10/16
to gps-help
I assume they are making sure the KL-div at each [t] will have a positive value, such that when finally returning the sum(list of values) it won't mess up the result, but it's interesting when we might get negative values...

Marvin Zhang

unread,
Nov 15, 2016, 12:00:50 AM11/15/16
to gps-help
This is computing the KL-divergence between Gaussians, i.e.:


However, it looks a bit strange here because it is actually computing an expectation of the KL-divergence over x, so it is an integral over D_KL(p(u|x)||\bar{p}(u|x)) p(x) dx. The derivation is relatively complicated.

The max with 0 is just for numerical stability, as Victor suggests.

Marvin
Message has been deleted

alba cheng

unread,
Dec 17, 2016, 9:08:53 PM12/17/16
to gps-help
Thanks for your reply. To be honest, I still do not understand...M_prev , v_pre, c_pre seems like second derivative of p(tau), first derivative of p(tau) and constant. I think they are second order approximation...Could you give me more details? It would be grateful!
Reply all
Reply to author
Forward
0 new messages