Another problem about the code

alba cheng

unread,

Sep 24, 2016, 12:26:15 AM9/24/16

to gps-help

I'm reading traj_opt_utils.py, and I'm confused by the computation of KL divergence between new and previous trajectory.It's about the code:

kl_div[t] = max(
    0,
    -0.5 * mu_t.T.dot(M_new - M_prev).dot(mu_t) -
    mu_t.T.dot(v_new - v_prev) - c_new + c_prev -
    0.5 * np.sum(sigma_t * (M_new-M_prev)) - 0.5 * logdet_new +
    0.5 * logdet_prev
)

I cann't figure out  why kl divergence could be computed like this. Any help would be grateful!

biao sun

unread,

Oct 31, 2016, 7:56:18 AM10/31/16

to gps-help

Hi,

Have you found the reason? I am puzzle too.

biao

在 2016年9月24日星期六 UTC+8下午12:26:15，alba cheng写道：

Victor Barbarosh

unread,

Nov 10, 2016, 9:06:07 PM11/10/16

to gps-help

I assume they are making sure the KL-div at each [t] will have a positive value, such that when finally returning the sum(list of values) it won't mess up the result, but it's interesting when we might get negative values...

Marvin Zhang

unread,

Nov 15, 2016, 12:00:50 AM11/15/16

to gps-help

This is computing the KL-divergence between Gaussians, i.e.:

http://stats.stackexchange.com/questions/60680/kl-divergence-between-two-multivariate-gaussians

However, it looks a bit strange here because it is actually computing an expectation of the KL-divergence over x, so it is an integral over D_KL(p(u|x)||\bar{p}(u|x)) p(x) dx. The derivation is relatively complicated.

The max with 0 is just for numerical stability, as Victor suggests.

Marvin

Message has been deleted

alba cheng

unread,

Dec 17, 2016, 9:08:53 PM12/17/16

to gps-help

Thanks for your reply. To be honest, I still do not understand...M_prev , v_pre, c_pre seems like second derivative of p(tau), first derivative of p(tau) and constant. I think they are second order approximation...Could you give me more details? It would be grateful!

Reply all

Reply to author

Forward