Problem 5- isotropic covariance matrices

190 views
Skip to first unread message

Joseph

unread,
Nov 5, 2007, 1:40:30 AM11/5/07
to CS281A: Statistical Learning Theory (Fall 2007)
Hi, I wasn't sure what was meant by "Restrict the covariance matrices
to be isotropic: SIGMA = sigma^2 * I

Do we use the same covariance matrix SIGMA (for the distribution over
y) for each state?
Or do we use a different covariance matrix (say SIGMA_i) for each
state. E.g. with SIGMA_1 = sigma_1^2 * I, SIGMA_2 = sigma_2^2 * I,.....

gregory...@gmail.com

unread,
Nov 5, 2007, 6:19:55 AM11/5/07
to CS281A: Statistical Learning Theory (Fall 2007)
You might as well use a different covariance matrix for each state,
thus for state q, you have the corresponding mu_q and sigma_q. Then
again, even if your sigmas are pretty off, your estimates of the mu's
and the transition matrix will not be especially hurt (for example,
I'll bet anyone a cookie that if you just set the sigmas to be 1, and
not update them, your mu_q's and transition matrix will still converge
to roughly what you want.)

-g

Dai Bui

unread,
Nov 5, 2007, 4:45:50 PM11/5/07
to CS281A: Statistical Learning Theory (Fall 2007)
I am not sure about what is the component density?

For part b, should I calculate the expected log-likelihood or just the
log-likelihood?

On Nov 5, 3:19 am, "gregory.vali...@gmail.com"

> > state. E.g. with SIGMA_1 = sigma_1^2 * I, SIGMA_2 = sigma_2^2 * I,.....- Hide quoted text -
>
> - Show quoted text -

Percy Liang

unread,
Nov 5, 2007, 11:31:31 PM11/5/07
to cs281a...@googlegroups.com
> I am not sure about what is the component density?

A component density corresponds to the distribution p(data point |
latent state)

> For part b, should I calculate the expected log-likelihood or just the
> log-likelihood?

You don't need to compute it - just optimize it, which is gotten by
moment matching.

-Percy

billstron

unread,
Nov 6, 2007, 3:47:35 AM11/6/07
to CS281A: Statistical Learning Theory (Fall 2007)
> You don't need to compute it - just optimize it, which is gotten by
> moment matching.

Do you mean you want the optimized hidden variables or are the optimal
model parameters found through EM enough?

Reply all
Reply to author
Forward
0 new messages