Maximum likelihood for t-distribution with gradient descent

29 views
Skip to first unread message

Mathieu LE PROVOST

unread,
Apr 26, 2023, 11:47:16 AM4/26/23
to Manopt
Hello, 

I am interested in the estimation of parameters of a multivariate t-distribution with a gradient descent optimization.

A multivariate random variable $X \in \mathbb{R}^n$ with density $\pi_X$ is characterize by a mean vector $\mu \in \mathbb{R}^n$, a positive definite matrix $C \in \mathbb{R}^{n \times n}$ and a degree of freedom $\nu > 1$, see https://en.wikipedia.org/wiki/Multivariate_t-distribution for more details. The pdf is fairly close to the one of a Gaussian distribution. 

I am trying to estimate these three parameters given M samples $\{ x^1, \ldots, x^M\}$ by maximizing the log likelihood. Do you think that this problem can be tackled using matrix optimization tools?

As a warm-up problem, I will try to estimate the sample mean and covariance matrix of a Gaussian distribution from samples by a gradient descent approach.

Thank you for your help,

Mathieu 


Ronny Bergmann

unread,
Apr 27, 2023, 8:00:23 AM4/27/23
to Manopt
HI Mathieu,
just for completeness, this is a similar thread to what we discussed at https://discourse.julialang.org/t/maximum-likelihood-for-t-distribution-by-gradient-descent/97954

I think the Matlab toolbox can do that quite similar to I think Manopt.jl should be able to do it (I just did not yet have the time to check – and if so would first check myself in Julia, since I am fast in that than in Matlab).

Best
Ronny

Nicolas Boumal

unread,
May 1, 2023, 4:04:07 AM5/1/23
to Manopt
Hello Mathieu,

Is $\nu$ an integer, or just a real number bigger than 1?

Ignoring $\nu$ for now, in Manopt you would write this as an optimization problem over a product manifold, as follows:

elements = struct();
elements.mu = euclideanfactory(n, 1);
elements.Sigma = sympositivedefinitefactory(n);
manifold = productmanifold(elements);

% You can get a sense for how this works by calling x = manifold.rand(); -- this creates a random point, as a structure with fields x.mu and x.Sigma.

problem.M = manifold;
problem.cost = @(x) ....... cost function to be minimized, with inputs x.mu and x.Sigma .... ;

Then you'd ideally define the gradient too. If you're lucky, automatic differentation will work "out of the box":

problem = manoptAD(problem);

If not, it may just be a matter of rewriting the cost function somewhat differently, or you may need to implement the gradient by hand in problem.grad or problem.egrad (which is faster than AD anyway) -- we can discuss more if questions come up then.

Best,
Nicolas
Reply all
Reply to author
Forward
0 new messages