Estimation of a transition rate matrix

49 views

Skip to first unread message

Karim Douch

unread,

Apr 13, 2022, 7:40:20 AM4/13/22

to Manopt

Hello,

For my research on the hydrology of the Amazon basin I am developing a linear but relatively high-dimensional model. Eventually, I would like to estimate from data a transition matrix by minimizing the prediction error. Basically I want to solve:

Â = argmin || Y - AX ||_F where Y and X are matrices consisting of my data.

To enforce first principle such as mass conservation I need A to be a transition rate matrix, also know as intensity matrix that is:

# A is a real matrix

# A has zero column-sum

# off diagonal elements of A are larger or equal to 0. Maybe >0 could be OK for a start.

I have already used Manopt with the centered matrix factory (2 first #) and the results are quite good but still I would like to enforce the third constraint.

Since I have the basic background in math, I started studying your book on optimization on smooth manifolds (nicely written btw) but it might take a while before I am able to do something by myself, therefore I would like to have some first feedback on the problem: is the set even a smooth manifold (I guess it is) etc.

Note that this kind of matrix are used to describe continuous-time Markov chain and are therefore widely used. From the litterature I found only a few methods to estimate one by one the matrix entries so it is only relevant for low-dimensional matrices. Finally and sorry if I state the obvious, the exponential mapping of A gives a left-stochastic matrix, which manifold is already implemented in Manopt (multinomial manifold).

I would be very grateful for any hint! Thanks in advance.

Karim

Nicolas Boumal

unread,

Jun 7, 2022, 7:23:05 AM6/7/22

to Manopt

Hello Karim,

I believe your matrix A is a so-called "discrete Laplacian" for a graph, no? (It would be so if we also require $A$ to be symmetric; I'm not sure about the general case.)

At any rate, one way to handle this with Manopt is: optimize over a matrix B with all entries > 0 (the multinomial factory will do, as you pointed out), and use the conversation A = B - D, where D = diag(sum(B, 2)) (to vbe checked, but I think this fulfils your conditions). This is a linear change of variable, so nothing bad. You can just replace A with B-D in your cost function and optimize for B.

That said, your problem is convex. Even more precisely, you can write it as a convex quadratic objective (just square the Frobenius norm) to be minimized under linear constraints. To solve this type of structured problem, you may get better performance from Matlab's built-in quadratic programming solver, called quadprog (part of the official Matlab optimization toolbox).

I hope this helps,

Nicolas

Reply all

Reply to author

Forward

0 new messages