log(sum(exp(x)))

912 views
Skip to first unread message

Mikhail Katliar

unread,
Apr 13, 2016, 1:30:54 PM4/13/16
to CasADi
PROBLEM DESCRIPTION
------------------------------------
In maximum-likelihood parameter estimation, one typically minimizes negative log-likelihood of the data. In some important cases (for example, estimating parameters of a Gaussian mixture), the objective has the following form:
L = -log(sum(exp(x))) = -log(exp(x[1]) + exp(x[2]) + ... + exp(x[N])), where x[1], x[2], ..., x[N] << 0. Evaluating this expression directly is very likely to cause underflow for big negative x[_]. The common numerical trick for dealing with this is to scale the exponents so that the maximum exponent equals to 1. This leads the 'logsumexp' function:

logsumexp(x) = log(sum(exp(x - max(x)))) + max(x). 

There are two problems with the max() function:
1. It does not exist in CasADi
2. It is not-smooth, therefore not suitable for using in an objective function.

It is possible to avoid using max() by introducing slack variables. There will be one one slack variable and corresponding constraint for every logsumexp() in the objective function. If the objective is a sum of many logsumexps (= log of a product many sums of exponents), which is sometimes the case, the number of slack variables and constraints grows dramatically and kills the performance.

PROPOSED SOLUTION
---------------------------------
I propose to introduce logsumexp() and a new special function in CasADi. It is a smooth function, but the implementation should employ the numerical trick to avoid underflow. The derivative of logsumexp() can be nicely expresses via logsumexp() itself:

d logsumexp(x) / dx = exp(x - logsumexp(x))

Mikhail Katliar

unread,
Apr 13, 2016, 1:33:42 PM4/13/16
to CasADi
errata: "logsumexp() AND a new special function" -> "logsumexp() AS a new special function"

Joel Andersson

unread,
Apr 18, 2016, 11:26:50 AM4/18/16
to CasADi
Hi Mikhail and sorry for slow answer!

log-sum-exp is a "classic" convex function and it could indeed make sense to include it as a symbolic primitive in CasADi. We want to add more support for convex optimization in CasADi and this could be part of that. The max-function also makes sense and the fact that it's only piecewise smooth is not necessarily a huge problem.

We have a lot of development tasks in CasADi right now so it can take some time before we get to work on this.

Joel

Mikhail Katliar

unread,
Apr 24, 2016, 4:05:36 PM4/24/16
to CasADi
Hi Joel,

I am glad that you think adding this function to CasADi is useful. I understand you have a lot of development task. I can start working on implementing this function on my own, but I will probably need help from CasADi developers.

Regards,
Misha

понедельник, 18 апреля 2016 г., 17:26:50 UTC+2 пользователь Joel Andersson написал:
Reply all
Reply to author
Forward
0 new messages