Re: Can YALMIP solve the following optimization problem?

Message has been deleted

Johan Löfberg

unread,

Sep 22, 2015, 2:15:17 AM9/22/15

to YALMIP

It is a convex problem, but you have to put it in a linear form to make it solvable by a linear semdefinite solver.

First replace max logdet(R'*(D(a)-a*a')+g*I) s.t g(a)>=0 with max logdet(X) s.t R'*(D(a)-a*a')+g*I >= X, g(a)>=0. Now perform a Schur complement to linearize in a

n = 15;
R = randn(n);
gamma = 1;
a = sdpvar(n,1);
X = sdpvar(n);
D = diag(a);
Model = [a >= 0, sum(a) == 1, [R'*D*R+gamma*eye(n)-X R'*a;a'*R 1]>=0];
optimize(Model,-logdet(X))

Message has been deleted

Johan Löfberg

unread,

Sep 22, 2015, 1:01:21 PM9/22/15

to YALMIP

I don't see any way of solving that

Johan Löfberg

unread,

Sep 23, 2015, 3:03:40 AM9/23/15

to yal...@googlegroups.com

Actually, when plotting a slice of this function in 2d, I cannot find a single example where it is nonconvex. The structure in the model appears to ensure that the matrix has real eigenvalues and the logdet being concave in alpha.

You can throw this model at fmincon using YALMIP. However, it will be extremely inefficient, as the derivatives of the eigenvalues are computed using finite-differences, and the general framework for optimizing over eigenvalues via nonlinear solvers in YALMIP is a terrible hack.

I first solve a slightly shifted model to avoid singularity problems in the logarithm when fmincon starts from scratch, and the use that solution while removing this shifting

n = 10;
K = randn(n);K = K*K';

gamma = 1;
a = sdpvar(n,1);

D = diag(a);
Model = [a >= 0, sum(a) == 1];
optimize(Model,-sum(log(1+eig((diag(a)-a*a')*K + gamma*eye(n)))))
optimize(Model,-sum(log(.1+eig((diag(a)-a*a')*K + gamma*eye(n)))),sdpsettings('usex0',1))
optimize(Model,-sum(log(eig((diag(a)-a*a')*K + gamma*eye(n)))),sdpsettings('usex0',1))

zhe

unread,

Sep 23, 2015, 3:06:42 AM9/23/15

to YALMIP

Do you refer to my first post or the second?

Johan Löfberg

unread,

Sep 23, 2015, 3:08:03 AM9/23/15

to YALMIP

The second. The first model is trivially solved using sdpt3 etc since it can be written as an LMI with maxdet objective, by introducing a new variable and performing a Schur complement as I showed

zhe

unread,

Sep 23, 2015, 3:15:39 AM9/23/15

to YALMIP

It seems the two problem have approximately the same eigenvalues?

Johan Löfberg

unread,

Sep 23, 2015, 3:22:41 AM9/23/15

to YALMIP

No, don't see where you get that claim from (unless you think 50% difference is approximately the same in the sample below)

>> n=10; a = rand(n,1);a = a/sum(a);
>> K = randn(n).^3;K = K*K';
>> R = chol(K);
>> [sort(eig((diag(a)-a*a')*K + .1*eye(n))) sort(eig(R'*(diag(a)-a*a')*R + .1*eye(n)))]

ans =

    0.1000    0.1000
    0.1024    0.1415
    0.2506    0.3885
    0.4412    0.4676
    0.8921    0.8035
    2.9605    1.6703
    7.9620    6.9650
   28.1475   23.8492
   47.4381   42.1951
   89.3010   65.3326

Johan Löfberg

unread,

Sep 23, 2015, 3:27:23 AM9/23/15

to YALMIP

The two objectives have a strong statistical correlation though

J = [];
n = 10;
for i = 1:100000

    a = rand(n,1);a = a/sum(a);
    K = randn(n).^3;K = K*K';
    R = chol(K);

    J = [J sum(log((eig((diag(a)-a*a')*K + .1*eye(n)))))/sum(log(eig(R'*(diag(a)-a*a')*R + .1*eye(n))))];
end
hist(J,10000);
median(J)

Johan Löfberg

unread,

Sep 23, 2015, 3:56:48 AM9/23/15

to yal...@googlegroups.com

Silly me, let K=R^TR and then multiply the matrix from left with R and from right with R^-1 (eigenvalues preserved), and it boils down to R(Lambda-a*a')R'+gamma*I, which is symmetric and you can solve it using a standard LMI solver using the method I previously gave

zhe

unread,

Sep 23, 2015, 3:58:11 AM9/23/15

to YALMIP

When I run the code for the second problem, it seems something gets wrong:

Your initial point x0 is not between bounds lb and ub; FMINCON
shifted x0 to strictly satisfy the bounds.

                                            First-order      Norm of
Iter F-count            f(x) Feasibility   optimality         step
    0      43   -8.940697e-08    1.700e+00    5.428e-03
    1      86   -1.469335e+00    4.701e-01    1.266e-01    2.907e+00
    2      94   -1.482727e-01    5.284e-07    2.408e-01    2.803e+00

Local minimum possible. Constraints satisfied.

fmincon stopped because the gradient calculation is undefined. Constraints
are satisfied to within the selected value of the constraint tolerance.

ans =

    yalmiptime: 1.6602
    solvertime: 0.7308
          info: 'Successfully solved (FMINCON)'
       problem: 0

ans =

    yalmiptime: NaN
    solvertime: NaN
          info: 'Unknown problem in solver (Turn on 'debug' in sdpsettings) (Error using barrier (line...'
       problem: 9

Your initial point x0 is not between bounds lb and ub; FMINCON
shifted x0 to strictly satisfy the bounds.

ans =

    yalmiptime: NaN
    solvertime: NaN
          info: 'Unknown problem in solver (Turn on 'debug' in sdpsettings) (Error using barrier (line...'
       problem: 9

Johan Löfberg

unread,

Sep 23, 2015, 4:02:40 AM9/23/15

to YALMIP

Yes, it is extremly unstable to solve it using this approach. You would have to tweak various parameters in the solver to ensure it returns a sufficiently feasible solution which then is used in the second run etc. Considering I just showed it can be convexified, simply forget this whole line of attack

zhe

unread,

Sep 23, 2015, 4:38:22 AM9/23/15

to YALMIP

K is computed from the data in my problem. You mean let R=K ' and then my problem can be solved like this？

R = K';
gamma = 0.02;

Johan Löfberg

unread,

Sep 23, 2015, 4:43:59 AM9/23/15

to YALMIP

No. You are given the symmetric positive definite matrix K and the model logdet((D-a*a')*K+...). Introduce K=R'*R, e.g.,. R=chol(K), and use logdet(R*(D-a*a)*R'+...), and then introduce the matrix X with R*(D-a*a)*R'+gamma*I >= X etc as previously explained

zhe

unread,

Sep 23, 2015, 5:16:32 AM9/23/15

to YALMIP

Sorry to bother you again. I don't quite understand how to perform Schur complement, is it right to execute your above explanation like this:

R = chol(K);

gamma = 0.02;

a = sdpvar(n,1);

X = sdpvar(n);

D = diag(a);

Model = [a >= 0, sum(a) == 1, [R'*D*R+gamma*eye(n)-X R'*a;a'*R 1]>=0];

optimize(Model,-logdet(X))

Thanks again!

Johan Löfberg

unread,

Sep 23, 2015, 5:19:50 AM9/23/15

to YALMIP

No, you are messing up the transposes. You have R*(D-a*a)*R'+gamma*I >=X, hence the Schur complement is [R*D*R'+gamma*I-X R*a;a'*R' 1]

Mark L. Stone

unread,

Sep 23, 2015, 8:11:37 AM9/23/15

to YALMIP

Leaving aside complications due to repeated eigenvalues, would it be very difficult to calculate gradient of eigenvalue in a future build of YALMIP, rather than finite difference, as {V,D,W] = eig(X) .
Then gradient of ith eigenvalue = sign(D(i,i)) * W(:,i) * V(:,i)' / norm(W(:,i)' * V(:,i)) modulo any typos, with the necessary workarounds for older versions of MATLAB which don't directly support left eigenvector calculation in eig?

And similar calculations could be used for singular values, which among other things, would allow for differentiation of two-norm. {U,S,V] = svd(X) . Then gradient of ith singular value = U(:,i) * V(:,i)' .

Johan Löfberg

unread,

Sep 23, 2015, 8:14:05 AM9/23/15

to YALMIP

Yes, YALMIP doesn't really have a good internal framework for R^m->R^n functions for both n,m>1

Mark L. Stone

unread,

Sep 23, 2015, 8:24:25 AM9/23/15

to YALMIP

You can't just vec your way out of that?

Johan Löfberg

unread,

Sep 23, 2015, 8:56:11 AM9/23/15

to YALMIP

Nope, no simple fix. As it is now, if I recall it correctly, I simply define n R^m->R functions, which is a mess, and ads a lot of extra computations hen computing derivatives etc.Fro that reason, there are no R^m->R^n functions supported, except eig, which only was something I needed for a quick test

zhe

unread,

Oct 19, 2015, 3:46:29 AM10/19/15

to YALMIP

Dear Johan,

In my execution of the above problem, different gamma in the regularization term will produce different solutions. I wonder if there is the problem of chosing gamma optimally (e.g., by grid search)

gamma = 1e-4
R = chol(K);

a = sdpvar(n,1);
X = sdpvar(n);
D = diag(a);

Model = [a >= 0, C >= a, sum(a) == 1, [R*D*R'+gamma*eye(n)-X R*a;a'*R' 1]>=0];
optimize(Model,-logdet(X))

Johan Löfberg

unread,

Oct 20, 2015, 2:08:21 AM10/20/15

to YALMIP

Sure, you can do grid-search. You just have to specify what is "best"

zhe

unread,

Oct 20, 2015, 3:23:17 AM10/20/15

to YALMIP

So when I identify the "best" gamma through grid-search, should I add the regularization term in the evaluation measure?

Johan Löfberg

unread,

Oct 20, 2015, 3:58:53 AM10/20/15

to YALMIP

That's what I am saying you have to decide first. How do you judge if the solution obtained when you used gamma1 is better or worse than a solution computed using gamma2?

zhe

unread,

Oct 20, 2015, 4:16:10 AM10/20/15

to YALMIP

I need to use the eigenvalues of the primal matrix (D-a*a')*K for further computation, and I wonder if I should use the eigenvalues of the regularized matrix ( (D-a*a')*K + gamma*I ) instead since the optimization procedure used the regularization term.

zhe

unread,

Oct 20, 2015, 4:58:25 AM10/20/15

to YALMIP

When gamma is too small, the optimization will report warnings:

'Lack of progress (SDPT3-4)' , the information in the optimization process shows:

"linsysolve: Schur complement matrix not positive definite
switch to LU factor. lu 23 ^25

... ...

stop: progress is too slow

stop: steps too short consecutively"

or "info: 'Numerical problems (SDPT3-4)':

linsysolve: Schur complement matrix not positive definite
switch to LU factor. lu 30 ^ 3
stop: primal infeas has deteriorated too much, 7.1e-03

But the solutions are still reasonable for my problem. Can I just ignore those warnings? Or is there some other ways to solve the ill-posed problem ( det ((diag(a) - a*a')*K) = 0 ) apart from the regularization technic?

Johan Löfberg

unread,

Oct 20, 2015, 4:58:48 AM10/20/15

to YALMIP

You are the only one who can answer that

Reply all

Reply to author

Forward