Suggestions for how to solve QCQP

121 views
Skip to first unread message

Lukas Schroth

unread,
Mar 23, 2023, 8:43:33 AM3/23/23
to YALMIP
6E03E176-AE45-4E6C-BCAD-DFD58A7DCBDC.jpegDear Prof. Löfberg,

for my research I am currently trying to (exactly or approximately) solve the attached polynomial optimization problem (or CMDP after transformation) as efficiently as possible.

Small disclaimer: I am new to optimization. Currently, I am experimenting with SDP relaxations, but I do not know if SDP relaxations make sense in this setting or if something else would be more appropriate.

Do you have any suggestions on how to approach solving the problem (s.t we receive an tight upper bound with reasonable computation time, even for larger instances.)

Many thanks in advance!!

Lukas Schroth

unread,
Mar 23, 2023, 8:56:33 AM3/23/23
to YALMIP
sorry, I mean't QCQP not CMDP...

Johan Löfberg

unread,
Mar 23, 2023, 10:06:35 AM3/23/23
to YALMIP
You basically write it as it is written https://yalmip.github.io/tutorial/basics/

Or do you mean both s and Z are decision variables, so that you have a cubic objective? You talk about SDP relaxations, but this looks failed as you have that cubic

Lukas Schroth

unread,
Mar 23, 2023, 11:09:37 AM3/23/23
to YALMIP
Sorry for the confusion, i made it more clear in the attached picture what the decision variables are. I started out with the optimization problem after 'compact'. Then I reformulated the nuclear norm condition as an LMI. The objective is a polynomial in the optimization variables. I experimented with the moment relaxations that are preimplemented in Yalmip, but I asked myself if theres an better way to approximately solve the Problem.

825CB6D8-445F-47D6-83BE-69EF888A115E.jpeg

Johan Löfberg

unread,
Mar 23, 2023, 11:53:27 AM3/23/23
to YALMIP
For nuclear norm, you can simply write norm(Z,'nuclear')

It feels like the problem you have now, is the result from some earlier relaxation where you have introduced Z and the nuclear norm to try to circumvent something, so perhaps you should simply attack the original problem using a better method? If you apply a moment relaxation to this nonlinear SDP, it will be absolutely humongous in size as you have to lift the whole vector (s,Z(:),A1(:),A2(:)) up to quartic or something, don't you?

Lukas Schroth

unread,
Mar 23, 2023, 12:20:19 PM3/23/23
to YALMIP
D13D72AB-7062-49F0-855F-FC72D4A7E958.jpegYes the problem size gets humongous. You're right about this, the original problem is in the attachements. I want to find the maximum spectral norm of a matrix that contains variables that can be one of two values.
I used the duality of the spectral norm and the nuclear norm to get an polynomial optimization problem, as I was not aure how to approach it otherwise. Do you have any suggestions for the original problem (or get an efficient upper bound for it)? Thanks again for your help.

Johan Löfberg

unread,
Mar 23, 2023, 12:28:03 PM3/23/23
to YALMIP
You own solver (or simply throwing it at a nonlinear solver and let it use finite differences or what not) is probably the best approach (although for small problems you could actually try the develop branch of yalmip, as it tries to solve nonlinear sdps natively by some triciks via standard nonlinear solvers)

Or maybe just iterative over s and Z on the dualized form

Johan Löfberg

unread,
Mar 23, 2023, 12:35:10 PM3/23/23
to YALMIP
So constraints are always active, i.e. either at lower or upper bound

What dimensions are we talking?

Any properties of W, alpha beta?

On Thursday, 23 March 2023 at 17:20:19 UTC+1 lukassch...@gmail.com wrote:

Lukas Schroth

unread,
Mar 23, 2023, 12:44:59 PM3/23/23
to YALMIP
To give a bit context: the solution of the problem is an upper bound on the Lipschitz constant of a neural network, where the W matrices are the weight matrices. Alpha and beta bound the derivative of the activation function of the network: alpha <= phi'(x) <= beta for all x. Currently I am working with very small networks and random weights, however the end goal is to scale to larger networks. Typically beta=1 and -1<= alpha <= 1, often alpha = 1 holds (Relu activation function f.ex). There is an trivial upper bound which consists of the product of the spectral norms of the indiviual weight matrices, however I need something more tight.

Lukas Schroth

unread,
Mar 23, 2023, 12:50:38 PM3/23/23
to YALMIP
When training networks, one could potentially train the weight matrices to be sparse, which could potentially be used in a solver.

Johan Löfberg

unread,
Mar 23, 2023, 12:59:25 PM3/23/23
to YALMIP
ok, sounds like dimensions that would be far beyond anything YALMIP and associated solvers ever could be applicable to

Stick to 2x2 and a MISDP-representation is easily solved :-) 

W2 = randn(2);
W1 = randn(2);
W0 = randn(2);
s = sdpvar(2,1);
s = sdpvar(2,1);
alpha1 = [1;2];
alpha2 = [1;2];
beta1 = [3;4];
beta2 = [3;4];
S1 = diag(binvar(2,1));
S2 = diag(binvar(2,1));
s1 = S1*alpha1 + (eye(2)-S1)*beta1;
s2 = S2*alpha2 + (eye(2)-S2)*beta2;
J = W2*diag(s2)*W1*diag(s1)*W0;
Z = sdpvar(2,2,'full');
[obj_linear,cuts] = binmodel(trace(J'*Z),[-1 <= Z(:) <= 1]);
Model = [norm(Z,'*')<=1,cuts];
optimize(Model,-obj_linear,sdpsettings('debug',1))

Lukas Schroth

unread,
Mar 23, 2023, 1:25:48 PM3/23/23
to YALMIP
That already helps alot, thank you very much.

Lukas Schroth

unread,
Jun 21, 2023, 5:22:07 AM6/21/23
to YALMIP
Good morning Prof.Löfberg,

I am currently continuing my research on Lipschitz constant estimation of NNs and have the presumption that an SDP-bound on the Lipschitz constant might actually be Shors SDP-relxation of the problem described above, unfortunately optimization is still not my main field of expertise and I will need some time to learn about optimization to then try to prove it. Before spending hours, I want to verify it numerically first, to see if my guess is correct.

I am aware that the moment relaxation that is implemented in Yalmip for order 1 is actually the dual of Shors relaxation and should lead to the same value. for the oder k >=2, the moment relaxation works great for me and actually yields results that I can confirm to be correct. When using k=1, i either get an error that I need to choose a higher order (Why is that, I thought all orders are possible for every problem) or the solver output is unknown and the obtained solution does not make any sense, dependent on the specific problem. In the second case I guess, the problem is badly conditioned or not even feasible but to boarderline for the solver to detect it. If that`s the case, it would mean that my presumption is wrong, as the SP bound yields a finite result.

Is the code for the moment relaxations implemented in a way to just handle higher oder relaxations (excluding the dual to Shors relaxation for k=1)?

I it helps I can of course provide a short example or log output.

I appreciate any help and many thanks in advance.

Johan Löfberg

unread,
Jun 21, 2023, 5:48:53 AM6/21/23
to YALMIP
If it complains when order manually is set to 1, it's because the degree of the thing you are analyzing has degree higher than 2*1

It does not make sense to try to use a moment matrix from [1;x^1]*[1 x^1] when the polynomials contain x^3 and x^4. It must be using [1;x;x^2]*[1 x x^2] at least
Message has been deleted

Lukas Schroth

unread,
Jun 21, 2023, 6:03:43 AM6/21/23
to YALMIP
Ok thanks. Then maybe I should implement Shors relaxation in its primal form, as it should work for any polynomial optimization problem (or QCQP after transformation). Implementing Shors relaxation for a QCQP is relatively easy. Do you have any tips what of the existing structures of yalmip I could use to transform my polynomial optimization problem to a QCQP?

Johan Löfberg

unread,
Jun 21, 2023, 6:19:03 AM6/21/23
to YALMIP
Not sure why you necessarily would want to complicate matters by first converting the polynomial problem to a set of quadratic things

There is no direct support to quadratize a model, but maybe you can do it by some replace etc

>> sdpvar x y z t
>> obj = x*y+z+z^3+x^4;
>> sdisplay(replace(obj,x,t^2))
z+z^3+y*t^2+t^8

In the develop branch, you can also replace monomials with other monomials

>> sdisplay(replace(obj,x^4,t^2))
z+x*y+z^3+t^2

or by going low-level
>> [A,B] = getexponentbase(x^2+5*y^3+6*y^7,[x y z])

A =

   (1,1)        2
   (2,2)        3
   (3,2)        7


B =

   (1,1)        1
   (1,2)        5
   (1,3)        6

Lukas Schroth

unread,
Jun 21, 2023, 8:09:39 AM6/21/23
to YALMIP
Ok thank you, I'll have a closer look into the development branch. 

I want to quadratize the model to be able to apply the SDP relaxation by Shor, I think it can only be applied to QCQPs (or I am not aware of how to use it for polynomial optimization problems).
I just want to verify my presumption that applying Shor's relaxation to the problem yields the same value as another SDP bound proposed in the literature. If I can verify this numerically, then I work on proving this, relating two bounds that were independently proposed by different people in the literature. 

Johan Löfberg

unread,
Jun 21, 2023, 8:39:18 AM6/21/23
to YALMIP
The vanilla SDP relaxation of quadratics is just a very special case of general moments

For a quadratic polynomial x+x^2 you write it as minimize x+x^2 s.t X = [1;x]*[1 x] equivalent to  X(1,2)+X(2,2) s.t X = [1;x]*[1 x] relaxed to  X(1,2)+X(2,2) s.t X>=0

For a quartic polynomial x+x^4 you write it as minimize x+x^4 s.t X = [1;x;x^2]*[1 x x] equivalent to  X(1,2)+X(4,4) s.t X = [1;x;x^2]*[1 x x^2] relaxed to  X(1,2)+X(4,4) s.t X>=0

Lukas Schroth

unread,
Jun 21, 2023, 2:11:12 PM6/21/23
to YALMIP
Ok, so I started some inital tests with a NN with a single hidden layer, there all terms are quadratic anyway and I don't need to reformulate the problem. Using the momentrelaxation with oder = 1 should then yield the standart SDP relaxation for the problem. Unfortunately the solution status is Unknow. Is there any way to tell if the probem is just unbounded/infeasible or if somethign can be changed to make it work better numerically?

I attached a simplified version of the code I currently use for recreation. 
Minimum_example_ShorsRelax.m

Johan Löfberg

unread,
Jun 21, 2023, 3:10:06 PM6/21/23
to YALMIP
it is unbounded (i.e. useless relaxation) which can be seen by removing the objective

It is rather common that simple models have weak relaxations, and a standard trick is to add a redundant compactifying quadratic constraint, such as s'*s<=beta^2*length(s) which works here. Still a weak relaxation though

Lukas Schroth

unread,
Jun 21, 2023, 3:25:05 PM6/21/23
to YALMIP
Thanks for the reply. Adding the additional redundant constraint yields unreasonable low values for me, despite the solution status beeing optimal, the solution isn't plausible. A bit confused now.

Johan Löfberg

unread,
Jun 21, 2023, 3:27:57 PM6/21/23
to YALMIP
the *relaxation* is solved optimally

the relaxation is far from tight though (clearly, as order 2 yields a larger objective) so the solution in original space can be completely arbitrary

Message has been deleted

Lukas Schroth

unread,
Jun 21, 2023, 3:31:02 PM6/21/23
to YALMIP
I also solved the original problem and observed that order 2 actually is tight and yields the exact solution for single layer networks. But maybe I have a misconception.

Lukas Schroth

unread,
Jun 21, 2023, 3:36:51 PM6/21/23
to YALMIP
Updated the file to additionally have the true solution computed by the function "Combettes" as reference.
Minimum_example_ShorsRelax.m

Johan Löfberg

unread,
Jun 21, 2023, 3:43:47 PM6/21/23
to YALMIP
About what? You've simply concluded that a standard SDP relaxation is not sufficient for a tight bound and solution extraction

Note that your code uses "value" which you cannot do when computing a relaxation. The computed relaxation objective is given by relaxvalue. value will simply return the objective computed with the linear part of the relaxation used to evaluate all the monomial terms, it will not use the relaxed monomials. value only makes sense when you know that the relaxation actually is tight. Otherwise, it will just return an objective based on an x which you don't know is feasible or infeasible (unless you check of course), nor will you know if it is a lower bound or upper bound, i.e. it is a completely useless computation

Johan Löfberg

unread,
Jun 21, 2023, 3:49:40 PM6/21/23
to YALMIP
you are just lucky here in that only one solution is extracted, meaning it will be assigned internally. What you generically must do is to assign an extracted solution, before using value etc

[sol,xsol,moments] = solvemoment(F,obj, options, mom);sol = solvemoment(F,obj, options, mom);
    if ~isempty(xsol)
        assign(moments.x,xsol{1});
    end

Lukas Schroth

unread,
Jun 21, 2023, 3:56:30 PM6/21/23
to YALMIP
Understood it now. That means my hypothesis is wrong. Thanks for your help.

Lukas Schroth

unread,
Jun 21, 2023, 4:48:49 PM6/21/23
to YALMIP
Actually, I have to ask anther question. If you don't have time for it, don't feel like you have to anwser, you already helped me alot.
In the literature s.b already found a relaxtion of the original problem that leads to the SDP bound for a single layer network.
I already tried generalizing this relaxation for multilayer networks. 
I created a function for my approach, however when I run it, the model generation fails:

"Error using sdpvar/model
Failed when trying to create a model for the "milpsubsref" operator

Error in expandrecursive (line 19)
    [properties,F_graph,arguments,fcn] = model(variable,method,options,allExtStruct(ext_index),w);

Error in expandmodel>expand (line 415)
            [F_expand,failure,cause] = expandrecursive(recover(variables(i)),F_expand,extendedvariables,monomtable,variabletype,where,level+1,options,method,[],'convex',allExtStruct,w);

Error in expandmodel (line 281)
                [F_expand,failure,cause] = expand(index_in_extended,variables,-sdpvar(Fconstraint),F_expand,extendedvariables,monomtable,variabletype,['constraint #' num2str(constraint)],0,options,method,[],allExtStructs,w);

Error in compileinterfacedata (line 116)
    [F,failure,cause,operators] = expandmodel(F,h,options);

Error in solvesdp (line 249)
[interfacedata,recoverdata,solver,diagnostic,F,Fremoved,ForiginalQuadratics] = compileinterfacedata(F,[],logdetStruct,h,options,0,solving_parametric);

Error in optimize (line 31)
[varargout{1:nargout}] = solvesdp(varargin{:});

Error in GeoLip (line 108)
optimize(F, obj);"

What could be causing this? I can provide more details of course, but my generalization is rather experimental and complicated, so it could also be that I just have an error in the code (but couldn't resolve it myself until now)


Johan Löfberg

unread,
Jun 22, 2023, 1:42:15 AM6/22/23
to YALMIP
You are doing something much much more complicated now, either by intent or by mistake.

I'm not sure exactly what causes the error (for that reproducible code is required) but you are using indexing with decision variables, i.e. you are doing something like this

sdpvar index x
y = x(index)


but in a more complicated fashion which causes a model so complex that it crashes internally

Reply all
Reply to author
Forward
0 new messages