Hardness of non-differentiable optimization

Karlis Freivalds

unread,

Jul 6, 2001, 10:26:22 AM7/6/01

to

Hi,

What is the complexity of obtaining a descent direction of a
nondifferentiable optimization problem? Is there a proven way to do
this in polynomial time?

In particular the optimization problem is a sum of absolute values of
the variable differences and constants.

F(X) = sum |Xi-Xj+Cij|

There are several methods to minimize such functions that work well in
practise, but I am interested in their theoretical behavior. So the
first obvious question is whether I can calculate a descent direction
in polynomial time?

Are there any results in this field?

Thank you very much,

Karlis.

Robin Becker

unread,

Jul 6, 2001, 11:20:52 AM7/6/01

to

In article <e6221938.01070...@posting.google.com>, Karlis
Freivalds <kar...@my-deja.com> writes

This kind of problem is treatable using ordinary linear programming and
is therefore polynomial complexity.

eg

if you need |x|

write constraints

x = x+ + x-
x+ >= 0
x- <= 0

note that some constraints on |x| will result in regions which are non-
convex.

so

|x| <= C

is usually ok, but

|x| >= C

is usually not
--
Robin Becker

Henry Wolkowicz

unread,

Jul 7, 2001, 2:26:25 PM7/7/01

to

In article <e6221938.01070...@posting.google.com>,

There are some 'negative results' in Shor's book on Nondifferentiable
Optimization (title???).
These results concern the inability to obtain a subgradient and also
negative results about superlinear convergence.

--
Prof. Henry Wolkowicz |Email hwolk...@uwaterloo.ca
Univ. of Waterloo |URL http://orion.math.uwaterloo.ca/~hwolkowi
Dept of Comb and Opt |Tel (519) 888-4567, x5589
Waterloo, Ont. CANADA N2L 3G1 |Fax (519) 725-5441

Karlis Freivalds

unread,

Jul 9, 2001, 5:26:36 AM7/9/01

to

Of course I can transform |x|<C to -C<x<C easily. But what how do if
there are several summands e.g. |x|+|y|<C?
Moreover in my case absolute values are in the function. For example I
need to minimize the following function
F(x) = |x|+|x-5|
How do I transform it to a linear programming problem?

Thanks,

Karlis.

Robin Becker <ro...@jessikat.fsnet.co.uk> wrote in message

Dmitri Chubarov

unread,

Jul 9, 2001, 6:12:45 AM7/9/01

to Karlis Freivalds

In reply to the message by Karlis Freivalds

> Of course I can transform |x|<C to -C<x<C easily. But what how do if
> there are several summands e.g. |x|+|y|<C?
> Moreover in my case absolute values are in the function. For example I
> need to minimize the following function
> F(x) = |x|+|x-5|
> How do I transform it to a linear programming problem?
>

|x| + |y| < C defines a convex polyhedron (a square), thus
it is represented by the following (as usual you have to
consider all four cases when dealing with absolute values)
-C < x + y < C,
-C < y - x < C

The second problem can be divided into a set of problems,
comparing the results afterwards:

x<0, -2x -> min
0<x5, 5
x>5, 2x -> min

Regards,
Dmitri Chubarov

--
Dept. Computer Science,
University of Manchester
student

Robin Becker

unread,

Jul 9, 2001, 6:18:51 AM7/9/01

to

In article <e6221938.0107...@posting.google.com>, Karlis
Freivalds <kar...@my-deja.com> writes

>Of course I can transform |x|<C to -C<x<C easily. But what how do if
>there are several summands e.g. |x|+|y|<C?
>Moreover in my case absolute values are in the function. For example I
>need to minimize the following function
>F(x) = |x|+|x-5|
>How do I transform it to a linear programming problem?
>
>Thanks,
>
>Karlis.

...

introduce y=x-5, substitute x+ + x- for x and y+ + y- = y

so then you have

min x+ + x- + y+ + y-

y+, x+ >=0
x-, x- <=0

y+ + y- = x+ + x- -5

when I run this I get the solution at x=5, but the range [0,5] is
optimal.
--
Robin Becker

Alexander Tsyplakov

unread,

Jul 9, 2001, 1:28:12 PM7/9/01

to

Consider using smoothed version of the function and differentiable
optimization (which is a kind of interior point method). It could be
much faster than directly solving the LP problem by simplex algorithm.

|x| ~= x * (2*F(x/h)-1)

F(x) is a smooth function growing from 0 to 1, F(0)=1/2. E.g. F(x) =
1/(1+exp(-x))
h is "smoothmess" parameter

It is not hard to take the first and second derivatives and use Newton
method.

-----------------------------
Alexander Tsyplakov
Novosibirsk State University
http://matrixer.narod.ru/

Karlis Freivalds <kar...@my-deja.com> пишет в
сообщении:e6221938.01070...@posting.google.com...

Peter Spellucci

unread,

Jul 9, 2001, 12:53:37 PM7/9/01

to

In article <e6221938.01070...@posting.google.com>,

in general your answer cannot be answered, since the computation of the
full subgradient is itself an optimization problem. but for functions
of the type you mention above computation of a subgradient is easy and there
are lots of papers dealing with it. there is even public domain software
(NDA.) see http://plato.la.asu.edu/guide.html
moreover problems of linear l1-minimization and l_infty -minimization
can be turned into LP problems and hence completely be solved in polynomial time.
write for example

- e_{i,j} <= Xi-Xj+Cij <= e_{i,j}
minimize sum e_{i,j} subject to htese constraints.

hope that helps
peter

Karlis Freivalds

unread,

Jul 10, 2001, 4:38:25 AM7/10/01

to

Thank you all for your responses. It is clear now that the convex case
is quite "easy". However I have a more general case to solve. I have a
constrained problem

min: F(X) = sum Cij|Xi -Xj|
subject to: sum |Xi-Xj| = 1
with Cij<=1

With exact penalty function I can convert it to the unconstrained case

min: sum Cij|Xi -Xj| + |1 - sum|Xi-Xj||

This is a non-convex problem, but I need to find a local minimum only.
I can prove that the problem can be solved by iteratively going in the
direction of descent of this function. Then no more than quadratic
number of steps are needed. Therefore I asked how to obtain a
direction of descent. Note that a single subgradient is easy to
obtain, but it does not give a direction of descent.

As Alexander Tsyplakov mentioned, the problem can be solved using
smoothing techniques, but I am interested in provably polynomial time
solutions. But finding a local optimum of a smooth, non-convex
function does not seem to be polynomial.

Have you any ideas in this situation?

Thanks,

Karlis.

spel...@mathematik.tu-darmstadt.de (Peter Spellucci) wrote in message news:<9icnih$t51$2...@sun27.hrz.tu-darmstadt.de>...

Robin Becker

unread,

Jul 10, 2001, 5:43:05 AM7/10/01

to

In article <e6221938.01071...@posting.google.com>, Karlis
Freivalds <kar...@my-deja.com> writes

>Thank you all for your responses. It is clear now that the convex case
>is quite "easy". However I have a more general case to solve. I have a
>constrained problem
>
>min: F(X) = sum Cij|Xi -Xj|
>subject to: sum |Xi-Xj| = 1
>with Cij<=1
>
>With exact penalty function I can convert it to the unconstrained case
>
>min: sum Cij|Xi -Xj| + |1 - sum|Xi-Xj||
>

...
replacing |Xi-Xj| by Yij>=0 we seem to have

min sum Cij Yij

Yij>=0

sum Yij = 1

which is a classical portfolio problem
--
Robin Becker

Karlis Freivalds

unread,

Jul 10, 2001, 9:23:15 AM7/10/01

to

The solution to the reduced problem seems trivial. But the problem is
that Yij are not independent, so the reduction does not work.

Thanks,

Karlis.

Robin Becker <ro...@jessikat.fsnet.co.uk> wrote in message news:<qIVrFJAp...@jessikat.fsnet.co.uk>...

Robin Becker

unread,

Jul 10, 2001, 10:46:37 AM7/10/01

to

In article <e6221938.01071...@posting.google.com>, Karlis
Freivalds <kar...@my-deja.com> writes

>The solution to the reduced problem seems trivial. But the problem is
>that Yij are not independent, so the reduction does not work.
>
>Thanks,
>
>Karlis.
>

my mistake, I didn't intend this to be taken as a solution; the boxoid
constraint sum |Xi-Xj| means that a classical solution is unlikely :(

clearly there can be at most n*(n-1)/2 Yij and we have only n free
variables.

However, I was thinking that if one of the |Xi-Xj| pairs is assumed to
be non-zero then as in the portfolio case the cash constraint can be
substituted out leaving an unconstrained piecewise-linear problem.

>Robin Becker <ro...@jessikat.fsnet.co.uk> wrote in message news:<qIVrFJAp4sS7EwB
>j...@jessikat.fsnet.co.uk>...

>> In article <e6221938.01071...@posting.google.com>, Karlis
>> Freivalds <kar...@my-deja.com> writes
>> >Thank you all for your responses. It is clear now that the convex case
>> >is quite "easy". However I have a more general case to solve. I have a
>> >constrained problem
>> >
>> >min: F(X) = sum Cij|Xi -Xj|
>> >subject to: sum |Xi-Xj| = 1
>> >with Cij<=1
>> >
>> >With exact penalty function I can convert it to the unconstrained case
>> >
>> >min: sum Cij|Xi -Xj| + |1 - sum|Xi-Xj||
>> >
>> ...
>> replacing |Xi-Xj| by Yij>=0 we seem to have
>>
>> min sum Cij Yij
>>
>> Yij>=0
>>
>> sum Yij = 1
>>
>> which is a classical portfolio problem

--
Robin Becker

Lutz Tautenhahn

unread,

Jul 10, 2001, 6:22:19 PM7/10/01

to

"Karlis Freivalds" <kar...@my-deja.com> schrieb im Newsbeitrag
news:e6221938.01071...@posting.google.com...

> Thank you all for your responses. It is clear now that the convex case
> is quite "easy". However I have a more general case to solve. I have a
> constrained problem
>
> min: F(X) = sum Cij|Xi -Xj|
> subject to: sum |Xi-Xj| = 1
> with Cij<=1
>
> With exact penalty function I can convert it to the unconstrained case
>
> min: sum Cij|Xi -Xj| + |1 - sum|Xi-Xj||
>
> This is a non-convex problem, but I need to find a local minimum only.
> I can prove that the problem can be solved by iteratively going in the
> direction of descent of this function. Then no more than quadratic
> number of steps are needed. Therefore I asked how to obtain a
> direction of descent. Note that a single subgradient is easy to
> obtain, but it does not give a direction of descent.
>
> As Alexander Tsyplakov mentioned, the problem can be solved using
> smoothing techniques, but I am interested in provably polynomial time
> solutions. But finding a local optimum of a smooth, non-convex
> function does not seem to be polynomial.
>
> Have you any ideas in this situation?
>
> Thanks,
>
> Karlis.
>

Hi,
first a little problem transformation, the problem is equivalent with
Cij_new = Cij_old+Cji_old (for all i > j)
Cij_new = 0 (for all i <= j)
Assume you have found an optimum solution, then a transformation
Xi_new = Xi_old + delta_X (for all i)
would also be an optimum solution, so it makes sense to add a further
constraint in order to fix the solution, for instance
sum Xi = 0
My conjecture is, that there is always an optimum solution where all the Xi
's are located in only 2 points (X_min, X_max).
If you assign every Xi to either X_min or X_max then the X_min and X_max
values and the value of the objective function can be easily calculated.
This would yield to 2^(n-1)-1 potential optimum configurations.
I have tested this with the javascript at
www.tu-chemnitz.de/~luta/plt/hondo.html
so with a simple iterative heuristic you could come with max. n(n+1)/2-1
iterations close to the optimum (it's only a first attempt), I think there
can be found still something better (maybe I will try this in the next
days).
Regards, Lutz Tautenhahn.

Karlis Freivalds

unread,

Jul 11, 2001, 4:36:55 AM7/11/01

to Lutz Tautenhahn

Hi,

You are absolutely right about the transformations and the position of
the optimal solution. However your iteration algorithm does not give the
optimal solution. Try this input data generator
C[ii][jj]=Math.random()<0.5 ? 0: 1;
In some cases you will see differences between the enumeration algorithm
and the iterative one.

Your proposed algorithm in general is not bad. Given some solution (that
is a division of variables in two classes), you generate a better
solution by going in a direction of descent of the objective function.
However your descent direction is found heuristically and in some cases
the algorithm could not find a descent direction even if it exists. I
wonder if a better algorithm exists that always finds a descent
direction?

Thanks,

Karlis.

>
> min: F(X) = sum Cij|Xi -Xj|
> subject to: sum |Xi-Xj| = 1
> with Cij<=1
>
> With exact penalty function I can convert it to the unconstrained case
>
> min: sum Cij|Xi -Xj| + |1 - sum|Xi-Xj||

> Hi,

Lutz Tautenhahn

unread,

Jul 12, 2001, 6:33:50 PM7/12/01

to

Heureka or heuristic?
This was a nice exercise Karlis, thanks for that!
I spent some more time on the problem, here are the results:
As already posted last time, the optimum of

min: F(X) = sum Cij|Xi -Xj|
subject to: sum |Xi-Xj| = 1

will consist of Xi's which are from {X_min, X_max}.
Assume in the optimum there are M of the N Xi values
equal to X_max, then the following model would
yield the same optimal configuration:

min: F(Y) = sum Cij(Yi -Yj)^2
subject to: sum(Yi) = M
Yi = {0, 1}

So by calculating the optimum for all M={1, ..., N-1}, you
could find the optimum of the original problem.
In order to get rid of the Yi = {0, 1}constraints, we can
further transform the problem into this:

min: F(Y) = sum Cij(Yi -Yj)^2 - alpha * sum (Yi^2)
subject to: sum(Yi) = M

where alpha should be a big number, so this will force the
Yi to be 0 or 1. For alpha = 0 we have a convex problem,
but unfortunately for a big alpha this is not the case. For
alpha = 0 the (original infeasible) solution would be Yi = M/N.
The idea I used is, to calculate the solution for a small alpha
(where F(Y) is still convex) and determine the direction of
dYi, and to set Yi = 1 for the M largest dYi values and the
remaining Yi values to 0 or set the M smallest Yi values to 0
and the remaining Yi values to 1 (as the problem is symmetrical).
To get the (real) Yi values for alpha>0, we must solve a
system of linear equations which result from dF/dYk=0 and the
additional equation sum(Yi) = M.
dF/dYk = 0 =
sum Cik(-2Yk)(Yi-Yk) + sum Ckj(2Yk)(Yk-Yj) - 2alpha Yk
which yields sum (Cik+Cki)(Yk-Yi) - alpha = 0.
We have now N+1 linear equations but only N variables. First
it seemed to me that one equation is redundant, so I removed one
of the equations arbitrary (not the sum(Yi) = M), but (surprisingly
for me) there were different results when removing another equation
from the system. So maybe not a redundant equation? Another
thing I tried was solving A^T A * Y = A^T AlphaM (A^T denotes
the transposed matrix A and AlphaM the vector of N rows of alpha
and one row M), as it can be done with overestimated (is this the
correct word) linear systems, but this didn't work. Another possibility
is that there is a bug in my code. You can test the javascript code at
www.tu-chemnitz.de/~luta/plt/hondo1.html
So at the end I do N calculations whereby I remove the i-th equation
from the system (i = 1, ..., N). When I wouldn't do this, then the method
would only find in about 90% the optimum, but so it has always found
the optimum (I have testet 100 instances with N=10 and Cij = {0, 1}).
The method is polynomial: (N-1)*N times solving a system of N equations
with N variables, which is about N^3, so alltogether about N^5 (which is
a lot). But I think at this time it is only a heuristic, it is not sure if
the method finds always the optimum. I probably will not spend more
time on this problem, but maybe one of the experts, who is monitoring
the group could provide some more insights, I would be interested in this,
too. If someone is or wants to become a doctor in mathematics, then this
problem is the right mental warmup! Please let me know Karlis, if you
finally get a poly-time algorithm, which finds definitely always the
optimum.
Regards, Lutz.

"Karlis Freivalds" <Karlis.F...@mii.lu.lv> schrieb im Newsbeitrag
news:3B4C1027...@mii.lu.lv...

Lutz Tautenhahn

unread,

Jul 14, 2001, 3:55:57 PM7/14/01

to

Hello again,

I must add some corrections to my last post:

> In order to get rid of the Yi = {0, 1}constraints, we can
> further transform the problem into this:
>
> min: F(Y) = sum Cij(Yi -Yj)^2 - alpha * sum (Yi^2)
> subject to: sum(Yi) = M

0<=Yi<=1 (I forgot to mention this line)

> The idea I used is, to calculate the solution for a small alpha
> (where F(Y) is still convex) and determine the direction of

> dYi ...
This was not correct: F(Y) is only convex for alpha=0:
If Yi=Yj for all i, j, then in this direction is
F(Y) = - alpha * sum (Yi^2) which is not convex for any
alpha>0.

> We have now N+1 linear equations but only N variables. First

> it seemed to me that one equation is redundant ...
The first N equations are dF/dYi=0 which would search for a
local minimum (if F was convex), but at least they should find
a saddle point. I still don't know, why it doesn't work to solve
this NxN system. Maybe it is because there is not one single
saddle point but a curve. The last equation is the hyperplane
sum(Yi) = M where in general the saddle point is not located.
This is all still a bit unclear for me, but anyway we have method
that seems to find the optimum, so why care about the
mathematical details? (*LOL*)

> The method is polynomial: (N-1)*N times solving a system of N equations
> with N variables, which is about N^3, so alltogether about N^5 (which is
> a lot).

It can be done in N^4 by taking advantage of the fact, that for the loop
M=1..N-1 there is only a difference in one row of the NxN linear system
(the line sum(Yi) = M).
However, it makes the code a bit more complicated, look at
www.tu-chemnitz.de/~luta/plt/hondo2.html
but it's worth the effort, so you can solve larger instances in the same
time.
Regards, Lutz.