Given a pair of (real) vectors, a and b,
each of length n.
Find X (a real scalar) that minimizes the
expression
Y = max(a+X*b) - min(a+X*b)
The length of these vectors will always be
at least 4, often on the order of 10-20, and
rarely in the hundreds.
Does anybody see a solution that will solve
this without the obvious optimization? Its
already inside an optimization that I cannot
avoid, so I'd like to avoid the need for a
nested optimization.
Thanks,
John
Good question. What about
fsolve(@(x) max(a+b*x)-min(a+b*x),0)
do you mean fsolve is also optimization?
you can see your problem as:
1. having a set of K lines (equations y = a_k + b_k X)
2. choose an X so that the range between the upper and the lower lines
at X is as large as possible
MATLAB code to see that could be:
% generation of a and b
a=randn(10,1);b=randn(10,1);
% plot lines
figure
x=[-1, 2];
plot(x, [a+b*x(1), a+b*x(2)], 'linewidth',2)
% choose an X
X=rand(1)
% and plot the associated range
hold on; plot([X,X],[min(a+b*X),max(a+b*X)],':k')
As you can see, if all the lines are not all parallel (i.e. all b are
not equals), the range will be increasing with abs(X)...
so if I understood your problem, you need a constraint on X, or take
it as large (positively or negatively) as possible....
It can be prooved.
I thought you wanted to maximize your function, to minimize it, my
point of view is the same as dmitrey's.
As a first result you have that the point is necessarly at a crossing
of two lines.
My conjecture is that your minimum should be the crossing of the two
lines which have the largest difference of slopes... I have no time to
explore your problem more deeply, but it's the idea
John
Yes. This actually does arise from a line crossing
problem. I can explain the original problem, but
that might only confuse things.
Its also not as simple as the largest difference
in slope, since then a would not be relevant.
But a is relevant. Here is how I might solve it
as an optimization:
n = 10;
a = rand(n,1);
b = rand(n,1);
fun = @(X) max(a+X*b) - min(a+X*b);
ezplot(fun)
[x,fval] = fminbnd(fun,[20,20])
x =
1.1205
I need both x and the corresponding difference.
fval =
0.59384
Note that if you change a to ones(n,1), the
solution occurs at x = 0, as you would expect.
John
--
The best material model of a cat is another, or preferably the same, cat.
A. Rosenblueth, Philosophy of Science, 1945
Those who can't laugh at themselves leave the job to others.
Anonymous
then look at
N=5;
a=randn(N);
b=randn(N);
[mib,imib]=min(b);
[mab,imab]=max(b);
xx=(a(imab)-a(imib))/(mab-mib)
x=fsolve(@(x) max(a+b*x)-min(a+b*x),0)
(x-xx)
> then look at
> N=5;
> a=randn(N);
> b=randn(N);
> [mib,imib]=min(b);
> [mab,imab]=max(b);
> xx=(a(imab)-a(imib))/(mab-mib)
> x=fsolve(@(x) max(a+b*x)-min(a+b*x),0)
> (x-xx)
Nope. Fsolve is a root finder anyway.
But if I compare your solution to that
from fminbnd or ezplot, its not the
correct one.
a = rand(10,1);
b = rand(10,1);
[mib,imib]=min(b);
[mab,imab]=max(b);
xx=(a(imab)-a(imib))/(mab-mib)
xx =
-0.89375
fun = @(X) max(a+X*b) - min(a+X*b);
ezplot(fun)
ezplot convinces me that the solution
for this random problem occurs a little
ways above 0.
[x,fval] = fminbnd(fun,-20,20)
x =
0.35121
fval =
0.78577
This is confirmed by fminbnd.
My gut tells me there is something
simple here that I've missed. If I look
at either piece of the "objective", it
is a piecewise linear function. So their
difference is also piecewise linear.
I think I can probably even convince
myself the solution is unique and
finite.
John
you are right, it is not a solution, but it should be found in
several steps, as my gut tells. sorry for giving emotions instead of
solutions. The idea is: start from arbitrary x. find component which
gives fastest decrease of max-min. find next x as a position of
expected max-min=0.
This are iterations, but i think, it is principally iterative
problem, but the number of iterations should be finite
%% Crossing problem
%% Data
n = 5;
a = rand(n,1);
b = rand(n,1);
f_crit = @(X) max(a+X*b) - min(a+X*b);
F_crit = @(X) max(repmat(a,1,length(X))+repmat(X',length(a),
1).*repmat(b,1,length(X))) - min(repmat(a,1,length(X))
+repmat(X',length(a),1).*repmat(b,1,length(X)));
%% Crossings
% Crossing between (a,b) and (A,B) is
%
% $$x=-\,\frac{a-A}{b-B}$$
%
crossings = -(repmat(a,1,n)-repmat(a',n,1))./(repmat(b,1,n)-
repmat(b',n,1));
triu_id = cumsum(eye(n),2)-eye(n);
crossings = sort(crossings(logical(triu_id)));
crossingy = F_crit(crossings)';
[Y,idX] = min(crossingy);
X = crossings(idX);
%% Plot
domainx = [min(crossings), max(crossings)];
domainy = [a+b*domainx(1), a+b*domainx(2)];
miax = [min(domainy(:)), max(domainy(:))]';
allx = [repmat(crossings,1,2), repmat(nan,length(crossings),1)]';
ally = repmat([miax; NaN],1,length(crossings));
figure('color', [1 1 1]);
a1=subplot(2,1,1);
plot(domainx, domainy, 'linewidth',2);
hold on
plot(allx(:),ally(:),':k');
hold off
title('All lines');
a2=subplot(2,1,2);
plot(crossings,crossingy,'-', 'linewidth',2);
hold on
plot(X, Y, 'or', 'Markerfacecolor', [1 0 0]);
plot(crossings,crossingy,'.k');
hold off
text(X,Y,sprintf(' X=%5.2f, Y=%5.2f', X, Y), 'rotation', 90)
title('Minimum');
linkaxes([a1,a2],'x');
I don't think the value of x that produces a minimum has to be unique
although that may be the case for your data.
Consider,
a = [1 5 2 -1 3 4];
b = [4 3 3 3 3 2];
Define,
fun = @(x) max(a+x*b) - min(a+x*b);
For a large interval,
[mx,fval] = fminbnd(fun,-20,20)
mx =
2.0607
fval =
6.0000
For a smaller interval,
[mx,fval] = fminbnd(fun,-5,12)
mx =
-0.9868
fval =
6
Both answers are correct since fun==6 for all values of x between -1
and 4.
Ken
Agreed. I believe I came up with an
example where the minimizer was not
unique. I'll accept any minimizer then.
John
if tha data are not specially prepared, the probability of nonunique
solution is zero. therefore, the question remains, which algorithm
needs minimum iterations. for instance about 40 iterations does
fsolve, even if a and b are as long as 100:
a=randn(100,1);
b=randn(100,1);
[X,FVAL,EXITFLAG,OUTPUT]=fsolve(@(x) max(a+b*x)-min(a+b*x),0)
OUTPUT.iterations
Anyway assuming above is true find all the possible points of
intersections. Afterwards, evaluate your function at those points
and pick a minimum.
Seems inefficient but doesn't use optimization.
The solution to the problem will coincide with the point of
intersection of some two (or more) lines from a given set. This
reduces your set of possble solutions from infinite to finite and
hopefully managebale.
You are right, but if we consider finding a crossing as an iteration,
you need N^2 iterations, and the question is whether the number of
iterations (operations) can be essentially decreased. This may be
important with vectors 1000 or 10000 long. Otherwise, finding
crossings is easily vectorized and your solution can be the fastest
one. It has an advantage of generality: if you find two equal minimal
values at different crossings, then the solution is given by all
points between the crossings, otherwise you have a unique solution.
One last remark: not all crossings can be found because some of the
straight lines can be parallel. Therefore before you apply your
algorithm, the data vectors should be correspondigly preprocessed.
------------------------
John, finding all the crossing points of n different lines is,
unfortunately, an order n^2 operation. I has occurred to me that an order
n operation is possible if the b vector is first sorted: "[b,p] =
sort(b); a = a(p);". (Well, actually this step is order n*log(n) - all
the rest can be order n.)
To simplify matters I will just talk about Y1 = max(a+b*X). (Handling
min(a+b*X) will be analogous.) Y1 will be a piece-wise linear function of
X which is concave upwards. It cannot have more than n linear pieces and
can easily have less than that. Starting with the lines 1 and 2 (with
a(1),b(1) and a(2),b(2)), we find x1 at their crossing point. Next find
x2 at the crossing point of lines 2 and 3. There are two possibilities,
x1 <= x2 or x2 < x1. In the latter case, line 2 will definitely not be
present in Y1 because of the sorting done on b. (Draw a picture and you
will see.) So we discard both x1 and x2 and find the crossing point of
lines 1 and 3, which now becomes the new x1. This procedure continues
through the ordered (by slope) lines. If at any point an x value is
obtained that lies prior to a set of previous x-values, xp,...,xq, then
all of these, the new x and xp,...,xq, are discarded and replaced by the
single crossing point of this latest line and the line that formed the
first of the pair for xp. When we finish up, we have a sequence, x1, x2,
..., xN of ascending crossing points which form the dividing points for
the linear pieces present in Y1. This process is order n because crossing
points corresponding to each new added line will only be computed at most
twice, and only discarded at most one time.
When these x values are combined with similar results for Y2 =
min(a+b*X), what you have is a set of points that can be run through 'min'
to find the true minimum value of Y, again an order n process.
You will notice I have run rough shod over the problem of parallel
lines. Probably parallel but different lines are okay because they only
create infinities, but identical lines would lead to NaNs and should be
coalesced into single lines.
Roger Stafford
See the code I posted at 5:02pm, the N^2 iterations are computed by
LAPACK and others, and not by MATLAB, consequently it's far more
faster than MATLAB loops...
The proof that the minimum is at a crossing is obvious:
1- the criterion is a piecewise affine function
2- the slope of this fonction changes only at crossings
3- the degenerate cases (constant slopes) can be set aside
3- between two crossings, the criteria can be minimized going to the
lowest of the two crossings
It seems more interesting to find if there is a local criterion to
find immediately the two lines which crossing will generate the
minimum.
To read more about those questions:
My PhD thesis has a part about piecewise affine functions (but in
french), you can find one of my papers here: http://www.lehalle.free.fr/clog/down/ICANN98.pdf
A more complete book is Arrangements of Hyperplanes (by Peter Orlik
and Hiroaki Terao)
http://www.amazon.com/Arrangements-Hyperplanes-Grundlehren-mathematischen-Wissenschaften/dp/3540552596/
I agree with that, so I posted it (and my solution) at
literateprograms.org:
http://en.literateprograms.org/Combinatorial_Geometry_in_MATLAB#Minimal_range_of_an_arrangement_of_lines
feel free to comment (use the discussion page or directly the
article).
regards
Nice analysis, Roger. Indeed, one can work only with max if you write
x=fsolve(@(x) max(a+b*x)+max(-a-b*x),0)
or hopefully faster
x=fsolve(@(x) mimafun(x,a,b),0)
function y=mimafun(x,a,b)
%a,b = columns
V=a+b*x;
V(:,2)=-V;
y=sum(max(V));
The final referee will be the solution time. In my comp fsolve needs
about 1 sec for vectors 100000 long. It is interesting how much
faster is your algorithm or that of cal.
I made tests:
- my method needs a lot of memory because I build NxN matrices!
- vassoli's method is fastest
Another victory of numerical maths against proof-oriented approach ;{)}
It's one of ML methods, but Roger is right: piece-wise linearity
gives a chance of a fast and exact solution. Let's wait which speed
his algo shows.
----------------------
John, in that previous article, at one point I said, "If at any point,
an x value is obtained that lies prior to a set of previous x-values,
xp,...,xq, then all of these, the new x and xp,...,xq, are discarded and
replaced by the single crossing point of this latest line and the line
that formed the first of the pair for xp." That was wrong and I should
have said, "if at any point an x value is obtained that lies prior to the
previous value xq, then x and xq are discarded and replaced by a new
crossing point x between the new line and the line that formed the first
of the pair for xq. This backing up should continue until x is not less
than a prior xq, at which point the process can move forward again to the
next line." I know that is pretty vague wordage, so I include a code
segment to illustrate the idea. This only applies to the Y1 = max(a+b*X)
part of your function. The code for the min part would look almost
identical except for a reversal of the inequality at "t<x(j)".
[b,q] = sort(b); a = a(q); % Sort ascending b's
p = zeros(n,1); % p will have line indices
x = zeros(n,1); % x will store x values
j = 1; % Set up the first point
p(1) = 1;
for k = 2:n % Work through the remaining lines one at a time
while 1 % Only exit while loop with a break
t = (a(k)-a(p(j)))/(b(p(j))-b(k)); % Calculate a crossing point
if (j>1) && (t<x(j)) % If in the wrong order, back
j = j-1; % up and discard a line
else % Otherwise add
j = j+1; % another point and line
x(j) = t;
p(j) = k;
break % Break out of the while-loop for the next line
end
end
end
x(1) = []; % Delete unused element
Vector x will have all the x-coordinates of points dividing up the
"curve", Y1 = max(a+b*X), into linear segments.
In spite of the correction, this remains an order n algorithm except for
the sort, because each move backward constitutes the removal of a line, so
there can't be more than n-2 steps backwards and n-1 steps forward.
I have totally ignored the problem of possibly parallel lines.
Roger Stafford
It is definitely a point for loops against repmat...
--------------------
Oops! I forgot to chop off the other end of x. The line x(1) = [];
should be replaced by:
x = x(2:j); % Remove unused portions of x
instead.
Roger Stafford
I have an impression that max-min is a kind of a piecewise linear
"parabola", i.e. if for scalar x f(x,a,b)=max(a+bx)-min(a+bx), then
for any equidistant vector x, all(diff(f(x,a,b),2)>=-eps) is true.
If so, the solution is adjacent to a line with minimal slope, defined
by b(i)-b(j), i~=j. If we sort these slopes, smallest absolute value
and its two neighbors from left and right should always contain the
exact answer. Just like an extremum of a smooth function.
> I have an impression that max-min is a kind of a piecewise linear
> "parabola", i.e. if for scalar x f(x,a,b)=max(a+bx)-min(a+bx), then
> for any equidistant vector x, all(diff(f(x,a,b),2)>=-eps) is true.
> If so, the solution is adjacent to a line with minimal slope, defined
> by b(i)-b(j), i~=j. If we sort these slopes, smallest absolute value
> and its two neighbors from left and right should always contain the
> exact answer. Just like an extremum of a smooth function.
--------------
It is true that your f(x,a,b)=max(a+bx)-min(a+bx) defines a concave
upwards function, which you could describe loosely as a "parabola", and
its minimum should therefore lie somewhere in the vicinity of a zero
"slope". The difficulty with that concept is that its slope at a value x
is determined by the particular i for which a(i)+b(i)*x is the maximum for
a+b*x, and the particular j for which a(j)+b(j)*x is minimum for a+b*x.
The slope would indeed be b(i)-b(j) at that point, but it's not a simple
matter of sorting the 'b' quantities. The choice of what i and j are as x
varies is something that can fluctuate wildly throughout the various lines
in a manner that is dependent on the 'a' quantities also, so this kind of
"slope" is not easy to determine.
In the algorithm I suggested, there is a way in theory to determine the
desired minimum which corresponds in a sense to what you are describing.
If the x result in the algorithm for max(a+b*x) is called x1 and the other
x result for min(a+b*x) is called x2, the x2 could be flipped and the
(x1,x2) values interleaved. The corresponding 'p' quantities could serve
to identify the lines which cross at these x values, and thereby determine
an effective slope for f(x,a,b). From this one could indeed determine the
precise point where a minimum occurs in the max-min, along the lines you
suggest - that is, looking for where this effective slope changes sign.
Unfortunately, the effort required to do the interleaving properly seems
more than would be required to do a simple max operation on all the values
of f(x,a,b) for the combined values of x1 and x2.
Roger Stafford
> John, in that previous article, at one point I said, "If at any point,
> an x value is obtained that lies prior to a set of previous x-values,
> xp,...,xq, then all of these, the new x and xp,...,xq, are discarded and
> replaced by the single crossing point of this latest line and the line
> that formed the first of the pair for xp." That was wrong and I should
> have said, "if at any point an x value is obtained that lies prior to the
> previous value xq, then x and xq are discarded and replaced by a new
> crossing point x between the new line and the line that formed the first
> of the pair for xq. This backing up should continue until x is not less
> than a prior xq, at which point the process can move forward again to the
> next line." I know that is pretty vague wordage, so I include a code
> segment to illustrate the idea. This only applies to the Y1 = max(a+b*X)
> part of your function. The code for the min part would look almost
> identical except for a reversal of the inequality at "t<x(j)".
>
> [b,q] = sort(b); a = a(q); % Sort ascending b's
> p = zeros(n,1); % p will have line indices
> x = zeros(n,1); % x will store x values
> j = 1; % Set up the first point
> p(1) = 1;
> for k = 2:n % Work through the remaining lines one at a time
> while 1 % Only exit while loop with a break
> t = (a(k)-a(p(j)))/(b(p(j))-b(k)); % Calculate a crossing point
To handle parallel lines properly, this ought to be:
t = (a(p(j))-a(k))/(b(k)-b(p(j)));
> if (j>1) && (t<x(j)) % If in the wrong order, back
> j = j-1; % up and discard a line
> else % Otherwise add
> j = j+1; % another point and line
> x(j) = t;
> p(j) = k;
> break % Break out of the while-loop for the next line
> end
> end
> end
> x(1) = []; % Delete unused element
Wrong!!! It should be:
x = x(2:j);
> Vector x will have all the x-coordinates of points dividing up the
> "curve", Y1 = max(a+b*X), into linear segments.
>
> In spite of the correction, this remains an order n algorithm except for
> the sort, because each move backward constitutes the removal of a line, so
> there can't be more than n-2 steps backwards and n-1 steps forward.
>
> I have totally ignored the problem of possibly parallel lines.
>
> Roger Stafford
----------------------
Hello John,
On the algorithm I suggested, here's a follow-up with regard to parallel
lines. If two lines are identical, one of them should be removed in
advance to avoid NaNs. However, if they are parallel but distinct, I am
convinced the algorithm still works, given one key modification. It was
just my luck to write the expression for a crossing point the wrong way.
I wrote:
t = (a(k)-a(p(j)))/(b(p(j))-b(k));
whereas it should be:
t = (a(p(j))-a(k))/(b(k)-b(p(j)));
For non-parallel lines these are of course equal, but in matlab if
b(k)-b(p(k)) is zero for parallel lines, the sign of the resulting
infinity would be computed as if the denominator is plus zero, which is in
keeping with the sorting that was done on b. In other words, we want the
sign of a(p(j))-a(k) to be in agreement with the sign of the infinity. So
the lower line would be discarded for the max(a+b*x) processing and the
upper line discarded for the min(a+b*x) stuff. Of course you will also
want to have disabled the system warning about division by zero. I don't
know how to do that with my 4a version but no doubt there is a way on
advanced versions.
Roger Stafford
Roger, the search for i and j is much simplified due to the fact that
x-values are not important. Corresponding algorithm is just a toy for
you: a double loop in i and j<i, within those loops you keep 3
slopes (or (i, j)indices): slope with minimal absolute value and its
two neighbors from below and above. after you have found 3 lines,
their crossings give you single x-minimum or an interval of x-values.
It would be a great surprize if this search could be vectorized. But
it is not much different from sorting, just keep additionally two
neighbors.
If n >>1 there are simple means to avoid sorting all slopes,
but John did not wish it, he warned that n is not very big, up to
hundred. In this way we do not need to look for all crossings and
corresponding function values. Except for special sorting, the
solution is found in 3 steps.
Roger, may be I am wrong thinking that EVERY possible slope really
participates in f(x,y,b). Then you are right, one has to look for
minimum iterations, i.e. no simple solution. I shall check it
tomorrow. Sorry for troubles, it was just an unusual and therefore
interesting question, thanks to John.
Yes Roger, you are right. I have checked, only a quickly decreasing
proportion of all possible slopes participates in the final curve max
- min. Were it possible to quickly find these real slopes, then my
"simple idea" would work. If John has found a simple solution
(without iterations or loops), I should look at it with big interest.