Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Finding the closest number from a list

6 views
Skip to first unread message

Jacob Rome

unread,
Feb 20, 2003, 5:23:29 AM2/20/03
to
Hi,

I have a seemingly simple problem. I want to find the element in a
list that is closest to a number I specify. In my case, I have a list
of about 30,000 numbers (y[i]). I want to find the closest match for
each of about 10,000 other numbers (call them x[i]). Obviously, speed
is important. I've sorted the large list, and right now I'm going
through each y[i] from lowest to highest and testing it to see if x[i]
is less than that value. This takes about .1 seconds for each x[i].

I'm wondering if anyone has had a similar problem, and if there is a
better function built-in to Mathematica. Alternatetively, I could
build my own. I've just recently realized that I could also reduce the
search time considerably if I sort the x[i] list as well, and only
start my search from where I last left off. Any ideas on which
approach would be more efficient? Thanks.

Robert Knapp

unread,
Feb 21, 2003, 4:04:24 AM2/21/03
to

Here is an approach suggested by Stephen Wolfram which is very efficient:
The idea is to use a first order interpolation to find an estimate to the
closest index, then use actual comparison to determine which element is actually
closest. The reason this approach is fast is because InterpolatingFunctions in
Mathematica use a fast method to determine where an input lies in a data list.

This function finds which element is actually closest given an upper and lower
bound on the index into the sorted list:

In[1]:=
GetClosest[lower_, upper_, item_, slist_] := Module[{dl, du, pos},
pos = If[upper <= lower,
upper,
{dl, du} = Abs[item - slist[[{lower, upper}]]];
If[du > dl,
upper,
lower]
];
slist[[pos]]
]

This function returns the list of closest matches in the List list given a List
things of items you want to find the closest matches to. I have not included it
in the argument checking, but this will only work if both lists are real or
integer -- for complex numbers a more complicated scheme would have to be concocted.

In[2]:=
ClosestMatches[list_?VectorQ, things_?VectorQ] :=
Module[{sorted, lint, first, last, n = Length[list], lower, upper},
sorted = Sort[list];
first = Max[Min[things], First[sorted]] - 1;
last = Min[Max[things], Last[sorted]] + 1;
lint =
ListInterpolation[
N[Join[{1},Range[n],{n}]],{Join[{first},sorted,{last}]},
InterpolationOrder\[Rule]1];
estimates = lint[things];
lower = Floor[estimates];
upper = Ceiling[estimates];
MapThread[GetClosest[##, sorted]&, {lower, upper, things}]
]

This sets up a random list and a random set of things.

In[3]:=
n = 30000;
list = Table[Random[],{n}];
things = Table[Random[],{10000}];

This finds the matches

In[6]:=
Timing[ClosestMatches[list, things];]

Out[6]=
{2.51 Second,Null}

The timing was done on a 750MhZ PentiumIII system.

Sseziwa Mukasa

unread,
Feb 21, 2003, 4:14:36 AM2/21/03
to

On Thursday, February 20, 2003, at 05:14 AM, Jacob Rome wrote:

> Hi,
>
> I have a seemingly simple problem. I want to find the element in a
> list that is closest to a number I specify. In my case, I have a list
> of about 30,000 numbers (y[i]). I want to find the closest match for
> each of about 10,000 other numbers (call them x[i]). Obviously, speed
> is important. I've sorted the large list, and right now I'm going
> through each y[i] from lowest to highest and testing it to see if x[i]
> is less than that value. This takes about .1 seconds for each x[i].
>

Are you doing a binary search? I tested searching for items from a
list using a binary search routine as follows:

binarysearch[item_, lst_] := Block[{left = 1, middle =
Quotient[Length[lst], \
2], right =
Length[lst]}, If[
middle == 0, Return[1]]; While[True, If[lst[[middle]] ==
item, Return[middle], If[right - left ÷ 1,
If[lst[[right]] - item < item - lst[[left]], Return[right], Return[
left]], If[
item > lst[[middle]], left = middle,
right = middle]]]; middle = Quotient[left + right, 2]]]

and using map to search for 10000 values (ie
binarysearch[#,lst]&/@items) and it takes about 6 seconds on a 1GHz G4
with a list of 30000 items.

> I'm wondering if anyone has had a similar problem, and if there is a
> better function built-in to Mathematica. Alternatetively, I could
> build my own. I've just recently realized that I could also reduce the
> search time considerably if I sort the x[i] list as well, and only
> start my search from where I last left off. Any ideas on which
> approach would be more efficient? Thanks.
>
>

Since a binary search scales as log n, dropping items from the list to
be searched may actually be slower due to overhead in manipulating the
list. To get a factor of 2 improvement in the search you need to drop
the list size to sqrt(n). This code takes about 11 seconds on my
machine:

Block[{mylist = lst, pos, dropped = 0}, Table[pos =
binarysearch[items[[i]], mylist]; mylist = Drop[mylist, pos -
1]; dropped +=
pos - 1, {i, Length[items]}]]

I tried a version using Mathematica's Select function which is far too
slow to consider for your problem and a version in which I precomputed
intervals in order to short circuit the search. That approach is
negligibly faster than the straightforward search and 3 times more
memory intensive.

Regards

Ssezi

Wolf, Hartmut

unread,
Feb 21, 2003, 4:20:49 AM2/21/03
to

>-----Original Message-----
>From: jr...@mail.com [mailto:jr...@mail.com]
>Sent: Thursday, February 20, 2003 11:14 AM
>To: math...@smc.vnet.net
>Subject: Finding the closest number from a list
>
[...]

>I have a seemingly simple problem. I want to find the element in a
>list that is closest to a number I specify. In my case, I have a list
>of about 30,000 numbers (y[i]). I want to find the closest match for
>each of about 10,000 other numbers (call them x[i]). Obviously, speed
>is important. I've sorted the large list, and right now I'm going
>through each y[i] from lowest to highest and testing it to see if x[i]
>is less than that value. This takes about .1 seconds for each x[i].
>

>I'm wondering if anyone has had a similar problem, and if there is a
>better function built-in to Mathematica. Alternatetively, I could
>build my own. I've just recently realized that I could also reduce the
>search time considerably if I sort the x[i] list as well, and only
>start my search from where I last left off. Any ideas on which
>approach would be more efficient? Thanks.
>

Effectively this is a binning problem, delt with quite often in this list.
(So search the archive.)

What you didn't specify is the form of your output you desire. I assume it
should be a list of closest elements of list y, corresponding in order to
sorted list x. (You then easily may generate a list of replacement rules, or
just eat up the result for further processing, by Scan, Map or the like.)

The idea you communicated to simultaneously scan sorted y and sorted x lists
is just the sort-merge approach. That should be among the best algorithms if
compiled.

Another idea would be to sort y and then make a binary search for each
element of x to get at the two closest element of y, and the check.

Here I'd like to communicate another idea relies on list processing, here
step by step:

In[1]:= y = Table[Random[], {30}]
Out[1]=
{0.267278, 0.99553, 0.201683, 0.0341569, 0.914983, 0.981118, 0.421376, \
0.0633918, 0.426651, 0.433505, 0.963199, 0.616847, 0.865283, 0.926424, \
0.151319, 0.734706, 0.278981, 0.247546, 0.586438, 0.0635677, 0.314219, \
0.492426, 0.800959, 0.288232, 0.0469408, 0.496896, 0.599276, 0.254075, \
0.131958, 0.515779}

In[2]:= x = Table[Random[], {10}]
Out[2]=
{0.177901, 0.190683, 0.705306, 0.0822738, 0.214701, 0.573836, 0.840023, \
0.15585, 0.0633828, 0.83913}

The test data. (I'm sorry to include the outputs, but such you may exactly
reproduce the steps.)

In[51]:= xx = Sort[x]; (* sorted x, to be used later *)

In[52]:=
zz = Join[{-Infinity}, Sort[Join[y, x + I]], {Infinity}]

Both lists y and x sorted together, the x are marked by complex I, such as
to lokate there positions in this combined sorted list. -Infinity and
Infinity are attached to the margins to avoid problems when any element of x
is smaller (or greater) than each element of y.

We search for the positions of marked x elements:

In[53]:= xX = Flatten[Position[jj, HoldPattern[Complex[_, _]]]]
Out[53]=
{4, 7, 10, 11, 12, 14, 27, 31, 34, 35}

In[54]:= lowX = (c = 0;
FoldList[If[#2 - #1 === ++c, #1, c = 0; #2] &, -Infinity, xX - 1] //
Rest)
Out[54]=
{3, 6, 9, 9, 9, 13, 26, 30, 33, 33}

These are the corresponding positions of the next smaller element of y. You
see double occurances when x elements came out adjacent in the combined
list.

In[55]:= highX = (c = 0;
FoldList[If[#2 - #1 === --c, #1, c = 0; #2] &, Infinity,
Reverse[xX] + 1] // Rest // Reverse)
Out[55]=
{5, 8, 13, 13, 13, 15, 28, 32, 36, 36}

Same for the next greater elements.

In[56]:= low = zz[[lowX]];
In[57]:= high = zz[[highX]];


Now we take the next nearest of both candidates:

In[58]:=
MapThread[If[#3 - #2 > #2 - #1, #1, #3] &, {low, xx, high}]
Out[58]=
{0.0633918, 0.0635677, 0.151319, 0.201683, 0.201683, 0.201683, 0.586438, \
0.734706, 0.865283, 0.865283}


put together:

closestMatchP[x_List, y_List] :=
Module[{xx = Sort[x],
zz = Join[{-Infinity}, Sort[Join[y, x + I]], {Infinity}], xX, lowX,
highX, c},
xX = Flatten[Position[zz, HoldPattern[Complex[_, _]]]];
lowX = (c = 0;
FoldList[If[#2 - #1 === ++c, #1, c = 0; #2] &, -Infinity, xX - 1] //

Rest);
highX = (c = 0;
FoldList[If[#2 - #1 === --c, #1, c = 0; #2] &, Infinity,
Reverse[xX] + 1] // Rest // Reverse);
MapThread[If[#3 - #2 > #2 - #1, #1, #3] &, {zz[[lowX]], xx,
zz[[highX]]}]
]


In[73]:= closestMatchP[x, y]
Out[73]=
{0.0633918, 0.0635677, 0.151319, 0.201683, 0.201683, 0.201683, 0.586438, \
0.734706, 0.865283, 0.865283}

In[74]:= closestMatchP[x, y + .5]
Out[74]=
{0.534157, 0.534157, 0.534157, 0.534157, 0.534157, 0.534157, 0.563568, \
0.701683, 0.814219, 0.814219}

In[75]:= closestMatchP[x, y - .5]
Out[75]=
{0.086438, 0.086438, 0.116847, 0.234706, 0.234706, 0.234706, 0.49553, \
0.49553, 0.49553, 0.49553}

Performance test:

In[45]:= x2 = Table[Random[], {10000}];
In[46]:= y2 = Table[Random[], {30000}];
In[50]:= closestMatchP[x2, y2]; // Timing
Out[50]= {2.564 Second, Null}

Not too bad, possibly not best; but I haven't got the time to check the
alternatives sketched above.

--
Hartmut Wolf


Carl K. Woll

unread,
Feb 22, 2003, 4:13:09 AM2/22/03
to
Hi Jacob,

The first thought that I had was to use Interpolation, similar to what Rob
Knapp posted. However, Rob's function found the closest y whether it was
greater than or less than your x[i]. If you only care to find the first y[i]
that is greater (or less than) a particular x[i] (and this seems to be what
you wanted), it is possible to speed up Rob's solution considerably. Again,
we create an interpolating function, but this time we use the Interpolation
function, as in:

ulint=Interpolation[Transpose[{y,y}],InterpolationOrder->0];

The ulint function is a ladder function which gives the value of the first y
greater than its argument. Since interpolating functions are listable, we
just apply ulint to your set of x values:

closestabove=ulint[x];

where closestabove is the list of closest values of y to each x (actually,
the first y greater than each x).

If instead, you want the first y lower than a particular value of x, we need
to modify as follows:

llint=Interpolation[Transpose[-{y,y}],InterpolationOrder->0];

closestbelow=-llint[-x];

In my testing, this approach was an order of magnitude faster than both
Hartmut Wolf's and Rob Knapp's solutions.

Carl Woll
Physics Dept
U of Washington

"Jacob Rome" <jr...@mail.com> wrote in message
news:b32ab1$b3o$1...@smc.vnet.net...

Daniel Lichtblau

unread,
Feb 22, 2003, 4:14:10 AM2/22/03
to
Jacob Rome wrote:
>
> Hi,
>
> I have a seemingly simple problem. I want to find the element in a
> list that is closest to a number I specify. In my case, I have a list
> of about 30,000 numbers (y[i]). I want to find the closest match for
> each of about 10,000 other numbers (call them x[i]). Obviously, speed
> is important. I've sorted the large list, and right now I'm going
> through each y[i] from lowest to highest and testing it to see if x[i]
> is less than that value. This takes about .1 seconds for each x[i].
>
> I'm wondering if anyone has had a similar problem, and if there is a
> better function built-in to Mathematica. Alternatetively, I could
> build my own. I've just recently realized that I could also reduce the
> search time considerably if I sort the x[i] list as well, and only
> start my search from where I last left off. Any ideas on which
> approach would be more efficient? Thanks.


If all elements are machine real numbers this can be done quite fast by
sorting both lists together, augmenting elements in each list with an
index of e.g. 1 or 2 to denote from which list they originate. To allow
for fast code throughout we use machine reals for these as well (so
packed array technology will be used behind the scenes).

We use fake "ends" for the second list in order to be certain the entire
merged list is bracketed by elements from that list. We will choose them
in such a way that they cannot be taken as closest elements to elements
in the first list.

Once sorted, we go through the list and augment each element by the
closest element from the second list that is to its left. We repeat that
with closest elements to the right.

list. They now have the nearest second list elements from each side
attached to them. We determine which is closer, and return pairs of the
form

{list 1 element, closest list 2 element}.

If instead we want to retain original positions in the first list and/or
give positions of closest elements in second list, minor modifications
of the code below (to retain list index positions) could be used. Again
one would want to store them as machine reals in order to get optimal
speed.

closestPoints = Compile[{{list1,_Real,1}, {list2,_Real,1}},
Module[{lo, hi, n1, n2, len1=Length[list1], len2=Length[list2]+2,
full, bound=1., t1},
lo = -2.*(Abs[Min[list1]] + Abs[Min[list2]]);
hi = 2.*(Abs[Max[list1]] + Abs[Max[list2]]);
n1 = Transpose[{list1,Table[1.,{len1}]}];
n2 = Transpose[{Join[{lo},list2,{hi}], Table[2.,{len2}]}];
full = Sort[Join[n1,n2]];
t1 = Table[
If [full[[j,2]]==2., bound = full[[j,1]]];
Append[full[[j]],bound]
, {j, len1+len2}
];
t1 = Table[
If [full[[j,2]]==2., bound = full[[j,1]]];
Append[t1[[j]],bound]
, {j, len1+len2, 1, -1}
];
t1 = Select[t1, (#[[2]]==1.)&];
Map[
If[#[[1]]-#[[3]] < #[[4]]-#[[1]],
{#[[1]],#[[3]]}, {#[[1]],#[[4]]}]&, t1]
]];

Example:

l1 = Table[Random[], {30000}]
l2 = Table[Random[], {10000}]

In[66]:= Timing[closestPoints[l1,l2];]
Out[66]= {0.58 Second, Null}

This on a 1.5 GHz machine.


Daniel Lichtblau
Wolfram Research

Jacob Rome

unread,
Feb 22, 2003, 4:16:13 AM2/22/03
to
Wow, I got so many helpful responses from so many people that I'm
blown away! I ended using an approach similar in spirit to the merged
list approach. Here's what I did.

I have two list of nodes on the x-y plane, nXY and oXY. What I want to
do is to determine the point from oXY closest to a point nXY[[i]]. I
will use that guess to help determine which Delaunay triangle (based
upon the oXY set of points) the new point (nXY[[i]]) is located in.

I have created two matrices of sorted values, nSort and oSort. I know
the length of the lists are nNodes (~10,000) and oNodes (~15,000),
respectively. The matrices have two columns, the 1st column has the
magnitude of the node (point on the x-y plane), the 2nd column is the
node number (the row number from the initial matrix).

My problem really has three steps, the way I've constructed it:

1) Find the nodes from oXY which have a magnitude closest to the node
nXY[[i]];
2) From these surrounding nodes, determine which one, oXY[[j]], is
closest to nXY[[i]]; (the point of step 1 is to greatly reduce the
number of nodes to be searched)
3) Find which Delaunay triangle (of oXY) the point nXY[[i]] belongs
to, searching only those triangles which have vertices at oXY[[j]];

(This takes care of 95%+ of my nodes. For the remaining nodes, I've
expanded the search to include more surrounding triangles.)

Therefore, for step 1, I don't need to know the node with the closest
magnitude, I just need to find one of the closest. For step 2, I do
not need to find the absolute nearest node, since that will neither
guarentee that step 3 is a success nor will not finding it ensure that
step 3 fails.

With all that in mind, here's the code I wrote (depth specifies how
many of the nearest nodes to search). The actual sorting takes about
0.36 seconds on a PIII/1000, although Length[oXY]=15000, not 30000 as
previously thought.

nearestNode[depth_, nNodes_, oNodes_, nSort_List, oSort_List,
nXY_List,
oXY_List] :=
Module[{nearest, offset, nearRow, mag, limLow, limUp, cMag,
minMag, cNode,
sortClose, k, testNode},
nearest = Table[0, {jj, nNodes}]; (*
this array will store the nearest node candidate *)
offset = Table[0, {jj, nNodes}];
nearRow = 1;
Do[{
mag = nSort[[k, 1]];
While[mag > oSort[[nearRow, 1]], nearRow++];
limLow = Max[1, nearRow - depth]; (*
limits of the nearest point search *)
limUp = Min[oNodes, nearRow + depth];
minMag = 10000;
Do[{ (*Closest point search *)
cMag = vectMag[
nXY[[nSort[[k, 2]]]] - oXY[[oSort[[testNode, 2]]]]];
minMag = Min[minMag, cMag];
If[cMag == minMag, {cNode = oSort[[testNode, 2]],
sortClose = testNode;}];
}, {testNode, limLow, limUp}];
offset[[k]] = nearRow - sortClose;
nearest[[nSort[[k, 2]]]] = cNode;
}, {k, nNodes}]; Return[{nearest, offset}];];

Again, thanks for everyone's help. I got some really good ideas from
your responses!

Uri Zwick

unread,
Feb 22, 2003, 4:20:19 AM2/22/03
to
Hi,

Here is my solution the problem.
The trick is to sort the two lists together!

fnd[x_,y_] :=
Module[ {inv,ind,one,pos,ly,x1} ,

inv = ind = Ordering[ Join[ y ,
x1 = Sort[ Join[x,{-$MaxMachineNumber,$MaxMachineNumber}] ] ] ] ;

inv[[ind]] = Range[1,Length[ind]] ;
ly = Length[y] ; one = Map[ (If[#<=ly,0,1])& , ind ] ;

pos = FoldList[ Plus , one[[1]] , Drop[one,1] ] [[
Take[inv,Length[y]] ]] ;
MapThread[ If[#1<(#2+#3)/2,#2,#3]& , { y , x1[[pos]] , x1[[pos+1]] }
]

]

On my 900Mhz Pentium III I get:

In[1]:= x = Table[Random[], {30000}];
In[2]:= y = Table[Random[], {10000}];
In[3]:= Timing[fnd[x, y];]

{0.401 Second, Null}

About two thirds of the time is spent on the last MapThread.
The other costly operation is probably Map. Is there a simple
way to speed these operations up?

Also, it would have been cleaner to use -Infinity and Infinity
instead of -$MaxMachineNumber and $MaxMachineNumber. But, according
to the standard order -Infinity is larger than finite numbers !!!

In[4] := Sort[ {0,-Infinity} ]

{0 , -Infinity }

Am I missing something here?

Giving Sort and explicit ordering function slows it down
considerably.

Uri

Dr Bob

unread,
Feb 22, 2003, 4:22:22 AM2/22/03
to
I think this works just as well, and a little faster:

closest[list_?VectorQ, things_?VectorQ] := Module[{sorted, lint, first,

last, n = Length[list], lower, upper}, \
sorted = Sort[list];
first = Max[Min[things], First[sorted]] - 1;
last = Min[Max[things], Last[sorted]] + 1;

lint = ListInterpolation[Join[{1}, Range[n], {n}], {
Join[{first}, sorted, {last}]}, InterpolationOrder -> 1];
list[[Round@lint@things]]
]

Bobby

On Fri, 21 Feb 2003 04:07:41 -0500 (EST), Robert Knapp <rkn...@wolfram.com>
wrote:

--
maj...@cox-internet.com
Bobby R. Treat


Dr Bob

unread,
Feb 22, 2003, 4:23:24 AM2/22/03
to
Robert's solution:

GetClosest[lower_, upper_, item_, slist_] := Module[{dl, du, pos}, pos =

If[upper ≤ lower,


upper, {dl, du} = Abs[item - slist[[{lower, upper}]]];
If[du > dl, upper, lower]];
slist[[pos]]]

ClosestMatches[
list_?VectorQ, things_?VectorQ] :=
Module[{sorted, lint, first, last, n = Length[list],
lower, upper}, sorted = Sort[list];
first = Max[Min[things], First[sorted]] - 1;
last = Min[Max[things], Last[sorted]] + 1;

lint = ListInterpolation[N[Join[{1}, Range[n], {n}]], {Join[{


first}, sorted, {last}]}, InterpolationOrder -> 1];

estimates = lint[things];
lower = Floor[estimates];
upper = Ceiling[estimates];
MapThread[GetClosest[##, sorted] &, {lower, upper, things}]]

My solution (inspired by Robert's):

closest[list_?VectorQ, things_?VectorQ] := Module[{sorted = Sort[list],

lint, first, last, n = Length[list]},

first = Max[Min[things], First[sorted]] - 1;
last = Min[Max[things], Last[sorted]] + 1;

lint = Interpolation[Transpose@{
Join[{first}, sorted, {last}], Join[{1}, Range[
n], {n}]}, InterpolationOrder -> 1];
list[[Round@lint@things]]
]

Test data:

n = 30000;
list = Table[Random[], {n}];
things = Table[Random[], {10000}];

Timings:

Timing[ClosestMatches[list, things]; ]
{0.9679999999999893*Second, Null}
Timing[closest[list, things];]
{0.5319999999999965*Second, Null}

The answers aren't the same:

Take[things, 5]
Take[ClosestMatches[list, things], 5]
Take[closest[list, things], 5]

{0.6384367070141534, 0.6250093104126428, 0.12184236611421798,
0.09511197717954557, 0.679769352921642}
{0.6383160624966754, 0.624879790611537, 0.12186365194457346,
0.09509389371548846, 0.679694223307205}
{0.6384510214302745, 0.6250442930656402, 0.12184191844340701,
0.09512514224893226, 0.6797705952963301}

Rounding the Interpolation output gets the closer result, and faster.

Bobby

On Fri, 21 Feb 2003 04:07:41 -0500 (EST), Robert Knapp <rkn...@wolfram.com>
wrote:

Uri Zwick

unread,
Feb 23, 2003, 4:58:25 AM2/23/03
to
Hi,

Here is an improvement over my previous suggestion.
It runs about three times faster:

fnd3[x_,y_] :=
Module[ {inv,ind,one,pos,ly,x1,z1,z2,avr,sgn} ,

inv = ind = Ordering[ Join[ y ,
x1 = Sort[ Join[x,{-$MaxMachineNumber,$MaxMachineNumber}] ] ] ] ;

inv[[ind]] = Range[1,Length[ind]] ;

ly = Length[y]+0.5 ; one = Sign[ Sign[ind-ly] + 1 ] ;

pos = FoldList[Plus,one[[1]],Drop[one,1]] [[Take[inv,Length[y]]]] ;

z1 = x1[[pos]] ; z2 = x1[[pos+1]] ; avr = (z1+z2)/2 ;
sgn = Sign[ Sign[ Sign[y-avr]-0.5 ] + 1 ] ;
(1.-sgn)z1 + (sgn+0.) z2

]

On my 900Mhz Pentium III I get:

In[1]:= x = Table[Random[], {30000}];

In[2]:= y = Table[Random[], {10000}];
In[3]:= Timing[fnd3[x, y];]

{0.15 Second, Null}

It is interesting to note that "(1.-sgn)z1 + (sgn+0.) z2"
is much faster than "(1-sgn)z1 + sgn z2". Also, using 0.5
instead of 1/2 inside the Sign speeds things up.

The posting of Daniel Lichtblau suggests a similar
approach, though the implementation details are
quite different.

Does anyone know why 0 comes before -Infinity in the
standard ordering? (See my previous posting in this thread.)

Also, did anyone encounter an

"assertion failed at "\[InvisibleSpace]"
after perturbation steplength > 0"

while calling ConstrainedMin? (See a separate posting.)

Uri

Carl K. Woll

unread,
Feb 23, 2003, 5:03:45 AM2/23/03
to
Hi all,

Upon further consideration, I thought of a way to modify my previous attempt
so that the closest matches were found. My new function follows:

makeinterp[y_]:=Module[{sy},
sy=Sort[y];

Interpolation[Transpose[{ListCorrelate[{1/2,1/2},Join[sy,{sy[[-1]]}]],sy}],I
nterpolationOrder->0]

This function creates an interpolating function which returns the y value
closest to its argument. As an example,

y=Table[Random[],{30000}];
x=Table[Random[],{10000}];

closest=makeinterp[y];//Timing
closest[x];//Timing
{0.078 Second, Null}
{0.047 Second, Null}

One could instead make a single function which takes x and y values as
arguments (as most others have done), but I think splitting up the work as I
have done makes more sense. One could apply the same interpolating function
to different sets of x values this way. By way of comparison, Rob Knapp's
solution (closestr), Daniel Lichtblau's solution (closestl), Uri Zwick's
solution (closestu), and Hartmut Wolf's solution (closestw) had timings of

closestr[y,x];//Timing
closestl[y,x];//Timing
closestu[y,x];//Timing
closestw[x,y];//Timing
{1.063 Second, Null}
{0.796 Second, Null}
{0.204 Second, Null}
{0.75 Second, Null}

I didn't include DrBob's solutions, as they didn't seem to give correct
results. Also, Rob Knapp's solution didn't always give the closest y value.
Note that not all solutions returned the same output of a list of the y
values corresponding to the list x, as Hartmut Wolf's solution returned a
sorted list and Daniel Lichtblau's solution returned a list of pairs of x
values and corresponding y values.

If one wanted to find the closest matches to a single set of x values, then
Uri's and my solutions are comparable in timing. However, if one wanted to
find closest matches to multiple sets of x values, then clearly my approach
would be the speediest, as one would need to create the interpolating
function only once.

Carl Woll
Physics Dept
U of Washington

"Carl K. Woll" <ca...@u.washington.edu> wrote in message
news:b37ev5$piv$1...@smc.vnet.net...

Dr Bob

unread,
Feb 23, 2003, 5:04:51 AM2/23/03
to
I had a typo in my solution (list in place of sorted in the final return
expression, but list had been sorted outside the routine). In case that
affected timings, here it is again, with an unsorted 'list':

closest[list_?VectorQ, things_?VectorQ] := Module[{sorted = Sort[list],
lint, first, last, n = Length[list]},
first = Max[Min[things], First[sorted]] - 1;
last = Min[Max[things], Last[sorted]] + 1;
lint = Interpolation[Transpose@{
Join[{first}, sorted, {last}], Join[{1}, Range[
n], {n}]}, InterpolationOrder -> 1];

sorted[[Round@lint@things]]
]

n = 30000;
list = Table[Random[], {n}];

things = Table[Random[], {10000}];

Timing[ClosestMatches[list, things]; ]
{0.8119999999999998*Second, Null}
Timing[closest[list, things]; ]
{0.266*Second, Null}

Take[things, 5]
Take[ClosestMatches[list, things], 5]
Take[closest[list, things], 5]

{0.5807002127146147, 0.7231743740277918, 0.3065812161844857,
0.31065747252742193, 0.4325752310216415}
{0.5806328433277591, 0.723217016039161, 0.3065416642883134,
0.31064073272181786, 0.4326087043388147}
{0.5807654279226884, 0.7231714516295773, 0.30659880784207205,
0.3106581666143228, 0.43257232669984563}

Bobby

Bill Rowe

unread,
Feb 23, 2003, 5:07:56 AM2/23/03
to
On 2/22/03 at 3:38 AM, dr...@bigfoot.com (Dr Bob) wrote:

>I think this works just as well, and a little faster:
>

>closest[list_?VectorQ, things_?VectorQ] := Module[{sorted, lint,


>first, last, n = Length[list], lower, upper}, \ sorted = Sort[list];
>first = Max[Min[things], First[sorted]] - 1; last = Min[Max[things],

>Last[sorted]] + 1; lint = ListInterpolation[Join[{1}, Range[n], {n}],


>{ Join[{first}, sorted, {last}]}, InterpolationOrder -> 1];

>list[[Round@lint@things]]
>]

I think that last line of code should be sorted[[Round@lint@things]]

0 new messages