Finding Position in an ordered list

2 views
Skip to first unread message

janostot...@gmail.com

unread,
Jun 6, 2005, 4:31:00 AM6/6/05
to
I wonder if it is possible to use the knowledge
that a list in which I am looking for the position
of an element is ordered. I want a quicker solution then e.g.
lis={ac,dmk,rfg,sty,zxxer}
Position[lis,sty]

I am certainly interested in longer lists...

Thank you,

Janos


dh

unread,
Jun 7, 2005, 2:15:15 AM6/7/05
to
Hi Janos,
I think "DiscreteMath`Combinatorica`BinarySearch" is what you are
searching for.

?BinarySearch

BinarySearch[l, k] searches sorted list l for key k and gives the
position of l containing k, if k is present in l. Otherwise, if k is
absent in l, the function returns (p + 1/2) where k falls between the
elements of l in positions p and p+1. BinarySearch[l, k, f] gives the
position of k in the list obtained from l by applying f to each element
in l.

sincerely, Daniel

Caffa Vittorio Dr.

unread,
Jun 7, 2005, 2:16:01 AM6/7/05
to
Hi Janos,

Here is a little experiment.

First I create a list of strings.

$Version
5.1 for Microsoft Windows (January 27, 2005)

Needs["Miscellaneous`Dictionary`"]

words = FindWords[__];

Length[words]
92836

Now I search the position of four words with Position:

Timing[
{Position[words,"egg"],Position[words,"book
"],Position[words,"you"],Position[words,"paper"]}]
{0.05 Second, {{{25730}}, {{8885}}, {{92516}}, {{57800}}}}

Now I create a little database:

db = Dispatch[Thread[words -> Range[Length[words]]]];

Timing[{"egg", "book", "you", "paper"} /. db]
{0. Second,{25730,8885,92516,57800}}

It is faster, as wanted.

Janos, maybe you could now determine the search time as a function of
the list size.

Cheers, Vittorio

Carl K. Woll

unread,
Jun 7, 2005, 2:19:49 AM6/7/05
to
Janos,

If your data is numerical, and you want to find the positions of many
numbers at once, then Interpolation is a much quicker method. Again, some
numeric data:

In[1]:=
x = Sort[Table[Random[], {10^6}]];

Load your package:

In[2]:=
Needs["DiscreteMath`"]

Define my function to find the position:

In[3]:=
f = Interpolation[Transpose[{x, N[Range[Length[x]]]}],
InterpolationOrder -> 1]; // Timing
Out[3]=
{0.844 Second, Null}

Compare finding the positions of 5000 elements using BinarySearch and using
my Interpolation method:

In[7]:=
r1 = BinarySearch[x, #] & /@ Take[x, 5000]; // Timing
r2 = Round[f[#]] & /@ Take[x, 5000]; // Timing
r1 === r2
Out[7]=
{2.453 Second, Null}
Out[8]=
{0.062 Second, Null}
Out[9]=
True

As you can see, the Interpolation method is 40 times faster.

Carl Woll

----- Original Message -----
From: "János TÓTH" <janostot...@gmail.com>
Subject: Re: Finding Position in an ordered list


Dear Carl,

Many thanks, your solution also seems to be good, but, as usual :),
there exist a built in function, as Daniel told me: BinarySearch in
DiscreteMath.

Thank you,

Janos


On 6/6/05, Carl K. Woll <ca...@u.washington.edu> wrote:
> <janostot...@gmail.com> wrote in message
> news:d811k4$cch$1...@smc.vnet.net...


> >I wonder if it is possible to use the knowledge
> > that a list in which I am looking for the position
> > of an element is ordered. I want a quicker solution then e.g.
> > lis={ac,dmk,rfg,sty,zxxer}
> > Position[lis,sty]
> >
> > I am certainly interested in longer lists...
> >
> > Thank you,
> >
> > Janos
> >
>

> Janos,
>
> I would think a binary search would be the quickest. I won't bother giving
> an implementation, as you can do a search in the archives. For example:
>
> http://forums.wolfram.com/mathgroup/archive/1998/Jul/msg00022.html
>
> Could you provide a few more details on your problem? For example, is your
> ordered list numeric, or does it contain names as you indicated in your
> post. Also, are you interested in finding positions of more than one
> element? If the answer to both of the above is yes, then you might find
> using the built in binary search in Interpolation useful. For example,
> suppose your list contained a million reals:
>
> x = Sort[Table[Random[], {10^6}]];
>
> Create the interpolating function:
>
> In[2]:=
> Timing[f = Interpolation[Transpose[{x, N[Range[Length[x]]]}],
> InterpolationOrder -> 1]; ]
> Out[2]=
> {0.844 Second, Null}
>
> Find the location of an element:
>
> In[3]:=
> Round[f[x[[100000]]]] // Timing
> Out[3]=
> {0. Second, 100000}
>
> Contrast this with using Position:
>
> In[4]:=
> Position[x, x[[100000]]] // Timing
> Out[4]=
> {2.093 Second, {{100000}}}
>
> Finally, let's make sure that f returns the correct answers for every
> element in your list:
>
> In[5]:=
> (Round[f[x]]===Range[10^6])//Timing
>
> Out[5]=
> {5.344 Second, True}
>
> Carl Woll
>
>
>
>
>


yehuda ben-shimol

unread,
Jun 7, 2005, 2:27:24 AM6/7/05
to
The immediate answer is to perform a binary search. This will result
with a complexity of O(Log[2,n]) (n being the length of the list) and
not O(n).
There is an implementation of a binary search algorithm in the
Combinatorica package
<< DiscreteMath`Combinatorica`
? BinarySearch

good luck
yehuda

János TÓTH

unread,
Jun 8, 2005, 3:58:33 AM6/8/05
to
Dear All,

0. Now I disclose that I am looking for a word in a dictionary.
1. Thank for the BinarySearch idea. It is much faster than Position.
2. The idea of interpolation in the case of numbers is fantastic.
3. However, I tried BinarySearch[{a,b,c},b] without being evaluated.

Thus, I need further help.

Thank for all of you.

Janos

Andrzej Kozlowski

unread,
Jun 9, 2005, 5:23:31 AM6/9/05
to

On 8 Jun 2005, at 16:21, János TÓTH wrote:
>
> Dear All,
>
> 0. Now I disclose that I am looking for a word in a dictionary.
> 1. Thank for the BinarySearch idea. It is much faster than Position.
> 2. The idea of interpolation in the case of numbers is fantastic.
> 3. However, I tried BinarySearch[{a,b,c},b] without being evaluated.
>
> Thus, I need further help.
>
> Thank for all of you.
>
> Janos

BinarySearch in the Combinatorica package is written so that it
accepts only numeric input. It is very easy to change this:

BinarySearch[list_, elem_] := Module[{
n0 = 1, n1 = Length[list], m}, While[n0 ≤ n1, m = Floor[(n0 +
n1)/2];
If[list[[m]] == elem,
Return[m]]; If[OrderedQ[{list[[m]], elem}], n0 = m + 1, n1 = m -
1] ];
0 ]


BinarySearch[{a,b,c,},b]


2


BinarySearch[{a,b,c,},e]


0

0 means "not found".

Andrzej Kozlowski

Chiba, Japan

Carl K. Woll

unread,
Jun 9, 2005, 5:31:08 AM6/9/05
to
"János TÓTH" <janostot...@gmail.com> wrote in message
news:d868f9$bus$1...@smc.vnet.net...

> Dear All,
>
> 0. Now I disclose that I am looking for a word in a dictionary.
> 1. Thank for the BinarySearch idea. It is much faster than Position.
> 2. The idea of interpolation in the case of numbers is fantastic.
> 3. However, I tried BinarySearch[{a,b,c},b] without being evaluated.
>
> Thus, I need further help.
>
> Thank for all of you.
>
[snip]

Janos,

Apparently the Combinatorica package BinarySearch function only accepts
numeric keys. We can modify this function to work with any key as follows:

binarysearch[d_, k_, ordf_:Less] :=
Module[{lo = 1, mid, hi = Length[d], c},
While[lo <= hi,
If[(c = d[[mid = Quotient[lo + hi, 2]]]) === k, Return[mid]];
If[ordf[k, c], hi = mid - 1, lo = mid + 1]];
Return[lo - 1/2]
]

Then, for your example, we would use the ordering function OrderedQ[{##}]&:

binarysearch[{a, b, c}, b, OrderedQ[{##}] &]
2

Carl Woll


Reply all
Reply to author
Forward
0 new messages