Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

position of sequence of numbers in list

132 views
Skip to first unread message

JB

unread,
Jan 30, 2010, 7:11:36 AM1/30/10
to
Hi,

What is the most efficient way to find the position of the beginning
of a sequence of numbers from a list?

I found a couple of ways:

find 3,4 in list={1,2,3,4,5};

1. pos=Intersection[Position[list,3],(Position[list,4])+1]

2. pos=Position[Partition[list,2,1],{3,4}]

Are there other ways to do this?
What is the best way when dealing with large lists?

Thanks,
JB

Patrick Scheibe

unread,
Jan 31, 2010, 5:57:23 AM1/31/10
to
Hi,

I assume the only thing you miss is how to express a "sequence of
numbers"? Just use Sequence[] :-)

list=Range[20];
Position[list, Sequence[11, 12, 13]]

Cheers
Patrick

Leonid Shifrin

unread,
Jan 31, 2010, 5:57:34 AM1/31/10
to
Hi again,

In my first post one of the solutions (the compiled function) contained a
bug:

In[1]:= posf[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}]

Out[1]= -1

which was because it ignores all but the first candidate sequences in the
case when they overlap. Here is a modified one which is (hopefully) correct,
if not as fast:

posf2=
Compile[{{lst,_Integer,1},{target,_Integer,1}},
Module[{i=1,len =Length[target],lstlen = Length[lst]},
While[i<=lstlen-len+1&&Take[lst,{i,len+i-1}]!=target,i++];
If[i>lstlen-len+1,-1,i]]]


In[3]:= posf2[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}]

Out[3]= 5

This one will still be much superior to Partition-based implementation for
cases when you can expect the sequence of interest to appear rather early
in the list. Anyway, sorry for the confusion with the buggy version.

By the way, should you wish to stick to Partition-based implementation, I
think it is fair to mention that for small sequence sizes and partitioning
with a shift 1, *and* when you have a list already in the packed array
representation (which is possible when your numbers are say all integers or
all reals, but not a mix), one can implement a more efficient version than
the built-in Partition:

Clear[myPartition];
myPartition[x_List, size_Integer] :=
With[{len = Length[x]},
Transpose@Table[x[[i ;; len - size + i]], {i, 1, size}]];

In[4]:= largeTestList = RandomInteger[{1, 15}, 3000000];

In[5]:= (pt = Partition[largeTestList, 2, 1]); // Timing

Out[5]= {0.521, Null}

In[6]:= (mpt = myPartition[largeTestList , 2]); // Timing

Out[6]= {0.17, Null}

In[7]:= pt == mpt

Out[7]= True

The built-in Partition will start winning when the partitioning size will be
around 30-50, so for long sequences using a built-in is better. If your list
is not packed, built-in Partition will be a few times faster even for small
partitioning sizes, so this will then be a de-optimization. You can attempt
to convert your list to packed array with Developer`ToPackedArray (note that
it does not issue any messages in case it is unable to do this), and check
that your list is packed with Developer`PackedArrayQ.

Likewise, you can implement your own Position-like function aimed at exactly
this problem, which, under similar requirements of your list being a packed
array of numbers, will be better than the built-in Position in most cases:

In[8]:=
Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}, 1,
1] // Timing

Out[8]= {0.27, {{41940}}}

In[9]:= myPosition[myPartition[largeTest , 4], {3, 4, 5, 6},
1] // Timing

Out[9]= {0.09, {41940}}

In[10]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}, 1,
2] // Timing

Out[10]= {0.301, {{41940}, {228293}}}

In[11]:= myPosition[myPartition[largeTest , 4], {3, 4, 5, 6},
2] // Timing

Out[11]= {0.311, {41940, 228293}}

In[12]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}] // Timing

Out[12]= {0.55, {{41940}, {228293}, {269300}}}

In[13]:= myPosition[
myPartition[largeTest , 4], {3, 4, 5, 6}, -1] // Timing

Out[13]= {0.411, {41940, 228293, 269300}}

where the last argument -1 is a convention to return all results.


Regards,
Leonid

Raffy

unread,
Jan 31, 2010, 5:57:44 AM1/31/10
to

Well, the {3,4} pattern must start with the first element of the
sublist. You can ignore the last position since the following 4 would
beyond the end of the list.

pos = Flatten@Position[Most[list], 3, {1}];

>From these positions, find the ones with a 4 in the next position:

pos = vPos[[Flatten@Position[list[[vPos + 1]], 4, {1}]]];

You could generalize this to repeating the above statement at
increasing offsets.

list = {1, 2, 3, 4, 5};
sub = {3, 4};

pos = Flatten@Position[Drop[list, 1 - Length[sub]], First[sub], {1}];
Do[pos = pos[[Flatten@Position[list[[pos + i - 1]], sub[[i]], {1}]]],
{i, 2, Length[sub]}];

Or, you could generalize it by finding the start position of the
sublist and then just extracting the following list to see if they
match the remainder of the sublist.

pos = Flatten@Position[Drop[list, 1 - Length[sub]], First[sub], {1}];
pos = Pick[pos, list[[# + 1 ;; # + Length[sub] - 1]] & /@ pos, Rest
[sub]]

Additionally, you'll want to verify that the Length[sublist] > 0.

Leonid Shifrin

unread,
Jan 31, 2010, 5:57:56 AM1/31/10
to
Well,

Sorry for so many posts - apparently I did not include the code for
<myPosition> function mentioned in my second post, while inluded the timings
for it. Here it is:


myPosition =
Compile[{{lst,_Integer,2},{target,_Integer,1},{nresults,_Integer}},
Module[{i=1,len = Length[lst],positions =
Table[0,{If[nresults!=-1,nresults,Length[lst]]}],pctr=0,
lcnresults = If[nresults!=-1,nresults,Length[lst]]},
While[i<=len&&pctr<lcnresults,
If[lst[[i++]]==target,positions[[++pctr]]=i-1]];
Take[positions,pctr]]]


Regards,
Leonid

Ray Koopman

unread,
Jan 31, 2010, 5:58:08 AM1/31/10
to
On Jan 30, 4:11 am, JB <jke...@gmail.com> wrote:

pne[vec,val] returns a list of the positions in vec that are NOT EQUAL
to val.

peq[vec,val] returns a list of the positions in vec that are EQUAL to
val. It's slower than pne but faster than the built-in Position.

Both pne and peq return simple lists, not the list-of-lists that
Position returns.

pne[vec_,val_] := SparseArray[vec,Automatic,val] /.
SparseArray[_, _, _, p_] :> Flatten@p[[2,2]]

peq[vec_,val_] := Block[{r = Range@Length@vec}, r[[pne[vec,val]]] = 0;
SparseArray[r] /. SparseArray[_, _, _, p_] :> p[[3]]]

list = Table[Random[Integer,{1,5}],{1*^6}];

Timing@Length[a = Intersection[Position[list,3],Position[list,4]-1]]
{1.09 Second, 39674}

Timing@Length[b = Position[Partition[list,2,1],{3,4}]]
{1.4 Second, 39674}

Timing@Length[c = Intersection[peq[list,3],peq[list,4]-1]]
{0.44 Second, 39674}

SameQ[a,b,Transpose@{c}]
True

Leonid Shifrin

unread,
Jan 31, 2010, 5:59:18 AM1/31/10
to
Hi,

Partition is in fact pretty good. In case you need only the first entry of
your sublist (rather than all), you can speed it up by indicating that only
one result is needed:

Position[Partition[list, 2, 1], {3, 4}, 1, 1]

{{3}}

Exactly how much faster this will run will depend on how early in the first
list the first instance of the second list is located.

If you will be dealing with numbers (integers or reals), you can get quite
a substantial speedup (for a single first entry search) by going back to
good old procedural approach (Compiled however), since this avoids the
overhead of Partition (however small it may be, it is proportional to the
length of the list, while procedural approach's complexity is proportional
to the value of the first position of your sequence):

posf = Compile[{{lst, _Integer, 1}, {target, _Integer, 1}},
Module[{i = 1, j = 1},
For[i = 1, i <= Length[lst], i++,
If[lst[[i]] == target[[j]], j++, j = 1];
If[j == Length[target] + 1, Return[i - j + 2]]];
Return[-1]]]


largeTest = RandomInteger[{1, 200}, 300000];

Position[Partition[largeTest , 2, 1], {3, 4}, 1, 1] // Timing

{0.231, {{46488}}}

Position[Partition[largeTest , 2, 1], {3, 4}] // Timing

{0.42, {{46488}, {94012}, {110753}, {125158}, {208861}, \
{243203}, {263641}, {273780}}}

posf[largeTest , {3, 4}] // Timing

{0.04, 46488}

Again, the speedup rate will depend on how early in the list the result is
first located.

Hope this helps.

Regards,
Leonid

Bob Hanlon

unread,
Jan 31, 2010, 5:59:40 AM1/31/10
to

Position[list, Sequence[3, 4]]


Bob Hanlon

---- JB <jke...@gmail.com> wrote:

=============

Steve Luttrell

unread,
Jan 31, 2010, 6:00:03 AM1/31/10
to
I don't know about efficiency, but here is another way of doing what you
want:

{1, 2, 3, 4, 5} /. {u___, 3, 4, ___} :> Length[{u}] + 1

--
Stephen Luttrell
West Malvern, UK

"JB" <jke...@gmail.com> wrote in message news:hk17lo$ogf$1...@smc.vnet.net...

Bill Rowe

unread,
Jan 31, 2010, 6:00:14 AM1/31/10
to
On 1/30/10 at 7:12 AM, jke...@gmail.com (JB) wrote:

>What is the most efficient way to find the position of the beginning
>of a sequence of numbers from a list?

>I found a couple of ways:

>find 3,4 in list={1,2,3,4,5};

>1. pos=Intersection[Position[list,3],(Position[list,4])+1]

>2. pos=Position[Partition[list,2,1],{3,4}]

note this will fail if the things searched for are not
sequential. That is:

In[1]:= list = {1, 2, 3, 4, 5};

In[2]:= pos = Position[Partition[list, 2, 1], {1, 4}]

Out[2]= {}

>Are there other ways to do this? What is the best way when dealing
>with large lists?

Here is another way to the task

In[3]:= theList = RandomInteger[100, 5]

Out[3]= {94,40,22,10,40}

In[4]:= Pick[Range@Length@theList, theList, Alternatives @@ {10, 22}]

Out[4]= {3,4}

and for non-sequential items

In[5]:= Pick[Range@Length@theList, theList, Alternatives @@ {94, 22}]

Out[5]= {1,3}


DrMajorBob

unread,
Jan 31, 2010, 6:00:25 AM1/31/10
to
I had no idea this would work... but it does.

list = {1, 2, 3, 4, 5};

Position[list, Sequence[3, 4]]

{{3}}

My first thought was

list /. {before___, 3, 4, ___} :> 1 + Length@{before}

3

But the other is far simpler.

Bobby

On Sat, 30 Jan 2010 06:12:04 -0600, JB <jke...@gmail.com> wrote:

> Hi,


>
> What is the most efficient way to find the position of the beginning
> of a sequence of numbers from a list?
>
> I found a couple of ways:
>
> find 3,4 in list={1,2,3,4,5};
>
> 1. pos=Intersection[Position[list,3],(Position[list,4])+1]
>
> 2. pos=Position[Partition[list,2,1],{3,4}]
>

> Are there other ways to do this?
> What is the best way when dealing with large lists?
>

> Thanks,
> JB
>


--
DrMaj...@yahoo.com

Leonid Shifrin

unread,
Jan 31, 2010, 7:53:32 AM1/31/10
to
Patrick, Bob,

Please educate me if I am misunderstanding, but Position is not SequenceHold
and therefore, your constructs don't seem to work for me. They seem to work
only because you provided length - 2 or length- 3 sequences (try length-4
and I am getting en error for v7), which Position interprets as 2 or 3
additional arguments. Thus,

Position[list, Sequence[11, 12, 13]] is equivalent to

Position[list, 11,12,13], which means:

find position of element <11> in the first 12 levels of expression (<list>),
stopping with at most 13 results. By accident (or luck), this gives correct
result on the toy example that you have. The same situation with Bob's
example Position[list, Sequence[3, 4]].
Or is this an extension of v.7.1, that I am not aware of?

Regards,
Leonid

On Sun, Jan 31, 2010 at 2:55 AM, Patrick Scheibe <
psch...@trm.uni-leipzig.de> wrote:

> Hi,
>
> I assume the only thing you miss is how to express a "sequence of
> numbers"? Just use Sequence[] :-)
>
> list=Range[20];
> Position[list, Sequence[11, 12, 13]]
>
> Cheers
> Patrick
>

Leonid Shifrin

unread,
Jan 31, 2010, 7:53:56 AM1/31/10
to
Hi Steve,

I greatly enjoyed your solution. It is so much more elegant and more
efficient than what I tried to do with Compile. I did some timings:

In[1]:=
largeTest = RandomInteger[{1, 12}, 300000];

In[2]:= Replace[
largeTest , {u___, 3, 4, 5, 6, ___} :> Length[{u}] + 1] // Timing

Out[2]= {0.05, 48265}

In[3]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}, 1,
1] // Timing

Out[3]= {0.461, {{48265}}}


It can be easily generalized to find all occurrences of the sequence, by
using ReplaceList.

In[4]:=


Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}] // Timing

Out[4]= {0.951, {{48265}, {53605}, {53846}, {61161}, {108323}, \
{108993}, {112560}, {115757}, {127582}, {142593}, {197069}, {210130}, \
{254725}}}

In[5]:= ReplaceList[
largeTest , {u___, 3, 4, 5, 6, ___} :> Length[{u}] + 1] // Timing

Out[5]= {0.48, {48265, 53605, 53846, 61161, 108323, 108993, 112560,
115757, 127582, 142593, 197069, 210130, 254725}}

And it is still twice more efficient than Position[Partition ... solution
(it is interesting that Position can catch up by just being told that we
search only level 1).

By the way, it is a little off-topic, but this seems a more general
technique, which has a great potential for efficient solutions (perhaps
this is obvious to many people but I learned this relatively recently). I
used the same technique to implement a fast tic-tac-toe solution finder in
just a few lines of code, which checks for a winner on the infinite board
and beats my Java implementation:

ClearAll[winner];
winner[positions : {{{_Integer, _Integer} ..}, {{_Integer, _Integer} \
..}}, wLen_] :=
Module[{pGen, diagF, analyze},
pGen = {___, Sequence @@ Table[#, {wLen - 1}], ___} &;
diagF[x_, sign_] := x[[Ordering[x[[All, 1]] + sign*x[[All, 2]]]]];
analyze[points_] := Catch@With[{srt = Sort[points],
pt = pGen /@ {{0, 1}, {0, 1}, {1, -1}, {1, 1}}},
MapThread[If[MatchQ[Differences@#1, #2], Throw["Winner"]] &,
{{srt, Sort[points[[All, {2, 1}]]], diagF[srt, 1],
diagF[srt, -1]}, pt}];];
Map[analyze, positions] /.
{{"Winner", "Winner"} :> "A tie", {"Winner", Null} :>
"Crosses win",
{Null, "Winner"} :> "Nulls win", {Null, Null} :>
"Nobody won yet"}];

It accepts a board represented as two sub-lists of 2-d coordinates for nulls
and crosses, and the length of the winning combination. It runs 1000000
points (no-win situation) under 10 seconds on my not very fast machine, and
beats my Java implementation which is 10 times more code and not the slowest
possible either.

Anyways, nice solution, thanks for sharing!

Regards,
Leonid


On Sun, Jan 31, 2010 at 2:58 AM, Steve Luttrell <steve@_
removemefirst_luttrell.org.uk> wrote:

> I don't know about efficiency, but here is another way of doing what you
> want:
>
> {1, 2, 3, 4, 5} /. {u___, 3, 4, ___} :> Length[{u}] + 1
>
> --
> Stephen Luttrell
> West Malvern, UK
>
> "JB" <jke...@gmail.com> wrote in message news:hk17lo$ogf$1...@smc.vnet.net...

Patrick Scheibe

unread,
Feb 1, 2010, 6:09:12 AM2/1/10
to
Hi,

> Please educate me if I am misunderstanding, but Position is not SequenceHold

yes, you're right. It's a pity that "my written down idea" + "ocam's
razor" + "it worked on the sample" failed it this case.
I don't like "iterations" since I have enough of them in other languages
and therefore, my next guess would have been something like

pos[l_List, what_List] :=
ReplaceList[
l, {head___, Sequence @@ what, ___} :> Length[{head}] + 1]

pos[RandomInteger[{0, 10}, 10000], {4, 5, 6}]

Sorry, for the confusion.

Cheers
Patrick


> and therefore, your constructs don't seem to work for me. They seem to work
> only because you provided length - 2 or length- 3 sequences (try length-4
> and I am getting en error for v7), which Position interprets as 2 or 3
> additional arguments. Thus,
>
> Position[list, Sequence[11, 12, 13]] is equivalent to
>
> Position[list, 11,12,13], which means:
>
> find position of element <11> in the first 12 levels of expression (<list>),
> stopping with at most 13 results. By accident (or luck), this gives correct
> result on the toy example that you have. The same situation with Bob's
> example Position[list, Sequence[3, 4]].
> Or is this an extension of v.7.1, that I am not aware of?
>

> Regards,
> Leonid
>

> On Sun, Jan 31, 2010 at 2:55 AM, Patrick Scheibe <
> psch...@trm.uni-leipzig.de> wrote:
>
> > Hi,
> >
> > I assume the only thing you miss is how to express a "sequence of
> > numbers"? Just use Sequence[] :-)
> >
> > list=Range[20];
> > Position[list, Sequence[11, 12, 13]]
> >
> > Cheers
> > Patrick
> >
> > On Sat, 2010-01-30 at 07:12 -0500, JB wrote:

Leonid Shifrin

unread,
Feb 1, 2010, 6:09:23 AM2/1/10
to
Hi Patrick,

and therefore, my next guess would have been something like
>
> pos[l_List, what_List] :=
> ReplaceList[
> l, {head___, Sequence @@ what, ___} :> Length[{head}] + 1]
>
>

Well, I wish my next guess would be this. I used this idiom before but it is
not in my active vocabulary, so instead I made a mess with Compile and
wasted lots of time - mine and all those who read my posts.

Regards,
Leonid

--002354530710f4c28a047e794078
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Sun-Content-Length: 3159

Hi Patrick, <br><br><div class="gmail_quote"><blockquote class="gmail_q=
uote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0=
pt 0.8ex; padding-left: 1ex;">
and therefore, my next guess would have been something like<br>
<br>
pos[l_List, what_List] :=<br>
ReplaceList[<br>
l, {head___, Sequence @@ what, ___} :&gt; Length[{head}] + 1]<br>
<br></blockquote><div><br>Well, I wish my next guess would be this. I used =
this idiom before but it is not in my active vocabulary, so instead I made =
a mess with Compile and wasted lots of time - mine and all those who read m=
y posts.<br>
<br>Regards,<br>Leonid <br><br> </div><blockquote class="gmail_quote" s=
tyle="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8e=
x; padding-left: 1ex;">
<font color="#888888"><br>
</font><div><div class="h5"><br>
<br>
&gt; and therefore, your constructs don&#39;t seem to work for me. They see=
m to work<br>
&gt; only because you provided length - 2 or length- 3 sequences (try lengt=
h-4<br>
&gt; and I am getting en error for v7), which Position interprets as 2 or 3=
<br>
&gt; additional arguments. Thus,<br>
&gt;<br>
&gt; Position[list, Sequence[11, 12, 13]] is equivalent to<br>
&gt;<br>
&gt; Position[list, 11,12,13], which means:<br>
&gt;<br>
&gt; find position of element &lt;11&gt; in the first 12 levels of expressi=
on (&lt;list&gt;),<br>
&gt; stopping with at most 13 results. By accident (or luck), this gives co=
rrect<br>
&gt; result on the toy example that you have. The same situation with Bob&#=
39;s<br>
&gt; example Position[list, Sequence[3, 4]].<br>
&gt; Or is this an extension of v.7.1, that I am not aware of?<br>
&gt;<br>
&gt; Regards,<br>
&gt; Leonid<br>
&gt;<br>
&gt; On Sun, Jan 31, 2010 at 2:55 AM, Patrick Scheibe &lt;<br>
&gt; <a href="mailto:psch...@trm.uni-leipzig.de">psch...@trm.uni-leipzi=
g.de</a>&gt; wrote:<br>
&gt;<br>
&gt; &gt; Hi,<br>
&gt; &gt;<br>
&gt; &gt; I assume the only thing you miss is how to express a &quot;sequen=
ce of<br>
&gt; &gt; numbers&quot;? Just use Sequence[] :-)<br>
&gt; &gt;<br>
&gt; &gt; list=Range[20];<br>
&gt; &gt; Position[list, Sequence[11, 12, 13]]<br>
&gt; &gt;<br>
&gt; &gt; Cheers<br>
&gt; &gt; Patrick<br>
&gt; &gt;<br>
&gt; &gt; On Sat, 2010-01-30 at 07:12 -0500, JB wrote:<br>
&gt; &gt; &gt; Hi,<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; What is the most efficient way to find the position of the b=
eginning<br>
&gt; &gt; &gt; of a sequence of numbers from a list?<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; I found a couple of ways:<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; find 3,4 in list={1,2,3,4,5};<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; 1. pos=Intersection[Position[list,3],(Position[list=
,4])+1]<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; 2. pos=Position[Partition[list,2,1],{3,4}]<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; Are there other ways to do this?<br>
&gt; &gt; &gt; What is the best way when dealing with large lists?<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; Thanks,<br>
&gt; &gt; &gt; JB<br>
&gt; &gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt;<br>
<br>
</div></div></blockquote><br></div><br>

--002354530710f4c28a047e794078--

Ray Koopman

unread,
Feb 1, 2010, 6:10:05 AM2/1/10
to

Once we know where the 3's are, we need to look only there for 4's. Or
we could look for 3's only where there are 4's. The general rule would
be to look for the less-frequent value first.

Here are some times, incorporating Leonid's observation that Position
speeds up when it is told to search only level 1.

(Timing the list-generation, with all the other Timing's in the same
cell, keeps the other times from being overstated. See
http://groups.google.ca/group/comp.soft-sys.math.mathematica/browse_frm/thread/64cf56831988258a
)

Timing@Length[list = Table[Random[Integer,{1,5}],{1*^6}]]


Timing@Length[a = Intersection[Position[list,3],Position[list,4]-1]]

Timing@Length[b = Position[Partition[list,2,1],{3,4},{1}]]


Timing@Length[c = Intersection[peq[list,3],peq[list,4]-1]]

Timing@Length[d = Block[{p3 = peq[Most@list,3]},
p3[[peq[Rest[list][[p3]],4]]]]]
SameQ[Flatten@a,Flatten@b,c,d]

{0.84 Second,1000000}

{0.9 Second,39994}

{0.77 Second,39994}

{0.38 Second,39994}

{0.19 Second,39994}

True

Ray Koopman

unread,
Feb 2, 2010, 3:29:16 AM2/2/10
to
On Feb 1, 3:10 am, Ray Koopman <koop...@sfu.ca> wrote:
> On Jan 31, 2:58 am, Ray Koopman <koop...@sfu.ca> wrote:
> Once we know where the 3's are, we need to look only there for 4's.
> Or we could look for 3's only where there are 4's. The general rule
> would be to look for the less-frequent value first.
>
> Here are some times, incorporating Leonid's observation that
> Position speeds up when it is told to search only level 1.
>
> (Timing the list-generation, with all the other Timing's in the
> same cell, keeps the other times from being overstated. See
> http://groups.google.ca/group/comp.soft-sys.math.mathematica/browse_frm/thread/64cf56831988258a
> )
>
> Timing@Length[list = Table[Random[Integer,{1,5}],{1*^6}]]

> Timing@Length[a = Intersection[Position[list,3],Position[list,4]-1]]
> Timing@Length[b = Position[Partition[list,2,1],{3,4},{1}]]

> Timing@Length[c = Intersection[peq[list,3],peq[list,4]-1]]
> Timing@Length[d = Block[{p3 = peq[Most@list,3]},
> p3[[peq[Rest[list][[p3]],4]]]]]
> SameQ[Flatten@a,Flatten@b,c,d]
>
> {0.84 Second,1000000}
>
> {0.9 Second,39994}
>
> {0.77 Second,39994}
>
> {0.38 Second,39994}
>
> {0.19 Second,39994}
>
> True

One more kick at the can. Method 'e' below is slightly but reliably
faster than 'd', and the code suggests how to generalize to patterns
that are not restricted to adjacent positions in the original list.

Timing@Length[list = Table[Random[Integer,{1,5}],{2*^7}]]


Timing@Length[c = Intersection[peq[list,3],peq[list,4]-1]]

Timing@Length[d = Block[{p3 = peq[Most@list,3]},
p3[[peq[Rest[list][[p3]],4]]]]]

Timing@Length[e = Block[{p3 = peq[Most@list,3]},
p3[[peq[list[[p3+1]],4]]]]]
SameQ[c,d,e]

{13.59 Second, 20000000}

{7.91 Second, 800382}

{4.13 Second, 800382}

{3.8 Second, 800382}

True

Norbert P.

unread,
Feb 3, 2010, 6:12:43 AM2/3/10
to
Hi Leonid,

I guess JB doesn't care about speed improvement anymore, but this is
an idea that I've been using for a week (since getting Mathematica 7)
that makes finding position in a packed array much faster. This works
only in the case when one wants to find positions of all subsequences
(see my code in In[6] and notice that my old computer is much slower
than yours):

In[1]:= list=RandomInteger[{1,15},3000000];
seq={3,4,5,6};
In[3]:= r1=Flatten@Position[Partition[list,4,1],{3,4,5,6}];//Timing
Out[3]= {4.485,Null}
In[4]:= r2=ReplaceList[list,{u___,3,4,5,6,___}:>Length[{u}]+1];//
Timing
Out[4]= {5.453,Null}
In[5]:= r3=myPosition[myPartition[list,Length[seq]],seq,-1];//Timing
Out[5]= {2.719,Null}

In[6]:= fdz[v_]:=Rest@DeleteDuplicates@Prepend[v,0]
r4=Fold[fdz[#1 (1-Unitize[list[[#1]]-#2])]+1&,fdz[Range[Length[list]-
Length[seq]+1](1-Unitize[list[[;;-Length[seq]]]-seq[[1]]])]
+1,Rest@seq]-Length[seq];//Timing
Out[7]= {0.422,Null}

In[8]:= r1==r2==r3==r4
Out[8]= True

myPosition and myPartition are the functions from your post.
I'm essentially using DeleteDuplicates together with Unitize to find
positions of all occurrences of a specific number in an array. No
unpacking occurs so it's quite fast. You can use this to possibly
improve myPosition.

Best,
Norbert

On Jan 31, 2:57 am, Leonid Shifrin <lsh...@gmail.com> wrote:
> Hi again,
>
> In my first post one of the solutions (the compiled function) contained a
> bug:
>
> In[1]:= posf[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}]
>
> Out[1]= -1
>
> which was because it ignores all but the first candidate sequences in the

> case when they overlap. Here is a modified one which is (hopefully) corre=


ct,
> if not as fast:
>
> posf2=
> Compile[{{lst,_Integer,1},{target,_Integer,1}},
> Module[{i=1,len =Length[target],lstlen = Length[lst]},

> While[i<=lstlen-len+1&&Take[lst,{i,len+i-1}]!=target,i++]=


;
> If[i>lstlen-len+1,-1,i]]]
>
> In[3]:= posf2[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}]
>
> Out[3]= 5
>

> This one will still be much superior to Partition-based implementation fo=
r
> cases when you can expect the sequence of interest to appear rather=


early
> in the list. Anyway, sorry for the confusion with the buggy version.
>
> By the way, should you wish to stick to Partition-based implementation, I

> think it is fair to mention that for small sequence sizes and partitionin=


g
> with a shift 1, *and* when you have a list already in the packed array

> representation (which is possible when your numbers are say all intege=
rs or
> all reals, but not a mix), one can implement a more efficient version tha=


n
> the built-in Partition:
>
> Clear[myPartition];
> myPartition[x_List, size_Integer] :=
> With[{len = Length[x]},
> Transpose@Table[x[[i ;; len - size + i]], {i, 1, size}]];
>
> In[4]:= largeTestList = RandomInteger[{1, 15}, 3000000];
>
> In[5]:= (pt = Partition[largeTestList, 2, 1]); // Timing
>
> Out[5]= {0.521, Null}
>
> In[6]:= (mpt = myPartition[largeTestList , 2]); // Timing
>
> Out[6]= {0.17, Null}
>
> In[7]:= pt == mpt
>
> Out[7]= True
>

> The built-in Partition will start winning when the partitioning size will=
be
> around 30-50, so for long sequences using a built-in is better. If your l=
ist
> is not packed, built-in Partition will be a few times faster even for =
small
> partitioning sizes, so this will then be a de-optimization. You can attem=
pt
> to convert your list to packed array with Developer`ToPackedArray (note t=
hat
> it does not issue any messages in case it is unable to do this), and chec=


k
> that your list is packed with Developer`PackedArrayQ.
>

> Likewise, you can implement your own Position-like function aimed at exac=
tly
> this problem, which, under similar requirements of your list being a pack=
ed
> array of numbers, will be better than the built-in Position in most cases=

Leonid Shifrin

unread,
Feb 4, 2010, 6:24:06 AM2/4/10
to
Hi Norbert,

Thanks a lot - this is indeed pretty fast. And the way you use this in Fold
is quite amazing,
as well as the observation that there is no unpacking - very cool. As far
as speeding up of
the myPosition function is concerned, I toyed with precisely the same idea
before
in version 5. I used Clip instead of Unitze (essentially implementing
Unitize), but have completely forgotten about it until I saw your solution.
I now used Unitize to improve it a little (about 5-10 %). My benchmarks show
that both my old and new versions are about twice faster than yours:

In[1]:= Clear[myPositionOld, positionNP, myPositionNew];
myPositionOld[x_List, n_Integer] :=
#[[All, 1]] &@
Most@ArrayRules@SparseArray[1 - Clip[Abs[x - n], {0, 1}]];

positionNP[x_List, n_Integer] :=
Rest@DeleteDuplicates@Prepend[#, 0] &[Times[Range[Length[x]], (1 -
Unitize[x - n])]];

myPositionNew[x_List, n_Integer] := #[[All, 1]] &@
Most@ArrayRules@SparseArray[1 - Unitize[x - n]];


In[4]:= Timing[Do[myPositionOld[tst, 10], {50}]][[1]]/50

Out[4]= 0.10562

In[5]:= Timing[Do[positionNP[tst, 10], {50}]][[1]]/50

Out[5]= 0.1928

In[6]:= Timing[Do[myPositionNew[tst, 10], {50}]][[1]]/50

Out[6]= 0.09564

In[7]:=
Flatten@myPositionOld[tst, 10] == positionNP[tst, 10] ==
Flatten[myPositionNew[tst, 10]]

Out[7]= True

Anyways, I often find it amazing how far one can go in speeding up things
in
Mathematica - sometimes it can be really fast. Thanks for the new info - I
had no idea
that DeleteDuplicates is so fast on packed arrays, and I neither was I aware
of Unitize.

Regards,
Leonid

Norbert Pozar

unread,
Feb 4, 2010, 6:24:38 AM2/4/10
to
Hi Leonid,

that's a nice observation. I was exploring ArrayRules too, but I found
out that it is too slow when the array is quite dense. I was testing
it always on RandomInteger[1,...]. That has density 1/2. On the other
hand, DeleteDuplicates is quite independent of the density. When the
density drops below ~1/50, ArrayRules start performing better than
DeleteDuplicates. So I propose the following method, since Total is
really fast:

positionComb[x_List, n_Integer] :=
If[50 Total[#] < Length[x], Flatten@ArrayRules[#][[;; -2, 1]],
Rest@DeleteDuplicates@Prepend[# Range[Length[x]], 0]] &[


1 - Unitize[x - n]]

Some timings (by the way, the timing varies a lot, so accuracy no
better than +-50%):

In[1]:= tst=RandomInteger[1,40000];
Timing[Do[positionNP[tst,1],{50}]][[1]]/50
Timing[Do[myPositionNew[tst,1],{50}]][[1]]/50
Timing[Do[positionComb[tst,1],{50}]][[1]]/50
Out[2]= 0.005
Out[3]= 0.04064
Out[4]= 0.00468

In[5]:= tst=RandomInteger[500,40000];
Timing[Do[positionNP[tst,1],{50}]][[1]]/50
Timing[Do[myPositionNew[tst,1],{50}]][[1]]/50
Timing[Do[positionComb[tst,1],{50}]][[1]]/50
Out[6]= 0.00218
Out[7]= 0.00218
Out[8]= 0.00188


Two points:
1) you don't need ArrayRules@SparseArray@, ArrayRules@ is enough, even
though there is no performance benefit =)
2) Unitize is better than Clip since it also works with Reals,
Unitize[x]==0 iff x==0

> Anyways, I often find it amazing how far one can go in speeding up things
> in
> Mathematica - sometimes it can be really fast.

Well, this is true, but I think it'd be much better if Position was
implemented to work fast with packed arrays of integers, or Pick or
something ;)


Best,
Norbert

Leonid Shifrin

unread,
Feb 4, 2010, 6:24:49 AM2/4/10
to
Hi Norbert,

Thanks again, nice bits of information! I admit to not having done proper
tests and therefore
missed the slow-down of SparseArray that you mention. I think that your
proposed hybrid solution
is indeed the best option from what we know so far. As for Clip vs Unitize -
I have no doubt that Unitize must be better for what it does, being a more
specialized command in a sense. My old solution (which I attempted to
reconstruct in my previous post) used additional UnitStep or something to
make Clip work for reals, with some extra overhead of course.

Regards,
Leonid

Norbert Pozar

unread,
Feb 4, 2010, 6:25:44 AM2/4/10
to
Hi Leonid,

I just found another way that outperforms both of our solutions on a
wide range of integer arrays I tested. It turns out that one can
extract the position directly from SparseArray, without using
ArrayRules. It took me some time to figure it out:

extractPositionFromSparseArray[
HoldPattern[SparseArray[u___]]] := {u}[[4, 2, 2]]

positionExtr[x_List, n_] :=
Flatten@extractPositionFromSparseArray[
SparseArray[1 - Unitize[x - n]]]

Again, no unpacking.

Best,
Norbert

Leonid Shifrin

unread,
Feb 4, 2010, 6:26:27 AM2/4/10
to
Norbert,

what can I say - this is truly impressive. This is order of magnitude faster
than the built-in position, for this problem. This implies that the main
slow-down for the previous SparseArray-based solution was due to ArrayRules
being slow for dense arrays. In retrospect, this seems natural since
somewhere along the way something similar to unpacking must happen, since
the output is a list of rules. Probably internally SparseArray uses a packed
form for positions, and the way you extract them (directly with Part)
probably avoids the unpacking stage - this is the only plausible explanation
which comes to my mind.

I checked with Developer`PackedArrayQ on the resulting position list for
your solution, and it gives True, which supports my conjecture. Your method
seems to be applicable to many other problems where SparseArray-based tricks
are employed. This is something truly useful, thanks a lot for sharing!

Best,
Leonid

P.S.

By the way, this is irrelevant for performance, but the position-extraction
function can be rewritten without HoldPattern:

extractPositionFromSparseArrayAlt[(h : SparseArray)[u___]] := {u}[[4, 2, 2]]

Of course, in any case the key is your observation that Part can not be
directly used (since it has been internally overloaded for SparseArray
objects).


On Thu, Feb 4, 2010 at 4:59 AM, Norbert Pozar <berta...@gmail.com> wrote:

> I'm sorry, I just realized that one doesn't need 1-Unitize..., this will
> do:


>
> positionExtr[x_List, n_] :=
> Flatten@extractPositionFromSparseArray[

> SparseArray[Unitize[x - n], Automatic, 1]]
>
> Norbert
>
> On Wed, Feb 3, 2010 at 5:52 PM, Norbert Pozar <berta...@gmail.com>

Norbert Pozar

unread,
Feb 5, 2010, 7:12:53 AM2/5/10
to
Your conjecture can be easily verified:

In[1]:= SetSystemOptions[ "PackedArrayOptions"->{"UnpackMessage"->True}];
In[2]:= SparseArray[Range[0,2]]
Out[2]= SparseArray[<2>,{3}]
In[3]:= ArrayRules@%
During evaluation of In[3]:= Developer`FromPackedArray::unpack1:
Unpacking array with dimensions {2}. >>
Out[3]= {{2}->1,{3}->2,{_}->0}
In[4]:= ArrayRules@Range[0,2]
During evaluation of In[4]:= Developer`FromPackedArray::unpack1:
Unpacking array with dimensions {2}. >>
Out[4]= {{2}->1,{3}->2,{_}->0}

As you can see, the unpacked array has dimension {2}, i.e. it is the
list of positions stored in the SparseArray object. ArrayRules most
likely first uses SparseArray and then extracts its data, converting
it to the list of rules.

Another easy application that comes to mind is DeleteCases, using the
other piece of useful data inside SparseArray:

sparseArrayElements[HoldPattern[SparseArray[u___]]] := {u}[[4, 3]]
fastDeleteCases[vector_List, n_] :=
sparseArrayElements[SparseArray[vector, Automatic, n]]

In[1]:= test=RandomInteger[10,1000000];
First@Timing@(r1=DeleteCases[test,5])
During evaluation of In[1]:= Developer`FromPackedArray::punpack1:
Unpacking array with dimensions {1000000} to level 1. >>
Out[2]= 0.234
In[3]:= First@Timing@(r2=fastDeleteCases[test,5])
Out[3]= 0.062
In[4]:= r1==r2
During evaluation of In[4]:= Developer`FromPackedArray::unpack:
Unpacking array in call to Equal. >>
Out[4]= True
In[5]:= Developer`PackedArrayQ[r1]
Developer`PackedArrayQ[r2]
Out[5]= False
Out[6]= True

The difference is not as impressive, but still interesting.

Anyway, I don't know how well the structure of SparseArray is known, I
made an attempt to describe it here
http://www.math.ucla.edu/~npozar/doc/sparseArrays.pdf

Best,
Norbert


On Thu, Feb 4, 2010 at 1:06 AM, Leonid Shifrin <lsh...@gmail.com> wrote:
> Norbert,
>

> what can I say - this is truly impressive. This is order of magnitude fas=
ter
> than the built-in position, for this problem. This implies that the ma=
in
> slow-down for the previous SparseArray-based solution was due to ArrayRul=


es
> being slow for dense arrays. In retrospect, this seems natural since
> somewhere along the way something similar to unpacking must happen, since

> the output is a list of rules. Probably internally SparseArray uses a pac=


ked
> form for positions, and the way you extract them (directly with Part)

> probably avoids the unpacking stage - this is the only plausible explanat=


ion
> which comes to my mind.
>
> I checked with Developer`PackedArrayQ on the resulting position list for

> your solution, and it gives True, which supports my conjecture. Your me=
thod
> seems to be applicable to many other problems where SparseArray-based tri=


cks
> are employed. This is something truly useful, thanks a lot for sharing!
>
> Best,
> Leonid
>
> P.S.
>

> By the way, this is irrelevant for performance, but the position-extracti=


on
> function can be rewritten without HoldPattern:
>

> extractPositionFromSparseArrayAlt[(h : SparseArray)[u___]] := {u}[[4, 2=


, 2]]
>
> Of course, in any case the key is your observation that Part can not be
> directly used (since it has been internally overloaded for SparseArray
> objects).
>
>
>
>

> On Thu, Feb 4, 2010 at 4:59 AM, Norbert Pozar <berta...@gmail.com> wrot=

>> >> specialized command in a sense. My old solution (which I attempted=
to
>> >> reconstruct in my previous post) used additional UnitStep or somethin=


g
>> >> to
>> >> make Clip work for reals, with some extra overhead of course.
>> >>
>> >> Regards,
>> >> Leonid
>> >>
>> >>
>> >> On Wed, Feb 3, 2010 at 11:06 PM, Norbert Pozar <berta...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi Leonid,
>> >>>

>> >>> that's a nice observation. I was exploring ArrayRules too, but I fou=


nd
>> >>> out that it is too slow when the array is quite dense. I was testing

>> >>> it always on RandomInteger[1,...]. That has density 1/2. On the othe=

>> >>> 1) you don't need ArrayRules@SparseArray@, ArrayRules@ is enough, ev=


en
>> >>> though there is no performance benefit =)
>> >>> 2) Unitize is better than Clip since it also works with Reals,
>> >>> Unitize[x]==0 iff x==0
>> >>>

>> >>> > Anyways, I often find it amazing how far one can go in speeding=


up
>> >>> > things
>> >>> > in
>> >>> > Mathematica - sometimes it can be really fast.
>> >>> Well, this is true, but I think it'd be much better if Position was
>> >>> implemented to work fast with packed arrays of integers, or Pick or
>> >>> something ;)
>> >>>
>> >>>
>> >>> Best,
>> >>> Norbert
>> >>>
>> >>> On Wed, Feb 3, 2010 at 10:39 AM, Leonid Shifrin <lsh...@gmail.com>
>> >>> wrote:
>> >>> > Hi Norbert,
>> >>> >

>> >>> > Thanks a lot - this is indeed pretty fast. And the way you use thi=


s
>> >>> > in
>> >>> > Fold
>> >>> > is quite amazing,
>> >>> > as well as the observation that there is no unpacking - very cool.
>> >>> > As
>> >>> > far
>> >>> > as speeding up of

>> >>> > the myPosition function is concerned, I toyed with precisely th=


e
>> >>> > same
>> >>> > idea
>> >>> > before
>> >>> > in version 5. I used Clip instead of Unitze (essentially
>> >>> > implementing
>> >>> > Unitize), but have completely forgotten about it until I saw your
>> >>> > solution.
>> >>> > I now used Unitize to improve it a little (about 5-10 %). My
>> >>> > benchmarks
>> >>> > show

>> >>> > that both my old and new versions are about twice faster than your=


s:
>> >>> >
>> >>> > In[1]:= Clear[myPositionOld, positionNP, myPositionNew];
>> >>> > myPositionOld[x_List, n_Integer] :=
>> >>> > #[[All, 1]] &@
>> >>> > Most@ArrayRules@SparseArray[1 - Clip[Abs[x - n], {0, 1}]];
>> >>> >
>> >>> > positionNP[x_List, n_Integer] :=

>> >>> > Rest@DeleteDuplicates@Prepend[#, 0] &[Times[Range[Length[x]], =


(1 -
>> >>> > Unitize[x - n])]];
>> >>> >
>> >>> > myPositionNew[x_List, n_Integer] := #[[All, 1]] &@
>> >>> > Most@ArrayRules@SparseArray[1 - Unitize[x - n]];
>> >>> >
>> >>> >
>> >>> > In[4]:= Timing[Do[myPositionOld[tst, 10], {50}]][[1]]/50
>> >>> >
>> >>> > Out[4]= 0.10562
>> >>> >
>> >>> > In[5]:= Timing[Do[positionNP[tst, 10], {50}]][[1]]/50
>> >>> >
>> >>> > Out[5]= 0.1928
>> >>> >
>> >>> > In[6]:= Timing[Do[myPositionNew[tst, 10], {50}]][[1]]/50
>> >>> >
>> >>> > Out[6]= 0.09564
>> >>> >
>> >>> > In[7]:=
>> >>> > Flatten@myPositionOld[tst, 10] == positionNP[tst, 10] ==
>> >>> > Flatten[myPositionNew[tst, 10]]
>> >>> >
>> >>> > Out[7]= True
>> >>> >

>> >>> > Anyways, I often find it amazing how far one can go in speeding=


up
>> >>> > things
>> >>> > in
>> >>> > Mathematica - sometimes it can be really fast. Thanks for the new
>> >>> > info -
>> >>> > I
>> >>> > had no idea

>> >>> > that DeleteDuplicates is so fast on packed arrays, and I neither w=


as
>> >>> > I
>> >>> > aware
>> >>> > of Unitize.
>> >>> >
>> >>> > Regards,
>> >>> > Leonid
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Wed, Feb 3, 2010 at 12:43 PM, Norbert P. <berta...@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> Hi Leonid,
>> >>> >>
>> >>> >> I guess JB doesn't care about speed improvement anymore, but this
>> >>> >> is

>> >>> >> an idea that I've been using for a week (since getting Mathematic=


a
>> >>> >> 7)
>> >>> >> that makes finding position in a packed array much faster. This
>> >>> >> works
>> >>> >> only in the case when one wants to find positions of all
>> >>> >> subsequences
>> >>> >> (see my code in In[6] and notice that my old computer is much
>> >>> >> slower
>> >>> >> than yours):
>> >>> >>
>> >>> >> In[1]:= list=RandomInteger[{1,15},3000000];
>> >>> >> seq={3,4,5,6};

>> >>> >> In[3]:= r1=Flatten@Position[Partition[list,4,1],{3,4,5,6}];//=


Timing
>> >>> >> Out[3]= {4.485,Null}

>> >>> >> In[4]:= r2=ReplaceList[list,{u___,3,4,5,6,___}:>Length[{u}]+1=


];//
>> >>> >> Timing
>> >>> >> Out[4]= {5.453,Null}
>> >>> >> In[5]:=
>> >>> >> r3=myPosition[myPartition[list,Length[seq]],seq,-1];//Timing
>> >>> >> Out[5]= {2.719,Null}
>> >>> >>
>> >>> >> In[6]:= fdz[v_]:=Rest@DeleteDuplicates@Prepend[v,0]
>> >>> >> r4=Fold[fdz[#1
>> >>> >> (1-Unitize[list[[#1]]-#2])]+1&,fdz[Range[Length[list]-
>> >>> >> Length[seq]+1](1-Unitize[list[[;;-Length[seq]]]-seq[[1]]])]
>> >>> >> +1,Rest@seq]-Length[seq];//Timing
>> >>> >> Out[7]= {0.422,Null}
>> >>> >>
>> >>> >> In[8]:= r1==r2==r3==r4
>> >>> >> Out[8]= True
>> >>> >>
>> >>> >> myPosition and myPartition are the functions from your post.
>> >>> >> I'm essentially using DeleteDuplicates together with Unitize to
>> >>> >> find
>> >>> >> positions of all occurrences of a specific number in an array. No
>> >>> >> unpacking occurs so it's quite fast. You can use this to possibly
>> >>> >> improve myPosition.
>> >>> >>
>> >>> >> Best,
>> >>> >> Norbert
>> >>> >>
>> >>> >> On Jan 31, 2:57 am, Leonid Shifrin <lsh...@gmail.com> wrote:

>> >>> >> > Hi again,
>> >>> >> >
>> >>> >> > In my first post one of the solutions (the compiled function)
>> >>> >> > contained
>> >>> >> > a
>> >>> >> > bug:
>> >>> >> >
>> >>> >> > In[1]:= posf[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}]
>> >>> >> >
>> >>> >> > Out[1]= -1
>> >>> >> >
>> >>> >> > which was because it ignores all but the first candidate
>> >>> >> > sequences in
>> >>> >> > the
>> >>> >> > case when they overlap. Here is a modified one which is
>> >>> >> > (hopefully)
>> >>> >> > correct,
>> >>> >> > if not as fast:
>> >>> >> >
>> >>> >> > posf2=
>> >>> >> > Compile[{{lst,_Integer,1},{target,_Integer,1}},

>> >>> >> > Module[{i=1,len =Length[target],lstlen = Length[l=
st]},
>> >>> >> > While[i<=lstlen-len+1&&Take[lst,{i,len+i-1}]!=t=


arget,i++];
>> >>> >> > If[i>lstlen-len+1,-1,i]]]
>> >>> >> >
>> >>> >> > In[3]:= posf2[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}]
>> >>> >> >
>> >>> >> > Out[3]= 5
>> >>> >> >
>> >>> >> > This one will still be much superior to Partition-based
>> >>> >> > implementation
>> >>> >> > for

>> >>> >> > cases when you can expect the sequence of interest to app=
ear


>> >>> >> > rather
>> >>> >> > early
>> >>> >> > in the list. Anyway, sorry for the confusion with the buggy
>> >>> >> > version.
>> >>> >> >
>> >>> >> > By the way, should you wish to stick to Partition-based
>> >>> >> > implementation,
>> >>> >> > I
>> >>> >> > think it is fair to mention that for small sequence sizes and
>> >>> >> > partitioning

>> >>> >> > with a shift 1, *and* when you have a list already in the pa=
cked
>> >>> >> > array
>> >>> >> > representation (which is possible when your numbers are say =


all
>> >>> >> > integers or
>> >>> >> > all reals, but not a mix), one can implement a more efficient
>> >>> >> > version
>> >>> >> > than
>> >>> >> > the built-in Partition:
>> >>> >> >
>> >>> >> > Clear[myPartition];
>> >>> >> > myPartition[x_List, size_Integer] :=
>> >>> >> > With[{len = Length[x]},

>> >>> >> > Transpose@Table[x[[i ;; len - size + i]], {i, 1, size}]]=


;
>> >>> >> >
>> >>> >> > In[4]:= largeTestList = RandomInteger[{1, 15}, 3000000];
>> >>> >> >
>> >>> >> > In[5]:= (pt = Partition[largeTestList, 2, 1]); // Timing
>> >>> >> >
>> >>> >> > Out[5]= {0.521, Null}
>> >>> >> >
>> >>> >> > In[6]:= (mpt = myPartition[largeTestList , 2]); // Timing
>> >>> >> >
>> >>> >> > Out[6]= {0.17, Null}
>> >>> >> >
>> >>> >> > In[7]:= pt == mpt
>> >>> >> >
>> >>> >> > Out[7]= True
>> >>> >> >
>> >>> >> > The built-in Partition will start winning when the partitioning
>> >>> >> > size
>> >>> >> > will be
>> >>> >> > around 30-50, so for long sequences using a built-in is better.
>> >>> >> > If
>> >>> >> > your
>> >>> >> > list
>> >>> >> > is not packed, built-in Partition will be a few times faster
>> >>> >> > even
>> >>> >> > for
>> >>> >> > small
>> >>> >> > partitioning sizes, so this will then be a de-optimization. You
>> >>> >> > can
>> >>> >> > attempt

>> >>> >> > to convert your list to packed array with Developer`ToPackedArr=
ay
>> >>> >> > (note
>> >>> >> > that
>> >>> >> > it does not issue any messages in case it is unable to do this)=
,


>> >>> >> > and
>> >>> >> > check
>> >>> >> > that your list is packed with Developer`PackedArrayQ.
>> >>> >> >

>> >>> >> > Likewise, you can implement your own Position-like function aim=
ed


>> >>> >> > at
>> >>> >> > exactly
>> >>> >> > this problem, which, under similar requirements of your list
>> >>> >> > being a
>> >>> >> > packed
>> >>> >> > array of numbers, will be better than the built-in Position in
>> >>> >> > most
>> >>> >> > cases:
>> >>> >> >
>> >>> >> > In[8]:=
>> >>> >> > Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}, 1,
>> >>> >> > 1] // Timing
>> >>> >> >
>> >>> >> > Out[8]= {0.27, {{41940}}}
>> >>> >> >
>> >>> >> > In[9]:= myPosition[myPartition[largeTest , 4], {3, 4, 5, 6},
>> >>> >> > 1] // Timing
>> >>> >> >
>> >>> >> > Out[9]= {0.09, {41940}}
>> >>> >> >

>> >>> >> > In[10]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}, =


1,
>> >>> >> > 2] // Timing
>> >>> >> >
>> >>> >> > Out[10]= {0.301, {{41940}, {228293}}}
>> >>> >> >
>> >>> >> > In[11]:= myPosition[myPartition[largeTest , 4], {3, 4, 5, 6},
>> >>> >> > 2] // Timing
>> >>> >> >
>> >>> >> > Out[11]= {0.311, {41940, 228293}}
>> >>> >> >

>> >>> >> > In[12]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}] =


//
>> >>> >> > Timing
>> >>> >> >
>> >>> >> > Out[12]= {0.55, {{41940}, {228293}, {269300}}}
>> >>> >> >
>> >>> >> > In[13]:= myPosition[
>> >>> >> > myPartition[largeTest , 4], {3, 4, 5, 6}, -1] // Timing
>> >>> >> >
>> >>> >> > Out[13]= {0.411, {41940, 228293, 269300}}
>> >>> >> >

>> >>> >> > where the last argument -1 is a convention to return all result=


s.
>> >>> >> >
>> >>> >> > Regards,
>> >>> >> > Leonid
>> >>> >> >
>> >>> >> > On Sat, Jan 30, 2010 at 4:12 AM, JB <jke...@gmail.com> wrote:
>> >>> >> > > Hi,
>> >>> >> >
>> >>> >> > > What is the most efficient way to find the position of the
>> >>> >> > > beginning
>> >>> >> > > of a sequence of numbers from a list?
>> >>> >> >
>> >>> >> > > I found a couple of ways:
>> >>> >> >
>> >>> >> > > find 3,4 in list={1,2,3,4,5};
>> >>> >> >

>> >>> >> > > 1. pos=Intersection[Position[list,3],(Position[list,=

0 new messages