list.pop(0) vs. collections.dequeue

533 views
Skip to first unread message

Steve Howell

unread,
Jan 22, 2010, 2:14:32 PM1/22/10
to
The v2.6.4 version of the tutorial says this:

'''
It is also possible to use a list as a queue, where the first element
added is the first element retrieved (“first-in, first-out”); however,
lists are not efficient for this purpose. While appends and pops from
the end of list are fast, doing inserts or pops from the beginning of
a list is slow (because all of the other elements have to be shifted
by one).
'''

Is that really true in CPython? It seems like you could advance the
pointer instead of shifting all the elements. It would create some
nuances with respect to reclaiming the memory, but it seems like an
easy way to make lists perform better under a pretty reasonable use
case.

Does anybody know off the top of their head if the "have-to-be-shifted-
by-one" warning is actually valid?


http://docs.python.org/tutorial/datastructures.html

Chris Rebert

unread,
Jan 22, 2010, 3:14:25 PM1/22/10
to Steve Howell, pytho...@python.org

Judging by the "Sorted dictionary" thread responses: Yes.

Cheers,
Chris
--
http://blog.rebertia.com

Steve Howell

unread,
Jan 22, 2010, 3:22:21 PM1/22/10
to
On Jan 22, 12:14 pm, Chris Rebert <c...@rebertia.com> wrote:

I think you are referring to this comment:

'''
Insertion and deletion are still O(n) as all items to the right of the
inserted/deleted one have to be shifted by one place.
'''

http://groups.google.com/group/comp.lang.python/browse_thread/thread/d3699724d94d5b5a

I can certainly see why most reasonable implementations of a list
would have insertion/deletion in the middle of the list be O(N), but I
don't think that limitation has to apply for the special cases of the
beginning and end of the list.


Christian Heimes

unread,
Jan 22, 2010, 3:40:08 PM1/22/10
to Steve Howell, pytho...@python.org
Steve Howell wrote:
> Is that really true in CPython? It seems like you could advance the
> pointer instead of shifting all the elements. It would create some
> nuances with respect to reclaiming the memory, but it seems like an
> easy way to make lists perform better under a pretty reasonable use
> case.
>
> Does anybody know off the top of their head if the "have-to-be-shifted-
> by-one" warning is actually valid?

Why do you think the documentation has such obvious errors?

CPython's list type uses an array of pointers to store its members. The
type is optimized for the most common list operations in Python:
iteration and appending. Python code rarely changes the head or middle
of a list. The dequeue type is an optimized data structure for popping
and inserting data at both ends.

When you list.pop() or list.insert() the pointers in the internal array
must be shifted. appending is much faster because the internal array is
overallocated meaning it contains free slots at the tail of the data
structure. A linked list of pointers requires more memory and iteration
is slower since since it can't utilize the CPU's cache as good as an array.

Christian

Steve Howell

unread,
Jan 22, 2010, 3:57:46 PM1/22/10
to
On Jan 22, 12:40 pm, Christian Heimes <li...@cheimes.de> wrote:
> Steve Howell wrote:
> > Is that really true in CPython?  It seems like you could advance the
> > pointer instead of shifting all the elements.  It would create some
> > nuances with respect to reclaiming the memory, but it seems like an
> > easy way to make lists perform better under a pretty reasonable use
> > case.
>
> > Does anybody know off the top of their head if the "have-to-be-shifted-
> > by-one" warning is actually valid?
>
> Why do you think the documentation has such obvious errors?

I wasn't making any assumptions, hence the question mark. The Python
docs are very good, but even the best of projects make advances that
aren't reflected in the docs.

> CPython's list type uses an array of pointers to store its members. The
> type is optimized for the most common list operations in Python:
> iteration and appending. Python code rarely changes the head or middle
> of a list. The dequeue type is an optimized data structure for popping
> and inserting data at both ends.
>

I disagree that Python code rarely pops elements off the top of a
list. There are perfectly valid use cases for wanting a list over a
dequeue without having to pay O(N) for pop(0). Maybe we are just
quibbling over the meaning of "rarely."

> When you list.pop() or list.insert() the pointers in the internal array
> must be shifted. appending is much faster because the internal array is
> overallocated meaning it contains free slots at the tail of the data
> structure. A linked list of pointers requires more memory and iteration
> is slower since since it can't utilize the CPU's cache as good as an array.
>

I am not proposing a linked list of pointers. I am wondering about
something like this:

p = &p[1];
(and then reclaim p[0] as free memory, I already said I understood
that was the tricky bit)

The pointer arithmetic for accessing each element would still work in O
(1), right?

Arnaud Delobelle

unread,
Jan 22, 2010, 4:08:48 PM1/22/10
to
Steve Howell <show...@yahoo.com> writes:

> On Jan 22, 12:14 pm, Chris Rebert <c...@rebertia.com> wrote:
>> On Fri, Jan 22, 2010 at 11:14 AM, Steve Howell <showel...@yahoo.com> wrote:
>> > The v2.6.4 version of the tutorial says this:
>>
>> > '''
>> > It is also possible to use a list as a queue, where the first element
>> > added is the first element retrieved (“first-in, first-out”); however,
>> > lists are not efficient for this purpose. While appends and pops from
>> > the end of list are fast, doing inserts or pops from the beginning of
>> > a list is slow (because all of the other elements have to be shifted
>> > by one).
>> > '''
>>
>> > Is that really true in CPython?  It seems like you could advance the
>> > pointer instead of shifting all the elements.  It would create some
>> > nuances with respect to reclaiming the memory, but it seems like an
>> > easy way to make lists perform better under a pretty reasonable use
>> > case.
>>
>> > Does anybody know off the top of their head if the "have-to-be-shifted-
>> > by-one" warning is actually valid?
>>
>> Judging by the "Sorted dictionary" thread responses: Yes.
>>
>
> I think you are referring to this comment:
>
> '''
> Insertion and deletion are still O(n) as all items to the right of the
> inserted/deleted one have to be shifted by one place.
> '''
>
>

> I can certainly see why most reasonable implementations of a list
> would have insertion/deletion in the middle of the list be O(N), but I
> don't think that limitation has to apply for the special cases of the
> beginning and end of the list.

I made the comment you quoted. In CPython, it is O(n) to delete/insert
an element at the start of the list - I know it because I looked at the
implementation a while ago. This is why collections.deque exists I
guess. I don't know how they are implemented but insertion/deletion at
either end are O(1) and so is random access. So they are the data
structure that you want.

If you want evidence for lists, rather than my word, try:

>>> import timeit
>>> timeit.Timer('while t:del t[0]', 't=[0]*100000').timeit(1)
1.8452401161193848
>>> timeit.Timer('while t:del t[-1]', 't=[0]*100000').timeit(1)
0.017747163772583008

>>> timeit.Timer(
'while t:del t[0]',
'from collections import deque; t=deque([0]*100000)'
).timeit(1)
0.022077083587646484
>>> timeit.Timer(
'while t:del t[-1]',
'from collections import deque; t=deque([0]*100000)'
).timeit(1)
0.027813911437988281

--
Arnaud

Steve Howell

unread,
Jan 22, 2010, 4:26:03 PM1/22/10
to
On Jan 22, 1:08 pm, Arnaud Delobelle <arno...@googlemail.com> wrote:

Ok, thanks, good to know. I assume it's the colorly named
list_ass_slice that would have to special case ilow == 0 and you'd
need a memory manager that would let you realloc from ilow:ihigh to
ilow+n:high. Am I reading that much of the code correctly?

Christian Heimes

unread,
Jan 22, 2010, 4:29:23 PM1/22/10
to Steve Howell, pytho...@python.org
Steve Howell wrote:
> I disagree that Python code rarely pops elements off the top of a
> list. There are perfectly valid use cases for wanting a list over a
> dequeue without having to pay O(N) for pop(0). Maybe we are just
> quibbling over the meaning of "rarely."

I was speaking from my own point of view. I've written several tenths of
thousands of lines of Python code in the last seven years, mostly
related to data manipulation, web applications and operating system
interaction but also GUI stuff and scientific code. I can't recall any
performance critical or prominent code that modifies the head of a list
a lot.

Of course there a use cases where you may want to use list.pop(). Unless
you need a FILO structure you can always replace a LILO with a FIFO --
instead of list.insert(0, value) and list.pop(0) use list.append(value)
and list.pop(). It's not possible to optimize a data structure for all
use cases.

> I am not proposing a linked list of pointers. I am wondering about
> something like this:
>
> p = &p[1];
> (and then reclaim p[0] as free memory, I already said I understood
> that was the tricky bit)
>
> The pointer arithmetic for accessing each element would still work in O
> (1), right?

You mean it's an impossible trick unless you come up with your own
memory management system. Realloc(), the standard function to change the
size of chunk of allocated memory in C, doesn't support your desired
operation.

Christian

Terry Reedy

unread,
Jan 22, 2010, 4:32:47 PM1/22/10
to pytho...@python.org
On 1/22/2010 2:14 PM, Steve Howell wrote:
> The v2.6.4 version of the tutorial says this:

> Is that really true in CPython? It seems like you could advance the


> pointer instead of shifting all the elements. It would create some
> nuances with respect to reclaiming the memory, but it seems like an
> easy way to make lists perform better under a pretty reasonable use
> case.

Something like this was one proposed (ie, leave space at both ends of a
list) but was rejected as over-complicating the list implementation for
*relatively* rare use cases. I believe deque was written subsequently to
address such other use cases.

Christian Heimes

unread,
Jan 22, 2010, 4:35:37 PM1/22/10
to Arnaud Delobelle, pytho...@python.org
Arnaud Delobelle wrote:
> I made the comment you quoted. In CPython, it is O(n) to delete/insert
> an element at the start of the list - I know it because I looked at the
> implementation a while ago. This is why collections.deque exists I
> guess. I don't know how they are implemented but insertion/deletion at
> either end are O(1) and so is random access. So they are the data
> structure that you want.

Your assumption is correct. The collections.dequeue type uses a double
linked list of blocks. Each blocks contains a fixed amount of pointers
to Python objects. The implementation makes the implementation less
memory hungry than a standard double linked list with just one element
for each block.

Christian

Steve Howell

unread,
Jan 22, 2010, 5:38:18 PM1/22/10
to
On Jan 22, 1:29 pm, Christian Heimes <li...@cheimes.de> wrote:
> Steve Howell wrote:
> > I disagree that Python code rarely pops elements off the top of a
> > list.  There are perfectly valid use cases for wanting a list over a
> > dequeue without having to pay O(N) for pop(0).  Maybe we are just
> > quibbling over the meaning of "rarely."
>
> I was speaking from my own point of view. I've written several tenths of
> thousands of lines of Python code in the last seven years, mostly
> related to data manipulation, web applications and operating system
> interaction but also GUI stuff and scientific code. I can't recall any
> performance critical or prominent code that modifies the head of a list
> a lot.
>

That maybe would be an argument for just striking the paragraph from
the tutorial. If it's rare that people pop the head off the list in
code that is performance critical or prominent, why bother to even
mention it in the tutorial?

> Of course there a use cases where you may want to use list.pop(). Unless
> you need a FILO structure you can always replace a LILO with a FIFO --
> instead of list.insert(0, value) and list.pop(0) use list.append(value)
> and list.pop(). It's not possible to optimize a data structure for all
> use cases.
>
> > I am not proposing a linked list of pointers.  I am wondering about
> > something like this:
>
> > p = &p[1];
> > (and then reclaim p[0] as free memory, I already said I understood
> > that was the tricky bit)
>
> > The pointer arithmetic for accessing each element would still work in O
> > (1), right?
>
> You mean it's an impossible trick unless you come up with your own
> memory management system. Realloc(), the standard function to change the
> size of chunk of allocated memory in C, doesn't support your desired
> operation.
>

Impossible is a strong word. You could be lazy about giving the
memory back, and just wait till the whole list is garbage collected.
I don't think there's anything in Python's contract that says memory
has to be freed the exact moment you stop using it, especially since
we're talking about doing an O(N) memmove just to free up one
pointer's worth of memory.

I know the Python programmer could simply iterate through the list
rather than popping off unused elements, but that just means that you
not only tie up the memory for the pointers just as long, but you also
prevent the objects themselves from being garbage collected.

In my case I'm not really concerned about giving the memory back right
away, it's more about keeping my code simple. Once I'm done with an
element, I do just want to pop it off and keep all the simplicity of
the lists, but I am just concerned enough speed that for a 1000
element list, I don't want to be doing 1000 memmoves for an average of
500 pointers, which effectively moves about a million bytes around for
no reason.

I suppose the solution is to either give up the sugar of lists, or try
to wrap something like deque or list-with-offset into a sequence.

Steve Howell

unread,
Jan 22, 2010, 5:52:41 PM1/22/10
to

Bummer. I guess I get to do my own over-complicating of code, being
in that unfortunate minority.


Dave Angel

unread,
Jan 22, 2010, 5:54:15 PM1/22/10
to Steve Howell, pytho...@python.org
> p =p[1];

> (and then reclaim p[0] as free memory, I already said I understood
> that was the tricky bit)
>
> The pointer arithmetic for accessing each element would still work in O
> (1), right?
>
>
I think it was Dijkstr (sp?) who said you can accomplish anything with
just one more level of indirection. Clearly each attribute (variable)
that has a binding to a given list must point to a fixed piece of
memory, that cannot safely be moved, because there's no efficient way to
find all the attributes. That fixed piece is the list object, and I
expect it's 16 or 20 bytes, on a 32bit implementation. It must in turn
point to the actual malloc'ed block that contains pointers to all the
elements of the list. So that block will be 4*n where n is the number
of reserved cells, at least as large as len(). This is the block where
copying takes place when you insert or delete from the beginning.

The list object must contain a pointer to the beginning of this block,
or it wouldn't be able to free() it later. So you'd be suggesting a
second pointer that actually points to the current 0th pointer. And a
pop would simply increment this second pointer.

Such an approach might be worthwhile if you expect lots of pops and
pushes, with a minimal net change. But of course each time you did a
pop, you'd have to decide whether it was time to normalize/compact the
block.

As you say, reclaiming the 0th element of this block to the memory pool
would be tricky. Doubly so, because 1) the C memory allocator has no
such notion as resizing the beginning of a block. and 2) it would have
nothing useful to do with the 4 bytes freed. The minimum allocated
block in Python is probably 16 bytes of actual address space. I'd guess
that's 4 bytes for malloc's overhead, and 8 bytes for the minimum object
header, and 4 bytes for data. To check me, try:

>>> a = 5.3
>>> b = 49.6
>>> id(a), id(b)
(11074136, 11074152)

Anyway, this could be done, where once the two pointers get some
distance apart, you do a realloc, and copy everything. But of course
you'd want to build some hysteresis into it, to avoid thrashing.

There wouldn't be much of a performance hit, but it would increase every
list size by 4 bytes minimum. So I doubt it would be a list replacement.

This'd be an interesting project.to do as an addon module.

DaveA


Dave Angel

unread,
Jan 22, 2010, 5:57:36 PM1/22/10
to Arnaud Delobelle, pytho...@python.org
Arnaud Delobelle wrote:
> Steve Howell <show...@yahoo.com> writes:
>
>
>> On Jan 22, 12:14 pm, Chris Rebert <c...@rebertia.com> wrote:
>>
>> <snip>

> I made the comment you quoted. In CPython, it is O(n) to delete/insert
> an element at the start of the list - I know it because I looked at the
> implementation a while ago. This is why collections.deque exists I
> guess. I don't know how they are implemented but insertion/deletion at
> either end are O(1) and so is random access. So they are the data
> structure that you want.
>
>
Not according to the 2.6 docs.

Indexed access is O(1) at both ends but slows to O(n) in the middle. For
fast random access, use lists instead.

That sounds to me like a doubly-linked list implementation.

<snip>

Steve Howell

unread,
Jan 22, 2010, 6:17:21 PM1/22/10
to

The indirection is already in Python, as it (at least appears to me)
that everything is deferenced off of an ob_item pointer:

http://svn.python.org/view/python/trunk/Objects/listobject.c?view=markup

> The list object must contain a pointer to the beginning of this block,
> or it wouldn't be able to free() it later.  So you'd be suggesting a
> second pointer that actually points to the current 0th pointer.  And a
> pop would simply increment this second pointer.
>

Yes, ob_item would point to the 0th element pointer and pop would
simply increment it.

The additional bookkeeping would be the original pointer.

> Such an approach might be worthwhile if you expect lots of pops and
> pushes, with a minimal net change.  But of course each time you did a
> pop, you'd have to decide whether it was time to normalize/compact  the
> block.
>

Yes, and that of course is the tricky bit.

> As you say, reclaiming the 0th element of this block to the memory pool
> would be tricky.  

I should clarify that a bit. Reclaiming the 0th element *cheaply* is
tricky, unless you want to rewrite the memory manager. But I also
think you can, of course, defer reclaiming the element.

> Doubly so, because  1) the C memory allocator has no
> such notion as resizing the beginning of a block. and 2) it would have
> nothing useful to do with the 4 bytes freed.  The minimum allocated
> block in Python is probably 16 bytes of actual address space.  I'd guess
> that's 4 bytes for malloc's overhead, and 8 bytes for the minimum object
> header, and 4 bytes for data.  To check me, try:
>
>  >>> a = 5.3
>  >>> b = 49.6
>  >>> id(a), id(b)
> (11074136, 11074152)
>
> Anyway, this could be done, where once the two pointers get some
> distance apart, you do a realloc, and copy everything.  But of course
> you'd want to build some hysteresis into it, to avoid thrashing.

> , but

There wouldn't be any additional thrashing beyond what happens now.
You'd simply avoid the first N-1 memmoves of up to kN bytes at the
cost of not reclaiming those k(N-1) bytes right away. I suppose it's
a more "bursty" penalty you'd be paying, but the peak of the "bursty"
curve is no wider than the "constant" curve, it's just N times
narrower!

> There wouldn't be much of a performance hit, but it would increase every
> list size by 4 bytes minimum.  So I doubt it would be a list replacement.
>

I don't think that's the reason people would oppose this, although you
are true about the penalty. Memory's cheap, you'd need about a
quarter million lists just to fill up a meg.

CPU cycles, on the other hand, are expensive, as users' demand for
responsive programs seems to do a better job of keeping up with
Moore's Law.

I'd also argue that the memory you keep around to avoid unnecessary
memmoves() will almost always be dwarfed by the memory used by the
list elements themselves.


Christian Heimes

unread,
Jan 22, 2010, 6:17:17 PM1/22/10
to Steve Howell, pytho...@python.org
Steve Howell wrote:
> That maybe would be an argument for just striking the paragraph from
> the tutorial. If it's rare that people pop the head off the list in
> code that is performance critical or prominent, why bother to even
> mention it in the tutorial?

How else are you going to teach new Python developers that they should
favor append() and pop() in place of insert() and pop(0)? :)

> Impossible is a strong word. You could be lazy about giving the
> memory back, and just wait till the whole list is garbage collected.
> I don't think there's anything in Python's contract that says memory
> has to be freed the exact moment you stop using it, especially since
> we're talking about doing an O(N) memmove just to free up one
> pointer's worth of memory.

Your proposal would increase the size of every list object of
sizeof(ptr) or ssize_t (32 or 64bits) in order to store the information
where the first element is. It would also unnecessarily complicate the
code and possible slow down a lot of list operations.

> I know the Python programmer could simply iterate through the list
> rather than popping off unused elements, but that just means that you
> not only tie up the memory for the pointers just as long, but you also
> prevent the objects themselves from being garbage collected.
>
> In my case I'm not really concerned about giving the memory back right
> away, it's more about keeping my code simple. Once I'm done with an
> element, I do just want to pop it off and keep all the simplicity of
> the lists, but I am just concerned enough speed that for a 1000
> element list, I don't want to be doing 1000 memmoves for an average of
> 500 pointers, which effectively moves about a million bytes around for
> no reason.


The simplicity of Python is gained with some performance drawbacks. You
have to learn to use Python algorithms. You can't simply re implement a
fast C algorithm and expect it to be fast in Python, too.

Christian

Steve Howell

unread,
Jan 22, 2010, 6:27:57 PM1/22/10
to
On Jan 22, 3:17 pm, Christian Heimes <li...@cheimes.de> wrote:
> Steve Howell wrote:
> > That maybe would be an argument for just striking the paragraph from
> > the tutorial.  If it's rare that people pop the head off the list in
> > code that is performance critical or prominent, why bother to even
> > mention it in the tutorial?
>
> How else are you going to teach new Python developers that they should
> favor append() and pop() in place of insert() and pop(0)? :)
>
> > Impossible is a strong word.  You could be lazy about giving the
> > memory back, and just wait till the whole list is garbage collected.
> > I don't think there's anything in Python's contract that says memory
> > has to be freed the exact moment you stop using it, especially since
> > we're talking about doing an O(N) memmove just to free up one
> > pointer's worth of memory.
>
> Your proposal would increase the size of every list object of
> sizeof(ptr) or ssize_t (32 or 64bits) in order to store the information
> where the first element is. It would also unnecessarily complicate the
> code and possible slow down a lot of list operations.
>

64 bits per list is negligible.

Adding a check for (ilow == 0) is even more negligible.

You are not "unnecessarily" complicating code for something as widely
used as Python lists if it achieves the desired benefit at minimal
cost. You are complicating the code, but not "unnecessarily."

> > I know the Python programmer could simply iterate through the list
> > rather than popping off unused elements, but that just means that you
> > not only tie up the memory for the pointers just as long, but you also
> > prevent the objects themselves from being garbage collected.
>
> > In my case I'm not really concerned about giving the memory back right
> > away, it's more about keeping my code simple.  Once I'm done with an
> > element, I do just want to pop it off and keep all the simplicity of
> > the lists, but I am just concerned enough speed that for a 1000
> > element list, I don't want to be doing 1000 memmoves for an average of
> > 500 pointers, which effectively moves about a million bytes around for
> > no reason.
>
> The simplicity of Python is gained with some performance drawbacks. You
> have to learn to use Python algorithms. You can't simply re implement a
> fast C algorithm and expect it to be fast in Python, too.
>

I actually do expect Python to solve performance problems for me that
are more easily solved in core than in Python itself. So I guess
that's where we differ.

Steven D'Aprano

unread,
Jan 22, 2010, 9:20:14 PM1/22/10
to
On Fri, 22 Jan 2010 14:38:18 -0800, Steve Howell wrote:

> I know the Python programmer could simply iterate through the list
> rather than popping off unused elements, but that just means that you
> not only tie up the memory for the pointers just as long, but you also
> prevent the objects themselves from being garbage collected.
>
> In my case I'm not really concerned about giving the memory back right
> away, it's more about keeping my code simple. Once I'm done with an
> element, I do just want to pop it off and keep all the simplicity of the
> lists, but I am just concerned enough speed that for a 1000 element
> list, I don't want to be doing 1000 memmoves for an average of 500
> pointers, which effectively moves about a million bytes around for no
> reason.
>
> I suppose the solution is to either give up the sugar of lists, or try
> to wrap something like deque or list-with-offset into a sequence.


I don't understand what the actual problem you're trying to solve is.
Despite your disclaimer about not being concerned about reclaiming the
memory, it sounds like you're trying to micro-optimize memory usage. That
is, you give me the impression that you prefer this:

while alist:
x = alist.pop(0)
process(x)

over this:

for x in alist:
process(x)
# allow alist to be garbage collected when it goes out of scope


That strikes me as a pointlessly trivial optimization, even if deleting
at the start of the list was efficient.

But whatever your reason, if you want to insert and delete efficiently
from both ends of the sequence, use a deque. If you are only doing a
small number of insertions/deletions at the beginning, and so don't care
about inefficiency, use a list.

If you only want to insert/delete from one end, use a list. Instead of:

alist.insert(0, x)
alist.pop(0)

use:

alist.append(x)
alist.pop()

That's fast and efficient. In some cases it doesn't matter which order
the list is, but if it does matter, the worst case is that you need to
call alist.reverse() occasionally, which is quite fast. Or iterate over
the list in reverse order, which is even faster.

So what am I missing?

--
Steven

Roy Smith

unread,
Jan 22, 2010, 10:04:10 PM1/22/10
to
In article <mailman.1283.1264192...@python.org>,
Christian Heimes <li...@cheimes.de> wrote:

> Steve Howell wrote:
> > Is that really true in CPython? It seems like you could advance the
> > pointer instead of shifting all the elements. It would create some
> > nuances with respect to reclaiming the memory, but it seems like an
> > easy way to make lists perform better under a pretty reasonable use
> > case.
> >
> > Does anybody know off the top of their head if the "have-to-be-shifted-
> > by-one" warning is actually valid?
>
> Why do you think the documentation has such obvious errors?
>
> CPython's list type uses an array of pointers to store its members. The
> type is optimized for the most common list operations in Python:
> iteration and appending. Python code rarely changes the head or middle
> of a list. The dequeue type is an optimized data structure for popping
> and inserting data at both ends.
>
> When you list.pop() or list.insert() the pointers in the internal array
> must be shifted.

Well, at least in theory you could make pop(0) run in O(1). All you need
to do is have each list store an offset. Initially it's 0, and pop(0)
would just increment the offset. Then, all references to alist[i] would
turn into array[i+offset].

Of course, that's a lot of complexity to optimize a relatively rare use
case. You're probably better off just using a dequeue :-)

Roy Smith

unread,
Jan 22, 2010, 10:09:06 PM1/22/10
to
In article
<3ac173bd-4124-434d...@36g2000yqu.googlegroups.com>,
Steve Howell <show...@yahoo.com> wrote:

> In my case I'm not really concerned about giving the memory back right
> away, it's more about keeping my code simple. Once I'm done with an
> element, I do just want to pop it off and keep all the simplicity of
> the lists, but I am just concerned enough speed that for a 1000
> element list, I don't want to be doing 1000 memmoves for an average of
> 500 pointers, which effectively moves about a million bytes around for
> no reason.

If you really want to pop all the elements from a long list, reverse the
list and pop them off the end. Popping every element off the beginning of
the list is O(n^2), as you pointed out. Reversing the list is O(n), and
each pop after that is O(1), so the overall complexity is O(n).

Steve Howell

unread,
Jan 23, 2010, 12:42:43 AM1/23/10
to
On Jan 22, 6:20 pm, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Fri, 22 Jan 2010 14:38:18 -0800, Steve Howell wrote:
> > I know the Python programmer could simply iterate through the list
> > rather than popping off unused elements, but that just means that you
> > not only tie up the memory for the pointers just as long, but you also
> > prevent the objects themselves from being garbage collected.
>
> > In my case I'm not really concerned about giving the memory back right
> > away, it's more about keeping my code simple.  Once I'm done with an
> > element, I do just want to pop it off and keep all the simplicity of the
> > lists, but I am just concerned enough speed that for a 1000 element
> > list, I don't want to be doing 1000 memmoves for an average of 500
> > pointers, which effectively moves about a million bytes around for no
> > reason.
>
> > I suppose the solution is to either give up the sugar of lists, or try
> > to wrap something like deque or list-with-offset into a sequence.
>
> I don't understand what the actual problem you're trying to solve is.
> Despite your disclaimer about not being concerned about reclaiming the
> memory, it sounds like you're trying to micro-optimize memory usage.

My discussion of memory probably distracted you from the fact that I'm
not proposing a micro-optimization of memory; I am proposing a mega-
optimization of CPU.

This innocent program here literally moves about a million bytes of
memory around for no good reason:

lst = []
for i in range(2000):
lst.append(i)
while lst:
print lst.pop(0)

Why? Because list.pop(0) is implemented in O(N) instead of O(1).

Why wouldn't you get a competent C programmer simply make
list_ass_slice smart enough to make list.pop(0) O(1)?

The brilliant computer scientist, Christian Heimes, provides the
answers, and I am paraphrasing here, of course:

1) You can save 64 bits for every list by not storing an extra
pointer!
2) You can simplify the CPython implementation!
3) You can avoid the oh-so-expensive check "if ilow == 1" for all
operations that don't need this optimization!

Sounds like two micro-optimizations to me (and a copout to boot).

> That
> is, you give me the impression that you prefer this:
>
> while alist:
>     x = alist.pop(0)
>     process(x)
>
> over this:
>
> for x in alist:
>     process(x)
> # allow alist to be garbage collected when it goes out of scope
>

No, to be more precise, I prefer this implementation of a recursive
parser (using lists) to one that would have to use deque's because of
the lameness of Python's list implementation:

https://bitbucket.org/showell/shpaml_website/src/tip/shpaml.py


Steve Howell

unread,
Jan 23, 2010, 12:58:28 AM1/23/10
to
On Jan 22, 7:09 pm, Roy Smith <r...@panix.com> wrote:
> In article
> <3ac173bd-4124-434d-b726-0b9baaeec...@36g2000yqu.googlegroups.com>,

I really want to use list *normally* with all its perfectly good
semantics and reasonable implementation, except for its blind spot
with respect to popping the first element off the list. The whole
reason I use CPython vs. C in the first place is that CPython
programmers can generally program basic data structures better than I
can. But list.pop(0) is the exception. And, with the possible
exception of dicts, lists are the most fundamental data structures
that Python has.

I know Python's number one concern will never be speed, but if Python
makes an O(1) operation into an unnecessarily O(N) operation for no
good reasons other than "it's too complicated, " or it "adds another
pointer to the structure," or "it adds another conditional check to
list_ass_slice for operations that aren't popping off the top," I
think it's reasonable to challenge the design philosophy.

Aahz

unread,
Jan 23, 2010, 2:10:23 AM1/23/10
to
In article <83082e19-9130-45a8...@22g2000yqr.googlegroups.com>,

Steve Howell <show...@yahoo.com> wrote:
>
>I really want to use list *normally* with all its perfectly good
>semantics and reasonable implementation, except for its blind spot
>with respect to popping the first element off the list. The whole
>reason I use CPython vs. C in the first place is that CPython
>programmers can generally program basic data structures better than I
>can. But list.pop(0) is the exception. And, with the possible
>exception of dicts, lists are the most fundamental data structures
>that Python has.
>
>I know Python's number one concern will never be speed, but if Python
>makes an O(1) operation into an unnecessarily O(N) operation for no
>good reasons other than "it's too complicated, " or it "adds another
>pointer to the structure," or "it adds another conditional check to
>list_ass_slice for operations that aren't popping off the top," I
>think it's reasonable to challenge the design philosophy.

"Rough consensus and running code."

You have a good point, but nobody will ever give your idea serious
attention until there's a patch and benchmarks.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

import antigravity

Alf P. Steinbach

unread,
Jan 23, 2010, 2:43:28 AM1/23/10
to
* Steve Howell:

> On Jan 22, 7:09 pm, Roy Smith <r...@panix.com> wrote:
>> In article
>> <3ac173bd-4124-434d-b726-0b9baaeec...@36g2000yqu.googlegroups.com>,
>> Steve Howell <showel...@yahoo.com> wrote:
>>
>>> In my case I'm not really concerned about giving the memory back right
>>> away, it's more about keeping my code simple. Once I'm done with an
>>> element, I do just want to pop it off and keep all the simplicity of
>>> the lists, but I am just concerned enough speed that for a 1000
>>> element list, I don't want to be doing 1000 memmoves for an average of
>>> 500 pointers, which effectively moves about a million bytes around for
>>> no reason.
>> If you really want to pop all the elements from a long list, reverse the
>> list and pop them off the end. Popping every element off the beginning of
>> the list is O(n^2), as you pointed out. Reversing the list is O(n), and
>> each pop after that is O(1), so the overall complexity is O(n).
>
> I really want to use list *normally* with all its perfectly good
> semantics and reasonable implementation, except for its blind spot
> with respect to popping the first element off the list. The whole
> reason I use CPython vs. C in the first place is that CPython
> programmers can generally program basic data structures better than I
> can. But list.pop(0) is the exception. And, with the possible
> exception of dicts, lists are the most fundamental data structures
> that Python has.

Having optimized list.pop() for first element, then you would have a blind spot
with respect to insertion at the front... Which could then be optimized for the
cases where there is some buffer space at the front, left over from a pop. I
think it would just be harder to understand and harder to explain. And I think
that due to that, as usual people would build their own elaborate
"explanations", with erroneous conclusions, and would then use it in inefficient
ways (like, popping from the middle or inserting at the front).

I think the "harder to use correctly" is the strongest argument against the
optimization, but there is also a non-obvious *memory overhead*...

Having popped just one element at the front, in order for the implementation to
not /accumulate/ unused memory, that is, in order to avoid an ongoing /memory
leak/, extending the buffer to accomodate e.g. an append() can no longer be done
as a (on relevant systems) constant time reallocation (letting the OS do its
virtual memory paging magic). Instead, a new buffer would have to be allocated
and all data copied over. One could still have amortized constant time for
appends by using a doubling buffer (which is the strategy used in C++
'std::vector'), but at the cost of wasting some memory, a percentage...

The implementation as a pure array is easy to understand and use correctly, and
doesn't waste memory.

But it would IMHO have been better if it wasn't called "list", which brings in
the wrong associations for someone used to other languages. The name does
matter. E.g. I don't think this discussion about a pop optimization would have
been there except for the name, which makes that optimization sound reasonable.
Perhaps some standard alternative name could be devised. Like, "array" would
have been nice, except that that's already grabbed by the array module.


> I know Python's number one concern will never be speed, but if Python
> makes an O(1) operation into an unnecessarily O(N) operation for no
> good reasons other than "it's too complicated, " or it "adds another
> pointer to the structure," or "it adds another conditional check to
> list_ass_slice for operations that aren't popping off the top," I
> think it's reasonable to challenge the design philosophy.

See above. In addition to being more difficult /to use/ correctly, that is,
being much easier to misunderstand, it incurs a memory overhead -- not the one
directly from the pop optimization, but by the requirements of buffer extension.
Essentially, as discussed above, it would then have to use a doubling buffer.


Cheers & hth.,

- Alf

Steve Howell

unread,
Jan 23, 2010, 2:43:03 AM1/23/10
to
On Jan 22, 11:10 pm, a...@pythoncraft.com (Aahz) wrote:
>
> >I know Python's number one concern will never be speed, but if Python
> >makes an O(1) operation into an unnecessarily O(N) operation for no
> >good reasons other than "it's too complicated, " or it "adds another
> >pointer to the structure," or "it adds another conditional check to
> >list_ass_slice for operations that aren't popping off the top," I
> >think it's reasonable to challenge the design philosophy.
>
> "Rough consensus and running code."
>
> You have a good point, but nobody will ever give your idea serious
> attention until there's a patch and benchmarks.

Here is a benchmark of O(N*N) vs. O(N) for two C programs. One does
memmove; the other simply advances the pointer.

showell@showell-laptop:~$ time ./slow

real 0m1.446s
user 0m1.444s
sys 0m0.004s
showell@showell-laptop:~$ time ./fast

real 0m0.003s
user 0m0.004s
sys 0m0.000s
showell@showell-laptop:~$ diff slow.c fast.c
23,24c23
< lst.size -= 1;
< memmove(lst.p, lst.p + 1, lst.size);
---
> lst.p += 1;
showell@showell-laptop:~$ cat slow.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

struct ob_item {
int whatever;
};

struct list {
struct ob_item *p;
int size;
};

struct list make_list(int n)
{
struct list lst;
lst.p = malloc(n);
lst.size = n;
return lst;
}

void list_pop_left(struct list lst) {
lst.size -= 1;
memmove(lst.p, lst.p + 1, lst.size);
}

int main() {
struct list lst;
int i;
int n = 40000;
int t;

lst = make_list(n);
for (i = 0; i < n; ++i) {
list_pop_left(lst);
}
}

Steve Howell

unread,
Jan 23, 2010, 2:58:29 AM1/23/10
to
On Jan 22, 11:10 pm, a...@pythoncraft.com (Aahz) wrote:
> In article <83082e19-9130-45a8-91f2-8601c1fda...@22g2000yqr.googlegroups.com>,

> Steve Howell  <showel...@yahoo.com> wrote:
>
>
>
>
>
> >I really want to use list *normally* with all its perfectly good
> >semantics and reasonable implementation, except for its blind spot
> >with respect to popping the first element off the list.  The whole
> >reason I use CPython vs. C in the first place is that CPython
> >programmers can generally program basic data structures better than I
> >can.  But list.pop(0) is the exception.  And, with the possible
> >exception of dicts, lists are the most fundamental data structures
> >that Python has.
>
> >I know Python's number one concern will never be speed, but if Python
> >makes an O(1) operation into an unnecessarily O(N) operation for no
> >good reasons other than "it's too complicated, " or it "adds another
> >pointer to the structure," or "it adds another conditional check to
> >list_ass_slice for operations that aren't popping off the top," I
> >think it's reasonable to challenge the design philosophy.
>
> "Rough consensus and running code."
>
> You have a good point, but nobody will ever give your idea serious
> attention until there's a patch and benchmarks.

Another benchmark is that deques are slower than lists for accessing
elements.

showell@showell-laptop:~$ python foo.py
0.0215361118317 <- list
0.0429010391235 <- deque


import time
from collections import deque

n = 40000
lst = []
for i in range(n):
lst.append(i)

t = time.time()
for i in range(n):
lst[i]
print time.time() - t

lst = deque(lst)
t = time.time()
for i in range(n):
lst[i]
print time.time() - t

So substituting deque for list suffers not just in convenience, but
also in performance.

Terry Reedy

unread,
Jan 23, 2010, 3:13:49 AM1/23/10
to pytho...@python.org
On 1/23/2010 12:58 AM, Steve Howell wrote:

> I really want to use list *normally* with all its perfectly good
> semantics and reasonable implementation, except for its blind spot
> with respect to popping the first element off the list.

It was not designed for that. .pop() was added to lists about 10 years
ago because I asked for it (with no parameter, pop off end only) and
wrote what would now be a PEP -- and because Tim Peters later supported
the idea. Adding the optional parameter was something of an afterthought
(never publicly discussed as far as I know) for occasional use for
'short' lists where O(n) is tolerable. You have half persuaded me that
that the parameter addition was a mistake. Perhaps is is too attractice
a nuisance for some people ;=).

> The whole
> reason I use CPython vs. C in the first place is that CPython
> programmers can generally program basic data structures better than I
> can.

They have given us three options other that .pop(0).

1. listiterator
2. queue.Queue
3. collections.deque\

Why are you so stubborn about not using a data structure intended for
your use case instead of one that was not?

There is also
4. a two-list design for queues: iterator through one (or pop() from a
reversed version thereof) while appending to another; switch when the
first is empty. I plan to write it up with tests some day this year.

> I know Python's number one concern will never be speed, but if Python
> makes an O(1) operation into an unnecessarily O(N) operation for no
> good reasons other than "it's too complicated, " or it "adds another
> pointer to the structure," or "it adds another conditional check to
> list_ass_slice for operations that aren't popping off the top," I
> think it's reasonable to challenge the design philosophy.

Challenge yes, mock no.

Part of writing good basic data structures is not adding needless
complication from featuritis and not penalizing 99.99% of access to
satify a .01% need better satisfied another way.

Terry Jan Reedy


Steve Howell

unread,
Jan 23, 2010, 3:23:28 AM1/23/10
to
On Jan 23, 12:13 am, Terry Reedy <tjre...@udel.edu> wrote:
>
> Challenge yes, mock no.
>
> Part of writing good basic data structures is not adding needless
> complication from featuritis and not penalizing 99.99% of access to
> satify a .01% need better satisfied another way.
>

I would like to challenge your assertion that advancing ob_item
instead of doing memmove during list_ass_slice would impact the
performance of list accesses in any way. It would only slow down
operations that add/insert items into the list by, and then only by a
single conditional statement, and those add/insert operations are
already O(N) to begin with.

Alf P. Steinbach

unread,
Jan 23, 2010, 3:32:26 AM1/23/10
to
* Steve Howell:

I'm sorry, no, the last part is incorrect.

Appending to a 'list' can currently be constant time, if OS reallocation is
constant time (as the string '+' optimization relies on).

With the pop optimization it can no longer be constant time without risking an
accumulation of unused memory, a memory leak, although it can be amortized
constant time, at the cost of wasting some percentage of memory.

Steve Howell

unread,
Jan 23, 2010, 3:54:19 AM1/23/10
to
On Jan 23, 12:32 am, "Alf P. Steinbach" <al...@start.no> wrote:
> * Steve Howell:
>
> > On Jan 23, 12:13 am, Terry Reedy <tjre...@udel.edu> wrote:
> >> Challenge yes, mock no.
>
> >> Part of writing good basic data structures is not adding needless
> >> complication from featuritis and not penalizing 99.99% of access to
> >> satify a .01% need better satisfied another way.
>
> > I would like to challenge your assertion that advancing ob_item
> > instead of doing memmove during list_ass_slice would impact the
> > performance of list accesses in any way.  It would only slow down
> > operations that add/insert items into the list by, and then only by a
> > single conditional statement, and those add/insert operations are
> > already O(N) to begin with.
>
> I'm sorry, no, the last part is incorrect.
>
> Appending to a 'list' can currently be constant time, if OS reallocation is
> constant time (as the string '+' optimization relies on).
>

That's true, but it's also irrelevant, as the pop optimization would
happen in a branch of code that never gets executed during list
appending:

if (d < 0) { /* Delete -d items */
/* ADD CODE HERE TO AVOID memmove
when ilow == 0 (i.e. ihigh + d == 0)
*/
memmove(&item[ihigh+d], &item[ihigh],
(Py_SIZE(a) - ihigh)*sizeof(PyObject *));
list_resize(a, Py_SIZE(a) + d);
item = a->ob_item;
}


> With the pop optimization it can no longer be constant time without risking an
> accumulation of unused memory, a memory leak, although it can be amortized
> constant time, at the cost of wasting some percentage of memory.

list_resize already overallocates memory to allow room for growth.
Whenever you did an append to the list that would force a realloc, you
could first check to see if there is unused stuff at the front and do
the memmove then and reclaim the unfreed memory. So instead of doing
a paying for memmove on every pop, you only pay for it when the list
goes to size 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, etc.


>

Terry Reedy

unread,
Jan 23, 2010, 4:24:40 AM1/23/10
to pytho...@python.org

If you try writing a full patch, as I believe someone did, or at least a
prototype thereof, when the idea was discussed, you might have a better
idea of what the tradeoffs are and why it was rejected.

For instance, when you append to a full list, it is resized. I believe
it is now doubled, but the policy has varied over the years. If you turn
list from essentially a stack to a deque, you complicate the resizing
policy and have to consider the addition of a shift policy. Do you split
the over-allocated fore and aft or increase the overallocation from
double to, say, triple? If the former, then for normal usage that never
uses the fore part, the over-allocation has been effectively reduced
from 2x to 1.5x (which is about what it once was, I believe). This means
more frequent resizings and copyings, which means slower operation for
most use cases. Adding extra usually wasted space is also a nuisance.

Terry Jan Reedy

Steve Howell

unread,
Jan 23, 2010, 4:27:12 AM1/23/10
to
On Jan 23, 12:13 am, Terry Reedy <tjre...@udel.edu> wrote:
> On 1/23/2010 12:58 AM, Steve Howell wrote:
>
> > I really want to use list *normally* with all its perfectly good
> > semantics and reasonable implementation, except for its blind spot
> > with respect to popping the first element off the list.
>
> It was not designed for that. .pop() was added to lists about 10 years
> ago because I asked for it (with no parameter, pop off end only) and
> wrote what would now be a PEP -- and because Tim Peters later supported
> the idea. Adding the optional parameter was something of an afterthought
> (never publicly discussed as far as I know) for occasional use for
> 'short' lists where O(n) is tolerable. You have half persuaded me that
> that the parameter addition was a mistake. Perhaps is is too attractice
> a nuisance for some people ;=).
>

pop(0) is a useful idiom in parsers. You can see examples in
ElementTree and lib2to3.

Even without pop(0), people would still write code like this, found in
pstats.py:

arg = args[0]
args = args[1:]

It is sometimes overkill (and even inappropriate) to use a queue when
really you just want a list. Iterators are great, but they also have
slightly different semantics than the list itself.

There is nothing wrong with a language specification that allows users
to do insert, delete, and pop on a list. Once you freeze the language
specification, then you can turn your attention to improving the
implementation.

Alf P. Steinbach

unread,
Jan 23, 2010, 4:38:55 AM1/23/10
to
> a paying for memmove on every pop [at front], you only pay for it when

> the list goes to size 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, etc.

Oh. If 'list' already uses a doubling buffer then the only overhead from the
optimization would, AFAICS, be a single add in every indexing. Which might be
acceptable (at least it sounds very reasonable in the context of Python).

Re terminology: I write "doubling buffer" to mean increase of buffer size by a
factor. It's often 2, but might be e.g. 1.5, whatever. The point of using a
constant factor is to avoid quadratic time for loops doing appending, i.e. the
constant factor size increase yields amortized constant time per append.

The sizes that you quote above, on the other hand, look like rather arbitrary
buffer size increases where the size to increase by is limited to a certain
small range. With copying or moving of everything at each buffer size increase
that would not yield amortized constant time. It yield linear time, and
quadratic time for a loop doing appends.

But, anyway, if 'list' already uses a doubling buffer then the only overhead
from the pop optimization would, AFAICS, be a single add in every indexing.

On the third & gripping hand, however, a proof-of-concept actual implementation
(patch) would be needed to ensure that one doesn't overlook any showstopper or
serious problem, and to provide timings. And the special case would have to be
documented as a special case. Hm, it would be nice if the Python docs offered
complexity (time) guarantees in general...


Cheers,

- Alf

Steve Howell

unread,
Jan 23, 2010, 5:37:56 AM1/23/10
to

It looks like most of the complication would be in list_resize.

I'm gonna oversimplify a bit, but tell me if this is the gist of it.

I would have ob_item continue to always refer to first element of the
list, and then I'd have to introduce another variable to refer to the
start of our allocated memory, ob_start_memory, whenever you do a
realloc/free/malloc. I'd have a notion of fore_wastage, which would
either be a variable I maintain or something that I just calculate as
needed from the difference of ob_item and ob_start_memory.

In deciding whether I want to give memory back to the memory manager,
I just need to adjust my calculations to account for fore and aft
wastage to see if it's time to do a shrink, and before shrinking, I do
the memmove.

On growth, I would just always do a memmove right away if there is
fore_wastage, and then do the normal procedure for aft wastage.

For the most common scenario of append, append, append, the only
penalty is having to skip over fore_wastage logic by checking for
fore_wastage == 0 or ob_item == ob_start_memory.

For the scenario of several appends followed by several pops, I get
the big win of only doing log 2 N memmoves instead of N as I shrink
the list down to zero.

If I start alternating between pops and appends, it's basically a
wash...instead of doing the memmove on the pop, I do it on the next
append.

If I were to pop the top element and then prepend a new element, to be
truly efficient, I'd want to use reserved right away, but for
simplicity, I would probably not complicate list_ass_slice and just do
the normal resize() dance, which would give me memmove in one
direction followed immediately by a memmove in the other direction
when I come back to list_ass_slice. (But it would all still be a
wash, since I would have to do the same number of memmoves in the
current implementation.)

A lot of the essential complexity here seems to come from the fact
that realloc() isn't a very robust abstraction. It seems to be
expensive to tell it you want to shrink, and it also does not have an
interface to tell it to give you a little growing room. On the other
hand, the code within list_resize() actually provides a nice framework
for amortizing memmoves exponentially.

Steven D'Aprano

unread,
Jan 23, 2010, 7:12:29 AM1/23/10
to
On Fri, 22 Jan 2010 21:42:43 -0800, Steve Howell wrote:

> This innocent program here literally moves about a million bytes of
> memory around for no good reason:
>
> lst = []
> for i in range(2000):
> lst.append(i)
> while lst:
> print lst.pop(0)
>
> Why? Because list.pop(0) is implemented in O(N) instead of O(1).
>
> Why wouldn't you get a competent C programmer simply make list_ass_slice
> smart enough to make list.pop(0) O(1)?

Because there are always trade-offs, and the competent C programmers who
coded the implementation for lists choose different tradeoffs to the ones
you would prefer.

Seems to me that the simple solution to your problem is for you to
implement your own data structure that makes whatever trade-offs you
like. If it is good enough and popular enough, it might even replace the
existing list implementation.

>> That is, you give me the impression that you prefer this:
>>
>> while alist:
>>     x = alist.pop(0)
>>     process(x)
>>
>> over this:
>>
>> for x in alist:
>>     process(x)
>> # allow alist to be garbage collected when it goes out of scope
>>
>>
> No, to be more precise, I prefer this implementation of a recursive
> parser (using lists) to one that would have to use deque's because of
> the lameness of Python's list implementation:
>
> https://bitbucket.org/showell/shpaml_website/src/tip/shpaml.py


That's a lot of code. Am I supposed to study the whole module, or can you
give me a hint as to what you're referring to? The lack of docstrings and
comments doesn't fill me with enthusiasm for reading it.

Nevertheless, on the basis of a quick scan, I suppose that you're
probably talking about the nested function called recurse:

def recurse(prefix_lines):
while prefix_lines:
prefix, line = prefix_lines[0]
if line == '':
prefix_lines.pop(0)
append('')
else:
block_size = get_block(prefix_lines)
if block_size == 1:
prefix_lines.pop(0)
if line == pass_syntax:
pass
elif line.startswith(flush_left_syntax):
append(line[len(flush_left_syntax):])
elif line.startswith(flush_left_empty_line):
append('')
else:
append(prefix + leaf_method(line))
else:
block = prefix_lines[:block_size]
prefix_lines = prefix_lines[block_size:]
branch_method(output, block, recurse)
return


Since you're not even looking at the results of the pop, why don't you
just call del prefix_lines[0]? It probably won't perform any better, but
it is more idiomatic.

An alternative would be to do exactly what you want lists to do: track
the start of the list. Untested:


def recurse(prefix_lines):
start = 0
end = len(prefix_lines)
while start < end:
prefix, line = prefix_lines[start]
if line == '':
start += 1
append('')
else:
block_size = get_block(prefix_lines)
if block_size == 1:
start += 1
if line == pass_syntax:
pass
elif line.startswith(flush_left_syntax):
append(line[len(flush_left_syntax):])
elif line.startswith(flush_left_empty_line):
append('')
else:
append(prefix + leaf_method(line))
else:
block = prefix_lines[:block_size]
start = block_size
branch_method(output, block, recurse)
return


No more O(N) deletions. Problem solved.


--
Steven

Arnaud Delobelle

unread,
Jan 23, 2010, 7:24:32 AM1/23/10
to
Dave Angel <da...@ieee.org> writes:

Yes you are correct. This will teach me (again!) to check my facts.

>
> That sounds to me like a doubly-linked list implementation.

I've just looked and it is a doubly-linked list of 'blocks' of size
BLOCKLEN, which is 62 on the source I have (I guess it's 62 because then
the whole block structure is 64 exactly, 62 + 1 for each link). So a
small list will have near constant random access, in a way.

--
Arnaud

Christian Heimes

unread,
Jan 23, 2010, 8:46:37 AM1/23/10
to Steve Howell, pytho...@python.org
Steve Howell wrote:
> Another benchmark is that deques are slower than lists for accessing
> elements.

deques are optimized for accessing, inserting and removing data from
both ends. For anything else it's slower than the list type. The fact
was explained in this very thread yesterday.

Christian

Roy Smith

unread,
Jan 23, 2010, 9:40:56 AM1/23/10
to
In article
<a6531cf3-949d-4db9...@l11g2000yqb.googlegroups.com>,
Steve Howell <show...@yahoo.com> wrote:

> This innocent program here literally moves about a million bytes of
> memory around for no good reason:
>
> lst = []
> for i in range(2000):
> lst.append(i)
> while lst:
> print lst.pop(0)
>
> Why? Because list.pop(0) is implemented in O(N) instead of O(1).

I think you're being a little pedantic here. Yes, it is true that pop(0)
is O(n), and that if you put an O(n) operation in a loop, you get O(n^2)
run time.

The problem is that while it is well-known that putting something that's
O(n) in a loop gets you O(n^2), it's not well known that pop(0) for a
Python list is O(n). This is where you and I apparently start to differ in
what we think about this.

You are arguing that this is a bug in the implementation of list. While I
suppose there's some validity to that argument, I disagree. What I would
argue (and have done so several times over the years, with little success)
is that this is a bug in the documentation!

I'm looking at http://tinyurl.com/cdbwog. It shows all the operations of a
list. What it does not show is their cost. For pop(), it has a note:

"The pop() method is only supported by the list and array types. The
optional argument i defaults to -1, so that by default the last item is
removed and returned".

There's nothing there that gives any hint that pop(0) is any more expensive
than pop(-1). That is "secret knowledge", which you only get by following
the newsgroup discussions or looking at the implementation. You shouldn't
have to do either. There's lots of possible ways list could be
implemented. Without knowing the details, I'm left to guess about
important stuff like the cost of operations.

Every one of these operations should list the cost. Even if it's something
as vague as, "While not guaranteed by the language spec, in the current
implemenation of CPython ...".

Roy Smith

unread,
Jan 23, 2010, 9:57:04 AM1/23/10
to
In article <hje979$kc9$1...@news.eternal-september.org>,

"Alf P. Steinbach" <al...@start.no> wrote:

> But it would IMHO have been better if it wasn't called "list", which brings
> in the wrong associations for someone used to other languages.

+1.

When I first started using Python (back in the 1.4 days), I assumed a list
was a singly-linked list. Which, of course, leads to the assumption that
pop(0) is O(1), and lots of other wrong thinking(*).

Oddly enough, I was going to write in the above paragraph, "like a C++ STL
list", until I happened to glance at the STL docs and refreshed my memory
that an STL list is doubly-linked. Which just goes to show that making
assumptions based on names is a bad idea.

So, we're right back to my statement earlier in this thread that the docs
are deficient in that they describe behavior with no hint about cost.
Given that, it should be no surprise that users make incorrect assumptions
about cost.

(*) I suspect somebody is going to point out that list.pop was added in
some version later than 1.4, but that's a detail.

Steven D'Aprano

unread,
Jan 23, 2010, 10:54:26 AM1/23/10
to
On Sat, 23 Jan 2010 09:57:04 -0500, Roy Smith wrote:

> In article <hje979$kc9$1...@news.eternal-september.org>,
> "Alf P. Steinbach" <al...@start.no> wrote:
>
>> But it would IMHO have been better if it wasn't called "list", which
>> brings in the wrong associations for someone used to other languages.
>
> +1.
>
> When I first started using Python (back in the 1.4 days), I assumed a
> list was a singly-linked list.

Why would you do that? I can think of at least eight different
implementations of the abstract list data structure:

constant-size array
variable-size array
variable-size array with amortised O(1) appends
singly-linked list
singly-linked list with CDR coding
doubly-linked list
skip list
indexable skip list

One can reasonably disregard constant-sized arrays as a possibility,
given that Python lists aren't fixed size (pity the poor Pascal and
Fortran coders who are stuck with static arrays!), but the rest are all
reasonable possibilities. Why assume one specific implementation in the
absence of documentation promising certain performance characteristics?


> Oddly enough, I was going to write in the above paragraph, "like a C++
> STL list", until I happened to glance at the STL docs and refreshed my
> memory that an STL list is doubly-linked. Which just goes to show that
> making assumptions based on names is a bad idea.

Exactly :)

> So, we're right back to my statement earlier in this thread that the
> docs are deficient in that they describe behavior with no hint about
> cost. Given that, it should be no surprise that users make incorrect
> assumptions about cost.

There are quite a few problems with having the documentation specify cost:

(1) Who is going to do it? Any volunteers?

(2) Big-oh notation can be misleading, especially for naive users, or
those whose intuition for what's fast has been shaped by other languages.
Big-oh doesn't tell you whether something is fast or slow, only how it
scales -- and sometimes not even then.

(3) Having documented a particular performance, that discourages
implementation changes. Any would-be patch or new implementation not only
has to consider that the functional behaviour doesn't change, but that
the performance doesn't either.

In practice the Python developers are unlikely to make an implementation
change which leads to radically worse performance, particularly for
critical types like list and dict. But in other cases, they might choose
to change big-oh behaviour, and not wish to be tied down by documentation
of the cost of operations.

(4) How much detail is necessary? What about degenerate cases? E.g. dict
lookup in CPython is typically O(1) amortised, but if all the keys hash
to the same value, it falls to O(N).

(5) Should the language guarantee such degenerate behaviour? Who decides
which costs are guaranteed and which are not?

(6) Such performance guarantees should be implementation specific, not
language specific. CPython is only one implementation of the language out
of many.

--
Steven

Duncan Booth

unread,
Jan 23, 2010, 11:29:23 AM1/23/10
to
Roy Smith <r...@panix.com> wrote:

> I'm looking at http://tinyurl.com/cdbwog. It shows all the operations
> of a list. What it does not show is their cost. For pop(), it has a
> note:
>
> "The pop() method is only supported by the list and array types. The
> optional argument i defaults to -1, so that by default the last item
> is removed and returned".

The page you should probably be looking at is
http://wiki.python.org/moin/TimeComplexity

Alf P. Steinbach

unread,
Jan 23, 2010, 11:37:22 AM1/23/10
to
* Steven D'Aprano:

> On Sat, 23 Jan 2010 09:57:04 -0500, Roy Smith wrote:
>
>> In article <hje979$kc9$1...@news.eternal-september.org>,
>> "Alf P. Steinbach" <al...@start.no> wrote:
>>
>>> But it would IMHO have been better if it wasn't called "list", which
>>> brings in the wrong associations for someone used to other languages.
>> +1.
>>
>> When I first started using Python (back in the 1.4 days), I assumed a
>> list was a singly-linked list.
>
> Why would you do that? I can think of at least eight different
> implementations of the abstract list data structure:
>
> constant-size array
> variable-size array
> variable-size array with amortised O(1) appends
> singly-linked list
> singly-linked list with CDR coding
> doubly-linked list
> skip list
> indexable skip list
>
> One can reasonably disregard constant-sized arrays as a possibility,
> given that Python lists aren't fixed size (pity the poor Pascal and
> Fortran coders who are stuck with static arrays!), but the rest are all
> reasonable possibilities.

A linked list implementation would yield O(n) indexing. A great many loops in
e.g. Python libraries code now having linear time would then get quadratic time,
O(n^2). Those libraries would then be effectively unusable without extensive
rewriting: one version for ordinary Python and one for 'list-as-list' Pythons...

Thus, the linked list implementations are IMO *not* reasonable.

And the reason is precisely the implied complexity guarantees, especially on
indexing -- which could reasonably be O(log n), but not worse than that.


> Why assume one specific implementation in the
> absence of documentation promising certain performance characteristics?
>
>
>> Oddly enough, I was going to write in the above paragraph, "like a C++
>> STL list", until I happened to glance at the STL docs and refreshed my
>> memory that an STL list is doubly-linked. Which just goes to show that
>> making assumptions based on names is a bad idea.
>
> Exactly :)
>
>
>
>> So, we're right back to my statement earlier in this thread that the
>> docs are deficient in that they describe behavior with no hint about
>> cost. Given that, it should be no surprise that users make incorrect
>> assumptions about cost.
>
> There are quite a few problems with having the documentation specify cost:
>
> (1) Who is going to do it? Any volunteers?

This problem must have been addressed at each time the documentation for some
version of Python was written or updated.


> (2) Big-oh notation can be misleading, especially for naive users, or
> those whose intuition for what's fast has been shaped by other languages.
> Big-oh doesn't tell you whether something is fast or slow, only how it
> scales -- and sometimes not even then.

It's how things scale that are of interest. :-)

Big-oh tells you an upper asymptotic limit.

That's sufficient for e.g. the C++ standard -- which, by the way, constitutes
a concrete example of the practicality of specifying complexity.


> (3) Having documented a particular performance, that discourages
> implementation changes. Any would-be patch or new implementation not only
> has to consider that the functional behaviour doesn't change, but that
> the performance doesn't either.
>
> In practice the Python developers are unlikely to make an implementation
> change which leads to radically worse performance, particularly for
> critical types like list and dict. But in other cases, they might choose
> to change big-oh behaviour, and not wish to be tied down by documentation
> of the cost of operations.

Say that there was an O(log n) documented worst complexity for 'list' indexing.
Above you have described it as "reasonable" to break that, having O(n)
complexity... But in light of my comments on that, and especially thinking a bit
about maintainance of two or more! versions of various libraries, don't you
agree that it would be Just Bad(TM)?


> (4) How much detail is necessary? What about degenerate cases? E.g. dict
> lookup in CPython is typically O(1) amortised, but if all the keys hash
> to the same value, it falls to O(N).

From N1745, the Technical Report 1 on C++ library extensions (will be part of
the C++0x standard), table 21 specifying general requirements of unordered
associative containers:

expression: b.find(k)
return type: iterator;
assertion: Returns an iterator pointing to an element with key equivalent
to k, or b.end() if no such element exists.
complexity: Average case O(1), worst case O(b.size()).


> (5) Should the language guarantee such degenerate behaviour? Who decides
> which costs are guaranteed and which are not?

I think the C++ standard (the latest draft of C++0x is freely available as PDF
from the commitee pages) provides good guidance in this regard. :-)


> (6) Such performance guarantees should be implementation specific, not
> language specific. CPython is only one implementation of the language out
> of many.

Disagree Very Strongly. An implementation may offer stricter guarantees. But
what matters regarding e.g. avoiding having to maintain two or three or umpteen
versions of a library, is the set of language level complexity guarantees.


Cheers,

- Alf

Steve Howell

unread,
Jan 23, 2010, 11:41:15 AM1/23/10
to

And the benchmark confirmed it. The slowness is fairly negligible,
though.

Steve Howell

unread,
Jan 23, 2010, 11:46:18 AM1/23/10
to
On Jan 23, 6:40 am, Roy Smith <r...@panix.com> wrote:
> In article
> <a6531cf3-949d-4db9-9800-590302fd7...@l11g2000yqb.googlegroups.com>,

>  Steve Howell <showel...@yahoo.com> wrote:
>
> > This innocent program here literally moves about a million bytes of
> > memory around for no good reason:
>
> >     lst = []
> >     for i in range(2000):
> >         lst.append(i)
> >     while lst:
> >         print lst.pop(0)
>
> > Why?  Because list.pop(0) is implemented in O(N) instead of O(1).
>
> I think you're being a little pedantic here.  Yes, it is true that pop(0)
> is O(n), and that if you put an O(n) operation in a loop, you get O(n^2)
> run time.
>
> The problem is that while it is well-known that putting something that's
> O(n) in a loop gets you O(n^2), it's not well known that pop(0) for a
> Python list is O(n).  This is where you and I apparently start to differ in
> what we think about this.
>

The source for Python is open. It pretty clearly shows that you move
N bytes when you pop from the top of the list.

Less clear is how linear the performance of memmove is. My benchmarks
on the C program show that, at least on my computer, the results do
not seem to contradict the "roughly linear" assumption.

> You are arguing that this is a bug in the implementation of list.  While I
> suppose there's some validity to that argument, I disagree.  What I would
> argue (and have done so several times over the years, with little success)
> is that this is a bug in the documentation!
>

> I'm looking athttp://tinyurl.com/cdbwog.  It shows all the operations of a


> list.  What it does not show is their cost.  For pop(), it has a note:
>
> "The pop() method is only supported by the list and array types. The
> optional argument i defaults to -1, so that by default the last item is
> removed and returned".
>
> There's nothing there that gives any hint that pop(0) is any more expensive
> than pop(-1).  That is "secret knowledge", which you only get by following
> the newsgroup discussions or looking at the implementation.  You shouldn't
> have to do either.  There's lots of possible ways list could be
> implemented.  Without knowing the details, I'm left to guess about
> important stuff like the cost of operations.
>
> Every one of these operations should list the cost.  Even if it's something
> as vague as, "While not guaranteed by the language spec, in the current
> implemenation of CPython ...".

I agree with that.

Steve Howell

unread,
Jan 23, 2010, 12:03:46 PM1/23/10
to
On Jan 23, 4:12 am, Steven D'Aprano <st...@REMOVE-THIS-

A minor modification of your solution does work, but it also slightly
+ complicates the implementation. Keeping track of the start variable
requires bookkeeping not just in recurse(), but also in methods that
it calls. This is a pretty small program, so it's acceptable to pass
around an offset variable to anybody else who might want to be
consuming the list.

############ Generic indentation stuff follows

-def get_indented_block(prefix_lines):
- prefix, line = prefix_lines[0]
+def get_indented_block(prefix_lines, start):
+ prefix, line = prefix_lines[start]
len_prefix = len(prefix)
i = 1
- while i < len(prefix_lines):
- new_prefix, line = prefix_lines[i]
+ while i + start < len(prefix_lines):
+ new_prefix, line = prefix_lines[start+i]
if line and len(new_prefix) <= len_prefix:
break
i += 1
- while i-1 > 0 and prefix_lines[i-1][1] == '':
+ while i-1 > 0 and prefix_lines[start+i-1][1] == '':
i -= 1
return i

@@ -190,15 +190,16 @@
):
append = output.append
def recurse(prefix_lines):
- while prefix_lines:
- prefix, line = prefix_lines[0]
+ start = 0
+ while start < len(prefix_lines):
+ prefix, line = prefix_lines[start]
if line == '':
- prefix_lines.pop(0)
+ start += 1
append('')
else:
- block_size = get_block(prefix_lines)
+ block_size = get_block(prefix_lines, start)
if block_size == 1:
- prefix_lines.pop(0)
+ start += 1


if line == pass_syntax:
pass
elif line.startswith(flush_left_syntax):

@@ -208,8 +209,8 @@


else:
append(prefix + leaf_method(line))
else:

- block = prefix_lines[:block_size]
- prefix_lines = prefix_lines[block_size:]
+ block = prefix_lines[start:start+block_size]
+ start += block_size
branch_method(output, block, recurse)
return
prefix_lines = map(indentation_method, lines)

Steve Howell

unread,
Jan 23, 2010, 12:17:31 PM1/23/10
to
On Jan 23, 4:12 am, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Fri, 22 Jan 2010 21:42:43 -0800, Steve Howell wrote:
> > This innocent program here literally moves about a million bytes of
> > memory around for no good reason:
>
> >     lst = []
> >     for i in range(2000):
> >         lst.append(i)
> >     while lst:
> >         print lst.pop(0)
>
> > Why?  Because list.pop(0) is implemented in O(N) instead of O(1).
>
> > Why wouldn't you get a competent C programmer simply make list_ass_slice
> > smart enough to make list.pop(0) O(1)?
>
> Because there are always trade-offs, and the competent C programmers who
> coded the implementation for lists choose different tradeoffs to the ones
> you would prefer.
>
> Seems to me that the simple solution to your problem is for you to
> implement your own data structure that makes whatever trade-offs you
> like. If it is good enough and popular enough, it might even replace the
> existing list implementation.
>

The data structure that would make the tradeoffs I want would be
implemented within CPython itself. I give a sketch of the changes
elsewhere in this thread.

Terry Reedy said:

'''


If you try writing a full patch, as I believe someone did, or at least
a
prototype thereof, when the idea was discussed, you might have a
better
idea of what the tradeoffs are and why it was rejected.

'''

I have to run, but tomorrow I will try to dig through python-dev
archives and find the patch. If anybody has hints on where to look
for it (anybody remember the author, for example?), it would be much
appreciated.

If the patch looks simple, I will try to pitch the idea that its time
has come. Now that the specification of the language itself is
frozen, I think there might be more room for improving
implementations. Also, I might be able to make the argument that
tradeoffs of memory vs. CPU vs. code complexity have different forces
in the 2010s.

Thanks for your reply.

Steve Howell

unread,
Jan 23, 2010, 12:38:29 PM1/23/10
to
On Jan 23, 7:54 am, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Sat, 23 Jan 2010 09:57:04 -0500, Roy Smith wrote:
> > In article <hje979$kc...@news.eternal-september.org>,

Bringing this thread full circle, does it make sense to strike this
passage from the tutorial?:

'''
It is also possible to use a list as a queue, where the first element
added is the first element retrieved (“first-in, first-out”); however,
lists are not efficient for this purpose. While appends and pops from
the end of list are fast, doing inserts or pops from the beginning of
a list is slow (because all of the other elements have to be shifted
by one).
'''

I think points #3 and #6 possibly apply. Regarding points #2 and #4,
the tutorial is at least not overly technical or specific; it just
explains the requirement to shift other elements one by one in simple
layman's terms.

Aahz

unread,
Jan 23, 2010, 1:50:42 PM1/23/10
to
In article <8e4d3fe2-c4bd-4a73...@o28g2000yqh.googlegroups.com>,
Steve Howell <show...@yahoo.com> wrote:

>On Jan 22, 11:10=A0pm, a...@pythoncraft.com (Aahz) wrote:
>>
>>>I know Python's number one concern will never be speed, but if Python
>>>makes an O(1) operation into an unnecessarily O(N) operation for no
>>>good reasons other than "it's too complicated, " or it "adds another
>>>pointer to the structure," or "it adds another conditional check to
>>>list_ass_slice for operations that aren't popping off the top," I
>>>think it's reasonable to challenge the design philosophy.
>>
>> "Rough consensus and running code."
>>
>> You have a good point, but nobody will ever give your idea serious
>> attention until there's a patch and benchmarks.
>
>Here is a benchmark of O(N*N) vs. O(N) for two C programs. One does
>memmove; the other simply advances the pointer.

You should provide pybench numbers and probably also use the benchmarks
produced by the Unladen Swallow project. The next step is to file a
patch on bugs.python.org and request review.

Roy Smith

unread,
Jan 23, 2010, 2:02:53 PM1/23/10
to
In article <Xns9D09A7BCC6...@127.0.0.1>,
Duncan Booth <duncan...@invalid.invalid> wrote:

I was not aware of this page; thanks for pointing it out.

Terry Reedy

unread,
Jan 23, 2010, 6:04:01 PM1/23/10
to pytho...@python.org
On 1/23/2010 12:17 PM, Steve Howell wrote:

> Terry Reedy said:
>
> '''
> If you try writing a full patch, as I believe someone did, or at least
> a
> prototype thereof, when the idea was discussed, you might have a
> better
> idea of what the tradeoffs are and why it was rejected.
> '''
>
> I have to run, but tomorrow I will try to dig through python-dev
> archives and find the patch. If anybody has hints on where to look
> for it (anybody remember the author, for example?), it would be much
> appreciated.

The approach you outlined in your other response to me is, I believe,
what was considered, investigated, and then rejected (by Guido, with
agreement). The discussion may have been on the now-closed and
(misspelled) pyk3 (?), or maybe on python-ideas, but my memory is more
likely the former. I am sure that Raymond H. was involved also.

> If the patch looks simple, I will try to pitch the idea that its time
> has come. Now that the specification of the language itself is
> frozen, I think there might be more room for improving
> implementations. Also, I might be able to make the argument that
> tradeoffs of memory vs. CPU vs. code complexity have different forces
> in the 2010s.

I am not opposed to a possible change, just hasty, ill-informed
criticism. If there is not a PEP on this issue, it would be good to have
one that recorded the proposal and the pros and cons, regardless of the
outcome, so there would be something to refer people to. If that had
been already done, it would have shortened this thread considerably.

Terry Jan Reedy

Terry Reedy

unread,
Jan 23, 2010, 6:08:37 PM1/23/10
to pytho...@python.org

Perhaps you could suggest on the tracker a place or places in the doc
where this relatively new wiki page could be referred to. Perhaps in the
introductory paragraphs of the Built-in Type section of the lib ref.
Where would you have like to have found it?

The page was added in response to threads like this one, but it
obviously is more obscure than it should be.

Terry Jan Reedy


Roy Smith

unread,
Jan 23, 2010, 10:02:22 PM1/23/10
to
In article <mailman.1340.1264288...@python.org>,
Terry Reedy <tjr...@udel.edu> wrote:

> >> The page you should probably be looking at is
> >> http://wiki.python.org/moin/TimeComplexity
> >
> > I was not aware of this page; thanks for pointing it out.
>
> Perhaps you could suggest on the tracker a place or places in the doc
> where this relatively new wiki page could be referred to. Perhaps in the
> introductory paragraphs of the Built-in Type section of the lib ref.
> Where would you have like to have found it?

I think the most logical place would have been right at the table of
operations (http://tinyurl.com/cdbwog).

Raymond Hettinger

unread,
Jan 23, 2010, 11:00:37 PM1/23/10
to
[Steve Howell]

> Why wouldn't you get a competent C programmer simply make
> list_ass_slice smart enough to make list.pop(0) O(1)?

When this suggestion was discussed on python-dev years ago,
it was rejected. One reason is that it was somewhat
common for C code to access the list data structure directly
(bypassing API accessor functions). Changing the list to have
a starting offset would break existing C extensions.

Another reason is that Guido is non-tolerant of space or time
trade-offs for lists and tuples because they pervade the language
and are heavily used internally. Any additional space or time
requirement however small would impact the language performance
as a whole. FWIW, that is also the reason that lists are not
weak-referenceable (it would cost one extra pointer field per
instance and that wasn't deemed to be worth it).


> The brilliant computer scientist, Christian Heimes, provides the
> answers, and I am paraphrasing here, of course:

IMHO, Christian IS a brilliant computer scientist, so I'll ignore
the rude intention and take the sentence literally.


>   1) You can save 64 bits for every list by not storing an extra
> pointer!
>   2) You can simplify the CPython implementation!
>   3) You can avoid the oh-so-expensive check "if ilow == 1" for all
> operations that don't need this optimization!
>
> Sounds like two micro-optimizations to me (and a copout to boot).

Micro or not, these reasons were part of Guido's rationale.
Tim Peters and I also participated in the conversation and did not
disagree.

So, collections.deque() was born and the problem stopped being an
issue.
Also, Daniel Stuzbach has published a blist implementation that offers
high performance insertions and deletions and fast handling of sparse
lists.


Raymond

Steve Howell

unread,
Jan 24, 2010, 5:33:36 AM1/24/10
to
On Jan 23, 8:00 pm, Raymond Hettinger <pyt...@rcn.com> wrote:
> [Steve Howell]
>
> > Why wouldn't you get a competent C programmer simply make
> > list_ass_slice smart enough to make list.pop(0) O(1)?
>
> When this suggestion was discussed on python-dev years ago,
> it was rejected.  One reason is that it was somewhat
> common for C code to access the list data structure directly
> (bypassing API accessor functions).  Changing the list to have
> a starting offset would break existing C extensions.
>
> Another reason is that Guido is non-tolerant of space or time
> trade-offs for lists and tuples because they pervade the language
> and are heavily used internally.  Any additional space or time
> requirement however small would impact the language performance
> as a whole.  FWIW, that is also the reason that lists are not
> weak-referenceable (it would cost one extra pointer field per
> instance and that wasn't deemed to be worth it).
>
> > The brilliant computer scientist, Christian Heimes, provides the
> > answers, and I am paraphrasing here, of course:
>
> IMHO, Christian IS a brilliant computer scientist, so I'll ignore
> the rude intention and take the sentence literally.
>

You are also a brilliant computer scientist, despite the fact that you
are defending a list implemenation that can't pop the first element
off the list in O(1) time.

Steven D'Aprano

unread,
Jan 24, 2010, 6:20:09 AM1/24/10
to
On Sun, 24 Jan 2010 02:33:36 -0800, Steve Howell wrote:

> You are also a brilliant computer scientist, despite the fact that you
> are defending a list implemenation that can't pop the first element off
> the list in O(1) time.

You say that like it's a bad thing.

It's very simple: the trade-offs that the Python development team have
deliberately chosen aren't the same trade-offs that you prefer. That
doesn't make your trade-offs right and Python's wrong. They're just
different, and if Python lists had your preferred implementation, I
guarantee that somebody would be complaining about it right now.

If you're serious about wanting O(1) pops from the start of the list,
write your own list implementation and use it. You might even like to
make it public, so others can use it as well. But please stop with the
snide remarks and poorly disguised insults and back-handed compliments,
it's getting tedious.

Or just change your algorithm slightly -- it's not hard to turn an
algorithm that pops from the start of a list to one that pops from the
end of the list.

--
Steven

Steve Howell

unread,
Jan 24, 2010, 1:59:32 PM1/24/10
to
On Jan 24, 3:20 am, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Sun, 24 Jan 2010 02:33:36 -0800, Steve Howell wrote:
> > You are also a brilliant computer scientist, despite the fact that you
> > are defending a list implemenation that can't pop the first element off
> > the list in O(1) time.
>
> You say that like it's a bad thing.

It is.

> It's very simple: the trade-offs that the Python development team have
> deliberately chosen aren't the same trade-offs that you prefer. That
> doesn't make your trade-offs right and Python's wrong. They're just
> different, and if Python lists had your preferred implementation, I
> guarantee that somebody would be complaining about it right now.
>
> If you're serious about wanting O(1) pops from the start of the list,
> write your own list implementation and use it. You might even like to
> make it public, so others can use it as well. But please stop with the
> snide remarks and poorly disguised insults and back-handed compliments,
> it's getting tedious.

I will stop being snide, but I will be blunt, and if anybody
interprets my criticism as an insult, so be it.

The current algorithm is broken. It's a 20th century implementation
of lists built on top of a 20th century memory manager. It's at least
ten years behind the times.

> Or just change your algorithm slightly -- it's not hard to turn an
> algorithm that pops from the start of a list to one that pops from the
> end of the list.
>

The fact that you are proposing to reverse a list to work around its
performance deficiencies just confirms to me that the algorithm is
broken. I will concede the fact that most of CPython's tradeoffs are
driven by the limitations of the underlying memory manager. If realloc
() allowed you to easily release memory from the front of a previously
allocated block, we'd be talking about maybe a 10-line patch here, and
it wouldn't impact even list_resize() in a meaningful way.

Even with realloc()'s brokenness, you could improve pop(0) in a way
that does not impact list access at all, and the patch would not
change the time complexity of any operation; it would just add
negligible extract bookkeeping within list_resize() and a few other
places.

The objection that the extra pointer would increase the size of list
objects is totally 20th century thinking. It would be totally
negligible for any real world program.


Steve Howell

unread,
Jan 24, 2010, 2:26:32 PM1/24/10
to

I think it's a good idea to write a PEP on this issue, and I will
attempt a first draft. I think I should submit the first draft to
python-ideas, correct?

I expect the PEP to be at least initially, if not permanently,
rejected, but it would not be an exercise in futility, as I agree it's
good to record pros and cons of the proposal in one place. The PEP
probably would not include a proposed patch until there was a little
bit of consensus behind it, but it would not take me a lot of time to
present the basic argument.

Here is my sketch of what the PEP would look like.

Proposal: Improve list's implementation so that deleting elements from
the front of the list does not require an O(N) memmove operation.

Rationale: Some Python programs that process lists have multiple
methods that consume the first element of the list and pop it off.
The pattern comes up with parsers in particular, but there are other
examples. It is possible now, of course, to use a data structure in
Python that has O(1) for deleting off the top of the list, but none of
the alternatives fully replicate the benefits of list itself.

Specification: Improving CPython's performance does not affect the
language itself, so there are no bikeshed arguments to be had with
respect to syntax, etc. Any patch would, of course, affect the
performance of nearly every Python program in existence, so any patch
would have to, at a bare minimum:

1) Not increase the time or memory complexity of any other list
operation.
2) Not affect list access at all.
3) Minimally affect list operations that mutate the list.
4) Be reasonably simple within CPython itself.
5) Not be grossly wasteful of memory.

Backwards Compatibility:

See above. An implementation of this PEP would not change the
definition of the language in any way, but it would have to minimally
impact the performance of lists for the normal use cases.

Implementation:

There are two ways to make deleting the first item of the list run
more efficiently.

The most ambitious proposal is to fix the memory manager itself to
allow the release of memory from the start of the chunk. The
advantage of this proposal is that it would simplify the changes to
list itself, and possibly have collateral benefits for other CPython
internal data structures. The disadvantage of the proposal is that
there is a strong tradition in CPython to use native memory
management, particularly with respect to the fact that it runs on many
platforms.

The less ambitious proposal is to change the memory management scheme
within list itself. There is already precedent in list_resize() to
optimistically allocate memory, so it is not a great break from
tradition to optimistically defer the release of memory. But it would
complicate things.

References:

I would refer to this thread on comp.lang.python for discussion, and I
would also try to dig up older threads on python-dev or elsewhere.

Aahz

unread,
Jan 24, 2010, 2:28:53 PM1/24/10
to
In article <b4440231-f33f-49e1...@b2g2000yqi.googlegroups.com>,

Steve Howell <show...@yahoo.com> wrote:
>
>Even with realloc()'s brokenness, you could improve pop(0) in a way
>that does not impact list access at all, and the patch would not change
>the time complexity of any operation; it would just add negligible
>extract bookkeeping within list_resize() and a few other places.

Again, your responsibility is to provide a patch and a spectrum of
benchmarking tests to prove it. Then you would still have to deal with
the objection that extensions use the list internals -- that might be an
okay sell given the effort otherwise required to port extensions to
Python 3, but that's not the way to bet.

Have you actually read the discussions you were pointed at?

Steve Howell

unread,
Jan 24, 2010, 2:53:15 PM1/24/10
to
On Jan 24, 11:28 am, a...@pythoncraft.com (Aahz) wrote:
> In article <b4440231-f33f-49e1-9d6f-5fbce0a63...@b2g2000yqi.googlegroups.com>,

> Steve Howell  <showel...@yahoo.com> wrote:
>
>
>
> >Even with realloc()'s brokenness, you could improve pop(0) in a way
> >that does not impact list access at all, and the patch would not change
> >the time complexity of any operation; it would just add negligible
> >extract bookkeeping within list_resize() and a few other places.
>
> Again, your responsibility is to provide a patch and a spectrum of
> benchmarking tests to prove it.  Then you would still have to deal with
> the objection that extensions use the list internals -- that might be an
> okay sell given the effort otherwise required to port extensions to
> Python 3, but that's not the way to bet.

Ok.

> Have you actually read the discussions you were pointed at?

I don't think anybody provided an actual link, but please correct me
if I overlooked it.

Terry Reedy

unread,
Jan 24, 2010, 3:13:16 PM1/24/10
to pytho...@python.org
On 1/24/2010 2:26 PM, Steve Howell wrote:

> I think it's a good idea to write a PEP on this issue, and I will
> attempt a first draft. I think I should submit the first draft to
> python-ideas, correct?

That is not a *requirement* for drafts in general, but it is a good idea
for a community or community-person generated proposal, such as this one.

> I expect the PEP to be at least initially, if not permanently,
> rejected,

Guido sometimes rejects 'no-chance' proposals without waiting to be
asked, but he usually waits until the PEP author feels the issue is ripe
and asks for a pronouncement.

tjr

Paul Rubin

unread,
Jan 24, 2010, 3:44:25 PM1/24/10