My main issue is that for correct usage:
- user should provide an external list, that shouldn't use for
anything else to don't break the invariant
- to alter the heap queue, a bunch of functions must be used, passing
always the external list
I think that changing "external list" for "internal attribute", and
"bunch of functions " for "methods", it will leave the module easier
to use, safer, and more object oriented.
So, I basically coded it. What do you think about including this class
in the heap module?
"""
class Heap(object):
'''Maintains a heapified list, always conserving the invariant.
Heaps are arrays for which heap[k] <= heap[2*k+1] and
heap[k] <= heap[2*k+2] for all k, counting elements from zero.
'''
def __init__(self, iterable=None):
'''Initializes the heap from any iterable.
>>> Heap([1, 2])
Heap([1, 2])
>>> Heap([])
Heap([])
>>> Heap()
Heap([])
>>> Heap((1, 2, 3))
Heap([1, 2, 3])
>>> Heap(x**2 for x in range(3))
Heap([0, 1, 4])
'''
if iterable is None:
self._queue = []
else:
self._queue = list(iterable)
heapq.heapify(self._queue)
def __repr__(self):
return "Heap(%s)" % self._queue
def push(self, item):
'''Push the item to the heap queue.
>>> h = Heap()
>>> h.push(5)
>>> h.push(4)
>>> h.push(3)
>>> h.push(2)
>>> h.push(1)
>>> h.push(0)
>>> h
Heap([0, 2, 1, 5, 3, 4])
'''
heapq.heappush(self._queue, item)
def pop(self):
'''Pop one item from the heap queue.
>>> h = Heap([0, 2, 1, 5, 3, 4])
>>> h.pop()
0
>>> h.pop()
1
>>> h.pop()
2
>>> h.pop()
3
>>> h.pop()
4
>>> h.pop()
5
>>> h
Heap([])
>>> h.pop()
Traceback (most recent call last):
...
IndexError: index out of range
'''
return heapq.heappop(self._queue)
def pushpop(self, item):
'''Push the item and pop one from the heap queue.
Note that this method is more efficient than calling both
push() and pop() separately.
>>> h = Heap()
>>> h.pushpop(7)
7
>>> h.push(5)
>>> h.push(4)
>>> h.push(3)
>>> h.pushpop(7)
3
>>> h.pushpop(7)
4
>>> h
Heap([5, 7, 7])
'''
return heapq.heappushpop(self._queue, item)
def replace(self, item):
'''Pop one item and push the received one into the heap queue
Note that this method is more efficient than calling both
pop() and push() separately.
>>> h = Heap()
>>> h.replace(3)
Traceback (most recent call last):
...
IndexError: index out of range
>>> h.push(7)
>>> h.replace(2)
7
>>> h
Heap([2])
>>> h.push(3)
>>> h
Heap([2, 3])
>>> h.replace(1)
2
>>> h.replace(9)
1
>>> h
Heap([3, 9])
'''
return heapq.heapreplace(self._queue, item)
"""
Regards,
--
. Facundo
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas
+1 -- I recently did a short presentation on Big O notation and Python
containers, and heapq looked remarkably ugly.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/
"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur." --Red Adair
Your "Heap" class implements (part of) an abstract data type usually
known as a priority queue.
The fact that it uses the heapq module internally is an implementation
detail that is not (and should not be IMHO) exposed to the users of the
class. So if this is added, I think it should rather go in the
collections module.
Much better (faster for some operations) implementations of this ADT are
possible, so by not tying the implementation to heapq we would be free
to switch the implementation later if we want.
+1 to adding a collections.priority_queue, initially based on your class.
+0 to adding your class as heapq.Heap
In either case I think you should add at least the following additional
method.
def peek(self):
return self._queue[0]
- Jacob
I did something similar a couple of years ago (and I doubt that I'm the
only one). You should be able to find it in the bug tracker (no longer open).
Stefan
+1 -- I recently did a short presentation on Big O notation and Python
containers, and heapq looked remarkably ugly.
--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC
> Facundo Batista wrote:
>> Using the module heapq, I found that it's not easy to use. Or, at
>> least, that it could be straightforward to use.
>> [...]
>> So, I basically coded it. What do you think about including this class
>> in the heap module?
>
> I did something similar a couple of years ago (and I doubt that I'm the
> only one).
Indeed, you're not: http://code.activestate.com/recipes/440673/
George
Raymond
>> I think that changing "external list" for "internal attribute", and
>> "bunch of functions " for "methods", it will leave the module easier
>> to use, safer, and more object oriented.
>
> +1 -- I recently did a short presentation on Big O notation and Python
> containers, and heapq looked remarkably ugly.
_______________________________________________
May I suggest that you implement an interface similar to the sort()
method for list objects:
=========================
sort([cmp[, key[, reverse]]])
"cmp specifies a custom comparison function of two arguments (list
items) which should return a negative, zero or positive number
depending on whether the first argument is considered smaller than,
equal to, or larger than the second argument: cmp=lambda x,y:
cmp(x.lower(), y.lower()). The default value is None.
"key specifies a function of one argument that is used to extract a
comparison key from each list element: key=str.lower. The default
value is None.
"reverse is a boolean value. If set to True, then the list elements
are sorted as if each comparison were reversed."
========================
You could add these option arguments to the Heap class constructor,
which would then be used to maintain the heap. The _reverse_ argument
would cover the min/max heap question. The other arguments would
allow the user to use the class for a priority queue of more complex
objects. For example, the Heap could consist of object references
that would be compared using the function supplied in _cmp_ or _key_
arguments.
--
Gerald Britton
Cmp(), .__cmp__, cmp=xxx, etc, are gone in Py3.
+something for the other two.
--
Gerald Britton
1) I think the usual approach is to implement it for Py3, then backport it.
2) Since we already know that code that uses cmp=xyz is not Py3 compatible,
there is little use in providing this parameter for a new data type, so
that new code ends up being broken in the future.
Stefan
--
Gerald Britton
I've also got a pair heap implementation:
http://mail.python.org/pipermail/python-dev/2006-November/069845.html
There's also a proposed update of sched.py in the bugtracker (as part
of some updates to asyncore to add a scheduler) that offers a version
of decreasekey, as well as "remove" functionality. I describe some of
the trade-offs in implementing a heap with such features (memory use,
O-notation, etc.) in a blog post last summer:
http://chouyu-31.livejournal.com/316112.html . One of the ways we get
around the "key must be unique" limitation is that we don't use a
dictionary to reference the keys; when you put an item into the heap,
you get an opaque reference, which you can then cancel, "reschedule",
etc.
The sched module is heavily biased towards event scheduling, so adding
a collections.priority_queue or whatever is just fine with me.
- Josiah
>
> Ok thanks. Any idea what the uptake is on Py3? I know that the
> projects I work on are not even talking about moving in that direction
> due to the significant migration effort. Also, I hear that
> performance is an issue, though that might be improve over the longer
> term.
3.0, of course, doesn't have yet a huge following.
Most of the performance problems will be fixed in 3.1.
> Facundo, I would like to work with you on this.
> I've been the primary maintainer for heapq for a while
> and had already started working on something like this
> in response to repeated requested to support a key= function
> (like that for sorted/min/max).
After a not much complicated, but different year (I had a kid!), I'm
bringing this thread back to live.
There were different proposals of different people about what to do
after my initial mail, we can separate them in two sections:
- Move the Heap class to the collections module: I'm just changing the
heapq module to have an OO interface, instead a bunch of functions.
I'm +0 to moving it to "collections", but note that even after the
reordering of the stdlib for Py3, the heapq module remained there.
- Add functionality to the Heap class: I'm +0 to this, but I don't
want to stop this change in function of further functionality... I
propose to have the same functionality, in an OO and less error prone
way. We can add more functionality afterwards.
What do you think?
Raymond, let's work together... but don't know where. The Heap class
is already coded in my first mail, if you want to start from there and
add functionality, I'm +0. If you want me to add tests and push the
inclusion of that class into the module, just tell me. Something else,
I'm all ears, :)
Regards,
--
. Facundo