heapq._siftdown / decrease-key support?

Skip to first unread message

Joshua Bronson

Jan 13, 2010, 8:16:49 PM1/13/10
to dan...@stutzbachenterprises.com, pyt...@rcn.com, ele...@elemel.se
I recently implemented A* search in Python using the heapq module and
in my first pass, to accomplish the decrease-key operation I resorted
to doing a linear scan of the list to find the position of the key in
the heap, and then calling the private undocumented method
heapq._siftdown at this position to enforce the heap property after
updating the key's priority[1].

Then I found Daniel Stutzbach's HeapDict[2], which looks like a whole
new package written just to support this operation (it also provides
some nice encapsulation to boot). So I switched[3]. The cost of this
improved implementation is slightly worse performance; I'm guessing
this is because more code is being run in Python now rather than in C.

Now I've just stumbled upon Mikael Lind's python-astar package[4],
which implements A* using heapq without needing to use _siftdown;
instead, if a better path to a neighbor is found, it marks the already-
found node on the worse path as invalid, pushes a new node onto the
heap with the improved priority, and at the end of each iteration,
pops off any invalid nodes from the top of the heap[5].

Before I rewrite my code another time to see what kind of performance
this results in, I figured I'd ask some interested parties first: Does
heapq not provide decrease-key (and increase-key) because it expects
you to work around it as Mikael has done, or for some other reason, or
is it planned for the future?


[1] http://bitbucket.org/jab/toys/src/7fe2965b275c/wordchain.py#cl-212
[2] http://github.com/DanielStutzbach/heapdict
[3] http://bitbucket.org/jab/toys/src/2ae894c91d08/wordchain.py#cl-218
[4] http://github.com/elemel/python-astar
[5] http://github.com/elemel/python-astar/blob/170c3d8c32c0c7ee24968c6d90fc6b6957461dec/src/astar.py#L112

Joshua Bronson

Jan 14, 2010, 5:03:38 AM1/14/10
to Daniel Stutzbach, Raymond Hettinger, Mikael Lind
Thanks for the responses!

On Thu, Jan 14, 2010 at 12:23 AM, Daniel Stutzbach
<dan...@stutzbachenterprises.com> wrote:
> Your guess is correct. Someday I'd like to rewrite HeapDict in C for speed, but I haven't been able to find the time (and no one has offered to pay me to make the time ;) ).

Daniel, did you realize you can accomplish logarithmic-time priority
change (albeit indirectly) with the built-in heapq module using this
mark-as-invalid trick? Are there other algorithms requiring decrease-
key besides A* and Dijkstra where this technique doesn't help?

On Thu, Jan 14, 2010 at 2:41 AM, Mikael Lind <ele...@elemel.se> wrote:
> If I recall correctly, avoiding the linear search did wonders for
> algorithm complexity.

It's still possible to perform the decrease-key operation without
having to do a linear search; heapdict does it in logarithmic time. If
my cursory benchmarks were accurate (which upon reconsideration I'm
not actually sure they are), I think the only explanation for it being
possibly slower is that it's running in pure Python.

I just switched back to using heapq but now with the mark-as-invalid
trick instead of the linear scan, and the performance I'm observing is
very close to heapdict's. Which is curious, because now that both
operations are running in logarithmic time, I'd expect the heapq
implementation to be much faster since it's written in C. Maybe I have
some confounding variables in how I'm running my tests.

By the way, I just realized that the obvious reason the heapq module
doesn't support priority change is that its functions are designed to
operate on external lists rather than providing an encapsulating data
structure like heapdict. With heapdict, logarithmic decrease_key is
possible because it's also storing the positions of the elements in an
underlying dictionary, so instead of a linear scan it does a constant-
time lookup. My guess is that the simplicity of having heapq operate
on external lists is preferable to supporting decrease-key directly,
so it isn't likely to change any time soon, especially since you can
accomplish the same thing with the mark-as-invalid technique.

Raymond, do you think this technique is worth documenting in the heapq
module? It'd be too bad if any future users incorrectly think that it
won't meet their needs the way I did.

Terry Reedy

Jan 14, 2010, 6:50:55 AM1/14/10
to pytho...@python.org

If the values in the queue are associated with 0 to n-1, as can be used
for graph node identifiers , the positions can also be kept in an list,
again with constant time lookup.

Joshua Bronson

Jan 18, 2010, 5:12:20 PM1/18/10
to Raymond Hettinger, Daniel Stutzbach, Mikael Lind
On Sun, Jan 17, 2010 at 11:30 PM, Raymond Hettinger <pyt...@rcn.com>

> > Raymond, do you think this technique is worth documenting in the heapq
> > module? It'd be too bad if any future users incorrectly think that it
> > won't meet their needs the way I did.
> Yes.  Please assign a tracker issue to me to expand the heapq documention
> to discuss the mark-as-invalid trick for priority queues.
> Raymond

I've created http://bugs.python.org/issue7734. I don't enough have
privileges to assign it to you, but I added you to the nosy list.


Reply all
Reply to author
0 new messages