One of the correspondents in that thread claims that it is
provably impossible to have an efficient priority queue implementation
without mutability. I think he's cuckoo. But I'd like to have some
numbers to back me up.
Can anyone point me to some actual benchmark results comparing
priority queue performance *with* mutation and priority queue
performance *without* mutation, in the same functional or
mostly-functional language?
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
> There's a current thread in the Erlang mailing list about
> priority queues. I'm aware of, for example, the Brodal/Okasaki
> paper and the David King paper. I'm also aware of James Cook's
> priority queue package in Hackage, have my own copy of Okasaki's
> book, and have just spent an hour searching the web.
>
> One of the correspondents in that thread claims that it is
> provably impossible to have an efficient priority queue implementation
> without mutability.
If he so claims, maybe you can challenge him by asking for a proof?
Such a proof would probably only involve asymptotics, since it's very hard
to *prove* anything when actual raw speed is involved. If that's the case,
you can use Okasaki to back yourself up (or back him up; I am not familiar
with the results in this area).
I've seen in a few programming circles now that "provably" is used as a
weasel word. Provably, eh? Where's the proof?
Luke
> There's a current thread in the Erlang mailing list about
> priority queues. I'm aware of, for example, the Brodal/Okasaki
> paper and the David King paper. I'm also aware of James Cook's
> priority queue package in Hackage, have my own copy of Okasaki's
> book, and have just spent an hour searching the web.
>
> One of the correspondents in that thread claims that it is
> provably impossible to have an efficient priority queue implementation
A priority queue based on skewed binomial heaps is asymptotically optimal
(O(1) for everything except deleteMin which is O(log n)), so if that's what
he means by "efficient" then he's most definitely wrong. If he's talking
about "small constant factors" then it's harder to understand what he's
referring to more precisely, and therefore what he means by "provably".
--
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
On Mon, Jun 15, 2009 at 2:12 PM, Lennart Augustsson
<len...@augustsson.net>wrote:
What about decreaseKey in a purely functional setting? I suppose it's
O(log n), based on the intuition of trees with limited branching factor.
Fibonacci heaps can do it in O(1), which makes a difference for
Dijkstra's algorithm, for example.
regards,
Bertram
He claims that the burden is on my end.
>
>
> Such a proof would probably only involve asymptotics, since it's
> very hard to prove anything when actual raw speed is involved.
> If that's the case, you can use Okasaki to back yourself up (or
> back him up; I am not familiar with the results in this area).
He is aware of Okasaki's book, about which he was somewhat offensive.
One thing that's clear is that he _isn't_ talking about asymptotics.
>
I've now done some benchmarks myself in C, Java, and Smalltalk,
comparing "imperative" versions of leftist heaps with "functional" ones.
For what it's worth, on a 2.16GHz Intel Core 2 Duo Mac, the
coefficient in front of the log(n) part was
C Java ST(A) ST(B)
"Imperative" 40 70 150 1123
"Functional" 240 126 290 1895
where ST(A) was a native-code Smalltalk and ST(B) a byte-code one.
The C/Functional case used the Boehm collector, untuned.
Times are in nanoseconds. Values of n ranged from 2 to 100; the
correspondent was saying that small sizes were important.
It seems that a factor of 2 for *this* problem is credible;
a factor of 10 is not.
Certainly not.
He claims X is provable, so he has to either provide a proof himself (a sketch may
suffice), or tell you where you can find a proof.
The burden of proof is always on the claimant.
The original poster in the Erlang thread on the subject
didn't ask for decreaseKey.
The problem with delete and decreaseKey is that they require a
way of identifying en entry _within_ a priority queue. This is
easy enough to do in C or Java: just hand out pointers to the
internal nodes. It's less easy in a language where nodes don't
_have_ identities, such as Haskell or Erlang. The Brodal and
Okasaki paper suggests using an extra dictionary data structure
for this purpose, roughly doubling the size of the whole thing.
just a wiiild guess on this, but anyway.
Maybe Oleg has something to say on this, in particular when it comes to
his domain, ie. "delimited continuations".
As I said, just a wild guess.
Günther
Richard O'Keefe schrieb:
Sounds cuckoo to me until I see a proof otherwise. I've seen a few proof
sketches indicating that immutable approaches can always be no worse
than a O(log n) multiple of imperative ones (by simulating main memory
with your O(log n) map of choice). Between this and the provable
asymptotic optimality of skewed binomial heap prioqueues, the argument
sounds untenable.
Though it really comes down to what they mean by "efficient". Asymptotic
complexity is the name of the game for most folks in algorithms and
datastructures, but that seems not to be what they're after. Shrinking
the constant factors is frequently a game that can go on forever, or
rather can seldom be proved not to, so the claim seems unlikely to be
meaningful in this arena either (you'd have to prove you've found the
smallest possible constant factor for any immutable approach, and then
find a smaller one for some mutable approach). Also proofs about
constant factors beyond basic thresholds aren't useful in practice due
to hardware barriers like cache size and disk access times.
There's also the difference between compilers that are designed to
optimize immutable patterns, vs ones which aren't. For example, in
certain circumstances and with suitable annotations Clean will take the
immutable approach and convert it into a mutable variant for you, thus
saving on allocation/collection/cache-miss overheads while maintaining
the spirit of immutability. Is compiled code like this considered
"mutable" or "immutable"? It all depends on the spirit of the question.
> Can anyone point me to some actual benchmark results comparing
> priority queue performance *with* mutation and priority queue
> performance *without* mutation, in the same functional or
> mostly-functional language?
I'm always curious to see datastructure benchmarks though :)
--
Live well,
~wren
And your results above indicate that the fastest imperative heap is over 3x
faster than the fastest functional heap?
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
It's saying that
(1) Using an imprecise an inefficient-relative-to-a-accurate-GC-that-doesn't-
have-to-assume-the-entire-memory-space-is-live-then-find-what's-dead
is a recipe for inefficiency.
(2) And (1) even more so when you're comparing it to the same language with
manual memory management and zero GC overhead.
(from here on out I disregard the C numbers (i like C, too))
(3) So now, it's saying that (given this sample, yada yada) among languages
where this comparison is possible (i.e. the mutable version
still has the GC running)
the functional version is on average 1.8 times slower.
ghci> ((126/70) + (290/150) + (1895/1123)) / 3
1.8069258929454834
Matt
> On Tuesday 16 June 2009 23:50:45 Richard O'Keefe wrote:
>> I've now done some benchmarks myself in C, Java, and Smalltalk,
>> comparing "imperative" versions of leftist heaps with "functional" ones.
>> For what it's worth, on a 2.16GHz Intel Core 2 Duo Mac, the
>> coefficient in front of the log(n) part was
>>
>> C Java ST(A) ST(B)
>> "Imperative" 40 70 150 1123
>> "Functional" 240 126 290 1895
>>
>> where ST(A) was a native-code Smalltalk and ST(B) a byte-code one.
>> The C/Functional case used the Boehm collector, untuned.
>> Times are in nanoseconds. Values of n ranged from 2 to 100; the
>> correspondent was saying that small sizes were important.
>>
>> It seems that a factor of 2 for *this* problem is credible;
>> a factor of 10 is not.
Also, I've now added you to (1) my list of people never to hire to
perform statistical computations for me where i'm looking for what's
actually going on, (2) my list of people to hire when i need to
mislead *via* statistics. However, (2) is pending verification that
you didn't actually think that. I can't imagine you did though. :)
Cheers,
Matt
How is that relevant to what I wrote?
> (2) And (1) even more so when you're comparing it to the same language
> with manual memory management and zero GC overhead.
There should be no GC overhead in the imperative case anyway.
> (from here on out I disregard the C numbers (i like C, too))
>
> (3) So now, it's saying that (given this sample, yada yada) among
> languages where this comparison is possible (i.e. the mutable version still
> has the GC running)
> the functional version is on average 1.8 times slower.
>
> ghci> ((126/70) + (290/150) + (1895/1123)) / 3
> 1.8069258929454834
You're assuming that other languages will not be a lot faster than Java even
though C already is.
> > On Tuesday 16 June 2009 23:50:45 Richard O'Keefe wrote:
> >> I've now done some benchmarks myself in C, Java, and Smalltalk,
> >> comparing "imperative" versions of leftist heaps with "functional" ones.
> >> For what it's worth, on a 2.16GHz Intel Core 2 Duo Mac, the
> >> coefficient in front of the log(n) part was
> >>
> >> C Java ST(A) ST(B)
> >> "Imperative" 40 70 150 1123
> >> "Functional" 240 126 290 1895
> >>
> >> where ST(A) was a native-code Smalltalk and ST(B) a byte-code one.
> >> The C/Functional case used the Boehm collector, untuned.
> >> Times are in nanoseconds. Values of n ranged from 2 to 100; the
> >> correspondent was saying that small sizes were important.
> >>
> >> It seems that a factor of 2 for *this* problem is credible;
> >> a factor of 10 is not.
Post the code and I'll port it to F#.
Merry Christmas,
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
Lazyness can be considered to be a controlled form of mutation, which
Okasaki takes advantage of in a number of his data structures. Most
(all I've seen) arguments stating that purely functional languages
have asymptotic performance worse than mutating languages simply don't
apply to lazy ones.
--
Svein Ove Aas
To be fair, I am not aware of any asymptotically efficient (as
efficient as their imperative counterparts) purely functional (t.i.
not using mutation) implementations of, say, the "union-find"
datastructure.
However, that does not pose any constraints on the efficiency of
Haskell because of the existence of the ST monad.
>
> --
> Svein Ove Aas
> _______________________________________________
> Haskell-Cafe mailing list
> Haskel...@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
--
Eugene Kirpichov
Web IR developer, market.yandex.ru
He is cuckoo, and here's proof:
http://www.brics.dk/RS/96/37/BRICS-RS-96-37.pdf
<http://www.brics.dk/RS/96/37/BRICS-RS-96-37.pdf>This demonstrates a
functional priority queue that performs every desired operation in
asymptotically optimal time. Granting that its constant factor could be
significantly improved, realize that there are more complicated functional
data structures with similar asymptotic guarantees. (Observation: I'm
pretty sure that the Fibonacci heap is actually surprisingly functional.)
Louis Wasserman
wasserm...@gmail.com
http://profiles.google.com/wasserman.louis
You're assuming he meant asymptotic time complexity.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
>
> Lazyness can be considered to be a controlled form of mutation
Can someone explain why this is true (or link me to an explanation)?
--
Gautam
Forcing the evaluating of a thunk replaces the unevaluated expression with the
value it evaluates to. That is effectively in-place mutation.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
>
> Forcing the evaluating of a thunk replaces the unevaluated expression with
> the
> value it evaluates to. That is effectively in-place mutation.
>
How can one use that to gain on efficiency? I understand that laziness
allows a modified data structure to share nodes with the original data
structure preventing unnecessary copying, but I do not see how forcing an
evaluation can be used to gain on efficiency (or alternatively prevent
inefficiency). Is there any simple example to illustrate this (or should I
read Okasaki)?
--
Thanks,
Gautam
In a purely functional setting?
> I understand that laziness
> allows a modified data structure to share nodes with the original data
> structure preventing unnecessary copying,
I think you mean that idiomatic purely functional data structures evade
copying the entire structure by breaking it down recursively and referring
back to older versions whenever possible (i.e. sharing).
That has nothing to do with laziness though and, compared to the mutable case,
it incurs a lot of extra copying. In fact, that is half the reason why purely
functional data structures are so slow. The other half is that recursive
decomposition leads to many levels of indirection which is hugely inefficient
on modern computers due to cache misses.
The main use case where purely functional data structures pay dividends in
terms of performance is persistence. For example, when you backtrack in an
n-queens solver you create a new, slightly-different list and forget about
the old one (which gets recycled). This style of programming is very common
in metaprogramming such as compilers, interpreters and theorem provers.
> but I do not see how forcing an
> evaluation can be used to gain on efficiency (or alternatively prevent
> inefficiency).
An example of preventing an inefficiency in the purely functional case is
easy: consider two threads about to perform the same computation. Making them
compete to force a thunk can prevent them from redundantly performing the
same computation.
The downside is that the implementation of laziness must deal with race
conditions and this is neither easy nor efficient.
> Is there any simple example to illustrate this (or should I
> read Okasaki)?
You should read Okasaki regardless. :-)
A good example from Okasaki might be the purely functional queue. A simple
eager solution maintains front and back lists, pushes on the back and pops
from the front except when it is empty, whereupon it reverses the back list
to create a new front list. Eager evaluation of that list reversal is
problematic in the presence of persistent use: the programmer might keep the
same old version of a queue with an empty front list around and pop from it
several times whereupon the same list reversal will be repeated needlessly
every time. So non-persistent use of these "batched queues" has good
amortized complexity but persistent use has awful complexity.
Okasaki presents a "real-time" queue that uses laziness to avoid this
inefficiency. In essence, the reversal is stored element-wise as thunks that
are forced only when necessary and their result reused if it is ever required
again. So it makes persistent use asymptotically more efficient.
You may want to have a look at
R. Bird, G. Jones, O. de Moor.
More haste, less speed: lazy versus eager evaluation.
http://www.comlab.ox.ac.uk/people/richard.bird/online/
BirdJonesDeMoor1997More.pdf
Regards,
Heinrich Apfelmus
--
Gautam