apply: increase performance by 60% for fixed length

154 views
Skip to first unread message

Marc Dzaebel

unread,
Oct 7, 2012, 5:15:28 PM10/7/12
to clo...@googlegroups.com
apply is slow. However you can increase performance by 60% with the following macro, if you have a fixed length in S.

(defmacro applyn [n f & s]
    (loop [curr `(list* ~@s), n n, vars[] vals[]]
        (if(pos? n)
           (let[v(gensym)]
                (recur v (dec n) (conj(conj vars v) (if (seq vars) (list 'next curr) curr))
                           (conj vals(list 'first v))))
          `(let[~@vars] ~(cons f (seq vals))))))

(let[t(fn[](apply       + '(1 2 3 4 5 6 7 8 9 10)))] (time(dotimes [_ 1000000] (t)))) ; ~680 msec
(let[t(fn[](applyn 10 + '(1 2 3 4 5 6 7 8 9 10)))] (time(dotimes [_ 1000000] (t)))) ; ~220 msec

So, if you have inner loops, that must be optimized for performance, you might remember this possibility. Even other functions could be optimized this way. However, "premature optimization is the root of all evil". Beside, the generated code is more space consuming.

Marc

Alan Malloy

unread,
Oct 7, 2012, 6:53:52 PM10/7/12
to clo...@googlegroups.com
This is nonsense. If s is fixed-size at compile-time, you would never use apply to begin with. Why bother with (applyn 10 + [1 2 3 4 5 6 7 8 9 10]) when you could just write (+ 1 2 3 4 5 6 7 8 9 10)?

Ben Wolfson

unread,
Oct 7, 2012, 10:56:45 PM10/7/12
to clo...@googlegroups.com
On Sun, Oct 7, 2012 at 3:53 PM, Alan Malloy <al...@malloys.org> wrote:
> This is nonsense. If s is fixed-size at compile-time, you would never use
> apply to begin with. Why bother with (applyn 10 + [1 2 3 4 5 6 7 8 9 10])
> when you could just write (+ 1 2 3 4 5 6 7 8 9 10)?

Why bother to write (+ 1 2 3 4 5 6 7 8 9 10) when you could just write 55?

In order to write (+ 1 2 3 4 5 6 7 ...) you need more than just the
length of the list, you need to know its contents as well.

(let [s (take 10 (infinite-stream-of-random-integers))] (applyn 10 + s))

will work right---you know the length of s---but you're not going to
be able to just directly apply +.

I'm skeptical for a different reason:

user=> (let[t(fn[](apply + '(1 2 3 4 5 6 7 8 9 10)))]
(time(dotimes [_ 1000000] (t))))
"Elapsed time: 1736.91518 msecs"
nil
user=> (let[t(fn[](applyn 10 + '(1 2 3 4 5 6 7 8 9 10)))]
(time(dotimes [_ 1000000] (t))))
"Elapsed time: 2375.503756 msecs"
nil

--
Ben Wolfson
"Human kind has used its intelligence to vary the flavour of drinks,
which may be sweet, aromatic, fermented or spirit-based. ... Family
and social life also offer numerous other occasions to consume drinks
for pleasure." [Larousse, "Drink" entry]

Wes Freeman

unread,
Oct 7, 2012, 11:01:24 PM10/7/12
to clo...@googlegroups.com
What version of clojure are you guys using, just to understand this a little better? I think apply was given a boost in 1.3 or 1.4--I'm a relative newbie, myself.

Wes


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Alan Malloy

unread,
Oct 8, 2012, 1:02:19 AM10/8/12
to clo...@googlegroups.com
On Sunday, October 7, 2012 7:56:53 PM UTC-7, Ben wrote:
On Sun, Oct 7, 2012 at 3:53 PM, Alan Malloy <al...@malloys.org> wrote:
> This is nonsense. If s is fixed-size at compile-time, you would never use
> apply to begin with. Why bother with (applyn 10 + [1 2 3 4 5 6 7 8 9 10])
> when you could just write (+ 1 2 3 4 5 6 7 8 9 10)?

Why bother to write (+ 1 2 3 4 5 6 7 8 9 10) when you could just write 55?

In order to write (+ 1 2 3 4 5 6 7 ...) you need more than just the
length of the list, you need to know its contents as well.

(let [s (take 10 (infinite-stream-of-random-integers))] (applyn 10 + s))

will work right---you know the length of s---but you're not going to
be able to just directly apply +.

Of course apply will work fine.
 
user> (apply + (take 10 (range)))
45

All applyn does is make the argument-unpacking code happen in your clojure source file instead of in RestFn.java, which is optimized for it, in exchange for the extremely small cost of dispatching on length. And since + only has different definitions for 0, 1, 2, or more arguments, that just means checking whether the argument sequence is 0, 1, 2, or more items long.

Ben Wolfson

unread,
Oct 8, 2012, 1:14:37 AM10/8/12
to clo...@googlegroups.com
On Sun, Oct 7, 2012 at 10:02 PM, Alan Malloy <al...@malloys.org> wrote:
> On Sunday, October 7, 2012 7:56:53 PM UTC-7, Ben wrote:
>>
>> On Sun, Oct 7, 2012 at 3:53 PM, Alan Malloy <al...@malloys.org> wrote:
>> > This is nonsense. If s is fixed-size at compile-time, you would never
>> > use
>> > apply to begin with. Why bother with (applyn 10 + [1 2 3 4 5 6 7 8 9
>> > 10])
>> > when you could just write (+ 1 2 3 4 5 6 7 8 9 10)?
>>
>> Why bother to write (+ 1 2 3 4 5 6 7 8 9 10) when you could just write 55?
>>
>> In order to write (+ 1 2 3 4 5 6 7 ...) you need more than just the
>> length of the list, you need to know its contents as well.
>>
>> (let [s (take 10 (infinite-stream-of-random-integers))] (applyn 10 + s))
>>
>> will work right---you know the length of s---but you're not going to
>> be able to just directly apply +.
>
>
> Of course apply will work fine.

What I meant by "directly apply +" wasn't "call apply with first
argument +", but "call (+ 1 2 ...)". I was responding to your
suggestion that one just write (+ 1 2 3 4 5 6 7 8 9 10).

> All applyn does is make the argument-unpacking code happen in your clojure
> source file instead of in RestFn.java, which is optimized for it, in
> exchange for the extremely small cost of dispatching on length.

Sure, but that's a different reason for finding applyn silly than the
one you initially gave, which was that one could bypass calling apply
altogether and just write (+ 1 2 ...).

Ben Wolfson

unread,
Oct 8, 2012, 1:17:07 AM10/8/12
to clo...@googlegroups.com
On Sun, Oct 7, 2012 at 8:01 PM, Wes Freeman <freem...@gmail.com> wrote:
> What version of clojure are you guys using, just to understand this a little
> better? I think apply was given a boost in 1.3 or 1.4--I'm a relative
> newbie, myself.

Ah using 1.4 I do see a speedup:

clojure-test.core> (let[t(fn[](apply + '(1 2 3 4 5 6 7 8 9
10)))] (time(dotimes [_ 1000000] (t))))
"Elapsed time: 832.403941 msecs"
nil
clojure-test.core> (let[t(fn[](applyn 10 + '(1 2 3 4 5 6 7 8 9
10)))] (time(dotimes [_ 1000000] (t))))
"Elapsed time: 368.181313 msecs"

Jean Niklas L'orange

unread,
Oct 8, 2012, 1:46:31 AM10/8/12
to clo...@googlegroups.com


On Sunday, October 7, 2012 11:15:28 PM UTC+2, Marc Dzaebel wrote:
apply is slow. However you can increase performance by 60% with the following macro, if you have a fixed length in S. 

[...]


(let[t(fn[](apply       + '(1 2 3 4 5 6 7 8 9 10)))] (time(dotimes [_ 1000000] (t)))) ; ~680 msec
(let[t(fn[](applyn 10 + '(1 2 3 4 5 6 7 8 9 10)))] (time(dotimes [_ 1000000] (t)))) ; ~220 msec

Interesting, even though it's impractical to use it on an already defined list. Usually, I want to work with list or sequences I've generated with other functions, like so:

(let[t(fn[](applyn 10 + (range 1 11)))] (time(dotimes [_ 1000000] (t)))) ; ~ 950 msec
(let[t(fn[](applyn + (range 1 11)))] (time(dotimes [_ 1000000] (t)))) ; ~ 940 msec

(In addition, apply is in this case is not really what you want to do with this list - you should use reduce instead.)
(let[t(fn[](reduce + (range 1 11)))] (time(dotimes [_ 1000000] (t)))) ; ~ 890 msec

Curiously, if I vec the result, I get a completely different answer:

(let[t(fn[](applyn 10 + (vec (range 1 11))))] (time(dotimes [_ 1000000] (t)))) ; ~ 1330 msec
(let[t(fn[](apply + (vec (range 1 11))))] (time(dotimes [_ 1000000] (t)))) ; ~ 1410 msec
(let[t(fn[](reduce + (vec (range 1 11))))] (time(dotimes [_ 1000000] (t)))) ; ~ 1340 msec

This is hardly any scientific test though, so I'll leave the conclusion to someone with a more rigorous testing scheme.

Marc Dzaebel

unread,
Oct 8, 2012, 3:15:50 AM10/8/12
to clo...@googlegroups.com
So applyn is only 3% faster for generated sequences with 10 elements (on my machine with 1.4 too).

Reducing the number of elements has a better effect:

(let[t(fn[](applyn 5 + (range 1 6)))] (time(dotimes [_ 1000000] (t)))) ;438 msec
(let[t(fn[](apply + (range 1 6)))] (time(dotimes [_ 1000000] (t)))) ;550 msec

So for 5 elements, we have a performance boost of about 20% but I got a degradation for longer sequences than 12 elements.
Reply all
Reply to author
Forward
0 new messages