function call speed...

229 views
Skip to first unread message

Jules

unread,
Oct 5, 2024, 11:59:32 AM10/5/24
to Clojure
I was just checking to see how much overhead I might pay using apply rather than destructuring a collection of args so I could call a function directly on them, when I found something I thought interesting:

TLDR:

- apply is really slow - 2 orders of magnitude slower than a direct call
- apply on an array, which I thought would be a direct way to the java varargs api is really slow
- destructuring a list is two orders of magnitude slower than a vector
- it is an order of magnitude faster to destructure a vector and may a direct call, than to use apply
- etc

openjdk full version "21.0.4+7"
clojure 1.12.0
AMD Ryzen 9 5950X 16-Core Processor

If I've made any silly mistakes, please point them out and I'll run these benchmarks again.

If I am correct then there should be a lot of room to speed up apply and destructuring of lists ?

Interested in peoples thoughts...


Jules


m3.repl> ;; let's have function that takes 4 args
m3.repl> (defn foo [a b c d])
#'m3.repl/foo
m3.repl> ;; let's try some direct calls
m3.repl> (time (dotimes [_ 100000000] (foo 1 2 3 4)))
"Elapsed time: 65.149019 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (foo 1 2 3 4)))
"Elapsed time: 49.920188 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (foo 1 2 3 4)))
"Elapsed time: 48.02132 msecs"
nil
m3.repl> ;; now lets try with a variety of collections via apply
m3.repl> (def vargs [1 2 3 4])
#'m3.repl/vargs
m3.repl> (def largs (list 1 2 3 4))
#'m3.repl/largs
m3.repl> (def aargs (into-array Object [1 2 3 4]))
#'m3.repl/aargs
m3.repl> ;; a vector
m3.repl> (time (dotimes [_ 100000000] (apply foo vargs)))
"Elapsed time: 7166.218804 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (apply foo vargs)))
"Elapsed time: 7167.766347 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (apply foo vargs)))
"Elapsed time: 7154.58141 msecs"
nil
m3.repl> ;; a list
m3.repl> (time (dotimes [_ 100000000] (apply foo largs)))
"Elapsed time: 5657.564262 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (apply foo largs)))
"Elapsed time: 5597.083287 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (apply foo largs)))
"Elapsed time: 5638.361902 msecs"
nil
m3.repl> ;; and an array
m3.repl> (time (dotimes [_ 100000000] (apply foo aargs)))
"Elapsed time: 7954.757485 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (apply foo aargs)))
"Elapsed time: 8125.750746 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (apply foo aargs)))
"Elapsed time: 8151.5701 msecs"
nil
m3.repl> ;; is it faster to destructure first ?
m3.repl> ;; a vector
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] vargs] (foo a b c d))))
"Elapsed time: 904.68628 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] vargs] (foo a b c d))))
"Elapsed time: 902.818246 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] vargs] (foo a b c d))))
"Elapsed time: 905.498641 msecs"
nil
m3.repl> ;; a list
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] largs] (foo a b c d))))
"Elapsed time: 30043.489179 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] largs] (foo a b c d))))
"Elapsed time: 30159.705918 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] largs] (foo a b c d))))
"Elapsed time: 30289.946966 msecs"
nil
m3.repl> ;; an array
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] aargs] (foo a b c d))))
"Elapsed time: 14231.86232 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] aargs] (foo a b c d))))
"Elapsed time: 14081.057031 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d] aargs] (foo a b c d))))
"Elapsed time: 14924.808985 msecs"
nil
m3.repl> ;; hmm... maybe destructuring the vector is taking a shortcut and not building a new collection whereas the others are
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d e] vargs] (foo a b c d))))
"Elapsed time: 1158.761312 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d e] vargs] (foo a b c d))))
"Elapsed time: 570.542383 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d e] vargs] (foo a b c d))))
"Elapsed time: 588.179669 msecs"
nil
m3.repl> (time (dotimes [_ 100000000] (let [[a b c d e] vargs] (foo a b c d))))
"Elapsed time: 589.489251 msecs"
nil
m3.repl> 

jum...@gmail.com

unread,
Dec 5, 2024, 5:48:02 AM12/5/24
to Clojure
A discussion about this on Clojurians slack: https://clojurians.slack.com/archives/C03L9H1FBM4/p1728361006109229
A couple of highlights: 
- apply is very slow and allocating. Desctucturing is suboptimal, especially for lists.
- If you look at the apply method's implementation, you see why it is much slower than regular fn call. It walks the seq to get each of the fn args, then calls invoke. It's a lot of overhead.

Reply all
Reply to author
Forward
0 new messages