Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Some timing experiments
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Richard Newman  
View profile  
 More options Sep 13 2009, 7:09 pm
From: Richard Newman <holyg...@gmail.com>
Date: Sun, 13 Sep 2009 16:09:40 -0700
Local: Sun, Sep 13 2009 7:09 pm
Subject: Some timing experiments
Thought I'd share with the group. Clojure's sets are fast!

http://www.holygoat.co.uk/blog/entry/2009-09-13-1


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chouser  
View profile  
 More options Sep 14 2009, 9:34 am
From: Chouser <chou...@gmail.com>
Date: Mon, 14 Sep 2009 09:34:28 -0400
Local: Mon, Sep 14 2009 9:34 am
Subject: Re: Some timing experiments

On Sun, Sep 13, 2009 at 7:09 PM, Richard Newman <holyg...@gmail.com> wrote:

> Thought I'd share with the group. Clojure's sets are fast!

> http://www.holygoat.co.uk/blog/entry/2009-09-13-1

Nice writeup!

...and I hate to be picky, but you can probably compare
strings sizes using a faster mechanism than the examples you
gave.

While 'count' can give the size of a string, it first checks
it's type against a couple other possibilities before
calling the string's .length method.

        (defn t [] (dotimes [i 2e7] (= 10 (count "123"))))
        (time (t)) ; run this a few times to let HotSpot work its magic
        --> "Elapsed time: 1000.960992 msecs"

If we know it's a string, we can call the .length method
directly.  If you also type-hint the string, this can make
the code rather uglier, and since this is hardly ever the
bottleneck in any application, it may not be worth it.  But
when you're benchmarking tiny opertions repeated a huge
number of times, this sort of detail starts to matter:

        (defn t [] (dotimes [i 2e7] (= 10 (.length "123"))))
        (time (t))
        --> "Elapsed time: 100.096957 msecs"

There's no need to type hint the string there because it's
a literal and the Clojure compiler already knows its type.

It's time to see if == will help us, since we are after all
dealing with numbers here:

        (defn t [] (dotimes [i 2e7] (== 10 (.length "123"))))
        (time (t))
        "Elapsed time: 96.791805 msecs"

So that's a bit better, but it's also worth looking at
where our numbers may be getting boxed or unboxed.

        (use '[clojure.contrib.repl-utils :only (expression-info)])
        (expression-info '10)
        --> {:class java.lang.Integer, :primitive? false}

        (expression-info '(.length "123"))
        --> {:class int, :primitive? true}

So now we're comparing a boxed Integer 10 with an unboxed
int returned by 'length'.  We can do better:

        (defn t [] (dotimes [i 2e7] (== (int 10) (.length "123"))))
        (time (t)) ; Watch HotSpot remove orders of magnitude, until:
        --> "Elapsed time: 0.044282 msecs"

That's really really fast.  That elapsed time is too small:
we'd have to increase the test size to get any useful
conclusion besides "really fast".  In fact, I almost wonder
if HotSpot is somehow memoizing the expression all by
itself.

Anyway, Clojure's sets by themselves appear to be fast
enough for you, or I assume you wouldn't have described them
as "crazy fast".  However, *if* you discover later that
they're are not fast enough you may be able to do length
tests faster than you were thinking.  This may yet be
a route to explore for improved overall speed.

--Chouser


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard Newman  
View profile  
 More options Sep 14 2009, 1:13 pm
From: Richard Newman <holyg...@gmail.com>
Date: Mon, 14 Sep 2009 10:13:47 -0700
Local: Mon, Sep 14 2009 1:13 pm
Subject: Re: Some timing experiments

> Nice writeup!

Thanks!

> ...and I hate to be picky, but you can probably compare
> strings sizes using a faster mechanism than the examples you
> gave.

Yeah, I figured I must have been doing something wrong, but I thought  
I'd leave it out there as a bit of ethnographic research :)

> So now we're comparing a boxed Integer 10 with an unboxed
> int returned by 'length'.  We can do better:

That's interesting... this is a case in which I thought HotSpot (or  
even Clojure's compiler) would automatically unbox: by the time it  
emits code for == it knows that it's comparing a primitive int to a  
constant non-primitive Integer (produced by the reader). Put another  
way: I would expect (== 10 10) to do no work at runtime, or at worst  
only primitive math, not boxed math.

Incidentally, this whole thing is a good example of a user expecting a  
sufficiently smart compiler: coming from Common Lisp one would expect  
the count implementation to be specialized for strings and other  
common types (essentially turning into a null check and a call  
to .length), and a similar optimization of ==. After all, the  
compiler(s) knows the input and output types of all the functions, and  
one of the arguments is constant. From that perspective, there's no  
reason why (== 10 (count some-string)) should not produce much the  
same bytecode as (== (int 10) (.length some-string)) -- both forms are  
equivalent, assuming no rebinding of count.

(While I'm very happy with Clojure, I can see the point of view of the  
people who come here and mouth off about Clojure not being as fast as  
Java. They're not strictly correct, but getting fast Clojure code does  
require some familiarity with which things its compiler will take care  
of, and which it leaves up to you.)

>    (defn t [] (dotimes [i 2e7] (== (int 10) (.length "123"))))
>    (time (t)) ; Watch HotSpot remove orders of magnitude, until:
>    --> "Elapsed time: 0.044282 msecs"

> That's really really fast.  That elapsed time is too small:
> we'd have to increase the test size to get any useful
> conclusion besides "really fast".  In fact, I almost wonder
> if HotSpot is somehow memoizing the expression all by
> itself.

It might be! In this case I suspect that something is inlining the  
result of (.length "123"), which is after all a compile-time constant.  
It might even be optimizing the entire contents of the loop away,  
replacing them with `false`.

> Anyway, Clojure's sets by themselves appear to be fast
> enough for you, or I assume you wouldn't have described them
> as "crazy fast".  However, *if* you discover later that
> they're are not fast enough you may be able to do length
> tests faster than you were thinking.  This may yet be
> a route to explore for improved overall speed.

You're quite right, and thanks for taking the time to continue the  
exploration!

My main conclusion from all this is that the JVM is really quick, and  
HotSpot is (by and large) a thing of wonder. There's not too much  
point in optimizing away a few hundred microseconds!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard Newman  
View profile  
 More options Sep 14 2009, 4:57 pm
From: Richard Newman <holyg...@gmail.com>
Date: Mon, 14 Sep 2009 13:57:22 -0700 (PDT)
Local: Mon, Sep 14 2009 4:57 pm
Subject: Re: Some timing experiments
I put an update in my original post, and I also (like you) timed
without the side effects I'd inserted (empty-string print statements).
Unlike your code, I'm using non-constant strings, which might explain
the difference.

I note:

The 1,000 string length comparisons take 0.184ms; the negative set
test takes 0.281ms; and the positive takes 0.77ms. The string length
check, then, is slightly faster, but if the length matches you still
have the set test to perform. Probably not worth the complexity.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »