avoiding casts with aget & friends

96 views
Skip to first unread message

Brian Craft

unread,
Jan 30, 2019, 5:03:43 PM1/30/19
to Clojure
Profiling is showing a lot of time spent in RT.longCast, in places like this:

(aget flat-dict (bit-and 0xff (aget arr j)))

arr is hinted as ^bytes, and flat-dict as ^objects.

which compiles to this:

Object code2 = RT.aget((Object[])flat_dict, RT.intCast(0xFFL & RT.longCast((Object)RT.aget((byte[])arr2, RT.intCast(k)))))

Is there any way to avoid that RT.longCast? There is an aget method in RT.java that returns a byte, and a longCast for byte, but I suspect the cast to Object is causing it to hit the longCast for Object, which does a bunch of reflection.

Chris Nuernberger

unread,
Jan 30, 2019, 6:03:38 PM1/30/19
to clo...@googlegroups.com
does doing an unchecked cast of the return value of the aget on the byte array change things?

(defn test-fn
        []
        (let [obj-ary (object-array 10)
              byte-data (byte-array (range 10))]
          (doseq [idx (range 10)]
            (let [idx (int idx)]
              (aget obj-ary idx
                    (bit-and 0xFF (unchecked-long (aget byte-data idx))))))))

This disassembled clearer and I think dropped the long cast.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alex Miller

unread,
Jan 30, 2019, 6:39:41 PM1/30/19
to Clojure
What have you tried? And how are you getting that Java? I would prefer to look at bytecode (via javap) to verify what you're saying. 

Have you tried an explicit long cast?

(aget flat-dict (bit-and 0xff (long (aget arr j))))

Are you running this hot enough for the JIT to kick in? Usually this is the kind of thing it's good at, but it might take 10k invocations before it does.

Brian Craft

unread,
Jan 30, 2019, 7:55:00 PM1/30/19
to Clojure
I haven't tried much. I'm getting the java via clj-java-decompiler.core 'decompile' macro.

A long cast does drop the cast (which is really counter-intuitive: explicitly invoke 'long', which calls longCast, in order to avoid calling longCast).

Amusingly this doesn't reduce the total run-time, though longCast drops out of the hotspot list. :-p There must be some other limiting step that I'm missing in the profiler.

I'm calling it around 1.2M times, so hopefully that engages the jit.

Alex Miller

unread,
Jan 30, 2019, 9:06:37 PM1/30/19
to clo...@googlegroups.com
Sometimes the insertion of profiling instrumentation magnifies the cost of things in a hot loop or provides misleading hot spot info. If you're using a tracing profiler, you might try sampling instead as it's less prone to this.

Or, this sounds silly, but you can manually sample by just doing a few thread dumps while it's running (either ctrl-\ or externally with kill -3) and see what's at the top. If there really is a hot spot, this is a surprisingly effective way to see it.

For this kind of code, there is no substitute for actually reading the bytecode and thinking carefully about where there is unnecessary casting or boxing.



--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/vXJFuOujTaw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

Chris Nuernberger

unread,
Jan 30, 2019, 9:40:17 PM1/30/19
to clo...@googlegroups.com
That is why I used 'unchecked-long' instead of 'long'.

(unchecked-long (unchecked-byte 5))

Not

(long (byte 5))


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

Brian Craft

unread,
Jan 30, 2019, 9:49:56 PM1/30/19
to Clojure
If there is unnecessary casting or boxing, how do you avoid it? hinting and casting affect it, but not in ways I understand, or can predict.

Alex Miller

unread,
Jan 30, 2019, 9:58:07 PM1/30/19
to clo...@googlegroups.com
It would really help to see a full function of code context. From that I could probably talk a little more about what I would expect the compiler to understand and how you might be able to influence it.

Brian Craft

unread,
Jan 30, 2019, 11:41:49 PM1/30/19
to Clojure
The full context is large. But, for example, in this code:

  (let [a 1
        b (:foo {:foo 3})
        c (if (< a b) a b)])

b and c are Object (if the disassembly is to be believed) which leads to casts where c is used later. Also, the compare is calling Numbers.lt, which is going to do a bunch of casting & dispatch.

Adding a :long hint on b, b is still Object, and the compare becomes

        if (a < RT.longCast(b)) {
            num = Numbers.num(a);
        }

with a long cast that doesn't seem necessary. Also, c is still Object.

Casting the lookup to long, like (long (:foo {:foo 3})), b and c are now long, but there's now a cast on the return of the lookup

        Object x;
        if (_thunk__0__ == (x = _thunk__0__.get(const__4))) {
            x = (__thunk__0__ = __site__0__.fault(const__4)).get(const__4);
        }
        final long b = RT.longCast(x);
        final long c = (a < b) ? a : b;

which is going to hit the RT.longCast(Object) method, and start probing for the class so it can pick a method.

Brian Craft

unread,
Jan 31, 2019, 12:06:56 AM1/31/19
to Clojure
With much experimentation, I ended up with this:

  (let [a 1
        b (.longValue ^Long (:foo {:foo 3}))
        c (if (< a b) a b)])

which seems to avoid the longCast call:

        Object o;
        if (_thunk__0__ == (o = _thunk__0__.get(const__3))) {
            o = (__thunk__0__ = __site__0__.fault(const__3)).get(const__3);
        }
        final long b = (long)o;
        final long c = (a < b) ? a : b;

I don't know if this is advisable. Does anyone do this?

Also just noted this is another case where explicitly calling something seems to make it disappear. :-p

Alex Miller

unread,
Jan 31, 2019, 1:07:38 AM1/31/19
to clo...@googlegroups.com
On Wed, Jan 30, 2019 at 11:07 PM Brian Craft <craft...@gmail.com> wrote:
With much experimentation, I ended up with this:

  (let [a 1
        b (.longValue ^Long (:foo {:foo 3}))
        c (if (< a b) a b)])

which seems to avoid the longCast call:

        Object o;
        if (_thunk__0__ == (o = _thunk__0__.get(const__3))) {
            o = (__thunk__0__ = __site__0__.fault(const__3)).get(const__3);
        }
        final long b = (long)o;
        final long c = (a < b) ? a : b;

I don't know if this is advisable. Does anyone do this?

No, I wouldn't do this. `long` can inline so it's going to be better (it's also more likely to jit well as it's used other places and is likely hotter in the jit).

Going back to the original...

  (let [a 1
        b (:foo {:foo 3})
        c (if (< a b) a b)])

let will track primitives if possible.
- a is going to be a primitive long. 
- (:foo {:foo 3}) is going to (always) return an Object and it's best to recognize that and make an explicit cast to a primitive long.
- if a and b are both primitives, then < is able to inline a long-primitive comparison (via Numbers.lt())
- the if is going to return an Object though, so again you'll probably want to type hint or cast c, but it's hard to know for sure without seeing more code

Without other info, I would probably start with 

  (let [a 1
        b (long (:foo {:foo 3}))
        c (if (< a b) a b)])

Or alternately, it might be better to just type hint b in the comparison to avoid reboxing b, but hard to know without more context:

  (let [a 1
        b (:foo {:foo 3})
        c (if (< a ^long b) a b)])

Comparing the bytecode for these (skipping everything up through the keyword lookup, which is same for all):

Original:                                  Option 1:                             Option 2:
45: astore_2                           45: invokestatic  #41           45: astore_2  
46: lload_0                             48: lstore_2                         46: lload_0
47: aload_2                            49: lload_0                          47: aload_2
48: invokestatic  #41              50: lload_2                          48: checkcast     #37
51: ifeq          62                     51: lcmp                              51: invokestatic  #43
54: lload_0                             52: ifge          60                  54: lcmp
55: invokestatic  #45              55: lload_0                          55: ifge          66
58: goto          65                    56: goto          61                58: lload_0
61: pop                                   59: pop                               59: invokestatic  #49
62: aload_2                            60: lload_2                          62: goto          69
63: aconst_null                       61: lstore        4                   65: pop
64: astore_2                           63: lload         4                   66: aload_2
65: astore_3                           65: invokestatic  #47           67: aconst_null
66: aload_3                                                                        68: astore_2
67: aconst_null                                                                   69: astore_3
68: astore_3                                                                       70: aload_3
                                                                                           71: aconst_null
                                                                                           72: astore_3

Option 1 does an lstore/lload (long) instead of an astore/lstore (object). Both options use lcmp which is likely the fastest path to compare longs. I've omitted some info here to make these fit, but Original line 48 will invoke Numbers.lt:(JLjava/lang/Object;)Z which is the Numbers.lt(long, Object) - lcmp is definitely going to be preferable to this. Also of importance is that in Option 1, both a and b are stored as longs and loaded as longs so if there is subsequent stuff to do on them, they can avoid boxing (this is also betrayed by the shorter length from fewer casts).

Your longValue one is like Option 1, but starts:

      45: checkcast     #37                 // class java/lang/Long
      48: invokevirtual #41                 // Method java/lang/Long.longValue:()J

I guess I don't know whether that's faster than an invokestatic call to clojure/lang/RT.longCast:(Ljava/lang/Object;)J, hard to tell without testing in a real context. I'd certainly use the (long ...) one though unless I proved it mattered.
 

Brian Craft

unread,
Jan 31, 2019, 6:10:09 PM1/31/19
to Clojure
Is there any way to inspect what the jit does to it?

Alex Miller

unread,
Jan 31, 2019, 7:10:53 PM1/31/19
to Clojure
Yes, there a bunch of jvm options to show you when it’s jit-ing, and even the rewritten code.

You can google for that stuff like LogCompilation. Occasionally I have found that useful.

Alexander Yakushev

unread,
Feb 1, 2019, 11:40:08 AM2/1/19
to Clojure
The "easiest" way to obtain JIT-produced native code without having to sift through mountains of it is to use JMH[1] in -perfasm mode[2]. Here's an article on how to use JMH from Clojure[3].

However, if you start to spend so much time optimizing primitive operations in Clojure (and this really, really matters in your program), I'd suggest writing that part in Java if possible, and call from Clojure. This library[4] greatly improves the experience of writing Java from inside Clojure projects.

Reply all
Reply to author
Forward
0 new messages