Thanks Zach.
I've spent some time in the past looking at disassembler output, and unless anything has changed I think Clojure always takes this lazy approach to type casting for non-primitives and relies on the JIT to optimize out the unnecessary casts. I haven't looked in the compiler, but I suspect this simplifies things by not having to worry about the type except when emitting method calls.
Typed sub-Object fields in types (and records) would allow us to work around the issue with the deftype solution, which is a still a bit awkward since we need a type for every use case (less awkward if it also worked with reify, but that seems like more of a stretch). But I've also wished for typed fields outside of this context, so that we could do things like:
(defrecord Foo [^Map m])
(.put (.m (Foo. (HashMap.))) 1 2) ;; currently reflects, would not if m was Map
This is also true of Clojure functions. Several members of our team have been confused by the fact that
(defn foo [^Map m] (seq m)
emits no typecast, so (foo "test") returns '(\t \e \s \t) rather than throwing, and the flipside of this is that
(defn foo [^Map m]
(.put m 1 2)
(.put m 3 4))
emits two typecasts. Usually the JIT takes care of this I think, but for whatever reason this doesn't seem to be working in our examples, i.e.,:
(defn asum-identity [^doubles a]
(let [len (long (alength a))]
(loop [sum 0.0
idx 0]
(if (< idx len)
(let [ai (aget a idx)]
(recur (+ sum ai) (unchecked-inc idx)))
sum))))
compiles to bytecode:
public java.lang.Object invoke(java.lang.Object);
Code:
0: aload_1
1: checkcast #78; //class "[D"
4: arraylength
5: i2l
6: lstore_2
7: dconst_0
8: dstore 4
10: lconst_0
11: lstore 6
13: lload 6
15: lload_2
16: lcmp
17: ifge 50
20: aload_1
21: checkcast #78; //class "[D"
24: lload 6
26: l2i
27: daload
28: dstore 8
30: dload 4
32: dload 8
34: dadd
35: lload 6
37: lconst_1
38: ladd
39: lstore 6
41: dstore 4
43: goto 13
46: goto 55
49: pop
50: dload 4
52: invokestatic #42; //Method java/lang/Double.valueOf:(D)Ljava/lang/Double;
55: areturn
I don't know enough about bytecode or the JIT to understand why the second cast isn't elided in this case.
In general I think we'd prefer the initial cast upon binding and no further casts for safety, clarity, and predictable performance, but I don't expect this is on the table.
-Jason