Type hinting inconsistencies in 1.3.0

484 views
Skip to first unread message
Message has been deleted

Chas Emerick

unread,
May 11, 2011, 11:09:18 AM5/11/11
to Clojure
(sorry for the prior hiccup, I accidentally sent a draft of my message)

I've been using 1.3.0 alphas for real work for a little while now, and would like to raise a couple of issues around type hinting so as to calibrate my expectations and understanding and/or provoke some discussion about some inconsistencies I'm seeing.

Some observations from a REPL interaction first:

=> *clojure-version*
{:major 1, :minor 3, :incremental 0, :qualifier "alpha6"}
=> (set! *warn-on-reflection* true)
true
=> (defn ^String foo [])
#'user/foo
=> (fn [] (String. (foo)))
#<user$eval398$fn__399 user$eval398$fn__399@215b011c>
=> (-> #'foo meta :tag)
java.lang.String

Right, so hinting a var's name works just fine, as in 1.2.x.  Things get wonky when hinting with primitives, though:

=> (defn ^long foo [])
#'user/foo
=> (fn [] (Long. (foo)))
#<CompilerException java.lang.IllegalArgumentException: Unable to resolve classname: clojure.core$long@41f6321, compiling:(NO_SOURCE_PATH:1)>
=> (-> #'foo meta :tag)
#<core$long clojure.core$long@41f6321>

The compilation fails because the var's :tag metadata is a function, clojure.core/long!  Bizarre.  I would assume that that's a bug, except that it appears the intended way to hint _primitive_ returns (but no other returns?) is on the arg vector, and not the var name:

=> (defn foo ^long [])
#'user/foo
=> (fn [] (Long. (foo)))
#<user$eval410$fn__411 user$eval410$fn__411@18287811>
=> (-> #'foo meta :tag)
nil
=> (meta @#'foo)
nil

So our compilation succeeds here, but the var's metadata is left :tag-less (understandably, since the hint is on the arg vector and not the var name), and the function has no metadata at all (this is surely a bug, but not my primary focus here).

My understanding is that the objective in hinting the arg vector is to allow for variable return hints on functions with multiple arities (which was not possible in 1.2.x).  This is an excellent aim, but it would seem that the current implementation presents some usage difficulties:

=> (defn foo
     (^String [])
     (^long [a])
     (^double [a b]))
#'user/foo
=> #(String. (foo))
#<user$eval2241$fn__2242 user$eval2241$fn__2242@25c2cbee>
Reflection warning, NO_SOURCE_PATH:1 - call to java.lang.String ctor can't be resolved.
=> #(Long. (foo))
#<user$eval2245$fn__2246 user$eval2245$fn__2246@508a8b07>
Reflection warning, NO_SOURCE_PATH:1 - call to java.lang.Long ctor can't be resolved.
=> #(Long. (foo 0))
#<user$eval2249$fn__2250 user$eval2249$fn__2250@c23c5ff>
=> #(Double. (foo 0))
#<user$eval2260$fn__2261 user$eval2260$fn__2261@35cfee57>
Reflection warning, NO_SOURCE_PATH:1 - call to java.lang.Double ctor can't be resolved.
=> #(Double. (foo 0 0))
#<user$eval2264$fn__2265 user$eval2264$fn__2265@5d504a84>

So, we can have variable hinting of fn returns of different arities, but only for primitive types.  This hurts, and should either be "fixed" or produce a compiler warning.  Also, there is no metadata on the var indicating the types of the different function arities:

=> (meta #'foo)
{:arglists ([] [a] [a b]), :ns #<Namespace user>, :name foo, :line 1, :file "NO_SOURCE_PATH"}

Somewhat worse from the standpoint of semantic consistency, hinting the var with ^String yields good — yet confusing — results:

=> (defn ^String foo
     ([])
     (^long [a])
     (^double [a b]))
#'user/foo
=> #(Double. (foo 0 0))
#<user$eval2289$fn__2290 user$eval2289$fn__2290@69996e15>
=> #(String. (foo))
#<user$eval2293$fn__2294 user$eval2293$fn__2294@220860ba>

And now the var metadata has a :tag, but only for the ^String hint:

=> (meta #'foo)
{:arglists ([] [a] [a b]), :ns #<Namespace user>, :name foo, :line 1, :file "NO_SOURCE_PATH", :tag java.lang.String}

I understand that the generated function implements the necessary interfaces when primitive hints are involved:

=> (supers (class foo))
#{clojure.lang.IFn$OL java.lang.Object java.lang.Runnable clojure.lang.Fn clojure.lang.AFn clojure.lang.IObj clojure.lang.IMeta clojure.lang.AFunction java.util.concurrent.Callable clojure.lang.IFn$OOD java.io.Serializable java.util.Comparator clojure.lang.IFn}

Having to touch (:tag (meta var)) as well as dig away at the implemented interfaces of generated function classes (which I have to assume are strictly implementation details) seems unnecessarily inconsistent.

Finally, I'd like to question the current path of supporting hinting of primitive returns — and primitive returns only — via function arg vectors.  Again, to me, it's a simple point of consistency, this time from a mostly syntactic perspective; most of the examples below have already been seen above, but I repeat them so that they're all co-located for easy comparison:

We hint object returns via the var name:

(defn ^String foo [])

But we hint primitive returns via arg vectors:

(defn foo ^long [])

Different arities of functions can be hinted, but _only_ with primitive types, and on arg vectors:

(defn foo
  (^long [])
  (^double [a]))

While a hint added to a function's var "cascades" down to any unhinted arities of a function:

(defn ^String foo
  ([] "bar")
  (^double [a])
  ([a b] :surprise!)

I wasn't able to find a canonical discussion of the introduction of this semantic of hinting argument vectors, so a pointer to one (or a new one here!) would be great.  It doesn't make a lot of sense to me — return hints are related to the function, not the arguments.  I thought that perhaps hinting the arg vector was provided so as to support hinting returns of anonymous functions, but that doesn't seem to be the case:

=> (def a (fn ^long []))
#'user/a
=> #(Long. (a))
#<user$eval2390$fn__2391 user$eval2390$fn__2391@3865db85>
Reflection warning, NO_SOURCE_PATH:1 - call to java.lang.Long ctor can't be resolved.

From a naive perspective, understanding that perhaps implementation details and historical considerations are driving the current state of things, I would expect this to work, and be the most semantically-consistent usage:

(defn foo
  ^String ([] "bar")
  ^double ([a] 5.6)
  ([a b] :not-hinted-at-all))

…with appropriate metadata on #'foo and the function itself indicating the corresponding arity<=>return types.

Of course, this isn't how 1.3.0 alphas work now, but unless things are really locked down, I hope the above is a reasonable starting point for discussing how to smooth out the inconsistencies that currently exist.

Cheers,

- Chas

Armando Blancas

unread,
May 11, 2011, 1:20:46 PM5/11/11
to Clojure
Have you seen this?
http://www.assembla.com/wiki/show/clojure/Enhanced_Primitive_Support

•hint for return goes on arg vector
◦e.g. (defn ^:static foo ^long [x] …)
◦this so it supports multi-arity with varied returns

I couldn't find a practical use of doing (defn ^String foo [] ...)
what's hinting the fn name good for?

The code in 1.3 is actually better than I had anticipated since you
get the speed benefits whether or not your arguments are primitives
(see Static linking); I thought you'd need two implementations of
every calls that takes primitives.

You don't mention ^:static and that seems to be required for best
performance.
:static
•defn supports {:static true} metadata
•:static fns can take/return longs and doubles in addition to Objects
•compiler will compile static methods in addition to IFn virtual
methods

The metadata isn't there, so I guess some symbol table is involved in
compiling static functions, not just the var.

user=> (defn fib [n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
#'user/fib
user=> (time (fib 38))
"Elapsed time: 25383.208481 msecs"
63245986
user=> (defn ^:static fib ^long [^long n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
#'user/fib
user=> (meta (var fib))
{:arglists ([n]), :ns #<Namespace user>, :name fib, :static
true, :line 149, :file "NO_SOURCE_PATH"}
user=> (time (fib 38))
"Elapsed time: 2792.755504 msecs"
63245986
user=> (time (fib 38N))
"Elapsed time: 2777.905013 msecs"
63245986

Chas Emerick

unread,
May 11, 2011, 1:40:03 PM5/11/11
to clo...@googlegroups.com

On May 11, 2011, at 1:20 PM, Armando Blancas wrote:

Indeed, though that's old. See http://dev.clojure.org/display/doc/Enhanced+Primitive+Support instead.

> I couldn't find a practical use of doing (defn ^String foo [] ...)
> what's hinting the fn name good for?

That's how one hints a non-primitive return type; that's been the case for a long time now.


> You don't mention ^:static and that seems to be required for best
> performance.
> :static
> •defn supports {:static true} metadata
> •:static fns can take/return longs and doubles in addition to Objects
> •compiler will compile static methods in addition to IFn virtual
> methods

^:static has been a no-op for a while AFAIK, made unnecessary after changes to vars a while back.

> The metadata isn't there, so I guess some symbol table is involved in
> compiling static functions, not just the var.

The compiler relies upon a stack of interfaces that define primitive-hinted function invocation:

https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/IFn.java#L91

I haven't investigated the details sufficiently enough to comment beyond that.

- Chas

Chas Emerick

unread,
May 11, 2011, 1:48:15 PM5/11/11
to clo...@googlegroups.com

On May 11, 2011, at 11:09 AM, Chas Emerick wrote:

Of course, this isn't how 1.3.0 alphas work now, but unless things are really locked down, I hope the above is a reasonable starting point for discussing how to smooth out the inconsistencies that currently exist.

After some clarifying discussion in #clojure, I wanted to explicitly state my concrete suggestions, as I should have done in my original message.  I think that:

  • as much as possible, return type information, whether obtained by a primitive or non-primitive hint, should be in the var and function metadata in a form amenable to tooling and other runtime introspection
  • hinting arg vectors in order to indicate function return types is a syntactically poor choice
    • In any case, the syntactic position of function return type hints should be consistent regardless of the "category" of return type

- Chas

P.S. BTW, my expectation that :tag metadata should trickle down to functions themselves was driven by what looks like a bug in 1.2.0 where it *appears* that that happens, but only after one has redefined a var: https://gist.github.com/248246a3ac545a20a7b3

Armando Blancas

unread,
May 11, 2011, 3:10:50 PM5/11/11
to Clojure
Thanks for the link to the new documentation.

> That's how one hints a non-primitive return type; that's been the case for a long time now.

I know, but haven't needed it; I guess it's for Java interop.

> hinting arg vectors in order to indicate function return types is a syntactically poor choice
> In any case, the syntactic position of function return type hints should be consistent regardless of the "category" of return type

I can see the inconsistency but for multiple arity/body functions
that's not a bad place to put it. My suggestion would be to extend
(declare) à la maclisp and keep the actual fn clean of hints (at least
the signature).
http://bitsavers.trailing-edge.com/pdf/mit/ai/aim/AIM-421.pdf

Ken Wesson

unread,
May 11, 2011, 4:45:34 PM5/11/11
to clo...@googlegroups.com
On Wed, May 11, 2011 at 11:04 AM, Chas Emerick <ceme...@snowtide.com> wrote:
> Somewhat worse from the standpoint of semantic consistency, hinting the var
> with ^String yields good — yet confusing — results:
> => (defn ^String foo
>      ([])
>      (^long [a])
>      (^double [a b]))
> #'user/foo
> => #(Double. (foo 0 0))
> #<user$eval2289$fn__2290 user$eval2289$fn__2290@69996e15>
> => #(String. (foo))
> #<user$eval2293$fn__2294 user$eval2293$fn__2294@220860ba>
> And now the var metadata has a :tag, but only for the ^String hint:
> => (meta #'foo)
> {:arglists ([] [a] [a b]), :ns #<Namespace user>, :name foo, :line 1, :file
> "NO_SOURCE_PATH", :tag java.lang.String}

What do you get from (map meta (:arglists (meta #'foo)))?

David Nolen

unread,
May 11, 2011, 4:47:13 PM5/11/11
to clo...@googlegroups.com
On Wed, May 11, 2011 at 11:09 AM, Chas Emerick <ceme...@snowtide.com> wrote:
(defn foo
  ^String ([] "bar")
  ^double ([a] 5.6)
  ([a b] :not-hinted-at-all))

Given the current implementation I really don't see how this could work. Keeping the old fn type-hint in a separate place from the new primitive type hints is wise until the implementation can actually deliver unified semantics.

David

David Nolen

unread,
May 11, 2011, 4:59:04 PM5/11/11
to clo...@googlegroups.com
On Wed, May 11, 2011 at 3:10 PM, Armando Blancas <armando...@yahoo.com> wrote:
I can see the inconsistency but for multiple arity/body functions
that's not a bad place to put it. My suggestion would be to extend
(declare) à la maclisp and keep the actual fn clean of hints (at least
the signature).
http://bitsavers.trailing-edge.com/pdf/mit/ai/aim/AIM-421.pdf

This is less generic then what already exists in Clojure 1.3.0, fns that consume primitives can efficiently dispatch on arities and types.

David 

Chas Emerick

unread,
May 11, 2011, 5:26:00 PM5/11/11
to clo...@googlegroups.com

=> (->> #'foo
meta
:arglists
(map (juxt identity meta)))
([[] nil] [[a] {:tag long}] [[a b] {:tag double}])

Beautiful. I clearly wasn't clever enough to spelunk deeply enough into the metadataÎ. Thank you, Ken.

I'd hope that the [] arity could get a :tag of java.lang.String, but that may be running into semantic difficulties as David is potentially pointing out separately.

Thanks again,

- Chas

Chas Emerick

unread,
May 11, 2011, 7:43:02 PM5/11/11
to clo...@googlegroups.com
Assuming the JVM as it is today (i.e. no fixnums, etc.), is there any potential that primitive and non-primitive semantics can be unified?  Is that even desirable considering the host?

Even if the answer is 'yes', if one wanted different syntax to correspond with different semantics, this is not the most obvious distinction:

^double ([a] 5.6)
(^double [a] 5.6)

My only point with the strawman you quoted is that the hint is for the return type, and has nothing to do with that arity's arguments – so, why put it there?

Syntactically (and semantically before they started participating in IFn class interface selection), type hints are metadata on the expression following the hint, so their current position implies that they relate to arguments.  This is more clear with a single-arity declaration:

(defn ^String foo [])
(defn foo ^long [])

Those that understand the nuanced semantic distinction will understand what's going on regardless of where the hint is placed; those that don't will think the syntactic difference is arbitrary, a just-so difference.

- Chas

David Nolen

unread,
May 11, 2011, 9:04:55 PM5/11/11
to clo...@googlegroups.com
On Wed, May 11, 2011 at 7:43 PM, Chas Emerick <ceme...@snowtide.com> wrote:
(defn ^String foo [])
(defn foo ^long [])

The real problem is that first case was never actually specifying a return type - it is a type hint on the var that happened to store an fn. The fn itself can only return Object (<=1.2.0).

Now fns can return Object, long, double.

David

Chas Emerick

unread,
May 11, 2011, 9:25:46 PM5/11/11
to clo...@googlegroups.com
Yes, I understand that, but is the differentiated placement of the hint intended to communicate that difference?  I assume there's only one or two people that might know…

- Chas

Ken Wesson

unread,
May 11, 2011, 11:58:39 PM5/11/11
to clo...@googlegroups.com
On Wed, May 11, 2011 at 9:04 PM, David Nolen <dnolen...@gmail.com> wrote:
> On Wed, May 11, 2011 at 7:43 PM, Chas Emerick <ceme...@snowtide.com> wrote:
>>
>> (defn ^String foo [])
>> (defn foo ^long [])
>
> The real problem is that first case was never actually specifying a return
> type - it is a type hint on the var that happened to store an fn. The fn
> itself can only return Object (<=1.2.0).

??

Using 1.2:

sandbox=> (defn bar [] "foo")
#'sandbox/bar
sandbox=> (defn foo [] (.length (bar)))
#'sandbox/foo
Reflection warning, NO_SOURCE_PATH:1 - reference to field length can't
be resolved.
sandbox=> (defn ^String baz [] "foo")
#'sandbox/baz
sandbox=> (defn foo [] (.length (baz)))
#'sandbox/foo

Looks like it recognizes it as hinting the return type to me.

David Nolen

unread,
May 12, 2011, 1:04:46 AM5/12/11
to clo...@googlegroups.com

Ken Wesson

unread,
May 12, 2011, 2:09:40 AM5/12/11
to clo...@googlegroups.com

What? Please actually respond in English, rather than by pointing
mutely at something whose salience is not even apparent.

Chas Emerick

unread,
May 12, 2011, 6:01:12 AM5/12/11
to clo...@googlegroups.com

On May 12, 2011, at 2:09 AM, Ken Wesson wrote:

>>>> The real problem is that first case was never actually specifying a
>>>> return
>>>> type - it is a type hint on the var that happened to store an fn. The fn
>>>> itself can only return Object (<=1.2.0).
>>>
>>> ??
>>>
>>> Using 1.2:
>>>
>>> sandbox=> (defn bar [] "foo")
>>> #'sandbox/bar
>>> sandbox=> (defn foo [] (.length (bar)))
>>> #'sandbox/foo
>>> Reflection warning, NO_SOURCE_PATH:1 - reference to field length can't
>>> be resolved.
>>> sandbox=> (defn ^String baz [] "foo")
>>> #'sandbox/baz
>>> sandbox=> (defn foo [] (.length (baz)))
>>> #'sandbox/foo
>>>
>>> Looks like it recognizes it as hinting the return type to me.
>>
>> https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Compiler.java#L3236
>
> What? Please actually respond in English, rather than by pointing
> mutely at something whose salience is not even apparent.

What David is getting at is that Object-based hints (as we've had all along) don't impact the signature of the generated function's invoke method -- their return types are always Object, and the hint is "just" used to determine which cast to use when the function's return is used in an interop context (and only in interop contexts, since "regular" Clojure functions never have statically-typed arguments). In contrast, primitive type hints _do_ change the signature of the corresponding function's invoke* method via the selection of the appropriate IFn$NNNNN interface that has statically-typed arguments and return types.

What started this is I don't happen to think that's relevant re: syntax and the placement of the type hint — or, more to the point, that essentially no one does/will/should understand the difference in the implementation details and choices that are represented in the differentiated placement of the hint in:

(defn ^String foo [])
vs.
(defn foo ^long [])

- Chas

Ken Wesson

unread,
May 12, 2011, 6:18:17 AM5/12/11
to clo...@googlegroups.com
On Thu, May 12, 2011 at 6:01 AM, Chas Emerick <ceme...@snowtide.com> wrote:
> What David is getting at is that Object-based hints (as we've had all along) don't impact the signature of the generated function's invoke method -- their return types are always Object, and the hint is "just" used to determine which cast to use when the function's return is used in an interop context (and only in interop contexts, since "regular" Clojure functions never have statically-typed arguments).

That's* different from his original claim that you "can't hint return
types", or words to that effect**; you can't enforce return types (as
in, returning the wrong type invariably throws CCE, or else it won't
compile), but you can hint them (and then wrong type used in interop
expression may cause CCE).

With 1.3, it looks like maybe you can enforce primitive return types
in some cases.

* If it even IS what he was getting at. All he posted was a URL, and I
don't think it's at all clear that that was what he meant by doing so.
And yes, that's after clicking through it and having a cursory look
around near the landing site in the code.

** To be exact, he said you can't "specify" return types, "it is a
type hint on the var that happened to store an fn". This reads to me
as implying that hinting (defn ^String foo [] ...) is hinting that foo
references a String rather than an IFn that returns a String, but the
compiler sure seems to interpret foo as referencing an IFn that
returns a String. :)

Juha Arpiainen

unread,
May 12, 2011, 6:44:14 AM5/12/11
to Clojure
On May 12, 1:18 pm, Ken Wesson <kwess...@gmail.com> wrote:
> This reads to me
> as implying that hinting (defn ^String foo [] ...) is hinting that foo
> references a String rather than an IFn that returns a String

It is if you use 'foo in argument position. That is, if Foo/bar is
overloaded for String and IFn, then (Foo/bar foo) resolves to the
wrong version.

--
Juha Arpiainen

Ken Wesson

unread,
May 12, 2011, 3:53:24 PM5/12/11
to clo...@googlegroups.com

MORE type hinting inconsistencies, then.

There does seem to be something of a mess in this area, in need of
cleaning up before 1.3 goes stable.

Chas Emerick

unread,
May 12, 2011, 11:32:00 PM5/12/11
to clo...@googlegroups.com

Object-based hints are not new, and haven't changed in 1.3.0 as far as I can tell. Such hints are _hints_, not type declarations (in the case of a var's value) or return type declarations (in the case of a fn held by a var). This is distinct from primitive "hints", which — in the context we've been discussing them — _are_ return type declarations (i.e. they change the type of the return from the function's generated class).

- Chas

Ken Wesson

unread,
May 13, 2011, 12:13:43 AM5/13/11
to clo...@googlegroups.com
On Thu, May 12, 2011 at 11:32 PM, Chas Emerick <ceme...@snowtide.com> wrote:
> Object-based hints are not new, and haven't changed in 1.3.0 as far as I can tell.  Such hints are _hints_, not type declarations (in the case of a var's value) or return type declarations (in the case of a fn held by a var).

I never said they were not hints.

Reply all
Reply to author
Forward
0 new messages