Feature proposal: function overloading on type?

809 views
Skip to first unread message

Mikera

unread,
Jan 7, 2013, 1:53:22 AM1/7/13
to cloju...@googlegroups.com
Hi all,

I've been writing a lot of Java library interop code recently, and keep coming across patterns where you end up having to branch on the type of some parameter. Contrived but representative example follows:

(defn process [obj]
  (cond
    (instance? IFn obj) (obj)
    (instance? List obj) (process-list obj)
    (instance? Number obj) (compute-primitive-double (double obj))
    :else (throw (IllegalArgumentException.))))

This kind of code would feel much nicer if Clojure allowed overloading on argument type as well as arity. e.g.

(defn process
  ([^IFn obj] (obj))
  ([^List obj] (process-list obj))
  ([^Number obj] (compute-primitive-double (double obj)))) 

Assumptions on the logic of how this would work:
 - You can overload a method based on Java class/interface, Clojure defrecord/deftype, primitives and perhaps Clojure protocols
 - Bytecode would get generated for overloaded methods of the correct type
 - Extra metadata could be added to let the compiler and/or user know about the iverloaded versions
 - Overload precedence would presumably be the same as Java, in the case of multiple matching types
 - If you don't provide an untyped version, one gets generated automatically that satisfies the regular IFn interface and performs conditional branching (which then delegates to the appropriate typed version)
 - No changes needed to existing code (since existing code won't have overloads, and a regular IFn implementation still gets generated)

This would offer a few advantages / opportunities:
 - It avoids the need to write ugly cond statements in many cases
 - It opens the door to many optimisations, e.g. in cases where the compiler knows the exact type at a call site, it can call the overloaded method directly and avoid casting / boxing / instanceof checks. I know this is a tiny gain, but it would apply to a lot of circumstances (including parts of clojure.core) so could easy add up to a decent performance gain overall.
 - It would possibly help with implementing Clojure-in-Clojure - I observe that a lot of the clojure jvm source e.g. clojure.lang.RT uses similar branch-on-instanceof patterns
 - I believe it is consistent with the "gradual typing" philosophy, which might be a good long-term direction to be heading (i.e. continue to be a dynamic language at heart, but offer optional typing features to people who want / need them)

Downsides:
 - It might get abused?
 - It's extra complexity in the language. Although we already have type hints and arity overloads, so it's pretty much  consistent as an extension of these.
 - There are probably some subtle complexities in terms of how it might interact with type hints / primitive functions. 
 - Obviously it would be a reasonable amount of work to implement at the compiler level.

Any interest in a feature like this? Maybe this is something for Clojure 2.0?

Sean Corfield

unread,
Jan 7, 2013, 2:08:22 AM1/7/13
to cloju...@googlegroups.com
My initial reaction: can't you do this with multi-methods and dispatch on type?
> --
> You received this message because you are subscribed to the Google Groups
> "Clojure Dev" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/clojure-dev/-/VUVq6_DIgIUJ.
> To post to this group, send email to cloju...@googlegroups.com.
> To unsubscribe from this group, send email to
> clojure-dev...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/clojure-dev?hl=en.



--
Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/
World Singles, LLC. -- http://worldsingles.com/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)

kovas boguta

unread,
Jan 7, 2013, 2:19:50 AM1/7/13
to cloju...@googlegroups.com
Its this substantially different than a single-function protocol?

Mikera

unread,
Jan 7, 2013, 5:31:15 AM1/7/13
to cloju...@googlegroups.com
You can certainly achieve the same functionality with multimethods.

However I think the syntax is much more elegant if it can be done as an extension to the normal arity overriding of functions that Clojure already provides.

Also, multimethods are slow compared with normal JVM method dispatch, which typed overloads would be able to exploit directly. It's an optimisation analogous to how Clojure provides the ability to define primitive functions to avoid the overhead of boxing.

Mikera

unread,
Jan 7, 2013, 5:41:35 AM1/7/13
to cloju...@googlegroups.com
It's different from single-function protocols for various reasons. Off the top of my head:
- It isn't designed to be open to extension to new types like protocols are. It's much more like defining a multi-arity function - you have to redefine the whole function if you want to add a new arity.
- It could allow specialisation on the type of any argument, not just the type of the first one
- It could work on primitive types
- It could theoretically be faster than protocols in some circumstances (JIT dependent, of course...)

kovas boguta

unread,
Jan 7, 2013, 6:01:01 AM1/7/13
to cloju...@googlegroups.com
Got it. Just asked because the examples happend to be all about the
first argument.

dnolen had looked into related issues in some depth, but I can't find
the threads now.

I like the idea. I presume this also applies for anonymous functions?

I would also be in favor of dispatching on keywords (or arbitrary
values for that matter), and on structural features like
(defn foo
([a [b]] "case 1")
([a [b c]] "case 2"))

The spirit of the idea is to dispatch on "what kind of thing it is",
and it would be a shame if the only measure of our things is what type
they are.

Its also worth looking at what Julia has done wrt multiple dispatch:
http://www.infoq.com/presentations/Julia
> https://groups.google.com/d/msg/clojure-dev/-/h60vOjpcq-QJ.

Paul Stadig

unread,
Jan 7, 2013, 7:21:25 AM1/7/13
to cloju...@googlegroups.com
On Mon, Jan 7, 2013 at 6:01 AM, kovas boguta <kovas....@gmail.com> wrote:
> Got it. Just asked because the examples happend to be all about the
> first argument.
>
> dnolen had looked into related issues in some depth, but I can't find
> the threads now.
>
> I like the idea. I presume this also applies for anonymous functions?
>
> I would also be in favor of dispatching on keywords (or arbitrary
> values for that matter), and on structural features like
> (defn foo
> ([a [b]] "case 1")
> ([a [b c]] "case 2"))
>
> The spirit of the idea is to dispatch on "what kind of thing it is",
> and it would be a shame if the only measure of our things is what type
> they are.
>
> Its also worth looking at what Julia has done wrt multiple dispatch:
> http://www.infoq.com/presentations/Julia

If I understand correctly, what Mike is describing is static dispatch,
and that is the major difference between what he is describing and
multimethods and protocols. What would happen is Clojure would
generate a class that implements IFn (which has only Object typed
arguments) *and* some 'invoke' methods that have specialized types. If
the compiler knows the type signatures of the arities of the function
being called and the types of the arguments, then it can generate
bytecode that would directly invoke the specialized invoke methods. If
it does not know the types of the arguments, then it would generate
code to call the Object invoke method. The body of the Object invoke
method would be a cond that either invokes the appropriate specialized
invoke method, or throws an exception.

I don't know how well this idea could be extended to other features.
Any feature would have to be knowable at compile time, and also
encodable as a Java method either with types (which is directly
supported by Java) or with name munging or something.

One drawback I see with this is that it is very concrete. Protocols
work against interfaces which allow multiple implementations. The IFn
interface allows only Object typed methods. Since they are not part of
any interface, these specialized methods are specific not only to the
static types of the arguments, but also the function that you are
compiling. Any high order functions would be unable to take advantage
of these specialized methods, because the compiler would only see them
as IFn instances. You could possiblly pass some additional type
information to a high order function, but it would tie itself directly
to the concrete class of the function with specialized methods, since
interfaces are the way to abstract away from concrete instances, and
again these specialized methods are not part of any interface.

The other way to abstract away from concrete classes is to take
advantage of invokedynamic. With invokedynamic you can compile an
invocation of a method with a specialized type signature that is not
tied to a concrete class, and at run time you link that type signature
to a specific class. invokedynamic may be faster than the current way
to do this (Mike's original example of a cond dispatching on type),
but it probably won't be faster than invoking through an interface, at
least not yet. It also would require Java 7.

I've had similar thoughts to Mike, but it was in the context of "how
would I implement Clojure differently knowing I can use
invokedynamic?" and I would probably do something like this where you
get rid of the IFn interface and just dispatch to a method named
'invoke'. You could encode as much type information as you know at
compile time into the invocation, and link it at runtime, either
calling a specialized method or an Object typed method.


Paul

Mike Anderson

unread,
Jan 10, 2013, 9:20:38 PM1/10/13
to cloju...@googlegroups.com
Yes - that is exactly what I was thinking.
 

I don't know how well this idea could be extended to other features.
Any feature would have to be knowable at compile time, and also
encodable as a Java method either with types (which is directly
supported by Java) or with name munging or something.

I think it could be extended relatively easily to other features / arbitrary type predicates. It would involve generating pattern matching style code based on whatever is known at compile time. The hardest thing would probably be defining which method gets used when multiple conditions match.

The matching on type is definitely the common case / most useful version to have though. Determining which method gets called is pretty well defined in this case thanks to Java overloading rules.
 

One drawback I see with this is that it is very concrete. Protocols
work against interfaces which allow multiple implementations. The IFn
interface allows only Object typed methods. Since they are not part of
any interface, these specialized methods are specific not only to the
static types of the arguments, but also the function that you are
compiling. Any high order functions would be unable to take advantage
of these specialized methods, because the compiler would only see them
as IFn instances. You could possiblly pass some additional type
information to a high order function, but it would tie itself directly
to the concrete class of the function with specialized methods, since
interfaces are the way to abstract away from concrete instances, and
again these specialized methods are not part of any interface.

I don't see this as a drawback particularly: you still have the IFn interface when you want to work with higher order functions in a generalised way. 

And theoretically a smart enough compiler could also specialise higher order functions at compile time using the concrete types it can prove it is working with. Something fun to play with if anyone needs a PhD research project :-)

Reply all
Reply to author
Forward
0 new messages