I created a library for Clojure to do open, single dispatch
polymorphism. What does this mean?
* A polyfn dispatches on the type of its first argument.
* You can add an implementation for a new type to an existing polyfn.
* You can define a new polyfn on an existing type.
Polyfns are exactly as fast as protocol functions (I am using the same
caching and dispatch code), but they do not generate a Java interface,
and they are slightly simpler to define.
I like the idea, but it seems to go against the pattern set by multi-fns:
(defmulti foo...)
(defmethod foo ...)
Could we do some pattern like that?
(defpolyfn foo...)
(extendpolyfn foo ...)
I'm thinking about situations like testing, repls, etc. When sometimes I
actually do want to completely redefine the function. I'd like a way to
re-create the polyfn. From what I see above it looks like the creation of
the polyfn is implicit rather than explicit.
> I like the idea, but it seems to go against the pattern set by multi-fns:
> (defmulti foo...)
> (defmethod foo ...)
> Could we do some pattern like that?
> (defpolyfn foo...)
> (extendpolyfn foo ...)
> I'm thinking about situations like testing, repls, etc. When sometimes I
> actually do want to completely redefine the function. I'd like a way to
> re-create the polyfn. From what I see above it looks like the creation of
> the polyfn is implicit rather than explicit.
At first I had that kind of syntax, but (just like multifns) if you
allowed defmulti to zap the dispatch table, then it would probably
make it more difficult to interactively write code with the repl and
recompiling a namespace. The only purpose of the defmulti is to define
the dispatch function, and in the case of polyfns the dispatch is
assumed to be on the type of the first argument, so it didn't seem
there was a need for (defpolyfn foo...) (extendpolyfn foo ...).
I've thought about whether there might be some use for a way to add
and remove implementations from the dispatch table, or perhaps reset
it. I'm open to ideas around this.
> I created a library for Clojure to do open, single dispatch
> polymorphism. What does this mean?
> * A polyfn dispatches on the type of its first argument.
> * You can add an implementation for a new type to an existing polyfn.
> * You can define a new polyfn on an existing type.
> Polyfns are exactly as fast as protocol functions (I am using the same
> caching and dispatch code), but they do not generate a Java interface,
> and they are slightly simpler to define.
Sounds cool.
I have a bunch of mostly one-method-protocols that I extend upon
existing (95% java) types. I don't rely on the existence of the
protocol interfaces, and neither do I use extends?, satisfies?, or
extenders.
Would it make sense to switch to polyfns? Are there more advantages
except from the definitions being slightly more concise?
One minor problem I have with the protocol approach is that if you
recompile a protocol during interactive development, then calling the
protocol methods on already existing instances in your repl session of
types on which the protocol has been extended won't work anymore. Do
polyfns help there?
On Monday, October 8, 2012 1:55:50 PM UTC-4, Tassilo Horn wrote:
> Paul Stadig <pa...@stadig.name <javascript:>> writes:
> Hi Paul,
> > I created a library for Clojure to do open, single dispatch > > polymorphism. What does this mean?
> > * A polyfn dispatches on the type of its first argument. > > * You can add an implementation for a new type to an existing polyfn. > > * You can define a new polyfn on an existing type.
> > Polyfns are exactly as fast as protocol functions (I am using the same > > caching and dispatch code), but they do not generate a Java interface, > > and they are slightly simpler to define.
> Sounds cool.
> I have a bunch of mostly one-method-protocols that I extend upon > existing (95% java) types. I don't rely on the existence of the > protocol interfaces, and neither do I use extends?, satisfies?, or > extenders.
> Would it make sense to switch to polyfns? Are there more advantages > except from the definitions being slightly more concise?
I can't say that you should necessarily switch to polyfns, but this was the kind of situation I was imagining. polyfns are the fast and open type based dispatch decomplected from protocols, and I think that's the main advantage, simplicity.
A drawback with polyfns is there's no Java interface that can be extended in Java code to participate in the dispatch. The Java interface is nice, but as you mention below, the generation of interfaces can cause staleness issues especially when paired with defrecord. defprotocol and defrecord both generate a new class each time you compile them, because of the way classes, class loaders, and class identity work.
Another difference between protocols and polyfns is that everytime you compile a defprotocol form it regenerates the protocol functions, whereas a polyfn is generated once and never changes (only the dispatch table changes). This has implications though I'm not sure how much they matter. YMMV
One minor problem I have with the protocol approach is that if you
> recompile a protocol during interactive development, then calling the > protocol methods on already existing instances in your repl session of > types on which the protocol has been extended won't work anymore. Do > polyfns help there?
defrecord will behave differently depending on how you extend the protocol. If you extend the protocol inline in the defrecord form, then the class that defrecord generates will implement the protocol interface. In this case nothing gets added to the protocol dispatch table, and instead dispatch happens through the Java interface. When you recompile the defprotocol form, it regenerates the Java interface and the protocol functions. The new functions it generates know only about the new interface, so they complain when you give it an instance of your defrecord that you had stashed away before you recompiled.
If you define a defrecord and extend a protocol to it using extend-protocol or extend-type it adds an entry to the dispatch table for the protocol. Since dispatch happens through the dispatch table instead of the interface, then when you recompile your defprotocol and defrecord your old instances continue to work, but because of the way class identity works they continue to dispatch to the implementation of the protocol that was defined when you created your instance. In order to invoke the new implementation of the protocol, you need to create a new instance of your defrecord. Every time you recompile your defrecord it adds a new entry to the protocol's dispatch table, so the table will contain a number of entries on the order of the number of times you have recompiled your defrecord.
Since polyfns do not generate a Java interface through which they dispatch, they will behave like this second case. Your old instances will continue to work, but with the polyfn implementation that was associated with the instance's class, and your dispatch table size will be on the order of the number of times you have recompiled the polyfn forms.
I'm not sure if this is better or worse than what happens in the first case, but I do know that in 100% of the cases where I've used protocols I have not really needed the Java interface. Though it also doesn't really hurt anything to have the interface around if you're not using it.
> On Monday, October 8, 2012 1:55:50 PM UTC-4, Tassilo Horn wrote:
> Paul Stadig <pa...@stadig.name> writes:
> Hi Paul,
> > I created a library for Clojure to do open, single dispatch > > polymorphism. What does this mean?
> > * A polyfn dispatches on the type of its first argument. > > * You can add an implementation for a new type to an existing polyfn. > > * You can define a new polyfn on an existing type.
> > Polyfns are exactly as fast as protocol functions (I am using the same > > caching and dispatch code), but they do not generate a Java interface, > > and they are slightly simpler to define.
> Sounds cool.
> I have a bunch of mostly one-method-protocols that I extend upon > existing (95% java) types. I don't rely on the existence of the > protocol interfaces, and neither do I use extends?, satisfies?, or > extenders.
> Would it make sense to switch to polyfns? Are there more advantages > except from the definitions being slightly more concise?
> I can't say that you should necessarily switch to polyfns, but this was the kind of situation I was imagining. polyfns are the fast and open type based dispatch decomplected from protocols, and I think that's the main advantage, simplicity.
> A drawback with polyfns is there's no Java interface that can be extended in Java code to participate in the dispatch. The Java interface is nice, but as you mention below, the generation of interfaces can cause staleness issues especially when paired with defrecord. defprotocol and defrecord both generate a new class each time you compile them, because of the way classes, class loaders, and class identity work.
> Another difference between protocols and polyfns is that everytime you compile a defprotocol form it regenerates the protocol functions, whereas a polyfn is generated once and never changes (only the dispatch table changes). This has implications though I'm not sure how much they matter. YMMV
> One minor problem I have with the protocol approach is that if you > recompile a protocol during interactive development, then calling the > protocol methods on already existing instances in your repl session of > types on which the protocol has been extended won't work anymore. Do > polyfns help there?
> defrecord will behave differently depending on how you extend the protocol. If you extend the protocol inline in the defrecord form, then the class that defrecord generates will implement the protocol interface. In this case nothing gets added to the protocol dispatch table, and instead dispatch happens through the Java interface. When you recompile the defprotocol form, it regenerates the Java interface and the protocol functions. The new functions it generates know only about the new interface, so they complain when you give it an instance of your defrecord that you had stashed away before you recompiled.
> If you define a defrecord and extend a protocol to it using extend-protocol or extend-type it adds an entry to the dispatch table for the protocol. Since dispatch happens through the dispatch table instead of the interface, then when you recompile your defprotocol and defrecord your old instances continue to work, but because of the way class identity works they continue to dispatch to the implementation of the protocol that was defined when you created your instance. In order to invoke the new implementation of the protocol, you need to create a new instance of your defrecord. Every time you recompile your defrecord it adds a new entry to the protocol's dispatch table, so the table will contain a number of entries on the order of the number of times you have recompiled your defrecord.
> Since polyfns do not generate a Java interface through which they dispatch, they will behave like this second case. Your old instances will continue to work, but with the polyfn implementation that was associated with the instance's class, and your dispatch table size will be on the order of the number of times you have recompiled the polyfn forms.
> I'm not sure if this is better or worse than what happens in the first case, but I do know that in 100% of the cases where I've used protocols I have not really needed the Java interface. Though it also doesn't really hurt anything to have the interface around if you're not using it.
> -- > You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscribe@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> A drawback with polyfns is there's no Java interface that can be
> extended in Java code to participate in the dispatch.
At least currently I don't see a use-case where I'd need to do that.
>> One minor problem I have with the protocol approach is that if you
>> recompile a protocol during interactive development, then calling the
>> protocol methods on already existing instances in your repl session
>> of types on which the protocol has been extended won't work anymore.
>> Do polyfns help there?
> defrecord will behave differently depending on how you extend the
> protocol. If you extend the protocol inline in the defrecord form,
> then the class that defrecord generates will implement the protocol
> interface. In this case nothing gets added to the protocol dispatch
> table, and instead dispatch happens through the Java interface. When
> you recompile the defprotocol form, it regenerates the Java interface
> and the protocol functions. The new functions it generates know only
> about the new interface, so they complain when you give it an instance
> of your defrecord that you had stashed away before you recompiled.
I never dug very deep into this problem, but now where you mentioning it
I can confirm that this problem occured always with deftype instances
that where extended with several protocols inline. As a consequence,
I've moved the protocols and deftypes into a separate namespace which I
usually don't need to recompile since those definitions are rather
stable. The downside is that now I have namespaces with a meaning plus
several purely technical sub-namespaces.
> Since polyfns do not generate a Java interface through which they
> dispatch, they will behave like this second case. Your old instances
> will continue to work, but with the polyfn implementation that was
> associated with the instance's class, and your dispatch table size
> will be on the order of the number of times you have recompiled the
> polyfn forms.
Ah, so if I change a polyfn for a type X, the new behavior will only be
available to new X instances, right?
Well, as said, those are pretty stable, so it's no issue for me. How
about the growth of the dispatch table? Say, I recompile a namespace
containing polyfn definitions 50 times during a session, will that slow
down dispatch noticeably? Maybe you could store the current active
definition form per type, so that identical definitions for the same
type don't get added to the dispatch table?
Oh, and here's a whishlist item: It should be possible to add a
docstring to polyfns, and maybe also other metadata (:pre, :post, :tag,
...). tools.macro/name-with-attributes makes that pretty easy.
And another question: Since defpolyfn expands into a defonce form, does
that mean that all (defpolyfn foo ...) forms have to be in the same
namespace? It looks to me that currently defining the same polyfn foo
in different namespaces bar and baz will create bar/foo and baz/foo
which have nothing to do with each other (you cannot `use` both bar and
baz). IMO, that would be an argument for splitting polyfns into a
single declaration form and many definition forms providing
implementations for several types (like defmulti and defmethod, or
defprotocol and extend).
<frank.siebenl...@gmail.com> wrote:
> Interesting project, although I'm still a little unclear about the "convincing" use cases where you would choose polyfn over protocols...
> Also, how does the polyfn implementation compare to the clojurescript protocol implementation?
Not sure I could comment on that, since I've never used ClojureScript.
The point of polyfns was to separate out the dispatch from the class
generation. If you don't care about the class generation, then they
would both work just as well. If you need the class generation, then
you need protocols. If you need not class generation, then you need
polyfns. This is about simpler composable parts. Protocols could
conceivably have been built on something like polyfns, adding class
generation.
On Tue, Oct 9, 2012 at 2:40 AM, Tassilo Horn <t...@gnu.org> wrote:
> Ah, so if I change a polyfn for a type X, the new behavior will only be
> available to new X instances, right?
If you change a polyfn implementation for a type X, then the new
behavior will be available to all instances of X, even those created
before you changed the polyfn. The new versus old instances issues
only manifest when you are generating new types with the same "name"
as defrecord and defprotocol do. In that case, instances created with
the old version of the defrecrod and defprotocol are actually
instances of a different type than instances created with the newly
compiled defrecord and defprotocol types.
> Well, as said, those are pretty stable, so it's no issue for me. How
> about the growth of the dispatch table? Say, I recompile a namespace
> containing polyfn definitions 50 times during a session, will that slow
> down dispatch noticeably? Maybe you could store the current active
> definition form per type, so that identical definitions for the same
> type don't get added to the dispatch table?
Recompiling a polyfn definition 50 times will only end up with a
dispatch table of one entry. The multiple entries get added when you
recompile defrecord and defprotocol. When you recompile a defrecord
form 50 times you get 50 different types which is why the dispatch
table grows both for polyfns and for protocols.
> Oh, and here's a whishlist item: It should be possible to add a
> docstring to polyfns, and maybe also other metadata (:pre, :post, :tag,
> ...). tools.macro/name-with-attributes makes that pretty easy.
Good point. I'll look into this.
> And another question: Since defpolyfn expands into a defonce form, does
> that mean that all (defpolyfn foo ...) forms have to be in the same
> namespace? It looks to me that currently defining the same polyfn foo
> in different namespaces bar and baz will create bar/foo and baz/foo
> which have nothing to do with each other (you cannot `use` both bar and
> baz). IMO, that would be an argument for splitting polyfns into a
> single declaration form and many definition forms providing
> implementations for several types (like defmulti and defmethod, or
> defprotocol and extend).
Haha a good point. I'll look into this issue. Looks like I will have
to split out the definition from the extension. Should be simple
enough.
Paul Stadig <p...@stadig.name> writes:
> Recompiling a polyfn definition 50 times will only end up with a
> dispatch table of one entry. The multiple entries get added when you
> recompile defrecord and defprotocol. When you recompile a defrecord
> form 50 times you get 50 different types which is why the dispatch
> table grows both for polyfns and for protocols.
Thanks for clarifying.
>> Oh, and here's a whishlist item: It should be possible to add a
>> docstring to polyfns, and maybe also other metadata (:pre, :post, :tag,
>> ...). tools.macro/name-with-attributes makes that pretty easy.
> Good point. I'll look into this.
Great.
>> And another question: Since defpolyfn expands into a defonce form,
>> does that mean that all (defpolyfn foo ...) forms have to be in the
>> same namespace? It looks to me that currently defining the same
>> polyfn foo in different namespaces bar and baz will create bar/foo
>> and baz/foo which have nothing to do with each other (you cannot
>> `use` both bar and baz). IMO, that would be an argument for
>> splitting polyfns into a single declaration form and many definition
>> forms providing implementations for several types (like defmulti and
>> defmethod, or defprotocol and extend).
> Haha a good point. I'll look into this issue. Looks like I will have
> to split out the definition from the extension. Should be simple
> enough.
I'm happy to have complemented my questions with at least a bit useful
feedback and pointers. :-) At least this latter point is a blocker for
trying to replace my protocols with polyfns right now.
On Tuesday, October 9, 2012 7:24:57 AM UTC-4, Tassilo Horn wrote:
> I'm happy to have complemented my questions with at least a bit useful > feedback and pointers. :-) At least this latter point is a blocker for > trying to replace my protocols with polyfns right now.
Yeah, thanks for the feedback. I've pushed a new version that separates the definition and extension mechanisms into defpolyfn and extend-polyfn forms. I also added add-impl, remove-impl, and reset-polyfn methods for modifying the dispatch table directly.
>> Polyfns are exactly as fast as protocol functions (I am using the same
>> caching and dispatch code), but they do not generate a Java interface,
>> and they are slightly simpler to define.
I think it might be a good idea to discuss why the Java interfaces are
created for protocols.
Before we had protocols, there were only multimethods. Multimethods are
more flexible than protocols, but are naturally fairly slower due to the
fact that every single call requires a lookup in a hashmap. The fastest way
to do single-dispatch on the JVM is via interfaces. Protocols are a mixture
of the two. So the invoke of a protocol fn looks something like this (for
the protocol IFoo):
So if the object implements IFoo then we get the "fast path" dispatch that
the JVM offers. Otherwise we're about as fast as multimethods. This means:
(extend-type String
IFoo
(bar [obj arg1 arg2] "string")) ; <--- uses slow path via the hashmap
(deftype Bar []
IFoo
(bar [obj arg1 arg2] "Bar")) ; <--- uses fast path via the interface
So I guess I have to ask the question again...what is the true use case of
polyfns? Are they faster than multimethods? If I can get a dramatic speed
up by using interfaces, why would I throw that away?
On Tue, Oct 9, 2012 at 9:36 AM, Timothy Baldridge <tbaldri...@gmail.com> wrote:
>>> Polyfns are exactly as fast as protocol functions (I am using the same
>>> caching and dispatch code), but they do not generate a Java interface,
>>> and they are slightly simpler to define.
> I think it might be a good idea to discuss why the Java interfaces are
> created for protocols.
I don't see how polyfns can be as fast as protocol functions that are
defined inline and thus backed by an Java interface. It has not been
my experience that this is true at all. extending to a type later is
expressive but takes a measurable performance hit as far as I've seen.
>> I think it might be a good idea to discuss why the Java interfaces
>> are created for protocols.
> I don't see how polyfns can be as fast as protocol functions that are
> defined inline and thus backed by an Java interface. It has not been
> my experience that this is true at all. extending to a type later is
> expressive but takes a measurable performance hit as far as I've seen.
So indeed type dispatch with inline implementations seems to be nearly
an order of magnitude faster than the cache-lookup based dispatch you
have when extending protocols using extend.
Well, but I still see a use-case for polyfns namely in the situation
where you'd usually use a protocol and extend it to existing (java)
types you have no control over, and you don't need the generated
interface. That applies to many of my protocols.
But on the other hand, many of my protocols aren't open but merely an
implementation detail, and they are extended upon only a few java
classes (many only 2). For those it seems to be more efficient to use a
plain function explicitly dispatching on type using `instance?'...
Tassilo Horn <t...@gnu.org> writes:
> But on the other hand, many of my protocols aren't open but merely an
> implementation detail, and they are extended upon only a few java
> classes (many only 2). For those it seems to be more efficient to use
> a plain function explicitly dispatching on type using `instance?'...
I just did that now with 3 of my most frequently used protocols all
being extended to only 2 java types each, and it brought me a
performance boost of ~15% and a bit less code.
15% doesn't sound that much, but when considering that large portions of
the overall time is spend in the actual function logic and not in the
type dispatch, it's astonishingly good. Another good thing here is the
fact that I know that in ~95% of all cases the function is called on an
object of type A rather than the other possibility B, so writing
(condp instance? x
A (do-A-stuff-with x)
B (do-B-stuff-with x))
rather than first testing for B and then A makes a difference.
On Tuesday, October 9, 2012 9:37:20 AM UTC-4, tbc++ wrote:
> >> Polyfns are exactly as fast as protocol functions (I am using the same > >> caching and dispatch code), but they do not generate a Java interface, > >> and they are slightly simpler to define.
> So I guess I have to ask the question again...what is the true use case of > polyfns? Are they faster than multimethods?
Dramatically faster than multimethods, just like non-interface based protocol dispatch. It is the same code.
> If I can get a dramatic speed up by using interfaces, why would I throw > that away?
Sorry, I should have said that polyfns are exactly as fast as non-interface protocol dispatch. As with everything, it's all about tradeoffs. I don't recall that I said everyone should be replacing their uses of protocols with polyfns, so I don't know why I'm supposed to convince you to do so. I think it is useful to have fast polymorphic dispatch separate from class generation, and useful as an orthogonal building block with granularity at the function level. If you're interested in pure performance, then feel free to use protocols. There are downsides to that approach, too. I think maybe polyfns might be fast enough for most use cases, but I have some ideas for improving performance, too.
I've been experimenting with some Java 7 features. I have a stashed version of polyfns that use java.lang.ClassValue as a cache for implementation fns instead of a clojure.lang.MethodImplCache. The performance is pretty good, and the implementation is simpler.
I'm wondering what kind of performance I could eek out by having a ClassValue cache that caches MethodHandles to the implementations instead of the IFns themselves. Obviously Clojure isn't emitting invokedynamic call sites, but I wonder whether the JVM could inline through that kind of a call chain.