Unpredictable behavior of Protocol Extension

32 views
Skip to first unread message

Chris Perkins

unread,
Feb 28, 2011, 6:28:19 AM2/28/11
to Clojure
I am wondering whether this is a known issue. I made the mistake of
extending a protocol to two interfaces, and then calling it with an
object that implements both.

(defprotocol Foo
(foo [x]))

(extend-protocol Foo
clojure.lang.Sequential
(foo [x] "sequential")
clojure.lang.Associative
(foo [x] "associative")
)

(dotimes [_ 10]
(println (foo [])))

You might expect this to print "sequential", or you might expect it to
print "associative", or you might expect an exception. Any of the
these behaviors would seem reasonable to me.

However, what it actually does is behind door number 4 :) It randomly
chooses one of "sequential" or "associative" for each run of the JVM.
Within a JVM, once it has chosen, it continues to choose the same
value forever, but in a new JVM process it chooses randomly again.

This caused me a long debugging / hair-pulling-out session because
within my slime repl, my code consistently behaved one way (which, by
luck, happened to match with my mistaken expectations at the time),
but at the same time my unit tests seemed to be passing or failing
without rhyme or reason as I frantically tweaked my code over and
over.

I don't know, or particularly care, which behavior clojure chooses in
the case of ambiguity like this, but it sure would be nice if it was
predictable.

- Chris

Chas Emerick

unread,
Feb 28, 2011, 8:32:07 AM2/28/11
to clo...@googlegroups.com
I agree with your sentiment. This has been discussed before here:

http://groups.google.com/group/clojure-dev/browse_frm/thread/fb3a0b03bf3ef8ca

That discussion pretty quickly wandered into the weeds of whether this sort of usage of protocols was intended or not, fundamentally programmer error or not. There is as yet no input from clojure/core on either question. Maybe someone would like to weigh in on the issue here?

- Chas

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Ken Wesson

unread,
Feb 28, 2011, 9:19:56 AM2/28/11
to clo...@googlegroups.com
On Mon, Feb 28, 2011 at 8:32 AM, Chas Emerick <ceme...@snowtide.com> wrote:
> I agree with your sentiment.  This has been discussed before here:
>
> http://groups.google.com/group/clojure-dev/browse_frm/thread/fb3a0b03bf3ef8ca
>
> That discussion pretty quickly wandered into the weeds of whether this sort of usage of protocols was intended or not, fundamentally programmer error or not.  There is as yet no input from clojure/core on either question.  Maybe someone would like to weigh in on the issue here?

I thought it was a goal of Java and JVM-hosted languages to avoid the
dreaded "undefined behavior" bugbear of C, C++, and ilk. So I vote the
behavior be made consistent -- say, the first applicable interface in
the sequence applies, so in the example you posted earlier you'd get
"sequential" consistently for vectors, and if you wanted "associative"
instead you'd just swap the two pieces of the protocol so the case for
Associative came first followed by the one for Sequential.

Incidentally, a quick fix is as simple as changing a set to a
sorted-set in clojure.core/supers:

user=> (defprotocol Foo
(foo [x]))

(extend-protocol Foo
clojure.lang.Sequential
(foo [x] "sequential")
clojure.lang.Associative
(foo [x] "associative")
)

(dotimes [_ 10]
(println (foo [])))

associative
associative
associative
associative
associative
associative
associative
associative
associative
associative
nil
user=> (in-ns 'clojure.core)
#<Namespace clojure.core>
clojure.core=> (def old-supers supers)
#'clojure.core/old-supers
clojure.core=> (defn supers [x] (into (sorted-set-by #(.compareTo
(.toString %2) (.toString %1))) (old-supers x)))
#'clojure.core/supers
clojure.core=> (in-ns 'user)
#<Namespace user>
user=> (supers (.getClass []))
#{java.util.concurrent.Callable
java.util.RandomAccess
java.util.List
java.util.Collection
java.lang.Runnable
java.lang.Iterable
java.lang.Comparable
java.io.Serializable
clojure.lang.Sequential
clojure.lang.Seqable
clojure.lang.Reversible
clojure.lang.Indexed
clojure.lang.IPersistentVector
clojure.lang.IPersistentStack
clojure.lang.IPersistentCollection
clojure.lang.IObj
clojure.lang.IMeta
clojure.lang.ILookup
clojure.lang.IFn
clojure.lang.IEditableCollection
...}
user=> (defprotocol Foo
(foo [x]))

(extend-protocol Foo
clojure.lang.Sequential
(foo [x] "sequential")
clojure.lang.Associative
(foo [x] "associative")
)

(dotimes [_ 10]
(println (foo [])))

sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
nil
user=>

I made it flip from associative to sequential in a single session by
simply redefining supers to return a reverse-alphabetically-sorted set
instead of an unsorted one. So if the clojure devs simply changed
supers to return a sorted set with a comparator like that (I'd swap
the %1 and the %2 to get ascending alphabetical order instead of
descending) instead of an unsorted set the behavior would become
consistent across runs; though it wouldn't generally use the first
applicable protocol extension, it would use the same one every time.
In fact it would use the alphabetically first by classname, so always
Associative in this case whether you put Associative first or
Sequential in defprotocol.

Another enhancement to consider would be to implement a
(prefer-protocol) or similar to override it. This would require not
just going over the return from (seq (supers ...)) looking for the
first match, as it apparently currently does, but doing some kind of
(let [protos (reduce disj protocol-extensions (seq (supers ...)))]
(choose-one-from-protos)) type setup.

Both alternative enhancements make protocols a bit slower to invoke.
The supers modification however needn't: just memoize supers and the
cost of generating the set is incurred only once per session for each
class of object that *any* protocol is invoked on. The further cost of
making it a sorted set is likewise incurred only the once.

Rich Hickey

unread,
Feb 28, 2011, 10:49:40 AM2/28/11
to clo...@googlegroups.com

On Feb 28, 2011, at 8:32 AM, Chas Emerick wrote:

> I agree with your sentiment. This has been discussed before here:
>
> http://groups.google.com/group/clojure-dev/browse_frm/thread/fb3a0b03bf3ef8ca
>
> That discussion pretty quickly wandered into the weeds of whether
> this sort of usage of protocols was intended or not, fundamentally
> programmer error or not. There is as yet no input from clojure/core
> on either question. Maybe someone would like to weigh in on the
> issue here?
>


Yes, sorry I wasn't available to chime in then. Reply is here:

http://groups.google.com/group/clojure-dev/msg/2dbd690c7b509b63

Rich

Chris Perkins

unread,
Feb 28, 2011, 11:30:22 AM2/28/11
to Clojure
On Feb 28, 10:49 am, Rich Hickey <richhic...@gmail.com> wrote:
> On Feb 28, 2011, at 8:32 AM, Chas Emerick wrote:
>
> > I agree with your sentiment.  This has been discussed before here:
>
> >http://groups.google.com/group/clojure-dev/browse_frm/thread/fb3a0b03...
>
> > That discussion pretty quickly wandered into the weeds of whether  
> > this sort of usage of protocols was intended or not, fundamentally  
> > programmer error or not.  There is as yet no input from clojure/core  
> > on either question.  Maybe someone would like to weigh in on the  
> > issue here?
>
> Yes, sorry I wasn't available to chime in then. Reply is here:
>
> http://groups.google.com/group/clojure-dev/msg/2dbd690c7b509b63
>
> Rich

I would vote for a stopgap measure in 1.3 to make the selection of an
implementation repeatable across JVM instances. It's fine if it's
still arbitrary, as long as I can determine experimentally what the
behavior is, and rely on that behavior to happen again next time I run
my program. This would solve the most painful part of the problem,
while allowing someone to come up with a brilliant idea for a proper
solution for 1.4 or beyond.

- Chris

Ken Wesson

unread,
Feb 28, 2011, 12:33:04 PM2/28/11
to clo...@googlegroups.com

As noted earlier in this thread, I have determined that simply making
supers return a sorted set with an appropriate comparator (e.g.
alphabetic) suffices in this regard. (For performance reasons, supers,
or the instance of supers used by protocols, should probably be
memoized.)

Chas Emerick

unread,
Feb 28, 2011, 2:41:31 PM2/28/11
to clo...@googlegroups.com

On Feb 28, 2011, at 12:33 PM, Ken Wesson wrote:

> As noted earlier in this thread, I have determined that simply making
> supers return a sorted set with an appropriate comparator (e.g.
> alphabetic) suffices in this regard. (For performance reasons, supers,
> or the instance of supers used by protocols, should probably be
> memoized.)

FYI, protocol fns already cache dispatch results in instances of clojure.lang.MethodImplCache, so memoizing supers for that reason is unnecessary. See core_deftype.clj for the details.

Whether supers and/or ancestors should be memoized or not to help out isa? is another story…

- Chas

Reply all
Reply to author
Forward
0 new messages