reacting to an article on clojure protocols, ruby monkey patching, and safety

663 views
Skip to first unread message

Laurent PETIT

unread,
Jun 3, 2010, 7:56:36 AM6/3/10
to clo...@googlegroups.com
Hello,

The following article:

http://kirindave.tumblr.com/post/658770511/monkey-patching-gorilla-engineering-protocols-in

claims that, as far as code safety is concerned, clojure's solution is far better than e.g. ruby monkey patching.

If I understand things well, one problem with ruby monkey patching is that if a library I use opens a class C and adds a method whose signature is M to it, and if in my own code I also open the same class C and add a method whose signature is M to it, then the library may break because it will call my implementation of M, not the librarie's original one.

Now to clojure. I can see the same problem occur, while the article's author claims that in clojure there's (almost) no problem anymore.
If several libraries, including my program, redefine blindlessly a protocol implementation for the same type, then the "last to speak" wins.
So isn't the problem basically the same ?

Meikel Brandmeyer

unread,
Jun 3, 2010, 8:34:57 AM6/3/10
to clo...@googlegroups.com
Hi,

On Thu, Jun 03, 2010 at 01:56:36PM +0200, Laurent PETIT wrote:

> Now to clojure. I can see the same problem occur, while the article's author
> claims that in clojure there's (almost) no problem anymore.
> If several libraries, including my program, redefine blindlessly a protocol
> implementation for the same type, then the "last to speak" wins.
> So isn't the problem basically the same ?

No. A protocol lives in a namespace. So its methods have to be
unique only in the same namespace. Types and reify implement
protocols (and their Java interface) at definition time. Later
on only the extend version via a map works. So the object itself
is not modified. Only the protocol functions know about these
extensions. And due to they belonging to a namespace it is clear
what has to be called by the protocol method you call.

And redefining things in foreign namespaces is rather not a
technique we should support...

Eh? Does that make sense?


Sincerely
Meikel

Christophe Grand

unread,
Jun 3, 2010, 9:01:02 AM6/3/10
to clo...@googlegroups.com
On Thu, Jun 3, 2010 at 1:56 PM, Laurent PETIT <lauren...@gmail.com> wrote:
> If I understand things well, one problem with ruby monkey patching is that
> if a library I use opens a class C and adds a method whose signature is M to
> it, and if in my own code I also open the same class C and add a method
> whose signature is M to it, then the library may break because it will call
> my implementation of M, not the librarie's original one.
>
> Now to clojure. I can see the same problem occur, while the article's author
> claims that in clojure there's (almost) no problem anymore.
> If several libraries, including my program, redefine blindlessly a protocol
> implementation for the same type, then the "last to speak" wins.
> So isn't the problem basically the same ?

my 2 cents:

The surface of the problem is reduced because of namespacing: two
different "fold" methods (with different semantics) from two protocols
won't clash. Plus if there are two extensions of the same protocol to
the same type, they should be rather equivalent since they satisfies
the same semantics.

I think one must only extend a protocol to a type if he owns either
the type or the protocol.


Christophe


--
European Clojure Training Session: Brussels, 23-25/6 http://conj-labs.eu/
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.cgrand.net/ (en)

Konrad Hinsen

unread,
Jun 3, 2010, 9:13:02 AM6/3/10
to clo...@googlegroups.com
On 3 Jun 2010, at 13:56, Laurent PETIT wrote:

> If I understand things well, one problem with ruby monkey patching
> is that if a library I use opens a class C and adds a method whose
> signature is M to it, and if in my own code I also open the same
> class C and add a method whose signature is M to it, then the
> library may break because it will call my implementation of M, not
> the librarie's original one.
>
> Now to clojure. I can see the same problem occur, while the
> article's author claims that in clojure there's (almost) no problem
> anymore.
> If several libraries, including my program, redefine blindlessly a
> protocol implementation for the same type, then the "last to speak"
> wins.
> So isn't the problem basically the same ?

The difference is that with the Clojure approach, the type definition
and the protocol definition are separate and can well live in
different namespaces. Extending a protocol to a type is safe if your
code "owns" at least one of the two. With the Ruby or Python approach
of dynamically modifiable classes, the class definition defines both
the data and the protocol, all in one, so there is only one "owner"
and everyone else messing around with it will be bothered by sleepless
nights ;-)

So it is fair to say that Clojure has a safe solution for a much wider
range of situations, even though, as you noted, there is still the
possibility of interfering protocol implementations.

Konrad.

Richard Newman

unread,
Jun 3, 2010, 9:19:09 AM6/3/10
to clo...@googlegroups.com
> The surface of the problem is reduced because of namespacing: two
> different "fold" methods (with different semantics) from two protocols
> won't clash. Plus if there are two extensions of the same protocol to
> the same type, they should be rather equivalent since they satisfies
> the same semantics.

Incidentally, this is the difference between pre-web and post-web
tools. Before the web, databases and identifiers would be "age", or
"name"; after the web we'd have "http://foo.com/age", or in Clojure
com.foo/age.

The former behaves as if it owns the entire world, and it's very
likely to result in collisions; the latter is very unlikely. This
issue of ambiguous identification is one of the problems that semantic
web tools try to solve.

The former, of course, is how Ruby works -- if you open the string
class and add a 'frob' method, then it will clash with someone else's
'frob' method. The latter is how Clojure works, because protocols are
namespaced, and good namespaces are reverse domains or otherwise
controlled.

If you use com.foo.bar/frob, you're claiming to obey the contract of
that particular frob operation, so even in the situation where
collision occurs, both implementations should at least share a
conception of what 'frobbing' means.

> I think one must only extend a protocol to a type if he owns either
> the type or the protocol.

That sounds like a reasonable rule of thumb, though I'd change it to
"publicly extend", or add "if you want to be safe" :)

Laurent PETIT

unread,
Jun 3, 2010, 9:28:34 AM6/3/10
to clo...@googlegroups.com
Hi,

2010/6/3 Christophe Grand <chris...@cgrand.net>

On Thu, Jun 3, 2010 at 1:56 PM, Laurent PETIT <lauren...@gmail.com> wrote:
> If I understand things well, one problem with ruby monkey patching is that
> if a library I use opens a class C and adds a method whose signature is M to
> it, and if in my own code I also open the same class C and add a method
> whose signature is M to it, then the library may break because it will call
> my implementation of M, not the librarie's original one.
>
> Now to clojure. I can see the same problem occur, while the article's author
> claims that in clojure there's (almost) no problem anymore.
> If several libraries, including my program, redefine blindlessly a protocol
> implementation for the same type, then the "last to speak" wins.
> So isn't the problem basically the same ?

my 2 cents:

The surface of the problem is reduced because of namespacing: two
different "fold" methods (with different semantics) from two protocols
won't clash.

Of course. I was not thinking about that. I was talking about working on the same protocol.

Here is the scenario I have in mind: clojure 1.3 ships in december with loooots of protocols for the whole set of abstractions currently implemented as interfaces.

A bunch of developers all over the place start adding implementations for these protocols for types not covered by clojure itself ....
 
Plus if there are two extensions of the same protocol to
the same type, they should be rather equivalent since they satisfies
the same semantics.

... granted, the more specific the semantic will be, the less different the implementations should behave.

So back to my scenario: lots of people adding protocol impls for lots of "common java types" : new implementations each time, copied implementations some other time. Of course some implementations will be buggy the first time they are delivered. Depending on N libraries each providing their own implem. in a particular version could lead, depending on the order the libraries are loaded in the environment, in having sometimes a buggy version, sometimes an "almost correct version".
 

I think one must only extend a protocol to a type if he owns either
the type or the protocol.


The general example here is java.lang.String. Nobody owns it but oracle/sun (and arguably clojure itself). People will want to reimplement some clojure protocols on it, interpreting the String content in a bunch of exotic ways :).

But I understand the rule : "extend with great care if you neither own the protocol nor the type".

Laurent PETIT

unread,
Jun 3, 2010, 9:34:27 AM6/3/10
to clo...@googlegroups.com
Hi,

2010/6/3 Meikel Brandmeyer <m...@kotka.de>

Hi,

On Thu, Jun 03, 2010 at 01:56:36PM +0200, Laurent PETIT wrote:

> Now to clojure. I can see the same problem occur, while the article's author
> claims that in clojure there's (almost) no problem anymore.
> If several libraries, including my program, redefine blindlessly a protocol
> implementation for the same type, then the "last to speak" wins.
> So isn't the problem basically the same ?

No. A protocol lives in a namespace. So its methods have to be
unique only in the same namespace. Types and reify implement
protocols (and their Java interface) at definition time. Later
on only the extend version via a map works. So the object itself
is not modified. Only the protocol functions know about these
extensions. And due to they belonging to a namespace it is clear
what has to be called by the protocol method you call.

The above point has been answered to Christophe. I was not talking about the way that namespacing will reduce the problem by allowing different function names (with probably different semantics) to live in different namespaces. This is no different than distinguishing 2 ordinary functions with the same name, living in different namespaces, from the caller's point of vue.
 

And redefining things in foreign namespaces is rather not a
technique we should support...

Eh? Does that make sense?

No, I don't understand your last point.

How can the expression problem be claimed to be (almost) solved, if only the owner of a protocol and of a namespace can extend the protocol on types ?

Stuart Halloway

unread,
Jun 3, 2010, 9:54:14 AM6/3/10
to clo...@googlegroups.com
The above point has been answered to Christophe. I was not talking about the way that namespacing will reduce the problem by allowing different function names (with probably different semantics) to live in different namespaces. This is no different than distinguishing 2 ordinary functions with the same name, living in different namespaces, from the caller's point of vue.
 

And redefining things in foreign namespaces is rather not a
technique we should support...

Eh? Does that make sense?

No, I don't understand your last point.

How can the expression problem be claimed to be (almost) solved, if only the owner of a protocol and of a namespace can extend the protocol on types ?

In languages that conflate classes and namespaces, there can be only one place to hang a definition. So, e.g., you and I can both define a method "correct-spelling" for strings. Mine returns an array of possible American English spellings, and yours returns the best match Spanish spelling. 

Consumers are screwed: they can have either my method, or yours, but not both. And the "last in wins" mechanism is utterly confusing as a tiebreaker.

Contrast that with com.relevance.american-english/correct-spelling vs. org.example.spanish/correct-spelling. Consumers have a clear way of choosing the one they want, or using both!

That is a solution to the expression problem. You seem to be asking for a solution to something else: How do I prevent two people from saying that X means two different things? That isn't the expression problem, and it isn't solvable unless you allow only one person.

Stu

Laurent PETIT

unread,
Jun 3, 2010, 10:16:56 AM6/3/10
to clo...@googlegroups.com


2010/6/3 Stuart Halloway <stuart....@gmail.com>

I think I clearly understand the benefits of namespaces in this case. I was reacting to Meikel's sentence:


"And redefining things in foreign namespaces is rather not a technique we should support..."

Stu, I don't see how your answers succeeds in explaining the above sentence.

Rich Hickey

unread,
Jun 3, 2010, 11:33:47 AM6/3/10
to Clojure


On Jun 3, 9:28 am, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> Hi,
>
> 2010/6/3 Christophe Grand <christo...@cgrand.net>
>
>
>
>
>
> > On Thu, Jun 3, 2010 at 1:56 PM, Laurent PETIT <laurent.pe...@gmail.com>
Yes, and be prepared to withdraw should the implementor of either
provide a definition.

Actually, I think the biggest issue will be people extend protocols to
types for which they don't make sense, e.g. for which the protocol
authored considered but rejected an implementation due to a semantic
mismatch. No extension will be there (by design), and people without
sufficient understanding/skills might fill the void with broken ideas.

I don't have a means to prevent this at present, but I'd like to
suggest this policy moving forward:

If a protocol comes with Clojure itself, avoid extending it to types
you don't own, especially e.g. java.lang.String and other core Java
interfaces. Rest assured if a protocol should extend to it, it will,
else lobby for it.

Rich

Meikel Brandmeyer

unread,
Jun 3, 2010, 1:23:54 PM6/3/10
to clo...@googlegroups.com
Hi,

Am 03.06.2010 um 16:16 schrieb Laurent PETIT:

> I think I clearly understand the benefits of namespaces in this case. I was reacting to Meikel's sentence:
>
> "And redefining things in foreign namespaces is rather not a technique we should support..."

I think what I meant was mentioned also by the others: don't extend a protocol to a type if you don't own either one. Doing so if you don't own one of them, this is basically equivalent to

(in-ns 'some.other.namespace)

(let [orig-x some-x]
(defn some-x
[foo]
(if (my-foo? foo)
(do-stuff foo)
(orig-x foo))))

At least IMHO.

Sincerely
Meikel


Laurent PETIT

unread,
Jun 3, 2010, 3:51:17 PM6/3/10
to clo...@googlegroups.com
2010/6/3 Meikel Brandmeyer <m...@kotka.de>
 
Sorry Meikel, but I'm having trouble following you today.
Does the above example stand for "pseudo-code" for explaining what happens when one reimplements a protocol (in which case I'm pretty sure you're wrong - redefining a protocol extension on a type redefines it for all following calls, from any thread), or does the above example stand for pseudo-code for how one would "try" to "break the rule" (mentioned by Christophe, Rich, and others) in a (hopefully) non-intrusive way ?

Laurent PETIT

unread,
Jun 3, 2010, 3:58:11 PM6/3/10
to clo...@googlegroups.com
2010/6/3 Rich Hickey <richh...@gmail.com>

Something like being able to declare a "black list" of symbols (e.g. #{ #"java\..*", #"clojure\.core\..*"} ) for which no implementation may be provided without e.g. raising a warning ?

I don't have a means to prevent this at present, but I'd like to
suggest this policy moving forward:

If a protocol comes with Clojure itself, avoid extending it to types
you don't own, especially e.g. java.lang.String and other core Java
interfaces. Rest assured if a protocol should extend to it, it will,
else lobby for it.


May I add this policy concerning Clojure protocols (as well as the rule "one must only extend a protocol to a type if he owns either the type or the protocol. If one breaks the rule, one should be prepared to withdraw should the implementor of either provide a definition") to the assembla Wiki page on best practices / conventions ?

Stuart Halloway

unread,
Jun 3, 2010, 4:01:49 PM6/3/10
to clo...@googlegroups.com
> May I add this policy concerning Clojure protocols (as well as the rule "one must only extend a protocol to a type if he owns either the type or the protocol. If one breaks the rule, one should be prepared to withdraw should the implementor of either provide a definition") to the assembla Wiki page on best practices / conventions ?

Yes please.

Laurent PETIT

unread,
Jun 3, 2010, 4:02:54 PM6/3/10
to clo...@googlegroups.com
2010/6/3 Rich Hickey <richh...@gmail.com>
[...]

I don't have a means to prevent this at present, but I'd like to
suggest this policy moving forward:

If a protocol comes with Clojure itself, avoid extending it to types
you don't own, especially e.g. java.lang.String and other core Java
interfaces. Rest assured if a protocol should extend to it, it will,
else lobby for it.


Another concern I have is when using multiple libraries from separate "vendors", with potentially long chains of transitive dependencies. I can force myself into respecting the above mentioned rules and laws. Much harder to be sure that every library I'll use does that too. I would like to have a way to be "warned" that something wrong may happen. It doesn't bother me if this happens at runtime (worst case if all the libraries I use are already compiled and I don't own the source code) : when the libraries are "referred", I would like to be able to spot warning messages if a redefinition of a protocol on a type is done in more than one namespace ("in more than one namespace", so that working interactively in the REPL would not trigger the warnings).

Meikel Brandmeyer

unread,
Jun 3, 2010, 4:10:44 PM6/3/10
to clo...@googlegroups.com
Hi,

Am 03.06.2010 um 21:51 schrieb Laurent PETIT:

>> (in-ns 'some.other.namespace)
>>
>> (let [orig-x some-x]
>> (defn some-x
>> [foo]
>> (if (my-foo? foo)
>> (do-stuff foo)
>> (orig-x foo))))
>
> Sorry Meikel, but I'm having trouble following you today.
> Does the above example stand for "pseudo-code" for explaining what happens when one reimplements a protocol (in which case I'm pretty sure you're wrong - redefining a protocol extension on a type redefines it for all following calls, from any thread), or does the above example stand for pseudo-code for how one would "try" to "break the rule" (mentioned by Christophe, Rich, and others) in a (hopefully) non-intrusive way ?

The above is working code (modulo foo related implementations). You can at any moment change namespaces. Change some function there and go back to your original namespace. This is effectively monkey-patching. This works now.

So now suppose seq does not support strings. Someone goes in to "fix" this in his library. Say, his fix returns (\f \o \o). Someone else also "fixes" seq: ("f" "o" "o"). You load both libraries and are in trouble. The same happens if both extend the Sequable protocol for j.l.String.

So extending a protocol you don't own to a type you don't own is like going into someone else's namespace and "fixing" things there. Not doing so is a question of good manners, which are in general lacking nowadays… So let's hope that we are all well-behaved and don't adopt such techniques.

Sincerely
Meikel

Laurent PETIT

unread,
Jun 3, 2010, 4:15:28 PM6/3/10
to clo...@googlegroups.com
2010/6/3 Stuart Halloway <stuart....@gmail.com>

> May I add this policy concerning Clojure protocols (as well as the rule "one must only extend a protocol to a type if he owns either the type or the protocol. If one breaks the rule, one should be prepared to withdraw should the implementor of either provide a definition") to the assembla Wiki page on best practices / conventions ?

Yes please.



Laurent PETIT

unread,
Jun 3, 2010, 4:21:32 PM6/3/10
to clo...@googlegroups.com


2010/6/3 Meikel Brandmeyer <m...@kotka.de>

I'm fearing that too, and that's the whole point of me creating this thread to start with. The spirit of clojure will not (probably) be to prevent this (this would remove power from the user, though there is a precedent with the non introduction of reader macros), but I would like at least to know how I could "audit" my project (once all its dependencies have been set up) to spot where the problems (conflicting protocols redefinitions) are.

Reply all
Reply to author
Forward
0 new messages