Datatypes and Protocols - early experience program

2,600 views
Skip to first unread message

Rich Hickey

unread,
Nov 12, 2009, 7:10:37 AM11/12/09
to Clojure
An early version of the code for a few important new language
features, datatypes[1] and protocols[2] is now available in the 'new'
branch[3]. Note also that the build system[4] has builds of the new
branch, and that the new branch works with current contrib.

If you have the time and inclination, please try them out. Feedback is
particularly welcome as they are being refined.

Thanks,

Rich

[1] http://www.assembla.com/wiki/show/clojure/Datatypes
[2] http://www.assembla.com/wiki/show/clojure/Protocols
[3] http://github.com/richhickey/clojure/tree/new
[4] http://build.clojure.org/

Sean Devlin

unread,
Nov 12, 2009, 8:29:34 AM11/12/09
to Clojure
Rich,
Just read the section on reify. I'm not quite sure what this new
mechanism lets me do. Could you provide an example of the problem it
solves? I personally would benefit from seeing the "Old, painful way"
contrasted to the "New, awesome way". This would probably help with
the other features too.

Thanks,
Sean

Rich Hickey

unread,
Nov 12, 2009, 9:39:05 AM11/12/09
to Clojure


On Nov 12, 8:29 am, Sean Devlin <francoisdev...@gmail.com> wrote:
> Rich,
> Just read the section on reify.  I'm not quite sure what this new
> mechanism lets me do.  Could you provide an example of the problem it
> solves?  I personally would benefit from seeing the "Old, painful way"
> contrasted to the "New, awesome way".  This would probably help with
> the other features too.
>

reify is the most subtle, as it is a subset of proxy, limited to
implementing interfaces only, and less dynamic (no equivalent to
update-proxy). What you get in return is a construct with fewer host
implications, and much better performance, as stated in the wiki doc:

"The result is better performance than proxy, both in construction
(proxy creates the instance and a fn instance for each method), and
invocation. reify is preferable to proxy in all cases where its
limitations are not prohibitive."

Rich

Michael Jaaka

unread,
Nov 12, 2009, 5:17:51 PM11/12/09
to clo...@googlegroups.com
Oh its looks like Google Go (http://golang.org) and Nice Interfaces (http://nice.sourceforge.net/).
Good! It sounds better than overrated polyphormism and class hierarchy.

Wiadomość napisana przez Rich Hickey w dniu 2009-11-12, o godz. 15:39:
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Meikel Brandmeyer

unread,
Nov 12, 2009, 6:54:09 PM11/12/09
to clo...@googlegroups.com
Hi,

Am 12.11.2009 um 13:10 schrieb Rich Hickey:

> An early version of the code for a few important new language
> features, datatypes[1] and protocols[2] is now available in the 'new'
> branch[3]. Note also that the build system[4] has builds of the new
> branch, and that the new branch works with current contrib.
>
> If you have the time and inclination, please try them out. Feedback is
> particularly welcome as they are being refined.

I implemented my lazymap library in terms reify where gen-class was
required before. Seems to work smoothly, but it is also only a simple
lib. Should ISeqs still extend ASeq? Is this a case where we still
need gen-class?

Otherwise I hadn't much chance to test... http://bitbucket.org/kotarak/lazymap/

Sincerely
Meikel

Chouser

unread,
Nov 12, 2009, 7:59:08 PM11/12/09
to clo...@googlegroups.com
On Thu, Nov 12, 2009 at 7:10 AM, Rich Hickey <richh...@gmail.com> wrote:
>
> If you have the time and inclination, please try them out. Feedback is
> particularly welcome as they are being refined.

For what it's worth, here are 2-3 finger trees implemented using
defprotocol and deftype.

http://tinyurl.com/yeh5fgg/finger_tree.clj

Here's an earlier version that's almost idectical except its
implemented using def-interface and reify instead:

http://tinyurl.com/y9jned5/finger_tree.clj

--Chouser

Chouser

unread,
Nov 12, 2009, 8:22:35 PM11/12/09
to clo...@googlegroups.com
On Thu, Nov 12, 2009 at 7:59 PM, Chouser <cho...@gmail.com> wrote:
> On Thu, Nov 12, 2009 at 7:10 AM, Rich Hickey <richh...@gmail.com> wrote:
>>
>> If you have the time and inclination, please try them out. Feedback is
>> particularly welcome as they are being refined.
>
> For what it's worth, here are 2-3 finger trees implemented using
> defprotocol and deftype.
>
> http://tinyurl.com/yeh5fgg/finger_tree.clj

I should have noted that this is a very early version and doesn't
yet take advantage of some features now available like reusing
the same protocol function in multiple protocols, or using 'case'
instead of 'cond' or 'condp'.

--Chouser

Krukow

unread,
Nov 13, 2009, 2:13:15 AM11/13/09
to Clojure


On Nov 12, 1:10 pm, Rich Hickey <richhic...@gmail.com> wrote:
> An early version of the code for a few important new language
> features, datatypes[1] and protocols[2] is now available in the 'new'
> branch[3]. Note also that the build system[4] has builds of the new
> branch, and that the new branch works with current contrib.
>
> If you have the time and inclination, please try them out. Feedback is
> particularly welcome as they are being refined.

I really like the semantics of your constructs. I have a comment about
regularity of syntax:

The way to specify method names in reify and deftype vs. function
names defprotocol and extend are different. It looks like when dealing
with interface-method implementations one uses .methodName (i.e., with
the dot), but when dealing with protocol functions one uses no dot.
Further, extend uses maps (the docs says why this is the case).

I was thinking this may make syntax irregular. I suspect this is a
deliberate design choice to distinguish clojure protocols from java
interfaces? Is this the case?

A stupid example:

;;uses dot
(deftype Sometype [x]
[java.lang.Comparable]
(.compareTo [o] ...))

::uses no dot
(defprotocol RSeqable :on clojure.lang.Seqable
"Seqable and reverse seqable"
(rseq [s] "reverse seq"))

;;do I mix dot and not?
(extend ::Sometype
:RSeqable
{:rseq (fn [a]...))
:.seq (fn [a] ...)} ;; do I write :.seq here or :seq?

I guess one can reintroduce the regularity using the :on feature of
protocol functions.

Any thoughts?
/Karl

Mark Engelberg

unread,
Nov 13, 2009, 2:13:57 AM11/13/09
to clo...@googlegroups.com
I'm still trying to get my head around the new features. Seeing more
code examples will definitely help. In the meantime, here is some
stream-of-consciousness thoughts and questions.

Datatypes:

I'm a little worried about the strong overlap between reify/proxy,
deftype/defstruct, and defclass/gen-class. I can just imagine the
questions a year from now when people join the Clojure community and
want to understand how they differ. So I think that eventually, there
needs to be a very clear "story" as to why you'd choose one over the
other. Or better yet, maybe some of the older constructs can be
phased out completely.

Is there a way to customize the way that types defined by deftype
print in the REPL?

While these datatype and protocol constructs are taking shape, maybe
now is the time to discuss what kind of "privacy" settings are
worthwhile in a language like Clojure. I think Java's system of
private/public/protected is probably overkill for Clojure. But do
people feel that some degree of data hiding is worthwhile? For
example, might you want to hide some deftype fields from keyword
lookup?

Protocols:

I don't understand whether there's any way to provide a partial
implementation or default implementation of a given
protocol/interface, and I believe this to be an important issue.

For example, a protocol for < and > that provides a default
implementation of > in terms of < and a default implementation of < in
terms of >, so that you only need to implement one and you get the
other for free.

I'm also thinking about the relationship in Clojure's source between
ISeq and ASeq. ASeq provides the partial, default implementation of
more in terms of next, for example. How does this kind of thing look
with the new protocol system?

Konrad Hinsen

unread,
Nov 13, 2009, 3:58:19 AM11/13/09
to clo...@googlegroups.com
On 13 Nov 2009, at 08:13, Mark Engelberg wrote:

> Is there a way to customize the way that types defined by deftype
> print in the REPL?

Implement the multimethod clojure.core/print-method for the associated
type tag:

(deftype Foo ...)

(defmethod clojure.core/print-method ::Foo [x] ...)

> While these datatype and protocol constructs are taking shape, maybe
> now is the time to discuss what kind of "privacy" settings are
> worthwhile in a language like Clojure. I think Java's system of
> private/public/protected is probably overkill for Clojure. But do
> people feel that some degree of data hiding is worthwhile? For
> example, might you want to hide some deftype fields from keyword
> lookup?

Coming from a Python background, I don't think access restrictions are
necessary. However, flagging fields as "not meant for use by
outsiders" could be of interest for documentation tools, to make it
clear what client code can safely rely on.

Note also that you can always make the deftype private (it's a var
like any other) and restrict all use of it to public functions defined
in the same namespace. That doesn't exclude defining protocols and
multimethods on this type elsewhere, because its type tag is a
namespace-qualified symbol that can be used anywhere.

> Protocols:
>
> I don't understand whether there's any way to provide a partial
> implementation or default implementation of a given
> protocol/interface, and I believe this to be an important issue.

I don't think that partial implementations are possible at the moment,
but I agree that it would be useful. A default implementation can be
provided as an implementation of Object. It's not quite the same as a
default implementation for a multimethod, as it doesn't apply to types
identified by a metadata type tag, but in practice it can be good
enough or even better.

> For example, a protocol for < and > that provides a default
> implementation of > in terms of < and a default implementation of < in
> terms of >, so that you only need to implement one and you get the
> other for free.

Right. I have such a case in my test implementation for multiarrays
(soon to be put on Google Code...), where I'd want to define a default
implementation for "rank" as "length of the shape vector".

Konrad.

Alex Osborne

unread,
Nov 13, 2009, 4:55:12 AM11/13/09
to clo...@googlegroups.com
Mark Engelberg wrote:
> Protocols:
>
> I don't understand whether there's any way to provide a partial
> implementation or default implementation of a given
> protocol/interface, and I believe this to be an important issue.
>
> For example, a protocol for < and > that provides a default
> implementation of > in terms of < and a default implementation of < in
> terms of >, so that you only need to implement one and you get the
> other for free.
>

How about this?

(defprotocol MyComparable
(comparinate [x y] "Returns a negative integer if x < y, zero if they
are equal and a positive integer if x > y.")
(equal-to [x y] "True if x is equal to y.")
(less-than [x y] "True if x is smaller than y.")
(greater-than [x y] "True if x is greater than y."))

(defn mixin-comparable [type compare-fn]
(extend type
MyComparable
{:comparinate compare-fn
:equal-to (fn [x y] (zero? (comparinate x y)))
:less-than (fn [x y] (neg? (comparinate x y)))
:greater-than (fn [x y] (neg? (comparinate x y)))}))

(mixin-comparable Integer -)
(mixin-comparable String #(- (count %1) (count %2)))

(less-than 8 2) ; => false
(less-than "x" "xxxx") ; => true

> I'm also thinking about the relationship in Clojure's source between
> ISeq and ASeq. ASeq provides the partial, default implementation of
> more in terms of next, for example. How does this kind of thing look
> with the new protocol system?

See above. But another way would just be to define ASeq as a map and
then merge with it:

(def aseq-impl {:more (fn [obj] (if-let [s (next obj)] s '() ))
:count (fn [obj] ...)})

(extend MyList
ISeq
(merge
aseq-impl
{:next (fn [lst] ...)
:count (fn [lst] ...) ; overrides the count from aseq-impl
:cons (fn [lst x] ...)}))

Code is data. :-)

I don't know whether doing things this way is a good idea or not, but
protocols are new: lets experiment and find out what works and what doesn't.

Jarkko Oranen

unread,
Nov 13, 2009, 5:26:38 AM11/13/09
to Clojure
On Nov 13, 9:13 am, Krukow <karl.kru...@gmail.com> wrote:
> I was thinking this may make syntax irregular. I suspect this is a
> deliberate design choice to distinguish clojure protocols from java
> interfaces? Is this the case?
>

As far as I understand it, in defprotocol's case, I suspect there is
no dot because the specified operations will be available as normal
Clojure functions, whereas in deftype's case you'll need to use Java
interop or keywords. For example, after

(defprotocol Someproto
(foo [x] "do stuff"))

you will be able to call (foo something-implementing-someproto), but
with deftype, you need to use (.field instance) or, if the type uses
the default ILookup implementation, (:field instance).

The extend example should just use :seq, as defprotocol will create a
function "seq" matching the .seq method in the Seqable interface.
(Because no explicit mapping is provided)

I hope I got my details right here. I haven't actually tried these
things yet. :)

Chris Kent

unread,
Nov 13, 2009, 5:27:22 AM11/13/09
to clo...@googlegroups.com
Mark Engelberg <mark.engelberg <at> gmail.com> writes:

> I'm a little worried about the strong overlap between reify/proxy,
> deftype/defstruct, and defclass/gen-class. I can just imagine the
> questions a year from now when people join the Clojure community and
> want to understand how they differ. So I think that eventually, there
> needs to be a very clear "story" as to why you'd choose one over the
> other. Or better yet, maybe some of the older constructs can be
> phased out completely.

What are the plans for the future of proxy? I assume it won't go away because
reify's inability to extend an existing class is a show-stopper for some Java
interop scenarios. Will the syntax be brought in line with reify so dots will
be needed in front of method names? As things stand it's a potential source of
confusion to have two such similar features with subtly different syntax.

Chris


Meikel Brandmeyer

unread,
Nov 13, 2009, 3:17:58 AM11/13/09
to Clojure
Hi,

On Nov 13, 8:13 am, Mark Engelberg <mark.engelb...@gmail.com> wrote:

> Is there a way to customize the way that types defined by deftype
> print in the REPL?

One can add a method to print-method for the type.

> While these datatype and protocol constructs are taking shape, maybe
> now is the time to discuss what kind of "privacy" settings are
> worthwhile in a language like Clojure.  I think Java's system of
> private/public/protected is probably overkill for Clojure.  But do
> people feel that some degree of data hiding is worthwhile?  For
> example, might you want to hide some deftype fields from keyword
> lookup?

I for now don't care for privacy settings. Everything is public. The
docstrings explain the contract. Period.

> I'm also thinking about the relationship in Clojure's source between
> ISeq and ASeq.  ASeq provides the partial, default implementation of
> more in terms of next, for example.  How does this kind of thing look
> with the new protocol system?

A pretty simple solution is (.more [] (lazy-seq (next this))), no? But
I stumbled over this, too. APersistentMap does a lot like implementing
the IFn invokes for key lookup or the IPersistentCollection equiv.
Giving up on this will create a lot of more work.

Sincerely
Meikel

Rich Hickey

unread,
Nov 13, 2009, 7:56:56 AM11/13/09
to clo...@googlegroups.com
On Fri, Nov 13, 2009 at 2:13 AM, Krukow <karl....@gmail.com> wrote:
>
>
> On Nov 12, 1:10 pm, Rich Hickey <richhic...@gmail.com> wrote:
>> An early version of the code for a few important new language
>> features, datatypes[1] and protocols[2] is now available in the 'new'
>> branch[3]. Note also that the build system[4] has builds of the new
>> branch, and that the new branch works with current contrib.
>>
>> If you have the time and inclination, please try them out. Feedback is
>> particularly welcome as they are being refined.
>
> I really like the semantics of your constructs. I have a comment about
> regularity of syntax:
>
> The way to specify method names in reify and deftype vs. function
> names defprotocol and extend are different.

That's because methods and functions *are* different. I think making
them look the same only makes it more confusing since:

- methods can only be defined by the definer of a type, protocol
extension fns can be defined by anyone anywhere

- methods have class scope - direct use of fields as locals,
functions must use (:field self)

- functions are first class values and can be put in maps etc, methods can't

- methods have implicit this, functions don't

- functions can be closures, methods, other than in reify, can't

> It looks like when dealing
> with interface-method implementations one uses .methodName (i.e., with
> the dot), but when dealing with protocol functions one uses no dot.
> Further, extend uses maps (the docs says why this is the case).
>
> I was thinking this may make syntax irregular. I suspect this is a
> deliberate design choice to distinguish clojure protocols from java
> interfaces? Is this the case?

Yes.

>
> A stupid example:
>
> ;;uses dot
> (deftype Sometype [x]
>   [java.lang.Comparable]
>   (.compareTo [o] ...))
>
> ::uses no dot
> (defprotocol RSeqable :on clojure.lang.Seqable
>  "Seqable and reverse seqable"
>  (rseq [s] "reverse seq"))
>
> ;;do I mix dot and not?
> (extend ::Sometype
>  :RSeqable
>   {:rseq (fn [a]...))
>    :.seq (fn [a] ...)} ;; do I write :.seq here or :seq?
>

You don't mix methods and protocol functions, so once that is clear I
don't think this will be a question.

In a sense, deftypes and protocols are bridging two polymorphism
systems. As long as one doesn't conflate the two, it becomes clearer.

For instance, you could use deftype and protocols in complete
ignorance/avoidance of Java and interfaces:

(deftype Foo [a b c])

(defprotocol P (bar [x] "bar docs"))

(extend ::Foo P {:bar (fn [afoo] :foo-thing)})

(bar (Foo 1 2 3))
:foo-thing

This is a simple, powerful, flexible and dynamic system, leveraging
one's understanding of Clojure functions.

If and only if there is some requirement that instances of Foo
implement some Java interfaces, then you will need to understand Java
interfaces and methods. And there will be a clear mechanism and place
to put them - in your deftype, just like methods have to be put inside
class definitions in Java. You have similar class scope for fields and
access to this, etc. One thing you do not have is implicit scope for
methods, the leading dot helps remind you that in order to call
someMethod, even in the body of another method in the same deftype,
you will have to use (.someMethod this ...). People have argued
against implicit this for similar reasons, and I am starting to come
around :)

The documentation is comprehensive in mentioning everything you can
do. But one doesn't need to use everything.

Rich

MikeM

unread,
Nov 13, 2009, 9:03:11 AM11/13/09
to Clojure

> (deftype Foo [a b c])
>
> (defprotocol P (bar [x] "bar docs"))
>
> (extend ::Foo P {:bar (fn [afoo] :foo-thing)})
>

A common error may be to:

(extend Foo P {:bar (fn [afoo] :foo-thing)})

when (extend ::Foo ... is intended. I notice that (extend Foo...
doesn't throw - should extend check that it is supplied a class,
intfc, or keyword and throw if something else is supplied?
Alternately, could extend be changed to allow (extend Foo ... and do
the right thing ? ie determine that Foo is the constructor function
for a type and do the extension for the type Foo.

James Reeves

unread,
Nov 13, 2009, 9:24:12 AM11/13/09
to Clojure
Are there any plans to use protocols to define polymorphic functions
like conj and get? Perhaps with an "untype" function to remove type
metadata so one could always get at the datastructures hidden by the
protocol. e.g.

(defn sql-get [table key]
(sql-query
(str "select * from " table " where " (get (untype table) :primary-
key) " = ?")
key))

(extend ::sql-table Gettable
{:get sql-get})

Then 'get' could be used in a more generic fashion:

user=> (def accounts (sql-table accounts))

user=> (get accounts 10)
{:id 10, :login "jsmith", :password "1234"}

- James

Rich Hickey

unread,
Nov 13, 2009, 9:30:20 AM11/13/09
to Clojure


On Nov 13, 2:13 am, Mark Engelberg <mark.engelb...@gmail.com> wrote:
> I'm still trying to get my head around the new features. Seeing more
> code examples will definitely help. In the meantime, here is some
> stream-of-consciousness thoughts and questions.
>
> Datatypes:
>
> I'm a little worried about the strong overlap between reify/proxy,
> deftype/defstruct, and defclass/gen-class. I can just imagine the
> questions a year from now when people join the Clojure community and
> want to understand how they differ. So I think that eventually, there
> needs to be a very clear "story" as to why you'd choose one over the
> other. Or better yet, maybe some of the older constructs can be
> phased out completely.
>

Yes, but there will be a transition period. I certainly tried to
explain the decision points on the wiki.

A big part of the design thinking behind these features went like
this:

Clojure is built on a set of abstractions, and leverages/requires that
the host platform provide some sort of high-performance polymorphism
construct in order to make that viable. That said, Clojure was
bootstrapped on the host language and didn't really provide similar
constructs itself (multimethods are more powerful but slower), leaving
people that wanted to do things similar to what I did, in order to
write Clojure and its data structures, to either write Java or use
Clojure interop to, effectively, write Java in Clojure clothing.

So I took a step back and said, what part of Java did I *need* in
order to implement Clojure and its data structures, what could I do
without, and what semantics was I willing to support - for Clojure -
i.e. not in terms of interop. What I ended up with was - a high-
performance way to define and implement interfaces. What I explicitly
left out was - concrete derivation and implementation inheritance.

reify is Clojure semantics and proxy is Java/host semantics. Why
doesn't it replace proxy? Because proxy can derive from concrete
classes with constructors that take arguments. Supporting that
actually brings in a ton of semantics from Java, things I don't want
in Clojure's semantics. reify should be possible and portable in any
port of Clojure, proxy may not. Will the performance improvements of
reify make it into proxy? Probably at some point, not a priority now.

*** Prefer reify to proxy unless some interop API forces you to use
proxy. You shouldn't be creating things in Clojure that would require
you to use proxy. ***

defstruct is likely to be completely replaced by deftype, and at some
point could be deprecated/removed.

*** Prefer deftype to defstruct, unconditionally. ***

AOT deftype vs gen-class touches on the same Clojure semantics vs Java/
host semantics, with the objectives from before - support implementing
interfaces but not concrete derivation. So, no concrete base classes,
no super calls, self-ctor calls, statics, methods not implementing
interface methods etc. Will the performance improvements of deftype
make it into gen-class? Probably at some point, not a priority now.

Like proxy, gen-class will remain as an interop feature.

*** Prefer deftype to gen-class unless some interop API forces you to
use gen-class. ***

There will be a definterface similar to and probably replacing gen-
interface, with an API to match deftype.

So, with definterface, deftype, and reify you have a very clean way to
specify and implement a subset of the Java/C# polymorphism model, that
subset which I find clean and reasonable, with an expectation of
portability, and performance exactly equivalent to the same features
on the host.

I could have stopped there, and almost did. But there are three
aspects of that polymorphism model that aren't sufficient for Clojure:

- It is insufficiently dynamic. There is a static component - named
interfaces, that must be AOT compiled.

- Client code must use the interop style (.method x), and type hints,
in order to tap into the performance

- It is 'closed' polymorphism, i.e. the set of things a type can do
is fixed at the definition time of the type. This results in the
'expression problem', in this case the inability to extend types with
new capabilities/functions.

We've all experienced the expression problem - sometimes you simply
can't request/require that some type implement YourInterface in order
to play nicely with your design. You can see this in Clojure's
implementation as well - RT.count/seq/get etc all try to use Clojure's
abstraction interface first, but then have hand-written clauses for
types (e.g. String) that couldn't be retrofitted with the interface.

Multimethods, OTOH, don't suffer from this problem. But it is
difficult to get something as generic as Clojure's multimethods to
compete with interface dispatch in Java. Also, multimethods are kind
of atomic, often you need a set of them to completely specify an
abstraction. Finally, multimethods are a good story for the Clojure
side of an abstraction, but should you define a valuable abstraction
and useful code in Clojure and want to enable extension or
interoperation from Java or other JVM langs, what's the recipe?

Protocols take a subset of multimethod power, open extension, combine
it with a fixed, but extremely common, dispatch mechanism (single
dispatch on 'type' of first arg), allow a set of functions
constituting an abstraction to be named, specified, and implemented as
group, and provide a clear way to extend the protocol using ordinary
capabilities of the host (:on interface).

*** Prefer using protocols to specify your abstractions, vs
interfaces. ***

This will give you open extension and a dynamic system. You can always
make your protocol reach any type, and, you can always make your
protocol extensible through an interface using :on interface. In
particular note, calls to a protocol fn to an instance of the :on
interface go straight through, and are as fast as calls using (.method
#^AnInterface x), so there is no up-front performance compromise in
choosing protocols.


> While these datatype and protocol constructs are taking shape, maybe
> now is the time to discuss what kind of "privacy" settings are
> worthwhile in a language like Clojure. I think Java's system of
> private/public/protected is probably overkill for Clojure. But do
> people feel that some degree of data hiding is worthwhile?

I don't.

> Protocols:
>
> I don't understand whether there's any way to provide a partial
> implementation or default implementation of a given
> protocol/interface, and I believe this to be an important issue.
>
> For example, a protocol for < and > that provides a default
> implementation of > in terms of < and a default implementation of < in
> terms of >, so that you only need to implement one and you get the
> other for free.
>
> I'm also thinking about the relationship in Clojure's source between
> ISeq and ASeq. ASeq provides the partial, default implementation of
> more in terms of next, for example. How does this kind of thing look
> with the new protocol system?

This was an important consideration in the deftype/protocol design.
One reasonable argument for concrete implementation is abstract
superclasses, especially when used correctly. And Clojure's
implementation does use them, as you note. Some of the problems with
abstract classes are:

- they create a hierarchical type relationship (for no good reason).

- unless you are going to open that huge can of worms that is
multiple concrete inheritance, you only get a single inheritable
implementation.

- they, too, are closed. If you are going to allow open extension,
but implementation reuse requires derivation, there is an open/closed
mismatch.

Protocols are designed to support hierarchy-free, open, multiple,
mechanical mixins. This is enabled by the fact that extend is an
ordinary function, and the mappings of names to implementation
functions are ordinary maps. One can create mixins by simply making
maps of names to functions. And one can use mixins in an ad hoc
manner, merging and replacing functions using ordinary map
manipulation:

(extend ::MyType AProtocol (assoc a-mixin-map :a-fn-to-replace a-
replacement-fn))

I think people will find this quite powerful and programmable.

Rich

AlexK

unread,
Nov 13, 2009, 9:26:54 AM11/13/09
to Clojure
Hi everybody,

after playing around with protocols & datatypes, I found them very fun
to use.
Some questions:
Performance
I don't see (with my limited benchmarking) any significant difference
between multifns and protocolfns:

user=> (defprotocol Test (protocol-fn [it] "Protocol-fn"))
Test
user=> (extend Object Test {:protocol-fn (fn [it] nil)})
nil
user=> (defmulti multi-fn type)
#'user/multi-fn
user=> (defmethod multi-fn Object [it] nil)
#<MultiFn clojure.lang.MultiFn@c8769b>
user=> (defn simple-fn [it] nil)
#'user/simple-fn

user=> (dotimes [_ 10] (time (dotimes [_ 100000] (protocol-fn :it))))
"Elapsed time: 105.532562 msecs"
"Elapsed time: 57.0031 msecs"
"Elapsed time: 33.210602 msecs"
"Elapsed time: 30.47827 msecs"
"Elapsed time: 26.326202 msecs"
"Elapsed time: 27.764654 msecs"
"Elapsed time: 28.381284 msecs"
"Elapsed time: 28.741735 msecs"
"Elapsed time: 28.697525 msecs"
"Elapsed time: 25.894514 msecs"

user=> (dotimes [_ 10] (time (dotimes [_ 100000] (multi-fn :it))))
"Elapsed time: 372.338313 msecs"
"Elapsed time: 73.104641 msecs"
"Elapsed time: 58.832009 msecs"
"Elapsed time: 60.312924 msecs"
"Elapsed time: 58.626328 msecs"
"Elapsed time: 57.005242 msecs"
"Elapsed time: 54.493328 msecs"
"Elapsed time: 56.283221 msecs"
"Elapsed time: 54.575182 msecs"
"Elapsed time: 54.939474 msecs"

user=> (dotimes [_ 10] (time (dotimes [_ 100000] (simple-fn :it))))
"Elapsed time: 28.504607 msecs"
"Elapsed time: 17.564177 msecs"
"Elapsed time: 1.877194 msecs"
"Elapsed time: 2.340661 msecs"
"Elapsed time: 1.581906 msecs"
"Elapsed time: 1.792407 msecs"
"Elapsed time: 1.878591 msecs"
"Elapsed time: 1.919937 msecs"
"Elapsed time: 2.367759 msecs"
"Elapsed time: 1.90555 msecs"

This is what I have been expecting of course (Fn < Protocolfn <
Multifn), but i was thinking that with Protocols dispatching would be
significantly faster than with Multimethods. Am I missing something?
Because they don't seem to provide the speed for implementing the core
abstractions (like (seq <coll>)).

Syntax
With protocols you define the protocols using symbols and extend types
by using the keywordized name
eg.

(defprotocol Test (protocol-fn [it] "Protocol-fn"))
(extend Object Test {:protocol-fn (fn [it] nil)})

i understand that extend is a function and evaluates its arguments,
and that {:protocol-fn (fn [it] nil)} is a real map, but wouldn't it
be possible just to use a {<generic-fn> <new-method-fn>} map instead?
The protocol should know its generic functions, so that wouldn't be
ambiguous.
{protocol-fn (fn [it] nil)}) ; seems a lot clearer to me


Extensibility
I've noticed that extending a protocol-fn redefines it:

(def old-fn protocol-fn) ; from above
(extend Number Test {:protocol-fn (fn [it] :a-number)})
(def new-fn protocol-fn)

(= old-fn new-fn)
false

this worries me, because the semantics are differing from MultiFns and
are less dynamic. Especially coupled with dynamic development this
could lead to some gotchas.

Clojure 1.1.0-alpha-SNAPSHOT
user=> (defprotocol Test (prtcfn [it]))
Test
user=> (extend Object Test {:prtcfn (fn [it] :object)})
nil
user=> (prtcfn (Object.))
:object
user=> (def old-fn prtcfn)
#'user/old-fn
user=> (old-fn (Object.))
:object
user=> (extend Number Test {:prtcfn (fn [it] :number)})
nil
user=> (prtcfn (Object.))
:object
user=> (prtcfn 1)
:number
user=> (def new-fn prtcfn)
#'user/new-fn
user=> (new-fn 1)
:number
user=> (old-fn 1)
:object

When some protocol-fns get bound in a closure this could hurt a lot.


Sorry if this seems like nitpicking, but this is just what I noticed
while experimenting

Rich Hickey

unread,
Nov 13, 2009, 9:43:13 AM11/13/09
to Clojure


On Nov 13, 3:58 am, Konrad Hinsen <konrad.hin...@fastmail.net> wrote:
> On 13 Nov 2009, at 08:13, Mark Engelberg wrote:
>

> > Protocols:
>
> > I don't understand whether there's any way to provide a partial
> > implementation or default implementation of a given
> > protocol/interface, and I believe this to be an important issue.
>
> I don't think that partial implementations are possible at the moment,
> but I agree that it would be useful.

Yes, just create mixin maps and use them in your extends.

> A default implementation can be
> provided as an implementation of Object. It's not quite the same as a
> default implementation for a multimethod, as it doesn't apply to types
> identified by a metadata type tag, but in practice it can be good
> enough or even better.
>

That's not true for protocols. Make sure to leave any preconceptions
from multimethods and type tags behind. In particular, protocols do
not, and will not, utilize the isa/hierarchy system. Right now, the
dispatch code routes through (type x), but the intention is to support
through type only classes and deftype types. The use of type metadata
should be deprecated once this in place. At that point, Object does
serve as the default for everything other than nil.

***But*** one should generally avoid using hierarchy for
implementation inheritance! You may encounter it in interop
situations, but otherwise use mixins.

Rich

Rich Hickey

unread,
Nov 13, 2009, 9:50:42 AM11/13/09
to Clojure


On Nov 13, 9:03 am, MikeM <michael.messini...@invista.com> wrote:
> > (deftype Foo [a b c])
>
> > (defprotocol P (bar [x] "bar docs"))
>
> > (extend ::Foo P {:bar (fn [afoo] :foo-thing)})
>
> A common error may be to:
>
> (extend Foo P {:bar (fn [afoo] :foo-thing)})
>
> when (extend ::Foo ... is intended. I notice that (extend Foo...
> doesn't throw - should extend check that it is supplied a class,
> intfc, or keyword and throw if something else is supplied?

Yes it could.

> Alternately, could extend be changed to allow (extend Foo ... and do
> the right thing ? ie determine that Foo is the constructor function
> for a type and do the extension for the type Foo.

Dunno yet - there isn't a path from the factory fn value to its name
or deftype.

Rich

Rich Hickey

unread,
Nov 13, 2009, 9:47:35 AM11/13/09
to Clojure
Yes. Right now the priority is to get these new features out, without
breaking anyone's code.

Rich

Rich Hickey

unread,
Nov 13, 2009, 9:46:18 AM11/13/09
to Clojure
Yes, the latter (mixin maps) is preferred.

Rich

Sean Devlin

unread,
Nov 13, 2009, 10:42:58 AM11/13/09
to Clojure
Rich,
I was wondering something about defprotocol.

Here's your example:

(defprotocol AProtocol :on AnInterface
"A doc string for AProtocol abstraction"
(bar [a b] "bar docs" :on barMethod)
(baz ([a] [a b] [a b &amp; c]) "baz docs"))

In this case, you provide the docs for each method after parameters.
Would the following be possible:

(defprotocol AProtocol :on AnInterface
"A doc string for AProtocol abstraction"
(bar "bar docs" [a b] :on barMethod)
(baz "baz docs" ([a] [a b] [a b &amp; c])))

This matches the rhythm of the rest of the language.

Sean

On Nov 12, 7:10 am, Rich Hickey <richhic...@gmail.com> wrote:

Stuart Halloway

unread,
Nov 13, 2009, 10:48:37 AM11/13/09
to clo...@googlegroups.com
>> But do
>> people feel that some degree of data hiding is worthwhile?
>
> I don't.


Hooray for benevolent dictators!

Rich Hickey

unread,
Nov 13, 2009, 11:11:25 AM11/13/09
to Clojure


On Nov 13, 10:42 am, Sean Devlin <francoisdev...@gmail.com> wrote:
> Rich,
> I was wondering something about defprotocol.
>
> Here's your example:
>
> (defprotocol AProtocol :on AnInterface
>   "A doc string for AProtocol abstraction"
>   (bar [a b] "bar docs" :on barMethod)
>   (baz ([a] [a b] [a b &amp; c]) "baz docs"))
>
> In this case, you provide the docs for each method after parameters.
> Would the following be possible:
>
> (defprotocol AProtocol :on AnInterface
>   "A doc string for AProtocol abstraction"
>   (bar "bar docs" [a b] :on barMethod)
>   (baz "baz docs" ([a] [a b] [a b &amp; c])))
>
> This matches the rhythm of the rest of the language.
>

It does. I'm still on the fence about it. What do others think?

Rich

Rich Hickey

unread,
Nov 13, 2009, 11:07:08 AM11/13/09
to Clojure


On Nov 13, 9:26 am, AlexK <alexander.konstanti...@informatik.haw-
hamburg.de> wrote:
> Hi everybody,
>
> after playing around with protocols & datatypes, I found them very fun
> to use.
> Some questions:
> Performance
> I don't see (with my limited benchmarking) any significant difference
> between multifns and protocolfns:
>
...
> This is what I have been expecting of course (Fn < Protocolfn <
> Multifn), but i was thinking that with Protocols dispatching would be
> significantly faster than with Multimethods. Am I missing something?
> Because they don't seem to provide the speed for implementing the core
> abstractions (like (seq <coll>)).
>

This kind of do-nothing microbenchmarking demonstrates nothing.
Protocol dispatching is significantly faster than multimethod
dispatching, and I haven't even looked into call-site optimization.
Protocol dispatch is not as fast as interface dispatch, and may or may
not become so with call-site caching. But, protocols :on interfaces
are precisely as fast when called with instances of the :on interface
as calls to the :on interface.

> Syntax
> With protocols you define the protocols using symbols and extend types
> by using the keywordized name
> eg.
>
> (defprotocol Test (protocol-fn [it] "Protocol-fn"))
> (extend Object Test {:protocol-fn (fn [it] nil)})
>
> i understand that extend is a function and evaluates its arguments,
> and that {:protocol-fn (fn [it] nil)} is a real map, but wouldn't it
> be possible just to use a {<generic-fn> <new-method-fn>} map instead?
> The protocol should know its generic functions, so that wouldn't be
> ambiguous.
> {protocol-fn (fn [it] nil)}) ; seems a lot clearer to me

I wouldn't want to key anything off of fn identity. It would make
mixins quite difficult.

>
> Extensibility
> I've noticed that extending a protocol-fn redefines it:
>
> (def old-fn protocol-fn) ; from above
> (extend Number Test {:protocol-fn (fn [it] :a-number)})
> (def new-fn protocol-fn)
>
> (= old-fn new-fn)
> false
>
> this worries me, because the semantics are differing from MultiFns and
> are less dynamic.

They are no less dynamic, they just work differently.
This is no different from ordinary functions:

#'user/foo
user=> (def old-foo foo)
#'user/old-foo
user=> (def foov #'foo)
#'user/foov
user=> (old-foo)
:old-foo
user=> (foov)
:old-foo
user=> (defn foo [] :new-foo)
#'user/foo
user=> (foo)
:new-foo
user=> (old-foo)
:old-foo
user=> (foov)
:new-foo

> When some protocol-fns get bound in a closure this could hurt a lot.
>

That claim is not supported by your example. In general, when making a
long-term relationship with a function, you do so with its var, as
above. A closure referencing a var closes over the var itself, not its
current value:

user=> (def f (future (dotimes [_ 10] (prn (foo)) (Thread/sleep
1000))))
#'user/f
user=> :new-foo
:new-foo
:new-foo
:new-foo
:new-foo
:new-foo
:new-foo
;;;;;;;;;;;;;;;while that is going
(defn foo [] :newer-foo)
#'user/foo
:newer-foo
:newer-foo
:newer-foo

Admittedly, it is a difference from multimethods. With protocols, both
protocols and their functions/methods are immutable. Redefining or
extending a protocol modifies only the protocol and fn vars. I prefer
that, and don't consider the above behavior a problem. What do others
think?

Rich

Rich Hickey

unread,
Nov 13, 2009, 11:10:08 AM11/13/09
to Clojure


On Nov 13, 10:48 am, Stuart Halloway <stuart.hallo...@gmail.com>
wrote:
I was just putting in my vote :)

Rich

David Nolen

unread,
Nov 13, 2009, 11:22:57 AM11/13/09
to clo...@googlegroups.com
I place my vote for no data hiding.

David Nolen

unread,
Nov 13, 2009, 11:24:28 AM11/13/09
to clo...@googlegroups.com
Is there an argument for putting it after the argument list? :)

Rich Hickey

unread,
Nov 13, 2009, 11:40:14 AM11/13/09
to Clojure


On Nov 13, 9:24 am, James Reeves <weavejes...@googlemail.com> wrote:
> Are there any plans to use protocols to define polymorphic functions
> like conj and get? Perhaps with an "untype" function to remove type
> metadata so one could always get at the datastructures hidden by the
> protocol. e.g.
>
> (defn sql-get [table key]
>   (sql-query
>     (str "select * from " table " where " (get (untype table) :primary-
> key) " = ?")
>     key))
>
> (extend ::sql-table Gettable
>   {:get sql-get})
>
> Then 'get' could be used in a more generic fashion:
>
> user=> (def accounts (sql-table accounts))
>
> user=> (get accounts 10)
> {:id 10, :login "jsmith", :password "1234"}
>

I'll have to think about that. Your 'untype' above is really just a
specialization of type aliasing:

(as-type nil table)

where (as-type ::Foo x) would create a view of x with dynamic type
Foo.

I'm still thinking about such aliasing and its implications.

Rich

Konrad Hinsen

unread,
Nov 13, 2009, 12:17:49 PM11/13/09
to clo...@googlegroups.com
On 13.11.2009, at 17:07, Rich Hickey wrote:

> Admittedly, it is a difference from multimethods. With protocols, both
> protocols and their functions/methods are immutable. Redefining or
> extending a protocol modifies only the protocol and fn vars. I prefer
> that, and don't consider the above behavior a problem. What do others
> think?

For most applications the difference doesn't matter. Having protocols
as immutable values bound to vars that change with every extend could
lead to both interesting use cases and undesirable surprises when used
with threads. Threads can have thread-local implementations of
protocols, intentionally or by mistake.

What makes this behaviour a bit disturbing is the fact that a var in a
namespace that refers to a protocol or a method is changed from code
in another namespace. This is of course possible otherwise as well,
but highly unusual. Consider this example:

(defprotocol Foo
(bar [x]))

(def bar nil)

Now in some other namespace, at some later time:

(extend Object
Foo
{:bar (fn [x] x)})

This will redefine bar in the first namespace to a method of Foo.
While I don't expect such a situation to be frequent, it is certainly
highly unexpected. All the more so since the code in the second
namespace doesn't even mention the symbol bar that it changes.

Konrad.

Constantine Vetoshev

unread,
Nov 13, 2009, 1:01:32 PM11/13/09
to Clojure
On Nov 12, 7:10 am, Rich Hickey <richhic...@gmail.com> wrote:
> [1]http://www.assembla.com/wiki/show/clojure/Datatypes

Could you please elaborate on why you chose to make IPersistentMap an
optional interface for deftype'd types, rather than making it
automatic?

I'm asking because I found the automatic defstruct-map equivalence
convenient in writing the Cupboard database library (http://github.com/
gcv/cupboard). It guaranteed that any reading Clojure code could read
back any map or any struct written by any other Clojure program
(unless the map contains closures, of course), without any other
knowledge of the writing program. It allows for data-centric designs
when thinking about storing objects to a database, i.e., the data can
be used without the type definitions which originally produced it.

I can, of course, require that any deftype'd types saved in Cupboard
databases implement IPersistentMap, but I'm curious about the
reasoning for not making maps part of the default nature of deftype.
Making IPersistentMap the default could also make deftype a nearly
drop-in replacement for defstruct.

Thanks,
Constantine Vetoshev

Konrad Hinsen

unread,
Nov 13, 2009, 1:09:25 PM11/13/09
to clo...@googlegroups.com
I'd prefer the doc string right after the function name, as in other
situations, but it's not important enough that I'd argue for it at any
length.

Konrad.

Sean Devlin

unread,
Nov 13, 2009, 1:10:23 PM11/13/09
to Clojure
I agree w/ Constantine. This would be very, very useful.

Sean

AlexK

unread,
Nov 13, 2009, 12:58:20 PM11/13/09
to Clojure


On 13 Nov., 17:07, Rich Hickey <richhic...@gmail.com> wrote:
> This kind of do-nothing microbenchmarking demonstrates nothing.
> Protocol dispatching is significantly faster than multimethod
> dispatching, and I haven't even looked into call-site optimization.
> Protocol dispatch is not as fast as interface dispatch, and may or may
> not become so with call-site caching. But, protocols :on interfaces
> are precisely as fast when called with instances of the :on interface
> as calls to the :on interface.

Forget what I said about Performance, I didn't realize that the
functions could be mapped to interface-methods. When mapped to an
interface they get JITed to awesome speeds :-)

> I wouldn't want to key anything off of fn identity. It would make
> mixins quite difficult.

This was just a minor thought of mine, if it impedes with this awesome
literal mixin concept then I gladly take those mixins.

(def composed-mixin (merge mixin-a mixin-b mixin-c))
seems absolutely logical and natural

>
> .......
>
> Admittedly, it is a difference from multimethods. With protocols, both
> protocols and their functions/methods are immutable. Redefining or
> extending a protocol modifies only the protocol and fn vars. I prefer
> that, and don't consider the above behavior a problem. What do others
> think?
>
> Rich

that was a bad Example probably...
hypothetical example for impementing the type function as a
Protocolfn:


(defprotocol Typed (my-type [obj] "Doc.."))
Typed
(extend Object Typed {:my-type (fn [obj] (.getClass obj))})
nil
(defmulti foo my-type)
#'user/foo
(extend clojure.lang.IObj Typed {:my-type (fn [obj] (if (:type ^obj)
(:type ^obj) (.getClass obj)))})

now our foo multifn woudn't work as expected (by me at least).
I just didn't expect that (extend <sth> <Type> <fn-map>) would
redefine some Vars.
extend has to mutate something, but I think that redefining some Vars
is wierder (and thus should be explicit via a visible (def ...))
than mutating the "generic function" (I may be biased by CLOS of
course)

Alex


Stuart Sierra

unread,
Nov 13, 2009, 3:19:09 PM11/13/09