(deftype tree
empty-tree
(leaf value)
(node left-tree right-tree))
(def a-tree (node (leaf :a)
(node (leaf :b)
(leaf :c))))
(defn depth
[#^tree t]
(match t
empty-tree 0
(leaf n) 1
(node l r) (inc (max (depth l) (depth r)))))
(depth empty-tree)
(depth a-tree)
For more examples, see clojure.contrib.types.examples.
The reason why I call this a proof-of-concept implementation is that
it has a major drawback that seriously limits its applicability:
equality tests on objects created from these types are done by
identity, not by value equality, which makes them practically
useless. This is due to the fact that the objects are actually
implemented as closures. Another drawback, following as well from the
implementation as closures, is that the objects cannot have any meta-
data attached.
Closures are the simplest way to create a new Java class in Clojure,
but they have the disadvantage of equality-by-identity. I have an
idea for a better implementation, based on gen-class and proxy, but
before embarking on this I'd like to see some feedback on the current
implementation. Is this something you would like to use?
Konrad.
> Is there any reason they cannot be implemented as structs with some
> sort of type tag?
I suspect that could be done, but some nice features would be lost:
1) User-friendly printing.
I implement the multimethods print-method and print-dup for my
classes, which allows me to specify how objects are printed. print-
method and print-dup dispatch on class, so my types need to be
classes or at least derive from a class that is distinct from the
standard Clojure data structures.
Consider how the simple tree structure a-tree would print if structs
with type tags were used for implementation! With my class-based
implementation, I could even define a specific printing method for
each individual type, and I expect to do that sooner or later.
2) Usability in multimethods
The above argument illustrates a more general problem: multimethods
meant for use with a wide range of data structures can in practice
only use class as a selector, as that is the only one that will work
for the built-in data structures. Look at fmap in
clojure.contrib.types.examples, for instance. It would make sense to
add an implementation for maps that applys f to the value of each map
entry. But I can't have that plus a type system based on maps with
special tags - it's one or the other.
3) Usability in type hints
With types implemented as classes, they can be used in type hints on
function arguments. While I don't expect any performance gains in
this case, it is quite useful as a development aid.
4) Modularity
There are bound to be several type or class systems for Clojure, as
for other Lisps. We already have Spinoza, for example. If all these
systems use maps with type tags, there are bound to be tag conflicts
sooner or later. For example, someone will create a Spinoza class
with a slot whose name interferes with my special type tag. This
can't happen with my closure-based implementation. It creates objects
that are distinct by the very nature of their classes, so it doesn't
interfere with whatever someone else may come up with.
Konrad.
1) User-friendly printing.
I implement the multimethods print-method and print-dup for my
classes, which allows me to specify how objects are printed. print-
method and print-dup dispatch on class, so my types need to be
classes or at least derive from a class that is distinct from the
standard Clojure data structures.
2) Usability in multimethods
The above argument illustrates a more general problem: multimethods
meant for use with a wide range of data structures can in practice
only use class as a selector, as that is the only one that will work
for the built-in data structures. Look at fmap in
clojure.contrib.types.examples, for instance. It would make sense to
add an implementation for maps that applys f to the value of each map
entry. But I can't have that plus a type system based on maps with
special tags - it's one or the other.
4) Modularity
There are bound to be several type or class systems for Clojure, as
for other Lisps. We already have Spinoza, for example. If all these
systems use maps with type tags, there are bound to be tag conflicts
sooner or later. For example, someone will create a Spinoza class
with a slot whose name interferes with my special type tag. This
can't happen with my closure-based implementation. It creates objects
that are distinct by the very nature of their classes, so it doesn't
interfere with whatever someone else may come up with.
> Should print and println be multimethods? It seems the ability to
> custom print a data structure would be generally useful.
It's already there, just not documented as far as I know. All the
printing routines end up calling clojure.core/print-method and
clojure.core/print-dup, which are multimethods dispatching on class.
> Not totally following here, looking at your code... Pardon my
> slowness but why not dispatch on :tag first, and class if no :tag?
That's possible, of course, but the result is a multimethod that
works for (a) class-based types and (b) one specific type hierarchy
hard-wired into the selector, such as your :tag. If I write some
other type system, it won't fit in. That makes it impossible to
define truly general interfaces that are open for anyone to implement.
Take print-method in clojure.core as an example: How would you
implement its selector, such that anyone could plug in their type
system based on some tagging scheme? Perhaps one could come up with a
scheme in which the selector function is itself a multimethod, but
that would probably be quite messy.
> Tags of course should be namespaced so is this really that much of
> an issue in practice? Since tags are keywords they don't have the
> interning conflict problem that symbols of the same name generally
> do, right?
Right. Namespaced tags go a long way to avoid name clashes. However,
they are still not the private property of any one library module.
Your namespaced tag is an object that can end up in any data
structure, intentionally or by error. Only time will tell if this is
a problem in practice.
Anyway, my two main arguments in this issue are printing and open-to-
anyone interfaces with multimethods. These are the problems that I
had in my own code and that I wanted to solve. Printing is important
for complex nested data structures and interactive use. Generic
interfaces are a very nice feature to have. I came across this
problem twice:
1) In clojure.contrib.accumulators. Ideally, it should be possible to
implement additional accumulator types elsewhere, using the same
interface. With the current implementation based on meta-data, that
is possible, but only for data types that allow meta-data. It is thus
not possible to build accumulators based on Java classes, at least
not without wrapping them in a map or vector. Moreover, the selector
function, based on meta-data with a fallback to classes, is neither
elegant nor efficient, and much of the code in the library deals with
the meta-data tagging scheme rather than with its real job.
2) In clojure.contrib.streams-utils. Again, I wanted to define a
generic interface to data streams. At the moment, it uses a
multimethod dispatching on class, but that doesn't leave much room to
define different data stream sources. That was in fact what prompted
me to write the types library.
Konrad.
Plus custom printing.
(defmethod print-method ::Foo [o w]
(doto w
(.write "#<Foo: ")
(.write (:x o))
(.write ">")))
(defn make-foo [x]
#^{:type ::Foo} {:x x})
user=> (make-foo "whee")
#<Foo: whee>
--Chouser
(defstruct stuff :a :b)
(defmacro make
"Creates a new instance of struct t and appends 'type' t as metadata"
[t & vs]
`(with-meta
(struct ~t ~@vs)
{:type (keyword (str *ns*) (name '~t))}))
user> (make stuff 1 2)
{:a 1, :b 2}
user> (meta (make stuff 1 2))
{:type :user/stuff}
-Jeff
Hmmm... On svn 1307, I get the same stack overflow. No issue on svn 1280. It's the keyword :type as metadata...but only when stringifying the value to output it.
(def x (with-meta [1 2] {:type "hi"})) ; define but don't print: works
(meta x) ; get metadata: works
(get x 1) ; get a value: works
(println x) ; stack overflow
(str x) ; stack overflow
(format "%s" x) ; stack overflow
If the keyword in the macro is changed to anything other than "type", there's no issue.
(with-meta [] {:typeX "hi"}) ; no worries
(with-meta [] {:type "hi"}) ; stack overflow
String representation obviously uses :type now in a very particular way. I'm not sure where this happens though. Can anyone shed some light on the details?
-Jeff
> String representation obviously uses :type now in a very particular
> way. I'm not sure where this happens though. Can anyone shed some
> light on the details?
print-method now dispatches on type, rather than class as it did
before. There is no default implementation for print-method, so if
you put something as :type that has no implementation of print-
method, weird things will happen.
The fix is to provide a default implementation for print-method. Try
executing this:
(defmethod print-method :default [o w] (print-method "failed" w))
before printing your object, and it should print "failed" without any
exception. There should be a default in clojure.core, but I am not
sure what it should best be. Should it print the object stripped of
its metadata? Or a string indicating the problem? Or should it throw
an exception?
Konrad.
> You raise interesting issues and I'd like to explore them further. I'm
> not sure the issues you have with type-tag-or-class dispatch are all
> that prohibitive. In any case, I've added a type function that returns
> the :type metadata or the class if none:
Thanks, that helps a lot! With a built-in universal dispatching
function, most of my problems should be solved. Another useful
function to have would be
(defn type-instance?
"Evaluates x and tests if it is an instance of the type or class t.
Returns true or false"
[t x]
(identical? t (type x)))
for type-based dispatching inside a function.
> I'm not sure the mechanism you are using (actual Class types) allows
> for any more overloading - Class is a single slot too, after all.
True, but the number of classes is not limited. Java programmers can
live with a single class hierarchy, so it can't be too bad. But
Clojure's hierarchies are definitely more flexible.
> A bigger problem with the mechanism you've chosen is its dependence on
> some implementation details. I haven't promised, e.g. that every fn
> generates a unique class type.
I know, but as I said, my current implementation is just a proof of
concept. It is not viable for production use for a variety of
reasons. I was planning to replace it by something based on gen-class
and proxy, but I will first try to get away with the new type function.
Konrad.
Thanks, that helps a lot! With a built-in universal dispatching
On 26.02.2009, at 01:51, Rich Hickey wrote:
> You raise interesting issues and I'd like to explore them further. I'm
> not sure the issues you have with type-tag-or-class dispatch are all
> that prohibitive. In any case, I've added a type function that returns
> the :type metadata or the class if none:
function, most of my problems should be solved. Another useful
function to have would be
(defn type-instance?
"Evaluates x and tests if it is an instance of the type or class t.
Returns true or false"
[t x]
(identical? t (type x)))
for type-based dispatching inside a function.
> You raise interesting issues and I'd like to explore them further. I'm
> not sure the issues you have with type-tag-or-class dispatch are all
> that prohibitive. In any case, I've added a type function that returns
> the :type metadata or the class if none:
>
> user=> (type #^{:type ::Fred} [1 2 3])
> :user/Fred
>
> user=> (type "foo")
> java.lang.String
>
> which should help people standardize.
One inconvenience I noticed with types based on meta-data tags is
that the type information does not participate in equality tests.
This means I have to include the type information once again in the
value itself if I want to make sure that no other type's value
accidentally compare as identical to mine.
> If you want to multiplex multimethods for your ADT type, you can just
> define a single :type ::ADT, and then sub-dispatch on e.g. :adt-type.
That's not even necessary: I add a derive clause from ::adt to each
of my data types and implement multimethods such as print-method
for ::adt.
Konrad.
Rich
Rich
On 26 February 2009 at 02:44, Konrad Hinsen wrote:
> The fix is to provide a default implementation for print-method. Try
> executing this:
>
> (defmethod print-method :default [o w] (print-method "failed" w))
This prevents the stack overflow. But I'd prefer a more forgiving default print behavior.
> There should be a default in clojure.core, but I am not
> sure what it should best be. Should it print the object stripped of
> its metadata?
I vote for this. A default that re-dispatches on class (as you suggested, stripping the :type metadata), falls back to the previous print behavior.
(defmethod print-method :default [o, #^java.io.Writer w]
(if (:type (meta o))
(print-method (with-meta o (dissoc (meta o) :type)) w)
(do
(.write w "#<")
(.write w (.getSimpleName (class o)))
(.write w " ")
(.write w (str o))
(.write w ">"))))
> I know, but as I said, my current implementation is just a proof of
> concept. It is not viable for production use for a variety of
> reasons. I was planning to replace it by something based on gen-class
> and proxy, but I will first try to get away with the new type
> function.
I just committed a completely new implementation to clojure.contrib.
It uses a vector with type metadata for representing algebraic data
types. Overall I am rather happy with this version. It is used almost
exactly like the previous one.
There is probably still room for improvement, but I don't expect the
interface to change significantly any more, so I'd say it's safe for
adoption by adventurous Clojurians :-)
Konrad.
> I was wondering if you considered using maps or struct-maps for this.
> One of my pet peeves with algebraic data types is the unnamed,
> positional nature of the components/fields. It always bothers me you
> have to rename the fields in every pattern match, skip fields with _
> etc.
There is something similar on my list of ideas for future extensions
of the algebraic data type module: optional keywords to name the
parameters of the constructors, which could then also be used for
matching in templates. Each component could then be referred to
either by its position or by its keyword. The model I have in mind is
function parameter passing in Python. The concrete representation
would still be vectors, probably, but no one should rely on the
concrete representation anyway. At the moment I use vectors because
template matching expands to code that uses nth for element access,
which is fastest for vectors.
I wouldn't like to have only named elements, because for constructors
with few arguments (one, at the extreme), this would be needlessly
verbose. Consider a very simple algebraic data type like Haskell's
maybe:
(deftype maybe
nothing
(just x))
Now that I re-think about this, it would make sense not to have
additional keywords as names, but use the constructor argument names
directly. All that would be required in addition to the current
mechanisms is an additional name-based template syntax.
BTW, another idea I have for future extensions is types with lazy
fields, with the constructors defined as macros and the component
values stored in delay objects. That would allow the construction of
lots of lazy data structures.
Konrad.