functions with metadata, 2 problems: performance hit and equality not preserved.

265 views
Skip to first unread message

John Alan McDonald

unread,
Sep 19, 2017, 9:01:07 PM9/19/17
to Clojure
I'd like to be able to do something like:

(defn square ^double [^double x] (* x x))
(def meta-square (with-meta square {:domain Double/TYPE :codomain Double/TYPE :range {:from 0.0 :to Double/POSITIVE_INFINITY :also Double/NaN}})

https://clojure.org/reference/metadata says "Symbols and collections support metadata...". Nothing about whether any other types do or do not support metadata.

The code above works, at least in the sense that it doesn't throw exceptions, and meta-square is a function that returns the right values, and has the right metadata.
That's because square is an instance of a class that extends AFunction, which implements IObj (https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/AFunction.java#L18).

It doesn't work, in the sense that it violates "Two objects that differ only in metadata are equal." from https://clojure.org/reference/metadata. That is,
(= square meta-square)
returns false. 

For my purposes, what really matters is that calling meta-square has roughly 30 times the cost of square itself (and about 3 times the cost of a version without type hints).
The reason is that meta-square is an instance of a class that extends RestFn (https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/AFunction.java#L26), whose invoke() methods are  expensive.

Also, for my purposes, it would actually be better if  "Two objects that differ only in metadata are NOT equal." So perhaps I shouldn't be using metadata at all. It just seems

Options:

(1) Add a meta field to clojure.lang.AFunction (and fix equals and hashcode). I presume the reason there isn't already a meta field is to keep functions as light weight as possible. Are there good benchmarks that I could use to measure the cost of adding an almost always empty field? 

(2) Experiments with a mechanical wrapper class (https://github.com/palisades-lakes/dynamic-functions/blob/dynesty/src/main/java/palisades/lakes/dynafun/java/MetaFn.java) show almost no overhead, but extending that to cover every possible combination of clojure.lang.IFn$DD, clojure.lang.IFn$DLD, ..., is impractical.

(3) Use asm to create a new class that extends the original function's class and implements IObj in the obvious way.

My short term plan is (2), ignoring the equals violation, and implementing primitive interface wrappers as needed.

Are there problems with (3) asm, as a long term solution?

Alex Miller

unread,
Sep 20, 2017, 2:34:48 AM9/20/17
to Clojure


On Tuesday, September 19, 2017 at 8:01:07 PM UTC-5, John Alan McDonald wrote:
I'd like to be able to do something like:

(defn square ^double [^double x] (* x x))
(def meta-square (with-meta square {:domain Double/TYPE :codomain Double/TYPE :range {:from 0.0 :to Double/POSITIVE_INFINITY :also Double/NaN}})

https://clojure.org/reference/metadata says "Symbols and collections support metadata...". Nothing about whether any other types do or do not support metadata.

Functions are probably the big missing thing in that list of types that have metadata that you can modify.

A few other things also have metadata that can be set at construction and read but not modified after construction: namespaces and the reference types (vars, atoms, refs, agents). 
 
The code above works, at least in the sense that it doesn't throw exceptions, and meta-square is a function that returns the right values, and has the right metadata.
That's because square is an instance of a class that extends AFunction, which implements IObj (https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/AFunction.java#L18).

It doesn't work, in the sense that it violates "Two objects that differ only in metadata are equal." from https://clojure.org/reference/metadata. That is,
(= square meta-square)
returns false. 

I think the tricky thing here is that functions are only equal if they are identical. Perhaps it would make sense to implement equals in the function hierarchy to somehow "unwrap" the meta wrappers. I'm not sure if that even makes sense.

Another option here is this, which I think would be more typical:

(def ^{:domain Double/TYPE :codomain Double/TYPE :range {:from 0.0 :to Double/POSITIVE_INFINITY :also Double/NaN}} meta-square square)

that puts the meta on the var (meta-square), not on the function instance itself. In this case equality works and the invocation timing should be about the same since in both cases you're going through the var to invoke the identical function. You can retrieve the meta with (meta #'meta-square) since it's on the var.

For my purposes, what really matters is that calling meta-square has roughly 30 times the cost of square itself (and about 3 times the cost of a version without type hints).

I see the overhead of your meta-square as more like 2 times the cost in a quick test, not sure how you're testing. I'm using 1.9.0-beta1 and Java 8 and timing 100,000 invocations over a series of runs. There's a lot of variability - using something like Criterium would yield better data.
 
The reason is that meta-square is an instance of a class that extends RestFn (https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/AFunction.java#L26), whose invoke() methods are  expensive.

Also, for my purposes, it would actually be better if  "Two objects that differ only in metadata are NOT equal." So perhaps I shouldn't be using metadata at all. It just seems

Options:

(1) Add a meta field to clojure.lang.AFunction (and fix equals and hashcode). I presume the reason there isn't already a meta field is to keep functions as light weight as possible. Are there good benchmarks that I could use to measure the cost of adding an almost always empty field? 

I think it would add the cost of an object ref (prob 32 or 64 bits) to every function object and I don't think it matters if it's empty or not. I don't really know the reason for the current design, would require some research. There are a LOT of potential considerations here with respect to backwards compatibility, etc. Any change like this would be treated very carefully. I do not think the need is necessarily worth such a change, but it's hard to weigh that.
 
(2) Experiments with a mechanical wrapper class (https://github.com/palisades-lakes/dynamic-functions/blob/dynesty/src/main/java/palisades/lakes/dynafun/java/MetaFn.java) show almost no overhead, but extending that to cover every possible combination of clojure.lang.IFn$DD, clojure.lang.IFn$DLD, ..., is impractical.

That's what code gen is for.
 
(3) Use asm to create a new class that extends the original function's class and implements IObj in the obvious way.

The asm included inside Clojure should be considered an internal implementation detail, subject to version and API changes without warning.
 
My short term plan is (2), ignoring the equals violation, and implementing primitive interface wrappers as needed.

Will the var version above satisfy?
 
Are there problems with (3) asm, as a long term solution?

As mentioned above, you should not rely on this being available or free from breakage. 

John McDonald

unread,
Sep 20, 2017, 2:18:16 PM9/20/17
to clo...@googlegroups.com
Thanks for the quick response.

One issue at a time:

(A) Putting metadata on Vars instead of on the functions themselves:

I need to be able to associate facts with the function instances. I can't rely on every function being bound to a Var.
For example, I'm constructing cost functions for machine learning, and other applications, by summing, composing, etc. other functions.  



--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John McDonald

unread,
Sep 20, 2017, 2:57:47 PM9/20/17
to clo...@googlegroups.com
2nd issue: Benchmarks

I use both criterium and simple 'run repeatedly and divide the clock time'.

I've had trouble getting consistent results from run to run with either.

Most recently (yesterday) I've added many more warmup runs, giving HotSpot lots of time to do its stuff, which seems to be stabilizing the results.
This might be unfair, because in practice a lot of important code might never run that many times. 
At least, it gives the limiting performance of code that runs a lot.

Unfortunately, this benchmark is split over 3 github projects, so it might be a little hard to follow.

The task I'm timing is computing the sum of squares of the elements in a moderately large (4m elements) double[] array (aka L2Norm).

I've compared 8 implementations, all of which accumulate the sum in Java.

  6.37ms inline --- naive sum of squares in Java.
  6.37ms invokestatic --- call a static method in Java to square the elements.
  6.38ms primitive --- calls square.invokePrim(x[i]) to square the elements
  6.37ms boxprimitive --- calls square.invoke(x[i]) to square the elements
  6.37ms funxprimitive --- calls funxSquare.invokePrim(), where funxSquare is square wrapped with MetaFn, an experiment metadata wrapper.
  6.38ms funxboxed --- calls funxSquare.invoke()
 43.55ms boxed --- calls boxedSquare.invoke() , where boxedSquare is a version of square without type hints.
151.61ms cljmeta --- calls metaSquare.invoke(), where metasquare is the result of (with-meta square {...})

Clojure 1.8.0, Oracle JDK 1.8, Win10, Lenovo X1 i5-7300U.
Sum of squares for each of 2 double[4194304], in 2 threads, concurrently.

boxed and cljmeta create a lot of garbage, causing their clock times to be more variable, depending on exactly what GC does..
It's possible they appear relatively faster with a small number of total calls, if they don't trigger a GC, and HotSpot doesn't fully optimize the others.
I think this is a valid usage pattern, but many many calls to small functions is the case that I'm interested in at present.


Main scripts:

Using criterium:

Running the benchmark 4k times and dividing the clock time:

These both use general benchmarking code from 

The experimental metadata function wrapper is in:

Justin Smith

unread,
Sep 20, 2017, 3:05:04 PM9/20/17
to clo...@googlegroups.com
I've had good luck with an approach suggested by Kevin Downey, defining a defrecord that implements IFn so that it works when called and applied, and transparently supporting attached data as if it were a hash-map. It's not too hard to implement if you know the precise arg count you need to support, and I'd be interested to see how the performance compares.


For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

John McDonald

unread,
Sep 20, 2017, 3:05:57 PM9/20/17
to clo...@googlegroups.com
3rd issue: metadata and function equality:

I've never really understood the motivation for "Two objects that differ only in metadata are equal." 
Is there a good reference for that?

For my purposes, it would probably be better if 'metadata' did affect equality.

Perhaps I shouldn't be overloading the metadata mechanism after all?

John McDonald

unread,
Sep 20, 2017, 3:21:32 PM9/20/17
to clo...@googlegroups.com
Code generation:

It seems to me this has to be done on demand, to be practical.

To support every possible combination of 'implements clojure.lang.IFn$DD, clojure.lang.IFn$DLD, ...' 
for differing arrities, I get 13,108,878 classes.

Or is there a better way?

John McDonald

unread,
Sep 20, 2017, 3:47:42 PM9/20/17
to clo...@googlegroups.com
ASM:

I haven't done anything with ASM before. Any advice would be greatly appreciated.

What I have in mind is using the org.ow2.asm, not the internal clojure.asm.

I am imagining I can take a function's class and add 'implements IObj', a 'meta' field, and the necessary methods, and pass through everything else unchanged.

I then ought to be reasonably insensitive to changes in how functions are implemented, but not completely. I would definitely be depending on IObj, which is not part of the public Clojure API.

So there's no guarantee anything implemented this way would continue to work in a new release.
That might be a reasonable risk to tradeoff with higher performance, however. 

There's also a chicken and egg thing here: The Clojure API shouldn't commit to anything without a good reason. But you can't demonstrate that there's a good reason without realistic code that shows what's possible with that commitment.


John McDonald

unread,
Sep 20, 2017, 3:58:39 PM9/20/17
to clo...@googlegroups.com
I've done something like this in the past. 

I'd expect the performance to be similar to MetaFn. In either case, I think you have a wrapper class that carries the additional data, and one level of indirection for invoke or invokePrim. My benchmarks results so far seem to show the indirection gets optimized away eventually.

I think the main difference is that the function now implements something like IPersistentMap rather than IObj. The problem I had was if the function was some Map-like object to begin with, I'd get confused about what was 'data' and what was 'metadata'. 


For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.

Justin Smith

unread,
Sep 20, 2017, 4:30:37 PM9/20/17
to clo...@googlegroups.com
You don't need to indirect - the invoke, call, and apply methods can have your code in them directly (a macro can expand to put your code body into them directly as appropriate). If you need to ensure it isn't mistaken for data, you could add a marker interface and check for it (an interface with no methods, just used for checking membership).

Why would you implement IPersistentMap instead of IObj, wouldn't it be both, if anything? A normal hash-map implements both for example.

To implement your own bytecode compiler just so you can have something that carries data that counts in equality checks seems a bit overboard to me, since a record implementing IFn does this for very little trouble.


For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

John McDonald

unread,
Sep 20, 2017, 6:43:20 PM9/20/17
to clo...@googlegroups.com
How do you handle closures?
EG:

(defn squared ^IFn [^IFn f]
  (with-meta
    (fn squared0 [x]
      (let [fx (double (f x))]
        (* fx fx)))
    {:domain :number
     :range :positive-number
}))

I think you can do it, but it would require the macro taking apart the value of &env, which means depending on clojure.lang.Compiler.LocalBinding,

My guess is that using ASM to adds fields and methods to a class would be less dependent on implementation details, but having never done that, it's only a guess.



For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/D8mksieuUPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.

Didier

unread,
Sep 21, 2017, 2:07:51 AM9/21/17
to Clojure
I'm not fully following what you're doing or trying to do, but don't expect meta to compose. A lot of macros and functions can strip it away. Its best kept to annotate global static things.

Didier

unread,
Sep 21, 2017, 2:15:59 AM9/21/17
to Clojure
You shouod also checkout https://github.com/jgpc42/insn/blob/master/README.md

It was annouced a few weeks back, looks like a nice interface to ASM.

Reply all
Reply to author
Forward
0 new messages