using gen-class to extend clojure.lang.Numbers and possibly expose constructor to library devs?

264 views
Skip to first unread message

Sophia Gold

unread,
Sep 28, 2016, 8:34:25 PM9/28/16
to Clojure Dev
Hi,

I've been interested in contributing to Clojure/ClojureScript for a while, although this was far from how I imagined the first patch I proposed looking like. 

I'm not a Java dev by any means so this is just something that came up in the context of a very small numerical library I've been working on (https://github.com/Sophia-Gold/Symbolic-Algebra.clj) that required me to dig into both core.clj and clojure.lang.Numbers and I had an idea that a variation of what I implemented could be of several potential uses when it comes to further dev work by others. So please, feedback is welcome as is taking this with a grain of salt.

Essentially my problem was that building a library of numeric types (so far rationals, complex numbers, polynomials, and combinations + subtypes thereof) using protocols eventualy came down to dispatching on a base type that could be cast across all of Clojure's numeric types. You can see how I solved that problem below by using gen-class to extend the java.lang.Number abstract class and then providing a constructor that used Clojure's (num) function to cast it based on reader input:

(ns symbolic-algebra.core
 
(gen-class
 
:name Number  
 
:extends java.lang.Number
 
:constructors {[ ][ ]}))

(defn -Number [this]
 
(num this))

(defprotocol Algebra
 
(add [a b])
 
(sub [a b])
 
(mul [a b])
 
(div [a b])
 
(equal? [a b]))

(extend-type Number
 
Algebra
 
(add [a b]
   
(if (= (class a) (class b))
     
(+ a b)
     
(raise-types a b add)))
 
(sub [a b]
   
(if (= (class a) (class b))
     
(- a b)
     
(raise-types a b sub)))
 
(mul [a b]
   
(if (= (class a) (class b))
     
(* a b)
     
(raise-types a b mul)))
 
(div [a b]
   
(if (= (class a) (class b))
     
(/ a b)
     
(raise-types a b div)))
 
(equal? [a b]
   
(if (= (class a) (class b))
     
(= a b)
     
(raise-types a b equal?))))


Now, having combed over the math section of clojure/core.clj I obviously saw a lot of repetitive calls to clojure.lang.Number where (aside from implementing some functions directly) it essentially does the same thing I was doing except in Java code. So I've been sitting around for a couple days weighing the upsides and downsides of this seven line patch simply to add a constructor for clojure.lang.Numbers as I did in my project for java.lang.Number so as not to make a fool of myself in my first post on the dev list. 

As I see it the considerations are as follows:
  1. Shouldn't break anything (again not a Java coder...so please correct me if I'm wrong).
  2. Likely to be confusing unless someone rewrites all the core math functions to use it (I would volunteer to do this, but I doubt anyone would want me to considering it would contribute no real value and require a rather non-trivial code review).
  3. Eases implementation of new core math functions...although it seems highly unlikely to me any are slated to be added any time soon. 
  4. To the extent anyone is thinking about Clojure-in-Clojure for 2.0 or any further releases, this could be seen as an incremental step towards that or even a way to try it out in an incremental release by switching just the math functions over from the Java implementation to protocols. But obviously that's a much greater design decision than just a seven line patch...
  5. More along my interests: the constructor is now exposed for devs writing numerical libraries. As I've shown above, the (num) function is only for casting and can't be used as a the basis of an actual type system. I'm not sure if this is something that people have already solved in other ways, maybe with core.typed, but for me one of the benefits of using a dynamic language means being able to write expressive systems on top of one numerical type that handles casting for me.
Based on that, here's my proposed patch:

(ns ^{:doc "The core Clojure language."
 
:author "Rich Hickey"}
 clojure
.core
 
(gen-class
   
:name "Number"
   
:extends clojure.lang.Numbers
   
:constructors {[ ][ ]}
   
:prefix "clojure-"))

...

(defn clojure-Number
 
[this] (cast this))


As mentioned, all feedback welcome. I'm certain I can't be correct on all of the assumptions I've made here, so please, have at it. I've found this community incredibly welcoming, which is the only reason I feel comfortable coming out of nowhere and proposing a patch on the core language like this :)

Thanks,
Sophia
 
(As an aside: I think a lot of people write off Clojure's prospects for numerical computing since it's a hosted language with dynamic typing, but I've recently been blown away by how powerful it is in many applications and that's what's motivated me to do work like this and hopefully see more of it from others, which is what I believe a constructor for clojure.lang.Numbers would help enable. I can't say this particularly library was fun to work on due to the OOP structure, but I've viewing it more as scaffolding on which to play with some of the transducers I've been writing. Particularly, I'm beginning to port the lazy power series code from Haskell (Doug McIlroy), Scheme (SICP), Go (Rob Pike), etc.— all of whom I believe got the idea from research by Gilles Kahn. Based on speed tests on similar code, I have strong reason to believe the Clojure version will be by far the fastest.)

Alex Miller

unread,
Sep 29, 2016, 9:49:29 AM9/29/16
to Clojure Dev
Hi Sophia, 

Thanks for the proposal. This reminds me of a post with similar goals that was made a couple years ago:

I asked the poster there to do a bit of work in describing the problem and alternative solutions but he did not have the time and no one else seemed to have the interest to do so. While there is occasional interest in this, I don't see custom number types as something that has huge demand and is not considered a priority by the core team right now. Certainly the strongest area of interest is the numerical computing segment of the community.

I think it would be interesting to see this developed further on a design page, esp one that compared the prior proposal, yours, and perhaps other approaches, but I'm not sure whether Rich would ok moving further on it towards inclusion.

Alex

adrian...@mail.yu.edu

unread,
Sep 29, 2016, 10:27:50 AM9/29/16
to Clojure Dev
I've had to roll my own version of this a few times so I personally would throw my full support behind Sophia's proposal as it solves a major problem for me and I think others interested in numeric computing in Clojure. 

Sophia Gold

unread,
Sep 29, 2016, 11:06:01 PM9/29/16
to Clojure Dev
Hi Alex,

Thanks for the quick reply. 

I agree with the general rationale of using platform specific math libraries except for cases where they would either diverge from Clojure's linguistic consistency or where they could leverage Clojure's core features. I haven't used core.matrix personally, but it seems like a great example of the latter case. I would argue anything involving typing should fall under the former case. As demonstrated by my own implementation, the problem with JVM specific libraries for custom numeric types are twofold:
  1. They always must defer to a base type and if this cannot be done dynamically then it diverges from the rest of Clojure. Currently, we have dynamic typing available to users yet not to developers of external numeric libraries that require custom numeric types. (It's hard to tell from how he presented the issue, but this seems to be the opposite goal of the person you linked to from two years ago as he seemed to want to implement this through excessively patching clojure.lang.Numbers to allow for a unique implementation rather than maintaining consistency with core math?)
  2. Without operator overloading I've had to rewrite a number of core math functions to use protocol methods like (add) and (sub) instead of '+ and '- and this has led to bugs in subtyping that I'm still ironing out. I didn't mention this specifically in my previous post because I don't believe it's a good idea to patch at this moment, but speaking generally it's certainly a key design consideration and something Clojure has a unique ability to offer over other languages on the JVM that would give it a huge edge when it comes to numeric computing (I'm thinking about comparisons to Fortress here...). 
It's my understanding that Clojure-in-Clojure is still the goal for JVM, just not being worked on currently (or at least not listed under future releases on the design wiki). I imagine if it's implemented through protocols, then regardless of what decisions are made, they should solve both issues above. 

So the argument I'm making right now is that at some point I think you're going to want a core library devoted to numeric computing that would implement custom typing (I've already had someone clone mine after asking on irc about complex number libraries), although not at all saying I'm the person to lead that effort, and that some type of "forward compatibility" until Clojure gets its major overhaul and core math is implemented with protocols would greatly aid in this development— especially at this moment when I'd like to see Clojure comparing benchmarks with languages like Julia or what not. 

That's the vein in which I proposed this patch: a concrete and low profile solution to one out of the two problems I see with implementing platform independent numeric libraries involving typing, where I believe there's a clear rationale for reinventing these wheels rather than using the platform dependent ones. The downsides, as noted, are that it's confusing in mixing syntax with how the core math functions are currently implemented and also that it could be seen as infringing on major design decisions much further down the road that are of course well above my pay grade.

All that said, if Rich would sign off on it as essentially guarantying forward compatibility then it would have the advantage over the implementation extending java.lang.Number in that whenever the switch to protocols is made it wouldn't break any numeric libraries implemented using this constructor. The second issue of operator overloading is trivial in that regard as custom implementations of core math functions should keep working regardless.

I do agree with your suggestion of creating a page on the design wiki about this and posting to the Numerical Clojure group to both survey solutions to this problem in other libraries and hopefully create momentum to show that there's enough interest in this type of development to merit extending the core to facilitate it, either through this patch or some other variation. Would you suggest I go about that by adding a child page here: http://dev.clojure.org/display/design/Math? I'm thinking then I could post it to the numerical group and circle back if/when I've gathered enough interest to support bringing this patch to Rich.

Thanks,
Sophia

Mikera

unread,
Oct 3, 2016, 10:25:27 PM10/3/16
to Clojure Dev
Hi Sophia,

Interesting ideas! 

You might want to check out expresso: https://github.com/clojure-numerics/expresso - This was a very interesting Google Summer of Code project that expressed symbolic mathematical types (polynomials, arbitrary expressions etc.) in Clojure and allowed symbolic manipulation, e.g. doing differentiation. A lot of the underlying code is implemented using core.logic, so it works as a general-purpose expression solver.

Clojure is actually a pretty great language for numerics: it is important to remember that for numerical performance in any language the biggest gains come from treating combining numerical values into larger arrays. The cost of dynamic protocol dispatch (around 15ns last time I checked) becomes irrelevant if you amortise that overhead over a million doubles. And the underlying operations themselves can be implemented using optimised Java / native code / GPU as desired for your target platform. That's basically the whole point of the core.matrix design: nice dynamic functions as a user level API, but the ability to plug in an optimised implementation underneath.

I would be hesitant about trying to change the core language with respect to numeric types. Unless you are going to be able to match primitive math performance for single operations (which certainly rules out protocols) then it would be a pretty big performance regression which many (myself included) would probably be opposed to. If we were to revamp the Clojure numeric tower, I think that "more dynamic" is a no-go, we would need to find ways to make the compiler smarter so that in can infer the right typed numeric operations. As a library though, I think a more dynamic approach is fine.

One other thing you may want to look at is the set of core.matrix protocols, which include all the numeric operations you have defined above (and many more). The core.matrix protocols are designed to work on N-dimensional arrays at once, but there is no reason that you couldn't create a 0D-specific implementation that handles arbitrary scalar numeric types in a dynamic way. This would be optionally extensible to higher dimension arrays. https://github.com/mikera/core.matrix.complex does this for complex numbers, for example. 

Happy to collaborate in any of this, I am mostly interested in high performance numeric (double) arrays and general purpose data processing, but always keen to help develop the Clojure numerical computing ecosystem.

Sophia Gold

unread,
Oct 4, 2016, 7:19:11 PM10/4/16
to Clojure Dev
Hi Mikera,

Thanks so much for responding. I hadn't had time for any due diligence on this yet, so looks like you did much of it for me! I'm definitely going to look into how both expresso and core.matrix solve both problems I noted: dynamic base types and operator overloading. 

I also want to clarify it was never my intention in proposing this patch to either position the library I was working on when the idea came to me as anything like a contrib library for symbolic algebra (it's really more of an exercise that's at toy level at this point), nor to propose any major overhauls on Clojure's numeric tower. The latter perhaps was confusing as I referenced future overhauls I had heard were coming at some point unknown at which the entire frontend would be rewritten using protocols similar to how ClojureScript is implemented. However, it seems you're saying that from a numeric perspective protocols are much too slow unless you use some method of vectorizing them for SIMD computation? In that case, you're quite a ways ahead of what I was thinking :)

Regardless, I thought the seven line patch would speak for itself, but the idea was primarily to make Clojure's existing dynamic numeric typing available to library developers in the same manner as it's available to users via the casting function (num) just by providing a constructor. It seemed like a small change to me as it wouldn't affect existing core math functionality at all, but perhaps if core.matrix is being introduced as a core library we should instead explore whether using it to for custom scalar types as well is a better option and should be listed as a best practice. I'll see if it's worth forking Clojure to benchmark my patch vs. core.matrix, although if there's any comparison whatsoever then existing library functionality vs. modifying the core language would surely decide it on an organizational level. 

I have to say, I'm so glad to hear you sing the praises of Clojure for numerical computing, and using concrete examples. There's this odd perception it's a great language for big data, but too slow for high performance math that I think is crucial to correct ASAP via instituting best practices for this sort of work language-wide. My opinion tends to be that as opposed to systems programming, numerical computing is often times more fungible in terms of the parameters for problems solving. The flexibility of transducers are a great example of this that haven't fully been explored and the one that really won me over when running speeds tests on certain problems. I think we're all smart enough to know the limited reliability of benchmarks, so if Clojure were to demonstrate just being in the realm of raw speed of languages like Python and Julia then I think it has many advantageous features over them when it comes to cost-cutting on what may have previously been seen to be rigid problems. And then any issues with Lisp syntax should be offset by the unparalleled advantage when it comes to writing DSLs.

Anyway, that's where I'd like to come in: I think the more high quality examples we can put out there to demonstrate Clojure's viability for numeric computing the better. I'm working on a couple different small projects at the moment and would love to share them as well as seeing what I can come up with working with core.matrix. I'll update this post if I do end up seeing any viability in that patch compared to the implementations you pointed me to, but otherwise probably shift further discussion to the Numerical Clojure group.

Sophia
Reply all
Reply to author
Forward
0 new messages