New Matrix Multiplication benchmarks - Neanderthal up to 60 times faster than core.matrix with Vectorz

925 views
Skip to first unread message

Dragan Djuric

unread,
Mar 13, 2016, 11:34:23 AM3/13/16
to Clojure

I am soon going to release a new version of Neanderthal.

I reinstalled ATLAS, so I decided to also update benchmarks with threaded ATLAS bindings.

The results are still the same for doubles on one core: 10x faster than Vectorz and 2x faster than JBlas.

The page now covers some more cases: multicore ATLAS (on my 4-core i7-4790k) and floats. Neanderthal is 60 times faster with multi-threaded ATLAS and floats than Vectorz (core.matrix).

For the rest of the results, please follow the link. This will work with older versions of Neanderthal.


https://www.reddit.com/r/Clojure/comments/4a8o9n/new_matrix_multiplication_benchmarks_neanderthal/

Mikera

unread,
Mar 13, 2016, 7:28:24 PM3/13/16
to Clojure
It would be great if Neanderthal simply implemented the core.matrix protocols, then people could use it as a core.matrix implementation for situations where it makes sense. I really think it is an architectural dead-end for Neanderthal to develop a separate API. You'll simply get less users for Neanderthal and fragment the Clojure library ecosystem which doesn't help anyone.

In the absence of that, we'll just need to develop separate BLAS implementations for core.matrix. 

Would be great if you could implement the core.matrix protocols and solve this issue. It really isn't much work, I'd even be happy to do it myself if Neanderthal worked on Windows (last time I tried it doesn't).

Dragan Djuric

unread,
Mar 13, 2016, 8:19:25 PM3/13/16
to Clojure
On Monday, March 14, 2016 at 12:28:24 AM UTC+1, Mikera wrote:
It would be great if Neanderthal simply implemented the core.matrix protocols, then people could use it as a core.matrix implementation for situations where it makes sense. I really think it is an architectural dead-end for Neanderthal to develop a separate API. You'll simply get less users for Neanderthal and fragment the Clojure library ecosystem which doesn't help anyone.

Mike, I explained that many times in detail what's wrong with core.matrix, and I think it is a bit funny that you jump in every time Neanderthal is mentioned with the same dreams about core.matrix, without even trying Neanderthal, or discussing the issues that I raised. Every time your answer is that core.matrix is fine for *YOUR* use cases. That's fine with me and I support your choice, but core.matrix fell short for *MY* use cases, and after detailed inspection I decided it was unsalvageable. If I thought I could improve it, it would have been easier for me to do that than to spend my time fiddling with JNI and GPU minutes.

I understand your emotions about core.matrix, and I empathize with you. I support your contributions to Clojure open-source space, and am glad if core.matrix is a fine solution for a number of people. Please also understand that it is not a solution to every problem, and that it can also be an obstacle, when it fells short in a challenge.
 
In the absence of that, we'll just need to develop separate BLAS implementations for core.matrix. 

I support you. If you do a good job, I might even learn something now and improve Neanderthal.
 
Would be great if you could implement the core.matrix protocols and solve this issue. It really isn't much work, I'd even be happy to do it myself if Neanderthal worked on Windows (last time I tried it doesn't).

I am happy that it is not much work, since it will be easy for you or someone else to implement it ;) Contrary to what you said on slack, I am *not against it*. I said that many times. Go for it. The only thing that I said is that *I* do not have time for that nor I have any use of core.matrix.

Regarding Windows - Neanderthal works on Windows. I know this because a student of mine compiled it (he's experimenting with an alternative GPU backend for Neanderthal and prefers to work on Windows). As I explained to you in the issue that you raised on GitHub last year, You have to install ATLAS on your machine, and Neanderthal has nothing un-Windowsy in its code. There is nothing Neanderthal specific there, it is all about comiling ATLAS. Follow any ATLAS or Nympu + ATLAS or R + ATLAS guide for instructions. Many people did that installation, so I doubt it'd be a real obstacle for you.

Mikera

unread,
Mar 13, 2016, 11:18:19 PM3/13/16
to Clojure
On Monday, 14 March 2016 08:19:25 UTC+8, Dragan Djuric wrote:
On Monday, March 14, 2016 at 12:28:24 AM UTC+1, Mikera wrote:
It would be great if Neanderthal simply implemented the core.matrix protocols, then people could use it as a core.matrix implementation for situations where it makes sense. I really think it is an architectural dead-end for Neanderthal to develop a separate API. You'll simply get less users for Neanderthal and fragment the Clojure library ecosystem which doesn't help anyone.

Mike, I explained that many times in detail what's wrong with core.matrix, and I think it is a bit funny that you jump in every time Neanderthal is mentioned with the same dreams about core.matrix, without even trying Neanderthal, or discussing the issues that I raised. Every time your answer is that core.matrix is fine for *YOUR* use cases. That's fine with me and I support your choice, but core.matrix fell short for *MY* use cases, and after detailed inspection I decided it was unsalvageable. If I thought I could improve it, it would have been easier for me to do that than to spend my time fiddling with JNI and GPU minutes.

It would be great if you could explain this statement. What *precisely* are your technical objections?

Please remember that core.matrix is primarily intended as an API, not a matrix implementation itself. The point is that different matrix implementations can implement the standard protocols, and users and library writers can then code to a standard API while maintaining flexibility to use the implementation that best suits their use cases (of which Neanderthal could certainly be one).
 

I understand your emotions about core.matrix, and I empathize with you. I support your contributions to Clojure open-source space, and am glad if core.matrix is a fine solution for a number of people. Please also understand that it is not a solution to every problem, and that it can also be an obstacle, when it fells short in a challenge.

Interested to understand that statement. Please let me know what use cases you think don't work for core.matrix. A lot of people have worked on the API to make it suitable for a large class of problems, so I'm interested to know if there are any we have missed. 

For any point you have here, I'm happy to either:
a) Explain how it *does* work
b) Take it as an issue to address in the near future.
 
 
In the absence of that, we'll just need to develop separate BLAS implementations for core.matrix. 

I support you. If you do a good job, I might even learn something now and improve Neanderthal.
 
Would be great if you could implement the core.matrix protocols and solve this issue. It really isn't much work, I'd even be happy to do it myself if Neanderthal worked on Windows (last time I tried it doesn't).

I am happy that it is not much work, since it will be easy for you or someone else to implement it ;) Contrary to what you said on slack, I am *not against it*. I said that many times. Go for it. The only thing that I said is that *I* do not have time for that nor I have any use of core.matrix.

Regarding Windows - Neanderthal works on Windows. I know this because a student of mine compiled it (he's experimenting with an alternative GPU backend for Neanderthal and prefers to work on Windows). As I explained to you in the issue that you raised on GitHub last year, You have to install ATLAS on your machine, and Neanderthal has nothing un-Windowsy in its code. There is nothing Neanderthal specific there, it is all about comiling ATLAS. Follow any ATLAS or Nympu + ATLAS or R + ATLAS guide for instructions. Many people did that installation, so I doubt it'd be a real obstacle for you.

Every time I have tried it has failed on my machine. I'm probably doing something wrong, but it certainly isn't obvious how to fix it. Can you point me to a canonical guide and binary distribution that works "out of the box"? 

Leif

unread,
Mar 14, 2016, 1:49:36 AM3/14/16
to Clojure

I also think the core.matrix crusade is unnecessary.  Here are my two cents:

1. No one jumps in every time someone writes a new web routing library saying "No!  You'll fragment the clojure
    web routing community!  Use compojure!"  I think we should pick and choose based on our needs.
2. We should build on well-tested, stable APIs, true.  Dragan has built on top of BLAS (basically stable for over
    35 years) and LAPACK (25-35 years depending on how you count).
3. Dragan has no need or desire to implement the core.matrix API, but I'm sure someone that wants native speed
    and compatibility with core.matrix will do so when the need arises.
4. If you want accessibility to the widest possible community of numerical programmers, bear in mind that most
    of them aren't clojure or even java programmers anyway.  BLAS and LAPACK are the way to make them feel
    at home.  Pure-java numerical programming is a rather strange cul-de-sac community already.
5. Numerical programming is a field where you, unfortunately, have to drop layers of abstraction more
    frequently than other fields.  I'd rather drop down into ATLAS than <<insert random java backend with its own
    matrix class here>>.

In short, having two libraries that do the same thing is not a problem, and if it becomes a problem, I think we as a community can deal with it fairly quickly.

--Leif

Mikera

unread,
Mar 14, 2016, 3:34:56 AM3/14/16
to Clojure
On Monday, 14 March 2016 13:49:36 UTC+8, Leif wrote:

I also think the core.matrix crusade is unnecessary.  Here are my two cents:

1. No one jumps in every time someone writes a new web routing library saying "No!  You'll fragment the clojure
    web routing community!  Use compojure!"  I think we should pick and choose based on our needs.

Hi Leif,

Absolutely agree that we should choose libraries based on our needs.

However there is huge value to having standard abstractions that everyone can code to if you want to create an ecosystem of compatible tools. Consider these examples:
a) The Clojure "sequence" abstraction
b) The Ring handler / middleware abstraction
c) Transducers

All of these present a common abstraction that can support many different implementations and use cases. core.matrix does exactly the same for array programming.

If someone reinvented transducers with a completely different API / syntax but a clever new implementation (e.g. for distributed parallel processing or something like that) then people would quite rightly say things like "why don't you just make that a transducer? It would be much better if we can use that with all our existing transducer code". 

All I'm really asking is that people use the core.matrix abstractions when doing numerical work, and we can all interoperate much more smoothly. For individual end users it isn't so much of an issue, however it is a *big* problem if tools / higher level libraries start adopting different APIs and representations. Then it isn't "choose the right library for your needs" but "choose between different competing stacks of libraries that don't interoperate". That's when fragmentation becomes a real problem. If you are interested in this topic, it is well worth reading about "The Curse of Lisp".

2. We should build on well-tested, stable APIs, true.  Dragan has built on top of BLAS (basically stable for over
    35 years) and LAPACK (25-35 years depending on how you count).

There is a set of BLAS-like API functions in core.matrix already. See: https://github.com/mikera/core.matrix/blob/develop/src/main/clojure/clojure/core/matrix/blas.cljc

Having said that, I don't personally think the BLAS API is a particularly good fit for Clojure (it depends on mutability, and I think it is a pretty clumsy design by modern API standards). But if you simply want to copy the syntax, it's certainly trivial to do in core.matrix.

Note that none of these (core.matrix blas namespace, Neanderthal) are actually real BLAS (which is a native API that you can't call directly from Clojure). They are all just wrappers that adopt a similar syntax and delegate to the underlying implementation after a bit of parameter manipulation / data marshalling.
 
3. Dragan has no need or desire to implement the core.matrix API, but I'm sure someone that wants native speed
    and compatibility with core.matrix will do so when the need arises.
4. If you want accessibility to the widest possible community of numerical programmers, bear in mind that most
    of them aren't clojure or even java programmers anyway.  BLAS and LAPACK are the way to make them feel
    at home.  Pure-java numerical programming is a rather strange cul-de-sac community already.
5. Numerical programming is a field where you, unfortunately, have to drop layers of abstraction more
    frequently than other fields.  I'd rather drop down into ATLAS than <<insert random java backend with its own
    matrix class here>>.

In short, having two libraries that do the same thing is not a problem, and if it becomes a problem, I think we as a community can deal with it fairly quickly.

--Leif

An important point to note is that they don't do the same thing at all: core.matrix is an API providing an general purpose array programming abstraction with pluggable implementation support. Neanderthal is a specific implementation tied to native BLAS/ATLAS. They should ideally work in harmony, not be seen as alternatives.

Neanderthal is more closely comparable to Vectorz, which *is* a matrix implementation (and I think it matches or beats Neanderthal in performance for virtually every operation *apart* from large matrix multiplication for which ATLAS is obviously fantastic for).

Ultimately, I'm trying to encourage people to work towards an ecosystem for Clojure(Script) that rivals the Python or R ecosystems. It's still going to take a fair bit of work to get there, but is perfectly feasible. I think we need:
a) core.matrix as a common abstraction that people can use to develop user level code as well as higher level libraries (environments like Incanter, data processing tools, deep learning etc.)
b) A really good pure JVM implementation (currently Vectorz / vectorz-clj is great for this)
c) A really good native implementation backed by BLAS/ATLAS (Clatrix currently works, but this could be Neanderthal)
d) A really good ClojureScript implementation (I know some folks have got core.matrix working in ClojureScript with thi-ng/ndarray which could be a good option)
e) Some other specialised implementations for specific use cases (e.g. for interfacing with Spark)

If anyone has other ideas / a better strategy I'd love to hear, and I welcome a good constructive debate. I'm not precious about any of my own contributions. But I do genuinely think this is the best way forward for Clojure data science overall, based on where we are right now.

Dragan Djuric

unread,
Mar 14, 2016, 5:20:01 AM3/14/16
to Clojure

Please remember that core.matrix is primarily intended as an API, not a matrix implementation itself. The point is that different matrix implementations can implement the standard protocols, and users and library writers can then code to a standard API while maintaining flexibility to use the implementation that best suits their use cases (of which Neanderthal could certainly be one). 

Exactly the same could be said about Neanderthal. It also has an abstract API that could be implemented quite flexibly, and is even better than core.matrix (IMO, of course, so I won't try to argue about that since it is a matter of personal preferences and needs and arguing about that leads nowhere).
 

I understand your emotions about core.matrix, and I empathize with you. I support your contributions to Clojure open-source space, and am glad if core.matrix is a fine solution for a number of people. Please also understand that it is not a solution to every problem, and that it can also be an obstacle, when it fells short in a challenge.

Interested to understand that statement. Please let me know what use cases you think don't work for core.matrix. A lot of people have worked on the API to make it suitable for a large class of problems, so I'm interested to know if there are any we have missed. 

For any point you have here, I'm happy to either:
a) Explain how it *does* work
b) Take it as an issue to address in the near future.

I do not say that core.matrix is a bad API. I just think BLAS is even more mature, and battle tested.

 
 
 
In the absence of that, we'll just need to develop separate BLAS implementations for core.matrix. 

I support you. If you do a good job, I might even learn something now and improve Neanderthal.
 
Would be great if you could implement the core.matrix protocols and solve this issue. It really isn't much work, I'd even be happy to do it myself if Neanderthal worked on Windows (last time I tried it doesn't).

I am happy that it is not much work, since it will be easy for you or someone else to implement it ;) Contrary to what you said on slack, I am *not against it*. I said that many times. Go for it. The only thing that I said is that *I* do not have time for that nor I have any use of core.matrix.

Regarding Windows - Neanderthal works on Windows. I know this because a student of mine compiled it (he's experimenting with an alternative GPU backend for Neanderthal and prefers to work on Windows). As I explained to you in the issue that you raised on GitHub last year, You have to install ATLAS on your machine, and Neanderthal has nothing un-Windowsy in its code. There is nothing Neanderthal specific there, it is all about comiling ATLAS. Follow any ATLAS or Nympu + ATLAS or R + ATLAS guide for instructions. Many people did that installation, so I doubt it'd be a real obstacle for you.

Every time I have tried it has failed on my machine. I'm probably doing something wrong, but it certainly isn't obvious how to fix it. Can you point me to a canonical guide and binary distribution that works "out of the box"? 

Googling "numpy atlas install windows" gave me thousands of results, here is the first: http://www.scipy.org/scipylib/building/windows.html#atlas-and-lapack 

Dragan Djuric

unread,
Mar 14, 2016, 6:21:15 AM3/14/16
to Clojure


There is a set of BLAS-like API functions in core.matrix already. See: https://github.com/mikera/core.matrix/blob/develop/src/main/clojure/clojure/core/matrix/blas.cljc

GitHub history says they were added 7 days ago. Nevermind that they just delegate, so the only BLAS-y thing is the 4 method names taken out  of Neanderthal (BLAS has a bit more stuff than that), but why you reinvented the wheel instead just creating core.matrix (or vectorz) implementation of Neanderthal's API? 
 
Having said that, I don't personally think the BLAS API is a particularly good fit for Clojure (it depends on mutability, and I think it is a pretty clumsy design by modern API standards). But if you simply want to copy the syntax, it's certainly trivial to do in core.matrix.

If you look at Neanderthal's API you'll see that I took a great care to make it fit into Clojure, which I think I succeeded. 
Regarding mutability:
1) Neanderthal provides both mutable and pure functions
2) Trying to do numeric computing without mutability (and primitives) for anything than toy problems is... well, sometimes it is better to plant a Sequoia seed, wait for the tree to grow, cut it, make an abacus and compute with it... 


An important point to note is that they don't do the same thing at all: core.matrix is an API providing an general purpose array programming abstraction with pluggable implementation support. Neanderthal is a specific implementation tied to native BLAS/ATLAS. They should ideally work in harmony, not be seen as alternatives.

* Neanderthal has an agnostic api and it is not in any way tied to BLAS/ATLAS *
Neanderthal also has pluggable implementation support - and it already provides two high-performance implementations that elegantly unify two very different *hardware* platforms: CPU and GPU. And it does it quite transparently (more about that can be read here: http://neanderthal.uncomplicate.org/articles/tutorial_opencl.html)

Neanderthal is more closely comparable to Vectorz, which *is* a matrix implementation (and I think it matches or beats Neanderthal in performance for virtually every operation *apart* from large matrix multiplication for which ATLAS is obviously fantastic for).
 
You think without having tried that. I tried that, and *Neanderthal is faster for virtually *ALL* operations, even 1D. Yesterday I did a quick measure of asum (1D vector operation), for example, and neanderthal was, if I remember correctly, * 9x faster than Vectorz in that simple summing *

. I even pointed to you that Neanderthal is faster even in ALL those cases when you raised that argument the last time, but you seem to ignore it.
 
If anyone has other ideas / a better strategy I'd love to hear, and I welcome a good constructive debate. I'm not precious about any of my own contributions. But I do genuinely think this is the best way forward for Clojure data science overall, based on where we are right now.

I would like to propose a strategy where more love is given to the actual libraries (incanter is rather indisposed and stagnant IMO) that solve actual problems instead of trying to unify what does not exist (yet!). Then, people will use what works best, and what does not work will not be important. That's how things go in open-source...
 

Mikera

unread,
Mar 14, 2016, 7:09:23 AM3/14/16
to Clojure
On Monday, 14 March 2016 18:21:15 UTC+8, Dragan Djuric wrote:


There is a set of BLAS-like API functions in core.matrix already. See: https://github.com/mikera/core.matrix/blob/develop/src/main/clojure/clojure/core/matrix/blas.cljc

GitHub history says they were added 7 days ago. Nevermind that they just delegate, so the only BLAS-y thing is the 4 method names taken out  of Neanderthal (BLAS has a bit more stuff than that), but why you reinvented the wheel instead just creating core.matrix (or vectorz) implementation of Neanderthal's API? 

The core.matrix API has been around a lot longer than Neanderthal, and presents an abstract API rather than being tied to any specific implementations. If there is any wheel-reinventing going on, it is Neanderthal :-)

The correct way to build these things is IMHO:
- User / library code depends on an implementation-agnostic abstraction (i.e. the core.matrix API)
- The user optionally specifies an implementation (or makes do with the default)
- The underlying implementation handles how that call gets executed (Neanderthal, vectorz-clj, Clatrix etc.)

It wouldn't make any sense to have core.matrix (the abstraction) depend on Neanderthal (the implementation) because that defeats the whole purpose of an implementation-agnostic API. And a hard dependency would rule the whole library out for any users who can't even run that particular implementation (including myself, due to lack of Windows support).
 
 
Having said that, I don't personally think the BLAS API is a particularly good fit for Clojure (it depends on mutability, and I think it is a pretty clumsy design by modern API standards). But if you simply want to copy the syntax, it's certainly trivial to do in core.matrix.

If you look at Neanderthal's API you'll see that I took a great care to make it fit into Clojure, which I think I succeeded. 
Regarding mutability:
1) Neanderthal provides both mutable and pure functions
2) Trying to do numeric computing without mutability (and primitives) for anything than toy problems is... well, sometimes it is better to plant a Sequoia seed, wait for the tree to grow, cut it, make an abacus and compute with it... 

Right, I agree you need both mutable and immutable options. That's why core.matrix provides both. But why are you pushing a new API rather than adopting the core.matrix one (or giving me PRs if you think it could be improved?)

2) I disagree with. Most real world data science is about data manipulation and transformation, not raw computation. 1% of people need to optimise the hell out of a specific algorithm for a specific use case, 99% just want convenient tools and the ability to get an answer "fast enough".
 


An important point to note is that they don't do the same thing at all: core.matrix is an API providing an general purpose array programming abstraction with pluggable implementation support. Neanderthal is a specific implementation tied to native BLAS/ATLAS. They should ideally work in harmony, not be seen as alternatives.

* Neanderthal has an agnostic api and it is not in any way tied to BLAS/ATLAS *
Neanderthal also has pluggable implementation support - and it already provides two high-performance implementations that elegantly unify two very different *hardware* platforms: CPU and GPU. And it does it quite transparently (more about that can be read here: http://neanderthal.uncomplicate.org/articles/tutorial_opencl.html)

OK, so you agree you want an abstract API with pluggable implementations... but that is exactly what the core.matrix API does!

But if you only offer native and GPU dependencies then you aren't really offering an agnostic API. Won't even work on my machine..... could you add Windows support? Maybe you want to add pure JVM implementations as well? And for Clojure data structures? What about ClojureScript support? Sparse arrays? Oh, could you maybe support n-dimensional arrays? Datasets?

If you go down that path you will simply be reinventing core.matrix, and you will eventually come to realise why a lot of the core.matrix design decisions actually make sense. Believe me, I've been down that road :-)
 

Neanderthal is more closely comparable to Vectorz, which *is* a matrix implementation (and I think it matches or beats Neanderthal in performance for virtually every operation *apart* from large matrix multiplication for which ATLAS is obviously fantastic for).
 
You think without having tried that. I tried that, and *Neanderthal is faster for virtually *ALL* operations, even 1D. Yesterday I did a quick measure of asum (1D vector operation), for example, and neanderthal was, if I remember correctly, * 9x faster than Vectorz in that simple summing *

. I even pointed to you that Neanderthal is faster even in ALL those cases when you raised that argument the last time, but you seem to ignore it.

Interesting. I don't recall you posting such benchmarks, apologies if I missed these.

I'd be happy to benchmark myself if I could get Neanderthal to build. I don't believe 9x though... this operation is usually bound by memory bandwidth IIRC
 
 
If anyone has other ideas / a better strategy I'd love to hear, and I welcome a good constructive debate. I'm not precious about any of my own contributions. But I do genuinely think this is the best way forward for Clojure data science overall, based on where we are right now.

I would like to propose a strategy where more love is given to the actual libraries (incanter is rather indisposed and stagnant IMO) that solve actual problems instead of trying to unify what does not exist (yet!). Then, people will use what works best, and what does not work will not be important. That's how things go in open-source...
 

core.matrix already exists, is widely used and already unifies several different implementations that cover a wide variety of use cases. It provides an extensible toolkit that can be used either directly or by library / tool implementers. It's really very powerful, and it's solving real problems for a lot of people right now. It has the potential to make Clojure one of the best languages for data science.

Don't get me wrong, I think Neanderthal is a great implementation with a lot of good ideas. I'd just like to see it work well *with* the core.matrix API, not be presented as an alternative. The Clojure data science ecosystem as a whole will benefit if we can make that work.

Dragan Djuric

unread,
Mar 14, 2016, 11:19:18 AM3/14/16
to Clojure

2) I disagree with. Most real world data science is about data manipulation and transformation, not raw computation. 1% of people need to optimise the hell out of a specific algorithm for a specific use case, 99% just want convenient tools and the ability to get an answer "fast enough".

And most of those people are quite happy with R and Python and don't care about Clojure. I, on the other hand, am perhaps in the other 1%, but my needs are also valid (IMHO), especially when I am prepared to do the required work to satisfy them myself.
 
But if you only offer native and GPU dependencies then you aren't really offering an agnostic API. Won't even work on my machine..... could you add Windows support? Maybe you want to add pure JVM implementations as well? And for Clojure data structures? What about ClojureScript support? Sparse arrays? Oh, could you maybe support n-dimensional arrays? Datasets?

Native and GPU are already available *now* and work well. Windows support is there, you (or someone else) need to compile the Windows binaries and contribute them. Pure JVM implementation is quite easy to add, since the majority of Neanderthal is not only pure Java, but  Clojure; the only thing I haven't implement (yet) is a pure Java BLAS engine. I already wrote a (much harder) GPU BLAS implementation from scratch, so compared to that pure Clojure BLAS is like a walk in the park. If/when I need those, the infrastructure is already there to support it quite easily, without API changes. ClojureScript support is more complex, but mostly a matter of putting work weeks into it, rather than as a technical challenge, since by now I know what causes most problems and how to solve them. Of course, I won't do that just as an exercise, but only when the need arises (if ever). Of course, I'll welcome quality PRs.
 
Interesting. I don't recall you posting such benchmarks, apologies if I missed these.

I'd be happy to benchmark myself if I could get Neanderthal to build. I don't believe 9x though... this operation is usually bound by memory bandwidth IIRC

OK, to be more precise it is around 8.5x with floats. With double precision (which I do not need, and probably neither do you in your work on deep learning) it  is "just" 4x :)
 
 
 
If anyone has other ideas / a better strategy I'd love to hear, and I welcome a good constructive debate. I'm not precious about any of my own contributions. But I do genuinely think this is the best way forward for Clojure data science overall, based on where we are right now.

I would like to propose a strategy where more love is given to the actual libraries (incanter is rather indisposed and stagnant IMO) that solve actual problems instead of trying to unify what does not exist (yet!). Then, people will use what works best, and what does not work will not be important. That's how things go in open-source...
 

core.matrix already exists, is widely used and already unifies several different implementations that cover a wide variety of use cases. It provides an extensible toolkit that can be used either directly or by library / tool implementers. It's really very powerful, and it's solving real problems for a lot of people right now. It has the potential to make Clojure one of the best languages for data science.

I agree that this sounds great, but, come on... It sounds like a marketing pitch. Clojure currently doesn't offer a single data science library that would be even a distant match to the state of the art. Nice toys - sure. I hope you didn't mean Incanter?...
 
Don't get me wrong, I think Neanderthal is a great implementation with a lot of good ideas. I'd just like to see it work well *with* the core.matrix API, not be presented as an alternative. The Clojure data science ecosystem as a whole will benefit if we can make that work.

That is up to people that would find that useful. I support them. 

BTW, each Neanderthal structure is a sequence, so, technically, it already supports core.matrix.

BBTW, I wish only the best to core.matrix, and think you do a great work with it. Also, I suppose we are leading a technical discussion here, so I am not getting you wrong and I appreciate your appreciation of Neanderthal technical aspects.

Timothy Baldridge

unread,
Mar 14, 2016, 11:56:19 AM3/14/16
to clo...@googlegroups.com
Just a side comment, Dragan, if you don't want to be compared against some other tech, it might be wise to not make the subtitle of a release "X times faster than Y". Make the defining feature of a release that it's better than some other tech, and the proponents of that tech will probably start to get a bit irritated. 

And I think the distinction made here about floats vs doubles is very important. We can all construct benchmarks that show tech X being faster than tech Y. But if I am evaluating those things I really want to know why X is faster than Y. I want to know what tradeoffs were made, what restrictions I have in using the faster tech, etc. Free performance gains are very rare, so the moment I see "we are X times faster" I immediately start looking for a caveats section or a rationale on why something is faster. Not seeing that, or benchmark documentation, I almost immediately consider the perf numbers to be hyperbole. 

I know practically nothing about either one of these projects, so I'll stop commenting here, but comparisons of tech as tag-lines is rarely a good idea. 

Timothy

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Dragan Djuric

unread,
Mar 14, 2016, 12:23:21 PM3/14/16
to Clojure


On Monday, March 14, 2016 at 4:56:19 PM UTC+1, tbc++ wrote:
Just a side comment, Dragan, if you don't want to be compared against some other tech, it might be wise to not make the subtitle of a release "X times faster than Y". Make the defining feature of a release that it's better than some other tech, and the proponents of that tech will probably start to get a bit irritated. 

I agree with you, and I am not against Neanderthal being compared to Vectorz or core.matrix, or Clatrix or Y. I only commented that it is funny that every time Neanderthal is mentioned anywhere, Mike jumps in with the comment that I should step off the heretic work immediately and convert Neanderthal to the only true core.matrix api. I am sorry if it came out as if I was offended by that, *which I am not*.
 
And I think the distinction made here about floats vs doubles is very important. We can all construct benchmarks that show tech X being faster than tech Y. But if I am evaluating those things I really want to know why X is faster than Y. I want to know what tradeoffs were made, what restrictions I have in using the faster tech, etc. Free performance gains are very rare, so the moment I see "we are X times faster" I immediately start looking for a caveats section or a rationale on why something is faster. Not seeing that, or benchmark documentation, I almost immediately consider the perf numbers to be hyperbole. 


Sure.
1) The benchmark page links to the benchmark source code on github, and all 3 projects are open-source, so I expect any potential user would be wise enough to evaluate libraries himself/herself. I took care to mention drawbacks more times than is usual. I do not want to give people false hope.
2) The reason I am posting the benchmarks is mainly that in Java (and Clojure) land there is much superstition about what can be done and what performance to expect. People read some comment somewhere and tend to either expect superspeed for free (and be disappointed when their naive approach doesn't work well) or to dismiss an approach because "it can't be done" or "calling the native library is slow" (it isn't, if done right).

kovas boguta

unread,
Mar 14, 2016, 12:44:28 PM3/14/16
to clo...@googlegroups.com
On Mon, Mar 14, 2016 at 11:19 AM, Dragan Djuric <drag...@gmail.com> wrote:
core.matrix already exists, is widely used and already unifies several different implementations that cover a wide variety of use cases. It provides an extensible toolkit that can be used either directly or by library / tool implementers. It's really very powerful, and it's solving real problems for a lot of people right now. It has the potential to make Clojure one of the best languages for data science.

I agree that this sounds great, but, come on... It sounds like a marketing pitch. Clojure currently doesn't offer a single data science library that would be even a distant match to the state of the art. Nice toys - sure. I hope you didn't mean Incanter?...

Interesting thread. 

Its a fact that the JVM is not the state of the art for numerical computing, including big swaths of data science/machine learning. There is 0 chance of this changing until at least Panama and Valhalla come to fruition (5 year timeline). 

Every interesting JVM library for numerical computing uses JNI to interact with native code. If that is the question, is core.matrix the answer? I'm not convinced (and Dragan clearly isn't), but its certainly worth thinking about in great detail. Its not just mutable vs immutable, but memory management and GC. Big questions. 

Now, its also the case that systems optimal for numerical computing are pretty much a disaster for almost everything else. So there is plenty of room to add value via Clojure. I am skeptical of a "all things to all people" approach though. Figuring out how to harness the best of both platforms (and acknowledging the tradeoffs and compromises necessary) seems to me the way to go. 
















Sergey Didenko

unread,
Mar 14, 2016, 12:46:40 PM3/14/16
to clo...@googlegroups.com
Dragan, thank you for your library and detailed explanations!

Beeing close to state of the art FORTAN libraries and GPU is important for long calculations.

You give me hope to use Clojure more for data science. Last time when I benchmarked Incanter's  vs Octave I decided to pause using Clojure for data science and move to other languages.

Dragan Djuric

unread,
Mar 14, 2016, 12:57:43 PM3/14/16
to Clojure
Thank you for the encouragement, Sergey.

As I mentioned in one of the articles, a decent vectorized/GPU support is not a solution on its own. It is a foundation for writing your own custom GPU or SSE algorithms. For that, you'll have to drop to the native level for some parts of the code, which is fortunately approachable in a not-so-un-clojure way through ClojureCL: http://clojurecl.uncomplicate.org

Dragan Djuric

unread,
Mar 14, 2016, 1:07:16 PM3/14/16
to Clojure

Its a fact that the JVM is not the state of the art for numerical computing, including big swaths of data science/machine learning. There is 0 chance of this changing until at least Panama and Valhalla come to fruition (5 year timeline). 

I agree, but I would not dismiss even today's JVM (+ JNI). Python and R are even worse in that regard, but they have a huge pie in data science, in my opinion, because of their pragmatism. In Java land, almost all people avoid native like the plague, while Python and R people almost exclusively use native for number crunching while building a pleasant (to them :) user interface around that.

I too shunned JNI as a dirty, clunky, ugly slime, and believed that pure Java is fast enough, until I came across problems that are slow even in native land, and take eternity in Java. And I started measuring more diligently, and, instead of following gossip and dis-use of JNI, I took time to learn that stuff and saw that it is not that ugly, and most of the ugliness could be hidden from the user, so I am not that pessimistic about JVM. It is a good mule :) 

kovas boguta

unread,
Mar 14, 2016, 1:45:12 PM3/14/16
to clo...@googlegroups.com
On Mon, Mar 14, 2016 at 1:07 PM, Dragan Djuric <drag...@gmail.com> wrote:
I too shunned JNI as a dirty, clunky, ugly slime, and believed that pure Java is fast enough, until I came across problems that are slow even in native land, and take eternity in Java. And I started measuring more diligently, and, instead of following gossip and dis-use of JNI, I took time to learn that stuff and saw that it is not that ugly, and most of the ugliness could be hidden from the user, so I am not that pessimistic about JVM. It is a good mule :) 

To clarify I meant the JVM as the computation engine. Through JNI it is very serviceable when done right (though that is a high bar)

I agree with your analysis of other language ecosystems. It is a shame that JVM has shunned native for so long, and just now is turning directions. 

One particular driver is GPU. Forgoing native speed may seem like a reasonable tradeoff in a CPU world, but becomes untenable versus GPU in domains they can be applied. 

Something I'd like to see is Clojure bindings for libraries that have taken this route, including 
https://github.com/dmlc/mxnet (has scala bindings)

Another interesting development is 
Brandon Bloom has been working on JVM bindings:















Raoul Duke

unread,
Mar 14, 2016, 1:46:53 PM3/14/16
to clo...@googlegroups.com
Awesome would be a way for Cojure to generate C (perhaps with e.g.
Boehm–Demers–Weiser GC to get it kicked off) and JNI bindings all
automagically.

Dragan Djuric

unread,
Mar 14, 2016, 2:01:46 PM3/14/16
to Clojure
At least for the JNI part - there are many Java libraries that generate JNI, but my experience is that it is easier to just write JNI by hand (it is simple if you know what you are doing) than to learn to use one of those, usually poorly documented, tools.

As for code generation - OpenCL as a standard is moving towards SPIR intermediate format in recent releases (I think they already provided an implementation for C++ translation), which looks to me as a kind of a native bytecode, whose purpose is just that - to make possible for higher level languages to generate native kernels for CPUs, GPUs and various hardware accelerators. Unfortunately, language compilers is not an area that interests me, so I am unable to provide that for Clojure.

Mars0i

unread,
Mar 14, 2016, 11:54:03 PM3/14/16
to Clojure
Dragan, I still support you doing whatever you want with your time, and I'm grateful that you've produced was I gather is a great library.  Wonderful.  I don't feel that you have to implement the core.matrix api--but I really, really wish someone would produce a core.matrix interface to Neanderthal.  (I don't know how difficult that will be; I just know that I'm not the right person to do it.) 

Some of the following has been said before, but some the points below seem to have been lost in the current discussion.

I want to try Neanderthal, but I may never do so until someone produces a core.matrix mapping.  That needn't be a problem for you, Dragan, but my time is too precious to reimplement my existing code for a new api.  Even with new code, I'm always going to prefer core.matrix in the foreseeable future unless I have very special needs, because being able to swap out implementations is just too convenient for me.  core.matrix allows me to easily figure out which library is most efficient in my application.  I also want there to be a standard api for matrix operations, and at this point, core.matrix is the standard imo. 

I came to Clojure from Common Lisp, where there are 5-10 matrix libraries with different APIs and different efficiency profiles, several seemingly best for different purposes (the Curse of Lisp mentioned by Mikera, applied to our case).  I am just very grateful that, instead of CL's mess of libraries and APIs, in the Clojure world Mikera and others decided to construct a standard API for matrix implementations.  What he's promoting is not marketing, nor religion--it's just convenience and coherence.  It's part of what Clojure is all about.

Again, Dragan, I certainly don't think that you have to support core.matrix, and I am still grateful for Neanderthal even no one ever gives it a core.matrix front end.  From my point of view, everyone in the Clojure community can be grateful for Neanderthal--even those of us who might never end up using it.  It's a good resource for those who know in advance that it will be best for their application, or for those with time to benchmark their application with variant code.  I personally don't want to negotiate different matrix APIs, however.  That's what's important to me. 

I don't have any opinion about the relative benefits of the core.matrix and Neanderthal APIs as such.  I just want to write to standard API.  If the Neanderthal API became the standard, with multiple back ends, and was overall better than core.matrix in that respect, I'd use it.  At this point, going that route seems unlikely to be a good option, because core.matrix already exists, with a lot of work having gone into it by a reasonably large community.  For me the API seems just fine, and it's still being enhanced.  (If it can be made better through conversation with you or others familiar with Neanderthal, that's even better.)

So ... I understand why Mikera keeps mentioning core.matrix whenever Neanderthal is promoted here.  I support him doing that, because I want the matrix libraries for Clojure to stay away from the curse of Common Lisp.  I also understand why core.matrix isn't important for you, personally.  But core.matrix is important for others besides Mikera.  It's not just a pet project; it's a Good Thing, because well thought out common APIs for libraries with similar functionality is a Good Thing.  I think you probably agree, at least in the abstract, even if in this particular context you don't think that core.matrix is the best API for all matrix applications.  (Maybe it's not.  Still ... a common API, that's gotta be a good thing.  And if someone ever does produce a core.matrix interface to Neanderthal, the regular Neanderthal interface will still be available, presumably.  Great!  Best of both worlds.)

I don't think this discussion has to get as heated as it does.  Perhaps I'll get some of the heat, now, but it's not necessary, imo.

Bobby Bobble

unread,
Mar 15, 2016, 9:42:45 AM3/15/16
to Clojure
I'm in the same boat as Mars0i in that I've written a load of code with core.matrix, which I'm happy with and which I will never have time to rewrite for Neanderthal. If I had a new project that was performance-critical I might bother to learn Neanderthal but for my existing project, speed is secondary to maintainability now. If my boss asked me to speed it up by 60x I'd love to be able to just switch to implementation :Neanderthal with core.matrix! One keyword and boom.

Christopher Small

unread,
Mar 15, 2016, 7:55:40 PM3/15/16
to Clojure
Ditto; same boat as Mars0i.

I think the work Dragan has done is wonderful, and that he bears no responsibility to implement the core.matrix api himself. I'm also personally not interested in touching it without core.matrix support, because similarly for me, the performance/flexibility tradeoff isn't worth it. I wish I had time to work on this myself, since it would be an awesome contribution to the community, but I just don't have the time right now with everything else on my plate (including a 1 year old (figurative plate, obviously)). So I would highly encourage anyone interested in numerical computing and clojure looking for a project to sharpen their teeth on to put some time here. I think it would be a real asset to the community.

The one thing I would like to as from you, Dragan, is that you help us (well, Mike perhaps more than anyone) understand what you really don't like about the core.matrix architecture. In previous threads where we've discussed this, I felt like nothing concrete ever came up. You seem to not like that the fact that the API doesn't mirror BLAS, and I get your thinking here (even if I don't necessarily agree). Is that really the extent of it, or did I miss something?

It was mentioned that core.matrix has a BLAS-style API ns, but you still seem uninterested in this, since those functions delegate to a non-BLAS API. If I'm right, your concern is that this would lead to leaky abstractions, and affect performance, yes? I haven't looked at this API or it's implementation, but I have an idea that might satisfy everyone. What if it were possible to specify a core.matrix implementation directly through the BLAS API, with the rest of the "standard" core.matrix API based off these functions (by default; custom implementations could still be provided I'm sure...)? (Does this seem doable Mike?) Could this get us the best of both worlds and make core.matrix look more attractive to you Dragan (even if you don't feel personally interested in working on the Neanderthal/core.matrix implementation yourself)? Those who want and know the BLAS API can use it, without any overhead or leakiness, and everything would remain interoperable with other libraries. If this still wouldn't do it, what would? You clearly have very deep knowledge in this subject, and I think I speak for the Clojure numerical computing community (and core.matrix community in particular) in expressing that we'd greatly appreciate any insights you can give which would help make the ecosystem as a whole better.

With gratitude

Chris

Dragan Djuric

unread,
Mar 16, 2016, 6:20:14 AM3/16/16
to Clojure
Christopher, Bobby, Mars0i,

I agree with you! core.matrix works fine for what you need and it is quite sane and wise to continue happily using it.

Whether you want to touch Neanderthal or not is up to you, and it is same to me one way or the other - I earn exactly $0 if you do and lose the same amount if you don't. I haven't ever said anywhere that you (or anyone else) should switch from core.matrix to Neanderthal, so I am quite puzzled by your insistence on telling me why you wouldn't. I created Neanderthal because I needed state of the art capabilities for my other projects, and open-sourced it because I know that there are other people (no matter if only 1%, 10%, or 100%) who need exactly that. My main goal is not to convert everyone (or even anyone) to it, but to attract like minded people *who need the same stuff* to use it, and hopefully contribute to it, or build other useful open-source libraries on top of it that *I* can use too. 

It seems to me that your main argument is that, whether good or not, core.matrix is a unification point that somehow makes the ecosystem and community thrive, while you'd like to avoid the "lisp curse" of fragmented libraries. That's a plausable hypothesis (and quite logical), but if we look around we could see it is simply not true. Often it is the opposite, even in Clojure land. 

How many competing web libraries, sql bindings, network libs, and various other infrastructure projects there are for Clojure? Tons! There are even several competing wrappers for React! And - all of them thrive! They build useful stuff on top of those that often match what is available elsewhere. And no one is going around scolding them for their heresy.

We can see the similar situation even in the state of the art tools elsewhere. BLAS itself has at least 2 competitive, mutually incompatible, state of the art open source implementations, and tons of proprietary ones. In Java, there are tons of competitive libraries, ditto for Python, R, etc. And all of them have several tools in each area that have arisen of thousands experiments and thrived.

Clojure's "matrix" (or NDarray, or whatever we should call it) space seems to be unified, or so I've her. The reasoning is that it will produce a thriving ecosystem of data analysis, machine learning, numerical computing, and other tools. But what are the results by now? Here and there, there is some upstart project or a library that is a good learning exercise for their author(s), but otherwise completely unusable for anything beyond that. Even the tools that had received some love from their authors are far from anything available in other languages. I asked a few times to be pointed to any competitive data-science/ML library in Clojure, and no one stepped in with examples.

The most telling example of this state is maybe the most serious of those libraries - Incanter. 6 years ago, it looked as an interesting project that was actively maintained and somehow usable. If not comparable to the alternatives in other languages, at least actively developed and with a potential and bits "Clojure's awesomeness will solve everything" promises. But, the author left for greener pastures long ago, and Incanter started stalling. A few maintainers helped it staying afloat for some time, until the main focus of the project became the porting to the one true lord and savior core.matrix. That took a lot of time and resources, and did it help? Judging by the commit history (master and develop), the main thing that is now happening for Incanter are occasional bumps of core.matrix dependency to the new version.

I am glad that Incanter, however dead it is, is still usable for (many?) people, or that core.matrix is usable and used in closed-source projects. But that is obviously not what I need, and I know better than to follow the peer pressure to accept the dogma. I do not mind being reminded every time to hear about our true lord and savior core.matrix, but please understand me when I reserve my right to not believe in that.

PS. I really think that this discussion, however heated it may be is still a technical discussion, so whatever I say about various technologies does not mean to say anything bad against their authors and users. I think these are amazing people who give they work for free so we can all freely use it and learn from it.

Mars0i

unread,
Mar 17, 2016, 1:07:03 AM3/17/16
to Clojure
Well, will disagree about some things, but agree very much about others--probably all practical matters.  Some of your remarks seem valuable.  Some don't seem relevant to me.  Ah well, we're just talking here.  Not everything needs to matter. 

Your point about thriving, competing libraries is important.   What's the difference between this situation and the Curse of Lisp?  I'm not sure.

There is one each of all of the Clojure contrib libraries.  There's a reason that that's a good thing, imo.  It's the same reason that there's usually only one of each core function in Clojure, and that similar functions usually have similar syntax (can't assume this in Common Lisp), although other people can define alternative functions, or build Scheme syntax into Clojure, or whatever one wants.  That there's a standard set of contrib libraries doesn't mean that there can't be competing libraries or competition for core functions, and there are in some cases.  (I would never use clojure.core/rand except for little experiments.)  If it works for people to have multiple React wrappers, etc., yeah, why not?  These competing tools have different costs and benefits--not just in efficiency in different contexts, but in how one goes about structuring one's code.  Given that, there would be no sense to having a single-line switch between competing libraries.  Matrix operations seem different, though.  Every matrix library will share a lot of functionality with any other, because they're intended to implement the same standard mathematical abstractions.  They will have different functionality in some places, of course, but on the shared functionality, the difference will only be in which library is more efficient for which kinds of operations with what kind of data, what systems they'll run on, etc.) There are not that many different fundamental operations that one can want with matrices.  I don't care whether adding a constant to a matrix is written as '(add m c)` or '(add c m)' or '(sum m c)', etc.  I want an API that's sensible and not foolish, and I prefer an API that's more sensible and more flexible without being more difficult to understand.  Beyond that I don't care much about what API I use, except that I don't want to have to learn multiple APIs and port my code from one to the other.  My guess is that in an alternate universe in which core.matrix had the API that Neanderthal has in this universe, I would be equally happy with core.matrix.

(Too bad that the Google Summer of Code deadline is past.  Providing a core.matrix interface to Neanderthal would be a great project for someone, I bet.)
Reply all
Reply to author
Forward
0 new messages