b) It looks like you are consistently about 2x faster than JBlas for large matrices - wondering what is causing the difference, is that because of copying?
c) Would be interesting to see a few other operations: I do a lot of work with stochastic gradient descent for example so addition and multiply-and-add can be even more important than matrix multiply.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/dFPOOw8pSGI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Just another +1 to include a core.matrix implementation
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/dFPOOw8pSGI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.
I do not even claim that an unified api is not possible. I think that to some extent it is. I just doubt in core.matrix eligibility for THE api in numerical computing. For it makes easy things easy and hard things impossible.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/dFPOOw8pSGI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.
I see now Dragan; you're concerned not about whether easily implementing and swapping in/out implementations of core.matrix is possible, but whether it can be done while maintaining the performance characteristics of Neanderthal, yes? That did not come through in your earlier comments in this thread.
Certainly, performance is one of those things that can leak in an abstraction. But I'd like to echo Matt's enquiry: If you think a unified API might be possible but that core.matrix isn't it, I think we'd all love to hear what you think it's missing and/or what would would need to be rearchitected in order for it to fit the bill.
Assuming a) someone forks Neanderthal and makes a core.matrix implementation with close performance parity to the direct Neanderthal API and/or b) folks working on core.matrix are able to address some of your issues with the core.matrix architecture, would you consider a merge?Certainly, a third party library implementing core.matrix with Neanderthal is a possibility, but I'm a bit worried that a) it would add extra burden keeping things in sync and feel a little second class; and more importantly b) it might be easier to maintain more of the performance benefits if it's directly integrating (I could imagine less indirection this way, but could be totally wrong). So let me ask you this:As for any sort of "responsibility" to implement core.matrix, I don't think anyone is arguing you have such a responsibility, and I hope our _pleading_ hasn't come across as such. We are simply impressed with your work, and would like to take advantage of it, but also see a "drawback" you don't: at present Neanderthal is less interoperable with many existing tools, and "trying it out" on an existing project would require a rewrite (as would migrating away from it if we weren't happy).
Dragan, this just occurred to me--a small comment about the slow speed that I reported from clatrix, which you mentioned earlier. I'm not sure whether the slow speed I experienced on 500x500 matrices itself provides evidence for general conclusions about using the core.matrix api as an interface to BLAS. There was still a lot of work to be done on clatrix at that point--maybe there still is. My understanding is that clatrix supported the core.matrix api at that stage, but it was known that it didn't do so in an optimized way, in many respects. Optimizing remaining areas was left for future work.
I think your general point doesn't depend on my experience with clatrix a year ago, however. I understand you to be saying that there are some coding strategies that provide efficient code with BLAS and LAPACK, and that are easy to use in Neanderthal, but that are difficult or impossible using the core.matrix api.
On Friday, June 19, 2015 at 11:17:02 PM UTC+2, Christopher Small wrote:I see now Dragan; you're concerned not about whether easily implementing and swapping in/out implementations of core.matrix is possible, but whether it can be done while maintaining the performance characteristics of Neanderthal, yes? That did not come through in your earlier comments in this thread.This, with the addition that for *any* library, not only Neanderthal, there would be many leaking abstractions. It is easy to define common function/method names and parameters, but there are many things that just flow through the API regardless, and taming this is the hardest part of any API.
Certainly, performance is one of those things that can leak in an abstraction. But I'd like to echo Matt's enquiry: If you think a unified API might be possible but that core.matrix isn't it, I think we'd all love to hear what you think it's missing and/or what would would need to be rearchitected in order for it to fit the bill.For an unified API, if it is at all feasible, I think there is one place it should be looked at first: BLAS 1, 2, 3 and LAPACK. This is THE de facto standard for matrix computations for dense and banded matrices. Sparse APIs are not that uniform, bat in that space, also, there is a lot of previous work. So, what's wrong with BLAS/LAPACK that core.matrix choose not to follow it and arbitrarily invent (in my opinion) unintuitive and complicated API? I am genuinely interested, maybe I don't see something that other people do.
In my opinion, the best way to create a standard API is to grow it from successful implementations, instead of writing it first, and then shoehorning the implementations to fit it.
Assuming a) someone forks Neanderthal and makes a core.matrix implementation with close performance parity to the direct Neanderthal API and/or b) folks working on core.matrix are able to address some of your issues with the core.matrix architecture, would you consider a merge?Certainly, a third party library implementing core.matrix with Neanderthal is a possibility, but I'm a bit worried that a) it would add extra burden keeping things in sync and feel a little second class; and more importantly b) it might be easier to maintain more of the performance benefits if it's directly integrating (I could imagine less indirection this way, but could be totally wrong). So let me ask you this:As for any sort of "responsibility" to implement core.matrix, I don't think anyone is arguing you have such a responsibility, and I hope our _pleading_ hasn't come across as such. We are simply impressed with your work, and would like to take advantage of it, but also see a "drawback" you don't: at present Neanderthal is less interoperable with many existing tools, and "trying it out" on an existing project would require a rewrite (as would migrating away from it if we weren't happy).a) I would rather see the core.matrix interoperability as an additional separate project first, and when/if it shows its value, and there is a person willing to maintain that part of the code, consider adding it to Neanderthal. I wouldn't see it as a second rate, and no fork is needed because of Clojure's extend-type/extend-protocol mechanism.
b) I am not sure about what's exactly "wrong" with core.matrix. Maybe nothing is wrong. The first thing that I am interested in is what do core.matrix team think is wrong with BLAS/LAPACK in the first place to be able to form an opinion in that regard
There is nothing fundamentally wrong with BLAS/LAPACK, it just isn't suitable as a general purpose array programming API. See my comments further below.
If you think the core.matrix API is "unintuitive and complicated" then I'd love to hear specific examples. We're still open to changing things before we hit 1.0
But it certainly isn't "arbitrarily invented". Please note that we have collectively considered a *lot* of previous work in the development of core.matrix. People involved in the design have had experience with BLAS, Fortran, NumPy, R, APL, numerous Java libraries, GPU acceleration, low level assembly coding etc. We'd welcome your contributions too.... but I hope you will first take the time to read the mailing list history etc. and gain an appreciation for the design decisions.
In my opinion, the best way to create a standard API is to grow it from successful implementations, instead of writing it first, and then shoehorning the implementations to fit it.It is (comparatively) easy to write an API for a specific implementation that supports a few specific operations and/or meets a specific use case. The original Clatrix is an example of one such library.
But that soon falls apart when you realise that the API+implementation doesn't meet broader requirements, so you quickly get fragmentation e.g.- someone else creates a pure-JVM API for those who can't use native code (e.g. vectorz-clj)
- someone else produces a similar library with a new API that wins on some benchmarks (e.g. Neanderthal)
- someone else needs arrays that support non-numerical scalar types (e.g. core.matrix NDArray)- a library becomes unmaintained and someone forks a replacement- someone wants to integrate a Java matrix library for legacy reasons- someone else has a bad case of NIH syndrome and creates a whole new library
Before long you have a fragmented ecosystem with many libraries, many different APIs and many annoyed / confused users who can't easily get their tools to work together. Many of us have seen this happen before in other contexts, and we don't want to see the same thing to happen for Clojure.
core.matrix solves the problem of library fragmentation by providing a common abstract API, while allowing users choice over which underlying implementation suits their particular needs best. To my knowledge Clojure is the *only* language ecosystem that has developed such a capability, and it has already proved extremely useful for many users.
So if you see people asking for Neanderthal to join the core.matrix ecosystem, hopefully this helps to explain why.
a) I would rather see the core.matrix interoperability as an additional separate project first, and when/if it shows its value, and there is a person willing to maintain that part of the code, consider adding it to Neanderthal. I wouldn't see it as a second rate, and no fork is needed because of Clojure's extend-type/extend-protocol mechanism.
While this could work from a technical perspective, I would encourage you to integrate core.matrix support directly into Neanderthal, for at least three reasons:a) It will allow you to save the effort of creating and maintaining a whole duplicate API, when you can simply adopt the core.matrix API (for many operations)
b) It will reduce maintenance, testing and deployment effort (for you and for others)
c) You are much more likely to get outside contributors if the library forms a coherent whole and plays nicely with the rest of the ecosystem
This really isn't hard - in the first instance it is just a matter of implementing a few core protocols. To get full performance, you would need to implement more of the protocols, but that could be added over time.b) I am not sure about what's exactly "wrong" with core.matrix. Maybe nothing is wrong. The first thing that I am interested in is what do core.matrix team think is wrong with BLAS/LAPACK in the first place to be able to form an opinion in that regardBLAS/LAPACK is a low level implementation. core.matrix is a higher level abstraction of array programming. They simply aren't comparable in a meaningful way. It's like comparing the HTTP protocol with the Apache web server.
You could certainly use BLAS/LAPACK to create a core.matrix implementation (which is roughly what Clatrix does, and what Neanderthal could do if it became a core.matrix implementation). Performance of this implementation should roughly match raw BLAS/LAPACK (all that core.matrix requires is the protocol dispatch overhead, which is pretty minimal and only O(1) per operation so it quickly becomes irrelevant for operations on large arrays).
In terms of API, core.matrix is *far* more powerful than BLAS/LAPACK.
Some examples:- Support for arbitrary N-dimensional arrays (slicing, reshaping, multi-dimensional transposes etc.)
- General purpose array programming operations (analogous to NumPy and APL)
- Independence from underlying implementation. You can support pure-JVM implementations (like vectorz-clj for example), native implementations, GPU implementations.
- Support for arbitrary scalar types (complex numbers? strings? dates? quaternions anyone?)
- Transparent support for both dense and sparse matrices with the same API
- Support for both mutable and immutable arrays
- Transparent support for the in-built Clojure data structures (Clojure persistent vectors etc.)
- Support for mixing different array types- Supports for convenience operations such as broadcasting, coercion
If you build an API that supports all of that with a reasonably coherent design... then you'll probably end up with something very similar to core.matrix
Hi Dragan,The situation as I see it:- You've created a matrix library that performs well on one benchmark (dense matrix multiplication).
- Neanderthal meets your own personal use cases. Great job!- Neanderthal *doesn't* fit the use cases of many others (e.g. some need a portable pure JVM implementation, so Neanderthal is immediately out)- Fortunately, in the Clojure world we have a unique way for such libraries to interoperate smoothly with a common API (core.matrix)
- Neanderthal could fit nicely in this ecosystem (possibly it could even replace Clatrix, which as you note hasn't really been maintained for a while...)- For some strange reason, it *appears to me* that you don't want to collaborate. If I perceive wrongly, then I apologise.
If you want to work together with the rest of the community, that's great. I'm personally happy to help you make Neanderthal into a great matrix implementation that works well with core.matrix. I'm 100% sure that is an relatively simple and achievable goal, having done it already with vectorz-clj
If on the other hand your intention is to go your own way and build something that is totally independent and incompatible, that is of course your right but I think that's a really bad idea and would be detrimental to the community as a whole.
Fragmentation is a likely result. At worst, you'll be stuck maintaining a library with virtually no users (the Clojure community is fairly small anyway... and it is pretty lonely to be a minority within a minority)
I can see from your comments below that you still don't understand core.matrix. I'd be happy to help clarify if you are seriously interested in being part of the ecosystem.
Ultimately I think you have some talent, you have obviously put in a decent amount of work and Neanderthal could be a great library *if and only if* it works well with the rest of the ecosystem and you are personally willing to collaborate.Your call.
(For the record I don't think it's fair to criticize core.matrix as not being an API because the documentation is limited. The API is in the protocols, etc.
It may be that there's agreement on everything that matters. Dragan, you've said that you wouldn't mind others integrating Neanderthal into core.matrix, but that you don't want to do that. That you are willing to work with others on this is probably all that's needed. People may have questions, want you to consider pull requests, etc. I think that integrating Neanderthal into core. matrix will be an attractive project for one someone.
(For the record I don't think it's fair to criticize core.matrix as not being an API because the documentation is limited. The API is in the protocols, etc. It's all there in the source. Of course anyone using core.matrix occasionally encounters frustration at the lack of documentation, but core.matrix is a work in progress. That's the nature of the situation. I have been able to figure out how to do everything I needed to do using only the existing documentation, source code, docstrings, asking occasional questions with helpful answers from others. If I had more time, maybe I would work on the documentation more than the tiny bit I've been able to do so. Similar remarks apply to what clatrix doesn't do well yet. Being unfinished is not a design flaw.)
--
As for performance benchmarks, I have to echo Mike that it seemed strange to me that you were claiming you were faster on ALL benchmarks when I'd only seen data on one. Would you mind sharing your full benchmarking analyses?
With all that out of the way... I'm glad that you're willing to play ball here with the core.matrix community, and thank you for what I think has been a very productive discussion. I think we all went from talking _past_ each other, to understanding what the issues are and can now hopefully start moving forward and making things happen. While I think we'd all love to have you (Dragan) personally working on the core.matrix implementations, I agree with Mars0i that just having you agree to work-with/advise others who would do the actual work is great. I'd personally love to take that on myself, but I already have about a half dozen side projects I'm working on which I barely have time for. Oh, and a four month old baby :scream:! So if there's anyone else who's willing, I may leave it to them :-)
--
As it is a *sparse matrix*, C++ library unavailable on JVM, I don't consider it relevant for comparison as these are really apples and pineapples. For now, at least.
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/dFPOOw8pSGI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.
I'm glad to see that you and Mike are making a productive dialog out of what could have gone the other way. It's a credit to you both.
Simple question wrt documentation. The Great White Whale of open source software. Suppose core.matrix had insanely great documentation. What difference would that make regarding your decision to support it or not? (IOW, if the community were to manage to make good docs, would you change your mind? )
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
I will support whomever wants to take on that task, but have no time and need to do it myself.