sparse matrix implementation

280 views
Skip to first unread message

Ian Fiske

unread,
Oct 23, 2009, 1:33:17 PM10/23/09
to Incanter
Hi all,

I've started work on the sparse matrix implementation for Incanter,
but am trying to decide how to proceed. I'm creating another java
class: SparseMatrix which parallels Matrix, except that it's sparse
(it extends SparseCCDoubleMatrix2D). So I think the java part is
pretty clear.

But the Clojure part is not so clear to me. I think that functions
like mult and plus should transparantly decide what to do depending on
whether both argument are SparseMatrix or Matrix objects. But the
current implementation uses plain functions everywhere. Is the best
way then to convert everything to multimethods? If so, will this have
a performance penalty? Any other ideas?

Thanks,
Ian

David Edgar Liebke

unread,
Oct 23, 2009, 1:39:42 PM10/23/09
to inca...@googlegroups.com
Ian,

You'll want to modify incanter.internal/transform-with and
incanter.internal/combine-with to recognize the sparse matrix class.
You'll also want to implement the equivalent of the incanter.Matrix
java class that implements the Clojure sequence interfaces, so that
the sparse matrix class will behave like a Clojure sequence.

Good luck and let me know how it goes, or if you have additional questions,
David

Ian Fiske

unread,
Oct 23, 2009, 4:32:29 PM10/23/09
to inca...@googlegroups.com
Thanks, David. All of these modifications would be cleaner if I
create a new superclass Matrix that has your Matrix (now call it
DenseMatrix) and my new one (SparseMatrix) as children.

I'm not an experienced Java programmer, so tell me if this seems
reasonable: I'll create an abstract superclass Matrix with the sole
purpose of containing DenseMatrix and SparseMatrix. So all methods of
Matrix will now be abstract methods, but it will still implement iseq
and counted (via abstract methods -- is this legal?). Then each of
the DenseMatrix and SparseMatrix classes will extend their respective
Parallel Colt classes and Matrix. In this way, no clojure would need
to be modified and all type hints could stay with Matrix. We would
only need a new constructor function for SparseMatrix objects.

Uh-oh... I just checked and Java does not have multiple inheritance,
so is there another way that DenseMatrix can somehow inherit from
Matrix and from DenseColDoubleMatrix2D simultaneously (and similarly
for sparse)?

Thanks,
Ian

--
Ian Fiske
PhD Candidate
Department of Statistics
North Carolina State University

David Edgar Liebke

unread,
Oct 23, 2009, 4:59:16 PM10/23/09
to inca...@googlegroups.com
> Uh-oh... I just checked and Java does not have multiple inheritance,
> so is there another way that DenseMatrix can somehow inherit from
> Matrix and from DenseColDoubleMatrix2D simultaneously (and similarly
> for sparse)?

Yep, this is the weakness of Java's single inheritance, but your
approach can still work if you make incanter.Matrix an interface that
both the dense and sparse versions of the Matrix class implement.
There will just be redundancy in the implementations of Clojure's
sequence interfaces in the two classes.

David

ogrisel

unread,
Nov 21, 2009, 5:19:20 PM11/21/09
to Incanter
First thanks Ian for wrapping the sparse matrix implementation in
incanter,
this work will be very much appreciated.

Could this refactoring be the opportunity to allow the use of single
precision float as an optional dtype argument to the matrix function
as this
is the case in numpy? Sometimes double precision is not needed and
using float reduce the memory usage and can make the processing faster
(provided that parallelcolt or the JVM can leverage SSE instructions
which
I don't know whether or not this is the case).

--
Olivier

David Edgar Liebke

unread,
Nov 23, 2009, 9:47:03 AM11/23/09
to inca...@googlegroups.com
My understanding is that the Java JVM (specifically the HotSpot
compiler) doesn't support SSE instructions, so single-precision
wouldn't improve performance much. But it could reduce memory usage,
and that could be useful.

I'm not sure how far Ian has come with his sparse matrix
implementation, but Bradford Cross and I have discussed
re-implementing the incanter.Matrix class in Clojure (it's currently a
Java class). At which point, I would like to refactor it so that
Matrix is an interface, instead of a class, that can support sparse
and float implementations in addition to the currently supported
double.

If anybody else is interested in doing work in this area, let me know :)

David

Olivier Grisel

unread,
Nov 23, 2009, 9:08:39 PM11/23/09
to inca...@googlegroups.com
2009/11/23 David Edgar Liebke <lie...@gmail.com>:
> My understanding is that the Java JVM (specifically the HotSpot
> compiler) doesn't support SSE instructions, so single-precision
> wouldn't improve performance much. But it could reduce memory usage,
> and that could be useful.

I googled a bit further: quoting a response to this forum post:
http://www.amdzone.com/phpbb3/viewtopic.php?f=52&t=135402

"""
by abinstein on Wed Aug 06, 2008 4:16 pm
I don't think you can manipulate SIMD in Java in the source level. I
know SSE has been "supported" by Sun JVM since 1.4.2, and there are
some request for enhancement on this feature. See
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6604786

You can try to compile with and without SSE. Turn SSE support off by
adding JVM flag -XX:UseSSE=0 to Application Server flags. Or see here:
http://forums.java.net/jive/message.jspa?messageID=84436
"""

But even though using floats might not trigger the expected SSE
speedup, the reduce memory pressure is still a nice to have as you
said :)

> I'm not sure how far Ian has come with his sparse matrix
> implementation, but Bradford Cross and I have discussed
> re-implementing the incanter.Matrix class in Clojure (it's currently a
> Java class). At which point, I would like to refactor it so that
> Matrix is an interface, instead of a class, that can support sparse
> and float implementations in addition to the currently supported
> double.

Sounds good to me.

> If anybody else is interested in doing work in this area, let me know :)

I can offer beta-testing and running benchmarks similar to what Mark
Reid is doing with http://github.com/mreid/injuce/ . However I won't
probably have the time (nor the clojure experience) to help you on the
implementation side.

--
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name

Ian Fiske

unread,
Nov 23, 2009, 9:21:34 PM11/23/09
to inca...@googlegroups.com
Hi all --

I should probably update you guys on my sparse implementation status.
I started on the project after our last communications, hoping it
would be a quick task. I got basic stuff implemented, like
constructors and Matrix as an interface. But I haven't had time to
finish the implementation of all of the methods ... it turns out that
a lot of careful thought probably needs to go into how to best handle
all of the methods for things like multiplying a sparse matrix with a
dense vector, etc. Should these things be coerced to dense? etc...

Also, is CsparseJ the best way to go? The methods in libsparsesuite
from Tim Davis at UF is probably the best sparse stuff in existence.
But it's in C and needs conversion to Java.

So I've put this off until I have more time to consider things more
carefully. I'd be happy to provide help with design and testing for
anyone else who has more time to head this up... sorry to disappoint,
but I've gotta work on my dissertation!

Best,
Ian

Bradford Cross

unread,
Feb 1, 2010, 6:12:59 PM2/1/10
to inca...@googlegroups.com
Is anyone picking things up with sparse matrices?  I am going to convert my nested-maps-as-sparse-matrices pattern to use real sparse matrices.  So if nobody has done it, then I am going to do it now.

David Edgar Liebke

unread,
Feb 1, 2010, 7:51:23 PM2/1/10
to inca...@googlegroups.com
Excellent!

jolby

unread,
Feb 4, 2010, 2:29:09 PM2/4/10
to Incanter
I just found this project for N-dim arrays (tuples) in Clojure:

http://code.google.com/p/clj-multiarray/

EPL licenced.

--Joel

On Feb 1, 4:51 pm, David Edgar Liebke <lie...@gmail.com> wrote:
> Excellent!
>
> On Mon, Feb 1, 2010 at 6:12 PM, Bradford Cross
>

> <bradford.n.cr...@gmail.com> wrote:
> > Is anyone picking things up with sparse matrices?  I am going to convert my
> > nested-maps-as-sparse-matrices pattern to use real sparse matrices.  So if
> > nobody has done it, then I am going to do it now.
>

Kevin

unread,
Dec 3, 2012, 5:51:15 AM12/3/12
to inca...@googlegroups.com
Hello,

sorry to bring this up again but seeing that there is still no wrapper for sparse matrices in incanter.core, are there any issues that people have been running into implementing this or is it just that no one got around doing it? Otherwise I'd be quite keen to do it, with a new Matrix interface like outlined above. Would it make any sense to do that right now with the talk about BLAS in the other thread?

Best!
Kevin

Matthew Rocklin

unread,
Dec 3, 2012, 10:54:15 AM12/3/12
to inca...@googlegroups.com
The BLAS conversation involves using the Java Native Interface.  Are you considering linking to a sparse matrix library (or generic external interface)?  Or are you implementing sparse matrices in Java/Clojure?

If you're willing to make code sufficiently generic to support various low-level sparse matrix implementations then you enable your code to hook into some numeric libraries.  PETSc for example is a parallel library for sparse matrix (and other) manipulations on distributed hardware (often used on super-computers).  It would be awesome if I could hook in various numeric libraries (sparseBLAS, PETSc) to a generic clojure front-end.

Kevin

unread,
Dec 3, 2012, 1:25:18 PM12/3/12
to inca...@googlegroups.com
I wouldn't want to tackle linking to a new library, no, I was just thinking about a simple extension of PColt's SparseCCDoubleMatrix2D as envisaged above. It doesn't look like more than a day's work which is why I was wondering why the two earlier efforts documented here didn't make it into the repo?

The BLAS question was more about whether it's still worth adding PColt sparse matrix support to the 1.4 branch when other people are already busy working on a major overhaul of the matrix-backend.

Best!
Kevin

meekerdb

unread,
Dec 3, 2012, 3:08:38 PM12/3/12
to inca...@googlegroups.com
I was looking at Incanter for a a project, but I didn't see any support for complex matrices.  Is that in the offing?

Brent Meeker

No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.2221 / Virus Database: 2634/5433 - Release Date: 12/02/12


Mike Anderson

unread,
Jan 27, 2013, 11:43:49 AM1/27/13
to inca...@googlegroups.com
On Tuesday, 4 December 2012 04:08:38 UTC+8, Brent wrote:
I was looking at Incanter for a a project, but I didn't see any support for complex matrices.  Is that in the offing?

Brent Meeker


Hi Brent,

We're making allowances for Complex matrices as part of the design for core.matrix (https://github.com/mikera/matrix-api) - which will hopefully become usable from Incanter in the near future.

Still plenty of work to do but the API should definitely support it (we've designed it to be able to handle any type of array element)

   Mike.


Alex Ott

unread,
Jan 27, 2013, 1:33:59 PM1/27/13
to inca...@googlegroups.com
I plan to try to port Incanter to matrix-api (when I'll find some free time...)
> --
>
>
>



--
With best wishes, Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)
Skype: alex.ott

Mike Anderson

unread,
Jan 27, 2013, 9:35:19 PM1/27/13
to inca...@googlegroups.com

On Monday, 28 January 2013 02:33:59 UTC+8, Alex Ott wrote:
I plan to try to port Incanter to matrix-api (when I'll find some free time...)

Awesome - happy to collaborate with you to help make that happen!

In particular, if you see anything missing in the core.matrix API that Incanter needs then let me know, I'd be happy to add this in. I had a quick look through incanter.core and I think nearly all of the basics are covered. We are missing some of the matrix decompositions but hopefully these can be added soon.

It also looks feasible to make the "Dataset" record work as a core.matrix implementation alongside the regular matrix stuff. We don't have explicit metadata support yet (of the type needed to make the column names work nicely) but in theory this should be possible to add.
Reply all
Reply to author
Forward
0 new messages