Scientific computing

473 views
Skip to first unread message

Rock

unread,
Oct 23, 2009, 2:01:54 PM10/23/09
to Clojure
What if I wanted to use Clojure for scientific computing, and in
particular for doing linear algebra and matrix computations a la
MATLAB?

What would my options be for representing matrices, not to mention,
especially, MULTIDIMENSIONAL ARRAYS?

Would java arrays be the way to go, or nested vectors?

I don't like the nested vector solution too much, but to be honest,
there really doesn't seem to be an adequate solution to this problem
at the moment. For instance, once you have a multidimensional array
(whatever kind it is) you can't, I believe, access any property of
that object at runtime to know its rank and dimensions, like you can
do with MATLAB or Scientifc Python (NumPy), or other similar
languages. And slicing, such as a[2:20:4, 1:9:3], which means every
element in a with indices that go from start to finish with a certain
step, seem to be problematic.

Actually the whole buiseness of having to deal with immutable (and
lazy) data structures appears to make computing with matrices and
multidimensional arrays a lot more difficult. Maybe this is just an
impression. Opinions?

Does Rich intend to address this issue of having multidimensional
arrays (for scientific computing purposes) sooner or later?

Thanks.

Rock

Konrad Hinsen

unread,
Oct 23, 2009, 4:00:29 PM10/23/09
to clo...@googlegroups.com
Rock a écrit :

> What if I wanted to use Clojure for scientific computing, and in
> particular for doing linear algebra and matrix computations a la
> MATLAB?
>
> What would my options be for representing matrices, not to mention,
> especially, MULTIDIMENSIONAL ARRAYS?
>
> Would java arrays be the way to go, or nested vectors?

There are a couple of Java libraries that you can use. Unfortunately the
efforts to make everyone agree on a single library a few years ago
have failed, so now you have to choose between a couple of incompatible
libraries.

For matrix computations and linear algebra, your best choice is probably
the Colt library developed at CERN, or the somewhat parallelized
version called Parallel Colt. Colt has arrays up to three dimensions for
a couple of data types, enough for matrix stuff but not much more.

If you need higher-dimensional arrays, e.g. to store complex data sets,
your best bet would be the netCDF implementation for Java, which
contains a very complete array library that handles any number of
dimensions. However, it provides no linear algebra at all.

Another library in this category is JAMA. It looks rather similar to
Colt, but I never looked at it more closely.

There's also an array package developed by IBM in conjunction with an
optimizing Java compiler to go with it, but apparently it is no longer
maintained.


> at the moment. For instance, once you have a multidimensional array
> (whatever kind it is) you can't, I believe, access any property of
> that object at runtime to know its rank and dimensions, like you can
> do with MATLAB or Scientifc Python (NumPy), or other similar
> languages. And slicing, such as a[2:20:4, 1:9:3], which means every
> element in a with indices that go from start to finish with a certain
> step, seem to be problematic.

All the libraries listed above provide most of this functionality. Like
NumPy, they represent an array as a 1D Java array containing the data
elements plus bookkeeping information about the shape and strides.

> Actually the whole buiseness of having to deal with immutable (and
> lazy) data structures appears to make computing with matrices and
> multidimensional arrays a lot more difficult. Maybe this is just an
> impression. Opinions?

The array packages listed above all permit modification of the array
elements. Fully immutable arrays are probably of little practical
interest, but something similar to Clojures transients (data types that
can be initialized once by the function that creates them and then
become immutable) would probably be sufficient to change this. There is
little practial experience with this at the moment.


Konrad.


__________ Information provenant d'ESET NOD32 Antivirus, version de la base des signatures de virus 4537 (20091023) __________

Le message a été vérifié par ESET NOD32 Antivirus.

http://www.eset.com


Garth Sheldon-Coulson

unread,
Oct 23, 2009, 4:06:17 PM10/23/09
to clo...@googlegroups.com
Hi Rock,

Clojure isn't designed as an array processing language, so it doesn't have multidimensional array-slicing or matrix algebra tools built in. That's just not what Clojure's trying to be, and you're right, immutable data structures might get in the way. There's probably nothing stopping you writing a library to treat nested vectors as multidimensional arrays (they have constant lookup after all), but I don't know if it's worth the trouble considering the matrix algebra routines aren't there.

Incanter http://incanter.org/ is a project that links Clojure to Parallel Colt for matrix algebra, adding nice statistical features a la R.

Clojuratica http://clojuratica.weebly.com/ is a project that links Clojure to Mathematica, which (as you probably know) is similar to MATLAB in its array-processing capabilities and has a whole host of other capabilities as well. I develop Clojuratica and am working on a new release with better syntactic integration. I anticipate the announcement of that release within two weeks, but take a look at what on the site now anyway.

If all you need is a statistical or array-processing language like MATLAB, my frank view is you're best off staying in R, or Mathematica, or MATLAB, or Octave, or whatever... they're mature and great at what they do (Mathematica most of all ;-) ). The reason you might want to use Clojuratica or Incanter is if you're building applications that need Clojure's awesome features in, say, concurrency, *plus* array processing.

Garth

Konrad Hinsen

unread,
Oct 24, 2009, 7:36:20 AM10/24/09
to clo...@googlegroups.com
On 23 Oct 2009, at 22:00, Konrad Hinsen wrote:

> For matrix computations and linear algebra, your best choice is
> probably
> the Colt library developed at CERN, or the somewhat parallelized
> version called Parallel Colt. Colt has arrays up to three dimensions
> for
> a couple of data types, enough for matrix stuff but not much more.
>
> If you need higher-dimensional arrays, e.g. to store complex data
> sets,
> your best bet would be the netCDF implementation for Java, which
> contains a very complete array library that handles any number of
> dimensions. However, it provides no linear algebra at all.

I should have added that it is possible to convert between netCDF and
Colt arrays without a costly element-by-element copying procedure. For
example, the following code inverts a matrix read from a netCDF file
using Colt:

(ns netcdf-read
(:import (ucar.nc2 NetcdfFile)
(cern.colt.matrix.tdouble.impl DenseDoubleMatrix2D)
(cern.colt.matrix.tdouble.algo DenseDoubleAlgebra)))

(defn unidata-to-colt
[#^ucar.ma2.ArrayDouble$D2 array]
(let [java-array-1d (.get1DJavaArray array Double)
[rows, cols] (.getShape array)]
(new DenseDoubleMatrix2D rows cols java-array-1d 0 0 cols 1 false)))

(let [matrix (with-open [ncfile (NetcdfFile/open "/Users/hinsen/
Temp/matrix.nc")]
(.read (.findVariable ncfile "matrix")))
colt-matrix (unidata-to-colt matrix)
la (new DenseDoubleAlgebra)
inverse (.inverse la colt-matrix)]
(prn inverse))

Konrad.

Rock

unread,
Oct 24, 2009, 7:01:42 PM10/24/09
to Clojure
Thanks for your replies. You've been very helpful.

Ahmed Fasih

unread,
Oct 25, 2009, 12:33:43 AM10/25/09
to Clojure
> If all you need is a statistical or array-processing language like MATLAB,
> my frank view is you're best off staying in R, or Mathematica, or MATLAB, or
> Octave, or whatever... they're mature and great at what they do (Mathematica
> most of all ;-) ). The reason you might want to use Clojuratica or Incanter
> is if you're building applications that need Clojure's awesome features in,
> say, concurrency, *plus* array processing.

Lisp/Clojure offers such a compelling advantage for software design of
complex applied math algorithms over some of the systems you have
named that I think it bowls over some long-suffering Matlab/R/Python
users, who are inspired work in Clojure in spite of incomplete (but
still surprisingly good) support for their specialized needs.
Hopefully Clojure's flexibility will soon allow it to become a matrix-
oriented language as well, if only to express linear-algebra-type
ideas (which are stored under the hood in whatever format or system
needed---Colt, CDF, Mathematica arrays even!).

I wanted to let you know that probably for every person who asks about
Clojure scientific computing, there's a number of others who are just
trying to make it work for them!

(By the way, Garth, great work on Clojuratica!)

Rock

unread,
Oct 26, 2009, 12:14:57 PM10/26/09
to Clojure
Just one more thing. It's still not really clear to me if I am better
off using Java arrays (make-array ...) or clojure vectors especially
when dealing with multidimensional arrays. I know that if use Java
libraries such as Colt, I have no choice. But in general? What do you
think?

Garth Sheldon-Coulson

unread,
Oct 26, 2009, 12:51:27 PM10/26/09
to clo...@googlegroups.com
I imagine you'll be better off with the flexibility and ease of nested vectors unless you have very specific performance needs (e.g. you need to pass the arrays frequently to third-party libraries that accept Java arrays, or for performance reasons you want to mutate them in place using amap etc.).

Even the latter need can now be satisfied with Clojure vectors and transients.

Once Rich's par branch is merged into the master branch you''ll have access to very efficient forkjoin-enabled parallel operations (pvmap) on Clojure vectors. That's reason enough to use vectors, imo. If you want this now you can compile the current par branch yourself. It's on github.

Garth

Konrad Hinsen

unread,
Oct 26, 2009, 2:23:05 PM10/26/09
to clo...@googlegroups.com
On 26 Oct 2009, at 17:14, Rock wrote:

> Just one more thing. It's still not really clear to me if I am better
> off using Java arrays (make-array ...) or clojure vectors especially
> when dealing with multidimensional arrays. I know that if use Java
> libraries such as Colt, I have no choice. But in general? What do you
> think?

Using Clojure's built-in data structures is almost always the best
choice until you have a good reason to use something else. In the case
of arrays, that good reason is likely to be
1) memory usage, once your data sets become big,
2) performance, either because of the size of the data sets or because
some algorithm needs very frequent access
3) interoperability with Java libraries such as Colt.

As long as you work with small enough datasets, stick to Clojure
vectors.

Konrad.

liebke

unread,
Oct 26, 2009, 2:32:16 PM10/26/09
to Clojure
I agree, use Clojure vectors whenever it's feasible. Even within
Incanter, which uses Parallel Colt extensively, I try never to convert
Clojure vectors into Colt vectors/matrices unless it's absolutely
necessary.

David

Rock

unread,
Oct 27, 2009, 12:46:08 PM10/27/09
to Clojure
But then again ...

Thinking this over, one idea that comes to mind:

What if we created a structmap (a sort of class), where we have a
flattened out one-dimensional clojure vector containing all the data
of our potentially multidimensional (rank n) array, together with its
dimensions and possibly other info? I believe the C scienitific
library, GSL, does something similar (with structs of course), even
though C (especially C99) has pretty good support for multidimensional
arrays per se.

I think such an approach could be more flexible. You can have the
vector as a data pool, and the rest would be sort of like a view on
that data (which can be changed at runtime if one chooses to). For
instance, a 5 x 3 x 2 array could also be a simple 5 x 6 matrix or
just a 30 element vector. You can access that info at runtime, like
the current size, rank, and dimensions of the array, and build a whole
set of multimethods to manage this data structure. This way, we can
have the power of Clojure vectors and the flexibility all wrapped up
in one coherent package.

What do you think?

Rock

Konrad Hinsen

unread,
Oct 27, 2009, 12:52:31 PM10/27/09
to clo...@googlegroups.com
On 27.10.2009, at 17:46, Rock wrote:

> What if we created a structmap (a sort of class), where we have a
> flattened out one-dimensional clojure vector containing all the data
> of our potentially multidimensional (rank n) array, together with its
> dimensions and possibly other info? I believe the C scienitific

That's certainly possible. In fact, this is very much how Colt and
netCDF represent multidimensional arrays, except that they use a Java
array as the 1D element storage space.

The real question is what you gain from such an implementation.
Compared to
1) nested Clojure vectors
2) Colt or netCDF arrays
what is the practical advantage of your proposed new data structure?

One answer could be "the best of those two worlds", but do you have a
concrete application in mind where this would be a real advantage?

> I think such an approach could be more flexible. You can have the
> vector as a data pool, and the rest would be sort of like a view on
> that data (which can be changed at runtime if one chooses to). For

That's how nested Clojure vectors behave as well. With Colt and netCDF
arrays, creating such views is simple as well (but they are not
immutable).

Konrad.

Rock

unread,
Oct 27, 2009, 1:07:30 PM10/27/09
to Clojure
Good question.

Let's see. Now, nested vectors ... I don't like them too much for
these things. Why? Because they're just that: nested vectors. They're
not truly multidimensional vectors, and the more I think about them,
the more they really suck from that point of view. For instance, first
of all they're not that safe to use (for these purposes): you could
potentially have a RAGGED nested vector, which you might end up
passing to a function that's expecting a true multidimensional array!
What is one to do? Implement checks all over the place just to make
sure that it's ok? Don't like that. But it gets worse. I imagine, when
dealing with nested vectors, that there's no guarantee that the data
will be contiguous as when you're dealing with a linear vector. So, as
far as efficiency is concerned, maybe they're not that good, but I'm
not sure about that (don't know the implementation details.)

Most of all they're not FLEXIBLE. This point I would like to stress.
You can't easily know the size, rank, and dimensions at runtime, or in
general, with this data-structure. It's just all more complicated. You
have to basically always make sure you're dealing with a true
multidimensional array, and you have to devise a set of functions to
discover the characteristics of that array (and I'm not sure it's an
easy task). Not to mention changing views! That would really be even
more challenging. Maybe I haven't thought about this long enough, I
don't know. But my instincts tell me that nested vectors are no good
when it comes to this kind of stuff.

Rock

Konrad Hinsen

unread,
Oct 28, 2009, 2:54:08 PM10/28/09
to clo...@googlegroups.com
On 27.10.2009, at 18:07, Rock wrote:

> these things. Why? Because they're just that: nested vectors. They're
> not truly multidimensional vectors, and the more I think about them,
> the more they really suck from that point of view. For instance, first
> of all they're not that safe to use (for these purposes): you could
> potentially have a RAGGED nested vector, which you might end up
> passing to a function that's expecting a true multidimensional array!

That's indeed one of the arguments against the use of multidimensional
Java arrays (which are nested arrays) for this purpose.

> What is one to do? Implement checks all over the place just to make
> sure that it's ok? Don't like that.

"All over the place" would be a call to a single test function. But
this need is not specific to a nested vector representation.
Basically, there are two approaches to dealing with Clojure data
structures representing multidimensional arrays:

1) Make them an abstract data structure. Client programs are not
supposed to know the internal representation, they create arrays
exclusively by calling factory functions provided for this purpose.
Functions that expect array arguments can then assume that the data
structure is consistent and needn't do any checks. The tightness of
the abstraction can be enforced to various degrees, but that's a
different issue.

2) Expose the internal representation to client code. Since any
representation in terms of standard Clojure data structures could be
invalid, all array functions need to do some check on their arguments.

In fact, the second approach would be the best argument for the use of
nested vectors, as it is a rather simple and intuitive representation
from the user's perspective. With the first approach, the internal
representation would be chosen exlusively for the convenience of the
implementation.

> But it gets worse. I imagine, when
> dealing with nested vectors, that there's no guarantee that the data
> will be contiguous as when you're dealing with a linear vector. So, as
> far as efficiency is concerned, maybe they're not that good, but I'm
> not sure about that (don't know the implementation details.)

I don't think there is a difference. A Clojure vector is never a
contiguous block of storage.

Konrad.

Rock

unread,
Oct 28, 2009, 5:21:56 PM10/28/09
to Clojure
Your analysis is crystal clear and very helpful Konrad. But you
haven't addressed the issue of dealing with useful information
regarding the data structure itself. What if, for example, a function
wanted to know the rank and dimensions of a multidimensional array it
was being passed, and that array were represented by means of a nested
vector? Suppose we're dealing with rank n objects. Do you think it
would be an easy task to figure all that out dealing with nested
vectors? And by the way, How would you go about implementing in detail
a check to see if a nested vector is actually an authentic
multidimensional array or not? I personally can't come up with a
simple straightforward solution. I imagine you'd have to scan each
dimension one by one, and make sure the various lengths are
consistent. Not a tremendously efficient perspective especially when
you're dealing with huge data sets, and you're forced to perhaps
redundantly check the same objects over and over during operations.

I honestly prefer your first case scenario. Seems much more efficient,
less resource-consuming, and just straightforward. But I really would
like to know what your preference is. If you had to choose, which way
would you go?

Thanks so much for your help.

Rock

harrison clarke

unread,
Oct 28, 2009, 6:07:06 PM10/28/09
to Clojure
maps could also be an option.
you can use vectors of ints as keys. (and you can stick dimensions and
such in there with keywords)

i'm not sure how that compares to nested vectors for perforance. you
have the overhead of the hash function, but you don't have any
nesting.
it's also pretty handy if you want to represent a sparse matrix.

Konrad Hinsen

unread,
Oct 29, 2009, 4:49:43 AM10/29/09
to clo...@googlegroups.com
On 28 Oct 2009, at 22:21, Rock wrote:

> Your analysis is crystal clear and very helpful Konrad. But you
> haven't addressed the issue of dealing with useful information
> regarding the data structure itself. What if, for example, a function
> wanted to know the rank and dimensions of a multidimensional array it
> was being passed, and that array were represented by means of a nested
> vector?

I didn't address such issues because there is no point in discussing
them before making the fundamental decision whether to use an abstract
or an exposed data type for array data. That choice dictates the
priorities for everything that follows.

> Suppose we're dealing with rank n objects. Do you think it
> would be an easy task to figure all that out dealing with nested
> vectors?

If you can assume the array is well-formed, it is rather easy.
Otherwise it isn't.

> And by the way, How would you go about implementing in detail
> a check to see if a nested vector is actually an authentic
> multidimensional array or not?

That's a rather simple recursive function. The only problem is its run-
time cost.

> I honestly prefer your first case scenario. Seems much more efficient,
> less resource-consuming, and just straightforward. But I really would
> like to know what your preference is. If you had to choose, which way
> would you go?

If I were to design an array interface for Clojure, it would consist
of multimethods implemented for both an efficient array data structure
for internal use and for nested vectors. The implementation for the
latter would convert the nested vectors to the efficient structure and
do all the necessary checks. It would be there for convenience and
clarity in user code. Ideally I would then implement the same
interface for Colt arrays and netCDF arrays as well, both for having
an effcient structure for large data sets and for interoperability.
And once we are at it, why not have another implementation for sparse
arrays, using a suitable data structure?

Unfortunately, a good design and implementation represents a lot of
work. At the moment I am not sure if we have the critical mass of
people interested in working on this to get the job done in a
reasonable amount of time.

Maps could be a good choice for sparse arrays. For dense ones, they
would represent an enormous waste of memory, and probably time as well.

Konrad.

John Harrop

unread,
Oct 29, 2009, 8:48:45 AM10/29/09
to clo...@googlegroups.com
On Thu, Oct 29, 2009 at 4:49 AM, Konrad Hinsen <konrad...@fastmail.net> wrote:
On 28 Oct 2009, at 22:21, Rock wrote:
> I honestly prefer your first case scenario. Seems much more efficient,
> less resource-consuming, and just straightforward. But I really would
> like to know what your preference is. If you had to choose, which way
> would you go?

If I were to design an array interface for Clojure, it would consist
of multimethods implemented for both an efficient array data structure
for internal use and for nested vectors. The implementation for the
latter would convert the nested vectors to the efficient structure and
do all the necessary checks. It would be there for convenience and
clarity in user code. Ideally I would then implement the same
interface for Colt arrays and netCDF arrays as well, both for having
an effcient structure for large data sets and for interoperability.
And once we are at it, why not have another implementation for sparse
arrays, using a suitable data structure?

One issue with using multimethods is the resolution overhead. I don't think the JIT can optimize this to the extent it can optimize a normal polymorphic Java call.

So you might want to use a Java interface for this instead, and Clojure's Java interop with macros to wrap a nice Clojury API around it.

A second problem is boxing of primitives, and a third is noncontiguous storage. The Java for the nonsparse stuff should probably use and expose Java arrays of primitives. If you can live without runtime polymorphism, having only compile-time polymorphism, the polymorphism can move back to the Clojure side but live in a macro; then a sugar-coated API call directly translates at macroexpand time into manipulation of Java arrays in either Java calls or a Clojure function that uses loop/recur with primitives and is loaded with type hints for the array type used (likely double[]).

Rock

unread,
Oct 29, 2009, 9:01:18 AM10/29/09
to Clojure

> I didn't address such issues because there is no point in discussing  
> them before making the fundamental decision whether to use an abstract  
> or an exposed data type for array data. That choice dictates the  
> priorities for everything that follows.

Yes, I agree.

> > Suppose we're dealing with rank n objects. Do you think it
> > would be an easy task to figure all that out dealing with nested
> > vectors?
>
> If you can assume the array is well-formed, it is rather easy.  
> Otherwise it isn't.

Can you give an example in code? I really would like to see it.

> > And by the way, How would you go about implementing in detail
> > a check to see if a nested vector is actually an authentic
> > multidimensional array or not?
>
> That's a rather simple recursive function. The only problem is its run-
> time cost.

Please provide an example. In the meantime, I'll give it a shot
myself.

> > I honestly prefer your first case scenario. Seems much more efficient,
> > less resource-consuming, and just straightforward. But I really would
> > like to know what your preference is. If you had to choose, which way
> > would you go?
>
> If I were to design an array interface for Clojure, it would consist  
> of multimethods implemented for both an efficient array data structure  
> for internal use and for nested vectors. The implementation for the  
> latter would convert the nested vectors to the efficient structure and  
> do all the necessary checks. It would be there for convenience and  
> clarity in user code. Ideally I would then implement the same  
> interface for Colt arrays and netCDF arrays as well, both for having  
> an effcient structure for large data sets and for interoperability.  
> And once we are at it, why not have another implementation for sparse  
> arrays, using a suitable data structure?

Yes, indeed I like your design. Makes sense.

> Unfortunately, a good design and implementation represents a lot of  
> work. At the moment I am not sure if we have the critical mass of  
> people interested in working on this to get the job done in a  
> reasonable amount of time.

Well. I'm willing to contribute. We can get this started. It won't
hurt to give it a try. You never know ...
I feel Clojure can and actually should become a language well-suited
for scientific computation. It may not have all the prerequisites now,
but, hey, this is Lisp, and it can be done, always! :)

> On 28 Oct 2009, at 23:07, harrison clarke wrote:
>
> > maps could also be an option. you can use vectors of ints as keys.  
> > (and you can stick dimensions and such in there with keywords)
>
> > i'm not sure how that compares to nested vectors for perforance. you  
> > have the overhead of the hash function, but you don't have any  
> > nesting.
> > it's also pretty handy if you want to represent a sparse matrix.
>
> Maps could be a good choice for sparse arrays. For dense ones, they  
> would represent an enormous waste of memory, and probably time as well.

Well sparse arrays should always be treated more specifically I
suppose (no use wasting all those resources!), One thing at a time :)

Rock

Konrad Hinsen

unread,
Oct 29, 2009, 9:29:09 AM10/29/09
to clo...@googlegroups.com
On 29.10.2009, at 14:01, Rock wrote:

>>> Suppose we're dealing with rank n objects. Do you think it
>>> would be an easy task to figure all that out dealing with nested
>>> vectors?
>>
>> If you can assume the array is well-formed, it is rather easy.
>> Otherwise it isn't.
>
> Can you give an example in code? I really would like to see it.

(defn rank
[item]
(if (vector? item)
(inc (rank (nth item 0)))
0))

(rank 0)
(rank [1 2 3])
(rank [[1 2] [3 4]])

(defn shape
[item]
(if (vector? item)
(let [element (nth item 0)
n (count item)]
(cons n (shape element)))
(list)))

(shape 0)
(shape [1 2 3])
(shape [[1 2] [3 4]])

>>> And by the way, How would you go about implementing in detail
>>> a check to see if a nested vector is actually an authentic
>>> multidimensional array or not?
>>
>> That's a rather simple recursive function. The only problem is its
>> run-
>> time cost.
>
> Please provide an example. In the meantime, I'll give it a shot
> myself.

(defn check-multiarray
[item]
(and (vector? item)
(apply = (map shape item))))

; true
(check-multiarray [1 2 3])
(check-multiarray [[1 2] [3 4]])
; false
(check-multiarray 1)
(check-multiarray [[1 2] 3])
(check-multiarray [[1 2] [3]])

> I feel Clojure can and actually should become a language well-suited
> for scientific computation. It may not have all the prerequisites now,
> but, hey, this is Lisp, and it can be done, always! :)

I completely agree. I don't think Clojure will become a mainstream
language for science any time soon, for that it is too "exotic" for a
community that still sticks to Fortran. But it can find its niche, and
perhaps not even a small one.

Konrad.

Konrad Hinsen

unread,
Oct 29, 2009, 9:37:03 AM10/29/09
to clo...@googlegroups.com
On 29.10.2009, at 13:48, John Harrop wrote:

> One issue with using multimethods is the resolution overhead. I
> don't think the JIT can optimize this to the extent it can optimize
> a normal polymorphic Java call.

That's worth exploring - while keeping in mind that the JIT improves
all the time.

> So you might want to use a Java interface for this instead, and
> Clojure's Java interop with macros to wrap a nice Clojury API around
> it.

Maybe. For now I would consider this premature optimization. The big
negative impact of using a Java interface is the impossibility to work
with existing Java classes, such as Colt arrays. They would have to be
wrapped in another object layer just for implementing the interface.

> A second problem is boxing of primitives, and a third is
> noncontiguous storage. The Java for the nonsparse stuff should
> probably use and expose Java arrays of primitives.

I can see a role both for an implementation based on Clojure vectors
and for one using Java arrays. With a multimethod interface, both can
coexist.

> If you can live without runtime polymorphism, having only compile-
> time polymorphism, the polymorphism can move back to the Clojure
> side but live in a macro;

Indeed. One could implement a polymorphic interface in one namespace
and a compile-time polymorphic interface in another namespace, making
it possible to switch easily between the two.

What is nice about Clojure is that many of these implementation
decisions can be changed later on without modifying the client-side API.

Konrad.

Reply all
Reply to author
Forward
0 new messages