First post: common-lisp -> numerical-lisp->CL-statistics

49 views
Skip to first unread message

Mirko Vukovic

unread,
Oct 11, 2012, 8:02:06 PM10/11/12
to lisp...@googlegroups.com, Liam Healy, Tamas Papp
This is a bit long-winded, but it may have some relevance to the architecture of CL-Statistics

The other day, I was writing this as part of a sequence generating library documentation that I hope to release soon:

The Intoruction to R describes vectors. Of interest are also sections on factors, arrays and matrices, and lists and data frames. Finally, reading data is of interest. At this point, this is starting to sound like Rossini's project common-lisp-stat. However note that this project uses Tamas Papp's xarray.

At that point, I went to google what Rossini was up to, and came upon this list.  It was up for only a day or so when I found it.

The reason for this post is that I am using Liam Healy's gsll and antik libraries.  These use his grid library for representing vectors, matrices, etc.  It is unfortunate that at the time Liam was releasing grid, Tamas released his xarray.

Personally, I dislike fragmentation (grid vs xarray), but now nothing can be done about it.  And some competition is good.

The interesting part of Liam's antik is that antik shadows several of CL's symobls: many math functions, and also aref and aref*.  He reimplements these as generic functions, allowing extension for other data types (such as grids).  In a way, Liam is on his way to creating what I would call Numerical Lisp: Extension of CL for numerical computing that may be incompatible with CL.  But the extension is a relatively thin layer on top of CL, leaving all of CL accessible.

Futhermore, Numerical-Lisp (NL) can unify grid and xarray using the following three-layered structure:
 - Application layer, such as CL-Statistics
 - Numerical-Lisp layer
 - Raw libraries (grid/xarray)

The numerical lisp layer would consist of Liam's redefinitions of CL mathematics related functions via generic functions.  The raw libraries would hook into this via methods.  In addition to CL mathematics functions, NL can define interface to higher mathematics functions also using generic functions.  Then the raw libraries can plug in using packages such as GSL, LAPACK, etc.

What needs to be done is
- creation of the Numerical-Lisp library
- hooks in grid and xarray
- write the application layer to use Numerical-Lisp interface.

It seems to me that much of this should be relatively straightforward.  I would be glad to help in testing (sbcl, clisp) and documenting.

Finally, what does this have to do with CL-statistics?  Well, I am suggesting that the code be organized in a way to make it easy to transition from interfacing to CL & xarray into NL.

Best,

Mirko

PS - and in the end, when all is said and done, someone writes a great book: Practical Numerical Lisp

Tamas Papp

unread,
Oct 12, 2012, 4:36:23 AM10/12/12
to lisp...@googlegroups.com, Liam Healy, Mirko Vukovic
Hi Mirko,

Thanks for cc'ing me, I hadn't known about this list but now I have
subscribed.

On Fri, Oct 12 2012, Mirko Vukovic <mirko....@gmail.com> wrote:

> The reason for this post is that I am using Liam Healy's gsll and antik
> libraries. These use his grid library for representing vectors, matrices,
> etc. It is unfortunate that at the time Liam was releasing grid, Tamas
> released his xarray.
>
> Personally, I dislike fragmentation (grid vs xarray), but now nothing can
> be done about it. And some competition is good.

xarray is no longer actively maintained, it is listed as DEPRECATED on
github.

FWIW, I found that arrays and displaced arrays are enough for pretty
much everything that I am doing, and fancy array views are not necessary
for me. I am working on a library that introduces some "array theory"
into CL, but I will not release it to the public until it stabilizes as
the API is still in flux.

> What needs to be done is
> - creation of the Numerical-Lisp library
> - hooks in grid and xarray
> - write the application layer to use Numerical-Lisp interface.
>
> It seems to me that much of this should be relatively straightforward. I
> would be glad to help in testing (sbcl, clisp) and documenting.
>
> Finally, what does this have to do with CL-statistics? Well, I am
> suggesting that the code be organized in a way to make it easy to
> transition from interfacing to CL & xarray into NL.

I prefer to use CL arrays or thin wrappers containing arrays (eg
hermitian/lower/upper matrices in LLA) exactly because of this reason.
Arrays are available in CL, so libraries which use arrays can just be
used in plain vanilla CL, without a need for a transition.

> PS - and in the end, when all is said and done, someone writes a great
> book: Practical Numerical Lisp

:-)

Best,

Tamas

A.J. Rossini

unread,
Oct 12, 2012, 8:39:11 AM10/12/12
to lisp...@googlegroups.com, Liam Healy, Mirko Vukovic
Hi both -

Since I've been gratuitiously stealing from all of the above folks (Liam, Tamas, etc), and the general philosophy is more to provide core glue than to have "the one true set of packages", I definitely agree with what Mirko is suggesting, and in fact, one quick way to move forward would be to use GSLL as the replacement for liblispstat, at least in the short term (and maybe forever, who knows).

I would definitely prefer at some point in the distant future to have everything in lisp code, but short of running things through F2CL tool, it's just a pipe dream.

Mirko's suggestion is completely in line with what I'd like to do, however, I need to get a package that looks and acts like R's dataframes (i.e. sort of like a CL array, but column typed with an optional key/id column).  And with the many possible sources of matrix tools around, I'd like a common set of functions/macros for accessing, manipulating, etc.  People can always drop down to the lower level (more direct access, faster), but I definitely want the upper level to remain for portability of access.

Tamas, I've continued with xarray, so no, it's not deprecated, but I'm now responsible for it (sort of like my "borrowing" of lisp-matrix and Rif's and your related code for BLAS/LAPACK and FNV/FFA access).  

I still need to keep general C-data array access, not just lisp-level, and unlike you, I'd like to support a range of implementations, and not just SBCL (or CCL, or ....).    So am not focused on optimization, just on functionality and correctness. 

I also have the luxury of this being a hobby, and not required to do any real data analysis for a while.

I also see no problem with creating an alternative matrix back end using GSLL and Antik, and I think it is more good than bad.   Much more.   We've got the luxury of being able to experiment, and the package system allows best of breed to be integrated into a core (or replace "core" packages). 

And git allows for us all to have different belief systems, while tracking heathens.  (I've always being that DVC's are better than centralized version control -- basically, acknowledging the reality of subjective belief systems for "the real version" verses the fiction of "the one true version of the source" which centralized systems require, a frequentist philosophy system.

best,
-tony

A.J. Rossini

unread,
Oct 12, 2012, 8:45:52 AM10/12/12
to lisp...@googlegroups.com, Liam Healy, Tamas Papp
Hi Mirko -

The other possibility is that depending on the complexity of Antik (sounds like it does a good deal, whether it should all be in the same place is another matter, which is just a question that I shouldn't ask until I read the code later this weekend, Sorry for posing it prematureley!), we just replace xarray with antik.  Some of this might be pretty simple.

David Hodge

unread,
Oct 18, 2012, 9:06:51 AM10/18/12
to lisp...@googlegroups.com, Liam Healy, Tamas Papp
I'd like to thank Mirko for his well considered post.

Some extra thoughts:

1. The layering should be

Visualisation
Application
Numerics 
"Raw or low level" - gsll, blas, lapack etc

I think that Visualisation is at least as important as the application layer and should be properly represented as such in the architecture

2. Component choice

Lots of discussion about xarray vs grid vs .....

and from a visualisation perspective, lots of good suggestions about grammar of graphics, D3.js etc etc

While I don't wish to appear to stifle such discussions, on the other hand they are much more fruitful when  we have some sort of base line that gives at least basic functionality. Endless conversations about "what about this or that component" just actually prevent progress IMHO.

So I hope that we can start a conversation that will allow us to settle quickly on a core set of components with which we are are in agreement. we can then start experimenting with other approaches, but without a solid, well understood and agreed foundation, it will make evolution of CLS tricky - to say the least. 

Lets decide this quickly so we can actually spend our time effectively as team. We might individually have to compromise on our long term desires for the short term gain , but if we can build a solid foundation quickly , we can then go off an add our favourite things later.

As a start, I vote for gsll as a "low level" library, mainly as a replacement for the now defunct liblispstat.
I would also propose a vote for gnuplot as the simplest visualisation alternative. its not the best, most dynamic or whatever, but easy to drive and quite portable acros platforms.
In terms of arrays/dataframes etc, i have an open mind , but need heterogeneous datatypes in dataframes need some thought

Tony, Steve, your thoughts?

Steven Núñez

unread,
Oct 18, 2012, 8:57:04 PM10/18/12
to lisp...@googlegroups.com, Liam Healy, Tamas Papp
Personally I like to take a practical approach to getting these things started, and I think you've done that in your post. 

As for graphics, I don't have enough knowledge of low-level graphics programming to contribute usefully to  the discussion. I think it would be difficult to under estimate the importance of data visualization these days, but let's not agonise over the color of the bikeshed (the FreeBSD guys, who do some great engineering, have something to say about this) but simply aim, as far as practical, not to make the climb too steep for those that want to add other bits in the future.

- SteveN

--
You received this message because you are subscribed to the Google Groups "Common Lisp Statistics" group.
To post to this group, send email to lisp...@googlegroups.com.
To unsubscribe from this group, send email to lisp-stat+...@googlegroups.com.
Visit this group at http://groups.google.com/group/lisp-stat?hl=en.
 
 
Reply all
Reply to author
Forward
0 new messages