Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Tips for TIPs? Vector/linalg math extension

291 views
Skip to first unread message

Christian Gollwitzer

unread,
Mar 7, 2014, 9:33:20 AM3/7/14
to
Hi all,

I'm working on a numeric array extension for Tcl which can do basic
vector math and linear algebra on N-dimensional arrays with the goal to
get it (or something similar) included into the Tcl core.

At the present stage, the math functionality is a superset of what TIP
#363 suggests, without the need to alter substitution semantics. It is
designed to blend well with the rest of Tcl. See below for some
examples, how it looks like and what alreday works.

My question is, how do I best discuss it with the Tcl community and the
TCT in order to get a realistic chance of inclusion into the core? I'm
based in Germany and I plan to go to the EuroTcl meeting to show it
there. Shall I write a TIP document?

Below you find some examples of use.

Best wishes,

Christian

Design goals:
===============

* Compatible with the nested list representation.
Therefore compatible with list functions (lindex $A 1 2 3 is equal to
vexpr {A[1,2,3]}) and tcllib math::linearalgebra

* Compact core with zero dependencies (currently it is dependent on
tcllib, but just for the parser generator pt)

* Array language compatible with expr and Matlab syntax (textbook math)
as closely as possible

Arguments why this should be in the core
========================================

* Currently a patch to the core is required to make shimmering to lists
efficient. It works without, but then goes via the string
representation. Still it does nasty things even with the patch (touching
the list object type)

* Larger numerical systems could be built upon this infrastructure, if
it is available (cf. Python NumPy and SciPy)

* The current implementation compiles the vector expressions into Tcl
procs with some upvar trickery to get the variable scoping right and
caches the compiled expressions in a dict. If done in the core, it could
do proper bytecode compilation and cache the expression in the object


Simple Demonstration
=====================


package require vectcl

# create a vector and multiply by 3
set x {1.0 2.0 3.0}
vexpr {3*x}
# 3.0 6.0 9.0

# create a matrix
set A {{2.0 3.0} {5.0 6.0} {7.0 8.0}}

# solve a linear system of equations
# in the least squares sense
vexpr {p=A\x}
# 0.23684210526315772 0.15789473684210545

# compare the fitted solution with the data
vexpr {A*p}
# 0.9473684210526319 2.1315789473684212 2.921052631578948

# compute the residuals
vexpr {A*p-x}
# -0.05263157894736814 0.13157894736842124 -0.07894736842105221

# compute the residual norm
vexpr {sum((A*p-x).^2)}
# 0.026315789473684164

# alter the matrix by assigning a slice
# and recompute the parameters
vexpr {
A[2,:] = {4.0 7.0}
p = A \ x
}
# -0.3582089552238807 0.626865671641791

Arjen Markus

unread,
Mar 7, 2014, 9:50:51 AM3/7/14
to
For your information: I am working on a loadable extension for LAPACK and BLAS.

But what you are working on looks charming :)

Regards,

Arjen

rene

unread,
Mar 7, 2014, 10:28:52 AM3/7/14
to
Am Freitag, 7. März 2014 15:33:20 UTC+1 schrieb Christian Gollwitzer:
> Hi all,
>
>
>
> I'm working on a numeric array extension for Tcl which can do basic
>
> vector math and linear algebra on N-dimensional arrays with the goal to
>
> get it (or something similar) included into the Tcl core.

Currently we have the vector command in blt and rbc.
How is your work in relation to this commands?
Can it be used as replacement with additional functionality?


Thank you
rene

Christian Gollwitzer

unread,
Mar 7, 2014, 5:16:24 PM3/7/14
to
Hi rene,

Am 07.03.14 16:28, schrieb rene:
The main difference to blt::vector (and the similar packages NAP or
tcl-tna) is that my extension uses a proper Tcl_Obj type (NumArray) to
extend the Tcl type system in a seamless way. All the other packages I
know of use object (handle) semantics. Value semantics blends better
with the rest of Tcl; it has the following advantages:

1) There is no difference between Tcl variables and NumArrays; you can do

set x {1 2 3}
set y [vexpr {3*x}]

as well as

vexpr { x=ones(5) }
set y [lindex $x 2]
puts $x

The variable in the above example shimmer from string or list to
NumArray and from Numarray to string/list according to requirement.
For most expressions, vexpr gives (should give) the same answer as expr
(no command substitution, no strings and drop the $ in front of variables)

2) Because it uses standard variables, the Tcl interpreter takes care of
memory management and variable scoping. If a NumArray goes out of scope
of a proc, it is deleted no other than a simple list. In contrast you
need to manually destroy a blt::vector, and it is a global command.

3) There is also no difference betwen vexpr functions and Tcl procs:

set x {1 2 3 4}
vexpr { n = llength(x) }

"llength" here calls the standard llength function, it is nothing
special. Therefore, you can easily extend vexpr by writing a proc. This
is difficult to achieve with the other packages

4) Returning more than a single value from a function is also easy; just
return a list. This plays nicely with the Matlab-style list assignment:

proc multi {x} {
return [list $x [expr {$x*2}]]
}

vexpr { x, y = multi(3) }
# now x is 3 and y is 6

or, for the built-in QR decomposition

vexpr { Q, R = qr(A) }

is equivalent to

lassign [numarray qr $A] Q R

and in fact, the former is tranformed into the latter by the compiler.

5)
The vector language is more powerful than blt::vector expr; it handles
the N-dimensional case with slicing and reductions in a concise way. You
can express "Take the 2nd column of a matrix A and sum up every 2nd
element starting from the 8th" with a single

vexpr { sum(A[7::2, 1]) }

invocation. Or, if x and y contain column vectors (a simple list), then
x'*y is the dot product, whereas x*y' gives the kronecker product ('
denotes matrix transposition).

This is inspired by Matlab/Numpy and should already compile most of the
expressions found in the wild. Also, multiple statements are supported
such that small math "kernels" can run under vexpr. These slices also
use a shared buffer and copy-on-write to take up only very little space.


On the other hand, there are also drawbacks of the value approach:

1) You can't specify a storage class. In the end, vexpr will support
integers, doubles and complex doubles (needed for linalg stuff like
eigen values). In contrast objects can have a real memory type; tcl-tna
supports a wide range of signed and unsigned integers and differently
sized floats.

2) Tcl-tna compiles the expression to bytecode which executes the
elementwise operations within the implicit loop. That will run faster.
In addition it is threaded. Not impossible for vectcl, but rather hard
and unlikely to change in the forseeable future.

Currently, all of the above does work; but integers and complex numbers
do not yet work. Lots of work must go into cleaning the code,
stabilizing the C interface, writing docs... all the boring stuff:) I'm
a bit busy now, but soon I might make the extension available through
github or similar.

Christian

PS: A week ago I sent you an email with binaries of kbskit0.4.5 for OSX
(tested on 10.6,10.8 and 10.9). Have you seen it?

Christian Gollwitzer

unread,
Mar 7, 2014, 5:47:57 PM3/7/14
to
Dear Arjen,

Am 07.03.14 15:50, schrieb Arjen Markus:
> For your information: I am working on a loadable extension for LAPACK and BLAS.
> But what you are working on looks charming :)

I ran across your name a few times during my research on Tcl and math on
the wiki. What is the status of that work? Are you wrapping the Fortran
version of LAPACK, or are you using the C translated version?

In my code, the linalg backends are rather primitive so far. It only
does dense matrix decompositions, and these are adapted from JAMA
http://math.nist.gov/javanumerics/jama/
which is public domain. For sure a real LAPACK thing would run much faster.

Do you think it is worth to try backing NumArrays with LAPACK? There are
a few misfits, though; VecTcl uses row-major ordering, an array could be
sliced (slices share the buffer with the parent variable for
copy-on-write), and I'm a bit reluctant to pull in dependencies, since
the goal is to go into the core. But maybe LAPACK could be optional,
with a simple C fallback.

Since you are based in the Netherlands, are you planning to come to
Munich to the EuroTcl?

Christian

Donal K. Fellows

unread,
Mar 9, 2014, 2:15:53 PM3/9/14
to
On 07/03/2014 14:33, Christian Gollwitzer wrote:
> My question is, how do I best discuss it with the Tcl community and the
> TCT in order to get a realistic chance of inclusion into the core? I'm
> based in Germany and I plan to go to the EuroTcl meeting to show it
> there. Shall I write a TIP document?

My advice (and yes, I plan to be at EuroTcl but I can't be sure until
closer to the time if I'll have work conflicting too close) is to NOT
seek to TIP this, and to instead focus on making code that is good,
fast, and easy to use. Integrating into Tcl's built-in expression engine
is probably not a good idea (it's very closely entangled with the
bytecode engine) and you'll end up having to make some awkward
compromises with regard to some of the operations anyway (there are
several types of multiplication over higher-order numeric types like
vectors and matrices). It'll be so much easier to go with your own
commands in your own package so that you don't need to compromise nearly
as far in other ways.

Once it's road tested and if other people think it's a good idea as
well, we can add it to the contrib package tree (like TDBC, Itcl, Thread
and SQLite). The only TIPping involved (if any) will be an agreement
that we'll be doing this. :-)

Aside from that, are you trying to learn the lessons from what other
Tclers have done in this area? That's a great way to leverage existing
work and encode best practice.

Donal.
--
Donal Fellows — Tcl user, Tcl maintainer, TIP editor.

Christian Gollwitzer

unread,
Mar 9, 2014, 4:09:16 PM3/9/14
to
Hi Donal,

thanks for your feedback.

Am 09.03.14 19:15, schrieb Donal K. Fellows:
> My advice (and yes, I plan to be at EuroTcl but I can't be sure until
> closer to the time if I'll have work conflicting too close) is to NOT
> seek to TIP this, and to instead focus on making code that is good,
> fast, and easy to use.

There is one thing which is hard to do for an extension at the present
stage: Shimmering *TO* a list is not suported by Tcls type system. This
should be fast as well, so that you can do a foreach/lindex etc. over a
NumArray.

Currently, I've implemented it by wrapping the setFromAny proc of the
built-in list type, but this isn't enough: The core calls listSetFromAny
directly. After a trivial patch (replacing ~10 occurences of
listSetFromAny with Tcl_ConvertToType in tclListObj.c) the extension can
inject the wrapper. If the patch is not applied, then it still works,
but the NumArray updates the string rep and the list code parses this,
which is inherently ineffecient. Maybe there is a chance to include that
small patch into the core? I would guess that the slowdown is not so
dramatic for other uses, but the improvement for the vector extension is
dramatic.

> Integrating into Tcl's built-in expression engine
> is probably not a good idea (it's very closely entangled with the
> bytecode engine) and you'll end up having to make some awkward
> compromises with regard to some of the operations anyway (there are
> several types of multiplication over higher-order numeric types like
> vectors and matrices). It'll be so much easier to go with your own
> commands in your own package so that you don't need to compromise nearly
> as far in other ways.

I'm aware of that and I agree that it should be a different command.

> Aside from that, are you trying to learn the lessons from what other
> Tclers have done in this area? That's a great way to leverage existing
> work and encode best practice.

Well, I know many similar things. The closest maybe python's NumPy,
though python has "real" typeinformation and references at the script
level. The other existing Tcl packages all implement object commands and
do not use Tcl_Obj, which is a very different design decision. Yes I'm
reading the source of the competitors to learn how they do it, but it's
often hard to transfer because of different designs. (BLT has only 2
dimensions, tcl-tna puts an arbitrary compile-time limit, no other Tcl
extension does matrix decompositions/inversion etc.)

I would love to meet you at EuroTcl to discuss about these issues and
the future of Tcl!

Christian

Alexandre Ferrieux

unread,
Mar 10, 2014, 2:32:26 PM3/10/14
to
Hi Christian,

On Sunday, March 9, 2014 9:09:16 PM UTC+1, Christian Gollwitzer wrote:
>
> There is one thing which is hard to do for an extension at the present
> stage: Shimmering *TO* a list is not suported by Tcls type system. This
> should be fast as well, so that you can do a foreach/lindex etc. over a
> NumArray.

You are quite right about the fact that integrating seamlessly with the List type is inadequately complex, and needs a patch. A truly modular type system is one of my old dreams, but that's everything but a weekend project :(

Now in your specific case, why not stick to lists (and lists of lists) themselves for the vector type ? Is there some metainformation that you want to carry along, and that wouldn't fit in a vanilla list / list of lists ?

-Alex

Christian Gollwitzer

unread,
Mar 10, 2014, 5:46:22 PM3/10/14
to
Hi Alex,

Am 10.03.14 19:32, schrieb Alexandre Ferrieux:
> You are quite right about the fact that integrating seamlessly with
> the List type is inadequately complex, and needs a patch. A truly
> modular type system is one of my old dreams, but that's everything
> but a weekend project :(

This would be great, but I understand this is lots of work.

> Now in your specific case, why not stick to lists (and lists of
> lists) themselves for the vector type ? Is there some metainformation
> that you want to carry along, and that wouldn't fit in a vanilla list
> / list of lists ?

Well, a list stores pointers to Tcl_Objs, which itself is then a list
with pointers... this is very wasteful in contrast to a sane vector
extension which stores a contiguous memory buffer of doubles, e.g. And
the list representation does not enforce that the length in all
dimensions is equal. Both of these things make fast and memory-efficient
operations hard.

Still the idea was to have acceptable interoperability with lists on the
script level. If you lindex into a 2D array, then a list with slices
into the original array is generated. Therefore this full thing does not
unfold to the list-of-list-Tcl_Obj-doubles monster until you lindex into
an element of every row.

Christian

Uwe Klein

unread,
Mar 11, 2014, 3:39:22 AM3/11/14
to
Am 09.03.2014 21:09, schrieb Christian Gollwitzer:
> Hi Donal,
>
> thanks for your feedback.
>
> Am 09.03.14 19:15, schrieb Donal K. Fellows:

>> Aside from that, are you trying to learn the lessons from what other
>> Tclers have done in this area? That's a great way to leverage existing
>> work and encode best practice.
>
> Well, I know many similar things. The closest maybe python's NumPy,
> though python has "real" typeinformation and references at the script
> level. The other existing Tcl packages all implement object commands and
> do not use Tcl_Obj, which is a very different design decision. Yes I'm
> reading the source of the competitors to learn how they do it, but it's
> often hard to transfer because of different designs. (BLT has only 2
> dimensions, tcl-tna puts an arbitrary compile-time limit, no other Tcl
> extension does matrix decompositions/inversion etc.)
>

Are you aware of Neil D. McKay's work?

see:
http://wiki.tcl.tk/5723
ftp://ftp.oreilly.com/pub/conference/os2001/tcl_papers/mckay.ps

uwe

oc_forums

unread,
Mar 11, 2014, 5:18:07 AM3/11/14
to
Le mardi 11 mars 2014 08:39:22 UTC+1, Uwe Klein a écrit :

>
> see:
>
> http://wiki.tcl.tk/5723
>

This is a link to Tcl3D, are there Vector/linalg math module inside Tcl3D ?

Olivier.

Arjen Markus

unread,
Mar 11, 2014, 5:27:41 AM3/11/14
to
On Tuesday, March 11, 2014 10:18:07 AM UTC+1, oc_forums wrote:

>
> This is a link to Tcl3D, are there Vector/linalg math module inside Tcl3D ?
>

No, it is a link to TK3D. Neil describes the contents of the package on that page. And there are several other Wiki pages that are dedicated to the Tensor package.

Regards,

Arjen

Arjen Markus

unread,
Mar 11, 2014, 5:35:15 AM3/11/14
to
On Friday, March 7, 2014 11:47:57 PM UTC+1, Christian Gollwitzer wrote:
> Dear Arjen,
>
> Am 07.03.14 15:50, schrieb Arjen Markus:
> > For your information: I am working on a loadable extension for LAPACK and BLAS.
> > But what you are working on looks charming :)
>
> I ran across your name a few times during my research on Tcl and math on
> the wiki. What is the status of that work? Are you wrapping the Fortran
> version of LAPACK, or are you using the C translated version?
>

As Fortran is one of the other programming languages I use a lot, I do use the original Fortran code. I use the wrapping facilities I developed to create interfaces that are easier to use than a blunt translation. For instance: the Tcl data are lists and lists of lists to represent vectors and matrices. Since these "know" their sizes, there is no need to pass this information via an argument - at least not on the Tcl side. So, for instance, [dnrm2] takes only one argument, the vector, instead of the vector and the number of elements in the vector.

I was able to automate a lot of the interfacing, but now I am fine-tuning it to take advantage of simplifications like the above.

>
> In my code, the linalg backends are rather primitive so far. It only
> does dense matrix decompositions, and these are adapted from JAMA
> http://math.nist.gov/javanumerics/jama/
> which is public domain. For sure a real LAPACK thing would run much faster.
>
> Do you think it is worth to try backing NumArrays with LAPACK? There are
> a few misfits, though; VecTcl uses row-major ordering, an array could be
> sliced (slices share the buffer with the parent variable for
> copy-on-write), and I'm a bit reluctant to pull in dependencies, since
> the goal is to go into the core. But maybe LAPACK could be optional,
> with a simple C fallback.
>

My efforts are to create a loadable package, not something that would go into the core.

>
> Since you are based in the Netherlands, are you planning to come to
> Munich to the EuroTcl?
>

I will definitely give it serious thought, I have enjoyed all of the Tcl meetings I have been to, but some practical issues may get into the way.

Regards,

Arjen


oc_forums

unread,
Mar 11, 2014, 6:02:51 AM3/11/14
to
Le dimanche 9 mars 2014 21:09:16 UTC+1, Christian Gollwitzer a écrit :

> Well, I know many similar things. The closest maybe python's NumPy,
>
> though python has "real" typeinformation and references at the script
>
> level. The other existing Tcl packages all implement object commands and
>
> do not use Tcl_Obj, which is a very different design decision. Yes I'm
>
> reading the source of the competitors to learn how they do it, but it's
>
> often hard to transfer because of different designs. (BLT has only 2
>
> dimensions, tcl-tna puts an arbitrary compile-time limit, no other Tcl
>
> extension does matrix decompositions/inversion etc.)
>

But BLT 2.5 has a Matrix improvement, multiply and transpose, wouldn't it be easier to implement vector in the math::linearalgebra package of Tcllib to save what has been made up to now and keep the force to maintain it ?

oc_forums

unread,
Mar 11, 2014, 6:10:17 AM3/11/14
to
Le mardi 11 mars 2014 10:27:41 UTC+1, Arjen Markus a écrit :

> No, it is a link to TK3D. Neil describes the contents of the package on that page. And there are several other Wiki pages that are dedicated to the Tensor package.
>

Ooops, thank you ... I definitively prefer pure Tcl, at least , work that has been done on such nice packages or extensions doesn't die so easily as this binaries packages.

Olivier.

jima

unread,
Mar 11, 2014, 4:38:49 PM3/11/14
to
Hi,

I've worked with NAP before and I think the goal of having something like NumPy and SciPy is really worthy.

I don't know if you were aware of napcore which is a simplified version of NAP (http://chiselapp.com/user/jima/repository/napcore/index) I bundled so people can begin using it without having to deal with all NAP dependencies. This should build/work in 8.6. I've even corrected a couple of bugs.

Just in case you may find it useful.

My intended next step was to take Nap_NAO objects to work inside Tcl [expr] command but at the moment I am really not finding the bandwith for progressing with this.

jima

Alexandre Ferrieux

unread,
Mar 11, 2014, 6:03:52 PM3/11/14
to
On Monday, March 10, 2014 10:46:22 PM UTC+1, Christian Gollwitzer wrote:
>
> > Now in your specific case, why not stick to lists (and lists of
> > lists) themselves for the vector type ? Is there some metainformation
> > that you want to carry along, and that wouldn't fit in a vanilla list
> > / list of lists ?
>
> Well, a list stores pointers to Tcl_Objs, which itself is then a list
> with pointers... this is very wasteful in contrast to a sane vector
> extension which stores a contiguous memory buffer of doubles, e.g.[...]

Well, I agree there is a slight overhead with the per-row allocation in the matrix case (but none in vectors). But the aggressive reuse of backing arrays in Lists makes this potentially unsignificant, especially if you spend the effort of favouring in-place operations. This is perfectly doable in an extension (hint, hint).

> the list representation does not enforce that the length in all
> dimensions is equal. Both of these things make fast and memory-efficient
> operations hard.

If you insist on *not* using nested lists for matrices, and (understandably) cling to the requirement of dimension checks, you can encode your metadata in 1-D lists as you see fit. For example:

1-D vectors:
{vector coordinates ... dim 1}

2-D matrixes:
{matrix elements ... dimX dimY 2}

N-th order tensors:
{tensor elements ... dim1 dim2 ... dimN N}

This way you keep:
- a compact backbone (the list's backing array of Tcl_Obj*)
- a fast random access
- fast order and dimension checks (by indexing from the end)

And of course:
- no need for a patch

-Alex

Christian Gollwitzer

unread,
Mar 11, 2014, 6:20:45 PM3/11/14
to
Am 11.03.14 23:03, schrieb Alexandre Ferrieux:
> On Monday, March 10, 2014 10:46:22 PM UTC+1, Christian Gollwitzer wrote:
>>
>>> Now in your specific case, why not stick to lists (and lists of
>>> lists) themselves for the vector type ? Is there some metainformation
>>> that you want to carry along, and that wouldn't fit in a vanilla list
>>> / list of lists ?
>>
>> Well, a list stores pointers to Tcl_Objs, which itself is then a list
>> with pointers... this is very wasteful in contrast to a sane vector
>> extension which stores a contiguous memory buffer of doubles, e.g.[...]
>
> Well, I agree there is a slight overhead with the per-row allocation in the matrix case (but none in vectors).

I don't understand this. Below is how I usually create a list of doubles
from a double *numbers, maybe this is wrong?

Tcl_Obj* list=Tcl_NewObj();

for (i=0; i<N; i++) {
Tcl_ListObjAppendElement(interp, list, Tcl_NewDoubleObj(numbers[i]));
}

(not copy-pasted, please ignore syntax errors)

My expectation is that for N=1000, this creates an array of >=1000
pointers pointing to Tcl_Objs with inlined doubles, where the Tcl_Obj
holds a refcount, an invalid string rep, a type ptr, and the double
value. On a 64bit machine even the array of pointers itself would amount
to the same size as the original double array, not to mention the
Tcl_Obj for every element.

Is either the above code plain wrong, or is the idea of what it does
incorrect?

The internal rep of the whole double array on the other hand, is just a
memory buffer + some meta information about datatype, the list of
dimensions etc.

Christian

Alexandre Ferrieux

unread,
Mar 12, 2014, 4:48:26 PM3/12/14
to
On Tuesday, March 11, 2014 11:20:45 PM UTC+1, Christian Gollwitzer wrote:
>
> The internal rep of the whole double array on the other hand, is just a
> memory buffer + some meta information about datatype, the list of
> dimensions etc.

OK, this aspect of your implementation had escaped me before. I thought you were still keeping an array of Tcl_Obj*. Now I see you are selecting one numeric type (doubles) and storing them directly as scalars ('expanded' as we would say in Eiffel). This is indeed *much* more efficient than what I thought, no comparison.

But in this case, I don't see a compelling reason for optimized bridges to the List type, since any mapping would incur the re-boxing (or conversion and un-boxing) of the individual values, with the overhead you've just highlighted.

Also, note that selecting one scalar type biases the package towards one kind of computation (floating point based), precluding its us with e.g. bignums. But maybe this does not really matter given the target fields.

-Alex

Christian Gollwitzer

unread,
Mar 12, 2014, 6:58:09 PM3/12/14
to
Am 12.03.14 21:48, schrieb Alexandre Ferrieux:
> OK, this aspect of your implementation had escaped me before. I
> thought you were still keeping an array of Tcl_Obj*. Now I see you
> are selecting one numeric type (doubles) and storing them directly as
> scalars ('expanded' as we would say in Eiffel). This is indeed *much*
> more efficient than what I thought, no comparison
>
> But in this case, I don't see a compelling reason for optimized
> bridges to the List type, since any mapping would incur the re-boxing
> (or conversion and un-boxing) of the individual values, with the
> overhead you've just highlighted.

Yes, this is true. Shimmering will occur sooner or later, as the user
wants to pull out the results from the computation, but all the
intermediate computations will be done in the most efficient
representation (contiguous array).

Of course, going via the string representation is still much more
expensive than via the list converter; the below code shimmers a vector
of 10000 elements between NumArray and list:

(vectcl) 49 % set x {}; for {set i 0} {$i<10000} {incr i} {
lappend x [expr rand()]
}
(vectcl) 50 % time {numarray .*= x 1.0; llength $x } 1000
1663.129394 microseconds per iteration


without the patch:

(vectcl) 50 % time {numarray .*= x 1.0; llength $x } 1000
24022.946292 microseconds per iteration

Therefore it's roughly 15 times faster with the list code. For matrices,
the NumArray->list converter creates slices for the rows of a matrix,
i.e. it shimmers to a list of NumArrays, which point to the original
buffer. The use case would be a

foreach row $matrix

invocation. Going via the string rep necessarily converts every value to
a string.

The question remains: is it possible to incorporate the small patch,
which replaces in tclListObj.c the direct invocations of SetListFromAny
with Tcl_ConvertToType? The extension still needs to do some "illegal"
stuff to inject the list converter, but at least it works.

> Also, note that selecting one scalar type biases the package towards
> one kind of computation (floating point based), precluding its us
> with e.g. bignums. But maybe this does not really matter given the
> target fields.

Yes and no. The current status is that it is working with doubles, but
I'm now adding support for integers and complex numbers. Integers are
needed to perform index arithmetics, and complex numbers will be used
for some standard linalg stuff, that should be provided by the basic
package (eigenvalues, FFT). There will be type metainformation similar
to Tcl_Obj->typePtr, but only once for the whole array of numbers. That
is consensus among all the numerical packages I've seen so far.

The overall idea is to have the EIAS data types expanded to
vector/matrix/tensor arithmetics. This can't distinguish between float
and double, but it can between an integer 1, a double 1.0 and a complex
1.0+0.0i, similar to expr, and therefore an integer vector {1 2}, a real
vector {1.0 2.0} and a complex vector {1.0+0.0i 2.0+0.0i}. From the
script perspective there should be no difference between expr and loops
on the list-of-lists representation and vexpr; except that the latter is
much faster and allows slicing, reductions etc. with a Matlab like syntax.

Christian


=== here comes the trivial patch ===
--- tclListObj.c.orig 2014-03-04 20:14:03.000000000 +0100
+++ tclListObj.c 2014-03-04 20:14:06.000000000 +0100
@@ -470,7 +470,7 @@
*objvPtr = NULL;
return TCL_OK;
}
- result = SetListFromAny(interp, listPtr);
+ result = Tcl_ConvertToType(interp, listPtr, &tclListType);
if (result != TCL_OK) {
return result;
}
@@ -579,7 +579,7 @@
Tcl_SetListObj(listPtr, 1, &objPtr);
return TCL_OK;
}
- result = SetListFromAny(interp, listPtr);
+ result = Tcl_ConvertToType(interp, listPtr, &tclListType);
if (result != TCL_OK) {
return result;
}
@@ -743,7 +743,7 @@
*objPtrPtr = NULL;
return TCL_OK;
}
- result = SetListFromAny(interp, listPtr);
+ result = Tcl_ConvertToType(interp, listPtr, &tclListType);
if (result != TCL_OK) {
return result;
}
@@ -796,7 +796,7 @@
*intPtr = 0;
return TCL_OK;
}
- result = SetListFromAny(interp, listPtr);
+ result = Tcl_ConvertToType(interp, listPtr, &tclListType);
if (result != TCL_OK) {
return result;
}
@@ -869,7 +869,7 @@
}
Tcl_SetListObj(listPtr, objc, NULL);
} else {
- int result = SetListFromAny(interp, listPtr);
+ int result = Tcl_ConvertToType(interp, listPtr, &tclListType);

if (result != TCL_OK) {
return result;
@@ -1627,7 +1627,7 @@
}
return TCL_ERROR;
}
- result = SetListFromAny(interp, listPtr);
+ result = Tcl_ConvertToType(interp, listPtr, &tclListType);
if (result != TCL_OK) {
return result;
}
=============================================================

pal...@yahoo.com

unread,
Mar 12, 2014, 9:38:30 PM3/12/14
to
Christian, Arjen,

I don't know if you've seen my tarray extension at http://tarray.sourceforge.net/manual/toc.html.

It implements arrays and tables as Tcl_Obj's using arrays of native types (integer, doubles etc.). Original use case was for an in-memory column store with parallelized sorting, search etc. but since then I've been thinking of adding numeric operations as well. Since I've no real experience in numeric computation I was wondering if some one else might be interested in working on those aspects.

Care to take a look and see if perhaps efforts might be combined ?

/Ashok

Christian Gollwitzer

unread,
Mar 14, 2014, 2:55:09 PM3/14/14
to
Hi Ashok,

Am 13.03.14 02:38, schrieb pal...@yahoo.com:
> Christian, Arjen,
>
> I don't know if you've seen my tarray extension at
> http://tarray.sourceforge.net/manual/toc.html.
>
> It implements arrays and tables as Tcl_Obj's using arrays of native
> types (integer, doubles etc.). Original use case was for an in-memory
> column store with parallelized sorting, search etc. but since then
> I've been thinking of adding numeric operations as well.

this is an interesting work in it's own right. Which numerical
operations would you like to see in this extension?

I think the design is rather different from other numerical packages.
tarray tables look to me more similar to tables in a relational database
- e.g. in a numeric array extension, typically every column must have
the same datatype and there can be tensors of higher rank, i.e. 3, 4, 5
etc. -dimensional arrays. I think you are competing more with SQlite
in-memory databases than with numerical packages.

Christian

Christian Gollwitzer

unread,
Mar 14, 2014, 3:23:26 PM3/14/14
to
Dear group,

Am 07.03.14 15:33, schrieb Christian Gollwitzer:
> I'm working on a numeric array extension for Tcl which can do basic
> vector math and linear algebra on N-dimensional arrays with the goal to
> get it (or something similar) included into the Tcl core.

thanks for all the pointers and ideas I've gotten. I'm browsing through
the source code of the different packages now.

Concerning the state of vectcl (my package), I'm adhering to Donal's
advice - I fill in missing pieces in my spare time to make a
"feature-complete" draft, which I will show at the EuroTcl meeting.

The math backend is probably inferior to some of the packages shown
here, maybe it is worth to rewrite the whole thing based on sourcecode
from the other packages.

Have fun coding,

Christian

pal...@yahoo.com

unread,
Mar 16, 2014, 11:47:00 AM3/16/14
to
On Saturday, March 15, 2014 12:25:09 AM UTC+5:30, Christian Gollwitzer wrote:
> Hi Ashok,
> I think the design is rather different from other numerical packages.
>
> tarray tables look to me more similar to tables in a relational database
>

They're really simple arrays, that's all :-) Really the goal of the extension (still work in progress) is to provide a memory-efficient and fast (as in parallelized on multi-core) low level array operations and then base higher level structures implemented in script on top. So far most of the latter is of the data store variety and that is still the main focus but I keep thinking there is scope there to go in the direction of numeric processing as well, perhaps not as comprehensive as the specialized numeric extensions but a useful subset.

> - e.g. in a numeric array extension, typically every column must have
>
> the same datatype and there can be tensors of higher rank, i.e. 3, 4, 5
>
> etc. -dimensional arrays.

Columns in tarrays are generally expected to be the same data type. Higher dimensions are composed via nesting (you could have columns of columns of columns... which is how tables are implemented). No question syntactic structures are needed otherwise explicitly programming nesting becomes cumbersome.

>
> Christian

I hope to keep an eye on what you and Arjen are up to and perhaps be able to lift code from there. I did take a look at some of the other numeric extensions but they were a bit too heavyweight in terms of my needs and not easy to extract code from.

/Ashok

Arjen Markus

unread,
Mar 17, 2014, 6:17:25 AM3/17/14
to
On Sunday, March 16, 2014 4:47:00 PM UTC+1, pal...@yahoo.com wrote:

>
> I hope to keep an eye on what you and Arjen are up to and perhaps be able to lift code from there. I did take a look at some of the other numeric extensions but they were a bit too heavyweight in terms of my needs and not easy to extract code from.
>

I had a look at the documentation and concluded that the tarray table data type is a collection of columns, rather than a contiguous array. If that is indeed the case, then tarray tables are not a very suitable construct for storing numerical matrices. However, a column can - with a bit of syntactic sugar - function as a two-dimensional contiguous array.

A strategy to use this data type in my wrapper library would be to use a different implementation of the routines that convert between Tcl_Objs and plain arrays. It will require some additional coding, but it does seem feasible.

Regards,

Arjen
0 new messages