[matrix/mat64] Grow function on Dense

77 views
Skip to first unread message

Brendan Tracey

unread,
Jan 12, 2015, 5:59:32 PM1/12/15
to gonu...@googlegroups.com
For a lot of my stuff, I'm often sequentially increasing the size of a matrix as new information comes in. I'd propose the addition of a Grow function that increases the size of a matrix.

// Grow increases the size of a matrix by m rows and n columns. If there is sufficient data
// available in the underlying slice, that data will be reused, otherwise new data will be
// allocated.
func (d *mat64.Dense) Grow(m, n int){
}

The combination of Grow and View can allow for efficient growth scheduling. So, for example, let's say I have a 10x10 matrix, and I want to make it an 11x11 matrix, but I also know I'll be adding data later. I want to preallocate this all at once. I can do

data.Grow(100, 100) // add space for the next 90 rows and columns
data.View(0,0, 11, 11)
// extra code to add my data

Adding rows and columns can both be made efficient. The total number of available columns is tracked by the Stride field, and the total number of available rows is len(data)/Stride.

It's true that this doesn't play well with View, but this behavior parallels that of append, so I think it's okay.

Dan Kortschak

unread,
Jan 12, 2015, 6:10:31 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
Timely. This helps me thinking about the issue I just filed.

I neglected to think that cap(col) == stride, but cap(row) is not
len(data)/Stride.

Dan Kortschak

unread,
Jan 12, 2015, 6:20:36 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
On Tue, 2015-01-13 at 09:40 +1030, Dan Kortschak wrote:
> I neglected to think that cap(col) == stride, but cap(row) is not
> len(data)/Stride.
>
Actually, cap(col) != stride either since a small view inset from the
left and right would over estimate its cap(col).

This requires two new fields. But fixes this and at least two other
problems.

Brendan Tracey

unread,
Jan 12, 2015, 6:27:27 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
If you don’t care about overwriting data, why isn’t it len(data)/Stride?

If you have the matrix
[ 1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16]

and you View it, to say get the middle 2 x 2 region

The data you have is (the dots and extra numbers are the data actually in the slice)

[ 6 7 . 8 9
10 11] . 11 12
. .
13 14 . 15 16

Stride is 4 which matches the column limit, and len(data) / Stride = 12 / 3 which matches the row limit.

Different example where the numbers don’t work as well:

[ 1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20]

Again, slice the same relative portion
[ 7 8 . 9 10 11
12 13] . 14 15 16
. .
17 18 19 20

Again, the column limit is correctly 5, and the Row limit is 14 / 5 = 2, which is correct.
> --
> You received this message because you are subscribed to the Google Groups "gonum-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gonum-dev+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Brendan Tracey

unread,
Jan 12, 2015, 6:30:48 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
I guess it needs to be cap(data)/Stride rather than length, but same idea.

Dan Kortschak

unread,
Jan 12, 2015, 6:42:36 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
That's part of it since we slice down to the minimum to represent the
matrix, so using len would not allow the grow operation.

[ 1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16 ]

viewed as you show give this (the leading elements are forever lost).

[ 6 7 8
9 10 11|12
13 14 15 16 ]

cap(data) = 11
stride = 4

cap(rows) = 2 (integer truncated)

The issue is that the matrix doesn't know how far right into the matrix
backing it it is, and can't know unless there is a field to describe it.
If we add that field, we may as well just keep capRows and capCols in
*Dense.

Brendan Tracey

unread,
Jan 12, 2015, 7:00:18 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
Oops, didn’t see that I had duplicated the 11 in my first example. Yes, in this case the capacity for rows is correctly 2

The leading elements are ‘forever lost’ and we don’t care about that fact.

The matrix doesn’t know how far to the right it is, but the data does remain available to use. If you were to grow
[ 6 7 8
9 10 11|12
13 14 15 16 ]

by two columns, you could end up with the matrix
6 7 8 9
10 11 12 13
.
14 15 16

Which, as far as data size is concerned, is perfectly fine. Contents, however, are a different matter. If we want different combinations of View and Grow to be of the same data, then you’re right. Alternately, we could have Grow zero the elements, so you’d have
6 7 0 0
10 11 0 0
.
14 15 16

I suppose consistency with slicing and how we would want tables implies that we should add the capacities. I think at that point they should go into blas.General so that RawMatrix() and its modifications work appropriately.

Dan Kortschak

unread,
Jan 12, 2015, 7:04:36 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
On Mon, 2015-01-12 at 16:00 -0800, Brendan Tracey wrote:
> I suppose consistency with slicing and how we would want tables
> implies that we should add the capacities.

That's what I ended up concluding in the issue I filed.

> I think at that point they should go into blas.General so that
> RawMatrix() and its modifications work appropriately.

Why? blas packages don't do any resizing, so they don't need to know
anything about capacity. Only mat64 knows about this, so it seems more
reasonable to include it in *mat64.Dense.

Brendan Tracey

unread,
Jan 12, 2015, 7:10:59 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
Agreed, but then the solution is to have RawMatrix back in mat64?

type RawMatrix struct {
blas64.General
CapRows int
CapCols int
}


Dan Kortschak

unread,
Jan 12, 2015, 7:37:03 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
No, just set m.capRows, m.capCols = b.Rows, b.Cols in
*Dense.SetRawMatrix. I don't think the caps should be exposed. We would
then create:

type interface Grower {
Caps() (r, c int)
Grow(m, n int)
}

where:

func (m *Dense) Caps() (r, c int) { return m.capRows, m.capCols }


This means that untrusted data (from a non-mat64 source is capped at the
size it is). I don't think that is a problem, in fact I think that is
entirely appropriate. If people want a growable matrix, they should use
NewDense to create it and they can refer to the data slice by getting
the results of RawMatrix(). SetRawMatrix is a dangerous tool

Dan Kortschak

unread,
Jan 12, 2015, 7:38:37 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
Incomplete.

This means that untrusted data (from a non-mat64 source is capped at the
size it is) is made as safe as possible. I don't think that is a
problem, in fact I think that is entirely appropriate. If people want a
growable matrix, they should use NewDense to create it and they can
refer to the data slice by getting the results of RawMatrix().
SetRawMatrix is a dangerous tool.


Brendan Tracey

unread,
Jan 12, 2015, 7:47:40 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
I’m trying to think through my needs for RawMatrix, which fortunately are getting less and less as time goes on. I think all I’ve ever needed size of the matrix and access the data slice directly. The last issue is the growing issue, which we are fixing here.

SGTM.

Dan Kortschak

unread,
Jan 12, 2015, 7:52:59 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
On Mon, 2015-01-12 at 16:47 -0800, Brendan Tracey wrote:
> I’m trying to think through my needs for RawMatrix, which fortunately
> are getting less and less as time goes on. I think all I’ve ever
> needed size of the matrix and access the data slice directly. The last
> issue is the growing issue, which we are fixing here.
>
> SGTM.
>
Can you file an issue and link to to gonum/matrix#73 and
gonum/matrix#65?

Dan Kortschak

unread,
Jan 12, 2015, 9:56:30 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
On Mon, 2015-01-12 at 16:47 -0800, Brendan Tracey wrote:
> I’m trying to think through my needs for RawMatrix, which fortunately
> are getting less and less as time goes on. I think all I’ve ever
> needed size of the matrix and access the data slice directly. The last
> issue is the growing issue, which we are fixing here.
>
> SGTM.

How to deal with *Dense that isZero() due to a Reset call? This now has
a non-zero stride, while a true zero Dense does not.

Brendan Tracey

unread,
Jan 12, 2015, 10:14:45 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
It kind of seems like Reset has to play not-nice. Reset() basically means “ignore what had been there”. It would be pretty confusing if Reset kept the stride and capacity, and would form to any size matrix as long as it was sufficiently small. It just seems like View and Reset have to be used with caution. You shouldn’t Reset a matrix that comes in as an input parameter, so Reset will usually be done in the same place where the initial matrix was allocated, so the errors should be reasonably isolated.

Dan Kortschak

unread,
Jan 12, 2015, 10:35:58 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
On Mon, 2015-01-12 at 19:14 -0800, Brendan Tracey wrote:
> It kind of seems like Reset has to play not-nice. Reset() basically
> means “ignore what had been there”. It would be pretty confusing if
> Reset kept the stride and capacity, and would form to any size matrix
> as long as it was sufficiently small. It just seems like View and
> Reset have to be used with caution. You shouldn’t Reset a matrix that
> comes in as an input parameter, so Reset will usually be done in the
> same place where the initial matrix was allocated, so the errors
> should be reasonably isolated.
>
So, we should special-case m.View(i, j, 0, 0) to allow the equivalent of
Reset() that doesn't reset the stride?

Brendan Tracey

unread,
Jan 12, 2015, 10:37:18 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
Can’t we change Reset to set the stride to 0, and change isZero to also check if the stride is zero?

Dan Kortschak

unread,
Jan 12, 2015, 10:42:11 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
On Mon, 2015-01-12 at 19:37 -0800, Brendan Tracey wrote:
> Can’t we change Reset to set the stride to 0, and change isZero to
> also check if the stride is zero?

Yes, in addition to what I suggested.

Brendan Tracey

unread,
Jan 12, 2015, 10:47:49 PM1/12/15
to Dan Kortschak, gonu...@googlegroups.com
But that doesn’t special-case View, it just changes what the isZero check means. Right?

Dan Kortschak

unread,
Jan 12, 2015, 10:54:27 PM1/12/15
to Brendan Tracey, gonu...@googlegroups.com
On Mon, 2015-01-12 at 19:47 -0800, Brendan Tracey wrote:
> But that doesn’t special-case View, it just changes what the isZero
> check means. Right?
>
Both. View panics if r || c == 0.

Dan Kortschak

unread,
Jan 13, 2015, 8:45:02 PM1/13/15
to Brendan Tracey, gonu...@googlegroups.com
I'm working on this now. Do you want "grow to" or "grow by"?

On Mon, 2015-01-12 at 14:59 -0800, Brendan Tracey wrote:

Brendan Tracey

unread,
Jan 13, 2015, 8:48:57 PM1/13/15
to Dan Kortschak, gonu...@googlegroups.com

I think grow by will be the more common operation (add one row,  add n columns)

Reply all
Reply to author
Forward
0 new messages