Sparse matrix-by-vector multiplication using available types

Krzysztof Sakrejda

unread,

Mar 6, 2015, 7:20:06 PM3/6/15

to stan...@googlegroups.com

I made a Stan function for sparse-matrix-by-dense-vector multiplication (appended below). To use it you have to generate your matrix in compressed column format (one-indexed, since this is Stan-land) and pass the matrix as its dimensions (m, n), the values vector (w) and the three indexing arrays (v,u,z). Memory-wise it's a win over using a big model matrix but speed-wise it's not much of a win unless I do gradients. As a step towards that I'd like to implement it as a C++ function (and do the whole pull request thing if there's interest and I can figure out how to do the gradient).

I couldn't sort out what the signature should be yet, so I put the Stan function into a placeholder model, got Stan to parse it to C++, and fixed things up from there. I think I'm going on the right path here but I thought I'd ask for feedback to make sure I'm not completely off the rails.

Thanks for any pointers,

Krzysztof

---------------------- Stan sparse_multiply function
// This function implements the multiplication of a sparse m by n
// matrix, X, by the dense vector b. The sparse matrix X is represented by a vector
// of values and three indexing vectors which allow for fast sparse
// matrix by dense vector multiplication.
//
// The matrix X is represented by the vector w (of values), the integer array v
// (containing one-based row index of each value), the integer array u
// (containing one-based indexes of where each column starts in w), and
// the integer array z (containing the number of non-zero entries in
// each column of w). The matrix dimensions, m and n are also passed.
//
// This outer loop runs over columns of the matrix X. With the column
// index in hand, The inner loop runs over the section of w and v for
// each column. The inner loop first calculates the row index 'i' for
// each non-zero value in w, and then calculates how the associated non-zero
// value from w is scaled by the approriate entry of b to contribute to the
// appropriate entry of the output column vector.
//
// The function works by incrementing the entries of the entries of the
// output vector, y, because the outer loop is iterating column-wise
// over X.

vector sparse_multiply (
int m, int n,
vector w, int[] v, int[] u, int[] z,
vector b
) {
vector[m] y;
y <- rep_vector(0.0,m);
for (j in 1:n) {
int i;
for (q in u[j]:(u[j]+z[j]-1)) {
i <- v[q];
y[i] <- y[i] + w[q] * b[j];
}
}
return y;
}

----------------------- Edited from Stan-generated C++
#ifndef STAN__MATH__MATRIX_FAKE_SPARSE_HPP
#define STAN__MATH__MATRIX_FAKE_SPARSE_HPP

#include <stan/math/matrix/Eigen.hpp>
#include <vector>

namespace stan {
namespace math {

template <typename T2__, typename T6__>
inline
Eigen::Matrix<typename boost::math::tools::promote_args<T2__, T6__>::type, Eigen::Dynamic,1>
sparse_multiply(const int& m,
const int& n,
const Eigen::Matrix<T2__, Eigen::Dynamic,1>& w,
const std::vector<int>& v,
const std::vector<int>& u,
const std::vector<int>& z,
const Eigen::Matrix<T6__, Eigen::Dynamic,1>& b) {
typedef typename boost::math::tools::promote_args<T2__, T6__>::type fun_scalar_t__;
{
Eigen::Matrix<fun_scalar_t__,Eigen::Dynamic,1> y(m);
y = rep_vector(0.0,m);
for (int j = 1; j <= n; ++j) {
{
int i(0);
(void) i; // dummy to suppress unused var warning
for (int q = u[j-1]; q <= (u[j-1] + z[j-1] - 1); ++q) {
i = v[q-1];
y[i-1] += w[q-1] * b[j-1];
}
}
}
return stan::math::promote_scalar<fun_scalar_t__>(y);
}
}
}
}

#endif

Bob Carpenter

unread,

Mar 6, 2015, 8:17:14 PM3/6/15

to stan...@googlegroups.com

> On Mar 7, 2015, at 11:20 AM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> I made a Stan function for sparse-matrix-by-dense-vector multiplication (appended below). To use it you have to generate your matrix in compressed column format (one-indexed, since this is Stan-land) and pass the matrix as its dimensions (m, n), the values vector (w) and the three indexing arrays (v,u,z). Memory-wise it's a win over using a big model matrix but speed-wise it's not much of a win unless I do gradients.

It'll be a big win if the matrix is very sparse. It'll mean
a much smaller expression graph and hence much faster gradients
even without analytic gradients coded in.

> As a step towards that I'd like to implement it as a C++ function (and do the whole pull request thing if there's interest and I can figure out how to do the gradient).
>
> I couldn't sort out what the signature should be yet, so I put the Stan function into a placeholder model, got Stan to parse it to C++, and fixed things up from there.

Nice idea --- I hadn't even thought of doing something like that.
Andrew (and to some extent the rest of us) thought people would
be doing that with models. Luckily, it hasn't been an issue.

> I think I'm going on the right path here but I thought I'd ask for feedback to make sure I'm not completely off the rails.

The thing to do is to create a branch and check this in.
We need to give you permission to do that on stan-dev.
Can you send us your GitHub name?

Definitely on the right track, but I can rewrite it for you
a bit. And you need to add doc in the usual Doxygen (Javadoc)
format, but it's great you wrote what it did already.

Also, never ever ever use tabs in programs if they're not Python
or make.

All those double underscores need to go away --- they're just
there to prevent internal name conflicts with other variables
in Stan.

The thing to do for efficiency is provide three var-based
implementations that calculate partials as they go and then
return a single var result. This would be a huge speedup.

- Bob

#ifndef STAN__MATH__MATRIX__SPARSE_MULTIPLY_HPP
#define STAN__MATH__MATRIX__SPARSE_MULTIPLY_HPP

#include <vector>

#include <stan/math/matrix/Eigen.hpp>

namespace stan {

namespace math {

/** Return the multiplication of the sparse matrix spefcified by
* by values and indexing by the specified dense vector.
*
* The sparse matrix X of dimension m by n is represented by the
* vector w (of values), the integer array v (containing one-based
* row index of each value), the integer array u (containing
* one-based indexes of where each column starts in w), and the
* integer array z (containing the number of non-zero entries in
* each column of w).
*
* @tparam T1 Type of sparse matrix entries.
* @tparam T2 Type of dense vector entries.
* @param m Rows in matrix.
* @param n Columns in matrix.
* @param v One-based row index of each value.
* @param u one-based index of where each column starts in w.
* @param z number of non-zer entries in each column of w.
* @return dense vector for the product.
* @throw ... explain error conditions briefly ...
*/
template <typename T1, typename T2>
inline
Eigen::Matrix<typename boost::math::tools::promote_args<T1, T2>::type, Eigen::Dynamic, 1>

sparse_multiply(const int& m,
const int& n,

const Eigen::Matrix<T1, Eigen::Dynamic,1>& w,

const std::vector<int>& v,
const std::vector<int>& u,
const std::vector<int>& z,

const Eigen::Matrix<T2, Eigen::Dynamic,1>& b) {

// a whole bunch of error checking goes here, e.g., m, n > 0,
// etc. etc. we can't throw index out of bounds exceptions at
// run time

typedef typename boost::math::tools::promote_args<T1, T2>::type fun_scalar_t;
Eigen::Matrix<fun_scalar_t, Eigen::Dynamic, 1> y(m);
for (int j = 0; j < n; ++j) {
int end = u[j] + z[j] - 1;
for (int q = u[j] - 1; q < end; ++q)
y(v[q]) += w[q] * b[j];
}
return y;
}

}

}

#endif

Krzysztof Sakrejda

unread,

Mar 6, 2015, 8:45:19 PM3/6/15

to stan...@googlegroups.com

On Fri, Mar 6, 2015 at 8:16 PM, Bob Carpenter <ca...@alias-i.com> wrote:
>
>> On Mar 7, 2015, at 11:20 AM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>>
>> I made a Stan function for sparse-matrix-by-dense-vector multiplication (appended below). To use it you have to generate your matrix in compressed column format (one-indexed, since this is Stan-land) >and pass the matrix as its dimensions (m, n), the values vector (w) and the three indexing arrays (v,u,z). Memory-wise it's a win over using a big model matrix but speed-wise it's not much of a win unless I >do gradients.
>
> It'll be a big win if the matrix is very sparse. It'll mean
> a much smaller expression graph and hence much faster gradients
> even without analytic gradients coded in.
>
>> As a step towards that I'd like to implement it as a C++ function (and do the whole pull request thing if there's interest and I can figure out how to do the gradient).
>>
>> I couldn't sort out what the signature should be yet, so I put the Stan function into a placeholder model, got Stan to parse it to C++, and fixed things up from there.
>
> Nice idea --- I hadn't even thought of doing something like that.
> Andrew (and to some extent the rest of us) thought people would
> be doing that with models. Luckily, it hasn't been an issue.

Yeah, I don't know if it would be as helpful for people who haven't
been around as much Stan talk as me but the only things I knew going
in were conceptual. I could see spending some time (in a month or
two) writing up the how-to for adding functions in detail.

>> I think I'm going on the right path here but I thought I'd ask for feedback to make sure I'm not completely off the rails.
>
> The thing to do is to create a branch and check this in.
> We need to give you permission to do that on stan-dev.
> Can you send us your GitHub name?

Ok, I'd like to figure that process out. The github name is 'sakrejda'.

>
> Definitely on the right track, but I can rewrite it for you
> a bit. And you need to add doc in the usual Doxygen (Javadoc)
> format, but it's great you wrote what it did already.

The re-write's is awesome, I'll look at a diff to see what you changed exactly.

[snip/will do]

> The thing to do for efficiency is provide three var-based
> implementations that calculate partials as they go and then
> return a single var result. This would be a huge speedup.

I realize you're juggling a lot of stuff but this is confusing, I need
you to throw me a bone here. I get that this part is about the
gradients... My possibly-overlapping questions are:

1) why three implementations? what does Stan have three of here?
2) where do I look up what a var-based implementation is versus what
I've done so far (in the code is fine)
3) is there another .hpp file you can point to that has one of these
functions that calculates partials and returns one var?
4) is the gradient calculation a separate function, or is this just
modifying the present code?

Finally, there might be a different way of doing this that helps even
more. I think Eigen probably has an efficient constructor for a CSC
format sparse matrix based on these vectors I'm passing into the
function (since that's how they store the matrix internally). If I go
that route maybe it's possible to just use their sparse matrix
multiplication directly and define gradients for it that would get
re-used when you guys include sparse structures in Stan. I guess that
wouldn't work if the efficient way to do it is to do the gradient at
the same time as the multiplication.

Krzysztof

> --
> You received this message because you are subscribed to a topic in the Google Groups "stan development mailing list" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-dev/X45dGQkaags/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to stan-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Bob Carpenter

unread,

Mar 6, 2015, 10:27:15 PM3/6/15

to stan...@googlegroups.com

> On Mar 7, 2015, at 12:45 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> On Fri, Mar 6, 2015 at 8:16 PM, Bob Carpenter <ca...@alias-i.com> wrote:
>>
>> The thing to do for efficiency is provide three var-based
>> implementations that calculate partials as they go and then
>> return a single var result. This would be a huge speedup.
>
> I realize you're juggling a lot of stuff

Encouraging new contributors is very high on my priority list!

> but this is confusing, I need
> you to throw me a bone here.

src/stan/agrad/rev/functions/hypot.hpp

is the simplest non-trivial example of a function with two
potentially var inputs. You can also look at

src/stan/agrad/rev/operators/operator_division.hpp

which is better because it's not symmetric so has all three
separate implementations.

Both of these have built-in C++ definitions, so you'd also need
the fourth signature which has everything instantatied with double
scalar values.

> I get that this part is about the
> gradients... My possibly-overlapping questions are:
>
> 1) why three implementations? what does Stan have three of here?

signatures (var,double), (double,var) and (var,var), as well
as (double,double) here, but that doesn't need gradients.

> 2) where do I look up what a var-based implementation is versus what
> I've done so far (in the code is fine)

see above

> 3) is there another .hpp file you can point to that has one of these
> functions that calculates partials and returns one var?

see above

But you're going to be returning multiple vars. But if
you look at our matrix functions, they're complicated by
memory management issues.

A good example to follow would be mulitply() of a matrix by
a vector.

> 4) is the gradient calculation a separate function, or is this just
> modifying the present code?

see the examples.

> Finally, there might be a different way of doing this that helps even
> more. I think Eigen probably has an efficient constructor for a CSC
> format sparse matrix based on these vectors I'm passing into the
> function (since that's how they store the matrix internally). If I go
> that route maybe it's possible to just use their sparse mat

> multiplication directly and define gradients for it that would get
> re-used when you guys include sparse structures in Stan. I guess that
> wouldn't work if the efficient way to do it is to do the gradient at
> the same time as the multiplication.

Exactly. You might want to use it for the stan/math version, and
even to produce the result value for the var version, but you need
to specify gradients to make it efficient.

It's OK to implement something with just templates without gradients
and then later make it faster. We do that all the time.

- Bob

Marcus Brubaker

unread,

Mar 7, 2015, 5:03:46 PM3/7/15

to stan...@googlegroups.com

You could speed up the Stan version of the function by copying entries into two vectors and then calling dot_product(). This will get you almost as fast as would be possible if you implemented it in C++ and without the trouble.

Cheers,

Marcus

--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.

Krzysztof Sakrejda

unread,

Mar 7, 2015, 9:31:03 PM3/7/15

to stan...@googlegroups.com

Ok, I'll try to wrap my head around what the gradient calculations are
doing, but first just figure out how to do a proper pull request for
the multiplication (with dot_product hopefully). Krzysztof

Krzysztof Sakrejda

Department of Biostatistics and Epidemiology
University of Massachusetts, Amherst
319 Morrill Science Center South
611 N. Pleasant Street
Amherst, MA 01003

work #: 413-325-6555
email: sakr...@umass.edu

Bob Carpenter

unread,

Mar 8, 2015, 7:40:16 AM3/8/15

to stan...@googlegroups.com

The dot product's going to be a tradeoff. It there
are lots of zero entries, it may not be as efficient.
This is always the kind of tradeoff you're looking at
with sparsity. You can actually switch on this behavior
if you can figure out the turning point and it's worth
the extra calculation. So many tradeoffs!

- Bob

> You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Krzysztof Sakrejda

unread,

Mar 8, 2015, 9:23:05 AM3/8/15

to stan...@googlegroups.com

I guess I was assuming the loop could transform the sparse
multiplication and then do the multiplication with a dense dot product
and then shuffle that back to output (which I guess has gradients
already) but it does seem like there'd be a lot of shuffling around.

Krzysztof

Krzysztof Sakrejda

Department of Biostatistics and Epidemiology
University of Massachusetts, Amherst
319 Morrill Science Center South
611 N. Pleasant Street
Amherst, MA 01003

work #: 413-325-6555
email: sakr...@umass.edu

Krzysztof Sakrejda

unread,

Mar 8, 2015, 12:17:18 PM3/8/15

to stan...@googlegroups.com

Carpenter <ca...@alias-i.com> wrote:
>> The dot product's going to be a tradeoff. It there
>> are lots of zero entries, it may not be as efficient.
>> This is always the kind of tradeoff you're looking at
>> with sparsity. You can actually switch on this behavior
>> if you can figure out the turning point and it's worth
>> the extra calculation. So many tradeoffs!

No kidding on the tradeoffs. I think I'll stick to one step at a time.
I've done all the steps up to the first/last step (unit tests).

https://github.com/stan-dev/stan/tree/feature/sparse_multiply/

Questions:
1) I'm not sure if this function should be added where it is right
now, is it in the right place?
2) I need to come up with tests, are you trying to follow some rules for these?
3) Do you have a favorite doc for the testing framework? It's new to me.

Krzysztof

Marcus Brubaker

unread,

Mar 8, 2015, 1:27:39 PM3/8/15

to stan...@googlegroups.com

Just to be clear, for using dot_product, what I was suggesting was that for each row of the output, copy the sparse entries into one vector (whose length is the number of non-zero entries in that row) and copy the vector entries corresponding to those non-zero sparse entries into another vector of the same length and take the dot_product between those vectors. This really should be almost as efficient as a specialized implementation in C++.

>>>> To unsubscribe from this group and all its topics, send an email to stan-dev+unsubscribe@googlegroups.com.

>>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

>>> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.

>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to a topic in the Google Groups "stan development mailing list" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-dev/X45dGQkaags/unsubscribe.

>> To unsubscribe from this group and all its topics, send an email to stan-dev+unsubscribe@googlegroups.com.

>> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.

Ben Goodrich

unread,

Mar 8, 2015, 1:40:31 PM3/8/15

to stan...@googlegroups.com

i.e. roughly what Marcus is suggesting, in Stan code, would amount to this in the case where the "sparse" matrix was lower-triangular

/**
 * Return a vector that is the product of a lower-triangular matrix and a vector
 * @param L lower-triangular matrix but not checked for lower-triangularness
 * @param x column vector
 * @return column vector equal to L * x
 */
vector lt_multiply_mv(matrix L, vector x) {
  vector[rows(L)] out;
  int K;
  K <- cols(L);
  if (K != rows(x)) reject("L and x are not conformable for multiplication");
  for (k in 1:rows(L)) {
    if(k < K) out[k] <- sub_row(L, 1, 1, k) * head(x, k);
    else      out[k] <- row(L,k) * x;
  }
  return out;
}

In your case, you just have to deal with the arbitrary pattern of non-zeros.

Ben

Krzysztof Sakrejda

unread,

Mar 8, 2015, 1:52:15 PM3/8/15

to stan...@googlegroups.com

On Sunday, March 8, 2015 at 1:40:31 PM UTC-4, Ben Goodrich wrote:
> i.e. roughly what Marcus is suggesting, in Stan code, would amount to this in the case where the "sparse" matrix was lower-triangular
>
>
>
> /**
> * Return a vector that is the product of a lower-triangular matrix and a vector
> * @param L lower-triangular matrix but not checked for lower-triangularness
> * @param x column vector
> * @return column vector equal to L * x
> */
> vector lt_multiply_mv(matrix L, vector x) {
> vector[rows(L)] out;
> int K;
> K <- cols(L);
> if (K != rows(x)) reject("L and x are not conformable for multiplication");
> for (k in 1:rows(L)) {
> if(k < K) out[k] <- sub_row(L, 1, 1, k) * head(x, k);
> else out[k] <- row(L,k) * x;
> }
> return out;
> }
>
> In your case, you just have to deal with the arbitrary pattern of non-zeros.

That makes sense. I'll stick with getting the current code into the form of a proper pull request but this might be a good next step. I still need to figure out tests. If the original matrix were in compressed sparse row format this would be super-easy (the first two vector you keep in that format are the values and the column indices you need to match up with the b vector.

Krzysztof

That makes a lot of sense

Bob Carpenter

unread,

Mar 8, 2015, 6:38:16 PM3/8/15

to stan...@googlegroups.com

Nice! That would be much more efficient on the autodiff.
It's totally worth the shuffling and could be done
type agnostically so you'd only need one instance and as
Marcus pointed out, there'd then be no point in doing custom
autodiff.

- Bob

Bob Carpenter

unread,

Mar 8, 2015, 6:54:16 PM3/8/15

to stan...@googlegroups.com

We haven't really discussed whether we want this function
in Stan! We have to be careful not to just keep adding
functions because they introduce maintenance overhead downstream,
complicate the manual and thus make existing functionality a bit harder
to use, etc.

The major thing that worries me about this one is backward
compatiblity --- we're going to eventually want to do sparse
matrices the "right" way with a built-in data type. At that
point, we'd want to deprecate this (and related) functions.

Is that OK with everybody?

More below.

> On Mar 9, 2015, at 3:17 AM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> Carpenter <ca...@alias-i.com> wrote:
>>> The dot product's going to be a tradeoff. It there
>>> are lots of zero entries, it may not be as efficient.
>>> This is always the kind of tradeoff you're looking at
>>> with sparsity. You can actually switch on this behavior
>>> if you can figure out the turning point and it's worth
>>> the extra calculation. So many tradeoffs!
>
> No kidding on the tradeoffs. I think I'll stick to one step at a time.
> I've done all the steps up to the first/last step (unit tests).
>
> https://github.com/stan-dev/stan/tree/feature/sparse_multiply/
>
> Questions:
> 1) I'm not sure if this function should be added where it is right
> now, is it in the right place?

It's branched from develop, which is the right start.

You should only need to put it in one place, and that should be:

stan/math/prim/mat

This is going to take some getting used to for me because Daniel
and Rob re-orged the entire code base around since I last worked
on it (with my blessing --- I'm not complaining).

> 2) I need to come up with tests, are you trying to follow some rules for these?

> 3) Do you have a favorite doc for the testing framework? It's new to me.

I created a Wiki page with an answer (and linked it from our top-level Wiki)

https://github.com/stan-dev/stan/wiki/How-to-Write-Unit-Tests-with-GoogleTest

- Bob

Marcus Brubaker

unread,

Mar 8, 2015, 11:35:45 PM3/8/15

to stan...@googlegroups.com

I'd support this getting in to Stan until we get real sparse matrix support. Realistically, it's going to take a while until sparse matrices are implemented. This is likely to be highly useful and expand the kinds of models Stan can be practically used on in the interim.

--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.

Krzysztof Sakrejda

unread,

Mar 9, 2015, 1:47:33 PM3/9/15

to stan...@googlegroups.com

On Sunday, March 8, 2015 at 6:54:16 PM UTC-4, Bob Carpenter wrote:
> We haven't really discussed whether we want this function
> in Stan! We have to be careful not to just keep adding
> functions because they introduce maintenance overhead downstream,
> complicate the manual and thus make existing functionality a bit harder
> to use, etc.
>
> The major thing that worries me about this one is backward
> compatiblity --- we're going to eventually want to do sparse
> matrices the "right" way with a built-in data type. At that
> point, we'd want to deprecate this (and related) functions.
>
> Is that OK with everybody?
>
> More below.

Ok, thanks for the comments and Wiki. I'll wait and see if there are any more responses. If there were any less messy way of doing this I'd be on board with that too but either way it's worth it to me to put this stuff together at least on a feature branch.

Krzysztof

Bob Carpenter

unread,

Mar 9, 2015, 7:00:17 PM3/9/15

to stan...@googlegroups.com

If nobody comments in the negative, I think we're good to go.

I think I'm the one most conservative about adding things
that may go away later, but I'm OK with this one if Marcus
thinks it's a good idea.

- Bob

Krzysztof Sakrejda

unread,

Mar 13, 2015, 12:10:13 PM3/13/15

to stan...@googlegroups.com

On Saturday, March 7, 2015 at 5:03:46 PM UTC-5, Marcus Brubaker wrote:
> You could speed up the Stan version of the function by copying entries into two vectors and then calling dot_product(). This will get you almost as fast as would be possible if you implemented it in C++ and without the trouble.

I took your suggestion (Marcus) but instead of trying to do something complicated with the compressed-column version, I took the compressed row version which simplifies nicely into a dot product. I'm going to try and figure out the testing framework (and work out any bugs) for this next but I would appreciate any comments. I think this is the dot product version everyone was suggesting.

/** Return the multiplication of the sparse matrix spefcified by
* by values and indexing by the specified dense vector.
*
* The sparse matrix X of dimension m by n is represented by the

* vector w (of values), the integer array v (containing one-based
* column index of each value), the integer array u (containing
* one-based indexes of where each row starts in w), and the
* integer array z (containing the number of non-zero entries in
* each row of w).

*
* @tparam T1 Type of sparse matrix entries.
* @tparam T2 Type of dense vector entries.
* @param m Rows in matrix.
* @param n Columns in matrix.

* @param v One-based column index of each value.
* @param u one-based index of where each row starts in w.
* @param z number of non-zero entries in each row of w.

* @return dense vector for the product.
* @throw ... explain error conditions briefly ...
*/
template <typename T1, typename T2>

inline

Eigen::Matrix<typename boost::math::tools::promote_args<T1, T2>::type, Eigen::Dynamic, 1>

sparse_multiply_csr(const int& m,
const int& n,
const Eigen::Matrix<T1, Eigen::Dynamic,1>& w,

const std::vector<int>& v,
const std::vector<int>& u,
const std::vector<int>& z,

const Eigen::Matrix<T2, Eigen::Dynamic,1>& b) {

// a whole bunch of error checking goes here, e.g., m, n > 0,
// etc. etc. we can't throw index out of bounds exceptions at
// run time

typedef typename boost::math::tools::promote_args<T1, T2>::type fun_scalar_t;
Eigen::Matrix<fun_scalar_t, Eigen::Dynamic, 1> y(m);

Eigen::Matrix<T2,Eigen::Dynamic,1> b_sub(n);
for (int i = 0; i < m; ++i) {
int end = u[i] + z[i] - 1;
int p = 0;
for (int q = u[i]-1; q < end; ++q) {
b_sub(p) = b(v[q]);
++p;
}
y(i) = dot_product(w.segment(u[i]-1,z[i]),b_sub.segment(0,p-1));
}
return y;

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Marcus Brubaker

unread,

Mar 13, 2015, 12:13:09 PM3/13/15

to stan...@googlegroups.com

Yes, that's pretty much exactly what I had in mind. Have you had a chance to test it yet? It should be relatively fast, particularly compared to the previous version.

Cheers,

Marcus

To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.

Krzysztof Sakrejda

unread,

Mar 13, 2015, 12:41:35 PM3/13/15

to stan...@googlegroups.com

On Friday, March 13, 2015 at 12:13:09 PM UTC-4, Marcus Brubaker wrote:
> Yes, that's pretty much exactly what I had in mind. Have you had a chance to test it yet? It should be relatively fast, particularly compared to the previous version.

Good to hear. I have not tried it on a model yet. I think our expectations that it will be faster are well-founded and I won't trust it on my own models until I write tests so... I'm just plowing ahead with figuring out the testing framework.

For readability in testing I think the right way of doing it is to make dense matrices, extract the sparse part, and test from there but I haven't gotten to it yet.

Krzysztof

Bob Carpenter

unread,

Mar 13, 2015, 8:09:20 PM3/13/15

to stan...@googlegroups.com

> On Mar 14, 2015, at 3:41 AM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> On Friday, March 13, 2015 at 12:13:09 PM UTC-4, Marcus Brubaker wrote:
>> Yes, that's pretty much exactly what I had in mind. Have you had a chance to test it yet? It should be relatively fast, particularly compared to the previous version.
>
> Good to hear. I have not tried it on a model yet. I think our expectations that it will be faster are well-founded and I won't trust it on my own models until I write tests so... I'm just plowing ahead with figuring out the testing framework.
>
> For readability in testing I think the right way of doing it is to make dense matrices, extract the sparse part, and test from there but I haven't gotten to it yet.

That sounds good to me. The tests will be clearer that
way, even if we don't want to convert dense matrices to sparse
inside Stan programs, except maybe in modest size cases in
transformed data.

- Bob

Krzysztof Sakrejda

unread,

Mar 13, 2015, 9:02:36 PM3/13/15

to stan...@googlegroups.com

On Fri, Mar 13, 2015 at 8:08 PM, Bob Carpenter wrote:

>
>> On Mar 14, 2015, at 3:41 AM, Krzysztof Sakrejda wrote:
>>
>> On Friday, March 13, 2015 at 12:13:09 PM UTC-4, Marcus Brubaker wrote:

[snip]

>> For readability in testing I think the right way of doing it is to make dense matrices, extract the sparse part, and test from there but I haven't gotten to it yet.
>
> That sounds good to me. The tests will be clearer that
> way, even if we don't want to convert dense matrices to sparse
> inside Stan programs, except maybe in modest size cases in
> transformed data.

Yeah, I wasn't sure what the best way to go was here. I made
extractors from the Eigen sparse types (appended) to generate the
w/v/u/z vectors for use in my functions for testing. I'm not sure if
I should make these functions part of Stan in my branch or if they
should just be part of the tests (if they're just part of the tests,
where do you even put helper functions for tests? GoogleTest doc
doesn't seem to dicuss this...)

This little exercise did let me figure out the low-level Eigen API for
sparse matrices so it wouldn't be a big leap for me (at some point) to
reimplement this stuff for actual Eigen::SparseMatrix types. If we
could generate SparseMatrix types in transformed data for "modest" big
sparse matrices it would let me write these functions in a way that
would get built on in the future.... for now I think (?) it's just
mission creep.

Krzysztof

#include <Eigen/Sparse>
#include <vector>
#include <numeric>

template <typename _Scalar>
const Eigen::Matrix<_Scalar, Eigen::Dynamic,1>
extract_w(Eigen::SparseMatrix<_Scalar> A) {
Eigen::Matrix<_Scalar,Eigen::Dynamic,1> w(A.nonZeros());
for ( int j = 0; j < A.nonZeros(); ++j)
w[j] = *(A.valuePtr()+j);
return w;
}

template <typename _Scalar>
const std::vector<int> extract_v(Eigen::SparseMatrix<_Scalar> A) {
std::vector<int> v(A.nonZeros());
for ( int j = 0; j < A.nonZeros(); ++j)
v[j] = *(A.innerIndexPtr()+j);
return v;
}

template <typename _Scalar>
const std::vector<int> extract_u(Eigen::SparseMatrix<_Scalar> A) {
std::vector<int> u(A.outerSize()+1);
for ( int j = 0; j <= A.outerSize(); ++j)
u[j] = *(A.outerIndexPtr()+j);
return u;
}

template <typename _Scalar>
const std::vector<int> extract_z(Eigen::SparseMatrix<_Scalar> A) {
std::vector<int> u(A.outerSize()+1);
std::vector<int> z(A.outerSize()+1);
u = extract_u(A);
std::adjacent_difference(u.begin(), u.end(), z.begin());
z.erase(z.begin());
return z;
}

template <typename _Scalar>
const Eigen::Matrix<_Scalar,Eigen::Dynamic,1>
extract_w(Eigen::SparseMatrix<_Scalar, Eigen::RowMajor> A) {
Eigen::Matrix<_Scalar,Eigen::Dynamic,1> w(A.nonZeros());
for ( int j = 0; j < A.nonZeros(); ++j)
w[j] = *(A.valuePtr()+j);
return w;
}

template <typename _Scalar>
const std::vector<int> extract_v(Eigen::SparseMatrix<_Scalar,
Eigen::RowMajor> A) {
std::vector<int> v(A.nonZeros());
for ( int j = 0; j < A.nonZeros(); ++j)
v[j] = *(A.innerIndexPtr()+j);
return v;
}

template <typename _Scalar>
const std::vector<int> extract_u(Eigen::SparseMatrix<_Scalar,
Eigen::RowMajor> A) {
std::vector<int> u(A.outerSize()+1);
for ( int j = 0; j <= A.outerSize(); ++j)
u[j] = *(A.outerIndexPtr()+j);
return u;
}
template <typename _Scalar>
const std::vector<int> extract_z(Eigen::SparseMatrix<_Scalar,
Eigen::RowMajor> A) {
std::vector<int> u(A.outerSize()+1);
std::vector<int> z(A.outerSize()+1);
u = extract_u(A);
std::adjacent_difference(u.begin(), u.end(), z.begin());
z.erase(z.begin());
return z;
}

>
> - Bob
>
> --
> You received this message because you are subscribed to a topic in the Google Groups "stan development mailing list" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-dev/X45dGQkaags/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to stan-dev+u...@googlegroups.com.

Bob Carpenter

unread,

Mar 14, 2015, 12:23:21 AM3/14/15

to stan...@googlegroups.com

I don't see an alternative to writing your own functions.

And you're right --- we don't want feature creep, or in
some sense we do, but it should be in a different issue that
involves properly updating the language to handle sparse matrices.

To properly add a sparse matrix data type is a huge issue.
We need to change the language to allow it to be declared,
then work on the I/O in CmdStan, RStan and PyStan, then add
enough operations operating on sparse matrices that it's actually
worthwhile. It's part of our current NSF proposal, but I'm guessing
it'd take six person months for a first version, minimum.

- Bob

> You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Krzysztof Sakrejda

unread,

Mar 15, 2015, 6:03:27 PM3/15/15

to stan...@googlegroups.com

On Saturday, March 14, 2015 at 12:23:21 AM UTC-4, Bob Carpenter wrote:
> I don't see an alternative to writing your own functions.

I added 1) the extractors required to test sparse multiplication, 2) some simple tests for those extractors, 3) some simple tests for the sparse multiplication, and cleaned it up. The whole thing is at:

https://github.com/stan-dev/stan/compare/feature/sparse_multiply

In the testing doc when it says "we want to make sure all tests get tickled and the appropriate exception types are thrown and documented", it's not clear to me what Stan-appropriate behavior is. I'm throwing in some context below and asking for clarification.

In multiply_test I see:

TEST(MathMatrix,multiply_m_v_exception) {
stan::math::matrix_d m;
stan::math::vector_d v;

m.resize(3, 5);
v.resize(5);
EXPECT_NO_THROW(stan::math::multiply(m, v));

m.resize(3, 0);
v.resize(0);
EXPECT_THROW(stan::math::multiply(m, v),
std::invalid_argument);

m.resize(2, 3);
v.resize(2);
EXPECT_THROW(stan::math::multiply(m, v), std::invalid_argument);
}

So with wrong-size arguments, I should be throwing std::invalid_argument directly or doing something with the code in stan::math::check_*? Not sure which way to go.

It looks like the rest of the testing is helping people not shoot themselves in the foot (and devs figure out future breakage)... I'm ok completing it if I can get some direction on what to do. I guess I can also generate some matrices and compare sparse_multiply results to the dense multiply(X,b).

In the mean time I think this works as intended and I'll see what some test models do with it. Thanks to everyone for getting this sorted out so far!

Krzysztof

Bob Carpenter

unread,

Mar 15, 2015, 7:29:21 PM3/15/15

to stan...@googlegroups.com

> On Mar 16, 2015, at 9:03 AM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> On Saturday, March 14, 2015 at 12:23:21 AM UTC-4, Bob Carpenter wrote:
>> I don't see an alternative to writing your own functions.
>
> I added 1) the extractors required to test sparse multiplication, 2) some simple tests for those extractors, 3) some simple tests for the sparse multiplication, and cleaned it up. The whole thing is at:
>
> https://github.com/stan-dev/stan/compare/feature/sparse_multiply
>
> In the testing doc when it says "we want to make sure all tests get tickled and the appropriate exception types are thrown and documented", it's not clear to me what Stan-appropriate behavior is. I'm throwing in some context below and asking for clarification.
>
> In multiply_test I see:
>
> TEST(MathMatrix,multiply_m_v_exception) {
> stan::math::matrix_d m;
> stan::math::vector_d v;
>
> m.resize(3, 5);
> v.resize(5);
> EXPECT_NO_THROW(stan::math::multiply(m, v));
>
> m.resize(3, 0);
> v.resize(0);
> EXPECT_THROW(stan::math::multiply(m, v),
> std::invalid_argument);
>
> m.resize(2, 3);
> v.resize(2);
> EXPECT_THROW(stan::math::multiply(m, v), std::invalid_argument);
> }
>
> So with wrong-size arguments, I should be throwing std::invalid_argument directly or doing something with the code in stan::math::check_*? Not sure which way to go.

Use the built-in check functions and if the ones you need
aren't available, write more.

> It looks like the rest of the testing is helping people not shoot themselves in the foot (and devs figure out future breakage)... I'm ok completing it if I can get some direction on what to do. I guess I can also generate some matrices and compare sparse_multiply results to the dense multiply(X,b).

You can look at some of our other functions for guidance.

The general point of doc and testing is complementary. The
doc should define all the constraints (e.g., what sizes things have to
be relative to each other), and the tests should test they are
handled appropriately.

The tests should also check boundary conditions, such as if a size
zero matrix or vector is allowed, then it should behave properly.

> In the mean time I think this works as intended and I'll see what some test models do with it. Thanks to everyone for getting this sorted out so far!

Great.

- Bob

Krzysztof Sakrejda

unread,

Mar 17, 2015, 6:34:59 PM3/17/15

to stan...@googlegroups.com

On Sunday, March 15, 2015 at 7:29:21 PM UTC-4, Bob Carpenter wrote:
> The general point of doc and testing is complementary. The
> doc should define all the constraints (e.g., what sizes things have to
> be relative to each other), and the tests should test they are
> handled appropriately.
>
> The tests should also check boundary conditions, such as if a size
> zero matrix or vector is allowed, then it should behave properly.

I added tests using check_, but I seem to have missed testing for whatever the gradient-enabled types are... which leads to a question below.

>
> > In the mean time I think this works as intended and I'll see what some test models do with it. Thanks to everyone for getting this sorted out so far!
>
> Great.

I had some trouble getting this working all the way. When linking a real model I get the mess pasted below. It looks like I missed implementing something in sparse_multiply_csr for the autodiff variables. At this point I haven't read up on the autodiff variables so I'd appreciate any suggestions about what's going on here. Probably the easiest way to look at my changes (against develop) is here:

https://github.com/stan-dev/stan/compare/feature/sparse_multiply

Failed model compilation is below, looks like a test model would fail the same way. This is for a full-scale problem with a 90% sparse matrix that I've already run using dense multiplication so I think it'll make a decent test case.

krzysiek@fawkes ~/packages/cmdstan-2.6.2.1 $ make ~/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O3 -o bin/cmdstan/stanc.o src/cmdstan/stanc.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/expression_grammar_inst.o stan/src/stan/lang/grammars/expression_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/program_grammar_inst.o stan/src/stan/lang/grammars/program_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/term_grammar_inst.o stan/src/stan/lang/grammars/term_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/statement_2_grammar_inst.o stan/src/stan/lang/grammars/statement_2_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/var_decls_grammar_inst.o stan/src/stan/lang/grammars/var_decls_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/bare_type_grammar_inst.o stan/src/stan/lang/grammars/bare_type_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/statement_grammar_inst.o stan/src/stan/lang/grammars/statement_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/whitespace_grammar_inst.o stan/src/stan/lang/grammars/whitespace_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/functions_grammar_inst.o stan/src/stan/lang/grammars/functions_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/grammars/expression07_grammar_inst.o stan/src/stan/lang/grammars/expression07_grammar_inst.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O0 -o bin/stan/lang/ast_def.o stan/src/stan/lang/ast_def.cpp
ar -rs bin/libstanc.a bin/stan/lang/grammars/expression_grammar_inst.o bin/stan/lang/grammars/program_grammar_inst.o bin/stan/lang/grammars/term_grammar_inst.o bin/stan/lang/grammars/statement_2_grammar_inst.o bin/stan/lang/grammars/var_decls_grammar_inst.o bin/stan/lang/grammars/bare_type_grammar_inst.o bin/stan/lang/grammars/statement_grammar_inst.o bin/stan/lang/grammars/whitespace_grammar_inst.o bin/stan/lang/grammars/functions_grammar_inst.o bin/stan/lang/grammars/expression07_grammar_inst.o bin/stan/lang/ast_def.o
ar: creating bin/libstanc.a
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -lpthread -O0 -o bin/stanc bin/cmdstan/stanc.o -Lbin -lstanc

--- Translating Stan model to C++ code ---
bin/stanc /home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.stan --o=/home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp
Model name=cjs_seasonality_4a_model
Input file=/home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.stan
Output file=/home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -c -O3 -o bin/cmdstan/print.o src/cmdstan/print.cpp
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -lpthread -O0 -o bin/print bin/cmdstan/print.o

--- Linking C++ model ---
g++ -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -I src -I stan/src -isystem stan/lib/eigen_3.2.4 -isystem stan/lib/boost_1.55.0 -Wall -pipe -DEIGEN_NO_DEBUG -Wno-unused-local-typedefs -lpthread -O3 -o /home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a src/cmdstan/main.cpp -include /home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp
In file included from stan/src/stan/math/prim/mat.hpp:152:0,
from stan/src/stan/model/model_header.hpp:12,
from /home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp:3,
from <command-line>:0:
stan/src/stan/math/prim/mat/fun/sparse_multiply.hpp: In instantiation of ‘Eigen::Matrix<typename boost::math::tools::promote_args<RT1, RT2>::type, -1, 1> stan::math::sparse_multiply_csr(const int&, const int&, const Eigen::Matrix<Scalar, -1, 1>&, const std::vector<int>&, const std::vector<int>&, const std::vector<int>&, const Eigen::Matrix<RhsScalar, -1, 1>&) [with T1 = double; T2 = stan::agrad::var; typename boost::math::tools::promote_args<RT1, RT2>::type = stan::agrad::var]’:
/home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp:643:114: required from ‘T__ cjs_seasonality_4a_model_namespace::cjs_seasonality_4a_model::log_prob(std::vector<T2>&, std::vector<int>&, std::ostream*) const [with bool propto__ = true; bool jacobian__ = true; T__ = stan::agrad::var; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/util.hpp:110:70: required from ‘double stan::model::log_prob_grad(const M&, std::vector<double>&, std::vector<int>&, std::vector<double>&, std::ostream*) [with bool propto = true; bool jacobian_adjust_transform = true; M = cjs_seasonality_4a_model_namespace::cjs_seasonality_4a_model; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/util.hpp:290:74: required from ‘int stan::model::test_gradients(const M&, std::vector<double>&, std::vector<int>&, double, double, std::ostream&, std::ostream*) [with bool propto = true; bool jacobian_adjust_transform = true; M = cjs_seasonality_4a_model_namespace::cjs_seasonality_4a_model; std::ostream = std::basic_ostream<char>]’
stan/src/stan/services/command.hpp:200:39: required from ‘int stan::services::command(int, const char**) [with Model = cjs_seasonality_4a_model_namespace::cjs_seasonality_4a_model]’
src/cmdstan/main.cpp:7:57: required from here
stan/src/stan/math/prim/mat/fun/sparse_multiply.hpp:135:51: error: no matching function for call to ‘dot_product(Eigen::Matrix<stan::agrad::var, -1, 1, 0, -1, 1>&, Eigen::Matrix<stan::agrad::var, -1, 1, 0, -1, 1>&)’
y(i) = stan::math::dot_product(w_sub,b_sub);
^
stan/src/stan/math/prim/mat/fun/sparse_multiply.hpp:135:51: note: candidates are:
In file included from stan/src/stan/math/prim/mat.hpp:73:0,
from stan/src/stan/model/model_header.hpp:12,
from /home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp:3,
from <command-line>:0:
stan/src/stan/math/prim/mat/fun/dot_product.hpp:22:19: note: template<int R1, int C1, int R2, int C2> double stan::math::dot_product(const Eigen::Matrix<double, R1, C1>&, const Eigen::Matrix<double, R2, C2>&)
inline double dot_product(const Eigen::Matrix<double, R1, C1>& v1,
^
stan/src/stan/math/prim/mat/fun/dot_product.hpp:22:19: note: template argument deduction/substitution failed:
In file included from stan/src/stan/math/prim/mat.hpp:152:0,
from stan/src/stan/model/model_header.hpp:12,
from /home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp:3,
from <command-line>:0:
stan/src/stan/math/prim/mat/fun/sparse_multiply.hpp:135:51: note: mismatched types ‘double’ and ‘stan::agrad::var’
y(i) = stan::math::dot_product(w_sub,b_sub);
^
stan/src/stan/math/prim/mat/fun/sparse_multiply.hpp:135:51: note: ‘Eigen::Matrix<stan::agrad::var, -1, 1, 0, -1, 1>’ is not derived from ‘const Eigen::Matrix<double, R1, C1>’
In file included from stan/src/stan/math/prim/mat.hpp:73:0,
from stan/src/stan/model/model_header.hpp:12,
from /home/krzysiek/projekty/westbrook_ats_survival/analysis/cjs-seasonality-4a/cjs-seasonality-4a.hpp:3,

>
> - Bob

Bob Carpenter

unread,

Mar 17, 2015, 8:10:23 PM3/17/15

to stan...@googlegroups.com

The problem is that stan::math only applies to primitives (double
and int, for Stan). Rather than calling functions with explicit
namespaces, you need to have a using statement so that
argument-dependent lookup can do its magic:

using stan::math::dot_product;
...
y(i) = dot_product(w_sub,b_sub);

Just make sure the autodiff versions of functions you need
are included --- they should be for a real model.

- Bob

P.S. See http://en.wikipedia.org/wiki/Argument-dependent_name_lookup

P.P.S. If you want to see a real error novel, try mucking about with
the lazy evaluation expression templates in the parser :-)

Krzysztof Sakrejda

unread,

Mar 17, 2015, 11:05:43 PM3/17/15

to stan...@googlegroups.com

On Tuesday, March 17, 2015 at 8:10:23 PM UTC-4, Bob Carpenter wrote:
> The problem is that stan::math only applies to primitives (double
> and int, for Stan). Rather than calling functions with explicit
> namespaces, you need to have a using statement so that
> argument-dependent lookup can do its magic:
>
> using stan::math::dot_product;
> ...
> y(i) = dot_product(w_sub,b_sub);
>
> Just make sure the autodiff versions of functions you need
> are included --- they should be for a real model.

Done! It gets through that.

> - Bob
>
> P.S. See http://en.wikipedia.org/wiki/Argument-dependent_name_lookup

Ooh weird. I knew I'd have to figure out more C++ stuff sooner or later. If you've got any book suggestions on the template metaprogramming stuff I'm all ears.

> P.P.S. If you want to see a real error novel, try mucking about with
> the lazy evaluation expression templates in the parser :-)

Hah, I'm sure that'll come up. I'll skip it for now but in a month or three I'd like to try and contribute to the sparse stuff in a more long-term way. In the meantime I had some mixed-types problems and I'm not sure if I fixed them correctly but the model compiles for a medium size problem:

Here's the X matrix:

> str(shared_data$p_X_sparse)
List of 5
$ w : num [1:1351868] 0.00556 0.54533 0.59435 0.0072 0.02086 ...
$ v : int [1:1351868] 1 2 3 4 1 2 3 4 1 2 ...
$ u : int [1:220036] 1 5 9 13 17 21 25 29 33 38 ...
$ z : int [1:220035] 4 4 4 4 4 4 4 4 5 6 ...
$ type: chr "CSR"
> ncol(phi_X)
[1] 52
> nrow(phi_X)
[1] 220035

The gradient evaluation (according to Stan, average of five chains), takes 67% of the time it did when using the dense matrix multiplication. It's a bit of a bummer to be using the CSR format since the dot product is then applied per row and in a model matrix that still means a lot of evaluations. I'll make sure the results match up with the run that used a dense matrix, test out the CSC version, and see if there's a way of applying the dot product column-wise. For some of the other models I have using radial basis functions that go to zero quickly in a large space this performance difference is only going to get better so I think this is exciting. I'll take two days over four days... any day...

The memory usage has to be better but I don't have numbers recorded for previous models so that'll have to wait too.

If anybody has a test model (or small set) they'd like run I'd be happy to give it a shot. I haven't really thought about how to test performance in this context (much).

Krzysztof

Bob Carpenter

unread,

Mar 17, 2015, 11:34:28 PM3/17/15

to stan...@googlegroups.com

> On Mar 18, 2015, at 2:05 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> On Tuesday, March 17, 2015 at 8:10:23 PM UTC-4, Bob Carpenter wrote:
>> The problem is that stan::math only applies to primitives (double
>> and int, for Stan). Rather than calling functions with explicit
>> namespaces, you need to have a using statement so that
>> argument-dependent lookup can do its magic:
>>
>> using stan::math::dot_product;
>> ...
>> y(i) = dot_product(w_sub,b_sub);
>>
>> Just make sure the autodiff versions of functions you need
>> are included --- they should be for a real model.
>
> Done! It gets through that.
>
>> - Bob
>>
>> P.S. See http://en.wikipedia.org/wiki/Argument-dependent_name_lookup
>
> Ooh weird. I knew I'd have to figure out more C++ stuff sooner or later. If you've got any book suggestions on the template metaprogramming stuff I'm all ears.

There are two that everyone recommends:

Alexandrescu

Vandevoorde and Josuttis.

>
>> P.P.S. If you want to see a real error novel, try mucking about with
>> the lazy evaluation expression templates in the parser :-)
>
> Hah, I'm sure that'll come up. I'll skip it for now but in a month or three I'd like to try and contribute to the sparse stuff in a more long-term way. In the meantime I had some mixed-types problems and I'm not sure if I fixed them correctly but the model compiles for a medium size problem:
>
> Here's the X matrix:
>
>> str(shared_data$p_X_sparse)
> List of 5
> $ w : num [1:1351868] 0.00556 0.54533 0.59435 0.0072 0.02086 ...
> $ v : int [1:1351868] 1 2 3 4 1 2 3 4 1 2 ...
> $ u : int [1:220036] 1 5 9 13 17 21 25 29 33 38 ...
> $ z : int [1:220035] 4 4 4 4 4 4 4 4 5 6 ...
> $ type: chr "CSR"
>> ncol(phi_X)
> [1] 52
>> nrow(phi_X)
> [1] 220035
>
> The gradient evaluation (according to Stan, average of five chains), takes 67% of the time it did when using the dense matrix multiplication.

It'll vary by the level of sparsity. You're fighting against
a very tight implementation in Eigen. We can probably make
the basic matrix stuff quite a bit faster, too.

I don't see any way to make what you wrote any faster, but
I'd be happy to take a look at some point. Or better yet,
run it by Marcus.

> It's a bit of a bummer to be using the CSR format since the dot product is then applied per row and in a model matrix that still means a lot of evaluations. I'll make sure the results match up with the run that used a dense matrix, test out the CSC version, and see if there's a way of applying the dot product column-wise. For some of the other models I have using radial basis functions that go to zero quickly in a large space this performance difference is only going to get better so I think this is exciting. I'll take two days over four days... any day...

Factors of two add up (multiply up?) quickly.

> The memory usage has to be better but I don't have numbers recorded for previous models so that'll have to wait too.

This may also be tight because we do a lot memory reuse in
the dense matrix multiplication, so memory's not too bad.

- Bob

Bob Carpenter

unread,

Mar 17, 2015, 11:36:23 PM3/17/15

to stan...@googlegroups.com

Sorry --- forgot the names of the books:

Alexandrescu: Modern C++ Design: Generic Programming and Design Patterns Applied

Vandevoorde and Josuttis: C++ Templates: The Complete Guide

They're both very good. The former is more of a hands-on style
whereas the latter's a bit more computer-sciency, and they have
complemetary strengths in what they cover. So you really want to
read them both, get your hands dirty, then read them again. Sort of
like any other technical material.

- Bob

Krzysztof Sakrejda

unread,

Mar 18, 2015, 1:08:04 AM3/18/15

to stan...@googlegroups.com

On Tue, Mar 17, 2015 at 11:32 PM, Bob Carpenter <ca...@alias-i.com> wrote:
>
>> On Mar 18, 2015, at 2:05 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>>
>> On Tuesday, March 17, 2015 at 8:10:23 PM UTC-4, Bob Carpenter wrote:

[snip]

> It'll vary by the level of sparsity. You're fighting against
> a very tight implementation in Eigen. We can probably make
> the basic matrix stuff quite a bit faster, too.
>
> I don't see any way to make what you wrote any faster, but
> I'd be happy to take a look at some point. Or better yet,
> run it by Marcus.

You're right, I'm not sure what I was thinking. The dot product must
(ultimately) go by row, so the CSR format has everything set up right
and the only way to make it faster would be drop some of the
intermediaries. I think this is me catching up with Marcus' original
comment about using dot product.

>> It's a bit of a bummer to be using the CSR format since the dot product is then applied per row and in a model matrix that still means a lot of evaluations. I'll make sure the results match up with the run that used a dense matrix, test out the CSC version, and see if there's a way of applying the dot product column-wise. For some of the other models I have using radial basis functions that go to zero quickly in a large space this performance difference is only going to get better so I think this is exciting. I'll take two days over four days... any day...
>
> Factors of two add up (multiply up?) quickly.
>
>> The memory usage has to be better but I don't have numbers recorded for previous models so that'll have to wait too.
>
> This may also be tight because we do a lot memory reuse in
> the dense matrix multiplication, so memory's not too bad.

I was thinking storage for the original X matrix. When using the full
matrix I already had to modify rstan to pipe it through rdump for one
of my models, so just avoiding that mess will be nice. I have no idea
about the multiplication.

Krzysztof

p.s.-thanks for the book recommendations

>
> - Bob
>
> --
> You received this message because you are subscribed to a topic in the Google Groups "stan development mailing list" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-dev/X45dGQkaags/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to stan-dev+u...@googlegroups.com.

Bob Carpenter

unread,

Mar 18, 2015, 1:20:23 AM3/18/15

to stan...@googlegroups.com

> On Mar 18, 2015, at 4:08 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> On Tue, Mar 17, 2015 at 11:32 PM, Bob Carpenter <ca...@alias-i.com> wrote

...

>> I don't see any way to make what you wrote any faster, but
>> I'd be happy to take a look at some point. Or better yet,
>> run it by Marcus.
>
> You're right, I'm not sure what I was thinking. The dot product must
> (ultimately) go by row, so the CSR format has everything set up right
> and the only way to make it faster would be drop some of the
> intermediaries. I think this is me catching up with Marcus' original
> comment about using dot product.

I was assuming you did put everything together into
dot product, of course --- that's a huge savings in
speed and space over individual multiplications.

- Bob

Krzysztof Sakrejda

unread,

Mar 18, 2015, 1:25:29 AM3/18/15

to stan...@googlegroups.com

On Wed, Mar 18, 2015 at 1:19 AM, Bob Carpenter <ca...@alias-i.com> wrote:
>
>> On Mar 18, 2015, at 4:08 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>>
>> On Tue, Mar 17, 2015 at 11:32 PM, Bob Carpenter <ca...@alias-i.com> wrote
>

[snip]

>
> I was assuming you did put everything together into
> dot product, of course --- that's a huge savings in
> speed and space over individual multiplications.
>

Yeah, but if you model matrix is m x n and you're using row-storage
format you have
to do m dot products, which is better then m x n individual
multiplications but a model matrix
typically has many more rows than columns so it's not as awesome as it
could be. For example
the 33% speed-up is for a 200k x 52 matrix, and it's going to get
better for wider matrices.
I was trying to scheme about some way of doing things column-wise with
dot_product but
I don't think that's possible. If i want to get that last bit of
speed for the gradients I'd need
to implement the gradient directly... unless there's something I'm missing.

Krzysztof

Bob Carpenter

unread,

Mar 18, 2015, 3:07:23 AM3/18/15

to stan...@googlegroups.com

With N observations and K predictors, the data matrix
will be (N x K) and the parameter vector (K x 1) in
a simple non-multilevel regression, then you get N
dot products, each involving K multiplications and K
additions (assuming you start with zero).

The naive approach introduces (2 * K * N) nodes into the
expression graph, each taking a node pointer (8 bytes),
vtable pointer (8 bytes), value (8 bytes), adjoint (8 bytes),
and two operands (16 bytes), for a total of (96 * K * N)
bytes of storage.

The dot-product approach introduces N nodes, each taking
a node pointer (8 bytes), vtable pointer (8 points),
value (8 bytes), and 2*K operands, for a total of
((24 + 2K) * N) bytes of storage. The overall fraction
of memory used is:

(24 + 2 * K) / (96 * K)

K=1: 26 / 96
K=10: 44 / 960
K=100: 224 / 9600

The naive approach requires (2 * K * N) virtual function
lookups and the dot-product approach only N. The overall
arithmetic operations are the same each way.

I think I did that arithmetic right.

- Bob

> You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Bob Carpenter

unread,

Mar 18, 2015, 3:41:23 AM3/18/15

to stan...@googlegroups.com

I looked at your branch:

https://github.com/stan-dev/stan/compare/feature/sparse_multiply

and the first problem is that you didn't follow our convention
of naming branching after the issues. No big deal, but it
makes them hard to put together later. I know it's verbose.
We should probably remove the "issue-" convention.

I didn't understand the difference between _csr and _csc.
The final "r" vs. "c" seems to hint at some row vs. column
difference, but the doc seemed to be identical and I couldn't
easily work through the code.

It would be tricky to have the _csr version use dot product.
You'd need to accumulate two vectors of unknown size to start,
so it'd require two passes or using something like a vector
to accumulate.

Some other comments.

You don't want to do this:

Eigen::Matrix<T1,Eigen::Dynamic,1> w_sub;
w_sub = w.segment(u[i]-1,z[i]);

but rather just do the compound decleare/define:

Eigen::Matrix<T1,Eigen::Dynamic,1> w_sub
= w.segment(u[i]-1,z[i]);

or even just apply the copy constructor directly as in:

Eigen::Matrix<T1,Eigen::Dynamic,1> w_sub(w.segment(u[i]-1,z[i]));

For this function, and others like it:

template <typename _Scalar>
const std::vector<int>
extract_u(Eigen::SparseMatrix<_Scalar, Eigen::RowMajor> A) {
std::vector<int> u(A.outerSize()+1);
for ( int j = 0; j <= A.outerSize(); ++j)

u[j] = *(A.outerIndexPtr()+j) + 1; // make 1-indexed
return u;
}

1. no reason to use _Scalar for type, just use Scalar, or
even just T.

2. No space between "(" and "int"

3. Spaces around "+".

4. Stay away from pointer manipulation. I don't know how
the sparse matrices work in Eigen, but presumably there's some
way to get at the values you want other than pointer ops.

http://eigen.tuxfamily.org/dox/group__TutorialSparse.html

I didn't even see outerIndexPtr() in their tutorial and the
API doc says it's just there for compatibility with other
libs. That means you might be able to just construct the
vector you want with iterators.

5. I didn't understand the make 1-indexed comment, but then
these functions are tricky and I didn't spend much time trying
to figure it out.

- Bob

> You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Krzysztof Sakrejda

unread,

Mar 18, 2015, 4:10:03 AM3/18/15

to stan...@googlegroups.com

On Wed, Mar 18, 2015 at 3:40 AM, Bob Carpenter <ca...@alias-i.com> wrote:
> I looked at your branch:
>
> https://github.com/stan-dev/stan/compare/feature/sparse_multiply
>
> and the first problem is that you didn't follow our convention
> of naming branching after the issues. No big deal, but it
> makes them hard to put together later. I know it's verbose.
> We should probably remove the "issue-" convention.
>
> I didn't understand the difference between _csr and _csc.
> The final "r" vs. "c" seems to hint at some row vs. column
> difference, but the doc seemed to be identical and I couldn't
> easily work through the code.

Thanks for taking a peek.

The extractors _are_ the same, for some reason I couldn't get the compiler to
do the right thing without having a second set for Eigen::RowMajor...
I'll have to take
another shot at it sometime now that everything works.

The multiplication is different because csc (that's what they call it...) is by
column and csr is by row. The doc looks the same until you realize
that column/row
switch places between the two... sorry there seems to be no good way of
documenting this. Eigen documents it in one place and pretends people
have sed in their brain.

>
> It would be tricky to have the _csr version use dot product.
> You'd need to accumulate two vectors of unknown size to start,
> so it'd require two passes or using something like a vector
> to accumulate.

Agreed (well, for the csc version, csr does use dot product). Can't
see how that would be efficient.

>
> Some other comments.
>
> You don't want to do this:
>
> Eigen::Matrix<T1,Eigen::Dynamic,1> w_sub;
> w_sub = w.segment(u[i]-1,z[i]);
>
> but rather just do the compound decleare/define:
>
> Eigen::Matrix<T1,Eigen::Dynamic,1> w_sub
> = w.segment(u[i]-1,z[i]);
>
> or even just apply the copy constructor directly as in:
>
> Eigen::Matrix<T1,Eigen::Dynamic,1> w_sub(w.segment(u[i]-1,z[i]));

That's what I did but compilation failed so I split it out. I'll go
back and figure out
if your suggestions do work, maybe I was doing something weird the
first time around.

>
> For this function, and others like it:
>
> template <typename _Scalar>
> const std::vector<int>
> extract_u(Eigen::SparseMatrix<_Scalar, Eigen::RowMajor> A) {
> std::vector<int> u(A.outerSize()+1);
> for ( int j = 0; j <= A.outerSize(); ++j)
> u[j] = *(A.outerIndexPtr()+j) + 1; // make 1-indexed
> return u;
> }
>
> 1. no reason to use _Scalar for type, just use Scalar, or
> even just T.
>
> 2. No space between "(" and "int"
>
> 3. Spaces around "+".

ok.

>
> 4. Stay away from pointer manipulation. I don't know how
> the sparse matrices work in Eigen, but presumably there's some
> way to get at the values you want other than pointer ops.
>
> http://eigen.tuxfamily.org/dox/group__TutorialSparse.html
>
> I didn't even see outerIndexPtr() in their tutorial and the
> API doc says it's just there for compatibility with other
> libs. That means you might be able to just construct the
> vector you want with iterators.

I looked for that first, but there seriously seems to be no good way
of getting at it. There's a high-level
API, and then there's pointers...

>
> 5. I didn't understand the make 1-indexed comment, but then
> these functions are tricky and I didn't spend much time trying
> to figure it out.

Yeah, it's really screwy since the function is exposed in Stan so indexing is
supposed to start at one, but it's doing pointer arithmetic so obviously starts
at zero... the up side is that the test I have written uses these
functions to do
multiplication with a dense X in sparse format so all the indexing is exercised.
That part I'm confident about. The check_ functions I call catch most
of the ways
you can screw things up (they mirror the doc). If you generate a
screwy 'z' vector you can still
get an out-of-bounds index (whatever that error's called...) but that's why I
exposed the extractors through rstan... I suppose I could check that one
during multiplication.

There's a lot not to like about the whole thing....

Krzysztof

Bob Carpenter

unread,

Mar 18, 2015, 4:38:23 AM3/18/15

to stan...@googlegroups.com

> On Mar 18, 2015, at 7:10 PM, Krzysztof Sakrejda
...

> There's a lot not to like about the whole thing....

Yeah, it's C++ code for a start :-)

Our whole API could use going over with a fine-toothed
comb. We're doing a lot wrong now with Eigen and Joshua Pritikin's
helping us make it right. The first version was terrible until
Marcus swooped in and did everything much more efficiently with
custom derivatives.

The key thing's making sure the first version works and is
well documented --- if it can be made efficient and people care
about it, that can come later.

- Bob

Krzysztof Sakrejda

unread,

Mar 18, 2015, 9:01:16 AM3/18/15

to stan...@googlegroups.com

On Wednesday, March 18, 2015 at 4:38:23 AM UTC-4, Bob Carpenter wrote:
> > On Mar 18, 2015, at 7:10 PM, Krzysztof Sakrejda
> ...
>
> > There's a lot not to like about the whole thing....
>
> Yeah, it's C++ code for a start :-)

Hey, I like C++, but maybe that's just because I've been surrounded by R for so long I like code that takes long enough to write that it's worth documenting/testing.

Krzysztof Sakrejda

unread,

Mar 20, 2015, 7:12:31 PM3/20/15

to stan...@googlegroups.com

Looks right to me... and on a practical test (.stan files attached for both non-sparse and sparse) I got roughly a 2x speed-up (6.5 hours instead 13, taking the middle of five chains). Roughly the same inits, same data, results are similar, etc...

Krzysztof

p.s.- the only difference between the two files is in data declaration (since there's no sparse type), and the multiplication call.

krzysiek@fawkes ~/projekty/westbrook_ats_survival/analysis $ diff ./cjs-seasonality-4/cjs-seasonality-4.stan ./cjs-seasonality-4a/cjs-seasonality-4a.stan
37c37,45
< matrix[n_rows,n_phi_effects] phi_X;
---
>
> // matrix[n_rows,n_phi_effects] phi_X;
> int phi_X_NNZ;
> int phi_X_m;
> int phi_X_n;
> vector[phi_X_NNZ] phi_X_w;
> int phi_X_v[phi_X_NNZ];
> int phi_X_u[phi_X_m+1];
> int phi_X_z[phi_X_m];
125c133,134
< phi <- phi_X * phi_betas;
---
> phi <- sparse_multiply_csr(phi_X_m, phi_X_n, phi_X_w, phi_X_v, phi_X_u, phi_X_z, phi_betas);
>