Matrix Multiplication / Neural Nets in Rstan

Sam Weiss

unread,

May 12, 2015, 5:56:56 PM5/12/15

to stan-...@googlegroups.com

I'm trying to made a neural network in Rstan (just to try it) and having trouble doing some matrix /vector multiplication. I'm getting the error:

TRANSLATING MODEL 'model' FROM Stan CODE TO C++ CODE NOW.

Error in stanc(file = file, model_code = model_code, model_name = model_name, :

failed to parse Stan model 'model' with error message:

EXPECTATION FAILURE - DIAGNOSTIC(S) FROM PARSER:

no matches for function name="multiply"

arg 0 type=vector

arg 1 type=vector

available function signatures for multiply:

0. multiply(real, real) : real

1. multiply(vector, real) : vector

2. multiply(row vector, real) : row vector

3. multiply(matrix, real) : matrix

4. multiply(row vector, vector) : real

5. multiply(vector, row vector) : matrix

6. multiply(matrix, vector) : vector

7. multiply(row vector, matrix) : row vector

8. multiply(matrix, matrix) : matrix

9. multiply(real, vector) : vector

10. multiply(real, row vector) : row vector

11. multiply(real, matrix) : matrix

expression is ill formed

LOCATION OF PARSING ERROR (line = 26, position = 90):

PARSED:

model{

matrix[N,num_nodes] fitted_middle[num_hidden];

for(n in 1:N)

for(j in 1:num_nodes) fitted_middle[1,n, j] ~ bernoulli_logit(beta_initial[j]*x_data[n]

EXPECTED: <eps> BUT FOUND:

);

How can I set up the multiplication so it is in matrix form?

I'm mainly trying to get this to compile to see if it works but, more generally, could rstan be used to estimate parameters for neural networks instead of gradient descent?

Thanks in advance,

Sam

nnet_rstan.R

Ben Goodrich

unread,

May 12, 2015, 6:23:39 PM5/12/15

to stan-...@googlegroups.com, samc...@gmail.com

Two vectors are not conformable for multiplication. You could transpose the first one, or use the dot_product() function (which is more tolerant of input that is not strictly correct).

Ben

Sam Weiss

unread,

May 14, 2015, 10:46:56 AM5/14/15

to stan-...@googlegroups.com, samc...@gmail.com

Ben

Thanks! Got it to work. Here's github if anyone is interested:

https://github.com/samcarlos/Rstan-Neural-Net/blob/master/nnet_rstan.R

Sam

Bob Carpenter

unread,

May 14, 2015, 11:39:57 AM5/14/15

to stan-...@googlegroups.com

Thanks for sharing. We've been meaning to do something like
this ourselves for ages so would like to include a variant of
what you're doing in our doc or example models.

Were you reproducing a textbook example here (I saw the Iris
data)?

Did you fit it with optimization or MCMC?

How well did it work? We'd like to include an example with
some of the mods suggested below.

Where'd iris come from? This is the first line of the R:

iris1=iris[1:100,]

I'd suggest replacing

fitted_middle[q,n, j] <- 1/ (1+exp(-beta_middle[q, j] * fitted_middle[q-1,n]'));

with:

fitted_middle[q,n,j] <- inv_logit(beta_middle[q,j] * fitted_middle[q-1,n]');

dot_product's even a bit faster. It would be even faster to do a matrix
product for beta_middle and fitted_middle and assign to a temporary and then
assign to fitted_middel using inverse logit.

If you define beta_final and fitted_middle appropriately in terms
of shape, you should be able to replace this:

y_data[n] ~ bernoulli_logit(beta_final' * fitted_middle[num_hidden,n]');

with

y_data ~ bernoulli_logit(beta_final * fitted_middle);

without having to transform anything (though transforms aren't that
expensive because there are no derivatives).

Also, if you don't want to save all of fitted_middle, you could
declare and define it in the model block as a local variable.

- Bob

> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

FHRB Toledo

unread,

May 14, 2015, 11:57:17 AM5/14/15

to stan-...@googlegroups.com

Dear All,

Thanks indeed to Sam for sharing it.

Regarding the Bob questions:

The iris data is from Fisher (http://en.wikipedia.org/wiki/Iris_flower_data_set). It was also used as example on the nnet R package and is a native's R data.

Regards,
FH

Sam Weiss

unread,

May 14, 2015, 11:58:31 AM5/14/15

to stan-...@googlegroups.com

Hey Bob,

The Iris dataset is in my R sessions. I just type in "iris" and the data set is there : http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/iris.html. I used this dataset since it seems to be a standard among testing new applications (its always in ML vignettes). The iris dataset includes 3 classes of 50 observations each, so I just took first 100 observations which corresponds to first two classes.

It was fit with MCMC.

As for performance, it does predict data out of sample well but I haven't compared it with neuralnet packages found in R.

I didn't take this example from a textbook but just wanted to get a better understanding of neural networks and rstan so I decided to try it..

Thanks for suggestions. I'll include them.

Of course, feel free to use this as a starting point for your documentation.

Sam

You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/3QBBpo11Lus/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.

Bob Carpenter

unread,

May 14, 2015, 12:02:56 PM5/14/15

to stan-...@googlegroups.com

Wow --- a top-level global named iris for the iris data.
R never ceases to amaze me!

- Bob

Herra Huu

unread,

May 14, 2015, 3:08:39 PM5/14/15

to stan-...@googlegroups.com

Nice. I have been wanting to test neural networks with Stan for some time as well. So this seemed like a good opportunity to finally test it out. Made some improvements (hopefully) to the provided code. I guess the most important changes were:

- Bob's ideas (at least most of them) + some additional linear algebra

- added priors for betas (probably the most important part!)

- fixed some bugs (for example, first middle layers wasn't used at all in the original code, I think)

I'm not 100% sure that there isn't any bugs any more, but at least the predictions seems to be pretty decent. And it's quite fast + n_eff&Rhat values are good.

Edit: updated file nn.stan, fixed one small bug.

testing_nn.R

nn.stan

Aki Vehtari

unread,

May 15, 2015, 8:41:49 AM5/15/15

to stan-...@googlegroups.com

In nnet literature it's quite common to center sigmoid transfer functions to 0.

It might be useful to scale also inv_logits as

inv_logit(.)*2-1
to make the prior mean of the hidden unit to be 0, which will make the zero mean priors used for the weights to work better.

Aki

Herra Huu

unread,

May 15, 2015, 10:26:01 AM5/15/15

to stan-...@googlegroups.com

Thanks, that makes a lot of sense. Updated model code as an attachment. It did improve results quite a bit, at least for the iris dataset (which is not that complex). I think I will do some simulations next.

nn.stan

Sam Weiss

unread,

May 15, 2015, 10:39:09 AM5/15/15

to stan-...@googlegroups.com

I'm getting an error when I try and run the code:

"Error in (function (classes, fdef, mtable) :

unable to find an inherited method for function ‘sampling’ for signature ‘"missing"’ "

Do you know what that could be?

Thanks,
Sam

--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/3QBBpo11Lus/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.

Herra Huu

unread,

May 15, 2015, 10:49:14 AM5/15/15

to stan-...@googlegroups.com

Oh sorry, I didn't remember to post the updated R file (I changed some of the parameter/data names in the model, so that's probably the problem). Does this (attached file) work?

testing_nn.R

Sam Weiss

unread,

May 15, 2015, 10:49:28 AM5/15/15

to stan-...@googlegroups.com

Loaded .rstan file incorrectly. Works fine.

Bob Carpenter

unread,

May 15, 2015, 3:27:56 PM5/15/15

to stan-...@googlegroups.com

You can use tanh(u) instead of inv_logit(u) * 2 - 1.

At least that's what I've usually seen used.

It scales by a factor of 2, in that:

2 * inv_logit(u) -1 = tanh(u / 2)

- Bob

Reply all

Reply to author

Forward