Scope of "model" block variables and access to them subsequently

950 views

Skip to first unread message

Rudolf Cardinal

unread,

Mar 9, 2013, 10:11:55 PM3/9/13

to stan-...@googlegroups.com

Dear all,

I've just come across Stan and am very impressed; thanks to all those who have developed it!

I'm trying to use it for reinforcement learning models in which some information is stored locally (in the "model" block) and used to calculate things on a trial-by-trial, cumulative basis. Stan seems potentially ideal for this, given that you can do imperative coding within the "model" block.

The Stan manual (v1.2.0) says (under "Variable scope") that "The variables declared in each block have scope over all subsequent statements. Thus a variable declared in the transformed data block may be used in the model block. But a variable declared in the generated quantities block may not be used in any earlier block, including the model block."

[There's one qualification I've noticed: variables in an earlier block are read-only for later blocks (with the exception of parameter sampling statements, which I'm aware are not the same sort of thing as conventional assignments).]

However, variables declared in the "model" block (even at its top level) are not in scope in the "generated quantities" block. Accordingly, if I want to retrieve values of my calculated probabilities of choosing the better stimulus on each trial ("p_best" in the skeleton code below), I can't retrieve it from the "model" block, and if I declare it in an earlier block, I can't write to it from the "model" block -- so my only option is to recalculate it within the "generated quantities" block -- which is somewhat inefficient since I had to calculate it anyway as part of the model. (Not as inefficient as it could be, as I understand that the "generated quantities" block is executed less often than the "model" block -- but it's still noticeable.)

Could "model" variables be put in read-only scope for the "generated quantities" block (clearly, with the caveat that only variables declared in the "generated quantities" should be reported back to e.g. RStan or the final output, or there'd be too much junk coming back in many cases -- but a simple copy would suffice, e.g.: generated quantities { real my_copy; my_copy <- my_model_var; })?

Or is there some other obvious way of retrieving posterior distributions of things calculated within the model block that I've not seen?

Simplest code to reproduce is:

model {

real modelthing;

modelthing <- 1.0;

}

generated quantities {

real thing;

thing <- modelthing;

}

... generates error:

thing <- modelthing;

^-- here

DIAGNOSTIC(S) FROM PARSER:

variable "modelthing" does not exist.

More realistic skeleton code below.

Many thanks!

all the best,

Rudolf.

data {

int<lower=0> T; // number of trials in total

int<lower=0> S; // number of subjects

int<lower=0> G; // number of groups

int<lower=1,upper=G> group_membership[S]; // which group is each subject in?

int<lower=1,upper=S> subject[T]; // which subject?

int<lower=1> discrimination[T]; // which discrimination?

int<lower=0,upper=1> chose_best[T]; // did the subject choose the best stimulus (0 no, 1 yes)?

int<lower=0,upper=1> reward[T]; // was the trial rewarded?

// ...

}

parameters {

// Group level, assuming separate means and shared SD

real<lower=0,upper=1> group_mean_reinf_rate[G];

real<lower=0> group_sd_reinf_rate;

// ...

// Subject level

real<lower=0,upper=1> subject_reinf_rate[S];

// ...

}

transformed parameters {

// Constants

real softmax_b;

softmax_b <- 1.0;

}

model {

// What we will optimize

real p_best[T]; // calculated probability of choosing the best stimulus

// Keeping track of the current subject/discrimination

int s; // current subject

real reinf_rate;

// ...

// Value tracking

vector[2] stim_value; // worst, best

// ...

// Specify priors

group_mean_reinf_rate ~ beta(1.1, 1.1);

group_sd_reinf_rate ~ cauchy(0, 5);

// ...

// Determine subject parameters

for (subj in 1:S) {

int g;

g <- group_membership[subj];

subject_reinf_rate[subj] ~ normal(group_mean_reinf_rate[g], group_sd_reinf_rate);

// ...

}

// Iterate through trials

s <- 0; // invalid value

for (t in 1:T) {

// Starting a new subject? Read subject parameters and reset discrimination.

if (subject[t] != s) {

s <- subject[t];

reinf_rate <- subject_reinf_rate[s];

// ...

}

// ...

// Calculate p

overall_value <- stim_value; // + ...

// ... calculate softmax

s_exp_products <- exp( (overall_value * softmax_b) - mean(overall_value) );

p_best[t] <- s_exp_products[2] / sum(s_exp_products);

// Update

stim_value[chosen_index[t]] <- ((1 - reinf_rate) * stim_value[chosen_index[t]]) + (reinf_rate * reward[t]);

}

// Optimize fit

chose_best ~ bernoulli(p_best);

}

generated quantities {

// Group mean differences (in all cases: group 1 - group 2 = sham - lesion)

real gmd_reinf_rate;

gmd_reinf_rate <- group_mean_reinf_rate[1] - group_mean_reinf_rate[2];

}

Bob Carpenter

unread,

Mar 9, 2013, 10:22:21 PM3/9/13

to stan-...@googlegroups.com

Answers inline.

On 3/9/13 5:11 PM, Rudolf Cardinal wrote:
> Dear all,
>
> I've just come across Stan and am very impressed; thanks to all those who have developed it!
>
> I'm trying to use it for reinforcement learning models in which some information is stored locally (in the "model"
> block) and used to calculate things on a trial-by-trial, cumulative basis. Stan seems potentially ideal for this, given
> that you can do imperative coding within the "model" block.
>

> The Stan manual (v1.2.0) says (under "Variable scope") that "*The variables declared in each block have scope over all
> subsequent statements.* Thus a variable declared in the transformed data block may be used in the model block. But a

> variable declared in the generated quantities block may not be used in any earlier block, including the model block."
>
> [There's one qualification I've noticed: variables in an earlier block are read-only for later blocks (with the
> exception of parameter sampling statements, which I'm aware are not the same sort of thing as conventional assignments).]

Right -- variables can only be assigned to (show up on the
left side of an "<-") in the block in which they are declared.

> However, *variables declared in the "model" block (even at its top level) are not in scope in the "generated quantities"
> block.* Accordingly, if I want to retrieve values of my calculated probabilities of choosing the better stimulus on each

> trial ("p_best" in the skeleton code below), I can't retrieve it from the "model" block, and if I declare it in an
> earlier block, I can't write to it from the "model" block -- so my only option is to recalculate it within the
> "generated quantities" block -- which is somewhat inefficient since I had to calculate it anyway as part of the model.
> (Not as inefficient as it could be, as I understand that the "generated quantities" block is executed less often than
> the "model" block -- but it's still noticeable.)

I'll try to clarify in the manual.

The variables in the model block are all local variables, which only persist
in the block in which they are defined.

To define a variable you need in the model block, use the
transformed parameter block.

> *Could "model" variables be put in read-only scope for the "generated quantities" block* (clearly, with the caveat that

> only variables declared in the "generated quantities" should be reported back to e.g. RStan or the final output, or

> there'd be too much junk coming back in many cases -- but a simple copy would suffice, e.g.: *generated quantities {
> real my_copy; my_copy <- my_model_var; }*)?

The could be, but they're not going to be :-)

> Or is there some other obvious way of retrieving posterior distributions of things calculated within the model block
> that I've not seen?

Yes, put them in the transformed parameter block.

> Simplest code to reproduce is:
>
> model {
> real modelthing;
> modelthing <- 1.0;
> }
> generated quantities {
> real thing;
> thing <- modelthing;
> }
>
> ... generates error:
> thing <- modelthing;
> ^-- here
> DIAGNOSTIC(S) FROM PARSER:
> variable "modelthing" does not exist.
>
> More realistic skeleton code below.

In this case, you'd do this:

transformed parameters {

real modelthing;
modelthing <- 1.0;
}

Then modelthing is available in the model and will
also be printed.

If the variable being defined does not depend on any parameters (implicitly,
this includes other transformed parameters or local variables), as
in this example, then it's more efficient to define it in the transformed data block.

- Bob

Rudolf Cardinal

unread,

Mar 10, 2013, 5:21:46 PM3/10/13

to stan-...@googlegroups.com

Many thanks! With local blocks to be able to define integer locals within the "transformed parameters" block, that works perfectly. Thank you!

all the best,

Rudolf.

Reply all

Reply to author

Forward

0 new messages