How to constraint a variance in a probit with a non-centered parameterization?

Jeremy Koster

unread,

May 26, 2017, 7:27:07 AM5/26/17

to Stan users mailing list

Hi everyone,

I work with dyadic data, and Richard McElreath recently helped me set up an RStan version of the Social Relations Model for count data: http://www.sciencedirect.com/science/article/pii/S0378873314000148

I am now trying to specify a related probit model for binary data. One difference is that the variance of the dyadic effects should be constrained to 1 -- see the attached notation. In other words, the parameter "sigma_dyad_ID" should be 1, not freely estimated from the data.

I have tried a few experiments with the model code to impose this constraint, but the solution has eluded me. The model code is below. Where in the code should I impose this constraint?

Thanks,
Jeremy

model_code_0 <- "

data{

int N;

int N_i__j_ID;

int N_dyads;

int N_i__l_ID;

int y[N];

int i_ID[N];

int j_ID[N];

int ij_dyad_ID[N];

int i_in_dyad[N];

}

parameters{

real a;

// Node level effects cov matrix

matrix[2,N_i__j_ID] z_i__j_ID;

vector<lower=0>[2] sigma_i__j_ID;

cholesky_factor_corr[2] L_Rho_i__j_ID;

// DYAD level covariance matrix

matrix [2,N_dyads] z_dyad_ID;

real<lower=0> sigma_dyad_ID;

cholesky_factor_corr[2] L_Rho_dyad_ID;

}

transformed parameters{

// Define all matrices

matrix [N_i__j_ID,2] v_i__j_ID;

matrix [N_dyads,2] v_dyad_ID;

{

vector[2] sigma_temp;

v_i__j_ID = (diag_pre_multiply ( sigma_i__j_ID, L_Rho_i__j_ID ) * z_i__j_ID)';

sigma_temp[1] = sigma_dyad_ID;

sigma_temp[2] = sigma_dyad_ID;

v_dyad_ID = (diag_pre_multiply ( sigma_temp, L_Rho_dyad_ID) * z_dyad_ID)';

}

model{

vector[N] p;

//priors

a ~ normal( 0 , 5 );

to_vector (z_i__j_ID) ~ normal (0,1);

to_vector (z_dyad_ID) ~ normal (0,1);

sigma_i__j_ID ~ exponential (1);

sigma_dyad_ID ~ exponential (1);

L_Rho_i__j_ID ~ lkj_corr_cholesky(4);

L_Rho_dyad_ID ~ lkj_corr_cholesky(4);

// LIKELIHOOD

for ( i in 1:N ) {

p[i] =

a

+ v_i__j_ID [i_ID[i],1]

+ v_i__j_ID [j_ID[i],2]

+ v_dyad_ID [ ij_dyad_ID[i], i_in_dyad[i] ]

;

}

// update target

y ~ bernoulli (Phi(p));

}

generated quantities{

matrix[2,2] Rho_i__j_ID;

matrix[2,2] Rho_dyad_ID;

Rho_i__j_ID = L_Rho_i__j_ID * L_Rho_i__j_ID';

Rho_dyad_ID = L_Rho_dyad_ID * L_Rho_dyad_ID';

}

"

notation.pdf

Jeremy Koster

unread,

May 26, 2017, 10:24:02 AM5/26/17

to Stan users mailing list

Actually, I'm catching myself and realizing that the conversion to a probit model might not be as straightforward as I first thought. The Social Relations Model is usually set up as a bivariate response model and then the dyadic correlation (Rho_dyad_ID) is a correlation of the residuals from the two equations. With the Poisson version, we use random effects to model the dyadic correlation, but that doesn't translate easily to the probit model.

Bob Carpenter

unread,

May 26, 2017, 11:08:03 AM5/26/17

to stan-...@googlegroups.com

Just declare it as transformed data

transformed data {
int<lower = 0> sigma_dyad_ID = 1;
}

Then it's a constant rather than estimated. You don't impose it as
a constraint.

But you may be asking how you constrain a vector to have unit variance.
That we can't do easily (you'd need to define your own constraint

- Bob

> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <notation.pdf>

Jeremy Koster

unread,

May 26, 2017, 11:26:47 AM5/26/17

to Stan users mailing list

That seems straightforward enough, Bob, but unfortunately the parser issues the following message:

parameters or transformed parameters cannot be integer or integer array; found declared type int, parameter name=sigma_dyad_ID

Meanwhile, I'm realizing further that switching from Poisson to probit necessitates additional changes to the model structure. In a few minutes, I'll post a BUGS version in case there is someone on the board who can help to translate from one to the other.

Thanks!

Ben B

unread,

May 26, 2017, 12:30:57 PM5/26/17

to Stan users mailing list

@ Jeremy, the transformed thing is is you really want the variable sigma_dyad_ID to be the number 1

If it's the other thing you're asking about, I'm curious about what the options are too. This came up in this discourse thread as well: http://discourse.mc-stan.org/t/dealing-with-uncertain-parameters-that-arent-purely-fitting-parameters/514/45?u=bbbales2

If we had the model:

data {

int N;

vector[N] y;

}

parameters {

real a;

real b;

}

model {

a ~ normal(0, 1);

y ~ normal(a, b);

}

And somehow only wanted to infer 'b' but not 'a' (we want to guarantee the posterior of 'a' is normal(0, 1) distributed), then if I was doing Gibbs sampling stuff this'd be straightforward I think. My updates would be:

p(a | b, y) ~ normal(0, 1)

p(b | a, y) ~ whatever_sampling_mechanism_I_choose

I think it makes sense that this is hard to do in Stan, because it violates the idea that we're writing down our likelihoods and letting Stan do the rest.

But I was trying to think about what I'd need to add to the target log probability to get this sampling effect, and I came up blank.

So for the model above:

target = normal_lpdf(y | a, b) + normal_lpdf(a | 0, 1)

It seems like if I wanted the sampler to somehow ignore y ~ normal(a, b) when inferring 'a', then I'd want the partial derivative of that expression with respect to 'a' to be fixed at zero, which seems really impolite.

I'm not sure if there's anything I can add there to make this change? I think you'd need a way to trick the autodiff and making the autodiff easily trickable probably wasn't a design goal...

Also, is there a simple example of why this is a bad idea? I'm guessing it is pathological and non-generative and all those not so good things, but I'm not sure why. I keep wanting to do this.

Ben

Ben B

unread,

May 26, 2017, 12:33:45 PM5/26/17

to stan-...@googlegroups.com

@ Jeremey, Eeek, I left out the detail that 'transformed data' and 'transformed parameters' are different things. Transformed data is for things you want to be constants.

Ben

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.

To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.

Jeremy Koster

unread,

May 26, 2017, 2:18:35 PM5/26/17

to Stan users mailing list

Thanks, Ben. Unfortunately, I'm not attempting anything so interesting.

Instead, I'm trying to fit a Social Relations Model (SRM) for binary data using a probit link. The SRM is useful for analyzing social network data, where individuals rate all other group members. For example, suppose that we asked everyone on the development team, "Who else on the team do you enjoy hanging out with?" Whereas the SRM is used primarily for continuous or interval data (such as the scaled data common to social psychology), we are trying to adapt it to binary yes/no responses.

The response is whether individual i gave a positive rating to individual j. The SRM decomposes the response variable into three parts -- the tendency of individual i to give affirmative ratings above or below the baseline, the tendency of j to receive affirmative ratings, and a residual relationship-specific i->j residual. We estimate the covariance of person i's giver and receiver effect, and then we calculate the correlation between the i->j and the j->i residuals. Setting up the model as a bivariate equation model facilitates the calculation of this latter correlation.

We developed a similar model for count data and I thought that I could use a clever indexing trick to get the analogous residuals from a probit model. However, whereas this is relatively straightforward with our Poisson model because we use case-level random effects (analogous to effects for overdispersion), what we are after with the probit model is the correlated residuals, not correlated random effects. Retrieving and modeling those quantities is one aspect of this model that is currently confounding.

A colleague of mine has coded up a similar model in Stat-JR, which borrows from the WinBUGS language. I paste the latter code below, as this is the model that I am attempting to replicate in Rstan. Advice on how to estimate correlated residuals from probit models would be much appreciated.

Stat-JR model

model {

for (d in 1:length(y_1)/2) {

dummy[d] ~ dnormal2a(misby_1[d], misby_2[d], mu1[d], mu2[d], diag, offdiag, diag)

mu1[d] <- cons_1[d] * beta_0 + a1[i_ID_1[d]] + b1[j_ID_1[d]]

mu2[d] <- cons_2[d] * beta_0 + a1[i_ID_2[d]] + b1[j_ID_2[d]]

misby_1[d] ~ dflat()

misby_2[d] ~ dflat()

}

for (i in 1:length(a1)) {

dummy_u[i] ~ dnormal2a(a1[i], b1[i], 0, 0, prec_a1b1[0], prec_a1b1[1], prec_a1b1[2])

dummy2_u[i] ~ ddummy(dummy_u[i])

a1[i] ~ dflat()

b1[i] ~ dflat()

}

# Priors

beta_0 ~ dflat()

rhoe1e1 ~ dunif(-1, 1)

div <- 1 - rhoe1e1*rhoe1e1

diag <- 1/div

offdiag <- -rhoe1e1/div

}

To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

Bob Carpenter

unread,

May 26, 2017, 4:01:01 PM5/26/17

to stan-...@googlegroups.com

I have no idea what ddummy or dnormal2a are, but
if you can code up their log density functions (up to an
additive constant), then this should be easy to translate line
for line into Stan. We'd just need to know what the shapes of
the variables are and which ones are data and which ones are
parameters.

- Bob

P.S. I hadn't heard of Stat-JR, but see that it is "free to
UK academics, and otherwise available for purchase." First
instance of Brexitware I've seen.

Ben B

unread,

May 27, 2017, 2:55:42 PM5/27/17

to stan-...@googlegroups.com

Lemme just @ myself here, but I dug around and found answers questions (if anyone stumbles across this in a Googling):

1. I think the thing I was talking about was BUGS cut http://www.openbugs.net/Manuals/Tricks.html#UseOfTheCutFunction

2. The way I talked about doing it in HMC makes no sense. Messing with the partials means not conserving the Hamiltonian means it's not doing HMC anymore (coded this up and tested it before I realized... hooray for leaping before thinking).

3. A couple links to stuff relevant for this http://andrewgelman.com/2016/02/20/dont-get-me-started-on-cut/ and https://groups.google.com/forum/#!topic/stan-users/LIPi6yKTykc (this has a link to a fun presentation on this)

And now I'll just sneak back out of this thread!

Ben

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.

> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.

To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward