Multilevel serial dilution assay

161 views
Skip to first unread message

Joanthan Audet

unread,
Feb 10, 2016, 5:45:35 PM2/10/16
to Stan users mailing list
I think someone inquired about the code for the serial dilution assay in the paper here. I have also been trying to implement it. I seem to have been able to code the model overall, but some issues remain. I did modify it a bit: I included 5 parameters instead of four (but have faced the same issues with a 4-parameter model) and I adjusted the priors for my assay. Also, the model does not take days into account (the paper had 3 levels: individual plate -> plates on same day -> days).

Two issues are beyond my math skills:
First, the model always runs into large numbers of divergent transitions even with adapt_delta = 0.95.
Second, (probably related) some of the R_hat's are terrible, especially for the correlation matrix Omega.

Third, should I be worried about all the warnings regarding positive-definite matrices and other related issues?

I also don't know if I am missing opportunities to vectorize (or otherwise speed up) the code...

I included 2 models, an initial one (which has the above issues) which simply calculates the curves, and a second one (ends in "unkn") which is meant to include multilevel interpolation of samples based on the curve. I was very interested in using the multilevel prediction across dilutions of unknown samples but the paper is vague. As I understand Stan, making the prediction multilevel requires putting it in the "model" block since it is a sampling statement (rather than just in the "generated quantities" for normal interpolation, which has worked fine in other models without any multilevel aspects). However, including unknowns completely messes up the curve parameters for that plate; I have included unknown data for one of the plates. The inclusion of unknowns also slows sampling substantially, the 3000 samples in the script generally take far longer than the 6000 for the model without unknowns.

I tried to comment the models as much as I could. I also included data and unknown data along with a script.

I am currently using Microsoft R Open 3.2.3 on Windows 7 (and 10) x64 along with the MKL libraries it provides. The script calls on the Checkpoint package to ensure that all package versions are stable.

logistic_OD_multi_Gelman_5p.stan.txt
logistic_OD_multi_Gelman_5p_unkn.stan.txt
RunModel.R.txt
demoLog5p.csv
UnknData.csv

Andrew Gelman

unread,
Feb 13, 2016, 1:30:56 AM2/13/16
to stan-...@googlegroups.com
Hi…it’s been a few years since I worked on this. . . . I actually fit the model in Bugs, believe it or not!  One thing I remember for real data was that if you have some unknowns with zero concnentrations, the model can have problems because the lognormal model on the unknown concentrations won’t make sense.  In that case we needed to go with a mixture model.  Also, the model I fit did not have any correlation matrix.  I think that your problems are coming from letting all 5 betas vary like this. I’d suggest just keeping the betas constant, or maybe just varying one of them, not all 5.
A

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<logistic_OD_multi_Gelman_5p.stan.txt><logistic_OD_multi_Gelman_5p_unkn.stan.txt><RunModel.R.txt><demoLog5p.csv><UnknData.csv>

Krzysztof Sakrejda

unread,
Feb 13, 2016, 9:40:31 AM2/13/16
to Stan users mailing list
I looked through the model and here are the issues I noticed:

- Use lkj_corr_cholesky instead of lkj_corr (section 51.2 in the 2.9 manual).
- It would help to figure out the appropriate scale for tau, and perhaps at least while you are figuring out the model make it half-normal.  
- This:

//Use the current parameters to calculate the interpolated concentration of each unknown value.
//The OD (Y value) must be higher than the bottom asymptote (beta[1]) and lower than the top asymptote (beta[1] + beta[2]) or RStan throws an error at the end (for RStan > 2.6).
//So any value outside the asymptotes gets the "marker value" of -50. Additional R-side commands deal with this afterwards.
for(i in 1:N_unkn){
  vector[5] beta;
  beta <- exp(logBeta[pID_unkn[i]]);
  if(Unknown[i] > beta[1] && Unknown[i] < (beta[1] + beta[2])){
    indiv_theta[i] <- log10((beta[3] / pow(pow((beta[1] - (beta[1] + beta[2])) / (beta[1] - Unknown[i]), 1 / beta[5]) - 1, -1 / beta[4])) * Dilution[i]);
  } else {
    indiv_theta[i] <- -50;
  }

This should be (?) generating a warning about using a non-linear transform on the LHS of a sampling statement, ultimately associated with the line:

   indiv_theta[i] ~ normal(theta[uID[i]], sigma_unkn);

So that's not quite the model you think it is.  I don't quite see how this version matches up with with the article so I won't try to make more detailed comments on it but it looks like you could simplify this a fair bit---I don't think the '-50' flag value should be necessary in this model.

Krzysztof

Krzysztof Sakrejda

unread,
Feb 13, 2016, 10:54:43 AM2/13/16
to Stan users mailing list
Come to think of it, this '-50' flag is arbitrarily throwing out data based on the state of the sampler, that definitely will not work! There is a right way to do this in Stan, this just isn't it.  Krzysztof

Bob Carpenter

unread,
Feb 13, 2016, 12:42:12 PM2/13/16
to stan-...@googlegroups.com
What's the "this"? I didn't understand what that if statement's
doing.

All that pow(pow()) stuff can get unstable. You might be better
off distributing the logs through and reducing.

- Bob

Jonathan Audet

unread,
Feb 13, 2016, 1:31:35 PM2/13/16
to Stan users mailing list, gel...@stat.columbia.edu
OK, I now realize I may have misunderstood section 2.6 of the paper. That multivariate normal is for each parameter across plates, not for each plate across parameters, i.e. the matrix sigma describes correlation of a single parameter across plates, correct?


On Saturday, 13 February 2016 00:30:56 UTC-6, Andrew Gelman wrote:
Hi…it’s been a few years since I worked on this. . . . I actually fit the model in Bugs, believe it or not!  One thing I remember for real data was that if you have some unknowns with zero concnentrations, the model can have problems because the lognormal model on the unknown concentrations won’t make sense.  In that case we needed to go with a mixture model.  Also, the model I fit did not have any correlation matrix.  I think that your problems are coming from letting all 5 betas vary like this. I’d suggest just keeping the betas constant, or maybe just varying one of them, not all 5.
A
On Feb 10, 2016, at 5:45 PM, Joanthan Audet <jona...@hotmail.com> wrote:

I think someone inquired about the code for the serial dilution assay in the paper here. I have also been trying to implement it. I seem to have been able to code the model overall, but some issues remain. I did modify it a bit: I included 5 parameters instead of four (but have faced the same issues with a 4-parameter model) and I adjusted the priors for my assay. Also, the model does not take days into account (the paper had 3 levels: individual plate -> plates on same day -> days).

Two issues are beyond my math skills:
First, the model always runs into large numbers of divergent transitions even with adapt_delta = 0.95.
Second, (probably related) some of the R_hat's are terrible, especially for the correlation matrix Omega.

Third, should I be worried about all the warnings regarding positive-definite matrices and other related issues?

I also don't know if I am missing opportunities to vectorize (or otherwise speed up) the code...

I included 2 models, an initial one (which has the above issues) which simply calculates the curves, and a second one (ends in "unkn") which is meant to include multilevel interpolation of samples based on the curve. I was very interested in using the multilevel prediction across dilutions of unknown samples but the paper is vague. As I understand Stan, making the prediction multilevel requires putting it in the "model" block since it is a sampling statement (rather than just in the "generated quantities" for normal interpolation, which has worked fine in other models without any multilevel aspects). However, including unknowns completely messes up the curve parameters for that plate; I have included unknown data for one of the plates. The inclusion of unknowns also slows sampling substantially, the 3000 samples in the script generally take far longer than the 6000 for the model without unknowns.

I tried to comment the models as much as I could. I also included data and unknown data along with a script.

I am currently using Microsoft R Open 3.2.3 on Windows 7 (and 10) x64 along with the MKL libraries it provides. The script calls on the Checkpoint package to ensure that all package versions are stable.

Jonathan Audet

unread,
Feb 13, 2016, 2:04:08 PM2/13/16
to Stan users mailing list
I will try out your suggestions today.

Regarding the unknown calculations, I have to admit I don't quite follow the reasoning of section 3.2 of the paper. In the context of least-squares fit, I would just isolate x instead of y and work from there. This is what I did here. But that method is only valid if "y" is between the bottom and top, each of which moves with the state of the sampler.

Krzysztof Sakrejda

unread,
Feb 13, 2016, 4:28:59 PM2/13/16
to Stan users mailing list
On Saturday, February 13, 2016 at 2:04:08 PM UTC-5, Jonathan Audet wrote:
I will try out your suggestions today.

Regarding the unknown calculations, I have to admit I don't quite follow the reasoning of section 3.2 of the paper. In the context of least-squares fit, I would just isolate x instead of y and work from there. This is what I did here. But that method is only valid if "y" is between the bottom and top, each of which moves with the state of the sampler.


You need to place constraints on the parameters in the declaration to deal with this, it's the <lower=min(OD), upper=...>, since you have three parameters and two constraints it's not too straightforward but should be doable.

When I pull out x I get:

log(x) = log(beta_3) + 1/beta_4 * (log(y-beta_1) - 1/beta_4 * log(beta_1+beta_2-y))

Based on your stated constraints: beta_1 < OD < beta_1 + beta_2, I think you need to change to something similar to:

real<upper=min(OD)> beta_1;
real<lower=OD-beta_1> beta_2;

Not checked for algebra/syntax but you get the idea.  It would be cool to see this model working well---I RA'd in an analytical chemistry lab for much of my undergrad and I kept thinking model-based calibration would be a great idea.  We did a lot of other quantitative stuff but never this.  Feel free to post more questions if there are other issues with it!

Krzysztof

Andrew Gelman

unread,
Feb 13, 2016, 4:30:03 PM2/13/16
to Jonathan Audet, Stan users mailing list
Hi, actually I ended up just fitting models to one plate at a time.
A

Krzysztof Sakrejda

unread,
Feb 13, 2016, 4:32:16 PM2/13/16
to Stan users mailing list
Also real<lower=0> beta_4

Jonathan Audet

unread,
Feb 13, 2016, 5:45:37 PM2/13/16
to Krzysztof Sakrejda, Stan users mailing list

The model without the unknown, while still having sampling issues, actually retrodicts the curves quite well already. I am including a corrected script, I noticed some missing functions in the original one. Also, a version of the curve-only model with the cholesky decomposition. Regarding tau, I did make it half-normal, its values in the posterior are all 0.05-0.5 so right now I think I put a normal(0.25, 0.1) prior on it. Would it be better to model log_tau instead?

 

 

Sent from Mail for Windows 10

--
RunModel.R.txt
logistic_OD_multi_Gelman_5p.stan.txt

Krzysztof Sakrejda

unread,
Feb 13, 2016, 7:38:51 PM2/13/16
to Stan users mailing list, krzysztof...@gmail.com
On Saturday, February 13, 2016 at 5:45:37 PM UTC-5, Jonathan Audet wrote:

The model without the unknown, while still having sampling issues, actually retrodicts the curves quite well already. 

I am including a corrected script, I noticed some missing functions in the original one. Also, a version of the curve-only model with the cholesky decomposition.


That's great! Thanks for sharing the improved code.
 

Regarding tau, I did make it half-normal, its values in the posterior are all 0.05-0.5 so right now I think I put a normal(0.25, 0.1) prior on it. Would it be better to model log_tau instead?


I wouldn't worry about modeling log_tau instead.  When you declare it as:

 vector<lower = 0>[5] tau;

Stan (hoping I get this right) internally uses log(tau) as a parameter (on the real line). 

If the posterior really looks a lot like the prior I would wonder whether there's much information about it in the data, or if it's not well identified.  Since you can do these models plate-by-plate there ought to be... making the prior tighter is not going to solve this issue so I would avoid normal(.25,.1) and try to understand the model better.  If you can currently fit the model (even without unknowns) I would run that version and look at the pairs plots of other parameters with tau and see if some are highly or non-linearly correlated with it.

Also looks like the OD line (OD ~ normal(cOD, sigma_idiv);) and the parent cOD calculation makes a lot more sense now with the paper.  

Krzysztof  
 
To post to this group, send email to stan...@googlegroups.com.

Bob Carpenter

unread,
Feb 14, 2016, 1:36:13 PM2/14/16
to stan-...@googlegroups.com, krzysztof...@gmail.com

> On Saturday, February 13, 2016 at 5:45:37 PM UTC-5, Jonathan Audet wrote:
> ...
> Regarding tau, I did make it half-normal, its values in the posterior are all 0.05-0.5 so right now I think I put a normal(0.25, 0.1) prior on it. Would it be better to model log_tau instead?

> On Feb 13, 2016, at 7:38 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
> I wouldn't worry about modeling log_tau instead. When you declare it as:
>
> vector<lower = 0>[5] tau;
>
> Stan (hoping I get this right) internally uses log(tau) as a parameter (on the real line).

Right. Each element of tau gets log transformed to be unconstrained
and Stan operates on the unconstrained space. The log transform is only
used for user-specified inits; otherwise, Stan operates on the unconstrained
space directly and uses exp() to map back to the positive-only values (with
the Jacobian for sampling and VB but not optimization).

There's a chapter near the end of the manual that lists all the transforms.

- Bob

Jonathan Audet

unread,
Feb 15, 2016, 5:23:20 PM2/15/16
to Stan users mailing list, krzysztof...@gmail.com
OK, I did what Andrew did and made an estimator for each plate, I did try to make some parameters multilevel but that makes for atrocious sampling and does not help with prediction. I think I figured out how unknown estimation is done in the paper, I was going to write "see above" but it appears my email failed me (which explains why no one reacted to it...). The advantage is that there is no need to deal with the beta1 < OD < beta1 + beta2 constraint. I added some generated quantities to keep track of the variation in coefficients between the plates and named the coefficients for their function.

I only have one question left:

Is there some vectorisation or optimization that I am missing, or something that can be reparameterized better? Every now and then (on the exact same data) I get a complete chain of divergent transitions (and the other 3 chains are quite fine) and the model takes 25 minutes (on my computer) to generate 4000 draws (including 1500 warmup). Using shinystan, I could not relate divergent transitions to any specific parameter. The chains also look a bit correlated.
logistic_OD_5p_multiUnkn.stan.txt
RunModel.R.txt

Jonathan Audet

unread,
Feb 16, 2016, 6:52:42 PM2/16/16
to Stan users mailing list, krzysztof...@gmail.com
I think I am starting to figure out what the main issue is with this model. Normally, the curve parameters are dependent on the standard, i.e. P(Bottom, Span, Inflec, Asym, Slope | x_std, OD_std), and the unknown concentrations are dependent on the curve parameters, i.e. P(theta | Bottom, Span, Inflec, Asym, Slope, OD_unkn). In the current model, since everything is calculated at once we have P(Bottom, Span, Inflec, Asym, Slope, theta | x_std, OD_std, OD_unkn) which implies that the curve parameters are also optimized as a function of the unknowns, this makes no logical or physical sense. Also, since there are 72 wells of unknowns (18 samples, 2 dilutions each, in duplicate) and 24 wells of standard per plate, the unknown overwhelm the standards.

As I said earlier, one solution is to use inverse calculation in the generated quantities block. However that has two problems: 1) Artificial limits have to be placed on some parameters (so that Bottom < OD < Bottom + Span), even though OD readings above the top of the curve are possible due to measurement and enzymatic noise (this is an ELISA assay); 2) In biological assays, in my experience, interpolated values do not quite scale across dilutions, i.e. simply multiplying to reverse the dilution can yield surprisingly different values, that is the strength (in theory) of the approach used in the paper, it can weigh different dilutions depending on the precision they yield.

Is there a way to make the curve parameters independent of the unknowns' OD while keeping the multilevel unknown estimation? That issue did not seem to be a problem in the paper.

Bob Carpenter

unread,
Feb 16, 2016, 10:30:29 PM2/16/16
to stan-...@googlegroups.com

> On Feb 15, 2016, at 5:23 PM, Jonathan Audet <jona...@hotmail.com> wrote:
>
> OK, I did what Andrew did and made an estimator for each plate, I did try to make some parameters multilevel but that makes for atrocious sampling and does not help with prediction. I think I figured out how unknown estimation is done in the paper, I was going to write "see above" but it appears my email failed me (which explains why no one reacted to it...). The advantage is that there is no need to deal with the beta1 < OD < beta1 + beta2 constraint. I added some generated quantities to keep track of the variation in coefficients between the plates and named the coefficients for their function.
>
> I only have one question left:
>
> Is there some vectorisation or optimization that I am missing, or something that can be reparameterized better?

Yes to efficiency, but it'll be some work because we don't have ^ vectorized,
so you have to work on pieces with temp variables. The goal is to reduce number
of repeated operations and do vector operations all at once.

for(i in 1:N){
cOD[i] <- Bottoms[pID[i]] + (Spans[pID[i]] / (1 + (Inflecs[pID[i]] / x_i[i]) ^ Slopes[pID[i]]) ^ Asyms[pID[i]]);
sigma_indiv[i] <- ((cOD[i] / 2) ^ (2 * alpha)) * sigma;
}

you can vectorize and remove redundant operations using:

{
real alpha_alpha;
alpha_alpha <- 0.5 * alpha;
sigma_indiv <- 0.5 * cOD;
for (n in 1:N) cOD[i] <- cOD[i] ^ alpha_alpha;
}

multiplication is faster than division, so 0.5 * x is better than x/2.

As to parameterization and arithmeitc, you want to stay away from exp()
if you can, and work on the log scale until it's absolutely necessary (hopefully
never) to go back to the linear scale. So you may want to work
in terms of log_cOD, for example, which you can define in a single line:

log_cOD <- log(sigma) + (0.5 * alpha) * (log_half + log(cOD));

Then if you really absolutely need to work on the exponentiated scale, use

cOD <- exp(log_cOD);

You may still overflow, but it's the best you can do for stability.

Also, you'd have to define log_half as log(0.5) in the transformed data.

Are OD and cOD constrained to be positive? If so, normal may not be
the most appropriate density. I see a lot of other lognormals there
(e.g., log_Inflecs ~ normal(...)).

I don't know what scale all the variables are on, but it helps immensely
to get everything as close to unit-scale as possible.

Rather than calculating means and std devs for variables like Bottoms,
it can make sense to create a hierarchical model --- it sometimes fits
much better if your hard-wired prior isn't great. So you should check
that your posterior for Bottoms really is normal(0.05, 0.1).

It can sometimes help to compress repeated calculations. For instance,
if there aren't too many pID[i] values, you could precalculate most
of the cOD term.

Also, hard constraints like upper=4 can be problematic if there's
not some physical constraint --- so you should check that the estimates
in the posterior for Spans doesn't get near 4 (if it does, you truncate
the part above 4 and force the posterior means to be lower compared
to if the variable wasn't truncated).

There are all kinds of little things you can do here, too:

x_init <- exp(log_x_init);
...
for(i in 1:N_unkn){
real x_i_unkn;
x_i_unkn <- ser_dilutions[i] * x_init[uID[i]];

If you take log_ser_dilutions as a transformed data variable, then you
can define

x_i_unkn <- exp(log_x_init[i] + log_ser_dilutions[i]);

and you've replaced a multiply with an addition, which is faster
during evaluation and during differentiation.

You don't need to save all the variables like x_init, either. It
can save time during sampling to calculate those after the fact,
but it can be error prone to do that.

- Bob
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <logistic_OD_5p_multiUnkn.stan.txt><RunModel.R.txt>

Krzysztof Sakrejda

unread,
Feb 17, 2016, 9:48:42 AM2/17/16
to Stan users mailing list, krzysztof...@gmail.com


On Tuesday, February 16, 2016 at 6:52:42 PM UTC-5, Jonathan Audet wrote:
I think I am starting to figure out what the main issue is with this model. Normally, the curve parameters are dependent on the standard, i.e. P(Bottom, Span, Inflec, Asym, Slope | x_std, OD_std), and the unknown concentrations are dependent on the curve parameters, i.e. P(theta | Bottom, Span, Inflec, Asym, Slope, OD_unkn). In the current model, since everything is calculated at once we have P(Bottom, Span, Inflec, Asym, Slope, theta | x_std, OD_std, OD_unkn) which implies that the curve parameters are also optimized as a function of the unknowns, this makes no logical or physical sense.

I disagree that it makes no logical or physical sense.  The "unknowns" are run repeatedly so they have a known dilution and a known reading.  The fact the dilution level is known _should_ contribute because the reading should go down with the dilution in a sensible way, following the estimated curve.  I get that in practice this model is not working out as written but I wouldn't blame it on logical/physical inconsistency.  In BUGS people avoid this kind of problem using a function called (I think...) "cut", but Stan does not do that.  It's possible that dealing with this issue takes this project from "applying a model described in a paper" to "extending the paper" and if you don't want to do that using generated quantities is the right way  to go.
 
Also, since there are 72 wells of unknowns (18 samples, 2 dilutions each, in duplicate) and 24 wells of standard per plate, the unknown overwhelm the standards.

Yes, I think the most likely issue is that a level of noise is not being taken into account---e.g.-the variation inherent in the enzyme that leads wells to develop at different speeds.  
 
As I said earlier, one solution is to use inverse calculation in the generated quantities block. However that has two problems: 1) Artificial limits have to be placed on some parameters (so that Bottom < OD < Bottom + Span), even though OD readings above the top of the curve are possible due to measurement and enzymatic noise (this is an ELISA assay);

These limits are not so much artificial as physical if you include the appropriate noise.  The true OD for a very low concentration sample can't be lower than the true instrument output with a negative control, but those things are all measured and recorded as averages.  The best data to go into a model like this would be a series of measurements (not averaged) or the sufficient statistics (mean measurement, measurement noise) and the equations for the curve would only be written for the "true" values not the measured values.  I think this is all possible in Stan just maybe more work that you want to put in at the moment.
 
2) In biological assays, in my experience, interpolated values do not quite scale across dilutions, i.e. simply multiplying to reverse the dilution can yield surprisingly different values, that is the strength (in theory) of the approach used in the paper, it can weigh different dilutions depending on the precision they yield.

That's the aspect of the model I find most interesting as well but it might take more work on the model itself to get the benefits---especially since you mentioned divergent iterations in the previous post.  It's a good signal that part of the parameter space is difficult for the sampler to deal with. 

Is there a way to make the curve parameters independent of the unknowns' OD while keeping the multilevel unknown estimation? That issue did not seem to be a problem in the paper.

This is what the "cut" function does in BUGS, there is no parallel in Stan.

Sorry I can't be more helpful, I don't have time to dig in at the moment. 

Krzysztof

Jonathan Audet

unread,
Mar 14, 2016, 4:56:45 PM3/14/16
to Stan users mailing list, krzysztof...@gmail.com
Sorry for the long delay. Other important things came up. Also, I had time to think about the model and go over that paper again. When I finally tried to model it based on what is on page 411, it (mostly) worked.

I had not realized that the data I was using to test this version had one standard point far out of range, which threw off the curve (well it was making sense given that data, but not given the rest of the data, e.g. the highest standard concentration was estimated at 10^4 instead of ~20). Making the model multilevel has solved the issue (although obviously not for a single plate).

For example, the data (std in blue, unknowns in red, lines link serial dilutions of the same sample): (plate 12 was the one that threw me for a loop, everything but the top 2 dilutions is lower)


The regressions with 95% HDI on the calculated concentrations (std in red):



Overall, the model estimates the individual points (each replicate of each dilution of each sample) with a, imho, remarkable accuracy. The output is consistent with the previous results and makes sense given the background.

I am still having issues with divergent transitions, hitting the max_treedepth, and major autocorrelation. From the little I understand about HMC, there are probably things in there that can be reparameterised to improve efficiency.

Newest model and code (with the same data I was using) are attached.  **This model took about 1.75 hours to sample on a i7-6700K overclocked at 4.6 GHz with 32 Gb of RAM.
Data.csv
logistic_OD_5p_multiUnkn.stan.txt
RunModel.R.txt

Bob Carpenter

unread,
Mar 15, 2016, 7:09:48 PM3/15/16
to stan-...@googlegroups.com, krzysztof...@gmail.com
There were some neat adjustments in Wing Wong's dChip software
(for genomic microarrays) for noisy chips (people putting fingers
on them, getting contaminated, etc.).

If you've increased adapt_delta and upped the max depth
and are still having issues, you'll need to reparameterize.
It's probably a good idea anyway if possible.

Looking at the model, you definitely want to make the hierarchical
parts of it non-centered (aka, use the "Matt trick"); there's
a writeup in the language manual.

The other thing you can do is separate the uId[i] == std_loc
cases from the others so that you don't need a loop over N_unkn.

You'll save some time with vectorizing those normals in the loop over
n_unkn. For instance, if log_x is just the std_loc versions, then you
can convert this

log_x[i] ~ normal(log_ser_dilutions[i] + log(pred_std), sigma_x[uID[i]] * abs_log_ser_dilutions[i]);

to

log_x ~ normal(log_ser_dilutions + log(pred_std),
sigma_x[uID] .* abs_log_ser_dilution);

Or, you can just build one big vector foo that for each i contains
either log(pred_std) or log_theta[uID[i]] depending on the
std_loc condition. Then get rid of the original loop and use:

log_x ~ normal(log_ser_dilutions + foo,
sigma_x[uID] .* abs_log_ser_dilutions);

It may seem like extra work to populate foo, but the vectorization of
normal() more than makes up for it.

- Bob

P.S. Overclocked at 4.6 GHz? Isn't that thing melting its case?


> On Mar 14, 2016, at 4:56 PM, Jonathan Audet <jona...@hotmail.com> wrote:
>
> Sorry for the long delay. Other important things came up. Also, I had time to think about the model and go over that paper again. When I finally tried to model it based on what is on page 411, it (mostly) worked.
>
> I had not realized that the data I was using to test this version had one standard point far out of range, which threw off the curve (well it was making sense given that data, but not given the rest of the data, e.g. the highest standard concentration was estimated at 10^4 instead of ~20). Making the model multilevel has solved the issue (although obviously not for a single plate).
>
> For example, the data (std in blue, unknowns in red, lines link serial dilutions of the same sample): (plate 12 was the one that threw me for a loop, everything but the top 2 dilutions is lower)
>
>
>
>
> The regressions with 95% HDI on the calculated concentrations (std in red):
>
>
>
>
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <Data.csv><logistic_OD_5p_multiUnkn.stan.txt><RunModel.R.txt>

Jonathan Audet

unread,
Mar 15, 2016, 9:51:10 PM3/15/16
to Bob Carpenter, stan-...@googlegroups.com, krzysztof...@gmail.com

Thanks for the tips!

I have used non-centered parameterization in other models, but did not realize it would help here.

 

I will try those changes out tomorrow.

 

Re 4.6 GHz: It does worry me whenever I look at the temperature but that is from Asus’ OC utility… and it cools down pretty quick.

 

Sent from Outlook Mail for Windows 10 phone

Jonathan Audet

unread,
Mar 30, 2016, 12:16:29 PM3/30/16
to Bob Carpenter, stan-...@googlegroups.com, krzysztof...@gmail.com

I think I figured where most of the issues were arising from. The 5-parameter form for the logistic curve adds an asymmetry factor to the equation. This asymmetry changes the location of the inflection point. The plot of the hyper-means of those two parameters (mu_log_Asym, mu_log_Inflec) looked much like the middle panel of figure 20.1 in the Stan manual (when you have 2 intercepts with proper priors). Removing the asymmetry reduced the divergent transitions to 46 (all in one chain, adapt_delta = 0.99) and the max_treedepth problems to 324 (shinystan says ~900; max_treedepth = 15) over 4 x 2000 post-warmup draws.

If these two parameters are so correlated, would it help if the correlation was modeled explicitly? If so, how does the multivariate reparameterization work with multi-level parameters. (Also, instead of doing: cov_matrix Sigma; L <- cholesky_decompose(Sigma); , is it possible to do cholesky_factor_cov L; ?)

To: ca...@alias-i.com; stan-...@googlegroups.com
CC: krzysztof...@gmail.com
From: jona...@hotmail.com
Subject: RE: [stan-users] Multilevel serial dilution assay
Date: Tue, 15 Mar 2016 20:50:22 -0500

Jonathan Audet

unread,
Mar 31, 2016, 4:50:12 PM3/31/16
to Bob Carpenter, stan-...@googlegroups.com, krzysztof...@gmail.com
After trying some simplifications of the model (i.e. having only 1 error on the concentrations; going from "vector<lower = 0>[N_unkn] sigma_x" to "real<lower = 0> sigma_x") and looking at the joint distributions in shinystan, I think I can spot the issue. The model runs into problems when sigma_x is large and sigma_unkn (essentially sigma_y) is small.

 
 

It looks even worse when the Log_Posterior is taken into account:
 
 


I kind of get the inverse relationship, increasing the error in X allows an estimation of Y closer to the measured value. But is there a way to deal with this issue?

Subject: RE: [stan-users] Multilevel serial dilution assay
Date: Wed, 30 Mar 2016 11:16:26 -0500
SigmaX_SigmaUnkn.png
3D_SigmaX_SigmaUnkn.png

Bob Carpenter

unread,
Mar 31, 2016, 5:15:20 PM3/31/16
to stan-...@googlegroups.com
It helps if you can keep variables scaled near 1 as much as
possible. Values close to zero can be problematic as you see.

I don't know a way to reduce the correlation in the posterior
other than reparameterizing.

- Bob

> On Mar 31, 2016, at 4:50 PM, Jon


> athan Audet <jona...@hotmail.com> wrote:
>
> After trying some simplifications of the model (i.e. having only 1 error on the concentrations; going from "vector<lower = 0>[N_unkn] sigma_x" to "real<lower = 0> sigma_x") and looking at the joint distributions in shinystan, I think I can spot the issue. The model runs into problems when sigma_x is large and sigma_unkn (essentially sigma_y) is small.
>
>
> <SigmaX_SigmaUnkn.png>
>
>
> It looks even worse when the Log_Posterior is taken into account:
>
> <3D_SigmaX_SigmaUnkn.png>
>
>
>
> I kind of get the inverse relationship, increasing the error in X allows an estimation of Y closer to the measured value. But is there a way to deal with this issue?
>
> From: jona...@hotmail.com
> To: ca...@alias-i.com; stan-...@googlegroups.com
> CC: krzysztof...@gmail.com
> Subject: RE: [stan-users] Multilevel serial dilution assay
> Date: Wed, 30 Mar 2016 11:16:26 -0500
>
>
> I think I figured where most of the issues were arising from. The 5-parameter form for the logistic curve adds an asymmetry factor to the equation. This asymmetry changes the location of the inflection point. The plot of the hyper-means of those two parameters (mu_log_Asym, mu_log_Inflec) looked much like the middle panel of figure 20.1 in the Stan manual (when you have 2 intercepts with proper priors). Removing the asymmetry reduced the divergent transitions to 46 (all in one chain, adapt_delta = 0.99) and the max_treedepth problems to 324 (shinystan says ~900; max_treedepth = 15) over 4 x 2000 post-warmup draws.
>
> If these two parameters are so correlated, would it help if the correlation was modeled explicitly? If so, how does the multivariate reparameterization work with multi-level parameters. (Also, instead of doing: cov_matrix Sigma; L <- cholesky_decompose(Sigma); , is it possible to do cholesky_factor_cov L; ?)

Andrew Gelman

unread,
Mar 31, 2016, 9:48:43 PM3/31/16
to stan-...@googlegroups.com, Bob Carpenter, krzysztof...@gmail.com
I seem to recall that this model has about 1 more parameter than can typically be estimated from data.  So you could well have some very weak identification.

Jonathan Audet

unread,
Apr 15, 2016, 2:27:03 PM4/15/16
to Stan users mailing list
The latest version of the model with vectorized likelihoods and using NCP, it did make the model a bit faster, also modelling the correlation between the Inflection point and the Asymmetry appears to have reduced the number of divergent iterations. Attempting more troubleshooting revealed that part of the original data set I had posted introduced the weird correlation and curvature between sigma_x and sigma_unkn (Plate 16, I still can figure out why, it looks exactly like the others). After reading other posts on the mailing list I ended up looking at the plots of mu_log_* vs sigma_log_* (e.g. mu_log_Bottom vs sigma_log_Bottom) and quickly noticed the funnel shape. I don't mind reparameterizing the model, but have no idea what to reparameterize. As I understand the trick in section 21.2 (Neal's Funnel), it is essentially using NCP, since the variables are centered at 0 there is no mean to add. I am just not sure how that applies in this case, should the distribution of mu_log_Bottom (for example) depend on sigma_log_Bottom or do I need to transform the data and use a different equation entirely?
Data2.csv
logistic_OD_5p_multiUnkn.stan.txt
Analysis.R.txt

Bob Carpenter

unread,
Apr 15, 2016, 3:05:47 PM4/15/16
to stan-...@googlegroups.com
The ones with parameters on LHS and RHS are the ones amenable to
non-centered parameterizations. So that'd be here:

log_x ~ normal(log_ser_dilutions + log_Undil, sigma_x * abs_log_ser_dilutions);

where you'd have

parameters {
log_x_raw ~ normal(0, 1);

and then as a transformed parameter or local variable in
the model block (before it's used):

log_x <- (log_ser_dilutions + log_Undil) + (sigma_x * abs_log_ser_dilutions) * log_x_raw;


Same for pred_std:

pred_std ~ normal(mu_Std, sigma_std);


but not this one, because the LHS is just data:

Unknown ~ normal(unkn_cOD, sigma_unkn);

I'd also recommend more descriptive names or conventional greek letters
instead of "Unknown" as variable names.

- Bob

Reply all
Reply to author
Forward
0 new messages