Averaging aster predictions over (fixed effect) block term

66 views
Skip to first unread message

John Benning

unread,
Mar 1, 2019, 1:34:19 PM3/1/19
to Aster Analysis User Group
Hello all!

(I think this can be addressed sans code but if code would help I'm happy to provide an example.)

I am generating lifetime fitness (seed set) predictions from an aster model with block as a fixed effect. There are six blocks. For simplicity, let's say that the only other effect is population (also fixed). 

One method would be to create a new "hypothetical" data frame (as in the aster tutorial), including one record for each population, and randomly choose a block number to assign to each record. But I would rather average over the six blocks, since variation among blocks was high. 

I think averaging over the six blocks to obtain a prediction for each population is OK, but how should I calculate standard errors for those averages? I wouldn't think simply averaging the SE's is the correct method. From some Googling, I'm wondering if Satterthwaite approximation is the way to go? Example for averaging over just two blocks (sorry for the image; can't use an equation editor here):

Capture.PNG

Does this seem correct? 

John

geyer

unread,
Mar 1, 2019, 5:15:01 PM3/1/19
to Aster Analysis User Group
Getting predicted values with standard errors is handled by the aster and aster.formula methods of R function predict, that is, R functions predict.aster and predict.aster.formula (I guess you used a formula to specify the aster model so will use the latter).  This is covered (using as an example the analysis in the first ever published paper about aster) on slides 98 ff. of the course slides for the aster models course.

There are two things you could reasonably do.

  • Use an average block effect on the canonical parameter scale.  This is a bit tricky because some blocks do not have estimated effects but rather have effects that are constrained to be zero to get an identifiable model.  So you sum the betas (coefficients) for blocks and divide by the number of blocks (not the number of betas for blocks) and this is the block effect for every hypothetical individual you predict for (one individual for each population).
  • Use an average block effect on the mean value parameter scale.  This is a bit tricky because you have to use the amat argument of R functions predict.aster as illustrated on the slides linked above starting at slide 104.  Here you would have nblocks * npop hypothetical individuals.  Predict mean fitness for all of them, then use amat to, in effect, sum over the predictions for the blocks for each pop, again getting one prediction for each pop.

If these are too brief, they are just special cases of the delta method, the whole theory of which is explained at the beginning of deck 4 of the course slides for the aster models course.  So this deck of slides discusses another way to get exactly the same predictions and standard errors.  That discussion is not doing a problem that is very close to your problem, so that may not be much help.  I think one of the suggestions in the bulleted list is better.


tl;dr. your guess about how to do standard errors is wrong, and R function predict.aster.formula knows how to do it right

John Benning

unread,
Mar 4, 2019, 7:03:30 PM3/4/19
to Aster Analysis User Group
Thanks, Charlie. I'm a bit stumped by how to set up the amat 3D array for my application. I've attached R code, and the necessary data files, to build and predict from the aster model I'm actually working with. (The large aster data file is too big too attach here; so it's here.)

A brief summary of the experiment:
These are data from a transplant experiment with Clarkia xantiana in southern California, similar in design to the experiment Ryan Briscoe Runquist was analyzing with Aster previously. I had six sites (with six blocks per site), three seed sources (populations), and a caging treatment to test for effects of biotic interactions across the species' geographic range. The aster LHS graph is: germination, early survival, late survival, survival to flowering, any fruits produced, total seeds produced.

The actual model is more complicated than the example I gave before: the predictors are planting year, site (with block nested within), Seed source, Caging treatment, and the site X Seed and site X Caging interactions. All in all, there are 360 "hypothetical" individuals for the predictions. I'm interested in generating predictions for all the combinations of predictors except block (i.e., averaging over block), meaning I'm interested in obtaining 60 estimates with SE's.

I'm stumped on how I can use amat to average over the 6 blocks at each site. I know the dimensions of the 3D amat array are: 1) the number of individuals, 2) number of nodes in the graph, 3) number of parameters we want point estimates for. I think I know these dimensions: 
  • 1 = 360 hypothetical individuals
  • 2 = 6 nodes in my graphical model
  • 3 = 60 mean fitness estimates
However, when I'm "setting" the components of the amat array, I don't know at what locations I need to change values (except I think for the 2nd dimension, where it would always be filling in the [x, 6, x] position, because the 6th and final node is total_seeds, which is our lifetime fitness estimate). 

Of course, I may be far off base here, or missing something very obvious -- sorry if so. Any help is appreciated!

John
t2.newdata.csv
geyer20190304.R
Reply all
Reply to author
Forward
0 new messages