Testing for interaction between continuous and categorical predictors

46 views
Skip to first unread message

José Waterton

unread,
Nov 29, 2018, 3:47:06 PM11/29/18
to Aster Analysis User Group
Hi everyone,

I am new to aster and struggling a bit so would greatly appreciate any assistance.

I'm analysing data from an experiment where I am testing whether selection on emergence time depends on the identity of neighbouring plants. I planted seeds of an annual plant in plots containing different competitors (CompTreatment) measured when they emerged (GermDateValue). I also am using seed mass which was weighed before planting as a covariate. I measured if individuals survived to flower, if they survived to produce viable seeds, and the total mass of seeds produced.

I used the following graphical model:

1 -> Survival to flower (Bernoulli) -> survival to produce seeds (Bernoulli) -> log(mass of seeds produced) (normal distrubution)

Seed mass was log transformed for normality

The code and model I fitted are below:

brdi.famlist<-list(fam.bernoulli(),fam.normal.location(n))

brdi.pred <- c(0,1,2)
brdi.fam<-c(1,1,2)

layer <- gsub("[0-9]", "", as.character(brdi.redata$varb))
unique(layer)

fit <- as.numeric(layer == "LogSeedProd")
unique(fit)

brdi.redata <- data.frame(brdi.redata, fit = fit)

brdi.aster1 <- aster(resp ~ varb + fit:(CompTreatment * GermDateValue + SeedMassMG) ,
                     pred=brdi.pred, fam=brdi.fam, varb, id, root, data = brdi.redata, famlist = brdi.famlist)

summary(brdi.aster1, show.graph = TRUE)

However, I get the following error message:

Error in summary.aster(brdi.aster1, show.graph = TRUE) : cannot compute standard errors

Fitting the model without the interaction doesn't give this error message, so I'm not sure if I'm specifying the model correctly in order to test the interaction between germination date and competition treatment. Any help would be greatly appreciated.

Thank you,
José Waterton

UC San Diego


geyer

unread,
Dec 3, 2018, 4:23:35 PM12/3/18
to Aster Analysis User Group
You did not look at the whole message; the part about directions of recession is crucial.  First read this old post; then if you still have questions after applying that, we can help.
If you want more reading than that post, deck 9 of the slides for the aster models special topics course, which is being taught right now, in fact we are in this topic today, is
all about this stuff.

José Waterton

unread,
Dec 3, 2018, 11:05:43 PM12/3/18
to Aster Analysis User Group
Thanks for your reply! I think I understand the issue with my data. Also apologies in advance for any misuse of the terminology below...

In one of my neighbour identity treatments (CompTreatment), every single individual that flowered then went on to survive to produce viable seeds. In the other treatments, the vast majority proportion of individuals that flowered went on to produce seeds (lowest of any treatment is 91%) which makes me wonder if a 'Survival to Produce Seeds' node is generally problematic or redundant in this case. I included it as I thought that I couldn't have values of 0 in the terminal node (mass of seeds produced) following values of 1 in the previous node (i.e. survival to flowering). I don't know if that assumption is correct, as I struggled to find examples using a normal distribution for the terminal node.

I think my main question is therefore: can I remove the middle 'survived to produce seeds' node, and simply have survive to flower (bernoulli) and log seed mass (normal distribution, and would have 0 values)?

I hope this makes sense!

Many thanks,
José

José Waterton

unread,
Dec 11, 2018, 9:50:30 PM12/11/18
to Aster Analysis User Group
Hi again,

I have tried again, but have still run into some problems so would really appreciate your help as I can't figure out where I'm going wrong.  I removed the 'Produced Seeds' variable so now I am left with the following simple graphical model:

1 -> Survival to flower (Bernoulli) ->  log(mass of seeds produced + 1) (normal distrubution)

There are some individuals that didn't produce seeds after flowering, but these are a minority of individuals.

brdi.vars<-c("Flowered2017","LogSeedProd2017")


brdi.redata <- reshape(brdi.aster.data, varying = list(brdi.vars),
                       direction = "long", timevar = "varb",
                       times = as.factor(brdi.vars),v.names = "resp")

> levels(brdi.redata$varb)
[1] "Flowered2017"    "LogSeedProd2017"
> class(brdi.redata$varb)
[1] "factor"


brdi.famlist<-list(fam.bernoulli(),fam.normal.location(1.84194732))

brdi.pred <- c(0,1)
brdi.fam<-c(1,2)

As the seed production is my surrogate for fitness, I used the following code from the tutorial pdf to do the following;

LogSeedProd2017 <- grep("LogSeedProd2017", as.character(brdi.redata$varb))
LogSeedProd2017 <- is.element(seq(along = brdi.redata$varb), LogSeedProd2017)
brdi.redata <- data.frame(brdi.redata, LogSeedProd2017 = as.numeric(LogSeedProd2017))
names(brdi.redata)

I can create the following model without including germination date, which works fine

brdi.aster2 <- aster(resp ~ varb + SeedMassMG + CompTreatment * LogSeedProd2017,

                     pred=brdi.pred, fam=brdi.fam, varb, id, root, data = brdi.redata, famlist = brdi.famlist)


However, I am interested in seeing whether the effect of germination date on the surrogate for fitness, seed production (LogSeedProd2017) differs between competition treatments (CompTreatment). I used the following model:

brdi.aster3 <- aster(resp ~ varb + SeedMassMG + CompTreatment * GermDateValue * LogSeedProd2017,

                     pred=brdi.pred, fam=brdi.fam, varb, id, root, data = brdi.redata, famlist = brdi.famlist)

I got the same error message:
apparent null eigenvectors of information matrix
directions of recession or constancy of log likelihood

It seems that including the interaction with germination date is causing the issue. This is a continuous predictor, and each competition treatment has different ranges of germination dates due to variation in the timing of emergence in the field. Could this itself be an issue?

Thanks again for your help,

Best wishes,
José

geyer

unread,
Dec 21, 2018, 9:16:12 PM12/21/18
to Aster Analysis User Group
Since this thread started the course slides for the unit on solutions at infinity have been revised http://www.stat.umn.edu/geyer/8931aster/slides/s9.pdf.  It may help to read them again.

It would really help if you could find out exactly what the direction of recession is.  Also log(mass of seeds produced + 1) doesn't make any sense.  I suppose the +1 is to avoid taking log of zero, but that isn't the way aster models work.  You should be able to use log(mass of seeds produced) or 0 depending on whether or not the predecessor is nonzero or zero.
But you shouldn't be changing your graph before you know exactly what the issue is.

If you cannot figure this out, I will give it a try.

geyer

unread,
Dec 21, 2018, 9:19:51 PM12/21/18
to Aster Analysis User Group
I should have also commented on "these are a minority".  It doesn't matter how many they are.  All that matters is whether these individuals can be perfectly predicted by the regression formula being used.  If these individuals are all in category A and the factor having this category is in the model, then that is solutions at infinity.  I really cannot
say more (that what the 86 slide deck was for).
Reply all
Reply to author
Forward
0 new messages