Trouble with testing significances of fixed effects using log likelihood ratio tests

87 views

Skip to first unread message

Ryan Runquist

unread,

Oct 30, 2015, 5:28:55 PM10/30/15

to Aster Analysis User Group

Hi,

I have been working with aster on data from a large reciprocal transplant experiment to test the significance transplant site and population of origin on fitness. Seeds were taken from three populations (Population: Central, Center-Edge, and Edge) and transplanted into three transplant sites (TransplantSite: Central, Center-Edge, Edge, and Far).

I am using the following graphical model:

#1 --> Germ --> Early survival --> Surivival to flower --> Any Fruits --> #Fruits --> #Seeds

#Germination is Bernoulli

#Early survival (survival to March) is Bernoulli

#Survival to flowering (survival to May with the possibility of flowering) is Bernoulli

#Any fruits (did the plant produce fruits) is Bernoulli

#Number of Fruits is zero-truncated poisson

#Number of seeds is zero-truncated poisson

I was able to run the fixed effects aster model:

seeds.out8 <- aster(resp ~ varb + seeds:(TransplantSite*Population), pred1, fam1, varb, id, root, data = reClarkia)

summary(seeds.out8, show.graph = T)

And the summary looked good:

Call:

aster.formula(formula = resp ~ varb + seeds:(TransplantSite *

Population), pred = pred1, fam = fam1, varvar = varb, idvar = id,

root = root, data = reClarkia)

Graphical Model:

variable predecessor family

Germ root bernoulli

March_Alive Germ bernoulli

Flowering_Alive March_Alive bernoulli

Fruiting_Alive Flowering_Alive bernoulli

Total_fruit Fruiting_Alive truncated.poisson(truncation = 0)

Total_seeds Total_fruit truncated.poisson(truncation = 0)

Estimate Std. Error z value Pr(>|z|)

(Intercept) -0.881062 0.026523 -33.219 < 2e-16 ***

varbFruiting_Alive 0.063563 0.070097 0.907 0.3645

varbGerm 1.102525 0.037838 29.138 < 2e-16 ***

varbMarch_Alive 2.886413 0.038911 74.179 < 2e-16 ***

varbTotal_fruit -23.430536 0.084645 -276.810 < 2e-16 ***

varbTotal_seeds 4.116154 0.026729 153.997 < 2e-16 ***

seeds:TransplantSiteC 0.018888 0.001451 13.018 < 2e-16 ***

seeds:TransplantSiteC-E 0.006653 0.001671 3.982 6.84e-05 ***

seeds:TransplantSiteE -0.030105 0.003546 -8.491 < 2e-16 ***

seeds:PopulationC-E -0.021735 0.001631 -13.327 < 2e-16 ***

seeds:PopulationE -0.028939 0.002059 -14.054 < 2e-16 ***

seeds:TransplantSiteC-E:PopulationC-E 0.017995 0.002300 7.824 5.12e-15 ***

seeds:TransplantSiteE:PopulationC-E 0.010533 0.005720 1.842 0.0655 .

seeds:TransplantSiteFar:PopulationC-E 0.017191 0.002570 6.689 2.24e-11 ***

seeds:TransplantSiteC-E:PopulationE 0.021116 0.002718 7.770 7.87e-15 ***

seeds:TransplantSiteE:PopulationE 0.033202 0.004847 6.850 7.36e-12 ***

seeds:TransplantSiteFar:PopulationE 0.027945 0.002771 10.085 < 2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Original predictor variables dropped (aliased)

seeds:TransplantSiteFar

I also ran a nested model to test the interaction:

seeds.out8.1 <- aster(resp ~ varb + seeds:(TransplantSite + Population), pred1, fam1, varb, id, root, data = reClarkia)

summary(seeds.out8.1, show.graph = T)

and the summary also looks fine:

Call:

aster.formula(formula = resp ~ varb + seeds:(TransplantSite +

Population), pred = pred1, fam = fam1, varvar = varb, idvar = id,

root = root, data = reClarkia)

Graphical Model:

variable predecessor family

Germ root bernoulli

March_Alive Germ bernoulli

Flowering_Alive March_Alive bernoulli

Fruiting_Alive Flowering_Alive bernoulli

Total_fruit Fruiting_Alive truncated.poisson(truncation = 0)

Total_seeds Total_fruit truncated.poisson(truncation = 0)

Estimate Std. Error z value Pr(>|z|)

(Intercept) -8.811e-01 2.652e-02 -33.219 < 2e-16 ***

varbFruiting_Alive 7.925e-03 6.994e-02 0.113 0.91

varbGerm 1.103e+00 3.784e-02 29.138 < 2e-16 ***

varbMarch_Alive 2.886e+00 3.891e-02 74.179 < 2e-16 ***

varbTotal_fruit -2.343e+01 8.463e-02 -276.885 < 2e-16 ***

varbTotal_seeds 4.122e+00 2.671e-02 154.338 < 2e-16 ***

seeds:TransplantSiteC 9.644e-03 9.638e-04 10.007 < 2e-16 ***

seeds:TransplantSiteC-E 4.744e-03 1.048e-03 4.528 5.94e-06 ***

seeds:TransplantSiteE -2.882e-02 2.119e-03 -13.600 < 2e-16 ***

seeds:PopulationC-E -1.203e-02 9.390e-04 -12.814 < 2e-16 ***

seeds:PopulationE -1.322e-02 9.729e-04 -13.589 < 2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Original predictor variables dropped (aliased)

seeds:TransplantSiteFar

But when I try use these models to test for significance:

anova(seeds.out8.1, seeds.out8)

I get this error:

Error in anovaAsterOrReasterList(allargs, tolerance) :

model matrices for fixed effects not nested

model 1 and model 2

I have used this code on a slightly different graphical model before (the same data but with Germination specified differently) and it worked. I have also tried other nested models and models without the fitness (seeds) interaction.

I am not sure why this is not working.

I have attached my code for this model (Clarkia_Fixed_Effect_Aster_Script.R), my original code with the slightly different graphical model (Original_Clarkia_Aster_Model.R) and the data (Clarkia_North_Aster_Ryan.csv).

Any thoughts would be greatly appreciated.

Thanks!

Ryan

Clarkia_North_Aster_Ryan.csv

Clarkia_Fixed_Effect_Aster_Script.R

Original_Clarkia_Aster_Model.R

geyer

unread,

Oct 31, 2015, 5:46:21 AM10/31/15

to Aster Analysis User Group

The problem is inexact computer arithmetic. There is no way to code the test for nested models (which is important because users often get this wrong, although
you didn't not get it wrong, your models are correctly nested). That's what the tolerance argument is for. Crazily, the code you sent worked OK on an old 32-bit computer but not on 64-bit (which almost everything is these days). When I added the argument tolerance = .Machine$double.eps ^ 0.7 to your anova function calls, it worked on 64-bit too (see attached Rout).

On another subject, the family for number of seeds should be Poisson rather than zero-truncated Poisson IMHO. Zero-truncated Poisson is for when zero values are not possible by design, not for when there just do not happen to be zeros in the actual data. That is, number of fruits cannot be zero conditional on any fruits = 1, because any fruits = 1 means number fruits > 0, but there is no such restriction on number seeds and its predecessor (number fruits). It probably doesn't matter in practice, but it will be a little easier on the computer and scientifically more sensible too (again IMHO) to use Poisson for the number seeds node.