Hi,
I'm currently trying to replicate the following MCMCglmm model using INLA, but without success so far. The objective is to estimate the coefficients of effects of 'fixed effect' predictors on the response while controlling for non-independence among data points (species) due to their evolutionary relationships ('phylogeny'). According to a Brownian motion model, which I'm assuming, the covariance among data points (species) drops linearly and proportionally with increasing distance in evolutionary time.
A phylogeny in this case is an ultrametric bifurcating representation of evolutionary distances in time ('branch lengths') among species, with tips (the individual species) connected by internal nodes (which also connect to each other via other nodes), and a root node connecting everything. Branch lengths vary among each pair of connecting nodes and tips, but all tips are the same distance from the root.
#############################################
#Example MCMCglmm analysis setup on a simulated dataset#
#############################################
library(MCMCglmm);
data(bird.families);#This is a representation of the phylogenetic relationships among bird families (although I refer to 'species' elsewhere, in this example it's families)#
#Simulate some data
set.seed(123456)
phylo.effect<-MCMCglmm::rbv(bird.families, 1, nodes="TIPS")#"Random Generation of MVN Breeding Values and Phylogenetic Effects". "TIPS" are the families - we don't have trait values for internal nodes. The '1' here refers to the variance of the simulated values#
set.seed(23456)
phenotype<-phylo.effect+rnorm(dim(phylo.effect)[1], 0, 1)#Adding the non-phylogenetic component of trait variation#
test.data<-data.frame(phenotype=phenotype, taxon=row.names(phenotype))
#Create a sparse inverse matrix of the phylogeny#
Ainv<-MCMCglmm::inverseA(bird.families)$Ainv#This includes all tips (species) and internal nodes (for the latter we have no trait data)#
#Run the analysis#
prior<-list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002)));
model<-MCMCglmm(phenotype~1, random=~taxon, ginverse=list(taxon=Ainv),
data=test.data, prior=prior);
This estimates the intercept and the phylogenetic and residual variance components. The estimate and uncertainty of the intercept in this model are modified due to the non-zero phylogenetic random effect for bird families.
I've tried several INLA model setups to replicate this, but without success. These include copying the animalINLA model setup, with either model="generic0" or model="generic2", and with the inverse matrix for the bird families input using Cmatrix (including only tips, but created by subsetting the Ainv object above to just tips, rather than inverting the matrix based on tips alone; according to MCMCglmm documentation this would seem to be okay). Here's the model:
inla(phenotype~f(spN, model="generic2", hyper = list(theta1 = list(param = c(0.5, 0.5), fixed = FALSE),theta2 = list(param = c(0.5, 0.5), fixed = FALSE)), constr = TRUE, Cmatrix = AinvSM) + 1,data=testdata,control.family = list(hyper = list(theta = list(initial = 10, fixed = TRUE))));
I'm totally guessing here, but I wonder if INLA is assuming that the bird families are equidistant in evolutionary time? I used numeric 1:N families as rownames in Cmatrix, and made an equivalent numeric variable in the data table (spN) to cross reference with it. But if INLA is assuming equidistance, what is the purpose of the Cmatrix?
I compared MCMCglmm with INLA (using the animalINLA setup with "generic2" as above) for the intercept and phylogeny random effect using 100 datasets simulated as above. I found that the intercept means were at least proportional (with r^2=0.5) but with much narrower credible intervals in INLA, almost exclusively non-overlapping with the true population value. For the phylogenetic random effect, there is a relationship between the MCMCglmm and INLA estimates (r^2~0.2) but it's more noisy and not proportional (slope~0.5).
If I've done something completely wrong please let me know!
I see there have been related questions previously, but so far I don't have the impression there's a conclusive answer.
Many thanks for your time!
All the best, Richard.