Hello,
I'm pretty new to using Vistrails, but am really liking the interface and the ability to keep track of the modelling process! Thanks for creating this product :-)
I have a quick question about the R code used in the glm portion of SAHM though. I ran a very quick exploratory analysis using the glm with the "Squared terms" option checked and the "SelectBestPredSubset" = AIC. However,when I looked at the glm_output.txt file of the final model I'm a bit confused as to what's going on (this could also be due to my limited knowledge of R and certain commands). The output lists the following for the best model:
Results:
number covariates in final model : 4
Call:
glm(formula = response ~ I(c^2) + I(cl^2) + I(built600^2) + cti,
family = model.family, data = dat, weights = weight, na.action = "na.exclude")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.626e+00 1.053e+00 -4.392 1.12e-05 ***
I(c^2) -7.938e-04 8.491e-04 -0.935 0.3498
I(cl^2) 6.983e-05 3.580e-05 1.950 0.0511 .
I(built600^2) 6.771e+00 3.193e+00 2.121 0.0339 *
cti 1.645e-01 8.798e-02 1.870 0.0615 .
It was my understanding that for any quadratic terms the "linear" term needs to also be included in the model. So in this case if I'm interested in the quadratic effects of c and cl and built600 shouldn't there be 3 more parameters in the model, ie.:
glm(formula = response ~ c+I(c^2) + cl+I(cl^2) + built600+I(built600^2) + cti,
family = model.family, data = dat, weights = weight, na.action = "na.exclude")
and then in the coefficient table, shouldn't there be 3 more parameters: c, cl and built600?
Apologies if there is a simple answer to this and I'm totally missing something.
Thanks again,
Ryan