Hello,
I'm using TASSEL for genome-wide association. While reading about GWAS in general, and TASSEL in particular, I came across the following three questions. Any contribution is welcomed.
1) I would like to confirm that if I use GLM without covariates (population stratification, Q) in TASSEL, I'm actually applying a 'regular' linear regression, i.e. using the naive model. Correct?
2) If I have the same trait measured in different conditions, would you advise me to use the Y=mean+condition+whatever or a 'combined trait', such as subtraction, ratio, etc.? I guess incorporating condition into the model will reduce power, since I have to correct for even more tests (number of snps x number of conditions), correct? But if I use a 'combined trait' and it is not normally distributed, can I apply the regular transformation strategies to normalize the data and use GLM and MLM as usual?
3) Finally, do you know what is the best tool to generate power curves that make sense for the models used in TASSEL? Most available power estimator tools are for case-control GWAS, and I haven't been able to figure out if tools like GWAPower (uses ANOVA and the associated F-test) or GWASpower/QT (also uses ANOVA and the associated F-test) are adequate for GLM and MLM. My understanding is that power depends on heritability, type 1 error
rate, total sample size and number of snps used, linkage disequilibrium,
and other covariates, but also on the method used. So if I calculate power for GLM it wont be the same for MLM. Is this correct?
Basically, I would like to have a statistically meaningful way of selecting the number of snps to use in my analysis, since I have way too many snps with MAF > 5% and recall rate > 95% and the correction for multiple testing with this many snps would make any possible signal disappear.
Thank you,
Ines.