Hi,
I have some new questions on this thread. I understand why picking a transformation that results in a higher LOD score is a bad approach.
1) What about picking the transformation that, when running the Shapiro-Wilk test for normality on the transformed data, results in the highest p-value (and/or smallest effect size as assessed on an accompanying normal QQ plot)? That way we're going into the LOD score calculation with a transformed set of data that is as normal as reasonably possible. Does this sound like a viable approach?
Also, in your last response in this thread, Karl, you mentioned the utility of forcing normality using the nqrank function in qtl1. I'm not sure if I'm failing to grasp the distinction between using the nqrank function to transform "genome-scale phenotypes like gene expression" as compared with any other phenotype. 2) Is it maybe ok to use that nqrank function idea to transform the general set of phenotype data for which neither a log transformation nor a square root transformation leads to a more normally distributed dataset? If not, maybe I need to better grasp the significance of only using this type of transformation on genome-scale phenotypes?
Log transformation did help normalize some of our study's data, but not for some other phenotypes.
I would appreciate any insight you provide on these two sets of related questions.
Take care,
Mark