Hi Tassel group,
Thanks for reading my post, I am new to Tassel, so please forgive me if I asked some stupid questions.
My question are listed below:
1) Should I calculate the PCA and kinship after MAF filter? Or use the unfiltered genotype data to calculate PCA and kinship?
I noticed that the workflow of MLM in Tassel is load genotype → filter MAF →calculate PCA →join trait data → kinship calculation → MLM. And the command line would be like this ./run_pipeline.pl -fork1 –h ‘my_geno.hmp.txt’ -filterAlign -filterAlignMinFreq 0.05 -fork2 -PrincipalComponentsPlugin -covariance true –input1 –endPlugin –fork3 -r "phenos.txt" –fork4 -KinshipPlugin –input1 -method Centered_IBS -endPlugin -combine5 -input1 -input2 input3 -intersect -combine6 –input4 -input5 -mlm -mlmVarCompEst P3D -mlmCompressionLevel Optimum In my knowledge, the genotype might be filtered first using the taxa list, then filtered by MAF requirement. Did I misunderstand something?
2) I have a kinship dataset include around 600 lines, but I only have 300 lines that have phenotype data, can I still use the kinship data for association analysis if the kinship was calculated using unfiltered data?
3) I found that the diagonal values of the kinship table generated by Tassel are not the same, how should I interpret it? From my understanding, the each diagonal value indicate the kinship between the same line, they might be same values across all the lines.
I hope I made myself clear, I am looking forward your responses.
Thanks!
Joe