I'm having difficulty getting PLINK to calculate PCAs for a tped/tfam from a nonmodel species. I'd tried many times now and don't know what else to do differently, and would really appreciate some pointers. This is similar to problems others have had in this group, but the solutions aren't working for me.
Full Command: plink --tfile all_emmax_format_12 --pca 5 --allow-extra-chr --chr-set 19
Error: Invalid chromosome code '96' on line 7728294 of .tped file.
(This is disallowed by your --chr-set/--autosome-num parameters. Check if the
problem is with your data, or your command line.)
I think --chr-set is supposed to be the number of autosomes (19 for my nonmodel species) but I've tried it with 3664 (number of total scaffolds including unassembled contigs) as well and understand --chr-set can't take a number that high. Is this correct?
My understanding is that I have to use --allow-extra-chr in this case.
The input tped and tfam look as they should. These files came from PLINK and I know the files are okay because I used them for EMMAX, which stops if there's any formatting issue.
Is it not possible to calculate covariates using PLINK when there are so many scaffolds? Is that the problem?
Thanks,
Michael Nagle
PhD student, Molecular and Cellular Biology