PCA: Invalid chr code when using --chr-set and --allow-extra-chr

945 views
Skip to first unread message

nag...@oregonstate.edu

unread,
May 18, 2018, 10:25:05 PM5/18/18
to plink2-users
I'm having difficulty getting PLINK to calculate PCAs for a tped/tfam from a nonmodel species. I'd tried many times now and don't know what else to do differently, and would really appreciate some pointers. This is similar to problems others have had in this group, but the solutions aren't working for me.

Full Command: plink --tfile all_emmax_format_12 --pca 5 --allow-extra-chr --chr-set 19 
Error: Invalid chromosome code '96' on line 7728294 of .tped file.
(This is disallowed by your --chr-set/--autosome-num parameters.  Check if the
problem is with your data, or your command line.)

I think --chr-set is supposed to be the number of autosomes (19 for my nonmodel species) but I've tried it with 3664 (number of total scaffolds including unassembled contigs) as well and understand --chr-set can't take a number that high. Is this correct?

My understanding is that I have to use --allow-extra-chr in this case.

The input tped and tfam look as they should. These files came from PLINK and I know the files are okay because I used them for EMMAX, which stops if there's any formatting issue.

Is it not possible to calculate covariates using PLINK when there are so many scaffolds? Is that the problem?

Thanks,
Michael Nagle
PhD student, Molecular and Cellular Biology




Christopher Chang

unread,
May 18, 2018, 10:31:49 PM5/18/18
to plink2-users
Scaffold names can’t be numeric. Change them to scaffold1, scaffold2, etc. and you should be fine.

Luis Cueto

unread,
Oct 10, 2023, 12:01:22 PM10/10/23
to plink2-users
Hi Christopher,
I am trapped in a loop. I have to add letters to the chromosomes' names to create the .ped file, but when I try to use that .ped file to run Admixture, I am asked to use only integers in the chromosome names.
I am working with 1265 scaffolds.
Any recommendation? 

Christopher Chang

unread,
Oct 10, 2023, 12:10:49 PM10/10/23
to plink2-users
0. You should prefer .bed+.bim+.fam to .ped+.map >99% of the time in 2023.  .ped+.map loses minor allele codes and is also far less efficient in both space and time.
1. Right before handing off the data to Admixture, you can use "--allow-extra-chr 0" to convert all the scaffold codes to "0", which Admixture should accept.

Luis Cueto

unread,
Oct 10, 2023, 2:05:53 PM10/10/23
to plink2-users

Thanks for your prompt reply!
It worked!
Reply all
Reply to author
Forward
0 new messages