doesn't --output-missing-genotype work for 0.

439 views
Skip to first unread message

Sheng Luan

unread,
Jun 19, 2015, 1:04:15 AM6/19/15
to plink2...@googlegroups.com
I set the missing genotype as  00 in the ped file (CG GG 00). Then I try set the missing genotype as 5 in the .raw file (0 1 2 5) using the --recode A. but I find NA was set as the missing value in the raw file although --output-missing-genotype 5 was used in the command line.



C:\plink_win64>plink --ped tsnpsMatrix.ped --map tsnpsMatrix.map --no-parents --no-sex --no-pheno --no-fid --geno 0.1 --maf 0.05 --output-missing-genotype 5 --r ecode A --out tsnpsMatrix

Christopher Chang

unread,
Jun 19, 2015, 1:10:37 AM6/19/15
to plink2...@googlegroups.com, luan...@gmail.com
"--recode A" ignores the missing genotype setting and always uses 'NA', since the default '0' value doesn't work (it's a valid allele count).  So you're best off using e.g. sed to convert the NAs to whatever you want.

Sheng Luan

unread,
Jun 19, 2015, 1:19:33 AM6/19/15
to plink2...@googlegroups.com, luan...@gmail.com
sorry, I don't express it clearly, I want to set the missing genotype as 5 in the raw file. is it not ok?
just like this :
--output-missing-genotype 5 

在 2015年6月19日星期五 UTC+8下午1:10:37,Christopher Chang写道:

Christopher Chang

unread,
Jun 19, 2015, 1:20:25 AM6/19/15
to plink2...@googlegroups.com, luan...@gmail.com
I already explained why "--recode A" ignores the --output-missing-genotype setting.  Use sed.

Sheng Luan

unread,
Jun 19, 2015, 1:46:18 AM6/19/15
to plink2...@googlegroups.com, luan...@gmail.com
Thank you very much. I am a newer for plink and snp. Is sed a software package or one parameter in plink? Thank you!


在 2015年6月19日星期五 UTC+8下午1:20:25,Christopher Chang写道:

Christopher Chang

unread,
Jun 19, 2015, 1:59:38 AM6/19/15
to plink2...@googlegroups.com, luan...@gmail.com
sed is a Unix utility that makes it easy to perform text replacement operations like changing all instances of "NA" to "5".  http://www.grymoire.com/Unix/Sed.html is a decent tutorial; the most important part is "The essential command: s for substitution".

http://www.mingw.org/wiki/msys provides one way to install sed and a Unix-style command prompt on Windows.  Once it is installed, you can run the following:

plink --ped tsnpsMatrix.ped --map tsnpsMatrix.map --no-parents --no-sex --no-pheno --no-fid --geno 0.1 --maf 0.05 --recode A --out tsnpsMatrix
cat tsnpsMatrix.raw | sed s/NA/5/g > tsnpsMatrix.raw5

tsnpsMatrix.raw5 will then contain the '5's you want.

One warning: if any of your SNP IDs, FIDs, or IIDs contain 'NA' in the middle, you will need to do a bit more work.

Sheng Luan

unread,
Jun 22, 2015, 10:24:53 PM6/22/15
to plink2...@googlegroups.com, luan...@gmail.com
I have download the sed for windows. It is very cool. Thank you very much.

在 2015年6月19日星期五 UTC+8下午1:59:38,Christopher Chang写道:
Reply all
Reply to author
Forward
0 new messages