Error when I tried to use Irregularly-formatted PLINK text files

3,777 views
Skip to first unread message

Xu Zhang

unread,
Apr 24, 2015, 5:52:35 PM4/24/15
to plink2...@googlegroups.com
Hi all,

I downloaded the test data from the plink website named 'toy'. Then I deleted the first column in the ped file, and use command 'plink --file toy --no-fid --assoc --out toy' to analyze these data.

Then I got an error:

PLINK v1.90b1g 64-bit (10 Jun 2014)        https://www.cog-genomics.org/plink2
(C) 2005-2014 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to toy.log.
16302 MB RAM detected; reserving 8151 MB for main workspace.
Possibly irregular .ped line.  Restarting scan, assuming multichar alleles.
Rescanning .ped file... 0%
Error: Half-missing call in .ped file at variant 1, line 1.

I also tried to use --no-parents, it showed same error.

When I tried to use the old version v1.07, it worked well with --no-fid and --no-parents

Is there any solution for the data which didn't have fid and parents information?

PS: I add the fid and parents columns as 0 then it works well, so I can keep working on my data. I am just curious about the reason why it showed this error.

Christopher Chang

unread,
Apr 24, 2015, 5:59:05 PM4/24/15
to plink2...@googlegroups.com
The build you are using is almost one year old.  Check if this problem still occurs in an Apr 2015 build.

Xu Zhang

unread,
Apr 25, 2015, 2:19:12 AM4/25/15
to plink2...@googlegroups.com
It works well! Thank you.

Hao-Xun Chang

unread,
Aug 26, 2015, 3:04:31 PM8/26/15
to plink2-users
Hi all,

   I encountered a similar problem, but my PLINK version is updated:

PLINK v1.90b3v 64-bit (15 Jul 2015)        https://www.cog-genomics.org/plink2
(C) 2005-2015 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:
  --file GWAS_FvNIS1

96730 MB RAM detected; reserving 48365 MB for main workspace.
Allocated 36273 MB successfully, after larger attempt(s) failed.
Possibly irregular .ped line.  Restarting scan, assuming multichar alleles.
Rescanning .ped file... 0%
Error: Half-missing call in .ped file at variant 4, line 1.

Is there any thing I should check? I appreciate any suggestions. Thank you very much.

Christopher Chang

unread,
Aug 26, 2015, 4:06:54 PM8/26/15
to plink2-users
Hi,

Can you post or send me the first line of your .ped file?

Hao-Xun Chang

unread,
Aug 26, 2015, 5:33:58 PM8/26/15
to plink2-users
Hello Dr. Chang,

    I noticed that I used 0,1,2 numeric genotype in my previous .ped file, which may be the cause of error. I reformatted my .ped file, the error disappeared although new error showed up. I'll ask the following question in another thread. Thanks.

Maryiam Shöâeè

unread,
Sep 5, 2015, 4:48:32 PM9/5/15
to plink2-users
Thank you for this post. It really helped. 

ryo...@unlv.edu

unread,
Mar 5, 2016, 7:37:13 PM3/5/16
to plink2-users
I'm having a similar issue.

verison - PLINK v1.90b3.31 64-bit (3 Feb 2016)

to make the ped.

plink --vcf ALL.chr15.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --recode --out ALL.chr15

I can post some of the.. or upload it to dropbox..  any help would be great.. thank you in advance

Christopher Chang

unread,
Mar 6, 2016, 2:16:00 AM3/6/16
to plink2-users
I'll take a look if you upload it to Dropbox.

N

unread,
May 10, 2016, 3:52:59 AM5/10/16
to plink2-users
Hi,
I'm getting the same error. I got the next message:

> plink --file combined --make-bed --out combined_bin                                                                                                                                

PLINK v1.90b3.36 64-bit (31 Mar 2016)      https://www.cog-genomics.org/plink2                                                                                                                              

(C) 2005-2016 Shaun Purcell, Christopher Chang   GNU General Public License v3                                                                                                                              

Logging to combined_bin.log.                                                                                                                                                                                

Options in effect:                                                                                                                                                                                          

  --file combined                                                                                                                                                                                           

  --make-bed                                                                                                                                                                                                

  --out combined_bin                                                                                                                                                                                        

                                                                                                                                                                                                            

32133 MB RAM detected; reserving 16066 MB for main workspace.                                                                                                                                               

Possibly irregular .ped line.  Restarting scan, assuming multichar alleles.                                                                                                                                 

Rescanning .ped file... 0%                                                                                                                                                                                  

Error: Half-missing call in .ped file at variant 110298, line 1.


It's a big ped file, but it seems to be correct:

> cut -d' ' -f220596-220616 combined.ped | head -1                                                                                                                                        

T A G C C 0 T A C A G T C C C C C C C T C        

You can see that variant 110298 (110298 *2 +6 = 220602) is missing, but it should be accepted. Isn't so?

Christopher Chang

unread,
May 10, 2016, 6:12:43 PM5/10/16
to plink2-users
Missing calls are supposed to be coded as "0 0" in .ped files; "C 0" or "0 T" are half-missing.

Paula Andrea

unread,
Aug 18, 2016, 2:54:35 AM8/18/16
to plink2-users
Hi Christopher,

would you explain this a bit more please, how should a missing call look like in the .ped file?
I have "0" for missing.  Should it be really "0 0" ? that didn't work for me.
Using the newest version by now: PLINK v1.90b3.40 64-bit (16 Aug 2016)

Christopher Chang

unread,
Aug 18, 2016, 12:18:49 PM8/18/16
to plink2-users
Yes, it should be "0 0".  Can you send me a small example fileset where "0 0" does not appear to be working?

Paula Andrea

unread,
Aug 18, 2016, 8:18:11 PM8/18/16
to plink2-users

Thanks, Christopher.

Please find the test in the attached, it has only 3 samples.
I used this command
"/plink1.9/plink --file /data/output1/testplink/batch_1001.plink --cluster --allow-extra-chr --no-fid --no-parents --no-sex --no-pheno --out try3 --write-snplist --missing --distance ibs flat-missing square --pca"

with "0" as missing I get : Half-missing call in .ped file (Please see original ped file)
with "0 0" I get 
Error: Line 2 of .ped file has more tokens than expected.

what else could it be?
testplink.zip

Christopher Chang

unread,
Aug 18, 2016, 10:58:56 PM8/18/16
to plink2-users
In original-batch_1001.plink.ped, the 0's already occur in pairs; you don't want to replace them with "0 0".

Your command line should work on the original .ped/.map pair if you remove --no-fid/--no-parents/--no-pheno/--no-sex; all of those columns are actually present, they're just zeroed out (well, FID is always 1).

Paula Andrea

unread,
Aug 19, 2016, 12:22:12 AM8/19/16
to plink2-users
Great!!!! Thank you, Christopher. It works! :D

I thought I have to specify all the unavailable information using all those parameters --no-*.

Have a great day!

Paula

Greg McInnes

unread,
Oct 12, 2018, 9:20:53 AM10/12/18
to plink2-users
Hi Chris,  I'm getting the same error but I don't see any half missing variants and the specified site.  

$ plink --make-bed --file stl_pgrn --out stl_pgrn --missing
PLINK v1.90b5 64-bit (14 Nov 2017)             www.cog-genomics.org/plink/1.9/
(C) 2005-2017 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to stl_pgrn.log.
Options in effect:
  --file stl_pgrn
  --make-bed
  --missing
  --out stl_pgrn

16384 MB RAM detected; reserving 8192 MB for main workspace.
Possibly irregular .ped line.  Restarting scan, assuming multichar alleles.
Rescanning .ped file... 75%
Error: Half-missing call in .ped file at variant 210, line 280.

$ head -n 280 stl_pgrn.ped | tail -n 1 | cut -f 420-450 -d' '
C G G T T A A 0 0 0 0 0 0 0 0 0 0 T T 0 0 0 0 0 0 0 0 G G C C

There are missing calls there but they don't appear to be half missing.

Thanks

Christopher Chang

unread,
Oct 12, 2018, 11:34:29 AM10/12/18
to plink2-users
Hi,

Have you checked whether there are two consecutive spaces somewhere on that line?  That would account for the discrepancy between cut -d' ' and plink's tokenization.

If that isn't the problem, can you send me a file to replicate this with?

Shicheng Guo

unread,
Oct 18, 2018, 1:34:49 AM10/18/18
to Christopher Chang, plink2...@googlegroups.com
Dear Chris, 

Here is the VCF files created from CoreExom array.  However, I found there are lots of 'dot' in the column of alternative allele. I know they are not "dot" actually, I want to update them with 1000 genome VCF files. 

Do you think is there any quick way to do it with Plink? 

image.png

Thanks.

Shicheng




--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christopher Chang

unread,
Oct 18, 2018, 12:08:14 PM10/18/18
to plink2-users
You can use Plink 2.0's --alt1-allele flag to fill these in from a reference VCF.

Anuj Kumar

unread,
Jan 29, 2019, 10:53:34 AM1/29/19
to plink2-users
Hi Chang,

This is, Anuj Kumar, University of Arkansas and trying to run plink for big data set for LD analysis. I formatted both file (map and ped from hapmap file in R). When i start plink with these files, the eorre comes " Possibly irregular .ped line.  Restarting scan, assuming multichar alleles.
Rescanning .ped file... 0%
Error: Line 1 of .ped file has fewer tokens than expected"
I tried to resolove this erroe but I was falied. Can you please help me out to find the issue? Here is the small set of formats
map file-
"1" "S1_1203" 0 1203
"1" "S1_1249" 0 1249
"1" "S1_1266" 0 1266
"1" "S1_1277" 0 1277
"1" "S1_1278" 0 1278
"1" "S1_1285" 0 1285
"1" "S1_1297" 0 1297
"1" "S1_1299" 0 1299
"1" "S1_1325" 0 1325
"1" "S1_1335" 0 1335
"1" "S1_1337" 0 1337
ped file-
"FAM1" "AP02_S3" "0" "0" "0" "24.43454508" "N" "N" "N" "N" "G" "N" "G" "T" "N" "N" "G"
"FAM2" "AP03_S4" "0" "0" "0" "17.66774924" "T" "A" "G" "T" "G" "0" "G" "T" "C" "G" "G"
"FAM3" "AP04_S1" "0" "0" "0" "13.41274635" "N" "N" "G" "N" "N" "N" "G" "T" "C" "G" "G"
"FAM4" "AP06_S6" "0" "0" "0" "14.67419332" "C" "C" "A" "C" "N" "G" "N" "N" "C" "G" "N"
"FAM5" "AP15_S6" "0" "0" "0" "12.80269602" "N" "A" "G" "T" "G" "0" "G" "T" "T" "T" "G"


Thanks for help
Regards
Anuj

Anuj Kumar

unread,
Jan 29, 2019, 10:54:41 AM1/29/19
to plink2-users
Hi Chang,

This is, Anuj Kumar, University of Arkansas and trying to run plink for big data set for LD analysis. I formatted both file (map and ped from hapmap file in R). When i start plink with these files, the eorre comes " Possibly irregular .ped line.  Restarting scan, assuming multichar alleles.
Rescanning .ped file... 0%

Error: Line 1 of .ped file has fewer tokens than expected"
I tried to resolove this erroe but I was falied. Can you please help me out to find the issue? Here is the small set of formats
map file-
"1" "S1_1203" 0 1203
"1" "S1_1249" 0 1249
"1" "S1_1266" 0 1266
"1" "S1_1277" 0 1277
"1" "S1_1278" 0 1278
"1" "S1_1285" 0 1285
"1" "S1_1297" 0 1297
"1" "S1_1299" 0 1299
"1" "S1_1325" 0 1325
"1" "S1_1335" 0 1335
"1" "S1_1337" 0 1337
ped file-
"FAM1" "AP02_S3" "0" "0" "0" "24.43454508" "N" "N" "N" "N" "G" "N" "G" "T" "N" "N" "G"
"FAM2" "AP03_S4" "0" "0" "0" "17.66774924" "T" "A" "G" "T" "G" "0" "G" "T" "C" "G" "G"
"FAM3" "AP04_S1" "0" "0" "0" "13.41274635" "N" "N" "G" "N" "N" "N" "G" "T" "C" "G" "G"
"FAM4" "AP06_S6" "0" "0" "0" "14.67419332" "C" "C" "A" "C" "N" "G" "N" "N" "C" "G" "N"
"FAM5" "AP15_S6" "0" "0" "0" "12.80269602" "N" "A" "G" "T" "G" "0" "G" "T" "T" "T" "G"


Thanks for help
Regards
Anuj

Christopher Chang

unread,
Jan 29, 2019, 11:37:58 AM1/29/19
to plink2-users
You need to use a different method of creating your map and ped files.  They aren't supposed to have double-quotes, and I assume there's another mistake causing the "fewer tokens than expected" issue since you clearly aren't using a well-tested method of exporting to plink.
Reply all
Reply to author
Forward
0 new messages