Error: Line 1 of .ped file has more tokens than expected

3,056 views
Skip to first unread message

Ana Marija

unread,
Nov 21, 2018, 4:53:46 PM11/21/18
to plink2-users
Hello,

I am trying to convert ped/map file into vcf file via:
module load plink/1.90
module load vcftools

plink --ped finalEDIC.chr1.pedc --map finalEDIC.chr1.map --recode vcf --out finalEDIC.chr1

But I am getting this error:
Rescanning .ped file... 0%
Error: Line 1 of .ped file has more tokens than expected.

format of .ped file is shown in attachment.

Is there is other way to convert .ped/.map to .vcf or there is something wrong with this .ped format?

Thanks
Ana
Screen Shot 2018-11-21 at 3.28.30 PM.png

Christopher Chang

unread,
Nov 21, 2018, 9:32:46 PM11/21/18
to plink2-users
There appear to be two issues here.

1. .ped files are normally assumed to contain 6, not 5, columns before the genotypes begin.  The .ped in your screenshot appears to be missing the phenotype (6th) column.
2. However, plink complained about seeing more, not fewer, tokens than expected in the line.  This implies a mismatch with the .map file: each .ped line is supposed to have 2n + 6 columns, where n is the number of lines in the .map file.  If the missing phenotype column was the only problem, there would be 2n + 5 columns, which would be fewer than the expected 2n + 6.

Ana Marija

unread,
Nov 21, 2018, 10:03:26 PM11/21/18
to chrch...@gmail.com, plink2...@googlegroups.com
Hi,

thank you so much for your replay. This is what I have:
number of columns in .ped:
$ awk '{print NF}' finalEDIC.chr1.ped | sort -nu | tail -n 1
78175

number of lines in .map file:
$ wc -l  finalEDIC.chr1.map
39085 finalEDIC.chr1.map

so n=39085

number of columns in .ped =2n + 6=78176

Do you know what I can do about this?

Thanks
Ana

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christopher Chang

unread,
Nov 22, 2018, 9:20:21 PM11/22/18
to plink2-users
Okay, looks like the second part of my previous message was incorrect, your lines do have exactly 2n + 5 tokens.  (There was an oversight in the .ped-validation code which caused an inappropriate error message to be printed here; this will be corrected in the next build.)  In this case, you should be able to use the --no-pheno flag to import this nonstandard .ped.

On Thursday, November 22, 2018 at 11:03:26 AM UTC+8, Ana Marija wrote:
Hi,

thank you so much for your replay. This is what I have:
number of columns in .ped:
$ awk '{print NF}' finalEDIC.chr1.ped | sort -nu | tail -n 1
78175

number of lines in .map file:
$ wc -l  finalEDIC.chr1.map
39085 finalEDIC.chr1.map

so n=39085

number of columns in .ped =2n + 6=78176

Do you know what I can do about this?

Thanks
Ana

On Wed, Nov 21, 2018 at 8:32 PM Christopher Chang wrote:
There appear to be two issues here.

1. .ped files are normally assumed to contain 6, not 5, columns before the genotypes begin.  The .ped in your screenshot appears to be missing the phenotype (6th) column.
2. However, plink complained about seeing more, not fewer, tokens than expected in the line.  This implies a mismatch with the .map file: each .ped line is supposed to have 2n + 6 columns, where n is the number of lines in the .map file.  If the missing phenotype column was the only problem, there would be 2n + 5 columns, which would be fewer than the expected 2n + 6.

On Thursday, November 22, 2018 at 5:53:46 AM UTC+8, Ana Marija wrote:
Hello,

I am trying to convert ped/map file into vcf file via:
module load plink/1.90
module load vcftools

plink --ped finalEDIC.chr1.pedc --map finalEDIC.chr1.map --recode vcf --out finalEDIC.chr1

But I am getting this error:
Rescanning .ped file... 0%
Error: Line 1 of .ped file has more tokens than expected.

format of .ped file is shown in attachment.

Is there is other way to convert .ped/.map to .vcf or there is something wrong with this .ped format?

Thanks
Ana

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users+unsubscribe@googlegroups.com.

Ana Marija

unread,
Nov 22, 2018, 9:32:56 PM11/22/18
to chrch...@gmail.com, plink2...@googlegroups.com
Hi,

thanks again, so where I would put that flag?
Maybe like this?

plink --ped --no-pheno finalEDIC.chr1.pedc --map finalEDIC.chr1.map --recode vcf --out finalEDIC.chr1

To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.

Christopher Chang

unread,
Nov 23, 2018, 12:00:09 AM11/23/18
to plink2-users
Switch the order of the filename and the --no-pheno flag.

To understand why this is necessary, it may be useful to rewrite the corrected command line as follows:

plink \
  --ped finalEDIC.chr1.pedc \
  --no-pheno \
  --map finalEDIC.chr1.map \
  --recode vcf \
  --out finalEDIC.chr1

(Ending a line with '\' lets you split a command across several lines in most shells.)

On Friday, November 23, 2018 at 10:32:56 AM UTC+8, Ana Marija wrote:
Hi,

thanks again, so where I would put that flag?
Maybe like this?

plink --ped --no-pheno finalEDIC.chr1.pedc --map finalEDIC.chr1.map --recode vcf --out finalEDIC.chr1
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users+unsubscribe@googlegroups.com.

Ana Marija

unread,
Nov 23, 2018, 9:57:11 PM11/23/18
to Christopher Chang, plink2...@googlegroups.com
Hi,

I did run it as you suggested:
plink --ped finalEDIC.chr1.ped --no-pheno --map finalEDIC.chr1.map --recode vcf --out finalEDIC.chr1

but I got again:
128952 MB RAM detected; reserving 64476 MB for main workspace.
Possibly irregular .ped line.  Restarting scan, assuming multichar alleles.

Rescanning .ped file... 0%
Error: Line 1 of .ped file has more tokens than expected.

Any suggestion would be appreciated.

Thanks
Ana

To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.

Christopher Chang

unread,
Nov 24, 2018, 7:40:33 PM11/24/18
to plink2-users
I’d need to look at your .map file and the first line of your .ped to investigate this further.

Ana Marija

unread,
Nov 24, 2018, 7:56:19 PM11/24/18
to Christopher Chang, plink2...@googlegroups.com
Hi,

my .map file is in attach
and
my .ped file first line extracted is in attach as well.

Thank you so much for looking into this!




On Sat, Nov 24, 2018 at 6:40 PM Christopher Chang <chrch...@gmail.com> wrote:
I’d need to look at your .map file and the first line of your .ped to investigate this further.

finalEDIC.chr1.map
first

Christopher Chang

unread,
Nov 25, 2018, 11:35:24 PM11/25/18
to plink2-users
Okay, the problem is that there are 34 blank lines scattered within the .map file; you'll need to investigate why they're there.

Ana Marija

unread,
Nov 26, 2018, 6:33:25 PM11/26/18
to Christopher Chang, plink2...@googlegroups.com
HI Christopher,

I made my .map file like this:

awk '   BEGIN { while ((getline <"bed_chr_22.bed") > 0) {REC[$4]=$0}}
    {print REC[$2]}' < finalEDIC.chr22.dat > out22
awk '{$2=$6=""; print $0}' out22 | awk '{ print $1 " " $3 " " $4 " " $2}' >tmp22
sed "s/chr/ /g" tmp22 > finalEDIC.chr22.map

I can see there is 7 empty lines in this file:
grep -cvE '\S' finalEDIC.chr22.map
7
and zero empty lines in finalEDIC.chr22.dat and bed_chr_22.bed.

In attach I em sending you how .dat and .bed files look like.

Do you know maybe why I might have this problem?

Thanks
Ana


On Sun, Nov 25, 2018 at 10:35 PM Christopher Chang <chrch...@gmail.com> wrote:
Okay, the problem is that there are 34 blank lines scattered within the .map file; you'll need to investigate why they're there.

bam.txt
dat.txt

Ana Marija

unread,
Nov 27, 2018, 1:14:18 PM11/27/18
to Christopher Chang, plink2...@googlegroups.com
Hi Christopher,

I cleaned (turned all in space separated) .bed file with perl::
perl -p -i -e 's/\t/ /g’ test.bed

then I created .map file again:
awk '   BEGIN { while ((getline <"test.bed") > 0) {REC[$4]=$0}}
    {print REC[$2]}' < test.chr1.dat > out1
awk '{$2=$6=""; print $0}' out1 | awk 'BEGIN {OFS = " "}{ print $1,$3,$4,$2}' >tmp1
sed "s/chr/ /g" tmp1 > test.chr1.map
sed -i 's/^ *//' test.chr1.map

but test.chr1.map has again 34 empty lines.

I printed those empty lines and they are these
awk '(NF==0){print NR}' test.chr1.map
1955
2119
3431
4027
4777
4880
5648
8364
11562
13020
13735
14168
14988
17249
17914
20560
20888
21056
21504
24523
24623
24627
26025
26571
30692
31565
31620
33049
33738
33752
35166
36376
37621
37876

and it seems those 34 empty lines are already happening in file out1 after this awk command:
awk '   BEGIN { while ((getline <"test.bed") > 0) {REC[$4]=$0}}
    {print REC[$2]}' < test.chr1.dat > out1

test.chr1.dat always has 2 columns and zero empty lines. and also test.bed has zero empty lines and always has 6 columns.

I will look into this further but please let me know if any other ideas are for this.

Thanks
Ana

Ruth Waineina

unread,
Aug 7, 2019, 5:20:38 AM8/7/19
to plink2-users

Hello,
I am facing the same problem of getting an error  that line 1 of .ped file has more tokens than expected.

kindly requesting for help

Waqas Khan

unread,
Apr 1, 2023, 10:34:17 PM4/1/23
to plink2-users
.ped file 
ERROR:
Problem with line 1 in [ C:\Users\Shadow\Desktop\plink-1.07-dos\snp.ped ]
Expecting 6 + 2 * 5 = 16 columns, but found more
Screenshot (1113).png
.map file
Screenshot (1114).png
Can anyone answer why I am getting this error, it is supposed to be executed, Please just look for data I have ped and map file format but when I run it o n the plink it gives me an error.

Waqas Khan

unread,
Apr 2, 2023, 7:52:33 PM4/2/23
to plink2-users
Thanks got the solution it was my data and the SNPs difference in .ped file

Zuxi Cui

unread,
Apr 4, 2023, 11:53:52 AM4/4/23
to Waqas Khan, plink2-users
Correct me if I’m wrong. It seems you have 5 SNPs in the map file.
However, you have more than 5*2=10 columns of nucleotides in the ped.

Terry 


Sent from my iPad

On Apr 1, 2023, at 22:34, Waqas Khan <waqaskh...@gmail.com> wrote:

.ped file 
--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.

Waqas Khan

unread,
Apr 4, 2023, 11:53:59 AM4/4/23
to plink2-users
You were right about the data in. Ped file 

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages