No entries in pheno_M.txt correspond to loaded sample IDs.

796 views
Skip to first unread message

Ana Marija

unread,
Aug 10, 2019, 4:32:20 PM8/10/19
to plink2-users
Hi Chris,

I was running:

plink2 --threads 40 --vcf VCFchr22.vcf.gz --glm --pheno pheno_M.txt
--pheno-name pheno --out FINchr22

and I got:

PLINK v2.00a2LM 64-bit Intel (31 Jul 2019)
2 Options in effect:
3 --glm
4 --out FINchr22
5 --pheno pheno_M.txt
6 --pheno-name pheno
7 --threads 40
8 --vcf VCFchr22.vcf.gz
9
10 Hostname: kg15-9
11 Working directory: /cephfs/users/anamaria/bio
12 Start time: Sat Aug 10 14:59:02 2019
13
14 Random number seed: 1565467142
15 257920 MiB RAM detected; reserving 128960 MiB for main workspace.
16 Using up to 40 threads (change this with --threads).
17 --vcf: 70195 variants scanned.
18 --vcf: FINchr22-temporary.pgen + FINchr22-temporary.pvar +
19 FINchr22-temporary.psam written.
20 487409 samples (0 females, 0 males, 487409 ambiguous; 487409
founders) loaded
21 from FINchr22-temporary.psam.
22 70195 variants loaded from FINchr22-temporary.pvar.
23 Error: No entries in pheno_M.txt correspond to loaded sample IDs.

in pheno_M.txt I have 459032 subject IDs and they can be all found in
.sample file
.sample file has 487409 IDs. Can pheno_M.txt have less IDs than
.sample file or they have to have the same number of IDs?

Thanks
Ana

Ana Marija

unread,
Aug 10, 2019, 5:42:46 PM8/10/19
to plink2-users
I should mention that my pheno file looks like this and everything is
space delimited:

> a=read.table("pheno_M.txt", header=T)
> head(a)
FID IID pheno
1 1000017 1000017 -9
2 1000025 1000025 -9
3 1000038 1000038 1
4 1000042 1000042 -9
5 1000056 1000056 -9
6 1000074 1000074 -9

Ana Marija

unread,
Aug 11, 2019, 2:27:23 PM8/11/19
to plink2-users
Hi Chris,

I am sending in attach how the first 7 lines of my VCFchr22.vcf file look like.
Any idea what I am doing wrong here? The pheno file gives the same
error if you run FID, IID, pheno columns space separated, and if I
just run with pheno containing IID and pheno tab separated.

Thanks
Ana
outvcf22.txt

Christopher Chang

unread,
Aug 11, 2019, 3:17:23 PM8/11/19
to plink2-users
Have you tried reading what “plink2 —help vcf” says about IDs?

Ana Marija

unread,
Aug 11, 2019, 10:02:12 PM8/11/19
to Christopher Chang, plink2-users
Hi Chris,

I did read this:

--id-delim [d] : Normally parses single-delimiter sample IDs as
<FID><d><IID>, and double-delimiter IDs as
<FID><d><IID><d><SID>; default delimiter is '_'.
--id-delim can no longer be used with
--double-id/--const-fid; it will error out if any ID
lacks the delimiter.

if that is what you mean and I see in my vcf and I do have default _ delimiter

What I am missing here?

This is how I created those vcf files

plink2 --threads 8 --bgen ukb_imp_chr17_v3.bgen ref-first --sample
ukb44316_imp_chr17_v3_s487317.sample --extract extractTheseSNPs
--make-pgen --out ex17
plink2 --threads 8 --pgen ex17.pgen --psam ex17.psam --pvar ex17.pvar
--maf 0.01 --geno 0.05 --hwe 0.000001 --make-bpgen --out chr17
plink2 --threads 8 --bim chr17.bim --fam chr17.fam --pgen chr17.pgen
--export vcf bgz vcf-dosage=DS --out VCFchr17

Thanks
Ana

On Sun, Aug 11, 2019 at 2:17 PM Christopher Chang <chrch...@gmail.com> wrote:
>
> Have you tried reading what “plink2 —help vcf” says about IDs?
>
> --
> You received this message because you are subscribed to the Google Groups "plink2-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/065b1df1-3323-4b3d-b832-717ad6ca8ef3%40googlegroups.com.

Christopher Chang

unread,
Aug 11, 2019, 10:12:13 PM8/11/19
to plink2-users
You used too old of a plink2 build when performing the original —bgen import. But you still have all the information you need to use the old-style VCF IDs you have.

Christopher Chang

unread,
Aug 11, 2019, 10:15:58 PM8/11/19
to plink2-users
Correction: if the .sample file had doubled IDs, ignore the —bgen import comment.

Ana Marija

unread,
Aug 11, 2019, 10:32:03 PM8/11/19
to Christopher Chang, plink2-users
Hi Chris,

this is the version of plink2 I was using to create vcf files:
PLINK v2.00a2LM 64-bit Intel (31 Jul 2019)

my .sample files look like this: So yes first and second column are the same

1 ID_1 ID_2 missing sex
2 0 0 0 D
3 2743359 2743359 0 1
4 3055474 3055474 0 2
5 1804099 1804099 0 2
6 3971576 3971576 0 2

to create vcf files I was running this:

plink2 --threads 8 --bgen ukb_imp_chr17_v3.bgen ref-first --sample
ukb44316_imp_chr17_v3_s487317.sample --extract extractTheseSNPs
--make-pgen --out ex17
plink2 --threads 8 --pgen ex17.pgen --psam ex17.psam --pvar ex17.pvar
--maf 0.01 --geno 0.05 --hwe 0.000001 --make-bpgen --out chr17
plink2 --threads 8 --bim chr17.bim --fam chr17.fam --pgen chr17.pgen
--export vcf bgz vcf-dosage=DS --out VCFchr17

What else can be the issue?

And again I am running this with the same version of plink2 I was
using to create vcf files

plink2 --threads 40 --vcf VCFchr17.vcf.gz --pheno pheno_M.txt
--pheno-name pheno --glm --out FINchr17

On Sun, Aug 11, 2019 at 9:16 PM Christopher Chang <chrch...@gmail.com> wrote:
>
> Correction: if the .sample file had doubled IDs, ignore the —bgen import comment.
>
> --
> You received this message because you are subscribed to the Google Groups "plink2-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/e0772365-19e0-413e-be35-8131034542b8%40googlegroups.com.

Christopher Chang

unread,
Aug 12, 2019, 12:56:32 AM8/12/19
to plink2-users
You've read what --id-delim does; why haven't you tried adding it to your command line yet?


On Sunday, August 11, 2019 at 7:32:03 PM UTC-7, Ana Marija wrote:
Hi Chris,

this is the version of plink2 I was using to create vcf files:
PLINK v2.00a2LM 64-bit Intel (31 Jul 2019)

my .sample files look like this: So yes first and second column are the same

 1 ID_1 ID_2 missing sex
     2 0 0 0 D
     3 2743359 2743359 0 1
     4 3055474 3055474 0 2
     5 1804099 1804099 0 2
     6 3971576 3971576 0 2

to create vcf files I was running this:

plink2 --threads 8 --bgen ukb_imp_chr17_v3.bgen  ref-first --sample
ukb44316_imp_chr17_v3_s487317.sample --extract extractTheseSNPs
--make-pgen --out ex17
plink2 --threads 8 --pgen ex17.pgen --psam ex17.psam --pvar ex17.pvar
--maf 0.01 --geno 0.05 --hwe 0.000001  --make-bpgen --out chr17
plink2 --threads 8 --bim chr17.bim --fam chr17.fam --pgen chr17.pgen
--export vcf bgz vcf-dosage=DS --out VCFchr17

What else can be the issue?

And again I am running this with the same version of plink2 I was
using to create vcf files

plink2 --threads 40 --vcf VCFchr17.vcf.gz  --pheno pheno_M.txt
--pheno-name pheno --glm --out FINchr17

On Sun, Aug 11, 2019 at 9:16 PM Christopher Chang wrote:
>
> Correction: if the .sample file had doubled IDs, ignore the —bgen import comment.
>
> --
> You received this message because you are subscribed to the Google Groups "plink2-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages