plink.prune files are all blank

1,617 views
Skip to first unread message

geed...@gmail.com

unread,
Dec 6, 2016, 12:20:31 PM12/6/16
to plink2-users
Hi,

I hope someone can help.

I have been using plink for various functions but now with my new dataset I keep getting plank files when I try to prune for SNP over a LD threshold.

The settings and output I am getting are as follows:


➜  plink ./plink --noweb --allow-no-sex --file GoodSNPs2batch_1 --indep-pairwise 50 5 0.5

Skipping web check... [ --noweb ]
Writing this text to log file [ plink.log ]
Analysis started: Tue Dec  6 17:11:41 2016

Options in effect:
        --noweb
        --allow-no-sex
        --file GoodSNPs2batch_1
        --indep-pairwise 50 5 0.5

2037 (of 2037) markers to be included from [ GoodSNPs2batch_1.map ]
Warning, found 84 individuals with ambiguous sex codes
Writing list of these individuals to [ plink.nosex ]
84 individuals read from [ GoodSNPs2batch_1.ped ]
0 individuals with nonmissing phenotypes
Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
Missing phenotype value is also -9
0 cases, 0 controls and 84 missing
0 males, 0 females, and 84 of unspecified sex
Before frequency and genotyping pruning, there are 2037 SNPs
84 founders and 0 non-founders found
Total genotyping rate in remaining individuals is 0.836676
0 SNPs failed missingness test ( GENO > 1 )
0 SNPs failed frequency test ( MAF < 0 )
After frequency and genotyping pruning, there are 2037 SNPs
After filtering, 0 cases, 0 controls and 84 missing
After filtering, 0 males, 0 females, and 84 of unspecified sex
Performing LD-based pruning...
Writing pruned-in SNPs to [ plink.prune.in ]
Writing pruned-out SNPs to [ plink.prune.out ]
Scanning from chromosome 0 to 0

Both the plink.prune.in and the plink.prune.out files are constantly blank. I have also trialled on another smaller dataset which I know contains both linkedand unlinked loci.

Is the issue likely due to missing data? THe dataset has already been filtered to ensure that all loci have at last 75% data present.

Apologies if this is a repeat post and happy if some one cn direct me if this has been answered else where.

Thanks.


Christopher Chang

unread,
Dec 6, 2016, 1:30:52 PM12/6/16
to plink2-users
plink's LD-pruning routine does not consider "unplaced" (chromosome 0) variants.

ANUBHAB KHAN

unread,
May 18, 2017, 12:06:16 AM5/18/17
to plink2-users
Hey,

Sorry but I too get empty files and I don't have any chromosome 0.
the log:

./plink --vcf rmheader.vcf --allow-extra-chr --out rmLD50 --indep-pairwise 50 5 0.5

PLINK v1.90b3.44 64-bit (17 Nov 2016)      https://www.cog-genomics.org/plink2

(C) 2005-2016 Shaun Purcell, Christopher Chang   GNU General Public License v3

Logging to cat_var_depth20_Xfilt_rmLD50.log.

Options in effect:

  --allow-extra-chr

  --indep-pairwise 50 5 0.5

  --out cat_var_depth20_Xfilt_rmLD50

  --vcf cat_var_depth20_Xfilt.rmheader.vcf


6924847 MB RAM detected; reserving 3462423 MB for main workspace.

--vcf: cat_var_depth20_Xfilt_rmLD50-temporary.bed +

cat_var_depth20_Xfilt_rmLD50-temporary.bim +

cat_var_depth20_Xfilt_rmLD50-temporary.fam written.

22122337 variants loaded from .bim file.

4 people (0 males, 0 females, 4 ambiguous) loaded from .fam.

Ambiguous sex IDs written to cat_var_depth20_Xfilt_rmLD50.nosex .

Using 1 thread (no multithreaded calculations invoked).

Before main variant filters, 4 founders and 0 nonfounders present.

Calculating allele frequencies... done.

22122337 variants and 4 people pass filters and QC.

Note: No phenotypes present.

Pruned 2133289 variants from chromosome 27, leaving 120431.

Pruned 221 variants from chromosome 28, leaving 10.

Pruned 414 variants from chromosome 29, leaving 24.

William Gilks

unread,
May 18, 2017, 9:35:58 AM5/18/17
to plink2-users
There could be something wrong with your ped file as indicated by: "After filtering, 0 cases, 0 controls and 84 missing"

Maybe change the phenotype value in the ped file (column 6) to "1", using:
 awk '{$6=1 ; print ;}' data.ped > newdata.ped
(You may have to specify in the awk command whether the input is tab- or space-delimited)

Christopher Chang

unread,
May 18, 2017, 11:29:08 AM5/18/17
to plink2-users
Hi Anubhab,

The log is ending at a weird place.  Can you send me a VCF file to replicate this with?  Thanks.

ANUBHAB KHAN

unread,
May 24, 2017, 9:36:02 PM5/24/17
to plink2-users
Hi Christopher,

I am attaching first 200 lines of my vcf. could you please let me know the error?
Thanks a lot. sorry for the delay.
test_200_depth20_Xfilt.rmheader.vcf

Christopher Chang

unread,
May 25, 2017, 11:45:37 AM5/25/17
to plink2-users
This works properly for me, with the caveat that the output is useless because the variants don't have unique IDs.  You should either write a short script or use plink 2.0's --set-all-var-ids flag to give your variants unique IDs.

ANUBHAB KHAN

unread,
May 26, 2017, 12:19:41 AM5/26/17
to plink2-users
okay thanks a lot.
Message has been deleted

ANUBHAB KHAN

unread,
May 26, 2017, 6:57:13 AM5/26/17
to plink2-users
Hi Chrisopher,

Now I run into a new issue. the prune.in and prune.out files are created with entries that look like:

AANG03312636.1/637

AANG03312667.1/606

AANG03312690.1/4692

AANG03312690.1/4976

AANG03312690.1/6133

AANG03312690.1/6950

AANG03312720.1/545

AANG03312804.1/1893

AANG03312814.1/5193

AANG03312816.1/337


However, when I try to --extract the prune.in variants I get the following:

./plink --vcf rmheader.vcf --allow-extra-chr --out rmLD50 --extract rmLD50.prune.in --make-bed

PLINK v1.90b3.44 64-bit (17 Nov 2016)      https://www.cog-genomics.org/plink2

(C) 2005-2016 Shaun Purcell, Christopher Chang   GNU General Public License v3

Logging to rmLD50.log.

Options in effect:

  --allow-extra-chr

  --extract rmLD50.prune.in

  --make-bed

  --out rmLD50

  --vcf cat_var_depth20_Xfilt.rmheader.vcf


6924847 MB RAM detected; reserving 3462423 MB for main workspace.

--vcf: rmLD50-temporary.bed + rmLD50-temporary.bim + rmLD50-temporary.fam

written.

22122337 variants loaded from .bim file.

4 people (0 males, 0 females, 4 ambiguous) loaded from .fam.

Ambiguous sex IDs written to rmLD50.nosex .

Error: No variants remaining after --extract.


However, I know that that the prune.out has 20428419 entries while prune.in has 1186467 entries. Could you please let me know if I am doing something wrong??

Christopher Chang

unread,
May 26, 2017, 11:27:43 AM5/26/17
to plink2-users
You need to save the new variant IDs.  E.g.

plink2 --allow-extra-chr --vcf cat_var_depth20_Xfilt.rmheader.vcf --set-all-var-ids [...] --make-bed --out prefilter
plink2 --allow-extra-chr --bfile prefilter --extract rmLD50.prune.in --make-bed rmLD50

(You can also use plink 1.9 for the second command, but it will sometimes swap the ref/alt alleles, whereas 2.0 preserves their order.)

ANUBHAB KHAN

unread,
May 29, 2017, 2:52:20 AM5/29/17
to plink2-users
Hi Christopher,

Sorry for bothering you again but what are the parameters I need to set for --set-all-var-ids?
Could you explain with an example? Do I have to individually assign names to the variants?

Christopher Chang

unread,
May 29, 2017, 2:01:42 PM5/29/17
to plink2-users
In bash,

"--set-all-var-ids @_#_\$r_\$a"

assigns names of the form [chromosome]_[position]_[reference allele]_[alternate allele] to all your variants.  I.e. '@' indicates where the chromosome code should go, '#' indicates where the position should go, '$r' and '$a' refer to the reference and alternate alleles, and backslashes are needed to tell bash to leave the '$'s alone (otherwise '$' has a special meaning).

ANUBHAB KHAN

unread,
Jun 1, 2017, 8:43:38 AM6/1/17
to plink2-users
Thanks.
That works well.

O Kam

unread,
Dec 30, 2018, 11:00:57 PM12/30/18
to plink2-users
Is there any way to include the rsID in the new ID construct? Like [rsID]_[position]_[reference allele]_[alternate allele]? If one tries to achieve IMPUTE2-styled IDs... I tried to check plink 1.09 and plink 2 documentation but didn´t find any info on other allowed characters apart from @, #,$r and $a..

Christopher Chang

unread,
Dec 31, 2018, 12:54:02 PM12/31/18
to plink2-users
No, because there's no way for plink to know the rsIDs. You should use your own script for that.

O Kam

unread,
Dec 31, 2018, 5:32:08 PM12/31/18
to plink2-users
I am sorry, I was imprecise - not the rsID but the "normal" id which --set-all-var-ids flag will overwrite. Like [oldID]_[position]_[reference allele]_[alternate allele]

Christopher Chang

unread,
Dec 31, 2018, 8:02:15 PM12/31/18
to plink2-users
There are no plans to support this; use a script.
Message has been deleted

Chris Chang

unread,
Oct 3, 2022, 2:44:50 PM10/3/22
to Mahmoud ElSayed, plink2-users
The --set-all-var-ids documentation includes an example of a template string that can be passed to --set-all-var-ids/--set-missing-var-ids.

If you still have duplicate IDs after resetting the IDs, you can use --rm-dup to address the duplicates.

On Sun, Oct 2, 2022 at 6:12 AM Mahmoud ElSayed <s-mahmou...@zewailcity.edu.eg> wrote:
Hi Christopher, 
I am still don't know what I should give exactly as input for the "--set-all-var-ids" flag, can you give me a real example, please ? 
my case is that I got a vcf file merged from multiple vcfs for different genes and I am trying to do the indep pairwise pruning. I have done the sorting and removing duplicates but it now asks me to give the uniqe variants ids, so can you help me with that, pelase ? 

Thanks, 
On Tuesday, January 1, 2019 at 3:02:15 AM UTC+2 chrch...@gmail.com wrote:
There are no plans to support this; use a script.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/9f7b9726-ca85-4810-a27d-0f0ea13307bbn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages