Error: Out of memory

Nooshin Abbasi

unread,

Jul 3, 2019, 2:18:07 PM7/3/19

to plink2-users

Hello,

I am trying to merge data for all the chromosomes by the following command:

plink --merge-list all_files.txt --make-bed --out merged

as the the "out of memory" error popped up, I added the memory flag and assigned higher memory for the workspace, however, I am still receiving the same error.

PLINK v1.90b4.6 64-bit (15 Aug 2017)
Options in effect:
--make-bed
--memory 200000
--merge-list all_files.txt
--out merged

Hostname: xxxx
Working directory: xxxx
Start time: Wed Jul 3 11:05:47 2019

Random number seed: 1562177147
257842 MB RAM detected; reserving 200000 MB for main workspace.
Allocated 6334 MB successfully, after larger attempt(s) failed.

Error: Out of memory. The --memory flag may be helpful.
Failed allocation size: 18276860288

End time: Wed Jul 3 11:07:37 2019

Could you please let me know why do I receive this error and how can I solve this issue?

Thank you.

Nooshin

Nooshin Abbasi

unread,

Jul 3, 2019, 3:57:00 PM7/3/19

to plink2-users

As an alternative way, can I merge the vcf files first and then convert them to the binary files with the following command-lines? (Although, I guess this approach may require as big memory as merging the binary files already does)

grep '^#' chr1.filtered.vcf > merge.vcf
grep -v '^#' chr1.filtered.vcf chr2.filtered.vcf ... chr22.filtered.vcf >> merge.vcf

plink --vcf merge.vcf --make-bed --allow-extra-chr --out merge

Or, on the other hand, merging binary files of chromosomes 2 by 2 iteratively to build up 22 chromosomes? (for example, 1 and 2 -> 12 ..... 3 and 4 -> 34 ..... 12 and 34 -> 1234)

Does either of these approaches help the memory issue?

Looking forward to hearing from you.

Thank you

Christopher Chang

unread,

Jul 3, 2019, 7:21:16 PM7/3/19

to plink2-users

The most likely reason for this error is the presence of a long variant ID. In particular, if your variant IDs look something like <chr>:<pos>_<ref>_<alt>, PLINK 1.9 will probably choke if you have any very long indels. This isn’t restricted to merging; you basically have to use PLINK 2.0 if you have any super-long variant IDs.

While your first VCF-concatenation command lines should work, you probably want to use “bcftools concat” for the merge instead.

Nooshin Abbasi

unread,

Jul 4, 2019, 1:19:23 PM7/4/19

to plink2-users

Thanks! I've upgraded however, plink2 does not support --merge-list option apparently. Could you please let me know what other options should I use?

I've seen an older post here that you suggested using https://bitbucket.org/gavinband/bgen/wiki/cat-bgen, however, I doubt if it works with my data as they are in .bim .bed .fam formats not .bgen, right?

Christopher Chang

unread,

Jul 4, 2019, 5:36:16 PM7/4/19

to plink2-users

For the merge, export to VCF and use the bcftools command I suggested.

Message has been deleted

DJon

unread,

Feb 24, 2020, 12:19:17 PM2/24/20

to plink2-users

Hi Christopher,

I also have a similar kind of memory issue when I'm going to make PCA. I tried with shortening the variant ids, but still get the same error as follows:

PLINK v1.90b6.15 64-bit (21 Jan 2020) www.cog-genomics.org/plink/1.9/

Logging to /data/chromosomes/all/PCA_train.log.

Options in effect:

--bfile /data/chromosomes/all/everything_chr_edt

--extract /data/chromosomes/all/for_pca.prune.in

--keep /home/p_10_4/females_list

--out /data/chromosomes/all/PCA_train

--pca 4

773975 MB RAM detected; reserving 386987 MB for main workspace.

805426 variants loaded from .bim file.

488377 people (223477 males, 264811 females, 89 ambiguous) loaded from .fam.

Ambiguous sex IDs written to /data/chromosomes/all/PCA_train.nosex

.

--extract: 421691 variants remaining.

--keep: 209940 people remaining.

Using up to 31 threads (change this with --threads).

Before main variant filters, 209940 founders and 0 nonfounders present.

Calculating allele frequencies... done.

Total genotyping rate in remaining samples is 0.965961.

421691 variants and 209940 people pass filters and QC.

Note: No phenotypes present.

Excluding 9911 variants on non-autosomes from relationship matrix calc.

Relationship matrix calculation complete.

Error: Out of memory. The --memory flag may be helpful.

Failed allocation size: 352598428800

When I check my memory in the hpc :

State: FAILED (exit code 1)

Nodes: 1

Cores per node: 32

CPU Utilized: 35-16:12:19

CPU Efficiency: 80.14% of 44-12:24:32 core-walltime

Job Wall-clock time: 1-09:23:16

Memory Utilized: 246.33 GB

Memory Efficiency: 33.68% of 731.45 GB

So there is enough memory. What would be the reason for this error? Is there any other way to calculate PCA for all these subjects with all these variants?

When I worked with lesser number of subjects (10%), it worked without an issue.

Thanks.

Djon

Christopher Chang

unread,

Feb 24, 2020, 12:22:00 PM2/24/20

to plink2-users

This is expected with hundreds of thousands of samples. Use plink 2.0's "--pca approx" instead in this setting.

Reply all

Reply to author

Forward