plink2 Error: File write failure. [Maybe a bug]

2,489 views
Skip to first unread message

Peng Cui

unread,
Mar 9, 2018, 12:43:18 PM3/9/18
to plink2-users
Dear all,

I have a weird problem using plink2. I want to find the relevant SNPs from pgen format to bed format using --make-bed. My command is the following:

The pgen is firstly transformed by using bgen:  plink2 --bgen ubk_imp_chr12_v2.bgen --sample ukb2990_imp_chrN_v2_s487395.sample --make-pgen --out ubk_chr12

Then: plink2 --pgen ubk_chr12.pgen --psam bk_chr12.psam --pvar ubk_chr12.pvar --chr 12 --from-bp 26915671 --to-bp 28924209 --make-bed --out ENSG00000205693


However, the following is the output:


PLINK v2.00a2LM 64-bit Intel (7 Mar 2018)      www.cog-genomics.org/plink/2.0/

  2 (C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3

  3 Logging to ENSG00000205693.log.

  4 Options in effect:

  5   --chr 12

  6   --from-bp 26915671

  7   --make-bed

  8   --out ENSG00000205693

  9   --pgen ubk_chr12.pgen

 10   --psam ubk_chr12.psam

 11   --pvar ubk_chr12.pvar

 12   --to-bp 28924209

 13 

 14 Start time: Fri Mar  9 11:03:47 2018

 15 128892 MB RAM detected; reserving 64446 MB for main workspace.

 16 Using up to 16 threads (change this with --threads).

 17 487409 samples (264362 females, 223033 males, 14 ambiguous; 487409 founders)

 18 loaded from ubk_chr12.psam.

 19 4431052 variants loaded from ubk_chr12.pvar.

 20 Note: No phenotype data present.

 21 64279 variants remaining after main filters.

 22 Writing ENSG00000205693.bed ... 0%

 23 Error: File write failure.

 24 End time: Fri Mar  9 11:04:39 2018



I have checked the output file ENSG00000205693.bed. It is only a 2Gb file and my cluster has fair enough disk volume. So why it happens? I don't know whether it is a bug of plink2 or other problems caused this error. Thanks for someone may help me!


Peng



Christopher Chang

unread,
Mar 9, 2018, 1:04:48 PM3/9/18
to plink2-users
Hi,

Thanks for reporting this.  Yes, it's a plink2 bug; I forgot a "not" in the code which broke up >2GB write operations, so plink2 thought the first 2GB write failed when it succeeded.  Will post a fix later today.
Message has been deleted

Chris Franklin

unread,
Mar 19, 2018, 9:10:10 AM3/19/18
to plink2-users
Hi,
I'm still having problems that I think relate to this issue.
Converting a UKB BGEN to pgen or bed format works fine with an old plink2.0 binary I have (PLINK v2.00aLM 64-bit Intel (2 Aug 2017))

The feb 11th release - PLINK v2.00a1LM 64-bit Intel (11 Feb 2018) converts to pgen format successfully, but fails the make-bed conversion step after writing 2GB as above.

But the march 11th release gives "Segmentation fault (core dumped)" almost immediately, with both the make-pgen and make-bed conversion commands.
I've tried both:
PLINK v2.00a2LM 64-bit Intel (11 Mar 2018)
PLINK v2.00a2LM AVX2 Intel (11 Mar 2018)

Thanks,
Chris

Christopher Chang

unread,
Mar 19, 2018, 11:54:02 AM3/19/18
to plink2-users
Hi,

Can you post the .log file from your latest failed run? Thanks.

Christopher Chang

unread,
Mar 19, 2018, 11:56:00 AM3/19/18
to plink2-users
(or if the segfault interfered with logging, add the —debug flag and rerun)

Chris Franklin

unread,
Mar 19, 2018, 1:37:37 PM3/19/18
to plink2-users

The log file:


PLINK v2.00a2LM AVX2 Intel (11 Mar 2018)
Options in effect:
  --bgen ukb_imp_chr22_v2b.bgen
  --debug
  --make-bed
  --memory 45000
  --out ukb_imp_chr22_v2b_bgen2bed
  --sample ukb_imp_v2b.sample

Hostname: compute19
Working directory: /data/scratch/user/chrisf/imputed
Start time: Mon Mar 19 17:29:46 2018

Random number seed: 1521480586
60397 MB RAM detected; reserving 45000 MB for main workspace.

Using up to 16 threads (change this with --threads).
--bgen: 445766 variants detected, format v1.2.


The error file (from submitting to SGE) has this:

/var/spool/gridengine/execd/compute19/job_scripts/2108712: line 87: 22535 Segmentation fault      (core dumped) /home/chrisf/plink_versions/avx2_20180311/plink2 --debug --memory 45000 --bgen ukb_imp_chr22_v2b.bgen --sample ukb_imp_v2b.sample --make-bed --out ukb_imp_chr22_v2b_bgen2bed


I get the same error and almost instant failure whether submitting to SGE or running directly on the cluster's head-node. An empty psam file is created then it dies.

Christopher Chang

unread,
Mar 19, 2018, 1:56:17 PM3/19/18
to plink2-users
Thanks.  I've replicated the crash; will try to post a fix later today.

Christopher Chang

unread,
Mar 19, 2018, 8:58:16 PM3/19/18
to plink2-users
Bugfix is posted; let me know if you run into any other problems.


On Monday, March 19, 2018 at 10:37:37 AM UTC-7, Chris Franklin wrote:

Chris Franklin

unread,
Mar 20, 2018, 6:24:19 AM3/20/18
to plink2-users
Thanks, but I'm still getting the same crash with the new version:

PLINK v2.00a2LM AVX2 Intel (19 Mar 2018)       www.cog-genomics.org/plink/2.0/

(C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to ukb_imp_chr22_v2b.log.

Options in effect:
  --bgen ukb_imp_chr22_v2b.bgen
  --debug
  --make-bed
  --memory 50000
  --out ukb_imp_chr22_v2b
  --sample ukb_imp_v2b.sample

Start time: Tue Mar 20 10:13:48 2018
64429 MB RAM detected; reserving 50000 MB for main workspace.
Allocated 37500 MB successfully, after larger attempt(s) failed.

Using up to 16 threads (change this with --threads).
--bgen: 445766 variants detected, format v1.2.
Segmentation fault (core dumped)

I tried it out on a couple of different chromosomes. It detects the variants, then gives up.
Might be some incompatibility with our cluster. But I'm not sure how to track that down.
Chris

Christopher Chang

unread,
Mar 20, 2018, 10:59:59 AM3/20/18
to plink2-users
Hmm, I'll post a debug build later today for you to use if I'm unable to reproduce this.

Christopher Chang

unread,
Mar 20, 2018, 8:05:14 PM3/20/18
to plink2-users
The 20 Mar build has a bunch of additional --debug logging during .sample file loading (where this crash is occurring); let me know when you have a chance to run it.

Chris Franklin

unread,
Mar 21, 2018, 7:20:42 AM3/21/18
to plink2-users
Hi Chris, The debug output is below.
It looks like it's validating the .sample file happily but then dies when writing the .psam (which is created with zero bytes).
The top of my .sample file looks sane to me:
> ukb_imp_v2b.sample
ID_1 ID_2 missing
0 0 0
1183872 1183872 0

Thanks,
Chris

PLINK v2.00a2LM AVX2 Intel (20 Mar 2018)
Options in effect:
  --bgen /data/scratch/project/UKBiobank/Genomics/imputed/ukb_imp_chr22_v2b.bgen

  --debug
  --make-bed
  --memory 45000
  --out ukb_imp_chr22_v2b_bgen2bed
  --sample /data/scratch/project/UKBiobank/Genomics/imputed/ukb_imp_v2b.sample

Hostname: compute03
Working directory: /data/scratch/project/UKBiobank/Genomics/prep_2018_03_13/imputed
Start time: Wed Mar 21 10:39:20 2018

Random number seed: 1521628760

60397 MB RAM detected; reserving 45000 MB for main workspace.
Using up to 16 threads (change this with --threads).
--bgen: 445766 variants detected, format v1.2.
Entering OxSampleToPsam().
Standardizing load-buffer size.
Opened .sample file; linebuf_size = 2147483584.
ReadLineStream initialized.
Missing code initialized.
First scan complete; write_fid = 1.
ReadLineStream rewound.
First line reread.
ID_1 validated.
ID_2 validated.
MISSING validated.
ukb_imp_chr22_v2b_bgen2bed-temporary.psam opened for writing.
Write-buffer initialized, size = 2147483584
sex_col = 0, col_ct = 3.
Second line read.
0 0 0 validated.
Remaining .sample columns validated.  col_ct = 3.
All-missing-phenotype check completed.  uncertain_col_ct = 0
Rewind #2 complete.
Processing line 1.

Christopher Chang

unread,
Mar 21, 2018, 11:37:26 AM3/21/18
to plink2-users
Okay, this is embarrassing; I was testing my handling of all the .sample column types yesterday, but I failed to test the no-sex/phenotype-columns-at-all case, and that's what was broken.  Fix is posted to Github, binaries will be posted later today and I'll make sure to add a test for this case.

Chris Franklin

unread,
Mar 21, 2018, 12:13:31 PM3/21/18
to plink2-users
Brilliant, thanks.

che...@gmail.com

unread,
Apr 1, 2018, 5:31:58 PM4/1/18
to plink2-users
Hi all,

I seem to have a similar problem -- there exists an output .bed file that is 2GB and the cluster has sufficient disk volume. Are there any suggestions as to how I may proceed? May I assume that the run was successful and the 2GB .bed file is correct? Thank you in advance!

Jessica

---

PLINK v2.00aLM 64-bit Intel (5 Jan 2018)

Options in effect:

  --make-bed

  --memory 64000

  --out /oak/stanford/groups/mrivas/users/jwrchen/sandbox/array/ukb_imp_chr21_v2.mac1.hrc

  --pgen /scratch/PI/mrivas/ukbb/24983/imp_pgen_1372/ukb_imp_chr21_v2.mac1.hrc.pgen

  --psam /scratch/PI/mrivas/ukbb/24983/imp_pgen_24983/ukb24983_imp_chr21_v2.fam

  --pvar /scratch/PI/mrivas/ukbb/24983/imp_pgen_1372/ukb_imp_chr21_v2.mac1.hrc.pvar

  --threads 8


Hostname: sh-101-49.int

Working directory: /oak/stanford/groups/mrivas/users/jwrchen/sandbox/array

Start time: Sun Apr  1 14:24:25 2018


Random number seed: 1522617865

128656 MB RAM detected; reserving 64000 MB for main workspace.

Using up to 8 compute threads.

487409 samples (0 females, 0 males, 487409 ambiguous; 487409 founders) loaded

from /scratch/PI/mrivas/ukbb/24983/imp_pgen_24983/ukb24983_imp_chr21_v2.fam.

531237 variants loaded from

/scratch/PI/mrivas/ukbb/24983/imp_pgen_1372/ukb_imp_chr21_v2.mac1.hrc.pvar.

Note: No phenotype data present.

Writing

/oak/stanford/groups/mrivas/users/jwrchen/sandbox/array/ukb_imp_chr21_v2.mac1.hrc.bed

... 

Error: File write failure.

Christopher Chang

unread,
Apr 1, 2018, 8:28:40 PM4/1/18
to plink2-users
This bug was fixed on March 9; retry with a newer build.

Ying Liu

unread,
Sep 10, 2018, 12:32:30 PM9/10/18
to plink2-users
I am having the same segmentation error problem, even I am using the Sept 4 version (latest). I do not know if the problem is due to I used --sort-vars flag or not. I ran into this problem when I try to make-bed from both bgen+sample, and pgen files. Maybe I should not use this flag? Since it ran OK and generate some results when I did not use --sort-vars. However, I received warning when I did not include --sort-vars 

((C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to BMI_auto_5e-8_all_bfile.log.
Options in effect:
  --make-bed
  --out BMI_auto_5e-8_all_bfile
  --pfile BMI_auto_5e-8_all_pfile

Start time: Mon Sep 10 11:05:46 2018
258218 MiB RAM detected; reserving 129109 MiB for main workspace.
Using up to 72 threads (change this with --threads).
487409 samples (264362 females, 223033 males, 14 ambiguous; 487409 founders)
loaded from BMI_auto_5e-8_all_pfile.psam.
37012 variants loaded from BMI_auto_5e-8_all_pfile.pvar.
Note: No phenotype data present.
Warning: Variants are not sorted by position.  Consider rerunning with the
--sort-vars flag added to remedy this.
Writing BMI_auto_5e-8_all_bfile.bed ... done.
Writing BMI_auto_5e-8_all_bfile.bim ... done.
Writing BMI_auto_5e-8_all_bfile.fam ... done.
End time: Mon Sep 10 11:06:54 2018). 

So then I tried to use --sort-vars and had the following error. Below is the log: 

PLINK v2.00a2LM 64-bit Intel (4 Sep 2018)      www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to BMI_auto_5e-8_all_sorted_fromBGEN_bfile.log.
Options in effect:
  --bgen /gpfs23/scratch/sbcs/liuy39/UKB/restore-rt-44863/download3/EGAD00010001474/extracted/NAGWAS_BMI/BMI_allGWAS_auto_v3.bgen
  --import-dosage-certainty 0.9
  --make-bed
  --out BMI_auto_5e-8_all_sorted_fromBGEN_bfile
  --sample /gpfs23/scratch/sbcs/liuy39/UKB/restore-rt-44863/download3/EGAD00010001474/ukb24487_imp_chr22_v3_s487395.sample
  --sort-vars

Start time: Mon Sep 10 11:15:59 2018
258218 MiB RAM detected; reserving 129109 MiB for main workspace.
Using up to 72 threads (change this with --threads).
--bgen: 37012 variants detected, format v1.2.
487409 samples imported from .sample file to
BMI_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.psam .
--bgen: BMI_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.pgen +
BMI_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.pvar written.
487409 samples (264362 females, 223033 males, 14 ambiguous; 487409 founders)
loaded from BMI_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.psam.
37012 variants loaded from
BMI_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.pvar.
Note: No phenotype data present.
Segmentation fault (core dumped)

=====================================
PLINK v2.00a2LM 64-bit Intel (4 Sep 2018)      www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to BMI_auto_5e-8_all_sorted_bfile.log.
Options in effect:
  --debug
  --make-bed
  --out BMI_auto_5e-8_all_sorted_bfile
  --pfile /gpfs23/scratch/sbcs/liuy39/UKB/restore-rt-44863/download3/EGAD00010001474/extracted/09062018/BMI_auto_5e-                                         8_all_pfile
  --sort-vars

Start time: Mon Sep 10 11:11:37 2018
258218 MiB RAM detected; reserving 129109 MiB for main workspace.
Using up to 72 threads (change this with --threads).
487409 samples (264362 females, 223033 males, 14 ambiguous; 487409 founders)
loaded from
/gpfs23/scratch/sbcs/liuy39/UKB/restore-rt-44863/download3/EGAD00010001474/extracted/09062018/BMI_auto_5e-8_all_pfil                                         e.psam.
37012 variants loaded from
/gpfs23/scratch/sbcs/liuy39/UKB/restore-rt-44863/download3/EGAD00010001474/extracted/09062018/BMI_auto_5e-8_all_pfil                                         e.pvar.
Note: No phenotype data present.
Segmentation fault (core dumped)

Christopher Chang

unread,
Sep 10, 2018, 2:03:12 PM9/10/18
to plink2-users
This --sort-vars bug was fixed on September 8th; update to a newer build.

Ying Liu

unread,
Sep 10, 2018, 3:15:03 PM9/10/18
to plink2-users
I just tried again with the latest version, but see the following error message. 

PLINK v2.00a2LM 64-bit Intel (9 Sep 2018)      www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to WHR_auto_5e-8_all_sorted_fromBGEN_bfile.log.
Options in effect:
  --bgen /gpfs23/scratch/sbcs/liuy39/UKB/restore-rt-44863/download3/EGAD00010001474/extracted/NAGWAS_WHR/WHR_allGWAS_auto_v3.bgen
  --import-dosage-certainty 0.9
  --make-bed
  --out WHR_auto_5e-8_all_sorted_fromBGEN_bfile
  --sample /gpfs23/scratch/sbcs/liuy39/UKB/restore-rt-44863/download3/EGAD00010001474/ukb24487_imp_chr22_v3_s487395.sample
  --sort-vars

Start time: Mon Sep 10 13:45:29 2018
258218 MiB RAM detected; reserving 129109 MiB for main workspace.
Using up to 72 threads (change this with --threads).
--bgen: 33158 variants detected, format v1.2.
487409 samples imported from .sample file to
WHR_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.psam .
--bgen: WHR_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.pgen +
WHR_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.pvar written.
487409 samples (264362 females, 223033 males, 14 ambiguous; 487409 founders)
loaded from WHR_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.psam.
33158 variants loaded from
WHR_auto_5e-8_all_sorted_fromBGEN_bfile-temporary.pvar.
Note: No phenotype data present.
Writing WHR_auto_5e-8_all_sorted_fromBGEN_bfile.bim ... done.
Writing WHR_auto_5e-8_all_sorted_fromBGEN_bfile.fam ... done.
Error: Fixed-width .bed/.pgen output doesn't support sorting yet.  Generate a
regular sorted .pgen first, and then reformat it.
End time: Mon Sep 10 14:09:27 2018

Christopher Chang

unread,
Sep 10, 2018, 3:28:51 PM9/10/18
to plink2-users
Replace --make-bed with --make-pgen in your first command; then follow up with --pfile + --make-bed to create the final .bed.  (Yes, this is a bit silly, and I'll enable --sort-vars + --make-bed at some point; but the first priority is to make this possible at all, with a smaller number of better-tested code paths.)
Message has been deleted

Fabian Grammes

unread,
Nov 23, 2019, 4:35:55 PM11/23/19
to plink2-users
Hi, I seem to have a similar problem with smaller files (223 samples, 56177 variants) using plink2 (20191122 release). I'm running

~/bin/plink2_linux_avx2_20191122/plink2  \
    --import-dosage just_a_test.plink.dosage noheader format=1 \
    --fam just_a_test.plink.fam \
    --map just_a_test.plink.map \
    --out just_a_test.OUT \
    --allow-extra-chr --sort-vars \
    --make-pgen --debug

and I get:
Segmentation fault

The temporary files are generated ( just_a_test.OUT-temporary.pgen/psam/pvar), and I can replicate the error with plink2 v10 Sep 2019. Any help is very much appreciated.

Thanks cheers, Fabian

Christopher Chang

unread,
Nov 23, 2019, 6:20:29 PM11/23/19
to plink2-users
Can you post a dataset that I can replicate this with?  I tried several variants of this command with the same dataset dimensions, and did not see a segfault.

Fabian Grammes

unread,
Nov 25, 2019, 6:23:42 AM11/25/19
to plink2-users
Thanks ! The data necessary to reproduce the error is attached. By sub setting I narrowed it down to line 55980 that appears to cause the segmentation fault, but I can not spot any error there. Thanks again for taking a look at this!

Cheers, F
just_a_test.plink.dosage.subset
just_a_test.plink.map_rename
just_a_test.plink.fam.subset

Christopher Chang

unread,
Nov 25, 2019, 1:37:37 PM11/25/19
to plink2-users
Thanks, this is sufficient.  Turns out the bug is in --sort-vars, not --import-dosage, and the problem wasn't noticed before since it was an off-by-one memory-allocation-size issue that only mattered when the number of contigs was a multiple of 16.

Bugfix is on GitHub; I'll post updated binaries tonight.

潘梦宇

unread,
Oct 5, 2020, 10:12:46 AM10/5/20
to plink2-users
Hi, I use plink2 to convert my UKB-bgen file, but the log file like this,


PLINK v2.00a2.3LM 64-bit Intel (24 Jan 2020)   www.cog-genomics.org/plink/2.0/
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to ./chr_22_plink.log.
Options in effect:
  --bgen chrom_imp/ukb_imp_chr22_v3.bgen ref-first
  --make-bed
  --out ./chr_22_plink
  --oxford-single-chr 22
  --sample chrom_imp_sample/ukb32048_imp_chr22_v3_s487296.sample

Start time: Mon Oct  5 15:32:11 2020
773517 MiB RAM detected; reserving 386758 MiB for main workspace.
Using up to 64 threads (change this with --threads).
--bgen: 1255683 variants detected, format v1.2.
487409 samples imported from .sample file to ./chr_22_plink-temporary.psam .
--bgen: ./chr_22_plink-temporary.pgen + ./chr_22_plink-temporary.pvar written.
487409 samples (264302 females, 222994 males, 113 ambiguous; 487409 founders)
loaded from ./chr_22_plink-temporary.psam.
1255683 variants loaded from ./chr_22_plink-temporary.pvar.
Note: No phenotype data present.
Writing ./chr_22_plink.fam ... done.
Writing ./chr_22_plink.bim ... done.
Writing ./chr_22_plink.bed ... 78%
Error: File write failure: Input/output error.
End time: Mon Oct  5 15:46:46 2020

could you help me how to solve this problem to get the --make-bed files?
Many thanks.


Ammy

Christopher Chang

unread,
Oct 5, 2020, 11:11:46 AM10/5/20
to plink2-users
This is almost certainly because you don't have enough disk space.  (Note that --make-pgen usually generates a smaller file.)

Karatuğ Ozan BİRCAN

unread,
Apr 14, 2021, 12:46:19 PM4/14/21
to plink2-users
Hi everyone,

When I run plink2 in a Databricks notebook, I get the following error:

```
PLINK v2.00a2.3LM 64-bit Intel (24 Jan 2020)
Options in effect:
  --extract /dbfs/mnt/dev-karatug-prs/sc-admin/research/traits/Blood_Pressure/summary-statistics/Blood_Pressure_study2_b_cts/snps.tsv
  --keep-if SuperPop==EUR
  --make-pgen
  --out /dbfs/mnt/dev-karatug-prs/tmp/62472eb6-396d-4564-9e78-f497daa57a7b/chr1
  --pgen /dbfs/mnt/dev-karatug-prs/common/split/decompress_1kg/chr1.pgen
  --psam /dbfs/mnt/dev-karatug-prs/common/split/general.psam
  --pvar /dbfs/mnt/dev-karatug-prs/common/split/chr1.pvar.zst
  --rm-dup exclude-mismatch

Hostname: 0214-211807-ogres312-10-208-244-42
Working directory: /databricks/driver
Start time: Wed Apr 14 16:39:25 2021

Random number seed: 1618418365
30603 MiB RAM detected; reserving 15301 MiB for main workspace.
Using up to 4 compute threads.
2504 samples (1271 females, 1233 males; 2497 founders) loaded from
/dbfs/mnt/dev-karatug-prs/common/split/general.psam.
6468094 variants loaded from
/dbfs/mnt/dev-karatug-prs/common/split/chr1.pvar.zst.
2 categorical phenotypes loaded.
--extract: 2113695 variants remaining.
--rm-dup: Loading INFO field... done.
Note: 1 duplicate ID with inconsistent genotype data or variant information
detected by --rm-dup exclude-mismatch; all copies removed.
--rm-dup: 1 duplicated ID, 2 variants removed.
--keep-if: 2001 samples removed.
503 samples (263 females, 240 males; 503 founders) remaining after main
filters.
2113693 variants remaining after main filters.
Writing
/dbfs/mnt/dev-karatug-prs/tmp/62472eb6-396d-4564-9e78-f497daa57a7b/chr1.psam
... done.
Writing
/dbfs/mnt/dev-karatug-prs/tmp/62472eb6-396d-4564-9e78-f497daa57a7b/chr1.pvar
... done.
Writing
/dbfs/mnt/dev-karatug-prs/tmp/62472eb6-396d-4564-9e78-f497daa57a7b/chr1.pgen
...
Error: File write failure: Operation not supported.

End time: Wed Apr 14 16:39:41 2021

```

I don't think this is a disk space issue since I write this file to S3.
Thank you.

Best regards.

Christopher Chang

unread,
Apr 14, 2021, 12:57:32 PM4/14/21
to plink2-users
When the .pgen writer is done, it seeks back to the beginning of the file to write the index.  I'm guessing from the error message that your system only supports sequential writes to S3.

Karatuğ Ozan BİRCAN

unread,
Apr 14, 2021, 2:26:37 PM4/14/21
to plink2-users
On the other hand, when I run the following, it works fine.

Does this also use .pgen writer?

plink2 --zst-decompress /dbfs/mnt/dev-karatug-prs/common/split/chr1.pgen.zst /dbfs/mnt/dev-karatug-prs/common/split/decompress_1kg/chr1.pgen

Christopher Chang

unread,
Apr 14, 2021, 2:31:27 PM4/14/21
to plink2-users
That's a generic decompression operation, it doesn't use the specialized .pgen writer.

Mary Makarious

unread,
Jun 8, 2021, 4:25:19 PM6/8/21
to plink2-users
Hello,
I'm running into a segmentation fault error at the .pvar stage of the pgen files being created.


PLINK v2.00a3LM AVX2 Intel (20 Apr 2021) www.cog-genomics.org/plink/2.0/
(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to plink_v2_reformatted_normalized.log.
Options in effect:
--fa hg38.fa.gz
--make-pgen
--memory 1000000
--normalize
--out plink_v2_reformatted_normalized
--pfile plink_v2_reformatted
--sort-vars
--threads 10


Start time: Tue Jun 8 15:45:30 2021
1547809 MiB RAM detected; reserving 1000000 MiB for main workspace.
Using up to 10 threads (change this with --threads).
10418 samples (0 females, 0 males, 10418 ambiguous; 10418 founders) loaded from
plink_v2_reformatted.psam.
158715874 variants loaded from plink_v2_reformatted.pvar.

Note: No phenotype data present.
--normalize: 1566188 variants changed.
Warning: Base-pair positions are now unsorted!
Writing plink_v2_reformatted_normalized_v2.pvar ... done.
Writing plink_v2_reformatted_normalized_v2.psam ... done.
Writing plink_v2_reformatted_normalized_v2.pgen ... 0%Segmentation fault



Working on an interactive node on a cluster that has 1.5TB, hence trying to use the --memory flag, so I do not think it is a memory issue.
From what I can tell, using --sort-vars is necessary, but using it in combination with --normalize is running into this issue.


Any insight would be really helpful!

Christopher Chang

unread,
Jun 8, 2021, 5:04:52 PM6/8/21
to plink2-users
Hi,

This is definitely a bug.  Can you send me a dataset (could be much smaller, e.g. part of one chromosome) I can use to replicate this crash?

Christopher Chang

unread,
Jun 8, 2021, 8:38:47 PM6/8/21
to plink2-users
Bugfix is now posted; let me know if you still see any problems.

Mary Makarious

unread,
Jun 9, 2021, 10:41:44 AM6/9/21
to plink2-users
Hey Chris - thank you so much for your quick response and fix! 
Seems to be working so far on my end fine after downloading the June 8th 2021 dev build.  

All the best, 
Mary
Reply all
Reply to author
Forward
0 new messages