I have tried the experiment you suggested, here are the results of 4 runs:
Run 1 | previous command but adding --randmem --seed 1 | failed
PLINK v2.00a3LM 64-bit Intel (8 Apr 2022)
Options in effect:
--allow-extra-chr
--chr 1-22 x xy
--debug
--extract ./EAS/extract_list.snplist ./req_files/ALL.BRAVO_TOPMed_Freeze_8.MAF0.0001.variants.wchr.list ./req_files/TOPMED_HRC_1KG_UKB.REP_functional_variants.wchr.list
--indiv-sort n
--keep ./req_files/Cohort.EAS.samples.txt
--maf 1e-30
--make-pgen erase-phase fill-missing-from-dosage pvar-cols=-xheader,-maybequal,-maybefilter,-maybeinfo,-maybecm
--memory 60000
--out ./EAS/intermediate
--output-chr 26
--pgen ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pgen
--psam ./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam
--pvar ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar
--randmem
--rm-dup exclude-mismatch
--seed 1
--sort-vars
Hostname: ip-172-24-83-240
Working directory: /EBS/dev/imputationSAS/rerun_clean
Start time: Mon Apr 18 09:09:53 2022
127355 MiB RAM detected; reserving 60000 MiB for main workspace.
Using up to 32 threads (change this with --threads).
44002 samples (21829 females, 21746 males, 427 ambiguous; 44002 founders)
loaded from
./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam.
307475576 variants loaded from
./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar.
Note: No phenotype data present.
--extract: 75157537 variants remaining.
Note: Skipping --rm-dup since no duplicate IDs are present.
--keep: 679 samples remaining.
679 samples (421 females, 254 males, 4 ambiguous; 679 founders) remaining after
main filters.
Calculating allele frequencies... done.
5574485 variants removed due to allele frequency threshold(s)
(--maf/--max-maf/--mac/--max-mac).
69583052 variants remaining after main filters.
--indiv-sort: 679 samples reordered.
Writing ./EAS/intermediate.pvar ... done.
Writing ./EAS/intermediate.psam ... done.
Writing ./EAS/intermediate.pgen ...
Error: Failed to unpack (0-based) variant #222501167 in .pgen file.
You can use --validate to check whether it is malformed.
* If it is malformed, you probably need to either re-download the file, or
address an error in the command that generated the input .pgen.
* If it appears to be valid, you have probably encountered a plink2 bug. If
you report the error on GitHub or the plink2-users Google group (make sure to
include the full .log file in your report), we'll try to address it.
write-index: 50593792
previous read-index: 222501161
block_widx: 0
g_debug_get_raw: 8
End time: Mon Apr 18 10:54:50 2022
________________
Run 2 | same as Run 1 | failed at a different position
PLINK v2.00a3LM 64-bit Intel (8 Apr 2022)
Options in effect:
--allow-extra-chr
--chr 1-22 x xy
--debug
--extract ./EAS/extract_list.snplist ./req_files/ALL.BRAVO_TOPMed_Freeze_8.MAF0.0001.variants.wchr.list ./req_files/TOPMED_HRC_1KG_UKB.REP_functional_variants.wchr.list
--indiv-sort n
--keep ./req_files/Cohort.EAS.samples.txt
--maf 1e-30
--make-pgen erase-phase fill-missing-from-dosage pvar-cols=-xheader,-maybequal,-maybefilter,-maybeinfo,-maybecm
--memory 60000
--out ./EAS/intermediate
--output-chr 26
--pgen ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pgen
--psam ./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam
--pvar ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar
--randmem
--rm-dup exclude-mismatch
--seed 1
--sort-vars
Hostname: ip-172-24-83-240
Working directory: /EBS/dev/imputationSAS/rerun_clean
Start time: Mon Apr 18 11:37:48 2022
127355 MiB RAM detected; reserving 60000 MiB for main workspace.
Using up to 32 threads (change this with --threads).
44002 samples (21829 females, 21746 males, 427 ambiguous; 44002 founders)
loaded from
./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam.
307475576 variants loaded from
./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar.
Note: No phenotype data present.
--extract: 75157537 variants remaining.
Note: Skipping --rm-dup since no duplicate IDs are present.
--keep: 679 samples remaining.
679 samples (421 females, 254 males, 4 ambiguous; 679 founders) remaining after
main filters.
Calculating allele frequencies... done.
5574485 variants removed due to allele frequency threshold(s)
(--maf/--max-maf/--mac/--max-mac).
69583052 variants remaining after main filters.
--indiv-sort: 679 samples reordered.
Writing ./EAS/intermediate.pvar ... done.
Writing ./EAS/intermediate.psam ... done.
Writing ./EAS/intermediate.pgen ...
Error: Failed to unpack (0-based) variant #211495665 in .pgen file.
You can use --validate to check whether it is malformed.
* If it is malformed, you probably need to either re-download the file, or
address an error in the command that generated the input .pgen.
* If it appears to be valid, you have probably encountered a plink2 bug. If
you report the error on GitHub or the plink2-users Google group (make sure to
include the full .log file in your report), we'll try to address it.
write-index: 48103424
previous read-index: 211495663
block_widx: 0
g_debug_get_raw: 8
End time: Mon Apr 18 13:21:25 2022
________________
Run 3 | previous command but adding --threads 1 | succeeded!
PLINK v2.00a3LM 64-bit Intel (8 Apr 2022)
Options in effect:
--allow-extra-chr
--chr 1-22 x xy
--debug
--extract ./EAS/extract_list.snplist ./req_files/ALL.BRAVO_TOPMed_Freeze_8.MAF0.0001.variants.wchr.list ./req_files/TOPMED_HRC_1KG_UKB.REP_functional_variants.wchr.list
--indiv-sort n
--keep ./req_files/Cohort.EAS.samples.txt
--maf 1e-30
--make-pgen erase-phase fill-missing-from-dosage pvar-cols=-xheader,-maybequal,-maybefilter,-maybeinfo,-maybecm
--memory 60000
--out ./EAS/intermediate
--output-chr 26
--pgen ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pgen
--psam ./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam
--pvar ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar
--randmem
--rm-dup exclude-mismatch
--seed 1
--sort-vars
--threads 1
Hostname: ip-172-24-83-240
Working directory: /EBS/dev/imputationSAS/rerun_clean
Start time: Mon Apr 18 14:38:38 2022
127355 MiB RAM detected; reserving 60000 MiB for main workspace.
Using 1 compute thread.
44002 samples (21829 females, 21746 males, 427 ambiguous; 44002 founders)
loaded from
./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam.
307475576 variants loaded from
./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar.
Note: No phenotype data present.
--extract: 75157537 variants remaining.
Note: Skipping --rm-dup since no duplicate IDs are present.
--keep: 679 samples remaining.
679 samples (421 females, 254 males, 4 ambiguous; 679 founders) remaining after
main filters.
Calculating allele frequencies... done.
5574485 variants removed due to allele frequency threshold(s)
(--maf/--max-maf/--mac/--max-mac).
69583052 variants remaining after main filters.
--indiv-sort: 679 samples reordered.
Writing ./EAS/intermediate.pvar ... done.
Writing ./EAS/intermediate.psam ... done.
Writing ./EAS/intermediate.pgen ... done.
End time: Mon Apr 18 16:40:48 2022
________________
Run 4 | same as Run 3 | failed at a new position
PLINK v2.00a3LM 64-bit Intel (8 Apr 2022)
Options in effect:
--allow-extra-chr
--chr 1-22 x xy
--debug
--extract ./EAS/extract_list.snplist ./req_files/ALL.BRAVO_TOPMed_Freeze_8.MAF0.0001.variants.wchr.list ./req_files/TOPMED_HRC_1KG_UKB.REP_functional_variants.wchr.list
--indiv-sort n
--keep ./req_files/Cohort.EAS.samples.txt
--maf 1e-30
--make-pgen erase-phase fill-missing-from-dosage pvar-cols=-xheader,-maybequal,-maybefilter,-maybeinfo,-maybecm
--memory 60000
--out ./EAS/intermediate
--output-chr 26
--pgen ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pgen
--psam ./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam
--pvar ./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar
--randmem
--rm-dup exclude-mismatch
--seed 1
--sort-vars
--threads 1
Hostname: ip-172-24-83-240
Working directory: /EBS/dev/imputationSAS/rerun_clean
Start time: Tue Apr 19 09:43:07 2022
127355 MiB RAM detected; reserving 60000 MiB for main workspace.
Using 1 compute thread.
44002 samples (21829 females, 21746 males, 427 ambiguous; 44002 founders)
loaded from
./EAS/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.wSex.psam.
307475576 variants loaded from
./req_files/Cohort.GT_hg38.pVCF.rgcpid.TOPMED_dosages.HDS.pvar.
Note: No phenotype data present.
--extract: 75157537 variants remaining.
Note: Skipping --rm-dup since no duplicate IDs are present.
--keep: 679 samples remaining.
679 samples (421 females, 254 males, 4 ambiguous; 679 founders) remaining after
main filters.
Calculating allele frequencies... done.
5574485 variants removed due to allele frequency threshold(s)
(--maf/--max-maf/--mac/--max-mac).
69583052 variants remaining after main filters.
--indiv-sort: 679 samples reordered.
Writing ./EAS/intermediate.pvar ... done.
Writing ./EAS/intermediate.psam ... done.
Writing ./EAS/intermediate.pgen ...
Error: Failed to unpack (0-based) variant #251348498 in .pgen file.
You can use --validate to check whether it is malformed.
* If it is malformed, you probably need to either re-download the file, or
address an error in the command that generated the input .pgen.
* If it appears to be valid, you have probably encountered a plink2 bug. If
you report the error on GitHub or the plink2-users Google group (make sure to
include the full .log file in your report), we'll try to address it.
write-index: 57147392
previous read-index: 251348497
block_widx: 0
g_debug_get_raw: 8
End time: Tue Apr 19 11:33:13 2022
_________
Let me know if there's any more I can do to help!