Yes, that fixed it. Thank you. Tangentially related, while I can use PLINK 1.9 to work on a VCF binary stream:
$ bcftools view --no-version -Ob --compression-level 0 file.vcf | plink --bcf /dev/stdin --make-bed
PLINK v1.90b6.24 64-bit (6 Jun 2021) www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to plink.log.
Options in effect:
--bcf /dev/stdin
--make-bed
7448 MB RAM detected; reserving 3724 MB for main workspace.
--bcf: plink-temporary.bed + plink-temporary.bim + plink-temporary.fam written.
1 variant loaded from .bim file.
1 person (0 males, 0 females, 1 ambiguous) loaded from .fam.
Ambiguous sex ID written to plink.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 1 founder and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is exactly 1.
1 variant and 1 person pass filters and QC.
Note: No phenotypes present.
--make-bed to plink.bed + plink.bim + plink.fam ... done.
While with PLINK 2.0 this is not possible anymore:
$ bcftools view --no-version -Ob --compression-level 0 file.vcf | plink2 --bcf /dev/stdin --make-bed
PLINK v2.00a3LM AVX2 Intel (1 Jul 2021) www.cog-genomics.org/plink/2.0/
(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to plink2.log.
Options in effect:
--bcf /dev/stdin
--make-bed
Start time: Fri Jul 2 10:20:25 2021
7448 MiB RAM detected; reserving 3724 MiB for main workspace.
Using up to 8 compute threads.
--bcf: 1 variant scanned.
End time: Fri Jul 2 10:20:25 2021
No output is created, my guess being that's because PLINK 2.0 wants to read the file twice and this is not possible when the file is a stream.
I am writing a pipeline where I have to merge many VCFs and then convert the output to PLINK format and this leaves me with three workarounds:
1) Ask users (or the Docker running the pipeline) to have both PLINK 1.9 and PLINK 2.0 installed to perform this step (and other PLINK 2.0 steps down the line)
2) Generate a large mostly temporary uncompressed VCF on disk that I then convert to PLINK format with PLINK 2.0, increasing the disk requirements of the task
3) Generate a compressed VCF on disk that I then convert with PLINK 2.0, increasing the CPU requirements
Any reason PLINK 2.0 needs to read the VCF file twice while PLINK 1.9 does not?