Scoring with filtering

176 views
Skip to first unread message

O Kam

unread,
Nov 18, 2018, 9:23:45 PM11/18/18
to plink2-users
Hi!

I get a weird error with --score command on plink2 (October 28-version) : "Error: Invalid --score parameter '  '. "
 The full command set was: 
./plink2 \
  --vcf genofile dosage=DS \
  --rm-dup retain-mismatch \
  --not-chr X \
  --exclude-if-info "R2<=1" \
  --score scorefilename list-variants \
  --set-allvar-ids @:#[\$a.\$r] \
  --out outfilename

All parts should be functional per se and command runs correctly without CHR/Info-filtering, but the required combination just don´t work for some reason. I tried re-ordering the flags as per manual and couple of other combinations but it didn´t help (and I didn´t understand which parameter was wrong from error message). What am I missing?

O Kam 



Christopher Chang

unread,
Nov 18, 2018, 9:53:05 PM11/18/18
to plink2-users
Can you post the full .log file so I can try to replicate this? I can’t just copy the literal command line since the —set-all-var-ids flag is incorrect, and the actual issue may be a similarly minor detail that was lost in translation.

O Kam

unread,
Nov 18, 2018, 10:51:42 PM11/18/18
to plink2-users
I can get a screenshot (file export not allowed from server):

CapturePlink.JPG

Christopher Chang

unread,
Nov 18, 2018, 11:30:51 PM11/18/18
to plink2-users
That’s a totally different error message.

If you can post a small vcf file (could have just 2 fake variants) that exhibits the same problem, I can investigate this and explain what’s going on.

O Kam

unread,
Nov 19, 2018, 12:06:30 AM11/19/18
to plink2-users

CapturePlink2.JPG

Sorry, this is the correct log. Looks like flag "--set-all-var-ids @:#[\$a.\$r]" is not being read for some reason. 

Christopher Chang

unread,
Nov 19, 2018, 3:25:18 AM11/19/18
to plink2-users
Okay, this appears to be due to your shell’s handling of special characters. Try a different escaping/quoting scheme for your —set-all-var-ids argument.

O Kam

unread,
Nov 19, 2018, 9:31:39 AM11/19/18
to plink2-users
Thank you, that could be the issue: compared to earlier runs which were executed on local interface (linux), this error occurs when running via server which uses Unix system. I´ll check the scheme. 

O Kam

unread,
Nov 25, 2018, 9:09:47 PM11/25/18
to plink2-users
Has anyone else experienced issues with special characters? I have tried to backslash all potential reserved/special characters and also double quoting but I keep getting the same error. 

O Kam

Christopher Chang

unread,
Nov 25, 2018, 11:22:10 PM11/25/18
to plink2-users
plink does not support spaces/tabs in command-line arguments, but it shouldn't have a problem with other UTF-8 encoded characters. Can you clarify what problem you're having, and what shell you're using?

O Kam

unread,
Nov 26, 2018, 12:16:12 AM11/26/18
to plink2-users
The shell is bash and the error is "Error: Invalid --score parameter '  '. " which occurs with   "--set-allvar-ids @:#[\$a.\$r] " -flag when running scoring routine. 

Christopher Chang

unread,
Nov 26, 2018, 11:34:33 AM11/26/18
to plink2-users
Can you post a screenshot of the *input* command line, as well as what makes it into the log, since the problem here occurs between the two?

O Kam

unread,
Nov 26, 2018, 7:10:15 PM11/26/18
to plink2-users
Sure. Here are the script running plink, log from plink and log from server. 
plink_serverlog2018-11-27.JPG
plink_log2018-11-27.JPG
plink_script2_2018-11-27.JPG

Christopher Chang

unread,
Nov 26, 2018, 10:43:32 PM11/26/18
to plink2-users
Is there a space after the trailing '\' on the --score line? That backslash appears to be a different color in your editor, and adding that extra space reproduces the problem on my end.

O Kam

unread,
Nov 27, 2018, 6:42:38 AM11/27/18
to plink2-users
Yes, there was!!! I corrected it and it is now running and seemed to read all of the flags correctly!!! Huge respect for the help and noticing it. I didn´t notice it and also didn´t realise that it would make all the difference!

O Kam

unread,
Nov 27, 2018, 9:56:11 PM11/27/18
to plink2-users
Yes, finally finished the run and now all flags seemed to be read correctly. But I still get either the same error "Error: no variants loaded from -temporary.pvar" (as in first screenshot above from 19.11) or "Error: malformed .pgen file" (with info-filters 1 and 0.9). E.g. plink either fails while trying to load variants from temporary.pvar or while calculating allele frequencies. I was thinking that the error could be in input files or due to insufficient memory, but the same run without info and chr-filtering was successful so it is probably not it...

Christopher Chang

unread,
Nov 27, 2018, 11:16:06 PM11/27/18
to plink2-users
“Malformed .pgen file” here means there’s a plink2 bug. Can you post a test dataset for me to replicate the problem with?

O Kam

unread,
Nov 28, 2018, 10:33:19 AM11/28/18
to plink2-users
I tested it with smaller per-chr file (=1 chromosome) and it seemed to run ok. The issue seems to occur with larger gwa-genotypefile (merged from per-chr files).  

Christopher Chang

unread,
Nov 28, 2018, 2:36:44 PM11/28/18
to plink2-users
Can you post the .log file of the failed run, so I can generate a dataset with the same dimensions and see if that's enough to replicate the bug?

O Kam

unread,
Nov 28, 2018, 4:40:15 PM11/28/18
to plink2-users
Sure. Here is a capture from the log file: 
slurmlog_error_log.JPG

Christopher Chang

unread,
Nov 29, 2018, 3:20:38 AM11/29/18
to plink2-users
Okay, the dataset dimensions were insufficient to replicate the crash on my end; I will probably need to send you a series of debug builds to get to the bottom of this.

To start, can you try splitting this run into three components:
1. plink2 --vcf merged.vcf.gz dosage=DS --not-chr X --out intermediate
intermission. plink2 --pfile intermediate --validate: does this fail?  If yes, we know the problem is with VCF import, and don't need to worry about the rest.  Check if adding "--threads 1" to the first step causes the problem to disappear.
2. plink2 --pfile intermediate --exclude-if-info "R2<=0.9" --freq: is this enough to trigger the crash?  If yes, does it crash at the same place each time?  Does the crash go away if "--threads 1" is added?
3. If --freq + --exclude-if-info wasn't enough to trigger the crash, check if your original --exclude-if-info + --set-all-var-ids + --rm-dup + --score triggers the crash again; if yes, check if the crash goes away with "--threads 1".

O Kam

unread,
Nov 29, 2018, 7:06:19 AM11/29/18
to plink2-users
Thank you! I´ll try that now. 

O Kam

unread,
Dec 2, 2018, 7:43:47 PM12/2/18
to plink2-users
Tried the first step but run didn´t pass through validation, and adding  "--threads 1" didn´t help: "Error: Invalid unconditional phased-dosages for (0-based) variant #3. "

O Kam

unread,
Dec 3, 2018, 10:20:46 AM12/3/18
to plink2-users
So based on the first step, it seems like the problem is with VCF import. If so, is there any way to fix or debug the VCF import?

Christopher Chang

unread,
Dec 3, 2018, 10:41:46 AM12/3/18
to plink2-users
Yes, I'll try to send you a debug build later today.

O Kam

unread,
Dec 4, 2018, 9:28:12 PM12/4/18
to plink2-users
Great, thank you!

Christopher Chang

unread,
Dec 5, 2018, 11:50:45 AM12/5/18
to plink2-users
Linux 64-bit debug build is now posted to http://s3.amazonaws.com/plink2-assets/plink2_debug_20181205.zip ; try rerunning the first step with the --debug flag added. Sorry about the delay.

O Kam

unread,
Dec 7, 2018, 7:28:29 PM12/7/18
to plink2-users
Thank you! It looks like the run still doesn´t go through (and adding "--threads 1" to the first step doesn´t help either). Process yields the same error in validation phase: "Error: Invalid unconditional phased-dosages for (0-based)  variant #3. ". However the first step seems to give some additional info between vcf-scanning and conversion (screenshot attached) - but I still don´t understand what is wrong with the vcf-reading. 
plinklog_slurm.JPG
plinklog_slurm2.JPG

Christopher Chang

unread,
Dec 7, 2018, 9:21:25 PM12/7/18
to plink2-users
Is it possible for you to provide a complete version of the second screenshot/log?  I have a guess as to where the problem is (import of dosages associated with 0|1 or 1|0 phased hardcalls), but the additional details in the rest of the debug log should help narrow this down further.

O Kam

unread,
Dec 7, 2018, 9:41:38 PM12/7/18
to plink2-users
Sure. Attached is the full report in two parts. 
plinklog_slurm2full_pt1.JPG
plinklog_slurm2full_pt2.JPG

Christopher Chang

unread,
Dec 9, 2018, 4:46:38 PM12/9/18
to plink2-users
Thanks, fixed one bug which may or may not be the critical one. New debug build is at http://s3.amazonaws.com/plink2-assets/plink2_debug_20181209.zip .

O Kam

unread,
Dec 9, 2018, 9:27:21 PM12/9/18
to plink2-users
Great, thank you! I tried to run it with thinned files and so far it looks promising (runs completed without erroring out and outputfiles ok). Setting it now to run with full gw-file. 

O Kam

unread,
Dec 10, 2018, 9:01:57 PM12/10/18
to plink2-users
Ok, so finished the runs with gw-files. First two steps went perfectly and intermediate files were created (successfully) and also validated (successfully). However when I tried to run a --freq + --exclude-if-info flags, it errorred out (Error: Malformed .pgen-file.). Trying to re-run the pfile with the same flags + "--debug" and "--threads 1" to see if it helps and if it would crash at same place. 

O Kam

unread,
Dec 10, 2018, 9:32:18 PM12/10/18
to plink2-users
Flag "--threads 1" (or "--debug") seems to work - run went through successfully, so now moving to the next steps. 

Christopher Chang

unread,
Dec 10, 2018, 10:46:55 PM12/10/18
to plink2-users
I’m primarily interested in what’s still broken; so, to clarify, if the VCF import is multithreaded, but you then run —freq + —exclude-if-info in single-threaded mode, does that crash? Or does —freq + —exclude-if-info only fail when multithreaded?

O Kam

unread,
Dec 10, 2018, 10:56:01 PM12/10/18
to plink2-users
In single-threaded mode —freq + —exclude-if-info runs successfully - e.g. crashes only when multithreaded. Also, in single-threaded mode it seems to be able to go through all of the later steps as well without crushing (though I am still checking all the outputs etc to confirm/verify that)

Christopher Chang

unread,
Dec 11, 2018, 5:59:47 PM12/11/18
to plink2-users
Ok, updated debug build intended to be used with the crashing multithreaded --freq + --exclude-if-info run is posted to http://s3.amazonaws.com/plink2-assets/plink2_debug_20181211.zip .

O Kam

unread,
Dec 11, 2018, 6:15:16 PM12/11/18
to plink2-users
Great, thank you!!! So this is now a different build where multithreaded crush issue was solved/addressed? I´ll test it. 

Christopher Chang

unread,
Dec 11, 2018, 6:16:25 PM12/11/18
to plink2-users
No, this only adds debug-prints to try to get more information about the problem. I'm particularly interested in whether this is failing at the same variant each time, or if it randomly varies between runs.

O Kam

unread,
Dec 11, 2018, 6:20:11 PM12/11/18
to plink2-users
Ok, I´ll run it a couple of times in multithreaded mode with the existing intermediate file. 

O Kam

unread,
Dec 26, 2018, 7:25:37 PM12/26/18
to plink2-users
Checked it in multithread mode from intermediate file, and it looks like it is failing at same place each time (at 43%) giving the same error ("Error: Malformed .pgen file.". ...and with "--threads 1" it runs smoothly until the end producing good results.  

O Kam

unread,
Dec 26, 2018, 9:27:04 PM12/26/18
to plink2-users
Here is a screenshot of error when run with "--debug" flag.
CapturePlink3.JPG

O Kam

unread,
Dec 26, 2018, 9:29:23 PM12/26/18
to plink2-users
I ran it in total 5 times from the same intermediate file and each time it gave the identical error report. ..

Christopher Chang

unread,
Dec 27, 2018, 10:27:28 PM12/27/18
to plink2-users
Okay, this is useful. I will try to post either an updated debug build or a bugfix tomorrow.

Christopher Chang

unread,
Dec 30, 2018, 1:51:49 PM12/30/18
to plink2-users

O Kam

unread,
Jan 1, 2019, 7:18:38 PM1/1/19
to plink2-users
Thank you! Is the debug build supposed to work also with VCFv4.2 formatted files or only with VCFv4.3? Because I noticed that with vcf-file  from another source, which is in VCFv4.2 format, build still produces the same error (malformed .pgen file) even with just --freq flag.

Christopher Chang

unread,
Jan 1, 2019, 8:41:23 PM1/1/19
to plink2-users
Debug build just has additional logging; can you post the new .log file?

VCF 4.2 vs. 4.3 shouldn't make a difference.

O Kam

unread,
Jan 1, 2019, 9:21:59 PM1/1/19
to plink2-users
Ok, .log file attached. 
outs.log

Christopher Chang

unread,
Jan 1, 2019, 9:30:11 PM1/1/19
to plink2-users
This is a different input file and machine; I had tweaked that debug build for the exact same context as your previous run.

O Kam

unread,
Jan 1, 2019, 10:21:51 PM1/1/19
to plink2-users
Yes, both the inputfile and machine are different - this is another project running outside the cluster (I´ll run couple of runs on previous setup too to check/test it.). 

Christopher Chang

unread,
Jan 1, 2019, 11:21:58 PM1/1/19
to plink2-users
With that said, if you can ever send me an input file to replicate the crash with, I’ll probably be able to fix the bug within a day.

O Kam

unread,
Jan 2, 2019, 8:03:08 AM1/2/19
to plink2-users
Ok, I´ll try to make a minimal genotype input file which behaves the same way. I tried to thin the GWA vcf or to subset it, but it crashes on the same error in the process, so I have to come up with some other way...

Christopher Chang

unread,
Jan 2, 2019, 11:54:26 AM1/2/19
to plink2-users
Can you keep just the first e.g. 70000 lines of the single-sample VCF file? "cat 1.vcf | head -n 70000 > top_70k.vcf"

O Kam

unread,
Jan 2, 2019, 6:56:23 PM1/2/19
to plink2-users
It didn´t fit file size limitations so sent via email. 

Christopher Chang

unread,
Jan 2, 2019, 10:23:01 PM1/2/19
to plink2-users
Thanks, that was enough to replicate the crash, working on a fix now.

Christopher Chang

unread,
Jan 3, 2019, 12:20:57 AM1/3/19
to plink2-users
Bugfix is posted on the main website. Not sure whether this is the same issue you previously encountered, though, so the new binary still has debug logging aimed at your previous crashing run; send me the log if it still fails.

O Kam

unread,
Jan 3, 2019, 6:59:49 AM1/3/19
to plink2-users
Great! Thank you!
Reply all
Reply to author
Forward
0 new messages