how to set pseudo-haploid and diploid mode in combination

90 views
Skip to first unread message

Wanbo Li

unread,
Aug 14, 2018, 8:45:36 AM8/14/18
to STITCH imputation
Hello Robert,

In your Nature genetics paper,  "The first 38 rounds were in the faster pseudo-haploid mode and the final 2 rounds were in the slower but more accurate diploid mode" was used when analyzed the CONVERGE data. I am wondering how to set pseudo-haploid and diploid mode in combination. Could you give us a example?

Best,
Wanbo 

Robbie Davies

unread,
Aug 16, 2018, 11:02:51 AM8/16/18
to li.wan...@gmail.com, STITCH imputation
Hey,

Thanks for the email. To achieve that functionality, set the method to "pseudoHaploid" and use the switchModelIteration variable. For the above one would set the value to 39, meaning it switches at the start of the 39th iteration, and hence does 38 of the faster pseudoHaploid iterations and 2 of the slower but more accurate diploid ones

An example can be found here (latest commit 70406e7638ab096fdafafd8066ca89a64bd276b7)

Best,
Robbie


--
You received this message because you are subscribed to the Google Groups "STITCH imputation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stitch-imputat...@googlegroups.com.
To post to this group, send email to stitch-i...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stitch-imputation/0db291ac-1038-4469-b8d6-29da4588153e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ryan Franckowiak

unread,
Feb 3, 2021, 2:09:49 PM2/3/21
to STITCH imputation
Hi Robbie,

I am receiving an error message when I switched to running STITCH to using the "pseudoHaploid" method with the "switchModelIteration" variable set to run the last few iterations using the "diploid" method. More specifically, when I ran the analysis in the using the "diploid" the analysis ran to completion. But when I switched the exact same run with the "pseudoHaploid/diploid" approach I received the following error message:

[2021-02-03 13:14:34] Start of iteration 26 
[2021-02-03 13:14:34] Error in forwardBackwardHaploid(sampleReads = sampleReads, eHapsCurrent_tc = eHapsCurrent_tc, : 
Mat::operator(): index out of bounds 
Error in check_mclapply_OK(single_iteration_results) : An error occured during STITCH. The first such error is above In addition: 
Warning message: In mclapply(sampleRanges, mc.cores = nCores, FUN = subset_of_complete_iteration, : all scheduled cores encountered errors in user code

The error always occurs at the 26th iteration regardless of what parameter settings I used. Below is the code I'm using:

STITCH(method = "pseudoHaploid", 
posfile = shad_posfile,
bamlist = shad_bamlist,
nCores = n_cores,
nGen = shad_nGen,
chr = shad_chr,
K = shad_K,
S = shad_S,
use_bx_tag = TRUE,
bxTagUpperLimit = 50000,
switchModelIteration = 38,
tempdir = tempdir,
outputdir = outputdir)

I would appreciate any advice you could provide.

Thank you,
Ryan

Nate Edelman

unread,
Mar 1, 2022, 9:08:18 AM3/1/22
to STITCH imputation
Hi! Did you ever figure this out? I have the same error

Thanks
Nate

zhiqiang chen

unread,
Apr 14, 2023, 12:22:59 PM4/14/23
to STITCH imputation
Did you solve the error? I also have the same error. 

Cheers
Zhiqiang

Robert Davies

unread,
Apr 17, 2023, 4:39:36 AM4/17/23
to STITCH imputation
Hi,

Sorry I missed these previous messages. 

This is likely a bug in STITCH. By default, on the 25th iteration, STITCH tries to find really long reads that would be better modelled if they were split in two, reflecting heuristic difficulties / true recombinations that are otherwise hard to model. However I have seen errors like these before and haven't had the time / data to properly debug it. 

My recommendation is to turn this feature off by setting for instance splitReadIterations to -1. 

I find it a bit peculiar that it seems to crop up more with pseudoHaploid than with diploid. In general, since publishing the paper back in 2016, I've become less enamoured with the statistical model that pseudoHaploid uses. I think I would generally recommend just using STITCH diploid model. One day I hope to write a merged version of QUILT and STITCH which would have a better statistical model, though I don't know when that might be.

Best,
Robbie


Reply all
Reply to author
Forward
0 new messages