support for PacBio reads?

79 views
Skip to first unread message

Carlos Cano

unread,
Feb 9, 2013, 6:49:09 AM2/9/13
to mosaik-...@googlegroups.com
Hello, 

I wonder whether MOSAIK can handle PacBio reads, although this technology is not explicitly supported (not currently listed as an option for the -st parameter in MosaikBuild). If not, are you planning to include support for PacBio in future releases? 

 Thanks, 

 -Carlos  

Wan-Ping Lee

unread,
Feb 11, 2013, 5:43:06 PM2/11/13
to mosaik-...@googlegroups.com
Hi Carlos,

I indeed tested MOSAIK on PacBio V. cholerae reads. The program is running; however, I need to employ other parameters rather than the default ones. Otherwise, the aligned rate is very low. The parameter set “-hs 10 -mmp 0.5 -act 15” was used as opposed to the default values “-hs 15 -mmp 0.15 -act 55”.

Best,
Wan-Ping


 -Carlos  

--
You received this message because you are subscribed to the Google Groups "mosaik-aligner" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mosaik-aligne...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Wan-Ping Lee
Postdoctoral Research Associate
Department of Biology, Boston College

Tel: +1 (617) 552-2922
wanpi...@bc.edu

Wan-Ping Lee

unread,
Feb 13, 2013, 12:39:08 PM2/13/13
to mosaik-...@googlegroups.com
It seems that I forgot to answer the question about MosaikBuild. I used "-st 454" for PacBio datasets.

Best,
Wan-Ping

On Sat, Feb 9, 2013 at 6:49 AM, Carlos Cano <carloscan...@gmail.com> wrote:

 -Carlos  

--
You received this message because you are subscribed to the Google Groups "mosaik-aligner" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mosaik-aligne...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Carmen Navarro

unread,
Feb 25, 2013, 5:14:46 AM2/25/13
to mosaik-...@googlegroups.com, wanpi...@bc.edu
Hi,

I have been trying this configuration you mentioned and the results are good. However, I wonder whether there is a way to improve the execution speed (i.e. some efficiency-oriented parameter configuration). For shorter reads Mosaik is very fast. I have tried to run the software with several processors, but I get the same execution time for 1, 2, 4, 8, 12 processors. Maybe the parallelization is not compatible with PacBio reads' length/error rate?

Thanks in advance.

Carmen

Wan-Ping Lee

unread,
Feb 25, 2013, 5:25:47 PM2/25/13
to mosaik-...@googlegroups.com
Hi,

I knew the performance of this parameter setting is very slow. I got the same issue here. I think the reason is -act 15 which means any hash region larger than 15bp will be enable a Smith-Waterman (SW) for it. You may try larger number for -act to speed up.

Ongoing development replaces the current banded SW by a SIMD SW. The initial test shows two-fold speedup. However, I haven't finished and released it.


Best,
Wan-Ping

Huang Ke

unread,
Jun 11, 2015, 10:56:26 AM6/11/15
to mosaik-...@googlegroups.com, wanpi...@bc.edu
Hi Wan-Ping,

I wanted to use Mosaik to do the alignment for the PacBio reads. I'm exited to have found this post but i came across some issue when I run MosaikAligner using the command "MosaikAligner -p 7 -in pacbio_4571661.mkb -out pacbio_4571661.mka -ia genome.dat -j genome_10 -hs 10 -mmp 0.5 -act 15 -annpe 2.1.78.pe.ann -annse 2.1.78.se.ann",  an error message jumped out:

Mosaik [1;31mAligner [0m 2.2.26                                                2014-03-28
Wan-Ping Lee & Michael Stromberg  Marth Lab, Boston College Biology Department
------------------------------------------------------------------------------

- Using the following alignment algorithm: all positions
- Using the following alignment mode: aligning reads to all possible locations
- Using a maximum mismatch percent threshold of 0.5
- Using a hash size of 10
- Using 7 processors
- Using a Smith-Waterman bandwidth of 3265
- Using an alignment candidate threshold of 15bp.
- Setting hash position threshold to 200
- Using a jump database for hashing. Storing keys & positions in memory.
- Using a homo-polymer gap open penalty of 4
- loading reference sequence... finished.
- loading jump key database into memory... finished.
- loading jump positions database into memory... finished.
[1;33m
Aligning read library (40908):
[0m
 0% [ [1;32m                                     [0m]                                  |
17% [ [1;32m=====>                               [0m]                                  /
17% [ [1;32m=====>                               [0m]                                  -
17% [ [1;32m=====>                               [0m]                                  \
17% [ [1;32m=====>                               [0m]                                  |
17% [ [1;32m=====>                               [0m]                                  /
17% [ [1;32m=====>                               [0m]   2,294.5 reads/s       ETA 14 s -
17% [ [1;32m=====>                               [0m]   2,294.5 reads/s       ETA 14 s \
17% [ [1;32m=====>                               [0m]   2,294.5 reads/s       ETA 14 s |
17% [ [1;32m=====>                               [0m]   1,502.4 reads/s       ETA 22 s /
17% [ [1;32m=====>                               [0m]   1,502.4 reads/s       ETA 22 s -
17% [ [1;32m=====>                               [0m]   1,502.4 reads/s       ETA 22 s \
17% [ [1;32m=====>                               [0m]   1,105.6 reads/s       ETA 30 s |
17% [ [1;32m=====>                               [0m]   1,105.6 reads/s       ETA 30 s /
17% [ [1;32m=====>                               [0m]     867.0 reads/s       ETA 39 s -
17% [ [1;32m=====>                               [0m]     867.0 reads/s       ETA 39 s \
17% [ [1;32m=====>                               [0m]     867.0 reads/s       ETA 39 s |
17% [ [1;32m=====>                               [0m]     707.3 reads/s       ETA 47 s /
17% [ [1;32m=====>                               [0m]     707.3 reads/s       ETA 47 s -
17% [ [1;32m=====>                               [0m]     707.3 reads/s       ETA 47 s \
17% [ [1;32m=====>                               [0m]     592.8 reads/s       ETA 57 s |Alignment score and position are not consensus.

Do you have any clue about this error?  is there any pre-process step I need to take before running the program?

Thanks

Ke
Reply all
Reply to author
Forward
0 new messages