I'm preparing TBA and revised multiz alignments. In order to do
multiple alignment, I need to create pairwise alignments first. I'm
using lastz from Penn State. There are a few pairs from flies that run
for really long time. Specifically, droVir3 vs. droWil1, droBip vs.
droAna3, dp4 vs. droPer1, and droSim1 vs. droSec1. These pairs have
run for many days. I've contacted Bob Harris for good parameters of
lastz to speed up, and the new processes have run for several days and
I don't know when they can finish. Do you know any groups who use
lastz and have finished running these pairs? I may need more time.
Thanks,
Minmei
--
Minmei Hou, Ph.D.
Assistant Professor
Department of Computer Science
Northern Illinois University
I know that Cactus uses lastz as does UCSC's Multiz pipeline, maybe those guys can chime in here with tips. I'm not sure whether or not team Cactus has finished the flies but I believe team Multiz has.
I'll chat with you off-list about a time extension. For everyone: limited extensions are possible, please contact me if you're not going to make the deadline.
d
We have noticed that there are some unmasked repeats in the flies that
have slowed down our Cactus pipeline. I can't say for sure that this
is your problem, but the self-masking option in the newest version of
lastz has helped us out considerably:
http://www.bx.psu.edu/~rsharris/lastz/newer/
http://www.bx.psu.edu/~rsharris/lastz/newer/README.lastz-1.03.02.html#adv_selfmasking
cheers
-Glenn
--Minmei
The self-masking process as described in the lastz README is simplified to make the description readable. I've sent a message to Minmei describing speedup details, mainly using parallelization, but with some other ideas too.
For others who may be interested in those details, I'll post them on a message to the lastz mailing list later today.
Bob H
Perfect question for today! :) I think the easiest way to do the submission will be for teams to compress their submission, place it on the web and then email me directly (off-list) both the link and an md5 (or sha1) checksum. If anyone is unable to place an archive on the web contact me and we'll work out an alternative (dropbox, ftp, etc). Eventually the data sets will be made public, but not immediately.
The form of the short write up portion of the submission is still being composed, I'll ping the list and the submitters with it when it is complete.
d