Good morning!
Living on the genomewiki are three excellent pages regarding whole genome alignments:
Together, they form a nice blueprint for how to create your own set of MAFs and liftOver chain files.
However, there is a subtle difference between the first link and the second two link. I am wondering if it is significant or not.
On the first page (as on
http://genomewiki.ucsc.edu/images/9/93/RunLastzChain_sh.txt) the chains are processed (and presumably sorted?) together by first converting each .psl file into a .chain file and then passing all of these chain files to chainMergeSort. The resulting all.chain file can then be further processed towards MAFs or liftOver chains.
In the second two pages (the third most relevantly), the results of chainMergeSort are passed to chainSplit. The resulting split chains are then merged by 'cat' and then sorted (despite the warning that chainSort is not suitable for large sets).
Looking at my output, the first approach seems to work just fine. Nonetheless, I'd like to check before making final use of my newly constructed alignment files.....perhaps the difference is because these second two links are aimed at genome versions rather than alignments between species?
Many thanks,
-- David
-------------------------------------------------------------------------------------David Garfield, PhD
Furlong Group
European Molecular Biology Laboratory (EMBL)
Telephone +49 6221 387 8426Snail Meyerhofstraße 1
D-69012 Heidelberg
Germany