A(mother) question regarding the construction of MAF files

183 views
Skip to first unread message

David Garfield

unread,
May 9, 2014, 5:17:19 AM5/9/14
to gen...@soe.ucsc.edu
Good morning, 

I have what I hope is a very basic question. I'm attempting to run phastCons on a subset of melanogaster subgroup. 
As input to this program, I've started with .net.axt.gz files from the Browser following this Method:

"Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. The pairwise alignments were then multiply aligned using the multiz program"

What I assume is happening here is that the .net.axt.gz files (the pairwise alignments?) are being converted to MAF files (axtToMaf) and then these are being fed to multiz (or, rather, autoMZ, which is now Roast?). In digging around some of the UCSC scripts, it is clear that autoMZ is looking for single coverage files (sing.maf). Roast would certainly like such files. 

My question, quite simple: In generating your MAF files, is there an additional step here of projecting to single coverage (e.g. single_cov2)?


Many thanks as always!

David

----




-------------------------------------------------------------------------------------
David Garfield, PhD
Furlong Group
European Molecular Biology Laboratory (EMBL)

Telephone    +49 6221 387 8426
Fax                 +49 6221 387 166
Snail Meyerhofstraße 1
D-69012 Heidelberg
Germany





Jonathan Casper

unread,
May 9, 2014, 7:18:24 PM5/9/14
to David Garfield, gen...@soe.ucsc.edu

Hello David,

Thank you for your question about converting our .net.axt.gz files to MAF format. We start the alignment process by building a number of individual smaller alignments (chains), and then combining them into one larger single-coverage alignment (the net). This means that the .net.axt.gz files are already single coverage, so the MAF files generated by axtToMaf are also automatically single-coverage without needing an additional step. If you are interested, more information about chains and nets is available on our wiki at http://genomewiki.ucsc.edu/index.php/Chains_Nets, as well as on the description page of many Chain/Net tracks like this one: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=primateChainNet.

One of our engineers suggests the following example of chain/net/MAF construction from the hg19/mm10 alignment might be helpful. The path names can be converted into local path names for chrom.sizes, 2bit and all.chain.gz files.

#!/bin/csh -efx

cd /hive/data/genomes/hg19/bed/lastzMm10.2012-03-07/axtChain

# Make nets ("noClass", i.e. without rmsk/class stats which are added later):
chainPreNet  hg19.mm10.all.chain.gz /scratch/data/hg19/chrom.sizes 
/scratch/data/mm10/chrom.sizes stdout \
| chainNet  stdin -minSpace=1 /scratch/data/hg19/chrom.sizes /scratch/data/mm10/chrom.sizes 
stdout /dev/null \
| netSyntenic stdin noClass.net

# Make liftOver chains:
netChainSubset -verbose=0 noClass.net hg19.mm10.all.chain.gz stdout \
| chainStitchId stdin stdout | gzip -c > hg19.mm10.over.chain.gz

# Make axtNet for download: one .axt per hg19 seq.
netSplit noClass.net net
cd ..
mkdir axtNet
foreach f (axtChain/net/*.net)
netToAxt $f axtChain/chain/$f:t:r.chain \
   /scratch/data/hg19/nib /scratch/data/mm10/nib stdout \
   | axtSort stdin stdout \
   | gzip -c > axtNet/$f:t:r.hg19.mm10.net.axt.gz
end

# Make mafNet for multiz: one .maf per hg19 seq.
mkdir mafNet
foreach f (axtNet/*.hg19.mm10.net.axt.gz)
   axtToMaf -tPrefix=hg19. -qPrefix=mm10. $f \
         /scratch/data/hg19/chrom.sizes /scratch/data/mm10/chrom.sizes \
         stdout \
   | gzip -c > mafNet/$f:t:r:r:r:r:r.maf.gz
end

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Group



--


Reply all
Reply to author
Forward
0 new messages