canFam3 phastCons scores

12 views
Skip to first unread message

Matt, Gavriel

unread,
Oct 3, 2019, 5:27:26 PM10/3/19
to gen...@soe.ucsc.edu
To whom it may concern,

I would like to use conservation (phastCons) scores for the current dog (canFam3) genome assembly but I noticed that these are not available. I was wondering whether it would be possible to generate a multiple alignment of vertebrate genomes with the current dog (canFam3) genome assembly in order to generate these phastCons scores.

Thank you,
Gavriel Matt


Gavriel Matt
Postdoctoral Research Associate, Wang Lab
Washington University in St. Louis, Department of Genetics
4515 McKinley Avenue
Campus Box 8510, Room 5312
St. Louis, MO 63110

________________________________
The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

Daniel Schmelter

unread,
Oct 4, 2019, 7:36:00 PM10/4/19
to Matt, Gavriel, gen...@soe.ucsc.edu

Hello Gavriel,

Thank you for your question about obtaining phastCons scores for the canFam3 genome.

If you wanted to replicate a vertebrate conservation track similar to the 7-species conservation track on human (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=cons7way), this would require producing a multi-species MAF file as input to the phastCons program. Producing a multi-species MAF file is a somewhat complicated process and is broadly documented here:

http://genomewiki.ucsc.edu/index.php/Conservation_Track#Multiple_Alignment
http://genomewiki.ucsc.edu/index.php/Whole_genome_alignment_howto

The multi-species MAF process begins with downloading or creating Pairwise-MAF files for each of the species you are using. We have several of these files on our download server (http://hgdownload.soe.ucsc.edu/goldenPath/canFam3/), but if you want to use any additional species, the process is documented here:

http://genomewiki.ucsc.edu/index.php/DoBlastzChainNet.pl

Once you have your pairwise MAF files made, you will need to run the MultiZ program to generate multi-species MAF files. For a reference on how a 5-way MultiZ alignment was made (line 246) and how phastCons was run (line 703), you can refer to the makeDoc log of the process:

https://genome-source.gi.ucsc.edu/gitlist/kent.git/blob/master/src/hg/makeDb/doc/hg38/multiz5way.txt

After that process, you should have a multi-species input file for the PhastCons program. You can generate the wig file yourself by following this guide put out by the phastCons publishers:

http://compgen.cshl.edu/phast/phastCons-HOWTO.html

As I mentioned, this is a complicated pipeline and these tools may be extremely tricky to use. This is something our team is slowly working on improving and we will be happy to provide support if you have any specific questions.

I hope this was helpful. If you have any more questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are publicly archived. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Warmly,
Daniel Schmelter
UCSC Genomics Browser


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/785900D5-D841-4B03-AD0A-3202B924F9DD%40wustl.edu.
Reply all
Reply to author
Forward
0 new messages