Dear UCSC staff:
I'm a Ph.D candidate from Tsinghua University. Recently I want to explore the conservation score (phastcons scores) in some regulatory elements of tree shrew and rabbit.
However, so far there lacks available phastcons scores with tree shrew or rabbit genome as reference genome from the website. Besides, even though I found the valuable resource 447-way mammalian alignment (so amazing!), it doesn’t include the specific species with specific genome version (which we used for sequence data analysis):
The rabbit genome (OryCun3.0):
https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_013371645.1/
The tree shrew genome (Genome assembly KIZ version 2):
https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_033439345.1/
Could you give me some advice on this condition (any advice would be helpful)? Or, I wonder could you help us to make a new multiple alignments and calculate phastcons scores(even helpful) using these new genome versions? I know it is a resource-costing and time-consuming process, could you please make a small-scale alignment first if possible (e.g human(hg38), mouse(mm10), pig(susScr11), rabbit(GCA_013371645.1), tree shrew(GCA_033439345.1), Mouse lemur(micMur2))? Or include some more mammals or vertebrates?
By the way, I don’t understand well whether there is big difference on phastcons scores when calculating with different-scale alignment files. I will appreciate it if you could explain it further to me.
Thank you for your patience! Looking forward to your reply!
Best regards,
Lin Ou
Hello, Lin.
Thank you for your interest in the UCSC Genome Browser and for sending your inquiry.
At this time, we do not have plans to add a multiple alignment for the hg38, mm10, susScr11, rabbit (GCA_013371645.1), tree shrew (GCA_033439345.1), and micMur2 genomes. Pairwise alignments may be helpful, as they allow exploration of the sequences in common. These alignments are available on our download server:
https://hgdownload.soe.ucsc.edu/goldenPath/hg38/liftOver/
https://hgdownload.soe.ucsc.edu/goldenPath/mm10/liftOver/
https://hgdownload.soe.ucsc.edu/goldenPath/susScr11/liftOver/
https://hgdownload.soe.ucsc.edu/hubs/GCA/013/371/645/GCA_013371645.1/liftOver/
https://hgdownload.soe.ucsc.edu/hubs/GCA/033/439/345/GCA_033439345.1/liftOver
https://hgdownload.soe.ucsc.edu/goldenPath/micMur2/liftOver/
You could run a multiz alignment using these pairwise alignments. The following makedoc describes the operations and programs used to create the hg38 PhastCons 7-way alignment. Please note that this document is primarily intended for internal replication: https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/hg38.txt#L4416.
Each alignment model is computed from the available data. Different scales of alignments, such as a mammal alignment versus a vertebrate alignment, could have big differences due to more species, which would give a wider context for the scores. If you have questions about how phastCons works, we recommend reaching out to the Siepel Lab at Cold Spring Harbor Laboratory (CSHL): http://compgen.cshl.edu/phast/
There are tools online that you could use to extract a MAF file from a Cactus HAL alignment, such as cactus-hal2maf (https://github.com/ComparativeGenomicsToolkit/hal). We do not maintain cactus-hal2maf, so we cannot provide guidance on how to use the tool.
You may also want to explore online resources such as Biostars (https://www.biostars.org/), where other researchers and experts can provide further guidance.
I hope this is helpful. If you have any further questions about UCSC Genome Browser tools or data, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
Gerardo Perez
UCSC Genomics Institute
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/a804b62e-c755-4221-ba3b-d62e9de25538n%40soe.ucsc.edu.