PhyloP or PhyloP46

727 views
Skip to first unread message

Teresa Pillars

unread,
May 18, 2016, 4:31:55 PM5/18/16
to gen...@soe.ucsc.edu
Hello,

I work in a lab that has recently switched from using PhyloP to PhyloP46 for base-wise conservation scoring. We are using PhyloP/46 in order to look at differences in humans at the base level. I am hoping you could clarify the differences between these two including the scale change, database change, and other significant changes. 

-Thank You! I look forward to your response

-Teresa Pillars

Yale DNA Lab
333 Cedar Street
 New Haven, CT 06520

Matthew Speir

unread,
May 18, 2016, 4:58:28 PM5/18/16
to Teresa Pillars, gen...@soe.ucsc.edu
Hello Teresa,

Thank you for your question about PhyloP conservation scores in the UCSC
Genome Browser.

Unfortunately, it's unclear what you're referring to as there are many
phyloP tracks in the UCSC Genome Browser. Perhaps you can provide more
detail about what phyloP data you were using such as where you
downloaded it from or how many and what organisms were in it?

Additionally, I would recommend looking into the 100-way vertebrate
conservation alignment for hg19, which includes a set of phyloP
conservation scores computed over the alignment. You can read more about
the track and how the data was produced here:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cons100way.
Additionally, you can download the phyloP data from our download server
here: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/phyloP100way/.

We also have a 100-way vertebrate alignment for the newest human genome
assembly, hg38/GRCh38, which also includes phyloP scores calculated over
the alignment. You can read more about the track here:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=cons100way. And,
again, you can download the phyloP scores for thehg38 version of this
100-way alignment here:
http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP100way/.

I hope this is helpful. If you have any further questions, please reply
to gen...@soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
> --
>

Teresa Pillars

unread,
May 20, 2016, 11:43:11 AM5/20/16
to gen...@soe.ucsc.edu
Hello,

On 5/18 I had inquired about PhyloP and received a response from msp...@soe.ucsc.edu. Thank you for the timely response and helpful links. However, I think there may have been some miscommunication. 

In my original email I mentioned switching from PhyloP to PhyloP46. We use a pipeline which has many tools for analyzing our exome data including PhyloP conservation scores. Because our data comes from this pipeline we did not choose which organisms are put into our old PhyloP. In our new pipeline we switched to PhyloP46 and noticed significant changes.

These two conservation scores have different scaling and do not seem to be correlated.

Firstly, I hope you could clarify exactly what "PhyloP" is. From your response it seems as though PhyloP is one database of many organisms from which tracks are chosen from? Would it be correct to say PhyloP46 is one of these tracks?

If this is true then could you explain the difference in scaling found between our old PhyloP and PhyloP46. The conservation scores for our old PhyloP appear to range from about-8 to +6, while the PhyloP46 scores range from about 0 to 45.

When comparing the same locations in our old PhyloP and PhyloP46 there seemed to be no correlation between these two conservation scores. Could this be because of the different organisms in each set?

Can you confirm that these are both base-wise conservation scores?

I hope this helps clarify any confusion.

-Teresa Pillars 

Yale DNA Lab
333 Cedar Street
 New Haven, CT 06520

Brian Lee

unread,
May 20, 2016, 2:44:23 PM5/20/16
to Teresa Pillars, gen...@soe.ucsc.edu
Dear Teresa,

Thank you for using the UCSC Genome Browser and your question about PhyloP files.

Our Track Description pages are the best place to look for explanations about the types of data that we display. In the previous response there were links shared that are a good start, however, it sounds as though you did not review them. Since you mentioned PhyloP46, here is the human hg19 assembly 46-way Conservation Track page: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cons46way

Please review the information and find the reference papers at the bottom where you can learn more how PhyloP represents conservation at individual locations. To also be of help, here is an example link comparing the hg19 46-way and 100-way Conservation tracks, to help illustrate how one would expect to see a relation between the two sets, one which represents multiple alignments of 46 vertebrate species and the other 100: http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=brianlee&hgS_otherUserSessionName=hg19.cons

Beyond our Track Documentation, we also have archived mailing list questions where you can enter a question and look at previous answers from our support team that may help provide additional insight. Here is an example search with answers you could review:https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!searchin/genome/what$20is$20phyloP

Please search our archives of previously answered questions before mailing the list again. Also, in future questions please provide more information to help contextualize your questions. Please note as well that it may be useful to contact the authors of your pipeline in regards to the possible impact of the changes you are introducing to it.

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply toge...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genomics Institute
> --
>
Reply all
Reply to author
Forward
0 new messages