BED scores

73 views
Skip to first unread message

Elisabetta sauta

unread,
Feb 6, 2015, 10:53:30 AM2/6/15
to gen...@soe.ucsc.edu
Hello,

I'm a PhD student and I'm using chIP seq ENCODE files. The score in BED files are transformed into a different shade of grey in the UCSC browser. How can I interpret this scale in a biological way? Is like a probability of a transcription factor binding in that genome fragment?

Thanks in advance for your attention.

Elisabetta

Brian Lee

unread,
Feb 6, 2015, 12:44:02 PM2/6/15
to Elisabetta sauta, gen...@soe.ucsc.edu
Dear Elisabetta,

Thank you for using the UCSC Genome Browser and your question about ChIP-seq ENCODE scores.

You are correct to think of interpreting the darker score as increased biological evidence of binding of that transcription factor at that particular spot. Here is a session that displays the Clustered Transcription Factor Binding Sites track (wgEncodeRegTfbsClusteredV3), and the underlying Uniform Peaks track (wgEncodeAwgTfbsUniform) used to create the clusters, produced by the ENCODE Analysis Working Group.


In summary the AWG created a pipeline to uniformly processes several hundred ChIP-seq files generated by the ENCODE project. That uniform processing resulted in a comparable signal scores viewable in the wgEncodeAwgTfbsUniform track, that was then used to generate the clustered score in the wgEncodeRegTfbsClusteredV3 track, where a normalization factor was used to attempt to better distribute scores evenly.

In the above session just the factors JUN, JUNB, JUND, and MYC have been filtered to display. You can see how MYC has a dark score and has several letters following the block, indicating all the cell types where binding of MYC has been observed. If you click into the box for the MYC cluster and you will see the list of assays where evidence shows there is binding.

Returning to the Browser display you can see several individual "Uniform ...c-Myc" tracks displayed below the clusters track. Those are the separate wgEncodeAwgTfbsUniform tracks used to generate the processed clustered summary wgEncodeRegTfbsClusteredV3 track for this MYC cluster. Those individual uniform processed scores were used to create the cluster score given to the the MYC cluster. Like the MYC factor, you can also click the JUN factors and you will see there is only one observed cell type where this data indicates this factor binds at this location. And similarly below, you will see the "Uniform... Jun" tracks that contributed to the clusters track.

Also note that some of the transcription factors, like MYC, also have additional Factorbook motif information available to display, you can read more about that in the wgEncodeRegTfbsClusteredV3 track description.

For a complete understanding of how the scores were calculated you must read the Track Description pages for these two tracks.

See Methods section

See Methods section and Peak Calling:

If you have more questions after reviewing the track description pages about how the score is calculated, I suggested reviewing our mailing list archive of previously answered questions: https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!searchin/genome/score$20tfbs

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group

--


Reply all
Reply to author
Forward
0 new messages