bigBed file for ENCODE3 TF Cluster

54 views
Skip to first unread message

Benjamin C. Hitz

unread,
Feb 18, 2021, 5:35:38 PM2/18/21
to gen...@soe.ucsc.edu
Hi -
Was just wondering if the TF Cluster/ENCODE3 "super track" was just a bigBed or something fancier.
I found this: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/encRegTfbsClustered/

But not a random access version...

Thanks,
Ben
--
Benjamin C. Hitz * Director of Genomic Data Resources
ENCODE DCC * Dept. of Genetics
hi...@stanford.edu




Daniel Schmelter

unread,
Feb 19, 2021, 5:29:16 PM2/19/21
to Benjamin C. Hitz, gen...@soe.ucsc.edu

Hello Benjamin,

Thank you for writing to the Genome Browser with your question about downloading TF Cluster data.

The main TF Cluster data is stored in our site as a SQL table in BED5+ format. Instead of bigBed binaries, these SQL database tables use the first column, bin, as an index to allow for fast data access within large datasets. The download directory you linked lists the SQL table in gzipped BED5+1 format as described at the top of the page:

encRegTfbsClusteredWithCells.hg38.bed.gz
        Description: TFBS clusters together with input cell sources (BED 5+1 format: 
                        standard 5 fields of BED followed by comma-separated list of cell types)

As mentioned there as well, there are a few supporting meta-data files that include information related to that data:

http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/

  • encRegTfbsClusteredInputs.txt.gz
  • encRegTfbsClusteredSources.txt.gz

You can also see some of this information in the Table Schema:

http://genome.ucsc.edu/cgi-bin/hgTables?db=hg38&hgta_group=regulation&hgta_track=encRegTfbsClustered&hgta_table=encRegTfbsClustered&hgta_doSchema=describe+table+schema

If you specifically want bigBed format, you will need to use the bedToBigBed command-line utility. The process is described here (though you may need to cut out the 'bin' column):

https://genome.ucsc.edu/goldenPath/help/bigBed.html

Other data access options are using Table Browser, our Public SQL server, or our JSON API:

I hope this was helpful. If you have any more questions, please reply-all to gen...@soe.ucsc.edu. All messages sent to that address are publicly archived. If your question includes sensitive data, please reply-all to genom...@soe.ucsc.edu.

All the best,

Daniel Schmelter
UCSC Genome Browser


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/AB523991-2853-4FB5-914A-89E6403ED33F%40stanford.edu.
Reply all
Reply to author
Forward
0 new messages