Enhancer, and TFBS from ucsc

31 views
Skip to first unread message

Manu Ferrando

unread,
Jan 14, 2020, 12:12:52 PM1/14/20
to gen...@soe.ucsc.edu
Dear, 

my name is Manuel Ferrando Bernal, a PhD student from the Institute of Evolutionary Biology in Barcelona (Spain). 

I am fully interested in UCSC genome browser database. It is so useful when dealing with small data, but sometimes a bit complicated when there is a need for get information about a huge number of genes. 

In particular I have a list of nearly 1000 genes, that are expressed in both testes and brain. I would like to get information, first, about the known enhancers expressed in that organs, and secondly, about the possible Trascription Factor Binding Sites of that enhancers. I known that information should be possile to download as they did it in that paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6742485/
but I couldn't get how to do it...

If you can help me I would be deligthted, as I think your database is by far the most suitable for my study.


Thanks!
Yours,

Manu

Luis Nassar

unread,
Jan 17, 2020, 8:01:27 PM1/17/20
to Manu Ferrando, UCSC Genome Browser Discussion List

Hello Manu,

Thank you for your interest in the Genome Browser.

There are different ways to extract regulatory information from the Genome browser. We can describe a few of these approaches to find and extract these data, and hopefully that will help narrow down your search.

The paper you linked includes some older methods for which more recent data is available. The FirstEF algorithm described can be found here (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg16&g=firstEF), though better resources for identifying promoters are available. As far as identifying tissue expression in a gene, you can take a look at the GTEx data (http://genome.ucsc.edu/cgi-bin/hgc?db=hg38&g=gtexGene&i=MTOR).

We have a feature called track search which allows you to use keywords to search the available data tracks in an assembly. Searching for more general terms such as "enhancer" or "promoter" will yield many results, often due to data on multiple cell lines or tissue types. Clicking on these results will show you additional information in the track description page. If you would like to visualize these data you can select them, and click "view in Browser".

Many of these TFBS tracks will be a part of the ENCODE super-track (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeTfBindingSuper). This would be a good place to look for this type of data. There is also additional regulatory data directly from their portal which can then be imported to the Genome Browser (https://www.encodeproject.org/).

We would also like to share some data which you may find helpful:

For promoters/enhancers:

The Eukaryotic Promoter Database (EPD) - http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=epdNew
GeneHancer - http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=geneHancer
FANTOM5 (hub) - http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://fantom.gsc.riken.jp/5/datahub/hub.txt
Vista Enhancers - https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=https://portal.nersc.gov/dna/RD/ChIP-Seq/VISTA_enhancer_e/VistaEnhancerTrackHub/hub.txt
Note: Public hubs are denoted by a (hub). To see the descriptions click on the link and scroll down past the tracks image.

TFBS:

ReMap 2020 (hub)- https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hubUrl=http://remap.univ-amu.fr/storage/public/hubReMap2020UCSC/hub.txt
Ensembl Regulatory Build (hub)- https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://ftp.ensembl.org/pub/papers/regulation/hub.txt
ENCODE Analysis hub (hub) - https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt

Predicted BS based on sequence:

JASPAR TFBS (hub) - https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt
TFBS Conserved - https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=tfbsConsSites

Once you have found data you are interested in, you can use the Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables) or our downloads page (http://hgdownload.soe.ucsc.edu/downloads.html) to access or download the entire data set. Often times the track description pages will also include a "Data Access" section with direct links to the data.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Lou Nassar
UCSC Genomics Institute

Training videos & resources: http://genome.ucsc.edu/training/index.html
Want to share the Browser with colleagues?
Host a workshop: http://bit.ly/ucscTraining


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAN_yZfjRSnt-WqK94pZY6scLXZKK1Z6BmoD9ADjjNNf18CyVkw%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages