Request to include poly(A) site annotation as official track to the genome browser

86 views
Skip to first unread message

Ralf Schmidt

unread,
Oct 4, 2016, 12:02:27 PM10/4/16
to gen...@soe.ucsc.edu
Dear UCSC Team!

We want to propose to include our latest annotation of mRNA 3' end cleavage sites (poly(A) sites) in the UCSC Genome Browser as an official track.
The annotated sites are the condensed information from a comprehensive and streamlined processing of publicly available 3' end sequencing data sets. In our publication we show that our catalog of 3' end processing sites is more accurate and comprehensive then other so far published resources, including the current poly(A) site track that is available in the UCSC Genome Browser (PolyA_DB). We provide all details about our catalog and its comparison to former poly(A) site annotations in our study that has been published in Genome Research very recently: http://genome.cshlp.org/content/26/8/1145

------------------------------

Poly(A) sites for the human genome hg19:
http://www.polyasite.unibas.ch/clusters/Homo_sapiens/8.0/clusters.bed

Poly(A) sites for the mouse gneome mm10:
http://www.polyasite.unibas.ch/clusters/Mus_musculus/4.0/clusters.bed

------------------------------

General information:
In the publication, we show that our atlas of poly(A) sites covers more genes and terminal exons than other previously published sets of poly(A) sites without increasing the number of poly(A) sites per organism, indicating a higher accuracy of our annotation. Moreover, the clustering of reads from diverse 3' end sequencing protocols is also done with respect to the occurrence and location of hexamer motifs (poly(A) signals) known to be crucial for 3' end processing, additionally to the common practice to cluster closely spaced reads. This allows for a more fine-grained annotation of poly(A) sites.

Data:
The data is in standard BED-format with an unique ID as name for each poly(A) site. Each ID has a two-letter tag that indicates the location of the corresponding poly(A) site with respect to the gencode v19 annotation of protein-coding genes and lincRNAs. A poly(A) site might be characterized by a genomic region due to fuzziness of cleavage and polyadenylation. The genomic position with the strongest support, the representative site, is given as position in the ID field.

Methods:
The human set of poly(A) sites was created based on the processing of data from 7 different 3' end sequencing protocols, for mouse from 9 different protocols, each of them requiring specific preprocessing. After preprocessing, we applied an in-house pipeline that allows to integrate the diverse data in order to generate a comprehensive set of high-confidence poly(A) sites. A detailed description of the pipeline can be found in the supplementary material of the above mentioned publication.


We are looking forward to hearing from you,

Kind regards,

Ralf Schmidt

--

Ralf Schmidt| PhD Student | Biozentrum, University of Basel | Klingelbergstrasse 50/70 | CH-4056 Basel
Phone: +41 61 267 18 66 | Email: ralf.s...@unibas.ch

Luvina Guruvadoo

unread,
Oct 5, 2016, 3:35:00 PM10/5/16
to Ralf Schmidt, gen...@soe.ucsc.edu
Hello Ralf,

Thank you for your interest in the Genome Browser. You may be interested in creating a track hub: http://genome.ucsc.edu/goldenPath/help/hgTrackHubHelp.html. Here are some additional standard guidelines: http://genomewiki.ucsc.edu/index.php/Public_Hub_Guidelines

Once you have set up your track hub, you can contact us again and request to submit it as a public hub on our site: http://genome.ucsc.edu/cgi-bin/hgHubConnect

If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Regards,
Luvina

--
Luvina Guruvadoo
UCSC Genome Browser

http://genome.ucsc.edu




--


Ralf Schmidt

unread,
Oct 6, 2016, 11:39:14 AM10/6/16
to gen...@soe.ucsc.edu
Hi Luvina,

thank you very much for your response.

I had a look at track hubs. The problem with them is that users have to access them actively and browse them afterwards.

The good thing with the default track selection that comes with the Genome Browser is that users get an idea of what is available just by scrolling through the list. Moreover, the already contained tracks for poly(A) site annotations (Poly(A) and PolyA-Seq) have not been updated for a longer period of time (like polyA_DB) and are not as comprehensive as our atlas (as shown in our publication).

Additionally, since we already host our annotations on our website (polyasite.unibas.ch) it would not make too much sense for us to create a official track hub. Instead, it would be easier to include a link on our webpage that allows to browse the data as track in the UCSC Browser.

I hope you understand my considerations. Do you think there is another way to present our data in your Genome Browser? The amount of data should not be a problem since it is only rather sparse BED-files which, when converted to BigBED format, are very small that need to be included.

I'm looking forward to hearing from you again,

best regards,

Ralf


--

Ralf Schmidt| PhD Student | Biozentrum, University of Basel | Klingelbergstrasse 50/70 | CH-4056 Basel
Phone: +41 61 267 18 66 | Email: ralf.s...@unibas.ch

Von: Luvina Guruvadoo [luv...@soe.ucsc.edu]
Gesendet: Mittwoch, 5. Oktober 2016 21:34
An: Ralf Schmidt
Cc: gen...@soe.ucsc.edu
Betreff: Re: [genome] Request to include poly(A) site annotation as official track to the genome browser

Cath Tyner

unread,
Oct 6, 2016, 1:21:56 PM10/6/16
to Ralf Schmidt, gen...@soe.ucsc.edu
Hello Ralf,

Thank you for responding with follow-up questions regarding your request to add a native track to the UCSC Genome Browser. First of all, congratulations on this accomplishment! I believe that the UCSC Genome Browser community will be interested in accessing these data in the browser.

We currently do not have the resources to assess and/or add these data as a native track within the browser. However, most users of the UCSC Genome Browser are very familiar with loading public track hubs, which will annotate your data against the hg19 and mm10 assemblies that you are interested in, alongside all other native tracks.

While I understand that you are already presenting your data on polyasite.unibas.ch, adding an official public track hub in the genome browser will allow others to load your data alongside all other data and software features of the genome browser (including queries and downloads with the table browser tool, etc.). By displaying your data as a public hub, you will also retain control over how these data are presented, and it will be easier for you to maintain your data sets, data descriptions, etc.

For these reasons, I encourage you to consider the best approach for sharing your data within the UCSC Genome Browser by: 1) creating a track hub and 2) submitting your hub for official public listing in the browser.

One relatively new feature in the UCSC Genome Browser is the ability to add publicly accessible sessions. Adding your data as a custom track/s and allowing public permissions by enabling your browsing session to appear in the Public Sessions list within the browser would be a quick and easy way to share your data in the genome browser. Public Session links will not expire, and can be used in publications or as a reference from your website.

More information: sessions and public sessions.

Please respond to this list if you have further questions!

Thank you again for your inquiry and for using the UCSC Genome Browser. 
​Please send new and follow-up questions to one of our UCSC Genome Browser mailing lists below:

  * Post to the Public Help Forum: E
mail 
gen...@soe.ucsc.edu
​ or search the Public Archives
​  * Post to the Mirror Help Forum: Email
 
genome...@soe.ucsc.edu 
or search the Mirror Archives​
​  * Confidential/private help: Email
 
genom...@soe.ucsc.edu

UCSC Genome Browser Announcements List (email alerts for new data & software):
  * Subscribe: Email genome-announce+subscribe@soe.ucsc.edu 
  * Unsubscribe: Email genome-announce+unsubscribe@soe.ucsc.edu

Join us on Social Media! FacebookTwitter, Wordpress BlogYouTube

​Enjoy,​
Cath
. . .
Cath Tyner
UCSC Genome Browser, Software QA & User Support
UC Santa Cruz Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.

Reply all
Reply to author
Forward
0 new messages