knownCanonical table -- are these ensembl canonical transcripts?

42 views
Skip to first unread message

Hansen, Adam Wesley

unread,
Jun 12, 2017, 5:31:34 PM6/12/17
to gen...@soe.ucsc.edu

Hi,


I would appreciate clarification on how the transcripts in the knownCanonical table are defined to be 'canonical'.


More specifically, is the definition logic identical to or different from the ensembl canonical transcript definition?


Thanks,


Adam Hansen

PhD Candidate

Human Genome Sequencing Center

Department of Molecular and Human Genetics

Baylor College of Medicine

Matthew Speir

unread,
Jun 13, 2017, 6:10:31 PM6/13/17
to Hansen, Adam Wesley, gen...@soe.ucsc.edu
Hi Adam,

Thank you for your question about the definition of "canonical" in relation to our knownGene table.

The answer to this question is going to depend on what assembly you are looking at. For assemblies that have a "UCSC Genes" track, e.g. hg19, mm9 or mm10, yes the way we define "canonical" in knownCanonical is similar to the definition that Ensembl uses. You can read some response to previous MLQs for more details on how transcripts are selected for the knownCanonical table: https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/gGzjT5XvI8A/MowrY3UDAwAJ. The other question linked in that response also provides some good information on the process for generating the knownCanonical table for the "UCSC Genes" track.

However, if you are looking at hg38, where the knownGene and knownCanonical table are based on data from GENCODE Genes project, https://www.gencodegenes.org/. In this case, the inclusion of a transcript in the knownCanonical table is based on the "tags" that GENCODE applies to this transcript. First, we look for "appris_principal" tags, and when we fail to find those, we then fall back to the "basic" tags, and if we fail to find either of these tags we fall back to the knownCanonical method used for the "UCSC Genes" tracks above. You can find a description of how APPRIS labels transcripts with their "appris_principal" tags here: http://appris.bioinfo.cnio.es/#/help/database. (You will need to scroll down to the "Principal Isoform flags" section.) Additionally, if you would like more information about how GENCODE tags "basic" transcripts, you can see the descriptions of their various tags here: https://www.gencodegenes.org/gencode_tags.html.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CY4PR06MB235860468CC385C73FBC9A8DF4CD0%40CY4PR06MB2358.namprd06.prod.outlook.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Hansen, Adam Wesley

unread,
Jun 14, 2017, 12:09:09 PM6/14/17
to Matthew Speir, gen...@soe.ucsc.edu

Matthew, thank you for the detailed response!


From: Matthew Speir <msp...@soe.ucsc.edu>
Sent: Tuesday, June 13, 2017 5:10:27 PM
To: Hansen, Adam Wesley; gen...@soe.ucsc.edu
Subject: Re: [genome] knownCanonical table -- are these ensembl canonical transcripts?
 
***CAUTION:*** This email is not from a BCM Source. Only click links or open attachments you know are safe.
Reply all
Reply to author
Forward
0 new messages