Canonical transcripts for ensembl and RefSeq hg19 latest version

723 views
Skip to first unread message

Genetics Savvy

unread,
Aug 29, 2016, 10:36:32 AM8/29/16
to gen...@soe.ucsc.edu
Hi, 
I require the canonical transcripts for hg19 (assembly:hg19) ensembl (Track : GENCODE v24 lift37) and Refseq (track: Refseq genes).

I didn't see the usual "knownCanonical" table in the "table" dropdown for these two. 
Could you please help?

Much thanks,
Savvy. 

Brian Lee

unread,
Aug 29, 2016, 12:22:25 PM8/29/16
to Genetics Savvy, gen...@soe.ucsc.edu
Dear Savvy,

Thank you for using the UCSC Genome Browser and your question about tables like knownCanonical for knownGene.

Before mailing our list in the future, please spend time searching our archives of previous answers as we have limited resources to reply to users:https://groups.google.com/a/soe.ucsc.edu/forum/#!forum/genome

There is no knownCanonical table available for the hg19 RefSeq (refSeq) genes track or GENCODE Gene V24lift37 (wgEncodeGencodeV24lift37) genes track.

Unlike hg19 UCSC Genes (knownGene), which is built by UCSC, these other gene prediction sources do not provide such tables, that is to say that RefSeq does not produce an official set of "canonical" transcripts. You will have to do your own research and take your own steps to decide which transcript in these other gene prediction sets you wish to represent as the "canonical" transcript. For GENCODE, you must read the track description page,http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeGencodeV24lift37, where you will learn about APPRIS, http://appris.bioinfo.cnio.es/, which attempts to annotate isoforms, but does not in the end provide a canonical table for the genes in this set.

There has been much discussion about this topic in our archived mailing list, which you must independently read, and independently interpret for your research needs to make your own final decision about how to go forward in your work. The UCSC mailing list can not help you further in this area, as you will see in the archives.

The take home message is that for RefSeq and GENCODE, bioinformaticians such as yourself must do their own work to decide which transcript they wish to represent as the canonical transcript in each of these gene prediction datasets, as such the sources for these data do not produce such a canonical list.

For future questions, do no reply to personal emails, rather email gen...@soe.ucsc.edu, only after reviewing our archvies and other external resources. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead togeno...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genomics Institute
> --
>
Reply all
Reply to author
Forward
0 new messages