Different gene versions in knownGene and kgTxInfo

8 views
Skip to first unread message

Yuval Nevo

unread,
May 25, 2016, 11:14:04 AM5/25/16
to gen...@soe.ucsc.edu
Hello UCSC Genome Browser team,
I am working with the current tables of the hg38 human assembly.
I am trying to merge information from the knownGene and kgTxInfo tables, according to the name field. However, while the knownGene table has 195178 unique gene names, and the kgTxInfo table has 104178 unique gene names, the overlap contains only 9459 genes.
I did notice that in several cases there are version differences (for example, uc001aak.4 in the knownGene table appears as uc001aak.3 in the kgTxInfo table). It seems that the kgTxInfo table is not up to date...
Is there a reason for this low overlap of gene names? can I ignore the version and merge by the initial 8 positions of the name? Is a new kgTxInfo table to be placed in the annotations directory soon?
Thanks a lot, Yuval.

Matthew Speir

unread,
May 27, 2016, 1:05:51 PM5/27/16
to Yuval Nevo, gen...@soe.ucsc.edu
Hello Yuval,

Thank you for your questions about the kgTxInfo table.

This table was discontinued when we switched from creating our own gene
models for the UCSC Genes process to importing those from GENCODE. You
can read more about that switch here:
http://genome.ucsc.edu/goldenPath/newsarch.html#062915. This table
should have been removed from the hg38 database when this new version of
the track was released, but, unfortunately, it looks like that never
happened. I have now removed the outdated kgTxInfo table.

Thank you for bringing this issue to our attention.

I hope this is helpful. If you have any further questions, please reply
to gen...@soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
> --
>

Reply all
Reply to author
Forward
0 new messages