TNNI3

Christian...@radboudumc.nl

unread,

May 14, 2014, 8:26:29 AM5/14/14

to gen...@soe.ucsc.edu

Dear Sir/Madam,

We use the resources from the ucsc genome browser a lot (and are very happy with it). However occasionally we find some small mistakes, particularly in the refGene annotation. I was wondering where we can send such information to, and whether it is possible that the information gets corrected?

The finding that we just have concerns the annotation of the gene TNNI3 (or NM_000363) according to refseq. The only transcript according to refseq has 7 exones and does not encode for a valid protein.

However with some puzzling we figured out that the more likely gene has a shorter first exon and an additional small exon (total of 8 exons) which seems to be supported by other data resources. (we used Alamut)

To be precise the definition of the first exon should start at chr19: 55,668,947 rather than 55,668,935 and an additional exon should be introduced at position 55668664 – 55668676

Please let me know whether this information can be used by you to make improvements?

Kind regards,

Christian Gilissen PhD

Department of Human Genetics (855)

Radboud University Medical Center

Geert Grooteplein 10

6525 GA Nijmegen, the Netherlands

Email. christian...@radboudumc.nl

Tel. +31 24 36 68940

http://genomicdisorders.nl

Het Radboudumc staat geregistreerd bij de Kamer van Koophandel in het handelsregister onder nummer 41055629.
The Radboud university medical center is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.

Steve Heitner

unread,

May 14, 2014, 6:10:25 PM5/14/14

to Christian...@radboudumc.nl, gen...@soe.ucsc.edu

Hello, Christian.

Thank you for reporting this error. We perform our own alignments as described in the Methods section of the RefSeq Genes description page at http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=refGene. In rare cases, mistakes are made such as the one you just illustrated.

According to the GenBank record at http://www.ncbi.nlm.nih.gov/nuccore/NM_000363?report=GenBank, there are indeed 8 exons in TNNI3.

If you examine the UCSC Genes, RefSeq Genes and GENCODE tracks side-by-side, you will see that UCSC Genes and GENCODE both contain exon 2 which occurs at chr19:55,668,664-55,668,676. The problem in the RefSeq Genes track stems from an apparent issue with exon 1. If you view chr19:55,668,933-55,668,960, you will see that exon 1 properly ends at 55,668,947 in both UCSC Genes and GENCODE. In RefSeq Genes, a stop codon correctly appears, but the exon erroneously continues beyond it. This is why the size of exon 1 is improperly reported and why the missing exon 2 never appears in the RefSeq Genes track.

Please contact us again at gen...@soe.ucsc.edu if you have any further questions. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---
Steve Heitner
UCSC Genome Bioinformatics Group

--

Christian...@radboudumc.nl

unread,

May 15, 2014, 2:36:17 AM5/15/14

to st...@soe.ucsc.edu, gen...@soe.ucsc.edu

Hi Steve,

Thanks very much for your rapid response.

Three further questions then regarding this then:

1. What is now the procedure? Will you fix this and if you do within what time-frame? (i.e. can we wait for you to fix this?)

2. What would you recommend to us for resolving this issue? Should we base our proteins on UCSC genes instead?

3. We calculate this for all refSeq genes, and we only looked into this because we’re interested in this gene. Would it be useful for you that we send you a complete list of all transcripts that do not encode a valid protein? (as you said there are not so many)

Kind regards,

Christian

Matthew Speir

unread,

May 16, 2014, 3:41:07 PM5/16/14

to Christian...@radboudumc.nl, st...@soe.ucsc.edu, gen...@soe.ucsc.edu

Hello Christian,

Thank your for your follow-up questions about the TNNI3 gene. Currently, the RefSeq track is generated through an automated pipeline. We download the transcripts from RefSeq, align them to the genome, and then filter the alignments according to the parameters described in the methods section of the track description page, http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=refGene. Since this track is not manually curated, it is unlikely that this issue with TNNI3 or other proteins will be fixed in the RefSeq Genes track. In the future, we hope to add a track that contains RefSeq's alignments to help resolve issues where our re-alignments are different from the RefSeq alignments. In the mean time, you could use the GENCODE genes track, which does not have this issue with the TNNI3 protein. The GENCODE track, http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeGencodeV19, contains manual annotations merged with evidence-based automated annotations.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group

--

Reply all

Reply to author

Forward