Number of exon in PEG10 human gene in UCSC

6 views
Skip to first unread message

Agrii Enginee

unread,
Jan 8, 2018, 11:49:31 AM1/8/18
to gen...@soe.ucsc.edu
Hi 
I search about PEG10 human gene in NCBI and for this gene there are 2 exon.
as i using UCSC for my work,  in search in UCSC for this gene, in output there are 3 exon. i think may one added by mistake! 

1- chr7 94656324     94656580 NM_015068.3_exon_0_0_chr7_94656325_f
2- chr7 94663333 94664513 NM_015068.3_exon_1_0_chr7_94663334_f
3- chr7 94664513 94669695  NM_015068.3_exon_2_0_chr7_94664514_f

may exon 2 start at 94663333 and end at 94669695. due to there are not any space for intron between two exons.
Also in search for intron in UCSC there are one intron in location:chr7: 94656580  94663333.

Thank you.
Keyvan

Jairo Navarro Gonzalez

unread,
Jan 10, 2018, 2:36:44 PM1/10/18
to Agrii Enginee, gen...@soe.ucsc.edu

Hello Keyvan,

Thank you for using the UCSC Genome Browser and your inquiry.

The "exons" in the output are actually aligned blocks of sequence, i.e. regions of the genome that align without gaps to the transcript sequence. Usually, alignment gaps between these regions are caused by introns in the reference genome, so calling the aligned blocks "exons" is usually accurate. However, in this case, the second exon is interrupted by an anomalous alignment gap that happens to have length 0 on the reference genome. We internally store our coordinates as zero-based half-open coordinates, and you can find more information about our coordinate system from the following blog post:

http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/

You can see the zero-length false "intron" in question from the following MySQL query in the ncbiRefSeqCurated table for NM_015068.3:

mysql> select * from ncbiRefSeqCurated where name = "NM_015068.3";
+------+-------------+-------+--------+----------+----------+----------+----------+-----------+-----------------------------+-----------------------------+-------+-------+--------------+------------+------------+
| bin  | name        | chrom | strand | txStart  | txEnd    | cdsStart | cdsEnd   | exonCount | exonStarts                  | exonEnds                    | score | name2 | cdsStartStat | cdsEndStat | exonFrames |
+------+-------------+-------+--------+----------+----------+----------+----------+-----------+-----------------------------+-----------------------------+-------+-------+--------------+------------+------------+
| 1307 | NM_015068.3 | chr7  | +      | 94656324 | 94669695 | 94663556 | 94665682 |         3 | 94656324,94663333,94664513, | 94656580,94664513,94669695, |     0 | PEG10 | cmpl         | cmpl       | -1,0,1,    |
+------+-------------+-------+--------+----------+----------+----------+----------+-----------+-----------------------------+-----------------------------+-------+-------+--------------+------------+------------+

Note that the second exonEnd is equal to the third exonStart (94664513) -- there is an alignment gap of length 0 on the reference genome. Usually, this means that the reference genome is missing a base from the transcript, requiring an alignment gap. However, this case is even rarer, and if you click into the details page for NM_015068.3, there is a note:

protein translation is dependent on -1 ribosomal frameshift; isoform 1 is encoded by transcript variant 1.

So it seems like the reference genome is not missing a base after all, but rather a ribosomal hiccup is required for correct transcription.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly-accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro 
UCSC Genomics Institute

Want to share the Browser with colleagues?
Host a workshop: http://bit.ly/ucscTraining


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/635379236.3041260.1515335796996%40mail.yahoo.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Reply all
Reply to author
Forward
0 new messages