Hello Doron,
Thank you for using the UCSC Genome Browser and for submitting your question regarding mRNA sequence differences. Please see answers below to your two questions:
1) Because poly-A tails vary in length and can be quite long, they are always trimmed down to two bases in the browser. In the case of transcript uc008odn.1, three "A's" have been trimmed to our standard two-A poly-A tail.
2) UCSC Genes are built using a multi-step pipeline and takes various evidence into account for annotations. You can read more about the methods here:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=mm10&g=knownGene
In this case, the UCSC Genes pipeline started with the RefSeq mRNA NM_025653. It used information from other RefSeq mRNAs and ESTs at this position to adjust its prediction. The final predicted transcript, uc007yge.2, included a slightly longer 5' UTR than the starting mRNA. However, when you request the mRNA for this transcript from UCSC Genes, it provides you with the unchanged starting sequence of NM_025653.
Thank you again for your inquiry and for using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
Dear Cath,
Thank you very much for your reply and sorry for my late one. Please look at my questions/comments below your answers.
1) Because poly-A tails vary in length and can be quite long, they are always trimmed down to two bases in the browser. In the case of transcript uc008odn.1, three "A's" have been trimmed to our standard two-A poly-A tail.
Ok, I understand that the poly-A tail is trimmed to two bases. However, it is still not clear to me why in one case there are 2 "A's" and in the other 3 "A's". The following part of the sentence is not clear to me "three "A's" have been trimmed to our standard two-A poly-A tail.". What did you mean?
In addition, if the poly-A tail is always trimmed to two (or three?) bases, how come some transcripts have a much longer poly-A chain? For example, uc007hdm.2 has 26 "A's".
2) UCSC Genes are built using a multi-step pipeline and takes various evidence into account for annotations. You can read more about the methods here:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=mm10&g=knownGene
In this case, the UCSC Genes pipeline started with the RefSeq mRNA NM_025653. It used information from other RefSeq mRNAs and ESTs at this position to adjust its prediction. The final predicted transcript, uc007yge.2, included a slightly longer 5' UTR than the starting mRNA. However, when you request the mRNA for this transcript from UCSC Genes, it provides you with the unchanged starting sequence of NM_025653.
I think I understood this, but it is very confusing that the sum of the exon lengths is not the same as the length of the transcript. What do you think?
Thanks!
Doron
Ok, I understand that the poly-A tail is trimmed to two bases. However, it is still not clear to me why in one case there are 2 "A's" and in the other 3 "A's". The following part of the sentence is not clear to me "three "A's" have been trimmed to our standard two-A poly-A tail.". >What did you mean?
In this case, the UCSC Genes pipeline started with the RefSeq mRNA NM_025653. It used information from other RefSeq mRNAs and ESTs at this position to adjust its prediction. The final predicted transcript, uc007yge.2, included a slightly longer 5' UTR than the starting mRNA. However, when you request the mRNA for this transcript from UCSC Genes, it provides you with the unchanged starting sequence of NM_025653.
I think I understood this, but it is very confusing that the sum of the exon lengths is not the same as the length of the transcript. What do you think?

Thank you again for your inquiry and for using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
Hi Doran,Can you please post your follow-up question to our mailing list, gen...@soe.ucsc.edu? That way we can provide the best answer from our team. In addition, others with similar questions in the future will be able to search and find the answer. Once the question is posted there, out team looks forward to providing an answer!
UC Santa Cruz Genomics InstituteCathCath Tyner
. . .UCSC Genome Browser: Public Help Forum, Suggestions, Contact
On Thu, Feb 25, 2016 at 1:52 AM, Doron Lemze <doron...@gmail.com> wrote:Dear Cath,Thank you very much for your reply and sorry for my late one. Please look at my questions/comments below your answers.
1) Because poly-A tails vary in length and can be quite long, they are always trimmed down to two bases in the browser. In the case of transcript uc008odn.1, three "A's" have been trimmed to our standard two-A poly-A tail.
Ok, I understand that the poly-A tail is trimmed to two bases. However, it is still not clear to me why in one case there are 2 "A's" and in the other 3 "A's". The following part of the sentence is not clear to me "three "A's" have been trimmed to our standard two-A poly-A tail.". What did you mean?
In addition, if the poly-A tail is always trimmed to two (or three?) bases, how come some transcripts have a much longer poly-A chain? For example, uc007hdm.2 has 26 "A's".
2) UCSC Genes are built using a multi-step pipeline and takes various evidence into account for annotations. You can read more about the methods here:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=mm10&g=knownGeneIn this case, the UCSC Genes pipeline started with the RefSeq mRNA NM_025653. It used information from other RefSeq mRNAs and ESTs at this position to adjust its prediction. The final predicted transcript, uc007yge.2, included a slightly longer 5' UTR than the starting mRNA. However, when you request the mRNA for this transcript from UCSC Genes, it provides you with the unchanged starting sequence of NM_025653.
I think I understood this, but it is very confusing that the sum of the exon lengths is not the same as the length of the transcript. What do you think?
Thanks!
Doron