Discrepancy Between Splice Junction Annotations and GTEx RNA-Seq Coverage in UCSC Genome Browser

36 views
Skip to first unread message

X L

unread,
Jan 10, 2026, 10:00:18 AMJan 10
to UCSC Genome Browser Public Support
Hi UCSC Developers,

I’ve noticed a discrepancy between the annotated splice junctions and the GTEx RNA-Seq coverage tracks in the UCSC Genome Browser. For example, on chr15:56,103,540–56,103,563 (hg38), which corresponds to a splicing donor site in the RFX7 gene, the annotated junction appears to be shifted a few bases upstream relative to the RNA-Seq coverage evidence from GTEx. The annotation of the corresponding splicing acceptor of this donor (chr15:56,102,217-56,102,283) also appears to be shifted a few bases upstream relative to the RNA-Seq coverage evidence from GTEx.

Could you please look into whether this offset reflects a known annotation issue?

Thank you for your time and support!

Best regards,

Xiao
Screenshot 2026-01-10 at 9.51.35 AM.png

Screenshot 2026-01-10 at 9.57.36 AM.png

Luis Nassar

unread,
Jan 14, 2026, 7:19:59 PMJan 14
to X L, UCSC Genome Browser Public Support

Hi, Xiao.

RFX7 has a U12 intron with AT/AC junctions. Both GENCODE and RefSeq annotate this intron.

RFX7 contains a U12-type intron, which uses AT-AC splice junctions instead of the canonical GT-AG junctions found in most introns. Both GENCODE and RefSeq correctly annotate this intron. However, the genomic sequence at the donor site is ATAT, which caused STAR (the aligner used for GTEx data) to shift the alignment by two bases, incorrectly calling the junction at the second AT instead of the first. This results in the apparent offset you observed between the RNA-seq coverage and the annotation.



--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/c6845d83-cf3e-438a-8939-67d2a47622a3n%40soe.ucsc.edu.


--
I hope this is helpful. Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on our public forum. If your question includes sensitive information, you may send it instead to genom...@soe.ucsc.edu.

Lou Nassar
UCSC Genomics Institute

X L

unread,
Jan 15, 2026, 6:00:32 PMJan 15
to Luis Nassar, UCSC Genome Browser Public Support
Hi Luis,

Thank you for your input. I noticed that this junction is predominantly non-canonical (CC-AT), regardless of the aligner used—both HISAT2 and STAR yielded consistent results in my analysis. Could it be that the annotation pipeline is enforcing classification as an AT-AC–type intron, even when the experimental coverage supports a different splice site?

Best regards,
Xiao

acceptor.png
donor.png

Gerardo Perez

unread,
Jan 22, 2026, 12:44:04 PMJan 22
to X L, UCSC Genome Browser Public Support

Hello, Xiao.

Thank you for your follow-up question. This turned out to be an interesting case.

You may find the SpliceAI Wildtype track useful, as it can help find the correct splice site: https://genome.ucsc.edu/cgi-bin/hgTrackUi?&db=hg38&position=default&g=spliceAIWt

One of our engineers reviewed a limited amount of data to better understand this junction, including:

  • 189 encode tissue poly(A) plus RNA-Seq experiments
  • ENCODE4 long-reads
  • UCSC BLAT of GenBank mRNAs and splice ESTs

Here is what he observed.

Donor support

AT chr15:56,103,552–56,103,553

  • Annotated in GENCODE and RefSeq
  • Supported by some ENCODE poly(A) plus RNA-seq data
  • SpliceAI signal: 0.78

AT chr15:56,103,550–56,103,551

  • Supported by the majority of ENCODE poly(A) plus RNA-seq data
  • Supported by an ENCODE long-read transcript
  • Supported by BLAT of GenBank mRNAs and ESTs
  • SpliceAI signal: 0.14

Acceptor support

AC chr15:56,102,254–56,102,255

  • Annotated in GENCODE and RefSeq
  • Supported by some ENCODE poly(A) plus RNA-seq data
  • SpliceAI signal: 0.26

AT chr15:56,102,254–56,102,252

  • Supported by the majority of ENCODE poly(A) plus RNA-seq data
  • Supported by an ENCODE long-read transcript
  • Supported by BLAT of GenBank mRNAs and ESTs
  • No SpliceAI signal detected

Based on this, there may be two possible U12 introns present:

  • AT/AT: chr15:56,103,552–56,103,553 to chr15:56,103,550–56,103,551
  • AC/AT: chr15:56,102,254–56,102,255 to chr15:56,102,254–56,102,252

The AT/AT intron annotated by GENCODE and RefSeq appears to be the minority version, at least in your data and in what was reviewed here.

For reference, here is a browser session showing the data that was examined: http://genome-test.gi.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=Markd&hgS_otherUserSessionName=MLQ%2D36906

Cases like this are typically handled through manual review, with coordination between the GENCODE and RefSeq annotation groups. For a definitive annotation decision, we recommend reaching out directly to RefSeq and GENCODE.

I hope this is helpful. If you have any further questions about UCSC Genome Browser tools or data, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


X L

unread,
Jan 31, 2026, 11:00:26 AM (12 days ago) Jan 31
to Gerardo Perez, UCSC Genome Browser Public Support
Hi Gerardo,

That’s interesting—thank you very much for your effort on the analysis.

Best,
Xiao
Reply all
Reply to author
Forward
0 new messages