Question about IGV Display of PacBio Iso-Seq Reads in 3' UTR Region

69 views

Skip to first unread message

Shicheng Guo

unread,

Mar 19, 2025, 3:36:57 AM3/19/25

to igv-help

Hi IGV team and community,

I am analyzing PacBio Iso-Seq long-read RNA sequencing data using IGV and encountered an issue in the 3' UTR region that I need help understanding.

In my IGV alignment view, I noticed that many reads in the 3' UTR contain gaps in the middle. However, these gaps are not connected by split-read lines, which usually indicate a split alignment (e.g., exon-exon junctions). This confuses me because:

If these gaps are from the same read, I would expect IGV to display a connection line between split segments.
If they are from different reads, why are they aligned on the same row instead of being placed on separate rows?

Additionally, I noticed several "I" (insertion) symbols in the alignment. Given that PacBio long reads are known to have insertion errors or represent real RNA isoform variations, I want to confirm:

Are these real insertions relative to the reference genome, or could they be sequencing artifacts?
How does IGV determine whether to display a read as a single unit versus splitting it across multiple rows?

I would appreciate any insights on how IGV handles long-read RNA alignments, particularly in cases where gaps exist without split-read lines.

Thanks in advance for your help!

Best,

Shicheng

igv-help

unread,

Mar 20, 2025, 1:13:28 PM3/20/25

to igv-help

Hi Shicheng,

First, by default the aligned reads in the IGV view are placed wherever there is room for them, so it is normal to see them next to each other on the same line. If you want only one per line, you can right-click on the track and select the "Full" mode, rather than the default "Expanded".

Regarding the insertions, IGV displays all insertions that are included in the BAM record the same way - it cannot tell the cause of the insertion. If you are concerned about small insertions and deletions that may be due to artifacts in the long-read data, you can right click on the track and select "Hide small indels" and then set a threshold; any indels smaller than the threshold will not be displayed. This can also be set in View > Preferences to make it the default for all tracks that you load. Hiding small indels is a commonly used feature with long-read data. Although I understand the quality of the data has much improved since the early days of long read sequencing.