Hi IGV team and community,
I am analyzing PacBio Iso-Seq long-read RNA sequencing data using IGV and encountered an issue in the 3' UTR region that I need help understanding.
In my IGV alignment view, I noticed that many reads in the 3' UTR contain gaps in the middle. However, these gaps are not connected by split-read lines, which usually indicate a split alignment (e.g., exon-exon junctions). This confuses me because:
Additionally, I noticed several "I" (insertion) symbols in the alignment. Given that PacBio long reads are known to have insertion errors or represent real RNA isoform variations, I want to confirm:
I would appreciate any insights on how IGV handles long-read RNA alignments, particularly in cases where gaps exist without split-read lines.
Thanks in advance for your help!
Best,
Shicheng
Hi Shicheng,
First, by default the aligned reads in the IGV view are placed wherever there is room for them, so it is normal to see them next to each other on the same line. If you want only one per line, you can right-click on the track and select the "Full" mode, rather than the default "Expanded".
Regarding the insertions, IGV displays all insertions that are included in the BAM record the same way - it cannot tell the cause of the insertion. If you are concerned about small insertions and deletions that may be due to artifacts in the long-read data, you can right click on the track and select "Hide small indels" and then set a threshold; any indels smaller than the threshold will not be displayed. This can also be set in View > Preferences to make it the default for all tracks that you load. Hiding small indels is a commonly used feature with long-read data. Although I understand the quality of the data has much improved since the early days of long read sequencing.
Helga