[FusionInspector] Ratio Spanning / Junction reads

Mario N

unread,

Jul 9, 2020, 5:47:11 AM7/9/20

to STAR-Fusion

Hi,

I have a little question concerning the ratio of the spanning and the junction reads. I'm using FusionInspector and I often observe a large difference of number between the spanning and the junction reads.

Let take an example : let's say I'm using a paired end sequencing of 75pb with an insert of 150pb .

So my paired should look like this .

* : reads

- : Insert

**********--------------------**********

The chance of the breakpoint appearing on a read 75+75 pb ( Junction reads ) or on a insert 150pb ( Spanning reads ) should be egal .

So if i understood correctly, the ratio of junctions reads and spanning reads entirely depends on the lenght of the insert ...

How can we explain a high desequilibrium between the two ?

Best,

Mario

Brian Haas

unread,

Jul 9, 2020, 8:17:34 AM7/9/20

to Mario N, STAR-Fusion

Hi Mario,

The read lengths and the position of the breakpoint in the transcript are two key factors that will impact the number of split/junction reads vs. spanning fragments (with other biases assumed equal). Once the reads get long enough, you'll only find split reads. Also, as the breakpoint gets closer to the termini of the transcript, the more split reads you'll have relative to spanning frags. This is assuming it's a real 'simple' fusion and you have good evidence. If it's an artifact or a complex fusion event, then it's a different story.

--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/2b17c48b-de64-441d-8926-8c7c7dff660bo%40googlegroups.com.

--

--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

Mario N

unread,

Jul 9, 2020, 9:08:13 AM7/9/20

to STAR-Fusion

Hello Brian,

Thanks for the quick answer.

What do you mean by : " Once the reads get long enough, you'll only find split reads. " ? The probability is linked to the length of the insert and the reads , so when the reads are longer we have more chance to have split reads, but we should still have a chance to have spanning reads isn't ?

Right, I didn't think of the termini of the transcript...

I'm looking for some generics guideline to validate the fusions.

Sometimes I have a huge disequilibrium between Split and Spanning. I can see why there is more Split thanks to your answer but, I still got some trouble to understand how can we have more spanning reads than split reads ( I have a case when i got 2 split and 163 spanning ... I can't say if it's an artifact of not ... )

Best,

On Thursday, July 9, 2020 at 2:17:34 PM UTC+2, Brian Haas wrote:

Hi Mario,

The read lengths and the position of the breakpoint in the transcript are two key factors that will impact the number of split/junction reads vs. spanning fragments (with other biases assumed equal). Once the reads get long enough, you'll only find split reads. Also, as the breakpoint gets closer to the termini of the transcript, the more split reads you'll have relative to spanning frags. This is assuming it's a real 'simple' fusion and you have good evidence. If it's an artifact or a complex fusion event, then it's a different story.

On Thu, Jul 9, 2020 at 5:47 AM Mario N <bonora....@gmail.com> wrote:

Hi,

I have a little question concerning the ratio of the spanning and the junction reads. I'm using FusionInspector and I often observe a large difference of number between the spanning and the junction reads.

Let take an example : let's say I'm using a paired end sequencing of 75pb with an insert of 150pb .

So my paired should look like this .

* : reads
- : Insert

**********--------------------**********

The chance of the breakpoint appearing on a read 75+75 pb ( Junction reads ) or on a insert 150pb ( Spanning reads ) should be egal .

So if i understood correctly, the ratio of junctions reads and spanning reads entirely depends on the lenght of the insert ...

How can we explain a high desequilibrium between the two ?

Best,

Mario

--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to star-...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/2b17c48b-de64-441d-8926-8c7c7dff660bo%40googlegroups.com.

Brian Haas

unread,

Jul 9, 2020, 9:14:58 AM7/9/20

to Mario N, STAR-Fusion

For the read length issue, eventually the reads will overlap each other within a paired-fragment, and so you won't have spans with gaps, and reads will cross the breakpoint.

If you're finding huge numbers of spanning reads with few breakpoint reads, it could be that the fusion breakpoint is unusual (involving an insertion at the breakpoint), or the breakpoint falls in an intronic region not being captured by the FusionInspector intron-shrinkage step. Other possibilities include low complexity sequence or difficult mapping at the breakpoint. More often than not, when you have tons of spanning fragments but no breakpoint read, I find it has to do with RT artifacts with highly expressed transcripts or alignments between seq-similar regions of the fusion partners.

hth,

~b

To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/e2afbb87-b274-4175-aba7-ec3485c9e3fco%40googlegroups.com.

Mario N

unread,

Jul 9, 2020, 9:57:07 AM7/9/20

to STAR-Fusion

Seem much clearer for me,

So most of the time, if a fusion got more spanning reads than split reads, it's probably an artefact...

Good to know, I always thought that if we had a high count of spanning + split reads (> 30) it was a good candidate...

Thanks for the enlightenment,

Best,

Mario

On Thursday, July 9, 2020 at 3:14:58 PM UTC+2, Brian Haas wrote:

For the read length issue, eventually the reads will overlap each other within a paired-fragment, and so you won't have spans with gaps, and reads will cross the breakpoint.

If you're finding huge numbers of spanning reads with few breakpoint reads, it could be that the fusion breakpoint is unusual (involving an insertion at the breakpoint), or the breakpoint falls in an intronic region not being captured by the FusionInspector intron-shrinkage step. Other possibilities include low complexity sequence or difficult mapping at the breakpoint. More often than not, when you have tons of spanning fragments but no breakpoint read, I find it has to do with RT artifacts with highly expressed transcripts or alignments between seq-similar regions of the fusion partners.

hth,

~b

To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/e2afbb87-b274-4175-aba7-ec3485c9e3fco%40googlegroups.com.

Brian Haas

unread,

Jul 9, 2020, 10:01:12 AM7/9/20

to Mario N, STAR-Fusion

I would treat those with high numbers of spanning reads w/ few breakpoint reads as highly suspicious.

I have some new updates I'm working on that will help w/ this analysis and shed some more insights, I hope.

best,

~b

To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/ee023af9-1f3c-48f1-911a-933d73fbdc8eo%40googlegroups.com.

Reply all

Reply to author

Forward