What to do when STAR-Fusion fails to find a fusion?

102 views
Skip to first unread message

simp...@gmail.com

unread,
Mar 4, 2021, 10:34:57 AM3/4/21
to STAR-Fusion
Hey Brian, thanks for the all hard work! 
Question, is there a way that you can think of to manaully look for fusions that might be missed by STAR-Fusion?  The reason why I ask is because I came across two samples recently that I know for sure has a certain fusion pair, mainly due to DUX4.  For example I know one sample that should for sure contain the IGH@-DUX4 and another CIC-DUX4 but regardless of what I try only fusion.catcher was able to pick up the CIC version but the IGH version remains elusive.  

Is there a way you can think of to manually look for this, perhaps via split reads? 

thanks! 

Brian Haas

unread,
Mar 4, 2021, 11:02:34 AM3/4/21
to simp...@gmail.com, STAR-Fusion
Hi,

Is it not finding it even with the current version of STAR-Fusion?  Modern versions of STAR-Fusion and corresponding ctat genome libs have some customizations to further improve on finding these.

The first thing I'd do to investigate it is to look at the file:
star_fusion_outdir/star-fusion.preliminary/star-fusion.junction_breakpts_to_genes.txt

and grep for the genes of interest to see if you find any reads mapped to those genes.

If no chimeric reads are being assigned to these genes, then STAR-Fusion will have no chance of flagging them.

The other thing to do is to try using FusionInspector.  Because the IGH gene features are annotated as large spans, you'll need to set the max intron length parameter to a high value (like 1 or 2 million).

Finally - I'm always interested in examining these things myself out of curiosity.  If you're able to share reads with me privately (and I promise not to reshare), I'm happy to take a look.  

all the best,

~brian

--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/d01e6243-1d07-454f-afe8-60dde4722ffdn%40googlegroups.com.


--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

dario....@gmail.com

unread,
Mar 5, 2021, 12:00:08 AM3/5/21
to STAR-Fusion
Those gene fusions are hard to identify. Another software for DNA structural variants, LINX, specifically mentions them.
  • For CIC-DUX4 and IGH-DUX4, the DUX4 end may map to a number of different chromosomal regions, including the telomeric ends of both chromosomes 10q and 4q and the hg19 alt contig GL000228.1.

Brian Haas

unread,
Mar 5, 2021, 8:36:40 AM3/5/21
to dario....@gmail.com, STAR-Fusion
For dux4, our hg38v22 ctat genome lib works best.  The pseudogenes and paralogs of dux4 are masked out of the confounding regions.  There's still some additional dux4 paralogs that aren't being masked out of our hg37v19 ctat genome lib that are partially confounding - I'll deal with that in the next-to-next release. ;-)  I've got the new release coming out in the next few days.

On Fri, Mar 5, 2021 at 12:00 AM dario....@gmail.com <dario....@gmail.com> wrote:
Those gene fusions are hard to identify. Another software for DNA structural variants, LINX, specifically mentions them.
  • For CIC-DUX4 and IGH-DUX4, the DUX4 end may map to a number of different chromosomal regions, including the telomeric ends of both chromosomes 10q and 4q and the hg19 alt contig GL000228.1.

--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages