Help with EML4-ALK fusion detection

44 views
Skip to first unread message

Léo Biscassi

unread,
Aug 26, 2020, 4:08:24 PM8/26/20
to STAR-Fusion
Hi everyone,
I'm having problems to identify EML4-ALK gene fusions among several samples when I use STAR-Fusion pipeline from docker image trinityctat/starfusion like reported in this link [1]. I've tried to run only FusionInspector against a commercial control [2] passing just the fusions that I expected and the results are fine.

Here it's what I've tried to solve this issue:

1) Run with max_sensitivity enabled
2) Run with full_Monty enable
3) Change the parameters alignSplicedMateMapLmin and alignSplicedMateMapLminOverLmate to STAR default values
4) Change the parameter alignSplicedMateMapLmin to default value and alignSplicedMateMapLminOverLmate to 0.1 link the guy from github issue
5) Change the parameters peOverlapNbasesMin, alignSplicedMateMapLmin and alignSplicedMateMapLminOverLmate to STAR default values

None of them worked.

I checked the output bam file generated by the first run of STAR in IGV and I have reads in the regions of the panels (Archer FusionPlex Solid Tumor and Illumina Trusight RNA Fusion), then I checked if the candidate it was not filtered in a step after the initial candidates given by STAR and they aren't in star-fusion.preliminary/star-fusion.fusion_candidates.preliminary file.

What can I do more to solve this issue?

Here is the raw data of my commercial positive control: https://1drv.ms/u/s!AuXrwYGmjO6jgbdsOu0rTQGxbzAxQA?e=vTBDT5
Expected fusions: EML4-ALK, CCDC6-RET and SLC34A2-ROS1; if you want I can share the results of pipeline that I have too.

Thanks in advance

Brian Haas

unread,
Aug 26, 2020, 6:13:54 PM8/26/20
to Léo Biscassi, STAR-Fusion
Hi Leo,

I'll take a look and get back to you.  EML4--ALK is usually one of the easy ones to detect.  More soon,

~brian


--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/7be297ec-f992-453f-a2e0-10cf175c3084n%40googlegroups.com.


--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Léo Biscassi

unread,
Aug 26, 2020, 6:27:51 PM8/26/20
to Brian Haas, STAR-Fusion
Hi Brian,
Thanks a lot, if you need anything let me know.

--
Léo Biscassi

Léo Biscassi

unread,
Aug 26, 2020, 6:34:27 PM8/26/20
to Brian Haas, STAR-Fusion
Brian, just to let you know: pipeline with standard parameters value catch RET and ROS1 fusions.

Here is the data results from standard values for parameters: https://1drv.ms/u/s!AuXrwYGmjO6jgbdxuuuYaCBwDIxzVw?e=DvCQxm
And here is the data results from standard values for parameters of FusionInspector (with the raw data that I've already sent): https://1drv.ms/u/s!AuXrwYGmjO6jgbdwLtgLRX_LTZ9neA?e=haKd5w

Funny thing is that when I execute just FusionInspector the pipeline finds 101 EML4-ALK junction reads.

Best regards,

Brian Haas

unread,
Aug 26, 2020, 6:42:48 PM8/26/20
to Léo Biscassi, STAR-Fusion
interesting.  thanks for the extra info!  I'll let you know what I find.  There should be a good explanation for it.

best,

~b

Brian Haas

unread,
Aug 27, 2020, 9:58:13 AM8/27/20
to Léo Biscassi, STAR-Fusion
Hi Leo,

From looking at the alignments in FusionInspector, there appears to be ~20 bases that get soft-clipped on the 'left' fastq reads, and I suspected that these non-matching bases (adaptors seqs? indexes?) might be confounding the STAR-Fusion algorithm.

I stripped off the first 22 bases of these reads and then reran them through STAR-Fusion, and it found the EML4--ALK just fine.

Attached are the EML4--ALK supporting fusion reads, the 'left fastq' that I trimmed the 5' 22 bases from, and a little python script for doing this fastq trimming.  STAR-Fusion results for them are included too.

Any idea what these ~20 5' bases might represent? Do you need to do adaptor trimming or something?

best,

~brian


for_Leo.tar.gz

Léo Biscassi

unread,
Aug 27, 2020, 12:14:41 PM8/27/20
to Brian Haas, STAR-Fusion
Hi Brian,
Nice, this maybe this explains the poor result with Archer FusionPlex Solid Tumor panel, I know that we have a GSP2 primers with molecular barcodes at this end, but at the same type, I have problems to identify this specific fusion in the Illumina RNA Fusion that don't have this characteristic.

I don't trimmed any sequence because I read in the paper of STAR that de software is capable to do that automatically, in this case you recommend trimming before to start STAR-Fusion pipeline?

I'll check with biologist about what I've said and returns to you and I'll check the sequences and your scripts too and reply with feedback.

Thank you!

Best,

Brian Haas

unread,
Aug 27, 2020, 12:31:02 PM8/27/20
to Léo Biscassi, STAR-Fusion
I think there's an option in STAR to have it begin alignment at some position in the sequence, effectively doing trimming there. You might be able to pass that in as a parameter.  I'd have to dig into the usage for it.  Sometimes it's easier to write a script than it is to look through user documentation. ;-)
Reply all
Reply to author
Forward
0 new messages