What’s an example of when to use --max_sensitivity vs --full_Monty?
- What output differences should we expect between the two flags?
- What do the following output columns mean and how are they useful?
- est_J
- est_S
- The annotation filter is supposed to remove red herring fusion predictions.
- Why is it that I still have fusions with annotation "GTEx_recurrent_StarF2019" when those are supposed red herrings?
- Are those false positive for cancer samples or are they false positives for having a real fusion event?(https://github.com/FusionAnnotator/CTAT_HumanFusionLib/wiki#red-herrings-fusion-pairs-that-may-not-be-relevant-to-cancer-and-potential-false-positives)
- I didn’t find documentation to explain some values in the “annot” column. Can you please clarify what these mean (including examples for how to interpret the numbers)?
- "LOCAL_REARRANGEMENT:+/-:[some number]"
- "NEIGHBORS[some number]"
- "NEIGHBORS_OVERLAP:+/-:+/-:[some number]"
- What exactly is the difference between INTERCHROMOSOMAL[chromosome and some Mb] and LOCAL_REARRANGEMENT:+/-:[some number]?
- We have 1000s of Disease/CTRL samples and we hope to understand differential transcripts based on brain region + disease status.
- Are there any specific flags you recommend based on our project description?
- Are there recommendations for how to conduct downstream analyses with the abridged TSVs?
--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/e7bdd7ec-f578-46ff-a058-d6701b577ad7n%40googlegroups.com.
Thanks for the unbelievably quick reply! :)
With est_J/est_S, the thing that was confusing is that it seemed redundant since there are the "JunctionReadCount" and "SpanningFragCount" columns.Apologies for not clarifying that.1. What is the difference between "JunctionReadCount" and "est_J" (same for "est_S" and "SpanningFragCount")?
2. With regards to the "red herring" stuff, I used the latest trinityctat/starfusion image which uses STAR-Fusion version 1.10.0The command I've used with that version is:STAR-Fusion --left_fq {fastq_R1} --right_fq {fastq_R2} --genome_lib_dir {/path/to/genome/} --CPU 8 --output_dir {out_dir} --FusionInspector validateCan you let me know why I would be getting these "false positives"?Are they only false positives in the cancer space or would they be "false positives" in all disease spaces?NOTE: I've attached a sample abridged TSV output file from STAR-Fusion as an example.
3. Outside of using more CPUs, are there any specific flags or options that will help with increasing speed of processing the large amount of data we have?
4. I meant to compare LOCAL_REARRANGEMENT with INTRACHROMOSOMAL but you've answered my question by explaining LOCAL_REARRANGEMENT.5. I'm debating whether the "--FusionInspector validate" flag is useful or not. Do you think it would be in my case? Better yet, are there specific use cases for this flag?
6. Since we are looking at Transcriptomics data, how can we detect whether "fusion candidates" are ACTUAL gene fusions vs trans-splicing occurring?
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/379befbc-2d92-4d17-8301-9f6ed1612cbcn%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/10114ec2-e977-4b34-bf5e-3ce4d10b8280n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/dbd0ede1-8697-4e13-80ca-7e9b111d81a5n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/f7f437ef-7e63-4046-83a8-f09e47b2d411n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/993c27a5-6e85-4ebc-b6bf-b1c5d240021bn%40googlegroups.com.
Thanks so much Brian! :)
Hello Brian,
MANY MANY CONGRATS on the pre-print! :)
I had 2 follow-up questions.1) What's the difference in meaning between "NEIGHBORS" vs "NEIGHBORS_OVERLAP" in the "annots" column of STAR-Fusion?Is it just that if the SHORTEST POSSIBLE base pair distance between two genes is less than 10K base pairs away (0.00Mb), then its special category is "NEIGHBORS_OVERLAP"?
To clarify the "SHORTEST POSSIBLE" portion above, let's say you have Gene A that goes from chr1:1-10 and Gene B is from chr1:100-500 (both are on positive strand).If you have a fusion between the two, the shortest distance between the genes is the start point of Gene B (100) subtracted by the end point of Gene A (10).Please let me know if my understanding of "NEIGHBORS_OVERLAP" is wrong.
2) Along the same vein of the last question, I'm still trying to understand "[some number]" in ONE example such as: "LOCAL_REARRANGEMENT:+/-:[some number]".You mentioned previously that "[some number]" is the genetic distance between two genes.This doesn't make sense to me based on the following example.I found the following fusion: AC010332.1--ZNF880 which had the following "annot" info: [""INTRACHROMOSOMAL[chr19:0.01Mb]"",""LOCAL_REARRANGEMENT:+:[6864]""]
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/f3c60b81-c62f-45dc-9b24-79be1059c846n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/b4212994-91f0-42d2-9420-9e9c5fbf6555n%40googlegroups.com.