Hi @mattqdean
STAR does not need or use the "expected" insert length information for mapping - so yo do not have to worry about it. :)
For mapping of DNA-seq data we can decide whether an alignment has insert size that's too big or too small by comparing it "expected" distribution of insert size.
However, in RNA-seq the unsequenced portion of insert between the mates can contain a splice junctions.
This means that we cannot simply calculate the insert size from an alignment to compare it with "expected" insert size.
Cheers
Alex
On Monday, February 8, 2016 at 6:32:38 PM UTC-5,
matt...@gmail.com wrote:
I have a set of TCGA tumor/normal paired RNAseq. In the past I had used RSeQC to determine fragment length and insert size for a given read and determine a "good" integer value for minimum inner mate distance and mate std deviation, which was automatically fed to TopHat for each unique samples. Although this was a really convenient feature, the time it takes for this process is too long.
I am wondering if STAR has a similar function for paired samples. I reviewed the manual and -help but could not find anything that touched on this issue. I've looked into seqanswers/biostars but there doesn't seem to be a lot of solution for this or any documentation. So I'm curious if it is common to even use this filter for calling variants or any downstream analysis.