Slightly different SJs from first pass of twopassMode Basic versus manually running STAR 1-pass

15 views
Skip to first unread message

Heather Geiger

unread,
Aug 20, 2025, 11:35:55 AMAug 20
to rna-star

I am currently trying to run STAR in 2-pass mode, following the instructions here:

https://github.com/broadinstitute/gtex-pipeline/blob/master/TOPMed_RNAseq_pipeline.md

Full code is in the linked Github issue here: https://github.com/alexdobin/STAR/issues/2656

However, we are currently facing an issue similar to what is described here (#733), where the second pass of STAR runs much much slower than the first pass. For us, the problem is even more severe than what is described in that issue (<1M reads/hour on the second pass).

Anyway, while we eventually will possibly want to filter out some of the novel junctions as described in that issue, the first step we want to achieve is to simply see if we can exactly reproduce the results of two-pass mapping by the option twoPassMode Basic. But instead using two separate STAR commands (one to run the first pass just to generate the initial SJ.out.tab file, then the second pass with sjdbFileChrStartEnd with the SJ.out.tab file from the first pass).

Following the instructions in the linked issue, I ran STAR 1-pass with standard parameters, without BAM/SAM output (--outSAMtype None), without --chim* options and without --outFilterType BySJout.

However, I am getting 54-78 junctions unique to each run (STAR 1-pass SJ.out.tab versus _STARpass1/SJ.out.tab from first pass of running in 2-pass mode). So not a ton, but also do not understand why it should not be identical.

Any insights as to why this might be happening, and what we can possibly do to fix?

Reply all
Reply to author
Forward
0 new messages