Hi Alex,
I definitely searched on the web before posting it here, but if i missed it please point me to the right place.
My team was using STAR version 2.4.k untill recently with two pass mode as below.
# run 1st pass
mkdir Pass1
cd Pass1
$STAR $CommonPars --genomeDir $GenomeDir --readFilesIn $Reads
cd ..
# make splice junctions database file out of SJ.out.tab, filter out non-canonical junctions
mkdir GenomeForPass2
cd GenomeForPass2
awk 'BEGIN {OFS="\t"; strChar[0]="."; strChar[1]="+"; strChar[2]="-";} {if($5>0){print $1,$2,$3,strChar[$4]}}' ../Pass1/SJ.out.tab > SJ.out.tab.Pass1.sjdb
# generate genome with junctions from the 1st pass
$STAR --genomeDir ./ --runMode genomeGenerate --genomeFastaFiles $GenomeFasta --sjdbFileChrStartEnd SJ.out.tab.Pass1.sjdb --sjdbOverhang 100 --runThreadN 12
cd ..
# run 2nd pass with the new genome
mkdir Pass2
cd Pass2
$STAR $CommonPars --genomeDir ../GenomeForPass2 --readFilesIn $Reads
I suggested moving to STAR 2.5.3 with twopassmode Basic, as I want to use STAR fusion downstream
I looked at the sjdbList.out.tab in both the scenarios for the same sample and the filtering criteria looks different. Could you please help me better understand the difference.
Thanks a ton.