Dear Yi,
Thank you very much for this script, which runs very well !
Just a comment about the automatic creation of .bam files from the .sam given as input of the script => this takes a lot of time.
I have taken the liberty to modify the script in order to put in argument my .bam files and verify that they have a valid header, if not, I exit the script.
In attachment, the little modification of the script.
In addition, it arrives that some values for "IncLevel1" or "IncLevel2" in the rMATS output have a "NA" value. And when the "rmats2sashimiplot.py" meets this value, it generates an error (of course it's not a numeric value) and exit => so, the events after this first NA are not treated. I remove the lines containing NA before launching the script with the following awk commands:
# on "IncLevel1" column
awk -v value="NA" -v column="21" '
$column ~ value {++removed; next}
1 {print}
END {print removed " lines removed" >"/dev/stderr"}
' <SE.MATS.ReadsOnTargetAndJunctionCounts.txt >SE.MATS.ReadsOnTargetAndJunctionCounts_without_NA.txt
# on "IncLevel2" column
awk -v value="NA" -v column="22" '
$column ~ value {++removed; next}
1 {print}
END {print removed " lines removed" >"/dev/stderr"}
' <SE.MATS.ReadsOnTargetAndJunctionCounts_without_NA.txt >SE.MATS.ReadsOnTargetAndJunctionCounts_without_NA_tmp.txt
mv SE.MATS.ReadsOnTargetAndJunctionCounts_without_NA_tmp.txt SE.MATS.ReadsOnTargetAndJunctionCounts_without_NA.txt
And it's ok !
Thank you very much for this script Yi ! It's very very useful.