Hi Malcolm,
at the present, the removal of duplicates requires multiple steps.
1. The --bamRemoveDuplicatesType UniqueIdentical or UniqueIdenticalNotMulti does not actually remove them from the BAM file, but rather marks them 0x400 bit in the SAM FLAG.
2. --bamRemoveDuplicatesType and --outWigType cannot be used simultaneously in one run.
So, to generate the signal files without duplicates:
$ STAR --runMode inputAlignmentsFromBAM --bamRemoveDuplicatesType UniqueIdenticalNotMulti --inputBAMfile Aligned.soretedByCoordinate.bam
This will generate Processed.out.bam. Note that UniqueIdentical would also mark all duplicates as multimappers, so in the end you will not see the multimapping signal at all)
$ samtools view -b -F0x400 Processed.out.bam > Processed.out.noDupl.bam
$ STAR --runMode inputAlignmentsFromBAM --outWigType bedGraph --outWigNorm RPM --inputBAMfile Processed.out.noDupl.bam
If you want to generate the counts for splice junctions after removing duplicates, you can use the script extras/scripts/sjFromSAMcollapseUandM.awk from the STAR distribution:
$ samtools view Processed.out.noDupl.bam | awk -f extras/scripts/sjFromSAMcollapseUandM.awk > SJ.out.noDupl.tab
The output format:
chr start end Nunique Nmultiple
Cheers
Alex