Option for only output primary alignment

2,021 views
Skip to first unread message

SHEN

unread,
Jun 8, 2015, 2:31:53 PM6/8/15
to rna-...@googlegroups.com
Hello,

As for multiple mapped reads with equal mapping score, the STAR has option " --outSAMprimaryFlag OneBestScore" to set one alignment with the best score is primary. But other alignments are still been produced in the alignment bam file. One can pick out the primary alignment by "-F 0x100", but I am wondering if there is an option which just output the "primary alignment", like "--best" option in "bowtie2". This will save lots of space. Thanks very much.

Bests,

Yang

Alexander Dobin

unread,
Jun 10, 2015, 6:33:19 PM6/10/15
to rna-...@googlegroups.com, shen.ya...@gmail.com
Hi Yang,

there is no way to do it at the moment - I already have this feature on my TODO list...
For now you can pipe the STAR BAMs through samtools view -F 0x100 to remove non-primary alignments on the fly.

Cheers
Alex

SHEN

unread,
Jun 11, 2015, 9:46:56 AM6/11/15
to rna-...@googlegroups.com, shen.ya...@gmail.com
Hi Alex,

Thanks for your update, I am looking forward to your new version.

Bests,

Yang

holger brandl

unread,
Sep 8, 2015, 7:13:39 AM9/8/15
to rna-star, shen.ya...@gmail.com
Hello Alex,

I'm trying to filter for primary alignments as well, and having a built-in output filter would be a nice addition to a future STAR. 

According to the manual a random alignment is selected from the alignments of equal quality in case of multimappers. However, this does not seem the case when using the latest version STAR_2.4.2a.


I've mapped some fruitfly data against BDGP5 using

STAR --genomeDir ${igenome}/Sequence/StarIndex --readFilesIn $fastqFile --runThreadN 6 --readFilesCommand zcat --outSAMtype BAM SortedByCoordinate --outSAMstrandField intronMotif --sjdbGTFfile $igenome/Annotation/Genes/genes.gtf --outFilterIntronMotifs RemoveNoncanonicalUnannotated --outFilterType BySJout --quantMode GeneCounts --outFilterMultimapNmax 4


STAR creates my bam-file and when looking at in IGV at a region  (3R:7,779,682-7,786,669) containing 2 copies (one forward and one reverse strand) of the same gene it seems that STAR is not randomly assigning the primary flag but has a strong bias towards the first gene copy:


1) All alignments:


Almost all alignments have an NH of 2 and the corresponding alignments always have the same AS.


2) When displaying just primary ones (using IGV's build in filter) it looks as follows:


Obviously it's not random. Is this a bug or am I doing something wrong?


In contrast when using tophat's -g 1 filter to perform the same filtering scheme, I get a well balanced picture with proper random selection of the primary alignment in case of equal mapping scores:




Best regards,

Holger

Alexander Dobin

unread,
Sep 9, 2015, 4:46:46 PM9/9/15
to rna-star, shen.ya...@gmail.com
Hi Holger,

the selection for the primary alignment is not truly random, I have to correct this statement in the manual.
The next release within a few days will have it truly random, and also allow output of a fixed number of alignments for the multimappers.

Cheers
Alex

Yang Shen

unread,
Sep 9, 2015, 9:53:05 PM9/9/15
to Alexander Dobin, rna-star
Hi Alex,

This is very great news. Thanks so much.

Best,

Yang
Reply all
Reply to author
Forward
0 new messages