Hi, we encountered a somewhat strange output of featureCounts with regard to the strand-specific options. While the RNA-seq was generated with a strand specific protocol, a preliminary TopHat run had not indicated a strand-specific alignment, so I would assume that the strand information in the resulting BAM files is random. However, when featureCounts was run on Ensembl v75 using the Illumina stranded flag (`-s 2`) we actually get MORE reads assigned to genes than leaving out this flag. I would have assumed that >1/2 of the reads would not be assigned to genes, given the strand information of the reads is random, but not random in the annotation. So, for example, if there were 100 reads aligned to a gene, I would have expected ~50 reads on the plus and ~50 reads on the minus strands on average, and then only 50 would be assigned to a gene on, say the plus strand. Is the strand information ignored if the annotation is not ambiguous at a given locus?
### with strand information
[ajaffe@compute-085 Fibroblasts]$ featureCounts -A /dcs01/lieber/ajaffe/Brain/DLPFC_PolyA/Counts/Ensembl/chrAliases_GRCh37_to_hg19.csv -a /dcs01/lieber/ajaffe/Brain/DLPFC_PolyA/Counts/Ensembl/Homo_sapiens.GRCh37.75.gtf -o tmp_revstrand.counts -s 2 -p TopHat/2328-Dura/accepted_hits.bam
//================================= Running ==================================\\
|| ||
|| 25 chromosome name aliases are loaded. ||
|| Load annotation file /dcs01/lieber/ajaffe/Brain/DLPFC_PolyA/Counts/Ens ... ||
|| Number of features is 1306656 ||
|| Number of meta-features is 63677 ||
|| Number of chromosomes is 265 ||
|| ||
|| Process BAM file TopHat/2328-Dura/accepted_hits.bam... ||
|| Assign fragments (read pairs) to features... ||
|| Each fragment is counted once. ||
|| Found reads that are not properly paired. ||
|| (missing mate or the mate is not the next read) ||
|| 1830975 reads have missing mates. ||
|| Input was converted to a format accepted by featureCounts. ||
|| Total number of fragments is : 14393505 ||
|| Number of successfully assigned fragments is : 12708887 (88.3%) ||
|| Running time : 4.71 minutes ||
|| ||
|| Read assignment finished. ||
|| ||
### without strand info
[ajaffe@compute-085 Fibroblasts]$ featureCounts -A /dcs01/lieber/ajaffe/Brain/DLPFC_PolyA/Counts/Ensembl/chrAliases_GRCh37_to_hg19.csv -a /dcs01/lieber/ajaffe/Brain/DLPFC_PolyA/Counts/Ensembl/Homo_sapiens.GRCh37.75.gtf -o tmp_unstrand.counts -p TopHat/2328-Dura/accepted_hits.bam
//================================= Running ==================================\\
|| ||
|| 25 chromosome name aliases are loaded. ||
|| Load annotation file /dcs01/lieber/ajaffe/Brain/DLPFC_PolyA/Counts/Ens ... ||
|| Number of features is 1306656 ||
|| Number of meta-features is 63677 ||
|| Number of chromosomes is 265 ||
|| ||
|| Process BAM file TopHat/2328-Dura/accepted_hits.bam... ||
|| Assign fragments (read pairs) to features... ||
|| Each fragment is counted once. ||
|| Found reads that are not properly paired. ||
|| (missing mate or the mate is not the next read) ||
|| 1830975 reads have missing mates. ||
|| Input was converted to a format accepted by featureCounts. ||
|| Total number of fragments is : 14393505 ||
|| Number of successfully assigned fragments is : 12379037 (86.0%) ||
|| Running time : 4.64 minutes ||
|| ||
|| Read assignment finished. ||
|| ||