Thanks! I am 95% certain I am confused, but don't understand why I'm confused.
{
"read_files": "[ /dev/fd/63, /dev/fd/62]",
"expected_format": "ISR",
"compatible_fragment_ratio": 1.0,
"num_compatible_fragments": 2473144,
"num_assigned_fragments": 2473144,
"num_frags_with_consistent_mappings": 1678556,
"num_frags_with_inconsistent_or_orphan_mappings": 794588,
"MSF": 0,
"OSF": 0,
"ISF": 0,
"MSR": 0,
"OSR": 0,
"ISR": 1678556,
"SF": 348831,
"SR": 445757,
"MU": 0,
"OU": 0,
"IU": 0,
"U": 0
}
So these are definitely fr-firstrand or RF reads.
I did run unstranded, sorry for not including it! Here is the blurb for that on a subset of the reads:
RNA-Seq QC report
-----------------------------------
>>>>>>> Input
bam file = align/small.bam
gff file = /projects/ngs/reference/genomes/Hsapiens/hg38/rnaseq/ref-transcripts.gtf
counting algorithm = proportional
protocol = non-strand-specific
>>>>>>> Reads alignment
reads aligned (left/right) = 710,176 / 690,655
read pairs aligned = 673,232
total alignments = 1,748,874
secondary alignments = 348,043
non-unique alignments = 0
aligned to genes = 781,605
ambiguous alignments = 88,959
no feature assigned = 448,909
not aligned = 106,419
SSP estimation (fwd/rev) = 0.97 / 0.03
So it's calling these fwd reads, but I aligned with hisat2 like this:
/projects/ngs/local/software/bb5CD48373/tools/bin/../../anaconda/envs/python2/bin/hisat2-align-s --wrapper basic-0 --new-summary -x /projects/ngs/reference/genomes/Hsapiens/hg38/hisat2/hg38 -p 16 --phred33 --rna-strandness RF --rg-id PrCa_09_PROSTATE_N_1_1 --rg PL:illumina --rg PU:PrCa_09_PROSTATE_N_1_1 --rg SM:PrCa_09_PROSTATE_N_1_1 --known-splicesite-infile /projects/ngs/reference/genomes/Hsapiens/hg38/rnaseq/ref-transcripts-splicesites.txt --novel-splicesite-outfile /projects/ngs/oncology/dev/Dev_1663_RNASeq_Bakeoff_2019/bcbio/work/align/PrCa_09_PROSTATE_N_1_1/PrCa_09_PROSTATE_N_1_1-novelsplicesites.bed -1 /tmp/29067.inpipe1 -2 /tmp/29067.inpipe2
the --rna-strandedness RF flag is set there, on my slimmed down file here are the alignments:
1855293 + 0 in total (QC-passed reads + QC-failed reads)
348043 + 0 secondary
0 + 0 supplementary
571978 + 0 duplicates
1748874 + 0 mapped (94.26% : N/A)
1507250 + 0 paired in sequencing
753625 + 0 read1
753625 + 0 read2
1346464 + 0 properly paired (89.33% : N/A)
1356310 + 0 with itself and mate mapped
44521 + 0 singletons (2.95% : N/A)
1698 + 0 with mate mapped to a different chr
1214 + 0 with mate mapped to a different chr (mapQ>=5)
Do you spot where I am not understanding?
Best,
Rory