I used subjunc of Subread 1.4.3-p1 to align RNA seq. reads to the UCSC hg19 reference human genome. The alignment seems to have been successful.
However, I am not sure if the result is good because I used both paired and unpaired reads (for the same sample). The original data is from an Illumina paired-end 101 b run but I used the trimmomatic read cleaning/filtering software on the data, which resulted in the generation of both paired and unpaired data which were then used for subjunc.
Example usage:
subjunc -T 16 --gzFASTQinput -i hg19_index_for_Subread -r trimmed_paired_1.fastq.gz trimmed_unpaired_1.fastq.gz -R trimmed_paired_2.fastq.gz trimmed_unpaired_2.fastq.gz -o out -u -H --BAMoutput
Can someone confirm that subjunc will use both paired and unpaired read data when provided (or if it discards the unpaired read data)? I cannot find information on this in the Subread manual or in online forums.
Further, the message that Subread shows after alignment to summarize the result uses the term 'fragment' (such as in 'mapped fragments'), and I cannot match the number of fragments to the number of reads. E.g., for one sample, the input had a total of 9598056 paired reads (2x9598056 total) and a total of 3402944 unpaired reads (as per Tophat), but Subread's message stated that the input had
7781687 fragments (of which 95.7% could be mapped). In this context (both paired and unpaired data in the input), what does 'fragment' mean?
Thanks.