I have 75bp single end drosophila RNAseq reads. The FASTQC report the reads looks good. But when I try to align the reads to the reference dm3 genome, very little (39.16%) reads align to it. I get a very high percentage of unmapped reads because they were too short.
Started job on | May 03 12:44:09
Started mapping on | May 03 12:49:20
Finished on | May 03 15:09:26
Mapping speed, Million of reads per hour | 4.35
Number of input reads | 10156563
Average input read length | 75
UNIQUE READS:
Uniquely mapped reads number | 3977709
Uniquely mapped reads % | 39.16%
Average mapped length | 72.43
Number of splices: Total | 104631
Number of splices: Annotated (sjdb) | 0
Number of splices: GT/AG | 100227
Number of splices: GC/AG | 848
Number of splices: AT/AC | 299
Number of splices: Non-canonical | 3257
Mismatch rate per base, % | 1.02%
Deletion rate per base | 0.02%
Deletion average length | 1.15
Insertion rate per base | 0.00%
Insertion average length | 1.87
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 461916
% of reads mapped to multiple loci | 4.55%
Number of reads mapped to too many loci | 26928
% of reads mapped to too many loci | 0.27%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 55.91%
% of reads unmapped: other | 0.11%
I am not sure how to fix this issue. Any helpful advise is much appreciated.