Started job on | Apr 22 23:19:11
Started mapping on | Apr 22 23:19:39
Finished on | Apr 23 00:01:11
Mapping speed, Million of reads per hour | 38.02
Number of input reads | 26318495
Average input read length | 183
UNIQUE READS:
Uniquely mapped reads number | 8665045
Uniquely mapped reads % | 32.92%
Average mapped length | 178.45
Number of splices: Total | 37848
Number of splices: Annotated (sjdb) | 1071
Number of splices: GT/AG | 25908
Number of splices: GC/AG | 1456
Number of splices: AT/AC | 31
Number of splices: Non-canonical | 10453
Mismatch rate per base, % | 0.36%
Deletion rate per base | 0.06%
Deletion average length | 1.15
Insertion rate per base | 0.01%
Insertion average length | 1.18
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 291859
% of reads mapped to multiple loci | 1.11%
Number of reads mapped to too many loci | 45962
% of reads mapped to too many loci | 0.17%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 65.42%
% of reads unmapped: other | 0.37%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
I went through a couple of posts, and included the following parameters which improved the unique reads to 65% how would it affect the read counting before proceeding to downstream analysis.
STAR --runThreadN 8 --genomeDir $TRANS_DATA --readFilesIn <(gunzip -c ${names[${SLURM_ARRAY_TASK_ID}]}_R1_001_val_1.fq.gz) <(gunzip -c ${names[${SLURM_ARRAY_TASK_ID}]}_R2_001_val_2.fq.gz) --outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 0 --outFilterMatchNmin 0 --outFilterMismatchNmax 2 --outSAMtype BAM SortedByCoordinate --outFileNamePrefix ${names[${SLURM_ARRAY_TASK_ID}]}_extraparam --quantMode GeneCounts