I am attempting to use Subread subjunc and featureCounts with small RNA sequencing data and will appreciate comments on appropriate usage of the programs.
--
My data is single-end 50 b-long reads obtained on the Illumina HiSeq 2500 platform with small RNA sequencing libraries (Illumina TruSeq Small RNA Sample Prep kit). It is trimmed to remove adapter and poor-quality sequences (the size of reads thus ranges from 15-50 b).
My main goal is to quantify microRNA (miR) expression. Mature miRs are ~18-22 b in length. They are generated from ~70-90 b-long precursor miRs (pre-miRs) which in turn are produced from primary miR transcripts (pri-miRs) which are like coding RNA transcripts in terms of length. Mature miRs can differ from each other at just one nucleotide position.
Different mature miRs (i.e., arising from different pri-miRs) can have the exact same sequence and a pri-miR can produce multiple different mature miRs. Finally, a pri-miR can be the product of multiple copies of a gene.
The reads in my data represent not only mature miRs but also
other small RNAs like tRNAs and snoRNAs as well as degradation fragments of coding RNAs (mRNAs), ribosomal RNAs, pre-miRs, pri-miRs, etc., because the biomaterial that were used were formalin-fixed tissues (formalin causes RNA degradation).
So, besides quantifying mature miR expression, I also want to check how many of the reads map to tRNAs, snoRNAs, rRNAs, mRNAs, etc.
--
Considering the above aspects, I am using Subread subjunc to map the reads against the hg19 human genome with:
subjunc -T 16 -i hg19 -r sample.fastq -d 15 -D 50 -B 16 -o sample.bam --BAMoutput
I find that ~70% of the 7-17 million reads of my samples get mapped. I then use featureCounts to obtain counts for miRs using a
GFF file from the miR sequence repository (miRBase):
featureCounts -T 16 -a hsa_mir.gff3 -t miRNA -f -O -M -d 15 -D 50 -o sample_mir_count.txt sample.bam--
I will appreciate if someone can comment on the use of subjunc and featureCounts for small RNA mapping and counting in the manner noted above fashion.
Thanks.