too low unique reads aligned using paired end data

30 views
Skip to first unread message

Sebastián Contreras

unread,
Mar 30, 2022, 9:04:21 AMMar 30
to rna-star

I have some "problems" when mapping my paired end RNAseq data. Giving as input the two files (read1 and 2) I got around 2% of uniquely mapped reads and nearly 97% of "too short". I have doing some "blast" for those unmapped reads and they are mostly associated with protein coding genes. Now,  when aligning each fastq file in a separate fashion, the unique mapped reads percentage rises up to 85% in each run.


Somebody could give me some advice about what to do about that? I mean any idea about what could be the reason for this issue or how to solve it?

Files have been processed and seem with good quality (through FastQC I've observed phred scores over 30 along the reads). Each fastq file  has around 50 millions of reads  and  100 bases of length.

This is the command used (using star 2.5.4b):

 STAR  --genomeDir ../STARindex/ --readFilesIn file_1.fastq file_2.fastq --outFileNamePrefix alignment_STAR/output_file/output_file_ --outFilterMultimapNmax 1 --outReadsUnmapped Fastx --outSAMtype BAM SortedByCoordinate --twopassMode Basic --runThreadN 25


Best Regards,
Sebastian

Alexander Dobin

unread,
May 2, 2022, 2:53:42 PMMay 2
to rna-star
Hi Sebastian,

sorry for the delayed reply.
this is likely caused by some inconsistency in the read ordering of read1 and read2 files.
Have you trimmed the reads? If so, you need to make sure that if a read is removed from read1 file, it's also removed from read2.
Reply all
Reply to author
Forward
0 new messages