Hi Alex,
I'm trying to do a comparison to search for the optimal parameters for NGS for our samples. To do that, I began with 2 samples sequenced at paired-end, 150bp and 20M of read depth and processed the fastq files in order to obtain the single-end (only taking the first fastq file), 10M depth (taking only the first 10M reads) and 100bp and 50bp (hard-trimming the 3' end of the fastq files).
I followed the same processing for all of them, umi extraction, mapping with STAR to the transcriptome and RSEM for quantification.
For the results that I have obtained, it seems that single-end yields more transcripts than paired-end and that 50bp yields more transcripts than 100 or 150bp, all those are false alignments? This only happens in the alignment to the transcriptome, since the Aligned.sortedByCoord.out.bam file is much bigger in the paired-end analysis than in the single-end analysis.
How does the alignment to the transcriptome work? Is it normal that a sample that has a lower number of genes in the alignment to the genome has a higher number of genes in the alignment to the transcriptome?
Thank you very much in advance.
Best regards,
Lluc