Hello Dr Foley,
I am Lakshmi, a faculty in the Dept of Pathology, Thomas Jefferson University. Our lab is trying to analyze data that was created using SMRT-3Seq protocol as explained in your publication.However, I am facing some issues with the mapping percentage and the final exon/gene counts. It would be great if you can help me with some questions.
(1) when I use STAR to align my fastq files (after read trimming using umi_homopolymer.py ), I get extremely low mapping. The best I got so far is about 16%. I have tried changing parameters to map withhigher mismatches since these are short reads.
(2) I tried to align with the transcriptome (whole and 250bp from 3 prime end). I am not sure whether this is a good strategy. What do you think?
(3) I am also trying to analyze the data that you used for this publication to see where I am going wrong. One of the data set gave me about 18% mapping and 5.6% exon counts. Is this what is expected? I can share what I have and the parameters I used.
We have tried many attempts to improve the experimental protocol to get the best results out of it. So I wanted to make sure that I am not missing anything at the data analysis part. I can share more information regarding the exact steps if needed.
Any help would be greatly appreciated.
Thank you!
Lakshmi Kuttippurathu
The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
STAR --outFilterMultimapNmax 1 --outFilterMismatchNmax 999 --clip3pAdapterMMp 0.2 --clip3pAdapterSeq AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAThis is after creating the STAR reference with "--sjdbOverhang 67", because we typically use 76 nt reads but the first 8 bases are the NNNNNGGG of the 2S primer, so the effective read length is 68 and therefore the overhang length is one less. It seems to work better to let STAR remove the poly(A) so we let umi_homopolymer.py report the trimmed lengths for QC but it doesn't actually trim the poly(A) from the sequences that go into STAR.
featureCounts -s 1 --read2pos 5
--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/bafc2422-631d-403a-b6e3-52a298074adb%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to smart...@googlegroups.com.