Sequencing of SMART3 results

98 views

Skip to first unread message

Justin Wong

unread,

Feb 21, 2019, 2:26:57 PM2/21/19

to Smart-3SEQ

I also wanted to start a discussion on what sequencing conditions folks are using. I see in the updated 1.9 protocol that is attached to this board, there are some recommendations: namely single-end (SE), over paired-end (PE), and that PhiX is not needed, though it can be useful for troubleshooting. We have yet to run any sequencing yet at MD Anderson, but I am curious as to what others have done, or have looked at with regard to conditions on the sequencer.

We will be most likely be sequencing on a NextSeq 500. In conjunction with this, are folks mostly leveraging BaseSpace infrastructure to do this, or running it as a stand-alone process. Any insight would be appreciated.

Thanks,

~Justin

Joe Foley

unread,

Feb 21, 2019, 3:14:49 PM2/21/19

to smart...@googlegroups.com

For LCM FFPE libraries we always use 76-base single-end reads (the shortest option) on the NextSeq, with a 1% PhiX spike-in, which indeed is only there in case we need to troubleshoot. Since we're counting reads and not bases, the only advantage of longer reads would be slightly greater alignability, but that's a case of diminishing returns in general and for LCM FFPE in particular the inserts are so short that long reads are probably just going to give you more adapter sequence. See supplemental figure S30 in the preprint.

We never do paired-end reads but at best they would be a waste of money, or they might cause the whole run to fail. Read 2 would start with 30 T's, which will prevent getting any usable data from the rest of the read and may even cause the sequencer to lose focus, which could prevent the clusters from getting index reads if those are performed afterward.

LCM FFPE libraries give poor sequencing QC metrics because of short inserts, and especially molecules with no insert that escaped size selection, but a High Output flow cell for the NextSeq still yields a comfortable amount of data for 96 human samples.

We don't use BaseSpace because entering metadata is frustrating and the analysis pipeline is all set up to run offline, so all we do is generate FASTQs and we don't need BaseSpace for that. Is there a demand for adapting the pipeline to run on BaseSpace?

--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To post to this group, send email to smart...@googlegroups.com.
Visit this group at https://groups.google.com/group/smart-3seq.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/4a819bac-5130-4257-b6c6-3052fe95244d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.