Can not sync read pairs, while explicit MID sequences are being used! (not supported, sorry)

26 views
Skip to first unread message

handle...@gmail.com

unread,
Mar 24, 2016, 3:13:18 PM3/24/16
to LotuS rRNA pipeline
Hi!

Really enjoying the LotuS sRNA pipeline and the options available in sdm. Hoping to incorporate it into our overall workflow for 16S amplicon analysis.

I currently get everything to work other than the read pairing. When I run the workflow for miseq data I receive the following error:

"Can not sync read pairs, while explicit MID sequences are being used! (not supported, sorry)"

We are currently using the primers/strategy from Caporaso et al.  http://www.ncbi.nlm.nih.gov/pubmed/20534432?dopt=Abstract. This generates paired-end reads along with a separate index read. 

Here is the SDM log output. Of note, this is only for a subset of samples from the full run, so the mapping file only has barcodes for a small portion of the samples, hence the large number of rejections. However, after playing with a bunch of things and reading through the documentation I am unsure how to fix the issue where 0 reads are Accepted for pair 2.

I assume this has to do with the error in the title of this post. Any suggestions on how to resolve? I have also included the first read from each file below in case it helps to look at the FASTQ file headers.

Many thanks in advance! Great work here!


----------------------------------------------------------------
Setting to Sanger fastq version (q offset = 33).

sdm 1.26 beta
Input File:  /mnt/data2/shandley/HIV/presti/oral/may2015-16s_S1_L001_R1_001.fastq.gz,/mnt/data2/shandley/HIV/presti/oral/may2015-16s_S1_L001_R2_001.fastq.gz
Output File: /mnt/data2/shandley/HIV/presti/oral/lotus_slout//tmpFiles//demulti.1.fna
Statistics of high quality reads

Reads processed: 17,206,582; 17,209,908 (pair 1;pair 2)
Rejected: 16,900,058; 17,209,908
Accepted: 306,524; 0 (0; 0 were 5'-trimmed)
Singletons among these: 306,524; 0
Bad Reads recovered with dereplication: 1,278
Short amplicon mode.
Looked for switched read pairs (3,326 detected)
Min/Avg/Max stats Pair 1
     - Seq Length : 170/170/170
     - Quality :   29/34.5028/38
     - Median Seq Length : 170, Quality : 35
     - Accum. Error 0.166821
Filtered due to:
  < min Seq length (170)  :                288,597; 0
       -after Quality trimming :           624,884; 0
  < avg Quality (27)  :                    1; 0
  < window (50 nt) avg. Quality (25)  :    1; 0
  > max Seq length (1000)  :               0; 0
  > (8) homo-nt run  :                     99; 0
  > (0) amb. Bases  :                      0; 0
  > (0.75) acc. errors :                   0; 0
  > (2.5) binomial est. errors :           89,072; 0
  -Barcode unidentified (max 0 errors) :   66,604,260; 17,209,908 (0 pairs failed)
    -reversed all barcodes

-------------------------------------------------------------------

Read 1
----------
@M00175:254:000000000-AEBBN:1:1101:15697:1331 1:N:0:1
TNCGTAGGTGGCGAGCGTTATCCGGAATGATTGGGCGTAAAGGGTGCGCAGGCGGCCCTGCAAGTCTGGAGTGAAACGCATGAGCTCAACTCATGCATGGCTTTGGAAACTGGAGGACTGGAGAGCAGGAGAGGGCGGTGGAACCCCATGTGTAGCGGTAAAATGCGCAGATATATGGAAGAATACCAGTGGTGAAGGCGGCCGCCTGGCCTGCTGCTGACGCCGAGGCACGAAAGCGTGGGGAGCAAAA
+
1#>>>1>>111BA1EFCFG0GFHF??/BFGGHB0A0EE///DBAAAAA/AECFCG/>>>F/0DF1BBG0EFC012EFC/<EC11<>BFCGB1B>BFD2GA00FGFBGDFBD1C0?/?FB//0<1<.<0>..CCC.A@CCCC.0;..9.;/;;000;-9.C.90000..---//;:///;/B-//99BF//9//-9///9---;->@;---:B--/9////---------9----9;/-----99--9B/-

Read 2
----------
@M00175:254:000000000-AEBBN:1:1101:15697:1331 2:N:0:1
NCTATTTGCTCCCCACGCTTTCGTGCCTCAGCGTCAGTAACAGGCCAGTCGTCCGCCTTCGCCACCGGTGTTCTTCAATATATCTACGCATCTCACCGCTACACATCGAGTTCCACCGCCCTCTCCTGCACTCAAGTCCTCCAGTTTCCAAAGCCATGCATGAGTTGAGCTCAACCGTTTCCCTCCAGACTTCCACGACCCCCCACGCACCCTTTACGCCCCATAATTCCGGACAACTACTCACACTCAC
+
#>>>AAF5DFFFGGGG?EECCGE4FFHBAC33AB2EA55555B2FFC13A11B1AABEEG1EE?11>@/EE1B4BB434B@@44B441>EA/?BBFG/>>/<3B3?3/BC1/1?2B0/<//<FGFC<110?1?011<<?@FC01>11<11>10>0<11<<=0<0==00/<0<00/:::.//0<0;.G/0;B000;0.-;--.---.-9-9-../99;.9;--.:/;;//;/-.-;..//9///9/;9/:9

Index Read
---------------
@M00175:254:000000000-AEBBN:1:1101:15697:1331 1:N:0:1
NAGAGATGTCGAA
+
#1111>@1DF11A

Falk Hildebrand

unread,
Mar 28, 2016, 6:16:09 AM3/28/16
to LotuS rRNA pipeline
Hey Scott,
thanks, I am glad you can use LotuS. I added the missing parts to the latest version of LotuS, please see the attached Linux binary. This still has to undergo some testing, but will be also online in the official LotuS release in a few days, but I would be glad to have feedback from you if this sdm (version 1.27) is fixing the reported problems. Just replace the sdm in your lotus folder with this new one.

All the best & happy easter,
Falk
sdm

Scott Handley

unread,
Mar 28, 2016, 4:31:51 PM3/28/16
to LotuS rRNA pipeline
Many thanks Falk! Everything seems to be working now. I have run several iterations with no problems.

Thanks!

Scott

Reply all
Reply to author
Forward
0 new messages