#split libraries error

62 views
Skip to first unread message

Manil

unread,
May 31, 2017, 7:12:32 AM5/31/17
to Qiime 1 Forum

Hello everyone !

I am trying to run the split libraries script  (multiple_split_libraries_fastq.py -i kristel/joined_reads/ -o SEQ/) but i am having this error msg:

Stderr
Traceback (most recent call last):
  File "/usr/local/bin/split_libraries_fastq.py", line 365, in <module>
    main()
  File "/usr/local/bin/split_libraries_fastq.py", line 344, in main
    for fasta_header, sequence, quality, seq_id in seq_generator:
  File "/usr/local/lib/python2.7/dist-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode
    phred_offset=phred_offset):
  File "/usr/local/lib/python2.7/dist-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file
    parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):
  File "/usr/local/lib/python2.7/dist-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq
    seqid)
skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: MISEQ-M2024:79:000000000-B439K:1:1119:10248:4830 1:N:0:GGAATACGTCTTTCCC 341F-785R-sample-39(dummy-primer)-Unknown Rescued: 341F-785R-sample-39-341F-785R-sample-39 (NM) (341F)-(785R) NS-NS MM-MM. This may be because you passed an incorrect value for phred_offset.


could you help me please with this issue!?


Manil

Embriette

unread,
May 31, 2017, 10:01:32 AM5/31/17
to Qiime 1 Forum

Manil

unread,
May 31, 2017, 10:26:33 AM5/31/17
to Qiime 1 Forum
Hi Ebriette ,

I have already had a look on the links before I posted my question for example I tried with >>>> multiple_split_libraries_fastq.py -i kristel/joined_reads/ -o kristel/SEQ --demultiplexing_method sampleid_by_file --include_input_dir_path --remove_filepath_in_name --phred_offset 33

and i got this:

Error in multiple_split_libraries_fastq.py: no such option: --phred_offset
 

Embriette

unread,
May 31, 2017, 10:29:38 AM5/31/17
to Qiime 1 Forum
Hi Manil,

Your answer was buried in one of the previous threads:

"`multiple_split_libraries_fastq.py` has a parameter to accept in things like `phred_offset` through a parameter file.

-p, --parameter_fp
Path to the parameter file, which specifies changes to the default behavior of split_libraries_fastq.py. See http://www.qiime.org/documentation/file_formats.html#qiime-parameters [default: split_libraries_fastq.py defaults will be used]"

So, you'll need to make a parameters file indicating the phred offset for this to work with multiple_split_libraries.py.

Embriette

Manil

unread,
Jun 2, 2017, 4:28:31 AM6/2/17
to Qiime 1 Forum
Hi Embriette,

I still have the same problem. I created the parameter file (attached file) can you check it for me please! and here is my command <<<  multiple_split_libraries_fastq.py -i joined_reads/ -o kristel/SEQ --demultiplexing_method sampleid_by_file --include_input_dir_path --remove_filepath_in_name -p joined_reads/param_file>>>


again this error msg <<<<<< Stderr

Traceback (most recent call last):
  File "/usr/local/bin/split_libraries_fastq.py", line 365, in <module>
    main()
  File "/usr/local/bin/split_libraries_fastq.py", line 344, in main
    for fasta_header, sequence, quality, seq_id in seq_generator:
  File "/usr/local/lib/python2.7/dist-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode
    phred_offset=phred_offset):
  File "/usr/local/lib/python2.7/dist-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file
    parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):
  File "/usr/local/lib/python2.7/dist-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq
    seqid)
skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: MISEQ-M2024:79:000000000-B439K:1:1119:10248:4830 1:N:0:GGAATACGTCTTTCCC 341F-785R-sample-39(dummy-primer)-Unknown Rescued: 341F-785R-sample-39-341F-785R-sample-39 (NM) (341F)-(785R) NS-NS MM-MM. This may be because you passed an incorrect value for phred_offset.

and I have got an output folder with the  seq.fna.incopmplete and logfile (attached).


Thank you

Manil

param_file
log_20170602003234.txt

Manil

unread,
Jun 2, 2017, 5:04:34 AM6/2/17
to Qiime 1 Forum
it is workingggggggggggg :) (I removed the .py (that was the problem) in the script name from the parameter file)


Cheers !

Manil 

Embriette

unread,
Jun 2, 2017, 10:09:49 AM6/2/17
to Qiime 1 Forum
Great! Glad you got it working!

Nilusha Malmuthuge

unread,
Jun 5, 2017, 3:24:49 PM6/5/17
to Qiime 1 Forum
Hi Embriette
I have the same problem. 

I have 16S sequences (27F -519R) obtained through MIseq 250bp. After joining paired ends using Join_paired_ends.py, I used split_libraries_fastq.py. But I got following error.

split_libraries_fastq.py -i ~/Documents/URTmicrobiota/sequence/6B/first_seq.fastq -o ~/Documents/URTmicrobiota/sequence/6B/split_library_output -m ~/Documents/URTmicrobiota/sequence/map_B.txt --barcode_type 'not-barcoded' --sample_id 6B_ -r 1 -q 19

Traceback (most recent call last):

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

    main()

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

    for fasta_header, sequence, quality, seq_id in seq_generator:

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

    phred_offset=phred_offset):

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

    parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

  File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

    seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GGGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCTTAACACATGCAAGTCGAACGGTAGCAGGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGCTTGGGAATCTGGCTTATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCGTAATCTCTACGGAGTAAAGGGTGGGACCTTTTGGCCACCTGCCATAAGATGAGCCCAAGTGGGATTAGGTAGTTGGTGAGGTAAAGGCTCACCAAGCCGACGATCGCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGGGGAACCCTGATGCAGCCATGCCGCGTGAATGAAGAAGGCCGTCGGGGTGTAAAGTTCTTTCGGTGATGAGGAAGGAGTGAAGTTTAATAGACTTCATTATTGACGTTAGTCACAGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATTC. This may be because you passed an incorrect value for phred_offset.


And after going through the forum I set phred-offset to 33 and still got the same error. I have attached my map file and a part of joined sequences. Could you please look into this as well.
Many thanks in advance

split_libraries_fastq.py -i ~/Documents/URTmicrobiota/sequence/6B/first_seq.fastq -o ~/Documents/URTmicrobiota/sequence/6B/split_library_output -m ~/Documents/URTmicrobiota/sequence/map_B.txt --barcode_type 'not-barcoded' --sample_id 6B_ -r 1 -q 19 --phred_offset 33

Traceback (most recent call last):

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

    main()

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

    for fasta_header, sequence, quality, seq_id in seq_generator:

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

    phred_offset=phred_offset):

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

    parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

  File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

    seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GGGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCTTAACACATGCAAGTCGAACGGTAGCAGGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGCTTGGGAATCTGGCTTATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCGTAATCTCTACGGAGTAAAGGGTGGGACCTTTTGGCCACCTGCCATAAGATGAGCCCAAGTGGGATTAGGTAGTTGGTGAGGTAAAGGCTCACCAAGCCGACGATCGCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGGGGAACCCTGATGCAGCCATGCCGCGTGAATGAAGAAGGCCGTCGGGGTGTAAAGTTCTTTCGGTGATGAGGAAGGAGTGAAGTTTAATAGACTTCATTATTGACGTTAGTCACAGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATTC. This may be because you passed an incorrect value for phred_offset.

map_B.txt
first_seq.fastq

Embriette

unread,
Jun 6, 2017, 12:11:02 PM6/6/17
to Qiime 1 Forum
Hi Nilusha,

I tried running split_libraries_fastq.py on your example files, with the phred offset set to 33 and to 64, and both have failed. 

Reading this thread in which someone has a similar problem, removing the "problem read" solved the issue. You can try taking this approach as well; if you are still met with the error, I will ask a few other people for their input so we can help you solve this!

Thanks!

Embriette

Nilusha Malmuthuge

unread,
Jun 13, 2017, 12:26:41 PM6/13/17
to Qiime 1 Forum
Thanks for looking into it Embriette. I went through my join.fastq files and found that there's an additional + in all the sequences after 835th sequence. It looks like it was added during join_paired_ends. I can't find that in my original sequences. I thought of manually removing it, but I have ~90k reads in each file.
Is this something happening with join_paired_ends.py. I am using macqiime 1.9.

here are my join.fastq sequences I mentioned
@M00833:558:000000000-B5H6B:1:2106:14051:10005 1:N:0:TGCTACATCA
+
AGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGGATGAAGGGAGCTTGCTCCTGGATTCAGCGGCGGACGGGTGAGTAATGCTTAGGAATCTGCCTATTAGTGGGGGACAACAGTTGGAAACGACTGCTAATACCGCATACGCCCTACGGGGGAAAGGAGGGGATCTTCGGACCTTTCGCTAATAGATGAGCCTAAGTCAGATTAGCTAGTTGGTGGGGTAAAGGCCTACCAAGGCGACGATCTGTAGCGGGTCTGAGAGGATGATCCGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGGGCAACCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGCCTTTTGGTTGTAAAGCACTTTAAGCGAGGAGGAGGCTCTTCTAGTTAATACCTAGGATGAGTGGACGTTACTCGCAGAATAAGCACCGGCTAACTCTGTGCCAGCAGCCGCGGTAATAC
+
CCCCCGGGGGGFGGFGFGGGGGGGDCFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGDGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGDEGGGGGGGGGGGEGCEGGGFFEGGGGGGGGGGDDEGGGGGGGCGGGGGGGGGGGGDFGEEGGGGGGGGFCFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGE)GGGF?>DCGGC<AFFF*A:FFCF?FGGF5;?FFEFGFC;;@FDFDFDFGGGGGFFFE7ED;5FGFCCGAFDD9>FAFGFFGGGGGGFCFFGGGGGGF8DFFGGGGCFGGGF@FCEGFEGFGGFCGGGGGGFGGGGGDGGDGGFFGGGGGGGGGECGGGGGGGGGFGGEFFECDFFGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGFCGGGGGGEGGFCGGGGGFFE8GGGGGGFGGGFFGGGGGGGCCCCC
@M00833:558:000000000-B5H6B:1:2106:19337:10018 1:N:0:TGCTACATCA
+
AGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGGATGAAGGGAGCTTGCTCCTGGATTCAGCGGCGGACGGGTGAGTAATGCTTAGGAATCTGCCTATTAGTGGGGGACAACAGTTGGAAACGACTGCTAATACCGCATACGCCCTACGGGGGAAAGGAGGGGATCTTCGGACCTTTCGCTAATAGATGAGCCTAAGTCAGATTAGCTAGTTGGTGGGGTAAAGGCCTACCAAGGCGACGATCTGTAGCGGGTCTGAGAGGATGATCCGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGGGCAACCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGCCTTTTGGTTGTAAAGCACTTTAAGCGAGGAGGAGGCTCTTCTAGTTAATACCTAGGATGAGTGGACGTTACTCGCAGAATAAGCACCGGCTAACTCTGTGCCAGCCGCCGCGGTAATTC
+
CCCCCGGGGGGGGFFGFGGGGCGGGGGGGDGGFEGGGCGGGGGGGGGGGGGGGGDFGDGGGGGGDGGGGGFEGGFGGGGGDCGEGGGGGG@FGGGGGGGGGGGGGGGGGFFGGGGGG?BFGGF=DCG9<<DGGGGGGGGGGG;FGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCCE5C56>EDGE>FFGFCEEEGGCFGGEG58FCFGGFCFGGGG6EGGG?9@CCFGG:<:FFGGGGGDGG:CGGGGG*C>GGGCCFDG=EEFFGC9979E=>D8GFFFFD;>DECGGCFGF9??>ECA;3+++(8D>C>C8ECCC;8B*<)FB2GGFDGFFGGGDE??GFD9F?FGF@2,GFCGGGGGGGF>GGGGFFDFF@FGGCC@C:@C@E9GFGGGGGGGGGGGGFGGGGGGAGFFCFEGGFGGGGGGGGGGGEFFFE8GGFDAGGGFGDGGGGGGGGGGGGGDGGGGGFFGGCGGGGGGGFFFFEGGEGFAACGGFFGGFFFEECC@FCFGGGGGCCCCC
@M00833:558:000000000-B5H6B:1:2106:17296:10023 1:N:0:TGCTACATCA
+
and here are my original sequences
R1
@M00833:558:000000000-B5H6B:1:2106:14051:10005 1:N:0:TGCTACATCA
AGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGGATGAAGGGAGCTTGCTCCTGGATTCAGCGGCGGACGGGTGAGTAATGCTTAGGAATCTGCCTATTAGTGGGGGACAACAGTTGGAAACGACTGCTAATACCGCATACGCCCTACGGGGGAAAGGAGGGGATCTTCGGACCTTTCGCTAATAGATGAGCCTAAGTCAGATTAGCTAGTTGGTGGGGTAAAGGCCTACCAAGGCGACGATCTGTAGCGGGTCTGAGAGGATGATCCGCCA
+
CCCCCGGGGGGFGGFGFGGGGGGGDCFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGDGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGDEGGGGGGGGGGGEGCEGGGFFEGGGGGGGGGGDDEGGGGGGGCGGGGGGGGGGGGDFGEEGGGGGGGGFCFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGECGGFF?:CCGGC5AFFF*95,0CF?FG@F5:

R2
@M00833:558:000000000-B5H6B:1:2106:14051:10005 2:N:0:TGCTACATCA
GTATTACCGCGGCTGCTGGCACAGAGTTAGCCGGTGCTTATTCTGCGAGTAACGTCCACTCATCCTAGGTATTAACTAGAAGAGCCTCCTCCTCGCTTAAAGTGCTTTACAACCAAAAGGCCTTCTTCACACACGCGGCATGGCTGGATCAGGGTTGCCCCCATTGTCCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGGCCGTGTCTCAGTCCCAGTGTGGCGGATCATCCTCTCAGACCCGCTACAGCTCGTCGCCTTGGTAGGCCTTTACCCCACCAACTAGCTAATCTGAC
+
CCCCCGGGGGGGFFGGGFGGGGGG8EFFGGGGGCFGGEGGGGGGCFGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGFFDCEFFEGGFGGGGGGGGGCEGGGGGGGGGFFGGDGGDGGGGGFGGGGGGCFGGFGEFGECF@FGGGFCGGGGFFD8FGGGGGGFFCFGGGGGGFFGFAF>9DDFAGCCFGF5;DE7EFFFGGGGGFDFDFDF@;;CFGFEFF?;+;GE>)=>FF:A).0:*<+0A<D>1DGF0).8),452?G*>C797:?C8?)6CF;3:CE<B6<F)14)*6.56*1

Thanks again 
Nilu
Reply all
Reply to author
Forward
0 new messages