Is it bug? seqs.fna.incomplete (FastqParseError: Failed qual conversion for seq id)

232 views
Skip to first unread message

Olena P

unread,
Feb 24, 2016, 6:29:24 PM2/24/16
to Qiime 1 Forum
Dear All,

I am working with established pipeline. Until today everything was fine. However, the problem appeared with split_libraries_fastq.py run (for MANY fastqjoin.join.fastq files).

it gives me such error: FastqParseError: Failed qual conversion for seq id and create seqs.fna.incomplete file, leaving histogram.txt file empty too. 

Not working withfastqjoin.join.fastq files created by multiple_join_paired_ends.py (?) while 2 -3 repetitions of joining read ends (join_paired_ends.py) gives finally fastqjoin.join.fastq file that be used in split_libraries_fastq run. 

This take a lot of time and It looks like problem with qiime (the bug?)

System information

==================

         Platform: darwin

   Python version: 2.7.10 |Anaconda 2.2.0 (x86_64)| (default, May 28 2015, 17:04:42)  [GCC 4.2.1 (Apple Inc. build 5577)]

Python executable: /macqiime/anaconda/bin/python


QIIME default reference information

===================================

For details on what files are used as QIIME's default references, see here:

 https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2


Dependency versions

===================

          QIIME library version: 1.9.1

           QIIME script version: 1.9.1

qiime-default-reference version: 0.1.2

                  NumPy version: 1.9.2

                  SciPy version: 0.16.0

                 pandas version: 0.16.2

             matplotlib version: 1.4.3

            biom-format version: 2.1.4

                   h5py version: 2.4.0 (HDF5 version: 1.8.14)

                   qcli version: 0.1.1

                   pyqi version: 0.3.2

             scikit-bio version: 0.2.3

                 PyNAST version: 1.2.2

                Emperor version: 0.9.51

                burrito version: 0.9.1

       burrito-fillings version: 0.1.1

              sortmerna version: SortMeRNA version 2.0, 29/11/2014

              sumaclust version: SUMACLUST Version 1.0.00

                  swarm version: Swarm 1.2.19 [Jun  2 2015 14:40:16]

                          gdata: Installed.


QIIME config values

===================

For definitions of these settings and to learn how to configure QIIME, see here:

 http://qiime.org/install/qiime_config.html

 http://qiime.org/tutorials/parallel_qiime.html


                     blastmat_dir: None

      pick_otus_reference_seqs_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

                         sc_queue: all.q

      topiaryexplorer_project_dir: None

     pynast_template_alignment_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta

                  cluster_jobs_fp: start_parallel_jobs.py

pynast_template_alignment_blastdb: None

assign_taxonomy_reference_seqs_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

                     torque_queue: friendlyq

                    jobs_to_start: 1

                       slurm_time: None

            denoiser_min_per_core: 50

assign_taxonomy_id_to_taxonomy_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt

                         temp_dir: /tmp/

                     slurm_memory: None

                      slurm_queue: None

                      blastall_fp: blastall

                 seconds_to_sleep: 60


QIIME base install test results

===============================

.........

----------------------------------------------------------------------

Ran 9 tests in 0.030s


OK


(working after several repetition with join_paired_ends.py)


$ join_paired_ends.py -f 03331202_lib84791_4374_1_1.fastq -r 03331202_lib84791_4374_1_2.fastq -o joined_03331202


$ split_libraries_fastq.py -i joined_03331202/fastqjoin.join.fastq -m 03331202.txt -o splitlib_03331202 --barcode_type 'not-barcoded' --sample_ids 03331202 


(not working after multiple_join_paired_ends.py)!


the same bash, the same day: 


$ split_libraries_fastq.py -i joined_03331202/fastqjoin.join.fastq -m 03331202.txt -o splitlib_03331202R --barcode_type 'not-barcoded' --sample_ids 03331202


Traceback (most recent call last):

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

    main()

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

    for fasta_header, sequence, quality, seq_id in seq_generator:

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

    phred_offset=phred_offset):

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

    parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

  File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

    seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id:




What is wrong? It is absolutely the same information and the same fastq files...


Could you, please, help me with some explanation? And suggestion? (it takes a lot of time to repeat samples one by one...)


Thanks in advance!
/Olena

Jamie Morton

unread,
Feb 24, 2016, 7:16:32 PM2/24/16
to Qiime 1 Forum
Hi Olena,

Could you post the first few lines of your fastq file?

Jamie

Olena P

unread,
Feb 25, 2016, 2:10:30 AM2/25/16
to Qiime 1 Forum
Hi Jamie,

here is information from the first 2 lines of fastq (read1), fastq (read2) and fastqjoin.join.fastq file.

I also attached log files + when it working (R-) and when its not (RR-), creating seq.fna and seq.fna.incomplete files, respectively from the same data!

first 2 lines:

$ head -n2 03331202_lib84791_4374_1_1.fastq

@HISEQ:400:HHYYTBCXX:1:1101:1883:2243 1:N:0:TAATGCGCTAATCTTA

GGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAACACATGCAAGTCGAACGGAGAATTTTATTTCGGTAGAATTCTTAGTGGCGAACGGGTGAGTAACGCGTAGGCAACCTACCCTTTAGACGGGGACAACATTCCGAAAGGAGTGCTAATACCGGATGTGATCATCTTGCCGCATGGCAGGACGAAGAAAGATGGCCTCTACAAGTAAGCTATCGCTAAAGGATGGGCCTGCGTCTGATTAGCTAGTTGGTAGTGTAACGGACTACCAAGGCGATGATCAGTAGCCGGTC

 

$ head -n2 03331202_lib84791_4374_1_2.fastq

@HISEQ:400:HHYYTBCXX:1:1101:1883:2243 2:N:0:TAATGCGCTAATCTTA

ATTACCGCGGCTGCTGGCACGTAGTTAGCCGTGGCTTCCTCGTTTACTACCGTCATTGCAATGCAATGTTCACACACTGCACGTTCGTCATAAACAACAGAGCTTTACAGACCGAAATCCTTCATCACTCACGCGGCGTTGCTCCGTCAGACTTTCGTCCATTGCGGAAGATTCCCCACTGCTGCCTCCCGTAGGAGTTTGGGCCGTGTCTCAGTCCAAATGTGGCCGTTCATCCTCTCCGACCGGCTACTCATCAGCCCCTTGGTAGTCCGTTACACTACCATCTCGCTATTCCGACCCA

 

$ head -n2 joined_03331202R/fastqjoin.join.fastq

@HISEQ:400:HHYYTBCXX:1:1101:3751:2232 1:N:0:TAATGCGCTAATCTTA

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCTACAGGCTTAACACATGCAAGTCGAGGGGCAGCATGGTCTTAGCTTGCTAAGGCTGATGGCGACCGGCGCACGGGTGAGTAACACGTATCCAACCTGCCGTCTACTCTTGGCCAGCCTTCTGAAAGGAAGATTAATCCAGGATGGGATCATGAGTTCACATGTCCGTATGATTAAAGGTATTTTCCGGTAGACGATGGGGATGCGTTCCATTAGATAGTAGGCGGGGTAACGGCCCACCTAGTCAACGATGGATAGGGGTTCTGAGAGGAAGGTCCCCCACATTGGAACTGAGACACGGTCCAAACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGGCGATGGCCTGAACCAGCCAAGTAGCGTGAAGGATGACTGCCCTATGGGTTGTAAACTTCTTTTATAAAGGAATAAAGTCGGGTATGCATACCCGTTTGCATGTACTTTATGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT 

Looking forward to understand why it is happening!

Regards,
Olena  
 

seqs.fna.incomplete
split_library_log_RR.txt
split_library_log_R.txt

Jamie Morton

unread,
Feb 25, 2016, 2:03:03 PM2/25/16
to Qiime 1 Forum
Hi Olena,

Can you try to pass --phred_offset 33 to see if that resolves the issue?

Jamie
Message has been deleted

Olena P

unread,
Feb 25, 2016, 3:36:27 PM2/25/16
to Qiime 1 Forum
Hi Jamie!
Thank you for the suggestion! This was the first that I have tried... No, it not working

split_libraries_fastq.py -i joined_02972108RR/fastqjoin.join.fastq -m 02972108.txt -o splitlib_02972108RR --sample_ids 02972108 --barcode_type 'not-barcoded' --phred_offset '33'  

Traceback (most recent call last):

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

    main()

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

    for fasta_header, sequence, quality, seq_id in seq_generator:

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

    phred_offset=phred_offset):

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

    parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

  File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

    seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT. This may be because you passed an incorrect value for phred_offset.


I have tried with phred_offset 64 - not working!

May be I have to do something with my sequencing files? 


Thank you for trying to help me,


/Olena


TonyWalters

unread,
Feb 25, 2016, 8:34:06 PM2/25/16
to Qiime 1 Forum
I'm not sure offhand, but, there is something amiss with the fastq file format-the error should be listing the fastq sequence label line, but it's listing a sequence, as if the label line wasn't written:


skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT. This may be because you passed an incorrect value for phred_offset.

Something perhaps went awry doing the join_paired_ends.py step. Maybe use that sequence to grep for the line before it to see what the label looks like? E.g.
grep -B 5 "GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT" joined_02972108RR/fastqjoin.join.fastq

I'd suggest using that approach to see what happened to the label line in the fastq file. It may be that there is a is a bug in the stitching software that caused it to skip writing a label. Maybe go back to the original unstitched reads and see what the label looks like there, as that might be a problem too. I'm not sure of the best way to fix this though if that's the case-maybe just remove that read?

Olena P

unread,
Feb 26, 2016, 3:50:57 AM2/26/16
to Qiime 1 Forum
Hi Tony!
Thank you for your suggestion!

I tried you suggestion grep -B 5 "GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT" joined_02972108RR/fastqjoin.join.fastq > test

Test:  AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTAACCCCCCACACTGGTACTGAGACACGGACCAGACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGACGCAAGTCTGAACCAGCCATGCCGCGTGCAGGATGACGGCTCTATGAGTTGTAAACTGCTTTTGTACGAGGGTAAACGCAGATACGTGTATCTGTCTGAAGGTATCGTACGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT
+
DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIHHGHHHHIEIIIGHIIHIIEHECIGEHHCHGFHHEH?FHCH@5+HHHIHF--IIIHEG@-@8@8-IIIHHIIHEDBGHHHHGIHIIGHIIGHHGGIIIHGHGIIIHIIHIIHHIIIHFGIHCIIIIHDIIIHIHDFCHIIIIHCIIIIIHHIHIHIIIIIIIIHGIIIIHIIIIIIIIIIIHIHIIIIIIIHIIIIHIIIIHIHHHIHHD1IIIIIIIHDIIIIHHIHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD
@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT
+
AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT

What I do now? As I understand it is not replace my fastqjoin.join.fastq file automatically?  

Regards,
Olena

TonyWalters

unread,
Feb 26, 2016, 11:32:26 AM2/26/16
to Qiime 1 Forum
Hello Olena,

Something does seem to be awry with that sequence, in particular, the one with this label:
@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT

The fastq format is supposed to look like this example:
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

but for that label above, it goes label, +, then sequence.

I don't know how it got messed up, and unfortunately, there are not automatic fixes for this sort of thing. Now that we know the offending label (hopefully the only one), we can check the original R1/R2 files (the ones used for join_paired_ends.py), and see what the reads look like there. Can you run the following grep commands and post the results?

grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" joined_02972108RR/fastqjoin.join.fastq
(I want to see how the lines after this offending label look too)

grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" X
grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" Y

where X and Y are the R1 and R2 fastq files that were used as input for join_paired_ends.py.

Olena P

unread,
Feb 26, 2016, 3:32:02 PM2/26/16
to Qiime 1 Forum
Hi Tony,

Here is result from 3 commands you ask me to do. I hope, you will see the problem. 

$ grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" joined_02972108RR/fastqjoin.join.fastq

+

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTAACCCCCCACACTGGTACTGAGACACGGACCAGACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGACGCAAGTCTGAACCAGCCATGCCGCGTGCAGGATGACGGCTCTATGAGTTGTAAACTGCTTTTGTACGAGGGTAAACGCAGATACGTGTATCTGTCTGAAGGTATCGTACGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT

+

DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIHHGHHHHIEIIIGHIIHIIEHECIGEHHCHGFHHEH?FHCH@5+HHHIHF--IIIHEG@-@8@8-IIIHHIIHEDBGHHHHGIHIIGHIIGHHGGIIIHGHGIIIHIIHIIHHIIIHFGIHCIIIIHDIIIHIHDFCHIIIIHCIIIIIHHIHIHIIIIIIIIHGIIIIHIIIIIIIIIIIHIHIIIIIIIHIIIIHIIIIHIHHHIHHD1IIIIIIIHDIIIIHHIHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT

+

AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT

+

CDDDDGHIIHIEHIIIIIIHIHIIHIIIIIHIIIGHIIIIHHIIIIIIIIIIIIIIIIIHG@HIHHHHGGHIIIIIIIIGIIIHIIEF@GHHHIIIIIIIICHCFHHIIHHIIIHIIIIHGGEHIIIGHHIIIGHHCHDH<EEHIIIIFIIHIGHF=GDEHHGGGFDFHIIHIEHHHHFGEHHIHIFHHFEEECHHH?EE@E?G..BC<EHEHHHHAG.@HH-BFB@AHHHHICGHHFEDHHHIIGHH@@GHIF@E@EHGB-BGDHDHFHFE@#@-CH=H,5@#H-IHCGHEDHI@-?FHFF-C>+>?EGHGFAB.BG?HFFHEEHDHHHF-EGIIHGCFHEFDGEGF?EFHEHHGHGEHHGHHHHHHHIIIIIHHGIHIIHGGCDGD-IIIIHDHHIIIHF11FIIIIIIIIIHGC/IIIHIHIHHIIIHFDEHEHHHHEEG@FHIHHHHCHHGHFEIIIHHGEHHCHE@DDEIIHIHIIIIIIIHIHCGHECGIGIHHECCIGHIIIIIIIIIIIIIIIHIIIIIIIDCDDD



$ grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" 02972108_lib84810_4374_1_1.fastq 

@HISEQ:400:HHYYTBCXX:1:1101:13393:100095 1:N:0:CGGCTATGCAGGACGT

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTA

+

DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIEHGHHHHIEIIIGHIIHIIEHCC@EEH5CHCCHHEH?FHCH@

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT

AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGG

+

CDDDDGHIIHIEHIIIIIIHIHIIHIIIIIHIIIGHIIIIHHIIIIIIIIIIIIIIIIIHG@HIHHHHGGHIIIIIIIIGIIIHIIEF@GHHHIIIIIIIICHCFHHIIHHIIIHIIIIHGGEHIIIGHHIIIGHHCHDH<EEHIIIIFIIHIGHF=GDEHHGGGFDFHIIHIEHHHHFGEHHIHIFHHFEEECHHH?EE@E?G..BC<EHEHHHHAG.@HH-BFB@AHHHHICGHHFEDHHHIIGHH@@GHIF@E@EHGB-BGDHDHFHFE@-@-CH=H,5@HH--6--5>DHI@--8FE

@HISEQ:400:HHYYTBCXX:1:1101:14633:100112 1:N:0:CGGCTATGCAGGACGT


$ grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" 02972108_lib84810_4374_1_2.fastq 

@HISEQ:400:HHYYTBCXX:1:1101:13393:100095 2:N:0:CGGCTATGCAGGACGT

ATTACCGCGGCTGCTGGCACGGAGTTAGCCGATCCTTATTCGTACGATACCTTCAGACAGATACACGTATCTGCGTTTACCCTCGTACAAAAGCAGTTTACAACTCATAGAGCCGTCATCCTGCACGCGGCATGGCTGGTTCAGACTTGCGTCCATTGACCAATATTCCTCACTGCTGCCTCCCGTAGGAGTCTGGTCCGTGTCTCAGTACCAGTGTGGGGGGTTAACCTCTCAGTCCCCCTATGTATCGTCGCCTTGGTGAGCCGTTACCCCACCAACTAGCTAATACACCGCAGGCCCC

+

DDDDDIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHHIHHIIIIDHIIIIIII1DHHIHHHIHIIIIHIIIIHIIIIIIIHIHIIIIIIIIIIIHIIIIGHIIIIIIIIHIHIHHIIIIICHIIIIHCFDHIHIIIDHIIIICHIGFHIIIHHIIHIIHIIIGHGHIIIGGHHGIIHGIIHIGHHHHGBDEHIIHHIII-8@8@-@GEHIII--FHIHHH+5,-8@-6@?F@FG-@HHDGICE@-6-65-+>@G-@G@H-8@GHGHHHAHHHHH-@GEH####################

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 2:N:0:CGGCTATGCAGGACGT

ATTACCGCGGCTGCTGGCACGTAGTTAGCCGTCCCTTTCTGGTAAGCTACCGTCACAGTGTGAACTTTCCACTCTCACACTCGTTCTTGACTTACAACAGAGCTTTACGATCCGAACACCTTCTTCACTCACGCGGCGTTGCTCGGTCAGGGTTGCCCCCATTGCCGAAGATTCCCTACTGCTGCCTCCCGTAGGAGTCTGGGCCGTGTCTCAGTCCCAGTGTGGCCGATCACCCTCTCAGGTCGGCTATTTATCGTCGCATAGGTGAGCCTTTACCTCACCTACTCGCTAATACAACCCA

+

DDDCDIIIIIIIHIIIIIIIIIIIIIIIHGICCEHHIGIGCEHGCHIHIIIIIIIHIHIIEDD@EHCHHEGHHIIIEFHGHHCHHHHIHF@GEEHHHHEHEDFHIIIHHIHIHIII/CGHIIIIIIIIIF11FHIIIHHDHIIII-DGDCGGHIIHIGHHIIIIIHHHHHHHGHHEGHGHHEHFE?FGEGDFEHFCGHIIGE-FHHHDHEEHFFH?GB.BAFGHGE?>+>C-FFHF?--AG?EHGCHI#####################################################

@HISEQ:400:HHYYTBCXX:1:1101:14633:100112 2:N:0:CGGCTATGCAGGACGT


Really appreciate your help!


/Olena

TonyWalters

unread,
Feb 27, 2016, 6:46:03 PM2/27/16
to Qiime 1 Forum
Okay Elena, thanks for bearing with us.

After testing join_paired_ends.py locally with fastq files created from the sequences you just posted, I was able to replicate the problem of the unwanted, added + lines. Here is the specific issue from your joined data:


grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" joined_02972108RR/fastqjoin.join.fastq

+    <------------ THIS SHOULDN'T BE HERE

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTAACCCCCCACACTGGTACTGAGACACGGACCAGACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGACGCAAGTCTGAACCAGCCATGCCGCGTGCAGGATGACGGCTCTATGAGTTGTAAACTGCTTTTGTACGAGGGTAAACGCAGATACGTGTATCTGTCTGAAGGTATCGTACGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT

+

DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIHHGHHHHIEIIIGHIIHIIEHECIGEHHCHGFHHEH?FHCH@5+HHHIHF--IIIHEG@-@8@8-IIIHHIIHEDBGHHHHGIHIIGHIIGHHGGIIIHGHGIIIHIIHIIHHIIIHFGIHCIIIIHDIIIHIHDFCHIIIIHCIIIIIHHIHIHIIIIIIIIHGIIIIHIIIIIIIIIIIHIHIIIIIIIHIIIIHIIIIHIHHHIHHD1IIIIIIIHDIIIIHHIHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD



I found that the problem happens when running the underlying software (fastq-join) directly. It seems we need to update the fastq-join software, which is part of the ea-utils package. Since you are on a mac, I would install the brew software, following the instructions here:
http://brew.sh/

Then run:

brew update

Finally, run:
brew install ea-utils

Hopefully this will install the 1.1.2-806 version of the ea-utils software. We're going to need to use the path that it is installed to make sure the system can find the fastq-join software-can you copy the output that comes after running brew install ea-utils?

Olena P

unread,
Feb 28, 2016, 5:35:44 AM2/28/16
to Qiime 1 Forum
Hi Tony!
I cannot install it by myself since I do not have administrator right. I have already contact our institutional service desk. 

I will come back as soon as I have more information.

 Thank you for your help,

/Olena

Olena P

unread,
Mar 1, 2016, 3:30:19 AM3/1/16
to Qiime 1 Forum
Dear Tony,

Since my work computer is getting service/support from my home institution they do not recommend to download "brew". 
Any other ways to solve my problem? 

When you, think new update for the fastq-join software will be released? 

Can I modify my sequencing data somehow to get it suitable for current QIIME update?

I got more information from company which produced sequencing data. 
They think that the read number was too high...

Can one split_libraries first and then join the paired_end instead in Qiime without changing final result?


Regards
/olena

TonyWalters

unread,
Mar 1, 2016, 7:12:50 AM3/1/16
to Qiime 1 Forum
Olena, if you have someone locally who has a decent amount of experience, you could have them try to compile the software from the source code: https://code.google.com/archive/p/ea-utils/source/default/source
It's going to take some work though, as they did not set up the install scripts to work well on a Mac (hence the suggestion for usage of brew).

Another option would be to bypass the join_paired_ends.py step, and just use one of the reads as input for split_libraries_fastq.py rather than the stitched reads-R1 usually has higher quality.

Olena P

unread,
Mar 1, 2016, 1:44:17 PM3/1/16
to Qiime 1 Forum
Dear Tony,
Thank you very much for your suggestion. I will work on it. 

I got help from bioinformatician who work with another software. He was able to stitch and proceed files which have failed in qiime.

He stitched paired ends with:

  --max-mismatch-density=0.25 --min-overlap=10 --max-overlap=300 and it work file.

I am wondering if problem in my situation in joined_pair_ends can be solved by making corrections for joining method? 

In qiime 1.9.1  fastq-join - (Erik Aronesty, 2011. ea-utils)  is default 

Can I use SeqPrep instead? Or it will not improve my situation?

I also have one important question to ask:
I got sequencing data from HiSeq 2500 where some files have up to 550,000 sequences reads (approx 330,000,000 sequences bases) while files that I usually run are about 200,000 - 300,000 sequences reads. Do you think this can affect the joining of paired end or it does not seems to be the problem?

I know that QIIME software primarily was designed for MiSeq data which usually has lower number of reads.

Regards,
Olena


TonyWalters

unread,
Mar 1, 2016, 1:59:21 PM3/1/16
to Qiime 1 Forum
Hello,

You could use SeqPrep instead (it won't have the issue of the + characters, as that appears to be something specific to particular versions of the EA-utils fastq-join software), but if the stitched reads you just generated are able to work with split_libraries_fastq.py without the error you were encountering before, I would go with those to be consistent with your prior pipeline for processing reads. The number of reads shouldn't be a factor (just increases the amount of time/memory needed to process the data), but the shorter read length of HighSeq reads relative to MiSeq could affect the overlap size for the stitching software.

Olena P

unread,
Mar 1, 2016, 2:57:37 PM3/1/16
to Qiime 1 Forum
Thank you for the answer.

I have tried
$ SeqPrep -f 02972108_lib84810_4374_1_1.fastq -r 02972108_lib84810_4374_1_2.fastq -1 output_unpaired_1.fastq -2 output_unpaired_2.fastq -s output_merged.fastq

Processing reads... \

Pairs Processed: 533880

Pairs Merged: 26911

Pairs With Adapters: 129

Pairs Discarded: 19

CPU Time Used (Minutes): 2.848939


split_libraries_fastq.py -i output_merged.fastq -o splitlib_02972108D -m 02972108.txt --sample_ids 02972108 --barcode_type 'not-barcoded'

Traceback (most recent call last):

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

    main()

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

    for fasta_header, sequence, quality, seq_id in seq_generator:

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

    phred_offset=phred_offset):

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 278, in process_fastq_single_end_read_file

    post_casava_v180 = is_casava_v180_or_later(fastq_read_f_line1)

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/parse.py", line 37, in is_casava_v180_or_later

    "Non-header line passed as input. Header must start with '@'."

AssertionError: Non-header line passed as input. Header must start with '@'.


NOT WORKING! give seqs.fna.incomplete. What did I do wrong?


$ head -n2 output_merged.fastq 

?DQ?8?.?_??k?????

%??P?;oF?ӝX?


Now I cannot understand the information with head command... How do I check output_merged.fastq?



Regards

/Olena


 

TonyWalters

unread,
Mar 1, 2016, 3:10:18 PM3/1/16
to Qiime 1 Forum
Olena, I think your file is in gzipped format (the output of SeqPrep will be gzipped). Can you try renaming your output_merged.fastq file to output_merged.fastq.gz (this will let QIIME know the file type when reading it in). 

Olena P

unread,
Mar 1, 2016, 4:01:55 PM3/1/16
to Qiime 1 Forum
Dear Tony!

It is working now with SeqPrep method!!!!  and I can do split_libraries_fastq command also. 

$ head -n2 output_merged.fastq

@HISEQ:400:HHYYTBCXX:1:1101:8983:2806 1:N:0:CGGCTATGCAGGACGT

AGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCGTGGCTAAGACATGCAAGTCGAACGAGAGAATTGCTAGCTTGCTAATAATTCTCTAGTGGCGCACGGGTGAGTAACACGTGAGTAACCTGCCCCCGAGAGCGGGATAGCCCTGGGAAACTGGGATTAATACCGCATAGTATCGAAAGATTAAAGCAGCAATGCGCTTGGGGATGGGCTCGCGGCCTATTAGTTAGTTGGTGAGGTAACGGCTCACCAAGGCGATGCAGGGTAGCCGGTCTGAGAGGATGTCCGGACACACTGGAACTGAGACACGGTCCAGACACCTACGGGTGGCAGCAGTCGAGAATCATTCACAATGGGGGAAACCCTGATGGTGCGACGCCGCGTGGGGGAATGAAGGTCTTCGGATTGTAAACCCCTGTCATGTGGGAGCAAATTAAAAAGATAGTACCACAAGAGGAAGAGACGGCTAACTCTGTGCCAGCAGCCGCGGTAAT


split_libraries_fastq.py -i output_merged.fastq -o splitlib_02972108DD -m 02972108.txt --sample_ids 02972108 --barcode_type 'not-barcoded'


RESULT:
with join_paired_end
Input file paths
Sequence read filepath: joined_02972108_R/fastqjoin.join.fastq (md5: 80d0088a69ae4b3aa8c82cd3380ad1d6)
Quality filter results
Total number of input sequences: 356536
Barcode not in mapping file: 0
Read too short after quality truncation: 1707
Count of N characters exceeds limit: 725
Illumina quality digit = 0: 0
Barcode errors exceed max: 0

Result summary (after quality filtering)
Median sequence length: 515.00
02972108 354104

Total number seqs written 354104
---


with SeqPrep method

Input file paths
Sequence read filepath: output_merged.fastq (md5: 73987c9ab4e0e7f5461a134c2148ee74)
Quality filter results
Total number of input sequences: 26911
Barcode not in mapping file: 0
Read too short after quality truncation: 1328
Count of N characters exceeds limit: 104
Illumina quality digit = 0: 0
Barcode errors exceed max: 0

Result summary (after quality filtering)
Median sequence length: 489.00
02972108 25479

Total number seqs written 25479



The result is not identical! Should I re-run all samples with the same method or it will not affect the alpha rarefaction?

Thank you very much for your input and all suggestions! 

/Olena

TonyWalters

unread,
Mar 1, 2016, 5:06:59 PM3/1/16
to Qiime 1 Forum
I wouldn't expect the results to be identical with two different algorithms. But, you're down about an order of magnitude when using seqprep instead of fastq-join.

You will need to follow one of these three paths to get more reads retained:
1. Tweak the parameters for seq-prep to get more read retained-I'm not an expert on the particulars of seq-prep though, so it may take some test runs on your end with different settings to see what works.
2. Get the updated fastq-join software installed and use that.
3. Skip the stitching step altogether and use just the R1 file.

-Tony

Olena P

unread,
Mar 2, 2016, 5:17:53 AM3/2/16
to Qiime 1 Forum
Dear Tony,

Thank you for the suggestions,
I have made decision to go for path 2.

You said that you can update fastq-join software, which is part of the ea-utils package 

I finally installed ea-utils/1.1.2-806

See output:

$  brew install homebrew/science/ea-utils

==> Tapping homebrew/science

Cloning into '/usr/local/Library/Taps/homebrew/homebrew-science'...

remote: Counting objects: 583, done.

remote: Compressing objects: 100% (582/582), done.

remote: Total 583 (delta 2), reused 66 (delta 0), pack-reused 0

Receiving objects: 100% (583/583), 481.71 KiB | 0 bytes/s, done.

Resolving deltas: 100% (2/2), done.

Checking connectivity... done.

Tapped 573 formulae (603 files, 1.5M)

==> Installing ea-utils from homebrew/science

==> Installing dependencies for homebrew/science/ea-utils: gsl

==> Installing homebrew/science/ea-utils dependency: gsl

==> Downloading https://homebrew.bintray.com/bottles/gsl-1.16.yosemite.bottle.2.

######################################################################## 100.0%

==> Pouring gsl-1.16.yosemite.bottle.2.tar.gz

🍺  /usr/local/Cellar/gsl/1.16: 245 files, 7.7M

==> Installing homebrew/science/ea-utils

==> Downloading https://homebrew.bintray.com/bottles-science/ea-utils-1.1.2-806.

######################################################################## 100.0%

==> Pouring ea-utils-1.1.2-806.yosemite.bottle.1.tar.gz

🍺  /usr/local/Cellar/ea-utils/1.1.2-806: 14 files, 760.7K



Looking forward for your answer. 


Regards,

Olena


p.s. I might have sent accidentally privat respond to your email. Sorry for spamming.




Olena P

unread,
Mar 2, 2016, 6:58:52 AM3/2/16
to Qiime 1 Forum
I have one thing to add.

Since my samples were sequenced in HiSeq i try to use FLASH for stitching the same sample I used before:

I got this error:

Segmentation fault: 11


Could it be that this issue is relating to Python function? I found that Python failure can occur after updating to Yosemite. https://discussions.apple.com/thread/6829612?start=0&tstart=0 

May qiime developers can come to new update for Macqiime, taking in account this problem...

Regards,
/Olena

TonyWalters

unread,
Mar 2, 2016, 10:13:03 AM3/2/16
to Qiime 1 Forum
The executable file should be at this location now:
/usr/local/Cellar/ea-utils/1.1.2-806/fastq-join

We should be able to put the directory /usr/local/Cellar/ea-utils/1.1.2-806/ into the $PATH environment, so the fastq-join file will be found no matter where it is ran.

These commands should add that folder to the $PATH environment:
echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/:$PATH" >> $HOME/.bashrc
source $HOME/.bashrc

After running these, can you type:
which fastq-join

and see if it's pointing to the expected filepath above?

Olena P

unread,
Mar 2, 2016, 10:30:04 AM3/2/16
to Qiime 1 Forum
Hi Tony,

I have got this:

$ echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/:$PATH" >> $HOME/.bashrc

$ source $HOME/.bashrc

$ which fastq-join

/macqiime/bin/fastq-join


Do you think it will work now?

/Olena

Olena P

unread,
Mar 2, 2016, 10:44:19 AM3/2/16
to Qiime 1 Forum
Hi I have run command in terminal with macqiime opened.

After I exit macqiime I got this:

$ echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/:$PATH" >> $HOME/.bashrc

$ source $HOME/.bashrc

$ which fastq-join

/usr/local/bin/fastq-join


Still not in the directory /usr/local/Cellar/ea-utils/1.1.2-806/

Should I do it in brew?


TonyWalters

unread,
Mar 2, 2016, 10:47:09 AM3/2/16
to Qiime 1 Forum
No, it's still pointing to the older fastq-join file. Maybe we can just rename the old file and see if the default file used will be the new one.

Try this:
mv /macqiime/bin/fastq-join /macqiime/bin/old_fastq-join

If it gives you a file access error, try this instead (and you'll have to use your admin password):
sudo mv /macqiime/bin/fastq-join /macqiime/bin/old_fastq-join

Then try
which fastq-join

Olena P

unread,
Mar 2, 2016, 10:53:57 AM3/2/16
to Qiime 1 Forum
should I try to go in qiime or just in terminal? 

Sorry for stupid question...

TonyWalters

unread,
Mar 2, 2016, 11:00:38 AM3/2/16
to Qiime 1 Forum
Not a stupid question-the macqiime environment is part of this. I would run those commands after running macqiime to make sure everything is working once that environment is up.

Olena P

unread,
Mar 2, 2016, 11:27:38 AM3/2/16
to Qiime 1 Forum
Well, this what I got with qiime environment...


$ mv /macqiime/bin/fastq-join /macqiime/bin/old_fastq-join

mv: /macqiime/bin/fastq-join: No such file or directory


Now I do not understand... 


What happened with directory?


Olena P

unread,
Mar 2, 2016, 11:30:31 AM3/2/16
to Qiime 1 Forum

Sorry,

Of course it is old_fast-join now



$ ls /macqiime/bin/old_fastq-join

/macqiime/bin/old_fastq-join


however, it is not showing me path when I type : which fast-join

TonyWalters

unread,
Mar 2, 2016, 11:40:25 AM3/2/16
to Qiime 1 Forum
Okay, let's move the new version to the location of the old file.

Use this command:
sudo cp /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join /macqiime/bin/

Then try:
which fastq-join

Olena P

unread,
Mar 2, 2016, 11:47:38 AM3/2/16
to Qiime 1 Forum

Hi!

I try to use spotlight search and it seems that I have 2 fast-join in 2 different places:

 the file that I re-named I am assuming


and

 

how it can be?

Olena P

unread,
Mar 2, 2016, 11:53:07 AM3/2/16
to Qiime 1 Forum
file with paths for 2 fast-join? 

2 fast-join.docx

TonyWalters

unread,
Mar 2, 2016, 11:58:41 AM3/2/16
to Qiime 1 Forum


What did you get when you typed this command:
sudo cp /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join /macqiime/bin/

Olena P

unread,
Mar 2, 2016, 12:06:38 PM3/2/16
to Qiime 1 Forum



$ sudo cp /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join /macqiime/bin/

Password:

cp: /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join: No such file or directory


TonyWalters

unread,
Mar 2, 2016, 12:12:21 PM3/2/16
to Qiime 1 Forum
Rerun these commands:

brew update

brew install ea-utils

TonyWalters

unread,
Mar 2, 2016, 12:33:15 PM3/2/16
to Qiime 1 Forum
Also, post the full output of those commands. When I installed ea-utils via brew, the directory listed at the end did indeed contain the fastq-join file.

Olena P

unread,
Mar 2, 2016, 12:49:56 PM3/2/16
to Qiime 1 Forum

Hi Tony,


This what I got:



$ brew update


Updated Homebrew from 3c896a6 to 2188945.

==> Updated Formulae

gdm                        macvim                     protobuf-swift           

 $ 

 $ brew install ea-utils

Warning: homebrew/science/ea-utils-1.1.2-806 already installed




TonyWalters

unread,
Mar 2, 2016, 12:51:38 PM3/2/16
to Qiime 1 Forum
Can you search for this folder:
ea-utils-1.1.2-806

We need to figure out where this is on your system.

Olena P

unread,
Mar 2, 2016, 12:59:11 PM3/2/16
to Qiime 1 Forum
It is in downloads folder.

When I open it I can actually see fast-join

Should I move the whole folder to qiime folder?

TonyWalters

unread,
Mar 2, 2016, 1:02:23 PM3/2/16
to Qiime 1 Forum
For now just move fastq-join (there may be one with an extension like fastq-join.cpp, we don't want that one, we just want fastq-join) to /macqiime/bin/

Olena P

unread,
Mar 2, 2016, 1:11:57 PM3/2/16
to Qiime 1 Forum


it is actually not fastq-join but fast-join.cpp and fast-join.t

Please see file in the attachment.


Screen Shot 2016-03-02 at 19.07.32.png

TonyWalters

unread,
Mar 2, 2016, 1:21:46 PM3/2/16
to Qiime 1 Forum
Okay, open a new terminal, type macqiime, and do this command:
find /. -name 'fastq-join' 2>/dev/null

Olena P

unread,
Mar 2, 2016, 1:25:26 PM3/2/16
to Qiime 1 Forum

which fastq-join

/usr/local/bin/fastq-join

:~ $ cd /usr/local/Cellar/ea-utils/

:ea-utils $ ls

1.1.2-806


:ea-utils $ cd 1.1.2-806/

:1.1.2-806 $ ls

CHANGES              README

INSTALL_RECEIPT.json bin

MacQIIME LUMAC1159:1.1.2-806 $ cd bin

MacQIIME LUMAC1159:bin $ ls

alc             fastq-join      fastq-stats     sam-stats

determine-phred fastq-mcf       fastx-graph     varcall

fastq-clipper   fastq-multx     randomFQ


$ pwd

/usr/local/Cellar/ea-utils/1.1.2-806/bin



It seems like I found

Olena P

unread,
Mar 2, 2016, 1:29:06 PM3/2/16
to Qiime 1 Forum

$ find /. -name 'fastq-join' 2>/dev/null

/./Users/cob-opr/Applications/QIIME_program/MacQIIME_1.9.1-20150604_OS10.7/macqiime/bin/fastq-join

/./usr/local/bin/fastq-join

/./usr/local/Cellar/ea-utils/1.1.2-806/bin/fastq-join



TonyWalters

unread,
Mar 2, 2016, 1:29:35 PM3/2/16
to Qiime 1 Forum
Ah good, that does look correct.


We want to get /usr/local/Cellar/ea-utils/1.1.2-806/bin into the $PATH now (we missed the /bin/ part earlier).

Let's try this-open a new terminal, start macqiime, and do this:
echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/bin/:$PATH" >> $HOME/.bashrc
source $HOME/.bashrc
which fastq-join

Olena P

unread,
Mar 2, 2016, 1:37:03 PM3/2/16
to Qiime 1 Forum

echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/bin/:$PATH" >> $HOME/.bashrc

 $ source $HOME/.bashrc

 $ which fastq-join

/usr/local/Cellar/ea-utils/1.1.2-806/bin//fastq-join


seems to be on the place :)

TonyWalters

unread,
Mar 2, 2016, 1:39:24 PM3/2/16
to Qiime 1 Forum
Great-so now you can try running join_paired_ends.py with the default fastq-join option again, and running the output through split_libraries_fastq.py like you did before.

Olena P

unread,
Mar 2, 2016, 2:33:16 PM3/2/16
to Qiime 1 Forum
Dear Tony,
Unfortunately is not working

The same problem:
 join_paired_ends.py -f desktop/OP/NG-8325_02972108_lib84810_4374_1_1.fastq -r desktop/OP/NG-8325_02972108_lib84810_4374_1_2.fastq -o desktop/OP/joined_NG-8325_02972108_lib84810_4374

~ $ split_libraries_fastq.py -i desktop/OP/joined_NG-8325_02972108_lib84810_4374/fastqjoin.join.fastq -m desktop/OP/02972108.txt -o desktop/OP/splitlib_02972108_HH/ --barcode_type 'not-barcoded' --sample_id 02972108

Traceback (most recent call last):

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

    main()

  File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

    for fasta_header, sequence, quality, seq_id in seq_generator:

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

    phred_offset=phred_offset):

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

    parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

  File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

    seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT. This may be because you passed an incorrect value for phred_offset.


:(

/Olena

Olena P

unread,
Mar 2, 2016, 2:39:35 PM3/2/16
to Qiime 1 Forum
Could it be because of Python function?

I got this error for another tool within qiime software (FLASH command):

error Segmentation fault: 11


Thank you, Tony.

/Olena

TonyWalters

unread,
Mar 2, 2016, 2:43:11 PM3/2/16
to Qiime 1 Forum
Okay, just to confirm, can you rerun:
which fastq-join

Just so we can be sure it's using the right one.

If so, then there is nothing further I can do to fix fastq-join, as it is not our software, QIIME is just interfacing with it. You will have to use seqprep or just use read1 (R1) files as input for split_libraries_fastq.py.

Olena P

unread,
Mar 2, 2016, 3:14:38 PM3/2/16
to Qiime 1 Forum

which fastq-join


/usr/local/Cellar/ea-utils/1.1.2-806/bin//fastq-join



It seems to be correct one. 

If I use only forward reads for the further steps, should I re-run all sequences that have been already analysed with F+R?

I have approx 1000 and 300 are already done. 


Regards,
Olena

TonyWalters

unread,
Mar 2, 2016, 3:18:35 PM3/2/16
to Qiime 1 Forum
It would be better if all sequences were processed in the same way.

You might look into the multiple_split_libraries_fastq.py to help automate this process.

Olena P

unread,
Mar 2, 2016, 3:41:50 PM3/2/16
to Qiime 1 Forum
Hi Tony!

I managed to do multiple_joined_paired_end but not multiple_split_libraries_fastq.py  Hopefully will manage it too. 

Thank you for all your help! I also appreciate your time!


Kind regards,
Olena
Reply all
Reply to author
Forward
0 new messages