Is it bug? seqs.fna.incomplete (FastqParseError: Failed qual conversion for seq id)

Olena P

unread,

Feb 24, 2016, 6:29:24 PM2/24/16

to Qiime 1 Forum

Dear All,

I am working with established pipeline. Until today everything was fine. However, the problem appeared with split_libraries_fastq.py run (for MANY fastqjoin.join.fastq files).

it gives me such error: FastqParseError: Failed qual conversion for seq id and create seqs.fna.incomplete file, leaving histogram.txt file empty too.

Not working withfastqjoin.join.fastq files created by multiple_join_paired_ends.py (?) while 2 -3 repetitions of joining read ends (join_paired_ends.py) gives finally fastqjoin.join.fastq file that be used in split_libraries_fastq run.

This take a lot of time and It looks like problem with qiime (the bug?)

System information

==================

Platform: darwin

Python version: 2.7.10 |Anaconda 2.2.0 (x86_64)| (default, May 28 2015, 17:04:42) [GCC 4.2.1 (Apple Inc. build 5577)]

Python executable: /macqiime/anaconda/bin/python

QIIME default reference information

===================================

For details on what files are used as QIIME's default references, see here:

https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2

Dependency versions

===================

QIIME library version: 1.9.1

QIIME script version: 1.9.1

qiime-default-reference version: 0.1.2

NumPy version: 1.9.2

SciPy version: 0.16.0

pandas version: 0.16.2

matplotlib version: 1.4.3

biom-format version: 2.1.4

h5py version: 2.4.0 (HDF5 version: 1.8.14)

qcli version: 0.1.1

pyqi version: 0.3.2

scikit-bio version: 0.2.3

PyNAST version: 1.2.2

Emperor version: 0.9.51

burrito version: 0.9.1

burrito-fillings version: 0.1.1

sortmerna version: SortMeRNA version 2.0, 29/11/2014

sumaclust version: SUMACLUST Version 1.0.00

swarm version: Swarm 1.2.19 [Jun 2 2015 14:40:16]

gdata: Installed.

QIIME config values

===================

For definitions of these settings and to learn how to configure QIIME, see here:

http://qiime.org/install/qiime_config.html

http://qiime.org/tutorials/parallel_qiime.html

blastmat_dir: None

pick_otus_reference_seqs_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

sc_queue: all.q

topiaryexplorer_project_dir: None

pynast_template_alignment_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta

cluster_jobs_fp: start_parallel_jobs.py

pynast_template_alignment_blastdb: None

assign_taxonomy_reference_seqs_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

torque_queue: friendlyq

jobs_to_start: 1

slurm_time: None

denoiser_min_per_core: 50

assign_taxonomy_id_to_taxonomy_fp: /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt

temp_dir: /tmp/

slurm_memory: None

slurm_queue: None

blastall_fp: blastall

seconds_to_sleep: 60

QIIME base install test results

===============================

.........

----------------------------------------------------------------------

Ran 9 tests in 0.030s

OK

(working after several repetition with join_paired_ends.py)

$ join_paired_ends.py -f 03331202_lib84791_4374_1_1.fastq -r 03331202_lib84791_4374_1_2.fastq -o joined_03331202

$ split_libraries_fastq.py -i joined_03331202/fastqjoin.join.fastq -m 03331202.txt -o splitlib_03331202 --barcode_type 'not-barcoded' --sample_ids 03331202

(not working after multiple_join_paired_ends.py)!

the same bash, the same day:

$ split_libraries_fastq.py -i joined_03331202/fastqjoin.join.fastq -m 03331202.txt -o splitlib_03331202R --barcode_type 'not-barcoded' --sample_ids 03331202

Traceback (most recent call last):

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

main()

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

for fasta_header, sequence, quality, seq_id in seq_generator:

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

phred_offset=phred_offset):

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id:

What is wrong? It is absolutely the same information and the same fastq files...

Could you, please, help me with some explanation? And suggestion? (it takes a lot of time to repeat samples one by one...)

Thanks in advance!

/Olena

Jamie Morton

unread,

Feb 24, 2016, 7:16:32 PM2/24/16

to Qiime 1 Forum

Hi Olena,

Could you post the first few lines of your fastq file?

Jamie

Olena P

unread,

Feb 25, 2016, 2:10:30 AM2/25/16

to Qiime 1 Forum

Hi Jamie,

here is information from the first 2 lines of fastq (read1), fastq (read2) and fastqjoin.join.fastq file.

I also attached log files + when it working (R-) and when its not (RR-), creating seq.fna and seq.fna.incomplete files, respectively from the same data!

first 2 lines:

$ head -n2 03331202_lib84791_4374_1_1.fastq

@HISEQ:400:HHYYTBCXX:1:1101:1883:2243 1:N:0:TAATGCGCTAATCTTA

GGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAACACATGCAAGTCGAACGGAGAATTTTATTTCGGTAGAATTCTTAGTGGCGAACGGGTGAGTAACGCGTAGGCAACCTACCCTTTAGACGGGGACAACATTCCGAAAGGAGTGCTAATACCGGATGTGATCATCTTGCCGCATGGCAGGACGAAGAAAGATGGCCTCTACAAGTAAGCTATCGCTAAAGGATGGGCCTGCGTCTGATTAGCTAGTTGGTAGTGTAACGGACTACCAAGGCGATGATCAGTAGCCGGTC

$ head -n2 03331202_lib84791_4374_1_2.fastq

@HISEQ:400:HHYYTBCXX:1:1101:1883:2243 2:N:0:TAATGCGCTAATCTTA

ATTACCGCGGCTGCTGGCACGTAGTTAGCCGTGGCTTCCTCGTTTACTACCGTCATTGCAATGCAATGTTCACACACTGCACGTTCGTCATAAACAACAGAGCTTTACAGACCGAAATCCTTCATCACTCACGCGGCGTTGCTCCGTCAGACTTTCGTCCATTGCGGAAGATTCCCCACTGCTGCCTCCCGTAGGAGTTTGGGCCGTGTCTCAGTCCAAATGTGGCCGTTCATCCTCTCCGACCGGCTACTCATCAGCCCCTTGGTAGTCCGTTACACTACCATCTCGCTATTCCGACCCA

$ head -n2 joined_03331202R/fastqjoin.join.fastq

@HISEQ:400:HHYYTBCXX:1:1101:3751:2232 1:N:0:TAATGCGCTAATCTTA

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCTACAGGCTTAACACATGCAAGTCGAGGGGCAGCATGGTCTTAGCTTGCTAAGGCTGATGGCGACCGGCGCACGGGTGAGTAACACGTATCCAACCTGCCGTCTACTCTTGGCCAGCCTTCTGAAAGGAAGATTAATCCAGGATGGGATCATGAGTTCACATGTCCGTATGATTAAAGGTATTTTCCGGTAGACGATGGGGATGCGTTCCATTAGATAGTAGGCGGGGTAACGGCCCACCTAGTCAACGATGGATAGGGGTTCTGAGAGGAAGGTCCCCCACATTGGAACTGAGACACGGTCCAAACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGGCGATGGCCTGAACCAGCCAAGTAGCGTGAAGGATGACTGCCCTATGGGTTGTAAACTTCTTTTATAAAGGAATAAAGTCGGGTATGCATACCCGTTTGCATGTACTTTATGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT

Looking forward to understand why it is happening!

Regards,

Olena

p.s. fastq files https://drive.google.com/open?id=0B4ZHFM2D8YtPVERJNmRRbVA5TXM

seqs.fna.incomplete

split_library_log_RR.txt

split_library_log_R.txt

Jamie Morton

unread,

Feb 25, 2016, 2:03:03 PM2/25/16

to Qiime 1 Forum

Hi Olena,

Can you try to pass --phred_offset 33 to see if that resolves the issue?

Jamie

Message has been deleted

Olena P

unread,

Feb 25, 2016, 3:36:27 PM2/25/16

to Qiime 1 Forum

Hi Jamie!

Thank you for the suggestion! This was the first that I have tried... No, it not working

split_libraries_fastq.py -i joined_02972108RR/fastqjoin.join.fastq -m 02972108.txt -o splitlib_02972108RR --sample_ids 02972108 --barcode_type 'not-barcoded' --phred_offset '33'

Traceback (most recent call last):

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

main()

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

for fasta_header, sequence, quality, seq_id in seq_generator:

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

phred_offset=phred_offset):

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT. This may be because you passed an incorrect value for phred_offset.

I have tried with phred_offset 64 - not working!

May be I have to do something with my sequencing files?

Thank you for trying to help me,

/Olena

TonyWalters

unread,

Feb 25, 2016, 8:34:06 PM2/25/16

to Qiime 1 Forum

I'm not sure offhand, but, there is something amiss with the fastq file format-the error should be listing the fastq sequence label line, but it's listing a sequence, as if the label line wasn't written:

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT. This may be because you passed an incorrect value for phred_offset.

Something perhaps went awry doing the join_paired_ends.py step. Maybe use that sequence to grep for the line before it to see what the label looks like? E.g.
grep -B 5 "GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT" joined_02972108RR/fastqjoin.join.fastq

I'd suggest using that approach to see what happened to the label line in the fastq file. It may be that there is a is a bug in the stitching software that caused it to skip writing a label. Maybe go back to the original unstitched reads and see what the label looks like there, as that might be a problem too. I'm not sure of the best way to fix this though if that's the case-maybe just remove that read?

Olena P

unread,

Feb 26, 2016, 3:50:57 AM2/26/16

to Qiime 1 Forum

Hi Tony!

Thank you for your suggestion!

I tried you suggestion grep -B 5 "GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT" joined_02972108RR/fastqjoin.join.fastq > test

Test: AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTAACCCCCCACACTGGTACTGAGACACGGACCAGACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGACGCAAGTCTGAACCAGCCATGCCGCGTGCAGGATGACGGCTCTATGAGTTGTAAACTGCTTTTGTACGAGGGTAAACGCAGATACGTGTATCTGTCTGAAGGTATCGTACGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT

+

DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIHHGHHHHIEIIIGHIIHIIEHECIGEHHCHGFHHEH?FHCH@5+HHHIHF--IIIHEG@-@8@8-IIIHHIIHEDBGHHHHGIHIIGHIIGHHGGIIIHGHGIIIHIIHIIHHIIIHFGIHCIIIIHDIIIHIHDFCHIIIIHCIIIIIHHIHIHIIIIIIIIHGIIIIHIIIIIIIIIIIHIHIIIIIIIHIIIIHIIIIHIHHHIHHD1IIIIIIIHDIIIIHHIHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT

+

AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT

What I do now? As I understand it is not replace my fastqjoin.join.fastq file automatically?

Regards,

Olena

TonyWalters

unread,

Feb 26, 2016, 11:32:26 AM2/26/16

to Qiime 1 Forum

Hello Olena,

Something does seem to be awry with that sequence, in particular, the one with this label:

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT

The fastq format is supposed to look like this example:
@SEQ_ID

GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT

+

!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

but for that label above, it goes label, +, then sequence.

I don't know how it got messed up, and unfortunately, there are not automatic fixes for this sort of thing. Now that we know the offending label (hopefully the only one), we can check the original R1/R2 files (the ones used for join_paired_ends.py), and see what the reads look like there. Can you run the following grep commands and post the results?

grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" joined_02972108RR/fastqjoin.join.fastq

(I want to see how the lines after this offending label look too)

grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" X

grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" Y

where X and Y are the R1 and R2 fastq files that were used as input for join_paired_ends.py.

Olena P

unread,

Feb 26, 2016, 3:32:02 PM2/26/16

to Qiime 1 Forum

Hi Tony,

Here is result from 3 commands you ask me to do. I hope, you will see the problem.

$ grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" joined_02972108RR/fastqjoin.join.fastq

+

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTAACCCCCCACACTGGTACTGAGACACGGACCAGACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGACGCAAGTCTGAACCAGCCATGCCGCGTGCAGGATGACGGCTCTATGAGTTGTAAACTGCTTTTGTACGAGGGTAAACGCAGATACGTGTATCTGTCTGAAGGTATCGTACGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT

+

DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIHHGHHHHIEIIIGHIIHIIEHECIGEHHCHGFHHEH?FHCH@5+HHHIHF--IIIHEG@-@8@8-IIIHHIIHEDBGHHHHGIHIIGHIIGHHGGIIIHGHGIIIHIIHIIHHIIIHFGIHCIIIIHDIIIHIHDFCHIIIIHCIIIIIHHIHIHIIIIIIIIHGIIIIHIIIIIIIIIIIHIHIIIIIIIHIIIIHIIIIHIHHHIHHD1IIIIIIIHDIIIIHHIHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT

+

AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT

+

CDDDDGHIIHIEHIIIIIIHIHIIHIIIIIHIIIGHIIIIHHIIIIIIIIIIIIIIIIIHG@HIHHHHGGHIIIIIIIIGIIIHIIEF@GHHHIIIIIIIICHCFHHIIHHIIIHIIIIHGGEHIIIGHHIIIGHHCHDH<EEHIIIIFIIHIGHF=GDEHHGGGFDFHIIHIEHHHHFGEHHIHIFHHFEEECHHH?EE@E?G..BC<EHEHHHHAG.@HH-BFB@AHHHHICGHHFEDHHHIIGHH@@GHIF@E@EHGB-BGDHDHFHFE@#@-CH=H,5@#H-IHCGHEDHI@-?FHFF-C>+>?EGHGFAB.BG?HFFHEEHDHHHF-EGIIHGCFHEFDGEGF?EFHEHHGHGEHHGHHHHHHHIIIIIHHGIHIIHGGCDGD-IIIIHDHHIIIHF11FIIIIIIIIIHGC/IIIHIHIHHIIIHFDEHEHHHHEEG@FHIHHHHCHHGHFEIIIHHGEHHCHE@DDEIIHIHIIIIIIIHIHCGHECGIGIHHECCIGHIIIIIIIIIIIIIIIHIIIIIIIDCDDD

$ grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" 02972108_lib84810_4374_1_1.fastq

@HISEQ:400:HHYYTBCXX:1:1101:13393:100095 1:N:0:CGGCTATGCAGGACGT

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTA

+

DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIEHGHHHHIEIIIGHIIHIIEHCC@EEH5CHCCHHEH?FHCH@

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 1:N:0:CGGCTATGCAGGACGT

AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGG

+

CDDDDGHIIHIEHIIIIIIHIHIIHIIIIIHIIIGHIIIIHHIIIIIIIIIIIIIIIIIHG@HIHHHHGGHIIIIIIIIGIIIHIIEF@GHHHIIIIIIIICHCFHHIIHHIIIHIIIIHGGEHIIIGHHIIIGHHCHDH<EEHIIIIFIIHIGHF=GDEHHGGGFDFHIIHIEHHHHFGEHHIHIFHHFEEECHHH?EE@E?G..BC<EHEHHHHAG.@HH-BFB@AHHHHICGHHFEDHHHIIGHH@@GHIF@E@EHGB-BGDHDHFHFE@-@-CH=H,5@HH--6--5>DHI@--8FE

@HISEQ:400:HHYYTBCXX:1:1101:14633:100112 1:N:0:CGGCTATGCAGGACGT

$ grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" 02972108_lib84810_4374_1_2.fastq

@HISEQ:400:HHYYTBCXX:1:1101:13393:100095 2:N:0:CGGCTATGCAGGACGT

ATTACCGCGGCTGCTGGCACGGAGTTAGCCGATCCTTATTCGTACGATACCTTCAGACAGATACACGTATCTGCGTTTACCCTCGTACAAAAGCAGTTTACAACTCATAGAGCCGTCATCCTGCACGCGGCATGGCTGGTTCAGACTTGCGTCCATTGACCAATATTCCTCACTGCTGCCTCCCGTAGGAGTCTGGTCCGTGTCTCAGTACCAGTGTGGGGGGTTAACCTCTCAGTCCCCCTATGTATCGTCGCCTTGGTGAGCCGTTACCCCACCAACTAGCTAATACACCGCAGGCCCC

+

DDDDDIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHHIHHIIIIDHIIIIIII1DHHIHHHIHIIIIHIIIIHIIIIIIIHIHIIIIIIIIIIIHIIIIGHIIIIIIIIHIHIHHIIIIICHIIIIHCFDHIHIIIDHIIIICHIGFHIIIHHIIHIIHIIIGHGHIIIGGHHGIIHGIIHIGHHHHGBDEHIIHHIII-8@8@-@GEHIII--FHIHHH+5,-8@-6@?F@FG-@HHDGICE@-6-65-+>@G-@G@H-8@GHGHHHAHHHHH-@GEH####################

@HISEQ:400:HHYYTBCXX:1:1101:14126:100028 2:N:0:CGGCTATGCAGGACGT

ATTACCGCGGCTGCTGGCACGTAGTTAGCCGTCCCTTTCTGGTAAGCTACCGTCACAGTGTGAACTTTCCACTCTCACACTCGTTCTTGACTTACAACAGAGCTTTACGATCCGAACACCTTCTTCACTCACGCGGCGTTGCTCGGTCAGGGTTGCCCCCATTGCCGAAGATTCCCTACTGCTGCCTCCCGTAGGAGTCTGGGCCGTGTCTCAGTCCCAGTGTGGCCGATCACCCTCTCAGGTCGGCTATTTATCGTCGCATAGGTGAGCCTTTACCTCACCTACTCGCTAATACAACCCA

+

DDDCDIIIIIIIHIIIIIIIIIIIIIIIHGICCEHHIGIGCEHGCHIHIIIIIIIHIHIIEDD@EHCHHEGHHIIIEFHGHHCHHHHIHF@GEEHHHHEHEDFHIIIHHIHIHIII/CGHIIIIIIIIIF11FHIIIHHDHIIII-DGDCGGHIIHIGHHIIIIIHHHHHHHGHHEGHGHHEHFE?FGEGDFEHFCGHIIGE-FHHHDHEEHFFH?GB.BAFGHGE?>+>C-FFHF?--AG?EHGCHI#####################################################

@HISEQ:400:HHYYTBCXX:1:1101:14633:100112 2:N:0:CGGCTATGCAGGACGT

Really appreciate your help!

/Olena

TonyWalters

unread,

Feb 27, 2016, 6:46:03 PM2/27/16

to Qiime 1 Forum

Okay Elena, thanks for bearing with us.

After testing join_paired_ends.py locally with fastq files created from the sequences you just posted, I was able to replicate the problem of the unwanted, added + lines. Here is the specific issue from your joined data:

$ grep -A 4 -B 4 "@HISEQ:400:HHYYTBCXX:1:1101:14126:100028" joined_02972108RR/fastqjoin.join.fastq

+ <------------ THIS SHOULDN'T BE HERE

AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCTTAACACATGCAAGTCGAGGGGCAGCATAATGGATAGCAATATCTATGGTGGCGACCGGCGCACGGGTGCGTAACGCGTATGCAACCTACCTTTAACAGGGGGATAACACTGAGAAATTGGTACTAATACCCCATAATATCATAGAAGGCATCTTTTATGGTTGAAAATTCCGATGGTTAGAGATGGGCATGCGTTGTATTAGCTAGTTGGTGGGGTAACGGCTCACCAAGGCGACGATACATAGGGGGACTGAGAGGTTAACCCCCCACACTGGTACTGAGACACGGACCAGACTCCTACGGGAGGCAGCAGTGAGGAATATTGGTCAATGGACGCAAGTCTGAACCAGCCATGCCGCGTGCAGGATGACGGCTCTATGAGTTGTAAACTGCTTTTGTACGAGGGTAAACGCAGATACGTGTATCTGTCTGAAGGTATCGTACGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAAT

+

DDDDDIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEHIIIIIIIIIIIIIHIIIIIIIIIIIHIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIGIIIIHIHHIIIIHHHIIIIIIIIGHIIGFHIHII?FGHHIGHIEHIIHHICFHDHIIIIIIIIIIIGHIIHIIGE-CHHIIIIGHIIEHHHIIIIIIIIHHGHHHHIEIIIGHIIHIIEHECIGEHHCHGFHHEH?FHCH@5+HHHIHF--IIIHEG@-@8@8-IIIHHIIHEDBGHHHHGIHIIGHIIGHHGGIIIHGHGIIIHIIHIIHHIIIHFGIHCIIIIHDIIIHIHDFCHIIIIHCIIIIIHHIHIHIIIIIIIIHGIIIIHIIIIIIIIIIIHIHIIIIIIIHIIIIHIIIIHIHHHIHHD1IIIIIIIHDIIIIHHIHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD

I found that the problem happens when running the underlying software (fastq-join) directly. It seems we need to update the fastq-join software, which is part of the ea-utils package. Since you are on a mac, I would install the brew software, following the instructions here:
http://brew.sh/

Then run:

brew update

Finally, run:
brew install ea-utils

Hopefully this will install the 1.1.2-806 version of the ea-utils software. We're going to need to use the path that it is installed to make sure the system can find the fastq-join software-can you copy the output that comes after running brew install ea-utils?

Olena P

unread,

Feb 28, 2016, 5:35:44 AM2/28/16

to Qiime 1 Forum

Hi Tony!

I cannot install it by myself since I do not have administrator right. I have already contact our institutional service desk.

I will come back as soon as I have more information.

Thank you for your help,

/Olena

Olena P

unread,

Mar 1, 2016, 3:30:19 AM3/1/16

to Qiime 1 Forum

Dear Tony,

Since my work computer is getting service/support from my home institution they do not recommend to download "brew".

Any other ways to solve my problem?

When you, think new update for the fastq-join software will be released?

Can I modify my sequencing data somehow to get it suitable for current QIIME update?

I got more information from company which produced sequencing data.

They think that the read number was too high...

Can one split_libraries first and then join the paired_end instead in Qiime without changing final result?

Regards

/olena

TonyWalters

unread,

Mar 1, 2016, 7:12:50 AM3/1/16

to Qiime 1 Forum

Olena, if you have someone locally who has a decent amount of experience, you could have them try to compile the software from the source code: https://code.google.com/archive/p/ea-utils/source/default/source
It's going to take some work though, as they did not set up the install scripts to work well on a Mac (hence the suggestion for usage of brew).

Another option would be to bypass the join_paired_ends.py step, and just use one of the reads as input for split_libraries_fastq.py rather than the stitched reads-R1 usually has higher quality.

Olena P

unread,

Mar 1, 2016, 1:44:17 PM3/1/16

to Qiime 1 Forum

Dear Tony,

Thank you very much for your suggestion. I will work on it.

I got help from bioinformatician who work with another software. He was able to stitch and proceed files which have failed in qiime.

He stitched paired ends with:

--max-mismatch-density=0.25 --min-overlap=10 --max-overlap=300 and it work file.

I am wondering if problem in my situation in joined_pair_ends can be solved by making corrections for joining method?

In qiime 1.9.1 fastq-join - (Erik Aronesty, 2011. ea-utils) is default

Can I use SeqPrep instead? Or it will not improve my situation?

I also have one important question to ask:

I got sequencing data from HiSeq 2500 where some files have up to 550,000 sequences reads (approx 330,000,000 sequences bases) while files that I usually run are about 200,000 - 300,000 sequences reads. Do you think this can affect the joining of paired end or it does not seems to be the problem?

I know that QIIME software primarily was designed for MiSeq data which usually has lower number of reads.

Regards,

Olena

TonyWalters

unread,

Mar 1, 2016, 1:59:21 PM3/1/16

to Qiime 1 Forum

Hello,

You could use SeqPrep instead (it won't have the issue of the + characters, as that appears to be something specific to particular versions of the EA-utils fastq-join software), but if the stitched reads you just generated are able to work with split_libraries_fastq.py without the error you were encountering before, I would go with those to be consistent with your prior pipeline for processing reads. The number of reads shouldn't be a factor (just increases the amount of time/memory needed to process the data), but the shorter read length of HighSeq reads relative to MiSeq could affect the overlap size for the stitching software.

Olena P

unread,

Mar 1, 2016, 2:57:37 PM3/1/16

to Qiime 1 Forum

Thank you for the answer.

I have tried

$ SeqPrep -f 02972108_lib84810_4374_1_1.fastq -r 02972108_lib84810_4374_1_2.fastq -1 output_unpaired_1.fastq -2 output_unpaired_2.fastq -s output_merged.fastq

Processing reads... \

Pairs Processed: 533880

Pairs Merged: 26911

Pairs With Adapters: 129

Pairs Discarded: 19

CPU Time Used (Minutes): 2.848939

$ split_libraries_fastq.py -i output_merged.fastq -o splitlib_02972108D -m 02972108.txt --sample_ids 02972108 --barcode_type 'not-barcoded'

Traceback (most recent call last):

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

main()

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

for fasta_header, sequence, quality, seq_id in seq_generator:

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

phred_offset=phred_offset):

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 278, in process_fastq_single_end_read_file

post_casava_v180 = is_casava_v180_or_later(fastq_read_f_line1)

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/parse.py", line 37, in is_casava_v180_or_later

"Non-header line passed as input. Header must start with '@'."

AssertionError: Non-header line passed as input. Header must start with '@'.

NOT WORKING! give seqs.fna.incomplete. What did I do wrong?

$ head -n2 output_merged.fastq

?DQ?8?.?_??k?????

%??P?;oF?ӝX?

Now I cannot understand the information with head command... How do I check output_merged.fastq?

Regards

/Olena

TonyWalters

unread,

Mar 1, 2016, 3:10:18 PM3/1/16

to Qiime 1 Forum

Olena, I think your file is in gzipped format (the output of SeqPrep will be gzipped). Can you try renaming your output_merged.fastq file to output_merged.fastq.gz (this will let QIIME know the file type when reading it in).

Olena P

unread,

Mar 1, 2016, 4:01:55 PM3/1/16

to Qiime 1 Forum

Dear Tony!

It is working now with SeqPrep method!!!! and I can do split_libraries_fastq command also.

$ head -n2 output_merged.fastq

@HISEQ:400:HHYYTBCXX:1:1101:8983:2806 1:N:0:CGGCTATGCAGGACGT

AGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCGTGGCTAAGACATGCAAGTCGAACGAGAGAATTGCTAGCTTGCTAATAATTCTCTAGTGGCGCACGGGTGAGTAACACGTGAGTAACCTGCCCCCGAGAGCGGGATAGCCCTGGGAAACTGGGATTAATACCGCATAGTATCGAAAGATTAAAGCAGCAATGCGCTTGGGGATGGGCTCGCGGCCTATTAGTTAGTTGGTGAGGTAACGGCTCACCAAGGCGATGCAGGGTAGCCGGTCTGAGAGGATGTCCGGACACACTGGAACTGAGACACGGTCCAGACACCTACGGGTGGCAGCAGTCGAGAATCATTCACAATGGGGGAAACCCTGATGGTGCGACGCCGCGTGGGGGAATGAAGGTCTTCGGATTGTAAACCCCTGTCATGTGGGAGCAAATTAAAAAGATAGTACCACAAGAGGAAGAGACGGCTAACTCTGTGCCAGCAGCCGCGGTAAT

split_libraries_fastq.py -i output_merged.fastq -o splitlib_02972108DD -m 02972108.txt --sample_ids 02972108 --barcode_type 'not-barcoded'

RESULT:

with join_paired_end

Input file paths

Sequence read filepath: joined_02972108_R/fastqjoin.join.fastq (md5: 80d0088a69ae4b3aa8c82cd3380ad1d6)

Quality filter results

Total number of input sequences: 356536

Barcode not in mapping file: 0

Read too short after quality truncation: 1707

Count of N characters exceeds limit: 725

Illumina quality digit = 0: 0

Barcode errors exceed max: 0

Result summary (after quality filtering)

Median sequence length: 515.00

02972108 354104

Total number seqs written 354104

---

with SeqPrep method

Input file paths

Sequence read filepath: output_merged.fastq (md5: 73987c9ab4e0e7f5461a134c2148ee74)

Quality filter results

Total number of input sequences: 26911

Barcode not in mapping file: 0

Read too short after quality truncation: 1328

Count of N characters exceeds limit: 104

Illumina quality digit = 0: 0

Barcode errors exceed max: 0

Result summary (after quality filtering)

Median sequence length: 489.00

02972108 25479

Total number seqs written 25479

The result is not identical! Should I re-run all samples with the same method or it will not affect the alpha rarefaction?

Thank you very much for your input and all suggestions!

/Olena

TonyWalters

unread,

Mar 1, 2016, 5:06:59 PM3/1/16

to Qiime 1 Forum

I wouldn't expect the results to be identical with two different algorithms. But, you're down about an order of magnitude when using seqprep instead of fastq-join.

You will need to follow one of these three paths to get more reads retained:
1. Tweak the parameters for seq-prep to get more read retained-I'm not an expert on the particulars of seq-prep though, so it may take some test runs on your end with different settings to see what works.

2. Get the updated fastq-join software installed and use that.

3. Skip the stitching step altogether and use just the R1 file.

-Tony

Olena P

unread,

Mar 2, 2016, 5:17:53 AM3/2/16

to Qiime 1 Forum

Dear Tony,

Thank you for the suggestions,

I have made decision to go for path 2.

You said that you can update fastq-join software, which is part of the ea-utils package

I finally installed ea-utils/1.1.2-806

See output:

$ brew install homebrew/science/ea-utils

==> Tapping homebrew/science

Cloning into '/usr/local/Library/Taps/homebrew/homebrew-science'...

remote: Counting objects: 583, done.

remote: Compressing objects: 100% (582/582), done.

remote: Total 583 (delta 2), reused 66 (delta 0), pack-reused 0

Receiving objects: 100% (583/583), 481.71 KiB | 0 bytes/s, done.

Resolving deltas: 100% (2/2), done.

Checking connectivity... done.

Tapped 573 formulae (603 files, 1.5M)

==> Installing ea-utils from homebrew/science

==> Installing dependencies for homebrew/science/ea-utils: gsl

==> Installing homebrew/science/ea-utils dependency: gsl

==> Downloading https://homebrew.bintray.com/bottles/gsl-1.16.yosemite.bottle.2.

######################################################################## 100.0%

==> Pouring gsl-1.16.yosemite.bottle.2.tar.gz

🍺 /usr/local/Cellar/gsl/1.16: 245 files, 7.7M

==> Installing homebrew/science/ea-utils

==> Downloading https://homebrew.bintray.com/bottles-science/ea-utils-1.1.2-806.

######################################################################## 100.0%

==> Pouring ea-utils-1.1.2-806.yosemite.bottle.1.tar.gz

🍺 /usr/local/Cellar/ea-utils/1.1.2-806: 14 files, 760.7K

Looking forward for your answer.

Regards,

Olena

p.s. I might have sent accidentally privat respond to your email. Sorry for spamming.

Olena P

unread,

Mar 2, 2016, 6:58:52 AM3/2/16

to Qiime 1 Forum

I have one thing to add.

Since my samples were sequenced in HiSeq i try to use FLASH for stitching the same sample I used before:

I got this error:

Segmentation fault: 11

Could it be that this issue is relating to Python function? I found that Python failure can occur after updating to Yosemite. https://discussions.apple.com/thread/6829612?start=0&tstart=0

May qiime developers can come to new update for Macqiime, taking in account this problem...

Regards,

/Olena

TonyWalters

unread,

Mar 2, 2016, 10:13:03 AM3/2/16

to Qiime 1 Forum

The executable file should be at this location now:
/usr/local/Cellar/ea-utils/1.1.2-806/fastq-join

We should be able to put the directory /usr/local/Cellar/ea-utils/1.1.2-806/ into the $PATH environment, so the fastq-join file will be found no matter where it is ran.

These commands should add that folder to the $PATH environment:
echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/:$PATH" >> $HOME/.bashrc

source $HOME/.bashrc

After running these, can you type:
which fastq-join

and see if it's pointing to the expected filepath above?

Olena P

unread,

Mar 2, 2016, 10:30:04 AM3/2/16

to Qiime 1 Forum

Hi Tony,

I have got this:

$ echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/:$PATH" >> $HOME/.bashrc

$ source $HOME/.bashrc

$ which fastq-join

/macqiime/bin/fastq-join

$

Do you think it will work now?

/Olena

Olena P

unread,

Mar 2, 2016, 10:44:19 AM3/2/16

to Qiime 1 Forum

Hi I have run command in terminal with macqiime opened.

After I exit macqiime I got this:

$ echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/:$PATH" >> $HOME/.bashrc

$ source $HOME/.bashrc

$ which fastq-join

/usr/local/bin/fastq-join

$

Still not in the directory /usr/local/Cellar/ea-utils/1.1.2-806/

Should I do it in brew?

TonyWalters

unread,

Mar 2, 2016, 10:47:09 AM3/2/16

to Qiime 1 Forum

No, it's still pointing to the older fastq-join file. Maybe we can just rename the old file and see if the default file used will be the new one.

Try this:

mv /macqiime/bin/fastq-join /macqiime/bin/old_fastq-join

If it gives you a file access error, try this instead (and you'll have to use your admin password):

sudo mv /macqiime/bin/fastq-join /macqiime/bin/old_fastq-join

Then try
which fastq-join

Olena P

unread,

Mar 2, 2016, 10:53:57 AM3/2/16

to Qiime 1 Forum

should I try to go in qiime or just in terminal?

Sorry for stupid question...

TonyWalters

unread,

Mar 2, 2016, 11:00:38 AM3/2/16

to Qiime 1 Forum

Not a stupid question-the macqiime environment is part of this. I would run those commands after running macqiime to make sure everything is working once that environment is up.

Olena P

unread,

Mar 2, 2016, 11:27:38 AM3/2/16

to Qiime 1 Forum

Well, this what I got with qiime environment...

$ mv /macqiime/bin/fastq-join /macqiime/bin/old_fastq-join

mv: /macqiime/bin/fastq-join: No such file or directory

Now I do not understand...

What happened with directory?

Olena P

unread,

Mar 2, 2016, 11:30:31 AM3/2/16

to Qiime 1 Forum

Sorry,

Of course it is old_fast-join now

$ ls /macqiime/bin/old_fastq-join

/macqiime/bin/old_fastq-join

however, it is not showing me path when I type : which fast-join

TonyWalters

unread,

Mar 2, 2016, 11:40:25 AM3/2/16

to Qiime 1 Forum

Okay, let's move the new version to the location of the old file.

Use this command:

sudo cp /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join /macqiime/bin/

Then try:

which fastq-join

Olena P

unread,

Mar 2, 2016, 11:47:38 AM3/2/16

to Qiime 1 Forum

Hi!

I try to use spotlight search and it seems that I have 2 fast-join in 2 different places:

the file that I re-named I am assuming

and

how it can be?

Olena P

unread,

Mar 2, 2016, 11:53:07 AM3/2/16

to Qiime 1 Forum

file with paths for 2 fast-join?

2 fast-join.docx

TonyWalters

unread,

Mar 2, 2016, 11:58:41 AM3/2/16

to Qiime 1 Forum

What did you get when you typed this command:

sudo cp /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join /macqiime/bin/

Olena P

unread,

Mar 2, 2016, 12:06:38 PM3/2/16

to Qiime 1 Forum

$ sudo cp /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join /macqiime/bin/

Password:

cp: /usr/local/Cellar/ea-utils/1.1.2-806/fastq-join: No such file or directory

TonyWalters

unread,

Mar 2, 2016, 12:12:21 PM3/2/16

to Qiime 1 Forum

Rerun these commands:

brew update

brew install ea-utils

TonyWalters

unread,

Mar 2, 2016, 12:33:15 PM3/2/16

to Qiime 1 Forum

Also, post the full output of those commands. When I installed ea-utils via brew, the directory listed at the end did indeed contain the fastq-join file.

Olena P

unread,

Mar 2, 2016, 12:49:56 PM3/2/16

to Qiime 1 Forum

Hi Tony,

This what I got:

$ brew update

Updated Homebrew from 3c896a6 to 2188945.

==> Updated Formulae

gdm macvim protobuf-swift

$

$ brew install ea-utils

Warning: homebrew/science/ea-utils-1.1.2-806 already installed

TonyWalters

unread,

Mar 2, 2016, 12:51:38 PM3/2/16

to Qiime 1 Forum

Can you search for this folder:
ea-utils-1.1.2-806

We need to figure out where this is on your system.

Olena P

unread,

Mar 2, 2016, 12:59:11 PM3/2/16

to Qiime 1 Forum

It is in downloads folder.

When I open it I can actually see fast-join

Should I move the whole folder to qiime folder?

TonyWalters

unread,

Mar 2, 2016, 1:02:23 PM3/2/16

to Qiime 1 Forum

For now just move fastq-join (there may be one with an extension like fastq-join.cpp, we don't want that one, we just want fastq-join) to /macqiime/bin/

Olena P

unread,

Mar 2, 2016, 1:11:57 PM3/2/16

to Qiime 1 Forum

it is actually not fastq-join but fast-join.cpp and fast-join.t

Please see file in the attachment.

Screen Shot 2016-03-02 at 19.07.32.png

TonyWalters

unread,

Mar 2, 2016, 1:21:46 PM3/2/16

to Qiime 1 Forum

Okay, open a new terminal, type macqiime, and do this command:

find /. -name 'fastq-join' 2>/dev/null

Olena P

unread,

Mar 2, 2016, 1:25:26 PM3/2/16

to Qiime 1 Forum

which fastq-join

/usr/local/bin/fastq-join

:~ $ cd /usr/local/Cellar/ea-utils/

:ea-utils $ ls

1.1.2-806

:ea-utils $ cd 1.1.2-806/

:1.1.2-806 $ ls

CHANGES README

INSTALL_RECEIPT.json bin

MacQIIME LUMAC1159:1.1.2-806 $ cd bin

MacQIIME LUMAC1159:bin $ ls

alc fastq-join fastq-stats sam-stats

determine-phred fastq-mcf fastx-graph varcall

fastq-clipper fastq-multx randomFQ

$ pwd

/usr/local/Cellar/ea-utils/1.1.2-806/bin

It seems like I found

Olena P

unread,

Mar 2, 2016, 1:29:06 PM3/2/16

to Qiime 1 Forum

$ find /. -name 'fastq-join' 2>/dev/null

/./Users/cob-opr/Applications/QIIME_program/MacQIIME_1.9.1-20150604_OS10.7/macqiime/bin/fastq-join

/./usr/local/bin/fastq-join

/./usr/local/Cellar/ea-utils/1.1.2-806/bin/fastq-join

TonyWalters

unread,

Mar 2, 2016, 1:29:35 PM3/2/16

to Qiime 1 Forum

Ah good, that does look correct.

We want to get /usr/local/Cellar/ea-utils/1.1.2-806/bin into the $PATH now (we missed the /bin/ part earlier).

Let's try this-open a new terminal, start macqiime, and do this:
echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/bin/:$PATH" >> $HOME/.bashrc

source $HOME/.bashrc
which fastq-join

Olena P

unread,

Mar 2, 2016, 1:37:03 PM3/2/16

to Qiime 1 Forum

echo "export PATH=/usr/local/Cellar/ea-utils/1.1.2-806/bin/:$PATH" >> $HOME/.bashrc

$ source $HOME/.bashrc

$ which fastq-join

/usr/local/Cellar/ea-utils/1.1.2-806/bin//fastq-join

seems to be on the place :)

TonyWalters

unread,

Mar 2, 2016, 1:39:24 PM3/2/16

to Qiime 1 Forum

Great-so now you can try running join_paired_ends.py with the default fastq-join option again, and running the output through split_libraries_fastq.py like you did before.

Olena P

unread,

Mar 2, 2016, 2:33:16 PM3/2/16

to Qiime 1 Forum

Dear Tony,

Unfortunately is not working

The same problem:

join_paired_ends.py -f desktop/OP/NG-8325_02972108_lib84810_4374_1_1.fastq -r desktop/OP/NG-8325_02972108_lib84810_4374_1_2.fastq -o desktop/OP/joined_NG-8325_02972108_lib84810_4374

~ $ split_libraries_fastq.py -i desktop/OP/joined_NG-8325_02972108_lib84810_4374/fastqjoin.join.fastq -m desktop/OP/02972108.txt -o desktop/OP/splitlib_02972108_HH/ --barcode_type 'not-barcoded' --sample_id 02972108

Traceback (most recent call last):

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 365, in <module>

main()

File "/macqiime/anaconda/bin/split_libraries_fastq.py", line 344, in main

for fasta_header, sequence, quality, seq_id in seq_generator:

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 239, in process_fastq_single_end_read_file_no_barcode

phred_offset=phred_offset):

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/split_libraries_fastq.py", line 317, in process_fastq_single_end_read_file

parse_fastq(fastq_read_f, strict=False, phred_offset=phred_offset)):

File "/macqiime/anaconda/lib/python2.7/site-packages/skbio/parse/sequences/fastq.py", line 174, in parse_fastq

seqid)

skbio.parse.sequences._exception.FastqParseError: Failed qual conversion for seq id: GAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAAGAGAGGAGCTTGCTCTTCTTGGATGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTGCCTTGTAGCGGGGGATAACTATTGGGAACGATAGCTAATACCGCATAACAATGGATGACCCATGTCATTTATTTGAAAGGGGCAAATGCTCCACTACAAGATGGACCTGCGTTGTATTAGCTAGTAGGTGAGGTAACGGCTCACCTAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCGGCAATGGGGGCAACCCTGACCGAGCAACGCCGCGTGAGTGAAGAAGGTGTTCGGATCGTAAAGCTCTGTTGTAAGTCAAGAACGAGTGTGAGAGTGGAAAGTTCACACTGTGACGGTAGCTTACCAGAAAGGGACGGCTAACTACGTGCCAGCAGCCGCGGTAAT. This may be because you passed an incorrect value for phred_offset.

:(

/Olena

Olena P

unread,

Mar 2, 2016, 2:39:35 PM3/2/16

to Qiime 1 Forum

Could it be because of Python function?

I got this error for another tool within qiime software (FLASH command):

error Segmentation fault: 11

Thank you, Tony.

/Olena

TonyWalters

unread,

Mar 2, 2016, 2:43:11 PM3/2/16

to Qiime 1 Forum

Okay, just to confirm, can you rerun:

which fastq-join

Just so we can be sure it's using the right one.

If so, then there is nothing further I can do to fix fastq-join, as it is not our software, QIIME is just interfacing with it. You will have to use seqprep or just use read1 (R1) files as input for split_libraries_fastq.py.

Olena P

unread,

Mar 2, 2016, 3:14:38 PM3/2/16

to Qiime 1 Forum

which fastq-join

/usr/local/Cellar/ea-utils/1.1.2-806/bin//fastq-join

It seems to be correct one.

If I use only forward reads for the further steps, should I re-run all sequences that have been already analysed with F+R?

I have approx 1000 and 300 are already done.

Regards,

Olena

TonyWalters

unread,

Mar 2, 2016, 3:18:35 PM3/2/16

to Qiime 1 Forum

It would be better if all sequences were processed in the same way.

You might look into the multiple_split_libraries_fastq.py to help automate this process.

Olena P

unread,

Mar 2, 2016, 3:41:50 PM3/2/16

to Qiime 1 Forum

Hi Tony!

I managed to do multiple_joined_paired_end but not multiple_split_libraries_fastq.py Hopefully will manage it too.

Thank you for all your help! I also appreciate your time!

Kind regards,

Olena

Reply all

Reply to author

Forward