pick_open_reference_otus error

277 views
Skip to first unread message

ddr...@uw.edu

unread,
Aug 9, 2015, 12:05:04 PM8/9/15
to Qiime Forum
Hi, I keep getting this error running pick_open_reference_otus.py. Any ideas or insight would be helpful.
test.fna contains 100 loci. My parameters file contains 

pick_otus:enable_rev_strand_match True

pick_otus:max_accepts 1

pick_otus:max_rejects 8

pick_otus:stepwords 8

pick_otus:word_length 8


Thanks,
Dan Drinan


ubuntu@ip-172-31-8-6:~/
meiofauna$ pick_open_reference_otus.py -o otus -i out/test.fna -p params.txt -f
Traceback (most recent call last):
  File "/usr/local/bin/pick_open_reference_otus.py", line 453, in <module>
    main()
  File "/usr/local/bin/pick_open_reference_otus.py", line 432, in main
    minimum_failure_threshold=minimum_failure_threshold)
  File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/pick_open_reference_otus.py", line 713, in pick_subsampled_open_reference_otus
    close_logger_on_success=False)
  File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/util.py", line 122, in call_commands_serially
    raise WorkflowError(msg)
qiime.workflow.util.WorkflowError: 

*** ERROR RAISED DURING STEP: Pick Reference OTUs
Command run was:
 pick_otus.py -i out/test.fna -o otus/step1_otus -r /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta -m uclust_ref --max_rejects 8 --stepwords 8 --enable_rev_strand_match --word_length 8 --max_accepts 1 --suppress_new_clusters
Command returned exit status: 1
Stdout:

Stderr
Traceback (most recent call last):
  File "/usr/local/bin/pick_otus.py", line 1004, in <module>
    main()
  File "/usr/local/bin/pick_otus.py", line 924, in main
    failure_path=failure_path)
  File "/usr/local/lib/python2.7/dist-packages/qiime/pick_otus.py", line 1912, in __call__
    HALT_EXEC=HALT_EXEC)
  File "/usr/local/lib/python2.7/dist-packages/bfillings/uclust.py", line 585, in get_clusters_from_fasta_filepath
    raise ApplicationError('Error running uclust. Possible causes are '
burrito.util.ApplicationError: Error running uclust. Possible causes are unsupported version (current supported version is v1.2.22) is installed or improperly formatted input file was provided

Tony Walters

unread,
Aug 9, 2015, 12:16:38 PM8/9/15
to qiime...@googlegroups.com
Hello,

Can you post the output of print_qiime_config.py?

Are you running this on a cluster, or any location that you might have limited access to the /tmp/ folder?

How much memory do you have on the system?

If you type:
uclust --version
on the command line, do you get version v1.2.22?

Can you check the input test.fna file with the validate_demultiplexed_fasta.py script (http://qiime.org/scripts/validate_demultiplexed_fasta.html)?

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tony Walters

unread,
Aug 9, 2015, 5:23:27 PM8/9/15
to qiime...@googlegroups.com
One other question, could you clarify what you mean by 100 loci in the input sequence file? Are these 100 different genes? The use of the default reference database (16S) would not be ideal if your data aren't 16S.

ddr...@uw.edu

unread,
Aug 10, 2015, 12:30:03 PM8/10/15
to Qiime Forum
Thanks so much for your quick response. Here's what I found.

Output from print_qiime_config.py
System information
==================
         Platform:    linux2
   Python version:    2.7.3 (default, Aug  1 2012, 05:14:39)  [GCC 4.6.3]
Python executable:    /usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:
 https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2

Dependency versions
===================
          QIIME library version:    1.9.1
           QIIME script version:    1.9.1
qiime-default-reference version:    0.1.2
                  NumPy version:    1.9.2
                  SciPy version:    0.15.1
                 pandas version:    0.16.1
             matplotlib version:    1.4.3
            biom-format version:    2.1.4
                   h5py version:    2.5.0 (HDF5 version: 1.8.4)
                   qcli version:    0.1.1
                   pyqi version:    0.3.2
             scikit-bio version:    0.2.3
                 PyNAST version:    1.2.2
                Emperor version:    0.9.51
                burrito version:    0.9.1
       burrito-fillings version:    0.1.1
              sortmerna version:    SortMeRNA version 2.0, 29/11/2014
              sumaclust version:    SUMACLUST Version 1.0.00
                  swarm version:    Swarm 1.2.19 [May 26 2015 15:28:37]
                          gdata:    Installed.

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:
 http://qiime.org/install/qiime_config.html
 http://qiime.org/tutorials/parallel_qiime.html

                     blastmat_dir:    /qiime_software/blast-2.2.22-release/data
      pick_otus_reference_seqs_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                         sc_queue:    all.q
      topiaryexplorer_project_dir:    None
     pynast_template_alignment_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta
                  cluster_jobs_fp:    start_parallel_jobs.py
pynast_template_alignment_blastdb:    None
assign_taxonomy_reference_seqs_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                     torque_queue:    friendlyq
                    jobs_to_start:    1
                       slurm_time:    None
            denoiser_min_per_core:    50
assign_taxonomy_id_to_taxonomy_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
                         temp_dir:    /home/ubuntu/temp/
                     slurm_memory:    None
                      slurm_queue:    None
                      blastall_fp:    /qiime_software/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep:    1


Am I running this on a cluster or anything that could prevent access to /tmp?
I'm running everything on Amazon. I found the following files in /tmp so I think I have the proper permissions (my username is ubuntu). Also, the computer is not very big (1 gig of RAM), but I'm only using 100 sequence reads, so I assumed that wouldn't be a problem.
-rw-------  1 ubuntu ubuntu  19640 Aug  5 21:06 OtuPickerKaiqns.fasta
-rw-------  1 ubuntu ubuntu  19640 Aug  4 20:43 OtuPickerWQ17S1.fasta
-rw-rw-r--  1 ubuntu ubuntu    277 Aug  5 21:06 tmp2NrCZvM4nkOJsDdQXxsq.txt
-rw-rw-r--  1 ubuntu ubuntu    274 Aug  4 20:44 tmpBTAQKK1mV9mbBuaaZWQ7.txt
-rw-rw-r--  1 ubuntu ubuntu      0 Aug  5 21:06 tmpmyFgiZ0CoNe31pfegFOa.txt
-rw-rw-r--  1 ubuntu ubuntu      0 Aug  4 20:43 tmpWhQoZ8scWMKJitTD43So.txt
-rw-------  1 ubuntu ubuntu  15247 Aug  4 20:43 UclustExactMatchFilter6JUlM_.fasta
-rw-------  1 ubuntu ubuntu  15247 Aug  5 21:06 UclustExactMatchFilterg8goPS.fasta

uclust --version output
uclust v1.2.22q


Output from validate_demultiplexed_fasta.py
This gave me a big error. Just to clarify, test.fna has 100 sequence reads NOT loci (or genes. Sorry about misspeaking on that). The reads are for 16S as well as 3 other genes. I'm wondering if part of the problem is the mapping file. I've posted that below the error. Any advice would be appreciated.

ubuntu@ip-172-31-8-6:~$ validate_demultiplexed_fasta.py -i meiofauna/out/test.fna -m meiofauna/mapping.tsv
Traceback (most recent call last):
  File "/usr/local/bin/validate_demultiplexed_fasta.py", line 113, in <module>
    main()
  File "/usr/local/bin/validate_demultiplexed_fasta.py", line 109, in main
    opts.suppress_barcode_checks, opts.suppress_primer_checks)
  File "/usr/local/lib/python2.7/dist-packages/qiime/validate_demultiplexed_fasta.py", line 590, in validate_fasta
    suppress_barcode_checks, suppress_primer_checks)
  File "/usr/local/lib/python2.7/dist-packages/qiime/validate_demultiplexed_fasta.py", line 426, in run_fasta_checks
    suppress_barcode_checks, suppress_primer_checks)
  File "/usr/local/lib/python2.7/dist-packages/qiime/validate_demultiplexed_fasta.py", line 40, in get_mapping_details
    process_id_map(mapping_f)
  File "/usr/local/lib/python2.7/dist-packages/qiime/check_id_map.py", line 182, in process_id_map
    added_demultiplex_field)
  File "/usr/local/lib/python2.7/dist-packages/qiime/check_id_map.py", line 822, in check_header
    desc_ix, bc_ix, linker_primer_ix, added_demultiplex_field)
  File "/usr/local/lib/python2.7/dist-packages/qiime/check_id_map.py", line 888, in check_header_required_fields
    if (header[curr_check] != header_checks[curr_check] and
IndexError: list index out of range



Mapping file
#SampleID   BarcodeSequence LinkerPrimerSequence        Description
b10c8c11b7  RB-EL6__MB-B1-MW2__MB-B1b-EL2__RB-B2-MW6
b9c9d2b5        B1-RB-EL5__MB-B1-MW5__MB-B1-EL5__RB-B2-MW2
b11c4b2 4B-EL2__MB-B1-DL2__RB-B2-DL1
c10b5c9c2   MB-B1a-EL2__RB-B2-MW2__MB-B1-MW5__4B-EL1
b2c3c8  RB-B2-DL1__4B-EL4__MB-B1-MW2
c11b7c6b11  MB-B1b-EL2__RB-B2-MW6__MB-B1-DL4__4B-EL2
b2c8b8  RB-B2-DL1__MB-B1-MW2__RB-B2-EL3
c2c6b3  4B-EL1__MB-B1-DL4__RB-B2-DL3
b3b10c7 RB-B2-DL3__RB-EL6__MB-B1-MW1
c3c5c10b4   4B-EL4__MB-B1-DL3__MB-B1a-EL2__RB-B2-DL6
b3c4b10 RB-B2-DL3__MB-B1-DL2__RB-EL6
c4b11b7 MB-B1-DL2__4B-EL2__RB-B2-MW6
b4b9c9  RB-B2-DL6__B1-RB-EL5__MB-B1-MW5
c5c3b4  MB-B1-DL3__RB-B2-EL3__RB-B2-DL6
b4c7b9  RB-B2-DL6__MB-B1-MW1__B1-RB-EL5
c6c2b3  MB-B1-DL4__4B-EL1__RB-B2-DL3
b5c10c2c4   RB-B2-MW2__MB-B1a-EL2__4B-EL1__MB-B1-DL2
c7b8b6c10   MB-B1-MW1__4B-EL4__RB-B2-MW5__MB-B1a-EL2
b6d2b11 RB-B2-MW5__MB-B1-EL5__4B-EL2__MB-B1-DL3
c8b10b5c11  MB-B1-MW2__RB-EL6__RB-B2-MW2__MB-B1b-EL2
b7c11b8c6   RB-B2-MW6__MB-B1b-EL2__RB-B2-EL3__MB-B1-DL4
c9b9b2d2        MB-B1-MW5__B1-RB-EL5__RB-B2-DL1__MB-B1-EL5

Greg Caporaso

unread,
Aug 10, 2015, 12:49:52 PM8/10/15
to Qiime Forum
Hi Dan,
Could you try running on an amazon instance with more memory? The 1GB might be the limiting factor, as the full reference database does need to be loaded into memory for open reference OTU picking. I'd recommend going with at least 8GB of memory. 

Greg
 

Tony Walters

unread,
Aug 10, 2015, 3:00:15 PM8/10/15
to qiime...@googlegroups.com
Hello,

In addition to what Greg is saying, it may be worth setting the temp folder to be something other than /tmp/, in case there are restrictions on the size of the files. See this page for setting the temp folder in the environment: https://github.com/biocore/burrito-fillings/issues/55#issuecomment-91041061

For the mapping file, you want to make sure everything is tab separated-I can't tell from the pasted data (tabs don't come through clearly) if that is the case for your mapping metadata.

-Tony

--

ddr...@uw.edu

unread,
Aug 11, 2015, 1:56:30 PM8/11/15
to Qiime Forum
Tony and Greg, thanks so much for your responses. I've since tried it on a new machine with 30 GB of memory and I changed the temporary file to a directory that I definitely have 'rwx' permissions, but unfortunately, I still get the following error. Any ideas?


ubuntu@amazon:~$ pick_open_reference_otus.py -o otus -i out/test.fna -p params.txt -f

Traceback (most recent call last):
  File "/usr/local/bin/pick_open_reference_otus.py", line 453, in <module>
    main()
  File "/usr/local/bin/pick_open_reference_otus.py", line 432, in main
    minimum_failure_threshold=minimum_failure_threshold)
  File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/pick_open_reference_otus.py", line 1071, in pick_subsampled_open_reference_otus
    status_update_callback=status_update_callback)
  File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/pick_open_reference_otus.py", line 327, in align_and_tree
    close_logger_on_success=close_logger_on_success)

  File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/util.py", line 122, in call_commands_serially
    raise WorkflowError(msg)
qiime.workflow.util.WorkflowError:

*** ERROR RAISED DURING STEP: Filter alignment
Command run was:
 filter_alignment.py -o otus/pynast_aligned_seqs -i otus/pynast_aligned_seqs/rep_set_aligned.fasta
Command returned exit status: 1
Stdout:

Stderr
Traceback (most recent call last):
  File "/usr/local/bin/filter_alignment.py", line 155, in <module>
    main()
  File "/usr/local/bin/filter_alignment.py", line 108, in main
    raise ValueError("An empty fasta file was provided. "
ValueError: An empty fasta file was provided. Did the alignment complete sucessfully? Did PyNAST discard all sequences due to too-stringent minimum length or minimum percent ID settings?

Tony Walters

unread,
Aug 11, 2015, 2:05:20 PM8/11/15
to qiime...@googlegroups.com
That is a different error than the previous one, indicating that OTU picking was completed. The representative sequences from OTU picking are failing to align.

What sort of sequences are these? Are they 16S? It looks like your version of the QIIME default reference is up to date (see https://qiime.wordpress.com/2015/04/16/bug-fix-qiime-1-9-0-pynast-default-reference-alignment/) so there shouldn't be an issue aligning reads for 16S data.

--

ddr...@uw.edu

unread,
Aug 11, 2015, 3:39:19 PM8/11/15
to Qiime Forum
You're absolutely right. There was a problem with my file of sequences. I tested it with some known sequences and the error went away. I appreciate your help.

Best,
Dan



On Sunday, August 9, 2015 at 9:05:04 AM UTC-7, ddr...@uw.edu wrote:
Reply all
Reply to author
Forward
0 new messages