Setting SILVA as reference data base for clustering

nlh15

unread,

May 17, 2016, 1:56:05 PM5/17/16

to Qiime 1 Forum

Hello,

I am having some issues assigning SILVA 123 as my reference database for the script pick_open_reference_otus.py. I created a parameter file which looks good (see attached file) but I am very unsure on how I am supposed to direct the script to utilize SILVA, so far I have just typed in SILVA 123 and I have SILVA version 123 downloaded and unzipped but I am not getting the expected results and so I figured id see if I was doing something wrong while setting up this script. Is there more to the installation of SILVA on my desktop? Do I have to source/ enter the path to the directory where SILVA is located in the parameter file? Help is greatly appreciated!

// Nine

Parameter_file

Greg Caporaso

unread,

May 18, 2016, 12:34:35 PM5/18/16

to Qiime 1 Forum

Hi Nine,

There are a few things you'll need to do if you'd like to use Silva as your reference database instead of Greengenes with pick_open_reference_otus.py. In the notes below, I'm using $SILVADIR to refer to my unzipped directory of Silva reference files (for me, that's /Users/caporaso/data/SILVA123_QIIME_release, but it will be different for you).

First, you'll want to pass the attached parameters file. This specifies that Silva should be used for the taxonomy alignment and reference-based alignment steps in the workflow. Next, you'll want to pass:

-r $SILVADIR/rep_set/rep_set_all/97/97_otus.fasta

to pick_open_reference_otus.py to specify that Silva should be used as the reference database for alignment. I tested this locally, and my command looked like the following:

pick_open_reference_otus.py -i seqs1.fna -p silva-123-params.txt -r $SILVADIR/rep_set/rep_set_all/97/97_otus.fasta -o or-silva123-out

I've attached my seqs1.fna file here as well - it would be a good idea to test with this first to make sure you have everything correct as this will only take a couple of minutes to run. Once it's working, you can swap in your sequences file.

Finally, you should review the file Silva_123_notes.txt, and specifically see the note about "Consensus and Majority Taxonomies". In the parameters file attached here, I'm having you just pass the taxonomy_7_levels.txt file, but you might want to consider comparing to the corresponding consensus and majority files (you would make that change by editing the parameters file).

Best,

Greg

seqs1.fna

silva-123-params.txt

nlh15

unread,

May 18, 2016, 8:56:40 PM5/18/16

to Qiime 1 Forum

After following your instructions and trying the script on the attached seqs1.fna file I get the following error message:

raise WorkflowError(msg)
qiime.workflow.util.WorkflowError:

*** ERROR RAISED DURING STEP: Assign taxonomy
Command run was:
parallel_assign_taxonomy_uclust.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_REV_OTU/rep_set.fna -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_REV_OTU/uclust_assigned_taxonomy -T --jobs_to_start 1 --reference_seqs_fp $SILVADIR/rep_set/rep_set_all/97/97_otus.fasta --id_to_taxonomy_fp $SILVADIR/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt --rdp_max_memory 6000
Command returned exit status: 2
Stdout:

Stderr
Error in parallel_assign_taxonomy_uclust.py: option --reference_seqs_fp: file does not exist: '/rep_set/rep_set_all/97/97_otus.fasta'

attached is my edited parameter file and below is the script in full as I attempted to run it:

uwfcedb@uwfcedb:~/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70$ pick_open_reference_otus.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/seqs1.fna -f -a -v -p /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/Parameter_file -r /home/uwfcedb/SILVA123_QIIME_release/rep_set/rep_set_all/97/97_otus.fasta -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_REV_OTU

My Silva version is the same as you described earlier and located in my home directory, could that be an issue?

Really appreciate your help in this matter!

Greg Caporaso

unread,

May 19, 2016, 10:43:12 AM5/19/16

to Qiime 1 Forum

Hello,

In your parameters file you'll need to replace $SILVADIR with /home/uwfcedb/SILVA123_QIIME_release (as you did on the command line).

Greg

nlh15

unread,

May 24, 2016, 3:34:28 PM5/24/16

to Qiime 1 Forum

Greg,

I replaced the $SILVADIR (my bad, rookie mistake) and attempted to run the script again but this time it kicked back the following message:

uwfcedb@uwfcedb:~/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70$ pick_open_reference_otus.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/seqs1.fna -f -a -v -p /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/Parameter_file -r /home/uwfcedb/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU
Pick Reference OTUs
parallel_pick_otus_uclust_ref.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/seqs1.fna -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus -r /home/uwfcedb/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta -T --jobs_to_start 1 --enable_rev_strand_match
Generate full failures fasta file
filter_fasta.py -f /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/seqs1.fna -s /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus/seqs1_failures.txt -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus/failures.fasta
Pick rep set
pick_rep_set.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus/seqs1_otus.txt -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus/step1_rep_set.fna -f /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/seqs1.fna
Pick de novo OTUs on step1 failures
pick_otus.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus/failures.fasta -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step4_otus/ -m uclust --denovo_otu_id_prefix New.CleanUp.ReferenceOTU --enable_rev_strand_match
Merge OTU maps
cat /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus/seqs1_otus.txt /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step4_otus//failures_otus.txt > /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/final_otu_map.txt
Pick representative set for subsampled failures
pick_rep_set.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step4_otus//failures_otus.txt -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step4_otus//step4_rep_set.fna -f /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/step1_otus/failures.fasta
Make the otu table
make_otu_table.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/final_otu_map_mc2.txt -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/otu_table_mc2.biom
Assign taxonomy
parallel_assign_taxonomy_uclust.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/rep_set.fna -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/uclust_assigned_taxonomy -T --jobs_to_start 1 --reference_seqs_fp /home/uwfcedb/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta --id_to_taxonomy_fp /home/uwfcedb/SILVA123_QIIME_release/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt --rdp_max_memory 6000
Traceback (most recent call last):
File "/usr/local/bin/pick_open_reference_otus.py", line 453, in <module>
    main()
File "/usr/local/bin/pick_open_reference_otus.py", line 432, in main
    minimum_failure_threshold=minimum_failure_threshold)
File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/pick_open_reference_otus.py", line 1030, in pick_subsampled_open_reference_otus
    status_update_callback=status_update_callback)
File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/pick_open_reference_otus.py", line 232, in assign_tax
    close_logger_on_success=close_logger_on_success)
File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/util.py", line 122, in call_commands_serially

raise WorkflowError(msg)
qiime.workflow.util.WorkflowError:

*** ERROR RAISED DURING STEP: Assign taxonomy
Command run was:

parallel_assign_taxonomy_uclust.py -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/rep_set.fna -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/uclust_assigned_taxonomy -T --jobs_to_start 1 --reference_seqs_fp /home/uwfcedb/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta --id_to_taxonomy_fp /home/uwfcedb/SILVA123_QIIME_release/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt --rdp_max_memory 6000

Command returned exit status: 2
Stdout:

Stderr

Error in parallel_assign_taxonomy_uclust.py: no such option: --rdp_max_memory

Am I failing to tell the script I want to use Silva or is there another issue making it bring up uclust again?

Again, thank you for all your help!

//Nine

Parameter_file

Greg Caporaso

unread,

May 24, 2016, 5:50:35 PM5/24/16

to Qiime 1 Forum

Hi Nine,

You just need to remove this line from your parameters file:

assign_taxonomy:rdp_max_memory 6000

Sorry, I mistakenly thought you were using the RDP classifier.

Greg

nlh15

unread,

May 26, 2016, 1:33:49 PM5/26/16

to Qiime 1 Forum

Greg,

Thank you for all your help with this matter. I removed the assign_taxonomy:rdp_max_memory 6000-line and ran the script again however this time it kicked back another error:

*** ERROR RAISED DURING STEP: Filter alignment
Command run was:
filter_alignment.py -o /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/pynast_aligned_seqs -i /home/uwfcedb/Desktop/3125Raw/ThesisExecuted/Reads/Filtered30_70/SILVA_enrev_OTU/pynast_aligned_seqs/rep_set_aligned.fasta --allowed_gap_frac 0.80 --entropy_threshold 0.10 --suppress_lane_mask_filter
Command returned exit status: 1
Stdout:

Stderr

Traceback (most recent call last):

File "/usr/local/bin/filter_alignment.py", line 155, in <module>
    main()
File "/usr/local/bin/filter_alignment.py", line 148, in main
    entropy_threshold=opts.entropy_threshold):
File "/usr/local/lib/python2.7/dist-packages/qiime/filter_alignment.py", line 78, in apply_lane_mask_and_gap_filter
    entropy_mask = generate_lane_mask(aln, entropy_threshold)
File "/usr/local/lib/python2.7/dist-packages/qiime/filter_alignment.py", line 146, in generate_lane_mask
    interpolation='nearest')
TypeError: percentile() got an unexpected keyword argument 'interpolation'

Any ideas on how to counteract this issue? Again thank you for you help!

// Nine

Greg Caporaso

unread,

May 26, 2016, 3:49:52 PM5/26/16

to Qiime 1 Forum

Hi Nine,

Can you run print_qiime_config.py and post the output please?

Greg

nlh15

unread,

May 26, 2016, 5:08:02 PM5/26/16

to Qiime 1 Forum

Greg,

uwfcedb@uwfcedb:~$ print_qiime_config.py

System information
==================
         Platform:    linux2
   Python version:    2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2]
Python executable:    /usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:
https://github.com/biocore/qiime-default-reference/releases/tag/0.1.3

Dependency versions
===================
          QIIME library version:    1.9.1
           QIIME script version:    1.9.1
qiime-default-reference version:    0.1.3
                  NumPy version:    1.8.2
                  SciPy version:    0.17.0
                 pandas version:    0.18.0
             matplotlib version:    1.5.1
            biom-format version:    2.1.5
                   h5py version:    2.6.0 (HDF5 version: 1.8.11)
                   qcli version:    0.1.1
                   pyqi version:    0.3.2
             scikit-bio version:    0.2.3
                 PyNAST version:    1.2.2
                Emperor version:    0.9.51
                burrito version:    0.9.1
       burrito-fillings version:    0.1.1
              sortmerna version:    SortMeRNA version 2.0, 29/11/2014
              sumaclust version:    SUMACLUST Version 1.0.00
                  swarm version:    Swarm 1.2.19 [Apr 14 2016 14:29:24]
                          gdata:    Installed.

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:
http://qiime.org/install/qiime_config.html
http://qiime.org/tutorials/parallel_qiime.html

                     blastmat_dir:    /home/uwfcedb/qiime_software/blast-2.2.22-release/data
      pick_otus_reference_seqs_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                         sc_queue:    all.q
      topiaryexplorer_project_dir:    None
     pynast_template_alignment_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta
                  cluster_jobs_fp:    start_parallel_jobs.py
pynast_template_alignment_blastdb:    None
assign_taxonomy_reference_seqs_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                     torque_queue:    friendlyq
                    jobs_to_start:    1
                       slurm_time:    None
            denoiser_min_per_core:    50
assign_taxonomy_id_to_taxonomy_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
                         temp_dir:    /tmp/
                     slurm_memory:    None
                      slurm_queue:    None
                      blastall_fp:    /home/uwfcedb/qiime_software/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep:    1

im running a native full install, this is the output of print_qiime_config.py -tf

QIIME full install test results
===============================
....FFF..F.................
======================================================================
FAIL: test_blast_supported_version (__main__.QIIMEDependencyFull)
blast is in path and version is supported
----------------------------------------------------------------------

Traceback (most recent call last):

File "/usr/local/bin/print_qiime_config.py", line 470, in test_blast_supported_version
% ('.'.join(map(str, acceptable_version)), version_string))
AssertionError: Unsupported blast version. 2.2.22 is required, but running 2.2.26.

======================================================================
FAIL: test_blastall_fp (__main__.QIIMEDependencyFull)
blastall_fp is set to a valid path
----------------------------------------------------------------------

Traceback (most recent call last):

File "/usr/local/bin/print_qiime_config.py", line 449, in test_blastall_fp
test_qiime_config_variable("blastall_fp", self.config, self, X_OK)
File "/usr/local/bin/print_qiime_config.py", line 730, in test_qiime_config_variable
(variable, fp))
AssertionError: blastall_fp set to an invalid file path: /home/uwfcedb/qiime_software/blast-2.2.22-release/bin/blastall

======================================================================
FAIL: test_blastmat_dir (__main__.QIIMEDependencyFull)
blastmat_dir is set to a valid path.
----------------------------------------------------------------------

Traceback (most recent call last):

File "/usr/local/bin/print_qiime_config.py", line 236, in test_blastmat_dir
test_qiime_config_variable("blastmat_dir", self.config, self)
File "/usr/local/bin/print_qiime_config.py", line 730, in test_qiime_config_variable
(variable, fp))
AssertionError: blastmat_dir set to an invalid file path: /home/uwfcedb/qiime_software/blast-2.2.22-release/data

======================================================================
FAIL: test_chimeraSlayer_install (__main__.QIIMEDependencyFull)
no obvious problems with ChimeraSlayer install
----------------------------------------------------------------------

Traceback (most recent call last):

File "/usr/local/bin/print_qiime_config.py", line 429, in test_chimeraSlayer_install
self.assertTrue(chim_slay, "ChimeraSlayer was not found in your $PATH")
AssertionError: ChimeraSlayer was not found in your $PATH

----------------------------------------------------------------------
Ran 27 tests in 0.959s

FAILED (failures=4)

Greg Caporaso

unread,

May 27, 2016, 6:55:54 PM5/27/16

to Qiime 1 Forum

Hello,

This is an issue with your numpy version - QIIME 1.9.1 requires numpy version >= 1.9.0, and you're running numpy 1.8.2. You'll need to update your numpy version to continue with this. Depending on how QIIME is installed, you may be able to run: pip install numpy --upgrade.

If that doesn't work for you, and you don't have another way to upgrade numpy, we posted some new draft installation instructions today, which are available here. This installation procedure should be straight-forward, and would address those issues, so might be worth trying.

Sorry for all of the trouble!

Best,

Greg

nlh15

unread,

Jun 2, 2016, 3:07:55 PM6/2/16

to Qiime 1 Forum

Greg,

Thank you for all of your help, I believe I was able to upgrade my numpy to a satisfactory version. Lastly I was just hoping to get your input on whether this script ran it righ (aka used silva as the reference or not). This is the output from my combined_seqs_otus.log file

OtuPicker parameters:
Application:uclust
Similarity:0.97
chimeras_retention:union
enable_rev_strand_matching:True
exact:False
max_accepts:1
max_rejects:8
new_cluster_identifier:denovo
next_new_cluster_number:1
optimal:False
output_dir:/home/nlh15/Test3125/Clustered/step1_otus/POTU_Zsbm_
prefilter_identical_sequences:True
presort_by_abundance:True
save_uc_files:True
stable_sort:True
stepwords:8
suppress_new_clusters:True
suppress_sort:True
word_length:8
Reference seqs:/home/nlh15/SILVA123_QIIME_release/rep_set/rep_set_all/97/97_otus.fasta
Num OTUs:844
Num new OTUs:0
Num failures:2465
Result path: /home/nlh15/Test3125/Clustered/step1_otus/POTU_Zsbm_/POTU_Zsbm_.0_otus.txt
OtuPicker parameters:
Application:uclust
Similarity:0.97
chimeras_retention:union
enable_rev_strand_matching:True
exact:False
max_accepts:1
max_rejects:8
new_cluster_identifier:denovo
next_new_cluster_number:1
optimal:False
output_dir:/home/nlh15/Test3125/Clustered/step1_otus/POTU_Zsbm_
prefilter_identical_sequences:True
presort_by_abundance:True
save_uc_files:True
stable_sort:True
stepwords:8
suppress_new_clusters:True
suppress_sort:True
word_length:8
Reference seqs:/home/nlh15/SILVA123_QIIME_release/rep_set/rep_set_all/97/97_otus.fasta
Num OTUs:761
Num new OTUs:0
Num failures:2580
Result path: /home/nlh15/Test3125/Clustered/step1_otus/POTU_Zsbm_/POTU_Zsbm_.1_otus.txt
OtuPicker parameters:
Application:uclust
Similarity:0.97
chimeras_retention:union
enable_rev_strand_matching:True
exact:False
max_accepts:1
max_rejects:8
new_cluster_identifier:denovo
next_new_cluster_number:1

Again really appreciate your help and inputs!

//Nine

Greg Caporaso

unread,

Jun 6, 2016, 5:50:30 PM6/6/16

to Qiime 1 Forum

Hello,

That looks right.

Greg

nlh15

unread,

Jun 7, 2016, 2:56:22 PM6/7/16

to Qiime 1 Forum

Thank you Greg, you have been a great help!

//Nine

daniel....@gmail.com

unread,

Aug 22, 2016, 7:59:11 PM8/22/16

to Qiime 1 Forum

Hi Greg,

I just did what you have explained above including the Silva123_param.txt file and my log file looks like below:

qiime_config values:

pick_otus_reference_seqs_fp /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

sc_queue all.q

pynast_template_alignment_fp /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta

cluster_jobs_fp start_parallel_jobs.py

assign_taxonomy_reference_seqs_fp /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

torque_queue friendlyq

jobs_to_start 1

denoiser_min_per_core 50

assign_taxonomy_id_to_taxonomy_fp /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt

temp_dir /tmp/

blastall_fp blastall

seconds_to_sleep 60

parameter file values:

align_seqs.py:template_fp /macqiime/SILVA123_QIIME-release/core_alignment/core_alignment_SILVA123.fasta

filter_alignment:allowed_gap_frac 0.80

filter_alignment:entropy_threshold 0.10

filter_alignment:suppress_lane_mask_filter True

assign_taxonomy:reference_seqs_fp /macqiime/SILVA123_QIIME-release/rep_set/rep_set_all/99/99_otus.fasta

assign_taxonomy:id_to_taxonomy_fp /macqiime/SILVA123_QIIME-release/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt

parallel:jobs_to_start 8

pick_otus:enable_rev_strand_match True

Is that ok, that qiime_config valuse still show GG ref? Should I change it and redirect to Silva123 e.g.:

pick_otus_reference_seqs_fp /macqiime/SILVA123_QIIME-release/rep_set/rep_set_all/97/97_otus.fasta

sc_queue all.q

pynast_template_alignment_fp /macqiime/SILVA123_QIIME-release/core_alignment/core_alignment_SILVA123.fasta

cluster_jobs_fp start_parallel_jobs.py

assign_taxonomy_reference_seqs_fp //macqiime/SILVA123_QIIME-release/rep_set/rep_set_all/97/97_otus.fasta

torque_queue friendlyq

jobs_to_start 1

denoiser_min_per_core 50

assign_taxonomy_id_to_taxonomy_fp //macqiime/SILVA123_QIIME-release/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt

temp_dir /tmp/

blastall_fp blastall

Thanks,

Daniel

Greg Caporaso

unread,

Aug 23, 2016, 11:35:53 AM8/23/16

to Qiime 1 Forum

Hi Daniel, This looks right now. The qiime_config values are the defaults, but you're overwriting those with the parameters file.

Greg

Daniel Laubitz

unread,

Aug 23, 2016, 12:26:03 PM8/23/16

to Qiime 1 Forum

Thank you Greg!

Melanie Lloyd

unread,

Feb 16, 2017, 1:52:37 PM2/16/17

to Qiime 1 Forum

Hi Greg,

I have a question about the specific SILVA files you're using in this code. Why is it that you use -r /rep_set_all/97/97_otus.fasta and assign_taxonomy:reference_seqs_fp $SILVADIR/rep_set/rep_set_all/99/99_otus.fasta?

What is the difference between the files in the 90, 94, 97, and 99 directories?

Thanks!

Melanie

TonyWalters

unread,

Feb 16, 2017, 2:46:22 PM2/16/17

to Qiime 1 Forum

Hello Melanie,

Those numbers represent the clustered identity of the reference reads, so one will have more reference reads at 99 than 97 (but will take more resources). I can't speak for Daniels' specific reasoning here, but you might be able to differentiate certain reads at the 99% identity that you wouldn't at the 97% identity.

David Seminario

unread,

Apr 14, 2017, 10:02:20 AM4/14/17

to Qiime 1 Forum

Good morning colleagues, I have read your comments to solve my problem but I can not do this command I would like to help me thank you very much.

I am currently using the silva128 database.

Pick_open_reference_otus.py -i Split_Library_ITS / seqs.fna -p silva_128_parametr.txt -r SILVA_128_QIIME_release / rep_set / rep_set_18S_only / 97 / 97_otus_18S.fasta -o silva_out

file parameter.txt

align_seqs.py:template_fp SILVA_128_QIIME_release/core_alignment/core_alignment_SILVA128.fna

filter_alignment:allowed_gap_frac 0.80

filter_alignment:entropy_threshold 0.10

filter_alignment:suppress_lane_mask_filter True

assign_taxonomy:reference_seqs_fp SILVA_128_QIIME_release/rep_set/rep_set_18S_only/97/97_otus_18S.fasta

assign_taxonomy:id_to_taxonomy_fp SILVA_128_QIIME_release/taxonomy/18S_only/97/majority_taxonomy_7_levels.txt

parallel:jobs_to_start 8

pick_otus:enable_rev_strand_match True

error problem in MACOSX

MacQIIME MacBook-Pro-de-David:qiime_bernabe $ pick_open_reference_otus.py -i Split_Library_ITS/seqs.fna -p example2.txt -r $/macqiime/SILVA123_QIIME-release/rep_set/rep_set_all/99/99_otus.fasta -o example2

Error in pick_open_reference_otus.py: option -r: file does not exist: '$/macqiime/SILVA123_QIIME-release/rep_set/rep_set_all/99/99_otus.fasta'

If you need help with QIIME, see:

http://help.qiime.org

MacQIIME MacBook-Pro-de-David:qiime_bernabe $ pick_open_reference_otus.py -i Split_Library_ITS/seqs.fna -p example2.txt -r SILVA_128_QIIME_release/rep_set/rep_set_18S_only/97/97_otus_18S.fasta -o example2

Traceback (most recent call last):

File "/macqiime/anaconda/bin/pick_open_reference_otus.py", line 453, in <module>

main()

File "/macqiime/anaconda/bin/pick_open_reference_otus.py", line 432, in main

minimum_failure_threshold=minimum_failure_threshold)

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/workflow/pick_open_reference_otus.py", line 1071, in pick_subsampled_open_reference_otus

status_update_callback=status_update_callback)

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/workflow/pick_open_reference_otus.py", line 327, in align_and_tree

close_logger_on_success=close_logger_on_success)

File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/workflow/util.py", line 122, in call_commands_serially

raise WorkflowError(msg)

qiime.workflow.util.WorkflowError:

*** ERROR RAISED DURING STEP: Filter alignment

Command run was:

filter_alignment.py -o example2/pynast_aligned_seqs -i example2/pynast_aligned_seqs/rep_set_aligned.fasta --allowed_gap_frac 0.80 --entropy_threshold 0.10 --suppress_lane_mask_filter

Jai Ram Rideout

unread,

Apr 14, 2017, 5:07:12 PM4/14/17

to Qiime 1 Forum

Hi David,

Here's the relevant part of your error message:

Error in pick_open_reference_otus.py: option -r: file does not exist: '$/macqiime/SILVA123_QIIME-release/rep_set/rep_set_all/99/99_otus.fasta'

The file path $/macqiime/SILVA123_QIIME-release/rep_set/rep_set_all/99/99_otus.fasta does not exist. You'll need to find where your Silva reference sequences are located on your filesystem and provide that path to the script.

Best,

Jai

Patricia Bovio

unread,

May 2, 2017, 9:42:24 AM5/2/17

to Qiime 1 Forum

Hi Greg,

i'm trying to do your method to analyze my sequences against silva data base, using this parameter file:

pick_otus:enable_rev_strand_ma

tch True
align_seqs.py:template_fp $SILVADIR/core_alignment/core_alignment_SILVA123.fasta

filter_alignment:allowed_gap_frac 0.80
filter_alignment:entropy_threshold 0.10
filter_alignment:suppress_lane_mask_filter True

assign_taxonomy:reference_seqs_fp /home/lem/patricia_bovio/SILVA_128_QIIME_release/rep_set/rep_set_all/97/97_otus.fasta
assign_taxonomy:id_to_taxonomy_fp /home/lem/patricia_bovio/SILVA_128_QIIME_release/taxonomy/taxonomy_all/97/majority_taxonomy_all_levels.txt

But i had the following error:

$ pick_open_reference_otus.py -i NCBI_chloroflex_200_2000bp_2_gedit
-o otus4 -r /home/lem/patricia_bovio/SILVA_128_QIIME_release/rep_set/rep_set_all/97/97_otus.fasta -p /home/lem/patricia_bovio/NCBI_chloroflex_200_2000bp/Analisis_completo_c_silva/otus4/silva_128_params_2.txt -f

Traceback (most recent call last):

File "/home/lem/miniconda2/envs/qiime1/bin/pick_open_reference_otus.py", line 4, in <module>
    __import__('pkg_resources').run_script('qiime==1.9.1', 'pick_open_reference_otus.py')
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py", line 744, in run_script

File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py", line 1499, in run_script

File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime-1.9.1-py2.7.egg-info/scripts/pick_open_reference_otus.py", line 453, in <module>
    main()
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime-1.9.1-py2.7.egg-info/scripts/pick_open_reference_otus.py", line 432, in main
    minimum_failure_threshold=minimum_failure_threshold)
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime/workflow/pick_open_reference_otus.py", line 1071, in pick_subsampled_open_reference_otus
    status_update_callback=status_update_callback)
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime/workflow/pick_open_reference_otus.py", line 327, in align_and_tree
    close_logger_on_success=close_logger_on_success)
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime/workflow/util.py", line 122, in call_commands_serially

raise WorkflowError(msg)
qiime.workflow.util.WorkflowError:

*** ERROR RAISED DURING STEP: Filter alignment
Command run was:

filter_alignment.py -o otus4/pynast_aligned_seqs -i otus4/pynast_aligned_seqs/rep_set_aligned.fasta --allowed_gap_frac 0.80 --entropy_threshold 0.10 --suppress_lane_mask_filter

Command returned exit status: 1
Stdout:

Stderr

Traceback (most recent call last):

File "/home/lem/miniconda2/envs/qiime1/bin/filter_alignment.py", line 4, in <module>
    __import__('pkg_resources').run_script('qiime==1.9.1', 'filter_alignment.py')
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py", line 744, in run_script

File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py", line 1499, in run_script

File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime-1.9.1-py2.7.egg-info/scripts/filter_alignment.py", line 155, in <module>
    main()
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime-1.9.1-py2.7.egg-info/scripts/filter_alignment.py", line 148, in main
    entropy_threshold=opts.entropy_threshold):
File "/home/lem/miniconda2/envs/qiime1/lib/python2.7/site-packages/qiime/filter_alignment.py", line 87, in apply_lane_mask_and_gap_filter
    raise ValueError("Positional filtering resulted in removal of all "
ValueError: Positional filtering resulted in removal of all alignment positions.

What could be the issue?
Thanks!
Patricia

Jai Ram Rideout

unread,

May 2, 2017, 4:20:29 PM5/2/17

to Qiime 1 Forum

Hi Patricia,

This line in your parameters file isn't formatted correctly, so QIIME is using the default template alignment instead of the Silva template.

align_seqs.py:template_fp $SILVADIR/core_alignment/core_alignment_SILVA123.fasta

You'll need to drop .py from the script name. Try changing it to:

align_seqs:template_fp $SILVADIR/core_alignment/core_alignment_SILVA123.fasta

I'm not sure if environment variables are allowed in parameter files. If you're still having issues getting the template alignment to work, try specifying an absolute file path to your Silva template alignment instead of using $SILVADIR.

Best,

Jai

Message has been deleted

Samuel Major

unread,

May 4, 2017, 7:29:03 PM5/4/17

to Qiime 1 Forum

Hello,

So I'm trying to so a similar method of picking otus

pick_open_reference_otus.py -i F16_16S/1_qlab/combined_seqs.fna -o F16_16S/5_SILVA/silva123_otus -r SILVA123_QIIME_release/rep_set/rep_set_16S_only/97/97_otus_16S.fasta -p F16_16S/silva-123-params.txt

With my Parameters file set as previously described:

pick_otus:enable_rev_strand_match True

align_seqs:template_fp SILVA123_QIIME_release/core_alignment/core_alignment_SILVA123.fasta

filter_alignment:allowed_gap_frac 0.80

filter_alignment:entropy_threshold 0.10

filter_alignment:suppress_lane_mask_filter True

assign_taxonomy:reference_seqs_fp $SILVADIR/rep_set/rep_set_all/99/99_otus.fasta

assign_taxonomy:id_to_taxonomy_fp $SILVADIR/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt

The error I'm receiving however is

File "/home/qiime/miniconda3/envs/qiime1/lib/python2.7/site-packages/burrito_fillings-0.1.1-py2.7.egg/bfillings/uclust.py", line 585, in get_clusters_from_fasta_filepath

burrito.util.ApplicationError: Error running uclust. Possible causes are unsupported version (current supported version is v1.2.22) is installed or improperly formatted input file was provided

but when I enter the command: uclust --version I get

--uclust v1.2.22q

I don't think this is an improperly formatted input file because I ran the default settings with pick_open_reference_otus.py with the same file and did not receive this error message. Is there something wrong with my parameters file?

Any insight into what the issue might be?

Thank you

Sam

Greg Caporaso

unread,

May 5, 2017, 3:31:55 PM5/5/17

to Qiime 1 Forum

Hi Sam,

I'm not sure if this is the issue, but can you try replacing your $SILVADIR environment variable with the absolute path that you're trying to reference with that environment variable? The environment variable might be getting handed off to uclust, which might not know how to expand it. If making that change doesn't work, can you reply back and include the log file that is generated when you run pick_open_reference_otus.py?

Also, I think you may avoid a future error if you use an absolute path for align_seqs:template_fp (you're currently using a relative path).

Thanks!

Greg

Samuel Major

unread,

May 5, 2017, 6:55:39 PM5/5/17

to Qiime 1 Forum

Hi Greg,

Thanks for the response. I'm getting a new error now...

pick_open_reference_otus.py -i F16_16S/1_qlab/combined_seqs.fna -o F16_16S/5_SILVA_uclust/silva123_otus -r SILVA123_QIIME_release/rep_set/rep_set_16S_only/97/97_otus_16S.fasta -p F16_16S/silva-123-params.txt

File "/home/qiime/miniconda3/envs/qiime1/lib/python2.7/site-packages/burrito-0.9.1-py2.7.egg/burrito/util.py", line 284, in __call__

IOError: [Errno 2] No such file or directory: '/tmp/tmpe5JGGanxjTimeK73xUxk.txt'

I've attached my parameters file and the log file.

Considering the paramaters file; the assign_taxonomy uses rep sets and taxonomy files in the 99%, but the -r flag calls for IDing the 97% sequences. I don't assume this is a problem, but could you explain this as well?

Thank you

Sam

silva-123-params.txt

log_20170505152449.txt

Greg Caporaso

unread,

May 8, 2017, 6:15:49 PM5/8/17

to Qiime 1 Forum

Hi Sam,

Can you try passing absolute file paths for the -i and -r parameters that you're providing to pick_open_reference_otus.py? uclust may be having an issue finding one or both of those files.

The differences in the percent id reference files that you're providing here won't be causing that issue. Generally I use the same percent id as my clustering threshold for the OTU picking reference database, which is what you're doing here (i.e., you're clustering at 97%, and providing the 97% Silva OTUs). I generally use the same percent id reference database for taxonomy assignment, but using the 99% shouldn't cause any problems. It's not immediately clear if this would perform better or worse - it really needs to be studied - but in my experience you probably won't see a very big difference if you use the 97% or 99% OTU reference for taxonomy assignment.

Ira Ayu Lestari

unread,

May 13, 2017, 10:43:50 AM5/13/17

to Qiime 1 Forum

Hai all,

How to make filter_alignment.py command run based on SILVA database?

I'm about to do OTU picking with pick_de_novo_otus.py but since it default database is greengenes, I run it separately (pick_otus.py --> assign_taxonomy.py --> make_otu_table.py --> align_seqs.py --> filter_alignment.py --> make_phylogeny.py; and use -r or -t option to path SILVA)

I see that on the QIIME website it said that the filter_alignment.py default database is greengenes.

Thank you for your help,

Best wishes,

Ira

Jai Ram Rideout

unread,

May 15, 2017, 7:12:24 PM5/15/17

to Qiime 1 Forum

Hi Ira,

You can accomplish this with a QIIME parameters file. Once you've written the parameters file you can supply it to pick_de_novo_otus.py via the -p option.

Best,

Jai

Sara Dunaj

unread,

Jun 6, 2017, 8:09:20 AM6/6/17

to Qiime 1 Forum

Hi,

I too am using the Silva database to pick OTUs, assign taxonomy and run alignments. Below are the details of my parameter file:

_seqs.py:template_fp /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/core_alignment/core_alignment_SILVA123.fasta

filter_alignment:allowed_gap_frac    0.80
filter_alignment:entropy_threshold    0.10
filter_alignment:suppress_lane_mask_filter    True

assign_taxonomy:reference_seqs_fp    /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta
assign_taxonomy:id_to_taxonomy_fp    /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt
parallel:jobs_to_start    1
pick_otus:enable_rev_strand_match    True

However, when I look at the log files for each step in the process it looks like the default Green Genes database is being used for the OTU clustering and alignment steps. I have attached the over log file and the alignment (rep_set) log files.

Also, when I tried to directly input the SILVA input file as the reference to use for Pick Open Reference OTUs I got several errors:

MacQIIME Sara-Dunajs-MacBook-Pro:Split_Lib_FullReads_attachedBC_q20 $ pick_open_reference_otus.py -m usearch61 -p /Users/Sara_Jeanne/Desktop/QIIME/Silva_RefSet_PickOTUs_Param.txt -o 20170531_Full_Reads_otus_USEARCH61_SILVA99_RERUN2 -i seqs_silva99_chimeras_filtered.fna -r /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta

Traceback (most recent call last):

File "/macqiime/anaconda/bin/pick_open_reference_otus.py", line 453, in <module>
    main()
File "/macqiime/anaconda/bin/pick_open_reference_otus.py", line 432, in main
    minimum_failure_threshold=minimum_failure_threshold)
File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/workflow/pick_open_reference_otus.py", line 713, in pick_subsampled_open_reference_otus
    close_logger_on_success=False)
File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/workflow/util.py", line 122, in call_commands_serially
    raise WorkflowError(msg)
qiime.workflow.util.WorkflowError:

*** ERROR RAISED DURING STEP: Pick Reference OTUs
Command run was:
pick_otus.py -i seqs_silva99_chimeras_filtered.fna -o 20170531_Full_Reads_otus_USEARCH61_SILVA99_RERUN2/step1_otus -r /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta -m usearch61_ref --enable_rev_strand_match --suppress_new_clusters

Command returned exit status: 1
Stdout:

Stderr
Traceback (most recent call last):

File "/macqiime/anaconda/bin/pick_otus.py", line 1004, in <module>
    main()
File "/macqiime/anaconda/bin/pick_otus.py", line 897, in main
    otu_prefix=otu_prefix, HALT_EXEC=False)
File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/pick_otus.py", line 1800, in __call__
    HALT_EXEC=HALT_EXEC
File "/macqiime/anaconda/lib/python2.7/site-packages/bfillings/usearch.py", line 1844, in usearch61_ref_cluster
    raise ApplicationError('Error running usearch61. Possible causes are '
burrito.util.ApplicationError: Error running usearch61. Possible causes are unsupported version (current supported version is usearch v6.1.544) is installed or improperly formatted input file was provided

Thank you for your help with this.

Sara

rep_set_log.txt

log_20170121155835.txt

Embriette

unread,

Jun 6, 2017, 11:58:02 AM6/6/17

to Qiime 1 Forum

Hi Sara,

A few things might be going on here-let's tackle them one at a time. First, I noticed that the first line of your parameters file is written as follows:

_seqs.py:template_fp /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/core_alignment/core_alignment_SILVA123.fasta

Is this correct, or did you make a mistake when copy-pasting? If this is correct, the command cannot find your silva reference because you haven't fully indicated the argument (align_seqs). You also need to remove the ".py" ending. Try that and see if that solves your problem.

As for inputting directly into the open ref workflow, your error:

burrito.util.ApplicationError: Error running usearch61. Possible causes are unsupported version (current supported version is usearch v6.1.544) is installed or improperly formatted input file was provided

Indicates that you don't have the correct version of usearch installed. Can you verify which version you have, and if you don't have v6.1.544, you should install that and then try again.

Thanks-let us know if this works!

Embriette

Sara Dunaj

unread,

Jun 6, 2017, 1:47:22 PM6/6/17

to qiime...@googlegroups.com

Hi Embriette,

Thank you for your time and help with this. I really appreciate it. I have updated my parameter file (though there was a copy base error) below:

pick_otus:enable_rev_strand_match True
align_seqs:template_fp /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/core_alignment/core_alignment_SILVA123.fasta

assign_taxonomy:reference_seqs_fp /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/rep_set/rep_set_all/99/99_otus.fasta
assign_taxonomy:id_to_taxonomy_fp /Users/Sara_Jeanne/Desktop/QIIME/SILVA123_QIIME_release/taxonomy/taxonomy_all/99/taxonomy_7_levels.txt

I also have both versions of USearch and have used it successfully in past runs:

Split_Lib_FullReads_attachedBC_q20 $ usearch -version
usearch v5.2.236

Split_Lib_FullReads_attachedBC_q20 $ usearch61 -version
usearch_i86osx32 v6.1.544

I am uncertain on why now it wouldn't be working. I have printing the current configuration below:

Split_Lib_FullReads_attachedBC_q20 $ print_qiime_config.py

System information
==================
         Platform:    darwin
   Python version:    2.7.10 |Anaconda 2.2.0 (x86_64)| (default, May 28 2015, 17:04:42) [GCC 4.2.1 (Apple Inc. build 5577)]
Python executable:    /macqiime/anaconda/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:

https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2

Dependency versions
===================
QIIME library version: 1.9.1
QIIME script version: 1.9.1

qiime-default-reference version:    0.1.2
                  NumPy version:    1.9.2
                  SciPy version:    0.15.1
                 pandas version:    0.16.1
             matplotlib version:    1.4.3
            biom-format version:    2.1.4
                   h5py version:    2.4.0 (HDF5 version: 1.8.14)

                   qcli version:    0.1.1
                   pyqi version:    0.3.2
             scikit-bio version:    0.2.3
                 PyNAST version:    1.2.2
                Emperor version:    0.9.51
                burrito version:    0.9.1
       burrito-fillings version:    0.1.1
              sortmerna version:    SortMeRNA version 2.0, 29/11/2014
              sumaclust version:    SUMACLUST Version 1.0.00

swarm version: Swarm 1.2.19 [Jun 2 2015 14:40:16]

gdata: Installed.

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:
http://qiime.org/install/qiime_config.html
http://qiime.org/tutorials/parallel_qiime.html

                     blastmat_dir:    None
      pick_otus_reference_seqs_fp:    /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                         sc_queue:    all.q
      topiaryexplorer_project_dir:    None
     pynast_template_alignment_fp:    /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta
                  cluster_jobs_fp:    start_parallel_jobs.py
pynast_template_alignment_blastdb:    None

assign_taxonomy_reference_seqs_fp:    /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                     torque_queue:    friendlyq
                    jobs_to_start:    1

slurm_time: None

            denoiser_min_per_core:    50
assign_taxonomy_id_to_taxonomy_fp:    /macqiime/anaconda/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
                         temp_dir:    /tmp/

                     slurm_memory:    None
                      slurm_queue:    None
                      blastall_fp:    blastall
                 seconds_to_sleep:    60
MacQIIME MacBook-Pro:Split_Lib_FullReads_attachedBC_q20 $ which usearch
/macqiime/bin/usearch
MacQIIME MacBook-Pro:Split_Lib_FullReads_attachedBC_q20 $ usearch

USEARCH 5.2.236
(C) Copyright 2010-11 Robert C. Edgar, all rights reserved.
http://drive5.com/usearch

License:

Common commands
===============
Clustering de novo (default is global alignment):
usearch -cluster seqs.sorted.fasta -uc results.uc -id 0.97 [-usersort]
    Specify -usersort if input is not sorted by length.
    Not recommended for OTU clustering. See manual.

Database search (default is local alignment):
usearch -query q.fasta -evalue 0.01 -blast6out results.b6
    -db db.fasta | -udb db.udb [-threads n] | -wdb db.wdb

Search + clustering of seqs that don't match (default is global alignment):
usearch -cluster seqs.sorted.fasta -db db.fasta -id 0.97 [-uc results.uc]
    [-seedsout seeds.fasta] [-consout cons.fasta]

Create udb or wdb database index:
usearch -makeudb db.fasta -output db.udb
usearch -makewdb db.fasta -output db.wdb

Dereplication, removing identical full-length sequences (does not search reverse strand):
usearch -derep_fullseq -cluster input.fasta -seedsout nr.fasta [-bithash] [-sizeout]

Dereplication, removing identical sub-sequences:
usearch -derep_subseq -cluster input.fasta -seedsout nr.fasta
    -w 32 -slots 40000003 [-sizeout]

Chimera detection (UCHIME ref. db. mode):
usearch -uchime q.fasta [-db db.fasta] [-chimeras ch.fasta]
    [-nonchimeras good.fasta] [-uchimeout results.uch] [-uchimealns results.alns]

Chimera detection (UCHIME de novo mode):
usearch -uchime amplicons.fasta [-chimeras ch.fasta] [-nonchimeras good.fasta]
     [-uchimeout results.uch] [-uchimealns results.alns]
Input is estimated amplicons with integer abundances specified using ";size=N".

Sort sequences by length:
usearch -sort seqs.fasta -output seqs.sorted.fasta
usearch -mergesort seqs.fasta -output seqs.sorted.fasta [-split S]
    Use -mergesort if too big for -sort. S is partition size in Mb, default 1000.0.

Sort sequences by cluster size/abundance specified by ";size=N" in label:
usearch -sortsize seqs.fasta -output seqs.sorted.fasta [-minsize n]

Output files
============
All formats are supported for clustering and searching.
-uc file           UCLUST format, tab-separated.
-blastout file     Human-readable verbose format similar to BLAST.
-blast6out file    Tab-separated, same as -outfmt 6 or -m8 option of NCBI BLAST.
-userout file      Tab-separated, fields specified by -userfields (see manual).
-seedsout file     FASTA file with cluster seeds, i.e. non-redundant version of input.
-consout file      FASTA file with consensus sequence for each cluster.
-fastapairs file   FASTA file with pair-wise alignments.

Search termination
==================
-maxaccepts N       Max accepted targets, 0=ignore, default 1.
-maxrejects N       Max rejected targets, 0=ignore, default 32.
-[no]usort          [Do not] test database sequences in U-sorted order. If -nousort is
                       specified, the entire database is searched and termination options
                       are ignored. Default is -usort.

Accept/reject criteria
======================
Criteria are combined with AND.

-id F
    Minimum identity, as a value 0.0 to 1.0, meaning 0% to 100% identity. No default value.
    The -iddef option specifies definiton of identity (see manual).

-evalue E
    Maximum E-value. Local alignments only. No default value.

-query[aln]fract F
    Minimum fraction of the query sequence covered by alignment. Default 0.0.

-target[aln]fract F
    Minimum fraction of the target sequence covered by alignment. Default 0.0.

-idprefix n / -idsuffix n
    First (last) n letters of the query must be identical to the target. Default 0.

-leftjust / -rightjust
    Left (right) terminal gaps cause reject. Recommended to use -idprefix if you
    use -leftjust or -idsuffix if you use -rightjust.

Alignment style
===============
-global             Default if -cluster is specified.
-local              Default if -query is specified.

Compressed index
================
-slots n           Size of compressed index table. Should be prime, e.g. 40000003.
                    Should also specify -w, typical is -w 16 to 32.

Misc.
=====
-quiet             Do not write progress messages to standard error.
-log filename      Write log file with information about parameters and performance.
-version           Show program version number and exit.
-help              This help.

See manual for more options.

Thank you very much,

Sara

Embriette

unread,

Jun 7, 2017, 5:57:35 PM6/7/17

to Qiime 1 Forum

Hi Sara,

Will you please send me your corrected parameters file so I can take a look?

Thanks!

Embriette

Sara Dunaj

unread,

Jun 15, 2017, 11:24:16 AM6/15/17

to qiime...@googlegroups.com

Hi Embriette,

My apologies for the delayed reply. I found the issue was partly due to running the 99% identity SILVA data-set. I was able to get it to run with the 97% identity data-set. I was also able to get it to run with SILVA 128's 99% similarity dataset.

Thank you,

Sara