!time pick_open_reference_otus.py -o otus/ -i slout/seqs.fna -p uc_fast_params.txt -a
qiime@qiime-190-virtual-box:~$ print_qiime_config.py -tf
System information
==================
Platform: linux2
Python version: 2.7.3 (default, Dec 18 2014, 19:10:20) [GCC 4.6.3]
Python executable: /usr/bin/python
QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:
https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2
Dependency versions
===================
QIIME library version: 1.9.1
QIIME script version: 1.9.1
qiime-default-reference version: 0.1.2
NumPy version: 1.9.2
SciPy version: 0.15.1
pandas version: 0.16.1
matplotlib version: 1.4.3
biom-format version: 2.1.4
h5py version: 2.4.0 (HDF5 version: 1.8.4)
qcli version: 0.1.1
pyqi version: 0.3.2
scikit-bio version: 0.2.3
PyNAST version: 1.2.2
Emperor version: 0.9.51
burrito version: 0.9.1
burrito-fillings version: 0.1.1
sortmerna version: SortMeRNA version 2.0, 29/11/2014
sumaclust version: SUMACLUST Version 1.0.00
swarm version: Swarm 1.2.19 [May 26 2015 13:50:14]
gdata: Installed.
RDP Classifier version (if installed): rdp_classifier-2.2.jar
Java version (if installed): 1.6.0_35
QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:
http://qiime.org/install/qiime_config.html
http://qiime.org/tutorials/parallel_qiime.html
blastmat_dir: /qiime_software/blast-2.2.22-release/data
pick_otus_reference_seqs_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
sc_queue: all.q
topiaryexplorer_project_dir: None
pynast_template_alignment_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta
cluster_jobs_fp: start_parallel_jobs.py
pynast_template_alignment_blastdb: None
assign_taxonomy_reference_seqs_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
torque_queue: friendlyq
jobs_to_start: 1
slurm_time: None
denoiser_min_per_core: 50
assign_taxonomy_id_to_taxonomy_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
temp_dir: /tmp/
slurm_memory: None
slurm_queue: None
blastall_fp: /qiime_software/blast-2.2.22-release/bin/blastall
seconds_to_sleep: 1
QIIME full install test results
===============================
..........................F
======================================================================
FAIL: test_usearch_supported_version (__main__.QIIMEDependencyFull)
usearch is in path and version is supported
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/bin/print_qiime_config.py", line 650, in test_usearch_supported_version
"which components of QIIME you plan to use.")
AssertionError: usearch not found. This may or may not be a problem depending on which components of QIIME you plan to use.
----------------------------------------------------------------------
Ran 27 tests in 0.072s
FAILED (failures=1)
pick_open_reference_otus.py -o otus2/ -i slout/seqs10k.fna -p uc_fast_params.txt -aO 4
Hello Michael,Thanks for getting in touch with us!At the start, you mentioned that memory and CPU are minimal, but that was when usearch/uclust was messed up. Now that uclust is working, can you check memory again? This often 'fills up' and overflows into swap, bringing everything to a crawl. This would explain why CPU usage was up, then drop off after some time.
You could also try using absolute file paths, like you mentioned. Several pages of the qiime documentation demand this, but I've never had a problem with relative paths.As for subsampling, vsearch --fastx_subsample seems like a perfect fit. Check it out:
Colin Brislawn
But you where right about uclust filling up the memory, it repeatedly crashes my browser when I run pick_open_reference_otus. Could that ground the program to a halt?
Colin
~/Desktop/miseq$ pick_open_reference_otus.py -o $PWD/otus/ -i $PWD/slout/seqs.fna -p $PWD/uc_fast_params.txt -f
Traceback (most recent call last):
File "/usr/local/bin/pick_open_reference_otus.py", line 453, in <module>
main()
File "/usr/local/bin/pick_open_reference_otus.py", line 432, in main
minimum_failure_threshold=minimum_failure_threshold)
File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/pick_open_reference_otus.py", line 713, in pick_subsampled_open_reference_otus
close_logger_on_success=False)
File "/usr/local/lib/python2.7/dist-packages/qiime/workflow/util.py", line 122, in call_commands_serially
raise WorkflowError(msg)
qiime.workflow.util.WorkflowError:
*** ERROR RAISED DURING STEP: Pick Reference OTUs
Command run was:
pick_otus.py -i /home/qiime/Desktop/miseq/slout/seqs.fna -o /home/qiime/Desktop/miseq/otus//step1_otus -r /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta -m uclust_ref --enable_rev_strand_match --suppress_new_clusters
Command returned exit status: 137
Stdout:
Stderr
Killed
count_seqs.py -i slout/seqs.fna
6703034 : Total* Using this number it becomes relatively easy to split the file in half with two Bash commands. Each sequence uses one line for the header, one line for the nucleotides:
head -6703034 slout/seqs.fna > slout/seq1of2.fna
tail -6703034 slout/seqs.fna > slout/seq2of2.fna
* And finally the iterative approach:
pick_open_reference_otus.py -i $PWD/slout/seq1of2.fna,$PWD/slout/seq2of2.fna -o $PWD/otus_iter/
Another word of caution: make sure you have enough hard-drive space available, or the VM will pause. My virtual machine has now reached about 50GB. Problem about dynamically expanding VMs hard-drives is that they don't dynamically shrink again, so I'll have to clear (zero-free) it out at some point again.