Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Problems “chimera checking sequences” pipeline using usearch61 algorithm

69 views
Skip to first unread message

Matías Di Paola

unread,
Jan 25, 2016, 10:03:27 AM1/25/16
to Qiime 1 Forum

Due to lack of memory on my personal computer I have run the latest AMI version (ami-19118ff72) of qiime in the Amazon EC2.


I want to perform the “chimera checking sequences” with usearch61 algorithm.

After installing the correct version of usearch61 in the server, copying the seqs.fna coming from the split_library_output script and downloading the reference database gg_13_5.fasta I ran:



identify_chimeric_seqs.py -i seqs.fna -m usearch61 -o usearch_checked_chimeras/ -r gg_13_5.fasta


For my surprise there was only two files:


seqs.fna_consensus_with_abundance.uc

seqs.fna_smallmem_clustered.log


No “.txt” extension file, so I continue with the pipeline using the “.uc”:


filter_fasta.py -f seqs.fna -o seqs_chimeras_filtered.fna -s usearch_checked_chimeras/ seqs.fna_consensus_with_abundance.uc -n


the output file was:


seqs_chimeras_filtered.fna


when I try to run the last step of the pipeline:


pick_otus.py -m usearch61 -i seqs_chimeras_filtered.fna -o usearch61_picked_otus/


I get the following error:


cogent.app.util.ApplicationError: Error running usearch61. Possible causes are unsupported version (current supported version is usearch v6.1.544) is installed or improperly formatted input file was provided


why there is not a “.txt” file in the identify_chimeric_seqs.py output? could I use.uc file to continue with the pipeline o I have to switch to uclust algorithm to pick the OTUs? I am not sure what the .uc kind of file is, I find searching in the tutorial that is an intermediate uclust(.uc) files.


I attached 20 lines of  the input files and the log file.


Thanks in advanced for the help

Matias

seqs_20lines.fna
seqs_chimera_filtered_20lines.fna
seqs.fna_consensus_with_abundance_20lines.uc
seqs.fna_smallmem_clustered.log

Daniel McDonald

unread,
Jan 25, 2016, 7:53:00 PM1/25/16
to Qiime 1 Forum
Hi Matías,

Can you please send the output from the following commands?

print_qiime_config.py -t
which usearch

I'm a bit confused as well about the lack of the .txt file. 

The .uc file is the standard output file from usearch. However, the identify_chimeric_seqs.py script should be outputting a .txt file as I understand it. 

Best,
Daniel

Matías Di Paola

unread,
Jan 26, 2016, 9:02:53 AM1/26/16
to Qiime 1 Forum
Hi Daniel,

I already finish my session in AMazon EC2, but I used the last AMI published.




ubuntu@ip-172-31-1-162:~$ print_qiime_config.py -t

System information
==================
         Platform: linux2
   Python version: 2.7.3 (default, Aug  1 2012, 05:14:39)  [GCC 4.6.3]
Python executable: /usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:

Dependency versions
===================
          QIIME library version: 1.9.1
           QIIME script version: 1.9.1
qiime-default-reference version: 0.1.2
                  NumPy version: 1.9.2
                  SciPy version: 0.15.1
                 pandas version: 0.16.1
             matplotlib version: 1.4.3
            biom-format version: 2.1.4
                   h5py version: 2.5.0 (HDF5 version: 1.8.4)
                   qcli version: 0.1.1
                   pyqi version: 0.3.2
             scikit-bio version: 0.2.3
                 PyNAST version: 1.2.2
                Emperor version: 0.9.51
                burrito version: 0.9.1
       burrito-fillings version: 0.1.1
              sortmerna version: SortMeRNA version 2.0, 29/11/2014
              sumaclust version: SUMACLUST Version 1.0.00
                  swarm version: Swarm 1.2.19 [May 26 2015 15:28:37]
                          gdata: Installed.

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:

                     blastmat_dir: /qiime_software/blast-2.2.22-release/data
      pick_otus_reference_seqs_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                         sc_queue: all.q
      topiaryexplorer_project_dir: None
     pynast_template_alignment_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta
                  cluster_jobs_fp: start_parallel_jobs.py
pynast_template_alignment_blastdb: None
assign_taxonomy_reference_seqs_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                     torque_queue: friendlyq
                    jobs_to_start: 1
                       slurm_time: None
            denoiser_min_per_core: 50
assign_taxonomy_id_to_taxonomy_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
                         temp_dir: /home/ubuntu/temp/
                     slurm_memory: None
                      slurm_queue: None
                      blastall_fp: /qiime_software/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep: 1

QIIME base install test results
===============================
.........
----------------------------------------------------------------------
Ran 9 tests in 0.580s

OK

ubuntu@ip-172-31-1-162:~$ which usearch

/// No results

I download and copied usearch6.1.544_i86linux32 to de amazon server.

and then:

sudo mv usearch* /usr/local/bin/usearch61
sudo chmod a+x /usr/local/bin/usearch61

ubuntu@ip-172-31-1-162:~$ usearch61 
(C) Copyright 2010-12 Robert C. Edgar, all rights reserved.



For documentation, please visit:



I hope this helps
Cheers

Daniel McDonald

unread,
Jan 26, 2016, 6:20:00 PM1/26/16
to Qiime 1 Forum
Okay, thanks. 

So I'm not sure why identify_chimeric_sequences.py isn't dumping out the expected .txt file. I suspect the reason that pick_otus.py is dying is because filter_fasta.py (I don't think...) knows how handle a .uc file properly. 

Inquiring more internally on this. Will follow up when I know more.

Best,
Daniel

Matías Di Paola

unread,
Jan 26, 2016, 9:52:19 PM1/26/16
to Qiime 1 Forum
Thanks Daniel,

By the way I hadn 't denoise the raw data (coming from 454) I just mention it because I am new in this kind of analysis an I dont know if it has something to do with this error.

Matias 

Daniel McDonald

unread,
Jan 27, 2016, 1:19:17 PM1/27/16
to Qiime 1 Forum
That shouldn't be an issue

Best,
Daniel

Daniel McDonald

unread,
Feb 1, 2016, 2:20:45 PM2/1/16
to Qiime 1 Forum
Hi Matías,

I asked around -- our best guess is that identify_chimeric_sequences.py did not complete properly and terminated prematurely. Possible to rerun and verify?

Best,
Daniel

Matías Di Paola

unread,
Feb 2, 2016, 8:36:16 AM2/2/16
to Qiime 1 Forum
Hi Daniel,

I will try again in a local server. I let you know.

Regards,
Matias
Reply all
Reply to author
Forward
0 new messages