Problems “chimera checking sequences” pipeline using usearch61 algorithm

69 views
Skip to first unread message

Matías Di Paola

unread,
Jan 25, 2016, 10:03:27 AM1/25/16
to Qiime 1 Forum

Due to lack of memory on my personal computer I have run the latest AMI version (ami-19118ff72) of qiime in the Amazon EC2.


I want to perform the “chimera checking sequences” with usearch61 algorithm.

After installing the correct version of usearch61 in the server, copying the seqs.fna coming from the split_library_output script and downloading the reference database gg_13_5.fasta I ran:



identify_chimeric_seqs.py -i seqs.fna -m usearch61 -o usearch_checked_chimeras/ -r gg_13_5.fasta


For my surprise there was only two files:


seqs.fna_consensus_with_abundance.uc

seqs.fna_smallmem_clustered.log


No “.txt” extension file, so I continue with the pipeline using the “.uc”:


filter_fasta.py -f seqs.fna -o seqs_chimeras_filtered.fna -s usearch_checked_chimeras/ seqs.fna_consensus_with_abundance.uc -n


the output file was:


seqs_chimeras_filtered.fna


when I try to run the last step of the pipeline:


pick_otus.py -m usearch61 -i seqs_chimeras_filtered.fna -o usearch61_picked_otus/


I get the following error:


cogent.app.util.ApplicationError: Error running usearch61. Possible causes are unsupported version (current supported version is usearch v6.1.544) is installed or improperly formatted input file was provided


why there is not a “.txt” file in the identify_chimeric_seqs.py output? could I use.uc file to continue with the pipeline o I have to switch to uclust algorithm to pick the OTUs? I am not sure what the .uc kind of file is, I find searching in the tutorial that is an intermediate uclust(.uc) files.


I attached 20 lines of  the input files and the log file.


Thanks in advanced for the help

Matias

seqs_20lines.fna
seqs_chimera_filtered_20lines.fna
seqs.fna_consensus_with_abundance_20lines.uc
seqs.fna_smallmem_clustered.log

Daniel McDonald

unread,
Jan 25, 2016, 7:53:00 PM1/25/16
to Qiime 1 Forum
Hi Matías,

Can you please send the output from the following commands?

print_qiime_config.py -t
which usearch

I'm a bit confused as well about the lack of the .txt file. 

The .uc file is the standard output file from usearch. However, the identify_chimeric_seqs.py script should be outputting a .txt file as I understand it. 

Best,
Daniel

Matías Di Paola

unread,
Jan 26, 2016, 9:02:53 AM1/26/16
to Qiime 1 Forum
Hi Daniel,

I already finish my session in AMazon EC2, but I used the last AMI published.




ubuntu@ip-172-31-1-162:~$ print_qiime_config.py -t

System information
==================
         Platform: linux2
   Python version: 2.7.3 (default, Aug  1 2012, 05:14:39)  [GCC 4.6.3]
Python executable: /usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:

Dependency versions
===================
          QIIME library version: 1.9.1
           QIIME script version: 1.9.1
qiime-default-reference version: 0.1.2
                  NumPy version: 1.9.2
                  SciPy version: 0.15.1
                 pandas version: 0.16.1
             matplotlib version: 1.4.3
            biom-format version: 2.1.4
                   h5py version: 2.5.0 (HDF5 version: 1.8.4)
                   qcli version: 0.1.1
                   pyqi version: 0.3.2
             scikit-bio version: 0.2.3
                 PyNAST version: 1.2.2
                Emperor version: 0.9.51
                burrito version: 0.9.1
       burrito-fillings version: 0.1.1
              sortmerna version: SortMeRNA version 2.0, 29/11/2014
              sumaclust version: SUMACLUST Version 1.0.00
                  swarm version: Swarm 1.2.19 [May 26 2015 15:28:37]
                          gdata: Installed.

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:

                     blastmat_dir: /qiime_software/blast-2.2.22-release/data
      pick_otus_reference_seqs_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                         sc_queue: all.q
      topiaryexplorer_project_dir: None
     pynast_template_alignment_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta
                  cluster_jobs_fp: start_parallel_jobs.py
pynast_template_alignment_blastdb: None
assign_taxonomy_reference_seqs_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                     torque_queue: friendlyq
                    jobs_to_start: 1
                       slurm_time: None
            denoiser_min_per_core: 50
assign_taxonomy_id_to_taxonomy_fp: /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
                         temp_dir: /home/ubuntu/temp/
                     slurm_memory: None
                      slurm_queue: None
                      blastall_fp: /qiime_software/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep: 1

QIIME base install test results
===============================
.........
----------------------------------------------------------------------
Ran 9 tests in 0.580s

OK

ubuntu@ip-172-31-1-162:~$ which usearch

/// No results

I download and copied usearch6.1.544_i86linux32 to de amazon server.

and then:

sudo mv usearch* /usr/local/bin/usearch61
sudo chmod a+x /usr/local/bin/usearch61

ubuntu@ip-172-31-1-162:~$ usearch61 
(C) Copyright 2010-12 Robert C. Edgar, all rights reserved.



For documentation, please visit:



I hope this helps
Cheers

Daniel McDonald

unread,
Jan 26, 2016, 6:20:00 PM1/26/16
to Qiime 1 Forum
Okay, thanks. 

So I'm not sure why identify_chimeric_sequences.py isn't dumping out the expected .txt file. I suspect the reason that pick_otus.py is dying is because filter_fasta.py (I don't think...) knows how handle a .uc file properly. 

Inquiring more internally on this. Will follow up when I know more.

Best,
Daniel

Matías Di Paola

unread,
Jan 26, 2016, 9:52:19 PM1/26/16
to Qiime 1 Forum
Thanks Daniel,

By the way I hadn 't denoise the raw data (coming from 454) I just mention it because I am new in this kind of analysis an I dont know if it has something to do with this error.

Matias 

Daniel McDonald

unread,
Jan 27, 2016, 1:19:17 PM1/27/16
to Qiime 1 Forum
That shouldn't be an issue

Best,
Daniel

Daniel McDonald

unread,
Feb 1, 2016, 2:20:45 PM2/1/16
to Qiime 1 Forum
Hi Matías,

I asked around -- our best guess is that identify_chimeric_sequences.py did not complete properly and terminated prematurely. Possible to rerun and verify?

Best,
Daniel

Matías Di Paola

unread,
Feb 2, 2016, 8:36:16 AM2/2/16
to Qiime 1 Forum
Hi Daniel,

I will try again in a local server. I let you know.

Regards,
Matias
Reply all
Reply to author
Forward
0 new messages