excluding sequences

501 views
Skip to first unread message

Aga

unread,
Oct 30, 2012, 1:41:03 PM10/30/12
to qiime...@googlegroups.com
Dear Qiime users,

I was trying to exclude some sequences from repr_set_seqs files using the script below:

exclude_seqs_by_blast.py -i repr_set_seqspnpA.fna -d ref_seq_setpnpA.fna -o exclude_seqs/

I got the error below:

exclude_seqs_by_blast.py: error: Please check -o option: cannot write to output file

I tried to save it in different output files but it doesn't work. I would appreciate any suggestions.

Best
Aga

Tony Walters

unread,
Oct 30, 2012, 1:45:34 PM10/30/12
to qiime...@googlegroups.com
Hello Aga,

Can you try running it like this:
exclude_seqs_by_blast.py -i repr_set_seqspnpA.fna -d ref_seq_setpnpA.fna -o excluded_seqs

It takes an output filepath (rather than directory with the / character) as input, unlike the majority of the QIIME scripts, so hopefully this change will fix it for you.

-Tony

Aga

--
 
 
 

Kowalczyk, Agnieszka

unread,
Oct 30, 2012, 1:49:46 PM10/30/12
to qiime...@googlegroups.com
Hi Tony,
I tried and I got this message:

Traceback (most recent call last):
  File "/macqiime/QIIME/bin/exclude_seqs_by_blast.py", line 245, in <module>
    main()
  File "/macqiime/QIIME/bin/exclude_seqs_by_blast.py", line 194, in main
    options.percent_aligned, DEBUG=DEBUG)
  File "/macqiime/lib/python2.7/site-packages/qiime/exclude_seqs_by_blast.py", line 118, in find_homologs
    DEBUG=DEBUG)
  File "/macqiime/lib/python2.7/site-packages/qiime/exclude_seqs_by_blast.py", line 81, in blast_genome
    blast_mat_root=blast_mat_root)
  File "/macqiime/lib/python2.7/site-packages/cogent/app/blast.py", line 665, in blast_seqs
    HALT_EXEC=HALT_EXEC)
  File "/macqiime/lib/python2.7/site-packages/cogent/app/blast.py", line 402, in __init__
    HALT_EXEC=HALT_EXEC)
  File "/macqiime/lib/python2.7/site-packages/cogent/app/blast.py", line 167, in __init__
    raise RuntimeError, blastmat_error_message
RuntimeError: BLAST cannot run if the BLASTMAT environment variable is not set.

Usually, the BLASTMAT environment variable points to the NCBI data directory,
which contains matrices like PAM30 and PAM70, etc.

Alternatively, you may create a .ncbirc file to define these variables.

From help file:

2) Create a .ncbirc file. In order for Standalone BLAST to operate, you
have will need to have a .ncbirc file that contains the following lines:

[NCBI] 
Data="path/data/"

Where "path/data/" is the path to the location of the Standalone BLAST
"data" subdirectory. For Example: 

Data=/root/blast/data

The data subdirectory should automatically appear in the directory where
the downloaded file was extracted. Please note that in many cases it may
be necessary to delimit the entire path including the machine name and
or the net work you are located on. Your systems administrator can help
you if you do not know the entire path to the data subdirectory.

Make sure that your .ncbirc file is either in the directory that you
call the Standalone BLAST program from or in your root directory.

Thanks
Aga
--
 
 
 

Tony Walters

unread,
Oct 30, 2012, 1:54:52 PM10/30/12
to qiime...@googlegroups.com
That sounds like BLAST isn't installed or configured-can you run print_qiime_config.py -t and post the results?

--
 
 
 

Kowalczyk, Agnieszka

unread,
Oct 30, 2012, 2:18:13 PM10/30/12
to qiime...@googlegroups.com
Yes, thank you. Please see the output below:

System information
==================
         Platform: darwin
   Python version: 2.7.1 (r271:86832, Dec 15 2011, 08:41:37)  [GCC 4.0.1 (Apple Inc. build 5493)]
Python executable: /macqiime/bin/python

Dependency versions
===================
                     PyCogent version: 1.5.1
                        NumPy version: 1.5.1
                   matplotlib version: 1.1.0
                  biom-format version: 0.9.3
                QIIME library version: 1.5.0
                 QIIME script version: 1.5.0
        PyNAST version (if installed): 1.1
RDP Classifier version (if installed): rdp_classifier-2.2.jar

QIIME config values
===================
                     blastmat_dir: None
                         sc_queue: all.q
      topiaryexplorer_project_dir: None
     pynast_template_alignment_fp: /macqiime/greengenes/core_set_aligned.fasta.imputed
                  cluster_jobs_fp: /macqiime/QIIME/bin/start_parallel_jobs.py
pynast_template_alignment_blastdb: None
assign_taxonomy_reference_seqs_fp: None
                     torque_queue: friendlyq
              qiime_test_data_dir: None
   template_alignment_lanemask_fp: /macqiime/greengenes/lanemask_in_1s_and_0s
                    jobs_to_start: 1
                cloud_environment: False
                qiime_scripts_dir: /macqiime/QIIME/bin/
            denoiser_min_per_core: 50
                      working_dir: None
                    python_exe_fp: /macqiime/bin/python
                         temp_dir: /tmp/
                      blastall_fp: blastall
                 seconds_to_sleep: 60
assign_taxonomy_id_to_taxonomy_fp: None


running checks:

test_FastTree_supported_version (__main__.Qiime_config)
FastTree is in path and version is supported ... ok
test_INFERNAL_supported_version (__main__.Qiime_config)
INFERNAL is in path and version is supported ... ok
test_ParsInsert_supported_version (__main__.Qiime_config)
ParsInsert is in path and version is supported ... ok
test_R_supported_version (__main__.Qiime_config)
R is in path and version is supported ... FAIL
test_ampliconnoise_install (__main__.Qiime_config)
AmpliconNoise install looks sane. ... FAIL
test_blast_supported_version (__main__.Qiime_config)
blast is in path and version is supported ... ok
test_blastall_fp (__main__.Qiime_config)
blastall_fp is set to a valid path ... ok
test_blastmat_dir (__main__.Qiime_config)
blastmat_dir is set to a valid path. ... ok
test_cdbtools_supported_version (__main__.Qiime_config)
cdbtools is in path and version is supported ... ok
test_cdhit_supported_version (__main__.Qiime_config)
cd-hit is in path and version is supported ... ok
test_chimeraSlayer_install (__main__.Qiime_config)
no obvious problems with ChimeraSlayer install ... ok
test_clearcut_supported_version (__main__.Qiime_config)
clearcut is in path and version is supported ... ok
test_cluster_jobs_fp (__main__.Qiime_config)
cluster_jobs_fp is set to a valid path and is executable ... ok
test_denoiser_supported_version (__main__.Qiime_config)
denoiser aligner is ready to use ... ok
test_for_obsolete_values (__main__.Qiime_config)
local qiime_config has no extra params ... ok
test_matplotlib_suported_version (__main__.Qiime_config)
maptplotlib version is supported ... ok
test_mothur_supported_version (__main__.Qiime_config)
mothur is in path and version is supported ... ok
test_muscle_supported_version (__main__.Qiime_config)
muscle is in path and version is supported ... ok
test_numpy_suported_version (__main__.Qiime_config)
numpy version is supported ... ok
test_pplacer_supported_version (__main__.Qiime_config)
pplacer is in path and version is supported ... ok
test_pynast_suported_version (__main__.Qiime_config)
pynast version is supported ... ok
test_pynast_template_alignment_blastdb_fp (__main__.Qiime_config)
pynast_template_alignment_blastdb, if set, is set to a valid path ... ok
test_pynast_template_alignment_fp (__main__.Qiime_config)
pynast_template_alignment, if set, is set to a valid path ... ok
test_python_exe_fp (__main__.Qiime_config)
python_exe_fp is set to a working python env ... ok
test_python_supported_version (__main__.Qiime_config)
python is in path and version is supported ... ok
test_qiime_scripts_dir (__main__.Qiime_config)
qiime_scripts_dir, if set, is set to a valid path ... ok
test_qiime_test_data_dir (__main__.Qiime_config)
qiime_test_data_dir, if set, is set to a valid path ... ok
test_raxmlHPC_supported_version (__main__.Qiime_config)
raxmlHPC is in path and version is supported ... ok
test_rtax_supported_version (__main__.Qiime_config)
rtax is in path and version is supported ... ok
test_temp_dir (__main__.Qiime_config)
temp_dir, if set, is set to a valid path ... ok
test_template_alignment_lanemask_fp (__main__.Qiime_config)
template_alignment_lanemask, if set, is set to a valid path ... ok
test_uclust_supported_version (__main__.Qiime_config)
uclust is in path and version is supported ... ok
test_usearch_supported_version (__main__.Qiime_config)
usearch is in path and version is supported ... FAIL
test_working_dir (__main__.Qiime_config)
working_dir, if set, is set to a valid path ... ok

======================================================================
FAIL: test_R_supported_version (__main__.Qiime_config)
R is in path and version is supported
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 689, in test_R_supported_version
    "which components of QIIME you plan to use.")
AssertionError: usearch not found. This may or may not be a problem depending on which components of QIIME you plan to use.

======================================================================
FAIL: test_ampliconnoise_install (__main__.Qiime_config)
AmpliconNoise install looks sane.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 143, in test_ampliconnoise_install
    "$PYRO_LOOKUP_FILE variable is not set. See %s for help." % url)
AssertionError: $PYRO_LOOKUP_FILE variable is not set. See http://www.qiime.org/install/install.html#ampliconnoise-install for help.

======================================================================
FAIL: test_usearch_supported_version (__main__.Qiime_config)
usearch is in path and version is supported
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 668, in test_usearch_supported_version
    "which components of QIIME you plan to use.")
AssertionError: usearch not found. This may or may not be a problem depending on which components of QIIME you plan to use.

----------------------------------------------------------------------
Ran 34 tests in 1.955s

FAILED (failures=3)
--
 
 
 

Tony Walters

unread,
Oct 30, 2012, 2:38:33 PM10/30/12
to qiime...@googlegroups.com
Helo again Aga,

That looks good, so can you check your .ncbirc file?  It should be in your home directory, and point to your blast data directory.

Mine for instance (yours will have different paths) looks like this:
Data=/Users/tony/code/blast-2.2.22/data

-Tony

--
 
 
 

Kowalczyk, Agnieszka

unread,
Oct 30, 2012, 2:43:19 PM10/30/12
to qiime...@googlegroups.com

This is a silly question but I have no idea how to check this.
Where should I look for this file? :)

Aga


Sent: 30 October 2012 18:38
--
 
 
 

Tony Walters

unread,
Oct 30, 2012, 2:49:54 PM10/30/12
to qiime...@googlegroups.com
It's in your home directory (or should be, it's possible it doesn't exist there), so you could open up a terminal and type:
vim .ncbirc

or from a terminal in any directory
vim ~/.ncbirc

to see the contents.

--
 
 
 

Kowalczyk, Agnieszka

unread,
Oct 30, 2012, 2:51:50 PM10/30/12
to qiime...@googlegroups.com
Great. Many thanks for that.

Aga

Sent: 30 October 2012 18:49
--
 
 
 

Kowalczyk, Agnieszka

unread,
Oct 30, 2012, 4:03:51 PM10/30/12
to qiime...@googlegroups.com
Hi 
I found the file, and I put it where it where it should be. I run the script again and I got the files which I attached.
The script descriptions says that I should have 4 files.
The output folder was not created either.

By the way is there any easier way to throw away sequences which are rubbish  and to analyse the good ones?
Thanks
Aga


Sent: 30 October 2012 18:49
--
 
 
 
ref_seq_setpnpA.fna.log
ref_seq_setpnpA.fna.nhr
ref_seq_setpnpA.fna.nin
ref_seq_setpnpA.fna.nsi
ref_seq_setpnpA.fna.nsq

Tony Walters

unread,
Oct 30, 2012, 5:25:10 PM10/30/12
to qiime...@googlegroups.com
What's the command you're using now?

For a conservative approach, one could do closed reference (no new OTUs) OTU picking to avoid non bacteria/archaea 16S sequences.

-Tony

--
 
 
 

Kowalczyk, Agnieszka

unread,
Oct 30, 2012, 6:17:36 PM10/30/12
to qiime...@googlegroups.com
I have sequences for functional genes which after blastx resulted in different than targeted protein.

I was trying to filter my OTU tables using the exclude script or the one for removing OTUs based on their id.

I also have some issues with the biom format for my OTU tables.

After I installed the new Qiime tables are created in .biom which doesn't work. I also tried to convert the .txt OTU tables into .biom

I attached one example of .biom OTU table. How can I remove some of the OTUs  and keep the ones that I want to do further analysis for?

Best
Aga

Sent: 30 October 2012 21:25
--
 
 
 
otu_table.biom

Tony Walters

unread,
Oct 30, 2012, 6:31:34 PM10/30/12
to qiime...@googlegroups.com
Hello Aga,

If you create a text file, with each line containing the OTU ID you want to remove, for instance "otus_to_remove.txt" with lines like this:
11
125
737

And then you run filter_otus_from_otu_table.py -i otu_table.biom -e otus_to_remove.txt -o filtered_otu_table.biom

Does this filter the table correctly?

-Tony

--
 
 
 

Kowalczyk, Agnieszka

unread,
Oct 31, 2012, 4:44:31 AM10/31/12
to qiime...@googlegroups.com

Hi,

Sorry, for my late reply. I have meetings today whole day and will be available in the evening.

 

 

I did try to filter it but there was a comment on my OTU table format. Something about JASN? I can post the error message later today.

 

That is why I think the OTU table in .biom is incorrect. I did try to convert a .txt table into .biom and that .biom table did not work either.

 

I will get back to this topic later on tonight.

 

Many thanks for your reply

 

Best

Aga

--
 
 
 

Reply all
Reply to author
Forward
0 new messages