QIIME 1.5.0 compare_categories.py patch

1,394 views
Skip to first unread message

Jai Ram Rideout

unread,
Sep 14, 2012, 4:11:05 PM9/14/12
to qiime...@googlegroups.com
QIIME users,

A patch has been issued to clear up two separate issues in QIIME 1.5.0's compare_categories.py script:

1) Users receive a KeyError when running the script with --method permanova.
2) Input metadata mapping files and/or distance matrices are sometimes reported as not containing any data, usually after they are opened/edited in a program such as Microsoft Excel.

You can download the patch from here:

https://raw.github.com/gist/3708104/7cfa736f15387bea059ffd4735656b5fd0c0db0a/patch_compare_categories.py

To run the patch, execute the following command:

python patch_compare_categories.py

This will download the fixed version of compare_categories.py and overwrite the existing version pointed to by qiime_scripts_dir in your QIIME config file.

NOTE: this patch will only work if you have read/write permissions in your QIIME scripts directory. This will cover most cases (i.e. Amazon EC2 instances and VirtualBox). If you do not have the correct permissions, your system administrator will need to download the fixed compare_categories.py script from https://raw.github.com/gist/3708104/e8107c997784e30a2de71aaae34de4de58811153/compare_categories.py and copy it over the original one in your QIIME scripts dir (making sure to match the file permissions exactly). Running the patch script with sudo will most likely not work, as your QIIME scripts dir will not be found.

SAMIK

unread,
Sep 15, 2012, 5:05:12 AM9/15/12
to qiime...@googlegroups.com
Hi Jai,

Thank you for the post. I tried to download the link, but it appears to open in another window.
Running the python patch_compare_categories.py resulted in following error:
python: can't open file 'patch_compare_categories.py': [Errno 2] No such file or directory 

did I miss something, how to download the files? I have the admin permission in my PC.

Thank you
Samik 

Daniel McDonald

unread,
Sep 17, 2012, 12:27:50 PM9/17/12
to qiime...@googlegroups.com
Hi Samik,

Could you try right-clicking on the link and selecting save as? Please
save it as patch_compare_categories.py. Once you've downloaded it,
open a terminal and change directories to the download location. The
recommended python command that Jai gave should work from there.

Let me know how it goes,
Daniel
> --
>
>
>

SAMIK BAGCHI

unread,
Sep 18, 2012, 7:13:47 AM9/18/12
to qiime...@googlegroups.com
Dear Daniel,

Thank you for your answer. But it's now showing following error:

sh: wget: command not found
Do you have write permissions to your QIIME scripts directory '/macqiime/QIIME/bin/'? If not, this patch will not work. You will need your sys admin to manually apply the patch by overwriting the file '/macqiime/QIIME/bin/compare_categories.py' with https://raw.github.com/gist/3708104/e8107c997784e30a2de71aaae34de4de58811153/compare_categories.py. They need to make sure that the permissions are the same as they were for the original file.

Note that i have administrative permission in my computer, thus i should be able to rewrite in QIIME directory.

Kind Regards
Samik  

--






--

Kind Regards

 
Samik Bagchi
Post-Doctoral Fellow
Water Desalination and Reuse Center
King Abdullah University of Science and Technology (KAUST)
Thuwal. 23955-6900 Kingdom of Saudi Arabia

Will Van Treuren

unread,
Sep 18, 2012, 10:19:11 AM9/18/12
to qiime...@googlegroups.com

Hi Samik,

I believe your error is caused by the fact that wget is not installed on your system.

To install, you might try this link:
http://mob.conduit.com/d8221e2f-6894-4535-9cf9-05c52de78780?device=66&skipLanding=true&cms=wordpress&url=http%3a%2f%2fwww.makeuseof.com%2ftag%2fwget-mac%2f

I am sure there are others out there as well.

hope this helps,
Will

--
 
 
 

Daniel McDonald

unread,
Sep 18, 2012, 10:19:32 AM9/18/12
to qiime...@googlegroups.com
Hey Samik,

Okay, lets try the direct download method then since you have
administrative privileges.

1) right-click on
https://raw.github.com/gist/3708104/e8107c997784e30a2de71aaae34de4de58811153/compare_categories.py
and choose save link as. If that URL is not clickable, then I suggest
opening up a terminal and downloading with "curl -O <URL>", where you
replace <URL> with the compare_categories.py URL.
2) Copy the downloaded script into your /macqiime/QIIME/bin/ (this may
take administrative privileges)
3) make the script executable with "chmod +x compare_categories.py"
(this may take administrative privileges as well)

Please let me know how it goes,
Daniel
> --
>
>
>

Jeff Werner

unread,
Sep 26, 2012, 10:30:56 AM9/26/12
to qiime...@googlegroups.com
To make things simpler for macqiime, the attached file will patch compare_categories.py in MacQIIME 1.5.0

--Jeff
CompareCagetoriesPatch_MacQIIME.tgz

Jeff Werner

unread,
Sep 26, 2012, 10:32:28 AM9/26/12
to qiime...@googlegroups.com
Just unzip it, cd to the created directory in the terminal, and run the script file as a user with sudo privileges. It just copies the script file into the qiime scripts directory and makes sure it's executable.
 -- Jeff

SAMIK BAGCHI

unread,
Oct 10, 2012, 5:09:41 AM10/10/12
to qiime...@googlegroups.com
Hi Jeff,

Thank you for providing the the patch script.

We run the script with our map file by choosing the method "rda" and it's working with all category expect SampleID. Here we have attached first few lines of our map file.
 

head combined_map_file.txt
#SampleID    BarcodeSequence    LinkerPrimerSequence    name    DOB    SRT    COD Removal    VSS    Sludge age    Description
A.2d.R1.0    ACACTGTTCATG    CAGTGYCAGCMGCCGCGGTAA    B2    4/18/11    2    351.5    636.6666667    NA    week_0_SRT_2d_Acclimated
A.2d.R2.0    ACCAGACGATGC    CAGTGYCAGCMGCCGCGGTAA    B3    4/18/11    2    360.5    663.3333333    NA    week_0_SRT_2d_Acclimated
A.10d.R1.0    ACGCTCATGGAT    CAGTGYCAGCMGCCGCGGTAA    B4    4/18/11    10    351    1286.666667    NA    week_0_SRT_10d_Acclimated
A.10d.R2.0    ACTCACGGTATG    CAGTGYCAGCMGCCGCGGTAA    B5    4/18/11    10    346.5    1336.666667    NA    week_0_SRT_10d_Acclimated
NA.2d.R1.0    AGACCGTCAGAC    CAGTGYCAGCMGCCGCGGTAA    B6    4/18/11    2    356.5    490    NA    week_0_SRT_2d_nonAcclimated
NA.2d.R2.0    AGCACGAGCCTA    CAGTGYCAGCMGCCGCGGTAA    B7    4/18/11    2    360.5    740    NA    week_0_SRT_2d_nonAcclimated
NA.10d.R1.0    ACAGACCACTCA    CAGTGYCAGCMGCCGCGGTAA    B8    4/18/11    10    367.5    1076.666667    NA    week_0_SRT_10d_nonAcclimated
NA.10d.R2.0    ACCAGCGACTAG    CAGTGYCAGCMGCCGCGGTAA    B9    4/18/11    10    366    983.3333333    0    week_0_SRT_10d_nonAcclimated
A.2d.R1.1    ACGGATCGTCAG    CAGTGYCAGCMGCCGCGGTAA    B10    5/9/11    2    375.5    596.6666667    0    week_4_SRT_2d_Acclimated

But we constantly get the following error when we chose -c as sampleID.


PERMDISP: Performs the PERMDISP statistical method on a distance matrix and mapping file using the HOST_SUBJECT_ID category. Then it outputs the results to the 'permdisp' directory. The full file path will be: ./permdisp/betadisper_results.txt
 compare_categories.py --method permdisp -i datasets/keyboard/unweighted_unifrac_dm.txt -m datasets/keyboard/map.txt -c HOST_SUBJECT_ID -o permdisp

RDA: Performs the RDA statistical method on a distance matrix and mapping file using the HOST_SUBJECT_ID category. Then it outputs the results to the 'rda' directory. The full file path will be: ./RDA/rda_results.txt and ./RDA/rda_plot.txt
 compare_categories.py --method rda -i datasets/keyboard/unweighted_unifrac_dm.txt -m datasets/keyboard/map.txt -c HOST_SUBJECT_ID -o rda

compare_categories.py: error: Category 'sampleID' not found in mapping file columns:

When we tried with method 'rda' and 'name' as the category, we get the plot for rda. Please refer the pdf attached. In this plot we could the names are appended with word 'factor' which doesnt exist in our map file. We are surprised to see this! Could this be a possible bug?


The same error appears when we tried with method 'anosim'.

Could you please let us know the possible reasons for this?

Appreciating your help as usual.

Kind Regards

Samik

Kind Regards

 
Samik Bagchi
Post-Doctoral Fellow
Water Desalination and Reuse Center (WDRC)
Building 4, Level 4, 4266-WS03

King Abdullah University of Science and Technology (KAUST)
Thuwal. 23955-6900 Kingdom of Saudi Arabia
Office: +966 02 505 4921, Mobile: +966546613742
rda_plot.pdf

Jeff Werner

unread,
Oct 10, 2012, 9:23:24 AM10/10/12
to qiime...@googlegroups.com
Hi Samik,

The error message seems to be implying that your mapping file doesn't have a header called "sampleID."  Does the mapping file include this exact string as a header, or is there a difference in spaces or capitalization?  If you're trying to use the initial "#Sample ID" column for this option, you could try adding a column to the mapping file with that "sampleID" header and putting copies of all the sample IDs in that column as well as the initial #Sample ID column.  But, since Sample IDs are unique for each sample, I wouldn't expect it to be a useful category for this analysis, so that is probably not what you're trying to do (?)

Cheers,
Jeff

Jeff Werner

unread,
Oct 10, 2012, 9:27:21 AM10/10/12
to qiime...@googlegroups.com
Hi again Samik,

Sorry, I just looked again and saw the copy of your mapping file!  You don't have a "sampleID" column - you do have a "SampleID" column. But, this may not be a useful category for many of those methods, since there's only one sample for each SampleID, which means it's not so much a category as it is a unique label.

Cheers,
Jeff

Jai Ram Rideout

unread,
Oct 10, 2012, 11:13:22 AM10/10/12
to qiime...@googlegroups.com
Hi Samik,

We do not support using a category with only unique entries (such as
SampleID) for these methods. The error message you're getting is
unintuitive and uninformative and will be fixed in the next QIIME
release, but regardless you will not be able to use SampleID for the
statistical methods in compare_categories.py.

-Jai
> --
>
>
>

SAMIK

unread,
Oct 13, 2012, 11:31:37 AM10/13/12
to qiime...@googlegroups.com
Hi Jeff and Jai,

Thank you for your kind response. sorry for delay in response, somehow your response has gone to my spam !!
You have well explained the point regarding the 'category'. if i understood well, I can use 'SRT' as a category since I have only 2 type of SRT and all sample fall within this!! But the output file that i received (attached in previous communication) after 'rda' is having some erroneous name, how to remove that? also I didn't receive any output by 'ANOSIM' analysis !!   am i missing something here?

Kind Regards
Samik

Jai Ram Rideout

unread,
Oct 14, 2012, 2:31:16 PM10/14/12
to qiime...@googlegroups.com
Hi Samik,

Can you please post the distance matrix, mapping file, and the exact
commands that you are running? I'll then be able to better help you.

Thanks,
Jai
> --
>
>
>

SAMIK

unread,
Oct 21, 2012, 7:55:15 AM10/21/12
to qiime...@googlegroups.com
Hi Jai,
thank you for extending your support. My objective in RDA is to see the effect of SRT (category) on the community dynamics. With compare_categories.py I'm getting the required output, but the problem is in the quality of output pdf which is not of presentable quality (attached). i want to have a nice sample representation (for eg as circle or square) and not as name in the output figure. Kindly suggest how to modify the result file.

please find the attached  distance matrix, mapping file, and the exact commands that i used:
compare_categories.py --method rda -i wf_bdiv_even146/unweighted_unifrac_dm.txt -m combined_map_file.txt -c SRT -o rda_srt

Similarly, i want a pairwise ANOSIM comparison for my sample but 'anosim' analysis with all category in attached map file resulted into combined anosim output. please suggest the required input for pairwise anosim. 

here is the commands:
compare_categories.py --method anosim -i wf_bdiv_even146/unweighted_unifrac_dm.txt -m combined_map_file.txt -c Description -o anosim_Description

Kind Regards
Samik
combined_map_file.txt
unweighted_unifrac_dm.txt
rda_plot.pdf

Jai Ram Rideout

unread,
Oct 21, 2012, 2:43:32 PM10/21/12
to qiime...@googlegroups.com
Hi Samik,

Our support for the configurability of the plots produced by RDA is
currently limited. If you'd like to modify how the RDA plot is
rendered (e.g. remove the factor label, change sample presentation,
etc.) you will need to use vegan::capscale in the R programming
language. This is the function that is being called by
compare_categories.py, and will give you full control of how RDA is
run and what type of plot is produced.

We currently do not support pairwise comparisons with ANOSIM, and I
cannot find this feature in R/vegan either. I believe the PRIMER
software package has this functionality, though it is not free
software. Alternatively, you might use filter_distance_matrix.py to
filter your distance matrix to only include samples from two of the
levels at a time in the SRT category, and then run ANOSIM over each of
those. You'd need to adjust your p-values for multiple comparisons
(e.g. Bonferroni correction).

If you would like these features to be implemented in a future QIIME
release, please submit a feature request here:

https://github.com/qiime/qiime/issues?state=open

In order to create an issue, you will need a GitHub account.

Hope this helps,
Jai
> --
>
>
>

SAMIK BAGCHI

unread,
Oct 21, 2012, 5:02:38 PM10/21/12
to qiime...@googlegroups.com
Hi Jai,

Thank you for nicely explaining the limitations. It'll definitely help me. I really liked your suggestions regarding pairwise anosim and I'll definitely like to see it in the next QIIME update.
It'll be nice of you to give me a link for PRIMER software coz the primer software available in our IT is about designing new primer.
waiting for your kind reply.

Kind Regards
Samik

Jai Ram Rideout

unread,
Oct 21, 2012, 7:48:52 PM10/21/12
to qiime...@googlegroups.com
Hi Samik,

Here's the link: http://www.primer-e.com/

-Jai
> --
>
>
>

Samik

unread,
Oct 22, 2012, 2:27:25 AM10/22/12
to qiime...@googlegroups.com
Hi Jai,

Thank you. I'll give you updates.

Sent from my iPad
> --
>
>
>

Jai Ram Rideout

unread,
Oct 24, 2012, 2:36:57 PM10/24/12
to qiime...@googlegroups.com
Hi Samik,

In case you're wanting to experiment with RDA in R (via
vegan::capscale), we have some code in QIIME that may be of use to
you. If you download our code (https://github.com/qiime/qiime), take a
look at qiime/support_files/R/loaddata.r. This file contains R
functions for loading QIIME files such as distance matrices, mapping
files, etc..

I also suggest looking at qiime/support_files/R/dbrda.r. This is the
file that is executed by compare_categories.py under the hood when you
tell it to run RDA. You could copy this file somewhere else and modify
how it is calling vegan::capscale or how it plots the results. Then
you can run your modified dbrda.r script (the script has usage
examples in comments at the beginning of the file) on your
QIIME-formatted files. If you end up finding a better/more desirable
way to generate the output, feel free to let us know or submit a pull
request to GitHub with your changes and we may include them in a
future release!

Hope this helps,
Jai
> --
>
>
>

Rachel S

unread,
Oct 24, 2012, 3:18:22 PM10/24/12
to qiime...@googlegroups.com
Hello Qiime Team,
Thank you for this timely patch! I have an additional problem-- my install is missing compare_categories.py  entirely (I get an error message with "command not found") and it is not in my scripts folder. If I download and install the patch, how can I make sure to point it to the correct destination? Are there any other dependencies needed for this script?

thank you very much for any advice,

Rachel

Jai Ram Rideout

unread,
Oct 24, 2012, 6:16:24 PM10/24/12
to qiime...@googlegroups.com
Hi Rachel,

What version of QIIME are you running? Can you please post the output
of print_qiime_config.py?

Thanks,
Jai
> --
>
>
>

Rachel S

unread,
Oct 26, 2012, 11:00:09 AM10/26/12
to qiime...@googlegroups.com
Hello,
I am using Qiime 1.4.0. Thank you for any help!

fulthorpe@fulthorpenix:~/qiime_software/rachel$ print_qiime_config.py

System information
==================
         Platform:    linux2
   Python version:    2.7.1 (r271:86832, Apr 20 2012, 15:13:40)  [GCC 4.4.3]
Python executable:    /home/fulthorpe/qiime_software/python-2.7.1-release/bin/python

Dependency versions
===================
                     PyCogent version:    1.5.1
                        NumPy version:    1.5.1
                   matplotlib version:    1.1.0
                QIIME library version:    1.4.0
                 QIIME script version:    1.4.0
        PyNAST version (if installed):    1.1
RDP Classifier version (if installed):    rdp_classifier-2.2.jar

QIIME config values
===================
                     blastmat_dir:    /home/fulthorpe/qiime_software/blast-2.2.22-release/data
      topiaryexplorer_project_dir:    None
     pynast_template_alignment_fp:    /home/fulthorpe/qiime_software/core_set_aligned.fasta.imputed
                  cluster_jobs_fp:    /home/fulthorpe/qiime_software/qiime-1.4.0-release/bin/start_parallel_jobs.py
pynast_template_alignment_blastdb:    None
assign_taxonomy_reference_seqs_fp:    /home/fulthorpe/qiime_software/gg_otus-4feb2011-release/rep_set/gg_97_otus_4feb2011.fasta
                     torque_queue:    friendlyq
   template_alignment_lanemask_fp:    /home/fulthorpe/qiime_software/lanemask_in_1s_and_0s
                    jobs_to_start:    1
                cloud_environment:    False
                qiime_scripts_dir:    /home/fulthorpe/qiime_software/qiime-1.4.0-release/bin
            denoiser_min_per_core:    50
                      working_dir:    /tmp/
                    python_exe_fp:    /home/fulthorpe/qiime_software/python-2.7.1-release/bin/python
                         temp_dir:    /tmp/
                      blastall_fp:    /home/fulthorpe/qiime_software/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep:    60
assign_taxonomy_id_to_taxonomy_fp:    /home/fulthorpe/qiime_software/gg_otus-4feb2011-release/taxonomies/greengenes_tax_rdp_train.txt

Jai Ram Rideout

unread,
Oct 26, 2012, 11:14:41 AM10/26/12
to qiime...@googlegroups.com
Hi Rachel,

The compare_categories.py script is only available in QIIME 1.5.0, so
that is why the patch is failing. If you do end up upgrading to 1.5.0
in the future, we recommend running the patch.

-Jai
> --
>
>
>

Rachel S

unread,
Oct 26, 2012, 2:56:34 PM10/26/12
to qiime...@googlegroups.com
Hello,
Thank you for your help (obvious in the news release for 1.5.0!) I am now trying to upgrade to 1.5.0.

garlicscape

unread,
Feb 25, 2013, 11:31:23 PM2/25/13
to qiime...@googlegroups.com
I'm working on using dbRDA in R, but am getting different results than those from Qiime. I am a beginner in R so I'm sure I'm doing something wrong. To sort out what it is, I'm now using the provided R script for uploading qiime files for R (and after will use the script provided for running dbRDA in R).
But I'm confused by the funcion  load.qiime.taxon.table
What qiime derived file should this correspond to? Is it the otu table that is then transposed? I am not getting the output from this function I suspect I should... it looks like only taxon should be returned, but I'm getting a transposed matrix of all the otu_table data... is that right?
I changed the taxon column header in the otu_table to read "Consensus Lineage" because the script load.qiime.otu.table was not recognizing it with it's original name (taxa)
Also... why should "include.lineages" be false in load.qiime.otu.table?
Thanks,
Sarah

Jai Ram Rideout

unread,
Feb 25, 2013, 11:51:41 PM2/25/13
to qiime...@googlegroups.com
Hi Sarah,

What db-RDA commands are you using in R, and what QIIME commands are you comparing them to? Also, what version of QIIME are you using?

load.qiime.otu.table requires a table in "classic" OTU table format, so you'll need the taxonomy column named "ConsensusLineage" (without the quotes). For more information on how to convert your .biom OTU table to "classic" format, please see:


I believe load.qiime.taxon.table loads a taxa summary file, which is the output of summarize_taxa.py. This file is a tab-delimited file relating samples to taxonomic groups, with abundances as the data in the table. If you're trying to load an OTU table, you should use load.qiime.otu.table.

Note that db-RDA in QIIME takes a distance matrix as input, while vegan::capscale (the R equivalent) accepts either an OTU table or a distance matrix. If you provide an OTU table, it will use Euclidean distance to compute a distance matrix from your OTU table. Thus, you will likely see differences between QIIME and R because QIIME typically uses UniFrac or Bray-Curtis to generate distance matrices, which will differ from Euclidean distances.

To load a QIIME distance matrix in R, please see load.qiime.distance.matrix.

Hope this helps,
Jai


--
 
---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jai Ram Rideout

unread,
Feb 25, 2013, 11:54:55 PM2/25/13
to qiime...@googlegroups.com
Woops, forgot to answer one of your questions regarding include.lineages. The include.lineages parameter in load.qiime.otu.table has a default value of false, meaning that if the user doesn't provide a value for that parameter, it will be false. You can set include.lineages to true when calling the function and taxonomic information will be loaded with the rest of your OTU table. For example:

load.qiime.otu.table("otu_table.txt", include.lineages=TRUE)

-Jai

garlicscape

unread,
Feb 26, 2013, 10:38:11 AM2/26/13
to qiime...@googlegroups.com
yes i think that helps.

To answer some of your questions

The Qiime command I'm using is
compare_categories.py --method adonis -i beta/distance_matrix/weighted_unifrac_rarefaction_1727_29.txt -m NiwotB_D_map.txt -c Fertilizer,Plot -o stats -n 999
and when I use dbRDA I am inputting the weighted unifrac distance matrix, but there are still differences in output. But that's without using the Qiime scripts, perhaps my input files to R are not formatted properly.

I'm using Qiime 1.6. And in that version the otu_table.txt (which I did convert from .biom) has the column header "taxonomy" rather than "ConsensusLineage." 

I guess I don't understand which of the load.data scripts I actually need. I thought I needed the OTU table in order to get species scores from dbRDA in R. I also am not sure whether I should use load.distance.matrix or read.qiime.table, which result in similar R files, but one does not have headers I think. 

Sarah

garlicscape

unread,
Feb 26, 2013, 10:51:03 AM2/26/13
to qiime...@googlegroups.com
sorry, I wrote the qiime code with method adonis but used for dbRDA as well

Jai Ram Rideout

unread,
Feb 26, 2013, 1:27:21 PM2/26/13
to qiime...@googlegroups.com
Hi Sarah,

QIIME is using the R loading functions you've been looking at to load the distance matrix and mapping file, and then it runs vegan::capscale to perform db-RDA. To see the R script that QIIME is using to do this, please refer to the file Qiime/qiime/support_files/R/dbrda.r. Towards the bottom of the file you'll be able to see the vegan::capscale call.

The file also provides a working example of how to load a QIIME distance matrix and mapping file (load.qiime.distance.matrix and load.qiime.mapping.file, respectively). I think you'll need to use load.qiime.otu.table and pass that to vegan::capscale in order to obtain species scores (QIIME doesn't not provide this capability).

I noticed in your compare_categories.py command that you are providing two categories (Fertilizer and Plot). Currently, QIIME only supports a single category with db-RDA, so this may be why you are seeing different results, if you are passing multiple categories to vegan::capscale.

Hope this helps,
Jai

garlicscape

unread,
Feb 26, 2013, 4:26:15 PM2/26/13
to qiime...@googlegroups.com
yes, very much.
Just one more question, though this may be more suited for an R team... The script provided in qiime for loading the otu table, when include.lineages=TRUE, the lineages are not included as colnames for the OTU counts, but rather as a list at the end of the file, so that the whole resulting otu table is class=list not data.frame. I am wondering if this could pose a problem for dbRDA analyses in R and if I should change it so the taxonomies are the column headers for the otu table?

Jai Ram Rideout

unread,
Feb 26, 2013, 5:03:45 PM2/26/13
to qiime...@googlegroups.com
Hi Sarah,

The OTU table is samples x OTUs, with a taxonomy string being extra metadata associated with each OTU. load.qiime.otu.table will return a list containing the OTU table (as a data.frame without taxonomy strings) and the taxonomy strings as a separate column. Thus, you'll receive a list with two elements (the OTU table and taxonomy strings).

I think you'd want to pass just the OTU table to vegan::capscale.

-Jai

garlicscape

unread,
Feb 26, 2013, 5:41:21 PM2/26/13
to qiime...@googlegroups.com
ok, in that case I'm not sure how to include the taxonomy data for species scores, but I'll work on that later.

So, here I'll repost my long post:

So, I've run dbRDA in qiime, but I need to run it in R, in part for graphics and in part because I have two treatments which could be interacting.
I ran dbRDA on my own but got different results in R from the same analysis in Qiime. The total inertia is the same, but the constrained and unconstrained axes inertias are quite different (in some cases an order of magnitude different).

I am still a bit new to R, so I assumed I loaded my data incorrectly or something. I have been working with the R scripts provided by Qiime (I'm using Qiime1.6). I am not totally confident I have done everything correctly, but I am still getting different results from Qiime and from R for the same analysis. I'm wondering if anyone can tell me why.
These are my Qiime results from dbRDA:

              Inertia Proportion Rank
Total          0.3910     1.0000     
Constrained    0.1180     0.3017    1
Unconstrained  0.2730     0.6983    8
Inertia is squared Unknown distance 

Eigenvalues for constrained axes:
  CAP1 
0.1180 

Eigenvalues for unconstrained axes:
    MDS1     MDS2     MDS3     MDS4     MDS5     MDS6     MDS7     MDS8 
0.148233 0.052007 0.029738 0.019791 0.010449 0.007996 0.003089 0.001697 

and these are the R results:

              Inertia Proportion Rank
Total         0.39096    1.00000     
Constrained   0.01210    0.03095    1
Unconstrained 0.37886    0.96905    8
Inertia is squared Unknown distance 

Eigenvalues for constrained axes:
  CAP1 
0.0121 

Eigenvalues for unconstrained axes:
    MDS1     MDS2     MDS3     MDS4     MDS5     MDS6     MDS7     MDS8 
0.241903 0.053415 0.032399 0.022755 0.016641 0.006657 0.003601 0.001485 


I loaded all the functions in loaddata.r included in the qiime support_files into R.
I wasn't sure, though, how to use the dbrda.r script from support_files... so once I loaded the files I just ran dbrda using commands towards the bottom of the dbrda.r script.
Here is what I did:

#put functions loaddata.r into workspace

distmat<-load.qiime.distance.matrix(filepath="Desktop/Niwot/Analyses/R/Bacteria/DB/weighted_unifrac_rarefaction_1727_29.txt")

map<-load.qiime.mapping.file(filepath="Desktop/Niwot/Analyses/R/Bacteria/DB/NiwotB_D_map.txt")

otus<-load.qiime.otu.table(filepath="Desktop/Niwot/Analyses/R/Bacteria/DB/rarefaction_1727_29.txt",include.lineages=FALSE)

qiime.data<-remove.nonoverlapping.samples(map=map,otus=otus,distmat=distmat)


N=as.factor(map$Fertilizer)

factors.frame<-data.frame(N)

capscale.results <- capscale(as.dist(qiime.data$distmat) ~ N, factors.frame)


My Qiime command was: compare_categories.py --method adonis -i beta/distance_matrix/weighted_unifrac_rarefaction_1727_29.txt -m NiwotB_D_map.txt -c Fertilizer -o stats -n 999

Jai Ram Rideout

unread,
Feb 26, 2013, 5:47:44 PM2/26/13
to qiime...@googlegroups.com
Hi Sarah,

You are using the following line:

N=as.factor(map$Fertilizer)

I think this needs to be changed to:

N=as.factor((qiime.data$map[["Fertilizer"]]))

The difference is that you are using the mapping file data structure directly, while the QIIME script uses the qiime.data variable obtained from calling remove.nonoverlapping.samples. This function only keeps samples that are found in your mapping file, distance matrix, and OTU table. It also puts them into the same order. This is most likely why you're seeing different (incorrect) results using your script versus the one in QIIME.

Please let me know if this clears up the issue for you.

-Jai

garlicscape

unread,
Feb 26, 2013, 5:55:43 PM2/26/13
to qiime...@googlegroups.com
Yes! Thank you so much! 

Jai Ram Rideout

unread,
Feb 26, 2013, 6:37:28 PM2/26/13
to qiime...@googlegroups.com
Great, glad that fixed the issue!

Regarding species scores:

The vegan::capscale documentation refers to the "comm" argument as a community data frame, which is samples x species. This is the equivalent to an OTU table, where vegan calls one axis "species", and in QIIME we call it "OTUs", which are not necessarily species. Thus, I'm not sure that vegan::capscale will plot taxonomy strings; it will probably plot OTU identifiers. I've never tried out this functionality though, so I may be wrong.

-Jai
Reply all
Reply to author
Forward
0 new messages