But, I have singletons if we look within each sample separately.
I removed the singletons using the filter script so I imagine that is correct done. However, I'm wondering why I still have singletons looking at each single sample.
Anyway, is that correct for the analysis?
In case I would like to filter the singletons from the single samples, what would I have to do?
But I don't really know how important is to do all of this normalization or not. So I would like to know if there is something I have to do before to run the metric scripts.
I guess to summarize_taxa_through_plots.py is ok to use the BIOM table without any normalization?
I only have 4 samples and I'm not sure about how important would be to rarefy or normalized the table before the diversity metrics. As, from my point of view, they are kind in the same range and I don't have uneven sequences.
Having a look of the rarefaction plots, the curves start to become flat around 2000 sequences per sample.
And my last question is about alpha_diversity.py, should I run the single_rarefaction.py before using the script for the alpha_diversity? (because for beta_diversity_through_plots.py is part of the step 1)
Error in CSS(opts$input_path, opts$out_path, opts$output_CSS_statistics) :
could not find function "load_biom"
Can I used collapse_samples.py as a normalization method?
I have another question: once I get the normalized otu_table and I'll run the beta_diversity_through_plots.py, what should I do with the -e argument? The first step for this command is to rarify the OTU table, which I won't like to do it as It will be normalized. I don't know if I can just run the command ignoring that argument.
At the same time, I would like to ask how to do a heatmap using the abundances instead of the number of counts. Someone told me by qiime forum to use the normalize_table.py for it, but I don't know how to do it.
I guess It would be better to use the normalize otu_table instead the one with the raw counts?
install.packages(c('ape', 'biom', 'optparse', 'RColorBrewer', 'randomForest', 'vegan')) source('http://bioconductor.org/biocLite.R') biocLite(c('DESeq2', 'metagenomeSeq'))
If I ran them separately, there are few errors with install.packages. And running all together I got few errors too at the end and warning messages.
Error in file(file, if (append) "a" else "w") :
cannot open the connection
ERROR: installing package DESCRIPTION failed for package ‘metagenomeSeq’
* removing ‘/home/eri/R/x86_64-pc-linux-gnu-library/3.3/metagenomeSeq’
The downloaded source packages are in
1: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) :
installation of package ‘DESeq2’ had non-zero exit status
2: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) :
installation of package ‘metagenomeSeq’ had non-zero exit status
3: installed directory not writable, cannot update packages 'rggobi', 'rgl',
I cannot copy all the text because it is so long, so that is the last part after the installation.
Once I have my normalize_otu_table I don't think that I might run the command make_otu_heatmap.py using the --absolute_abundance argument, since that normalize samples and they will already be after the normalization, do I?
As I have 1004 observations, the output results in a huge tree, but If I filter the samples and take just few of them I won't show the whole results. Any advice? :)
1. When you mean "scaling everything" I don't know how to scale the heatmap with the arguments for make_otu_heatmap.py script.
2. I did not use Greengenes database and the format for the taxonomy is something like this:
Once again I expected just "Blastocatella" and "Algoriphagus".
1. PCoA Plots: after running beta_diversity_through_plots.py I just got PCoA in three dimensions but not in 2d or the histograms. At the same time, the percent in each of the three axes I don't know with what they are referred to.
2. Again, I should use the normalized otu_table for the beta_diversity.py command, shouldn't I?
I ran the upgma_cluster.py but I got as an output a file with .tre extension but once I open the file I can see just data, not a tree at all. I don't know if I have to open it using another program or what to do, as there is no more arguments to include for that command.
1. are they refer at genus level? I'm trying to explain the different alpha metrics I'm working with to explain the diversity in the samples.
2. can I get the statistical signifcance (p) of the alpha metrics I´m interested in?
Should I work with the normalized otu_table for the rarefaction? As I´m using the non normalized table for the taxonomy and also for the rarefaction curves. I´m asking myself now if it might be more correct to work with the normalized.
In the data set I´m working with, most of the samples go till the same number of seqs (around 8000) but some of them end the curve around 3500 (using the non-normalized otu table), what that it means in terms of the quality of the sampling done? Does it interfer in the statistical analysis?