I am using the function of "Phylogenetic Distribution of Metagenomes" on IMG/MER.
1> What is the normal cutoff that I should choose? 30+% 60% or 90%?
2>The
website says "phylogenetic distribution of genes for selected
metagenomes". Here the "distribution of genes" means all functional
genes or just phylogenic gene markers such as 16SrRNA.
3>When compared, which database do you use? COG? KO? What's the default database?
4>Also,
when I selected the "Estimated gene copies" The output on the website
has the gene number like this 38(36). Which one is the raw gene count
and which one is the estimate gene count? However, when I download the
excel sheet (see my attachment), there is no any numbers inside
the parentheses. No matter I set "estimated gene copies" or "gene
count", I got the same results.
5>How does the website
calculate the relative abundance (percentage of each taxonomy)? For each
sample, I use the number of genes that found in a phylum and divide by
the total number genes manually, I got the different percentage from the
website.
The last question is about "Genome Clustering"
For
each calculation such as PCA, PCoA, nMDS, can you tell me what
distance matrix does it use? Is it Bray-curits or other matrix.
Thanks,
Ben