Hi Qiime,
I got the output from assign_taxonomy.py, and trying to make sense what it means for the 3rd and 4th columns. I read from http://qiime.org/scripts/assign_taxonomy.html, and learned that
The output of this step is an observation metadata mapping file of input sequence identifiers (1st column of output file) to taxonomy (2nd column) and quality score (3rd column). There may be method-specific information in subsequent columns.
I used the default method for classification, UCLUST, but I did not find any explanation for the bold numbers in the results:
denovo36730 k__Bacteria; p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Oscillospira; s__ 1.00 3
denovo36737 Unassigned 1.00 1
denovo36736 Unassigned 1.00 1
denovo36735 Unassigned 1.00 1
denovo36734 Unassigned 1.00 1
denovo36739 Unassigned 1.00 1
denovo36738 Unassigned 1.00 1
denovo35282 Unassigned 1.00 1
I can only guess that the 3rd column is either the
--min_consensus_fraction
Minimum fraction of database hits that must have a specific taxonomic assignment to assign that taxonomy to a query, only used for sortmerna and uclust methods [default: 0.51]
or
--similarity
Minimum percent similarity (expressed as a fraction between 0 and 1) to consider a database match a hit, only used for sortmerna and uclust methods [default: 0.9]
Since all the numbers are above 0.51 (I do have some are 0.67), I am putting my money on the first one, --min_consensus_fraction, (But was my guess correct?) then what does the second number mean? I even look up at UCLUST website. You know how their website is, very hard to find any information about the parameters in QIIME.
If you have any idea what those numbers are, please let me know. Thanks a lot!
Huaiying