Hi,
I would like to run some basic analyses on a dataset that has already been curated.
My raw 454 sequences were already taxonomically assigned and clustered into OTUs, leaving me with a spreadsheet listing each OTU as a number (first column) followed by the number of times each OTU was detected in the many environmental samples we are studying (columns 2-). The final column on the far right shows taxonomical assignment after blasting each representative sequence, also already done for me and written as Kingdom;Phylum;Class etc..
Example of my spreadsheet:
OTU NCarolinaBeach1 NCarolinaBeach2 NCarolinaBeach3 Taxonomy
1 346 290 130 Kingdom;Phylum;Class...
I also have a separate file listing each environmental sample (rows, example: NCarolinaBeach1) and associated environmental metadata (columns, example: temperature).
I would like to run these data through summarize_taxa_through_plots.py but I am confused as to how to create the input files.
a) How do I properly format an OTU table that incorporates not only each OTU, but also the number of times that particular OTU was found within a set of environmental samples?
b) What would my map (-m) file have to look like, and how can I include the metadata recorded at each environmental location/sample?
For starters, I wish to see bar graphs showing the taxonomic makeup as a percentage (y-axis) within each environmental sample (x-axis).
Thanks in advance!
All the best,
Ed