Hi Allison,
Nothing special about my mapping file. It seems possible that something on the way into phyloseq won't agree with all of the conventions found in qiime mapping files. From the QIIME page on mapping files:
All metadata must be composed of only alphanumeric, underscore (“_”), period (”.”), minus sign (“-”), plus sign (“+”), percentage (“%”), space (” ”), semicolon (”;”), colon (”:”), comma (”,”), and/or forward slash (“/”) characters. For missing data, write “NA”; do not leave blanks.
When I get to my other computer I'll try to remember to post a few lines anyway.
*****
I love QIIME for all of it's rather simple automation, but I still don't love the plots or the customizations available for output. For things I want to get out in the near future, I have settled on Veusz for actually making plots (open source, cross platform, yes!!
http://home.gna.org/veusz/). For colors, I am using the colorblind palette suggested by Wong (2011) after learning that several of my close colleagues have color deficient vision. Really changes the way you think about trying to communicate your data...
Wong, B. (2011). Color blindness. Nature Methods, 8(6), 441–441. doi:10.1038/nmeth.1618
Also there are scripts for making taxa plots in ggplot (I hate the slashes through the legend) and for running ancom.R without the shiny interface, or for converting a text-based OTU table for use in shiny.
While I didn't like the phyloseq or ggplot output enough to make documentation (yet), the ancom scripts I have found to be useful and have been incorporated into my wiki on github:
https://github.com/alk224/akutils-v1.2/wiki
If you want to play on your own with moving things into R, you probably already know that some packages simply don't like the way an OTU table or a qiime mapping file is constructed. I used to use a series of awk and sed commands to shape the data as necessary, but that can be really tedious. I have found the program datamash (
https://www.gnu.org/software/datamash/) to be very helpful in command-line manipulation of text files, and it has many other functions (including an example involving genome analysis at the homepage).
If you need to parse a mapping file for a column by name (as you might when scripting with variables), you can use this one-liner to store the column as a value in the variable $factorcol:
factorcol=`awk -v factor="$factor" -v map="$map" '{ for(i=1;i<=NF;i++){if ($i == factor) {print i}}}' $map`
Finally, I mentioned above that I'm not super happy with phyloseq. Phyloseq is great. I really like it, and that group does a super job at documentation and updates and communicating. It's R that I struggle with more. From the ggplot legend slashes (no way to remove them) to more difficult issues where one command will pass a variable while another will not, or more complicated problems such as controlling the order of repeating color units in taxa plots, R feels to me like a place to be if you want to spend all of your time entering almost the same command. I realize that QIIME was trying to solve this issue by making customizable plots via matplotlib, but I'm just not willing to use them in publication either. While I aim to become proficient at matplotlib in the future or some derivative because this package really seems to have the ultimate flexibility, I am enjoying Veusz in the meantime. Because it has it's own command-line interface, I will probably try to create some templates and see if I can use the command line to script some automated plots to life.
For everyone else, maybe you will find my script code useful in generating your own plots in R. Most of my scripts are written such that a bash script (.sh) is called, which then calls the R script (.r) via Rscript once all the major text formatting steps are completed. Maybe you will find that useful as well as it requires passing variables from script to script.