Hi Andrew,
We have also found that most OTU level abundance tables are not normally distributed. In QIIME 1.8 and above, the new script group_significance.py is the replacement for otu_category_significance.py and it defaults to using the Kruskal Wallis test rather than ANOVA.
Testing for significant differences between feature means in samples classes is a fundamentally different problem than testing for differences between samples via things like beta diversity. The transformation you describe (UniFrac/Jaccard) eliminates the features, and seeks to embed the samples in some other space (rather than the feature space). The reason you may not be seeing individual differences with otu_category_significance.py but do see differences with beta_diversity.py is that no single feature is discriminatory between the samples. Only through a combination of your features will you see differences between your samples. If you wanted to get a more quantitative look at which features were causing separation between your samples, I would suggest looking at the feature importance scores in the supervised_learning.py script.
If you are concerned about the normality of your data and want to transform it, you can make those transformations using the biom API and python functions, or you can convert your table to classic format, manipulate it in Excel or R, and then convert it back. Log transforms have worked well to normalize for me (casting -infs resulting from log(0) to 0).
Hope this helps,
Will