Hi Brent,
First of all analyzing results of just one test is in itself non-trivial. If you are running a user level test, some heavy users have a larger impact on the test statistics. You can address this by randomly assigning each user to one of the groups and collecting variation in statistics. For simple metrics such as visits and clicks this number can be calculated in closed form (it will reduce to a gaussian distribution and p-values can be calculated by
looking up the actual effect size). For complicated metrics such as CTR, either you can use some
approximation or rely on simulation.
Coming back to your original question, using n-way ANOVA should help you analyze your data (implemented in almost all statistical packages). Bayesian methods can be biased based on the selection of prior parameters.
-Manju