First of all, thank you for this software. It is great, exactly what I need to show everyone at my think tank how to run simulations. It has the essential core functionality and can be installed without admin privileges.
However, the Hypothesis Test function is flawed and basically unusable. The problem is that it treats each additional recalculation as an actual new sample. So if you run a small number of simulations, it gives really large p-vales, but if you run a large number of simulations, it will always give absurdly tiny p-values, even for variables that are extremely close together. But clearly, the question of whether or not two output variables are significantly different should not depend on the number of simulations you ran.
In the attached sheet, Output 1 is 100 samples and Output 2 is 1000 samples drawn from the exact same input distributions. The output means are (by design) so incredibly close that anyone should see there is no significant difference. The confidence intervals are stacked almost exactly on top of each other. And yet it claims a highly significant p-value in the 1000-sample run.
For future releases, please delete this misleading feature or replace it with something that compares the means and variance of simulation outputs and gives the same result independent of the chosen number of simulation runs.