Hi,
To be slightly more precise, there are different ways discussed in the literature to reweight well-tempered metadynamics simulations. I know at least three of them.
1. Bonomi, Barducci, and Parrinello, JCC 2009. This is slightly more complicated to implement than the next two since it requires histogramming. I never used it directly, perhaps it is better if Max comments on it.
2. Branduardi, Bussi, and Parrinello, JCTC 2012. This prescribes the use of the final bias potential from metadynamics to compute the reweighting factor. This estimator can be shown to be identical to the usual formula linking V to F within the limit of infinitely narrow Gaussians (that is usually reasonable). So, agreement between this estimator and the formula using V cannot tell you anything interesting!
3. Tiwary and Parrinello, JPCB 2014. This paper gives a slightly different prescription for reweighting factors. I tried on model systems and results are very similar to 2. However, there might be cases where 3 is better than 2 or 2 is better than 3. Since equivalence with the formula using V is not anymore automatic, differences in the result could indicate a problem. I am sure Pratyush can comment more about this.
Finally, in the case of non-well-tempered metadynamics there are less possibilities discussed in the literature. For sure one way to analyze bias-exchange MD is discussed in Marinelli et al PLoS Comput Biol 2009. I am (almost) sure that if you apply the prescription there to a "single replica bias exchange" simulation (that is a normal MetaD simulation) you would again get a result identical to the usual -V formula. So, this should be similar to the approach 2 discussed above.
Certainly the methods above all work correctly for an infinitely long and converged simulation. Thus, if they give different results, this might suggest some convergence problem. However, I don't think there is any guarantee that if they agree the result is correct.
The right way to compute statistical error is not to compare the same simulation analyzed with different approaches, but rather to compare simulations that are as independent as possible (or, as in block analysis, consecutive parts of the same simulation) analyzed with the same approach.