On Tue, Aug 25, 2015 at 3:56 AM, Amogh Gudi <
klr...@gmail.com> wrote:
> Just to let you guys know, there is one important difference in the Mean
> Squared Error computation in "dataset_y_mse" channel of the LinearGaussian
> layer, and in the "dataset_objective" of the Linear layer (with
> use_abs_loss=false) for multivariate (multiple output labels) regression:
>
> In LinearGaussian layer, the MSE is computed correctly as the squared
> difference between prediction and target labels averaged over all labels in
> one example, averaged over the whole dataset.
> rval['mse'] = T.sqr(state - targets).mean()
>
> In Linear layer, the cost function (which I assume is trying to calculate
> MSE), is computed as the squared difference between prediction and target
> labels summed over all labels in one example, and then averaged over the
> whole dataset.
> T.sqr(Y - Y_hat).sum(axis=1).mean()
>
> I don't think what Linear layer implements is something standard. It doesn't
> really fit the definition of Sum Squared Error (which is summer over data
> samples, not labels). Also, this computed error is dependent on the number
> of labels present for an example, which is annoying.
>
> Any comments?
distribution with variance 1. This corresponds to a sum across outputs
and a mean across examples. It is annoying that the magnitude of the
to make sure it's scaled correctly compared to other costs. (For