setting weights in ParallelCriterion

Qingnan Fan

unread,

Aug 23, 2016, 5:13:46 AM8/23/16

to torch7

Hi all,

I just looked at the codes in ParallelCriterion.lua. However, I want to know how the weights in ParallelCriterion influence the convergence of training process, and does the weights need to sum to 1?

I know the weights will influence the Output (weighted sum of all outputs) and GradInput (the normal GradInput * weight). The GradInput is actually smaller than the normally computed GradInput, if weight < 1. The only thing matters to convergence is GradInput, but if the GradInput is smaller, will this be bad for convergence?

Qingnan Fan

unread,

Aug 26, 2016, 5:04:23 AM8/26/16

to torch7

Nobody replied.. Ok, let me answer it.

1. The summation of weights doesn't need to sum to 1.

2. The weight will only influence the speed of convergence. But the ratio/weights between different criterions matters since you want the network to be trained for some direction, right?

Vislab

unread,

Aug 26, 2016, 5:39:29 AM8/26/16

to torch7

1. Its always best practice to have the weights normalized to sum to 1.
2. The ratios between weights should be used as another hyperparameter for your optimization problem in order to achieve best performance.

Reply all

Reply to author

Forward