how is batch normalization layer working in multi-gpu mode?

217 views

Skip to first unread message

Etienne Perot

unread,

Feb 23, 2016, 10:47:06 AM2/23/16

to Caffe Users

Hello,

I would like to know how BN layer accumulates mean & variance statistics in multi-gpu mode, and if it is possible to use the same principle when batches are too small for low-budget scenarii like Fully-Convolutional Mode? I am assuming Gradient Accumulation wouldn't do the job, since it is not analytically equivalent.

Prasanna S

unread,

Feb 10, 2017, 2:22:36 PM2/10/17

to Caffe Users

I don't think mean and variance are learnable parameters and thus update might not be called on them. I am also wondering about the same thing. If you find something, do let u

Reply all

Reply to author

Forward

0 new messages