how is batch normalization layer working in multi-gpu mode?

217 views
Skip to first unread message

Etienne Perot

unread,
Feb 23, 2016, 10:47:06 AM2/23/16
to Caffe Users
Hello,

I would like to know how BN layer accumulates mean & variance statistics in multi-gpu mode, and if it is possible to use the same principle when batches are too small for low-budget scenarii like Fully-Convolutional Mode? I am assuming Gradient Accumulation wouldn't do the job, since it is not analytically equivalent.

  


Prasanna S

unread,
Feb 10, 2017, 2:22:36 PM2/10/17
to Caffe Users
I don't think mean and variance are learnable parameters and thus update might not be called on them. I am also wondering about the same thing. If you find something, do let u
Reply all
Reply to author
Forward
0 new messages