batch norm

12 views

Skip to first unread message

Emanuele Dalmasso

unread,

May 14, 2019, 8:00:50 AM5/14/19

to DyNet Users

I implemented a batch norm for my network (in C++)

It seems to work ok in training, but I have now to save the batch statistics for inference.

I have an expression updated at each batch loading a parameter, and updated with an iir (like the default in pytorch)

The problem is that I do not know what is the best way for bring back the final expression to the parameter inside the model.

So basically I have a code snippet like the following:

 dynet::Expression oldRunningMean = parameter(cg, m_runningMean);
 dynet::Expression newRunningMean = oldRunningMean*(1 - batch_running_momentum) + mu*batch_running_momentum;

where m_runningMean is a dynet::parameter

and mu is an expression with the cuurent batch mean

The following line make it possible to save back the expression to the parameter

m_runningMean.set_value(as_vector(newRunningMean.value()));

The problem is that I have to collect all the expressions from which I want to grasp back the values, and do this call after the forward pass (to be efficient I cannot call them during the graph building phase).

This is not elegant given I have multiple classes representing different layer types, and I have to cycle across all them for doing this call.

Is there a different way of doing this? something like setting a value to the parameter that will be evaluated only at the forrward (or backward) time?

Graham Neubig

unread,

May 14, 2019, 9:15:53 AM5/14/19

to Emanuele Dalmasso, DyNet Users

I don't think there's an easier way to do this, but what I would do myself is create a wrapper class that has a static vector that keeps track of all instantiated instances of batch norm, let's say `BatchNormWrapper`, which has a static variable `instances`. Then, when you need to update the running averages, you'd call `BatchNormWrapper::update_running_averages`, which would then loop through instances and update all of the running averages.

Graham

--
You received this message because you are subscribed to the Google Groups "DyNet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynet-users...@googlegroups.com.
To post to this group, send email to dynet...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynet-users/59fc042e-98d9-474c-8395-355bc7d27caa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Emanuele Dalmasso

unread,

May 14, 2019, 10:27:14 AM5/14/19

to DyNet Users

Yes, a static list will probably be the best in this situation.

I just wanted to be sure there was not the functionality already implemented: now I am.

Thank you very much,

Emanuele.

On Tuesday, May 14, 2019 at 3:15:53 PM UTC+2, Graham Neubig wrote:

I don't think there's an easier way to do this, but what I would do myself is create a wrapper class that has a static vector that keeps track of all instantiated instances of batch norm, let's say `BatchNormWrapper`, which has a static variable `instances`. Then, when you need to update the running averages, you'd call `BatchNormWrapper::update_running_averages`, which would then loop through instances and update all of the running averages.

Graham

On Tue, May 14, 2019 at 8:00 AM Emanuele Dalmasso <emanuele...@gmail.com> wrote:

I implemented a batch norm for my network (in C++)
It seems to work ok in training, but I have now to save the batch statistics for inference.
I have an expression updated at each batch loading a parameter, and updated with an iir (like the default in pytorch)
The problem is that I do not know what is the best way for bring back the final expression to the parameter inside the model.
So basically I have a code snippet like the following:

dynet::Expression oldRunningMean = parameter(cg, m_runningMean); dynet::Expression newRunningMean = oldRunningMean*(1 - batch_running_momentum) + mu*batch_running_momentum;

where m_runningMean is a dynet::parameter
and mu is an expression with the cuurent batch mean

The following line make it possible to save back the expression to the parameter
m_runningMean.set_value(as_vector(newRunningMean.value()));

The problem is that I have to collect all the expressions from which I want to grasp back the values, and do this call after the forward pass (to be efficient I cannot call them during the graph building phase).
This is not elegant given I have multiple classes representing different layer types, and I have to cycle across all them for doing this call.
Is there a different way of doing this? something like setting a value to the parameter that will be evaluated only at the forrward (or backward) time?

--
You received this message because you are subscribed to the Google Groups "DyNet Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to dynet...@googlegroups.com.

Reply all

Reply to author

Forward

0 new messages