Training Code Improvement

252 views
Skip to first unread message

Bartosz Ludwiczuk

unread,
Mar 8, 2016, 9:11:41 AM3/8/16
to CMU-OpenFace
Hi,
I see Brandon that you have no plans in big improvements in OpenFace training code. I have some ideas in my mind. 
The main aim of this idea is getting clearer code, faster training and easy adding the Data Augmentation techniques. Most of the proposition is based on https://github.com/facebook/fb.resnet.torch which I think is better than https://github.com/soumith/imagenet-multiGPU.torch.

So there are my proposition:
  - adding "cudnn.convert"  instead of function "nn_to_cudnn"
-using ":clearState()" instead of "sanitize"
-use "shareGradient" idea from fb.resnet (it reduce consumption of GPU memory in models, which use Concat modules, enable bigger batch size)
-adding in-place transfer function (ex. nn.ReLu(true), reduce the memory consumption too)
-add multi-gpu support, like in fb.resnet (they achieve near liner acceleration using 4 GPU->3.8x faster. Much better than multi-gpu from imagenet-multiGPU.torch)
-adding transform class from fb.reset for data augumentation
(Note: there is no clearState() function in fb.resnet. I tried to use it and it cause out-of-memory after each epoch. I thing it was because of cleaning gradInput, which are shared by "shareGradient". 
We need to find a nice way to use it. )


There are two ways of using such functions:
- integrating it into out code (it think that beside the data loading , which is different, other stuff are pretty much easy to integrate)
- migrate to fb.resnet training code. This need much more work, like:
  • Triplet Selection
  • Triplet choosing
  • sth like OpenFaceOptim class
but have pros too, like:
  • reading images from disk (like it is done now) or from .t7 file (like cifa10)
  • faster reading the images (based on Data loading time in fb.resnet and imagenet-multiGPU.torch)
What do you think about it?   

Brandon Amos

unread,
Mar 9, 2016, 2:03:59 PM3/9/16
to Bartosz Ludwiczuk, CMU-OpenFace
Hi Bartosz,

Great suggestions!
It's amazing how much the Torch ecosystem has grown since
I implemented OpenFace and now a lot of the features I
had to find from 3rd parties or hacks (like sanitizing
and nn/cudnn conversions) are now available in more
mainstream sources.
It makes a lot of sense to use these rather than
the versions I wrote.

I need to look closer at the fb.resnet code, it looks like
they use a lot of great techniques.
It's especially great to know about the data augmentations.
Some of these like the PCA-based noise should help with training.
I've tried doing some random crops for face recognition networks
almost a year ago and it hurt the accuracy since it conflicts
with alignment.

I can help with some of these. Do you want to manage development
in the GitHub issue tracker? I can create a milestone for this
and we can assign ourselves to the issues for development and
discussions.

-Brandon.

* Bartosz Ludwiczuk :: 2016-03-08 09:11 Tue:
> Hi,
> I see Brandon that you have no plans in big improvements in OpenFace
> training code. I have some ideas in my mind.
> The main aim of this idea is getting clearer code, faster training and easy
> adding the Data Augmentation techniques. Most of the proposition is based
> on https://github.com/facebook/fb.resnet.torch which I think is better than
> https://github.com/soumith/imagenet-multiGPU.torch.
>
> So there are my proposition:
> - adding "cudnn.convert
> <http://cudnn.converthttps://github.com/soumith/cudnn.torch/blob/99883f3633ce8fc0567913a506203630da7ed95f/convert.lua>"
> instead of function "nn_to_cudnn"
> -using ":clearState <https://github.com/torch/nn/pull/526>()" instead of
> "sanitize"
> -use "shareGradient
> <https://github.com/facebook/fb.resnet.torch/blob/master/models/init.lua#L42>"
> idea from fb.resnet (it reduce consumption of GPU memory in models, which
> use Concat modules, enable bigger batch size)
> -adding in-place transfer function (ex. nn.ReLu(true), reduce the memory
> consumption too)
> -add multi-gpu support, like in fb.resnet
> <https://github.com/facebook/fb.resnet.torch/blob/master/models/init.lua#L89>(they
> achieve near liner acceleration using 4 GPU->3.8x faster. Much better than
> multi-gpu from imagenet-multiGPU.torch)
> -adding transform
> <https://github.com/facebook/fb.resnet.torch/blob/master/datasets/transforms.lua>
> class from fb.reset for data augumentation
> (Note: there is no clearState() function in fb.resnet. I tried to use it
> and it cause out-of-memory after each epoch. I thing it was because of
> cleaning gradInput, which are shared by "shareGradient".
> We need to find a nice way to use it. )
>
>
> There are two ways of using such functions:
> - integrating it into out code (it think that beside the data loading ,
> which is different, other stuff are pretty much easy to integrate)
> - migrate to fb.resnet training code. This need much more work, like:
>
> - Triplet Selection
> - Triplet choosing
> - sth like OpenFaceOptim class
>
> but have pros too, like:
>
> - reading images from disk (like it is done now) or from .t7 file (like
> cifa10)
> - faster reading the images (based on Data loading time in fb.resnet and
> imagenet-multiGPU.torch)
>
> What do you think about it?
>
> --
> You received this message because you are subscribed to the Google Groups "CMU-OpenFace" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cmu-openface...@googlegroups.com.
> To post to this group, send email to cmu-op...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/cmu-openface/6ddcd045-dadf-4a8b-9a79-09d4ec8c1fa5%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
signature.asc

Bartosz Ludwiczuk

unread,
Mar 9, 2016, 4:37:36 PM3/9/16
to CMU-OpenFace, melg...@gmail.com, ba...@cs.cmu.edu
Ok, maybe I will firstly modify current code by adding the most useful stuff. I will create an issue with all steps, which will include:
  • cudnn conversion
  • Transform class 
  • shareGradient
  • multi-GPU
  • clearState -> I need to test it with and without sharedGradient
  • adding in-place activation to all nets
Beside "Transform" class, other should be pretty easy to integrate. 
About "Transform":
  • This will introduce mean and std with preprocessing (I think it is better to have it than not). Should be use predefined value like in fb.resnet or maybe calculate this values using chosen dataset?
  • What then with batch-represent? It will need the same values too.  It need to use "Transform Class" too. 

Brandon Amos

unread,
Mar 9, 2016, 6:10:26 PM3/9/16
to Bartosz Ludwiczuk, CMU-OpenFace
Ok, the GitHub issue looks great, you can attach commits to it
by specifying the string "#106" somewhere in your commit messages.

> - This will introduce mean and std with preprocessing (I think it is
> better to have it than not). Should be use predefined value like in
> fb.resnet or maybe calculate this values using chosen dataset?
> - What then with batch-represent? It will need the same values too. It
> need to use "Transform Class" too.

I agree, color normalization with mean/std should only help.
I think using predefined values like in the resnet code would
work well and will make development the easiest.

Since the same code for this is needed in batch-represent and
openface_server in the Python library, what do you think of
introducing a shared 'openface' Torch module with functionality
all of these can use?

-Brandon.
signature.asc
Reply all
Reply to author
Forward
0 new messages