"Real-time" data augmentation

Nicolas Petitclerc

unread,

Feb 26, 2015, 1:13:10 PM2/26/15

to caffe...@googlegroups.com

Hi,

I'm interested in doing "real-time" data augmentation - adding random rotation and scaling at run time so that every images seen by the CNN will be different.

I wonder if there is a way to do so now? I have seen some threads about it (https://github.com/BVLC/caffe/pull/1070), but I haven't found a solution yet.

Thanks.

yr

unread,

Mar 7, 2015, 11:03:38 AM3/7/15

to caffe...@googlegroups.com

Have a look at here: https://github.com/ChenglongChen/caffe-windows, It offers some general augmentations.

npit

unread,

Jun 19, 2015, 6:14:39 AM6/19/15

to caffe...@googlegroups.com

Any idea on how to add them to an existing caffe installation without bugging up the whole code?

Because I don't think simply adding the layer would do (multi scale classification, etc).

Axel Angel

unread,

Jun 19, 2015, 7:59:09 AM6/19/15

to caffe...@googlegroups.com

If I were you, I would give a try with a custom layer. You can easily create a Python layer, put it just after the input and you can do any transformations you want. You can do random transformations or more predictive (based on a seed). You can make your network train multiple epoch and thus the next time an image is fed, it's transformed differently, thus data-augmentation built-in.

npit

unread,

Jun 19, 2015, 8:26:33 AM6/19/15

to caffe...@googlegroups.com

I am using the C++ sources.

the next time an image is fed, it's transformed differently, thus data-augmentation built-in

That scenario would produce a single score per augmentation though, instead of averaging the scores of the augmentations for every image.

Axel Angel

unread,

Jun 19, 2015, 5:29:53 PM6/19/15

to caffe...@googlegroups.com

I thought it was for training only because data-augmentation is for training… or am I wrong? So if it's for classification, why not create a small c++ for your program with your transformations, it should be quite short.

I don't really know. Maybe you could put it into the pipeline but this requires a custom transformation layer again: copy your input n times, fed them into one copy of your pipeline with a pre-transformation layer, then merge the prediction. The first solution is easier.

On Friday, June 19, 2015 at 2:26:33 PM UTC+2, npit wrote:

I am using the C++ sources.
the next time an image is fed, it's transformed differently, thus data-augmentation built-in

That scenario would produce a single score per augmentation though, instead of averaging the scores of the augmentations for every image. ou

npit

unread,

Jun 24, 2015, 4:27:15 AM6/24/15

to caffe...@googlegroups.com

I thought it was for training only because data-augmentation is for training… or am I wrong? So if it's for classification, why not create a small c++ for your program with your transformations, it should be quite short.

Yes, but this way I could only do classification "manually" instead of viewing the test results every K iterations to view the training progress.

I don't really know. Maybe you could put it into the pipeline but this requires a custom transformation layer again: copy your input n times, fed them into one copy of your pipeline with a pre-transformation layer, then merge the prediction. The first solution is easier.