What does the transformation do?

Arka Sadhu

hayajasomwa,

23 Mei 2017, 19:18:5923/05/2017

kwa Caffe Users

Hi

I am new to Caffe, and I am following the tutorial http://nbviewer.jupyter.org/github/BVLC/caffe/blob/tutorial/examples/completed/00-caffe-intro.ipynb.

In [14]

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))  # move image channels to outermost dimension
transformer.set_mean('data', mu)            # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255)      # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (2,1,0))  # swap channels from RGB to BGR

After this, transformer is used to preprocess

transformed_image = transformer.preprocess('data', image)

I wanted to know why is this necessary? Also how do I get back the original image from the transformed image [its ok if its lossy].

Thank you

Nate Ting

hayajasomwa,

24 Mei 2017, 00:22:1724/05/2017

kwa Caffe Users

think this has sth to do with the way caffe deals with input data, just follow the data structure caffe wants to be fed with.

Arka Sadhu

hayajasomwa,

24 Mei 2017, 12:16:5324/05/2017

kwa Caffe Users

Just to confirm this transformation will be required for every training as well test image right ?

On Tuesday, May 23, 2017 at 9:22:17 PM UTC-7, Nate Ting wrote:

think this has sth to do with the way caffe deals with input data, just follow the data structure caffe wants to be fed with.us

Przemek D

hayajasomwa,

25 Mei 2017, 02:28:2025/05/2017

kwa Caffe Users

Caffe expects data to be in a specific format: CHW dimension order (channel, height, width), BGR channel order, 0-255 range. caffe/io.py however uses skimage to load images, which inputs HWC, RGB images in 0-1 range. OpenCV would load HWC too but BGR order and 0-255 range already - so depending on what library are you using to load images, you will need different transformations to make the data compatible with caffe. Additionally, you might want to do stuff like mean subtraction. Transformer is a convenience class that packs all those transformations so you can do them in a single call to preprocess() - you can perform the inverse transformation by deprocess().
And yes if you trained your net on a data processed somehow, then you most likely need to do the same transformation for tests/deployment as well. If you forget to subtract mean or swap channels you will simply get weird results, but forgetting about transposition might lead to shape mismatches etc.

Arka Sadhu

hayajasomwa,

25 Mei 2017, 12:13:5325/05/2017

kwa Caffe Users

Thanks a lot, that does explain a lot.

Jibu wote

Mjibu mchapishaji

Sambaza