Caffe and 16 bit data

Fabio Maria Carlucci

unread,

Mar 16, 2016, 10:00:30 AM3/16/16

to Caffe Users

I'm attempting to train a network with 16 bit PNG files, but the loss immediately explodes. I think this is due to the fact that caffe does not support natively 16 bit data - am I right?

I found this old post which is related: https://groups.google.com/forum/#!topic/caffe-users/Hexwdcop_Tw , but no solution is given.

Did something change in the meantime? Any ideas?

Thanks,

Fabio M.

Joshua Slocum

unread,

Mar 17, 2016, 2:16:00 PM3/17/16

to Caffe Users

You can scale your images by adding a scale statement into the transform_param block of your data layer. For example:

name: "CIFAR10_full"

layer {

name: "cifar"

type: "Data"

top: "data"

top: "label"

include {

phase: TRAIN

}

transform_param {

mean_file: "models/MonoclonalNet/TRAIN_MEAN"

scale: 0.00391

}

data_param {

source: "models/MonoclonalNet/TRAIN"

batch_size: 100

backend: LMDB

}

Where do you see images being loaded as 8-bit in the source code? I'm pretty sure it preserves 16-bit depth.

On Wed, Mar 16, 2016 at 3:53 PM, Fabio Maria Carlucci <fabiom....@gmail.com> wrote:

Hi Joshua,
thanks for the reply.
No, i didn't set it to perform pixel value scaling - how would you do it?
I followed your suggestion and reduced the learning rate - now the loss stays small but oscillates up and down. Accuracy is still at random choice level, after 4000 iterations. I've looked at the caffe code, and it seems to me it loads all images as 8 bits depth. Do you know anything about it?
Thanks,
Fabio M.

On Wed, Mar 16, 2016 at 8:50 PM, Joshua Slocum <jfsl...@gmail.com> wrote:
Have you set your data layer to perform pixel value scaling? If not, then your pixel values are ~256x larger than normal, which could cause an exploding loss if you didn't reduce your learning rate.

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/98ff0644-793a-4bd6-87c4-e8264c630c80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fabio Maria Carlucci

unread,

Mar 18, 2016, 3:57:41 AM3/18/16

to Caffe Users

Thanks! Doing this scaling you are projecting the data into 0-255, but is it also being discretized into 8 bits of information?

Looking at io.cpp (https://github.com/BVLC/caffe/blob/master/src/caffe/util/io.cpp ) it seems the call to load images is cv::imread(filename, cv_read_flag); (line 78), which, according to the opencv docs, will load as 8bit, unless the appropriate flag is given:

CV_LOAD_IMAGE_ANYDEPTH - If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit. ( http://docs.opencv.org/3.0-beta/modules/imgcodecs/doc/reading_and_writing_images.html#Mat imread(const String& filename, int flags) )

Fabio Maria Carlucci

unread,

Mar 18, 2016, 3:58:58 AM3/18/16

to Caffe Users

P.S. The cv_read_flag is simply:

int cv_read_flag = (is_color ? CV_LOAD_IMAGE_COLOR :
	CV_LOAD_IMAGE_GRAYSCALE);

Adrián Galdrán

unread,

Mar 18, 2016, 7:23:03 AM3/18/16

to Caffe Users

Hi!

Just a suggestion. It could be that the scale parameter is wrong? If you use scale: 0.00391 = 1/256, I guess internally values are in [0,1]. But if you have data of 16-bit depth, with that scale normalization you fall outside [0,1], and that could lead to the loss exploding?